Research Collection

Doctoral Thesis

Structure of the human transcription factor TFIIF

Author(s): Gaiser, Florian

Publication Date: 2000

Permanent Link: https://doi.org/10.3929/ethz-a-003928353

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library Diss. ETI! No. 13735

Structure of the human

ranscription factor TFIIF

A dissertaiion submitted to the Swiss Federal Institute of Technolog} Zurich

for the degree of Doctoi ol Natural Sciences

presented by Florian Gaiscr Dipl. Narw.'ETH born May 29, 1972 from Munich (Germany)

Prof. Dr. T. J. Richmond, cxatmnei

Prof. Dr. M. Gruttei. coexammet Zurich, 2000 il

Table of contents

Table of contents ,...ii

List of figures ....vi

List of tables viii

Abstract x

Zusammenfassung , xi

Abbreviations xii

1 introduction 1

1.1 Eukaryotic gene expression and the role of transcription 1 1.2 The Role of TFIIF in RNA polymerase II transcription 4

1.2.1 Assembly of the RNA polymerase II preinitiation complex (PIC) 4 1.2.2 The RNA polymerase II holoenzyme 12 1.2.3 Initiation of phosphodiester bond formation by RNA polymerase II 13 1.2.4 Transcript elongation by RNA polymerase II 20

1.2.5 TFIIF in activated transcription 24

1.3 Domain structure of human TFIIF 28

1.3.1 Domain structure of RAP74 28

1.3.2 Domain structure of RAP30 29

1.4 Outline of the thesis project and its goals 31

2 Materials and Methods. 33

2.1 Materials and Apparatus 33

2.1.1 Sources of general chemicals 33 2.1.2 Purification of polyethylene glycol (PEG) 36

2.1.3 Buffers and solutions 37

2.1.4 Media for bacterial growth 39 2.1.5 Enzymes for DNA-subcloning 39

2.1.6 Proteolytic enzymes 39

2.1.7 General apparatus 40

2.1.8 HPLC equipment.. 41

2.1.9 FPLC equipment 41

2.1.10 Centrifuges 41 2.1.11 Heavy atom compounds 42

2.1.12 Crystallization, X-ray sources and X-ray detection 42 2.1.13 Computing ...43 2.2 DNA analysis 45

2.2.1 Concentration 45 ni

2.2.2 DNA Polyacrylamide gel electrophoresis (DNA PAGE) 45 2.2.3 Electrophoretic DNA markers 45 2.3 DNA purification 46 2.3.1 Ethanol precipitation of DNA fragments 46

2.3.2 Spun column purification 46 2.3.3 Purification of synthetic oligonucleotides by denaturing gel

electrophoresis 46

2.3.4 Agarose gel purification 47

2.3.5 Medium scale alkaline lysis plasmid preparation 47

2.4 Cloning methods 48 2.4.1 Competent cell preparation 48

2.4.2 Restriction digestion 48

2 4.3 Dephosphorylation 49 2.4.4 Ligation 49

2 4.5 Plasmid transformation 49

2.4.6 Design of PCR primers 49 2 4.7 PCR subcloning with cohesive-ended inserts 50

2.4.8 PCR subcloning with blunt-end inserts 50 2.4.9 PCR-subcloning based site directed mutagenesis (PCR-SDM) 51

2.4 10 DNA sequencing .....54

2.4.11 PCR screening .55 2.4.12 Small scale alkaline lysis (miniprep) 56

2.5 Protein analysis 57

2.5.1 Concentration ...... 57

2.5.2 SDS Polyacrylamide gel electrophoresis (SDS PAGE) 57

2.5.3 Electroblotting and N-terminal sequencing 57

2.5.4 Mass spectrometry ...58 2.5.5 Limited proteolysis 59 2.5.6 Analytical gel filtration 58

2.5.7 Dynamic light scattering 62

2.5.8 Ellmann's assay.... 62

2.6 Protein expression methods 63

2.6.1 Small scale expression test in 2xTY-media 63

2.6.2 Large (6 I) scale expression in 2xTY-media 63

2.6.3 Small scale expression test in SeMet-M9-media 65

2.6.4 Large (3 I) scale expression in SeMet-M9-media 65 2.6.5 Solubility test 66

2.7 Protein purification methods ...67 IV

2.7.1 Purification of human RAP74(2-517) 67 2.7.2 Purification of the C-terminal domain of human RAP74(364-517) 69

2.7.3 Purification of the N-terminal domain of human RAP74 71

2.7.4 Purification of human RAP30 and its N-terminal domain 74

2.7.5 Refolding and purification of human TFIIF 76

2.7.6 Refolding and purification of the RAP30/74-interaction domains 77

2.8 Crystallization screens .79 2.9 MeHgN03-prelabelling of TFIIF .83

3 Design of crystallization constructs based on human TFIIF domain structure by limited proteolysis 84 3.1 General scheme for limited proteolysic analysis 84

3.2 Results of limited endoproteolysis 87

3.3 Results of limited Exoproteolysis 93

3.4 Design of crystallization constructs based on the approximate domain structure of human TFIIF by endoproteolysis 97

3.5 Design of more cystallization constructs based on precise definition of domain boudaries by exoproteolysis 102 4 Crystallization of the RAP30/74-interaction domains of human TFIIF 105

4.1 Introduction of a general scheme for crystallization screening 105 4.2 Preparation of native crystals and collection of native data 112

4.2.1 Initial crystallization screening 112 4.2.2 Refinement of initial crystallization conditions 115 4.2.3 Cryoprotection screening and flash cooling 120 4.2.4 Dehydration screening.... 122 4.2.5 Collection of native data for RAP30(2-119)/RAP74(2-158) 125

4.2.6 Collection of native data for RAP30(2-119)/RAP74(2-154) 126

4.2.7 Collection of native data for RAP30(2-119)/RAP74(2~172) ....127 4.3 Discussion of crystallization screening and data collection 129 5 Screening for heavy atom derivatives for the RAP30/74-interaction domains of human

TFIIF 135

5.1 Introduction of a general scheme for derivative screening 135 5.2 Derivative crystal preparation and derivative data collection 138 5.2.1 Derivative crystal screening by heavy atom compound soaking 138 5.2.2 Derivative crystal screening by heavy atom compound cocrystallization 141

5.2.3 Investigation of cysteine-alanine mutant proteins for derivative screening 141 V

5.2.4 Selenomethionine in vivo prelabeling of wild type and methionine-

mutant proteins 142 5.2.5 Methylmercury in vitro prelabeling 146

5.3 Discussion of derivative screening and data collection..... 150

6 Crystal structure of the RAP30/74 interaction domains of human TFIIF 153

6.1 Overview of the structure determination process from phase determination to

model validation , 153

6.2 Phase determination and phase improvement by density modification and phase cycling 155 6.3 Model building and refinement 162

6.4 A novel heterodimerization fold of the RAP30/74 interaction domains of

human TFIIF at 1.7 À resolution 168

6.4.1 An novel "triple barrel" heterodimerization fold 168

6.4.2 The RAP74 "arm domain" 175

6.4.3 The C-terminal a-helix of RAP74(2-172) 176

7 Conclusions and future perspectives 178

References 182

Appendix 195

Human RAP74 amino acid and DNA sequences 195

Human RAP30 amino acid and DNA sequences 197

Multiple sequence alignment of RAP74 holomogues..... 199

Multiple sequence alignment of RAP30 holomogues 199 Acknowledgements 200

Curriculum Vitae 201 VI

List of figures

Figure 1: Basal transcription by RNA polymerase 11 3

Figure 2: Structures of the RNA polymerase 11 transcription machinery 19

Figure 3: Domain structure of human TFIIF 30

4: PCR based site-directed Figure mutagenesis , 52

Figure 5: Limited proteolysis assays 59

Figure 6: General scheme for limited proteolytic analysis 84

Figure 7: Initial RUNNlNG-assays with various endoproteases 87

Figure 8: Inhibitor testing with STOPPING-assay 88

Figure 9:Efficiency of inhibition and relative stability of TFT1F proteolytic fragments. ..89

Figure 10: HPLC-MS-analysis of chymotryptic digestion product 91

Figure 11: Initial RUNNING- and STOPPING-assays with carboxypeptidases 94

Figure 12: Comparison of various reaction conditions for exproteascs 95

Figure 13: Design of N-terminal RAP30 and RAP74 constructs for crystallization 101

Figure 14: General scheme of crystallization screening 106

Figure 15: Refinement of crystallization conditions: RAP30(2-119VRAP74(2-172) 116

Figure 16: Crystals of RAP30(2-119VRAP74(2-154). RAP30(2-119)/RAP74(2-158)

and alignment of the unit cell axes b and b" with crystal morphology 117

Figure 17: Crystal forms of RAP30/74 interaction domain of human TFIIF before

and after dehydration 133

Figure 18: Radiation damage during high resolution data collection of native of

RAP30(2-119)/RAP74(2-172) 134

Figure 19: General scheme of derivative screening 135

Figure 20: Methylmercury prelabcled crystals of RAP30(2-119)/RAP74(2-154) 147

21 Figure : XFAS-expenmcnt and theoretical plot the mercury anomalous

scattering components 148

Figure 22: Stability of crystalline diffraction limits during MAD-data collection 152

Figure 23: Process of structure determination for RAP30/74 interaction domains

of human TFIIF 154

Figure 24. Anomalous difference patterson map 156

Figure 25: Vector recombination diagram 157

Figure 26: Phasing Power and R of final heavy atom model 159 Vil

Figure 27: Fiïcct of heavy atom parameter refinement in MLPHARE and density

modification with DM. 161

Figure 28: B-faclor distribution of the water molecules 164

Figure 29: Mean Ca-B-factors and mean NCS-C a-distanccs for the RAP30/74-

intcraction domains of human TFIIF after superposition 166

Figure 30: Structure of RAP30/74-interaction domains of human TFIIF 169

Figure 3 I: Electron density maps from the '"triple barrel" core 170

Figure 32: Superpostition of the four copies of the RAP30/74-interactiondomam 171

Figure 33: Solvent accessible surfaces 173

Figure 34: Sequence alignments of RAP30C-119) and RAP74Q-172) 174

Figure 35: Functional aspects of the RAP50/74-interaction domains 176

36: Molecular of Figure weight RAP30/74-interaction domain complexes as

determined by gel filtration and dvnamie light scattering 180 Mil

List of tables

Table 1 The general ttanscnption factors ol human RNA polymerase II 11

Table 2 Cloning sttams and plasmids 48

Table ^ Sequenung-Ptimeis ->4

1 able 4 PCR-Pumeis -yS

Tablet 1 ndopioteases 61

Table ft I xoproteases 61

Table 7 Bacterial expression sttams and cxpiession plasmids 64

fable 8 Results ot limited cndoproteolvsis of lull length TFIIF 92

Table 9 Results oi hmtied exoprolcolvsis ol RAP10(2 1M)/RAP74(2 172) with

Caihoxypcpticlases 9ft

Table 10 Refinement or ci\stallt/ation condition-, (incomplete list of parameters) 109

Table 11 Paiamcters m uvopiotcction and deultatton screening 110

fable 12 Oveiview oi the crystallization ol RAP^O/74-c omplexes and the ciystalhzation

scieens used 1 14

Table 13 Variation diffraction limits of RAP30(2-124)/RAP74(2-118) with additive salt 118

Table 14 Final u\stalhzation conditions foi RAP^O/7 1 intetaction domams ol human

TFIIF 118

Table 15 Crvstallogiaphic chaiacten/ation ot enstals used loi post giowth tieatntent

sciecning 119

Table 16 Minimal eficctrve concentrations loi cnopiotectants 120

Table 17 Minimal tempciatrues toi slow cooling giadients 121

Table 18 Ciyopiotection experiments with RAP30(.2 124)/R4P M(2 158) 122

Table 19 Fvaluation o! ditlerent kinds of PFG foi dehvdration 12^

Table 20 Final dehvdiation ptotocol loi R \P^0(2 1 F))/R4.P74(2 l\Sj 12S

Table 21 Natu e datasets ol RAP^0(2 11 «VR 4P74( 2-1 \8) 12ft

Table 22 Name datasel ol RAP30(2-119VR \P74(2 154) 12/

Table 2} Natne dataset oi RAP<0(3-119)/R \P74(2-172) 128

Table 24 Parameters m dem am e sueenmg b\ soaking method 1%

Table 2^ Den\ati\e sciecning with RAP^0(2 119)/R \P7 4(2 1\8) cnstals 139

Table 26 Derivative sueenmg with RAP^0(2 1 ]9)/R \P74<2 1 M) mstals 140

Table 27 Dcirvatne sciecning with R VP30(,2-1 19)/R \P74(2 1 /2) civstals 140

Table 28 Heavv atom cou\stalli7ation with RAPW2 1 19)/RAP74(M72) 141

Table 29 Chaiacten/ation ot c\steine alamn mutant ct\stals 14?

Table ^0 Chaiacten/ation ol selenomettiiomne labeled (methionine mutant) crystals 14*> IX

Table 31: Native dataset of selenomethionine in vivo prelabeled

RAP30(2-119)-L106M/RAP74(2-I72)-F127M 145

Table 32: MAD-data collection for methylmercury prelabeled

RAP30(2-119)/RAP74(2-172) 149

Table 33: RelTncment of the RAP30/74-mteraction domains of human TFIIF 167 X

Abstract

The human transcription factor ITF (TFTIF) consists minimally of a heterodimcr of RNA polymerase II associated proteins RAP30 and RAP74. They have multiple functions along the transcription process, which involves first the assembly of general transcription factors (GTFs) including TF1ID. TFIIB, TFIIF. TFIIF, TPTIH

and RNA polymerase II on the promoter to form a preinitiation complex followed by initiation of phosphodiester bond formation, promoter clearance and finally elongation

ol the nascent pre-mRNA-chain. According to previous solution and mutagenesis experiments, the functions of RAP30 and R AP74 aie mediated through interactions of distinct structural domains with RNA Pol 11. TFIIB. TAFII250 and DNA. The heterodimerization sites of RAP30 and RAP74 are located in their respective N- tcrminal domains.

In this thesis, the domain structure ol TFIIF as derived from these previous

studies was confirmed by limited proteolysis ol recombinant RAP30/74-complexes with endo and exoproteascs. The breakdown products were identified by N-icrminal

sequencing and combined FIPLC-MS~analysis Companson ol these proteolytic

fragments with the previous!} defined functional domains showed close correspondence which is consistent with three distinct structural and functional domains per TFTIF-subumt. The limited proteolysis icsults were used to design

RAP30 and RAP74 constructs tor crystallization screening with RAP30/RAP74 complexes. A complex of the N-tcrnnnal RAP30/74-intcraction domains, RAP?0(2-

119) and RAP74(2-172). crystallized. Upon dehydration with polyethylene glycol the

diffraction limit of the ciystals was extended from 3 8 to 1 7 A resolution, such that

the first structure of the RAP30/74-rnteraction domains could be solved The crystal

structure reveals a new dimenzation motive termed a "triple barrel'" The intimately

interwoven secondary structuie elements of the two TFIIF subumts enclose a large buned interaction surface between the TFTIF-subumts which indicates veiy stable binding. Closer analysis ol the structure suggests that interactions with the

transcription apparatus are mediated not only b} this tripartite ß-barrcl, but also via

flexible loops and loosely associated a-heltccs and ß-stiands extending from it. This

model is a ioundatton of site targeted mutagenesis studies to further understand the

role of TFIIF in basal and activated transcription. VI

Zusammenfassung Der Transkriptionsfaktor IIP (TF1IF) des Menschen enliäll mindestens ein

Tleterodimer der RNA-Polymerase II gebundenen Proteine RAP30 und RAP74 (= "RNA polymerase II Associated Proteins"), Die beiden TFIIF-Untereinlieiten sind an allen

Teilschritten der Transkription von genomischer DNA in mRNA beteiligt. Transkription beginnt damit, dass sich die generellen Transkriptionsfaktoren. TFIID, TFIIB, TFIIF, TITTE und TFIIH (- GTFs) mit der RNA Polymerse II in einem Initiationskomplex zusammeiilagern. Nachdem die ersten Phosphodicsterbindungen geknüpft worden sind, reisst sich die RNA-Polymerase von den Transkriptionsfaktoren los und verlässt den Proinolor.

Wenn das Gen vollständig abgeschrieben ist. lallt die Polymerase von der DNA ab, und ein neuer TranskriptionszykTus beginnt. Biochenusche und molckularbiologische Experimente haben aufgezeigt, dass jeweils bestimmte funktionelle Domänen der TFTIF-LIntereinheiten den einzelnen Schritten der Transkription durch spezifische Kontakte mit RNA-Polymerase,

IT, TFIIB, TATTI250 und DNA teilnehmen.

In dieser Abhandlung wurde die Domänenstruktur von TFIIF, welche aus den oben genannten Studien abgelcite worden war, durch parzielle Proteolyse mit Endo- und Exo- proteasen an rekombmantem TFIIF bestätigt, Die Abbauprodukte wurden N-termmal ansequenziert und durch HPLC-MS-Analyse identifiziert. Die proteolytischen Fragmente stimmten weitgehend mit den zuvor definierten funktioneilen Domänen tiberein. Demzufolge gibt es je drei strukturelle und funktionelle Domänen pro TFlIF-UntercitiTieit. Diese

Ergebnisse wurden dazu verwendet, um RAP30- und RAP74-Konstrukte für die Suche nach den Kristallisationsbedingungen eines RAP30/74-Heterodimcrs zu entwerven. Ein Komplex der N-terminalen RAP30/74-Bmdungsdomanen. RAP30(2-1I9) und RAP74(2-172), kristallisierte. Durch Entwässerung mit Polyethylenglycol. wurde die Auflösung dieser

Kristalle von 3.8 A auf 1.7 A verbessert, so class die erste Struktur der RAP30/74-

Bindungsdomänen gelöst werden konnte. Die Kristallstruktur ziegt ein neues

Dimerisierungsmotiv. das mit "triple barrel" benannt wurde. Die eng verwobenen

Sekundärstruturclemente der be inen TFIIF-Untereinhciten umschliessen eine grosse, wasserfreie Kontaktfläche, was aul eine stabile Bindung hindeuted. Eine genauere Analyse der Struktur ergibt, dass nicht nur das •'triple barrel'" mit dem Transkriptionsapparat wechselwirken kann, sondern auch die beweglichen Schleifen, oHeliccs and ß-Stränge. die lose an den Kern der Struktur gebunden sind. Dieses Modell sieht den Ausgangspunkt für gezielte Mutagenesestudien dar, um die Rolle von TFIIF in einfacher und aktivierter

Transkription besser verstehen zu lernen. XII

Abbreviations

bP B ise puis Dl Ah Diethyl îminoethvl DNA Deownboniclcie iwid

ON ist, I)eo\yiibonuelc hc ckDN \ Doubk siiuidt.il DNA.

I PI C Tow piLssuic liquid ehioimtcn iph\ GTT Genu il funsctiption Fittoi HIT C Huh pitsstiK liquid chiomito^iipln kbp Kilo b ist. p ins kl)i Kik d ilton IMWM low molttulii weight mitkti

Mi Me keiilu vu uhl

MS Miss spictionu ti\ MWCO Moletuhi wt/uht cut off NCS Non utsulloei iphic svmrnetie NMR Nucleii magnetic tesonince spceti si qn OAc Act rite

ORF Opt n icidnu Fame

PCR Ik Ivmci ist, chain ic letton PK Pu initiation complex Foin RNA pol\mci ise, II RAP RNApohmusi II issoeiiudpiotein

RM SI) Root me m squ ne devi mon RNA Ribomielcic icid

RNise Riboniie.il ist

ipm Rotations' pu minute RF Room tcmptntuu ssDN\ Situe stnndtd l)\ \

TAF I AT \ tx \ binding piotem issoi i lit 1 11 t i

TBL lus boi lie FDI \

TT Fus I DTA

msciiption 1 ictoi _J_i _ _

Remark Abbreviations oi chemical name s aie listed m section 2 1

1 Introduction

1.1 Eukaryotic gene expression and the role of transcription The flow of genetic information from the genetic disposition as stored m the

genome (genotype) to the appearance and functionality of living cells (phenotype) is

a two step process called gene expression. It involves first transcription of respective

DNA sequences into messenger RNA imRNA) by DNA-depcndent RNA

polymerases, then translation of the mRNA code mto ptotem sequence which is

perfomed by the ribosomes. Depending on metabolic state, environment or its specific

the task, cell has to activate the appropriate subset of its genes. Regulation of gene

expression is crucial for normal growth and development. It occurs mainly at the level

ot DNA transcription although translation is tightly regulated as well1.

In eukaryotes. DNA transcription is performed by three different DNA-

depcndent RNA-polymerases denoted as RNA polymerases I. II and TIL They are

structurally similar to one another comprising 10 oi more subunits with moleculai

between 10 weights and 220 kDa. Some of the subunits are common, otheis unique.

RNA polymerase I synthesizes the large ribosomal RNAs essential for translation.

RNA polymerase III makes a variety of very small stable RNAs including small 5S ribosomal RNA and transfer RNAs. However, all genes whose RNAs will be translated into proteins as well as small nuclear RNA-genes that aie involved m RNA processing, are transcribed by RNA polymerase If-'

Transcription of a specific protein gene by RNA polymerase II is the product of collaboration a among basal transcription machiner y,4 that is common to all genes and gene specific activators and repressors"*-" as well as chromatin and its modifying cn/ymes8 The intriguing complexity of basal transcription and its regulation becomes evident when we consider the numbei ol components involved. Even the simple eukaryote S. cerevisiac has approximately 200 transcriptional regulators which are responsible for regulation ol 6200 genes. At the promoter ot each gene, a specific subset of these regulators interacts with one or more of approximately 50 factors that form the basal transcription initiation machiner}0. Yet another set of components gets involved when it comes to regulation of elongation and finally transcript termination and release"''10. While the countless regulatory pathways and mechanisms of activated transcription involve many different factots and aie pooily understood, the structures n

and mechanisms of basal transcription have been elucidated to a large extend.

Preinitiation and initiation stages of basal transcription have received the most attention m the past, but recent studies on transcription have also focused on the RNA polymerase II elongation complex such that a reasonahlv complete model of the entire basal transcription process can be presented (Figure 1).

First, a preinitiation complex (PIC) is assembled on the promoter DNA. In eukarvotes, specific promoter recognition and transcription initiation requires, in addition to RNA polymerase II, several general transcription factors (GTFs):

i{ ^ TBP/TFIID. TFIIA. TFIIB. TFIIE, TFIIF and TFU1I4 For many promoters tins set of GTFs is necessary and sufficient to support aeouiate transcription at a basal level. These protein factors may comprise a single polypeptide chain (TFIIB) or as many as twelve subunits (TFÏÏFL Table 1) which aie conserved between yeast and men. Upon PIC assembly, a superhclical turn of promoter DNA is wrapped around the

PIC to assist ATP-dnvcn promoter melting14. Alter DNA strand separation the first 5-

10 RNA phosphodiester bonds aie iormed Phosphorylation of the C-tcrminal domain

(CTD) of the largest RNA polymerase II subumt is followed by ATP-driven promoter clearance1S. Upon promoter escape, most ol the gcneial initiation factors dissociate sequential]} irom the polymerase and are recycled foi the next round of transcription similar to the residual TFIID-TPTIA-promotei complex16 The elongating polymerase proceeds along the DNA-template in little bursts like an mchworm. Whenever it faces an obstacle, it requires help from additional iactors to proceed until specific termination signals result in release of the complete mRNA transcript and template dissociation by RNA polymerase II1"7. Before the enzyme can enter the next round of transcription the CTD has to be dephosphorylated by a phosphatase18 (Figure 1 ).

The general transcription factor TFHF is the only basal i e. essential transcription faetoi that icmains bound to RNA polvmerase II dunng the whole cycle ol nanscnption initiation, elongation, termination, and recycling. It consists of two subunits called RAP30 and RAP74 (=RNA Polymerase Associated Proteins) which are involved at each individual stage oi transcription10"21 (Table 1). It has been shown previously that TFIIF transcription functions are carried out by distinct functional domains of the RAP-subumts. The goal of the presented study is to define the structural domains of TFIIF, relate these to function and solve the high resolution crystal structure of the RAP30/74-comp]ex. 3

Figure 1: Basal transcription by RNA polymerase II1 '.

Components ot the basal tianscription machmeiy ot the stepwise PIC-assembly model aie m blaek Components that are only paît ot the holoen/vme" model 01 activated tiansenption are drawn in light giae -30 +1 +30 TATA INR DNA

Stepwise PIC assembly

TFIIF

X TFIIF/RNApol II (HA) 'Itfiie Aaa, Itfiih

Recycling CTD dephosphorylation

kRNA

M TermiTermination PIC closed Elongation I DNA wrapping T DNA melting incompetent PIC open w Pausing/ Arrest initiation

Elongation TFIIB |crDptepn„*„o„ Elongation p TFIIE **$" -J Promoter escape competent TFNH

Recycling 4

1.2 The Role of TFIIF in RNA polymerase II transcription

1.2,1 Assembly of the RNA polymerase II preinitiation complex (PIC)

Transcription of protein coding genes by RNA polymerase If starts with the assembly of the preinitiation complex (PIC) on promoter DNA which must contain core promoter elements that can direct RNA polymerase II to a specific transcription start site. This distinguishes accurate from general transcription initiation at random start sites. The most common core promoter clement is the TATA-box (consensus

TATAa/t Aa/t), located neat positions -30 to -25 with respect to the transcription start site ( + 1 ) Anothei abundant control clement is the pyimudme-nch initiator (Int. consensus YYANt/aYY) which overlaps the transcription start sites of many TATA containing promoteis. Both core promoter elements are present at the Adenovirus

Major Fate Promoter (AdMLP) which has been widely used for the m vitro experiments that led to the current model of basal transcription by RNA polymerase

FJ11, The prevalence of the AdMLP-syslem has led to a TATA-box dominated view ot transcription initiation. Thus this introduction will focus on transcription initiation from TATA-box containing promoters. But there are also TATA-less promoters which do not contain consensus TATA-boxes or even TATA-relatcd sequences. The mechanisms of transcription initiation from these promoters have been extensively reviewed elsewhere2-.

Not only the core promoter DNA sequence but also the organization of chromatin is essential to transcription initiation icgulation Nucleosomes have been shown to repress transcription in many different wa\s-'-4 First, they occlude sites tor protein binding to DNA. thereb} mteifenng with the Jormatron oi a preinitiation complex. Second, ami} s ot nucleosomes can adopt higher oidet structuies repressing transcription of entire chromosomal domains. Finally, nucleosomes may condense to hetcrochromatin and repress gene expiession m a hereditaiv manner. Most /;; vitro studies on eukaryotic transenption burse been perlormed on non-chiomatin templates which is reflected in the current models Thus, it is no surprise that the basal transcription machiner} described below is not sufficient to transcribe nucleosomal

DNA efficiently211. Additional 1 actors are needed which aie under intense investirai ion810. 5

Even on non-chromatin templates, specific promoter recognition and preinitiation complex (PIC) formation requires, in addition to RNA polymerase II, a whole series of general transcription initiation 1 actors (GTFs): TBPATTID, TFIIA.

TFIIB. TFI1E. TFIIF and TFIIH, Since the mechanism for basal transcription initiation has been established based on stepwise /;; vitro assembly of thepreiiutiation complex (PIC) on the AdML-promoter. the prevailing idea is that the components of the PIC engage the promoter DNA sequentially4-1'~14, More lecent studies however suggest that m vivo transcription involves a preassembled holoenzyme containing

' RNA polymerase II and all GTFs essential for initiation s-2n (Figure 1). For practical reasons the components ol the basal transcription machinery will be discussed along the classical preinitiation complex assembh pathway and modifications based on the holoenzyme model will be interspersed wheie necessary.

The first general transcription factor to assemble with promotct DNA rs

TF11D. It is the only GTF that can bind core promotct elements in a sequence specific way. Therefore, TFIID orients and places the whole transcription machinciy with respect to the transcription stait site. Human TFIID consists of TBP (=TATA-box

Binding Protein) which directly recogni/cs the TATA-box and 12 TBP Associated

Factors (TAFs) which have been implicated in diiect and indirect DNA binding'1'11-27-28 (Table I). The X-ray crystal structure of the conserved TBP core domain bound to DNA shows a saddle shaped molecule that has an hydrophobic cavity for binding across the minor groove of a sharply bent TATA-box. Two phenylalanine side chains are wedged between the bases to stabilize the DNA distortion29-M\ The structure correlates perfectly with TBP's function as a protein that sits on DNA creating a stable platform for binding further TFIlD-subunits (TAFs) and general transcription factors (Figure 2A). TARI250 is the backbone of the TPTID- complcx. It binds directly to TBP and most of the othei TAFs which all together provide an extensive interaction suiface ol tianscnptional activators and repressois. It was a surprise when the solution structure of the N-terminal TBP-mteractioti domain of TAFII250 with TBP revealed that the TAF was not bound to the back of TBP but occupied the DNA-bmdmg site In this complex TAF11250 mimics the hydrophobic and charged surfaces of the TATA-box minor groove which supports reports that

TAPTI250 can displace TBP from promoter DNA and tepress PIC assembly31. In addition, TAF11250 has significant RAP74-kinase and histonc acetyl transferase 6

activities. The first is important for transcriptional activation32-3', (he latter may be involved m chromatin remodeling during preinitiation complex formation24-34. The histone homology regions of four of the smaller TAFs (dTAF42/62, hTAF 18/20)

the adopt canonical histone fold, consisting of two short alpha-helices flanking a long central alpha-helix. Like histones FI3 and 114, the two pairs of TAFs form intimate heterodimers by extensive hydrophobic contacts In solution and m the crystalline state, the dTAFlT42/dTAFII62 complex exists as a heterotetramcr, resembling the

(F13/PI4)2 hetciotetrameric core of the histone octamci, suggesting that TFFFD contains histone octamei-hke substructures"-'6 (Figuie 2A).

The TFIID-TATA-complex is stabilized by TFI1A binding to the upstream site of the TATA-box and TBP. TFIIA is a boot-shaped molecule composed oi three intricately intertwined subunits that require cololding (Fable I). Interaction of TFIIA with the N-terminal tegion of core TBP invohes antiparallel ß-strands while the DNA is bound through unspecific backbone contacts oi basic suiface residues37-38 (figure

2B). The precise role of TFIIA m basal transcription initiation has been controversial, because the requirement of TFIIA depends on the //; \ttro transcription system used

!1 Basal transcription assays with recombinant TBP do not depend on TFIIA1 and

RNA polymerase IT holoenzyme preparations sufficient for promoter-specific transcription do not even contain TFIIA26 (Figure 1 ) However, m vitro transcription assays with highly purified TFIID fractions and recombinant GTFs require TPTIA4-11

TFIIA also has a well established role in mediating transcriptional activation signals to the basal transcription machinery. It is contacted by many activators including the general cofaetor PC4 but is also involved in antirepression of transcription initiation4 7.

The next general transcription factor to entei the nascent preinitiation complex is TFIIB. It makes direct contacts to TBP and the DNA downstream and upstream of the TATA-element. The acidic C-terminal stump of TBP is clamped in the basic cleft between the two cyclin like repeats of TFIIB C-termmal domain"0 (Figure 2B).

Despite lemaikable structuial snmlant} ol the TFIIB icpeats with the cell cycle protein cyclin A. there is no eudence that TF1TB regulates the activity of a cyclin dependent kinase. Instead. TFIIB stabilizes the kinked DNA-eontormation ol the

TBP-TATA-box-complex through contacts with the phosphonbose backbone upstream and downstream of the TATA-box12 The N-teiminal 50 amino acids ol /

TFIIB form a Zinc finger motive which has been found in many DNA-binding

proteins but which in this case does not seem to bind DNA but protein40 (Figure 2B).

The N-terminus of TFIIB is important for accurate transcription start site selection and

probably contributes to the binding surface for the incoming TFIIF/RNA polymerase

11 complex41. The overall shape of this binding suriace has been presented at very low

resolution (35 A) based on election mtcroscop\ image reconstruction of the TFIID-

TFIIA-TFI1B (DAB) complex without DNA'12 (Figure 2C). This study revealed a horseshoe-shaped structure for TFIID. The TAFs are arranged around TBP in three

lobes (A, B, C) with diametets around 60 A that are connected bv 15 Â bridges.

TRIA and TFIIB are located at these two bridges on opposite sides of the 65 A caviiy which probably holds the promoter DNA. TBP is located at the top of the cavity, This complex stmeture presents an extensive binding mtetface for RNA polymerase II,

Human RNA polymerase II (hRPB) itself is a 500 kDa complex consisting of

12 subunits ranging from 220 to 10 kDa4344 (Table I). The X-ray crystal structures of

a 10 subunit yeast RNA polymerase II show a globular molecule with many giooves.

channels and lobes45-46 (Figure 2D. E). The two largest subunrts. yRPBl and yRPB2.

form the scaffold of the structure with a deep cleit between them which haibors the

active site magnesium ion. Aiound the penphen. each oi the smaller subunits oecuis

in a single copy. The structure is held together b\ secondary structure elements of the

two largest subunits that traverse the catalytic cleft and a subcomplcx of yRPB3,

yRPB 10. yRPB 11, and yRPB 12 at one end of the cleft The largest three subunits. hRPB 1-3, are highly conserved from yeast to men and show significant homolog} to

the E. coli RNA pohmeiase subunits ß, ß' and a4". They arc essential for veast

viability and are probably icsponsible for RNA catalysis. Some smaller subunits ol

RNA polymerase II, yRPB5. yRPB6. yRPB8, \RPB10. and }RPB12, are also found

in RNA polymerase I and III1 F They are involved in nuclear localization as well as

maintenance and regulation of transciiptional efficient'} Suiprismgly. the yeasi

homologues the three subunits }RPB4. vRPB7. and yRPB9 aie not essential for yeast

viability. There are three more small subunits attached to human RNA polymerase II

(hRPB10-hRPB12). The yeast homologue ot hRPBl i vital for the yeast cell despite

its small size (46 ammo acids)47 Although the three laige RNA polymerase II

subunrts share main features with the three prokanotic core subunits, the largest

subunit of RNA pol} ineiase II (RPB1 ) has a unique C-termmal domain (CTD) that is 8

not shared by its prokaryotic homologue or human RNA polymerases I and III.

Human CTD contains 53 copies of the heptapeptide YSPTSPS. Reversible phosphorylation these CTD-repeats regulates initiation, piomoter escape, elongation and tecycling the polymeiase. Specific phosphatases and kinases act processively on the domain which icsults m the complete conversions between dephosphorylatcd

RNA polymerase II (IIA) that is initiation competent and a phosphorylated form (TIO) that is elongation competent'18

At this stage oi the RNA polymerase II transcription evele the general transciiption facto 1 TFTIF comes into play In solution il is tightly associated with the

RNA polymerase fl40^1, TFTIF binds to the second largest subunit of RNA polymerase If (RPB2)48 through the C-terminal domain of RAP74 domain52 and the central domain of RAP30. The latter has sequence homology to E. coli fj70-iaeloi region 2 which binds to the bacterial RNA polymerase (Figure 3)51. The position of

TFIIF on the RNaA polymeiase II catalytic face has been derived form the DNA- trajectoryM m combination with photo ciosshnkmg and copper phenanthrolin footprmting experiments In the preinitiation complex. TFTTF interacts with the DNA

55 mai or groove between the TATA-box of the transcription start sitcA4 (Figure 2D)

Befoie the TTT1F7RNA polymerase ll-complcx can be incorporated into the picinitiation complex, specific phosphatases have to recycle it into an initiation competent dephosphorylatcd form (IIA). Yeast CTD-phosphatase is essential for growth of S. cerevisraeS6 ^. Human CTD-phosphatase. FCP1. is part of the RNA polymeiase II holoenzyme and also binds to the C-terminal domain of RAP74 which activates the enzyme fivefold18 This C-teiminal domain also includes DNA and RNA polymerase II binding sites and accessibility to these site is affected by N-terminal and central regions of RAP74, Central sequences of RAP74 mask RNA polymerase II binding, while N-termmal sequences counteract the efleet (ammo acids 1-84P8.

Similarly, central sequences ot RAP74 mask the CTD-phosphatase activation and N- termmal sequences especial!} ammo acids 74-84 compensate loi this18. Whcterthis reflects effects on RNA polymerase Î1 binding ot changes to the direct interaction with

FCP1 is not known Furthei. TFIIF ma\ be support substrate binding by the phosphatase since it also interacts with CTD60. Once the underphosphorylated RNA polymerase II is incorporated into the preinitiation complex. TFIIB inhibits the TFIIF 9

dependent stimulation of the phosphatase59. Since TFIIB also binds to FCPl18. it may function to displace FCPl from its activator RAP74. Alternatively, TFUB my compete lor the CTD-substrate60. Inhibition of the CTD-phosphatase favors phosphorylation of CTD by specific kinases which will become important for promoter escape at a later stage, In addition to CTD phosphatase regulation. TFIIF functions in analogy to the ö70-factor from F.. coli: RAP30 prevents unspecific DNA- binding by RNA polymerase II61 and both TFIIF subunits contribute to inhibition of general transcription from random start sites52-5-8. Instead, they recruit the enzyme to the prciniation complex for accurate transcription from specific start sites through direct interactions with components of the preassembled DAB-complex which arc

DNA. TFIIB. and TFIID.

Like the E. coli o70-lactor, RAP30 has a cryptic DNA binding site at its C- terminus which looses DNA affinity in the context of the full length protein62 but contacts the DNA in the preinitiation complex, The NMR structure of this domain shows a ''winged" hclix-turn-helix motif which is vers similar to the structures of other DNA-bindmg proteins like the linker histone H5 or 1INF-363 (Figure 2B). The

C-termmal domain of RAP74 also binds to DNA58,

The interaction of TFIIF with TFIIB is not only important for CTD phosphatase regulation but also for stabilization of the preinitiation complex and transcription start site selection. In yeast, a single point mutation within the N- terminus of TFIIB (E62K) leads to transcription initiation from incorrect start sites.

This mutation can be compensated for by RAP74 suppressor mutations indicating that the interaction between TFIIB and TFFIF is important for correct, spacing between the

TATA-bov and the catalytic center of RNA polymerase II at the transcription start site41-64-65. This may be a consequence of close the association of TFIIB with TFTIF which involves the N-terminus and the first cyclin repeat of TFIIB (amino acids 1-

201 )66 as well as the C-termmal domain of RAP74 and/or the N-terminal half of

RAP3052 (Figure 3). Since the RAP74 domain disrupts the RAP30-TFIIB interaction in a competition assay, the RAP74-TFITB interaction is physiologically probably more relevant52. It may compete with the RAP74-FCP1 contact and therefore contribute to promoter clearance by RNA polymerase II. 10

The preinitiation complex is stabilized further through direct interactions of

TFllD-eomponents with the RAP30 and RAP74. Alanine scanning only revealed a minor contact between TBP and TFIIF6" whereas the TFIID scaffold TAFTJ250 interacts directly with RAP74 and two associated TAFs. TAFIT100 and TAFTI80, bind to RAP30 or RAP74, respectively68'70.

Upon binding ol the TFI1F/RNA polymerase U complex, a loop of DNA, approximately 50-80 bp m length, ts wrapped around the preinitiation complex

(Figure 1 ) Then TFIIE enters the preinitiation complex thiough direct interactions with RNA polymerase II. TBP, TFIIB and TFIIF. TFIIE consists of two subunits called TFUl>a and TFKE-ß (Table i ), TITIE-a associates with RAP30 while TFIIE-

ß pre!ers RAP3071 Together with TFIIE-a, the N-terminal domain of RAP74 (amino acids 2-205) lightens DNA-loop around the PIC inducing several DNA-protem contacts by polymerase subunits (RPB I. RPB2), TFIIF. and TFIIE-a72 (Figure 3).

Wrapping and bending of promoter DNA assists DNA sit and separation and phosphodicsiei bond formation not onh in initiation but also during elongation1 F73-75_

TFDH is recruited to complete the pieinitiation complex through a direct contact ol its ERCC-3 hehease subunit with TFIIE-ß"1 (Figure 1). TFIIFI exists in two different forms m the cell. "Core TFÏÏH" consisting of six subunits associates either with five RAD proteins to form the "repairosomc" involved in nucleotide excision repair (NER) or with a three subunit protein kinase complex referred to as

TFIIK oi CAK, The NER activity of TFTIH is coupled to active transcription ot the damaged genes which indicates that TFFJH has a dual role in transcription initiation and elongation The '"core TF1IH-CAK" complex icpiesenls '"holo TFIIH"(Table 1 ).

Except for the TAFII250 subunit of TFIID. "holo TF1IH'" is the only general transcription factoi with catalytic activities that include the CTD-kinase activity of

CAK and the ATP-dependent DNA hehease acm ities of the two largest subunits

(ERCC3. FRCC2). These participate in transcription initiation and promoter clearance

i7 ~h"" by RNA polymerase TI as described below Il

Table 1: The general transcription factors of human RNA polymerase IL

AdTpkd trom tCoulombe 1994) >Z fOiph-uiicU s 1907)4 null \ Ui 199 7% ! GI1 subunit(s) M h. Piope Ut s

ol DN 11II V 1 v b Snbih/iti n i TBP bmdm ubilization 1 Al \ t PC ß n j \ s mlti iLtion lulu pitssion tunc lions uoet ot 4 eoi tiv itoi 1 < MS

rum iinB « Ses Bind n m TBP rnivnx A complex iccmits R\ \

p hnui IIII III complex spin sit selection b\ RX \ p Km it II mhibiti net II III dependent 1 C II itlrvition

ina> TBP VeS ( il 11 nul n union ( T SI \ box) lATtl^O ""id Vl S dl 1 ! t th THID cmpl x TBP bmdm cote pi me tel TArillSO 15(1 Ve le nui n 1 \1 \1 ss pi ni tu positiv ind île ids

lAnmo n-i II t p ulu n t n il n Hi ten h mol i (Il uie 1)

lAinino 100 \

r \i iiso so V

TUTP2 VeS

UMTPO >( \es

i \i in 5 1-1 ve

I MIPS 2S VeS I M1I6S 6S nd

I Mil" 5i yes

1 \ITB0 VeS

i \nin) IS \es

i sniiiii 105 n d

rnrr R \P 0 26 yes \ ni 1 ulh R\ V polvmuse II ctyptic DNA bmdm btntl R \P 4 5S yes pi e m pun u i utnti n suppoit fust phosphodieski 1 muH n ml 1 n il) n DN \ wi ippm slmniht s 1CPI

le tiht ltw \ i ibl |h phorvl-ition md lutophosphotvhtion DN \ niiL •J 1 VeS Bind t mil K\ \ poKmetise H complex eenppm s; V S letiuil in li ilu Tnill i tivitt

rnra I RCC3 ^9 \es Bin1 1 11 HL VI le pendent piomotci mtltm h li i

de u nice t R( O S( \ i tiMU C ID kiln e lctrviK Requited tot piomolei \n / VeS me Kcimmule lid- xeisioni p in p-P nd

hSSl 1 44 n i pH H nd t OK/ 40 \ s

cvclitt II 7 V VI ATI nd

"RNA Pol 11 RPBl ">00 VeS CitiMi vom] tuenl ">1 Ui Um iipUon machine ix- se jueiic

! RPB 150 \e h m 1 les «ill bid nil p kernel is s ! RPB P- V

RPBl n

RPBi VeS

RPB( e s

RPB 1 n

RPB s 11 VeS

RPB ) I ne

Kl Bl n 1

RPB 11 s \

"> KPB 1 1 n i

i essaiti U K i st i teiihihte n 1 notdiUimin d 12

1.2.2 The RNA polymerase II holoenzyme

The presented mechanism for preinitiation complex formation was established

based on stepwise in vitro assembly of purified general transcription factors (GTFs)

with RNA polymerase II onto the AdML-promoter4-11-1-. A much more realistic model involves a preasscmbled RNA polymerase IT holoenzyme that contains the

RNA polymerase 11 subunits associated with many if not all general transcription initiation factors, the mediator complex, chromatin remodeling factors, some elongation factors and even termination factors—-8-26, Apparently, (he machinery necessary for all steps of basal and activated transcription //; vivo comes together in this holoenzyme before it binds lo the promoter and forms a preinitiation complex

(Figure 1).

The number and type of GTFs present in various holoenzyme preparations depend on the source of the holoenzyme and the isolation method I Some holoenzyme preparations were capable of promoler specific transcription initiation, indicating that the full set of basal GTFs can be associaied with RNA polymerase II-'0. TFIIA however, has not been purified as a component oi the holoenzvme yet\ In addition to the general initial ion factors, (he general elongation factor TFHS has been copurificd with RNA polymerase ff TFIIS and similar elongation factors are probably nol sufficient to enhance elongation rales m vivo, since they fail to significantly activate elongation on chromatin templates in v/fro-*"-48. This implies that nucleosomes are a major obstacle to RNA polymerase II initiation and elongation. Probably, additional chromatin remodeling factors like the SWl/SNF-complex or FACT, that arc associated with RNA polymerase II holoenzyme. are needed loi full initiation and elongation activity on chromatin templates %,-48'76-7s. Another major component of the

RNA polymerase 11 holoenzyme II is the mediator complex which is required to support transcriptional activation m vitro. Nearh 20 subunits are associated with lhe dephosphorylated CTD of RNA polyerase IIA '-5 8. In addition, the RNA polymerase

II holoenzyme contains factors involved in transcript termination and polymerase recycling. Three '"cleavage and polyadenyiation stimulating lactors'' (CPSF) are associated with the CTÜ of phosphorylated RNA polymerase I107(). FCPl phosphatase that may be involved in RNA polymerase 110 recycling was mentioned before. 13

1.2.3 Initiation of phosphodiester bond formation by RNA polymerase II

DNA wrapping and promoter melting

Initiation of phsophodiester bond formation requires melting of the DNA double helix around the transcription start site. This isomerization is referred to as

"open complex formation". It involves bending of the promoler DNA by TBP, wrapping of 80-90 bp of DNA around the polymerase (base pairs -55 to + 25)14 and

another bent through the RNA polymerase active site7- (Figure I). Tightening of the

DNA loop between these to bents is induced by TFTIF and TFIIE and results in double helix unwinding at the transcription start site"7-. RAP74 contacts the DNA loop

around to preinitiation complex along its entire length and induces DNA contacts by the two catalytic RNA polymerase II subunits (RPB I. RPB2). TFIIE-a and RAP307-2.

Since the contact area oi TFTIF with DNA is approximately 300 A long, it has been

proposed that there may be two copies of TFIIF in the preinitiation complex. Gel filtration results also suggested that TFIIF lormcd an a2ß2-helerotetramer in solution

and more recent studies have defined a RAP74 homodimcnzation domain within

RAP74(172-205) which was proposed to stabilize these heterotetramers72. The deletion mutant RAP74( 1-172) supports formation ol the preinitatiation complex at wild-tvpe level but has very low activity in single round transcription. DNA- cross linking experiments suggest that this transcriptional defect can be attributed to ils inability to tightly wrap DNA around the preinitiation complex and induce the necessary protein DNA contacts. Like tetramenzation. DNA wrapping involves

sequences of RAP74 between N172 and E205 (probably T158 to M177) indicating

that tetramenzation may be necessary for DNA wrapping72'74 (Figure 3).

Progression from an established preinitiation complex to phosphodiester bond

initiation and promoter clearance requires at least one (probably two) ATP-dependcnt

steps attributed to the presence of DNA helicase and protein kinase activities in

TFI1H. Whether TFIIF and TFIIE are sufficient to unwind the DNA completely for

initiation or whether the helicase activity of TFT1II is necessary at this step is still under discussionly7(1-80 The exact requirements of promoter melting seem to depend

on the purity of the components of the /// vitro transcription systems. DNA-topology

and the particularities of the respective promoter sequence1^71-76 Interestingly,

transcription initiation from negative!) supercoiled DNA templates does not depend 14

on TFIIE and TFI1H and negative supercoiling partially compensates for some TFIIF mutants compromising DNA wrapping and initiation'75-81 8\ This indicates that DNA wrapping by TFIIF is essential for transcription initiation and can be sufficient to support transcription initiation. The helicase and kinase activities of TFIIII seem to be more important for promoter escape than for promotei melting and initiation76-84.

Formation of the first phosphodiester bonds

TFTIF promotes the formation of the first phosphodiester bond and enhances processivity in the very early stages of pre-mRNA polymerization ( < 9 nucleotides)7^7CA8-\ Earlier studies have pointed out that RAP30 on its own is sufficient for preinitiation complex formation, template strand separation and first phosphodiester bond formation16-86 and that RAP74 is only required for early elongation and possibly for promoter clearance b> RNA polymerase TI86-87. However, recent studies however have shown that the early surges of transcription involve both the RAP30 and RAP74 subunit of TFIIF^ "-8\ Alutants of RAP30 with deletions the

C-termma] DNA binding domain do not support formation of the first phosphodiester bond but are effective in C-tailcd template run-off transcription81'-88. In RAP74, the region between E155 and Ml77 is dispensable for preinitiation complex formation but enhances accurate initiation. RAP74-mutations in this region primarily affected the formation of the first phosphodiester bond. As mentioned above, untwisting of the

DNA helix partially complements for the RAP74-mutants indicating that the mutated amino acids are involved m DNA wrapping. Interestingly, the effect on initiation efficiency correlates with the effect on transcript elongation, Thus. DNA wrapping or a similar mechanism ma\ be important for both initiation and elongation73-7\

Once the first phosphodiesterbond has been formed, TFIIF suppresses abortive transcription by very early RNA polymerase IT elongation intermediates by increasing their processivity, In the absence of TFIIF. transcription is aborted after the first phosphodiester bond while in die presence of TFIIF. transcripts of 4-10 nucleotides appear before TFIIE. TFIIH and ATP are added. TFIIF seems to prevent release of the nascent transcript before it has grown long enough to remain stably associated with elongating polymerase (Figure 3). Elongation defective RAP30 mutants with deletions at the end of the N-termmal and in the cential domain, are also defective at enhancing processivit} of vers early elongation complexes8-88. RAP74-mutalions in 15

the region from T155 to Ml77 also have similar consequences on initiation (including very early elongation) and elongation73-75. Of all single RAP74-point mutations,

El55A, W164A, N172A, 1176A and Ml77A caused the greatest defects m transcription. These mutants had activities very similar to RAP74(2-158) which lacks the entire critical region7\ Even the most severely affected mutants entered the preinitiation complex with wild type affinity but did not support transcription at detectable levels. Comparison with former crosshnkmg expenments suggests that the critical residues of RAP74 arc involved in DNA wrapping again72 As pointed out before. DNA wrapping by TFTIF converts or keeps RNA polymerase II in an synthesis competent activated conformation which is equallv important for initiation of phosphodiester bond formation as for elongation oi very short or longer transcripts75.

Additional prool for the idea of a general role of TRIE in initiation and elongation conies from competition experiments. Wild-type TFfTF can dissociate from and rebrnd to stalled RNA polymerase II complexes but TFIIF remains stablv bound to elongating complexes. However, mutant TFTIF with RAP74-I176A is readily

7S replaced by wild-type TFIIF even in elongating complexes16 Therefore the mutant

TFIIF-complcxes have an elongation incompetent conformation. Anhcnius analysis on the temperature dependence of elongation rate also showed that TFIIF affects the population ol active RNA polymerase II (activity) complexes and not the elongation

îatc (activation energy) as teflected in the unchanged pattern of pause sites in the presence and absence ot TFTIF, All together this indicates that TFIIF may regulate the competent-incompetent equilibrium of RNA polymerase II during initiation, elongation and possibly during transcript termination"^

Promoter clearance

After incorporation ot the first 4-10 nucleotides into the RNA transcript. RNA polymerase must escape the strong protein-protein and piotem-DNA contacts of the initiation complex. Promoter clearance is driven by ATP-dependent DNA helicases and protein kinases which act on the preinitiation complex and promoter DNA to telcase an elongation competent RNA polymerase II complex1^-71 76-89. All general transcription initiation factors (GTFs) but TFIIF aie lost along piomoter escape in an ordered fas ion. TFIIB is released during formation of the first 10 phosphodiester 16

bonds while TFIID and TFIIA remain bound to the promoter16. This residual promoter complex is recycled for the next round of transcription from this promoter

(Figure 1) but before the polymerase II leaves the promoter, RAP74 is phosphoiylated by ATP-dependent kinase activities contained in TAF1T250 and TFTIH.

TAHI250 consists of N- and C-terminal serine kinase domains that are connected by a central binding site for the N-termmal domain of RAP7468

(aninoacids 2-139. Figure 3). The C-termmal kinase domain requires the central

RAP74 binding site of TAPTT250 for activity while the N-termmal kinase can phosphorylate RAP74 autonomously, but efficient phosphorylation requires both kinase domains32-90. The RAP74 phosphorylation sites are within amino acids 206-

256. There are five serine residues and no thteoiune oi tyrosine residues in this legion

Four serines are clustered in 217-224 which aie evolutionary conserved. This site is the most probable substrate for the TAPTI250 TAFT1250-dependcnt phosphorylation of TFIIF may be relevant in geneial transcription activation or connected to cell cycle control91 Q2. TAFII250 is encoded by CCG1. a gene that overcomes a Gl cell cycle arrest m temperature-sensitive cell lines Thereioie the îole of TAFII250 in transcription activation may reflect its cell cycle control function. RNA polymeiase II activity is low in S/G2 phase and high in eaily Gl phase. TFIID is the rate limiting component accounting for this différence. TFIID preparations from S/G2 phase are more efficient in phosphorylatmg TFIIF than preparations form early Gl phase, TFIIF" transcription activity is upregulatcd upon cell cycle dependent phosphorylation but the role of TAFII250 dependent phosphorylation of RAP74 in cell cycle regulation is not absolutely clear92. Phosphorylation of other components ol the transcription machinery may be important, Foi example, TAF1L250 can phosphorylate TFllA-yand

TFIIE-ß at a rate tenfold reduced with respect to RAP7491

There is a second RAP74-kmase activity contained m the preinitiation complex. CDK7. a subunit of "holo TFTIH" (Table I) CDR7 is part oi the tripartite

CAK complex (CDK7-CyehnII-MATT) that can dissociate ttom "holo TFTIFI"

Howevei. RAP74 is only phosphoiylated by TFTIH and not by free CAK93-94. "Core

TFI1H" is involved in substrate binding for CAK which is via TFIIE that bridges between the ERCC-3 subunit of "core TFTIH" and TFTIF"1 17

Originally, the CDK7 activity of TFTIH was not described in the context of

TFTIF regulation but in the context of promoter clearance. CDK7 phosphorylates the

CTD of initiation competent RNA polymerase IIA converting it to elongation

9o> competent RNA polymerase no71-76-77-79-95 Additional CTD-kmasc activities are contained in the mediator complex of RNA polymerase IT holoenzyme3-5, CTD- phosphorylation occurs during initiation of phosphodiester bond formation and promoter escape15. Whether phosphorylation has a functional role in these steps or is a mere consequence of increased processivity of the kinases is not really known, although CTD phosphorylation interferes with die CTD-TBP/TFITD interaction in vitro19. The activity of CDK7 is balanced by the CTD-phosphatase FCPl which is downregulated in the functional preimfiatron complex18-59. TFIIF1 drssociates form

RNA polymerase Tl after initiation. It is not clear whether FCPl remains associated with elongation competent RNA polymerase II and whether it can arrest the complex by delaying phosphorylation of the CTD. But there are some indications that CTD- phosphatases may function as negative effectors m elongation under control of polymerase bound TFITF74. The CTD of pausing RNA polymerase on drosophikt heat shock genes is undcrphosphorylated and transcription of HIV-I ETR depends on cellular CTD-kinases (see below)18.

After CTD-phosphorylation the mediator complex and TFIIE dissociate from the initiation complex during the formation of the first 10 phosphodiester bonds l<">-71-97. This has important mechanistic implications as TFIIE was found to stimulate TFIIFI CTD-kinasc activity but to inhibit TFIIFI helicase activity which drives promoter escape94. This helicase activity is essential for transcription m yeast.

Therefore, TFIIE release is an important checkpoint between transcription initiation and promoter release71.

Before TFTIH helicase derepression, the initiation complex is only able to

cycle short abortive transcripts1 . After, the TFUH associated helicase ERCC3 extends the melted region of the DNA at the transcription start site in the direction of transcription, "The reaction depends on hydrolysis of ATP which provides the energy necessary for DNA strand separation and preinitiation complex disassembly. When the RNA transcript reaches a critical length (10-60 nucleotides)16, the RNA polymerase elongation complex must be separated from the residual promoter 18

complex1,1 At this stage, TFIIF, which is engaged in DNA wlapping and stiand separation in initiation, seems to hinder promotei cleaiance although it also assists

TFITTI to o\crcome this impediment8^ RAP3() mutants with deletions in the C teimmal DNA binding domain stiongh stimulate elongation by RNA polvmeiase II but do not suppoit tianscnption initiation88 Suiptisinglv these deletion mutants aie mote ellective supporting ptomotei escape than wild tvpe RAP i0 m the absence ol

Tf IIII In combination with both wild t\pe and mutant R \P^0 TFTIH stimulates piomotei clearance but stimulation is much gieatei with the wild tvpe facloi indicating that the DN A binding C teimmal domain ot R \Pv() c oopeiates with 11 IIFI upon piomotei clearance8^ Altei mcoipoiation ol appioxiinatelv 60 nucleotides into the RNA tninscnpt TFI1H ts the last genentl tunsuiption iactoi to leave the pioductrse RNA pohmciase elongation complex (Figure 2D) 19

Figure 2: Structures of the RNA polymerase II transcription machinery.

\ TFIID (ompoiunls dTA14V6î histm I old tetnmci^ h 1 \F18/hT VI 2S histone fold dimu% IBPIAlVbov 3S eompkx"'0'10 B Gtnenl rnnseiiption I utois R<\PiO C teinnn ü ebnum6' IBP Till \ t VI Y box complex,7 IBP 40 IIIIBTM V box complex39 tHlB Nttiminil donnit C Imyt kconstnistnn ot the TWID THIA IFIIB complex

Position i IBP is muked by i IBP ml i both The 'spothstiejl DNS. posi nn is mdicuxd D RN \ pihmuist II tioiuitioi eompkx Side \uw oi the emeolope ol the \eist ui/\me u i \ lesolntion in eombmition wilh the D\ \ md

1 R\ V pill is denucl loi m election miciosoipy stidus no the hepothclk il position ot the 1 Mît subimits RAP 0 md

R VP74 m the mitiinon md cloiyition complex P R\\pol\iienst 11 b lekbonc the 10 subunit ve ist tnzvmc detunumd it 46 } Viesolutton

%.." .... :vy 20

1.2,4 Transcript elongation by RNA polymerase II

Elongation is far from being a monotonous series of phosphodiester bond synthesis reactions. RNA polymerase proceeds in saltatory movements lite an inchworm, RNA polymerase 11 has two RNA binding sties. As the RNA chain grows, the first site is filled but there is no forward motion because RNA polymerase II is kept m place by the second RNA binding site. When the fist site is jammed with 5-10 nucleotides the polymerase assumes a strained conformation which is relaxed when the second binding site releases the growing RNA and the polymerase jumps forward.

As a result of this discontinuous movement and local DNA, RNA and chromatin

98 structure, large variation is seen in polymerase dwell times at different positions48

There are actual elongation blocks referred to as pausing and arrest sites that are involved in transcriptional regulation and termination. However, dwell time is not only modulated by nucleic acid structure but also by auxiliary transcription factors, availability of nucleotide substrates, and perhaps cotranscnptional processing of the primary transcript. Furthei. an equilibrium between elongation competent and incompetent conformations oi RNA polymerase II exists Aftei each nucleotide addition, the polymerase cither remains m an active conformation and continues RNA synthesis or the enzyme assumes an elongation incompetent conlormation which leads fo longer dwell times that are known to favor pausing, arrest or termination99 (Figure

1). As mentioned above. TFIIF seems to influence this equilibrium through a mechanism related to DNA wrapping which is apparently relevant for initiation and promoter escape but also for productive elongation,

The position of TFIIF m the initiation complex with respect to the RNA polymerase II active site and the transcribed DNA has been derived from DNA cross-

89-10° 10! linking results and the DNA traicctory on the polymerase54 The path of

UNA and RNA through the wrinkled surface of RNA polymerase II was recently lcdefined by electron crystallography with an arrested elongation complex5' (Figure

21), E). The striking result was that the DNA did not entci (he active site at the centet oi the molecule as a double helix through a 25 A channel as proposed before but as a single strand along a 14 A DNA groove on the surface of the molecule. The two paws that hold the upstream DNA m place are provided by the RNA polymerase II subunits

RPB 1, RBPQ and RPB546, The RNA tumsenpt is elected through a 50 Â long RNA 21

groove that can be closed to a narrow channel of 12-15 Ä in diameter by a hinged domain referred to as RNA clamp45. The clamp is compoesd of RP1B N-ferminal region, RPB2 C-terminal region andRPB646. TFIIF is located close to this hinged domain which may play a role in processivity control ol RNA transcription and release of the terminated transcript. TFIIF also controls the exit point of the DNA from the active site54 which is important lor DNA wrapping i.e. enhancing polymerization rates and pausing site read through

RNA polymerase II dwelling at pausing sites is transiently in a non-proccssivc state which it may overcome autonomously99, but there is a long list of stimulatory factors that help pausing site read through including Elongm. FEE. ELL2. CSB. TAT-

SF1 and last but not least TFIIF78. Other general transcription factors like TFIIE and

TFTIH were proposed to be such "processivity lactois"8^ but TFIIF is the only one that interacts with RNA polymerase IIA and HO16 71S3 TEUF remains bound lo the

RNA polymerase II (IIO) during elongation TFIIF' enhances the ability of RNA polymerase II lo use limiting levels of NTPs which accelerates polymerization rates.

This reduces the probability that RNA polymerase U will be locked in a elongation incompetent form at a pausing siteso-88-lo:-10\ At pausing sites whcie the RNA polymerase If adopts an elongation incompetent i.e stalled conformation. TFTIF can be released from and rebound to the polymeraseUl-""s but m the elongation competent form the tight wrap of DNA also holds TFTIF m place"'5 RAP30 is more tightly bound to the elongation complex than RAP7486. Recent results also indicate that RAP74 can be replaced by the TAT-SF1 elongation coacttvator in elongating RNA polymerase

1IU)4. Therefore. TFIIF is not only involved m basal level transcript elongation but also m the regulation ol elongation rates

Eong dwell times at pausing sites favor RNA polymerase It arrest Arrested

RNA polymerase II is in a permanently non-piocessive state which may icsult in premature transcript termination unless auxihaiy activatois like TFT1S or p-TEFb release the enzyme"8 99. It seems to be a common theme ol elongation regulation that early polymerase elongation complexes are arrested at promoter proximal sites by negative regulators (DISE. factor2 ) and that read through depends on (gene specific) releasing factors10, TFIIF has no direct effect on arrested RNA polymerase II but one of the numerous mechanisms ol eloneation activation bv HIV-l TAT involves 11

TFIIF's DNA wrapping or FCPl regulatory functions. Since this mechanism also engages cellular components like the elongation activator p-TEFb. the role of 'fFTIF m reinitiating transcription from arrested RNA polymerase II elongation complexes may be more general10S.

In this context, the regulation of TFIIF activity becomes important. So far, only the el feet of trans- and autophosphorylation have been investigated106 but TFIIF is also modified by acetyltranslerases107 and poly( ADP-nbose)polymcrasc108.

Phosphoiylation of RAP74 dunng initiation by TAFTI250 and TFUH has been discussed aboY-cA2909-. The functional implication of these posttranslalional modifications are discussed below

Phosphorylation oi both TFIIF subunits was demonstrated by in vitro and m vivo experiments49-109-110. However, there is not much data about the effect of RAP30 phosphorylation and in some //; vivo studies phosphoiylation of RAP30 could not be shown at all91. In MELC (Murine ErythroLeukenna Cells) phosphorylated RAP30 was found which was dephosphorylatcd upon differentiation suggesting that RAP30 activity is regulated by reversible phosphorylation41" However, reversible phosphorylation of RAP74 mainly regulates TFIIF transcription activity91 In a system with no RAP30 phosphorylation, alkaline phsphatase-trated TFTIF showed reduced activity for in \itro transcription, in both the fiee lound and single round assays Stimulation of elongation of mRNA synthesis was also reduced by the phosphatase treatment. All effects can be attributed to changes in the RAP74 subunit which may also interact more efficiently with RAP30 in its phosphorylated form91

There is evidence for three transphosphorylation srtes within (he central domain of

1!1 RAP74 at positions 207-230. 271-283 and 335-3449--106 Each sequence has a casein kinase II consensus site (SXXE) which is phosphorylated using HcEa whole cell extracts while TAFII250 phosphor}kites only the fist site92. All phosphoiylation sites ate outside the N-termmal domain of RAP74 (ammo acids 2-217) that is necessary and sufficient for single round accurate transcription. Consistently, reversible phosphorylation has no effect on initiation but on elongation and possibh multiple round transcription. A mechanistic explanation may be that revcrsibly phosphorylated residues in the central domain of RAP74 affect the accessibility of

RAP74 C-termmal domain to its interaction partners18-'18. These are RNA polymeiase 23

TT, DNA and the CTD phosphatase FCPl. Thus, reversible iransphosphorylation might regulate elongation efficiency, pausing site read through and transcript termination by altering TFIIF activity in DNA wrapping, CTD phosphatase activation or RNA polymerase TT binding,

Recently it was shown that RAP74 itself possesses a serine-threonine kinase activity in its C-termitial domain, leading to autophosphorylation of S385 and

T38910", Mutation analysis suggests that autophosphorylation of both sites dow nregulates elongation rates but does not affect pausing site read through or inititation which both require the N-termmal domain of RAP74, Again, DNA binding and wrapping, RNA polymerase interaction 01 FCPl regulation by RAP74 C-terminal domain may be affected. Autophosphorylation may piepare termination by extending dwelling times between the individual poly menzation steps11* or by stimulating the

CTD phosphatase FCPl18

Termination by RNA polymerase II is pooily understood. It depends on reduced elongation rates (sec above), pausing. RNA termination signals and RNA cleavage factors. Some of these termination factors, the "cleavage and polyadcnylation stimulating factors" (CPSF) arc associated with the RNA polymerase

Tlholocn/ymc79 (Figure 1). Once the RNA-polymerase II is released from its DNA- fcmplate, it has to be recycled info an initiation competent form which involves dephosphorylation of its CTD repeats by FCPl, Whether CTD dcphosphorylation is responsible for termination or just occurs upon recycling of the RNA polymerase II for multiple round transcription is not clear. Interestingly, multiple round transcription depends on the C-termmal domain of RAP74 which happens to activate the CTD- phosphatase FCPl74. Recycling involves reassembly oi the RNA poymerase IT holoenzyme or stepwise reconstruction ol (he preinitiation complex as described above. The cycle of RNA polymerase II in basal tianscription is completed 24

1.2.5 TFIIF in activated transcription

TFIIF is not only involved in basal level transcription but is integral part of gene regulation pathways involving a multitude of transcriptional activators and repressors such as Scram Response Factor (SRF)112-m or Estrogen Receptor (ER)114,

Interactions with prominent proto-oncogene products such as jun/fos and related bZIP-famihy members like itniB, ATF2, ATF4, C/EBP. CREBU-5-116. as wcn as wjm

Myc-c117, and Retinoblastoma protein (Rb)3" have been reported. The following paragraphs give a short overview over the factors and mechanisms involved and their connections to the basal transcription cycle by RNA polymerase II,

Serum Response Factor (SRF)

TFIIF is required for SRF-dependcnt activation of transcription which is essential for the induction ol the c-fos proto-oncogene113, Serum Response Factor, eSRF, belongs to the MADS-box binding protein superlamily. If is adimeric transcription factor which binds to the serum response element in the c-fos promoter1 ltS. The C-termmal Domain of RAP74 lias a weak binding affinity for the

DNA-binding domain of SRF which is not necessary for SRF-dependcnt activation.

The hypercharged central domain of RAP74 is however necessary for this activation and binds to the activation domain of SRF112. This interaction may help to recruit the

RNA polymerase II to the preinitiation complex or modulate the regulatory activity of the central domain of RAP74 on transcription initiation and elongation.

Retinoblasoma protein (Rh)

Retinoblastoma protein. Rb. is a tumor suppressor whose inactivation has been implicated in a variety of sporadic and familial cancers, Rb has been shown to stimulate or repress the activity of specific promoters in a cell cycle dependent manner. One mechanism of transcriptional regulation by Rb involves direct interactions with the N-termmal kinase domain of TAFII250. Rb protein inhibits autophosphorylation of TAFII250 and transphosphroylation of RAP74. and possibly other targets. Two tumor-associated mutants of Rb still bind the amino terminus to

TAF250 but lack the ability to inhibit its RAP74 kinase activity ,;, Since,

of phosphorylation RAP74 stimulates transcript elongation01-'*, Rb-regulated genes 25

are overexpressed, when the Rb-dependent control over RAP74 phosphorylation by

TAFII250 is interrupted33. c-Myc

The product of the proto-oncogene c-myc is a nuclear phosphoprotcm involved m cellular growth, differentiation and programmed cell death, ft belongs to the basic

Ilehx-Loop-HtTix superfamily but contains an additional leucine zippei dunenzadon domain which is responsible for heterodimerization with another transcriptional regulator called max. The N-termmal transactivation domain of c-myc (amino acids 1-

143) binds RAP74. The RAP74 interaction may be involved m recnutment of RNA polymerase II to the picinitiation complev The ttansaemation domain of c-Myc also binds to Rb-prolcin mentioned above which triggers phosphorylation of RAP74 by the

TAFII250 kinase. Tt is suggestive to think that Rb-protcin can recmit TAF1I250 kinase to RAP74 when it is bound to the c-myc tiansaetivation domain117.

Basic leucine zipper (hZip) regulators

Several transcriptional regulator proteins that are members of the basic leucine zipper (bZIP) superfamily can interact w ith TFUF-subunits //; vitro. It is not clear whether any ol these interactions are essential lot the function of these regulators but it has been suggested that interactions oi the bZIP transcriptional regulators with the

TFIIF-subunis could facilitate loading of the RNA polymerase II onto the preinitiation complex through direct interactions with the general transcription factors, The proto- oncogene products run and fos bind to TFITE-34. RAP74 and RAP30 which show stable association with jun/fos hctcrodimers m presence and absence of AP-1 site

DNA113. Similar interactions were shown tor los and inn homodimers. Other bZIP transcriptional regulators were assayed lot RAP30 and RAP74 binding. Jim B. ATF2.

ATF4, C/EBP and CREB homodimers showed interaction with RAP30 whereas

116 RAP74 bound homodimers oi Jun. JunB, and weakly to ATF^11"1 These inter actions are not general properties of all bZIP familv membeis but rather appear to be specific for proteins within the immediate Fos and Jun families. Stable association with RAP30 or RAP74 requires lust the core domains of run and los comprising the leucine zipper and DNA-bindmg regions. Even one of the basic domains can be removed11'' 26

Estrogen receptor (ER)

The estrogen receptor, ER, is a ligand dependent transcription regulatory factor

that belongs to the nuclear receptor (NR) superfamily, Several co-activators that

interact w ith ER in a hormone dependent fashion have been identified. One of them is

TAFT130 which may be responsible for propagation of the hormone signals through

ER to the basal transcription machinery But estrogen icceptor influences preinitiation

complex formation directly and promotes incorporation of TFIIB, TFIIF and TFIIE

which leads to a 3-4 fold stimulation of tianscnption upon ER addition114.

AP-2 regulator

Ras oncogene activation or knock-out results in deiegulation of the

transcription factor AP-2 which is the critical mechanism of ras onoeogenesis. AP-2

overexpression quenches AP-2 dependent tianscnption activation which is indicative

ol coaetivalors involved. AP-2 interacts with PC4, poly(ADP-ribose)polymerase

(PARP) and RAP74. PARP binds to the C-terminus of AP-2 and acts as a coactivatot

while the PC4 coactivatot interacts with the AP-2 N-tei minus. RAP74 associates with

AP-2 via the central region that contains a portion ol its DNA-bmdtng domain (amino

acids 166 277). Although RAP74 has been ruled out as a coacm at or of AP-2

transcriptional activation, the interaction may be significant because PARP (tonner

TFIIC) may participate m regulation of RNA polymeiase II tianscription thiough

ADP-ribosylation of the TFflF-subunits119.

TFIIF in activated elongation

As described above, TFffFhas not only its role in transcnption initiation but

also in elongation and elongation control. RAP30 binds to two coaetivalors of HIV-1

TAT protein activated elongation- SPT5 and TAT-SFT It was shown that a significant

fraction ol cellulai TAT-SFl is bound to RAP30 such that RAP74 is dissociated from

RAP3010' Bui not only TAT-SFT protein but also RAP74 phosphorylation is

absolutely required lor TAT activated elongation 10\ These contradictory results

suggest that the displacement of RAP74 in the first study w as caused by the

antibodies used and not by TAT-SFl. Since TAT-SFT binds to the SPT5 protein as

well, the mutual interaction ot RAP30. TAT-SFl, and SPT5 could stabilize the

complex on the RNA polymerase II104 27

Originally, SPT5 was isolated as a component of DISE78, This negative elongation factor arrests RNA polymerase II shortly after the transcription start. To resume elongation, the positive elongation factor P-TEFb is required. It contains a

CTD kinase activity (CDK3)) that signals the release ol RNA polymerase IT from the

DSIF arrest site. In addilion. SPT5 and P-TEFb are essential lor HIV-1 TAT actuated elongation of longer transcripts. The TAT protein affects transcription at several stages. During initiation, it stimulates CTD-phosphorylation by TFTIH as well as phosphorylation of RAP74 by TAFII2501* Il also binds weekly to FCPl CTD- phosphatase and may interrupt the TFTIF dependent stimulation of the enzyme and thus stimulate promoter escape18. Dm ing elongation, HIV-1 TAT protein binds to the iransactivation response element hairpin (TAR) on the giowing mRNA. The viral activât oi recruits P-TEFb that hypcrphosphory kites RNA polymerase II at the CTD and thus allows elongation to proceed7^

The cellular coactivators SPT5 and TAT-SFT which are bound to RAP30 and

RNA polymerase II are absolutely required loi TAR read through. SPT5 is known to stimulate elongation once promoter pioximal ai test is oveicome but the physiological role of TAT-SFl is not known104. In vnal transe apt ton. TAT-SFl tightly associates with TAT' protein and also binds to p-TEFb These tnteiactions would place TAT-SFT in the appropriate location lo play a role m the actuation of elongation not only by viral TAT protein but also under physiological conditions'04. Two coactivatois of transcript elongation are bound to RAP30 This leads to the assumption that at least part of the stimulatory effect is caused by TFIIF Indeed, it was found that phosphorylation of RAP74 correlates w ith TAT-activaied transcription. T'AT has been reported to stimulate the TAFU250 and P-TEFb kinases Whether one of the two kinases phosphorylates RAP74 in this context, is not known Phosphorylation ol

TFIIF stimulates its elongation activation function which helps to ]ump-start stalled elongation complexes whose CTD weie hypeiphosphoiylated by P-TEFb105. 28

1.3 Domain structure of human TFIIF

Along the description of the transcription cycle by RNA polymerase II, il has often been pointed out that the various functions of TFIIF in initiation and elongation are carried out by various functional domains of the T'FIIF-subunits. Functions of both

RAP30 and RAP74 are clustered in three domains referred to as "N-terminal".

"central", and "C-terminal" domains of RAP30 and RAP74. respectively (Figure 3).

The domains and their functions mentioned in the previous paragraphs are summarized below.

1.3.1 Domain structure of RAP74

The functional N-tcrminal domain of RAP74 consists of the first 217 amino acids. Just RAP74(2-L39)68 is sufficient to bind TAFT1250 while RAP74(2-154) provides a minimal RAP30 binding domain. RAP74(2-172) is sufficient for incorporation of RNA polymerase IT into a minimal ly functional preinitiation complex and support minimal levels of transcription-'18-73-74-112. Mutagenic analyses showed that the region of RAP74 from T154 to M177 is particularly important both for stimulating the rate of elongation by RNA polymerase IT and for supporting productive initiation"^. Tins involves tightening of a DNA loop around RNA polymerase II referred to as "DNA wrapping" which requires R.AP74(2-205. TFIIF containing RAP74(2-205) contacts this loop along its entire length of 80-90 bp winch led to the idea that two copies of TFIIF associated into a a2ß2-hcterotctramcr were needed for DNA wrapping and that tetramenzation depends on sequences within

RAP74(172-205)14-72. However, for hill single round, accurate transcription activity,

RAP74(2-217) is necessary which includes additional sequences for first phosphodiester bond formation, processivity and elongation'3"75. The multiple sequence alignment of RAP74 homologues shows two well conserved domains at the

N- and C-terniinus connected by a variable linker with many deletion and insertion sequences encompassing amino acids 206 - 362109-1 i'U-''-i-\ This region has an exceptional accumulation of charged residues that occurs in less than 4% of all eukaiyotic proteins111 and is also readily digested by various proteases124 (Section x2). These observations suggest that the central linker domain of RAP74 is flexible and solvent exposed. Lt is the target of vanous protein kinases, interacts with SRF activation domain and contributes to multiple round transcription74-106-112. This 29

indicates that the cential domain of RAP74has îcgulatoiy functrons during initiation and elongation m that it limits the accessibility of the RAP 74 C-tenmnal domain to

^ its binding partneis DNA, RNA polymerase II and CTD-phosphatase FCPlls I his eflect has been iclciied to as masking and is compensated lor by N terminal

"lS sequences ol RAP74 (ammo acids 74-8^)ls The C teimmal domain of RAP74 also binds to 1FITB52 which counieracts FFITP dependent stimulation oi FCPl In combination with autophosphonlation10'1 this max be essential loi RNA polvmeiase

Il regulation throughout the tianscnption tvcle especiallx toi piomotei escape and loi

' polymeiase lecychng m multiple îound uansuiptiotf The SRI DNA-bindmg domain also internets vs ith RAP74 C-tcnninal domain and mav conlnbute to TI TO regulation112

1.3.2 Domain structure of RAP30

Multiple sequence alignment oi RAP30 homolocucs shows that the three

p'1 r} domains aie not as well defined as in the RA.P74 alignment1"2 (Section 0)

RAP30 is also divided into N and C-teinnnal domains sepaiated by a pioteolytically susceptible legion of apptoximately ^0 ammo acids ( 124 157) (Section 3 2 Ftguie 1)

This appaicntly flexible cential legion is necessaix loi elongation activitx oi R \P'0 and shows sequence homologx to g iactoi tegion 2 which binds to bactcual coie

s8 polvmeiase'118^ The cential domain is in close contact with RNA polymeiase Tl which is shown by pi election ol RAP^O casein kinase II sites in the FFIIF/RN A. polymeiase Tl-complex But theie is no evidence that the centtal icgion of RAP^O is a functional RNA polvmeiase IT binding site1,0 The functional N teimmal domain ol

RAP30 has not been well defined Ammo acids 2 08 ue sufficient to bind RAP74 while the extension to ammo acid 152 is tequtted lot 11 IIB binding'12 The N-teimmal domain contains sequences necessaix toi tianscnption initiation and elonealion

(ammo acids 1^-45) which max be responsible toi specific functional tnteiactions with othei components ol the tianscnption machiner \ oi proper assembly ol T F TIF and she preinitiation complex88 The C-teiminal domain oi R \P^0 (ammo acids 160-

33 8S 249) is necessaix loi tianscnption initiation and binds nonspecific ally to DNA02

I he solution stiucture ol this domain shows a winged helix tut n helix DNA bindtne motive'11 (I retire 2B) 30

Figure 3: Domain structure of human TFIIF.

Ftinttion.ll domains ol RAPW and RXP74 weie assigned accoiding to litcratmc as cited in the text Proteolysis lestilts aie taken loi m section 1 2 and 1 1 horn this studv Full length human TFIIF was digested with clastase and chymotivpsm Tiiiiuatcd TFIIF sontammg R \P74(2-202) and R \P10(2 HI) was digested with Smmopeplidase 1 (Aspeigillus guseus) and nuxtuies ol t atboxxpepdiase \ C aiboxtpepdtidase P Caibowpepditase V and Cai bow peptidase B Pioteolvlie hagments Iiom sieeiai tune points ot the digestion leactions weie tiaetionated with leveised phase HPLC and analyzed hy N letminai

and MUDlTOf- mass sequeiuine speetionietre Veitie.il anows militated endopioteoluie tutlnig sites Diagonil mows indicate exopioteohtie eultinc sites Die laiee anows denote cleavaee sites obsened withm 10 minutes and small anows denote ilea\age sites obseited within 3 houis to 4da\s

RAP30

aeä Interactions, of RAP^io M W-SF1 "AFin 00 SPT5

. )i i un B ATF2 ATF4 C EBP CREB

RAP74 binding (2 119} DNAbindinq,1o2 249) TFIIB G Pd nq ^1 152) HTH motive (175 2'. aïO region 4 boa i cqy (IT'S 229) nidation (116 24«)

fc T

mm 249

-srenm s—I-CO

RAP74

Unlocdl ^ed Infractions -f RAD74 in t se rib -TArll80 AP 2 t Mye activât on domain lui /los um ,unB

TAF,260 binding (1-n9) SRF ic> vation domain g ndmq DNAbmdnq(363 51/, RAP30 bindmq {2 154) 20b 2=JJ) RNA Pol II bmdinq(363 444) (T/0 1 2 4 1h7) region homologies (136 prosohorylahon

o CM

I I I I i III imimn 517 ^ \\ \ il

1.4 Outline of the thesis project and its goals

The lole ol I RTF along the RNA polymeiase II tiansctipuon eye 1c has been discussed m meat detail The \annus lunctional legions of 1F11F aie clnsteied m specitK leginns ol the ammo acid sequence ol its RAP74 and RAP^O subumls which led to the concept ol lunctional domains (1 iguie ^) In addition it was pointed out that the undeistanding ol the mechanisms ol basal and actuated tiansuiptton depends on dcteimining atomic stmctuies ol all components and then complexes involved (liguie

2) rheiefoie the goal ol this stud\ w is to dehne the sttuctuial domains ol tlllF

ielate these to iunction and sohc a high ttsolution \ ta\ u\stal slruclmc oi the

RAP-* 077 4 complex This appioach was designed to answei the questions whcthei

RAP^O and RAP74 Joint a tight complex ot lit onh loosely attached and what the subunits stochiometi\ ol TFIIF is Discussion ot the u suits liom pievious biochemical analyses in the light ol the piotetn stiuttuie should lead to a moie detailed undcistandmg ol the mechanisms ol 11 III oieamzing RNA pohmeiase 11 tianscnption (1 iguie 1 ) and pio\ide the basis toi spec the questions about IF HI sttuctute Junction iclationships

Initial ciystalh/ation tnals with lull leneth It HT weie not successful

Ihcieloit stiuctuie dttetmination had to pioteed m two steps 1 nst definition ol the

RAP^0 and RAP74 domain boundanes Second ctwalli7ation and stmctuic

detet ruination ol these domains and then complexes

L muted pioteoHsis has been the method ol choice loi elucidation ol piotein

] l "" domain struct me and stiuctuie iunclion telationships1 Chaptei 1 ol this thesis

descubes how tecombmant ITru?1 was neated w Uli endo and exopiotcases (Pnnitc

b) The bteak down pioducts weie identihed b\ N teimunl sequencing and HLPC

hactionation lollowecl In mass specttometn \n alninntent between these pioteasc

îcsistant hacments and the pie\ mush delmed Junctional domains ot 11111 subunits

showed close cotiespondetice (Ftgute ^) Ihis suppoited the idea that distinct

stiuctuial domains ol RAP74 and R vPH) ha\e specitk lunctions Chaptci 1 contains

a detailed desctiption ot how the limited ptoteohsis lesults weie used to design

RAPM) and RAP74 consttucts loi civstalh/atton sciecnm« with RAPW/RAP74 complexes Relokhng and piuihcation ni the RAPWR \P74 unci action domains aie

documented in the Mateuals and Methods (Chaptet 2) C rvstalli/ation scieenintï and 32 data collection with native and derivative RAP30/74-crystals is the subject of the following the chapters 4 and 5. Subsequently, phase determination and improvement leading lo structure determination of the RAP30/744ierterodimei at 1.7 A lesolulion are presented followed b> discussion of the stiuctuie and its functional implications

(Chaptei* 6). In the secluding chaptei 7. the conclusions and future perspectives of the structural work with TFITF-subunits are summarized 33

2 Materials and Methods

2.1 Materials and Apparatus

2.1.1 Sources of general chemicals

Chcmii al M.miHiotuici INumbei

1 10 Phemnthiolim lluk-i "00

1 10 Pi imiiiodtoint i luk i SSO

1 12 Di iminododeeiiie I luki _)/0

1 12 Uodn uiednmiiie 1 luki >°970

i 12 l)otkc inok i id : luki 14 ÜM)

I t Dilhi thuiul Dil) 1 luk i l S 19

s Dnmini „ iiKthslpnitini : luk 1 iM20 8 Di iniinopeiuin Juki 1 ( HtXIIltdlol _|Iluki_ J.S291 1 7 ' )90 QnmmelupPin 4 luT> 1 8 Di iminooiPuit J luki '00 1 8 Oi iminoottun riuki 10»7T

"" nuimdiimnie OctiinethtleiK ill mini Tkiki Ï 200

A _ 1_9 Pi.immojsRiji mi lliki 2^190 1 9 \ontm di icid luki Tl __ _ 170_ \ 1 2 Dukoxy ulunniii rihi snh its (dd VIPi Ph uni lei 2/ t819 01 S 2 i Dikoxytytosuu diphosphate Ph u m it i 21 18 M 01 ddCTP] _ __

i 2_J_ Di(ko\yi umosiiiL nphosph itt kklGlP Pint mi i 2" 4817 01 8 2 ] Pitkoxy thymidine diphosphate ddl IP Phumi i 2/ 4S^1 01 i * l\ox\ lduiesnie mphosph tu (d Ml Ph uni i i 2J.1SS0 01,

2 PXoxy yüttmt s diphosphate (ilCtl I h it nn i 2 1860 _ 0]_ ^ 2 Dioxynnnesint tnphosph »u tlGIP Ph u tniti 18 0 02 _ _ 2 Deoxythymiilnii s lnphosphiti dTIP Plmnnci 7 1880 0) _ 2 Meici] toelhinol bnxo I luki n(9o 1 Methyl ^4 pinl-inidiol i,MPD) Huki _22 l2 7 Moiph bndtlhmt ullonieaud MES ] luk i _ (9_8)0 4 (2 tlyditxstth)!) pipn i/ms 1 Oh em ulitnieieid I1EPES I luki _s4ls9

n _6_Ammoi inioi _Tcid J luk 1 7b0 8 Ilydioxvqumohnt riuk-i --SQ70

Acilomtul 1 1 hei Sel lltltlt \/062"'17

Airvlinmk Huki l 1699

Adenosine diphosphate B Khun ei M-innhe S]QC)7T

Adenosine liiphosplnle ( \1P1 I luki i 20f 0 Adipic leid luki ->P0

\m niml it d Hilt 1 Is 1 / 0

A uosi St ipiq r\tc "iCHCL

Ambeilit MB BDIÏ oOO S

<\.mnioiii i solution MeKk 18(0

Ammonium teil ik 1 luk 1 t ;<")()

Amnionium e hi Mide luki

Ammonium tithe dip en phii[ hits luki )9 si

Ammoniuin nilntc luki 0188 1

Ammonium suit lit Tluki 09)81

\mpieiliin ( h mu Bum ihwi ( 18010

Intl Inptetie Dil )l s )/ J

Bitnmi ehlond 1 luki S W0

But/amidm 1207 \ _ ^ JTJuki

Bin- h\diox>tlh\lhmino lus hydioxt methyl) roetlnnt (Bislns ] luk 1 14880

Bon i id 1 luki

_ Jsr<2L

Bos m stium ilbumin iBS \ Bpthnn i 2'8C1 Biomoph nol blue I luki ^ IS0^0_ C it xly he leitl sodium s Ut 108 tO ^ Juki

( idiimini mtntt Ithihedi it luk i 09 11

<- C. lesuim thlondt Ink i "09 0 L lltiuni itel it J luk_l _10v

1 i Chloi impheinitl I luk 1 8 34

Chloiokim F luki »8(90

Oint icid UK nohydntt l luki /488

(obillou mti id Htxihydnte Fluki ,0833

CoomiK.ii bnllnntblut K ""sO 1 luki i"818

Cupiit mti ik liihydi ite 1 riuki (119/ 1

tupnt tillit Ptntihvdnte 1 1 luki 61/10

L\0 Biotine Mu k 21111

FK+) Cilu o monohydiit M i k 8^t

D( t- Siithii i t (sun is ) lluki 8 1100

s CX-t-l Suei 1 ink i 8 409"

D(0 litlnh dihedi id 1 luki 10'10

Dtoxuibt mill i itidlimlxh 1 imbliDW MBII tin nn SPOO 1 DeXtllll suit it S15111 181 08

Oi immoniumhetiio n till it 1 luki 09833

Dichloitliiluoiniith m Put 1 is \C

Pusopiopy iilu i ph spin! PT1 1 riuki S 0

DinmthvKulphiMl UMSO t luki Ht K) Di pot issium hvdi n ph «ihn mi lit ,111kl a '14

Di Pol is siuni fuit il 1 luk 1 co-, VI

llodetvl imm ' llkl 141/0

Dow Coimn tin united ill on 1c- 10000cS BI11I C302f^iC

Dowix s()\8 I luk 1 41143

rthmol in IlPI C 1 luki 018-^8

Dili inohmin t lut, i 01100

1 (hidiuinbi mi i Si i i I 87ll

1 thykiie lyct I riuk 1 0)780 , 1 Fthy k nedi mime t tiiittdt mtldi dnini s ill dihedi He IPi \ 1 luki 03679

I liymmmonuckotitk inliumsilt TAIN B him» 1 M umlvim 15 418 Cl \b 4 1 oiminudt I luki 1 /c"0

Poimit ici 1 hum tit Si 1111 ' -1O2

1 (dutii ild he 1 ^0co m vi iter riuk i )l

Cilvtsiol inhvtliou ] luki 49" 0

Glynn s i ))i

CiKey) htm M t k P)

lycel Ivtin luk i s3 Cilytyl t 1 MO , | Ouuudm hydiothl utk (Ci InlIC 1 lluki 10)1(0

Ikxnniint eobilt(IIt) ehloiid M lu h )M))

IlexideiuiM leid (Pilmitmit lutP 1 luki (PO

Ilydiochloii leid (H( 1 h luki 9 b

Imidizok 1 luki .83|

liondin thltude Meick 3)1

1 Isopicptn | Nth lit i b thi Isopiopsl P iheti id (IPKt) | bi h m n P80 L Minuit I luki 08130

L \i nulle I luk 1031

I \spui m riuki 11130

i \spiilit 111 1 luki 1)817

1 ( y tin JJllk 1 0 _ qo__

I Glut mm H itl I lllk 1 I s 4430

I Cilutinim 1 luki Uli 1 1 lllstltlllle 1 luki s '11 J L Isole ui me 1 luki SSSA ,

1 ilhium lut itt 1 1 )\ JJuk _ Lithium thloutk 3 luki '•M 7_ il 1 lithium nid luk s t

1 îlhium Sulfite 1 llkl if 1 '

ï 1 utm 11 k 18i()

T I s sm 1 lut 1 (1l)Tl)

L Mtlhu nui 1 luk i --6 432

L Ph mlilinin i 80'0 __ _ _ J_liik __ I Pu lin riuki 317)0 ,

L S un i luki , 84960_ 1 lin on me 1 luki 33180 L liypltphin OlLk (O) L leu in IJuki ) 8)() I \ ihn 1 luki 4 20 35

Ml llespim lClUltl Tluka 6308 1

Mi msnim ehlondi riuk i 61064

Mi iiesmmmtnte riuki ( 303 1

Mi lit nun suliite Fluki bM40

Mille i id 1 luki OP 10

Mummt i II Kttite 1 luki (31,"

Mm un si(II) elilondt Fluk PVP

Mm m till md it 1 luk l 63S47

Mt thmol 1 luk l bss4)

Mi the k ik bilk I luk l (67P

Mithykikthloiitk 11 luk 1 66 10 NN \ N tttum thsk thy k m Inmin (11 Ml D) 3 '689 JMukt _ _

N N Dimethyl dotkthlimine \ oxitk LD\()) Hut I 40131

NN nitthvkiie bisunl imid (bis iciv hmtdi) 1 luki 666b"

Ni inn mud j luki -1140 __

Nuktl (II) md ite J lllk 1 "2P3

Nitktl IPehloiidt 1 lllk 1

"2lL _

' i- Niti tin nnid ldeiniie thnuelet Ink (N \D B -> i iv i tes 3s Nituc md ( r uk t 8P80 OO Bis(i iminopiopyl) polvpiopvknt IvcopOO 1 luki 148,2 bell D Ottyl Gltitopynnositk J lllk 1 13083

Pu ittin i il I luki 6 MS

Phenol 1 Pkl / '61 1

Pipencme 1 4 bi 0 tthuie ullonu ieid)(PlPl SI 1 luki S0636

Polvtlhvi m Keol 1000 (PEG 10001 1 llkl 81190

Poleethyk ne Iviol 10000 (PI 0 10000) I llkl 3P80

Polv Uni n Iv el POO (PI O 1300) 1 lllk 3P10

Polyethyl ne Ivtol '000 (PFG i000 I luki 31221 1

i ! m Keol ">000 mononielhyklhti (PI (, )00 MMP 1 luk 1 Polyithy , 81 121

Polyethylene Jvtol 400 (PI 0 400) 1 uk 81170

Keol 4000 (PI O 1 llkl 31 10 Polyillrvkm , 4000)

Polvtlhy kn Keol 6000 (PLG 6000) I lllk 1 8| 111

Polyethylene heil S0O()(P, G 3000) 1 lllk 1 81^68

Polvtlhyknt Ktoi i()000 (PK 20000) 1 lllk 31 00

Polyethylen Kt oie dut id f 00 I luk 1 8 PU

Polv 1 Jut miii i id 1 luki 81 P

Polv I K sm 1 luk 1 SP 1

Pol issium Kit iti lluki (9()s|

Potissmm i h loi id jllikl 60130

i Pot issmrn dihi dio i n phosphik 1 luki 60 s()

Pot issiumhtxiey innokn lUlIB 1 luki 60100

Pot issium md lit 1 luki OUI)

Riboflivin i rnonophosphik sodium silt 1 MV I lllkl 83810

Rubidium till iiick ' luki 317)89

Rubidium mti it 1 luk I 8 1010

Si ik m IK 1 i u ssL I MC 3(1042

st t But m 1 riuki 19140

Seien m tin mini Se M 1 riuki »3981

Silicone il D( "Ot 1 lOmPi I lllk 1 8pll Sihe m t il P( Ol DmPt riuki _83lll

Sihei md if Fluk 8S S _

Sodium did til sultitt (SDS 1 buk i P 1

Sodium it i He 1 luki 1 18 1

Sodium izitk luki '1 )) J _ Sodium b lohe In t \Mii h ^ 834 I

Sodium t hi sud 1 «llkl 1 79

Sodium dibs dio, tiiph pint Pitsdiu 1 iuki '1(43 Sodium hvdi sxitk M_uk I 198 Sodium mti ik JJiik i /I 88 Sptinndiiiet tiihydi hPn 1 ^Huki S-605 Sp imiiit led ihydioehkiitk 1 lllkl 81610

Sliontium mil it 1 lllk 1 88899 Svieiimc it id _Huki 1 0/9

Sulluii Hid 1 luk 1 P 17

Suil Dtttl "l Site n kit Il »nptm K nth JIR2 410 Ptti i Kim 1 luki SP(0 36

Teti imethvlinimoiiium chlond Fluk i 8 mo

tliiiiiiui I luki JilCO

u ins D 1 tenu icid Fluki 9) 00

Tiieui M ick 860'

Tn thin limmt 1 luki X) 79

Tnothsl inun M lek 8 p

nillu 1 ICell 1 1 1(11 A) 1 lllkl ipoo

tu S duim tid K Pihv kit riuki 1] (0s

IntinX 100 1 luki UP

Tii/nnbis Tus) Si nn llpj Lu i BPH K POSiO

\yl n y m 1 Ihki PC )0

Xyhl 1 Iliki P6P

i istl xtn t '"pit (P(

\tt ibiuni indite i ent ihvdi n l luki 18730

/m i i n M !Ck 8301

/nit chliu 1 1 Ilk 1 i?0

Zme mli it I luk X 181

7me nil it riik )6-.()0

2.Î.2 Purification of polyethylene glvcol (PIG)

Polyethylene gl\col of \anous lengths (PI G400 Pt G20000) was puiitietl following a piotocol developed b\ Ra\ and Puxatlungal and adapted by Dt Song

ran1"1* 1000 ml ol a M)<< (w/\) PI G solution m MilliQ watci weie ptepaicd and the

ichactive index al 20 C was mcasnied (t\picalh nD - I iQ7 1400) The PFG solution was placed m a 1 litei bottle and 55 ul cone emitted sulliuic acid (95 {)7G ) and 10 e Dowex 5()\8 acid ioim weie added I he bottle w as placed on a spual mi v. i and the solution was h slow mixed o-ycmmht at loom tempciatine I he Dowex icsin was iemo\ed with a smteied glass iiltci and the solution was tiansJetied to a 2000 ml

Jlask 5 Eilenme\ei Sodium boiohvdnde ( I g) was added and (lie viscous solution was mixed m a shakmu an mcubatoi at toom tcmpeiatuie al 30-1 50 ipm Ahei 10 minutes the of a 1 pH 10 dilution ol the PFG solution w is checked 11 die pH was less then 3 0 additional 5 pll sodium boiohvdnde (0 c) was added ft the pH was giatei than pH 8 0 mote Dowex 50X8 acid loi m (5 g) w is added I veiv 13 minutes the pH was checked els beloie If then the pH was eieatei than pH 8 0 moie Dowex

50X8 (5 c) was added else the solution was shaken anothci Is minutes and then liltcied tluough a smteied «lass tiltei to iemo\e the Dowex ^0X8 icsin The PFG solution w as then degassed toi 10 minutes beloie il w as p tssed tluough an Ambethte

MB i column (2 5 cm x 20 cm Mushed with 5()0 ml i0'< methanol and 1000 ml

MilliQ watet) at a flow tale oi about 1 ml mm Iht Just 15() 200 ml oJ the cluate weie discaided I he test was collected in a bottle wtapped m aluminum foil Hnalh the cluate was mixed well beloie its ietiacti\e index at 20 C was asain mtasuud m 37

oidei to detetmme the final PEG concentiation (typically 45-479«- (w/v) PFG) and stoicd at 20° C m 50 ml aliquots

2.1.3 Buffers and solutions

Buffs rs and solutions Composition 1 xMBbulfu 1 (NIB TOO" 1) iOmMBi lus Pitpmei II 0 lOmM MP P

i m\i nn

1 x NIB huit i (NI B 00 10 mM tu < lpt1 / ) 10 mM MCP

80 mM N k I

1 mM P11 t x NI B buff i NI B 0 1 MniM In Ci pli J lOmM M ( 1

100 mM N i( 1

mM nn

J _ _ ^ IxNTBbuttei 4(NIB >00 4 OmM lu 0\t ]H 0 10 mM M OP

P mM kO \

,.I'"M _, 1 x 1B1 j 3)mMTii | 3)mMb 11 U I

i i [ m\l 1 PI \ ' ) M pit issmm/8 M netitt M p ti mm i etit

1 i 31 1 ti i 1

_ 6 x DN \ Ihidin buftei sm ml br m pheii I blu

s m ml xv i n s m 1

s h i k

_ jpPnMJJ \pI13_j _ BiniHIbull i NLB 1 PI ) | 1 ) m\i In C 1 pH i 1 ^ niM \ iC 1

l inMDl [

Bltt d sum , P s \ ih lut ihm I

1 , i ii i i

Blot ninnin hutki 1 mV in

* m\! K m

Blot stun 1 \\ v unitl blick

4 in thin 1 UPI C ntle)

i 3 y \ i li nid

CIA \V v s etil i t tin

1 \ \ i uns 1 ik ihol Comptteilt cell silt solution ( C Si 440 mM M ( 11 140 mM M SO t

lIOmMkt 1 PoRIbuttii(NFB 1011 100 mM hisUpH lOmM M LP

s(PnM NP1

( 0 sf Tut nX 100

Fix toi stqutniii t îs P \ s nid i i i

10 v v ) t ih m 1 rsimiinidt thee 18 t >îminii 1

d nni/ luilh Vmb i lue MB lOniMFOl \pII3 psis bultel P mM kl e o i-sniM In CSpUS ) lOmMIDI \| 113

M) ill mM N i 1ÏP04

HniMklI l>()t PmMNlnci

3 s mM N i( 1

NiOH/SPS lutnn 0 i MNiOIÏ

1 V, V Se S

Phuiol« I\ P ph ni 11 P P (UiliNiudi 38

Piotein Sel destam T-c (v/v) ethanoi 3%(v/v)acctK Kid Piotein Del tix Is" (v/v) ethanoi

MV (v/v) an tic leid

Piotem _il loidnu buüu (POI B) 0 t nu ml biomophtnol blue Pi mMBisTusClpI16 8

if) iv PJitiiol

t w v)SDS

i 10r v s i nv.tt îpt ihmol 1 (PORB 3()mM his Piottm _ luniiinjbiittti *3 \1 Ktlllt

^< i wv srs

s IP uni i t un ni ml ( t mi n BU e R in Pi inn I tix ,

F4 ON \ h mo i buHti > mMliisCl pH 8

1 ) mM \1 ( 1 lOnAimi 1 t mM \ IT

0 s m mi ps\

' 1 mm al bullet "33 „At lusU pli 8 "sniMNk 1

| ' IQniMMpP

8 17 ION A pol) met ist dilution hulhi Ph 11 nu 1 \2 008 \

T7 1 llx 1 mix ptMtini llM eiCl 1 P 1 uM AC XV * __^„._„___ I ' DN \ pohtiisi mix (6 7s pi pel cloiii ) 13 111MIM 1

I 0 ixl 1 llx 1 mix

" j i OS x I ON Vptlvm 11 e iilution bullet

0 i u( 1 ul P d \TP

) 1 ' I nil J3N x P Pint 1 is " II (H) (P1 10 mM ItiPlpll 3i ) 1 mMrOl \

~ DiI8_0 11 10 11 I K»mM In ( 1 pll 80 I I mM IPTXpIPO

tl (P M) 1 ) mM 1 ri C i pll 8 (

Xent DN 3 polv mu ise untion buiki i|JmM Ins( I pile 3 lOmMkC) lOmMiMItPPOt

I 2m\l M SOI 1 1 PiionX 10) | ( ompetent it 11 Ii tntioi mutton butlei "Tl mM PIPI S "s 1 inAlkCI 1 P mVt ( Pi I s s niMMP P 17 ON \ poivnieiise lummttion mixes J4)mM IusClpIi7 8

I Ml m M N i( 1

JOmMM'Ck

P()ni\PlNIP

lOmMddXIPtX \C ( 1 i \ PCR sa t nun kau ion mix 122 2 pi pu clotk) 1 \ etil PN \ polv mens tt iitiouhuttti I :M)mMdNiP

1 s 0 u\i [oiivmi P( R pnniti 0 s uMuseise P( R pimri

) "1 unit nl \ slit ON \ pt ly niui ^9

2.1.4 Media for bacterial growth

Gl o-vvlh medium C om position

2x13 dm m nu 1 C (vv/P biet livpt n

1 0 n (\v/v vt i t exti i t

8 0 p v, v N k 1

Se Mit M) M tu A M ) s ilt

0 4P vv v lu

40 m 1 t tith inmm 11 but nitthnnm t nli7 1 p intch ) i mM SO M 4 si tihz I se] n it K

DuMl SOI i nh/ t i int K

ni 1 f ithIMN Nn m I Pviil mu xm mi Ilnmin sleiihz d ep n itt.lv)

10 ni s ml 1 n m thnmn si nh/ 1 pu n K

i 1 B m dium 1 v. v lu u tivptnit

P (mlvi m e\h ut

0 ip (v, v) Ken 1 VI i niMkiHP04kIPP01(st nl,/_i _j_ int K 131 i u (ht IP vv v but tivj t ne

1 0e (v, v) vei t xli ut

(8 iv v N PI

st I (w v) i ir

2.1.5 Eimmes for DNA-subcloning

I1 ii/yiiii Mmul ittm n Numbti

— ( Alk dm i h ] h n i ÎP Bo nun M muh un BimlII i tu 11 su ii7vmt _NT Bithb i(T — Bsm II stn ti n nzvni Trci nli i t

B 1 Bl i siu p n n/vm NI Bi hl ip

1 to RV i tu h n n/vm NEBi hl 1 PS

FeoRI i lu ti n enzyme NLBi hl Kit

Ilmiill] lesin ti n n/vm NI Biohb I PL Ms| I 1 tll tl 11 U17VI11 NTBiohl 1PS__ Nile It tu u n m/vm NEBi hl un

Ribonutlt i \ KN is A Si mi k P( 0

Stlll I le lllttl )ll ll/VIlle NE Bnl il tus 1 4 DNA 11 is NkBnhl OJl Itptlvnuel ni 1 kini PNk) NE Bithb

Tl DN <\ ptlvni i is Phuiin n 1 08 u

V ni DN\ p Ivniei i M Bi hb 184

Xb 11 ic tiittion eii7vni 1 tri nti MRl 1 R 3

2.1.6 Proteolytic en/vmes

1 n/vmt Munit Il till tl \ umbe i

xminop pti h I Si m \>m

i\minipepii 11 M B him t M mnh 0)1

Cut tidi t xv| t[ _\_ î tm^_ M mnh im C ub l h B xvj | nniç i M mnh un

Cuhoxvi ptiji-_ej^ 11 m, i M mnh _n 14 (JJJ_

C ub ti i i i xy p 1 j- 3J lu n i 31 innh n 81

CJivniihyppin _nim_ i M innh m 10331 1

El i 11 eh un i Miniih un I 8p _ _

P un uki

îp _

Pi t un k i ehtin i M innh m

Subiih m lukn ss fs

I ehim«si M mnheini 1 PI ) 40

2.1.7 General apparatus

Fquipmcnt Supphci Usage

ABI P8BDN V svnthtsi/ i xpphtd Biosisiems Svnth-sis ol olnouutktidt s I T CT) Spettiopolninittti /10 \S( O On cul ii tin hi iism sptetiomettv Dnpfrn/el ( 80 O R13CO Stonpe ol tells pioteins md DN \ DNA IhtimilCvtln 480 C Ptikm Flniei PCR

DP 801 Put m Solutions Dynimit Lijit St ilkim^ rititiophoims j 1 bo\t C imbiu I 1 ctiophoitsis P let Uophoi si

EktUophoissispovtii supph 21° I kB Hiomnn 1 kctioph >1 1 Fue/e divu I vovn P 1 Lelblkl Hill lis tvophili7itn.il il pioteins nid DN \ I 30 iet7ti ( L Buiktnloi Su i i u ] mein nid DN V Giltlivti ssn Bio Rid kltt hophoiesi

Oyiitoiv vv itti bull shikti XI Neu Brunsiiieh Sel Iltltlt In ub ition c t b k U u i litmid eultuie s

Heitm bloïkDn Blotk PB 1 echn 1 il ill OlllltleOtltles ! DepiOkttl lut libit oi Bm) MU Iklell Intubation e t bute na a^u plate s lut ub Ut i shikti Ci2s New Biunsivitk sneiititit Ineubinon ol bi tan liqui 1 uilluies Mil h Qvv Un Mlllipoie Vvitti Puuhi ilion Siskm plIMetti PlIMS8Pucision R idiome tel Cope.nh.l_ ill pli Itle (suit mellts Sinket CntomitWR Buidei X llobun C nniissi stumn V de st iimiiji

PItllllee P md l Milhpoit Pli telll toile ntntioo

Ultimum piottssoi XI II ut Systems 11 Iv 1

1 i ( ni IL \ Specliophotom u i m l\ ib ipti m sptt uostopv

N i\ Spistioplutom In ispn U Pimi lel l O 1 but m liqui 1 eiiltuies \ 1 Speed 1 St 00 \ S iv nu ( Miteiih ih m t piol ms m 1 DN \ l ilnthmibi Sehull X Sthltieh ( n nn ih n 11 pi i ms

' I 111 llllll IX Pllke ek Kunk \( ( 11 lv 1

° l\ 11 uiMllumin Uoi 1J Mitiovu I kB Biomm i Ithidiumbi nui visuih/mon V ituum m tnttiliM Sehull X S Illtlthel t n nu Ui n I pi km

Vitiium tt iitiim Uoi Millipoi C Ulttllli 10 n I pi ! ins

VoïkxOein PM Belldel X Ikb m Mixin

Wittrbikh lempettt U 8D 1 e hllt C mis t uniii d st iinin0

Wateib ub II like kP Hnkt 1 i Ui i 1 C

X Riv Mtdu ilPilmRX ! un \ut u idi iv it s els 11| qn nun, , 41

2.1.8 HPLC equipment

Ml HPLC chiomatogiaphv was earned out usine a Gilson equipment (model

302 pump model 811 mixins chambci model 302 monometci model 1 16 dual wavelength LV detectoi and model 20^ hauten collectoi HPI C contiolwas eliected b\ Ramm D\tiamax soltwaie (\eision 1 1 Macintosh)

Hl'l C t iliimn Sut Supplui __ _ _ 8 PI 31 jvPU 11 mm \ 13( mn T~\ iij_sto SP HPV H nui x 1 mn [T v i isk __ l'henvl P \\ H mm \ 18( mm Tiv ii TSk

JyudosulJ)) s C3 t x s ) nin Mi h iv Ni

Supeidex 1 IK 0 llx m n 1 huma i

2.1.9 FPL C equipment

All IPI C chiomatogtaph\ was compelled usine a Gilson equipment (model

^ Mimpuls peiislaltit pump model 116 dual w rvelencth UV delcctoi and model 20 i liactton colkctotl

rPI C it sins M imif ii tuiei Nunibti llpiun [hu CI P 1 h u m i ii 1 h 1

S phitivl Ilk S P( : h u ma 11 I 3

S phiiryl Ilk S ( Phumu ii 8i

S | hi ixl HI S tfX 1 h u i ii

S i h 11 x C m In n PI u ixi n

S | h I x P n in n ! I u mai 11 t 1

2.1.10 Centrituges

( entiitugations m tins woik aie documented mini! the toimaiism (ipm t 1 c i) lins stands toi a centiilusaiion at ipm lotations pet minute loi t minutes u fempeiattue 3 m the cenuiluee c using the totoi i Ixample ( 18000K 20 mm R 1

RC26 3S^4)

Centrifuge R il i ni ix i îdms

Centn lu, 8)1 I n ht j I[ ] ni il I x 1 n, 1 )

Mmitu e 1 1 ihl l | II 11 u win, in bu k l P'

S SoivillRC ( j lu ( RC ( ,\ ,11 SS 4 IU \ P II il u nxePt ui 1 1 ) )

( Sx IIE\ I4P) Il ri u tix i P m 1 111

(S ITTX 1 M 3 II n u lix P) ml K 42

2.1.11 Hea^j atom compounds

Chum il i(impound Supplie] Nnmbei \mmom 1111 hexithloioniditefh IJuki JJ) 40

Ammonium hex ithloitosinit (I\ 1 luki (10 T ) ammonium hex i hi npilhdite I\ 1 luki (O_C0 Ammonium m Kb hi J.D8S0 Tlukl — Ammonium uti uhloi p ill idite U Tiuk i Ol WO

Aninioniumt ii uhloiopl itmiic II I luki i 01 0 i

Bikei s ihmtiitui ü \inhi B10K r BisPlhsl n di iniiniplittiiuindl ) hi n le su m "'S 0100

eis Plumiim II) dnmuie dit hit ml S in PI 3

- i m ipium hPudt 1 luk 1 1 I 0

ClldoliniUm Itetlte tetllllVtlllt Sti ni s 6PI llydio ni ltd ithloio nu itedll) hydi ik Sti_m ) 1P0 Indium IX ehl ink hexihydi tie 1 P 1 luk_i _ )_ 1 inlhinum hi in it 1 lllk ! 01 P)

1 elltdl) 1 t ll tllllVtll lie 1 lllk

JJïkLL -

leltKIV 1 ht SU n * 8 0

Mtieuiv IP mil ite m nohvdiit Su m 0) 8001

Me llivlnieiiuiv milite Si i P 3] P

Sodium v. In imite 1 luki 0 (

p Chkiim ituivb n/ u i it Si nn C 891

Pill ilium blind lluki 7C0-.0

( p hloi m ituivplinivl till m in I Si nn C IMP

Ptl 1 m ni il V 111 IUI lit I 1 luk 01)0

Poll limit til v in i hüll ik TI tnhv Ii ite \l lu 1 13 PC 8

Pol issium t ti unli ipl itmil 11 \1 In h "-4 )"'(

Simiiiuiii hlond lluki S 4140

Sodium tlhvlm i uuth i ihvlitt LM1S ï lllkl 1 P)

Sodium hix itbloit phtm Hi 13' riuki 188 )

Sodium tiiinsiit dihy hit ! luki on

letiikist iietexvnx r mi niethm ) 1'VMM) Stun 3 ) ""O )()

Thilliumh) ie tit Su m P 8P(

tnns Phi mu indll tb inline dithioiitk Si 1111 1 PP

Pnnielliv 1 1 id ml ik \n m u n j

2.1.12 Crystallization. \-rav sources and \-ra\ detection

1 quipmin! Supplitl lsc_

Multiw hl F 1 tu P B el n I h km t n 11] [ ( is 1 illi/iti n ml s i ikm

Min Bnl ( il mi i s t m 1 Ciy Siltm, ii | Sillet 11 1 1 BiVei nitthumvistositv S ihn

_ — hl us t ipill 111 \ n phy si R uth C iv Hl m untiiijj

Pho n b \S M) ) phoimi, I up X i iv 1 t tti n

in b \S sxi ml R I Inn, plu UJj_ X nv i (ein n

Irai, m pi u 13 u m \t \k it nth miunt 1 X nv d tettnn

i i f tuck tilth mein Huh i X nv d t tti n

-

Inn, pi it bis \1\Ri lid X i iv d t eu in double mint i 1 tills 1 v 1 1 X seule i p I1V

RotUin mod X nv ntiUei Ri iku X i iv im FT C md Rl P)0

Synthioh n 1 uk pe m Sv n hi ton Rili id iPi ihtv X nv tu btuiihn BMP SNbl md'P 1 1 t

Piv t d bee/in mil t 11 I dlv tkv 1 tlipt tquipnit nt t Sirni 1 P1 l " i' d lib 1

tl ell h 1

Civ til e hn tt Id is su im FI S s\ et ms S uni 1 Pi Million 43

2.1.13 Computing

Vet sum Ref. I Sciipl/ Routine Dcsciiplion \ S 1 138 VilORF molecular replacement seltiotationTunctioii

\RP-\\ XRP tixsiallosiaphit letinement with updating ol the model m ical spate P\l) eombuiation antl stating of civstullogiapluc teilettion data

PPORPPONV inteitonversiüii oi vanous coordinate tnrmats

PROSSEP .oniputution ol anomalous leHtenm, tactors PM dcnsitv modiiication and NCS-aieiaguig

Fl'I last 1 ounei tianstoim

riNONCS deteetionot NOS opeiateiis toiinheavv atom sites

GETPX teal space loiulalion seat oh lotation ot NCS axes m leal spate

llkl PI OT genetalionpsmtl > puvession pieluies liomentui dalaseis

M \M \3C PP4 tonveision h.tween VI \M \ and CCP4 map and mask loim.its

\t \PM \SK lnteiconvctsion iU' map antl mask loiniats and untP

MLPH\RE m.tvimum likelihood heaw atom lelmemenl and phase culmlation

Mr/2vaiious com e mon oi CPP I ieflection tiles lo CN.S foimat

MPZPUMP viewin» oi lefk-etion data tiles

M 17t UPS lellettion data tiles utihtv piogiam

NPSMASK NC S masks manipul mon piogiam (oveilap lemoval. , ihtmp

NPO moiceiile and map plotlm« PDßsrrr manipulation ot POR tiles

PFPKMAX peak stau hm- m maps

POLXRR1N s. IllOl.ltlOil-tlllletlOlls

PROCUPCK stnietmc ..nidation

PROTIN piepaialion ol lesiiamts nies loi XRIPWARP and REI MAP

RTTMAC ei vsiallogiuphk tonuigate nathent minimization n fine nient

ROTXPR1 P piodueiion ol \FI7 tips m (oim suitable toi input lo SC \I \

ROI'M\T uiieiionvnsion ol vanous lolation ainlc loiniats

RSPS heave aiom position sc.uohmg in thtfeiente Palleison maps SCVL'X staling ol multiple observations oi icflections SP \I I IT sealing ot isomoiphcus derivative tlaia

SCA1 PPACK2M1/ toiivmion ot menvd s.itlepack output into MT7 loimat

SI AI I stiut tun hittoi tale illation anel X-iav retmement usina ITT

SIPMAA takiilatimi of 1 ounei eoeffieieiits using calculated ph.isvs SOLOMON dcnsitv niothlKalion b\ solvent thppin» and NCS-aveiagmg

S OR WATER soiiins ol vsatei molecules bv the piolein chain and NPS

TRI NCATE tonveision ol ictkttion intensitv to sirucluie tat loi amplitudes

VECR1T vctloi spate teliiu ment of neavv aloni sites

VEC1 ORS geiieialion ot Patte ison vectois liom atomic cooithnales

XPIC)P84I)RIVER niwuie. ol PPPI (oini.it maps

IPs IP PN S 0 8, 0 9a J but ied_siu fat e mp bulled siutaeC aitessibiJitx ealtulalion 1 tontaet mp dem mi in tonl.u tine icsidues between two tes I ot atoms

geneiate_e.m mp «enu lie i ooidimt- md slimline tile toi simple models

lies del del nit s non nxst illoeiapnu simnie tiv tonstiamts and lestiamP

iioss_totation mp sioss lotaiion 'unetiin loi moPulai leplaceninit

iinalation mp .Hitoinaud tianshtion s.anh foi nxhetulat leplacemint

anneal mp ei\stilloiiiaphk simmaud annealing lefmenient

bnoup mp gtouped Ogioup.pn leudik linn snauuil B-laitoi ic line me lit

bindilidiiai lesti uned mdiv idu il 1) hit lei lelmemenl

iiuniiiii/e mp tiisialloiiiaphk eoniugak nadient minimization seiiiienitnt

moikl map mp demon dcnsitv maps using phase inhumation horn a model niodtl...stais mp civ siallogiaphu model statistics

e>plmii7e_aveiage mp tuids eiptiniuni toinhmalion o! tooitiinale flics

opumi/e via mp optimize vi.nX iav tvometiv uemhO

riaitl mp civstalloniaphk luiekhodi nlmtmeul

sa_oniit map mp annealed omii map with spheneal omitted legion

w atet pelote mp delete w.iiei molecules bv analv sis of e lectron deiisip map

vvatei pick mp pish w.Uei molceiiles in cleelion density map

make_e\ mp seuip anav toi s ioss \ ilklation usina a landein seiet tion of data

i^pdhsuluiussion mn pipelines paijn] mioini ition u quired loi PDP suhnv^ssirm 44

Paik.iSt Veision Rtf Suipt/ Routine Desinption 138 189 F)I I AVI 12/99 PLIAVk finding ol motifs and let oamlion ol piotems with similar told

I SQMAN supei position md rally sis of multiple piotems md folds hi I SSI geneiitionof SSF fiks foi DLKVt tiomPDB files ' Ol NZO 1110 140 OEX70 uilomitixin i ol i ivv u [lection dit i

110 SC VII P \Ck I 11 0 SPVIlPVCk md mtigmn of i iw u flection dita snhng _

O 622 141 O iiKxltl builduu envelope ^mention md stiuetme vilidition

OOPS 02/99 141 OOPS stlUtlUlt V ill lit! ill

1 PI J p ' RWE 02/90 M \M \ m isk nn mon md minipuhtion

t OX1 \ niei ilion 11 NC S nnsks loiniu il spue eouehtion mips

M \P\1 \N muupuhtion ot nxq md misk toinnts uidpuimuns n \ 1 \M \N niinipuhtion otiethction dihsets

IMP mime mail ot NCS opn it is

1 SSFNS le il spiee s nehm, Ol fl I tllellts md motilities

i 141 n\ SIIMT 1 6 SU\PI In iv s ilimpt m n in Inn bv sinnm-hy minimization methol

i 118 146 SIIFIX SI11IX lu ivv itompi iti ns in hm bvPitl ison mthluett me Ihitl

> M3R\tn\ 4 0 1 17 M \R\ IFW vi win oi iivv i Sk hon mn e 46

2.3 DNA purification«7

2.3.1 Ethanol precipitation of DNA fragments

DNA fragments from 30 to 1000 bp in length were precipitated by adding 2,5 volumes ol absolute ethanoi and 0 1 volumes 3 M NaOAc pH 5.2. The precipitated

DNA was hai vested b\ 5 mm ccntrifuganon at 13000 rpm in the benchtop centrifuge at loom temperatute The DNA pellets were rcsuspended m TE(10. 0 I).

2.3.2 Spun column purification

Spun columns weie constructed b\ placing siliconized glass wool in the base of a 1 ml plastic synnge barrel, filling the banel with Scphadex G25 medium oi

Sephacrvl HR S400. equihbiatcd in the appropnate bultei. The syringe was placed inside a 15 ml Falcon polypropylene tube, and pie-equilibrated by ccnlnfugation loi

(2000 rpm. 5 mm. RT, tablclop), The sample ( placed in a iresh 15 ml lube containing a I 5 ml Eppcndoif tube without cap, and eluted b\ centrilugation as above

2.3.3 Purification of synthetic oligonucleotides b\ denaturing gel electrophoresis

Oligonucleotides weie synthesized at 0 2 imiol scale on an ABl 308B DNA synthcsizei (Applied Biosystems) and dc-ptoteeted by incubation at 55° C lot 14 hours The de-protected synthesis product was dned m a Speedvac and rcsuspended in

100 pi HbO. 25 til formamide dye was added to an equal \ohtme of the ciudc oligonucleotide stock The solution was boiled toi 3 mm and loaded onto a iO-l^o;

(1:20 bisacrylamide: aerylamnde). 7 M uiea. I x TBE gel (of) cm x 20 cm x 1 mm)

The gel was run at 40 W loi 5-6 houis. removed horn the plates and wrapped in Saian wrap (Dow). Bands were visualized by their absorbance against a tluorescent TLC plate (Merck) undei LA' illumination at 254 nm. and rims could be excised, The gel fragments weie placed m a 6-8 kDa MWCO chah sis bag together with 1 ml EDO and dialv/ed against two changes ot a latge volume of I DO The liquid in the bag was isolated bv liltenng tluough a 0 45 llm iiltei (Saitonus) and the DNA was clued in a

Speech ac The oligonucleotide was rcsuspended m 100 til TE(10.0.1). 47

2.3.4 Agarose gel purification

DNA (<2 pmol per 10mm lane) in 1 x DNA gel loading buffer was applied to

0.8-1.2^ agarose minigels (8 cm x 8 cm x 0.5 cm, ScaChem ME grade. I x TBE. I ill

10 mg/ml Ethidium bromide in the gel), and electiophoresed at 70 V loi 40 min. The desired bands were excised while being visiulized on a UV transilluminator The agarose slices were then placed in a 0 5 ml safe lock Eppendorf tube prepared with a

0.5 mm hole pierced in the bottom and siliconized glass wool above this. The 0.5 ml tube was placed m an open 1.5 ml Eppendoil tube and centrifugecl (7000 rpm. 5 mm.

RT. benchtop) The DNA was used for cloning purposes without further purification

2.3.5 Medium scale alkaline lysis plasniid preparation

To purify plasmids loi preparatne woik a scaled up version of the alkaline lysis mintprep as adapted by Dr. Song Tan was emploved. An 100 ml 2xTY-euhuie was inoculated with a single colony lorm a Ireshh restieaked TYE agar plate ami grown to saturation overnight at 37° C. The cells weie pelleted in a 250 ml centrifuge bottle (6000 rpm. 5 mm. RT. RC 26. GSA). The pelleted cells were resuspcncled m 5 ml lysis butler m a 50 ml Falcon polypropylene tube. Then 10 ml NaOH/SDS solution were added and the solution was shaken vigoiously. Alter 5 minutes incubation on ice 10 ml ice cold 3 M potassium/ 5 M acetate solution was added and the suspension was mixed gently After anothei 5 minutes incubation on ice the tube was centriJuged (4000 rpm. 10 minutes. 4° C, tabletop) The supernatant was itlteied through a sintered glass iunnel into another 50 ml Falcon polypropylene tube.

Isopropanol (12.5 ml) was added and the mixture was ecntniuged (4000 rpm, 10 minutes, 4° C. tabletop) The pellet was washed with 70 9- ethanoi and then resuspcncled m 300 pi TE(10, 50) and tiansiened to a 1 5 ml-Eppendorf tube Alter

' addition of 2 5 pi of J 0 nig ml RNase (DNase Iree) the sample w as incubated at 37°

C loi 20 minutes The icacuon was extiaeted twice with phenol and once with CIA.

Two 150 pi aliquots of the aqueous phase were puiilied o\ei two Sepharyl HR S400 spun columns equilibrated with TE( 10. 0,1 ) The cluate was suitable for restriction digest, sequencing, and cloning 48

2.4 Cloning methods147

2.4.1 Competent cell preparation

Ihe L coli stiams ( lable 2 lable 71 weie obtained tiom Di Song Ian and le sneaked on FY! agai plates I sing colonies liom this plate a 250 ml eultinc (250 ml

^ 21Y media with 6 ml competent cell sails solution) was gtown to an optical density ol 0 6 at 600 nm The cultuie was chilled on ice toi 10 minutes and then contiilnged

(4000 ipm. 10 mm 4 C RC 26 GSA) The pelleted cells weie icsuspended m 80 ml ice-cold tiansloimation bullet and the centiilugation was tepeated ihe tell pellet was ic suspended in 20 ml oi ice cold ttansfoimation bullei and the suspension was tiansteiiecl to a stenle pohpiop\lene tube DMSO (1 5 nil) was added with swnhne and the cells weie incubated on ice loi 10 minutes \hquots ol 100 ml weie dispensed into 1 5 ml Fppendoii lubes and Hash tio/en m liquid mtioecn Ihe cells weie stoied at -80 C and ga\c a bans toi mat ion elliuencv ol 106-107 colonies pei ns oi pUCl9HS

Table 2: Cloning strains and plasmids.

St i mis

101 h n tvpe hsips tin upIDIi pi VB1I ItuD'fpi \PB htQ7DMlsl Tbl wisnoinnlly

i wn n 1 i ! i u 11 He in 1 Ihn « ml i n Ion i ui\ th T |hmil\ihihi n ussity 1 i Ml

i . P nnnipul ill n PI ismiels

* PL ( n i il 1 el v tl r t f PSf in itilhn t n puipt imp Ip nivup mi[ isjjn ^

pt I i C mp nein il m F / ] i Kein \pies iens\ tcniiuivin in inipinlhn i sisiine n Du Jv toi h iv iT phi piomittr whith is i ii mzitl uilv bv th 7 RN \ piivm i is but n t bv th In t RN Xpolymnis (usai with BL It 11)IM j _ pRLI h V unni t tpIT i it nt until« t tinloi mil mm p dit tRN \ lin t t VGA (cinstimt IP j I)i SonL 1 ui) j _ _ _ _ ^ 1

2.4.2 Restriction digestion

Foi anahtical pmposes 0 5 ug DNA 01 "3 pi ot a alkaline hsis mmipicp

(Section 2 4 12) wete mixed with 5 ul digestion mix (Ix NEB teaction bid lei 0 4 D pi lestnction enzvme \ 041 pi 1 icstnction cnzvme B) Altei iwo boms incubation at 37 C 2 pi ol 6 x DN \ gel loading bullei weie added and the teaction ptoducts weie anahzed b\ DNA P \GF loi piepatatne pmposes 2 -s pg ol DNA weie digested w ith 1 > units ol lestnction enzyme m a 3olume oi ^0 pi ol the lecommencled

New I ngland Biolabs-supplied which was supplemented with 10 mM DEI and 0 1 nig ml BSA Altei two houis incubation at ^7 C 6 pi DN \ gel loading bullei wete 49

added. Samples were electrophoresed on an agarose gel as described for preparative gel purification methods above.

2.4.3 Dephosphorylation

For dephosphorylation of digested vectors and DNA-lragmcnts calf intestinal alkaline phosphatase (C1P"> was added to the digestion reaction after two hours

* incubation at 37e C (see above), For vectors with protruding 5 termini 0,1 u tut s of

' C1P were added and foi blunt or recessed 5 termini 1,0 units of CIP. Aller 45 min incubation al 50° C the samples were phenol/CIA extracted twice, CIA extracted and finally ethanoi precipitated. The DNA pellet was taken up m 8 pi TE(10. 0.1) and 2 ul

DNA loading buffer. The desired vector or DN A-fragment was isolated by agarose gel purification (Section 2.3.4) (adapted by Dr. Song Tan).

2.4.4 Ligation

For cohesive-ended ligation, approximately 30 ng (2 pi) of agarose gel purified vector DNA and 2 ul oi agarose gel purified insert DNA wete mixed in a volume of 10 nil lx T4 DNA ligase buffer. Then 4 units of T4 DNA ligase were added, Reactions were incubated for one I -2 hours at RT or overnight at 16° C and then used for plasmid transformation For blunt-ended ligation. 40 units of T4 ligase were used (adapted by Dr. Song Tan).

2.4.5 Plasmid transformation

Plasmid DNA fO. 1-1.0 ng, or 2-5 ml of the ligation reaction product) was added to 200 ml competent cells which had been thawed on ice. The cell suspension was incubated on ice for 40 mm. The cells were heat-shocked at 42° C for 30 seconds and immediately chilled on ice for 10-20 seconds. 400 ml of 2xTY media were added and the cells were incubated for 40 min at 57" C in a shaking air incubator at 210 rpm.

The transformed cells (500 ml) were plated onto TYE agar plates containing appropriate antibiotics.

2.4.6 Design of PCR primers

PCR Primers were designed for a melting temperature liii of more than 50" C and checked lor primer dimenzation and secondary priming sites with the programs

E, PRIMER (version 0.5. Lincoln. S. E,. Daly. M. H . and Lanciei. S„ Whitehead 50

Institute loi Biomedical Reseaich, Cambridge MA USA) and AMPI IFY (veision

2 5 B ol Madison USA) w ith deiault Engels , Univeisity Wisconsin, Wl, paiameteis

2.4.7 PCR subcloning with cohesive-ended inserts

DNA loi subcloning was amplified h\ pohmeiase chain teaction (PCR)1''1 In a teaction volume of 100 ul 10 to 100 ng ol plasmid DNA containing the taiget sequence to be amplified was incubated on ice in a 0 "3 ml sale lock 1 ppendotl tube

^ containing 0 uM loiwaid PCR pitmei 0 -< tiM teveisi PCR piimei and Vent DN \ polymeiase icattion bullei with 0 2 m\l di\ IP \ ent DNA polvmeiase (1 unit) was added 60 ul ol nuneial oil was oveilaid and the samples weie tianslened to a theimocvclei which had been pie-waimed to (>s C Ihe samples weie incubated at

95 C loi 2 mm Then amphlicalion was automatic ilh earned out using lollow m« the scheme 5 cvcles oi ^0 seconds at 95 C 60 seconds pei kilobase ol expected PCR pioduct at a lempeiamie 5 C lowei than the lows st inciting tempeiatuie oi the two pnmeis and extension at 75 ( lot 40 seconds followed bv 15 cycles ol 30 sec it 95

C, 60 seconds pei ktlobasc ot expected PCR pioduci at ns Ç and 40 sec at 75 C A linal incubation ot 2 mm at 7^ C was used lollowed b\ holding at 0° C DNA piodutts waie anahzed b\ DNA PAGE Ihe teaction piocfucts w eie phenol/CIA extiacted tw ice CIA cxtiactecl and ethanoi pieupitated Ihc DN A was lesuspended m

20 pi TE( 10 0 1) and used loi lestnction digestion followed b\ hcition and tiansloimation

2.4.8 PCR subcloning with blunt-end inserts

DNA loi subcloning was amphhed bv pohmeiase chain icaction (FCR)!M

Foi cloning oi blunt ended PCR pioducts ihe PCR pnmeis weie phosphoivlated In a volume ol 50 ul 1 x 14 DN \ I igase bullet h\e units ol 14 pohnucleoucle kinase

5 (PNK) weie added lo 12 pmol ot the svnthctic PC R punie i Altet M) minutes incubation at ^7 C the leaction was phenol/CIA extiacicd iwinC CIA extiat ted and pin died o\ei a Sephadex G25 medium spun column I ht cluate w as ethanoi pieupitated and taken up m 20 pi IE (10 0 1 ) to \ield a 10 pM solution ol the PCR pumei Ihe PCR icaction was earned out as notmal (Section 2 4 7) PCR pioducls weie then phcnol/CT \ exti acted twice CIA exit acted punlied o\ei a Sephadex G25 medium spun column (Section 2^2) Ihe cluate was ethanoi pieupitated and taken up m ^0 pi 1 x 14 DN \ 1 niase bnttei \ltei 10 minutes incubation at 70° C one unit 51

of PNK and 1 mM ATP were added to a final volume of 56 pi. After 30 minutes incubation at 37° C the reaction was phenol/CIA extracted twice. CIA extracted over a

Sephadex G25 medium spun column. The cluate was ethanoi precipitated, taken up in

10 pi TE (10. 0.1) ) and used for restriction digestion followed by Ligation and transformation (adapted b> Dr Thomas Rcchstemei).

2.4.9 PCR-subcloning based site directed mutagenesis (PCR-SDM)

Point mutations wete introduced using a PCR-method developed by Ito el al.is-. The protocol was adapted by Dr. Song Tan loi mutating genes inserted between the Xbal- and BamHI-sites m the multiple cloning sue ol the pET3a vector (T7 expression system)1'11'' (Figure 4). In a fust PCR-react ton ihe whole gene is amplified using a the forvvaid primei ST0452 which eliminates ihe Xbal-sitc al the 5 '-end of the

^ gene ("template PCR") and the nweisc punier S TO 14 In a second PCR reaction the sited-dueetecl mutation is mtiocluced using the ioiwaid pinner STÜ43 I which preserves the Xbal-siie and a icveisc mutagenesis pnniei. The icverse mutagenesis primer has 15 bases on either side of the mutated bases Ihe product of this

"mutagenesis PCR" contains the desired mutation but noimally Jacks the 3'-end of the gene. So, the product oi the "mutagenesis PCR" is elongated further using the

"template PCR" product as the template ("combination PCR") and the PCR-primers

ST0431 and ST0314, Products form this "combination PCR" w ill either contain the mutation and a functional Xbal-site or no mutation and a dysfunctional Xbal-sitc

Therefore, only mutated inserts can be digested with both Xbal and BamHl and ligated properly into pET3a. 52

Figure 4: PCR based site-directed mutagenesis.

BamHI

ST0431 STO432 I

N ^ î mutagenesis primer SFÖ314 Xbal mutagenesis PCR with STP431 and the ^ mutagenesis primei template PCR with ST0432andST0314 Ï

combination PCR with ST0431 and ST0314

Xbal BamHI restuction digestion

^i

Subcloning into Xbal BamHI pfcl to

Ihe * PCR" template was earned out in a teaction volume ol 100 pi pi 1 vt

DNA plasmid (10 to 100 ng m 1E(10, 0 D) containing the taiget sequence to be amplified was incubated on ice in a 0 ^ ml sate lock Lppendoii tube containing 0 5 pM ST04*2, 0 5 uM STCD 14. and Vent DNA polvmeiase icaction buftei with 0 2 mM dNTP Vent DNA pohmeiase (1 unit) was added 60 ill ot nuncial oil weie

and the oveilaicl, samples weie tianstcited to a Uicunoculci which had been pic wanned to 95 C The samples weie incubated at 95 (" lot 2 nun and then the amphlication was auiomalicalh canted out using 20 cvclcs ol M) seconds at 9s Ç 60 seconds pet kilobase ol expected PCR piodnct at 50r C and extension at 75° C toi 40 seconds \ linal incubation ot 2 ram at 75 C was used, lollowed bv holding at 0 C

DNA pioduets waie anah/ed b\ DNA P VGL (2 2 2) and punlied bv piepaialive

^ agaiose gel electiophoiesis (Section 2 4) m oidet to sepaiate the icaction pioduets liom lemaming template and PCR-piimet DNA 53

The "mutagenesis PCR" was carried out in a reaction volume of 100 pi. pET3a

DNA to 100 in plasmid (10 ng TE(10, 0. D) containing the target sequence to be

amplified was incubated on ice in a 0.5 ml safe lock Eppendorf tube containing 0 5

STO pM 431. 0.5 pM revetse mutagenesis piimet. and Vent DNA polymerase

reaction bullet* with 0.2 mM dNTP Vent DNA polymeiase (I unit) was added, 60 pi

of mineral oil was oveilaid. and the samples weie ti ans ferre d to a thennoeycler which

had ' been prc-waimcd to 95 C. The samples were incubated at 95° C lor 2 mm and

then the amplification was automatically carried out using following the scheme: 5

cycles of 30 seconds at 95 C, 60 seconds per kilobase of expected PCR product at

40° C and extension at 75e C for 40 seconds lollowed by 15 cycles of 30 sec at 95° C.

60 seconds pei kilobase ol expected PCR product at 55° C and 40 sec at 75'" C A

final incubation of 2 mm at 75e C was used, followed by holding at 0° C DNA

products ware analyzed by DNA PAGE and purilted by preparative agarose gel

in oiciei to clectrophoicsts separate the reaction pioduets iiom icmainmg template and PCR-pnmei DNA

Combination PCR war carried out m a icaction volume of 100 pi. 5pl of

"template PCI?." product and 5 ul of "mutagenesis PCR" were incubated on ice m a

0.5 ml safe lock Eppendoii tube containing 0 5 pM SIX) 431. 0 5 pM ST03J4, and

Vent DNA polymerase reaction bullei with 0 2 mM dNTP Vent DNA polymerase (1 unit. New England Biolabs) was added. 6() pi oi mineral oil (Sigma) was oveilaid. and the samples were transferred to a thermocyclet which had been pre-warmed to 05^ C

The were incubated samples at 95" C for 2 mm and (hen the amphlication was

carried out 18 automatically using cvclcs of 30 seconds at 95° C, 60 seconds per kilobase of expected PCR pioduci at 55 C. and extension at 75° C for 40 seconds A final incubation ot 2 mm at 75' C was used, lollowed bv holding at 0° C. DNA

ware bv products analyzed DNA PAGE (Section 2 2 2) The icaction pioduets were

extracted phenol/CIA twice. CIA extracted and ethanoi pieupitated. The DNA was rcsuspended m 20 pi TE(10. 0 I) and used foi lestnction digestion lollowed by ligation and transformation (Sections 2 4 4. 2 4.5) 54

2.4.10 DNA sequencing

Sequencing ol dsDNA was earned out bv the duieoxv chain tot mutation

s • ^ method1 using E7 DNA polvmeiasel 2 pg ot plasmid DN \ m 20 pi 11 ( 10 OD weie alkali denatuied at loom tempeiatuie loi 5 mm bv addition oi 2 pi ot 2M NaOH,

2 mM I DPA solution 1 he denatuied DN A was ethanoi pieupitated adding 10 pi 0 9

M NaüAc pH ^ 2 and /A pi absolute ethanoi The DNA pellet was an citied toi 6 minutes and lesuspended m 11 pi ol a mixtum c ontainme 10 pmol oi sequencinc ptmiu and 2 pi ot 17 anneal bullet Ihe sample was muibatecl loi 10 mm al U ( and 10 mm al loom tempeiatuie To the piimei annealed sample 6 pi 17 DNA pohmeiase mix weie added Ihe mixtum w as incubated at toom tempeiatuie toi 5 mm \ltquots (4 5 pi) weie pipetted on the sick ot 4 weih ol a miciotitie plate

(Falcon Bee ton Dickinson) with each well uintainine 2 5 pi ol one oi the Join 1"

DNA pohmeiase tenmnation mixes Ihe miciotitie plate was centiitugcd but lh to mix the leagenls (500 ipm 5 seconds RT tabletop) and incubated at D C toi 5 mm beloie 2 ul ol loimanude elves was added to each well to stop the teaction Samples weie then incubated toi 10 mm m a 90 C oven and lo ided onto a denatmme

Polyacrylamide gel Sequencing eels (6h ( 1 20 bisauvlanude acnlanudc 7 Muica

0 ^ 5 x TBI giadient 60 cm x 20 cm x 0 1 mm usin« siliconized glass plates) weie urn lot appioximaieh 15() mm at 45 \\ usine 0 s x IBF m the uppci and lowet bullei leseivons The eels weie lived loi Is mm dticd onto blortin« papet (Sihl Papiet) m a vacuum divci and exposed to X ia\ lilm at loom tempetatme loi 3 18 homs Ihe sequencing pnmeis used aie summaiized below lable 3: Sequencing-Primers.

Iduititj Siqiitnie Deseiiptlf n "siou j \\1 \( ( \C1( \C I \JP j i i ip ]ii n ii, ] i i xi 1 i i PI i [ kit i in 111 111 I SJ012 ( CC CR \\C \( ( ( ( I 11 R s i |ii n in, | uni il i | 1 1 i | 1 T 1 i mi 11 1 11 I ST()4_) I 1 \ \C I 1 C C ( 1\_UP( \<_C I is n 1 [ unie i t i ( is i m i I |ii in, 111 C I is 11 hsnn I

S1Q4P ( IK K U C \V1 U i( 3 R i | limn I HI n. n \ ti I 11 u \" ot j P C ! i i ) h mi i

SlOllI ( ( I 11 C( ( \( K U 1( ( K ( S ijii n m, i uni i I i k VP 0 mt mil

STOIP ncccc \( c \ccclece All1 n m,. | unk i I i K XI' 1 ni u il

VOCCCOCCIC UO\OCi\( S |ii nein, punni toi K \P i int n il SjOUS __ _ SIOI1 ( \(C( I(((\C( «fit _S ijii n îng ptiron t i R \P I ml r< oi y \ v ( u w \( ( \ \i c ( A1Ï2 mm,, puni i I i k VP I il mil

HO C\\\((KC(CHC(1\ S qu n ill" punxr I i R VP 4 ml m 1

H 03 ( \(( ICC \C( 1C \C IC \ S pi ntin° pinrni Pi R VP I C t imiiiil

K Ol ( C K 1 KCl \\K VIC \\l \( \ Siqu n m, puni r 1 i i\ VI __C_tiimnnl S5

Table 4: PCR-Primers.

St Idenlilv qui ntt Desuiption Tm m* S104P IGCGICCGGCGrvG VG P< nnid pnnxi loi ikiuii ml Ihe. ORF ->{ pLI ipRC13i tO 7

_ mdpbllJd St() IP OGGTl'lt t CCI VG V VVI V VTTT I oiwii piinin toi SIMM usm MR mitnt 1 into th ORr of P > C ,|tl'l SH) P ( C(P PI VG VVU V VI11 R ITT Praiiei piimn toi 1 ( R n mn ineHuie SOFm into th 17 s VVCTTI ORP IjETPindpRPI i s 10 11 C 1 1 ICGGGCITTCn I VC C \C R sei pnmei f i I CK mn ml ni SOI in into th O j OiO pi i i mipRT 1 i SIO ps CCC1CIAG VV\1 VVTIHC 11 1 lorn ndpnm i t i nn SOI in intPh OKI ofpFPiind 33

VVCTTAVG VVGGVGVI R1 I i , j

SI04P 1 V\GTlGGGrv

SUMP PirOlOIGGWlIGlOV R i s i J.HI1H1 1 i CK i mn md queiitun ot ptT( O hi s 1 ph mi 1 rooi ( ( GGVKCTl V( 1 VCT1 I K \( R Site ( nil i I i k\P 1 PI si 1 P<< < \CC u m | ETI 1 k VI ll ni| 1 « rooi C( Ct VGG1 3 \UCTl roil VCV k \ 1 pi mi 1 Pi k V 1 P) VIIKRC usm ] 1 1 11 1 R V i l mihi lOO (ICC \K (T VC I I VVC \ \CT k \ i j mn i t i k VP 11 P 1< 1 I \( G\( ( l usm pF 11 1 R \1 4 i t mpht 1G0S c < c< \ i ( c rr v vc t 11 r " v( c v ksmse nun i 1 i R VP 11" s ( ( vc c V usm« pf 1 1 Ul R VI 1 t t mpnt rcio; (.CGGVKCTl VCICt ITG VGG Reinsi [ nniei pi K VP 1 M si) S

u in 1 11 1 R VP ii t 1 it ~ (J.GVVGT _ pi _ ni) 1GO10 GVVGGVGVI VTVC VI VTGGPG Fen« ud pi inn i t i K XI 1 M t <• 1 (GCTGt IG u nnpFTI 11 RVP Ii i n, ht 1GOU t GGG VT( C 1 I \TTCI K Gl CGC I Rivers puni r lor R VI IIP) r (( 1CPTGG1CCTGGIP1 u m pLTllilRVt P im|ht 1G013 CGGGVlt riTVVGC rGGlCGG Ritus- punit i Pi k VP ) 1 11 1 1 C VI 1CA usm pU ltd RVP 0 is[ ni| lu 1GOP ( CPC VICCT1 AC HP 1C 1 VI n Riv i i puni i Pi RVP 11 t M S c,< \\inm\ usm, | I PI) i RVP'0 i t m[ lu 1GOP (GGG VIC CI 1 VGC/1 VCGGTGV Rev is puni i Pi RVP (MM < G VGCG usnuiEIll 1 RVP » isfimiliP [GO IS C GGGA1C CI 1A VGC1GG1C G V Riv is pundit ikVPl 1 111 ( ip Vusm pLI h "' GtTIC VGCTCIIIGI VCC RVP 3(1 11 ) i teni|lit

I GOD ic( gc \i ir vec icrnc î v(( R s ist, puniei t n k VP 1 1 n 1

vci vnee î tec vno vcvgci t F« VI u mil i k VP ill f m| lu TUCK V

tgop) C V VC \I( ( K U It VVVGTC V R v i | n i i t i k Vi 1 1 nd

TIC VC VC ( IC \( H( 1 PPM i u ji 1 i R VI 0 1 il it mj ht I G02t ( CI KG VVCGC CCCC1CCCGV R vi i ni i t i k VP 1 1 1 2V nd ( omet ( lev v( \n i \i (1 Vu m i LI i R VI 111 lii mpht

FGO> C ( 1 ICC V V< C C( t ( C 1C( ( ( \ Fx i 1 [ iiniu ink) lit n 1

C VCTG( G1C U( VH1 V( j V( Il VI u m pi 1 i k VP l 1 1 ) i lemphti G VCGH lGO"> Gt IC WORK ( CK ( VI ( ( ( R v i puni t toi R VP 4 J 1 n 1 V( C CTC U K t V( L4Ptusm>_jsy_Pk\I_l II il mpht Ko înitul ( \ ilip nd n i tl t i nun

2.4.11 PCR screening

A. method developed b\ Di Song Tan was ustd assav tiansloimcdZ wli cells loi the collect length ot \ectot iiiseit Horn a H 1 platt with colonies ioim lieshh 56

transformed E. coli cells 6-24 single colonies were picked with sterile loops which were swirled in 50 pi water m a 1.5 ml Eppendorf tubes. The wet loop was finally

streaked on an argai plate to preserve the isolate lor latei growth. The cell suspension

m the Eppendorf tube was vigoioush vortcxed loi 15 seconds before I ul oi the

suspension was added to 10 pi ice cold PCR screening icaction mix containing the

appiopnate PCR primers m a 0 5 safe lock Eppendori tube. Two drops ol mineral oil weie oveilaid with a Pasteur pipette and the reaction mixtures weie kept on ice until the samples were transieired to a therniocvcler which had been pie-warmed to 95° C.

The samples were incubated at 95e C foi 2 nun and then the amplification was automatically carried out using 25-30 cycles of 30 seconds at 95° C. 60 seconds pei kilobase ol expected PCR product at a temperature HE C lovvci than the lowest

of melting temperature the two pnmeis . and extension at 75° C foi 40 seconds A linal incubation of 2 mm at 75° C was used, followed b\ holding at 0° C. The icaction

product was analyzed b\ DNA PAGE

2.4.12 Small scale alkaline hsis (miniprep)

A method developed bv Sambrook et. al 14? and adapted by Dr. Thomas

Rechsteiner was used lo assay tiansfoimed E coll cells foi the correct length of vcctoi insert. From a TYE plate with colonies form Ireshh trans lormcd E coli cells 6-24 single colonies were picket! with sterile toothpicks and used (o inoculate 5 ml TB-

' cullnres. The culture were incubated overnight at 37 C m a shaking air incubât oi at

210 rpm. The overnight cultures were centrifugée! (4000 rpm. 5 mm, RT, tabletop) m

15 ml polypropylene tubes (Falcon: Becton Dickinson). The pelleted cells weie rcsuspended m 100 pi lysis buffer and transferred to I 5 ml Eppendorf tubes. 200 ul

NaOH/SDS was then added, and the lubes were shaken vigorously prior to incubation at loom tempeiatuie foi 3 minutes. 300 pi of ice cold 5/2 5 M KÜAc/HOAc solution weie added to each, the tubes shaken bnelfy. Afiei centiitugation ol the samples

(13000 rpm, 5 min. RT. benchtop centnJuge) 600 pi of the supernatant were evtiacted with 500 pi phenol/CIA After ethanoi precipitation the DNA pellet was resuspended

' m 50 pi TE( 10.0 1 ) containing 40 pg nil D\ase itee RNascA and incubated loi 30 mm at 37 C. The DNA was used for analytical scale lestnction digestion and sequencing. 57

2.5 Protein analysis

2.5.1 Concentration

Protein concentrations were determined by UV-absorption spectroscopy. The optical density OD at 280 nm was measured against a corresponding buffer control sample. Protein concentrations were calculated using extinction coefficients as calculated with the program PEPTÏDESORT (GCG. version 9 5).

2.5.2 SDS Polyacrylamide gel electrophoresis (SDS PAGE)

Proteins weie analyzed using a 18c'é ( 1.60 bisaciylamideiacrylamide), 0 75 M

TrisCl pH 8.8. 0. lr/c SDS separating gel covered with a 5^ (1.20 bisaervlamideiacrylamide), 120 mM BisTnsCl pll 6 8, 0 I 'i SDS stacking gel. Flic gel si/e was 10 cm x 10 cm x 0,5 mm (nnntgeh) Samples were diluted with an equal volume of PGLB and boiled lor 2 minutes beloie loading. Gels were run foi 60 min at

10W using 50 mM TrisCl, 0 IM glycine. 0 lr4 SDS as running buffer. Alter SDS

PAGE, protein bands weie vtsuali/ecl by staining loi 5 minutes at room temperature with protein gel staining solution and ciestamed bv heating to 55° C pioiem gel clestaining solution foi 30-60 minutes '^

- 1 on moleeuhr m is)ln ptotem 111,11 kel >I V1VVM Ph.umaua , <)4io ol 9 I kPa (P UOa I ) kPa >"klW 33 kPa 1 14 kDa

2.5.3 Electroblotting and N-terminal sequencing

For elecrotransfer of proteins form SDS PA gels to PVDF-mcmbranes an apparatus and protocol developed by Dr. Geihard Fiank was used. The protein sample was run fractionated by SDS PAGE Meanwhile PYDE-membrane (7.5 cm x 7 5 cm.

Bio Racl) was soaked m methanol (HPLC grade) loi 30 minutes and in blot running buffer loi another 50 minutes Then membrane and gel weie inserted into the electroblotting apparatus described by Dr. Gcrhaid Frank. Electioblottmg was run at

50 V for two hours The PVDF membrane was stained with blot stammg solution foi

20 minutes at room temperature while the gel was checked loi blot efficiency by coomassic stammg. The blot was dcstained with blot destainmg solution with scveial changes until the destainmg bath lemamed coloiless The membrane was air dried at room temperature. For sequencing, the appropriate bands were cut out with a sterile scalpel and submitted foi N-termmals sequencing bv Edniann degradation and HPLC 58

to the Protein Service Laboratory of the Institute ol Molecular Biology and

Biophysics (ETII Honggerberg).

2.5.4 Mass spectrometry

Protein samples toi mass spectrometry were pieparedby reversed phase

HPLC 1-2 mg of the protein oi~ interest were dissolved m 0 \CA TEA and loaded at I

l nil min onto a NueJeosil 300-5 C8 HPLC-colunin m 0 k/< ITA. The protein was cluted with 7(E'Ï acctotiiuile, 0.191 TEA. The peak (tactions as delected at 280 nm were anahzed by SDS PAGE and pooled The pooled fiactions were evaporated to dryness (Speed Vac. Savant, model SC100A). iesuspended m 20 pi 1120 and submitted loi mass spectrometiy analysis to the Piotcin Sei vice Laboratory of the

Institute of Biochemistry (ETH-Ziuich)

Desalting ol proteins for mass spectrometry

Column Nucleoli PP-3 ps, Via,hnv Nagel Ko \ 04 un lempetatuie R I

Plow i.itc 1 ml mm

Sample I Png piokm m 0 IP 11 V

Detection 380 nm

Flow e ell GiPesn s 0 mm Pnii

ISuliei \ 0 IP TIA Bullei B 70P aeetonitiile 0 IP Fl V

Gl ad jeu I __| slep «îadient lo 100% B

2.5.5 Analytical gel filtration

Analytical gel filtration samples (200 pi) were dialy/cd against the tespectivc elulion bullei. The samples were injected onto a calibrated gel filtration column

(Sephadex HR 200, Pharmacia) and eluted as indicated below. The chromaiogram was recorded clcctronicalh and evaluated off-line (Section 2.1,8). The peak fractions as detected al 280 nm were anahzed by SDS PAGE

Analytical gel filtration

Column Supetdex IIR 200 10 \ Po mm lempeiatuie 4 'C

Flow late 0 2-0 3 mlmm

Sample P2 m_t piotem elulion builei Delettion 2M)nm

1 Flow e ell Oil sein 0 mm lop

Elution bullei 20 mM 1IPPFS pll ; 3. pool) m\i KCl 0 1 mM LOT V pll 3 n 0 1 mM PIT

Oil tiltiation ILV1VV c.ilibiahon kit PH00 UP

1 ' 7 kPa Ribonuilea-e V 4) kPa Ovalbumin 'P kDa Vlbumin

Pel dilution ILMW i.ilibiation kit 160 '(H) kDa

PSUPi Vldolase P s U)a Catalase 440 kDa rcnitm wP LüaTPieoalobulm 59

2.5.6 Limited proteolysis

Limited pioteohsis was peiloimed accoidmg totlnee assays thaï aie îeiened lo as RUNNING- PRF1NC1 BATION- and SIOPPING-assav (Figuie 5)

Figure 5: Limited proteolysis assays.

A) RUNNINti ASSAY (PR1LIM1NIARY P REENS TIM1 t OIIRSFS PI RPARATIVF SC Alt )

REAC 1IÜN INHABIT ION DFNA1RUAFIÜN S.DS PACL

reai bon time 10 minutes y minuti s reaction tempeniuic RI ioo r

~ " "T 18 ni 20 iiM (truncated) T P lit 1 ni inhibitor slock solution 20 iii PCLB

in reaction butter

2 ni pre tease at 0 01 10000 pq ml

B) Pitt INCUBAI ION ASSAY (MAXIMAI PRO H ASP C ONU NI RAI ION) f,LA MON FLA t n notes RI PRHNCUBAriON DLNÄ1RUAIION SDS PAC-, t n nutos lOminutis minutes RI (P 100 C

2 uel inhibitor stock sol uton 18|il P }uM Puni HtcbTHlF 20u.lPGin 2 iil prote »se at 0 01 KPOQpqml in Mcticn bufft r

P) STOPPING ASSAY (S1ABH ITY OF 1 RAOMFN1S1

RFACIION INHIBITION RTNAI RUAI ION SDS PAOF reiclion time intubation time 2 minutes reictiontemperiture incubation temp rature 100 C

18 u.1 20 u.M (truncated) IF IIP 2 p.1 inhibitor stick solution 20 fil PGLB

in rt k bon buffer ? p.1 prott ase at 0 01 10000 p ) ml

I he RUNNING assav icpiesents (he standaid limited ptotcohsis icaction with

a deimed end (Eiguie 5A) l S ul ol 20 uM (tiune ated) 11 IIP (2(M0 pg) m the appiopiiate teaction bultei weie placed m a l ml I ppendoil tube 2 pi ol the

lespcclne piotease m watet at the appiopiiate concentiation (0 01 10000 lia ml ) weie added Altct icaction tunes between 10 minutes and 72 horns at the appiopiiate icaction tempeiatuie the tubes weie centnlugcd bnelly to collect condensation dioplets and 2 pi ol mhibitoi stock solution weie etdded Altei JO minutes incubation at loom tempeiatuie PG1 B (20 ul) was added and the samples weie immediateK boiled loi 2 minutes 1 he samples weie t onsideied to be stable altei this n calment 60

and stored on ice or at -20 °C before analysis on SDS PA gels. For preparative woik the reactions were scaled up to 900 pi of 20 pM (truncated) TFIIF (1 -1.5 mg) and 100

pi of protease solution (foi details see Table 5 and Table 6).

The PRElNCUBATION-assay determines the maximal protease concentration that is completely inhibited with a given inhibition protocol (Figure 5B). 2 pi of the

respective protease in water at the apptopnate concentiation (0 01 10000 pg ml ) were placed in a 1 ml Eppendoif tube. 2 pi of mhibitoi stock solution were added.

Attei a preincubation ol 10 minutes at RT the tubes weie cenliilugcd briefly to collect condensation droplets and IS ul of 20 pM (truncated) TFIIE (20-30 pg) in the appropriate reaction buffei were added Atter a icaction time ol 10 minutes al loom tempeiatuie. PGLB (20 pi) was added and the samples were immediately placed in a boiling water bath for 2 minutes. The samples wete consideicd to be stable aftet this treatment and stored on ice or at -20 °C betoie anahsis on SDS PA gels (lor details see Table 5 and Table (3)

The STOPPING-assav checks the stability ol the proteolytic fragments aftet the inhibition step (Eigene 5C) 18 id oi 20 pM (truncated) TFIIE (20-30 pg) m the appropriate react mn buffer were placed m a 1 ml Eppendorl tube. 2 pi of the

h respective protease m waiei at the appiopiiate concentiation (0.01 10000 pg ml were added. After reaction limes between 10 minutes and 72 hours at the appropnate reaction temperature the tubes wete centnfuged bnelly to collect condensation droplets and 2 pi of inhibitor stock solution were added Aftei incubation between 10 minutes and 24 hours at room temperature, PGLB (20 ul) was added and the samples were immediately placed in a boiling water bath lot 2 minutes The samples weie consideicd to be stable after this ticatment and stoicd on ice m at --20 °C before analysis on SDS PA geh (lor details sec Table 5 and fable (3) 61

Table S: Endoproteases.

Endopioteases Spetiuiitv Reaction conditions Met h am sm Inlubitoi stoik solution

Tiypsm n\ 3 e 0 00001 1 nig ml ennme Senne-piotcasc ' (bovine pane n'as) \ Vu Iss 20mVl InsClpIlSO 10 mg ml Petabloc EC 3 4 21 1 3 no speniicits 200 tnVl XaCl 0 lmMI PI VpIPO 0 s mM PIT

RI 10 POmm

Cliymottvpsm n X P -e 0 00001-1 nig mi en/inie Sentie-piotc ise ibovmepantieas) \ hp Plu Ivi 20mVITnPlpIIS 0 10 mo ml Pefablot FC U 21 1 Pin I eu Via Vsp G1P 20i>mM\aCl Y no spnltleltv 0 I mM rnx V pi I 8 o 0 3 mM PIT

RI 10 ISO mm

n \ i 0 Elasiase ^ 00001 1 me ml enzt me Sonne piotease (poitine païuicas) \ V il I eu Ile Via GIv 20mM hisPlpIPVi 10 mg/ml Petabloc EC ) 12 PO Sei 200mVl\iCl

V no spenticiP 0 lmMlDl VpïPSO 0 PiiMÜPI

RT 10 POmm

Papain n X / V c 0 00001 1 mg n.l eii/vnk. Cv steine piotease (Caiica papas .u XP Ajo Lss Glu Un 20rnVI lnsCI pll SO FC W 22 2 Glv Ivi POmVlXiCl

Moie bonds aie eleived at spurn 0 1 mV1I PI VpilSU

I ate o iniMDTT

RT 10 POmm

Pioteinase K n \ 3 . O000P ] „nn,] enziroe Cv sleun piotease Outil ichiuni albunii V ihph ilk uomalic 30mV1 InsClpIISO

FC 3 1 21 PS itmno kkP PO tnVl \ iCl

3 no xp.nheip 0 1 mV! UP VpIPSO I' smVlPIT

RI in PO nun

Subtilmn unspecitii 'iOOOOI ] mg ml en/vme Senne pioteasi (Bdiillus subtibs) 20mV1 his( IpH P, 1 ( î 4 21 1-1 300nAi NaCl 0 1 mM PPT V pll 3 0 0 s mV! PIT

RI lo POmm

Table 6: Exoproteases.

Cm bow peptidases. Spetiflnh Re aetion loiiditions Miihanisni Inhibiloi stoik solution

Caibowpiptidase 3 uiispentii 0 1 nn ml eii/vnk Sinne piokasi (Sacihaiomvei-s ceitvisi.O Releast ol Vsp and GIv n PnVtBislnP IplIoT IP (PP OPT EC 1 H 6 1 iciaided 20 mV! KCl

1 niVIPPI \

^ C P

^ Caipoxspeptidase P unspecitii 1 nu nil en/vme Senile piotease

*- (Pénicillium lanthiiii llunP Release ot Sei and Gl\ is mMNiO VcpII 16 P, (V'V)PIPI VC Ulbl let.uded 20 mVl KCl

1 niMiOl V

"O C -h

Caiboxvpepliilas. \ anspititii 0 Ol im ml en/s me 7n pioO.ns

(bos mt panneasj RJe.ist ol Gls Vsp Glv LPs is s mM IiisCl pll / 3 S|)V Ol'v) solid lllea l 3 1 P 1 iitaided No iclease ol Vig and 2omMKCl 30nMPPPV s" Pio t Psh i() mVI o Phenaiithtolim

inFlOHHO- 1 1

C u bow peptidase B specilic 0 001 m> ml en/vme Zn piotoas^ (poiellle paillle is) Releases Arg ans I ys PnMTtnClpH s 30O Pi s i solid mea f 5 I P 2 20 mivt KCl 30 mVl EDI V

»^ C 13 h s() mVl o Phenanlhiolme

mEtOII HO I 1 62

AmuiopepticUsts Speiilints Reatlion conditions Mtthamsm Inhibitoi stoik solution

M Aminopeplidisi unspe ih 0 03 nn/ml en/vnit Mulillo piotc lie (jioicnit kindney) \ Pi X V ] X Gin in not SiiuMTiisClpIISO 341P etile 1 20 mM Kcl

^ C Mb

Aminopi ptiel i e unkii sn 0 (P mo ml n/ynii, In piote ast (Streptonivee s iistiis) ImMTii 01 pll 3 FC Mil PpllMKel

7 C ( > th

2.5.7 Dynamic light scattering

Beloie civstalh/atton piotein samples weie assaved lot aggicgation by dynamic light surname (DP SOI small sample volume npgiade soitwaie v 2 0

Piotem Solutions) The piotein sample was diahzed against BSB20K (1 mM

BislnsCl pH 7 0. 0 5 mM DÏ F 0 1 mM LDI A pH

' ' s 20-45 mg ml loi ciystalhzation and then diluted lo 3 IUg m] ^îth BSB20K foi the clvnamic hghl scatlenng expenment Mimmalh 10 mcasmements weie lecoidediot aveiagmg

2.5.8 Ellmann's assav

Piotein samples weie conecntiatcd 111 Ultialicc 4 conecnttatois (Miihpoic) to appioximately 200 pM (100 pl) divided bv the numbci ot lice thiol gtoups pei piotem molecule The concentiation bullei was exchanged against the Ellmann s assay bullei

(80 mM Na HPO/NaH PO, pH 8 0 1 m\l 1DIA pH 8 0 [7 M uica], FAB) bv eel iilttation (Sephadex G25M spun column m EAB Section 2^2) The eluate

(appioximately 200 pl) was diluted with 1 AB to a linal volume ol 500 ul An UV- absoiption spectmm oi this sample was teeoidcd (222 - 4^2 nm) then 20 ul at 4 mg

' ml ditluomtiobenzoate (D1NB) m I AB weie added Altes 15 minutes maibation at loom tempeiatuie m the datk the I \ absotptton change at 412 nm was iccoidcd Flic concentiation ol lice thiol gioups pei piotem molecule was calculated using an

extinction eoellicient (42 nm) of f - 1^600 M cm loi the 2 nitio 'Mhioben/oaie anion1,10 63

2.6 Protein expression methods

2.6.1 Small scale expression test in 2xTY-media

Eor protein expression in conventional 2\TY-media, the E. coli expression

strains BL2 1 (DE3) and BL21 (DE?) pLvsS were used (Table 7). In the evening, 50

pl of competent cells were transformed with I pl ot the appropirate expression plasmid at 100-300 ng pl (Table 7) The ceil solution was plated onto TYE agar

plates containing the appropnate antibiotics (usually lOOpg ml"1 ampicillm and 25 pg

ml' chloiamphenicol) and incubated overnight at 37° C The next morning. 100 ml

' 2xTY-medium (inclusive 100 pg ml ampicillm and 25 pg ml chloramphenicol) weie mocculatecl with 5-10 colonics of the BL21 trans tonnant s and incubated at 37°

C in a shaking incubator al 210 rpm. Typically, expression was induced al an OD600

of 0 4-0.5, A sample of 500 pluninduced cells was pelllcted (13000 rpm. 5 nun. RT benchtop), rcsuspended m 50 pl PGLB and boiled immediately lor 2 minutes before

adding 200 pl of 0.2M 1PTG to the bacterial culture to a Enal concentration of 0 4 mM. After 60. 120. and 180 minutes moie samples of 500 pl weie drawn, pelleted

(13000 rpm. 5 mm, RI. benchtop), lesuspendcd in 50 ul PGLB. boiled immediately for 2 minutes and analysed on 18^ SDS PA gels. In the end 50ml ol the remaining culture were pelleted (2000 rpm, 5 mm. RT, lableiop), rcsuspended m 10 ml T100 (20

mM Tus-Cl pH 8.0. 0.5 mM HOTA pH 8.0. 100 mM NaCl. 10 mM 2- meicaploelhanoL 1 mM ben/amidme), shock lio/en in liquid nitrogen and stoicd at -

20° C This sample was used loi the solubility test (Section 2 6 5).

2.6.2 Large (6 1) scale expression in 2xTY-media

Foi protein expression in conventional 2xTY-mecha, the E coll expression strains BL21 (DE3) and BL21 (DE3) pl.ysS were used (Table 7). Earlv in the morning. 50 pl of competent cells weie transformed with I ul oi the appropriate

' expression plasmid at 100-300 ng pl (Table 7). The cell solution was plated onto

' TYE-agai plates containing the appiopriatc antibiotics (usualh 100 pg ml ampicillin

' ' and 25 pg ml chloramphenicol) and incubated lor 10-12 h at 37 C In the evening, a starter culture of 100 ml 2xTY-medium (inclusive 100 pg ml ampicillm and 25 pg

' ml chloramphenicol) was inoculated with 5-10 colonies of the BE21 transformants and incubated overnight at 20° C m a shaking air mcubatoi al 210 rpm. By the next 64

morning die starter culture typically reached an OD 600 from 0.2 to 0.5. From that

starter culture 12x8 ml were used to inoculate 12 x 500 ml 2xTY-medium (inclusive

100 pg ml1 ampicillm and 25 pg ml"' chloramphenicol). The cultures were incubated

at 37° C m a shaking air incubator at 210 rpm until the OD600 had reached 0 4-0,55,

Two samples oi 500 pl unmciuced cells from two cliffeicnt flasks were pelleted (13000

rpm. 5 mm. RT, benchtop). rcsuspended m 50 pl PGLB and boiled immediately for 2

minutes before adding J ml of 0.2 M IPTG to each 500ml bacterial culture reaching a

final concentration oi 0 4 mM IPTG. Altei a typical expicssion time of 3-4 ii two

samples ot 500 pl were dtawn from the same two flasks, pelleted (13000 rpm. 5 mm,

RT, benchtop). resuspencied m 50 pl PGLB. boiled immediately for 2 minutes and

analyzed on 18VÏ SDS PA gels. The cells were pelleted in 500 mi centrifuge bottles

(6000 rpm, 5 min, RT, RC 26, GS3) The cells weie then lesuspended in 100 ml T10O

(20 mM Tns-Cl pH 8.0. 0.5 mM EDTA pH 8 0. 100 mM NaCl. 10 mM 2-

mercapioethanol. 1 mM benzaimdme). transferred to a 250ml polypropylene beakci.

shock Irozen in liquid nitrogen and stored at 20' C

Table 7: Bacterial expression strains and expression plasmids.

Bacterial strain Ivvprcssinn plasmid BP21 (FTPipFvsS pETPt~RVP'4il l'en pPPaRVP'OiPltoi pEPaTPVP IP PP pi 1 Pi-R VPPtFPt) Genotype pHPaRVPM. PPPl pl PaRVPPfl IP) T ompl hsdS..(i,- nv) ?a/dem (DE3) pMIPiRVP74GO \P pLI3a-R\P30(l 131) pi ysS(Cmp) pri7a-RAP"4(PI8P pEl )a-RAP740 P021 Giosvth Conditions pm.i-R VPP« 1 2031

?x 13 - media

' 100 pg ml ampicillm pRET>aRVP7t, POPPA

23 ug nil Jilotamphemiol 37 C pEFTa RAPPi FP2V pETPi R VP3()< Fl Fp CPOV tIPP

BL21 (PEP pLTliel-RVPPti 1 51P pPrild RVP30p 3 pi)

Genotvpe

b- ompl l>\t!1 u mOePP/' iPE7i

GiovUh Conditions

Pl 3-media

l°o Ug ml ampieillin P C RSUiDLTipl ssS pPI \i-R VPPG IP. pPI nRVPPiPIPP 12P1 pFP.i-RVPVO 1 Pp P106V1 Genotvpe pPPa-R VPPPPPVi pVl pIPPiRVPPl! 1 PP-1 sTAI

E- ni)lh\ilS u - m -i i) Prfn'i nn' pFPPa-RAPP, I P2P pEPnRVPPp 1 10)- i PI 's pl v sSi Cm s E12PH PM LP3A1 I sfAl

Giottlh Conditions SeVlet MO-media

100 pg ml ample dim

25 ug ml chloiamphemeol 17 C 65

2.6.3 Small scale expression test in SelVIet-M9-media

Foi piotem expiession m SeMet-M9-media the E coli expicssion stiam B834

(DES) pi vsS was used ( Fable 7) Failv m the mottling 50 pl ol competent cells weie

tiansloimed with 1 pl ol the apptopnate expiession plasmid al 100 ^00 ng pl (Table

7) The cell solution was plated onto TYI -agai plates containing the appiopilate

' antibiotics (100 pg ml ampicillm and 25 ug ml chloramphenicol) and incubated loi

10 12 h at ^7° C In the evening 5 ml Met-M9-iiiedta contamine oidinaiv methionine

(inclusive 100 p g ml1 ampicillm and 25 u g ml1 chloiamphenitol) weie inoculated with 5 10 colonies oi the BS M tianstoimants and incubated overnight at 25° C in a shaking mcubalot at 210 ipm The next morning this staitci eultinc was used to inoculate c)5 ml SeMeLM9-mcdia contamina selenomethionine Altei 24 h incubation at ^7° C expiession was induced at an OD600 oi 0 1 0 3 A sample ol 500 pl unmciuced cells was pelleted (13000 ipm 5 mm RI benchtop) lesuspended m 50 pl

PG1B and boiled immediately loi 2 minutes beloie adding 200 pl ot 0 2M IP l G to

(he bactctial cultuie to a tmal concentialion ol 0 4 mM \ltei > 6 and24limoic samples ol 500 ul wcic chaw n pelleted (H000 ipm *> mm RI benchtop) lesuspended m 50 pl PGIB boiled immediately lot 2 minutes and analyzed on \Rc/c

SDS PA gels In the end 5()ml ol the lcmammg cultuie weie pelleted (2000 ipm 5 mm RF tabletop) lesuspended m 10 ml 1100 (20 mM Ins CI pH 8 0, 0 "3 mM

IDIApHSO lOOmMNaCl 10 mM 2-meicaptocthanol 1 mM benzamidme) shock dozen m liquid nitiogen and stoied at 20 C Phis sample was used loi the solubility test (Section 2 6 *>)

2.6.4 Large (3 1) scale expression in SeMet-M9-media

Foi piotem expiession in ScMet M9-media the L coh expiession stiain B8U

(DF3) plysS was used ( lable 7) Eaily m the moinmg 50 ul ol competent cells weie

' tiansloimed with 1 ul oi the appiopiiate expiession plasmid at 100 ^00 ng pl ( [able

7) Ihe cell solution w as plated onto T\ F-agai plates containing the appiopiiate antibiotics (100 pg ml ampiullin and 25 pg ml cliloiamphemcol) and incubated lot

10 12 h at ^7 C In the evening a stailei cultuie ol 15 ml Met M9 Medium

' containing oidmaiv methionine (inclusive 100 pg ml1 ampicillm and 25 p g ml chloiamphemcol) was inoculated with 5-10 colonies ol the BS s 1 tiansioimants and incubated overnight at 27 C m a shaking an mcubatoi at 210 ipm By the next 66

morning the starter culture typically reached an OD 600 from 0.2 to 0.3. From that starter culture 6 x 2.5 ml were used to inoculate 6 x 500 ml SeMct-M9-mcdia

(inclusive 100 pg ml1 ampicillm and 25 pg ml'1 chloramphenicol). The cultures were incubated at 37° C m a shaking an mcubatoi at 210 rpm until the OD600 had reached

0 341.4 which took 8-9 hours Two samples of 500 pl unmciuced cells from two different flasks was pelleted (13000 rpm, 5 mm., RT. benchtop). rcsuspended in 50 pl

PGLB and boiled immediately lot 2 minutes be foie adding 1 ml of 0.2 M IPTG to each 500ml bactcnal cultuie to a final concentration of 0.4 mM IPTG. After atypical expression time of 20-22 h, two samples ot 500 pl weie drawn irom the same two flasks, pelleted ( 13000 rpm, 5 mm. RT. benchtop). rcsuspended in 50 pl PGLB, boiled immediately for 2 minutes and anahzed on I8'p SDS PA gels. The cells weie pelleted in 500 ml centrifuge bottles (6000 rpm. 5 mm. RT. RC 26. GS3). The cells were then rcsuspended in 100 ml Tl 00 (20 mM Tns-Cl pH 8 0. 0.5 mM EDTA pH

8.0. lOOmMNaCl. 1 mM DTE J mM benzamidme). nans fen ed to a 250m polypropylene beaker, shock frozen in liquid nitiogen and stored at -20° C.

2.6.5 Solubility test

A sample of a 50 ml expiession lest cell culture, pelleted and shock frozen in

10 ml T100 (Section 2.6.1, 2 6.3 ) w as thawed in a lukew arm water bath The cells were put on ice for 10 minutes before sonicating the sample with 5x10 pulses oi 1 second using ihe smallest probe of the Somcator at level 5. The fluid suspension was stored on ice. 25 pl of the whole cell extiact were mixed with 25 pl PGLB and boiled immediately foi 2 minutes. 500 pl of the whole cell extract were pelleted (13000 rpm.

5 mm. RT, benchtop), The pellet was rcsuspended m 1000 pl PGLB and boiled loi 2 minutes. 25 pl of the supernatant were mixed with 25 pl PGLB and boded for 2 minutes. Equal volumes of the samples from ihe whole cell extract, the pellet and (he supernatant weie anahzed on a 18r< SDS PA gel. 67

2.7 Protein purification methods

2.7.1 Purification of human R 4P74( 2-517)

Cell lysis

Ihe cells liom a 6 1 laige scale expiession expcnment, pelleted and shoe k dozen m 100 ml 1100 (Section 2 6 2) weie thawed m a Inkewann walei bath The thawed cells wete mixed w ith an equal volume oi 1L 100(8M) (20 mM InsCl pH 8 0

0 5 mM EDTA pH 8 0 1 mM bon/amidme 10 mM 2 meicaptoethanol 100 mM

NaCl, 8 M uica) Ihe sample was put on kc lot 10 minutes beloie somcation with 10 x JO pulses ol I second using the laigest piobe of the Sonicaioi at maximum level

The thm fluid whole cell extiact was stoicd on ite

Heparine-Sepharose Chromatograph}

25 ul oi the whole cell extiact wete mixed with 2s* pl PGLB and boiled immediately toi 2 mmmes Ihen the suspension was centtiluged m polypiopvlcne tubes ( 18000 ipm 20 mm 4 C RC 26 SSv4) In the cold toom the supernatant was loaded at 3 ml mm onto a Hepaiine Sephaiose-column (Bioiad ^x6 cm) m

TU100(4M) ) (20 mM TnsCl pH 8 0 0 5 mM LDI A pH 8 0 1 mM benzamidme 10 mM 2 meicaptoethanol, 100 mM NaCl, 4 M uica) The column was washed with rUK)0(4M) then TU200(4M) (20 mM InsCl pH 8 0 0 5 mM HT1A pH 8 0 1 mM benzamidme 10 mM 2-meicaptoethanol 200 mM NaCl 4 M uica) until (he baseline at 280 nm was neaih flat again RAP"U was eluled with IIR00(4M1 (20 mM InsCl pll 8 0 0 5 mM IDI \ pH 8 0 I mM benzamidme 10 mM 2 meicaptoethanol 200 mM NaC 1 4 M mea) m a single laige peak as detected at 280 nm I he peak hactions weie anahzed on a 18h SDS PA gel and pooled Ihe sample was dialyzecl m MWCO

6-8 kDa dialysis bags against 1000ml \U100(4M) (20 mM NaO Vc pH *> 2 0 1 mM

LITI A 100 mM NaCl 10 mM 2 meicaptoethanol 4M mea) al 4" C

Cation exchange HPLC-chromatograpln

Altei Hepanne Sephaiosc thtomatogiapbv the piotem was applied onto a SP

5PW HPl C-eolumn Ihe column was equilibiated with MT100(4M) (20 mM NaOAc pH 5 s 0 I mM EDI A pll 8 0 100 mM NaCl. 10 mM 2 meicaptoethanol 4M mea)

Maximalh 15 mg ot RAP74 weie loaded and eluted with an appiopiiate eiadicnt (see 68

below) I he peak It actions as detected at 280 nm wete analyzed on 18% SDS PA gels and pooled The pooled ii actions weie ehalyzed in MWCO 6-8 kDa cliahsis bags against 1000 ml Al)50(4M) (i mM NaOAc pll 5 2 0 1 mM FDl A, 50 mM NaCl, 0 5 mMDlT 4M mea) at 4 C

Puitiicauon oi RAP74 ovet a SP-5PW HPEC-column

Column DEVbSTW ISk »P mm \ PO mm

1 e nipt 1 Kill t 1\T

1 low i ite. 4 ml mm

Simple ^ P m ot R VP74 \ tumitnl domain m 1I ) FXliclion POnm

Hon ci tl OiPin 0 2nim Oi ul

Bufln V VI iOMVPPOniMXiOV pHs> 0 1 mXIl Pl V pll 3 0 lOOmMXiCl 1 ) mVl nieieiptoeHlnnol 1 XI m i)

Bull i B VII000 4VI) P mM In CI pll 0 s mVlI PI V ] U S 0 s()( mVl X Pl P mVl Pn inpti ihmol Vim i)

Consli ik 1 Gl idiint

7RVP 1 „ M 20 40« B 40 mm

Anion exchange HPLC-chromatography

Ihe ehalyzed peak liactions horn cation exchange HPLC chiomatogiaphv weie piinlicd luithei ovei a DEAL 5p\y HPLC column Befoie the pit ol the sample was adjusted bv adding 1 M 1 nsCl pLI 8 0 lo a linal concentiation oi 20 mM Ihe column was eciuihbiated w ith IL 50( IM) (20 mM I list 1 pH 8 0 0 1 mM 1D1A pll

8 0 50 mM NaCl 0 5 mM DII 4M mea) Maximally 15 mg ol RAP74 wete loaded and eluted with an appiopiiate giadient (see below ) flic peak liactions as delected at

280 nm weie analyzed on 18'4 SDS PA gels and pooled Ihe pooled liactions weie dialyzcd m MWCO 6-8 kDa dialysis bags against 1000 ml 10 mM NaOAe pll 5 2 10 mM 2-meicaptocfhanol and v times 1000 ml 10 mM 2 meicaptoethanol at 4° C Ihe solution was shock dozen and lyoplulized The white powdei was stoicd at 20 C

Puiilicatton ol RAP74 ovei a DEAE 5PW HPI C-c olimin

s Column Pl VI SPVV ISK I mm V PO mm

1 e mpe i Uni RI

Flow i U I ml nun

S impli ^ P in 1 R VP /1 X te i minil il un un m Tl )

FXPction PO nm

1 low tell Cnlson 0 2 mm 0 ( ul Biittn \ Tl Mi(4V1P PPnNl In CI|IIS M) 1 mMLDI VpFIS 0 sOmVf NiC 1 mXiPlI IMmii

BiiUn B TIPO IVI POmVt In C 1 ) II s ) 1 mM EDI V pll 3 0 P ( niXi X id i mVl P IT i VI m i

Construit Ol admit

y RVP/ \ si j P I 's nun

Analysis

Adtei these piiiiluation steps the piotem was mdced moie than 98h pine bv

SDS P \ gel electiophoiesis In oiclct to make suie that the piotems had been cloned expicssed and puidicd eonectlv 1-2 mg oi each piotem stock wete desalted ovci a 69

reversed phase HPLC-column (Machcry Nagel, N300-5 CS) and submitted for mass

spectroscopic analysis and N-terminal sequencing (Sections 2.5.3. 2.5.4).

Colistine! I'l VIi j MS AMi JS-Teim PM't-m' t/nig" ml em' 3 6 P(o RAP/IP 17) 7 33122 0 | o 0 08P V ^4140 0 591

2.7.2 Purification of the C-terminal domain of human RAP74C364-517)

Cell lysis

The cells from a 6 1 large scale expression cxpenmeni, pelleted and shock

frozen in 100 ml T100 (Section 2 6 2) were thawed in a lukewarm water bath. The

chromosomal DNA of the lysed cells was broken up by mixing in the Ulira-Turrax

three times for 30 s at level 10 with one minute incubations on ice in between. Then

the suspension was centrifugée! m polypropylene tubes (18000 rpm. 20 min. 4" C. RC

26. SS34),

Hepanne-Sepharose chromatography

' In the cold room the supernatant was loaded al 3 ml mm onto a Heparine -

Sepharose-column (Bioiad 3x6 cm) m T100 (20 mM EnsCl pll 8 0. 0 5 mM EDTA

pH 8 0. 1 mM benzamidme, 10 mM 2-mercaptoelhanol. 100 mM NaCl). The column

was washed with T100. T200 (20 mM TusCT pH 8 0. 0 5 mM EDTA pH 8 0. 10 mM

2-mcrcaptoethanol. 200 mM NaCl) and T300 (20 m.M TrisCl pH 8 0. 0.5 mM EDTA

pH 8,0. 10 mM 2-mcrcaptocthanol, 300 mM NaCl) until the base line

nearly Hal again, RAP74(364-517) was eluted with a linear gradient from 300 to 1500

mM NaCl using a Bio-Rad 585 Gradient Former with 40 ml of T300 and Tl 500 (20

mM TrisCl pH 8.0, 0.5 mM EDIA pll 8.0. 10 mM 2-mercaptoethanol. 1500 mM

NaCl) each. The peak fractions as detected at 280 nm were analyzed on I8c'é SDS PA

gels and pooled. The pooled fractions were dialyzed in MWCO 6-8 kDa dialysis bags

against 2000 ml A50 (20 mM NaOAc pH 5.2, 0.1 mM EDTA. 10 mM 2-

mercaptocthanol. 50 mM NaCl) and 2000 ml T50 (20 mM TnsCJ pH 7.6. 0.1 mM

EDTA. 0 5 mM DTE 50 mM NaCl) at 4° C.

Hydrophobic interaction HPLC-chromatography

The sample was mixed with 0.586 Volumes ol 1009? satmatcd ammonium

sulfate lo yield a final concentration of 1.5 M ammonium sulfate. In the cold loom, a

Phcnyl-5PW HPLC column was equilibrated with TN 1500 Maximally 20 mg oi

' RAP74(364-517) wete loaded at 4 ml mm and eluted with an appropriate gradient 70

(see below) The peak hac lions as detected at 280 nm weie analyzed on 18% SDS PA gels and pooled The pooled liactions weie diahyzcd m MWCO 6-8 kDa dialysis bags

against A50 (20 mM NaO Ac pH 5 2 0 1 mM 1 DIA pH 8 0 10 mM 2 meicaptoethanol 50 mM NaCl) m two changes ai 4 C

Putilicationof RAP74(564 5T) ovei a Phenyl 5PW HPI ç column

C iltimn Ph nvl spw ISk ""1 5 mm \ P ) mn

I ni| i ilui 1 (

FPis nt 4 nil nun

Simp] < i m 11 R VP/J C l imiml dniiun îiiTX P

D Oili n s nm

El w II ( il n s ( mm 1 pl

T Buffe.1 V TXP00 PmVlTii ( ]tH 1 m\l 11 V i II s iMiAiPIl i 0(1 mVl XIUSO I) BulliB I) OmM lnsCl]II ( ImMP^ V)IP PuVlPII

Construit (»i idiiiit

i s RVP741 Ol ) \ ) B mlI

Cation exchange HPLC-Chi*omatoçraph\

Ihe piotem was tuithei puniied ovei a cation exchange HPLC column In the cold loom a SP 5PW HP1C column was eqmlibtated with \5() (20 mM NaOAc pll

52 0 1 mMLDIApHSO 10tnM2 meicaptoethanol 50mMNaCl) Maximally 20 mg oi RAP74(s64 5 17) weit loaded ill 4 ml mm and eluted with an appiopiiate

giadient (sec below foi details) Ihe peak liactions as delected at 280 nm weie

analyzed on 189J SDS PA. gels and pooled The pooled liai lions weie dialvzecl m

MWCO 6 8 kDa dialvsis bags acamsl BSB25 ( 1 mM BisTi tsC l pH 7 0 0 1 mM

EDTA pll 8 0 0 5 mM DL I 25 mM KCl) m two changes at I C flic piotem

solution was concentiated with a Satoiius 1 52001 ultiathimble to a final concentiation

of 54 mg ml '(=32 mM) Attei addition ol 20% dvceiol the piotem piepaiation was

flash iiozen m aliquots and stoiecl at 80 C

Puiilicalion ol RAP74(564-517) ovei a SP 5PW HPI C column

' Column SP 3PVV 1 Sk P mm v P0 mm Timpirituii 1 C

I Pw nti t ml min

S inipp < P m I R VP I X Pimiinl domim m

1 Instill Gilson Minim 1 ul In ami PO nm

s Bullei V VM)( OmM XiO Vt [IP 0 1 mV1 I Pl V | Il I mXi mi i| I til in l P mVl XiU Bull i B V10O0) 0 niVlXiOV ]ip ) 1 m VI T\ i \ [ II s 1 mVl m u ip ihm] 10) n VI \ iCI)

t oiistiiiet 1 (• idiint 1 " k VP 4 P t s s I ) min

Analysis

\ftei these puiilicalion steps the piotem was iiidced moie than 98% pme bv

SDS PA gel elettiophoiesis In oiclei 10 make sine that the piotems had been cloned

expiesscd and punficd conecth I 2 ms ol each piotem stoe k weie desalted ovei a 71

reversed phase HPLC-column (Machcry Nagel, N300-5 C8) and submitted for mass spectroscopic analysis and N-terminal sequencing. The concentrated material was assayed for aggregation by dynamic light scattering. The sample was mondisperse

(baseline: 1 000). A hydrodynaimc radius of 2.9 nm was calculated for RAP7 4(364-

517) which corresponds to a 35-40 kDa species m solution. Tins suggests that the protein may form a dinier m solution oi may have an elongated shape.

' ' ' Colistins t fpl Vli MS AMi N-Fei m. U VI cm f/ ingérai cm

RVP PPM s I 7) 11 3 POP 1 |P2' OOP, V 1280 0 0PP

2.7.3 Purification of the N-terminal domain of human RAP74

REMARK: Poi the purification oi selenomethionine proteins 10 mM 2- mercaptoethanol was replaced with 1 mM DTE in all buffers. The final lyophtlizate was stoicd under argon at -20° C.

Cell lysis

The cells from a 6 1 large scale expression experiment, pelleted and shock frozen in J 00 ml T100 (Sections 2 6 2,2 6 4) were thawed in a lukewarm water bath.

The cells were put on ice lot 10 minutes before sonicating the sample with 10 x 10 pulses ol 1 second using the largest piobe of the Sontcatoi at maximum level. The fluid suspension was stored on ice.

Inclusion body preparation

The whole cell extract was adjusted to 150 ml with T100 (20 mM TrisCl pH

8 0. 0 5 mM EDTA pH 8 0. 100 mM NaCl, 10 mM 2-mercaptoetlianol, I mM ben/amidine) and 25 pl of the winde cell extract were mixed with 25 pi PGLB and boiled immediately lor 2 minutes Then the suspension was ccntiituged in polypropylene tubes (18000 rpm, 20 mm, 4° C. RC 26. SS34) The pellet was washed three times with 150 ml ot 1(7 Triton-buffer ( I % (v/v) Triton X-100. 20 mM TnsCl pll 8.0. 0 5 mM EDTA pH 8 0. 100 mM NaCl. 10 mM 2-meicaptoelhanol. J mM benzamidme) and centnluged in a 250ml centrifuge bottle (10000 rpm. 10 mm. 4 ( \

RC 26, GS A). The creamy white pellet was rcsuspended in 50 ml T100. then ccniriiuged in two siliconized 30 ml Correx tubes (9000 rpm. 20 mm, 4° C. RC 26.

SS34). The washed inclusion bodies were solubdizcd m 50 ml SAGME m a 50 ml

homogenize!" (douncei) Ehe homogenized material was transferred to two siliconized 72

30 ml Correx tubes and centrifuges (9000 rpm, 20 min, 4° C, RC 26, SS34). The supernatant was stored on ice

Gel filtration chromatography under denaturing conditions

An S-200 gel filtration column (Pharmacia 5x100 cm) was equilibrated with

2000 ml AU 100 (20 mM NaOAc pll 5.2. 0.5 mM EDTA pH ,8.0. 100 mM NaCl. 10 mM 2-mercaptocthanol, 7M urea). The solubllizecl inclusion bodies (50 ml SAGME.

20 iilM NaOAc pH 5.2. 0 5 mM EDTA pH 8.0, 10 mM 2-mcicaptoelhanoh 100 mM

' NaCl. 6 M GdnHCl) were loaded at 3 ml mm and eluted at ihe same flow rate using first 400 ml AU 100 then 2000 ml 0.02% sodium aztde The peak fractions as detected at 280 nm were analyzed on 18% SDS PA gels and pooled

Anion exchange HPPC-chromatography under denaturing conditions

After gel filtration, RAP74(2-165). RAP74(2-172). RAP72(2-183) and

RAP74(2-202) were lust puniied ovei aDEAEÖPW HPLC column RAP74(2-192).

RAP74(2-154) and RAP74 (2-158) were dnectly applied onto the SP-5PW HPLC- column (next section) Since the samples had to be loaded onto the DEAE-column at very low salt the pooled fractions liom gel filtration were dialyzed m MWCO 6-8 kDa dialysis bags against 10 mM 2-mereaptoethano! in two changes at 4' C. The clear solution was shock frozen and lyophdized. The white powder was taken up m TU0 (5 mM TrisCl pH 7.6, 0.5 mM EDTA pll 8 0, 10 mM 2-mcrcaptocthanol. 7M urea) and the column was equilibrated with the same buffer. Maximally 15 mg RAP74 N- terminal domain were loaded at 4 ml mm '. RAP74(2-165) was eluted with a gradient from 15-25$ TU50Ü in 30 minutes. The peak fractions as detected at 280 nm were analyzed on 18$ SDS PA gels and pooled. The pooled fractions were dialyzed in

MWCO 6-8 kDa dialysis bags against 10 mM 2-meicaptoethanol m two changes at 4°

C. The clear solution was shock frozen and lyophdized, The white powdei was stored at -20" C. This construct did not need further puiilicalion RAP74(2-172). RAP74(2-

183) and RAP74 (2-202) eluted in the How through fractions but ihe contaminating proteolytic stele products were retained by the resin These contaminations weie washed off the column with TD500 (5 mM TnsCl pH 7 6. 0 5 mM EDTA pll 8 0. 500 mM NaCl. 10 mM 2-mercaptoethanol, 7M urea). Befoie loading onto the next column

(sec below) the pH of the flow tluough fractions was adjusted adding 3 M NaOAc pH

5.2 to a final concentration oi 20 mM 73

Puiilicalion ol the N-tciminal domain oi RAP74 ovei a DLAF 5PW HPLC column

^ Column DiM iPW ISk P mm x PO rum Tempmtuit RT

Eli v. nti, 1 ml nun Simple < 13 mo of RAP 74 N I imm il domain m TOO

FXttciion ''SO mn 1 low 0 tell Giison ^ mm ) C pi ) s "> Buffi V lUOPmVl InsClpII C mV! I OT Vt II S 0 10 mM m înpo linn 1 ''Vim i) ButluB hi Tl30O(3myi ClpH 0 PnVI TPF V) II 3 0 300 mM XiCl 1 mVI m îciptoetlnnol 7 M m \)

C oiistmit Oi icliint ( iiistmt t Cl Klient

RVl PI n t s* 74(2 n uv R VP 1 -t üosi thioitji 1 RVP74 2 PS, 11 t 11 l uv RVP 1 1 ci m I t ilovi tin u h 4-^4.

' RVl 71 PO P P IS mm R VP t 1 S VI 1 i+ floss thron h +

7 k VP74 P ) -«-**- tl \i tin u h +• t-1- RVl 1 1 1 4 s vi t t + flosv thron h

_ _

R VI I 13 ++ 11 VI till U 11 + r+- R VP 4 i 1 P S VI t tl si tluou h i++

P R VI 1 1 1 ot P 1 RVl3 1 1 L t S M i 1 1 S VI l +1 <- tl vt thi u 1 t+

Cation exchange HPLC -chromatograpfn under denaturing conditions

Aller gel tiltiation and anion exchange chiornaiogiaphy the RAP74 N icimmal

domain constiucls wete applied onto a SP 5PW HPLC column The column was

eouihbiated with AUl 00 Maximally 15 mg ol R \P74 N leimtnal domain weie

loaded then eluted with an appiopnatc salt giadient (see be low ) Ihe peak liactions as

detected at 280 nm weie analv/cd on l8'/f SDS PA «els and pooled Ihe pooled

liactions w etc diah/cd m MW C O 6 8 kDa chah sis bags against 10 mM 2 meicaptoethanol m tw o changes al 1° C Ihe ck u solution w is shock lio/cn and

lvophili/ed The white powdei was stoied al 20 C

Puiilicalion ol ihe \ ttimmal domain of RAP74 ovci i SP ^PW HPLC column f lumn SP sPW ISk Puni] mm I n| i Um RT llev. iP i 1 ml nun Smi) 1 v 1 s m il R VI3 74 \ t i min il loiinin m VF lot

LXl cli n S um PI vi II C ilson < mm 0( il

Bulf i V VI 100 PniMNiOV i II s s S mVl EDI V| II ) lOOmVlViU UPiVI m i iplithm I 7 Vf m i

Bull i B Vt MX) P Vi O i i ' 3 ( mM V II (P mVl I PI VpH 3 0 300 mM N îCl K mM nie ii. ii t lli m I M U1

P C onsliui t Ci îditiit Colistine t Gi ïdient

RAP7K jAl RVP 4( A) ) T3 p I 0 nn

RAIPt US) 'SB) nu i RVP 4) 17P C P V 3P B 30 mu

P- s RAP74Ç ICI n t n R VI5 PI S VI c B P) m i

~> KAP7JP 1/ ) l'P B i)raiii k VP 4 1 I i S VI i h 1 min

P t RAP74( 1S7 4 I 0 mm RM PI II S VI I ( t B nui

' RVP74P I P) P B 10 nun RVP7I 17 ) 14 S VI t i 1 S VI 3 B nun

Rumik P'Iniin i RVP71P be. putt de mpet Is ii nntspi t ht] p p n i K VP 11 P t IP

Analysis

Altei diese puitficaiion steps the piotems weie ludctd moie than %

SDS PA gel elecltophoiesis In oidei to make sine that tht piotems had been cloned expiessed and punfted eonectlv I 2 me ol each piotem stock weie desalted ovei a

'S leveised phase HPl C column (Macherv Nacel NvOO eg) and submitted toi mass spcctioscopie analysis and N teimmal sequencme 74

Colistine! Pi Mr MS AMi lN- Ici m E/M'cm1 t/mg' ml cm1 RAP74P PI) UP 1/7110 17706 0 0 07Çf nd P090 14/7

RAP74P 1 IS) 1 i i 18 P6 t 18177 CP 0 04P v '6090 1 170

RAP74P IM) 10 l P7PO 187SS0 0 40' s 26000 1 Pl

RAP" 1(7 164) 10 i ]S90i 1 ISO 76 0 0 40', s 71 'S{ 1 681

" RVP74' IP) U Ss J 0031} 2 100 10 0 OOPt 11780 P P

" RVPP P2) 11 1 P10 1001 M) 0 ( P V 1] 'SO |

< s "' RVPPP P P o,o 21 il 0 09P V 31/30 1 ISO

- 1 kVPPp P2 ) P 1 "•IP" c r n V U PO 1416

RVP 4(2 '0' K 9 t 2 e 4 m 0 Pu V 11780 1 iT

"- RVPP 21 P POA 1 i 1 10s s lOSSl 4 ( uP nil U/'O i î06

RVP AC r>) VMu Kl s 33 P '0003 t 0 P nd P/SO 1 1S0

RAP/tu 1 '2) 1 PSeVPt 10 2 P2' P068 0 OOP n d il /SO l PI

RAPP) 1 2. F127SiVlel I" 2 )0 2 ^ 200 s f, 0 01 nel 11730 i SU

RVP/tP I/PI 4~SiVkt 1 1 S Vin 10 2 P P i '0101 n np, n d P/SO 1 381

l - "' 11 d not de Hi mined îllei sului iilion e I 2A!i loi 2 nuKiptoithmol

2.7.4 Purification of human RAP30 and its N-terminal domain

REMARK Foi the pmihcation ol selenomethionine piotems 10 mM 2

meicaptoethanol was leplaced with 1 mM DI f m all bulleis Ihe linal lvophih/ate

was stoicd undet aigon at -20° C

Cell lysis

The cells horn a 0 1 latge scale expiession exponment, pelleted and shock

fio/en m 100 ml T100 (Sections 2 6 2, 2 6 4) weie thawed m a lukewaim watei bath

The cells weie put on ice loi 10 minutes beloie sonicating the sample with 10 \ 10 pulses of 1 second using the laigest piobe ot the Somcatoi at maximum level The

fluid suspension was stoicd on ice

Inclusion body preparation

The whole cell extiact was adiusted to 150 ml volume with FlOO (20 mM

TnsCl pll 8 0. 0 5 mM El) FA pH 8 0. 100 mM NaCl. 10 niM 2mieieaptoetlianol( I mM beirzamulme) Ihen the suspension was centiifugcd in pofvpiopylcnc tubes

(18000 ipm. 20 mm. 41 C. RC26. SS^l) The pellet was washed thiee times with 1 5() nil ol Wi riiton-butlei (IE (v/v ) Tnton X-100 20m\l TiisCl pH 8 0, 0 5 mM

EDTA pH 8 0. 100 m\l NaCl 10 mM 2-meicaptoethanol. I mM benzamidme) and centntuged m a 250ml continuée bottle (10000 ipm, 10 mm. 1 C. RC 26. OSA) 1 he eteamv while pellet was lesuspended m 50 ml T100. then the sample was centnhigmg m two siliconized 30 ml Conex tubes mm 4L (<-K)00 ipm. 20 , C. RC 26. SS34) The washed inclusion bodies weie solubiLved m 50 ml SAGV1E m a 50 ml homogenizei

(douncei) The homogenized matenal was ttansleiied to two sihcomzed 3() ml Conev 75

tubes and centiituged (()000 ipm, 20 mm, 4° C, RC 26, SS34) The supernatant was stoted on ice

Gel filtration chromatography under denaturing conditions

An S-200 gel iiluatton column (Phannae ta 5\100 cm) was eqmhbiated with

2000 ml AU50 (20 mM NaOAc pH 5 2 0 5 mM EDI A pH 8 0 50 mM NaCl 10 niM

2 meicaptoethanol 7M mea) Ihe solubiiized inclusion bodies (50 ml SAGME 20 mM NaOAc pH 5 2 0 5 mM 1 Df \ pll 8 0 10 mM 2 meicaptoethanol, lOOmM

NaCl 6M GdnllCl) weie loaded ai i ml mm and eluted àt the same flow late usms lust 400 ml AU UK) then 2000 ml 0 02E sodium a/idc Ihe peak liactions as detected at 280 nm weie anahzed on 18E- SDS P \ gels and pooled

Cation exchange HPLC-chromatogfapln und denaturing conditions

Altei gel hltiation RAPH) antl R \P"U) N-teiminal domain constiuc ts weie applied onto a SP 5PW HPl C column The column was eqmhbiated with AU50

Maximally 15 mg ol R \Ed) oi R \P *0 Vteimmal domain weie loaded and then eluted with an appiopnatc sali giaillent (see below) Ihe peak inictions as detected at

280 nm weie analyzed on ISC SDS P \ gels and pooled The pooled liactions weie dialy/ed m MWCO 6-8 kDa chah sis bags against 10 mM 2 meicaptoethanol in two changes at 4° C The cleai solution was shock liozen and lvophili/ed The white powdei was stoicd at 20° (

Punhcation ot RAPiO constiucls ovet a SP-5PW HPLC column

Column SP 3PVV 1 Sk 1 i mm v 1 iO nun Timp i Um RT Iiownle. Pnl nun S uiipl spin it k VP74 \ tnmiinl ilonmn in AUPr

DePetion "»SOnm Fl weilt Oilsin i 0 mm Pnl

Buffn A VIM) PmMXiOV pit i i mV1 FDT VpH P) P mM X it 1 10 mM Pn it iptiuhninl Vim i Butt i B VI 300 PO mVl \ i O V | II i ) i mM FDl VpH S 0 sOO mVl \P 1 I) mM unit iptoithine 1 /Vlmu)

Constrm t Cri uliinl Consliuit Gl Klient

RVP 0 i [0 B 1 nun R VP 01 11 PC IK V 11 B '0 mm

JO _ ' î si RAP 10( I I 1 B i nun RVP30 11 n I K SiVRt 11 B „„„

1 P b nun RVPP) IPI FM S Vi I P P B nun

RVP 0 PP P ( b nun RVP 0 1191 1 i'S Vi l LlOi SsVPl P P B n n

RVPP P P b n 11

Analy sis

Altei these puiilication steps ihe pioteins weie pidged moic than °8E pine bv

SDS PA gel elecltophoiesis In oidei to make suie that the piotems had been cloned expiessed and punliecl conecth 1 2 mg ot each piotem stock wete desalted ovei a tcveised phase HPIC column (Machen Nagel NtOO 5 C8) and submitted foi mass speetioscopic analysis and N teimmal sequencing 76

t onstiuit Mi MS 1>I AMi N leim U \1 un t/mn ml im RVP Op 92 12701 2 12707 5* 0 119) 03 ••/ P ^ 100

RAP10P 1 t) S4 1 P6i S 1 3^64 0 OOP v 14 )() t X

RMPOp P5) ) 1 1630 t 11379 8 0 43 140 0 0 P6

RVPPP Pl 1 1 1 4i] 16414 0 OOi 1 10) SST

R VP P( 19) 10 S 48 1 PPOO ( P P 0 1 Of

11 IPV ) RVPPl PC 1 ) PCS! ) 0 1 n 1 i ) 19P

i RVP10 HP I MSiVl t ) 1 1 7S4 ) p n 1 1 0 (

R VP P) I 1 P I 1 )C S VI l 1 n 1 11 1 i i 1 0 P )

> RVP 0( U) IMS VI II FPSsVl t 7 1 v 1 P 3 1 n I p ) ) P!

* ni n t t t iinin d »tt i subtil ti n i Wir- f t i m i ipt th m I

2.7.5 Refolding and purification of human TFIIE

lyophdized RAP^O and RAP74 wue dissolved m eqmmolai amounts m

HDK500 (20 niM HEPES pH 7 5 () 1 mM LDIA 8 0 1 mM Dl I 500 mM KCl 4 M uica) lo a final concentiation ol 1 2s me ml Attei 1 houi incubation at mom tempeiatuie the sample was diah/ed m lOOml MWCO 6 8 kDa dialysis bags aeamst

2000 ml HK500 (5 mM HI PES pll "MM mM 1 Dl A pH 8 0 0 25 mM Dl I 500 mM KCl) at 4 C Aftei lout houis the chah sis bullei was changed to 2000 ml Ilk 100

(5 mM HI PL S pH 7 5 0 1 mM 1 DI A pH 8 0 0 2^ mM DEI 100 mM kCl) Altei an overnight equihbiation m the cold loom the diahsis bullei was changed again to anothei 2000 ml HklOO \rtet tout moie houis ol diahsis the icloldcd TELII was punliedovci aDl Al 5P\\ HPLC column In the cold loom aDLAF 5PW HPLC column was eqmhbiated with I klOO (20 mM lust 1 pH 8 0 0 1 mM FDTA pH 8 0

0 5 mM DTT, 100 mM KCl) Maximally 12 mg olIPfTI weie loaded at 4 ml mm and eluted with an appiopiiate salt eiadient (see below) Ihe peak fmotions is dctcclcd at 280 nm weie analyzed on 18Et SDS PÀ eels and pooled Ihe pooled liactions weie extensively dialyzed m MWCO 6 8 kDa chalvsis bags against

BSB200N ( 1 mM BisTnsCl pll 7 0 0 1 mM ED I A pH 8 0 0 5 mM DPI 200 mM

KCl) m two chances at 4 C Ihe piotem solution was cone initialed w ith a Satoiius

H200E ultiathimble to a final concentiation ot 15 nu ml t 0 5 mM) Aftei addition

01 20'/ glyceiol containing 200 mM KC 1 the piotein piepaiation was flash dozen m aliquots eand stoiecl al 80 C

Punfication ol icfolded human ipill

C Puriin Dl \P iPVV I SK 1 i mm \ Pimm

T nip iilui 4 C

11 «ul 4 ml mm

S mi] 1 <. 11 in 111 t ltd d hum m 1FIIF m IIP 1C 0

II v, eell Ciisin i 0 mm 1 ni D t ti n S ) nm Bull i V TKlOOPOmVl In Clt IPO ) 1 mVl TD1 V pH S ) HmVtDll 1 X mVl kl I

Hull i B 1 S s R100 { P mM In C I [ II 0 P mM EDI V | H 0 ( S mVI DT I 7 mV! kt I

( nnstniit in iclitnt

^ RVP71P iPVRVI i P I B ran 77

Analysis

After these purification steps the protein was judged more than 98% pure by

SDS PA gel electrophoiesis. Ehe concentrated material was assayed for aggregation

by dynamic light scattering. Ehe sample was mondisperse (baseline; 1.000) and TFIIF

had an apparent moleculat weight of 158 kDa

2.7.6 Refolding and purification of the RAP30/74-interaction domains

Lyophih/ed RAP30 and RAP74 were dissolved in equimolar amounts in

HUK500 (20 mM HEPES pH 7.5. 0 1 mM EDI'A 8.0. 1 mM DTE 500 mM KCl, 4 M

urea) to a final concentration of 1.00 mg/ml Aftei 1 hour incubation at room

temperature, the sample was dialyzed in 50 ml MWCO 6-8 kDa dialysis bags against

2000 ml HKJ00 (5 mM HEPES pH 7.5. 0.1 mM EDTA pH 8 0, 0.5 mM DTT. 100

mM KCl) ai 4" C. After four hours, the diahsis buffei was changed to 2000 ml

IIK100 (5 mM HEPES pH 7.5. 0.1 mM 1IDTA pH 8 0. 0.25 mM DTT. 100 mM KCl).

After an overnight cquilibtation in the cold room, the lefolclecl truncated TEÏIF- complex was purified over a SP-5PW I IP LC-column m the cold loom. The column was equilibrated with HK100 (5 mM HEPES pH 7,5. 0.1 mM EDTA pH 8.0. 0.5 mM

DTE 100 mM KCl) Maximally 12 mg of RAP30/74-complex were loaded at 4 ml

' mm and eluted with an appropnate salt giadient (sec table). The peak fractions as delected at 280 nm were analyzed on 18E SDS PA gels and pooled. The pooled fractions were extensively diah/ed in MWCO 6-8 kDa dialysis bags against BSB20K

(1 mM BisTiisCl pH 7 0. 0.1 mM EDEA pll 8.0. 0.5 mM DTT, 20 mM KCl) in two changes at 4° C, The protein solution was concentrated with a Satonus 13200E ultiathimble or a to 20-40 mg ml '. Altei addition of 20% glycerol at 20 mM KCl. the protein preparation was flash frozen in aliquots and stored at -80" C.

Purification of the RAP30/74-hetciodimer over a SP-5PW HPEC-column

* Column SP-iPVV IS is. 21 mni \ P'Pnm Tempeiatuie 4 0

Flow nite-1 4 nil mm

Sample t 1 î ni« ot KtokPd hum.iii ITHi inHKlOO

Deleitinii 280 nm

Plow iell GiRein î 0 mm P ul Buffei A HKlOOiSmVlHPPI Spll 's n i m\1 FUI VpH K 0 OïmMDPl I0O mVl KCB ButfeiB IlkîOOPmVlIIi PI SpH 5 0 I mVl RDI A pli S 0 0 s mVl PP POmMKtl) 78

RAP3fl(2-119) RAP7KK2-124) RAP30(2-135) RAP30(2-151) RAP30(2-249)

1 RAP74(2-154) 26 5--14% B, 10 mm 26 Wi B. JO mm . - -

RAP74C2-158) 2P2S 3% B, 10 nun pi 240 B 30 mm 10-10% B, 60 mm 10P0P B 60 nun -

- RAP74(2 163) 10 Pn B 10 mm - - -

- RAP74(2P64) 10 in, B 10 mm - -

RAP74C2-165) 10-1391 B llfniin lo PP B lu mm 12-1 >P B Pl mm - -

RAP74(2-172) 12 PP B lOmnp 12 PP B 3d nun 12 PC B Pl mm 10-70'/, B 60 mm 10-PC B 60 mm

s RAP74(2-18.3) 10 FPPB 10 mm po B 40 rum - -

R\P74(2-1<>2) 20 HP B 40 nun 1" 2 P B 40 mm _

) - R\P74( 2-202) 1 2(P B 40 nun 10-10O B 6(i nun lO-PP B 60mm

!UP74(2-517) - - 11 40P B 2.1 mm

Bold eloiiiel Pigment noimal pioleohiK tiannnil PiJudine all mutants oi selenomethionine denvaUtcs.

Analysis

Alter these purification steps the protein was iiidged more than 98% pure by

SDS PAGE. For some complexes the concentrated material was assayed for

' aggregation in BSB20 al 3-5 mg ml bv dynamic light scattering (DP-801; Protein

Solutions)

RAP74 RAP30 l'l Vli O baseline IlndnP lllin|. IM'uii'l |uiji! ml im'l Mi-,(i„p(kDii|

2 Pl 2-110 10 4 »up 73810 1 2'7 -

2 US 7 110 lu i PS2 os, 78810 1 710 1 001 monod 2 8 17

" 2 PP 7-110 ot¬ 11 lu 14 44100 1 4P 1 000 monod 10 41

se) 2 172 2 IP) to: p-nin 4 POO I 16 î -

'0 U 3 2 13' 3 110 I i40 44300 1 lo; 1 006, polyel - 7 192 2 110 e) OO 'i141 2i 44100 1 26'' 1 014 polyd

-- 2 111 2 124 1" 1 .()0i(, 40090 1 2sp

2 IP 2-124 |1 1 1P02 22 40090 1 20 I 000 monod 2 0 10

ei 2 16) 2 124 U ipoou I00U0 1 231 -

i SI os, - 2-164 2 121 P On 4US0 1 171 -

2-163 7 121 0 , s P201 ou 41 S,, 1 418

2 172 2-12 1 10 1 Il l-r, 1 1 l"S0 1 7S I 000 monod i 1 40

1 2-181 2-121 10 34o,p SÏ PP0 1 P'' -

2-102 2 124 0 41 is-pi \ 41 OSO 1 282 1 00 i monod"" 1 17

2-702 2 121 10 7oi)41 >4 I77S0 1 2 10 1 002 monod on

2-118 2 131 10 1 i"'6 JT 400Q0 1 22' 0 9O9 monod 2 " 70

2 161 2 PI 9 84 •Uop P 4Ï7S0 1 119 1 000 monod 7 1 40

2 177 2-in PP 34I0O 73 47 780 1 321 -

2 PS 2 in 10 I 74117 OS 40090 I lo 1 007 monod I J p,

1 2 P2 2 HI 10 'Mil -P 11780 I 36 1 001 monod ' S 76

2 202 2 Pl 10 2 40]10 92 41~80 1 1 tl I 001 monod ' / "1

2 172 2-240 10 2 (SpO.P oPIO I Pl -

2 202 2 219 10 2 pop 11 MP0 1 1S3 1 000 monod -P S) monod = nionodispnse polyd - pohdispnse 79

2.8 Crystallization screens

It not stated otheiw ise eivstalhzation scieens weie set up as hanging dtop

expenments in 24 well fakon plates 4 ul piotem solution (10-20 mg ml ') weie mixed

with 4 ul lescivon solution and equihbiated against 500 pl icseivon solution at loom

tempeiatuie and 4° C

Hampton research crystal screen I

Based on the Hampton icseatch eivstal scieen 11V the loi lowing conditions weie set

up

17 7()

i7i '0 r PFO4000 0 lMXPitiit pH i i VIMIOV Pi 70PMPD0 lMNiCitodyhtepIICi 3 0 7MM. lO\t Bl 10PP1 O 1000 0 1MIN1Ü V pH4' V1XIIOV ?)/ KP PI G 1000 0 lMlnsCl S 3 0 2V1 \ lO Vt " pli

i ' 7Î1 1 0M(MI,1 IIPO, OlMXiCitiit ] 11 77s PO PI O 400 0 IM NiIIFPFS pH 1 0 MM C 1

Bb 7()c 2 0 N s i ptop.moP IV! tHLPl | II X1M ( 1 1)6 109 2 piopmoli (i IM MOAi r H t PI °M t PI

17 I 0V1 \iOVe 0 IM mudi/iiiCl pH s c ? S PI C 4Ö0( 7 IXlXiO Vcpll lo P 0 V1PD 0 IM XiCitutipII i ' V1MI OV c 1 1X1 X iC mite ( 1X1 Xüll PI 7 1 ' SpII 7 At 70 0 IVlXlHi PI XI X i C iti U ( 1 s piopmole SpII 2 e PI O 4P 0X1 XII.,1 SO, 0 IM N iHI PI SpII

s ' \4 W PI OS000 0 1V1XP nid\hie.) H X1 \i() Vi (4 Pp7piopine| 70 Pro 4000 0 IM X iC Uni pH i i

\ P o svi k r«n ite o im x iHi prs Pi' i pH C 10 r 2 | top in 1 PLO4000 0 1X1 X illl PI S pll io 10' PI OS000 0 2M(XII, SO ( 70' PPO S00 OJsMK HPO

71/ 10' PFG 1000 0 2M (MI,) SO O7 10 c PI C P)

7 0X1 (Ml,) SO, D 0 7yiM<(Soimnt ) TP I 0V1 X iloimiili Pi P PI OS000 0 IMNiC iiodslite pllf OAl/nOV 1 B4 0V1 XiFonni ite 0 1X1 V i iC) Vi pH 4 P4 IS' PPO S000 0 [MN-iCiiodyliOplP 0 M C 1 O V /P 1 »Alk SIPO, 0 IVlXiHI ITS pH Ps 7 0311X110 30, 0 IM XiOAe.pII P 776 V P1-OS000 0 1X1 lu C IpHS s Dô 2 0X1 (MI 7 IIPO, 0 1V1 Tust I pH S i

Hampton research crystal screen II

^ Based on the Hampton icseatch eivstal scieen 11 the following conditions weie set

up

[J PP ' 000 PI C 0V1\ i( 1 ( 1 11 PrCAiXlF ^Oi M) IXIXiOV iII4 VI MI SO 0 s\1Vi( I 0OIX1X CH ( M ~!_ UXIXi C 1 20XPXIL S04 0 IVfXiCitnt piPo (PXlk I unit 27 ilhtlelli h le c I »XfLi SO lXlXiCmit pli i iVI MI S()

p 11 di x m i < c _ lihcthiln mm 1V1 X iC iinte, pll ( iVl\i(]

i s îsojiiepinil OXI XH SO ( t it bum I P VI X Ulli lie pli 1 /

p 1 0X1 imidi/e 1 Cl t ) P r Ifimin XP70 0 IVlXiCitnte 7 01X1 i C pH 1 pH 7 777 I0PP) G )O0( 10 PI < S 00) Di i\l i i h xinidiil i IVlXiC ittiOplIi < 1 10' oh in 1 1 iXIXtCi O ! 'X1X1, SO IVfXiMFSplF 1 n I OXlXPl 0 1X1 NiO V pH P 71 OXI Ni I ]Xl XA1ES] IP i 0 Al k UPC) ( B, KP X1PI7 0 1X1 XiOVpIUt mxici Dl P PFCP0O0 ) iMXiMI Spilt i J2A 1 0X1 1 ( Iu\ medio] 0 1X1 XiOV pi ï 4 f OOIXICoCl D U ilnxin 1XIXi\1EX|II< •- PV1 XII) SO r 0' tOO ) / i PPO PXlXiOV pH P 1X1 C MI ci IV 0 I tl mim VI ) ) ) IX! X PIPS pJP 0 ( Al C C 1 80

VI 1 SM (NU,) S04 0 IM NiMPS pH 6 3 0 01M CoCl Ol I0P PFG 8000 0 IM X lilt PFS pH 7 3 SO otlivle nuip A1 10r PK iM ML 2000 0 IM XiXILS pill 6 1 0 2M (NIL,) SO, C7 20P PFC, 10000 0 IM XiHEPFS pH 7 1

7i 1 M o PLGMXir 7000 0 1 VI X AIES pH C 0 OIM ZnSO, r~c3 IM 1 6 Hexmutiol 0 JM ImC IpII S 3 0 'MM LI VI l 6M XlCltl-lte.pH6 -s C4 | "-"Pou buoiiol 0 1X1 lusCl pH S i OlXICiCl 1 Vi OOMPE 0 IXINiHFPI SpII sM XII SO C 1 1 0X11 i 10 0 1X1 UisCl|HS 5 OOlMXiCl

ii AC l) c ITC, 6000 0 V1PP (( 1 h i P 0 S-s 1 1M(NH SO, lVlNiHEPLSpII , IMTnsCliII )

I 1 s ( i Bl 1 70 r tfimmiM 600 0 IM XiIIFPl S pH Dl XIPP lXtTnsUpIIh 0 2M MIiMIPOi R j UM (NU I SO, 0 lXIXiIIl PI S ill PXtXiCl P th-mil 0 1X1 lu Cl pll S 3 B OM XHtoimnli 0 IVlXiHI PI l l l C V1X1E »00 ) ( 1X1 Inst I pH S i 0 DIM XiC 1 ( S)H B4 1 0MX lOVt 0 lXlXiHtPFS ,11 i( 1X1 C XII Cl 13 t P PI GV1X1L 700C 0 1X1 lnsC 1 | H ) 0 0 IM X i( 1 Bi 0 AfPD 0 i Pi Cl 0 lXITiiP )0 IMXiHFPESpH , MM, lpll __j 4 MXiCl 0 iMNiHEPl Sil! i P ! PP P) 0 C IM In Cl 11 0 li vm , , )

Hampton reserach detergent screen I

Ihe Hampton îeseaich deteigent scieen 11V; was setup as a sitting chop

expcnment usina miciobndges m a 24 well plate 4 ul piotem solution (10 20 mg ml )

weie mixed with ! pl ol a 1 10 dilution ol the Detc ice til scieen stock solution beloie ^

pl icscixon bullei weie added Ihe sttttne chop was tqudibiated acamst 500 pl

leseivon bullei at loom tempeiatuie and/oi 4 I Hit leseivoti buffet contained the

appiopnatc salts buffeis icductaiits sind pieeipilalion agents lot the eivstalhzation of

the (t c \$<4 ^0 m\l BishisCl 6 5 100 mM 1 iNO 1 lespcctive piotem P1G6000, pH ,

mMDIIl

Al Cl'P ) s mM C 1 Ne nil (1 I\ lue 'id os „V A7 Cl I S 1 1 mM t IS t\l ß 17 tint, lui il mXl As X O Kin vi ß O milt il 1 mX1 0 EOVO PUmXl V4 Suttosi monoliunti nAl C 1 in OVV11 ( 1 P mM

V1 CVMVT 6 P mXl c n 0 tin tl un i i 44 mM

7 AC 11 Hon X 100 mXl C( 11 |t\l f> E thnju i 1 00 mM

Bl t l AB 10 mXl Pl tl O ul E luii id P mM ß . B1 Deeixt Bi, Chip 11 mX1 D1 C X XI VI 4i mXl

B n O ovl ß E nnltosid IS niM I) c in C VI) SiOmXf B4 1DAO 2 ) mM m 7WII11RCIM P 10c mM

B3 C V M VL 1 l nAl Ei JMKA s__ __ _P_mXl 1 so ) B( /Will ERCLX I 1 W mX1 Eo I! II X>1 ß E, ill si I mXl

Song s factorial design 44

Ihe icseivou bullei contained the appiopnatc pieupitation agent foi

uystalhvaiion ol the lesptPtive piotem and the components mven m the tible below

M 70 nAlBi In C 1) IK i P mXIXH XO I nAl ( K 1 C 1 ppniMri h st nn 1 X) mVlkXO i V>"""î(> mV1 X lO Vc pH 1 i s i XI XiHIO, C i niMXiO V i H 1 mXlPXO I mXIMPI k( I mVi h C IjIIS s P nAl XII Cl 1 OnXIM C 1 _so nAijoju cijiin _1 jttM _ __ V4 C i 7 )mVI In C S i 100 nAl KO Vt P raVK id lOmVlBi lu IpHC J_ nAl X i C m it JJ_niX1MjJ ^Ct 1 IpH 1 C s î tnXIXiO V II i 1 1)0 K mVlX iOV ]_H 1 jniV XiXO 1 n M XI Cl j mM XII OV

s "^ K) mVtXiOVc pII4 i mM XII I IPO C mVlXiO V jll 1 7 i ) mXl XiHIOl Bl lOmVlNiOVi )l]n * mXl XII < iti it J01 OnMBi In C lpll ( lOmXl XII ) Oil lOniXICiCl

if mVI X iHFPFS t H i p xnci "di iiAIXPIfPrSiH 1 0 mXf X iO \i ~- Bl lOmVlXiOV ]I1P lOOnAlkOV PmXICiCl E niVlli ( lpll S i 1(( mXIXiCl It nAl XI Cl 1 B4 lOmXiBi hi ( IpIP ( IK mXP iO V l mXIXl Oi Et MPiAÏBi hi CM H lOmAtk mo i) tnXIBi In CI pll s P PnXI niM E i IniMX HCP1 S [II 1 it nAlLiNO kXO_J_ \l_CI __ ~ B( lOmXIXiHI PI S pll I Pm«l C! 1 nXlCiCl "13 POnXt hi l I pll S i I 0 niXI LiO V I ( nAl XI ( ,

VII inlitnn sntnn I nAl El i 81

Divalent screen (Do you like divalents?)

The icservon bullei contained ihe appiopiiate salts, bullets, and piecipitation

agents loi the ciystalh/ation ol the icspective piotem (e g 18'/t PI G6000 50 miVl

Bis 11 isCI pH 0 "î, 100 mM I iNO ) as well as ihe dix aient given m the table below but

no icducine aaent

VI i mliol C I 10 mM /mVO

p 10 nAl BiCI ( 10 mXl C i OV

lOnAt CiCl Cl IC mXl ( sCI

M lOmM Pd(XC) ) C 1 1 nA1 CuSO AI lOmVl C o(VO 1 Ci 10 "Al XI SO _ _ 'VC 10 m VI C o(XH PCI 10 mM Xln O V '

" Bl lOmVl Cu(XOP El 10 mM X,( 1

B2 10 m VI KCl 10 mXl StCI

" J?' Bl 10 nAl V1 P"l 1 IO, 10 mM RhXO B4 10 nAl VliKXO 7 î m 10 mM XI XO Bl 10 mM Xi(XO ) ^ 10 mM XInC 1 ! RC 10 mM PlKOAe ) 1)0 lOmM /nC 1

Monovalent screen (Do you like salt?)

The leservou bullei contained the appiopiiate bullei s icdnctants and

piecipitation agents loi the crystallization ol the icspective piotem (e g 189t

PPG6000 50 mM Bis 11 isCI pH (7 5 1 mM ÜT f ) as well as ihe salts gix en m the

table below but no icdut iihi aeent

Vi eonliol c 11 11AI XII CI

A^ 100 mM Xi( 1 mV < 101 nAl XII XO 10 111M CiKNOl

__1 _ Ai 10 mVI XaXO 10 mXl \C NO C' 100 nAl XII OVe.

VI 1 uO mVI XiO Ve t 4 100 nAl XH HPOI

As 100 mVI Xi I IPC), t 1 P nAl XII ) OilnP

\( 100 mM I it I 0 nAl C id P iO mX1 X 1 Citi 1

Bl 100 mM El IOO11AI X il oimnli

' LiX<2_ 1)7 JOOmM I 1OA1 10 n.Xl XI, OVe) D 100 mXl XI I oimnle.) _ _ _ "bi 100 mM kCl 17, IOO11AI XI, SO, B4 100 mM KXO 10 nAl. XI XO 1 E4 100 mXI On SO _ . IP 100 mM kOVe It mXl_ laiOVP Ei 100 mM /nsO B6 100 mM k HPO rv IOC mXl Li SO

Electrostatic crosshnker screen

I he electiostattc uosshnkei scieen is based on a scieen pioposed by Cuclney

1K! et al Ihe Scieen was set up as a hanging chop expcnment using 24 well plates 4

ul piotem solution (10-20 mg ml ') weie mixed with 2 pl ot a I 2-1 10 dilution ol the

electiostettic uosshnkei scieen stock solution beloie 4 ul icseivoii hülfet weie added

Ihe sitting chop was equihbtated against 500 pl icseivoii but lei at 100m tempeiatuie

and/01 4 C Hie leset \ on bullet contained the appiopnatc salts bullei s leduttants

and piecipitation agents loi the ctystalhzation of the icspective piotem (e g 1 8{/

PI 50 mM mM G6000. BisTusCl pH 6 5, 100 LiNO . 1 m\l DID 82

Al watei tontiole Cl Dodecylamiiu pli 7 0 ">00 mM A2 Polv I flutarntc itid pli 7 0 21 mg/ml C2 higlycine pli 7 0 230 mM "7 AI PoK I I vsine" pli 0 ""i nn ml C 1 Peti-iglyunc pli 7 0 10 mM V4 sultile ( Eextian pH ()20iw/\) Cl Gpcy] .give nie pli 7 0 210 nAl AI PEG 600di u id pIPO 10 ml es Vliln icid pH7 0 POmVl A6 1 12 Dodu mon îcid pIP i pi s-tt Of Pilmitmir leid 7 0 K)OP(sil) t pli Bl 7s Vzelm llld pIPO mM El 1 S Di immoxmiL pH / 0 230 mM LP nid Ki Vdipmi pIPO lïAt P2 'Il Enmmodoeki in pIPO lOtPUO ' B) liHiniatimt and pli "P ""~0 nAl E7 7 0 230 mVl ^ Speirmn pli Bln VtPP pH 1 î n Et Speimidnii pll / 0 7s0mM Ltlltklle PlMOle pli ( s l Ei j Iiieilnnohmirn pll 0 PC) nAl hW^ ' u ^n E 1 Vmmoe ipionic id pH mM 1 Dnnun 2 irUhtlpeiitiiii pli / t) P0 nAl

Heavy atom screen

I oi couvsalh/anon of piotems with heavy atom compounds, a scieen based on

the v gnen geneial heav atom soaking stock solutions was used The scieen was set up

as a hanging chop expenment m 24 well plates 2 ul piotem solution (J0-20 mg ml ')

weie mixed with 6 pl leset von bulles beloie 2 ul ot the diluted heaw atom stock

solution weie added to yield a linal heaw atom concentiation ot 1 mM The hanging

diop was equilibiatccl agamsi 500 ul icsetvoit buffei at 4' C The icseivoii buffei

coniamcd the appiopiiate salts butleis. and piecipitation agents loi the eivstalhzation

ol the icspective piotem (e g 18^ PI G6000, 50 mM BisI nsCl pll 6 5. 1()() mM

LiNO,), but no i educing aeent

ink diluti n I Ilk eliiulion " Al no Vu 0 WI I ipi ( 1 k lit I 0 1 M 1 P r A2 Pli«) Vc) 0 i VI 1 10) I O OVO 11 vTi 1 »o 4 A, pXICBV o 1 VI 1 20 C SmCI 0 s XI 1 100

Al i xns 1 s XI 1 100 C4 PiCl U XI I 100

i Vi Bikei s Dimticuinl o vi 1 to ( 1 rvnov 1 1X1 1 so

V6 VPH XO 0 1X1 1 1(3, XHtPdC I 0 1 VI 1 7(1 , Bl II, CNOO W 1 XI 1 20 Pl Od O V 0 XI I 10

B> IIVuCl 0 1 X1 1 100 P2 tuCI 0 1 I 100 L M_ IP kPtCl, 0 ' VI 1 60 "El uXHpOsCI 0 1 XI 1/20 ' Bl kVuCX 0 1X1 17S0 Et XiPtCl 0i XI 1/100

Bl NlliPdC 1, 0 i M 17100 L71 Xi wo 0 s VI i , icyq B6 kPlCX, 0 IM 17100 EC timsPtiXH i ( I 0 I VI 1/ 0

Ptxk dilution ' ~

Al usPKME, C 1 O I XI 1 "O

V1 p X1C S V o o xt I i (

Al Phentl IOIXO 10 XI 1 6 6

A4 1X1 X X 1 ) XI l~l f I.

Ai IiidninnIV Khlond ( s XI 1 100 Bi 'etliskn dintmielpl itiniim ehPnel 4X1 1 SO Bl kPliXO I VI I 20 B2 PdC 1. 0 4 XI 1 SO

« Pittien itetownxiiuit nx th m O" M 1 6 6 83

2.9 MeHgN03-preIabelling of TFIIF

The RAP30/74-mtet'action domains of TFIIF was refolded, purified and concentrated as usual (Section 2.7 6). To 500 pl concentrated protein solution

(approximately 10 mg ml in BSB20K) 1 M DTT was added to a final concentration of 10 mM and the mixture was incubated for one hour at room temperature. The sample was applied onto a Sephadex G25 medium giavity column (PD10. Pharmacia)

in BSB20K (no DTT) and 10 Fractions of I ml were eluted with BSB20K (no DTD.

The peak fractions as detected at 280 nm were pooled. Free sulfliydryl gioups weie quantated with Ellmann s assav (usualh 0 75-0 9 free sulfhychyl groups pei cysteine residue) The sample was diluted with BSB20K to a final piotem concentration of 50

pM and 6 equivalents met hy line rem y nitrate per cysteine îesiclue were added Altei overnight incubation at room temperatuie the sample was concentrated to a final volume of 500 pl with an Ultrafree- 4 concentrât oi (MWCO 5 kDa, Millipore) and

applied onto a Sephadex G25 medium gravity column (PI) 10. Pharmacia) m BSB2ÜK

(no DTT). The protein was eluted as described above. Free sulfliydryl groups were quantated with Ellmann s assav again (usually no signal above background) The peak

lracuons as detected at 280 nm weie concentrated to a imal concentration of 20 mg

' ml Sodium azidc (0 01 r'< (w/v )) was added before the matenal was used foi crystallization. Crystallization conditions did not contain any reducing agents such as

2-meicaptoethaiiol oi DTT. 84

3 Design of crystallization constructs based on human JFHF domain structure by limited proteolysis

3.1 General scheme for limited proteolysic analysis

Foi the îeasons outlined m the pievtous section lull length human T1T11 was put thiough hunted pioteolvtic analysis which has been ihe method of choice to distinguish between nghtlx packed domains and flexible solvent exposed legions m piotem stiuctmes These hinges and loops aie genciallv moie susceptible lo pioteolvtic cleavage than compact domains because then piotem backbone can be accommodated oi even liansientlv unlolded bv pioieases much moie eiliuently The (meta)stabk liagments m limited pioteolvsis thcieloie concspond to distinct tishtlv packed stiucluial domains1"1 ri

Ingure 6: General scheme for limited proteolytic analysis.

Intial Re i tion Con litions »ith Vin ins Prop ises I

Dev l( i mt nt e f Inhibit on PioP oh

o i f nernent c f Re lotion Gond t ins

N teimin il pquennnq , 1 Onmtmlntu Subdiqpstion Pie. ttrmin il s< qupneinq O I , Blot Gel MS Amlysu 41 HPLC MS Amlysis I P MS An ipso

I Funttioml Inform rtion I [Piottolytir Map of Tarqet Protein | St i up nee Analysis I

Put itivp Donmn Structure ifTnq t Pr te

Limited pioteolvtic analysis (I igute 0) stalls with piehmmaiv tnals using seveial pioteases to hnd appioximate conditions that yield anahzahle distinct pioteolvtic liagments Otdeisol magnitude m time enzvme substiate ialio tempeiatuie and bullei conditions (pll salt) aie sampled Compatison oi the dtccstion pioduets liom dilleient icactions mav leveal common patterns 01 (moments Ihese aie ot special mteicst because they indicate that the enzv mes aie guided bv the substiate stiuttuic and not bv sequence specificity In a second step elticient inhibition ( stopping ) pioiocols 81

aie developed lo mipiove icpioducibilitv and allow handling sind stoiage ol stable

imxtuics of pioteolvtic liagments e g foi HPLC MS analysis Ivpiuilly pioteolysis is

stopped by adding deinrfuiants hke SDS to the ieaction mixtum which is immediately

boiled and then anahzed on SDS PA. gels This tvpe of inhibition is leveisiblc sind

involves deteigents which n is unsuitable lot MS analysis of the pioleolytic liagments luithei most pioteascs aie much moie stable tow aids denaluiants and heat than taiget piotems sind may even noi be completely mactiv atcd bv this tteatment The ptoteases ictain enough lesidual aettv itv to cut the unfolding taiget piotems into unspecifie and lnepioduublc bicak down pioduets thai show no conclation with stiuclunil elements anvmoie Ihe pioteascs must theieioie be inactivated completely and ineveisibh bv appiopi îate mhibitots beloie denatui mts aie added \\ ith the inhibition pioiocols at hand the icaction conditions aie lelmed to optimize the yield oi tht fiagmcnts ol mteicst Then \ teiimni aie deteimtned bv Fdmann deeiadation altei blotting on

PVDF niembianes1^6 Ihe C tenmnt can be extiactcd bv live dilleienl methods'^1

1 On mcmbiane subdmestion ol the blotted pioteolvtic liagments followed by

luithei Edmann deinadation1 l

1M 2 C teimmal sequenung oi ihe blotted pioteolvtic liagments1(l1

7 Blot MS analysis bv IR M4LDI mass spoctiometiv ol the blotted liagments

dnecllv liom die PVDf membiane1(V*

1 HPLC MS-anahsis oi the liagments as applied m this sludv involves sealing up

oi the îeactions to picpaiativc amounts (1 2 m") and HPl ( iiactionation

lollowed bv mass spectiomctiv (see below)

7 In simple eases diiect MS analysis ol the unitacdonated dicestion imxtuics can

viele! satisfaetoiv lesults" ^lff

Ihe choice between these methods depends on the complexity ol the pioblem and the equipment available In guienil methods I ) to ^) should be pielenecl because the allow an unequivocal assignment ol the conesponding I\ and C teimini Methods

4) and 5) snllei horn the lact that peaks m mass speuia ol piotem mixtuies can not easily be assigned to then conesponding bands i e N teimini in the dicestion patterns

Combmalion oi mass spectioscopical data w ith N tummal sequencing icsults involves some mteipietation which mav lead to enoi but yields die pioteolvtic map oi the tat set 86

piotem as well Finally, mteipietation of proteolytic map togethei with lunctional mloimation (e g deletion analyses) and sequence analysis (multiple sequence alignment, secondaiy stiuctuie picdiction) yields the putative domain s tincture oi the taiget piotem 87

3.2 Results of limited endoproteolysis

Full length, recombinant TFIIF was digested with various proteases to define those that: yielded analyzable patterns of distinct proteolytic fragments rather than non¬ specific breakdown products (Figure 7). Buffer conditions for the initial RUNNING- assays were kept constant in order to avoid conformational variability of the target protein TFIIF. Chymotrypsin, proteinase K and elastase yielded digestion patterns with discrete proteolytic fragments, but only chymotrypsin and elastase were selected for further analyses because inhibition efficiency for proteinase K was not satisfactory

(Figure 7C, lane preinc).

Figure 7: Initial RUNNING-assays with various endoproteases.

RUNNING- and PREINCUBATION-assays were performed as describee! in section 2.5.6. Reactions contained full length TFIIF al 1.5 ml" and mg (30 ng) were incubated at room temperature for 10 minutes. Inhibition was omitted and POPB aelded directly where indicated. Inhibition efficiency was controlled by adding water instead of inhibitor stock solution (lane conlr.) and PRKINCUBATION-assay (lane preinc).

TFIIF 15Ö0 1500 1500 1500 1500 Ifiq/ml] 1500 1500 0 1500 1500 1500 1500 1500 1500 1500 150Ö 0 TFIIF finU!) 0 0.01 0.1 Chymlrypsin [iKi'mlj 1 10 100 100 LMVVM 100 0 0.1 1 10 100 100 100 100 100 LMVVM Papain [p-j/mlt'

- ^ - Inhibition t i -s + -....-.. conlr, contr, prm)ö. . Inhibition

TFIIF (ng/ml) 1500 1500 1500 1500 1500 1500 1500 1500 0 1500 1500 1500 1500 1500 1500 1500 1500 0 TFIIF [uq/mll Proteinase K lug/ml] 0 0.1 1 10 100 100 100 100 100CMWM 0 0.1 ! 10 100 100 100 100 100LMWM Subtiiisin lecj/ml] - • ' ' Inhibition - . . . TOiilr. . . prenne. ... con!r. preinc. - inhibition 88

in a second step, efficient inhibition protocols for chymotrypsin and elastase were developed. On chymotrypsin three different inhibitors were tested (Figure 8):

TPCK. acetic acid, and Pefablock. TPCK inhibition was show and followed pseudo first order kinetics being insensitive lo the inhibitor concentration. This kind of inhibition is worthless for proteolysis experiments on the timescale of minutes because it lakes 2-4 hours to completely inhibit the proteases and can not be accelerated by excess

TPCK167. On the timescale of minutes, acetic acid alone seemed to be at least as effective as an inhibitor as TPCK but with the major draw back that at the end of the inhibition time alkaline PGLB reversed the inhibitory effect of the acid and allowed non-specific cutting during heat den aturation (man) additional bands m lanes HOAc 10 and HOAc20). Pefablock caused considerable inhibition of the protease that followed second order kinetics indicating that the inhibitor} effect was sensitive to Pefablock concentration. Using high concentrations of Pefablock (10 mg ml ), proteolysis could be completely stopped within 10 minutes incubation at room temperature. Therefore

Pelablock was chosen for further experiments as an effective, last and irreversible inhibitor for chymotrypsin and elastase.

Figure 8: Inhibitor testing with STOPPlNG-assay.

The STOPPlNG-assav was pei loi med as elesmbed m section 2 i p Reactions eontamed lull length TFIIF at 1 7 mo ml PO etP nelicatee! and weie incubated at 'oom tempciature loi 10 minutes Inhibition was omitted and PGl 8 added diiectP wheie olheitnse incubation with the mhibitoi was 10 minutes at loom tempeiatuie The following mhibitoi stock solutions weie testeei 10P 1 nig mi (PCKprPCki) 10 mg'ml TPCK iTPCKlO) 1 me ml' PcfabolcK (PH VI), ,0 mg/ml Pelablock (PPFAIO) es P atetic acid (HO VclO) 20P (\/\ ) uceiic and (HO V20) In the mntiol lane watei was used instead (lane conti I

IMP lug/ml] 1500 1500 1500 1500 1500 1o00 1500 1500 0 [ug'mll 11111 117 i LMWM Chymtrypsm ~ Inhibition - mu rposimkni pnAi ur-An HOAno hoa n

ffriffnliiiiiitliy If ÄÜSSCff

^A^jptiMMPill1 t^wjwwiw«^^

^((|bPR!iwpWW^)ÄSWwIWi 89

This approximate inhibition protocol was refined lurthei First, the maximal piotease concentration for the inhibition piotocol was detennmed by

PRLINCUBATlON-assay (section 2 5 6 data not shown) Pelablock is soluble m watei to a maximal concentration of 10 mg ml Ibis maximal stock concentiation limits the maximal concentiation of chymotiypsin in the icaction mixtuies that can be completely- inhibited on a 10 minutes timescale to 1 ug ml llastase could be used up to 10 ug ml

(data not shown) Stability ol the ptoteolytic liagments after inhibition was assessed by

SIOPPlNG-assay showing no change m the band pattern altei inhibition (Pigme 9)

Finally time tesolved limited pioteolysis experiments (RI WING-assay) yielded useful mfotmation about the relative stability of the pioteolvtic liagments In the following discussions, the individual bands will be refened to as labeled below (Figine 9)

Figure ():Efflcieney of inhibition and relative stability of TFTIF proteolytic fragments,

STOITTNC md RI WING issit w is peiloimtd is descnlxd m section 2 3 6 Ro lotions eontiuieil lull lumth Till! it I 7 m ml OOll ) 1 i \\c e inetibiteel It loom tempt i Unie 1 i 10 mi rules ke Ktnris nid Incubation with tht mhibitoi wtie eleiiit 1 mom Itmpti l it Stmd uels loi RUWlNCi nsiv wne the t nie ill mc P n el the 4 h it limit 71 Conditions ehosen le i tuithu intuit Pion m hbtltdwith iblitkdot

STOPPING nsuay RUNNING assay

• 1 ! LMWM V ^ yp "ft"! 20 3 r"

- Kr*1^*^ ~**\t*^»*~ wte^^»»frAte* s*r***" KK»»#t# ^ 90

Intersting fragments were those common to several pioteascs indicating mclastable fragments that were stable due to their structure and not lo their sequence

(e.g. C2, F2. C6. E7, CIO. K12) Fragments that were stable over long reaction times were interesting as well since these cither lacked culling sites or were particularly tightly folded (e.g. C3, C5. Eb, Figme 9). In order to identify the N-lermim of all these fragments, digestion patterns lrom several limepomts of the chymotrypsin and elastase digestions (Figure 9) weie blotted onto PVDF membranes to make use of the varving relative concentration of the fragments and submitted for Eelnian sequencing. For

HPLC-MS-analysis. reactions were scaled up to preparative amounts such thai 0.25 -

1.5 mg TFflF could be digested and then stabilized with inhibitors rcproduciblv The digestion mixtures were fractionated and desalted over a Nucleosil 300-5C8 reversed phase column which led to fractions containing only subsets of the proteolytic fragments These tractions were analyzed on a MALD1-TOF mass spectiomctci (Figure

10). The N-termmi ot all the proteolytic fragments coniamcd in these HPLC-fractiotis were known such that mass spectia could be mterpieted to obtain the C-tenmm of most fragments as well. For inictions containing only one or two iragments, an unequivocal assignment ol certain peaks liom the mass spectra to single bands or N-lcrmnn in the digestion patterns could be obtained In more complicated cases a less reliable assignment could be achieved comparing between HPLC-frac Hon s with different contents when the known specificities oi the proteases weie taken into account This yielded more reliable results lor chymotrypsin fragments than lor elastase Iragments because elastase is less specific The linal results of the limited endoproteolysis are summarized below (Table 8). 91

Figure 10: HPLC-MS-analysis of chymotryptic digestion product,

V The clnmotitptit liagments boni a piepaiatite dicestion ol lull length Tl IIP (RUNNING assay 1 3 nie ml Tl HF (OOO its)) i tu ni ehtmottypsin (1 ug) 10 minutes Rtl were tiaclionated by tettised phase MPI 0 chiomatofiapl» (Nuclesotl 100-s CS V 0 P II V B 70P C II CN + 0 Pe 11 V I ml mm 1 ml pei tiauion) B Vnapsis ol HPl ( tiactions o i I8P SDS PA gtls C M VI Ol I OP mass spectia (DHB in.itnx 1 1) ol the indicated nactio is altei eoiieenttation m the SpeedVac A

ïïïïitftîAîfïïw I el if O.tlllU || . J|l, PP4.0 1-liJi to ! ' i »: ïï Iii d^ËtlltiîîlpItfiilli;:tIt ill,.:;:« f *l; 111* p f f hi:

B Fraction 0 9 10 13 60 61 62LMWM LMWM 0 68 60

C2 W# *% 3P'" C3

C5

C6 ^*fc» «BBSS; n^^Bse

2+ 2+

C3 C6

n, 00 (0

1 + 1.

3+

C2 3+ 0

m

kkj 1 + hWah w*w~*__.»

- C3 .= =. - RAP74(S175 F262/M363) C2 RAP/4(A2 F262) C5 = RAP74(A2-F174)

C6 = RAP74(M363-E51/) 92

Table 8: Results of limited endoproteolysis of full length TFIIF.

C util ne Ptels fti N t mum uc d (in 1 is hmh when thett is i suivit, uiiimbimou qu nee ml is middle when the

se.qiie.iKc is dciived tioin i mrstuie ol equeue s md inbiintoiiil woid ouch (GCC 3 P N Toitium with low conlidi nee h ivc

no elm i suit m this evoidsLiich m 1 ire Uni foi oimtt I torn th tibi Conhd n P\ P toi C tcirram n iIlIui i hi h

wh n time is i miss pLCtiurn Ii in a tiiotisn with I In muits whieh ilk s uuqiiis oeil issi nmciit of the p iks U ihe N t imini included Middle conticl nc i set v.hen th i u multipl miss sp tn ti m dictions with \ uvm° but et ihppiim tn m nt c mpositions The issi n d ] eik mu t con p n I t h i nietit tint u \p it I bl elein qu ti sp Hi its ot the

i ] slit pi t isc Povv confident is set when th i n n un quit il i i nm nt lip ik to in N terrnum contiined m the

ttatti n und t tuvestpitionbut the pi pe t lin menti P t tpeesibl tiiminlbi d on one ot tlus N t mum

t h\ moti v psin Fi agmuits _

Confident i Confidtnti

I ibtl R\P N It im C ltim \1 MS N In minus C Itinmnis _ + Pa.

( 1 4 P I s] 331 73P 7 1 ( 01 lu h lu h

( 71 V I 7 11 1 41 1( (7 P 7 1 hmli In h

s ( 1 C 1 V VP | 11 U-i! ' P lu h lu h ( t SP": PC4 i n 1 ( 1( e- + lny_h hiji i ( 1 SI 0 M ( 1 P 11 1 ) liiji luji

< 1 P I7P1 S o 3 S i 7 IP hi h lu h ... V

( o 1 p 1 I 1 1 7 141 o4 1 ) o lu h lu h

Cl 1 VIP PsP |C)1 \ 31 P hmh _ Jimti 1 c; V vu < b 1 1474 4P 4 11 <_ 11 Jimh niitl il c> 10 Ml D1( PC PP 1 > 3 ) hi h lu h -+- 7 C3 n RIO' Lsl i ns 1 )1 00 ( hi h nu 111

' i i < C) 0 ^ Pll 1 x | 1 P hlh Pus L _ _

sS 1 C) 70 V 3 1 1 i 3 )7 lu h mulet 1 ( 10 1 | R4 4 SM 1 1 1 N ) 11 1 lu h lut C 1 i 71 RI 1 I 1 SO" P S 03 hiji _miHl

( 11 74 C 3s RI s ) sO c 0 mid 11 midtl

s ( 11 10 NIC k , 7) ) nu 11l 1 w

t C 1 71 V I 1 S3 14 l hi, h mid 11

Flastast Fi agmtnts

Confidence C outillons c

i ,hd R\P N lu m C 1 c un MS t minus C L N lu Itiimniis ( u _ P

11 14 V 1 M Pl o o3| ) 7 1 ( 01 lu "h lu h

1° n v v S M 40 HO 4)3 7 PO 0 01 lu h lu h

« ^ 1 X MP 1 1 vv t

_ t - hiji

1 4 71 V Cl 70 000 lu h low 1

1 3 10 V on) PP) P V 1 1 0 11 lu h lu ll

1 C ' x' S_0 2P0( ^C 4 p hmh

- I— lnjl _ l_f 1 4 V V 0 P( ) o C O l in h lu h f I n i iPi 1 ol 1 J 1 1 ^ st In h rmdll

1 3 1 V 11 1 P 1 P4 3 1 P lu h

- •uji

> p Ll 11 PP4 11 S lu n 1W J l _\ u

Ll i VI J I M 1 1 P )l lu h miel 11

t ri | VI 1 1 P 4 J P )t lu i mid 11 LI ) n ki 1 1 J_ 1 I 43 3 s huh mid 11

PII 0 M 7 t t IP ) hui ^ ___ J_P lnji_

I II ) VP I 1 11 1 11 P i ) lu h lu h

1 l1 S RI 1 ss 3 3 1 p Pw . 17 J_^ hiji

o s V V 3 )s 1 'S lu h 1 w

o 3 7 rn 1 Sil P 7(0 p nu Idl 1 \s e U' _ 93

3.3 Results of limited Exoproteolysis

The exact C-tenninal domain boundaries were defined with carboxypcptidascs.

The truncated TFÜP-complex with the longest C-terminal extensions was used.

RAP74(2-202)/ RAP30(2-151) The complex was digested with caiboxypepticlase P. eai boxy peptidase Y, caiboxypepliclasc A and carbem peptidase B. First. RUNNING- ancl STOPPlNG-assay weie used to get an impression about the activity of caiboxypcptidases and useful inhibition conditions: Catbopypeptidases A and B were active at pH 7.5 and room temperature. They could be inhibited using 50 mM EDTA and 50 mM o-Phenanthrohnc m 50e c ethanoi as inhibiteur stock solution followed by addition of 50% (w/v) solid urea Bolli caiboxypepliclasc Y and P had very low aetivitv at pH 7 5 and room temperature Butler conditions had to be changed to match the optimal pH-tange lor those enzymes which arc pH 4 6 for caiboxypepliclasc P and pH

6.3 foi caiboxypepliclasc Y, Incubation temperature had to be increased to 37 '0 Cndei these conditions Pelablock (10 mg ml ') did not significantly inhibit earoxypcptidaso Y or P. 17c (v/v) Dusopiopv lphosphofluondatc (DIPF) had to be used as inhibitor stock solution instead.

Mixtures of eaibo\\peptidases were used in oidei to overcome residues that weie not efficient!} released bv one ot the othct en/vme By comparing between digestion patterns lrom various reaction conditions, six fragments were defined that

were common to most icactions. These bands were labeled fiom i - vi and three conditions were chosen for their identification by N-teiminai sequencing and HPLC-

MS-analysis. in the lollowing discussions the individual bands will be refeircd to as labeled below (Figure 12) 94

Figure 11: Initial RUNNING- and STOPPlNG-assays with carboxypeptidases.

Hit SlOPPlNCi md KlNNlNGlssn weit ptiloimui is described in section P6 Ructions confuncd RVP10C7. PI) R VP74(2 7()oj aui 7 m ml (14 LUI md we le inciilntcd it the te mpei mue mdic ited Inhibito mctibition w is it loom teinpct Itlill

STOPPING-assay RUNNING-assay

eu xv o dise V[ug iqo 100 0 oO o 1 A° à boyxvpe pi dite \ \iyn] toxypeptlist Bfng oi) mû 10P o tP/VVM o n 1 ") 10 LMWM 10 t hoyxyiepîd^ t P \i} } t U ntn o ] OS o 1 i0* leirton n t 1

'' i^wtouafo.tAùi.jm.,. pwSfflftl «n - *"*% «•*% *"•** *^5***** «**!,

7 C (.H

Uboxypeotdï mm) too 100 IO) 100 oC LWWM 1 0 lu 7 00 WWM 1 0 100 Ci yxype d e lin hbtote] 0011 1

1 4b

03 " Corboxvupdl |iq ) 00 1 11 WAM 0 00 1 1 t\ r\WM 10) 100 d oyxvrr i I boY fior nh b 1001 1 3 i ") o Reicî in n

»*w*m»«» 95

Figure 12: Comparison of various reaction conditions for exoproteases.

' RilNNlNG-assay was perl oi med as destiihed m section 7 3 6 Reactions contained lull Icnsith Tl I1F al 1 3 mg ml (30 11?) and weie pei loi med undci the conditions indicated Standaid toi RCNNINO assav was the 1 lane ot Fiauie 7F Conditions shosen loi luithei iniestieatton aie lande cl w ith a b lek deit

YAB PAB P YAB PAB P m • • C irboxypeplidiso A [uOnl 0 11 i Ciiboxypeptiehse R lieq/ml) T ( 0 u 0 CubexypeptidisU (jn ml) n 'o no 110 Cirboxyn plidist P jfiqlnl no np no FloKtiontime [hj tWWM 12 ~-î m lint n n s btmdird Reaction nmporalure \ C] 21 20 i 1/ M 17 07 pll r s 70 S l 1 u 4 11 1 h 4 0 tl>

«««»*%

-OU

-SAO!

W x3

Digestion patterns obtained with carboxypeptidase P and a mixture ol carboxv peptidases A. B and P are veiv similar. This is due to the low activity of the pancreatic enzymes in the acidic reaction buffer for carboxypeptidase P. The teaction products were analyzed at two time points: After one hour with carboxypeptidase P and alter !8 hours with carboxypeptidases A, B, and P. Caiboxxpeptidase P first teleases 10 ammo acids from the C-tenmnus of RAP74(2-202) to \ield Pl (label i). Then there is a second pausing site around the residues 165 and 166 which leads to band P2 (label nt).

The lower molecular weight fragments m this reaction could not be identified. Altet longer reaction times (18 h) a mixture of carboxypeptidases A, B and P yields a simple four band pattern. The band with label vi is m tact a double band of RAP30(2-119)

(PAB 1) and RAP30(2-113) (PAB2). The two low moleculai weight bands PAB3 and

PAB4 are RAP74 (2-60/61/62/63/64/65/68) and RAP74 (85-144?) where PAB4 must be the product of a contaminating endoprotease. The digestion pattern obtained with a mixture ot carboxypeptidase A, B and Y shows tour major bands the two longer liagments corresponding to label m and n aie RAP74(2-165) (YAB 1) and RAP74(2-

154) ( YAB2). The two shorter fragments YAB3 and YAB4 are debris of RAP30 but their C-termini could not be identified. However, b> comparison with the standard chymotrvptic digestion pattern approximate molecular weights can be given. The proteol>sis results are summarized below (Table 9) 96

Table 9: Results of limited exoproteolysis of RAP30(2-151)/RAP74(2-172) with carboxypeptidases.

Confidence levels He detlll el OS desclllxd Ul tl l Ule I)

Ccmlultiict Cetil aient t

I the I RAP Nluni C Turn M MS c I N l'a minus C lei minus

i Pi /4 A2 LP): 224s2 22440 11 76 OOS'/e hph hph

it 74 A^ d ro j n d ,ul

in \ABl 74 A2 K.16Q J9pp 19ss4 6 S 6 0 oSPo 17 tl middle

" m \ \B1 74 1 A2 MdS PPS-s 194s6 69 0 V/r n d middle 7 m P2 74 \: R166 PPJ4 191 Ss 2S 6 0 1 SP , hph huh

1 tu P2 ""4 \2 Flcss 1904 J 9029 P 0 07,/ , [__ ^ISÎL— high iv YAB 2 74 \2 ris4 [""707 Pl 10 c q_ 1 OOoP lid middle

v M) YABs \2 el pS n d uLd

vi > \B4 AP el rut 7ti ~ 1_20_

vi Ps TO A2 ti 120 n d /tl 1 vi PVBli of) \2 MP) I2ôl)s :~oi (7 S OOsp lush huh

vi P\B2 ol) A2 \]J4 ppp 121 h si 1 0 07(7 h 1,7 h h /'i

o EM) 6S09 677; 0 0 S4< c P YBy oOj \2 SfvQ 7dOS 0 9 0 01CA hi ah JiuAi PAB 4 M) keS9 ^ 144 67 p Ciss9 3(7 0 09% Pgn It w nil not det t mined but dented ft mi ti 11 its ube-ctp j j iii

foi the exact delmition ol ihe \ teiminal domain boundaiies with anunopeptidases RAP^0(2 1^1 )/R AP74t2 172) was digested with poieine aniitiopcptidase M and ammopepudase 1 boni S qiislus N tcimim ol the diecstion pioduets weie anah/ed bv tdman sequencing \itei long îeadion tunes aminopcptidasc M iclcased the lust two alanine lesidues horn RAP74(2 P2) while ammo peptidase I liom S ruscus (10 pg ml ) also cleaved oil the following leue me icsidue f he \-teiminus ol RAP30 piotected tiom both enzvmes Both anunopeptidases used contain a signilicant amounts ol endopioteasc contaminations which led to additional lowci moleculai weight bands on SDS PA eels (data not shown) 97

3.4 Design of crystallization constructs based on the approximate domain strructure of human TFIIF by endoproteolysis.

Analysis of limited endoproteolysis products ot TFlfF led to a detailed understanding of the digestion piocesses of RAP74 and RAP30 m the context of the

TPllF-complex q^^ g^ Alignment of proteolytic fragments and previously defined functional domains of TFllF-subumts shows close coricsponcicnce supporting the idea that TFIIF -subunits are composed of three structural domains each (Figure 3)12f

The RAP74 N-termmal domain includes appioximately the lirst 200 ammo acids corresponding to fragments C5 and E6. With longci paction times these are cleaved into smallei pieces (Cl 1. C12) indicating some flexibility around residues 70 to

85 m the RAP74 "arm domain'* (Section 6 4.2). The .Vtermtnal domain is released from larger fragments (C2. E2) that also include the central domain oi RAP74 from residue 220 to 360. Once this central domain has been removed from C2 oi F,2, it remains resistant against further chymotryptic digestion because it has no more laige hydrophobic residues These constitute the hycliophobic cores of tightly packed domains, supporting the hypothesis that the central domain C3 is unstructured. Elastase cuts the central domain trom F2 via the mctastable unci mediates and E4 Debris of

Ihc central domain though arc not found m the reaction mixtures indicating again that it is not well structured and therefore readily digested to small peptides124. For many studies with RAP74 C-lermmal domain, constructs similar to C6 or E7 have been used.

Upon N-termmal truncation these constructs loose then respective activities so structural and functional domain boundaries coincide m this case (amino acids 360-

517). With longer reaction times the C-termmal domain is digested further to smallei pieces (CS, CIO, E10. F12. El4).

The RAP30 N-terminal domain includes roughly the fust 120 amino acids corresponding to the chvmotryptic fragment Of The othei hall ot RAP30 (C7) with central domain and C-termmal domain is quite resistant against further digestion. Only aftei long reaction times part ot the C-termmal domain appears m band Cl I (150-249)

Elastase cuts right attei RAP30 central domain (lesidues 120 to 150. E8, E9, Ell)

Again these eaily liagments aie quite resistant against further digestion with elastase

But altei longei reaction times a shorter fragment ol the N-terminal domain F8/E9 appears m band E13 indicating some flexibility aiound icsidue 73 m the RAP30-loop

(Section 6 4 2) 98

However, when interpreting limited endoproteolysis results in terms of domain boundaries, special care must be taken for three reasons: First, all proteases show some sequence specificity. Theretorc. the positions of the cuttings sites depend licit only on the fold of the protein backbone, but also on the sequence i e the presence of specific cutting sites. The minimal domain boundary may be anywhere between the actual cutting site and the next potential cutting sue towards the domain core which is protected. Second, proteases do not only cut hinges between or flexible tails at the end of compact structural domains, but also flexible or transiently unfoldable sequences within domains1'2. The re lore proteolytic fragments do not neccssaiily correspond to entue structural domains. An example toi this are the N- and C-tcrnimal regions ol the large subunit ot human TFIIA which contribute lo a common structural domain although they aie separated by a laige loop which is readily cleaved by proteolytic enzymes37. Third, all proteases have limited access to theii substrates due to siertc hinderance and thus will leave behind short tails at the termini ol the proteolytic

'2 fragment as compaied to minimal stiuclural domains1 Tn order lo define structural domains and choose sensible liagmems loi crystallization expenments, endoproteolysis

îesults have to be complemented by lunctional studies with the putative structural domains and multiple sequence alignments (Figure 3). Based on such combined analyses, a first set of N-iernunal RAP30 and RAP74 constructs was designed foi crystallization trials with RAP30/?4-eomplcxcs (Figuic 13),

Design of KAP74-constructs for crystallization screening

The C-termmal boundary of RAP74 N-termmal domain has not been unambigously defined by lunctional assays yet. ll varied between residue T158 and

S217 in the numerous studies depending on the functionality under investigation

(Figure 3). This is indicative ol the presence of several loosely packed (secondary) structure elements that constitute the C-tcrnunal part of RAP74 N-tcrmmal domain.

Multiple sequence alignments m this region of RAP74 show a streich of high homologv between Q96 and Kl 94 whieb is lollowed by hydrophilic insertion sequences of

RAP74 central domain spaced by short conserved patches like R199 to D2()9 Thcic are two hy pei charged patches m this sequence that are piobably located at the piotem surtace111 The first spans between F157 and R167 The second is between D18R and

K201 (Figure 15) The piesenee of these hydrophilic sequences indicates that the 99

(secondary) structure elements in this region arc separated by solvent exposed loops

and may therefore be loosely packed. Secondary structure predictions are consistent

with a amphipathic a-hehces from T156 to N168 and from EIS1"1 to D209. The second

a-helix is veiy unlikely because of the insertion sequences in the multiple sequence

alignment of this region (Figuie 13) Elastase cuts RAP74 after amino acids A202/S203

and the lirsi protected elastase site tow aids the core of the N-terminal domain is El 82

(Figure 13). This indicates that the N-termmal domain of RAP74 ends anywhcie in

between these two positions oi even further out if elastase is cutting a loop and

sequences up to D209 ot F2 15 contiibute to the domain structuie Since the functional

role ol residues beyond E205 was unknown at that time, the elastase fragment

RAP74(2-202) (E6) was chosen for crystallization trials because it is similar to

RAP74( 1-205) which is necessary and sufficient lor minimal transcription (Section

1.3.1 ). A second crystallization construct vv as designed based on the chymotiypiic

fragment RAP74(2-174) The closest piotected chymotrypsin sites are W165 and

M177. Thereloie. RAP74(2-T4) lepiesents a decent definition of RAP74 N-termmal

domain, unless the protease cuts withm a loop between secondary structure elements.

Since there is a hypothetical a-hehx to N16S. the choice ol C-terminus is restncted

furlhei to N168 to F174. There is 11173 close to the C-terminus. This is not ideal foi crystallization trials undei physiological conditions because the liistidinc piotonation equtlibiium is a souice ol inhomogeneity In addilion, residues up to E171 had been shown to be necessary and sufficient to bind RAP30. Thus, it was decided to clone

RAP74(2-172) for crystallization mais and first crystals weie obtained with this construct

Design of RAP30-constructs for crystallization screening

The lunctional N-termmal domain of RAP30 has not been well defined yet

Ammo 2 to 98 acids are sufficient to bind RAP74S2 while residues up to 152 are necessary to bind TFIIB'"2 indicating that there is more than one stiuetuial element

(domain) involved This is supported by the presence ol multiple protease cutting sites between El 16 and T151 (Figuie 13). The region interacts with RNA polymerase II and shows sequence homology to £ coli g70 region 2 which binds to bacterial core polymerase61. Therefore, an additional central domain around ammo acids 120 to 155 has been postulated (Figuie 5) Multiple sequence alignments show a high degree ol 100

conseivation tiom yeast to human liom residue VI10 up to Ihe C temumis with two shoit mseition sequences altei F122 and Y166 (Figuie 13) These hame the RNA polvmeiase II binding site Ihe second mseition sequence is lollowed by a chaiged patch liom F 167 to DI72 Aliogcthei this indicates a stiucluial bieak aiound tesidue

Y166 Secondaiy stiuctuie piedietions (GCG 9 5) aie tn keeping with lout ex helices in thistemoti K103 Q1 12 M12SFIU Ql fs K147 N162 Dl7 7 (1 uniie 17) Ihe last ex hth\ is \ei\ unlikely because it budges an mseition sequence A lust construct was designed based on the elastase euttine site at 1 1 M It is sunonndecl bv moie potential sites ( 1150 N152) which stiesses the stmttuial basis toi this choice ol cleavage site

(maybe altei the thud a helix) Once the peptide backbone is cut m this position moie icsidues aie iclcased tiom the N teiminus ol the C teimmal fiacment up to N1S 7 indicating thai lesidues on this side ol the I 1s I site aie not stiucluied (airv moie) So we decided to clone RAP30(2 1*>1 ) also because this was consistent with ihe O teimmal end ol the cential domain The second consnuct was based on the chymotivpsm tiaement RAPsO(2 124) Ihe neiehbonng chvmotiypsm site towaids the N let minus is

F96 and lies within the RAP74 lnteiaction domain Ihe liasment lo Fl 16 (C9) has a eleai peak 111 the mass spectrum but is \ei\ unlikely (specihcitx) iheieloie it was neglected So the choice was lo use RAPs()(2 124) as it was since the exact location ol the yeast msetfion sequence is not eleai This fiaement togethei with RAP74(2 ln2) yielded the Inst civstals 101

Figure 13: Design of N-terminal RAP30 and RAP74 constructs for crystallization.

In Ihe se-quence ah miment identical ammo acids aie indicated m bold similai ammo acids aie undeilated in mat Secondatv stiuctuie (01 piedittions a-helices

RAP74 al

X~ny staid ire Oh 2i v preOUic o GC7R 2ry prpdiaion N/S hyoercna qe cleUer

/ //// / human xenopus I IIQJRHIKDO UfcÛlBtl i ; K k s 0 L Ki no i eUsophii IM KRLR I If Lpt1 < ll Ul Ö yeast < I i j_ 1. K\ 1 ß

c ryitalb yes yes

yes

RAP30

X-ray strut t\ e t. F 2ry prediction vvs/s bOR 2ty prediction a70 X toy structu e hyfoi charge clusfor

-* r>

A / / 1 J I

o70 region 4

human VVOftARCSPAAS L K 8 L Q 1 _ S S K F V SU ^ûïi' xenopus VVNRArCBPAASi) HKRKC1 ï .KAVl UViiCtf SrJtiÖVHl 61 arosophila i VOX! URfl A .SKI V VKPV H MIL'

yeist * £ V r M C ^ <> -i M s

O'-ystAls yes 102

3.5 Design of more cystallization constructs based on precise definition of domain boudarres by exoproteolysis.

Using the lust set o1 constiucts loi civstalli/ation scieeiims onlvthe combination oi RAP74(2 172) with RAP^OQ 124) vielcled civstals Ihe eivstals weie useless loi X-iay civslallogiaphie chaiacten/ation oi data collée lion because they weie lo small and could not be sepaiated tiom the tieht oxidation skin thev wcu gi owing w lthm 1 hus moie tiestalli/ation constiucts weie designed based on a second lound ol limited pioteohsts on the existing constuicts with exopioteases Ihesc eii7vmes weie used lo iemo\e flexible and sohent accessible tails which weie mteiiciing with ciystalh/ation UnloitunateK this appioaeh led to additional complications lAopiotcases aie vciy slow at iemo\mg teimmal ammo acids Diastic icaction conditions with stochiomctiic amounts ol en/\nies and long icaction limes had to be used to iemo\e manv amino acids liom the C and N teimini at all Unclei these conditions endopioteasc contaminations ol the eommciual cxpioiease piepaiations eontiibuted signilicant acii\ it\ and theteloie piecluded unambiguous analysis ol the expeiiraental data It is not eleai wluthei the teimmal tails weie îemoved bv sequential exopioteolysis which stopped at the enti\ points ot the peptide backbone into tiehtlv packed domains oi whethei the tails weie icmoeed b\ endopioteohsis which could also occiu in loop iceions withm a domain iold So exoptoteohsis data weie combined with the loimei endopioteohsis lesulls multiple sequence alignments sccondan stiuctuie piedictions and stiuetuial mloimation liom homologous piotems loi desiemne luithei ensialh/ation constiucts (Fieuic 1 7)

Design of additional RAP74-ionstructs for crystallization screening

I ust options loi chanmng the \ tciminus ot the R YP7 1 constiucts weie mvestigalcd The ÎN tetminus ol R \P"74 is acessibk to 1 lastase amuiopeptidase M and aminopepticiase I liom S jnsc ns which manages to iclease lesidues up to I 4 beloie it runs into a GP sequence that is geneialh cut mellicientlv b\ exopioteaseslöS Iheieloie it is not cleat whethei the eii7\me stopped because oi stiuetuial piotedion 01 a sequential block Ihese lesulls indicated that the \ tetminus ot RAP7I is solvent accessible and enteis a tighlh packed domain aiounci ammo acid VI1 which is the hist potential elastase cutting site which is piotecled This idea was suppoited by a lone 103

insertion in the yeast sequence after VI I. Nevertheless no N-terminal truncation of

RAP74 was planned. Finally, the X-ray structure justified the hypothesis showing no dcnstliy for the first five residues of the ammo acid chain which enters a ß-sheet that belongs to the core of the structure at VI1,

Then, the C-terminus of the RAP74 constructs was ledesigned. Indications for a shortei N-terminal "coie" domain came lrom functional studies showing that constructs from 7V2 to Pl 59 arc su flic te m to bind TAFU25008. On the othei hand full single round tianscnption activity requites RAP74(2-217)7<;; indicating that sequences beyond A202 may be important loi proper folding. But this was not know al the time and searching concentrated on constructs shorter than RAP74(2-202) which had not yielded civstals

RAP74(2-192) was a prominent raefastable digestion product when RAP74(2-202) was digested with caiboxypepliclasc P. In the sequence alignment El92 lies at the end ot the conserved N-termmal domain but is alieady part of the hypcrcharged insertion sequence

(Figure 13). This may be the piopei boundary of the N-termmal domain fold, since between Id82 and A202 theie is no more potential elastase culling site, RAP74(2-183) was designed as an additional crystallization construct of intermediate length between

RAP74(2-172) and RAP"14(2-192) L182 is a potential elastase cutting site but elastase did not cut there indicating structuial pi erection ot the site. K183 was added m order to have a charged residue at the C-ieinunus which may enhance solubility and to allow for hydrophobic interactions In the lysine methylene groups. Additional constructs shoitei than RAP74(2-172) were designed based on caiboxypcptidase truncations to Id 65-

K169 orT154 (Figure 13). Additional information about the C-leimmal boundaiy ot

RAP74 N-termmal domain came from expression experiments ol RAP74 constructs m

E. coli (BL21 (DE3) pLysS) where all constructs longer than Ed 65 were contaminated

with the same degradation products. RAP74 (2-163/164/165) - 1 18 (data not shown). The residues are at the end ol the putative amphipatic ohelix The mapii breakdown product RAP74(2-165) was chosen for crystallization maïs because it included W164 which was protected from chymotrypsin and because it ends with a charged ammo acid which may enhance solubility. When RAP74(2-202) was digested with caiboxypeptidascs A, B and Y the fragment RAP74(2-154) was a prominent digestion product Since Tl54 is lollowed by lluec hydiophobic ammo acids, it was

' "' s P 6 s s Giottth conditions Its 33', PUPu- uuious uts k 7 l00m\1Ml,NO mM BPl mCl pll l mM PTl eold 100m pl hangim drop 104

decided to include these and the additional E158 to preserve hydrophobic core packing interactions. Again, a charged amino acid at the C-lcrminus should prevent problems with low solubility. Because the first crystals containing RAP74(2d58) were big

(> 100x150x300 pm) but did not diffract beyond 3 8 A m their native state, it was decided to clone the actual caiboxy peptidase product RAP74(2-154) in order to be as close to the end of a tightly packed structure as possible

Design of additional RAP30-constructs for crystallization screening

Since the RAP30 N-lermmiis is protected Irom all proteases applied and is well conserved no changes were planned. Only two moie RAP30 constructs with alternative

C-termim were designed. Digestion of RAP30(2-151) with carboxypeptidases A. R. and Y yielded a fragment slightly läget than RAP30(2-124) (no MS-data). The icgion is homologous to E. coli

4 Crystallization of the RAP30/74-interaction domains of hTFIIF

4.1 Introduction of a general scheme for crystallization screening Crystallographic structure détermination of biological macromolecules involves first selection and purification, then crystallization screening and development of crystal handling protocols. II well diffracting crystals can be prepared the process continues with dala collection, data processing and phasing. Once interprétable electron density maps can be obtained, model building and refinement lead to the atomic structure of the biological niacromolecule1"0-171. Considering the recent advances in molecular biology, protein chemistry as well as data collection and computing techniques crystallization clearly represents the bottle neck in crystallographic siructure determination. Further, crystallization has kugeh remained an art rather than becoming a science with powerful theoretical foundations. Nevertheless, a general scheme for

'2 r> crystallization screening has emerged1 that goes well beyond mere listings of possible parameters and conditions1"0-1'5. Here an expanded version is presented

J8° including novel techniques in crystal growth160-177 and additional aspects of crystal dehydration181-182 and cryoerystallography18' 18fS (Figure 14).

Perquisite to any protect in crystallography is the formulation of the underlying biological problem. This has no theoretical foundation but is lor practical and methodological reasons. Abstracting from a specific structure to be solved to the biologically relevant question reveals many limitations to crystallization screening which are otherwise carried along as implicit premises. This has direct consequences for selection and design of the macromolecules and ligands included in crystallization screening, In the case of protein crystallography, abstract problem formulation may allow great freedom choosing or sampling among homologues from various species, isoforms and splicing eananis. mutations, fusionprotems, domains, ligands (DNA. inhibitors, antibodies). Parallel screening with various macromolecules or complexes representing the same biolobical problem is probably the key determinant of getting crystals of sufficient quality (resolution, mosaicity. reproducibility, size, stability).

Correlations between macromolecular design and crystal formation or crystal quality may be exploited. IM

Figure 14: General scheme of crystallization screening.

_^. asl. a question of biological relevance

Seletion and design of macromolecule(s)and ligands Cross seeding Correlations X

r* Preparation of Macromolecule(s) Goal Covalent and conformational purity Source of macromolecule

Purification I

and (7£ Assembly refolding Purification II ; to Insufficient amounts i U)

no tompiex Physiochemical Analysis inhomoqeneities

X Initial crystallization screening Q. Goal First crystals E o 1 o ! i Sparse matrix incomplete factorial Random grid sampling Revetse CD Weighted sampling screening curves Solubility o

No crystals Visual inspection

of i—^ Refinement uystai'ization Goal Reproducitr-ie macroscopic, diffracting crystals

(1) 2 o Chemical refinement Physical refinement Macromolecule refinement (/)

tr

No analysable crystals Crystallographic characterization Physiochemical Analysis degradation complex dissociation t—^- Post crystallization treatment Goal Stable, well diffracting cystals i I

Crvoprotection Dehydration Annealina

tr

Insufficient resolution or stability Crystallographic characterization

Data collection & reduction

ttt

Derivative screening phasing model building and refinement 107

Once a particular macromolecule or complex has been chosen for struclurc determination it has to be purified in milligram amounts to covalent and conformational homogeneity187-188. The components may be isolated from natural source or from heterologous expression systems, or synthesized chemically. Purification to covalent homogeneity is followed by refolding and/or assembly of the macromolecule(s).

Conformational homogeneity is achieved by a final purification step. The procedure is refined until certain analytical test aie passed: Activity assays, gel electrophoresis, mass spectrometry, sequencing, CD-spectioscopy, analyticalultracentrifugation or gelfiltration will pick up any problems with inhomogeneilies189. Dynamic light scattering analysis takes a special position because unimodal size distribution can be directly related to high probability of successful crystallization172-190.

For initial crystallization screening several methods have been proposed: Sparse matrix screens based on analysis of "experience of many laboratories""191 or preferably statistics derived from the Biological Macromolecule Crystallization Database

(BMCD)174 are now common sense, Single conditions are chosen that have often yielded macromolecular crystals in the past: a method that maximizes the probability of an isolated hit but also has its disadvantages. Most of the macromolecule preparation is wasted to meaningless conditions that might have been excluded for the particular case in advance and single hits do not yield any information about the solubility curve of the macromolecule176 which becomes important once the sparse matrix screens fail. Other methods therefore start with an experience based list of significant factors for crystallization like precipitation agents, macromolecule concentration, temperature, ionic strength, and pll (Table IO)160-176-102. Initial screening then proceeds with two dimensional grid searches around typical values (BMCD) of these parameters. Even if this first set of screens does not yield any crystals, much information about the

''precipitation behavior"" of the protein is generated. Ideally, coarse solubility curves are extracted from this method called "random grid sampling*' or "reverse screening""1"". In iterative rounds guided by intuition and experience initial crystallization conditions may be found. It has been pointed out that this is a very inefficient way of screening large multidimensional parameter spaces and that there should be more rational ways of locating the initial crystallization conditions. Mathematical methods have been used to minimize the number oi experiments to extract all the informal ion need for localizing initial crystallization conditions. Unfortunately both methods, "incomplete factorial 108

screening"193 and "weighted sampling"194, are based on the unrealistic assumptions that results from crystallization screening can be rated on a single metric scale and that factors in crystallization are independent. Initial crystallization screening is controlled by visual inspection only and there tore highly intuitive and associative process where quantitative aspects are only of maigmal importance because interdependences can not be ruled out and observations are rarely comparable 01 even commensurable (e.g. granular precitiation, spheruhtes. miciocrystals, skins, wrinkles, fibers). If initial crystallization sciecning tails, tlioughtJul rescreemng may be performed including additional faciei s like novel precipitation agents, additives, crystallization methods 01 similar. However, changes m maciomolecular selection and design, as well as then preparation piotocols may be more helpiul

Once lirst crystals are observed, one must be certain that they contain all the macromolcculc(s) or complexes m the desired (active) conformation and airangement

(electrophoresis, activity tests, sequencing, mass spectrometry). Unfortunately, initial crystals aie often not suitable lor this kind of physiochemical analysis, crystallographic characterization or even data collection (size, numbci. twinning, skins). So. growth conditions have to be refined first to îepioducibly yield ntacioscopic crystals.

Optimization of crystallization conditions is led only by crystallographic parameters like space gioup. cell parameters, diffraction limits and mosaicity. Neither size not morphology are relevant guidelines Only now "incomplete factorial screening" or

"weighted sampling'" can be useful since metric paiameters are optimized like diffraction limits, crystal number or eivstal size, I piopose ihiec classes of refinement parameters to be seaiched that can not be precisely sepaiated but have provided useful heunstical basis for idea generation (Table 10) 109

Table 10: Refinement or crystallization conditions (incomplete list of parameters).

.160 17s 178 102 103 1Q8 Ten more detailed lists see the kiHotem1 lektetiet

Chemical refinement Physical refinement Macromolecular refinement

1'itcipitalion actcnt • Ci\st illi7ation method • Coiuentialion and

(concentiation, kind ol (bitoh m teil ate dillusion t ont enti alien melhod

pieeipiuiioii agenl t tpoi difnuion thmentu diop • Sequence iftetations

demalues ol pituipit ition stttinc eli op) th alt sis'! imu tat tons ti une allons)

agent ee PEC t 400 20 000) » Com unci loss etc ctalhz ition » Labeling and dem attzation

Sail between oil 1 ncis ikmci ol die ivy atoms leagenls) (eoncenttation kind ta salt) oils ttiie,kness oi 1 i\eis"i • Pnnfit ation piotcthne Bullei • Silicon pu lthn oil mtxtutts • Sloiage conditions (pH tonttnttation kindoi is \apoi ditnision baittei • Hmdlmc ion lines buifci ) (kinds ot oils eompeuitioti • Acme and speed

Additues thicktless Ol 1 lVCls) • Micio mac lo uoss seeding

(electiostatit. t tosslmkets » \ apot dillusion paianicteis • Rite h Sl/t tlelct cents elualentP (size ol diops numbei oi L îcands (mhibitoi s chops slope ol ci ultenO subslialcs) • lempet itute nut temper itme Soiiiee md pin itv ol all pioli't id expeiinu n1 il themuals iptinluaiion oi ptoe-tdtitt md emnonnient

PEG > • ( ttsullu ition in uK md

tiwü growth undei uyon other suppôtts

oi N Hi oat it » Civstilh/mon undei nntu

oi Inpetctat uv

• t itstill 7 ition in elet Hie.

iincnetie oi uotutie fields

Even with nice nieicioscopic ciystals, data collection mav still be impossible oi useless due to msiabilitv ol the civstals in the X-iay beam (synchiotion îadiation) oi simplv due to insutiieient resolution Both pioblenis can be acldiessed by '"post ciystallization ttcatment" of the existing ciystals beloie they aie exposed to \-ta\s

Gicat i eduction in X lay induced ladiation damage to maciomoleculai ciystal can be achieved woikmg at low tempeiatuies This led to the de\elopment oi ciyogenic

1S^ methods11^ The piimaiv pioblenis aie tee formation which geneially nuns data quality (ice lings, decieasmg lesolution) and damage that can ocean to some ctvstals duimg flash cooling even in absence oi ice The piimaiv pioblenis aie suppiessed using ciyopioteetants and sophisticated Hash cooling piotocols (slow cooling, tempeiatuie steps) which cieates moie pioblems since civsnils mav be incompatible with eivopiotcetanls oi Hash cooling piotocols applied Similaily to initial eivstal sciecning theie is a multidimensional optimization pioblem to sohe Appioachcs aie simiLa too. only paiametei space looks a little difieient (Table 1 I) Refinement is guided exclusively by momtoimg dilti action chaiactenstics ol the civstals (lesolulion, mosaicitv. cell paiameteis. space gioup) which may change in each step oi the eivstal HO

handling piotocol liom haivesting to flash cooling Tn some cases it has been found that

very high coneentiations ol civopiotecting agents weie not destructive but enhanced

'82 100200 ciystallogiaplnc lesolution bv clehuliatioii181 Refinement paiamctcis loi

dehvdialion piotocols aie similai 10 those lot eivoeivstallogiaphv Only coneentiations

ol civopiotecting agents aie much highei than beloie (lable 1 I) Fven altei mounting

and Hash cooling lesolution limits mav be enhanced by maciomoleculai eivstal

2( l annealmc1^ Civstals aie eithei Hash annealed bv shoit del lection ol the c old

mtiogen gas stieam 01 thawed in ihe oneinal civopielection bullei and lecooled In

some eases lesolution and mosaicily impiovecl upon these pioceduics

Table 11: Parameters in cryoproteetion and devdration screening.

foi tl 1111 d vphmtun s l the lolleittnu 1 1 i n

Ciyopiottition &. dchvdiation agents lie h it se. Yvlrtol Gltuose Sucrost Ghent 1 FthvluitcKtol MPI) 1 6 Htxancdiol Pl G v400

<000 RuuiK 2 o did tivlhulol Mctlvunl

Is ] pinol fthinol Piopvkntglycol

Rutti i t onditions flee,! UtKll ICtnts Silts md bultti mi\

e,oiiespond to giowth tonditu ns ot change Intieist m picupitition lEunt tt ncentiation

uv ] rotte lion lllei del vtt Hi 1

Rtpioduubility Mike like StOeks Ot s hill lis slOH 111 lllJUtlS

s iltbt Ut vu it tt it live index

Method of gi admit \ ij i ftllitsion dnlvsis stop itiditnts

\ _i ldient Quick L\iliin t ct sin lice hqm 1) Details of (step) giadicnt nun bu el steps vhmces/step ttmt/sttp fmil ettiimon limc/ftml condnnn tempeiittut (ptoitle^

Handling of a \stal I tbu P ps e.ipilliius bullei t\chmgt

Mounting L t f s eipill nus (Remov U ot suihtc he|tii 1 with immtision til)

Flash cooling pi otocol giadunt tempuituu (-> si tt culm -> tiqunl pi op me ( 42"C to l()0 C 1 -^ he]tudmtio en l-ixptnuitt Successful ll ish to lity net K [ 1 I tree

Once macioseopic civstals diffiacting to icasonable lesolution can be piepaied

mounted and exposed to \~ia\s with acceptable ciceav unes data collection can be

stattecf02 I suallv the Oscillation method is used14 C ivstal onentation with icspect lo

beam and lotation axes mav become an issue when paiticulai ciicumsianccs like special

eivstal onentation veiv long teal space axes 01 high mosaicnv impau the completeness

ol data sets Dneetional mounting techniques eliminate the pioblenr"1' Olhcivvise

civstals mav be mounted on flexible mateuals that can be bent it necessary (see below )

*° Aftei data collection and data leduciion1 llieie mav be a last icason to lesunic eivstal

20b sueetitne (pseudo)metohedial twin nine-04 Fheoieticalh the effect can be iii

compensated for by propei handling of the data but sometimes e.g. with low resolution datasets merohedral twinning leads to serious practical problems. 112

4,2 Preparation of native crystals and collection of native data

4.2.1 Initial crystallization screening

As described in chaptei 3 several RAP30 and RAP74-construcls were designed for refolding and crystallization of the RAP30/74-intetaclion domains of human TFIIF according to limited proteolysis data, multiple sequence alignments, and mutation data about minimal functional domains. The proteins weie subcioned, expressed in E. coli, then punt icd to covalent homogeneity as show by SDS PAGE. N-termmal sequencing and mass spectrometry (Section 2 7) TFIIF subunits were lecombmcd for imlial crystallization screening lollow mg a l\vo dimensional gitel screen (Fable 12). Out ot 50 possible complexes 23 were corcfolded and punlied to conlormational homogeneity as shown by cation exchange HPLC-clnomaiogiaphy (SP-5PW, "FSK, Section 2 7 (7), dynamic light scattering, and analytical gel filtration Altei concentration most preparations of RAP30/74-interaction domain were monodispeise in solution (dynamic light scattering, section 2.7 (7. Figuie 36) Some weie assayed by analytical gel filtration

(Sephadex 200 HR. Phaimacia. Figuie 3d) They eluted m single symmetric peaks and showed no signs of aggieganon (data not shown) Cnculat dieinoism spectra (CD) ot

RAP30(2-124)/RAP74(2-165) and RAP30(2-124)/RAP74(2-172) showed that the proteins adopted predominately ß-sheet conlormation (data not shown). Since there were no signs of inhomogeneities or instabilities (SDS PAGE) initial crystallization screening was started.

In early stages of initial crystallization screening the material and time intensive

''Screen A" was step up using the hanging chop method10"1 Conditions included sparse matrix screening and experience based random grid sampling (Song's factorial screen #

4). Later the reduced "Scieen B" was employed loi new complexes and constructs similar to those that had already been crystallized. Ovei all ten different complexes could be ciystalh/ed (Table 12)

F'nst crystals were obtained with RAP74(2-172V RAP30(2-124) using Song's factorial screen #4 with 24^ PFG 6000 at 4° C, These Ciystals weie thin plates

(<20x 100x400 um) that showed severe twinning and weie sucking lightly to a gluey ptotem skin over the hanging drops Crystallization was not easy to icproduce and took

442 weeks. Only alter several months single crystals could be collected from the covershps (up to 20x40x80 urn), These crystals weie mounted on iibci loops and flash 113

cooled with 30% glycerol as cryoprotectant but did not diffract at all on the SNBL bending magnet beam line at the FSRF. More crystals were grown with RAF74(2-165),

RAP74(2-J64) and RAP74(2-163). These fragments were isolated as proteolytic degradation products of RAP74(2-172) and RAP74(2-202) from E. coli inclusion bodies (Section 3.5). Initial crystallization conditions were found using Song's factorial screen #4 with 20-24% PEG 6000 at 4' C. Ciystals grew faster (3-6 weeks) and more reliably than with RAP"M(2 -172) but were not significantly larger. The crystals were also attached to a tight skin on the surface of the drop. It was impossible to isolate crystals from these skins without severe damage and debris were too small for biochemical or crystallographic characterization. Thus, an entire skin containing three plates was flash cooled with 30% PEG 6000 as cryoprotectant and exposed at the

SNBL bending magnet beam line at the ESRF. This yielded reflections oui to 3.7 A resolution. Reciprocal spacings corresponding to 70-80 A real space axes indicated thai these plates were indeed macromolecular crystals. With recombinant RAP74(2-165) more trials could be set up. Growing the crystals in sitting drops under nitrogen h was possible to obtain crystals without the interfering skin (<20x40x80 um). With 30% glycerol as cryoprotectant they diflracted to 4 6 A at the ID2 high brilliance beam line at the ESRF. RAP30(2-1 19)/RAP74(2G72) and RAP30(2G 19)/74(2G65) yielded crystals as well (Song's factorial #4. cold room. 18-24% PEG 6000, 2-2 weeks). They were comparable to the plates described before until PEG 15(X) was used as precipitation agent instead of PEG 6000. Now a few crystals up to 80x150x200 um could be grown. Still, they were sticking lo a soft skin over the crystallization drops but now they could be isolated. SDS PAGE showed that both TFIIF subunits were contained in the crystals (data not shown). Best diffraction patterns of native crystals mounted directly trom mother liquor and exposed at 4 °C showed reflections to 3.8 A.

Diffraction was anisotropic with the worst reciprocal space dnection b'!' diffracting only to 5-4 5 A. Using Song's Pictorial screen #4 with 18-22% PEG 1500 crystals containing

RAP74(2G58) grew very fast (15 hours - 2 weeks). First small ciystals (20x150x200 pm) with 30% glycerol as cryoprotectant diffracted to 3.5 A on the ID2 high brilliance beam line at the ESRF. Falter large block or wedge shaped crystals could be harvested that were not sticking to skins on drop surfaces (2-4 weeks). With RAP30(2-

124)/RAP74(2G58) they reach dimensions up to 150 x 300 x 900 urn. With RAP30(2-

119)RAP74(2~158) the largest crystals reached 300 x 600 x 800 um. As native crystals 114

these nevei diifiacted bevond 7 s \ m the best iccipiocal space dnection and sutieied horn amsotiopies as descnbed above Civstals ol RAPsO(2 124)/74(2 154) 01

RAP;?0(2 1 19)/RAP74(2 1^4) also aiew last and icliablv undei similar conditions is the other civstals descnbed above (6 davs to 2 weeks) Vlthoueh these civstals weie paiticulailv clean and laige (up to 2^0 x 500 x s()0 urn) no telleclions bevond 7 S \ could be detected and civstals weie ilso sullenng J10111 amsotiopy lable 12: Overview oi the mstalh/ation ot RAP W74-c ompl exes and the crystallization screens used (Section 2 8).

MK2-119) *0( 2-124 M)(2-l«) M)(2-B1) W2-249) 74(2-1=41 Be n r x b c n x

74(2-1 «!8i B C D L X A B C D X B n X B neoX 74C lf7o B X 74(2 164 B X

74(2-16s) B X B t D L X B n X

74(2-172) B ( DIX A B ( X B ti X \ ti X \ 11 \

74(2-J8^i B n X B 1 X

74(2 192) B 11 X B X

74(2-202) \ \ \ nt X \ ti X

74(2-^17) V ( 11 X

X ttt til n X 11 trvlil 1 II 1 I fit hu tn m n

Su ttn A 16 plates at 10 mg ml

• ISP/^pPFCtcoOOO S nultel 1 it ui M Rl/4 C

« SOP/609" Am SO S n, IP m 1cm, n *4 Rl/4 C • H imptoii Restlich C tvsiil Se.ieet I X tl RI 4 C

SdttnB 4 plates at 10 nig ml

• ISP PECt 6000 St tu s 1 u t oi ul de si 7D #4 RI /4 t » plh7Pee 14 20P PKC7 6000 lOOmMKNO lOmMVLCl 1 mM DÎT 4 ( • pH s s 7 S vs 14 22% PI (, 6000 f 00 mM NH Cl 1 m\l DTT 4 C

Chemical refinement C

• pH vs PPEC7

» pH vs sait c once nu îli n • pHvs kind otPLGtPl (,400 :00O0 \s rPKr

• Dtvilcnt stittn

• l leetlOSt ItU UOSslltîkUs V tlel lllliiVes

• S ill Selten

• tivstilh/ itie n undei ii sii/mti en Phvsital itfinuutnt D

• Sittuu di ops (m pu itttn Ntlte. n il t j uveiit etvstils ttetn stukm » Sizeti hops numbet t it ps

• lempellttlle _ C 4 ( 10 (

• Contllllelltss UVSt tilt/ Itl IlhettUUl OllllVtls

• \ ipti dillusion binîus pu Hin s hein tlhvus 115

Molecular refinement E:

» pH vs %PEG vs Protein concenti ahon < 2-20 mg ml ') » Micro-, macroseedmg • Puiilicalion piocedme * Concenti alion procedur e

4.2.2 Refinement of initial crystallization conditions

All crystals grown so tar suffered from the same problems (skins, twinning) and had poor diffraction quality (3 8 A. anisotrope ). Searching for the crucial factor(s) responsible for these shortcomings, peiramctei space had to be sampled as efficiently as possible (refinement C-E. Fable 12) Thus, the searching scheme included several complexes but new sets of parameters were only investigated for one paiticulai complex at a time. Results were then interpreted and ihe findings were applied for crystallization refinement of all other complexes undei investigation.

The icfmement process for the RAP30(2- 11 WRÂP74(2-172Komplex will be discussexl cxemplarily loi all complexes, because it led to the ciystals that were used lot structure rciinement in the end Initial crystallization conditions were single 8 pl hanging drops al 10 mg ml piotem containing 509? well solution al 1 b-1 S% PEG

6000, 100 mM NH Cl. 50 mM NaHEPLS pH7 5, 1 mM DTT Setups were equilibrated at 4° C. After 4-6 weeks hundreds ol mierocrysials could be observed, growing in and nuclei a oxidation skin (Figuie 15AI Tt was known from other complexes (e.g.

RAP30(2424VRAP74(2-172)) that lettned conditions with PEG 6000 as precipitation agents led to somewhat largei but heavily twined plates that were still sticking to this tight skm. Crystals without skins could be obtained by three methods'. First, by growing crystals in hanging drops under argon/nitrogen atmosphere Second, by growing crystals in sitting chops under silicon oil ( 12 Pa s) Both conditions had been developed with other complexes (e g. RAP30(2 124)/RAP74(2-165)) wheie they led lo buttle, thin and twinned plates (Figuie 15B). The third method was using low molecular weight

PEG (PEG 1000-2000. PEG 2000 MME) as precipitation agent This did not only remove the disturbing skins, but also leduceel niieieation and yielded block or wedge shaped crystals of macioscopic size. MALDI-TOF mass spectrometry showed that both

TFIIF subunits were contained m these crystals without proteolytic truncal ions (data not shown). 116

Figure 15: Refinement of crystallization conditions for RAP30(2-119)/R\P74(2-172).

faul 10 nl RAP hanq nq drop 8 ill 10 mg ml RAP30(2 119) RAP74{2 172) 50% well B sittnqdrop mq 10(2 121) A well PEb 6000 100 mM NH U oü mM NaHEPES pH / S 1 mM DTT RAP74{2 163) '50% well 8 ill siliconoil (12Pu>) wt.ll 20 PEG600Ü lOOmMNH,^ •i0mMBPrsClpH6 5 1 mM DP

16%PFGb000 18 o PEG 6000 c lunqmgdroD 8 }il 10 mg/ml RAP1Û > H9)/RAP/4(2 1/2) 50% well vu II 20° PEG POO 100 mM mouvaient salt 50 mM NiHEPES pH / 1 riM DI F

LNO NHC

D innqmq drop 1 x 8 (il RAP30(2 119) RAP74(2 72) i ell 20 % pur f ed PEC P00 100 mM NHC I 50 toM NiHEPES pH 7 0 1 mM DTT

3 mg ml 73 % wel 7 5 mg/ml 66 / P well

hing nq drop }x8ul i mq/ml RAP30(2 119)/RAP/t 2 172) 75° well well PEG 1500 20°o|Lurfied lOO/riMNHCI 5) nM NaHEPES pH 7 7 1 nM L)7 F

> ^M 1 MqC layer on wel 300 ni panff n <, I con o I 11 117

The largest crystals were grown with 100 mM NHtCl KCl, and Li HO (Figuie

15C) Ammonium chlonde was chosen loi luither optimization lounds with three dimensional cuhe sueenmg Hie etlect oi PEG 1^00 concentration was assayed versus vanous huiler conditions and thiee hangma chops of varying protein concentiation pei covei slip Stalling with 5 mg ml protein and H 5 1 5c/( Pl G 1500 m a diop at pH 7 0 and 18 20l/c PFG in the well was the optimal condition Funded PIG

1500 yrelded laiger and cleaner cry staK than technical giade Pt G (Figure 15D)

Conditions could be further lehned usine eithei 5 m\l MgCl m addition to 100 mVI

NH,C1 oi an oil layei as vapoi dittustion hainer on top of the well solution (500 pl paraflm/siheon ( 12 Pa s) oil =1 1, suspension) (F mute 15E)

Figure 16: Crystals ot RAP30(2-119)/R4P74(2-Î54) and R\P30(2-119)/RAP74(2- 158) and alignment of the unit ceil axes b and h"- with crystal morphology.

\ CiWiNol md R >l "" R\PRVPiO(7 il)3R\l"4(7 134) \PPp ) 1 K \P- U PS) B \lui mt nt of low it lit in i ipn spice dntetion h with tttstit moipli )lo t

hirq gd op 8(l! lOmg n PA130 2 1 igngd pip ~>nc RAP 30(2 1 t) RAP74(2 1s8) 50 w RAP74 2 0 30 wc I we 8 D t ed U 00 mM L NO 50 mM UT SC pH t \ 0 wo 18 p UdPE ,1600 10 VI KM lOmMBlsT oC pH7 0 01M OTT

it ve d Irr ict n patter C13P30° 1 7)RAf /4(P o) 118

Using similai piotocols, crystal size could be improved loi other complexes as well and skin lomiation was no longci a pioblem (Fieuie 16A) Diftiaction limits howeeei weie msensitne to all letmement paiameteis "Variation oi addiü\e salts with civstals ol RAP7()(2 124)/RAP74(2 158) is gteen as an example loi these effoits Ml civstals had decent sue (< lOOvl 50x^00 uni) but only minoi ellects on dil h action limits could be lound (lable 17 )

Fable 13: Variation ol the diffraction limits of R \P*0(2-124)/RAP74(2-158) with additive salt,

Ciucil ti 1 1 un 111 70 PPO 000 MVtt H ) niVlI i\0 M mVf Bi In ( 1 pll ( oui niontP duecly (torn tinth t liqti it

I \] m \t i it I t I i t luui X i it « m »i iKi ikuRt O totni m I n iitouuth If built nunc i

Die aient salt lesolution

limits

7 s y

10 mM St Ci ^s V

lOmMBiC 1 ot V

lOmMMXl , s y

10 mM C sC ! -i y

lOmMRINO o -o \

Since no diastie ellecis on lesolution limits could be lound with vanation ol growth conditions sciecning proceeded ro post civstalli/ation tieatment ol the civstals Fust giowth concluions loi the stalling matetial weie slandaidr/ed (lable 14)

Laige volumes oi stock solutions weie piepaied and stoicd m aliquots and PI G solutions weie adiusted accoiding lo teiractive index

Table 14: Final crystallization conditions for R VP W74-interattion domains ot human TFIIF.

1 C implex RAP"4(2 h4)/RyPW(2 119) ti RAP71 PWR\lP0(2 IIQ)

Di op I hops cctu slip S til it lOnuml stump iP) ll pm died PFC. l-iOO sO mM(LiV) IiClkV) NTH CI \H NO) "smMBis l iu pH 6 ()"'() [snAFMozCl C tCl ] 0 s mM Dil 0l2smMFDl\

- pH S 0 s mM K( I 0 7s nAl Bu lus rH 0

Well IS 22 piimitdPiC, P0Ü 00 mM mon v de nt s il ( if) mM div dent silt] 30 mM Bis TnspH < O P) ' niMDTl

^ It 7 *> F qillllbl ltletl Setup RI Gt W ill in tilt ^ old l m Hlttest iflu Weeks

C O lipleX R\P"4 2 7VRVPP) 2 P91

Pic I 7 dtops u\el slip S ul it s nu mi sititiiu itpuulied h "> IsPr PEO hOO "s niMl iNO 72 s mMBis FupH s S 70

^ ()7sm\lün 0 l2sm\ltDTVpHS0 s m\t KC I 0 ^mMRu hujH 0

^ Well IS 20 < puttied PFC. hf)0 lOOmMltNO s() mM Bis Ins pH s 8 0 ImMDIT

SOOul Siluon Oil ( 12 Pis) Pnülinoil-1 lctitoj c 1 well st lull n

7 ition it m the s est I quihbi Setup RT Giov-th Id loom H in utu . weeks 119

Aftei piehmmaiy trials with ciyopiotectants the seau h focused oniluee crystal lotms containing RAPMX2-119) and not RAP30(2-124) because the ciystals with the shorter RAP^O-constmct seemed to he moie stable with respect to crvoprotcctmg agents In oidei to detect anv changes to civsiallogiaphic paiametcis (resolution, mosaicity) due lo "post crvstalh/ation ticatment the native civstals weie well charactcii/ed (lable 15) All space gioups weie C2 with similai ceil parametcts

Solvent content was between ^l 4 and Fi sp? conesponding to Matthews volumes between 1 79 and 1 8^ A,/Dari The low resolution icciprocal space dnection responsible loi anisotiopic diffraction was along ihe b* axis which coincides with the ciystallogiaphic two-iold b-axis and urns approximately along (he medium morphological axis ol the ihombic ciystals indicating poor crystal packing in this duecuon (Figuie 16B)

Table 15: Crystallographic characterization of crystals used for post growth treatment screening.

XiivuuitL Rmku RI P'Uotittn inoik 'etientoi / Plh PvPhtistiiK nirnoi hot 7 Kputtiu tP3h/ u I t

( usnl loim RVP74 2 ls4,/R\PoO(2 lUHnitiM spue çioup C2 Cell SS SOS P) 2 "A SS }S2 0() 116 29s 00

Resolution is t 7 \ i 0 80 7 ~0 P TJnique lellections 272" (I87i

Completeness 00 IP S4 <3< « ) Rhn 10"") t2"4P) Va It 0 (oO) Amsottopy lesolulion limit lions b" 4 s y Mosaicity 1 1 unit cell toltime 28186^6 \ (HKP, Solvent content *5 2uo (lOOPi

NC S no NCS

Civstal tonn R\P74(2 ISSVR4P¥)(2 110) mine

Spa^e tuoup C2 Cell 0 1 ()4S 10 OU 00 s88 00 116 P2 00

Resolution p 0 0 y (4 04 -7 00 y) Unique toilettions 260 (2sP Compltecness 102c (10 20 Rim 10', (H 2fP

I/O 12 6 (o 6) Amsottope lesolution limit alone b" 4 s A Mos licite 0 6"

Unit eell eolunie 27472-7 s y (KXPP

Soltuit 1. onle.nl P 41 (J009O

NC S m NCS 120

Civsril tonn RAP7412 172)/RAPW(2 119) indu

Spiet cioup C2 Ctll SS 48S ^9 727 88 8SÎ 90 I IS SS4 90

Resolution is 7 6 y {Y7() o60 \ Unique relltctions 27^1 (244i Compk nt ss i- p (9-s 4P)

Rhn 4 / < (2~9 ) Fg 20 6 (2 X) Anisotrope lesolulion limit iloiu I 4^ \ Mosaictlv 0" unit tell volume 7So^74s\ j(X)<

Solvent ce nient SP KW-

NC S n NCS

4.2.3 Crvoprotection screening and flash cooling

In oidei lo lediieC the amount ol civsials that would be used up m seaichmg loi convenient civopiotectton conditions minimal coneentiations loi vanous c tyopiotectants in combination with a tvpical hatvest bullei weie determined 1 oi this purpose, ciyopiotection solution was taken up in a 0 2 mm capiUaiy and flash cooled m liquid nilrogen Capillaries weie exposed lot IS minutes ai 170 C m achy nitiogen aas sticam Fhe Viav souice was a Rigaku Rt 200 lotation anode geneiatoi with home

^ made muioi boxes The three ice nngs at 4 V ^6 A and 7 8 A wete monitored with

MARVIf W lee nng intensities weie qualitativ elv assigned to Join classes eleai weak middle stiong Minimal civopiotection was delmed as the concentration ol civopiotecting agent that leduced the dominant ice une ai 7 6 A to a weak middle signal (Table 16)

Table 16: Minimal effective concentrations tor crvoprotectants.

()"c PFG 6000 20 rr Pl (t 6000

Ci y ©protectant -196 C -196 C 1 Glvut il 20Pf 16

MPD 18V S

I Ihvlent olveol 249 10

PFG 400 "CA w- 1 6 Htxtntdiol 2S( 10

Tlellllose ,ot C

Xvlttol ol pt

Suelt Si 2S IS Pl(, P00 20

P1GMMF 2000 "V-

PI C. 6000 26f

PEG \1MF2000 20(

PI Cr 4000 40<-

PPG J00(K) 40'

* J 0( mM -Writ 1 of) mVl BT pH t loop 100 mM Kc 1 P m\l BT pH C 121

Similar tests were performed for passive slow cooling conditions. Critical for slow cooling experiments is that the cryoproleclion buffer does no! freeze at the final temperature of the slow cooling gradient. In preliminary experiments approximate freezing points and safe temperatures lor a series of cryoproleclion systems were determined. For this purpose. 1000 ul of the cryoproteetion buffei were placed in a I 5 ml Eppendorf tube and a thermoelenient was inserted through the lid. Ice lorniation in a acetone/dry ice-bath (-75" C) was monitored by eve (Table 17)

Table 17: Minimal temperatrues for slow cooling gradients.

Harvesting buffer Cry oprotection First ice obsen ed No ice observed (Approximate (Safe temperature) freezing point) 18', PFG 151X1- 15P-MPI) 15 ' C 12° C

' 189rPIG 1500" ISP Glvcetol 18r c - 16 C 18P PEG 1500- ISP Ethvlene dvcol 10' c 18° C

18 f'( PEG 1500" ISP 1.6-Hexanediol - 17 C - 16° C

° 189i PEG 1500- 15P PEG 400 12 C - n c

IS'/rPEG 1500- 2^P Xvluol 16 C - 14° C 18PrPEG 1500"- 25P Succiose F C IP C

2scf PEG 15(XP Ise-Ttehalose 11 C - 0° C

1 lOOmM M14CI lOniMC.iCU P m\l BiPnPi pi I 3 u

For the actual ciyopiotection experiments, crystals ot RAP30(2-I24)/RAP74(2-

158) were transferred to a harvesl bullei identical to growth conditions (18% PEG

1500. 100 mM NH Cl, 50 mM Bis-TrisCl pll 6 5. cold room). Every 5 minutes crystals were transferred with a fiber loop to solutions oi increasing PEG concentral ion liom

18%, 20%. 229F lo 24'X PEG 1500. Then the cryopiotectani concentration was augmented m 5% steps every 5 minutes After overnight incubation m the final bullei crystals were cheeked under the microscope Those without visual cracks were mounted on fibei loops and flash cooled in liquid piopane at 120' C Passive slow cooing was done in parallel placing the ctystal dishes in a insulating Stytofoam box with 3 kg aluminum blocks as heat icseivoirs The Stytofoam box was then placed in a -20° C cold room and passive slow cooling was monitored with a ihcrmo clement taped to the aluminum blocks. At the appropriate temperature (lable 17) the box was taken lo the 4°

C cold room and crystals were mounted as described above 122

Table 18: Cryoprotection experiments with RAP30(2-124)/RAP74(2-158).

CivstiK muo ixpottd foi 1 horn it 176 C m 1 tin mtioeen qis streim X nv soiuei was th kinlu RU200 lotition iiiocIl giaiUi m. ith sell nnd mmol bo\C7

Crvopiotectant icso Vpptaiantt of îeflettions limit

ISG GFeUOk ~40 A" sue \kv spots ISP PEG 400 * * \ nice spots ISP Iidnl^e l s a, lUec spots

Is', Fthvlene «Ivcol o SA nue spots IS'r 1 6 HtVUltt iol 40 y nue spots 1S<< MPF 4o y et uks elVstlls sttelkv spots isp \vloot 40 y stte ikv spo's 1S'( SlktOsC 4 o y stle ikV spots lsp Glvceio 73 y less silt ikv spots -l slow eOOlllljpO 10 C

Generally, crystal qualitv deteiioiated using ciyopiotcctanis Lrthci the high lesolulion limit lell below s 8 A 01 die spots weie smeaied out Only liehalose and elhvlenc glvcol had no negative ellects on the visual appearance oi the icilcctions and on lesolution 15^ PEG 400 howevei did enhance lesolution to 3 F\ Since PFG400 had not only been used as civopiotectant but also as clehvdiation agent181 moie elehydiation expenments weie planned using vanous Pl G ptepaiations (PEG15(X)

PFG 10000)

4.2.4 Dehydration screening

In a in si expcnment on clehvdiation civstals oi R\P^0(2 ll())/RAP74(2 158) weie lianslened liom a haivest bullei containing 18'7 PEG 1 ^00 to Imal clehvdiation bullei s containing 40% PEG ol vanous moleculai weight t anges (PFG 4000 10000)

Ciystals weie tianslened between solution with a libei loop (step gradient) and incubated m the linal bullei at 4 C overnight I or analvsis civstals weie mounted on libei loops and Hash cooled 111 liquid nitrogen W lule crystals weie seveielv damaged m

40% PEG 4000 and 40^ PIG 10000. medium moleculai weight PFG ptepaiations

(PFG 6000 PFG 8000) had no such diastie ellects Ctaeks that lot med at 2^ W< PFG along the moiphological laces ol the civstals paitiallv reannealed in (he Imal clehvdiation bullet indicating an oideied dehvdiation pnx,esscs Resolution was extended liom 7 8 to ^2 A. but spots weie split and smeaied out {lable 19) Whethei amsotiopv was still a pioblem was not addiessed at this stage Anvhow dehvdiation conditions had to be lelined 111 older to avoid the unwanted side ellects pist mentioned 127

Table 19: Evaluation of different kinds of PEG for dehydration.

Oiowth eonditi u oi hirvc t bullei 13 PFC 1700 100 mM NHiNOr 10 mM CiCI ^0 mVl Bis Tu ( 1 pH ( s 1 mVl nil D hv Intnn buffei WO 1 )0 mM Ml XO ( 10 mM Ul 30 mVI Bl* hi Cl ] II ( rxieeui wis 1 h tu m i di\ mtn n i tti im X i it touu m th Ki ikti R > n nt i with « It mil nun i b x

PEG Its Vppt 11 lllet 1 le tleetlt lis limit

* 40f PI G 4000 * v et leks UVstll lie IVlK split

40e PPG 6000 }4 \ t, Ktes e \stll spill spots 40% PFG 8000 ^2 y uks e \stll split sp op 40% PI G 10000 It silt Vs eiVstll e ompletelv

Slow tianslei liom 18% PFG 1 ^00 lo 40k PFG 6000 in 7 1 small steps was died on ciystals of RAP^0(2 124)/RAP74(2 158) Civstals did not ciack anvmom but still high lesolution tiansition was not obseived Positive controls w ith the last tianslet piotocols weie successlul with these crvstals bin the spots weie split and stieakv attain

Moie sophisticated tianslei piotocols weie tired with civstals ol RAPvO(2

119)/RAP74(2%58) and RAPsO(2 12f)/RAP74p2 128) m parallel m oidei to isolate diileiences between these two eivstal loims Pot the vapoi diffusion piotocol civstals weie haivested lino 10 ul chops oi ISG PIG 6000 lOOmMNHCl sOniMBis FnsCl pH 6 0 on micio budges m 2 1 well c ivslalh/ation plates Well solutions weie 500 ul ol

40% 45% oi 50% PFG 6000 (no butlers no sails) Diops weie cquilibiated at 4 C toi

2 weeks undei aigon/niliogen to prevent the lotmation ol a PFG skin (Tins vapoi dillusion hairier prevents complete équilibration1) I mallv the civstals weie mounted on libei loops and exposed loi 1 hour m a eliv nitiogen eas stteam at 170 C \ lay souice was the Rigaku RI 200 geneiatoi with sell made minoi boxes Crvstals with

RAPs0(2 124) cracked undei all thiee conditions died while the civstals with ihc shoitei RAPsO consttuct looked line m 40% and 45% Pl G 6000 and chili acted to 7 f

A showing unsplit bin stieakv spots Onlv 50% PEG 6000 ciacked these civstals

Dehvdiation bv diahsis was tried as well Civstals ol the s true two eomplexes descnbed above weu tianstened to harvest bullei ( 18% PI G 6000 100 mM Ml (T

Bis InsCl [>H 6 0) 111 5() pl diahsis buttons The buttons weie sealed with 10 14 kl)i

MWCO diahsis membrane and hist cquilibiated aeamsi 20 ml dehvdiation bullei at

25% PFG 6000 then against 20 ml at 7SG 40% 01 iv Pf G 6000 Diahsis buttons weie lived to the walk ol the polvpiopvlene lubes w ith plastic sciews so that ihe dehvdiation bullei could be \1c010uslv stined without damagin" the cnstals Again ciwals with RAP3()(2-124) ciacked into small pieces undet all conditions tired while 124

crystals of the slightly longer complex had only minor cracks and diffracted to 3.4 A

showing split and smeary diffraction patterns with 40% and 45% PEG 6000. At 35%

PEG 6000 cryoprotcction was not efficient any more, fee rings were visible and crystals

diffracted only to 4 À.

Since crystals containing RAP30(2-124) had been less stable towards

dehydration conditions, lurther investigations concentrated on crystals of the somewhat

shorter complex RAP30(2-1 19)/RAP74(2%58), Foi refinement of the step gradient

protocol, crystals were harvested from the hanging drop 118% PEG 1500. 100 mM

UNO,. 50 mM Bis-TrisCl pH6.5. 1 mM DTT) to a first dehydration buffer (18% PEG

100 6000. mM NH.C1. 50 mM Bis-TrisCl pi I 6.0). The type of sail was changed on purpose since most dehydration experiments had been performed with ammonium chloride in the dehydration buffers. This detail became more import ant later. The crystals were transi erred between dehydration buffers with a 0.5 mm - capillary

following ihe scheme shown below (Table 20). Along this gradient ciystals did not crack visually but some showed band patterns under the polarization microscope. While crystals were incubated overnight at 4" C in the final dehydration buffer some cracks formed perpendicular to the longest morphological axes The larger crystals could be

into split several sections along these cracks with a micro scalpel. These pieces were mounted on fiber loops and flash cooled in liquid nitrogen. Exposures were for 1 hour in a dry nitrogen gas slrcam at - 170° C. X-rav source was ihe Rigaku RU200 rotating anode generator with home made mirror boxes. Screening through do/ens of these crystal pieces some showed clean diffraction patterns to 3.0 A. Crystal annealing did not improve resolution lurther but was a good tool to rescue and recool crystals showing ice rings. The space group was C2 with cell dimensions similar to those in the native state (C2 type I). It was surprising that cell parameters changed again when the experiment was repeated w ith 100 mM LiNO, instead of 100 mM NH Cl m the dehydration buffers. The space group was still C2 and the crystals even diffracted to 2 9

A but one of the unit cell axes was now approximately iwice as long (C2 type II) (Tabic

21). In order to standardize conditions and guarantee reproducibility, PEG concentrations along the dehydration gradient were determined via refractive index which revealed that the final dehydration buffer contained 25.59% PEG 6000 rather than 36% PEG 6000. Large quantities of"the final dehydration buffers were prepared. 125

then icTiaclive mdcces checked (nD 20° C = I 38^5) Bulleis weie stoicd in aliquots at

25 C (lable 20)

Table 20: final dehydration protocol for RAP30(2-119VRAP74(2-158).

Butfet uiiditionPoi I s < titttilfoimC7ttpe PIC (000 1 mVlMICl ) mM B TuCl ] Il ( s Butlu condition'; tor ciystil lorm (P ttji II DO COP J00 nAl I iX() ot mM Bi TiuCI pll 6 7 PEG etinctnii Uion PEG coiieenti ition rteiibition time

bv lelnetlVt nick \ 31 I L

18 PFG 6000 thin est hill lei ) 7 nun

PE PFG 6000 o mm

22 PÏG61XX) 7 mm

H PIGPKX) -7 mm

Vi' PFG PKX) o nun

2S PK. <00O -> mm

MVr PEG 601X1 o mm o2 Pfc G 6()(X) 12 oFpnl) ^0 C 1 - 3() 7 imn

- o4c PEG60IX) U Wo (nD "0 ( SP) 7 mm

W/o PFG ( (XX) ?s s9% (nl) 20 ( =: 1 WP wunuht

4.2.5 Collection ot native data for R4P307 2-119VR VP74(2-158)

Complete low tcsolution native datasets loi R VPo()(2 119)/RAP74(2 158) were collected lo chaiactcn/c the dehvdrated eivstal foims C2 tvpe I and C2 type Id and piepaie loi heavy atom cieinative scieenuic (lable 71) Solvent content was 26 ^c/< loi

C2 tvpe I and 24 4% loi C2 tvpe II conespondme to Matthews volumes ol 1 67 and

I 61 A /Dari Ihe unit cell ol the C2 tvpe Tl eivstal loim is appioximately twice as laige as (he C2 tvpe I cell It contains 2 molecules pei asvmmetuc unit coiielated bv piedoimnanlly tianslational non er vstallogi aphic svmmeliv alona (0050 5) as dem cd tiom v isual inspection ol selipatteison maps In oicle t to optimize data collection conditions loi the C2 II tvpe civstals they wete mounted with then long ical space avis along the dithactometei spmdle axis (

Table 21: Native datasets of RAP30(2-119)/RAP74(2-158).

weie loi 0 3h" X-i iv souice was i Lxposuiu Riaïku rR îotatina anode aeneiatoi (X 1 oils A) with home mick mirtoi hov

Crystal foi m R \P74(2 ISSVRAIPOjP 119) T ___ _ type

So.iking condamne IS is s9P PEG 6000 1(X) mM \H Cl so mM Bis TiuGî nil 6 ^ Space cioup C 2

(\11 91 sh2 40 440 81 999 90 P o p(o qo

Resolution 2o o s y ^ 97 o so yj Unique ici lections 2S24 i29o (Completeness 97 1L 90 "G i

7 Rim 7 7G id i

' I/o 23 2 (16 Mosatcil} 10 Unit cell volume 2s6tP3 9 V l9Pi'rl

Solvent u>nteiu 26 sp S4 4< i

NGS no NCS

Custal loim RAP74% IsSVR Vl"G0 2 P9i ivjx II îo Soakimj uonditiotu IS S9G PF( i 6(S(X) KXlmMliNO 30 mM Bis TmC t pH 6 s Space aioup C2 Cell 90 061 og024 I4ss44 9{)9] PsOl)

Resolution 20 3 S y 79 o s0 y Unique telleetiotu 49SS s'i2 Completeness 97 SPf 94 (0

Rim s S^ M)

I/o 12 6 7 6 MosiicitY OS

~ unit tell volume 498271 0 A 90

"""" Solvintconu.nl 24 4% s' 1

NCS 2 told tuns] Hie 1 il Nt S vuth (0 0 s () S)

4.2.6 Collection of native data for R YP30i2-119VR\P74(2-154)

Ciyopiotection and dchychalion conditions weie adapted suecessliillv lo civstals ol RAP7()(2% 19)/RAP74(2 154) Dehvdiation bullets containing 100 mM LiNO yielded civstals m space gioup C2 loim II as usual but no C2 tvpe f ciystals could be obtained until civstals weie soaked with SO m\l TMI O-V loi heaw atom denvative screening Then the ellccr ol drlterent salts and salt concentiation on cell paiameteis was studied until native C2 type I ciystals could be lepioduced with dehvdiation bulleis containing 200 mM K.0 \c Full dataseis wete collecteel to chaiacteri7C the tw o dehydialed eivstal toims and prépaie loi heavy atom tlcnvalive sueenine ( lable 12)

Solvent content was 26 8% loi C2 tvpe 1 and 24 s% On 02 tvpe 11 conesponding to

Matthews volumes ol I 68 and I 67 A7l)ari C2 tvpe TT uvstals showed the same indications ol uanslations NCS as descnbed above loi R\P^0(2 119)/RAl>(2%i78) 127

Table 22: Native dataset of RAP30(2-119)/RAP74(2-154).

Toi tk C ' Ivp II nys-iil loim exposures wui lot 30 minutes pei dc"iee \ t iv «ouïe tus i Ri iku RU200 îotitin mode

^ uni iloi 1 3413 with sell mid muni lu \ Fi th C (/ A) ftp I at til I mi xposm s »ut loi °70 seconds pu ilnee md \ nt sun ttis (ho s s SNBI (A ( V u FSkl with i VI VRPo mil pi it It tei

Civstilloim RAPPir ls4 ~RV!PQ 2 119 tvpe I ___ P t Soikiiu eonditiouo IS s{ ptG 60oo 200 mM kl) Ve s() mM Bis TnsCl pH 6 s ^ Sp k c g oup C CPU )1 "M 40 )H S2 j-sQO To 77690 P ^ 0 \ ""7 "- o Rt solution ii2y ssss 4) \

Unique îelieetie no SS 7 ""4S Complt ttness 9S 1 S3 s s

Rhn s 0 PS )

" Va 2S 3 ((

Mos nu tv 0 s unit sell volutin 2s4sls9 y (90 r

Solvtnt eontuit ">6 S% SE )

NCS no NCS

Ctvstil loim RAP74(2 ls4iRMM0 2 1191 tvpe ll | __ So os iknu conditions IS 36 c Pfco 6()(K) UK) mM LiNO SOmMBishisC pH(P Sp ICe Cl oup C2

Cell 90 (PO öS 02S 14-7 P 0 )0 ) 1 IDS 90

Rtsolution >() î S y if 1 SO A

Unique lelle t Hulls sOP O""")

Complete tie ss )) * )4 1 Rhn s S 10 7

1/0 N) 1 P t

Mos ucitv 0 s unit sell volume 199211 f y (SS 6

Solvent content ^s 74 S

NCS 2 I Id It 1 isl m ml NC S with 0 0 s 0 s)

4 2 7 Collection of native data tor RAP30(2-119)/R YP74Q-172)

Adaptation ol the civopioteclion and dehvdiation piotocol lo ciystals oi

RAP30(2-1 19VRAP74(2 172) was not stiaight loiwaid Toi the shoitci complexes ciystals giown m T iNO KNO 01 Nil Cl with and without MgCl ot Call had been used loi dehvdiation without consequences loi linal lesolution and cell patameteis

Ciystals containing RAP74(2 172) however could onlv be successfully dehvdiated when glown in the presence 1 iNO Civstals giown with othei salts like NH ( 1 ot

KNO had nicer moiphologv but ciacked into manv little pieces and did not underco high lesolution tiansitton upon dehvdiation The dehvdiated civstals ol RAP^0(2

I P))/R AP74(2 172) had space 21 oup Pl with 4 lold non civstalloinaphic svmmetiv

The auangemeni ol the lout molecules m the asvmmctiic unit icsembled the native ( 2 cell 1 wo pans with 2 lold lotanonal non crvstallogiaphit. symmetry weie conelaled bv tiansiational non ciystallogiaphic svmmetiy along (0 M) 5 0) Details ol (his auangement aie discussed in section 6 2 Complete Inch lesolution (9 I 7 A) and low 128

lesolution (67-3 1 A) data sets wcie collected and merged with an oveiall R of 6 1% v

This was the daiasel used foi heaw atom denvative seieemnc as well as final structuie iclinement Solvent content was 29 1% conesponding to a Matthew s volume ol 1 7) AVDa171

Table 23: Native dataset ol R VP30(2-U9)/RAP74(2-172).

s I i the lu h us lud n d ms t tt i o s Exposut loi n 1 | i t i i. m 1 il f t th lott luoluti n lit i t Vnv i i some wis 111 ID1 1 4iiiKluht ib imlin it th I SI ! / M with i Oui 1 H t 0 d t t i

Cl si il loi ni ' y VP 74(2 1 %2>/R \PC) 2 i _ _k 9_ tvpt I_

Soîkin;: conditions IS as s6 s PFC, lOOinMItNO s() mM Bis InsC i pli 6 Spact a imp Pl Ci 11 4S 01s'72 290 s: Vs lOpoll )o o7() (Ol P7

" Rt solution 6-7- 1 A 1 ~7 V Unique ullePions 108916 M)P) CAompktuiess 96 S" Ps

R ( V t ) S i i/o n i î Mosueity o\ unit tell v olumt V ^MS 4 V 9o

Solvent etnttnt 29 P Ps )

NCS II ltPNCS lesemî htu ihe u îni'C unit ull ni iiutment 129

4.3 Discussion of crystallization screening and data collection

In the course of human TFIIF structure determination, the overall domain structure was determined first by limited proteolysis (Chapter 3). Then crystallographic siructure determination of separate domains was attempted. Tn order to provide more than simply structural information, the crystal structuie should answei the biologically" relevant question whether and how the TFIIF subunrts RAP30 and RAP74 form a stable complex Theiefote any combination RAP30 and RAP74-constructs forming TFIIF-like complexes were ol interest for crystallization screening Since the RAP30/74- intcraction domains of (he TFIIF subunits weie contained m their N-tcrmini, a numbci of C-lcrnunal truncations was designed for ciysialli/ation screening with all possible combinations. Initial screening was performed w ith a combmalion ol "sparse maux screening"101 and experience based "random end screening" (S. Tan. unpublished)

This reflects that a certain balance between chaotic and systematic search slratcgies must be maintained for antral crystallization screening and later crystallization refinement

Statistical evaluation of the most likely isolated crystallization conditions leads to a chaotic search lield with high probability lot* single hits. Large paramcler spaces can be sampled very elliciently but m case of defeat there is no information about the undei lying problems Results are non-intuiuve and even success may be difficult to interpret and reproduce because nothing is learned about the relevant parameters On the othei hand systematical exploration of parameters and their mleideependences with complete gnd screening108 is time consuming and must therefoic be limited lo a tew

"essential" parameteis chosen by intuition and experience The influence of each paramctei on crystallization is tested until the relevant sei loi a particular crystallization problem is isolated. Tl there is no success, experiments including as many new parameters as possible may be moie promising than line sampling along a limited set ol parameters The latiei is like asking the same wrong question ovei and over again. This sott ol failure is typical foi mathematical optimization methods as implemented m

"incomplete factorial screening"207 oi "weighted sampling"104 These aie used lot refinement ol initial crystallization condition and are designed to locate the optimum withm given paramcler space Like (his. new parameters thai may be more significant to the present problem are never tried Therefore iterative analytical grid screening 130

including as many parameters as possible is the superior method for refinement of crystallization conditions. Interpretation of trends in these complete grid screenings is more intuitive and reliable than mathematical analysis of isolated conditions.

Redundancy is not a problem but leads to internal control. For crystallization refinement screening three classes of refinement parameters have been proposed for idea generation (Table 10). Crystal morphology and size could be improved significantly using this kind of approach but the poor diffraction quality (low resolution, anisotropy) was insensitive to all efforts.

Given this situation, it was a consequence o( the described screening and refinement concept to look for additional parameters, Screening was extended to "post growth treatment" techniques like cryoprotection. dehydration and annealing. These methods represent multidimensional optimization problems pist like crystallization screening. Many parameters were screened which finally led lo the well diffracting crystals used for native dala collection. Il is the main implication of this part of the work that crystals with poor crystallographic characteristics like fast decay, low diffraction limits, anisotropy or high mosaicity may be transformed lo high quality crystals by a series of easy and fast experiments once certain prciudice is overcome.

The general rule that cryoprotection buffers have to resemble as closely as possible the original crystallization condition185 is not really important. As long as the crystals do not dissolve, gradual changes of all parameters are possible. Large differences between growth conditions and final cryoprotection conditions may even be desirable to achieve drastic effects like dehydration. The effect is discontinuous and typically accompanied by changes tn cell parameters or even space group181-18--100--00 (also P. Cramer: personal communication), ll has also been stressed that cryoprotection should not visibly disrupt crystals18*". Dehydration however may cause major rearrangements m the crystals and in all cases known lo me crystals cracked during this proceduie. Prior lo cracking characteristic band patterns have been observed under the polarization microscope (FG unpublished observation; P. Cramer personal communication). Then cracks showed up along the morphological faces of the crystals and sometimes reannealed which seems to be indicative of an ordered dehydration process. This suggests that also badly damaged crystals should be analvzed to discover effects on resolution limiis. These crystals may even be useful 1er data collection if they are large enough to cut out a well diffracting piece. 131

The effect of dehydration on RAP30/74-interaction domain crystals was

impressive but difficult to reproduce. The dehydration process was very sensitive lo all

parameters involved which had to be strictly controlled. Subtle differences in tvpe and

concentration of salts in the crystallization and dehvdraiion buffers determined the

outcome of the experiment. Minor differences in the protein constructs like point

mutations or 2-14 ammo acid truncations at the C-termini led to different results. All

native crystal forms under investigation grew m C2 space group with similar cell

parameters (Table 7). Before dehydration crystalline diffraction was significantly

anisotropic and weak (3.8 A). Upon dehydration diffraction became isotropic and

resolution was extended to 2.5-1.7 A. Depending on conditions and constructs used,

three different crystal forms were observed. They will be relcrred to as C2 type I. C2

type II and P1.

Crystal form C2 type T was obtained with crystals containing RAP74(2-154) or

RAP74(2-158), Before dcvhdralion. native cvstals had similar a and c-axes around 90 A

° encompassing a ß-angle ot 116-117 (Figure |7A). Upon dehydration the c-axis

shrunk by ca. 1-9% to 82 A and the ß-angle widened to 123 '. The unit cell volume

decreased by 7-109'- corresponding to a Ups of 16-20C of the crystal water. Solvent

content sunk irom 22% to 21% (Figure 17B) Crvstal form C2 type II was obtained

with the same RAP74-constructs but dehvdraiion buffers were slightly different (Table

14, Table 15). They contained 100 mM UNO, instead of 100 mM NHfCl or 200 mM

KOAc. After dehydration, diffraction patterns showed very narrow reciprocal spacings

m keeping with doubling of the real space c-axes (ca. 146 A). The native Paderson map

showed strong peaks for translational non-erystallographie symmetry vectors at the

positions (0 0.5 0.5) and (0.5 1 0.5). The easiest explanation was that doubling of the unit cell in the c-direction went along with shearing ot the una cells along the

crystallographic c-axes. "There is also a significant change of (he ß-angle from 116° to

91° (Figure 17D), The unit cell volume decreased by 19-21 cu conesponding to a loss of

23-25C of the crystal water. The final solvent content was 249r which was even lower

than for C2 type T crystals, Crystals containing RAP74(2-172) behaved differently

when the standard dehydration protocol was applied. The crystallographic 2-1 old

symmetry b-axis was bent out of register and stretched by 11% to 48 À. Crystals now had 2-fold rotational non-crystallographic symmetry and space group had changed to

Pl. The a- and c-axes shrunk to 72 A and 82 A but were still loneer than the former 132

crystallographic 2 fold axis. Therefore, a- and b-axis labels were inverted according to the TUC-definitions (Figure 17C). The arrangement of the four molecules in the Pl unit cell was very similar to the original C2 unit cell as supported by a strong peak for a translational non-crystallographic symmetry vector in the native patterson map al position (0.5 0 5 01. Alternating rows ol stiong and weak peaks in the pscudo precession photographs indicate that the pseudosymmeiric arrangement almost fui tilled conditions for the typical C2 systematic absences (h+k = 2n) The effect is obvious when the hOl-plane ol the C2 native ciystals is compared to the conesponding Okl-plane of the Pl crystals. The oiigmal reflections are spiead due to overall shrinking ol the unit cell axes and weak interspersed reflections appeal al medium resolution (Figure 17C)

The unit cell volume decreased by only 1% corresponding to a loss of 6% of the crystal water. The final solvent content was 20e; These Pl high resolution crystals of

RAP3()(2-119)/RAP74(2-172) were chosen loi stiuctuie detenmnalion because this was the largest complex that could be crystallized, .Also, cell constants were much more constant between different batches of dehydrated crystals than with the shortei complexes which was important for heavy atom denvative sciecning. 133

Figure 17: Crystal forms of RAP30/74 interaction domain of human TFIIF before and after dehydration.

A Diaoiam ot the native C2 unit ilII with tin contend mtiti Pittuson ps iks m tix < totes (black spheies) Nitite unit ci 11 patamekis in similar foi all complexes Below is tix pse di pin ssion photo i iph oJ th hOI ncipiocil spate plaine li IP lesolution as pnparui with IIKLVIÎ VV (Civstal RVPPP 1IPRVP I yi 2) n Divum oi the dihvdiated C2 ttpo I unit tell * lound with civ st ils i ontatnin RVP71P Ml md R APP Pd Below is th'psuulo pisct sston phologiaph of the hOI iccipioeil space plame lo < P t Dnai mi ot the dehtdiit d Pl unit sell ol (test ils with RVP7!(? 1/2) Axes hbels ne dmued on\ accoidin-' to the PC iklimtions Tix non citstilloatophis lold seninxtit i\i i^ shown is will (7= 180 0= '70 0- 66 Below is the pscudo pisiusion photiuiaph ot tht Ckl ntipiix il spu phiiu to P V iuolution (touesponds to tix toima hOI plane; T> Sinie diisyam- tot the ikhvlrated L2 t\pe U unit nil I und with itu ils com unins RM'MP 1311 md R \PPp lis Plie native Pittcison tiuislatioiTil non srwilkuiaphit ^vmni tit p iks m show 1 is hhek spheies Ihe oiuiinl native 0> unit cells are shown tn aav Six urn1 aloiu the b i\is is mdu lteo hv two t lows

A C2 native B C2 type

0 0

ßfi__.

O T. ->© b

©•—-

h 01

y-\ D C2 type Pl

—i -A ^O y°r u Ù PL &L.

O bi y"

Okl 134

Native, high resolution data were collected on a single dehydrated Pl crystal of

RAP30(2-119)/RAP74(2-172) at the ID14/4 undulator beam line at the ESRF. Due to

the low symmetry space group full 180° of reciprocal space had to be collected.

the Therefore, crystal was exposed for more than 10 minutes lo one of the strongest X-

ray beams available such that the possibility of serious primary radiation damage had to

be considered208. After data collection, the first frame was repeated in order to get an impression of the decrease in diffraction limits (Figure 18). At ihe beginning of data collection the crystal di ffracted to 1.7 A resolution with a mean intensity of 2.5 1/0. For the whole dataset an average intensity of 1.9 T/a was recorded to this resolution and in the end crystalline diffraction dropped below' 2 T/a between 1.9 and 1.85 Ä while diffraction up to 2.1 A was more or less constant during data collection. Il was concluded that data beyond 2 Â were seriously affected by radiation damage. During structure refinement this problem was not systematically addressed but it might be responsible for some discontinuities in the final electron density.

Figure 18: Radiation damage during high resolution data collection of native of RAP30(2-119)/RAP74(2-172).

Intensité ar dil'fetent prohles stages of the data collection piocess are given as dotted line for the first frame of high resolution dala collection and as dashed line tot the repetition ol the fist ttanv at the end of high resolution data collection. The merged huh and low resolution dataseP are represented bv the solid hue

merged high and low resolution clatasets 14

resolution [À] 135

5 Screening for heavy atom derivatives for the RÄP30774- interaction domains of human TFIIF

5.1 Introduction of a general scheme for derivative screening

To date cU noxo structuie solution in maciomoleeulai civstallogiaphy generally

171 icqunes civstals with specilic heavy atom sites loi phase deteimmation170 Onlv m special cases can dncct methods solve the phase [noblem m the absence ot such heavy atom sites"*1*0 When heavy atom ligands aie natuialK piesent as in melallopiotems, multiple wavelength anomalous difhaction methods can be utilized lo solve the phase pioblem with the native ciystals Apart liom these taie eases, it is the availability of appiopnatc heavy atom denvative ciystals (isomoiphous oi non tsomoiphous) thai deteimmcs and limits methods applicable loi phase deteimmation I ike crystallization sciecning the process ol dem ate sciecning and chatacieii/ation is laigelv empnital and typicallv lollows the scheme piescnted below t Haute Ï9) It mav lepiesenl a time limiting siep in civstallogiaphic stiuctuie deteimmation |usi as ciystallization scicenine

Figure 19: General scheme of derivative screening.

*• Native crystal karris Coi relations I

introduction of heavy atoms • Soaking • oocrystallization • Drelabelinq • Mutation

Vbii I doter ora o

ui sir tsslil obel iq Visual inspectic7n & Chemical analysis r I iL Insut et t r sol tor

hrh nost Iv si tt rn Ciystallographic characterization

No isomtroh is nt v non isomorphous isomorphous or ier vol v^ ysti lo MAD MIRCAS SIRIAS

V P> s 3 As

pv t It o ivv t

S A -P\ < x? tes m T Data collectai 4» reduction I M o eat litis es Search for heavy atom situs

Y Phasing model building and refinement H6

Well difliacting native ciystals aie noimally the stalling point ol denvative sciecning Yet denvative sciecning may be the last îesoit to obtain leasonably drill acting ciystals since it has been obseivcd quite often that denvative crvstals diifiacted bettei than the native civstals There aie sevetal methods loi obtaining denvative ciystals Usuallv soaking eivstal in h n v est bullei containing nnllimolat amounts ol heavy atom compounds is the lastesl vvav to obtain denvative civstals170-10211 Since civstals mav be veiv sensitive even to low coneentiations ot heaw atoms there is again a multidimensional optimization pioblem which is addiessed by sciecning stiategics similai to those mentioned ibove Paiameteis aie listed below

(Tabic M)

Table 24: Parameters in derivative screening bv soakmg method.

Hum atom nagtnts Sunn ties m Blun iell s Piotem * Civ still u iphv ot Muhocls m Lnzymokuv

"" n^H vol P4V i Do not louct Inch pussiue 1 nobk is nu Muons £

Buffi i conditions Pieeijuiii i ens sil s in Ibutfci may

coiiesp nit u vv ih s. tidili 7tu oi elnnee Not Ul

he ivv tt in e neliti ns ite tt nip itible with ihe usml

u wlh i h iivesi e n iiti ns

Rcpi ocluubihtv Wtikiiutt ih tun im unts ol tht hphly (o\ie htiw it nieti psiiielsjs i pioblem Mmy m nol stible in iqneous sclulions ot whui t \pcsttt to luht Mike st ek s hill Mis slue 111 lliquots _ Soaking piotocol t onitnti ut-m md u i lient of he ivy nom

s s time e m[ und d ulk iknu) tempeiitme luht

_

Bat k soaking Butte e tl lltl is lime telllpel Util I _ _

(xlutai aldthvde eiosshnkmg BulP e nlitt tu luuitt itnti time Icmpei Uuk

si, iv en ..in., w ilh ; t tmttv mimes

An alternative method to pioduce denvative civstals is cociystallizalion wheie heavy atom compounds aie added to the ciystallization setups like otdmaiv additives

Optimization me hides all possibilities discussed above loi initial sciecning and lelmement ol eivstalhzation conditions Mme îational approaches im lüde pielabelme ol the (macio)molecules betöre etvstalhzation oi engineer mn ot heavy atom sites bv mutagenesis Pielahehng ol (maicio)molecules may be done /// xitio oi m \no In \itio methods include labeling ol synthetical DNA or ligancl molecules with iodine biomine oi even heaviei substituents Piotems oi DNA tiom biological souice can be covalently labeled using site specilic icagents like meictuv compounds that bind to cvsteine lesidues21 oi like iodine oi biormne that attacks aiomatic mus (electiophiltt aiomattc 137

substitution)170'171. In vivo methods arc restricted to recombinant proteins that arc expressed in media containing labeled amino acids (selenocystcinc, selenomethionine. telluromethiomnc)214"217. Finally, heavy atom sites for soaking, cocrystallizatton and prelabeling may be engineered by mutagenesis ol ihe biological macromolecules Of

ls~221 special interest are cysteine mut aliens2 and methionine mutations as discussed below

If ciystals survive the heavy atom soaking piocedures oi can be obtained bv cocrystalhzalion or prelabeling. mere visual inspection may be supplemented by physiochemical analysis Localized heavy atoms may be detected by biochemical

(Ellniann's assay) or spectroscopic methods (UV-spectroscopy) or mass spectrometry

Crystals trom prelabeled material should delmitely contain localized heavy atom sites

Even il these crystals aie non-isonioiphous to the original native crystals and can not be used for classical phase determination by \IIR(AS) or SIR(AS) methods222, they may be saved foi a MAD-cxperiment—3 ot screening lot an isomorphous native 01 deiivativc crystal lorm In most cases though, no method can tell whethei derealization has been successful apart Iront crystallography. For searching heavy atom sites by isomorpous or

anomalous dillerence patterson or difference Fourier methods, ueaily complete datasets

( >c\5%) to at least 5 A resolution have lo be recorded170 There have been several propositions for shortcuis to tins time consuming procedure involving statistical lests on small slices of reciprocal space140-211, but generally a detinue answei is preferred. Once heavy atom sites aie found, complete denvative datasets to maximal possible resolution are collected for phase determination 138

5.2 Derivative crystal preparation and derivative data collection

5.2.1 Derivative crystal screening by heavy atom compound soaking

Preparation and sciecning of potential heav y atom denvative ciystals ol the

RAP30/74-interaction domains involved all methods presented above* Soaking, cocrystaliization. pielabehng (m vivo, m \itro) and mutagenesis, First, conventional soaking was tried because it is usually the lastest and easiest wav for obtaining derivative crystals. The standard soaking piotocol applied is described in the Materials and Methods chaptei All ciystal tonus tned wete very sensitive towaids heavy atom compounds and cracked into small pieces even at submillimolar concentrations of most compounds utilized. Tn the few cases where intact ciystals could be prepared, they were mounted on fiber loops and flash cooled m liquid nitrogen Crystals were aligned with the X-ray beam adjusting the goniomctei head aies and the bendable solder stem of the fibei loop Tl the ciystal difliacted beyond 4 A w ith evposuie times of less than 2 houis per degree, approximately 10 degiees aiound two niaioi zones were collected (X-ray source: Rigaku Rt'2()0 with double loeusmg minors). Aftei indexing and data reduction with DENZO and SCAEEPACK140 cell dimensions weie compared to the native crystal forms. Non-isomorphous crystals wcie discarded at this stage while data liom isomorphous crystals weie meiged w ith the conesponding native data using the

erroi estimates toi the denvative data optimal (SCALEPACK-x ,iimil„u I)

SCALERACK-XMesting revealed whethei differences between native and derivative data were withm experimental error limits or significantly larger. SCALER ACK-X",^ values laigei than 30 were attributed to detector problems oi non-isoniorphism 'Those crystals were discarded. Data collection on ciystals leading to values lower than 5 was stopped and the crystals were saved If SCAEEPACK x\s,^01 lxvo meiged datasets was between 5 and 30. data collection with this potential denvative ciystal was completed and the crystals saved Whenevet possible isomorphous difference pattcison maps between 10-4 A and 15-5 A were checked loi heavy atom signals

First, crystals containing RAP30(2-i19)/RAP74(2-158) of both dehydrated crystal forms weie scieened applying methods and criteria described above. No convincing derivative crystal could be found (Table 25) More chemicals were tried on crystals of RAP3(\2-119)/RAP74(2-154) (Table 26). Although all dehydration buffeis contained LiNO, oi LiCl which normally led to C2 type 11 native ciystals, some heavy 139

atom soaked ciystals weie ol the C2 type I ciystal foim So, conditions foi obtaining native ciystals ol the same eivstal loim had to be developed (see above) Geneially, cell paiameteis weie quite v anable loi both ioims oi these eivstals and many non isomoiphous civsials weit geneiated With Bakei s dimeicunal a potential heavy atom denvative was obtained but data collection was nevei complete because heavy alom scieenme iocused on the complex containing R \P74(2 172) Those civstals dilltacted

" to highet lesolution (I A instead ot 2 *> \) and cell paiameteis wcie not dependant on soaking conditions to such a degiei Ihe whole soaknu scieen as descnbed m the

Materials and Methods section was applied (Section 2 S) Alter 24 houis incubation wnh 0 * mM iv PtCl 0 ^ mM E MTS * 6 mM FM IS *b mM KAuCN oi 4 5 mM

UOAc civstals di tti acted to appioximaieh 4 \ C ivstals soaked loi 24 houis in 4- 5 mM R2PtCN 4 5 mM EMLOAc 3 o m\l RI IX Pl oi 0 - mM Pb(0 \c ) difliacted to

2 5 A Ihc Littet civstals wcie anahzed as descnbed ibove but none was lound hkeh to be a dun ativ e ( I able 27)

lable 25: Derivative screening with R \P30(2-119VR \P74(2-158) crystals.

St ikll Is 1 nn le s R 1 1 s mj 1 / les! R tomme ni OOlltlltl lis IM l 1 1 ] 1 i R 1

}mMl\ns + Tl s s 1 1 1 0 S 10 7 piobiblv not i denv luve 4 C cveinuht

"" 0 hmM KAuCN 11 s s S ( 1 1 X) t 70 10 3 no pe ik l'a ~> s m 4 L eV el tiuht Illicit net. Pilteison mi[ 0 ISnAlKPtCl + U -s S JPf : i 14 s SfY 13 ( nwvbt weak ilciivativt

4 C tvunuht 1 ut bid eiyst il jti Uitv

S P( )7 (7 1 SmMkPiCl 4 ib ~ OS p (O1 non isomoipht lis 4° C ovumeht

-1 oniMTMI()\t + II S too i pe. 1 oh P S 1 lobiblv nt t i denv uni 4 C tvunuht

) 1 OOlSmMMctLNO + n s "7 1 2 14 0 lo un piOblblV »t t 1 tie 1IV lllVe

4° C ov t im cht

0 iSmMKIiCl + H s t s 0 7 "b 1 -O ïo le s m

"" OMiiMkliCl -1 i s S s 1 10S 1 ss n mavbc weak dcnvaùvt

4 C ovemuhl 1 ul h 11 ciyst il (inhlv __ bs lik ik n X i it i IP t V Ik sA~

C II 7 4 V7 W> S7 111 1 p 140

Table 26: Denvative screening with RAP30(2-119)/RAP74(2-154) crystals.

Soikmc bs feiim les R mos tcql somment t compl x ^ * conditions [Al [6 1 [ ] Pi iu ^mMPtCN + 11 s 0 0 0 oll) (Js7o4 7o piohihlv not i Itnvilivi 4" C 2 ehvs

0 ISmMUOOAe + Il s S f 07 s 16/10 Ko 1 piotuWy non denv itive

4 C 2d iv s

P " OOlSmMpMtBV + 11 p 10 .+ 4 0 P20 10 7 piob it ly not i duivitiu

4°C cl s _ iv

OlsmMkliCl o s f ) O + Il '(O l(o P bSfo non isun tj huis 4e C ovtinuht

^ ~- su mM s < s IMLOAe 1 OS )4 IS s 19 2 no pe ik l'ffoP m 4 C 2itllVS dlllelUlte Plttelsou illip 0 IS mM Bikei s dmieieiii il 11 o S S s o Q -si 6 ()-o 194 maybc dcu-vatnt 4r C cvtinuhl

- OlSmMSmtl l e 0 1 0 2ot 1 S69 10 1 piobiblv nt t i denv il te

4° C t thv s

^ ^ OSmMklitl (II) oS 4 0 n" ^ PS 14 4 non iscm ij h iu 4° C ovttnuht

o^ OlSmMkliCl (R oS 0 PO 2 hC P4 non ut m q ht us 4" C ovunuhî

OSmMkAuCN I oS f 0 |)o > V "'s nwjk tin inak 4 C ovcmuht denvative

7 0 S mM NH PdCl I s 0 0 s no „ 4^ 6 9 pioblbh 1U, t dcm llne 4° C ovtinuht

OSmMPbOAv ^OOmM 1 s s 9 00 10 0 -441 1.2 maybc vei y weak KO\s 4 t ^ tins duivative

" OsmMNHPdCl 200m\l 1 i 4 9 1 1 C 1 26S )2 piobibly not idtiiv Hive k() \e 4 C ovetmcht

hs In k 01km" X nv sau R P) u 7 V 1 R to s v I SKI S\Bl t ov

C II P S 1 16 804 PS 12- 00 0 P

(ill 3P)1 IS 906 1 13 Ï33 )0 71 0040)

t II 7 ( f 39 71-s S1 PIQ( 1 )) >0

Table 27: Denvative screening with RAP30(2-119)/R4P74(2-172) crystals.

Scakm os i ntn les R nus c mp x ks| 'v comme ni eonditi ns [Al M f 1 [P] (PI bmMkPlCN Pl S L. P I6i 104o lo) piob iblv no tknv itivi

4 C S thv s limMUOp PP s 10 2 1" i0 4 non isomoiplions 4°C Sdivs

^ 4 S mM k PtCN P1 01 ) 0 -• > s 7 4 1 no pc ik I/o m V C ovtinuht ditkitnce Pniuson

"" 4 S mM [ MLO ye Pl 2 0 s o f )s 0 no pe ik I/a^-P m

4° C ove muni tlillucnce P u tel son m tp | bs biks ikm X n\ m kt 7 t o v l SRI 1014 U V

C II 13(11) 47+80 670 101 497 71 is Ids 14S 141

5.2.2 Derivative crystal screening by heavy atom compound cocrystallization

Since no derivative crystals could be produced by soaking, cociystallizaton was

attempted. Using the heavy atom screen described in the Materials and Methods section

(Section 2.8). ciystals ol RAP30(2 4 R))/RAR74(20 72) were grown m the presence of

1-2 mM of the following compounds. TlOAc. Pb(OAcA KRtCN4, KAuCN, SmCl,.

Na or RaClt. Gd(OAc)v EuCk. WO,. TMAA. McHgNO . BEDCPt. The underlying

crystallization conditions were 1() 5'< purified PKG 1500. 100 mM LiNOt. 50 mM Bis-

TnsCl pll 6 5. Ciystals were dehydrated as usual but in the presence ot the respective heavy atom compound, Without backsoakmg. sphnteis ol these crystals were mounted

on fiber loops and flash cooled m liquid nitrogen Isomotphous crystals diffracting

3 A could be obtained with beyond BEDCPt. \a WO . EuCl,. Gd(OAc),, and SmCl

Complete datasets weie collected. Inspection ol the isomoiphous difference patteison maps using vanous lesolution ranges and chtierenee cut oils however, did not give an indication of pi eminent heavy atom sites (no peak 1/(7 > 2 5) (Table 28).

Table 28: atom with Heavy cocrystallization RAP30(2-119VRAP74(2-172) .

" ! vtete loi 0 Ps - sl,sOtuh Fotposmes pei dogue Vi.tt souu s was th, ID11 1 uiiJul itoi boamlmi ,it the I SRF tt ith a Qu.titt CTD do let ui

atom tu foim ies mos Heavy compound Ry_ _ .ompl cv, r<;y 'i i>(] Bis(cihvlenctli

Gd(OAt) - Pl U> si h 0 6 98 20f EuCl, Pl 2 9 10 0 0 7 9S 2C

SmCl, Pl 2 7 9 S 0 b 98 0

5.2.3 Investigation of cysteine-alanine mutant proteins for derivative screening

Since screening loi heavy atom derivatives by soaking and cocrystallization methods had been unsuccessful, mutant proteins loi furthei soaking and prelabeling expenments wete designed Based on the observation that even submilhmolai

of coneentiations meicuiy compounds disrupted the native crystals, it was concluded that a prominent mercury binding sue (presumably a cysteine residue) was involved in crystal packing, which implied that RAP30/74 interaction domain had at least one surface exposed, reactive cysteine residue. In lact. both cysteine residues ol the

RAP30/74 interaction domain were accessible to Ellmanii's reagent (DTNB. data not

In shown) a first step, alanine mutants for both wild type cysteine icsidues were designed (Cl 16A-RAP30(2-1 RR, C130A-RAP74(2~172R It was assumed that 142

complexes lacking one oi the othci cntical throl group might ciystallize tust as well as the wild type piotem complex Civstals then could possiblv withstand saokmg with mcicuiy compounds which would still icacl with the temainme c\steine residue In a second step additional cysteine residue s would have been mtioduced cieatmg additional meicuiy binding sites Mutant ciystals wete obtained and dehydiated usine the standaiel protocols toi RAP^0(2 119)/RAP74(2 172) I ndei those conditions both types ot single mutant crystals undei went no high icsolulion transition while wild tvpe conti ol ciystals drlliactcd to 2 9 A (T ible 29) Howevei cell paiameteis changed and icmaiiied 'stuck in between the native C2 and the desued Pl paiameteis Space gioup was Pl w ith stiong pscudosymmetty leading to a dtlliaction pattern with alternating intensities at medium lesolution and entlieh C2 appealancc allow icsolulion

Depending on ciystal onentation and icsolulion limits civstals weie autoindexcd in C2 oi Pl as indicated below ( 1 able 29) Ihe double mutanl t îvstals led to well ditliactine dehydiated civstals with veiy shoti lecipioe al spaanes This indicated consideiablv laige cell dimensions (ca 240 \) A complete ciat i set to ^8 \ was collected Spate gioup was piobabh C 2 but the data could not be indexed and scaled piopci ly (lable

29) These preltrmniary results indicated that the alanine mutants weie not likely to piodnce crvstals isomoipous to ihe native high lesolution Pl civstals without maioi adaptions oi civstalli7ation and dehvdiation protocols The sinsle mutants might still have been uselull loi pielabehng studies but Mellg pit.labeled civstals (Section 2 9) did not giow to sullicient si/e I xpeinnents wete terminated sit this staec as othci appioaches appealed to he moie pi onus mg lable 29: Characterization of cysteine-alamn mutant crystals.

Pvposui ttits let 1 h ui p i 1 i in uililmtio en i trim P ( X u\ m t is th Ri iku

RIPOO n i tt itsithh m nuei nan ltnx

_

" RA,P74(2 17P CloOV | k VP"71 1""> vu

" RAtPtH 110 C PoV c 2 o y Pl o ) y

11 I llltltX ible pseii bsvmmetiP

RyPsfH" 11 0 ttt p i ^ o y Pl "Hi nn 1

psuidosemnisitv ,7 11 ITlll tels see ll VI I! P ll ) 1 Ml Co 3 3 1

PI 13 { 7 ( S_ 7 1( 1 1 r 71 3 )l )3 p 0

5.2.4 Selenomethionine in vno prelabeling ot wild tvpe and methiomne-mutant proteins

Solai none ol the applied sciecning methods soakmc eouvstallization mutation had led to denvative civstals In \no pielabcline with selenomcthinonine 143

was the next method to be tned Given pievious pioblems with amsomoiphism and vanable space gioups, the piclabcled ciystals should be suitable loi stractiue deteimmation by multiple wavelength anomalous dispcision methods (MAD)—1 The sensitivilv limit toi MAD-phase deteimmation with selenomethionine pielabeled civstals at (hat time was one selenomethionine pet U) kDa ol piotem (P Rattison peisonal communication) Wild tvpe RAP3()(2-119)/R \R74(2 172) complex has a moleculai weight oi ^2 b kDa and contains onlv two methionine icsidues (RAP74

M29 RAP74-M62) I wo additional methionine icsidues had to be intioduced m oidei to maximize chances loi a successlul MAI)-e\petinient Cntciia loi selection ol methionine mutants weie based on secondaiy and teitniv stiuctuie picdiction i out mes as implemented in the GCG progiam suit (v 9 1)

Suitable methinonme icsidues should be plated m the hvdiophobic coie oi the piotem as delmed bv hvdiophobicilv and surface piobabihtv plots Methionine icsidues may also be at the end ol a buned secondaiy sunt tuie element close to the suilace as delmed bv secondaiy stiuctuie piediciion multiple sequence alignments ot hydiophobic moment plots Ihe mutations aie then closet ro the movable suilace which should be moie tolerant towaids ammo acid changes than the ngid hvdiophobic coie ol the piotem Methionine mutations should nevei be placed m flexible loops which become obvious when comparing hydophobicitv plots ilexibihtv piedictions secondaiy stiuctuie picdiction and gaps/inseitions in the sequence alignment Choosing tegions loi mutation is lollowed by choosing the actual residues to be icplaced \ mutant melhionme should not îeplace a highlv tonseived icsidue Mutations should icplace laige hvdiophobic icsidues m legions with reduced conservation especially when methionine shows up in a closeh lekued homologous sequence Based on these cntciia the follogwmg methionine mutations weie designed R \P10(2 119) f%M

RAP^)(2 119)1 LOfAt R\R74(2 172U47M, RAP74Ü 172) H27V1 The single and double mutants wete expiesscd undei special conditions icquucd toi mcoipoiation ol selenomethionine-1''221 Piotems weie puuhed to homogeneity undei denatuimg conditions (Section 2 7) Incoipoitation ot seleiiomeiionnit was checked bv mass spettiometiv Then combinations with lorn or mote methionine icsidues pei complex weie letolclecl and used loi civstalh/alion sciecning based on conditions smulai to the wild type civstalli/ation conditions Complexes with the RAP^OÜ 119) F^MVt mutation did not civstalh/e while optimal civstalh/ation conditions loi the othci 144

complexes were shifted to slightly higher PEG1500 concentrations (22-24%) with

respect to the selenomethionine prelabeled wild type complex that crystallized under

standard conditions (Table 30). Application of the standard dehydration protocol for

RAP30C-1 19)/RAP74(2-172) to the mutant crystals led to the same problems encountered with the cysteine-alanine mutants Crystals including the RAP74(2-172)-

L47M mutant did not dillract to high resolution (3 5 A) and cell parameters as well as the space group seemed to be trapped m between the native C2 and the desired high resolution Pl crystal form Space gioup was Pl vv ith strong pseudosymmetry leading to a diffraction pattern with alternating intensities at medium resolution and C2

appearance at low resolution. Depending on ciystal onentation and resolution limits crystals were autoindexcd m C2 or Pi as indicated below (Table 30). Onlv crystals ot

RAP30(2-l 19)-L1()6M/RAP74(2-172)-F127\1 appeared lo be of proper C2 space group. Although these crystals were not isomorphous to the native, high resolution Pl crystals they could have been useful for a MAD expcnment In order lo investigate this option a complete C2-datasct (9()r) from 30 to 3 A was collected (Table Ü). The diffraction pattern showed some twinning and intensities weie weak. Therefore it can not be excluded thai additional reflections inchcatme Pl space group with pseudosymmetry were present al higher resolution C2-Harkei sections (y=0) of anomalous difference paticrson maps lor vanous icsolulion langes and difference cut off levels showed a number peaks (3-4 I/o. data not shown) The heavy atom sites were not localized as the analysis of more promising data iccerved higher priority (see below )

Selenomethionine prelabeled wild type ciystals weie grown as well since improvements on the sensitivity of the MAD-method made it now feasible to try structure determination w ith just two selenium atoms pet 22 kDa Alter dehydration uncict standard conditions the crystals weie isomorpous to the high resolution native PI crystal form. Screening statistics loi these crystals were characteristic lor all crystal forms obtained by dehydration Ot more than 200 crystals soaked only 50 mountable crystal pieces could be collected alter dehydration 1-2 ol which showed a diffraction pattern with well defined spots. These were taken to the RM 14 MAD-bcamhne at the

ESRF (a = 1.003 A). There, ciystals diffracted to 2 5 A but minor splitting became apparent. Because rt was eleai that these crystals would only allow structuie determination undei* ideal conditions, u was decided that the beam time was moie 145

cilié îently spent collecting data on the methyl-mcicury piclabeled ciystals as desci ibed below (Section 5 2 5)

Table 30: Characterization of selenomethionine labeled (methionine mutant) crvstals.

1 \posm tt t In 1 hout p t ri r tn t ol I mti mi i ti im 1 ( C \n\ met \t is IP Ri ikukL700 11 utoiwith homo mile mm i Inx s

RAP74(2 172 RVP 12 r_i RVP74P P1 RAP71P 1721 wt I47M hPPl L4"Xl Pr"M

RAPoOu 110) no u)suis 11 elVstlls l scM

RAIPOR 110) Pl P\ P i3\ Pl o s y

I 106M (pseud osvmmt tn (Tills Ri (j-iseti 1 sv mincit yl

RAPsO 2 1101 no uvP ils rs6M I 10(A1

RAP->0(2 HOHvt 10 slVst lis Pl ~>9 V (isomoiphous to nitivt L

3 1 7 7 Pl I 3 ) 101 7 97 _ 1 L S3 0 3 1

Pl 43 ( SO 7 in> I n 1 1 ) 7 ( S3 1 S r s S 3 0 ]

PI I" 1 7 s] 7 104 4 91 I 104 1

Table 31: Native dataset of selenomethionine in iivo prelabeled RAP30(2-11()>- L106M/R VP74(2-172)-F127M.

7 1 ixpuii tt i ki umds p r It \n\ ut w i 11 11 t4 I u lui u i 1 itiilin U th LSRt () OJ y) with i Qu 111 111) 1 t 11 i

^ Ctvstil loi in RVIPO nil 1 10CM R\P 4 PO ] 12"M

So eonditit ns ISPm PLC. (000 1(X) tiMI MO sOmMBis TnsCl pH6 S Sp ut u oup C2 Cell ss s "ft os tr" ss 741 )0 1 ir SP) 00

Rtsolution so , y oil oO 47

Unique u 1 les lions ( inomal uis ) 9962 (Ssf )

C omplt it nt s s 96 0C S4 o )

"* ( omplt luit ss hu del P uis 61 o"" S

R I of C- i

I/o S 4 Mosnuty 0 7 unit tell volunit P ""9 "9 4 \

"c Solvent content "'O

NCS i \CS 146

5.2.5 Methylmereury in vitro prelabeling

Since both cysteine residues of the RAP30/74-intcraction domain were solvent accessible, an //; vitro mercury prelabeling protocol was developed in parallel to in vivo selenomethionine prelabeling. Prelabeling of the single subunits with methylmereury nitrate under denaturing conditions before refolding was quantitative as verified by

% Ellmann" s assay1 but mass spectra of the complex just beiore crystallization showed that the methylmereury labels had been lost again. The mercury sulfur bond is similar to a covalent bond and very stable but has a certain albeit very slow off rate which mav be sufficient to dilute the labels away during ihe lengthy complex preparation involving large volumes of buffers for refolding, HPRC-purttication. dialysis and concentration.

Therefore a protocol for labeling concentrated protein preparations just before crystallization was developed (Section 2.9). It was essential to mcubale the protein with a six-fold molar excess of methylmereury nitrate over free thiol groups lor at least 10-

12 hours under exclusion of oxygen m order lo quantitatively denvatize the protein thiol groups. Derivatization was monitored with Ellmann's assay130 and MALDI-TOF mass spectrometry.

Crystals were grown under standard conditions for the respective complexes excluding sulfur containing reducing agents (DTT. 2-mereaptoethanol) from all buffers.

This should guarantee that ciystals were isomorphous with the native crystal forms and that the methylmereury labels slaved m place. Prelabeled crystals were much smaller than native crystals. Single macroscopic crystals grew exclusively on Ihe cover slips

(Figure 20A). Free floating crystals were even smaller and formed lightly packed, twinned clusters (Figure 20B). Splinters from the crystals on the cover slips and crystal clusters were harvested for dehydration. Thereafter, the crystal clusters were separated with a micro scalpel and the splinters from the crystals on the cover slips were cut along the visible cracks. These ciystal fragments (20-80 x 100-150 x 150000 pin) were mounted on fiber loops and flash cooled in liquid nitrogen. Numberous crystals had to be screened until several suitable crystal pieces could be saved for data collection. 147

Figure 20: Methylmereury prelabeled crystals of RAP30(2-119)/RAP74(2-154).

A Civstals iwwms on ihe toter slip B tree tloatiiu crvstil clustets

1 " ^unadroo pl 10 na o)UU30(211W Ahmainadrop faul 10 m l ml RAPOO 11 Q U U o well RAP74 2 154 U well D RAP712

- vtlMH" rttlP^, well 18 „pirfed Prb 1 t W nM L NO SO mM b r 111 I s mMBslrUpH-U

At the ID14/4 undulatoi beam line at the ESRF ciystals ol methvlmcicuiv piclabeled RAP30(2-119)/RAP74(2-172) ditltacted to 2 5 A Ihe dehvdiated ciystals had space gioup Pl but weie non-isomoiphous with the native high icsolulion Pl ciystals (Table 32) Since it had been shown bv mass specliometrv that these ciystals contained the methvlmeicmv labels (data not shown) thev weie chosen foi MAD-daia collection at the BM14 bending magnet beam line at the ISRF The optimal wave

' length loi collection ol the 'mileclion pomt datasei was deieimmed to be 1 0092 A bv

X-iay lluoiescence speetioscopy (XAFS) (Figuie 21A) Fxposuies weie foi 90 seconds pei dcgiee m a cold mttogen gas sticain (-IKY C)

Two mote datasets weie collected on the same flash cooled ciystal Wave lengths weie selected based on thcoictical calculations oi the unagmaiy 1 and ical component 1 ol anomalous scattenng with icspeet to X îav eneigv The second dataset

' was collected at 1 (XXO \ (12 400 keV) It will be telened to as the peak" wavelength although it was collected lai awav liom the ineicuiv Fill edge The thud dataset was collected ai 0 9ss6 A ( H 001 ke\ ) It will be icletied to as the lemote wavelength

(FiBine21B) 148

Figure 21 : XFAS-experiment and theoretical plot the mercury anomalous scattering components.

A XFVStxptiiment with uvstit ot imthtlme suit m libeled R \P30P I tfn/PPPPP 172) B Tluoielicd plots ol I (ml component) ind t timuiruiv so iipoiiint) 11 the niotttm I ut (CROSShC CCPli

A B

. _ 1 r t

- i

- (D

: "©'

: t _ : 1

J _

© _ i -A 1

I 1 1 U i i i 1 i. i i l_i_i j—, L I L_L 0 1. 1 1 1 go L L_i_l 1-j. _J_1 Jill

L/ktV 2 I" U 26 1 28 1 2 30 12 T> 1 34 10 11 U 17 14 1 t ktV

XPlOlU 10121 10113 10091 10080 1 0o4 1 OO4-7 1240 M2? 1 )33 0UV 0 883b ù 82faf A/A

The dehydrated crystals ol methylmereury prelabeled RAP30(2~119)/RAP74(2-

172) had space group Pl with 4 molecules pei asymmetnc unit The 4-fold non¬ et ystallogtaphie symmctiy was similai to the original C2 unit cell arrangement with two pans oi 2-fold rotational symmetiy related by translational symmetry along (0 ^ 0 5

0) Details of this arrangement aie discussed m below (Section 6) Ihe three datasets weie 97 Vc complete and included most ot the FnedelPaiis (ca 70f/O Frames wcie meiged with an oveiall R ol s 60 Solvent content was 28 4% conesponding to a

Matthews volume oi 1 72 A /Da|71(Tahle 32) Ihe MAD-data weie used for phase determination, initial model buildine and i Ornement 149

Table 32: MAD-data collection for methylmereury prelabeled RAP30(2- 119)/RAP74(2-172).

Exposuiei weic toi 90 s/°. X-iay somce was (he BM14 bending magnet beam line atthc ESRF with a MARCCD dctectoi.

Crystal form- methylmeicuiy pielabeled RAP30(2P19)/RAP74(2-172) Soaking conciliions: 18-33.56% PEG 6000. ICK) mM LiNO,, 50 mM Bis-TrisCl pH 6.5 Space group: Pl Cell- 47 880 72 286 80 859 102.347 91.587 104.963

Resolution. 2S -2.6S A (2 71 - 2.(P A) Wave length: "'inflection point" "peak'' ''remote" 10092 A 1.0003 A 0.9536 A Unique reflections (anomalous): 57629(3536) 57519(3473) 58006 0664) Completeness. 97 0% (87 8PP 97.0%- (88-2%) 97.5% (93 1 %) Completeness "Fnedel Pairs": 79.4%r (25Jcb) 69 87c (25.4%) 72.4% (67 5%) Rhn: C6%(38.6%> 3 6% (38.7%) 3.7%. (28 17c) I/o: 12.4(1.7) 16.1(1.7) 12.5(2 4) Mosaicity: 0.7° 0 7° 0.7°

Unit cell volume: 263095.2 k'

Solvent content: 28.4%e

NCS: 4-fold NCS simulai to the otumal C2 unit cell arrangement. 150

5.3 Discussion of derivative screening and data collection

Heavy atom derivative screening by conventional soaking and cocrystallization170-210-211 was seriously impaired by the prevailing variability of space groups and cell parameters. Even datasets of the same crystal from dehydrated with the same dehydration protocol and identical dehydration buffers were not fully isomorphous as indicated by high Rmcrfi values The overall Rncreo was \2V for two native datasets of RAP30(2-119)/RAP74(2-172) collected at the SNBL and the ID 14/4 beam lines at ihe FSRF while individual R irit-faclots were only 6.59c and 3.6%22S. The second problem was the low yield of well diffracting ciystals with the dehydration protocols which was typically around 5-10^. So. very few intact crystals went into derivative screening m the first place such that the adverse effects of heavy atom soaking and dehydration were difficult lo deconvolute. Hundreds of crystals were used for time intensive dehydration and heavy atom soaking experiments. The lew more or less intact crystals that could be characterized in the X-iay beam needed long exposure limes of 1-2 iwo hours per degree. This would have required 1-2 weeks of beam time lor a complete low resolution dataset required for heavy atom localization by conventional isomorpous difference Patteison techniques. Thus, data were processed online in order to save time. Low quality or non-isomorphous crystals were immediately discarded. SCALEPACK-%2-tcsting140 was used to exclude crystals that were probably not derivatives. In retrospect, the applicability of this test can not be judged since not a single derivative crystal was found with soaking or cocrystallization methods. Comparing MAD datasets oi different wavelengths gives values of

SCALEPACK-x21M between 1.2 and 2.5 which would be below the limits for a statistically significant derivative signal. Therefore, the method seems lo harbor some peril because nicely isomorphous but weak derivatives may well be excluded by ihc proposed criteria.

As a last resort for generating heavy atom denvative crystals, /// vivo and //; vitro prelabeling methods were used. Given the swift success with this approach, it is suggestive to generally put these methods at the beginning of heavy atom screening.

The time spent for the somewhat demanding preparation will easily be regained once crystals are obtained, since no further screening and civstallographic characterization arc necessary to prove that they actually are derivative crvstals with high occupancy 151

heavy atom sites. Simple mass spectrometry will answer that question. The value of this

kind of information about presence and number of heavy atom sites in derivative

crystals independent of crystallographic data can not be over-estimated, especially in

cases such as this one characterized by variable space gioups, variable cell constants

and low reproducibility. Even when isomorphous native crystals arc not available, the

prelabeled crystals can be used for structure determination by MAD-mclhods223-226-227

as applied here for the RAP30/74-intefaction domain of human TFIIF.

The remaining challenge was the accurate choice of wavelengths for MAD-data

collection involving mercury atoms. The usual selling is to collect a first data set al the

LIII edge inflection point228. At this wave length the real component of anomalous

scattering Af and hence f are minimalr * (Figure 21 B) The exact position of this

inllection point depends on the chemical cnvitonment of the mercury atoms and can be

determined by X-ray fluorescence spectioscopv (XAFS) since the fluorescence signal is

proportional to the imaginary component f of anomalous scattering (Figure 21).

Theoretically, a second dataset at a "remote" wavelength with maximal P value and a high dispersive difference AF with respect to the first wave length is sufficient to solve

the phase problem228. If the anomalously scattering atoms show signilicant whrtc-hnc

features, a third wavelength at the "peak" ol the absorption edge is collected. However, mercury does not have empty states in the 5d atomic band which are necessary for this

" additional peak of the imaginary component f of anomalous scattering. It has been

pointed out that for mercury derivatives optimal conditions for ihc second dataset are

found at "'remote" wavelengths corresponding to energies 200-300 eV above the LI

edge. There, f" is slightly laiger than at the peak of the LIII edge and a 30% increase in dispersive signal AC with respect to wavelengths between the LIII and LII edges can be

obtained228 However, at energies above the LI edge (15 keV. 0 826 Ä) the BM14 bending magnet beam line at the ESRF had much lower intensity than between the LUI

and LII edges which would have required exposure times of 200 seconds per degree as

opposed lo 90 seconds per degree for data collection to 2.65 Ä. Based on this observation it was decided that the limited beam time was probably better spent collecting two full datasets between the LIII and LII edges instead of one beyond (he LI edge and use the remaining beam time for characterization and data collection with scTenomelhmotiiiie prelabeled wild type crystals. The third "remote" dataset was rather a repetition of the second "peak" dataset and few additional phase information was 152

obtained This becomes obvious companng f and f levels at the paiticulai wavelengths in Figuie 21 Neveilheless the thud dataset c ontiibuted to redundancy and ovei deteimmation oi the MAD dataioi phase deteimmation

All llnee clalasets wem collected on a single eivstal of piclabeled RAP30(2

1 D)/RAP74(2 172) m a div mtiogen gas stieam at 470 °( To collect all 96,0 names the ciystal was exposed loi 240 minutes to intense svticTnotion X lav ladiation

Howevei companng intensities and intaisitv pioliles ot ihc llnee clalasets no signilicant changes which might be attnbuted to ladiation damage038 can be obseived

(Figuie 22) figure 22: Stability ot crystalline diffraction limits during MAD-data collection.

Intinsity ptolile it iiftttuit sti tt the din colleetion 11 su it n os dott 1 line loi the lust inflection point dttisu

(1009° V) ml is hshed line lei the s conti p ok lot iset 10 V Th thu 1 i mote dilis t (0 J PC A)iie i pi nt 11 y the solid lino Th h htlv hi h i intensity oflhethul lotos t m b i unt d f i bv tht mt nsiiypiolil ttlh BMItbnlin mannet b imlin

xx r 5

_ ^0\_ _ — 4 5

i 1 0

resolution [À] 153

6 Crystal structure of the RAP30/74 interaction domains of human TFIIF

6.1 Overview of the structure determination process from phase determination to model validation

The structure ol the RAP30/74-interaction domains of human TFIIF was solved based on the MAD experiment described in the pievious chapter. The process from the experimental MAD data to final structure refinement and validation involved more steps and iterative cycles than classical structure solution and refinement protocols as

1,7-22Q described by Kleywegt, Jones and Brungei (Figure 2?>). The first steps were phase determination and improvement by fourfold NCS- averaging3"0 and solvent flattening2'1-232. Phases were further refined by two rounds of "phase cycling" which employs the phases improved by density modification as restraints for heavy atom parameter refinement. Using these phases an initial model was built and refined against the "'inflection point" MAD-datasct to 3 A. This involved seven classical "macrocycles"' of model building and refinement as well as impiovement of the NCS-masks and operators. The model comprised all four copies oi the RAP30/74 interaction domain complex in the Pl unit cell which were refined using non-crystallographic symmetry restraints. The most complete copy was subsequently used as a search model for molecular replacement2'"-234 against the high resolution native dataset Four positions wilh reasonable packing were found in the native Pl unit cell. The lour heterodimcrs were thoroughly rebuilt and extended m two '"macrocycles" of refinement without

NCS-restraints The Jour partial models were then superimposed1"8 to generate an almost complete search model for "molecular re replacement" Another eight

"macrocycles" of rebuilding and refinement against native data lo 1.7 À led to the final model. The individual steps of the rcfienemenl process will be discussed in detail below. 154

Figure 23: Process of structure determination for RAP30/74 interaction domains ot human TFIIF (based on Klevwegt and Jones229).

Experimental MAD data

[_

Location of HA sites Basic M S information

Refinement of HA parameters

itllial phases

Initial electron dens ty maps ->. In 131 NC b operators 1

Local correlation -> Intial NCS masks maps NCS-refiment -f Cycle II Phase eyclhf Skeletonization NCS masks editing

C NCS operators rennement

X

.0 ^-^ "Sv

\CS averaging & solvent flattening ¥ Impioved phases & electron density maps Phasecorrb nat on mproved model a 4s Refinement macrocycles Cycle HI

Final Mode 155

6.2 Phase determination and phase improvement by density modification and phase cycling2

De novo phase determination involves heavy atom site location, paramcler refinement and phase calculation. The initial heavy sites are found by solving a difference Patterson function. NormalK, the number of heavy atom sites is not known m advance and some of them will have low occupancy which acids to the difficulties in placing the initial heavy atom sites. For additional sites and derivatives difference

Fourier techniques can be applied which are moie definitive.

In the case of the RAP30/74 interaction domains, location of the heavy atom positions was significantly facilitated because MAD-datahad been collected on in vitro prelabeled crystals. Each subunit contained one cysteine residue which was quantitatively derivatized with methylmereury before crystallization as shown by

Ellmann's assay156. Heavy atom labels weie preserved in the crystals as shown by mass spectrometry (data not shown) Thercloie exactly two mercury sites per RAP30/74- hcterodimer. or four mercury sites per RAP30/74-helerotetramer were expected. The number of RAP30/RAP74 heterodimcrs in the dehydrated Pl unit cell was derived from the native C2 unit cell arrangement which has 4 asymmetric units and room for maximally one heterodimcr per asymmetric unit. This was consistent with an estimated solvent content of 33.5% for the native ciystals assuming a volume of 10 ÄVDa

(TRUNCATE). Upon dehydration the unit cell volume decreased by only 6.3% indicating that there were still either four heterodimcrs or two heteroletramcrs in the Pl unit cell. Analysis of the basic NCS-informaiion contained in the MAD-data revealed that not only the number but also the overall arrangement of the molecules in the native

C2 unit cell had been preserved as well. A strong peak at (0 5 0.5 0) in the native

Patterson map ot the Pl dataset and alternating intcnsitv levels between rellections with h+k = 2n+l and h+k = 2n indicated signilicant translational non-crystallographic symmetry which was nicelv lined up w ith the ciyslallographic translational symmetry element oft he original C2 unit cell (TRUNCATE). This translational pseudosymmetry oi the Pl unit cell was supplemented to C2-likc unit cell organization by 2-lold rotational NCS. The rotation axis was parallel to the c-face of the Pl unit cell as

: 11 not state otheivvise all piogiams cited in the chaptei aie pait ot the CCP4 progiam suite17*1 oi RAVE142 (Section 2 1 LP 156

inferred from strong peaks in the selfrotation function (x=180 ° 10=00 ° d)=66.9 °,

AMORE. POLARRFN).

The eight mercury sites corresponding to the four RAP30/74-complexcs in the

dehydrated Pl unit cell were located by visual inspection of the difference Patterson

maps and vector recombination with RSPS. In a geneial case, these would lead to 57

peaks m the difference Patterson maps but m this case with the translational NCS only

26 peaks were expected Indeed, inspection of the anomalous and dispersive difference Patterson maps revealed approximately 22-30 peaks above 3 I/c> (Figure 24).

Figure 24: Anomalous difference patterson map.

The anomalous ditference pattenon map tot the entue Pl unit tell is calculated loi the 'peak" dataset (1.0092 A). Resolution limits I 3P 3 A Contoitt letcK .ik 0 o I/o Cutotlis 3 0 I/o w ^

%• % ! a» A m - '' 4«? w *W % **7%». l # ' ! t m &£

<8> S % ! *• * *~ ' % *%® * " Jft 1 r mI m * *» i % i

The crystal belonged to the Pl space group. Therefore, the first heavy atom site could be placed m the origin such that Patterson space and real space became virtually identical. Every other Patterson peak represented either another heavy atom site or a vector connecting two sites. A second heavy atom site could be placed m any other strong Patterson peak The third site could not be chosen as freely among the Patterson peaks as the second since the connecting vector (peak) lo the second site had to be present as well. Such a set of lour Patterson peaks which included the origin and was consistent with two heavy atom sites in addition to the oiigm site was called a consistent group of peaks. These consistent groups were easily found by vector combination with RSPS. Due to the translational NCS. each consistent group that did not involve the symmetry vector (0.5 0.5 0) could be found twice leading to pairs of consistent groups which could be used to locate 6 out of 8 mercury sites. 157

Unfortunately, each consistent vector group can be interpreted in several ways

(Figure 25). Each vector can be attached to the origin in two different orientations leading to six potential sites around the origin from which two must be selected. After correction for degenerate solutions which involve the same vector twice, mirror images or identical solutions, only three possible combinations remained. These three solutions for 6 heavy atom sites were completed selecting the additional two mercury sites from difference Fourier maps (MLPHARE). Shifting the origin between the heavy atom sites of individual solutions it was shown that all three solution were identical as supported by identical phasing statistics with mean figures of merit of 0.490 (25-3 A, MLPHARE, stage I. Figure 27).

Figure 25: Vector recombination diagram.

1

3" Select 2 Kites Solution uses outoPo the same Unique i~\ KJ i C- q . vector twice solutions A 013' 2 ; " 3 = 3 A 023

Identical 2 solutions

r

Electron density maps caclulated with these initial phases were barely interprétable at this stage. In order to insure that the heavy atom locations were correct, three independent methods were utilized: Patterson deconvolution by symmetry minimization as implemented in SHAPE144 and sequential site selection by difference

Fourier techniques with MLPHARE (CCP4) confirmed ihe proposed heavy atom solution. Direct methods146 as well as the Patterson interpretation algorithm145 of

SHELX-97 failed to find any heavy atom sites at all. It was concluded that the vector combination solution described above was essentially correct.

Heavy atom parameters were refined further using VECREF. The program refines heavy atom positions in vector space which results in a much larger theoretical convergence radius (2-3 A) than refinement in reciprocal space (1-2 Â) as implemented in most other programs (MLPHARE)235. Heavy atom parameters were then refined with MLPHARE, treating MAD-data as a special case of multiple isomorphous 158

replacement as described by Ramakrishnan227. The "inflection point" data set was used as pseudo-native dataset which means that dispersive differences AF were calculated with respect to this dataset. Optimal refinement conditions were to fix one heavy atom site at the origin, to exclude large dispersive and anomalous differences from the refinement and to fix the istotropic B-factors at 35 A2. Refinement of heavy atom positions and occupancies was performed in two rounds: First, against dispersive differences, then against anomalous differences. This yielded a mean figure of merit of

0.53 (25-3 Ä, MLPHARE, stage 11, Figure 27). Additional refinement of heavy atom parameters was achieved m combination with NCS-avcraging and solvent flattening

(sec below) Iterative heavy atom parameter refinement against dispersive and anomalous differences may not yield the optimal result because it starts with very crude phase information. Therefore, improved phases after solvent flattening, NCS-averagmg or model phases should be used lo îestiain initial heavy atom parameter refinement.

Two rounds of this "phase cycling" (Cycle I. Figure 23) increased the mean figure of merit to 0.571 (25-3 Â, MLPHARE, stage III. Figure 27). FTpon application of heavy atom parameter restraints even isotropic temperature iactois could be refined to a reasonable average value of 58.6 Â2. Without these restraints B-fact ors reached values above 100 A2 which impaired phasing statistics

At this final stage of heavy atom parameter refinement, phasing power was maximal al 6 Ä and dropped below 1.00 between 4 and 3.6 Ä. Overall phasing power was 0.84 and 0.69 (25 - 2.65 A) for the "peak" and '"remote" datasets. Dispersive RmjliK was around 0.90 with a shallow minimum at 6 A. Anomalous Ra]]11, was somewhat lower

(0.75-0.80. Figure 26) These dala indicated acceptable accuracy of the initial phasing

information lo approximately 3.5 A where the mean figure of merit dropped below 0.5

(mean phase en or. 60°) Up to the driiraetron limit of the cystal (2 65 À) initial phases were of very poor quality (Figure 26) 159

26: Power and R of final atom mode!. Figure Phasing ]llis heavy

Phasing novvei and R aie both as a tunction ot resolution ditteiences were n given Dispeisive calculated with îespecl lo the

'inlleclton point" dataset (X, 1 0092 \) Thcietore an anomalous R is snven led statistics ol the onlc „, (solid line) Phasing

dataset . 1 0032 "peak (X A) aie given as dashed lines and toi the 'icmole" dataset (?. 0 9316 A) as dotted lines

10.00 9.00 7 00 6 00 5 00 4.00

resolution [A]

These rather crude phases were improved by 4-fold non-crystallographic symmetry averaging2"10 and solvent flattening2^1-212. The non-crystallographic symmetry comprised not only rotational but also translational components. Therefore, it was classified as improper non-crystallographic symmetry. Averaging required four

NCS-operators and a monomer mask including a single RAP30/74 heterodimer (DM) or four times four NCS-operators and a separate mask for each RAP30/74 heterodimer

(SOLOMON) depending on the software used.

By definition, the first NCS-operator is always the unity operator. The second was found based on the translational NCS-peak in the native Patterson map at (0.5 0.5

and the 0) heavy atom site locations (Figure 23). The program FINDNCS detected a minor rotational component in the mainh translational symmetry (ca. 0.5°) and a slight deviation from the original (0.5 0.5 0) vector to (0.5040 0.4973 0). The third operator was obtained with the program CETAX which located the optimal position of the 2- fold rotation axis based on a low resolution electron density map (15-4.5 A) and the peak of the self-rotation function (axis position: 0.033 0.000 0.550). The fourth NCS- operator was constructed by combination of the translational with the rotational operator. An initial heterodimer mask was generated from a local electron density correlation map (COMA, probe radius 5.1 Â) which was calculated based on the initial electron density map (25-2.65 A) and the four crude NCS-operators. The local correlation map was contoured to include 10% of the unit cell which corresponds to an 160

approximate solvent content of 30%. The raw mask was eciited with MAMA and O141 until it contained only one RAP30/74 heterodimer. Some parts ot the protein density were not included in this NCS-mask. This was cither a consequence of low NCS- correlation or an artifact created by the crude NCS-operators. Approximate protein/sol vent-boundaries were determined by skeletonization of the initial electron density map (MAPMAN) Some ß-strand density could be filled with BONES but it was noi possible to determine their connectivity. This skeleton and the heavy atom sites were used as guides lo extend the raw NCS-mask with geometrical mask elements such as spheres and cubes in older to include the whole RAP30/74 heterodimer. Finally, all sharp edges, cavities or islands (high resolution Icatures) were removed from the mask with MAMA. This mask was then used lo refine the initial NCS-operators.

NCS-opcratoi refinement in IMP maximizes the correlation of a given masked area oi the electron density map with the NCS-relaicci regions. Therefore the final NCS- operators depend heavily on the mask which was originally generated using the crude

NCS-operators. Therefore, refinement of NCS-masks and operators is interdependent.

Editing of the masks based on the putative protein-solvent boundaries introduces some bias towards the low correlation regions ol the oiigmal local correlation map which has to be removed before NCS-avcraging. The ultimate test for each combination of NCS- masks and operators is a final correlation map. The high correlation area should cover the entire mask thai was involved in operator refinement. Otherwise, the non- crystallographic symmetry is broken and the low correlation region should be removed.

In the case of improper non-crystallographic symmetry the overlapping regions between the NCS-mask and its symmetry related positions must be removed as well

(NCSMASK).

Following the the above analysis, the scene was set foi NCS-avcraging and

solvent flattening There were two programs available for these density modification techniques. DM2'- and SOLOMON2'1. Since there was no analytical way of determining which technique or parameterization was most appropriate for the given problem, sevcial density modification protocols were applied in parallel and compared

on the level of the electron density maps This method provided ihe advantage that even

very weak features could be used for model building il they appeared in all averaged maps, and that gross error could be avoided excluding features which appeared in only

one election density map All protocols had in common that phases were extended in 161

many small steps (50-500) from medium resolution (3.8 A) to the diffraction limit of the crystal (2.65 Ä). Starting from the final heavy atom parameter set the mean figure of merit was extended from 0.571 to 0.787 (DM, 25-3 Â, stage III, Figure 27).

Figure 27: Effect of heavy atom parameter refinement in MLPHARE and density modification with DM.

Ihe thtee dilfeicnt stages ol heaw atom pai.imetet relitiemenl are libeled with rom.in numbers (black.) 1 Heaw atom pti.imetet letmemcnt mtluthng postions isot-opic B laaois and otcupincies tgainst dispeisive and anomalous ditlcienccs in Ml P11ARF (teupiocal space) 11 Vettoi space iclinement (VFCRtl) lollowed by icupiocai space lelmement in MLPHARE lnchidmo- positions and occtipanctes against dispeistve and anomalous datantes tl lactois wete sel to 3o  til Phases altei two lounds of "phase tvcling starting with phases as described to- II (black) toi low cd ht dcnsitv modification U (red! as lesliamts lot lecipoical space heaw atom paiametei letinement in Ml PHARl including postilions occiipanicics and isotiopic B tactots against dispeistte and anomalous dilteiences Kich set ol paiameteis was used tot initial phase calculation and a densitt modification ptottKoM m DM involving 4 fold noncivstallogrphic stinemtiy avetagmg and solvent flattening vtith phase e\tetisltoii horn 3 8 A to 2 (,«( \ m s00 steps (ted) 1

0.9

0.8

statW 111 staè "0.7 staç S stage 111 o 0.6 ü. ' stage Ii» — MLPHARE 0.5 S

O

Mean FOM Mean FOM 0.4

(2S-3 A): . S? I 0.490 .725 LL 0.3 II 0.530 I 0.784 0.571 I 0.78? 0.2

0.1

4-— - 0

10

resolution [Â] 162

6.3 Model building and refinement

The initial model of the RAP30/74-intcraction domains of human TFIIF was built based on the ''peak" datasei of the MAD-expenmcnt and MAD-phascs after density modification as described above. In seven classical "macrocycles"229 of model building and restrained refinemcnl the "triple ß-barrcl" core structure of the RAP30/74 heterodimcr was solved. Intermediate models were used to further refine NCS-masks and operators (Cycle III, Figure 23). Upon phase combination and additional "phase cycling" most turns connecting the ß-strands of the "triple barrel" gradually became visible. Long loops and extensions from this core structure remained incomplète or were missing entirely. The MAD-models were refined with CNS137-236-237 by energy minimization and simulated annealing (40(H) K): First with non-crystallographic

symmelry constraints (1 hetrodimer). then with non-crystallographic symmetry restraints (4 heterodimcrs). Final h, the four RAP30/74-models encompassed 199c of

the backbone atoms (914 residues ) and 31 % of the side chains (363 side chains ). This resulted in a crystallographic R-factor of 439. (10-3 A, bulk solvent correction,

anisotropic B-factor corrcciion, "peak" dataset). The random tesi sel of 4000 reflections

(IV ) gave an RlKi-factor of 45%.

The. most complete RAP30/74-heterodimcr was then used as a search model for molecular replacement against the high resolution native dataset. The cross-rotaion

function calculated with a searching sphere radius of 45 A and a sampling step of 2.5°

in AM ORF233 showed two distinct peaks with approximately 180° difference corresponding to the 2-fold rotational non-crystallographic symmetry. This result was

reproduced with vanous resolution ranges between 20 and 3.5 A. One molecule was

fixed in the origin and oriented according to the highest peak m the cross rotation

function. Translation searches against data of various resolution ranges between 15 and

3.5 A located three more RAP30/74-heterodimers in the native Pl cell. Rigid body

refinemcnl against data from 20 to 3.0 A in AMORE yielded a solution with a

correlation coefficient of 0.682 between model and electron density and a

crystallographic R-factor of 47 S.%. Visual inspection of the crystal packing in O141

revealed no bad contacts. The four complexes were arranged as observed in the

methylmereury prelabeled Pl crystal with one exception: one molecule was rotated by

10° out of the C2-like arrangement which was enough to break the pseudosymmetry. 163

This was apparent from equal intensity distribution between reflections with h+k =2n

and h+k = 2n+l (TRUNCATE). Rigid body refinement, energy minimization and simulated annealing (4000 K) in CNS against data up to 2.5 Â decreased the R-factor to

40.5% (15-2.5 A, bulk solvent correction, anisotropic B-factor correction). The random test sel of approximately 4000 rellections (4V ) gave an Rlr^-factor of 45%. There were no more NCS-rcstraints oi constraints applied. Inspection of the high resolution electron density map calculated irom 67 to 1 7 A with phases derived Irom the molecular replacement model revealed that the lour "triple barrel" core structures were very well defined even though the model was seiiously misplaced with respect to the electron density. After two classical "maciocvcles"229 of model rebuilding with O and restrained refinement with CNS the model comprised 86% of the main chain atoms

(999 ammo acids) and 70% of the side chains (805 side chains) (Cycle V. Figure 23) with a of crystallographic R-factor 36r/( (R„Ä 38' <, 15-2 5 Ä. bulk solvent correction. anistrophic B-factoi corrcciion). The tour copies of the RAP30/74-intcraction domains had been lcfmcd independently and at this point exhibited some differences. Foi example, loops and extensions which weie modeled in on copy were missing in the other copies. In ordei to extend the model furthci ihe four hcterodimers of the

RAP30/74-mteraction domains were superimposed with LSQMAN138 and combined into a single "super model" which was composed of the best defined and longest parts of all four copies. In order to remove all individuality from this model it was refined by energy minimization and simulated annealing (4000 K) in CNS using NCS-constiamls.

This search model was used lor "moleculai rcrcpacemcnt" m the native Pl cell (Cycle

IV, Figure 23) CNS2*4 was used for molecular remplacement instead of AMORE. The remplacement solution was refined b\ rigid body refinement, energy minimization and simulated annealing (8000 K) in CNS against native data between 15 and 2.5 À. After grouped B-factor refinement the crystallographic R-factor leached 34.8% (Rfrcc 39.4%.

15-2.5 A. bulk solvent correction, anistropic B-factor corrcciion). Tt look eight classical

"macrocycles"221' of model rebuilding and refinement to obtain the final model (Cycle

V. Figure 23). Regions with unassigncd backbone trace were eliminated from ihc model when at termini or m loops longer than 5 amino acids Occupancies for residues with unassigncd backbone trace were sel to zero if they weie in loops of 5 amino acids or less. Side chain occupancies foi icsidues with only an appioximalc backbone trace were set to zeio and the icsidues were icfined as glycine icsidues. Side chain occupancies foi 164

icsidues with unassigncd side chain conlormation wcie set to /eio and the icsidues weie iclined as alanine icsidues Resolution was slowly extended tiom 2 5 A to 1 7 A

Individual B lactoi leiinemcnt and eneigv minimization was pciloimed with

REFMACns because it was moie powei lui at lowenng R and Rir than CNS On the othci hand high tempeiatuie simulated annealing (8000 K) was peifoimed with CNS as well as geometry optimization which was a pioblem m RIFMAC Watci molecules

' weie picked liom Fo Fc maps with a 2 > 51/g cut oil level oi horn 2Fo Fc maps with a 1 2 1 5 I/a cut oil level CNS was used to exclude watei molecules with B

Jactois above 100 A oi unicasonable 11 bonding geomctiy Settings weie such that watei molecules had to be within 4 0 A ot any othei atom but not closei than 3 2 A unless the neighboring atom was an oxygen oi nitrogen atom The minimum H-bonding distance was sel lo 2 4 A Additional watei molecules weie picked altei visual inspection of Fo I c and 2Fo Fc maps when watei molecules in NCS equivalent positions suggested anothci water site 688 watei molecules passed the imal visual inspection ol the 2Fo Fc map loi minimum election density of 1 01/a

Figure 28: B-factor distribution of the water molecules.

40 50 60 70 90 100

Isotropic B-factors [A2] 165

The quality of the final model was assessed with CNS137 and PROCHECK135

(Table 33). The coordinate deposition file contains four RAP30 (A, C, E, G) and four

RAP74 (B, D, F, Ft) chains with corresponding heterodimer combinations (AB. CD,

EF, GH). They comprize 919, of the main chain atoms (1055 amino acids) and 82%) of the side chains (937 side chains) of the four RAP30(2-J 19)/RAP74(2-172V heterodimers1. The water molecules were assigned to four chains (Q. R. S, T) corresponding the closest hetciodimer (ABQ, CDR, EFS, GHT). Water atom numbers

were soiled in NCS-relatcd groups w ith SORTWATER (CCP4). All protein chains m

the asymmetric unit show good stcieochemistry with 90.6% of the non-glycmc and

non-prohnc residues in the most favored legions of the Ramachandran plot. Only

Asnl5 and Asp 102 of RAP30 and Lys36 of RAP74 he m the generously allowed

regions. Asnl5 of RAP30 as the C-termmus of the al-helix and has to adopt this unusual conlormation to allow the sharp bend into the ß2-sirand. Asp 102 is pan of a

flexible turn between ß6 and ß7 of RAP30 which as an unusual conformation which is

stabilized by a hyciiogen bond between the hydroxyl group ol Sci99 and the backbone

nitrogen atom of Lys 103. The backbone carbonyl groups of Asp35 and Lys36 of

RAP74 are pointing to the same side ot the peptide backbone forming two hydrogen

bonds to the head group of Arg 108 (RAP74) from a neighboring molecule, resulting m

an unusual backbone conformation of Lys36. For all these residues there is clear 2Fo-

F7c electron density which shows that they are correctly modeled. The mean isotropic B-

tacloi of the model is 45.4 A" which is above the Wilson plot B-lactor of 30.1 A2 for the

native high resolution dataset The B-factor r.m.s.d. for bonded mam chain atoms is 4.3

Â2and for bonded side chain atoms 6.6 A2. The reason for these high values arc the

badly defined loop and extension regions of the structure (Figure 29). B-factors do not

only vary along the ammo acid chains but also between the lour RAP30/74

heterodimcrs m ihe asymmetric unit. The best defined copy (EF in the deposition file)

has and mean Ca-B-factoi of 32 4 A2 while the worst copy (CD m the deposition file)

has 46 8 A2. This variability must be caused by subite differences (water mediated)

crystal contacts.

Ammo acids thai weie deleted loim the final model- RAP74 B2-4. R(oP72. R154P72. D2-6. D64-72.

D169-172, F2-4. F63-72. H2-6. H6*-72.11154-172(101 icsidues) Amino acids with occupancy zeio RAP74 Bl 16-119. DU 8 119. F77-79. Fit9. HI 1(7-119 (18 lesidues) Ammo duds with side chain

occupancy zero- RAP74 F21. Fl 18. F159. H21. H55 (5 residues). RAP30- A71-76. Cil C71-77. Gl 4 166

Figure 29:Mean Ca-B-factors and mean NCS-Ca-distances for the RAP30/74- interaction domains of human TFIIF after superposition.

IS 1 ' Hit tout molecules m tin. tsymmcttic unit hive been supcniipostd with ISQV1W m-iummn»- the ovcilip belwecn tht, uiliit st uetutts Mt in Ca B fitto s ut shown is bl ick solid lines Vit in NC S Ca distincts nt shown -is ttd solid Pits Tht second uv stiuctuie issi nments lie shown below these »iiphs R<\P30(2119)

V pl al «o t» t)3 „j V u « % )f3)ß8 fj4 +C«"""MBI «•#> iE too o

RAP74(2 172)

i/A! IA

IS By jïlû mm^. ^m^. «^ 't-tm

40

The B-facors tot the watei molecules have normal distribution with a slight distortion towaids highei B-factois The aveiage B-factor foi all water molecules is

50 Is) A which is a little above the overall B-tactoi foi all piotem chains (Figure 28)

The i m s deviations tiom ideal geometi) as well as i m s deviations from empincal maincham and sidechain paiameteis are gnen below (fable 33) The> ate all with m the limits expected toi piotein structures lefined against X-ia\ data to 1 7 A This is teilected m the positive G-tactors describing log-odds scoies foi model backbone and sidechain geometiy based on empincal distnbutions 1 he overall C-tactoi is 0 24 which is bettei than tor an aveiage X-ray stmcture at 1 7 A

(IS lesidues) Vmino acids with undctmed sidechain contoimation (tetinenttl is alanine) RAP74 1 1276 H281 F122 H2i H76 (6 residues) 167

Table 33: Refinement of the RAP30/74-interaction domains of human TFIIF.

Model1

Total atoms Replied atoms Relmed atoms Relmed watei

2X conf molecules

9220 82 36 44 (i92

R-factors

R R

102106 hkl -S96 hkl 22 3f/rt29 WrY 2(oO'( (32 ^<)'

Coordinate errors

Lu//ati \R) Lu n a ti (R, ) Stgniaa R^ SiemapRr ) ' 0 2o A 0 2S k 0 17 \ 0 19 A

B-factois'

B 1 lit s d

Vlamehuii Sidcchnn Maint h nn ancle Sidechain angle bonded bonded

45 4 \ 4 27 A. 6 S6 V s /ft \ 8 42 A-

Deviations from ideality Bonds Anales DihediaK Impiopti 0 0OSS A. 1 IV 26 23 0^

Ramachandran plot" Coie Allowed Geneicius Disallowed

90 (CA (R6 SPr)' S 9P 0 6' r 0 07

(i-factors

Dihedials C ovalem Oveiall

0 06 OS] 0 24 ( 0 ô t 0 oV

Deviation from empirical maincham parameters

CO peptide bond bad contacts'' C Ca distoition H bondi ns enei<*v 100 icsidues l kcal/mo Ij

1 6° o 1 0 9° 0 3

' 6 0" ± } 0°' 1 9+ 10 0' Ô 1 t 1 6 o z j o ;'

Deviation from empirical side-chain parameters X, (^ ) X '7-+> X, dpi X,(Se SC+ ipl X (^P) 12 7e 14" H 1 lo6° 14 7°

' ' l^'KP0' 16 s -7 i 15 r ±4 9 is 9J^4S0' 14 7' I IS 801

'ONS (6 0-1 7 A bulk solvent cOiiedion amsouopit B tadoi collection) PROCHFCk

Empinical value loi stiuetuies icfined against uvstalloeiaphit dati to I 7 A icsolulion 'last shell (1 76- I 70 A) 168

6.4 A novel heterodimerization fold of the RAP30/74 interaction domains of human TFIIF at 1.7 À resolution

6.4.1 An novel 'triple barrel" heterodimerization fold

The RAP30/74-interaction domains of human TFIIF form a heterodimer constructed from a novel, "triple barrel" protein lold. Both subunits contribute lo a single common core of three intricately intertwined ß-barrels with overall dimensions

of approximately 25x25x50 A (Figure 30A.. B) The ß-strands of RAP 30 and RAP74 contributing lo the triple barrel have identical topology. They are related by a pseudo-

twofold symmetry axis (186.6 T along the central ß-barrel which superimposes the corresponding ß-strands of RAP74 and RAP30 with a root-mean-squarc difference of

2.3 A for 49 common Ca-positions (Figure 30A. C) The structural similarity between the TFIIF subunits is not accompanied by sequence homology which indicates

convergent evolution No Structural homologues to ihe RAP30/74-heterodimer were found with DALI2,18 oi DFJAYU138

Formation of the "triple barrel" appears to require cofoldmg of the RAP-

stibunits as opposed to docking of stable monomers. Interdigitation of secondary

siructure elements between RAP30 and RAP74 is confirmed by locations of both

mercury-labeled cysteine residues withm the RAP30 ß-barrel (Figure 30C). The firm

molecular "hand shake" encloses a buried surlace area of 4400 A\ which is unusuallv

large for single protein-protein interfaces2^, The interaction surface includes many

contacts bctweem aromatic side chains indicating that the heterodimer is highly stable

(Figure 31 A, B). The interaction domains îemain as a unit even when the first four ß-

strands (1-66) of RAP74 or the last two ß-strands (99-119) of RAP30 are deleted"5---40.

The importance of corefolding or coexpicssion of the RAP30 with RAP74 for m \itro

transcriptional activitv of TFIIF as opposed to simple addition of subumls has been

",SS7 10g demonstrated previously This suggests that corefolding may indeed be

necessary to form the tightly intertwined '"triple barrel" told and stresses the relevance

ol a firm RAP30/74-mteraction for accurate transcription. This has been questioned in

the past32-(l1-82-241-242 although both TFIIF subumls are crucial for proper assembly and

stability of the preinitiation complex as well as for transcription initiation and

cloneation7"8187 88.24 y 1.69

Figure 30: Structure of RAP30/74-interaction domains of human TFIIF.

the axis of A, RAP30(2-110) (red) and RAP74(2-172) (green) form a '"triple barrel" ß-stiueture (view down pseudo-twofold View into the "RAP30 barrel" and the "RAP74 barrel" (90° rotation around symmetry passing through the "central barrel"). B, with Ihcnth amino acid numbered horizontal compared to A). C\ Stereo view of the Ca-ttaee of the RAP30-74/heterodimei'. every of the tielerodimeric (view clown the pseudo-twofold axis). O. Topology diagram showing the pseudo-twofold symmetry triple the structure .solution. not visible m the barret. The cysteine amino acids (yellow) were labeled by heavy atom for Regions density are indicated (dotted).

\ cJ T^ 170

fos |M y is\ p (,» m |w iii« i« f,b (,i im m ßt, m 80 58 3S 18 139122 24 116 142 24 99 104 6 49 98 114 86 51 4E 10 135129 17 111 148 32 91 110 8 4b 102109

R P30 barrel RAP74 barrel

Central barrel

Figure 31: Electron density maps from the "triple barrel" core.

A, The coiiscived ß2-strand ot RAP10 (below) and aiomatic amino acids Irom both TFIIF subunits tn the hydrophobic cote ot the "RAP30 band" (1 5 a contouis) B. The ß7-stiand oi RAP74 (below) and luithei aiomatic ammo acids from both TFIIF subunits m the hydiophobic core of the "RAP74 Panel" (I 6 o contouis)

Despite the apparent requirement for cofolding of the TFIIF subunits. application of the twofold pseudo-symmetry to RAP30 or RAP74 in isolation generates hypothetical homodimers with seemingly reasonable stereochemistry. Buried interaction surfaces of 3600 À" for RAP30-homodimers and of 4300 A'" for RAP74- homodimer are generated. It has already been shown that the single subunits RAP30 and RAP74 oligomerize in v/Yro81-133 but in an in vivo two hybrid assay they did not interact109. Further investigations are needed to address the topic of stochiometry and in vivo relevance of homo-oligomers.

The three ß-barrels of the "triple barrel core" are name the "RAP30", "central'' and "RAP74" barrel, respectively. They comprise 16 parallel and antiparallel ß-strantls 171

which are well defined in the electron density map. These ß-strands are connected by

flexible solvent exposed loops, ß-strands, and a-helices. These extensions from the core

structure are not always well defined m the electron density map as indicated by high

temperature factors, but at least one of the four independently refined copies in the asymmetric unit is sufficiently well ordered to visualize (Figure 32).

Figure 32: Superpostition of the four copies of the RAP3()/74-interactiondomam.

| OQ The loin molecules in the ° asymnietnc unit have been supetimposed with the 'explicit routine ! SQMV.N-1 maximizing the between the band" oveilap 'tuple aire stiuetutes Total all modeled icsidues Cote all modeled lesidues but Pop'aim legions i oop/'Vim RAP30(67~77), R-\P74(33-92) The protein Cu-tiatc is eoloied atcOidtng to the isotiopie Ca-B f.ktois tiom 30 V (blue) to 73 V (ted)

RAP 30 RAP74

Ca positions t m s d LA] Ca positions t m sd [À|

118 1 28 133 1 66 103 0 83 104 127

m P 2 P 29 2 43 172

This variability may be a consequence of the dehydration process.

Conformations may have diverged from a single common conformation in the native

C2 crystals with one complex per asymmetric unit. Alternatively the variable regions might be flexible in the native crystals and could be trapped in various conformations during dehydration. Both processes are not necessarily uniform and complete across the ciystal which explains the high degree of disorder (i.e. flexibility or multiple conformations) observed in the variable parts of the structure (Figure 29). The flexible character of these regions, particularly of the loops, is most likely important for the interaction with oilier transcription factors of RNA polymerase II (Figure 35A).

Siructure and functions of the three ß-barrels.

The eight stranded "RAP30 barrel"' is composed of five ß-strands from RAP30 and three ß-strands from RAP74 with overall dimensions of approximately 25x25x20

A. The hydrophobic ß-barrel core is composed tightly packed aromatic and aliphatic residues from both TFlIF-subunils (Figure 31 ). Many of these conserved in evolution

(Figure 34). The top of Ihe barrel is covered by the a2-helix of RAP30. The subsequent

ß-strands (ß3-ß5) contribute to a highly conserved patch on the surface of the barrel involving RAP30 K42. K43. R4. L58. and E80 as well as RAP74 VI1 and El3 (Figure

33A). Deletion of this RAP 30 region (amino acids 30-45) abolishes the elongation activation activity of RAP30 that suppresses abortive transcription by early RNA polymerase II elongation intermediates85-88. So the mutation either destabilizes the whole barrel fold which would impair all interactions with this part of the structure or the effect is due to the loss of a specific interaction surface supplied by the conserved residues (Figure 35B). Another potential binding site lor components of the transcription machiner}' is RAP30(65-77). a 13 amino acid loop that extends 20 Ä away from the RAP30 dominated ß-barrel exposing the hydrophobic surface of 165 and 168 to the solvent (Figure 35A). 168 is tied up m a crystal contact which may replace an in vivo binding site. The loop was only well defined in one of the NCS related copies of the RAP30/74 interaction domain indicating high flexibility (Figure 32). The different conformations superimpose with a root-mcan-square difference of 2.4 Ä for 13 common Ca-posiuons for all pair wise comparisons of the four copies (Figure 29,

Figure 32). 173

The six stranded "central barrel" with overall dimensions of approximately

25x25x 15 A is rotated by 90° around the long axis of the "triple barrel" as compared to the RAP30 and RAP74 barrels (Figure 30A, B). Each subunit contributes three ß~ strands to the barrel that encompasses most of the interaction face of RAP74/RAP30- heterodimer. Side chains pointing into the hydrophobic core are well conserved in evolution (Figure 34) including the ß2-strand or RAP30 which represents the best conserved region of the whole structure. It contributes both to the hydrophobic cores of both the RAP30 and the central barrels (Figure 31). Deletion of ß2 (amino acids 15-30) abolishes not only initiation and elongation activities but also RAP74 binding88(Figure

35A). The central barrel has a positively charged face surrounded by a corona of conserved residues. Kl 15. RI 13, K22, andRl 17 from RAP74 contribute to the positive surface. The conserved residues are located in part on a poorly defined, solvent exposed loop between ß7 and ß8 of RAP74 which includes the absolutely conserved residues

V118 and N121. RAP30 P24 and P118 as well as RAP74 N145 complete the conserved surface at one end of the central barrel which is likely to be a binding site for additional components of the transcription apparatus (Figure 33B).

Figure 33: Solvent accessible surfaces.

A, Amino aetd homologies (identical yellow, conserved - blue), B. 1 lecttostattv sntlace potential hom blue (positive) to ted (negative) 1 he suilace piobe is I 4 V m induis 174

Figure 34: Sequence alignments of RAP30(2-119) and RAP74(2-172).

Ammo and identities and (bold) sinitlaitties (underline) tic denoted Ammo acids cliaratttusties are displayed as loi lows

interface contact (P3 5 A) RAP30 barrel core

buried interface contribution (1 4 A sphere) RAP74 barrel core buried residue (<10°o solvent exposed surface) Central barrel core O surface residue (>30% solvent _ exposed surface)

RAP30(2-119) al ß2 a2 ß3

10 ?0 0 40 v urn -i i 1 tfttetOt-FGAKQNTGVWtVKVPKYLSQQWAKA'it RGEVGKtR 1 AKT xemi.ua } h K C ! tDtNGAKONTOMWtV KtPKYtAQQWAKATC, RtEVC, KLR1VKN 11 9 k Irosoph 1 IDkDtitS < q GVWtVKVPKYIAa" « «•* «

O O O J m a a h I O O I o o 3 C 3 0 0 ß4 a3 ß5 ß6

RTEVStTlNEDtAN I H D 1 b ti K P A S V&APRÊHPlVtQSVi. 6 Q T L T V 0 SO KTÈVSFTtNettA-i I a O t S G K P A S VSTPREHPrLtÛSVfas G Q T L î V "H S? k Ö T L G V 100

T v t t L N fc N / V n y ! S *> 40-

OO O O O I O O O O O O ojao3 oi ß6. ß7

D K U S t 6 e VVORAËCRP A mT~ 201 KUÎ lüS L T fc S L S 0 K t A t B G VVHRAECRP A 101 264 droboptl 11 t k t y MES IVQKteCRP ! '07 277

yo M i?5 k K T A 1 6 S 200 400

O O 7 l ?

RAP74(2-172) ßl ß2

^0 hi T3t 1 M A ALGP-SSGNV TE ¥ V V R V 1 K K Y N l M A F N A A 0 K V N F x i 1 M A S G T C pis L S Q V T £ Y V V « V «ï K R Y s i, M A P N A A n K V D F 23 S irosoph M A K K h H V M r p N A T L N V n F

n R T H , 1 k P Q S s K K 1 N p « « * « 4* * * *

» » * » * * * * t « * * +

3 O m m M m U O O > 2> o 0 S ! C 7 (. / C 7 ß^ ß4

f mn 40 W N Q L E R D i b K K I Y Q EÊÊMPESGAÔSEF N R K t R E fc 71 \e s 10 op W N Q W Ë R D L S EKEMPEtaGAGSE Y N R K Q R E fc S droso^ h I î i 6 E E D q P K Ö A G S E Y N R D 0 R E E 100 veast 1 *0 RATONS 1/-)

OOOOOQ70 OO OOOùûl

humi i 71 A 122 xfnopib 79 s 122 Irosophiia 10b A 149 1 yeost 10 _tj. 242

O O O o o

AEfcEWtRRNKVlM 1T i*; 1 1 xenop iYflFTCllAOBAFEAPPV N W Y S F ROLTAbEAE EWERRNKV1 M 1'2 Irosoph h 0 AfYVPTHAPtlGAIËAYPt «Y H I KSU^AEEAE B t a K R t K V U N 191 743 *> /cost JS O .. N L A N , 33 _ A_A _i _A_0Ô_T a s v « 2^0 p. _

O O J H » O D O O O O O

0 t 1 C 175

The "RAP74 barrel" is complementary in structure to the RAP30 barrel by the

pseudo-twofold symmetry, tt contains five RAP74 and three RAP30 ß-strands (Figure

30A, C). The hydrophobic core is constructed of intcrmeshed aromatic and aliphatic

side chains (Figure 31B). Again, many of these aie well conserved (Figure 34). Instead

of an a-hclix as for the RAP30 band, a loop of RAP74(34-43) spans the analogous barrel opening which has an extensive positively charged surface comprised of RAP74

R153. R151. K105. K10S. and K1CW Irom RAP74 (Figure 3^B). RAP30 contributes a

ß-hairpin (ß6. ß7) to ihc RAP74-bairel. Deletion of this hairpin (ammo acids 91-105)

or ihe subsequent ß8-strand (amino acids 106-120) also eliminates elongation activity

of TFIIF88. Any deletion mutant of this type probably destabilizes the entire structure because oi Ihc intertwining of the RAP30 and RAP74 chains. Otherwise the clfect must be assigned to the loss ol a specific interaction surtace since Ihe mutants are still binding lo RAP74 (Figuie 35A)

6.4.2 The RAP74 "arm domain"

The RAP74 ßs and ß6 strands are linked by a 45 A "arm domain*' (amino acids

54-92). The extension is stabilized by the hydrogen bonds between the twisted antiparallel ß-strands ß4 and ß5 outside the RAP74 barrel as well as by the Y57 side chain inserting into a hydrophobic pocket lormed by 186 and L88 (Figure 35C).

Together with W164, K89, and V87. these icsidues constitute a hydrophobic but solvent exposed surface on one side of the arm which indicates a potential binding site

for lurther tactors or ]ust represents the hydrophobic core of the arm domain (Figure

35C). The apex of the arm is highly positively charged (R73, R76, R80, R81. K82.

K83) and is localized through a hydrophobic crystal contact of L75 at its tip (Figure

35A). There is a nine ammo acid loop (64-72). including conserved E70. that extends from the end of the arm domain into the soKent and is not obseived in the electron density map In NCS-ielated molecules the arm domain adopts slightly different confirmations. Only one of these conformations is well defined due lo favorable crystal contacts which indicates high lle\ibiht\ of the arm domain (Figure 32). The different conformations supeiimpose with a loot-mcan-squaie difference of 2.4 A for 30 common Ca-positions lor all pan wise comparisons ot the lour copies (Figure 32,

Figure 34). 176

Figure 35: Functional aspects of the RAP30/74-interaction domains.

to be A. Regions ot R3.P30 (ß2. amino aeids 13-3-0) and RAP74 (al, ammo acids 133-168) hate been shown b> mutagenesis

is deletion of additional ol R<\P30 involved m transuiption initiation and elongation (blae) Hongation also allected by icgions (amino acids W 13 and 91 120 ptitple) Components ot the tianscriptional machmei) (i e RNA pol 11, I'BP, 1 Al-250) aie likely 17s and the R<\P74 to intctaU with llexible, soKent exposed stiuetutes (tellow) such as the "arm domain' containing RAP74 C ol the (13 a eontotiis) is al-helix R The legion ot RAIMG trom the C-tei minus ot the a2 helix to the terminus ßPstiand with Ys/ between 186 and 1 88 implicated m elongation C, The RAP74 -aim doiiiam' contains a hvdiophobic coie insetting oi IPO W164 from the RAP74 (x I-helix pads against the KS9 side diatn and hvdiogen bonds lo Ihc backbone catbonyl group

(1.0 ci contouis)

6.4.3 The C-terrninal a-helix of RAP74(2-Ï72)

The RAP74 C-terminal residues from A157 to N168 form the al-helix that

is clear packs against the arm domain (Figure 30). The electron density for the a-helix

for two NCS-related copies of the RAP30/74-interaction domain but the whole stretch

from T154 to N168 is only well defined in one copy. The sequence beyond amino acid

N168 could not be modeled and longer constructs did not crystallize. This region of full

length RAP74 was subjected to extensive mutation analyses73-7\ Alanine replacement

mutations from F138A to T154A had only minor effects on the initiation and

elongation functions of TFIIF. The RAP30/RAP74-interaetion was maintained although

these mutations affected the RAP30/RAP74 core structure. Further mutations in the

region between Ll 55 and Ll 82 reduced both initiation and elongation function of

TFIIF while the formation of the preinitiation complex was unaffected. Since

quantitative effects on initiation and elongation rate were highly correlated, a similar

function of this protein region was for both processes. The effect ol these RAP74 177

mutations must be mediated association with DNA through , TBP, TFIIB or RNA

II as there polymerase were no other factors included in the in vitro transcription assay.

L155A and W164A were among those mutations with the largest effect. Mutation of

L155 to alanine reduced transcription to minimal levels obtained with the RAP74(2-

172) deletion mutant. The residue just precedes the RAP74 al-helix and points backwards towards the RAP74 barrel. However, there is no clearly defined hydrophobic pocket lor this residue suggesting it has an important role tn the addition of another protein. Mutation of W164 to alanine has similar effects. The residue attaches the C- tcrmmal a-hclix to the arm domain through a hydrophobic contact with the K89 side chain and a hydrogen bond to the main chain of E90 (Figure 35C). On the opposite side of (he C-termmal RAP74 al-helix. six glutamic acid residues E158. E159. E161. E162 and El 65 constitute a highly ncgativch charged surface. Three more glutamic acid residues. E5, E60 and E6lextend the acidic patch along the arm domain (Figure 33B).

This may be significant loi TAF250 binding which most probably associates with the al-helix and the arm domain based on mutagenesis data available68. 178

7 Conclusions and future perspectives

The I wo subunits of TFÏTF, RAP30 and RAP74. each consist of three distinct functional and structural domains each. This has been shown by limited proteolysis of recombinant RAP30/RAP74 complexes with endo- and exoproteases. Comparison of the proteolytic break down pioduets with the previously defined functional domains showed close correspondence (Figuie 3) The limited proteolysis results were used to design RAP30 and RAP74 constructs Joi crystallization screening with RAP30/RAP74- complexes. A complex of RAP30(2-i 1 9) with RAP74(2-172) was crystallized (Figure

15). Alter dehydration, native crystals diifiacted to 1.7 A. Phases to 2.65 A resolution were determined m a MAD experiment with a methylmereury prelabeled crystal

(Figure 20). The derivative crystal was non-isomoiphous to the native. Therefore, the partial model based on the MAD-data was built, transferred into the native unit cell by molecular replacement and refined to a final crystallographic R-iaclor of 229c (RfrK,

26V ) (Table 33). The structuie shows a heterodimer of the RAP30/74~interaction domains. Both subunits contribute secondary structure elements to a common core structure of tree mtricaieh intertwined ß-barrels (Figure 30). This "triple barrel" represents a novel dimenzauon fold m which RAP30 and RAP74 form a single interaction domain that requires cofoldmg rather than two separate structural entities.

The buned solvent accessible surface of 4400 A is much larger than required foi a normal protein-prolem interaction. The dissociation equilibrium between free TFIIF- subuntts and the hcterodimei must be studied to answer whether this tight interaction is ever broken m the cell. Until then all studies with smglc TFIIF-subumts must be considered with caution because the large hydrophobic dimenzation surlace of the single subunits mav lead to experimental ait ifact s when not properly occupied. One of these artifacts ma}' be lormation ot homodimers that can be generated applying the twofold pseudo-symmeii} between RAP30 and RAP74 in the heterodimer structure to the separate subunits. Whether these homodimers exist or not and what their physiological role is. may be revealed by further studies.

The loops and extensions that protrude from the "triple barrel" core structure into the surrounding medium ma\ be involved in recruiting other components of the transcriptional machiner}. In addition, unusual surface charge distributions and patches ol conserved surface residues have been pointed out that ma\ be of functional 179

significance. Future mutagenesis studies of these regions will lead to a much more detailed understanding of the role of TFIIF in basal and activated RNA polymerase II transcription (Figure 1 ).

The X-ray crystal structuie of the RAP30/74-mteraction domains clearly shows a heterodimer as supposed to a helerotetramer that had been postulated based on gel filtration, affinity Chromatograph} and DNA-eross-linking experiments i9.20.72.m,244

However, the accurate interpretation of these data these data for TFIIF would not be straight-forward as both subunits have at least two ilevibly linked domains, and would not behave as typical globular proteins. It ma}' be argued that the RAP(2-172) construct used lor structure determination lacks the pioposed homodieimerization domain w ithin

RAP74( 172-205)"- Howcvci, comparison ol dynamic light scatteiing and gel tilt rat ion results of complexes with RAP74(2-172) and RAP74(2-202) suggests that both oi them are in the same ohgomcnzation state which is presumably dimenc (Figure 26) All complexes studied show apparent molecular weights that arc significantly larger than expecicd for RAP30/74-heterdmiers but somewhat smaller than expected lor heterotctramcrs. As mentioned aboec. this can be account foi by the non-globular, bulky shape ol these complexes These observations lead to the conclusion that cither the RAP74(2-202) is simply lacking the essential 3 ammo acids needed for RAP74 dimerization or that the homodimcrization of RAP74(2-205) is artifactual and thus

irrelevant in the presence of RAP30 Still, the discrepancies may be explained with differences in the experimental conditions but the lact that the RAP74 homooligomerization was observed in absence of RAP30 icmforces the suspicion of an experimental artifact, since RAP74 is known to iorm latgei aggiegates maybe

" homodimers in solution1 Further studies are required lo decide about the subunit ohgomcnzation state of intact TFIIF, partieulaily in the context ol transcription m \i\o 180

Figure 36: Molecular weight of RAP30/74-interaction domain complexes as determined by gel filtration and dynamic light scattering.

Squaies îepresenl complexes with RAP74(2-7()2) with tarions RAPlO-construcN The diamonds lepiesent complexes containing RAP74(2-172) or even shoitet R AP74-constiiicts \long the black solid line appaient moleculai weight coiiesponds to calculated moleculai weight tor a heteiodimer Heterotetrameis would tall onto the black dotted hue I'he led solid line icpiesents the least squaies interpolation ol the data Iheie is a hneai corielation between calculated moleculai weight and appaient moleculai weight This indicates that all complexes are m the same oligomen/ation state

Dynamic hqht scattei inq Gel lilti ation »

1 il 10

v it hi v . i it«, k

With the presentation of the crystal structure of the RAP30/74-interaction domains, structure determination of the folded parts of TFIIF is almost complete. The missing domains are the C-termmal domain of RAP74 and the central domain of

RAP30. The central domain of RAP30 (amino acids 120-150) is susceptible to proteolytic cleavage and may be unstructured without the appropriate binding partner which is probably the second largest subunit of RNA polymerase TI48. The region shows sequence homology to region 2 of the E. coli a70-factor which binds lo bacterial

RNA polymerase. Interestingly, RAP30 also binds to E. coli RNA polymerase indicating that the sequence homology is also functional''1 and that the central domain of RAP30 may adopt a conformation similar to o70 region 2 when associated to RNA polymerase II. The structure of cj70 region 2 is known109 and alignment with RAP30 central domain (amino acids 120-150) predicts two alpha helices connected by a short loop around ammo acids 135-140 (Figure 13). The C-terminal domain of RAP74

(ammo acids 364-517) was cloned, expressed purified to homogeneity. Denaturation- renaturation experiments monitored by CD-spectroscopy showed that the domain is properly folded and adopts predominantly random coil conformation (56%) with some

ß-strand secondary structure elements (359r, data not shown) Crystallization of the isolated domain was not possible but after further truncations and in combination with 181

one of its interaction partners, structure determination may be possible and the result

would be highly signilicant for understanding RNA polymerase II transcription.

Similarly, complexes of RAP30/74 domains with all known interaction partners

may be prepared foi stiuctuie deteimmation. Most impoitant may be those with olhei

general transcription factors (TFIIB, TFIIE). RNA polymerase II. the TAFs (TAFII250.

TAFI 180, TAFI1100). FCPl and components oi TAT-activaied elongation. TFTIF-

complexes in combination with tianscnptional activators and repressors will be

interesting lo obtain insight mto the mechanisms ol gene regulation. Structural studies

with TFIIF-conslructs isolated from eukarvolic expiession systems mav help to

establish the role of posttranslational modifications on TFIIF activity. 182

References

1. Alberts, B. ct al. Molecular biologv of the cell, (Garland Publishing, Inc.. New York. 1904). 2. Kmppers. R.. Philippsen. P.. Schaefer. K.P. & Fanning. E. Molekulare Genetik. (Georg Thieme Verlag, Stuttgart. 1000). 3. Grcenblatt. J. RNA polymerase 11 holoenzyme and transcriptional regulation. Curr 0pm Cell Biol 9, 310-9 ( 19Q7)

4. Orphamdes. G . Lagrange. T A Remberg. D. The general transcription factors ol RNA polymerase II. Genes tfc DtxelopmcntW. 2657-2683 (1996). 5. B]orklund. S. & Kim. Y -J. Mecliatot ot transcriptional regulation. TIBS 21, 3335- 337(1996). 6. Veiiiizer. C.P. & Tijan, R. TAFs mediate tianscriptional activation and piomoter selectivity. TIBS 21. 338-341 (1996). 7. Kaisci, K. & Meisterernst, AI The human general co-factors. TIBS 21, 342-345 (1996). 8. Biorklund. S., Almouzni, G.. Davidson. I.. Nightingale, K. P., Weiss, K Global Transcription Regulators of Eukaiyotcs. Cell 96, 759-767 (1999). 9. Holstegc, F C.P. A Young. RA Transcriptional regulation: Contending with complexity. PNAS 96. 2-4(1999). 10. Shilatilard. A. Factois regulating (he transcriptional elongation activity of RNA polymerase II. Fasib 1 12. 1457-46 (1998) 11 Roeder, R.G. The role ol geneial initiation factors m iranscnption by RNA polymerase II TIBS 21, 327-334 (. 1906). 12 Nikolov, D.B. & Muiley. SK RNA polymerase II transciiption initiation: A structural view. /WAS 94. 15-22 (1997).

13. Maklonado, E. & Remberg, D News on initiation and elongation of transcription by RNA polymerase II. Current Opinions in Cell Biologx 7, 352- 361 (1905) 14. Coulombc, B, A Burton, ZF DNA Bending and Wrapping around RNA

" Polymerase, a "Revolutionaiy Model Dcsciibmg Transcriptional Mechanisms Microbiol. Mol Biol Rex 63. 457-478 (1999), 15. Goodrich, .1 A. ct Tpan, R Transcription Factors HE and IIFI and ATP Hydrolysis Direct Promotei Clearance by RNA Polymerase II. Cell 77, 145-156 (1904). 16. Zawcl. I,., Kumai. K P & Remberg. D Recycling of the general transcription factors during RNY polymerase II transcription. Genes & De\elopment9, 1479- 1490(1095). 17 Aso, T., Conaway. J W. & Conway. R.C. The RNA polymerase II elongation complex. FASEB./. 9, 1419-1428 ( 1995 ). 18. Archambault. J. et al. FCPl. the RAP74-interactmg subunit of a human protein phosphatase that dephospborylatcs the caiboxyl-termmal domain ol RNA polymerase HO../ Biol Chem 273, 27593-601 (1998). 19. Floies, O.. Ha. I. A Rembeig. D. Factors Involved in Specific Transcription by Mammalian RNA Polymerase IT ./. Biol. Chan. 265. 5629-5634 (1990). 20. Conaway. .1 W. & Conaway. C. A Multisubunit Transcription Factor Essential for Accurate Initiation b> RNA Polymerase II. / Rial Chem. 264, 2357-2362 (1980) 183

Burion, Z.F., Killcen. M., Sopta, M., Ortolan, L.G. & Grecnblatt, J. RAP30/74: a general initiation factor that binds to RNA polymerase II. Mol Cell Biol 8, 1602-13(1988). Smale, S.T. Transcription initiation from TATA-less promoters within eukaryotic protein-coding genes. Biochhn Bwphxs Acta 1351, 73-88 (1997). Romberg, R.D. Eukaryotic transcription control, TIBS 24, 46-49 (1999). Workman, J.L. & Kingston, R.E. Altérai ion of nucleosome structure as a mechanism of transcriptional regulation. Annu Rev Biochem 67, 545-79 (1998). Izban. M.G. & Fuse. D.S. Factor-stimulated RNA Polymerase II Transcribes at Physiological Elongation Rates on Naked DNA but Very Poorly on Chromatin Templates.,/. Biol. Chem. 267. 13647-13655 (1992). Ossipow. V,. Tassan. J.-P.. NTgg. E.A. et Schibier. U. A Mammalian RNA Polymerase II Holoenzyme Containing All Components Required for Promoter- Specific Transcription Initiation. Cell 83. 137-146 ( 1995 ). Vernjzer. C.P.. Chem. J.L., Yokomon. K. A Tijan. R. Binding of TAFs to Core Elements Directs Promoter Selectivity by RNA Polymerase II. Cell 81, 1115- 1 125 (1905). Green, Al, TBP-associated factors (TAFs): multiple, selective transcriptional mediators in common complexes. TIBS 25, 50-63 (2000). Kim. L.J.. Nikolov. D.B. et Burley, S.K. Co-crystal structure of TBP recognizing the minor groove of a TATA element. Nature 365, 520-527 (1993). Kim. Y., Geiger. J.H.. Hahn. S. et Sigler. P. Crystal structure of a yeast TBP/TATA-box complex. Nature 365. 512-520 (1993). Liu, D. et al. Solution structure of a TBP-TAF(11)230 complex: protein mimicry of the minor groove surface oi the TATA box unwound by TBP [see commentsl. Cell 94. 575-83 (1998), Dikstem. R.. Ruppert. S. & Tijan. R. TAFI1250 is a Bipartite Protein Kinase Thai Phosphorylates the Basal'Transcription Factor PAP74. Cell 84, 781-790 (1996). Siegert, J.L. et Robbins. P.D. Rb Inhibits the Intrinsic Kinase Activity of TATA-Binding Protein-Associated Factor TAFII250. Mol. Cell. Biol. 19. 846- 854(1999),

Mizzcn, CA. et al. The TAF( 11)250 subunit of TFIID has hislonc acetyl!ransferase activity. Cell 87. 1261-70 (1996). Xie. X. et al. Structural similarity between TAFs and the heterotretrameric core of the histone octamer. Natrue 380. 316-323 ( 1996).

Buck. C. et ni. Human TAF(TI)28 and TAF(JI)I8 interact through a histone fold encoded by atypical evolutionary conserved motifs also found in the SPT3 family. Cell 94." 230-49 ( D98). Tan, S.. Hunziker. Y., Sargent, D.F. tt Richmond. R.J. Crystal structure of a yeast TFIIA/TBP/DNA complex. Nature 381. 127-134 ( 1 006). Geiger, J.H.. Hahn. S.. Lee. S. & Sigler. P.B. Crystal Structure of the Yeast TFÏIA/TBP/DNA Complex. Science 272. 830-836 (1996). Nikolov. D.B. et al. Crystal structure of an TFIIB-TBP-TATA-element ternary complex. Nature 377, 119-128 (1995). Zhu, W. et al. The N-termmal domain of TFIIB from Pyrococcus iuriosus forms a zinc ribbon, Nature structural biology 3, 122-124 (1996). 184

4L Pinto, F, Wu. WH.. Na. J.G. & Hampsey. M. Characterization of sua7 Mutations Defines a Domain of TFIIB Involved in tianscription Start Site Selection in Yeast. J Biol. Chem. 269. 30569-30573 (1994). 42. Andcl, F., 3rd, Ladurner. A G.. Inouyc, C, Tjian. R. & Nogales, E. Three- dimensional structure ol the human TFI1D-IIA-HB complex. Science 286, 2157- 6 ( 1900). 43. Schaller, S. et al. Interactions between the full complement of human RNA polymerase II subunits. FEBS Lett 461. 253-7 (1099). 44. Acker. J. et al. Interactions between the human RNA polymerase II subunits. ./ Biol Chem 272. 16815-21 (1907) 45 Fu. .1. et al. Yeast RNA Polymerase II al 5 A Resolution. Cell 98. 709-810 (1999) 46. Cramer, P. et al. Arclntectuie of RNA Polymeiase II and Implications foi the Transcription Mechanism. Sciemc 288. 640-649 (2000) 47. Woychik, N.A. & Young. R A RNA polymeiase IF subunit structure and function. Trends Brachem Sei 15. 347-51 (1990). 48. Bentley, D.E. Regulation of transcriptional elongation by RNA polymerase It, Curr. Opm. Genet Dex 5.210-216(1995). 49. Sopta, M.. Carthew, R.W. A Cueenblatt. J. Isolation of Three Proteins That Bind to Mammalian RNA Pohmeiase II / Biol. Chem 260.10353-10360(1985). 50 Flores, O., Maldonado. E & Remberg. D Factors Involved m Specific Transcription b} Mammalian RNA Pohmeiase Tl. ./ Biol. Chem. 264. 8913- 8921(1989) 51. McCracken. S A Greenblatt. .1 Related RNA Polvmerasc-Binding Regions in Human RAP30/74 an Escherichia Coli Sigma 70. Science 253. 900-902 (1091). 52. Fang, SM. et Burton. Z F. RNA Polymerase Il-associated protein (RAP74) Binds Transcription Factor (TF) IIB and Blocks TFTIB-RAP30 Binding. /. Biol. Chem. 271. 11703-1 1709 (1996)

53. Poghtsch. CL. et al. Electron Crystal Structure of an RNA Polymerase II Transcription Elongation Complex. Cell 98. 791-798 (1999). 54. Kim, T.-K, et al. Traiectoiy of DNA m the RNA Polymerase II Transcription Preinitiation Complex /WAS94. 12268-12273 (1997). 55. Buratowski. S.. Sopta. M.. Greenblatt, J. tt Shaip. P.A. RNA polymeiase II- associated proteins are required for a DNA conformation change in the transcription initiation complex. Proc Natl. /icad. Sei USA 88, 7509-7513 (1991).

56 Chambers, RS A Kane. CM. Purification an Chaiactenzation of an RNA Polymerase II Phosphatase from Yeast. ./. Biol. Chem. 271. 24498-24504 (1906).

57. Archambault. .1 et al. An essential component of a C-termmal domain Phosphatase that interacts with transcnption lactoi HF m Saccharomyees ceievisiae. /WAS 94. 14301-14305 (1007). 58 B A Z Wang. Q Burton. F Functional Domame of Human RAP74 Including a Masked Polymerase Binding Domain../ Biol Chem. 270. 27035-27044 (1005). 50. Chambers, R.S.. Q.. W.B.. Burton. Z.F. A Dahmus, M E. The Activity of COOH-termmal Domain Phosphatase Is Regulated by a Docking Site an RNA Polymerase 11 and b} the General Transcription Factors IIP" and HB / Biol Chem. 270. 14962-14060(1995). 185

Kang. M.E. & Dahmus, ME. The Photoactivated Cross-linking of Recombinant C-termmal Domain to Piotems in a HcLa Cell Transcription Extract That Comigrate with Transcription Factors nE and ITF. /. Biol. Chem. 269. 23390- 23397(1005). Kilcen, M T. & Greenblatt. J F The General Transcription Factor RAP30 Bins to RNA Polymerase II and Prevents It from Binding Nonspeeifically to DNA. Moleculai and Cellulai Biologx 12, 30-37 (1992). Tan, S.. Garrett. P.. Conaway, R.C. & Conaway, J W. Cryptic DNA-binding domain m the C terminus oi RNA polymerase II general transcription factor RAP30, Proc. Natl. Acad. Set l'SA 91, 9808-9812 ( fo04). Grolt, CM.. lT|on. S N.. Wang. R. tt Werner, M.H, Structural homology between the RAP50 DNA-bmdmg domain and Imker histone H5: Implications for preinitiation complex assembly. PNAS , 0117-9122 (1998), Sun, Z.W ct Hampsey, M. Identification ot the gene (SSU71/TFGI) encoding the largest subunit of transcription lactoi TFIIF as a suppressor of a TFIIB mutation in Saccharomyces cerevisiae. Proc. Natl. Acad. Sa. USA 92, 3127 3131 (1995). Sun, Z.-W. & Hampsey. M. Synthetic Enhancement of a TFIIB Defect by a Mutation in SSU72. an Essential Yeast Gene Encoding a Novel Protein That

Affects Transcription Start Site Selection m Vitro. Mol. Cell. Biol. 16, 1557- 1566(1006). Ha, L et al. Multiple lunctional domains of human transcription factor IIB: distinct interactions with two general transcription factors and RNA polymerase 11. Genes & Development 7. 1021-1032 (1093). Tang. II.. Sun. X.. Remberg. D. tt Ebright. R.H. Protein-protein interactions in eukaryotic transcription initiation: Structure of the preinitiation complex. Proc. Natl. Acad. Sei USA 93. 1119-1124(1096). Rupperl, S. & Tijan, R. Human TAFII250 interacts with RAP74: implications lor RNA polymerase II initiation. Genes & Development 9, 2747-2755 ( 1995). Dubrovskaya. A. et al Distinct domains of hTAPTIlOO are rcquued foi functional interaction with transcription factor TFIIF (RAP30) and incorporation into the TFIID complex The EMBO Journal 15, 3702-3712(1996). Hisatake, K. et al Evolutionary conservation of human TATA-binding- polypeptide-associated laelois TAFII31 and TAFU80 and interaction of TAFÏÏ80 with othei TAFs and with general tianscnption factors. Proc Natl. Acad. Sa. USA 92. 8105-8109 (1095). Mavon, M.E.. Goodrich. J.A. & Tjian, R. Transcription factor HE binds preferentially to RNA polymerase II and recruits TFIIH: a model for promoter clearance. Genes & Pcx elopment 8. 515-524 (1904). Robert. F. et al. Wrapping of Promoter DNA around the RNA Polymerase II Initiation Complex Induced by TFIIF. Molecular Cell 2, 341-351 (1998). Ren, D.. Let. L. ct Burton, Z.F. A region within the RAP74 subunit of human tianscnption tactor IIF is critical for initiation but dispensable tor complex assembly. Mol Cell Biol 19. 7377-87 (1900) D & Z F of N- Lei. L.. Rcn. , FmkcTstem, A. Burton. Functions ihe and C- Terminal Domains of Human RAP74 in Tianscriptional Initiation. Elongation. and Recycling of RNA Polymerase IL Moi. Cell Biol. 18, 2130-2142 (1008). 186

75. Lei. L., Ren, D. & Burton, Z.F. The RAP74 Subunit of Human Transcription Factor IIF Has Similar Roles in Initiation and Elongation. Mol Cell Biol 19. 8372-8382(1999). 76. Svejstrup, J.Q.. Yichi. P. tt Egly, J.-M. The multiple roles of transcription/repair factor TF1TH. TIBS 21. 346-350 (1996). 77. OhJkuma, Y. Multiple functions of general transcription factors TFÏÏE and TFIIH in transcription: possible points of regulation by trans-acting factors. J Biochem (Tokyo) 122. 481-9(1997). 78. Reines, D., Conaway. R.C. A Conaway, J.W. Mechanism and regulation of transcriptional elongation by RNA polymerase II. Curr Opin Cell Biol 11, 342-6 (1999). 79. McCracken, S. et al. The Cterminal domain of RNA polymerase If couples mRNA processing to transcription. Nature 385. 357-61 (1997). SO. Dvir, A. et al. A Role for ATP and TFIIH in Activation of the RNA Polymerase II Prcnirtiation Complex Prior to Transcription Initiation. J. Biol. Chem. 271, 7245-7248(1996). 81. Parvin. .I.D., Shykind, B.M., Meyers. R.E., Kim. J. & Sharp. P.A. Multiple Sets of Basal Factors Initiate Transcription by RNA Polymerase II. .1. Biol, Chem. 269, 18414-18421 (1994). 82. Tyrec, CM. et al. Identification oi a minimal set of proteins that is sufficient for accurate initiation of transcription by RNA polymerase ÏÏ. Genes & Development 7. I254-1265 ( 1903). 83. Usheva. A. et al. Specific Interaction between the Nonphosphorylated Form of RNA Polymerase II and the TATA-Bhiding Protein. Cell 69, 871-881 (1002).

84. Seroz, T.. Hwang. J.R.. Moncollin, V. & Egly, J.M. TFITH: a link between transcription. DNA repair and cell cycle regulation. Curr. Opin. Genet. Dev. 5. 217-221 (1005). 85. Y an, Q.. Moreland. R.J., Conaway, J.W. tt Conaway, R.C. Dual roles for transcription factor HE in promoter escape by RNA polymerase II [In Process Citation). ./ Biol Chem 274. 35668-75 (1000). 86. Chang. CR, Kost rub. CF. & Burton, Z.F, RAP30/74 (Transcription Factor IIF) Is Required for Promoter Escape by RNA Polymerase II. J. Biol. Chem. 268, 204182-20389 (1093).

87. Tan, S.. Alo. T., Conaway, R.C. et Conaway, J.W. Roles lor Both the RAP30 an RAP74 Subunits of Transcription Factor IIF in Transcription Initiation and Elongation by RNA Polymerase 11../, Biol. Chem. 269. 25684-25691 (1994). 88. Tan. S.. Conaway, R.C

92. Yonaha, M., Tsuchiya, T. & Yasukochi, Y. CclLcyclc-dependent phosphorylation of the basal transcription factor RAP74. FEBS Letters 410, 477-480(1997). 93. Yankulov, K.Y. & Bentley, D.L. Regulation of CDK7 substrate specificity by MAT1 and TFÏÏH. Embo 7 16, 1638-46 (1997).

94. Ohkuma, Y. & Rocder, R.C Regulation of TFTIH ATPase an kinas activities by TFIID during active initiation complex formation. Nature 368, 160-163 (1994). 95. Roy, R. et al. The MO 15 Cell Cycle Kinase Is Associated with the TFIIH Transcription-DNA Repair Factor. Cell 79. 1093-1 101 (1904). 06. Fcavcr, W.J.. Sve]strap. J.Q.. Henry. N.L. et Romberg. R.D. Relationship of CDK-Activating Kinase and RNA Polymerase II CTD Kinase TFIIH/TFHK. CV//79, 1103-1100(1094).

97. Svejstrup, J.Q. ct al. Evidence for a mediator cycle at the initiation of transcription. /WAS 94. 60075-6078 (1997). 98. F'ptain, S.M., Kane, CM. & Chamberlm, M.J. Basic mechanisms of transcript elongation and its regulation. Annu Rev Biocliem 66, 117-72 (1997).

90. Gu, W. & Reines, D. Identification of a Decay in Transcription Potential 'That Results in Elongation Factor Dependence of RNA Polymerase II. ./. Biol. Chan. 279, 11238-11244(1905), 100. Coulombe, B., Li. J. A Greenblatt. J. Topological Localization of the Fluman Transcription Factors IIA. ITB. TATA Box-binding Protein and RNA Polymerase Il-associated Protein 30 an a Class II Promoter. ,/. Biol. Chem, 269, 10062-10067(1004).

101. Parvin, J.D. Sc Sharp. P.A. DNA Topology and a Minimal Set of Basal Factors of Transcription bv RNA Polymerase II. Cell 73. 533-540 (1993).

102. Price. D.H., Slucier. A.E. et Greenleaf. A.L. Dynamic Interaction between a Drosophila Transcription Factoi and RNA Polymerase IL Mol. Cell. Biol. 9, 1465-1475(1980). 103. Kephart, D.D.. Wang. B.Q.. Burton. Z.F". A Price. D.H. Functional Analysis of Drosophila Factor 5 (TFIIF). a General Transcription Factor../. Biol. Chem. 269, 135361343543(1994).

104. Kim, J.B., Yamaguchi. Y.. Wada. T., Handa. H. A Sharp. P.A. Tat-SFl protein associates with RAP30 and human SPT5 proteins. Mol Cell Biol 19. 5960-8 (1990). 105. Zhou, M., Kashanchi, F., Jiang. H., Ge. H, A. Brady, J.N. Phosphorylation of the RAP74 Subunit of TFIIF Correlates with Tat-Activated Transcription of the HIV-1 Long Terminal Repeat. Virology 268. 452-460 (2000). 106. Rossignol, M., Keriel. A.. Staub. A. et Egly. J.M. Kinase activity and phosphorylation of the largest subunit of TFTIF transcription factor. J Biol Chan 274,22387-02(1000). 107. Imhof. A. et al. Acetylation of general transcription factors by histone acetyltransferases. Current Biology 7. 680-602 (1007). 108. Rawling. J.M. et Alvarez-Gonzalez, R. TFTIF, a basal eukaryotic transcription factor is a substrate for poly(ADP-ribosylation). Biocliem. .1. 324. 240-253 (1007). 100. Aso, T. et al. Characterization of cDNA for the large subunit of the transcription initiation factor TFIIF. Nature 355, 461-463 (1092). 188

110. Finkelstein, A. ct al. A cDNA encoding RAP74. a general initiation factor foi- transcription by RNA polymerase II. Nature 355, 464-467 (1992). 111. Karlin, S. Unusual charge configurations in transcription factors of the basic RNA polymerase II initiation complex. Proc. Natl. Acad. Sei. USA 90, 5593- 5597(1993). 112. Joliot. V., Demma, M. & Prywes. R. Interaction with RAP74 Subunit of TF1TF is required for transcriptional activation by serum response factor. Nature 373. 632-635(1995) 113. Zhu. H., Joliot. V. A Prywes. R. Role of Transcription Factor TFTIF in Serum Response Factor-activated Transcription. ./. Biol. Chem. 269. 3489-3497 (1994). 1 14. Sabbah. M., Kang. K.-L. Tora. L. ct Redeuilh. G. Estrogen receptor facilitates the formation of preinitiation complex assembly: involvement of the general transcription factor TFIIB. Biocliem. 1. 3366. 639-646 ( 1998). 115. Martin. M.L., Licberman, P.M. & Curran, T. Fos-Jun Dimerization Promotes Interaction of the Basic Region with TFIIE-34 and TFÏÏF. Mol. Cell. Biol. 16. 2110-2118(1996). 116. Liang, G. et Hai, T, Characterization of Human Activation Transcription Factor 4, a Transcriptional Activator Thai Interacts with Multiple Domains o( cAMP- rcsponsive Element-binding Protein (CREB)-bmding Protein (CBP). J. Biol. Chan. 272, 24088-24095 0997) 117. McEwan, I.J.. Dahlman-Wright, K., Ford. J. & Wright, A.P.FI. Functional Interaction of Ihe c-Myc Transaetivation Domain with the TATA Binding Protein: Evidence for an Induced Fit Model of Transaetivation Domain Folding. Biochemistry 35. 9584-9593 (1996). 118. Pellegrini, L., Tan, S. ct Richmond. T..1. Structure oi serum response factor core bound to DNA. Nature 376, 400-498 (1905). 119. Kaiman. P., Yu, Y„ Wankhade. S. et Tamsky. M.A. PolyADP-nbose polymerase is a coactivator for AP-2 mediated transcriptional activation. Nucl. Acid. Res. 27. 866-874 (1 999). 120. Gong, D.W. et al. Elucidation of three putative structural subdomains by comparison of primary structure of Xenopus and human RAP74. Nucleic Acids Research 20, 6736 (1992). 121. Gong. D.-W., Horikoshi. M. & Nakatani. Y. Analysis of cDNA encoding Drosophila transcription initiation 1 actor TFIIF alpha (RAP74). Nucleic AÏads Research 21. 1492(1993). 122. Henry. N.L. et al. TFL1F-TAF-RNA polymerase TI connection. Genes A Development 8. 2868-2878 ( 1994).

123. Kephart, D.D. et al. Cloning ol a Drosophila cDNA with seciuence similarity to human transcription factor RAP74. Nucleic Acids Research 21. 1319 (1993). 124. Yong. C. et al. Structure of the human transcription iactor TFTIF revealed by limited proteolysis with trypsin. FEBS Letters 435. 191-194 (1998). 125. Kobyashi. Y.. Kitajima, S. & Yasukochi. Y, Isolation and nucleotide sequence of a rat cDNA homologous to human RAP30. Nucleic Acids Research 20, 1994 (1992). 126. Gong. D.W., Mortui. M.A.. Horikoshi. M. ct Nakatani. Y. Molecular cloning of cDNA encoding the small subunit of Drosophila transcription initiation factor TFIIF. Nucleic Acids Research 28, 1882-1886 (1995). 189

127. Sopta, M., Burton, Z.F. & Greenblatt, J. Structure and associated DNA-helicase activity of a general transcription initiation factor that binds to RNA polymerase II. Nature 341, 410-414 (1080),

128. Gong. D.W. et al. Imperfect conservation of a sigma factoi-like subrcgion in Xenopus general transcription factor RAP30 [published erratum appears in Nucleic Adds Res 1003 Jul 25:21(l5)'3606l. Nucleic Acids Res 20. 6414 (1092). 120. Horikoshi, M., Fujita. FF. Wang. J,. Takada, R. ct Roeder, R G. Nucleotide and ammo acid sequence of RAP30. Nucleic Acids Res 19, 5436 (1991), 130. Killeen, M.T. et Greenblatt, .1 F The geneial tianscnption factor RAP30 binds to RNA polymerase II and prevents it liom binding nonspecifically to DNA. Mol Cell Biol 12, 30-7 ( 1992) 131. Wilson. J.E The Use of Monoclonal Antibodies and Limited Proteolysis in Elucidation of Structure-Function Relationships m Proteins, in Methods of Biochemical Analysis, Vol. 35 (ed Suelter, CH.) 207-243 (John Wiley ct Sons, Inc.. New York, 1991).

132. Hubbard, S.J. The structural aspects of limited proteolysis ol native proteins. Bioclum. Biopfm. Acta 1382, 191-206 (1998). 133 Wang, B.Q.. Lei, L. ct Burton. ZF Importance of codon prefeience loi production ol human RAP~?4 and reconstitution of the RAP30/74 complex.

Protein Expression and Purification 5 . 476-485 ( 1994) 134 Ray. W J. tt Puvathmgal. J M A Simple Procedure for Removing Contaminating Aldehydes and Peroxides from Aqueous Solutions of Polyethylene Glycols and ol Noniome Detergents That Arc Based on the Polyoxyethylene Linkage Anal. Biocliem 146, 307-312(1985). 135. Collaboialivc Computational Project. N. The CCP4 suite: Programs for computational ciystallography. Acta Crxst. 1)50. 760-763 (1994). 136. Bagby, S. et al. Solution structure of the C-termmal core domain of human TFIIB: similarity to cyclin A and interaction with TATA-binding protein Cell 82.857-67(1995). 137. Brungcr. A.T. et al. Crystallography A NMR System: A New Software Suite foi Macromolecular Structure Determination. Acta. Cnst. 1)54, 905-921 (1998).

138. Kleywcgt, G..T. A Jones, T A. Detecting folding motifs and similarities m protein structures. Methods in Enzvnwlogx 277. 525-545 (1997). 139. Klevwegt, G.J. Experimental assessment ot differences between related protein crystal stiructurc. Acta. Crxst. D55, 1878-1884 ( 1999) 140. Otwmowski. Z. & Minor, W. Processing of X-ray detraction data collcted m oscillation mode. Methods in Enzymology 276, 307-326 (1997). 141. Jones, T.A., Zou, J.Y., Cowan, S.W. ct Kieldgaard, M. Improved methods loi- building protein models in electron density maps and location of errors in these models. Acta. Cnst. A47. 110-119(1991). 142. Klevwegt. G.J., Jones, T. A, Software foi handling macromolecular envelopes. Acta Crxst D55. 941-944 (1999). 143. Estermann, M.A. Solving crystal structures with the symmelry minimum

lunction. Nucl lntstr Meth. m Phvs. Res A 34. 126-135 ( 1995).

144. Estermann. M, SHAPE, Patterson deconvolution for siructure solution, XTAL Manual (1999). 190

Sheldrick, G.M., Dauter, Z., Wilson, K.S., Hope, H. & Sicker. E.G. Acta Cryst. D49. 18-23(1993). Sheldrick. G.M. Acta Crxst A46. 467-473 (1990).

Sambrook, J., Fntsch. E.F. ct Maniatis, T. Molecular Cloning - A laboratory manual. (Cold Spring Harbor Press. Cold Spring Harbor. NY. USA, 1989). Inoue, H., Nojima, H. & Okayama, H. High efficiency transformation of Escherichia cob with plasmids. Gene 96. 23-28 (1990). Yamsch-Penon. C, Vieira, J. & Messing, .1 Gene 33. 103-109 (1985). ATI W. Use of T7 RNA Studiei. FW.. Rosenberg. . Dunn. J.J. A Dubendorff. J Polymerase to Direct Expression ol Cloned Genes Methods in Enz.xmologs 185, 60-89(1990) Saiki, R K. et al. Pnmei-directed enzymatic amplification of DNA with a thermostable DNA polymerase Science 239, 487-91 (1988). Ito, H. A general method for mtioducmg a series of mutations into cloned DNA using the polymerase chain reaction Gene 102. 67-70 ( 1991). Sanger. F.. NTcklen. S. ct Coulson, AR. DNA sequencing with ehain- terminatmg inhibitors. P.AAS 74. 5463-5467 ( 19^7) Taboi. S. ct Richardson. C C DNA sequence analysis with a modified bacteriophage T7 DNA polymeiase Proc Natl Acad Sa USA 84, 4767-4771 (1987). Lacmmli. U. Nature 227. 680-685 (1970) Lottenspeich. F. ct Zorbas, II Bioaiudxtik, (Spektrum Akademischer Aerlag, Heidelberg, Berlin. 1998) Research, H. Crystal Screen F Sparse Matrix Crystallization Screening Kit. Crystallization Research Tools Catalogue 6. 6-7 (1996) Research, H. Crystal Screen II Sparse Matrix Crystallization Screening Kit Crystallization Research Tools Catalogue 6. 8-9 ( 1996) Research, H. Detergent Scieen I pievent non-specific aggregatton Crystallization Research Tools Catalogue 9. 20 (1999) Cudiiey, B., Patel, S., Weisengiaber, K., New house. Y, A McPherson, A, Sciecning and Optimization Strategies for Maciomolccular Ciystal Giowth, Acta Crxst D50, 413-423 (1994) Severmova, E et al. Domain Organization of the Escherichia coli RNA Polymerase sigma 70 Subunit. J. Mo) Biol. 263, 636-647 (1996) Casagranda. F. et Wilshire. J.F C-termmal sequencing oi peptides The thiocyanate degradation method Methods Mol Biol 32. 335-49 (1994) Inghs. A.S. Chemical piocedures foi C-termmal sequencing ol peptides and piotems. Anal Biocliem 195, 183-96 (1991 ) Schleuder. D.. Hillenkamp. F. tt Slrupat. K IR-MALDI-mass analysis of cleclioblotted proteins directly from the membrane, comparison of different membranes, application to on-membrane digestion, and protein identification by database searching. Anal Chem 71, 3238-47 ( 1999), Cohen, S.L.. Ferrc-D'Amarc, Burley. S.K. et Chan. B T. Probing the solution structure ol the DNA-biiidmg protein Max by a combination ol proteolysis and mass spectrometry. Protein Science 4, 1088-1099 (1995). Cohen. S.L. Domain elucidation by mass spectrometiy. Structure 15, 1015-1016 (1996). 191

167. Shaw, E. & Rusciea. J. The Reactivity of His-57 in Chymotrypsin to Alkylation. Archives of Biochemistry and Biophysics 145, 484-489 ( 1971 ). 168. Indig, F.E., Bcn-Meir, D., Spungin, A. tt Blumbcrt, S. Investigation of neutral anunopeptidases and of neutral proteinases using a new sensitive two-stage enzymatic reaction. FEBS Legg. 255.237-241 (1989), 169. Malhotra, A.. Severmova, E ct Darst, S.A. Crystal Stnicturc of a sigma 70 Subunit Fragment irom E. coli RNA Polymerase. Cell 87. 127-136 (1996). 170 Blundelk T L. ct Johnson, L.N Protein Crystallography, (Academic Press. New Yoik. 1976). 171 Drenth, J. Principles of Protein X-ra\ Crxstallography. (Springer. New York. 1994).

172 D'Arcy. Crystallizing Piotems . a Rational Appioach? Acta Crxst D50, 469471 (1994). 173 Stura. E.A., Sattcrthwait. AC et Calvo. J C. Reveise Screening. Acta Crvst. D50. 448-455(1994). 174. Gilhland. G.L. & Bickham. D M. The Biological Macromolecule Crystallization Database: A Tool for Developing Crystallization Strategies. METHODS 1. 6-11 (1990). 175 Gilhland. G.L. ct Ladnei. JE Crystallization of biological macromolecules foi X-tay diffraction studies Curr Opin Struct Biol 6. 595-603 (1996) 176. McPherson, A. Cystallization ot Macromolecules: General Principles, in Diffraction Methods for Biological Macromolecules, Vol. Pail A (eels. Wyckoff, HW„ Hirs. C.IIW et Rimasheff, S.N.) 112-119 (Academic Press, Inc. Harcourt Brace Jovanovich, Publishers. Orlando. San Diego, Now York, Austin, London, Montreal. Sydney. Todyo, Toronto, 1985). 177. Chayen, N.B. The lole of oil in macromolecular crystallization. Structure 5. 1269-1274(1997). 178. Chayen. N.E. A novel technique for contamerless protein crystallization. Protein Engineering 9. 927-929 (1996) 179 Robert. M.C.. Provost. K. & Lelaucheux. F. Crystallization in gels and related methods, in Crystallization of Nucleic Acids and Proteins: A Practical Approach (eds. Ducruix, A A Giege, R.) 127-144 (IRL Press, Oxford, 1992). 180. Reiss-Husson, F. Crystallization of membrane proteins, in Crxstallization of Nucleic Acids and Proteins: A Practical Approach (eds Ducruix, A. A Giege. R ) 175-191 (IRE Press. Oxford. 1902). 181. Schick. B. ct Jurak. F. Ciystal Growth and Ciystal Improvement Stiategies Acta Cnstallotiraphica Section D D50, 563-568 (1994). 182 Esnoul. R.M a al. Continuous and discontinuous changes m the unit cell of HIV-1 reverse transcriptase crystals on dehydration Acta Crxstallogr D Biol Crxstallogr 54. 938-53 (1998) 183. Garman. E.F. A Schneider, T R, Macromolecular Cryocrystallography. ./, Appl. Crxst. 30.211-237(1997). 184 Rodgct's, D W. Cryocrystallography. Structure 2. 1135 -1140 ( 1994). 185. Rodgers. DW, Practical Cryocrystallography. Methods m Enzymology 276, 183-203(1997) 186. Haip, J.M., Hanson, B.L.. Timm, D.E. & Bumck, G.J. Macromoleculai crystal annealing: evaluation of techniques and variables. Acta Crxstallogr D Biol Crxstallogr 55. 1329-54 (1999). 192

187. Scopes. R K Protein Purification: Principles and Practice, (Springer, New York, 1087).

188. Hams. E.L V. & Angal. S. Protein purification methods: a practical approach. (TRL Press. Oxford, 1080). 180. Lorbcr. B. ct Giege, R. Preparation and handling of biological macromolecules for crystallization, m Crxstallization of Nucleic Acids and Proteins: A Practical Approach (eds. Ducruix. A. ct Giege. R.) 10-46 (TRL Press, Oxford, 1992). 190. Ferre-DAmare, A. tt Burley. S. Dynamic Light Scattering in Evaluating Crystallizabihty of Macromolecules. Methods in Enzxmologv 276. 157-165 (1997) 191. Janank, J. & Kim. S.-H. Spaise matrix sampling: a screening method foi crystallization ol proteins../. APPt Cn st. 24. 409-411 (1991). 192. McPherson. A. Crystallization of Proteins by Variation of pH or Temperature, m Diffraction Methods for Biological Macromolecules. Vol. Part A (eds. Wyckoff. HW„ Hirs, CHW. & Rimashefi. SN) 125-127 (Academic Press, Inc..

H are ou rt Brace Jovanovich, Publishers. Oilando. San Diego. Now York, Austin, London. Montreal, Sydney. Todyo. Toronto. 1985). 193. Carter, CW Design of crystallization experiments and protocols in Crystallization of Nucleic Acids and Proteins A Practical Approach (eds Ducruix. A, ct Giege. R.) 47-72 (TRL Piess. Oxford, 1992). 194. Shieh. H.-S., Stallmgs. W C. Ste\ens. A M & Stegenian. R.A. Using Sampling Techniques m Piotein Crystallization Acta Cnst D51. 305-310 (1995). 195. Webei, PC. Overview ol Piotem Crystallization Methods. Methods in Enzxmologv 276. 13-22(1997) 196. Ducruix, A. ct Giege. R. Crystallization of Nucleic Acids and Proteins: A Practical approach. (IRL Press. Oxiorci. 1992) 197. McPherson. A Use of Polyethylene Glycol m the Crystallization of

Macromolecules. m Diffraction Methods Biological Macromolecules, Vol for ' Part A (eds. Wyckoff, FIW., Hirs. CHW A Rimasheff. SN) 120-124 (Academic Press. Inc., Ilarcourt Brace Jovanovich. Publishers, Oilando. San Diego. Now Yoik. Austin. London. Montreal, Sydney, Todyo. Toronto. 1985) 198 McPherson, A. Current appioaches to macromolecuku crystallization. Eur. I Biochem. 189. 1-23 (1990).

199. Weiss, M.S. ct Hilgenfeld. R. Dehydration leads to a phase transition m monochmc factor XITT crystals. Acta Crxstallogi D Biol Crxstallogr 55. 1858- 62(1999) 200. Struck. VI.M., Klug. A. ct Richmond. T.J, Comparison ol X-iay structures ol the nuclcosome core particle m two dillercnt hydration states I Mol Biol 224. 253- 64(1992)

201. Williams. CE et al. Crystallization and piclimmaiy X-ray studies on the molbindm AlodG trom Azetobacter vmelandii. Acta Crxstallogr D Biol Crxstallogi 55. 1356-8 ( 1000). 202. Dautei. Z. Data Collection Strategy. Methods in Enzxmologv 276. 326-344 (1007). 203. Maeder. A. 12 124. ETH Zuerich ( 1997). 204. Dumas. P., Ennifar. E. & Walter, P. Detection and treatment of (winning, an improvement and new results. Acta Crxstallogi D Biol Crxstallogr 55. 1170-87 (1000), 193

Yeales, T.O. Simple statistics for intensity data fiom twinned specimens. Acta Crxstallogr A 44, 142-4 (1988). Yeates, T.O. Detecting and overcoming crystal twinning. Methods Enzxmol 276. 344-58(1907). Chailes. W. & Carter. J. Efficient Factorial Designs and the Analysis of Macromoleculai Ciystal Cuowth Conditions Methods: A Companion to Methods m Enzxmologv 1. 12-24 (1990) Weik. M. et al. Specific chemical and stmctural damage to proteins produced by

' synchrotron ladiation. Proc Natl Acad Sa I S A 97. 623-8 (2000). Uson. I. tt Sheldrick, CM Advances in direct methods for protein crystallography Curr Opin Struct Biol 9, 643-648 (1999). Petsko, G.A. Preparation oi Isomophous Heavy-Atom Derivatives. Methods m Enzxmologv 114. 147-156 (1985) Rould, M.A. Screening for Heavy-Atom Derivatives and Obtaining Accurate Isomorphous Differences. Methods tn EnzMiwloçv 276, 461-472 (1997), Prange, T. et al Exploring hydrophobic sites m proteins with xenon or krypton. Proteins 30, 61-73 (1998k Luger. K., Madei, AW., Richmond. RK. Sargent, D.F. & Richmond, TJ, Crystal stiucture oi the nucleosome coie panicle at 2.8 A icsolulion [see comments! Nature 389. 251 -60 (1997 ) Yang. W.. Hendrickson. W.A.. Crouch, RJ & Satow, Y. Structure of Ribonuclease H Phased at 2 A Resolution by MAD Analysis of the Selenometluonyl Piotem Science 249. 1398-1404 ( 1990). De Luce, C Selenomethionine labeling oi type III AFP. thesis chapter (1998). Smith. J.L. A Thompson. A Reactivity of selenomethionine—dents in the magic bullet? Structure 6. 815-9 (1998) Buclisa. N. et al. Biomcorporation oi Telluromethmonine into Proteins A Promising Approach for X-ray Stiucture Analysis of Proteins. J. Mol. Biol 270. 616-623 0997). Nagai. K., Otibndge, C. Jessen. T.H., Li, J. & Evans, P.R. Crystal structure ol the RNA-bmding domain of the Ul small nucleai nbonucleoprolein A. Nature 348.515-520(1990). Oubride, C. Ito. N.. Teo, C.-FL, Feamley. I.

225. Diederichs, K. tt Karplus. P.A. Improved R-factors for diffraction data analysis in macromolecular crystallography, nature structural biology 4, 269-275 (1997). 226. Hendrickson, W.A. & Osata. CM. Phase Determination from Multiwavelenclh

Anomalous Diffraction Measurements. Methods in Ezymologv 276, 494-523 (1997). 227. Ramakrishnan. V.. Biou. V Treatment of Multiwavclcngth Anomalous Diffraction Data as a Special Case of Multiple Isomorphous Replacement.

Methods in Enzxmologx 276, 538 - 539 (1997). 228 Gonzalez, A. et al Two-wavelength MAD phasing: in search of the optimal choice ol wavelengths Acta Cr\ stallovi D Biol Crxstallogr 55. 1449-58 (1999). 229 Kleywegt. G.J tt Jones. TA Model Building and Refinement Piactiec. Methods m Enzxmologv 277. 208-230 (1997). 230. Velheux. FMD.. Read, R. J. Non ciysiallographic Symmetry Averaging in Phase Refinement and Extension Methods in Enzxmologx 276, 18-64 (1997).

231. Abrahams, J P., Leslie, A. G. W, Methods Used m the Structure Determination of Bovine Mitochondrial Fl ATPase Acta Crxst. D52. 30-42 (1996). 232. Cowtan. K„ Mam, P. Miscellaneous Algorithms for Density Modification. Acta 0-V.3/D54. 487-493(1998). 233. Navaza, J. tt Saludpan, P. AMoRe, An Automated Molecular Replacement Piogram Package Methods m Enzvmolgx 276. 581-593 (1997). 234. Brunger. AT. Patteison Correlation Searches and Refinement. Methods in Enzxmologv 276, 558-580 (1997). 235. Kraut, J.. Siekci, L C, High. D F & Free f. S T PNAS 48( 1962). 236. Brunger. A.T. & Rice. L.M. Crystallographic Refinement by Simulated Annealing: Methods and Applications. Methods in Enzxmologv 277, 243-269 (1997). 237. Brunger, A.T., .Adams. P D. A Rice, L.M. New- applications of simulated anneal¬ ing m X-ray crystallography and solution NMR Structure 5. 325-36 (1997) 238. Holm, L. & Sandci, C Protein structuie comparison by alignment of distance matrices. J Mol Biol 233, 123-38 (1993). 239. Janm, J. Elusive affinities. Proteins 21. 30-9 (1995) 240. Yonaha, M. et al. Domain stnicturc of a human general transcription initiation iactor. TFIIF. Nucleic Acids Res. 21, 273-279 (1993). 241 Flores, O.. Hua, F.. Killcen. AL, Greenblatt, J. & Burton, Z.F. The small subunit of transcription Iactor IIF recruits RNA polymerase II into the preinitiation complex. Proc. Natl. Acad. Sei. USA 88. 9999A0003 (1991 ). 242. Frank. D.J.. Tyiee. C M, George, C.P. & Kadonaga. J F Stiucture and Function of the Small Subunit of TFIIF (RAP30) Irom Drosophila mclanogaster. /. Biol. Chem. 270. 6292-6297 ( 1995). 243 Pan. G. & Greenblatt, J. Initiation of Transcription by RNA Polymerase II Is Limned by Melting of the Promoter DNA m the Region Immediately Upstream ol the Initiation Site.,/. Biol. Chem. 269. 30101-30104 ( 1994)

244. Kttapma. S. a al. A hcteromeric transcription Iactor lequircd lor mammalian RNA polymerase II Nucleic Acids Res 18. 4843-9 ( 1990). 195

Appendix

Human RAP74 amino acid and DNA sequences

The gene lor human RAP74 was obtained from Q Wang and Z. Burton m the pETll expression vcctoi (No\agen)lVl The gene was lesequcnced completely Vector sequences aie in capitals and the open leading fiame is given in lower case. Some silent point mutations weie lound which aie maiked in bold punt Bases are numbered with

îcspect to the pETl Id oi igm of icphcation.

GAAtPtAClATATACCatgac-gqcc et aag --,f .rai tgaaLgteactaaafacgtcg

-- ... „.-. ._ --^. . __ .... ,. .. . „ 4) + + + 2099

' OTT "T'TATAT î 4t ace jcccig tat/econ aooiO "31*- ~ttacaatjacttatqcagc

" A A I i- ? •= -ô Q N V T F \ V V 16

11 -gagLt "-o^aagaat ^ ~a ae. aaaaaaCa* at -at eatgetetrcttaatogeag'-cgaca

------2<_^o f~ i t - 4 - -^ 2159

- - *- - îaor^oaaqqat t cet^ta- -r oc c to *" ^- - - t-aotLa-'-qaaaattraogtrggctgL

p v p • •; : i ; n : ti a f n a a d k 3 6

- aagteaac t t-a-^a"-f"igaat -ag-oc--^o-3 nagea tctacttgagcaaraagaaaa

- - ttcagt t gaaa^ gal te^ -t Pigt ^ga t g a -c* egccetgaaep-gtt gt trtttL

"* \* F A w N Q A - L F R 17 L S N K K 1 - SS

' -taeeaa laii x~t ta i^* gi.cc gaal cagg- oc j tcteagt oaqLt -aacegcaagrttc

:::o +-- * — —->- 4— - - + —>- -- 2279

- --- acta tggtt et ^et -taegtjg(i~tagc~-*-t-n--e~L -acLCîagt:--gaejt:trqaag

o.

- 'o - Y O F E M F L 5 A F F N K t, R 7S

~- gggaggaggct cgqa-jotaa-o tootac ne a* ~qf cette î a tgaectaeg :; ^cgaqgocc

- - --'------2730 t j.- -r i- 2339

_ ctoet cegageet c-^'c'toatoc-T a t-dOqaTft"t\aaj-]oo tggetcetgq

* c - L F A \ F \ -1 ^ T 3 L K h V P o D Q 96

agccc--ggctact^ nq ot eaa -otg -aaa* ca t jeaqgaagtt caaqggeae^aagaagg

2710 t - ~- + 4- + - 2399

- tegggaccgao-ta t .v -ag^t nca-ttrFagt c -cttcctt oaagttzc-cttt aqttettcc

T

to - l rt L I, N , À S 7 R K F K <-. T K K G J 16

aara nt-aacagaqa u î-ql^c' Oo^taeatrttcacccagtgTe-ct!qac-gggqcrtt'--g

-t -eqe ot tcot .-t ett -ng o Mf tatqtagaaqtgggc caeAggCetgc cor toraagc

- -7 - G 2 î T , \ »; . T F T C ^ P O A F F 17b

-' a a tc'-^toco- st_t -aoaa -t ctonaeaact '"cac-'-enctqg -grea~o et ac^etga

>40 - * - * + - + -- îi-,Q î ccgaaacj.tg-ca ut1- F hoa -ca* gr tgaagtggggcgaeeqaqcagLageat -ggac*-

'

h d - A F t ',' ft . N F 'I P L A R G T F T IV

ce g et taagaagcttnagacsgaa* tggaacctt cgtaacaaagt Cettaaccat ttet^ca

jgegact -ettr-tae^et'-ccta-'cettgcagcatt.qttteaagaaLLqtftqaagaqgt

A F E A F F E w r R R N K V F ît ll P <3 1 l'b

tvatqcaqcaoottco1-ct taaa ttacc-aagaec-aqgacgaaetae -aaqaagagaaaqaaa

- _ -*_- - -- „'Sen + + -_"+ + _, 2S39 ajtacf.. 73 -g-anca-tae-** -"ggtet tggtectgrt t "" grtt ~* 3 -Lett tel Lt

'" " M 0 C '< l F Q D Q D E : t F F F R - PP

^ aacoto tteqt ._ g-- aaage*- -"eeaaartqcqtatccaeqaee^aaa ta *a aqaaa

- -- 2PU 4- t -- + + - t _ . _- 2609

"* -** t tegeao eactoa oca*-1 t -ttaa t tc^tgacgeataggtgetgqaaet* ge*- t jae t C 196

RGF PFASELRIHDLEDDLFM-2 16

t g tee fee"! a "Je-iLe-qaegettceggtgaagaaggtggtegLaLt ecgaaagctaaaa

aea'nagqc tg_ q ao q --gcgaagqceacttcttccaceaqt ataaggetLtogart LL

c S F A fa F A fa G F E G G R I P K A K K 236

acta a a a^ reo g a*-ego Laaa-aet-aqt cg' a a qaagaaaaa qaaaaaaggt- -c-ataacg

g + f - f _ + + __~__^ 2819

' t ett Lt taggegacegat coca^ oageat*- -LL-rtttttttttt*- txaaqqctatpgc

-o F 7 K - A F : A K G _, \ K K K G s D D E 256

aa qe*-ote qaaga-ag ~qa* gatgaTqa: t t -cta qqgc eaaqa tgeggae*"acatgLeaq

L^eqaaaqet tet t^- e-a-taoi.Äoloaa tet - nt tetaa er qat eqt aeagt -

A E F L F L) t - P : e t ; ri \ D Y M F D 276

a tP "g -<~acttag;t ec-aa qaacgageet eaq t j -aaon -ea a jg -q --gea qeaggagq

.,_. - .. :,. - ... ." 0 ,.- + 4 ^ 2 93 9

-^ ~ -u " L qceaa qgtea" -qa qggtpt f-eLcgaaet t c t ttt eg-ggcqtegtcetcc

G -o g g g ^ - q E E P F , A i1 0 C F E 2 96

a qg"Tcîi --a ï t 'ato*- egatgagcagagcgaea t* a t- g a qg a ofa ,f- gaggaggagaage

- - 4. ~~ - O + + 4- .- - 2999

--.- -gqgtLcc-eaeagctactcgtcteaeto:t„cat -act -CLcLeacLc etcctctLcg

- G P R c, \ v t7 F E Q S F L 3 L E K P 316

eget t oaatdqqac laggaqcqag jaggng j ii'-i">-i-i- -c e --egeaggagaaga

- - C - -- 0 r P - - -.\ -, --i iOS">

- ,. qeaaaer^et u* at o-tca tccteetcc~c* L^G' -.--.-j ~.g-gccqtrct".-LLct

7- « - P F E D t i n F E 3 t q- t G E K K - Ph

agegeagqaa agaeageagcqa qgagtx,ggaea ter -aqaeqjqaej-qaeatt gacagrg

- .::_ .. ..t _- ... _ - _ p +" 4-_ ., 2 -j.--. --- F-, t qqLQ Legegteet1-Let-gt eqteg'-' -et^ag -'-q* -g» g* -t--t Lege' gtaa -tqtegc

'"IFD-^FF^lF- ss---gE qç6

L * a tgc o -et .. a q-c e*~ t eatgot^ naeaagaa tae J -aeo e^aqaga tagettgaage ï

„ _ - - 3 C - - i- j.-.- -t- t - 3170

* -c onaggagt egggaqaagt aeegct L ct t c* L -~ o - ta-- ggjLt - et -t'-gcetLcg

- A S S A Ii ! M A K B Kl F F K p to \ p - 375

egt eoqgaggqgaget: eaagqggeoaacagecgceeaonea teeeaaeecaga qgettagea - - - - 0 1 + f 4 - -4- 3279 qeageoet -c-et egagtteeeegttgtcggegggt eeg^g -ggqt -gegtereecaeegt

G 7 G S S G N S R P t T P S A F G G 3 3 96

t -acetecLecaeec t qc-gqgeggetgccaacaaact eqaaeaagg qaaote ogcttrgageg

- 2 .._ „-- i. ... - -3- g 2, f- ,. _(.-'-- 3299

- qt t jaqaa tat q qjae i--e teeqaeagtegLt-gat ' egt^c -_ttc tceeaef ege

« *, T = g g A A A S K L f g . - *v 3 E - 116

a et at etc et or-a g -e a a te jgt t qcoqgc Lggacacgggaec ..a *>-._e-~~L~^ngaaqt

- - gtav.taa~o' -~t te~ 1-eaa< gccgacctgtgccctgct ten "g a -r qa-ec-L* ea

T - M 1 A A I P I. D T G - Q P G G .. S 436

egaeae- ea te-j--a* -agg -aagaeaacacceaacagcgg -ga^gLT-aegt-gaetcq

. -.- ne ; j lt -q-Laq* tt*^oqf Het gf tgtgggr tqfceccnt geacqreeaetga-

- c c -, a - 1 'o - p T x P N 3 F V C v F F 4 36

a qaat tc-cgtgc tet tet acetgaeaaggaagcccatgact ae' aaggaeetgetaaaaa

^-etacggea-acgg-gatqgacLqtqectt egggtactqgt gat t cet t^ta -gaeLttt

F A 3 F » V I T R K P M M T K l F F K K 476

a teeagaeeaaaaa taea-q q V tetgageagcgagcagaeag* cta a at sFtq" tga

-*- t ea et 'O t t- e -'" agqc t r*~ --qaetcgFcgctcgteLg' -a-** ]-aa. ctq ttet

° F - FO^-^^-jTjSSEQT . A, I 4 6 197

*- teaagcgacLeaaecceqagegcaagatgatcaacgacaaaatgi aeirt -tc«cLea

-- 73-4P 4 4- T-: + + r 3599 agga qttcgcLgagttggggctcgegttctactagttgctqLLttacgtgaaqagggagt

h K R I, N P E R K M 1 N D K M F F s t, R 516

aggagtgaGCPTTGGTCGAAT

7600 a - + + 362 tcctcaet CCGAACGAGGTTA

E ' l"7

Human RAP30 amino acid and DNA sequences

The gene foi human RAP3Ü was obtained foi m QWang and Z Bmlon m pETlld expiession vector (No3agen)n\ The gene was lescquenced completely Some silent pom! mutations wete lound which aie maiked m bold punt. Vcctoi sequences aie in capitals and the open leading frame is gi\en m lowei case. Bases arc numbered with respect to the pETl Id origin of replication

FAA' PATF PP-: TAIAAATAATTTTOTTIAA-- AA7AA --P -V ATTACCatggcc

-- -OOO f - --a 4 - - „- - f 2059 TATT / rAAn-GeAOFF-FTTTATTAAAAGAAAr" .FGP P P I'AFATGGtaccgg

MA - 2

- gagea- t tq nact n._ L^gaccggcgccaaa -a ta a -a -> na ngtggctagLcaag

- 2060 <- -- -» - - + + -, . - 2119

' et eq^ je- -' t ji i qaactggocgcgg-to-" gtg -"-e -ceaeaeegat cagtlc

- G ' F P F F 1 A v 2 F 3 v\ I, V R 22

-* qvt^ct aoatattt F -a. age aat cjgget: laat -n t qa a ta t q- aaaeLtggaaaa

------"-- - - 2120 ^ -4- -4 + -4 .j----- 2179

*- r aaggatt tat aa aea Fqt egtt aeeegattt eaaaga -,_ t t a^LVaa c-ttt

V -. r 7 v P R F Q ft AI A S G g _ fa R 42

•" geqgat tqocuci teaag jaaqgac1- tag g* qt "->' P acL* tg a? t qa tgat Lt

?m t- - - - a - -4._ ,- - 4. C 2239

qa. netaaeqgttcCtaqF^ct* ^tgaeL--a-aqtaaatgaaa *~La-t-eragaa

~ FF F A R T G G R T E 7 S F T F 3 13 F o2

vîoaaatatteatgatanggtggaaaaceagcLLo-aqteagtgeoe---agaqaacaLeca

< - - - 4-- 4 t- - < 2-P a +- 22°9 caLL^at aagta^tat aaeeaectt Lt ggt eaaagt caqtraegagaa- -L -- rgLactgL

= to A F T H F I ^ G K P A S V fa A F g g .32

' tt*-qtcP g -na a at !"-qqaq taeagacaLt aacagt a- tta gaqaaetcaLcaqat

-3------F .307 + -,.-- f-- + -- + --- t 20C;9

aaacaqaae tt CL -a-a t. -teetgtcLgLaattgLcaLaaaLgtaetcLcetqLa tL-La

- ; : f ^ > - 3 q t L t v f r f g f f fof

- aagct itca-nqaa îtaal actLgqt acaaagagctgaaLgccgaccag^L q -a t-- qaa

2360 + ------4- + -- + ~ t 2419 11 -gaea g-a? -nccttat -aeeatgtttctcgacLLacggeLggLegaegat .aolt

' '' - F G G >- F I , F \ Q P A F ï A A S 122

a t -ta at t. g ntaaaaaqattq-aaatagaagaaL-*~t:^etaae^ajtgaggeLaLca

212C 1 t -4 4 - 4 - -Z - 24?Q ttgat qtai.ar*aatt^ttctaacqtttatcttctcaqaa tttttqaG-actccaatagt

= \.wF,n 'Ji7jtEES'3\*N - F fa 1 12

*-' aaeag-ra qa aa 1 -q- aaeaaeo aat tacaaacct itt g-" aP -vt ~aat a -aat

' atcgt -caac t g* t t aa-are et tggLbaatgt Ltgaaeaac gat gmttatqtta

' - F F 7 G G F T N Y K I \ A F F ^ . N 162

" ateqaa-- n qaaaectaaaa i qaaaotaagacggaaagcgactcL iï <, v-n^-aa aea a-aL

2 5 4*- t -- - - 4 - f -a ------2 5û9 198

L aa tt tta-f-t* cttPL L tt ctt ctgectttcgctcgagctegacLaLttgL^a4-a

IFYLRKKRFDGKRARAnRQH 187

gLtt Lagac at geta"-* "caqcot t "-gaqaaacatcaat aotataat t aaqga ttg

H Oij + f <- f 4 4 2 P

*~*~ o taaat t-gta gat aaaacqLo q taaa *-c qtaqt t a1- qat att aqaaLt tqaac

V L F M t? fa A L R h G 3 t N F R ) 1 2 02

'-'- qtctqaca-- aoaaa aa c'-qtcn'-q a- ^-aaoqaaa taaagaaaLLg ?t gL L

1 60 + 4 a f 4 2 IX'-

-- *- acctqt a ttq *--- gt**gga a a atqaa t o \qaan c t "--aocoat aa

T ' ' V D 1 G P I F a I R L T G V P2

aqaa"-t ma- t» a aaaac a at ataq •"noaq agagi-acagae^ at ^ ">C ( *- * -- ^ - - f t _- . / q

-- g t * a^ ~t a~- q^* q q ^ ta oo qtet a -Lctgt qat a

v» to - I NT v I \H\ 24-1

" n tgajaag oa aqagtqaP aaGAA-aa i ». ai. U T-AAAuG îAGu

) 4 4 4 * 4 4 2839

m q t t g ft-t --eacLgatLcTT a Pi a! A TTmCG ,AI fa

F * fa D * 249 199

Multiple sequence alignment of RAP74 holomogues

huma n human L>a: - - -03)1 P PÄKKKA P t. A KG G xeriopus xenopus 2M S BK it FglKKV PIfaA KGG

drosophila - - - drosophila 2M K K K E Q t D S DID G KR]k g|KPJk]K G A yeast MS RRN PPQS RNGGi 3M yeast F T AJR)N K Y A T t T I [dJË"A [§J< | H M D K[K SO EY PRWtMKH .

human human ?4e K^pDDlÂF E'DSDOgPm

xenopus - xenopus ja. GsDD EA t EDSDb'G 276 drosophila M S S A S K S T P S A A S G drosophila oü R PJV O D E A F 6 E S D Dû| 259 M M H M G Q N Q S N S S S P G \ 3 N Q D N S H O S t Y K K D D p E 0 yeast m -Ot Ci T 0 Afv A DIcTqIa M D E[d1d[R D *«

human M A A t Q p S S Q N Y human zo, B"F ËQQEYO.YMSDIûlsTS SjQ ËE'P ES KA KA P|QB EEG xenopus - SvU s l- G T S G OPV xenopus 2?i O F 60Q EVDYMS DJES S SJeTE EEQpFJg K I K P AJK ElL6 HO er s a n a a a a à s s o a -"o drosophila y[a s s s s|a[m y drosophila a)û DfÈleGJRlEMDYF dt s DEPO P E A K D MIR G " sjsje KY)D yaast t E P E K M L t Q I G V V? ËI-AlDlAGffilSjH yeast «M Bins e vif. 16 Yl- due fIaJdd'è IfAlpfMiD gJn e g EitTîTEs

human YTKNTl- K KYN ! - -|T MAFNAAD-KYNF A T| hurnan sus PKQVD EQSDS'S- C'ES E EEC PP E

- xenopus V p| -jS KiRMsl t M A P N A A CMC S - P[r]h vpp TJ xenopos .10: P K G t D E G S E S S E E S E E E K aIËIeJ 323 ' |EF Kl FI - drosophila YP'tRpj IPfpjHYlYllRlF KATlL N|y|D|f Ajp arosg>hila m A E E D A t H K L t T S DjEplË DlPglPiS D ES D K E D A|DG : lïF PIPP .LEjEDt E N M p|T H t L llCJF G SlFÎFFK [ NTPIvTIP IÜ R I K K E M t Q A NlAjMaT]R|DjE E AIP S E?N ËlE.DJ|t

- - hutmh Wû - g ABl le ê S oTslNlk i iT3] Ë Ê E M P E l 6DKEEEEEKKAPTPGEKK R R - P • WN A M DIn|K xenopus 0 fij- B RCLLSJA K t' M E66MPEI E E K K A^. H i 4U YQ|- 6[faC]EiG]E T£Q LqNjf. N N v^ drosophila VfRJNVF qt TFRfa MIK E F [ROM B 6 D0P K qî'osopriiia E K K K K D.K-JGlt! D[E:vispb)k kk K K P T t' D DlK = H L P V 0 t H R K dLl1n|iJ0F Q0B flilYjlR CaPIGP.l] KGpDEDplirr < kiaJlSk|t1e t A A t Y S S|D en[e

IS G A 0 S ê FIN fi K L - - »IBAPRKKYe I Y L KlE F R P - - - - E SI n union -|S'S E Et S E ; S D I D S E A S S A t F Itvl IA K K Y N R - AQSE - KloiB - - xenopus [SO EESRBKKVQIlt REFplDDl- xotwpu IS S D E S E SDI.D GRA S. S S -|m|- -TjlKK

drosophila YNR A R - - DiûIHI'B R K K[F]iS i 1 A Rb;lYläjPE[A- «cphih 342 Wp sIg d_s SfflDLF s SJD s T ttlsJfPB] t 3 M G p - - fp K K yeast K a [ïik k{a£ Èlijfs[q|pnstejaMNli<_storrvs L N N TVpOjG lst !» t S Elf tl i E N K È NI E Si P|V kIkIe! E oBlDtT I- 5 K 3 K R 3 |3JP|K

human ÛPWt t RVNQ.KSQRKFM human KT P PK] RlÉlR P S G G S S Fl] - - Q.PWf LR.VNQ.KAQRK0K xenopus KT^faFKi KDK atasHs s-s RJ- - 0 PWl tjKlYlSlQ.K TG|FlK F kvIy vikIdkd k e k|e|k i t Kl drosophila KgS'AASSfKV . ,' D V 3 T K Ü N T A N C n s s ijp TlvlTlafsIs \'|P [%fs ; yeast kj'qg kIkJatnahyhk P|T t|R|Vp

human i I K K O O Y t E N 1 S Y Y ! F T Q 0 P D 0 A F E A F P V human IB PGT PS À" EG G S TS ST I. RAA A3 K I. EOG K nfa -IVSlE VKtQCrYTSNAS FT G C A F xonopus Y|F ADQ Ë A F P V xenopus B PC, TPS PDTQRtSST t RAA AS Kl EQS K bIûtIvsIn drosophila i i Pie q g y q B tt a T h a p D a A l a|fiy Vf fëa[y]p drosophila |R S A T P T LJslTlDJA SpR kWnIS t P S D tJTAlS DJT S N|S PjT wa^ t|t GsAYib sIno L sInIgisItIsIa aPq tfoo N|A s q yeast ri7l3pgrHPEQJ1tj!#|p!QlTJ|KIAVfDlSjS"NNAiSjl

human iNFTPtÀRHRTtTAEEAeEÊWEBBNKYtNHfSIM human M P A A K R tlRlt D TO PQ S t S G K S T PQ P PS C f TP v aPh p t t a e i xenopus N a e ëElew e r r n k v t n h F t î m xenopus U P A A K R t|KJM E A OPQ[n|TSG KS T P Q PBlS 0 drosophila nf[q[P Ytigaps t s a e b a eble f|5|b bIkIk y mnItIe s t m drosODhlia m S]T P A K Bip K n|e I F -'

--* - L VI- - T K - [Nt. BIP KltflD AG!pJÄJe.DP|t V G m|yJk T D Ci K if[v lev'—=-*7-i-jiiiLP-iPF'MIk LolËriteltlN S T V A ETpTE'Tlit' A p Tl -

human R RI E - - IQÔ KDq"d)QUÔTd EBKl lÊKlRpî- "ESK human IG D V Q Y T'E D A V R'PfY t T R K P MT T K 0 l t K K F Q T K K T PL xenopus pOP KDOlVGlDED ED EEGGGK L|EKGG KG KK K KlK xenopus CjD I Q t TE EAY RRY ETfiKPMTTKD t t KKFQTKKT

- L fl Klfy,fPG'Tg=]ElREiû|OpE A K - - - - drosophil E|- tP|KfA[A.|T KjK drosophila ElDjßll TEEAVftRTLlHlRJÇA t T A[Î]E L LlflK FliCNlK KT P t - : -' T[N)E F(0EG TMlBjP t AiDj'V A P DGGG B A K RG N L|BR_K -LIT Ê_KJDiiJi eIa i gI5 g"k1vinI i k e f g R'i fI i j n b[kJy]p

-.seïIs I'HD - t e d d m G t S S EQTVIOV t Au I. t K R t H P E R KM I ND KMH F t 1 HD xenopus L EDD I. El- G t S S EjgT VNV t AG t t KyR.L N'PD R K Y Y HD "" ' KMT drosophila sfliG t K M D I D - E@ P VS_S.t73ipV|E]TMT(KlFLtt FfKl I H PIy'k'hIt I G[Glig.t[YltWjJ 576 P G K ENA KK - L R F EE F Y P\ F DIG - - Il LU v]l| yRED GAfENIKIK -ItMFJj VKlKlLrFcofRKVGNDlHtvllEILlKT 73.1

IM S S PIA s B' a'sIg EEG G] - k"e] its slip s Egtejo 6 E GJË - - 'S D SÎfc 0' e Bo k - - - ïf |yg_Jye ApNlUD: S Y V t L S Y spjcHs F 1

Multiple sequence alignment of RAP30 holomogues

M SSQSAûAPAlSNNS T K ' F N H R V M T O R D G R D R Y I PY'VKT 1 i

human z RC EL--DL TÛA V INTj v v o r-! a e c r p a a s - rat e\ ELDiTQAI- VVQSA lï C R P A A S E - xtô-nopf.F E U D A K O KjQ L=!H]G N TJ ï A BÙ HP A A SJ.JJ- ijl PJKlDlb L S|N A Ci R : lÖfBl. "iv'lLKGRP -i D i E M MET E S SE&JJDIJJË R S (SÛR ( EiOjbiVJMlP S MNSD ûntfîlN

17 VWL V K Y P KTi - - [Q L S 0 Q W A K A S C o e - \k ^tiiTxrxRi J K Y V-T- T N Y- K. P V A N H O X Ni-! L SQQWA K A P ( - - ! a O E V.'G KI.HUK. K P Y H U b UÜ A 3 K' Y V T T N Y K P V A N H Q Y H I I

ÖMWi. V L < - - K P K L A O Q W A K A T [M Q Ë V O K L H I Y K = iKPfKlflJQlSOO L K A V T S N Y K P V 5 M 1-10 Y N ! 1 - V W L, V K V f> K , , .. n i", - - [O v.*, -= rJÏMDV-0 Ùfi ifNIK 1 P ^ItWfcj! ?.J£L vt^NJFlK Y[E;^1hKhin M yy.'Xct PR h ' L H O QlË L Ci K i RijNlKl IîSHer lOLn ë

human fn TEVSFTLNE D 1. A M - S A [V SE F c k 11: 3 A f i. D. -i Q" 1 t£Pr EVSFTLNEOUN V S A ESKKKEDGK H Ö Y Y N KiTEVSF T-t-NJiE L AiSI I OD ; JVS T [ej kIk 'K *< i-1 n â KRÄ p v a k h a v t. o m i. k s a p f.- k MÛ ï Y N ' ^(A]û]V S L S ifVP AiY P E- LiA|LrD R EjïlK KÎAlfrG KÎKlA L F H K H O Y Y N - R!d|.D:k Nî(a]VMDM F[h|a -'N i T i L UiliM ON n Y DlY'vVJS

P P F V B îï M l„ O S V G G Q T L T \J f t (S h. S S v L K D i V D i T k: Q P V V YLKEiLI! EE I G 1 H K Ni * P H E H P F Y L Q S V G G O T L T V F T E S S S L KOL-VD-I T KOPYOn, K'E 1- L K E i H K Ni : P H F H P F L L Q S V Ci Q Q f I T V L T K ;, L S g a s È d * I K D L V D i T KO P Y T Y"t K B î Q T H K Ni 2 pfi1e^Liqvfr^T[K]aXLGVFsiEMlAP 1[h]D JrlNMR tirosophiïf I K D L V K i TI'MJQ P I S Y.L K E I L K D ' 'V'JHM K [n"RܱJ< H: ; PHlËpC II'h.iHt-iT K k1w;EN ElY.Vf.TTÎ]QN|L K K Y O Q F 1 i- V KlK-

[TW K t SPEYBH Y û G £' B K ï h'WE UPËTR i-'l Y Q T £î"i.; K 3: It we i. k PET RH yog go k< ..

;ûËPe£Ëii- ^ÙlMli-i^ÎË-I-lf^ ^LLIK Ele^JR K A T L 200

Acknowledgements

Special thanks go to Prof. Dr. Timothy J. Richmond for giving me the opportunity to work on an exiting project in his laboratory. 1 had a chance of learning a wide range of methods in the field.

I am grateful 10 Prof. Dr. Song Tan for introduction into the molecular biology techniques and helpful discussions.

For all advice on experimental methods as well as for patience and spontaneous help

Dr. David Sargent, Dr. Thomas Rcchsteiner. Dr. Armin Madcr. Dr. Imre Berger and my fellow doctoral students Michael Bleichenbaeher, Markus Hassler. and Eugenio Santelli deserve special thanks.

f want to thank the remaining members ot' the Richmond Lab for providing an excellent working environment: Dr. Rob Coleman. Dr. Carl DeLuca, Yvonne Hunziker. Prof. Dr.

Karolin Luger. Dr. Dan Fitzgerald. Dr. Andrew Flaus. Dr Kerstin Weiss.

Assistance and advise by the technical staff al the ESRF beamlmes and al the ETH

Protein Service Laboratory is acknowledged: Dr. Phil Pattison (SNBL), Dr. Grodon lYonard (BM14). Dr. Sean McSweeney (ID14-4). Dr. René Brunisholz (ETH). Dr.

Gerhard Frank (ETH)

Last but not least. I would like to formally thank my parents who gave me (he freedom to study whatever I wanted and always supported me as best as possible.

This woik has been supported in part by the Swiss National Science Foundation and the

Novartis AG. 201

Curriculum Vitae

Florian Gaiser. Dipl Natw. ETH

Citizen of the Federal Republic of German}.

Born in Munich. Germany on the 29th Ma\ 19"2

1978-1982 Primai y School m Munich. Germain.

1982-1984 Piimai> School in Baar. Sw it/eiland

1984-1991 Secondai'} School

Kantonsschule Zug, Switzerland.

1991 -1995 Study of Biochemistry and Organic Chemistiy

Federal Institute of Technolog} (FIT), Zurich. Switzerland.

1995 Diploma-thesis

"Herstellung von (5R. e>R. 7S. 8R V5,6.7.8-TtMi ahydro-5-

(hydrovymeth\l)-indoli7ni-6.7,8-tno]cn ("No]nipvrrolcn")"

Institute ot Organic Chemistn. FIT Zuiich. Switzciland.

Prof. Dr. A Vasella

1995 Diploma in Biochemistry. Moleculai Biology, Genetics, Svnthetie

Organic Chemistry, and Bioorganic Chemistn

1995-2000 Doctoral stud} and preparation of this thesis

''Structure of the human transcription lactot TFIIF"

Institute of Molecular Biology and Biophysics. FIT Zurich, Switzerland

Prof. Dr. T. J. Richmond