INFORMATION TO USERS

This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer.

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction.

In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book.

Photographs included in the original manuscript have been reproduced xerographically in this copy. Higher quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional charge. Contact UMI directly to order.

University Microfilms International A Bell & Howell Information Company 300 North Zeeb Road. Ann Arbor. Ml 48106-1346 USA 313/761-4700 800/521-0600 Order Number 9227401

Cloning and characterization of elongation factor G genes

Welcsh, Piri Louise, Ph.D.

The Ohio State University, 1992

UMI 300 N. ZeebRd. Ann Arbor, MI 48106 Cloning and Characterization of Elongation Factor 6 Genes

DISSERTATION

Presented in Partial Fulfillment of the Requirement for

the Degree of Doctor of Philosophy in the Graduate

School of the Ohio State University

By

Piri Louise Welcsh, B.S.

*****

The Ohio State University

1992

Dissertation Committee: Approved by

C.A. Breitenberger

P.A. Fuerst

G.A. Marzluf

L.F. Johnson Adviser

Department of Molecular my parents ACKNOWLEDGEMENTS

I express sincere appreciation to Dr. Caroline A.

Breitenberger for her guidance, insight, understanding and

friendship throughout this research. I would also like to

thank the other members of my dissertation committee, Drs.

Paul A. Fuerst, George A. Marzluf, and Lee F. Johnson, for

their suggestions and comments. Thanks go to Drs. R. Sayre,

M. Teirney, K. Davis, R. Scholl, P. Perlman, R. Saldhana, T.

Chang, and A. Tzagoloff and to Douglas Johnson, John Moran,

and Yangsheng Zheng for their generous contributions to this project. A very special thanks goes to Douglas Johnson not only for his work on the S. cerevisiae project but also for his friendship, support and encouragement over the last two years. I would also like to thank all the past and present members of the Breitenberger laboratory as well as Dr. Elio

Vanin and the past members of the Vanin laboratory. To all of the other friends I have made while at Ohio State, I thank you for your support, encouragement, and understanding.

Finally, I would like to thank my family, my parents Thomas and Marilyn, my sisters Kristan and Denise, and my brothers

Jeffery, Kevin and Albert, for their encouragement and enthusiasm throughout this endeavor.

iii VITA

April 10, 1963 Born - Youngstown, Ohio

1985...... B.S., Marquette University, Milwaukee, Wisconsin

FIELD OF STUDY

Major Field: Molecular Biology TABLE OF CONTENTS

DEDICATION...... ii

ACKNOWLEDGEMENTS...... iii

VITA...... iv

LIST OF TABLES...... viii

LIST OF FIGURES...... IX

CHAPTER PAGE

I. INTRODUCTION...... 1

A. Outline...... 1 B. Objectives...... 2 C. Protein Synthesis...... 3 D. GTP-hydrolyzing proteins function as molecular switches...... 8 E. Organellar protein synthesis elongation factors in higher are encoded by nuclear genes...... 12 F. Targeting of nuclear-encoded proteins to cellular ...... 15 G. Regulation of organellar protein biosynthesis..22

II. MATERIALS AND METHODS...... 27

A. Strains...... 27 B. Transformation of yeast...... 28 C. Recombinant ...... 28 D. DNA restriction analysis...... 28 E. Preparation of high molecular weight nuclear DNA...... 28 F. Design of EF-G specific ...... 29 G. Generation and analysis of EF-G specific PCR products...... 3 0 H. Cloning of PCR generated EF-G gene fragments...31 I. DNA sequencing...... 32 J. Genomic Southern analysis...... 3 3 K. Northern analysis...... 35 L. Isolation of the Synechocystis 6803 EF-G gene..36 M. Isolation of the Saccharomyces cerevisiae mitochondrial EF-G gene...... 3 6

v N. Attempted isolation of the pea, Chlamydomonas reinhardtii, and Arabidopsis thaliana EF-G genes...... 37 0. Construction of the disrupted efgl::URA3 allele...... 38 P. Disruption of the wild type s. cerevisiae mitochondrial EF-G gene...... 38 Q. Analysis of the disrupted S. cerevisiae mitochondrial EF-G gene...... 39 R. Yeast mating...... 39 S. Complementation of the respiratory-defective S. cerevisiae mutant strain C155...... 40

III. Isolation and characterization of a cyanobacterial EF-G gene...... 41

A. Background and approach...... 41 B. Generation and identification of a Synechocystis 6803 EF-G specific PCR product...... 45 C. Synechocystis 6803 genomic Southern analysis...47 D. Isolation of three genomic clones which contain the Synechocystis 6803 EF-G gene...... 48 E. sequence of the EF-G gene of Synechocystis 6803...... 48 F. A comparative analysis of the primary sequence of Synechocystis 6803 EF-G to other cyanobacterial, bacterial, and archaebacterial EF-G sequences..49 G. Use of the Synechocystis 6803 EF-G specific PCR product as a heterologous probe...... 54

IV. Cloning and characterization of a Saccharomyces cerevisiae mitochondrial EF-G gene...... 55

A. Background and approach...... 55 B. Generation and identification of a S. cerevisiae mitochondrial EF-G specific PCR product...... 58 C. S. cerevisiae genomic Southern analysis...... 60 D. Isolation of two genomic clones which encode a S. cerevisiae mitochondrial EF-G gene...... 61 E. Elucidation of the sequence of a S. cerevisiae mitochondrial EF-G gene...... 61 F. A comparative analysis of the amino acid sequence of a S. cerevisiae mitochondrial EF-G to other known eukaryotic and prokaryotic EF-G sequences...... 62 G. Disruption of a S. cerevisiae mitochondrial EF-G gene...... 64 H. Complementation of the S. cerevisiae pet mutant strain C155 with a 5. cerevisiae mitochondrial EF-G gene...... 65 1. Discussion...... 6 6

vi V. Attempted isolation and characterization of the Pisum sativum and Arabidopsis thaliana chloroplast EF-G genes...... 73

A. Background and approach...... 73 B. Generation and identification of pea and A. thaliana chlEF-G specific PCR fragments 75 C. Pea and A. thaliana genomic Southern analysis...... 78 D. Pea northern analysis...... 78 E. Screening of a pea cDNA library, an A. thaliana cDNA library and an A. thaliana genomic library...... 79 F. Future plans...... 80

VI. Localization of matrix associated regions (MARs) and topoisomerase II cleavage sites in the human jS-globin gene cluster...... 87

A. Introduction...... 87 B. Materials and methods...... 101 C. Results...... 105 D. Discussion...... 109

LIST OF REFERENCES...... 238

vii LIST OF TABLES

TABLE PAGE

1. Yeast strains used for genetic analysis...... 112

2. vectors and their selectable markers...... 114

3. Sequence motifs in GTPase superfamily...... 116

4. Percent amino acid sequence homology between Synechocystis 6803 EF-G, A. nidulans EF-G, S. platensis EF-G, E. coli EF-G, and E. coli LepA..118

5. Comparison of the codon usage in Synechococcus and the Synechocystis 6803 fus gene...... 120

6 . Distance in base pairs between the genes of the str operon in E. coli, S. platensis, A. nidulans, and M. luteus...... 123

7. Percent amino acid sequence homology between the two S. cerevisiae mtEF-Gs and E. coli EF-G...... 125

8 . Comparison of the codon usage in S. cerevisiae and the S. cerevisiae fus gene identified in our laboratory...... 127

9. Results of yeast test crosses: aBWG7A x aKARp0 and aBWG7A efgl:: URA3 x aKARp0...... 130

10. Results of densitometric scans of the in vitro MAR assays shown in Figure 46...... 132

11. Results of densitometric scans of the in vitro MAR assays shown in Figure 48...... 134

viii LIST OF FIGURES

FIGURES PAGE

1. Prokaryotic protein synthesis initiation...... 136

2. Prokaryotic protein synthesis elongation and termination...... 138

3. Eukaryotic protein synthesis...... 140

4. The GTPase cycle...... 142

5. GTPase cycle of EF-Tu...... 144

6 . Design of EF-G specific oligonucleotides...... 146

7. Schematic representation of the PCR strategy used to generate EF-G specific PCR products...... 148

8 . A restriction enzyme map of the Synechocystis 6803 fus-like gene...... 150

9. Restriction enzyme map of a S. cerevisiae mtEF-G gene...... 152

10. Schematic diagram of the construction of the allele used to disrupt a wild type S. cerevisiae mtEF-G gene...... 154

11. Nondenaturing polyacrylamide gel electrophoresis of Synechocystis 6803 and pea PCR products...... 156

12. Nucleotide sequence of the Synechocystis 6803 EF-G specific PCR product...... 158

13. Alignment of the N-terminal amino acid sequences of translocases from E. coli, hamster, Synechocystis 6803, pea, and A. thaliana...... 160

14. Synechocystis 6803 genomic Southern...... 162

15. N. crassa, Euglena gracilis, Synechocystis 6803, and pea genomic Southern probed with the Synechocystis 6803 EF-G specific PCR product... 164

ix 16. Nucleotide and corresponding amino acid sequence of the Synechocystis 6803 EF-G gene...... 166

17. Alignment of the amino acid sequence of translocases from Synechocystis 6803, A. nidulans, S. platensis, E. coli, and T. thermophilus 168

18. Dot plot comparison of the available Synechocystis 6803 DNA sequence and the entire str operon of Spirulina platensis...... 170

19. Nondenaturing polyacrylamide gel electrophoresis of S. cerevisiae PCR products...... 172

20. DNA sequence of two S. cerevisiae EF-G specific PCR products, Y1 and Y2...... 174

21. Alignment of the N-terminal amino acid sequences of translocases from E. coli and S. cerevisiae..176

22. S. cerevisiae genomic Southern...... 178

23. Low stringency S. cerevisiae genomic Southern...180

24. Nucleotide and corresponding amino acid sequence of a S. cerevisiae mtEF-G gene...... 182

25. Alignment of the amino acid sequences of translocases from E. coli and S. cerevisiae.... 184

26. Genomic Southern analysis of S. cerevisiae efgl::URA3 transformants probed with the 5 1 Hind III fragment...... 186

27. Genomic Southern analysis of S. cerevisiae efgl::URA3 transformants probed with the Ura 3+ fragment...... 188

28. Nondenaturing polyacrylamide gel electrophoresis of A. thaliana and pea PCR products...... 190

29. Amino acid and DNA sequence of the unrearranged and the rearranged pea PCR products...... 192

30. Alignment of two pea chlEF-G PCR products...... 194

31. Pea genomic Southern...... 196

32. Arabidopsis thaliana genomic Southern...... 198

x 33. Pea northern analysis...... 200

34. The a- and j3-globin gene clusters...... 202

35. Map of characterized deletions within the |6 -globin gene cluster...... 204

36. Comparison of normal DNA and the deletions associated with 7

37. Mechanism for the generation of related deletions by the loss of a chromatin loop during DNA replication...... 208

38. Mechanism for the generation of small deletions found within the human j8 -globin gene cluster.... 2 1 0

39. The predicted location of a matrix attachment site...... 2 1 2

40. MAR sequences often contain topoisomerase II cleavage sites and reside next to known enhancers...... 214

41. Restriction enzyme map of AN2.1...... 216

42. Schematic representation of the in vitro procedure used to identify topoisomerase II cleavage sites...... 218

43. Schematic representation of the in vitro and in vivo procedures used to identify DNA which specifically binds nuclear matrices...... 2 2 0

44. Localization of an in vitro MAR within the mouse k immunoglobin gene and map of clone pG 19/45...... 222

45. Control in vitro MAR assays...... 224

46. Localization of an in vitro matrix associated region in the second intervening sequence of the human /3-globin gene...... 22 6

47. Map of the j8 -globin gene and flanking regions illustrating the location of the nuclear matrix attachment site as well as the locations of specific restriction enzyme sites...... 228

xi 48. Localization of two in vitro matrix associated regions in AN2.1...... 230

49. Map of AN2.1 illustrating the location of two matrix attachment sites as well as pertinent restriction enzyme sites...... 232

50. Identification of in vitro topoisomerase II cleavage sites in the EcoRI subclones of AN2.1..234

51. Map of AN2.1 illustrating the location of in vitro topoisomerase II cleavage sites for subclones p0.8, pl.O, pi.3, and pi .8 ...... 236

xii CHAPTER 1

INTRODUCTION

I.A. Outline.

Chapter I includes a brief description of the objectives

of this research. Chapter I also provides an introduction to

prokaryotic and eukaryotic protein synthesis, GTP hydrolyzing

proteins, the cellular location of eukaryotic organellar protein synthesis factor genes, organellar protein targeting,

and organellar protein biosynthesis. Chapter II details the

experimental materials and methods used in the research described in this thesis. Chapter III outlines the isolation of a cyanobacterial EF-G gene from Synechocystis 6803.

Chapter IV describes the isolation of a yeast mitochondrial

EF-G gene. Chapter V describes the attempted isolation of plant chloroplast EF-G genes. Finally, Chapter VI describes work done in the laboratory of Dr. Elio F. Vanin. This work consisted of the identification and characterization of matrix associated regions (MARs) and topoisomerase II cleavage sites in the human /?-globin gene cluster.

1 2

I.B. Objectives.

The overall objective of the research described in this

thesis has been to investigate the fundamental problem of how

events in the nucleus and cell cytoplasm are coordinated with

events inside cellular organelles. In order to understand

how a cytoplasmically synthesized protein may regulate

organellar gene expression, I have chosen to study nuclear

encoded organellar protein synthesis elongation factor G

genes.

Both mitochondria and have their own

distinct protein synthesizing systems and, in both cases,

organellar protein synthesis is required for the expression

of genes encoded in organellar DNA and for the development of

an intact, functional . The studies described here

were designed to clone and characterize nuclear encoded

organellar elongation factor G genes from plants and fungi.

The specific aims of the research described here were:

1. To isolate and characterize the gene encoding elongation

factor G from a cyanobacteria, specifically Synechocystis

6803.

2. To isolate and characterize a Saccharomyces cerevisiae gene encoding mitochondrial elongation factor G. 3. To isolate nuclear genes encoding chloroplast elongation

factors from pea, and A. thaliana, Chlamydomonas reinhardtii

using the Synechocystis 6803 EF-G specific PCR product and/or

gene as a heterologous probe.

I.e. Protein synthesis.

Biochemically, peptide bond formation is a simple

reaction. However, the translation of a messenger RNA

molecule into a polypeptide product is an extremely complex

cellular phenomenon which involves the coordinated

participation of over 100 different macromolecules. In

general protein synthesis can be subdivided into four

distinct stages: amino acid activation, initiation,

elongation, and termination.

In both prokaryotes and eukaryotes, the activation of

each amino acid is carried out by amino acid specific enzymes

known as aminoacyl-tRNA synthetases. These enzymes add an

amino acid, in an ATP dependent manner, to the 3 '-terminal

ribose residue of its cognate tRNA to form an aminoacyl-tRNA.

The second step of prokaryotic protein synthesis, which

is diagramed schematically in Figure 1, involves the

formation of an initiation complex. The initiation complex

consists of two ribosomal subunits and N-formylmethionyl- tRNAfmct (fMet-tRNAfmct) assembled on a properly aligned mRNA.

The assembly of this 70S ribosome complex requires the 4

participation of three initiation factors designated IF-1,

IF-2, and IF-3. In E. coli the initiation sequence occurs as

follows: Upon completion of a cycle of protein synthesis,

the 3OS and 50S ribosomal subunits remain associated as an

inactive 70S ribosome. IF-3 binds to the 30S subunit and in doing so, promotes the dissociation of the 70S complex. IF-1

increases the dissociation rate, possibly by mediating IF-3 binding. Dissociation is followed by the binding of GTP, mRNA, IF-2, and fMet-tRNA™ 1 to the liberated 30S subunit.

At this stage, IF-3 assists the 3OS subunit in binding the

Shine-Dalgarno sequence. Lastly, IF-3 is released and GTP is hydrolyzed as the 50S subunit joins the complex. This reaction is irreversible and results in a conformational change in the 3OS subunit as well as the release of IF-1 and

IF-2.

Eukaryotic initiation resembles the prokaryotic process.

Eukaryotic ribosomes are slightly larger than prokaryotic ribosomes having an average sedimentation coefficient of 80S.

The eukaryotic initiator tRNA is an unformylated Met-tRNA™1.

Eukaryotes have numerous initiation factors which are designated elF-n. Over 10 initiation factors have been identified in a number of eukaryotic systems. The most striking difference between prokaryotic and eukaryotic initiation occurs in the second stage of initiation.

Eukaryotes do not have an equivalent to the prokaryotic

Shine-Dalgarno sequence. Instead, elFla, elFlb, and a cap 5

binding protein (CBP) recognize the capped 5'end of the

message. Translation of eukaryotic mRNAs almost always

begins at the first AUG codon, although many exceptions are

known.

Ribosomes elongate growing polypeptide chains in a three

step cycle. This process, which is illustrated in Figure 2,

can occur at a rate of up to 40 residues/s, requires the

participation of several proteins known as elongation

factors, and can be subdivided into three stages: aminoacyl-

tRNA binding, transpeptidation, and translocation.

Aminoacyl-tRNA binding is facilitated by the formation

of a complex with GTP, elongation factor Tu (EF-Tu), and an

aminoacyl-tRNA. This complex binds the ribosome and, in a reaction that hydrolyzes GTP, the aminoacyl-tRNA is bound in a codon-anticodon dependent manner to the A site of the ribosome. Since EF-Tu has a strong affinity for GDP, another

factor, elongation factor Ts (EF-Ts), is required in prokaryotes to facilitate the conversion of the GDP-bound form of EF-Tu to the GTP-bound form.

The second stage of elongation, transpeptidation, involves the formation of a peptide bond between the amino acid of the new amino acyl-tRNA and the amino acid or peptide in the P site of the ribosome. Transpeptidation occurs via the nucleophilic displacement of the P site tRNA by the amino group of a 3 '-linked amino acyl-tRNA in the A site. This reaction is mediated by a peptidyl transferase activity which is associated with the ribosome.

During the final stage of elongation, translocation, the uncharged tRNA is expelled from the P site and the peptidyl- tRNA in the A site is moved to the P site together with its bound mRNA thereby exposing the adjacent codon and generating a vacant A site. Translocation is mediated by elongation factor G (EF-G) which binds the ribosome together with GTP.

GTP is hydrolyzed during translocation and is apparently required for the release of EF-G from the ribosome.

Hydrolysis of GTP is required for the subsequent round of elongation due to the fact that the binding of EF-G and EF-Tu to the ribosome are mutually exclusive.

The eukaryotic chain elongation cycle is similar to the cycle described for prokaryotes (Figure 3). In eukaryotes, two subunits of eEF-1, eEF-la and eEF-lj8 , function similarly to EF-Tu and EF-Ts respectively. A third eukaryotic initiation factor has been identified, eEF-1 7 , which apparently has no prokaryotic equivalent. The function of eEF-2 is analogous to EF-G.

Translation is terminated when one of the three termination codons, UAA, UAG, or UGA, is present in the A site of the ribosome. In E. coli the termination codons have no corresponding tRNAs and are recognized instead by protein release factors (Figure 2) . RF-1 recognizes UAA and UAG while RF-2 recognizes UAA and UGA. A third release factor,

RF-3, binds GTP and stimulates the binding of RF-1 and RF-2. 7

The binding of a release factor to the termination codon

induces peptidyl transferase to transfer the peptidyl group

to a water molecule rather than an aminoacyl-tRNA. Following

this transfer, the uncharged tRNA dissociates from the

ribosome, GTP is hydrolyzed, and the release factors are

ejected from the ribosome. The ribosome releases the bound

mRNA molecule and a new round of protein synthesis ensues.

Eukaryotic translation termination is similar to that

described for prokaryotes; however, eukaryotic termination

only requires one releasing factor, eRF, which binds the

ribosome together with GTP.

Eukaryotic cells contain subcellular membrane-surrounded

organelles within which the multi-step pathway of electron

transfer harnesses liberated free energy to form ATP. In

plants, the process of photosynthesis is carried out by the

chloroplasts whereas oxidative metabolism occurs in both

plant and animal mitochondria. Eukaryotic organelles are

semiautonomous in that they carry their own DNA, which codes

for a small fraction of the organellar proteins as well as

organellar rRNAs and tRNAs. However, the majority of

organellar proteins are encoded in the nucleus, synthesized

on cytoplasmic ribosomes, and directed to their respective

organelle via a targeting mechanism which is not yet fully understood. While the process of protein synthesis in both prokaryotes and eukaryotes is well known, much less is understood about protein biosynthesis in eukaryotic 8

organelles.

In many respects, prokaryotic protein synthesis more

closely resembles organellar protein synthesis than it

resembles eukaryotic cytoplasmic protein synthesis. For

example, organellar ribosomes (50-75S) are generally smaller

than their cytoplasmic counterparts (80-90S). Organellar

initiator tRNAs appear to be formylated. Organellar

ribosomes are also sensitive to many inhibitors of bacterial

protein synthesis such as chloramphenicol, streptomycin,

spectinomycin, and erythromycin. Moreover, organellar

ribosomes are not inhibited by low levels of cycloheximide which is known to inhibit protein synthesis on eukaryotic

cytoplasmic ribosomes. Furthermore, many of the organellar protein synthesis elongation factors may be functionally exchanged for their bacterial counterparts on prokaryotic ribosomes; however, organellar elongation factors are not active on eukaryotic cytoplasmic ribosomes.

I.D. GTP-hydrolyzing proteins function as molecular switches.

Proteins which belong to the GTPase superfamily of GTP hydrolyzing proteins can be distinguished by their common structural designs as well as their shared molecular mechanisms. All of the proteins in this superfamily function as molecular switches which change their affinities for other 9 macromolecules by either binding or hydrolyzing GTP. GTPases

are generally turned on when they bind GTP and subsequently turned off upon GTP hydrolysis. GTP-binding proteins are quite versatile and function to sort and amplify transmembrane signals, regulate protein synthesis, control cellular proliferation and differentiation, and mediate the transfer of vesicular proteins through the cytoplasm (Bourne et a l ., 1990).

How GTPases function as molecular switches is not well understood; however, recent crystal structures of three GTP binding proteins, EF-Tu, ras, and p21, have provided great insight into this mechanism. Furthermore, using biochemical as well as molecular approaches in the study of the role of various GTPases, researchers have established that the core mechanism of the GTPase cycle remains constant whether the protein is involved in signal transduction, protein synthesis, cellular growth, or protein trafficking.

The GTPase cycle is diagrammed schematically in Figure

4. Briefly, upon binding GTP, the protein assumes an

"active" state; whereas hydrolysis of GTP converts the protein to an "inactive" conformation. In many cases the intrinsic rates of GTP hydrolysis and GDP release are quite low. Two classes of regulatory proteins have been described which increase these rates. These are guanine nucleotide releasing proteins (GNRPs), which catalyze the release of bound GDP, and GTPase activating proteins (GAPs), which 10

stimulate the rate of GTP hydrolysis. Therefore, either GAPs

or GNRPs can function to regulate cellular cycles by

controlling the relative amounts of "active" versus

"inactive" protein. During protein synthesis for example,

EF-Ts functions as a GNRP as it catalyzes the dissociation of

GDP from EF-Tu, while the ribosome functions as a GAP

stimulating the hydrolysis of GTP after binding of the EF-Tu-

GTP complex to the ribosome. This is diagrammed

schematically in Figure 5.

During protein synthesis, the EF-Tu/Ts GTP/GDP cycle

appears to be crucial for codon-anticodon proofreading.

Translational accuracy, or the mechanism by which a ribosome proofreads a given codon-anticodon interaction, can be

envisaged to occur either by a selective binding mechanism,

or a kinetic mechanism. A selective binding type of editing mechanism has been shown to function in proofreading the

fidelity of amino acid attachment to tRNA molecules. A

separate catalytic site on many tRNA synthetases has been shown to be the actual site of tRNA synthetase editing. If the match is incorrect, the amino acyl adenylate is hydrolyzed and the incorrect amino acid and AMP are released from the tRNA synthetase.

It is believed that the ribosome does not monitor the fidelity of the codon-anticodon match using a selective binding mechanism based on the fact that there is no evidence

indicating the existence of a second aminoacyl-tRNA binding 11

site that functions to exclude inaccurate codon-anticodon

interactions. Recent evidence indicates that the ribosome's proofreading mechanism may actually be kinetic.

A kinetic proofreading model for codon-anticodon

recognition requires only one tRNA binding site. Briefly, an

initial discriminatory event would occur during the binding

of a specific (correct) and a nonspecific (incorrect) tRNA to the anticodon. If the codon-anticodon binding is specific,

GTP is hydrolyzed, the codon-anticodon interaction is fixed, and the tRNA-ribosome complex can no longer dissociate. EF-

Tu-GDP dissociates from the ribosome which commits the ribosome to form a peptide bond between the C-terminal amino acid in the growing polypeptide and the amino acid positioned

in the A site of the ribosome. If, however, the codon- anticodon binding is nonspecific, the tRNA-ribosome complex dissociates before GTP hydrolysis occurs. Dissociation of the tRNA-ribosome complex aborts chain elongation thereby permitting the ribosome to reinitiate the elongation step.

This proofreading model is consistent with the existence of mutants of E. coli EF-Tu which hydrolyze GTP more slowly than wild-type, and which result in hyperaccurate translation in vivo (Jacquet and Parmeggiani, 1988) .

The physical basis for the kinetic model may be the rate at which GTP is hydrolyzed. For example, if hydrolysis of

GTP is delayed, this delay would allow time for proofreading the codon-anticodon interaction. The hydrolysis check would 12

allow for timing of the strength or accuracy of the

interaction between the codon and anticodon. The clock is

timed such that the hydrolysis of GTP is slower than the

dissociation of an incorrect codon-anticodon interaction from

the ribosome. Incorrectly aminoacylated tRNA will always be

incorporated.

I.E. Organellar protein synthesis elongation factors in

higher eukaryotes are usually encoded by nuclear genes.

In prokaryotes ribosomal protein genes are located in

operons which encode two to eleven ribosomal proteins (Nomura

et a l ., 1984). In higher plants, only remnants of these

operons remain in the (Tanaka et al., 1986;

Thomas et al., 1988; Zhou et al., 1989). Evolutionary gene

transfer, from the once free-living eubacterium to the

nucleus of its respective host, is a basic corollary of the widely accepted endosymbiotic theory. Complete organellar

DNA sequences have been reported for tobacco (Shinozaki et

al., 1986), liverwort (Ohyama et al., 1986), and rice

(Hiratsuka et al., 1989) chloroplasts as well as for numerous

vertebrate, insect, and fungal mitochondria (Anderson et a l . ,

1981; Bibb et al., 1981; Wolstenholme and Clary, 1985; Brown et a l ., 1985; Breitenberger and RajBhandary, 1985). These

organellar encode all of the rRNAs and tRNAs required

for organellar protein synthesis as well as a limited number 13

of mRNAs. These organellar encoded proteins represent only

a small fraction of the proteins required for the development

of an intact functional organelle. For example, only one

third of the sixty ribosomal proteins required by

chloroplasts are encoded by chloroplast DNA (Shinozaki et

al., 1986; Ohyama et al., 1986; Hiratsuka et al., 1989).

Therefore, the majority of organellar proteins must be

encoded by nuclear DNA. These proteins are synthesized on

cytoplasmic ribosomes and are directed to their respective

organelle by organellar targeting mechanisms which are not

completely understood.

The locations of some of the genes encoding chloroplast

elongation factors in higher plants have not yet been

determined. It appears that in the green alga Chlorella

vulgaris, chlEF-G is encoded in cpDNA; likewise, spinach

chloroplast EF-Tu, EF-Ts, and EF-G were originally reported to be synthesized in the chloroplast (Ciferri and Tiboni,

1976; Ciferri et al., 1979). (It is likely that the results

in spinach were an experimental artifact, as all other higher plant cpDNAs analyzed to date do not contain any elongation

factor genes.) In support of these findings, the Euglena gracilis chlEF-Tu gene has been cloned from E. gracilis cpDNA

(Montandon and Stutz, 1983) and homology to the E. coli EF-Tu gene has been detected in Chlamydomonas reinhardtii cpDNA

(Watson and Surzycki, 1982). However, it has been shown that the E. gracilis chlEF-G gene is located in the nucleus 14

(Breitenberger et al., 1979); likewise, the chlEF-G gene from

the unicellular green alga Cryptomonas appears to be missing

from the str operon in the plastid (Douglas, 1991).

Furthermore, the str operon has been characterized in

cyanelles (which are photosynthetic organelles distinct from

chloroplasts) of the flagellated protist Cyanophora paradoxa

(Kraus et al., 1990). In C. paradoxa the organellar EF-G gene is not located in the cyanelle str operon. More recently, the nuclear encoded Arabidopsis thaliana chlEF-Tu gene has been cloned and sequenced (Baldauf and Palmer,

1990).

Genes encoding mitochondrial elongation factors appear to be located in the nucleus. For example, the genes which encode the mitochondrial elongation factors in Neurospora crassa were initially believed to be located in the nucleus

(Barath and Kuntzel, 1972). Subsequent mtDNA sequence analyses have indicated that these genes are not found on mtDNA (Anderson et al., 1981; Bibb et al., 1981; Wolstenholme and Clary, 1985; Brown et al., 1985; Breitenberger and

RajBhandary, 1985). Furthermore, the nuclear encoded S. cerevisiae mtEF-Tu gene (Nagata et al., 1983) and two mtEF-G genes (Vambutas et al., 1991; Welcsh et a l ., 1992) have been cloned and sequenced.

These results suggest that in land plants the genes encoding chlEF-Tu and chlEF-G as well as the genes encoding mtEF-Tu and mtEF-G have been transferred from the organellar 15

genome to the nuclear genome. However, in green algae it

appears that while the chlEF-G gene has been transferred to

the nucleus, the chlEF-Tu gene remains in plastid DNA.

Furthermore, it appears that in the fungal species examined,

mitochondrial elongation factors are nuclear encoded.

I. F. Targeting of nuclear-encoded proteins to cellular

organelles.

Both chloroplasts and mitochondria synthesize a portion

of their own proteins; however, the majority of organellar proteins are nuclear-encoded and are imported after they are

synthesized on cytoplasmic ribosomes. Some cytoplasmically

synthesized organellar proteins may be destined for the outer membrane of the organelle whereas others may have to pass through as many as three organellar membranes before they reach their final destination. Since organellar biogenesis requires the coordinated expression of proteins which are synthesized in different cellular locations, nuclear-encoded organellar proteins must be directed to their respective organellar location by a complex protein trafficking mechanism which is not yet completely understood.

Protein import into both chloroplasts and mitochondria occurs postranslationally and, with a limited number of exceptions, requires energy in the form of ATP as well as an organellar membrane potential. Most imported mitochondrial 16

proteins and all chloroplast imported proteins studied to

date are initially synthesized as precursor proteins which

contain N-terminal amino acid extensions. These N-terminal

amino acid extensions, termed transit peptides, are cleaved

from the precursor proteins by peptidases either during or

immediately after import into the organelle; however, cleavage of the transit peptide is not a prerequisite for organellar import. Transit peptides, like the signal

sequences which direct secretory proteins to the endoplasmic reticulum, are believed to contain targeting information.

Transit peptides have been shown to be both necessary and sufficient for directing their respective passenger proteins to their proper suborganellar location. For example, the transit peptide of yeast cytochrome c oxidase subunit IV (Cox IV) can direct mouse dihydrofolate reductase

(DHFR), which is naturally a cytosolic protein, into the mitochondrial matrix in vitro (Hurt et al., 1984). Similar results have also been reported for transit peptides from human ornithine transcarbamylase (Horwich et al., 1986), yeast S- aminolevulinate synthase (Keng et al., 1986), yeast alcohol dehydrogenase III (van Loon et al., 1986) as well as for the chloroplast transit peptide from the small subunit of pea ribulose 1,5-bisphosphate carboxylase (Rubisco) (van den

Broeck et al., 1985).

A number of nuclear encoded organellar proteins have been identified which are not synthesized with transit 17

peptide extensions. Targeting information for these proteins

has been localized to their N-terminal regions. For example,

the N-terminal 41 amino acids of the yeast 70 kd outer

membrane protein (Hase et al., 1984) and the N-terminal 115

amino acids of the yeast inner membrane ATP/ADP translocator

protein (Adrian et al., 1986) contain the targeting

information for these proteins.

Chloroplast and mitochondrial transit peptides have no pronounced amino acid sequence homology. Therefore,

organellar import signals do not consist of "consensus"

sequences that are specifically recognized by either the mitochondrial or chloroplast import machinery. However, both

chloroplast and mitochondrial transit peptides are rich in basic and hydroxylated residues, lack uninterrupted stretches

of hydrophobic residues, and contain few, if any, acidic residues. Therefore, it appears that aspects of secondary or tertiary structure of transit peptides contain the organellar targeting information that is recognized by the organellar

import machinery.

The similarity between chloroplast and mitochondrial transit peptides has been documented experimentally. It has been shown that chloroplast and mitochondrial transit peptides can be functionally interchanged both in vivo and in vitro. For example, the Chlamydomonas reinhardtii ribulose bisphosphate carboxylate small subunit transit peptide, naturally a chloroplast transit peptide, can direct a 18

passenger protein into yeast mitochondria both in vitro and

in vivo (Hurt et al., 1986). While Hurt et al. were able to

show that mitochondrial transit peptides were more efficient

than chloroplast transit peptides in targeting passenger

proteins to yeast mitochondria, these experiments illustrated

that a more discriminating organellar import mechanism must

exist in plants, where similar proteins must be specifically

directed to either chloroplasts or mitochondria.

Once a protein is correctly targeted to an organelle,

how is it localized to its correct suborganellar location?

One possible mechanism predicts that transit peptides contain

multiple targeting signals which direct a protein not only to

a given organelle, but also to a given suborganellar

location. Studies on mitochondrial intermembrane space proteins, such as cytochrome b, and b2, have revealed that

these proteins are processed in two steps. Initially the

transit peptides of these intermembrane space proteins are

cleaved by a matrix-localized peptidase, indicating that at

least the transit peptides of these proteins are imported

into the mitochondrial matrix (Gasser et al., 1982). The

second peptidase, which removes the remainder of the transit peptide, is believed to be located outside of the mitochondrial matrix although its exact location has not yet been determined.

The presence of anchoring sequences within transit peptides has also been documented. The transit peptides of two other intermembrane space proteins, cytochrome c, and

cytochrome c peroxidase, contain C-terminal stretches of uncharged amino acids. These may act as anchoring signals preventing complete transfer of the protein across the inner membrane. Gene fusion experiments have shown that the first

32 amino acids of the yeast cytochrome c, transit peptide can direct the mouse DHFR protein into the mitochondrial matrix

(van Loon et a l ., 1986). However, when mouse DHFR was fused to the complete yeast cytochrome c, transit peptide, which contains a stretch of 19 uncharged C-terminal amino acids, the fusion protein was localized to the intermembrane space.

Therefore, it appears that membrane anchoring sectors in transit peptides play a crucial role in preventing inter­ membrane space proteins from localizing to the mitochondrial matrix.

A number of nuclear encoded mitochondrial proteins have been identified which lack N-terminal targeting sequences.

For example, the intermembrane space protein, cytochrome c, is synthesized without a transit peptide. Binding of cytochrome c to mitochondria is believed to be mediated by cytochrome c specific receptors. Furthermore, many mitochondrial outer membrane proteins are synthesized without transit peptides and are inserted into the outer membrane in an energy independent fashion.

Studies on the small subunit of Rubisco, chlorophyll a/b protein, and ferredoxin have revealed that chloroplast 20

transit peptides appear to consist of three blocks of

conserved amino acid regions which are interrupted by two

blocks of nonconserved amino acid sequence (Karlin-Neumann

and Tobin, 1986; Schmidt and Mishkind, 1986). Conserved block I contains several serine and threonine residues and is believed to function in keeping the transit peptide on the

surface of the protein during protein synthesis. Conserved block II contains a Gly-Leu-Lys motif and is believed to be the site of the initial cleavage event. Block II is also postulated to mediate receptor binding. Conserved block III has a net positive charge, contains a conserved cysteine at a potential cleavage site, and contains a conserved Gly-Gly-

Arg-Val motif. Deletion studies on these regions indicate that conserved blocks I and III are essential for import whereas block II can be deleted without drastically hindering import (Reiss et al., 1987). It was also determined that cleavage in block II is not an obligatory step in organellar import.

The current model of organellar import and suborganellar targeting suggests that transit peptides contain subdomains which act in a hierarchical manner. This model predicts that most transit peptides would contain a matrix or stromal signal and that proteins destined for the matrix or stroma would contain only this signal. Proteins destined for other organellar regions would contain additional sequences which may act as anchors, thereby preventing matrix localization (yeast 70 kd outer membrane protein) or cause incomplete transfer into the matrix (yeast cytochrome c,) . Furthermore, transit peptides may contain a second targeting signal. For example, plastocyanin, which is located in the lumen of chloroplasts, appears to contain a bifunctional transit peptide. The first region of the plastocyanin transit peptide acts as a stromal targeting signal and is removed by a stromal protease. Removal of the stromal targeting signal reveals a second signal which serves to direct the protein

into the thylakoid lumen (Smeekens et a l ., 1986).

Very little is known about the mechanism of protein translocation across a membrane. However, it is thought that precursor proteins bind to the outer membrane of an organelle via a receptor mediated interaction. These receptors may be ligand specific or they may recognize "common" proteins.

Furthermore, these receptors may be bound to the outer membrane or they may be soluble, cytoplasmic components.

Binding of the precursor protein to the receptor as well as insertion into the outer membrane do not appear to require energy. The energy-requiring step appears to be translocation across the inner membrane. It has been shown that in N. crassa mitochondria, the energy-dependent step is the penetration by the transit peptide into the matrix

(Schleyer and Neupert et al., 1985). Clearly, many interesting questions remain to be answered about the cellular machinery and mechanisms involved in organellar 22

protein targeting.

I.G. Regulation of organellar protein biosynthesis.

Many of the well characterized adaptive and morphological changes that occur during plant development occur in response to light. For example, pea seedlings grown

in the dark do not synthesize chlorophyll. However,

following a brief period of illumination, chlorophyll synthesis and concomitant chloroplast development, marked by the rapid synthesis of chloroplast proteins, are readily detectable. Some of the proteins whose synthesis is light-

induced are encoded in the nucleus, while others are encoded by chloroplast DNA.

The small subunit of Rubisco and the chlorophyll a/b binding protein are both examples of nuclear-encoded, light- inducible proteins. The study of the genes which encode these proteins has provided some insight into the mechanism by which light triggers gene expression in plants. 5' upstream regions have been identified on both the Rubisco and chlorophyll a/b binding protein genes which are involved in regulating the expression of these genes in response to light

(Lamppa et al., 1985; Kuhlemeier et al., 1989). These 5'- upstream regions, or light responsive elements, have been shown to confer phytochrome activation as well as tissue specific expression on reporter genes in transgenic plants. A number of studies have suggested that there is light-

dependent regulation of the capacity of to

synthesize proteins encoded by chloroplast DNA. It has been

shown that etioplasts isolated from dark-grown pea seedlings

are able to synthesize proteins when supplied with ATP (Reger

et al., 1972; Siddell and Ellis, 1975). These proteins

appear to be similar to those synthesized in chloroplasts

after 48 hours of exposure to light. However, during those

48 hours of illumination, an overall two-fold increase in

chloroplast protein-synthesizing capacity was observed.

Similar results have been obtained in studies using

etioplasts and chloroplasts from Phaseolus vulgaris (Drumm

and Margulies, 1971). This light-induced increase in the

capacity of chloroplasts to synthesize proteins is responsible for the regulation of major components of chloroplasts such as the large subunit of Rubisco (Tobin and

Silverthorne, 1985).

Increases in the protein synthesizing capacity of chloroplasts from other photosynthetic organisms have also been documented. For example, it has been shown that the protein synthesizing capacity of light-grown E. gracilis chloroplasts was an order of magnitude higher than that observed for etioplasts isolated from dark-grown E. gracilis

(Reger et al., 1972; Miller et al., 1983). Furthermore, the capacity of chloroplasts to synthesize proteins may also be developmentally regulated. A temporary increase in the 24

capacity of chloroplasts for protein synthesis has been

observed during germination of cucumber seeds in the light

(Walden and Leaver, 1981). However, light-induction of the

capacity of chloroplasts to synthesize proteins may not be universal. For example, no differences in protein synthetic capacity were observed for wheat etioplasts and chloroplasts

(Reger et al., 1972).

One could propose four possible mechanisms whereby the capacity of chloroplasts to synthesize proteins is light- regulated. The nuclear encoded components of the chloroplast protein synthesizing machinery could be light-regulated at the level of transcription. This type of induction may be mediated by phytochromes and may be homologous to the phytochrome mediated induction observed for the small subunit of Rubisco (Silverthorn and Tobin, 1984). It is also possible that synthesis of these nuclear encoded chloroplast proteins is light-regulated at the level of translation. The genes for these proteins may be transcribed in the dark, but the mRNAs may only be translated efficiently in the light.

These proteins may also be regulated post-translationally.

For example, in the dark these proteins may be inactive or rapidly degraded. Activation or stabilization of these proteins would occur in response to light. A thioredoxin activation system has been documented for a number of chloroplast enzymes (Buchanan, 1980). Lastly, an inhibitor of chloroplast protein synthesis may be degraded, or an 25

activator produced, in response to light. This mechanism

seems least likely due to the findings of researchers who were able to demonstrate that no inhibitor of chloroplast

protein synthesis was present in etioplasts; furthermore,

they were unable to detect protein synthesis activator

activity in chloroplasts (Drumm and Margulies, 1971).

While little evidence exists in support of any mechanism

in particular, it has been shown that in E. gracilis many components of the chloroplast protein synthesizing machinery are induced by light. These components include chlEF-G

(Breitenberger et a l ., 1979; Breitenberger and Spremulli,

1980), aminoacyl-tRNA synthetases (Hecker et al., 1974), chlEF-Tu (Spremulli, 1982), chlEF-Ts (Fox et al., 1980), and chlIF-2 (Gold and Spremulli, 1985). It has also been shown that the activity of pea chlEF-G is light regulated (Akkaya and Breitenberger, 1992).

If chloroplast protein synthesis is required for the development of a fully functional organelle, then induction of chloroplast protein synthesis factors should precede the induction of other light-induced chloroplast proteins such as the small subunit of Rubisco. As predicted, light-induction of pea chlEF-G activity precedes maximal synthesis of the small subunit of Rubisco (Akkaya and Breitenberger, 1992).

The regulation of protein biosynthesis in plant mitochondria is not as well understood as the regulation of protein biosynthesis in chloroplasts; however, it is apparent 26

that plant mitochondrial protein synthesis is indeed

regulated. Mitochondrial translation has been shown to

increase ten-fold during early germination of Vicia faba

(Dixon, et al., 1980). Furthermore mitochondrial protein

synthesis may be repressed in response to increased

chloroplast protein synthesis. For example, it has been

shown that E . gracilis mtEF-G activity decreases when dark-

grown cells are exposed to light (Eberly and Spremulli,

1986).

Unlike plant mitochondria, significant progress has been

made in understanding the regulation of mitochondrial protein

biogenesis in Saccharomyces cerevisiae. As in plants, yeast mitochondrial biogenesis is believed to be regulated via the

coordinated expression of both mitochondrial and nuclear genes. Furthermore, it has been shown that transcription of yeast mitochondrial genes is mainly regulated at the level of

initiation (Biswas, 1990).

One can conclude from the discussion presented here that the study of light regulated, nuclear encoded organellar elongation factors may reveal how cells are able to coordinate chloroplast and nuclear gene expression and thereby regulate chloroplast biogenesis. CHAPTER II

MATERIALS AND METHODS

II.A. Strains

II.A.I. Strains of Escherichia coli and their growth media.

DH5a and HB101 were the host strains used for this

research. Rich media (2YT) contained 1.6% bactotryptone, 1% yeast extract and 0.085 M NaCl. Supplements which were added

included 50 mg/1 of ampicillin, 40 mg/1 X-gal, and 70 mg/1

IPTG.

II.A.2. Strains of yeast and their growth media.

The yeast strains used in this dissertation are listed

in Table 1. Non-selective growth media contained 1% yeast extract, 2% peptone, and 2% glucose (YPD). Selective growth media contained 0.17% yeast nitrogen base without amino acids or ammonium sulfate (Difco) , 0.038 M (NH4)2S04, and 2% dextrose. Yeast glycerol media contained 1% yeast extract,

27 28

2% peptone, 2% ethanol, and 3% glycerol (YEPG). Solid media

also contained 2% agar. Amino acids and other auxotrophic

supplements were added as described (Ausubel et al., 1989).

II.B. Transformation of yeast.

Yeast strains were transformed with either 1-10 jug of

supercoiled plasmid DNA or 1-10 jug of linear DNA as

previously described (Ausubel et al., 1989).

II.C. Recombinant plasmids.

Table 2 lists the plasmid vectors used in this dissertation.

II.D. DNA Restriction site analysis.

Restriction enzyme digestions were performed as recommended by the enzyme manufacturer (Bethesda Research

Laboratories). Gel electrophoresis was conducted as described previously (Maniatis et al., 1982).

II.E. Preparation of high molecular weight nuclear DNA.

High molecular weight pea nuclear DNA was prepared as described previously (Watson and Thompson 198 6). High 29

molecular weight Saccharomyces cerevisiae genomic DNA was

prepared as described previously (Ausubel et al., 1989).

High molecular weight Arabidopsis thaliana genomic DNA was

generously provided by Dr. Randy Scholl. High molecular

weight Synechocystis 6803 genomic DNA was given to us by Dr.

Richard Sayre.

II.F. Design of EF-G specific oligonucleotides.

N-terminal amino acid sequences derived by DNA

sequencing for translocases from Escherichia coli,

Micrococcus luteus, Methanococcus vannielii, and hamster

(Zengel et al. 1984, Ohama et al. 1987, Lechner et al. 1988,

Kohno et al. 1986) are aligned in Figure 6. EF-G specific

oligonucleotides were generated which correspond to conserved

regions in the amino acid sequences of these translocases.

The conserved amino acid regions from which the

oligonucleotides were derived as well as the DNA sequence of the oligonucleotides are shown in Figure 6. The oligonucleotides were synthesized by Jane Tolley using the

Applied Biosystems Synthesizer in the

Biochemical Instrumentation Center at Ohio State. 30

II.G. Generation and analysis of EF-G specific PCR products.

EF-G specific PCR products were generated using either

pea, A . thaliana, S. cerevisiae, or Synechocystis 6803

genomic DNA as a template as illustrated schematically in

Figure 7. A given PCR reaction consisted of 0.3-2 jug of

genomic DNA, 0.1 /xg of each primer, 10 mM Tris, pH 8.3, 200

mM dATP, 200 mM dCTP, 200 mM dGTP, 200 mM dTTP, 3 mM MgCl2, 1

Hi Perfect Match™ Enhancer (Stratagene) , and 0.015 U//il Taq

DNA polymerase (Bethesda Research Laboratories) in 50 or 100

Hi total volume. Amplification was performed using a Perkin

Elmer Cetus thermocycler for 30 cycles. Each cycle consisted

of 30 seconds at 94°C, 1 minute at 50°C, and 1 minute at 72°C.

PCR generated products were electrophoresed on 6%

polyacrylamide gels (1 X TBE: 0.08 M Tris, pH 8.3, 0.08 M

borate, and 0.002 M EDTA) . The gels were stained for 10

minutes in ethidium bromide (250ng/ml) and visualized by

ultra-violet fluorescence. The PCR products were transferred

to nylon filters (Hybond-N, Amersham) as described previously

(Nguyen, 1989) and probed with a third EF-G specific

oligonucleotide, G1 (Figure 6) . The filters were prehybridized in 6 X SSC ( 6 X SSC: 0.9 M sodium chloride,

0.09 M sodium citrate, pH 7.0), 5 X Denhardt's solution (5 X

Denhardt's: 0.1% (w/v) Ficoll 400, 0.1% (w/v) polyvinylpyrrolidone, 0.1% (w/v) BSA), 0.05% sodium pyrophosphate, 100 /xg/ml denatured herring testes DNA, and 0.5% SDS for at least one hour at room temperature.

Hybridizations were performed at room temperature for 12-16

hours in 6 X SSC, 1 X Denhardt's solution, 100 jug/ml

denatured herring testes DNA, 0.05% sodium pyrophosphate, and

5 X 106 cpm/ml of a 5*end labeled oligonucleotide probe.

Oligonucleotides were 5'end labeled with [7 32P-ATP] (3 000

Ci/mmol, Amersham) using T4 Polynucleotide Kinase (Bethesda

Research Laboratories). Unincorporated were

removed by passing the probe over a Sephadex G-25 spun

column (Maniatis et al., 1982). The filters were washed 1 X

15 minutes in 6 X SSC and 0.05% sodium pyrophosphate at room temperature and 1 X 15 minutes in 6 X SSC and 0.05% sodium pyrophosphate at 37°C. Following washing, the filters were exposed to Kodak XAR film with intensifying screen at -70°C.

II.H. Cloning of PCR generated EF-G gene fragments.

The primers used to generate EF-G specific gene fragments were designed to facilitate cloning. Each primer was synthesized so that it contained a unique restriction site at its 5' terminus (Figure 6). EF-G specific PCR products, which were identified by hybridization, were excised from 6% polyacrylamide gels and eluted overnight at

37°C in 0.5 M ammonium acetate, pH 8.0. Following elution, the DNA was precipitated with 2 volumes of ethanol. The purified PCR products were cleaved with BamHl (Bethesda 32

Research Laboratories) and EcoRl (Bethesda Research

Laboratories) and cloned into the vector pBS+

(Stratagene) using conventional cloning techniques (Ausubel

et al., 1989).

II.I. DNA sequencing.

The Sequenase Version 2.0 DNA sequencing kit (United

States Biochemical) was used for all DNA sequencing reactions. All sequencing reactions were performed on double

stranded DNA prepared as previously described (Kraft et al.,

1988). The reactions were performed essentially as suggested by the manufacturer. The products of all sequencing reactions were electrophoresed on 8 M urea, 8% polyacrylamide gels. To sequence the cloned PCR products, the M13 universal primer and the pUC/M13 reverse primer were used. To sequence the Synechocystis 6803 EF-G gene, the 1000, 550, 200, and 75 Nco 1 fragments of C3 (Figure 8) were subcloned into pUCBM21 (Boehringer Mannheim) and sequenced in both directions using the M13 universal primer and the pUC/M13 reverse primer. Two deletion subclones of the Synechocystis

6803 EF-G gene were constructed by Yangsheng Zhang in our laboratory (Figure 8) . These clones were also sequenced using the M13 universal primer and the pUC/M13 reverse primer. Oligonucleotide primers, which were derived from the

DNA sequence of the subclones, were synthesized and used to 33 sequence the remainder of the gene (Figure 8).

To sequence the S. cerevisiae mtEF-G gene, synthetic oligonucleotides were synthesized and used as primers to sequence portions of the gene as diagrammed in Figure 9. The

Erase-a-Base System Kit (Promega Corporation) was used as described by the manufacturer to generate a series of Exo­ nuclease III deletions of the 3.8 kb C-terminal Hind III fragment of the S. cerevisiae mtEF-G gene (Figure 9). These deletions, which were generated by Douglas R. Johnson in our laboratory, were sequenced in both directions using the M13 universal primer and the pUC/M13 reverse primer. Finally, small fragments of the gene were subcloned into the vector pBS+ (Stratagene) and sequenced using both the M13 forward primer and the pUC/M13 reverse primer (Figure 9).

II.J. Genomic Southern Analysis.

3-50 jug of genomic DNA were digested to completion with various restriction enzymes, separated by gel electrophoresis, and transferred by the method of Southern

(Southern, 1975) to nylon membranes (Hybond-N, Amersham).

The membranes were prehybridized in 6 X SSPE (20 X SSPE: 3.6

M NaCl, 200 mM NaH2P04, 20 mM EDTA, pH 7.4), 10 X Denhardt's solution, 1% SDS, and 50 /zg/ml denatured calf thymus DNA for

1-3 hours at 40-60°C. Following pre-hybridization, the membranes were hybridized in 6 X SSPE, 1% SDS, 50 /ig/ml 34 denatured calf thymus DNA, and 5 X 105 - 1 X 106 cpm/ml of [a-

32P-dCTP]-labeled DNA for 12-16 hours at 40-60°C. The membranes were washed 2X15 minutes in 6 X SSPE and 0.1% SDS at room temperature, 2 X 15 minutes in 3 X SSPE and 0.1% SDS at 42°C and 1 X 20 minutes in 0.1 X SSPE and 0.1% SDS at 60-

65°C. All membranes were exposed to either Kodak or Fuji X- ray film with intensifying screen at -70°C.

Genomic Southern analysis was sometimes performed in dried agarose gels using a procedure suggested to us by

Elizabeth Oakley, Ohio State University. Genomic DNA was displayed on 1% agarose gels (1 X TAE: 0.04 M Tris, 0.002 M

EDTA, pH 8.0). Following electrophoresis, the DNA was denatured by soaking the gels in 50 mM NaOH for 45 minutes at room temperature. The gels were then rinsed twice in deionized water and neutralized by soaking in deionized water for 30 minutes at room temperature. The gels were then dried on 3MM Whatman filter paper (VWR Scientific) for 45 minutes on a conventional vacuum gel drier. The drying cycle consisted of 15 minutes without heat, 15 minutes at 80°C, followed by 15 minutes without heat. The dried gels were prehybridized for 1 hour at 65°C in 6 X SSC and 0.25 % non-fat dry milk. Following prehybridization, 5 X 105 - 1 X 106 cpm/ml of [a-32P-dCTP]-labeled DNA was added. Hybridizations were performed for 12-16 hours at 65°C. Following hybridization, the gels were washed 2 X 20 minutes in 2 X SSC and 0.1% SDS at 65°C, and 2 X 20 minutes in 0.2 X SSC and 0.1% 35

SDS at 65°C. The gels were then wrapped in Saran wrap and

exposed to either Kodak or Fuji X-ray film with intensifying

screen at -70°C.

All DNA probes were generated by the Random Primers DNA

Labeling System (Bethesda Research Laboratories) using [a-32P- dCTP] (3000 Ci/mmol, Amersham).

II.K. Northern Analysis.

Total pea RNA was isolated from dark grown and light grown pea seedlings as previously described (Chomczynski and

Sacchi, 1987) by Jeonghae Park. The RNA was electrophoresed on 1.2% (w/v) formaldehyde agarose gels and transferred to nylon membranes (Hybond-N, Amersham) as previously described

(Maniatis et al., 1982). Northern blots were prehybridized

in 50% formamide, 5 X Denhardt's solution, 0.1% SDS, 5 X SSPE and 200 /ig/ml denatured calf thymus DNA at 42°C for 1-3 hours.

Following prehybridization, 5 X 105 - 1 X 106 cpm/ml of [a-32-

P-dCTP]-labeled DNA were added and the filters were incubated overnight at 42°C. The filters were washed 2 X 15 minutes in

6 X SSPE and 0.1% SDS at room temperature and I X 15 minutes in 1 X SSPE and 0.1% SDS at 37°C. The filters were exposed to either Kodak or Fuji X-ray film with intensifying screen at -

7 0°C.

All DNA probes were generated by the Random Primers DNA

Labeling System (Bethesda Research Laboratories) using [a-32P- 36 dCTP] (3000 Ci/mmol, Amersham).

II.L. Isolation of the Synechocystis 6803 EF-G gene.

A Synechocystis 6803 genomic library (XZAP II Custom

Genomic Library, Stratagene) was kindly provided by Dr. Lee

McIntosh (Michigan State University, East Lansing, Michigan).

The library was screened as described previously (Ausubel et al., 1989) using the cloned Synechocystis 6803 PCR product as a probe. Following plaque purification, the pBluescript SK- plasmid was excised from the XZAP II vector as directed by the manufacturer. Three unique clones were isolated and designated C3, Cll, and C13.

II.M. Isolation of the Saccharomyces cerevisiae mitochondrial EF-G gene.

A S. cerevisiae genomic library (Rose et al., 1987), constructed in the yeast shuttle vector YCp50, was kindly provided by John V. Moran (The University of Texas

Southwestern Medical Center, Dallas, Texas). The library was transformed into Library Efficiency™ HB101 Competent Cells

(Bethesda Research Laboratories) as directed by the manufacturer. Plasmid DNA was isolated from pools of transformants (Maniatis et al., 1982), digested with BamHl

(Bethesda Research Laboratories) , and electrophoresed on 1% 37 agarose gels. The gels were dried, as previously described

in this section, and probed with the cloned S. cerevisiae PCR product. Positive pools were identified and the procedure was repeated until unique clones were isolated. Two clones were isolated and designated Y31 and Y33.

II.N. Attempted Isolation of the pea, Chlamydomonas reinhardtii, and Arabidopsis thaliana chloroplast EF-G genes.

A Xgtll pea cDNA library (Clontech) was screened as described previously (Ausubel et a l ., 1989) using the cloned pea PCR product as a probe. However, no positive clones were isolated. Two A. thaliana libraries (a XZAP II cDNA library, kindly provided by Dr. Randy Scholl, and a XFIX genomic library, kindly provided by Dr. Keith Davis) were screened as described previously (Ausubel et al., 1989) using the A. thaliana PCR product as a probe. Again, no positive clones were identified. Finally, a Xgtll C. reinhardtii genomic library, which was kindly provided by Dr. Alice M. Curry,

Yale University, New Haven, Connecticut, was screened using the Synechocystis 6803 EF-G specific PCR product as a probe as described previously (Ausubel et a l ., 1989). Again, no positive clones were identified. 38

II.O. Construction of the disrupted efgl::URA3 allele.

The restriction map of the wild type yeast mtEF-G gene is shown in Figure 9. The construction of the disrupted efgl::URA3 allele is outlined in Figure 10. The yeast URA3+ wild type gene was excised from the yeast shuttle vector

YCp50 as a 1565 base pair Sma 1 - Nru 1 fragment and subcloned into the Sma I site in the vector pBS+ (Stratagene) creating pBSURA3. Next, the yeast 2.7 kb N-terminal Hind III fragment was cloned into the Hind III site of pBSURA3 creating pURAH. Finally, the yeast 3.8 kb C-terminal BamHl fragment was cloned into the BamHl site of pURAH creating pUHB. All subcloning was performed using standard techniques. All subcloning was confirmed by DNA sequencing and/or restriction enzyme analysis.

II. P. Disruption of the wild type S. cerevisiae mitochondrial EF-G gene.

In order to effect the gene replacement of the wild type copy of the S. cerevisiae mtEF-G gene, pUHB was digested with

Sph I. The linear 8.2 kb Sph I fragment, which was gel purified, was used to transform the S. cerevisiae strain

BWG7A as described previously (Ausubel et a l ., 1989). Ura+ transformants were identified as those which grew on minimal media without uracil. 39

II.Q. Analysis of the disrupted Saccharomyces cerevisiae mitochondrial EF-G gene.

Total yeast genomic DNA was isolated from Ura3+ transformants as well as from untransformed cells as described previously (Ausubel et al., 1989). Approximately

2jug of genomic DNA was digested with an appropriate restriction enzyme and was displayed on a 0.8% agarose gel.

Transformants which contained the disrupted copy of the gene were identified by Southern analysis using the 2.7 kb N- terminal Hind III fragment and the 1.5 kb Smal-Nrul- Ura3 + fragment as probes.

Transformants were also assayed for growth on a nonfermentable carbon source. Ura3+ transformants were replica plated onto YEPG plates and incubated for 48-72 hours at 3 0° C.

II.R. Yeast mating.

Yeast matings were performed in patches on solid YPD media. Approximately 5 X 107 cells of each mating type were patched together on solid YPD media. Mated cells were incubated for 8 hours at 30° C. Following incubation, the cells were resuspended in 1 ml of sterile ddH20. Aliquots of this suspension were diluted accordingly, plated on selective media, and incubated for 48-72 hours at 30° C. 40

II.S. Complementation of the respiratory-defective

Saccharomyces cerevisiae mutant strain C155.

The yeast strain C155, which was shown to contain a

mutation in a nuclear encoded mitochondrial EF-G gene

(Vambutas et al., 1991), was transformed, as previously

described (Ausubel et al., 1989), with Y31/33 as well as with

YEp352/RV. YEp352/RV consists of an 8.2 kb Eco RV fragment, which contains the yeast mitochondrial EF-G gene identified

by our laboratory as well as 2.4 kb of upstream sequence,

inserted into the Sma I site of YEp352. Ura3+ transformants were replica plated onto glycerol media (YEPG) and incubated

for 48-72 hours at 30° C. CHAPTER III

Isolation and characterization of a cyanobacterial

EF-G gene.

III.A. Background and approach.

The cyanobacteria are one of the most morphologically

diverse and remarkably prosperous prokaryotic groups. It is postulated that the cyanobacteria were the first major group

of prototrophs to arise with a two-stage photosynthetic pathway capable of oxidizing water to produce molecular

oxygen. Therefore, cyanobacteria are generally believed to be responsible for the primordial transition in the Earth's atmosphere from anaerobic to aerobic (Hayes, 1983; Schopf et al., 1983; Walter, 1987). Moreover, molecular phylogenetic analyses have established strong homologies between cyanobacteria and the green (euglenoids, green algae, and higher plants) and red (rhodophyte) chloroplasts, thus supporting the endosymbiotic theory for the prokaryotic origin of plastids.

41 42

The evolutionary relationship between cyanobacteria and chloroplasts was initially established by using RNA:DNA hybridization experiments, which demonstrated that plastid ribosomal are homologous in sequence to those of cyanobacteria and other prokaryotes and not homologous to nuclear encoded cytoplasmic rRNAs (Phillips and Carr, 1977,

1981). Pigott and Carr (1972) were able to show that total

32P-labelled rRNA from cyanobacteria hybridized strongly to

Euglena gracilis chloroplast DNA. However, bacterial rRNA hybridized only weakly to this DNA and Euglena gracilis cytoplasmic rRNA did not hybridize at all. In 1975, Gruol and his colleagues produced similar results showing that

Euglena gracilis chloroplast rRNA formed hybrids only with chloroplast DNA and Euglena gracilis cytoplasmic rRNA formed hybrids only with nuclear DNA. They also showed that rRNA from the cyanobacterium, Anacystis nidulans, was capable of forming specific hybrids only with Euglena gracilis chloroplast DNA. In 1975 Phillips and Carr purified 5S rRNA from various prokaryotic and eukaryotic sources. These purified 5S rRNAs were labelled in vitro and used in hybridization experiments with Euglena gracilis chloroplast

DNA. All prokaryotic 5S rRNAs employed in this study showed more extensive hybridization to Euglena gracilis chloroplast

DNA than did Euglena gracilis cytoplasmic 5S RNA. The strongest hybridization to Euglena gracilis chloroplast DNA was observed with 5S rRNA from the cyanobacterium 43

Gloeocapsa alpicola.

The evolutionary relationship between cyanobacteria and chloroplasts was also substantiated in the mid 1970s by partial sequence characterization of ribonuclease T1 digested ribosomal RNAs. This technique, pioneered by Woese and his colleagues, involved the complete enzymatic sequencing of all

(Gp)-terminated residues generated by ribonuclease T1 digestion of 32P-labelled rRNA (Pechman and Woese, 1972; Fox et a l ., 1977; Woese and Fox, 1977). This technique has been applied by Bonen and Doolittle (1975; 1976; 1978; also see

Bonen et al., 1979; Doolittle and Bonen 1981) to the plastid

16S rRNAs of eight cyanobacterial strains as well as to the plastid 16S and cytoplasmic 18S rRNAs of Porphyridium cruentum (a red alga). Similar information has also been generated by Bonen and coworkers (1977) for the cytoplasmic

18S rRNA of wheat, by Woese and Fox (1977) for the cytoplasmic 18S rRNA of yeast and mammals, and by Zablen and his colleagues (1975) for the plastid 16S rRNA of Euglena gracilis. From these studies, one can conclude that the plastid 16S rRNAs are more homologous to the cyanobacterial

16S rRNAs than they are to any bacterial 16S rRNA, although they are demonstrably related to the bacterial 16S rRNAs. It is also apparent from these studies that the plastid 16S rRNA of Porphyridium is especially closely related to cyanobacterial 16S rRNAs. Furthermore, none of the plastid

16S rRNAs showed detectable homology to their respective 44 cytoplasmic 18S rRNAs, nor was significant homology observed between any eukaryotic 18S rRNA and any prokaryotic 16S rRNA

(Woese and Fox, 1977) .

Further evidence for the evolutionary relationship between cyanobacteria and chloroplasts has been generated primarily by comparing the DNA sequences of homologous gene products encoded by nuclear genomes, plastid genomes, and prokaryotic (cyanobacterial) genomes. Comparisons of this sort have consistently revealed that the DNA sequences of plastid encoded proteins and rRNAs are more homologous to prokaryotic encoded proteins and rRNAs, specifically cyanobacterial proteins and rRNAs, than they are to the comparable nuclear encoded cytoplasmic molecules. Examples of molecules for which this type of DNA sequence comparison has been conducted include 5S rRNAs (Dyer and Bowman 1979),

16s and 18s rRNAs (Schwartz and Kossel 1980; Giovannoni et al., 1988), and the tRNAphe of Euglena gracilis (Larue et a l .

1977). For example, Dyer and Bowman (1979) have reported the complete sequence of the chloroplast 5S rRNA gene of Lemna

(duckweed) and have shown that the DNA sequence of the Lemna chloroplast 5S rRNA gene is 50% homologous to the Lemna cytoplasmic 5S rRNA gene, 63% homologous to the Escherichia coli 5S rRNA gene, and 75% homologous to the Anacystis nidulans 5S rRNA gene (Corry et al., 1974).

Given the strong evolutionary relationship between cyanobacteria and plastids (chloroplasts), I chose to isolate 45

the elongation factor G gene from Synechocystis 6803 in the

hope that it could be used as a heterologous probe to

identify clones which contain the nuclear encoded chloroplast

EF-G genes of higher plants. Furthermore, the approach used,

which involved the generation and identification of

cyanobacterial EF-G specific PCR products, would confirm the

feasibility of the experimental design which was subsequently

employed to generate chlEF-G specific PCR products from pea

and Arabidopsis thaliana and a mtEF-G specific PCR product

from Saccharomyces cerevisiae.

The approach used to isolate the Synechocystis 68 03 gene

consisted of generating EF-G specific PCR products which were

subsequently used to screen a genomic library.

III.B. Generation and identification of a Synechocystis 6803

EF-G specific PCR fragment.

EF-G specific PCR primers, which correspond to conserved regions in the N-terminal amino acid sequences for translocases from Escherichia coli, Micrococcus luteus,

Methanococcus vannielii, and hamster (Zengel et al., 1984;

Ohama et al., 1987; Lechner et al., 1988; Kohno et al.,

1986), were generated and designated G3SA, G3SB, RevG4A, and

RevG4B as indicated in Figure 6. G3SA and G3SB correspond to a highly conserved N-terminal amino acid sequence which has been shown to be involved in phosphate binding. G3SA and G3SB, which encode the peptide GHVDF, were synthesized to

account for the degeneracy in the glycine codon to reduce the

number of oliognucleotides in the pool. G3SA contains either

a guanine or a cytosine nucleotide in the third position of

the glycine codon whereas G3SB contains either a thymine or

an adenine nucleotide in this position (Figure 6). Likewise,

RevG4A and RevG4B correspond to another highly conserved N-

terminal region also believed to be involved in phosphate

binding. RevG4A and RevG4B encode the peptide HIDAG and were

synthesized to account for the degeneracy in the alanine

codon. RevG4A contains either an adenine or a thymine

nucleotide in the third position of the alanine codon whereas

RevG4B contains either a guanine or a cytosine nucleotide in

this position (Figure 6).

Synechocystis 6803 EF-G specific PCR products were

generated using RevG4A, RevG4B, G3SA, and G3SB as primers as

outlined in Figure 7. Electrophoretic analysis of the

Synechocystis 6803 PCR products produced using these primers

indicated that, when used as described, these primers

amplified numerous regions of genomic DNA (Figure 11). This result was expected, because the PCR primers were derived

from amino acid sequences which have been shown to be conserved in many GTP-binding proteins (Bourne et a l . , 1991; also see Table 3). Therefore, it was necessary to identify those PCR products which encoded EF-G-like sequences. 47

Potential cyanobacterial EF-G specific PCR fragments

were initially identified as those which were of the expected

size based on the E . coli EF-G sequence (Figure 11) .

Cyanobacterial EF-G specific PCR products were further

identified as those which hybridize to Gl. Gl, as outlined

in Figure 6, is a third EF-G specific oligonucleotide.

Hybridizing fragments were subcloned and sequenced to confirm that they do indeed encode an EF-G like polypeptide (Figure

11) .

The DNA sequence of the Synechocystis 6803 EF-G specific

PCR product is shown in Figure 12. The amino acid sequence generated from the DNA sequence of the cloned Synechocystis

6803 EF-G specific PCR product is shown in Figure 13 aligned to the N-terminal amino acid sequences of translocases from

E. coli, hamster, Arabidopsis thaliana, and pea.

III.C. Synechocystis 6803 genomic Southern analysis.

The results of a high stringency Synechocystis 6803 genomic Southern, which was probed with the cloned

Synechocystis 6803 EF-G specific PCR product, are shown in

Figure 14. When the stringency of hybridization was reduced, the cloned PCR product appeared to hybridize to numerous genomic fragments (Figure 15) . Again, this result was expected due to the presence of particular amino acid domains, which have been shown to be conserved in GTP binding 48

proteins, in the region spanned by the PCR product.

III.D. Isolation of three genomic clones which contain the

Synechocystis 6803 EF-G gene.

The cloned Synechocystis 6803 EF-G PCR product was used

as a probe to screen a Synechocystis 6803 XZAP II genomic

library which was kindly provided by Dr. Lee McIntosh. Three unique clones were identified and designated C3, Cll, and

C13. C13 appears to be the smallest of the three clones and contains an insert of approximately 3.0 kb. C3 contains an

insert of approximately 4.5 kb while Cll, the largest of the three clones, contains an insert of approximately 7.6 kb.

III.E. Nucleotide sequence of the EF-G gene of Synechocystis

6803.

The nucleotide sequence of the Synechocystis 6803 EF-G gene was determined by sequencing Ncol restriction fragments as well as deletion subclones of the gene. In addition, synthetic oligonucleotides, which were derived from the DNA sequences of the appropriate subclones, were used as sequencing primers to bridge gaps in the sequence as diagrammed in Figure 8. The complete nucleotide sequence of both strands was determined and is shown in Figure 16. 49

III.F. A comparative analysis of the primary sequence of

Synechocystis 6803 EF-G to other cyanobacterial, bacterial,

and archaebacterial EF-G sequences.

The amino acid sequence of Synechocystis 6803 EF-G is

shown in Figure 17 aligned to the amino acid sequences of translocases from two other cyanobacteria, Anacystis nidulans

(Meng et al., 1989) and Spirulina platensis (Buttarelli et al., 1989), and two eubacteria, Thermus thermophilus (Yakhmin et al., 1989) and Escherichia coli (Zengel et al., 1984).

It is apparent from this comparison that the amino acid sequence of Synechocystis 6803 EF-G differs from other cyanobacterial, eubacterial, and archaebacterial EF-G sequences; however, the significance of these differences remains to be determined. The amino acid sequence of

Anacystis nidulans EF-G is approximately 80% homologous to the Spirulina platensis EF-G sequence. However, the

Synechocystis 6803 EF-G sequence is only 50% homologous to either the Anacystis nidulans or the Spirulina platensis sequence. When the three cyanobacterial EF-G amino acid sequences are compared to the Escherichia coli EF-G sequence, the homology ranges from 54-59% with Anacystis nidulans being the most homologous to Escherichia coli while

Synechocystis 6803 is the least. These results are summarized in Table 4. Also shown in Table 4 is the percent 50 identity between E. coli LepA. LepA is an E. coli protein of unknown function which shares regions of homology with E. coli translation factors such as EF-Tu, IF-2, and EF-G as well as other GTP-binding proteins (March and Inouye, 1985).

A comparison of the codon usage between the Synechocystis

6803 EF-G gene and that determined for Synechococcus is shown in Table 5.

While amino acid sequence comparisons of this type are interesting, they cannot be the sole criterion by which the evolutionary relatedness of organisms in a given taxon is determined. For example, Giovannoni and his colleagues have sequenced 16S rRNAs to explore the evolutionary relationship among 3 0 representatives of the diverse cyanobacterial group

(1988). Their results predict that the evolutionary distance between Spirulina and Synechocystis is smaller than the evolutionary distance separating Anacystis from either

Spirulina or Synechocystis. Based on their results, one would predict that the amino acid sequence of Spirulina EF-G would be more homologous to Synechocystis EF-G than either is to Anacystis EF-G. However, the Anacystis and Spirulina EF-G amino acid sequences are more homologous to each other than they are to the Synechocystis EF-G sequence.

It is also of interest to note that the Synechocystis

6803 EF-G gene does not appear to be located in the str operon. In Escherichia coli, Anacystis nidulans, and

Spirulina platensis, the str operon includes the structural 51 genes rpsL (ribosomal protein S12), rpsG (ribosomal protein

S7) , fus (translation elongation factor G) , and tuf

(translation elongation factor Tu) . I have sequenced 414 base pairs of the 5' flanking region of the Synechocystis

6803 gene as well as approximately 500 base pairs of the 3 1 flanking region. I have not detected any significant homology to the S7 gene in the 5' flanking region. Likewise,

I did not detect any homology to the EF-Tu gene in the 31 flanking sequence. The distances between the genes of the str operons from E. coli, S. platensis, A. nidulans, and M. luteus are given in Table 6. A dot plot comparing the DNA sequence of the Synechocystis 6803 EF-G gene to the entire str operon of

A. nidulans is illustrated in Figure 18.

The fact that the Synechocystis 6803 EF-G gene is not located in the str operon was unexpected, but it is not unprecedented. For example, the Synechocystis 6803 psbH gene, which encodes a 9 kDa phosphoprotein of photosystem II, has been cloned and sequenced. In chloroplast DNA, the gene shows a conserved location in the co-transcribed psbB-psbH- petB-petD operon. However, in Synechocystis 6803, the psbB gene is not located within llkb upstream or downstream of the psbH gene (Mayes and Barber, 1989) . It is also possible that the Synechocystis 6803 EF-G gene is not located in the str operon. 52

It is also possible that two distinct genes, one which

is located in the str operon and one which is located

elsewhere in the genome, encode EF-G in Synechocystis 6803.

For example, in Escherichia coli, two distinct genes encode

EF-Tu, tufA which is located in the str operon and tufB which maps elsewhere in the Escherichia coli genome (Jaskunas et al., 1975). If two distinct genes encoded EF-G in

Synechocystis 6803, I would not expect these two genes to differ significantly at the nucleotide level. The sequence of tufA differs from tufB in only 13 positions and the respective gene products, EF-TuA and EF-TuB are identical except for the COOH-terminal amino acid (An and Friesen,

1980; Yokota et al., 1980). If two EF-G genes are present in

Synechocystis 6803 and are as homologous to each other as the two Escherichia coli EF-Tu sequences, I would also expect two distinct bands to be present on genomic Southerns which were probed with the Synechocystis 6803 EF-G PCR product using high stringency conditions. However, under high stringency conditions, only one genomic fragment hybridized to the

Synechocystis 6803 EF-G PCR product (Figure 14).

It is possible that two distantly related fus genes are present in all cyanobacteria due to the experimental procedure used to isolate the fus genes from other species.

The A. nidulans fus gene was isolated as part of the str operon, using the tobacco chloroplast gene for ribosomal protein S12 as a probe whereas the S. platensis fus gene was 53

isolated using the E. coli tufA and rps7 genes as well as C-

terminal fragments of the E. coli fus gene as probes. A

second distantly related fus gene may also be present in

these cyanobacteria. It is therefore necessary to reexamine

these prokaryotes for the presence of an unexpected second

fus gene. For example, a genetic screen has recently

revealed the presence of a novel translation factor, with

extensive sequence homology to EF-Tu, which is required for

the incorporation of selenocysteine (Forchhammer et al.,

1989).

DNA sequence information from the 5 1 and 3' flanking

regions of the gene did not reveal any homology to either the

S7 or EF-Tu genes. Therefore, DNA sequence information, in conjunction with high stringency genomic Southern analyses, suggests that in Synechocystis 6803, there is only one copy of the EF-G gene which is not located within 414 bp downstream of the S7 gene or 500 bp upstream of the EF-Tu gene. On the other hand, in yeast, it appears that there are two mtEF-G genes which are not highly homologous to each other; therefore, the possibility remains that there are two genes in Synechocystis. This would explain the low homology of the gene we have isolated when compared to the other cyanobacterial fus genes. If two distantly related fus genes are present in Synechocystis, I probably would not have detected the other gene while screening the library due to the fact that high stringency conditions were employed. If 54

low stringency conditions were used to screen the library, I would predict that a second fus gene, if present, would be detected as well as numerous other genes with DNA sequences sufficiently homologous to the Synechocystis 6803 PCR product. These include genes which encode other GTP-binding proteins such as EF-Tu, IF-2, and LepA.

III.G. Use of the Synechocystis 6803 EF-G specific PCR product as a heterologous probe.

Genomic DNA from N. crassa, E. gracilis, and pea was digested with EcoRl and probed with the Synechocystis 6803

EF-G specific PCR product. Under low stringency conditions, numerous hybridizing fragments can be observed (Figure 15).

Based on the results of this Southern, I used the

Synechocystis 6803 EF-G specific PCR product as a heterologous probe to screen a pea Xgtll cDNA library as well as a Chlamydomonas reinhardtii Xgtll genomic library which was kindly provided by Dr. Alice Curry. It was my hope that, under conditions of reduced stringency, the cyanobacterial

EF-G PCR product would hybridize to the pea chlEF-G cDNA as well as to the Chlamydomonas reinhardtii chlEF-G gene.

However, after numerous attempts, neither the pea chlEF-G cDNA nor the Chlamydomonas reinhardtii chlEF-G gene were isolated. CHAPTER IV

Cloning and characterization of a Saccharomyces cerevisiae

mitochondrial EF-G gene.

IV.A. Background and approach.

The mitochondrial translation system of Saccharomyces cerevisiae, like that of the mitochondrial and chloroplast translation systems in higher plants, is primarily composed of nuclear encoded proteins. The nuclear genes which encode these proteins are distinct from the genes which encode the homologous components of the cytoplasmic translation system.

The only known exceptions to this rule in yeast are HTS

(Natsoulis et al., 1986) which encodes the mitochondrial and cytoplasmic histidyl-tRNA synthetase and VAS1 (Chatton et al., 1988) which encodes the mitochondrial and cytoplasmic valyl-tRNA synthetase.

Genes encoding numerous components of the yeast mitochondrial translation apparatus have been cloned and characterized. These genes have been cloned by screening cDNA or genomic libraries with either antibodies generated against a particular protein (Natsoulis et al., 1986; Chatton

55 56 et al., 1988; Partaledis and Masen, 1988), synthetic oligonucleotide probes (Kitakawa et al., 1990; Dang and

Ellis, 1990), or with heterologous DNA probes (Nagata et al.,

1983) .

A number of genes encoding components of the yeast mitochondrial translation system have also been isolated by complementation. Yeast strains which contain mutations in components of the mitochondrial translation system are characterized primarily by their inability to incorporate radioactive precursors into the few proteins which are synthesized endogenously as well as their tendency to delete segments of their mitochondrial DNA. This phenotype is not uncommon among the well characterized yeast petite (pet) mutations. Clones which are able to complement a particular pet mutation have been sequenced and characterized to determine the nature of the complementing genomic DNA. Using this approach, a mitochondrial ribosomal protein gene (Myers et al., 1987), a mitochondrial initiation factor gene

(Vambutas et al., 1991), a mitochondrial elongation factor G gene (Vambutas et al., 1991), and a number of mitochondrial aminoacyl-tRNA synthetase genes (Myers and Tzagoloff, 1985;

Pape et al., 1985; Koerner et a l . , 1987) have been identified.

The information generated from these studies has increased our understanding of the components of the mitochondrial protein synthesizing machinery as well as our 57 understanding of mitochondrial biogenesis in general. For

example, the extensive homology shared between the characterized proteins of the yeast mitochondrial translation

system and the respective bacterial counterparts provides additional evidence for the prokaryotic nature of the mitochondrial translation system.

The study of the nuclear encoded proteins of the yeast mitochondrial translation system as well as other nuclear encoded yeast mitochondrial proteins has also contributed to our knowledge of organellar protein targeting. However, many questions about eukaryotic mitochondrial transit peptides and their respective targeting mechanisms remain to be answered.

By cloning and characterizing the nuclear encoded yeast mitochondrial elongation factor G gene, I had hoped to contribute to our understanding of the nuclear encoded components of the mitochondrial translation apparatus and the mechanism by which they are targeted to the organelle. It was also my hope that the yeast mtEF-G gene could be used as a heterologous probe to identify clones which contain the nuclear encoded mtEF-G genes of higher plants.

The approach used to isolate the yeast mtEF-G gene is the same as that previously described for pea, A. thaliana, and Synechocystis 6803. A S. cerevisiae genomic library was screened with a S. cerevisiae mtEF-G specific PCR product.

Two positive clones were identified and characterized by restriction mapping and DNA sequencing. 58

During these experiments, it came to our attention that

A. Vambutas and her colleagues had identified a clone which

they believed encoded the S. cerevisiae mtEF-G gene (Vambutas

et al., 1991). This clone was identified based on its

ability to complement the S. cerevisiae petite mutant strain

C155 (Vambutas et al., 1991) A comparative analysis of our

clone and theirs, as well as complementation studies using

our clone and their mutant strain, are presented.

IV.B. Generation and identification of a Saccharomyces cerevisiae mitochondrial EF-G specific PCR product.

The EF-G specific PCR primers G3SA, G3SB, RevG4A, and

RevG4B which are described in section III.B and in Figure 6, were again used as outlined in Figure 7 to amplify

S. cerevisiae genomic DNA. Electrophoretic analysis of the

PCR products generated revealed that G3SA, G3SB, RevG4A, and

RevG4B amplified numerous regions of genomic DNA (Figure 19) .

As previously discussed, this result is expected and probably due to the fact that G3SA/SB and RevG4A/B were derived from amino acid sequences which have been shown to be conserved in translocases as well as other GTP binding/ hydrolyzing proteins (Bourne, et al. , 1990). Therefore, it was again necessary to identify only those PCR products which encoded

EF-G-like sequences. 59

Potential mtEF-G PCR fragments were initially identified

as those which were of the expected size based on the E . coli

sequence (Figure 19). However, if a yeast gene contains an

intron, this intron is most likely to be located near the N-

terminal region of the gene (Teem, et al., 1984). If the

yeast mtEF-G gene contained an intron in this region and

G3SA/B and RevG4A/B flanked this intron, then the PCR

products generated using these primers would be larger than

the predicted size. Therefore, yeast mtEF-G specific PCR

products were further identified as those which hybridized to

Gl. As discussed previously and diagrammed in Figure 6, Gl

is a third EF-G specific oligonucleotide which should

hybridize to PCR products that have been generated using

G3S/A and RevG4A/B as primers (Figure 7). Two hybridizing

PCR products, Y1 and Y2, were cloned and sequenced to

confirm that they do indeed encode EF-G like polypeptides.

The DNA sequences of the two EF-G specific PCR products, which were generated using S. cerevisiae genomic DNA as a template, are given in Figure 20. The amino acid sequences generated from the DNA sequences of these cloned EF-G

specific PCR products are shown in Figure 21 aligned to the

N-terminal amino acid sequence of E. coli EF-G. 60

IV.C. Saccharomyces cerevisiae genomic Southern analyses.

S. cerevisiae genomic Southerns were probed with Y1 and

Y2. Y2 did not hybridize to S. cerevisiae genomic DNA when

high stringency hybridization and wash conditions were

employed. The results of a S. cerevisiae genomic Southern, which was probed with Y1 are shown in Figure 22. Y2 most probably encodes a portion of a bacterial EF-G due to the high degree of amino acid conservation between Y2 and E. coli

EF-G (Figure 21).

As stated previously, Vambutas and coworkers have

identified a gene which they believe encodes a Saccharomyces cerevisiae mitochondrial EF-G. To determine whether or not the yeast mtEF-G PCR product that I have identified is capable of hybridizing to the yeast mtEF-G gene identified by

Vambutas, a low stringency yeast genomic Southern was performed using the cloned mtEF-G PCR product as a probe.

When the hybridization and wash temperatures are reduced, numerous hybridizing fragments appear (Figure 23) . This result is likely due to the heterologous hybridization of the yeast mtEF-G PCR product to the yeast mtEF-G gene identified by Vambutas and other yeast genes which encode

GTP-binding proteins. It was never determined whether any of the fragments detected by the yeast mtEF-G PCR product corresponded to the gene identified by Vambutas (1991). 61

IV. D. Isolation of two genomic clones which encode a

Saccharomyces cerevisiae mitochondrial EF-G gene.

A S. cerevisiae YCp50 genomic library (Rose et al., 1987),

which was kindly provided by John V. Moran, was screened

using the yeast mtEF-G specific PCR product Y1 as a probe.

Two clones were identified and designated Y31 and Y33. Upon

detailed restriction analyses, it was determined that Y31 and

Y33 are, in fact, identical. The restriction map of Y31/Y33

is shown in Figure 9.

IV. E. Elucidation of the sequence of a Saccharomyces

cerevisiae mitochondrial EF-G gene.

The nucleotide sequence of the S. cerevisiae mtEF-G gene

identified in our laboratory was determined by Douglas

Johnson as diagrammed in Figure 9. The complete nucleotide

and amino acid sequence of the yeast mtEF-G gene is shown in

Figure 24. A "TATAA" box, whose sequence is "ATAAT" is

located 85 bp 5 1 of the AUG initiation codon. A "G/C" box, whose sequence is a perfect match to the consensus sequence

"GGGCGG" is located 101 bp 5 1 of the AUG codon. A "CAAT"

box, whose sequence is "AACCAATCA" is located 139 bp 5' of the AUG codon. There are two polyadenylation signals located

14 and 94 bp 3' of the termination codon. In comparison, the 62

gene isolated by Vambutas (1991) contains no apparent "TATAA"

box in the 5' untranslated sequence; however, an "AT" rich

region, which may contain sequences that function as "TATAA"

boxes is located between -200 and -100 relative to the start

codon. Furthermore, there are no apparent "CAAT" boxes or

"G/C" boxes in the 5' flanking sequence. There are three

polyadenylation signals in the 3' flanking sequence that are

located 44, 72, and 107 bp 3' of the termination codon. Both

genes appear to encode N-terminal transit peptides which

contain regularly spaced basic amino acid residues.

IV.F. A comparative analysis of the amino acid sequence of a Saccharomyces cerevisiae mitochondrial EF-G to other known

eukaryotic and prokaryotic EF-G sequences.

The amino acid sequence of the yeast mtEF-G

characterized by our laboratory is shown in Figure 25 aligned to the amino acid sequence of E. coli EF-G and the S. cerevisiae mtEF-G amino acid sequence derived by Vambutas and coworkers (1991). It is apparent from this comparative analysis that the yeast clone which was identified in our

laboratory does indeed encode an EF-G-like sequence. There are large insertions in our sequence relative to the sequence generated by Vambutas (1991); however, there are no apparent intron consensus splice sites in these insertions indicating that they do not correspond to introns. 63

Why S. cerevisiae would evolve in such a way as to have

two obviously separate and distinct mitochondrial elongation

factor G genes is not known. The percent amino acid identity

between our S. cerevisiae mtEF-G and E . coli and the S.

cerevisiae mtEF-G identified by Vambutas (1991) is given in

Table 7. A comparison of codon usage between these two genes

is presented in Table 8.

One would predict that the two yeast sequences would be

significantly more homologous to one another than they are to

the Escherichia coli sequence. The fact that they are as

homologous to each other as they are to the Escherichia coli

sequence suggests that the two genes diverged from the

ancestral nuclear encoded gene during a relatively early

point in evolutionary time; or, that the EF-G gene was

transferred to the nucleus from the primordial

on two separate occasions. A possible reason for the

existence of two mitochondrial EF-G genes in yeast will be presented in Section IV.I.

IV.G. Disruption of a Saccharomyces cerevisiae mitochondrial

EF-G gene. 64

IV.G. Disruption of a Saccharomyces cerevisiae mitochondrial

EF-G gene.

The construct used to disrupt the S. cerevisiae mtEF-G

gene is diagrammed schematically in Figure 10. The 8.1 kb

Sphl fragment was used to effect the disruption in strain

aBWG7A. Ura3+ transformants were selected on minimal medium which lacked uracil. Ura3+ transformants were analyzed by

Southern blotting to ensure that the constructed allele

correctly recombined and thereby replaced the endogenous

allele. Total yeast genomic DNA was isolated from eight

Ura3+ transformants, digested with EcoRl, and probed with the

2.7 kb N-terminal Hindlll fragment (Figure 10). EcoRl digested DNA was also probed with the Ura3+ gene. The results of these Southerns are shown in Figures 26 and 27.

Ura3+ transformants 2, 3, 4, 5, and 15 had the disrupted allele inserted in the correct genomic location whereas transformants 21, 23, and 24 had disrupted alleles which had undergone illegitimate recombination .

To determine whether or not the S. cerevisiae fus-like gene is an essential gene, the eight Ura3+ transformants were plated on a glycerol media (YEPG). Transformants 2, 3, 4, 5,

15, and 23 were unable to grow on glycerol whereas transformants 21 and 24 were able to grow on glycerol.

In order to show that transformants 2, 3, 4, 5, 15, and

23 have retained their mitochondrial DNA, they were crossed 65

to a p° tester strain. Mated cells were plated on glycerol media. Theoretically, the only cells capable of growing on glycerol are those which have acquired wild type mitochondria

from the disrupted strain. The results of this cross are given in Table 9. All transformants tested contained wild type mitochondria. Therefore, it appears that the S. cerevisiae fus-like gene encodes an essential mitochondrial protein.

IV.H. Complementation of the Saccharomyces cerevisiae pet mutant strain C155 with a Saccharomyces cerevisiae mitochondrial EF-G gene.

The original S. cerevisiae mtEF-G mutant strain, C155, was kindly provided by Dr. A. Tzagoloff. C155 was transformed with our original clone Y31/33 as well as with

YRV. YRV was constructed by isolating the S. cerevisiae mtEF-G gene characterized by our laboratory on a 7 kb EcoRV fragment which was subsequently cloned into the high copy number, 2pm based, yeast shuttle vector, YEp352. Transformed cells were plated onto minimal media without uracil and then replica plated onto glycerol media. Ura3+ transformants did not grow on glycerol. These transformants were crossed to a p° tester strain to determine if they retained wildtype mitochondria. However, all transformants tested were p°.

These results are inconclusive but suggest that our S. 66

cerevisiae fus-like gene is unable to complement C155 due to the fact that transformed strains were highly unstable and

always found to be p°.

IV.I. Discussion.

Although I isolated an EF-G specific PCR product that was unrelated to yeast (probably due to bacterial contamination of the yeast DNA), it is not surprising that I did not isolate a yeast EF-G specific PCR product which corresponded to the gene characterized by A. Vambutas. Upon careful analysis of the sequence of our gene and theirs, it became apparent that the PCR primers G3SA/B and RevG4A/B are not completely homologous to either our sequence or the yeast mtEF-G sequence published by A. Vambutas. G3SA/B matches 13 of 14 nucleotides in both sequences. This mismatch in both sequences results in the observed change in the encoded amino acid sequence. The primers encode the sequence GHVDF whereas both genes encode the sequence GHIDF. RevG4A/B matches our sequence in 14 of 14 nucleotides; however, it matches the sequence of the other yeast mtEF-G gene in 13 of 14 positions and results in an amino acid change (HIDSG versus HIDAG).

Therefore, it is possible that this nucleotide change reduced the amplification efficiency of the gene identified by A.

Vambutas. A primer which contains a single base mismatch near the 3' terminus most probably will not work as well as 67

a completely homologous primer.

While the gene characterized by A. Vambutas may not have

been amplified efficiently using the strategy described, it

is also possible that it was not detected due to the fact

that only two positively identified clones were sequenced.

One of these was determined to be a bacterial contaminant while the other corresponded to the yeast mtEF-G described

here. Therefore, sequencing other PCR products might

identify one of which corresponds to the gene isolated by A.

Vambutas.

It is unknown why yeast has two mitochondrial EF-G

genes. However, one may speculate that a novel translation

factor may be required under various growth conditions. For

example, one factor may be required when cells are grown

anaerobically versus aerobically or at one temperature versus

another. It is also possible that a novel translation factor may be required for the incorporation of novel amino acids.

In E . coli a novel translation factor, with extensive

sequence homology to IF-2 and EF-Tu, has been identified and

appears to be required for the incorporation of

selenocysteine through a process directed by a UGA codon that

normally functions as a stop codon (Forchhammer et al.,

1989).

Novel mitochondrial translation elongation factors may

also play a role in the regulation of protein biosynthesis.

While protein synthesis regulation is traditionally believed 68

to occur most often at the stage of initiation, many reports

suggest that the regulation of protein biosynthesis occurs at

elongation in eukaryotic cells.

The rate of protein chain elongation on either total mRNA or on a specific mRNA has been reported to vary depending on various stimuli which may include hormone treatment, viral infection, and egg fertilization. For example, when serum is added to HeLa cells a two-to-three-

fold increase in the overall elongation rate is observed

(Nielsen, and McConkey, 1980). Examples also exist which document the change in the rate of elongation of a specific mRNA species with a minimal change in the overall elongation rate. For example, when cultured hepatoma cells are exposed to dibutyryl cyclic AMP no change in the elongation rate of total protein synthesis is observed (two amino acids per second per ribosome); however, the elongation rate for tyrosine aminotransferase was increased to ten amino acids per second per ribosome (Roper and Wicks, 1978). Therefore, changes in the rate of elongation under various physiological conditions have been well documented.

How the elongation rate is regulated is still not well understood. It is possible that the rate can be regulated by altering or modifying the concentration of components of the elongation machinery. Several examples are known where a change in the overall rate of protein synthesis is correlated with elongation factor activity; therefore, elongation factors may be primary targets for the regulation of the rate

of elongation. The nature of the modification responsible

for the regulation of elongation factor activity is not

known; however, the covalent modification of eukaryotic

elongation factors has been documented. While the role of these modifications for the most part is not known, methylation of eEF-la in the fungus Mucor racemosus is correlated with the overall increase in the rate of elongation during the yeast- to-hyphae-transition while reduction in the methylation state of eEF-la is correlated with a decrease in eEF-la activity during spore germination

(Fonzi et al., 1985). Furthermore, phosphorylation of eEF-2 appears to inactivate the protein rendering it incapable of catalyzing ribosome translocation in cell free translation systems (Shestakova and Ryazanov, 1987; Nairin and Palfrey,

1987; Ryazanov et al., 1988; Ryazanov and Davydova, 1989).

Therefore, modification of elongation factors may play a role in the regulation of the rate of polypeptide chain elongation.

How can regulation of the rate of elongation affect the steady state level of a cellular protein or proteins? One can envision three possible mechanisms. A weak mRNA, that is, one for which the initiation rate constant is low, may be stimulated to be translated by inhibiting the overall elongation rate. The rate of initiation is the most common rate-limiting stage in translation and different mRNAs are translated according to their strength of initiation. When

the overall rate of elongation is inhibited, the efficiency with which each mRNA is translated is proportional to the

concentration of a given mRNA species. It has been

postulated that inhibition of the overall elongation rate

could potentially stimulate translation of weak mRNAs

relative to strong mRNAs. This is likely due to the

existence of specific initiation factors which have different

affinities for different mRNAs. Reducing the rate of elongation only limits the rate of elongation of strong mRNAs, whereas elongation of weak mRNAs is stimulated by the mobilization of the limiting factors from strong mRNAs.

Mobilization of these factors from strong mRNAs to weak mRNAs appears to be the result of "ribosome stacking" on strong mRNAs. As the overall rate of elongation is reduced, strong mRNAs are not able to initiate translation as frequently as before due the reduced rate at which ribosomes are able to translate strong messages. Ribosomes essentially "pile up" on strong messages and thereby reduce the rate at which these messages can initiate translation. This subsequently increases the cellular concentration of initiation factors which can now initiate weak mRNAs. This phenomenon has been documented experimentally in mouse fibroblasts. When protein synthesis in these cells was partially inhibited by cycloheximide, the relative rate of synthesis of a number of proteins increased (Walden, et al., 1981; Brendler, et al., 71

1981; Brendler, et a l ., 1981; Godefroy-Golburn and Thach,

1981; Ray et al., 1985; Walden and Thach, 1986).

Regulation of the rate of elongation has also been known to change mRNA stability. Examples exist where mRNA stability is directly dependent on ongoing protein synthesis.

Inhibition of elongation may result in the stabilization of particular mRNA species. This may be the result of a stalled ribosome protecting a particular mRNA from degradation (Gay, et al., 1989). There are also cases where inhibition of elongation stimulates mRNA degradation (Gay, et al., 1989) .

Inhibition of the rate of elongation may also result in the elimination of proteins with very short half-lives (Bachmair, et al., 1986). One can easily imagine how the concentration of a protein with an extremely short half-life, such as the a2 repressor in yeast which has been shown to have a half- life of approximately 5 minutes (Hochstrasser and Varshavsky,

1990) , may be reduced in a cell whose rate of overall polypeptide chain elongation has been decreased.

Two roles for the regulation of protein synthesis at the elongation stage have been proposed. Regulation of the rate of elongation may be important in coordinating the translation of selected mRNA species. This may be the case in the activation of cells by hormones or in the case of fertilization. In both cases, a vast amount of new mRNA is generated. If these are strong mRNAs and have high rate constants of initiation, the sudden increase in these mRNAs 72 could limit the availability of initiation factors thereby

jeopardizing the translation of weak constitutive messages.

A decrease in the rate of elongation in response to a dramatic increase in total cellular mRNA could minimize translational discrimination thereby preventing the interrupted synthesis of essential constitutively expressed proteins.

Transition of cells from one physiological state, dependent on continuous protein synthesis, to another could also be regulated at the stage of elongation. For example, metaphase arrest of maturing oocytes has been shown to be dependent on continual protein synthesis. Arrest can be overcome by inhibiting the rate of elongation (Clarke and

Masui, 1983; Zampetti-Bosseler et al., 1973; Neant and

Guerrier, 1988; Dube and Dufresne, 1990).

The discussion presented here outlines the potential global importance of the regulation of the elongation stage of translation. One can therefore easily imagine why a cell may require numerous elongation factors such as the situation documented here for S. cerevisiae mitochondria. CHAPTER V

Attempted isolation and characterization of the Pisum sativum

and Arabidopsis thaliana chloroplast EF-G genes.

V.A . Background and approach.

Eukaryotic cellular organelles such as mitochondria and chloroplasts contain circular genomes which encode a small fraction of the organellar proteins as well as organellar rRNAs and tRNAs. Many other organellar proteins are known to be encoded by nuclear genes. These nuclear encoded organellar proteins generally include organellar protein synthesis factors such as the enzyme responsible for the translocation step in protein synthesis, elongation factor G

(EF-G). The nuclear encoded organellar proteins, which are synthesized on cytoplasmic ribosomes, are targeted to their respective organelle by a mechanism which is not completely understood. There they participate in various organellar functions such as respiration, photosynthesis, and protein synthesis.

73 74

One of the goals of the research I initiated is to

determine how light regulates the protein synthetic capacity

of plant chloroplasts. Chloroplast protein synthesis is

required for the expression of genes encoded in chloroplast

DNA and for the development of an intact, functional

organelle. It has been shown that the ability of plastids to

synthesize proteins is regulated in a light-dependent manner

(Reger et al., 1972). This light induced increase in the

capacity of chloroplasts to synthesize proteins is

responsible for the light regulation observed for numerous

chloroplast proteins. Therefore, a precise understanding of how chloroplast gene expression is light regulated would

allow us to more fully understand how light regulates chloroplast protein synthesis in general.

Recent studies on chlEF-G may help explain how light regulates the protein synthetic capacity of plant chloroplasts. It has been shown that in pea, chlEF-G activity is light regulated (Akkaya and Breitenberger, 1992).

Since chlEF-G is encoded by a nuclear gene, light regulation of its activity may play an important role in coordinating nuclear and chloroplast gene expression in plants. Isolation of genomic and/or cDNA clones which encode the pea and

Arabidopsis thaliana chlEF-G genes will allow us to study how light regulates chlEFG gene expression, by analyzing chlEF-G transcription during early development in both light- and dark-grown seedlings. Such studies will add to our 75

understanding of how light regulates chloroplast gene

expression and subsequently chloroplast protein synthesis

during plant growth and development.

In order to isolate the pea and A. thaliana chloroplast

EF-G genes, I chose to generate pea and A. thaliana EF-G

specific PCR fragments which were used as probes to screen the appropriate library for the respective gene.

V.B. Generation and identification of pea and Arabidopsis

thaliana chlEF-G specific PCR fragments.

The approach used to generate both pea and A. thaliana

EF-G specific PCR products, using G3SA, G3SB, RevG4A, and

RevG4B as PCR primers is outlined in Figure 7. This approach

is identical to that described in detail in Chapter III for the generation and identification of Synechocystis 6803 EF-G specific PCR products as well as that described in Chapter IV for the generation and identification of S. cerevisiae mtEF-G specific PCR products.

Electrophoretic analysis of the pea and A. thaliana PCR products produced by this approach indicated that the primers

G3SA, G3SB, RevG4A, and RevG4B when used as described, amplified numerous regions of genomic DNA (Figure 28). This result was expected due to the fact that G3SA, G3SB, RevG4A, and RevG4B were derived from amino acid sequences which have been shown to be conserved in translocases as well as in the 76

superfamily of GTP binding and hydrolyzing proteins (Bourne

et al., 1991 and Table 3). Therefore, it was necessary to

identify predominantly EF-G specific PCR products from the

total pool of PCR products.

Potential EF-G specific PCR fragments were initially

identified as those which were of the expected size based on the E. coli EF-G sequence. However, this could not be the only criterion used to identify potential EF-G specific PCR

fragments. For example, it is possible that the PCR primers

flank an intron which may be present in the genomic copy of a gene. If G3SA/SB and RevG4A/B flanked an intron, I would expect EF-G specific PCR products, generated using these primers and genomic DNA as a template, to be larger than 234 base pairs which is the predicted size based on the E. coli sequence. It is also possible that the pea and/or the A. thaliana EF-G genes contain sequences which are not present

in the E. coli sequence. Again, if the PCR primers flanked such a region, I would expect EF-G specific PCR products to be larger than the predicted size. Therefore, potential EF-G specific PCR fragments were further identified as those which hybridized to Gl, a third EF-G specific oligonucleotide

(Figure 6) . Hybridizing PCR fragments were subcloned and sequenced to confirm that they do indeed encode an EF-G like peptide.

The amino acid sequence generated from the DNA sequence of the pea EF-G specific PCR product appeared to be 77 rearranged relative to the E . coli EF-G sequence, as shown in

Figure 29. This rearrangement was the result of the presence of an internal EcoRl restriction site in the PCR product.

Since G3SA and G3SB were synthesized with an EcoRl restriction site at their 3 '-termini to facilitate cloning, cleavage of the pea EF-G specific PCR product with EcoRl prior to cloning generated two EcoRl fragments, one of which was cloned rearranged relative to the other (Figure 29).

The amino acid sequences generated from the DNA sequences of the cloned pea and A. thaliana PCR products are shown aligned in Figure 13 to the N-terminal amino acid sequences of translocases from E. coli, hamster, and

Synechocystis 6803.

It was determined that the pea EF-G-like PCR product encoded a portion of the chlEF-G gene due to the fact that the 5 1 DNA sequence of this PCR clone was homologous to the

3' DNA sequence of a PCR clone which encodes the N-terminal portion of the pea chlEF-G gene (Figure 30) (Akkaya, 1991;

Akkaya and Breitenberger, 1992). This N-terminal chlEF-G PCR clone was generated using G3SA and an oligonucleotide which corresponds to the N-terminal amino acid sequence of pea chlEF-G, RevPG (Figure 3 0) . The N-terminal amino acid sequence of pea chlEF-G, which was used to generate RevPG, was determined by peptide sequencing by M. Akkaya in our laboratory. 78

It was subsequently assumed that the A. thaliana EF-G-

like PCR product encoded a portion of the A. thaliana chlEF-G gene. This assumption was based on the strong homology between the amino acid sequences generated by the A. thaliana

EF-G-like PCR product and the pea chlEF-G PCR product (Figure

13) . However, it has not been directly demonstrated that the

A. thaliana EF-G-like PCR product encoded a portion of the A. thaliana chlEF-G gene.

V.C. Pea and Arabidopsis thaliana Genomic Southern analyses.

The results of a pea genomic Southern, which was probed with the pea chlEF-G PCR product, are shown in Figure 31. The results of an A. thaliana genomic Southern, which was probed with the A. thaliana chlEF-G PCR product, are shown in Figure

32. The results of the pea genomic Southern, which indicated that the pea PCR product hybridized to two genomic EcoRl fragments, were predicted due to the presence of an Eco R1 restriction site in the pea chlEF-G PCR clones as discussed previously (Figure 29) .

V.D. Pea northern analysis.

As discussed previously, it has been shown that in pea, chlEF-G activity is light regulated (Akkaya and

Breitenberger, 1992). To study light-dependent regulation of 79 pea chlEF-G, total RNA was isolated from pea seedlings grown under various light conditions for different lengths of time.

Standard northern analysis was performed using the cloned pea chlEF-G PCR fragment as a probe. The preliminary results obtained from this experiment indicate that in pea, chlEF-G mRNA levels are light regulated (Figure 33).

The observation that the amount of chlEF-G mRNA is greater in light grown samples when compared to dark grown indicates that either synthesis or degradation of chlEF-G mRNA is an important step in the mechanism of light regulation observed for this protein.

V.E. Screening of a pea cDNA library, an Arabidopsis thaliana cDNA library and an Arabidopsis thaliana genomic library.

The cloned A. thaliana chlEF-G PCR product was used to screen an A. thaliana cDNA library as well as an A. thaliana genomic library. The cDNA library was kindly provided by Dr.

Randy Scholl while the genomic library was kindly provided by

Dr. Keith Davis. Both Dr. Scholl and Dr. Davis have isolated positive clones from their respective library. I, however, was unable to isolate positive clones from either library using the A. thaliana chlEF-G PCR product as a probe.

The cloned pea chlEF-G PCR product was used as a probe to screen a pea cDNA library which was purchased by our 80

laboratory from Clonetech. Again, I was unable to isolate

positive clones from this library. A possible explanation

for the failure of these experiments will be discussed in the

following section.

V.F. Future plans.

One of the long-term goals of my research was to examine the targeting of cytoplasmically synthesized polypeptides to organelles (mitochondria and chloroplasts) in higher plants, where the problem is made more interesting by the need to accurately direct these proteins to one or the other organelle. Closely related proteins, targeted specifically to both types of organelles, would be the ideal candidates for such a study. Both mitochondria and chloroplasts have closely related protein synthesizing machineries; therefore, the nuclear encoded organellar protein synthesis factors must also be similar. However, the mitochondrial factors must be targeted to the mitochondria while the chloroplast factors must be targeted to the chloroplasts. For this reason, I have attempted to clone and sequence the nuclear encoded organellar protein synthesis elongation factor G genes from pea and A. thaliana. By comparing the transit peptides of these proteins to each other, and to other known chloroplast and mitochondrial transit peptide sequences, I hoped to gain further insight into the mechanism by which nuclear encoded 81

proteins are targeted to cellular organelles.

One would predict that the experimental design outlined

above would yield PCR products which encode portions of

either the mitochondrial or chloroplast EF-G genes as well as

the cytoplasmic elongation factor, EF-2. The organellar EF-

Gs should be highly homologous to E. coli EF-G and I had

planned to use this similarity to distinguish organellar PCR

products from cytoplasmic PCR products.

Differentiating between the chloroplast and mitochondrial EF-G specific PCR products should be an easy matter. I would expect a higher degree of amino acid

identity between chloroplast and bacterial elongation factors than between chloroplast and mitochondrial sequences. This assumption is based on the fact that all known chloroplast elongation factors [A. thaliana chlEF-Tu (Baldauf and Palmer,

1990); Chlamydomonas reinhardtii chlEF-Tu (Watson and

Surzycki, 1982); Euglena gracilis EF-Tu (Montandon and Stutz,

1983) ] are more homologous to the respective bacterial

sequence (Shibuya et al., 1979; Miyajima et al., 1979) than are the known mitochondrial elongation factors [Saccharomyces cerevisiae EF-Tu (Nagata et al.,1983) and EF-G (Vambutas et al., 1991; Welcsh et al., 1992)] to their respective bacterial counterpart (Zengel et al., 1984; Shibuya et al.,

1979; Miyajima et al., 1979).

Regulation of the organellar elongation factors should also differ and an analysis of the pattern of regulation 82 could be used to distinguish the chlEF-G PCR products from the mtEF-G products. For example, chloroplast development requires the coordinated expression of both plastid and nuclear encoded proteins. Light regulation of nuclear encoded chloroplast proteins appears to play an important role in the development of photosynthetic competence in higher plants. One of the most thoroughly studied, nuclear encoded, light regulated chloroplast proteins is the small subunit of chloroplast ribulose-1,5-bisphosphate (rbcS). It has been shown that light stimulates the accumulation of rbcS in pea leaves. Accumulation of rbcS has been shown to be the direct result of an increase in the amount of translatable rbcS mRNA in polysomes (Bedbrook et al., 1980).

The results of Akkaya and Breitenberger (1992) suggest that in pea, chlEF-G activity is light regulated. If chloroplast elongation factors are regulated at the level of transcription like rbcS, I would have expected chlEF-G PCR products to detect an increase in chlEF-G mRNA levels in seedlings grown in the light relative to those grown in the dark. However, it is also possible that chlEF-G is regulated at the translational level. If this is the case, I would have expected chlEF-G PCR products to detect relatively constant levels of chlEF-G mRNA in light- and dark-grown seedlings.

Changes in the mitochondria of photosynthetic organisms in response to light have also been documented and could be used if necessary to differentiate the mtEF-G PCR products

from the chlEF-G PCR products. For example, it has been

shown that exposure of dark grown resting Euglena to light results in a decrease in mitochondrial area (Shatz et al.,

1975) as well as a decrease in mitochondrial lipids and proteins including mtEF-G (Shatz et al., 1975; Beale et al.,

1981; Corriveau and Beal, 1986; Foley et al., 1982; Eberly et al., 1986). It has also been shown that during the development of shoots, leaves, and roots in embryonic pea plants, the number of mitochondrial genomes per cell either decreases or remains the same (Lamppa and Bendich, 1984).

Lamppa and Bendich were also able to show that the percentage of mtDNA relative to embryonic levels, decreases in both dark- and light-grown leaves. If mitochondrial biogenesis, like chloroplast biogenesis, requires the coordinated expression of both mitochondrial and nuclear encoded genes, and these genes are regulated at the level of transcription,

I would have expected mtEF-G specific PCR fragments to detect constant, or slightly decreased, levels of mtEF-G mRNA in light-grown plants relative to dark-grown. However, if mtEF-

G is regulated postranscriptionally, I would expect mtEF-G mRNA levels to remain constant in both light- and dark-grown seedlings.

If the identity of the EF-G specific PCR products was still in question, I should have been able to distinguish the mitochondrial factors from the chloroplast factors based on 84

DNA sequence information generated from their respective

genes. For example, I would expect the chloroplast genes to

encode transit peptides which are similar in sequence to

other known chloroplast transit peptides while the mitochondrial genes should have encoded transit peptides which are similar in sequence to other mitochondrial transit peptides. Moreover, I would have been able to differentiate between the chloroplast and mitochondrial factors by showing

import of the chloroplast elongation factor gene products, but not the mitochondrial factors, into chloroplasts in vitro

(Mullett and Chua, 1983; Grossman et al., 1982). Ultimately,

if the identity of any of the EF-G clones was still in question, it may have been necessary to purify individual elongation factors from their respective organelle and perform peptide mapping and/or sequencing. A purification scheme, which was devised for pea chlEF-G by M. Akkaya in our

laboratory, would have been used to purify A. thaliana chlEF-

G if necessary. Purification schemes and partial purifications have been reported for E. gracilis mtEF-G

(Eberly and Spremulli, 1985) and EF-2 (Breitenberger et al.,

1979) and for wheat germ EF-2 (Lax et al., 1986). From these procedures, I would have been able to devise a purification procedure for both pea and A. thaliana mtEF-G if necessary.

While I was successful in generating and characterizing both pea and A. thaliana chlEF-G PCR products, I was unable to isolate clones from either pea or A. thaliana libraries 85 using these PCR products as probes. I have also tried twice, albeit unsuccessfully, to clone portions of the pea and A. thaliana chlEF-G genes by isolating size selected fragments which correspond to the fragments which hybridize to the appropriate PCR product in genomic Southerns. I have also attempted, again unsuccessfully, to clone the A. thaliana chlEF-G gene using inverse PCR. One possible explanation for the failure of these experiments may be that plant organellar

EF-Gs, if expressed, are toxic to E. coli. One can imagine a scenario in which organellar EF-Gs, synthesized at a low level in cells harboring the recombinant clone, bind to E. coli ribosomes in such a manner as to render them nonfunctional thereby halting protein synthesis in the host cell. While purified organellar EF-Gs, which lack a transit peptide, are fully functional on bacterial ribosomes, EF-Gs which contain a transit peptide may be fatal to bacterial ribosomes. It is also possible that the synthesis of truncated forms of organellar EF-Gs is toxic to the host.

These truncated polypeptides may also halt host protein synthesis by lethally binding to bacterial ribosomes. In support of the hypothesis that organellar EF-Gs are toxic to host cells harboring organellar EF-G clones, a group in

Switzerland, which has successfully cloned the soybean chlEF-

Tu gene, has attempted to clone the soybean chlEF-G gene and has, to date, been unsuccessful in this endeavor (Spielmann, personal communication). I have also been unable to subclone certain fragments of the S. cerevisiae mtEF-G gene that was

isolated in our laboratory. Therefore, nonconventional

cloning methods may need to be employed in order to isolate

clones containing plant organellar EF-G genes.

Cloning vectors, which have been engineered so that

cloned sequences are not transcribed, are currently being

employed in our laboratory to assist in the cloning of the potentially toxic pea and A. thaliana chlEF-G genes. CHAPTER VI

Localization of matrix associated regions (MARS) and

topoisomerase II cleavage sites in the human /3-globin gene

cluster.

VI.A. Introduction

VI.A.I. Organization of the human /3-like and a-like globin gene clusters and their expression during development.

Human hemoglobin (Hb) is a tetrameric protein which consists of two a-like and two /3-like globin polypeptide chains. The human /3-globin gene cluster is approximately 60 kb in length, is located on the short arm of 1 1 , and consists of five expressed genes e, Ay , Gy, S and /3 and one pseudogene, ^/3 (Fritsch et al., 1980; Gussela et a l .,

1979; Lebo et al., 1979; Sanders, et al., 1980) (Figure 34).

The a-like globin gene cluster is approximately 30 kb in length, is located on the short arm of chromosome 16, and consists of three expressed genes f, a2 and al and three pseudogenes 'I'f, ^al and (Bunn et al., 1986; Hardison et

87 88 al., 1986) (Figure 34).

The human globin genes undergo an orderly program of expression during prenatal and postnatal life. In early

fetal development, the /3-like e gene and the a-like f gene are expressed in the yolk sac of the embryo. As fetal development continues, the e and f genes are turned off and are replaced by the expression of the y and a2 genes in the liver and spleen. The 7 -globin genes, which encode two distinct polypeptides, and °7 , differ at codon 13 6 . The

A7 ~gene encodes an alanine at this position whereas the Gy- gene encodes a glycine. The a2 and 7 genes continue to be expressed throughout the remainder of fetal development.

Shortly before birth, the production of the 7 polypeptides decreases and production of

are nonallelic in that they are present on the same

chromosome and code for the same protein.

VI.A.2. Molecular characterization of the j3-globin gene

cluster.

The entire j3-globin gene cluster has been cloned

(Fritsch et al., 1980) and sequenced (Efstradiatis et al.,

1980; Poncz et al., 1983; Lawn et al., 1980; Spritz et al.,

1980; Slightom et al., 1980). The globin genes have been

found to be structurally similar in that they all consist of three exons which are separated by two introns. The first

intron is characteristically smaller (110-13 0 bp) than the

second intron (600-900 bp) . The introns have the dinucleotides GT and AG at their respective 5 1 and 3' boundaries. These dinucleotides have been shown to be required for proper splicing. The introns also contain sequences near their 3 1 end that are believed to be involved in the formation of a branched RNA intermediate in nuclear splicing (Ruskin et al., 1984; Keller et al., 1984). Other conserved sequences which are found in the jS-globin cluster include the "ATA" box which is located approximately 30 nucleotides 5' of the CAP site. This sequence has been shown to be important for accurate transcription initiation. A

"CCAAT" box which is located approximately 70 nucleotides 5 1 of the CAP site has been identified and is believed to be 90

important for efficient transcription of the globin genes

(Dierks et al., 1981: Grosveld et al., 1982).

The /3-globin cluster also contains a number of

repetitive sequences. Nine copies of the Alu I repeat have

been found within the /3-globin gene cluster. Alu I repeats

are approximately 300 bp in length and are present in

approximately 300,000 copies per haploid human genome. Three

copies of the Alu I repeat have been localized 5' to the e- globin gene, one 5' to the Gy-globin gene, one 5' to the globin gene, two 5' to the 5-globin gene and two 3' to the /3- globin gene (Fritsch et al., 1980; Coggins et al., 1980).

Two copies of the Kpn I family of repetitive elements have also been identified in the /3-globin cluster. These sequences, which are usually 1.5 - 6 kb in length (Shafit-

Zagardo et al., 1982), are located 5* to the S-globin gene

(Jagadeeswaran et al., 1981) and 3' to the /3-globin gene

(Kaufman et al., 1980; Adams et al., 1980).

VI.A.3. Characterization of large deletion mutations in the

/3-globin gene cluster.

Thalassemia is a genetic defect in humans which is characteristically the result of altered synthesis of the a- like or /3-like chains of the normal hemoglobin tetramer.

Molecular characterization of a number of thalassemias has shown that they are often the result of a wide variety of mutations which include point mutations as well as deletions ranging in size from 100 bp to 100 kb (Bunn et al., 1986)

(Figure 35). Two large deletion thalassemias (thals), ySfi- thal 1 and 75/3-thal 2, have been characterized and were shown to be the result of deletions which encompass 99.6 and

95.4 kb respectively. 7

like globin genes except for the adult /3-globin gene; likewise, 7 6 /3-thal 2 begins in the /3-globin gene and removes all of the /3-like globin genes (Vanin et al., 1983; Taramelli et al., 1986).

Another interesting mutant phenotype of the /3-globin gene cluster which, in some cases, is the result of a deletion is hereditary persistence of fetal hemoglobin

(HPFH). HPFH is a clinically benign and genetically heterogenous condition which is characterized by the continued expression of fetal hemoglobin in adult life.

Three deletions responsible for the HPFH phenotype have been characterized. The deletions associated with HPFH-1 and HPFH-

2 appear to be approximately 105 kb in length (Collins et al., 1987) whereas HPFH-3 is the result of a 48.5 kb deletion which begins 3 1 to the A7 ~gene and continues in the 3 1 direction deleting the S and /3-globin genes (Henthorn et al.,

1986) (Figure 35). The HPFH-1 deletion begins approximately

4 kb 5 1 to the 6 -globin gene and extends for approximately

105 kb in the 3* direction. HPFH-2 begins approximately 5 kb 92

5 1 of the HPFH-1 5 1 deletion break point and continues in the

3' direction for approximately 105 kb (Vanin et al., 1983)

(Figure 35).

Vanin and colleagues have compared the HPFH-1 and -2

deletions to the two large deletion thalassemias (ySj3-thal-l

and 7

large, encompassing approximately 100 kb each. They also

have been shown to be the result of nonhomologous breakage

and reunion events (Vanin et al., 1983). Furthermore, the 5'

break points of y5j8-thal 1 and y8(3-thal 2 as well as the 5 1

break points of HPFH-1 and HPFH-2 are located approximately

the same distance apart and in the same order along the

chromosome as their respective 3' ends (Figure 36). For

example, for

similar relationship is observed for their 3' break points.

The same is true for HPFH-1 and HPFH-2. The HPFH-2 deletion has both its 5 1 and 3 1 ends shifted approximately 5 kb in the

5' direction relative to HPFH-1; however, the total amount of

DNA lost in HPFH-2 is within 1 kb of the amount lost in HPFH-

1 (Figure 3 6 ). 93

VI.A.4. A possible mechanism for the generation of

particular deletions in the human /3-globin gene cluster.

In order to describe how two distant sequences are

brought within close proximity to one another so as to

facilitate a nonhomologous recombination event, Vanin and

colleagues have proposed the following mechanism, which is

illustrated schematically in (Figure 37) . This mechanism

links the production of the large deletions in the /3-globin

cluster (thal-1, thal-2, HPFH-1, and HPFH-2) to DNA replication. It predicts that specific segments of DNA are attached to the nuclear matrix. If, during DNA replication, a nonhomologous breakage and reunion event occurs near these attachment points, it would result in the loss of a complete chromatin loop and would thereby generate a characteristically large deletion. Variation between deletion break points may be attributed to the stage of replication at the time of the breakage and reunion event

(Figure 37) . As shown in Figure 37, the breaks which gave rise to the deletions occurred at the same position relative to the nuclear attachment site but at different times during

DNA replication. A nonhomologous reunion event at the break points would result in the loss of the intervening DNA.

Therefore, this model could account for the generation of a series of deletions (y

HPFH-2) each of which has lost approximately the same amount 94

Of DNA.

A number of other deletions in the globin gene clusters have been characterized and may also be the result of nonhomologous breakage and reunion events. For example, the

Chinese A7

35) . A similar observation has been made for a group of deletions in the a-globin gene cluster. In the a-globin deletions, the breakpoints are staggered similarly to the

large 0-globin deletions; however, the amount of DNA deleted

is approximately 20-30 kb (Nicholls et al., 1987).

This model predicts that the amount of DNA deleted is equivalent to the size of a chromatin loop or some multiple thereof. Mammalian DNA loops have been shown to range in size from 30 kb to 200 kb with an average DNA loop being approximately 80 kb in length (Laemuli et al., 1978).

Therefore, the size of the four deletions (7

The mechanism proposed by Vanin et a l . predicts a strong relationship between DNA replication and the production of large deletions in the globin clusters. This mechanism may also be responsible for the generation of some of the small globin deletions. Rene Anand has characterized a jS- thalassemia which is the result of a 1.4 kb deletion (Anand, 95

1989) (Figure 35). He compared the deletion break points of

the 1.4 kb thal to two other small deletion thalassemias in

the /3-globin gene cluster, 4.3 kb thal and 0.6 kb thal.

These deletions, which are the result of nonhomologous

breakage and reunion events, may have been generated during

DNA replication if one of the breaks points is in close proximity to a matrix attachment site.

The model for the generation of small deletions in the

/3-globin cluster, proposed by Anand et al., is illustrated

schematically in Figure 38. This model predicts that one of the chromosomal breaks occurs at the base of a chromatin

loop. The second break occurs at a sequence which may cause

DNA polymerase a to pause. Sequences which arrest DNA polymerase a activity may be more likely than others to break during DNA replication due to increased torsional strain. DNA polymerase a pause sequences have been identified by Weaver and DePamphilis (1982) and are believed to contain the consensus sequence (T/A)GGAG. Rene Anand identified a sequence similar to this consensus at the appropriate break points in the 1.4 kb, 4.3 kb, and the 0.6 kb thals. Based on this model and the characterization of the deletion break points for the 1.4 kb, 4.3 kb and 0.6 kb thals, Anand predicted that a matrix associated region is located in the second intron of the human /3-globin gene (/3-IVS2) (Figure

39) . 96

Characteristic regions of DNA have been postulated to be

associated with DNA replication. These include matrix

associated regions (MARS) and topoisomerase II cleavage

sites. Evidence which supports the association of these

sites with eukaryotic DNA replication includes the fact that

DNA replication has been shown to occur at the nuclear

matrix, specifically at the base of the chromatin loop. The

DNA replication machinery has also been found to be bound to

the nuclear matrix. In some eukaryotic organisms, there is

an association between replicon size and loop size. Finally,

topoisomerase II seems to be associated with the nuclear

matrix due to the fact that topoisomerase II has been found

at the base of DNA loops in mitotic as well as in

nuclear matrix preparations.

VI.A.5. Evidence for a functional relationship between

chromosomal loop matrix attachment sites and the regulation

of torsional stress in chromatin domains by topoisomerase II.

The fact that transcriptionally poised chromatin is under torsional stress is supported by several lines of evidence. For example, nuclease or chemical probes often

selectively react with actively transcribed regions of DNA which are also the preferred targets in supercoiled, but not relaxed, naked DNA (Larson and Weintraub, 1982; Glikin et al., 1983; Kohwi-Shigematsu et al., 1983; Weintraub, 1983; Selleck et al., 1984; Han et al., 1984). In Xenopus oocytes,

circular, but not linearized, recombinant plasmids are transcriptionally active (Mertz 1982; Harland et al., 1983;

Pruitt and Reeder, 1984) . It has also been shown that minichromosomes enriched in transcriptionally active DNA possess torsionally unconstrained supercoiled DNA (Luchnik et al., 1982; Ryoji and Worcel, 1984, 1985; Kmiec and Worcel,

1985). Finally, novobiocin, an inhibitor of topoisomerase II activity, is also capable of arresting or reversing heat

shock induction (Han et al., 1985), the preferential DNAase

I sensitivity of active chromatin domains (Villeponteau et a l ., 1984), and the assembly and maintenance of torsionally stressed chromatin (Glikin et al., 1984; Ryoji and Worcel,

1984; Kmiec and Worcel, 1985). Therefore, it has been postulated that one function of topoisomerase II is to

introduce torsional stress into transcriptionally poised chromatin (Glikin et al., 1984; North, 1985)

In order for topoisomerase II to introduce torsional stress into transcriptionally active chromatin, the linear chromosomal DNA molecules must be anchored to the nuclear matrix so that free rotation is impeded. In both interphase nuclei and metaphase chromosomes, DNA appears to be organized

into large, supercoiled loops or domains of approximately 50-

1 0 0 kb, the bases of which are attached to the nuclear matrix or scaffold (Benyajati and Worcel, 1976; Cook and Brazell,

1976; Paulson and Lamelli, 1977; Vogelstein, 1980). It also appears that particular DNA sequences in these loops are

nonrandomly organized. For example, the nontranscribed

spacer regions of Drosophila hsp70 and histone gene repeats

are localized at the bases of chromosomal loops (Mirkovitch

et al., 1984). Furthermore, topoisomerase II, which has been

shown to be a major component of interphase nuclei (Berrios et al., 1986) and mitotic chromosomes (Earnshaw et al., 1985)

also appears to be localized at the base of DNA loops in mitotic chromosomes (Earnshaw and Heck, 1985). These observations suggest a functional link between chromosomal

loop attachment sites and the regulation of torsional stress

in particular chromatin domains. Evidence which supports this prediction includes the fact that regions of loop anchorage shown for Drosophila genes (Mirkovitch et al.,

1984) also contain strong topoisomerase II cleavage sites

(Udvardy et al., 1985). Topoisomerase also appears to be

localized adjacent to the SV40 enhancer element in SV40 minichromosomes (Yang et al., 1985), and gene-specific gyration has been demonstrated, in response to a trans-acting factor, in Xenopus extracts (Kmiec and Worcel, 1985).

VI.A.6 . Characteristics of chromosomal loop attachment sites.

The DNA sequences that remain tightly associated with the nuclear matrix or scaffold in vitro after nuclease digestion and extraction have been termed matrix associated

regions (MARS) or scaffold attachment sites (SARS) (Gasser

and Laemmli, 1987; Phi-Van and Straling, 1990). MARS are

commonly found at the boundaries of transcription units where

they may mark the ends of an active chromatin domain

(Mirkovitch et al., 1984, Gasser and Laemmli, 1986; Udvardy

et al., 1985; Levy-Wilson and Fortier, 1990; Bode and Maass,

1988; Dijkwel and Hamlin, 1988; Jarman and Higgs, 1988; Phi-

Van et al., 1990; Phi-Van and Stratling, 1990). MARS are

also often located close to or within known enhancer-like regulatory elements (Gasser and Laemmli, 1987; Phi-Van and

Straling, 1990; Mirkovitch et al., 1984; Gasser and Laemmli,

1986; Udvardy et al., 1985; Levy-Wilson and Fortier, 1990;

Bode and Maass, 1988; Dijkwel and Hamlin, 1988; Jarman and

Higgs, 1988; Phi-Van et al., 1990; Stratling et al., 1986;

Cockerill and Garrard, 1986; Cockerill et al., 1987) (Figure

40) . Reporter genes that have stably integrated into genomic

DNA and are flanked by particular MARS show position-

independent, copy number-dependent expression (Phi-Van et al., 1990; Stief et al., 1989) and increased transcriptional activity (Phi-Van et al., 1990; Stief et al., 1989; Mielke et al., 1990; Klehr et al., 1991). Furthermore, MAR-like A+T- rich sequences isolate the regulatory influences of adjacent sequences (Kellum and Schedl, 1991).

Sequences which contain MARS are usually A+T-rich (70%) and often contain topoisomerase II consensus sequences 100

(Gasser and Laemmli, 1987; Phi-Van and Straling, 1990)

(Figure 40) . MARS often demonstrate structural features

including bending (Homberger, 1989; Kries-von et al., 1990), a narrow minor groove which is the direct result of oligo(dA) tracts (Adachi et al., 1989), and single-strandedness (Probst and Herzog, 1985). Recent analysis of topoisomers containing the MARSs flanking the immunoglobin heavy chain gene (IgH) enhancer has revealed that these MARS are capable of relieving superhelical strain by remaining stably unpaired

(Kohwi-Shigematsu and Kohwi, 1990).

VI.A.7. Objectives.

The objectives of the research described here include the identification and localization of matrix associated regions (MARS) and topoisomerase II cleavage sites near known deletion breakpoints in the /3-globin gene cluster. The deletion generation model proposed by Vanin et al., 1983, predicts that matrix associated regions as well as topoisomerase II cleavage sites should flank known deletion break points in the /3-globin gene cluster. Two such regions are the second intervening sequence of the /3-globin gene (/3-

IVS2) , which was predicted to contain a MAR based on the characterization of small deletion break points in this region by Rene Anand, and a region which contains the 5' deletion break points for 7 «5/3-thal 1 and 2. This region, 101

which was cloned by Vanin et al., is 14 kb in length and was

designated AN2.1 (Figure 41). This clone is particularly

interesting due to the fact that five nonhomologous breakage

and reunion events have occurred within this particular 14 kb region of the /3-globin gene cluster. These nonhomologous breakage and reunion events are responsible for two Alu I repeat insertions, the generation of the 5' break points for

7

localization of MARS and topoisomerase II sites in these two regions of the /3-globin gene cluster (/3-IVS2 and AN2.1) would provide further evidence in support of the proposed deletion generation model which intimately links the production of some of the deletion break points in the cluster to DNA replication.

VI.B. Materials and Methods

VI.B.l. Topoisomerase II cleavage reactions.

Topoisomerase II cleavage reactions were performed as described previously (Spitzner and Muller, 1988) also see

Figure 42. Briefly, cleavage reactions were performed in a final volume of 2 0 jul in a standard cleavage buffer which consisted of the following: 3 0 mM Tris-Cl, pH7.6, 60 mM KCl,

15 mM /3-mercaptoethanol, 8 mM MgCl2, 3 mM ATP, and 30 jug/ml 102

BSA. 0.3-1.0jLig of supercoiled DNA substrate, l/il of m-AMSA

[41 -(9-acridinylamino)methanesulfon-m-anisidide], which was

prepared as a stock solution in DMSO at a concentration of 1 mg/ml), and 4 units of purified human or chicken

topoisomerase II, which was kindly provided by Dr. Mark

Muller, were added to the cleavage buffer. The reactions were incubated for 20 minutes at 30° C. The reactions were terminated by addition of 40/zl of 1.5% (v/v) SDS followed by digestion with 1/zl of Proteinase K (prepared as a stock

solution in ddH20 at a concentration of 10 mg/ml) for 30 minutes at 45-50° C. Following phenol extraction and ethanol precipitation, the DNAs were digested with Sail, which generated linear fragments. These fragments were separated by conventional gel electrophoresis and transferred to nitrocellulose membranes using standard techniques (Maniatis et a l., 1982).

Topoisomerase II cleavage sites were identified by probing the filters with the appropriate clone. Briefly, the filters were prehybridized in 6 X SSC, 5 X Denhardt's for 30 minutes at 65° C. DNA fragments were random primer labelled with a-32P-dCTP, denatured by adding an equal volume of 0.4 N

NaOH and incubating at room temperature for 15 minutes, neutralized by adding an equal volume of 0.4 N HC1, and added to the prehybridization solution at a concentration of 1 X 105 cpm/ml. Hybridizations were performed at 65° C for 12-16 hours. Following hybridization, the filters were washed 2 X 103

30 minutes in 3 X SSC, 0.5% SDS at 6 8 ° C. The filters were exposed to Kodak X-ray film for 12-16 hours at -70° C.

VI.B.2. Preparation of human nuclear matrices.

Human nuclear matrices were prepared from HeLa cells and human placental tissue as described previously (Cockerill and

Garrard, 1986). Briefly, human placental tissue was homogenized in RSB-0.25 M sucrose (RSB: 10 mM NaCl, 3 mM

MgCl2, 10 mM Tris-HCl, 0.5 mM PMSF, pH 7.4) and filtered through cheese cloth. Similarly, cultured HeLa cells were washed once in phosphate-buffered saline, suspended in RBS, incubated on ice for 1 0 minutes, and then homogenized with a

Dounce homogenizer. The nuclei (placental and HeLa cell) were pelleted, washed twice in RBS-0.25 M sucrose, suspended in RBS-2 M sucrose and centrifuged through a cushion of RBS-2

M sucrose at 24,000 rpm for 30 minutes in an SW27 rotor.

Isolated nuclei were then washed once in RBS-0.25 M sucrose.

Nuclear matrices were prepared from isolated nuclei as previously described (Cockerill and Garrard, 1986) . Briefly, nuclei were digested with 100 jug/ml DNAase 1 for 1-2 hours at

23° C. Following DNAase 1 digestion, the nuclei were incubated in a high salt buffer (high salt buffer: 4 M NaCl,

20 mM EDTA, and 20 mM Tris-HCl, pH 7.4). The nuclei were then extracted twice with a solution consisting of 2 M NaCl,

10 mM EDTA, 10 MM Tris-HCl, 0.5 mM PMSF, and 0.25 mg/ml BSA, 104

pH 7.4. The matrices were washed with RSB-0.25 M sucrose and

0.2 5 mg/ml BSA and resuspended in the same solution, and

stored at -20° C after combining with an equal volume of

glycerol.

VI.B.3. Assay of DNA binding to nuclear matrices.

DNA binding assays were performed essentially as described previously (Cockerill and Garrard, 1986), also see

Figure 43. Briefly, end labeled DNA was incubated with approximately 1 X 107 nuclear matrices in the presence of 0.5-

2.0 /xg of unlabeled competitor DNA, usually sonicated E. coli

DNA. After incubation on a shaker for 1-2 hours at 23° C, the matrices were pelleted, washed, solubilized in 0.5% SDS, and treated for 12-16 hours with proteinase K. Following proteinase K digestion, the DNA was phenol/chloroform extracted and EtOH precipitated. The resulting purified matrix-bound DNA fragments were resolved on 1% agarose gels.

The gels were dried on 3MM Whatman and exposed to Kodak X-ray film for 12-48 hours at -70° C.

The mouse k immunoglobin gene MAR, identified by

Cockerill and Garrard, was used as a control to ensure that nuclear matrix preparations were intact. The mouse k gene

MAR is contained within the 2.85 kb BamHl fragment inserted in clone pG19/45 (Figure 44). This clone was kindly provided by Dr. William Garrard. Results of a nuclear binding assay 105

using the mouse k gene MAR as a control are shown in Figure

45. These results indicate that increasing the amount of

competitor DNA rapidly decreases the degree of nonspecific

vector binding while higher levels of competitor DNA are

required to compete with the specific binding of the MAR-

containing fragment.

VI.C. Results.

VI.C.l. Identification of matrix associated regions in the

/3-globin gene cluster.

The model proposed by Anand for the generation of the 3 1 break points for the 4.3 kb and 1.4 kb thalassemias and the

5' break point of the 0.6 kb thalassemia predicts that these break points border a matrix associated region. Likewise, the model proposed by Elio F. Vanin predicts that a matrix associated region should be in the vicinity of the 5' break points of the large /3-globin deletion thalassemias (7

1 and 2). To test these hypotheses, I determined whether these regions (/3-IVS2 and AN2.1) could bind nuclear matrices in vitro.

The in vitro nuclear binding assay used in these studies is diagrammed schematically in Figure 43. In vitro assays were performed by incubating human placental matrices with end-labeled DNA fragments in the presence of various 106

concentrations of competitor DNA (sheared E. coli DNA). As

a control, the mouse immunoglobin K-gene MAR, which was kindly provided by Dr. William Garrard, was used as a control

(Cockerill and Garrard, 1986) (Figure 44) . Results of

control in vitro MAR assays, which were performed using the mouse immunoglobin K-gene MAR, are shown in Figure 45.

In order to determine the location of the 0-IVS2 MAR, a

0.9 kb Bam HI-Eco RI fragment, which contains all of /3-IVS2, and a 1.8 kb Bam HI fragment, which lies immediately 5' of the 0.9 kb Bam HI-Eco RI fragment and contains all of exons

1 and 2 as well as /3-IVS1 were used in vitro binding assays.

The results of these assays are shown in Figure 46 and are summarized in Table 10. The results were analyzed by normalizing the binding of the test fragments to that of the plasmid vector which was given a value of 1. This analysis

indicated that both the 0.9 kb Bam HI-Eco RI fragment as well as the 1.8 kb Bam HI fragment had a greater affinity for nuclear matrices than did the plasmid vector. However, the

0.9 kb Bam HI-Eco RI fragment had a slightly greater affinity for nuclear matrices than did the 1.8 kb Bam HI fragment as indicated from the fact that the normalized ratio for the 0.9 kb Bam HI-Eco RI fragment continues to increase at greater concentrations of competitor DNA while the normalized ratio for 1.8 kb Bam HI fragment decreases at greater concentrations of competitor DNA. These results indicate that both fragments are capable of binding nuclear matrices 107

in vitro; however, they do so with different affinities.

Rene Anand has shown that both of these fragments are also

capable of binding to the nuclear matrix in vivo (Anand,

1989). Based on the in vitro results presented here as well

as the in vivo results obtained by Anand, it was concluded

that the matrix associated region begins upstream of the Bam

HI site located in the second exon of the j3-globin gene and

extends into the second intron of the /?-globin gene (Figure

47). Similar results to those presented here have been

obtained by others (Higgs et al., 1988).

To test thehypothesis that a matrix associated region

is located near the 5 1 break points of the large deletion

thalassemias (7

normal DNA of this region (Figure 41), was digested with Eco

RI. The seven Eco RI fragments were subcloned into the

vector PAT153. Each of these subclones was tested in vitro

to determine if it contained a matrix associated region. The

results of these assays are shown in Figure 48 and are

summarized in Table 11. Briefly, the 0.9 kb Eco RI fragment, which contains the 5' deletion break point for 7 6 /3-thal 1,

and the 1.8, 1.3 and 0.8 kb Eco RI fragments, which are

adjacent to one another and are located > 2 kb 3' of the 5'

deletion break point for 7

being located in the 0.9 kb Eco RI fragment while the other

appears to span the region defined by the 1.8, 1.3 and 0.8 kb

Eco RI fragments (Figure 49).

VI.C.2. Identification and localization of topoisomerase II

sites in the /3-globin gene cluster.

Topoisomerase II is an enzyme that transiently breaks

and rejoins double stranded DNA in the presence of ATP. As discussed previously, topoisomerase II is a major component of the nuclear matrix, appears to be localized to the base of

DNA loops in mitotic chromosomes, and matrix associated regions of DNA often contain topoisomerase II cleavage sites

(Figure 40) . These observations suggest a functional link between matrix attachment sites and the modulation of torsional stress by topoisomerase II. The role that topoisomerase II may play in recombination in mammalian cells has not been determined. However, calf thymus topoisomerase

II has been shown to mediate nonhomologous recombination between X DNA molecules in vitro (Bae et al., 1988).

In order to examine whether some of the /3-thalassemia deletions are the result of cleavages generated by topoisomerase II, topoisomerase II cleavage sites were mapped in vitro in a region which contains naturally occurring /3-globin deletion break points as well as matrix associated regions. 109

The in vitro topoisomerase II assay used to map

topoisomerase II cleavage sites in the AN2.1 subclones pi.8 ,

pi. 3, pl.O and p0.8 is shown schematically in Figure 42.

Both human topoisomerase II, which was kindly provided by

Rene Anand, as well as chicken topoisomerase II, which was

kindly provided by Dr. Mark Muller, were used to generate

topoisomerase II cleavage fragments. No difference was

observed between the human and chicken enzymes either in

cleavage site specificity or relative cleavage strength.

Topoisomerase II cleavage sites for the AN2.1 EcoRI

subclones pi.8 , pi.3, pl.O, and p0.8 were determined (Figure

50). The position of the strongest cleavage sites within the

subclones are illustrated schematically in Figure 51.

VI.D.l. Discussion.

The results presented here detail an association between

MARs and the breakpoints of some of the many deletions in the

/3-globin gene cluster. These include the four large /3-globin deletions, thal-1, thal-2, HPFH-1, and HPFH-2, for which a model involving the deletion of a chromosomal loop across its base has been proposed (Vanin et al., 1983). However, we have been unable to define a clear association between topoisomerase II cleavage sites and deletion breakpoints.

For example, Rene Anand has mapped the topoisomerase II cleavage sites for the 0.9 kb Bam HI-Eco RI fragment which 110

contains all of /J-IVS 2, as well as for fragments which

contain the 3' break point for the 1.4 kb thal, the 5'

breakpoint for the 0.6 kb thal, the 3' breakpoint for the 4.3

Czech thal, and the 5' breakpoint for the large deletion thalassemia, 7 $/?-thal 1 (Anand, 1989) . He was able to show that topoisomerase II sites do flank the 31 breakpoint for the 1.4 kb thal; however, the 3' breakpoint for the 4.3 kb

Czech thal, the 5' breakpoint for the 0.6 kb thal, the 5' breakpoint for 7

The preliminary topoisomerase II cleavage data presented here, however, do not refute the hypothesis that the large /3- globin deletion breakpoints characterized in clone AN2.1 are also topoisomerase II cleavage sites. Topoisomerase II cleavage sites were mapped to subclones of AN2.1 which also were shown to contain matrix associated regions. Each fragment tested (pi.8, pi.3, pl.O, and p 0.8) contained numerous topoisomerase II cleavage sites. These results are in agreement with the fact that MAR sequences, which tend to be approximately 70% AT, contain topoisomerase II consensus Ill

sites. They also support the results of other groups which have shown that topoisomerase II cleavage sites map to regions with identified MARs (Gasser and Laemmli, 1987; Phi-

Van and Straling, 1990). The exact sequences cleaved by topoisomerase II in the MAR-containing AN2.1 subclones will need to be identified. These specific topoisomerase II cleavage sites will then need to be compared to the exact deletion breakpoints in order to determine if topoisomerase

II is responsible for generating the 5'deletion breakpoints for y£/3-thal 1 and 2.

If topoisomerase II does not generate the large /3-globin deletion breakpoints, the results presented here show that it

is capable of cleaving DNA located near matrix associated regions. Topoisomerase II cleavage sites should therefore be useful as enzymatic markers to identify potential matrix associated regions of chromatin; however, not all topoisomerase II cleavage sites lie in matrix associated regions. Moreover, topoisomerase II cleavage sites appear to constitute a part of the sequence recognition requirement of

MARs. While topoisomerase II may function as a chromatin

"loop fastener", it also appears to play a crucial role in maintaining the supercoiled state of chromatin loops, which has been implicated to be of importance in both DNA replication and gene regulation and expression. Table 1. Yeast strains used for genetic analysis.

The yeast strains used in this dissertation are listed according to their nuclear markers as well as their mitochondrial phenotypes. a/a C155, which was kindly provided by Dr. Alexander Tzagoloff, contains a mutation in a yeast mitochondrial elongation factor G gene. All other strains were kindly provided by Dr. Philip Perlman.

112 113

Table 1

Strain Nuclear Markers Mitochondrial Phenotvoes a/aC155 ura, mefl P +

0161 ade, lys P + aKAR leu, kar-1 P° aKAR ura, his, canr P° a/aKAR161 ade, lys P + aBWG7A ade, leu, his, ura P + Table 2. Plasmid vectors and their selectable markers.

The recombinant plasmids used in this dissertation are listed according to their selectable phenotypes in E. coli, their selectable phenotypes in yeast and, where applicable, their yeast replicon.

114 115

Table 2

Plasmid Size E. coli phenotypes Yeast phenotypes Replicon pBS+:

3.2 kb Ampr N/A N/A pBluescript SK-:

2.9 kb Ampr N/A N/A pUCBM21:

2.7 kb Ampr N/A N/A

YCp50:

7.9 kb Ampr, Tetr Ura+ ARS1

YEp3 52:

5.2 kb Amp+ Ura+ 2jum Table 3. Sequence motifs in GTPase superfamily.

The amino acid sequences of putative N-terminal

conserved regions (G1-G4) in representatives of prokaryotic

and eukaryotic elongation and initiation factors plus a

consensus sequence for the regions are shown (from Bourne, et

al. , 1991). Bold type indicates residues conserved in nearly

all GTPases; upper-case letters indicate other highly

conserved amino acid residues. Species designations are:

EC, E . coli; SR, Spirulina platensis; AN, Anacystis nidulans;

TM, Thermotoga maritima; ML, Micrococcus leteus, EG, Euglena

gracilis; SC, Saccharomyces cerevisiae; HS, Homo sapiens; LE,

Lycopersicon esculentum; DD, Dictyostelium discoideum; SF,

Streptococcus faecium; BS, Bacillus subtilis; TT, Thermus

thermophilus; MV, Methanococcus vannielii.

116 117

Table 3

G-l G 2 G-3 G.4 Consensus XOOOOGXXGXCKS D-IXL-T OJOODXAGJX OOOONKJ TQ

Elongation and initiation factors

Elongation factors ^Vo**yofic EEET-G 12 NigisAHiD&GKTTtt 51 DwmoqDqeRGlTItsa 84 inilDtPGHv 138 lafvNK aO SREF-G 12 NigiaAHiDaGKTTtt 51 D*maqEreRCITItaa 84 iniiOiPGHv 138 laflN K aD ANEF-C 12 NigiaAHiDttGKTTU 51 D*ooqEreRGITItaa 77 vniiDtPGHv 131 lvfvNKoD ECLEPA 7 NfsilAHlQhGKSTls 42 DsodlEreRGXTIkaq 73 InfiOtPGHv 127 vpvlNKlD EC E7TU 13 NvgtlGHvDhGKTTlt 50 DnapeDtaRG Hints 76 yahvOcPGHa 131 lvflN K cO TMDTU 14 NvgtiGHiDhGKSTlt 51 DkapeEkaRClHnit 77 yahlDcPGHa 132 lVfiNKtD TTETTU 14 NvgtlGHvDhGKTTlt 52 CkapeEraRGlTInta 78 yshvDePGHa 133 wfoNKvO KL ETITJ 14 NlgtlGHvDtiGKTTU 53 DsapeErqRGITInis 79 yahvOaPGHa 134 lvalN K sD EBETOJ 14 Nl g t 1 GHvDhGKTTl t 51 DsapeDtaRCITInta 77 yahvOcPGHa 132 wflNKaO SC EFRJ SO HvgtlGHvDhGKTOl 67 DkapcEraRGlTIsta 113 yshvOcPGHa 131 w fvN K vO tfVETTU 9 NvaflGrtYCoGKTTtv 61 DglkoEreRGvTXdva 87 vtivOcPGHr 145 avavNKaD EiMaryotic KSEF1A 9 NiwiGHvDsGKTTU 61 DklkaEreRGITIdls 87 vtllDaPGHr 149 ivgvNfaD SCEF1A 9 HwviGHvDsGKTTtt 61 Dklk&EraRClTIdta 87 vtviDaPGHr 149 lvavNKmD LEEF1A 9 siwiGWvDsGKTTtt 61 DklkaEreRGITIdia 87 ctviDaPGHr 149 lCCCNKaD ►SGST1 76 NwfiGWvDaGKTTig 128 DtnqcErdkGXTvevg 154 ftllDaPGKk 216 lvliNKnD SCGST1 262 s 11 faGttvD&G KTItog 314 DtnkeErndCkTIevg 340 yttlDaPGKk 402 w w N K aO HAKSTEREF-2 21 tosvlAWvDhGKTTlt 58 OtrkdEqeRcITlkst 100 InilOsPGHv 154 vI qd NKjbO < < DO IT -2 21 toSVlAHvDhGKTTls 58 scradEqeRCITIkss 96 inllDsPGHv 152 < z * o

initiation factors ProKSfyotK EC IF2 333 wtimGHvDhGKTsU 414 tkvasgeagCITqhig 440 ilflOtPGHa 494 w avN K iD ST IF2 288 wtlttGHvDhGKTTll 309 skvtoqeagCITqhig 335 ltflOtPGHa 389 ivavNKiO Table 4 . Percent amino acid sequence homology between

Synechocystis 6803 EF-G, A. nidulans EF-G, S. platensis EF-G,

E. coli EF-G, and E. coli LepA.

Percentages represent pairwise amino acid identities.

118 119

Table 4

A. nidulans S. platensis E. coli E. coli

EF-G LepA

Synechocystis 49% 51% 54% 3 6%

A. nidulans X 80% 59% 36%

S. platensis X 58% 35%

E . COli EF-G X 34% Table 5. Comparison of the codon usage in Synechococcus

(frequency of codon per 1000 amino acids; wada et al., 1990) and the Synechocystis 6803 fus gene (695 amino acids).

120 121

Table 5

Codons Synechococcus Synechocystis 6803 fus

CGA Arg 3 0 CGC 19 5 CGG 11 14 CGU 15 11 AGA 1 1 AGG 0 0 CUA Leu 5 0 CUC 26 10 CUG 39 9 CUU 8 1 UUA 5 3 UUG 31 19 UCA Ser 3 0 UCC 10 17 UCG 14 4 UCU 11 4 AGC 16 5 AGU 6 3 ACA Thr 3 3 ACC 28 33 ACG 10 3 ACU 12 8 CCA Pro 5 1 CCC 15 19 CCG 14 6 CCU 9 10 GCA Ala 21 2 GCC 30 21 GCG 2 7 13 GCU 32 10 GGA Gly 4 1 GGC 37 22 GGG 9 14 GGU 3 5 26 GUA Val 6 15 GUC 21 2 GUG 23 34 GUU 20 9 AAA Lys 17 37 AAG 16 10 AAC Asn 24 8 AAU 8 7 122

Table 5 (continued)

Codons Synechococcus Synechocystis 6803 fus

CAA Gin 28 12 CAG 22 13 CAC His 15 8 CAU 5 4 GAA Glu 30 51 GAG 20 7 GAC Asp 25 28 GAU 22 26 UAC Tyr 22 14 UAU 8 8 UGC Cys 3 1 UGU 2 4 UUC Phe 30 15 UUU 20 9 AUA H e 0 0 AUC 35 20 AUU 22 27 AUG Met 19 24 UGG Trp 20 4 Table 6. Distance in base pairs between the genes of the str operon in E. coli, S. platensis, A. nidulans, and M. luteus.

123 124

Table 6

Organism Distances in base pairs between genes

S7 (rps 7) EF-G (fus) EF-Tu (tufA)

E. coli lOObp 70bp

S. platensis 73bp 214bp

A. nidulans 77bp 29bp

M. luteus 215bp 276bp Table 7. Percent amino acid sequence homology between the two

S. cerevisiae mtEF-Gs and E. coli EF-G.

S. cerevisiae 1 is the gene characterized in our laboratory whereas S. cerevisiae 2 is the gene characterized by A. Vambutas et al. Upper right triangle: pairwise amino acid identities. Gaps in one sequence relative to another were not included in these calculations.

125 126

Table 7

S. cerevisiae 2 E. coli

S. cerevisiae 1 29% 3 3 %

S. cerevisiae 2 x 43% Table 8. Comparison of the codon usage in s. cerevisiae

(frequency of codon per 1000 amino acids; wada et al., 1990) and the S. cerevisiae fus gene identified by our laboratory

(819 amino acids).

127 128

Table 8

Codons S. cerevisiae S. cerevisiae fus

CGA Arg 2 6 CGC 2 2 CGG 1 0 CGU 7 2 AGA 24 13 AGG 8 9 CUA Leu 12 14 cue 4 2 CUG 8 7 CUU 10 11 UUA 24 25 UUG 32 14 UCA Ser 16 14 UCC 14 10 UCG 7 8 UCU 25 17 AGC 7 7 AGU 12 11 ACA Thr 16 15 ACC 14 13 ACG 7 8 ACU 22 24 CCA Pro 21 15 CCC 6 9 CCG 4 3 ecu 13 10 GCA Ala 15 16 GCC 16 5 GCG 5 7 GCU 28 11 GGA Gly 9 15 GGC 9 11 GGG 5 5 GGU 35 15 GUA Val 10 9 GUC 15 9 GUG 9 12 GUU 27 21 AAA Lys 38 45 AAG 35 15 AAC Asn 26 23 129

Table 8 (continued).

Codons S. cerevisiae S. cerevisiae fus

AAU 31 39 CAA Gin 30 18 CAG 10 7 CAC His 8 4 CAU 12 6 GAA Glu 49 33 GAG 17 18 GAC Asp 22 11 GAU 37 36 UAC Tyr 17 8 UAU 17 14 UGC Cys 4 7 UGU 8 4 UUC Phe 20 10 UUU 23 14 AUA H e 13 19 AUC 18 16 AUU 31 39 AUG Met 21 15 UGG Trp 10 9 Table 9. Results of yeast test crosses: aBWG7A x aKARp0 and aBWG7A efgl::URA3 x aKARp0.

130 131

Table 9

Test strain Lane No. of colonies on YEPG plate

aBWG7A 1 >500

Transformant 24 10 >500

Transformant 23 9 1

Transformant 21 8 >500

Transformant 15 7 1

Transformant 5 6 10

Transformant 4 5 20

Transformant 3 4 5

Transformant 2 3 2 Table 10. Results of densitometric scans of the in vitro MAR assays shown in Figure 46.

132 133

Table 10

Area under peak Normalized Area

Lane 1.8 B 0.9 B-E 1.8 B 0.9 B-E

A. 0.687 1.179 1.00 1.00

B. 0.733 1.139 1.07 0.97

C. 6.423 5.836 9.35 4.95

D. 1.267 9.735 1.84 8.26

E. 5.45 15.700 7.93 13.32

F. 7.70 28.571 11.21 24.23

6. 6.33 58.333 9.21 49.48 Table 11. Results of densitometric scans of the in vitro MAR assays shown in Figure 48.

134 135

Table 11

Lane p 4 .0 p3 .0 pi.8 pi. 5 pi.3 pi. 0 p O . 8

A. 0.539 1.558 1.884 0.998 1.962 1.397 1.772

B. 0.341 0.626 1.229 0.741 1.178 0.874 1.204

C. 0.307 0.697 1.594 0.849 1.589 1.076 1.502

D. 0.494 0.836 1.499 0.791 1.379 1.128 1.392

E. 0.209 0.711 2.679 1.416 3.215 2.050 3.127

F. 0.330 0.928 2.682 1.468 3.336 2.199 3.361

6 • 0.282 1.165 5.959 2.826 10.129 7.100 10.390

H. 0.304 1.139 8.017 4.365 17.930 11.565 18.817

I. 0.221 0.733 6.647 3.779 18.709 11.686 18.09 Figure 1. Prokaryotic protein synthesis initiation.

A schematic representation of the formation of the prokaryotic initiation complex for protein synthesis is

illustrated. Note that the methionyl-tRNA complexes directly to the P site on the ribosome. All other aminoacyl-tRNAs initially complex with the A site. Once the initiation complex is formed, GTP is hydrolyzed and the initiation factors are displaced. The following symbols represent the respective components of the protein synthesizing machinery:

------— taRNA

803 aubunit

80S subunit

70S ribosome

tRNA

136 CHO Nil (Mol MCI - (CII j ) 2 aciij mRNA + fMot-tRNA c=o + 3 IF3 + QTP

^ -*/ T IP-1-QIP| V IF— 3

60S 30S prelnlllnilon complax

CHO 1 NH CHO Nil HC-CII. CII, SCIia 1 IIC-CII, Cll2 S CHj C=0 C=0

AUO NHN

IF-I + IF-2 + G D P + PI 70S ribosome

Figure 1. 137 Figure 2. Prokaryotic protein synthesis elongation and termination.

A schematic representation of prokaryotic elongation and termination is illustrated. Note the addition of the second aminoacyl-tRNA to the ribosome complex and the accompanying

EF-Tu, EF-Ts cycle. Also note the translocation reaction, which involves the displacement of the discharged tRNA from the P site and the movement of the peptidyl tRNA to the P site. Finally, note the termination reaction, which occurs when a stop codon is recognized by a release factor.

Following recognition, GTP is hydrolyzed and peptidyl transferase transfers the polypeptide chain to a water molecule.

138 •not _ .. GTP IM cl-lRNAj + 70S Itlliusonia + mltNA

CIIO nn-IIIMA {in +Gr|,V IIC-CI I,Cl I,SCI u /

EF-Tu-GIPnn-tMNA EP-Tti-GDPtP,

AUCi NNH

/

:i lit

GTP EF-G

3 RFs

70S nibo9oinetPcplltlotiitnNAf IHNA

Figure 2. 139 Figure 3. Eukaryotic protein synthesis.

A schematic representation of the elongation cycle of eukaryotic protein synthesis. Note that the eukaryotic elongation reaction is very similar to the corresponding prokaryotic reaction.

140

THANSL0CAT10N ^F-lgfrn-GDP^

3’ V_P_A A ) JmRHA PEPTIDYL TRANSFER

Figure 3. Figure 4. The GTPase cycle.

A schematic representation of the basic GTPase cycle is illustrated (from Bourne et al., 1990). GTP binding activates GTPases whereas GTP hydrolysis inactivates GTPases.

GTP hydrolysis is enhanced by GAPs (GTPase activating proteins). GNRPs (guanine nucleotide releasing proteins) enhance GDP dissociation.

142 143

Empty GDP State GTP GNRPs

Inactive Active State State GDPj GTP

GAPs

cat-GTP

Figure 4. Figure 5. GTPase cycle of EF-Tu.

The bacterial EF-Tu, EF-Ts GTPase cycle is illustrated

(from Bourne et al., 1990). Tu, Bacterial EF-Tu; Ts, bacterial EF-Ts; aa-tRNA, aminoacyl-tRNA; Ribo, mRNA-bound ribosome.

144 145

TsJTu

T u CTP aa-tRNA

Ribo (a*Xaa-tRNA

Figure 5. Figure 6. Design of EF-G-specific oligonucleotides.

N-terminal amino acid sequences derived by DNA sequencing for

translocases from E. coli, M. luteus, M. vannielii, and

hamster (Zengel et al.t 1984, Ohama et a l . , 1987, Yakhnin et

a l ., 1989, Lechner et al.,1988, Kohno et al., 1986) are shown

aligned to one another. The underlined amino acids in the E.

coli sequence have been implicated in guanine nucleotide

binding. Dashes indicate that the sequence is the same as that of E. coli and spaces indicate deletions in one sequence relative to the others. The amino acid sequences which were used to derive the oligonucleotide sequences are overlined.

The sequences of oligonucleotides used in PCR reactions are shown in uppercase bold type while the restriction enzyme sites, which were incorporated into the 5'terminal sequences of the oligonucleotides to facilitate cloning, are shown in lower case bold type. The sequence of oligonucleotide Gl, a third EF-G specific oligonucleotide which was used in hybridization experiments, is also shown in bold type.

146 5'-ccggatcCAYATHGAYGCWGG-3'REVG4SA 5'-CcggatcCAYATHGAYGCSGG-3'REVG4SB

1. Escherichia coli EF-G MARTTPIARYRNIGISAHIDAGKTTTTE 2. Micrococcus luteus EF-G------ML-DLHKV----- M ------4. Methanococcus vannielii EF-2 MGRRAKMVEKVKSLMETHDQI— M— C H LSD 5. Hamster EF-2 MVNFTVDQIR-IMDKK-NI— MSVI— V-H— S-L-D phosphate binding 3'-CTRACCTACCTYGTYCT-5' G1

1. RILFYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAATTAFWSG MAKQYEPHRINIID 2. -H------L— T G— T------K------V-C— ND -Q---- 4. NL-AGA- MISKDLAGDQLAL-FD-E-AA YA-NVSMVHEY NGKEYL— L— 5. SLVCKA- IIASARAGETRFT-TRKD C K-T-ISL-YELSENDLNFIKQS-DGSGFL— L—

3'-CCHGTRCAMCTRAActtaagcg-5' G3SA 3'-CCSGTRCANCTRAActtaagcg-5' G3SB

1. TPGHVDFTIEVERSMRVLDGAVMVYCAVGGVQPQSETVWRQANKYKVPRIAFVNKMDRMGANFLKWNO 2 . N------V L------A-FDGKE—E------D—D----- C----- KL—D-YFT-DT 4 . ------GGD-T-A— Al----V-C E— M— T-- L-- L-E—KPVL-I—V—LINELKLTPEE 5 . S------SS—TAAL—T--- LV-VDC-S—CV-T-- L-- IAERIKPVLMM ALLELQLEPEE phosphate binding G binding

Figure 6. 147 Figure 7. Schematic representation of the PCR strategy used to generate EF-G specific PCR products.

Organellar EF-G N-terminal amino acid consensus sequences are boxed. Primers RevG4A and RevG4B have been synthesized to account for the degeneracy in the alanine codon whereas G3SA and G3SB have been synthesized to account for the degeneracy in the glycine codon. The size of the expected PCR products, which is based on the E. coli sequence, is also given.

148 149

^ T T T A REVG4A-CCGGATCCACATCGACGCTGG A T T T G REVG4B-CCGGATCCACATCGACGCCGG

H,N HIDAG GHVDF COOH

1

G A A CCCGTGCANCTGAACTTAAGCG-G3SA

A A A CCTGTGCANCTGAACTTAAGCG-G3SB

| PCR

234bp F ra g m e n t

Figure 7. Figure 8. A restriction enzyme map of the Synechocystis 6803

fus-like gene.

A restriction enzyme map of the Synechocystis 6803 fus-

like gene is given. The AUG codon is located approximately

90 bp 5' of the Hpa I site whereas the termination codon is

located approximately 85 bp 3' of the 3'-most Nco I site.

Arrows indicate sequence generated using synthetic

oligonucleotides. Closed arrow heads designate the 3'-ends

of two deletion subclones, Cla I and Sma I, which were also

sequenced. Open arrow heads designate the four Nco I

fragments which were subcloned and sequenced using both the

M13 universal primer and the pUC/M13 reverse primer. The

restriction sites are: H, Hind III; N, Nco I; K, Kpn I; S,

Sma I ; H , Hpa I .

150 500bp if______a SI

- — Sm a 1 Cla1

<>

Figure 8. Figure 9. Restriction enzyme map of a S. cerevisiae mtEF-G gene.

A restriction enzyme map of a S. cerevisiae mtEF-G gene is given. Arrows indicate sequence generated using synthetic oligonucleotides. Closed arrow heads indicate the 5' ends of deletion subclones which were also sequenced. DNA fragments which were subcloned and sequenced are indicated by open arrow heads. The AUG codon is located 64 bp 3' of the 5'

Hind III site. The UAG codon is located 27 base pairs 3' of the Pvu I site. The restriction sites are: H, Hind III; S,

Sal I; A, Acc I; C, Cal I; B, Bam HI; Sp, Sph I; P, Pvu I.

152 153

------55 0 bp A H S C B H S P

It H i t I

t><> < 1 > <> ►

Figure 9. Figure 10. Schematic diagram of the construction of the allele used to disrupt a wild type S. cerevisiae mtEF-G gene.

The 2.7 kb Hind III fragment and the 3.8 kb Bam HI

fragment were subcloned into the vector pBS+. The construct was cleaved with Sph I and the liberated 8.1 kb fragment was band isolated and used to effect the gene replacememt in aBWG7A.

154 2.7 kb 3.8 kb

Sph I

pBS+

▼ H 2.7 kb pBS+ URA3+ B -S 0.8 kb w 'w 'w 'w Iw Iw >w >w |w Iw Iw Im Iu Iu 'lylylyl^lyTylylylylYTylyTyT^

Figure 10. Figure 11. Nondenaturing polyacrylamide gel electrophoresis

of Synechocystis 6803 and pea PCR products.

Lane 1 contains PCR products generated using pea genomic

DNA as a template with the primers RevG4B and G3. Lanes 3,

5, 7, 9, 11, and 13 contain PCR products generated using

Synechocystis 6803 genomic DNA as a template with the primers

RevG4A and G3SA, RevG4A and G3SB, RevG4A and G3, RevG4B and

G3SA, Rev G4B and G3SB, and RevG4B and G3 respectively.

Lanes 2, 4, 6, 8, 10, and 12 are the respective negative controls for the cyanobacterial reactions (no template DNA).

PCR products in the expected size range are indicated by the arrow. The markers sizes (lane m) are indicated in base pairs.

156 157

m 1 2 3 4 5 m 6 7 8 910 11 12 13 T.yr r-TT:. 1ST-.. > •ST. ■. I P - •:. 1444 t • t:' 736 Vk. 501 489 476 — • ■'■ t 1' ' :r- 331 •ijuii,. • •• • • 'rn'I;1 242 190 147 110 67

t.r.iii* • • .'V M 1 I s II i'l' w- • ‘ U1 •: T ►iVvi *;• . c , i . w • > - t * c - : . . j j-j

vl/v: V.*■.i ri::i v X V ' : V V r ?i *" 'v;' ;.V.!rt. “f,1., • v; .V *<; V;;' £ jv ! ■ il^Lu "'iv

Figure 11. Figure 12. Nucleotide sequence of the Synechocystis 6803 EF-

6 specific PCR product.

The nucleotide sequence of the Synechocystis 6803 EF-G specific PCR product is given. Slashes designate primer sequence.

158 159

Synechocystis 6803 PCR Product:

5'//////////////GGTAAAACCACCACCACCGAACGGATTTTGAAGTTAA

CCGGGAGAATCCATAAACTCGGCGAGGTACGCGAAGGCGAATCCACCATGGACT

TCATGGAGCAGGAAGCGGAGCGGGGTATCACCATCCAATCGGCGGCCACCAGTT

GTTTTTGGAAAGATCATCAACTCAATGTTATCGACACC//////////////3'

Figure 12. Figure 13. Alignment of the N-terminal amino acid sequences

of translocases from E. coli, hamster, Synechocystis 6803,

pea, and A. thaliana.

N-terminal amino acid sequences derived by DNA

sequencing for translocases from E. coli (Zengel et al.,

1984) and hamster (Kohno et al., 1986) are aligned. The N-

terminal amino acid sequence of pea chloroplast EF-G, which was obtained by protein sequencing, is shown in bold type.

The sequences obtained from PCR products of Synechocystis

6803, pea, and A. thaliana are also given. The underlined

amino acids in the E . coli sequence have been implicated in

guanine nucleotide binding. Dashes indicate that the

sequence is the same as that of E. coli, and spaces indicate deletions in one sequence relative to the others. Slashes

indicate PCR primers. The pea product was originally

obtained as two individual clones due to the presence of an

EcoRI restriction site within the coding region.

160 1. Escherichia coli EF-G MARTTPIARYRNIGISAHIDAGKTTTTE 2. Hamster EF-2 MVNFTVDQIR-IMDKK-NI— MSVI— V-H— S-L-D 3. Synechocystis 6803 EF-G (PCR) IIIII----- 4. Pea chi EF-G (PCR) ATEDGK-AV-LKD------5. Arabidopsis thaliana chi EF-G (PCR) IIIII----- phosphate binding

1. RILFYTGVHHKIGEVHDGAATMDWMEQEQERGITITSAATTAFWSG MAKQYEPHRINIID 2. SLVCKA- IIASARAGETRFT-TRKD---- C K-T-ISL-YELSENDLNFIKQS-DGSGFL— L— 3. KL— RI— L AE-ES F A------Q ---- SC—KD -QL-V— 4 . -----P-R-Y------E-T------T — DK ------5. Y R-Y------E-T------T — DK ------

1. TPGHVDFTIEVERSMRVLDGAVMVYCAVGGVOPOSETVWROANKYKVPRIAFVNKMDRMGANFLKWNO 2 . S------SS— TAAL— T LV-VDC-S— CV-T L IAERIKPVLMM----- ALLELQLEPEE 3 - — \\\\\ 4 - -~\\\\\ 5- — \\\\\ phosphate binding G binding

Figure 13. Figure 14. Synechocystis 6803 genomic Southern.

Ten micrograms of Synechocystis 6803 genomic DNA were digested with Hindlll, lane 1, and EcoRI, lane 2, and

analyzed by Southern blotting using the cloned Synechocystis

6803 PCR product as a probe. Lanes 3 and 4 correspond to

lanes 1 and 2 respectively. The marker sizes are listed in kilobase pairs.

162 163

ppwa

Figure. 14 Figure 15. N. crassa, Euglena gracilis, Synechocystis 6803, and pea genomic Southern probed with the Synechocystis 6803

EF-G specific PCR product.

Lanes 1 and 2 contain 5fig of N. crassa genomic DNA which was digested with Hindlll and EcoRI respectively. Lanes 3 and 4 contain 5fig of E . gracilis genomic DNA cleaved with

Hindlll and EcoRI respectively. Lane 5 contains 5/xg of

Synechocystis 6803 genomic DNA digested with Hindlll whereas

lanes 7, 8, and 9 contain 40/xg of pea genomic DNA digested with Hindlll, EcoRI, and BamHI respectively. Lane 6 contains marker DNA. The filter was hybridized to the Synechocystis

6803 EF-G specific PCR product in 6XSSPE at 37° C. Washes were performed in 1XSSPE at 49°. Marker sizes are as indicated.

164 165

y< s

>X>S^W

f»illfi .'.WWVW.'MVA'.V.V,

Figure 15. Figure 16. Nucleotide and corresponding amino acid sequence of the Synechocystis 6803 EF-G gene.

166 167

CCTCCCCTTTGGCCTGTCGTTGTGGCATAAAGTCTAGCTAAGGCCCAGAACCCCGACCGGCAAAC AACACGACCCCATAACCTATACATCCAACCGCCCCTCTGTTTGCTGTTGGCTAAAATTTACACGT AAATTCCTATGGAAAAAGATCTCACTCGTTACCGCAATATTGGTATTTTCGCCCACGTAGATGCG mekdXtryrnigifahvda GGTAAAACCACCACCACCGAACGGATTTTGAAGTTAACCGGGAGAATCCATAAACTCGGCGAGGT gktttterilkltgrihkXgev ACGCGAAGGCGAATCCACCATGGACTTCATCGAGCAGGAAGCGGAGCGGGGTATCACCATCCAAT regestJndfmeqeaergitiqs CGGCGGCCACCAGTTGTTTTTGGAAAGATCATCAACTCAATGTTATCGACACCCCTGGCCACGTG aatscfwkdhqlnvidtpghv GACTTCACTATTGAAG7TTACCGCTCCCTGAAAGTGTTGGATGGCGGCATTGGGGTATTTTGTGG dftievyreXkvIdggigvfcg CTCCGGCGGTGTGGAACCTCAGTCGGAAACCAACTGGCGCTATGCGAACGATTCTAAAGTCGCCC sggvepqsetnwry -a n d s k v a r GTTTGATTTACATTAATAAACTTGACCGTACCGGGGCAGATTTTTACCGGGTAGTTAAGCAGGTG X i yinkXdrtgadfyrvvkqv GAAACGGTGCTCGGGGCTAAACCCTTGGTGATGACTCTGCCCATTGGGACAGAAAATGATTTCGT etvXgakpXvmtXpigtendfv TGGTGTGGTGGATATTCTCACGGAAAAAGCCTATATCTGGGATGACTCCGGCGACCCGGAAAAAT gvvdiXtekayiwddsgdpeky ATGAAATCACCGACATTCCTGCTGACATGGTGGACGATGTGGCCACCTACCGGGAAATGCTGATC e i t d ipadmvddvatyremX i GAAACAGCGGTGGAACAGGACGATGATTTGATGGAAAAATACCTGGAAGGGGAAGAAATCAGCAT et.a veqddd ltneky 1 egee i s i CGATGACATCAAGCGTTGTATCCGTACCGGCACCCGTAAGTTGGACTTTTTCCCCACCTATGGCG ddikrcirtgtrkldffptygg GTTCCTCCTTTAAAAACAAAGGGGTACAACTGGTGCTGGATGCGGTGGTGGACTACTTGCCCAAC ssfknkgvqXvXdavvdyXpn CCCAAAGAAGTACCTCCTCAACCGGAAGTGGATTTAGAAGGGGAAGAAACCGGCAACTACGCCAT pkevppqpevdlegeetgnyai TGTTGATCCAGAAGCCCCCCTGCGGGCCCTGGCGTTCAAAATTATGGACGACCGCTTTGGTGCTT vdpeapXraXafkinddrfgaX TGACCTTCACCCGGATTTACTCCGGTACCCTCAGTAAAGGGGACACCATTCTCAATACTGCCACC tftriysgtXskgdtiXntat GGTAAAACAGAACGCATTGGTCGTTTGGTGGAAATGCACGCCGATTCCCGGGAAGAAATTGAGTC gkterigrlvemhadsreeies CGCTCAAGCCGGGGACATTGTGGCGATCGTGGGCATGAAAAATGTGCAAACGGGCCACACTCTCT aqagdivaivgmknvqtghtlc GTGACCCCAAAAATCCGGCCACTTTGGAACCCATGGTCTTCCCCGACCCGGTGATCTCCATTGCC dpknpatlepmvfpdpvisia ATTAAGCCCAAGAAAAAAGGCATGGACGAAAAATTGGGTATGGCTCTGAGCAAAATGGTGCAGGA ikpkkkgmdekXgmaXskmvqe AGATCCTTCCTTCCAAGTGGAAACCGACGAAGAAAGCGGTGAAACCATCATTAAGGGGATGGGGG dpsfqvetdeesgetiikgmge AATTGCACTTGGATATCAAAATGGATATTCTCAAGCGTACCCATGGTGTAGAAGTGGAAATGGGT IhXdikmdilkrthgvevejng AAACCCCAGGTGGCCTACCGGGAATCCATTACCCAGCAGGTATCGGATACCTATGTACACAAGAA kpqvayresitqqvsdtyvhkk GCAGTCCGGTGGTTCTGGTCAGTATGCCAAAATTGACTACATCGTGGAGCCTGGGGAACCCGGTT qsggsgqyakidyivepgepgs CTGGCTTCCAGTTCGAGTCCAAAGTAACCGGTGGTAACGTGCCCCGGGAATATTGGCCCGCAGTA gfqf eskvtggnvpreywpav CAAAAAGGTTTTGACCAGAGTGTGGTTAAAGGCGTTTTGGCTGGCTATCCTGTGGTGGACTTGAA qkgfdqsvvkgvXagypvvdXk AGTTACCCTCACCGACGGTGGCTTCCACCCTGTAGACTCTTCGGCGATCGCCTTTGAAATTGCGG vtXtdggfhpvdssaiafeiaa CCAAAGCTGGTTACCGTCAAAGCTTACCCAAAGCCAAACCCCAAATTTTGGAACCGATTATGGCG kagyrqsXpkakpqiXepina GTGGATGTGTTCACCCCGGAAGACCATATGGGGGATGTAATCGGCGATTTGAACCGTCGTCGGGG vdvftpedhmgdvigdXnrrrg catgatcaaatcccaggaaactggccccatgggagtgcgggtaaaagcggatgtgcctttgagcg zniksqetgpmgvrvkadvpXse aaatgttcggctacattggtgatttgcggaccatgacttccggtcggggtcaattctccatggtg mfgyigdXrtmtBgrgqfsinv tttgaccactacgctccctgccccaccaacgttgccgaggaagtaatcaaagaagccaaagaacg fdhyapcptnvaeevikeaker gcaagccgctgcttaattccaaatttgactgtctcaaatgt q a a a

Figure 16. Figure 17. Alignment of the amino acid sequence of

translocases from Synechocystis 6803, A. nidulans, S. platensis, E. coli, and T. thermophilus.

Translocases from A. nidulans (Meng et a l ., 1989), S. platensis (Buttarelli et al., 1989), E. coli (Zengel et al.,

1984), and T. thermophilus (Yakhmin et al., 1989) are aligned to a translocase from Synechocystis 6803. Dashes indicate

identical amino acids whereas spaces indicate gaps in one sequence relative to the other.

168 . PQA -HT VM IN-N-ADQGVTF-YE LVKM— A-EH-T VPLQ-A— 4. MR 4. MR 3. . MR 2. -I 4. . A E—G LEERS-- -I-GLXV-VIC A ALTEEE-RHSL-Q- G— AGL-K-VLIQGNDRLV-MLC— LEQ— E DGDL— F- EA 3. EA-LD— 2. LKVLDGGIGVFCGSGGVEPQSETNWAYANDSKVARLIYINKLDRTGADFYRWKQVETVLGAKP 1. -M 3. -I 2. 2. . VAR —IQ K KEVR NK I-Q F— D-I-VDELR SVKAAR— IILK— 4. VS— K— A ATK— 2. VLNNEIILVTC— FS GAL-QR LTEAE— G - QGT-PDG-VALRPSS I S-QDI— 2. EE 4. ELQETA-EW-SKMV-AVA-T- K-L-AVA-T- E V EEVQ-L— H-YTNDLGTDIL LVEM— LVAM-T-LYTNDLGTDIQVSDE— Q-I— H-L— A-S— VPIQV—V-R-S— 3. VPIQI 2. LVMTLPIGTENDFVGWDILTEKAYIWDDSGDPEKYEITDIPADMVDDVATYREMLIETAVEQD 1. 4. . SV—AINGILDD-KD-PAERHASDDE-FS S-VD— 4., . DMK L EIIDKCRG TRKLDFFPTYGGSSFKNKGVQLVLDAWDYLP GEEISIDDIKRCIRTG LE DDLMEKY 1. . -E -D S—QALSE SK— T-NDM V-VE— — 2. GALTFTRIYSGTLSKGDTILN NPKEVPPQPEVDLEGEETGNYAIVDPEAPLRALAFKIMDDRF 1. 3. . - IGLD V— -D VAP R - VQ SY-Y- V-Q— V-V -R V-A-PY S -DD GVR— IKG-LPD A-T 3. . LERGSMFEEEGTQATCW KDHQLNVIDTPGHVDFTIEVYRS KLGEVREGESTMDFHEQEAERGITIQSAATSCFW 1. 1. . --E -D S—QSLSE SK— SLYI-E— T-QDM 4. EA-SII— --v-VE— 3. D-I— LG-AL-L-DTL— IVLKS-E—I—V—ELR S— ATKN-K 3. TATGKTERIGRLVEMHADSREEIESAQAGDIVAIVGMKNVQTGHTLCDPKNPATLEPMVFPDPV 1. . ISIAIKPKKKGMDEKLGMALSKMVQEDPSFQVETDEESGETIIKGMGELHLDIKMDILKRTHGV 1. 2 4. IQEQLKA-P D— I VI-G— 4. 3 NVPREYWPAVQKGFDQSWKGVLAGYPWDLKVTLTDGGFHPVDSSAIAFEIAAKAGYRQSLPK 1. 2 LRTMTSGRGQFSMVFDHYAPCPTNVAEEVIKEAKERQAAA 1. . LK ASYT-E-LK-DEA-S SL-K— — 4. 3 3 RVKADVPLSEMFGYIGD PMGV 2. AKPQILEPIMAVDVFTPEDHMGDVIGDLNRRRGMIKSQETG 1. A U (O H -ANV . -ANI-A . -ANI-A • . EVEMGKPQVAYRESITQQVSDTYV HKKQSGGSGQYAK ID YIVEPGEPGSGFQFESKVTGG ID HKKQSGGSGQYAK EVEMGKPQVAYRESITQQVSDTYV . S —IPE-KAE—I INPAEQ-MKEACES— K— VGPAEQ-MKETCES S— . K— T— . --L m-K-E-EV -S-VL . IS— V -—EEV-RS E-S— V I—SK—Q . —S I-T-E-SQ-EEV—R Q S— — . DV VKEE F----S -GATNT -- A AT- A T-S-K NGTA Q-EG-A-T FL-S-M-N-IS V-K-E-EV -D-VL Escherichia COli Escherichia Spirulina platensis Spirulina nidulans Anacystis Synechocystis 6803 Synechocystis ------VL ------E -D—ML GRLAK M-L— T-ADQ— VE— DA W Q W R HD-AA R TAVT-W-A— H— HD-NAVT-W ------SV AVM-Y-AV SV V-A A W K-E-E T-R-K-T-VEGK-A FIR T-RKSIRTEGK FVR TVRKA-KAEGK ------E- MARTTPIA EF-G -- Q Q MARSVP-EKV EF-G Q FG MEKDLTRYRNIGIFAHVDAGKTTTTERILKLTGRIH EF-G ENT E- MARTIP-E-V EF-G ------FI—N ------Q-Y -AV —M -K -KR N- N-IK-R N-LK— M— M— P-IAFV— M V-RQ--KY— V-RQ-ERYQ-P-IAF M V-RQ-DRYS-P-IVFV— ------A--S L--RX-I SGHAKQYEP-RI-I --YRV-I TA— T TA--I-TS- I-TS- TA— ------A EARGK QA— iue 17. Figure ------TI-AKS-GN TI-AKN-GNA ------IIA —SY-D V— LI-I-A— IVA —SY-E V— LI-V-A— S T-R-SI-S-TNQ-V-A T-R-SV-S-TNQ-V-A ----- S -- GV--T—D LG-VL-L-DTF— ------N A NQ R-W MGIR-HF-SY-D ------ILDT D A-AI-L-D-T— R K K Q-EG ------—S T KIH-E VT— SE L-G— ------L SNPK-YE-INDIK— IV— E-V— M-PL — T— W I H G EL W A I GH EL— W GH AT-P-V-N -- -Y- VV -—SYVY- I-Q— V-V -R A-PY ------DQSQSIAK-V-K —I A— I A— —I S— ------ELANEWHQN ------M-—GSMAIKNGVT- GSMAIKEAVR- EM--K— K— EM— L L SIAFKEGFK- KL— EL— ------WNS N W F-V Q I SLFI-E— II— DQ— RL R AI—RE E— R-E— DA-II— FK-YG-IRDR-R-NA FK-YG-IRDRVR-NA E-LV-RML-EFK- E-LV-RML-EYK------AM VR—EFN- IV-RM— L ------—IV— E-V— AT FYS-W- FYS-W- Y VH-. FY— --- -- S-A-AS 1 --- EL— E— E— E— V— AT- ATQ 169 Figure 18. Dot plot comparison of the available

Synechocystis 6803 DNA sequence and the entire str operon of

Spirulina platensis.

The DNA sequence of the Synechocystis 6803 fus-like sequence (horizontal axis) was compared to the entire str operon of S. platensis (Buttarelli et al., 1989) (vertical axis) using the DotPlot program of DNA Star with a window of

20 and 65% homology.

170 171

m v s 1

s\

i « M

Sim .. x

\|

i ^ W s\

N\

V

V V*

Figure 18. Figure 19. Nondenaturing polyacrylamide gel electrophoresis

of S. cerevisiae PCR products.

Lanes 1, 2, 3, and 4 contain PCR products generated using S. cerevisiae genomic DNA as a template with the primers RevG4A and G3SA, RevG4A and G3SB, RevG4A and G3,

RevG4B and G3SA, Rev G4B and G3SB, and RevG4B and G3 respectively. PCR products in the expected size range are

indicated by the arrow.

172 Figure 19. Figure 20. DNA sequence of two S. cerevisiae EF-G specific

PCR products, Y1 and Y2.

The nucleotide sequence of two S. cerevisiae EF-G specific PCR products is given. Slashes designate primer sequence.

174 1. S. cerevisiae PCR product Y2 5' //////////////GGTAAAACAACTACAACTGAACGTATTTTGTTCTA 2. S. cerevisiae PCR product Y1 5' //////////////GGTAAAACTACCACAACAGAGAGGATGCTTTATTA

1. CACAGGTGTATCTCACAAAATTGGTGAAGTACACGACGGTGCAGCAACAATGGACTGGATGGAACAAGAGCAAGAGCGTGG 2 . TGCAGGAATCTCAAAGCATATTGGAGACGTCGATACTGGTGATACGATAACTGATTTTTTAGAGCAGGAGAGATCCCGAGG

1. TATTACAATTACCTCTGCTGCTACGACTTGTTTCTGGTCTGGTATGGGTAACCAATTCGAACAACACCGTATCAACGTAAT 2 . TATCACAATTCAAAGCGCTGCGATTTCATTCCCATGGAGAAATACTTTTGCAATAAATCTTATTGATACG11111111111

1. TGACACC////////////// 3* 2. /// 3'

Figure 20.

ui Figure 21. Alignment of the N-terminal amino acid sequences of translocases from E. coli and S. cerevisiae.

The N-terminal amino acid sequence derived by DNA sequencing for E. coli EF-G (Zengel et al., 1984) is shown aligned to the amino acid sequences generated from the DNA sequence of two S. cerevisiae EF-G specific PCR products.

Slashes indicate primer sequence.

176 177

1. E. COli EF-G MARTTPIARYRNIGISAHIDAGKTTTTERIL 2. S. cerevisiae Y1 /////------M- 3. Y2 /////------

1. FYTGVNHKIGEVHDGAATMDWMEQEQERGITITSAATTAFWSGMAKQYE HR 2 Y-A-ISKH— D-DT-DTIT-FL--- RS-----Q---ISFP- RNT FA

1. INIIDTPGHVDF 2. — L //// 3- — V ////

Figure 20. Figure 22. S . cerevisiae genomic Southern.

Five micrograms of S. cerevisiae genomic DNA was digested with either Bam HI or Hind III and probed with the

S. cerevisiae mtEF-G specific PCR product Yl. Lane 1 contains Bam HI digested DNA. Lane 2 contains Hind III digested DNA. The sizes of the hybridizing fragments are given in kilobase pairs.

178 179

2

53 kb

14

Figure 22. Figure 23. Low stringency S. cerevisiae genomic Southern.

Five micrograms of S. cerevisiae genomic DNA was

digested with Pvu I (Lane 1) or Sal I (Lane 2) and probed with the S. cerevisiae mt EF-G specific PCR product. The

blot was hybridized in 6XSSC at 45 °C and washed in 2XSSC at

45 °C.

180 181

1 2

Figure 23. Figure 24. Nucleotide and corresponding amino acid sequence of a S. cerevisiae mtEF-G gene.

N-termnial "CAT","TATA", and "GC" boxes as well as two

C-terminal polyadenylation sites are underlined.

182 183

aaaaacttgatgtgtgtaacaaagaaaatatccggttcagagatccgaacaaatgatagt CGTACCCTTCCCCTTAGTGTTAACTCGTTTAAGCAGGCTATGATGATTAGTAATCGTTTC AGCCTCTTCTCATTTCGACAATTTTTGAGGCCTTCCCTTAGTGAGAGGGGAGATAACCAi ICAGCACAGAGAACGACGACGGCGCGGAGGCTGCCCCCTTAAAAGAAqllMIGTGAAAT AGATCAACAAGCTTTCTGGTGGCAAAACGGCAGTCAGAGATATACAAAAGGGGCAAGATA AGTGAGGGAGGCTATGTGCAAATCGAATGTCCGCAGATGGGCCCCCCCAAGAGTAAACAT mvkwnvrrvagarvni TACCAAGAACCCATTAAGTGTAATTAACGTTGGAAGTAGATACTTATCGACCGCAAGAAG sknrlsvinvqsrylstarn TCCACTGTCTAAGGTTAGAAATATAGGAATTATAGCCCATATCCATGCAGGTAAAACTAC p 1 s k v r n iqiiahidagktt CACAACAGAGAGGATGCTTTATTATGCAGGAATCTCAAAGCATATTGGAGACGTCGACAC tternlyyaqiskhigdvdt TGGCGATACGATAACTGATTTTTTAGAGCAGGAGAGATCCCGAGGTATCACAATTCAAAC •3 d t i tdfleqersrgitiqs CGCTGCGATTTCATTCCCATGGCGAAATACTTTTGCAATAAATCTTATTGATACGCCGGG a a i s f p w rntfainl idtpq ACATATAGAnTTACTTTTGAAGTGATTAGGGCTTTGAAAGTTATTGATTCATGCGTGGT h i dftfeviralkvidscvv CATTTTGGACGCTGTTGCCGGCGTAGAGGCACAGACAGAGAAGGTTTGGAAGCAAAGTAA i 1 d a v a veaqtekvgtqsk ATCGAAGCCCAAAATCTGCTTTATCAACAAAATGGATCGTATGGGGGCGAGTTTCAACCA s k p k i cf i nkndrragasf nh TACAGTCAATCACTTGATAAATAAATTTATGAGAGGAACAACTACCAAACCCGTGTTGGT t v n d 1 inkfnrqtttkpvlv AAACATTCCTTACTATCGAAAACAACCGACTAGCAATGATTATGnTTCCAAGGTGTTAT nipyyrkqptsndyvfqqvi TGACGTCGTAAATGGGAAGCGATTAACCTGGAATCCAGAAAATCCTGATGAAATTATCCT d v v n q k r ltwnpenpdei iv CGATGAGTTGGATGGTACTrCTCTTGAGCAGTGTAATCGTTGTAGGGAATCCATGATAGA d e ldgtslcqcnrcresnie CACTTTGACTGAATACGACGAAGATTTAGTCCAGCACITCTTAGAGCAGGCAGAAGGCGA tlteydedlvqhf leeaegd TTACTCCAAGGTTTCCGCACAGTTTTTGAACCCTTCAATAAGAAAACTGACTATGAAGAA yskvsaqf lnasi rk ltmkn TATGATTGTACCTGTCTTATGCGGTCCGTCTTTTAAAAATATTGGCGTCCACCCTCTATT miypv lcgasfkn iqvqpl 1 AGATGCTATAGTTAATTATCTCCCTTCGCCCATTGAAGCCGAACTACCAGAACTAAATGA daivnylpspieaelpelnd TAAAACCGTTCCAATGAAATATGACCCCAAAGTTGGATGTTTAGTAAACAACAACAAAAA kt vpmkydpkvqc lvnnnkn TCTTTGCATTGCTCTTGCATTCAAAGTGATTACGGATCCAATCAGGGGTAAACAAATATT lcialafkvitdpirgkqif CATTAGAATTTATTCAGGTACACTAAACAGTGGCAATACCGTTTATAATTCTACTACTGG i r i y sgtlnsgntvynsttg GGAAAAATTTAAATTGGGTAAATTACTGATACCCCATGCGGGAACATCACAACCTGTAAA c k fklgkl liphaqtsqpvn TATTTTAACTGCGGGCCAAATCGGATTGCTTACTGGCTCTACAGTCGAAAACAACATATC iltaqqiqlltqstvennis TACTGGAGATACACTAATAACGCATTCGTCGAAGAAGGATGGGTTGAAATCACTTGATAA tgdtl ithsskkdglksldk AAAAAAGGAGTTGACGTTAAAAATCAATTCTATCTTCATTCCCCCACCAGTTTTTGGTGT kkcltlkinsifipppvfgv

TTCCATCGAACCAAGCACTTTAAGTAATAAAAAATCGATCGAACAAGCTTTAAACACTTT siepr tlsnkksmeealn t 1 AATTACTGAAGATCCCAGTCTTTCAATTTCTCAAAATGATGAGACTGGCCAAACGGTTTT itedpslsisqndetgqtvl AAACGGTATGGGTGAATTACACCTTGAAATTGCCAAGGATCGATTGGTTAATGACCTAAA nqmgelhleiakdrlvndlk AGCAGATGTTGAATTTGGTCAGTTGATGGTGTCCTACAAAGAAACAATTAATTCAGAAAC advefgqlmvsyketinset AAATATAGAAACGTATGAGAGTGATGATGGATATAGATTCTCCTTATCTCTGCTACCAAA nietyesddgyrfslsl 1 p n TTCCGATGCCCTTCCTAACTGCCTTGCATATCCATTAGGAGTCAACGAGAATTTTCTGAT adalpnclayplqvn e n f 1 i AATGGAAAAAAATGGTAATTGGGATAAAGAATGGAAATACCAAGTTTCTTTCGAATCAAT oeknqnwdkewkyqvstesi TCTAAATTCTATTATTGCAAGTTGCATTGTGGGACTTCAAAGAGGCGGAAAAATAGCTAA lnsiiascivglqrqgkian CTTTCCCTTATATGCATGCTCCATCAAAATCAATAGCGATTGGTCAGTCCCACCTGATAT fplyacsikinsdwsvppdi TGAAACTCCACAAGAAATCTTAAAAATTACTAGAAACTTAATTTTTAAAGCTCTCAATGA etpqeilkitrnlifkalnd CTTAAAACCTGAAAAATACAACCTTCTGGAACCCATCATGAATCTTGATCTAACAATCCC lkpokynllepimnldltip TCAATCTGATGTrCGTACCGTGTTACAAGATCTAACACGAGCAAGAAAGGCTCAAATTCT qsdvqtvlqdltqarkaq i 1 GTCTATTGAAGACGAATCAAGCGTGTCAAATTCTGGCGCATCCACTTGTAATTCTCCAGA siedessvsnsqastcnspe GAACAGCAATAGGATATATATTCCCTCTGATGCTGTTACAACCCTACACGCAACCAAAGA nonriyipsdavttlhatkd TAAAAAAAACACTCAAGAGACCAGTTCTAATGTAAAAAAAATTATAAAAGCGAAAGTGCC kkntqetssnvkk i ikakvp ACTAACGCAAATTACCACCTACACCAATAACCTAAGGAGCTTATCGCAAGGGAGGGGTGA lreittytnklrslsqgrgc GTTCAATATTGAATATTCCGATATGGAGAAAGTTACCAACGATCGCCTACAATCAATACT fnieysdmekvtndrlqsil TCACGACTTGTAGCAAAAATGTCCTTMIiaaCGCCTTGTTATATCTTrATCAAAATGGT h d 1 TTTTACAACTATCGAATAGTGTGAACCGATAACTGTGTAGTTGTATfiHA&IAGAACnT ACTTTTGCGCAGGAACAAACATTTTATTCCGACATAGGCATACCTTGCCTCAGAGATCTT TATATTCTTGTTCCTTCGTGGCCCAATGGTCACGGCGTCTTGGCTACCAACCAGGAGATT CCAGGTCGAGTC

Figure 24. Figure 25. Alignment of the amino acid sequences of translocases from E. coli and S. cerevisiae.

Translocases from E. coli (Zengel et a l . , 1984) and

S. cerevisiae (Vambutas et al., 1991; Welcsh et al., 1992) are shown aligned to one another. Dashes indicate identical amino acids whereas spaces indicate gaps in one sequence relative to the other.

184 1. Escherichia coli EF-G 2. Saccharomyces cerevisiae mtEF-G HWKW 3. Saccharomyces cerevisiae mtEF-G MSVQKMMWVPPKMVGGRIPFFTCSKVFSGFSRR 1. MARTTPIARYRNIGISAHIDAGKTTTTERILFYTGV1JH 2 . NVRRWAGARVNISKNRLSVINVGSRYLSTARS-LSKV 1------M-Y-A-ISK 3. SFHESPLARSTWEEEKVLVDEIKQKLTPDDIGICNKL------S--- F-- V-Y— KRIK 1. KIGEV HDGAATMDWMEQEQERGITITSAATTAFWSGMAKQYEPHRINIIDTPGHVDFTIEVE 2. H— D- DT-DTIT-FL RS---- Q---- ISFP-RNTF A— L----- 1-- F—I 3. A-H— RGRDNVG-K— S-DL-R-K--- Q--- YCS-DKEG-N- HF-L----- 1------1. RSMRVLDGAVMVYCAVGGVQPQSETVWRQANKYKVPRIAFVNKMDRMGANFLKWNQIKTRLGAN 2 . -ALK-I-SC-VILD— A— EA-T-K— K-SK SK-K-C-X------S-NHT— DLINKFMRG 3 . -AL------L-V S S-TV--D— MRR-N VT-I------SDPFRAIE-LNSK-KIP 1. PVPLQLAIGAEEHFTGWDLVKMKAINWNDADQGVTFEYEDIPADMVELANEWHQ 2. TTTKPVLVNI-YYRKQPTSNDYV-Q— I-V-NG-RLT- NPENPDEIIVDELDGTSL-QC-RCRE 3. AAAV-IPV-S-SSLS INRV— Y-KG-N-EII-KGPV-ENLKP-ME-KR- 1. NLIESAAEASEELHEKYLGGEELTEAEIKG ALRQRVLNNEIILVTCGSAFKNKGVQAMLD 2. SM— TLT-YD-D-VQHF-EEA-GDYSKVSAQFLNASI-KLTMK-M—VP-L— AS 1--- PL— 3. L TL-DVDD-MA-MF-EEK-P-TQQ— D -I-RSTIARSFTP-LM LA-T-I-PV— 1. AVIDYLPSPVDV PAINGILDDGKDTPA ERHASDDEPFSALAFKIATDPFVGNLTFFRVYSG 2. -IVN IEAEL-EL-DKTVPM-YD-KVGCLVNNNKNLCI VI IR-KQI-I-I-- 3. -IV N-SE- L-TA— VSNNEAKV NLVPAVQQ— VG----LEEGKY -Q— YV Q- 1. NVSGDTVLNSVKAARERFGRIVQMHANKREEIKEVRAGDIAAAIGLKDVTTGDTLCDPDAPIILE 2. L N— Y— TTGEKFKL-KLLIP— GTSQPVNILT— Q-GLLT-STVENNIS-GT-LITHSSKK 3 . LRK-NYIT-VKTGKKVKVA-L-R— SSEH-DVD— GS-E-C -TFGI-CAS FT-GSVQYSMS 1. RMEFPEPVISIAVEPKTKADQEKMGLALGRLAKEDPSFRVHTDEESNQ 2. DGLKSLDKKKELTLKINSIFI-P— FGVSI— R-LSNKKS-EE— NT-IT LSISQND-TG- 3 . S-YV-DA-V-LS-TSNS--- ASNFSK—N-FQ-T--- KF-P— HE 1. TIIAGMGELHLDIIVDRMKREFNVEANVGKPQVAYRETIRQKVTDVEGKHAKQSGGRGQYGHWI 2. -VLN------E-AK— LVNDLKADVEF-QLM-S-K NSE-NI-TYES DD-Y-FSLSLLPN 3. -- S------E-Y-E—R—Y—DCVT-----S S- TIPA-FDYT-K A R-IG 1. DMYPLEPGSNPKGYEFINDIKGGVIPGEYIPAVDKGIQEQLKAGPLAGYP 2. SDALPNCLAYPLGVNENFLIMEKNGNWDK-WKYQVSFES-LNSI-ASCIV-L-RGG-IANFPL-A 3. TLSPVDDIT— NI-ETA-V— R— DK-LA-CG— FE-VCEK I-HR 1. WDMGIRLHFGSYHDVDSSELAFKLAASIAFK EGFKKAKPVLLEPIMKVEVETPEENTGDVIGD 2. CSIKINSDWSVPPDIETPQ-ILKITRNL-FKALNDL-PE-YN------NLDLTI-QSDV-T-LQ- 3. -L-VKMPIND-AI-A N— S— T-TMS— R DA-LR-Q— IH N-S-TS-N-FQ-N L 1. LS RRRGMLKGQESEVTGVKI 2 . -TGARKAQILSIEDESSVSNSGASTCNSPENSNRIYIPSDAVTTLHATKDKKNT— T-SNVKKI 3. -N KLQAVI-DT-NGHDEF X. HAEVPLSEMFGYATQLRSLTKGRASYTMEFLKYDEAPSNVAQAVIEARGK 2 . IK-K R-ITT-TNK--- SQ— GEFNI-YSDHEKVTNDRL-SILHDL 3 . TLK— CA— T F— S— AS-Q-KGEFSL— SH-APTAPH-QKEL-SEFQ-KQAKK

Figure 25. Figure 26. Genomic Southern analysis of S. cerevisiae efgl::URA3 transformants probed with the 5' Hind III fragment.

Lane 1 contains DNA from nontransformed cells. Lanes 3-

10 contain DNA from transformants 2, 3, 4, 5, 15, 21, 23, and

24 respectively. Lane 2 contains marker DNA.

186 187

Figure 26. Figure 27. Genomic Southern analysis of S. cerevisiae efgl::URA3 transformants probed with the Ura 3+ fragment.

Lane 1 contains DNA from nontransformed cells. Lanes 3-

10 contain DNA from transformants 2, 3, 4, 5, 15, 21, 23, and

24 respectively. Lane 2 contains marker DNA.

188 189

1__2 3 4 5 6 7 8 9 10

m

Ml JtfSSflBSSSS

Figure 27. Figure 28. Nondenaturing polyacrylamide gel electrophoresis of A. thaliana and pea PCR products.

Lanes 1 and 2 contain PCR products generated using A. thaliana genomic DNA as a template and RevG4B and G3SA, and

RevG4B and G3SB as primers respectively. Lanes 3, 4, 5, and

6 contain PCR products generated using pea genomic DNA as a template with the primers RevG4A and G3SA, RevG4A and G3SB,

RevG4B and G3SA, and RevG4B and G3SB respectively. PCR products in the expected size range are indicated by the arrow. The markers (lane M) are indicated in base pairs

190 Figure 28. Figure 29. Amino acid and DNA sequence of the unrearranged and the rearranged pea PCR products.

The DNA and amino acid sequence of the unrearranged pea chlEF-G PCR product is shown in panel A. The DNA sequence is given in upper case letters. The corresponding amino acid sequence is given in lower case letters. Shown in bold type are the PCR primers RevG4A and G3SA. The arrow indicates the location of the internal EcoRI restriction site. Panel B contains the rearranged pea chlEF-G PCR product aligned to the E. coli sequence. Slashes designate PCR primers. Dashes indicate identical amino acid residues. Spaces indicate a gap in one sequence relative to the other.

192 Pea chlEF-G PCR (unrearranged) RevG4A-AAGACAACTACAACGGAAAGAATTCAAGTC

TACATGTCCCGGAGTATCAATGATATTGATCCTGTGCTTATCCCAAAACGTGGTGGTTGCAGCAGAAGTAAT G3SA-p t d i inirhkdwftttaasti

TGTAATCCCTCTTTCTTGTTCCTGTTCCATCCAGTCCATTGTGGCTGTCCCCTCGTGCACCTCTCCAATTTT tigreqeqemwdmtatgehvegik

GTAGTTTCTTCCGGGATAGAACAGAATTC y n r g p y f

1. Escherichia coli EF-G marttpiaryrnigisahidagktttterilfytgvn 2. Pea chlEF-G PCR (rearranged) /////------p-r-

1. hkigevhdgaatmdwmeqeqergititsaattafwsgmakqyephriniidtpghvdf 2 . y ------e-t------1 — dk------/////

Figure 29. Figure 30. Alignment of two pea chlEF-G PCR products.

Shown in bold are the PCR primers RevPG and RevG4A. The

amino acid sequence used to generate RevPG was determined by

Mahinur Akkaya by peptide sequencing. RevPG was used in conjunction with G3SA to generate a chlEF-G PCR product whose

DNA sequence is shown in line 1. Shown in line 2 is the partial DNA sequence of the pea chlEF-G PCR product generated using RevG4A and G3SA as primers as discussed in the text.

The location of the Eco RI site is indicated with an arrow.

194 195

1. Pea chlEF-G RevPG-TGTCCCGTTGAAGGATTATCGCAA 2. Pea chlEF-G

1. CATTGGCATCATGGCTCACATAGACGCTGGAAAGACAACTACAACGGAAAGAATTC 2. RevG4A-AAGACAACTACAACGGAAAGAATTC i

Figure 30. Figure 31. Pea genomic Southern.

Twenty micrograms of total pea genomic DNA was digested with EcoRI and analyzed by Southern blotting, using the

cloned pea EF-G PCR product as a probe. The probe detected

two fragments due to the presence of an EcoRI restriction

site in the PCR product. The sizes of the two fragments detected are 4.0 kb and 1.9 kb as illustrated.

196 197

W m W m m m m

Figure 31. Figure 32. Arabidopsis thaliana genomic Southern.

Three micrograms of A. thaliana genomic DNA were digested with Hindlll (lane 4), EcoRI (lane 5), and BamHI

(lane 6) and stained with ethidium bromide. Lanes 4-6 were analyzed by Southern blotting, using the cloned A. thaliana

EF-G PCR product as a probe. Lanes 1, 2, and 3 correspond to

lanes 4, 5, and 6 respectively. The marker sizes (lane M) are given in kilobase pairs.

198 199

mm

v\W*\ ••

Figure 32. Figure 33. Pea northern analysis.

Twenty micrograms of 10 day dark grown total pea RNA was loaded in lane 1, twenty micrograms of 4 day light grown total pea RNA was loaded in lane 2, and twenty micrograms of

10 day light grown total pea RNA was loaded in lane 3. The

RNA samples were analyzed by northern blotting, using the cloned pea EF-G PCR product as a probe. The probe detected a 2.3 kb light-induced RNA product as indicated by the arrow.

200 201

W m m m m l® i SplpglSllliti ill wav

IIPPIIMII ■;&i■iW>>/>ym^ H P I

m m m m m m m m m IMS m s m m m 111IS1

,23kb

Oiii£&

Figure 33. Figure 34. The a- and /3-globin gene clusters (from Anand,

1989) .

A scale, in kilobase pairs, is given in the top line.

A schematic representation of the a-like globin gene cluster

is shown in the second line. The /3-like globin gene cluster

is shown in line three. Solid boxes represent exons while open boxes represent introns.

202 THE GLOBIN GENES

kb 0 10 20 30 40 50 60 70 80 l_ _l_ ..I... _1 I. _J_ i I

C H l/'Cll a 2 a I

.■■I ■ ■a . III. .III___ III.

£ r r */?! S ft

.1111. .ii:i ii;i_ u:i .h i . .1171. 203 Figure 34. Figure 35. Map of characterized deletions within the /3- globin gene cluster (from Anand 1989).

Shown schematically in the top line is a map of the j3- globin gene cluster. /3-globin genes are represented as closed boxes. The bars below the map represent the amount of

DNA deleted and the location of a given deletion relative to the normal DNA of this region. Open boxes indicate that the precise deletion breakpoints have not been determined.

Breaks in solid bars indicate that various amounts of DNA have been omitted from a given deletion map. The designation

/3° indicates that /3-globin is not synthesized from a chromosome harboring this deletion. Likewise, °t a7(5j8)° indicates that neither S or /3 globin are synthesized from chromosomes harboring this deletion. Thalassemia (Thai);

Hereditary Persistence of Fetal Hemoglobin (HPFH).

204 Tl«l

p°Tlial Indian poTlial Black (1 Thai Czech P°Thal Dutch Y Thai HU Leporo lib Kenya Gy Ay|6p|° That Sicilian Japanese Black Greek Turkish C l— .' i Malaysian C M — 3S G erm an Black r:i Indian IML Spanish IIPFII USA rjm IIPFII Gljana IIPFII Indian Gy |Ay6pl° Thai Chinese 1 am ly6|ll°Thal Anolo Saxon m Dulcli Scolcti li I a It ■ Moxlcan English

Figure 35. 205 Figure 36. Comparison of normal DNA and the deletions associated with ^^-thalassemia 1 and 2 and HPFH 1 and 2

(from Vanin et al., 1983).

Letters A through Z represent normal DNA; the globin gene cluster is underlined. Brackets surround regions that are deleted in the respective hemoglobinopathies.

206 globin gene cluster Normal ABCDE FGHI-JKIMNOPQ---RSTUVWXYZ

YS/J-thal 1 A B C D E f| ]o P Q---RSTUVWXYZ

YS/3- thal 2 abcoefgh J Jo---RSTUVWXYZ

HPFH 2 ABCDEFGH I ---J K Lm[ jv W X Y Z

HPFH 1 ABCDEFGH I---J K LMn[ JwXYZ

Figure 36. to o •v] Figure 37. Mechanism for the generation of related deletions by the loss of a chromatin loop during DMA replication (from

Vanin et al., 1983).

Points A, B, C, and P, Q, R, indicate unreplicated DNA while points A', B', C 1 and P', Q', R', designate DNA that has been replicated. Ellipses represent matrix attachment

sites and their associated DNA replication machinery while breakage and reunion sites are indicated by an asterisk.

208 — - ® ^ [ Delel Ion 1 ) '

B [ Dele I ion 2 }■■■■

^-{Deletion 3 *■

Figure 209 Figure 38. Mechanism for the generation of small deletions found within the human /3-globin gene cluster (from Anand

1989) .

Points A, B, C, D, E, F, G, H, I, represent DNA prior to replication while points A', B', C 1, D', E', F 1, G', H 1, I 1, indicate newly replicated DNA. A dashed line designates DNA that is anchored to the matrix. Small closed circles designate DNA polymerase a molecules while large open circles indicate matrix attachment sites. Asterisks above chromosomal loops indicate that that loop was removed during the generation of the corresponding deletion shown to the right. Brackets flank the amount of DNA removed for a given deletion. Note that E corresponds to a matrix attachment site and that all deletions flank this site.

210 c 0 F 011 1

A II C p______E_j j G II I

A U ^ j E______F G II I

A J. J__ E______F G II I

Figure 38. 211 Figure 39. The predicted location of a matrix attachment

site (from Anand 1989).

The predicted location of a matrix attachment site in the second intervening sequence of the /3-globin gene (/3-IVS2)

is illustrated. El, E2, and E3 represent the three exons of the human /3-globin gene. The three deletion break points, characterized by Anand and used to predict the location of the /3-IVS2 MAR, are also shown.

212 B-GLOBIN GENE

MAR CD— 1-4 Kb p°-Thal

■■■■■■■■■ <----- 4.3 kb p°-Thal

0-6 Kb p°- Thai ► ■■■■■■■!

Figure 39. Figure 40. MAR sequences often contain topoisomerase II cleavage sites and reside next to known enhancers (from

Cockerill et al., 1986).

Arrows indicate Drosophila topoisomerase II consensus sequences. Other A/T rich sequences are indicated by open and closed triangles. Open boxes designate exons, whereas open horizontal arrows indicate transcription units. Solid boxes indicate enhancers (except for the switch region, S/i) ; cross-hatched boxes indicate MARs.

214 215

H v sr 7 w 7 W W Mouse -mm—o K IgL MAR E/f Ck

#* 7...... W VW W Rabbit —[HHHH ------K IgL E K ° K tkb I 1

T VT Mouse -fl H H H I f*igH D Jj J2J3 J4 ^ /1

. t ♦ Drosophila vv w hsp70 MAR 87A7 Distal I , TOPOH (GTNAa Ca TTHATNn A) I ♦ T , (AATATTTTT) Drosophila V TW W ff yL 7 , (ATATTT) histone cluster HI MAR H3

Figure 40. Figure 41. Restriction enzyme map of AN2.1 (from Vanin et

al., 1983).

The restriction enzyme map of AN2.1 is shown. The

single-headed arrow indicates the location of an Alu I repetitive element. The double-headed arrow indicates the

location of a second Alu I repetitive element of undetermined orientation. Triangles indicate the 5' deletion break points

for 7

sites are: B, Bam HI; Bg, Bgl II; E, Eco RI; H, Hind III; K,

Kpn I.

216 Ik k Bg H Bg

Figure 41. Figure 42. Schematic representation of the in vitro procedure used to identify topoisomerase II cleavage sites.

218 Topo II mAMSA ▼ B S

N> Sal I

Separate on 1% Agarose Gel

* Southern Transfer

* Probe with Bam-Sal Fragment

t Determine location of cleavage site

Figure 42. Figure 43. Schematic representation of the in vitro and in vivo procedures used to identify DNA which specifically binds nuclear matrices (from Cockerill et al., 1986).

220 llnbotintl av a0 "MATRIX" /'? u Extract Rind P-lahcled Purity ORA, £ «1 lllslonas (!nne fragments, (icl I: (cclioplmresls, and Oil A Ccnli iltnjnlion Aaloiadioij1 . ^ In Vitro Associated " MAH

NUCLEI

R ele a se d "l lAI.Ci" nfl neslrlclion Partly OIIA, Extract Einyme Digestion, (]el Electrophoresis, lllslanes Cenlr i liitjti lion Southern Analysis In 14 ^0 Associated ' MAR 'Vt

Figure 43. 221 Figure 44. Localization of an in vitro MAR within the mouse

k immunoglobin gene and map of clone pG 19/45 (from Cockerill

et al., 1986).

Pertinent restriction enzyme sites within the mouse k

immunoglobin gene are shown. DNA segments are illustrated as

lines; open boxes indicate exons while closed boxes indicate enhancers. Cross-hatched boxes indicate MARs. pA indicates the site of polyadenylation. LK, VK, J, E,,, and CK indicate

leader variable, joining, enhancer, and constant regions respectively of the mouse k gene.

222 tn in8 a & O ------HindEl — Bom H1 — H1 Bom HindEH-- ' H indlll— BamHI— H in d m - EcoRi — EcoRi Hind H I— B am H 1“ BamH1— Bom H I H in d lir

Figure 44. zzz Figure 45. Control in vitro MAR assays.

In vitro DNA binding assays were performed as described using the mouse immunoglobin k gene MAR (pG19/45) as a control. Lanes A-G contain decreasing amounts of competitor

E. coli DNA beginning with 50 jug/ml in lane A and decreasing in increments of 10 jug/ml in lane E. Lane F contains 5 jig/ml of competitor DNA whereas lane G contains no competitor DNA.

Electrophoretically resolved matrix bound fragments from assays using the 2.85 kb BamHI-Hindlll fragment from pG19/45 as a probe are shown. The vector fragment was used as a control as illustrated.

224 - I- Vector

^jHB^-Mouse K MAR

Figure 45. Figure 46. Localization of an in vitro matrix associated region in the second intervening sequence of the human /?- globin gene.

In vitro DNA binding assays were performed as described.

Lanes A-H contain increasing amounts of competitor E. coli

DNA beginning with no DNA in lane A and increasing in

increments of lOjug/lane to lane H which contains 70 fig of competitor DNA. Panel A contains electrophoretically resolved matrix bound fragments from assays using the 1.8 kb

Bam HI fragment as a probe while panel B contains matrix bound fragments using the 0.9 kb Bam HI-Eco RI fragment as a probe.

226 227

ABODE FGH ABCDEFGH

t:*s.asitijSa

Figure 46. Figure 47. Map of the /3-globin gene and flanking regions illustrating the location of the nuclear matrix attachment site as well as the locations of specific restriction enzyme sites.

Solid boxes designate exons while open boxes indicate introns. Eco RI (E); Bam HI (B).

228 MAR

IVS 2

B 1*0 B 0*9 E

Figure 47. 229 Figure 48. Localization of two in vitro matrix associated

regions in AN2.1.

In vitro DNA binding assays were performed as described.

Lanes A-I contain increasing amounts of competitor E. coli

DNA beginning with no DNA in lane A and increasing in

increments of 10 jug/ml to lane F. Lane G contains 100 fig/ml

of competitor DNA whereas lane I contains 150 /ng/ml of

competitor DNA. Electrophoretically resolved matrix bound

fragments from assays using the Eco RI sublcones of AN2.1 as probes are shown. pUC 19 was used as a control as

illustrated.

230 231

abcdefghi

-Vector Figure 49. Map of AN2.1 illustrating the location of two

matrix attachment sites as well as pertinent restriction

enzyme sites.

The restriction enzyme map of AN2.1 is shown. The

dashed lines indicate the locations of matrix associated

regions. The single headed arrow indicates the location of

an Alu I repetitive element whereas the doubleheaded arrow

indicates the location of a second Alu I repetitive element

of undetermined orientation. Triangles indicate the 5'

deletion breakpoints for 7

restriction enzyme sites are: B, BamHI; Bg, Bglll; E, EcoRI;

H, Hindlll; K, KpnI. The sizes of the EcoRI subclones are

listed in kilobase pairs.

232 MAR

1.3 0.8 1.8 E E E <-4 ____ I | k k Bg H Bg

1 k b p

Figure 49. 233 Figure 50. Identification of in vitro topoisomerase II

cleavage sites in the EcoRI subclones of AN2.1.

In vitro topoisomerase II cleavage reactions were performed as described. Lane 1 contains marker DNA. Lanes 2-

8 contain topoisomerase II cleavage fragments generated using p4.0, p3.0, pi.8, pi.5, pi.3, pl.O, and p0.8 respectively as templates whereas lane 9 contains topoisomerase II cleavage

fragments generated using the vector pAT153 as a template.

The locations of the most intense cleavage sites for pi.8, pi. 3, pl.O, and p0.8 were determined. It should be noted that while pi. 8 appears to contain no topoisomerase II cleavage sites, two faint bands are detectable on the original autoradiogram.

234 235

Figure 5Q. Figure 51. Map of AN2.1 illustrating the location of in vitro topoisomerase II cleavage sites for subclones po.8, pl.O, pi.3, and pi.8.

The restriction enzyme map of AN2.1 is shown. The vertical arrows indicate the locations of in vitro topoisomerase II cleavage sites. The horizontal arrows

indicate the locations of Alu I repetitive elements whereas triangles designate the locations of the 5 1 breakpoints for

7

BamHI; Bg, Bglll; E, EcoRI; H, Hindlll; K, KpnI. The sizes of the EcoRI subclones are listed in kilobase pairs.

236 0.9 3.0 4.0 1.5 1.3 0.8 1.8 E E B K E E E E E E iH. .I. Hi J | I , ill k k H Bg H H H H H Bg H Bg

A 7 5 / 3 thal 1 A t 5 |3 thal*-2

1 kbp

Figure 51. 237 LIST OF REFERENCES

Adachi, Y., Kas, E. and Laemmli, U.K. (1989). EMBO J. 8., 3997.

Adams, J.W., Kaufman, R.E., Kretschmer, D.J., Harrison, M. and Nienhuis, A.W. (1980). Nuc. Acids Res. 8, 6113.

Adrian, McCammon, M.T., Montgomery, D.L. and Douglas, M.G. (1986). Mol. Cel. Biol. 6, 626.

Akkaya, M.S. (1991). Ph.D. Thesis. The Ohio State University, Columbus, Ohio.

Akkaya, M.S. and Breitenberger, C.A. (1992). Plant. Mol. Bio. Manuscript accepted.

An, G. and Friesen, J.D. (1980). Gene 12., 33.

Anand, R. (1989). Ph.D. Thesis. The Ohio State University, Columbus, Ohio.

Anderson, S., Bankier, A.T., Barrell, B.G., de Bruijn, M.H.L., Coulson, A.R., Drouin, J., Eperon, I.e., Nierlich, D.P., Roe, B.A., Sanger, F., Schreier, P.H., Smith, A.J.H., Staden, R. and Young, I.G. (1981). Nature 290. 457.

Ausubel, F., Brent, R., Morre, D.D., Seidman, J.G., Smith, J.A. and Struhl, K. (eds.), (1989). In, Current Protocols In Molecular Biology. Greene Publishing Associates and Wiley Interscience, Vols. 1 & 2.

Bachmair, A., Finley, D. and Varshavsky, A. (1986). Science, 234, 179.

Bae, Y.S., Kawasaki, I., Ikeda, H. and Liu L.F. (1988). Proc. Natl. Acad. Sci. USA 85, 2076.

Baldauf, S.L. and Palmer, J.D. (1990). Nature 344. 262.

Barath, Z. and Kuntzel, H. (1972) Proc. Natl. Acad. Sci. USA 69, 1371.

238 239

Beale, S.I., Foley, T. and Dzelzkalns, V. (1981). Proc. Natl. Acad. Sci. USA 78, 1666.

Bedbrook, J.R., Smith, S.M. and Ellis, R.J. (1980). Nature 287. 692.

Benyajati, C. and Worcel, A. (1976). Cell 9, 393.

Berrios, M., Osheroff, N. and Fisher, P.A. (1985). Proc. Natl. Acad. Sci. USA 82, 4142.

Bibb, M.J., van Etten, R.A., Wright, C.T., Walberg, M.W. and Clayton, D.A. (1981). Cell 26, 167.

Biswas, T.K. (1990). Proc. Natl. Acad. Sci. USA 87, 9338.

Bode, J., Kohwi, Y., Dickinson, L., Joh, T., Klehr, D., Mielke, C. and Kohwi-Shigematsu, T. (1982). Science 255. 195.

Bode, J. and Maass, K. (1988). Biochemistry 27, 4706.

Bonen, L. and Doolittle, W.F. (1975). Proc. Natl. Acad. Sci. USA 72. 2310.

Bonen, L. and Doolittle, W.F. (1976). Nature 261. 669.

Bonen, L. and Doolittle, W.F. (1978) , J. Mol. Evol. 10, 283.

Bonen, L. Cunningham, R.S., Gray, M.W. and Doolittle, W.F. (1977). Nuc. Acids Res. 4, 663.

Bonen, L., Doolittle, W.F. and Fox, G.E. (1979). Can. J. Biochem. 57, 879.

Bourne, H.R., Sanders, D.A. and McCormick, F. (1990). Nature 348. 125.

Bourne, H.R., Sanders, D.A. and McCormick, F. (1991). Nature 149, 117.

Breitenberger, C.A., Graves, M.C. and Spremulli, L.L. (1979) Arch. Biochem. Biophys. 194. 265.

Breitenberger, C.A., Moore, M.N., Russell, D.W. and Spremulli, L.L. (1979). Anal. Biochem. 99, 434.

Breitenberger, C.A. and Spremulli, L.L. (1980). J. Biol. Chem. 255. 9814. 240

Breitenberger, C.A. and RajBhandary, U.L. (1985). Trends in Biochem. Sci. .10, 478.

Brendler, T., Godefroy-Colburn, T., Carlill, R.D. and Thach, R.E. (1981). J. Biol. Chem. 256/ 11747.

Brendler, T., Godefroy-Colburn, T., Yu, S. and Thach, R.E. (1981). J. Biol. chem. 256, 11755.

Brown, T.A., Waring, R.B., Scazzocchio, C. and Davies, R.W. (1985). Curr. Genet. 9, 113.

Buchanan, B.B. (1980). Ann. Rev. Plant Phys. 31, 341.

Bunn, H.F. and Forget B.G. (1986). Hemoglobin: Molecular. Genetic and Clinical Aspects. W. B. Saunders Company, 169-222.

Buttarelli, F.R., Calogero, R.A., Tiboni, O., Gualerzi, C.O. and Pon, C.L. (1989). Mol. Gen. Genet. 217. 97.

Chatton, B., Walter, P., Ebel, J-P., Lacroute, F. and Fasiolo, F. (1988). J. Biol. Chem. 262. 52.

Chomczynski, P. and Sacchi, N. (1987) . Analytical Biochemistry 162. 156.

Ciferri, 0., Di Pasquale, G. and Tiboni, O. (1979). Eur. J. Biochem. 102. 331.

Ciferri, O. and Tiboni, O. (1976). Plant Sci. Lett. 7, 455.

Clarke, H.J. and Masui, Y. (1983). Dev. Biol. 97, 291.

Cockerill, P.N. and Garrard, W.T. (1986). Cell 4A, 273.

Cockerill, P.N., Yuen, M-H. and Garrard, W.T. (1987). J. Biol. Chem. 262, 5394.

Coggins, L.W., Grindlay, G.J., Krass, J., Slater, A.A., Montague, P., Stinson, M.A. and Paul, J. (1980). Nuc. Acids Res. 8, 3319.

Collins, F.S., Cole, J.L., Lockwood, W.K. and Iannuzzi, M.C. (1987). Blood 70, 1797.

Collins, F.S., Drumm, M.T. Cole, J.L., Lockwood, W.K., Vande Woude, G.F. and Iannuzzi, M.C. (1987). Science 235. 1046. 241

Cook, P.R. and Brazell, I.A. (1975) . J. Cell Sci. 19, 261.

Corriveau, J.L. and Beale, S.I. (1986). Plant Sci. 45, 9.

Corry, M.J., Payne, P.I. and Dyer, T.A. (1974). FEBS Lett. 46, 63.

Dang, H. and Ellis, S.R. (1990) . Nuc. Acids Res. 18, 6895.

Dierks, P., van Ooyen, A., Mantel, N. and Weissmann, C. (1981). Proc. Natl. Acad. Sci. USA 78, 1411.

Dijkwel, P.A. and Hamlin, J.L. (1988) . Mol. Cell. Biol. 8, 5398.

Dixon, L.K., Forde, B.G., Forde, J. and Leave, C.J. (1980) in Kroon, A. M. and Saccone, C. (eds.), In The Organization and Expression of the Mitochondrial Genome. Elsevier, Amsterdam, pp. 365-368.

Doolittle, W.F. and Bonen, L. (1981). Ann. N.Y. Acad. Sci. 361. 248.

Douglas, S.E. (1991). Curr. Genet. 19, 289.

Drumm, H.E. and Margulies, M.M. (1971). Plant Phys. 45, 435.

Dube, F. and Dufresne, I. (1990). J. Exp. Zool. 256. 323.

Dyer, T.A. and Bowman, C.M. (1979) . J. Biochem. 183. 595.

Earnshaw, W.C. and Heck, M.M.S. (1985) . J. Cell Biol. 100. 1716.

Earnshaw, W.C., Halligan, B., Cooke, C.A., Heck, M.M.S. and Liu, L.F. (1985). J. Cell Biol. 100, 1706.

Eberly S.L. and Spremulli, L.L. (1985). Arch. Biochem. Biophys. 243. 246.

Eberly, S.L. and Spremulli, L.L. (1986). Arch. Biochem. Biophys. 245. 338. 242

Efstratiadis, A., Posakony, J.W., Maniatis, T. , Lawn, R.M., O'Connell, C., Spritz, R.A., DeRiel, J.K., Forget, B.G., Weissman, S.M., Slingtom, J.L., Blechl, A.E., Smithies, 0., Baralle, F.E., Shoulders, C.C. and Proudfoot, N.J. (1980). Cell 21, 653.

Foley, T. and Beale, S.I. (1982). Plant Physiol. 70, 1495.

Fonzi, W.A., Katayama, C., Leathers, T. and Sypherd, P.S. (1985). Mol. Cell. Biol. 5, 1100.

Forchhammer, K., Leinfelder, W. and Bock, A. (1989) Nature 342. 453.

Fox, L., Magrum, L.J., Balch, W.E. Wolfe, R.S. and Worsde, C.R. (1977). Proc. Natl. Acad. Sci. USA 74, 4537.

Fox, L., Erion, J., Tarnowski, J., Spremulli, L., Brot, N. and Weissbach, H. (1980). J. Biol. Chem. 255f 6018.

Fritsch, E.F., Lawn, R.M. and Maniatis, T. (1980). Cell 19, 959.

Gasser, S.M., Ohashi, A., Daum, G., Bohni, P.C., Gibson, J. , Reid, G.A., Yonetani, T. and Schatz, G. (1982). Proc. Natl. Acad. Sci. USA 79., 267.

Gasser, S.M. and Laemmli, U.K. (1986). Cell 4j6, 521.

Gasser, S.M. and Laemmli, U.K. (1987). Trends Genet. 3, 16.

Gay, D.A., Sisodia, S.S. and Cleveland, D.W. (1989). Proc. Natl. Acad. Sci. USA 86, 5763.

Gilkin, G.C., Gargiulo, G., Rena-Descalzi, L. and Worcel, A. (1983). Nature 303/ 770.

Gilkin, G.C., Ruberti, I. and Worcel, A. (1984). Cell 37, 33.

Giovannoni, S.J., Turner, S., Olsen, G., Barns, S., Lane, D.J. and Pace, N. (1988). J. Bacteriol. 170. 3584.

Godefroy-Colburn, T. and Thach, R.E. (1981) J. Biol. Chem. 11762.

Gold, J.C. and Spremulli, L.L. (1985) . J. Biol. Chem. 260. 14897. 243

Gross, D.S. and Garrard, W.T. (1987). Trends Biochem. Sci. 12, 293.

Grossman, A.R., Bartlett, S.G., Schmidt, G.W., Mullet, J.E. and Chua, N-H. (1982). J. Biol. Chem. 257. 1558.

Grosveld, G.C., de Boer, E., Shewmaker, C.K. and Flavell, R.A. (1982). Nature 295, 120.

Gruol, D., Rawson, J.R.Y. and Haselkorn, R. (1975). Euglena. Biochim. Biophys. Acta 414. 20.

Gusella, J., Varsanyi-Breiner, A., Kao, E.T., Jones, C., Puck, T.T., Keys, C., Orkin, S. and Housman, D. (1979). Proc. Natl. Acad. Sci. USA 76, 5239.

Han, S., Udvary, A. and Schedl, P. (1984). J. Mol. Biol. 179, 469.

Han, S., Udvary, A. and Schedl, P. (1985). J. Mol. Biol. 183. 13.

Hardison, R.C., Sawada, I., Cheng, J-F., Shen, C-K. and Schmid, C.W. (1986). Nuc. Acids Res. 14, 1903.

Harland, R.M., Weintraub, H. and McKnight, S.L. (1983). Nature 302. 38.

Hase, Muller, U. , Riezman, H. and Schatz, G. (1984). EMBO J. 3, 3157.

Hayes, J. M. (1983). In, J. W. Schopf (ed.), The Earth's earliest biosphere, its origins and evolution. Princeton University Press, Princeton, N. J.

Hecker, L., Egan, J., Reynolds, R.J., Nix, C.E., Schiff, J.A. and Barnett, W.E. (1974). Proc. Natl. Acad. Sci. USA 71, 1910.

Henthorn, P.S., Mager, D.L., Huisman, T.H.J. and Smithies, 0. (1986). Proc. Natl. Acad. Sci. USA 83, 5194.

Hiratsuka, J., Shimada, H., Whittier, R., Ishibashi, T., Sakamoto, M., Mori, M . , Kondo, C., Honji, C., Sun, C-R., Meng, B-Y., Li, Y-Q., Kanno, A., Nishizawa, Y., Hirai, A., Shinozaki, K. and Sugiura, M. (1989). Mol. Gen. Genet. 217, 185.

Hochstrasser, M. and Varshavsky, A. (1990). Cell 61, 697. 244

Homberger, H.P. (1989). Chromosoma 98, 99.

Horwich, Kalousek, F., Fenton, W.A., Pollock, R.A. and Rosenberg, L.E. (1986). Cell 44, 451.

Hurt, E.C., Pesold-Hurt, B. and Schatz, G. (1984). FEBS Lett 178. 306.

Hurt, E.C, Soltanifar, N., Goldschmidt-Clermont, M., Rochaix, J-D. and Schatz, G. (1986). EMBO J. 5, 1343.

Jagadeeswaran, P. et al., (1981). In, Brown, D., and Fox, C.F. (eds), Developmental Biology Using Purified Genes. New York, Academic Press. ICN-UCLA Symp. Mol. Biol. Vol. XXIII, 71-84.

Jacquet, E., and Parmeggian, A. (1988). EMBO J. 2, 2861.

Jarman, A.P. and Higgs, D.R. (1988). EMBO J. X, 3337.

Jaskunas, S.R., Lindahl, L., Nomura, L. , and Burgess, R.R. (1975). Nature 2 57. 458.

Karlin-Neuman, G .A. and Tobin, E.M. (1986). EMBO J. 5, 9-13.

Kaufman, R.E., Kretschmer, P.J., Adams, J.W., Coon, H.C., Anderson, W.F. and Nienhuis, A.W. (1980). Proc. Natl. Acad. Sci. USA 77, 4229.

Keller, E.B. and Noon, W.A. (1984). Proc. Natl. Acad. Sci. USA 81 7417.

Kellum, R. and Schedl, P. (1991). Cell 64., 941.

Keng, T., Alani, E. and Guarente, L. (1986). Mol. Cel. Bio. 6 , 355.

Khowi-Shigematsu, T., Gelinas, R. and Weintraub, H. (1983). Proc. Natl. Acad. Sci. USA 80, 4389.

Khowi-Shigematsu, T. and Khowi, Y. (1990). Biochemistry 29, 9551.

Kitakawa, M., Grohmann, L., Graack, H-R. and Isone, K. (1990). Nuc. Acids Res. 18, 1521.

Klehr, D., Maass, K. and Bode, J. (1991). Biochemistry 30. 1264.

Kmiec, E.B. and Worcel, A. (1985). Cell 41, 945. 245

Kohno, K., Uchida, T., Ohkubo, H., Nakanishi, S., Nakanish, T., Fukui, T., Ohtsuka, E., Ikehara, M. and Okada, Y. (1986). Proc. Natl. Acad. Sci. USA 83., 4978.

Koerner, T.J., Meyers, A.M., Lee, S. and Tzagoloff, A. (1987). J. Biol. Chem. 262. 3690.

Kraft, R., Tardiff, J., Krauter, K.S. and Leinwand, L.A. (1988). Biotechniques 6 No. 6, 544.

Kraus, M . , Gotz, M. and Loffelhardt, W . , (1990). Plant Molecular Biology 15, 561.

Kuhlemeier C., Strittmatter, G., Ward, K. and Chua, N-H. (1989). Plant Cell 1, 471.

Laemuli, U., Cheng, S.M., Adolph, K.W., Paulson, J.R., Brown, J. A. and Bawmbach, W.R. (1977). Cold Spring Harbor Symposium.

Lamppa, G.K. and Bendich, A.J. (1984). Planta 162. 463.

Lamppa, G.K., Nagy, F. and Chua, N-H. (1985). Nature 316. 750.

Larson, A. and Weintraub, H. (1982). Cell 29., 609.

Lawn, R.M., Efatratiadis, A., O'Connell, C. and Maniatis, T. (1980). Cell 21, 647.

Lax, S.R., Lauer, S.J., Browning, K.S. and Ravel, J.M. (1986). Meth. Enzymol. 118, 109.

Lebo, R.V., Carrano, A.V., Burkhart-Schultz, K . , Dozy, A.M., Yu, L-C. and Kan, Y-W. (1979). Proc. Natl. Acad. Sci. USA 76, 5804.

Lechner, K., Heller, G. and Bock, A. (1988). Nucl. Acids Res. 16, 7817.

Levy-Wilson, B. and Fortier, C. (1990). J. Biol. Chem. 264. 21196.

Lindahl, L. and Zengel, J.M. (1986) . Ann. Rev. Genet. 20, 297.

Luchnik, A.N., Bakayev, V.V., Zbarsky, I.B. and Georgiev, G.P. (1982). EMBO J. 1, 1353.

Maniatis, T. , Fritsch, E.F and Sambrook, J. (1982). Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. 246

March, P.E. and Inouye, M. (1985). Proc. Natl. Acad. Sci. USA 82, 7500.

Mayes, S.R. and Barber, J. (1990). Nuc. Acids Res. 18, 194.

Meng, B.Y., Shjinozaki, K., and Sugiura, M. (1989). Mol. Gen. Genet. 216. 25.

Mertz, J.E. (1982). Mol. Cell. Biol. 2, 1608.

Meyers, A.M., Crivellone, M.D. Koerner, T.J. and Tzagoloff, A. (1987). J. Biol. Chem. 262. 16822.

Meyers, A.M. and Tzagoloff, A. (1985) . J. Biol. Chem. 260. 15371.

Mielke, C., Kohwi, Y., Kohwi-Shigematsu, T. and Bode, J. (1990) . Biochemistry 29, 7475.

Miller, M.E., Jurgenson, J.E., Reardon, E.M. and Price, C.A. (1983). J. Biol. Chem. 258, 14478.

Mirkovitch, J., Mirault, M-E. and Laemmli, U.K. (1984). Cell 39, 223.

Miyajima A., Shibuya, M. and Kaziro, Y. (1979). FEBS Lett. 102, 207.

Montandon, P.E. and Stutz, E. (1983). Nucl. Acids Res. 11, 5877.

Mullett, J.E. and Chua, N-H. (1983). Meth. Enzymol. 97. 502.

Nagata, S., Tsunetsugu-Yokota, Y. , Naito, A. and Kaziro, Y. (1983). Proc. Natl. Acad. Sci. USA 80, 6192.

Nairn, A.C. and Palfrey, H.C. (1987) . J. Biol. Chem. 262. 17299.

Natsoulis, G., Hilger, F. and Fink, G. (1986). Cell 46. 235.

Neant, I. and Guerrier, P. (1988). Develop. 102. 505.

Nicholls, R.D., Fischel-Ghodsian, N. and Higgs, D.R. (1987). Cell 49, 369.

Nielsen, P.J. and McConkey, E.H. (1980). J. Cell. Physiol. 104. 69. 247

Nguyen, T.D. (1989). Biotechniques 2 No. 3, 238.

Nomura M . , Gourse, R. and Baughman, G. (1984). Ann. Rev. Biochem. 53, 75.

North, G. (1985). Nature 316. 394.

Ohama, T., Yamao, A., Muto, A. and Osawa, S. (1987). J. Bacteriol. 169, 4770.

Ohyama, K., Fukuzawa, H., Kohchi, T., Shirai, H., Sano, T., Sano, S., Umesono, K., Shiki, Y., Takeuchi, M . , Chang, Z., Aota, S., Inokuchi, H. and Ozeki, H. (1986). Nature 322. 572.

Pape, L.K., Koerner, T.J. and Tzagoloff, A. (1985). J. Biol. Chem. 260, 15362.

Partaledis, J.A. and Mason, T.L. (1988) . Mol. Cell. Biol. 8, 3647.

Passavant, C.W., Stiegler, G.L. and Hallick, R.B. (1983). J. Bio. Chem. 258. 693.

Paulson, J.R. and Laemuli, U.K. (1977) . Cell 12., 817.

Pechman, K.J. and Woese, C.R. (1972). J. Mol. Evol. 1, 230.

Phillips, D.O. and Carr, N.G. (1975). FEBS Lett. 60, 94.

Phillips, D.O. and Carr, N.G. (1977). Taxon 26, 3.

Phillips, D.O. and Carr, N.G. (1981) . Ann. N.Y. Acad. Sci. 361. 298.

Phi-Van, L., von Kries, J.P., Ostertag, W. and Stratling, W. (1990). Mol. Cell. Biol. 10, 2302.

Phi-Van, L. and Straling, W. H. (1990). Prog. Mol. Subcell. Biol. 11, 1.

Pigott, G.H. and Carr, N.G. (1972). Euglena gracilis Science, N.Y. 175. 1259.

Poncz, M . , Schwartz, E., Ballantine, M. and Surrey, S. (1983). J. Biol. Chem. 258, 11599.

Probst, H. and Herzog, R. (1985). Eur. J. Biochem. 146. 167. 248

Pruitt, S.C. and Reeder, R.H. (1984). J. Mol. Biol. 174, 121.

Ray, A., Walden, W.E., Brendler, T., Zenger, V.E. and Thach, R.E. (1985). Biochem. 24, 7525.

Reger, B.J., Smillie, R.M. and Fuller, R.C. (1972). Plant Phys. .50, 19.

Reger, B.J., Smillie, R.M. and Fuller, R.C. (1972). Plant Phys. 50, 24.

Reiss B., Wasmann, C. and Bohnert, H. (1987). Mol. Gen. Genet. 209. 116.

Roper, M.D. and Wicks, W.D. (1978) Proc. Natl. Acad. Sci. USA 75, 40.

Rose, M.D., Novick, P., Thomas, J.H., Botstein, D. and Fink, G.R. (1987). Gene 60, 237.

Ruskin, B., Krainer, A.R., Maniatis, T. and Green, M.R. (1984). Cell 38, 317.

Ryazanov, A.G., Shestakova, E. A. and Natapov, P.G. (1988). Nature 134, 170.

Ryazanov, A.G. and Davydova, E.K. (1989) FEBS Lett. 251, 187.

Ryoji, M. and Worcel, A. (1984). Cell 37, 21.

Ryoji, M. and Worcel, A. (1985). Cell 40, 923.

Sanders-Haig, L., Anderson, W.F. and Francke, U. (1980). Nature 283. 683.

Schantz, R. et a l ., (1975). Plant Sci. Lett. 5, 313.

Schleyer and Neupert, (1985). Cell 43., 339.

Schmidt G.W. and Mishkind M.L. (1986). Ann. Rev. Biochem. 55., 879.

Schopf, J.W., Hayes, J.M. and Walter, M.R. (1983). In, J. W. Schopf (ed.), The Earth1 s earliest bioshpere, its origins and evolution. Princeton University Press, Princeton, N. J.

Schwartz and Kossel, H. (1980). Nature 283. 739. 249

Selleck, S .B., Elgin, S.C. and Cartwright, I.L. (1984). J. Mol. Biol. 178, 17.

Shafit-Zagardo, B., Brown, F.L., Maio, J.J. and Adams, J.W. (1982). Gene 20, 397.

Shestakova, E.A. and Ryazanov, A.G. (1987). Dokl. Akad. Nauk. SSSR 297, 1495 (in Russian).

Shibuya, M., Nashimoto, H. and Kaziro, Y. (1979). Mol. Gen. Genet. 170, 231.

Shinozaki, K . , Ohme, M., Tanka, M . , Wakasugi, T., Hayashida, N., Matsubayashi, T., Zaita, N., Chunwongse, J. , Obokata, J., Yamaguchi-Shinozaki, K., Ohto, C., Torazawa, K . , Meng, B.Y., Sugita, M . , Deno, H., Kamogashira, T., Yamada, K., Kusada, J. , Takaiwa, F., Kato, A., Tohdoh, N. , Shimada, H. and Sugiurs, M. (1986) EMBO J. 5, 2043.

Siddell, S.G. and Ellis, R.J. (1975) Biochem. J. 146. 675.

Silverthorne, J. and Tobin, E.M. (1984) Proc. Natl. Acad. Sci. USA 81/ 1112.

Slightom, J.L., Blechl, A.E. and Smithies, 0. (1980). Cell 21, 627.

Smeekens, S., De Groot, M. , Van Binsbergen, J. and Weisbeek, P. (1986) Cell, 46, 365.

Southern, E. (1975). J. Mol. Biol. 98, 503.

Spitzner, J.R. and Muller, M.T. (1988) Nucl. Acids Res. 16, 5533.

Spremulli, L.L. (1982). Arch. Biochem. Biophys. 214. 734.

Spritz, R.A., DeRiel, J.K., Forget, B.G. and Weissman, S.M. (1980). Cell 21, 638.

Stief, A. , Winter, D.M., Stratling, W.H. and Sippel, A.E. (1989). Nature 341. 343.

Stratling, W.H. and Dolle, A. (1986). Biochemistry 25. 495.

Tanka M., Wakasugi, T., Sugita, M., Shinozaki, K. and Sugiura, M. (1986). Proc. Natl. Acad. Sci. USA 82, 6030. 250

Taramelli, R., Kioussis, D., Vanin, E. , Bartram. K., Groffen, J., Hurst, J. and Grosveld, F.G. (1986). Nuc. Acids Res. 14, 7017.

Taylor, F.R. In: Lee, J.L. and Frederick, J.F. (eds) Endocytobiology III, pp. 1-16. Ann. N.Y. Acad. Sci. (1986).

Teem, J.L., Abovich, N., Kaufer, N.F., Schwindinger, W.F., Warner, J.R., Levy, A., Woolford, J., Leer, R.J., van Raamsdonk-Duim, M.M.C., Mager, W.H., Planta, R.J., Schultz, L., Frieser, J.D., Fried, H. and Rosbash, M. (1984). Nuc. Acids. Res. 12, 8295.

Thomas F., Massenet, 0., Dorne, A.M. and Mache, R. (1988). Nuc. Acids Res. JL6, 2461.

Tiboni, 0., Cantoni, R., Creti, R., Cammarano, P. and Sanangelantoni, A. (1991). J. Mol. Evol. 33., 142.

Tiboni, 0. and Di Pasquale, G. (1987). Biochim. Biophys. Acta 908. 113.

Tobin, E.M. and Silverthorne, J. (1985) . Ann. Rev. Plant Phys. 36, 569.

Udvardy, A., Schedl, P., Sander, M. and Hsieh, T. (1985). Cell 40, 933.

Udvardy, A., Maine, E. and Schedl, P. (1985). J. Mol. Biol. 185, 341.

Vambutas, A., Ackerman, S.H. and Tzagoloff, A. (1991). Eur. J. Biochem. 201. 643.

Van den Broeck, G . , Timko, M.P., Kausch, A.P., Cashmore, A.R., Van Montagu, M. and Herrera-Estrella, L. (1985). Nature 313. 358.

Van Loon, A., Brandii, A.W. and Schatz, G. (1986). Cell 44, 801.

Vanin, E.F., Henthorn, P.S., Kioussis, D., Grosveld, F. and Smithies, 0. (1983). Cell 3J5, 701.

Villeponteau, B., Lundell, M. and Martinson, H. (1984). Cell 39, 469.

Vogelstein, B., Pardoll, D.M. and Coffey, D.S. (1980). Cell 22, 79. 251

Von Kries, J.P., Phi-Van, L., Diekmann, S. and Stratling, W.H. (1990). Nuc. Acids Res. 18, 3881.

Walden, R. and Leaver, C.J. (1981) Plant Phys. 67, 1090.

Walden, W.E., Godefroy-Colburn, T. and Thach, R.E. (1981). J. Biol. Chem. 256, 11739.

Walden, W.E. and Thach, R.E. (1986). Biochem. 25, 2033.

Walter, M.R. (1983). In, J.W. Schopf (ed.), The Earth's earliest bioshpere, its origins and evolution. Princeton University Press, Princeton, N. J.

Watson, J.C. and Surzycki, S.J. (1982). Proc. Natl. Acad. Sci. USA 79, 2264.

Watson, J.C. and Thompson, W.F. (1986). Purification and Restriction Endonuclease Analysis of Plant Nuclear DNA. Methods In Enzvmoloqy. Academic Press, Inc., 118. 57-75.

Weaver, D.T. and DePamphilis, M.L. (1982) . J. Biol. Chem. 257, 2075.

Weintraub, H. (1983). Cell 32, 1191.

Welcsh, P.L., Johnson, D.R. and Breitenberger, C.A. (1992). Manuscript in preparation.

Wolstenholme, D.R. and Clary, D.O. (1985) . Genetics 109, 725.

Woese, C.R. and Fox, G.E. (1977) . Proc. Natl. Acad. Sci. USA 74, 5088.

Yakhnin, A., Vorozheykina, D. , and Matvienko, N. (1989). Nucl. Acids Res. 21, 8863.

Yang, L., Rowe, T.C., Nelson, E.M. and Liu, L.F. (1985). Cell 4JL, 127.

Yokota, T., Sugosaki, H., Takanami, M., and Kaziro, Y. (1980). Gene 12, 25.

Zablen, L.B., Kissil, M.S., Woese, C.R. and Buetow, D.F. (1975). Proc. Natl. Acad. Sci. USA 72, 2418. Zampetti-Bosseler, F., Huez, G. and Brachet, J. (1973). Exp. Cell Res. 78., 383.

Zengel, J. M . , Archer, R.H., and Lindahl, L., (1984). Nucl. Acids Res. JL2, 2181. Zhou K-X, Quigley, F., Massenet, O. and Mache, (1989). Mol. Gen. Genet. 216, 439.