Proc. Nati. Acad. Sci. USA Vol. 82, pp. 5676-5680, September 1985 Biochemistry

Synthesis of sperm and late histone cDNAs of the sea urchin with a primer complementary to the conserved 3' terminal palindrome: Evidence for tissue-specific and more general histone gene variants (cDNA cloning/H2A and H2B histone variants/developmental regulation/gene conversion) MEINRAD BUSSLINGER AND ALCIDE BARBERIS Institut fur Molekularbiologie II der Universitat Zirich, Honggerberg, 8093 Zrifch, Switzerland Communicated by Max L. Birnstiel, May 6, 1985

ABSTRACT We have cloned histone cDNAs from total mRNAs, however, it would be preferable to first clone the RNA isolated from testis and from gastrula-stage embryos of variant histone mRNAs as cDNAs from a specific tissue or the sea urchin Psammechinus miliaris. The reverse transcrip- developmental stage and then use these cDNAs for the tion of histone mRNAs was specifically primed with an oligo- isolation of the corresponding chromosomal histone genes. nucleotide that is complementary to the conserved palindromic All nonpolyadenylylated histone mRNAs analyzed to date sequence present at the 3' end of nonpolyadenylylated histone contain a highly conserved palindromic sequence at their 3' mRNAs. Two sperm H2B, two late H2B, and three late H2A end (1, 15). We have exploited this histone-specific sequence variant cDNA clones were isolated and characterized by DNA to clone variant histone cDNAs from total sea urchin RNA by sequence analysis. These cDNA clones were used to study the priming reverse transcription with an oligonucleotide com- accumulation of histone mRNA during sea urchin embryogen- plementary to this conserved sequence. We have demon- esis. The different late H2A and H2B mRNAs are present in as strated the potential of this approach by cloning sperm and few as 200 copies in the egg and each accumulate to 3-5 x 105 late H2A and H2B cDNAs from testis and gastrula RNA of molecules in the gastrula embryo. One of the late mRNAs, the the sea urchin Psammechinus miliaris. These cDNA clones H2A-3 mRNA, is also abundant in testis RNA and codes for the were used to study the accumulation ofthe respective histone H2A variant present in sperm chromatin. The late H2A-3 mRNAs during embryogenesis. protein is therefore a more prevalent H2A variant of the sea urchin. In contrast, the two sperm H2B mRNAs are found in testes but not ovaries and embryos ofthe sea urchin, suggesting MATERIALS AND METHODS that the sperm H2B genes are expressed only during Materials. Sea urchins (P. miliaris) were obtained from the spermatogenesis. In addition, evidence for gene conversion Marine Biological Station Millport (Millport, Scotland) and between two late H2A gene variants is presented. avian myeloblastosis virus from Life Sciences (St. Petersburg, FL). The oligonucleotide primer The sea urchin genome contains four developmentally reg- was kindly provided by R. A. Flavell (Biogen, Boston); ulated histone gene families coding for the early, late, sperm, Escherichia coli adenylate transferase, by M. Billeter (Uni- and cleavage-stage histones (reviewed in refs. 1 and 2). The versity of Zurich); and calf thymus terminal deoxynucleoti- early histone proteins are synthesized during the first hours dyltransferase, by C. Weissmann (University of Zurich). of sea urchin development up to the hatching blastula stage. Preparation of RNA. Sea urchin embryos were cultured at Subsequently the synthesis ofthese early histones is replaced 16°C as described (16). Total RNA was prepared from by that of the late proteins (3-5). The sperm histones are embryos, testes, and ovaries by phenol extraction and was found in the highly condensed male pronucleus (6, 7), then freed from DNA and polysaccharides by purification on whereas the so-called cleavage-stage proteins are the CsCl step gradients (17). histones in the egg and early cleavage-stage chromatin (8, 9). Reverse Transcription of Histone mRNAs. Total RNA (20 This program of differential histone gene expression appears ,ug) and 20 pmol of synthetic primer were denatured in 3 ,ul to be the result of transcriptional as well as posttranscrip- of 10 mM Tris Cl, pH 7.5/1 mM EDTA at 65°C for 2 min and tional regulation, at least in the case of the early and late then were hybridized for 30 min by lowering the temperature histone genes that have been analyzed (10-12). slowly from 20°C to 0°C. cDNA synthesis was started by The highly repetitive early histone genes outnumber the adding 17 ul of the reaction mixture containing 10 units of late, sperm, and cleavage-stage histone gene variants by two reverse transcriptase; 50 mM Tris Cl (pH 8.3); 30 mM KCI; orders of magnitude in the sea urchin genome (13, 14). 8 mM MgCl2; 1 mM dithiothreitol; 0.1 mM each dATP, dCTP, Furthermore, the various histone gene families have diverged dGTP, and dTTP; and 10 ,uCi (1 Ci = 37 GBq) of[a-32P]dCTP. far from each other (13) and therefore cross-hybridize poorly After 1 hr at 42°C and subsequent RNA hydrolysis in 0.2 M with one another. Both these facts prevent the efficient NaOH at 42°C for another hour, the cDNA was fractionated isolation of the less abundant histone gene variants from a by electrophoresis in an 8% polyacrylamide sequencing gel. chromosomal DNA by screening with early histone Cloning of Histone cDNA. For preparative cDNA synthe- gene probes. Homologous probes for the late, sperm, and sis, 200 ,ug of total RNA was used. cDNA transcripts of cleavage-stage histone genes are therefore required for their 300-700 nucleotides were eluted from polyacrylamide gels cloning. Maxson et al. (14) have used highly purified late and cloned as described in detail by Land et al. (18). The histone mRNA as a probe for the isolation of the late histone cDNAs were elongated at the 3' end with dTMP by calf genes of the sea urchin Strongylocentrotus purpuratus. In thymus terminal deoxynucleotidyltransferase prior to sec- order to eliminate the risk of contamination by early histone ond-strand synthesis with avian myeloblastosis virus reverse transcriptase using (dA)12_18 as primer. The double-stranded The publication costs of this article were defrayed in part by page charge cDNA was separated from unreacted primer on Sephadex payment. This article must therefore be hereby marked "advertisement" G-150 in 10 mM Tris Cl, pH 7.5/1 mM EDTA. After a further in accordance with 18 U.S.C. §1734 solely to indicate this fact. tailing step with dTTP and terminal deoxynucleotidyltrans- 5676 Downloaded by guest on September 30, 2021 Biochemistry: Busslinger and Barberis Proc. Natl. Acad. Sci. USA 82 (1985) 5677 ferase, the cDNA was hybridized at 250C in 0.5 M NaCl to M a b c pSP64 DNA (19) that had been poly(A)-tailed at the I site. The then was used to transfect E. coli strain 527 Hi Pst hybrid _ H3 i- SK 1592 (20). Histone cDNA clones were identified by 404 _ H2A H2B colony hybridization with nick-translated h22 histone DNA H4 - (21) at 650C in 0.3 M NaCl/0.03 M sodium citrate/10% 309 _ (wt/vol) dextran sulfate/0.2% bovine serum albumin/0.2% 242 Ficoll/0.2% polyvinylpyrrolidone. Filters were first washed 238 v under nonstringent conditions (0.3 M NaCl/0.03 M sodium 201 _ 190 citrate, 650C, 1 hr) and, after autoradiography, a second time 180 < under stringent conditions (15 mM NaCl/1.5 mM sodium 160 * citrate, 650C, 1 hr). 147 W DNA Sequencing. The nucleotide sequences of DNA in- serts were determined by the Maxam and Gilbert method (22) from restriction sites in the cDNA insert as well as in the 122 i polylinker of the plasmid pSP64. Nuclease S1 Analysis. About 0.1 pmol of a 5'-end-labeled 3Iend DNA fragment was hybridized to 10 ,ug or 40 ,g of RNA prior histone to nuclease S1 digestion and gel electrophoresis as described mRNA - --- AAC GGCuCU UUUC AGAGCC ACCA (23). primer 3' CCGGA AAAG TC 5' RESULTS TTG Reverse Transcription of Early Histone mRNAs. Most histone mRNAs are not polyadenylylated and are conse- 5' ~ 3' quently not amenable to standard cDNA cloning procedures. cap ATG histone mRNA TERM palindrome -4*-- Since these histone mRNAs, however, terminate in a highly --rm conserved palindromic sequence (1, 15, 24), we used a cDNA primer synthetic oligonucleotide complementary to the 5' part of this synthesized sequence as a primer for the reverse transcription of histone FIG. 1. Reverse transcription of early histone mRNAs. (Upper) mRNA (Fig. 1). The specificity of cDNA synthesis was first Total sea urchin RNA from early blastula embryos labeled with tested with early histone mRNAs as templates, since they [3H]uridine (lane a) was used for reverse transcription in the presence should be abundant enough in early blastula RNA to give rise of [a-32P]dCTP with (lane b) or without (lane c) the synthetic to detectable levels of histone cDNA. Furthermore, early oligonucleotide primer. Both RNA and cDNA were electrophoresed histone mRNAs can be efficiently labeled in vivo with in a 7 M urea/8% polyacrylamide gel prior to fluorography. The small [3H]uridine, facilitating their identification in polyacrylamide prominent cDNA band in lane b was identified as a cDNA transcript gels (Fig. 1, lane a). When total RNA from early embryos was of the 5' terminal 139 nucleotides of the 26S rRNA (unpublished used for reverse in the presence of the oligo- results). 5'-End-labeled pBR322 DNA digested with Hpa II was used transcription as markers (lane M, sizes given at left in nucleotides). (Lower) The nucleotide primer and [a-32P]dCTP, a pattern of full-length conserved palindromic sequence at the 3' end of nonpolyadenylyl- cDNA bands (Fig. 1, lane b) was obtained which directly ated histone mRNAs (1, 15, 24) is shown together with the sequence reflects that of their respective histone mRNAs (lane a). In of the complementary oligonucleotide used as primer. TERM, addition, we always obtained a small prominent cDNA band termination codon. which we could identify as a cDNA transcript of the 5' terminal 139 nucleotides of the 26S rRNA (unpublished results). None ofthese cDNA transcripts were detected when two different H2B variants present in sperm chromatin of P. the synthetic oligonucleotide was omitted from the reaction miliaris (25). Likewise, clones pcLH2B-1 and pcLH2B-2 mixture (Fig. 1, lane c). from the gastrula cDNA library contain the coding sequences Cloning and Identification of Late and Sperm Histone cDNA. for two different late H2B proteins which correspond to the Total RNA isolated from sea urchin testis or from gastrula- two partially sequenced late H2B variants of the related sea stage embryos was used for reverse transcription by the urchin Parechinus angulosus (7). The gastrula cDNA clones above method. cDNA transcripts of 300-700 nucleotides pcLH2A-1, pcLH2A-2.1, and pcLH2A-2.2 code for typical were isolated from polyacrylamide gels and inserted into the H2A proteins and their assignment as late H2A cDNA clones Pst I site ofplasmid pSP64 (19), essentially according to Land was based solely on the accumulation of their respective et al. (18). Both the testis and gastrula cDNA libraries were mRNAs during late embryogenesis (Fig. 5). The two partial highly enriched in histone cDNA clones, since about 10% of but overlapping cDNA clones pcLH2A-2.1 and pcLH2A-2.2 all colonies hybridized, under nonstringent conditions (300 together contain the entire protein-coding sequence of the mM NaCl at 65°C), to nick-translated early histone gene late H2A-2 mRNA (see legend to Fig. 2). The complete probes of the Psammechinus repeat unit h22 (21). One-fifth protein sequence is known for the sperm H2A variant of P. of the positive colonies from the gastrula cDNA library were miliaris (6) and P. angulosus (26). It is identical in both cases identified as early histone cDNA clones on the basis of their with the deduced amino acid sequence of clone pcLH2A-3 strong hybridization signals after stringent washing in 15 mM from the testis cDNA library. The same H2A cDNA was NaCl at 65°C. A number of potential late and sperm histone twice cloned from RNA of gastrula embryos (data not cDNA clones (i.e., clones that formed unstable heterodu- shown), indicating that the "sperm" H2A gene is also plexes with early H2A or H2B gene probes under the above expressed during late embryogenesis. We refer to this histone conditions) were selected for DNA sequencing. variant as late H2A-3 protein. The DNA sequences ofthe H2A and H2B cDNA clones are Steady-State Levels of Sperm and Late Histone mRNAs at shown in Figs. 2 and 3 and the deduced amino acid sequences Different Developmental Stages. The expression of the sperm are shown in Fig. 4. Clones pcSH2B-1 and pcSH2B-2, and late histone genes was studied by quantitative nuclease isolated from the testis cDNA bank, were identified as sperm S1 analysis. The cloned histone cDNA probes were used to H2B cDNA clones, since their deduced amino acid se- identify and titrate the different H2A and H2B mRNAs in quences are identical with the known partial sequences of the total RNA of developing embryos. Fig. 5 is a compilation of Downloaded by guest on September 30, 2021 5678 Biochemistry: Busslinger and Barberis Proc. Natl. Acad. Sci. USA 82 (1985)

1 5 10 SerGlyArgGlyLysGlyAl aLysAl aLysGly a late H2A-3 (pcLH2A-3) ACAGTTCTCGTTTCAACTCCGAAATCGATAAACTAACAAATCAT GTCTGGACGTGGTAAAGGCGCTAAGGCTAAGGGA b early H2A (h22) ATTCAAGCCAGCGCACATCGCTTCGTTCACAACCTCGCTTCGCTCTCAGCTCGTTAACCAACCAACCATC.ITCTGGAAGAGGTAAAAGTGGAAAGGCCCGTACC c late H2A-l (pcLH2A-l) ATTAGTTTCATTCGTCAACCGATCAACTAAACAAAATCAT G TCTGGACGTGGCAAAGGAAAGGCTAAGGGCACT d late H2A-2 (pcLH2A-2.1/2) CAAAACTAAATCATC AIG TCTGGACGTGGTAAAGGAGCTAAGGCAAAGAGC

1 5 20 25 30 35 40 45 50 55 LysAl aLysSerArgSerSerArgAl aG 1yLeuGl nPheProV al1Gl1yArgV al1H isArgPheLeuArgLysGl1yAsnTyrAl1aAsnArgV al G1yAl1aGl yAl aProV al1TyrLeuAl1aAl1aV alLeuGl u a AAGGCAAAGAGCCGTTCATCCCGTGCAGGACTTCAGTTCCCCGTCGGTCGTGTCCACCGCTTCCTCCGCAAGGGCAACTATGCCAACCGTGTTGGTGCTGGAGCCCCAGTCTACTTGGCTGCCGTTCTCGAA b AAGGCAAAATCTCGTTCATC CCGCGCTGGTCTC CAGTTCC CAGTGGGACGTGTTCAC CGATTT CTACGCAAAGGCAAC TAT GCAAAGAGGGT CGGCGGTGGGGCAC CAGT CTACATGGC CGC TGT CTTGGAG c AAGTCAAAGACGC GTTCATC CCGC GCAGGACTT CAGTTC CCAGT CGGT CGTGTGCAC CGTT T CTTGAAGAAGGGCAACTACGGAT CCCGTGTCGGAGCTGGTGC C CCAGTGT ACC TCGCAGC CGTAC TCGAG d AAGGCTAAGAGCCGCTCATCCCGTGCAGGACTTCAGTTCCCTGTCGGCCGTGTGCAC CGTTTCTTGAAGAAGGGCAACTACGGCAACCGTGTTGGAGCTGGTGCCCCAGTGTACCTCGCAGCCGTC CTCGAG

60 65 70 75 80 85 90 95 TyrLeuAl aAl aGlu Il1eLeuGl uLeuAl aGlyAsnAl aAl aArgAspAsnLysLysThrArgI 1e Il1eProArgHi sLeuGl nLeuAla Il1eArgAsnAspGl uGl uLeuAsnLysLeuLeuGl yGlyVal a TACTTGGCAGC TGAGATC CTCGAGT TGGCAGGCAACGC CGC TCGCGACAACAAGAAGAC CCGTATCATC C CCC GTCAC TT GCAGC TCGC CAT CAGGAAC GACGAGGAGTTGAACAAGC TT CTTGGAGGAGTT b TACTTGACTGCCGAAATTCTCGAGCTCGCTGGCAACGCTGCTCGCGACAACAAGAAATCTAGGATCATTCCCCGTCATCTTCAACTTGCCGTGCGCAACGACGAAGAACTCAACAAGCTC CTCGGAGGGGTG c TACCTCACCGCTGAGATCCTCGAGCTCGCCGGCAACGCCGCCCGCGACAACAAGAAGAGCAGGATCATCCCCCGTCATCTTCAGTTGGCTGTCCGCAACGACGAGGAGCTCAACAAGCTTCTCGGAGGAGTC d TACCTCACCGCTGAGATCCTCGAGCTCGCCGGCAACGCCGCC CGCGACAACAAGAAGAGCAGGATCATCCCCCGTCATCTTCAGTTGGCTGTCCGCAACGACGAGGAGCTCAACAAGCTT CTCGGAGGAGTC

100 105 110 115 120 125 ThrIleAlaGlnGlyGlyValLeuProAsnIleGlnAlaValLeuLeuProLysLysThrGlySerLysSerSerLys a ACCATCGCCCAGGGTGGTGTCCTCCCAAACATCCAGGCCGTCCTTCTCC CCAAGAAGACTGGCTCAAAGTCCTCCAAGESAGAGTTGC TCTTTGCTGCAGCTAATACAAAi~CTTA b ACGATCGCCCAAGGTGGTGTCCTGCCCAACATCCAAGCCGTGCTGCTTCCCAAGAAGACAGGCAAATCAAGCffA-TTTGTTTGCTAC CTCTTGCAACCTCAACAACGGCCCTTATCAGGGCCACCA c ACCATCGCTCAGGGTGGTGTCCTCCCCAACATCCAGGCTGTCCT CCTC CCCAAGAAGAC CGCCAAGGCCTCCAAAITAAIGAAGGGACTTCTGTCATCTCAAAGTAGAACA& C ITCTTrA d AC CAT CGCT CAGGGTGGTGTCCT C CCCAACAT CCAGGC TGTCC TT CT C CCCAAGAAGAC CGGCAAGT CTGCALLIGAAGGGAC TCGTCT CGT CTCAAGCAACGTTT" FIG. 2. Late H2A cDNA sequences of the sea urchin. The deduced amino acid sequence for the late H2A-3 protein is shown above'the late H2A-3 cDNA sequence. The entire mRNA coding sequence is shown for the early H2A gene of the histone DNA clone h22 (21). The cDNA clones pcLH2A-2.1 and pcLH2A-2.2 are identical in the overlapping sequences from codon 28 to codon 96 and together contain the entire coding sequence of the late H2A-2 protein. The start and stop codons of translation are boxed, and the primer-complementary sequences are underlined by wavy lines. several different experiments showing the Si-resistant DNA H2A-1 and H2B-2 probes in total egg RNA. The S1 signals of bands corresponding to the fully protected cDNA sequences. the late H2A-2, H2A-3, and H2B-1 probes are near the limit From the known specific activity of the S1 probes and an of detection and correspond, at most, to 200 mRNA mole- RNA content of 3 ng per embryo (27), we estimated the cules per egg. Consequently the different late H2A and H2B absolute number of histone mRNA molecules present in the mRNAs of P. miliaris accumulate at least 1500-fold from the embryo. egg up to the gastrula embryo. The late H2A-3 mRNA coding The two different sperm H2B mRNAs of P. miliaris are for the "sperm" H2A protein is very abundant in testis RNA, highly abundant in testis RNA and cannot be detected from which it was initially cloned as cDNA. The late H2A-2 throughout early development. The early H2A and H2B mRNA is present at low levels in testis RNA, as are the late mRNAs are stored as maternal mRNAs in ~-2 x 105 copies H2B-1, H2A-2, and H2A-3 mRNAs in the ovary. Due to the per egg. The levels of these early mRNAs increase about heterogeneous nature of both tissues, these less abundant 10-fold from fertilization up to the 128-cell stage and then mRNAs may be expressed in cell types other than decline steadily during later development. As in the case of spermatocytes or oocytes, respectively. the late H3 and H4 transcripts of Lytechinus pictus (12), the two late H2B mRNAs and the three late H2A mRNAs of P. miliaris begin to accumulate in the early blastula and each DISCUSSION reach a level of 3-5 X 105 molecules in the gastrula embryo. We have modified the oligonucleotide-primed reverse tran- No Si-resistant DNA fragment was detected with the late scription technique to synthesize histone cDNA. The great

1 5 10 15 20 ProSerGl nLysSerProThrLysArgSerProThrLysArgSerProGl nLysGlyGlyLysGlyAl a a sperm H2B-1 (pcSH2B-l) CATTTTGATTAATATCAAAAGTAATCaCCGTCTCAGAAGAGTCCCACCAAGCGGAGTCCGACAAAGCGTAGC CCCCAGAAGGGAGGCAAAGGAGCC b sperm H2B-2 (pcSH2B-2) c early H2B ( h 2 2 ) ACTCACAGTACCAAAAGCATTGCTCGTGACACTCGCATCGTTCTGCTCCTAAGACATCAGAAAACTTCATCTCACC CTCCAACAGGTCAGGTCGCTAAGAAA d late H2B-l (pcLH2B-1) TTCTCATTGTCTGCTACGACTCTGAACTCAA1 CCTGCTAAAGCACAAGCCGCTGGAAAG e 1 ate H2B-2 (pcLH2B-2) AGTTGAACCTATCGACATCCACAACAAACAAT CCTGCCAAACAAACCAGCGGAAAGGGAGCA

25 30 35 40 45 50 55 60 65 LysArgGlyGlyLysAl aGl yLysArgArgArgGi yVal A1aVal1LysArgArgArgArgArgArgGl uSerTyrGly Il1eTyr Il1eTyrLysVal LeuLysGl nVal Hi sProAspThrGly Il1eSerSer a AAACGTGGAGGAAAGGCAGGCAAACGT CGACGTGGAGTT GC TGTAAAGCGT CGACGC CGCAGACGTGAAAGCTACGGAAT CTACAT C TACAAGGTGTT GAAGCAAGTT CATC C CGACAC TGGTATT T CCAGC b ...... AAGGAATGTAGTCAAGCGTCGCCGACGTCGACGTGAGAGCTATGGCATTTACATCTACAAGGTCCTCAAGCAAGTCCACCCAGACACCGGAATCTCTAGC c GGC T CCAAGAAAGCAGTCAAAC CACC TC GTGC TAGCGGTGGCAAGAAGAGGCATAGGAAAAGGAAGGAGAGC TACGGTATC TACATC TACAAAGTCC TCAAGCAGGT TCAC CC TGACACTGGTGT CT CCAGC d AAGGGAT CGAAGAAGGC CAAGGC CCCCAGGC CTAGC GGCGACAAGAAGAGGCGCAGAAAGCGCAAGGAAT CCTACGGAAT CTACAT CTACAAGGTT CTGAAGCAGGT CCAC CCAGACAC TGGTAT CTC CAGC e AAGAAGGCCGGTAAGGCCAAGGGACGCCCAGCCGGCGCCAGCAAGACCCGTCGCCGTAAGCGCAAGGAAAGCTACGGAATCTACATCTACAAGGTTCTGAAGCAGGTCCACCCCGACACTGGCATCTCCAGC

70 75 80 85 90 95 100 105 110 ArgAl aMetSerV al1MetAsnSerPheV al1AsnAspVal1PheGl1uArg Il1eAl1aSerGl1uAl1aGl1yArgLeuThrThrTyrAsnArgArgAsnThrV al SerSerArgGl uV al1Gl1nThrAl1aV alArgLeu a CGTGC CATGTC CGTCATGAACAGCTTTGTCAACGACGT CT TCGAGCGCATTGCTTC CGAAGCAGGC CGC CTTAC CACC TACAAC CGCAGAAACAC CGTGT CCAGC CGAGAGGTACAGACCGC TGTC CGCC TT b CGTGGCATGT CCGTCATGAACAGCTTCGT CAAC GATGT CTT CGAGCGCAT CGC CGGAGAAGC CT C TCGTT TGAC CAGC GCTAAC CGAAGAAGCACCATAAGTAGC CGTGAAATC CAGACTGC TGTT CGC CTG c C GGGC CATGACAAT CAT GAACAGC TTTGTCAACGATAT CTT CGAGCGGAT CGC CGGCGAAGCC TCC CGT C TCAC CCAGTACAACAAGAAGT CAAC CAT CAGTAGC CGGGAGAT TCAGAC CGC CGTGCGC CTT d CGTGCCATGTC CATCATGAACAGTTTCGTCAACGATGTCTTCGAGCGCATTGC CGCCGAAGCTTCCCGTCTTGCCCACTACAACAAAAAGTCCACCATCACCAGTCGTGAGGT CCAGACCGCTGTCAGACTT e AAAGCCATGTCCATCATGAACAGCTTCGTCAACGATGTCTTCGAGCGCATCGCCGGTGAGGCTTCCCGTCTTGCCCACTAC.....

115 120 125 130 136 LeuLeuProGlyGl uLeuAl aLysHi sAl aVal SerGl uGlyThrLysAl aVal ThrLysTyrThrThrSerArg a CTACTCCCTGGAGAGTTGGCCAAGCACGCCGTTTCAGAAGGAACCAAGGCTGTGACAAAGTACACCACGTCTCG A AGACGATAGATTAGAGGGGAGAGACCATCTCGAAACAAAABQGC5CTTTTCAG b CTCCTCCCTGGAGAGCTGGCGAAGCATGCCGTCTCCGAGGGTACCAAGGCCGTGACAAAATACACCACCGCCACGC TGTCAAACAGACCAACCAGGCGATGCCTAAACAAMACGGCTCTTTTCAG c CTTCTCCCAGGAGAGTTGGCCAAGCACGCCGTGAGTGAGGGGACCAAAGCAGTGACCAAGTACACCACCGCCAA A CGGTTACACCCTAGTCCCTTCGGACTGACAACGGCCCTTTTCAGGGCCACCA d CTCCTTCCCGGAGAATTGGCCAAGCACGCCGTCTCTGAGGGCACCAAGGCTGTGACCAAGTACACCACCTCCAA A TTGGAAACTTCTTGTTTCTAATCTCGCAACAAAC"CCCTTTTCAG FIG. 3. Sperm and late H2B cDNA sequences of the sea urchin. The deduced amino acid sequence for the sperm H2B-1 protein is shown above the sperm H2B-1 cDNA sequence. The entire mRNA coding sequence is shown for the early H2A gene of the histone DNA clone h22 (21). The cDNA clones pcSH2B-2 and pcLH2B-2 contain only partial cDNA copies of the sperm and late H2B-2 mRNAs, respectively. The start and stop codons of translation are boxed, and the primer-complementary sequences are underlined by wavy lines. Downloaded by guest on September 30, 2021 Biochemistry: Busslinger and Barberis Proc. Natl. Acad. Sci. USA 82 (1985) 5679

10 20 30 40 50 60 70 80 90 100 110 120 A I late H2A-3 SGRGKGAKAKGKAKSRSSRAGLQFPVGRVHRFLRKGNYANRVGAGAPVYLAVLEYLAAEILELAGNARDNKKTRIIPRHLQLAIRNDEELNKLLGGVTIAQGGVLPNIQAVLLPKKTGSKSSK early H2A -----SGKART-A-S------R----AK---G -----M------T------S------V------GKSS late H2A-l -----GKAKGT-S-T------K----GS---A-----L------T------S------V------K------MASK late H2A-2 -----GAKAKS-A-S------K----GN---A-----L------T------S------V------GKSA

B 10 20 30 40 50 60 70 80 90 100 110 120 130 sperm H2B-1 PSQKSPTKRSPTKRSPQKGGKGAKRGGKAGKRRRGVAVKRRRRRRESYGIY IYKVLKQVHPDTGISSRAMSVMNSFVNDVFERIASEAGRLTTYNRRNTVSSREVQTAVRLLLPGELAKHAVSEGTKAVTKYTTSR sperm H2B-2 .....RNVVK-R-R-R------I--RG-SV------V-----G--S--TSA-RRS-IS---I------ARR early H2B APTGQVAKKGSKKAVKPPRASGGKK-H-K-K------V--RA-TI ------I-----G--S--TQY-KKS-IS---I------AK late H2B-l PAKAQAAGKKGSKKAKAPRPSGDKK-R-K-K------I--RA-SI ------V----- A--S--AHY-KKS-IT ---V------SK late H2B-2 PAKQTSGKGAKKAGKAKGRPAGASKT-R-K-K ------I--KA-SI ------V----- G--S--AHY .....

FIG. 4. Sea urchin H2A and H2B protein sequences. The amino acid sequences of the H2A (A) and H2B variants (B) were deduced from the DNA sequences shown in Figs. 2 and 3. The NH2-terminal pentapeptide repeats of the sperm H2B-1 protein are indicated by arrows. Dashes represent residues that are identical in all variants. The one-letter code for amino acids is as follows: A, Ala; C, Cys; D, Asp; E, Glu; F, Phe; G, Gly; H, His; I, Ile; K, Lys; L, Leu; M, Met; N, Asn; P, Pro; Q, Gln; R, Arg; S, Ser; T, Thr; V, Val; Y, Tyr.

potential of this approach was demonstrated by cloning H2A substitutions per 60 million years (31). Applying this mutation and H2B cDNAs from sea urchin testis and gastrula embryos. rate to our sequence data, we find that the H2A gene variants So far only fragmentary protein sequence information was have apparently arisen by gene duplications as early as 500 available for the two sperm H2B (25) and the two late H2B million years ago. This estimation of the divergence time, variants (7), and it was entirely lacking for three late H2A however, is strictly valid only for orthologous genes of proteins identified only by electrophoresis on Triton/acid/ different species. Since the paralagous histone genes of urea gels (3, 4). We have now cloned and sequenced the Psammechinus may have exchanged sequence information- cDNAs for these H2A and H2B variants. This revealed that for instance, by gene conversion-the calculated 500 million one of the late H2A histones, the H2A-3 protein, is identical years must be regarded as a minimal estimate of the actual in sequence with the "sperm" H2A variant (6, 26). In divergence time for these sea urchin histone gene variants. addition we have isolated several late H3 and H4 cDNA Indeed, gene conversion has apparently been responsible for clones (data not shown), further supporting the generality of homogenizing the protein coding sequences of two nonallelic our approach to histone cDNA cloning. This method might pairs of late H3 and H4 genes in the sea urchin Lytechinus ultimately be useful for elucidating the hitherto unknown (32). As shown in Fig. 6a, a recent gene conversion event may primary structure of the cleavage-stage histones of the sea also explain the high homology between the late H2A-1 and urchin, provided their mRNAs contain the 3' terminal pal- H2A-2 genes of Psammechinus in their central and COOH- indrome. terminal protein coding sequences. By contrast, sequence The sea urchin H2A variants differ by 8-10% amino acid differences are evenly distributed between the late H2A-2 substitutions from each other (Fig. 6) and by 10-16% from gene and the late H2A-3 as well as the early H2A genes (Fig. vertebrate H2A histones (28-30). H2A proteins are known to 6 b and c). evolve at an approximately constant rate of 1% amino acid The late H2A and H2B mRNAs of P. miliaris are stored in

A blastula B blastula

e 4 \\ OOOW\\ ct

late H2A-2 .-

late H2A-3 M.ui!1 sperm H2B-1 5

sperm H2B-2 _

early H2A early H2B1

FIG. 5. Developmental profile of histone mRNA accumulation. SI probes of the different histone cDNA clones were 5'-end-labeled at a suitable restriction site in the cDNA insert and then were hybridized to 10 ,ug (A) or 40 ,ug (B) of total RNA isolated from testis, ovary, or developing embryos prior to nuclease Si digestion and gel electrophoresis. Only the relevant part of the autoradiograph with the Si signal of the fully protected cDNA sequence is shown for the different Si mapping experiments. The autoradiographs in B were exposed for a longer time than those in A. To calculate the absolute number of histone mRNA molecules per embryo, the radioactivity in the bands was quantitated by scintillation counting or by comparison with radiolabeled marker DNA fragments. Testis and ovary RNA were extracted from sea urchins 1-2 months before their fertility period. Downloaded by guest on September 30, 2021 5680 Biochemistry: Busslinger and Barberis Proc. Natl. Acad. Sci. USA 82 (1985)

ATG TAA IIE I Ila1lIloIIA I 1 1* I 1alaE= a 1I 5 111111o III i ,HII I now oil _ a III II II 11 III I 1111111 If I I IIM e l l= b IIIII Li I 1II 11II I II I VW 11 I1111111I1111I I II I .1 II I 111IIII we 1u1 11 11 I1Ia l I II11p - N o so n

_aiM1 LIII11 I 111 11111 il1 Hl fMIBl I III 11 I 1111t I 1MMENUA-

FIG. 6. DNA sequence comparison of sea urchin H2A genes. The late H2A-2 gene was compared to the late H2A-1 gene (a), the late H2A-3 gene (b), and the early H2A gene (c). The DNA sequences are those shown in Fig. 2; vertical bars indicate either an insertion/deletion or a point mutation.

the egg as maternal mRNAs in as few as 200 copies, in L. H. (1977) Cold Spring Harbor Symp. Quant. Biol. 42, contrast to the more prevalent early histone mRNAs (Fig. 5). 421-431. These late histone mRNAs accumulate during blastulation to 10. Mauron, A., Kedes, L., Hough-Evans, B. R. & Davidson, 3-5 x 105 molecules in the gastrula-stage embryo, in agree- E. H. (1982) Dev. Biol. 94, 425-434. 11. Maxson, R. E. & Wilt, F. H. (1982) Dev. Biol. 94, 435-440. ment with the developmental profile of late histone mRNA 12. Knowles, J. A. & Childs, G. J. (1984) Proc. Natl. Acad. Sci. accumulation studied in two other sea urchin species (12, 14). USA 81, 2411-2415. The late H2A-3 mRNA coding for the "sperm" H2A protein 13. Childs, G., Nocente-McGrath, C., Lieber, T., Holt, C. & is abundant in sea urchin testis as well as in gastrula embryos Knowles, J. A. (1982) Cell 31, 383-393. and is present in small amounts even in the ovary, the 14. Maxson, R., Mohun, T., Gormezano, G., Childs, G. & Kedes, unfertilized egg, and the early cleavage stages (Fig. 5). The L. (1983) Nature (London) 301, 120-125. H2A-3 gene therefore does not appear to be subject to tight 15. Busslinger, M., Portmann, R. & Birnstiel, M. L. (1979) developmental and tissue-specific regulation. In contrast, the Nucleic Acids Res. 6, 2997-3008. 16. Gross, K., Probst, E., Schaffner, W. & Birnstiel, M. (1976) two sperm H2B variants appear to be tissue-specific, since Cell 8, 455-469. their mRNAs are found in testes but not in ovaries and 17. Glisin, V., Crkvenjakov, R. & Byus, C. (1974) Biochemistry embryos of the sea urchin. These sperm proteins differ from 13, 2633-2637. the other H2B variants by an NH2-terminal extension of 18. Land, H., Grez, M., Hauser, H., Lindenmaier, W. & Schutz, characteristic pentapeptide repeats (Fig. 4), which seem to G. (1981) Nucleic Acids Res. 9, 2251-2266. impart to the sperm H2B variants an important function in 19. Melton, D. A., Krieg, P. A., Rebagliati, M. R., Maniatis, T., Zinn, K. & Green, M. R. (1984) Nucleic Acids Res. 12, sperm chromatin condensation (7, 33). 7035-7056. 20. Peacock, S. L., McIver, C. M. & Monahan, J. J. (1981) We are grateful to Dr. R. A. Flavell (Biogen, Boston, MA) for Biochim. Biophys. Acta 655, 243-250. generously providing the synthetic oligonucleotide, to Drs. M. L. 21. Schaffner, W., Kunz, G., Daetwyler, H., Telford, J., Smith, Birnstiel, D. Schumperli, and G. Gilmartin for critical reading of the H. 0. & Birnstiel, M. L. (1978) Cell 14, 655-671. manuscript; to S. Degiacomi for excellent technical assistance; and 22. Maxam, A. M. & Gilbert, W. (1980) Methods Enzymol. 65, to F. Ochsenbein for graphical work. This work was supported by the 499-560. State of Zurich and the Swiss National Research Foundation (Grant 23. Busslinger, M., Moschonas, N. & Flavell, R. A. (1981) Cell 27, 3.484-0.83). 289-298. 24. Hentschel, C., Irminger, J.-C., Bucher, P. & Birnstiel, M. L. 1. Hentschel, C. C. & Birnstiel, M. L. (1981) Cell 25, 301-313. (1980) Nature (London) 285, 147-151. 2. Maxson, R., Cohn, R., Kedes, L. & Mohun, T. (1983) Annu. 25. Strickland, M., Strickland, W. N., Brandt, W. F. & von Holt, Rev. Genet. 17, 239-277. C. (1978) Biochim. Biophys. Acta 536, 289-297. 3. Cohen, L. H., Newrock, K. M. & Zweidler, A. (1975) Science 26. Strickland, W. N., Strickland, M. S., de Groot, P. C. & von 190, 994-997. Holt, C. (1980) Eur. J. Biochem. 109, 151-158. 4. Newrock, K. M., Cohen, L. H., Hendricks, M. B., Donnelly, 27. Davidson, E. H., Hough-Evans, B. R. & Britten, R. J. (1982) R. J. & Weinberg, E. S. (1978) Cell 14, 327-336. Science 217, 17-26. M. Proc. Natd. Acad. Sci. USA 28. Moorman, A. F. M., De Boer, P. A. J., De Laaf, R. T. M. & 5. Grunstein, (1978) 75, Destrde, 0. H. J. (1982) FEBS Lett. 144, 235-241. 4135-4139. 29. D'Andrea, R., Harvey, R. & Wells, J. R. E. (1981) Nucleic 6. Wouters, D., Sautiere, P. & Biserte, G. (1978) Eur. J. Acids Res. 9, 3119-3128. Biochem. 90, 231-239. 30. Zhong, R., Roeder, R. G. & Heintz, N. (1983) Nucleic Acids 7. von Holt, C., de Groot, P., Schwager, S. & Brandt, W. F. Res. 11, 7409-7425. (1984) in Histone Genes and Histone Gene Expression, eds. 31. Wilson, A. C., Carlson, S. S. & White, T. J. (1977) Annu. Rev. Stein, G., Stein, J. & Marzluff, W. (Wiley, New York), pp. Biochem. 46, 573-639. 65-105. 32. Roberts, S. B., Weisser, K. E. & Childs, G. (1984) J. Mol. 8. Carroll, A. G. & Ozaki, H. (1979) Exp. Cell Res. 119, 307-315. Biol. 174, 647-662. 9. Newrock, K. M., Alfageme, C. R., Nardi, R. V. & Cohen, 33. Green, G. R. & Poccia, D. L. (1985) Dev. Biol. 108, 235-245. Downloaded by guest on September 30, 2021