MOLECULAR AND CELLULAR BIOLOGY, Feb. 1990, p. 681-688 Vol. 10, No. 2 0270-7306/90/020681-08$02.00/0 Copyright © 1990, American Society for Microbiology

Mouse Zfx Is Similar to Zfy-2: Each Contains an Acidic Activating Domain and 13 Zinc Fingers GRAEME MARDON,"2 SHIUH-WEN LUOH,"2 ELIZABETH M. SIMPSON,"12 GRACE GILL,3t LAURA G. BROWN,"12 AND DAVID C. PAGE' 2* Whitehead Institute for Biomedical Research, Nine Cambridge Center, Cambridge, Massachusetts 021421; Department ofBiology, Massachusetts Institute of Technology, Cambridge, Massachusetts 021392; and Department of Biochemistry and Molecular Biology, Harvard University, Cambridge, Massachusetts 021383 Received 29 September 1989/Accepted 13 November 1989

The Zfy is located on the Y of placental mammals and encodes a zinc finger protein which may serve as the primary sex-determining signal. A related gene, Zfx, is similarly conserved on the . Unlike that in most mammals, the mouse genome contains four homologous zinc finger loci: Zfy-1, Zfy-2, Zfx, and Zfa (on an autosome). We report that, in contrast to the mouse Zfy , Zfx is widely transcribed in embryos, newborns, and adults, both male and female. Moreover, Zfx transcripts contain long 3' untranslated sequences which are phylogenetically conserved. Zfa is a processed gene derived from Zfx. An analysis of cDNA clones demonstrated that Zfx encodes a 799-amino-acid protein that is 70% identical to the mouse Zfy-l and Zfy-2 . Zfx, Zfy-l, and Zfy-2 contain highly acidic amino-terminal domains and carboxy-terminal regions containing 13 zinc fingers. When fused to the DNA-binding domain of GAL4, the acidic domains of Zfx and Zfy-2 activated transcription in yeast cells.

In humans and mice, the sex of a developing embryo is have a two-domain structure, with an acidic amino-terminal determined by one or more genes on the Y chromosome (8, region and 13 zinc fingers in the carboxy-terminal half. Zfx 16, 40). The ZFY gene, located in the sex-determining region and Zfy proteins may both function in the mouse as se- of the human Y chromosome, may serve as the primary quence-specific transcriptional activators, as substantiated sex-determining signal (33). ZFY encodes a protein with 13 here by the ability of their acidic domains to activate Cys-Cys-His-His zinc fingers, a nucleic acid binding motif transcription in yeast cells. first described in Xenopus transcription factor IIIA (4, 29). Homologs of ZFY are found on the Y of all MATERIALS AND METHODS placental mammals examined. A related gene, ZFX, is also Northern (RNA) and Southern blotting. Poly(A)+ RNAs conserved among mammals and has been mapped to the X were prepared from FVB/N mouse tissues or whole embryos chromosome in both humans (33) and mice (30, 31; D. C. Page, C. M. Disteche, E. M. Simpson, A. de la Chapelle, M. as previously described (25). DNA inserts of plasmids were Anderson, T. Alitalo, L. G. Brown, and P. Green, Genom- purified, radiolabeled by random-primer synthesis, and hy- ics, in press). The zinc finger domains of human ZFY and bridized to RNA or DNA blots for 16 h at 370C (low stringency), 420C (medium stringency), or 470C (high strin- ZFX are 97% identical, suggesting that the two proteins may gency) as previously described (25, 33). The nylon mem- recognize the same DNA or RNA sequence (39). ZFY and branes were washed three times for 30 min each at 50'C (low ZFX may both function in sex determination (33). stringency), 600C (medium stringency), or 650C (high strin- Two zinc finger genes, Zfj-J and Zfr-2, are located in the gency) in 0.lx SSC (lx SSC is 0.15 M NaCl plus 0.015 M sex-determining region of the mouse Y chromosome (33). sodium citrate)40.1% sodium dodecyl sulfate and exposed at This is the result of an intrachromosomal duplication that -80°C for 1 to 7 days to X-ray film backed by an intensifying occurred mouse evolution The two mouse Zfy during (24). screen. genes may encode functionally redundant products; both Cloning and nucleotide sequence analysis. cDNA libraries genes are transcribed in the adult testis, and the predicted were prepared from FVB/N newborn liver poly(A)+ RNA as proteins are 95% identical. Both Zfy proteins are composed previously described (25). Following ligation to EcoRI adap- of two domains: an amino-terminal region that is highly tors, the double-stranded cDNA was size fractionated by acidic and a carboxy-terminal domain that contains 13 zinc agarose gel electrophoresis. After electroelution, the 5- to fingers (1, 25). The combination of an acidic activating region lO-kilobase (kb) fraction was ligated to XZAP (Stratagene). and a DNA-binding motif is characteristic of several eucary- An unamplified library of 5 x 105 recombinants was otic transcription factors, e.g., yeast GAL4 and GCN4 and screened by using the human genomic insert of plasmid the mammalian receptor glucocorticoid (11-13, 23). Thus, pDP1007 as a probe. Six cDNA clones were identified, each the putative sex-determining genes Zfy-1 and Zfy-2 appear to encode sequence-specific activators of transcription. containing an insert of 5.7 to 6.8 kb. Comparative restriction mapping suggested that all six clones derive from the same In this report, an analysis of cDNA clones demonstrates locus. The cDNA inserts of two clones were transferred into that Zfx encodes a protein that is 70% identical to mouse the EcoRI site of Bluescript (Stratagene), generating plas- Zfy-1 and Zfy-2. Like the Zfy proteins, Zfx is predicted to mids pDP1115 and pDP1119. Plasmids pDP1193 and pDP1194 contain 1.9- and 2.1-kb * Corresponding author. EcoRI fragments of Zfx and Zfa, respectively (see Fig. 2B, t Present address: Department of Biochemistry, University of probe 1), cloned into the EcoRI site of Bluescript. Both California, Berkeley, CA 94720. inserts derive from FVB/N genomic mouse DNA. 681 682 MARDON ET AL. MOL. CELL. BIOL.

Nucleotide sequencing was carried out by dideoxy-chain xx xY termination (37), using synthetic oligonucleotide primers. Construction of GAL4 fusion genes. The acidic amino- kb terminal domains of Zfx and Zfy-2 were fused, in frame, to the DNA-binding domain of GAL4 (amino acids 1 to 148). First, sequences between the HindIII and BamHI sites of Bluescript KS (-) were replaced by an adaptor containing an NcoI site: 5'-AGCTTCGCCATGGAG AGCGGTACCTCCTAG-5' The NcoI-PstI (with XbaI linker) fragment of Zfy-2 cDNA pDP1122 (25) or NcoI-PvuII (with XbaI linker) fragment of Zfx cDNA pDP1115 was cloned into the NcoI and XbaI sites of the modified Bluescript vector. ClaI-XbaI fragments from the resulting clones were inserted into the ClaI and XbaI sites of pGG25AX, a single-copy yeast plasmid which ex- presses GAL4(1-148) from an ADH1 promoter. The resulting GAL4-Zfx and GAL4-Zfy-2 fusion plasmids are, respec- tively, pDP1199 and pDP1200. The net charge on these portions of Zfx (residues 1 to 330) and Zfy-2 (residues 1 to FIG. 1. Human ZFY detects transcripts in male and female 314) was calculated as the sum of acidic residues (glutamic mouse embryos. Human ZFY probe pDP1007 was hybridized at and aspartic acid) minus the sum of basic residues (arginine moderate stringency to poly(A)' RNAs prepared from female (XX) and lysine). and male (XY) whole embryos at 14 days postcoitum. The sizes of Yeast transformation and assay of 0-galactosidase activity. transcripts detected are indicated. GGY1:SV15 is a AGAL4 yeast strain bearing an integrated GAL1:lacZ fusion with a single near-consensus GAL4- indicating an X chromosomal location, while the 2.1-kb binding site upstream (9). pDP1199 and pDP1200, as well as EcoRI fragment hybridizes with equal intensity in both plasmids expressing intact GAL4 (pGG22) or the GAL4 sexes, suggesting an autosomal origin. The same 1.9- and DNA-binding domain alone (pGG25AX), were introduced 2.1-kb EcoRI fragments are detected with human ZEY probe into GGYl:SV15 after treatment with lithium acetate (15). pDP1007 (results not shown). The 1.9-kb EcoRI fragment is Yeast cells were grown in selective media lacking histidine derived from mouse Zfx, while the 2.1-kb fragment is from but containing 2% galactose, 3% glycerol, and 2% lactic mouse Zfa (30; Page et al., in press) (Fig. 2B). This result acid, pH 6. P-Galactosidase assays (9) were performed in suggests that cDNA clone pDP1115 is derived from either triplicate; the standard error was less than 15%. Immuno- Zfx or Zfa. The additional sequences detected by probe 1 precipitation of the various GAL4 derivatives with an anti- probably correspond to other exons represented in the GAL4(1-147) serum revealed no significant differences in the mouse cDNA clone but absent in human ZFYgenomic clone amounts of the proteins (data not shown). pDP1007. A portion of the 3' untranslated (UTR) sequence of the cDNA (Fig. 2A, probe 2) also detects two EcoRI RESULTS fragments in mouse genomic DNA: a 5.3-kb fragment from the X chromosome and an 8.2-kb fragment that is autosomal. Human ZFY detects transcripts in both female and male The 3'-most portion of the cDNA (Fig. 2A, probe 3) detects mice. The genomic insert of plasmid pDP1007 encodes the only the 5.3-kb EcoRI fragment, demonstrating that cDNA zinc finger domain of human ZFY (33). When hybridized to clone pDP1115 is derived from Zfx (Fig. 2B). mouse genomic DNAs at low stringency, pDP1007 detects Using Northern analysis, we determined that Zfx cDNA related sequences on the Y chromosome (Zfy-J and Zfy-2), clone pDP1115 corresponds to the 7.5-kb transcript detected the X chromosome (Zfx), and autosome 10 (Zfa) (30-33; in whole embryo RNA (Fig. 1). Probes 1 and 2 detect the Page et al., in press). At moderate stringency, pDP1007 same 6- and 7.5-kb transcripts in mouse poly(A)' RNA as hybridizes only to Zfx and Zfa. When hybridized to Northern those seen with human ZEY probe pDP1007 (Fig. 1 and 2C). blots at moderate stringency, pDP1007 detects transcripts of Probe 3, however, detects only the 7.5-kb RNA (Fig. 2C), 6 and 7.5 kb in mouse embryo RNAs (Fig. 1). These demonstrating that cDNA clone pDP1115 corresponds to transcripts, which are present in both males and females, that particular transcript. As probe 3 is an X chromosome- must derive from either Zfx or Zfa. Similar results were specific probe (Fig. 2B), we conclude that the 7.5-kb tran- obtained with newborn male liver RNA (results not shown). script derives from mouse Zfx. Southern and Northern In order to further characterize these transcripts, a cDNA hybridizations with probes prepared from other portions of library prepared from newborn male mouse liver RNA was Zfx cDNA clone pDP1115 suggest that the 6-kb transcript is screened with human ZFY probe pDP1007 under the same also derived from mouse Zfx (results not shown). conditions used for Northern analysis. One of the cDNA Mouse Zfx is widely transcribed. Transcripts of 6 and 7.5 clones isolated, pDP1115, contains an insert of 6.8 kb (Fig. kb are detected by Zfx cDNA probe 1 in all mouse tissues 2A). examined, including the following, both male and female: A cDNA clone derived from mouse Zfx. Using Southern whole embryos at 12, 14, 16, and 18 days postcoitum; and analysis, we determined that cDNA clone pDP1115 is de- newborn and adult brain, gonad, heart, kidney, liver, lung, rived from mouse Zfx. The 5'-most 2.7 kb of the cDNA (Fig. and spleen (Fig. 3). In general, the two transcripts are more 2A, probe 1) detects 1.9- and 2.1-kb EcoRI fragments in abundant in whole embryos and adult tissues than in new- genomic mouse DNA (Fig. 2B). The 1.9-kb EcoRI fragment born tissues. However, relatively high levels of Zfx tran- hybridizes with twice the intensity in females as in males, scripts are present in newborn testis. VOL. 10, 1990 MOUSE Zfx AND Zfy PROTEINS ARE SIMILAR 683

E H H H I A. ZfxcDNA Eq :1 An

P 2 1 Probe: 1 2 3

B. Genomic xx xY xx xY xx XY Southern

8.2 -- - Zfa

5.3- @_ 5.3- _M -Zfx

2.1 it - Zfa 1.9 Zfx

C. Northerns xx xY xx XY xx xY

7.5- _m 7.5- 7.5- __ 6 -*0 6F-I.

FIG. 2. A cDNA clone derived from mouse Zfx. (A) Zfx cDNA clone pDP1115, 6.8 kb in length, is shown schematically. The single long open reading frame is depicted as an open box. A polyadenosine run (An) is found at the 3' end of the cDNA. EcoRI (E) and HindIII (H) restriction sites are indicated. (B) Restriction fragments 1 (2.7 kb), 2 (1.8 kb), and 3 (0.6 kb) were gel purified from an EcoRI-HindIII digest of pDP1115 and hybridized at high stringency to EcoRI-digested mouse genomic DNAs (BALB/c). Exposures of 2 to 5 days are shown. The sizes of fragments detected are indicated in kilobases. Zfa and Zfx refer to the zinc finger homologs on mouse chromosomes 10 and X, respectively. (C) Probes 1, 2, and 3 were hybridized at high stringency to poly(A)+ RNAs prepared from whole embryos at 14 days postcoitum. Filters were exposed for 1 to 4 days. Transcripts of 6 and 7.5 kb are indicated.

Mouse Zfx and Zfy genes encode similar proteins. Nucleo- alternate splicing. As a result of alternate polyadenylation, tide sequence analysis of mouse Zfx cDNA clone pDP1115, the 3' UTRs of these two cDNAs differ in length by 1.1 kb. 6.8 kb in length, revealed a single, long open reading frame The significance of alternate processing of Zfx transcripts (Fig. 4). The first AUG in this frame (Fig. 4, position 1) remains to be determined. occurs in a sequence context that is favorable for initiation of The predicted Zfx amino acid sequence is very similar to translation (17). Beginning at this putative initiation codon, both mouse Zfy-1 and Zfy-2 (1, 25) over the entire length of the open reading frame encodes a protein 799 amino acids in the protein (Fig. 5). The Zfx and Zfy proteins have nearly length, with a predicted molecular weight of 90,000. The identical amino and carboxy termini. Like the Zfy proteins, AUG codon is preceded by a 5' leader of 145 nucleotides Zfx has two large domains: a highly acidic amino-terminal containing stop codons in all three reading frames. A puta- portion, with nearly 25% aspartic or glutamic acid residues, tive 3' untranslated sequence of4,163 bases is followed by 65 and a carboxy-terminal region encompassing 13 zinc fingers. adenosines. A canonical AATAAA polyadenylation signal The acidic domains of Zfx and Zfy are 60% identical, with (7) occurs 26 nucleotides 5' of the poly(A) track. Several seven insertions or deletions of 1 to 11 amino acids. In other potential polyadenylation signals are also present contrast, the zinc finger domains are 80% identical, with no within the 3' UTR. insertions or deletions. In Zfx, as in the Zfy proteins, a short Comparative analysis of a second cDNA clone suggests basic sequence is located between the acidic and zinc finger that Zfx transcripts undergo alternate splicing and polyade- domains (Fig. 5). Similar basic regions serve as nuclear nylation, which may account for the difference in lengths of localization signals in several proteins (5, 6, 19, 22). the 6- and 7.5-kb Zfx transcripts (Fig. 2C). The nucleotide The 3' UTR of Zfx is conserved in mammals. Nucleotide sequence of cDNA clone pDP1119, 5.7 kb in length, is sequence analysis of Zfx cDNA clone pDP1115 revealed no identical to that ofZfx cDNA pDP1115 from nucleotides -31 open reading frame greater than 125 codons in the more than to +5458 (Fig. 4). However, sequences 5' of position -31 in 4 kb following the zinc finger coding region. Nonetheless, the two clones appear unrelated, perhaps the result of when hybridized to genomic DNAs from a variety of mam- 684 MARDON ET AL. MOL. CELL. BIOL.

whole embryos newborn tissues adult tissues

'a

c I~ co &- e 0, .> L I- 0 :2 C - _ - I, co :Ii2Y U) m0 a ! -YJ .j n rm m- r-m--mIr --I '''----'' '' '- kb

_ _ - - 9 p - 7.5 _8_6. _ Ztx 0 6 4b - 6 probe 0@

tubuein *_ _ _ _e_ _ - 4 a _ 2 probe is 4. go40t

FIG. 3. Zfx is widely transcribed in mice. The coding region of a mouse Zfx cDNA (Fig. 2A, probe 1) was hybridized at high stringency to poly(A)+ RNAs from the 35 sources shown. In each pair of RNA samples, female is to the left and male is to the right. Transcripts of 6 and 7.5 kb are indicated. (Newborn mouse ovary also contains the 6- and 7.5-kb transcripts [data not shown].) As a control, the filters were hybridized with an ot-tubulin probe (lower panel). Other hybridizations to these filters reveal that the female newborn liver and male adult kidney RNAs are partially degraded (data not shown). mals, the Zfx 3' UTR detects sequences conserved on the X Acidic domains of Zfx and Zfy-2 activate transcription in chromosome. In each species, the Zfx 3' UTR detects one or yeast cells. Many eucaryotic transcription factors are com- two restriction fragments common to male and female but posed of two domains, one which binds DNA and another, about twice as intense in female (Fig. 6). These conserved the activating region, which interacts with the transcription sequences are present in both the 6- and 7.5-kb transcripts machinery (3, 35). The activating regions of many such detected in mouse RNA (Fig. 2C, probe 2). factors are rich in acidic amino acids (11-13, 23). Therefore, Zfa is a processed gene derived from Zfx. Zfx transcripts we tested whether the acidic portions of Zfx or Zfy-2 could and the Zfa genomic locus exhibit extended sequence simi- function as activating regions in vivo. Fusion proteins com- larity, with sequences derived from the Zfx 5' UTR, coding posed of the acidic region of either Zfx (residues 1 to 330; net region, and 3' UTR cross-hybridizing to Zfa at high strin- charge, -77) or Zfy-2 (residues 1 to 314; net charge, -68) gency (Fig. 2B) (unpublished data). Nonetheless, in compar- attached to the DNA-binding region of yeast GAL4 (residues ison with its homologs in mice and humans, the Zfa locus is 1 to 148) were assayed for their ability to stimulate transcrip- condensed; Zfx cDNAs hybridize to sequences spanning at tion in yeast cells. The P-galactosidase activity produced least 4.0 kb but no more than 11.5 kb of Zfa genomic DNA. from a GAL1:lacZ fusion gene bearing a single GAL4- In contrast, the human ZFY and ZFX genes span 50 and 70 binding site upstream is a measure of transcriptional activa- kb, respectively, and the mouse Zfy-J, ZJf-2, and Zfx loci tion by a given GAL4 derivative. Both the GAL4-Zfx and are of comparable size (33, 39; E. M. Simpson and S.-W. GAL4-Zfy-2 fusion proteins activate transcription, whereas Luoh, unpublished data; A. Schneider-Gadicke, P. Beer- the DNA-binding region of GAL4 alone does not (23) (Fig. Romero, L. G. Brown, G. Mardon, S.-W. Luoh, and D. C. 8). The GAL4-Zfy-2 fusion activates transcription nearly as Page, Nature [London], in press). well as wild-type GAL4, while the GAL4-Zfx fusion is about Given the condensed size of the Zfa locus-and the fact 20-fold less effective. that an autosomal homolog of Zfx is not found in most placental mammals-we hypothesized that Zfa might be a processed gene derived from a Zfx transcript. Indeed, by DISCUSSION nucleotide sequence comparison, we found that the Zfa genomic locus lacks at least one intron present in Zfx (Fig. Mouse Zfx and Zfy genes encode similar proteins. The 7). In Zfx, the alignment of genomic and cDNA sequences existence of two ZJy genes on the mouse Y chromosome is reveals an intron immediately 5' to the zinc finger exon. the result of an intrachromosomal duplication during rodent (Human ZFX and ZFY transcripts are spliced at precisely evolution (24). The Zfy-1 and Zfy-2 proteins (1, 25) are 95% the same site [39].) On both sides of this splice, the Zfa identical (Fig. 5) and may be functionally redundant. The genomic sequence is nearly identical to that of the Zfx mouse Zfx protein is 70% identical to the mouse Zfy proteins cDNA; the intron has been precisely excised in Zfa. On the and exhibits the same two-domain structure: the amino- basis of the small size of Zfa, the lack of at least one intron, terminal half is acidic, while the carboxy-terminal half con- and sequence similarity to Zfx cDNAs, we conclude that the tains 13 Cys-Cys-His-His zinc fingers (Fig. 5). Zfa locus derives from a processed Zfx transcript. The Acidic and DNA-binding domains are found in combina- absence of such an autosomal locus in most placental tion in many eucaryotic transcription activators (11-13, 23). mammals suggests that this retroposition occurred during By analogy, both Zfx and Zfy will probably function as rodent evolution. Since some processed genes encode func- sequence-specific activators of transcription. When fused to tional proteins (e.g., human PGK-2 [26]), we cannot rule out the GAL4 DNA-binding domain, the acidic regions of both a functional role for Zfa. Zfx and Zfy-2 indeed activate transcription in yeast cells VOL. 10, 1990 MOUSE Zfx AND Zfy PROTEINS ARE SIMILAR 685

-145 CCGGCCTGCAGCACCCGCCACCGTC -198 GGGCGCTCAGAACGCGCGGTCGAGGCGCGTTCTGGAAACGGGCGGGGACCGGGGCAGTGCCGCCCGGCGGTTGGGGAGG -12 0 GCCGACCGCCGCAC GACCGTCCGGTGCGTATAACTGTGCCCAGCTAACCTTGTAAGATGAATTGGGGAGCATCCCCTCTTT TTTGAAACAGGTGTT CTGGCTGATGAGAATTTAAAGGCC -120 CGGCTGAGGGGTCCCGGGTGGGCGACGTGGCCGACCCGGGGCACCGTGCGGGGGCGAGGCCTGGCGAGGAGGCTAAGGCTGCAGGCATG 1 M D E D G L E L Q QQ A P N S F F D A T G A G A T H M D G N Q I V V E V Q E T V 1 ATGGATGAAGATGGACTTGAATTACMACAACAAGCACCAAACTCATTTTTTGAT GCAACAGGAGCTGGTGCTACACACATG GATGGTAATCAAATTGTTGTGGAAGT TCAAGAAACGGTT 41 Y V S D V V D S D I T V H N Y V P D D P D S V V I Q D V I E D V V I E D V Q C T 121 TATGTGTCAGATGTTGTGGATTCCGACATAACCGTACATAACTATGTTCCCGATGACCCAGATTCAGTTGTCATCCAGGATGTCATTGAGGATGTTGTTATAGAAGATGTCCAGTGTACA 91 D I M D E A D V S E T V I I P E Q V L D S D V T E E V S L T H C T V P D D V L A 241 GACATCATGGACGAAGCAGATGTATCGGAAACAGTCATCATCCCTGAACAAGTGCTGGATTCAGATGTAACCGAAGAAGTTTCTTTAACACATTGCACAGTTCCAGATGATGTTTTAGCT 121 S D I T S A S I S M P E H V L T S E S I H V S D V G H V E H V V H D S V V E A E 361 TCTGATATCACTTCAGCCTCAATATCTATGCCGGAACATGTCTTGACAAGTGAATCTATACATGTGTCTGACGTTGGTCACGTTGAACATGTGGTTCATGATAGTGTAGTAGAAGCAGAA 161 I V T D P L A A D V V S E E V L V A D C A S E A V I D A N G I P V N Q Q D E E K 481 ATCGTCACAGATCCTCTGGCCGCTGATGTTGTCTCAGAAGAAGTGTTGGTAGCAGACTGTGCCTCTGAAGCAGTCATAGATGCCAACGGGATCCCTGTGAACCAGCAGGATGAAGAGAAA 201 N N C E D Y L M I S L D D A G K I E H D G S S G L T M D N E T E I D P C K V D G 601 AACAACTGTGAGGACTACCTTATGATCTCTTTGGATGATGCTGGTAAAATAGAACATGATGGTTCATCTGGACTGACTATGGACAACGAAACAGAAATTGACCCTTGTAAAGTGGATGGC 241 T C P E V I K V Y I F K A D P G E D D L G G T V D I V E S E P E N E H G V E L L 721 ACTTGCCCTGAAGTCATCAAGGTGTACATTTTTAAAGCTGATCCTGGAGAAGATGACTTAGGTGGAACTGTAGATATCGTCGAAAGTGAGCCTGAGAATGAGCATGGAGTTGAACTACTC 281 D P N N S I R V P R E K M V Y M A V N D S Q Q E E E E L N V A E I A D E V Y M E 841 GATCCAAACAATAGTATTCGCGTTCCCAGGGAAAAGATGGTTTATATGGCTGTTAACGACTCTCAGCAAGAAGAAGAAGAACTGAATGTCGCTGAGATTGCTGATGAAGTTTATATGGAA 321 V I V G E E D A A A A A A A A V H E Q Q V E D N E N K T F M P I A W A A A Y G N 961 GTGATTGTTGGAGAGGAGGATGCTGCAGCTGCAGCCGCAGCCGCAGTGCATGAACAGCAAGTGGAGGACAACGAAATGAAAACCTTCATGCCGATTGCATGGGCAGCTGCTTACGGTAAT 361 N S D G I E N R N G T A S A L L H I D E S A G L G R L A K 0 K P K K R R R P D S 1081 AATTCTGATGGAATTGAAAACCCGAATGGCACTGCAAGTGCCCTCTTGCACATAGATGAGTCTGCTGGCCTCGGCAGACTGGCTAAACAAAAACCAAAGAAAAGGAGAAGACCTGATTCC 401 R Q Y Q T A I I I G P D G H P L T V Y P C M I C G K K F K S R G F L K R H M K N 1201 AGGCAGTACCAAACAGCAATAATTATTGGCCCAGATGGACATCCTCTGACTGTCTACCCTTGCATGATTTGTGGCAAGAAGTTTAAATCGAGAGGTTTCTTGAAAAGACACATGAAAAAC 441 H P E H L A K K K Y R C T D C D Y T T N K K I S L H N H L E S H K L T S K A E K 1321 CATCCTGAACACCTTGCCAAGAAAAAGTACCGCTGTACTGACTGTGATTACACTACCAACAAGAAGATAAGTTTACACAATCACCTGGAGAGCCACAAGCTAACCAGCAAGGCAGAGAAA 481 A I E C D E C G K H F S H A G A L F T H K N V H K E K G A N K N H K C K F C E Y 1441 GCCATAGAATGCGATGAGTGCGGAAAGCATTTCTCTCATCCTGGCGCTTTGTTTACCCATAAAATGGTGCATAAGGAAAAAGGAGCCMCAAAATGCACAAGTGTAAGTTCTGTGAATAT 521 E T A E 0 G L L N R H L L A V H S K N F P H I C V E C G K G F R H P S E L K K H 1561 GAAACAGCTGAACAAGGCCTATTGAATCGCCATCTTTTGGCAGTCCACAGCAAGAATTTTCCTCACATTTGTGTAGAGTGTGGCAAAGGTTTCCGCCATCCGTCAGAGCTCAAAAAGCAC 561 M R I H T G E K P Y E C Q Y C E Y R S A D S S N L K T H V K T K H S K E M P F K 1681 ATGCGAATCCATACTGGAGAMAGCCCTATGAATGCCAGTACTGCGAATACAGGTCTGCAGATTCTTCTAACTTGAAAACCCATGTAAAAACTAAGCATAGTAAAGAGATGCCATTCAAG 601 C D I C L L T F S D T K E V Q Q H A L V H 0 E S K T H Q C L H C D H K S S N S S 1801 TGTGACATTTGTCTTCTGACTTTCTCAGATACCAAAGAGGTGCAACAACATGCTCTTGTCCACCAAGAGAGCAAAACTCACCAGTGTTTGCATTGTGATCACAAGAGTTCAAACTCAAGT 641 D L K R H I I S V H T K D Y P H K C D M C D K G F H R P S E L K K H V A A H K G 1921 GATTTGAAACGACACATAATTTCAGTTCATACGAAGGACTATCCTCATAAGTGTGACATGTGCGATAAAGGCTTTCATAGGCCTTCAGAACTCAAGAAACATGTGGCTGCCCACAAGGGT 681 K K N H Q C R H C D F K I A D P F V L S R H I L S V H T K D L P F R C K R C R K 2041 AAAAAAATGCACCAGTGTAGACATTGTGACTTTAAGATTGCAGATCCGTTTGTACTAAGTCGCCATATTCTCTCGGTTCATACAAAGGACCTTCCATTTAGGTGTAAGAGATGTAGAAAA 721 G F R Q Q S E L K K H M K T H S G R K V Y 0 C E Y C E Y S T T D A S G F K R H V 2161 GGATTTCGGCAACAAAGTGAGCTTAAAAAGCATATGAAGACCCACAGTGGCCGGAAAGTGTATCAGTGTGAGTACTGTGAATATAGCACTACAGATGCCTCAGGCTTTAAGCGGCACGTT 761 I S I H T K D Y P H R C E Y C K K G F R R P S E K N Q H I M R H H K E V G L P 799 2281 ATCTCCATTCATACGAAAGACTATCCTCACCGCTGTGAGTACTGCAAGAAAGGATTCCGAAGACCCTCGGAAAAGAACCAGCACATAATGCGACATCATAAAGAAGTTGGCCTGCCCTAA 2401 CAGTACTCTTCATAGCTGTTTGTAGAGAACTGGCCTTGAAACAGAAAATTCATTTAAAAGCCAATCAGTCTTGTTCACATACAATACTGTATATTGATTTATGCTGTGTACAAATAGGAT 2521 TATTGCTTCTAGTTGACTGTTTTTACGTTTTGTTCAATAGTGTGTTCTGAATTCTATTCAGTTTGTTTMTACATGGGAAAAAGCAGCMCACGTTAGTTGCTTTTACTAGTAATCCC 2641 TGATTCTATACTGAAGTTTTCTATCTTAGAAGTTTTATATTTATTTTAAAATATTTACCTTGCTTACCTTGATGGTACTCTTGTAAGACCAGTTAACTCATGGTMCTCTGCATTGGTAA 2761 CTGAAAGGATTCATGTGGACTTTTTGTCTCACAAATTTCTCACAATTAAATCGTCAAGAGACATCTMTGGAATGGGAGATTTTGTATGGTGAGGTCTCATGCTCGTMCATGGAAGTCA 2881 TTTACTTGTCTTGCTTAGTAATGTAGACTGTATGACAATGAMGTTTTCCTATGAACTCTTGTGTCCCTGGCATCGCTGTGTAGAAGAGCAGTAGCTGGGTTTGTGTTTACCTTTCTTTC 3001 TGATTTTTTTTCCTTTTTAGTGGAGAAAATACACTTTTTTTTTCCTTTAGACTCTTTCTTTACATCTTCTGTCAGTGTAGTAAACTTTTAGAAAACCTTGTTGMAATTTTTGCTTGATC 3121 CTGTTGTATTTTGATTATTTTATCTGTGCTGCTTTGTCTTGGMTGGTTGTGTGCTCCAAATGMATTTMCTGMGACTGCATTTTGGAATCTCCTAGGGTMCTTGTGGCTCATAGG 3241 ATCTTTTGCAACTTTATATATGTMATGTACCCTGMTTATATATATCTATCTATATATACACATATATATGTAACATGTATCTGTGTGTATTGCTTATTTTACATACTTATACACACAA 3361 CCCCCAGTAGTAGTTCTTTAAAATCTATAATGAAAAGTATTMMTTTACAGTAACATGAMGATCCAGGGATGCMGAGAGAGCATTTTGTMGTCTTGCTCTTCAGAGAGACTACTCAG 3481 GTGAAGAATTAGMGGAAAATAAGGACACTAGTATTTTTAMGAGTAMGGTATTTTCTTTTAAATATCTTTGGTAATTGAAAMTAGAGGTTMGATGTTTATAGATACMTGTTTTCA 3601 TACAGTTTCAACTCCATGCCTTTATATTTTTCTGAAAAGCTMTGGGTATCCAGACATGMTCCCCCAGTTCTCTCTGATMGCCGGAGAGGGAACTCTTGTCTGACCAAGCGAGGGGGA 3721 CAGAMGGGAGTGATGGCTCCGTGGACCAGTGAACGGGTGCTAAGGAGGACTTAGACTCGGCCAGTTGAGCCTGCTCTGTTAGTTCTTAGTTAACTAGAAAAACTTMCACTCGCTTGTC 3841 TTGGATTACACGATAGAGGTGTGGTCCMGTGTGTGAACAGTCTGAGGCCTCACTTGGATCCTTACCAAGTCTTTCACAGTTTTGTTTTTGAGAGCTGGCMTTTGTAACATACAGGTCT 3961 GATCGCTGATAAAATTTGTAAAAATMGTTGGATATTAAGATTTTTGTTAAGATGTCTTTTTGGAAAACCGTTTCTAGAGTCCCTGGAATTGGCTGTTGGCACACGGGTCTGGCACATCA 4081 CGGGAGACTTAGATGTTTCTTCCCACTCGATAGCTCCCTTGCTGCCTCATTATGGTGCTCCATCCCTTTCGTGCCGTTAGTTTTGACCTTCCATCCTTCTCTTTGTCATCMGATTTGTA 4201 TTCAAAAGTTATATTTTTGGGGTTAGATGTCTTGAGTATTTGGTGTTTGGCMMCCTCTGAAGTGCTAGATCGATTTAGTCTAGTTCTAAATCMATTCTTAAGGCTGGTATGAAGTCCA 4321 CTCAATGCCAGTCMTGCCAAGGCAGGGTTCCCTMGCTTCCCACCCGCTAGGACTCCTTTTACCTTTGTAGCCTCTGCMTGTGCTTTAAMTACATCACMCTATGATTTGTTTTA 4441 TTGCCCTGTTATTCTTGAAAGCTGTCTTTTTTGTGAGMCTTTTMMCCTACTTCATTTTTTCCCCAGAAGTMTATMMCACCTGACATTGAMTTTATGMMTTTMGMTCATCT 4561 ACAATCAGTACAGAACMMTGATTGGACTATTTMCATATATTTTTTTMATAAACTGAMGATAAGGMCACTTTACACMTTTATATTTCTCTTTACATAATCTGGAACCACACAGTT 4681 CTTTTCTTTTTAAAGCACAATATTGAAGCCTTTAAAAGGTATTTMGGGTTTGGTCMGTGMTATMMTGTGTCTGTATAAAMGAAAATGAAGTAGTCACTGTTACGTACTGACATT 4801 AGTTACMCCTAGTTTTAATTCTTMAACAATTTTGATTAGCAAAGCTAAMAAGGGTGTTTCAGTTAAATGTTTTAAAGAGGTACAGATTTTTACAAGGACATAATATMGTTATTGTT 4921 CTGTAGAAATATCCTATTAAATATTGTATGTCCCTCCCTCTGTATACTTTGTAMAAAAGTAAAATACATAAAAAGAAAATCATATAGGGGATGTGTGACATTATTGTAATTGTGTACTT 5041 GAGAATAACGTGCAAAAATAAAAATCAGAATATTTTCCTGTTAACGMTGTTCAGTCTATTTGATACCAGTACTMGTTAAAGCTTTTTCTTAAGAMGMAATGTACAGTTTTTGTAAA 5161 CCTAATAAACATCAAAAGCAGTGGATTATTTTCATCCCCCTATTTCTTAATTCCTTTTATGCAGCAGTGGAATGCAAAATGCTTGGTTGCTTTGAATTTTGTGACTTGGATGCAAATACT 5281 GGTATTTCAGTCCTGTGAATTTGCMGTAACATTGAGAAGTTMGAGGTGATTGGTGAGTCCTTTGATGAGCATGTCCTTGAMTTCATTTTTTCTTTTATCTTTGTMATGATTTGTTT 5401 TATTAGCATCTCTGATTCCTGTTATAAATTACCCACCCATAAAGTGGGTTTGAAAGCACTTMGATTTGTTTTATTATCAGTTCACMCACATTTCATCTGAGGCCCTACCTTTTAGTTA 5521 CCCATTTGTGCCCAGGTTGGAGGCCTCTAGTAGACTCAGATCTTGGAGGATGGTCACTGTCGTAGTGAAMTGGCAAGTTGGATAAAGGTTGGGAATCCTGAGAATGTGCAGCACTTTTG 5641 ATATTAACAATGAGATTCTGTMTCTAGAAACCAGCGTTGGTCTGAGAGATGAGGTGACAAATMCTAMCCTTGAGGTACATGTCCACAAGMTGTGGTGGAAGTCAGTTGTGAGTCTG 5761 AGCATACCTGATCTCACCCATTCTGCTAAAATATGCTGCCTTATCGTTGCCAGGGATCTCGGTATTMTTTCCTGCAAAATCTTACATTTGTGTMGGAAATGCAAGCTCAGTCAGCATC 5881 TTAAAGAGGATCCATGTCCAAATGATGCCTGTCAGCAAGGTAGCCTGCCCTCTTAGTCCAGTCCCCAGTGAGCCGGGACGAGGCATGACCTGTGTTTCAGCAGGCTCCCCGGGTTGTTC 6001 TGCTCCAGGCTCAGGTTAGTGAAAGCGATCCTAGAGCACTTGAGGAACGGMMGCTTACTTGGTGTGATGTTTTGTGTAGTTGTCAAGGGGCTCACMMGCAGCTAGAGATGGTGCAG 6121 TGAGAATACATTATAATCCAGAGACTGACTGCCTTTAGCCTTTCCGTAAAAGTCCTGAGTCTTGTCAGTCACTMGTCCGTCCTTCAGTGCACGTCTGMCAGCCTGTCCCTCCTACTCT 6241 TACATAGCCCCCAGTGTTGCATTCCTCAAACGTAGGAAGACTTGATAAGTAAAGAGAAGGACTGAAAATTGTGTGTTCTTGTGTGTTTAATTGGCCCAGAGTTAATAAAGTAATTCATAT 6361 TAAAAAGGACATTTTGCTGCATTTTACAACTTTTTAGGAG;TCAAAGTTAAGTTTGCCCTATGGATTTTGTTTTTATCTTTTGTTTCATGTATATATTGTTTGCCTTTTTCATAAAATAA 6481 TTCTTGGATTTGTTATATATTGTTCCTGTTATTTTTGACATCTTTGCTATTGTAAATAAATTCCTATTTTGTTTTAAGTT _ 6601 AAAAAAAAAAAAAAAAAWAAAAAAA 6625 FIG. 4. Nucleotide sequence of mouse Zfx cDNAs and predicted amino acid sequence of the encoded protein. The nucleotide sequence of the inserts of mouse Zfx cDNA clones pDP1115 and pDP1119 is shown. The predicted 799-amino-acid sequence is given above the corresponding nucleotide sequence. The numbering of nucleotides and amino acids begins with the first in-frame AUG codon. The sequences of the two cDNAs are identical from -31 to +5458. Two distinct 5' untranslated regions of 145 nucleotides (in pDP1115, top line) and 198 nucleotides (in pDP1119, bottom line) are shown; the sequences of the two clones converge at position -31. pDP1119 terminates with 40 adenosines following position +5458 (not shown), while pDP1115 terminates with 65 adenosines following position +6560. 686 MARDON ET AL. MOL. CELL. BIOL.

Zfx M D E D G L E L Q Q Q A P N S F F D A T G A G A T H M D G N Q I V V E V Q E T V Y V S D V V D S D I T V H N Y V P D D P - - - Zfy-1 E I S T P E E E K . G I D V .... S. D . F L A N V F . N

. .. D . . Zfy-2 E I T P E E E K L G I D. V S D S . F L N - - - V . F . . D S V V I Q D V I E D V V I E D V Q C T D I M D E A D V S E T V I I P E Q V L D S D V T E E V S L T H C T V P D D V L A S D I T G .. . . . L ...... H S. H . L E ...NT I D N...... L G T A A Q .F L I I -.T.G N .L...L ... H S H . L.E T I D N...... L T A A Q F L I I -.T.S S A S I S M P E H V L T S E S I H V S D V G H V E H V V H D S V V E A E I V T D P L A A D V V S E E V L V A D C A S E A V I D A| T.LT. N. .A . F.Q I. .L T MV . . I .TSDI... V ... . S ..I . T.LT.... N. .A. . . N. . . F Q I .L. .R .T. . . .T 1D...... S acidic N G I P V N Q Q D E E K N N C E D Y L M I S L D D A G K I E H D G S S G L T M D N E T E I D P C K V D G T C P E V I K V Y I F K domain

S .M.L E ...... T . .. N D I E P S D L E ..E V ..NA S T S S.L.E A S . . L. . S .M.L E .. D A R I ...... E P S T D E .. . E V N A S E. T SAS SS.LL E SC.L. A D P G E D D L G G T V D I V E S E P E N E H G V E L L D P N N S I R V P R E K M V Y M A V N D S Q Q E E E E L N V A E I A D E .S E V V.E.IHA. . . . T KGNE A VT.QST.V-...... S.S. . K. . D T E. - S E V VEIQA. . . . TDGNEAVT Q R T . . H . .K V NI- LAS. K. . .DTK -

V Y M E V I V G E E D A A A A A A A A V - H E Q Q V E D N E M K T - F M P I A W A A A Y G N N S D G I E N R N G T A S A L L H I - . . . G G T D T P E. . .DVSIAAL. . .T. . . D . .E DQV ...... N Q D .GGT. -.DPE.- . G DVS I AA L T P .. . E V Q.A ..m H D E S A G L G R L A'K Q K P K K R R R P D S RiQ Y Q T A I I I G P D G H P L T V Y P C M I C G K K F K S R G F L K R H M K N H .G. .D P. . S. K K.E K. . K S F V A Q .TRT R . . . F ...... T K R ... . T.. .G. .D .V P. S. .KK .E. K . VA..QT..FTKR.. . P E H L A K K K Y R C T D C D Y T T N K K I S L H N H L E S H K L T S K A E K A I E C D E C G K H F S H A G A L F T H K M VIN

N H . . . M . T T T ..VY.Y.A...N.N..H [. ..E....M..N.. .1...IDSNT .N.. I...TT.T..T..D.R.NLS .. . T.C.T M.C T .T MN. K E K G A N K M H K C K F C E Y E T A E Q G L L N R H L L A V H S K N F P H I C V E C G K G F R H P S E L K K H M R I H

. T .V T C . . . . .D . . . . . T H . . . VR K . . . . G ...... A . . . . I 1 3

T...... I. ~~~~~~~~.RK D T H .VR K A . . I Vzinc T G E K P Y E Q Y E Y R S A D S S N L K T V KT K H S K E M P FK D I L L T F S D T K E V Q Q A LV fingers . .. K ...... I . S . . . . I . L. G ...... A...... A VL . .. K ...... I . S . . . . I . L...... A...... AVL Q E S K T H Q L H D H K S S N S S D L K R I IsSV H T K D Y P HK D M D K G F H R P S E L K K V A A . . . R . . . S. N ...... A . . S...... T

...... R S. N ...... A S ...... T K G K K M H Q R H D F K I A D P F V L S R I LS V N T K D L P FR K R MK T . . . N S P . . . L . . H . . . A . . N V . K . E . . C . . Q T S...... K Q. . NR N S P L H . . . . . N V . . . . . S.K... I. A .K K E Q. C . . Q T 799 Y Q C E Y C E Y S T T D A S G F K R H V I I H T K D Y P HR E Y K K G F R R P S E K N Q I M R H K E V G L P

...... S K . S D F ......

. . _ . . . . S . K . D F ...... L

FIG. 5. Predicted amino acid sequence of mouse Zfx is 70% identical to mouse Zfy-1 and Zfy-2. The predicted amino acid sequences of mouse Zfx (799 residues), mouse Zfy-1 (782 residues [1]), and mouse Zfy-2 (783 residues [25]) are compared. Dots represent identity to Zfx, while dashes indicate gaps in one sequence compared with another. A short basic sequence (dashed box) is located between large acidic and zinc finger domains. In the zinc finger region, the sequences are aligned as six and one-half repeats of a 57-amino-acid unit (each composed of two fingers and two linkers), as previously reported for human ZFY and ZFX and mouse Zfy-2 (25, 33, 39) and which we note is also present in mouse Zfy-1 (1). The invariant cysteines (C) and histidines (H), characteristic of zinc finger domains, are boxed.

(Fig. 8). The machinery with which transcription factors truncated acidic domain (Schneider-Gadicke et....al., in press). interact appears to be conserved between yeasts and mam- Although we have no direct evidence of alternative splicing mals; the intact mammalian estrogen and glucocorticoid within the coding region of mouse Zfx, we note that mouse receptors will, for example, activate transcription in yeast Zfx contains an internal ATG (Fig. 4, codon 227) that could cells, given appropriate binding sites (28, 38). Therefore, we allow production of a similar truncated isoform. (A methio- believe that our yeast experiments shed light on the function nine codon is also present in homologous positions in Zfy-1 of the Zfx and Zfy proteins in mice. and Zfy-2 [Fig. 5].) The sequence similarity between the products of Zfx and Are the Zfx and Zfy genes functionally distinct? Although Zfy reinforces previous evidence (33, 39) that these X- and Zfx and Zfy encode closely related proteins, the functions of Y-chromosomal genes evolved from a single, common an- the genes are not necessarily identical. First, their transcrip- cestral gene prior to the radiation of placental mammals. A tion patterns are quite distinct. Zfy transcripts have been comparison of human and mouse zinc finger sequences detected only in the adult testis (25, 31). In contrast, Zfx suggests that Zfx is more conserved than is Zfy. In this appears to be widely transcribed in embryos, newborns, and domain, the ZFX and Zfx proteins are 99.5% identical, adults, both female and male (Fig. 3). differing by just two amino acid substitutions, while the ZFY Second, Zfx and Zfy differ dramatically in their 3' UTRs. and Zfy proteins are only 80% identical. However, human While the 3' UTRs of Zfy transcripts are less than 200 bases ZFY is 97% identical to ZFX and Zfx. Thus, the mouse Zfy long, Zfx transcripts contain 3' UTRs of 3 to 4 kb (Fig. 4). proteins have diverged from the other X- and Y-linked The mouse Zfx 3' UTR cross-hybridized to X-linked, pre- human and mouse zinc finger proteins. sumably Zfx-associated sequences in all placental mammals A comparison of the amino-terminal portions of the human tested (Fig. 6). In fact, nucleotide sequence analysis of a ZFX and mouse Zfx proteins reveals 92% amino acid iden- human ZFX cDNA reveals that its 3' UTR is 83% identical to tity, with just two gaps in the aligned sequences (Schneider- 2.7 kb of the mouse Zfx 3' UTR (Schneider-Gadicke et al., in Gadicke et al., in press). As a result of alternative splicing, press). Conservation of long segments of 3' UTRs has been the human ZFX gene also encodes an isoform with a observed in several other genes (14, 18, 20, 21, 34). Though VOL. 10, 1990 MOUSE Zfx AND Zfy PROTEINS ARE SIMILAR 687

a Constructs P-Galactosidase :3 . ~ c) 0._ 0 C cm 0 Activity o D 0 0 s" O A GAL41-148 > x > x > x > X > x x >- x >- x x x x x x x x x x x x x x x x x kb GAL4 / Zfx 1 - 330 24

GAL4 /Zfy-2 1 -314 445 - 8 wt GAL4 4/- 554 6 FIG. 8. The acidic domains of Zfx and Zfy-2 activate transcrip- tion in yeast cells. Fusion proteins composing the DNA-binding region of GAL4 (residues 1 to 148) (black boxes) and the acidic, amino-terminal domains of either mouse Zfx (residues 1 to 330) or Zfy-2 (residues 1 to 314) were assayed for their ability to activate * 4 transcription in yeast cells. The activating domain (residues 149 to 881) of wild-type (wt) GAL4 is shown as a stippled box. Single-copy plasmids encoding each of the indicated GAL4 derivatives were A _ * - 3 introduced into yeast cells deleted for GAL4 and bearing an inte- grated GAL1:IacZ fusion gene with a single GAL4-binding site upstream. The relative P-galactosidase activity stimulated by each GAL4 derivative is shown. Immunoprecipitation of the various GAL4 derivatives with an anti-GAL4(1-147) serum revealed no significant differences in the amounts of the proteins (results not 2 shown).

is a more potent activator despite being less acidic than the corresponding portion of Zfx (net charge, -68 versus -77, respectively). Thus, other structural features must affect the efficiency of activation (10). Even if Zfx and Zfy bind to the same DNA sequences and regulate the same genes, as suggested for the human homologs (39), their regulatory effects may be strikingly different. Such issues will be more FIG. 6. The 3' UTR of Zfx is conserved in mammals. A portion directly accessible once a binding site has been identified. of the 3' UTR of a mouse Zfx cDNA (Fig. 2A, probe 2) was hybridized at low stringency to EcoRI-digested male (XY) and Sex-determining function for Zfx and Zfy? If Zfx and Zfy female (XX) genomic DNAs from a variety of mammals. A scale in are involved in gonadal sex determination, then it might be kilobases is shown at the right. expected that these genes are expressed in mid-gestation embryos, when testis differentiation is first detectable histo- logically. Such transcription is observed for Zfx but has not the c-fos 3' UTR has been implicated in transforming poten- been reported for Zfy. Of course, these observations do not tial and message stability (20, 27, 36) and the protamine 3' prove or disprove a role for either gene in primary sex UTR regulates translation (2), the function of conserved 3' determination. Zfx is also transcribed in many adult tissues, UTR domains is generally not well understood. implying that its functions are not limited to embryos. Third, as assayed in yeast cells, the acidic domains of the Similarly, the mouse Zfy genes are transcribed in the adult Zfx and Zfy proteins differ in the degree to which they (but not newborn) testis, suggesting a role in male reproduc- activate transcription. Mutations in an activating region of tion (25). GAL4 have similar effects on activation in mammalian and If the Zfy genes function in primary sex determination, yeast cells, suggesting that determinants of activation po- then they must be functionally distinct from the Zfx gene. tency are conserved in yeasts and mammals (G. Gill, unpub- Although the mouse Zfy and Zfx genes appear to encode lished data). We have shown, in yeast cells, that a GAL4- similar proteins, three findings suggest that the genes are not Zfy-2 fusion protein activates transcription 18 times as functionally interchangeable. First, the transcription pat- effectively as a similar GAL4-Zfx fusion (Fig. 8). Mutational terns of Zjfy and Zfx differ dramatically. Second, mouse Zfx analysis of a GAL4-activating region demonstrated a strong transcripts contain long, conserved 3' untranslated se- correlation between the acidity of the region and potency of quences that may have functional consequences. Third, Zfy activation (9). However, the amino-terminal portion of Zfy-2 may be far more potent than Zfx as an activator of transcrip-

splice

Zfx genomic ACACATTAGAGTTT.G..G...CTAA.TCAGCAT.TTTTGAT.ACTTATG.TCCTTTCCT.TCTT......

Zfa genomic G ...... Zfx cDNA CTCGGCAGACTGGCTAAACAAAAACCAAAGAAAAGGAGAAGACCTGATTCCAGGCAGTACCAAACAGCAATAATTATTGGCCCAGATGGACATCCTCTGACTGTCTACCCTT FIG. 7. Zfa lacks an intron present in Zfx. Partial nucleotide sequences of genomic clones from Zfx (pDP1193) and Zfa (pDP1194) are compared with the corresponding Zfx cDNA sequence. Dots represent identity to the Zfx cDNA, while dashes indicate gaps in one sequence compared with another. The arrow indicates the 3' end of an intron present in the Zfx genomic clone but absent from both the Zfx cDNAs and the Zfa genomic clone. This splice occurs after nucleotide + 1216 in the Zfx cDNAs. 688 MARDON ET AL. MOL. CELL. BIOL. tion. The functional relationship of these genes and their 1988. Effect of basic and nonbasic amino acid substitutions on putative role in sex determination are being tested in trans- transport induced by simian virus 40 T-antigen synthetic peptide genic mice. nuclear transport signals. Mol. Cell. Biol. 8:2722-2729. 20. Lee, W. M., C. Lin, and T. Curran. 1988. Activation of the transforming potential of the human fos proto-oncogene re- ACKNOWLEDGMENTS quires message stabilization and results in increased amounts of We thank our colleagues, especially Douglas Vollrath, for critical partially modifiedfos protein. Mol. Cell. Biol. 8:5521-5527. review of the manuscript and Rebecca Mosher for preparation of 21. Lemaire, C., R. Heilig, and J. L. Mandel. 1988. The chicken mammalian genomic DNAs. dystrophin cDNA: striking conservation of the C-terminal cod- This study was supported by grants from the National Institutes of ing and 3' untranslated regions between man and chicken. Health, the Whitaker Health Sciences Fund, the Lucille P. Markey EMBO J. 7:4157-4162. Charitable Trust, and the Searle Scholars Program of The Chicago 22. Loewinger, L., and F. McKeon. 1988. Mutations in the nuclear Community Trust. E.M.S. was supported by a fellowship from the lamin proteins resulting in their aberrant assembly in the cyto- Medical Research Council of Canada. G.G. was supported by Public plasm. EMBO J. 7:2301-2309. Health Service grant GM32308 to Mark Ptashne from the National 23. Ma, J., and M. Ptashne. 1987. Deletion analysis of GAL4 Institutes of Health. defines two transcriptional activating segments. Cell 48:847- 853. LITERATURE CITED 24. Mardon, G., R. Mosher, C. M. Disteche, Y. Nishioka, A. 1. Ashworth, A., S. Swift, and N. Affara. 1989. Sequence of cDNA McLaren, and D. C. Page. 1989. Duplication, deletion, and for murine Zfy-J, a candidate for Tdy. Nucleic Acids Res. polymorphism in the sex-determining region of the mouse Y 17:2864. chromosome. Science 243:78-80. 2. Braun, R. E., J. J. Peschon, R. R. Behringer, R. L. Brinster, and 25. Mardon, G., and D. C. Page. 1989. The sex-determining region R. D. Palmiter. 1989. Protamine 3'-untranslated sequences reg- of the mouse Y chromosome encodes a protein with a highly ulate temporal translational control and subcellular localization acidic domain and 13 zinc fingers. Cell 56:765-770. of growth hormone in spermatids of transgenic mice. Genes 26. McCarrey, J. R., and K. Thomas. 1987. Human testis-specific Dev. 3:793-802. PGK gene lacks introns and possesses characteristics of a 3. Brent, R., and M. Ptashne. 1985. A eukaryotic transcriptional processed gene. Nature (London) 326:501-505. activator bearing the DNA specificity of a prokaryotic repres- 27. Meilink, F., T. Curran, A. D. Miller, and I. M. Verma. 1985. sor. Cell 43:729-736. Removal of a 67 sequence in the noncoding region of 4. Brown, R. S., C. Sander, and P. Argos. 1985. The primary the proto-oncogenefos converts it to a transforming gene. Proc. structure of transcription factor TFIIIA has 12 consecutive Natl. Acad. Sci. USA 82:4987-4991. repeats. FEBS Lett. 186:271-274. 28. Metzger, D., J. H. White, and P. Chambon. 1988. The human 5. Dang, C. V., and W. M. F. Lee. 1988. Identification of the oestrogen receptor functions in yeast. Nature (London) 334: human c-myc protein nuclear translocation signal. Mol. Cell. 31-36. Biol. 8:4048-4054. 29. Miller, J., A. D. McLachlan, and A. Klug. 1985. Repetitive 6. Dingwall, C., and R. A. Laskey. 1986. Protein import into the zinc-binding domains in the protein transcription factor IIIA cell nucleus. Annu. Rev. Cell Biol. 2:367-390. from Xenopus oocytes. EMBO J. 4:1609-1614. 7. Fitzgerald, M., and T. Shenk. 1981. The sequence 5'-AAUAAA- 30. Mitchell, M., D. Simon, N. Affara, M. Ferguson-Smith, P. 3' forms part of the recognition site for polyadenylation of late Avner, and C. Bishop. 1989. Localization of murine X and SV40 mRNAs. Cell 24:251-260. autosomal sequences homologous to the human Y located 8. Ford, C. E., 0. J. Miller, P. E. Polani, J. C. de Alameida, and testis-determining region. Genetics 121:803-809. J. H. Briggs. 1959. A sex-chromosome anomaly in a case of 31. Nagamine, C. M., K. Chan, C. A. Kozak, and Y.-F. Lau. 1989. gonadal dysgenesis (Turner's syndrome). Lancet i:711. Chromosome mapping and expression of a putative testis- 9. Gill, G., and M. Ptashne. 1987. Mutants of GAL4 protein altered determining gene in mouse. Science 243:80-83. in an activation function. Cell 51:121-126. 32. Page, D. C. 1988. Is ZFYthe sex-determining gene on the human 10. Giniger, E., and M. Ptashne. 1987. Transcription in yeast Y chromosome? Philos. Trans. R. Soc. Lond. B Biol. Sci. activated by a putative amphipathic a helix linked to a DNA 322:155-157. binding unit. Nature (London) 330:670-672. 33. Page, D. C., R. Mosher, E. M. Simpson, E. M. C. Fisher, G. 11. Godowski, P. J., D. Picard, and K. R. Yamamoto. 1988. Signal Mardon, J. Pollack, B. McGillivray, A. de la Chapelle, and L. G. transduction and transcriptional regulation by glucocorticoid Brown. 1987. The sex-determining region of the human Y receptor-LexA fusion proteins. Science 241:812-816. chromosome encodes a finger protein. Cell 51:1091-1104. 12. Hollenberg, S. M., and R. M. Evans. 1988. Multiple and coop- 34. Ponte, P., S.-Y. Ng, J. Engel, P. Gunning, and L. Kedes. 1984. erative trans-activation domains of the human glucocorticoid Evolutionary conservation in the untranslated regions of actin receptor. Cell 55:899-906. mRNAs: DNA sequence of a human beta-actin cDNA. Nucleic 13. Hope, I. A., and K. Struhl. 1986. Functional dissection of a Acids Res. 12:1687-1696. eukaryotic transcriptional activator protein, GCN4, of yeast. 35. Ptashne, M. 1988. How eukaryotic transcriptional activators Cell 46:885-894. work. Nature (London) 335:683-689. 14. Hsu, C.-Y., and F. R. Frankel. 1988. Conserved and unique 36. Rahmsdorf, H. J., A. Schonthal, P. Angel, M. Litfin, U. Ruther, sequences in the 3'-untranslated region of rat smooth-muscle and P. Herrlich. 1987. Posttranscriptional regulation of c-fos alpha-actin mRNA. Gene 69:345-348. mRNA expression. Nucleic Acids Res. 15:1643-1659. 15. Ito, H., Y. Fukuda, K. Murata, and A. Kimura. 1983. Transfor- 37. Sanger, F., S. Nicklen, and A. R. Coulson. 1977. DNA sequenc- mation of intact yeast cells treated with alkali cations. J. ing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. Bacteriol. 53:163-168. USA 74:5463-5467. 16. Jacobs, P. A., and J. A. Strong. 1959. A case of human 38. Schena, M., and K. R. Yamamoto. 1988. Mammalian glucocor- intersexuality having a possible XXY sex-determining mecha- ticoid receptor derivatives enhance transcription in yeast. Sci- nism. Nature (London) 183:302-303. ence 241:965-967. 17. Kozak, M. 1986. Point mutations define a sequence flanking the 39. Schneider-Gadicke, A., P. Beer-Romero, L. G. Brown, R. Nuss- AUG initiator codon that modulates translation by eukaryotic baum, and D. C. Page. 1989. ZFX has a gene structure similar to ribosomes. Cell 44:283-292. ZFY, the putative human sex determinant, and escapes X 18. Krawetz, S. A., W. Connor, and G. H. Dixon. 1987. Cloning of inactivation. Cell 57:1247-1258. bovine P1 protamine cDNA and the evolution of vertebrate P1 40. Welshons, W. J., and L. B. Russell. 1959. The Y chromosome as protamines. DNA 6:47-57. the bearer of male determining factors in the mouse. Proc. Natl. 19. Lanford, R. E., R. G. White, R. G. Dunham, and P. Kanda. Acad. Sci. USA 45:560-566.