© 2001 Nature Publishing Group http://genetics.nature.com news & views

such as MEDLINE, synonymy causes example of a ‘data mining’ tool that uses described in the text of the records of Online reduced recall (that is, inability to find explicit linkages between two indepen- Mendelian Inheritance in Man, and 51% of records that are truly related to the dently constructed sources of information the pairs of described in the Data- being sought) and polysemy causes reduced that contain conceptually related records. base of Interacting Proteins. Although this is precision (that is, retrieving records that are Successful data mining depends critically far above a chance level of performance, it not related to the gene, due to word sense upon reliable sets of what computer scien- means that such approaches are useful pri- ambiguities such as insulin being a gene, a tists call “foreign key references”—that is, marily as tools of intellectual exploration , a hormone and a therapeutic the existence of the same unique words or and browsing, and not as comprehensive or agent). A more fundamental limitation is record identifiers in records of each of the definitive tools for global characterization of that the full text of the published biomedical sources to be linked. The PubGene applica- expression results. Additional limitations literature contains far more information tion uses HUGO gene symbols associated include the inescapable fact that expressed than is contained in the corresponding titles with specific loci on microarrays as the sequence tags and without associated and abstracts of articles. common currency for linking to the litera- publications do not participate in the analy- To overcome the problems of unpre- ture, and displays characterizations of genes sis, and the more subtle bias that well- dictable word usage of authors, libraries and using the MeSH keywords from the litera- known, better-characterized genes are information scientists developed the notion ture written about those genes. At the Uni- over-represented in the literature relative to of controlled vocabularies. Terms selected versity of California, San Diego, we have newly discovered genes. from ‘preferred term lists’ by professionally developed a similar application (http:// The usefulness of automated linkages trained indexers (who, in the case of MED- www.array.ucsd.edu) for interpreting gene to the literature in assisting in the inter- LINE, are taught to read and apply key- clusters that uses GenBank accession num- pretation of array data will improve as words to the full text of the article before bers as the common currency for linking to the literature expands and becomes reading the abstract) have the potential to the literature and extends the notion of increasingly available as electronic full improve both recall and precision relative to characterizing groups of genes through lit- text, and as computational tools for pro- searching of author-supplied text words. erature-derived keywords by placing those cessing language become more powerful But even trained biomedical indexers read- keywords in concept hierarchies. and robust. In the meantime, the view ing the same article choose the same pre- Interpretive tools that are currently avail- through a picket fence is clearly better ferred keywords only 40–60% of the time3, able for analyzing microarray results pro- than no view at all. so a central issue in language-based systems vide only a partial view of the relevant is making them sufficiently robust and tol- literature, as if gazing through a picket 1. Jenssen T.-K., Laegrid, A., Komoroski, J. & Hovig, E. Nature Genet. 28, 21–28 (2001). erant of the variety of ways of representing fence. As Jenssen et al. have pointed out, 2. Johnson, S. Preface to Dictionary of the English the biological ideas in the literature. PubGene gene pairs identified by co- Language (London, 1775). 3. Funk, M.E., Reid, C.A. & McGoogar, L.S. Bull. Med. Associating experimental microarray occurences within titles and abstracts Library Assoc. 71, 176–183 (1983). results with the published literature is an accounted for only 45% of gene pairs 4. Masys D.R. et al. Bioinformatics 7, 319–326 (2001). © http://genetics.nature.com Group 2001 Nature Publishing

Packaging paternal with Robert E. Braun

Department of Genetics, University of Washington, Seattle, Washington 98195-7360, USA. e-mail: [email protected]

The chromosomes of sperm cells are tightly packaged into a complex of DNA and . Converting the from a nucleohistone to a nucleoprotamine structure may serve both biophysical and developmental functions. Several recent genetic studies have shown unexpected findings of the dosage requirements for the genes involved in sperm chromatin remodeling.

As spermatids undergo the terminal stages across the zona pellucida surrounding the mine, mice and men have two. A study by of , compacting the DNA egg. It may also be that the nucleoprota- Chunghee Cho, William Willis and col- requires the replacement of the mine structure protects the genetic mater- leagues1 (see page 82) indicates that each is with a class of arginine- and cysteine-rich ial in the sperm head from physical and vital to paternal procreation. proteins called protamines. Although the chemical damage. Alternatively, packing of Using gene targeting in mouse embry- reason for replacement of the histones sperm chromatin may serve to reprogram onic stem cells to investigate the function with protamines is unknown, one possibil- the paternal genome so that the appropri- of the mouse protamine genes (Prm1 and ity is the generation of a more hydrody- ate genes from the father’s chromosomes Prm2), Cho et al. show that mutation of namic sperm head that speeds the transit are expressed in the early embryo. Whereas either haploid-expressed gene leads to through the female reproductive tract and most mammals contain a single prota- defective sperm. Unexpectedly, removal

10 nature genetics • volume 28 • may 2001 © 2001 Nature Publishing Group http://genetics.nature.com news & views

Packaging chromatin. A model of chromatin packaging in somatic cells (left) and mammalian somatic sperm sperm (right). In somatic cells, the DNA is wound

twice around octamers to form nucleo- 5' DNA 5' Bob Crimi somes, which are then coiled into solenoids. The double 3' helix 3' solenoids are attached at intervals to the nuclear matrix at their bases and form DNA loop domains. In the sperm nucleus, protamines replace the his- 5' 3' protamine 3' 5' tones and the protamine-DNA complex is coiled 5' into a doughnut shape. Inset shows the tight com- pacting of protamine-DNA strands. Displacement histone nucleosome 3' octamer of the histones is facilitated by post-translational 3' histone transition protamines 5' modifications of the proteins, in the form of his- displacement proteins tone H4 acetylation, ubiquitination and phosphory- lation. Phosphorylation and dephosphorylation of the transition proteins facilitate their displacement histone H4 phosphorylation/ before protamines bind. hyperacetylation dephosphorylation solenoid ubiquitination? 5' of a single copy of either Prm1 or Prm2 is 3' phosphorylation? doughnut detrimental to postmeiotic spermatids, doughnut demonstrating haploinsufficiency. Both loop types of meiotic products—sperm carry- ing the mutant protamine allele and sperm carrying the wild-type allele—are solenoid nonfunctional. Presumably the wild-type loop sperm are affected as a result of the syncy- tial nature of spermatogenesis2,3. Incom- plete cytokinesis during mitosis and meiosis generates clusters of haploid spermatids that remain connected through cytoplasmic bridges. The inter- cellular bridges connecting spermatids largely redundant functions5. Neither spermatid differentiation, the ratio of effectively equilibrate the 1× dosage of Tnp1 or Tnp2 alone are haploinsufficient, protamine 1 to protamine 2 in mature gene product produced by the wild-type in fact, mutants homozygous for either sperm is similar to that in wild-type spermatid, and the zero dosage produced gene are fertile, although their litter sizes sperm6,7. Somehow the cells can distin- by the mutant spermatid, to a 0.5× dosage are smaller, and sperm have relatively guish between the two pools of prota- distributed among all the cells. This minor head abnormalities (M. Meistrich, mines and assemble a nucleoprotamine

© http://genetics.nature.com Group 2001 Nature Publishing reduction in protamine causes defects in pers. comm.). However, reduction of the structure with wild-type stoichiometry. DNA compaction. total Tnp dosage by 75% in either Tnp1- One wonders whether an extra gene dose Cho et al. are to be commended for their or Tnp2- null mice lacking one copy of of Prm1 can substitute for the missing demonstration that protamine genes the other Tnp, or 100% elimination of dose of Prm2 and vice versa. behave in a haploinsufficient manner. both transition proteins in double-null Because most ES cells are XY, and trans- mutants, results in more severe abnor- Nuclear condensation mission of mutant alleles is accomplished malities in nuclear condensation and Displacement of histones by transition pro- by breeding chimeric males, the authors sterility. The similar phenotypes teins and protamines is accompanied by had to investigate the haploinsufficiency in observed in all combinations of equiva- several post-translational modifications chimeric males. Analysis of the mice lent Tnp dosage indicate a common func- (see figure). Biochemical studies in mam- required distinguishing between a mixed tion for the two transition proteins. mals, fish and birds indicate that histone population of sperm containing three Do the two mouse protamines also acetylation (especially histone H4 acetyla- genotypes; sperm derived from the recipi- have redundant functions? The presence tion8), ubiquination and phosphorylation ent C57BL/6N blastocysts, and sperm of a single protamine in most mammals all facilitate the displacement of histones. derived from the 129 Sv/J ES cells carrying would suggest that they do. If so, the hap- Chromatin remodeling may also require either the wild-type or mutant protamine loinsufficiency is even more remarkable chaperones that actively displace the post- allele. The data show convincingly that as it suggests that chromatin remodeling translationally modified histones. Phospho- wild-type and mutant 129 Sv/J sperm are is compromised by as little as a 25% rylation of the transition proteins and the made in equal numbers and that both are reduction in protamine levels (three protamines, presumably important for neu- nonfunctional1. functioning alleles in a four-allele sys- tralizing these highly basic proteins, may tem). However, several observations indi- also be important for chromatin com- Prelude to protamines cate that the protamines have evolved paction9. It was recently shown that targeted In mammals, the protamines do not separate functions. First, the stoichiome- mutations in the HR6B ubiquitin-conjugat- directly replace the histones. Instead, try of the two protamines in humans and ing enzyme, Ube2b, and Camk4 disrupt the another group of basic proteins, the tran- mice are different and, moreover, Cho et terminal stages of spermatid differentiation sition proteins, act as intermediaries in al. show that in Prm2+/– chimeras, there and cause sterility. Camk4 encodes the the histone-to-protamine transition. is also less Prm1 protein present in Ca2+/calmodulin-dependent protein kinase Mice have two major transition proteins, sperm, indicating that Prm1 deposition IV (ref. 10), and phosphorylates Prm2 (ref. encoded by Tnp1 and Tnp2 (ref. 4). requires normal amounts of Prm2. In 11). Both genes are expressed in other tis- Experiments in which Tnp1 is deleted addition, in transgenic mice that overex- sues, yet have phenotypes that are restricted indicate that the transition proteins have press Prm1 at its normal time during to the testis. The temporal and spatial pat-

nature genetics • volume 28 • may 2001 11 © 2001 Nature Publishing Group http://genetics.nature.com news & views

tern of spermatid differentiation, the mines directly involved in developmental direct test of the developmental potency of increasing power and sophistication of reprogramming? Despite the sterility the Tnp and Prm mutant sperm nuclei. mouse genetics and the opportunity to per- observed in the mouse Tnp and Prm 1. Cho, C. et al. Nature Genet. 28, 82–86 (2001). 2. Braun, R.E., Behringer, R.R., Peschon, J.J., Brinster, form biochemical phenotyping make sper- mutants, many of the sperm look surpris- R.L. & Palmiter, R.D. Nature 337, 373–376 (1989). matogenesis a highly attractive system for ingly normal. Have the nuclei present in 3. Caldwell, K.A. & Handel, M.A. Proc. Natl. Acad. Sci. USA 88, 2407–2411 (1991). the study of chromatin repackaging. these sperm undergone proper develop- 4. Meistrich, M.L. in Histones and Other Basic Nuclear mental reprogramming? A precedent for Proteins (eds. Hnilica, G., Stein, G. & Stein, J.) 165–182 (CRC Press, Orlando, 1989). Repack, reprogram this class of mutants exists in flies and 5. Yu, Y.E. et al. Proc. Natl. Acad. Sci. USA 97, Developmental reprogramming of the worms where certain paternal effect 4683–4688 (2000). 6. Peschon, J.J., Behringer, R.R., Brinster, R.L. & parental genomes occurs during egg and mutants generate sperm capable of fertil- Palmiter, R.D. Proc. Natl. Acad. Sci. USA 84, sperm formation. However, the direct izing eggs, but defective in post-fertiliza- 5316–5319 (1987). 14 7. Peschon, J.J. Thesis, Univ. Washington (1988). relationship between chromatin repackag- tion processes . One of these 8. Meistrich, M.L., Trostle Weige, P.K., Lin, R., ing in sperm and developmental repro- paternal-effect mutants in Drosophila, Bhatnagar, Y.M. & Allis, C.D. Mol. Reprod. Dev. 31, 170–181 (1992). gramming is unknown. Fluorescence in referred to as pal (ref. 15) causes specific 9. Green, G.R., Balhorn, R., Poccia, D.L. & Hecht, N.B. situ hybridization indicates that chromo- loss of the paternal chromosomes during Mol. Reprod. Dev. 37, 255–263 (1994). 10. Wu, J.Y. et al. Nature Genet. 25, 448–452 (2000). somes are not packaged haphazardly into the early embryonic cleavages following 11. Roest, H.P. et al. Cell 86, 799–810 (1996). sperm12, and in humans, where about fertilization. Interestingly, pal encodes a 12. Ward, W.S. & Zalensky, A.O. Crit. Rev. Euk. Gene Expr. 6, 139–147 (1996). 15% of the DNA remains packaged in small basic protein expressed late in sper- 13. Gatewood, J.M., Cook, G.R., Balhorn, R., Bradbury, nucleohistones, it seems that specific matogenesis (B. Wakimoto, pers. comm.). E.M. & Schmid, C.W. Science 236, 962–964 (1987). 13 14. Fitch, K.R., Yasuda, G.K., Owens, K.N. & Wakimoto, sequences remain bound by histones . The ability to fertilize mammalian eggs by B.T. Curr. Top. Dev. Biol. 38, 1–34 (1998). Are the transition proteins and prota- intracytoplasmic sperm injection allows a 15. Baker, B.S. Genetics 80, 267–296 (1975). © http://genetics.nature.com Group 2001 Nature Publishing

12 nature genetics • volume 28 • may 2001