Predicting the Total Number of Human Genes
Total Page:16
File Type:pdf, Size:1020Kb
© 1994 Nature Publishing Group http://www.nature.com/naturegenetics correspondence• in 11p15.5 is more centromeric than NoemiRoy Isabelle Denjoy previously thought. The first evidence Pascal Kahlem Philippe Coumel for genetic heterogeneity in LQT EricDausse Ketty Schwartz came from Benhorin et aU who Mohammed Bennaceur Pascale Guicheney analysed linkage HRAS in a large INSERM UR153, Service de Cardiologie, family and found no linkage. Given Hopital Pitie-Salpetriere, Hopital Lariboisiere, our new results, such a conclusion Pavilion Rambuteau, 41 boulevard de la Chapelle must be confirmed with other 47 boulevard de l'Hopital, 75010, Paris, France 5 6 markers • • 75013 Paris, France Our results show that HRAS is excluded from the region con Sabine Faure taining the LQT gene and that LQT Jean Weissenbach Acknowledgements is more centromeric in 11 p 15.5 than Genethon, Evry, France We are indebted to the family members for previously thought. Until the LQT their invaluable participation. This work was supported by the Association Frans:aise contre gene is found, several informative Michel Komajda les Myopathies, the "Federation Frans:aise de markers should be used for linkage Service de Cardiologie Cardiologie", the "Assistance Publique analysis and presymptomatic Hopital Pitie-Salpetriere H<5pitaux de Paris" and the "Fondation de Ia diagnosis. 75013 Paris, France Recherche Medicale". Predicting the total number of human genes Sir-We read with interest the News percentages we determined should kinetics to observed gene densities in & Views by Fields and colleagues on really be 66% rather than 56%? And sequenced regions comprising even human gene number in the July issue why 66%, rather than 77% or 88%? tinier samples ofthe human genome. 1 of Nature Genetics , but were some While we do not pretend that our Our best estimate is based on a large what concerned about the represen estimate is the last word on gene sample of ESTs - representing tation of our own work on this number, we feel that it is unhelpful if probably half of human protein subjecf. In referring to our data in the considered "best estimates" from coding genes - but even here the Table 1, Fields et a[.! altered the different sources are altered without uncertainty in the true sampling proportion of CpG islands that are good reason. This is particularly so if redundancy is significant. Our associated with genes from 56% to the alteration gives the impression of explicitly hypothetical statement that 66%. This has the effect of reducing a tighter consensus than is justified. "if two-thirds of human genes had our estimate for the number of genes associated CpG islands, the total from 80,000 to 67,000, bringing it Francisco Antequera predicted gene number would be closer to the preferred estimate Adrian Bird 67,000" should be taken at face value. derived from expressed sequence tag Institute of Cell and Molecular Very little of the human genome has (EST) analysis (64,000). Biology, Darwin Building, been examined and the distribution of The justification for changing the University of Edinburgh, CpG islands in proximity to genes is figure was that some of the genes Scotland EH9 3JR, UK not understood in any detail (see Fig. 2 included in our 'non-island' category (ref. 4). In this context, a little over haif might in fact be associated with islands (56%) and about two-thirds (66%) that were outside the sequenced IN REPLY - Antequera and Bird are are not all that different. regions. We deliberately restricted our concerned with our treating their analysis to complete transcription estimate of the number of genes for Chris Fields units (including a known tran observed CpG islands (about 80,000) MarkDAdams scription start site and flanking as roughly consistent with our Owen White sequences) in order to avoid this estimates based on other data (about J. Craig Venter possibility. Even so, our result agreed 65,000). Their estimate is based on The Institute for Genomic Research, closely with an independent survey of 152 complete transcription units 932 Clopper Road, 362 human genes3 showing that 57% (about 0.2% ofhuman genes even on Gaithersburg, Maryland 20878, USA of genes were island-associated. If our estimate), and is consistent with 3 islands were being missed in these an independent analysis of362 genes 1. Fields, C., Adams, M.D .. White, 0. & Venter, studies, they would have to be remote (less than 0.6% ofhuman genes). Our J.C. Nature Genet. 7, 345-346 (1994). 2. Antequera, F. & Bird, A. Proc. natn. Acad. Sci. from their associated genes. This estimates are based on data from a U.S.A. 90, 11995-11999 (1993). would be unprecedented. We there variety of sources, varying from 3. Larsen, F. et a/. Genom/cs 13, 1095-1107 (1992). fore wonder what observation pro genome-wide analyses of tran 4. Martin-Gallardo, A. eta!. Nature Genet. 1, 34-39 vides the basis for believing that the scription density by reassociation (1992). 114 Nature Genetics volume 8 october 1994 .