WHAT IS a GENE? the Idea of Genes As Beads on a DNA String Is Fast Fading
Total Page:16
File Type:pdf, Size:1020Kb
25.5 gene MH 22/5/06 2:53 PM Page 398 NEWS FEATURE NATURE|Vol 441|25 May 2006 C DARKIN © 2006 Nature Publishing Group 25.5 gene MH 22/5/06 2:53 PM Page 399 NATURE|Vol 441|25 May 2006 NEWS FEATURE C. DARKIN WHAT IS A GENE? The idea of genes as beads on a DNA string is fast fading. Protein-coding sequences have no clear beginning or end and RNA is a key part of the information package, reports Helen Pearson. ene’ is not a typical four-letter Laurence Hurst at the University of Bath, UK. viously unimagined scope of RNA. word. It is not offensive. It is never “All of that information seriously challenges The one gene, one protein idea is coming bleeped out of TV shows. And our conventional definition of a gene,” says under particular assault from researchers who ‘Gwhere the meaning of most four- molecular biologist Bing Ren at the University are comprehensively extracting and analysing letter words is all too clear, that of gene is not. of California, San Diego. And the information the RNA messages, or transcripts, manufac- The more expert scientists become in molecu- challenge is about to get even tougher. Later tured by genomes, including the human and lar genetics, the less easy it is to be sure about this year, a glut of data will be released from mouse genome. Researchers led by Thomas what, if anything, a gene actually is. the international Encyclopedia of DNA Ele- Gingeras at the company Affymetrix in Santa Rick Young, a geneticist at the Whitehead ments (ENCODE) project. The pilot phase of Clara, California, for example, recently studied Institute in Cambridge, Massachusetts, says ENCODE involves scrutinizing roughly 1% of all the transcripts from ten chromosomes that when he first started teaching as a young the human genome in unprecedented detail; across eight human cell lines and worked out professor two decades ago, it took him about the aim is to find all the precisely where on the chro- two hours to teach fresh-faced undergraduates sequences that serve a useful “We’ve come to the mosomes each of the tran- what a gene was and the nuts and bolts of how purpose and explain what realization that the scripts came from3. it worked. Today, he and his colleagues need that purpose is. “When we genome is full of The picture these studies three months of lectures to convey the concept started the ENCODE project paint is one of of the gene, and that’s not because the students I had a different view of overlapping transcripts.” mind-boggling complexity. are any less bright. “It takes a whole semester what a gene was,” says con- — Phillip Kapranov Instead of discrete genes to teach this stuff to talented graduates,” Young tributing researcher Roderic dutifully mass-producing says. “It used to be we could give a one-off def- Guigo at the Center for Genomic Regulation identical RNA transcripts, a teeming mass of inition and now it’s much more complicated.” in Barcelona. “The degree of complexity we’ve transcription converts many segments of the In classical genetics, a gene was an abstract seen was not anticipated.” genome into multiple RNA ribbons of differing concept — a unit of inheritance that ferried a lengths. These ribbons can be generated from characteristic from parent to child. As bio- Under fire both strands of DNA, rather than from just one chemistry came into its own, those character- The first of the complexities to challenge molec- as was conventionally thought. Some of these istics were associated with enzymes or proteins, ular biology’s paradigm of a single DNA transcripts come from regions of DNA previ- one for each gene. And with the advent of mol- sequence encoding a single protein was alterna- ously identified as holding protein-coding ecular biology, genes became real, physical tive splicing, discovered in viruses in 1977 (see genes. But many do not. “It’s somewhat revolu- things — sequences of DNA which when con- ‘Hard to track’, overleaf). Most of the DNA tionary,” says Gingeras’s colleague Phillip verted into strands of so-called messenger sequences describing proteins in humans have a Kapranov. “We’ve come to the realization that RNA could be used as the basis for building modular arrangement in which exons, which the genome is full of overlapping transcripts.” their associated protein piece by piece. The carry the instructions for making proteins, are Other studies, one by Guigo’s team4, and one great coiled DNA molecules of the chromo- interspersed with non-coding introns. In alter- by geneticist Rotem Sorek5, now at Tel Aviv somes were seen as long strings on which gene native splicing, the cell snips out introns and University, Israel, and his colleagues, have sequences sat like discrete beads. sews together the exons in various different hinted at the reasons behind the mass of tran- This picture is still the working model for orders, creating messages that can code for dif- scription. The two teams investigated occa- many scientists. But those at the forefront of ferent proteins. Over the years geneticists have sional reports that transcription can start at a genetic research see it as increasingly old-fash- also documented overlapping genes, genes DNA sequence associated with one protein ioned — a crude approximation that, at best, within genes and countless other weird arrange- and run straight through into the gene for a hides fascinating new complexities and, at ments (see ‘Muddling over genes’, overleaf). completely different protein, producing a worst, blinds its users to useful new paths Alternative splicing, however, did not in itself fused transcript. By delving into databases of of enquiry. require a drastic reappraisal of the notion of a human RNA transcripts, Guigo’s team esti- Information, it seems, is parceled out along gene; it just showed that some DNA sequences mate that 4–5% of the DNA in regions con- chromosomes in a much more complex way could describe more than one protein. Today’s ventionally recognized as genes is transcribed than was originally supposed. RNA molecules assault on the gene concept is more far reach- in this way. Producing fused transcripts could are not just passive conduits through which the ing, fuelled largely by studies that show the pre- be one way for a cell to generate a greater vari- gene’s message flows into the world but active ety of proteins from a limited number of regulators of cellular processes. In some cases, exons, the researchers say. RNA may even pass information across gener- Many scientists are now starting to think ations — normally the sole preserve of DNA. that the descriptions of proteins encoded in PLAILLY/SPL P. An eye-opening study last year raised the DNA know no borders — that each sequence possibility that plants sometimes rewrite their reaches into the next and beyond. This idea DNA on the basis of RNA messages inherited will be one of the central points to emerge from generations past1. A study on page 469 of from the ENCODE project when its results are this issue suggests that a comparable phenom- published later this year. enon might occur in mice, and by implication Kapranov and others say that they have doc- in other mammals 2. If this type of phenome- umented many examples of transcripts in non is indeed widespread, it “would have huge Spools of DNA (above) still harbour surprises, with which protein-coding exons from one part of implications,” says evolutionary geneticist one protein-coding gene often overlapping the next. the genome combine with exons from another 399 © 2006 Nature Publishing Group 25.5 gene MH 22/5/06 2:53 PM Page 400 NEWS FEATURE NATURE|Vol 441|25 May 2006 part that can be hundreds of thousands of when scientists have searched for the genetic Hard to track bases away, with several other ‘genes’ in basis of a disease or other characteristic they 1860s After playing between. This continuum of genes might have overwhelmingly found the underlying with pea plants, even spill over the boundaries of chromo- mutation to be in a protein-coding gene Austrian monk somes: last year, Richard Flavell at Yale Uni- rather than in another region. “The prepon- Gregor Mendel versity School of Medicine in New Haven, derance of evidence suggests that protein- defines the basic Connecticut, documented human immune- coding genes will hold their own when the rules of inheritance. system genes that seem to be controlled by day is over,” Hogenesh says. Traits are regulatory regions from another chromo- Some of the recent discoveries — that the determined by some6. “Discrete genes are starting to van- human genome makes a continuum of tran- discrete units that ish,” Guigo says. “We have a continuum of scripts and that cells produce masses of non- are passed from one transcripts.” coding RNA molecules — have not posed generation to the next. much of a problem to people outside the ABBEY OF ST THOMAS, BRNO, CZECH REPUBLIC CZECH THOMAS, BRNO, ABBEY OF ST Slippery concept world of molecular biology. Population 1909 Danish botanist Wilhelm Johanssen The large transcriptional surveys suggest geneticists can examine how a trait is passed coins the word ‘gene’ for the unit associated that a vast amount of the RNA manufactured down and evolves regardless of the precise with an inherited trait, although the physical basis remains unknown. by the mouse and human genomes do not molecular mechanism that underlies it. For code for proteins. Last year a consortium of example, geneticists can build models show- 1910 Thomas researchers in Japan, for example, estimated ing how a mutation is inherited whether it Morgan’s work on that a whopping 63% of the mouse genome is affects a protein, a non-coding RNA or a reg- 7,8 fruitflies (right), transcribed ; only 1–2% of the genome is ulatory region.