Long Noncoding Rnas: the Search for Function Monya Baker
Total Page:16
File Type:pdf, Size:1020Kb
TECHNOLOGY FEATURE Tool box 380 Getting structure 382 Box 1: Long noncoding RNAs regulating genes 380 Box 2: Possible roles for long noncoding RNA 381 Long noncoding RNAs: the search for function Monya Baker Transcripts are easy to find: sorting out what they do is a challenge. In the late 1990s, before the publication of the human genome, John Rinn, then at Protein X Yale University, was hunting for protein ? genes on chromosome 22 with his gradu- ate student adviser Michael Snyder. The lncRNA only genes they found were ones that had (iii) Protein X Activate or already been discovered, but their arrays repress protein identified a steady stream of transcribed gene ? regions with no apparent purpose. These POL II long noncoding RNAs (lncRNAs) came Gene A Gene B ... from genome regions that were known to lack protein genes. The transcripts also lncRNA lacked open reading frames and other (i) ? Z properties necessary for them to be trans- Protein X lated into proteins. lncRNA X Most scientists at the time dismissed (ii) such transcripts as noise, but Rinn kept doing experiments. “I started cloning Y them,” he recalls, “and I realized that if I X Nature America, Inc. All rights reserved. All rights Inc. America, Nature 1 could clone them, they must be stable.” And if they were stable, he thought, per- © 201 haps they were functional, too. In 2004, Z Rinn took a postdoctoral position with Rinn J. Howard Chang at Stanford University, Long noncoding RNAs (lncRNAs) could be a byproduct of transcription (i), a scaffold linking after the two came up with a scheme to proteins (ii) or a guide bringing proteins to specified parts of the genome (iii). The same lncRNA learn what, if anything, these mysterious can function simultaneously as a scaffold and a guide. POL II, RNA polymerase II. transcripts were doing. This work led eventually to the dis- covery of a noncoding RNA they named Function focus different cells. Other researchers have HOTAIR1. This 2.2-kilobase spliced RNA Everyone has favorite analogies for how suggested an effector component: lncRNA transcript interacts with the protein com- lncRNAs might function. Chang recently binds to a protein, changing its structure plex polycomb to modify chromatin and showed that HOTAIR serves as a ‘modu- and activating it4. Tom Cech, a scientist repress transcription of the human HOX lar scaffold’, assembling a molecular cargo at the University of Colorado at Boulder genes, which regulate development. How of specific combinations of enzymes that who won the Nobel Prize for his work on exactly it does so is still unclear. are equipped to regulate target genes2. RNA, believes that each of these mecha- What is clear is that HOTAIR is just one Rinn likens some of these scaffolds to an nisms and more may be in play. of thousands of lncRNAs. Although less air traffic controller, guiding regulatory The possibilities seem endless (Box 2). than 2% of a mammalian genome codes machinery to the appropriate spots in the Some lncRNAs may even enhance tran- for protein, studies consistently show that genome. His work has shown that hun- scription through chromosome looping half or even more of the genome is tran- dreds of lncRNAs are physically associ- or other means5,6. “I don’t know why peo- scribed. Partly because suitable research ated with polycomb and other chromatin- ple think that lncRNAs are all doing one tools are in their infancy, scientists are modifying complexes3. That, he says, thing,” says Rinn. “They are just new types only beginning to uncover the functions would explain why the same protein of genes, and their repertoire of functions of these transcripts (Box 1). complexes act on different sequences in I think will rival the proteome.” NATURE METHODS | VOL.8 NO.5 | MAY 2011 | 379 TECHNOLOGY FEATURE BOX 1 LONG NONCODING RNAs REGULATING GENES Although researchers are still struggling to uncover the stress, DNA damage and p53, a protein that regulates the cell mechanisms and protein partners of long noncoding RNAs cycle and is implicated in about half of all human cancers12. (lncRNAs), evidence of their importance in basic biology Another set of experiments hinted at extensive regulatory is pouring in. systems featuring lncRNAs. Several lncRNAs were found to be Last year, John Rinn at the Broad Institute and colleagues regulated by p53; one transcript, lncRNA-21, binds a protein used a specially designed microarray to test the expression known as heterogeneous nuclear ribonucleoprotein K (hnRNP-K). of about 900 lncRNAs in human fibroblasts and pluripotent Almost 600 genes are affected in common by all three—p53, stem cells. Just over 100 lncRNAs were induced when lncRNA-21 and hnRNP-K (ref. 13). fibroblasts were reprogrammed to pluripotency; a similar This March, Howard Chang at Stanford University described number of lncRNAs were repressed. The team focused on how a lncRNA called HOTTIP can help to activate the transcription a couple of dozen lncRNAs that were more upregulated in of several HOXA genes in vivo. The molecule is transcribed from induced pluripotent stem cells than in embryonic stem cells, the tip of the HOXA@ locus, where it binds adaptor proteins reasoning that these would be particularly important for and sets nearby chromatin marks to drive transcription. When reprogramming. Previously published data indicated that HOTTIP was knocked down, the proteins MLL1 and WDR5 were Oct4, an essential transcription factor for pluripotency, was not observed on the transcription start sites of HOX5 genes as binding sites in the genome identified as coding for lncRNAs. they normally are. However, these effects could not be rescued by Gain- and loss-of-function experiments showed that at expressing HOTTIP from another region on the genome, indicating least one of these, called lincRNA-RoR, for long intergenic that the lncRNA must work from the chromosome on which noncoding RNA and regulator of reprogramming, was it is transcribed. Indeed, chromosome conformation capture essential for a variety of functions, including reprogramming techniques have indicated that chromatin looping brings the RNA as well as modulating genes known to respond to oxidative into close proximity to the activated genes3. Because their functions are difficult to that are sticking well above the pack.” And really proof,” he says. “Ultimately we have study, long noncoding RNAs are gener- though the work is still in early stages, it to go in and do experiments to demon- ally classified by their origins. They seem seems to be paying off. Knocking down strate that things have function.” to come from everywhere in the genome: or ectopically expressing these transcripts Advances in uncovering the function the noncoding side of genes, alongside and changes cells’ phenotypes for about half of of lncRNAs could be self-reinforcing, between the protein-coding regions and, the cases he has investigated, he says. says Chang. Standard approaches often especially, the long stretches in genomes Still, the debate over just what propor- neglect noncoding genes. Whole-exon where no protein-coding genes are thought tion of all noncoding RNAs are function- capture used in disease association stud- to exist at all. Noncoding transcripts are al is surprisingly fierce. “It is difficult to ies, for example, restricts sequencing to Nature America, Inc. All rights reserved. All rights Inc. America, Nature 1 traditionally classified as long at around discriminate functional transcripts from regions that code for proteins. And when 200 nucleotides, an arbitrary distinction those that may be byproducts of other researchers find that a transcription factor © 201 based on RNA purification technologies. processes,” says Tim Hughes, a genome binds to an intergenic region, their first Most are thousands of nucleotides long. biologist at the University of Toronto, instinct is often to investigate the nearest Nowadays, new putative lncRNAs are “but many transcripts that come from ‘protein gene’. “The knowledge that there generally identified by an RNA sequenc- intergenic regions are starting to look like are long noncoding RNA genes could ing technique called RNA-seq, which uses real signals. They show up relatively con- change someone’s strategy,” says Chang. high-throughput sequencing to profile sistently in different experiments, contain cell transcripts. Chromatin analysis also splice junctions and are present in high Tool box helps. Work by Rinn and others showed numbers,” says Hughes. Still, he advocates As more researchers begin to investi- that histone methylation patterns that are caution. Differential expression could be gate lncRNAs, there is a greater need to characteristic of transcribed protein genes explained by activity in nearby protein- annotate them so that researchers know also apply to lncRNAs, and resulting ‘chro- coding genes, for example. “Higher abun- what to look for, know if they find some- matin-state maps’ have now been used to dance presumably increases the likelihood thing new and know how to name what flag thousands of putative lncRNAs6. that a transcript is functional, but it’s not they find. In general, microarrays cannot So many lncRNAs are being identified distinguish between different forms of a that it is difficult to know what to study transcript, and sequencing often indicates in depth, says John Mattick, a genome multiple ‘isoforms’ of a transcript without biologist at the University of Queensland. indicating which is the most biologically His tactic is to use microarray data to find relevant. “Protein-coding RNA has been transcripts with the greatest change in studied for long enough that most major expression between tissues; differences lab Lee J. Sarma, K. forms of the transcripts are known, but of more than 20-fold are not uncommon, RNA FISH probes (green) that bind the lncRNA lncRNAs are too new for that,” says Rinn. he says. “When you have a mountain of Xist (right) can be displaced using specially Right now, he says, it is not always clear things to look at, you just want the ones designed sequences of locked nucleic acids (left).