Gene Expression: Non-Coding RNA and Personalized Medicine
Total Page:16
File Type:pdf, Size:1020Kb
Gene expression: Non-coding RNA and personalized medicine. Breaking the genetic enigma Narayan Sastri Palla [email protected] Outline • RNA world • Central Dogma • Types of RNA transcripts and their functions • Noncoding RNA • Structure RNA • Computational methods to identify ncRNA • Lab exercises: microRNA target prediction Evolution of Life from primitive RNA to complex regulatory RNA in DNA Genome Central Dogma of Biology r 4 RNA is an Active Player reverse transcription long ncRNA 5 Overview of the kinds of DNA sequences found in the human genome Definitions: Genome: the total amount of genetic material, stored as DNA. • The nuclear genome refers to the DNA in the chromosomes contained in the nucleus; in the case of humans the DNA in the 46 chromosomes. It is the nuclear genome that defines a multicellular organism; it will be the same for all (almost) cells of the organism. Transcriptome: • The total amount of genetic information which has been transcribed by the cell. This information will be stored as RNA. • This represents some 90% of the total genomic sequences • There is ~5X more RNA than DNA in a cell, most of it rRNA (~80%) and tRNA (~15%) Transcriptome: • The transcriptome is unique to a cell type and is a measure of the gene expression. • Different cells within an organism will have different transcriptomes. Cell types can be identified by their transcriptome. Proteome: • The cell’s complete protein output. This reflects all the mRNA sequences translated by the cell. • Cell types have different proteomes and these can be used to identify a particular cell. • Only 1 – 2% of the genome codes for the proteome If protein-coding portions of the human genome make up only 1.5% what is the rest doing? How can the disparity between the number of sequences transcribed and translated be explained? It’s an RNA World • Protein Translation • mRNA • tRNA • rRNA • RNA function and maturation • snRNA • snoRNA • RNaseP • Y RNA • Rnase MRP • RNA interference • miRNA • siRNA • piRNA • Regulatory RNAs • lncRNA • lincRNA • Telomere synthesis • Telomerase RNA Non-coding RNA • Only 1-2 % of the genome codes for proteins • BUT a large amount of it is transcribed; some estimates have it as high as 98%. Non-coding RNA The difference is the RNA which is an end in itself. This non-coding RNA (ncRNA) consists of : • the introns of protein coding genes, • non coding genes (what are these??) • Sequences antisense to or overlapping protein coding genes. Classes of ncRNAs Class Size Function Phylogenetic distribution tRNA 70-80 Translation ubiquitous rRNA translation ubiquitous 16S/18S 1.5K 28S+5.8S/23S 3K 5S 130 RNase P 220-440 tRNA -maturation ubiquitous MRP 250-350 eukarya snoRNA 130 pseudouridinylation addition of repeats telomerase 400-550 snRNA 100-600 Spliceosome Eukarya U1 ~ U6 130-140 mRNA maturation Eukarya, archaea U7 ~65 Histone mRNA Eukayotes Maturation 7SK ~300 Translational vertebrata regulation tmRNA 300-400 Tags protein bacteria For proteolysis miRNA ~22 Post-tran. Reg. Multi-cellular orgs (Bompfunewerer, et al, 2005) Roles of ncRNAs • Known roles for ncRNAs: – RNA catalyzes excision/ligation in introns. – RNA catalyzes the maturation of tRNA. – RNA catalyzes peptide bond formation. – RNA is a required subunit in telomerase. – RNA plays roles in immunity and development (RNAi). – RNA plays a role in dosage compensation. – RNA plays a role in carbon storage. – RNA is a major subunit in the SRP, which is important in protein trafficking. – RNA guides RNA modification. – In the beginning it is thought there was an RNA World, where RNA was both the information carrier and active molecule. 16 Functions of RNAs • Protein Translation • mRNA • tRNA • rRNA Protein translation Messenger RNA (mRNA) • Destiny dictated by post-transcriptional modifications • “Cap and tail exits cell” • mRNA methylation widespread and likely functional • N6-methyladenosine (m6a) • meRIP-Seq Protein translation Transfer RNA (tRNA) • 15% of cellular RNA Ribosomal RNA (rRNA) • 80% of cellular RNA Functions of RNAs •RNA function + maturation • snRNA • snoRNA • RNaseP • Y RNA • Rnase MRP RNA-protein complexes Support cellular and molecular functions RNA function and maturation Small nuclear RNA (snRNA) • RNA component of the Spliceosome • snRNP complex made up of 5 snRNAs and over 20 proteins • Removes regions of non-coding mRNA (introns) RNA function and maturation Small nucleolar RNA (snoRNA) • Guides chemical modifications of other RNAs • mainly rRNA, tRNA, and snRNA • 2 main classes of snoRNA: • H/ACA box, direct conversion of uridine to pseudouridine • C/D box snoRNAs, help add methyl groups to RNAs RNA function and maturation Ribonuclease P (RNaseP) • RNA component of an RNA enzyme (Ribozyme) • Cleaves a precursor sequence from tRNA molecules, generating mature tRNA • Also required for RNA Pol III transcription of various small noncoding RNA genes (e.g., tRNA, 5S rRNA, SRP RNA, and U6 snRNA genes) • Because they catalyze site-specific cleavage of RNA molecules, Ribozymes may have pharmaceutical applications RNA function and maturation Y RNA • Component of Ro Ribonucleoparticle (RoRNP) complex • Chaperone regulating maturation of small ncRNAs • Transcribed by RNA Pol III • UV resistance in mammalian cells • Essential for DNA replication • Upregulated in human cancer tissue • Required for increased proliferation of cancer cell lines RNA function and maturation Ribonuclease MRP (RNase MRP) • RNA component of RNase MRP • Enzymatically active ribonucleoprotein • Initiation of mitochondrial DNA replication • In the nucleus, involved in precursor rRNA processing Functions of RNAs • RNA interference • miRNA • siRNA • piRNA RNA inhibiting RNA RNA interference (RNAi) • Biological process in which RNA molecules inhibit gene expression, typically by causing the destruction of specific mRNA molecules • Andrew Fire and Craig C. Mello shared the 2006 Nobel Prize in Medicine for their work on RNAi in the nematode worm C. elegans • RNAi helps defend cells against parasitic nucleotide sequences • viruses and transposons • Plays integral role in development as well as regulation of gene expression in general • Technological applications • Gene knockdown: study physiological role of individual genes • Functional genomics: genome-wide RNAi screens • Medicine: attractive, but RNAi delivery to tissues is difficult Small (short) interfering RNA (siRNA) MicroRNA (miRNA) miRNAs must first undergo extensive post-transcriptional modification before they are mature and functional Primary transcript (Pri-miRNA) pre-miRNA mature miRNA 1. Pri-miRNA are processed into 70-nucleotide precursors (pre-miRNA) 2. Precursor is cleaved to generate 21–25-nucleotide mature miRNAs Difficulty to discover ncRNAs from genomes Unlike protein- coding genes: No strong statistical sequence signals (no ORF, no polyadenine) Folded into 3D structure Transcribed to tRNA sequence tRNA gene ncRNA gene finding strategies 1. Computational predictive methods 2. cDNA cloning to enrich ncRNAs 3. Detecting new transcripts with oligonucleotide microarrays Computational ncRNA gene finding methods • Specific (custom-designed) ncRNA search and annotation (e.g., tRNAscan, methylattion-guide snoRNA, miRNA, tmRNA) • Reconfigurable search systems (e.g., Infernal, ERPIN, RNATOPS,FastR) • mechanism to profile the target ncRNA (structure) - need training data • De novo ncRNA gene detection with • base composition (e.g., G+C %) • structure fold (e.g., RNAz) • Comparative analysis (e.g., QRNA, EvolFold) - consensus structure Some ncRNAs databases • Rfam (280,000 regions of 379 families) • NONCODE (109 transitional classes and 9 groups) • RNAdb (800 mammalian ncRNAs, excluding tRNAs, rRNAs and snRNAs) • Arabidposis small RNA Project (ASRP) • Etc. RNA Folds into (Secondary and) 3D Structures A C A G A C A G C U A G C 200 G C AAUUGCGGGAAAGGGGUCAA P5a C GU G U 120 A U U G CAGCCGUUCAGUACCAAGUC G C P5 C G U A C G UCAGGGGAAACUUUGAGAUG C G G G GCCUUGCAAAGGGUAUGGUA A U U A A U A AUAAGCUGACGGACAUGGUC A A A CUAACCACGCAGCCAAGUCC U A A A 180 G C UAAGUCAACAGAUCUUCUGU G C C G P5c U A C G A G U A A 260 UGAUAUGGAUGCAGUUCA A A G G C G C G C C A P4 C G U G U C C U G G A 140 A U U G A U G U U 160 G C P6 C G A A A U G C C A P5b U A A C U G A G U G 220 G U C G U A A G A A C G P6a C G U U A A A We would like G U U A C G A U to predict them A U P6b C G A U G C 240 from sequence. A U U C U Waring & Davies. Cate, et al. (Cech & Doudna). (1984) Gene 28: 277. (1996) Science 273:1678. 33 RNA structure rules • Canonical basepairs: • Watson-Crick basepairs: • G - C • A - U • Wobble basepair: • G – U • Stacks: continuous nested basepairs. (energetically favorable) • Non-basepaired loops: • Hairpin loop. • Bulge. • Internal loop. • Multiloop. • Pseudo-knots RNA stem-loop (pseudoknot-free) structure example X chromosome inactivation in mammals An example of noncoding RNA function X X X X Y Dosage compensation Xist – X inactive-specific transcript Avner and Heard, Nat. Rev. Genetics 2001 2(1):59-67 Laboratory Exercises microRNA computational prediction and analysis Slides from: www.bioalgorithms.info miRNA Pathway Illustration Features of miRNAs • Hundreds miRNA genes are already identified in human genome. • Most miRNAs start with a U • The second 7-mer on the 5' end is known as the “seed.” • When an miRNAs bind to their targets, the seed sequence has perfect or near- perfect alignment to some part of the target sequence. • Example: UGAGCUUAGCAG... Features of miRNAs • Many miRNAs are conserved across species: • For half of known human miRNAs, >18% of all occurrences of one of these miRNA seeds are conserved among human, dog, rat, and mouse. • As a rule, the full sequence of miRNAs is almost never completely complementary to the target sequence. • Common to see a loop or bulge after the seed when binding. • Loop/bulge is often a hairpin because of stability. • The site at which miRNAs attack is often in their target's 3' UTR.