Corso di Biologia Molecolare e Bioinformatica

Prof. Caterina Missero

2° anno della Laurea Magistrale in Biologia

Lezione 24 NON CODING RNA: Geni che non codificano per proteine Non-coding RNAs (ncRNAs) oltre il 50% dei geni non codifica per proteine:

– RNA strutturali:

• tRNA • rRNA

• small nuclear RNAs • small nucleolar RNAs

– RNA regolatori: • miRNA • siRNA • Long noncoding RNA (LincrRNA): >200 nucleotidi; intragenici o intergenici in the RNA-seq

Long non coding RNA non hanno tutti il poly A Long noncoding RNA: LincRNA

• Intergenici

• intragenici/intronici

• antisense

• funzioni in cis (es. Xist) o in trans (es. Hotair)

• Possono essere espressi in maniera tessuto specifica

Per l’identificazione di questi ncRNA ci si e’ basati anche sulle modificazioni della cromatina che marcano i geni trascritti • lncRNAs are generated through pathways similar to that of -coding genes, with similar histone-modification profiles, splicing signals, and exon/intron lengths.

• A striking bias toward two-exon transcripts

• Predominantly localized in the chromatin and nucleus, and a fraction appear to be preferentially processed into small RNAs.

• They are under stronger selective pressure than neutrally evolving sequences

• lncRNAs are generally lower expressed than protein-coding genes and display more tissue-specific expression patterns.

• Expression correlation analysis indicates that lncRNAs show particularly striking positive correlation with the expression of antisense coding genes. Are lncRNAs functional? Coding potential

better distinguish not only between coding RNAs and ncRNAs, but also between small coding RNAs and small ncRNAs than the state-of-the-art methods due to the addition of the novel RNA features. A recent study proposes 1335 novel human coding RNAs from a large number of RNA-seq datasets. However, only 119 transcripts are predicted as coding RNAs by the CPPred. In fact, almost all proposed novel coding RNAs are ncRNAs (91.1%), which is consistent with previous reports. LncRNA sono meno espressi e meno ubiquitari Evolution, junk DNA and long non coding • Transcriptome studies reveal pervasive transcription of complex genomes, such as those of mammals.

• Despite popular arguments for functionality of most, if not all, of these transcripts, genome-wide analysis of selective constraints indicates that most of the produced RNA are junk. However, junk is not garbage. On the contrary, junk transcripts provide the raw material for the evolution of diverse long non-coding (lnc) RNAs by non-adaptive mechanisms, such as constructive neutral evolution.

• The generation of many novel functional entities, such as lncRNAs, that fuels organismal complexity does not seem to be driven by strong positive selection. Rather, the weak selection regime that dominates the evolution of most multicellular eukaryotes provides ample material for functional innovation with relatively little adaptation involved. Struttura e funzione dei LncRNA

• Effettori piu’ semplici e versatili delle proteine (possono associarsi a diversi ligandi, ed ad acidi nucleici semplicemente per appaiamento)

• La produzione a livello cellulare e’ piu’ semplice, non richiede apparato di traduzione e degradazione delle proteine

• Piu’ tolleranti alle mutazioni, e meno conservati

• Localizzazione nucleare o citoplasmatica A regulatory RNA can transition faster from being transcriptionally inactive to fully functional

Ciclo dei long non-coding RNA Ciclo dei coding RNA RNA can fold into complex three-dimensional structures that can specifically bind various ligands

As conformational changes can be triggered by ligand binding, RNA structures themselves can be very dynamic RNA is a biochemically versatile polymer

Ability to with other nucleic acids: recognizing both RNA and DNA targets through simple one-to-one base pairing interactions (less complicated than for a protein!) Long non-coding RNAs as scaffolds

LncRNA– protein interactions can have various functions, including: 1) combining the functions of multiple 2) localizing complexes to genomic DNA 3) modifying the structure of proteins 4) competitively inhibiting protein function (as decoys) 5) providing a multivalent platform, for example, to increase the avidity of protein interactions or to promote RNA– protein complex polymerization. A) lncRNAs can act as guides to target chromatin-modifying complexes to specific genomic locations for the regulation of gene expression. B) lncRNAs can act as dynamic scaffolds for cofactors to transiently assemble together. C) lncRNAs can bind to microRNAs or transcription factors as decoys to sequester them away from their targets, affecting transcription and translation. Funzioni dei lncRNAs

• Regolazione della trascrizione • Regolazione del processamento del mRNA • Modulazione del controllo post-trascrizionale

• Regolatori dell’attivita’ di una proteina • Scaffold per macro-complessi • Molecole di segnalazione RNA are less restricted in terms of their conservation and more tolerant of mutations RNA interactome analysis (RIA-seq)

RIA-seq identifies lncRNA-interacting RNAs Schematic representation of high-throughput assays to analyze the lncRNA RNA- interactome Protein interactome analysis

Schematic representation of protein-interactome. Left panel: Protein microarray assay. Right panel: RNA-binding protein pulldown Regolazione della trascrizione (I) LncRNA partecipano a complessi di modificazione della cromatina e contribuiscono al targeting di specifici locus genici

EXAMPLES:

HOTAIR which functions in trans to direct the chromatin modifier Polycomb Repressive Complex 2 (PRC2) to the developmental HOXD locus and, when aberrantly overexpressed, to cancer-related genes, leading to gene repression

XIST TSIX lncRNAs can bind one or more chromatin-modifying complexes and target their activities to specific DNA loci. Depending on the nature of the enzymes bound, lncRNA-mediated chromatin modifications can activate or repress gene expression Regolazione post-trascrizionale (VI)

decoys to attenuate small RNA regulation

Linear or circular lncRNAs can function as miRNA decoy to sequester miRNAs from their target mRNAs Circular RNAs

• observed for decades in eukaryotic cells but perceived as splicing errors

• discovery of their abundance in high-throughput deep sequencing: very abundant and highly expressed

• circular RNAs are distinct from their linear counterparts because they are devoid of the terminal structures (e.g., 5’ cap or a polyA tail) Circular RNAs

• because of the lack of free ends, circular RNAs are resistant toward exonucleases, thereby escaping normal RNA turnover

• generated co-transcriptionally at the expense of canonical linear mRNA Biogenesis of circular RNAs

• generated during splicing through various mechanisms • splicing with a strong dependence on intronic sequences • exon circularization depends on flanking intronic complementary sequences • Splice sites along with short (~30- to 40- nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells • complementary repetitive intronic sequences (e.g., Alu-sequences) favor back-splicing during RNA maturation L’esone 2 e 3 sono fiancheggiati da sequenze Alu Localizzazione sequenze Alu nel nucleo di fibroblasti

Alu elements are primate-specific repeats and comprise 11% of the human genome. They have wide-ranging influences on gene expression. Alu elements are the most abundant transposable elements.

Alu elements are about 300 base pairs long and are therefore classified as short interspersed nuclear elements (SINEs) among the class of repetitive DNA elements.

Localization of Alu Sequences in Nuclei of Fibroblasts and Lymphocytes

Alu (porzioni in verde): sequenze ripetute ricche in GC, sono piu’ presenti in zone trascritte

Bolzer A, Kreth G, Solovei I, Koehler D, et al. (2005) Three-Dimensional Maps of All in Human Male Fibroblast Nuclei and Prometaphase Rosettes. PLoS Biol 3(5): e157. doi:10.1371/journal.pbio.0030157 Circular RNAs

• exon circularization and linear splicing compete with each other in a tissue-specific fashion

• several types of circular RNA transcripts can be produced from a single gene

• precursor messenger RNA (pre-mRNA) splicing machinery ‘‘backsplices’’ and covalently joins, for example, the two ends of a single exon. Proactive back-splice event

The biogenesis of circRNAs is instead actively regulated and favored by the presence of specific and repetitive sequences’ and RNA-binding proteins’(RBPs) binding sites within the introns up- and downstream of the circularizing exons the 3’ tail of one exon is joined to the 5’ head of an upstream exon

Pseudogeni

• Si pensava fossero copie non funzionali di geni

• Si originano per: – duplicazione: non-processati

– retrotrasposizione: processati, reinseriti nel genoma via RNA (introni, e polyA)

• Possono essere trascritti, ma non tradotti

Interessanti dal punto di vista evolutivo Pier Paolo Pandolfi • Long ncRNAs can function as miRNA decoys, also referred to as target mimics or sponges, in which long ncRNAs carry a short stretch of sequence sharing homology to miRNA-binding sites in endogenous targets.

• As a consequence, miRNA decoys are able to sequester and inactivate miRNA function. PTEN: TUMOR SUPPRESSOR GENE

The loss and mutation of PTEN in various cancers lead to hyperactive PI3K signaling. PTEN AND CANCER

• Monoallelic mutation of PTEN in early cancer

• Loss or mutation of the second allele advanced cancers

• Cells are sensitive to even subtle decreases in PTEN abundance

• PTEN-targeting microRNAs (miRNAs) In the human genome:

• 1 gene for PTEN • 1 pseudogene called PTENP1 PTEN PTENP1 A coding-independent function of gene and pseudogene mRNAs regulates tumour biology

Nature 2010 PTENP1 is targeted by PTEN-targeting miRNAs The 3’ UTR of PTENP1 has tumour suppressive activity

Proliferation rate of cells Growth in semisolid medium of cells PTENP1 3’ UTR is a more potent growth suppressor compared to PTEN. This result may be explained by the fact that miRNAs for which PTENP1 functions as a decoy also bind other targets with tumour suppressive activities. PTEN expression is dependent on PTENP1 • Is PTENP1 relevant for cancer???

• Lo pseudogene PTENP1 e’ un oncosopressore? Expression and losses of PTENP1 in human cancer ü In normal human tissues and prostate tumour samples: direct correlation between PTEN and PTENP1 expression suggests that they may be co- regulated.

ü Copy number losses occurring specifically at the PTENP1 locus in colon cancer

ü Direct relationship between PTENP1 copy number and PTEN expression Gli pseudogeni possono fungere come spugna per microRNA

Competing Endogenous Competing endogenous RNA (ceRNA)

Pier Paolo Pandolfi