View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Biochimica et Biophysica Acta 1842 (2014) 1910–1922

Contents lists available at ScienceDirect

Biochimica et Biophysica Acta

journal homepage: www.elsevier.com/locate/bbadis

Review Genetic variation in the non-coding genome: Involvement of micro-RNAs and long non-coding RNAs in disease☆

Barbara Hrdlickova 1, Rodrigo Coutinho de Almeida 1, Zuzanna Borek, Sebo Withoff ⁎

Department of Genetics, University of Groningen, University Medical Center Groningen, The Netherlands

article info abstract

Article history: It has been found that the majority of disease-associated genetic variants identified by genome-wide association Received 6 November 2013 studies are located outside of -coding regions, where they seem to affect regions that control transcription Received in revised form 5 March 2014 (promoters, enhancers) and non-coding RNAs that also can influence expression. In this review, we focus on Accepted 16 March 2014 two classes of non-coding RNAs that are currently a major focus of interest: micro-RNAs and long non-coding Available online 22 March 2014 RNAs. We describe their biogenesis, suggested mechanism of action, and discuss how these non-coding RNAs

Keywords: might be affected by disease-associated genetic alterations. The discovery of these alterations has already con- SNP tributed to a better understanding of the etiopathology of human diseases and yielded insight into the function ncRNA of these non-coding RNAs. We also provide an overview of available databases, bioinformatics tools, and high- miRNA throughput techniques that can be used to study the mechanism of action of individual non-coding RNAs. lncRNA This article is part of a Special Issue entitled: From Genome to Function. Genetic variant © 2014 Elsevier B.V. All rights reserved. Human disease

1. Introduction allelic chromatin states [8]. A small percentage seems to disrupt or create micro-RNA (miRNA) binding sites in the 3′ untranslated region (3′-UTR) Genome-wide association studies (GWAS) have discovered thou- of . All these events will affect the expression level of the genes sands of single-nucleotide polymorphisms (SNPs) that are associated regulated by these functional elements and can thereby contribute to with multifactorial diseases and quantitative traits. At the time of writing deregulation of pathways that control healthy cell function. of this review (October 2013), the GWAS catalog [1] described 11,680 The recent discovery of approximately 13,500 long non-coding RNAs SNPs associated with diverse phenotypes and quantitative traits. Despite (lncRNAs) has changed the view on the (Fig. 1) [9].It the wealth of information that GWAS provide, it can be difficult to inter- has been estimated that approximately 7% of SNPs associated with auto- pret the results because of the limited resolution of the genome-wide immune diseases seem to annotate to long intergenic non-coding RNAs chips used in the initial genotyping screen. Moreover, even state-of- (lincRNAs), a subclass of lncRNAs [6], and some GWAS SNPs have been the-art technology and approaches, such as using specialized high- demonstrated to have eQTL-effects on these lincRNAs [10]. It is thought resolution genotyping platforms (for example, the Immunochip [2] and that the majority of lncRNAs are somehow involved in regulating the Metabochip [3]), performing genotype imputation [4],oreQTLanalysis expression of protein-coding genes and, therefore, SNPs associated [5], are often not enough to pinpoint the functional SNP and/or causative with these ncRNAs may indirectly influence the expression of gene. What is emerging from these GWAS, however, is that N90% of involved in disease. In this review we will describe how two classes of disease-associated SNPs are located in non-coding regions of the genome regulatory ncRNAs, the miRNAs and the lncRNAs, regulate gene expres- for example in promoter regions, enhancers, or even in non-coding RNA sion. We will also describe how SNPs and other types of genetic varia- genes [1,6]. This indicates that these SNPs might be regulatory. In a paper tion that affect these ncRNAs can contribute to disease phenotypes. by the Encyclopedia of the DNA Elements (ENCODE) Project Consortium it was suggested that biochemical functionality could be assigned 80% of 2. miRNAs the human genome [7]. While fewer than 10% of the GWAS SNPs affect coding sequences, most non-coding variants are concentrated in DNA miRNAs are short regulatory RNAs (approximately 19–24 nucleo- stretches marked by deoxyribonuclease I (DNase I) hypersensitive sites, tides long) involved in post-transcriptional gene regulation. The first where they seem to perturb transcription factor binding sites or alter miRNA, lin-4, was identified in 1993 in a screen for genes required for post-embryonic development in Caenorhabditis elegans [11], but it took another seven years to discover the second one (let-7) [12].Since ☆ This article is part of a Special Issue entitled: From Genome to Function. ⁎ Corresponding author. then, the number of miRNAs has increased steadily. At the time of 1 These authors contributed equally. writing, the number of human mature miRNAs described in miRBase

http://dx.doi.org/10.1016/j.bbadis.2014.03.011 0925-4439/© 2014 Elsevier B.V. All rights reserved. B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 1911

Fig. 1. Abundance of regulatory ncRNA species versus protein coding genes in the human genome. The numbers are based on Gencode V17 (http://www.gencodegenes.org/releases/17.html).

V20 is more than 2500 [13]. miRNAs are estimated to regulate the trans- 3. miRNA biogenesis and mechanism of action lation of up to 60% of protein-coding genes [14]. A single vertebrate miRNA has been described as targeting 200 messenger RNAs (mRNAs) The process of miRNA biogenesis is quite characteristic for this sub- on average, although some miRNAs regulate only a few targets [15]. class of ncRNAs. The primary miRNA transcript (pri-miRNA) is charac- Conversely, some protein-coding genes are regulated by only a single terized by one or many hairpins that encompass the functional mature miRNA, while others are regulated by many miRNAs [16]. The impor- miRNA in their stem (Fig. 2). Upon recognition by two nuclear enzymes, tance of miRNAs as fine-regulators of gene expression has become Drosha and DGCR8, the pri-miRNA is processed into one or several hair- clear since it was discovered that they play important roles in pivotal pins approximately 70 nucleotides long; these are called precursor biological processes, such as development, cell proliferation, cell differ- miRNAs (pre-miRNAs). They are exported into the cytoplasm by the entiation, and cell death [17–20]. nuclear export protein Exportin 5 (XPO5) [21]. In the cytoplasm, the

Fig. 2. miRNA biogenesis and mechanism of action. Stem loop sequences in primary microRNA (pri-miRNA) transcripts are recognized by DROSHA/DGCR8 and processed into 60–80-nucleotide long precursor microRNA (pre-miRNA) hairpins, which are subsequently translocated by exportin 5. In the cytosol, the pre-miRNA is cleaved by Dicer to produce two single-stranded RNA strands, each of which can be loaded into the RISC complex (here represented by Dicer, GW182 and AGO2 proteins), which is guided by the miRNA sequence to the 3′ UTR of the target mRNA. 1912 B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 pre-miRNA can be recognized and are then processed by the RNase III miRNAs by promoter hypermethylation was found in many human tu- enzyme Dicer, which removes the loop of the hairpin, resulting in a mors [41], for example, the miR-200 family is involved in the control of ~20 bp dsRNA molecule. One of the strands will be incorporated into the epithelial–mesenchymal transition (EMT). In EMT, epithelial cells the RNA-induced silencing complex (RISC) containing the Argonaute lose their adherence and polarity, and start to migrate. The miR-200 protein 2 (AGO2) and the GW182 [22]. The RISC complex will target a family downregulates ZEB1 (zinc finger E-box-binding homeobox 1) mRNA transcript, based on sequence complimentarity between the and ZEB2 (zinc finger E-box-binding homeobox 1), two important tran- miRNA sequence and nucleotides in the 3′-UTR of the target [23].Itis scriptional repressors of genes involved in cell adherence (E-cadherin) thought that binding of the RISC complex to this target leads to direct and polarity (CRB3 (crumbsproteinhomolog3)andLGL2 (lethal Ago-mediated cleavage of the target if the homology between miRNA giant larvae)). Thus hypermethylation of miR-200 family members in and its target is extensive or to deadenylation followed by degradation cancer leads to upregulation of ZEB1 and ZEB2, leading to decreased of the target mRNA if the homology between the miRNA and its target adherence and polarity [42].Similarly,histonemodifications might is less extensive [22,24].Efficient targeting requires continuous base- also affect miRNA expression levels. For instance, SIRT1 (sirtuin 1), a pairing of the miRNA seed region (stretches of 6–8 nucleotides between NAD-dependent histone deacetylase involved in control of axon growth positions 1 and 8 of the mature miRNA) with its target [11,24,25]. and degeneration, was recently found to directly suppress the expres- Computational target prediction approaches make use of this pro- sion of miR-138 in response to peripheral nerve injury [43]. posed rule to predict the targets of miRNA-induced silencing. These al- Thirdly, different types of genetic alterations to miRNA genes or gorithms are based on searching for perfect Watson–Crick pairing to their regulatory motifs can have deleterious consequences. In fact, between the miRNAs' seed-sequence and the target mRNA sequence the first example of the involvement of miRNAs in cancer was the (most algorithms focus only on the 3′-UTR of genes) alone, or in combi- description of a deletion of 13q14 in chronic lymphocytic nation with other rules, such as evolutionary conservation criteria to leukemia patients. The deleted area contains the miR-15a and miR-16-1 predict miRNA target sites [14,24]. Evolutionary conservation was im- genes that target the anti-apoptotic/pro-survival gene BCL-2 (B-cell portant in defining the first identified miRNAs [26], but it is now becom- lymphoma 2) [44] and thus deletion of this region contributes to the ing clear that many miRNAs are species-specific. In contrast, more than greater survival characteristics of cancerous cells. For the purpose of 60% of human protein-coding genes have been under selective pressure this review, we will focus on the most common type of genetic variants, to maintain pairing to miRNAs [27]. miRNAs that appear to have a com- SNPs, and show how they affect the function of miRNAs. mon ancestor, and differ only in a few nucleotides, are grouped into the same miRNA family [28]. Until recently, it was assumed that miRNAs 5. Genetic variation and miRNA function mainly target the 3′-UTRs of mRNAs [22], but it has now been shown that miRNA target sites can also be located in the 5′-UTRs of target There are many ways in which disease-associated SNPS can affect mRNAs, or even in the coding region of these RNAs [22,29]. miRNA levels. Firstly, mutations in miRNA biogenesis genes (Fig. 2) Much attention has recently been paid to miRNA as potential bio- can affect miRNA processing, which, in turn, can contribute to disease. markers in circulation. Cell-free miRNAs have been described in multiple For instance, the homozygous presence of SNP rs2073778, located in human body fluids, such as serum [30,31], saliva [32], cerebrospinal fluid DGCR8, was found to be associated with a 4-fold increased risk of non- (CSF) [33], and urine [34].Mostimportantly,thedisease-specificoreven muscle bladder cancer progression [45]. Another example is the pres- disease-stage-specific nature of circulating miRNA profiles [35,36] ence of SNP rs3742330 (ANG) in the Dicer gene, which is associated implies that circulating miRNAs might potentially be used as novel with increased survival of T cell lymphoma patients. Homozygous indi- biomarkers to evaluate health status or disease progression. viduals carrying the GG genotype had a significantly increased overall survival [46]. The functional consequences of these SNPs have not 4. A role for miRNAs in disease been investigated to date. Secondly, SNPs in pri-miRNA and pre-miRNA can affect miRNA miRNAs have been shown to be involved in cancer and in neurode- maturation efficiency (Fig. 3A). For example, SNP rs11671784 in the generative, cardiovascular and autoimmune diseases [16].Changes miR-27a gene reduces gastric cancer risk by impairing the processing in the amount of specific miRNAs will result in downregulation or of pre-miR-27a to mature miR-27a [47]. upregulation of their targets, leading to deregulation of the pathways Thirdly, SNPs affecting the promoters of miRNAs can affect expres- in which those targets are involved. This deregulation of miRNA levels sion of the miRNA in question. SNP rs57095329, which confers risk of in human diseases can occur in different ways. systemic lupus erythematosus (SLE), is located in the miR-146a pro- Firstly, altered functions of the enzymes involved in the miRNA moter (Fig. 3B). Individuals carrying the risk allele in the promoter biogenesis pathway have been implicated in human disease. showed lower expression levels of miR-146a [48]. Haploinsufficiency of DGCR8 accounts for over 90% of cases of Fourthly, SNPs can alter the binding efficiency of miRNAs to mRNA DiGeorge syndrome. This dominantly inherited disorder is caused by targets (Fig. 3C) and SNPs in both miRNAs or in target mRNAs can the presence of hemizygous chromosome 22q11.2 deletions, which affect miRNA–target interaction. An example of such an event is SNP lead to various phenotypic defects including immunodeficiency and rs3853839, which is associated with SLE susceptibility in Eastern autoimmunity [37]. However, DGCR8 haploinsufficiency does not lead Asians [49]. This SNP is located in the 3′ UTR of the TLR7 (Toll-like recep- to an overall decrease in miRNA levels [38]. The presence of the XPO5 tor 7) gene and potentially has a negative effect on the binding of miR- inactive mutant traps miRNAs in the nucleus in a subset of human 3148. Risk allele carriers of this SNP display an increased TLR7 mRNA tumors [39]. Defects in Dicer have also been associated with disease, half-life, resulting in increased TLR7 expression levels [50], which for example, recurrent somatic missense mutations in DICER1 have would increase the production of pro-inflammatory cytokines, suggest- been identified in non-epithelial ovarian cancers [40]. In conclusion, de- ing a role in autoimmunity. SNPs in miRNAs or target mRNAs can also fects in several members of the miRNA processing machinery have been confer the potential binding to different miRNA–target combinations reported. However, these changes never lead to dramatic overall chang- (Fig. 3C). A SNP in the 3′ UTR may create a sequence match to the es of miRNA levels in the cell, which is consistent with the notion that seed of a miRNA that was not previously associated with the given miRNAs are essential for cell survival. mRNA [51]. Gong et al. predicted that 52% of SNPs in the dbSNP database Secondly, as pri-miRNA expression is regulated by RNA polymerase (release 132) would be able to create novel miRNA binding sites [52]. II and by the transcription factors that regulate the expression of The identification of GWAS SNPs in miRNA target sites helps with protein-coding genes, the same epigenetic control mechanisms are prioritizing functional variants. The first GWAS signal that was ex- involved in regulating miRNA expression. Transcriptional repression of plained by polymorphic miRNA targeting was the synonymous SNP B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 1913

AB

C

Fig. 3. Genetic variants and their influence on microRNAs. (A) SNPs occurring in pri-miRNA sequences can affect miRNA processing. (B) SNPs changing the sequence of the mature miRNA can prevent binding to the original target and cause binding to alternative targets. (C) SNPs located within the 3′-UTR of the target can modulate miRNA–mRNA interaction in three ways: (1) a novel miRNA binding site is created, (2) a miRNA target site is disabled, and (3) the strength of binding can be attenuated, the mature miRNA sequence in blue indicates a weakness of binding and in red shows the strength of binding.

(c.313CNT) in the 3′ UTR of IRGM (immunity-related GTPase family M and PolymiRTs described in the table. This in silico analysis identified protein), a GTPase involved in regulating immunity. This SNP confers 32 3′ UTR SNPs that potentially affect miRNA binding [55]. risk to Crohn's disease by decreasing the binding of miR-196 [53].An- Since miRNAs may have many targets, high-throughput methodolo- other example is the presence of SNP rs1625579, which is associated gies are needed to analyze the data. In the miRNA field, promising with schizophrenia and located in the intron of a putative primary tran- approaches involve cross-linking RNA (mRNA and miRNAs) to the script for the mir137 gene. This SNP alters the seed sequence of miR-137, RISC complex, immunoprecipitation of these complexes by capturing which is known to regulate neuronal development. Interestingly, Argonaute proteins (most often antibodies against Argonaute are ap- four other genes associated with schizophrenia (TCF4 (transcription plied or cell lines are used that express tagged Argonaute proteins), factor 4), CACNA1C (calcium channel, voltage-dependent, L type, alpha and sequencing of the mRNA targets in the complex. Examples of such 1C subunit), CSMD1 (CUB and Sushi multiple domains 1) and C10orf26 techniques are AGO2 HITS-CLIP [58], PAR-CLIP [59], and CLIP-seq [60]. (chromosome 10 open reading frame 26)) contain predicted target- The disadvantage of these assays is that they do not yield informa- binding sites for miR-137, suggesting that the expression levels of tion on specific miRNAs binding to specific targets and, therefore, the these four genes might be affected by multiple mechanisms [54]. miRNA of interest is often overexpressed to enrich for complexes with The strongest proof for the functionality of a SNP is an expression this specific miRNA. However, in a recent paper by Helwak et al. quantitative trait locus (eQTL) effect of the SNP on a transcript. In CLASH (cross-linking, ligation, and sequencing of hybrids) was de- 2012 Gamazon et al. showed that 25% of European 3′-UTR SNPs and scribed [61]. In this method, loaded RISC complexes are first cross- 18% of African 3′-UTR SNPs predicted to alter miRNA-binding sites did linked, isolated by Argonaute immunoprecipitation, and then the RNA indeed have a cis-eQTL effect [55]. In another study using publicly avail- in these complexes is ligated forming hybrid miRNA–target mRNA able eQTL datasets, 26 eQTL SNPs were predicted to create, disrupt, sequences that are sequenced. This provides a high-throughput method or change the target site of miRNAs in genes encoding xenobiotic me- to identify the targets to which the miRNAs bind and gives an idea of tabolizing enzymes [56]. which fraction of the RISC complexes in the cell is occupied by specific miRNAs.

6. Bioinformatic and high-throughput methods for studying 7. Long non-coding RNAs miRNA function Long non-coding RNAs (lncRNAs) are a heterogeneous group de- Concurrent with the increase in the amount of data, web-based ap- fined as transcripts more than 200 nucleotides (nt) in length that exhib- plications and online databases have been developed that can be used it no coding potential [9,62,63]. They can be isolated from nuclear as to analyze miRNA data. These are shown in Table 1. The usefulness of well as cytosolic fractions, may or may not be polyadenylated, 98% are the available data is exemplified here by just one example. As spliced and ~25% of them display at least two different alternatively mentioned above, the most powerful indication of a SNP's functionality spliced isoforms [9,62,64]. In comparison with protein-coding genes, is to find an eQTL effect. In a recent study SNPs from the HapMap lncRNAs have longer, but fewer, exons [9,65]. LncRNA promoter regions Consortium [57] located in the 3′ UTRs of genes with cis-eQTL effects are similarly conserved between vertebrates as promoters of protein- were examined using TargetScan, miRBase, Pictar, TarBase, Patrocles coding genes. In contrast lncRNA exons are less well conserved between 1914

Table 1 Publicly available databases and bioinformatics tools for non-coding RNAs.

ncRNAs Database Link Explanation Ref.

ncRNAs in general Rfam 11.0 http://rfam.sanger.ac.uk/ Database of non-coding RNA families, primarily RNAs with a conserved RNA secondary [169] structure, including both RNA genes and mRNA cis-regulatory elements RNAdb 2.0 http://research.imb.uq.edu.au/rnadb/ Database of mammalian noncoding RNAs [170] DIANA tools http://62.217.127.8/DianaTools/index.php?r=site/index ncRNA analysis web server-current emphasis is on the analysis of miRNA and protein coding – genes ncRNA.org http://www.ncrna.org/ Functional RNA Project — bioinformatics tools and databases – http://www.ncrna.org/software List of bioinformatics tools for RNAs – UCSC Genome Browserfor functional RNA http://www.ncrna.org/glocal/cgi-bin/hgGateway UCSC Genome Browser with large inclusion of functional RNA related custom tracks and [153,154] several enhancements for RNA secondary structure support http://www.ncrna.org/custom-tracks UCSC Genome Browser — specific custom tracks for different species (human, mouse, rat, – drosophila) to visualize the data of interest

RegulomeDB http://regulome.stanford.edu/index Database annotates SNPs with known and predicted regulatory elements in the intergenic [171] 1910 (2014) 1842 Acta Biophysica et Biochimica / al. et Hrdlickova B. regions of the human genome NONCODE database http://www.noncode.org/NONCODERv3/guide.htm Database that includes almost all types of ncRNAs (except transfer RNAs and ribosomal [172–174] RNAs), ncRNA sequences and their related information (e.g. function, cellular role, cellular location, and chromosomal information) NPInter http://www.bioinfo.org.cn/NPInter/ Functional interactions between noncoding RNAs (except tRNAs and rRNAs) and [175] biomolecules (proteins, RNAs and DNAs) which are experimentally verified. small ncRNAs MirBase http://www.mirbase.org/ Online repository for microRNA sequences and their annotations [13] MicroRNA.org http://www.microrna.org/microrna/home.do Miranda algorithm-predicting microRNA targets. [176] Target Scan http://www.targetscan.org/ Prediction of microRNA targets (for human, mouse, worm, drosophila, zebrafish) [27] DIANA-microT-ANN DIANA- http://62.217.127.8/DianaTools/index.php?r=microtv4/index ncRNA analysis web server-current-emphasis is on the analysis of miRNA & protein-coding [177] MicroT-CDS http://62.217.127.8/DianaTools/index.php?r=microT_CDS/index genes (incl. miRNA target prediction, analysis of expression data for microRNA function, DIANA-mirExTra http://diana.cslab.ece.ntua.gr/hexamers/ incorporating microRNAs in pathways, db. of experimentally supported microRNA targets) DIANA-miRPath http://www.microrna.gr/miRPathv2 TarBase http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=tarbase/index Available miRNA targets derived from all contemporary experimental techniques [178] Starbase http://starbase.sysu.edu.cn/ miRNA targets and protein RNA interaction data from CLIP seq experiments [179] miRmap http://mirmap.ezlab.org/ Prediction of miRNA targets that combines different known approaches [180] PicTar http://pictar.mdc-berlin.de/ Algorithm for the identification of microRNA targets [15] miRanda http://www.microrna.org/microrna/home.do MiRNA target predictions and expression profiles [176] microSNiPer http://epicenter.ie-freiburg.mpg.de/services/microsniper/ Predicts the impact of a SNP on a putative microRNA binding site [181] mirdSNP http://mirdsnp.ccr.buffalo.edu/ Database of disease-associated SNPs and microRNA target sites in 3′ UTRs of human genes [182] mirwalk http://www.umm.uni-heidelberg.de/apps/zmf/mirwalk/predictedmirnagene.html Database of predicted and validated miRNA targets [183] PolimiRTS http://compbio.uthsc.edu/miRSNP/ Database of naturally occurring DNA variation in predicted and experimentally identified [184] miRNA target sites –

Patrocles http://www.patrocles.org/Patrocles.htm Database of DNA polymorphisms predicted to disturb miRNA target [185] 1922 miRSNP http://cmbi.bjmu.edu.cn/mirsnp Collection of human SNPs in predicted miRNA–mRNA binding sites. [186] PITA http://genie.weizmann.ac.il/pubs/mir07/mir07_prediction.html MicroRNA prediction tool for targets sites [187] Long ncRNAs LNCipedia http://www.lncipedia.org Database for annotated human lncRNA sequences and structure — contains 32,183 human [152] annotated lncRNAs lncRNA Database http://www.lncrnadb.org/ Repository of known lncRNAs in eukaryotic cells with published characteristics — linked to [71] UCSC genome browser for visualization and to NRED database for expression information Human lincRNA catalog http://www.broadinstitute.org/genome_bio/human_lincrnas/ Reference catalog containing over 8000 human lincRNAs [65] Functional lncRNA Database http://www.valadkhanlab.org/database.php Database containing 204 lncRNAs and their splicing variants, analysis of the lncRNAs and [188] their comparison to protein-coding transcripts ncRNA Expression database (NRED) http://nred.matticklab.com/cgi-bin/ncrnadb.pl Provides gene expression information for thousands of lncRNAs in human and mouse, [189] contains both microarray and in situ hybridization data NONCODE v3.0 http://www.noncode.org/NONCODERv3/ Integrative annotation of long noncoding RNAs [174] ncFANs http://ebiomed.org/ncfans/ Functional annotation and function enrichment of lncRNAs (human & mouse), using data [190] from 3 microarray platforms Linc2GO http://www.bioinfo.tsinghua.edu.cn/~liuke/Linc2GO/index.html Web-server providing comprehensive function annotation of human lincRNAs [191] LncRNADisease http://202.38.126.151/hmdd/html/tools/lncrnadisease.html Resource containing experimentally supported lncRNA-disease association data and inte- [192] grates tool(s) for predicting novel lncRNA-disease associations LncBase http://diana.imis.athena-innovation.gr/DianaTools/index.php?r=lncBase/index Part of DIANA tools — predicted and experimentally verified, miRNA–lncRNA interactions [193] B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 1915 these species [62,66]. It is becoming clear that lncRNAs exhibit cell- seq) data to generate the human lincRNA catalog, containing more type-specific expression profiles [62] and that in their specific cellular than 8000 lincRNAs determined across 24 different human cell types background they can be expressed at levels, similar to the levels of and tissues [65]. Although more than 13,500 human lncRNAs have protein-coding RNAs. The origin of lncRNAs is still under debate, but a been annotated by GENCODE, only a few dozen have been studied in recent study [67] has reported that more than two-thirds of mature more detail so far. The challenge is now to elucidate the function of lncRNA transcripts contain transposable elements (TEs), whereas only these lincRNAs. 4% of protein-coding genes contain these ‘jumping genes’ [68].Thisob- servation led to the postulation that the majority of lncRNAs have arisen 8. Classes of lncRNAs and mechanism of action via insertion of TEs. For example, the rodent-specific brain cytoplasmic RNA 1 (BC1), the anthropoid primate-specific brain cytoplasmic RNA 8.1. Classification of lncRNAs based on genomic location 200-nucleotide (BC200), and the strepsirhini primate-specific G22 lncRNAs, form a family of lncRNAs which originate independently The size limit of N200 nucleotides used to define lncRNAs is an arbi- from insertion of TEs, resulting in lncRNAs that locate to dendrites in dif- trary cut-off based on RNA isolation protocols and their size exclusion ferent mammalian species [69–72]. limit in the past, which potentially leads to the capture of a heteroge- The first lncRNAs (H19 and Xist (X-inactive specific transcript)) were neous group of different transcripts with respect to function. Although discovered using traditional gene mapping approaches in the early different nomenclatures are used, in this review we will adhere to the 1990s [73–75] and were considered to be rare exceptions to the then classification based on the lncRNA's location relative to the nearest central dogma of molecular biology. Using tiling arrays HOTAIR (HOX known protein-coding gene as described in GENCODE [9].Thissubclas- antisense intergenic RNA) and HOTTIP (HOXA transcript at the distal sification leads to four broad categories (Fig. 4). The long intergenic tip) were discovered in the homeobox gene regions (HOX clusters) non-coding RNAs (lincRNAs) are the largest group of lncRNAs, account- [76,77]. In 2009, Guttman et al. were the first to describe a genome- ing for approximately 6000 human genes according to the GENCODE wide approach for discovering lncRNAs that yielded 1600 novel dataset V17 [9] or approximately 8000 according to the human lincRNA mouse lncRNAs. In this study gene expression data and the presence catalog [65]. LincRNA genes do not overlap or lie in close proximity to of chromatin marks for promoter regions and gene bodies were inte- protein-coding genes [66,86]. The second most prevalent class of grated with the known annotations of coding transcripts to identify lncRNAs is the antisense lncRNA that is transcribed from the strand op- lncRNAs [66]. Since then, thousands of lncRNAs have been identified posite of the protein-coding genes, which they are overlapping. Based using similar approaches in the mouse and human genomes [62,78]. on their complete or incomplete overlap, antisense lncRNAs can be When used in combination with next-generation RNA sequencing, ad- subdivided into various subclasses, for example, intronic antisense ditional information can be obtained about lncRNA exon–intron struc- lncRNAs when the lncRNA transcript falls completely within the bound- ture as well as about the abundance of these transcripts [79–85]. aries of an opposing coding intron, or natural antisense transcripts Cabili et al. combined chromatin marks and RNA-sequencing (RNA- (NATs) with partial overlap, mainly around the promoter or terminator

A)

B)

C)

D)

Fig. 4. Classification of lncRNAs based on position relative to the nearest protein-coding gene. (A) Long intergenic non-coding RNA (lincRNA) genes do not overlap or neighbor protein- coding genes. (B) Antisense lncRNAs are transcribed from the strand opposite of the protein-coding gene with whom they are overlapping. Antisense lncRNAs can be subdivided into (1) intronic antisense lncRNAs, when the transcript falls completely within the boundaries of an opposing coding intron; or (2) natural antisense transcripts (NATs) which partially overlap the coding gene. (C) Sense lncRNAs are located on the same strand and transcribed in the same direction as a neighboring protein-coding gene. (D) Bidirectional or divergent lncRNAs are located on the antisense strand (opposite of the protein-coding gene) and their transcription start site (TSS) is close to the TSS of the protein-coding gene. These lncRNAs are transcribed in the opposite direction relative to the protein-coding gene. Color legend: protein coding genes (blue) and lncRNA genes (red). 1916 B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 site of the coding gene [87,88]. For antisense transcripts it holds true In 2011, Wang and Chang [93] divided lncRNAs into four archetypes that the sense–antisense pairs are often co-expressed together, that based on their molecular mechanism (Fig. 5). It is important to state they share a similar pattern of evolutionary conservation [89],and here that some lncRNAs are known to exercise more than one of these that the antisense transcript modulates the expression of the sense tran- archetypal molecular mechanisms. LncRNAs that are defined to belong script by the formation of a sense–antisense RNA duplex [88]. The third to display the signaling archetype act as a molecular signal (marker) subclass of lncRNAs comprises the sense lncRNA transcripts, which can for a particular biological state or condition (e.g. intracellular signaling be ‘sense intronic’ or ‘sense overlapping’. Such transcripts are located on or response to a stimulus) rather than on the function/mechanism the same strand and transcribed in the same direction as a protein- that they might exert. The presence or absence of this signal can indicate coding gene. This organization is much less prevalent. In total, fewer what is happening in the given cell in that moment and may activate than 1000 sense lncRNAs have been identified, overlapping completely or silence other genes without the need for translation (Fig. 5A). or partially with protein-coding genes. To date, this subclass is poorly For instance the presence of Xist indicates that one of the X chromo- characterized compared with the other lncRNAs. The fourth subclass somes is actively being epigeneticly silenced. Examples of lncRNAs of lncRNAs is the bi-directional or divergent group. These transcripts displaying the signaling archetype are lncRNAs involved in embryonic are located on the antisense strand (opposite of the protein-coding development (HOTAIR and HOTTIP) [76,77,97], DNA damage response gene); they have their transcription start site (TSS) close to the TSS of (e.g. lincRNA-p21 and PANDA (p21-associated ncRNA DNA damage the protein-coding gene, but are transcribed in the opposite direction. activated lncRNA)) [98,99], stress responses (e.g. COLDAIR (cold- The majority of these bi-directional pairs are co-expressed together assisted intronic non-coding RNA) and COOLAIR (cold-induced long an- and conserved between human and mouse [90,91]. tisense intragenic RNA)) [100,101], and somatic cell reprogramming (e.g. lincRNA-ROR (regulator of reprogramming)) [102,103]. The second mechanism of action is the decoy mechanism (Fig. 5B) 8.2. Classification of lncRNAs based on molecular mechanism (lncRNA exerted by e.g. GAS5 (growth arrest-specific transcript 5), MALAT1 archetypes) (metastasis-associated lung adenocarcinoma transcript 1), TERRA (telomeric repeat-containing RNA) and PANDA. These lncRNAs can act LncRNAs can interact with DNA or RNA as well as proteins. Although as decoys that bind to and interfere with the function of other RNAs or for most of the annotated lncRNAs the detailed mechanism of action is proteins (miRNAs, transcription factors, or RNA-binding proteins). still unknown, the few examples available show the complexity of They act by competing with another sequences or structures for binding lncRNA biology. LncRNAs control multiple mechanisms and have been and are considered to be negative regulators. For example GAS5 acts as a implicated in post-transcriptional gene regulation by controlling pro- decoy glucocorticoid-response element (GRE) and competes with DNA cesses, like protein synthesis, RNA maturation, and RNA transport, and GREs for binding to the glucocorticoid receptor [104]. PANDA binds to have been shown to control transcriptional gene silencing via epigenetic the transcription factor NF-YA and prevents the activation of NF-YA- regulation and chromatin remodeling [82,92–96]. induced pro-apoptotic targets [99,105].

A)

B)

C)

D)

Fig. 5. Four archetypes of lncRNA mechanism of action. (A) Signaling archetype: some lncRNAs act as molecular signals activating (1) or silencing (2) other genes without own translation. (B) Decoy archetype: some lncRNAs compete with another sequence/structure (such as miRNAs, transcription factors, or RNA-binding proteins) for binding. (C) Guide archetype: lncRNAs that bind specific proteins and transport them closer to specific target sequence. This interaction might be (1) direct — when the lncRNA–protein heteroduplex (ribonucleoprotein com- plex) binds directly to DNA, or (2) indirect — when the interaction between lncRNA–protein is mediated via another protein located on the DNA. (D) Scaffold archetype: lncRNAs that bind multiple proteins, bringing them in close proximity to facilitate interactions and, for example, allow the formation of ribonucleoprotein complexes. Adapted from Wang and Chang [93]. B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 1917

The third lncRNA mechanism is the ability to act as a guide imprinting-related, neurogenetic Angelman syndrome and Beckwith– archetype, for example, by binding chromatin modifying proteins and Wiedemann syndrome (BWS) [134]. Angelman syndrome is caused by transporting the created complex to specific targets, for example, chro- a loss-of-function of ubiquitin-protein ligase E3A (UBE3A), also known matin modification enzymes to DNA, where the interaction may as E6AP [135]. Although in the majority of human tissues, both copies be directly between this complex and the DNA, or indirectly with het- of the UBE3A gene are expressed, in neurons one copy is silenced by eroduplex protein–DNA (Fig. 5C). These lncRNAs may interact as activa- UBE3A-AS1 (ubiquitin-protein ligase E3A antisense RNA 1) [136].In tors or repressors with neighboring (cis) or distant (trans)genes. patients suffering from Angelman syndrome, the other (active) allele Examples of lncRNAs employing this mechanism are HOTAIR, lincRNA- has either been deleted or inactivated [136]. p21, Xist, COLDAIR and Jpx (just proximal to XIST). LncRNAs have also been associated with other neurological disor- The fourth archetype is acting as a scaffold (Fig. 5D), for instance by ders, such as BACE1-AS or BC200 in Alzheimer disease, HAR1 (human ac- bringing bound proteins into a complex or in spatial proximity. Exam- celerated region 1 lncRNA) in Huntington disease, and ATXN8OS (Ataxin ples of lncRNAs exploiting this strategy are ANRIL (antisense ncRNA in 8 opposite strand lncRNA) in spinocerebellar ataxia type 8 [134,137, the INK4 locus), which functions as a scaffold for the chromatin remod- 138]. In Alzheimer disease, the protein-coding gene BACE-1 (β-site am- eling complexes PRC1 (polycomb repressive complex 1) and PRC2 yloid precursor protein-cleaving enzyme) cleaves amyloid precursor (polycomb repressive complex 2) [106], HOTAIR (scaffold for PRC2 protein (APP) to β-amyloid peptide (Aβ), the accumulation of which binding it to the LSD1 (lysine-specific demethylase 1A) complex) [76, (amyloid plaques) is associated with disease. LncRNA BACE1-AS,located 97,107],andTERC (telomerase RNA component) that scaffolds the on the antisense strand to BACE1, binds complementarily to BACE1 telomerase complex [108]. mRNA, increases its stability, regulates BACE1 translation, and thereby the production of Aβ [139]. BACE1-AS also prevents miRNA-induced re- 9. LncRNAs in human disease pression of BACE-1 by masking the binding site for miR-485-5p, by com- peting for the same region on exon 6 of BACE-1 [140]. Two- to three-fold In many diseases the expression of protein-coding genes is increased levels of BACE1-AS and a smaller increase (1.5×) of BACE-1 deregulated and evidence is now accumulating that altered lncRNA have been documented post mortem in the brains of patients with function might be one of the causes involved. Below we describe exam- Alzheimer disease [139]. ples of lncRNAs involved in the etiopathology of different disorders. Besides their actions in cancer and neurological diseases, lncRNAs Many lncRNAs have been described that exhibit altered expression also exhibit aberrant expression in other disease states, such as levels in cancer cells compared to healthy tissue of the same origin facioscapulohumeral muscular dystrophy (FSHD). FSHD is a common, [109]. MALAT1, also known as NEAT2 (nuclear-enriched abundant tran- progressive, genetic disease of skeletal muscle caused by deletions script 2), was initially discovered as a predictive biomarker for metasta- that reduce the number of D4Z4 repeats in the FSHD locus at chromo- sis development in lung cancer [110,111]. It took almost ten years to some 4q35 [134,141]. Under physiological conditions, D4Z4 repeats re- unravel its mechanism of action. MALAT1 acts by inducing the expres- cruit Polycomb complexes resulting in chromatin reorganization and sion of metastasis-associated genes [112] and recently it was shown causing repression of 4q35 genes [142]. In contrast, in FSHD patients, a that in vitro metastasis of EBC-1 cells (human lung cancer cells) deletion of D4Z4 repeats results in cis production of the DBE-T lncRNA can be inhibited by antisense oligonucleotides directed to MALAT1 that binds to protein complexes, reorganizes the chromatin state of [112,113]. the FSHD locus, and reactivates the repressed 4q35 genes [142]. LncRNA HOTAIR interacts with PRC2 and alters chromatin to a metastasis-promoting state [114]. In approximately one-quarter of 10. Genetic variants and lncRNA function human breast cancers, HOTAIR is highly induced, while its elevated levels are also predictive of metastasis and disease progression in As yet, how far lncRNAs are affected by genetic alterations related to other cancers, such as colon, colorectal, gastrointestinal, pancreatic the disease phenotype is unknown. The larger alterations, like chromo- and liver cancer [107,115–118]. ANRIL, GAS5 and lincRNA-p21 are somal rearrangements (translocations, amplifications, or deletions) do involved in the escape of growth suppression by regulating tumor sup- affect the expression of lncRNAs that are involved in disease phenotypes. pressor genes (ANRIL) or apoptosis regulators (GAS5, lincRNA-p21). For example, two different balanced translocations (t(8;12)(q13;p11.2) TERRA and TERC (telomerase RNA component) regulate replicative im- and t(4;12)(q13.2-13.3;p11.2)) affecting chromosome 12p have been as- mortality [119,120]. MALAT1 and HOTAIR activate cancer invasion and sociated with the human brachydactyly type E phenotype [143,144].The metastasis by respectively regulating cell motility-related genes lncRNA DA125942 in the affected 12p region was recently described as (MALAT1) or retargeting of PCR2 complex resulting in a metastasis interacting in cis with Parathyroid hormone-like hormone (PTHLH, promoting chromatin state (HOTAIR) [107,112,121]. The lncRNAs αHIF which is a regulator of endochondral bone development) and in trans (antisense to hypoxia inducible factor α (HIFα)) and tie-1AS (tyrosine with SOX9 (Sex determining region Y-box 9, which acts during chon- kinase containing immunoglobulin and epidermal growth factor drocyte differentiation) [103,143]. Another balanced translocation homology domain-1 antisense) induce angiogenesis [86,122]. PCGEM1 (t(1;11)(q42.1;q14.3)) has been associated with schizophrenia and (prostate-specific transcript 1), UCA1 (urothelial cancer associated 1, other psychiatric disorders [145,146]. This particular translocation af- also known as CUDR, cancer upregulated drug resistant), SPRY4-IT1 fects two genes, the protein-coding disrupted in schizophrenia 1 gene (SPRY4 intronic transcript 1), and PANDA are involved in suppressing (DISC1) and the antisense lncRNA disrupted in the schizophrenia 2 apoptosis [99,123–125]. Some lncRNAs have, as yet, only been associat- gene (DISC2) [146,147].ThesameDISC1/DISC2 region was also associated ed with one specifictypeofcancer.PCGEM1, PCA3 (prostate cancer an- witha1q42deletioninanautismspectrumdisorder[148]. tigen 3, known also as DD3, differential display code 3) and PCNCR1 The interactions of lncRNAs with other molecules are probably (prostate cancer ncRNA 1) are involved in prostate cancer, while HULC governed by lncRNA structure, rather than by sequence. As it is difficult (highly up-regulated in liver cancer) is involved with liver cancer to predict the structure of larger RNA molecules, it is a challenge to un- [123,126–130]. In contrast, other lincRNAs (e.g. HOTAIR and MALAT1) derstand how small mutations (small insertions/deletions or SNPs) are appear to be broadly deregulated in carcinogenesis [110,116,117,131, involved in disease etiopathology. Here we describe a few scenarios that 132]. can be envisioned. Firstly, SNPs may affect the expression level of More recently, data has been generated that shows that lncRNAs also lncRNAs. SNPs can be located in promoter sequence and directly alter have roles in other diseases. Some human pathological phenotypes are the expression of lncRNAs (loss-of-function, Fig. 6A) or change the bind- caused by epigenetic changes and imprinted lncRNA gene clusters have ing of inhibitory complexes, thereby allowing expression of lncRNAs been associated with these diseases [133]. Examples of these are the that are not expressed under physiological circumstances (gain-of- 1918 B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 A B

C D

Fig. 6. Functional consequences of mutations on lncRNAs and their function. Mutations located in regions involved in transcriptional control (e.g. promoters, enhancers) might result in: (A) direct alteration of the amount of lncRNA transcripts (loss-of function); or (B) indirect alteration of lncRNA levels when enhancer or repressor sites are affected (gain-of-function). Mutations located in exons of lncRNAs might result in alternative splicing leading to loss-of-function (C). Mutations might also alter lncRNA function by affecting their secondary structure (D). function, Fig. 6B). Secondly, SNPs within lncRNA genes may cause alter- and Functional Annotation of the Mammalian Genome (FANTOM) native splicing of the transcript (Fig. 6C) or affect its secondary structure [154–156]. (Fig. 6D). This can lead to an altered function of the lncRNAs. For in- Co-expression analysis is a promising strategy to predict the mecha- stance, the chromosomal region 9p21 (previously described as a “gene nism of action of lncRNAs of interest. This type of analysis predicts func- desert”)harborstheANRIL lncRNA (also known as CDKN2BAS or tions for molecules of unknown gene products, based on co-expression CDKN2B-AS1). Many SNPs located within or around this lncRNA have data and taking into account the known functions or mechanisms been associated in GWAS with a susceptibility to atherosclerotic vascu- of the co-expressed genes (“guilt by association”). Two examples lar disease, coronary artery disease, stroke, myocardial infarction, aortic of such tools are Gene Network (www.genenetwork.nl/genenetwork) aneurysm, type 2 diabetes, and several types of cancers [149,150].Ex- (Karjalainen and Franke et al., manuscript in preparation) or Gemma actly how these SNPs contribute to disease is not yet known. It is unclear (http://www.chibi.ubc.ca/Gemma/home.html) [157]. However, the data whether they directly regulate the expression level of ANRIL or whether most commonly used for these analyses comes from arrays. Although they disturb the binding site of transcription factor STAT1 (ANRIL's nearly all the arrays were designed to study the expression of protein- repressor) in ANRIL's enhancer [151]. SNPs may also modulate ANRIL coding genes, several of them also carry probes for lncRNAs. Kumar transcripts by inducing exon skipping, resulting in shorter splice vari- et al. took advantage of this and used microarray data to investigate the ants with reduced efficiency or non-functional isoforms [149]. association of SNPs with the expression levels of lincRNAs by using eQTL and a platform containing approximately 2000 lncRNA probes [10].Theydiscovered112cis-regulated lincRNAs, of which 45% were rep- 11. Bioinformatics tools and high-throughput applications for licated in an independent dataset. A remarkable 75% of these SNPs affect- studying lncRNAs ed the expression of the lincRNA but did not affect the expression of neighboring protein-coding genes. Expression microarray platforms As lncRNAs are a relatively novel class of transcripts, there are not probing both protein-coding transcripts and non-coding RNAs will yet many bioinformatics tools to aid the study of lncRNA function. enable co-expression analyses of these RNA classes. The data can then To date, a couple of lncRNA databases and catalogs are available be used to link lncRNAs to pathways, using existing data for the coding (Table 1), such as LNCipedia [152], LncRNAdb [71], and the Human genes with which they are co-expressed. lincRNA catalog [65]. LncRNAdb contains a comprehensive collection In the near future, enough transcriptomic data is expected to be gen- of eukaryotic lncRNAs and relevant information, such as RNA sequence, erated by next-generation sequencing to allow in depth co-expression structure, genomic location, expression profiles, subcellular localization, analysis. The initial studies in the field were mostly focused on lncRNAs conservation and function [71]. The Human lincRNA catalog [65] focuses in the polyadenylated (polyA+) RNA fraction. More recently, a signifi- only on human lincRNAs and contains more than 8000 lincRNAs defined cant proportion of lncRNAs also present in the non-polyadenylated by more than 30 properties. Expression data across 24 tissues and (polyA−) RNA fraction have been described [158]. For this reason, the cell types is available in this database. Data from both databases can latest focus is on both RNA fractions (polyA+, polyA−) and on different be visualized and analyzed using the University of California Santa subcellular compartments (nuclear, cytoplasmic) [62,64].Itisexpected Cruz (UCSC) genome browser [153]. In the same browser, it is also pos- that many more lncRNA transcripts will be identified [159]. sible to annotate ncRNAs with the latest updated genome-wide data Because lncRNA function is based on RNA structure rather than on from large international consortia, such as ENCODE (GENCODE dataset) RNA sequence, it is difficult to predict lncRNA targets or to pinpoint B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 1919 the functional domains in these molecules. However, a new tool has re- miRNAs and lncRNAs exhibit cell type, and even cell developmental cently been described – Triplexator – that can be used to predict regions stage specificexpressionprofiles, it is pivotal that this research will be where chromatin-associated RNAs could interact with the DNA double performed in the relevant (disease associated) cell types. The identifica- helix (thereby forming a triple helix structure) [160]. These predicted tion of disease SNPs associated with ncRNAs is an important step in sites might pinpoint genes that are regulated by lncRNAs. linking ncRNAs to human disease, but to fully understand how these The techniques used for the genome-wide identification of lncRNA SNPs contribute to pathology the functions of these ncRNAs need to targets and binding partners employ methods similar to the ones de- be uncovered experimentally. Identification of the proteins and path- scribed above for finding miRNA targets. Chromatin Isolation by RNA ways regulated by these ncRNAs is likely to provide novel insights into Purification (ChIRP) [161] and Capture Hybridization Analysis of RNA pathology of human diseases, provide novel diagnostic opportunities, Targets (CHART) [162] are based on cross-linking RNA in complex and pinpoint novel therapeutical targets. with DNA and/or protein, followed by capture techniques targeting the lncRNA in the complex. Subsequently, the protein component Acknowledgements or DNA sequences in the complexes can be identified by mass spectro- photometry [161] or next-generation sequencing, respectively. Con- The authors would like to thank Jackie Senior and Kate Mc Intyre for versely, since lncRNAs have been reported to be involved in targeting critically reading the manuscript. The authors of this review are paid by, chromatin-modifying complexes to specific DNA sequences, one can or have obtained grants from the Netherlands Organisation for Scientific cross-link complexes and immunoprecipitate these (e.g. the PRC2 com- Research (NWO-VICI 918.66.620; BH), the Coeliac Disease Consortium 2 plex), and then analyze the RNA component by sequencing (RNA (BSIK grant 03009; BH), the Capes Foundation, Ministry of Education of immunoprecipitation-sequencing (RIP-SEQ)) [163]. Finally, lncRNAs Brazil (BEX0891-10-0 to RA), a grant from the European Research Coun- can be synthesized in vitro and hybridized with protein microarrays to cil (ERC) under the European Union's Seventh Framework Programme identify lncRNA–protein interactions [91]. (FP/2007-2013; ERC Grant Agreement no. 2012-322698 to CW), and the Dutch Multiple Sclerosis Foundation (11-752 to SW). 12. Perspectives

The rapid evolution of next-generation sequencing technologies References and the expected drop in cost of assays mean that we will soon be [1] L.A. Hindorff, P. Sethupathy, H.A. Junkins, E.M. Ramos, J.P. Mehta, F.S. Collins, et al., able to sequence large numbers of genomes and transcriptomes. This Potential etiologic and functional implications of genome-wide association loci for will ultimately provide us with a nearly complete overview of disease- human diseases and traits, Proc. Natl. Acad. Sci. U. S. A. 106 (2009) 9362–93627. associated genetic variation. Transcriptome analysis will discover tens [2] A. Cortes, M.A. Brown, Promise and pitfalls of the Immunochip, Arthritis Res. Ther. 13 (2011) 101. of thousands of new transcripts, many of which will be of a non- [3] B.F. Voight, H.M. Kang, J. Ding, C.D. Palmer, C. Sidore, P.S. Chines, et al., The coding nature [159]. Most of the genetic variation associated with com- metabochip, a custom genotyping array for genetic studies of metabolic, cardiovas- plex diseases is located within genomic regions that control transcrip- cular, and anthropometric traits, PLoS Genet. 8 (2012) e1002793. [4] J. Marchini, B. Howie, Genotype imputation for genome-wide association studies, tion, rather than in protein-coding areas. As the ncRNAs are now Nat. Rev. Genet. 11 (2010) 499–511. emerging as important regulators of expression, it is becoming clear [5] W. Cookson, L. Liang, G. Abecasis, M. Moffatt, M. Lathrop, Mapping complex disease that deregulation of gene expression might be the lynch-pin in many traits with global gene expression, Nat. Rev. Genet. 10 (2009) 184–194. [6] I. Ricaño-Ponce, C. Wijmenga, Mapping of immune-mediated disease genes, Annu. disease mechanisms. Because of the above it is essential to understand Rev. Genomics Hum. Genet. 14 (2013) 325–353. the function of ncRNAs in health and disease. [7] B.E. Bernstein, E. Birney, I. Dunham, E.D. Green, C. Gunter, M. Snyder, An integrated What is already emerging and will complicate the study of the regula- encyclopedia of DNA elements in the human genome, Nature 489 (2012) 57–74. [8] D. Variation, M.T. Maurano, R. Humbert, E. Rynes, R.E. Thurman, E. Haugen, et al., tory involvement of ncRNAs in human diseases even more, is that differ- Systematic localization of common disease-associated variation in regulatory ent classes of ncRNAs might interact. A few lncRNAs have been reported DNA, Science 337 (2012) 1190–1195 (80-.). to be precursors for different types of small ncRNAs, such as miRNAs or [9] J. Harrow, A. Frankish, J.M. Gonzalez, E. Tapanari, M. Diekhans, F. Kokocinski, et al., small nucleolar RNAs [62,164].Inaddition,miRNAsandlncRNAscanreg- GENCODE: the reference human genome annotation for The ENCODE Project, Genome Res. 22 (2012) 1760–1774. ulate each other's expression. For example, Zhang et al. reported miR-21 [10] V. Kumar, H.-J. Westra, J. Karjalainen, D.V. Zhernakova, T. Esko, B. Hrdlickova, et al., which is able to negatively regulate the expression of the GAS5 lncRNA. Human disease-associated genetic variation impacts large intergenic non-coding On the other hand GAS5 was able to repress miR-21 [165].LncRNAscan RNA expression, PLoS Genet. 9 (2012) e1003201. ‘ ’ [11] R.C. Lee, R.L. Feinbaum, V. Ambros, The C. elegans heterochronic gene lin-4 encodes also act as microRNA sponges . The most striking example is the circular small RNAs with antisense complementarity to lin-14, Cell 75 (1993) 843–854. lncRNA CDR1 (cerebellar degeneration-related protein 1), which has [12] A.E. Pasquinelli, B.J. Reinhart, F. Slack, M.Q. Martindale, M.I. Kuroda, B. Maller, et al., been shown to contain approximately 70 binding sites for miR-7. By Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA, Nature 408 (2000) 86–89. binding miR-7 molecules CDR1 prevents binding of this miRNA to other [13] A. Kozomara, S. Griffiths-Jones, miRBase: integrating microRNA annotation and miR-7 targets [166,167]. deep-sequencing data, Nucleic Acids Res. 39 (2011) D152–D157. The availability of genomic and transcriptomic data will also lead to [14] B.P. Lewis, C.B. Burge, D.P. Bartel, Conserved seed pairing, often flanked by adeno- sines, indicates that thousands of human genes are microRNA targets, Cell 120 an increase in the number of ncRNA eQTLs associated with disease [10]. (2005) 15–20. Once the causal mutations have been connected to specific ncRNAs, the [15] A. Krek, D. Grün, M.N. Poy, R. Wolf, L. Rosenberg, E.J. Epstein, et al., Combinatorial next step will be to identify the targets/interaction partners of these microRNA target predictions, Nat. Genet. 37 (2005) 495–500. [16] M. Esteller, Non-coding RNAs in human disease, Nat. Rev. Genet. 12 (2011) transcripts, which is essential to fully understand the mechanism of 861–874. disease. Because both miRNAs and lncRNAs may act on many targets, [17] M. Jovanovic, M.O. Hengartner, miRNAs and apoptosis: RNAs to die for, Oncogene high-throughput methods will need to be designed and applied. Even- 25 (2006) 6176–6187. tually, genome-editing techniques might be applied to pinpoint the ef- [18] I. Büssing, F.J. Slack, H. Grosshans, let-7 microRNAs in development, stem cells and cancer, Trends Mol. Med. 14 (2008) 400–409. fects of a single SNP or of small deletions/insertions by allowing [19] R. Schickel, B. Boyerinas, S.-M. Park, M.E. Peter, MicroRNAs: key players in the alterations of wild-type alleles into risk alleles in eukaryotic cells [168]. immune system, differentiation, tumorigenesis and cell death, Oncogene 27 (2008) – The ‘omics’-revolution started with the description of the sequence 5959 5974. [20] G. Stefani, F.J. Slack, Small non-coding RNAs in animal development, Nat. Rev. Mol. of the human genome. Since then state-of-the-art high-throughput Cell Biol. 9 (2008) 219–230. techniques and bioinformatic approaches have been developed, [21] W. Sun, Y.-S. Julie Li, H.-D. Huang, J.Y.-J. Shyy, S. Chien, MicroRNA: a master that were used to identify mutations that are associated with human regulator of cellular processes for bioengineering systems, Annu. Rev. Biomed. – fi Eng. 12 (2010) 1 27. diseases. It has now become apparent that a signi cant part of these [22] A.E. Pasquinelli, MicroRNAs and their targets: recognition, regulation and an mutations affect the expression or function of ncRNAs. Because both emerging reciprocal relationship, Nat. Rev. Genet. 13 (2012) 271–282. 1920 B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922

[23] J. Krol, I. Loedige, W. Filipowicz, The widespread regulation of microRNA biogene- [54] The Schizophrenia Psychiatric Genome-Wide Association Study (GWAS) Consor- sis, function and decay, Nat. Rev. Genet. 11 (2010) 597–610. tium, Genome-wide association study identifies five new schizophrenia loci, Nat. [24] D.P. Bartel, MicroRNA target recognition and regulatory functions, Cell 136 (2009) Genet. 43 (2011) 969–976. 215–233. [55] E.R. Gamazon, D. Ziliak, H.K. Im, B. LaCroix, D.S. Park, N.J. Cox, et al., Genetic archi- [25] B.J. Reinhart, F.J. Slack, M. Basson, a E. Pasquinelli, J.C. Bettinger, a E. Rougvie, et al., tecture of microRNA expression: implications for the transcriptome and complex The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis traits, Am. J. Hum. Genet. 90 (2012) 1046–1063. elegans, Nature 403 (2000) 901–906. [56] R. Wei, F. Yang, T.J. Urban, L. Li, N. Chalasani, D. a Flockhart, et al., Impact of the [26] E. Berezikov, Evolution of microRNA diversity and regulation in animals, Nat. Rev. interaction between 3′-UTR SNPs and microRNA on the expression of human xe- Genet. 12 (2011) 846–860. nobiotic metabolism enzyme and transporter genes, Front. Genet. 3 (2012) 248. [27] R.C. Friedman, K.K. Farh, C.B. Burge, D.P. Bartel, Most mammalian mRNAs are con- [57] The International HapMap Consortium, International HapMap project, Nature 426 served targets of microRNAs, Genome Res. 19 (2009) 92–105. (2003) 789–796. [28] B. Kaczkowski, E. Torarinsson, K. Reiche, J.H. Havgaard, P.F. Stadler, J. Gorodkin, [58] S.W. Chi, J.B. Zang, A. Mele, R.B. Darnell, Ago HITS-CLIP decodes miRNA–mRNA Structural profiles of human miRNA families from pairwise clustering, Bioinfor- interaction maps, Nature 460 (2009) 479–486. matics 25 (2009) 291–294. [59] M. Hafner, M. Landthaler, L. Burger, M. Khorshid, P. Berninger, A. Rothballer, et al., [29] W. Wang, B.R. Wilfred, K. Xie, M.H. Jennings, Y. Hu, J. Arnold, et al., Individual Transcriptome-wide identification of RNA-binding protein and microRNA target microRNAs (miRNAs) display distinct mRNA targeting “rules”,RNABiol.7 sites by PAR-CLIP, 141 (2010) 129–141. (2010) 373–380. [60] D.G. Zisoulis, M.T. Lovci, M.L. Wilbert, K.R. Hutt, Y. Tiffany, A.E. Pasquinelli, et al., [30] P.S. Mitchell, R.K. Parkin, E.M. Kroh, B.R. Fritz, S.K. Wyman, E.L. Pogosova- Comprehensive discovery of endogenous Argonaute binding sites in Caenorhabditis Agadjanyan, et al., Circulating microRNAs as stable blood-based markers for cancer elegans, Nat. Struct. Mol. Biol. 17 (2010) 173–179. detection, Proc. Natl. Acad. Sci. U. S. A. 105 (2008) 10513–10518. [61] A. Helwak, G. Kudla, T. Dudnakova, D. Tollervey, Mapping the human miRNA inter- [31] K.C. Vickers, B.T. Palmisano, B.M. Shoucri, R.D. Shamburek, A.T. Remaley, actome by CLASH reveals frequent noncanonical binding, Cell 153 (2013) 654–665. MicroRNAs are transported in plasma and delivered to recipient cells by high- [62] T. Derrien, R. Johnson, G. Bussotti, A. Tanzer, S. Djebali, H. Tilgner, et al., The density lipoproteins, Nat. Cell Biol. 13 (2011) 423–433. GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene struc- [32] N.J. Park, H. Zhou, D. Elashoff, B.S. Henson, D.A. Kastratovic, E. Abemayor, et al., ture, evolution, and expression, Genome Res. 22 (2012) 1775–1789. Salivary microRNA: discovery, characterization, and clinical utility for oral cancer [63] M. Guttman, P. Russell, N.T. Ingolia, J.S. Weissman, E.S. Lander, Ribosome profiling detection, Clin. Cancer Res. 15 (2009) 5473–5477. provides evidence that large noncoding RNAs do not encode proteins, Cell 154 [33] P.N. Alexandrov, P. Dua, J.M. Hill, S. Bhattacharjee, Y. Zhao, J. Walter, MicroRNA (2013) 240–251. (miRNA) speciation in Alzheimer's disease (AD) cerebrospinal fluid (CSF) and [64] S. Djebali, C.A. Davis, A. Merkel, A. Dobin, T. Lassmann, A. Mortazavi, et al., Land- extracellular fluid (ECF), Int. J. Biochem. Mol. Biol. 3 (2012) 365–373. scape of transcription in human cells, Nature 489 (2012) 101–108. [34] M. Hanke, K. Hoefig, H. Merz, A.C. Feller, I. Kausch, D. Jocham, et al., A robust meth- [65] M.N. Cabili, C. Trapnell, L. Goff, M. Koziol, B. Tazon-Vega, A. Regev, et al., Integrative odology to study urine microRNA as tumor marker: microRNA-126 and microRNA- annotation of human large intergenic noncoding RNAs reveals global properties 182 are related to urinary bladder cancer, Urol. Oncol. 28 (2010) 655–661. and specific subclasses, Genes Dev. 25 (2011) 1915–1927. [35] A.M. Zahm, M. Thayu, N.J. Hand, A. Horner, M.B. Leonard, J.R. Friedman, Circulating [66] M. Guttman, I. Amit, M. Garber, C. French, M.F. Lin, D. Feldser, et al., Chromatin microRNA is a biomarker of pediatric Crohn disease, J. Pediatr. Gastroenterol. Nutr. signature reveals over a thousand highly conserved large non-coding RNAs in 53 (2011) 26–33. mammals, Nature 458 (2009) 223–237. [36] Q. Zhou, W.W. Souba, C.M. Croce, G.N. Verne, MicroRNA-29a regulates intestinal [67] A. Kapusta, Z. Kronenberg, V.J. Lynch, X. Zhuo, L. Ramsay, G. Bourque, et al., Trans- membrane permeability in patients with irritable bowel syndrome, Gut 59 posable elements are major contributors to the origin, diversification, and regula- (2010) 775–784. tion of vertebrate long noncoding RNAs, PLoS Genet. 9 (2013) e1003470. [37] T.P. Atkinson, Immune deficiency and autoimmunity, Curr. Opin. Rheumatol. 24 [68] A. Nekrutenko, W.H. Li, Transposable elements are found in a large number of (2012) 515–521. human protein-coding genes, Trends Genet. 17 (2001) 619–621. [38] M.T. de la Morena, J.L. Eitson, I.M. Dozmorov, S. Belkaya, A.R. Hoover, E. Anguiano, [69] J.A. Martignetti, J. Brosius, Neural BC1 RNA as an evolutionary marker: guinea pig et al., Signature MicroRNA expression patterns identified in humans with 22q11.2 remains a rodent, Proc. Natl. Acad. Sci. U. S. A. 90 (1993) 9698–9702. deletion/DiGeorge syndrome, Clin. Immunol. 147 (2013) 11–22. [70] B.V. Skryabin, J. Kremerskothen, D. Vassilacopoulou, T.R. Disotell, V.V. Kapitonov, J. [39] S.A. Melo, C. Moutinho, S. Ropero, G.A. Calin, S. Rossi, R. Spizzo, et al., A genetic Jurka, et al., The BC200 RNA gene and its neural expression are conserved in defect in exportin-5 traps precursor microRNAs in the nucleus of cancer cells, Anthropoidea (primates), J. Mol. Evol. 47 (1998) 677–685. Cancer Cell 18 (2010) 303–315. [71] P.P. Amaral, M.B. Clark, D.K. Gascoigne, M.E. Dinger, J.S. Mattick, lncRNAdb: a refer- [40] A. Heravi-Moussavi, M.S. Anglesio, S.-W.G. Cheng, J. Senz, W. Yang, L. Prentice, et al., ence database for long noncoding RNAs, Nucleic Acids Res. 39 (2011) D146–D151. Recurrent somatic DICER1 mutations in nonepithelial ovarian cancers, N. Engl. J. [72] X. Cao, G. Yeo, A.R. Muotri, T. Kuwabara, F.H. Gage, Noncoding RNAs in the mam- Med. 366 (2012) 234–242. malian central nervous system, Annu. Rev. Neurosci. 29 (2006) 77–103. [41] P. Lopez-Serra, M. Esteller, DNA methylation-associated silencing of tumor- [73] C.I. Brannan, E.C. Dees, R.S. Ingram, S.M. Tilghman, The product of the H19 gene suppressor microRNAs in cancer, Oncogene 31 (2012) 1609–1622. may function as an RNA, Mol. Cell. Biol. 10 (1990) 28–36. [42] V. Davalos, C. Moutinho, A. Villanueva, R. Boque, P. Silva, F. Carneiro, et al., Dynamic [74] N. Brockdorff, A. Ashworth, G.F. Kay, V.M. McCabe, D.P. Norris, P.J. Cooper, et al., epigenetic regulation of the microRNA-200 family mediates epithelial and mesen- The product of the mouse Xist gene is a 15 kb inactive X-specific transcript con- chymal transitions in human tumorigenesis, Oncogene 31 (2012) 2062–2074. taining no conserved ORF and located in the nucleus, Cell 71 (1992) 515–526. [43] C.-M. Liu, R.-Y. Wang, Z.-X. Jiao Saijilafu, B.-Y. Zhang, F.-Q. Zhou, MicroRNA-138 and [75] G. Borsani, R. Tonlorenzi, M.C. Simmler, L. Dandolo, D. Arnaud, V. Capra, et al., SIRT1 form a mutual negative feedback loop to regulate mammalian axon regener- Characterization of a murine gene expressed from the inactive X chromosome, ation, Genes Dev. 27 (2013) 1473–1483. Nature 351 (1991) 325–329. [44] G.A. Calin, C.D. Dumitru, M. Shimizu, R. Bichi, S. Zupo, E. Noch, et al., Frequent dele- [76] J.L. Rinn, M. Kertesz, J.K. Wang, S.L. Squazzo, X. Xu, S.A. Brugmann, et al., Functional tions and down-regulation of micro-RNA genes miR15 and miR16 at 13q14 in chronic demarcation of active and silent chromatin domains in human HOX loci by non- lymphocyticleukemia,Proc.Natl.Acad.Sci.U.S.A.99(2002)15524–15529. coding RNAs, Cell 129 (2007) 1311–1323. [45] H.-L. Ke, M. Chen, Y. Ye, M. a T. Hildebrandt, W.-J. Wu, H. Wei, et al., Genetic vari- [77] K.C. Wang, Y.W. Yang, B. Liu, A. Sanyal, R. Corces-Zimmerman, Y. Chen, et al., A long ations in micro-RNA biogenesis genes and clinical outcomes in non-muscle- noncoding RNA maintains active chromatin to coordinate homeotic gene expres- invasive bladder cancer, Carcinogenesis 34 (2013) 1006–1011. sion, Nature 472 (2011) 120–124. [46] X. Li, X. Tian, B. Zhang, Y. Zhang, J. Chen, Variation in dicer gene is associated with [78] A.M. Khalil, M. Guttman, M. Huarte, M. Garber, A. Raj, D. Rivea Morales, et al., Many increased survival in T-cell lymphoma, PLoS One 7 (2012) e51640. human large intergenic noncoding RNAs associate with chromatin-modifying [47] Q. Yang, Z. Jie, S. Ye, Z. Li, Z. Han, J. Wu, et al., Genetic variations in miR-27a gene complexes and affect gene expression, Proc. Natl. Acad. Sci. U. S. A. 106 (2009) decrease mature miR-27a level and reduce gastric cancer susceptibility, Oncogene 11667–11672. (2012) 1–10. [79] M. Lin, E. Pedrosa, A. Shah, A. Hrabovsky, S. Maqbool, H.M. Lachman, RNA-Seq of [48] X. Luo, W. Yang, D.-Q. Ye, H. Cui, Y. Zhang, N. Hirankarn, et al., A functional variant human neurons derived from iPS cells reveals candidate long non-coding RNAs in- in microRNA-146a promoter modulates its expression and confers disease risk for volved in neurogenesis and neuropsychiatric disorders, PLoS One 6 (2011) e23356. systemic lupus erythematosus, PLoS Genet. 7 (2011) e1002128. [80] J.C. Marioni, C.E. Mason, S.M. Mane, M. Stephens, Y. Gilad, RNA-seq: an assessment [49] N. Shen, Q. Fu, Y. Deng, X. Qian, J. Zhao, K.M. Kaufman, et al., Sex-specific associa- of technical reproducibility and comparison with gene expression arrays, Genome tion of X-linked Toll-like receptor 7 (TLR7) with male systemic lupus erythemato- Res. 18 (2008) 1509–1517. sus, Proc. Natl. Acad. Sci. U. S. A. 107 (2010) 15838–15843. [81] A. Mortazavi, B.A. Williams, K. Mccue, L. Schaeffer, B. Wold, Mapping and quantify- [50] Y. Deng, J. Zhao, D. Sakurai, K.M. Kaufman, J.C. Edberg, R.P. Kimberly, et al., ing mammalian transcriptomes by RNA-Seq, Nature 5 (2008) 621–628. MicroRNA-3148 modulates allelic expression of toll-like receptor 7 variant associ- [82] J.L. Rinn, H.Y. Chang, Genome regulation by long noncoding RNAs, Annu. Rev. ated with systemic lupus erythematosus, PLoS Genet. 9 (2013) e1003336. Biochem. 81 (2012) 145–166. [51] M.A. Saunders, H. Liang, W.-H. Li, Human polymorphism at microRNAs and [83] M. Guttman, M. Garber, J.Z. Levin, J. Donaghey, J. Robinson, X. Adiconis, et al., Ab microRNA target sites, Proc. Natl. Acad. Sci. U. S. A. 104 (2007) 3300–3305. initio reconstruction of cell type-specific transcriptomes in mouse reveals the con- [52] J. Gong, Y. Tong, H.-M. Zhang, K. Wang, T. Hu, G. Shan, et al., Genome-wide identi- served multi-exonic structure of lincRNAs, Nat. Biotechnol. 28 (2010) 503–510. fication of SNPs in microRNA genes and the SNP effects on microRNA target bind- [84] C. Trapnell, A. Roberts, L. Goff, G. Pertea, D. Kim, D.R. Kelley, et al., Differential gene ing and biogenesis, Hum. Mutat. 33 (2012) 254–263. and transcript expression analysis of RNA-seq experiments with TopHat and [53] P. Brest, P. Lapaquette, M. Souidi, K. Lebrigand, A. Cesaro, V. Vouret-Craviari, et al., A Cufflinks, Nat. Protoc. 7 (2012) 562–578. synonymous variant in IRGM alters a binding site for miR-196 and causes deregu- [85] M. Garber, M.G. Grabherr, M. Guttman, C. Trapnell, Computational methods for lation of IRGM-dependent xenophagy in Crohn's disease, Nat. Genet. 43 (2011) transcriptome annotation and quantification using RNA-seq, Nat. Methods 8 242–245. (2011) 469–477. B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922 1921

[86] C.A. Thrash-Bingham, K.D. Tartof, aHIF: a natural antisense transcript overexpressed [120] J. Feng, W. Funk, S. Wang, S. Weinrich, A. Avilion, C. Chiu, et al., The RNA compo- in human renal cancer and during hypoxia, J. Natl. Cancer Inst. 91 (1999) 143–151. nent of human telomerase, Science 269 (1995) 1236–1241 (80-.). [87] M. Lapidot, Y. Pilpel, Genome-wide natural antisense transcription: coupling its [121] K. Tano, R. Mizuno, T. Okada, R. Rakwal, J. Shibato, Y. Masuo, et al., MALAT-1 en- regulation to its different regulatory mechanisms, EMBO Rep. 7 (2006) 1216–1222. hances cell motility of lung adenocarcinoma cells by influencing the expression [88] M.A. Faghihi, C. Wahlestedt, Regulatory roles of natural antisense transcripts, Nat. of motility-related genes, FEBS Lett. 584 (2010) 4575–4580. Rev. Mol. Cell Biol. 10 (2009) 637–643. [122] K. Li, Y. Blum, A. Verma, Z. Liu, K. Pramanik, N.R. Leigh, et al., A noncoding antisense [89] J. Chen, M. Sun, L.D. Hurst, G.G. Carmichael, J.D. Rowley, Genome-wide analysis of RNA in tie-1 locus regulates tie-1 function in vivo, Blood 115 (2010) 133–139. coordinate expression and evolution of human cis-encoded sense-antisense tran- [123] X. Fu, L. Ravindranth, N. Tran, G. Petrovics, S. Srivastava, Regulation of apoptosis by scripts, Trends Genet. 21 (2005) 326–329. a prostate-specific and prostate cancer-associated noncoding gene, PCGEM1, DNA [90] A.M. Khalil, M.A. Faghihi, F. Modarresi, S.P. Brothers, C. Wahlestedt, A novel RNA Cell Biol. 25 (2006) 135–141. transcript with antiapoptotic function is silenced in fragile X syndrome, PLoS [124] W.P. Tsang, T.W.L. Wong, A.H.H. Cheung, C.N.N. Co, T.T. Kwok, Induction of drug re- One 3 (2008) e1486. sistance and transformation in human cancer cells by the noncoding RNA CUDR, [91] N.A. Rapicavoli, E.M. Poth, H. Zhu, S. Blackshaw, The long noncoding RNA Six3OS RNA 13 (2007) 890–898. acts in trans to regulate retinal development by modulating Six3 activity, Neural [125] D. Khaitan, M.E. Dinger, J. Mazar, J. Crawford, M.A. Smith, J.S. Mattick, et al., The Dev. 6 (2011) 32. melanoma-upregulated long noncoding RNA SPRY4-IT1 modulates apoptosis and [92] J. Whitehead, G.K. Pandey, C. Kanduri, Regulation of the mammalian epigenome by invasion, Cancer Res. 71 (2011) 3852–3862. long noncoding RNAs, Biochim. Biophys. Acta 1790 (2009) 936–947. [126] V. Srikantan, Z. Zou, G. Petrovics, L. Xu, M. Augustus, L. Davis, et al., PCGEM1, a [93] K.C. Wang, H.Y. Chang, Molecular mechanisms of long noncoding RNAs, Mol. Cell prostate-specific gene, is overexpressed in prostate cancer, Proc. Natl. Acad. Sci. 43 (2011) 904–914. U. S. A. 97 (2000) 12216–12221. [94] N. Schonrock, R.P. Harvey, J.S. Mattick, Long noncoding RNAs in cardiac develop- [127] M.J.G. Bussemakers, A. Van Bokhoven, G.W. Verhaegh, F.P. Smit, H.F.M. Karthaus, J. ment and pathophysiology, Circ. Res. 111 (2012) 1349–1362. A. Schalken, et al., DD3: a new prostate-specific gene, highly overexpressed in [95] J.T.Y. Kung, D. Colognori, J.T. Lee, Long noncoding RNAs: past, present, and future, prostate cancer, Cancer Cell 59 (1999) 5975–5979. Genetics 193 (2013) 651–669. [128] L.B. Ferreira, A. Palumbo, K.D. de Mello, C. Sternberg, M.S. Caetano, F.L. de Oliveira, [96] E. Bernstein, C.D. Allis, RNA meets chromatin, Genes Dev. 19 (2005) 1635–1655. et al., PCA3 noncoding RNA is involved in the control of prostate-cancer cell surviv- [97] M.-C. Tsai, O. Manor, Y. Wan, N. Mosammaparast, J.K. Wang, F. Lan, et al., Long non- al and modulates androgen receptor signaling, BMC Cancer 12 (2012) 507. coding RNA as modular scaffold of histone modification complexes, Science 329 [129] S. Chung, H. Nakagawa, M. Uemura, L. Piao, K. Ashikawa, N. Hosono, et al., Associ- (2010) 689–693. ation of a novel long non-coding RNA in 8q24 with prostate cancer susceptibility, [98] M. Huarte, M. Guttman, D. Feldser, M. Garber, M.J. Koziol, D. Kenzelmann-Broz, Cancer Sci. 102 (2011) 245–252. et al., A large intergenic noncoding RNA induced by p53 mediates global gene [130] K. Panzitt, M.M.O. Tschernatsch, C. Guelly, T. Moustafa, M. Stradner, H.M. repression in the p53 response, Cell 142 (2010) 409–419. Strohmaier, et al., Characterization of HULC, a novel gene with striking up- [99] T. Hung, Y. Wang, M.F. Lin, A.K. Koegel, Y. Kotake, G.D. Grant, et al., Extensive and regulation in hepatocellular carcinoma, as noncoding RNA, Gastroenterology 132 coordinated transcription of noncoding RNAs within cell-cycle promoters, Nat. (2007) 330–342. Genet. 43 (2011) 621–629. [131] F. Guo, Y. Li, Y. Liu, J. Wang, Y. Li, G. Li, Inhibition of metastasis-associated lung ad- [100] J.B. Heo, S. Sung, Vernalization-mediated epigenetic silencing by a long intronic enocarcinoma transcript 1 in CaSki human cervical cancer cells suppresses cell pro- noncoding RNA, Science 331 (2011) 76–79. liferation and invasion, 0 (2010) 224–229. [101] S. Swiezewski, F. Liu, A. Magusin, C. Dean, Cold-induced silencing by long antisense [132] R. Lin, S. Maeda, C. Liu, M. Karin, T.S. Edgington, A large noncoding RNA is a marker transcripts of an Arabidopsis Polycomb target, Nature 462 (2009) 799–802. for murine hepatocellular carcinomas and a spectrum of human carcinomas, [102] S. Loewer, M.N. Cabili, M. Guttman, Y.-H. Loh, K. Thomas, I.H. Park, et al., Large Oncogene 26 (2007) 851–858. intergenic non-coding RNA-RoR modulates reprogramming of human induced [133] J.L. Thorvaldsen, M.S. Bartolomei, SnapShot: imprinted gene clusters, Cell 130 pluripotent stem cells, Nat. Genet. 42 (2010) 1113–1117. (2007) 958. [103] K.D. Pruitt, T. Tatusova, W. Klimke, D.R. Maglott, NCBI Reference Sequences: cur- [134] P.J. Batista, H.Y. Chang, Long noncoding RNAs: cellular address codes in develop- rent status, policy and new initiatives, Nucleic Acids Res. 37 (2009) D32–D36. ment and disease, Cell 152 (2013) 1298–1307. [104] T. Kino, D.E. Hurt, T. Ichijo, N. Nader, G.P. Chrousos, Noncoding RNA Gas5 is a [135] J.T. Lee, M.S. Bartolomei, X-inactivation, imprinting, and long noncoding RNAs in growth arrest and starvation-associated repressor of the glucocorticoid receptor, health and disease, Cell 152 (2013) 1308–1323. Sci. China Life Sci. 3 (2010) ra8. [136] S.J. Chamberlain, M. Lalande, Angelman syndrome, a genomic imprinting disorder [105] E. Sotillo, A. Thomas-Tikhonenko, The long reach of noncoding RNAs, Nat. Genet. of the brain, J. Neurosci. 30 (2010) 9958–9963. 43 (2011) 616–617. [137] R. Johnson, N. Richter, R. Jauch, P.M. Gaughwin, C. Zuccato, E. Cattaneo, et al., [106] K.L. Yap, S. Li, A.M. Muñoz-cabello, S. Raguz, L. Zeng, J. Gil, et al., Molecular interplay Human accelerated region 1 noncoding RNA is repressed by REST in Huntington's of the non-coding RNA ANRIL and methylated histone H3 lysine 27 by polycomb disease, Physiol. Genomics (2010) 269–274. CBX7 in transcriptional silencing of INK4a, Mol. Cell 38 (2010) 662–674. [138] R.S. Daughters, D.L. Tuttle, W. Gao, Y. Ikeda, M.L. Moseley, T.J. Ebner, et al., RNA [107] R.A. Gupta, N. Shah, K.C. Wang, J. Kim, H.M. Horlings, D.J. Wong, et al., Long non- gain-of-function in spinocerebellar ataxia type 8, PLoS Genet. 5 (2009) e1000600. coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis, [139] M.A. Faghihi, F. Modarresi, A.M. Khalil, D.E. Wood, B.G. Sahagan, T.E. Morgan, et al., Nature 464 (2010) 1071–1076. Expression of a noncoding RNA is elevated in Alzheimer's disease and drives rapid [108] D.C. Zappulla, T.R. Cech, RNA as a flexible scaffold for proteins: yeast telomerase feed-forward regulation of b-secretase, Nat. Med. 14 (2008) 723–730. and beyond, Cold Spring Harb. Symp. Quant. Biol. 71 (2006) 217–224. [140] M.A. Faghihi, M. Zhang, J. Huang, F. Modarresi, M.P. Van der Brug, M.A. Nalls, et al., [109] E.A. Gibb, E.A. Vucic, K.S.S. Enfield, G.L. Stewart, K.M. Lonergan, J.Y. Kennett, et al., Evidence for natural antisense transcript-mediated inhibition of microRNA func- Human cancer long non-coding RNA transcriptomes, PLoS One 6 (2011) e25915. tion, Genome Biol. 11 (2010) R56. [110] P. Ji, S. Diederichs, W. Wang, S. Böing, R. Metzger, P.M. Schneider, et al., MALAT-1, a [141] R.J.L.F. Lemmers, M. Wohlgemuth, K.J. van der Gaag, P.J. van der Vliet, C.M.M. van novel noncoding RNA, and thymosin beta4 predict metastasis and survival in Teijlingen, P. de Knijff, et al., Specific sequence variations within the 4q35 region early-stage non-small cell lung cancer, Oncogene 22 (2003) 8031–8041. are associated with facioscapulohumeral muscular dystrophy, Am. J. Hum. Genet. [111] V. Tripathi, J.D. Ellis, Z. Shen, D.Y. Song, Q. Pan, A.T. Watt, et al., The nuclear- 81 (2007) 884–894. retained noncoding RNA MALAT1 regulates alternative splicing by modulating [142] D.S. Cabianca, V. Casa, B. Bodega, A. Xynos, E. Ginelli, Y. Tanaka, et al., A long ncRNA SR splicing factor phosphorylation, Mol. Cell 39 (2010) 925–938. links copy number variation to a polycomb/trithorax epigenetic switch in FSHD [112] T. Gutschner, M. Hämmerle, M. Eissmann, J. Hsu, Y. Kim, G. Hung, et al., The non- muscular dystrophy, Cell 149 (2012) 819–831. coding RNA MALAT1 is a critical regulator of the metastasis phenotype of lung [143] P.G. Maass, A. Rump, H. Schulz, S. Stricker, L. Schulze, K. Platzer, et al., A misplaced cancer cells, Cancer Res. 73 (2013) 1180–1189. lncRNA causes brachydactyly in humans, J. Clin. Invest. 122 (2012) 3990–4002. [113] T. Gutschner, M. Hämmerle, S. Diederichs, MALAT1 — a paradigm for long noncod- [144] P.G. Maass, J. Wirth, A. Aydin, A. Rump, S. Stricker, S. Tinschert, et al., A cis- ing RNA function in cancer, J. Mol. Med. (Berl.) 91 (2013) 791–801. regulatory site downregulates PTHLH in translocation t(8;12)(q13;p11.2) and [114] M. Eißmann, T. Gutschner, M. Hämmerle, S. Günther, M. Caudron-Herger, M. Groß, leads to Brachydactyly Type E, Hum. Mol. Genet. 19 (2010) 848–860. et al., Loss of the abundant nuclear non-coding RNA MALAT1 is compatible with [145] J.K. Millar, J.C. Wilson-Annan, S. Anderson, S. Christie, M.S. Taylor, C.A. Semple, et al., life and development, RNA Biol. 9 (2012) 1076–1087. Disruption of two novel genes by a translocation co-segregating with schizophre- [115] K. Kim, I. Jutooru, G. Chadalapaka, G. Johnson, J. Frank, R. Burghardt, et al., HOTAIR nia, Hum. Mol. Genet. 9 (2000) 1415–1423. is a negative prognostic factor and exhibits pro-oncogenic activity in pancreatic [146] J.K. Millar, S. Christie, C.A. Semple, D.J. Porteous, Chromosomal location and cancer, Oncogene 32 (2013) 1616–1625. genomic structure of the human translin-associated factor X gene (TRAX; [116] R. Kogo, T. Shimamura, K. Mimori, K. Kawahara, S. Imoto, T. Sudo, et al., Long TSNAX) revealed by intergenic splicing to DISC1, a gene disrupted by a transloca- noncoding RNA HOTAIR regulates polycomb-dependent chromatin modification tion segregating with schizophrenia, Genomics 67 (2000) 69–77. and is associated with poor prognosis in colorectal cancers, Cancer Res. 71 [147] O. Wapinski, H.Y. Chang, Long noncoding RNAs and human disease, Trends Cell (2011) 6320–6326. Biol. 21 (2011) 354–361. [117] T. Niinuma, H. Suzuki, M. Nojima, K. Nosho, H. Yamamoto, H. Takamaru, et al., Up- [148] J.M. Williams, T.F. Beck, D.M. Pearson, M.B. Proud, S. Wai, D.A. Scott, A 1q42 regulation of miR-196a and HOTAIR drive malignant character in gastrointestinal deletion involving DISC1, DISC2, and TSNAX in an autism spectrum disorder, Am. stromal tumors, Cancer Res. 72 (2012) 1126–1136. J. Med. Genet. A 149A (2009) 1758–1762. [118] Z. Yang, L. Zhou, L.-M. Wu, M.-C. Lai, H.-Y. Xie, F. Zhang, et al., Overexpression of [149] C.E. Burd, W.R. Jeck, Y. Liu, H.K. Sanoff, Z. Wang, N.E. Sharpless, Expression of linear long non-coding RNA HOTAIR predicts tumor recurrence in hepatocellular carcinoma and novel circular forms of an INK4/ARF-associated non-coding RNA correlates patients following liver transplantation, Ann. Surg. Oncol. 18 (2011) 1243–1250. with atherosclerosis risk, PLoS Genet. 6 (2010) e1001233. [119] S. Redon, P. Reichenbach, J. Lingner, The non-coding RNA TERRA is a natural ligand [150] E. Pasmant, A. Sabbagh, M. Vidaud, I. Bièche, ANRIL, a long, noncoding RNA, is an and direct inhibitor of human telomerase, Nucleic Acids Res. 38 (2010) 5797–5806. unexpected major hotspot in GWAS, FASEB J. 25 (2011) 444–448. 1922 B. Hrdlickova et al. / Biochimica et Biophysica Acta 1842 (2014) 1910–1922

[151] O. Harismendy, D. Notani, X. Song, N.G. Rahim, B. Tanasa, N. Heintzman, et al., 9p21 [173] S. He, C. Liu, G. Skogerbø, H. Zhao, J. Wang, T. Liu, et al., NONCODE v2.0: decoding DNA variants associated with coronary artery disease impair interferon-γ signal- the non-coding, Nucleic Acids Res. 36 (2008) D170–D172. ling response, Nature 470 (2011) 264–268. [174] D. Bu, K. Yu, S. Sun, C. Xie, G. Skogerbø, R. Miao, et al., NONCODE v3.0: integrative [152] P.-J. Volders, K. Helsens, X. Wang, B. Menten, L. Martens, K. Gevaert, et al., annotation of long noncoding RNAs, Nucleic Acids Res. 40 (2012) D210–D215. LNCipedia: a database for annotated human lncRNA transcript sequences and [175] T. Wu, J. Wang, C. Liu, Y. Zhang, B. Shi, X. Zhu, et al., NPInter: the noncoding RNAs structures, Nucleic Acids Res. 41 (2013) D246–D251. and protein related biomacromolecules interaction database, Nucleic Acids Res. 34 [153] W.J. Kent, C.W. Sugnet, T.S. Furey, K.M. Roskin, T.H. Pringle, A.M. Zahler, et al., The (2006) D150–D152. Human Genome Browser at UCSC, Genome Res. 12 (2002) 996–1006. [176] D. Betel, M. Wilson, A. Gabow, D.S. Marks, C. Sander, The microRNA.org resource: [154] T.R. Dreszer, D. Karolchik, A.S. Zweig, A.S. Hinrichs, B.J. Raney, R.M. Kuhn, et al., The targets and expression, Nucleic Acids Res. 36 (2008) D149–D153. UCSC Genome Browser database: extensions and updates 2011, Nucleic Acids Res. [177] M. Maragkakis, M. Reczko, V. a Simossis, P. Alexiou, G.L. Papadopoulos, T. 40 (2012) D918–D923. Dalamagas, et al., DIANA-microT web server: elucidating microRNA functions [155] K.R. Rosenbloom, T.R. Dreszer, J.C. Long, V.S. Malladi, C.A. Sloan, B.J. Raney, et al., through target prediction, Nucleic Acids Res. 37 (2009) W273–W276. ENCODE whole-genome data in the UCSC Genome Browser: update 2012, Nucleic [178] T. Vergoulis, I.S. Vlachos, P. Alexiou, G. Georgakilas, M. Maragkakis, M. Reczko, et al., Acids Res. 40 (2012) D912–D917. TarBase 6.0: capturing the exponential growth of miRNA targets with experimen- [156] T. Ravasi, H. Suzuki, C.V. Cannistraci, S. Katayama, V.B. Bajic, K. Tan, et al., An atlas of tal support, Nucleic Acids Res. 40 (2012) D222–D229. combinatorial transcriptional regulation in mouse and man, Cell 140 (2010) [179] J.-H. Yang, J.-H. Li, P. Shao, H. Zhou, Y.-Q. Chen, L.-H. Qu, starBase: a database 744–752. for exploring microRNA–mRNA interaction maps from Argonaute CLIP-Seq and [157] A. Zoubarev, K.M. Hamer, K.D. Keshav, E.L. McCarthy, J.R.C. Santos, T. Van Rossum, Degradome-Seq data, Nucleic Acids Res. 39 (2011) D202–D209. et al., Gemma: a resource for the reuse, sharing and meta-analysis of expression [180] C.E. Vejnar, E.M. Zdobnov, MiRmap: comprehensive prediction of microRNA target profiling data, Bioinformatics 28 (2012) 2272–2273. repression strength, Nucleic Acids Res. 40 (2012) 11673–11683. [158] P. Kapranov, F. Ozsolak, S.W. Kim, S. Foissac, D. Lipson, C. Hart, et al., New class of [181] M. Barenboim, B.J. Zoltick, Y. Guo, D.R. Weinberger, MicroSNiPer: a web tool gene-termini-associated human RNAs suggests a novel RNA copying mechanism, for prediction of SNP effects on putative microRNA targets, Hum. Mutat. 31 (2010) Nature 466 (2010) 642–646. 1223–1232. [159] T.R. Mercer, D.J. Gerhardt, M.E. Dinger, J. Crawford, C. Trapnell, J.A. Jeddeloh, et al., [182] A.E. Bruno, L. Li, J.L. Kalabus, Y. Pan, A. Yu, Z. Hu, miRdSNP: a database of disease- Targeted RNA sequencing reveals the deep complexity of the human tran- associated SNPs and microRNA target sites on 3′UTRs of human genes, BMC scriptome, Nat. Biotechnol. 30 (2012) 99–104. Genomics 13 (2012) 44. [160] F.A. Buske, D.C. Bauer, J.S. Mattick, T.L. Bailey, Triplexator: detecting nucleic [183] H. Dweep, C. Sticht, P. Pandey, N. Gretz, miRWalk-database: prediction of possible acid triple helices in genomic and transcriptomic data, Genome Res. 22 (2012) miRNA binding sites by “walking” the genes of three genomes, J. Biomed. Inform. 1372–1381. 44 (2011) 839–847. [161] C. Chu, J. Quinn, H.Y. Chang, Chromatin isolation by RNA purification (ChIRP), J Vis [184] J.D. Ziebarth, A. Bhattacharya, A. Chen, Y. Cui, PolymiRTS Database 2.0: linking Exp. 61 (2012) e3912. polymorphisms in microRNA target sites with human diseases and complex traits, [162] M.D. Simon, C.I. Wang, P.V. Kharchenko, J. a West, B. a Chapman, A. a Alekseyenko, Nucleic Acids Res. 40 (2012) D216–D221. et al., The genomic binding sites of a noncoding RNA, Proc. Natl. Acad. Sci. U. S. A. [185] S. Hiard, C. Charlier, W. Coppieters, M. Georges, D. Baurain, Patrocles: a database of 108 (2011) 20497–20502. polymorphic miRNA-mediated gene regulation in vertebrates, Nucleic Acids Res. [163] J. Zhao, T.K. Ohsumi, J.T. Kung, Y. Ogawa, D.J. Grau, K. Sarma, et al., Genome-wide 38 (2010) D640–D651. identification of polycomb-associated RNAs by RIP-seq, Mol. Cell 40 (2010) [186] C. Liu, F. Zhang, T. Li, M. Lu, L. Wang, W. Yue, et al., MirSNP, a database of polymor- 939–953. phisms altering miRNA target sites, identifies miRNA-related SNPs in GWAS SNPs [164] M.E. Askarian-Amiri, J. Crawford, J.D. French, C.E. Smart, M.A. Smith, M.B. Clark, and eQTLs, BMC Genomics 13 (2012) 661. et al., SNORD-host RNA Zfas1 is a regulator of mammary development and a [187] M. Kertesz, N. Iovino, U. Unnerstall, U. Gaul, E. Segal, The role of site accessibility in potential marker for breast cancer, RNA 17 (2011) 878–891. microRNA target recognition, Nat. Genet. 39 (2007) 1278–1284. [165] Z. Zhang, Z. Zhu, K. Watabe, X. Zhang, C. Bai, M. Xu, et al., Negative regulation of [188] F. Niazi, S. Valadkhan, Computational analysis of functional long noncoding lncRNA GAS5 by miR-21, Cell Death Differ. 20 (2013) 1558–1568. RNAs reveals lack of peptide-coding capacity and parallels with 3′ UTRs, RNA 18 [166] S.Memczak,M.Jens,A.Elefsinioti,F.Torti,J.Krueger,A.Rybak,etal.,CircularRNAsare (2012) 825–843. a large class of animal RNAs with regulatory potency, Nature 495 (2013) 333–338. [189] M.E. Dinger, K.C. Pang, T.R. Mercer, M.L. Crowe, S.M. Grimmond, J.S. Mattick, NRED: [167] T.B. Hansen, T.I. Jensen, B.H. Clausen, J.B. Bramsen, B. Finsen, C.K. Damgaard, et al., a database of long noncoding RNA expression, Nucleic Acids Res. 37 (2009) Natural RNA circles function as efficient microRNA sponges, Nature 495 (2013) D122–D126. 384–388. [190] Q. Liao, H. Xiao, D. Bu, C. Xie, R. Miao, H. Luo, et al., ncFANs: a web server for [168] E. Pennisi, The CRISPR craze, Science 341 (2013) 833–836. functional annotation of long non-coding RNAs, Nucleic Acids Res. 39 (2011) [169] S.W. Burge, J. Daub, R. Eberhardt, J. Tate, L. Barquist, E.P. Nawrocki, et al., Rfam 11.0: W118–W124. 10 years of RNA families, Nucleic Acids Res. 41 (2013) D226–D232. [191] K. Liu, Z. Yan, Y. Li, Z. Sun, Linc2GO: a human LincRNA function annotation resource [170] K.C. Pang, S. Stephen, M.E. Dinger, P.G. Engström, B. Lenhard, J.S. Mattick, RNAdb 2. based on ceRNA hypothesis, Bioinformatics 29 (2013) 2221–2222. 0—an expanded database of mammalian non-coding RNAs, Nucleic Acids Res. 35 [192] G. Chen, Z. Wang, D. Wang, C. Qiu, M. Liu, X. Chen, et al., LncRNADisease: a database (2007) D178–D182. for long-non-coding RNA-associated diseases, Nucleic Acids Res. 41 (2013) [171] A.P. Boyle, E.L. Hong, M. Hariharan, Y. Cheng, M.A. Schaub, M. Kasowski, et al., D983–D986. Annotation of functional variation in personal genomes using RegulomeDB, [193] M.D. Paraskevopoulou, G. Georgakilas, N. Kostoulas, M. Reczko, M. Maragkakis, T.M. Genome Res. 22 (2012) 1790–1797. Dalamagas, et al., DIANA-LncBase: experimentally verified and computationally [172] C. Liu, B. Bai, G. Skogerbø, L. Cai, W. Deng, Y. Zhang, et al., NONCODE: an integrated predicted microRNA targets on long non-coding RNAs, Nucleic Acids Res. 41 knowledge database of non-coding RNAs, Nucleic Acids Res. 33 (2005) D112–D115. (2013) D239–D245.