Research Collection

Doctoral Thesis

The Good, the Bad and the Ugly: a three-way duel in microRNA targeting

Author(s): Lucic, Matije

Publication Date: 2019

Permanent Link: https://doi.org/10.3929/ethz-b-000335030

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library DISS. ETH NO. 25693

The Good, the Bad and the Ugly: a three-way duel in microRNA targeting

A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH Zurich (Dr. sc. ETH Zurich)

presented by MATIJE LUCIC M.Sc. Pharm. Sciences, ETH Zurich born on 09.01.1986 citizen of Muralto, Switzerland

accepted on the recommendation of Prof. Dr. Jonathan Hall, examiner Prof. Dr. Constance Ciaudo, co-examiner

2019

To my loved ones.

For ‘Science & Cigars’, may this be the start.

ETH Zurich Matije Lucic 2 Table of contents

Acknowledgements ...... 6 Abstract ...... 7 Riassunto ...... 8 1 Introduction ...... 9 1.1 Introduction to microRNAs (miRNAs) ...... 9 1.1.1 miRNA families ...... 10 1.1.2 miRNA nomenclature ...... 10 1.1.3 miRNA guide and passenger strands ...... 11 1.1.4 miRBase: the miRNA sequence repository ...... 12 1.2 miRNA biogenesis ...... 14 1.3 Strand selection and RISC assembly ...... 15 1.4 miRNA-mediated gene silencing ...... 16 1.4.1 Argonaute-catalyzed slicing mechanism ...... 17 1.4.2 Slicing-independent translational repression and mRNA decay ...... 18 1.5 miRNA target recognition ...... 19 1.5.1 5′ end of the miRNA ...... 19 1.5.2 Central region of the miRNA ...... 20 1.5.3 3′ end of the miRNA ...... 21 1.5.4 Non-canonical miRNA targeting ...... 22 1.5.5 Model for miRNA target recognition ...... 23 1.5.6 Identifying the miRNA targetome ...... 24 1.5.7 miRNA target prediction strategies ...... 25 1.6 miRNA targets fight back on miRNAs ...... 26 1.7 Complexity and redundancy of miRNA function ...... 26 1.8 Oncomirs and tumor suppressors ...... 27 1.8.1 miR-17~92 cluster ...... 28 1.8.2 let-7 family ...... 30 1.9 Aim of the project...... 31 2 Results ...... 32 2.1 between miR-17 and let-7 families ...... 32 2.2 Model for competitive non-canonical binding at let-7 target sites ...... 33 2.3 Investigation of the let-7 transcriptome in HEK293T cells ...... 34 2.3.1 Canonical repression by let-7a ...... 37 2.3.2 Canonical repression by miR-106a and miR-106b ...... 39 2.3.3 miR-106a seed-mutation abolishes canonical repression...... 41 2.3.4 3′ end-mutated miR-106a retains canonical repressive activity ...... 43 2.3.5 miR-106a and miR-106b are unable to repress let-7 targets ...... 45 2.3.6 Co-transfection of let-7a with either miR-106a or miR-106b ...... 47 2.3.7 Co-transfection of let-7a with seed-mutated miR-106a ...... 49 2.3.8 Co-transfection of let-7a with 3′ end-mutated miR-106a ...... 51 2.4 Follow-up on putative let-7 targets: HMGA2 and LIN28B ...... 55

ETH Zurich Matije Lucic 3

2.5 let-7 competition by miR-106a-5p strand ...... 58 2.6 Seed-homology sequences in human miRNAs ...... 61 3 Discussion ...... 64 4 Outlook...... 67 5 Contributions ...... 68 6 Side projects ...... 69 6.1 RNAi activity of hybrid duplexes with parallel orientation ...... 69 6.2 Targeting miR-122 in RISC with conjugated antimiRs...... 70 6.3 Antagonizing Lin28-pre-let-7 interaction with ‘looptomirs’ ...... 71 6.4 Mono- and bis-labeling of pre-miRNAs ...... 72 7 Materials and methods ...... 73 7.1 Materials ...... 73 7.1.1 miRNAs and siRNAs ...... 73 7.1.2 Plasmids ...... 74 7.1.3 RT-qPCR primers ...... 74 7.1.4 Antibodies ...... 74 7.2 Methods ...... 75 7.2.1 Cultivation and maintenance of mammalian cell lines ...... 75 7.2.2 Seeding of the cells...... 75 7.2.3 Transient transfections of miRNAs and siRNAs ...... 75 7.2.4 RT-qPCR ...... 75 7.2.5 Western blot...... 75 7.2.6 Cloning and transfections of luciferase reporter plasmids ...... 76 7.2.7 Luciferase assay ...... 76 7.2.8 RNA integrity and quantification ...... 76 7.2.9 Library preparation ...... 77 7.2.10 Clustering and sequencing...... 77 7.2.11 Analysis of sequencing data sets ...... 77 7.2.12 Library ID ...... 78 7.2.13 miRNA target predictions ...... 79 7.2.14 Statistical analysis...... 80 References ...... 81 Supplementary information ...... 90 Posters ...... 104 Curriculum vitae ...... 108

ETH Zurich Matije Lucic 4 Table of figures and tables

Figure 1 Two human miRNA families: miR-17 and let-7...... 10 Figure 2 Mature miRNA duplex contains a 5p and a 3p strand...... 11 Figure 3 Display of miRNA hsa-let-7a-1 annotated in miRBase...... 13 Figure 4 Display of HMGA2-hsa-let-7a-5p interaction annotated in DIANA-TarBase...... 13 Figure 5 Canonical miRNA biogenesis and RISC assembly...... 14 Figure 6 Mechanisms of miRNA function...... 17 Figure 7 Functional domains of the miRNA guide within AGO...... 19 Figure 8 Canonical miRNA target sites...... 20 Figure 9 3′-compensatory let-7 sites in C. elegans lin-41 mRNA...... 22 Figure 10 Model for miRNA target recognition...... 24 Figure 11 Complexity and redundancy of miRNA function...... 27 Figure 12 miR-17~92 cluster and its paralogs...... 29 Figure 13 Sequence homology between miR-17 and let-7 families...... 32 Figure 14 Model for competitive non-canonical binding at let-7 target sites...... 34 Figure 15 Non-canonical targeting of miR-106a, its mutants, and miR-106b...... 35 Figure 16 Screen for canonical miR-106a/b activity on miR-17 luciferase reporter...... 36 Figure 17 let-7 targets are repressed by let-7a duplex in a dose-dependent manner...... 38 Figure 18 miR-17 targets are repressed by both miR-106a and miR-106b precursors...... 40 Figure 19 Repression of miR-17 targets is abolished by miR-106a seed-mutation...... 42 Figure 20 miR-17 targets are repressed by the 3′ end-mutated miR-106a precursors...... 45 Figure 21 let-7 targets are not repressed by miR-106a and miR-106b precursors...... 46 Figure 22 miR-106a precursor is able to neutralize let-7a’s effect on its targetome...... 48 Figure 23 Seed-mutated miR-106a retains the ability to hinder let-7 target repression...... 50 Figure 24 Mutation #1 in the 3′ end of miR-106a abolishes its let-7 neutralizing activity...... 52 Figure 25 Mutation #3 in the 3′ end of miR-106a abolishes its let-7 neutralizing activity...... 53 Figure 26 miR-106a 3′ end-mutant #2 retains the ability to hinder let-7 target repression...... 54 Figure 27 Effects of co-transfections on validated let-7 and miR-17 targets...... 56 Figure 28 Effects of co-transfections on HMGA2 3′ UTR luciferase reporter...... 57 Figure 29 Mimic-106a-5p prevents let-7 repression of putative let-7 targets...... 59 Figure 30 Mimic-106a-5p prevents let-7 repression of LIN28B ...... 60 Figure 31 Mimic-106a-5p prevents repression of HMGA2 3′ UTR luciferase reporter...... 61 Figure 32 Seed-overlap sequences are frequently found in 3′ ends of human miRNAs...... 62 Figure 33 miR-34 seed-homology sequence in the 3′ end of miR-214-3p...... 63

Table S1 658 confidently annotated miRNAs in human (Homo sapiens)...... 90 Table S2 614 confidently annotated miRNAs in mouse (Mus musculus)...... 91 Table S3 155 confidently annotated miRNAs in fly (Drosophila melanogaster)...... 92 Table S4 81 confidently annotated miRNAs in worm (Caenorhabditis elegans)...... 93 Table S5 89 broadly conserved miRNA families comprising 200 miRNA ...... 94 Table S6 Multiple sequence alignments of all miRBase-annotated miR-17 family members. ... 96 Table S7 819 TargetScan-predicted let-7 targets in human...... 100 Table S8 194 TargetScan-predicted let-7 and miR-17 co-targets in human...... 101 Table S9 Top 50 most repressed let-7 targets upon transfection of 50 nM let-7a duplex...... 102 Table S10 Mfold-predicted secondary structures of miR-106a hairpin precursors...... 103

ETH Zurich Matije Lucic 5

Acknowledgements

I would like to thank… …Prof. Dr. Jonathan Hall for giving me the opportunity to work in his group during my whole academic curriculum, first as undergraduate student, then for my master thesis, and finally for the doctoral research. It was a great environment with great people that allowed me to develop my research and communication skills and grow as a person. I have deeply appreciated the freedom and support Jon gave me to pursue many of my own ideas and I am very grateful for both his scientific inputs and for his consideration and caring attitude regarding personal matters. …Prof. Dr. Constance Ciaudo for co-examining my PhD thesis and for her genuine interest in my research, for our discussions and planning of experiments, and all the collaborative work she has done in the cell-culture lab. …Prof. Dr. Mihaela Zavolan for constructive and honest discussions and all her inputs and work done for our collaborative sequencing experiments. …Dr. Alexander Kanitz for the raw analysis of the transcriptomic data and his inputs and help with the analysis of cumulative plots. …Philippe Demougin for the preparation of DNA libraries and the supportive work before and after the deep sequencing. … Prof. Dr. Masad Damha and Dr. Maryam Habibian for the successful collaboration in the investigation of RNAi activity of hybrid duplexes with parallel orientation. …Dr. Luca Gebert for being a role model and instilling in me the passion for science. I will always remember the time spent together in the lab, in the gym and during our running sessions. He was always helpful and willing to discuss and improve my research, proof- read my thesis and perform the AGO binding assay. …my students Amany El Gedaily, Philipp Keller, Arpad Dunai, Wille Suominen and Gregory Holtzhauer for their interest, help and contributions to my research. …all the members of the Hall group for the daily fun, pain, coffee-breaks, anger, football- sessions, help, tears, cakes, desperation, dedication, hate, after-work beers, collaborations, competitions, cookie-contests, support, bitcoin-talks, science, cigars, joy, pizza-lunch, awkward silence, Halloween party, ‘cocking-around’… and all the other good and bad things we shared. I would like to thank especially... …Sylvia Peleg for her daily cheerfulness and all the administrative work. …Mauro Zimmermann for the daily football chats and replay sessions, synthesis of oligonucleotides and IT-support. …Yuluan Wang for being my lab sister.

ETH Zurich Matije Lucic 6

Abstract

Historically, the posttranscriptional control of gene expression has been focused on RNA- binding as primary regulators of mRNA stability and function. More recently, however, a new class of regulatory molecules was discovered, the microRNAs (miRNAs). miRNAs are small ∼21-nucleotides-long non-coding RNAs that modulate gene expression by base-pairing to short sequences usually located in 3′ untranslated regions of their target mRNAs, thereby regulating protein synthesis by translational inhibition and/or mRNA degradation. Our comprehension of miRNA-target interactions has increased in the last decade and has uncovered an ever-growing number of targeting modes, suggesting a much more dynamic and heterogeneous mechanism of post-transcriptional gene regulation than previously proposed. Most miRNA target sites have perfect complementarity to a stretch of approximately 6-8 nucleotides at the 5′ end of the miRNA known as the miRNA seed. The pairing of the miRNA seed to the target mRNA is by far the most understood interaction and represents the vast majority of validated targets found in the literature. However, in addition to the canonical miRNA-mRNA targeting dominated by the seed, there is significant evidence for exceptions as numerous non-canonical miRNA target sites have been described. Herein, we propose a miRNA ‘safe-guarding’ mechanism of target mRNAs based on a non-canonical targeting competition between two major miRNA families, the oncogenic miR-17 and the tumor-suppressive let-7. Considering their sequence homology with the let-7 seed, we hypothesize that miR-17 family miRNAs bind to let-7 target sites in a non-canonical fashion, lacking the standard miR-17 seed-pairing and exploiting instead their 3′ end complementarity. This non-canonical targeting is not able to induce repression of let-7 targets, instead it ‘guards’ them against the let-7-mediated silencing, possibly by competing with the conventional let-7 seed-pairing.

ETH Zurich Matije Lucic 7

Riassunto

Storicamente, il controllo post-trascrizionale dell'espressione genica è stato focalizzato sulle proteine capaci di legare l'RNA e regolare così la stabilità e funzione degli mRNA. Più recentemente, tuttavia, è stata scoperta una nuova classe di molecole regolatrici, i microRNA (miRNA). I miRNA sono piccoli RNA non codificanti di ~21 nucleotidi che modulano l'espressione genica mediante accoppiamento di basi con brevi sequenze, solitamente situate in regioni non tradotte dei loro mRNA bersaglio, regolando così la sintesi proteica mediante inibizione traslazionale e/o degradazione dei mRNA. La nostra comprensione delle interazioni tra miRNA ed il suo target è aumentata nell'ultimo decennio. Le scoperte di un numero sempre crescente di modalità di targeting dei miRNA suggeriscono un meccanismo molto più dinamico ed eterogeneo di regolazione genica post-trascrizionale, rispetto a quanto proposto precedentemente. La maggior parte dei siti target di miRNA ha una perfetta complementarità con un tratto di circa 6-8 nucleotidi all'estremità 5′ del miRNA noto come seme del miRNA. L'associazione del seme con l’mRNA bersaglio è di gran lunga l'interazione più conosciuta e rappresenta la stragrande maggioranza dei siti target validati finora. Tuttavia, oltre a questo targeting canonico dominato dal seme del miRNA, vi è una significativa evidenza di eccezioni rappresentanti numerosi siti target non canonici. Qui proponiamo un meccanismo di protezione degli mRNA bersaglio basato su una competizione di targeting non canonica tra due principali famiglie di miRNA, il miR-17 oncogeno e il let-7 soppressore del tumore. Considerando la loro omologia di sequenza con il seme del let-7, ipotizziamo che i miRNA della famiglia miR- 17 si leghino a siti target del let-7 in modo non canonico, privi dell'associazione tipica tramite seme e sfruttando invece la loro complementarità all'estremità 3′. Questo targeting non canonico non è in grado di indurre la repressione di questi bersagli, invece li protegge contro l’azione del let-7, possibilmente tramite competizione diretta sui suoi siti target.

ETH Zurich Matije Lucic 8

Main project Introduction

1 Introduction 1.1 Introduction to microRNAs (miRNAs)

In 1993, two groups led by Ambros and Ruvkun discovered lin-4, a small regulatory RNA that came to be known later on as the first microRNA (miRNA) ever identified [1, 2]. The lin- 4 gene was screened by forward genetic approaches as a regulator of lin-14, a heterochronic gene of the worm Caenorhabditis elegans (C. elegans), essential for the early development of the animal. Surprisingly, the small non-protein-coding transcript encoded by lin-4 was regulating lin-14 post-transcriptionally through an antisense RNA-RNA interaction with sequence complementary repeats within the lin-14 mRNA [1, 2]. Lin-4 didn’t turn out to be a unique and worm-specific miRNA usurping the role of gene expression, thought generally to be exclusive to regulatory proteins. On the contrary, after the seminal discovery by Ambros and Ruvkun, miRNAs have emerged as major regulators of gene expression in nearly all biological processes and many human diseases [3-6].

But after all, why should these regulators not be small RNAs? The elegance of the Watson- Crick base pairing rules provide a degree of specificity and robustness that is difficult for proteins to match. In fact, since their discovery over two and a half decades ago, the number of researchers involved in small RNA studies continued to increase. Early genetic studies on phenotype mutants, cloning experiments and recent advances in next-generation sequencing techniques, have led to the identification of thousands of miRNAs across nearly all clades including viruses, unicellular organisms, plants and metazoans [7].

For a comprehensive and updated review on metazoan miRNAs, see [3]. Other small regulatory RNAs (e.g. small interfering RNAs, siRNAs or PIWI-interacting RNAs, piRNAs) have been extensively reviewed in [4].

This work focuses on the miRNA biology in humans and specifically on two functionally opposing miRNA families: oncogenic miR-17 [8, 9] and tumor-suppressive let-7 [10, 11].

ETH Zurich Matije Lucic 9 Main project Introduction

1.1.1 miRNA gene families

miRNA genes represent one of the most abundant and widely distributed gene families. The latest release of the miRNA database ‘miRBase’ (http://www.mirbase.org) lists 1’917 miRNA gene annotations in human [7]. Many of them are highly homologous since they evolved through duplications of common ancestral genes [12, 13]. In fact, all encoded miRNAs that have high sequence homology, specifically identical nucleotides 2-8 from the 5′ end of the mature miRNA sequence (termed ‘extended seed’ [3]), are grouped into the same ‘miRNA family’ [14]. For example, the human miR-17 family contains 6 members, while the let-7 family is the most numerous miRNA family with 12 members [7] (Figure 1). Overall ~60% of human miRNAs belong to a miRNA family [15].

miR-17 family let-7 family miRNA sequence (5′-3′) miRNA sequence (5′-3′) 12345678 12345678 miR-17 CAAAGUGCUUACAGUGCAGGUAG let-7a-1 UGAGGUAGUAGGUUGUAUAGUU miR-20a UAAAGUGCUUAUAGUGCAGGUAG let-7a-2 UGAGGUAGUAGGUUGUAUAGUU miR-20b CAAAGUGCUCAUAGUGCAGGUAG let-7a-3 UGAGGUAGUAGGUUGUAUAGUU miR-93 CAAAGUGCUGUUCGUGCAGGUAG let-7b UGAGGUAGUAGGUUGUGUGGUU miR-106a AAAAGUGCUUACAGUGCAGGUAG let-7c UGAGGUAGUAGGUUGUAUGGUU miR-106b UAAAGUGCUGACAGUGCAGAU let-7d AGAGGUAGUAGGUUGCAUAGUU let-7e UGAGGUAGGAGGUUGUAUAGUU let-7f-1 UGAGGUAGUAGAUUGUAUAGUU let-7f-2 UGAGGUAGUAGAUUGUAUAGUU let-7g UGAGGUAGUAGUUUGUACAGUU let-7i UGAGGUAGUAGUUUGUGCUGUU miR-98 UGAGGUAGUAAGUUGUAUUGUU

Figure 1 Two human miRNA families: miR-17 and let-7. Members of each family are shown with ClustalW multiple sequence alignments. The extended seed (nucleotides 2-8 from the 5′ end of the miRNA sequence; shown in red) defines the miRNA family. Only sequences of the miRNA guide strands are shown (5′-3′ orientation). The suffix ‘-5p’ was omitted.

1.1.2 miRNA nomenclature

miRNAs are named using the ‘miR’ prefix followed by a dash and a unique identifying number (e.g. miR-17, miR-93). Exceptions to the naming convention are miRNAs found in early genetic studies and named after their mutant phenotypes, for example, lin-4 and let- 7. Other miRNAs, identified mainly by cloning or sequencing much before a mutant phenotype was known, received numerical names in the chronological order of their

ETH Zurich Matije Lucic 10 Main project Introduction

discovery [16]. The unified annotation of a human miRNA has the form ‘hsa-miR-106a’. The three-letter prefix preceding the miRNA name defines the organism, in this case ‘hsa’ (Homo sapiens) for human miRNAs. Paralog miRNAs are indicated with lettered suffixes (a, b, c, etc.) distinguishing miRNAs with nearly identical mature sequences (e.g., miR-106a and miR-106b) (Figure 1). If the same mature miRNA is generated from multiple loci, numeric suffixes (-1, -2, -3, etc.) are added at the end of the miRNA name (e.g., let-7a-1, let-7a-2 and let-7a-3) (Figure 1). The primary miRNA transcripts are designated with the prefix ‘pri-’, while the miRNA precursors have the prefix ‘pre-’. Finally, the mature miRNA strands are named with the suffixes ‘-5p’ and ‘-3p’. Furthermore, an attempt was made to assign the same name to orthologs from different species. For example, 11 of the 12 human orthologs of C. elegans let-7 family have the let-7 name, whereas the last one was named miR-98 (Figure 1). The annotation guidelines are explained in more detail on the miRBase website (http://www.mirbase.org) and in [16].

1.1.3 miRNA guide and passenger strands

miRNAs are primarily defined by their short length of typically 21-23 nucleotides [17-19]. Each miRNA gene encodes two mature miRNA strands embedded in a structured hairpin- like precursor: the 5p strand from the 5′ arm (named with the suffix ‘-5p’) and the 3p strand from the 3′ arm (named with the suffix ‘-3p’) [4, 20, 21] (Figure 2). The nucleotides 2-7 from the 5′ end of each miRNA strand define the miRNA ‘seed’ that is crucial for target mRNA recognition [14].

Figure 2 Mature miRNA duplex contains a 5p and a 3p strand. Two mature miRNA strands are generated by endonucleolytic cleavage (black arrowheads) from each arm of the same hairpin precursor encoded by a single miRNA gene.

However, only one of the produced miRNA strands (called ‘guide’ strand) is usually biologically active and much more abundant in the cell than the other strand which is

ETH Zurich Matije Lucic 11 Main project Introduction

considered to be inactive and therefore named ‘passenger’ strand (also known as ‘miRNA*’, pronounced ‘miRNA star’) [21-23]. Nevertheless, growing evidence suggests that the dominant guide miRNAs can arise from either strands of the precursor and may be equally functional [4, 20, 24-26].

1.1.4 miRBase: the miRNA sequence repository

The miRBase (http://www.mirbase.org) provides an online repository for published miRNA sequences and associated annotation. The latest release of the database (v22, March 2018) contains 38’589 miRNA loci from 271 species, expressing 48’885 mature miRNAs. The high-throughput sequencing techniques of small RNAs, started in the mid-late 2000s, have transformed miRNA gene discovery [7, 27, 28]. The great majority of novel miRNAs, particularly those annotated since the 2007, have been discovered by small RNA deep sequencing [7].

However, miRNAs are not the only small RNAs that can be sequenced from the cell, stringent criteria have been introduced to prevent these other types of small RNAs, such as piRNAs, endogenous siRNAs or degradation fragments from longer RNAs, from being misannotated as miRNAs. In fact, to be annotated with high confidence, a miRNA must meet the following criteria [7]:

• at least 10 reads must map with no mismatches to each of the two possible mature miRNAs derived from the miRNA precursor. • the most abundant reads from each arm of the precursor must pair in the mature miRNA duplex with 0-4 nucleotides overhang at their 3′ ends. • at least 50% of reads mapping to each arm of the miRNA precursor must have the same 5′ end. • the predicted miRNA precursor structure must have a folding free energy of < -0.2 kcal/mol/nt. • at least 60% of the bases in the mature sequences must be paired in the predicted miRNA precursor structure.

As a result, 658 confidently annotated miRNAs have been identified in human, 614 in mouse (Mus musculus), 155 in fly (Drosophila melanogaster) and 81 in worm (Caenorhabditis elegans) (Table S1-4). The most evolutionarily conserved of these can be grouped into 89 miRNA families comprising 200 miRNA genes [3] (Table S5).

Moreover, miRBase has recently implemented a community-feedback poll to provide additional evidence and help reassess authentic from misannotated miRNAs (Figure 3).

ETH Zurich Matije Lucic 12 Main project Introduction

Figure 3 Display of miRNA hsa-let-7a-1 annotated in miRBase. User-feedback buttons ‘Yes’ and ‘No’ help revisit and reassess specific miRNA annotations (image source: http://www.mirbase.org/cgi-bin/mirna_entry.pl?acc=MI0000060; visited on 22.05.2018).

However, not all the sequences annotated in miRBase have been experimentally validated. In this respect, curated databases like DIANA-TarBase (http://www.microrna.gr/tarbase) index only experimentally supported miRNAs [29] (Figure 4).

Figure 4 Display of HMGA2-hsa-let-7a-5p interaction annotated in DIANA-TarBase. DIANA-TarBase lists the experimental methodologies that support the miRNA interactions with its putative targets (image source: http://www.microrna.gr/tarbase; visited on 30.07.2018).

ETH Zurich Matije Lucic 13 Main project Introduction

1.2 miRNA biogenesis

In the canonical biogenesis pathway, miRNAs are encoded by individual loci (monocistronic genes), or in case of clustered genes, mature miRNAs are processed from a common polycistronic transcript. Some miRNAs are located within introns of protein-coding genes and rely on host gene transcription in order to be expressed [20, 21, 30]. The biogenesis of miRNAs is a tightly regulated multistep process and has been reviewed recently in [21] (Figure 5).

Figure 5 Canonical miRNA biogenesis and RISC assembly. The biogenesis of miRNAs is a multistep process. Mature miRNA duplex is generated post- transcriptionally through successive endonucleolytic cleavages (black arrowheads) of the hairpin-like precursor by two RNase III-type enzymes, Drosha and Dicer. Finally, the guide strand is loaded into AGO while the passenger strand is discarded and degraded. The AGO protein loaded with a single-stranded guide miRNA dissociates from Dicer/TRBP and forms the mature RISC. Adapted from [21, 31].

ETH Zurich Matije Lucic 14 Main project Introduction

Firstly, miRNA genes are transcribed by the RNA polymerase II (Pol II) to long, capped and polyadenylated primary transcripts called pri-miRNAs [32]. These are processed to mature miRNAs through successive endonucleolytic cleavages by two RNase III-type enzymes [33]. The first cleavage is carried out by the nuclear protein complex called microprocessor, comprising the RNase III enzyme Drosha [34] and the double-stranded RNA-binding protein DiGeorge syndrome critical region 8 (DGCR8) [35, 36]. The cleavage of the pri-miRNA releases the pre-miRNA, a structured ‘hairpin-like’ precursor transcript of ~70 nucleotides [17-19] with a 5′ phosphate and a characteristic 2-nucleotide 3′ overhang, typical of cleavage products by RNase III-type enzymes [33]. The export receptor Exportin 5 (XPO5) facilitates the transport of pre-miRNAs from the nucleus to the cytoplasm [37-39]. As shown in a recent XPO5 knockout study, its role in the miRNA maturation process is not essential and can be complemented by alternative mechanisms [40]. Once in the cytoplasm, a second RNase III enzyme Dicer further processes the precursor hairpin and releases the mature miRNA duplex embedded in the stem of the pre-miRNA [41, 42]. Dicer acts as a ‘molecular ruler’ [43, 44] and yields specifically 21-23 nucleotides long mature miRNA duplex with a 5′ phosphate and 2-nucleotide 3′ overhangs on each end [45]. In vertebrates, the ‘dicing’ process is supported and modulated by two Dicer-binding proteins: trans- activation response (TAR) RNA-binding protein (TRBP) and protein activator of the interferon-induced protein (PACT) [46-48]. Dicer has a fundamental role in the miRNA biogenesis, although its depletion in a recent knockout study didn’t abolish completely the expression of many canonical miRNAs [40], suggesting the existence of alternative pathways for miRNA maturation [21, 40].

1.3 Strand selection and RISC assembly

The miRNA-mediated gene silencing is the result of a remarkable interplay between a miRNA and an effector protein of the Argonaute (AGO) family that form the RNA-induced silencing complex (RISC, also called miRISC) [49].

After cleaving the pre-miRNA (Figure 5), Dicer and its helper protein TRBP or PACT transfer the mature miRNA duplex to an AGO protein forming the RISC loading complex (RLC) which leads to RISC maturation [22, 23]. Interestingly, AGO proteins can be loaded with essentially any mature miRNA sequence, which makes the RISC fully versatile and capable of targeting and silencing effectively any (partially) complementary target [50-53]. However, the mature miRNA duplex generated by Dicer cleavage contains necessarily two miRNA strands [33] (Figure 5). Generally, only one of the miRNA strands (the ‘guide’

ETH Zurich Matije Lucic 15 Main project Introduction

strand) is selected and retained in AGO, while the other strand (the ‘passenger’ strand; also known as miRNA*) is discarded [22, 23]. The released passenger strand is rapidly degraded, resulting in a prominent bias in the cellular miRNA pool towards the guide strand [20]. Once loaded into AGO, the guide strands are stably retained, preventing rapid turnover and extending their lifetime on the order of several days [54, 55]. Therefore, the selection of one or the other strand as guide for RISC is not random and of crucial importance for the downstream target repression [23]. The fate of the two strands is influenced by the identity of their 5′ terminal nucleotides and the relative thermodynamic stability of the two ends of the miRNA duplex [56-58]. In fact, the duplex asymmetry is recognized by the RLC and the loading preference is given to the strand with thermodynamically less stable 5′ end [56, 57] and preferably a 5′ terminal adenosine or uridine [58]. The AGO protein loaded with a single-stranded miRNA dissociates from Dicer/TRBP and forms the mature RISC (also called miRISC) [22, 23] (Figure 5). The hallmark of RISC is the use of the sequence information encoded in the guide strand to direct the gene silencing machinery to complementary target transcripts (Figure 6).

1.4 miRNA-mediated gene silencing

The miRNA strand loaded into AGO guides RISC to complementary target sites usually located in the 3′ untranslated region (3′ UTR) of target mRNAs [14]. More than 60% of protein-coding genes contain at least one conserved miRNA-binding site and are predicted to be regulated by miRNAs [59].

In animals, the miRNA-mediated gene silencing can occur through two distinct pathways depending on the degree of complementarity between the AGO-loaded guide miRNA and its target mRNA. Catalytically active AGO proteins can directly cleave target mRNAs or alternatively, additional silencing factors can be recruited to promote translational repression, deadenylation and mRNA degradation [60, 61] (Figure 6).

ETH Zurich Matije Lucic 16 Main project Introduction

Figure 6 Mechanisms of miRNA function. RISC uses the sequence information encoded in the guide strand to direct the gene silencing machinery to complementary target sites usually located in the 3′ untranslated region of target mRNAs. In case of extensive complementarity, catalytically active AGO is able to directly cleave target mRNAs (black arrowhead). Mammalian miRNAs act predominantly in a slicing-independent manner and do not require extensive miRNA-target complementarity. The pairing of the miRNA seed (nucleotides 2-7 from the 5′ end of the guide strand) is generally necessary for target site recognition. Adapted from [3, 21, 31].

1.4.1 Argonaute-catalyzed slicing mechanism

In the case of an extensive miRNA-target complementarity, the RISC is able to directly cleave target mRNAs [62, 63] (Figure 6). Indispensable for the slicing mode of action is the association of the guide RNA with a specific AGO protein that has endonucleolytic activity [51, 52]. In mammals, the slicing activity is catalyzed by Argonaute2 (AGO2) which cleaves the phosphodiester bond linking target nucleotides paired to positions 10 and 11 from the 5′ end of the guide miRNA leaving a 3′ hydroxyl and 5′ phosphate [50, 52, 62-64]. The other Argonaute paralogs (AGO1, AGO3 and AGO4) have been thought to serve as slicer- independent effector proteins of the RISC, silencing gene expression through translational inhibition and deadenylation but not cleavage [60]. Although, AGO3 was recently shown to cleave target RNA, its slicing activity is guide RNA-dependent and possibly different from the conventional mechanism applicable to AGO2 [65].

The AGO-catalyzed slicing mechanism is common in plants [66] whereas the pairing between mammalian miRNAs and their targets only rarely contains sufficiently extensive complementarity able to elicit target cleavage. In fact, very few examples of AGO-catalyzed

ETH Zurich Matije Lucic 17 Main project Introduction

slicing have been reported in mammals [63, 67, 68]. RISC-directed slicing has been reported also in circular RNAs [69], and in special cases it was shown to be exploited by viruses. In the context of an Epstein-Barr virus (EBV) infection, EBV-encoded miRNAs take advantage of the host AGO-catalyzed slicing mechanism to cleave their own mRNAs in order to enhance the probability of a successful infection [70, 71].

AGO-catalyzed slicing is also the basis of the siRNA-mediated mRNA-knockdown process called RNA interference (RNAi). RNAi was first discovered in C. elegans [72], and since then, artificially designed siRNAs have been broadly used in biomedical research as tools to study gene function [73] and are currently showing promising results as gene-specific therapeutics [74].

1.4.2 Slicing-independent translational repression and mRNA decay

The miRNA-mediated gene silencing that dominates in mammals doesn’t require extensive miRNA-target complementarity and acts in a slicing-independent manner (Figure 6). In fact, miRNA targets are repressed upon recruitment of additional silencing factors which promote translational repression, deadenylation and mRNA degradation [61, 75-77]. The pairing of the miRNA seed sequence is generally necessary for target site recognition [14].

Key component among the recruited silencing factors are GW182 proteins, present as three paralogs in mammals, TNRC6A/B/C [78-80]. Guided by the miRNA, AGO associates with the target mRNA and recruits TNRC6 which acts as a structural scaffold responsible for the assembly process of the whole silencing machinery [60].

TNRC6 interacts with the poly(A)-binding protein (PABPC) associated with the poly(A) tail of the mRNA and recruits cytoplasmic deadenylase complexes PAN2-PAN3 [81, 82] and CCR4-NOT [83-85], either of which shortens the poly(A) tail of the mRNA and induces 3′ 5′ mRNA decay. A short or absent poly(A) tail leads to mRNA decapping with the removal of the 7-methylguanylate (m7G) cap by DCP1-DCP2 decapping enzymes [81]. The unprotected 5′ end is susceptible to 5′-3′ mRNA degradation [86]. Simultaneously, CCR4- NOT recruits DDX6, a helicase that activates the decapping complex and inhibits the translation of the mRNA [87, 88]. Although the translational repression occurs rapidly, its effect on the steady-state silencing of endogenous mRNAs is relatively weak because target mRNAs remain stable. Only at a later stage, the shortening of poly(A) tails induces irreversibly their degradation [75, 89, 90]. In fact, in diverse cell types and conditions mRNA decay is by far the dominant miRNA-mediated silencing mechanism and is generally

ETH Zurich Matije Lucic 18 Main project Introduction

responsible for 66-90% of target repression [61, 91]. The initial translational repression without irreversible mRNA decay can be rescued and allows flexibility in the miRNA- mediated silencing mechanism that can either switch off or fine-tune protein expression [92].

1.5 miRNA target recognition

The guide strand is tightly bound within AGO (Figure 7). The MID domain binds the miRNA 5′ nucleotide and makes it unavailable for target recognition, whereas the PAZ domain holds the 3′ terminus [93-95]. Functional analysis with human AGO demonstrated its selectivity for guide 5′ uridine or adenosine [56-58].

Figure 7 Functional domains of the miRNA guide within AGO. AGO loading of the guide RNA defines distinct functional domains crucial for efficient miRNA target recognition.

Besides anchoring its termini and making them unlikely to with the target mRNA, AGO alters the properties of the guide RNA and divides it into functionally distinct domains (Figure 7). Counting from the 5′ end of the guide strand, nucleotides 2-7 define the miRNA seed, the central region is represented by nucleotides 9-12, and in the 3′ half of the guide, nucleotides 13-16 are part of the 3′-supplementary region. Each miRNA region has a distinctive function in the process of miRNA target recognition [14, 96, 97] (Figure 10).

1.5.1 5′ end of the miRNA

The 5′ end of a miRNA is the most conserved region of animal miRNAs [98] and is of crucial importance for target interactions. All evidence so far indicates the miRNA seed as core motif for target site recognition. In fact, canonical miRNA target sites complement perfectly

ETH Zurich Matije Lucic 19 Main project Introduction

the miRNA seed (‘seed match’) with 6-7 contiguous base pairs and are usually located in the 3′ UTR of mRNAs [14]. Five types of conserved canonical target sites have been identified [99] (Figure 8). Target adenosine opposite to guide nucleotide 1 (position 1 of the target, termed ‘t1A’) is a conserved feature of many target sites and is preferentially bound by AGO, contributing to a 2-fold increase in affinity over other bases and anchoring AGO to the target site [100].

Figure 8 Canonical miRNA target sites. Canonical miRNA target sites are typically found in 3′ UTRs of target mRNAs and match perfectly to the miRNA seed (nucleotides 2-7). 8mer sites mediate the strongest miRNA- induced target repression, whereas 6mer sites have marginal activity. Relative efficacy for each target site in mammalian cells is shown on log scale. t1A in green. Adapted from [3].

Following the hierarchy of repression efficacy in mammalian cells, the most effective target sites are 8mer sites that match miRNA nucleotides 2-8 and have a t1A, followed by 7mer sites without a t1A and 7mer sites including the t1A. The 6mer sites either complement only the miRNA seed or are offset by one nucleotide in either the 5′ or 3′ direction. They are the least effective and only poorly conserved, but they are still classified as canonical [99]. The most of the miRNA-dependent gene silencing is mediated by 7-8mer sites [14, 101].

1.5.2 Central region of the miRNA

In addition to the seed-match, pairing by the miRNA central region (nucleotides 9-12) is crucial for target cleavage by catalytically active AGO (Chapter 1.4.1, p.17). Extensive

ETH Zurich Matije Lucic 20 Main project Introduction

pairing of at least 5-8 nucleotides beyond the seed (‘zippering of the helix’) induces a conformational change in AGO, making it catalytically competent [95, 102]. Unlike siRNAs and plant miRNAs, animal miRNAs rarely induce cleavage of their targets [14]. One such case is the repression of mouse HOXB8 by miR-196a [63]. In fact, with the exception of a single G•U wobble base pair in the seed, the pairing between miR-196a and its target site in the HOXB8 3′ UTR is perfect. This near-full-complementary pairing with miR-196a directs the AGO-catalyzed cleavage of HOXB8 mRNA [63].

1.5.3 3′ end of the miRNA

In theory, additionally to the canonical seed match, supplementary pairing to the 3′ end region of the miRNA could increase site affinity and potentially enhance target silencing. Such canonical target site with additional 3′ end pairing, involving at least 3-4 contiguous base pairs to miRNA nucleotides 13-16, is called 3′-supplementary site [14]. However, experimental data showed that this extra pairing increases only slightly the site affinity and its repression efficacy, and plays thereby a rather modest role in the miRNA target recognition [96, 103]. In fact, overall only ~5% of mammalian seed-matched target sites include this additional pairing at the 3′ end of the miRNA [59].

Extensive pairing to the 3′ end of the miRNA is not only supportive to a canonical target site, but more importantly it can compensate for a weak or imperfect seed match including a bulge, a mismatch or a G•U wobble base pair, rescuing a functional interaction. Such non- canonical target site is called 3′-compensatory site and has typically at least nine extra base-pairs contiguously centered on miRNA nucleotides 13-16 [14]. It has a marginal role in miRNA targeting and accounts only for ~1% of preferentially conserved target sites in mammals [59]. Nevertheless, prominent and well-studied examples of this category are let- 7 sites in C. elegans lin-41 mRNA (Figure 9), among the first ever validated miRNA target sites in animals [104, 105] and originally thought to be prototypical of miRNA target recognition. They play an essential role in nematode development [106-108].

ETH Zurich Matije Lucic 21 Main project Introduction

Figure 9 3′-compensatory let-7 sites in C. elegans lin-41 mRNA. Extensive and contiguous pairing at the 3′ end of let-7 miRNA compensates for an imperfect seed match (site 1 has an adenine bulge and site 2 a G•U wobble base pair). Such atypical target sites are rare but play a crucial role in C. elegans lin-41 regulation by let-7 [104-108].

The strong requirement on extensive 3′ end pairing of complementary sites makes them potentially advantageous to ensure targeting specificity among individual miRNA family members that share the same seed but have typically different 3′ ends. As shown recently, pairing beyond the 5′ end seems to be much more common than anticipated and offers a mechanism to avoid the regulatory redundancy common to members of the same miRNA family and can have important implications in vivo [107-109]. Sites with an imperfect seed match are more likely to exhibit strong intra-family specificity as shown in nematodes where lin-41 responds specifically to let-7 and not to earlier-expressed paralogs with the same seed but unfavorable compensatory pairing [106-108].

1.5.4 Non-canonical miRNA targeting

Compared to canonical sites, non-canonical miRNA target sites have an imperfect or completely missing (called ‘seedless’ targets) seed-match [14]. Despite lacking canonical seed pairing, non-canonical miRNA targeting can be productive. This is mostly the case for 3′-supplementary sites that compensate for weak seed-matches (Chapter 1.5.3, p. 21) and centered sites, characterized by an extensive pairing to the central region of the miRNA [68]. Additional types of non-canonical target sites have been recently discovered by high- throughput AGO-crosslinking-immunoprecipitation (CLIP) approaches, identifying miRNA complementary sites in close proximity of AGO crosslinks in a context-specific and systematic manner [107, 109-111] (Chapter 1.5.6, p. 24). Indeed, the non-canonical

ETH Zurich Matije Lucic 22 Main project Introduction

miRNA targeting was shown to be much more widespread and heterogeneous than previously thought. However, a comprehensive meta-analysis of small RNA transfection datasets proved their low importance in target silencing. Collectively, these non-canonical sites failed to mediate repression despite pairing authentically to the miRNAs [101]. Their high abundance within the transcriptome and intrinsically fast dissociation rates contribute to their low individual site occupancy and general ineffectiveness in target repression.

1.5.5 Model for miRNA target recognition

The miRNA seed sequence was given its name because it ‘seeds’ the pairing with the target site that can ‘grow’ to the rest of the miRNA [112], propagating to the 3′ end like in case of supplementary paired sites or extensively complementary sites that can be additionally sliced [102]. It has been suggested that AGO presents the seed of the loaded guide strand in a preorganized A-form conformation that behaves similarly to a locked nucleic acid [113] increasing its binding affinity for complementary sequences. This lowers the entropic costs and preferentially enhances the initial pairing to seed-matched target sites [112, 114]. Other RNA-guided proteins, like the CRISPR endonuclease Cas9 [115], adopt a similar search mechanism with the use of a preordered RNA region, indicating its broad effectiveness.

Recent structural and functional studies, reviewed in [94], revealed that AGO uses a more complex and stepwise approach in its search for miRNA target sites (Figure 10). The refined model of miRNA target recognition considers the miRNA seed as two functional domains [116]. The nucleotides 2-5 of the guide strand (g2-g5, also called ‘sub-seed’) are most critical for target recognition and their mutations greatly affect target binding rates [96, 117]. In fact, AGO exposes only guide nucleotides g2-g5 in a suitably preorganized A-form conformation and available for initial semi-stable target pairing, while the rest of the seed sequence is sterically occluded by AGO �-helix-7 (in human AGO2) [93]. This allows AGO to diffuse laterally along potential target mRNAs and search with remarkable speed for sub- seed complementary sites [96, 117]. The preliminary and transient pairing to g2-g5 causes AGO to pause on a potential target site where it can propagate the pairing through the full seed up to nucleotide g8, making a more stable interaction with the target. As pairing extends to g6-g8, AGO undergoes a conformational change and displaces the �-helix-7, further stabilizing the seed-paired conformation and ensuring fidelity of the associated target while reducing dwell time on weak or off-target sites [116, 118]. Additionally, the shift in �-helix-7 exposes the previously sequestered 3′ half of the guide, specifically nucleotides g13-g16, allowing for 3′-supplementary interactions. In case of an extensive miRNA-target

ETH Zurich Matije Lucic 23 Main project Introduction

complementarity, the pairing can then propagate to the center of the miRNA, allowing catalytically active AGO to slice target mRNA [3].

Figure 10 Model for miRNA target recognition. AGO diffuses laterally along potential target mRNAs and searches for complementary sites. Initial semi-stable pairing to the preorganized sub-seed (g2-g5) allows AGO to pause at a specific site and propagate the pairing to match the full seed region (g2-g7/g8), making a more stable interaction with the target. Additional pairing with the 3′ end of the guide (nucleotides g13-g16) can follow. Adapted from [94, 117].

1.5.6 Identifying the miRNA targetome

The interaction between a miRNA and its target mRNA is based upon well-established Watson-Crick base pairing rules [14]. However, the action of miRNAs on their targets remains a challenge. In fact, each miRNA is believed to regulate tens or hundreds of mRNAs and vice versa, making the accurate prediction of miRNA-target interactions critical for understanding miRNA biology and its relevance in disease. To address this issue,

ETH Zurich Matije Lucic 24 Main project Introduction

several in silico algorithms and experimental methodologies have been developed aiming to determine the miRNA targetome [119].

Low-throughput classical techniques (e.g. reporter gene assays, quantitative polymerase chain reactions and western blots) produce low amounts of data and generally only infer a miRNA interaction by considering the reduction of a reporter gene, a target mRNA or protein. Instead, recent improvements in costs, accessibility and throughput of next- generation sequencing (NGS) technologies have radically changed the identification of experimentally supported miRNA-target interactions transcriptome-wide [119, 120]. The immunoprecipitation of RISC in AGO-CLIPs, followed by NGS-based identification of its interacting RNAs, set a new path in the miRNA research. The use of these high-throughput approaches has achieved a so-far-unmatched accuracy, thereby providing an exceptionally valuable resource for dissecting systematically the miRNA targetome. In addition, recent advances enabled the generation of covalently linked miRNA-target fragments called ‘chimeras’, that provide context information of individual miRNA-target sites at nucleotide- resolution [107, 109-111].

1.5.7 miRNA target prediction strategies

Despite the advances in high-throughput experimental approaches, computational prediction tools provide a rapid method to identify putative miRNA targets, prior their functional characterization and validation in biological models. A common feature among many different algorithms is the search for phylogenetically conserved seed-matched miRNA target sites in annotated 3′ UTRs. In fact, the presence of canonical seed matches has been identified as the most important prediction determinants and it is of fundamental importance for almost all prediction tools [14, 101]. However, the mere presence of a perfect seed complement doesn’t always guarantee a functional interaction [77, 121, 122]. Thus, to improve the target site prediction efficacy, additional features can be included in the prediction algorithms. For example, TargetScan (http://www.targetscan.org/vert_72/) in its latest version (v7.2), considers 14 different miRNA and mRNA features to predict the most effectively targeted mRNAs [101]:

• features of the miRNA target site: type and number of sites, supplementary pairing at the miRNA 3′ end and the site context within the 3′ UTR. • features of the miRNA: thermodynamic stability of the seed-pairing and the number of competing sites in all annotated 3′ UTRs. • features of the mRNA: UTR length, ORF length, presence of alternative 3′ UTR isoforms, presence of additional marginal sites and sites in the ORFs.

ETH Zurich Matije Lucic 25 Main project Introduction

Despite continuous updates and improvements made to increase the confidence of miRNA target predictions, most of the bioinformatics tools are far from being perfect, displaying relatively high false positive and false negative rates and predicting true targets typically only in about 50% of the cases [119]. Nevertheless, continuous advances in miRNA biology and understanding of miRNA-targeting mechanisms, coupled with technological innovations like machine learning approaches, provide the knowledge to further develop the prediction algorithms and make them as reliable as the best high-throughput crosslinking approach or even more, at a fraction of the cost and with great advantage in time.

1.6 miRNA targets fight back on miRNAs

Repression of target mRNA is the canonical outcome of miRNA function [14]. However, miRNA pairing to specific non-canonical target sites with extensive complementarity to both 5′ and 3′ regions of the miRNA can reverse this outcome and target RNAs themselves can trigger degradation of bound miRNAs. This emerging miRNA destabilization mechanism is known as target RNA-directed miRNA degradation (TDMD) [123] and has been comprehensively reviewed in [124]. The extent and architecture of the pairing between the miRNA and its target mRNA is crucial to evade silencing and promote miRNA turnover, although TDMD requirements are now starting to be understood and show both tissue- and miRNA-specific effects. Upon favorable pairing, TDMD leads to miRNA 3′ end tailing (addition of non-templated nucleotides, specifically A or U) followed by trimming of the miRNA from the 3′ end, resulting in a highly specific miRNA loss. Indeed, TDMD-inducing target RNAs provide a new layer of miRNA regulation [124]. Alternatively, highly complementary seedless sites promote the unloading of the guide strand from AGO, inducing its rapid degradation [125, 126].

1.7 Complexity and redundancy of miRNA function

Understanding the miRNA biology and mechanism of action remains a challenge. In recent years, the principles of miRNA post-transcriptional gene regulation have been extensively studied. However, the reason that miRNAs are so important goes beyond a single miRNA- mRNA interaction and its downstream activity [31]. In fact, because of the imperfect nature of its target recognition mechanism, a single miRNA can concomitantly regulate tens or hundreds of different target mRNAs. Furthermore, individual mRNAs can be targeted by

ETH Zurich Matije Lucic 26 Main project Introduction

several miRNAs in parallel, ensuring a regulatory redundancy in miRNA function [31, 127, 128] (Figure 11). Additionally, closely spaced, adjacent target sites can function cooperatively to silence common targets [129, 130].

Figure 11 Complexity and redundancy of miRNA function. miRNAs form complex and highly connected regulatory networks ensuring collaborative and redundant repression of target mRNAs. Adapted from [31].

In fact, it appears that miRNAs have been extensively integrated in multiple crucial pathways in a modular, highly connected and redundant fashion, ensuring the correct maintenance of cellular homeostasis even in case of deletions of specific miRNAs [3, 128]. Such a highly connected miRNA regulatory network provides the means for optimal gene regulation in every developmental stage in each cell of each tissue, increasing the overall tolerance to genomic DNA mutations, diseases and environmental perturbations [3].

1.8 Oncomirs and tumor suppressors

Dysregulation of miRNA expression and function has been found in many human diseases [131], especially in the context of cancer [132] where some miRNAs can act as tumor suppressors [10], whereas oncogenic miRNAs (called ‘oncomirs’) are functionally associated with the promotion of cancer [133]. Systematic miRNA profiling across human tumors and normal tissues identified a general downregulation of miRNA expression as hallmark of cancer [134]. This work focuses on two functionally opposing miRNA families: oncogenic miR-17 [8, 9] and tumor-suppressive let-7 [10, 11].

ETH Zurich Matije Lucic 27 Main project Introduction

1.8.1 miR-17~92 cluster

The miR-17 family is part of a large miRNA cluster called miR-17~92. Besides playing an essential role during normal development [133], the mir-17~92 cluster contained the first reported oncogenic miRNA (called ‘oncomir-1’) [135]. It was found to be overexpressed in various types of hematopoietic malignancies and solid cancers, and implicated in a broad range of immune, cardiovascular and neurodegenerative diseases, and aging processes [8, 9]. The overexpression can result from a genomic amplification of the miR-17~92 locus, as originally found in human B-cell lymphomas [135], or direct cluster transactivation by transcription factors such as c-MYC [136] and E2F [137, 138], each frequently up-regulated in cancer and under the control of miR-17~92 itself, constituting a tightly regulated feed- forward loop [139].

miR-17~92 cluster consists of six highly conserved miRNAs representing four seed families: the miR-17 family (miR-17 and miR-20a), the miR-18 family (miR-18a), the miR-19 family (miR-19a and miR-19b-1) and miR-92 family (miR-92a-1) [9] (Figure 12). The entire cluster has two paralogs, the miR-106a~363 cluster (encoding miR-106a, miR-18b, miR-20b, miR- 19b-2, miR-92a-2 and miR-363) and the miR-106b~25 cluster (encoding miR-106b, miR- 93 and miR-25) (Figure 12). The miR-17~92 cluster is the most studied and has been extensively reviewed in [8, 9]. Unlike the miR-17~92 and miR-106b~25 clusters, which are both abundantly expressed across many tissues and cell types, the miR-106a~363 cluster is undetectable or expressed only at trace levels [133].

Among the four clustered families, an allelic series of miR-17~92 mutant mice identified the miR-19 family as key downstream effector of MYC and of crucial importance in MYC-driven tumor initiation and progression [140]. Emerging evidence supports the oncogenic activity of the miR-17 family in different cellular phenotypes [8, 141, 142]. Among its targets are key regulators of the cell cycle progression (e.g. CDKN1A/p21) and cell apoptosis (e.g. BCL2L11/BIM) [143]. Ectopic expression of a single member of the miR-17 family, miR-17- 5p, leads to dysregulation of the normal cell cycle progression and is sufficient to drive a proliferative signal in HEK293T cells [144]. However, its effect seems to be context- dependent with possibly an opposite outcome in different settings. For example, miR-17-5p is oncogenic in liver, colorectal and prostate cancers [25, 26, 142], whereas it might suppress cell proliferation and induce apoptosis in cervical cancer [145].

ETH Zurich Matije Lucic 28 Main project Introduction

a)

b) miR-17~92 cluster and its paralogs family miRNA sequence (5′-3′) 12345678 miR-17-5p CAAAGUGCUUACAGUGCAGGUAG miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG miR-17 miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG miR-106a-5p AAAAGUGCUUACAGUGCAGGUAG miR-106b-5p UAAAGUGCUGACAGUGCAGAU 12345678 miR-18 miR-18a-5p UAAGGUGCAUCUAGUGCAGAUAG miR-18b-5p UAAGGUGCAUCUAGUGCAGUUAG 12345678 miR-19 miR-19a-3p UGUGCAAAUCUAUGCAAAACUGA miR-19b-3p UGUGCAAAUCCAUGCAAAACUGA 12345678 miR-25-3p CAUUGCACUUGUCUCGGUCUGA miR-92 miR-92a-3p UAUUGCACUUGUCCCGGCCUGU miR-363-3p AAUUGCACG.GUAUCCAUCUGUA

Figure 12 miR-17~92 cluster and its paralogs. (a) Genomic representation and chromosomal location of the human miR-17~92, miR- 106a~363, and miR-106b~25 clusters representing four miRNA families: the miR-17 family (shown in red, contains miR-17, miR-20a/b, miR-93 and miR-106a/b), the miR-18 family (shown in blue, contains miR-18a/b), the miR-19 family (shown in green, miR-19a/b) and the miR-92 family (shown in yellow, contains miR-25, miR-92a, and miR-363). (b) Cluster members are shown with ClustalW multiple sequence alignments. The extended seed sequence (nucleotides 2-8 from the 5′ end of the miRNA sequence; shown in red) defines each miRNA family. Only sequences of the miRNA guide strands are shown (5′-3′ orientation). Adapted from [8, 9].

ETH Zurich Matije Lucic 29 Main project Introduction

1.8.2 let-7 family

Following the seminal discovery of lin-4 in C. elegans [1, 2], let-7 was the second miRNA to be identified as a heterochronic gene regulating the progression of larval stages during the development of the worm [104]. The well-characterized developmental defects observed in let-7 mutant animals [104], namely lack of terminal differentiation and ongoing cell proliferation leading eventually to a lethal phenotype (earning this mutation also its name: ‘lethal-7’ or ‘let-7’), suggested a conserved function in mammals with possible contributions of let-7 in human diseases, especially in cancer. In fact, abundant in vitro and in vivo data classified let-7 as a bona fide tumor-suppressor [146]. Decreased expression of let-7 miRNAs has been detected in many human cancers and associated with a poor clinical outcome [10, 147]. Several well-known oncogenes such as HMGA2 [148, 149], RAS [150], MYC [151], IGF2BP1 [152], CDK6 [153] and CCND1 [154] are directly repressed by let-7.

The let-7 miRNA family is highly conserved across various animal species [155]. As the largest miRNA family in humans with 12 different genomic loci, located individually or as clusters, it consists of 9 mature let-7 miRNAs (Figure 1). Because of deleterious developmental consequences and oncogenic potential upon their functional dysregulation [146], the biogenesis of let-7 miRNAs is tightly regulated by several transcriptional and post- transcriptional mechanisms, as reviewed in [156]. The most prominent let-7 regulators are LIN28A and LIN28B, two related RNA-binding proteins frequently up-regulated in cancers [157, 158], that specifically bind to let-7 precursors and inhibit their maturation [159, 160]. However, let-7 itself targets the LIN28A/B mRNA, thereby establishing a double-negative feedback loop [146].

ETH Zurich Matije Lucic 30 Main project Introduction

1.9 Aim of the project

Our comprehension of miRNA-target interactions has increased in the last decades and has uncovered an ever-growing number of non-canonical targeting modes. As part of a related but independent project in our lab, we developed a new variant of an AGO-CLIP, termed miRNA crosslinking and immunoprecipitation (miR-CLIP), able to specifically ‘capture’ miRNA targets in cells followed by their identification by deep sequencing [161]. The miR- CLIP identified hundreds of functional miR-106a targets in HeLa cells and revealed a significant number of atypical targets with no seed match but instead an extensive complementary to the 3′ end of miR-106a. Additionally, these seedless targets were significantly upregulated upon delivery of miR-106a, particularly when the 3′ end matches were located in target 3′ UTRs. The re-analysis of the miR-CLIP data initiated this project.

The initial observation of a sequence homology between the miR-17 family and the seed of let-7 suggested possible functional implications between these two miRNA families, considering also their opposing nature. In fact, miR-17 family is oncogenic, whereas let-7 is a prominent tumor-suppressor. This led us to hypothesize a competing miRNA-targeting mechanism based on the sequence overlap between the 3′ end of miR-17 family members and the seed of let-7. Our working model considers a non-canonical targeting of miR-17 miRNAs, possibly co-targeting and competing for conventional and repressive let-7 target sites. We aimed to identify the sequence requirements of this interaction by mutating the seed and the 3′ end of the competitive miRNAs. Additionally, we directly compared the let- 7 competing ability of two members of the miR-17 family, namely miR-106a and miR-106b.

ETH Zurich Matije Lucic 31 Main project Results

2 Results 2.1 Sequence homology between miR-17 and let-7 families

The 5′ end sequence homology among miRNA family members is given, whereas the 3′ end can be heterogeneous. This is not the case for members of the human miR-17 family. Besides the shared seed region, they have high degree of homology at the 3′ end (Figure 13) suggesting its functional conservation. In fact, a closer investigation of this sequence highlighted its similarity with the 5′ region of the human let-7 family. With exception of let- 7d, the only let-7 member starting with an adenine, the overlap with the 3′ end of miR-17 family covers the nucleotides 1-8 including the let-7 seed sequence (Figure 13).

family miRNA sequence (5′-3′) 12345678 17 20 miR-17 CAAAGUGCUUACAGUGCAGGUAG miR-20a UAAAGUGCUUAUAGUGCAGGUAG miR-20b CAAAGUGCUCAUAGUGCAGGUAG miR-17 miR-93 CAAAGUGCUGUUCGUGCAGGUAG miR-106a AAAAGUGCUUACAGUGCAGGUAG miR-106b UAAAGUGCUGACAGUGCAGAU

let-7a-1 UG-AGGUAGUAGGUUGUAUAGUU let-7a-2 UG-AGGUAGUAGGUUGUAUAGUU let-7a-3 UG-AGGUAGUAGGUUGUAUAGUU let-7b UG-AGGUAGUAGGUUGUGUGGUU let-7c UG-AGGUAGUAGGUUGUAUGGUU let-7d AG-AGGUAGUAGGUUGCAUAGUU let-7 let-7e UG-AGGUAGGAGGUUGUAUAGUU let-7f-1 UG-AGGUAGUAGAUUGUAUAGUU let-7f-2 UG-AGGUAGUAGAUUGUAUAGUU let-7g UG-AGGUAGUAGUUUGUACAGUU let-7i UG-AGGUAGUAGUUUGUGCUGUU miR-98 UG-AGGUAGUAAGUUGUAUUGUU 12 345678

Figure 13 Sequence homology between miR-17 and let-7 families. Members of the human miR-17 and let-7 families are shown with ClustalW multiple sequence alignments. miR-17 family miRNAs have high degree of homology at the 3′ end that overlaps with the 5′ end of the let-7 family, covering the let-7 seed sequence (yellow box). The overlap is not perfectly contiguous with an extra cytidine at position 17. miR-106b is shorter and has an adenine instead of guanosine at position 20, reducing its 3′ end homology with family members and let-7 seed. Extended seed (nucleotides 2-8, shown in red). Only sequences of the miRNA guide strands are shown. The suffix ‘-5p’ was omitted.

ETH Zurich Matije Lucic 32 Main project Results

However, the overlap between the two families is not perfectly contiguous because of an extra cytidine at position 17 in the 3′ end of miR-17 members. Interestingly, miR-106b is shorter and has an adenine instead of guanosine at position 20, reducing considerably its sequence homology with other miR-17 family members and the let-7 seed region.

Considering the absence of the miR-17 family in nematodes (Table S4), specie where let- 7 was first identified and well-known for its biology and implications in animal development, we first sought to verify the conservation of the miR-17-let-7-seed overlap in other species. We performed a multiple sequence alignment of all miRBase-annotated miR-17 family miRNAs from all species (Table S6). Strikingly, the high degree of 3′ end-homology observed in the human miR-17 family is evolutionary conserved, suggesting functional implications linked to this region.

This 3′ end-seed homology led us to hypothesize an overlapping and competitive targeting mechanism between these two miRNA families, considering also their opposing oncogenic and tumor-suppressive roles.

2.2 Model for competitive non-canonical binding at let-7 target sites

Considering their sequence homology with the let-7 seed, we hypothesized that miR-17 family miRNAs bind to let-7 target sites in a non-canonical miRNA fashion, lacking the miR- 17 seed-pairing and exploiting instead their 3′ end complementarity to let-7 seed match (Figure 14). Based on the current knowledge on effective miRNA target sites [101], we were speculating whether such non-canonical targeting would be able to mediate functional repression of let-7 targets, with the extensive 3′ end interaction possibly compensating for the absence of a conventional 5′ seed-pairing, as reported previously (Chapter 1.5.3, p. 21). We also envisioned a non-repressive let-7 target site occupancy by miR-17 family miRNAs, hindering possibly the canonical let-7 seed-based silencing, leading eventually to let-7 target derepression.

ETH Zurich Matije Lucic 33 Main project Results

Figure 14 Model for competitive non-canonical binding at let-7 target sites. The canonical seed-based repression at the let-7 target site (let-7a-5p in blue, 8mer site in cyan with t1A in green) could be hindered by an overlapping non-canonical targeting based on an extensive 3′ end complementarity of miR-17 family miRNAs (miR-106a-5p shown in red as representative miR-17 family member) with the let-7 seed match (cyan). Such non- canonical pairing is not expected to mediate repression of the let-7 target.

2.3 Investigation of the let-7 transcriptome in HEK293T cells

To gain insights into the hypothesized let-7 competing ability of miR-17 family members, we designed a series of combination treatments and explored how they affect gene expression in HEK293T cells. Although miRNAs ultimately control protein output, we first sought to generate and analyze transcriptomic data by deep sequencing. In fact, as previously demonstrated, miRNA-induced changes in protein expression can be explained by changes in the transcriptome [77, 91, 121]. We directly compared the competing ability of two miR- 17 family members, miR-106a and miR-106b, against let-7. Additionally, we mutated the seed and 3′ end of miR-106a in order to investigate the sequence requirement for let-7 competition (Figure 15).

ETH Zurich Matije Lucic 34 Main project Results

Figure 15 Non-canonical targeting of miR-106a, its mutants, and miR-106b. miR-106a (red), its seed-mutant (brown) and 3′ end-mutants #1-3 (purple), as well as the shorter miR-106b (orange) are shown paired non-canonically to the let-7 target site (8mer site in cyan, t1A in green) with their 3′ ends. Only miRNA 5p guide strands are shown.

As positive control for let-7 activity we used the let-7a duplex, whereas a randomized duplex served as negative control, as described in [162]. The competitors miR-106a, its mutations and miR-106b were delivered as hairpin precursors. To ensure their endogenous processing by Dicer into functionally active mature miRNAs, as shown previously [24], and loading into AGO, we first investigated their canonical seed-based activity in HEK293T cells on a miR-17 luciferase reporter bearing three TargetScan-predicted miR-17 target sites from the 3′ UTR of the putative miR-17 target ANKRD52 (Figure 16). The reporter is significantly and dose-dependently repressed by both the siRenilla (positive siRNA control duplex, as previously shown [24]) and the commercially available miRNA positive control (mimic miR-106a-5p) that bypasses Dicer cleavage and can be directly loaded into RISC. As expected, both the negative control and the let-7a duplexes do not have repressive activities. Similarly, the seed-mutation of miR-106a abolishes its repressive activity. All the precursors with the intact seed sequence are able to induce repression of the reporter in a dose-dependent manner, although with minor differences, especially at lower concentrations, possibly reflecting family member-dependent differences and/or the importance of the 3′-supplementary targeting in the canonical miRNA-mediated silencing. However, they all have a comparable effect at the highest dose.

ETH Zurich Matije Lucic 35 Main project Results

Figure 16 Screen for canonical miR-106a/b activity on miR-17 luciferase reporter. Relative luminescence of the miR-17 luciferase reporter in HEK293T cells upon 0, 2.5, 10 or 40 nM transfections of the positive siRNA control siRenilla (gray), let-7a duplex (blue), randomized negative control duplex (black), positive miRNA control mimic miR-106a-5p (bordeaux) and the precursors: miR-106a (red), miR-106b (orange), seed-mutated miR- 106a (brown) or 3′ end-mutated miR-106a #1-3 (purple). Renilla luminescence is normalized to Firefly luminescence and relative to mock transfections (n = 3) set to 100%. Statistical significances of 2way ANOVA test with Dunnett’s multiple comparisons to negative control (black) is shown on the graph, while comparisons to miR-106a (red) transfection are shown for each concentration below the graph. Error bars are SD. ns, not significant, omitted. *: P ≤ 0.05; **: P ≤ 0.01; ***: P ≤ 0.001; ****: P ≤ 0.0001.

Following the initial screen on the luciferase reporter, we investigated individual miRNAs in single transfections for their canonical repressive activity of endogenous TargetScan- predicted targets. Next, we explored the let-7 competing ability of the miR-17 family members by measuring the let-7 target repression after co-transfection of let-7a with either miR-106a, miR-106b or miR-106a mutants.

ETH Zurich Matije Lucic 36 Main project Results

2.3.1 Canonical repression by let-7a

Conserved TargetScan-predicted let-7 targets (Table S7) are significantly and dose- dependently repressed after transfection of the let-7a duplex in HEK293T cells (Figure 17). Compared to the downregulation of all let-7 targets (Figure 17: c), a more stringent selection of the targetome shows an even larger repression for both let-7a doses (Figure 17: d). The negative control duplex is not affecting the predicted let-7 targets and induces no change in both subsets (Figure 17: c, d).

ETH Zurich Matije Lucic 37 Main project Results

Figure 17 let-7 targets are repressed by let-7a duplex in a dose-dependent manner. Cumulative distributions of changes in gene expression for all genes (a) non-predicted targets (b) and conserved TargetScan-predicted let-7 targets (c and d) upon transfecting HEK293T cells with either 10 nM of let-7a duplex (light blue), 50 nM of let-7a duplex (dark blue) or 50 nM of randomized negative control duplex (black), compared to mock transfections (n = 3). For the selection of the top 100 let-7 targets (d), conserved

TargetScan-predicted let-7 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov-Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph.

ETH Zurich Matije Lucic 38 Main project Results

2.3.2 Canonical repression by miR-106a and miR-106b

Conserved TargetScan-predicted miR-17 targets are significantly repressed after transfection of both miR-106a and miR-106b precursors in HEK293T cells, suggesting their correct Dicer processing and AGO loading (Figure 18). Compared to the significant downregulation of all miR-17 targets (Figure 18: c), a more stringent selection of the targetome shows an even larger repression for both miRNAs with no differences in canonical targeting among the two family members (Figure 18: d). Considering all miR-17 targets, miR-106b induces a smaller change in target expression than miR-106a, possibly reflecting family member-dependent differences and/or biased target prediction (Figure 18: c). The negative control duplex is not affecting significantly the predicted miR-17 targetome and shows minimal changes in both subsets (Figure 18: c, d).

ETH Zurich Matije Lucic 39 Main project Results

Figure 18 miR-17 targets are repressed by both miR-106a and miR-106b precursors. Cumulative distributions of changes in gene expression for all genes (a), non-predicted targets (b) and conserved TargetScan-predicted miR-17 targets (c and d) upon transfecting HEK293T cells with 50 nM of either miR-106a precursor (red), miR-106b precursor (orange) or a randomized negative control duplex (black), compared to mock transfections (n = 3). For the selection of the top 100 miR-17 targets (d), conserved TargetScan-predicted miR-

17 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov-Smirnov tests (KS- test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

ETH Zurich Matije Lucic 40 Main project Results

2.3.3 miR-106a seed-mutation abolishes canonical repression

Compared to the wildtype miR-106a precursor, the seed-mutation abolishes the repression of conserved TargetScan-predicted miR-17 targets in HEK293T cells and slightly induces their expression (Figure 19).

ETH Zurich Matije Lucic 41 Main project Results

Figure 19 Repression of miR-17 targets is abolished by miR-106a seed-mutation. Cumulative distributions of changes in gene expression for all genes (a), non-predicted targets (b) and conserved TargetScan-predicted miR-17 targets (c and d) upon transfecting HEK293T cells with 50 nM of either wildtype miR-106a precursor (red), seed-mutated miR- 106a precursor (brown) or a randomized negative control duplex (black), compared to mock transfections (n = 3). For the selection of the top 100 miR-17 targets (d), conserved

TargetScan-predicted miR-17 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov-Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

ETH Zurich Matije Lucic 42 Main project Results

2.3.4 3′ end-mutated miR-106a retains canonical repressive activity

The mutations in the 3′ end of the miR-106a precursor are generally well tolerated for the regulation of canonical targets. The three mutants #1-3 induce significant repression of canonical TargetScan-predicted miR-17 targets similarly to the wildtype miR-106a (Figure 20). Among them, the mutant #2 (miR-106a 3′ end-mut-2) shows no significant difference in target downregulation compared to the wildtype precursor (Figure 20: d-f). On contrary, the mutant #1 (Figure 20: a-c) and #3 (Figure 20: g-i) exhibit a significantly smaller effect than wildtype miR-106a, possibly reflecting the importance of the 3′-supplementary targeting in the canonical miRNA-mediated silencing.

ETH Zurich Matije Lucic 43 Main project Results

Caption on the next page.

ETH Zurich Matije Lucic 44 Main project Results

Continued from the previous page.

Figure 20 miR-17 targets are repressed by the 3′ end-mutated miR-106a precursors. Cumulative distributions of changes in gene expression for all genes (a, d and g) or conserved TargetScan-predicted miR-17 targets (b, c, e, f, h and i) upon transfecting HEK293T cells with 50 nM of either wildtype miR-106a precursor (red), 3′ end-mutated miR- 106a precursors (purple, a-c: 3′ end-mutant #1, d-f: 3′ end-mutant #2, g-i: 3′ end-mutant #3) or a randomized negative control duplex (black), compared to mock transfections (n = 3). For the selection of the top 100 miR-17 targets (c, f and i), conserved TargetScan-

predicted miR-17 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov- Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

2.3.5 miR-106a and miR-106b are unable to repress let-7 targets

The transfection of both miR-106a and miR-106b precursors is unable to mediate repression of conserved TargetScan-predicted let-7 targets in HEK293T cells (Figure 21). On contrary, a more stringent selection of the let-7 targetome (Figure 21: d, e) shows a minor upregulatory effect for miR-106a, although not significant if compared to miR-106b. Interestingly, roughly one quarter of the let-7 targets is predicted to be co-regulated by the miR-17 family. In fact, only the predicted co-targets are significantly repressed by both miR- 106a and miR-106b. The effect of the three miRNAs on this subset of targets is statistically comparable (Figure 21: f).

ETH Zurich Matije Lucic 45 Main project Results

Figure 21 let-7 targets are not repressed by miR-106a and miR-106b precursors. Cumulative distributions of changes in gene expression for all genes (a), non-predicted targets (b) or conserved TargetScan-predicted let-7 targets (c-f) upon transfecting HEK293T cells with 50 nM of either let-7a duplex (blue), miR-106a precursor (red), miR- 106b precursor (orange), or a randomized negative control duplex (black), compared to mock transfections (n = 3). Co-targets (f) are conserved TargetScan-predicted targets of both let-7 and miR-17 miRNA families (Table S8), while let-7-only targets (d-e) exclude miR-17 family targets. For the selection of the top 100 let-7 targets (e), conserved

TargetScan-predicted let-7 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov-Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

ETH Zurich Matije Lucic 46 Main project Results

2.3.6 Co-transfection of let-7a with either miR-106a or miR-106b

The co-transfection of let-7a with either miR-106a or miR106b precursors shows a striking difference on the expression of TargetScan-predicted let-7 targets (Figure 22). Compared to the let-7a single transfection, the combination with miR-106a abolishes the effect of let- 7a on its own targetome, showing even a slightly positive median fold change (Figure 22: d, e). On contrary, the co-transfection with miR-106b enhances significantly the repression of all let-7 targets (Figure 22: c) and its subsets (Figure 22: d, e). This ‘let-7 potentiating’ effect is observed only in case of co-transfection with let-7a, whereas single miR-106b has no silencing effect on the let-7 targetome, as shown previously (Figure 21: c-e). The let-7a competitive and synergistical effects of miR-106a and miR-106b, respectively, are particularly prominent on the let-7-only targetome (Figure 22: d, e). In fact, in case of co- targets (Figure 22: f), both co-transfections show an additive effect, inducing a significantly higher repression of the shared targetome compared to all the single transfections (Figure 21: f).

ETH Zurich Matije Lucic 47 Main project Results

Figure 22 miR-106a precursor is able to neutralize let-7a’s effect on its targetome. Cumulative distributions of changes in gene expression for all genes (a), non-predicted targets (b) or conserved TargetScan-predicted let-7 targets (c-f) upon transfecting HEK293T cells with 10 nM of let-7a duplex as single transfection (blue) or combined with either 50 nM miR-106a precursor (red) or 50 nM miR-106b precursor (orange), and 50 nM randomized negative control duplex as single transfection (black), compared to mock transfections (n = 3). Co-targets (f) are conserved TargetScan-predicted targets of both let- 7 and miR-17 miRNA families (Table S8), while let-7-only targets (d-e) exclude miR-17 family targets. For the selection of the top 100 let-7 targets (e), conserved TargetScan-

predicted let-7 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov- Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

ETH Zurich Matije Lucic 48 Main project Results

2.3.7 Co-transfection of let-7a with seed-mutated miR-106a

The seed-mutated miR-106a retains the ability to hinder repression of TargetScan- predicted let-7 targets by the co-transfected let-7a duplex (Figure 23), suggesting a minor importance of the seed sequence for the proposed non-canonical let-7 competing mechanism (Figure 14). However, the effect is less strong compared to the wildtype miR- 106a (Figure 23: d). Although, no significant differences between wildtype and seed- mutated co-transfections are observed on all (Figure 23: c) and top 100 (Figure 23: e) predicted let-7 targets. As expected, the seed-mutation does not demonstrate the additive effect on the co-targets (Figure 23: f). On contrary, it seems to induce their upregulation, similarly to its single transfection (Figure 19), despite the fact that it was co-transfected with let-7a.

ETH Zurich Matije Lucic 49 Main project Results

Figure 23 Seed-mutated miR-106a retains the ability to hinder let-7 target repression. Cumulative distributions of changes in gene expression for all genes (a), non-predicted targets (b) or conserved TargetScan-predicted let-7 targets (c-f) upon transfecting HEK293T cells with 10 nM of let-7a duplex as single transfection (blue) or combined with either 50 nM miR-106a precursor (red) or 50 nM seed-mutated miR-106a precursor (brown), and 50 nM randomized negative control duplex as single transfection (black), compared to mock transfections (n = 3). Co-targets (f) are conserved TargetScan-predicted targets of both let-7 and miR-17 miRNA families (Table S8), while let-7-only targets (d-e) exclude miR-17 family targets. For the selection of the top 100 let-7 targets (e), conserved

TargetScan-predicted let-7 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov-Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

ETH Zurich Matije Lucic 50 Main project Results

2.3.8 Co-transfection of let-7a with 3′ end-mutated miR-106a

The mutation in the 3′ end of the miR-106a, the region overlapping with the let-7 seed (Figure 13) and complementary to the canonical let-7 target sites (Figure 15), can abolish its let-7 neutralizing activity, unaffecting the repression of TargetScan-predicted let-7 targets by the co-transfected let-7a duplex. The combination of either the 3′ end mutant #1 (Figure 24) or mutant #3 (Figure 25) with let-7a is not impairing the downregulation of the let-7 targetome and shows no significant difference compared to let-7a single transfection (Figure 24: c-e; Figure 25: c-e), suggesting a strong dependency of the 3′ end sequence for the proposed non-canonical let-7 competing mechanism (Figure 15). Importantly, both mutants retain canonical seed-based activity, as previously shown (Figure 20), and enhance synergistically with let-7a the repression of the shared targetome (Figure 24: f; Figure 25: f). On contrary, the mutant #2 (Figure 26) retains the ability to prevent let-7a repression similarly to the wildtype miR-106a precursor. According to our model, the observed differences among these mutants can be possibly explained by differences in their 3′ end complementarity to the let-7 seed match (Figure 15). In fact, the extent of this complementarity seems to directly correlate with the ability of the miR-106a to hinder let-7 target repression. The 3′ end mutant #2, still able of a contiguous 5mer-pairing to the let-7 seed-match, retains the competing ability, whereas the more disruptive mutations, #1 and #3, cause a minimal 3′ end interactions with the let-7 target site, that possibly makes them unable to compete and neutralize let-7a repressive activity on its targetome.

ETH Zurich Matije Lucic 51 Main project Results

Figure 24 Mutation #1 in the 3′ end of miR-106a abolishes its let-7 neutralizing activity. Cumulative distributions of changes in gene expression for all genes (a), non-predicted targets (b) or conserved TargetScan-predicted let-7 targets (c-f) upon transfecting HEK293T cells with 10 nM of let-7a duplex as single transfection (blue) or combined with either 50 nM miR-106a precursor (red) or 50 nM 3′ end-mutated miR-106a precursor #1 (purple), and 50 nM randomized negative control duplex as single transfection (black), compared to mock transfections (n = 3). Co-targets (f) are conserved TargetScan-predicted targets of both let-7 and miR-17 miRNA families (Table S8), while let-7-only targets (d-e) exclude miR-17 family targets. For the selection of the top 100 let-7 targets (e), conserved

TargetScan-predicted let-7 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov-Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

ETH Zurich Matije Lucic 52 Main project Results

Figure 25 Mutation #3 in the 3′ end of miR-106a abolishes its let-7 neutralizing activity. Cumulative distributions of changes in gene expression for all genes (a), non-predicted targets (b) or conserved TargetScan-predicted let-7 targets (c-f) upon transfecting HEK293T cells with 10 nM of let-7a duplex as single transfection (blue) or combined with either 50 nM miR-106a precursor (red) or 50 nM 3′ end-mutated miR-106a precursor #3 (purple), and 50 nM randomized negative control duplex as single transfection (black), compared to mock transfections (n = 3). Co-targets (f) are conserved TargetScan-predicted targets of both let-7 and miR-17 miRNA families (Table S8), while let-7-only targets (d-e) exclude miR-17 family targets. For the selection of the top 100 let-7 targets (e), conserved

TargetScan-predicted let-7 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov-Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

ETH Zurich Matije Lucic 53 Main project Results

Figure 26 miR-106a 3′ end-mutant #2 retains the ability to hinder let-7 target repression. Cumulative distributions of changes in gene expression for all genes (a), non-predicted targets (b) or conserved TargetScan-predicted let-7 targets (c-f) upon transfecting HEK293T cells with 10 nM of let-7a duplex as single transfection (blue) or combined with either 50 nM miR-106a precursor (red) or 50 nM 3′ end-mutated miR-106a precursor #2 (purple), and 50 nM randomized negative control duplex as single transfection (black), compared to mock transfections (n = 3). Co-targets (f) are conserved TargetScan-predicted targets of both let-7 and miR-17 miRNA families (Table S8), while let-7-only targets (d-e) exclude miR-17 family targets. For the selection of the top 100 let-7 targets (e), conserved

TargetScan-predicted let-7 targets were ranked by decreasing aggregate PCT as described in [59]. Median changes in gene expression as well as P values and D statistics of Kolmogorov-Smirnov tests (KS-test) between miRNAs and negative control transfections (each compared to mock transfection) are shown below each graph, whereas P values from direct comparisons are shown on the graph.

ETH Zurich Matije Lucic 54 Main project Results

2.4 Follow-up on putative let-7 targets: HMGA2 and LIN28B

We next confirmed the observations from our deep sequencing data on validated let-7 targets by quantitative real-time PCR (qRT-PCR) (Figure 27) and luciferase reporter assay (Figure 28). From the TargetScan-predicted let-7 targets, we selected the top 2 most repressed targets upon transfection of the let-7a duplex (Table S9), namely HMGA2 and LIN28B, both validated let-7 targets (Chapter 1.8.2, p. 30) with multiple let-7 target sites in their 3′ UTRs (Table S9). HMGA2 has additionally a predicted target site for miR-17, although not functional as shown later (Figure 28: b; Figure 31: a). We also included ANKRD52, as putative miR-106a target, as shown previously [161].

Similar to the effect observed on the let-7 transcriptome (Chapter 2.3, p. 34), the co- transfection of let-7a with either miR-106a or miR106b precursors shows a striking difference on transcript abundances for validated let-7 target HMGA2 (Figure 27: a) and LIN28B (Figure 27: b). Compared to the let-7a single transfection, the combination with miR-106a hinders significantly the repression by the co-transfected let-7a. On contrary, the combination with miR-106b has a synergistic effect and enhances the repression of both targets. The seed-mutated miR-106a retains the ability to hinder the downregulation of both let-7 targets similar to the wildtype miR-106a precursor, whereas the mutations in the 3′ end of miR-106a can abolish its let-7 neutralizing activity, although with different efficiencies as observed in Chapter 2.3.8, p. 51. The let-7a single transfection and the seed-mutated miR-106a are not repressing ANKRD52 (Figure 27: c), a validated miR-106a target downregulated significantly and comparably by the wildtype and 3′ end-mutated miR-106a precursors, whereas miR-106b induces a smaller change in its transcript expression, possibly reflecting family member-dependent differences.

ETH Zurich Matije Lucic 55 Main project Results

Figure 27 Effects of co-transfections on validated let-7 and miR-17 targets. Relative fold changes of endogenous let-7 targets HMGA2 (a) and LIN28B (b) or miR-17 target ANKRD52 (c) upon transfecting HEK293T cells with 50 nM randomized negative control duplex (black), 10 nM of let-7a duplex as single transfection (blue) or co-transfected with 50 nM of each precursor: miR-106a (red), miR-106b (orange), seed-mutated miR-106a (brown) or 3′ end-mutated miR-106a #1-3 (purple). Fold changes are relative to mock transfections (n = 4) and normalized to housekeeping genes ACTB and GAPDH. Statistical significances of 2way ANOVA test with Dunnett’s multiple comparisons to miR-106a (red) are shown on each graph, while comparisons to either single let-7a (blue) or negative control (black) transfections are shown below each graph. Error bars are SD. ns, not significant; *: P ≤ 0.05; **: P ≤ 0.01; ***: P ≤ 0.001; ****: P ≤ 0.0001.

We next cloned the ~3 kb-long HMGA2 3′ UTR into the 3′ UTR of Renilla luciferase to correlate the let-7 competing activity of miR-106a with the output of the luciferase protein (Figure 28: a). Both the siRenilla (positive siRNA control duplex, as previously shown [24]) and the let-7a duplex induce a significant and dose-dependent downregulation of the luciferase reporter (Figure 28: b). Interestingly, the miR-106a precursor has no repressive effect although it has been predicted to target HMGA2 (Table S9). On contrary, it has a minor upregulatory effect at the highest doses, suggesting possible competition of endogenous let-7 miRNAs. The let-7a co-transfections (Figure 28: c) with either miR-106a

ETH Zurich Matije Lucic 56 Main project Results

or miR106b precursors show significant differences, with the miR-106a inducing a strong derepression of the reporter confirming our previous observations. On the other hand, the co-transfection of miR-106b enhances slightly the let-7a repression at the highest dose, confirming the synergistic effect seen on endogenous HMGA2 mRNA (Figure 27). Similarly, the hairpin bearing the mutation in the miR-106a seed increases at the highest dose the let-7a repression in this setup. Consistent with the qRT-PCR data (Figure 27) and compared to the wildtype precursor, the 3′ end-mutants #1-3 of miR-106a have significantly reduced activities with different efficiencies as previously observed.

Figure 28 Effects of co-transfections on HMGA2 3′ UTR luciferase reporter. (a) Schematic representation of the ~3 kb-long 3′ UTR of HMGA2 (pink with let-7 target sites in cyan) within the 3′ UTR of Renilla luciferase mRNA. Relative percentage changes in luminescence of the HMGA2 3′ UTR luciferase reporter in HEK293T cells upon 2.5, 10 or 40 nM single transfections (b) of the positive siRNA control siRenilla (gray), let-7a duplex (blue), miR-106a precursor (red) or randomized negative control duplex (black). The let-7a duplex is co-transfected at 2.5 nM (c) with increasing concentrations (2.5 nM, 1:1 ratio; 10 nM, 1:4 ratio; or 40 nM, 1:16 ratio) of each precursor: miR-106a (red), miR-106b (orange), seed-mutated miR-106a (brown) or 3′ end-mutated miR-106a #1-3 (purple). Renilla luminescence is normalized to Firefly luminescence and relative to either mock transfections (n = 4) set to 0% (b) or single let-7a transfections (n = 4) set to 0% (c). Statistical significances of 2way ANOVA test with Dunnett’s multiple comparisons to either negative control (b) or miR-106a (c) are shown on each graph. Error bars are SD. ns, not significant; *: P ≤ 0.05; **: P ≤ 0.01; ***: P ≤ 0.001; ****: P ≤ 0.0001.

ETH Zurich Matije Lucic 57 Main project Results

2.5 let-7 competition by miR-106a-5p strand

The use of a hairpin precursor allows the generation of mature miRNAs inside the cell, mimicking the endogenous miRNA maturation process, creating possibly two functional strands that can program RISC to mediate gene repression. To exclude this hypothesis and avoid the loading of the passenger strand into AGO, creating eventually off-target effects, we used the commercially available miR-106a-5p mimic (mimic-106a-5p) that is chemically enhanced to bias the strand selection and load preferentially the 5p strand of the miR-106a duplex. We sought to validate the observed let-7 competing effect of the miR-106a precursor with the mimic-106a-5p.

Single transfections in HEK293T cells confirmed the let-7a repression of endogenous let-7 targets (Figure 29: a). Besides the top 2 most repressed transcripts upon transfection of let-7a duplex (Table S9), namely HMGA2 and LIN28B, we included another two putative let-7 targets, ARID3B (top 3) and DICER1 (top 10). The miR-17 target, ANKRD52, is significantly repressed by the mimic-106a-5p only, whereas the let-7 targets seem to be slightly but not significantly upregulated, suggesting possibly the competition of endogenous let-7 miRNAs upon transfection of mimic-106a-5p. The same set of transcripts was measured again after co-transfecting let-7a in a 1:1 and 1:4 ratio with either miR-106a- 5p mimic or negative control duplex (Figure 29: b, c). Compared to the let-7a single transfection, the combination with the mimic-106a-5p prevents significantly the repression of all let-7 targets, causing even a minor upregulation of the transcripts at the highest ratio (Figure 29: c). On contrary, the let-7a repression remains unaffected by the co-transfection of the negative control duplex.

ETH Zurich Matije Lucic 58 Main project Results

Figure 29 Mimic-106a-5p prevents let-7 repression of putative let-7 targets. Relative fold changes of endogenous let-7 targets HMGA2, LIN28B, ARID3B and DICER1 and the miR-17 target ANKRD52, after transfecting HEK293T cells with 50 nM (a) of the let-7a duplex (blue), mimic-106a-5p (red) and randomized negative control duplex (black). The let-7a duplex is also co-transfected in a 1:1 ratio (b) or a 1:4 ratio (c) with either mimic- 106a-5p (red) or negative control duplex (black). Fold changes are relative to mock transfections (n = 3) and normalized to housekeeping genes ACTB and GAPDH. Statistical significances of 2way ANOVA test with Dunnett’s multiple comparisons to gene-specific let- 7a single transfections (blue) are shown on each graph. Error bars are SD. ns, not significant; ****: P ≤ 0.0001.

ETH Zurich Matije Lucic 59 Main project Results

Similar to its effect on endogenous LIN28B transcript (Figure 29: b, c), mimic-106a-5p prevents LIN28B repression also at protein level (Figure 30). Single transfections of both let-7a duplex and the siLIN28B induce a significant LIN28B downregulation. The effect is only affected by the combination with the mimic-106a-5p and only in case of co-transfection with let-7a, whereas the introduction of mimic-106a-5p is not influencing the repression mediated by siLIN28B.

Figure 30 Mimic-106a-5p prevents let-7 repression of LIN28B protein. Western blot showing levels of putative let-7 target protein LIN28B and housekeeping GAPDH after transfecting HEK293T cells with 50 nM of randomized negative control duplex (neg) and mimic-106a-5p. The let-7a duplex (blue) and the siLIN28B (pink) are transfected at 10 nM as single transfections or combined to both negative control and mimic-106a-5p. The intensity of each band was normalized to mock transfection (first band) and the corresponding GAPDH signal and shown below the blots. Representative example out of two independent biological replicates.

Additionally, we investigated the effect of the mimic-106a-5p on the HMGA2 3′ UTR luciferase reporter bearing seven putative let-7 target sites, as previously described (Figure 28: a). Compared to its significant and dose-dependent repression by both siRenilla and the let-7a duplex (Figure 31: a), the mimic-106a-5p has no effect and performs similarly to the negative control duplex. The combination of the let-7a duplex (Figure 31: b) with the negative control shows a uniform ~50% repression of the relative luminescence, excluding any additive or suppressive effect mediated by the co-transfected negative control. In contrast to this, the co-transfection with the mimic-106a-5p induces a significant and dose- dependent derepression of the reporter up to the baseline luminescence level of 100%. The combination of siRenilla (Figure 31: c) with the negative control induces a downregulation of ~50%, similar to the combination of let-7a and the negative control (Figure 31: b). Interestingly, the mimic-106a-5p has no derepressive effect on the co-transfected siRenilla.

ETH Zurich Matije Lucic 60 Main project Results

Figure 31 Mimic-106a-5p prevents repression of HMGA2 3′ UTR luciferase reporter. Relative luminescence of the HMGA2 3′ UTR luciferase reporter in HEK293T cells upon 0, 2.5, 10 or 40 nM transfections (a) of the positive siRNA control siRenilla (gray), let-7a duplex (blue), randomized negative control duplex (black) or miR-106a precursor (red). The let-7a duplex is co-transfected at 5 nM (b) with increasing concentrations (0, 2.5, 10, 40 nM) of either negative control duplex (black) or mimic-106a-5p (red), whereas the siRenilla is co- transfected at 20 nM (c) with increasing concentrations (0, 2.5, 10, 40 nM) of either negative control duplex (black) or mimic-106a-5p (red). Renilla luminescence is normalized to Firefly luminescence and relative to mock transfections (n = 3) set to 100%. Statistical significances of 2way ANOVA test with Dunnett’s multiple comparisons to either negative control (a) or co-transfected negative control (b, c) are shown on each graph. Error bars are SD. ns, not significant, omitted. **: P ≤ 0.01; ***: P ≤ 0.001; ****: P ≤ 0.0001.

2.6 Seed-homology sequences in human miRNAs

Following our initial observation of the let-7 seed-homology sequence in the 3′ end of the miR-17 family, we sought to investigate other miRNA families for 3′ end overlaps with known seed sequences. We first selected all annotated human mature miRNAs and isolated 1’733 unique seeds. We then screened the same mature miRNAs for full 7mer seed- overlaps in their 3′ ends (starting from nucleotide 9 of the mature miRNA sequence). We generated two additional seed-like data sets as negative controls: the reversed seeds (same seed sequences but with inverted orientation) and the non-seeds (all the possible and unique permutations of a 7mer RNA excluding the known seeds). Interestingly, the seed sequences appear to be a lot more frequently found in human miRNAs than non- or reversed seeds (Figure 32). In fact, more than 70% of the known seeds have at least one 3′ end overlap, suggesting possibly an active use of the frequently found seed-homology sequences as part of an additional layer of the miRNA regulation mechanism.

ETH Zurich Matije Lucic 61 Main project Results

Figure 32 Seed-overlap sequences are frequently found in 3′ ends of human miRNAs. Frequency plot of seed-overlaps in human miRNAs (starting from nucleotide 9 of the mature miRNA sequence) for non-seeds (black), seeds (red) or reversed seeds (blue). P values of the pairwise Mann-Whitney U tests are shown on the graph.

After manually curating this data, we selected a potentially interesting miR-34 family seed- homology sequence in the 3′ end of miR-214-3p (Figure 33: a). Considering the well- documented tumor-suppressive activity of the miR-34 family [163], and the high degree of conservation of this seed-overlap sequence in all miRBase-annotated miR-214-3p miRNAs (Figure 33: b), we were intrigued by possible implications of this homology. In fact, a recent report assigns oncogenic properties to the miR-214-3p miRNA [164], making its overlap with the tumor-suppressive miR-34 potentially meaningful.

ETH Zurich Matije Lucic 62 Main project Results

a) miRNA sequence (5′-3′) 12345678 miR-214-3p ACAGCAGGCACAGACAGGCAGU

miR-34a-5p UGGCAGUGUC.UUAGCUGGUUGU miR-34b-5p UAGGCAGUGUCAUUAGCUGAUUG miR-34c-5p AGGCAGUGUAGUUAGCUGAUUGC 12345678

b) miRNA Sequence (5′-3′) Length (nt) hsa-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 mmu-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 oan-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 tgu-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 aca-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 cli-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 pbv-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 pal-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 rno-miR-214-3p -ACAGCAGGCACAGACAGGCAG- 21 ssc-miR-214-3p -ACAGCAGGCACAGACAGGCAG- 21 mml-miR-214-3p -ACAGCAGGCACAGACAGGCAG- 21 mdo-miR-214-3p -ACAGCAGGCACAGACAGGCAG- 21 ami-miR-214-3p -ACAGCAGGCACAGACAGGCAG- 21 gmo-miR-214-3p -ACAGCAGGCACAGACAGGCAG- 21 xla-miR-214-3p -ACAGCAGGCACAGACAGGCAG- 21 cpo-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 dno-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 ocu-miR-214-3p -ACAGCAGGCACAGACAGGCAGU 22 cgr-miR-214-3p UACAGCAGGCACAGACAGG---- 19 ssa-miR-214-3p UACAGCAGGCACAGACAGGCAGA 23 cpi-miR-214-3p UACAGCAGGCACAGACAGGCAG- 22 chi-miR-214-3p UACAGCAGGCACAGACAGGC--- 20 ******************

Figure 33 miR-34 seed-homology sequence in the 3′ end of miR-214-3p. (a) Human miR-214-3p and miR-34 family members are shown with ClustalW multiple sequence alignments. The 3′ end of miR-214-3p overlaps with the 5′ end of the miR-34 family, covering its seed sequence (yellow box). Extended seed (nucleotides 2-8, shown in red). (b) Multiple sequence alignments of all miRBase-annotated miR-214-3p miRNAs. Human member shown in red.

ETH Zurich Matije Lucic 63 Main project Discussion

3 Discussion

Our initial observation of the let-7 seed-homology sequence in the 3′ end of the miR-17 family led us to hypothesize an overlapping and competitive targeting mechanism between the miR-17 and let-7 families, considering also their opposing oncogenic [8, 9] and tumor- suppressive roles [10, 11]. We envisioned miR-17 miRNAs binding to the let-7 target sites in a non-canonical miRNA fashion, lacking the typical seed-pairing and exploiting instead their 3′ end complementarity to the let-7 seed match. According to the general consensus in the field [3], the functional miRNA-mediated gene silencing relies largely on the seed- pairing to the target mRNA and additional auxiliary base-pairing can have a supportive role, compensating for an imperfect seed match. Recently, non-canonical miRNA targeting has been extensively reevaluated and found to be much more widespread than previously thought [101, 107, 109-111]. However, with the current state of knowledge today, it appears generally to have a minor importance in the miRNA-mediated target repression [101]. Based on this, we speculated whether the proposed miR-106a non-canonical targeting would be able to mediate functional repression of let-7 targets or, on contrary, impair the canonical let-7 seed-based silencing, leading possibly to let-7 target derepression.

Support for this hypothesis comes from a recent report about a novel class of miR-17 family target sites within open reading frames (ORFs), characterized by an extensive base-pairing at the 3′ side of the miRNA and lacking conventional complementarity at the 5′ seed [165]. Interestingly, these atypical sites recruit AGO-loaded miR-17 family members, miR-17 and miR-20a, without inducing canonical miRNA-mediated target mRNA repression. Instead, they cause gene silencing at the translational level by transient ribosome stalling. Additionally, the presence of such sites in the 3′ UTR did not repress mRNA transcripts either, supporting the crucial role of the miRNA seed-pairing for efficient silencing of targets. This investigation brings mechanistic insights and possibly a plausible explanation for the functional derepression of the let-7 targets by miR-106a presented in this work. In fact, similarly to the ribosome stalling in the ORFs, miR-106a non-canonical targeting on let-7 targets does not mediate their decay, instead it possibly competes with the conventional let-7 seed-pairing, neutralizing its repression of the targets.

To functionally support this hypothesis, we investigated changes in expression of the predicted let-7 targetome upon single and co-transfections of our competing miRNAs, miR- 106a/b and let-7a. Considering the complexity of this approach, we first explored the canonical seed-based and repressive activity of each miRNA on its own targetome. For this purpose, we used the conserved TargetScan-predicted targets to categorize and explain

ETH Zurich Matije Lucic 64 Main project Discussion

the observed changes in gene expression. Although expected, we were pleased to verify the miRNA repressive activities on the corresponding TargetScan-predicted targets, confirming the validity of our reagents in inducing canonical gene silencing. Additionally, a more stringent selection and ranking of the predicted-targetome showed generally an even larger downregulation, indicating the advances of the modern miRNA target prediction algorithms. Performance that was confirmed by the reduced repression of the seed-mutated miR-106a precursor. The related miR-106a and miR-106b achieved similar activities, especially on the top 100 predicted targets, repressed equally by both miRNAs, suggesting their correct maturation in the cell and loading into AGO. The mutations in the 3′ end of miR-106a had generally a minor impact on its canonical activity. In fact, all the mutants induced a significant repression of the targetome similarly to the wildtype precursor. As reported previously [107], the observed differences among them could be explained by the emerging contribution of the miRNA 3′ end interactions to target site specificity.

To facilitate the analysis of miRNA co-transfections, we identified the predicted co-targets of both let-7 and miR-17 families and verified their overlapping repression. Investigation of these co-targets and the let-7-only targetome allowed us to explore the putative let-7 competing activity of miR-106a. In fact, miR-106a displayed two opposing effects when co- transfected with let-7a. In some cases, it had a synergistical effect, producing an additive repression of the canonical co-targets higher than that of single let-7a or miR-106a transfections, in others, it almost completely neutralized the let-7 target repression by let- 7a. This striking effect on the let-7 targetome was abolished upon mutation of the 3′ end of miR-106a, strongly suggesting the requirement of the 3′ side interaction with complementary let-7 target sites. The extent of this complementarity seems to directly correlate with the ability of the miR-106a to hinder let-7 target repression. This is additionally supported by the remarkable difference between the miR-106a and miR-106b, with the latter lacking completely any let-7 competing activity. While indispensable for the repression of canonical targets, the seed sequence of the miR-106a played a minor role in neutralizing repression of let-7 targets. However, according to [165], in addition to the extensive pairing at the 3′ end of the miRNA, a minimal requirement for the 5′ seed-interaction is likely and target-site specific.

Follow-up experiments on two highly repressed let-7 targets confirmed our deep sequencing data. The miR-106a-5p mimic confirmed the let-7 neutralizing effect of the miR- 106a precursor on endogenous transcripts, at protein level and by luciferase reporter assays. Supporting our miRNA competition hypothesis, the reported derepression of let-7

ETH Zurich Matije Lucic 65 Main project Discussion

targets appeared to be linked exclusively to the activity of let-7a. In fact, miR-106a was unable to neutralize target repression mediated by siRNAs, that are not expected to bind let-7 target sites in a miRNA fashion, inducing instead the siRNA-mediated cleavage upon extensive pairing with fully-complementary sites.

The impairment of the let-7 activity has been reported in the case of RNA-binding proteins, especially IGF2BPs (IGF2 mRNA-binding proteins). These bona fide oncofetal proteins are able to sequester let-7 target mRNAs, particularly HMGA2, LIN28B or their own mRNA transcripts, in RISC-deprived cytoplasmic granules, ‘shielding’ them from let-7-dependent repression [166, 167]. Herein, we propose a miRNA ‘safe-guarding’ mechanism of target mRNAs based on a direct miRNA competition on overlapping target sites with opposing effects. Considering our data and the functionally opposing nature of the miR-17 and let-7 families, we can appreciate the elegance of the competing interaction between miR-106a and let-7a, that can be even seen as a fierce duel. As tumor suppressor, let-7 is no doubt the Good, the Clint Eastwood of the miRNA world. It represses prominent oncogenes and fights the cancer. By extension, oncogenes are definitely the Bad guys. And to complete this analogy with the famous trio from the epic Spaghetti Western film directed by Sergio Leone, the miR-106a is the Ugly of the situation. It does not repress let-7 targets and additionally it ‘guards’ them against the let-7-mediated silencing, preventing the downregulation of oncogenes and possibly inducing cancerogenic transformations. Furthermore, attenuation of LIN28B would be expected to impair let-7 biogenesis, leading to a loss of let-7 activity.

The frequent occurrence of the seed-homology motifs in human miRNAs suggest possibly their active use as part of a competitive mechanism acting on the shared targetome similar to miR-106a and let-7a. For example, we identified a potentially meaningful seed-overlap between tumor-suppressive miR-34 family and the 3′ end of the oncogenic miR-214-3p.

ETH Zurich Matije Lucic 66 Main project Outlook

4 Outlook

The hypothesis of the impairment of let-7 repression by miR-106a non-canonical targeting is supported by our functional data. Further experiments are required to bring mechanistic insights to this interaction. The direct evidence of an overlapping miR-106a non-canonical targeting on a conventional let-7 target site is still missing. One possible approach would be the re-analysis of publicly available AGO-CLIP data containing miRNA-target chimeras that possibly provide binding and context information at bona fide miRNA-target sites. This could theoretically provide a systematic way to explore overlapping targeting of any cellular miRNA, predicting additionally their binding modes (seed-based or non-canonical) at shared target sites. However, the abundancy of chimeric reads among older CLIP data sets is very low. Recently, dedicated methods increased their yields allowing the identification of thousands of reproducible hybrid reads. One could attempt to ‘dissect’ such data sets and develop a protocol to identify overlapping chimeras supporting possibly our hypothesis of competing miRNAs. Additionally, we are starting a collaborative approach with Prof. Ian MacRae and Dr. Luca Gebert at The Scripps Research Institute in La Jolla California, to investigate AGO-miR-106a targeting on short functional let-7 target sites, identified previously in a luciferase reporter screen. Interestingly, with such reconstituted in vitro binding assay we can additionally explore competitive binding with AGO-let-7 and the possibility of miR-106a unloading from AGO upon its extensive 3′ end pairing with the let-7 seed match, as shown in [125]. This would possibly add an additional layer of control, since unloaded miRNAs are rapidly degraded.

Our comprehension of miRNA-target interactions has increased in the last decade. Beyond the predominance of the canonical seed-pairing, an ever-growing number of non-canonical miRNA targeting modes have been identified, suggesting a much more dynamic and heterogeneous mechanism of miRNA-mediated gene regulation than previously proposed. The refinement and further expansion of the spectrum of miRNA-mRNA interactions will likely continue.

ETH Zurich Matije Lucic 67 Main project Contributions

5 Contributions

Mauro Zimmermann synthesized, purified and validated all the miRNAs and siRNAs.

Philippe Demougin measured the integrity of total RNA, prepared the DNA libraries and submitted them for sequencing.

Dr. Alexander Kanitz performed the raw analysis and mapping of the transcriptomic data and provided significant inputs and help with the analysis of cumulative plots. Additionally, he performed the analysis of seed-overlap sequences within 3′ ends of human miRNAs.

ETH Zurich Matije Lucic 68

Side project RNAi activity of hybrid duplexes with parallel orientation

6 Side projects 6.1 RNAi activity of hybrid duplexes with parallel orientation

Abstract Herein [168], we report a new class of RNAi trigger molecules based on the unconventional parallel hybridization of two oligonucleotide chains. We have prepared and studied several parallel stranded (ps) duplexes, in which the parallel orientation is achieved through incorporation of isoguanine and isocytosine to form reverse Watson-Crick base pairs in ps-DNA:DNA, ps-DNA:RNA, ps-(DNA-2′F-RNA):RNA, and ps-DNA:2′F-RNA duplexes. The formation of these duplexes was confirmed by UV melting experiments, FRET and CD studies. In addition, NMR structural studies were conducted on a ps- DNA:RNA hybrid for the first time. Finally, we provide evidence for the unprecedented finding that ps-DNA:RNA and ps-DNA:2′F-RNA hybrids can engage the RNAi pathway to silence gene expression in vitro.

Individual contribution:

This work was part of a collaborative project with the laboratory of Prof. Masad J. Damha from the McGill University in Montreal, Canada. Matije Lucic designed and performed

luciferase reporter assays, calculated the EC50 values, performed the AGO2 knockdown experiment and Western blot.

This work was included in the following publication [168]:

ETH Zurich Matije Lucic 69

Side project Targeting miR-122 in RISC with conjugated antimiRs

6.2 Targeting miR-122 in RISC with conjugated antimiRs

Abstract Herein [169], we synthesized a miR-122 antimiR library in which drug-like fragments were site-specifically introduced to short 2′-O-methyl-RNAs. At some sites selected fragments elevated cellular antimiR activity to that of an unmodified 23-mer antimiR, whereas at others the same fragments abolished activity. The potency of the antimiRs correlated with uptake into miRISC.

Individual contribution:

This project was designed and initiated by Dr. A. Brunschweiger and Dr. L. F. R. Gebert. Matije Lucic carried out part of the luciferase assays, designed and performed the RISC- uptake assay through AGO2 immunoprecipitation and successively quantified the antimiRs via chemical-ligation qPCR.

This work was included in the following publication [169]:

ETH Zurich Matije Lucic 70 Side project Antagonizing Lin28-pre-let-7 interaction with ‘looptomirs’

6.3 Antagonizing Lin28-pre-let-7 interaction with ‘looptomirs’

Abstract MicroRNAs (miRNAs) originate from stem-loop-containing precursors (pre- miRNAs, pri-miRNAs) and mature by means of the Drosha and Dicer endonucleases and their associated factors. The let-7 miRNAs have prominent roles in developmental differentiation and in regulating cell proliferation. In cancer, the tumor suppressor function of let-7 is abrogated by overexpression of Lin28, one of several RNA-binding proteins that regulate let-7 biogenesis by interacting with conserved motifs in let-7 precursors close to the Dicer cleavage site. Herein [170], we identified a binding site for short modified oligoribonucleotides (‘looptomirs’) overlapping that of Lin28 in pre-let-7a-2. These looptomirs selectively antagonize the docking of Lin28, but still permit processing of pre-let- 7a-2 by Dicer. Looptomirs restored synthesis of mature let-7 and inhibited growth and clonogenic potential in Lin28 overexpressing hepatocarcinoma cells, thereby demonstrating a promising new means to rescue defective miRNA biogenesis in Lin28-dependent cancers.

Individual contribution:

This project was initiated and led by Dr. M. Roos. Matije Lucic performed the in vitro dicer assay and analyzed the processing of pre-let-7a-2 by LC-MS.

This work was included in the following publication [170]:

ETH Zurich Matije Lucic 71 Side project Mono- and bis-labeling of pre-miRNAs

6.4 Mono- and bis-labeling of pre-miRNAs

Abstract Herein [171], a chemical method for the post-synthetic labeling of pre-miRNAs on solid support using easily accessible reagents was developed. The procedure was employed to generate a library of 31 pre-microRNAs carrying labels commonly used in chemical biology, including Cy3, trioxalen, biotin, and BHQ-1.

Individual contribution:

This project was designed and led by Dr. U. Pradère. Matije Lucic immobilized biotinylated pre-miRNAs on magnetic streptavidin beads and acquired the Cy3-signal by fluorescence microscopy.

This work was included in the following publication [171]:

ETH Zurich Matije Lucic 72 Materials and methods

7 Materials and methods 7.1 Materials

7.1.1 miRNAs and siRNAs

If not mentioned differently, Mauro Zimmermann synthesized, purified and validated all the miRNAs and siRNAs. The mismatch mutations were

name sequence (5′-3′) notes randomized negative GUNUGNGUUNNUUANNCACTT As described in [162]. control duplex GUGNNUAANNAACNCANACTT N, randomized nucleotide. UGAGGUAGUAGGUUGUAUAGUU miRBase accession: hsa-let-7a-1 duplex CUAUACAAUCUACUGUCUUUC MIMAT0000062, MIMAT0004481 AAAAGUGCUUACAGUGCAGGUAGCU miRBase accession: MI0000113 hsa-pre-miR-106a UUUUGAGAUCUACUGCAAUGUAAGC Mature sequences underlined. ACUUCUUAC UAAAGUGCUGACAGUGCAGAUAGUG miRBase accession: MI0000734 hsa-pre-miR-106b GUCCUCUCCGUGCUACCGCACUGUG Mature sequences underlined. GGUACUUGCUGC AAACGUUCUUACAGUGCAGGUAGCU hsa-pre-miR-106a Mutation (red) in the seed region. UUUUGAGAUCUACUGCAAUGUAAGA seed mutant* Mature sequences underlined. ACGUCUUAC AAAAGUGCUUACAGUGCCGCUAGCU hsa-pre-miR-106a UUUUGAGAUCUACGGCAAUGUAAGC 3′ end mutant #1* ACUUCUUAC AAAAGUGCUUACAGUUCCGGUAGCU hsa-pre-miR-106a Mutation (red) in the 3′ end region. UUUUGAGAUCUACGGAAAUGUAAGC 3′ end mutant #2* Mature sequences underlined. ACUUCUUAC AAAAGUGCUUACAGUUACGUGCGCU hsa-pre-miR-106a UUUUGAGAUCGCCGUAAAUGUAAGC 3′ end mutant #3* ACUUCUUAC mimic Dharmacon miRIDIAN mimic AAAAGUGCUUACAGUGCAGGUAG hsa-miR-106a-5p (#C-300526-07-0005) GAGCGAAGAGGGCGAGAAAUU siRenilla As described in [24]. AAUUUCUCGCCCUCUUCGCUC AAAUCCUUCCAUGAAUAGUTT siLIN28B As described in [172]. ACUAUUCAUGGAAGGAUUUTT

* The mutant miR-106a hairpin precursors were carefully designed to maintain the same predicted secondary structure as the wildtype pre-miR-106a and ensure equal Dicer processing (Table S10). The mutations in the 5p strands imply compensatory mutations in the 3p strands to maintain the hairpin structure.

ETH Zurich Matije Lucic 73

Materials and methods

7.1.2 Plasmids name insert notes psiCHECK-2: empty none Promega, #C8021 plasmid psiCHECK-2: HMGA2 3′ wildtype HMGA2 3′ UTR (cloned from Addgene plasmid #14785 UTR reporter [149]). TTTAGAAAAAGTCTAAACATTTAGGGC 3x miR-17 family psiCHECK-2: miR-17 ACTTTAAAGGAGACACTCCTAGCCTGG TargetScan-predicted target CCCCTAACCACGCACTTTAACCTTGCC reporter sites underlined (artificially TAAAGCACTTGCTTCAAGTAATAAGCA CTTTTGTGAAAA merged together).

7.1.3 RT-qPCR primers

The RT-qPCR primers were selected from the Roche Universal Probe Library and synthesized by Microsynth AG (Balgach, CH).

gene symbol, sequence (5′-3′) Roche Probe Library # transcript ID ACTB, L: CCAACCGCGAGAAGATGA #64 (cat. no. 04688635001) NM_001101.3 R: CCAGAGGCGTACAGGGATAG ANKRD52, L: GGACGAGCCACTGAAGGAG #4 (cat. no. 04685016001) NM_173595.3 R: GGTCTGCACCGTTATCCAGT ARID3B, L: GATGGCACCACCTATGCAG #61 (cat. no. 04688597001) NM_006465.2 R: GAGCAGACCCCGTGATGA DICER1, L: GTCCGATGGTTCTCGAAGG #47 (cat. no. 04688074001) NM_177438.2 R: GCAAAGCAGGGCTTTTCA GAPDH, L: AGCCACATCGCTCAGACAC #60 (cat. no. 04688589001) NM_002046.3 R: GCCCAATACGACCAAATCC HMGA2, L: TCCCTCTAAAGCAGCTCAAAA #34 (cat. no. 04687671001) NM_003483.4 R: ACTTGTTGTGGCCATTTCCT LIN28B, L: GAAAAGAAAACCAAAGGGAGATAGA #49 (cat. no. 04688104001) NM_001004317.2 R: GAGGTAGACTACATTCCTTAGCATGA

7.1.4 Antibodies name notes anti-GAPDH (OTID9) Origene, #TA802519 anti-LIN28B (#4196) CST, #4196S

ETH Zurich Matije Lucic 74 Materials and methods

7.2 Methods

7.2.1 Cultivation and maintenance of mammalian cell lines

HEK293T (ATCC, #CRL-3216) cells were grown as monolayer and maintained in Dulbecco’s Modified Eagle’s medium (Gibco, #31966021) supplemented with 10% FBS

(Gibco, #10270106) and kept at 37°C in a 5% CO2 incubator.

7.2.2 Seeding of the cells

HEK293T cells were seeded 6-8 hours before transfection in either 12-well transparent plates (150’000 cells/well; used for RNA extraction or lysates for Western blot) or 96-well opaque plates (10’000 cell/well; used for Luciferase assay).

7.2.3 Transient transfections of miRNAs and siRNAs

All reagents were transfected at given concentrations (usually up to 50 nM) using Lipofectamine 2000 (Invitrogen, #11668027) according to the manufacturer’s instructions. Dilutions were performed ~25 min post-incubation of the reagents with the transfection reagent. The complexes were then transferred on the cells and incubated up to 48 hours.

7.2.4 RT-qPCR

HEK293T cells were washed twice with ice cold PBS and lysed in 750 µl of Trizol reagent (Invitrogen, #15596-018). The RNA was extracted according to the manufacturer’s instructions and used for cDNA synthesis with the TaqMan microRNA reverse-transcription kit (Thermo Fisher Scientific, #4366596) according to the manufacturer's protocol and using a 1:1 mix of random hexamer (Promega, #C1181) and oligo(dT)15 primers (Promega, #C1101). qPCR reactions were performed using the KAPA SYBR FAST qPCR master mix (Kapa Biosystems, #KK4618) on a LightCycler 480 (Roche). Each reaction was carried out in 3 technical replicates. The 2∆∆�� method was used to analyze the data. Housekeeping genes, ACTB and GAPDH, were used for normalization.

7.2.5 Western blot

HEK293T cells were washed twice with ice cold PBS and lysed in RIPA buffer (Sigma- Aldrich, #R0278). Protein concentrations were determined using the Pierce BCA protein

ETH Zurich Matije Lucic 75 Materials and methods

assay kit (Thermo Fisher Scientific, #23227). ~20–30 µg of total protein were separated on SDS gel and transferred to PVDF western blotting membrane (Thermo Fisher Scientific, #88518). Non-specific membrane binding was blocked for 60 min at room temperature with 5% milk. Membranes were incubated overnight at 4°C with primary antibodies against LIN28B (1:1’000) or GAPDH (1:10’000). After washing, membranes were incubated with appropriate HRP-conjugated secondary antibodies for 2 hours at room temperature. Signals generated by the Amersham ECL Western blotting detection reagent (GE Healthcare, #RPN2109) were captured by a cooled CCD camera (Bio-Rad, Hercules).

7.2.6 Cloning and transfections of luciferase reporter plasmids

The psiCHECK-2 dual-luciferase reporter plasmid (Promega, #C8021) was digested with NotI (Promega, #R6431) and XhoI (Promega, #R6161) and the DNA insert was cloned in the 3′ UTR of the Renilla gene according to the manufacturer’s instructions. The plasmid was confirmed by sequencing and transfected at 20 ng/well (in a 96-well plate format) using jetPEI (Polyplus, #101-10) according to the manufacturer’s instructions.

7.2.7 Luciferase assay

HEK293T cells were seeded in opaque 96-well plates at 10’000 viable cells/well (in 80 µl/well total medium) and miRNAs were transfected in triplicates after ca. 12-16 hours whereas the reporter plasmid ca. 24 hours post-seeding. 48 hours later, supernatants were removed and luminescence was measured on a microtiter plate reader using the Dual-Glo Luciferase Assay System (Promega, #E2940) according to the manufacturer’s instructions. Renilla luminescence was normalized to firefly luminescence and the corresponding mock control with the relative luciferase activity set to either 0 or 100%.

7.2.8 RNA integrity and quantification

Total RNA was quantified by fluorometry using the QuantiFluor RNA System (Promega, #E3310) and quality-checked on the Bioanalyzer (Agilent Technologies) using the RNA 6000 Nano Chip (Agilent, #5067-1511). Average RIN (RNA Integrity Number) was 8.8.

ETH Zurich Matije Lucic 76 Materials and methods

7.2.9 Library preparation

Library preparation was performed with 200 ng total RNA using the TruSeq Stranded mRNA Library Prep Kit High Throughput (Illumina, #RS-122-2103). Libraries were quality-checked on the Fragment Analyzer (Advanced Analytical) using the Standard Sensitivity NGS Fragment Analysis Kit (Advanced Analytical, #DNF-473) revealing excellent quality of libraries (average concentration was 111±19 nmol/l and average library size was 379±26 base pairs). Samples were pooled to equal molarity. The pool was quantified by fluorometry using the QuantiFluor ONE dsDNA System (Promega, #E4871) in order to be adjusted to 1.4 pM and used for clustering on the NextSeq 500 instrument (Illumina).

7.2.10 Clustering and sequencing

Samples were sequenced Single-reads 76 bases (in addition: 8 bases for index 1 and 8 bases for index 2) using the NextSeq 500 High Output Kit 75-cycles (Illumina, #FC-404- 1005). Primary data analysis was performed with the Illumina RTA version 2.4.11 and Basecalling Version bcl2fastq-2.20.0.422. Two Nextseq runs were performed to compile enough reads (on average per sample: 20.1±1.5 millions pass-filter reads).

7.2.11 Analysis of sequencing data sets

RNA-Seq libraries were sequenced for 76 cycles on an Illumina NextSeq machine. Reads were processed with Cutadapt1 (v1.8.3; https://github.com/marcelm/cutadapt) to remove 3′ adapters (GATCGGAAGAGCACA) and poly(A) tail residues. Only read fragments were retained that were at least 20 nucleotides long after trimming (Cutadapt option --minimum- length=20). Transcript abundances were quantified from processed reads with kallisto2 (v0.42.3; https://pachterlab.github.io/kallisto/) based on the sequences of all transcripts annotated by Ensembl3 (release 89; http://may2017.archive.ensembl.org/Homo_sapiens/Info/Index). In the absence of more precise information, fragment length distribution parameters were set to --fragment- length=300 and --sd=100 for all samples. Furthermore, reads were aligned to the (GRCh38, also from Ensembl) with the STAR aligner4 (v2.4.1c; https://github.com/alexdobin/STAR). In order to facilitate spliced mapping of reads spanning multiple exons, human gene annotations (Ensembl release 89) were supplied to STAR together with the following non-default parameters: --sjdbOverhang=75, -- twopassMode="Basic", --twopass1readsN=-1. For each library, more than 98% of reads

ETH Zurich Matije Lucic 77 Materials and methods

could be mapped, with at least 88% of reads mapping “uniquely” (i.e. only a single alignment was reported, with all other potential alignments having a lower alignment score / higher edit distance). For differential gene expression analyses, the estimated number of reads for each transcript (as produced by kallisto) were summed up for each gene and supplied to edgeR5 (v3.12.1; https://bioconductor.org/packages/release/bioc/html/edgeR.html). Prior to the principal component analysis, gene expression levels, obtained by summing up kallisto- estimated transcript abundances (in transcripts per million; TPM) for each gene, were log2- transformed, arranged in a gene-by-sample matrix and zero-centered by columns and rows.

References:

1. https://journal.embnet.org/index.php/embnetjournal/article/view/200/479

2. https://www.nature.com/articles/nbt.3519

3. https://academic.oup.com/nar/article/45/D1/D635/2605734

4. https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/bts635

5. https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btp616

7.2.12 Library ID

The RNA sequencing files (fastq files) are located on the group server of Prof. Mihaela Zavolan at University of Basel. The following table provides the library ID with the original names of the sequencing data and their use in the figures of the result section. Each data set (treatment) consists of biological triplicates (n = 3) having the suffixes _1, _2 and _3 or the suffixes _4, _5 and _6.

figures library ID (fastq files) mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 17 let-7a duplex 10 nM: let7_1, _2, _3 let-7a duplex 50 nM: let7_4, _5, _6 mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 18 miR-106a precursor: 106a_1, _2, _3 miR-106b precursor: 106b_1, _2, _3 mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 19 miR-106a precursor: 106a_1, _2, _3 seed-mutated miR-106a precursor: Smut_1, _2, _3 Continued.

ETH Zurich Matije Lucic 78 Materials and methods

Continued. figures library ID (fastq files) mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 miR-106a precursor: 106a_1, _2, _3 Figure 20 3′ end-mutated miR-106a precursor #1: ASmut1_1, _2, _3 3′ end-mutated miR-106a precursor #2: ASmut2_1, _2, _3 3′ end-mutated miR-106a precursor #3: ASmut3_1, _2, _3 mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 21 let-7a duplex 50 nM: let7_4, _5, _6 miR-106a precursor: 106a_1, _2, _3 miR-106b precursor: 106b_1, _2, _3 mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 22 let-7a duplex 10 nM: let7_1, _2, _3 let-7a duplex + miR-106a precursor: 106a_let7_1, _2, _3 let-7a duplex + miR-106b precursor: 106b_let7_1, _2, _3 mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 23 let-7a duplex 10 nM: let7_1, _2, _3 let-7a duplex + miR-106a precursor: 106a_let7_1, _2, _3 let-7a duplex + seed-mutated miR-106a precursor: Smut_let7_1, _2, _3 mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 24 let-7a duplex 10 nM: let7_1, _2, _3 let-7a duplex + miR-106a precursor: 106a_let7_1, _2, _3 let-7a duplex + 3′ end-mutated miR-106a precursor #1: ASmut1_let7_1, _2, _3 mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 25 let-7a duplex 10 nM: let7_1, _2, _3 let-7a duplex + miR-106a precursor: 106a_let7_1, _2, _3 let-7a duplex + 3′ end-mutated miR-106a precursor #3: ASmut3_let7_1, _2, _3 mock: mock_1, _2, _3 randomized negative control duplex: neg_1, _2, _3 Figure 26 let-7a duplex 10 nM: let7_1, _2, _3 let-7a duplex + miR-106a precursor: 106a_let7_1, _2, _3 let-7a duplex + 3′ end-mutated miR-106a precursor #2: ASmut2_let7_1, _2, _3

7.2.13 miRNA target predictions

Predicted targets of miR-17 and let-7 families were obtained from TargetScan (v7.2; http://www.targetscan.org/vert_72/).

ETH Zurich Matije Lucic 79 Materials and methods

7.2.14 Statistical analysis

Statistical analysis was performed with GraphPad Prism and R. An independent student’s t-test was used to compare the statistical significance of two groups, whereas two-way ANOVA with Dunnett’s multiple comparison was applied to compare more than two groups. In case of cumulative distributions, the P values and D statistics between individual treatments were calculated with the Kolmogorov-Smirnov tests (KS-test). The type and description of the statistical analysis is detailed in the caption of each figure.

ETH Zurich Matije Lucic 80 References

References

1. Lee, R.C., R.L. Feinbaum, and V. Ambros, The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell, 1993. 75(5): p. 843-54. 2. Wightman, B., I. Ha, and G. Ruvkun, Posttranscriptional regulation of the heterochronic gene lin-14 by lin-4 mediates temporal pattern formation in C. elegans. Cell, 1993. 75(5): p. 855-62. 3. Bartel, D.P., Metazoan MicroRNAs. Cell, 2018. 173(1): p. 20-51. 4. Ghildiyal, M. and P.D. Zamore, Small silencing RNAs: an expanding universe. Nat Rev Genet, 2009. 10(2): p. 94-108. 5. Kloosterman, W.P. and R.H. Plasterk, The diverse functions of microRNAs in animal development and disease. Dev Cell, 2006. 11(4): p. 441-50. 6. Stefani, G. and F.J. Slack, Small non-coding RNAs in animal development. Nature Reviews Molecular Cell Biology, 2008. 9(3): p. 219-230. 7. Kozomara, A. and S. Griffiths-Jones, miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Research, 2014. 42(D1): p. D68-D73. 8. Mogilyansky, E. and I. Rigoutsos, The miR-17/92 cluster: a comprehensive update on its genomics, genetics, functions and increasingly important and numerous roles in health and disease. Cell Death Differ, 2013. 20(12): p. 1603-14. 9. Mendell, J.T., miRiad roles for the miR-17-92 cluster in development and disease. Cell, 2008. 133(2): p. 217-22. 10. Takamizawa, J., et al., Reduced expression of the let-7 microRNAs in human lung cancers in association with shortened postoperative survival. Cancer Res, 2004. 64(11): p. 3753-6. 11. Balzeau, J., et al., The LIN28/let-7 Pathway in Cancer. Front Genet, 2017. 8: p. 31. 12. Berezikov, E., Evolution of microRNA diversity and regulation in animals. Nat Rev Genet, 2011. 12(12): p. 846-60. 13. Hertel, J., et al., The expansion of the metazoan microRNA repertoire. BMC Genomics, 2006. 7: p. 25. 14. Bartel, D.P., MicroRNAs: target recognition and regulatory functions. Cell, 2009. 136(2): p. 215-33. 15. Kozomara, A. and S. Griffiths-Jones, miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res, 2011. 39(Database issue): p. D152-7. 16. Ambros, V., et al., A uniform system for microRNA annotation. RNA, 2003. 9(3): p. 277-9. 17. Lagos-Quintana, M., et al., Identification of novel genes coding for small expressed RNAs. Science, 2001. 294(5543): p. 853-8. 18. Lau, N.C., et al., An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science, 2001. 294(5543): p. 858-62. 19. Lee, R.C. and V. Ambros, An extensive class of small RNAs in Caenorhabditis elegans. Science, 2001. 294(5543): p. 862-4.

ETH Zurich Matije Lucic 81 References

20. Ha, M. and V.N. Kim, Regulation of microRNA biogenesis. Nature Reviews Molecular Cell Biology, 2014. 15(8): p. 509-524. 21. Treiber, T., N. Treiber, and G. Meister, Regulation of microRNA biogenesis and its crosstalk with other cellular pathways. Nature Reviews Molecular Cell Biology, 2018. 22. Kawamata, T. and Y. Tomari, Making RISC. Trends Biochem Sci, 2010. 35(7): p. 368- 76. 23. Kobayashi, H. and Y. Tomari, RISC assembly: Coordination between small RNAs and Argonaute proteins. Biochim Biophys Acta, 2016. 1859(1): p. 71-81. 24. Guennewig, B., et al., Synthetic pre-microRNAs reveal dual-strand activity of miR-34a on TNF-alpha. RNA, 2014. 20(1): p. 61-75. 25. Yang, X., et al., Both mature miR-17-5p and passenger strand miR-17-3p target TIMP3 and induce prostate tumor growth and invasion. Nucleic Acids Res, 2013. 41(21): p. 9688-704. 26. Shan, S.W., et al., Mature miR-17-5p and passenger miR-17-3p induce hepatocellular carcinoma by targeting PTEN, GalNT7 and vimentin in different signal pathways. J Cell Sci, 2013. 126(Pt 6): p. 1517-30. 27. Lu, C., et al., Elucidation of the small RNA component of the transcriptome. Science, 2005. 309(5740): p. 1567-9. 28. Ruby, J.G., et al., Large-scale sequencing reveals 21U-RNAs and additional microRNAs and endogenous siRNAs in C. elegans. Cell, 2006. 127(6): p. 1193-207. 29. Karagkouni, D., et al., DIANA-TarBase v8: a decade-long collection of experimentally supported miRNA-gene interactions. Nucleic Acids Res, 2018. 46(D1): p. D239-D245. 30. Carthew, R.W. and E.J. Sontheimer, Origins and Mechanisms of miRNAs and siRNAs. Cell, 2009. 136(4): p. 642-55. 31. Gebert, L.F.R. and I.J. MacRae, Regulation of microRNA function in animals. Nature Reviews Molecular Cell Biology, 2018. 32. Lee, Y., et al., MicroRNA genes are transcribed by RNA polymerase II. EMBO J, 2004. 23(20): p. 4051-60. 33. Nicholson, A.W., Ribonuclease III mechanisms of double-stranded RNA cleavage. Wiley Interdiscip Rev RNA, 2014. 5(1): p. 31-48. 34. Lee, Y., et al., The nuclear RNase III Drosha initiates microRNA processing. Nature, 2003. 425(6956): p. 415-9. 35. Denli, A.M., et al., Processing of primary microRNAs by the Microprocessor complex. Nature, 2004. 432(7014): p. 231-5. 36. Gregory, R.I., et al., The Microprocessor complex mediates the genesis of microRNAs. Nature, 2004. 432(7014): p. 235-40. 37. Bohnsack, M.T., K. Czaplinski, and D. Gorlich, Exportin 5 is a RanGTP-dependent dsRNA-binding protein that mediates nuclear export of pre-miRNAs. RNA, 2004. 10(2): p. 185-91. 38. Lund, E., et al., Nuclear export of microRNA precursors. Science, 2004. 303(5654): p. 95-8. 39. Yi, R., et al., Exportin-5 mediates the nuclear export of pre-microRNAs and short hairpin RNAs. Genes Dev, 2003. 17(24): p. 3011-6.

ETH Zurich Matije Lucic 82 References

40. Kim, Y.K., B. Kim, and V.N. Kim, Re-evaluation of the roles of DROSHA, Export in 5, and DICER in microRNA biogenesis. Proc Natl Acad Sci U S A, 2016. 113(13): p. E1881-9. 41. Grishok, A., et al., Genes and mechanisms related to RNA interference regulate expression of the small temporal RNAs that control C. elegans developmental timing. Cell, 2001. 106(1): p. 23-34. 42. Ketting, R.F., et al., Dicer functions in RNA interference and in synthesis of small RNA involved in developmental timing in C. elegans. Genes Dev, 2001. 15(20): p. 2654-9. 43. MacRae, I.J., K. Zhou, and J.A. Doudna, Structural determinants of RNA recognition and cleavage by Dicer. Nat Struct Mol Biol, 2007. 14(10): p. 934-40. 44. Macrae, I.J., et al., Structural basis for double-stranded RNA processing by Dicer. Science, 2006. 311(5758): p. 195-8. 45. Lau, P.W., et al., The molecular architecture of human Dicer. Nat Struct Mol Biol, 2012. 19(4): p. 436-40. 46. Fukunaga, R., et al., Dicer partner proteins tune the length of mature miRNAs in flies and mammals. Cell, 2012. 151(3): p. 533-46. 47. Lee, H.Y., et al., Differential roles of human Dicer-binding proteins TRBP and PACT in small RNA processing. Nucleic Acids Res, 2013. 41(13): p. 6568-76. 48. Wilson, R.C., et al., Dicer-TRBP complex formation ensures accurate mammalian microRNA biogenesis. Mol Cell, 2015. 57(3): p. 397-407. 49. Lee, Y.S., et al., Distinct roles for Drosophila Dicer-1 and Dicer-2 in the siRNA/miRNA silencing pathways. Cell, 2004. 117(1): p. 69-81. 50. Rivas, F.V., et al., Purified Argonaute2 and an siRNA form recombinant human RISC. Nat Struct Mol Biol, 2005. 12(4): p. 340-9. 51. Liu, J., et al., Argonaute2 is the catalytic engine of mammalian RNAi. Science, 2004. 305(5689): p. 1437-41. 52. Meister, G., et al., Human Argonaute2 mediates RNA cleavage targeted by miRNAs and siRNAs. Mol Cell, 2004. 15(2): p. 185-97. 53. Martinez, J., et al., Single-stranded antisense siRNAs guide target RNA cleavage in RNAi. Cell, 2002. 110(5): p. 563-74. 54. Baccarini, A., et al., Kinetic analysis reveals the fate of a microRNA following target regulation in mammalian cells. Curr Biol, 2011. 21(5): p. 369-76. 55. van Rooij, E., et al., Control of stress-dependent cardiac growth and gene expression by a microRNA. Science, 2007. 316(5824): p. 575-9. 56. Khvorova, A., A. Reynolds, and S.D. Jayasena, Functional siRNAs and miRNAs exhibit strand bias. Cell, 2003. 115(2): p. 209-16. 57. Schwarz, D.S., et al., Asymmetry in the assembly of the RNAi enzyme complex. Cell, 2003. 115(2): p. 199-208. 58. Frank, F., N. Sonenberg, and B. Nagar, Structural basis for 5'-nucleotide base-specific recognition of guide RNA by human AGO2. Nature, 2010. 465(7299): p. 818-22. 59. Friedman, R.C., et al., Most mammalian mRNAs are conserved targets of microRNAs. Genome Res, 2009. 19(1): p. 92-105.

ETH Zurich Matije Lucic 83 References

60. Jonas, S. and E. Izaurralde, Towards a molecular understanding of microRNA- mediated gene silencing. Nat Rev Genet, 2015. 16(7): p. 421-33. 61. Guo, H., et al., Mammalian microRNAs predominantly act to decrease target mRNA levels. Nature, 2010. 466(7308): p. 835-40. 62. Hutvagner, G. and P.D. Zamore, A microRNA in a multiple-turnover RNAi enzyme complex. Science, 2002. 297(5589): p. 2056-60. 63. Yekta, S., I.H. Shih, and D.P. Bartel, MicroRNA-directed cleavage of HOXB8 mRNA. Science, 2004. 304(5670): p. 594-6. 64. Schwarz, D.S., Y. Tomari, and P.D. Zamore, The RNA-induced silencing complex is a Mg2+-dependent endonuclease. Curr Biol, 2004. 14(9): p. 787-91. 65. Park, M.S., et al., Human Argonaute3 has slicer activity. Nucleic Acids Res, 2017. 45(20): p. 11867-11877. 66. Jones-Rhoades, M.W., D.P. Bartel, and B. Bartel, MicroRNAS and their regulatory roles in plants. Annu Rev Plant Biol, 2006. 57: p. 19-53. 67. Davis, E., et al., RNAi-mediated allelic trans-interaction at the imprinted Rtl1/Peg11 locus. Curr Biol, 2005. 15(8): p. 743-9. 68. Shin, C., et al., Expanding the microRNA targeting code: functional sites with centered pairing. Mol Cell, 2010. 38(6): p. 789-802. 69. Hansen, T.B., et al., miRNA-dependent gene silencing involving Ago2-mediated cleavage of a circular antisense RNA. EMBO J, 2011. 30(21): p. 4414-22. 70. Sullivan, C.S., et al., SV40-encoded microRNAs regulate viral gene expression and reduce susceptibility to cytotoxic T cells. Nature, 2005. 435(7042): p. 682-6. 71. Barth, S., et al., Epstein-Barr virus-encoded microRNA miR-BART2 down-regulates the viral DNA polymerase BALF5. Nucleic Acids Res, 2008. 36(2): p. 666-75. 72. Fire, A., et al., Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature, 1998. 391(6669): p. 806-11. 73. Elbashir, S.M., et al., Duplexes of 21-nucleotide RNAs mediate RNA interference in cultured mammalian cells. Nature, 2001. 411(6836): p. 494-8. 74. Bobbin, M.L. and J.J. Rossi, RNA Interference (RNAi)-Based Therapeutics: Delivering on the Promise? Annu Rev Pharmacol Toxicol, 2016. 56: p. 103-22. 75. Djuranovic, S., A. Nahvi, and R. Green, miRNA-mediated gene silencing by translational repression followed by mRNA deadenylation and decay. Science, 2012. 336(6078): p. 237-40. 76. Lim, L.P., et al., Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature, 2005. 433(7027): p. 769-73. 77. Selbach, M., et al., Widespread changes in protein synthesis induced by microRNAs. Nature, 2008. 455(7209): p. 58-63. 78. Meister, G., et al., Identification of novel argonaute-associated proteins. Curr Biol, 2005. 15(23): p. 2149-55. 79. Rehwinkel, J., et al., A crucial role for GW182 and the DCP1:DCP2 decapping complex in miRNA-mediated gene silencing. RNA, 2005. 11(11): p. 1640-7. 80. Liu, J., et al., A role for the P-body component GW182 in microRNA function. Nat Cell Biol, 2005. 7(12): p. 1261-6.

ETH Zurich Matije Lucic 84 References

81. Chen, C.Y., et al., Ago-TNRC6 triggers microRNA-mediated decay by promoting two deadenylation steps. Nat Struct Mol Biol, 2009. 16(11): p. 1160-6. 82. Braun, J.E., et al., GW182 proteins directly recruit cytoplasmic deadenylase complexes to miRNA targets. Mol Cell, 2011. 44(1): p. 120-33. 83. Behm-Ansmant, I., et al., mRNA degradation by miRNAs and GW182 requires both CCR4:NOT deadenylase and DCP1:DCP2 decapping complexes. Genes Dev, 2006. 20(14): p. 1885-98. 84. Chekulaeva, M., et al., miRNA repression involves GW182-mediated recruitment of CCR4-NOT through conserved W-containing motifs. Nat Struct Mol Biol, 2011. 18(11): p. 1218-26. 85. Fabian, M.R., et al., miRNA-mediated deadenylation is orchestrated by GW182 through two conserved motifs that interact with CCR4-NOT. Nat Struct Mol Biol, 2011. 18(11): p. 1211-7. 86. Braun, J.E., et al., A direct interaction between DCP1 and XRN1 couples mRNA decapping to 5' exonucleolytic degradation. Nat Struct Mol Biol, 2012. 19(12): p. 1324- 31. 87. Chen, Y., et al., A DDX6-CNOT1 complex and W-binding pockets in CNOT9 reveal direct links between miRNA target recognition and silencing. Mol Cell, 2014. 54(5): p. 737-50. 88. Mathys, H., et al., Structural and biochemical insights to the role of the CCR4-NOT complex and DDX6 ATPase in microRNA repression. Mol Cell, 2014. 54(5): p. 751-65. 89. Bazzini, A.A., M.T. Lee, and A.J. Giraldez, Ribosome profiling shows that miR-430 reduces translation before causing mRNA decay in zebrafish. Science, 2012. 336(6078): p. 233-7. 90. Bethune, J., C.G. Artus-Revel, and W. Filipowicz, Kinetic analysis reveals successive steps leading to miRNA-mediated silencing in mammalian cells. EMBO Rep, 2012. 13(8): p. 716-23. 91. Eichhorn, S.W., et al., mRNA destabilization is the dominant effect of mammalian microRNAs by the time substantial repression ensues. Mol Cell, 2014. 56(1): p. 104- 15. 92. Bhattacharyya, S.N., et al., Relief of microRNA-mediated translational repression in human cells subjected to stress. Cell, 2006. 125(6): p. 1111-24. 93. Schirle, N.T. and I.J. MacRae, The crystal structure of human Argonaute2. Science, 2012. 336(6084): p. 1037-40. 94. Sheu-Gruttadauria, J. and I.J. MacRae, Structural Foundations of RNA Silencing by Argonaute. Journal of Molecular Biology, 2017. 429(17): p. 2619-2639. 95. Wang, Y., et al., Structure of the guide-strand-containing argonaute silencing complex. Nature, 2008. 456(7219): p. 209-13. 96. Salomon, W.E., et al., Single-Molecule Imaging Reveals that Argonaute Reshapes the Binding Properties of Its Nucleic Acid Guides. Cell, 2015. 162(1): p. 84-95. 97. Wee, L.M., et al., Argonaute divides its RNA guide into domains with distinct functions and RNA-binding properties. Cell, 2012. 151(5): p. 1055-67. 98. Lim, L.P., et al., Vertebrate microRNA genes. Science, 2003. 299(5612): p. 1540.

ETH Zurich Matije Lucic 85 References

99. Lewis, B.P., C.B. Burge, and D.P. Bartel, Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell, 2005. 120(1): p. 15-20. 100. Schirle, N.T., et al., Water-mediated recognition of t1-adenosine anchors Argonaute2 to microRNA targets. Elife, 2015. 4. 101. Agarwal, V., et al., Predicting effective microRNA target sites in mammalian mRNAs. Elife, 2015. 4. 102. Wang, Y., et al., Nucleation, propagation and cleavage of target RNAs in Ago silencing complexes. Nature, 2009. 461(7265): p. 754-61. 103. Grimson, A., et al., MicroRNA targeting specificity in mammals: determinants beyond seed pairing. Mol Cell, 2007. 27(1): p. 91-105. 104. Reinhart, B.J., et al., The 21-nucleotide let-7 RNA regulates developmental timing in Caenorhabditis elegans. Nature, 2000. 403(6772): p. 901-6. 105. Vella, M.C., et al., The C. elegans microRNA let-7 binds to imperfect let-7 complementary sites from the lin-41 3'UTR. Genes Dev, 2004. 18(2): p. 132-7. 106. Ecsedi, M., M. Rausch, and H. Grosshans, The let-7 microRNA directs vulval development through a single target. Dev Cell, 2015. 32(3): p. 335-44. 107. Broughton, J.P., et al., Pairing beyond the Seed Supports MicroRNA Targeting Specificity. Mol Cell, 2016. 64(2): p. 320-333. 108. Brancati, G. and H. Grosshans, An interplay of miRNA abundance and target site architecture determines miRNA activity and specificity. Nucleic Acids Res, 2018. 46(7): p. 3259-3269. 109. Moore, M.J., et al., miRNA-target chimeras reveal miRNA 3'-end pairing as a major determinant of Argonaute target specificity. Nat Commun, 2015. 6: p. 8864. 110. Grosswendt, S., et al., Unambiguous identification of miRNA:target site interactions by different types of ligation reactions. Mol Cell, 2014. 54(6): p. 1042-1054. 111. Helwak, A., et al., Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding. Cell, 2013. 153(3): p. 654-65. 112. Bartel, D.P., MicroRNAs: genomics, biogenesis, mechanism, and function. Cell, 2004. 116(2): p. 281-97. 113. Braasch, D.A. and D.R. Corey, Locked nucleic acid (LNA): fine-tuning the recognition of DNA and RNA. Chem Biol, 2001. 8(1): p. 1-7. 114. Filipowicz, W., RNAi: the nuts and bolts of the RISC machine. Cell, 2005. 122(1): p. 17-20. 115. Sternberg, S.H., et al., DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature, 2014. 507(7490): p. 62-7. 116. Klum, S.M., et al., Helix-7 in Argonaute2 shapes the microRNA seed region for rapid target recognition. EMBO J, 2018. 37(1): p. 75-88. 117. Chandradoss, S.D., et al., A Dynamic Search Process Underlies MicroRNA Targeting. Cell, 2015. 162(1): p. 96-107. 118. Schirle, N.T., J. Sheu-Gruttadauria, and I.J. MacRae, Structural basis for microRNA targeting. Science, 2014. 346(6209): p. 608-13.

ETH Zurich Matije Lucic 86 References

119. Steinkraus, B.R., M. Toegel, and T.A. Fulga, Tiny giants of gene regulation: experimental strategies for microRNA functional studies. Wiley Interdiscip Rev Dev Biol, 2016. 5(3): p. 311-62. 120. Mittal, N. and M. Zavolan, Seq and CLIP through the miRNA world. Genome Biol, 2014. 15(1): p. 202. 121. Baek, D., et al., The impact of microRNAs on protein output. Nature, 2008. 455(7209): p. 64-71. 122. Didiano, D. and O. Hobert, Perfect seed pairing is not a generally reliable predictor for miRNA-target interactions. Nat Struct Mol Biol, 2006. 13(9): p. 849-51. 123. Ameres, S.L., et al., Target RNA-directed trimming and tailing of small silencing RNAs. Science, 2010. 328(5985): p. 1534-9. 124. Fuchs Wightman, F., et al., Target RNAs Strike Back on MicroRNAs. Frontiers in Genetics, 2018. 9. 125. De, N., et al., Highly complementary target RNAs promote release of guide RNAs from human Argonaute2. Mol Cell, 2013. 50(3): p. 344-55. 126. Park, J.H., S.Y. Shin, and C. Shin, Non-canonical targets destabilize microRNAs in human Argonautes. Nucleic Acids Res, 2017. 45(4): p. 1569-1583. 127. Fischer, S., et al., Unveiling the principle of microRNA-mediated redundancy in cellular pathway regulation. RNA Biol, 2015. 12(3): p. 238-47. 128. Miska, E.A., et al., Most Caenorhabditis elegans microRNAs are individually not essential for development or viability. PLoS Genet, 2007. 3(12): p. e215. 129. Broderick, J.A., et al., Argonaute protein identity and pairing geometry determine cooperativity in mammalian RNA silencing. RNA, 2011. 17(10): p. 1858-69. 130. Saetrom, P., et al., Distance constraints between microRNA target sites dictate efficacy and cooperativity. Nucleic Acids Res, 2007. 35(7): p. 2333-42. 131. Esteller, M., Non-coding RNAs in human disease. Nat Rev Genet, 2011. 12(12): p. 861-74. 132. Bracken, C.P., H.S. Scott, and G.J. Goodall, A network-biology perspective of microRNA function and dysfunction in cancer. Nat Rev Genet, 2016. 17(12): p. 719- 732. 133. Ventura, A., et al., Targeted deletion reveals essential and overlapping functions of the miR-17 through 92 family of miRNA clusters. Cell, 2008. 132(5): p. 875-86. 134. Lu, J., et al., MicroRNA expression profiles classify human cancers. Nature, 2005. 435(7043): p. 834-8. 135. He, L., et al., A microRNA polycistron as a potential human oncogene. Nature, 2005. 435(7043): p. 828-33. 136. O'Donnell, K.A., et al., c-Myc-regulated microRNAs modulate E2F1 expression. Nature, 2005. 435(7043): p. 839-43. 137. Sylvestre, Y., et al., An E2F/miR-20a autoregulatory feedback loop. Journal of Biological Chemistry, 2007. 282(4): p. 2135-2143. 138. Woods, K., J.M. Thomson, and S.M. Hammond, Direct regulation of an oncogenic micro-RNA cluster by E2F transcription factors. Journal of Biological Chemistry, 2007. 282(4): p. 2130-2134.

ETH Zurich Matije Lucic 87 References

139. Martinez, N.J. and A.J.M. Walhout, The interplay between transcription factors and microRNAs in genome-scale regulatory networks. Bioessays, 2009. 31(4): p. 435-445. 140. Han, Y.C., et al., An allelic series of miR-17 approximately 92-mutant mice uncovers functional specialization and cooperation among members of a microRNA polycistron. Nat Genet, 2015. 47(7): p. 766-75. 141. Bobbili, M.R., et al., OncomiR-17-5p: alarm signal in cancer? Oncotarget, 2017. 8(41): p. 71206-71222. 142. Ma, Y., et al., Elevated oncofoetal miR-17-5p expression regulates colorectal cancer progression by repressing its target gene P130. Nat Commun, 2012. 3: p. 1291. 143. Fontana, L., et al., Antagomir-17-5p Abolishes the Growth of Therapy-Resistant Neuroblastoma through p21 and BIM. Plos One, 2008. 3(5). 144. Cloonan, N., et al., The miR-17-5p microRNA is a key regulator of the G1/S phase cell cycle transition. Genome Biol, 2008. 9(8): p. R127. 145. Wei, Q., et al., MiR-17-5p targets TP53INP1 and regulates cell proliferation and apoptosis of cervical cancer cells. IUBMB Life, 2012. 64(8): p. 697-704. 146. Bussing, I., F.J. Slack, and H. Grosshans, let-7 microRNAs in development, stem cells and cancer. Trends Mol Med, 2008. 14(9): p. 400-9. 147. Nair, V.S., L.S. Maeda, and J.P. Ioannidis, Clinical outcome prediction by microRNAs in human cancer: a systematic review. J Natl Cancer Inst, 2012. 104(7): p. 528-40. 148. Lee, Y.S. and A. Dutta, The tumor suppressor microRNA let-7 represses the HMGA2 oncogene. Genes & Development, 2007. 21(9): p. 1025-1030. 149. Mayr, C., M.T. Hemann, and D.P. Bartel, Disrupting the pairing between let-7 and Hmga2 enhances oncogenic transformation. Science, 2007. 315(5818): p. 1576-1579. 150. Johnson, S.M., et al., RAS is regulated by the let-7 MicroRNA family. Cell, 2005. 120(5): p. 635-647. 151. Sampson, V.B., et al., MicroRNA let-7a down-regulates MYC and reverts MYC- induced growth in Burkitt lymphoma cells. Cancer Research, 2007. 67(20): p. 9762- 9770. 152. Boyerinas, B., et al., Identification of let-7-regulated oncofetal genes. Cancer Research, 2008. 68(8): p. 2587-2591. 153. Johnson, C.D., et al., The let-7 MicroRNA represses cell proliferation pathways in human cells. Cancer Research, 2007. 67(16): p. 7713-7722. 154. Schultz, J., et al., MicroRNA let-7b targets important cell cycle molecules in malignant melanoma cells and interferes with anchorage-independent growth. Cell Research, 2008. 18(5): p. 549-557. 155. Pasquinelli, A.E., et al., Conservation of the sequence and temporal expression of let- 7 heterochronic regulatory RNA. Nature, 2000. 408(6808): p. 86-89. 156. Lee, H., et al., Biogenesis and regulation of the let-7 miRNAs and their functional implications. Protein Cell, 2016. 7(2): p. 100-13. 157. Wang, T.Z., et al., Aberrant regulation of the LIN28A/LIN28B and let-7 loop in human malignant tumors and its effects on the hallmarks of cancer. Molecular Cancer, 2015. 14.

ETH Zurich Matije Lucic 88 References

158. Zhou, J., S.B. Ng, and W.J. Chng, LIN28/LIN28B: an emerging oncogenic driver in cancer stem cells. Int J Biochem Cell Biol, 2013. 45(5): p. 973-8. 159. Viswanathan, S.R., G.Q. Daley, and R.I. Gregory, Selective blockade of MicroRNA processing by Lin28. Science, 2008. 320(5872): p. 97-100. 160. Newman, M.A., J.M. Thomson, and S.M. Hammond, Lin-28 interaction with the Let-7 precursor loop mediates regulated microRNA processing. Rna, 2008. 14(8): p. 1539- 1549. 161. Imig, J., et al., miR-CLIP capture of a miRNA targetome uncovers a lincRNA H19-miR- 106a interaction. Nat Chem Biol, 2015. 11(2): p. 107-14. 162. Zagalak, J.A., et al., Properties of short double-stranded RNAs carrying randomized base pairs: toward better controls for RNAi experiments. RNA, 2015. 21(12): p. 2132- 42. 163. Hermeking, H., The miR-34 family in cancer and apoptosis. Cell Death Differ, 2010. 17(2): p. 193-9. 164. Cai, H., M. Miao, and Z. Wang, miR-214-3p promotes the proliferation, migration and invasion of osteosarcoma cells by targeting CADM1. Oncol Lett, 2018. 16(2): p. 2620- 2628. 165. Zhang, K., et al., A novel class of microRNA-recognition elements that function only within open reading frames. Nat Struct Mol Biol, 2018. 25(11): p. 1019-1027. 166. Jonson, L., et al., IMP3 RNP safe houses prevent miRNA-directed HMGA2 mRNA decay in cancer and development. Cell Rep, 2014. 7(2): p. 539-51. 167. Busch, B., et al., The oncogenic triangle of HMGA2, LIN28B and IGF2BP1 antagonizes tumor-suppressive actions of the let-7 family. Nucleic Acids Res, 2016. 44(8): p. 3845- 64. 168. Habibian, M., et al., Structural properties and gene-silencing activity of chemically modified DNA-RNA hybrids with parallel orientation. Nucleic Acids Res, 2018. 46(4): p. 1614-1623. 169. Brunschweiger, A., et al., Site-specific conjugation of drug-like fragments to an antimiR scaffold as a strategy to target miRNAs inside RISC. Chem Commun (Camb), 2016. 52(1): p. 156-9. 170. Roos, M., et al., Short loop-targeting oligoribonucleotides antagonize Lin28 and enable pre-let-7 processing and suppression of cell growth in let-7-deficient cancer cells. Nucleic Acids Res, 2015. 43(2): p. e9. 171. Pradere, U., et al., Chemical synthesis of mono- and bis-labeled pre-microRNAs. Angew Chem Int Ed Engl, 2013. 52(46): p. 12028-32. 172. Jahns, H., et al., Stereochemical bias introduced during RNA synthesis modulates the activity of phosphorothioate siRNAs. Nature Communications, 2015. 6.

ETH Zurich Matije Lucic 89 Supplementary information

Supplementary information

1 51 101 151 201 251 301 351 401 451 501 551 601 651 let-7a-1 miR-196a-1 miR-124-2 miR-29c miR-425 miR-518a-2 miR-605 miR-873 miR-3129 miR-3691 miR-4714 miR-548at miR-6791 miR-486-2 let-7a-2 miR-197 miR-124-3 miR-30c-1 miR-18b miR-517c miR-615 miR-374b miR-3130-1 miR-3692 miR-4728 miR-5582 miR-6793 miR-1273h let-7a-3 miR-199a-1 miR-125b-1 miR-200a miR-20b miR-522 miR-616 miR-301b miR-3130-2 miR-3912 miR-4733 miR-5583-1 miR-6796 miR-6516 let-7b miR-208a miR-128-1 miR-302a miR-450a-1 miR-519a-1 miR-548c miR-509-3 miR-3136 miR-3913-1 miR-3064 miR-5583-2 miR-6797 miR-6862-2 let-7c miR-129-1 miR-130a miR-101-2 miR-431 miR-516a-1 miR-624 miR-937 miR-3140 miR-3913-2 miR-4742 miR-5585 miR-6799 miR-6770-2 let-7d miR-148a miR-132 miR-219a-2 miR-433 miR-516a-2 miR-625 miR-939 miR-548t miR-3150b miR-122b miR-5587 miR-6802 miR-6770-3 let-7e miR-30c-2 miR-133a-1 miR-34b miR-329-1 miR-519a-2 miR-627 miR-942 miR-3144 miR-676 miR-4745 miR-4536-2 miR-6803 miR-6859-2 let-7f-1 miR-30d miR-133a-2 miR-34c miR-329-2 miR-499a miR-628 miR-1180 miR-3145 miR-3928 miR-4746 miR-548av miR-6807 miR-6859-3 let-7f-2 miR-139 miR-135a-1 miR-299 miR-452 miR-500a miR-629 miR-1226 miR-3150a miR-3934 miR-4749 miR-5699 miR-6810 miR-15a miR-7-1 miR-135a-2 miR-301a miR-409 miR-501 miR-33b miR-1228 miR-3152 miR-3940 miR-4753 miR-548ay miR-6812 miR-16-1 miR-7-2 miR-137 miR-99b miR-376b miR-502 miR-642a miR-1229 miR-3074 miR-3942 miR-4755 miR-6500 miR-6816 miR-17 miR-10a miR-138-2 miR-296 miR-483 miR-450a-2 miR-651 miR-1233-1 miR-3156-1 miR-3944 miR-499b miR-548az miR-6818 miR-18a miR-10b miR-140 miR-130b miR-485 miR-503 miR-652 miR-1237 miR-3157 miR-374c miR-4761 miR-6501 miR-6819 miR-19a miR-34a miR-141 miR-30e miR-487a miR-504 miR-548d-1 miR-548e miR-3158-1 miR-642b miR-4762 miR-6503 miR-6820 miR-19b-1 miR-181a-2 miR-142 miR-26a-2 miR-488 miR-505 miR-548d-2 miR-548j miR-3158-2 miR-550b-1 miR-4763 miR-6505 miR-6824 miR-19b-2 miR-181b-1 miR-143 miR-361 miR-490 miR-513a-1 miR-449b miR-1285-1 miR-3160-1 miR-550b-2 miR-4766 miR-6507 miR-6825 miR-20a miR-181c miR-144 miR-362 miR-491 miR-513a-2 miR-653 miR-1287 miR-3160-2 miR-548o-2 miR-4772 miR-6509 miR-6826 miR-21 miR-182 miR-145 miR-363 miR-511 miR-506 miR-411 miR-1304 miR-3162 miR-4423 miR-4776-1 miR-6510 miR-6832 miR-22 miR-183 miR-152 miR-365a miR-146b miR-508 miR-654 miR-548f-1 miR-3173 miR-548ad miR-4776-2 miR-6511a-1 miR-6833 miR-23a miR-187 miR-153-2 miR-365b miR-202 miR-509-1 miR-549a miR-1247 miR-3177 miR-548ae-2 miR-4778 miR-6513 miR-6780b miR-24-1 miR-196a-2 miR-191 miR-302b miR-493 miR-510 miR-659 miR-1249 miR-3180-1 miR-4446 miR-4436b-1 miR-6514 miR-6836 miR-24-2 miR-199a-2 miR-9-1 miR-376c miR-432 miR-514a-1 miR-660 miR-548g miR-3180-2 miR-548ah miR-2467 miR-6515 miR-6837 miR-25 miR-199b miR-9-2 miR-369 miR-494 miR-514a-2 miR-542 miR-548h-4 miR-3180-3 miR-548aj-2 miR-4786 miR-6511b-1 miR-6839 miR-26a-1 miR-203a miR-9-3 miR-370 miR-495 miR-514a-3 miR-758 miR-1277 miR-3184 miR-4474 miR-4787 miR-6720 miR-6840 miR-26b miR-204 miR-125a miR-371a miR-193b miR-532 miR-671 miR-1292 miR-3065 miR-4485 miR-4793 miR-6726 miR-6842 miR-27a miR-205 miR-125b-2 miR-374a miR-497 miR-455 miR-550a-3 miR-1252 miR-3156-2 miR-4524a miR-4797 miR-6731 miR-6850 miR-28 miR-210 miR-126 miR-375 miR-181d miR-539 miR-767 miR-1255b-2 miR-3187 miR-548am miR-3688-2 miR-6733 miR-6852 miR-29a miR-211 miR-127 miR-376a-1 miR-512-1 miR-545 miR-1224 miR-664a miR-3190 miR-4536-1 miR-4800 miR-6734 miR-6855 miR-30a miR-212 miR-129-2 miR-377 miR-512-2 miR-376a-2 miR-1296 miR-1306 miR-3194 miR-4638 miR-4802 miR-6735 miR-6858 miR-31 miR-181a-1 miR-134 miR-378a miR-515-1 miR-552 miR-1271 miR-1307 miR-548x miR-4639 miR-4804 miR-6738 miR-6859-1 miR-32 miR-214 miR-136 miR-379 miR-520f miR-92b miR-1301 miR-513b miR-3200 miR-4640 miR-4999 miR-6739 miR-6769b miR-33a miR-215 miR-138-1 miR-380 miR-515-2 miR-556 miR-454 miR-513c miR-514b miR-4652 miR-5000 miR-6740 miR-6862-1 miR-92a-1 miR-216a miR-146a miR-381 miR-519c miR-561 miR-1185-2 miR-1537 miR-2355 miR-4655 miR-5001 miR-6741 miR-6866 miR-92a-2 miR-217 miR-149 miR-382 miR-520a miR-551b miR-449c miR-1908 miR-500b miR-4659a miR-5006 miR-6742 miR-6869 miR-93 miR-218-1 miR-150 miR-340 miR-526b miR-570 miR-769 miR-1909 miR-1233-2 miR-4661 miR-548ap miR-6746 miR-6871 miR-95 miR-218-2 miR-154 miR-330 miR-519b miR-574 miR-766 miR-1910 miR-3605 miR-4662a miR-5008 miR-6747 miR-6875 miR-96 miR-219a-1 miR-185 miR-328 miR-523 miR-576 miR-1185-1 miR-1912 miR-3613 miR-4659b miR-5009 miR-6753 miR-6879 miR-98 miR-221 miR-186 miR-342 miR-518f miR-579 miR-675 miR-1914 miR-3614 miR-4664 miR-5010 miR-6756 miR-6881 miR-99a miR-222 miR-188 miR-337 miR-520b miR-580 miR-509-2 miR-1915 miR-3616 miR-4667 miR-5088 miR-6758 miR-6882 miR-100 miR-223 miR-190a miR-323a miR-526a-1 miR-582 miR-450b miR-2114 miR-3619 miR-4668 miR-5089 miR-6763 miR-6885 miR-101-1 miR-224 miR-193a miR-135b miR-520c miR-584 miR-874 miR-2115 miR-3620 miR-219b miR-5187 miR-6765 miR-6892 miR-29b-1 miR-200b miR-195 miR-148b miR-518c miR-585 miR-888 miR-2116 miR-3622a miR-4670 miR-5196 miR-6767 miR-6894 miR-29b-2 let-7g miR-320a miR-331 miR-517a miR-548b miR-876 miR-2276 miR-3663 miR-4676 miR-4436b-2 miR-6769a miR-6895 miR-103a-2 let-7i miR-200c miR-324 miR-520d miR-589 miR-708 miR-2277 miR-3664 miR-4677 miR-3680-2 miR-6770-1 miR-7110 miR-103a-1 miR-15b miR-1-1 miR-338 miR-517b miR-550a-1 miR-147b miR-2682 miR-3667 miR-4687 miR-5571 miR-6775 miR-7111 miR-105-1 miR-23b miR-155 miR-335 miR-516b-2 miR-550a-2 miR-190b miR-3120 miR-3677 miR-1343 miR-548aq miR-6777 miR-7112 miR-105-2 miR-27b miR-181b-2 miR-345 miR-518e miR-590 miR-744 miR-3121 miR-3679 miR-4695 miR-548ar miR-6780a miR-6511b-2 miR-106a miR-30b miR-128-2 miR-196b miR-518a-1 miR-597 miR-885 miR-3124 miR-3680-1 miR-4707 miR-548as miR-6781 miR-6511a-2 miR-16-2 miR-122 miR-194-2 miR-423 miR-518d miR-598 miR-877 miR-3126 miR-3688-1 miR-203b miR-664b miR-6783 miR-6511a-3 miR-192 miR-124-1 miR-106b miR-424 miR-516b-1 miR-548a-3 miR-887 miR-3127 miR-3689a miR-4713 miR-5580 miR-6789 miR-6511a-4 Table S1 658 confidently annotated miRNAs in human (Homo sapiens).

ETH Zurich Matije Lucic 90 Supplementary information

1 51 101 151 201 251 301 351 401 451 501 551 601 let-7g miR-190a miR-15a miR-19a miR-379 miR-302b miR-669c miR-466f-3 miR-1251 miR-669a-12 miR-6921 miR-7035 miR-7672 let-7i miR-191 miR-16-1 miR-25 miR-380 miR-302c miR-297b miR-466h miR-3061 miR-466p miR-6923 miR-7046 miR-3569 miR-1a-1 miR-193a miR-16-2 miR-28a miR-381 miR-302d miR-499 miR-467c miR-3062 miR-466n miR-6924 miR-7047 miR-7674 miR-15b miR-194-1 miR-18a miR-32 miR-382 miR-1224 miR-455 miR-467d miR-3064 miR-3087 miR-6927 miR-7048 miR-7675 miR-23b miR-195a miR-20a miR-100 miR-383 miR-1247 miR-491 miR-493 miR-3066 miR-3089 miR-6929 miR-7051 miR-7676-1 miR-27b miR-199a-1 miR-21a miR-139 miR-133a-2 miR-301b miR-700 miR-504 miR-3068 miR-3091 miR-6931 miR-7052 miR-7676-2 miR-29b-1 miR-200b miR-22 miR-200c miR-133b miR-675 miR-701 miR-574 miR-3069 miR-3093 miR-6932 miR-7060 miR-129b miR-30a miR-201 miR-23a miR-210 miR-181b-2 miR-744 miR-702 miR-92b miR-3070-1 miR-3094 miR-6933 miR-7063 miR-1191b miR-30b miR-202 miR-24-2 miR-212 miR-215 miR-374b miR-708 miR-466d miR-3070-2 miR-3095 miR-6935 miR-6769b miR-7679 miR-99a miR-203 miR-26a-1 miR-181a-1 miR-384 miR-216b miR-712 miR-878 miR-3072 miR-3097 miR-6937 miR-7068 miR-465d miR-99b miR-204 miR-26b miR-214 miR-196b miR-592 miR-500 miR-872 miR-3073a miR-3100 miR-6946 miR-7069 miR-7681 miR-101a miR-205 miR-29a miR-216a miR-409 miR-758 miR-501 miR-873a miR-3074-1 miR-3101 miR-6948 miR-7070 miR-7683 miR-124-3 miR-122 miR-29c miR-218-1 miR-410 miR-1264 miR-450b miR-875 miR-3075 miR-344e miR-6949 miR-7075 miR-126b miR-125a miR-143 miR-27a miR-218-2 miR-376b miR-551b miR-505 miR-208b miR-3076 miR-344b miR-6952 miR-7080 miR-466c-3 miR-125b-2 miR-30e miR-31 miR-223 miR-411 miR-1249 miR-652 miR-877 miR-3077 miR-344c miR-6953 miR-7085 miR-126a miR-290a miR-92a-2 miR-320 miR-412 miR-671 miR-490 miR-511 miR-3078 miR-344g miR-6956 miR-7087 miR-127 miR-291a miR-93 miR-26a-2 miR-370 miR-668 miR-676 miR-544 miR-3079 miR-344f miR-6957 miR-7090 miR-128-1 miR-293 miR-96 miR-33 miR-425 miR-1843a miR-615 miR-598 miR-3080 miR-3102 miR-6958 miR-7091 miR-130a miR-294 miR-34a miR-211 miR-431 miR-665 miR-741 miR-653 miR-3081 miR-3103 miR-6959 miR-7092 miR-9-2 miR-295 miR-129-2 miR-221 miR-434 miR-667 miR-742 miR-582 miR-3082 miR-3105 miR-6961 miR-7093 miR-132 miR-296 miR-98 miR-222 miR-448 miR-770 miR-743a miR-467e miR-3084-1 miR-3074-2 miR-6962 miR-7094-1 miR-133a-1 miR-298 miR-103-1 miR-224 miR-429 miR-344d-3 miR-181d miR-466l miR-3085 miR-3109 miR-6964 miR-7094-2 miR-134 miR-299a miR-103-2 miR-29b-2 miR-365-2 miR-802 miR-743b miR-669d miR-3086 miR-3110 miR-6966 miR-7117 miR-135a-1 miR-300 miR-322 miR-199a-2 miR-449a miR-672 miR-871 miR-466i miR-466m miR-374c miR-6970 miR-7210 miR-136 miR-301a miR-323 miR-199b miR-450a-1 miR-670 miR-879 miR-1b miR-669d-2 miR-1912 miR-6973a miR-7213 miR-137 miR-302a miR-324 miR-135a-2 miR-452 miR-1298 miR-880 miR-1193 miR-466o miR-466b-8 miR-6976 miR-7214 miR-138-2 miR-34c miR-325 miR-124-1 miR-463 miR-764 miR-881 miR-669e miR-467a-2 miR-1843b miR-6977 miR-7215 miR-140 miR-34b miR-326 miR-124-2 miR-465a miR-3059 miR-883a miR-1197 miR-669a-4 miR-5107 miR-6980 miR-7216 miR-141 let-7d miR-328 miR-19b-1 miR-466a miR-3058 miR-883b miR-1198 miR-669a-5 miR-3572 miR-6981 miR-7217 miR-144 miR-106a miR-329 miR-92a-1 miR-467a-1 miR-3099 miR-190b miR-1929 miR-467a-3 miR-5129 miR-6985 miR-7218 miR-145a miR-106b miR-330 miR-9-1 miR-468 miR-3106 miR-874 miR-1930 miR-466c-2 miR-5132 miR-6987 miR-7222 miR-146a miR-130b miR-331 miR-9-3 miR-470 miR-669a-1 miR-147 miR-1933 miR-669a-6 miR-5134 miR-6988 miR-7223 miR-149 miR-19b-2 miR-337 miR-138-1 miR-471 miR-344d-1 miR-18b miR-1934 miR-467a-4 miR-3544 miR-6989 miR-7230 miR-150 miR-30c-1 miR-148b miR-181b-1 miR-532 miR-666 miR-193b miR-1941 miR-466b-4 miR-299b miR-6990 miR-7232 miR-151 miR-30c-2 miR-338 miR-181c miR-483 miR-496a miR-297a-3 miR-1943 miR-669a-7 miR-5615-1 miR-6994 miR-7234 miR-152 miR-30d miR-339 miR-125b-1 miR-485 miR-673 miR-297a-4 miR-1306 miR-467a-5 miR-5615-2 miR-6995 miR-7237 miR-153 miR-148a miR-340 miR-128-2 miR-540 miR-760 miR-297c miR-1948 miR-466b-5 miR-5619 miR-6996 miR-7241 miR-154 miR-192 miR-341 miR-7a-1 miR-543 miR-674 miR-344-2 miR-669l miR-669p-1 miR-5620 miR-6997 miR-6546 miR-155 miR-196a-1 miR-342 miR-7a-2 miR-539 miR-488 miR-421 miR-669m-1 miR-467a-6 miR-5623 miR-7001 miR-7646 miR-10b miR-196a-2 miR-344-1 miR-7b miR-541 miR-677 miR-465b-1 miR-669m-2 miR-669a-8 miR-5624 miR-7007 miR-7647 miR-129-1 miR-200a miR-345 miR-217 miR-542 miR-497a miR-465b-2 miR-669o miR-466b-6 miR-3084-2 miR-7008 miR-7649 miR-181a-2 miR-208a miR-346 miR-194-2 miR-547 miR-423 miR-465c-1 miR-1955 miR-669a-9 miR-6418 miR-7012 miR-219b miR-182 let-7a-1 miR-350 miR-219a-2 miR-494 miR-679 miR-465c-2 miR-1964 miR-467a-7 miR-6540 miR-7015 miR-7652 miR-183 let-7a-2 miR-351 miR-361 miR-376c miR-495 miR-466b-1 miR-1966 miR-466b-7 miR-6899 miR-7016 miR-7654 miR-184 let-7b miR-135b miR-362 miR-487b miR-449c miR-466b-2 miR-1968 miR-669p-2 miR-6901 miR-7023 miR-7656 miR-185 let-7c-1 miR-101b miR-363 miR-369 miR-146b miR-466b-3 miR-1839 miR-467a-8 miR-6906 miR-7024 miR-7658 miR-186 let-7c-2 miR-1a-2 miR-365-1 miR-20b miR-669b miR-466c-1 miR-1981 miR-669a-10 miR-6911 miR-7025 miR-7667 miR-187 let-7e miR-107 miR-376a miR-450a-2 miR-669a-2 miR-466e miR-1982 miR-467a-9 miR-6912 miR-7029 miR-7668 miR-188 let-7f-1 miR-10a miR-377 miR-503 miR-669a-3 miR-466f-1 miR-664 miR-669a-11 miR-6913 miR-7030 miR-7670 miR-24-1 let-7f-2 miR-17 miR-378a miR-291b miR-467b miR-466f-2 miR-3057 miR-467a-10 miR-6915 miR-7031 miR-7671 Table S2 614 confidently annotated miRNAs in mouse (Mus musculus).

ETH Zurich Matije Lucic 91 Supplementary information

1 51 101 151 miR-1 miR-31b miR-976 miR-4951 miR-2a-1 miR-304 miR-977 miR-4968 miR-2a-2 miR-305 miR-978 miR-4969 miR-2b-1 miR-9c miR-979 miR-9369 miR-2b-2 miR-306 miR-980 miR-9388 miR-3 miR-9b miR-981 miR-4 let-7 miR-982 miR-5 miR-125 miR-983-1 miR-6-1 miR-307a miR-983-2 miR-6-2 miR-308 miR-984 miR-6-3 miR-31a miR-927 miR-7 miR-309 miR-985 miR-8 miR-310 miR-986 miR-9a miR-311 miR-987 miR-10 miR-312 miR-988 miR-11 miR-313 miR-989 miR-12 miR-314 miR-137 miR-13a miR-315 miR-990 miR-13b-1 miR-316 miR-991 miR-13b-2 miR-317 miR-992 miR-14 miR-318 miR-929 miR-263a miR-2c miR-993 miR-184 miR-iab-4 miR-994 miR-274 miR-iab-8 miR-995 miR-275 miR-954 miR-996 miR-92a miR-190 miR-252 miR-219 miR-193 miR-997 miR-276a miR-956 miR-998 miR-277 miR-957 miR-999 miR-278 miR-958 miR-1000 miR-133 miR-375 miR-1001 miR-279 miR-959 miR-1003 miR-33 miR-960 miR-1006 miR-281-1 miR-961 miR-1007 miR-282 miR-962 miR-1008 miR-283 miR-963 miR-1010 miR-284 miR-964 miR-1012 miR-281-2 miR-932 miR-1014 miR-34 miR-965 miR-1016 miR-124 miR-966 miR-2279 miR-79 miR-967 miR-307b miR-276b miR-1002 miR-2491 miR-285 miR-968 miR-2493 miR-100 miR-969 miR-2494 miR-92b miR-970 miR-2497 miR-286 miR-971 miR-2498 miR-87 miR-972 miR-2499 miR-263b miR-973 miR-2501 bantam miR-974 miR-3641 miR-303 miR-975 miR-4949 Table S3 155 confidently annotated miRNAs in fly (Drosophila melanogaster).

ETH Zurich Matije Lucic 92 Supplementary information

1 51 miR-1 miR-232 miR-2 miR-233 miR-34 miR-234 miR-35 miR-235 miR-36 miR-236 miR-37 miR-238 miR-38 miR-239a miR-39 miR-244 miR-40 miR-250 miR-41 miR-252 miR-42 miR-255 miR-43 miR-259 miR-44 miR-788 miR-45 miR-789-2 miR-46 miR-790 miR-47 miR-795 miR-49 miR-1820 miR-50 miR-1822 miR-51 miR-1823 miR-52 miR-1829b miR-53 miR-1829c miR-54 miR-1832a miR-55 miR-4813 miR-56 miR-4816 miR-57 miR-5592-1 miR-61 miR-5592-2 miR-63 miR-5593-1 miR-64 miR-5593-2 miR-65 miR-2217b-2 miR-66 miR-2217b-3 miR-67 miR-2217b-4 miR-70 miR-71 miR-72 miR-73 miR-74 miR-75 miR-79 miR-80 miR-81 miR-82 miR-83 miR-86 miR-87 miR-90 miR-124 miR-228 miR-229 miR-230 miR-231 Table S4 81 confidently annotated miRNAs in worm (Caenorhabditis elegans).

ETH Zurich Matije Lucic 93 Supplementary information

# miRNA gene family # of genes Extended seed (nucleotides 2-8) # miRNA gene family # of genes Extended seed (nucleotides 2-8) 1 let-7/miR-98 12 GAGGUAG 51 miR-204/211 2 UCCCUUU 2 miR-1/206 3 GGAAUGU 52 miR-205 1 CCUUCAU 3 miR-10 2 ACCCUGU 53 miR-208 2 UAAGACG 4 miR-101 2 ACAGUAC, UACAGUA 54 miR-21/590 2 AGCUUAU 5 miR-103/107 3 GCAGCAU 55 miR-210 1 UGUGCGU 6 miR-122 1 GGAGUGU 56 miR-214 1 GCCUGUC 7 miR-124 3 AAGGCAC, UAAGGCA 57 miR-216a 1 AAUCUCA 8 miR-125 3 CCCUGAG 58 miR-216b 1 AAUCUCU 9 miR-126 1 CGUACCG, GUACCGU 59 miR-217 1 ACUGCAU 10 miR-128 2 CACAGUG 60 miR-218 2 UGUGCUU 11 miR-129 2 UUUUUGC, AGCCCUU 61 miR-219 2 GAUUGUC 12 miR-130/301/454 5 AGUGCAA 62 miR-22 1 AGCUGCC 13 miR-1306 1 CACCUCC 63 miR-221/222 2 GCUACAU 14 miR-132/212 2 AACAGUC 64 miR-223 1 GUCAGUU 15 miR-133 3 UGGUCCC, UUGGUCC 65 miR-23 2 UCACAUU 16 miR-135 3 AUGGCUU 66 miR-24 2 GGCUCAG 17 miR-137 1 UAUUGCU 67 miR-25/32/92/363/367 7 AUUGCAC 18 miR-138 2 GCUGGUG 68 miR-26 3 UCAAGUA 19 miR-139 1 CUACAGU 69 miR-27 2 UCACAGU 20 miR-140 1 AGUGGUU, CCACAGG, ACCACAG 70 miR-29 4 AGCACCA 21 miR-141/200a 2 AACACUG 71 miR-30 6 GUAAACA 22 miR-142 1 AUAAAGU, GUAGUGU, UAGUGUU 72 miR-302abd 5 AAGUGCU 23 miR-143 1 GAGAUGA 73 miR-302c 1 AGUGCUU 24 miR-144 1 ACAGUAU 74 miR-31 1 GGCAAGA 25 miR-145 1 UCCAGUU 75 miR-33 1 UGCAUUG 26 miR-146 2 GAGAACU 76 miR-338 1 CCAGCAU 27 miR-147 1 UGUGCGG 77 miR-34/449 6 GGCAGUG 28 miR-148/152 3 CAGUGCA 78 miR-365 2 AAUGCCC 29 miR-15/16/195/424/497 7 AGCAGCA 79 miR-375 1 UUGUUCG 30 miR-150 1 CUCCCAA 80 miR-425 1 AUGACAC 31 miR-153 2 UGCAUAG 81 miR-455 1 AUGUGCC, CAGUCCA, UGCAGUC 32 miR-155 1 UAAUGCU 82 miR-489 1 UGACAUC 33 miR-17/20/93/106/519d 7 AAAGUGC 83 miR-499 1 UAAGACU 34 miR-18 2 AAGGUGC 84 miR-551 2 CGACCCA 35 miR-181 6 ACAUUCA 85 miR-7 3 GGAAGAC 36 miR-182 1 UUGGCAA 86 miR-802 1 CAGUAAC 37 miR-183 1 AUGGCAC, UGGCACU 87 miR-9 3 CUUUGGU 38 miR-184 1 GGACGGA 88 miR-96/1271 2 UUGGCAC 39 miR-187 1 CGUGUCU 89 miR-99/100 3 ACCCGUA 40 miR-19 3 GUGCAAA 41 miR-190 2 GAUAUGU 42 miR-191 1 AACGGAA 43 miR-192/215 2 UGACCUA 44 miR-193 2 GGGUCUU, ACUGGCC 45 miR-194 2 GUAACAG 46 miR-196 3 AGGUAGU 47 miR-199 3 CCAGUGU, CAGUAGU 48 miR-200bc/429 3 AAUACUG 49 miR-202 1 UCCUAUG 50 miR-203a 1 UGAAAUG, GAAAUGU Table S5 89 broadly conserved miRNA families comprising 200 miRNA genes. The extended seed sequence (nucleotides 2-8), typical for each miRNA family, is shown.

ETH Zurich Matije Lucic 94 Supplementary information

miRNA Sequence (5′-3′) Length (nt) hsa-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 mmu-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 rno-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 gga-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 ggo-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 lca-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 age-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 ppa-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 ppy-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 ptr-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 mml-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 sla-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 lla-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 mne-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 xtr-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 bta-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 mdo-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 oan-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 ssc-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 aca-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 cgr-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 ccr-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 oar-miR-17-5p CAAAGUGCUUACAGUGCAGGUA-- 22 ssa-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 cpi-miR-17-5p CAAAGUGCUUACAGUGCAGGU--- 21 ami-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 cli-miR-17-5p CAAAGUGCUUACAGUGCAGGUA-- 22 pbv-miR-17-5p CAAAGUGCUUACAGUGCAGGUA-- 22 chi-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 tch-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 oha-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 pal-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 gmo-miR-17-5p CAAAGUGCUUACAGUGCAGGUA-- 22 xla-miR-17-5p CAAAGUGCUUACAGUGCAGGUAGU 24 cpo-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 dno-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 ocu-miR-17-5p CAAAGUGCUUACAGUGCAGGUAG- 23 *********************

hsa-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 mmu-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 pma-miR-20a-5p CAAAGUGCUUAUAGUGCAGGUAG 23 rno-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 gga-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 dre-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 ssc-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUA- 22 mml-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUA- 22 xtr-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 mdo-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 oan-miR-20a-5p UAAAGUGCUUAUAGUGCAGG--- 20 tgu-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 aca-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUA- 22 ccr-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 ssa-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 cpi-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUA- 22 ami-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 cli-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUA- 22 pbv-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 chi-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 tch-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUA- 22 oha-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 pal-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 gmo-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 xla-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 cpo-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 dno-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 ocu-miR-20a-5p UAAAGUGCUUAUAGUGCAGGUAG 23 ******************* Continued.

ETH Zurich Matije Lucic 95 Supplementary information

Continued. miRNA Sequence (5′-3′) Length (nt) hsa-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 mmu-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 gga-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 rno-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 oan-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 mml-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 tgu-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 aca-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUA-- 22 mdo-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 cpi-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUA-- 22 cli-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 pbv-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 oha-miR-20b-5p -AAAGUGCUCAUAGUGCAGGUA-- 21 pal-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 cpo-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAGU 24 dno-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAG- 23 ocu-miR-20b-5p CAAAGUGCUCAUAGUGCAGGUAGU 24 dre-miR-20b-5p CAAAGUGCUCACAGUGCAGGUAG- 23 ssa-miR-20b-5p CAAAGUGCUCACAGUGCAGGUA-- 22 gmo-miR-20b-5p CAAAGUGCUCACAGUGCAGAUA-- 22 ********** ******* **

hsa-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 mmu-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 gmo-miR-93-5p AAAAGUGCUGUUUGUGCAGGUAG 23 rno-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 mml-miR-93-5p -AAAGUGCUGUUCGUGCAGGUAG 22 mdo-miR-93-5p -AAAGUGCUGUUCGUGCAGGUAG 22 cgr-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 ami-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 chi-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 tch-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 pal-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 xla-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 cpo-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 dno-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 ocu-miR-93-5p CAAAGUGCUGUUCGUGCAGGUAG 23 cpi-miR-93-5p CAAAGUGCUGUUCGUGCAGGU-- 21 *********** ********

hsa-miR-106a-5p AAAAGUGCUUACAGUGCAGGUAG- 23 mmu-miR-106a-5p CAAAGUGCUAACAGUGCAGGUAG- 23 ssa-miR-106a-5p UAAAGUGCUUACAGUGCAGGUA-- 22 mml-miR-106a-5p AAAAGUGCUUACAGUGCAGGUAGC 24 cpi-miR-106a-5p AAAAGUGCUUACAGUGCAGGUAG- 23 chi-miR-106a-5p AAAAGUGCUUACAGUGCAGGUAGC 24 pal-miR-106a-5p AAAAGUGCUUACAGUGCAGGUAG- 23 cpo-miR-106a-5p AAAAGUGCUUACAGUGCAGGUAG- 23 ocu-miR-106a-5p AAAAGUGCUUACAGUGCAGGUAG- 23 dno-miR-106a-5p AAAAGUGCUUGCAGUGCAGGUAG- 23 ******** ***********

hsa-miR-106b-5p UAAAGUGCUGACAGUGCAGAU- 21 mmu-miR-106b-5p UAAAGUGCUGACAGUGCAGAU- 21 ssa-miR-106b-5p AAAAGUGCUUACAGUGCAGGUA 22 rno-miR-106b-5p UAAAGUGCUGACAGUGCAGAU- 21 mml-miR-106b-5p UAAAGUGCUGACAGUGCAGAU- 21 cgr-miR-106b-5p UAAAGUGCUGACAGUGCAGAUA 22 chi-miR-106b-5p UAAAGUGCUGACAGUGCAGAU- 21 pal-miR-106b-5p UAAAGUGCUGACAGUGCAGAUA 22 cpo-miR-106b-5p UAAAGUGCUGACAGUGCAGAUA 22 dno-miR-106b-5p UAAAGUGCUGACAGUGCAGAUA 22 ocu-miR-106b-5p UAAAGUGCUGACAGUGCAGAUA 22 cpi-miR-106b-5p UAAAGUGCUGACAGUGCAGGU- 21 ******** ********* *

Table S6 Multiple sequence alignments of all miRBase-annotated miR-17 family members. Human members shown in red. Numbers are miRNA lengths in nucleotides.

ETH Zurich Matije Lucic 96 Supplementary information

let-7 miR-17 let-7 miR-17 let-7 miR-17 # Gene name # Gene name # Gene name TS TS TS TS TS TS 1 AAK1 1 2 71 B3GAT3 1 0 141 CHRFAM7A 1 0 2 AASS 1 0 72 B4GAT1 1 0 142 CHST3 2 0 3 ABCB9 1 0 73 BACE2 1 0 143 CHUK 1 0 4 ABCC10 1 0 74 BAHD1 1 1 144 CLASP2 1 0 5 ABCC5 1 0 75 BBX 1 1 145 CLCN5 3 0 6 ABL2 2 2 76 BCAT1 1 0 146 CLOCK 1 2 7 AC004080.3 1 0 77 BCL2L1 1 0 147 CLP1 1 0 8 AC011499.1 2 0 78 BCL7A 1 0 148 CLPB 1 0 9 ACAP3 1 0 79 BDP1 2 0 149 CMTM6 1 0 10 ACER2 1 1 80 BEGAIN 1 0 150 CNNM2 1 1 11 ACER3 1 0 81 BEND4 2 0 151 CNOT6L 1 1 12 ACPP 1 0 82 BIN3 1 0 152 CNTRL 1 0 13 ACSL6 1 0 83 BLOC1S5 1 0 153 COIL 1 0 14 ACVR2B 2 0 84 BMP2 1 1 154 COL11A1 1 0 15 ADAM15 1 0 85 BMPR1A 1 0 155 COL1A1 1 0 16 ADAMTS1 1 0 86 BNC2 1 3 156 COL27A1 2 0 17 ADCY9 1 0 87 BRWD3 1 0 157 COL4A1 1 1 18 ADGRL3 1 0 88 BTBD9 1 0 158 COL4A2 1 0 19 ADIPOR2 1 1 89 BTF3L4 1 0 159 COL4A3BP 1 0 20 AEN 2 1 90 BZW1 2 1 160 COL4A5 1 0 21 AFF2 1 0 91 C11orf57 1 0 161 COL4A6 1 0 22 AGO1 1 2 92 C14orf28 3 1 162 COL5A2 1 0 23 AGO3 3 3 93 C15orf39 1 0 163 COL9A3 1 0 24 AGO4 2 0 94 C15orf41 1 0 164 CPEB1 1 0 25 AHCTF1 1 1 95 C16orf52 1 1 165 CPEB2 2 0 26 AIFM1 2 0 96 C19orf47 1 0 166 CPM 1 0 27 AKAP5 1 0 97 C2orf88 1 0 167 CPSF4 1 0 28 AKAP6 1 0 98 C5orf51 1 0 168 CRBN 1 0 29 AKT2 1 0 99 C6orf203 1 0 169 CRK 1 1 30 AL139011.2 1 1 100 C9orf40 1 0 170 CRTAP 1 0 31 ALDH6A1 1 0 101 CACNB4 1 0 171 CRY2 1 1 32 ALKBH1 1 0 102 CADM2 1 2 172 CSNK2A1 1 0 33 AMMECR1L 1 0 103 CALM1 1 0 173 CSRNP3 1 2 34 AMOT 1 0 104 CALU 1 0 174 CTPS1 1 0 35 AMT 1 0 105 CANT1 1 1 175 CYB561D1 1 1 36 ANKFY1 1 2 106 CARNMT1 1 0 176 CYTH3 1 0 37 ANKRD28 1 1 107 CASK 1 0 177 DAPK1 1 0 38 ANKRD46 1 0 108 CASKIN1 1 0 178 DARS2 1 0 39 ANKRD52 1 3 109 CASP3 1 0 179 DCUN1D2 1 0 40 AP1S1 1 0 110 CBARP 1 0 180 DCUN1D3 1 1 41 APBA1 1 0 111 CBX2 1 0 181 DDI2 1 0 42 APBB3 1 0 112 CBX5 2 0 182 DDN 1 0 43 APPBP2 1 0 113 CCDC144A 1 1 183 DENND6A 1 0 44 ARHGAP20 1 0 114 CCDC93 1 0 184 DHX57 1 0 45 ARHGAP28 1 0 115 CCNF 1 0 185 DIAPH2 2 0 46 ARHGEF38 1 0 116 CCNJL 1 0 186 DICER1 2 0 47 ARHGEF7 1 1 117 CCNY 1 0 187 DKK3 1 0 48 ARID3B 5 0 118 CCSAP 1 0 188 DLC1 2 0 49 ARL4D 1 0 119 CDC25A 1 0 189 DLGAP4 1 0 50 ARL5A 1 1 120 CDC42SE1 1 0 190 DLST 1 0 51 ARL5B 1 0 121 CDCA8 1 0 191 DMD 1 0 52 ARL6IP6 1 0 122 CDK14 1 0 192 D02 1 0 53 ARMC8 1 1 123 CDK6 1 1 193 D0JA2 1 0 54 ARMT1 1 0 124 CDK8 3 0 194 D0JB14 1 0 55 ARPP19 1 0 125 CDV3 1 0 195 D0JC1 1 0 56 ASAP1 1 0 126 CDYL 1 0 196 D0L1 1 1 57 ATAD2B 1 0 127 CECR2 1 0 197 DPF2 1 0 58 ATG16L1 1 1 128 CELF1 1 0 198 DPH3 1 0 59 ATG4B 1 0 129 CEP85L 1 0 199 DPP3 1 1 60 ATP2A2 1 0 130 CERCAM 1 1 200 DPYSL3 1 0 61 ATP2B4 1 0 131 CFL2 1 2 201 DST 1 0 62 ATP6V1C1 1 0 132 CHD4 1 0 202 DTX2 2 0 63 ATPAF1 1 0 133 CHD7 1 0 203 DUSP1 1 0 64 ATXN1L 1 2 134 CHD9 1 2 204 DUSP16 2 0 65 ATXN7L3 1 0 135 CHIC1 2 1 205 DUSP22 1 0 66 B3GAT3 1 0 136 CHRFAM7A 1 0 206 DUSP7 1 0 67 B4GAT1 1 0 137 CHST3 2 0 207 DUSP9 1 0 68 BACE2 1 0 138 CHUK 1 0 208 DVL3 1 0 69 BAHD1 1 1 139 CLASP2 1 0 209 DYRK1A 1 1 70 BBX 1 1 140 CLCN5 3 0 210 DYRK2 2 1

Continued.

ETH Zurich Matije Lucic 97 Supplementary information

Continued.

let-7 miR-17 let-7 miR-17 let-7 miR-17 # Gene name # Gene name # Gene name TS TS TS TS TS TS 211 DUSP7 1 0 281 GOLGA4 1 0 351 KCNC4 1 0 212 DUSP9 1 0 282 GOLGA7 1 0 352 KCNJ11 1 0 213 DVL3 1 0 283 GOLT1B 1 0 353 KCNQ4 1 0 214 DYRK1A 1 1 284 GOPC 1 0 354 KCTD10 1 0 215 DYRK2 2 1 285 GPAT3 1 0 355 KCTD17 1 0 216 DZIP1 1 0 286 GPAT4 1 0 356 KCTD21 1 0 217 E2F2 1 1 287 GPATCH2 2 0 357 KDM3A 1 0 218 E2F5 1 1 288 GPATCH3 1 0 358 KIAA0391 1 0 219 E2F6 1 0 289 GPC4 1 0 359 KIAA0895L 1 0 220 EDA 1 0 290 GPCPD1 2 0 360 KIAA0930 1 0 221 EDEM3 1 0 291 GPR137 1 0 361 KIAA1147 1 1 222 EEA1 1 1 292 GPR137C 1 3 362 KIAA1161 1 0 223 EFHD2 1 0 293 GPR157 1 1 363 KIAA1549 1 0 224 EGLN2 1 0 294 GPR63 1 1 364 KIAA1958 2 0 225 EIF2S2 1 0 295 GRAMD1B 1 0 365 KIF21B 1 0 226 EIF4G2 1 1 296 GREB1 1 0 366 KIF2A 1 0 227 ELF4 1 0 297 GREB1L 1 0 367 KLF8 2 0 228 ELK4 1 4 298 GRK3 1 0 368 KLF9 1 1 229 ELOVL4 1 0 299 GRPEL2 1 0 369 KLHDC8B 1 0 230 EMB 1 0 300 GXYLT1 2 2 370 KLHL13 1 0 231 ENTPD7 1 0 301 GYG2 1 0 371 KLHL23 1 0 232 EOGT 2 0 302 HABP4 1 1 372 KLHL36 1 1 233 EPHA3 1 0 303 HAND1 1 0 373 KLHL6 1 0 234 EPHA4 1 2 304 HAND2 1 0 374 KMT2D 1 0 235 ERCC4 1 0 305 HAS2 1 1 375 KMT2E 1 0 236 ERCC6 1 0 306 HBEGF 1 0 376 KP01 1 0 237 ERGIC1 1 0 307 HDLBP 1 0 377 KP04 1 1 238 ERGIC2 1 0 308 HDX 1 0 378 KP05 1 1 239 ERO1A 2 0 309 HECTD2 1 1 379 KREMEN1 1 0 240 ERP29 1 0 310 HECTD4 1 0 380 L2HGDH 1 0 241 ETNK2 1 0 311 HELZ 1 0 381 LBH 1 0 242 ETV3 1 0 312 HIC2 4 0 382 LBR 1 0 243 EXOC5 1 0 313 HIF3A 2 0 383 LCOR 2 4 244 EZH1 1 2 314 HIP1 1 0 384 LCORL 1 0 245 FADS3 1 0 315 HIPK2 1 0 385 LDB1 1 0 246 FAM104A 1 0 316 HMGA1 1 0 386 LEPROTL1 1 0 247 FAM135A 1 0 317 HMGA2 7 1 387 LGR4 1 0 248 FAM189A1 1 1 318 HNRNPA1 1 0 388 LIMD2 1 0 249 FAM208A 1 0 319 HOOK1 1 0 389 LIMK2 1 0 250 FAM214B 1 0 320 HOXA1 2 0 390 LIN28B 4 0 251 FAM222B 2 1 321 HOXA9 1 0 391 LINGO1 1 0 252 FAM84B 1 1 322 HOXC11 1 0 392 LIPT2 1 0 253 FARP1 1 0 323 HS2ST1 1 1 393 LMLN 1 0 254 FAS 1 0 324 HSPE1-MOB4 1 0 394 LPGAT1 2 1 255 FASTK 1 1 325 ICK 1 0 395 LRFN4 1 0 256 FAXC 1 0 326 ICMT 1 1 396 LRIG2 1 0 257 FBXL19 1 0 327 IDH2 1 0 397 LRIG3 1 0 258 FBXO21 1 1 328 IGDCC3 3 0 398 LRRC20 1 1 259 FBXO30 1 0 329 IGDCC4 1 0 399 LRRC59 1 0 260 FBXO32 1 0 330 IGF2BP2 2 0 400 LRRC8B 1 0 261 FBXO45 1 0 331 IGF2BP3 1 0 401 LTN1 1 0 262 FGD6 1 0 332 IGSF1 1 0 402 LUC7L3 1 0 263 FGF5 1 1 333 IKBKAP 1 0 403 LZIC 1 1 264 FIGN 6 1 334 IKZF2 1 0 404 MAP3K1 1 1 265 FKRP 1 0 335 IL6R 1 0 405 MAP3K13 1 0 266 FNDC3A 2 1 336 INPP5A 1 0 406 MAP3K2 1 4 267 FNDC3B 2 1 337 INSR 1 0 407 MAP3K3 1 1 268 FNIP1 1 0 338 INTS2 1 0 408 MAP3K9 1 1 269 FOPNL 1 0 339 INTS6L 1 0 409 MAP4K3 1 0 270 FOXP1 1 0 340 INTU 1 0 410 MAP4K4 1 0 271 FOXP2 3 0 341 IPO11 1 0 411 MAPK1IP1L 1 1 272 FRMD4B 2 1 342 IPO9 1 1 412 MAPK8 1 0 273 FRMD5 1 0 343 IQCB1 1 0 413 MBD2 2 1 274 FRS2 1 2 344 IRGQ 1 0 414 MBTD1 1 0 275 FSD2 1 0 345 ITSN1 1 0 415 MBTPS2 1 0 276 FSTL4 1 1 346 JOSD1 1 1 416 MDM4 1 0 277 FZD4 1 1 347 KAT0L1 1 1 417 MED6 1 1 278 G3BP1 1 2 348 KATNBL1 1 0 418 MED8 1 0 279 GAB2 1 0 349 KCMF1 1 0 419 MEF2C 2 0 280 GABPA 1 0 350 KCNC3 1 0 420 MEF2D 1 0

Continued.

ETH Zurich Matije Lucic 98 Supplementary information

Continued.

let-7 miR-17 let-7 miR-17 let-7 miR-17 # Gene name # Gene name # Gene name TS TS TS TS TS TS 421 MEGF11 1 0 491 PAWR 1 0 561 RASGRP1 1 0 422 MEIS1 1 0 492 PAX2 1 0 562 RAVER2 1 0 423 MEIS2 1 0 493 PAX3 1 0 563 RBFOX2 1 0 424 MEX3A 1 0 494 PBX1 2 0 564 RBM38 1 0 425 MFAP3L 1 1 495 PBX2 2 0 565 RBMS1 1 1 426 MFSD4A 1 0 496 PBX3 2 1 566 RBPJ 1 0 427 MFSD4B 1 0 497 PCDH19 1 0 567 RCSD1 1 0 428 MGAT4A 1 0 498 PCGF3 1 0 568 RDX 1 0 429 MGLL 2 0 499 PCTP 1 0 569 REEP1 1 0 430 MIEF1 1 0 500 PCYT1B 1 1 570 RGMA 1 1 431 MIOS 1 0 501 PDE12 2 0 571 RGS16 1 0 432 MLLT10 1 0 502 PDHB 1 0 572 RGS17 1 2 433 MLXIP 1 0 503 PDP2 1 0 573 RICTOR 1 0 434 MON2 1 0 504 PDPR 1 1 574 RIMKLA 1 1 435 MPST 1 0 505 PDSS1 1 0 575 RIMS3 1 0 436 MSANTD3 1 0 506 PDZD8 1 0 576 RIOK3 1 0 437 MSN 1 0 507 PEX11B 1 0 577 RNF152 2 0 438 MTDH 1 0 508 PGM2L1 2 1 578 RNF170 1 0 439 MTMR12 1 0 509 PHACTR2 1 0 579 RNF20 1 0 440 MTMR3 1 2 510 PHF8 1 0 580 RNF217 1 1 441 MTUS1 1 0 511 PIGA 1 0 581 RNF38 1 1 442 MVB12B 1 0 512 PIGU 1 0 582 RNF44 1 0 443 MYCBP 1 0 513 PIK3IP1 1 0 583 RNFT1 1 0 444 MYCL 1 1 514 PITPNM3 1 0 584 ROBO1 1 0 445 MYO1F 1 0 515 PKIA 1 1 585 RPS6KA3 1 1 446 MYO5B 1 1 516 PKN2 1 1 586 RPUSD2 1 0 447 0A20 1 0 517 PLA2G15 1 0 587 RRAGD 1 2 448 0A30 2 2 518 PLAGL2 1 1 588 RRM2 1 1 449 0P1L1 2 0 519 PLD3 1 0 589 RRP1B 1 0 450 0PEPLD 1 1 520 PLEKHA8 1 1 590 RSF1 1 0 451 0T8L 1 0 521 PLEKHH1 1 0 591 RTCA 1 0 452 NCBP3 1 0 522 PLEKHO1 2 0 592 RUFY3 1 0 453 NCOA1 1 0 523 PLX02 1 0 593 RUNX1T1 1 0 454 NCOR1 1 1 524 PLXND1 1 0 594 SALL3 1 1 455 NDST2 1 0 525 PMAIP1 1 0 595 SALL4 1 0 456 NEFM 1 0 526 POC1B-GALNT4 1 1 596 SAMD12 1 1 457 NEK3 1 0 527 POGZ 1 0 597 SBK1 1 0 458 NEK9 1 2 528 POLR2D 1 0 598 SBNO1 1 0 459 NEMP1 1 0 529 POLR3D 1 0 599 SCD 1 0 460 NHLRC2 1 0 530 POU2F1 2 0 600 SCN4B 1 0 461 NHLRC3 1 1 531 POU6F2 1 1 601 SCN5A 1 0 462 NID2 1 0 532 PPARGC1A 1 0 602 SCYL3 1 0 463 NIPA1 1 0 533 PPP1R12B 1 0 603 SDK1 1 0 464 NKAP 1 0 534 PPP1R15B 2 1 604 SEC14L1 1 0 465 NKIRAS2 1 0 535 PPP2R2A 1 1 605 SEC31B 1 0 466 NLK 1 0 536 PPP3CA 1 0 606 SECISBP2L 1 0 467 NLN 1 0 537 PPTC7 1 0 607 SEMA4C 1 0 468 NME6 1 0 538 PQLC2 1 0 608 SEMA4F 1 0 469 NOL4L 1 0 539 PRDM5 1 0 609 SENP5 1 1 470 NOVA1 1 0 540 PRKAA2 1 0 610 SESN3 1 2 471 NPEPL1 1 0 541 PRKAB2 1 0 611 SESTD1 1 0 472 NR6A1 4 0 542 PRKAR2A 1 0 612 SFMBT1 1 2 473 NRAS 2 0 543 PRPF38B 1 0 613 SH2B3 1 0 474 NSD1 1 0 544 PRR14L 1 1 614 SKIDA1 1 1 475 NUMBL 1 0 545 PRRX1 1 1 615 SKIL 3 0 476 NUP155 1 0 546 PSENEN 1 0 616 SLC10A7 1 0 477 NXT2 1 0 547 PTAR1 2 0 617 SLC12A9 1 0 478 NYNRIN 1 0 548 PTPRD 2 2 618 SLC16A10 3 0 479 ONECUT2 3 0 549 PTPRU 1 0 619 SLC16A14 1 0 480 OSBPL3 1 0 550 PUDP 1 0 620 SLC16A9 1 1 481 OSTF1 1 0 551 PXDN 2 0 621 SLC22A23 1 1 482 OTUD3 1 1 552 PXMP4 1 0 622 SLC25A24 1 0 483 P4HA2 1 0 553 PYGO2 1 0 623 SLC25A27 1 1 484 PACS2 1 0 554 QARS 1 0 624 SLC25A32 1 0 485 PAG1 1 0 555 RAB11FIP4 2 1 625 SLC2A12 1 0 486 PAK1 1 0 556 RAB15 1 0 626 SLC30A1 1 1 487 PALM3 1 0 557 RAB30 1 1 627 SLC30A4 1 0 488 PARD6B 1 1 558 RAB3GAP2 1 0 628 SLC30A7 1 1 489 PARM1 1 0 559 RAB8B 1 1 629 SLC31A1 1 0 490 PARP8 1 0 560 RADIL 1 0 630 SLC38A9 1 0

Continued.

ETH Zurich Matije Lucic 99 Supplementary information

Continued.

let-7 miR-17 let-7 miR-17 let-7 miR-17 # Gene name # Gene name # Gene name TS TS TS TS TS TS 631 SLC45A4 1 1 701 TET3 3 2 771 XPOT 1 0 632 SLC4A4 1 1 702 TEX261 1 0 772 XRN1 1 2 633 SLC52A3 1 0 703 TGDS 1 0 773 XYLT1 1 0 634 SLC5A6 1 0 704 TGFBR1 2 0 774 YAF2 1 0 635 SLC6A15 1 0 705 TGFBR3 1 0 775 YBX2 1 0 636 SLC9A9 1 0 706 THOC2 1 0 776 YOD1 2 2 637 SLCO5A1 1 0 707 THRA 2 2 777 YPEL2 1 1 638 SLK 1 1 708 TIA1 1 0 778 YTHDF3 1 2 639 SMAD2 1 0 709 TIMM17B 1 0 779 ZBTB10 2 0 640 SMARCAD1 2 0 710 TMC7 1 0 780 ZBTB16 3 0 641 SMARCC1 1 0 711 TMED5 1 0 781 ZBTB26 1 0 642 SMC1A 2 0 712 TMEM110 1 0 782 ZBTB37 2 2 643 SMCR8 1 0 713 TMEM135 1 0 783 ZBTB39 1 0 644 SMIM13 1 0 714 TMEM143 1 0 784 ZBTB5 1 0 645 SMIM3 1 0 715 TMEM167A 1 1 785 ZC3H3 1 0 646 SMUG1 1 0 716 TMEM178B 1 0 786 ZC3HAV1L 1 0 647 SNN 1 0 717 TMEM2 1 0 787 ZCCHC11 1 0 648 SNX1 1 0 718 TMEM234 1 0 788 ZFYVE26 1 2 649 SNX16 1 1 719 TMEM251 1 0 789 ZKSCAN5 1 0 650 SNX30 1 0 720 TMEM255A 1 0 790 ZNF200 2 0 651 SNX6 1 0 721 TMEM41A 1 0 791 ZNF202 1 1 652 SOCS4 1 0 722 TMEM65 1 0 792 ZNF24 1 0 653 SOX13 1 0 723 TMPPE 1 0 793 ZNF275 2 0 654 SOX6 1 0 724 TMX4 1 0 794 ZNF282 1 0 655 SP8 1 1 725 TNIK 1 0 795 ZNF322 1 0 656 SPATA13 1 0 726 TOB2 1 0 796 ZNF341 1 0 657 SPATA2 1 0 727 TOR1AIP2 1 0 797 ZNF354A 1 0 658 SPEG 1 0 728 TP53 2 0 798 ZNF354B 1 0 659 SPIRE1 1 0 729 TPP1 1 0 799 ZNF362 1 1 660 SPOCD1 1 0 730 TRIB1 1 0 800 ZNF451 1 0 661 SPRYD4 1 0 731 TRIM41 1 0 801 ZNF473 1 0 662 SPTBN4 1 0 732 TRIM71 7 2 802 ZNF512B 3 2 663 SREBF2 1 0 733 TRIOBP 1 1 803 ZNF516 1 0 664 SREK1 1 0 734 TRPM6 1 0 804 ZNF566 1 0 665 SREK1IP1 1 0 735 TSC1 1 0 805 ZNF583 1 0 666 SRGAP1 1 1 736 TSC22D2 1 1 806 ZNF620 1 0 667 SRGAP3 1 2 737 TSPAN18 1 0 807 ZNF641 1 0 668 SSH1 1 1 738 TSPAN2 1 0 808 ZNF644 2 0 669 ST3GAL1 1 1 739 TTC31 1 0 809 ZNF652 3 3 670 ST8SIA1 1 0 740 TTC39C 1 1 810 ZNF689 1 0 671 STAB2 1 0 741 TTLL4 2 0 811 ZNF697 1 1 672 STARD13 2 0 742 TUSC2 1 1 812 ZNF70 1 2 673 STAT3 1 2 743 TXL0 1 2 813 ZNF74 1 1 674 STEAP3 1 0 744 TXLNG 1 0 814 ZNF740 1 0 675 STK24 1 0 745 UBE2G2 1 0 815 ZNF774 1 0 676 STK40 1 0 746 UBN2 1 1 816 ZNF783 1 0 677 STRBP 1 2 747 UGCG 1 0 817 ZNF784 1 0 678 STRN 1 0 748 ULK2 1 0 818 ZNF879 1 0 679 STX3 1 0 749 USP12 1 0 819 ZSWIM5 1 0 680 STXBP5 1 1 750 USP24 1 1 681 STYX 1 1 751 USP32 1 1 682 SUCLG2 1 0 752 USP44 3 0 683 SULF1 1 1 753 USP47 1 1 684 SULF2 1 0 754 USP49 1 0 685 SURF4 1 0 755 USP6 1 1 686 SUV39H2 1 0 756 UTRN 1 0 687 SWT1 1 0 757 VANGL2 2 0 688 SYNCRIP 1 1 758 VASH2 1 1 689 SYNJ2BP 1 0 759 VAV3 1 0 690 SYT1 1 0 760 VCPIP1 1 0 691 SYT11 1 0 761 VEZT 1 0 692 SYT2 1 0 762 VGLL3 1 0 693 SYT7 1 1 763 VSNL1 1 0 694 TAB2 1 0 764 WAPL 1 0 695 TAF9B 1 0 765 WARS2 1 0 696 TARBP2 2 0 766 WASL 1 0 697 TBKBP1 1 0 767 WDR37 1 2 698 TEAD3 1 0 768 WNK3 1 2 699 TECPR2 1 0 769 XK 1 0 700 TET2 2 1 770 XKR8 1 0 Table S7 819 TargetScan-predicted let-7 targets in human. Numbers of conserved let-7 an miR-17 target sites (TS) in the 3′ UTR are shown.

ETH Zurich Matije Lucic 100 Supplementary information

let-7 miR-17 let-7 miR-17 let-7 miR-17 # Gene name # Gene name # Gene name TS TS TS TS TS TS 1 AAK1 1 2 66 G3BP1 1 2 131 RIMKLA 1 1 2 ABL2 2 2 67 GAN 3 2 132 RNF217 1 1 3 ACER2 1 1 68 GAREM1 1 1 133 RNF38 1 1 4 ADIPOR2 1 1 69 GNS 1 2 134 RPS6KA3 1 1 5 AEN 2 1 70 GPR137C 1 3 135 RRAGD 1 2 6 AGO1 1 2 71 GPR157 1 1 136 RRM2 1 1 7 AGO3 3 3 72 GPR63 1 1 137 SALL3 1 1 8 AHCTF1 1 1 73 GXYLT1 2 2 138 SAMD12 1 1 9 AL139011.2 1 1 74 HABP4 1 1 139 SENP5 1 1 10 ANKFY1 1 2 75 HAS2 1 1 140 SESN3 1 2 11 ANKRD28 1 1 76 HECTD2 1 1 141 SFMBT1 1 2 12 ANKRD52 1 3 77 HMGA2 7 1 142 SKIDA1 1 1 13 ARHGEF7 1 1 78 HS2ST1 1 1 143 SLC16A9 1 1 14 ARL5A 1 1 79 ICMT 1 1 144 SLC22A23 1 1 15 ARMC8 1 1 80 IPO9 1 1 145 SLC25A27 1 1 16 ATG16L1 1 1 81 JOSD1 1 1 146 SLC30A1 1 1 17 ATXN1L 1 2 82 KATNAL1 1 1 147 SLC30A7 1 1 18 BAHD1 1 1 83 KIAA1147 1 1 148 SLC45A4 1 1 19 BBX 1 1 84 KLF9 1 1 149 SLC4A4 1 1 20 BMP2 1 1 85 KLHL36 1 1 150 SLK 1 1 21 BNC2 1 3 86 KPNA4 1 1 151 SNX16 1 1 22 BZW1 2 1 87 KPNA5 1 1 152 SP8 1 1 23 C14orf28 3 1 88 LCOR 2 4 153 SRGAP1 1 1 24 C16orf52 1 1 89 LPGAT1 2 1 154 SRGAP3 1 2 25 CADM2 1 2 90 LRRC20 1 1 155 SSH1 1 1 26 CANT1 1 1 91 LZIC 1 1 156 ST3GAL1 1 1 27 CCDC144A 1 1 92 MAP3K1 1 1 157 STAT3 1 2 28 CDK6 1 1 93 MAP3K2 1 4 158 STRBP 1 2 29 CERCAM 1 1 94 MAP3K3 1 1 159 STXBP5 1 1 30 CFL2 1 2 95 MAP3K9 1 1 160 STYX 1 1 31 CHD9 1 2 96 MAPK1IP1L 1 1 161 SULF1 1 1 32 CHIC1 2 1 97 MBD2 2 1 162 SYNCRIP 1 1 33 CLOCK 1 2 98 MED6 1 1 163 SYT7 1 1 34 CNNM2 1 1 99 MFAP3L 1 1 164 TET2 2 1 35 CNOT6L 1 1 100 MTMR3 1 2 165 TET3 3 2 36 COL4A1 1 1 101 MYCL 1 1 166 THRA 2 2 37 CRK 1 1 102 MYO5B 1 1 167 TMEM167A 1 1 38 CRY2 1 1 103 NAA30 2 2 168 TRIM71 7 2 39 CSRNP3 1 2 104 NAPEPLD 1 1 169 TRIOBP 1 1 40 CYB561D1 1 1 105 NCOR1 1 1 170 TSC22D2 1 1 41 DCUN1D3 1 1 106 NEK9 1 2 171 TTC39C 1 1 42 DNAL1 1 1 107 NHLRC3 1 1 172 TUSC2 1 1 43 DPP3 1 1 108 OTUD3 1 1 173 TXLNA 1 2 44 DYRK1A 1 1 109 PARD6B 1 1 174 UBN2 1 1 45 DYRK2 2 1 110 PBX3 2 1 175 USP24 1 1 46 E2F2 1 1 111 PCYT1B 1 1 176 USP32 1 1 47 E2F5 1 1 112 PDPR 1 1 177 USP47 1 1 48 EEA1 1 1 113 PGM2L1 2 1 178 USP6 1 1 49 EIF4G2 1 1 114 PKIA 1 1 179 VASH2 1 1 50 ELK4 1 4 115 PKN2 1 1 180 WDR37 1 2 51 EPHA4 1 2 116 PLAGL2 1 1 181 WNK3 1 2 52 EZH1 1 2 117 PLEKHA8 1 1 182 XRN1 1 2 53 FAM189A1 1 1 118 POC1B-GALNT4 1 1 183 YOD1 2 2 54 FAM222B 2 1 119 POU6F2 1 1 184 YPEL2 1 1 55 FAM84B 1 1 120 PPP1R15B 2 1 185 YTHDF3 1 2 56 FASTK 1 1 121 PPP2R2A 1 1 186 ZBTB37 2 2 57 FBXO21 1 1 122 PRR14L 1 1 187 ZFYVE26 1 2 58 FGF5 1 1 123 PRRX1 1 1 188 ZNF202 1 1 59 FIGN 6 1 124 PTPRD 2 2 189 ZNF362 1 1 60 FNDC3A 2 1 125 RAB11FIP4 2 1 190 ZNF512B 3 2 61 FNDC3B 2 1 126 RAB30 1 1 191 ZNF652 3 3 62 FRMD4B 2 1 127 RAB8B 1 1 192 ZNF697 1 1 63 FRS2 1 2 128 RBMS1 1 1 193 ZNF70 1 2 64 FSTL4 1 1 129 RGMA 1 1 194 ZNF74 1 1 65 FZD4 1 1 130 RGS17 1 2 Table S8 194 TargetScan-predicted let-7 and miR-17 co-targets in human. Numbers of conserved let-7 an miR-17 target sites (TS) in the 3′ UTR are shown.

ETH Zurich Matije Lucic 101 Supplementary information

log2 FC Rank Gene name P value let-7 TS miR-17 TS over mock 1 HMGA2 -2.50 3.430E-37 7 1 2 LIN28B -1.74 3.313E-39 4 0 3 ARID3B -1.45 2.492E-30 5 0 4 IGDCC3 -1.41 5.925E-09 3 0 5 SLC52A3 -1.33 7.766E-02 1 0 6 HAND1 -1.28 6.836E-13 1 0 7 HIC2 -1.23 2.090E-21 4 0 8 ZC3HAV1L -1.14 3.531E-07 1 0 9 HOXA1 -1.11 2.216E-08 2 0 10 DICER1 -1.09 1.830E-16 2 0 11 PUDP -1.05 2.383E-07 1 0 12 E2F5 -1.04 2.935E-10 1 1 13 BMP2 -1.03 2.224E-11 1 1 14 NR6A1 -1.01 2.628E-13 4 0 15 TRIM71 -0.98 5.934E-10 7 2 16 KCNJ11 -0.94 1.386E-04 1 0 17 FOXP2 -0.94 4.087E-07 3 0 18 TUSC2 -0.87 1.119E-08 1 1 19 PLAGL2 -0.81 8.190E-12 1 1 20 CHD7 -0.78 8.827E-11 1 0 21 AC004080.3 -0.77 6.482E-01 1 0 22 TTLL4 -0.76 4.989E-10 2 0 23 USP44 -0.75 4.790E-04 3 0 24 C15orf39 -0.75 5.045E-08 1 0 25 SKIL -0.74 1.942E-06 3 0 26 FNDC3A -0.73 3.820E-10 2 1 27 NEFM -0.73 1.558E-04 1 0 28 LEPROTL1 -0.72 3.023E-07 1 0 29 DPH3 -0.71 3.424E-07 1 0 30 PPARGC1A -0.70 3.614E-04 1 0 31 STX3 -0.69 2.286E-07 1 0 32 ATP6V1C1 -0.69 2.948E-08 1 0 33 SLC5A6 -0.67 5.566E-04 1 0 34 ACVR2B -0.66 1.236E-07 2 0 35 CARNMT1 -0.65 2.907E-06 1 0 36 PALM3 -0.65 2.464E-01 1 0 37 ACPP -0.65 1.486E-01 1 0 38 GREB1 -0.64 1.063E-04 1 0 39 HAS2 -0.63 1.355E-02 1 1 40 TMEM2 -0.63 4.427E-07 1 0 41 XK -0.62 2.906E-02 1 0 42 DUSP9 -0.62 5.872E-06 1 0 43 HOXA9 -0.62 4.426E-06 1 0 44 SALL3 -0.62 6.982E-02 1 1 45 TGFBR3 -0.62 2.793E-07 1 0 46 PBX2 -0.61 2.214E-07 2 0 47 FAM222B -0.61 2.543E-04 2 1 48 PLA2G15 -0.60 2.236E-04 1 0 49 TIMM17B -0.59 8.009E-06 1 0 50 GNG5 -0.58 3.003E-06 1 0 Table S9 Top 50 most repressed let-7 targets upon transfection of 50 nM let-7a duplex. TargetScan-predicted let-7 target genes are ranked according to log2 fold changes over mock transfection. Shown are P values and numbers of conserved let-7 an miR-17 target sites (TS) in the 3′ UTR.

ETH Zurich Matije Lucic 102 Supplementary information

pre-miR-106a mfold predicted secondary structure

wt

seed-mutant

3′ end mutant #1

3′ end mutant #2

3′ end mutant #3

Table S10 Mfold-predicted secondary structures of miR-106a hairpin precursors. Predictions made with mfold web server (http://unafold.rna.albany.edu) using standard settings.

ETH Zurich Matije Lucic 103 Posters

Posters 2018 M. Lucic, M. Habibian, M. Yahyaee-Anzahaee, J. Hall, MJ. Damha, “Structural properties and gene silencing activity of parallel-stranded duplexes”, Swiss RNA workshop, University of Bern (02.02.2018) 2017 M. Lucic, F. Halloy, P. Cwiek, A. Brunschweiger, L. Gebert, U. Pradère, C. Berk, J. Hall, “Quantification of modified oligonucleotides in vitro and in vivo”, Microsymposium on small RNA biology, IMBA Vienna, Austria (26-28.05.2017) 2016 M. Lucic, A. Brunschweiger, L. Gebert, J. Hunziker, J. Hall, “Quantification of antimiRs in RISC by chemical-ligation qPCR”, Swiss RNA workshop, University of Bern (22.01.2016)

ETH Zurich Matije Lucic 104 Posters

ETH Zurich Matije Lucic 105 Posters

ETH Zurich Matije Lucic 106 Posters

ETH Zurich Matije Lucic 107 Curriculum vitae

Curriculum vitae

CONTACT Matije Lucic Wehntalerstrasse 227 | 8057 Zurich [email protected]

PERSONAL INFORMATION First name: Matije | Surname: Lucic | Date of birth: 09.01.1986 | Nationality: Swiss

RESEARCH EXPERIENCE 02.2013 - 12.2018 Doctoral research, ETH Zurich, Institute of pharmaceutical sciences Group of Prof. Jonathan Hall in pharmaceutical chemistry Main tasks and expertise: • Modulation of biological activity of small regulatory RNAs in different cellular and biochemical backgrounds (e.g. targeting of endogenous liver-specific microRNA-122 in context of Hepatitis C virus infection). • Investigation of novel competitive targeting mechanism of oncogenic microRNAs from the miR-17~92 cluster with cell-based luciferase reporter assay, RT-qPCR and transcriptome sequencing. • Cultivation, maintenance and transfection of mammalian cell lines. • Development of PCR-based assay for detection and quantification of chemically modified oligonucleotides. • Analysis and visualization of biological and transcriptomic data with R and GraphPad Prism packages. 08.2011 - 10.2011 Research assistant, ETH Zurich, Institute of pharmaceutical sciences Group of Prof. Cornelia Halin Winter in pharmaceutical immunology Main tasks and expertise: • Investigation of dendritic cell migration within lymphatic vessels in the ear skin of transgenic mice during inflammation. • Analysis of cell size and dynamics by intravital microscopy and image processing with Imaris software. 11.2010 - 06.2011 Research assistant, ETH Zurich, Institute of pharmaceutical sciences Group of Prof. Jonathan Hall in pharmaceutical chemistry Main tasks and expertise: • Expression and characterization of human endoribonuclease Dicer in insect cells using the baculovirus expression system. • Characterization of its catalytical activity by LC-MS and Bioanalyzer.

EDUCATION 10.2013 Swiss federal diploma for pharmacists, BAG, Bern 09.2009 - 09.2012 Master of Science (M.Sc.), Pharmaceutical sciences, ETH Zurich 09.2005 - 09.2009 Bachelor of Science (B.Sc.), Pharmaceutical sciences, ETH Zurich 09.2001 - 07.2005 Matura (High school degree), Liceo cantonale di Locarno

ETH Zurich Matije Lucic 108 Curriculum vitae

PHARMACY EXPERIENCE Since 12.2016 Pharmacist (part-time), Amavita Airport Zurich 11.2011 - 08.2012 Internship in public pharmacy, Coop Vitality Sihlcity Zurich

TEACHING EXPERIENCE 11.2013 - 11.2017 Teaching assistant, ETH Zurich, Institute of pharmaceutical sciences Main tasks: organization and supervision of a yearly recurring practical laboratory course for bachelor students in pharmaceutical sciences (HPLC analysis and kinetic studies of beta-lactamase activity). Since 02.2013 Student supervisor, ETH Zurich, Institute of pharmaceutical sciences Main tasks: supervision of 3 bachelor projects and 2 master theses.

PUBLICATIONS 2018 M. Lucic, A. Kanitz, M. Zavolan, J. Hall, “Non-canonical microRNA targeting reveals new layer of microRNA regulation”, (manuscript in preparation) M. Habibian, M. Yahyaee-Anzahaee, M. Lucic, E. Moroz, N. Martín-Pintado, L. Di Giovanni, JC. Leroux, J. Hall, C. González, MJ. Damha, “Structural properties and gene-silencing activity of chemically modified DNA-RNA hybrids with parallel orientation”, Nucleic Acids Res., 2018 2015 M. Lucic, A. Brunschweiger, L. Gebert, J. Hunziker, J. Hall, “Quantifizierung von Antimirs im RISC mittels chemischer Ligations-qPCR”, pharmaJournal 25, 2015 A. Brunschweiger, L. Gebert, M. Lucic, U. Pradère, H. Jahns, C. Berk, J. Hunziker, J. Hall, “Site-specific conjugation of drug-like fragments to an antimiR scaffold as a strategy to target miRNAs inside RISC”, Chem. Commun., 2015 2014 M. Roos, M. Rebhan, M. Lucic, D. Pavlicek, U. Pradère, H. Towbin, G. Civenni, C. Catapano, J. Hall, “Short loop-targeting oligoribonucleotides antagonize Lin28 and enable pre-let-7 processing and suppression of cell growth in let-7-deficient cancer cells”, Nucleic Acids Res., 2014 2013 U. Pradère, A. Brunschweiger, L. Gebert, M. Lucic, M. Roos, J. Hall, “Chemical synthesis of mono- and bis-labeled pre-microRNAs”, Angew. Chem., 2013 2012 M. Nitschké, D. Aebischer, M. Abadier, S. Haener, M. Lucic, B. Vigl, H. Luche, H. Fehling, O. Biehlmaier, R. Lyck, C. Halin, “Differential requirement for ROCK in dendritic cell migration within lymphatic capillaries in steady-state and inflammation”, Blood, 2012 Master thesis: M. Lucic, L. Gebert, J. Hall, “Expression of human Dicer in insect cells using the Bac-to-Bac® viral expression system”, ETH Zurich, 2012 (unpublished)

ETH Zurich Matije Lucic 109 Curriculum vitae

TALKS 2017 M. Lucic, “A non-canonical microRNA competition mechanism”, NCCR RNA & disease summer school, Saas-Fee (28.08-01.09.2017) 2016 M. Lucic, “A new design for the targeting of miRNAs”, NCCR RNA & disease seminar series, ETH Zurich (19.04.2016) M. Lucic, “Targeting miRNAs inside RISC: design of antimiR-conjugates and their detection by chemical-ligation (CL)-qPCR”, Doktorandentag, ETH Zurich (17.02.2016) 2015 M. Lucic, “Quantification of antimiRs in RISC by chemical-ligation (CL)-qPCR”, NCCR RNA & disease summer school, Saas-Fee (24-28.08.2015) 2014 M. Lucic, “miR-106a seedless targeting on miR-CLIP targetome”, Sinergia meeting, ETH Zurich (12.12.2014) M. Lucic, J. Imig, “Towards a pre-miRNA-RBP interaction screen”, Sinergia meeting, University of Basel (02.05.2014)

POSTERS 2018 M. Lucic, M. Habibian, M. Yahyaee-Anzahaee, J. Hall, MJ. Damha, “Structural properties and gene silencing activity of parallel-stranded duplexes”, Swiss RNA workshop, University of Bern (02.02.2018) 2017 M. Lucic, F. Halloy, P. Cwiek, A. Brunschweiger, L. Gebert, U. Pradère, C. Berk, J. Hall, “Quantification of modified oligonucleotides in vitro and in vivo”, Microsymposium on small RNA biology, IMBA Vienna, Austria (26-28.05.2017) 2016 M. Lucic, A. Brunschweiger, L. Gebert, J. Hunziker, J. Hall, “Quantification of antimiRs in RISC by chemical-ligation qPCR”, Swiss RNA workshop, University of Bern (22.01.2016) 2015 M. Lucic, A. Brunschweiger, L. Gebert, J. Hunziker, J. Hall, “Quantification of antimiRs in RISC by chemical-ligation qPCR”, Swiss pharma science day, University of Bern (19.08.2015)

AWARDS 2015 Poster prize: M. Lucic, A. Brunschweiger, L. Gebert, J. Hunziker, J. Hall, “Quantification of antimiRs in RISC by chemical-ligation qPCR”, Swiss pharma science day, University of Bern (19.08.2015)

ADDITIONAL TRAINING 2018 Career management seminar, ETH career center, Balsthal (10-11.09.2018) 2017 Summer school, NCCR RNA & disease, Saas-Fee (28.08-01.09.2017) Course: “RNA & RNP architecture: from structure to function to disease” 2015 Summer school, NCCR RNA & disease, Saas-Fee (24-28.08.2015) Course: “RNA as target and drug”

ETH Zurich Matije Lucic 110 Curriculum vitae

VOLUNTEER ACTIVITIES I was involved in the organization and realization of the following events: 2018 Open lab day at D-CHAB, ETH Zurich (23.03.2018) 2017 Scientifica: Zurich science days, ETH Zurich and University of Zurich (1-3.09.2017) Open lab day at D-CHAB, ETH Zurich (07.04.2017) 2016 Doktorandentag, ETH Zurich (17.02.2016) 2014 Sinergia meeting, ETH Zurich (12.12.2014)

LANGUAGE SKILLS Italian, Croatian (native speaker) German, English (highly proficient) French (basic communication skills)

IT SKILLS Operating systems: familiar with Microsoft Windows, Apple macOS and Ubuntu Linux Office applications: proficient with Microsoft Office tools Graphics software: proficient with Adobe Illustrator and InDesign Programming skills: Good knowledge in Linux/Unix, R and Python Data analysis skills: Practical skills with data analysis, visualization and statistical techniques using R/Bioconductor packages and GraphPad Prism

MEMBERSHIPS pharmaSuisse (Switzerland) The RNA society (International)

HOBBIES AND INTERESTS Sports (jogging, football, winter sports), motorcycles, board games, e-games

REFERENCES Prof. Jonathan Hall (Doctoral advisor) ETH Zurich Institute of pharmaceutical sciences Vladimir-Prelog-Weg 1-5/10, 8093 Zurich Phone: +41 44 633 74 35 | E-mail: [email protected]

ETH Zurich Matije Lucic 111