Research Collection

Doctoral Thesis

Gene circuit based screening assays for mirna drug discovery

Author(s): Häfliger, Benjamin

Publication Date: 2016

Permanent Link: https://doi.org/10.3929/ethz-a-010651992

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library

DISS. ETH NO. 23325

GENE CIRCUIT BASED SCREENING ASSAYS FOR MIRNA DRUG DISCOVERY

A thesis submitted to attain the degree of DOCTOR OF SCIENCES of ETH ZURICH (Dr. sc. ETH Zurich)

presented by BENJAMIN HÄFLIGER

MSc ETH Biotech, ETH Zurich born on 25.02.1988 citizen of Fischbach, LU

accepted on the recommendation of Prof. Dr. Yaakov Benenson Prof. Dr. Sven Panke Dr. Helge Grosshans

2016

ii Abstract

Drug discovery is the process of identifying potential new medicines. A key step in this process is high throughput screening (HTS), whereby active compounds are identified from a large library of up to one million candidate molecules. Through minia- turization and automation, such libraries can nowadays be tested in a single week. Despite this impressive performance, about 50% of the screening campaigns fail to provide the desired hits, or the emerging lead compounds do not perform as expected in clinical settings, mostly due to toxicity from off-target interactions. To overcome this issue, novel HTS assays should provide more detailed information on the underlying biology and the relevant off-targets. Multiplexed assays based on liquid chromatog- raphy-mass spectroscopy or deep sequencing could deliver this information; yet, their throughput is comparably low. Therefore technologies that can probe a vast range of parameters in a high throughput fashion are highly desirable. One approach to addressing this challenge is the use of artificial regulatory networks. These can be implemented using synthetic gene circuits that are pro- grammed to perform specific functions inside living cells. To date, several complex mammalian gene circuits have been described that sense, integrate and compute mul- tiple molecular inputs to conditionally produce a defined output. In the context of high throughput screening, such circuits could be designed to simultaneously report on the effects on the drug target as well as on multiple off-targets. Compared to standard approaches, where a single target is linked to a single reporter, the circuit-based ap- proach compresses multiple inputs in a single output, increasing information density and throughput significantly. The concept can theoretically be applied to many different target families, such as kinases, GPCRs and miRNAs. However, small molecules tar- geting miRNAs are especially prone to off-target interactions, because of miRNAs’ similarities in nucleotide sequence and phosphoribose backbone, shared maturation pathway and common pri-miRNA precursors. This makes miRNAs the ideal candi- dates for a proof-of-concept of synthetic gene circuit-based drug discovery assays. In this thesis, I designed, optimized, tested and validated a family of gene circuit based assays for the discovery of specific small molecules that modulate miRNA ac- tivity. These assays consist of three genetic modules providing information on distinct drug effects including (i) global effects on gene expression, (ii) systemic effects on the RNAi pathway or off-target miRNAs and (iii) the desired effect on the intended miRNA drug target. Due to its involvement in hepatitis C virus replication and liver cancer ma-

i lignity, I used miR-122 as the primary drug target miRNA and implemented the under- lying gene circuit in the liver cancer cell line HuH-7. For the implementation of a pilot assay, the individual parts of each module were optimized alone and in concert. Due to the pilot circuit’s insufficient performance, three augmented circuit topologies were constructed. They were individually modeled in silico for best performance and subse- quently experimentally validated. Ultimately, the best performing one – the complete feed-forward (CFF) assay – was adapted for automatic screening. I performed a pilot small molecule screen and investigated the miR-122 hits and some of the excluded compounds by measuring dose-response of assay readouts. I confirmed the off-target effects for 14 of 18 tested compounds. Yet, the modest activity of miR-122 specific hits (<1.3 fold) turned out to be not significant in the follow-up experiments. Next, I ex- changed the drug and off-target miRNA binding sites, generating two new assay cir- cuits. I validated these assays and found excellent screening properties, highlighting that the modular architecture of the circuit allows fast tailoring to new miRNA targets. Finally, I benchmarked the circuit assays against the current gold standard, i.e. dual reporter assays with fully complementary binding sites, and proved that they reliably identify unspecific compounds, while the traditional assays miss these effects. This thesis describes, to the best of my knowledge, the first multi-input, cus- tomizable screening assay that can be directly applied in high throughput screening. It outlines the engineering steps undertaken, identifies the bottlenecks and proves the fast adaptability of the system to other miRNA drug and off-targets. Eventually, I dis- cuss further directions to improve the assays and assess the potential of the technol- ogy for other target families.

ii Zusammenfassung

Die Wirkstoffsuche ist ein Prozess, bei dem neue Arzneimittel gefunden wer- den sollen. Ein zentraler Aspekt dieses Prozesses ist das Hochdurchsatz-Screening (HTS), bei welchem die aktiven Verbindungen einer grossen Substanzbibliothek iden- tifiziert werden. Durch die Miniaturisierung und Automatisierung dieses Prozesses können heutzutage Substanzbibliotheken mit bis zu einer Million Verbindungen in ei- ner einzigen Woche getestet werden. Obwohl der Durchsatz dieser Screening-Kam- pagnen ausserordentlich hoch ist, scheitern 50% der Versuche. Dabei werden entwe- der nicht die gewünschten Treffer gefunden oder die identifizierten Leitwirkstoffe ver- halten sich in klinischen Test anders als erwartet – aufgrund von off-Target Effekten sind sie häufig toxisch. Um dieses Problem zu beseitigen, sollten neue HTS Testver- fahren so entworfen werden, dass diese auch Informationen über off-Target Effekte enthalten. Multiplex Testverfahren wie zum Beispiel Flüssigchromatographie mit Mas- senspektrometrie-Koppelung oder Deep Sequencing können diese Informationen be- reitstellen, doch der Durchsatz dieser Techniken ist vergleichsweise gering. Aus die- sem Grund sind Techniken, die sowohl multiplex Charakter als auch einen hohen Durchsatz haben sehr erstrebenswert. Ein Ansatz, um diese Herausforderung zu meistern, sind künstliche Regulati- onsnetzwerke, welche mit Hilfe von synthetischen Genschaltkreisen implementiert werden können. Dazu sind diese Schaltkreise so programmiert, dass sie in lebenden Zelle bestimme Funktionen ausführen. Bis heute wurden einige komplexe Säugerzell- genschaltkreise beschrieben, die mehrere molekulare Inputs messen, integrieren und verarbeiten können, um bedingt entsprechende Outputs zu produzieren. Im Kontext des Hochdurchsatz-Screenings könnten solche Schaltkreise verwendet werden, um gleichzeitig Interaktionen der Testsubstanzen mit dem Zielmolekül (on-Target) und mehreren off-Targets zu messen. Verglichen mit üblichen Ansätzen, bei denen jeweils ein einzelnes Target mit einem einzelnen Reporter verbunden wird, komprimiert der Genschaltkreis mehrere Inputs in einem einzigen Output, wodurch die Informations- dichte signifikant erhöht wird, der Durchsatz aber konstant bleibt. Dieses Konzept kann theoretisch auf viele verschiedene Target-Familie angewendet werden, aber nieder- molekulare Verbindungen die bei miRNAs ansetzen, interagieren speziell häufig mit off-Targets. Dies kann auf zwei Ursachen zurückgeführt werden. Erstens unterschei- den sich verschiedene miRNA nur durch die Abfolge ihrer Basen, ihre chemische Struktur ist aber die selbe und zweitens, teilen sich die misten miRNAs den selben Reifungsprozess. Deshalb sind miRNAs die idealen Kandidaten, um das Konzept von

iii auf synthetischen Genschaltkreisen basierenden Testverfahren für die Wirkstoffsuche zu bestätigen. Für die vorliegende Arbeit habe ich verschiedene Varianten dieser Testverfah- ren entwickelt, optimiert, getestet und validiert. Sie bestehen jeweils aus drei Modulen, welche unterschiedliche Effekte, welche eine Testsubstanz ausüben könnte, anzei- gen. Nämlich (i) globale Effekte auf die Genexpression, (ii) systemische Effekte auf den RNAi Signalweg und (iii) die gewünschten Effekte auf die Ziel-miRNA. Da die miR- 122 sowohl in der Vervielfältigung von Hepatitis-C Viren als auch in der Malignität von Leberkarzinomen eine wichtige Rolle spielt, verwendete ich diese als Zielmolekül und implementierte das entsprechende Testverfahren in der Leberkrebszelllinie HuH-7. Für ein erstes Versuchsverfahren wurden die einzelnen Schaltkreisteile separat und gemeinsam optimiert und getestet. Da dieser Test nicht die gewünschte Leistung er- brachte, wurden drei verbesserte Verfahren entworfen und sowohl in silico als auch experimentell getestet und validiert. Schliesslich wurde das beste Verfahren ausge- wählt – der „complete feed-forward (CFF) assay“ – und für die Automation adaptiert. Daraufhin habe ich eine Substanzbibliothek getestet und bestätigte die Effekte von 14 von 18 ausgewählten Verbindungen in Nachfolgeuntersuchen. Für die miR-122 Hits zeigte sich jedoch, dass die mässige Aktivität (< 1.3 Fach) nicht signifikant war. Um die Flexibilität des Systems zu zeigen, ersetze ich sowohl das on- als auch mehrere off-Target miRNA Bindungsstellen. Diese zwei neuen Testverfahren wurden validiert und zeigten exzellente Screeningeigenschaften. Einige der Testverfahren wurden schliesslich mit dem derzeitigen Goldstandard verglichen und es konnte gezeigt wer- den, dass die Genschaltkreis basierenden Verfahren die off-target Effekte finden und damit effizienter als die derzeitigen Verfahren sind. Diese Arbeit beschreibt, nach meinem bestem Wissen und Gewissen, das erste, multi-Input, anpassbare Testverfahren, das direkt für Hochdurchsatz-Scree- nings verwendet werden kann. Weiter sind die dazu nötigen technischen Vorgehens- weisen und Engpässe beschrieben und die möglichen weiteren Anwendungsgebiete der entwickelten Technik identifiziert und erörtert.

iv Acknowledgements

I would like to express my deepest gratitude to Professor Kobi Benenson for his constant support and help. He gave me the opportunity to work at the forefront of synthetic biology and biocomputing and introduced me to all their concepts, with hu- mor, passion and patience. He also let me the freedom to pursue many good, bad and crazy ideas and he gave me the directions and inputs to make some of them successful ones. I always enjoyed the exiting, thoughtful and interesting discussions and I would like to thank him in particular for his mentoring throughout the five years that I’ve hap- pily spent in his lab. A huge “Thank you” I would also like to say to Laura Prochazka. She was my first colleague and taught me all the secrets of conducting molecular and cell biology experiments. She was always there when I needed help and advice and was the proud organizer of some great parties. I would also like to thank her for patiently proofreading all the works I’ve written during my time in the Benenson lab – including this thesis – and for the thoughtful inputs that considerably improved my texts and works. I would like to thank Bartolomeo Angelici for all the support and help in the last years. We’ve spent quite some time discussing various subjects from transactivators, to miRNA to food and the latest “news” from audio books such as Chinese dressing in the age of Ming. I am indebted to all current and past Benenson lab members who supported me during the past years with insightful conversations, help with experiments and un- countable funny moments at lunch, in bars or at the Rhine. I would like to thank Jona- than Kleinert, Margaux Dastor and Raffaele Altamura for their efforts spent with proof- reading my thesis and eliminating all the typos and flaws that the text had, when they first read it. It is a pleasure for me to thank those people that worked in the background easing my daily life through their countless efforts – the shop, the admin services and the scientific facilities. Thomas Horn for microscopy support, Verena Jäggin and Telma Lopes for flow cytometry help and especially Urs Senn, without whom the automated screening would not have been possible and who brought fun to commuting from Aarau to Basel and back. M. Fussenegger for providing the plasmids encoding PIT2 and ET activators and their regulated promoter sequences. I would also like to thank my committee, Prof. Sven Panke and Dr. Helge Grosshans for their valuable feedback and the effort spent to judge my work.

v During all my life, I had the priceless luck of having great family and friends. I would like to thank my parents, Bernadette and Urs, for their constant effort in support- ing me pursuing any goal that would cross my mind and making my life as easy as possible. My brothers for helping me with any matter and my friends, for unforgettable adventures and exiting nights. And last but not least, I cannot find words to express my gratitude to Andrea. For the last ten years, she has been my partner and best friend and all my accomplish- ments I could never have achieved without her support, advice and unconditional love.

vi Abbreviations

2A Self-cleaving peptide 3’-UTR Thee prime translated region AND Logic function with positive output, when all inputs are present bp Base pair CAG Strong, synthetic promoter encoding the CMV early enhancer element, the first exon and the first intron of chicken beta-actin gene and the splice acceptor of the rabbit beta-globin gene CAGop CAG promoter followed by an intron with two LacO sites cDNA Complementary DNA CFF Complete feed-forward (assay/circuit) CMV Cytomegalovirus DGCR8 DiGeorge syndrome critical region 8 protein DMSO DNA Deoxyribonucleic acid Dox Doxycycline Drosha Class 2 ribonuclease III enzyme DsRed DsRed-Express, engineered Discosoma sp. red fluorescent protein ECFP Enhanced cyan fluorescent protein EDTA Ethylenediaminetetraacetic acid Ef1a Promoter of the human elongation factor-1 alpha ELISA Enzyme-linked immunosorbent assay ERE ET response element ET Erythromycin-dependent transactivator (MphR(A) fused to VP16) Exp-5 Exportin-5 protein FACS Fluorescence-activated cell sorting FANA 2'-Deoxy-2'-fluoro-arabinonucleic acid FF4 miRNA targeting a region of firefly luciferase, version #4 FF5 miRNA targeting a region of firefly luciferase, version #5 FFL Feed-forward loop gC1 virus type 1 glycoprotein GFP Green fluorescent protein GPCR G-protein coupled receptors HCC Human hepatocellular carcinoma

vii HCV Hepatitis C Virus HEK293 Human Embryonic Kidney 293 cells HeLa Human immortal cell line derived from cervical cancer HSC70 Heat shock 70kDa protein 8 HSP90 Heat shock protein 90 HTS High throughput screening HuH-7 Human immortal cell line derived from liver cancer IL-2 Interleukin 2 IRES Internal ribosome entry site iRFP Near-infrared red fluorescent protein Kcat Turnover number in Michaelis Menten kinetics kdeg Degradation rate KM Michaelis constant kON “On-binding” rate kRNA RNA synthesis rate LacI Lac repressor LacO LacI repressor binding site LFF Low inputs feed-forward (circuit/assay) LNA Locked nucleic acid m7Gppp 7-methyl-guanosine-containing cap mCerulean Improved monomeric cyan fluorescent protein derived from ECFP mCherry Improved monomeric red fluorescent protein derived from DsRed mCitrine Improved monomeric yellow fluorescent protein derived from GFP miR-x miRNA number x miRNA Micro RNA MOE Methyl-O-ethyl mpc Molecules per cell mRNA Messenger RNA NAND NOT AND, logic function with positive output, when no or any but all inputs are present NOR NOT OR, logic function with positive output, when none of the inputs is present NOT Logic signal inverter

viii NSC158959 2,4-dichloro-N-naphthalen-2-ylbenzamide NSC308847 5-amino-2-[2-(dimethylamino)ethyl]-1H-benzo[de]isoquinoline- 1,3(2H)-dione or Amonafide NSC5476 6-[[(4aR,8aS)-3,4,4a,5,6,7,8,8a-octahydro-2H-quinolin-1- yl]sulfonyl]-1,2,3,4-tetrahydroquinoline OR Logic function with positive output when any but none input is present ORF Open reading frame PCR Polymerase chain reaction PEST Proline (P), Glutamic acid (E), Serine (S), and Threonine (T) rich peptide sequence that increases protein degradation PIT2 Streptogramin-responsive transactivator (Pristinamycin- in- duced protein (Pip) fused to p65) PLL Polylysine Pol II DNA polymerase II PRE PIT2 response element pre-miRNA miRNA intermediate after splicing/cutting by Drosha but before Dicer processing pre-mRNA mRNA after transcription but before splicing. All exons and in- trons still present pri-miRNA miRNA intermediate after transcription. No splicing/drosha based processing occurred yet pTRE TetR responsive element promoter qPCR Quantitative PCR or real-time polymerase chain reaction R2 Coefficient of determination RGA Reporter gene assay RISC RNA-induced silencing complex RNA Ribonucleic acid RNAi RNA interference rRNA Ribosomal ribonucleic acid rtTA Reverse Tet transactivator SD Standard deviation shRNA Small hairpin RNA siDicer siRNA against dicer protein siDicer0 siRNA against dicer protein, sequence version 0 siDrosha siRNA against Drosha

ix siNegCtrl siRNA with a sequence that does not target any known mRNA siRNA Small interfering RNA siTRBP2 siRNA against TRBP2 TA Transacivator TET Tetracycline-controlled transactivator TRBP2 RISC-loading complex subunit TRE TetR responsive element tRNA Transfer RNA tTA Tet transactivator Tx DNA element encoding the binding sequence of miR-x Ubx4 Four repeats of ubiquitin sequence XOR Exclusive OR, logic function with positive output, when either of two inputs is present XRN-1 5’ to 3’ exoribonuclase Z’ Experimentally determined Z factor ZsYellow Optimized version of a Yellow fluorescent protein derived from Zoanthus

x Table of Contents

Abstract ...... i

Acknowledgements ...... v

Abbreviations ...... vii

Table of Contents ...... xi

Table of figures ...... xiii

Table of tables ...... xv

1 Introduction ...... 17 1.1 High throughput screening in drug discovery ...... 18 1.2 RNA as a drug target ...... 21 1.2.1 Targeting RNAs with antisense RNAs ...... 21 1.2.2 Targeting RNAs with small molecules ...... 23 1.3 miRNAs ...... 23 1.3.1 miRNA maturation ...... 25 1.3.2 miRNA in disease ...... 26 1.3.3 Challenges in miRNA drug discovery ...... 27 1.3.4 State of the art miRNA screening systems ...... 27 1.4 Synthetic biology ...... 28 1.4.1 Principles of synthetic biology ...... 29 1.4.2 Mammalian synthetic biology ...... 31 1.5 Thesis statement ...... 34

2 Precision multidimensional assay for high-throughput microRNA drug discovery ...... 41 2.1 Introduction ...... 41 2.2 Materials and methods ...... 43 2.2.1 Plasmid construction ...... 43 2.2.2 Cell culture and transfection ...... 44 2.2.3 Small molecule screening ...... 45 2.2.4 Fluorescent microscopy ...... 46 2.2.5 Flow cytometry ...... 47 2.2.6 Luciferase assays ...... 47 2.2.7 Flow cytometry data and image processing ...... 48 2.2.8 Statistical analysis ...... 50

xi 2.2.9 Modeling ...... 50 2.3 Basic concepts ...... 51 2.4 Validation strategy ...... 52 2.5 Experimental system ...... 53 2.5.1 Cell line and miRNA inputs ...... 53 2.5.2 Small molecule and RNA based positive controls ...... 54 2.6 Assembly and testing of pilot assay ...... 59 2.7 Alternative assay design ...... 63 2.8 Mechanistic model of pilot circuit and alternative designs...... 64 2.8.1 Initial choice of parameters ...... 65 2.8.2 Model calibration ...... 67 2.8.3 Assay performance analysis ...... 68 2.9 Validation of alternative assays ...... 72 2.10 In-depth validation of complete feed forward assay...... 72 2.11 Automated screening of small molecule library ...... 75 2.12 Validation of screening hits...... 76 2.13 Assay customization ...... 78 2.14 Assay comparison to bidirectional reporters ...... 79 2.15 Conclusions ...... 81

3 Discussion ...... 85 3.1 A synthetic biology approach to HTS ...... 86 3.2 Major findings ...... 87 3.2.1 Testing of established miR-122 modulators ...... 87 3.2.2 Validation concept ...... 88 3.2.3 Choice of library ...... 88 3.2.4 Hit selection algorithms ...... 89 3.2.5 Comparison of the assay with current standards ...... 89 3.2.6 Engineering a complex artificial network ...... 90 3.3 Future directions ...... 90 3.3.1 Screening a larger, more focused library ...... 91 3.3.2 Different miRNA drug- and off-targets ...... 92 3.3.3 More off-target markers - toxicity ...... 92 3.3.4 Different target families – kinases and GPCRs ...... 92 3.4 Closing statement ...... 93

Appendix ...... 97

xii Table of figures

Figure 1.1 Comparison of different high throughput screening assays...... 19 Figure 1.2 Classification of various assays by throughput and multiplexity...... 20 Figure 1.3 RNA modifications...... 22 Figure 1.4 Cell fate decisions in buffered and unbuffered systems...... 24 Figure 1.5 Common miRNA regulation motifs...... 25 Figure 1.6 Canonical miRNA maturation pathway...... 26 Figure 1.7 Illustration of the engineering principles applied to biology...... 30 Figure 1.8 Synthetic biology devices...... 32 Figure 2.1 Small molecule screening workflow...... 46 Figure 2.2 Flow cytometry gating strategy...... 48 Figure 2.3 Image processing pipeline...... 49 Figure 2.4 Basic concepts and assay designs...... 52 Figure 2.5 Perturbation combinations for circuit validation...... 53 Figure 2.6 Activity profile of miRNAs in HuH-7 cells...... 54 Figure 2.7 miR-122 modulator tests...... 55 Figure 2.8 Test of NSC308847 with destabilized fluorescent protein reporters...... 55 Figure 2.9 Small molecule effects on untargeted Renilla or firefly luciferases...... 56 Figure 2.10 Test of NSC5476 with luciferase assay...... 57 Figure 2.11 Testing of miRNA maturation pathway protein modulators...... 58 Figure 2.12 Mimics and LNA testing with selected fluorescent reporters...... 59 Figure 2.13 High input sensor tests with varying doses of rtTA...... 60 Figure 2.14 Characterization and optimization of PRE/ERE bidirectional reporters. 61 Figure 2.15 Low sensors optimization...... 62 Figure 2.16 Optimization of assembled circuit and testing of pilot circuit...... 63 Figure 2.17 Schematics of alternative assay designs...... 64 Figure 2.18 Simbiology model diagram view...... 65 Figure 2.19 Parameter scans for changes in high sensor inputs...... 69 Figure 2.20 Parameter scans for changes in low and high sensor inputs...... 70 Figure 2.21 Off-target effect dynamic range analysis for alternative layouts...... 71 Figure 2.22 Experimental comparison of alternative layouts...... 72 Figure 2.23 In-depth validation of CFF circuit...... 73 Figure 2.24 Dose response testing of CFF assay with validation set...... 74 Figure 2.25 Screening of NIH clinical collections 1 & 2...... 75 Figure 2.26 Screening hits validation...... 77 Figure 2.27 Assay customization...... 79

xiii Figure 2.28 Assay benchmarking...... 80 Figure 3.1 Comparison of standard and circuit based screening approaches...... 87

Supplementary Figure 1 Analysis of data distributions for screen 1...... 97 Supplementary Figure 2 Analysis of data distributions for screen 2...... 98 Supplementary Figure 3 Chemical structures of gene expression module hits...... 99 Supplementary Figure 4 Chemical structures of RNAi module hits...... 100 Supplementary Figure 5 Chemical structures of specific module hits...... 101

xiv Table of tables

Table 2.1 Parameter values used for Simbiology model simulations...... 68

Supplementary Table 1 Plasmid construction ...... 102 Supplementary Table 2 Primers ...... 112 Supplementary Table 3 gBlocks ...... 116 Supplementary Table 4 LNAs ...... 116 Supplementary Table 5 Mimics ...... 116 Supplementary Table 6 siRNAs ...... 117 Supplementary Table 7 Arrengement of control plate transfections ...... 117 Supplementary Table 8 Screening library ...... 120 Supplementary Table 9 Transfection table experiment of Figure 2.6...... 131 Supplementary Table 10 Transfection table experiment Figure 2.7a...... 133 Supplementary Table 11 Transfection table experiment of Figure 2.7b...... 136 Supplementary Table 12 Transfection table experiment of Figure 2.8a...... 136 Supplementary Table 13 Transfection table experiment of Figure 2.8b...... 137 Supplementary Table 14 Transfection table experiment of Figure 2.8c...... 137 Supplementary Table 15 Transfection table experiment of Figure 2.9b...... 138 Supplementary Table 16 Transfection table experiment of Figure 2.10a...... 139 Supplementary Table 17 Transfection table experiment of Figure 2.10b...... 139 Supplementary Table 18 Transfection table experiment of Figure 2.11a...... 140 Supplementary Table 19 Transfection table experiment of Figure 2.11b...... 141 Supplementary Table 20 Transfection table experiment of Figure 2.12a...... 147 Supplementary Table 21 Transfection table experiment of Figure 2.12b...... 148 Supplementary Table 22 Transfection table experiment of Figure 2.12c...... 149 Supplementary Table 23 Transfection table experiment of Figure 2.12d...... 151 Supplementary Table 24 Transfection table experiment of Figure 2.13...... 152 Supplementary Table 25 Transfection table experiment of Figure 2.14b...... 153 Supplementary Table 26 Transfection table experiment of Figure 2.14c...... 154 Supplementary Table 27 Transfection table experiment of Figure 2.15...... 155 Supplementary Table 28 Transfection table experiment of Figure 2.16b, c...... 156 Supplementary Table 29 Transfection table experiment of Figure 2.16e, f...... 157 Supplementary Table 30 Transfection table experiment of Figure 2.22...... 158 Supplementary Table 31 Transfection table experiment of Figure 2.23b, d...... 161 Supplementary Table 32 Transfection table experiment of Figure 2.24a, b...... 162

xv Supplementary Table 33 Transfection table experiment of Figure 2.25 and Supplementary Figure 1 and 2...... 165 Supplementary Table 34 Transfection table experiment of Figure 2.26a...... 166 Supplementary Table 35 Transfection table experiment of Figure 2.26b, c...... 168 Supplementary Table 36 Transfection table experiment of Figure 2.26d...... 169 Supplementary Table 37 Transfection table experiment of Figure 2.26e, f...... 170 Supplementary Table 38 Transfection table experiment of Figure 2.27b...... 171 Supplementary Table 39 Transfection table experiment of Figure 2.27c...... 172 Supplementary Table 40 Transfection table experiment of Figure 2.28a...... 173 Supplementary Table 41 Transfection table experiment of Figure 2.28b...... 174 Supplementary Table 42 Transfection table experiment of Figure 2.28c...... 175

xvi 1 Introduction

Understanding living systems and their functions and malfunctions is a central subject of man’s strives for knowledge since the beginning of time. This understanding enables the treatment of disease and the production of desired goods. Baking bread, producing cheese or brewing beer are examples from prehistoric time that illustrate these efforts and can be considered the first successful biotechnological processes1. By applying selective breeding, humans have modified genetic information of organ- isms such as yeast, lactobacillus, cattle or plants since thousands of years2 and with an ever-growing body of knowledge about these systems, more and more sophisti- cated products can be produced3. A series of discoveries in the last two centuries – including Mendel’s inheritance laws4, the structure and function of DNA5, restriction enzymes6 and polymerase chain reaction7 (just to mention a few) – enabled scientists to precisely manipulate the genetic information carried by these organisms. This has lead to a wide variety of novel biotechnological products that are produced in different species or with different molecular entities, from small molecules like antibiotics8 or vitamins9, to proteins such as isomerases10 and antibodies11. Yet, developing a pro- cess to produce these goods is usually an endeavor of considerable scale with many unforeseen pitfalls and detours12. In order to understand and ultimately overcome these limitations researches started in the early 2000’s to think of biology not only as a discovery science, but as an engineering discipline that benefits from engineering approaches13,14. In particular, three engineering principles were identified to be espe- cially useful in context of biology, namely standardization of parts, decoupling of mod- ules and abstraction of complexity15, which will be explored in further detail below. This conceptual merging of biology with engineering can be considered as the founding basis of synthetic biology15. In this thesis, the goal is to apply the design principles of synthetic biology to high throughput screening (HTS), specifically HTS based discovery of new molecular entities. We choose HTS for two main reasons. First, it is a well-established set of technologies that is essential to modern drug discovery16 and therefore the potential impact of successful developments is large. Second, even though it is a common ap- proach about 50% of the screening campaigns fail to provide the desired hits16 or sug- gest lead compounds that do not perform as predicted in (pre-) clinical settings17. This is especially true for “unusual”, non-enzyme targets such as (mi)RNAs, scaffold pro- teins and chaperones, because enzymatic turnover cannot be used as a potency indi-

17 cator18. But because miRNAs play a particularly important role in physiology and dis- ease19, it is desirable to establish screening assays that deliver compounds with high potency and specificity. Therefore we will develop a novel assay platform, which iden- tifies specific miRNA modulators. To achieve this, we will engineer synthetic biological networks, also called gene circuits that classify an assayed compound in the first round of screening to be either potent or not potent and specific or non-specific. This will allow the identification of better lead compounds at lower cost and in shorter time.

1.1 High throughput screening in drug discovery Drug discovery is the process of finding new candidate compounds that impact on a certain biological function in the body to improve a patient’s health. Traditionally, these compounds are found in extracts from plants, fungi, bacteria or other complex sources20. The rationale is that these compounds went through evolutionary optimiza- tion to preferentially bind to proteins21, which are the main class of drug targets22. In the early nineties, this discovery process shifted from natural products to synthetic chemical libraries that were generated from a central scaffold trough random chemis- try, with the goal to generate the largest possible untested chemical space23. As a consequence, the number of compounds that could be tested for activity grew expo- nentially and challenged the approach of testing these manually21. In order to over- come this bottleneck, technologies were invented to increase the throughput of testing to be able to screen about 10’000 compounds per day, leading to the creation of the field of high throughput screening (HTS)24. Three aspects are particularly important when it comes to increasing through- put in screening: First, the assay and its specifications. Second, the scale of the assay and its format and third, the degree of manual labor involved as well as the degree of automation24. The assay’s properties are the core aspects of any screen. They heavily de- pend on the target of interest and can be hierarchically classified into different types (Fig. 1.1), each with certain advantages and disadvantages: On the highest level of abstraction one distinguishes between biochemical and cell based assays21. The for- mer usually exploits the nature of the target that is to be disrupted21. For example, if one wants to target an enzyme, its core function is the catalysis of a reaction. There- fore, monitoring the speed of this reaction serves as a good proxy for the inhibition/ac- tivation of this enzyme by a compound of interest25. They can be further classified either based on the target (kinase, protease, etc.), function (phosphorylation, cleavage,

18 etc.) or measurement technique (luminescence, fluorescence, absorbance, etc.) ap- plied. Major advantages of biochemical assays are direct interaction of the compound with the target, ease of readout and miniaturization potential. However, the matrix in which the assay is carried out in is rather different from a real life situation requiring thorough follow-up testing26. Cell based assays overcome this limitation at the cost of more complex handling and noisier data27. They are further classified into target-based and phenotypic assays21. For phenotypic assays no a priori knowledge of the drug target is needed. A high level readout such as cell growth or morphology is used and a wide variety of the host’s molecules could be targeted28. This generality comes at the cost of follow-up experiments, where different pathways need to be perturbed to identify the drug target29. In case of target-based screens the link between the de- scribed effect and the molecule is known from the beginning, usually from genomic or proteomic data and the readout is directly linked to its activity/location/etc30. This makes target-based assays more similar to biochemical assays but they are performed in more native matrices such as cell or tissue culture26. As cells “shield” their target, successful compounds are already selected for a subset of drug-like properties such as stability31, passing of the membrane32 and access to the target33.

Figure 1.1 Comparison of different high throughput screening assays. HTS assays can be classified into different types. The main advantages and disadvantages of the different types are indicated.

The other two factors that determine an assay’s throughput are miniaturization and automation, which were developed hand in hand24. Starting with 96 well plates, which can still be managed manually, the field shifted to the 384 and 1536 well for- mats34. Since these formats require high precision and imply handling small volumes, manual manipulations are challenging and error prone. Therefore, automation is es- sential for the successful implementation of these formats. Cost reduction was a further driver of implementing these technologies. Ultimately, reducing the assay volume al-

19 lows decreasing the amount of raw input material and applying massive paralleliza- tion35. Both factors cut cost and time spent per well drastically enabling the desired throughput to test libraries of > 1’000’000 compounds. At the beginning of the millennium, miniaturization and automation became common practice in the field and testing a large number of compounds was no longer the limiting factor for fast discovery. This shifted the focus away from technology back to the biology to improve validity of the assay and hence increase the discovery rate of successful compounds24. As a consequence, novel assays must provide a higher density of information on the system and compounds tested. Multiplexed assays, e.g. liquid chromatography/mass spectroscopy36 or deep sequencing37, would fulfill this criterion by measuring up to 10’000 parameters per sample38. However, these assays are not compatible with HTS set-ups, as an increase in parameters measured simultaneously generally leads to a decrease in throughput39.

Figure 1.2 Classification of various assays by throughput and multiplexity. The two parameters are inversely correlated. cDNA, complementary DNA; FACS, fluores- cence-activated cell sorting; RGA, reporter gene assay; ELISA, enzyme-linked immuno- sorbent assay; HTS, high throughput screening. Adapted with permission from Feng et al.39.

Assays combing a medium degree of multiplexing with high throughput set-ups are called high content screening assays. The most common method to collect system parameters is microscopy40 – leading to the synonymous use of “automated micros- copy screening” and “high content screening”, even though also non microscopy based assays can provide high content41. However, multiplexing of readouts that directly link to a specific target is desirable and further developments in this field necessary (Fig. 1.2).

20 1.2 RNA as a drug target The idea of RNAs as drug targets exists already for many years for three main reasons42. First, RNAs play diverse roles in human physiology, as carrier of information (mRNAs), in the ribosome (rRNAs, tRNAs), as genetic regulators (miRNAs) and many more43. Deregulation of these RNAs is associated with a wide variety of diseases; hence targeting them could provide a way for treatment. Second, all proteins produced are translated from mRNA. In cases where the targeting of a protein is difficult its mRNA could be targeted in order to prevent the protein from being produced, providing an alternative treatment route44. Third, since RNAs are essential to all living organisms on earth, they are also essential for human pathogens. By targeting RNAs that are vital to these pathogens and do not have a function in human physiology, powerful antibiotic and antivirals can be found45. Two concepts for targeting RNAs were suggested, small molecules46 or antisense RNAs47, both with their advantages and disadvantages.

1.2.1 Targeting RNAs with antisense RNAs Antisense RNAs are conceptually very beautiful: by providing an RNA that hy- bridizes with the target RNA, the target RNA is blocked from performing its function, e.g. it is not translated or properly folded. In case of targeting mRNA, instead of provid- ing a long stretch of antisense RNA, one can also provide the cells with an shRNA, a siRNA or a mature miRNA, exploiting the natural inhibition mechanism of the cells (see below)48. Since the sequence of the target RNA is known, designing antisense RNAs follows defined rules, making the synthesis of a specific treatment easy and fast49. Therefore, unlike for traditional targets like proteins, problems associated with anti- sense RNAs are not the discovery of an active compound but rather the stability, im- munogenicity and delivery of the therapeutic agents50. Several solutions have been suggested to tackle these challenges, especially for siRNAs/miRNAs, but these may be applied to any antisense nucleotide concept. Chemical modifications of (some of) the bases were suggested as a solution to both stability and immunogenicity51. Nucle- ase activity is the main reason for rapid RNA degradation in serum and inside cells. Changes in the phosphate backbone of the RNA prevent its recognition by nucleases, considerably increasing stability. Further, it was observed that mammalian tRNAs and rRNAs evade innate immune response by methylation of the 2’ oxygen of the ribose. This led to the creation of a wide variety of different modifications, which all increase stability against nucleases and decrease immunogenicity (Fig. 1.3). Drawbacks in- clude a potential decrease in antisense activity and highlight that modifications must be tailored to each application51.

21

Figure 1.3 RNA modifications. The element changed compared to natural RNA is highlighted in red. The name of the RNA analogue is written above or below for ribose or phosphoester modifications, respectively. LNA, locked nucleic acid; FANA, 2'-deoxy-2'-fluoro-arabinonucleic acid, MOE, methyl-O- ethyl. Adapted from Behlke51.

The site of action of antisense RNAs is inside the cell. Delivering these RNAs into the cells is a major challenge because naked RNA has a number of properties making penetration of cell membranes difficult – namely negative charge, high molec- ular weight and size52. Therefore the entry pathway for naked RNA is via endocytosis, which is unfavorable for activity as endosomes ultimately end up in lysosomes, where their content is degraded53. A possible solution to overcome these issues is local delivery. Especially the eye54 and lung55 have attracted considerable attention, since these tissues can easily be accessed with considerable amounts of RNA through injection and inhalation, re- spectively. Systemic administration poses bigger obstacles and various carriers were developed to address them. They can generally be divided into viral and non-viral de- livery methods53. For viral delivery, the DNA (or RNA) coding for the RNA of interest is placed under the control of a constantly active promoter and introduced into the viral genome. Upon transduction the host cells’ machinery expresses these RNAs56. This delivery is especially suited when long lasting expression is desired, mostly for chronic diseases. On the down side, the safety of viral delivery methods is being part of an active debate. Since viruses integrate their DNA (retrotranscribed RNA) into the host’s genome they pose the risk of producing unwanted and uncontrolled side effects52. Non-viral delivery systems on the other hand are based on conjugates or encapsula- tion of the RNA. Conjugates can be divided into two groups, non-covalently and cova- lently bound ones. The former is achieved with positively charged particles such as cell penetrating proteins or polymers, the latter one with linking the RNA to bioavailable molecules such as cholesterol or antibodies. Antibody-conjugates have the advantage that they can also be used to target the delivery to a certain tissue through the speci-

22 ficity of the antibody52. Lipid-based carriers involve liposomes, micelles, microemul- sions and solid-lipoid nanoparticles. The siRNA is wrapped in these vesicles to be shielded from its environment and enters the cell via endocytosis57. In order to escape the endosome trap, various methods were developed, however, they mainly base on the “proton sponge” effect. The delivery agent buffers the pH in the endosome, leading to a constant influx of protons and counter ions. This leads to a swelling of the endo- some and eventually to its rupture, releasing the RNA payload58.

1.2.2 Targeting RNAs with small molecules Targeting RNAs with small molecules poses challenges more similar to classic drug discovery with protein targets. Yet targeting RNAs is associated with a number of difficulties not observed for proteins44. RNAs have highly diverse and very dynamic three-dimensional structures. In contrast to many protein-small molecule interactions that can be sufficiently modeled with a lock-key mechanism, RNA-small molecule bind- ing is very dynamic and can only be described by an induced-fit model. This makes rational design of RNA drugs difficult and finding small molecules that specifically in- teract with RNAs very challenging. All RNAs share the same phospho-ribose backbone that is negatively charged. Therefore, positively charged molecules that would interact strongly with RNA are usually not specific. Further, all bases making up an RNA poly- mer have aromatic rings. These rings can interact via π-stacking with molecules that have similar aromatic systems42. However, since these properties are shared among all RNAs, only small differences of the global electron system can be used. All the properties of RNA drug targets are also observed for miRNAs but due to their short length, it is even more difficult to find small molecules that specifically inter- act with miRNAs. Therefore, understanding miRNAs regulation, maturation and func- tion is key to identify potential alterative targets that modulate the activities of miR- NAs59,60.

1.3 miRNAs Genes are regulated post-transcriptionally by miRNAs. These are short, non- coding RNAs of about 18-22 bp length that regulate the gene by interacting with its mRNA. The target mRNA has nucleotide sequences in its 3’-untranslated region that are (partially) complementary to the sequence of the miRNA. This region is usually 8- 15 bp long, with strong conservation of complementarity in the first 7 base pairs of the miRNA, called seed region61. The mature miRNA is embedded in proteins to form a

23 large riboprotein complex, called RNA-induced silencing complex or RISC. Initiated by Watson-Crick base paring of the seed region, the miRNA binds to its target and RISC can exert its function62. There are two suggested mechanisms how protein production is decreased by miRNAs. RISC either cleaves the target mRNA, thereby inducing degradation of the transcript or it recruits inhibitory factors of translation63. The net effect of a single miRNA binding to its target is usually weak and modulates the amount of produced protein only slightly. It is therefore believed that miRNAs do not have an all-or-nothing type of regulation, but are responsible for fine tuning of transcript abundance and buff- ering of transcriptional noise64. Since already slight differences in protein levels can have a significant impact on cell fate, transcript regulation by miRNAs play an essential role to ensure robust cellular processes (Fig. 1.4)65.

Figure 1.4 Cell fate decisions in buffered and unbuffered systems. miRNA reduces the expression level of a mRNA and buffers its expression bursts. Unbuff- ered systems show much larger variability. Adapted with permission from Vidigal et al.64.

Despite their rather small effect on single transcripts, miRNAs were found to play a profound role in physiology and disease19. More than 60% of all mRNAs are regulated by one or multiple miRNAs, which is a surprisingly high fraction66. Yet, this can be rationalized by considering two distinct miRNA properties. Frist, the recognition site of 8 bp is rather short, increasing the frequency of possible matches and second, being a nucleic acid sequence evolution of novel miRNAs can be fast, leading to a tremendous diversity of miRNAs67. Indeed, by today there are more than 1800 human miRNAs annotated in the miRBase68. Given the wide range of genes that miRNAs regulate it is intuitive that also miRNA are tightly regulated69. Based on a broad range of cases, a few key regulation motifs were identified, namely negative feedback be- tween the gene and the miRNA and incoherent as well as coherent feed forward loops of two genes with miRNA as the regulator of the indirect branch (Fig. 1.5)64.

24

Figure 1.5 Common miRNA regulation motifs. In larger networks, miRNAs mostly function in negative feedback loops or as mediators in indirect branches of (in)coherent feed-forward loops. Adapted with permission from Vidigal et al.64.

These motifs can already exert very complex dynamics. However, miRNAs usually interact with several genes and many mRNAs themselves are regulated by several miRNAs (coregulation), increasing the complexity of these regulation networks even further70. These several intertwined layers of regulation and many cooperative actions also explain, why the knockdown strength of a single miRNA does not need to be very strong to elicit fate decision functions of cells, and why its deregulation can still affect the physiology of a cell considerably.

1.3.1 miRNA maturation Most miRNAs follow a canonical maturation process71. I will briefly review this process here. For a review of non-canonical maturation pathways please refer to Yang et al.72. miRNAs are expressed from Pol II promoters and are encoded either introni- cally within coding genes or intergenically from their own promoters. The transcript of a miRNA gene is called pri-miRNA and can be several hundred base pairs long. This pri-miRNA forms a particular stem loop that is recognized by a protein complex con- sisting of Drosha and DGCR8, which cleaves off the stem loop resulting in an approx- imately 60 bp long RNA fragment called pre-miRNA. The pre-miRNA is exported from the nucleus to the cytoplasm with the help of Exportin-5. In the cytoplasm, Dicer and TRBP2 further process the pre-miRNA by cleaving the hairpin off the stem loop, leav- ing a miRNA-miRNA* duplex of ~22 bp, consisting of the passenger strand and the future mature miRNA. Eventually, the duplex is loaded into the RISC complex and the passenger strand, being the one with the thermodynamically more stable 5’ end, is degraded, leaving back the mature miRNA in the RISC complex (Fig. 1.6)73.

25

Figure 1.6 Canonical miRNA maturation pathway. miRNAs originate from a long transcript that gets cleaved, exported and loaded into a large complex called RISC to exert its function. m7Gppp, 7-methyl-guanosine-containing cap; DGCR8, DiGeorge or critical region 8 protein; TRBP, TAR (HIV-1) RNA binding protein; HSC70, heat shock 70kDa protein 8; HSP90, heat shock protein 90, miRNA*, not selected strand of miRNA-miRNA* duplex; RISC, RNA-induced silencing complex, ORF, open read- ing frame. Adapted with permission from Ameres et al.73.

1.3.2 miRNA in disease Given miRNAs’ involvement in regulating a large fraction of the human tran- scriptome it is not surprising that miRNAs were found to play crucial roles in disease74. The human miRNA disease database currently lists 572 different miRNAs, involved in 378 different diseases75. In most of these cases, miRNAs enhance a given diseased phenotype or their restoring/inhibiting helps to treat the disease together with other means such as small molecules or antibodies76. A particularly interesting disease-associated miRNA is miR-122. It is a liver specific miRNA expressed at extremely high levels in healthy liver tissue, accounting for approximately 70% of total liver miRNAs. Further it is involved in Hepatitis C virus (HCV) replication and liver cancer77. HCV is a single-stranded RNA virus causing hepatitis C. For its replication it depends on the presence of miR-12278. miR-122-RISC associates with two binding sites located at the 5’ terminus of the HCV RNA and, contrary to the usual function of miRNA of inhibiting gene expression, facilitates virus replication. Two distinct mecha- nisms cause this phenomenon. First, miR-122-RISC stabilizes a stem loop at the 5’ end of the viral RNA and prevents its exposure to XRN-1, a 5’ to 3’ exoribonuclase, and other cytosolic nucleases. This higher stability increases the number of transla- tions per RNA, increasing replication of the virus79. Second, a long range RNA-RNA interaction of the viral RNA masks its internal ribosome entry site (IRES) that is in close

26 proximity to the miR-122 binding site. Upon binding of miR-122-RISC, this interaction is prevented, leading to the exposition of the IRES, thereby enhancing the association of ribosomes to the RNA, which further increases the rate of translation. Therefore, decreasing the level of miR-122 decreases viral load78. Well-differentiated hepatocytes have an extraordinarily high expression level of miR-12279 and loss of miR-122 was found to be associated with a loss of liver specific functions80. Especially in human hepatocellular carcinoma (HCC), miR-122 loss is strongly correlated with a higher tendency to metastasize and therefore with poor prog- nosis. Restoration of miR-122 levels in HCC models strongly decreased cancer growth and metastatic properties of the cancer81. Both cases indicate that miR-122 plays an important role in disease, making it an attractive drug target82.

1.3.3 Challenges in miRNA drug discovery As outlined above specific targeting of RNA is challenging, since miRNAs are rather short and, especially among members of the same family, show considerable sequence similarity, making finding specific modulators for miRNAs complicated83. Therefore alternative modulation concepts should be considered. Instead of targeting miRNAs themselves, one could also target their transcription factors or other proteins involved in their maturation. However, achieving specificity for these mechanisms is equally demanding. First, miRNAs are often clustered or intronic and therefore share the same transcription factor with other miRNAs84 or with the gene encoded in the exon, which would subsequently be targeted too85. Second, many miRNAs are part of a larger family of miRNAs with shared seed regions or even identical mature se- quences. Since these versions of the same miRNA are encoded in different loci of the genome, multiple transcription factors might need to be targeted specifically, which is unlikely to be achieved with a single molecule86. And third, most miRNAs share the same canonical maturation pathway. It is therefore likely that a not-on-RNA-focused compound library targets one of these maturation proteins. Changing the activity of a maturation protein will affect the expression levels of many miRNAs, which leads to unspecific hits60.

1.3.4 State of the art miRNA screening systems In literature there are two screening systems described for the discovery of miRNA modulators. The first one uses qPCR87. The cells are treated with a compound of interest and the level of mature or pre-miRNA expression is measured. Measuring the pre-miRNA rather than the mature miRNA has the advantage that the response to

27 the modulation is fast, since the miRNA does not need to be fully processed. On the other hand, measuring the pre-miRNA does not identify compounds that interact with the mature miRNA and potent modulators might be missed. The second screening system builds on the functionality of the miRNA as a repressor. By placing a miRNA’s binding site in the thee-prime untranslated region (3’-UTR) of a reporter, the miRNA’s repressive capacity becomes inversely proportional to the signal of the reporter. Shan et al.88 described a system to screen for general miRNA processing machinery inhibi- tors. An shRNA that targets the mRNA of green fluorescent protein (GFP) is expressed together with GFP itself. Therefore, any increase in GFP expression is either due to an inhibition of the maturation proteins, or due to direct interaction with GFP. Gum- mireddy et al.89, Young et al.90 and Conelly et al.91 published a series of works for the discovery of modulators of endogenous miRNAs. In the first version of their assay, they place the miRNA’s binding sequence in the 3’-UTR of Renilla luciferase and measure increase/decrease in its activity. They improve the system gradually by add- ing a second luciferase reporter to normalize for unspecific effects such as cell density, transfection efficiency, etc. Finally, they stably integrate these two plasmids into the genome of the screening cell line. With these systems, small molecules modulating a range of miRNAs were described including non-specific miRNA activators (Enoxa- cin87,88) and inhibitors (Trypaflavine92, Polylysine92), specific miR-21 inhibitors (based on diazobenzene89) and specific miR-122 activators (NSC 30884790) and inhibitors (NSC 15895990, NSC 547690). However, the miR-122 modulators were barely tested for specificity (the compounds were only assayed with a miR-21 screening system) and conclusions about their specificity are therefore difficult. Common to both of these screening systems, PCR and reporter gene based, is that an assessment of the compound’s specificity is not possible. For small library screens this might not be a huge obstacle, however, with discovery scale libraries this becomes a significant problem. We identified mammalian gene circuits as a candidate system that could help overcome this problem and I devote the next section to intro- duce the background and functionality of these circuits.

1.4 Synthetic biology Synthetic biology is a branch of biology that started with the vision of rational design of novel biological products, systems and organism for human applications. It considers modifying and building biological systems in part as an engineering problem

28 that could be solved with engineering approaches. In particular, three engineering prin- ciples were identified to be especially useful in context of biology, namely standardiza- tion of parts, decoupling of modules and abstraction of complexity15.

1.4.1 Principles of synthetic biology Standards are central elements of modern society; regulating subjects as di- verse as railroad track width, Internet addresses or exhaust fume measurements15. They ensure that different parts can work together efficiently by having the compatible specifications, such as size, material, etc., and that measurements performed in one laboratory can be reliably reproduced in another one. Standards hence guarantee compatibility and comparability. In molecular biology, these standards are still to be defined93. An example, where researchers designed a set of rules for a biological sys- tem, are bacterial promoters94, as they can be considered as part of a larger biological system. Dependent on the organism a certain promoter is transplanted into, the meas- urement equipment it is recorded with or the references it is compared to will lead to different classification of its “strength”. In order to enable re-usage in different context without repeated characterization, it is desirable to measure it with defined methods in model organism. This allows finding rules to transfer these entities to other settings. However, these standards should be defined as universal as possible to enable com- puter-aided design and modeling to further speed up the development of novel sys- tems93. Decoupling is the concept of breaking down a very complex problem into smaller ones, which can be solved independently with the intention to eventually reas- semble the resulting solutions to solve the initial problem. This allows many people to work in parallel on the same grand challenge, which accelerates the innovation pro- cess considerably15. The result of such an inventive process can be a device. It is assembled from standard parts and is characterized by a defined input/output transfer function, which allows integrating it in larger systems without dealing with the specificity of each part95. The tetracycline-dependent transactivator (tTA) is a biological example of a single-input-single-output device. The small molecule tetracycline serves as an input that is sensed by the tet-repressor, which inhibits the production of a gene, i.e. the output and a defined relationship between the amount of tetracycline and output can be measured (Fig. 1.7)96. Abstraction is a very powerful approach to deal with complex problems. Based on specific examples with detailed descriptions general rules are identified and ab- stract concepts derived. It is effective for essentially every complex problem; yet, its

29 full power is only realized when combined with standardized parts of modularized sys- tems, because this allows hierarchical structuring of the problem. It can be broken down into systems, devices and parts, enabling the scientist to deal with each of these layers separately15. This can be best illustrated when compared to software engineer- ing: Abstract programming languages that perform well-defined functions enable the programmer to focus on the creative process, while not having to deal with machine language or circuit board design, which speeds up innovation considerably93. This could also be beneficial for biological problems. For example, when choosing an acti- vator it is desirable to know beforehand, which DNA sequence codes for it, to which DNA it binds and what strength the subsequent induction will be (Fig. 1.7)15.

Figure 1.7 Illustration of the engineering principles applied to biology. Complex systems are represented in abstracted units to enable the understanding of higher order function. The systems are decoupled into smaller devices with defined input-output relations, which are further broken down into (standardized) parts. Standardization allows the fast reconnection of parts to form functional devices. Adapted with permission from Endy15.

Synthetic biology applies engineering principles to biology in order to make the development of novel biological systems fast, reliable and predictable. Over the last fifteen years, researchers from all over the world have applied these rules to create an impressive diversity of function including switches97, oscillators98-100 and computing networks101-103. These units were further developed and applied to biofuel and high value chemicals production104,105, cancer therapy106,107, infectious diseases108,109, im- mune disorders110,111 and drug discovery112,113, in in vitro and in vivo systems and in hosts as diverse as bacteria, yeast and mammalian cells.

30 1.4.2 Mammalian synthetic biology Gene circuits consist of a set of engineered genes that, upon introduction to its host, are transcribed and/or translated by the cell’s machinery to form a functional bi- ological network103. In order to build such a network, parts are combined to devices and devices to networks. Common parts used are DNA binding domains, promoters, etc., which are combined to devices such as sensors, switches, oscillators and logic gates, that, when assembled properly, enable a full-scale network to exhibit a desired biological behavior95. A sensor measures the concentration of an analyte and produces an output at amounts (inversely) correlated to the input’s level over a certain concentration range. Combined with an easily measureable output, it can be used as a reporter to sense various inputs like ions, metabolites, polynucleotides or proteins96,101,114. Exchanging the reporter with an actuator, sensors can be adapted to perform a specific biological function. One such system is engineered to control T-cell proliferation. It consists of three components, a sensor (aptazyme), an actuator (Interleukin 2 (IL-2) cytokine) and an inducer (theophylline). An aptazyme is a combination of a ribozyme and an ap- tamer, that results in a functionalized RNA sequence, which cleaves itself in the ab- sence of an inducer, leading to its degradation, and is stable in the presence of the inducer. By combing this aptazyme with the IL-2 gene, cytokine expression and hence stimulation of T-cell proliferation is only achieved when theophylline is present to sta- bilize the mRNA (Fig. 1.8 a)110. Switches are a specific class of sensors with a defined input/output relation. A sensor can potentially take any correlated input/output relation, while a switch’s trans- fer function has to have a very steep increase/reduction of the output at a certain input concentration switching from an Off- to an On-state/Off- to On state respectively. This clean on/off behavior is beneficial for the construction of larger networks in order to achieve a good signal to noise ratio (see below) but it can also have a function as it is, for example for the production of toxic proteins. Friedman et al. employed this to pro- duce herpes simplex virus type 1 glycoprotein gC (gC1). It is a cytotoxic protein, which would inhibit the cells’ growth and therefore its own production. By inducing it only upon a certain cell density production yields can be remarkably improved (Fig. 1.8b)115.

31 ?

Figure 1.8 Synthetic biology devices. (a) – (d) Coarse-grained diagrams of the concept followed by potential input-output function and implemented examples reviewed in the text.

Oscillators have key functions in animal biology from the timing of the circadian rhythm to the regulation of signaling pathways116. A synthetic version of such an oscil- lator can be used to time the expression of certain target genes and give rise to com- plex network dynamics. Tigges et al. implemented a synthetic oscillator in CHO cells by encoding an auto-regulated positive and a time-delayed negative feedback loop (Fig. 1.8c). Specifically, the tetracycline dependent transactivator (tTA) is put under a tTA dependent minimal promoter. Hence, as the default state of the system, none of the involved proteins is expressed. Then the first cycle starts through leakiness in the tTA-dependent minimal promoter. It expresses tTA and initiates a positive feedback loop. A second minimal tTA dependent promoter controls the expression of PIT, a sec- ond, orthogonal transactivator. PIT in turn induces its cognate minimal promoter that encodes an antisense RNA of the tTA mRNA, thereby inhibiting tTA protein production. This leads together with degradation of tTA to its temporal extinction. PIT, which is regulated by tTA gets also degraded, setting the system to its default state, where none of the component is present and self-activation of tTA trough leakiness can occur.

32 To read out the whole system, the expression of a reporter can be coupled to either of the transactivators100. Several sensors can be combined to sense and integrate multiple inputs. Proper wiring of these inputs allows the creation of logic gates, e.g. a Boolean AND gate. This gate creates an output only if all inputs are present, performing the logic of “Input 1 AND Input 2 AND … AND Input N”. Being able to also master the other Bool- ean functions, such as OR, NOR, XOR and NAND theoretically allows the creation of any logical operation desired. Synthetic biologists strived to create these gates in mammalian cells, sensing various different inputs such as small molecules102, small RNAs101 and transcriptional activators117. Xie et al.107 published a nice example of this concept for targeted cancer cell killing. Cancer cells are very different to healthy cells in some aspects of their behavior, e.g. they grow much faster or in low oxygen environment. However, as they originated from a healthy ancestor, they also share a lot of common traits with these healthy cells. This makes distinguishing between healthy and cancerous cells rather difficult, thus rendering targeted therapy challenging. However, by comparing a larger range of markers, defined molecular patterns emerge. Xie et al. used miRNAs as their molecu- lar markers and found that by using six miRNAs, the cervix cancer cell line HeLa can readily be discriminated from other cells. In particular, they used three miRNAs that are highly expressed in HeLa, miR-21, -17 and -30a and three that are low expressed, miR-141, -142(3p) and -146a, to molecularly characterize the HeLa cells. These miR- NAs were wired in such a way that expression of an output is observed only when the exact expression pattern is matched. This was achieved by placing the binding se- quences of the low expressed miRNAs in the 3’UTR of the output gene. Any expres- sion of a low miRNA would therefore lead to output repression. The highly expressed miRNAs were employed to target repressor molecules that in turn target the output gene. This ensures that high expression of these miRNAs represses the repressor, which then cannot repress the output anymore. In the absence of either of these miR- NAs, expression of the repressors is allowed, which turns off output expression. Com- bined together, this set of miRNA encodes the logic Output = miR-21 AND miR-17-30a AND NOT(miR-141) AND NOT(miR-142(3p)) AND NOT(miR-146a), meaning that any deviation from the defined high/low expression pattern leads to repression of the output and hence, to the precise, selective identification of HeLa cells. By replacing the ge- neric output with a killing gene, HeLa cells can selectively be destroyed and the tech- nology can potentially serve as a cancer treatment (Fig. 1.8d).

33 1.5 Thesis statement As outlined above, the goal of this thesis is to apply the design principles of synthetic biology to high throughput screening. Specifically, the inverse relation be- tween multiplexing of the number of measured targets and the throughput of the assay should be addressed. Multi-input gene circuits were identified as a possible solution to this limitation. These circuits sense, compute and integrate a number of defined inputs and generate an output based on a predefined Boolean Logic. An example of such a circuit was reviewed above for the case of cancer cell classification107. Based on this and other previous work101,117 we developed an assay to screen for specific miRNA modulators. It consists of a gene circuit that multiplexes five differ- ent miRNAs, miR-21, -20a, -141, -146a and -122, the latter being the drug target. The other miRNAs serve as proxies for off-target effects. If any of these miRNAs is affected simultaneously with miR-122, a compound is identified as non-specific. After proving the basic functionality of the assay, we optimized the circuit in silico and experimentally to exhibit “excellent” screening properties and validated it with a customized pipeline. We tested a pilot compound library and found non-specific modulators. Unfortunately, none of the compounds was found to be specific. We proved the fast customizability of the assay and benchmarked it to the current gold standard (dual reporter assay), showing its advantages in sensing off-target effects. To conclude the thesis, I discussed the different designs and findings in the context of current developments and give an outlook on the possibilities of the technology.

34 References 1 Bud, R. The uses of life: a history of biotechnology. (Cambridge University Press, 1994). 2 Burbank, L., Whitson, J., John, R., Williams, H. S. & Society, L. B. Luther Burbank, his Methods and Discoveries and their Practical Application. (Luther Burbank, 1914). 3 Zhang, Y.-X. et al. Genome shuffling leads to rapid phenotypic improvement in bacteria. Nature 415, 644-646 (2002). 4 Mendel, G. Versuche über Pflanzen-Hybriden. Verhandlungen des Naturforschenden Vereines in Brünn 4, 3-47 (1866). 5 Watson, J. D., & Crick, F. H. C. A structure for deoxyribose nucleic acid. Nature 171, 737–738 (1953). 6 Linn, S. A., W. Host specificity of DNA produced by Escherichia coli, X. In vitro restriction of phage fd replicative form. Proc. Natl Acad. Sci. USA 59, 1300– 1306 (1968). 7 Mullis, K. B. & Faloona, F. A. Specific synthesis of DNA in vitro via a polymerase-catalyzed chain reaction. in Methods in Enzymology Vol. 155 335- 350 (Academic Press, 1987). 8 Martin, J. F. & Demain, A. L. Control of antibiotic biosynthesis. Microbiol. Rev. 44, 230-251 (1980). 9 Vandamme, E. J. Production of vitamins, coenzymes and related biochemicals by biotechnological processes. J. Chem. Technol. Biot. 53, 313-327 (1992). 10 Chen, W. P., Anderson, A. W. & Han, Y. W. Production of isomerase by Streptomyces flavogriseus. Appl. Environ. Microbiol. 37, 324-331 (1979). 11 Köhler, G. & Milstein, C. Derivation of specific antibody-producing tissue culture and tumor lines by cell fusion. Eur. J. Immunol. 6, 511-519 (1976). 12 Sommerfeld, S. & Strube, J. Challenges in biotechnology production - generic processes and process optimization for monoclonal antibodies. Chem. Eng. Process. 44, 1123-1137 (2005). 13 Rawis, R. L. 'Synthetic Biology' makes its debut. Chem. Eng. News 78, 49-53 (2000). 14 Benner, S. A. & Sismour, A. M. Synthetic biology. Nat. Rev. Genet. 6, 533-543 (2005). 15 Endy, D. Foundations for engineering biology. Nature 438, 449-453 (2005). 16 Macarron, R. et al. Impact of high-throughput screening in biomedical research. Nat. Rev. Drug. Discov. 10, 188-195 (2011). 17 Bowes, J. et al. Reducing safety-related drug attrition: the use of in vitro pharmacological profiling. Nat. Rev. Drug. Discov. 11, 909-922 (2012). 18 Makley, L. N. & Gestwicki, J. E. Expanding the number of ‘Druggable’ targets: Non-enzymes and protein–protein interactions. Chem. Biol. Drug Des. 81, 22- 32 (2013). 19 O'Connell, R. M., Rao, D. S., Chaudhuri, A. A. & Baltimore, D. Physiological and pathological roles for microRNAs in the immune system. Nat. Rev. Immunol. 10, 111-122 (2010). 20 Harvey, A. L. Natural products in drug discovery. Drug Discov. Today 13, 894- 901 (2008). 21 Hüser, J. et al. High-throughput Screening for Targeted Lead Discovery. in High-Throughput Screening in Drug Discovery 15-36 (Wiley-VCH Verlag GmbH & Co. KGaA, 2006). 22 Overington, J. P., Al-Lazikani, B. & Hopkins, A. L. How many drug targets are there? Nat. Rev. Drug. Discov. 5, 993-996 (2006). 23 Kauffman, S. Random chemistry. Perspect. Drug Discov. 2, 319-326 (1995). 24 Mayr, L. M. & Fuerst, P. The future of High-Throughput Screening. J. Biomol. Screen. 13, 443-448 (2008).

35 25 Koltermann, A., Kettling, U., Bieschke, J., Winkler, T. & Eigen, M. Rapid assay processing by integration of dual-color fluorescence cross-correlation spectroscopy: High throughput screening for enzyme activity. Proc. Natl Acad. Sci. USA 95, 1421-1426 (1998). 26 Moore, K. & Rees, S. Cell-based versus isolated target screening: how lucky do you feel? J. Biomol. Screen. 6, 69-74 (2001). 27 An, W. F. & Tolliday, N. Cell-based assays for High-Throughput Screening. Mol. Biotechnol. 45, 180-186 (2010). 28 Zimmermann, G. R., Lehár, J. & Keith, C. T. Multi-target therapeutics: when the whole is greater than the sum of the parts. Drug Discov. Today 12, 34-42 (2007). 29 Lee, J. & Bogyo, M. Target deconvolution techniques in modern phenotypic profiling. Curr. Opin. Chem. Biol. 17, 118-126 (2013). 30 An, W. F. Fluorescence-based assays. in Cell-Based Assays for High- Throughput Screening Vol. 486 97-107 (Humana Press, 2009). 31 Kerns, E. H. & Di, L. Solution Stability. in Drug-like properties: Concepts, Structure Design and Methods 178-186 (Academic Press, 2008). 32 Kerns, E. H. & Di, L. Permeability. in Drug-like properties: Concepts, Structure Design and Methods 86-99 (Academic Press, 2008). 33 Kerns, E. H. & Di, L. Barriers to Drug Exposure in Living Systems. in Drug-like properties: Concepts, Structure Design and Methods 17-I (Academic Press, 2008). 34 Pereira, D. A. & Williams, J. A. Origin and evolution of high throughput screening. Brit. J. Pharmacol. 152, 53-61 (2007). 35 Berg, M. et al. Evaluation of liquid handling conditions in microplates. J. Biomol. Screen. 6, 47-56 (2001). 36 Christians, U., Klepacki, J., Shokati, T., Klawitter, J. & Klawitter, J. Mass spectrometry-based multiplexing for the analysis of biomarkers in drug development and clinical diagnostics - How much is too much? Microchem. J. 105, 32-38 (2012). 37 Huang, P., Kehner, G. B., Cowan, A. & Liu-Chen, L.-Y. Comparison of Pharmacological Activities of Buprenorphine and Norbuprenorphine: Norbuprenorphine Is a Potent Opioid Agonist. Journal of Pharmacology and Experimental Therapeutics 297, 688-695 (2001). 38 Gurard-Levin, Z. A., Scholle, M. D., Eisenberg, A. H. & Mrksich, M. High- Throughput Screening of small molecule libraries using SAMDI mass spectrometry. ACS Comb. Sci. 13, 347-350 (2011). 39 Feng, Y., Mitchison, T. J., Bender, A., Young, D. W. & Tallarico, J. A. Multi- parameter phenotypic profiling: using cellular effects to characterize small- molecule compounds. Nat. Rev. Drug Discov. 8, 567-578 (2009). 40 Bickle, M. The beautiful cell: high-content screening in drug discovery. Anal. Bioanal. Chem. 398, 219-226 (2010). 41 Fennell, M., McIlvain, B., Stewart, W. & Dunlop, J. Leveraging HCS in neuroscience drug discovery. in High Content Screening 169-187 (John Wiley & Sons, Inc., 2007). 42 Ferner, J.-P. et al. RNA as a drug target. in NMR of biomolecules 298-313 (Wiley-VCH Verlag GmbH & Co. KGaA, 2012). 43 Mortimer, S. A., Kidwell, M. A. & Doudna, J. A. Insights into RNA structure and function from genome-wide studies. Nat. Rev. Genet. 15, 469-479 (2014). 44 Ecker, D. J. & Griffey, R. H. RNA as a small-molecule drug target: doubling the value of genomics. Drug Discov. Today 4, 420-429 (1999). 45 Hong, W., Zeng, J. & Xie, J. Antibiotic drugs targeting bacterial RNAs. Acta Pharm. Sin. B 4, 258-265 (2014). 46 Thomas, J. R. & Hergenrother, P. J. Targeting RNA with small molecules. Chem. Rev. 108, 1171-1224 (2008).

36 47 Inouye, M. Antisense RNA: its functions and applications in gene regulation - a review. Gene 72, 25-34 (1988). 48 Wilson, R. C. & Doudna, J. A. Molecular mechanisms of RNA interference. Annu. Rev. Biophys. 42, 217-239 (2013). 49 Bramsen, J. B. et al. A large-scale chemical modification screen identifies design rules to generate siRNAs with high activity, high stability and low toxicity. Nucleic Acids Res. 37, 2867-2881 (2009). 50 Castanotto, D. & Rossi, J. J. The promises and pitfalls of RNA-interference- based therapeutics. Nature 457, 426-433 (2009). 51 Behlke, M. A. Chemical modification of siRNAs for in vivo use. Oligonucleotides 18, 305-320 (2008). 52 Chen, Y., Gao, D.-Y. & Huang, L. In vivo delivery of miRNAs for cancer therapy: Challenges and strategies. Adv. Drug Deliver. Rev. 81, 128-141 (2015). 53 Wang, J., Lu, Z., Wientjes, M. G. & Au, J. S. Delivery of siRNA therapeutics: barriers and carriers. AAPS J. 12, 492-503 (2010). 54 Martinez, T. et al. In vitro and in vivo efficacy of SYL040012, a novel siRNA compound for treatment of glaucoma. Mol. Ther. 22, 81-91 (2014). 55 Lam, J. K.-W., Liang, W. & Chan, H.-K. Pulmonary delivery of therapeutic siRNA. Adv. Drug Deliver. Rev. 64, 1-15 (2012). 56 Tomar, R. S., Matta, H. & Chaudhary, P. M. Use of adeno-associated viral vector for delivery of small interfering RNA. Oncogene 22, 5712-5715 (2003). 57 Liu, X. & Huang, G. Formation strategies, mechanism of intracellular delivery and potential clinical applications of pH-sensitive liposomes. AJPS 8, 319-328 (2013). 58 Ma, D. Enhancing endosomal escape for nanoparticle mediated siRNA delivery. Nanoscale 6, 6415-6425 (2014). 59 Zhang, S., Chen, L., Jung, E. J. & Calin, G. A. Targeting microRNAs with small molecules: Between Dream and Reality. Clinical pharmacology and therapeutics 87, 754-758 (2010). 60 Monroig, P. d. C., Chen, L., Zhang, S. & Calin, G. A. Small molecule compounds targeting miRNAs for cancer therapy. Adv. Drug Deliver. Rev. 81, 104-116 (2015). 61 Ambros, V. The functions of animal microRNAs. Nature 431, 350-355 (2004). 62 Bartel, D. P. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116, 281-297 (2004). 63 Bushati, N. & Cohen, S. M. microRNA functions. Annu. Rev. Cell Dev. Bi. 23, 175-205 (2007). 64 Vidigal, J. A. & Ventura, A. The biological functions of miRNAs: lessons from in vivo studies. Trends Cell Biol. 25, 137-147. 65 Ebert, M. S. & Sharp, P. A. Roles for microRNAs in conferring robustness to biological processes. Cell 149, 515-524. 66 Selbach, M. et al. Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58-63 (2008). 67 Berezikov, E. Evolution of microRNA diversity and regulation in animals. Nat. Rev. Genet. 12, 846-860 (2011). 68 Kozomara, A. & Griffiths-Jones, S. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucleic Acids Res. 42, D68-D73 (2014). 69 Ding, X. C., Weiler, J. & Großhans, H. Regulating the regulators: mechanisms controlling the maturation of microRNAs. Trends Biotechnol. 27, 27-36 (2009). 70 Lim, L. P. et al. Microarray analysis shows that some microRNAs downregulate large numbers of target mRNAs. Nature 433, 769-773 (2005). 71 Winter, J., Jung, S., Keller, S., Gregory, R. I. & Diederichs, S. Many roads to maturity: microRNA biogenesis pathways and their regulation. Nat. Cell Biol. 11, 228-234 (2009).

37 72 Yang, J.-S. & Lai, E. C. Alternative miRNA biogenesis pathways and the interpretation of core miRNA pathway mutants. Mol. Cell 43, 892-903 (2011). 73 Ameres, S. L. & Zamore, P. D. Diversifying microRNA sequence and function. Nat. Rev. Mol. Cell Biol. 14, 475-488 (2013). 74 Soifer, H. S., Rossi, J. J. & Saetrom, P. MicroRNAs in disease and potential therapeutic applications. Mol. Ther. 15, 2070-2079 (2007). 75 Lu, M. et al. An analysis of human microRNA and disease associations. PLOS ONE 3, e3420 (2008). 76 Ardekani, A. M. & Naeini, M. M. The role of microRNAs in human diseases. AJMB 2, 161-179 (2010). 77 Filipowicz, W. & Großhans, H. The liver-specific microRNA miR-122: biology and therapeutic potential. in Epigenetics and Disease Vol. 67 Progress in Drug Research Ch. 11, 221-238 (Springer Basel, 2011). 78 Goergen, D. & Niepmann, M. Stimulation of hepatitis C virus RNA translation by microRNA-122 occurs under different conditions in vivo and in vitro. Virus Res. 167, 343-352 (2012). 79 Wilson, J. A. & Huys, A. miR-122 Promotion of the hepatitis C virus life cycle: sound in the silence. Wiley Interdiscip. Rev. RNA 4, 665-676 (2013). 80 Gramantieri, L. et al. Cyclin G1 is a target of miR-122a, a microRNA frequently down-regulated in human hepatocellular carcinoma. Cancer Res. 67, 6092- 6099 (2007). 81 Coulouarn, C., Factor, V. M., Andersen, J. B., Durkin, M. E. & Thorgeirsson, S. S. Loss of miR-122 expression in liver cancer correlates with suppression of the hepatic phenotype and gain of metastatic properties. Oncogene 28, 3526- 3536 (2009). 82 van der Ree, M. H. et al. Miravirsen dosing in chronic hepatitis C patients results in decreased microRNA-122 levels without affecting other microRNAs in plasma. Aliment. Pharm. Ther. 43, 102-113 (2016). 83 Schmidt, M. F. Drug target miRNAs: chances and challenges. Trends Biotech. 32, 578-585 (2014). 84 Altuvia, Y. et al. Clustering and conservation patterns of human microRNAs. Nucleic Acids Res. 33, 2697-2706 (2005). 85 Monteys, A. M. et al. Structure and activity of putative intronic miRNA promoters. RNA 16, 495-505 (2010). 86 Borchert, G. M. et al. Comprehensive analysis of microRNA genomic loci identifies pervasive repetitive-element origins. Mob. Genet. Elements 1, 8-17 (2011). 87 Melo, S. et al. Small molecule enoxacin is a cancer-specific growth inhibitor that acts by enhancing TAR RNA-binding protein 2-mediated microRNA processing. Proc. Natl Acad. Sci. USA 108, 4394–4399 (2011). 88 Shan, G. et al. A small molecule enhances RNA interference and promotes microRNA processing. Nat. Biotech. 26, 933-940 (2008). 89 Gumireddy, K. et al. Small-molecule inhibitors of microRNA miR-21 function. Angew. Chem. Int. Ed. 47, 7482-7484 (2008). 90 Young, D. D., Connelly, C. M., Grohmann, C. & Deiters, A. Small molecule modifiers of microRNA miR-122 function for the treatment of hepatitis C virus infection and hepatocellular carcinoma. J. Am. Chem. Soc. 132, 7976-7981 (2010). 91 Connelly, C. M., Thomas, M. & Deiters, A. High-throughput luciferase reporter assay for small-molecule inhibitors of microRNA function. J. Biomol. Screen. 17, 822-828 (2012). 92 Watashi, K., Yeung, M. L., Starost, M. F., Hosmane, R. S. & Jeang, K.-T. Identification of small molecules that suppress microRNA function and reverse tumorigenesis. J. Biol. Chem. 285, 24707-24716 (2010).

38 93 Andrianantoandro, E., Basu, S., Karig, D. K. & Weiss, R. Synthetic biology: new engineering rules for an emerging discipline. Mol. Syst. Biol. 2, 1-14 (2006). 94 iGEM-Foundation. Registry of Standard Biological Parts - Promoters, (2015). 95 Wang, Y.-H., Wei, K. Y. & Smolke, C. D. Synthetic biology: advancing the design of diverse genetic systems. Annu. Rev. Chem. Bio. Eng. 4, 69-102 (2013). 96 Gossen, M. & Bujard, H. Tight control of gene expression in mammalian cells by tetracycline-responsive promoters. Proc. Natl Acad. Sci. USA 89, 5547- 5551 (1992). 97 Gardner, T. S., Cantor, C. R. & Collins, J. J. Construction of a genetic toggle switch in Escherichia coli. Nature 403, 339-342 (2000). 98 Elowitz, M. B. & Leibler, S. A synthetic oscillatory network of transcriptional regulators. Nature 403, 335-338 (2000). 99 Stricker, J. et al. A fast, robust and tunable synthetic gene oscillator. Nature 456, 516-519 (2008). 100 Tigges, M., Marquez-Lago, T. T., Stelling, J. & Fussenegger, M. A tunable synthetic mammalian oscillator. Nature 457, 309-312 (2009). 101 Rinaudo, K. et al. A universal RNAi-based logic evaluator that operates in mammalian cells. Nat. Biotechnol. 25, 795-801 (2007). 102 Ausländer, S., Ausländer, D., Müller, M., Wieland, M. & Fussenegger, M. Programmable single-cell mammalian biocomputers. Nature 487, 123-127 (2012). 103 Benenson, Y. Biomolecular computing systems: principles, progress and potential. Nat. Rev. Genet. 13, 455-468 (2012). 104 Ro, D.-K. et al. Production of the antimalarial drug precursor artemisinic acid in engineered yeast. Nature 440, 940-943 (2006). 105 Zhang, F., Rodriguez, S. & Keasling, J. D. Metabolic engineering of microbial pathways for advanced biofuels production. Curr. Opin. Biotech. 22, 775-783 (2011). 106 Anderson, J. C., Clarke, E. J., Arkin, A. P. & Voigt, C. A. Environmentally controlled invasion of cancer cells by engineered bacteria. J. Mol. Biol. 355, 619-627 (2006). 107 Xie, Z., Wroblewska, L., Prochazka, L., Weiss, R. & Benenson, Y. Multi-input RNAi-based logic circuit for identification of specific cancer cells. Science 333, 1307-1311 (2011). 108 Lu, T. K. & Collins, J. J. Dispersing biofilms with engineered enzymatic bacteriophage. Proc. Natl Acad. Sci. USA 104, 11197-11202 (2007). 109 Allison, K. R., Brynildsen, M. P. & Collins, J. J. Metabolite-enabled eradication of bacterial persisters by aminoglycosides. Nature 473, 216-220 (2011). 110 Chen, Y. Y., Jensen, M. C. & Smolke, C. D. Genetic control of mammalian T- cell proliferation with synthetic RNA regulatory systems. Proc. Natl Acad. Sci. USA 107, 8531-8536 (2010). 111 Larman, H. B. et al. Autoantigen discovery with a synthetic human peptidome. Nat. Biotech. 29, 535-541 (2011). 112 Weber, W. et al. A synthetic mammalian gene circuit reveals antituberculosis compounds. Proc. Natl Acad. Sci. USA 105, 9994-9998 (2008). 113 Zhao, W., Bonem, M., McWhite, C., Silberg, J. J. & Segatori, L. Sensitive detection of proteasomal activation using the Deg-On mammalian synthetic gene circuit. Nat. Commun. 5:3612, doi:10.1038/ncomms4612 (2014). 114 Wilde, R. J. et al. Control of gene expression in cells using a bacterial operator-repressor system. EMBO J. 11, 1251-1259 (1992). 115 Friedman, H. M. et al. Use of a glucocorticoid-inducible promoter for expression of herpes simplex virus type 1 glycoprotein gC1, a cytotoxic protein in mammalian cells. Mol. Cell. Biol. 9, 2303-2314 (1989).

39 116 Novak, B. & Tyson, J. J. Design principles of biochemical oscillators. Nat. Rev. Mol. Cell Bio. 9, 981-991 (2008). 117 Leisner, M., Bleris, L., Lohmueller, J., Xie, Z. & Benenson, Y. Rationally designed logic integration of regulatory signals in mammalian cells. Nat. Nanotechnol. 5, 666-670 (2010).

40 2 Precision multidimensional assay for high-through- put microRNA drug discovery

Based on

Precision multidimensional assay for high-throughput microRNA drug discovery, Benjamin Haefliger, Laura Prochazka, Bartolomeo Angelici and Yaakov Benenson, Nat. Commun., 7:10709, doi: 10.1038/ncomms10709 (2016).

The following chapter consists of an adapted version of the main text of the paper “Precision multidimensional assay for high-throughput microRNA drug discov- ery”, published in Nature Communications, with adapted figures and included/adapted supplementary information text for better readability. It describes the rational of syn- thetic gene circuits for the drug discovery of miRNA modulators, their construction, optimization, validation and testing. Further it describes how these circuits can be adapted to fit other inputs and how it compares to the current gold standard, dual re- porter assays. The work was designed and planned together with my supervisor Kobi Benen- son, who also performed the major part of the Simbiology simulations. I performed most of the experimental work myself, however I had the kind help of Laura Prochazka and Bartolomeo Angelici for the construction of some (precursor) plasmids as well as for performing the second run of the large small molecule screen and the independent validation of the published small molecule modulators. Urs Senn performed the pro- gramming and handling of the robot used for the large screen.

2.1 Introduction Progress in drug discovery is hampered by under-exploration of chemical space, and by the difficulty in assessing the full range of drug candidates’ effects on living cells. The former challenge is addressed by extending chemical space coverage, in part using synthetic pathways1,2 engineered by synthetic biology3-12 methods. The latter is partially solved with cell-based assays13 that allow evaluating drug action in a complex environment. Yet, these assays still generate candidate compounds that per- form inadequately in vivo with respect to efficacy and toxicity14 in large part because many unwanted interactions15 pass undetected in vitro. Multiplex assays and serial testing16 have been proposed as a way to gauge off-target effects, yet increasing the

41 number of measured parameters reduces assay throughput and makes it unsuitable for large library screens. The weaknesses of cell-based assays are amplified with microRNAs (miRNA) as drug targets. miRNA activity can be enhanced using miRNA mimics17 and inhibited with complementary RNA analogs18 or genetic sponges19. The search for small mole- cule miRNA modulators20,21 has relied on qPCR22 and genetic reporter23,24 assays, to produce a few candidate compounds20-25. However, multiple miRNA molecules can be easily targeted by the same compound due to similarities in nucleotide sequence and phosphoribose backbone, shared maturation pathway, and common pri-miRNA pre- cursors26; therefore, the likelihood of side-effects is high. Given that miRNAs are prom- ising drug targets27,28 playing an important role in a large number of diseases29, there is a need for miRNA drug discovery tools that adequately address the high risk of non- specific effects with this target family. We propose to address this challenge via rapid assessment of multiple off-tar- get effects using intracellular genetic information-processing circuits30 to integrate these effects and “compress” them into a small number of fluorescent reporters. Con- sider a large genetic “AND gate” that responds to multiple intracellular inputs, none of them the intended drug target, which can nevertheless be affected by a drug candi- date. The gate generates “true” logical output when all its inputs are in their default (ground) state, corresponding to the lack of interference between the drug and the inputs. The output of an AND gate changes when any one of its inputs changes, re- flecting a deviation from the default state. Denoting a default state of input X (X=A, B, …) as Input X0, in the logical notation

No off-target effects = (Input A0) AND (Input B0) AND... (1)

Off-target effect = NOT (No off-target effects) = NOT ((Input A0) AND (Input B0) AND ...) (2) = NOT (Input A0) OR NOT (Input B0) OR …

Equation (2) means that a change in the AND gate output results from at least one off-target effect, without identifying it. Adding inputs to the gate will expand the range of sampled off-target effects while keeping a single output without the need for multiplexing. Consequently, the validity and information content of the screen in- creases dramatically without sacrificing throughput.

42 Here we describe a novel cell-based assay that utilizes a genetic information- processing circuit to integrate and compress multiple miRNA inputs into a small num- ber of fluorescent reporters to distinguish between off-target and specific effects of candidate compounds. Using an iterative simulation-aided design process, we imple- ment the concept for miR-122, a promising drug target in liver cancer31 and hepatitis C32. We validate the assay in HuH-7 cells using miRNA mimics and inhibitors, then further adapt it for automated screening and test a library of ~700 compounds. Finally, we reprogram and re-validate the assay to address additional miRNA drug targets and off-targets. Importantly, we show that compounds that would have been mistaken as specific hits with traditional methods are correctly identified as non-specific modula- tors. This study presents a precise yet high throughput approach for miRNA drug dis- covery. The general concept is applicable to additional target families with appropriate modifications in the sensing and processing components.

2.2 Materials and methods

2.2.1 Plasmid construction Standard cloning techniques were used to construct plasmids. E. coli DH5α served as the cloning strain, cultured in LB Broth Miller Difco (BD) supplemented with appropriate antibiotics (ampicillin, 100 μg/mL, chloramphenicol, 25 μg/mL, kanamycin, 50 μg/mL). Enzymes were purchased from New England Biolabs (NEB). Phusion High- Fidelity DNA Polymerase (NEB) was used for PCR amplification. Oligonucleotides used as primers or for annealing were purchased form Microsynth, IDT or Sigma-Al- drich. Digestion products or PCR fragments were purified using GenElute Gel Extrac- tion Kit or Gen Elute PCR Clean Up Kit (both Sigma-Aldrich). Ligations were performed using T4 DNA Ligase (NEB) at 16 °C for 1h for sticky end overhangs or at 4 °C over- night for blunt end ligation, followed by transformation of chemically competent cells and plating on LB agar plates with appropriate antibiotics. Clones were screened by colony-PCR using Quick-Load Taq 2X Master Mix (NEB) or by restriction. Plasmids were sequenced by Microsynth. Detailed cloning procedure for each plasmid can be found in Supplementary Table 1, with primers listed in Supplementary Table 2 and gBlocks in Supplementary Table 3.

43 2.2.2 Cell culture and transfection HuH-7 cells were received from the Health Science Research Resources bank of the Japan Health Sciences Foundation (Cat-# JCRB0403, Lot-# 07152011) and cul- tured at 37 °C, 5% CO2 in DMEM, low glucose, GlutaMAX (Life Technologies, Cat #21885-025), supplemented with 10% FBS (Sigma-Aldrich, Cat #F9665 or Life Tech- nologies, Cat #10270106) and 1x penicillin/streptomycin solution (Sigma-Aldrich, Cat #P4333). Cells were passaged every 3-4 days using 0.25% trypsin- EDTA (Life Tech- nologies, Cat # 25200-072). Transfections were performed using Lipofectamine 2000 transfection reagent (Life Technologies, Cat-# 11668-019) in uncoated 24-well plates (Thermo Scientific, Cat-# 142475), 96-well plates (Thermo Scientific, Cat-# 167008) or black μ clear 96-well plates (Greiner bio-one, Cat-# 655090). Two transfection proto- cols were used, a seeding and a seed-transfection protocol. For the seeding protocol in e.g. 24 well plates, HuH-7 cells were seeded one day before transfection at a density of 65,000 cells/well in 500 μL complete medium. The medium was replaced before transfection with medium supplemented with doxycycline hyclate (Fluka, Cat # 44577) at a final concentration of 1 μg/mL if appropriate. For the seed-transfection protocol in e.g. 96-well plates, HuH-7 cells were placed at a density of 30,000 cells/well (double the number of cells compared to the seeding protocol) in 100 μL complete medium supplemented with Doxycycline hyclate (Fluka, Cat # 44577) at a final concentration of 1 μg/mL right before transfection if appropriate (Suspension transfection protocol from the manufacturer). All transfections were performed at 80 – 90% cell confluence at the day of transfection. Plasmids were purified from 100 mL – 400 mL cultures of E.coli DH5α grown overnight at 37 °C, shacken at 200 rpm in LB Broth Miller Difco (BD) supplemented with appropriate antibiotic using HiPure Plasmid Filter Maxi/Midi Kit (Invitrogen) for low and high copy plasmids or PureYield Plasmid Midiprep Kit (Promega) for high copy plasmids only. After plasmid purification an additional purifi- cation step was performed using Endotoxin Removal Kit (Norgen Biotek Corporation). DNA amounts were quantified using Nanodrop (ND-2000) and integrity was verified by agarose gel electrophoresis of the undigested plasmid. The purified plasmids were mixed according to Supplementary Table 9 – 42 and diluted with 50/25 μL Opti-MEM I Reduced Serum (Gibco, Life technologies Cat # 31985-962) per sample for 24/96- well plates respectively. If needed, microRNA mimics, siRNAs and LNA-inhibitors were added to the plasmid mix without adjusting the amount of Lipofectamine 2000 used. Mimics were purchased from Thermo Scientific (now GE Healthcare), siRNAs from Microsynth and LNA inhibitors form Exiqon (Supplementary Table 4 – 6). Lipofec- tamine 2000 was used at a Lipofectamine [μL]:DNA [μg] ratio of 2.5:1 and was mixed with 50/25 μL Opti-MEM for 24/96-well plates respectively. After 5 minutes incubation

44 at room temperature, the diluted Lipofectamine was mixed with the diluted DNA sam- ple. The mixture was incubated for 20 min at room temperature and added to the cells.

2.2.3 Small molecule screening Small molecules were received from the NIH Clinical Collection program via Evotec. 727 compounds were shipped in 96-well plates, at 10 mM in 50 µL DMSO (Supplementary Table 8). Purity was guaranteed by supplier. Robotic dilutions were first performed in DMSO to 1 mM and afterwards in complete medium to 30 µM. 50 µL of this compound-containing medium was added to the screening plates. All steps were performed on a Hamilton Microlab STAR Line robot with custom configuration and protocol. The circuit transfection was performed manually in bulk suspension as described above. In large screens with 30 plates in total, two separate transfected cell batches were prepared for the first and the second 15 plates, respectively, to avoid long waiting times. A batch of 150 mL transfected cells in growth medium was placed in a V-shaped container. Immediately afterwards, 100 µL aliquots containing sus- pended transfected cells were dispensed into compound-containing wells (96-well plates) by a custom robot program, resulting in a final concentration of 10 µM com- pound and 1% DMSO. Cells were maintained in uniform suspension by periodic pipet- ting of the mixture in the large container each time before aspiring for a new plate; shaking the container on the other hand resulted in low transfection efficiencies and cell death (Fig. 2.1). Cells were assayed after 48 h using microscopy as described below. Positive and negative controls were placed in rows 1 and 12. For each of these controls the transfection master mixes were prepared separately and added after all the sample wells were filled. The controls were untransfected cells (A1, A12), pure DMSO (A2 – A4, H2 – H4), 5 nM LNA-122 (A5, H5), 5 nM Mim-122 (A6, H6), 5 nM LNA-21 (A7, H7) and 5 nM Mim-146a (A8, H8).

Nine plates were covered exclusively with controls to test for edge, row and column effects and to calculate Z’-factors according to the guidelines by Sittampalam et al.33. Plates 1-3 tested the screening properties of miR-122 downregulation (addition of LNA-122). DMSO served as the baseline expression (here the lowest mCherry sig- nal), 0.1 nM LNA-122 as intermediate and 5 nM as high mCherry signal. Plates 4-6 were used to test miR-122 upregulation. In this case DMSO baseline signal was the highest signal measured, 0.1 nM Mim-122 the intermediate and 5 nM Mim-122 the lowest one. Plates 7-9 tested the non-specific module output mCerulean for downreg- ulation only, since the assay signal is expected to be close to the maximal expression in the ground state. DMSO served here as the highest signal, 0.1 nM Mim-146a as

45 intermediate and 5 nM Mim-146a as the lowest one. The specific arrangement of the transfections for each well can be found in Supplementary Table 7. Data processing is described below.

Figure 2.1 Small molecule screening workflow. Schematic representation of the transfection process for the screening experiments. Steps performed by hand and by robot are shown in separate boxes and the different dilution steps are indicated. Transfection mixes for the control wells are performed by hand as described for the sample ones.

2.2.4 Fluorescent microscopy Cells were measured 48 h after transfection by an inverted Fluorescent Micro- scope (Nikon Eclipse Ti) using a Fiber Illuminator (Nikon Intensilight C-HGFI), with optical filter sets (Semrock) and a Digital Camera System (Hammamatsu, ORCA R2).

46 The filters are a combination of excitation and emission band pass filters combined with a dichroic filter for each individual fluorescent protein. We measured mCerulean, mCitrine, mCherry and iRFP with the filter set CFP HC (HC 438/24, HC 483/32, BS 458), YFP HC (HC 500/24, HC 542/27, BS 520), TxRed HC (HC 624/40, HC 562/40, BS 593), Cy5.5-A (HC 655/40, HC 716/40, BS 685), respectively. For the high through- put screening experiments a Plan Apo λ 2x objective was used and 2x1 frames were acquired to cover the majority of a well. For all other experiments a Plan Fluor 10x Ph1 DLL objective was used and 2x2 frames were acquired.

2.2.5 Flow cytometry Samples were analyzed 48 hours after transfection using a BD LSR Fortessa cell analyzer. Medium was removed and cells were incubated with 150/50 μL phenol- red free Trypsin (0.5% Trypsin- EDTA (Gibco, Life Technologies, cat # 15400-054) 1:2 diluted with PBS (Life Technologies cat # 10010-56) for 5/15 min for 24/96 well plates, respectively. Detached cells were transferred to small FACS tubes (Life Systems De- sign, Cat # 02-1412-000) and kept on ice. The fluorophores were measured with a combination of excitation lasers and emission filters. For mCherry we used a 561 nm excitation laser 600 nm longpass filter and 610/20 emission filter. For mCitrine we used 488 nm laser, 505 nm longpass filter and 542/27 nm emission filter. For mCerulean we used a 445 nm laser and a 473/10 nm emission filter. For iRFP, we used a 640 nm laser and 780/60 emission filter. PMT voltages were checked and adjusted if needed using standard fluorescent beads (SPHERO™ Rainbow Calibration Particles, RCP- 30-5A (8 peaks) and Alignflow™ Flow Cytometry Alignment Beads, A-16500, Life Technologies) before and after each measurement in order to ensure constant device performance.

2.2.6 Luciferase assays Cells were harvested 48 h post transfection. Supernatant was removed, cells were washed with PBS, and 100 µL 1x passive lysis solution (Promega) was added (incubation at 37 °C for 15 min). The luciferase reaction was performed using Promega dual luciferase assay kit. We mixed 20 µL of lysed cells with 50 µL of the respective reagent. Measurements were performed with a lag time of 2 s and 10 s recoding time using a tube luminometer (Sirius, Berthold Detection Systems). In the case of sensor saturation, measurements were repeated with half the amount of lysis solution until quantifiable.

47 2.2.7 Flow cytometry data and image processing All flow cytometry experiments were analyzed using FlowJo software. Com- pensation of crosstalk (< 1.9 %) of mCerulean into the 488-542/27 nm channel of mCitrine was performed if necessary using single color controls. The values in the various figures shown as absolute units (a.u.) or relative units (rel.u.) are calculated as follows (Fig. 2.2).

Figure 2.2 Flow cytometry gating strategy. Gating strategy applied to calculate absolute units of fluorescence. Here two colors are gated in the negative control (top row) and subsequently applied to the sample (bottom row). The gate in the negative control is placed such that the frequency of positive cells is <0.1 % positive but as close to it as possible. In the sample the frequency of positive cells is multi- plied with the mean of the positive cells to give absolute units (a.u.).

(i) Live cells were gated using forward and side scatter. (ii) Within this gate, fluorophore positive gates are constructed using untransfected controls such that 99.9% of cells in this control sample fall outside of the selected gate. (iii) For each positive cell population in a given channel, the mean value of the fluorescent intensity is calculated and multiplied by the frequency of the positive cells to result in absolute intensity (a.u.): absolute intensity (a.u) of fluorophore X = mean(fluorescence in fluorphore+ cells) × Frequency (fluorphore+ cells)

48 For the relative intensities, the absolute intensity of the fluorophore of interest was divided by the absolute intensity of a constitutively expressed fluorescent protein that was co-transfected with the other plasmids: relative intensity (rel.u.) of fluorophore X = a.u.(fluorophore X) / a.u.(ctrl. fluorophore)

Small molecule screening data of Figure 2.25 was processed as follows (Fig. 2.3): All microscopy images were exported as TIFF files using NIS-Elements Viewer’s “Export” function.

Figure 2.3 Image processing pipeline. Raw images were cropped and background corrected. A mask for mCitrine (transfection con- trol) was generated and the intensities for all positive cells calculated. The mask was then applied to the other background corrected colors and their pixels were summed too and fur- ther normalized by the value of the corresponding mCitrine image.

All images were cropped by 460 pixels on both sides, in order to remove well border artifacts, resulting in 1690 x 1024 px images. The background image for each plate in mCerulean and mCitrine channels was calculated by individual averaging of image pixels from non-transfected wells A1 and A12. Then, we subtracted (pixel-by- pixel) these averaged background image files from all other images of the same plate. For mCherry, after cropping we first applied MATLAB’s msbackadj function (‘Win- dowsize’, 130) row wise for all images, including non-transfected wells A1 and A12, thus performing the first round of background correction. In the second step the back- ground image for each plate was calculated by averaging each pixel of previously cor- rected mCherry snapshots from non-transfected wells A1 and A12. Finally, we sub- tracted these averaged background images, pixel-by-pixel, from all other previously corrected mCherry images in the same plate thus performing an additional round of background correction. Next, we create a “positive pixel mask” based on mCitrine:

49 Each pixel in mCitrine images with an intensity >250 is defined as positive, all others negative. Finally we calculate the average intensity of the mCitrine positive pixels in all three colors (mCitrine, mCerulean, mCherry). The MATLAB script used for this pro- cessing can be found in the “appendix”, in section “MATLAB scripts”.

2.2.8 Statistical analysis Ten compound storage plates were used in the screen, with about 80 com- pounds in each plate. Each plate was "replicated" three times for the screening assay, and each compound assayed as a triplicate in 3 different plates. Triplicate assay plates belonging to separate storage plates were analyzed separately. For mCitrine, absolute readouts were compared, while both mCerulean and mCherry readouts were internally normalized to mCitrine values in each well. The set of measurements made with all compounds in a given storage plate (about 240 values in total for each readout) was used as a reference distribution for individual compounds from this plate for the pur- pose of exclusion and hit identification34. Specifically, a triplicate measurement of each compound was compared against its respective reference distribution using a two- sided t-test. Compounds that generated non-specific mCitrine and normalized mCeru- lean readouts that differed from the reference with a p-value of 0.1 or less, were con- sidered as potential non-specific modulators and thus were excluded from the analysis. Compounds that generated normalized mCherry readouts that differed from the refer- ence distribution with p-value of 0.01 or less, were classified as hits (Fig. 2.25, the MATLAB script used for this analysis can be found in the “appendix”, in section “MATLAB scripts”). In order to test whether the data used in the t-test are distributed normally, we built histograms of readouts and fitted them to normal distribution using histfit MATLAB function (Supplementary Figure 1 and 2). Z’-factors for different per- turbations were calculated with the following formula: Z’=1-3*(σ(p)+ σ(n)/|µ(p)-µ(n)|), where σ(p) and µ(p) are the standard deviation and mean of that perturbation, respec- tively, and σ(n) and µ(n) those of the control. For perturbations including LNAs, the control is measured with scrambled LNAs, and in perturbations involving miRNA mimic, the control is a scrambled miRNA mimic.

2.2.9 Modeling The model was built and simulations performed using MATLAB and SimBiology toolbox. All simulations were performed with “sundials” solver, with 10-6 absolute and 0.001 relative tolerance. The endpoint of the simulation was chosen at 50 hours. Pa- rameter scans were performed with a MATLAB code executing the SimBiology model

50 in a loop using sbiosimulate command with different parameter values. Endpoint val- ues were used for analysis (cf. Section “2.8 Mechanistic model of pilot circuit and al- ternative designs”).

2.3 Basic concepts The assay circuit consists of three genetic modules reflecting distinct drug ef- fects, including, respectively, global effects on gene expression (“Gene expression module”), systemic effects on the RNAi pathway or off-target miRNAs (“Non-specific RNAi module), and the desired effect on the intended miRNA drug target (“Specific module”) (Fig. 2.4a). The gene expression module reports on cell-wide changes in gene expression and cell viability via a constitutively expressed fluorescent protein. The same protein also serves as a transfection normalization control (see below). The non-specific RNAi module implements a genetic AND gate integrating multiple molec- ular inputs, potentially affected by off-target effects, into a single output reflected in the intensity of a fluorescent readout. To do so, we first place fully-complementary, tandem binding sites for a miRNA input in the 3’-UTR of an mRNA coding for this output, to

3 implement a “NOT (Input)” logic . A second miRNA input knocks down a repressor of that same output via fully-complementary, tandem binding sites in the repressor’s 3’- UTR, inverting miRNA inhibitory activity. This results in a logic gate that facilitates high output expression when the second input is highly expressed (high input) and the first input is expressed at low levels (low input): “high input AND NOT (low input)” (Fig. 2.4b, left). Additional high and low miRNA inputs can be added to scale up the gate (Fig. 2.4b, right). This gate structure suits our purpose because off-target miRNA downregulation will reduce high inputs, and off-target miRNA upregulation will elevate low inputs, with either scenario altering the output. Previous work8 showed how to im- prove the inversion of high inputs with additional genetic elements (Fig. 2.4c, left). We note that the module contains multiple gene expression components and it might also be sensitive to global changes in gene expression. The output of the AND gate can be a fluorescent reporter; however, following our previously-shown strategy for robust integration of multiple modules using a shared transactivator “knot” component35, we use a transactivator as the immediate AND gate output. This transactivator controls two fluorescent proteins. The first mirrors transac- tivator expression and serves as the non-specific assay readout; the second protein is furnished in addition with the binding sequence of the intended drug target miRNA in its 3’-UTR. Increased miRNA activity leads to a reduction in this reporter level and vice

51 versa (Fig. 2.4c, bottom), relative to the non-specific readout. This reporter’s expres- sion (normalized to the non-specific readout) constitutes the specific assay readout.

Figure 2.4 Basic concepts and assay designs. (a) High-level representation of the screening assay modules. Module names, on- and off- target miRNA molecules and readouts are indicated. (b) Left: Schematic representation of the possible input combinations and outcomes for a two miRNA input AND gate. The values of 1 and 0 correspond to high and low miRNA expression, respectively. Note that the output is highly expressed only when high input is highly active, and low input is inactive. Squiggly lines are microRNAs, blunt arrows denote repression. Grey circles denote repressors. Pale hue indicates low expression or activity. Logic gate that determines output activity is shown on top. Right: Logic gate and corresponding input combination in the scaled-up four-input AND gate that results in high output expression. (c) Coarse-grained and detailed assay dia- grams with four inputs for the non-specific module and one input for the specific module. Genetic implementation for each of the building blocks is shown in respective zoomed-in frames, with pointed arrows indicating activation, blunt arrows indicating repression and com- ponent names shown. TA, transactivator; CMV, cytomegalovirus immediate-early promoter; TRE, TetR responsive element; LacI, Lac repressor; rtTA, reverse Tet transactivator; CAGop, CAG promoter followed by an intron with two LacO sites, FF4, synthetic miRNA.

2.4 Validation strategy We established a set of positive and negative controls to validate the assay modules. Ideally, controls should be chemical counterparts of candidate compounds33. We sought small molecule compounds with proven anti miR-122 activity, as well as

52 those targeting multiple miRNAs or the RNAi pathway. Due to the late emergence of miRNAs as drug targets, controls were difficult to identify (see below), and we sought alternatives as suggested by good practice33. Based on prior reports20,23,25, we chose miRNA mimics and LNA-based miRNA inhibitors (referred to as LNAs) to respectively increase and decrease miRNA activity in a predictable manner. Perturbing individual miRNA inputs with mimics and LNA emulates individual drug-miRNA interactions, while perturbing multiple inputs simultaneously emulates systemic alteration of miRNA processing pathways. We designed 15 different assay perturbations comprising sub- sets of mimics and LNAs that span a range of possible off-target and on-target effects (Fig. 2.5), and used these perturbations to calculate Z’-factors34 for assay performance evaluation. Each perturbation is associated with its own Z’-factor, providing fine insight into assay behavior under different conditions.

Figure 2.5 Perturbation combinations for circuit validation. Validation perturbations table with 15 combinations. The ground state (column 0) describes the expression levels of miRNAs in the assay cell line furnished with the assay circuit, without any perturbation. 1 stands for high and 0 for low expression, respectively. Drug target miRNA expression is shown as 0.5, indicating intermediate activity levels required to detect both up- and down-regulation with drug candidates. All other columns are different perturbations. -1 corresponds to miRNA inhibition, +1 is miRNA activation/overexpression relative to the ground state, and 0 indicates no change. The anticipated expression of non-specific and specific readouts following a perturbation is shown in two bottom rows, with 1, 0 and 2 rep- resenting respectively the ground state, reduced, and elevated readout.

2.5 Experimental system

2.5.1 Cell line and miRNA inputs As stated above, for the proof-of-concept we used miR-122 as the target. The assay was tested in HuH-7 cells, a liver cancer cell line used as a model of liver tu- mor36, Hepatitis C Virus (HCV) infection37, and miR-122 drug discovery38. Studies showed that miR-122 has an effect on both liver cancer31,39 and HCV replication32, making it a promising drug target40.

53 In order to determine non-specific assay inputs, we looked for miRNAs ex- pressed at either high or low levels as required by our approach (Fig. 2.4c). We meas- ured the activities of 21 miRNAs41 with bidirectional reporters (Fig. 2.6), and retained miR-145, -141, -375 and -146a as non-specific low inputs; and miR-21 and -20a as high inputs.

Figure 2.6 Activity profile of miRNAs in HuH-7 cells. 21 endogenous miRNAs were tested in HuH-7 cells. High and low fluorescence indicate respectively low and high miRNA activity. Inset: Schematics of the bidirectional miRNA re- porter assay used for the activity assay. All bars shown are mean ± SD of biological tripli- cates. Transfection setup for this figure is given in Supplementary Table 9.

2.5.2 Small molecule and RNA based positive controls Initially we planned to validate the assay with both RNA-based and chemical or genetic agents known to affect miR-122 or the RNAi pathway. We identified a num- ber of relevant small molecules25 and genetic modulators42,43 and tested them as po- tential positive controls for our assay using bidirectional miRNA reporters. Using the previously-reported non-specific modulators, Enoxacin23 and polylysine (PLL)44, we could not observe the desired activity changes, which could be due to the idiosyncra- sies of the HuH-7 cell line (Fig. 2.7a). For the compound NSC158959, previously re- ported to inhibit miR-12225, we did not observe any change either (Fig. 2.7a). On the other hand, for the compound NSC30884725, we found 2-fold repression of the miR- 122 bidirectional reporter (Fig. 2.7a). In order to validate these findings, we replaced fluorescent proteins with luciferases to closer model an earlier experimental set-up25. We retested the compounds with Renilla luciferase-based miR-122 activity reporter normalized to Firefly luciferase activity (Fig. 2.7b) and reproduced the effect for NSC30884725. The miR-122 sensor was knocked down, while the sensor with scram- bled binding site was only slightly reduced. Quantitatively, the effect was stronger with luciferase reporters. We hypothesized that this was due to shorter half-lives of the lu- ciferase proteins and constructed mCherry-PEST45 and Ubiquitin x4-mCherry-PEST46 reporters in order to have fluorescent reporter versions with comparable half-lives to luciferases.

54

Figure 2.7 miR-122 modulator tests. (a) Effects of small molecules (10 µM) on different fluorescent protein based miRNA report- ers in HuH-7 cells. Maximal expected expression level is represented by negative control binding sites TFF5. x-axis labels are miRNA names whose fully complementary binding sites are placed as four repeats in the 3’-UTR of the fluorescent reporter (See Fig. 2.6 for reporter schematics) (b) Effects of specific miR-122 modulators on luciferase-based reporters. miR- 122 reporters (left) are compared with controls (right) to account for non-specific effects. Different shades of blue represent different small molecule modulators as indicated. All bars shown are mean ± SD of biological triplicates. Transfection setup for this figure is given in Supplementary Table 10 and 11. PLL, Polylysine.

Figure 2.8 Test of NSC308847 with destabilized fluorescent protein reporters. (a) Flow cytometry scatter plot of bidirectional miRNA activity reporter (See Fig. 2.6 for gen- eral scheme), expressing wild-type mCerulean as an internal control, and Ubiquitin x4- mCherry-PEST-T122x4 as a destabilized miR-122 reporter. (b) The effect of NSC308847 on PEST-tagged mCherry with miR-122 binding sites (right) and scrambled binding sites (left), measured with a bi-directional reporter expressing wild-type mCerulean as an internal con- trol. (c) Flow cytometry scatter plot showing the effect of 10 μM NSC308847 on HuH-7 cells transfected with the PEST-destabilized reporter corresponding to data in panel b (right) to illustrate effect on expression and cell health. The normalized ratio may increase due to de- crease in the internal control expression and not due to the increase in miR-122 reporter level. All bars shown are mean ± SD of biological triplicates, scatter plots are representative single measurements of biological triplicates. Transfection setup for this figure is given in Supplementary Table 12 – 14. Ubx4, four repeats of the Ubiquitin fusion tag for fast protein degradation; PEST, proline, glutamic acid, serine and threonine rich tag for faster protein degradation.

55 For Ubiquitin x4-mCherry-PEST with T122 binding sites we were not able to measure mCherry fluorescence anymore (Fig. 2.8a), and we repeated the NSC308847 measurements with the mCherry-PEST reporter. However, we were not able to recreate the effects we observed with luciferase. On the contrary, we found de-repression of the reporter at 10 μM concentration (Fig. 2.8b). Yet, thorough analysis of the scatter plots indicates toxic effects at higher drug concentrations generating this apparent de-repression (Fig. 2.8c). Contrary to these findings, the results of the fluorescent and the luciferase assays for NSC158959 did not show consistent behavior: in the luciferase assay, both the targeted protein with miR-122 binding sites as well as a scrambled control were elevated (Fig. 2.7b), while the earlier publication25 showed a specific miR-122 effect. We therefore hypothesized, that NSC158959 could be targeting one of the luciferases directly or interfere with the luciferase-catalyzed reaction. We tested this hypotheses by measuring firefly and Renilla luciferases devoid of any miRNA binding site separately and added the compound to the samples at dif- ferent times: (i) 4 h post transfection, which is a typical time point for adding a drug; (ii) directly to cell culture medium 48 h after cultivation and right before cell lysis; and (iii) directly to the cell lysate immediately prior to photon counting (Fig. 2.9a).

Figure 2.9 Small molecule effects on untargeted Renilla or firefly luciferases. (a) Flow diagram of the experimental design to interrogated compounds’ effects (10 µM) on luciferase genes without miRNA binding sites. The time points of small molecule addition in different cases are indicated in red. (b) The chemicals (10 µM) are added at different time points to luciferase expressing HuH-7 cells. Both luciferases do not have T122 binding sites. The reference sample was not treated with any chemical. All bars shown are mean ± SD of biological triplicates. Transfection setup for this figure is given in Supplementary Table 15.

56 The Renilla luciferase showed a stable expression pattern for both compounds; only NSC308847 had a minor effect after 48 h, which could be due to modest cytotox- icity (see above). Firefly luciferase showed strong reduction for NSC158959 and stable expression for NSC308821; in particular, direct addition of NSC158959 reduces the photon count strongly. Because this effect was much weaker when the compound was added to media after 48 h (and thus is not carried over to cell lysate), we concluded that NSC158959 might have been directly interacting with the luciferase (Fig. 2.9b). Since NSC158959 and NSC308847 did not meet the expectations for assay validation we tested NSC547625. In order to use a similar set-up as in published ex- periments25, we tested it with bidirectional luciferase reporters. We observed the de- repressive effect of NSC5476 on miR-122 reporters but, similar to NSC158959, we also observe this effect on the control reporter with miR-FF5 binding sites (Fig. 2.10a). We therefore repeated the measurements with individual untargeted luciferases and added the compound at different time points. In contrast to the NSC157659 case, this time the Renilla luciferase activity was increased when the chemical was incubated with the cells for 48 h (addition 0 h post transfection) (Fig. 2.10a), suggesting that multiple effects add to its efficacy (Fig. 2.9b).

Figure 2.10 Test of NSC5476 with luciferase assay. (a) Effect of NSC5476 on luciferase bidirectional reporter. * The luciferase activity of the TFF5 binding sites negative control sample treated with 10 µM could not be quantified be- cause of oversaturation of the photon counter for all pipettable lysate volumes. (b) Effect of NSC5476 (10 µM) on Renilla and Firefly luciferases. NSC5476 is added at different time point post transfection or directly to cell lysate, as indicated. ** calculated luciferase activity from half the lysate due to saturation of the sensor with standard lysate amounts. The pro- cedure is schematically depicted in the flow diagram in Fig. 2.9a. The reference sample was not treated with any chemical. All bars shown are mean ± SD of biological triplicates. Trans- fection setup for this figure is given in Supplementary Table 16 and 17.

In addition to small-molecule effectors we sought genetic or RNAi based mod- ulators of miR-122. Literature reports showed that Exportin-5, the protein responsible for nuclear pre-miRNA export, increases the efficiency of miRNA strength for let-7a in

57 HEK293 cells42 and that targeting of Drosha47, DGCR843,47, Dicer47,48 and TRBP247 by siRNA or shRNA leads to a reduction of certain mature miRNAs. We assayed the ef- fects of Exportin-5, siDicer0 and anti-DGCR8-shRNA on four different candidate miR- NAs using their respective bidirectional reporters (Fig. 2.11a): highly expressed miRNA miR-122, modestly expressed miRNA let-7b, miR-146a expressed at very low levels; and the negative control reporter with a scrambled miRNA binding site. We did not observe a clear effect. For the other siRNAs targeting the miRNA pathway we first searched the literature for published deep-sequencing or microarray data and found a dataset for MCF-7 cells49. We sorted this dataset for highest differential effects and compared it with the expression strengths we observed with reporters in HuH-7. We identified a set of five miRNAs, which met both criteria, namely miR-18a, 7, let-7b, 16 and 17 and added to this set the miRNA inputs for the non-specific RNAi module as well as miR-122. We found three miRNAs that were responsive to some of the siRNAs, namely miR-18a, let-7b and 17 with effects in the range of 2-fold. Unfortunately miR- 122 was not among them, ruling out these non-specific modulators for validation (Fig. 2.11b).

Figure 2.11 Testing of miRNA maturation pathway protein modulators. (a) Effects of Exportin-5 overexpression, anti-DGCR8-miRNA and siDicer0 on a subset of candidate reporters. (b) The effects of additional siRNAs47 on miRNA reporters (See Figure 2.6 for the experimental set-up). miRNAs selected as pilot circuit inputs were tested together with candidates identified based on deep sequencing data49. The reporters were co-trans- fected with 20 nM siRNA for individual siRNA testing, and with 5 nM of each siRNA for sim- ultaneous siRNA delivery. All bars shown are mean ± SD of biological triplicates. Transfec- tion setup for this figure is given in Supplementary Table 18 and 19. Exp-5, Exportin-5 protein; DGCR8, DiGeorge syndrome critical region 8 protein; TRBP2, RISC-loading com- plex subunit; siDicer0: siRNA targeting Dicer protein; siNegCtrl, siRNA with a sequence not known to bind to any endogenous mRNA; siDrosha/DGCR8/Dicer/TRBP2, siRNAs targeting the miRNA maturation pathway protein Drosha, DGCR8, Dicer and TRBP2, respectively.

Next we tried synthetic miRNA mimics and LNA-based miRNA inhibitors20,25. All mimics showed good performance, with miR-146a and miR-141 mimics slightly out- performing miR-145 and miR-375 (Fig. 2.12a). Therefore we chose miR-141 and miR-

58 146a as the low inputs for the non-specific RNAi module. Likewise, LNAs against miR- 21 and miR-20a resulted in almost-complete inhibition of their cognate miRNAs (Fig. 2.12b). To confirm that the mimics and the LNAs can be used together to simulate different non-specific effects, we also measured their mutual orthogonality whereby each mimic or LNA was tested with all the reporters including their cognate ones. Our data show that there is no significant crosstalk within this set (Fig. 2.12c, d).

Figure 2.12 Mimics and LNA testing with selected fluorescent reporters. (a, b) Testing of miRNA mimics/LNAs (5 nM) with bidirectional reporters (See Figure 2.6 for the experimental set-up). TFF5 reporter serves as control, representing unrepressed reporter level. miRNA binding sites of the reporters are shown on the x-axis. (c) Orthogonaltiy tests of mimics (5 nM) on bidirectional reporters in those cell lines where the respective miRNA is not expressed for background-free functional characterization. We used HEK293 cells for all mimics except for Mim-20a, which was tested in S2 drosophila cells. Color code of heat map represents the expression of the respective reporter. miRNA mimics and reporters are indi- cated. (d) Orthogonality tests with LNAs (5 nM) on bidirectional reporters in HuH-7 cells. All bars shown are mean ± SD of biological triplicates. Heat map data is based on the mean of a biological triplicate. Transfection setup for this figure is given in Supplementary Table 20 – 23. Mim, mimics (modified RNA sequences coding for the respective miRNA); LNA, locked nucleic acid.

2.6 Assembly and testing of pilot assay In order to build a full-scale pilot circuit (cf. Fig. 2.4c, Fig. 2.16d), we optimized individual circuit building blocks for function and dynamic range under a subset of val- idation perturbations. We addressed first the sensing of highly expressed HuH-7 miRNA markers. Frist, we established an optimization assay, focusing on the interac- tion between the miRNA and the rtTA transactivator component of the high marker sensor. We cloned sequences complementary to the previously identified candidates in the 3’-UTR of the rtTA activator, and used it to drive a fluorescent reporter mCeru- lean. As a control we used rtTA with scrambled miRNA binding sites (TFF5). We kept the reporter constant and varied rtTA levels in transient transfections, measuring mCerulean. We found that in HuH-7 cells, miR-21 and miR-20a sensors cause rtTA knockdown as evidenced by substantially-lower mCerulean levels, while miR-130a and combined miR-130a-miR-20a sensors cause only minor reductions (Fig. 2.13).

59 The latter observations contradict the predictions from reporter data (Fig. 2.6) and confirm the assertion that not only the sensor sequence but also the entire mRNA transcript context determines knockdown efficiency50.

Figure 2.13 High input sensor tests with varying doses of rtTA. Dose-response of pTRE-driven fluorescent reporter to varying amounts of rtTA furnished with different miRNA binding sites, as indicated. FF5 is a scrambled miRNA binding site and it results in the strongest possible expression. Each titration is fitted to an exponential model with three parameters. All points shown are mean ± SD of biological triplicates. Transfection setup for this figure is given in Supplementary Table 24. Ef1a, promoter of the human elon- gation factor-1 alpha.

We planned to set up an interface between the specific and non-specific RNAi modules using transcriptional transactivator35. The idea was to place this activator as the immediate output of the non-specific module and use it to control two fluorescent reporters via a bidirectional promoter. One of them would be correlated with the non- specific RNAi module activity and thus represent the non-specific readout, while the second would be additionally furnished with the binding site sequence for miR-122. The ratio between the latter and the former reporters would indicate specific effect on miR-122 regardless of non-specific interaction, and help isolate this specific effect. Since rtTA is used in high-level miRNA sensors, we tested other known engineered transactivators that function in mammalian cells, namely PIT251 and ET52. We con- structed bidirectional promoters for these activators driving mCerulean and mCherry (Fig. 2.14a) and compared their dose response with well-characterized bidirectional pTRE promoter (Fig. 2.14c). Protein coexpression was found to be comparable to pTRE (Fig. 2.14b), while absolute expression was lower. PIT2 dose-response was more gradual compared to ET.

60

Figure 2.14 Characterization and optimization of PRE/ERE bidirectional reporters. (a) Schematic representation of the system used to characterize the bidirectional reporters. (b) Representative flow cytometry scatter plots for the different bidirectional reporters at 50 ng of transactivator as shown in (c). (c) Dose-response of newly constructed bidirectional reporters to varying amounts of their cognate activators, as indicated in panel (a). rtTA dose- response is shown for comparison. Each titration is fitted to an exponential model with three parameters. All points shown are mean ± SD of biological triplicate, scatter plots are repre- sentative single measurements of biological triplicates. Transfection setup for this figure is given in Supplementary Table 25 and 26. PRE: PIT2 response element; ERE, ET response element; PIT2, Streptogramin-responsive transactivator (Pristinamycin- induced protein (Pip) fused to p65); ET, ET-dependent transactivator (MphR(A) fused to VP16).

Next, we cloned these transactivators, linked to mCitrine reporter via a 2A pep- tide53, as output genes of the non-specific RNAi module. This involved placing them under CAGop promoter and furnishing them with the sensor sequences for miRNA expressed at low levels in HuH-7 cells, miR-141 and miR-146a. We then measured how their regulation via low-level and high-level sensors translates into the levels of downstream bidirectional reporters mCerulean and mCherry, with mCherry furnished with miR-122 binding site in its 3’-UTR, to reflect specific knockdown. We cotrans- fected the transactivator and the bidirectional reporter constructs at optimal concen- trations and applied miR-141, miR-146a and miR-FF4 mimics to assess the direct knockdown of the activator using mCitrine as a proxy, and the indirect effect on acti- vator-induced reporters using mCerulean. To investigate context dependency for miRNA binding sites, we also tested two sequence arrangements to achieve optimal knockdown. We found once again that the context matters: in cases where TFF4 bind- ing site is adjacent to the stop codon, its cognate siRNA is very potent, but binding sites following FF4 are either not affected at all by their mimics in the case of mCitrine- 2A-PIT2 or affected weakly with ET-2A-Citrine (Fig. 2.15). By swapping the binding site positions and placing TFF4 downstream from other binding sites, we strongly in- crease the knockdown by the other two mimics while maintaining relatively strong knockdown with siFF4. When comparing the two transactivators we observe strong direct knockdown of the PIT2 as well as strong reduction in PIT2-controlled reporter, while with ET the effect does not propagate downstream. We explain this behavior by hypersensitivity of pERE to ET leading to a strong induction of the promoter also with

61 only very low amounts of transactivator (Fig. 2.14c), and thus use only PIT2 in subse- quent experiments.

Figure 2.15 Low sensors optimization. Comparison of knockdown efficiency of miRNA mimics (5 nM) on two different transactiva- tors with different positioning of the respective binding sites, as indicated. Yellow bars repre- sent the direct readout of the targeted protein (mCitrine). Blue bars indicate the expression of a fluorescent reporter controlled by its transactivator via a bidirectional promoter. All bars shown are mean ± SD of biological triplicates. Transfection setup for this figure is given in Supplementary Table 27. 2A, Self-cleaving peptide; iRFP, near-infrared red fluorescent protein.

Having done initial optimizations, we assembled the complete sensors for highly expressed markers using optimal amounts of rtTA constructs augmented with appropriate functional miRNA binding sites (Fig. 2.16a). We reintroduced downstream pTRE-LacI-Tx- ^miR-FF4^ repressor constructs and measured sensor performance with varying amounts of these cassettes for both miR-21 and miR-20a sensors individ- ually in order to maximize sensor dynamic range and therefore sensitivity for changes in highly expressed miRNAs. We found that when not targeted by miRNAs, the re- pressor construct is very potent and already small amounts fully repress mCitrine and mCerulean expression giving tight Off-state. The On state obtained with miRNA-tar- geted sensors also decreases with increasing repressor amount (Fig. 2.16b, c). To maintain high On-state for the purpose of robust assay readouts, we decided to use 1:1:1 plasmid ratio even though it does not result in highest dynamic range. The optimized components were assembled in a circuit dubbed “Pilot assay” (Fig. 2.16d). The layout follows the structure in Figure 2.4c, with the addition of an auxiliary fluorescent reporter mCitrine coupled to the PIT2 transactivator via 2A linker for characterization purposes. The circuit senses five different miRNA inputs, of which four feed into the non-specific RNAi module. Initially we tested a subset of perturba- tions by targeting each non-specific input individually. We found that the assay perfor- mance was satisfactory when judged by the changes in mCitrine/PIT2 (Fig. 2.16e), but it deteriorated at the non-specific readout level for both high and low input modulation,

62 leading to Z’-factors <0.5, reflecting merely “acceptable” assay performance (Fig. 2.16f). We concluded that the Pilot assay was not robust enough and considered al- ternative topologies.

Figure 2.16 Optimization of assembled circuit and testing of pilot circuit. (a) Schematic representation of the high input optimization set-up. (b, c) Dose-response of the immediate AND gate reporter mCitrine (b) and non-specific assay readout mCerulean (c) measured separately for miR-21 and miR-20a sensors. Sensor genes furnished with scrambled FF5 binding sites generate baseline "Off" response. Each titration is fitted to an exponential model with three parameters. (d) Schematics of the pilot circuit. miR-21 and miR- 20a are the high inputs, miR-146a and miR-141 are the low inputs. mCitrine is a 2A-peptide linked reporter to read the immediate AND gate output. miR-122 is the drug target and iRFP reports on gene expression fluctuations. (e, f) Expression levels of the immediate AND gate reporter mCitrine (e) and non-specific assay readout mCerulean (f) following perturbations of individual non-specific inputs (see Figure 2.5, Perturbations 2, 3, 11, 12, 5 nM of LNA/Mimic used). The bar charts and points show mean ± SD for biological triplicates. Trans- fection setup for this figure is given in Supplementary Table 28 and 29.

2.7 Alternative assay design The first alternative is the “Parallel assay” where specific and non-specific mod- ules are not connected by the PIT2 "knot" transactivator. Non-specific effects are re- flected in a fluorescent protein ZsYellow replacing the PIT2 knot. The specific module is a PIT2-induced bidirectional reporter of miR-122 activity, driving mCherry furnished with miR-122 binding sites and a reference mCerulean. The latter also serves as a global gene expression readout and transfection control, similar to iRFP in the pilot assay. The two other architectures extend the Pilot assay circuit with additional feed- forward loops (FFL)54 that proved their utility in high input sensors8. We implemented the motifs by augmenting the non-specific readout mRNA with binding sites for miR- NAs that bind to the PIT2 knot. In an implementation called “low input feed-forward

63 (LFF) assay”, only the low inputs miR-146a and -141 target the readout; in the “com- plete feed-forward (CFF) assay”, these are miR-146a, -141 and -FF4 (Fig. 2.17).

Figure 2.17 Schematics of alternative assay designs. Schematic representation of different assay topologies. The sensor genes for two high inputs are common to all designs. The components specific to individual layouts are shown in sep- arate panels. mCherry serves as the specific module readout in all layouts. ZsYellow is the non-specific RNAi module readout in the parallel assay, and mCerulean in the LFF and CFF assays. The gene expression readout in the Parallel assay is mCerulean. In LFF and CFF assays it is mCitrine. The crucial differences to the pilot assay (Cf. Fig. 2.16) are highlighted with red circles.

2.8 Mechanistic model of pilot circuit and alternative designs In order to explore these designs in silico we implemented a complete mecha- nistic model that could be adjusted to represent each of the four architectures with small changes in its reaction networks. We used the MATLAB SimBiology tool to build it. We modeled all the transcription-translation-degradation reactions, as well as tran- scription factor binding, explicitly with mass action kinetics. Transcription factor binding was modeled as a non-cooperative process for simplicity, since the factors used in the assay (rtTA, PIT2, LacI) are not known to have high Hill coefficients. Degradation by RNAi was modeled using Michaelis-Menten kinetics, assuming a finite pool of RISC complexes as well as a background pool of microRNAs competing for RISC binding. Transient transfection was modeled explicitly by setting initial gene copy numbers and including a degradation process for transfected genes with the rate constant corre- sponding to cell doubling every 24 hours. Thus, the network outputs never reach steady state, as is the case in transient transfections. We created the model such that

64 each of the four assay variants can be tested by inactivating certain reactions and activating others. Because the assays only differ in a small number of such reactions, the modifications affect only a small fraction of the model and in any case they do not require introduction of new parameters. Therefore all four topologies can be compared with the exact same parameter set. We used concentration units of molecules per cell (mpc) with 1 nM = 1000 mpc (an approximate conversion for a mammalian cell). The diagram is shown in Figure 2.18.

Figure 2.18 Simbiology model diagram view. SimBiology model used to simulate the different assays. Reactions in green are shared for Pilot, LFF and CFF assays and the ones in purple for LFF and CFF. Orange reactions are specific for the Parallel assay, the blue one for the CFF assay. All other species and reactions are identical for all assays and parameters used for comparison are the same.

2.8.1 Initial choice of parameters Despite the large number of reactions, we used similar parameter values to describe identical processes such as transcription, translation and degradation rates as well as the kinetic parameters for RNA interference. The following numerical values were chosen based on literature.

65

RNAi parameters and modeling Downregulation of gene expression by miRNA was modeled exclusively at the mRNA level because the binding sites in the 3’-UTR are fully complementary. The equation used was 푑[푅푁퐴] 푘 [푅퐼푆퐶 ∙ 푚푖푅푋][푅푁퐴] = − 푐푎푡 푑푡 퐾푀 + [푅퐼푆퐶 ∙ 푚푖푅푋] 55,56 The initial parameter values were chosen as KM = 1 nM (1000 mole-

-1 cules/cell) and kcat = 0.017 s . The creation of RISC complex associated with a specific miRNA was modeled as a reversible binding reaction with dissociation rate of 0.001 s- 1 and dissociation constant of about 0.3 nM, leading to estimated association rate con- stant of 3x10-6 s-1 mpc-1. The level of cytoplasmic RISC complex was set initially at 3 nM (3000 mpc)57, although the number refers to the level in HeLa extracts and is thus a probable underestimation of the cytoplasmic protein concentration. For the ratio be- tween the number of mRNA transcripts and the number of spliced microRNA mole- cules we used the factor 0.458.

Transcriptional regulation parameters and modeling Initially we chose dissociation constants of 1 nM (1000 mpc) for all transcription factors rtTA, LacI and PIT2; given that dissociation rate from an occupied promoter is on the order of 0.01 s-1, the association rate is 10-5 s-1mpc-1. Transcriptional regulation was modeled explicitly as binding and unbinding reactions without cooperativity; to avoid accumulation of protein-DNA bound complex, such complexes were converted to DNA with the rate that equaled free protein degradation rate.

DNA, RNA and protein synthesis and degradation Genetic components of the assay are given initial copy numbers on the order of 5-15 copies/cell for the low-level components and the rest scaled by the ratios used in a particular transfection, that is, 1x of rtTA and LacI^FF4^ sensor-encoding genes, 2x of CAGop-driven gene and 4x of the bidirectional PIT2-controlled cassette. Gene copy number is diluted with half-life of 24 hours corresponding to cell division time

-6 -1 -1 (kdeg=8.02*10 s ). RNA synthesis was modeled explicitly with rate of kRNA=0.003 s for all species in the model. RNA degradation was modeled with rate constant of

RNA -5 -1 kdeg =6*10 s corresponding to half-life of 3 hours. Protein translation from mRNA was modeled explicitly with rate constant of 0.008 s-1 for all species in the model. Deg- radation of transcription factor proteins (rtTA, LacI and PIT2) was modeled with rate

66 constant of 2.4*10-5 s-1 corresponding to half-life of 8 hours. Degradation of stable flu-

-6 orescent proteins was determined by cell division with rate constant of kdeg=8.02*10 s-1.

2.8.2 Model calibration In order to calibrate the selected parameters we evaluated the dynamic range of the simulated non-specific RNAi module output. We chose three different conditions and compared them with the fold changes we observed for the pilot circuit experimen- tally. The first test condition is denoted (1,0) – both high inputs are present at 1000 mpc each, denoted by the 1 in (1,0), and the low inputs are absent, denoted by the 0 in (1,0). This state represents the ground state of HuH-7 cells and is expected to give high expression of the non-specific assay marker (Condition 0 in Fig. 2.5). The second condition is (0,0) – this is the condition when all the non-specific inputs are absent (Condition 5 in Fig. 2.5). Assay should generate low output in this situation, since the high marker do not match the HuH-7 ground state profile and consequently inhibit non- specific assay marker expression. Third condition is (0,1): the high inputs are absent and two low inputs are present at 100 mpc each. We did not test this condition exper- imentally, however, we can use Condition 13 (Fig. 2.5) as a proxy, since we observed that low inputs are very strong. For this condition, the output is expected to be very low, since both input type oppose their ground state condition. We ran the simulation (simulation time 180,000 seconds = 50 hours, approxi- mate waiting time from transfection till measurement) while randomizing parameters describing the gene copy numbers, all RNAi-related rate constants and transcription factor association rate constants (eight parameters in total) within approximately 10- fold range. The main criterion to judge different sets was our experimental observation that the (1,0):(0,0) ratio of an assay whose output is fluorescent protein (as is the non- specific ZsYellow output in Parallel assay layout) is on the order of 15-20. Therefore, we ran about 1000 simulations with randomized parameter sets and selected those sets that generated (1,0):(0,0) ratio above 13. The following is the summary of the parameter values that resulted in such ratio:

67 Table 2.1 Parameter values used for Simbiology model simulations. Set used for Parameter name mean stdev Initial value simulations Plasmid copy number 6.88 4.7 5 – 15 6.88 RISC complex, mpc 5.60 x103 2.02 x103 3 x103 5.60 x103 mRNA degradation by 5.16 x102 2.25 x102 103 5.16 x102 RISC, KM, mpc mRNA degradation by 3.37 x10-2 0.77 x10-2 1.7 x10-2 3.37 x10-2 -1 RISC, kcat, s Association of miRNA 3.78 x10-6 1.91 x10-6 3 x10-6 3.78 x10-6 with RISC, s-1 mpc-1

-1 - kON, rtTA to DNA, s mpc 6.33 x10-6 3.38 x10-6 10-5 10-5 1

-1 - kON, LacI to DNA, s mpc 1.49 x10-5 8.34 x10-6 10-5 10-5 1

-1 -1 -5 -6 -5 -5 kON, PIT to DNA, s mpc 1.40 x10 7.89 x10 10 10

One observes that the main changes relative to initial parameter values oc- curred in the parameters related to RNAi pathway: the concentration of RISC complex was higher than initial estimate, while the KM was lower and kcat was higher, indicating that RNAi activity is stronger than initial estimates obtained predominantly in cell ly- sates. We replaced the initial values in our simulation with the values from the table above and scanned individual parameters across two orders of magnitude. We did not change the initial values of the association rate constants for transcription factor bind- ing because the values in the table were very similar to 10-5 s-1mpc-1.

2.8.3 Assay performance analysis The circuit layout of choice is the design that can best detect the unintended effects on off-target miRNAs over a wide range of input parameters. Conceptually there are two related sources of non-specific effects. First, changes in unintended miRNA levels and/or activities for any number of reasons and second, global changes to RNAi machinery. To simulate circuit response to changes in non-specific marker levels, we calculated the dynamic range by altering, in the model, between high and low miRNA input concentrations (cf. section “2.8.2 Model calibration”). We then scanned a range of different parameters to identify the circuit with the best perfor- mance and its ideal conditions.

68

Figure 2.19 Parameter scans for changes in high sensor inputs. Different parameters were scanned with the model shown in Figure 2.18. (1,0) represents the On state. For (0,0) the high input miRNAs are inhibited or absent, generating an interme- diate (high inputs only) Off-state. All the plots show the dynamic range calculated by dividing On-state by Off-state output value. The following parameters were scanned, from top left to bottom right: Copy number, association rate of rtTA, PIT and LacI to the DNA, RNA and Protein synthesis rate, RNA and Protein degradation rate (Transcription factors and fluores-

cent proteins), total RISC number, miRNA RISC association rate, kcat and KM of the RISC

complex. kcat, turnover number in Michaelis Menten kinetics; KM, Michaelis constant; LFF, Low inputs feed-forward circuit; CFF, complete feed-forward circuit.

We first changed the high input miRNAs alone. Generally, CFF and parallel layout greatly outperform LFF- and pilot circuits, for which we observe a maximal dy- namic range of five fold and hence rather stable ratios for all the parameters tested (Flat lines). The CFF and the parallel circuit on the other hand are more dependent on various parameters, with similar tendencies for all of them. For the non-RNAi related parameters, it is beneficial to have fewer mRNA/protein in the cell or higher turnover of these. This is either achieved with low copy numbers, low RNA synthesis rates,

69 weaker rtTA/PIT-DNA association, stronger LacI-DNA binding and faster protein deg- radation. Only the RNA degradation opposes this rule, for which we see better sensi- tivities at lower rates. The RNAi related parameters also point in the same direction. Best performance is achieved when it works efficiently. This is the case with high total

RISC concentration, i.e. low KM and high kcat. For the low input miRNAs we generally observe much larger dynamic ranges and LFF is under all conditions the one with the best performance.

Figure 2.20 Parameter scans for changes in low and high sensor inputs. Different parameters were scanned with the model shown in Figure 2.18. (1,0) represents the On state. For (0,1) the high input miRNAs are inhibited or absent and the low miRNA are present, generating the strongest possible Off-state. All the plots show the dynamic range calculated by dividing On-state by Off-state output value. The following parameters were scanned, from top left to bottom right: Copy number, association rate of rtTA, PIT and to the DNA, RNA and Protein synthesis rate, RNA and Protein degradation rate (Transcription fac-

tors and fluorescent proteins), total RISC number, miRNA RISC association rate, kCAT and

KM of the RISC complex.

For the non-RNAi parameters we observe fairly stable systems, only CFF and parallel circuit layouts show considerable changes for the RNA and protein synthesis

70 rates. As for the high inputs, also the low inputs benefit from an efficiently working RNAi system. This also explains the relative robustness of the system to changes in general parameters – the low inputs with efficient RNAi overrule all other effects bal- ancing out possible perturbations. In summary we achieve good dynamic ranges with any layout for the low sen- sors, even though the differences between the layouts are significant. For the high sensors on the other hand only the CFF and the parallel circuits perform well under a wide range of parameter values and CFF circuit is superior to the Parallel circuit with 2-3 fold improvement, especially when RNAi-related parameters reflect fast and effi- cient down-regulation (high levels of RISC complex, low KM value for loaded RISC toward its target mRNA, and high kcat of mRNA degradation/sequestration).

Global changes to RNAi machinery are unlikely to affect KM and kcat values, because these represent basic biochemical properties of the pathway; instead the likely changes will affect the total amount of RISC complexes or the efficiency of miRNA processing. RISC concentration is a parameter in the model; for miRNA pro- cessing, we use the fact that a synthetic miR-FF4 is spliced from LacI^FF4^ pre-mRNA (Fig. 2.21a). We reason that the ratio between the amount of processed LacI mRNA and the mature miR-FF4 is a good proxy for global changes in miRNA processing and maturation pathway. We mapped changes in circuit output as a function of RISC con- centration and miR-FF4/LacI mRNA ratio (Fig. 2.21b). Once again, Parallel and CFF circuit are superior to the other two architectures in terms of sensitivity to change; be- cause the mCerulean output contains a binding site for miR-FF4, CFF is more sensitive than Parallel circuit. Thus the simulation supported our expectations and pointed to CFF as an optimal architecture. Since the exact parameter values are unknown, we nevertheless decided to test all of them experimentally.

Figure 2.21 Off-target effect dynamic range analysis for alternative layouts. (a) The construct expressing LacI and miR-FF4 from the same mRNA. Ratio of mature miR- FF4 to LacI-mRNA can be used to detect influence of candidate molecules on miRNA pro- cessing machinery. (b) Simulated dependency of the non-specific RNAi module readout on changes in total RISC concentration and processing efficiency of miR-FF4, as reflected by the miR-FF4/LacI-mRNA ratio. CAP, five prime cap; AAA: poly adenosine.

71 2.9 Validation of alternative assays We quantitatively validated and characterized all three variants using a com- plete set of input perturbations (see Fig. 2.5), due to uncertainties in simulating com- plex networks. Z’-factors for specific targeting of miR-122 are >0.5 ("excellent") for all assays (Fig. 2.22, lower panel). Z’-factors corresponding to all perturbations involving non-specific inputs form distinct distributions in each of the three assays (Fig. 2.22, upper panel). As predicted by the model, the CFF assay has the smallest number of Z’-factors <0.5 (two out of 13), with the rest being “excellent”. With other assays, the lack of miR-FF4 feed-forward loop leads to lower performance when high inputs are perturbed. By applying the “best-worst-case” as an indicator of the system’s weakest link59, we judge the CFF assay to be superior.

Figure 2.22 Experimental comparison of alternative layouts. Z’-factor histograms for the three different assays tested with the 15 validation perturbations (see Fig. 2.5). Blue bars represent the Z’-factors of perturbations probing the non-specific module, and red bars represent perturbations specifically modulating miR-122. All data points are mean values of biological triplicates. Transfections are described in Supplemen- tary Table 30. Z’-factor, measure of statistical effect size (cf. section “2.2.8 Statistical analy- sis”).

2.10 In-depth validation of complete feed forward assay The detailed data of the CFF assay (Fig. 2.23a) response to all perturbations are shown in Figure 2.23. Zooming into Figure 2.23b, we see that specific changes in miR-122 are readily discovered. For the non-specific RNAi module, the induction of miR-20a and miR-21 and the inhibition of miR-FF4 generate the smallest effects, while under other perturbations the assay performs very well. Weak response to increasing high inputs is expected, as they already exert most of their effect in the default state and their main purpose is to serve as detectors of non-specific miRNA inhibition. We then correlated the data obtained from flow cytometry with the one from microscopy (Fig. 2.23c) and found an excellent correlation with an R2 of 0.996, indicating that both techniques can be used for data analysis.

72

Figure 2.23 In-depth validation of CFF circuit. (a) Schematic representation of the CFF assay circuit used in these experiments, reproduced here from Figure 2.17 for convenience. (b) Performance of the CFF assay when tested with the 15 validation perturbations. The table describes the miRNA/LNA mixtures. Each column shows which miRNA(s) are changed relative to HuH-7 background (gray columns). "-1" indi- cates inhibition of a miRNA using LNAs and "+1" indicates induction of a miRNA with mimic(s). Gray columns show miRNA expression pattern (0 - not expressed, 1 - highly ex- pressed, 0.7 - intermediate level of the drug target miR-122) in the HuH-7 ground state. Pink columns emphasize specific modulation of miR-122. Faint blue and red bands indicate the range in which the changes in the non-specific and the specific module readouts, respec- tively, are insignificant. Z’-factors for the different perturbations are displayed in the bottom row: Values >0.5 indicate “excellent” performance, >0 are “acceptable”, while the ones <0 are not suitable for screening. Pink and blue shades highlight Z’-factors calculated for specific and non-specific module readouts, respectively. (c) Correlation between normalized mCherry values generated from flow cytometry data (b) with the values generated by our image-processing pipeline using microscopy (d). Straight line indicates linear regression us- ing least square fit. The slope and coefficient of determination (R2) are displayed. (d) Repre- sentative microscopy images of transfections shown in (b) for all perturbations used to char- acterize and validate the assay. All bars and data points are mean ± SD of biological tripli- cates. Transfections are described in Supplementary Table 31.

We then quantified assay’s sensitivity to intermediate perturbation strength by measuring dose-response across 50-fold intensity variation (Fig. 2.24a). The specific module is sensitive to all perturbations in this range; only miR-122 up-regulation at the lowest concentration of mimic-122 shows a decreased Z’-factor (Z’=0.42) (Fig. 2.24b). The non-specific RNAi module exhibits shallow dose-response to increase in high in- puts and decrease in miR-FF4, consistent with the end-point data. On the contrary,

73 assay sensitivity to low input activation with respective mimics is very high, resulting in strong readout reduction already at the lowest tested concentration.

Figure 2.24 Dose response testing of CFF assay with validation set. (a) Dose response of assay readouts to gradual change in perturbation strength. Pale blue and red lines show dose-response behavior with the relevant controls. The dose-response curves were fitted with exponential functions to serve as a visual guide. (b) Z’-factors for all data points of (a). All data points are mean ± SD of biological triplicates. Transfections are described in Supplementary Table 32.

The asymmetry in response to the two types of inputs is advantageous, since it precludes balancing of two opposing unspecific effects of the same strength. This is confirmed upon simultaneous activation of all non-specific inputs with respective mim- ics, whereby the effect of perturbing low inputs overrides that of high inputs. Lastly, in all cases where the intended drug target miR-122 is affected simultaneously with other

74 miRNAs, we observe changes in non-specific RNAi module readout. These perturba- tions emulate a scenario when the intended target is modulated together with a number of off-targets. For all such cases the assay correctly reports the side effects (Fig. 2.24b). For further details, see section “2.14 Assay comparison to bidirectional report- ers” below.

2.11 Automated screening of small molecule library Following extensive assay optimization and validation, we performed a pilot small molecule screen. First, we adapted the assay to high-throughput screening pro- tocol33 by automating compound dilution and transfection and developing an image processing pipeline for 96-well plates (Fig. 2.25a, Fig. 2.1 and Section “2.2.3 Small molecule screening”). The assay was re-validated with the automated protocol using select perturbations and was found to retain its performance (Z’>0.5).

Figure 2.25 Screening of NIH clinical collections 1 & 2. (a) Schematic representation of the small molecule screen and the data processing pipeline. (b) Volcano plots of screening results with each point representing the mean of a biological triplicate. x-values are fold changes of the candidate triplicate compared to the plate average, and y-values represent the p-value of a two-sided t-test of each triplicate compared to the plate averages. Green dots are from screen #1, red ones from screen #2. Horizontal dotted lines are drawn at p=0.1 for the gene expression module (mCitrine) and the non-specific RNAi module (normalized mCerulean), and at p=0.01 for the specific module (normalized mCherry). Transfections are described in Supplementary Table 33.

75 As the pilot library, we chose the NIH Clinical Collections 1 & 2 with a total of 726 compounds. We tested the library twice using automated liquid handling, transient transfections and triplicate measurements. Assuming that most compounds are inac- tive34, we used all readouts from an entire 96-well compound plate as the reference distribution that is compared, using two-sided t-test, to triplicate measurements of in- dividual compounds from the same plate (Supplementary Figure 1 and 2). A p-value cut-off of p<0.1 (i.e., a compound has a “true” effect on a given readout with at least 90% probability) was used to exclude a compound from further analysis based on changes in gene expression readout mCitrine. 18.3% and 20.2% of compounds in screens #1 and #2, respectively, were excluded. The same criterion was applied to non-specific RNAi module readout (normalized mCerulean), respectively excluding 12.8% and 16.1% of compounds as potential non-specific effectors. For hit identifica- tion with the specific module readout (normalized mCherry) we used p<0.01 cut-off, identifying 39 and 31 hits, respectively, corresponding to an apparent hit rate of 5.1% and 3.3% (Fig. 2.25b). However, as the relative magnitude of the effects is <30% for all cases, many hits are potential false-positives; therefore, secondary validation is needed.

2.12 Validation of screening hits We followed up on the miR-122 hits and some of the excluded compounds by measuring dose-response of assay readouts. We looked for compounds that signifi- cantly affected at least one fluorescent readout in both screen replicas and ranked them according to mean deviation from reference distribution. We measured dose re- sponse of the top ten modulators of the gene expression module (four up- and six down-regulators, chemical structure shown in Supplementary Figure 3) and top eight compounds affecting the non-specific module (two up and six down-regulators, chem- ical structures shown in Supplementary Figure 4). We reproduced the effects on the gene expression module with eight out of ten compounds, whereas Indinavir and Donepezil (1.6- and 1.5-fold change in the initial screen) failed to elicit the expected response (Fig. 2.26a-c). Similarly, compounds that reduced the readout of the non- specific RNAi module in the screen also reduced it in a dose-dependent manner. How- ever, all of them affected the gene expression readout mCitrine at a concentration of 50 µM, five times higher than the one used in the screen, suggesting additional effects on gene expression or cell health. Among compounds that elevated the non-specific module readout, Ifenprodil did so in a dose-dependent manner but Oxytetracyclin could not be confirmed (Fig. 2.26d).

76

Figure 2.26 Screening hits validation. (a) Dose response of assay gene expression module readout mCitrine to select compounds originally excluded based on this readout in the automated screen. Compounds found to down-regulate gene expression readout mCitrine in the automated screen are in shades of blue, and those up-regulating mCitrine are in shades of red. (b, c) Dose response character- ization of gene expression module "hits" (mCitrine-based exclusion) with the full CFF screen- ing assay as well as with the simple bidirectional reporter assay (scheme shown in Fig. 2.6). Gene expression readout (mCitrine) measured with the CFF assay (top row) and correspond- ing mCerulean readout for the simple bidirectional reporter assay (bottom row). mCerulean is without miRNA binding sites and overall protein load is smaller for this assay. Last con- centration without toxic effects is indicated. (d) Dose response of gene expression and non- specific readouts for select compounds excluded based on the non-specific RNAi module readout in the automated screen. Black lines show gene expression module readout (mCitrine); blue ones show the non-specific module readout (normalized mCerulean). (e, f) Dose-response data for specific hits (mCherry based identification) measured with the circuit assay (e) as well as with simple bidirectional reporter assay (f). All data points are mean ± SD of biological duplicates. Transfections are described in Supplementary Table 34 – 37.

77 For the seven miR-122 hits, including three up- and four down-regulators (Chemical structures shown in Supplementary Figure 5), we confirmed the lack of effect on gene expression and non-specific module readouts apart from an occasional effect on mCitrine at 50 µM. However, modest modulation of miR-122 activity observed in the screen (<1.3 fold) could not be reproduced (Fig. 2.26e). Upon repeated testing with a miR-122 bidirectional reporter, we confirmed the lack of anti-miR-122 effect (Fig. 2.26f), indicating that the hits were likely false-positives resulting from “multiplicity of testing” problems60.

2.13 Assay customization The assay can be customized (reprogrammed) to address different miRNA drug targets and different sets of non-specific inputs. To show this experimentally, we modified specific and non-specific modules, generating two new assay circuits. The first is a miR-23b drug discovery assay circuit obtained by replacing miR-122 binding sites with those for miR-23b. The second circuit has an augmented non-specific input set obtained through replacement of miR-141 binding sites with the sites for miR-375 and miR-145 (Fig. 2.27a). We validated the new assays with perturbations modified to match the new miRNA drug targets and non-specific inputs. In the miR-23b assay, the Z'-factors for individual perturbations were comparable to those measured with the miR-122 assay, with the exception of miR-23b inhibition that resulted in “acceptable”, but not “excellent”, performance (Z’=0.38, Fig. 2.27b). This can be explained by the relatively low endogenous activity of miR-23b in HuH-7, and may be rectified with a more sensitive sensor for miR-23b activity. For the miR-122 assay with augmented non-specific inputs, most of the Z'-factors were comparable to the original miR-122 assay, apart from lower sensitivity to LNA-FF4 due to weaker miR-FF4 effect in this circuit (Fig. 2.27c). These data show that specific and non-specific inputs can be swapped and augmented in a modular fashion requiring only minor optimizations, re- sulting in drug discovery assays for new on- and off-targets.

78

Figure 2.27 Assay customization. (a) Schematic representation of new customized assay circuits with highlighted modifications – drug target miRNAs and low input miRNAs. (b, c) Validation of customized assays. Re- sponse of the miR-23b drug discovery assay (b) and miR-122 drug discovery assay with augmented non-specific inputs, miR-375 and miR-145 together with miR-146a to different perturbations of the inputs with appropriately adjusted validation perturbations (see Fig. 2.5) (c). Each column shows which miRNA(s) are changed relative to HuH-7 background (gray columns, 0 - not expressed, 1 - highly expressed, 0.3 or 0.7 - intermediate level). "-1" indi- cates inhibition of a miRNA using LNAs and "+1" indicates induction of a miRNA with mimic(s). Pink columns emphasize specific modulation of miR-23b or miR-122. Faint blue and red bands indicate the range in which the changes in the non-specific and specific mod- ule readouts, respectively, are insignificant. Z’-factors for the different perturbations are dis- played in the bottom row: Values >0.5 indicate “excellent” performance, >0 are “acceptable”, while the ones <0 are not suitable for screening. All data points are mean ± SD of biological triplicates. Transfections are described in Supplementary Table 38 and 39.

2.14 Assay comparison to bidirectional reporters The key assay feature is the “filtering” of candidate compounds that would have been otherwise considered specific in simple luciferase or fluorescent reporter assays. To illuminate this advantage, we sought compounds or perturbations that affect RNAi pathway non-specifically, i.e. affecting miR-122 and other miRNA. The simplest exam- ple is a validating perturbation that affects miR-122 together with the non-specific in- puts, e.g., a mixture of multiple LNAs with LNA-122 (Fig. 2.5, Perturbation 5). When this combination is applied to a bidirectional reporter (Fig. 2.28a, right chart), it can be mistaken for a specific modulator; analysis of the assay readouts suggests otherwise, because the non-specific RNAi readout mCerulean is significantly reduced (Fig. 2.28a, charts on the left). Next, we noticed that in bidirectional reporter assays siRNAs47 against the RNAi pathway proteins Drosha and Dicer affected the activity of miR-20a

79 and let-7b, but not of miR-122 (Fig. 2.11b). To illustrate the distinction between specific and non-specific let-7b targeting, we exchanged miR-122 binding sites in the original CFF circuit with those for let-7b, resulting in an assay for let-7b. We subjected this new assay, as well as a bidirectional let-7b reporter, to siDrosha/siDicer mixture. The bidi- rectional reporter shows clear effect on let-7b. In the circuit assay, the gene expression readout mCitrine was unchanged, while the non-specific module readout mCerulean changed relative to the control with p-value p<0.1. In our decision tree, this compound is flagged as a non-specific hit (Fig. 2.28c). Lastly, we looked at some of the com- pounds that affected the non-specific RNAi module in the automated screen. Among those, Clobetasol propionate also had an effect on miR-122 in the bidirectional reporter assay. The measurements with Clobetasol propionate done with full miR-122 assay and with a bidirectional reporter show that this compound would have been identified as a specific modulator in a bidirectional assay, but it is excluded in the complete multi- module circuit assay (Fig. 2.28b). These data show that non-specific compounds af- fecting RNAi can be identified with our assay, while they would be considered specific with simple bidirectional reporter assays.

Figure 2.28 Assay benchmarking. The effects of non-specific modulators on assay circuit readouts (yellow, blue and red bars) as compared to bidirectional reporter assay readout (black bars, see schematics in Fig. 2.6). (a, b) miR-122 screening assay with miR-21, 20a as high inputs and miR-146a and miR-141 as low inputs (Fig. 2.23a) compared to the standard fluorescent protein bidirectional reporter with T122x4 binding sites (Fig. 2.6). (a) Probing with a combination of LNAs emulating a non-specific modulator situation. (b) Probing of the assays with a chemical (Chlobetasol) found in the screen to be a non-specific modulator. (c) let-7b screening assay with miR-21, 20a as high inputs and miR-146a and miR-141 as low inputs (Fig. 2.27a) compared to the standard fluorescent protein bidirectional reporter with Tlet7bx4 binding sites (Fig. 2.27a). siRNAs targeting the mRNA of miRNA maturation pathway proteins Drosha and Dicer are used to emulate a chemical inhibiting these two components. Data in (b) are pooled from the two triplicate runs of the small molecule screens (Fig. 2.25) and normalized to the measured plate means. All bars are mean ± SD of six biological replicates. Transfections are described in Supplementary Table 40 – 42.

80 2.15 Conclusions In this report we describe a large-scale mammalian gene circuit serving as an assay for drug discovery against miRNA targets, enabling highly-precise identification of specific target modulators with high throughput. Until now off-target effects have been usually assessed in secondary screens61. Although gene circuits have been sug- gested for use in small molecule screening before9,11, this is to the best of our knowledge, the first multi-input, customizable assay. It might as well be one of the first large-scale mammalian synthetic circuits that can be directly applied to an unmet tech- nological need. Transient transfection of plasmid sets is sufficient to establish the as- say because the readouts are averaged across transfected wells. Thanks to a cell- level computation5 it performs over a set of potential off-target inputs, assay readouts carry rich information that is difficult to measure otherwise. We used miRNA as the drug target in this study because off-target effects are expected due to shared matu- ration pathway and common mechanisms of action. However the same general strat- egy can be applied in just about any cell-based screening scenario given that appro- priate sensing-computing networks are designed and experimentally implemented. For example, very recent report uncovered four consensus genes related to toxicity62. Pro- moters driving these genes can be integrated with an assay such as ours and add to the information content of the non-specific reporter. Our results validate the basic premise behind rationally designed biological in- formation-processing networks, namely that appropriate design frameworks3, inspired by and built on solid engineering principles, can give rise to multiple systems with di- verse properties and intended uses, eliminating the need to construct each new sys- tem from scratch. This is not to say that these design frameworks have reached the "plug and play" stage. In this study we describe four different circuits that were tested extensively in silico and in experiments. Eventually, we arrived at a well-performing, customizable architecture and implemented an automated screening protocol, sug- gesting that these circuits can be used "as is" in exploratory screening campaigns. Circuit engineering effort has also augmented the toolkit of synthetic biology with new concepts such as the nested feed-forward motif from CFF assay. Thus, encounters of abstract concepts with real life applications not only address specific needs, but also provide rich data that are applicable in other contexts of circuit engineering.

81 References 1 Temme, K., Zhao, D. & Voigt, C. A. Refactoring the nitrogen fixation gene cluster from Klebsiella oxytoca. Proc. Natl Acad. Sci. USA 109, 7085-7090 (2012). 2 Cummings, M., Breitling, R. & Takano, E. Steps towards the synthetic biology of polyketide biosynthesis. FEMS Microbiology Letters 351, 116-125 (2014). 3 Rinaudo, K. et al. A universal RNAi-based logic evaluator that operates in mammalian cells. Nat. Biotechnol. 25, 795-801 (2007). 4 Ausländer, S., Ausländer, D., Müller, M., Wieland, M. & Fussenegger, M. Programmable single-cell mammalian biocomputers. Nature 487, 123-126 (2012). 5 Benenson, Y. Biomolecular computing systems: principles, progress and potential. Nat. Rev. Genet. 13, 455-468 (2012). 6 Ro, D.-K. et al. Production of the antimalarial drug precursor artemisinic acid in engineered yeast. Nature 440, 940-943 (2006). 7 Lu, T. K. & Collins, J. J. Dispersing biofilms with engineered enzymatic bacteriophage. Proc. Natl Acad. Sci. USA 104, 11197-11202 (2007). 8 Xie, Z., Wroblewska, L., Prochazka, L., Weiss, R. & Benenson, Y. Multi-input RNAi-based logic circuit for identification of specific cancer cells. Science 333, 1307-1311 (2011). 9 Weber, W. et al. A synthetic mammalian gene circuit reveals antituberculosis compounds. Proc. Natl Acad. Sci. USA 105, 9994-9998 (2008). 10 Weber, W. & Fussenegger, M. The impact of synthetic biology on drug discovery. Drug Discov. Today 14, 956-963 (2009). 11 Zhao, W., Bonem, M., McWhite, C., Silberg, J. J. & Segatori, L. Sensitive detection of proteasomal activation using the Deg-On mammalian synthetic gene circuit. Nat. Commun. 5:3612, doi:10.1038/ncomms4612 (2014). 12 Chen, Y. Y., Jensen, M. C. & Smolke, C. D. Genetic control of mammalian T- cell proliferation with synthetic RNA regulatory systems. Proc. Natl Acad. Sci. USA 107, 8531-8536 (2010). 13 Macarrón, R. & Hertzberg, R. P. Design and Implementation of High- Throughput Screening Assays. in High Throughput Screening Vol. 565 Methods in Molecular Biology (eds William P. Janzen & Paul Bernasconi) Ch. 1, 1-32 (Humana Press, 2009). 14 Kitano, H. Innovation - A robustness-based approach to systems-oriented drug design. Nat Rev Drug Discov 6, 202-210, doi:Doi 10.1038/Nrd2195 (2007). 15 Muller, P. Y. & Milton, M. N. The determination and interpretation of the therapeutic index in drug development. Nat. Rev. Drug Discov. 11, 751-761 (2012). 16 Feng, Y., Mitchison, T. J., Bender, A., Young, D. W. & Tallarico, J. A. Multi- parameter phenotypic profiling: using cellular effects to characterize small- molecule compounds. Nat. Rev. Drug Discov. 8, 567-578 (2009). 17 Bader, A. G., Brown, D. & Winkler, M. The promise of microRNA replacement therapy. Cancer Res. 70, 7027-7030 (2010). 18 Esau, C. C. Inhibition of microRNA with antisense oligonucleotides. Methods 44, 55-60 (2008). 19 Ebert, M. S., Neilson, J. R. & Sharp, P. A. MicroRNA sponges: competitive inhibitors of small RNAs in mammalian cells. Nat. Meth. 4, 721-726 (2007). 20 Connelly, C. M., Thomas, M. & Deiters, A. High-throughput luciferase reporter assay for small-molecule inhibitors of microRNA function. J. Biomol. Screen. 17, 822-828 (2012). 21 Velagapudi, S. P., Gallo, S. M. & Disney, M. D. Sequence-based design of bioactive small molecules that target precursor microRNAs. Nat. Chem. Biol. 10, 291-297 (2014).

82 22 Melo, S. et al. Small molecule enoxacin is a cancer-specific growth inhibitor that acts by enhancing TAR RNA-binding protein 2-mediated microRNA processing. Proc. Natl Acad. Sci. USA 108, 4394–4399 (2011). 23 Shan, G. et al. A small molecule enhances RNA interference and promotes microRNA processing. Nat. Biotech. 26, 933-940 (2008). 24 Gumireddy, K. et al. Small-molecule inhibitors of microRNA miR-21 function. Angew. Chem. Int. Ed. 47, 7482-7484 (2008). 25 Young, D. D., Connelly, C. M., Grohmann, C. & Deiters, A. Small molecule modifiers of microRNA miR-122 function for the treatment of hepatitis C virus infection and hepatocellular carcinoma. J. Am. Chem. Soc. 132, 7976-7981 (2010). 26 Schmidt, M. F. Drug target miRNAs: chances and challenges. Trends Biotech. 32, 578-585 (2014). 27 Kasinski, A. L. & Slack, F. J. MicroRNAs en route to the clinic: progress in validating and targeting microRNAs for cancer therapy. Nat. Rev. Cancer 11, 849-864 (2011). 28 Ling, H., Fabbri, M. & Calin, G. A. MicroRNAs and other non-coding RNAs as targets for anticancer drug development. Nat. Rev. Drug Discov. 12, 847-865 (2013). 29 Li, Y. et al. HMDD v2.0: a database for experimentally supported human microRNA and disease associations. Nucleic Acids Res. 42, D1070-D1074 (2014). 30 Benenson, Y. Biomolecular computing systems: principles, progress and potential. Nature Reviews Genetics 13, 455-468, doi:Doi 10.1038/Nrg3197 (2012). 31 Coulouarn, C., Factor, V. M., Andersen, J. B., Durkin, M. E. & Thorgeirsson, S. S. Loss of miR-122 expression in liver cancer correlates with suppression of the hepatic phenotype and gain of metastatic properties. Oncogene 28, 3526- 3536 (2009). 32 Jopling, C. L., Yi, M., Lancaster, A. M., Lemon, S. M. & Sarnow, P. Modulation of hepatitis C virus RNA abundance by a liver-specific microRNA. Science 309, 1577-1581 (2005). 33 Sittampalam GS, C. N., Nelson H, et al., editors. Assay Guidance Manual [Internet]. (2004-). 34 Zhang, J.-H., Chung, T. D. Y. & Oldenburg, K. R. A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J. Biomol. Screen. 4, 67-73 (1999). 35 Prochazka, L., Angelici, B., Haefliger, B. & Benenson, Y. Highly modular bow- tie gene circuits with programmable dynamic behaviour. Nat. Commun. 5:4729, doi:10.1038/ncomms5729 (2014). 36 Nakabayashi, H., Taketa, K., Miyano, K., Yamane, T. & Sato, J. Growth of human hepatoma cell lines with differentiated functions in chemically defined medium. Cancer Res. 42, 3858-3863 (1982). 37 Lohmann, V. et al. Replication of subgenomic hepatitis C virus RNAs in a hepatoma cell line. Science 285, 110-113 (1999). 38 Kambara, H. et al. Establishment of a novel permissive cell line for the propagation of hepatitis C virus by expression of microRNA miR-122. J. Virol. 86, 1382-1393 (2012). 39 Kutay, H. et al. Downregulation of miR-122 in the rodent and human hepatocellular carcinomas. J. Cell. Biochem. 99, 671-678 (2006). 40 Deal watch: GSK invests in targeting microRNA for the treatment of hepatitis C. Nat. Rev. Drug Discov. 9, 350-350 (2010). 41 Landgraf, P. et al. A mammalian microRNA expression atlas based on small RNA library sequencing. Cell 129, 1401-1414 (2007).

83 42 Diederichs, S. et al. Coexpression of Argonaute-2 enhances RNA interference toward perfect match binding sites. Proc. Natl Acad. Sci. USA 105, 9284-9289 (2008). 43 Chien, C.-H. et al. Identifying transcriptional start sites of human microRNAs based on high-throughput sequencing data. Nucleic Acids Res. 39, 9345-9356 (2011). 44 Watashi, K., Yeung, M. L., Starost, M. F., Hosmane, R. S. & Jeang, K.-T. Identification of small molecules that suppress microRNA function and reverse tumorigenesis. J. Biol. Chem. 285, 24707-24716 (2010). 45 Rogers, S., Wells, R. & Rechsteiner, M. Amino acid sequences common to rapidly degraded proteins: the PEST hypothesis. Science 234, 364-368 (1986). 46 Dantuma, N. P., Lindsten, K., Glas, R., Jellne, M. & Masucci, M. G. Short-lived green fluorescent proteins for quantifying ubiquitin/proteasome-dependent proteolysis in living cells. Nat. Biotech. 18, 538-543 (2000). 47 Mori, M. et al. Hippo signaling regulates microprocessor and links cell-density- dependent miRNA biogenesis to cancer. Cell 156, 893-906 (2014). 48 Levy, C. et al. Lineage-specific transcriptional regulation of DICER by MITF in melanocytes. Cell 141, 994-1005 (2010). 49 Friedländer, M. R., Mackowiak, S. D., Li, N., Chen, W. & Rajewsky, N. miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic Acids Res. 40, 37-52 (2012). 50 Sun, G., Li, H. & Rossi, J. J. Sequence context outside the target region influences the effectiveness of miR-223 target sites in the RhoB 3′UTR. Nucleic Acids Res. 38, 239-252 (2010). 51 Weber, W., Kramer, B. P., Fux, C., Keller, B. & Fussenegger, M. Novel promoter/transactivator configurations for macrolide- and streptogramin- responsive transgene expression in mammalian cells. J. Gene. Med. 4, 676- 686 (2002). 52 Weber, W. et al. Macrolide-based transgene control in mammalian cells and mice. Nat. Biotech. 20, 901-907 (2002). 53 Donnelly, M. L. L. et al. The ‘cleavage’ activities of foot-and-mouth disease virus 2A site-directed mutants and naturally occurring ‘2A-like’ sequences. J. Gen. Virol. 82, 1027-1041 (2001). 54 Mangan, S. & Alon, U. Structure and function of the feed-forward loop network motif. Proc. Natl Acad. Sci. USA 100, 11980-11985 (2003). 55 Haley, B. & Zamore, P. D. Kinetic analysis of the RNAi enzyme complex. Nat. Struct. Mol. Biol. 11, 599-606 (2004). 56 Martinez, J. & Tuschl, T. RISC is a 5 ' phosphomonoester-producing RNA endonuclease. Gene Dev. 18, 975-980 (2004). 57 Brown, K. M., Chu, C. Y. & Rana, T. M. Target accessibility dictates the potency of human RISC. Nat. Struct. Mol. Biol. 12, 469-470 (2005). 58 Strovas, T. J., Rosenberg, A. B., Kuypers, B. E., Muscat, R. A. & Seelig, G. MicroRNA-vased single-gene circuits buffer protein synthesis rates against perturbations. ACS Synth. Biol. (2014). 59 Weibull, W. A statistical distribution function of wide applicability. J. Appl. Mech.-T. ASME 18, 293-297 (1951). 60 Dickhaus, T. Simultaneous statistical inference - with applications in the life sciences. (Springer-Verlag Berlin Heidelberg, 2014). 61 MacDonald, M. L. et al. Identifying off-target effects and hidden phenotypes of drugs in human cells. Nat. Chem. Biol. 2, 329-337 (2006). 62 Zhang, J. D., Berntenis, N., Roth, A. & Ebeling, M. Data mining reveals a network of early-response genes as a consensus signature of drug-induced in vitro and in vivo toxicity. Pharmacogenomics J. 14, 208-216 (2014).

84 3 Discussion

High throughput screening is a well-established set of technologies that is ubiq- uitously used in industry and academia and lies at the heart of most of the current drug development programs. Over many years the development of these technologies pushed the limits of throughput and size and ever-larger compound libraries could be screened. Yet, half of the discovery programs still fail to deliver promising leads1 or exhibit unwanted off-target effects that mostly manifest themselves as toxicity in clini- cal trials2. To overcome these issues, Mayr and Fuerst3 among others suggested that HTS efforts should shift its “focus toward relevance of biological data”. Increasing the relevance of biological data produced in high throughput screen- ing set-ups can take many shapes. One core strategy is increasing the number of pa- rameters to be measured in the first round of screening, resulting in multiplexed as- says. This has two major advantages: First, compounds with a particular off-target profile can be identified early and excluded early reducing follow-up costs considerably and second, compounds with a weaker on-target effect that have an advantageous profile in target-promiscuity4 can be identified and further developed through lead op- timization chemistry. Yet, research has shown, that multiplexing comes at the cost of throughput and decreased number of readouts5. This can be illustrated when consid- ering a reporter gene approach. A single reporter links to a phenomenon of interest in the cell, e.g. a fluorescent reporter reflects the activity of a certain miRNA (cf. Fig. 2.6). However, if we want to increase the number of parameters measured per run, we need to scale this reporter system by linking a second miRNA to a second reporter and so forth. As a consequence, each additional parameter to be measured requires another reporter, which needs to be designed and measured, increasing both development efforts and acquisition time. In addition, as soon as certain limits are crossed, scaling is no longer possible due to a lack of available reporters. In the case of routine micro- scopic measurements of fluorescent reporters, this limit is reached with four (by using some technical tricks up to ten) reporters6,7 (Fig. 3.1, left). In order to increase the information on the screening compounds that are tested, different concepts were suggested. The classic approach is serial testing, i.e. a first screen is performed only with the drug target and the hits are subsequently screened individually with slow but highly multiplexed assays. This method has proven to be very successful in the past, but with average hit rates of ~0.5%8 and with libraries of 106 compounds still 5’000 compounds would need to be tested in follow-up assays. By clustering these hits according to structural similarity9 and testing only a single one

85 for each group, the number of follow-up experiments can be reduced significantly. Yet, potentially potent and/or specific compounds might be missed or compounds with un- desired properties advanced to the next step of development. Another strategy is mi- croscopy based high content screening. It can distinguish several fluorescent labels and, dependent on the objective used, detailed attributes of the assayed cells. These include properties of the whole cell (size, mobility, granularity, etc.), the organelle (lo- calization, size, shape, etc.), of fluorescent proteins (motility, localization, etc.) and many more. By transforming these parameters with statistical procedures, e.g. princi- ple component analysis, new, abstract parameters can be generated10. Using positive controls as the “hit phenotype” these parameters are linked to the desired phenotype. For example Rämö et al.11 linked the appearance of invasomes, an actin surrounded membrane structure, to actin-GFP intensity as a measure for the infection of cells by Bartonella henselae, in order to screen for siRNAs that inhibit this process. Neverthe- less, these parameters emerge from correlations with the hit phenotype and the tar- get(s) are identified by the siRNA sequence. When applied to small molecules, this direct treatment-target-link does not exist and subsequent target deconvolution efforts are needed.

3.1 A synthetic biology approach to HTS In this thesis we approached the problem of measuring off-target effects in the first round of screening by using synthetic gene circuits. We apply this strategy specif- ically to miRNAs as drug targets, since off-target effects are particularly probable with miRNAs (cf. Section “1 Introduction”). Synthetic gene circuits were shown to be able to sense, integrate and compute basic Boolean logic functions on up to five siRNAs12 or six miRNAs13, as well as on synthetic14 and natural transcription factors15. A key property of these circuits is the defined integration and computational processing of the input signals. This property can be reused in a high throughput setting to measure multiple off-target effects. Compared to the standard approach, where a single target is linked to a single readout, the circuit-based approach senses several targets and integrates them in a single output (Fig. 3.1, right). For example, three off-targets feed into the same output and these inputs are not affected when only DMSO is applied. Upon exposure to a compound, any of them could be targeted, which in turn induces a change of the output. This output change indicates, that at least one of the off-targets was affected and the respective compound can be excluded from further investiga- tions. In brief, information on the changes in different inputs is compressed into a single output, keeping throughput high.

86

Figure 3.1 Comparison of standard and circuit based screening approaches. Left: Standard approach for screening multiple drug targets. With each target a new reporter is introduced. Right: Gene circuit based approach, where additional targets are sensed and integrated and the readout reports, if any of the inputs is affected.

Here, we implemented the multi-input screening assay for miR-122 as the drug target and a suitable set of off-target miRNA. miR-122 was chosen for two main rea- sons: First, miR-122 is an attractive drug target that is involved in HCV replication and hepatocellular carcinoma (cf. Section “1.3.2 miRNA in disease”) and second, several specific and non-specific small molecule modulators were published, which could serve as positive controls for the assay.

3.2 Major findings

3.2.1 Testing of established miR-122 modulators We first tested the published miR-122 small molecule modulators16-18 in an as- say framework that was compatible with our future assays, i.e. using bidirectional flu- orescent reporters (cf. Fig. 2.6). Only one out of six tested compounds lead to the expected change, albeit not to the magnitude as published earlier16-18. Further investi- gation of these compounds with luciferase reporters, the reporters used in the original publication, indicated that two of the compounds might be interacting with the lucifer- ase assay. For the other compounds, the activities of which we could not reproduce, it was not possible by the given means to find the reason for the discrepancy between the experiments. These inconsistent results for miR-122 modulators again highlight the difficulties of small molecule miRNA drug discovery. As outlined before (cf. Section “1.3.3 Challenges in miRNA drug discovery”), these difficulties are associated to miRNA and drug discovery properties: (i) miRNAs are short, (ii) their chemical struc- tures are similar to each other and (iii) standard libraries are not focused to target RNA but proteins. As LNAs and mimics have proven to be very potent and reliable, they are good alternatives to small molecule based miRNA modulators. However, as long as

87 the drug delivery issues with miRNA are not resolved – currently only local eye19 or lung20 administration are available – small molecules remain a valid alternative and further screens to find potent modulators are desirable.

3.2.2 Validation concept To the best of my knowledge this study is the first example of a screening as- say, where several clearly defined input analytes are measured and compressed in a single output. This set-up led to challenges in terms of assay validation. If an assay with a single drug target is to be validated, its performance is usually assessed by applying positive controls and calculating Z’-factors. The Z’-factor is a statistical meas- ure that combines the dynamic range with the variability of an assay21. Specifically, in order to have an assay that can be classified as “excellent”, the distributions of the negative and the positive controls should be separated by at least twelve times the averaged standard deviations. Since the gene circuit assay measures five different analytes, which activities all can potentially be increased and decreased or even bal- ance each other, we had to introduce a scoring concept that deals with this high di- mensional space. We evaluated different options, including statistical measures of this space as well as individual cut-off values and decided to adapt the concept of the “weakest link”, mathematically developed by Weibull in the 1950’s22. In this concept positive controls for each input are tested separately and the corresponding Z’-factors are calculated. Then, the assays are scored by comparing their worst Z’-factors (weak- est link) and the assay with the best weakest link is selected. This ensures good per- formance over the widest possible range of inputs. The downside of this concept is, that when comparing only the weakest link, differences in the other parameters are not considered. It could therefore be that a system with a single badly performing param- eter would not be chosen, even if the results for all the other parameters were excel- lent.

3.2.3 Choice of library After identifying the optimal screening circuit with the best overall Z’-factor per- formance we screened a pilot library to prove the general concept of a gene circuit in a high throughput set-up (cf. Section “2.10 In-depth validation of complete feed forward assay”). The chosen library is the NIH Clinical Collection, which consists of 727 com- pounds that are either approved for human use or in late stage clinical development. This implies, that all of these compounds have a biological function, which would not

88 be the case in a randomized collection. We identified a rather big fraction of com- pounds that showed an effect on the systemic level of the cell (18.3/20.2 %) or on the RNAi pathway module (12.8/16.1 %). We explain this unusually high hit rate with the fact that the library consists solely of biologically active molecules and expect a “nor- mal” (lower) hit rate with an unbiased collection. We further found ten compounds in the primary screen that showed very modest but specific changes of the drug target. However, in follow up dose-response experiments we could not reproduce these ef- fects, of which possible causes are discussed in the next section.

3.2.4 Hit selection algorithms In order to classify a compound as a hit, it had to pass three selection criteria. First, its gene expression readout (mCitrine) had to be unaffected, i.e. all compounds with p<0.1 were excluded, all others were further considered. The second criterion tests if a compound affects several miRNAs or the miRNA pathway itself. Again, we excluded compounds that changed the mCerulean readout significantly (p<0.1). At last, we checked the miR-122 readout (mCherry) and applied a significance level of p<0.01. For each criterion, the average signal of the whole plate is compared with the well of interest. We applied this strategy to account for potential edging effects or trans- fection/plating artifacts23. When we consider the miR-122 hits, the applied significance level is rather strict for a single experiment – only 1% of the results should fall in this section in a random experiment. Yet, since we test a “many vs. one” set-up, where we compare a lot of measurements against the same control, we encounter a multiple comparison problem24, also known as the “look elsewhere effect”. By comparing the same control against many different wells, we repeat a random experiment many times. While the probability of finding a random outlier in each experiment is <0.01 (for a p value <0.01), the probability for finding any outlier in 100 trials is (1-0.01)100 ≈ 0.37. As a consequence, the probability of finding a random outlier increases with the num- ber of wells tested. This has implications on the hit rate and also on follow-up experi- ments. Various corrections or work-around could be applied, which are outlined below in section “3.3 Future directions”.

3.2.5 Comparison of the assay with current standards In order to benchmark our assay against the current gold standard, i.e. a dual reporter system where one of the reporters is equipped with a miRNA binding site18,25, we performed side by side experiments with three different types of perturbations, namely a mixture of LNAs, a mixture of siDicer and siDrosha as well as Clobetasol, a

89 compound that we identified in the screen to be a non-specific modulator of RNAi. We could show that our assay behaves as expected and identifies the false positive hits, while they would be undetected with the current gold standard. This proves that inte- gration of multiple signals has a clear added value and is beneficial for the discovery of specific miRNA modulators.

3.2.6 Engineering a complex artificial network Using gene circuits to sense and integrate multiple inputs for drug discovery has, as outlined above, a lot of advantages; yet, there are some challenges associated with it. An apparent drawback of this strategy is the rather complex design and its tedious development. In order to adapt circuits to a novel task or cell line, the different components need to be optimized to work efficiently by themselves, as well as in con- cert and under a wide range of perturbation conditions (small molecule compounds). These small molecules might affect cellular physiology considerably. For example, a cytostatic compound could inhibit transcription or other vital cellular functions, which also interferes with the functionality of the circuit. This poses significant challenges to the assay development process, requiring optimization of the circuit components indi- vidually as well as the validation procedure. However, we based the approach on the synthetic biology principles of standardization and modularization in order to prevent the need of reengineering for each new application and showed in section “2.13 Assay customization” that rewiring of the gene circuit within its tested specifications is rather straight forward and needs only limited adaptation. Yet, bigger conceptual changes or the transfer of the basic concept to different target families (see below) need extensive development and characterization efforts until the adjusted concept can serve as a platform for further applications. Below I summarize the directions the project could take to exploit further op- portunities and which aspects need to be improved for a screen at bigger scale.

3.3 Future directions This study presented a proof-of-concept of a synthetic gene circuit based drug discovery assay. It laid the ground for different projects to be spun off, in order to har- vest the full potential of the technology. These include expansion of the drug- or off- target miRNA set, inclusion of toxicity markers or changes to different target families such as kinases or G-protein coupled receptors (GPCRs).

90 3.3.1 Screening a larger, more focused library As no novel specific compound could be identified from the NIH Clinical Col- lections, the natural next step is to screen a larger compound library. Since RNAs and miRNAs in particular are difficult targets for small molecules (cf. Section “1 Introduc- tion”), screens of compound-scaffolds that are optimized to interact with RNA could be a starting point. Nevertheless, in the library screening presented in section “2.11 Au- tomated screening of small molecule library” also compounds were identified that showed specific activity in the first round, but could not be confirmed in follow up dose- response experiments. This indicates that further optimization of the protocol is needed before scale up. Specifically the multiple-comparison problem and the currently applied transfection system of (multiple) bulk transfections that are distributed by the robot should be changed. For the latter one there are two different ways, which could in- crease accuracy and therefore decrease the rate of false positive hits. Frist, single well transfections: transfecting each well independently would probably reduce the preci- sion between repeats of the same transfection mix. Yet, by transfecting each well in- dependently the time between contact with the transfection agent and subsequent de- livery to the cells is more uniform and therefore makes wells that are further separated in time more consistent. Second, by generating a stable cell line, transfection itself becomes obsolete and the main source of error can be excluded. The multiple-com- parison problem can be addressed by applying different statistical tricks. Particularly two could be applied to the current screening system. First, Bonferroni correction, where one divides the significance level applied for a single experiment (in our case p=0.01) by the number of comparisons performed. For a 96-well plate the p value would therefore decrease to ~0.0001, leading to the identification of a much smaller fraction of the compounds tested and a lower hit rate. The second strategy does not only include the significance of an effect (p value) but also its magnitude (fold change compared to control). A given compound could very weakly but significantly change the miR-122 activity. Yet, this change might not translate to any effect that is biologi- cally or clinically relevant, e.g. miR-122 is reduced by 10 %, but the remaining 90 % of miR-122 molecules is enough for HCV to readily replicate. Therefore, setting a certain threshold on top of the significance criteria could reduce the number of hits and in- crease the likelihood of discovering a potent lead compound. For miR-122, a com- pound could be judged as a hit, when miR-122 is increased/decrease by 50 % with a p value <0.01.

91 3.3.2 Different miRNA drug- and off-targets In section “2.13 Assay customization” we showed that the drug target miRNA can be exchanged without reengineering the whole screening circuit. In particular, we exchanged miR-122 with miR-23 and let-7b and proved validity of the assays either using the full mimic/LNA based control set or by using siRNAs that were identified based on deep-sequencing data and tested with the simple reporter assay. Expanding the range of target miRNAs to other disease relevant miRNAs provides a good starting point for further screening campaigns. Additionally, different off-target miRNAs can be incorporated, since each miRNA has its own set of relevant off-targets – either miRNAs that are co-cistronically expressed with the target miRNAs, other family members or known miRNAs that are up/downregulated by a given treatment. We proved that the exchange of low ex- pressed off-targets is easily accomplished and also that an expansion of the number of off-target inputs is possible by standard procedures. This shows that the present circuit assay layout can be readily applied for various miRNA target sets.

3.3.3 More off-target markers - toxicity Several studies showed that unexpected toxicity, which is only identified in phase I clinical trials is a big problem in drug discovery2,26. A consortium of opinion leaders in pharmaceutical industry hence suggested testing potential toxic effects by screening each lead compound against at least 25 key targets. These targets include various GPCRs, ion channels, enzymes, transporters and nuclear receptors. Screen- ing each target independently against all promising hits is a serious investment in time, labor and money. By integrating these targets in a first-round-screening-assay costs can be reduced significantly and the development of the best lead can be promoted from the first screening round. Using a data mining approach on deep sequencing data Zhang et al.27 identified promoters that are induced upon toxicity. By integrating these promoters in the screening assay developed in this thesis, they could serve as a proxy for the identified toxicity targets and therefore serve as an intermediate stage assay.

3.3.4 Different target families – kinases and GPCRs Targeting miRNAs is very prone to off-target effects due to miRNAs’ chemistry and biology, which is why we used them as the proof of concept target. However, also other target families suffer from similar issues or benefit from simultaneous measure- ments of several parameters. The latter is the case for kinases, a very important target in oncology drug discovery28. Solely considering the tyrosine kinase family, there are

92 already 90 different versions encoded in the human genome29. Some of them are very similar to each other and can have very similar phosphorylation targets. This makes specificity screens for novel kinase drugs an important step30. Yet, Imatinib, a very potent kinase inhibitor (marketed by Novartis as Gleevec/Glivec) was identified to be promiscuous in its specificity for kinases and it was found, that this feature is the reason for its potency4. An appropriately designed gene circuit could integrate the signal from several kinases and separate off-target effects from desired multiple targeting ones. This could lead to discovery programs that find those promiscuous candidates, which are more potent. GPCRs on the other hand suffer from the same problem as miRNAs. Many GPCRs can be grouped into families that have similar functions, response path- ways or ligands that can bind to several receptors. E.g. the adrenergic receptors are a family of nine subtypes31. Various ligands for these receptors were discovered and they have a wide range of functions. Norepinephrine, the natural ligand of these re- ceptors, can bind to all of them (with different affinities), while Procaterol specifically

32 binds to the β2 adrenergic receptors . Dependent on the molecular basis of a disease, either a very specific or a very promiscuous ligand can be desirable. Therefore, assays that can report on these features of ligand/GPCR pairs are important novel tools33.

3.4 Closing statement Finding specific compounds with fewer toxic side effects or promiscuous com- pounds that are more potent is a very difficult task. In this thesis I started with the vision that synthetic biology or synthetic gene circuits in particular could contribute to solving this problem. Taking miR-122 as a model system I showed that this is factually true. I summarized the steps that are needed to apply the gene circuit to miRNA small mole- cule screening and what challenges are associated with it. Furthermore, I put my find- ings in a broader scientific context and evaluated the promises of the technology. This work is, to the best of my knowledge, the first that applies mammalian synthetic biology to an actual biological problem. It merges the principles of gene circuit design with HTS approaches and lays the ground for novel promising technologies for small molecule screening, with a broad range of potential future applications.

93 References 1 Macarron, R. et al. Impact of high-throughput screening in biomedical research. Nat. Rev. Drug. Discov. 10, 188-195 (2011). 2 Kitano, H. A robustness-based approach to systems-oriented drug design. Nat. Rev. Drug Discov. 6, 202-210 (2007). 3 Mayr, L. M. & Fuerst, P. The future of High-Throughput Screening. J. Biomol. Screen. 13, 443-448 (2008). 4 Frantz, S. Drug discovery: Playing dirty. Nature 437, 942-943 (2005). 5 Feng, Y., Mitchison, T. J., Bender, A., Young, D. W. & Tallarico, J. A. Multi- parameter phenotypic profiling: using cellular effects to characterize small- molecule compounds. Nat. Rev. Drug Discov. 8, 567-578 (2009). 6 Piatkevich, K. D. & Verkhusha, V. V. Guide to red fluorescent proteins and biosensors for flow cytometry. in Methods in Cell Biology Vol. 102 431-461 (Academic Press, 2011). 7 Pepperkok, R., Squire, A., Geley, S. & Bastiaens, P. I. H. Simultaneous detection of multiple green fluorescent proteins in live cells by fluorescence lifetime imaging microscopy. Curr. Biol. 9, 269-274. 8 Morissette, S. L. et al. High-throughput crystallization: polymorphs, salts, co- crystals and solvates of pharmaceutical solids. Adv. Drug Deliver. Rev. 56, 275-300 (2004). 9 Noel, T. S., Ajit, J., Ruili, H., Trung, N. & Yuhong, W. Enabling the large-scale analysis of quantitative High-Throughput Screening data. in Handbook of Drug Screening, Second Edition Drugs and the Pharmaceutical Sciences 442-464 (CRC Press, 2009). 10 Stoeger, T., Battich, N., Herrmann, M. D., Yakimovich, Y. & Pelkmans, L. Computer vision for image-based transcriptomics. Methods 85, 44-53 (2015). 11 Rämö, P. et al. Simultaneous analysis of large-scale RNAi screens for pathogen entry. BMC Genomics 15, 1-18 (2014). 12 Rinaudo, K. et al. A universal RNAi-based logic evaluator that operates in mammalian cells. Nat. Biotechnol. 25, 795-801 (2007). 13 Xie, Z., Wroblewska, L., Prochazka, L., Weiss, R. & Benenson, Y. Multi-input RNAi-based logic circuit for identification of specific cancer cells. Science 333, 1307-1311 (2011). 14 Leisner, M., Bleris, L., Lohmueller, J., Xie, Z. & Benenson, Y. Rationally designed logic integration of regulatory signals in mammalian cells. Nat. Nanotechnol. 5, 666-670 (2010). 15 Angelici, B., Mailand, E. & Benenson, Y. Synthetic biology platform for sensing and integrating endogenous transcriptional inputs in mammalian cells. In Revision (2016). 16 Melo, S. et al. Small molecule enoxacin is a cancer-specific growth inhibitor that acts by enhancing TAR RNA-binding protein 2-mediated microRNA processing. Proc. Natl Acad. Sci. USA 108, 4394–4399 (2011). 17 Watashi, K., Yeung, M. L., Starost, M. F., Hosmane, R. S. & Jeang, K.-T. Identification of small molecules that suppress microRNA function and reverse tumorigenesis. J. Biol. Chem. 285, 24707-24716 (2010). 18 Young, D. D., Connelly, C. M., Grohmann, C. & Deiters, A. Small molecule modifiers of microRNA miR-122 function for the treatment of hepatitis C virus infection and hepatocellular carcinoma. J. Am. Chem. Soc. 132, 7976-7981 (2010). 19 Martinez, T. et al. In vitro and in vivo efficacy of SYL040012, a novel siRNA compound for treatment of glaucoma. Mol. Ther. 22, 81-91 (2014). 20 Lam, J. K.-W., Liang, W. & Chan, H.-K. Pulmonary delivery of therapeutic siRNA. Adv. Drug Deliver. Rev. 64, 1-15 (2012).

94 21 Zhang, J.-H., Chung, T. D. Y. & Oldenburg, K. R. A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J. Biomol. Screen. 4, 67-73 (1999). 22 Weibull, W. A statistical distribution function of wide applicability. J. Appl. Mech.-T. ASME 18, 293-297 (1951). 23 Buchser, W. et al. Assay development guidelines for image-based high content screening, high content analysis and high content imaging. in Assay Guidance Manual [Internet]. (2004-). 24 Dickhaus, T. Simultaneous statistical inference - with applications in the life sciences. (Springer-Verlag Berlin Heidelberg, 2014). 25 Connelly, C. M., Thomas, M. & Deiters, A. High-throughput luciferase reporter assay for small-molecule inhibitors of microRNA function. J. Biomol. Screen. 17, 822-828 (2012). 26 Bowes, J. et al. Reducing safety-related drug attrition: the use of in vitro pharmacological profiling. Nat. Rev. Drug. Discov. 11, 909-922 (2012). 27 Zhang, J. D., Berntenis, N., Roth, A. & Ebeling, M. Data mining reveals a network of early-response genes as a consensus signature of drug-induced in vitro and in vivo toxicity. Pharmacogenomics J. 14, 208-216 (2014). 28 Zhang, J., Yang, P. L. & Gray, N. S. Targeting cancer with small molecule kinase inhibitors. Nat. Rev. Cancer 9, 28-39 (2009). 29 Robinson, D. R., Wu, Y.-M. & Lin, S.-F. The protein tyrosine kinase family of the human genome. Oncogene 19, 5548-5557 (2000). 30 Anastassiadis, T., Deacon, S. W., Devarajan, K., Ma, H. & Peterson, J. R. Comprehensive assay of kinase catalytic activity reveals features of kinase inhibitor selectivity. Nat. Biotech. 29, 1039-1045 (2011). 31 Strosberg, A. Structure, function, and regulation of adrenergic receptors. Protein Sci. 2, 1198 (1993). 32 Baker, J. G. The selectivity of β‐adrenoceptor agonists at human β1‐, β 2‐and β3‐adrenoceptors. Brit. J. Pharmacol. 160, 1048-1061 (2010). 33 Kenakin, T. Functional Selectivity and Biased Receptor Signaling. J. Pharmacol. Exp. Ther. 336, 296-302 (2011).

95

96 Appendix

Supplementary figures

Supplementary Figure 1 Analysis of data distributions for screen 1. Different assay readouts corresponding to triplicate assay plates of the same compound plate are pooled together to build the histograms, which are further fitted to a normal distri- bution. The pooled readouts are used as reference distributions for hit identification (see section “2.2 Materials and methods”). Readouts and the compound storage plates are indi- cated. All the reference distributions are close to normal, except for mCerulean/mCitrine for plate09. Yet, most of these points are excluded based on mCitrine data. Transfections are described in Supplementary Table 33.

97

Supplementary Figure 2 Analysis of data distributions for screen 2. Different assay readouts corresponding to triplicate assay plates of the same compound plate are pooled together to build the histograms, which are further fitted to a normal distri- bution. The pooled readouts are used as reference distributions for hit identification (see section “2.2 Materials and methods”). Readouts as well as the compound storage plates are indicated. All of the reference distributions are close to normal, except for mCitrine for plate01. Transfections are described in Supplementary Table 33.

98

Supplementary Figure 3 Chemical structures of gene expression module hits. (a)-(j) Chemical structures of compounds excluded based on the gene expression module (mCitrine) that were followed up in dose-response experiments. Numbers in brackets repre- sent fold changes compared to plate mean averaged for the two screening runs.

99

Supplementary Figure 4 Chemical structures of RNAi module hits. (a)-(h) Chemical structures of compounds excluded based on the non-specific RNAi module readout (normalized mCerulean) that were followed up in dose-response experiments. Num- bers in brackets represent fold changes compared to plate mean averaged for the two screening runs.

100

Supplementary Figure 5 Chemical structures of specific module hits. (a)-(g) Chemical structures of compounds classified as specific hits (normalized mCherry). Numbers in brackets represent fold changes compared to plate mean averaged for the two screening runs.

101 Supplementary Tables

Supplementary Table 1 Plasmid construction

Ubi-Nos (pDT7007, Junk-DNA): Described in Xie et al.1 pcDNA3.1-IFP1.4: Described in Shu et al.2 CMV-PpLuc (pZ003): Described in Xie et al.3 CMV-RrLuc (pZ005): Described in Xie et al.3 CMV-Neomycin-miR-30 backbone-miR-FF3-miR-30 backbone (pZ037): Described in Leisner et al.4 AmCyan-TRE-DsRed-T21x4 (pZ072): Described in Xie et al.1 AmCyan-TRE-DsRed-TFF5x4 (pZ073): Described in Xie et al.1 CMV-rtTA-T21x4 (pZ090): Described in Xie et al.1 CMV-rtTA-TFF5x4 (pZ091): Described in Xie et al.1 CMV-rtTA-T21x4 (pZ090): Described in Xie et al.1 TRE-LacI-TFF5x4 (pZ094): Described in Xie et al.1 AmCyan-TRE-DsRed-T141x4 (pZ116): Described in Xie et al.1 AmCyan-TRE-DsRed-T142-3px4 (pZ117): Described in Xie et al.1 AmCyan-TRE-DsRed-T146ax4 (pZ118): Described in Xie et al.1 AmCyan-TRE-DsRed-T17x4 (pZ145): Described in Xie et al.1 AmCyan-TRE-DsRed-T30ax4 (pZ146a): Described in Xie et al.1 TRE-LacI-T21x4-miR-FF4 (pZ224): Described in Xie et al.1 TRE-LacI-TFF5x4-miR-FF4 (pZ225): Described in Xie et al.1 CMV-IFP1.4 (pZ210): Commercial plasmid from Clontech #632441 was digested with NheI and XhoI and ligated with NheI and XhoI digested pcDNA3.1-IFP1.4. CMV-PIT2 (pMF206): Provided by Fussenegger Lab, described in Weber et al.5 ERE-minCMV-SEAP (pBP013): Provided by Fussenegger Lab, described in Weber et al.6 PRE-minCMV-SEAP (pBP031): Provided by Fussenegger Lab, described in Weber et al.5 CMV-iRFP (pCS0012): Plasmid from Addgene #31857, deposited by Vladislav Verkhusha, described in Filonov et al.7 Ef1α-mCerulean (pKH024): Described in Prochazka et al.8 Ef1α-mCitrine (pKH025): Described in Prochazka et al.8 Ef1α-mCherry (pKH026): Described in Prochazka et al.8 MCS-TRE tight BI-MCS (pIM001): Commercial plasmid from Clontech # 631068.

102 MCS-TRE-mCherry (pIM003): mCherry was PCR amplified using primers PR0534 and PR0535 from pKH026, digested with KpnI and MluI and ligated with KpnI and MluI digested pIM001. mCerulean-TRE-mCherry (pIM002): mCerulean was PCR amplified using primers PR0522 and PR0563 form pKH024, digested using NdeI and EcoRI and ligated with NdeI and EcoRI digested pIM003. mCerulean-TRE-MCS (pIM015): mCerulean was extracted from pIM003 by KpnI and MluI and was ligated with KpnI and MluI digested pIM001. TRE-LacI-TFF4x4 (pBA007): pZ094 was digested using NotI and SalI and was ligated with annealed oligos PR1093 and PR1094, coding for 4 repeats of inverse complement sequence of miR-FF4. ETR-2x-Hnf1-AmCyan-^miR-FF4^-2A-ET (pBA026): Described in Angelici et al. (In preparation) PIR-2x-Hnf1-AmCyan-^miR-FF4^-2A-PIT2 (pBA065): Described in Angelici et al. (In preparation) CMV-tTA (pBA166): Commercial plasmid from Clontech # 631069. CMV-DsRed-Express-PEST (pNL69): Commercial plasmid from Clontech #632430 was digested using BglII and EcoRI and ligated with BglII and EcoRI digested PCR product of primers PR0913 and PR0914 on annealed oligos PR0876 and PR0877. AmCyan-TRE-DsRed-T145x4 (pBH0008): pZ072 was digested using NotI and SalI and was ligated with annealed oligos PR0077 and PR0078, coding for 4 repeats of inverse complement sequence of miR-145. AmCyan-TRE-DsRed-T24x4 (pBH0010): pZ072 was digested using NotI and SalI and was ligated with annealed oligos PR0081 and PR0082, coding for 4 repeats of inverse complement sequence of miR-24. AmCyan-TRE-DsRed-T375x4 (pBH0012): pZ072 was digested using NotI and SalI and was ligated with annealed oligos PR0085 and PR0086, coding for 4 repeats of inverse complement sequence of miR-375. AmCyan-TRE-DsRed-T196ax4 (pBH0014): pZ072 was digested using NotI and SalI and was ligated with annealed oligos PR0089 and PR0090, coding for 4 repeats of inverse complement sequence of miR-196a. CMV-DsRed-Express (pBH0015): Commercial plasmid from Clontech #632430. CMV-ZsYellow (pBH0016): Commercial plasmid from Clontech #632445. CMV-AmCyan (pBH0017): Commercial plasmid from Clontech #632441. AmCyan-TRE-DsRed-T200cx4 (pBH0019): pZ072 was digested using NotI and SalI and was ligated with annealed oligos PR0107 and PR0108, coding for 4 repeats of inverse complement sequence of miR-200c.

103 AmCyan-TRE-DsRed-T16x4 (pBH0020): pZ072 was digested using NotI and SalI and was ligated with annealed oligos PR0109 and PR0110, coding for 4 repeats of inverse complement sequence of miR-16. AmCyan-TRE-DsRed-Tlet7bx4 (pBH0021): pZ072 was digested using NotI and SalI and was ligated with annealed oligos PR0111 and PR0112, coding for 4 repeats of inverse complement sequence of let-7b. AmCyan-TRE-DsRed-T200cx4 (pBH0022): pZ072 was digested using NotI and SalI and was ligated with annealed oligos PR0113 and PR0114, coding for 4 repeats of inverse complement sequence of miR-23b. CMV-Neo-miR-30 Stem loop-miR-145 coding sequence (pBH0024): pZ037 was di- gested using XhoI and EcoRI and ligated with annealed oligos PR0101 and PR0102, coding for a fully complementary lower miRNA stem, the miR-145 coding sequence and a miR-30 pPRIME9 based loop sequence. mCerulean-TRE-mCherry-Spacer (pBH0074): pBH0016 was digested using NotI and SalI to extract an 800 bp long Spacer. pIM002 was digested using NotI and SalI and was ligated with the 800 bp Spacer. mCerulean-TRE-mCherry-T130ax4 (pBH0075): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR0626 and PR0627, coding for 4 re- peats of inverse complement sequence of miR-130a. mCerulean-TRE-mCherry-T27bx4 (pBH0076): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR0628 and PR0629, coding for 4 re- peats of inverse complement sequence of miR-27b. mCerulean-TRE-mCherry-T125ax4 (pBH0077): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR0630 and PR0631, coding for 4 re- peats of inverse complement sequence of miR-125a. mCerulean-TRE-mCherry-T18ax4 (pBH0078): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR0632 and PR0633, coding for 4 re- peats of inverse complement sequence of miR-18a. mCerulean-TRE-mCherry-T7x4 (pBH0079): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR0634 and PR0635, coding for 4 repeats of inverse complement sequence of miR-7. mCerulean-TRE-mCherry-T20ax4 (pBH0080): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR0636 and PR0637, coding for 4 re- peats of inverse complement sequence of miR-20a. mCerulean-TRE-mCherry-T21x4 (pBH0081): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pZ072, coding for 4 repeats of inverse complement sequence of miR-21.

104 mCerulean-TRE-mCherry-T21x4 (pBH0082): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pBH0021, coding for 4 repeats of inverse complement sequence of let-7b. mCerulean-TRE-mCherry-T16x4 (pBH0083): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pBH0020, coding for 4 repeats of inverse complement sequence of miR-16. mCerulean-TRE-mCherry-T24x4 (pBH0084): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pBH0010, coding for 4 repeats of inverse complement sequence of miR-24. mCerulean-TRE-mCherry-TFF5x4 (pBH0091): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pZ091, coding for 4 repeats of FF5 sequence. mCerulean-PRE-mCherry-800 bp Spacer (pBH0107): The backbone encoding mCherry-Spacer-BB-mCerulean was amplified form pBH0074 using PR0646 and PR0647. PRE-Spacer-minCMV was amplified from pBP031 using primers PR0648 and PR0649. Spacer-minCMV was amplified from pBP031 using primers PR0650 and PR0651. All fragments were digested using BspQI and batch ligated to result in mCerulean-minCMV-Spacer-PRE-Spacer-minCMV-mCherry-Backbone. mCerulean-ERE-mCherry-800 bp Spacer (pBH0111): The backbone encoding for mCherry-Spacer-BB-mCerulean was amplified form pBH0074 using PR0646 and PR0647. ERE-Spacer-minCMV was amplified from pBP013 using primers PR0648 and PR0649. Spacer-minCMV was created by annealing PR0644 and PR0645. All PCR products were digested using BspQI and batch ligated together with the anneal- ing product to result in mCerulean-minCMV-Spacer-ERE-Spacer-minCMV-mCherry- Backbone. mCerulean-TRE-mCherry-T122x4 (pBH0112): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR0720 and PR0721, coding for 4 re- peats of inverse complement sequence of miR-122. mCerulean-TRE-mCherry-T122x1 (pBH0145): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR0775 and PR0776, coding for 1 re- peat of inverse complement sequence of miR-122. CMV-Exportin-5 (pBH0151): Plasmid from Addgene #12552, deposited by Ian Mac- ara, described in Brownawell et al.10 CAGop-ET-2A-Citrine-MCS (pBH0152): A Gibson assembly was performed with the following fragments: pZ166 digested using AflII and AgeI. mCitrine PCR amplified form pKH025 using primers PR0784 and PR0789, ET PCR amplified from pBA026 using primers PR0782 and PR0783 and the MCS-pA was introduced by gBlock0013.

105 CAGop-Citrine-2A-PIT2 (pBH0153): A Gibson assembly was performed with the fol- lowing fragments: pZ166 digested using AflII and AgeI. mCitrine PCR amplified form pKH025 using primers PR0778 and PR0779, PIT2 PCR amplified from pBA065 using primers PR0780 and PR0781 and the MCS-pA was introduced by gBlock0013. mCerulean-TRE-RrLuc (pBH0156): RrLuc was PCR amplified from pZ005 using pri- mers PR0825 and PR0826 and digested using NotI and NheI. This product was ligated with EagI and NheI digested pIM015. PpLuc-TRE-RrLuc (pBH0157): PpLuc was PCR amplified from pZ003 using primers PR0827 and PR0828 and digested using BsaI and BglII. This product was ligated with EcoRI and BglII digested pBH0156. PpLuc-TRE-RrLuc-TFF5x4 (pBH0159): pBH0157 was digested using NotI and SalI and ligated with NotI and SalI digested PCR product of primers PR0673 and PR0674 on pBH0145. PpLuc-TRE-RrLuc-T122x1 (pBH0160): pBH0157 was digested using NotI and SalI and ligated with NotI and SalI digested PCR product of primers PR0673 and PR0674 on pBH0091. PpLuc-TRE-RrLuc-T122x4 (pBH0161): T122x4 was PCR amplified form pBH0122 and digested using NotI and SalI. This was ligated with NotI and SalI digested pBH0157. CMV-Ubiquitin x4-ZsYellow (pBH0173): Four ubiquitin coding fragments were pro- duced the following: gBlock0014 was PCR amplified using primers PR0932 and PR0933 and digested with XbaI. gBlock0014 was PCR amplified using primers PR0934 and PR0935 and digested with AvrII and EcoRI. gBlock0014 was PCR ampli- fied using primers PR0936 and PR0937 and digested with MfeI and PstI. gBlock0014 was PCR amplified using primers PR0938 and PR0939 and digested with NsiI. These four fragments were ligated and digested simultaneously with XbaI, AvrII, EcoRi, PstI, MfeI, NsiI in NEBuffer2 supplemented with ATP at 25 °C11. The largest ligation product was purified from gel and digested using SacI and BamHI to produce four repeats of ubiquitin. This fragment was then ligated into SacI and BamHI digested pBH0016. CMV-Neo-miR-30 Stem loop- anti-DGCR8 miRNA (pBH0175): pBH0024 was di- gested using XhoI and EcoRI and ligated with annealed oligos PR0947 and PR0948 encoding for a DGCR8 targeting siRNA described in Chien et al.12 mCerulean-ERE-mCherry-T122x4 (pBH0177): T122x4 was recived by digesting pBH0112 using NotI and SalI and was ligted into NotI and SalI digested pBH0111. CAGop-ET-2A-Citrine-MCS, killed XhoI site in ET (pBH0178): In order to mutate the XhoI site within the coding region of ET, oligos PR0782 and PR0994 were annealed and filled in using phusion. This part was blunt-end ligated with the PCR product of

106 PR0675 and PR0993 on pBH0152 and digested with AgeI and KpnI. The 986 bp band was purified form gel and ligated with the AgeI and KpnI digested pBH0152. CAGop-Citrine-2A-PIT2-MCS, killed XhoI site in PIT2 (pBH0179): In order to mutate the XhoI site in within the coding region of PIT2, two fragments were PCR amplified from pBH0153, once using primer PR0358 and PR0992, once with primers PR0793 and PR0091. These two fragments were blut-end ligated and digested using BglII and BspEI. The 593 bp band was gel extracted and ligated with BglII and BspEI digested pBH0153. CAGop-ET-2A-Citrine-TFF4x4 (pBH0180): TFF4x4 was extracted from pBA007 by NotI and SalI digestion to be ligated with NotI and PspXI digested pBH0178. CAGop-Citrine-2A-PIT2-TFF4x4 (pBH0181): TFF4x4 was extracted from pBA007 by NotI and SalI digestion to be ligated with NotI and PspXI digested pBH0179. mCerulean-PRE-mCherry-T122x4 (pBH0182): T122x4 was received by digesting pBH0112 using NotI and SalI and was ligted into NotI and SalI digested pBH0107. TRE-Ubiquitinx4-mCherry (pBH0183): pIM003 was digested using KpnI, dephosphorylated and ligated with BstXI digested PCR product of primers PR0940 and PR0941 on pBH0173. TRE-Ubiquitinx4-mCherry-PEST (pBH0184): pBH0183 was digested using BsrGI and HindIII and ligated with BsrGI and HindIII digested PCR product of PR0995 and PR0996 on pNL_69. mCerulean-TRE-mCherry-T145x4 (pBH0193): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pBH0008, coding for 4 repeats of inverse complement sequence of miR-145. mCerulean-TRE-mCherry-T375x4 (pBH0194): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pBH0012, coding for 4 repeats of inverse complement sequence of miR-375. mCerulean-TRE-mCherry-T146ax4 (pBH0195): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pZ118, coding for 4 repeats of inverse complement sequence of miR-146a. β-actin-op-ET-2A-Citrine-TFF4x4 (pBH0197): A MCS in the backbone of pBH0180 and its neighboring CMV early enhancer were deleted by SnaBI digestion and subse- quent self-ligation. β-actin-op-Citrine-2A-PIT2-TFF4x4 (pBH0198): A MCS in the backbone of pBH0181 and its neighboring CMV early enhancer were deleted by SnaBI digestion and subse- quent self-ligation.

107 β-actin-op-ET-2A-Citrine-TFF4x4-T145x4 (pBH0199): T145x4 was PCR amplified from pBH0193 using primers PR0673 and PR0674 and digested using MluI and SalI to be ligated with AscI and BsmBI digested pBH0197. β-actin-op-Citrine-2A-PIT2-TFF4x4-T145x4 (pBH0200): T145x4 was PCR amplified from pBH0193 using primers PR0673 and PR0674 and digested using MluI and SalI to be ligated with AscI and BsmBI digested pBH0198. CMV-rtTA-T20ax4-T130ax4 (pBH0201): NotI-T20ax4-SgrDI-AarI-T130a-AarI-HindIII was obtained by annealing oligo PR1069 with PR1070 and PR1071 with PR1072, fol- lowed by subsequent ligation and gel purification. This fragment was then ligated with NotI and HindIII digested pZ090. TRE-LacI-T20ax4-T130ax4-miR-FF4 (pBH0202): Target insert was recived as for pBH0201 and ligated with NotI and HindIII digested pZ224. β-actin-op-ET-2A-Citrine-TFF4x4-T375x4-T145x4 (pBH0203): T375x4 was PCR amplified from pBH0194 using primers PR0673 and PR0674 and digested using MluI and SalI to be ligated with PaeR7I and MluI digested pBH0201. β-actin-op-Citrine-2A-PIT2-TFF4x4-T375x4-T145x4 (pBH0204): T375x4 was PCR amplified from pBH0194 using primers PR0673 and PR0674 and digested using MluI and SalI to be ligated with PaeR7I and MluI digested pBH0200. β-actin-op-ET-2A-Citrine-TFF4x4-T146ax4-T375x4-T145x4 (pBH0206): T146ax4 was PCR amplified from pBH0195 using primers PR0673 and PR0674 and digested using NheI and SalI to be ligated with SpeI and XhoI digested pBH0203. β-actin-op-ET-2A-Citrine-TFF4x4-T146ax4-T375x4-T145x4 (pBH0207): T146ax4 was PCR amplified from pBH0195 using primers PR0673 and PR0674 and digested using NheI and SalI to be ligated with SpeI and XhoI digested pBH0204. CMV-rtTA-T20ax4 (pBH0211): pBH0201 was digested using AarI, Klenow Large Fragment treated and self-ligated. TRE-LacI-T20ax4-miR-FF4 (pBH0212): pBH0202 was digested using AarI, Klenow Large Fragment treated and self-ligated. CMV-rtTA-T130ax4 (pBH0213): pBH0201 was digested using SgrDI and NotI, Klenow Large Fragment treated and self-ligated. CAGop-ET-2A-Citrine-TFF4x4-T146ax4-T375x4-T145x4 (pBH0219): ET-2A-Citrine- Tbox was extracted from pBH0206 by digesting with XbaI and BsmBI and ligate with XbaI and BsmBI digested pBH0180. CAGop-Citrine-2A-PIT2-TFF4x4-T146ax4-T375x4-T145x4 (pBH0220): In order to get Citrine-2A-PIT2-Tbox pBH0207 is digested using XbaI and BsmBI. This fragment is ligated with pBH0180 digested using XbaI and BsmBI as well.

108 CMV-rtTA-T20ax1 (pBH0225): pBH0201 was digested using NotI and SalI and ligated with annealed oligos PR1277 and PR1278, coding for 1 repeat of inverse complement sequence of miR-20a. CMV-rtTA-T20ax2 (pBH0226): pBH0201 was digested using NotI and SalI and ligated with annealed oligos PR1279 and PR1280, coding for 2 repeats of inverse complement sequence of miR-20a. CAGop-ET-TFF4x4-T146ax4-T375x4-T145x4 (pBH0229): ET was extracted from pBH0219 by PCR amplification with primers PR0830 and PR1307. This fragment was digested using AgeI and BsrGI and ligated with pBH0219 digested by AgeI and BsrGI. CAGop-PIT2-TFF4x4-T146ax4-T375x4-T145x4 (pBH0230): PIT2-Target-box was extracted from pBH0220 by PCR amplification with primers PR0793 and PR1308. This fragment was digested using AgeI and BsmBI and ligated with pBH0220 digested by AgeI and BsmBI. CAGop-ET-TFF4x4-T146ax4-T141x4 (pBH0231): pBH0219 was digested using MluI and BsmBI and ligated with annealed oligos PR1320 and PR1321, which are coding for 4 repeats of inverse complement sequence of miR-141. CAGop-Citrine-2A-PIT2-TFF4x4-T146ax4-T141x4 (pBH0232): pBH0220 was di- gested using MluI and BsmBI and ligated with annealed oligos PR1320 and PR1321, which are coding for 4 repeats of inverse complement sequence of miR-141. CAGop-PIT2-TFF4x4-T146ax4-T141x4 (pBH0234): pBH0230 was digested using MluI and BsmBI and ligated with annealed oligos PR1320 and PR1321, which are coding for 4 repeats of inverse complement sequence of miR-141. CAGop-Citrine-TFF4x4-T146ax4-T375x4-T145x4 (pBH0235): In order to remove PIT2 or ET to get a Citrine only construct pBH0229 and pBH0232 were digested using XbaI and BsrGI, which results after proper insert/backbone choice in the wanted prod- uct. T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0246): T141x4-T146ax4 is extracted from pBH0231 by PCR using primers PR1442 and PR1443. This fragment digested using PspOMI and XbaI is ligated with pBH0182 cut with the same restriction enzymes. CAGop-ZsYellow-TFF4x4-T146ax4-T375x4-T145x4 (pBH0247): ZsYellow was ex- tracted from pBH0016 by PCR amplification using primers PR1125 and PR1451 and digested using XmaI and BsaI. This fragment was ligated with pBH0235 digested with AgeI and BsrGI. CAGop-ET-2A-Citrine-T146ax4-T375x4-T145x4-TFF4x4 (pBH0249): T146ax4- T375x4-T145x4 was extracted from pBH0229 by PCR using primer PR1442 and PR1492 and digested with BsaI. Then ligated with annealed oligos PR1493 and

109 PR1494 coding for 4 repeats of inverse complement sequence of miR-FF4 to result in T146ax4-T375x4-T145x4-TFF4x4. This was digested with HindIII and ligated with HindIII and BsmBI digested pBH0219. CAGop-ET-2A-Citrine-T146ax4-T141x4-TFF4x4 (pBH0250): T141x4-T146ax4 was extracted from pBH0231 by PCR using primers PR1442 and PR1492 and digested with BsaI. Then ligated with annealed oligos PR1493 and PR1494, coding for 4 repeats of inverse complement sequence of miR-FF4 to result in T141x4-T146ax4-TFF4x4. This was digested with HindIII and ligated with HindIII and BsmBI digested pBH0219. CAGop-ET-T146ax4-T141x4-TFF4x4 (pBH0252): Target box T141x4-T146ax4- TFF4x4 constructed as for pBH0250, but ligation with HindIII and BsmBI digested pBH0229. CAGop-Citrine-2A-PIT2-T146ax4-T141x4-TFF4x4 (pBH0254): Target box T141x4- T146ax4-TFF4x4 constructed as for pBH0250, but ligation with HindIII and BsmBI di- gested pBH0232. CAGop-PIT-T146ax4-T375x4-T145x4-TFF4x4 (pBH0255): Target box T146ax4- T375x4-T145x4 was constructed as for pBH0249, but ligation with HindIII and BsmBI digested pBH0234. CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256): Target box T141x4-T146ax4- TFF4x4 constructed as for pBH0250, but ligation with HindIII and BsmBI digested pBH0234. CAGop-ZsYellow-T146ax4-T141x4-TFF4x4 (pBH0260): Target box T141x4- T146ax4-TFF4x4 constructed as for pBH0250, but ligation with HindIII and BsmBI di- gested pBH0247. TFF4x4-T145x4-T375x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0263): TFF4x4-T145x4-T375x4-T146ax4 is extracted from pBH0249 by PCR using primers PR1441 and PR1443. This fragment digested using PspOMI and XbaI is ligated with pBH0182 cut with the same restriction enzymes. TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264): TFF4x4- T141x4-T146ax4 is extracted from pBH0250 by PCR using primers PR1441 and PR1443. This fragment digested using PspOMI and XbaI is ligated with pBH0182 cut with the same restriction enzymes. Lac-op free Junk-DNA Ubi-Nos (pBH265): MauBI and BspQI flank the lacO sites in pDT7004. These enymes were used to digest the plasmid before Klenow Large Frag- ment treatment and self-ligation. mCerulean-TRE-mCherry-TFF4x4 (pBH0266): pBH0074 was digested using NotI and SalI and was ligated with annealed oligos PR1493 and PR1494, coding for 4 re- peats of inverse complement sequence of miR-FF4.

110 mCerulean-TRE-mCherry-T141x4 (pBH0267): pBH0074 was digested using NotI and SalI and was ligated with NotI and SalI digested pZ116, coding for 4 repeats of inverse complement sequence of miR-141. mCerulean-TRE-mCherry-23bx4 (pBH0273): pBH0074 was digested with NotI and SalI and ligated with NotI and SalI digested PCR product of primers PR0673 and PR0674 on pBH0022. TRE-Ubiquitinx4-mCherry-PEST-T122x4 (pBH0277): pBH0184 was digested using NotI and EcoRV and ligated with NotI and EcoRV digested PCR product of primers PR0673 and PR0674 on pBH0112. TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T23bx4 (pBH0278): pBH0264 was digested with NheI and SalI and ligated with NheI and SalI digested pBH0273. mCerulean-TRE-Ubx4-mCherry-T122x1 (pBH0279): pBH0145 was digested using NotI and XhoI and ligated with NotI and XhoI digested pBH0183. mCerulean-TRE-Ubx4-mCherry-PEST-T122x1 (pBH0280): pBH0145 was digested using NotI and XhoI and ligated with NotI and XhoI digested pBH0184. mCerulean-TRE-Ubx4-mCherry-T122x4 (pBH0281): pBH0279 was digested using NotI and AseI and ligated with NotI and AseI digested pBH0277. mCerulean-TRE-Ubx4-mCherry-PEST-T122x4 (pBH0282): pBH0280 was digested using NotI and AseI and ligated with NotI and AseI digested pBH0277. mCerulean-TRE-Ubx4-mCherry-PEST-TFF5x4 (pBH0284): pBH0282 was digested using NotI and AseI and ligated with NotI and AseI digested pBH0091. mCerulean-TRE-mCherry-PEST-T122x4 (pBH0286): pBH0282 was digested using AgeI and NheI, treated with T4-DNA polymerase and ligated to close on itself. mCerulean-TRE-mCherry-PEST-TFF5x4 (pBH0287): pBH0284 was digested using AgeI and NheI, treated with T4-DNA polymerase and ligated to close on itself. TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-Tlet7bx4 (pBH0288): pBH0264 was digested with BsaI and NheI and ligated with BsaI NheI digested pBH0082.

111 Supplementary Table 2 Primers GGCCGCAAAAGGGATTCCTGGGAAAACTGGACAGGGATTCCTGG- PR0077 GAAAACTGGACAGGGATTCCTGGGAAAACTGGACAGGGATTCCTGG- GAAAACTGGACG TCGACGTCCAGTTTTCCCAGGAATCCCTGTCCAGTTTTCCCAG- PR0078 GAATCCCTGTCCAGTTTTCCCAGGAATCCCTGTCCAGTTTTCCCAG- GAATCCCTTTTGC GGCCGCAAACTGTTCCTGCTGAACTGAGCCAC- PR0081 TGTTCCTGCTGAACTGAGCCACTGTTCCTGCTGAACTGAGCCAC- TGTTCCTGCTGAACTGAGCCAG TCGACTGGCTCAGTTCAGCAGGAACAGTGGCTCAGTTCAGCAG- PR0082 GAACAGTGGCTCAGTTCAGCAGGAACAGTGGCTCAGTTCAGCAG- GAACAGTTTGC GGCCGCAAATCACGCGAGCCGAACGAACAAATCACGCGAGCCGAAC- PR0085 GAACAAATCACGCGAGCCGAACGAACAAATCACGCGAGCCGAAC- GAACAAAG TCGACTTTGTTCGTTCGGCTCGCGTGATTT- PR0086 GTTCGTTCGGCTCGCGTGATTTGTTCGTTCGGCTCGCGTGATTT- GTTCGTTCGGCTCGCGTGATTTGC GGCCGCAAACCCAACAACATGAAACTACCTACCCAACAACATGAAAC- PR0089 TACCTACCCAACAACATGAAACTACCTACCCAACAACATGAAACTAC- CTAG TCGACTAGGTAGTTTCATGTTGTTGGGTAGGTAGTTTCATGTTGTTGGG- PR0090 TAGGTAGTTTCATGTTGTTGGGTAGGTAGTTTCATGTTGTTGGGTTTGC PR0091 TGAAGGGCGAGATCCACA TCGAGGAGCATTCCTGGTCCAGTTTTCCCAGGAATCCCTTAGTAA- PR0101 GAGGGCAACCTTAAGGGATTCCTATGAAAACTGAATCAGGAGTGTTTG AATTCAAACACTCCTGATTCAGTTTTCATAGGAATCCCTTAAGGTT- PR0102 GCCCTCTTACTAAGGGATTCCTGGGAAAACTGGACCAGGAATGCTCC GGCCGCAAATCCATCATTACCCGGCAGTATTATCCATCATTACCCGG- CAGTATTATCCATCATTACCCGGCAGTATTATCCATCATTACCCGGCAG- PR0107 TATTAG

TCGACTAATACTGCCGGGTAATGATGGATAATACTGCCGGG- PR0108 TAATGATGGATAATACTGCCGGGTAATGATGGATAATACTGCCGGG- TAATGATGGATTTGC GGCCGCAAACGCCAATATTTACGTGCTGCTACGCCAATATTTAC- PR0109 GTGCTGCTACGCCAATATTTACGTGCTGCTACGCCAATATTTAC- GTGCTGCTAG TCGACTAGCAGCACGTAAATATTGGCGTAGCAGCACGTAAATATTGGCG- PR0110 TAGCAGCACGTAAATATTGGCGTAGCAGCACGTAAATATTGGCGTTTGC GGCCGCAAAAACCACACAACCTACTACCTCAAACCACACAACCTACTAC- PR0111 CTCAAACCACACAACCTACTACCTCAAACCACACAACCTACTACCTCAG TCGACTGAGGTAGTAGGTTGTGTGGTTTGAGGTAGTAGGTT- PR0112 GTGTGGTTTGAGGTAGTAGGTTGTGTGGTTTGAGGTAGTAGGTT- GTGTGGTTTTTGC GGCCGCAAAGGTAATCCCTGGCAATGTGATGGTAATCCCTGG- PR0113 CAATGTGATGGTAATCCCTGGCAATGTGATGGTAATCCCTGG- CAATGTGATG TCGACATCACATTGCCAGGGATTACCATCACATTGCCAGGGATTAC- PR0114 CATCACATTGCCAGGGATTACCATCACATTGCCAGGGATTACCTTTGC PR0358 AAGGAGGACGGCAACATCCTG PR0522 CGGCCATATGTTACTTGTACAGCTCGTCCATG PR0534 CAATGCTAGCGCCTCAGACAGTGGTTCAAAG PR0535 CAGCCACCACCTTCTGATAGG PR0563 GTACAGAATTCGCCACCATGGTGAGCAAGG

112 GGCCGCAAAATGCCCTTTTAACATTGCACTGATGCCCTTTTAACATT- PR0626 GCACTGATGCCCTTTTAACATTGCACTGATGCCCTTTTAACATTGCAC- TGG TCGACCAGTGCAATGTTAAAAGGGCATCAGTGCAATGTTAAAAGGG- PR0627 CATCAGTGCAATGTTAAAAGGGCATCAGTGCAATGTTAAAAGGG- CATTTTGC GGCCGCAAAGCAGAACTTAGCCACTGTGAAGCAGAACTTAGCCAC- PR0628 TGTGAAGCAGAACTTAGCCACTGTGAAGCAGAACTTAGCCACTGTGAAG TCGACTTCACAGTGGCTAAGTTCTGCTTCACAG- PR0629 TGGCTAAGTTCTGCTTCACAGTGGCTAAGTTCTGCTTCACAG- TGGCTAAGTTCTGCTTTGC GGCCG- CAAATCACAGGTTAAAGGGTCTCAGGGATCACAGGTTAAAGGGTCTCAG PR0630 GGATCACAGGTTAAAGGGTCTCAGGGATCACAGGTTAAAGGGTCTCAG GGAG TCGACTCCCTGAGACCCTTTAACCTGTGATCCCTGAGACCCTTTAAC- PR0631 CTGTGATCCCTGAGACCCTTTAACCTGTGATCCCTGAGACCCTTTAAC- CTGTGATTTGC GGCCGCAAACTATCTGCACTAGATGCACCTTACTATCTGCACTA- PR0632 GATGCACCTTACTATCTGCACTAGATGCACCTTACTATCTGCACTA- GATGCACCTTAG TCGACTAAGGTGCATCTAGTGCAGATAGTAAGGTGCATCTAGTGCAGA- PR0633 TAGTAAGGTGCATCTAGTGCAGATAGTAAGGTGCATCTAGTGCAGA- TAGTTTGC GGCCGCAAAACAACAAAATCACTAGTCTTCCAACAACAAAATCACTAG- PR0634 TCTTCCAACAACAAAATCACTAGTCTTCCAACAACAAAATCACTAG- TCTTCCAG TCGACTGGAAGACTAGTGATTTTGTTGTTGGAAGACTAGTGATTTTGTT- PR0635 GTTGGAAGACTAGTGATTTTGTTGTTGGAAGACTAGTGATTTTGTT- GTTTTGC GGCCGCAAACTACCTGCACTATAAGCACTTTACTACCTGCACTATAA- PR0636 GCACTTTACTACCTGCACTATAAGCACTTTACTACCTGCACTATAAGCAC- TTTAG TCGACTAAAGTGCTTATAGTGCAGGTAGTAAAGTGCTTATAGTG- CAGGTAGTAAAGTGCTTATAGTGCAGGTAGTAAAGTGCTTATAGTG- PR0637 CAGGTAGTTTGC

GGCCCAGGCGATCTGACGGTTCACTAAACGAGCTCTGCTTATA- PR0644 TAGGCCTCCCACCGTACACGCCTACTCGACCCGGGTAC- CGAGCTCGAATTACGATCCTGCAGG TCGCCTGCAGGATCGTAATTCGAGCTCGGTACCCGGGTCGAG- PR0645 TAGGCGTGTACGGTGGGAGGCCTATATAAGCAGAGCTCGTTTAG- TGAACCGTCAGATCGCCTGG GCACCATCCTGCTCTTCCGAGCGCCACCATGGTGAGCAAGGGCGAG- PR0646 GAGGATAACATG GCACCATCCTGCTCTTCCGCCGCCACCATGGTGAGCAAGGGCGAG- PR0647 GAGCTGTTCAC PR0648 GCACCATCCTGCTCTTCCGCGCTGTACAGCGTATGGGAATCTCTTG PR0649 GCACCATCCTGCTCTTCCCTCCAGGCGATCTGACGGTTCACTAAAC GCACCATCCTGCTCTTCCCGCTATTTCCAAGGAGCCTGCAG- PR0650 GATCGTCGAGCTCGGTAC PR0651 GCACCATCCTGCTCTTCCGGCGATCTGACGGTTCACTAAACGAG PR0673 TCCCACAACGAGGACTACAC PR0674 CGAGTCAGTGAGCGAGGAAG PR0675 AACTTGTGGCCGTTTACGTC GGCCGCAAAACAAACACCATTGTCACACTCCACAAACACCATT- PR0720 GTCACACTCCACAAACACCATTGTCACACTCCACAAACACCATT- GTCACACTCCAG

113 TCGACTGGAGTGTGACAATGGTGTTTGTGGAGTGTGACAATGGTGTTT- PR0721 GTGGAGTGTGACAATGGTGTTTGTGGAGTGTGACAATGGTGTTT- GTTTTGC ATTACGGCCGCTAGCGCTACCGGACTCAGATCCACCGGTTCGCCAC- PR0778 CATGGTGAGCAAGGGCGAGGAG CTTCCCCTGCCCTCGGCTCTGGTACCCTT- PR0779 GTACAGCTCGTCCATGCCGAGAGTGATCC GCATGGACGAGCTGTACAAGGGTACCAGAGCCGAGGGCAGGG- PR0780 GAAGTCTTCTAACATGC TTGCACTAGTCGCGTGACTCGAGTTCGTGTT- PR0781 GGCGGCCGCCCATGTACTTGGGAAGCTTCCTTAG- GAGCTGATCTGACTCAGCAGGGCTGAGAAGTCCATGTC ATTACGGCCGCTAGCGCTACCGGACTCAGATCCACCGGTCGCCAC- PR0782 CATGCCCCGCCCCAAGCTCAAGTCCGATG TTTTCCTCCACGTCCCCGCATGTTAGAA- PR0783 GACTTCCCCTGCCCTCGGCTCTGGTACCCCCACCG- TACTCGTCAATTCCAAGG GCAGGGGAAGTCTTCTAACATGCGGGGACGTGGAG- PR0784 GAAAATCCCGGGCCCAGATCTGTGAGCAAGGGCGAGGAGCTGTTCAC TTGCACTAGTCGCGTGACTCGAGTTCGTGTTGGCGGCCGCAA- PR0789 GCTTCCATGTACTTGGCTACTTGTACAGCTCGTCCATGCCGAGAG- TGATC PR0793 ATTTTAACAAAATATTAACGCTTACAATTTACGCCTTAAGATAC PR0825 AAGCAGAGCTGGTTTAGTGAACCGTCAGAT PR0826 GCGGCCGCAGCTTATTGTTCATTTTTGAGAACTCGCTCAACGAACGATTT PR0827 CGCGGTCTCCAATTGCCACCATGGAAGACGCCAAAAAC PR0828 TTGTGGAATCGCCGCTTTCG PR0830 GATCCACCGGTCGCCAC GGGCATGGCTTCCCGCCGGCGGTGGCGGCGCAGGATGATGGCAC- PR0876 GCTGCCCATGTCTTGTGCCCAGGAGAGCGGGATGGACCGTCAC ATCGTGACGGTCCATCCCGCTCTCCTGGGCACAAGACATGGG- PR0877 CAGCGTGCCATCATCCTGCGCCGCCACCGCCGGCGGGAAGCCATG

PR0913 GGTTCCAGAGATCTGGGCATGGCTTCCCGCCGGC

GGTTCCAGGAATTCCTCGAGCTACACATTGATCCTAGCAGAA- PR0914 GCACAGGCTGCAGGGTGACGGTCCATCCCGCTCTCCTGG

PR0932 GAATGGACTAGAGCTCGCCACC ATGCAGATCTTCGTGAAAACCCTTACC

PR0933 GAATGGACTATCTAGACACACCTCTCAGACGCAGGAC

PR0934 GAATGGACTACCTAGGATGCAGATCTTCGTGAAAACCCTTACC

PR0935 GAATGGACTAGAATTCCACACCTCTCAGACGCAGGAC

PR0936 GAATGGACTACAATTGATGCAGATCTTCGTGAAAACCCTTACC

PR0937 GAATGGACTACTGCAGCACACCTCTCAGACGCAGGAC

PR0938 GAATGGACTAATGCATATGCAGATCTTCGTGAAAACCCTTACC

PR0939 GAATGGACTAGGATCCCACACCTCTCAGACGCAGGAC

PR0940 GAATGGACTACCACGTACCTGGGCTCGCCACCATGCAGATCTTC

114 PR0941 GAATGGACTACCAAGTACATGGACCGGTGGATCCCACACCTCTC

PR0995 GAATGGACTATGTACAAGCATGGCTTCCCGCCGGCGGTG GAATGGACTAAAGCTTAAGATCAACGTCTCGTCGA- PR0996 GAATTCGCGGCCGCCGAGCTACACATTGATCCTAGCAGAA- GCACAGGCTGCAGGGTGAC TCGAGGAGCATTCCTGGCTCGATGAGTTAGAAGATTTCTCGA- PR0947 GAAATCTTCTAACTCATCGAGCGGAGTGTTTG AATTCAAACACTCCGCTCGATGAGTTAGAAGATTTCTCGA- PR0948 GAAATCTTCTAACTCATCGAGCCAGGAATGCTCC PR0992 CAGCAACTGGTCCTTGGTGTCGACGTACCAGTACACGGACATC PR0993 GAGGCCGCCACCGTAGTGCTGAAG PR0994 CAGTACCTCGTCATCGGACTTGAG GGCCGCAAACTACCTGCACTATAAGCACTTTACTACCTGCACTATAA- GCACTTTACTACCTGCACTATAAGCACTTTACTACCTGCACTATAAGCAC- PR1069 TTTACGTCGACGCACCTGC

GCATGCAGGTGCGTCGACGTAAAGTGCTTATAGTGCAGGTAG- TAAAGTGCTTATAGTGCAGGTAGTAAAGTGCTTATAGTGCAGGTAG- PR1070 TAAAGTGCTTATAGTGCAGGTAGTTTGC

ATGCGGCCAAAATGCCCTTTTAACATTGCACTGATGCCCTTTTAACATT- GCACTGATGCCCTTTTAACATTGCACTGATGCCCTTTTAACATTGCAC- PR1071 TGTCGAATGCGCAGGTGA

AGCTTCACCTGCGCATTCGACAGTGCAATGTTAAAAGGGCATCAGTG- CAATGTTAAAAGGGCATCAGTGCAATGTTAAAAGGGCATCAGTG- PR1072 CAATGTTAAAAGGGCATTTTGGCC

TTCCTCTAATGGTCGACCCCTGAGGAAAAAAAAGGAAACAATT- PR1093 GAAAAAAGTGATTTAATTTATACCATTTTAATTCAGCTTTGTAAA PR1094 GAATGGACTAGAATTCGCTCGCCACCATGCAGATCTTC PR1125 GAATGGACTAACCGGTCCGGTGGATCCCACACCTCTC PR1277 GGCCGCAAACTACCTGCACTATAAGCACTTTAA PR1278 AGCTTTAAAGTGCTTATAGTGCAGGTAGTTTGC GGCCGCAAACTACCTGCACTATAAGCACTTTACTACCTGCACTATAA- PR1279 GCACTTTAA AGCTTTAAAGTGCTTATAGTGCAGGTAGTAAAGTGCTTATAGTG- PR1280 CAGGTAGTTTGC PR1307 CAGAATTGTACATAACCCACCGTACTCGTCAATTCCAAG PR1308 GATATTGCCACCACCGGTATGAGTCGAGGAGAGGTGCGC CGCGTAAACCATCTTTACCAGACAGTGTTACCATCTTTACCAGACAG- PR1320 TGTTACCATCTTTACCAGACAGTGTTACCATCTTTACCAGACAGTGTTA TCGATAACACTGTCTGGTAAAGATGGTAACACTGTCTGGTAAAGATGG- PR1321 TAACACTGTCTGGTAAAGATGGTAACACTGTCTGGTAAAGATGGTTTA PR1441 GAATGGACTAGGGCCCAAGTAGCCAAGTACATGGAAG PR1442 GAATGGACTAGGGCCCGCCGCATCGATAAGCTTAAC PR1443 GAATGGACTATCTAGAATGGCTGATTATCGTCTC PR1451 GAATGGACTAGGTCTCTGTACATCAGGCCAGGGCGCT PR1492 GAATGGACTAGGTCTCCGGCCGCATGGCTGATTATCGTCTC GGCCGCCCGCTTGAAGTCTTTAATTAAACCGCTT- PR1493 GAAGTCTTTAATTAAACCGCTTGAAGTCTTTAATTAAACCGCTT- GAAGTCTTTAATTAAAG

115 TCGACTTTAATTAAAGACTTCAAGCGGTTTAATTAAAGACTTCAA- PR1494 GCGGTTTAATTAAAGACTTCAAGCGGTTTAATTAAAGACTTCAAGCGGGC

Supplementary Table 3 gBlocks gBlock0013 GCGGCCGCCAACACGAACTCGAGTCACGCGACTAGTGCAAC- GAGCTCTCGAGGTCATCACGCGTTCCGTGATCTCGAG- GAATCGGGCGCGCCGGCCAAGATCGAAGAGACGA- TAATCAGCCATACCACATTTGTAGAGGTTTTACTTGCTTTAAAAAAC- CTCCCACACCTCCCCCTGAACCTGAAACATAAAATGAATGCAATT- GTTGTTGTTAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAA- GCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCAC- TGCCCCGAGCTTCCTCGCTCGTCCAAACTCATCAATGTATCTTAAG GCGTAAATTGTAAGCGTTAATATTTTGTTAAAAT

gBlock0014 ATGCAGATCTTCGTGAAAACCCTTACCGGCAAGACCATCACCCTT- GAGGTGGAGCCCAGTGACACCATCGAAAATGTGAAGGCCAA- GATCCAGGATAAGGAAGGCATTCCTCCCGACCAG- CAGAGGCTCATCTTTGCAGGCAAGCAGCTGGAAGATGGCCG- TACTCTTTCTGACTACAACATCCAGAAGGAGTCGACCCTGCAC- CTGGTCCTGCGTCTGAGAGGTGTG

Supplementary Table 4 LNAs Neg.Ctrl. Commercial: Cat # 199020-00 Exiqon miR-20a Commercial: Cat # 426943-00 Exiqon miR-21 Commercial: Cat # 426947-00 Exiqon miR-23b Commercial: Cat # 4101534-101 Exiqon miR-122 Commercial: Cat # 426674-00 Exiqon miR-FF4 Commercial: Custom order Exiqon

Supplementary Table 5 Mimics Neg.Ctrl. Commercial: Cat # CN-001000-01-05 Thermo Scientific (GE Healthcare) miR-20a Commercial: Cat # C-300491-03 Thermo Scientific (GE Healthcare) miR-21 Commercial: Cat # C-300492-03-0005 Thermo Scientific (GE Healthcare) miR-23b Commercial: Cat # C-300588-05-0005 Thermo Scientific (GE Healthcare) miR-122 Commercial: Cat # C-300591-05 Thermo Scientific (GE Healthcare) miR-141 Commercial: Cat # C-300608-03 Thermo Scientific (GE Healthcare) miR-145 Commercial: Cat # C-300613-05 Thermo Scientific (GE Healthcare) miR-146a Commercial: Cat-# C-300630-03 Thermo Scientific (GE Healthcare) miR-375 Commercial: Cat-# C-300683-05 Thermo Scientific (GE Healthcare)

116 Supplementary Table 6 siRNAs siNeg.Ctrl. Commercial: No Cat #: Microsynth 5'-AGGUAGUGUAAUCGCCUUGtt-3', 3'-ttUCCAUCACAUUAGCGGAAC-5' siDicer013 5'-CAGCAUACUUUAUCGCCUUtt-3', Microsynth 5'-AAGGCGAUAAAGUAUGCUGgg-3' siDrosha14 5’-CGAGUAGGCUUCGUGACUUau-3', Microsynth 3’-uuGCUCAUCCGAAGCACUGAA-5’ siDGCR814 5’-GGAUGUAAAGAUUAGCGUGag-3’, Microsynth 3’-uuCCUACAUUUCUAAUCGCAC-5’ siDicer14 5’-UGCUUGAAGCAGCUCUGGAuc-3', Microsynth 3’-ugACGAACUUCGUCGAGACCU-5’ siTRBP414 5’-GCUGCCUAGUAUAGAGCAAau-3', Microsynth 3’-ccCGACGGAUCAUAUCUCGUU-5’ siFF415 5'-GCUUGAAGUCUUUAAUUAAAuu-3', Microsynth 3'-ggCGAACUUCAGAAAUUAAUUU-5'

Supplementary Table 7 Arrengement of control plate transfections Plate 1, LNA-122 Ctrl plate 1 1 2 3 4 5 6 7 8 9 10 11 12 A 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO B 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO C 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO D 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO E 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO F 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO G 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO H 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO

Plate 2, LNA-122 Ctrl plate 2 1 2 3 4 5 6 7 8 9 10 11 12 A DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM B DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM C DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM D DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM E DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM F DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM G DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM H DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM

Plate 3, LNA-122 Ctrl plate 2 1 2 3 4 5 6 7 8 9 10 11 12 A 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM B 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM C 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM D 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM E 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM F 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM G 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM H 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM

Plate 4, Mim-122 Ctrl plate 1 1 2 3 4 5 6 7 8 9 10 11 12 A 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO B 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO C 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO D 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO E 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO F 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO G 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO H 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO

117 Plate 5, Mim-122 Ctrl plate 2 1 2 3 4 5 6 7 8 9 10 11 12 A DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM B DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM C DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM D DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM E DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM F DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM G DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM H DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM

Plate 6, Mim-122 Ctrl plate 3 1 2 3 4 5 6 7 8 9 10 11 12 A 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM B 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM C 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM D 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM E 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM F 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM G 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM H 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM

Plate 7, Mim-146a Ctrl plate 1 1 2 3 4 5 6 7 8 9 10 11 12 A 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO B 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO C 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO D 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO E 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO F 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO G 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO H 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO

Plate 8, Mim-146a Ctrl plate 2 1 2 3 4 5 6 7 8 9 10 11 12 A DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM B DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM C DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM D DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM E DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM F DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM G DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM H DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM

Plate 9, Mim-146a Ctrl plate 3 1 2 3 4 5 6 7 8 9 10 11 12 A 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM B 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM C 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM D 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM E 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM F 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM G 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM H 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM 0.1 nM DMSO 5 nM

118 References 1 Xie, Z., Wroblewska, L., Prochazka, L., Weiss, R. & Benenson, Y. Multi-input RNAi-based logic circuit for identification of specific cancer cells. Science 333, 1307-1311 (2011). 2 Shu, X. et al. Mammalian expression of infrared fluorescent proteins engineered from a bacterial phytochrome. Science 324, 804-807 (2009). 3 Xie, Z., Liu, S. J., Bleris, L. & Benenson, Y. Logic integration of mRNA signals by an RNAi-based molecular computer. Nucleic Acids Res. 38, 2692-2701 (2010). 4 Leisner, M., Bleris, L., Lohmueller, J., Xie, Z. & Benenson, Y. Rationally designed logic integration of regulatory signals in mammalian cells. Nat. Nanotechnol. 5, 666-670 (2010). 5 Weber, W., Kramer, B. P., Fux, C., Keller, B. & Fussenegger, M. Novel promoter/transactivator configurations for macrolide- and streptogramin- responsive transgene expression in mammalian cells. J. Gene. Med. 4, 676- 686 (2002). 6 Weber, W. et al. Macrolide-based transgene control in mammalian cells and mice. Nat. Biotech. 20, 901-907 (2002). 7 Filonov, G. S. et al. Bright and stable near-infrared fluorescent protein for in vivo imaging. Nat. Biotech. 29, 757-761 (2011). 8 Prochazka, L., Angelici, B., Haefliger, B. & Benenson, Y. Highly modular bow- tie gene circuits with programmable dynamic behaviour. Nat. Commun. 5:4729, doi:10.1038/ncomms5729 (2014). 9 Stegmeier, F., Hu, G., Rickles, R. J., Hannon, G. J. & Elledge, S. J. A lentiviral microRNA-based system for single-copy polymerase II-regulated RNA interference in mammalian cells. Proc. Natl Acad. Sci. USA 102, 13212-13217 (2005). 10 Brownawell, A. M. & Macara, I. G. Exportin-5, a novel karyopherin, mediates nuclear export of double-stranded RNA binding proteins. J. Cell. Biol. 156, 53- 64 (2002). 11 Cost, G. J. Enzymatic ligation assisted by nucleases: simultaneous ligation and digestion promote the ordered assembly of DNA. Nat. Protoc. 2, 2198-2202 (2007). 12 Chien, C.-H. et al. Identifying transcriptional start sites of human microRNAs based on high-throughput sequencing data. Nucleic Acids Res. 39, 9345-9356 (2011). 13 Levy, C. et al. Lineage-specific transcriptional regulation of DICER by MITF in melanocytes. Cell 141, 994-1005 (2010). 14 Mori, M. et al. Hippo signaling regulates microprocessor and links cell-density- dependent miRNA biogenesis to cancer. Cell 156, 893-906 (2014). 15 Rinaudo, K. et al. A universal RNAi-based logic evaluator that operates in mammalian cells. Nat. Biotechnol. 25, 795-801 (2007).

119 Supplementary Table 8 Screening library

# Name Trivial name ESCITALOPRAM 34 CPD000469191 BUPROPION HY- OXALATE 1 CPD000058423 DROCHLORIDE 35 CPD000058866 IRSOGLADINE 2 CPD000471621 36 CPD000466354 LATANOPROST MALEATE 37 CPD000058576 3 CPD000466376 ACARBOSE 38 CPD000466298 Sertraline BENPROPERINE 4 CPD000469294 39 CPD000466353 CALCIPOTRIOL PHOSPHATE EPIRUBICIN HY- 5 CPD000466378 40 CPD000466308 DROCHLORIDE 6 CPD000058926 41 CPD000466329 BICALUTAMIDE 7 CPD000449280 Carvedilol 42 CPD000469192 BENIDIPINE HCl 8 CPD000466379 LOMIFYLLINE 43 CPD000466352 AMLEXANOX 9 CPD000466380 PAZUFLOXACIN CERIVASTATIN 44 CPD000469148 10 CPD000466381 MIGLITOL SODIUM 11 CPD000058373 45 CPD000466309 ICARIIN 12 CPD000466345 OLANZAPINE METHYLANDROS 46 CPD000466310 13 CPD000449297 Nefazodone TENEDIOL Moxifloxacin hy- 47 CPD000466307 14 CPD000469185 ROSIGLITAZONE drochloride 48 CPD000469170 NELFINAVIR ME- HCl 15 CPD000469186 SYLATE 49 CPD000059106 PRAVASTATIN 50 CPD000466392 OLIGOMYCIN C 16 CPD000469187 Sodium BENAZEPRIL HY- Topotecan Hydro- 51 CPD000469199 17 CPD000466344 DROCHLORIDE chloride 52 CPD000058877 18 CPD000466303 LEVETIRACETAM 53 CPD000059060 35212-22-7 PRAMIPEXOLE 19 CPD000469142 HCl 54 CPD000058286 OXAPROZIN 20 CPD000466323 RISPERIDONE 55 CPD000058510 pioglitazone hydro- MOSAPRIDE CIT- 21 CPD000469167 56 CPD000469200 chloride RATE 22 CPD000469147 Cilastatin sodium 57 CPD000466391 Isoquercitrin 23 CPD000466348 ARGATROBAN 58 CPD000058450 24 CPD000466327 VALDECOXIB 59 CPD000469164 25 CPD000466346 NAFTOPIDIL 60 CPD000466394 HYPEROSIDE 26 CPD000156231 Nobiletin 61 CPD000466322 RIFABUTIN ESMOLOL HY- 27 CPD000466304 FINASTERIDE 62 CPD000469141 DROCHLORIDE ZOLPIDEM TAR- 28 CPD000469145 TRATE 63 CPD000466321 TADALAFIL 29 CPD000048458 Viramune 64 CPD000058957 30 CPD000466325 TOPIRAMATE DOXORUBICIN 65 CPD000058570 HYDROCHLO- 31 CPD000466350 VORICONAZOLE RIDE FENOLDOPAM 32 CPD000469190 66 CPD000469209 MOXONIDINE HCl MESYLATE 67 CPD000058302 ROSIGLITAZONE 33 CPD000471612 PEFLOXACIN ME- MALEATE 68 CPD000387024 SYLATE

120 Venlafaxine hydro- 69 CPD000469154 106 CPD000058464 chloride N-Ethyl-o-cro- Pantoprazole So- 107 CPD000059145 70 CPD000469592 tonotoluidide dium 108 CPD000472526 Amfebutamone FLUTICASONE 71 CPD000469159 109 CPD000466340 ALFUZOSIN PROPIONATE 72 CPD000469161 Indinavir Sulfate 110 CPD000449309 Amisulpride Midazolam Hydro- 111 CPD000469292 LOFEPRAMINE 73 CPD000469160 PEROSPIRONE chloride 112 CPD000466362 74 CPD000466319 LAMIVUDINE HCl 113 CPD000059010 DOCETAXEL 75 CPD000469151 366-70-1 114 CPD000387107 HONOKIOL ESOMEPRAZOLE 76 CPD000469280 TOLTERODINE Mg 115 CPD000469196 TARTRATE 77 CPD000059146 SULFASALAZINE 116 CPD000466363 CARMOFUR 78 CPD000466313 TORASEMIDE 117 CPD000469181 PAROXETINE tropisetronξhy- 79 CPD000469156 OLMESARTAN drochloride 118 CPD000466337 Ranolazine dihy- MEDOXOMIL 80 CPD000326795 LOSARTAN Potas- drochloride 119 CPD000469593 sium 81 CPD000338536 120 CPD000466338 TEMOZOLOMIDE 82 CPD000466390 PIDOTIMOD 121 CPD000058528 83 CPD000466386 RAMIPRIL tosufloxacin tosi- FENPIVERINIUM 122 CPD000469195 84 CPD000469284 late BROMIDE 123 CPD000466361 MECILLINAM 19-Nortestos- 85 CPD000058610 Atomoxetine hy- terone 124 CPD000469177 drochloride 86 CPD000466384 125 CPD000466336 ARTESUNATE 87 CPD000059047 126 CPD000058959 88 CPD000048684 CEFPODOXIME 127 CPD000469193 89 CPD000466385 TROXIPIDE PROXETIL 90 CPD000466341 ACTARIT 128 CPD000058803 Buflomedil HCl azelastine hydro- 91 CPD000469183 4-Chloro-N-(2-mor- chloride 129 CPD000012114 pholin-4-yl-ethyl)- 92 CPD000466388 TOCAINIDE benzamide 93 CPD000499525 TAXIFOLIN-(+/-) HALOMETASONE 130 CPD000466330 94 CPD000466387 LEVOFLOXACIN MONOHYDRATE TRICLABENDAZO CEFATRIZINE 131 CPD000466357 LE 95 CPD000469182 PROPYLENE GLYCOL 132 CPD000466331 ROFECOXIB BISOPROLOL 96 CPD000466364 IDEBENONE 133 CPD000471619 FUMARATE 97 CPD000466366 LEVOSULPIRIDE 134 CPD000466334 EZETIMIBE 98 CPD000238142 135 CPD000469176 TIAGABINE HCl 99 CPD000466343 LETROZOLE idarubicin hydro- 136 CPD000466355 100 CPD000469184 MEROPENEM chloride 101 CPD000466339 ORLISTAT 137 CPD000466360 FLUBENDAZOLE 102 CPD000469179 138 CPD000466356 TACROLIMUS 103 CPD000059117 VALACICLOVIR 104 CPD000469197 CETRAXATE HCl 139 CPD000469208 HYDROCHLO- RIDE 105 CPD000149316

121 CLARITHROMY- 140 CPD000466382 3-[3,5-DIBROMO- CIN 4-HYDROXYBEN- 141 CPD000466383 ARIPIPRAZOLE 174 CPD000058310 ZOYL]-2- TRIMEBUTINE ETHYLBENZOFU- 142 CPD000471622 MALEATE RAN 143 CPD000238198 175 CPD000058300 144 CPD000466370 NISOLDIPINE 176 CPD000058701 145 CPD000466371 PICEID 177 CPD000058715 1-(2-Methyl-5-nitro- 178 CPD000058273 146 CPD000149359 imidazol-1-yl)-pro- Reichsteins sub- 179 CPD000466922 pan-2-ol stance S Nifekalant hydro- 147 CPD000466369 3-PYRIDINE- chloride 180 CPD000059086 METHANOL 148 CPD000466372 NATEGLINIDE 149 CPD000058691 181 CPD000449283 Haloperidol 150 CPD000466374 ORMETOPRIM 182 CPD000449279 Stiripentol 151 CPD000466377 ZILEUTON 183 CPD000449303 Fluperlapine 152 CPD000058350 184 CPD000058660 Homoveratryla- 153 CPD000058918 185 CPD000112358 mine OXICONAZOLE 154 CPD000469293 186 CPD000058194 NITRATE XANTHINOL NIC- 155 CPD000469235 KITASAMYCIN 187 CPD000058741 OTINATE 156 CPD000466375 FAMCICLOVIR 188 CPD000059111 SYNEPHRINE 157 CPD000326828 189 CPD000058206 501-36-0 rufloxacin monohy- 158 CPD000466373 190 CPD000059093 118-71-8 drochloride 159 CPD000466389 TAXIFOLIN-(+) 191 CPD000059077 alosetron-monohy- 192 CPD000059011 ENROFLOXACIN 160 CPD000469211 drochloride 193 CPD000058603 161 CPD000059165 BESTATIN 194 CPD000058250 TOREMIFENE 162 CPD000469213 195 CPD000059044 CITRATE duloxetine hydro- GOSERELIN ACE- 196 CPD000469136 163 CPD000469214 chloride TATE VARDENAFIL CIT- SECOISOLAR- 197 CPD000469155 164 CPD000469212 RATE ICIRESINOL Ropivacaine hy- 198 CPD000469137 165 CPD000469217 RALTITREXED drochloride DOXAPRAM HY- 166 CPD000469229 199 CPD000466301 ANASTROZOLE DROCHLORIDE KETOTIFEN 200 CPD000058462 167 CPD000466294 RU 24969 FUMARATE 168 CPD000112281 Brucine 201 CPD000058769 169 CPD000059115 16502-01-5 Pinacidil monohy- 202 CPD000466919 170 CPD000058411 drate Palonosetron hy- 203 CPD000058266 171 CPD000469233 drochloride 204 CPD000112269 SO- 172 CPD000058746 205 CPD000059045 92-84-2 DIUM 206 CPD000058553 173 CPD000058904 Grani- 207 CPD000469138 setronξHydro- chloride

122 208 CPD000466293 Rimcazole 236 CPD000058318 50-22-6 209 CPD000466292 Nafadotride VECURONIUM 237 CPD000471625 210 CPD000058856 BROMIDE DEXCHLOR- 238 CPD000469219 TIBOLONE 211 CPD000471617 PHENIRAMINE 239 CPD000058212 98-92-0 MALEATE 240 CPD000059131 212 CPD000466288 Guanidine 241 CPD000058612 213 CPD000466290 L-694,247 242 CPD000058726 214 CPD000466284 AM-251 1,1-DIMETHYL-4- 215 CPD000466289 HTMT 243 CPD000058572 PHENYLPIPER- Benzo[a]phenan- AZINIUM IODIDE thridine-10,11-diol, 244 CPD000058507 216 CPD000466286 5,6,6a,7,8,12b- 245 CPD000059128 72-33-3 hexahydro-, trans- [CAS] BENACTYZINE 246 CPD000059142 HYDROCHLO- Methanesulfona- RIDE mide, N-[4-[[1-[2- 247 CPD000059100 (6-methyl-2-pyridi- 248 CPD000059158 79-43-6 217 CPD000466291 nyl)ethyl]-4-piperi- dinyl]carbonyl]phe- 249 CPD000466283 Altanserin nyl]-, dihydrochlo- Acetamide, 2- ride [CAS] amino-N-(1- 250 CPD000466281 methyl-1,2- diphenylethyl)-, 2H-Indol-2-one, (+/-)- [CAS] 1,3-dihydro-1-phe- 218 CPD000466279 nyl-3,3-bis(4-pyridi- 251 CPD000058420 nylmethyl)- [CAS] 252 CPD000466311 253 CPD000466285 Azasetron 219 CPD000466920 Beclomethasone 254 CPD000466287 GR 89696 220 CPD000058847 73590-58-6 DOLASETRON DELTA1-HYDRO- 221 CPD000469228 MESYLATE CORTISONE 21- 255 CPD000058773 222 CPD000449310 Zolmitriptan HEMISUCCINATE SODIUM SALT 223 CPD000469223 TREMULACIN 224 CPD000469227 DACTINOMYCIN 256 CPD000058392 225 CPD000449308 257 CPD000058366 CHLORDIAZE- SAQUINAVIR ME- 226 CPD000469226 258 CPD000469290 POXIDE SYLATE CEFIXIME TRIHY- 259 CPD000058970 60628-96-8 227 CPD000469225 DRATE SUMATRIPTAN 260 CPD000469158 228 CPD000469224 SUCCINATE Lofexidine hydro- 261 CPD000466314 EXEMESTANE 229 CPD000469232 chloride 262 CPD000466367 NITAZOXANIDE 230 CPD000469221 BALSALAZIDE 263 CPD000058398 OLOPATADINE QUETIAPINE 264 CPD000471623 231 CPD000469220 HYDROCHLO- HEMIFUMARATE RIDE 265 CPD000112560 RUTIN 232 CPD000469287 ITAVASTATIN Ca 266 CPD000466317 PENCICLOVIR 233 CPD000058334 267 CPD000466393 CALCITRIOL 234 CPD000058431 HOMOHARRING- 268 CPD000469140 DIPHENOXYLATE 235 CPD000469230 TONINE 269 CPD000449307 Felbamate

123 270 CPD000058855 299 CPD000058436 562-10-7 271 CPD000035998 300 CPD000449266 Milnacipran 1H-Imidazole-5- 5-fluoro-2-pyrim- 301 CPD000449315 carboxylic acid, 1- idone 272 CPD000466277 (1-phenylethyl)-, 302 CPD000466271 Chlorpheniramine ethyl ester, (R)- [CAS] 303 CPD000466333 DOFETILIDE 273 CPD000466395 RITONAVIR FORMOTEROL 304 CPD000471620 FUMARATE DIHY- vinorelbineξtar- 274 CPD000469210 DRATE trate RIZATRIPTAN 275 CPD000466335 LINEZOLID 305 CPD000525252 BENZOATE LOMERIZINE 276 CPD000469203 DiHCl 306 CPD000466332 RIFAPENTINE LOTEPREDNOL 277 CPD000466351 EFAVIRENZ 307 CPD000469178 ETABONATE 278 CPD000466306 IRBESARTAN 308 CPD000466359 ENALAPRILAT 279 CPD000466305 309 CPD000449292 Donepezil 280 CPD000238204 310 CPD000238177 281 CPD000440694 311 CPD000466365 roxatidine ace- 282 CPD000469144 tateξhydrochlo- 312 CPD000466326 ride 313 CPD000469143 ITOPRIDE HCl DEX- 314 CPD000466324 RIFAXIMIN 283 CPD000471616 BROMPHENIRA- MONTELUKAST MINE MALEATE 315 CPD000469188 SODIUM anagrelide hydro- 284 CPD000469168 2',3'-DIDEOX- chloride 316 CPD000058253 YCYTIDINE TEGASEROD MA- 285 CPD000471618 LEATE 1H-Imidazol-2- amine, N-(2,6-di- 317 CPD000466276 286 CPD000058475 MILRINONE chlorophenyl)-4,5- 287 CPD000466315 LEVOCETIRIZINE dihydro- [CAS] 288 CPD000326936 Citalopram 6H-Pyrido[2,3- b][1,4]benzodiaze- Ticlopidine Hydro- 289 CPD000048468 pin-6-one, 11-[[2- chloride 318 CPD000466280 [(diethylamino)me- sodiumξlox- 290 CPD000469165 thyl]-1-piperidi- oprofen nyl]acetyl]-5,11-di- 291 CPD000466316 ZAFIRLUKAST hydro- [CAS] Terbinafine hydro- 319 CPD000449316 3'-deoxydenosine 292 CPD000469152 chloride 320 CPD000449296 Ifenprodil 293 CPD000466320 ISRADIPINE 5-Amino-2-hy- 321 CPD000145728 294 CPD000466318 VALSARTAN droxy-benzoic acid 295 CPD000449291 322 CPD000466269 Paroxetine LOBELINE HY- 296 CPD000469282 323 CPD000058465 DROCHLORIDE 297 CPD000449286 Physostigmine L-Ornithine, N5- 1H-Indole-2-propa- [imino(methyla- 324 CPD000449329 noic acid, 1-[(4- mino)methyl]- chlorophenyl)me- [CAS] thyl]-3-[(1,1-di- 298 CPD000466278 325 CPD000058461 methylethyl)thio]- Alpha,Alpha-dime- Oxiranecarboxylic acid, 2-[6-(4-chlo- thyl-5-(1-meth- 326 CPD000449321 ylethyl)- [CAS] rophenoxy)hexyl]-, ethyl ester- [CAS]

124 Epigallocatechin 327 CPD000449288 349 CPD000449328 gallate 350 CPD000449294 zucapsaicin 328 CPD000449275 Raclopride SALBUTAMOL 351 CPD000058513 329 CPD000449271 Zacopride SULFATE (+/-)-Vesamicol hy- 330 CPD000449276 SKF 83566 352 CPD000057879 331 CPD000449274 AM 404 drochloride Picrotin - Picrotoxi- 353 CPD000469289 332 CPD000449281 Nalbuphine nin PILOCARPINE 354 CPD000449268 Terazosin 333 CPD000059053 HYDROCHLO- diphenylcyclopro- RIDE 355 CPD000449319 penone 334 CPD000058291 335 CPD000042823 4-Thiazolidinecar- 356 CPD000449326 boxylic acid, 2-oxo- 3-HYDROXY-1,2- , (R)- [CAS] 336 CPD000059136 DIMETHYL-4(1H)- PYRIDONE 357 CPD000466274 Mesoridazine 337 CPD000058470 Loxapine 3(2H)-Pyridazi- d-3-Methoxy-N- none, 6-[4-(difluo- 338 CPD000326694 methylmorphinan 358 CPD000449313 romethoxy)-3- hydrobromide methoxyphenyl]- 339 CPD000449282 Duloxetine [CAS] Glycine, N-[2- 10H-Phenothia- [(acetylthio)me- zine, 2-chloro-10- 340 CPD000449320 thyl]-1-oxo-3-phe- 359 CPD000466275 [3-(4-methyl-1-pi- nylpropyl]-,phenyl- perazinyl)propyl- methyl ester [CAS] [CAS] Benzeneacetic 1H-Cyclo- acid, 2-[(2,6-dichlo- penta[b]quinolin-9- 341 CPD000449318 rophenyl)amino]-, amine, 2,3,5,6,7,8- 360 CPD000449322 monosodium salt hexahydro-, mono- [CAS] hydrochloride- 342 CPD000058345 [CAS] 343 CPD000058961 FAMOTIDINE 361 CPD000058306 344 CPD000449299 SR 57227A 362 CPD000058255 79794-75-5 345 CPD000466270 Pancuronium 363 CPD000058500 Phenelzine sulfate 346 CPD000058175 443-48-1 364 CPD000449311 Riluzole Benzeneacetic acid, Alpha-(hy- 365 CPD000449312 Naltrindole droxymethyl)-, 9- 366 CPD000449277 Nornicotine methyl-3-oxa-9- 367 CPD000449269 Bifemelane 347 CPD000449327 azatricy- clo[3.3.1.02,4]non- 368 CPD000449284 CGS 15943 7-yl ester, [7(S)- 369 CPD000449287 Cinanserin (1Alpha,2 ,4 ,5Al- 370 CPD000449272 Cisapride pha,7 )]- [CAS] Benzeneacetoni- 371 CPD000449273 Indatraline trile, Alpha-[3-[[2- 372 CPD000058520 25332-39-2 (3,4-dimethoxy- 373 CPD000449301 Prazosin phenyl)ethyl]me- URAPIDIL HY- 348 CPD000449323 thylamino]propyl]- 374 CPD000058525 3,4-dimethoxy-Al- DROCHLORIDE pha-(1-meth- 375 CPD000449278 (-)-Cotinine ylethyl)-, (R)- 376 CPD000058313 D-CYCLOSERINE [CAS] 377 CPD000466268 Fluvoxamine

125 378 CPD000449270 Doxepin 404 CPD000058344 379 CPD000059133 405 CPD000238180 (+)-3-HYDROXY- 406 CPD000468734 PD 81723 N-METHYL- 380 CPD000058908 407 CPD000469222 MORPHINAN D- TARTRATE 408 CPD000058445 381 CPD000058555 LY 171883 Thiophene, 5- Maprotiline hydro- bromo-2-(4-fluoro- 382 CPD000148117 chloride 409 CPD000466299 phenyl)-3-[4-(me- thylsulfonyl)phe- 383 CPD000466272 Pizotyline nyl]- [CAS] BETA-ESTRA- 384 CPD000059126 DIOL 410 CPD000466295 Salmeterol R(+)-SCH-23390 N,N'-DIACETYL- 411 CPD000326935 385 CPD000059046 1,6-DIAMINOHEX- hydrochloride ANE DEHYDROEPI- 412 CPD000059075 386 CPD000058353 147-24-0 ANDROSTERONE 387 CPD000449267 Galanthamine 413 CPD000112594 Prostaglandin E1 388 CPD000449290 Indomethacin 414 CPD000058878 TETRAETHYL- 415 CPD000468732 CCPA 389 CPD000059171 THIURAM DISUL- 416 CPD000468733 CGS 12066B FIDE VINDESINE SUL- 417 CPD000469153 390 CPD000449302 Piribedil FATE 391 CPD000058460 VINCRISTINE 418 CPD000058540 392 CPD000058623 SULFATE Pyrazinecarboxa- 419 CPD000466342 LACIDIPINE mide, 3,5-diamino- 420 CPD000466347 393 CPD000449325 N-(aminoimi- nomethyl)-6- 421 CPD000469285 AMPIROXICAM chloro- [CAS] 422 CPD000466368 GLIMEPIRIDE 9-AMINO-1,2,3,4- 423 CPD000469198 Amlodipine TETRAHY- 394 CPD000059105 DROACRIDINE 424 CPD000469174 RABEPRAZOLE HYDROCHLO- 425 CPD000058704 CLOFAZIMINE RIDE Irinotecan hydro- ETHYNYLESTRA- 426 CPD000469166 395 CPD000058319 chloride DIOL 427 CPD000058469 103577-45-3 2(1H)-Pyrimidi- 8-Chloro-11-piperi- none, 4-amino-1-y- 396 CPD000449317 din-4-ylidene-6,11- D-arabinofurano- dihydro-5H- 428 CPD000149358 syl- [CAS] benzo[5,6]cyclo- L-Glutamic acid, N- hepta[1,2-b]pyri- [4-[[(2,4-diamino-6- dine pteridinyl)me- 397 CPD000449324 1,3,5(10)-ESTRA- thyl]methyla- TRIEN-3-OL-17- 429 CPD000058772 mino]benzoyl]- ONE SULPHATE, [CAS] SODIUM SALT 398 CPD000449305 TFMPP 430 CPD000058481 399 CPD000449298 Pramipexole 431 CPD000112002 400 CPD000058189 432 CPD000238156 Sibutramine 401 CPD000466297 SDM25N 433 CPD000469632 5-Nonyloxytrypta- 402 CPD000466300 mine 434 CPD000469231 Sibutramine hydro- 403 CPD000466296 SB 205607 435 CPD000472527 chloride

126 436 CPD000058410 468 CPD000037139 55268-74-1 8- 469 CPD000059104 1716-12-7 Azaspiro[4.5]dec- PREDNISOLONE 470 CPD000058326 ane-7,9-dione, 8- ACETATE [2-[[(2,3-dihydro- 471 CPD000058379 Phenergan 437 CPD000469633 1,4-benzodioxin-2- yl)me- 472 CPD000058180 thyl]amino]ethyl]-, 473 CPD000718761 Prednisolone monomethanesul- 474 CPD000058506 fonate [CAS] Adenosine, N-(2- 475 CPD001227202 Prednisone hydroxycyclopen- DL-PENICILLA- 438 CPD000469631 476 CPD000059161 tyl)-, (1S-trans)- MINE PIPERACILLIN [CAS] 477 CPD000058579 439 CPD000058296 19774-82-4 SODIUM 440 CPD000336944 Quinidine hydro- 478 CPD000857275 chloride monohy- IMATINIB MESYL- 441 CPD000469175 drate ATE 479 CPD000653467 56131-49-8 442 CPD000468736 Metylperon 480 CPD001906767 RIFAMPICIN 443 CPD000469594 Parecoxib sodium 481 CPD000058245 trans-Retinoic acid 444 CPD000058504 482 CPD000471892 Spironolactone ATRACURIUM 445 CPD000471626 BESYLATE 483 CPD000035999 446 CPD000469218 ARTEMETHER 484 CPD000058219 Tyzine 447 CPD000058230 485 CPD000059176 L-THYROXINE 448 CPD000058382 486 CPD000058515 URSODEOXY- 449 CPD000059151 2078-54-8 487 CPD000058403 CHOLIC ACID 450 CPD000058600 488 CPD000059064 80-08-0 451 CPD000058187 FLUTAMIDE 489 CPD001370746 Symmetrel 452 CPD000058299 49562-28-9 WARFARIN SO- 490 CPD000058849 453 CPD000058202 54-31-9 DIUM 5-FLUOROURA- 454 CPD000038082 491 CPD000058394 59-66-5 CIL 492 CPD000059083 455 CPD000471860 Folic acid HYDROCORTI- 493 CPD001906768 ATROPINE 456 CPD000653523 SONE 494 CPD000058264 389-08-2 457 CPD000653536 Cortell 3,5,3'-TRIIODO- 495 CPD001567029 458 CPD000058184 15687-27-1 THYRONINE 459 CPD000040181 15962-46-6 496 CPD000058284 MINOCYCLINE 460 CPD001906766 HYDROCHLO- 497 CPD000058368 Annoyltin RIDE 498 CPD000058613 Busulfan MICONAZOLE NI- 461 CPD000058733 499 CPD000058269 Chlorzoxazone TRATE 500 CPD000058429 Chlorothiazide 462 CPD000059134 METYRAPONE 501 CPD001370748 Cimetidine 463 CPD001317860 70458-96-7 502 CPD000058433 464 CPD000058975 503 CPD000058364 94-20-2 465 CPD000058999 Disipal 504 CPD000058440 Bentyl 466 CPD000058192 505 CPD000312779 Chloroxine 467 CPD000059120 PINDOLOL 506 CPD000058723

127 Adenine 9-beta;-D- 507 CPD001370749 Econazole Nitrate 547 CPD000471872 508 CPD001370750 536-33-4 arabinofuranoside 548 CPD000036768 29122-68-7 509 CPD000058719 549 CPD001491671 Tamoxifen 510 CPD000035778 58-93-5 550 CPD000058418 511 CPD001370751 Vistaril Pamoate 551 CPD000058745 512 CPD000058356 70-30-4 552 CPD000058254 69-09-0 513 CPD000059082 Isoniazid 553 CPD001491644 Cefazolin Sodium 514 CPD000058729 Duvadilan ISOPRO- 554 CPD000059061 CAPTOPRIL 515 CPD000058267 TERENOL HY- 555 CPD000058372 305-03-3 DROCHLORIDE 556 CPD000058809 516 CPD000471847 Triclosan 557 CPD000058321 DANAZOL 517 CPD000058188 61-68-7 (+)-CIS-DILTI- 518 CPD000058832 Cantil 558 CPD000058375 AZEM HYDRO- 519 CPD000058471 CHLORIDE 559 CPD001906774 DIGOXIN 520 CPD001370753 Methyldopa NITROFU- 17-BETA-ESTRA- 521 CPD000058271 RANTOIN 560 CPD000058346 DIOL 17-VAL- ERATE 522 CPD000058486 561 CPD000058672 523 CPD000058292 562 CPD000058329 524 CPD000059024 Nicotinic Acid 563 CPD000042823 Flurbiprofen 525 CPD000058817 Norflex 564 CPD000058455 29094-61-9 Oxytetracycline hy- 526 CPD001614498 drochloride 565 CPD000058393 GEMFIBROZIL 527 CPD000718771 566 CPD000058229 Glyburide 528 CPD000058714 58-14-0 HYDROCORTI- 567 CPD000058328 SONE HEMISUC- 529 CPD000058661 Pro-Banthine CINATE 530 CPD000058280 57-66-9 568 CPD000058829 26807-65-8 PYRIDINE-2-ALD- Ipratropium Bro- 569 CPD001906775 531 CPD001906769 OXIME METHO- mide CHLORIDE 570 CPD000058388 113-52-0 532 CPD000058501 125-33-7 571 CPD000058463 32780-64-6 533 CPD000058275 Propylthiouracil 572 CPD000058466 534 CPD000036662 98-96-4 573 CPD000058833 Pro-Amatine 535 CPD000059079 Pronestyl Medroxyprogester- 574 CPD000653524 536 CPD000037657 Sulfisoxazole one 17-acetate 537 CPD000058223 19-NORE- 575 CPD001906776 THINDRONE AC- 538 CPD000058173 Sulfacetamide ETATE 539 CPD000058991 Sulfinpyrazone 576 CPD000499579 19-Norethindrone 540 CPD000326718 577 CPD000059074 541 CPD001906770 TETRACYCLINE 578 CPD001456372 Cardene 542 CPD000058537 Theophylline 579 CPD000058835 NABUMETONE 543 CPD000058363 64-77-7 580 CPD000058490 544 CPD000059118 Triamterene 581 CPD000058605 Mestinon 545 CPD000059081 Intropin 582 CPD001453705 Rythmol 546 CPD000058416 AMOXAPINE 583 CPD001491654 Pfizerpen

128 584 CPD000499581 99-66-1 619 CPD001496939 Terbutaline Sulfate 585 CPD000058821 620 CPD000471888 Mupirocin 586 CPD000875264 Proxymetacaine 621 CPD000058331 NALOXONE HY- Mefloquine hydro- 587 CPD000058766 622 CPD000875233 DROCHLORIDE chloride SPECTINOMYCIN 623 CPD001496941 Floxuridine DIHYDROCHLO- 588 CPD001906777 624 CPD001563707 MITOXANTRONE RIDE PENTAHY- ENALAPRIL MA- DRATE 625 CPD001906784 589 CPD000058523 LEATE 626 CPD000058337 51333-22-3 590 CPD000058290 1156-19-0 627 CPD000466386 RAMIPRIL 591 CPD000058335 76-25-5 S(-)-Timolol male- 628 CPD000718757 DEPO-MEDROL 592 CPD001456519 ate (+/-)-NOREPI- 593 CPD000058170 THIABENDAZOLE 629 CPD000058383 NEPHRINE HY- DROCHLORIDE 594 CPD000058380 630 CPD001491664 AMCINONIDE 595 CPD000058181 631 CPD001317855 Clomid 596 CPD001491672 Phylloquinone Phentola- 597 CPD001491659 Eryped 632 CPD001819784 mine Mono-hy- 598 CPD000058422 Dibenzyline drochloride 6ALPHA-METHYL- 633 CPD000058874 FLUDARABINE 11BETA-HY- 599 CPD000058693 634 CPD000109709 Testosterone DROXYPROGES- 635 CPD000471891 Isotretinoin TERONE 636 CPD000058376 Methimazole 600 CPD000058524 Thalidomide Aminolevulinic 637 CPD000596519 Zonisamide 601 CPD000857229 Acid 638 CPD000058355 Carbinoxamine 602 CPD001496929 639 CPD000036734 Mebendazole Maleate Meclizine hydro- 640 CPD000058736 603 CPD001496930 Demeclocycline chloride 604 CPD001496932 Westcort 641 CPD000058451 605 CPD000449328 642 CPD000146393 Dilantin 6-[2-ETHOXY-1- 643 CPD000059182 Miochol NAPHTHAMIDO]- Dantrolene So- 606 CPD000058840 644 CPD000326766 PENICILLIN SO- dium DIUM SALT 645 CPD001227192 Dexamethasone Primaquine Di- 607 CPD000875314 phosphate 646 CPD000394012 Cogentin Mesylate 608 CPD001496934 Micropenin 647 CPD000058324 609 CPD001550033 DOXYCYCLINE 648 CPD000059219 Beclomethasone 649 CPD000058785 Meclomen 610 CPD001233361 dipropionate 650 CPD000471882 Fluconazole 611 CPD000058721 Cromolyn Sodium 651 CPD001453712 Metaproterenol 612 CPD000149600 Priscoline 652 CPD000071170 Methoxsalen 613 CPD000544948 Mercaptopurine 653 CPD000058224 Chloramphenicol 614 CPD000427366 Azathioprine Tizanidine hydro- 654 CPD000499584 615 CPD000036735 Albendazole chloride 616 CPD000718755 Griseofulvin 655 CPD001453706 Paroxetine 617 CPD000059006 656 CPD000550486 mirtazapine 618 CPD001496938 Methazolamide 657 CPD000010931 Etomidate

129 658 CPD000499578 Moban 694 CPD000469282 659 CPD001453708 fluvastatin 695 CPD000046147 Ethambutol 660 CPD000058680 Urecholine 696 CPD001453715 Cetirizine 661 CPD001496804 Cefuroxime DICLOXACILLIN 697 CPD000539527 662 CPD000718805 Cytoxan SODIUM 698 CPD000718800 Meloxicam 663 CPD000550478 Eszopiclone DAUNORUBICIN 664 CPD000058802 Bendrofluazide 699 CPD001906781 HYDROCHLO- 665 CPD000058508 82640-04-8 RIDE 666 CPD000058351 30516-87-1 700 CPD001906779 RIFAPENTINE 667 CPD000058365 701 CPD000274084 Penicillin V 668 CPD001317850 Ampicillin Sodium 702 CPD000043336 Gatifloxacin 669 CPD000058800 703 CPD000550475 clopidogrel AMOXICILLIN CEFOTAXIME SO- 670 CPD000058707 704 CPD001551784 CRYSTALLINE DIUM (+/-)-Epinephrine 705 CPD000466319 LAMIVUDINE 671 CPD000857209 hydrochloride 706 CPD001307702 Ondansetron 672 CPD000857239 5-Azacytidine 707 CPD000339803 Betamethasone 673 CPD000058186 Buspar 708 CPD000550473 Celecoxib 674 CPD000436311 4-(AMINOME- 675 CPD000059121 Podofilox THYL)BENZENE- 709 CPD000058778 676 CPD000058313 D-CYCLOSERINE SULFONAMIDE ACETATE CORTISONE AC- 677 CPD000059124 ETATE 710 CPD001906782 THIOTHIXENE 678 CPD000058295 17321-77-6 711 CPD000465669 Citalopram 679 CPD001227191 298-46-4 712 CPD000471864 Azithromycin Memantine hydro- 713 CPD000673570 Lovastatin 680 CPD000875213 chloride 714 CPD000326785 Aminoglutethimide 681 CPD000036827 715 CPD000058452 682 CPD000326711 716 CPD001233272 FluniSOLIDe 683 CPD000058438 717 CPD000058225 Acyclovir 684 CPD000673569 STAVUDINE 718 CPD000058443 685 CPD000097306 Doxazosin 719 CPD000718785 Simvastatin 686 CPD000058963 Minoxidil 720 CPD001227203 Rifabutin 687 CPD000059167 318-98-9 721 CPD001496951 Felodipine 688 CPD001496943 Ribavirin QUINAPRIL HY- 722 CPD000499582 689 CPD000058309 Terazosin DROCHLORIDE 690 CPD000058635 Chlorthalidone 723 CPD000499573 METHYLPREDNI- 724 CPD000718798 138452-21-8 691 CPD000058330 SOLONE 725 CPD001563899 Fluorometholone 692 CPD001496977 Phenelzine 726 CPD000466298 Sertraline 693 CPD000058767 727 CPD001566944 CARBIDOPA

130 Supplementary Table 9 Transfection table experiment of Figure 2.6. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. Experiment can be found in BH5.149.

7b

-

200c 145 141 375 146a 196a 24 142 let 125a 18a 7 30a AmCyan-TRE-DsRed-T200cx4 (pBH0019) 25 mCerulean-TRE-mCherry-T145x4 (pBH0193) 25 mCerulean-TRE-mCherry-141x4 (pBH0267) 25 mCerulean-TRE-mCherry-T375x4 (pBH0194) 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 AmCyan-TRE-DsRed-T196ax4 (pBH0014) 100 mCerulean-TRE-mCherry-T24x4 (pBH0084) 25 AmCyan-TRE-DsRed-T142-3px4 (pZ117) 25 mCerulean-TRE-mCherry-Tlet7bx4 (pBH0082) 25 mCerulean-TRE-mCherry-T125ax4 (pBH0077) 25 mCerulean-TRE-mCherry-T18ax4 (pBH0078) 25 mCerulean-TRE-mCherry-T7x4 (pBH0079) 25 AmCyan-TRE-DsRed-T30ax4 (pZ146) 25 mCerulean-TRE-mCherry-T27bx4 (pBH0076) AmCyan-TRE-DsRed-T23bx4 (pBH0022) mCerulean-TRE-mCherry-T16x4 (pBH0083) mCerulean-TRE-mCherry-T20ax4 (pBH0080) mCerulean-TRE-mCherry-T122x4 (pBH0112) mCerulean-TRE-mCherry-T130ax4 (pBH0075) AmCyan-TRE-DsRed-T17x4 (pZ145) mCerulean-TRE-mCherry-T21x4 (pBH0081) Junk-DNA, Ubi-empty-NOS (pDT7004) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 CMV-rtTA-TFF5 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5

131 Supplementary Table 9 Continuation

27b 23b 16 20a 122 130a 17 21 AmCyan-TRE-DsRed-T200cx4 (pBH0019) mCerulean-TRE-mCherry-T145x4 (pBH0193) mCerulean-TRE-mCherry-141x4 (pBH0267) mCerulean-TRE-mCherry-T375x4 (pBH0194) mCerulean-TRE-mCherry-T146ax4 (pBH0195) AmCyan-TRE-DsRed-T196ax4 (pBH0014) mCerulean-TRE-mCherry-T24x4 (pBH0084) AmCyan-TRE-DsRed-T142-3px4 (pZ117) mCerulean-TRE-mCherry-Tlet7bx4 (pBH0082) mCerulean-TRE-mCherry-T125ax4 (pBH0077) mCerulean-TRE-mCherry-T18ax4 (pBH0078) mCerulean-TRE-mCherry-T7x4 (pBH0079) AmCyan-TRE-DsRed-T30ax4 (pZ146) mCerulean-TRE-mCherry-T27bx4 (pBH0076) 25 AmCyan-TRE-DsRed-T23bx4 (pBH0022) 25 mCerulean-TRE-mCherry-T16x4 (pBH0083) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-T130ax4 (pBH0075) 25 AmCyan-TRE-DsRed-T17x4 (pZ145) 25 mCerulean-TRE-mCherry-T21x4 (pBH0081) 25 Junk-DNA, Ubi-empty-NOS (pDT7004) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 CMV-rtTA-TFF5 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5

132

Supplementary Table 10 Transfection table experiment Figure 2.7a. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH2.88.

T122 T130a T27b T125 mCerulean-TRE-mCherry-T122x4 (pBH0112) 50 mCerulean-TRE-mCherry-T130ax4 (pBH0075) 50 mCerulean-TRE-mCherry-T27bx4 (pBH0076) 50 mCerulean-TRE-mCherry-T125ax4 (pBH0077) 50 mCerulean-TRE-mCherry-T18ax4 (pBH0078) mCerulean-TRE-mCherry-T7x4 (pBH0079) mCerulean-TRE-mCherry-T20ax4 (pBH0080) mCerulean-TRE-mCherry-T21x4 (pBH0081) mCerulean-TRE-mCherry-Tlet7bx4 (pBH0082) mCerulean-TRE-mCherry-T16x4 (pBH0083) mCerulean-TRE-mCherry-T24x4 (pBH0084) mCerulean-TRE-mCherry-TFF5x4 (pBH0091) CMV-tTA (pBA166) 25 25 25 25 Junk-DNA, Ubi-empty-NOS (pDT7004) 100 100 100 100 DMSO 1 1 1 1 Enoxacin 10 10 10 10 PLL 10 10 10 10 NSC158959 10 10 10 10 NSC308847 10 10 10 10

133 Supplementary Table 10 Continuation

T18a T7 T20a T21 mCerulean-TRE-mCherry-T122x4 (pBH0112) mCerulean-TRE-mCherry-T130ax4 (pBH0075) mCerulean-TRE-mCherry-T27bx4 (pBH0076) mCerulean-TRE-mCherry-T125ax4 (pBH0077) mCerulean-TRE-mCherry-T18ax4 (pBH0078) 50 mCerulean-TRE-mCherry-T7x4 (pBH0079) 50 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 50 mCerulean-TRE-mCherry-T21x4 (pBH0081) 50 mCerulean-TRE-mCherry-Tlet7bx4 (pBH0082) mCerulean-TRE-mCherry-T16x4 (pBH0083) mCerulean-TRE-mCherry-T24x4 (pBH0084) mCerulean-TRE-mCherry-TFF5x4 (pBH0091) CMV-tTA (pBA166) 25 25 25 25 Junk-DNA, Ubi-empty-NOS (pDT7004) 100 100 100 100 DMSO 1 1 1 1 Enoxacin 10 10 10 10 PLL 10 10 10 10 NSC158959 10 10 10 10 NSC308847 10 10 10 10

134 Supplementary Table 10 Continuation

Tlet7 T16 T24 TFF5 mCerulean-TRE-mCherry-T122x4 (pBH0112) mCerulean-TRE-mCherry-T130ax4 (pBH0075) mCerulean-TRE-mCherry-T27bx4 (pBH0076) mCerulean-TRE-mCherry-T125ax4 (pBH0077) mCerulean-TRE-mCherry-T18ax4 (pBH0078) mCerulean-TRE-mCherry-T7x4 (pBH0079) mCerulean-TRE-mCherry-T20ax4 (pBH0080) mCerulean-TRE-mCherry-T21x4 (pBH0081) mCerulean-TRE-mCherry-Tlet7bx4 (pBH0082) 50 mCerulean-TRE-mCherry-T16x4 (pBH0083) 50 mCerulean-TRE-mCherry-T24x4 (pBH0084) 50 mCerulean-TRE-mCherry-TFF5x4 (pBH0091) 50 CMV-tTA (pBA166) 25 25 25 25 Junk-DNA, Ubi-empty-NOS (pDT7004) 100 100 100 100 DMSO 1 1 1 1 Enoxacin 10 10 10 10 PLL 10 10 10 10 NSC158959 10 10 10 10 NSC308847 10 10 10 10

135 Supplementary Table 11 Transfection table experiment of Figure 2.7b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH3.17.

T122 w/o target DMSO NSC158959 NSC308847 DMSO NSC158959 NSC308847 low med high low med high low med high low med high low med high low med high PpLuc-TRE-RrLuc-T122x4 (pBH0161) 100 100 100 100 100 100 100 100 100 PpLuc-TRE-RrLuc (pBH0157) 100 100 100 100 100 100 100 100 100 CMV-tTA (pBA166) 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 Junk-DNA, Ubi-empty-NOS (pDT7004) 650 650 650 650 650 650 650 650 650 650 650 650 650 650 650 650 650 650 DMSO 0.1 0.5 1 0.1 0.5 1 NSC158959 1 5 10 1 5 10 NSC308847 1 5 10 1 5 10

Supplementary Table 12 Transfection table experiment of Figure 2.8a. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH7.2.

Ubx4-mcherry-T122 mCerulean-TRE-mCherry-Ubiqutintx4-PEST-T122x4 (pBH0277) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 DMSO 1%

136 Supplementary Table 13 Transfection table experiment of Figure 2.8b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH7.18.

PEST-TFF5x4 PEST-T122x4 mCerulean-TRE-mCherry-PEST-TFF5x4 (pBH0287) 25 25 25 25 25 25 mCerulean-TRE-mCherry-PEST-T122x4 (pBH0286) 25 25 25 25 25 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 DMSO 0.1% 0.5% 1.0% 0.1% 0.5% 1.0% NSC308847 1 5 10 1 5 10

Supplementary Table 14 Transfection table experiment of Figure 2.8c. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH7.18.

DMSO NSC308847 mCerulean-TRE-mCherry-Ubiqutintx4-PEST-T122x4 (pBH0277) 25 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 DMSO 1.0% NSC308847 10

137 Supplementary Table 15 Transfection table experiment of Figure 2.9b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH3.64.

4h post transfection Reference DMSO NSC158959 NSC308847 CMV-PpLuc (pZ003) 100 100 100 100 CMV-RrLuc (pZ005) 100 100 100 100 Junk-DNA, Ubi-empty-NOS (pDT7004) 500 500 500 500 500 500 500 500 DMSO 10 10 NSC158959 10 10 NSC308847 10 10 Addition time post transfection 4h 4h 4h 4h 4h 4h

48h post transfection Direct to lysate DMSO NSC158959 NSC308847 DMSO NSC158959 NSC308847 CMV-PpLuc (pZ003) 100 100 100 100 100 100 CMV-RrLuc (pZ005) 100 100 100 100 100 100 Junk-DNA, Ubi-empty-NOS (pDT7004) 500 500 500 500 500 500 500 500 500 500 500 500 DMSO 10 10 10 10 NSC158959 10 10 10 10 NSC308847 10 10 10 10 Addition time post transfection 48h 48h 48h 48h 48h 48h 48.5h 48.5h 48.5h 48.5h 48.5h 48.5h

138 Supplementary Table 16 Transfection table experiment of Figure 2.10a. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH6.69.

TFF5x4 T122x1 T122x4 PpLuc-TRE-RrLuc-TFF5x4 (pBH0159) 100 100 100 100 PpLuc-TRE-RrLuc-T122x1 (pBH0160) 100 100 100 100 PpLuc-TRE-RrLuc-T122x4 (pBH0161) 100 100 100 100 CMV-rtTA-TFF5x4 (pZ091) 50 50 50 50 50 50 50 50 50 50 50 50 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 550 550 550 550 550 550 550 550 550 550 550 550 DMSO 1.0% 1.0% 1.0% NSC5476 0.1 1 10 0.1 1 10 0.1 1 10

Supplementary Table 17 Transfection table experiment of Figure 2.10b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH6.93.

Reference direct addition to lysate addition 48 h post TF addition 4 h post TF CMV-PpLuc (pZ003) 100 100 100 100 100 100 100 CMV-RrLuc (pZ005) 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 600 600 600 600 600 600 600 600 600 600 600 600 600 600 DMSO 1.0% 1.0% 1.0% 1.0% 1.0% 1.0% NSC5476 10 10 10 10 10 10

139 Supplementary Table 18 Transfection table experiment of Figure 2.11a. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH5.106.

Reference Exp-5, 200 ng Exp-5, 500 ng miR-DGCR8, 200 ng

7b 7b 7b 7b

- - - -

let 122 146a FF5 let 122 146a FF5 let 122 146a FF5 let 122 146a FF5 mCerulean-TRE-mCherry-Tlet7bx4 (pBH0082) 100 100 100 100 mCerulean-TRE-mCherry-T122x4 (pBH0112) 100 100 100 100 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 100 100 100 100 mCerulean-TRE-mCherry-TFF4x4 (pBH0091) 100 100 100 100 CMV-Exportin-5 (pBH0151) 200 200 200 200 550 550 550 550 CMV-Neo-miR-30 Stem loop- anti-DGCR8 miRNA (pBH0175) 200 200 200 200 siNegCtrl siDicer CMV-rtTA-TFF5 (pZ091) 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 Junk-DNA, Ubi-empty-NOS (pDT7004) 550 550 550 550 350 350 350 350 350 350 350 350

Supplementary Table 18 continuation miR-DGCR8, 500 ng siDicer0, 0 nM siDicer0, 10 nM siDicer0, 30 nM

7b 7b 7b 7b

- - - -

let 122 146a FF5 let 122 146a FF5 let 122 146a FF5 let 122 146a FF5 mCerulean-TRE-mCherry-Tlet7bx4 (pBH0082) 100 100 100 100 mCerulean-TRE-mCherry-T122x4 (pBH0112) 100 100 100 100 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 100 100 100 100 mCerulean-TRE-mCherry-TFF4x4 (pBH0091) 100 100 100 100 CMV-Exportin-5 (pBH0151) CMV-Neo-miR-30 Stem loop- anti-DGCR8 miRNA (pBH0175) 550 550 550 550 siNegCtrl 30 30 30 30 20 20 20 20 siDicer 10 10 10 10 30 30 30 30 CMV-rtTA-TFF5 (pZ091) 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 Junk-DNA, Ubi-empty-NOS (pDT7004) 550 550 550 550 550 550 550 550 550 550 550 550

140 Supplementary Table 19 Transfection table experiment of Figure 2.11b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH6.161.

siNeg.Ctrl mCerulean-TRE-mCherry-T18ax4 (pBH0078) 25 mCerulean-TRE-mCherry-T7x4 (pBH0079) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-T21x4 (pBH0081) 25 mCerulean-TRE-mCherry-T21x4 (pBH0082) 25 mCerulean-TRE-mCherry-T16x4 (pBH0083) 25 mCerulean-TRE-mCherry-TFF5x4 (pBH0091) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 mCerulean-TRE-mCherry-T141x4 (pBH0267) 25 mCerulean-TRE-Ubx4-mCherry-T122x4 (pBH0281) 25 AmCyan-TRE-DsRed-T17x4 (pZ145) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 siNeg.Ctrl. 20 20 20 20 20 20 20 20 20 20 20 20 siDrosha siDGCR8 siDicer siTRBP4

141 Supplementary Table 19 Continuation

siDrosha mCerulean-TRE-mCherry-T18ax4 (pBH0078) 25 mCerulean-TRE-mCherry-T7x4 (pBH0079) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-T21x4 (pBH0081) 25 mCerulean-TRE-mCherry-T21x4 (pBH0082) 25 mCerulean-TRE-mCherry-T16x4 (pBH0083) 25 mCerulean-TRE-mCherry-TFF5x4 (pBH0091) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 mCerulean-TRE-mCherry-T141x4 (pBH0267) 25 mCerulean-TRE-Ubx4-mCherry-T122x4 (pBH0281) 25 AmCyan-TRE-DsRed-T17x4 (pZ145) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 siNeg.Ctrl. siDrosha 20 20 20 20 20 20 20 20 20 20 20 20 siDGCR8 siDicer siTRBP4

142 Supplementary Table 19 Continuation

siDGCR8 mCerulean-TRE-mCherry-T18ax4 (pBH0078) 25 mCerulean-TRE-mCherry-T7x4 (pBH0079) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-T21x4 (pBH0081) 25 mCerulean-TRE-mCherry-T21x4 (pBH0082) 25 mCerulean-TRE-mCherry-T16x4 (pBH0083) 25 mCerulean-TRE-mCherry-TFF5x4 (pBH0091) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 mCerulean-TRE-mCherry-T141x4 (pBH0267) 25 mCerulean-TRE-Ubx4-mCherry-T122x4 (pBH0281) 25 AmCyan-TRE-DsRed-T17x4 (pZ145) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 siNeg.Ctrl. siDrosha siDGCR8 20 20 20 20 20 20 20 20 20 20 20 20 siDicer siTRBP4

143 Supplementary Table 19 Continuation

siDicer mCerulean-TRE-mCherry-T18ax4 (pBH0078) 25 mCerulean-TRE-mCherry-T7x4 (pBH0079) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-T21x4 (pBH0081) 25 mCerulean-TRE-mCherry-T21x4 (pBH0082) 25 mCerulean-TRE-mCherry-T16x4 (pBH0083) 25 mCerulean-TRE-mCherry-TFF5x4 (pBH0091) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 mCerulean-TRE-mCherry-T141x4 (pBH0267) 25 mCerulean-TRE-Ubx4-mCherry-T122x4 (pBH0281) 25 AmCyan-TRE-DsRed-T17x4 (pZ145) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 siNeg.Ctrl. siDrosha siDGCR8 siDicer 20 20 20 20 20 20 20 20 20 20 20 20 siTRBP4

144 Supplementary Table 19 Continuation

siTRBP4 mCerulean-TRE-mCherry-T18ax4 (pBH0078) 25 mCerulean-TRE-mCherry-T7x4 (pBH0079) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-T21x4 (pBH0081) 25 mCerulean-TRE-mCherry-T21x4 (pBH0082) 25 mCerulean-TRE-mCherry-T16x4 (pBH0083) 25 mCerulean-TRE-mCherry-TFF5x4 (pBH0091) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 mCerulean-TRE-mCherry-T141x4 (pBH0267) 25 mCerulean-TRE-Ubx4-mCherry-T122x4 (pBH0281) 25 AmCyan-TRE-DsRed-T17x4 (pZ145) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 siNeg.Ctrl. siDrosha siDGCR8 siDicer siTRBP4 20 20 20 20 20 20 20 20 20 20 20 20

145 Supplementary Table 19 Continuation

all mCerulean-TRE-mCherry-T18ax4 (pBH0078) 25 mCerulean-TRE-mCherry-T7x4 (pBH0079) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-T21x4 (pBH0081) 25 mCerulean-TRE-mCherry-T21x4 (pBH0082) 25 mCerulean-TRE-mCherry-T16x4 (pBH0083) 25 mCerulean-TRE-mCherry-TFF5x4 (pBH0091) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 mCerulean-TRE-mCherry-T141x4 (pBH0267) 25 mCerulean-TRE-Ubx4-mCherry-T122x4 (pBH0281) 25 AmCyan-TRE-DsRed-T17x4 (pZ145) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 siNeg.Ctrl. siDrosha 5 5 5 5 5 5 5 5 5 5 5 5 siDGCR8 5 5 5 5 5 5 5 5 5 5 5 5 siDicer 5 5 5 5 5 5 5 5 5 5 5 5 siTRBP4 5 5 5 5 5 5 5 5 5 5 5 5

146 Supplementary Table 20 Transfection table experiment of Figure 2.12a. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH5.149.

T145 T375 T146a 141 T122 mCerulean-TRE-mCherry-T145x4 (pBH0193) 25 25 mCerulean-TRE-mCherry-T375x4 (pBH0194) 25 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 25 mCerulean-TRE-mCherry-141x4 (pBH0267) 25 25 mCerulean-TRE-mCherry-122x4 (pBH0112) 25 25 CMV-rtTA-TFF5 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 Junk-DNA, Ubi-empty-NOS (pDT7004) 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 137.5 Mim-Neg.Ctrl. 5 5 5 5 5 Mim-145 5 Mim-375 5 Mim-146a 5 Mim-141 5 Mim-122 5

147 Supplementary Table 21 Transfection table experiment of Figure 2.12b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH5.149 and BH5.152.

TFF5 T21 T20a T122 mCerulean-TRE-mCherry-TFF5x4 (pBH0091) 25 mCerulean-TRE-mCherry-T21x4 (pBH0081) 25 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 25 CMV-rtTA-TFF5 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 Junk-DNA, Ubi-empty-NOS (pDT7004) 137.5 137.5 137.5 137.5 137.5 137.5 LNA-Neg.Ctrl. 5 5 5 5 LNA-21 5 LNA-21 5 LNA-122 5

148 Supplementary Table 22 Transfection table experiment of Figure 2.12c. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH6.89 and BH6.99.

T21 TFF4 T141 T146a mCerulean-TRE-mCherry-T21x4 (pBH0082) 25 mCerulean-TRE-mCherry-TFF4x4 (pBH0266) 25 mCerulean-TRE-mCherry-T141x4 (pBH0267) 25 mCerulean-TRE-mCherry-T146ax4 (pBH0195) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) mCerulean-TRE-mCherry-T20ax4 (pBH0080) mCerulean-TRE-mCherry-23bx4 (pBH0273) CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 Junk-DNA, Ubi-empty-NOS (pDT7004) 137.5 137.5 137.5 137.5 Mim-Neg.Ctrl. 5 5 5 5 Mim-21 5 5 5 5 siFF4 5 5 5 5 Mim-141 5 5 5 5 Mim-146a 5 5 5 5 Mim-122 5 5 5 5 Mim-20a 5 5 5 5 Mim-23b 5 5 5 5

149 Supplementary Table 22 Continuation

T122 T20a T23b mCerulean-TRE-mCherry-T21x4 (pBH0082) mCerulean-TRE-mCherry-TFF4x4 (pBH0266) mCerulean-TRE-mCherry-T141x4 (pBH0267) mCerulean-TRE-mCherry-T146ax4 (pBH0195) mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-23bx4 (pBH0273) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 Junk-DNA, Ubi-empty-NOS (pDT7004) 137.5 137.5 137.5 Mim-Neg.Ctrl. 5 5 5 Mim-21 5 5 5 siFF4 5 5 5 Mim-141 5 5 5 Mim-146a 5 5 5 Mim-122 5 5 5 Mim-20a 5 5 5 Mim-23b 5 5 5

150 Supplementary Table 23 Transfection table experiment of Figure 2.12d. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH6.89 and BH6.99.

T21 T20a TFF4 mCerulean-TRE-mCherry-T21x4 (pBH0082) 25 mCerulean-TRE-mCherry-T20ax4 (pBH0080) 25 mCerulean-TRE-mCherry-TFF4x4 (pBH0266) 25 mCerulean-TRE-mCherry-T122x4 (pBH0112) mCerulean-TRE-mCherry-23bx4 (pBH0273) CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 TRE-LacI-TFF5x4-miR-FF4 (pZ225) 12.5 Junk-DNA, Ubi-empty-NOS (pDT7004) 137.5 125 137.5 LNA-Neg.Ctrl. 5 5 5 LNA-21 5 5 5 LNA-20a 5 5 5 LNA-FF4 5 5 5 LNA-122 5 5 5 LNA-23b 5 5 5

151 Supplementary Table 23 Coninuation

T122 T23b mCerulean-TRE-mCherry-T21x4 (pBH0082) mCerulean-TRE-mCherry-T20ax4 (pBH0080) mCerulean-TRE-mCherry-TFF4x4 (pBH0266) mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 mCerulean-TRE-mCherry-23bx4 (pBH0273) 25 CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 TRE-LacI-TFF5x4-miR-FF4 (pZ225) Junk-DNA, Ubi-empty-NOS (pDT7004) 137.5 137.5 LNA-Neg.Ctrl. 5 5 LNA-21 5 5 LNA-20a 5 5 LNA-FF4 5 5 LNA-122 5 5 LNA-23b 5 5

Supplementary Table 24 Transfection table experiment of Figure 2.13. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. Experiment can be found in BH5.146.

TFF5 T130a CMV-rtTA-TFF5x4 (pZ091) 0 0.93 2.78 8.33 25 75 CMV-rtTA-T130ax4 (pBH0213) 0 0.93 2.78 8.33 25 75 CMV-rtTA-T20ax4-T130ax4 (pBH0201) CMV-rtTA-T20ax4 (pBH0211) CMV-rtTA-T21x4 (pZ090) CMV-rtTA-T20ax1 (pBH0225) CMV-rtTA-T20ax2 (pBH0226) mCerulean-TRE-MCS (pIM015) 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCherry (pKH026) 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 500 499.07 497.22 491.67 475 425 500 499.07 497.22 491.67 475 425

152 Supplementary Table 25 Transfection table experiment of Figure 2.14b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. Experiment can be found in BH5.98 and BH5.132.

rtTA PIT2 ET mCerulean-TRE-mCherry-Spacer (pBH0074) 100 CMV-rtTA-TFF5x4 (pZ091) 25 mCerulean-PRE-mCherry-800 bp Spacer (pBH0107) 100 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 25 mCerulean-ERE-mCherry-800 bp Spacer (pBH0111) 100 CAGop-ET-TFF4x4-T146ax4-T375x4-T145x4 (pBH0229) 25 Junk-DNA, Ubi-empty-NOS (pDT7004) 475 475 475 Ef1α-mCitrine (pKH025): 100 100 100

153 Supplementary Table 26 Transfection table experiment of Figure 2.14c. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. The experiment can be found in BH5.98 and BH5.132.

rtTA PIT2 mCerulean-TRE-mCherry-Spacer (pBH0074) 100 100 100 100 100 100 100 CMV-rtTA-TFF5x4 (pZ091) 50 25 12.5 6.25 3.13 1.56 0 mCerulean-PRE-mCherry-800 bp Spacer (pBH0107) 100 100 100 100 100 100 100 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 50 25 12.5 6.25 3.13 1.56 0 mCerulean-ERE-mCherry-800 bp Spacer (pBH0111) CAGop-ET-TFF4x4-T146ax4-T375x4-T145x4 (pBH0229) Junk-DNA, Ubi-empty-NOS (pDT7004) 450 475 488 494 497 498 500 450 475 488 494 497 498 500 Ef1α-mCitrine (pKH025): 100 100 100 100 100 100 100 100 100 100 100 100 100 100

ET mCerulean-TRE-mCherry-Spacer (pBH0074) CMV-rtTA-TFF5x4 (pZ091) mCerulean-PRE-mCherry-800 bp Spacer (pBH0107) CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) mCerulean-ERE-mCherry-800 bp Spacer (pBH0111) 100 100 100 100 100 100 100 CAGop-ET-TFF4x4-T146ax4-T375x4-T145x4 (pBH0229) 50 25 12.5 6.25 3.13 1.56 0 Junk-DNA, Ubi-empty-NOS (pDT7004) 450 475 488 494 497 498 500 Ef1α-mCitrine (pKH025): 100 100 100 100 100 100 100

154 Supplementary Table 27 Transfection table experiment of Figure 2.15. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH5.144.

PIT ET TFF4-146a-141 T146a-141-TFF4 TFF4-146a-141 T146a-141-TFF4 CAGop-Citrine-2A-PIT2-TFF4x4-T146ax4-T141x4 25 25 25 25 (pBH0232) CAGop-Citrine-2A-PIT2-T146ax4-T141x4-TFF4x4 25 25 25 25 (pBH0254) 100 100 100 100 100 100 100 100 mCerulean-PRE-mCherry-T122x4 (pBH0182) 25 25 25 25 CAGop-ET-TFF4x4-T146ax4-T141x4 (pBH0231) CAGop-ET-2A-Citrine-T146ax4-T141x4-TFF4x4 25 25 25 25 (pBH0250) 100 100 100 100 100 100 100 100 mCerulean-ERE-mCherry-T122x4 (pBH0177) CMV-iRFP (pCS0012) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Junk-DNA 475 475 475 475 475 475 475 475 475 475 475 475 475 475 475 475 Mim-Neg.Ctrl. 5 5 5 5 siFF4 5 5 5 5 Mim-146a 5 5 5 5 Mim-141 5 5 5 5

155 Supplementary Table 28 Transfection table experiment of Figure 2.16b, c. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. Experiment can be found in BH6.2.

T21 T20a TFF5 CMV-rtTA-T21x4 (pZ090) 25 25 25 25 25 TRE-LacI-T21x4-miR-FF4 (pZ224) 0 25 50 100 150 CMV-rtTA-T20ax4 (pBH0211) 25 25 25 25 25 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 0 25 50 100 150 CMV-rtTA-TFF5x4 (pZ091) 25 25 25 25 25 TRE-LacI-TFF5x4-miR-FF4 (pZ225) 0 25 50 100 150 CAGop-Citrine-2A-PIT2-T146ax4-T141x4-TFF4x4 (pBH0254) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 mCerulean-PRE-mCherry-T122x4 (pBH0182) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 CMV-iRFP (pCS0012) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 450 425 400 350 300 450 425 400 350 300 450 425 400 350 300

156 Supplementary Table 29 Transfection table experiment of Figure 2.16e, f. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH5.158.

LNA-Neg.Ctrl. LNA-21 LNA-20a Mim-Neg.Ctrl. Mim-146a Mim-141 CMV-rtTA-T21x4 (pZ090) 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T21x4-miR-FF4 (pZ224) 12.5 12.5 12.5 12.5 12.5 12.5 CMV-rtTA-T20ax4 (pBH0211) 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 12.5 12.5 12.5 12.5 12.5 12.5 CAGop-Citrine-2A-PIT2-T146ax4-T141x4-TFF4x4 (pBH0254) 25 25 25 25 25 25 mCerulean-PRE-mCherry-T122x4 (pBH0182) 100 100 100 100 100 100 CMV-iRFP (pCS0012) 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 425 425 425 425 425 425 LNA-Neg.Ctrl. 5 LNA-21 5 LNA-20a 5 Mim-Neg.Ctrl. 5 Mim-146a 5 Mim-141 5

157 Supplementary Table 30 Transfection table experiment of Figure 2.22. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH5.111.

Parallel assay CMV-rtTA-T21x4 (pZ090) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T21x4-miR-FF4 (pZ224) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CMV-rtTA-T20ax4 (pBH0211) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CAGop-ZsYellow-T146ax4- T141x4-TFF4x4 (pBH0260) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 CMV-PIT2 (pMF206) 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 mCerulean-PRE-mCherry-T122x4 (pBH0182) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 CMV-IFP1.4 (pZ210) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 375 375 375 375 375 375 375 375 375 375 375 375 375 375 375 375 375 375 LNA-Neg.Ctrl. 5 LNA-122 5 5 5 LNA-21 5 5 5 LNA-20a 5 5 5 LNA-FF4 5 5 Mim-Neg.Ctrl. 5 Mim-122 5 5 5 5 Mim-21 5 5 5 Mim-20a 5 5 5 siFF4 5 5 Mim-141 5 5 5 5 Mim-146a 5 5 5 5

158 Supplementary Table 30 Continuation

LFF assay CMV-rtTA-T21x4 (pZ090) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T21x4-miR-FF4 (pZ224) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CMV-rtTA-T20ax4 (pBH0211) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CAGop-PIT2-T146ax4-T141x4- TFF4x4 (pBH0256) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 T141x4-T146ax4-mCerulean- PRE-mCherry-T122x4 (pBH0246) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCitrine (pKH025) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 LNA-Neg.Ctrl. 5 LNA-122 5 5 5 LNA-21 5 5 5 LNA-20a 5 5 5 LNA-FF4 5 5 Mim-Neg.Ctrl. 5 Mim-122 5 5 5 5 Mim-21 5 5 5 Mim-20a 5 5 5 siFF4 5 5 Mim-141 5 5 5 5 Mim-146a 5 5 5 5

159 Supplementary Table 30 Continuation

CFF assay CMV-rtTA-T21x4 (pZ090) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T21x4-miR-FF4 (pZ224) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CMV-rtTA-T20ax4 (pBH0211) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CAGop-PIT2-T146ax4-T141x4- TFF4x4 (pBH0256) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 TFF4x4-T141x4-T146ax4-mCeru- lean-PRE-mCherry-T122x4 (pBH0264) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCitrine (pKH025) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 LNA-Neg.Ctrl. 5 LNA-122 5 5 5 LNA-21 5 5 5 LNA-20a 5 5 5 LNA-FF4 5 5 Mim-Neg.Ctrl. 5 Mim-122 5 5 5 5 Mim-21 5 5 5 Mim-20a 5 5 5 siFF4 5 5 Mim-141 5 5 5 5 Mim-146a 5 5 5 5

160 Supplementary Table 31 Transfection table experiment of Figure 2.23b, d. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH5.111.

CFF assay CMV-rtTA-T21x4 (pZ090) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T21x4-miR-FF4 (pZ224) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CMV-rtTA-T20ax4 (pBH0211) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CAGop-PIT2-T146ax4-T141x4- TFF4x4 (pBH0256) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 TFF4x4-T141x4-T146ax4-mCeru- lean-PRE-mCherry-T122x4 (pBH0264) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCitrine (pKH025) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 LNA-Neg.Ctrl. 5 LNA-122 5 5 5 LNA-21 5 5 5 LNA-20a 5 5 5 LNA-FF4 5 5 Mim-Neg.Ctrl. 5 Mim-122 5 5 5 5 Mim-21 5 5 5 Mim-20a 5 5 5 siFF4 5 5 Mim-141 5 5 5 5 Mim-146a 5 5 5 5

161 Supplementary Table 32 Transfection table experiment of Figure 2.24a, b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 24-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH5.111 and BH5.120.

high mimic/LNA CMV-rtTA-T21x4 (pZ090) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T21x4-miR-FF4 (pZ224) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CMV-rtTA-T20ax4 (pBH0211) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CAGop-PIT2-T146ax4-T141x4- TFF4x4 (pBH0256) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 TFF4x4-T141x4-T146ax4- mCerulean-PRE-mCherry- T122x4 (pBH0264) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCitrine (pKH025) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 LNA-Neg.Ctrl. 5 LNA-122 5 5 5 LNA-21 5 5 5 LNA-20a 5 5 5 LNA-FF4 5 5 Mim-Neg.Ctrl. 5 Mim-122 5 5 5 5 Mim-21 5 5 5 Mim-20a 5 5 5 siFF4 5 5 Mim-141 5 5 5 5 Mim-146a 5 5 5 5

162 Supplementary Table 32 Continuation

medium mimic/LNA CMV-rtTA-T21x4 (pZ090) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T21x4-miR-FF4 (pZ224) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CMV-rtTA-T20ax4 (pBH0211) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CAGop-PIT2-T146ax4-T141x4- TFF4x4 (pBH0256) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 TFF4x4-T141x4-T146ax4- mCerulean-PRE-mCherry- T122x4 (pBH0264) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCitrine (pKH025) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 LNA-Neg.Ctrl. 1 LNA-122 1 1 1 LNA-21 1 1 1 LNA-20a 1 1 1 LNA-FF4 1 1 Mim-Neg.Ctrl. 1 Mim-122 1 1 1 1 Mim-21 1 1 1 Mim-20a 1 1 1 siFF4 1 1 Mim-141 1 1 1 1 Mim-146a 1 1 1 1

163 Supplementary Table 32 Continuation

low mimic/LNA CMV-rtTA-T21x4 (pZ090) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T21x4-miR-FF4 (pZ224) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CMV-rtTA-T20ax4 (pBH0211) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 12.5 CAGop-PIT2-T146ax4-T141x4- TFF4x4 (pBH0256) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 TFF4x4-T141x4-T146ax4- mCerulean-PRE-mCherry- T122x4 (pBH0264) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCitrine (pKH025) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 425 LNA-Neg.Ctrl. 0.1 LNA-122 0.1 0.1 0.1 LNA-21 0.1 0.1 0.1 LNA-20a 0.1 0.1 0.1 LNA-FF4 0.1 0.1 Mim-Neg.Ctrl. 0.1 Mim-122 0.1 0.1 0.1 0.1 Mim-21 0.1 0.1 0.1 Mim-20a 0.1 0.1 0.1 siFF4 0.1 0.1 Mim-141 0.1 0.1 0.1 0.1 Mim-146a 0.1 0.1 0.1 0.1

164 Supplementary Table 33 Transfection table experiment of Figure 2.25 and Supplementary Figure 1 and 2. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration, for compounds in μM. Experiment can be found in BH6.120 and BH6.126.

DMSO/ Samples LNA-122 Mim-122 LNA-21 Mim-146a CMV-rtTA-T21x4 (pZ090) 3.125 3.125 3.125 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 3.125 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 3.125 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 3.125 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 6.25 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry- T122x4 (pBH0264) 25 25 25 25 25 Ef1α-mCitrine (pKH025) 25 25 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 106.25 106.25 106.25 106.25 106.25 DMSO/Compound 1%/10 LNA-122 5 Mim-122 5 LNA-21 5 Mim-146a 5

165 Supplementary Table 34 Transfection table experiment of Figure 2.26a. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH7.19.

DMSO Tadalafil Indinavir Donepezil CMV-rtTA-T21x4 (pZ090) 3.125 3.125 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264) 100 100 100 100 Ef1α-mCitrine (pKH025) 25 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 34.375 34.375 Compound 1% 0.4 0.4 0.4 Compound 2 2 2 Compound 10 10 10 Compound 50 50 50 Compound Indatraline Hexachlorophene Chloroxine CMV-rtTA-T21x4 (pZ090) 3.125 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264) 100 100 100 Ef1α-mCitrine (pKH025) 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 34.375 Compound 0.4 0.4 0.4 Compound 2 2 2 Compound 10 10 10 Compound 50 50 50

166 Supplementary Table 34 Continuation.

Levothyroxine Lomerizine CMV-rtTA-T21x4 (pZ090) 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264) 100 100 Ef1α-mCitrine (pKH025) 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 Compound 0.4 0.4 Compound 2 2 Compound 10 10 Compound 50 50 Compound Digoxin Mefloquine CMV-rtTA-T21x4 (pZ090) 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264) 100 100 Ef1α-mCitrine (pKH025) 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 Compound 0.4 0.4 Compound 2 2 Compound 10 10 Compound 50 50

167 Supplementary Table 35 Transfection table experiment of Figure 2.26b, c. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH7.19.

DMSO Tadalafil Indinavir Donepezil Levothyroxine Lomerizine CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 mCerulean-PRE-mCherry-T122x4 (pBH0112) 25 25 25 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 Compound 1% 0.4 0.4 0.4 0.4 0.4 Compound 2 2 2 2 2 Compound 10 10 10 10 10 Compound 50 50 50 50 50

Supplementary Table 35 Cont. Indatraline Hexachlorophene Chloroxine Digoxin Mefloquine CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 mCerulean-PRE-mCherry-T122x4 (pBH0112) 25 25 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 Compound 0.4 0.4 0.4 0.4 0.4 Compound 2 2 2 2 2 Compound 10 10 10 10 10 Compound 50 50 50 50 50

168 Supplementary Table 36 Transfection table experiment of Figure 2.26d. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH7.19.

DMSO Ritonavir Saquinavir Ifenprodil CMV-rtTA-T21x4 (pZ090) 3.125 3.125 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264) 100 100 100 100 Ef1α-mCitrine (pKH025) 25 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 34.375 34.375 Compound 1% 0.4 0.4 0.4 Compound 2 2 2 Compound 10 10 10 Compound 50 50 50

Clobetasol propi- Supplementary Table 36 Continuation Amlodipine Oxytetracycline onate Rifabutin Rifapentine CMV-rtTA-T21x4 (pZ090) 3.125 3.125 3.125 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 3.125 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 3.125 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 3.125 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 6.25 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264) 100 100 100 100 100 Ef1α-mCitrine (pKH025) 25 25 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 34.375 34.375 34.375 Compound 0.4 0.4 0.4 0.4 0.4 Compound 2 2 2 2 2 Compound 10 10 10 10 10 Compound 50 50 50 50 50

169 Supplementary Table 37 Transfection table experiment of Figure 2.26e, f. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH7.15 and BH7.19

2−Chloro-adeno- DMSO Finasteride sine Mestranol Azasetron Droperidol CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 12.5 12.5 12.5 12.5 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 25 25 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 137.5 137.5 137.5 137.5 Compound 1% 0.4 0.4 0.4 0.4 0.4 Compound 2 2 2 2 2 Compound 10 10 10 10 10 Compound 50 50 50 50 50

Supplementary Table 37 Continuation

Diclofenac Telithromycin CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 mCerulean-TRE-mCherry-T122x4 (pBH0112) 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 Compound 0.4 0.4 Compound 2 2 Compound 10 10 Compound 50 50

170 Supplementary Table 38 Transfection table experiment of Figure 2.27b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH6.145.

miR-23b assay CMV-rtTA-T21x4 (pZ090) 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 CAGop-PIT2-T146ax4- T141x4TFF4x4 (pBH0256) 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 TFF4x4-T141x4-T146ax4- mCerulean-PRE-mCherry- T23bx4 (pBH0278) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCitrine (pKH025) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 LNA-Neg.Ctrl. 5 LNA-23b 5 5 5 LNA-21 5 5 5 LNA-20a 5 5 5 LNA-FF4 5 5 Mim-Neg.Ctrl. 5 Mim-23b 5 5 5 5 Mim-21 5 5 5 Mim-20a 5 5 5 siFF4 5 5 Mim-146a 5 5 5 5 Mim-141 5 5 5 5

171 Supplementary Table 39 Transfection table experiment of Figure 2.27c. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH7.4.

miR-145/375 assay CMV-rtTA-T21x4 (pZ090) 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 3.125 CAGop-PIT2-T146ax4- T375x4-T145x4-TFF4x4 (pBH0255) 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 6.25 TFF4x4-T145x4-T375x4- T146ax4-mCerulean-PRE- mCherry-T122x4 (pBH0263) 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100 Ef1α-mCitrine (pKH025) 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 25 Lac-op free Junk-DNA Ubi- Nos (pBH0265) 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 43.75 LNA-Neg.Ctrl. 5 LNA-122 5 5 5 LNA-21 5 5 5 LNA-20a 5 5 5 LNA-FF4 5 5 Mim-Neg.Ctrl. 5 Mim-122 5 5 5 5 Mim-21 5 5 5 Mim-20a 5 5 5 siFF4 5 5 Mim-146a 5 5 5 5 Mim-141 Mim-375 5 5 5 5 Mim-145 5 5 5 5

172 Supplementary Table 40 Transfection table experiment of Figure 2.28a. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH7.43.

Circuit assay LNA-Neg.Ctrl. LNA-122/21/20a CMV-rtTA-T21x4 (pZ090) 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264) 100 100 Ef1α-mCitrine (pKH025) 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 LNA-Neg.Ctrl. 3 LNA-122 1 LNA-21 1 LNA-20a 1

Bidicrectional reporter assay LNA-Neg.Ctrl. LNA-122/21/20a CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 mCerulean-PRE-mCherry-T122x4 (pBH0112) 25 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 LNA-Neg.Ctrl. 3 LNA-122 1 LNA-21 1 LNA-20a 1

173 Supplementary Table 41 Transfection table experiment of Figure 2.28b. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of compounds amounts are reported in μM, for DMSO as percentage of total reaction volume. Experiment can be found in BH6.120, BH6.126 and BH7.40.

Circuit assay DMSO Clobetasol CMV-rtTA-T21x4 (pZ090) 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-T122x4 (pBH0264) 100 100 Ef1α-mCitrine (pKH025) 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 DMSO 1% Clobetasol propriate 10

Bidicrectional reporter assay DMSO Clobetasol CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 mCerulean-PRE-mCherry-T122x4 (pBH0112) 25 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 DMSO 1% Clobetasol propriate 10

174

Supplementary Table 42 Transfection table experiment of Figure 2.28c. The numbers are the nanogram (ng) plasmid amounts co-transfected per sample in a 96-well setup. In case of LNAs/ mimics, the amounts are reported in nM final concentration. Experiment can be found in BH7.32.

Circuit assay siNeg.Ctrl. siDrosha/siDicer CMV-rtTA-T21x4 (pZ090) 3.125 3.125 TRE-LacI-T21x4-miR-FF4 (pZ224) 3.125 3.125 CMV-rtTA-T20ax4 (pBH0211) 3.125 3.125 TRE-LacI-T20ax4-miR-FF4 (pBH0212) 3.125 3.125 CAGop-PIT2-T146ax4-T141x4-TFF4x4 (pBH0256) 6.25 6.25 TFF4x4-T141x4-T146ax4-mCerulean-PRE-mCherry-Tlet7bx4 (pBH0288) 100 100 Ef1α-mCitrine (pKH025) 25 25 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 34.375 34.375 siNeg.Ctrl. 10 siDrosha 5 siDicer 5

Bidicrectional reporter assay siNeg.Ctrl. siDrosha/siDicer CMV-rtTA-TFF5x4 (pZ091) 12.5 12.5 mCerulean-PRE-mCherry-Tlet-7bx4 (pBH0082) 25 100 Lac-op free Junk-DNA Ubi-Nos (pBH0265) 137.5 137.5 siNeg.Ctrl. 10 siDrosha 5 siDicer 5

175 MATLAB scripts Two MATLAB scripts were used to generate the result for the small molecule library screen. The script “image_processing_all_automated_small.m” was used to process the microscopy images. As an output an excel file is generated with the inten- sities for the different colors of a cropped image. The principle of this script was later on used for a project of Jörg Schreiber. In order to increase the readability of the script, I rewrote it with function calling. This script is called “image_processing_updated_4.m” and should be used for potential future experiments. The script called “vol- cano_feat_venn_modified.m” was used to identify potential hits in this dataset and to generate the appropriate plots for it. Due to the length of the code it is not attached here but can be received upon request.

176