Research Collection

Doctoral Thesis

Tools and strategies for the unraveling of post-transcriptional regulatory networks

Author(s): Kanitz, Alexander

Publication Date: 2012

Permanent Link: https://doi.org/10.3929/ethz-a-7358945

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library DISS. ETH Nr. 20147

TOOLS AND STRATEGIES FOR THE UNRAVELING OF

POST-TRANSCRIPTIONAL GENE REGULATORY NETWORKS

A dissertation submitted to

ETH ZURICH

for the degree of

Doctor of Sciences

presented by

ALEXANDER KANITZ

M.Sc. University of Amsterdam

born November 15, 1980

citizen of Germany

Accepted on the recommendation of

Prof. Dr. Michael Detmar Prof. Dr. Jonathan Hall Prof. Dr. André Gerber

2012

1 SUMMARY ______6 1.1 Summary ______6 1.2 Zusammenfassung ______9

2 INTRODUCTION ______12 2.1 Key principles and players of post-transcriptional gene regulatory processes ______13 2.1.1 The fate of eukaryotic messenger RNAs ______13 2.1.2 Cis -regulatory elements ______15 2.1.3 Trans -acting factors ______16 2.1.3.1 RNA-binding ______16 2.1.3.2 MicroRNAs ______17 2.1.4 Ribonucleoprotein complexes and the RNA regulon theory ______19 2.2 Post-transcriptional gene regulatory networks ______22 2.2.1 Basic network motifs ______23 2.2.1.1 Multiple output network motifs ______23 2.2.1.2 Multiple input network motifs ______25 2.2.2 Autoregulatory, two- and multicomponent loops ______26 2.2.3 Composite gene regulatory networks ______29 2.2.3.1 RNA-binding proteins versus transcription factors ______29 2.2.3.2 RNA-binding proteins versus microRNAs ______30 2.3 Ribonomics methodologies for the systematic identification of basic post-transcriptional gene regulatory network motifs ______32 2.3.1 Top-down approaches: RIP-Chip and related methods ______33 2.3.2 Bottom-up: RNA affinity chromatography and related methods ______35 2.3.2.1 Direct RNA affinity chromatography ______35 2.3.2.2 Purification methods based on antisense hybridization ______36 2.3.2.3 Aptamer-based purification methods ______37 2.3.2.4 Indirect - and peptide-based purification methods ______38 2.3.2.5 Bifunctional RNA tag systems ______39 2.4 Combinatorial control of cancer-related messages ______40 2.4.1 Combinatorial control of the angiogenesis factor vascular endothelial growth factor A ______40 2.4.2 Combinatorial control of the tumor suppressor CDKN1B/p27______44

3 AIMS AND OUTLINE OF THE THESIS ______48

4 IDENTIFICATION OF NEW POST-TRANSCRIPTIONAL REGULATORS OF VASCULAR ENDOTHELIAL GROWTH FACTOR A EXPRESSION ______50 4.1 Introduction ______50 4.2 Results ______55 4.2.1 The VEGFA 3’-untranslated region contains canonical Pum consensus motifs ______55 4.2.2 VEGFA is a putative target of microRNA 361-5p ______59 4.2.3 MicroRNA 361-5p and Pum1/2 may target other angiogenesis-related transcripts ______61 4.2.4 Generation and characterization of stable Pum1/2 overexpression cell lines ______64 4.2.5 Transfection of small RNAs ______67 4.2.6 The putative Pum and microRNA 361-5p recognition elements in the VEGFA 3’-UTR possess regulatory potential ______68 4.2.7 Pum1, Pum2 and microRNA 361-5p repress the expression of VEGFA 3’-UTR reporters ______71 4.2.8 The repressive effects of Pum proteins and microRNA 361-5p on VEGFA 3’-UTR reporter activity are additive ______73 4.2.9 Endogenous VEGFA expression is regulated by microRNA 361-5p ______76 4.2.10 MicroRNA 361-5p is down-regulated in cutaneous squamous cell carcinoma ______78 4.3 Discussion ______83 4.3.1 The putative Pum and microRNA recognition elements exhibit regulatory potential ______83

4.3.2 Combinatorial control of VEGFA expression by microRNA 361-5p and the Pum proteins ______85 4.3.3 The regulation of VEGFA expression by microRNA 361-5p and the Pum proteins may be dependent on each other ______87 4.3.4 The influence of microRNA 361-5p and Pum1 on VEGFA secretion rates ______89 4.3.5 A potential role of microRNA 361-5p and Pum proteins in cancer development and progression _ 90 4.3.6 Bioinformatics analyses suggest common functions of microRNA 361-5p and Pum proteins beyond the regulation of VEGFA expression ______92 4.3.7 Conclusion ______93 4.4 Materials and Methods______93 4.4.1 Ethics statement______93 4.4.2 Plasmids ______93 4.4.3 Cell culture and tissue samples ______95 4.4.4 MicroRNA target gene prediction and pathway analysis ______97 4.4.5 Immunoblot analysis ______98 4.4.6 Flow cytometry______99 4.4.7 Quantitative reverse transcription PCR ______99 4.4.8 Immunocytochemistry ______100 4.4.9 Luciferase reporter assays ______101 4.4.10 Enzyme-linked immunosorbent assay ______102 4.5 Contributions ______102

5 A NOVEL RNA TANDEM AFFINITY TAG FOR THE PURIFICATION OF RIBONUCLEOPROTEIN PARTICLES ______104 5.1 Introduction ______104 5.2 Results ______106 5.2.1 Aptamer selection ______106 5.2.2 Oligonucleotide selection ______108 5.2.3 Arrangement of the HAMMER tandem affinity tag system ______110 5.2.4 Purification strategy ______112 5.2.5 Plasmid generation ______114 5.2.6 Secondary structures of HAMMER-tagged RNAs are largely unaffected ______116 5.2.7 HAMMER-tagged RNAs are expressed in transiently transfected cells ______119 5.2.8 Purification of HAMMER-tagged in vitro transcripts via hybridization to antisense oligonucleotides ______122 5.2.9 Purification of HAMMER-tagged in vitro transcripts via the S1 aptamer______125 5.3 Discussion ______127 5.3.1 Expression of tagged transcripts ______127 5.3.2 Capturing of tagged transcripts by antisense hybridization ______128 5.3.3 Elution of transcripts immobilized by hybridization ______129 5.3.4 Aptamer-mediated purification of tagged transcripts ______130 5.3.5 Reflections on tag folding and insertion ______131 5.3.6 Limitations of RNA secondary structure prediction algorithms ______132 5.3.7 Conclusion ______133 5.4 Materials and Methods______134 5.4.1 Tag design and bioinformatics ______134 5.4.2 Plasmids ______134 5.4.3 Cell culturing ______135 5.4.4 Quantitative reverse transcription PCR ______135 5.4.5 Fluorescence microscopy ______136 5.4.6 In vitro transcription and labeling ______137 5.4.7 S1 aptamer purification ______138 5.4.8 Oligonucleotide synthesis ______138 5.4.9 Preparation of antisense oligonucleotide matrix ______140 5.4.10 Purification by antisense oligonucleotide hybridization ______140

5.5 Contributions ______141

6 CONCLUDING REMARKS ______142

7 APPENDIX ______145 7.1 Nucleotide sequences ______145 7.1.1 Nucleotides used for cloning ______145 7.1.2 Nucleotides used for mutagenesis ______145 7.1.3 Nucleotides used for quantitative reverse transcription PCR (SYBR Green) ______146 7.1.4 Nucleotides for the HAMMER RNA tandem tag ______146 7.2 MicroRNA mimics and antisense inhibitors ______147 7.3 Commercial quantitative reverse transcription PCR assays ______147 7.4 MicroRNAs predicted to target VEGFA ______148 7.5 Predicted microRNA 361-5p targets ______155 7.6 Gene set enrichment analysis of microRNA 361-5p, Pum1 and Pum2 targets ______194 7.7 pTO-HA-Strep-GW-FRT map and sequence ______196

8 BIBLIOGRAPHY ______204

9 ACKNOWLEDGMENTS ______227

10 CURRICULUM VITAE ______229

11 ABBREVIATIONS ______232

1 Summary

1.1 Summary

In order to guarantee survival in a complex and ever-changing environment, cells dispose of efficient and highly dynamic regulatory circuits that continuously interpret the genetic code in a context-dependent manner. This ensures, with remarkable accuracy, that the right components are at the right place at the right time. Consequently, messenger RNAs

(mRNAs) – which play a central role in the flow of genetic information by carrying it from the nucleus to the cytoplasm or, ultimately, from DNA to proteins – are subject to particular scrutiny by a cell’s regulatory machinery. It is believed that at every instant of an mRNAs life it is decorated by a host of ever-changing ‘trans -acting factors’, namely RNA-binding proteins and non-coding RNAs, which assemble with adaptor, scaffolding and effector proteins into dynamic macromolecular ribonucleoprotein (RNP) complexes. RNPs represent the functional units of post-transcriptional gene regulation, and it is their respective compositions that dictate which gene regulatory program is executed for a particular mRNA at a given moment, i.e. whether they are spliced, edited, exported, silenced, translated, transported, stored, or degraded. Whether a protein or non-coding RNAs is part of an RNP is determined by its availability and activity, as well as the nature of the message itself: Next to the protein coding information, each mRNA species contains a unique set of distinct ‘cis - regulatory elements’, sequence and/or structural motifs that are recognized by the trans -acting factors. To a certain degree, the fate of an mRNA is therefore pre-determined by its underlying ‘RNA code’.

RNPs are organized into highly intertwined, decentralized post-transcriptional regulatory networks which constantly exchange information with the intra- and extracellular environment. They receive and integrate external stimuli, interpret and eventually relay them

6 in order to coordinate RNP remodeling accordingly. In order to fully understand the logic and logistics of living systems, it is therefore paramount to unravel these intricate networks.

Detailed insights into their nature and organization principles should have broad implications for a large number of scientific and technological disciplines, including but not limited to medicine, computer sciences and even socioeconomics. Here we present strategies and tools for the identification of combinatorial control motifs, a fundamental component of such networks.

In a first project, we integrated bioinformatics analyses and experimental evidence to predict, with high confidence, cis -regulatory elements for the RNA-binding proteins Pum1 and Pum2, and the non-coding RNA microRNA 361-5p in the message of the angiogenesis factor vascular endothelial growth factor A (VEGFA). Using in vitro reporter assays we then demonstrated that the defined sequence elements possess regulatory potential and that elevated levels of the corresponding trans -acting factors negatively affect reporter activities in a combinatorial, additive manner. RNA levels of microRNA 361-5p, Pum1 and Pum2 were all reduced in human cutaneous squamous cell carcinoma samples that express elevated levels of VEGFA, suggesting a potential role in tumor development and/or progression. The study represents a valid strategy for the prediction of combinatorial control motifs in an mRNA of interest. It further lays the foundation for future studies that could address the relevance and interplay of these new repressors and their regulatory elements for the regulation of VEGFA expression in more detail, such as in vivo and in disease.

In a second project, we present a strategy for studying the composition of RNPs. To this end, we rationally designed a modular tandem tag system that contains an RNA sequence with high affinity for the ligand streptavidin (‘streptavidin-aptamer’), as well as an exposed

RNA oligonucleotide without strong secondary structure characteristics. The tag can be

7 attached to an RNA of interest and is supposed to allow the purification of the corresponding

RNP in a highly specific two-step manner that involves the interaction of the aptamer with immobilized streptavidin on the one hand, and hybridization of the exposed RNA oligonucleotide to an immobilized antisense strand composed of a stable RNA derivative on the other. Due to gentle elution methods, the procedure should be compatible with downstream applications for the identification of its protein and RNA components. So far, we have generated a number of tools and controls for the characterization of the tag system and were able to show in preliminary experiments that a tagged in vitro transcript could be enriched by antisense hybridization, but not aptamer-based purification. If the method can be successfully established in future experiments, it should find widespread use for the characterization of RNP composition and plasticity.

8

1.2 Zusammenfassung

Um das Überleben in einer komplexen und sich ständig verändernden Umgebung zu garantieren, verfügen Zellen über hocheffiziente und dynamische Regelkreise, die den genetischen Kode fortlaufend und situationsabhängig interpretieren, um mit bemerkenswerter

Sorgfalt dafür zu sorgen, dass sich die richtigen Bausteine zur rechten Zeit am rechten Ort befinden. Boten-RNAs (mRNAs) spielen eine Schlüsselrolle im Fluss genetischer Information, indem sie diese vom Zellkern in das Zellplasma beziehungsweise von der DNA zum Protein tragen, und unterliegen folglich einer besonderen Aufmerksamkeit durch die

Steuermechanismen einer Zelle. Es wird angenommen, dass eine mRNA zu jedem Zeitpunkt ihres Lebens von einer Vielzahl ständig wechselnder „trans -aktiver Faktoren“ - RNA- bindende Proteine und nicht-kodierende RNAs - gebunden wird, die sich mit Adapter-,

Gerüst- und Effektorproteinen zu makromolekularen Ribonukleoproteinkomplexen (RNPs) zusammenlagern. RNPs stellen die funktionellen Einheiten der post-transkriptionellen

Genregulation dar, und ihre jeweilige Zusammensetzung bestimmt, welches genregulatorische

Programm für welche mRNA zu einem bestimmten Zeitpunkt ausgeführt wird, also ob sie gespleisst, editiert, exportiert, transportiert, gelagert, abgebaut oder aber in ein Protein

„übersetzt“ wird. Ob ein bestimmtes Protein oder eine nicht-kodierende RNA Teil eines RNPs ist, hängt sowohl von dessen Anwesenheit und Aktivität ab, als auch von der Boten-RNA selbst: Neben dem Bauplan für ein Protein enthält jede mRNA-Spezies nämlich zusätzlich eine einzigartige Kombination verschiedener „ cis -regulatorischer Elemente“ – sequenzbasierte beziehungsweise strukturelle Motive, die von den trans -aktiven Faktoren erkannt werden. Zu einem gewissen Grad ist das Schicksal einer mRNA also durch den ihr zugrundeliegenden „RNA Kode“ vorgegeben.

RNPs sind in stark verflochtenen, dezentralen sogenannten „post-transkriptionellen regulatorischen Netzwerken“ organisiert, die im regen Austausch mit der Umgebung im

9

Zellinneren und –äusseren stehen. Sie nehmen ständig Reize auf, integrieren und interpretieren diese und leiten sie schliesslich weiter, um so den Umbau der RNPs entsprechend zu koordinieren. Um die Logik und Logistik lebender Systeme zu erfassen, ist es daher unerlässlich, diese hochkomplexen Netzwerke zu entschlüsseln. Detaillierte

Einblicke in ihre Beschaffenheit und Organisationsprinzipien könnten von grosser Bedeutung für eine Vielzahl wissenschaftlicher und technologischer Disziplinen sein, etwa der Medizin und der Informatik, aber auch der Erforschung gesellschaftlicher Strukturen. In dieser Arbeit wurden Strategien und Werkzeuge entwickelt und angewandt, um ein grundlegendes Motiv solcher Netzwerke, die „kombinatorische Kontrolle“ einer bestimmten Boten-RNA durch verschiedene trans -wirkende Faktoren zu untersuchen.

Dazu wurden in einer ersten Studie bioinformatische Analysen mit experimentell- empirischen Erkenntnissen gekoppelt, um so cis -regulatorische Elemente für die RNA- bindenden Proteine Pum1 und Pum2 sowie für die nicht-kodierende RNA mikroRNA 361-5p in der Boten-RNA für den Angiogenesefaktor „vascular endothelial growth factor

A“ (VEGFA) mit hoher Wahrscheinlichkeit vorherzusagen. Mit künstlichen

Reportertestverfahren konnten wir dann nachweisen, dass diese in der Tat regulierende

Eigenschaften besitzen. Erhöhte Pegel der entsprechenden trans -aktiven Faktoren führten ferner zu einer Verminderung der Reporteraktivität in einer kombinatorischen, additiven

Weise. Zuletzt konnten wir zeigen, dass die Pegel von mikroRNA 361-5p, Pum1 und Pum2 in menschlichen Plattenepithelkarzinomen reduziert sind und sich konträr zu denen von VEGFA verhalten. Diese Befunde deuten auf eine mögliche Rolle in der Entstehung von Tumoren oder dem Krankheitsverlauf bestimmter Krebsleiden hin. Die Arbeit stellt eine Strategie für die Vorhersage bestimmter Steuerelemente in Boten-RNAs vor. Weiterhin liefert sie

Voraussetzungen für künftige Studien, die sich näher mit der Bedeutung und dem

Zusammenspiel der identifizierten Repressoren und ihrer Bindestellen für die Regulierung der

10

VEGFA-Aktivität beschäftigen könnten, insbesondere bei der Entstehung von Krankheiten.

Dies liesse sich durch die Untersuchung von Patienten oder entsprechender

Modellorganismen, etwa Mäusen, bewerkstelligen.

In einer zweiten Studie präsentieren wir eine Strategie für die Untersuchung der

Zusammensetzung von RNPs, die sich um eine Boten-RNA gebildet haben. Zu diesem Zweck haben wir ein Nukleinsäure-basiertes Erkennungsmerkmal entworfen, welches einerseits aus einer Sequenz mit hoher Affinität zu dem Liganden Streptavidin besteht (‚Streptavidin-

Aptamer‘), und andererseits aus einem exponierten RNA-Oligonukleotid, welches keinerlei besondere Sekundärstrukturmerkmale aufweist. Das Erkennungsmerkmal lässt sich an RNAs anhängen, um so die Aufreinigung der entsprechenden RNPs in einem hochspezifischen, zweistufigen Prozess zu ermöglichen, der die Wechselwirkung mit immobilisertem

Strepatvidin beziehungsweise einem Gegenstrang, bestehend aus einem RNA-Derivat, beinhaltet. Ferner ist das Merkmal nach dem Baukastenprinzip aufgebaut, sodass sich einzelne Bauteile mühelos austauschen lassen. Aufgrund schonender Ablöseverfahren der

„eingefangenen“ RNPs sollte das Verfahren verträglich mit den verfügbaren hochempfindlichen Analysemethoden zur Bestimmung von Proteinen und RNAs sein. Bisher haben wir eine Reihe von Werkzeugen und Kontrollen für die Charakterisierung des

Erkennungsmerkmals entwickelt und konnten in ersten Untersuchungen zeigen, dass eine künstlich hergestellte, mit dem Erkennungsmerkmal versehene Boten-RNA mittels

Gegenstrang-Hybridisierung spezifisch angereichert werden konnte. Die Aptamer-basierte

Aufreinigung hingegen war bisher nicht erfolgreich. Sollte sich diese Methode in künftigen

Studien erfolgreich etablieren lassen, ist zu erwarten, dass sie ein breites Anwendungsgebiet für die Untersuchung von RNPs finden wird.

11

2 Introduction

The ability to establish a high level of order and suspend the harmful effects of entropy is a unique and fascinating quality of living systems. So how does a cell achieve this spectacular feat in a microscopic space that is thronged with millions of molecules of varying sizes, shapes and chemical properties?

In the year 1961 François Jacob and Jacques Monod described the first gene regulatory system, the Escherichia coli lac operon (Jacob and Monod, 1961). In the very same year,

Sydney Brenner and others identified an "unstable intermediate carrying information from to ribosomes for protein synthesis" (Brenner et al. , 1961; Gros et al. , 1961), yet at the time probably few people had imagined the myriad forms of regulation that these messenger

RNA molecules (mRNAs) undergo during their brief, yet eventful – and highly promiscuous

– life. Shortly after, the ground work for the deciphering of the genetic code was laid (Crick et al. , 1961; Matthaei et al. , 1962) and modern molecular biology was born. In the ensuing half century enormous efforts have been devoted to further our understanding of the ‘logic of life’, gradually introducing scientists to an overwhelmingly complex and intricately intertwined, yet remarkably robust and efficient regulatory circuitry.

This work deals with the post-transcriptional aspects of this circuitry, i.e. the regulatory events that an mRNA undergoes after it is generated. In this chapter, we introduce the underlying concepts, processes and key players of post-transcriptional gene regulation

(PTGR; see 2.1), followed by a phenomenological description of post-transcriptional gene regulatory networks (GRNs; see 2.2) and an overview of the methodologies devised for the unraveling of such networks (see 2.3). Finally, we touch upon the relevance of PTGR and its medical implications by summarizing in detail the available literature on the combinatorial

12 control exerted on the messages coding for two proteins with major roles in cancers (see 2.4).

2.1 Key principles and players of post-transcriptional gene regulatory processes

2.1.1 The fate of eukaryotic messenger RNAs

The genome stores the information that is needed to build proteins in defined entities, called genes, in the form of a four-letter DNA polymer. But in order to deliver this information to ribosomes, the cytoplasmic protein production ‘factories’, a gene’s intrinsic sequence code first needs to be transcribed into a mobile carrier, the messenger RNA. For decades, transcriptional regulation of gene expression has been regarded as the dominant force in determining the fate of mRNA transcripts. However, cellular mRNA levels have been demonstrated to be poor indicators of protein abundance (Gygi et al. , 1999), indicating that the previous understanding of mRNAs as “blind” carriers of protein-coding information is much too linear. Apparently, the presence of an mRNA molecule in the cytoplasm does not automatically result in the synthesis of the corresponding protein; instead, translation seems to be controlled by a complex logical gate, integrating the availability and state of activity of various molecules in deciding whether an individual transcript is translated or not (Halbeisen et al. , 2008; Mansfield and Keene, 2009; Kanitz and Gerber, 2010). It is now clear that regulatory cues are exerted on mRNAs throughout their life, starting during their ‘birth’ and ending, eventually, with their ‘death’ – when they are broken up into their individual building blocks. In between, there is a large number of individually regulated steps that ensure that the genetic information is carried in the correct form to where it is required and at the precise time when it is required (Figure 2.1).

Once a eukaryotic mRNA precursor is formed in the nucleus, it first undergoes various

13 processing steps, namely 5’-end processing (‘capping’; reviewed in Topisirovic et al., 2011),

3’-end cleavage and 3’-end processing (‘polyadenylation’; reviewed in Tian and Graber,

2011). Usually splicing mechanisms further process the mRNA precursor by cutting out long stretches of non-coding information (‘introns’) that are interspersed between the protein- coding information (‘exons’; reviewed in McManus and Graveley, 2011). However, frequently exons are clipped as well, thereby changing the identity of the resulting mRNA, and thus, eventually, of the corresponding protein. The majority of mature eukaryotic mRNAs further contain non-coding sequences on either end of the molecule, the 5’- und 3’- untranslated regions (UTRs), which serve important regulatory purposes, as discussed in the next chapter. Furthermore, introns may exhibit regulatory functions of their own (reviewed in

Pyle, 2010). Additional changes to an mRNA’s sequence may be made (‘editing’; reviewed in

Godfried Sie and Kuchka, 2011), before the mRNA is finally prepared for its export to the cytoplasm. The now ‘mature’ mRNA may be actively localized to a certain

Figure 2.1 Eukaryotic gene expression. The different steps are schematically depicted. See text for details. From Halbeisen et al. (2008)

14 position inside the cell (‘RNA localization’; reviewed in Shahbabian and Chartrand, 2011), where it may be stored, translated into protein by ribosomes, or – ultimately – degraded.

Regulation of all these different ‘programs’ of mRNA regulation requires two basic components: Regulators in trans (i.e. they act upon other molecules) and recognition elements in cis (i.e. they are acted upon by other molecules).

2.1.2 Cis -regulatory elements

RNAs are not stiff linear polymers (as frequently depicted in schemes), but rather fold into dynamic three-dimensional structures reminiscent of the folding of proteins. However, due to the smaller number of building blocks and the chemical and structural properties of nucleotides compared to amino acids, the complexity of RNA folding is considerably reduced.

Cis -regulatory elements are either sequence and/or structural features of an mRNA molecule that are recognized by RBPs, miRNAs and other trans -acting factors. While mRNAs often contain cis -regulatory elements in their coding sequences, the majority of those features is located in a message’s 5’- or 3’-UTRs (reviewed in Mignone et al., 2002; Figure 2.2), probably because the protein-coding information carried by mRNAs puts constraints on RNA folding and the evolution of recognition elements. The usually lower degree of conservation of untranslated regions supports this idea, suggesting that they evolve faster, thus allowing swift (on evolutionary scales) rewiring of regulatory circuits. Moreover, the placement of cis - regulatory elements in the untranslated regions has the additional advantage that elements may be distributed more freely, since only a structural, but not a continuous sequence context has to be maintained. While the majority of cis -regulatory elements relies on predominantly structural features, some of them can be well represented by degenerate ‘consensus sequence motifs’ (Figure 2.2).

15

Figure 2.2 Consensus RNA recognition elements of several yeast RNA-binding proteins. The names of the proteins, the experimentally determined consensus motifs, the probabilities that motifs occur in the coding sequence, 5’- or 3’-untranslated regions, and the conservation probability of the motifs are indicated. Adapted from Hogan et al. (2008).

2.1.3 Trans-acting factors

2.1.3.1 RNA-binding proteins

A considerable fraction of eukaryotic genomes possess RNA-binding activity, with estimations ranging from 8 to 15% of the protein coding genes (Keene, 2001). Approximately

1000-2000 RBPs are encoded in the (Anantharaman et al. , 2002). This large

16 number of RNA binding proteins reflects both the ancient role of RNA-dependent regulatory mechanisms, as well as the increased regulatory requirements in eukaryotes, owing to their higher levels of organization through compartmentalization (Keene 2001; Anantharaman et al. , 2002). This is particularly true for multicellular organisms, where cell-cell communication adds yet another layer of organization. Consequently, the number of RBPs in higher eukaryotes rivals those of other classes of gene regulators, such as transcription factors (TFs) and kinases. In fact, a recent systematic approach aimed at the identification of novel RBPs in yeast found a high number of proteins with hitherto unappreciated RNA-binding activities, unexpectedly including a number of enzymes, thus suggesting that the real number of RNA- binding proteins may be even larger (Scherrer et al. , 2010). RBPs usually contain defined

RNA-binding domains through which they interact with cis -regulatory elements in the targeted mRNAs. Approximately one hundred distinct RNA-binding domains have been characterized so far, some of which are also able to bind double-stranded RNA

(Anantharaman et al. , 2002; Lasko, 2003). In contrast to the RBPs with clearly defined roles, such as those required for mRNA processing, there is also a number of ‘regulatory’ RBPs with less well defined functions that generally stabilize or destabilize bound messages, or inhibit their translation (Shyu et al. , 2008).

2.1.3.2 MicroRNAs

Although originally described in 1993 (Lee et al. , 1993), the significance of the class of small (~22 nt) non-coding RNAs referred to as microRNAs has only begun to be appreciated with the back-to-back publication of three articles in 2001 (Lagos-Qintana et al. ,

2001, Lau et al. , 2001, Lee and Ambros, 2001). This marked the beginning of an ongoing phase of growing interest in PTGR and the role of non-coding RNAs. MicroRNAs (miRNAs) are endogenously expressed in metazoans, where they exert their repressive function by

17 imperfect antiparallel hybridization to targeted mRNA transcripts. This is mainly mediated via the miRNA’s 5’-terminal ‘seed’ sequence of 6-8 nucleotides with usually perfect or near- perfect complementarity to the target sequence, the miRNA recognition element (MRE; reviewed in Filipowicz et al. , 2008). Targets are generally repressed through mechanisms that either lead to enhanced decay or translational inhibition (reviewed in Djuranovic et al. , 2011), although some reports demonstrated an activating role under certain conditions (reviewed in

Vasudevan, 2011).

Importantly, the functional unit of miRNA-mediated repression is a cytoplasmic ribonucleoprotein complex, termed miRNP (reviewed in Mourelatos et al. , 2002), composed of proteins of the Argonaute family of proteins (reviewed in Höck and Meister, 2008), the

RNase III endonuclease Dicer, as well as facultative accessory proteins (reviewed in Steitz and Vasudevan, 2009). Based on bioinformatics analyses, it is estimated that miRNAs regulate the majority of all human genes – with each miRNA being able to bind up to hundreds of target mRNAs (Lewis et al. , 2005). With more than one thousand miRNAs being encoded in the human genome, they represent a major class of post-transcriptional trans - acting factors that are implicated in virtually all biological processes, as well as a wide variety of pathological conditions, particularly cancers (reviewed in Calin and Croce, 2006; Ventura and Jacks, 2009). In some sense, miRNPs can be regarded as RBPs that can be loaded with different target specificities. This is highly advantageous for cells that rely heavily on post- transcriptional control, since the evolution of short nucleotide stretches is considerably easier to achieve than the generation of a RNA-binding domains with different or novel specificities, as is exemplified by the considerably higher number of miRNAs compared to RNA-binding domains (>1000 vs. 100; compare above).

18

2.1.4 Ribonucleoprotein complexes and the RNA regulon theory

Trans -acting factors usually do not act alone. Instead, it takes the concerted action of various RNA-binding proteins, scaffolding factors and effectors, as well as non-coding RNAs, which assemble into dynamic macromolecular complexes. These structures are referred to as ribonucleoprotein complexes or particles (RNPs) and represent the functional units of PTGR.

RNPs assemble on mRNAs as soon as they are transcribed and accompany it throughout their life time while continuously changing their compositions along the way. In fact, it may vary so much that different, relatively stable complexes can be identified at the various stages of an mRNA’s life. Moreover, proteins usually do not join or leave an RNP individually, but may be present as pre-formed protein complexes that await an mRNA (or rather the ‘previous’

RNP), at which point they then ‘take over’ by displacing factors that are (currently) not needed. In this way, all of the events outlined in chapter 2.1.1 are regulated by RNPs of a more or less defined nature (reviewed in Hieronymus and Silver, 2004; Keene, 2007;

Halbeisen et al. , 2008; Kanitz and Gerber, 2010). For instance: Spliceosomes regulate the splicing of mRNAs, ribosomes govern the translation into proteins, stress granules may store mRNAs ‘for further use’, and processing bodies are implicated in their decay (reviewed in

Kedersha and Anderson, 2007; Erickson and Lykke-Andersen, 2011; Thomas et al. , 2011).

Using ribonomics techniques for the global identification of the RNA targets of RBPs

(see 2.3.1), various groups were able to establish that RBPs may bind and regulate subsets of functionally and cytotopically related target mRNAs (Gerber et al., 2004; reviewed in

Halbeisen et al., 2008; Morris et al., 2010). This important feature of RNP organization was confirmed in numerous studies conducted in various organisms, suggesting the presence of highly dynamic, transcript- and condition-specific RNPs of a modular architecture, which coordinate post-transcriptional control mechanisms in a combinatorial fashion (Hieronymus

19 and Silver, 2004; Halbeisen et al. , 2008). In a series of seminal review articles, Jack Keene first proposed the notion of ‘post-transcriptional operons’ for such structures (Keene, 2001), in analogy to the bacterial operons, polycystronic transcription units coding for multiple proteins of similar function; and later proposed the existence of ‘RNA regulons’, higher-level regulatory circuits encompassing multiple post-transcriptional operons for more complex control mechanisms (Keene, 2007). A hypothetical RNA regulon is outlined in Figure 2.3, highlighting the modular architecture and the individual post-transcriptional operons. Figures

2.4 and 2.5 schematically depict the combinatorial and cooperative control concepts of RNA regulons, respectively.

Figure 2.3 Schematic representation of a hypothetical RNA regulon. Five RNA transcripts including different cis -regulatory elements are shown together with the corresponding trans -acting factors. Different classes of trans -acting factors are represented by circles, squares and triangles. (A) An RNA regulon consisting of six different post-transcriptional operons (indicated by different colors on the left). (B) Two exemplary post- transcriptional operons and the regulating trans -acting factors are indicated.

20

Figure 2.4 Combinatorial control of RNA transcripts. Hypothetical RNA transcripts including different cis - regulatory elements are shown together with the corresponding trans -acting factors. Different classes of trans - acting factors are represented by circles, squares and triangles. (A) In the presence of all trans -acting factors the combined regulatory cues lead to degradation of the targeted RNA. (B) In the absence of one of the destabilizing factors (yellow square), e.g. due to differential expression, stabilizing cues dominate. (C) In the absence of destabilizing cis -regulatory elements (one yellow and purple box), e.g. through alternative splicing, stabilizing cues dominate. Note that all regulatory cues were assumed to be of equal strength, differing only in their directions.

21

Figure 2.5 Cooperative control of an RNA transcript. A hypothetical RNA transcript including different cis - regulatory elements is shown together with the corresponding trans -acting factors. Different classes of trans - acting factors are represented by circles and squares. Note that two of the cis -regulatory elements overlap (light blue and red), so that the respective trans -acting factors, one stabilizing and one destabilizing factor, compete for binding. (A) When the stabilizing factor outcompetes the destabilizing factor for binding, the combined regulatory cue is stabilizing. (B) When the destabilizing factor outcompetes the stabilizing factor for binding, the stabilizing cue of the distal trans -acting factor is compensated, so that the net effect on the RNA is zero. Note that that all regulatory cues were assumed to be of equal strength, differing only in their directions.

2.2 Post-transcriptional gene regulatory networks

Note: Parts of section 2.2 are abridged from Kanitz, A., and Gerber, A. P. (2010).

Circuitry of mRNA regulation. Wiley Interdiscip Rev Syst Biol Med 2, 245–251.

Strikingly, many features of RBP- and miRNA-mediated gene regulation closely resemble those of transcription factors (TFs): While TFs generally bind DNA motifs upstream of a given gene (reviewed in Barrera and Ren, 2006), RBPs and miRNAs typically bind sequence or structural features of mRNA molecules, primarily located in their untranslated regions (Hogan et al. , 2008; Hafner et al. , 2010b). Moreover, much like RBPs and miRNAs assemble into highly dynamic transient RNP complexes, transcription factors are organized into transcription initiation complexes or “enhanceosomes”. Finally, TFs, RBPs and miRNAs often bind targets that code for functionally or cytotopically related proteins (Chu et al. , 1998;

Lee et al. , 2002; Harbison et al. , 2004; Gerber et al. , 2004; Hogan et al. , 2008; reviewed in

Halbeisen et al. , 2008; Morris et al. , 2010). The development and application of genome-wide analysis tools like DNA microarrays revealed fundamental insights into the logic of gene regulatory programs. Chromatin immunoprecipitation (ChIP-Chip) assays have been implemented to systematically map the binding sites of DNA-associated proteins, leading to the identification of transcriptional network motifs (Ren et al. , 2000; Iyer et al. , 2001; Lieb et al. , 2001). For instance, Rick Young and colleagues systematically analyzed the transcription factor binding sites for almost all known transcriptional regulators (203 proteins) in

Saccharomyces cerevisiae (Lee et al. , 2002; Harbison et al. , 2004). Likewise, ribonomics

22 approaches (see 2.3) have revealed the (m)RNA targets for dozens of RBPs and thus shed light on the organization of PTGR systems (see 2.3.1). Similarly, several recent studies globally identified targets of individual human miRNAs, either using quantitative proteomics

(Baek et al. , 2008; Selbach et al. , 2008) or RIP-Chip-based approaches, each in the presence or absence of specific miRNAs (see 2.3.1).

In this chapter, we summarize some fundamental motifs of transcriptional and post- transcriptional gene regulatory network (GRNs) circuitries based on selected systematic investigations on the targets of TFs, RBPs and/or miRNAs. Networks, in contrast to linked list and hierarchical data structures, are multidimensional relationships that are neither

(exclusively) linear, unidirectional, nor ‘rooted’. In the case of GRNs, regulators and targets are interconnected by particular regulatory cues exerted on targets. For the discussed networks, we focus on TFs, RBPs and miRNAs as trans -acting factors, although the principles extend to other classes of regulators, such as kinases/phosphatases or methylases/acetyltransferases, as well. A more detailed discourse on general concepts of GRNs can be found elsewhere

(reviewed in Milo et al. , 2002; Mesarovic et al. , 2004; Alon, 2007).

2.2.1 Basic network motifs

The basic network motifs comprise the unidirectional structures: Regulatory cues are exclusively passed down. In biological networks these are mainly constituted by multiple output motifs, i.e. regulatory relationships between one regulator and multiple ‘regulatees’, as well as multiple input motifs, i.e. the binding of one regulatee by multiple regulators.

2.2.1.1 Multiple output network motifs

As GRNs are non-linear, within each class of regulatory molecules there are instances

23 where one regulator binds to and controls the expression of two or more targets (Figure 2.6).

The regulator is usually activated by a signal which could either be an inducer molecule that binds to the regulator or a protein modification of the regulator mediated by a signal- transduction cascade. The frequency of this network motif in GRNs is apparent from elaborate studies analyzing transcription factor binding sites and RBPs in the yeast

Saccharomyces cerevisiae (Lee et al. , 2002; Harbison et al. , 2004; Hogan et al. , 2008). Lee et al. found that each of 106 yeast transcription factors under study bound up to 181 promoter regions (P < 0.001), with an average of 38 bound promoter regions per regulator (Lee et al. ,

2002). Similarly, Hogan et al. found that 43 of the 46 RBPs – including two “negative” control proteins – bound more than one RNA target; eleven of them binding less than ten targets, and six of them binding more than a thousand different RNAs, mainly mRNAs (false discovery rate < 1%; Hogan et al. , 2008). Although perhaps attributable to the different confidence levels applied for target definition as well as a bias from the RBP selection, it is nevertheless striking that RBPs are associated with an average of about 300 mRNA targets, a number almost ten times as high as the average number of targets for TFs. Whether the larger numbers of RBP targets go along with diminished regulatory impact on individual messages or have other functional implications is not known; yet it illustrates that PTGR networks are at least as densely constructed as TF systems.

Figure 2.6 Schematic representation of a multiple output network motif. The regulation of three regulatees (i.e. targets) by one regulator is depicted. Adapted from Kanitz and Gerber (2010).

24

A similar scenario has been observed for miRNAs, where several recent studies have systematically analyzed potential targets for individual human miRNAs, suggesting that each miRNA may bind to and regulate between dozens and thousands of mRNAs (Karginov et al. ,

2007; Baek et al. , 2008; Hendrickson et al. , 2008; Selbach et al. , 2008; Hafner et al. , 2010b).

For example, Karginov et al. found that Argonaute 2 (Ago2) proteins associated with 294 unique messages upon overexpression of miR-124 in HEK293 cells (Karginov et al. , 2007).

Applying the same approach, Hendrickson et al. defined 419 miR-124 target messages, which substantially overlapped with the ones defined by Karginov et al. (Hendrickson et al. , 2008).

Applying a quantitative proteomics approach, Selbach et al. found the abundance of 1,544 proteins changed in response to miR-124 overexpression (Selbach et al. , 2008). Although this number will certainly include secondary effects, it clearly demonstrates the far-reaching consequences of PTGR.

2.2.1.2 Multiple input network motifs

Another hallmark of non-linear GRNs is combinatorial control, which implicates the binding of two or more regulators to a single target (Figure 2.7). Lee et al. mapped almost

4000 individual interactions between transcription factors and promoter regions, providing evidence for the regulation of 2343 of 6270 yeast genes (37%) and an overall connectivity of

1.7 TFs per gene (Lee et al. , 2002). Likewise, Hogan et al. mapped 12,000 individual interactions between 46 RBPs and 4,300 mRNAs, indicating a fairly dense overall connectivity (2.8 RBPs per message; Hogan et al. , 2008). Extrapolating this to the hundreds of regulatory RBPs present in yeast, each mRNA message might interact with a dozen or more different RBPs on average during its lifetime. These data indicate that the potential for combinatorial controls is considerably higher for RBPs than for TFs, supporting the speculation that PTGR networks are meshed more densely than transcriptional networks.

25

Figure 2.7 Schematic representation of a multiple input network motif. The combinatorial regulation of one regulatee (i.e. target) by three regulators is depicted. Adapted from Kanitz and Gerber (2010).

Multiple input network motifs have also been observed for miRNA-mediated PTGR.

The expression of VEGFA, for instance, is under the control of at least eight different miRNAs (see 2.4.1). Likewise, synthesis of the tumor suppressors PTEN (Meng et al. , 2007;

Palomero et al. , 2007; Yang et al. , 2008) and CDKN1B (see 2.4.2) is controlled by multiple miRNAs.

2.2.2 Autoregulatory, two- and multicomponent loops

The directional regulatory events described above are passed down from one or more regulators to one or more regulatees. A true network, however, is multidimensional, such that each regulator is itself subject to regulation. Consequently, circular motifs are ubiquitously found within GRNs. The simplest form of such a circuit is constituted by an autoregulatory loop, where one regulatory molecule activates (positive feedback loop) or inhibits (negative feedback loop) its own production or activity (Figure 2.8 A). Importantly, autoregulatory loops are thought to be important for the modulation of the response time of gene circuits to a signal and to affect cell-cell variation in protein levels (“noise”). Whereas negative autoregulation generally speeds up the response time of transcriptional gene circuits and reduces cell-cell variation in protein levels, positive autoregulation has slowed response times,

26 leading to enhanced variation (reviewed in Alon, 2007). These circuits may allow rapid adaptation to new environmental conditions (negative loop) or to differentiated states

(positive loop).

Global TF binding site analysis in yeast revealed ten high confidence (P < 0.001) autoregulatory loops among the interrogated 106 yeast transcription factors (9%; Lee et al. ,

2002). Likewise, 9 out of the 46 RBPs (20%) surveyed by Hogan et al. were reproducibly associated with their own message (Hogan et al. , 2008). Interestingly, this number is doubled to 18 autoregulation loops (39% of all studied RBPs) when applying a less stringent cutoff

(FDR < 5%). In contrast, miRNAs have not been reported to be organized in autoregulatory loops because miRNAs are not translated but rather target mRNAs in the cytoplasm (reviewed in Djuranovic et al. , 2011). Whether the high incidence of such loops in RBP-mediated PTGR has general, systematic implications, possibly beyond the actual pathways they are found in, remains to be analyzed.

More complex regulatory loops are composed of two or more components of a class of regulators (Figure 2.8 B). Lee et al. identified three distinct multicomponent loops that consist of two, and in one case of three TFs (Lee et al. , 2002). Hogan et al. did not explicitly analyze the data from their survey on protein-RNA interactions for the presence of multicomponent loops (Hogan et al. , 2008). We therefore re-analyzed the raw data from this survey considering all RNA-protein associations with a false discovery rate of less than 5%. We found that 29 RBPs bound to mRNAs coding for at least one of the 46 RBPs under study. At least 21 of these 29 RBPs are arranged in multicomponent loops. In particular, we identified

13 two-component and 16 three-component loops, involving 16 and 17 different RBPs, respectively (Kanitz and Gerber, unpublished data). This high incidence of multicomponent loops among RBP targets is intriguing and suggests great regulatory potential. One example

27

Figure 2.8 Schematic representation of autoregulatory, two- and multicomponent loops. (A) Autoregulatory loop, in which a trans -acting factor regulates its own expression. (B) A serial multicomponent loop consisting of three regulators that regulate the expression of each other. (C) A simple feed-forward loop, in which one regulator controls the expression of another regulator, while both regulators control the expression of a common, unrelated target. Adapted from Kanitz and Gerber (2010). of how multicomponent loops may influence gene regulation is represented by feed-forward loops, a special kind of multicomponent loop which involves the regulation of one or more

28 common targets by two regulators, one of which being under the regulation of the other

(Figure 2.8 C). Feed-forward loops can trigger delayed responses to signals, which can be useful to filter out spurious pulses of signals (reviewed in Alon, 2007).

2.2.3 Composite gene regulatory networks

Integration of networks regulated by different classes of trans -acting factors leads to an even more complex model of unified ‘composite GRNs’, which may interact on various levels. Based on selected experimental evidence, some of the concepts of such networks are briefly discussed in this section.

2.2.3.1 RNA-binding proteins versus transcription factors

RBPs are selectively regulated by transcriptional activators or repressors. Among 561 known and predicted yeast RBPs (Hogan et al. , 2008), 279 (50%) were targeted by at least one of 106 TFs surveyed by Rick Young and colleagues (Lee et al. , 2002). This fraction is considerably larger than the total fraction of regulated genes in the genome (2343 regulated genes out of 6270; 37%). Conversely, the messages encoding TFs require the activity of RBPs for their maturation, decay, localization, and translation. Interestingly, 94 out of the 106 TFs

(87%) surveyed by Lee and colleagues were bound by at least one of the 46 RBPs analyzed by Hogan et al. (false discovery rate < 5%; Lee et al. , 2002; Hogan et al. , 2008). Although the study was not exhaustive, this analysis underpins recent observations that RBPs tend to control other gene regulators, such as RBPs and TFs (Figure 2.9 A). Such a "regulator of regulators" concept has recently been established for some human RBPs (Pullmann et al. ,

2007).

29

Figure 2.9 Composite gene regulatory network motifs. (A) A two-component loop in which an RNA-binding protein (RBP; blue) regulates the expression of a transcription factor (TF; green), which in turn regulates the expression of the RBP. (B) A two-component loop in which an RBP (blue) regulates the expression of a microRNA (miRNA; yellow), which in turn regulates the expression of the RBP. (C) A unidirectional two-level ‘composite regulon’ consisting of different classes of gene regulators (TFs, RBPs and miRNAs) which regulate the expression of several target genes in a combinatorial manner. Adapted from Kanitz and Gerber (2010).

2.2.3.2 RNA-binding proteins versus microRNAs

Besides the RNA processing factors required for the biogenesis of the majority of miRNAs, such as the nuclear microprocessor complex and TRBP/Dicer (Denli et al. , 2004;

Gregory et al. , 2004; Chendrimada et al. , 2005; Gregory et al. , 2005), an increasing body of

30 work suggests that RBPs extensively regulate miRNA expression, either generally or selectively (reviewed in Winter et al. , 2009; Krol et al. , 2010). In one of the earliest examples of such regulation, Lin28, a cytoplasmic mRNA binding protein, selectively blocks the processing of pre-miRNAs of the let-7 family in human cells (Viswanathan et al. , 2008).

Interestingly, the message encoding the Lin28 protein is itself regulated by its target miRNA let-7b (Wu and Belasco, 2005), thus providing an example of a RBP/miRNA two-component feedback loop (Figure 2.9 B).

Several studies indicate that RBPs may selectively modulate miRNA function, both synergistically and competitively, to alter translational repression (reviewed in van

Kouwenhove et al. , 2011). In the first study demonstrating such interplay, ELAV (embryonic lethal, abnormal vision, Drosophila)-like 1 (Hu antigen R; ELAVL1/HuR) relieves the miRNA-mediated repression of CAT-1 mRNA upon stress (Bhattacharyya et al. , 2006). For additional examples of crosstalk between miRNAs and RBPs, see 2.4.1 and 2.4.2.

Interestingly, bioinformatics analysis of our own ribonomics analyses of the two human

Pumilio/Fem-3-binding factor (PUF) family members Pum1 and Pum2 in human cancer cells revealed that conserved Pum recognition elements and miRNA seed sequences were preferentially located in close vicinity among the experimentally identified targets, suggesting extensive crosstalk between the two regulatory systems (Galgano et al. , 2008).

The high degree of interplay between transcriptional and PTGR should eventually lead to the characterization of composite GRNs (reviewed in Alon, 2007). For instance, the combination of the multiple-input and multiple-output motifs for functional related gene classes leads to dense and overlapping ‘composite regulons’ (Figure 2.9 C). Such regulons are widespread phenomena in the control of gene expression at different levels, and can be thought of as gate-arrays, processing multiple inputs from regulators to multiple targets

31

(reviewed in Milo et al. , 2002; Alon, 2007). However, in order to understand the functional implications of these composite regulons, not only the connectivity, but also the input function of each regulator (either positive or negative) has to be known, requiring quantitative measurements of the abundance and actions of diverse components of this network. If methodologies can be refined accordingly, such analyses, in the future, will allow a systems- level understanding of the multilayered gene-expression programs.

2.3 Ribonomics methodologies for the systematic identification of basic post-transcriptional gene regulatory network motifs

The advent of global and quantitative analysis tools for the study of gene expression allows the detection and quantification of network motifs in gene regulatory systems.

Generally, it appears that principles and structures of transcriptional regulatory networks are also preserved at the post-transcriptional level. However, systems analysis of PTGR is still in its infancy. The development of novel techniques for PTGR network analysis will hence be crucial to obtain sufficient data for the deciphering of the “RNA code”. In this chapter, we will summarize the methodologies developed for the unveiling of network motifs, starting with the “top-down” approaches, which allow the identification of the RNA targets of RNA- binding proteins and miRNAs. As these cannot directly identify combinatorial control (i.e.

“multiple input”) motifs and are thus not in the focus of this work, we will only briefly summarize the available tools, instead focusing on the less well-established “bottom-up” approaches, which allow the identification of proteins and RNAs that bind a specific RNA of interest. Both approaches should allow the analysis of RNA and protein components of purified RNPs by next generation sequencing (reviewed in Wang et al. , 2009) and quantitative proteomics methods (Wepf et al. , 2009; reviewed in Gstaiger and Aebersold,

2009), respectively.

32

2.3.1 Top-down approaches: RIP-Chip and related methods

In 1999, the late Robert Cedergren has coined the term 'ribonomics' for the search for

RNA genes, their structures and functions (Bourdeau et al. , 1999). However, the term now mainly refers to the application of methods aimed at the systematic identification of the RNA components of RBPs. In a pioneering approach, Jack Keene and colleagues have successfully isolated mRNPs by immunopurification of RBPs, followed by the identification of the associated, co-purified mRNA ‘targets' using DNA microarrays. Immunopurification relies on the specific immobilization of RNPs on antibody-coupled matrices, and the RBPs are either targeted directly, using RBP-specific antibodies, or via affinity tags that are fused to the RBPs.

Targets are defined based on their relative enrichment compared to a suitable control, such as matrices coupled to isotype control antibodies or uncoupled matrices. In analogy to the ChIP-

Chip (chromatin immunopurification-microarray) method, which allows the identification of the DNA binding regions of TFs or other DNA-binding proteins (Ren et al. , 2000; Iyer et al. ,

2001; Lieb et al. , 2001), the procedure is referred to as RIP-Chip (RBP immunopurification- microarray; Tenenbaum et al. , 2000, 2002). The method was a great success and has been employed to determine the RNA targets of more than hundred RNA-binding proteins in mammalian cells, flies, worms, trypanosomes, and particularly in yeast (reviewed in Morris et al. , 2010). It also allowed to indirectly determine miRNA targets through the immunopurification of Argonaute proteins, e.g. in human cell lines (Karginov et al. , 2007;

Hendrickson et al. , 2008; Landthaler et al. , 2008).

Main limitations of the technique are its inability to discover previously unknown

RNAs (as microarrays have to be spotted with specific hybridization probes against potential targets, requiring previous knowledge of their sequences), and the incapacity to directly define the corresponding cis -regulatory elements. While the latter may sometimes be

33 circumvented by deducing RBP recognition motifs from sequence comparisons of enriched co-purified messages with the help of pattern recognition/motif discovery software such as

MEME (Bailey et al. , 2009), this strategy is confined to RBPs with a strong affinity to specific sequence rather than structural or mixed motifs (see the description of PUF proteins in 4.1 for an example of RBPs with high specificity for a particular sequence motif).

Furthermore, owing to the dynamics of RNPs and the, consequently, often transient nature of

RBP-RNA (or RNA-RNA) interactions, and potential artifacts introduced by the purification procedure, the technique is prone to false negatives and positives.

In a number of recent variations of the technique, the aforementioned obstacles have been addressed, at least in part, by two modifcations: (1) the covalent crosslinking of the bait proteins to the targeted RNAs; and (2) the identification of bound RNAs by high-throughput sequencing methods. Crosslinking procedures rely on either UV light (Greenberg, 1979) or photoactivatable ribonucleoside analogues such as 4-thiouridine (Sontheimer, 1994). The respective techniques, termed HITS-CLIP (high-throughput sequencing crosslinking immunopurification; Licatalosi et al. , 2008; reviewed in Darnell, 2010), PAR-CLIP

(photoactivatable ribonucleoside enhanced crosslinking and immunopurification; Hafner et al. ,

2010a, 2010b), and iCLIP (individual nucleotide resolution crosslinking and immunopurification; König et al. , 2010, 2011), have been used extensively in the last three years to map in high resolution the RNA targets of several RBPs, including Argonaute proteins (Chi et al. , 2009; Hafner et al. , 2010b), heterogeneous nuclear ribonucleoprotein

(hnRNP) particles (König et al. , 2010), T-cell intracellular antigen 1 (TIA1) and TIA1-like 1

(TIAL1; Wang et al. , 2010), fragile X mental retardation protein (FMRP; Darnell et al. , 2011), and ELAVL1/HuR (Lebedeva et al. , 2011; Mukherjee et al. , 2011). Moreover, analysis methods (Kishore et al. , 2011; Zhang and Darnell, 2011) and databases (Corcoran et al. , 2011;

Khorshid et al. , 2011) have been developed that facilitate the analysis and meta-analysis of

34 such data. The application of these and similar techniques in the coming years will surely contribute immensely to the discovery of multiple output motifs of post-transcriptional GRNs and further our understanding of the RNA code.

2.3.2 Bottom-up: RNA affinity chromatography and related methods

Complementary approaches to RIP-Chip and related methods, focusing on the affinity purification of RNA components of RNPs, should allow both the identification of RNAs bound by trans -acting RNAs, as well as trans -acting proteins and RNAs binding to messages and non-coding RNAs. Apart from the insights into principles of gene regulatory circuitry, particularly the latter, ‘gene-centered’ approach might have broad and immediate implications for medical research, as it would allow the convenient and comprehensive identification of post-transcriptional regulators acting on messages of interest, such as oncogenes and tumor suppressors. Although a widely applicable, sensitive, specific and reliable purification strategy for RNAs remains elusive, various promising approaches have been devised and often successfully implemented.

2.3.2.1 Direct RNA affinity chromatography

The most direct approach is represented by RNA affinity chromatography methods that rely on the immobilization of a bait RNA to a matrix and incubation with substrate solution containing proteins or RNAs of interest, such as whole cell extracts, fractions thereof, or purified proteins (reviewed in Kaminiski et al. , 1998). The RNA of interest is either in vitro transcribed or synthesized, covalently or non-covalently attached to the matrix, and may or may not contain chemical modifications (Grabowski and Sharp, 1986; Bindereif and Green,

1987; Roualt et al. , 1989; Caputi et al. , 1999; Allerson et al. , 2003; Gerber et al. , 2004). A method for the immobilization of double-stranded RNA for the isolation of double-stranded

35

RNA-binding proteins has also been described (Langland et al. , 1995). The main disadvantage of such approaches is that RNP formation does not occur in vitro , strongly limiting their applicability for the comprehensive characterization of dynamic RNP.

Nevertheless, they remain useful for the isolation of individual proteins (Roualt et al. , 1989) and relatively static complexes, such as spliceosomes (Grabowski and Sharp, 1986; Bindereif and Green, 1987; Caputi et al. , 1999).

2.3.2.2 Purification methods based on antisense hybridization

The specificity of antisense oligonucleotide probes has also been used to characterize the spliceosome: In a variation of the direct RNA affinity chromatography method, immobilized biotinylated antisense RNAs or 2’-O-alkylated RNAs were extensively used to mask specific regions of snRNAs as a means to identify their functional domains and spliceosome architecture (Ruby and Abelson, 1988; Lamond et al. , 1989; reviewed in

Lamond and Sproat, 1994; Blencowe and Barabino, 1995). A similar approach was later employed to purify Euplotes aediculatus telomerase from nuclear extracts via hybridization of immobilized biotinylated 2’-O-methylated (2’-O-Me) RNA complementary to telomerase

RNA (Lingner and Cech, 1996), thus extending the use of antisense oligonucleotides for the purification of RNPs beyond the study of spliceosomes. Although the use of immobilized antisense oligonucleotides should allow the purification of native, in situ -formed RNPs, the described approaches relied on the incubation of in vitro -transcribed bait RNAs with nuclear extracts for RNP assembly, possibly due to the inefficiency of the method. This could possibly be improved by using antisense probes with improved stability, affinity and specificity properties, such as locked nucleic acids (LNAs), morpoholinos, or peptide nucleic acids (PNAs). Indeed, in a more recent study, a PNA was coupled to a cell penetrating peptide and a photoactivatable compound and was introduced into cortical neurons where it

36 hybridized with a complementary region in a dendritically localized mRNA. Irradiation with ultraviolet light enabled crosslinking of the PNA to the nearest proteins, and, following RNase digestion, the PNA probe could be captured by an immobilized sense oligonucleotide

(antisense to PNA; Zielinski et al. , 2006).

2.3.2.3 Aptamer-based purification methods

Aptamers are DNA or RNA sequences that have been selected from random nucleotide libraries for their ability to bind ligands of choice with high affinities (reviewed in Mayer,

2009), usually by systematic evolution of ligands by exponential enrichment (SELEX)

(Oliphant et al. , 1989; Tuerk and Gold, 1990; Ellington and Szostak, 1990) or related methods.

Due to their chemical nature, RNA aptamers can be conveniently used to tag bait RNAs by recombinant DNA technology. Upon expression, RNPs assemble on tagged RNA in an almost native fashion (depending on the site of aptamer insertion), and are amenable to purification via ligand-coated matrices. As such, RNA aptamers constitute perhaps the most promising approach for a ‘universal’ RNA affinity tag. While the aptamer database (Lee et al. , 2004) contains hundreds of different RNA aptamer sequences, so far only a few have been successfully employed for the affinity purification of RNPs: A group of aptamers against the aminoglycoside antibiotic tobramycin has been selected (Wang and Rando, 1995), out of which one has since been established as an RNA affinity tag (Hamasaki et al. , 1998) and used to characterize human prespliceosomes (Hartmuth et al. , 2002; Hartmuth et al. , 2004) and other RNPs (Vazquez-Piazola et al. , 2005; Weinlich et al. , 2009). An aptamer against streptomycin (Wallace and Schroeder, 1998), another aminoglycoside antibiotic, has also been established as an RNA tag (Bachler et al. , 1999; Windbichler and Schroeder 2006;

Dangerfield et al. , 2006) and applied for the study of the 48S ribosomal subunit (Locker et al. ,

2006; Locker and Lukavsky, 2007) and group II intron binding proteins (Böck-Taferner and

37

Wank, 2004). A precursor in yeast ribonuclease P (RNase P) assembly has been identified with the use of an aptamer against Dextran B512 (Srisawat et al. , 2001; Srisawat et al. , 2002).

Finally, an aptamer against streptavidin (Srisawat and Engelke, 2001) has also been used to study RNase complexes (Li and Altman 2002; Xiao et al. , 2005; Welting et al. , 2007), as well as RNPs and individual proteins associated with small nucleolar RNAs (Lemay et al. , 2011), rRNAs (Mohammad et al. , 2007), telomerase RNA (Shcherbakova et al. , 2009), and AU-rich elements (AREs) (Vasudevan and Steitz, 2007; Vasudevan et al. , 2007). One group established a sensitive, gel-free mass spectrometry method for the protein analysis of S1 aptamer-purified RNPs: A tRNA and several mRNA motifs were aptamer-tagged and purified.

Analysis of the associated proteins identified proteins that were previously shown to interact with the respective RNAs, as well as several potential new ‘interactors’ (Butter et al. , 2009).

Recently, an interesting modification of the S1 aptamer system was reported that involves coupling of the aptamer to a tRNA (Iioka et al. , 2011). This report is the first to indicate that the use of scaffolds may enhance both the stability and binding efficiency of aptamers.

2.3.2.4 Indirect protein- and peptide-based purification methods

A more indirect approach exploits protein affinity tags for the purification of RNAs.

By fusing bait RNAs to well-established recognition elements of RBPs, it is possible to ‘pull down’ RNPs via the corresponding, affinity-tagged RBP and a suitable matrix. By co- expression of the RBP, this approach allows the purification of in situ -formed RNPs and, furthermore, the direct study of RNP localization when a fluorescent protein is fused to the

RBP. The most widely used RBP for these purposes is a coat protein from the R17/MS2 bacteriophage (Bardwell and Wickens, 1990); others include the coat protein of the PP7 phage (Hogg and Collins, 2007a, 2007b), the N antiterminator protein of the λ phage

(Czaplinski et al. , 2005), the spliceosomal snRNP protein U1A (Brodsky and Silver, 2000;

38

Takizawa and Vale, 2000) and the eukaryotic initiation factor 4A (Valencia-Burton et al. ,

2007). As high affinity protein tags are well established and routinely used in many laboratories, and matrices are widely available, this attractive approach has gained some popularity for the purification of RNPs. Affinity tagged MS2 proteins, or fusion proteins thereof (Zhou and Reed, 2003; Beach and Keene, 2008), have been employed, among others, to characterize spliceosomes (Bardwell and Wickens, 1990; Das et al. , 2000; Jurica et al. ,

2002; Jurica and Moore 2002; Zhou et al. , 2002; Deckert et al. , 2006), mRNA transport/localization proteins or complexes (Gonsalvez et al. , 2004), as well as HIV-1 RNA

(Kula et al. , 2011). Recently, a fusion of the MS2 coat protein to GFP and a streptavidin binding peptide (Keefe et al. , 2001) has been described that allows both the tracking (via

GFPs) as well as a highly efficient purification of RNPs (via a streptavidin-coated matrix)

(Slobodin and Gerst, 2011). While indirect, RBP-mediated approaches are relatively versatile and efficient, a major downside is the indirect nature of the purification method, making it prone to artifacts resulting from the ectopic expression of the protein, cross-reactivity with untagged RNAs, interference with native RNP formation and unspecific binding of RNAs and proteins to the RBP.

2.3.2.5 Bifunctional RNA tag systems

A common obstacle in the widespread use of all of the described approaches is a low signal to noise ratio, mainly due to limited binding specificity and/or low efficiency. Two bifunctional tag systems address these issues by allowing a two-step ‘tandem’ purification of

RNPs: The ‘RAT’ (RNA Affinity in Tandem) tag contains binding sites for the PP7 coat protein as well as a tobramycin aptamer (Hogg and Collins, 2007a, 2007b), while the ‘TRAP’

(Tandem RNA Affinity Purification) tag contains an S1 aptamer as well as binding sites for the MS2 coat protein (Nelson et al. , 2007). While promising, these tandem tag systems have

39 not gained popularity so far, possibly because their universal use is hindered by the lack of a stabilizing scaffolding structure that could increase the reliability of the systems and ease the selection of integration sites in potential bait RNAs.

2.4 Combinatorial control of cancer-related messages

As diseases often arise as a consequence of dysregulation of one or more genes, it is not surprising that, due to their outlined central roles in the control of eukaryotic gene expression, RBPs and miRNAs are implicated in a large number of human ailments, particularly cancers (reviewed in Calin and Croce, 2006; Lukong et al. , 2008; Farazi et al. ,

2011; van Kouwenhove et al. , 2011; Wapinski and Chang, 2011). Here we summarize in detail the extent of combinatorial post-transcriptional control exerted on two cancer-related messages, the angiogenesis factor vascular endothelial growth factor A (VEGFA), and the tumor suppressor cyclin-dependent kinase inhibitor 1B (p27, Kip1)/CDKN1B.

2.4.1 Combinatorial control of the angiogenesis factor vascular endothelial growth

factor A

Vascular endothelial growth factor A (VEGFA) is implicated in a wide range of human tumors, as well as several other malignancies, through its function as a potent angiogenic mitogen (see 4.1 for more details). The great variety of mRNA and protein isoforms and the large number of cis -binding elements in its mRNA, particularly in its 3’-

UTR, render it a prime example of a cancer-related gene that is under extensive combinatorial post-transcriptional control.

The presence of a “tumor-angiogenesis factor” has first been hypothesized in 1971 by

Judah Folkman and colleagues based on the description of a mitogen secreted by tumors

40

(Folkman et al. , 1971). In 1983, a factor causing microvascular permeability was partially purified from tumor ascites fluid (Senger et al. , 1983). It took another six years, however, until the gene expressing this “vascular endothelial growth factor” was fully purified and eventually cloned (Ferrara and Henzel, 1989; Keck et al. , 1989). At the same time, three protein isoforms resulting from alternative splicing were identified, representing the first indication that its expression is regulated post-transcriptionally (Leung et al. , 1989). Since then, eight protein isoforms, differing in peptide length and resulting from splicing events in exons 6 and 7, have been characterized (Houck et al. , 1991; Poltorak et al. , 1997; Lei et al. ,

1998; Whittle et al. , 1999; Lange et al. , 2003). Moreover, the use of a distal splice site in exon

8 gives rise to further VEGFA isoforms (Bates et al. , 2002) with anti-angiogenic properties, demonstrating the fundamental role of alternative splicing for the function of VEGFA.

Changes in the VEGFA co-factor binding domains, resulting from the splicing events, are believed to be the main cause of the differential activity of the various isoforms (reviewed in

Harper and Bates, 2008). Additional VEGFA mRNA isoforms may also be generated by the use of an alternative initiation codon (CUG) that is driven by an internal ribosomal entry site

(IRES; Huez et al. , 2001; Meiron et al. , 2001).

The first hints about splicing-independent post-transcriptional control of VEGFA expression came from a 1997 study in which the human VEGFA 3’-UTR was sequenced and five hypoxia-inducible binding sites were identified (Levy et al. , 1997). A year later, Levy and colleagues identified the RNA-binding protein ELAVL1/HuR as the first trans -acting factor regulating VEGFA expression post-transcriptionally in a hypoxia-dependent manner

(Levy et al. , 1998) by binding one of the previously identified binding sites (Goldberg-Cohen et al. , 2002; Figure 2.10). Similarly, it was shown that the poly(A)-binding protein-interacting protein 2 (PAIP2) stabilizes the VEGFA transcript by binding to its 3’-UTR and interacts with

ELAVL1/HuR, thus suggesting a potential cooperative regulation (Onesto et al. , 2004).

41

Recently, ELAVL1/HuR-mediated control of VEGFA expression was shown to contribute to the maintenance of an angiogenic phenotype in tumor-derived endothelial cells, thus underpinning the biological relevance of this interaction (Kurosu et al. , 2011).

An early study further reported the presence of a ~125 nt hypoxia-inducible AU-rich stability region in the 3’-UTR, which was shown to be bound by a protein complex of unknown identity (Claffey et al. , 1998). One of the components was later identified as the splicing factor hnRNP L, which, under hypoxic conditions, translocates to the cytoplasm and specifically associates with a 21 nt CA-rich element (CARE) within the previously identified

AU-rich region (Shih and Claffey, 1999; Figure 2.10). An interesting feedback mechanism of

VEGFA expression was revealed by Ray & Fox (2007): While interferon gamma (IFN γ) activates transcription of VEGFA, it also contributes to its translational silencing through the binding of the IFN-gamma-activated inhibitor of translation (GAIT) complex to the VEGFA

3’-UTR. Later it was shown that the VEGFA 3’-UTR region containing the adjacent binding sites for hnRNP L and the GAIT complex represents a conformational switch that allows binding only to one of the two trans -acting factors, depending on the status of IFN γ and hypoxia signaling (Ray et al. , 2009; Figure 2.10). Recently, Jafarifar et al. showed that hypoxia-dependent binding of hnRNP L to the CA-rich element in the VEGFA 3’-UTR led to competitive displacement of miRNAs miR-297, -299, -567 and -605 (Figure 2.10) and consequently derepression of VEGFA expression in tumor-associated macrophages (Jafarifar et al. , 2011), providing the first evidence for cooperative regulation of VEGFA expression by a regulatory RBP and miRNAs.

An in-depth analysis of the whole VEGFA transcript revealed the presence of hypoxia-inducible cis -regulatory elements also in its 5’-UTR and coding region, and it was demonstrated that their function is dependent on one another (Dibbens et al. , 1999).

42

Subsequently, it was shown that a complex of cold shock domain and polypyrimidine tract binding proteins is able to bind and stabilize both the VEGFA 5’- and 3’-UTRs in a hypoxia- independent manner (Coles et al. , 2004). Similarly, the double-stranded RNA-binding protein

DRBP76/NF90 was found to bind in the 3’-UTR by Vumbaca and colleagues (2008) and also effected stabilization. Yet another stabilizing factor is constituted by the oncoprotein

MDMD2, which like ELAVL1/HuR and hnRNP L translocates to the cytoplasm under hypoxic conditions, where it binds the VEGFA transcript in the AU-rich region (Zhou et al. ,

2011). In contrast, Ciais and colleagues revealed that the zinc-finger protein TIS11b is able to effect a destabilization of the VEGFA mRNA by binding to a 75 nt region in its 3’-UTR that contains two consensus AU-rich motifs (Ciais et al. , 2004).

Apart from the aforementioned study by Jafarifar et al. (2011), other miRNAs have also been found to regulate human VEGFA expression post-transcriptionally. In two studies, the group of Zhang You identified several miRNAs that repressed VEGFA expression in vitro , although the corresponding recognition motifs in the mRNAs have not been unambiguously identified (Hua et al. , 2006; Ye et al. , 2008). One of the two VEGFA IRES, driving the expression of the VEGF-121 isoform (Huez et al. , 2001), has been shown to be susceptible to regulation by miR-16 (Karaa et al. , 2009). Lei et al. identified a feedback loop regulating the adaptation of murine tumor cells to different oxygen concentrations in which hypoxia- inducible factor 1 alpha (HIF-1α) suppresses the expression of miR-20b which in turn may regulate both HIF-1α and VEGFA expression (Lei et al. , 2009). Two miRNAs, miR-126 and -

205, were shown to regulate VEGFA expression and inhibit the growth of lung and breast cancer, respectively, in vitro and in vivo (Liu et al. , 2009; Wu et al. , 2009). Two other miRNAs, miR-93 and -200b, were demonstrated to regulate VEGFA expression in diabetes and diabetic retinopathy, respectively (Long et al. , 2010; McArthur et al. , 2011).

43

Figure 2.10 Post-transcriptional regulation of VEGFA expression. Some of the identified cis -regulatory elements and trans -acting factors that regulate VEGFA expression post-transcriptionally are depicted schematically, with a focus on the cooperative regulation that leads to the stabilization of the VEGFA mRNA during hypoxia. See main text for details. Adapted from van Kouwenhove et al. (2011).

2.4.2 Combinatorial control of the tumor suppressor CDKN1B/p27

The cyclin dependent kinase (cdk) inhibitor p27 Kip1 /CDKN1B is a tumor suppressor with a key role in the control of the cell cycle. In quiescent cells, the protein is present at high levels where it binds to and inhibits the activity of the cdk4-cyclin D and cdk2-cyclinA/E complexes. Upon mitogenic stimulation in late G1 phase, p27 is targeted for regulated destruction by the Skp1/Cul1/F-box-containing (SCF) ubiquitin ligase complex (Pagano et al. ,

1995), resulting in the activation of the cyclin complexes and progression of the cell cycle.

The protein may be valuable as a prognostic marker or drug target, as its frequent downregulation in tumors is associated with poor prognosis (reviewed in Slingerland and

Pagano, 2000; Hershko, 2010). Several post-transcriptional regulators of its expression have been identified that sometimes act in cooperation.

The first indication that CDKN1B expression is under post-transcriptional control

44 came in the mid-nineties: A peak in p27 Kip1 levels in early G1 phase was not consistent with the stable levels of its mRNA, leading one group to suggest that CDKN1B mRNA may be under translational control (Hengst and Reed, 1996). Later the same year, another group observed that p27 Kip1 levels drop dramatically upon mitogen-mediated exit from G0 (Agrawal et al. , 1996). However, the decrease in protein levels was demonstrated to be independent of its degradation, thus leading them to propose a post-transcriptional regulatory mechanism as well. A role for RNA-binding factors in the regulation of the oscillating p27 Kip1 levels was finally confirmed in 2000, when it was shown that a U-rich sequence in the CDKN1B 5’-UTR is bound by a number of cycling trans -acting factors, specifically the RBPs ELAVL1/HuR, hnRNP C1 and C2, leading to translational activation in proliferating and quiescent cells

(Millard et al. , 2000). Interestingly, an inhibitory role for the ELAV proteins HuR and HuD in proliferating cells has also been proposed, mediated by the binding to and blocking of an internal ribosome entry site (IRES), which is also situated in the CDKN1B 5’-UTR

(Kullmann et al. , 2002). Recently, the CUG binding protein was also reported to repress

IRES-mediated translation (Zheng and Miskimins, 2011). In contrast, PTB has been proposed to promote cap-independent translation of p27 Kip1 by interacting with the IRES (Cho et al. ,

2005). However, other reports call the existence of a functional IRES into question and attribute the observed effects to cryptic promoter activity instead (Liu et al. , 2005; Cuesta et al. , 2009).

Consistent with the more conventional roles of ELAVL1/HuR, it was recently shown that the protein may also stabilize the CDKN1B message, by binding to several U-rich regions in its 5’- and 3’-UTRs (Ziegeler et al. , 2010). In the same study it was also established that multiple mRNA isoforms of CDKN1B are produced by the use of alternative transcription start sites and polyadenylation signals, and that a CU repeat region in the resulting extended

5’-UTR is bound by 41 kDa (kilodalton) stabilizing factor of unknown identity. Another study

45 identified a novel endonuclease that is able to degrade the CDKN1B message by binding to one of its U-rich regions, thus revealing a possible mechanism for the stabilizing role of

ELAVL1/HuR, involving the protection of U-rich elements from degradation (Zhao et al. ,

2000).

While studying oligodendrocyte differentiation in rats, the RBP quaking/QKI was identified as another trans -acting factor capable of stabilizing the CDKN1B transcript

(Larocque et al. , 2005). Due to this role, QKI was recently implicated in the suppression of colon cancer (Yang et al. , 2010). Furthermore, the protein was identified to partake in a negative feedback loop involving its direct upregulation by E2F1, followed by the QKI- mediated stabilization of p27, which in turn leads to stabilization of the retinoblastoma protein and suppression of E2F1 (Yang et al. , 2011).

By employing an elegant miRNA library screening, Reuven Agami and colleagues identified the first miRNAs suppressing CDKN1B expression, miR-221 and miR-222, and established their roles as potent oncogenes that are upregulated in several cancer cell lines as well as glioblastomas (le Sage et al. , 2007). This finding was since corroborated and extended to prostate cancer cell lines (Galardi et al. , 2007), myelomas (Felicetti et al., 2008) and hepatocellular carcinomas (Fornari et al. , 2008; Pineau et al. , 2010), among others. Another miRNA, miR-181a was demonstrated to suppress CDKN1B expression during myeloid cell differentiation (Cuesta et al. , 2009). Recently, a third miRNA, miR-148a, was found to regulate CDKN1B and promote proliferation in gastric cancer cell lines, although its levels were frequently downregulated and inversely correlated with CDKN1B levels in gastric cancer tissue samples, indicating a potential antagonistic roles on cancer progression in these particular tumors (Guo et al. , 2011).

46

Two more studies from the Agami lab prominently demonstrated that miR-221/222- mediated suppression of CDKN1B expression is under cooperate control by RBPs: In a 2007 study, it was shown that the RBP Dnd1 is able to bind to two uridine-rich regions located between the two miR-221/222 recognition elements (also see Figure 5.7), thus blocking them from access by the miRNAs in a competitive manner (Kedde et al. , 2007). In a more recent study, mitogen-activated phosphorylation of the PUF family RBP Pum1 led to its binding of a cis -regulatory element in the proximity of one of the miR-221/222 recognition elements, thus inducing a structural switch that positively influenced the accessibility of the recognition element (Kedde et al. , 2010). These findings explain why p27 Kip1 is able to accumulate in quiescent cells, despite high levels of miR-221/222, and presents an elegant synergistic regulatory mechanism between two different classes of trans -acting factors (Figure 2.11).

Figure 2.11 Synergistic post-transcriptional control of CDKN1B expression. In the absence of growth factors, Pum1 is in an inactive state and miRNA access to its recognition element is limited. Upon stimulation with growth factors, Pum1 is phosphorylated and binds a recognition element in the CDKN1B 3’-UTR, thus inducing a structural switch that increases the accessibility of the miRNA recognition element. See main text for additional information. Adapted from van Kouwenhove et al. (2011).

47

3 Aims and Outline of the Thesis

The discovery of combinatorial control motifs is a major challenge in the study of post-transcriptional gene regulatory networks. In this work we address this problem by (a) the combination of experimental evidence and bioinformatics predictions to identify two classes of cis -regulatory elements in a message of interest (see chapter 4), and (b) the development of a widely applicable method for the identification of trans -acting factors that associate with an

RNA of interest (see chapter 5), using human cell lines as a model system.

Regarding the first approach, we built upon previous observations which suggest extensive interaction between two classes of post-transcriptional gene regulators, the

Pumilio/Fem-3-binding (PUF) family of RNA-binding proteins and the miRNA (miRNA) machinery. To examine the relevance of this finding, we chose to study the 3’-untranslated region of the angiogenesis factor VEGFA, which contains three canonical Pum consensus motifs and is bound by the RNA-binding domains of the two human PUF family members

Pum1 and Pum2 in vitro . We used five miRNA target prediction services to identify a high confidence miRNA recognition element in the vicinity of the Pum consensus motifs. By using in vitro reporter assays, we then assessed the regulatory potential of these cis -regulatory elements, as well as the individual and combinatorial regulatory impact of the corresponding trans -acting factors.

For the second approach, we set out to develop a method for the enrichment and subsequent compositional analysis of ribonucleoprotein (RNP) particles. By using rational design principles, we aimed to develop a nucleic acid-based RNA tandem affinity tag system that can be fused to RNAs of interest and expressed in human cell lines without impeding native RNP formation. The tagged complexes should then be amenable to purification via a

48 gentle, efficient and highly specific two-step process that is compatible with downstream analytical methods. By using highly sensitive transcriptomics and proteomics approaches, it should then be possible to characterize, in an unbiased manner, their RNA and protein components.

49

4 Identification of New Post-Transcriptional Regulators of Vascular Endothelial Growth Factor A Expression

4.1 Introduction

Angiogenesis is a multi-step process leading to the formation of vascular structures derived from preexisting blood vessels, either through remodeling or the formation

(“sprouting”) of new vessels. It involves the induction of microvascular hyperpermeability, breakdown of the vascular basement membrane, recruitment and proliferation of endothelial cells (EC), and the formation of mature blood vessels. Positive and negative regulators of angiogenesis have to be tightly regulated to maintain physiological tissue homeostasis and function. For example, in healthy adult skin, angiogenesis is generally quiescent as angiogenic stimuli are overruled by inhibitory signals. However, environmental insults may tip this balance, leading to initiation of angiogenesis in order to counteract tissue damage. Similarly, excessive angiogenesis in the skin, resulting from dysregulation of one or more of its regulators, is associated with a plethora of pathological conditions, such as psoriasis and other inflammatory dermatoses, autoimmune blistering diseases, and many cancers, most prominently melanoma, basal cell carcinoma and squamous cell carcinoma. Anti-angiogenic therapy therefore holds promise for the treatment of a wide spectrum of human ailments

(Detmar, 2000; Carmeliet, 2005).

Cutaneous squamous cell carcinoma (SCC) is the second most common skin cancer in the general population (Lohmann and Solomon, 2001). In contrast to basal cell carcinoma – the most common skin cancer – it is characterized by the risk for metastasis. Incidence of

SCC is 60- to 100-times higher among immunosuppressed patients, which makes it the most common cancer following organ transplantation. Invasive SCC develops from atypical keratinocytes, clinically visible as actinic keratosis or Bowen’s disease, both considered

50 intraepithelial non-invasive forms of SCC (Hofbauer et al. , 2010). Approximately 1% of these intraepithelial lesions develop into an invasive SCC (Schwartz et al. , 2008). However, such tumor development requires intense interactions with stromal cells and profound extracellular remodelling. Angiogenesis is an essential part of the malignant phenotype as most tumors are apparently not able to exceed 1-2 mm of diameter without developing new blood vessels

(Folkman, 1990). Therefore they produce angiogenic factors at an early point of development.

Vascular endothelial growth factor A (VEGFA) is a homodimeric heparin-binding glycoprotein that mainly acts as a paracrine mitogen, growth and survival factor for ECs, but it also causes vascular permeability, vasodilatation, and various changes in immune cell properties upon binding to its main receptors VEGF receptor-1 and -2. VEGFA has been identified as the predominant tumor angiogenesis factor in the majority of human cancers, including those of the breast, colon, lung and prostate (Ferrara et al. , 2003; Hoeben et al. ,

2004). Invasive SCC also expresses increased levels of VEGFA, particularly in the leading front of the tumor, which is an intuitive site for the induction of angiogenesis (Bowden et al. ,

2002). VEGFA expression in SCC has not been studied intensively. However in some other cancers, such as head and neck squamous cell carcinoma, increased expression of VEGF has been associated with progression to a more aggressive phenotype, both clinically and in experimental systems (Sauter et al. , 1999). Similarly, increased VEGFA expression correlates with greater metastatic potential of melanoma, and its expression is high in melanoma metastases themselves (Salven et al. , 1997; Tóth-Jakatics et al. , 2000). In the skin, VEGFA is mainly secreted by epidermal keratinocytes. Its expression is upregulated in response to hypoxia (Detmar et al. , 1997), activation of epidermal growth factor (EGF) receptor signalling via EGF or transforming growth factor (TGF)-α, and to a number of cytokines including TGF-β, fibroblast growth factor-7 and others (Detmar et al., 1994; Frank et al. ,

1995). Interestingly, it was shown that heterozygous deletion of the VEGFA 3’-untranslated

51 region (3’-UTR) in mice leads to a two- to three-fold increase in VEGFA levels and embryonic lethality following cardiac failure, thus suggesting the presence of important regulatory elements in its downstream untranslated region (Miquerol et al. , 2000). Indeed,

VEGFA expression appears to be excessively regulated at the post-transcriptional level, both by RNA-binding proteins and miRNAs (see 2.1.3.2).

Pumilio/Fem-3-binding factor (PUF) proteins belong to a family of regulatory RNA- binding proteins that is conserved among all eukaryotes (reviewed in Spassov and Jurecic,

2002; Wickens et al. , 2002; Spassov and Jurecic, 2003). PUF proteins play important roles in a large number of processes, including differentiation and stem cell maintenance, as well as embryonic, germ cell and neural development. For instance, the PUF protein Pumilio, together with the RBPs Nanos and Brat, mediates proper segmentation of fly embryos by inhibiting the translation of hunchback mRNA (Wreden et al. , 1997; Sonoda and Wharton,

1999). In worms, the PUF protein PUF9 is required for the miRNA let-7-mediated repression of hbl-1 transcripts, thus regulating the differentiation of epidermal stem cells during larval- to-adult transition (Nolde et al. , 2007). The murine PUF protein Pum2 has been shown to localize to neuronal dendrites (Vessey et al. , 2006), where it regulates their morphogenesis, as well as synaptic function, at least partly by repressing the translation of eIF4E (Vessey et al. ,

2010).

The defining characteristic of PUF family members is the presence of a C-terminal

RNA-binding domain, referred to as the Pumilio homology domain (Pum-HD). In canonical

PUF proteins, it consists of eight imperfect tandem repeats, each of which makes contact with a specific nucleotide in its target cis -regulatory element (Wang et al. , 2001, 2002), the conserved Pumilio (or Pum) recognition element (PRE). Many PUF proteins further contain a prion-like region that is rich in asparagine and glutamine and may explain their tendency to

52 aggregate and re-localize to stress granules upon their formation (Vessey et al. , 2006; Morris et al. , 2008; Salazar et al. , 2010; Vessey et al. , 2010). Due to the unusually high sequence specificity of the Pum-HD, it has been used as a scaffold for engineering RNA-binding domains with engineered sequence specificity (Cheong and Hall, 2006; Dong et al. , 2011;

Filipovska et al. , 2011; reviewed in Lu et al. , 2009; Filipovska and Rackham, 2011).

Furthermore, the conserved binding motif has been useful as an internal control for ribonomics approaches (see 2.3.1), which is – next to their important biological functions – probably another reason why systematic target analyses have been reported for PUF proteins in humans (Galgano et al. , 2008; Hafner et al. , 2010b), flies (Gerber et al. , 2006), worms

(Kershner and Kimble, 2010), trypanosomes (Archer et al. , 2009), and yeast (Gerber et al.,

2004). The aforementioned characteristics in general and these studies in particular established the PUF family as a prototypical example of regulatory RNA-binding proteins, and they had a considerable impact on the proposition of the RNA regulon theory (see 2.1.4), as it was shown that PUF proteins often bind and presumably regulate functionally and cytotopically related clusters of targets.

In many respects, PUF-mediated post-transcriptional gene regulation (PTGR) strongly resembles the regulation exerted by miRNAs: The conserved AU-rich consensus motif resembles in length and sequence composition the seed sequences of miRNAs (see 2.1.3.2); both PUF proteins and miRNAs preferentially bind in the 3’-UTRs of regulated targets, with a positional bias towards their distal (Pum motifs; Piqué et al. , 2008; Kanitz & Gerber, unpublished data), or proximal and distal ends (miRNAs; Gaidatzis et al. , 2007); they generally have a repressive effect on target gene expression, with few exceptions (reviewed in

Djuranovic et al. , 2011; Quenault et al., 2011; Vasudevan, 2011); finally, Caenorhabditis elegans contains an unusually high number of different Argonaute (27; Yigit et al. , 2006) and

PUF proteins (12; Zhang et al. , 2011).

53

There are four human PUF family members, although only two, Pum1 and Pum2, contain the typical Pum-HD containing eight tandem repeats, whereas the other two,

KIAA0020/Puf-A and C14orf21 only contain six repeats. Not much is known about the latter two, but Puf-A has previously been identified as a minor histocompatibility antigen (Brickner et al. , 2001), and it has recently been shown to re-localize from nucleoli to the nucleoplasm under genotoxic stress, where it then interacts with and modulates the cleavage of PARP-1, a protein involved in DNA damage repair (Chang et al. , 2011). No reports exist on C14orf21 function.

The canonical human PUF proteins, Pum1 and Pum2, differ mainly in the presence of an N-terminal stretch of 128 amino acids that is present in Pum1 (molecular weight = 127 kilodalton; kDa), but not Pum2 (molecular weight = 114 kDa; Spassov and Jurecic, 2002), and otherwise share a high degree of similarity, particularly between their homology domains

(91% identity, 97% similarity; Spassov and Jurecic, 2002). The consensus sequence for the canonical murine and human PUF proteins is UGUAnAUA (in which n is any nucleotide), as determined by SELEX (White et al. , 2001), and ribonomics analyses (Galgano et al. , 2008;

Hafner et al. , 2010b) respectively. Considering the absence of reports demonstrating functional differences between the two proteins and the high overlap between their mRNA targets (Galgano et al. , 2008), it is possible that they may act redundantly. Pum1 and Pum2 are ubiquitously expressed, and their levels may oscillate during the cell cycle (Kedde et al. ,

2010; Kanitz & Gerber, unpublished data). Expression levels of the proteins in cancers were not extensively studied, but Pum1 levels appear to be stable in breast cancers (Szabo et al. ,

2004; Lyng et al. , 2008), but unstable in ovarian cancers (Li et al. , 2009). Activity of the Pum proteins may be further modulated by alternative splicing and post-translational modifications, as several isoforms and phosphorylation sites have been found for both proteins. The latter is supported by the recent finding that the Pum1-mediated regulation of CDKN1B/p27 is

54 phosphorylation-dependent (Kedde et al. , 2010). However, an in-depth analysis of the role of post-transcriptional and post-translational mechanisms on the regulation of the Pum proteins, or PUF proteins in general, is currently not available.

Here we set out to identify an example of a message that is subject to combinatorial or cooperative control by microRNAs (miRNAs; 2.1.3.2) and the human Pumilio/Fem-3-binding factor (PUF) proteins Pum1 and Pum2, based on our observations that predicted miRNA recognition elements (MREs) are often found in the immediate vicinity of Pum recognition elements (PREs) (Galgano et al. , 2008). We chose to focus our studies on human VEGFA, due to the extensive role of post-transcriptional mechanisms in the regulation of its expression and the presence of multiple putative recognition elements for both classes of regulators in its

3’-UTR. We were able to confirm a repressive effect for both Pum proteins and a promising miRNA candidate, microRNA 361-5p (miR-361-5p). While we were not able to study in detail the nature of the potential crosstalk between the regulators, or the significance of

VEGFA regulation in vivo , we could show that all regulators are present at reduced levels in

SCC samples expressing elevated levels of VEGFA.

4.2 Results

4.2.1 The VEGFA 3’-untranslated region contains canonical Pum consensus motifs

The almost 2 kb long sequence of the human VEGFA 3'-UTR (Figure 4.1), the vast majority (>95%) of which is present in all of its known isoforms, contains two regions that are highly conserved among vertebrates, one at its 5'- and the other at its 3'-end. Both regions are lower in GC content compared to the weakly conserved region separating them (GC% approximately 44, 58 and 28 from 5' to 3'), and they contain all unambiguously identified recognition elements for RBPs and miRNAs (Levy et al. , 1998; Shih and Claffey, 1999;

55

Goldberg-Cohen et al. , 2002; Hua et al. , 2006; Ray and Fox, 2007; Ye et al. , 2008; Karaa et al. , 2009; Liu et al. , 2009; Wu et al. , 2009; Ray et al. , 2009; Jafarifar et al. , 2011; McArthur et al. , 2011).

We had previously shown that the RNA-binding domains of Pum1 and Pum2 (‘Pum homology domain’; 91% identity and 97% similarity between Pum1 and Pum2) are able to bind the VEGFA 3’-UTR in vitro , although RIP-Chip experiments (see 2.3.1) revealed no significant enrichment of the VEGFA message in either Pum1 or Pum2 ‘pulldowns’ of HeLa cell lysates compared to a control (Galgano et al. , 2008). The latter was corroborated by a similar study in HeLa cells conducted by Morris et al. (2008). However, upon closer inspection of the VEGFA 3’-UTR, we found three canonical Pum consensus motifs

(UGUAnATA; Figure 4.1) in its downstream conserved region. The most upstream of these putative PREs, PRE1, was present in the transcript subjected to the in vitro binding assays, and could thus explain the observed binding. Recently, a PAR-CLIP-based study (see 2.3.1) in HEK293 cells revealed an association of Pum2 with the downstream conserved region of the VEGFA 3’-UTR at two locations surrounding, but not overlapping, PRE2 and PRE3

(Hafner et al. , 2010b). While it appears that the elements PRE1 and PRE2 are mainly conserved among primates (Figure 4.2 A) and humans (Figure 4.2 B), respectively, PRE3 is highly conserved across mammals and marsupials (Figure 4.2 C). Based on these data and observations, we hypothesized that VEGFA might be a target of Pum1 and/or Pum2 and that the lack of a strong association of the VEGFA message in the Pum1/2 RIP-Chip experiments may result from low expression levels of VEGFA in HeLa cells, other trans -acting factors inhibiting accesss to the binding sites, or technical limitations.

56

Figure 4.1 Overview of the human VEGFA 3’-UTR. Schematic representation of the genomic locus encoding the 3’-UTR of human VEGFA. The terminal exon of the VEGFA gene, the 3’-UTR fragment common to all isoforms, PhyloP and PhastCons conservation scores, GC content, and the 3’-UTR region subcloned behind a luciferase reporter for use in this study are indicated. Recognition elements of RBPs known to regulate VEGFA expression are depicted. Sequence tag densities of PAR-CLIP experiments for AGO1-4, IGF2BP1-3, PUM2 and TNRC6A/C (Hafner et al. , 2010b) are given in gray shades (high density = black, no tags = white). The in vitro transcript used in Pum binding assays (Galgano et al. , 2008) as well as the putative PREs (red) are indicated. Where available, putative or confirmed seeds and recognition elements for human miRNAs and RBPs regulating VEGFA expression (Levy et al. , 1998; Shih and Claffey, 1999; Goldberg-Cohen et al. , 2002; Hua et al. , 2006; Ray and Fox, 2007; Ye et al. , 2008; Karaa et al. , 2009; Liu et al. , 2009; Wu et al. , 2009; Ray et al. , 2009; Jafarifar et al. , 2011; McArthur et al. , 2011) are highlighted, together with the putative miR-361-5p seed (red). The 3’-UTR fragment used for luciferase reporter assays is depicted. Adapted from UCSC genome browser (Fujita et al. , 2011) .

57

Figure 4.2 Conservation of putative Pum recognition elements. Schematic representation of the three putative PREs in the VEGFA 3’-UTR, PRE1 (A), PRE2 (B), and PRE3 (C). The PhyloP basewise conservation score and sequence alignments for various vertebrates are indicated for each PRE. In the latter, dark blue letters denote nucleotides that are identical with those found in human at the particular positions, while light blue letter indicate those nucleotides that deviate from the human sequence. Insertions in the aligned species are denoted by orange vertical lines; the orange number corresponds to the number of inserted nucleotides in the aligned species with the longest insertion. Light blue double horizontal lines indicate bases that are unalignable. When no alignment is available, spaces are empty. Adapted from UCSC genome browser (Fujita et al. , 2011) .

58

4.2.2 VEGFA is a putative target of microRNA 361-5p

Due to the significant enrichment of predicted MREs in close proximity to canonical

PREs of experimentally confirmed Pum1/2 targets, we have previously proposed that there may potentially be extensive interplay between Pum- and miRNA-mediated regulation

(Galgano et al. , 2008). Recently, the finding that Pum1 modulates miR-221/222-mediated regulation of the tumor suppressor CDKN1B/p27 (Kedde et al. , 2010) has substantiated this hypothesis. Most miRNA recognition elements (MREs) that have been unambiguously shown to be able to affect human (Figure 4.1) or murine (not shown) VEGFA expression are located in the 5'-conserved region (Jafarifar et al. , 2011; Karaa et al. , 2009; Lei et al. , 2009; Wu et al. ,

2009; Long et al. , 2010); only miR-126 and miR-200b have been demonstrated to bind in the

~730 nucleotide downstream conserved region (Liu et al. , 2009; McArthur et al. , 2011), albeit relatively far away from the putative PREs (~ 80-380 nt).

In order to find candidate miRNAs that may potentially ‘crosstalk’ with Pum-mediated regulation of VEGFA expression, we employed five miRNA target prediction services to search for predicted MREs in the VEGFA 3’-UTR: microRNA.org (Betel et al. , 2010),

TargetScan (Friedman et al. , 2009), DIANA-microT (Maragkakis et al. , 2009), miRDB

(Wang, 2008), and MicroCosm (Griffiths-Jones et al. , 2008). The resulting list (see 7.4) contains hundreds of potential binding sites for a large set of miRNAs, covering the entire length of the VEGFA 3’-UTR. Importantly, the density of predicted MREs is considerably higher in the conserved regions of the 3’-UTR (2-3-fold higher compared to the weakly conserved region), particularly in the downstream conserved region (

Table 4.1). This is consistent with the observation that the density of predicted MREs increases towards both ends of a transcript's 3'-UTR (Gaidatzis et al. , 2007).

59

Table 4.1 Density of predicted microRNA recognition elements in the VEGFA 3’-untranslated region. Densities of predicted MREs in the whole 3’-UTR as well as the 3’-UTR regions defined in the main text are given. Target predictions were from the following web services: microRNA.org (MR; Betel et al. , 2010), TargetScan (TS; Friedman et al. , 2009), DIANA-microT (µT; Maragkakis et al. , 2009), miRDB (DB; Wang, 2008), and MicroCosm (MC; Griffiths-Jones et al. , 2008). See 7.4 for a full list of miRNAs predicted to target the VEGFA 3’-UTR.

Region Length (nt) MREs Density (MREs/nt) Conserved region 1 519 150 0,29 Non-conserved region 673 95 0,14 Conserved region 2 733 297 0,41 Whole 3'-UTR (NM_001025366) 1925 545 0,28

Consistent with our bioinformatics analysis (Galgano et al. , 2008) and in order to narrow down the list of miRNAs that may co-regulate VEGFA expression together with

Pum1/2, we have considered only those predicted MREs whose seed sequences fall within 50 nt of either of the putative PREs, and which have been predicted by at least two algorithms.

This analysis revealed 13 potential miRNA:MRE pairs (Table 4.2). Out of these, miR-361-5p emerged as the most likely candidate, as its putative MRE was the only one predicted by all five algorithms. Furthermore, the MRE is situated within 50 nt of not one, but two PREs, including the highly conserved PRE3. Finally, transfection of a miR-361-5p mimic in hypoxia-induced CNE cells has already been shown to reduce VEGFA protein levels, as determined by enzyme-linked immunosorbent assay (ELISA; Ye et al. , 2008) – supporting the idea that this miRNA may regulate VEGFA expression. RNAhybrid (Rehmsmeier et al. ,

2004) calculated a minimum free energy of -22.0 kcal/mol for the interaction between miR-

361-5p and the MRE located between nucleotides 1604 and 1625 of the VEGFA 3’-UTR in

NM_001025366 (Figure 4.3 A), which is in the range of other miRNAs that have been shown to be able to regulate VEGFA expression in vitro (Hua et al. , 2006; Ye et al. , 2008).

The corresponding MIR361 gene is encoded on the X , between exons 9 and 10 of CHM /choroideremia (Rab escort protein 1), and gives rise to two mature miRNA species, miRNA-361-3p and the predominant miRNA-361-5p (Figure 4.3 B). The locus is

60 highly conserved among placental mammals, particularly the stem region of the putative precursor miRNA. The mature form, miR-361-5p, has first been isolated from pancreatic islets by Poy and colleagues (Poy et al. , 2004), and subsequently from neuroblastoma cell lines (Afanasyeva et al. , 2008). No targets for miR-361-5p have been experimentally confirmed so far.

Table 4.2 Predicted microRNA recognition elements in the vicinity of the putative Pum recognition elements. Predicted miRNA seed sequences that fall within 50 nt of either of the three putative Pum recognition elements in the VEGFA 3’-UTR and that are predicted by at least two algorithms are listed. Seed predictions were from the following web services: microRNA.org (MR; Betel et al. , 2010), TargetScan (TS; Friedman et al. , 2009), DIANA-microT (µT; Maragkakis et al. , 2009), miRDB (DB; Wang, 2008), and MicroCosm (MC; Griffiths-Jones et al. , 2008). For each miRNA seed, the start and end positions of the seed relative to the start site of the VEGFA 3’-UTR (based on GenBank RefSeq entry NM_001025366.2), the nearby PRE or PREs, the particular algorithms and the total number of algorithms predicting the MRE, as well as the total number of algorithms for which target predictions for the corresponding miRNA were available in the accessed information, are indicated. See 7.4 for a full list of miRNAs predicted to target the VEGFA 3’-UTR.

MRE predicted by MicroRNA Start seed End seed PRE Count MR TS µT DB MC hsa-miR-548p 1345 1352 PRE1 yes yes n/d yes n/d 3 / 3 hsa-miR-548d-3p 1376 1383 PRE1 yes yes yes 3 / 5 hsa-miR-300 1380 1387 PRE1 yes yes 2 / 5 hsa-miR-381 1380 1387 PRE1 yes yes 2 / 5 hsa-miR-590-3p 1388 1395 PRE1 yes yes 2 / 5 hsa-miR-494 1396 1403 PRE1 yes yes 2 / 5 hsa-miR-185 1410 1417 PRE1 yes yes yes 3 / 5 hsa-miR-300 1568 1575 PRE2/3 yes yes 2 / 5 hsa-miR-381 1568 1575 PRE2/3 yes yes 2 / 5 hsa-miR-329 1576 1583 PRE2/3 yes yes 2 / 5 hsa-miR-362-3p 1576 1583 PRE2/3 yes yes 2 / 5 hsa-miR-603 1576 1583 PRE2/3 yes yes 2 / 5 hsa-miR-361-5p 1618 1625 PRE2/3 yes yes yes yes yes 5 / 5

4.2.3 MicroRNA 361-5p and Pum1/2 may target other angiogenesis-related transcripts

It has been proposed that RBPs and miRNAs often act as master regulators of PTGR

(reviewed in Keene, 2007; Mansfield and Keene, 2009; Kanitz and Gerber, 2010). We therefore wondered whether miR-361-5p and Pum1/2 may have additional targets in angiogenesis or related processes. By combining target predictions using the aforementioned five miRNA target prediction services, we have compiled a list of potential miR-361-5p targets (see 7.5). Experimentally verified Pum1 and Pum2 targets were previously published

61

Figure 4.3 MicroRNA 361-5p. (A) Secondary structure of the hybrid between miR-361-5p and the putative MRE within the VEGFA 3’-UTR, as predicted by RNAhybrid (Rehmsmeier et al. , 2004). The calculated free energy of the interaction is indicated. Note that the putative PRE3 overlaps with the predicted MRE. (C) The genomic locus encoding hsa-mir-361. The two mature strands, miR-361-5p and -3p are highlighted, and PhyloP and PhastCons conservation scores and alignments with various mammals are indicated. Note that the locus of hsa-mir-361 lies within an intron of the CHM gene, encoding Rab escort protein 1. Single horizontal lines indicate deletions relative to the human version, while double horizontal lines indicate bases that are unalignable. Adapted from UCSC genome browser (Fujita et al. , 2011).

(Morris et al. , 2008; Galgano et al. , 2008; Hafner et al. , 2010b). Putative targets were then subjected to (GO) term annotation using PANTHER (Thomas et al. , 2006).

Intriguingly, out of the 69 pathways that were significantly (P < 0.05) enriched among the putative targets of either of the post-transcriptional regulators, 21 were enriched among all three of them ( Table 4.3; see 7.6 for the full list). Strikingly, among these are both angiogenesis (P = 1.6 x 10 -3, 1.0 x 10 -7, and 1.2 x 10 -8, for miR-361-5p, Pum1 and Pum2 targets, respectively) and the VEGF pathway (P = 8.4 x 10 -3, 1.6 x 10 -2, and 2.8 x 10 -3, for miR-361-5p, Pum1 and Pum2 targets, respectively). A number of other commonly enriched pathways can be linked to angiogenesis or, more specifically, VEGFA function as well, such

62 as EGF, FGF, PI3K, TGF-β and inflammation-related pathways.

Table 4.3 Gene set enrichment analysis of microRNA 361-5p, Pum1 and Pum2 targets. MicroRNA target predictions were from the following web services: microRNA.org (Betel et al. , 2010), TargetScan (Friedman et al. , 2009), DIANA-microT (Maragkakis et al. , 2009), miRDB (Wang, 2008), and MicroCosm (Griffiths-Jones et al. , 2008). Experimentally verified Pum1 and Pum2 targets were previously published (Morris et al. , 2008; Galgano et al. , 2008; Hafner et al. , 2010b). Results were pooled and converted to gene identifiers using the DAVID web service (Huang et al. , 2008). Putative targets were compared to a human reference gene list and analyzed for pathway enrichment using PANTHER (Thomas et al. , 2006). Pathways commonly and significantly enriched among miR-361-5p, Pum1, and Pum2 targets are listed together with their P values. Angiogenesis and the VEGF pathway are highlighted. See 7.6 for a full list of significantly enriched pathways.

Pathway miR-361-5p Pum1 Pum2 PDGF signaling pathway 7.7 x 10 -7 7.0 x 10 -7 1.6 x 10 -12 T cell activation 2.2 x 10 -6 8.9 x 10 -6 6.1 x 10 -5 EGF receptor signaling pathway 7.2 x 10 -6 5.6 x 10 -7 5.7 x 10 -10 p53 pathway 4.3 x 10 -5 1.6 x 10 -7 4.9 x 10 -14 B cell activation 4.9 x 10 -5 3.6 x 10 -5 4.8 x 10 -3 Apoptosis signaling pathway 4.3 x 10 -4 3.6 x 10 -4 1.4 x 10 -6 Ras Pathway 8.2 x 10 -4 1.3 x 10 -7 1.7 x 10 -7 Angiogenesis 1.6 x 10 -3 1.0 x 10 -7 1.2 x 10 -8 Wnt signaling pathway 3.2 x 10 -3 4.3 x 10 -4 3.2 x 10 -8 Inflammation mediated by chemokine and cytokine signaling pathway 4.9 x 10 -3 1.0 x 10 -3 2.1 x 10 -2 Insulin/IGF pathway-MAPKK/MAPK cascade 6.2 x 10 -3 1.9 x 10 -3 3.7 x 10 -5 VEGF signaling pathway 8.5 x 10 -3 1.6 x 10 -2 2.8 x 10 -3 Alzheimer disease-amyloid secretase pathway 9.7 x 10 -3 2.5 x 10 -2 1.3 x 10 -3 p53 pathway feedback loops 2 1.1 x 10 -2 2.2 x 10 -2 1.2 x 10 -6 Oxidative stress response 1.7 x 10 -2 1.0 x 10 -3 5.3 x 10 -3 Integrin signalling pathway 1.9 x 10 -2 5.7 x 10 -5 1.0 x 10 -3 PI3 kinase pathway 2.1 x 10 -2 3.7 x 10 -4 2.2 x 10 -7 Parkinson disease 2.4 x 10 -2 4.3 x 10 -4 4.0 x 10 -5 TGF-beta signaling pathway 2.5 x 10 -2 2.6 x 10 -4 9.0 x 10 -7 FGF signaling pathway 2.6 x 10 -5 9.6 x 10 -4 5.6 x 10 -9 Interferon-gamma signaling pathway 2.9 x 10 -2 5.4 x 10 -4 6.2 x 10 -3

We have also analyzed the manually curated VEGF pathway map at KEGG (Kanehisa et al. , 2010) for the list of predicted targets. Among them we found a number of key players, namely PKC, Rac, PI3K, NFAT and cPLA2, which are all predicted to be targets of miR-361-

5p by more than one algorithm (Figure 4.4 A), and were further found to be associated with at least one of the Pum proteins (Figure 4.4 B). Even though the prediction algorithms, GO term annotations and transcriptome-wide target identification approaches may be imprecise at the level of individual genes, the strong enrichment of related pathways among the compiled lists suggests that miR-361-5p and Pum1/2 may indeed have additional – and possibly common –

63 roles in regulating angiogenesis and other VEGF(A)-related functions, both upstream and downstream of the VEGFA/VEGF receptor axis.

4.2.4 Generation and characterization of stable Pum1/2 overexpression cell lines

To study the influence of Pum1 or Pum2 on VEGFA expression, we generated

HEK293-derived cell lines stably expressing HA-StrepIII-tandem tagged versions of either the Pum proteins or a control protein (enhanced green fluorescent protein; eGFP) from a single-copy genomic locus upon induction with tetracyclin (Figure 4.5). Immunoblot analysis with an anti-HA antibody confirmed that the resulting cell lines express the introduced coding sequences from the human cytomegalovirus hybrid promoters in a tetracyclin-dependent manner (Figure 4.5 A). Analysis of Flp-In-293-eGFP cells by fluorescence microscopy further indicated that the recombinant eGFP is functional (data not shown). By determining fluorescence levels in cells treated with different tetracyclin concentrations using flow cytometric analysis, we were thus able to determine an ‘effective range’ of tetracyclin concentrations for the Flp-In-293-derived cell lines: Continuous increases in fluorescence intensities were recorded for tetracyclin concentrations ranging from approximately 0.01 to 1

µg/mL (Figure 4.5 B). As established elsewhere (see 5.2.7), ‘leakiness’ of the promoter (i.e. basal expression in the absence of inducer) amounts to approximately 10% or less of maximum levels. For Pum1 mRNAs levels in Flp-In-293-Pum1 cell lines treated with different tetracyclin concentrations in the effective range, a similar dose response was observed (Figure 4.5 C). As the used qRT-PCR assay does not differentiate between endogenous and ectopic Pum1 mRNA, it can be concluded that induction with 1.0 µg/mL tetracyclin raises total Pum1 mRNA levels considerably (approximately 4-fold). Finally, we wondered whether recombinant Pum1 co-localizes with endogenous Pum1.

64

Figure 4.4 Pathway analysis of predicted microRNA 361-5p targets. A manually curated representation of the VEGF signaling pathway available at KEGG (Kanehisa et al. , 2010) was color-coded according to (A) the number of algorithms that predict an individual gene to be targeted by miR-361-5p, or (B) whether the corresponding mRNAs were found to be associated with either Pum1 (blue), Pum2 (yellow), or both (green). (A) Target predictions were from the following web services: microRNA.org (Betel et al. , 2010), TargetScan (Friedman et al. , 2009), DIANA-microT (Maragkakis et al. , 2009), miRDB (Wang, 2008), and MicroCosm (Griffiths-Jones et al. , 2008). Results were pooled and converted to uniform gene identifiers using the DAVID web service (Huang et al. , 2008). (B) Data are from Morris et al. (2008; Pum1 only), Galgano et al. (2008), and Hafner et al. (2010b; Pum2 only).

65

Figure 4.5 Characterization of Flp-In cell lines. (A) Immunoblot analysis of Flp-In-293-eGFP, -Pum1 and Pum2 cell lines. Lysates of tetracyclin- (+) or ethanol-treated (EtOH; -) Flp-In-293-eGFP, -Pum1 or –Pum2 cells were subjected to immunoblot analysis with an anti-HA antibody (clone HA-7). (B) Flow cytometry analysis of Flp-In-293-eGFP cells treated with different concentrations of tetracyclin. Geometric mean fluorescence levels are plotted against the tetracyclin-concentration. (C) Pum1 transcript levels in Flp-In-293-Pum1 cells treated with different concentrations of tetracyclin were assayed by qRT-PCR analysis. Fold changes ± S.D. in expression levels with respect to vehicle-treated (0 µg/mL) cells are plotted. ACTB was used as a reference. Experiments were performed in triplicate. (D) Flp-In-293-Pum1 cells were treated with tetracyclin (1 µg/mL; 24 h) and sodium arsenite (0.5 mM; 45 min). Subsequently, cells were stained with Hoechst 33342 dye (blue) and Pum1 (A300-201A; green) and HA (clone HA-7; red) antibodies and analyzed by fluorescence microscopy. Data in (B and C) are from a single experiment.

66

Immunocytometric analysis with anti-HA and anti-Pum1 antibodies to detect recombinant and endogenous Pum1 revealed that both proteins are present ubiquitously in the cytoplasm under normal conditions, resulting in diffuse cytoplasmic staining (data not shown), thus making it difficult to assess whether ectopic Pum1 is localized correctly. However, it has been reported that Pum1 localizes to stress granules under oxidative stress conditions, which appear as clearly visible cytoplasmic foci when analyzed by microscopy (Morris et al. , 2008). In order to induce stress granule formation, we therefore treated Flp-In-293-Pum1 cells with sodium arsenite in addition to tetracyclin. Immunocytochemistry confirmed that under oxidative stress conditions both proteins co-localize to distinct foci in the cytoplasm (Figure 4.5 D). Taken together, these data indicate that the generated cell lines produce the recombinant HA-

StrepIII-tagged proteins upon treatment with inducer in a dose-responsive manner and that, at least in the case of eGFP and Pum1, the proteins are likely folded properly.

4.2.5 Transfection of small RNAs

In order to study the effects of altered miRNA levels in cellular systems, it is paramount to establish efficacious transfection methods for small RNAs. We therefore assessed the ability of HEK293, A431 and HaCaT cells to take up miRNA mimics and antisense inhibitors by transfecting them with increasing amounts of Cy3-labeled Pre- and

Anti-miR control constructs, respectively, followed by flow cytometric analysis (Figure 4.6 A to C). The resulting data revealed that Cy3-labeled constructs were incorporated by more than

80% of all cell lines across the whole range of concentrations (Figure 4.6 B) and with fluorescence intensities increasing in a dose-respondent manner (Figure 4.6 C). Differences in the fractions of transfected cells were small between cell lines and constructs (< 20%). By transfecting the used cells lines with Pre- and Anti-miR constructs, we should therefore be able to alter the levels of miRNAs of interest as desired.

67

Figure 4.6 MicroRNA mimics and antisense inhibitors are readily taken up by HEK293, A431 and HaCaT cells. Cells were transfected with either 0 (mock), 10, 30 or 100 nM of Cy3-labeled Pre-miR or Anti-miR constructs and analyzed by flow cytometry. Experiments were performed in triplicate. (A) For each cell line, dot plots of mock-transfected cells (left panel) indicate the populations subjected to fluorescence analysis. The fluorescence distributions of gated cells are plotted for a single replicate, both for Pre-miR (middle) and Anti- miR-transfected cells (right). The fluorescence thresholds for positive cells are indicated (M1). The efficiencies of transfection are represented as the mean fractions of fluorescent cells (“M1 positive cells”) (B) and the mean geometric means (C) within the gated populations ± S.D.

4.2.6 The putative Pum and microRNA 361-5p recognition elements in the VEGFA 3’-

UTR possess regulatory potential

Due to the generally high degree of interconnectivity within post-transcriptional

68 regulatory networks (see 2.2), the extensive post-transcriptional regulation that has already been shown to be exerted on the VEGFA transcript (see 2.4.1), and the high occurrence of predicted MREs in the conserved region surrounding the putative MRE (see 7.4), it is possible that other trans -binding factors might affect the binding potential of miR-361-5p, Pum1 and/or Pum2. Thus we reasoned that it may be beneficial to preserve potential RNA recognition elements in our experiments. We therefore cloned the entire downstream conserved region of the VEGFA 3’-UTR behind the coding sequence of Renilla luciferase under the control of a Simian virus 40 (SV40) promoter, on a plasmid (psiCHECK-2) further encoding a firefly luciferase for normalization purposes (Figure 4.7 A and B). As a positive control, we also generated luciferase reporter constructs bearing the CDKN1B 3’-UTR, which is known to be cooperatively regulated by Pum1 and miR-221/222 (Kedde et al. , 2010).

Additionally, we generated variants of both reporters in which either the PREs or MREs were mutated. In order to avoid competition between the reporter and endogenous VEGFA, we performed the assays in Flp-In-293 cells, as the parental human embryonic kidney (HEK293) cell line expresses low levels of VEGFA (Liang et al. , 2002).

In order to assess the regulatory potential of the total VEGFA 3’-UTR fragment as well as that of the PRE and MRE motifs of interest, we transfected Flp-In-293 cells with the wild type and mutated VEGFA and CDKN1B reporters or unmodified psiCHECK-2 (Figure

4.7 C). While relative Renilla activity was significantly decreased by almost half in cells transfected with the wild type CDKN1B reporter when compared to the psiCHECK-2 control

(P = 2.82 x 10 -5; unpaired t-test, two-tailed), no such change was observed for the wild type

VEGFA reporter. These data indicate that, in the tested cell line, the combined post- transcriptional regulation exerted on the VEGFA 3’-UTR fragment is roughly neutral, whereas in the CDKN1B 3’-UTR negative regulators (i.e. repressors) appear to prevail.

Mutation of either the PREs or the miR-221/222 MREs in the CDKN1B 3’-UTR leads to an

69

Figure 4.7 Effect of the CDKN1B and VEGFA 3’-untranslated regions and their putative Pum and microRNA recognition elements on the activity of a luciferase reporter. (A) Schematic representation of the luciferase reporter constructs (psiCHECK-2 vector, Promega). 3’-UTR fragments (yellow) are fused to Renilla luciferase, which is under the control of a simian virus 40 (SV40) promoter. Firefly luciferase, under the control of a herpes simplex thymidine kinase (HSV-TK) promoter, is used for normalization. (B) Overview of the wild type (wt) and mutated CDKN1B and VEGFA 3’-UTR fragments used for the generation of luciferase reporter constructs. miR-221 (blue) and putative Pum (brown) and miR-361-5p (green) recognition elements are highlighted, and the first and last three nucleotides of the fragments are given. Numbers denote the positions of the first residues of each motif relative to the start of the respective 3’-UTRs according to GenBank RefSeq mRNA entries NM_004064. 2 (CDKN1B) and NM_001025366.2 (VEGFA). Note that while the miR-361-5p MRE was mutated at three consecutive nucleotides (see 7.1.2 for the sequences of oligonucleotides used for the mutagenesis), only two mutations are present in the seed sequence. (C) Flp-In-293 cells were transfected with either of the indicated luciferase reporters. Mean ratios of Renilla and firefly luciferase activities (relative luciferase units; RLU) were normalized to those of psiCHECK-2-transfected cells. Four independent experiments were performed at least in triplicate. Mean values ± S.E.M. from a representative experiment are plotted. Two-tailed, unpaired t-tests were used to calculate P values (one, two and three asterisks denote P values <0.05, <0.01, and <0.001, respectively).

70 increase in relative luciferase activity (1.23- and 1.81-fold higher than the wild type 3’-UTR, respectively; P = 0.0103 and 0.0011; unpaired t-test, two-tailed) in cells transfected with the respective reporters. While the mutation of the PREs in the VEGFA 3’-UTR also resulted in elevated relative luciferase activity (1.24-fold higher than the wild type 3’-UTR; P = 0.0053; unpaired t-test, two-tailed), the opposite was observed for the mutation of the putative miR-

361-5p MRE (approximately 1.19-fold lower than the wild type 3’-UTR; P = 0.0025; unpaired t-test, two-tailed). These data indicate that the studied ‘motifs’ in the VEGFA 3’-

UTR appear to possess intrinsic regulatory potential and thus represent cis -regulatory elements.

4.2.7 Pum1, Pum2 and microRNA 361-5p repress the expression of VEGFA 3’-UTR

reporters

To study the effects of elevated Pum1 or Pum2 levels on the expression of VEGFA, we transfected Flp-In-293-derived cell lines overexpressing either eGFP, Pum1 or Pum2 with the luciferase reporters and compared relative luciferase activities (Figure 4.8). For all

CDKN1B and VEGFA reporters, overexpression of either Pum1 or Pum2 resulted in significant drops in Renilla activities of approximately 25% compared to eGFP overexpression (e.g. P = 0.0016 and 0.0033 for the wild type CDKN1B reporter, for Pum1 and Pum2 respectively; e.g. P = 2.9 x 10 -6 and 0.0253 for the wild type VEGFA reporter, for

Pum1 and Pum2 respectively; unpaired t-test, two-tailed). When disregarding the higher base level activities of the PRE mutants compared to wild type reporters (see 4.2.6), differences between luciferase activities in cells transfected with wild type or PRE mutants were very moderate (approximately 6% and 13% increase in the PRE mutants for CDKN1B and

VEGFA, respectively; P = 0.2065 and 2.3 x 10 -4; unpaired t-test, two-tailed). Mutation of

MREs did not result in significant changes compared to the wild type reporters for both genes.

71

Figure 4.8. Effect of elevated Pum1, Pum2 and microRNA 221 or 361-5p levels on the activity of VEGFA and CDKN1B 3’-UTR luciferase reporters. (A) The indicated Flp-In-293-derived cell lines were transfected with either of the indicated luciferase reporters. (B) Flp-In-293 cells were co-transfected with the indicated luciferase reporters and Pre-miR microRNA mimics. Mean ratios of Renilla and firefly luciferase activities (relative luciferase units; RLU) were normalized to those of Flp-In-293-eGFP (A) or Pre-miR-control- transfected cells (B). Two independent experiments were performed in triplicate. Mean values ± S.E.M. from one experiment are plotted. Two-tailed, unpaired t-tests were used to calculate P values (one, two and three asterisks denote P values <0.05, <0.01, and <0.001, respectively).

72

While the effects of overexpressing Pum2 were generally lower (between 1% and 11%) when compared to Pum1 overexpression, differences only reached significance in the VEGFA PRE and MRE mutant reporters (P = 2.2 x 10-4 1.9 x 10 -4; unpaired t-test, two-tailed). The results indicate that the VEGFA is susceptible to repression by Pum1 and Pum2 which appears to be at least partly mediated by one or more of the putative PREs.

Analogously, the effects of elevated levels of miR-361-5p on VEGFA were studied by co-transfecting Flp-In-293 cells with the VEGFA luciferase reporters and either a miR-361-5p mimic or a control, or, for control purposes, with the CDKN1B reporters and either a miR-

221 mimic or control. For cells transfected with the wild type reporters, the addition of miRNA mimics led to significant decreases in luciferase activities compared to the addition of control constructs, both for CDKN1B and VEGFA (31% and 19%, respectively; P = 7.3 x 10 -

4 and P = 0.0270; unpaired t-test, two-tailed). Effects were even more pronounced when PREs were mutated (43% and 28%, respectively; P = 0.0057 and P = 0.0130; unpaired t-test, two- tailed), whereas mutation of the MREs fully (CDKN1B; 11% increase for the miR-221 mimic) or partly (VEGFA; 8% decrease for the miR-361-5p mimic) abolished this effect.

Taken together, the results show that miRNA-361-5p is able to repress VEGFA reporter expression via the putative MRE in the respective 3’-UTR.

4.2.8 The repressive effects of Pum proteins and microRNA 361-5p on VEGFA 3’-UTR

reporter activity are additive

To assess whether miRNA-361-5p and Pum proteins exert combinatorial or even cooperative regulation on the VEGFA 3’-UTR, we co-transfected the Flp-In-293-derived cell lines expressing recombinant eGFP, Pum1 or Pum2 with either miRNA mimics or controls

73 and wild type and mutated VEGFA or CDKN1B 3’-UTR luciferase reporters and compared luciferase activities (Figure 4.9). We found that for both the CDKN1B and VEGFA reporters, most of the data fit with a model in which the effects of simultaneously elevated Pum and miRNA levels add up to the sum of their individual effects (Table 4.4). This suggests that the

Figure 4.9 Effect of simultaneously elevated levels of microRNA 221 or 361-5p and Pum1 or Pum2 on the activity of VEGFA and CDKN1B 3’-UTR luciferase reporters. The indicated Flp-In-293-derived cell lines were co-transfected with either of the indicated CDKN1B (A) or VEGFA (B) luciferase reporters and Pre-miR microRNA mimics. Mean ratios of Renilla and firefly luciferase activities (relative luciferase units; RLU) were normalized to those of Pre-miR-control-transfected Flp-In-293-eGFP cells. Two independent experiments were performed in triplicate. Mean values ± S.E.M. from one experiment are plotted. Two-tailed, unpaired t-tests were used to calculate P values (one and two asterisks denote P values <0.05 and <0.01, respectively).

74

Table 4.4 Observed and expected reductions in reporter activities. The effects of simultaneously elevated miRNA and Pum1 or Pum2 levels (observed; obs.), expressed as percent reduction compared to the baseline (transfection of Flp-In-293-eGFP with Pre-miR-control), are compared to the sum of their individual effects (expected; exp.). Fold changes and P values (unpaired t-test, two-tailed) are indicated. Significance was assumed for P < 0.05.

% reduction % reduction Fold change Reporter P value (observed) (expected) obs vs. exp miRNA and Pum1 wt CDKN1B 40.77 ± 4.01 41.42 ± 1.38 0.98 0.8095 CDKN1B-PRE-DM 47.59 ± 5.65 46.23 ± 2.94 1.03 0.7267 CDKN1B-MRE-DM 33.43 ± 2.74 27.83 ± 3.75 1.20 0.1040 wt VEGFA 37.52 ± 1.69 40.70 ± 1.91 0.92 0.0715 VEGFA-PRE-TM 36.05 ± 4.00 43.69 ± 3.86 0.83 0.0660 VEGFA-MRE-MUT 35.70 ± 0.71 36.37 ± 5.18 0.98 0.7885 miRNA and Pum2 wt CDKN1B 51.48 ± 2.05 37.70 ± 3.85 1.37 0.0060 CDKN1B-PRE-DM 56.22 ± 4.46 49.92 ± 2.94 1.13 0.1220 CDKN1B-MRE-DM 23.43 ± 2.08 27.11 ± 3.12 0.86 0.1396 wt VEGFA 37.97 ± 7.71 40.80 ± 2.87 0.93 0.5961 VEGFA-PRE-TM 49.01 ± 7.05 41.38 ± 10.78 1.18 0.2777 VEGFA-MRE-MUT 36.18 ± 7.80 33.87 ± 6.40 1.07 0.7015

regulation of the 3’-UTRs of CDKN1B and VEGFA by Pum1/2 and miR-221, or miR-361-5p respectively, is largely independent of one another. The only condition in which the data significantly deviates from such an additive model is when Flp-in-293-Pum2 cells were co- transfected with the wild type CDKN1B reporter and a miR-221 mimic. Here, an approximately 51% reduction in reporter activity was observed, which is a considerably stronger repression (1.37-fold; P = 0.0060; unpaired t-test, two-tailed) than would have been expected from their individual effects (38% reduction). These data better fit a model that assumes synergism between Pum2 and miR-221. As for the VEGFA 3’-UTR, the observed reductions differed noticeably, yet not significantly, from expected ones only for the reporter in which the PREs were mutated. For this reporter, the effects of a simultaneous increase in miR-361-5p and Pum1 levels are somewhat weaker than would be expected from applying a strictly additive model (0.83-fold; P = 0.0660; unpaired t-test, two-tailed). In contrast, the observed reduction in reporter activity upon increasing miR-361-5p and Pum2 levels appears

75 to be stronger than the expected additive effects (1.18-fold; P = 0.2777; unpaired t-test, two- tailed).

Taken together, these data indicate that both the VEGFA and CDKN1B 3’-UTRs are regulated by Pum1, Pum2 and either miR-361-5p (VEGFA) or miR-221 (CDKN1B) in a combinatorial manner. Most of the observed data fit well with an additive model of co- regulation, thus suggesting that the regulatory effects exerted on the 3’-UTRs are largely independent of one another.

4.2.9 Endogenous VEGFA expression is regulated by microRNA 361-5p

To check whether endogenous VEGFA levels could be affected by miR-361-5p, we chose two different cell lines derived from human skin that are known to express high levels of VEGFA: the epidermoid squamous cell carcinoma-derived A431 cell line, and HaCaT cells, keratinocytes derived from normal skin that transformed spontaneously in vitro . First, we determined the expression levels for endogenous miR-361-5p and VEGFA mRNA in these cell lines using quantitative reverse transcription PCR (qRT-PCR; Figure 4.10). While miRNA expression did not differ between the two cell lines (fold difference between A431 and HaCaT = 1.04 + 0.27 – 0.21), VEGFA levels were significantly higher in A431 cells compared to HaCaT cells (3.69 + 0.38 – 0.35; P = 2.1x10 -7; unpaired t-test, two tailed).

On the protein level, the baseline VEGFA secretion rates of mock-transfected cells were approximately 3262 ± 585 and 1314 ± 152 pg/mL after 24 hours, for A431 and HaCaT cells respectively (ratio ~ 2.5), as determined by ELISA (Figure 4.11).

We then determined VEGFA levels in the culture supernatants of both cell lines

76

Figure 4.10 VEGFA and miR-361-5p are expressed in A431 and HaCaT cells. qRT-PCR analysis of VEGFA and miR-361-5p expression in A431 and HaCaT cells. Fold differences ± S.D. in expression levels with regards to the references (ACTB and RNU6B, for VEGFA and miRNA-361-5p, respectively) are plotted. Fold differences ± S.D. between A431 and HaCaT expression levels are given for miR-361-5p and VEGFA. Mean CT values ± S.D. are indicated above each column. Data represent at least three independent experiments performed in triplicate. Two-tailed, unpaired t-tests were used to calculate P values (the triple asterisk denotes a P value <0.001). transfected with different amounts of miRNA-361-5p mimic or control using ELISA (Figure

4.11 A and B). While we only observed a slight decrease in VEGFA levels in HaCaT cells (up to ~11% when transfecting 30 nM; P = 0.0502; unpaired t-test, two-tailed), VEGFA levels were significantly decreased in A431 cells (up to ~30 % when transfecting 10 nM; P = 0.0063; unpaired t-test, two-tailed) when comparing transfection of miR-361-5p mimic and control.

Conversely, VEGFA levels were not affected in A431 cells transfected with increasing amounts of miRNA-361-5p antisense inhibitor when compared to cells transfected with a control (Figure 4.11), while in HaCaT cells we detected elevated VEGFA levels for all tested antisense inhibitor concentrations (up to ~39% when transfecting 10 nM; P = 0.0150; unpaired t-test, two-tailed). Taken together, these findings demonstrate that altered levels of miR-361-5p may affect the rates at which VEGFA is secreted, suggesting that the miRNA is able to repress endogenous VEGFA expression in vitro .

77

Figure 4.11 Impact of altered miRNA-361-5p levels on secretion of endogenous VEGFA. A431 (A) and HaCaT (B) cells were transfected with the indicated concentrations of miR-361-5p mimic, antisense inhibitor, or controls. Culture supernatants were analyzed for VEGFA protein content using a human VEGFA ELISA. Three independent experiments were performed in triplicate. Mean values ± S.D. from a representative experiment are plotted. Two-tailed, unpaired t-tests were used to calculate P values (one and two asterisks denote P values <0.05 and <0.001, respectively).

4.2.10 MicroRNA 361-5p is down-regulated in cutaneous squamous cell carcinoma

Having established that miR-361-5p is expressed and able to regulate VEGFA expression in skin-derived cell lines, we hypothesized that its expression may potentially be

78 downregulated in SCC and that it may thus contribute to the initiation or maintenance of high

VEGFA expression. We therefore measured the expression of miR-361-5p and several other

VEGFA-regulating miRNAs in five samples of SCC obtained from patients and in five healthy skin samples using qRT-PCR.

First, we investigated whether VEGFA expression was indeed increased in the SCC samples by assessing VEGFA mRNA levels with two different assays, one for exon 3 and the other one for the downstream conserved region in the 3'-UTR. As expected, we found that

VEGFA mRNA levels were around two-fold higher in the SCC samples compared to healthy control samples (fold difference between SCC and healthy skin for the exon 3 assay: 2.27 +

2.61 – 1.22; P = 0.0472; unpaired t-test with Welch’s correction, two-tailed; P = 0.0556;

Mann-Whitney U = 3, n1 = n2 = 5, two-tailed; fold difference between SCC and healthy skin for the 3’-terminal assay: 2.12 + 4.50 – 1.44; P = 0.1846; unpaired t-test with Welch’s correction, two-tailed; P = 0.2222; Mann-Whitney U = 6, n1 = n2 = 5, two-tailed; Figure 4.12

A). Moreover, data correlated very well for both assays (r = 0.83, P = 0.0015; Spearman’s rank correlation, one-tailed; Table 4.5). Interestingly, we found that the VEGFA 3'-terminus was expressed at significantly lower levels than the coding region (fold difference between

VEGFA exon 3 and VEGFA 3’-terminus assays of 2.72 + 1.34 – 0.90; P = 0.0098; unpaired t- test with Welch’s correction, two-tailed; P = 0.0115; Mann-Whitney U = 17, n1 = n2 = 10, two-tailed; Figure 4.12 B and C).

We then measured the expression of Pum1, Pum2, miR-361-5p, its 'host gene' CHM and the known VEGFA-regulating miRNAs miR-20b, miR-34a, miR-93, miR-126 and miR-

205. In healthy skin samples, the average expression levels of miR-20b (~62-fold down) and miR-205 (~51-fold up) strongly deviated from that of the reference RNA (RNU6B), while for all other miRNAs differences stayed within an order of magnitude (Figure 4.12 B). Of note,

79 miR-361-5p levels (~3.6-fold lower than RNU6B) were very consistent between samples and correlated well with CHM mRNA levels (r = 0.53, P = 0.0587; Spearman’s rank correlation, one-tailed; data not shown). The protein coding genes were all present at levels comparable to those of VEGFA, with Pum2 being expressed at ~2.2-fold higher levels than Pum1. In the

SCC samples, Pum1, Pum2, CHM and miR-361-5p levels were significantly decreased compared to healthy skin samples (fold difference between SCC and healthy skin for the

PUM1 assay: 0.58 + 0.18 – 0.14; P = 0.0175; unpaired t-test with Welch’s correction, two- tailed; P = 0.0079; Mann-Whitney U = 0, n1 = n2 = 5, two-tailed; fold difference between

SCC and healthy skin for the PUM2 assay: 0.36 + 0.08 – 0.07; P = 0.0001; unpaired t-test with Welch’s correction, two-tailed; P = 0.0079; Mann-Whitney U = 0, n1 = n2 = 5, two- tailed; fold difference between SCC and healthy skin for the CHM assay: 0.40 + 0.53 – 0.23;

P = 0.0456; unpaired t-test with Welch’s correction, two-tailed; P = 0.0952; Mann-Whitney U

= 4, n1 = n2 = 5, two-tailed; fold difference between SCC and healthy skin for the miR-361-

5p assay: 0.44 + 0.33 – 0.19; P = 0.0220; unpaired t-test with Welch’s correction, two-tailed;

P = 0.0159; Mann-Whitney U = 1, n1 = n2 = 5, two-tailed), whereas the other tested miRNAs either did not exhibit considerably reduced expression levels or, as seen for miR-126, even appeared to be upregulated (fold difference between SCC and healthy skin of 3.72 + 10.99 –

2.78; P = 0.0699; unpaired t-test with Welch’s correction, two-tailed; P = 0.0952; Mann-

Whitney U = 4, n1 = n2 = 5, two-tailed; Figure 4.12 A and C).

Importantly, miR-361-5p levels exhibited the strongest inverse correlation with

VEGFA levels across all samples (r = -0.58 and -0.60, P = 0.0408 and 0.0333; Spearman’s rank correlation, one-tailed; for the exon 3 and 3’-terminal assays, respectively; Table 4.5).

No other miRNA or protein coding gene passed an r ≤ -0.5 threshold. Out of the Pum proteins,

Pum1 levels correlated considerably stronger with VEGFA levels (r = -0.45 and -0.41, P =

0.0935 and 0.1221; Spearman’s rank correlation, one-tailed; for the exon 3 and 3’- terminal

80

81

Figure 4.12 Relative changes in expression of selected mRNAs and mature miRNAs in healthy skin and cutaneous squamous cell carcinoma samples. For each group, five samples were analyzed by qRT-PCR, representing ten individuals. Experiments were performed in quadruplicates. Data are based on C T values normalized to ACTB and RNU6B for mRNAs and miRNAs, respectively. (A) Fold differences in expression relative to the averages of expression in healthy skin samples ± S.D. are indicated for each assay. The Mann- Whitney U test was used to calculate P values (two-tailed; single and double asterisks denote P values <0.05 and <0.01, respectively). (B) For each of the indicated assays, the fold difference in expression with regard to the reference is plotted for each of the healthy skin samples. Horizontal bars represent means of all healthy skin samples. (C) For each assay and for each of the cutaneous squamous cell carcinoma samples, the fold difference in expression to the average of the healthy skin samples is indicated. Horizontal bars represent means of all squamous cell carcinoma samples. Note that data in (A to C) are plotted logarithmically.

assays, respectively). Notably, CHM did not correlate with VEGFA expression (r = 0.05 and

0.09, P = 0.4405 and 0.4014; Spearman’s rank correlation, one-tailed; for the exon 3 and 3’- terminal assays, respectively).

Table 4.5 Correlation of microRNA 361-5p, Pum1 and Pum2 with VEGFA expression in healthy skin and squamous cell carcinoma samples. For each group, five samples were analyzed by qRT-PCR, representing ten individuals. Experiments were performed in quadruplicates. Data are based on C T values normalized to ACTB and RNU6B for mRNAs and miRNAs, respectively. Based on the normalized expression levels, Spearman rank correlation coefficients (r) across all samples were calculated between VEGFA and the indicated messenger and miRNAs. P values (one-tailed) are indicated.

VEGFA exon 3 VEGFA 3'-terminus Assay r P r P VEGFA exon 3 1.00 n/a 0.83 0.0015 VEGFA 3'-terminus 0.83 0.0015 1.00 n/a CHM 0.05 0.4405 0.09 0.4014 miR-361-5p -0.58 0.0408 -0.60 0.0333 miR-20b -0.05 0.4405 -0.22 0.2667 miR-34a -0.08 0.4144 -0.43 0.1072 miR-93 -0.05 0.4405 -0.33 0.1733 miR-126 0.16 0.3257 -0.03 0.4669 miR-205 -0.26 0.2338 -0.44 0.1029 PUM1 -0.45 0.0935 -0.41 0.1221 PUM2 -0.30 0.2024 -0.16 0.3257

In summary, Pum1, Pum2 and miR-361-5p were significantly reduced in SCC expressing high levels of VEGFA, indicating that their dysregulation could contribute to the observed elevated VEGFA levels in SCC. Furthermore, we found that out of a panel of six miRNAs targeting VEGFA, only miR-361-5p levels were inversely correlated with VEGFA levels in the patient samples. Although we did not observe a similar inverse correlation between CHM and VEGFA mRNA levels, the correlation between CHM and miR-361-5p

82 levels may be an indication that transcription of the miRNA precursor is dependent on CHM transcription.

4.3 Discussion

In the present study, we have used bioinformatics analysis tools to define potential recognition elements of the post-transcriptional repressors Pum1, Pum2 and miRNA-361-5p in a downstream conserved region of the human VEGFA 3’-UTR. By using luciferase reporter assays, we demonstrate that at least some of these elements possess regulatory potential, that elevated levels of the repressors negatively affect reporter activities, and that the repressive effects of the Pum proteins and miRNA-361-5p are consistent with an independent, additive model of combinatorial control. We have confirmed the repressive effect of miR-361-5p on VEGFA expression in vitro using ELISA. Importantly, we also found that the RNA levels of all three regulators are reduced in SCC samples compared to healthy skin. In the case of miR-361-5p, we could further show that its levels are inversely correlated with VEGFA expression in the patient samples. Taken together, we have identified three novel post-transcriptional regulators of VEGFA expression in vitro , and our results indicate that they may possibly affect cancer development or progression by modulating

VEGFA expression in particular tumor types.

4.3.1 The putative Pum and microRNA recognition elements exhibit regulatory

potential

The mutation of three critical nucleotides (Wang et al. , 2001, 2002) in each of the three putative PREs in the VEGFA 3’-UTR led to an increase in luciferase reporter activity, similar in extent to that of a 3’-UTR reporter of the known Pum target CDKN1B/p27

(Galgano et al. , 2008; Morris et al. , 2008; Kedde et al. , 2010) in which the two canonical

83

PREs were deleted. This de-repression is consistent with the usual role of PUF proteins as repressors of gene expression (see 4.1). While the effects may appear to be relatively moderate, it has to be considered that these are caused by the mutation or deletion of only 9 or

16 residues (VEGFA and CDKN1B reporter, respectively) out of more than a thousand each.

While we did not discern in detail the contribution of the individual PREs, we have observed that the mutation of the most downstream PRE, PRE3, in the VEGFA 3’-UTR alone caused a similar response (all in the range of a 20-25% increase in relative Renilla activity) compared to the mutation of two (PRE2 and PRE3) or all three PREs (data not shown), suggesting that

PRE3 is likely responsible for the majority of the changes. Likewise, a deletion of three residues in the ‘seed’ sequence confirmed a regulatory potential for the putative miR-361-5p

MRE. Interestingly, and in contrast to the observations for the CDKN1B reporter with deletions of the two miR-221 seed sequences, this mutation caused a moderate decrease in relative luciferase activity. This is surprising, as we had expected that the mutation would negatively affect binding of the miRNA and thus, consistent with the well-established role of miRNAs as repressors of gene expression (see 2.2.2.2) and similar to the effects observed for the mutation of the putative PREs, lead to de-repression. There are, however, a number of reports in which miRNAs appear to upregulate the expression of targets, either directly or indirectly, depending on RNP context (reviewed in Vasudevan, 2011). This observation may thus well be a hint that the putative miR-361-5p MRE may possibly be under the cooperative control of two or more trans -acting factors. In this regard it may be important to note that

PRE3 overlaps with the putative MRE of miR-361-5p (see Figure 4.2 A).

Our data indicate a regulatory potential of the putative PREs, at least for PRE3, as well for the predicted miR-361-5p MRE. The study of the individual contribution of each putative

PRE to the total regulatory impact as well as the observations regarding the apparent activating potential of the putative miR-361-5p MRE may constitute entry points for future

84 studies, which is further outlined in the following paragraphs.

4.3.2 Combinatorial control of VEGFA expression by microRNA 361-5p and the Pum

proteins

Co-transfection of cells with CDKN1B and VEGFA 3’-UTR reporters and Pre-miR constructs of miR-221, and -361-5p respectively, led to a reduction in relative Renilla activities compared to the co-transfection with Pre-miR-control, thus suggesting that the putative miR-361-5p MRE in the VEGFA 3’-UTR is targeted by the respective miRNAs, just like the miR-221 MRE in the CDKN1B 3’-UTR (Galardi et al. , 2007; le Sage et al. , 2007).

This is supported by the findings that mutation of the respective MREs abolished the repressive effects of elevated miR-221 and miR-361-5p levels. The absence of a complete de- repression using the VEGFA MRE mutant reporter – in contrast to the CDKN1B MRE double mutant reporter – could be explained by the relatively mild mutation (three/two nucleotides deleted in the MRE/seed sequence) of the former compared to the latter (complete deletion of both seeds), which consequently might allow residual binding of miR-361-5p. In contrast to the observations regarding the MRE-mutated VEGFA 3’-UTR reporter alone, these data suggest a repressive role of miR-361-5p on VEGFA expression. Interestingly, mutation of the

PREs for both the CDKN1B and VEGFA reporters enhanced the repressive effect, adding further circumstantial evidence to the possibility of cooperative regulation between miR-361-

5p and Pum, such as has been reported for CDKN1B (Kedde et al. , 2010).

Overexpression of both Pum1 and Pum2 in cells transfected with the VEGFA luciferase reporters led to a considerable decrease in relative Renilla activities compared to cells overexpressing a control protein, consistent with the widely established roles of PUF proteins as post-transcriptional repressors of gene expression (see 4.1). Together with the

85 observed increases in luciferase activities of cells expressing a luciferase reporter with mutated canonical PREs, these data indicate that both Pum1 and Pum2 may likely contribute to the regulation of VEGFA expression. However, mutation of all three putative PREs did not fully abolish the observed repressive effect of Pum1 and Pum2 overexpression. While PUF proteins exhibit a certain amount of flexibility in the recognition of their target sequences (Lu and Hall, 2011; reviewed in Miller and Olivas, 2011), and the identified PREs of human

Pum2 do not always fully match the consensus UGUAnAUA sequence (Hafner et al. , 2010b), the possibility that the mutated PREs allow residual binding of the Pum proteins seems unlikely, as the mutations cover the whole UGU triplet that is regarded critical for the target recognition of PUF proteins (Wang et al. , 2001, 2002) and thus probably does not allow such profound mutations. The failure of PRE mutant reporters to effect a complete de-repression is thus more likely to stem from the presence of additional PREs that mediate binding to Pum1 and Pum2 even in the absence of the canonical PREs. This is consistent with the observations that the complete deletion of both consensus PREs in the CDKN1B 3’-UTR is also unable to effect a de-repression, although the functionality of at least the upstream PRE has been proven

(Kedde et al. , 2010). Furthermore, the global identification of Pum2 PREs in Flp-In-293- derived cells using PAR-CLIP (Hafner et al. , 2010b) supports the idea of non-canonical PREs:

While two likely binding sites for Pum2 were indeed found in the VEGFA 3’-UTR, the corresponding sequence tags are located approximately 60 nt upstream of PRE2, and downstream PRE3 respectively (see Figure 4.1), and they contain no PRE consensus motifs or

UGU triplets. Instead, both contain UCUACAUA sequences, which differ from the canonical

PRE motif in only one nucleotide and could thus be likely candidates for such ‘cryptic’ PRE motifs. The absence of sequence tags covering the putative consensus PREs, and particularly

PRE3, might be explained by the masking of functional PREs by other trans -acting factors, for example miR-361-5p, as well as technical limitations, and thus does not necessarily mean that these sites are not amenable to regulation by Pum1 and/or Pum2. In vitro binding assays

86 conducted in our lab support this explanation, as a transcript not including PRE2, PRE3 and either of the sequence tags was nevertheless bound by the Pum1 and Pum2 homology domains (Galgano et al. , 2008).

Taken together, these data indicate that VEGFA expression may be under the combinatorial control of Pum1, Pum2 and miR-361-5p, which all act as repressors on its 3’-

UTR in our in vitro reporter assays. While the cis -regulatory element for miR-361-5p could be identified by mutational studies, the PRE identity is less clear: While PRE3 appears to possess at least some regulatory potential, the VEGFA 3’-UTR likely contains additional, non-canonical PREs. Moreover, other potential Pum binding sites, such as the consensus motifs PRE1 and PRE2, may possibly gain functionality only when other PREs are blocked or mutated, thus making the study of their individual role a complex task.

4.3.3 The regulation of VEGFA expression by microRNA 361-5p and the Pum proteins

may be dependent on each other

The data from experiments in which we simultaneously increased the levels of miR-

361-5p and either of the Pum proteins largely indicate additive effects of the two classes of repressors. This would suggest the absence of cooperativity, which is in contrast to some of the previously discussed results. Moreover, due to the overlap of the miR-361-5p MRE and

PRE3, it appears unlikely that no competition exists between these motifs, if PRE3 indeed represents a functional Pum recognition element, as is supported by the increased relative

Renilla activity when comparing the PRE triple mutant with the wild type VEGFA reporter.

There are several possible reasons why we failed to pick up cooperativity in our experimental setup. First of all, the repression exerted on the VEGFA 3’-UTR by both the Pum proteins and miR-361-5p is relatively moderate, and it is reasonable to assume that the modulation of the

87 effects by possible cooperativity may be even smaller in extent. But the sensitivity and robustness of our assay system is limited, so that it is possible that we would not be able to reliably detect such differences. This scenario is supported by the data obtained from the PRE mutant reporter, in which deviations from an additive model of up to 20% in either direction were observed. However, due to the sample variation, these did not reach significance. On the other hand, increasing the levels of both types of regulators may not be the ideal setup to discover competitive effects between two repressors, as their relative ratios may not be affected. The synergy between Pum1- and miR-221-mediated regulation of the CDKN1B 3’-

UTR, which has previously been established (Kedde et al. , 2010), could also not be confirmed using our experimental strategy. In accordance with the mentioned study, the effect of depleting either or even both of the repressors should thus be tested in future experiments.

The aforementioned possibility of the existence of alternative PREs, whether active in itself or requiring the blockage of other PREs to gain functionality, may be another explanation for the inability to detect cooperativity, as the effects of potential competition between the regulators for a particular PRE, e.g. PRE3, may be diluted by the use of distant alternative, perhaps non- canonical PREs. Finally, it is conceivable that crosstalk between the regulators requires the presence or absence of additional factors to become visible. The large set of known post- transcriptional regulators of VEGFA expression (see 2.4.1), as well as the extremely high density of predicted MREs in the downstream conserved region of the VEGFA 3’-UTR (0.41

MREs/nucleotide, compared to 0.29 and 0.14 for the upstream conserved and non-conserved regions, respectively; see Table 4.1 and 7.4) indeed favor a complex picture of combinatorial control, with many players being involved. This implies the potential for a high degree of competition or other cooperative effects between miRNAs, or between miRNAs and RNA- binding proteins. Indeed, cooperativity has already been observed between other miRNAs regulating VEGF expression in vitro (Hua et al. , 2006), as well as between miRNAs and

RNA-binding proteins (Jafarifar et al. , 2011).

88

Considering that the decrease of reporter activity in response to increased repressor levels appeared to be additive, the results suggest the absence of cooperativity under the tested conditions and instead imply a model of independent control mechanisms for miR-361-5p on the one hand and the Pum proteins on the other. But considering all of the results from our in vitro reporter assays, a definite conclusion regarding the nature of the combinatorial control between the two repressors cannot be drawn. The use of reporters with a limited 3’-UTR context and number of potential PREs (e.g. the region between PRE2 and the first potential downstream UCUACAUA Pum binding site; Hafner et al. , 2010b) might greatly facilitate the dissection of the cis -regulatory elements, because it would (a) restrict the number of potential alternative Pum binding sites and other cis -regulatory elements, and (b) likely lead to stronger effects and thus a better assay sensitivity. While the biological significance of results obtained from such short, “out-of-context” reporters may not be immediately clear, or even misleading, they could still serve as a starting point for a subsequent in-depth functional analysis.

4.3.4 The influence of microRNA 361-5p and Pum1 on VEGFA secretion rates

Consistent with findings indicating that the regulatory impact of altered miRNA levels on endogenous protein levels of targets is often weak (Baek et al. , 2008; Selbach et al. , 2008), we have also observed mild inhibitory effects of miRNA 361-5p on VEGFA secretion rates in

HaCaT and A431 cells. A difference in VEGFA miR-361-5p MRE occupancy by endogenous miR-361-5p between the two cell lines could potentially explain the differential effects of miR-361-5p mimic or antisense inhibitor on their VEGFA secretion rates: A high occupancy

(‘saturation’) of the MRE in HaCaT cells, which express similar miR-361-5p but lower

VEGFA levels compared to A431, could account for the absence of a significant decrease in

VEGFA secretion upon the addition of miR-361-5p mimic. Conversely, a low occupancy of the MRE in A431 cells could render the cells unresponsive to miR-361-5p inhibition.

89

However, in this simple scenario the impact of increased miR-361-5p levels in A431 and miR-361-5p inhibition in HaCaT cells should be proportional to the amount of added miRNA mimic or antisense inhibitor. The absence of such a miRNA dose-response could be a further indication that one or more additional factors differentially modulate miR-361-5p activity in the two cell lines, systemically or specifically, by influencing the availability, accessibility or functionality of the miRNA or its recognition element.

The influence of Pum1 overexpression and knockdown on VEGFA expression in

A431 cells was previously assessed in our laboratory (Galgano, 2010). Using ELISA, it was shown that knockdown of Pum1 led to a significant decrease in VEGFA secretion rates and intracellular VEGFA protein levels. Conversely, Pum1 overexpression resulted in a significant increase in intracellular VEGFA, while a similar effect on secreted levels did not reach significance. These data add further evidence for a regulatory role of Pum proteins on

VEGFA expression and indicate that the nature, impact, and direction of the regulation are strongly dependent on the cellular context and thus likely on the crosstalk with other post- transcriptional regulators.

4.3.5 A potential role of microRNA 361-5p and Pum proteins in cancer development

and progression

We have shown that levels of miR-361-5p, but not those of the known VEGFA- regulating miRNAs miR-20b, -34a, -93, -126, and -205, inversely correlate with VEGFA expression in SCC compared to healthy skin samples, corresponding to previously reported findings (Dziunycz et al. , 2010). Moreover, Pum1 and Pum2 mRNA as well as miR-361-5p levels were significantly reduced in the SCC samples, suggesting that any or all of these repressors may contribute to elevated VEGFA levels at least in this type of cancer. Studies

90 with larger sample numbers and follow-up data should establish the value of these regulators as diagnostic and prognostic markers or potential drug targets.

These findings further underline the presence of cancer-specific miRNA profiles

(reviewed in Calin and Croce, 2006), the emerging and possibly widespread role of RNA- binding proteins in disease in general (reviewed in Lukong et al. , 2008), and the interaction of the two classes of post-transcriptional gene regulators in cancers specifically (reviewed in van

Kouwenhove et al. , 2011). The application of methods for the unraveling of post- transcriptional regulatory networks, as well as expression profiling of cancers with respect to miRNAs, RBPs and other post-transcriptional regulators, should allow us to gain further insights into these intricate and exciting new mechanisms. Eventually, the integration of the available knowledge on transcriptional, post-transcriptional, post-translational, epigenetic, and other types of gene regulation, should greatly broaden our understanding of the dysregulation of tumor suppressors, oncogenes and critical players in other diseases.

Interestingly, the data from SCC and healthy skin samples add another possibility for potential crosstalk between post-transcriptional regulatory mechanisms. The localization of the putative recognition elements for Pum1/2 and miR-361-5p close to the mRNA’s 3’- terminus may render it prone to inaccessibility due to degradation or alternative polyadenylation. Regarding the latter, it has recently been proposed that proliferating cells may employ alternative polyadenylation to shorten the 3’-UTRs of transcripts and thus escape miRNA- and RNA-binding protein-mediated post-transcriptional repression (Sandberg et al.,

2008). Indeed, it was found that the VEGFA transcript uses two different polyadenylation sites in mice, although no differential usage of the signals was observed between normoxic and hypoxic conditions (Dibbens et al. , 2001). Similarly, our data from SCC and healthy skin samples indicate an apparent difference in expression levels between the coding region and

91 the downstream region of the 3’-UTR within but not in between both sample groups, suggesting that alternative polyadenylation may indeed limit the availability of the miR-361-

5p MRE (Figure 4.12 B). Further studies into the differential regulation of VEGFA 3’-UTR length, the interplay between miRNAs, as well as the crosstalk with other post-transcriptional regulators may shed light on these issues.

4.3.6 Bioinformatics analyses suggest common functions of microRNA 361-5p and Pum

proteins beyond the regulation of VEGFA expression

It has been proposed that miRNAs often act as master regulators of PTGR (reviewed in

Kanitz and Gerber, 2010) and thus could well target multiple targets within the same pathway.

Indeed, besides regulating the expression of VEGFA, miR-361-5p, Pum1 and Pum2 are predicted or were found to target thousands of different mRNAs, and the proteins encoded by these often act in similar pathways. Strikingly, out of all pathways significantly enriched among the genes (putatively) targeted by either regulator, around 45% are enriched among genes predicted to be targeted by miR-361-5p and found to be associated with either Pum1 or

Pum2. In addition to many others, the VEGF pathway, as well as angiogenesis in general were among the enriched pathways, consistent with the important roles that have been proposed for RNA-binding proteins and miRNAs in the regulation of angiogenesis (reviewed in Chang and Hla, 2011). Even though the reliability of prediction software and GO annotation is often questionable at the level of individual targets, the large number of commonly enriched pathways suggests a large potential for common functions, possibly extending those suggested by their identification as repressors of VEGFA expression. Future studies should establish the biological relevance of these predictions using functional assays or in vivo experiments.

92

4.3.7 Conclusion

We were able to demonstrate the validity of our combined evidence- and prediction- based approach for the identification of regulators of the expression of VEGFA, whose dysregulation is implicated in several human malignancies. However, the downsides of such a biased approach lie in the difficulties associated with the unambiguous identification of cis - regulatory elements. Furthermore, the study of combinatorial control mechanisms may be severely hampered by the complexity of PTGR programs and the high interconnectivity of the underlying networks. This is particularly problematic for messages like VEGFA whose expression is extensively controlled at the post-transcriptional level, both by RNA-binding proteins and miRNAs. Nevertheless, we were able to identify three novel trans -acting factors, the miRNA miR-361-5p and the RNA-binding proteins Pum1 and Pum2, which are able to repress VEGFA expression in vitro . We could further establish that all regulators exhibited reduced expression in cutaneous squamous cell carcinoma samples expressing high levels of

VEGFA, indicating that the regulation demonstrated in vitro may have medical implications.

4.4 Materials and Methods

4.4.1 Ethics statement

The collection of specimens from clinically indicated excisions for this study was explicitly approved by the institutional review board (Kantonale Ethikkommission Zürich).

Informed consent (both written and verbal) was obtained from patients for the use of their skin samples in this research project.

4.4.2 Plasmids

The wild type and mutated versions of the CDKN1B/p27 3’-UTR (nt 1070-2403 of

93

RefSeq mRNA NM_004064.3 at GenBank) were amplified from pGL3-CDKN1B -3’UTR, pGL3-CDKN1B -3’UTR-PRE-DM, and pGL3-CDKN1B -3’UTR-MRE-DM respectively

(kindly provided by Martijn Kedde, NKI Amsterdam), using the primers CDKN1B -3’-UTR- fwd and CDKN1B -3’-UTR-rev (containing XhoI and NotI restriction sites, respectively; see

7.1.1). A fragment comprising nucleotides 43,753,225 to 43,754,253 of human chromosome 6

(Build GRCh37/hg19, February 2009), containing nucleotides 926 to 1925 of the 3‘-UTR of human VEGFA (isoform a, NM_001025366.2), was amplified from HeLa S3 genomic DNA using KOD Hot Start DNA Polymerase (Novagen) and the primers VEGFA -3’-UTR-fwd and

VEGFA -3’-UTR-rev (containing XhoI and NotI restriction sites, respectively; see 7.1.1).

Amplicons were purified with the QIAquick PCR Purification kit (QIAGEN, Cat. No. 28104) and inserted into pCR-Blunt II-TOPO (Invitrogen), using the Zero Blunt TOPO PCR Cloning

Kit (Invitrogen, Cat. No. K2830-20) according to the manufacturer’s recommendations. The resulting plasmids, pCR-Blunt II-TOPO-CDKN1B -3’-UTR, pCR-Blunt II-TOPO-CDKN1B -

3’-UTR-PRE-DM, pCR-Blunt II-TOPO-CDKN1B -3’-UTR-MRE-DM, and pCR-Blunt II-

TOPO-VEGFA -3’-UTR, were verified by sequencing of the inserted regions. Mutations in the putative Pum and miR-361-5p recognition elements of the VEGFA 3’-UTR were introduced using the QuikChange I site-directed mutagenesis kit (Stratagene) according to the manufacturer’s recommendations. PREs were sequentially mutated using the following oligonucleotide pairs in the indicated order: VEGFA-PRE3-MUT-fwd/-rev (PRE3 mutation),

VEGFA-PRE2/3-MUT-fwd/-rev (PRE3/2 double mutation), VEGFA-PRE1-MUT-fwd/-rev

(PRE3/2/1 triple mutation). The VEGFA-MRE-MUT-fwd/-rev oligonucleotide pair was used to mutate the MRE in pCR-Blunt II-TOPO-VEGFA -3’-UTR. See 7.1.2 for oligonucleotide sequences. All mutations were verified by sequencing of the appropriate regions. To generate the luciferase reporter constructs psiCHECK-2-CDKN1B -3’-UTR (‘wt CDKN1B’), psiCHECK-2-CDKN1B -3’-UTR-PRE-DM (‘CDKN1B-PRE-DM’), psiCHECK-2-CDKN1B -

3’-UTR-MRE-DM (‘CDKN1B-MRE-DM’), psiCHECK-2-VEGFA -3’-UTR (‘wt VEGFA’),

94 psiCHECK-2-VEGFA -3’-UTR-PRE-TM (‘VEGFA-PRE-TM’), and psiCHECK-2-VEGFA -

3’-UTR (‘VEGFA-MRE-MUT’), wild type and mutated CDKN1B and VEGFA 3’-UTR fragments were then subcloned into psiCHECK-2 (Promega) via Xho I and Not I restriction sites.

The coding sequence for Pum1 (corresponding GenBank entry CV027786.1) was from pDONR223-Pum1 (Thermo Scientific Inc., Cat. No. OHS1770-93822323; kindly provided by

Alexander Wepf, Institute of Molecular and Systems Biology, ETH Zurich). The coding sequence for Pum2 (corresponding to GenBank entry BC143550.1) was amplified from pCMV6-XL5-Pum2 (OriGene, Cat. No. SC112640) with primers Pum2-CDS-attB-fwd and

Pum2-CDS-attB-rev (both containing attB recombination sites; see 7.1.1). The amplicon was purified with the QIAquick PCR Purification kit (QIAGEN, Cat. No. 28104) and inserted into pDONR221 (Invitrogen, 12536-017) by recombination, using BP Clonase II Enzyme Mix

(Invitrogen, Cat. No. 11789-020) according to the manufacturer’s recommendations. The

Pum2 coding sequence of the resulting plasmid, pDONR221-Pum2, was verified by sequencing. To generate pTO-HA-Strep-GW-FRT-Pum1 and pTO-HA-Strep-GW-FRT-Pum2, pTO-HA-Strep-GW-FRT (kindly provided by Alexander Wepf, Institute of Molecular and

Systems Biology, ETH Zurich; see 7.7) was recombined with pDONR-223-Pum1 and pDONR221-Pum2, respectively, using LR Clonase Enzyme Mix (Invitrogen; Cat. No. 11791-

020) according to the manufacturer’s recommendations.

4.4.3 Cell culture and tissue samples

All cell lines were cultured in Dulbecco’s modified Eagle’s medium (DMEM;

Invitrogen, Cat. No. 41966) supplemented with 10% fetal bovine serum (Invitrogen, Cat. No.

10270-106) and 1x antibiotic-antimycotic (Invitrogen, Cat. No. 15240-062) in 5% CO 2 at

95

37°C. Flp-In-293 cells were cultured in the presence of 1 mg/mL zeocin (Invitrogen, Cat. No.

R250-01), Flp-In-293-Pum1/2 and -eGFP cells in the presence of 200 µg/mL hygromycin B

(Invitrogen, Cat. No. 10687-010).

Flp-In-293 (Invitrogen, Cat. No. R750-07) and Flp-In-293-eGFP cell lines were kindly provided by Alexander Wepf (Institute of Molecular and Systems Biology, ETH Zurich).

HEK293 (Graham et al. , 1977) and A431 (Giard et al. , 1973) cells were purchased from

ATCC (CRL-1573 and CRL-1555, respectively). HaCaT (Boukamp et al. , 1988) cells were obtained from Cell Lines Service (Cat. No. 300493). Flp-In-Pum1 and Flp-In-Pum2 cells were generated by co-transfection of 500,000 Flp-In-293 cells with 720 ng pOG44 (Invitrogen,

Cat. No. V6005-20; kindly provided by Alexander Wepf, Institute of Molecular and Systems

Biology, ETH Zurich) and 80 ng of either pTO-HA-Strep-GW-FRT-Pum1 or -Pum2, respectively, using the FuGENE HD Transfection Reagent (Roche, Cat. No. 04883560001) according to the manufacturer’s recommendations. Stably transformed cells were selected by supplementing the medium with hygromycin B (200 µg/mL; Invitrogen, Cat. No. 10687-010).

Squamous cell carcinoma (SCC) samples were obtained at the time of surgery. Normal skin was obtained from abdominoplastic reductive surgery. All specimens’ diagnoses were confirmed by a board-certified dermatohistopathologist. Four mm punch biopsies from SCC or normal skin were placed in preheated phosphate-buffered saline (PBS; Invitrogen, Cat. No.

14190) at 60°C for 45 seconds, and then chilled on ice in 0.1% PBS for one minute, followed by mechanical separation of epidermis and dermis by scratching. The epidermis was homogenized in TRIzol reagent (Invitrogen) and stored at -80°C. RNA was extracted according to the manufacturer’s recommendations. Quantity and quality of extracted RNA was assessed by spectrophotometry with a NanoDrop 1000 (Thermo Fisher Scientific Inc.) and a 2100 Bioanalyzer (Agilent Technologies), respectively. All RNA samples had an RNA

96

Integrity Number (RIN) of higher than 7.0.

4.4.4 MicroRNA target gene prediction and pathway analysis

Predictions of human miRNA targets were downloaded from microRNA.org (Betel et al. , 2010) (August 2010 release, http://www.microrna.org/, accessed: August 12 th , 2011; all predictions with mirSVR of less than -0.1 were considered), TargetScan (Friedman et al. ,

2009) (Release 5.2; http://www.targetscan.org/, accessed: August 12 th , 2011; all predicted miRNA recognition elements were considered, regardless of conservation), DIANA-microT v3.0 (Maragkakis et al. , 2009) (Release 3.0, http://diana.cslab.ece.ntua.gr/microT/, accessed:

August 12 th , 2011), miRDB (Wang, 2008) (Release 3.0, http://mirdb.org/miRDB/, accessed:

August 12 th , 2011), and MicroCosm (Griffiths-Jones et al. , 2008) (Release 5, www.ebi.ac.uk/enright-srv/microcosm/, accessed: August 12 th , 2011). The miR-361-5p sequence was obtained from miRBase (Kozomara and Griffiths-Jones, 2011) (Release 17, http://www.mirbase.org/, accessed: May 5 th , 2011). RNAhybrid (Rehmsmeier et al. , 2004)

(http://bibiserv.techfak.uni-bielefeld.de/rnahybrid/) was used online, with default settings and the following sequences: UUAUCAGAAUCUCCAGGGGUAC (miR-361-5p, miRNA) and

UGUAUAUAUGTGAUUCUGAUAAA (VEGFA 3’-UTR fragment containing the putative miR-361-5p MRE, target RNA). For the pathway analysis, predicted targets for miR-361-5p and experimentally verified Pum1 and Pum2 targets (Morris et al. , 2008; Galgano et al. , 2008;

Hafner et al. , 2010b) were converted to Entrez identifiers using DAVID (Huang et al. , 2008)

(http://david.abcc.ncifcrf.gov/), if not present in the respective outputs. Results for each group were pooled, filtered for unique records, and subjected to gene set enrichment analysis with

PANTHER (Thomas et al. , 2006) (http://www.pantherdb.org/tools/compareToRefList

Form.jsp). For VEGFA pathway analysis, the KEGG PATHWAY database (Kanehisa et al. ,

2010) (http://www.genome.jp/kegg/pathway.html) was consulted. All services were used with

97 default settings.

4.4.5 Immunoblot analysis

Cells were treated with tetracyclin (2 µg/mL; 24 h) and lysed with RIPA buffer (Cell

Signaling Technology, Cat. No. 9806S) supplemented with 1 mM phenylmethanesulfonyl fluoride (Sigma-Aldrich, Cat. No. P7626) according to the manufacturer’s recommendations.

Protein concentrations were determined using the Bio-Rad Protein Assay according to the manufacturer’s recommendations and spectrophotometric analysis with a NanoDrop 1000

(Thermo Fischer Scientific, Inc.) at a wavelength of 595 nm. 20 µg protein extracts were supplemented with NuPAGE LDS Sample Buffer (Invitrogen, Cat. No. NP0008) and

NuPAGE Sample Reducing Agent (Invitrogen, Cat. No. NP0009), loaded on NuPAGE Novex

4-12% Bis-Tris gels (1mm, 10 wells; Invitrogen, Cat. No. NP0321BOX). Proteins were separated by polyacrylamide gel electrophoresis in NuPAGE MOPS SDS Running Buffer

(Invitrogen; Cat. No. NP0001) supplemented with NuPAGE Antioxidant (Invitrogen; Cat. No.

NP0005), according to the manufacturer’s recommendations. Proteins were transferred onto nitrocellulose membranes (Bio-Rad, Cat. No. 162-0115) in a Mini-PROTEAN II

Electrophoresis Cell (Bio-Rad) in the presence of transfer buffer (25 mM Tris, 192 mM glycine, 20% methanol; 200 mA per membrane, 1 h, 4°C). Membranes were blocked with 5% low-fat milk powder in PBS (1 h, RT, 300 rpm). Anti-HA antibody (clone HA-7; Sigma-

Aldrich, Cat. No. H3663-200UL) was added at 0.5 µg/mL and the incubation was continued for an additional hour. Membranes were washed twice with PBS containing 0.05% Tween-20

(Sigma-Aldrich, Cat. No. P2287-100ML) for 5 min each (300 rpm) and incubated (1 h, RT,

300 rpm) in PBS containing 5% low-fat milk powder and a 1:5,000 dilution of a horseradish peroxidase-coupled anti-mouse antibody (Amersham, Cat. No. NA931V). Membranes were washed four times as described above, briefly dried and developed with the ECL Plus

98

Western Blotting Detection System (Amersham, Cat. No. RPN2132) according to the manufacturer’s recommendations. Bands were visualized with a Bio-Rad Universal Hood II.

4.4.6 Flow cytometry

To assess the efficiency of small RNA transfections, 125,000 A431, HaCaT or

HEK293 cells were reverse transfected with 10, 30 or 100 nM Cy3 dye-labeled Pre- or Anti- miR Negative Control #1 (Applied Biosystems; see 7.2), or mock-transfected, using siPORT

NeoFX (Applied Biosystems) according to the manufacturer's recommendations.

Transfections were performed in triplicate in 24-well plates. For the analysis of eGFP expression, Flp-In-293-eGFP cells were treated with the indicated concentrations of tetracyclin. In both cases, cells were detached with 0.05% trypsin-EDTA (Invitrogen, Cat. No.

25300-054) after 24 h and washed with PBS. Dye-labeled small RNAs or eGFP were excited with a blue laser (excitation wavelength = 488 nm) and analyzed with a FACScan flow cytometer (Becton Dickinson). At least 5,000 or 50,000 events were recorded for each sample for the analysis of small RNA transfections and eGFP levels, respectively. Data were analyzed with WinMDI 2.8.

4.4.7 Quantitative reverse transcription PCR

For each reaction, cDNA was prepared from 10 ng total RNA using the TaqMan

MicroRNA Reverse Transcription Kit (Applied Biosystems) for miRNA detection, or the

High Capacity cDNA Reverse Transcription Kit (Applied Biosystems) for mRNA detection. miRNA and gene expression assays were purchased from QIAGEN (SYBR; ACTB, PUM1,

PUM2) or Applied Biosystems (TaqMan; all others). See 7.3 for an overview of commercial qRT-PCR assays. Quantitative PCR reactions were performed in quadruplicates, using

FastStart TaqMan Probe Master (Rox; Roche) or FastStart Universal SYBR Green Master

99

(Rox; Roche, Cat. No. 04913914001) in an AB 7900 HT Fast Real-Time PCR System

∆∆ (Applied Biosystems). Quantification was performed using the 2 - CT method (Schmittgen and Livak, 2008), with RNU6B and ACTB serving as references for the normalization of miRNA and mRNA expression levels, respectively. Equal amplification efficiencies of near

100% were assumed for all assays, based on the manufacturers’ assertions.

4.4.8 Immunocytochemistry

The chambers of an 8-chamber slide (BD Falcon, Cat. No. 354108) were coated with 5

µg/mL fibronectin (Sigma-Aldrich, Cat. No. F0895) diluted in PBS for 20 min at RT. 250 µL of an Flp-In-293-Pum1 cell suspension (8 x 10 4 cells/mL in DMEM supplemented with 10% fetal bovine serum and 1x antibiotic-antimycotic) were seeded into each of the chambers. 16 h later, medium was replaced with fresh medium containing 1 µg/mL tetracyclin (2 mg/mL stock solution in EtOH). 23 hours later, sodium arsenite (0.05 M; Sigma, Cat. No. 35000-1L-

R) was added to the medium at 0.5 mM. After 45 min, cells were fixed (4% paraformaldehyde in PBS; 15 min, RT, 300 rpm) and permeabilized (0.1% Triton X-100, 0.5% BSA, 1 µg/mL

Hoechst 33342 in PBS; 3 min, RT). After pre-incubation in blocking buffer (0.5% BSA, 5% donkey serum in PBS) for 30 min at RT, cells were co-stained with an anti- Pum1 (25 µg/mL;

A300-201A, Bethyl Laboratories, Inc., Cat. No A300-201A) and an anti-HA antibody (clone

HA-7; 2 µg/mL; Sigma, Cat. No. H3663) in blocking buffer (1 h 30 min, 300 rpm). Cells were washed twice in PBS with 0.02% Tween-20 (Sigma-Aldrich, Cat. No. P2287-100ML) for 5 min each (300 rpm), and then incubated with 1:200 dilutions of Alexa488-coupled anti- goat and Alexa594-coupled anti-mouse antibodies (both from Molecular Probes, Cat. Nos. A-

11055 and A-21203, respectively) in blocking buffer (1 h, RT, 300 rpm) in the dark. Cells were washed three times as described above and mounted in Mowiol 4-88 (Calbiochem, Cat.

No. 475904). Images of the cells were acquired with a Leica TCS SP2 confocal microscope

100

(Light Microscopy Center, ETH Zurich). Overlay images were created with Adobe Photoshop

CS5.

4.4.9 Luciferase reporter assays

20,000 of the indicated cells were reverse transfected with 20 ng of the indicated luciferase reporter constructs using polyethylenimine (Polysciences, Inc., Cat. No. 23966).

For complex formation, DNA and polyethylenimine stock solution (1 mg/mL in water) were diluted in Opti-MEM I (Invitrogen, Cat. No. 51985-026), to 20 µg/mL and 60 µg/mL, respectively. Both solutions were incubated for 10 min at room temperature, and then mixed at equal volumes (mass ratio polyethylenimine to DNA = 3:1; final polyethylenimine concentration = 30 µg/mL). After incubation for 20 min at room temperature, solutions were diluted 1:10 in DMEM (Invitrogen, Cat. No. 41966) to achieve a final DNA concentration of

1 µg/mL. 20 µL of the transfection mixes were added to each well of a 96-well plate, followed by the addition of 80 µL of cell suspension in DMEM (2.5x10 5 cells/mL). Where applicable, 16 hours after transfection of the luciferase reporter plasmids, the cells were further transfected with 50 nM of Pre-miR-361-5p or Pre-miR Negative Control #1 (Pre-miR- control; Applied Biosystems; see 7.2) by using siPORT NeoFX (Applied Biosystems) according to the manufacturer's recommendations. All transfections were performed in triplicate. Where applicable, the medium was replaced with fresh medium containing 1 µg/mL tetracyclin (2 mg/mL stock in EtOH) or EtOH only after 24 hours. In all cases, the medium was aspirated 64 hours after the initial transfection. Cells were lysed with a mixture of 15 µL

Luciferase Assay Reagent II (Promega) and 15 µL nuclease-free water (Invitrogen, Cat. No.

10977). Firefly luciferase activity was measured after 10 min. Subsequently, 15 µL Stop &

Glo Reagent (Promega) were added and Renilla luciferase activity was measured after 10 min.

Luciferase activity measurements were performed in an LMAX II 384 luminometer

101

(Molecular Devices) with 5 seconds integration time. For each triplicate, the mean

Renilla /firefly ratio was calculated.

4.4.10 Enzyme-linked immunosorbent assay

20,000 A431 or HaCaT cells were reverse transfected with 10, 30 or 100 nM Pre- or

Anti-miR-361-5p or Pre-/Anti-miR Negative Control #1 (Applied Biosystems; see 7.2) with siPORT NeoFX (Applied Biosystems) according to the manufacturer's recommendations.

Transfections were performed in triplicate in 96-well plates. 24 hours after transfection, supernatants were collected and centrifuged to remove cell debris (1000 g for 3 min at room temperature). VEGFA protein levels were determined using the Human VEGF-A Platinum

ELISA kit (eBioscience, Cat. No. BMS277) according to the manufacturer's recommendations. After subtraction of blank values, triplicates were averaged and quantified using a standard curve prepared from serial dilutions of purified VEGFA.

4.5 Contributions

André P. Gerber, Michael Detmar, Jochen Imig, Alessia Galgano, Jonathan Hall

(Institute of Pharmaceutical Sciences, ETH Zurich), Günther F. L. Hofbauer, and Piotr J.

Dziunycz (Department of Dermatology, University Hospital Zurich, Zurich) have contributed to the conception or design of experiments or provided helpful discussions. Jochen Imig, under the supervision of André P. Gerber and Jonathan Hall, performed the flow cytometry analysis of small RNA-transfected A431 and HaCaT cells. Piotr J. Dziunycz, under the supervision of Günther F. L. Hofbauer, provided the healthy skin and cutaneous squamous cell carcinoma RNA samples and performed the RNA quality control experiments. Michael

Forrer (Institute of Pharmaceutical Sciences, ETH Zurich), under the supervision of Alessia

Galgano and André P. Gerber, has generated the wild type CDKN1B luciferase reporter.

102

Alexander Kanitz, under the supervision of André P. Gerber and Michael Detmar, performed all other experiments. Fabienne Bereiter and Alexander Svensson (Institute of Pharmaceutical

Sciences), under the supervision of Alexander Kanitz and André P. Gerber, have helped with the generation of mutant CDKN1B and VEGFA luciferase reporters. Sinem Karaman

(Institute of Pharmaceutical Sciences), under the supervision of Michael Detmar, has helped with statistical analyses. Martijn Kedde, under the supervision of Reuven Agami (NKI,

Amsterdam, Netherlands), and Alexander Wepf, under the supervision of Matthias Gstaiger

(Institute of Molecular and Systems Biology), have contributed plasmids and cell lines as indicated in the main text.

103

5 A Novel RNA Tandem Affinity Tag for the Purification of Ribonucleoprotein Particles

5.1 Introduction

In eukaryotes, a large number of regulatory programs govern all major events in the entire life span of messenger and non-coding RNAs (see 2.1.1). The underlying ‘code’, in the form of cis -acting sequence and structural elements, is stored in the RNA itself (see 2.1.2), while interpretation and execution of the programs requires the help of additional protein and

RNA factors acting in trans (see 2.1.3). Together, these proteins and RNAs form RNPs, highly dynamic macromolecular complexes that represent the functional units of PTGR (see

2.1.4). RNPs and their components are organized in intricate, densely meshed post- transcriptional gene regulatory networks (GRNs) that orchestrate and coordinate the execution of regulatory programs simultaneously, for the whole transcriptome, in the correct spatiotemporal context and in response to continuously changing environmental stimuli (see

2.2). In addition to complex loop circuits and cooperation, GRN architecture heavily relies on combinatorial control principles to accomplish this feat: A single trans -acting factor generally binds multiple RNAs (2.2.1.1); in turn, a single RNA is generally bound by multiple trans - acting factors (2.2.1.2). Thanks to a range of elegant affinity purification-based methodologies for RNP analysis that have been developed and continuously improved over the last decade or so, several streamlined protocols now exist for the identification of RNA species that are bound by a particular trans -acting (RNA-binding) protein (see 2.3.1).

Unfortunately, a widely applicable approach for the systematic identification of the trans -acting factors binding a particular RNA is currently unavailable. Such a methodology, i.e. one that allows the purification of RNPs via their RNAs, would enable us to directly and comprehensively examine the post-transcriptional combinatorial control exerted on a given

104

RNA. For the study of cancers, which arise from the dysregulation of a relatively small number of genes, such a ‘gene-centered’ ribonomics approach would therefore be highly desirable (see 2.4). Several attempts at establishing such a procedure have been made, but they generally suffer from lack of sensitivity, low efficiency or their artificial or complicated setup (see 2.3.2).

Based on the shortcomings of current approaches and the complexity of RNP dynamics, we believe that a powerful, yet versatile approach towards the affinity purification of RNA molecules and the identification of associated proteins and RNAs has to guarantee that RNP formation occurs unimpeded, in its native conformation and location inside the cell

(i.e. in situ ), and with as little disturbance to the cells as possible. Therefore it has to rely on structural determinants that can be expressed and purified directly from lysed cells. Apart from escaping the additional, potential interference with normal RNP formation caused by the binding of a bait RBP, an RNA-based biochemical purification approach has the further advantages of eliminating additional background generated by proteins associating with the bait protein, as well as the perturbations to the cell introduced by its (ectopic) expression.

In this study, we envisioned to design a widely applicable, convenient RNA affinity tag system for the purification of in situ -formed RNPs that is sensitive, yet highly specific. To this end, we devised a tandem tag system consisting of an aptamer, as well as an exposed oligonucleotide. In the application of the system, tagged RNPs would first be pre-purified from crude cell lysate with high efficiency via the aptamer and then subjected to an additional purification step based on the discriminative power of antisense hybridization. Finally, to maximize robustness, applicability and efficiency of this bifunctional tag system, we rationally designed a scaffold that secures exposition of the selected structural determinants.

105

5.2 Results

5.2.1 Aptamer selection

For an initial pre-purification step, we surveyed the literature for specific RNA aptamers that had been repeatedly shown to be amenable to direct affinity purification through a ligand-coated matrix. The resulting shortlist of aptamers includes the dextran

B512/Sephadex aptamer D8, the tobramycin aptamer J6f1, the streptavidin-aptamer S1, and the streptomycin aptamer stII (Table 5.1 and Figure 5.1 A to D), and was further evaluated according to the following selection criteria: (a) In order to prevent the final tag from becoming excessively long, the length of the aptamer sequence should preferably be below 50 nt. (b) To guarantee high efficiency of the aptamer-based pre-purification step, the

-7 dissociation constant (K d) of aptamer and ligand should be 10 M or lower. (c) The elution should be gentle, so as not to interfere with the second purification step or downstream applications, and, in order to minimize the co-elution of unspecifically bound proteins and

RNAs, ideally occur in a competitive manner.

Table 5.1 Selected RNA aptamers. Names, lengths, ligands, dissociation constants (Kd), elution methods and references to the original description of aptamers that have previously been used for the purification of RNPs are indicated. For the D8 and S1 aptamers, the indicated lengths refer to those of the minimal motifs and the full lengths (in brackets), respectively. See 7.1.4 in the appendix for the corresponding sequences.

Length K Aptamer Ligand d Elution Original description (nt) (nM)

D8 84 (40) Dextran B512 250 Dextran B512 Srisawat et al. , 2001; Srisawat and Engelke, 2002

J6f1 40 Tobramycin 5 Tobramycin Hamasaki et al. , 1998

S1 84 (44) Streptavidin 70 D-Biotin Srisawat and Engelke, 2001

stII 46 Streptomycin 1000 Streptomycin Bachler et al. , 1999

While the D8 (Figure 5.1 A) and S1 aptamers both have a length of 84 nt in their original form, functional “minimal motifs” of less than 50 nt have been described for each

(Srisawat and Engelke, 2001, Srisawat and Engelke, 2002), so that all aptamers fulfill the length requirements. With a Kd in the nanomolar range (5 nM), the J6f1 (Figure 5.1 B)

106 aptamer promises the highest affinity for its ligand, followed by the S1 aptamer (70 nM;

Figure 5.1 C). The other two aptamers, D8 (250 nM) and stII (1000 nM; Figure 5.1 D), do not fulfill our binding affinity requirements. While all aptamers are amenable to native elution methods, only the S1 aptamer is not released from the matrix by competition with free

(aptamer) ligand; instead, it is displaced upon the addition of streptavidin’s natural ligand D- biotin.

Figure 5.1 Secondary structures of RNA aptamers. The predicted centroid secondary structures of the following RNA aptamers are schematically depicted: (A) D8 (minimal motif; Srisawat and Engelke, 2002), (B) Jf61 (Hamasaki et al. , 1998), (C) S1 (minimal motif; Srisawat and Engelke, 2001), and (D) stII (Bachler et al. , 1999). Predictions are based on the reported aptamer sequences (see 7.1.4) and were made with RNAfold (Gruber et al. , 2008). Calculated free energies for the depicted structures are indicated. Dots indicate the 5’- termini.

Despite the unfavorable K d between aptamer and ligand compared to the J6f1 aptamer, we chose to include the S1 aptamer for our tandem tag system, because with -14.83 kcal/mol, the predicted centroid secondary structure (the structure that is closest to the whole thermodynamic ensemble) of the S1 aptamer exhibits the lowest calculated free energy out of all shortlisted aptamers (Figure 5.1), suggesting a strong characteristic fold. This should improve the universal applicability of the tag system as it is expected that its fold, and thus its function, will not be impeded by the presence of the bait RNA. Moreover, the affinity of

-14 streptavidin to its natural ligand D-biotin (k D ~10 ; Green, 1990) is among the strongest non- covalent interactions known, and thus should allow captured RNPs to be eluted with utmost efficiency and specificity, minimal amounts of competitor, and without a chance to rebind the

107 matrix. Finally, a variety of streptavidin-coated matrices are readily available from various commercial suppliers.

5.2.2 Oligonucleotide selection

Short nucleotides are able to bind RNA molecules with great discriminating power in a sequence-specific manner. For a second, high specificity purification step, we therefore decided to exploit the antisense base-pairing characteristics of oligonucleotides. Biotinylated

2’-O-methylated (2’-O-Me) RNA nucleotides have already been successfully employed for the affinity purification of RNPs (see 2.3.2.2), but the general applicability of such approaches is hampered by the immense amounts of cellular material required for the analysis of RNP composition. We reasoned, however, that the reduction in sample complexity achieved by the aptamer-mediated pre-purification step might greatly enhance the efficacy of such an approach. In order to maximize the efficiency and specificity of a hybridization-based purification strategy, the targeted sequence in the bait RNA should fulfill the following conditions: (a) The sequence should be free of strong internal secondary structures. (b)

Extensive similarity or complementarity to other nucleotides present in human cells should be avoided.

For our tandem tag system to be of universal use, we set out to design such an RNA sequence and incorporate it into the tag itself – instead of scanning potential bait RNAs for regions that might fulfill these criteria. For this purpose we have used RNA Designer, a program that designs RNA sequences that fold into specific secondary structures based on user-submitted parameters (Andronescu et al. , 2004). Using only unpaired bases and no sequence constraints as input, we obtained five small (15 nt) and medium length (25 nt) oligonucleotides with calculated minimum free energies (MFE) of 0 kcal/mol (Table 5.2). We

108 then performed BLAST (Altschul et al. , 1990) searches against the set of human reference

RNAs for each of the sequences, disregarding the sequences with the lowest E values, and thus the highest chances of cross-hybridization (15v1, 15v2, 25v1, and 25v5). For each of the

Table 5.2 Unstructured RNA sequences. As predicted by RNA Designer (Andronescu et al. , 2004). Name, GC content (%), MFE, the free energy of the thermodynamic ensemble, the probability of the MFE structure in the ensemble, and the lowest E value of a Megablast search against human reference RNA sequences (Altschul et al. , 1990) are indicated for each of five 25- and 15-mer RNA sequences. Refer to 7.1.4 in the appendix for the corresponding sequences. # MegaBLAST search against human reference RNA sequences (accessed: October 13 th , 2011).

Minimum free energy Probability of MFE Length GC content # Name (MFE) structure in ensemble Lowest E value (nt) (%) (kcal/mol) (%)

15v1 15 40.0 0.00 86.3 0.76 15v2 15 46.7 0.00 97.2 0.76 15v3 15 53.3 0.00 35.5 12 15v4 15 66.7 0.00 66.3 12 15v5 15 53.3 0.00 71.4 12 25v1 25 44.0 0.00 43.0 0.24 25v2 25 52.0 0.00 13.4 3.8 25v3 25 60.0 0.00 20.5 3.8 25v4 25 52.0 0.00 53.2 3.8 25v5 25 48.0 0.00 51.3 0.96

remaining sequences, we then considered the calculated probability of the MFE structure in the thermodynamic ensemble (RNA Designer), which should give a rough indication of the degrees of (structural) freedom of the set of likely secondary structures. We reasoned that sequences with higher structural dynamics (i.e. lower probabilities of the MFE) might be more inclined to commit to binding in order to stabilize their configuration. Out of the sequences with the lowest probabilities of the MFE in the thermodynamic ensemble, 15v3 for the 15-mers (35.5%) and 25v2 for the 25-mers (13.4%), we decided to use the latter, longer sequence for initial “proof-of-principle” experiments. To specifically capture the 25v2 oligonucleotide by hybridization, we synthesized an antisense 5’-aminohexyl-3’-Cy3-linked

DNA/2’-O-Me RNA hybrid oligonucleotide (25v2-as; Figure 5.2 A; see 7.1.4). The amino linker is supposed to mediate crosslinking to carboxylated microbeads via the formation of peptide bonds (Figure 5.2 B). The DNA spacer was designed to allow specific elution of

109 captured, tagged RNAs via cleavage with Eco RI or an unspecific DNase. The 2’-O-Me- substituted RNA residues were included to increase stability of the oligonucleotide and are fully complementary to the 25v2 oligonucleotide in the affinity tag. Finally, incorporation of the Cy3 fluorophore should allow monitoring of bead coupling and DNase-mediated 25v2-as cleavage.

Figure 5.2 Antisense oligonucleotide 25v2-as and coupling to microbeads. (A) Schematic representation of the 25v2-as oligonucleotide used for the antisense-based purification of tagged transcripts. 25v2-as contains a 5’- terminal aminohexyl group (NH 2-C6), a single-stranded DNA linker region (yellow box) with an Eco RI recognition site (red, italicized), a 25-mer 2’-O-Me RNA sequence (light red box) complementary to the 25v2 region in the tag, and a 3’-terminal Cy3 fluorophore (green circle). The 25v2 region of the tag that is complementary to 25v2-as is indicated (green box). (B) Coupling of 25vs2-as antisense oligonucleotides to carboxyl microbeads (grey). Dehydration (blue box) leads to the formation of a peptide bond between the amino group (NH 2) and the carboxyl group (COOH), covalently linking the oligonucleotide to the beads.

5.2.3 Arrangement of the HAMMER tandem affinity tag system

In order for the tandem tag system to be widely and universally applicable, both the aptamer and the oligonucleotide should always fold into the same secondary structure,

110 regardless of the nature of the bait RNA and the relative insertion position of the tag.

Furthermore, it should be ensured that the structural determinants of the tag are exposed in order for them to interact with their matrix-coated counterparts. We therefore aimed to design a scaffolding for the presentation of the structural determinants that is (a) flexible enough not to impede with their folding dynamics, and (b) strong and rigid enough to protect them from misfolding due to interaction with the bait RNA (Iioka et al. , 2011). For the final arrangement of aptamer and oligonucleotide, we have therefore introduced a stem structure, consisting of a stretch of 25 guanine residues opposing two stretches of cytosine residues (10 and 15 nt), on whose end is situated the S1 minimal motif, while the 25v2 oligonucleotide extends from a side-stem 15 nt into the main stem structure. Figure 5.3 A depicts the predicted centroid secondary structure. We chose to rely on G-C base pairs for the stem structure, owing to their favorable stacking interaction and hydrogen bonding characteristics compared to A-T base pairs in DNA (Sponer et al. , 2002; Yakovchuk et al. , 2006). Furthermore, we reasoned that the use of a continuous arrangement of guanines on one and cytosines on the other strand instead of alternating base pairs should prevent misalignment by allowing folding of the stem structure even if structural constraints resulting from the folding of the bait RNA require a shift of the base pairing in the stem. The final tag layout, containing the aptamer and oligonucleotide determinants as well as the stem structure, is referred to as the ‘HAMMER tandem affinity tag system’ or ‘HAMMER tag’. Predicted base-pairing probabilities of nucleotides are high across the whole structure, particularly within the stem (>0.5), suggesting that the tandem tag will likely fold into a strong characteristic structure (Figure 5.3 B). By embedding the S1 aptamer and 25v2 oligonucleotide between restriction sites, we have designed the tag system in a modular way, thus allowing the convenient exchange of the structural determinants while keeping their general position and orientation with respect to each other and the stem (Figure 5.3 C).

111

Figure 5.3 The HAMMER tandem affinity tag. (A and B) Centroid secondary structure of the HAMMER tandem affinity tag as predicted by RNAfold (Gruber et al. , 2008). Dots indicate the 5’-terminus. The stem region (orange), the S1 aptamer cassette (blue), and the oligonucleotide cassette (oligonucleotide: 25v2; green) are highlighted (A). Nucleotides are color-coded according to base-pair probabilities (RNAfold; B). (C) Schematic representation of the HAMMER tandem affinity tag, showing the GC-rich stem region (orange), the S1 aptamer cassette (blue), and the oligonucleotide cassette (oligonucleotide: 25v2; green), together with the corresponding DNA sequence in the sense direction. Restriction sites that can be used to exchange the S1 and oligonucleotide cassettes ( Cla I and Nhe I, respectively) are indicated.

5.2.4 Purification strategy

The HAMMER tag is designed for use in a highly versatile, specific and universal tandem purification procedure which should allow the identification of protein and RNA constituents of RNPs forming on tagged RNAs by mass spectrometry and sequencing methods, respectively. A generalized schematic representation of the tandem purification process is outlined in Figure 5.4. For the chosen aptamer and oligonucleotide, the envisioned procedure includes four steps that independently introduce specificity: 1. Pre-purification of crude cell extracts through the binding of S1 aptamer to matrix-coupled streptavidin.

112

Figure 5.4 Purification of HAMMER-tagged transcripts. Schematic representation of the generalized, envisioned two-step process of RNP purification via the HAMMER tandem affinity tag. Lysates of cells expressing transcripts encoding enhanced green fluorescent protein (eGFP; green) and HAMMER (yellow), either with (right) or without (control; left) an RNA species of interest (red), are first purified via binding of the aptamer to microbeads coated with ligand (brown). HAMMER-tagged RNPs, including the RNAs and proteins that specifically associate with the RNA of interest (blue) are captured, together with all molecules that bind to eGFP, HAMMER, ligand or microbeads (grey). The majority of proteins and RNPs (grey) do not bind the ligand and are washed away. In a next step, the eluate is further purified via microbeads coupled to oligonucleotides (magenta) that are fully complementary to an exposed region in the HAMMER tag. This second step reduces the amount of unspecifically bound molecules and eliminates those proteins and RNPs that were specifically bound by the ligand in the previous step. The final eluate should be strongly enriched in HAMMER-tagged RNPs, the components of which can then be identified by (quantitative) mass spectrometry and high-throughput sequencing methods. Comparison with the control should allow the distinction of RNAs and proteins that specifically associate with the RNA species of interest from those that interact specifically or unspecifically with eGFP, the HAMMER tag, or the microbeads.

113

2. Competitive elution by displacement of S1 aptamer by D-biotin. 3. Hybridization between

25v2 and matrix-coupled 25v2-as oligonucleotides. 4. Elution by DNase-mediated cleavage of 25v2-as. Through the generation of careful control reagents, it should be possible to establish each of these steps independently.

5.2.5 Plasmid generation

In order to establish the purification procedure, we first generated a set of plasmid constructs for the expression of HAMMER-tagged RNAs (Figure 5.5 A to D). The sequence encoding the HAMMER tag was cloned into the pcDNA-5-based mammalian expression vector pTO-HA-Strep-GW-FRT (see 7.7) under the control of a tetracyclin-inducible CMV promoter (pTO-HAMMER; Figure 5.5 A). To facilitate the introduction of bait RNA sequences upstream, downstream or encompassing HAMMER, we have incorporated multiple cloning sites flanking the tandem tag. For the generation of cell lines stably expressing

HAMMER-tagged RNAs, the plasmid further contains a Flippase recognition target (FRT) site, allowing convenient recombinase-mediated single site genomic integration into a range of commercially available cell lines, or – after the introduction of an FRT site into a transcriptionally active locus – a cell line of choice. An inducible promoter was chosen to allow the precise timing of the expression of HAMMER-tagged RNAs, thus minimizing potential adversary effects on transfected cells. Furthermore, it should ensure that the tag system could be used even when studying RNAs that give rise to gene products that may be toxic to cells. This is especially useful for cell lines stably expressing HAMMER-tagged

RNAs.

To enable us to conveniently check for the expression of the HAMMER tag construct, we generated a variant of pTO-HAMMER in which we introduced the coding sequence of

114 eGFP upstream of the HAMMER tag (pTO-eGFP-HAMMER; Figure 5.5 B). Several proteins and RNAs have been shown to regulate the expression of CDKN1B post-transcriptionally, by binding regions in its 3’-UTR (see 2.4.2). For a first “proof of principle”, we therefore subcloned the CDKN1B 3’-UTR into the multiple cloning site downstream of the HAMMER tag (pTO-eGFP-HAMMER-CDKN1B -3’-UTR; Figure 5.5 C) . Finally, we subcloned the eGFP-HAMMER cassette from pTO-eGFP-HAMMER into pBlueScript SK+, allowing us to generate in vitro transcripts of eGFP or eGFP-HAMMER from its T7 promoter (PBS-SK+- eGFP-HAMMER; Figure 5.5 D). With the help of these constructs, it should be possible to

Figure 5.5 HAMMER plasmid constructs. The following plasmids containing the HAMMER tandem affinity tag have been generated: (A) pTO-HAMMER, (B) pTO-eGFP-HAMMER, (C) pTO-eGFP-HAMMER- CDKN1B -3’-UTR, and (D) pBS-SK+-eGFP-HAMMER. The positions and orientations of relevant DNA elements with respect to the HAMMER tandem affinity tag (blue) are indicated. The plasmids based on pTO- HA-Strep-GW-FRT (A to C; see 7.7) contain a chimeric tet operon/CMV minimal promoter (CMV/tetO; orange), allowing the tetracyclin-inducible expression of HAMMER-containing transcripts in mammalian cells (arrows). Transcription is terminated by the bovine growth hormone (bGH) terminator (A to C; orange). The T7 promoter in the plasmid based on pBlueScript-SK+ (D; Stratagene, discontinued) allows generation of eGFP and eGFP- HAMMER in vitro transcripts after digestion with Eco RV, and Not I respectively. The eGFP expression cassette including the Kozak sequence (B to D; dark and light green, respectively), the CDKN1B 3’-UTR (C; dark red), and multiple cloning sites (MCS; A to D) are indicated. DNA sequences in the sense direction are given for the restriction sites (yellow) and the eGFP Kozak sequence. Restriction sites that have been used to insert the HAMMER tandem affinity tag (A), eGFP (B), the CDKN1B 3’-UTR (C), and eGFP-HAMMER (D; from B) are highlighted in light red. The Xho I restriction site is not unique in (D; yellow and red stripes). To improve readability, overlapping restriction sites were omitted, and only one restriction enzyme was indicated for restriction sites that are recognized by more than enzyme.

115 establish several parameters of the purification protocol without having to cope with the problems arising from the use of complex crude cell lysates for initial tests.

5.2.6 Secondary structures of HAMMER-tagged RNAs are largely unaffected

To estimate the impact of fused RNAs on the folding of the HAMMER tag, and vice versa , we predicted the centroid secondary structures of the HAMMER-containing transcripts encoded on pTO-eGFP-HAMMER and pTO-eGFP-HAMMER-CDKN1B -3’-UTR, and compared them to the corresponding sequences without the HAMMER tag (Figure 5.6 A to

D). More than 80% of the predicted eGFP secondary structure (Figure 5.6 A) are not affected by the introduction of the HAMMER tag (Figure 5.6 B); only a small region without longer stretches of nucleotides with high base-pairing probabilities and in the immediate vicinity of the tag exhibits a changed fold. As expected, the HAMMER tag folds into its characteristic shape. Furthermore, it considerably reduces the free energy of the overall structure (-263.8 kcal/mol and -196.3 kcal/mol for the tagged and untagged variants, respectively).

While the predicted structural changes resulting from introducing the HAMMER tag between the coding region and the 3’-UTR of a eGFP-CDKN1B -3’UTR transcript are stronger than for the eGFP-only transcript, they are also found mainly in the vicinity of the location of tag insertion and predominantly affect secondary structure elements without a high density of nucleotides with high base-pairing probabilities (Figure 5.6 C and D). A large fraction of characteristic secondary structure motifs remains unaffected. As with the eGFP transcript, introduction of the HAMMER tag decreases the free energy of the centroid structure (-416.7 kcal/mol and -391.4 kcal/mol for the tagged and untagged variants, respectively), while the structure of the tag itself is not altered.

116

Figure 5.6 Impact of HAMMER insertion on predicted secondary structures. Centroid secondary structures of transcripts eGFP (A), eGFP-HAMMER (B), eGFP-CDKN1B -3’-UTR (C), and eGFP-HAMMER-CDKN1B -3’- UTR (D), as predicted by RNAfold (Gruber et al. , 2008). Calculated free energies for the depicted structures are indicated. Nucleotides are color-coded according to base-pair probabilities (RNAfold). Sequences encoding the HAMMER tandem affinity tag (yellow), the eGFP coding sequence (green) and the CDKN1B 3’-UTR (magenta) are highlighted. (B and D) Diagonal stripes denote regions in which the predicted secondary structures (from A and C, respectively) are not affected by incorporation of HAMMER.

117

When considering the predicted MFE structures instead of the centroid structures, the

HAMMER-induced changes are even less pronounced. This is exemplified by a fragment of the CDKN1B 3’-UTR that contains several experimentally verified recognition elements for miRNAs and RNA-binding proteins (le Sage et al. , 2007; Galardi et al. , 2007; Kedde et al. ,

2007; Kedde et al. , 2010). Magnification of the corresponding regions from Figure 5.6 C and

D reveals that the structural context of two out of five recognition elements in the centroid secondary structures is affected by the presence of HAMMER (Figure 5.7 A and B, respectively), probably due to their close proximity to the site of the tag insertion

(approximately 200-300 nt). In contrast, introduction of the tag has virtually no influence on the predicted MFE structures (Figure 5.7 C and D for the untagged and tagged variants, respectively).

Taken together, these results suggest that the HAMMER tag folds into a stable, characteristic secondary structure, irrespective of the presence of long sequence stretches adjacent to it. While parts of the tagged RNAs exhibit changes in the folding of structural motifs, these are mainly confined to regions exhibiting low frequencies of nucleotides with a high probability of base pairing and thus, presumably, unfavorable energetic properties. This is supported by the observation that structural changes are mostly found in the centroid structures, which are representative structures for the whole thermodynamic ensembles, while

MFE structures are not affected. Additionally, the majority of these changes are confined to regions in close vicinity to the inserted tag. Overall, it appears that the majority of characteristic secondary structure motifs in either of the tested transcripts are not affected by tag insertion, suggesting that the potential binding of proteins and RNAs to these elements is likely not impeded. Additionally, introduction of the HAMMER tag lowered the calculated free energies of predicted centroid structures by 6 and 34% (eGFP-CDKN1B -3’UTR and eGFP-only transcripts, respectively), and might thus even help to stabilize such motifs.

118

Figure 5.7 Comparison between predicted centroid and minimum free energy secondary structures. Centroid and MFE secondary structures of transcripts eGFP-CDKN1B -3’-UTR and eGFP-HAMMER-CDKN1B - 3’-UTR were predicted using RNAfold (Gruber et al. , 2008). A fragment of the CDKN1B 3’-UTR containing recognition elements for miR-221/222 (green; le Sage et al. , 2007), Dnd1 (red; Kedde et al. , 2007), and Pum1 (blue; Kedde et al. , 2010) is depicted. (A) Centroid structure of eGFP-CDKN1B -3’-UTR fragment. (B) Centroid structure of eGFP-HAMMER-CDKN1B -3’-UTR fragment. (C) MFE structure of eGFP-CDKN1B -3’-UTR fragment. (D). MFE structure of eGFP-HAMMER-CDKN1B -3’-UTR fragment.

5.2.7 HAMMER-tagged RNAs are expressed in transiently transfected cells

In order to test whether the incorporation of HAMMER interferes with the expression

119 of tagged RNAs, we transiently transfected Flp-In-293 cells with peGFP-HAMMER and analyzed eGFP mRNA levels by qRT-PCR, using two different primer pairs (eGFP-v2 and eGFP-v3). In both cases, transfection of peGFP-HAMMER leads to a strong and significant increase in eGFP expression compared to mock-transfected cells (Figure 5.8 A) even without induction of the CMV/tetO promoter (fold changes of 136.9 + 11.7 – 10.8 and 100.9 + 2.3 –

2.2 for eGFP-v2 and eGFP-v3, respectively; P = 5.1 x 10-5 and 2.2 x 10 -7; unpaired t-test, two-tailed). Induction with tetracyclin (2 µg/mL, 24 h) leads to a further increase in detected eGFP levels (fold changes between eGFP-HAMMER- and mock-transfected cells of 1295.1 +

171.1 – 151.2 and 1037.3 + 119.0 – 106.8 for eGFP-v2 and eGFP-v3, respectively; P = 4.4 x

10 -9 and 1.6 x 10 -5; unpaired t-test, two tailed). This corresponds to fold changes between tetracyclin- and vehicle-treated cells of 9.5 + 1.3 – 1.1 and 10.3 + 1.2 – 1.1, for eGFP-v2 and eGFP-v3, respectively (P = 2.2 x 10 -4 and 2.8 x 10 -4; unpaired t-test, two-tailed). The

CMV/tetO promoter appears to allow residual expression of the downstream expression cassette even in the absence of tetracyclin, as the detected transcript levels account for ~10% of those measured in tetracyclin-treated samples. Furthermore, apparent residual expression of eGFP in mock-transfected cells suggests that both eGFP primer pairs are not entirely specific.

Consistently, fluorescence microscopy (Figure 5.8 B) revealed that considerable fractions of Flp-In-293 cells transiently transfected with peGFP-HAMMER or peGFP-

HAMMER-CDKN1B -3’-UTR, but not those that did not receive plasmid (mock), were brightly green upon tetracyclin treatment (2 µg/mL; 24 h; approximately 40%, 25%, and 0%, respectively; estimated). Fractions of green fluorescent peGFP-HAMMER- and peGFP-

HAMMER-CDKN1B -3’-UTR-transfected cells were considerably reduced upon treatment with vehicle (3% and 1.5%, respectively; estimated). Interestingly, cells that do express eGFP in the absence of tetracyclin exhibit ostensibly similar fluorescence intensities as tetracyclin- treated cells, suggesting that “promoter leakiness” may be restricted to a small subset of cells.

120

Figure 5.8 Expression of HAMMER transcripts in Flp-In-93 cells. Flp-In-293 cells were transfected with pTO-eGFP-HAMMER, pTO-eGFP-HAMMER-CDKN1B -3’-UTR (A only), or mock-transfected. The medium was supplemented with ethanol (EtOH; –) or tetracyclin (2 µg/mL; +) and cells were analyzed for eGFP expression 24 hours after transfection. (A) qRT-PCR experiments were performed with two different eGFP primer pairs (eGFP-v2 and eGFP-v3) and the resulting C T values were normalized to those of ACTB. EtOH- or tetracyclin-treated pTO-eGFP-HAMMER-transfected cells were further normalized to the corresponding mock- transfected cells. Fold changes in eGFP expression ± S.D. are indicated. Experiments were performed in triplicate. Two-tailed, unpaired t-tests were used to calculate P values (triple asterisks denote P values <0.001). (B) Fluorescence microscopy images showing eGFP-expressing cells (green). Nuclei (blue) were stained with Hoechst 33342 dye.

121

Taken together, the results indicate that HAMMER-tagged transcripts are readily expressed in Flp-In-293 cells. However, since the eGFP coding region is located upstream of

HAMMER in peGFP-HAMMER and peGFP-HAMMER-CDKN1B -3’-UTR (Figure 5.5 B and C), it cannot be concluded that expression extends across the HAMMER RNA tandem tag itself.

5.2.8 Purification of HAMMER-tagged in vitro transcripts via hybridization to

antisense oligonucleotides

To test whether HAMMER-tagged RNA can be purified by hybridization to an antisense oligonucleotide probe, we first coupled 25v2-as oligonucleotides with carboxylated polystyrene microbeads. We then incubated this matrix with eGFP-HAMMER in vitro transcript with covalently bound Cy5 in binding buffer containing either 100 mM, 250 mM or no NaCl. To assess the binding of the in vitro transcript to the microbeads, we recorded the absorption of supernatants at λ = 650 nm (A650; Cy5) and 260 nm (A260; RNA concentration) at 0 min (“input”; t 0), after incubation for 10 min at 30°C (t 10 ), and after an additional 2 h at

4°C (t 130 ; Figure 5.9 A and B).

The measured A650 values were reduced for both time points and all salt concentrations when compared to the input (approximately 18%, 36%, and 40% decrease at t10, and 35%, 76%, and 80% at t 130 , for 0 mM, 100 mM and 250 mM NaCl, respectively).

Consistently, similar decreases were recorded for the A260 measurements (approximately

15%, 42%, and 53% decrease at t10, and 59%, 85%, and 95% at t 130 , for 0 mM, 100 mM and

250 mM NaCl, respectively). These data indicate that 25v2-as-coupled beads are able to bind eGFP-HAMMER in vitro transcript. The variation in binding efficiency when using buffers with different NaCl concentrations suggests that binding occurs in a salt-dependent manner.

122

Figure 5.9 Antisense oligonucleotide purification of HAMMER in vitro transcripts. (A and B) In vitro transcripts of eGFP-HAMMER were enzymatically labeled with Cy5 and incubated with microbeads coupled to Cy3-labeled oligonucleotide 25v2-as in binding buffer containing different concentrations of sodium chloride (NaCl). Binding of transcripts to microbeads was assessed by determining free RNA levels in the supernatants (relative to input; in %) after different time points, by measuring either residual fluorescence (absorption at λ = 650 nm; A) or RNA concentration directly (absorption at λ = 260 nm; B). (C) Microbeads with bound eGFP- HAMMER in vitro transcripts from (A, B; 250 mM NaCl) were incubated at increasing temperatures. Release of Cy5 and Cy3 was assessed by measuring the absorption at λ = 650 nm, and 550 nm respectively. For each data 2 set a fitted sigmoidal curve is plotted. The corresponding melting temperature (T M) and sum of squares (R ) are indicated. (D and E) As in (A) and (B), respectively, but data are from non-enzymatically labeled eGFP and eGFP-HAMMER in vitro transcripts. Two-tailed, unpaired t-tests were used to calculate P values (single, double and triple asterisks denote P values <0.05, 0.01, and <0.001 respectively). For clarity, asterisks were omitted for significantly reduced eGFP levels between time points t 130 and t 10 (*), and between time points t 130 and t 0 (**) in (E; refer to main text).

Next, we tested whether the bound RNA could be released from the beads. Previously, we had established that the 25v2-as oligonucleotide could not be cleaved from the beads by incubation with either Eco RI or DNase I (Felix Schnarwiler, Institute of Pharmaceutical

123

Sciences, ETH Zurich; data not shown). Instead, we tested whether we could disassociate

RNA:2’-O-Me RNA hybrids by heat. Aliquots of beads bound to eGFP-HAMMER transcripts were incubated for 5 min at different temperatures (T = 4, 35, 50, 58, 65, 72, and

95°C) in buffer containing 250 mM NaCl. Releases of eGFP-HAMMER transcript and 25v2- as oligonucleotide were assessed spectrophotometrically by measuring the absorption of supernatants at λ = 650 nm (A650; Cy5), and 260 nm (A550; Cy3) respectively (Figure 5.9 C).

Both the A650 and A550 values increased together with the temperature, indicating that the fluorophore-containing molecules are released into the supernatant. Recorded values strictly followed sigmoidal curves with increasing temperature (R 2 = 0.97 and 0.99, for A650 and

A550, respectively). The temperature at which 50% of the maximum release was achieved

(melting temperature; T M) was lower for Cy5 (47.8°C) than for Cy3 (55.4°C). The results suggest that bound RNA can be efficiently released from antisense oligonucleotide-coated beads by heat, even when using relatively moderate temperatures and short treatment time.

By comparing the binding efficiencies of eGFP transcripts with and without the

HAMMER tag, we then tested whether binding of HAMMER-tagged RNA is specific. The corresponding in vitro transcripts were produced (Figure 5.5 D), labeled with non-covalently bound Cy5, and incubated with 25v2-as-coupled microbeads (Figure 5.2 D) in binding buffer containing 100 mM NaCl. Binding was assessed by measuring A650 and A260 after incubation for 10 min at 30°C (t 10 ), and an additional 2 h at 4°C (t 130 ; Figure 5.9 D and E). For the eGFP-HAMMER transcript, Cy5 levels in the supernatant (A650) dropped by approximately 42% between t 0 and t 10 and 13% between t 10 and t 130 , adding up to a total

-5 decrease of 55% between t0 and t130 (P = 0.0079, 0.0379, and 2.9 x 10 respectively; unpaired t-test, two-tailed). Consistently, the concentration of RNA in the supernatant (A260) decreased by approximately 35% between t 0 and t 10 and 25% between t 10 and t 130 , accounting for a total drop of 60% between t 0 and t 130 (P = 0.0297, 0.0156, and 0.0257 respectively;

124 unpaired t-test, two-tailed). A650 (first value) and A260 (second value) levels for the eGFP transcript also decreased, albeit to a lesser extent: 18% and 7% between t 0 and t 10 , 18% and 22% between t 10 and t 130 , and a total of 36% and 29% between t 0 and t 130 (P = 0.1373, 0.0915,

0.1499, 0.0379, 0.0882, and 0.0021 respectively; unpaired t-test, two-tailed). The decrease in

Cy5 levels was around 2.3-fold (t 10 ) and 1.5-fold (t 130 ) higher for the transcript with the

HAMMER tag (P = 0.0539 and 0.0098, respectively; unpaired t-test, two-tailed). With fold changes of approximately 5.3 (t 10 ) and 2.0 (t 130 ; P = 0.0256 and 0.0101, respectively; unpaired t-test, two-tailed), this difference was even more pronounced when comparing A260 measurements between eGFP-HAMMER and eGFP. The data indicate that the 25v2-as- couplex matrix is indeed able to discriminate between transcripts that contain the HAMMER sequence (eGFP-HAMMER) and those that do not (eGFP), suggesting that binding is specific.

The discriminative power appears to be higher for the incubation at 30°C.

5.2.9 Purification of HAMMER-tagged in vitro transcripts via the S1 aptamer

To check whether HAMMER-tagged RNAs are amenable to purification with streptavidin-coated matrix via the S1 aptamer, we incubated eGFP-HAMMER or eGFP

(control) in vitro transcripts covalently linked with Cy5 fluorophores with paramagnetic streptavidin-coated microbeads. Binding was assessed spectrophotometrically, by measuring the absorption of supernatants at λ = 650 nm (A650; Cy5) and 260 nm (A260; RNA concentration) at different time points (Figure 5.10 A and B). No decrease in the unbound

RNA fractions could be detected for either transcript, neither by A650 nor by A260 measurements, indicating that the tagged RNA was not bound by streptavidin-coated beads.

125

Figure 5.10 Streptavidin-S1 aptamer purification of HAMMER in vitro transcripts. (A to C) Cy5-labeled (A and B) and unlabeled (C) eGFP and eGFP-HAMMER in vitro transcripts were incubated with streptavidin-coated microbeads. Binding of transcripts to microbeads was assessed by determining free RNA levels in the supernatants (relative to input; in %) after different time points, by measuring either residual fluorescence (absorption at λ = 650 nm; A) or RNA concentration directly (absorption at λ = 260 nm; B, C).

To rule out that absence of binding is due to steric interference by aminoallyl-modified deoxyuridine triphosphates or Cy5 fluorophores, the experiment was repeated with unlabeled in vitro transcripts and A260 measurements only (Figure 5.10 C). Although a moderate decrease in the unbound RNA fractions in the supernatants could be observed after 5 min of incubating streptavidin-coated microbeads with transcripts (approximately 10% with respect to the input), there was no difference in A260 decrease between the eGFP-HAMMER and eGFP in vitro transcripts. For both transcripts, levels of unbound RNA were stable after prolonged incubation. These results indicate that HAMMER-tagged transcripts cannot be purified via the S1 aptamer under the tested reaction conditions.

126

5.3 Discussion

Here we present a strategy for the tandem purification of RNPs via the HAMMER tag, a novel RNA tag system consisting of a previously described RNA streptavidin aptamer and an unstructured oligonucleotide designed for antisense-based purification. The tag system was set up in a modular way, conveniently allowing the exchange of aptamer or oligonucleotide while keeping the general layout, and includes a scaffold that was rationally designed to expose and stabilize the tag components, as well as allow them to fold reliably into a characteristic secondary structure. By flanking the tag with multiple cloning sites, bait RNA sequences can be introduced on either side of it, or even around it. As all components of the tag system are based on unmodified RNA nucleotides, the tag can be expressed in cells, thus allowing RNP formation on tagged RNAs in situ . The envisioned two-step procedure involves pre-purification via a streptavidin-coated matrix, competitive elution with D-biotin, a second purification step relying on hybridization of the exposed oligonucleotide with a matrix-coated

DNA/2’-O-Me RNA antisense oligonucleotide, and DNase-mediated elution, and should thus be highly specific. The gentle elution methods should enable the identification of RNP components by a variety of downstream applications, such as mass spectrometry and RNA sequencing.

5.3.1 Expression of tagged transcripts

We have verified the expression of HAMMER-tagged eGFP in a HEK293-derived cell line by qRT-PCR and fluorescence microscopy. But as the tag insertion site was located downstream of the eGFP coding in the tested plasmids, it was not clearly shown that the tag itself was properly expressed. The presence of long stretches of guanine and cytosine residues in DNA has been reported to negatively affect transcription both in vitro (Belotserkovskii et al. , 2010) and in yeast (Kim and Jinks-Robertson, 2011), and thus could pose a potential

127 problem for the expression of the HAMMER tag with its GC-rich stem structure. However, the fluorescence microscopy and qRT-PCR results suggest normal amounts of eGFP both on the mRNA and the protein level. Furthermore, both in vitro transcription of HAMMER- tagged RNAs and propagation of HAMMER-containing plasmids yielded usual amounts of products of the expected sizes, thus suggesting that both RNA and DNA polymerases can process the sequences encoding the tag system. Nevertheless, the ability of cells to express intact copies of HAMMER-tagged transcripts and the HAMMER tag itself should be carefully studied in future experiments.

5.3.2 Capturing of tagged transcripts by antisense hybridization

Proper T7 RNA polymerase-mediated transcription of the tag is further supported by the observation that antisense oligonucleotide-coupled microbeads are able to specifically and efficiently bind HAMMER-tagged in vitro transcripts when compared to an untagged control.

Under the tested incubation conditions, transcripts with Cy5-labeled aminoallyl-modified

UTPs were bound efficiently in the presence of approximately physiological ionic strength, while binding of transcripts labeled with intercalated Cy5 was less efficient. Low salt concentrations considerably reduced binding whereas elevated salt levels slightly increased the efficiency of capturing tagged RNAs. Incubation at 30°C appears to have favorable binding kinetics compared to incubation in the cold. Similarly, the difference between binding efficiencies of tagged and untagged in vitro transcripts was higher after the 30°C incubation step (2- to 5-fold lower for the latter) compared to end point measurements (1.5- to 2-fold less for untagged transcript), suggesting that specificity is higher at the elevated temperature.

These results are consistent with a report stating that annealing rates between RNA-RNA heterodimers decrease considerably at temperatures around 30 degrees below the melting temperature (Patzel and Sczakiel, 1999). Considering the determined melting temperature of

128 the RNA:2’-O-Me RNA hybrid of approximately 50°C, incubation temperatures above or around room temperature should be preferable for the oligonucleotide hybridization procedure.

Given the high efficiency of the binding and an approximately two-fold enrichment of signal over noise in the pilot experiments, the oligonucleotide purification strategy appears to be promising. While its efficacy in purifying cellular transcripts remains to be proven, the envisioned aptamer-based pre-purification step should be able to reduce sample complexity considerably so that it can be hoped that efficiency and specificity will not dramatically decrease compared with the purification of the in vitro transcript. Furthermore, by testing different antisense oligonucleotide pairs, binding buffer compositions and incubation/reaction conditions, it should be possible to further optimize specificity and efficiency of the procedure.

5.3.3 Elution of transcripts immobilized by hybridization

As the cleavage of bead-coupled 25v2-as oligonucleotide by either Eco RI or DNase I was not successful, elution of captured HAMMER-tagged Cy5-labelled in vitro -transcript was achieved by “melting” RNA:2’O-Me RNA hybrids at elevated temperatures. Based on the measurements of released Cy5, a melting curve was recorded which, as expected, exhibits a sigmoidal shape, strongly resembling the thermal denaturation of DNA. Unexpectedly, labeled 25v2-as was also released upon heating up the beads, possibly due to residual, uncoupled 25v2-as oligonucleotide binding to the bead surface. While thermal elution was efficient even at moderate temperatures and short incubation times, a more gentle and specific elution method would be preferable to exclude the possibility of RNA alkaline hydrolysis, particularly in the presence of divalent cations and a relatively high pH in the buffer. One explanation for the failure of endonuclease-mediated cleavage could be the inaccessibility of the DNA linker due to its limited length (12 bp) and proximity to the beads. Moreover, while it has been reported that both enzymes, as well as type II restriction enzymes in general, are

129 able to process single-stranded DNA, cleavage usually occurs at reduced rates (Nishikagi et al. , 1985; Bischofberger et al. , 1987; Sutton et al. , 1997; Latham, Ambion, unpublished).

Possible strategies for establishing an elution method relying on endonuclease-mediated cleavage of the DNA linker include the use of endonucleases with increased specificity for single-stranded DNA, extending the incubation time, optimization of reaction/buffer conditions, the use of a longer DNA linker, and the addition of short DNA fragments complementary to the DNA linker sequence. The use of excessive amounts of endonuclease is not recommended if mass spectrometric analysis of the eluate is intended, as signals resulting from fragments of highly abundant proteins may mask those of lesser abundant proteins, as has been reported for example for the mass spectrometric analysis of the human blood plasma proteome (Anderson and Anderson, 2002; Atkins et al. , 2002). Similarly, the choice of buffer components should be carefully considered so as not to interfere with downstream applications.

5.3.4 Aptamer-mediated purification of tagged transcripts

In initial experiments aimed at establishing a protocol for S1 aptamer-mediated purification, no binding of HAMMER-tagged Cy5-labeled in vitro transcripts to streptavidin- coated microbeads could be detected. It is conceivable that the presence of fluorophore complexes in the labeled transcripts prevents proper folding of the S1 aptamer and, consequently, binding to the streptavidin-coated matrix. Indeed, binding occurred when unlabeled HAMMER-tagged and untagged in vitro transcripts were used instead, albeit indiscriminately and with low efficiency. One possible explanation is that the aptamer, although properly folded, is inaccessible. However, this is not likely to be the culprit in this case, since we tested the procedure with in vitro transcripts, largely in the absence of factors that might potentially interfere with the folding of the tag. Moreover, the same in vitro

130 transcripts appeared to be well accessible in the antisense oligonucleotide purification experiments. While we cannot rule out that the observed effects are caused by unfavorable reaction conditions (binding buffer composition or incubation times/temperature), the very low binding efficiency (<15%), the binding kinetics (plateau reached after 5 min) and the complete absence of specificity suggest a more fundamental problem: As was reported in the original description of the S1 aptamer (Srisawat and Engelke, 2001), integration into bait

RNAs sometimes led to dysfunctional aptamers, even when the predicted secondary structures suggested proper folding. To rule out positional effects, the tag should first be inserted into different locations. If the problem persists, it may be necessary to include a more flexible linker between the stem structure and the aptamer. Alternatively, either the full-length S1 aptamer (Srisawat and Engelke, 2001) or a different aptamer could be employed. The tobramycin aptamer J6f1 might constitute a reasonable choice as its binding affinity to the ligand is reported to be in the nanomolar range (Hamasaki et al. , 1998), although both the affinity and the specificity of the aptamer have been called into question (Verhelst et al. ,

2004).

5.3.5 Reflections on tag folding and insertion

So far we have generated plasmids that allow in vitro transcription of HAMMER- tagged transcripts, as well as their expression in mammalian cells. Incorporation of the tag downstream of an eGFP coding sequence as well as between an eGFP coding sequence and a

CDKN1B 3’-UTR did cause minor changes in the predicted secondary structures of the corresponding transcripts, while the tag structure itself was not affected by the presence of the bait RNAs. Changes in the fold of bait RNAs were predominantly located in the vicinity of the location of tag insertion and mostly affected regions with a low density of nucleotides with high base-pairing probability. In order to avoid or minimize such adverse effects, it

131 might be advisable to introduce the tag in those regions of potential bait RNAs that exhibit exposed structural motifs with low free energies. Better yet, generating a set of constructs in which one and the same bait RNA is tagged at two or more different positions should increase the chances of proper aptamer and bait RNA folding and should also reduce the chance of steric interference between the tag structure and potential binding factors of the bait RNA.

Moreover, results obtained from the parallel purification of bait RNAs using tags introduced at different locations should increase both the robustness and reliability of the results. In general, it is recommended to consult secondary structure prediction software such as

RNAfold (Gruber et al. , 2008) or MC-Fold/MC-Sym (Parisien and Major, 2008) for choosing appropriate insertion positions.

5.3.6 Limitations of RNA secondary structure prediction algorithms

While changes in the folding of an RNA in different sequence contexts can be estimated with some confidence by employing secondary structure prediction algorithms, the accuracy and relevance of a specific predicted structure cannot be assessed without empirical structural data. Regarding the absence of binding to the ligand-coated matrix, it might be possible that the aptamer requires interconversion between two or more stable folds to allow binding to the ligand in an “induced fit”-like manner (Williamson, 2000; Leulliot and Varani,

2001). It is conceivable that the presence of the stem structure or steric interference of the bait

RNA traps a non-functional intermediate structure in a local minimum of the ‘folding landscape’ (Chen and Dill, 2000; Solomatin et al. , 2010), thus restricting structural dynamics.

Furthermore, it should be kept in mind that classical nucleic acid secondary structure prediction algorithms are mostly based on base pairing and stacking interactions of nucleobases and may, on their own, be of very limited use for the estimation of RNA conformations in RNA-protein complexes. Therefore, a more in-depth structural analysis

132 should be performed, making use of advanced prediction software that may take into account the tertiary structure of the aptamer (e.g. MC-Fold/MC-Sym; Parisien and Major, 2008), folding dynamics (e.g. BarMap; Hofacker et al. , 2010), and its interaction with protein ligands

(e.g. catRAPID; Bellucci et al. , 2011). Ideally, it should include the experimental determination of the solution structure of the functional aptamer, which can then be used to fine-tune prediction parameters. In this regard, low (Jiang et al. , 1997) and high (Jiang and

Patel, 1998) resolution structures of tobramycin bound to a J6f1-related aptamer (Wang and

Rando, 1995; Wang et al. , 1996), as well as a crystal structure of the stII aptamer in complex with streptomycin (Tereshko et al. , 2003) already exist.

5.3.7 Conclusion

If it is possible to overcome the addressed technical difficulties and establish a working tandem purification protocol, the HAMMER tandem affinity tag system holds promise to be a versatile and widely applicable tool for the analysis of in situ -formed RNPs in a wide range of cell-based or even whole organism systems. In a next step, optimization and streamlining of the purification and analysis procedures may lead to improved reliability and reproducibility, as well as reduced costs and hands-on time. To this end, the use of paramagnetic beads as the matrix in both purification steps may help to achieve a certain level of automation which, together with, for example, the use of the Gateway cloning system (Invitrogen) and the construction of “UTRome libraries” may allow medium- to high-throughput analysis of RNPs.

Furthermore, additional information may be obtained, e.g. by the incorporation of crosslinking methods (Niranjanakumari et al. , 2002), which should enable the simultaneous identification of cis -regulatory messages in tagged bait RNAs. Finally, the use of the tag system is not limited to RNP purification: By targeting regions of the tag with fluorescently labeled antisense oligonucleotides, it should be possible to study RNP localization by

133 fluorescence in situ hybridization methods (Walker et al. , 2011).

5.4 Materials and Methods

5.4.1 Tag design and bioinformatics

RNA secondary structure predictions were made with RNAfold

(http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi; Gruber et al. , 2008), using default settings and an energy parameter model established by Andronescu et al. , (Andronescu et al. , 2007).

Unstructured RNA sequences (15-mers and 25-mers) were designed using RNA Designer

(http://www.rnasoft.ca/cgi-bin/RNAsoft/RNAdesigner/rnadesign.pl; Andronescu et al., 2004; structure: string of 15 or 25 times “.”, sequence constraints: string of 15 or 25 times “N”, temperature: 37°C, number of sequences: 5, GC content for paired/unpaired regions: 50%, random number seed: 1).

5.4.2 Plasmids

The designed HAMMER tandem affinity tag sequence (HAMMER-synth; see 7.1.4) was ordered from Mr. Gene (http://mrgene.com/). To generate pTO-HAMMER, the insert sequence of the obtained pMK-based vector was verified by sequencing and subcloned into pTO-HA-Strep-GW-FRT (kindly provided by Alexander Wepf, Institute of Molecular and

Systems Biology, ETH Zurich; see 7.7) via Bam HI and Apa I, replacing its HA and StrepIII protein tags as well as the Gateway cassette with the HAMMER tandem affinity tag. The coding sequence of eGFP was amplified from pTO-HA-Strep-GW-FRT-eGFP (kindly provided by Alexander Wepf; Institute of Molecular and Systems Biology, ETH Zurich) using the primers Hind III-Kozak-eGFP-fwd (containing a restriction site for Hind III as well as a

Kozak site; see 7.1.1) and eGFP-rev-MCS-Bam HI (containing several restriction sites, including one for BamHI; see 7.1.1). The amplicon was purified with the QIAquick PCR

134

Purification kit (QIAGEN, Cat. No. 28104) and inserted into pCR-Blunt II-TOPO

(Invitrogen), using the Zero Blunt TOPO PCR Cloning Kit (Invitrogen, Cat. No. K2830-20) according to the manufacturer’s recommendations. After verifying correct insertion by sequencing of the corresponding region, the eGFP cassette was subcloned into pTO-

HAMMER via Hind III and Bam HI. The region containing eGFP and the HAMMER sequence were subcloned into pBlueScript SK+ (Stratagene, discontinued) via Hind III and Not I to generate pBS-SK+-eGFP-HAMMER. The CDKN1B /p27 3’-UTR (nt 1070-2403 of RefSeq mRNA NM_004064.3) was subcloned from pCR-Blunt II-TOPO-CDKN1B -3’-UTR (see

4.4.2) via Xho I and Not I to generate pTO-eGFP-HAMMER-CDKN1B -3’-UTR.

5.4.3 Cell culturing

Flp-In-293 cells (Invitrogen, Cat. No. R750-07; kindly provided by Alexander Wepf,

Institute of Molecular and Systems Biology, ETH Zurich) were cultured in Dulbecco’s modified Eagle’s medium (DMEM; Invitrogen, Cat. No. 41966) supplemented with 10% fetal bovine serum (Invitrogen, Cat. No. 10270-106), 1x antibiotic-antimycotic (Invitrogen, Cat.

No. 15240-062), and 1 mg/mL zeocin (Invitrogen, Cat. No. R250-01) in 5% CO 2 at 37°C.

5.4.4 Quantitative reverse transcription PCR

2.5 mL of a Flp-In-293 (Invitrogen, Cat. No. R750-07) cell suspension (2 x 10 5 cells/mL in DMEM supplemented with 10% fetal bovine serum and 1x antibiotic-antimycotic) were seeded into the wells of an 6-well plate (TPP, Switzerland, Cat. No. 92006). 16 hours later, cells were transfected with 2 µg of peGFP-HAMMER, or mock-transfected, using polyethylenimine (Polysciences, Inc., Cat. No. 23966): DNA and polyethylenimine stock solution (1 mg/mL in water) were diluted to 15 µg/mL and 60 µg/mL, respectively, in DMEM.

Both solutions were incubated for 10 min at room temperature, and then mixed at equal

135 volumes (mass ratio polyethylenimine to DNA = 4:1; final polyethylenimine concentration =

30 µg/mL). After incubation for 20 min at room temperature, transfection solutions were added to the cells. 24 hours after transfection, medium was replaced with DMEM supplemented with 10% fetal bovine serum, 1x antibiotic-antimycotic, and either 2 µg/mL tetracyclin (2 mg/mL stock solution in EtOH) or a corresponding volume of EtOH only. 24 hours after transfection, cells were washed with phosphate-buffered saline (PBS; Invitrogen,

Cat. No. 14190) and RNA was isolated by using the RNeasy Mini kit (QIAGEN, Cat. No.

74106) with RNase-free DNase I (QIAGEN, Cat. No. 79254) digestion according to the manufacturer’s recommendations (elution in nuclease-free water). RNA concentrations were determined spectrophotometrically with a NanoDrop 1000 (Thermo Fisher Scientific Inc.). cDNA was prepared from total RNA (50 ng per reaction) using the High Capacity cDNA

Reverse Transcription Kit (Applied Biosystems, Cat. No. 4368814) according to the manufacturer’s recommendations. Quantitative PCR reactions were performed in triplicate, using FastStart Universal SYBR Green Master (Rox; Roche, Cat. No. 04913914001) in an

AB 7900 HT Fast Real-Time PCR System (Applied Biosystems). The following primer pairs were used at 0.1 nM: ACTB (ACTB-SYBR-fwd, ACTB-SYBR-rev), eGFP-v1 (eGFP-SYBR- v1-fwd, eGFP-SYBR-v1-rev), and eGFP-v2 (eGFP-SYBR-v2-fwd, eGFP-SYBR-v2-rev).

∆∆ Quantification was performed using the 2 - CT method (Schmittgen and Livak, 2008), with

ACTB serving as a reference for the normalization of expression levels.

5.4.5 Fluorescence microscopy

250 µL of a Flp-In-293 (Invitrogen, Cat. No. R750-07) cell suspension (4 x 10 4 cells/mL in DMEM supplemented with 10% fetal bovine serum and 1x antibiotic-antimycotic) were seeded into the chambers of an 8-chamber slide (BD Falcon, Cat. No. 354108). 16 hours later, cells were transfected with 100 ng of peGFP-HAMMER or peGFP-HAMMER-

136

CDKN1B -3’-UTR, or mock-transfected, using polyethylenimine (Polysciences, Inc., Cat. No.

23966): DNA and polyethylenimine stock solution (1 mg/mL in water) were diluted to 15

µg/mL and 60 µg/mL, respectively, in DMEM. Both solutions were incubated for 10 min at room temperature, and then mixed at equal volumes (mass ratio polyethylenimine to DNA =

4:1; final polyethylenimine concentration = 30 µg/mL). After incubation for 20 min at room temperature, transfection solutions were added to the cells. 24 hours after transfection, medium was replaced with DMEM supplemented with 10% fetal bovine serum, 1x antibiotic- antimycotic, and either 2 µg/mL tetracyclin (2 mg/mL stock solution in EtOH) or a corresponding volume of EtOH only. 24 hours later, Hoechst dye 33342 (Invitrogen, Cat. No.

H3570) was added to the medium, and after 10 min cells were analyzed with an Axiovert 200

M (Zeiss) fluorescence microscope. Images were acquired with the AxioVision software

(Zeiss; release 4.7) and overlay images were created with Adobe Photoshop CS5.

5.4.6 In vitro transcription and labeling

pBS-SK+-eGFP-HAMMER was digested with either Eco RV or Not I for the generation of eGFP and eGFP-HAMMER in vitro transcripts, respectively. Linearized plasmid was run on a 1% TAE agarose gel and purified using the MinElute Gel Extraction Kit (QIAGEN, Cat.

No. 28604) according to the manufacturer’s recommendations (elution in nuclease-free water).

In vitro transcripts with non-covalently bound Cy5 (eGFP and eGFP-HAMMER) were produced with the MEGAscript T7 Kit (Ambion, Cat. No. AM1333) according to manufacturer’s recommendations, using 3.4 µg linearized plasmid in 100 µL volume as template. In vitro transcribed RNA was isolated by using the RNeasy Mini kit with DNase I digestion according to the manufacturer’s recommendations (elution in nuclease-free water).

20 µg purified RNA were Cy5-labeled with the Turbo Labeling Kit Cy5 (Arcturus, Cat. No.

KIT0610) according to the manufacturer’s recommendations. In vitro transcript with

137 covalently bound Cy5 (eGFP-HAMMER only) was produced from 750 ng linearized plasmid in 14 µL volume using the Amino Allyl MessageAmp II aRNA Amplification Kit with Cy5

(Ambion, AM1796) according to the manufacturer’s recommendations, starting from the in vitro transcription step. RNA quality and the correct lengths of products were assessed by agarose gel electrophoresis. Yield and efficiency of Cy5 incorporation were determined with a

NanoDrop 1000.

5.4.7 S1 aptamer purification

120 µL of streptavidin-coupled microbeads (Streptavidin MagneSphere Paramagnetic

Particles, Cat. No. Z5481) were incubated three times (20 min, 850 rpm; followed by collection of beads using a magnet) in 200 µL blocking buffer (20 mM Tris-HCl, pH 8.0; 100

V mM NaCl; 5% /V glycerol ; 5 mM MgCl 2, pH 8.0 ; 0.1% Triton X-100 ; 0.1 % BSA; 0.1 mg/ml heparin; 0.1 mg/ml E. coli tRNA; 40 U/ml RNasin Plus, Promega, Cat. No. N2615), and resuspended in 50 µL of the same buffer. 10 pmol of eGFP-HAMMER in vitro transcript

(calculated molecular weight without label ≈ 302 kilodalton; kDa) were dissolved in 50 µL of

V binding buffer (20 mM Tris-HCl, pH 8.0; 100 mM NaCl; 5% /V glycerol; 0.1% Triton X-100;

5 mM MgCl 2, pH 8.0;), incubated at 65°C for 5 min, and allowed to cool down slowly to RT.

The RNA solution was combined with the blocked microbeads (total volume = 100 µL) and incubated for 30 or 60 min at 4°C. 5 µL aliquots were removed after 0, 5, 10, 20, 30, and 60

(where applicable) min and analyzed by spectrophotometry with a NanoDrop 1000.

Absorption at λ = 260 nm and, in the case of Cy5-labeled transcripts, 650 nm was recorded in order to determine the unbound RNA fraction.

5.4.8 Oligonucleotide synthesis

5’-aminohexyl-3’-Cy3-linked DNA/2’-O-Me RNA hybrid 25v2-as (see 7.1.4)

138 oligonucleotides were synthesized from the 3’- to the 5’-terminus with a MerMade 12

(Bioautomation Corporation; MerMade 12 v2.3.7 software) on controlled-pore glass solid support (Glen UnySupport, 500 Å, 50 µmol/g; Glen Research) under standard conditions.

Each synthesis cycle involved deblocking of the dimethoxytrityl group with 5% dichloroacetic acid in dichlorethane, coupling of 0.1 M phosphoamidites in acetonitrile

(SAFC Proligo) upon activation with 0.25 M 5-ethylenthiotetrazole (360 s for 2’-O-Me RNA amidites, 180 s for all other phosphoamidites), capping of uncoupled amidites with a mixture of 10% acetic anhydride in tetrahydrofuran (“Capping A”; SAFC Proligo) and 10% N- methylimidazole and 10% pyridine in water (“Capping B”; SAFC Proligo) in order to avoid synthesis of incomplete product, and stabilization of the phosphoamidite linkage by oxidation with 0.02 M iodine in a mixture (7:2:1) of tetrahydrofuran, pyridine and water (RFLC

Proligo). 5’-amino-modifier C6 and Cy3 phosphoamidites were from Glen Research, and

DNA and 2’-O-Me RNA phosphoamidites were from Thermo Fisher Scientific, Inc. The product was cleaved and deprotected in a screw cap vial with a mixture (1:1) of aqueous ammonia (33%; Fluka) and ethanolic methylamine (33%; Fluka) for 30 min at 65°C and washed once with a mixture (1:1) of water and EtOH. The combined supernatants were dried in a SpeedVac (Savant SPD 2010; Thermo Electronic), dissolved in 0.1 M triethylammonium acetate and purified by semi-preparative reversed phase high-performance liquid chromatography (RP-HPLC; Agilent 1200 HPLC system, Agilent Technologies) on a C18 column (XBridge OST C18 2.5 µm; 4.6 x 50 mm; Waters), based on the presence of the monomethoxytrityl group in the 5’-amino-C6, which was only available in the full-length product. Relevant fractions were dried in a SpeedVac, the residue solubilized in 40% acetic acid and incubated at RT for 60 min to remove monomethoxytrityl groups. After liquid evaporation in a SpeedVac, residue was dissolved in 0.1 M hexafluoroisopropanol with 8.6 mM triethylamine. Purity (>99%) and identity (12721.2 Da found; 12723.8 Da calculated; deviation = 0.020%) were verified by RP-HPLC on a C18 column (XBridge OST C18 2.5 µm;

139

2.1 x 50 mm; Waters), followed by liquid chromatography-mass spectrometry (LC-MS;

Agilent 6130 Single Quad, Agilent Technologies) analysis.

5.4.9 Preparation of antisense oligonucleotide matrix

5.6 nmol of Cy3-labeled 25v2-as oligonucleotide (in 800 µL water) were coupled to

29.2 mg (1167 µL of suspension) of carboxylated 0.75-1 µm polystyrene microbeads

(Polysciences, Inc., Cat. No. 07759) by using the PolyLink Protein Coupling Kit

(Polysciences, Inc., Cat. No. 24350-1) according to manufacturer’s recommendations (reagent volumes were scaled accordingly). Incubation was performed for two hours at RT with gentle shaking (850 rpm). The suspension was centrifuged (10 min, 1000 g, RT) at the beginning (0 min; resuspended in same buffer) and the end of the incubation time (2 h, RT, 850 rpm), and supernatants were subjected to spectrophotometric analysis with a NanoDrop 1000 ( λ = 550 nm) in order to determine the fraction of unbound oligonucleotide. Coupled beads were resuspended in 1.2 mL of PolyLink Wash/Storage Buffer (2.8 nmol bound oligonucleotides per mL suspension) and stored at 4°C.

5.4.10 Purification by antisense oligonucleotide hybridization

100 µL of 25v2-as-coupled microbead suspension were equilibrated for 5 min in

V binding buffer (20 mM Tris-HCl, pH 8.0; 5% /V glycerol; 5 mM MgCl 2, pH 8.0; 0.1% Triton

X-100) including 100 mM NaCl (unless mentioned otherwise). Beads were collected by centrifugation (5 min, 3,000 g, RT), resuspended in 100 µL of the same buffer, and mixed with 10 pmol of the indicated in vitro transcript. Suspensions were incubated for 10 min at

30°C, and then for 2 h at 4°C. 15 µL aliquots were removed after 0, 10, and 130 min, centrifuged (2 min, 13,000 g, RT), and analyzed by spectrophotometry with a NanoDrop 1000.

Absorption at λ = 260 nm and 650 nm were recorded in order to determine the fractions of

140 unbound RNA. To assess the release of bound RNA by increasing temperature, microbeads with bound Cy5-labeled eGFP-HAMMER in vitro transcript in buffer with 250 mM NaCl were topped up with additional buffer to a volume of 128 µL and divided into eight aliquots with 15 µL each. Aliquots were incubated for 5 min at either 4, 35, 50, 58, 65, 72, or 95°C, centrifuged (2 min, 13,000 g, RT), and analyzed by spectrophotometry with a NanoDrop 1000.

Absorption at λ = 550 nm (Cy3) and 650 nm (Cy5) were recorded in order to determine the fractions of unbound 25v2-as oligonucleotide and Cy5-labeled eGFP-HAMMER in vitro transcript, respectively.

5.5 Contributions

André P. Gerber, Alexander Kanitz, Luca Schenk and Jonathan Hall (Institute of

Pharmaceutical Sciences, ETH Zurich) contributed to the design of the HAMMER RNA tandem tag. Mauro Zimmermann (Institute of Pharmaceutical Sciences, ETH Zurich), under the supervision of Jonathan Hall, prepared the 25v2-as oligonucleotides, and Felix

Schnarwiler (Institute of Pharmaceutical Sciences, ETH Zurich), under the supervision of

André P. Gerber and Alexander Kanitz, performed the coupling of 25v2-as oligonucleotides to microbeads. Alexander Kanitz, under the supervision of André P. Gerber and Michael

Detmar, performed all other experiments. Katarzyna Hunt (Institute of Pharmaceutical

Sciences, ETH Zurich), under the supervision of Alexander Kanitz, André P. Gerber and Luca

Schenk, helped with the generation of plasmids, the expression analysis, as well as aptamer and antisense oligonucleotide purification experiments. Michael Detmar (Institute of

Pharmaceutical Sciences, ETH Zurich) provided general support and, together with André P.

Gerber, supervised Alexander Kanitz and Luca Schenk. Alexander Wepf, supervised by

Matthias Gstaiger (Institute of Molecular and Systems Biology, ETH Zurich), contributed plasmids and cell lines as indicated in the main text.

141

6 Concluding Remarks

Since the dawn of molecular biology half a century ago, the exploration of transcriptional and post-translational control of gene expression has absorbed the vast majority of the available capacities for the study of gene regulation. It does therefore, perhaps, not come as a surprise that the discovery of the first miRNA in 1993 (Lee et al. , 1993) was neglected as a worm idiosyncrasy by the scientific community. It required the advent of genomic techniques and the insights gained from the sequencing of the first higher organisms to alert scientists of the previously disregarded, almost completely undiscovered ‘parallel universe’ of post-transcriptional gene regulation. And even today, and despite the increased interest in post-transcriptional gene regulation that has been largely caused by the ‘re- discovery’ and subsequent appreciation of the far-reaching implications of microRNA

(miRNA) regulation, the full significance of post-transcriptional gene regulatory networks

(GRNs) may still be underestimated. Nevertheless, lessons learned from studying transcriptional and post-translational networks have accelerated the rate of progress in the field so that now, for example, several elegant and sophisticated methods exist that allow targets and recognition elements of RNA-binding proteins and miRNAs to be identified on a transcriptome-wide scale. However, such “top down” approaches are limited to spotting

‘multiple output’ network motifs only. In contrast, gene-centered “bottom up” strategies aimed at unveiling ‘multiple input’ motifs, i.e. the identification of multiple trans -acting factors regulating a single (m)RNA, are less straight-forward and often rely on chance findings or predictive tools. Yet considering that thousands of laboratories around the world largely rely on the study of individual genes to identify and develop leads for potential drug targets, the demand for detailed insights into the post-transcriptional regulatory impact exerted on specific messages is high.

142

In this work, we used a biased, integrative approach relying on bioinformatics predictions and empirical data to identify cis -regulatory elements and trans -acting factors that repress the expression of a well-established human angiogenesis factor with high medical relevance. Importantly, the levels of all repressors were significantly reduced in a particular tumor type that expresses high levels of the targeted angiogenesis factor, suggesting a potential role in disease. Our findings thus validate the feasibility of the outlined procedure for the identification of combinatorial control motifs. Furthermore, as the used data and prediction algorithms are readily available, the strategy could be useful for many laboratories studying physiological or pathological aspects of a limited number of genes. However, the biased, time-intensive and error-prone nature of the procedure severely hampers its scalability and limits its use to ‘hypothesis-driven’ research.

In a second study, we therefore laid the foundation for a widely applicable method for the unbiased discovery of proteins and RNAs associated with a transcript of interest. The method relies on a modular RNA tandem affinity tag that we rationally designed to facilitate its appendage to RNAs of interest while maintaining a high stability, folding characteristics and exposure of the affinity determinants. The tag system is nucleic acid-based and thus able to be expressed inside cell lines of interest, ensuring the unimpeded assembly of near-native ribonucleoprotein complexes which can then be purified by immobilization on ligand-coated matrices in a two-step process, guaranteeing a high level of specificity. Gentle elution methods should permit the analysis of co-purified proteins and RNAs by transcriptomics and proteomics approaches. While we were not able to provide a ‘proof of principle’, we believe that our tag design holds great promise for the unbiased identification of ‘multiple input’ motifs of post-transcriptional GRNs, as the availability of ‘discovery-driven’ complementary top-down and bottom-up approaches is indispensable for the comprehensive characterization of the basic motifs of such networks.

143

By integrating the data generated by the application of these and other methods into the available network models, they can be gradually expanded to include other, more complex motifs, such as feedback or feed-forward loops. The ultimate goal, however, would be the reconciliation of all different types of gene regulatory networks into a single, robust composite network of gene regulation. Such a model would have broad implications for medical and pharmaceutical sciences, synthetic biology, computer sciences and technology, and even the study of social networks in economics. Moreover, it could help us to approach an answer to perhaps the oldest question in the biological sciences: How can chemical systems sustain the remarkable organizational complexity that we call life?

144

7 Appendix

7.1 Nucleotide sequences

7.1.1 Nucleotides used for cloning

PCR primer sequences used for the cloning of the indicated (final) plasmids are listed. Used restriction or attB Gateway recombination sites are underlined. Nucleobases that are part of a coding sequence or untranslated region are italicized. psiCHECK-2-CDKN1B-3’-UTR (and mutated version), peGFP-HAMMER-CDKN1B -3’UTR CDKN1B -3’-UTR-fwd 5'-CTCGAG ACAG CTCGAATTAA GAATATGTTT CC -3' CDKN1B -3’-UTR-rev 5'-GCGGCCGC GA AGTTTTCTTT ATTGATTACT TAATGTG -3' psiCHECK-2-VEGFA-3’-UTR (and mutated versions) VEGFA -3’-UTR-fwd 5'-TCACTCGAG G TCCCGGCGAA GAGAAGAG -3' VEGFA -3’-UTR-rev 5'-CATGCGGCCG CTCAATGGAG AAGGAGAAACCA -3' peGFP-HAMMER Hind III-Kozak-eGFP-fwd 5'-GATTAAGCTT GCCACC ATGG TGAGCAAGGG CGAGG -3' eGFP-rev-MCS-Bam HI 5'-GGATCCGGCG CCCCCGGGGA TATC TTACTT GTACAGCTCG TCCATGC -3' pTO-HA-Strep-GW-FRT-Pum2 Pum2-CDS-attB-fwd 5'-GGGGACAAGT TTGTACAAAA AAGCAGGCTT A ATGAATCAT GATTTTCAAG C -3' Pum2-CDS-attB-rev 5'-GGGGACCACT TTGTACAAGA AAGCTGGG TT ACAGCATTCC ATTTGG -3'

7.1.2 Nucleotides used for mutagenesis

Oligonucleotide sequences used for the mutagenesis of the VEGFA 3’-UTR are listed. Mutated residues are underlined. Deleted residues are marked by hyphens. Note that the PRE mutations were introduced sequentially, starting with PRE3, and followed by PRE2 and PRE1.

VEGFA-PRE-TM VEGFA-PRE3-MUT-fwd 5'-CAACTTGTAT TTGTGTGTAT ATATATATAT ATATGTTTAA CAATATATGT GATTCTGATA AAATAGACAT TGCTATTCTG-3' VEGFA-PRE3-MUT-rev 5'-CAGAATAGCA ATGTCTATTT TATCAGAATC ACATATATTG TTAAACATAT ATATATATAT ATACACACAA ATACAAGTTG-3'

VEGFA-PRE2/3-MUT-fwd 5'-TCTACATACT ATATATATAT TTGGCAACTT GTATTTGTGA CAATATATAT ATATATATGT TTAACAATAT ATGTGATTCT G-3' VEGFA-PRE2/3-MUT-rev 5'-CAGAATCACA TATATTGTTA AACATATATA TATATATATT GTCACAAATA CAAGTTGCCA AATATATATA TAGTATGTAG A-3'

VEGFA-PRE1-MUT-fwd 5'-CTCTTGCTCT CTTATTTGTA CCGGTTTTAC AATATAAAAT

145

TCATGTTTCC AATCTCTCT-3' VEGFA-PRE1-MUT-rev 5'-AGAGAGATTG GAAACATGAA TTTTATATTG TAAAACCGGT ACAAATAAGA GAGCAAGAG-3'

VEGFA-MRE-MUT VEGFA-MRE-MUT-fwd 5'-GTGTGTATAT ATATATATAT ATGTTTATGT ATATATGTGA T---GATAAA ATAGACATTG CTATTCTGTT TTTTATATGT AAAAACAAA-3' VEGFA-MRE-MUT-rev 5'-TTTGTTTTTA CATATAAAAA ACAGAATAGC AATGTCTATTT TATC---ATC ACATATATAC ATAAACATAT ATATATATAT ATACACAC-3'

7.1.3 Nucleotides used for quantitative reverse transcription PCR (SYBR Green)

Oligonucleotides used for SYBR Green-based qRT-PCR are listed.

ACTB ACTB-SYBR-fwd 5'-TCACCGAGCG CGGCT-3' ACTB-SYBR-rev 5'-TAATGTCACG CACGATTTCC-3' eGFP-v1 eGFP-SYBR-v1-fwd 5'-CCTGAAGTTC ATCTGCACCA-3' eGFP-SYBR-v1-rev 5'-GAAGAAGTCG TGCTGCTTCA-3' eGFP-v2 eGFP-SYBR-v2-fwd 5'-CGACGGCAAC TACAAGAC-3' eGFP-SYBR-v2-rev 5'-TAGTTGTACT CCAGCTTGTG C-3'

7.1.4 Nucleotides for the HAMMER RNA tandem tag

RNA aptamer sequences. The used aptamer (S1 minimal motif) is marked with an asterisk. D8 (minimal motif) 5'-GUCCGAGUAA UUUACGUUUU GAUACGGUUG CGGAACUUGC-3' J6f1 5'-ACCGACCAGA AUCAUGCAAG UGCGUAAGAU AGUCGCGGGC CGGG-3' S1 (minimal motif)* 5'-GGCUUAGUAU AGCGAGGUUU AGCUACACUC GUGCUGAGCC-3' stII 5'-GGAUCGCAUU UGGACUUCUG CCCAGGGUGG CACCACGGUC GGAUCC-3'

Unstructured RNA oligonucleotides. Generated using RNA Designer (http://www.rnasoft.ca/cgi- bin/RNAsoft/RNAdesigner/rnadesign.pl; Andronescu et al. , 2004). The used sequence (25v2) is marked with an asterisk. 15v1 5'-UUUAUCUUCA GCUGG-3' 15v2 5'-UAUUGUGUCC CUCUC-3' 15v3 5'-UUGCCCGUAG GAUCA-3' 15v4 5'-AUGCGCCGCU CGAGA-3' 15v5 5'-UGAGUCACGU CCGUA-3'

146

25v1 5'-UUUAUCUUCA CUUGACUAGC CGGCU-3' 25v2* 5'-UGUUGUUUCA CGCUGUUGAC CGAGG-3' 25v3 5'-CAGACCCUAG GAUUACGUGC ACCGG-3' 25v4 5'-AUGCGCCGCU CGAGAAACAC AAUUG-3' 25v5 5'-UGAGUCACGU CCGUAAACCU AAUGC-3'

HAMMER sequence. Obtained from Mr. Gene (http://mrgene.com/). HAMMER-synth 5'-GGATCCGGGG GGGGGGGGGG GGGGGGGGGG GATCGATACC GACCAGAATC ATGCAAGTGC GTAAGATAGT CGCGGGCCGG GATCGATCCC CCCCCCCTTG CTAGCTATGT TGTTTCACGC TGTTGACCGA GGTCGCTAGC TTCCCCCCCC CCCCCCCCTC GAGTTAATTA AGTTAACGCG GCCGCGGGCC C-3'

Sequence of the 5’-amino-3’-Cy3 DNA/2’-O-methylated RNA hybrid for antisense oligonucleotide purification. 2’O-methylated RNA nucleotides are complementary to 25v2 and are underlined. Synthesized by Mauro Zimmermann (Institute of Pharmaceutical Sciences, ETH Zurich).

25v2-as NH 2-(C6)-5'-ACAGAATTCA TACCUCGGUC AACAGCGUGA AACAACA-3'-Cy3

7.2 MicroRNA mimics and antisense inhibitors

Ordering information and, where applicable, identifiers and sequences of the mature miRNAs are indicated.

Name MicroRNA ID Sequence Supplier Product ID Pre-/Anti-miR ID

Cy3 dye-labeled Pre-miR n/a n/a Ambion AM17120 n/a Negative Control # 1 Cy3 dye-labeled Anti-miR n/a n/a Ambion AM17011 n/a Negative Control # 1

Pre-miR Negative Control #1 n/a n/a Ambion AM17110 n/a

Anti-miR Negative Control #1 n/a n/a Ambion AM17010 n/a

UUAUCAGAAUC Pre-miR-361-5p hsa-miR-361-5p Ambion AM17100 PM10085 UCCAGGGGUAC UUAUCAGAAUC Anti-miR-361-5p hsa-miR-361-5p Ambion AM17000 AM10085 UCCAGGGGUAC

7.3 Commercial quantitative reverse transcription PCR assays

In addition to the ordering information, an identifier of the targeted gene (Entrez), and, where available, the binding regions of the primers and probes as well as the amplicon lengths (in nucleotides) are indicated.

Name System Entrez Supplier Product ID Assay ID Binding Length ACTB TaqMan 60 Applied Biosystems 4326315E n/a exon 1 171 CHM TaqMan 1121 Applied Biosystems 4351372 Hs01114163_m1 exons 9-10 102 miR-126 TaqMan 406913 Applied Biosystems 4427975 002228 n/a n/a miR-205 TaqMan 40698 Applied Biosystems 4427975 000509 n/a n/a miR-20b TaqMan 574032 Applied Biosystems 4427975 001014 n/a n/a miR-34a TaqMan 407040 Applied Biosystems 4427975 000426 n/a n/a miR-361-5p TaqMan 494323 Applied Biosystems 4427975 000554 n/a n/a

147

Name System Entrez Supplier Product ID Assay ID Binding Length miR-93 TaqMan 407050 Applied Biosystems 4427975 001090 n/a n/a ACTB SYBR 60 QIAGEN QT01680476 n/a n/a 104 PUM1 SYBR 9698 QIAGEN QT0002941 n/a exons 5-6 73 PUM2 SYBR 23369 QIAGEN QT00067760 n/a exons 3/4 89 RNU6B TaqMan 26826 Applied Biosystems 4427975 001093 n/a n/a VEGFA 3'-terminus TaqMan 7422 Applied Biosystems 4331182 Hs03929054_s1 exon 8 134 VEGFA exon 3 TaqMan 7422 Applied Biosystems 4331182 Hs99999070_m1 exon 3 63

7.4 MicroRNAs predicted to target VEGFA

MicroRNAs predicted to have recognition elements in the VEGFA 3’-UTR are listed. Target predictions were from the following web services: microRNA.org (MR; Betel et al. , 2010), TargetScan (TS; Friedman et al. , 2009), DIANA-microT (µT; Maragkakis et al. , 2009), miRDB (DB; Wang, 2008), and MicroCosm (MC; Griffiths-Jones et al. , 2008). For each MRE, the genome coordinates (chromosome 6, + strand), the 3’-UTR region as defined in the main text (Reg.; 1 = conserved region 1, NC = non-conserved region, 2 = conserved region 2), the relative position to the start site of the 3’-UTR (Pos.; based on GenBank RefSeq entry NM_001025366.2), the particular algorithms and the total number of algorithms predicting the MRE (#; first number), as well as the total number of algorithms for which target predictions for the corresponding miRNA were available in the accessed information (#; second number), are indicated. For MREs previously subjected to experimental analysis, the corresponding references are indicated.

MicroRNA Genome coordinates Reg. Pos. MR TS µT DB MC # Reference hsa-miR-1976 43752300-43752312 1 1 yes n/d n/d n/d 1/2 hsa-miR-125a-5p 43752303-43752331 1 4 yes 1/5 hsa-miR-1321 43752309-43752326 1 10 yes n/d n/d 1/3 hsa-miR-125b 43752310-43752331 1 11 yes 1/5 hsa-miR-4319 43752315-43752331 1 16 yes n/d n/d n/d n/d 1/1 hsa-miR-218-1* 43752323-43752343 1 24 yes n/d n/d n/d 1/2 hsa-miR-4299 43752340-43752356 1 41 yes n/d n/d n/d n/d 1/1 hsa-miR-136* 43752359-43752381 1 60 yes n/d n/d n/d 1/2 hsa-miR-29b-2* 43752368-43752391 1 69 yes n/d n/d n/d 1/2 hsa-miR-34a 43752371-43752398‡ 1 72 0/5 Ye et al. , 2008 hsa-miR-140-5p 43752373-43752392‡ 1 74 0/5 Ye et al. , 2008 hsa-miR-885-3p 43752377-43752398 1 78 yes yes 2/5 hsa-miR-34b 43752377-43752399‡ 1 78 0/5 Ye et al., 2008 hsa-miR-26b* 43752407-43752428 1 108 yes n/d n/d n/d 1/2 hsa-miR-3169 43752411-43752432 1 112 yes n/d n/d n/d n/d 1/1 hsa-miR-499-5p 43752411-43752434 1 112 yes 1/5 hsa-miR-1299 43752420-43752441 1 121 yes yes n/d n/d 2/3 hsa-miR-1246 43752422-43752440 1 123 yes n/d n/d 1/3 hsa-miR-205* 43752426-43752451 1 127 yes n/d n/d n/d n/d 1/1 hsa-miR-516b 43752435-43752441* 1 136 yes 1/5 hsa-miR-205 43752437-43752456 1 138 yes yes† yes yes yes 5/5 Ye et al. , 2008 Wu et al. , 2009 hsa-miR-1236 43752441-43752461 1 142 yes yes n/d yes n/d 3/3 hsa-miR-877* 43752442-43752462 1 143 yes n/d n/d n/d 1/2 hsa-miR-579 43752447-43752453* 1 148 yes† 1/5 hsa-let-7i* 43752449-43752474 1 150 yes n/d n/d n/d 1/2 hsa-miR-4279 43752450-43752465 1 151 yes n/d n/d n/d n/d 1/1 hsa-miR-520b 43752453-43752481$ 1 154 yes† yes 2/5 hsa-miR-520c-3p 43752453-43752481$ 1 154 yes† yes 2/5 hsa-miR-520d-3p 43752453-43752481$ 1 154 yes† yes 2/5 hsa-miR-302c 43752456-43752482 1 157 yes yes† 2/5 hsa-miR-372 43752457-43752482 1 158 yes yes† yes 3/5 Ye et al. , 2008 hsa-miR-33b* 43752459-43752481 1 160 yes n/d n/d n/d 1/2 hsa-miR-520a-3p 43752460-43752482 1 161 yes yes† yes 3/5 hsa-miR-302a 43752460-43752482 1 161 yes yes† 2/5 hsa-miR-302b 43752460-43752482 1 161 yes yes† 2/5 hsa-miR-593 43752460-43752480 1 161 yes 1/5 hsa-miR-93 43752461-43752483 1 162 yes yes† 2/5 Ye et al. , 2008 hsa-miR-520e 43752462-43752482 1 163 yes yes† yes 3/5 hsa-miR-373 43752462-43752482 1 163 yes yes† 2/5 Ye et al. , 2008 hsa-miR-520g 43752462-43752484 1 163 yes yes† 2/5 Ye et al. , 2008 hsa-miR-519d 43752463-43752483 1 164 yes yes† yes yes 4/5 hsa-miR-106a 43752463-43752483 1 164 yes yes† yes 3/5 Hua et al. , 2006 Ye et al. , 2008 hsa-miR-20a 43752463-43752483 1 164 yes yes† yes 3/5 Hua et al. , 2006 Ye et al. , 2008

148

MicroRNA Genome coordinates Reg. Pos. MR TS µT DB MC # Reference hsa-miR-17 43752463-43752483 1 164 yes yes† 2/5 Hua et al. , 2006 Ye et al. , 2008 Lei et al. , 2009 hsa-miR-20b 43752463-43752483 1 164 yes yes† 2/5 Hua et al. , 2006 Ye et al. , 2008 hsa-miR-526b* 43752463-43752483 1 164 yes n/d n/d n/d 1/2 hsa-miR-520h 43752464-43752484 1 165 yes yes† 2/5 Ye et al. , 2008 hsa-miR-106b 43752465-43752483 1 166 yes yes† yes 3/5 Hua et al. , 2006 Ye et al. , 2008 hsa-miR-302e 43752466-43752482 1 167 yes yes† n/d n/d 2/3 hsa-miR-711 43752468-43752489 1 169 yes n/d n/d n/d n/d 1/1 hsa-miR-186* 43752468-43752486 1 169 yes n/d n/d n/d 1/2 hsa-miR-302d 43752475-43752481* 1 176 yes† 1/5 Ye et al. , 2008 hsa-miR-613 43752489-43752517$ 1 190 yes 1/5 hsa-miR-1249 43752491-43752497* 1 192 yes n/d n/d 1/3 hsa-miR-516a-3p 43752507-43752513* 1 208 yes 1/5 hsa-miR-492 43752518-43752542 1 219 yes 1/5 hsa-miR-3126-5p 43752525-43752545 1 226 yes n/d n/d n/d n/d 1/1 hsa-miR-520a-5p 43752528-43752550 1 229 yes 1/5 hsa-miR-525-5p 43752529-43752550 1 230 yes 1/5 hsa-miR-652 43752544-43752564 1 245 yes 1/5 hsa-miR-3163 43752548-43752569 1 249 yes n/d n/d n/d n/d 1/1 hsa-miR-922 43752556-43752580 1 257 yes 1/5 hsa-miR-31 43752558-43752579 1 259 yes 1/5 hsa-miR-15a 43752559-43752582 1 260 yes yes† yes 3/5 hsa-miR-103 43752559-43752581 1 260 yes yes 2/5 hsa-miR-107 43752559-43752581 1 260 yes yes 2/5 hsa-miR-545 43752559-43752580 1 260 yes 1/5 hsa-miR-16 43752560-43752582 1 261 yes yes† yes 3/5 Karaa et al. , 2009 hsa-miR-503 43752560-43752582 1 261 yes yes† 2/5 hsa-miR-15b 43752561-43752582 1 262 yes yes† yes 3/5 hsa-miR-424 43752561-43752582 1 262 yes yes† yes 3/5 hsa-miR-497 43752562-43752582 1 263 yes yes† 2/5 hsa-miR-195 43752563-43752582 1 264 yes yes† yes yes 4/5 hsa-miR-374b* 43752563-43752584 1 264 yes n/d n/d n/d 1/2 hsa-miR-646 43752564-43752582 1 265 yes yes† 2/5 hsa-miR-423-3p 43752572-43752594 1 273 yes yes 2/5 hsa-miR-1180 43752579-43752600 1 280 yes n/d n/d 1/3 hsa-miR-141* 43752581-43752603 1 282 yes n/d n/d n/d 1/2 hsa-miR-548e 43752593-43752614 1 294 yes yes n/d n/d 2/3 hsa-miR-548a-3p 43752594-43752614 1 295 yes yes 2/5 hsa-miR-548f 43752596-43752614 1 297 yes yes n/d n/d 2/3 hsa-miR-323b-5p 43752618-43752640 1 319 yes n/d n/d n/d n/d 1/1 hsa-miR-1293 43752619-43752641 1 320 yes yes n/d n/d 2/3 hsa-miR-363* 43752624-43752645 1 325 yes n/d n/d n/d 1/2 hsa-miR-139-5p 43752627-43752633* 1 328 yes 1/5 hsa-miR-299-3p 43752628-43752649 1 329 yes yes† yes 3/5 Jafarifar et al. , 2011 hsa-miR-567 43752629-43752651 1 330 yes yes 2/5 Jafarifar et al. , 2011 hsa-miR-934 43752629-43752635* 1 330 yes 1/5 hsa-miR-3149 43752635-43752657 1 336 yes n/d n/d n/d n/d 1/1 hsa-miR-609 43752635-43752641* 1 336 yes 1/5 Jafarifar et al. , 2011 hsa-miR-297 43752636-43752656 1 337 yes yes yes 3/5 Jafarifar et al. , 2011 hsa-miR-3171 43752636-43752658 1 337 yes n/d n/d n/d n/d 1/1 hsa-miR-340 43752641-43752662 1 342 yes 1/5 hsa-miR-410 43752643-43752663 1 344 yes yes 2/5 hsa-miR-374a 43752654-43752677 1 355 yes yes yes 3/5 hsa-miR-889 43752655-43752675 1 356 yes 1/5 hsa-miR-374b 43752656-43752677 1 357 yes yes yes 3/5 hsa-miR-369-3p 43752657-43752676 1 358 yes yes 2/5 hsa-miR-410 43752658-43752678 1 359 yes yes 2/5 hsa-miR-3145 43752669-43752697 1 370 yes n/d n/d n/d n/d 1/1 hsa-miR-590-3p 43752670-43752690 1 371 yes 1/5 hsa-miR-539 43752676-43752699 1 377 yes 1/5 hsa-miR-1270 43752679-43752701 1 380 yes n/d n/d 1/3 hsa-miR-202 43752680-43752699 1 381 yes 1/5 hsa-miR-135a* 43752681-43752702 1 382 yes n/d n/d n/d 1/2 hsa-miR-1303 43752681-43752702 1 382 yes n/d n/d 1/3 hsa-miR-3121 43752682-43752704 1 383 yes n/d n/d n/d n/d 1/1 hsa-miR-620 43752682-43752701 1 383 yes 1/5 hsa-miR-376c 43752684-43752704 1 385 yes 1/5 hsa-miR-3163 43752686-43752707 1 387 yes n/d n/d n/d n/d 1/1 hsa-miR-142-5p 43752687-43752707 1 388 yes 1/5 hsa-miR-340 43752687-43752708 1 388 yes 1/5 hsa-miR-410 43752689-43752709 1 390 yes yes 2/5 hsa-miR-577 43752689-43752709 1 390 yes 1/5 hsa-miR-1259 43752690-43752710 1 391 yes n/d n/d 1/3 hsa-miR-568 43752692-43752710 1 393 yes 1/5 hsa-miR-3145 43752698-43752721 1 399 yes n/d n/d n/d n/d 1/1 hsa-miR-590-3p 43752699-43752719 1 400 yes 1/5 hsa-miR-4307 43752703-43752720 1 404 yes n/d n/d n/d n/d 1/1 hsa-miR-186 43752709-43752731 1 410 yes yes† yes 3/5 hsa-miR-548u 43752709-43752731 1 410 yes n/d n/d n/d n/d 1/1 hsa-miR-3133 43752710-43752731 1 411 yes n/d n/d n/d n/d 1/1

149

MicroRNA Genome coordinates Reg. Pos. MR TS µT DB MC # Reference hsa-miR-548l 43752710-43752731 1 411 yes n/d n/d 1/3 hsa-miR-548n 43752711-43752732 1 412 yes n/d n/d 1/3 hsa-miR-3121 43752712-43752733 1 413 yes n/d n/d n/d n/d 1/1 hsa-miR-3123 43752714-43752731 1 415 yes n/d n/d n/d n/d 1/1 hsa-miR-4311 43752717-43752733 1 418 yes n/d n/d n/d n/d 1/1 hsa-miR-32* 43752720-43752740 1 421 yes n/d n/d n/d 1/2 hsa-miR-2053 43752722-43752744 1 423 yes n/d yes n/d 2/3 hsa-miR-569 43752723-43752743 1 424 yes yes 2/5 hsa-miR-452 43752725-43752746 1 426 yes 1/5 hsa-miR-141 43752727-43752749 1 428 yes 1/5 hsa-miR-1208 43752728-43752747 1 429 yes yes n/d yes n/d 3/3 hsa-miR-106b* 43752728-43752749 1 429 yes n/d n/d n/d 1/2 hsa-miR-1278 43752728-43752749 1 429 yes n/d n/d 1/3 hsa-miR-374b* 43752729-43752751 1 430 yes n/d n/d n/d 1/2 hsa-miR-323-3p 43752733-43752754 1 434 yes 1/5 hsa-miR-543 43752734-43752755 1 435 yes yes† 2/5 hsa-miR-3143 43752736-43752756 1 437 yes n/d n/d n/d n/d 1/1 hsa-miR-21* 43752743-43752764 1 444 yes n/d n/d n/d 1/2 hsa-miR-199b-5p 43752750-43752772 1 451 yes yes† 2/5 hsa-miR-508-5p 43752751-43752776 1 452 yes 1/5 hsa-miR-199a-5p 43752753-43752772 1 454 yes yes† 2/5 hsa-miR-1825 43752755-43752772 1 456 yes yes† n/d n/d 2/3 hsa-miR-4276 43752756-43752772 1 457 yes n/d n/d n/d n/d 1/1 hsa-miR-545 43752761-43752785 1 462 yes 1/5 hsa-miR-4274 43752770-43752787 1 471 yes n/d n/d n/d n/d 1/1 hsa-miR-150 43752783-43752805 1 484 yes yes 2/5 hsa-miR-543 43752807-43752813* 1 508 yes 1/5 hsa-miR-1292 43752811-43752817* 1 512 yes n/d n/d 1/3 hsa-miR-1205 43752813-43752833 1/NC 514 yes n/d n/d 1/3 hsa-miR-339-5p 43752817-43752835 1/NC 518 yes yes yes 3/5 hsa-miR-1274a 43752820-43752836 NC 521 yes yes n/d n/d 2/3 hsa-miR-877* 43752821-43752841 NC 522 yes n/d n/d n/d 1/2 hsa-miR-483-3p 43752826-43752849 NC 527 yes 1/5 hsa-miR-1274b 43752829-43752835* NC 530 yes n/d n/d 1/3 hsa-miR-4279 43752832-43752847 NC 533 yes n/d n/d n/d n/d 1/1 hsa-miR-1236 43752833-43752839+ NC 534 yes n/d yes n/d 2/3 hsa-miR-641 43752847-43752869 NC 548 yes 1/5 hsa-miR-3148 43752857-43752879 NC 558 yes n/d n/d n/d n/d 1/1 hsa-miR-4298 43752860-43752881 NC 561 yes n/d n/d n/d n/d 1/1 hsa-miR-1302 43752861-43752881 NC 562 yes yes n/d n/d 2/3 hsa-miR-504 43752892-43752898* NC 593 yes 1/5 hsa-miR-1207-5p 43752904-43752910* NC 605 yes n/d n/d 1/3 hsa-miR-1285 43752907-43752913* NC 608 yes n/d n/d 1/3 hsa-miR-612 43752907-43752913* NC 608 yes 1/5 hsa-miR-874 43752927-43752933+ NC 628 yes yes 2/5 hsa-miR-146b-3p 43752927-43752933* NC 628 yes 1/5 hsa-miR-548a-3p 43752929-43752957$ NC 630 yes yes 2/5 hsa-miR-1279 43752942-43752948* NC 643 yes n/d n/d 1/3 hsa-miR-548e 43752951-43752957* NC 652 yes n/d n/d 1/3 hsa-miR-548f 43752951-43752957* NC 652 yes n/d n/d 1/3 hsa-miR-1323 43752952-43752958* NC 653 yes n/d n/d 1/3 hsa-miR-548o 43752952-43752958* NC 653 yes n/d n/d 1/3 hsa-miR-762 43752957-43752981 NC 658 yes n/d n/d n/d n/d 1/1 hsa-miR-629 43752970-43752976* NC 671 yes 1/5 hsa-miR-24 43752986-43752992* NC 687 yes 1/5 hsa-miR-631 43752988-43753008 NC 689 yes yes 2/5 hsa-miR-511 43753005-43753025 NC 706 yes yes 2/5 hsa-miR-619 43753009-43753035 NC 710 yes 1/5 hsa-miR-486-5p 43753016-43753040 NC 717 yes yes 2/5 hsa-miR-205 43753023-43753048‡ NC 724 0/5 Ye et al. , 2008 hsa-miR-1274a 43753025-43753042 NC 726 yes yes n/d n/d 2/3 hsa-miR-1274b 43753035-43753041* NC 736 yes n/d n/d 1/3 hsa-miR-339-5p 43753035-43753041* NC 736 yes 1/5 hsa-miR-15b 43753069-43753094% NC 770 0/5 Hua et al. , 2006 hsa-miR-107 43753070-43753093% NC 771 0/5 Hua et al. , 2006 hsa-miR-3120 43753073-43753093 NC 774 yes n/d n/d n/d n/d 1/1 hsa-miR-520g 43753073-43753099‡ NC 774 0/5 Ye et al. , 2008 hsa-miR-127-5p 43753074-43753080* NC 775 yes 1/5 hsa-miR-17 43753075-43753098‡ NC 776 0/5 Hua et al. , 2006 Ye et al. , 2008 hsa-miR-520h 43753075-43753099‡ NC 776 0/5 Ye et al. , 2008 hsa-miR-20b 43753076-43753098‡ NC 777 0/5 Hua et al. , 2006 Ye et al. , 2008 hsa-miR-186* 43753077-43753101 NC 778 yes n/d n/d n/d 1/2 hsa-miR-15a 43753077-43753094‡ NC 778 0/5 Ye et al. , 2008 hsa-miR-330-3p 43753092-43753098* NC 793 yes 1/5 Ye et al. , 2008 hsa-miR-16 43753092-43753121‡ NC 793 0/5 Hua et al. , 2006 Ye et al. , 2008 hsa-miR-147 43753093-43753116‡ NC 794 0/5 Ye et al. , 2008 hsa-miR-103 43753099-43753120 NC 800 yes yes yes 3/5 hsa-miR-107 43753099-43753120 NC 800 yes yes yes 3/5 hsa-miR-372 43753120-43753147‡ NC 821 0/5 Ye et al. , 2008 hsa-miR-373 43753124-43753147‡ NC 825 0/5 Ye et al. , 2008

150

MicroRNA Genome coordinates Reg. Pos. MR TS µT DB MC # Reference hsa-miR-637 43753132-43753138* NC 833 yes 1/5 hsa-miR-331-3p 43753135-43753141* NC 836 yes 1/5 hsa-miR-141* 43753137-43753157 NC 838 yes n/d n/d n/d 1/2 hsa-miR-378 43753141-43753163‡ NC 842 0/5 Ye et al. , 2008 hsa-miR-34a 43753142-43753148* NC 843 yes 1/5 hsa-miR-34c-5p 43753142-43753148* NC 843 yes 1/5 hsa-miR-449a 43753142-43753148* NC 843 yes 1/5 hsa-miR-449b 43753142-43753148* NC 843 yes 1/5 hsa-miR-484 43753161-43753167* NC 862 yes 1/5 hsa-miR-4270 43753171-43753193 NC 872 yes n/d n/d n/d n/d 1/1 hsa-miR-125a-3p 43753172-43753193 NC 873 yes yes 2/5 hsa-miR-612 43753187-43753209 NC 888 yes yes 2/5 hsa-miR-1285 43753189-43753209 NC 890 yes yes n/d n/d 2/3 hsa-miR-140-5p 43753189-43753217$ NC 890 yes 1/5 Ye et al. , 2008 hsa-miR-942 43753220-43753242 NC 921 yes yes 2/5 hsa-miR-1236 43753233-43753239* NC 934 yes n/d n/d 1/3 hsa-miR-1236 43753238-43753244* NC 939 yes n/d n/d 1/3 hsa-miR-103-2* 43753240-43753263 NC 941 yes n/d n/d n/d n/d 1/1 hsa-miR-593 43753240-43753246* NC 941 yes 1/5 hsa-miR-920 43753251-43753279$ NC 952 yes yes 2/5 hsa-miR-920 43753251-43753279$ NC 952 yes 1/5 hsa-miR-920 43753251-43753279$ NC 952 yes 1/5 hsa-miR-939 43753256-43753281 NC 957 yes yes yes 3/5 hsa-miR-1308 43753265-43753271* NC 966 yes n/d n/d 1/3 hsa-miR-939 43753292-43753317 NC 993 yes yes yes 3/5 hsa-miR-1255a 43753296-43753302* NC 997 yes n/d n/d 1/3 hsa-miR-1255b 43753296-43753302* NC 997 yes n/d n/d 1/3 hsa-miR-3179 43753299-43753320 NC 1000 yes n/d n/d n/d n/d 1/1 hsa-miR-1224-5p 43753335-43753353 NC 1036 yes yes n/d yes n/d 3/3 hsa-miR-29b-2* 43753347-43753367 NC 1048 yes n/d n/d n/d 1/2 hsa-miR-1294 43753348-43753354* NC 1049 yes n/d n/d 1/3 hsa-miR-140-5p 43753352-43753369 NC 1053 yes yes† yes 3/5 hsa-miR-637 43753362-43753385 NC 1063 yes yes 2/5 hsa-miR-4271 43753362-43753381 NC 1063 yes n/d n/d n/d n/d 1/1 hsa-miR-342-3p 43753387-43753409 NC 1088 yes yes 2/5 hsa-miR-505 43753395-43753417 NC 1096 yes 1/5 hsa-miR-493 43753399-43753421 NC 1100 yes yes 2/5 hsa-miR-377 43753401-43753407* NC 1102 yes† 1/5 hsa-miR-765 43753419-43753425* NC 1120 yes 1/5 hsa-miR-339-5p 43753449-43753471 NC 1150 yes yes yes 3/5 hsa-miR-146b-3p 43753452-43753472 NC 1153 yes yes 2/5 hsa-miR-550 43753453-43753459* NC 1154 yes 1/5 hsa-miR-593 43753461-43753467* NC 1162 yes 1/5 hsa-miR-449a 43753464-43753486 NC 1165 yes 1/5 hsa-miR-449b 43753464-43753486 NC 1165 yes 1/5 hsa-miR-874 43753466-43753472+ NC 1167 yes yes 2/5 hsa-miR-768-3p 43753486-43753492* NC/2 1187 n/d yes n/d 1/3 hsa-miR-141 43753496-43753517 2 1197 yes 1/5 hsa-miR-200a 43753496-43753517 2 1197 yes 1/5 hsa-miR-3163 43753500-43753521 2 1201 yes n/d n/d n/d n/d 1/1 hsa-miR-371-3p 43753509-43753531 2 1210 yes 1/5 hsa-miR-624* 43753510-43753531 2 1211 yes n/d n/d n/d 1/2 hsa-miR-126 43753511-43753531 2 1212 yes yes 2/5 Liu et al. , 2009 hsa-miR-1911 43753511-43753532 2 1212 yes n/d n/d n/d 1/2 hsa-miR-548k 43753513-43753533 2 1214 yes n/d n/d 1/3 hsa-miR-3145 43753517-43753542 2 1218 yes n/d n/d n/d n/d 1/1 hsa-miR-410 43753517-43753523* 2 1218 yes† 1/5 hsa-miR-656 43753521-43753541 2 1222 yes 1/5 hsa-miR-3154 43753527-43753548 2 1228 yes n/d n/d n/d n/d 1/1 hsa-miR-3201 43753527-43753543 2 1228 yes n/d n/d n/d n/d 1/1 hsa-miR-548t 43753528-43753548 2 1229 yes n/d n/d n/d n/d 1/1 hsa-miR-583 43753529-43753549 2 1230 yes 1/5 hsa-miR-576-5p 43753536-43753557 2 1237 yes yes† yes 3/5 hsa-miR-513a-3p 43753539-43753561 2 1240 yes n/d n/d 1/3 hsa-miR-4263 43753540-43753557 2 1241 yes n/d n/d n/d n/d 1/1 hsa-miR-29b-1* 43753545-43753567 2 1246 yes n/d n/d n/d 1/2 hsa-miR-452 43753547-43753568 2 1248 yes yes† yes 3/5 hsa-miR-4307 43753547-43753567 2 1248 yes n/d n/d n/d n/d 1/1 hsa-miR-2052 43753548-43753565 2 1249 yes n/d n/d n/d 1/2 hsa-miR-451 43753548-43753569 2 1249 yes 1/5 hsa-miR-582-3p 43753548-43753569 2 1249 yes 1/5 hsa-miR-1208 43753550-43753569 2 1251 yes n/d n/d 1/3 hsa-miR-548g 43753550-43753570 2 1251 yes n/d n/d 1/3 hsa-miR-943 43753550-43753570 2 1251 yes 1/5 hsa-miR-183* 43753554-43753575 2 1255 yes n/d n/d n/d 1/2 hsa-miR-302b* 43753561-43753582 2 1262 yes n/d n/d n/d 1/2 hsa-miR-302d* 43753561-43753582 2 1262 yes n/d n/d n/d 1/2 hsa-miR-130b* 43753564-43753584 2 1265 yes n/d n/d n/d 1/2 hsa-miR-24-1* 43753568-43753589 2 1269 yes n/d n/d n/d 1/2 hsa-miR-24-2* 43753568-43753589 2 1269 yes n/d n/d n/d 1/2 hsa-miR-548d-3p 43753575-43753594 2 1276 yes yes† yes 3/5 hsa-miR-548x 43753576-43753594 2 1277 yes n/d n/d n/d n/d 1/1

151

MicroRNA Genome coordinates Reg. Pos. MR TS µT DB MC # Reference hsa-miR-875-3p 43753580-43753600 2 1281 yes 1/5 hsa-miR-200c 43753582-43753604 2 1283 yes yes† 2/5 hsa-miR-200b 43753583-43753604 2 1284 yes yes† 2/5 McArthur et al. , 2011 hsa-miR-1278 43753583-43753604 2 1284 yes n/d n/d 1/3 hsa-miR-4251 43753584-43753600 2 1285 yes n/d n/d n/d n/d 1/1 hsa-miR-429 43753585-43753604 2 1286 yes yes† 2/5 hsa-miR-186 43753587-43753608 2 1288 yes 1/5 hsa-miR-16-2* 43753595-43753618 2 1296 yes n/d n/d n/d 1/2 hsa-miR-656 43753596-43753616 2 1297 yes yes 2/5 hsa-miR-195* 43753596-43753618 2 1297 yes n/d n/d n/d 1/2 hsa-miR-559 43753603-43753623 2 1304 yes 1/5 hsa-miR-203 43753604-43753625 2 1305 yes yes† 2/5 hsa-miR-653 43753606-43753628 2 1307 yes 1/5 hsa-miR-219-1-3p 43753608-43753629 2 1309 yes 1/5 hsa-miR-3121 43753611-43753632 2 1312 yes n/d n/d n/d n/d 1/1 hsa-miR-633 43753611-43753632 2 1312 yes 1/5 hsa-miR-216a 43753619-43753640 2 1320 yes 1/5 hsa-miR-3133 43753624-43753647 2 1325 yes n/d n/d n/d n/d 1/1 hsa-miR-548p 43753630-43753651 2 1331 yes yes n/d yes n/d 3/3 hsa-miR-3121 43753650-43753670 2 1351 yes n/d n/d n/d n/d 1/1 hsa-miR-150* 43753657-43753676 2 1358 yes n/d n/d n/d 1/2 hsa-miR-548d-3p 43753658-43753682 2 1359 yes yes yes 3/5 hsa-miR-1537 43753659-43753680 2 1360 yes n/d n/d n/d 1/2 hsa-miR-300 43753665-43753686 2 1366 yes yes 2/5 hsa-miR-381 43753665-43753686 2 1366 yes yes 2/5 hsa-miR-1283 43753665-43753686 2 1366 yes n/d n/d 1/3 hsa-let-7f-1* 43753666-43753687 2 1367 yes n/d n/d n/d 1/2 hsa-let-7b* 43753667-43753687 2 1368 yes n/d n/d n/d 1/2 hsa-miR-1284 43753667-43753688 2 1368 yes n/d n/d 1/3 hsa-let-7a* 43753668-43753687 2 1369 yes n/d n/d n/d 1/2 hsa-miR-590-3p 43753674-43753694 2 1375 yes yes 2/5 hsa-miR-494 43753681-43753702 2 1382 yes yes 2/5 hsa-miR-4261 43753688-43753703 2 1389 yes n/d n/d n/d n/d 1/1 hsa-miR-571 43753688-43753709 2 1389 yes 1/5 hsa-miR-765 43753691-43753716 2 1392 yes 1/5 hsa-miR-185 43753695-43753716 2 1396 yes yes yes 3/5 hsa-miR-4270 43753700-43753719 2 1401 yes n/d n/d n/d n/d 1/1 hsa-miR-4306 43753700-43753716 2 1401 yes n/d n/d n/d n/d 1/1 hsa-miR-3118 43753710-43753734 2 1411 yes n/d n/d n/d n/d 1/1 hsa-miR-134 43753711-43753734 2 1412 yes yes yes 3/5 hsa-miR-943 43753712-43753733 2 1413 yes yes yes 3/5 hsa-miR-1278 43753712-43753735 2 1413 yes n/d n/d 1/3 hsa-miR-512-3p 43753713-43753735 2 1414 yes 1/5 hsa-miR-26a 43753724-43753747 2 1425 yes 1/5 hsa-miR-26b* 43753730-43753751 2 1431 yes n/d n/d n/d 1/2 hsa-miR-124* 43753731-43753751 2 1432 yes n/d n/d n/d 1/2 hsa-miR-429 43753734-43753756 2 1435 yes 1/5 hsa-miR-889 43753736-43753757 2 1437 yes 1/5 hsa-miR-548p 43753747-43753766 2 1448 yes yes† n/d yes n/d 3/3 hsa-miR-3152 43753751-43753771 2 1452 yes n/d n/d n/d n/d 1/1 hsa-miR-138 43753752-43753776 2 1453 yes 1/5 hsa-miR-3160 43753756-43753778 2 1457 yes n/d n/d n/d n/d 1/1 hsa-miR-650 43753765-43753786 2 1466 yes 1/5 hsa-miR-574-5p 43753767-43753773* 2 1468 yes 1/5 hsa-miR-149* 43753769-43753787 2 1470 yes n/d n/d n/d 1/2 hsa-miR-638 43753771-43753794 2 1472 yes yes 2/5 hsa-miR-939 43753780-43753804 2 1481 yes yes yes 3/5 hsa-miR-505* 43753783-43753802 2 1484 yes n/d n/d n/d 1/2 hsa-miR-542-5p 43753783-43753789* 2 1484 yes 1/5 hsa-miR-3150 43753786-43753805 2 1487 yes n/d n/d n/d n/d 1/1 hsa-miR-1231 43753790-43753810 2 1491 yes n/d n/d 1/3 hsa-miR-409-3p 43753793-43753814 2 1494 yes 1/5 hsa-miR-219-2-3p 43753794-43753815 2 1495 yes 1/5 hsa-miR-206 43753795-43753816 2 1496 yes yes† yes 3/5 hsa-miR-1 43753795-43753816 2 1496 yes yes† 2/5 hsa-miR-613 43753796-43753816 2 1497 yes yes† yes 3/5 hsa-miR-607 43753802-43753822 2 1503 yes yes 2/5 hsa-miR-3171 43753817-43753842 2 1518 yes n/d n/d n/d n/d 1/1 hsa-miR-568 43753821-43753840 2 1522 yes 1/5 hsa-miR-1279 43753822-43753838 2 1523 yes n/d n/d 1/3 hsa-miR-576-3p 43753826-43753843 2 1527 yes yes 2/5 hsa-miR-1277 43753826-43753847 2 1527 yes n/d n/d 1/3 hsa-miR-567 43753827-43753849 2 1528 yes 1/5 hsa-miR-297 43753828-43753850 2 1529 yes 1/5 hsa-miR-3149 43753829-43753851 2 1530 yes n/d n/d n/d n/d 1/1 hsa-miR-1265 43753829-43753850 2 1530 yes n/d n/d 1/3 hsa-miR-633 43753832-43753855 2 1533 yes 1/5 hsa-miR-144 43753834-43753852 2 1535 yes 1/5 hsa-miR-656 43753839-43753860 2 1540 yes 1/5 hsa-miR-548s 43753845-43753867 2 1546 yes n/d n/d n/d n/d 1/1 hsa-miR-192* 43753845-43753866 2 1546 yes n/d n/d n/d 1/2 hsa-miR-744* 43753848-43753869 2 1549 yes n/d n/d n/d 1/2

152

MicroRNA Genome coordinates Reg. Pos. MR TS µT DB MC # Reference hsa-miR-382 43753849-43753870 2 1550 yes 1/5 hsa-miR-300 43753852-43753874 2 1553 yes yes 2/5 hsa-miR-381 43753852-43753874 2 1553 yes yes 2/5 hsa-miR-655 43753856-43753876 2 1557 yes 1/5 hsa-miR-7-1* 43753859-43753879 2 1560 yes n/d n/d n/d 1/2 hsa-miR-7-2* 43753859-43753879 2 1560 yes n/d n/d n/d 1/2 hsa-miR-329 43753861-43753882 2 1562 yes yes 2/5 hsa-miR-362-3p 43753861-43753882 2 1562 yes yes 2/5 hsa-miR-603 43753861-43753882 2 1562 yes yes 2/5 hsa-miR-466 43753861-43753883 2 1562 yes n/d n/d n/d n/d 1/1 hsa-miR-377 43753861-43753882 2 1562 yes 1/5 hsa-let-7b* 43753864-43753885 2 1565 yes n/d n/d n/d 1/2 hsa-let-7f-1* 43753864-43753885 2 1565 yes n/d n/d n/d 1/2 hsa-let-7f-2* 43753864-43753885 2 1565 yes n/d n/d n/d 1/2 hsa-let-7a* 43753865-43753885 2 1566 yes n/d n/d n/d 1/2 hsa-miR-1284 43753865-43753886 2 1566 yes n/d n/d 1/3 hsa-miR-410 43753865-43753886 2 1566 yes 1/5 hsa-miR-374a 43753867-43753889 2 1568 yes 1/5 hsa-miR-656 43753869-43753890 2 1570 yes 1/5 hsa-miR-30c 43753882-43753904 2 1583 yes 1/5 hsa-miR-494 43753882-43753903 2 1583 yes 1/5 hsa-miR-302a* 43753883-43753905 2 1584 yes n/d n/d n/d 1/2 hsa-miR-30a 43753883-43753904 2 1584 yes 1/5 hsa-miR-30b 43753883-43753904 2 1584 yes 1/5 hsa-miR-30d 43753883-43753904 2 1584 yes 1/5 hsa-miR-30e 43753883-43753904 2 1584 yes 1/5 hsa-let-7b* 43753888-43753909 2 1589 yes n/d n/d n/d 1/2 hsa-let-7f-1* 43753888-43753909 2 1589 yes n/d n/d n/d 1/2 hsa-let-7f-2* 43753888-43753909 2 1589 yes n/d n/d n/d 1/2 hsa-miR-1284 43753888-43753910 2 1589 yes n/d n/d 1/3 hsa-let-7a* 43753889-43753909 2 1590 yes n/d n/d n/d 1/2 hsa-miR-130a* 43753895-43753916 2 1596 yes n/d n/d n/d 1/2 hsa-miR-23a 43753896-43753916 2 1597 yes 1/5 hsa-miR-23b 43753896-43753916 2 1597 yes 1/5 hsa-miR-15b* 43753897-43753919 2 1598 yes n/d n/d n/d 1/2 hsa-miR-34b 43753897-43753918 2 1598 yes 1/5 hsa-miR-2115* 43753901-43753923 2 1602 yes n/d n/d n/d n/d 1/1 hsa-miR-3074 43753902-43753925 2 1603 yes n/d n/d n/d n/d 1/1 hsa-miR-374a* 43753902-43753925 2 1603 yes n/d n/d n/d 1/2 hsa-miR-383 43753902-43753924 2 1603 yes 1/5 hsa-miR-361-5p 43753903-43753924 2 1604 yes yes† yes yes yes 5/5 hsa-miR-138-2* 43753909-43753930 2 1610 yes n/d n/d n/d 1/2 hsa-miR-654-3p 43753909-43753934 2 1610 yes 1/5 hsa-miR-934 43753909-43753932 2 1610 yes 1/5 hsa-miR-590-3p 43753910-43753929 2 1611 yes 1/5 hsa-miR-106a* 43753917-43753938 2 1618 yes n/d n/d n/d 1/2 hsa-miR-7-2* 43753922-43753946 2 1623 yes n/d n/d n/d 1/2 hsa-miR-3133 43753929-43753950 2 1630 yes n/d n/d n/d n/d 1/1 hsa-miR-548x 43753929-43753949 2 1630 yes n/d n/d n/d n/d 1/1 hsa-miR-548d-3p 43753929-43753950 2 1630 yes 1/5 hsa-miR-570 43753929-43753949 2 1630 yes 1/5 hsa-miR-3163 43753930-43753952 2 1631 yes n/d n/d n/d n/d 1/1 hsa-miR-410 43753931-43753954 2 1632 yes 1/5 hsa-miR-577 43753933-43753954 2 1634 yes 1/5 hsa-miR-340 43753934-43753953 2 1635 yes 1/5 hsa-miR-376c 43753936-43753956 2 1637 yes 1/5 hsa-miR-448 43753936-43753957 2 1637 yes 1/5 hsa-miR-466 43753937-43753957 2 1638 yes n/d n/d n/d n/d 1/1 hsa-miR-2052 43753942-43753962 2 1643 yes n/d n/d n/d 1/2 hsa-miR-129-5p 43753942-43753961 2 1643 yes 1/5 hsa-miR-2116 43753943-43753962 2 1644 yes n/d n/d n/d n/d 1/1 hsa-miR-4307 43753946-43753964 2 1647 yes n/d n/d n/d n/d 1/1 hsa-miR-19a* 43753946-43753967 2 1647 yes n/d n/d n/d 1/2 hsa-miR-19b-1* 43753947-43753967 2 1648 yes n/d n/d n/d 1/2 hsa-miR-19b-2* 43753948-43753967 2 1649 yes n/d n/d n/d 1/2 hsa-miR-578 43753952-43753972 2 1653 yes yes† yes 3/5 hsa-miR-138-2* 43753958-43753979 2 1659 yes n/d n/d n/d 1/2 hsa-miR-136 43753959-43753981 2 1660 yes 1/5 hsa-miR-942 43753963-43753984 2 1664 yes 1/5 hsa-miR-2117 43753964-43753984 2 1665 yes n/d n/d n/d n/d 1/1 hsa-miR-1277 43753970-43753991 2 1671 yes n/d n/d 1/3 hsa-miR-567 43753971-43753993 2 1672 yes 1/5 hsa-miR-3149 43753973-43753995 2 1674 yes n/d n/d n/d n/d 1/1 hsa-miR-297 43753975-43753994 2 1676 yes 1/5 hsa-miR-32* 43753978-43753999 2 1679 yes n/d n/d n/d 1/2 hsa-miR-185 43753985-43754006 2 1686 yes yes† yes 3/5 hsa-miR-3173 43753986-43754008 2 1687 yes n/d n/d n/d n/d 1/1 hsa-miR-4306 43753990-43754006 2 1691 yes n/d n/d n/d n/d 1/1 hsa-miR-583 43753991-43754011 2 1692 yes 1/5 hsa-miR-3143 43753995-43754019 2 1696 yes n/d n/d n/d n/d 1/1 hsa-miR-551b* 43753995-43754017 2 1696 yes n/d n/d n/d 1/2 hsa-miR-548n 43753997-43754018 2 1698 yes n/d n/d 1/3

153

MicroRNA Genome coordinates Reg. Pos. MR TS µT DB MC # Reference hsa-miR-3163 43753998-43754020 2 1699 yes n/d n/d n/d n/d 1/1 hsa-miR-4282 43754001-43754018 2 1702 yes n/d n/d n/d n/d 1/1 hsa-miR-656 43754003-43754023 2 1704 yes yes 2/5 hsa-miR-16-2* 43754004-43754025 2 1705 yes n/d n/d n/d 1/2 hsa-miR-195* 43754004-43754025 2 1705 yes n/d n/d n/d 1/2 hsa-miR-3065-5p 43754006-43754029 2 1707 yes n/d n/d n/d n/d 1/1 hsa-miR-495 43754007-43754028 2 1708 yes yes 2/5 hsa-miR-7-1* 43754007-43754028 2 1708 yes n/d n/d n/d 1/2 hsa-miR-7-2* 43754007-43754028 2 1708 yes n/d n/d n/d 1/2 hsa-miR-3161 43754010-43754032 2 1711 yes n/d n/d n/d n/d 1/1 hsa-miR-570 43754010-43754031 2 1711 yes 1/5 hsa-miR-224* 43754011-43754035 2 1712 yes n/d n/d n/d n/d 1/1 hsa-miR-522 43754011-43754035 2 1712 yes 1/5 hsa-miR-577 43754019-43754039 2 1720 yes 1/5 hsa-miR-323b-3p 43754023-43754044 2 1724 yes n/d n/d n/d n/d 1/1 hsa-miR-593* 43754024-43754048 2 1725 yes n/d n/d n/d 1/2 hsa-miR-767-5p 43754025-43754047 2 1726 yes 1/5 hsa-miR-3065-3p 43754027-43754049 2 1728 yes n/d n/d n/d n/d 1/1 hsa-miR-424 43754027-43754048 2 1728 yes 1/5 hsa-miR-29b 43754028-43754048 2 1729 yes yes† yes yes yes 5/5 hsa-miR-21* 43754028-43754048 2 1729 yes n/d n/d n/d 1/2 hsa-miR-222* 43754028-43754053 2 1729 yes n/d n/d n/d 1/2 hsa-miR-29a 43754029-43754048 2 1730 yes yes† yes yes 4/5 hsa-miR-29c 43754029-43754048 2 1730 yes yes† yes yes 4/5 hsa-miR-3129 43754029-43754052 2 1730 yes n/d n/d n/d n/d 1/1 hsa-miR-148a 43754030-43754052 2 1731 yes 1/5 hsa-miR-148b 43754030-43754052 2 1731 yes 1/5 hsa-miR-936 43754031-43754052 2 1732 yes 1/5 hsa-miR-101 43754032-43754053 2 1733 yes 1/5 hsa-miR-199b-3p 43754033-43754052 2 1734 yes n/d 1/4 hsa-miR-199a-3p 43754033-43754052 2 1734 yes 1/5 hsa-miR-144 43754034-43754053 2 1735 yes 1/5 hsa-miR-562 43754036-43754052 2 1737 yes 1/5 hsa-miR-651 43754041-43754063 2 1742 yes 1/5 hsa-miR-1261 43754043-43754060 2 1744 yes n/d n/d 1/3 hsa-miR-556-3p 43754044-43754065 2 1745 yes yes† 2/5 hsa-miR-944 43754047-43754068 2 1748 yes yes† yes 3/5 hsa-miR-1975 43754047-43754074 2 1748 yes n/d yes n/d 2/3 hsa-miR-126* 43754047-43754067 2 1748 yes n/d n/d n/d 1/2 hsa-miR-590-3p 43754047-43754068 2 1748 yes 1/5 hsa-miR-2355 43754053-43754075 2 1754 yes n/d n/d n/d n/d 1/1 hsa-miR-188-3p 43754054-43754074 2 1755 yes 1/5 hsa-miR-4286 43754058-43754074 2 1759 yes n/d n/d n/d n/d 1/1 hsa-miR-889 43754065-43754085 2 1766 yes yes† yes 3/5 hsa-miR-2053 43754066-43754088 2 1767 yes n/d yes n/d 2/3 hsa-miR-569 43754067-43754087 2 1768 yes yes† 2/5 hsa-miR-153 43754086-43754110 2 1787 yes 1/5 hsa-miR-548a-3p 43754095-43754115 2 1796 yes yes† yes 3/5 hsa-miR-548e 43754095-43754115 2 1796 yes yes† n/d n/d 2/3 hsa-miR-548x 43754096-43754116 2 1797 yes n/d n/d n/d n/d 1/1 hsa-miR-548f 43754097-43754115 2 1798 yes yes† n/d n/d 2/3 hsa-miR-516a-5p 43754101-43754121 2 1802 yes 1/5 hsa-miR-191 43754105-43754130 2 1806 yes 1/5 hsa-miR-190 43754118-43754139 2 1819 yes 1/5 hsa-miR-190b 43754119-43754139 2 1820 yes 1/5 hsa-miR-548c-3p 43754125-43754146 2 1826 yes yes† 2/5 hsa-miR-548n 43754125-43754146 2 1826 yes n/d n/d 1/3 hsa-miR-29a* 43754144-43754168 2 1845 yes n/d n/d n/d 1/2 hsa-miR-138-2* 43754145-43754166 2 1846 yes n/d n/d n/d 1/2 hsa-miR-876-5p 43754145-43754166 2 1846 yes 1/5 hsa-miR-223* 43754147-43754168 2 1848 yes n/d n/d n/d 1/2 hsa-miR-654-3p 43754151-43754172 2 1852 yes 1/5 hsa-miR-651 43754158-43754179 2 1859 yes 1/5 hsa-miR-3140 43754170-43754191 2 1871 yes n/d n/d n/d n/d 1/1 hsa-miR-586 43754170-43754193 2 1871 yes 1/5 hsa-miR-3119 43754173-43754192 2 1874 yes n/d n/d n/d n/d 1/1 hsa-miR-155 43754173-43754195 2 1874 yes 1/5 hsa-miR-105 43754174-43754196 2 1875 yes 1/5 hsa-miR-106a 43754174-43754196 2 1875 yes 1/5 hsa-miR-17 43754174-43754196 2 1875 yes 1/5 hsa-miR-20a 43754174-43754196 2 1875 yes 1/5 hsa-miR-20b 43754174-43754196 2 1875 yes 1/5 hsa-miR-93 43754174-43754196 2 1875 yes 1/5 hsa-miR-4307 43754175-43754193 2 1876 yes n/d n/d n/d n/d 1/1 hsa-miR-526b* 43754175-43754196 2 1876 yes n/d n/d n/d 1/2 hsa-miR-519d 43754175-43754196 2 1876 yes 1/5 hsa-miR-106b 43754176-43754196 2 1877 yes 1/5 hsa-miR-3163 43754177-43754198 2 1878 yes n/d n/d n/d n/d 1/1 hsa-miR-300 43754178-43754200 2 1879 yes 1/5 hsa-miR-524-5p 43754178-43754199 2 1879 yes 1/5 hsa-let-7b* 43754180-43754201 2 1881 yes n/d n/d n/d 1/2 hsa-miR-520d-5p 43754180-43754199 2 1881 yes 1/5

154

MicroRNA Genome coordinates Reg. Pos. MR TS µT DB MC # Reference hsa-miR-655 43754181-43754202 2 1882 yes yes† yes 3/5 hsa-let-7a* 43754181-43754201 2 1882 yes n/d n/d n/d 1/2 hsa-miR-1283 43754182-43754200 2 1883 yes n/d n/d 1/3 hsa-miR-381 43754182-43754200 2 1883 yes 1/5 hsa-let-7f-1* 43754183-43754201 2 1884 yes n/d n/d n/d 1/2 hsa-miR-302d* 43754183-43754205 2 1884 yes n/d n/d n/d 1/2 hsa-miR-369-3p 43754183-43754203 2 1884 yes 1/5 hsa-miR-302b* 43754185-43754205 2 1886 yes n/d n/d n/d 1/2 hsa-miR-10a* 43754189-43754210 2 1890 yes n/d n/d n/d 1/2 hsa-miR-148a* 43754189-43754210 2 1890 yes n/d n/d n/d 1/2 hsa-miR-16-1* 43754193-43754217 2 1894 yes n/d n/d n/d 1/2 hsa-miR-183* 43754195-43754216 2 1896 yes n/d n/d n/d 1/2 hsa-miR-2115* 43754198-43754219 2 1899 yes n/d n/d n/d n/d 1/1 hsa-miR-383 43754199-43754220 2 1900 yes yes† 2/5 hsa-miR-361-5p 43754199-43754220 2 1900 yes 1/5

7.5 Predicted microRNA 361-5p targets

Target predictions were from the following web services: microRNA.org (MR; Betel et al. , 2010), TargetScan (TS; Friedman et al. , 2009), DIANA-microT (µT; Maragkakis et al. , 2009), miRDB (DB; Wang, 2008), and MicroCosm (MC; Griffiths-Jones et al. , 2008). Results were pooled and converted to uniform gene identifiers (Entrez) using the DAVID web service (Huang et al. , 2008). For each gene the particular algorithms and the total number of algorithms predicting it (#) are indicated. Genes acting in the VEGF pathway according to KEGG (Kanehisa et al. , 2010) are indicated (KEGG).

Symbol Gene name Entrez MR TS µT DB MC # KEGG BCORL1 BCL6 co-repressor-like 1 63035 yes yes yes yes yes 5 POLR3G polymerase (RNA) III (DNA directed) polypeptide G 10622 yes yes yes yes yes 5 RANBP17 RAN(32kD) binding protein 17 64901 yes yes yes yes yes 5 RRAGB Ras-related GTP binding B 10325 yes yes yes yes yes 5 VEGFA vascular endothelial growth factor A 7422 yes yes yes yes yes 5 yes ARGLU1 arginine and glutamate rich 1 55082 yes yes yes yes 4 ARMC8 armadillo repeat containing 8 25852 yes yes yes yes 4 ATPAF1 ATP synthase mitochondrial F1 complex assembly 64756 yes yes yes yes 4 C18ORF34 chromosomefactor 1 18 open reading frame 34 374864 yes yes yes yes 4 C9ORF150 open reading frame 150 286343 yes yes yes yes 4 CAMK2D calcium/calmodulin-dependent protein kinase II delta 817 yes yes yes yes 4 CPOX coproporphyrinogen oxidase 1371 yes yes yes yes 4 CSMD1 CUB and Sushi multiple domains 1 64478 yes yes yes yes 4 CSTF1 cleavage stimulation factor, 3' pre-RNA, subunit 1, 1477 yes yes yes yes 4 DCTN6 dynactin50kDa 6 10671 yes yes yes yes 4 ELOVL7 ELOVL family member 7, elongation of long chain 79993 yes yes yes yes 4 FAM122C familyfatty acids with (yeast)sequence similarity 122C 159091 yes yes yes yes 4 GLRB glycine receptor, beta 2743 yes yes yes yes 4 GNB4 guanine nucleotide binding protein (G protein), beta 59345 yes yes yes yes 4 GOLGA4 golgipolypeptide autoantigen, 4 golgin subfamily a, 4 2803 yes yes yes yes 4 MAP2 microtubule-associated protein 2 4133 yes yes yes yes 4 MTRR 5-methyltetrahydrofolate-homocysteine 4552 yes yes yes yes 4 NFE2 nuclearmethyltransferase factor (erythroid-derived reductase 2), 45kDa 4778 yes yes yes yes 4 OGN osteoglycin 4969 yes yes yes yes 4 PRICKLE2 prickle homolog 2 (Drosophila) 166336 yes yes yes yes 4 RBM16 RNA binding motif protein 16 22828 yes yes yes yes 4 SEC31A SEC31 homolog A (S. cerevisiae) 22872 yes yes yes yes 4 SH3BGRL2 SH3 domain binding glutamic acid-rich protein like 2 83699 yes yes yes yes 4 TFAP2B transcription factor AP-2 beta (activating enhancer 7021 yes yes yes yes 4 TSR1 TSR1,binding 20S protein rRNA 2 beta)accumulation, homolog (S. 55720 yes yes yes yes 4 WNT7A wingless-typecerevisiae) MMTV integration site family, member 7476 yes yes yes yes 4 ABCA10 ATP-binding7A cassette, sub-family A (ABC1), member 10349 yes yes yes 3 ABHD3 abhydrolase10 domain containing 3 171586 yes yes yes 3 ACADSB acyl-Coenzyme A dehydrogenase, short/branched 36 yes yes yes 3 ADAMTS5 chain ADAM metallopeptidase with thrombospondin type 1 11096 yes yes yes 3 ADCY2 adenylatemotif, 5 cyclase 2 (brain) 108 yes yes yes 3 ANKRD49 ankyrin repeat domain 49 54851 yes yes yes 3 ARCN1 archain 1 372 yes yes yes 3 BCAS1 breast carcinoma amplified sequence 1 8537 yes yes yes 3 BMPR2 bone morphogenetic protein receptor, type II 659 yes yes yes 3 C17ORF39 chromosome(serine/threonine 17 open kinase) reading frame 39 79018 yes yes yes 3 C1ORF77 chromosome 1 open reading frame 77 26097 yes yes yes 3 C2ORF78 chromosome 2 open reading frame 78 388960 yes yes yes 3 C3ORF17 chromosome 3 open reading frame 17 25871 yes yes yes 3 C5ORF51 chromosome 5 open reading frame 51 285636 yes yes yes 3 C6ORF58 chromosome 6 open reading frame 58 352999 yes yes yes 3 C7ORF23 chromosome 7 open reading frame 23 79161 yes yes yes 3 CAPN6 calpain 6 827 yes yes yes 3 CBLN1 cerebellin 1 precursor 869 yes yes yes 3

155

Symbol Gene name Entrez MR TS µT DB MC # KEGG CCDC58 coiled-coil domain containing 58 131076 yes yes yes 3 CCNYL1 cyclin Y-like 1 151195 yes yes yes 3 CCT4 chaperonin containing TCP1, subunit 4 (delta) 10575 yes yes yes 3 CD177 CD177 molecule 57126 yes yes yes 3 CKS1B CDC28 protein kinase regulatory subunit 1B 1163 yes yes yes 3 CLDN22 claudin 22 53842 yes yes yes 3 CMTM4 CKLF-like MARVEL transmembrane domain 146223 yes yes yes 3 COQ2 containing coenzyme Q24 homolog, prenyltransferase (yeast) 27235 yes yes yes 3 COX7A2 cytochrome c oxidase subunit VIIa polypeptide 2 1347 yes yes yes 3 CRY1 cryptochrome(liver) 1 (photolyase-like) 1407 yes yes yes 3 CYHR1 cysteine/histidine-rich 1 50626 yes yes yes 3 DDX3X DEAD (Asp-Glu-Ala-Asp) box polypeptide 3, X-linked 1654 yes yes yes 3 DEXI dexamethasone-induced transcript 28955 yes yes yes 3 DPP10 dipeptidyl-peptidase 10 57628 yes yes yes 3 DYNC1LI2 dynein, cytoplasmic 1, light intermediate chain 2 1783 yes yes yes 3 EHF ets homologous factor 26298 yes yes yes 3 EIF3A eukaryotic translation initiation factor 3, subunit A 8661 yes yes yes 3 ELL3 elongation factor RNA polymerase II-like 3 80237 yes yes yes 3 ELMOD1 ELMO/CED-12 domain containing 1 55531 yes yes yes 3 ERG v-ets erythroblastosis virus E26 oncogene homolog 2078 yes yes yes 3 FYTTD1 forty-two-three(avian) domain containing 1 84248 yes yes yes 3 G6PC2 glucose-6-phosphatase, catalytic, 2 57818 yes yes yes 3 GOT1 glutamic-oxaloacetic transaminase 1, soluble 2805 yes yes yes 3 GPR155 G(aspartate protein-coupled aminotransferase receptor 155 1) 151556 yes yes yes 3 GPRIN3 GPRIN family member 3 285513 yes yes yes 3 GTPBP10 GTP-binding protein 10 (putative) 85865 yes yes yes 3 HACE1 HECT domain and ankyrin repeat containing, E3 57531 yes yes yes 3 HBS1L HBS1-likeubiquitin protein (S. cerevisiae) ligase 1 10767 yes yes yes 3 HMMR hyaluronan-mediated motility receptor (RHAMM) 3161 yes yes yes 3 INTS8 integrator complex subunit 8 55656 yes yes yes 3 IQCH IQ motif containing H 64799 yes yes yes 3 KIAA1524 KIAA1524 57650 yes yes yes 3 LAD1 ladinin 1 3898 yes yes yes 3 MCPH1 microcephalin 1 79648 yes yes yes 3 MECP2 methyl CpG binding protein 2 (Rett syndrome) 4204 yes yes yes 3 MET met proto-oncogene (hepatocyte growth factor 4233 yes yes yes 3 MRPL19 mitochondrialreceptor) ribosomal protein L19 9801 yes yes yes 3 MTUS1 mitochondrial tumor suppressor 1 57509 yes yes yes 3 MTX2 metaxin 2 10651 yes yes yes 3 NAALAD2 N-acetylated alpha-linked acidic dipeptidase 2 10003 yes yes yes 3 NEBL nebulette 10529 yes yes yes 3 NPVF neuropeptide VF precursor 64111 yes yes yes 3 PARP11 poly (ADP-ribose) polymerase family, member 11 57097 yes yes yes 3 PCDH11X protocadherin 11 X-linked 27328 yes yes yes 3 PDE4B phosphodiesterase 4B, cAMP-specific 5142 yes yes yes 3 PF4 platelet(phosphodiesterase factor 4 E4 dunce homolog, Drosophila) 5196 yes yes yes 3 PGBD1 piggyBac transposable element derived 1 84547 yes yes yes 3 PIGA phosphatidylinositol glycan anchor biosynthesis, class 5277 yes yes yes 3 PRKCB proteinA kinase C, beta 5579 yes yes yes 3 yes PRPF38A PRP38 pre-mRNA processing factor 38 (yeast) 84950 yes yes yes 3 RAB28 RAB28,domain containingmember RAS A oncogene family 9364 yes yes yes 3 RAB3GAP1 RAB3 GTPase activating protein subunit 1 (catalytic) 22930 yes yes yes 3 RAC1 ras-related C3 botulinum toxin substrate 1 (rho family, 5879 yes yes yes 3 yes RCHY1 ringsmall finger GTP andbinding CHY protein zinc finger Rac1) domain containing 1 25898 yes yes yes 3 RDH11 retinol dehydrogenase 11 (all-trans/9-cis/11-cis) 51109 yes yes yes 3 RPGR retinitis pigmentosa GTPase regulator 6103 yes yes yes 3 SERP1 stress-associated endoplasmic reticulum protein 1 27230 yes yes yes 3 SLC10A1 solute carrier family 10 (sodium/bile acid 6554 yes yes yes 3 SLC1A3 solutecotransporter carrier familyfamily), 1 member(glial high 1 affinity glutamate 6507 yes yes yes 3 SLC22A10 solutetransporter), carrier member family 22, 3 member 10 387775 yes yes yes 3 SMG1 SMG1 homolog, phosphatidylinositol 3-kinase-related 23049 yes yes yes 3 SNCAIP synuclein,kinase (C. alphaelegans) interacting protein 9627 yes yes yes 3 SP1 Sp1 transcription factor 6667 yes yes yes 3 STEAP2 six transmembrane epithelial antigen of the prostate 2 261729 yes yes yes 3 STX7 syntaxin 7 8417 yes yes yes 3 TAF5L TAF5-like RNA polymerase II, p300/CBP-associated 27097 yes yes yes 3 TC2N tandemfactor (PCAF) C2 domains,-associated nuclear factor, 65kDa 123036 yes yes yes 3 TFPI tissue factor pathway inhibitor (lipoprotein-associated 7035 yes yes yes 3 TPD52L3 tumorcoagulation protein inhibitor) D52-like 3 89882 yes yes yes 3 TRPC5 transient receptor potential cation channel, subfamily 7224 yes yes yes 3 TSC1 tuberousC, member sclerosis 5 1 7248 yes yes yes 3 TYW3 tRNA-yW synthesizing protein 3 homolog (S. 127253 yes yes yes 3 UBE2K cerevisiae) ubiquitin-conjugating enzyme E2K (UBC1 homolog, 3093 yes yes yes 3 VBP1 vonyeast) Hippel-Lindau binding protein 1 7411 yes yes yes 3 WNT3 wingless-type MMTV integration site family, member 7473 yes yes yes 3 XRCC4 X-ray3 repair complementing defective repair in 7518 yes yes yes 3 YTHDF1 YTHChinese domain hamster family, cells member 4 1 54915 yes yes yes 3 ZBTB40 zinc finger and BTB domain containing 40 9923 yes yes yes 3 ZDHHC13 zinc finger, DHHC-type containing 13 54503 yes yes yes 3 ZFAND1 zinc finger, AN1-type domain 1 79752 yes yes yes 3 ZNF148 zinc finger protein 148 7707 yes yes yes 3 ZNF238 zinc finger protein 238 10472 yes yes yes 3

156

Symbol Gene name Entrez MR TS µT DB MC # KEGG ZNF804A zinc finger protein 804A 91752 yes yes yes 3 AAK1 AP2 associated kinase 1 22848 yes yes 2 ABHD2 abhydrolase domain containing 2 11057 yes yes 2 ABL1 c-abl oncogene 1, receptor tyrosine kinase 25 yes yes 2 ACTG1 actin, gamma 1 71 yes yes 2 ADC arginine decarboxylase 113451 yes yes 2 ADD3 adducin 3 (gamma) 120 yes yes 2 ADHFE1 alcohol dehydrogenase, iron containing, 1 137872 yes yes 2 AEBP2 AE binding protein 2 121536 yes yes 2 AFF4 AF4/FMR2 family, member 4 27125 yes yes 2 AGBL5 ATP/GTP binding protein-like 5 60509 yes yes 2 AGPS alkylglycerone phosphate synthase 8540 yes yes 2 AHCYL2 adenosylhomocysteinase-like 2 23382 yes yes 2 AICDA activation-induced cytidine deaminase 57379 yes yes 2 AK3 adenylate kinase 3 50808 yes yes 2 AK5 adenylate kinase 5 26289 yes yes 2 AKAP5 A kinase (PRKA) anchor protein 5 9495 yes yes 2 AKNAD1 chromosome 1 open reading frame 62 254268 yes yes 2 ALDH6A1 aldehyde dehydrogenase 6 family, member A1 4329 yes yes 2 ALS2CR8 amyotrophic lateral sclerosis 2 (juvenile) chromosome 79800 yes yes 2 ANKLE2 ankyrinregion, candidaterepeat and 8 LEM domain containing 2 23141 yes yes 2 ANKRD33 ankyrin repeat domain 33 341405 yes yes 2 ANKRD46 ankyrin repeat domain 46 157567 yes yes 2 ANKRD57 ankyrin repeat domain 57 65124 yes yes 2 ANO6 anoctamin 6 196527 yes yes 2 AP1S3 adaptor-related protein complex 1, sigma 3 subunit 130340 yes yes 2 APOL6 apolipoprotein L, 6 80830 yes yes 2 ARF4 ADP-ribosylation factor 4 378 yes yes 2 ARPC5L actin related protein 2/3 complex, subunit 5-like 81873 yes yes 2 ARRDC3 arrestin domain containing 3 57561 yes yes 2 ART1 ADP-ribosyltransferase 1 417 yes yes 2 ASCC3 activating signal cointegrator 1 complex subunit 3 10973 yes yes 2 ASTN1 astrotactin 1 460 yes yes 2 ATAD1 ATPase family, AAA domain containing 1 84896 yes yes 2 ATP13A3 ATPase type 13A3 79572 yes yes 2 ATP1B4 ATPase, (Na+)/K+ transporting, beta 4 polypeptide 23439 yes yes 2 ATP6V0A2 ATPase, H+ transporting, lysosomal V0 subunit a2 23545 yes yes 2 ATP6V1E1 ATPase, H+ transporting, lysosomal 31kDa, V1 529 yes yes 2 ATP6V1E2 ATPase,subunit E1 H+ transporting, lysosomal 31kDa, V1 90423 yes yes 2 B9D1 B9subunit protein E2 domain 1 27077 yes yes 2 BCCIP BRCA2 and CDKN1A interacting protein 56647 yes yes 2 BCDIN3D BCDIN3 domain containing 144233 yes yes 2 BDNF brain-derived neurotrophic factor 627 yes yes 2 BHMT betaine-homocysteine methyltransferase 635 yes yes 2 BLMH bleomycin hydrolase 642 yes yes 2 BMP2K BMP2 inducible kinase 55589 yes yes 2 BOLL bol, boule-like (Drosophila) 66037 yes yes 2 BRCC3 BRCA1/BRCA2-containing complex, subunit 3 79184 yes yes 2 BRIP1 BRCA1 interacting protein C-terminal helicase 1 83990 yes yes 2 C10ORF122 chromosome 10 open reading frame 122 387718 yes yes 2 C10ORF72 chromosome 10 open reading frame 72 196740 yes yes 2 C10ORF84 chromosome 10 open reading frame 84 63877 yes yes 2 C11ORF41 chromosome 11 open reading frame 41 25758 yes yes 2 C13ORF23 chromosome 13 open reading frame 23 80209 yes yes 2 C13ORF31 chromosome 13 open reading frame 31 144811 yes yes 2 C16ORF46 chromosome 16 open reading frame 46 123775 yes yes 2 C17ORF78 open reading frame 78 284099 yes yes 2 C19ORF55 open reading frame 55 148137 yes yes 2 C1ORF65 chromosome 1 open reading frame 65 164127 yes yes 2 C1ORF94 chromosome 1 open reading frame 94 84970 yes yes 2 C20ORF112 chromosome 20 open reading frame 112 140688 yes yes 2 C20ORF194 chromosome 20 open reading frame 194 25943 yes yes 2 C22ORF25 chromosome 22 open reading frame 25 128989 yes yes 2 C2ORF3 chromosome 2 open reading frame 3 6936 yes yes 2 C2ORF67 chromosome 2 open reading frame 67 151050 yes yes 2 C3ORF18 chromosome 3 open reading frame 18 51161 yes yes 2 C3ORF24 chromosome 3 open reading frame 24 115795 yes yes 2 C3ORF37 chromosome 3 open reading frame 37 56941 yes yes 2 C3ORF55 chromosome 3 open reading frame 55 152078 yes yes 2 C3ORF59 chromosome 3 open reading frame 59 151963 yes yes 2 C5ORF22 chromosome 5 open reading frame 22 55322 yes yes 2 C5ORF33 chromosome 5 open reading frame 33 133686 yes yes 2 C6ORF153 chromosome 6 open reading frame 153 88745 yes yes 2 C7ORF53 chromosome 7 open reading frame 53 286006 yes yes 2 C8ORF76 chromosome 8 open reading frame 76 84933 yes yes 2 CAB39L calcium binding protein 39-like 81617 yes yes 2 CACNG1 calcium channel, voltage-dependent, gamma subunit 786 yes yes 2 CALCRL calcitonin1 receptor-like 10203 yes yes 2 CALHM2 calcium homeostasis modulator 2 51063 yes yes 2 CARD8 caspase recruitment domain family, member 8 22900 yes yes 2 CBLB Cas-Br-M (murine) ecotropic retroviral transforming 868 yes yes 2 CC2D2B coiled-coilsequence band C2 domain containing 2B 387707 yes yes 2

157

Symbol Gene name Entrez MR TS µT DB MC # KEGG CCDC152 coiled-coil domain containing 152 100129 yes yes 2 CCDC59 coiled-coil domain containing 59 29080792 yes yes 2 CCDC90B coiled-coil domain containing 90B 60492 yes yes 2 CCDC91 coiled-coil domain containing 91 55297 yes yes 2 CCNDBP1 cyclin D-type binding-protein 1 23582 yes yes 2 CDADC1 cytidine and dCMP deaminase domain containing 1 81602 yes yes 2 CDC123 cell division cycle 123 homolog (S. cerevisiae) 8872 yes yes 2 CDC14A CDC14 cell division cycle 14 homolog A (S. 8556 yes yes 2 CDHR3 hypotheticalcerevisiae) protein FLJ23834 222256 yes yes 2 CDK7 cyclin-dependent kinase 7 1022 yes yes 2 CENPC1 centromere protein C 1 1060 yes yes 2 CEP78 centrosomal protein 78kDa 84131 yes yes 2 CFHR3 complement factor H-related 3 10878 yes yes 2 CFHR4 complement factor H-related 4 10877 yes yes 2 CHIC2 cysteine-rich hydrophobic domain 2 26511 yes yes 2 CLDN8 claudin 8 9073 yes yes 2 CLEC1A C-type lectin domain family 1, member A 51267 yes yes 2 CLEC2D C-type lectin domain family 2, member D 29121 yes yes 2 CLEC3A C-type lectin domain family 3, member A 10143 yes yes 2 CLIC2 chloride intracellular channel 2 1193 yes yes 2 CLPX ClpX caseinolytic peptidase X homolog (E. coli) 10845 yes yes 2 CMTM8 CKLF-like MARVEL transmembrane domain 152189 yes yes 2 CNGB3 containing cyclic nucleotide 8 gated channel beta 3 54714 yes yes 2 CNOT8 CCR4-NOT transcription complex, subunit 8 9337 yes yes 2 CNTN3 contactin 3 (plasmacytoma associated) 5067 yes yes 2 COCH coagulation factor C homolog, cochlin (Limulus 1690 yes yes 2 COL14A1 collagen,polyphemus) type XIV, alpha 1 7373 yes yes 2 COL4A4 collagen, type IV, alpha 4 1286 yes yes 2 COMMD8 COMM domain containing 8 54951 yes yes 2 CPT1A carnitine palmitoyltransferase 1A (liver) 1374 yes yes 2 CR1 complement component (3b/4b) receptor 1 (Knops 1378 yes yes 2 CRB1 crumbsblood group) homolog 1 (Drosophila) 23418 yes yes 2 CRBN cereblon 51185 yes yes 2 CREBBP CREB binding protein 1387 yes yes 2 CREG1 cellular repressor of E1A-stimulated genes 1 8804 yes yes 2 CRISPLD1 cysteine-rich secretory protein LCCL domain 83690 yes yes 2 CSTA cystatincontaining A (stefin1 A) 1475 yes yes 2 CTNND1 catenin (cadherin-associated protein), delta 1 1500 yes yes 2 CTNND2 catenin (cadherin-associated protein), delta 2 (neural 1501 yes yes 2 CTPS2 CTPplakophilin synthase-related II arm -repeat protein) 56474 yes yes 2 CYB5D1 cytochrome b5 domain containing 1 124637 yes yes 2 CYP2U1 cytochrome P450, family 2, subfamily U, polypeptide 113612 yes yes 2 CYP3A4 1 cytochrome P450, family 3, subfamily A, polypeptide 1576 yes yes 2 CYP3A7 cytochrome4 P450, family 3, subfamily A, polypeptide 1551 yes yes 2 CYP7B1 cytochrome7 P450, family 7, subfamily B, polypeptide 9420 yes yes 2 CYYR1 cysteine/tyrosine-rich1 1 116159 yes yes 2 DDX20 DEAD (Asp-Glu-Ala-Asp) box polypeptide 20 11218 yes yes 2 DDX52 DEAD (Asp-Glu-Ala-Asp) box polypeptide 52 11056 yes yes 2 DEM1 defects in morphology 1 homolog (S. cerevisiae) 64789 yes yes 2 DPY19L4 dpy-19-like 4 (C. elegans) 286148 yes yes 2 DSEL dermatan sulfate epimerase-like 92126 yes yes 2 DUS4L dihydrouridine synthase 4-like (S. cerevisiae) 11062 yes yes 2 DUSP6 dual specificity phosphatase 6 1848 yes yes 2 EDIL3 EGF-like repeats and discoidin I-like domains 3 10085 yes yes 2 EFHA2 EF-hand domain family, member A2 286097 yes yes 2 EFHC2 EF-hand domain (C-terminal) containing 2 80258 yes yes 2 EIF1AX eukaryotic translation initiation factor 1A, X-linked 1964 yes yes 2 EIF5B eukaryotic translation initiation factor 5B 9669 yes yes 2 ENDOD1 endonuclease domain containing 1 23052 yes yes 2 ENPP4 ectonucleotide pyrophosphatase/phosphodiesterase 4 22875 yes yes 2 EPB41 erythrocyte(putative function) membrane protein band 4.1 (elliptocytosis 2035 yes yes 2 EPS8L3 EPS8-like1, RH -linked) 3 79574 yes yes 2 ERC2 ELKS/RAB6-interacting/CAST family member 2 26059 yes yes 2 ERI2 exoribonuclease 2 112479 yes yes 2 ERMN ermin, ERM-like protein 57471 yes yes 2 EXO1 exonuclease 1 9156 yes yes 2 EYS eyes shut homolog (Drosophila) 346007 yes yes 2 FADS2 fatty acid desaturase 2 9415 yes yes 2 FAM13C family with sequence similarity 13, member C 220965 yes yes 2 FAM162B family with sequence similarity 162, member B 221303 yes yes 2 FAM188A chromosome 10 open reading frame 97 80013 yes yes 2 FAM26D family with sequence similarity 26, member D 221301 yes yes 2 FAM36A family with sequence similarity 36, member A 116228 yes yes 2 FAM63B family with sequence similarity 63, member B 54629 yes yes 2 FAM96A family with sequence similarity 96, member A 84191 yes yes 2 FAS Fas (TNF receptor superfamily, member 6) 355 yes yes 2 FGF1 fibroblast growth factor 1 (acidic) 2246 yes yes 2 FGF5 fibroblast growth factor 5 2250 yes yes 2 FKBP14 FK506 binding protein 14, 22 kDa 55033 yes yes 2 FKBP15 FK506 binding protein 15, 133kDa 23307 yes yes 2 FKBP4 FK506 binding protein 4, 59kDa 2288 yes yes 2 FKBP9L FK506 binding protein 9-like 360132 yes yes 2

158

Symbol Gene name Entrez MR TS µT DB MC # KEGG FLRT2 fibronectin leucine rich transmembrane protein 2 23768 yes yes 2 FMO5 flavin containing monooxygenase 5 2330 yes yes 2 FNBP1 formin binding protein 1 23048 yes yes 2 FOLR1 folate receptor 1 (adult) 2348 yes yes 2 FOXM1 forkhead box M1 2305 yes yes 2 FOXN2 forkhead box N2 3344 yes yes 2 FRYL FRY-like 285527 yes yes 2 FUCA1 fucosidase, alpha-L- 1, tissue 2517 yes yes 2 FUNDC2 FUN14 domain containing 2 65991 yes yes 2 FUT5 fucosyltransferase 5 (alpha (1,3) fucosyltransferase) 2527 yes yes 2 FUT9 fucosyltransferase 9 (alpha (1,3) fucosyltransferase) 10690 yes yes 2 FZD3 frizzled homolog 3 (Drosophila) 7976 yes yes 2 GAN gigaxonin 8139 yes yes 2 GAPT GRB2-binding adaptor protein, transmembrane 202309 yes yes 2 GARNL3 GTPase activating Rap/RanGAP domain-like 3 84253 yes yes 2 GBE1 glucan (1,4-alpha-), branching enzyme 1 2632 yes yes 2 GCET2 germinal center expressed transcript 2 257144 yes yes 2 GCNT1 glucosaminyl (N-acetyl) transferase 1, core 2 (beta- 2650 yes yes 2 GEN1 Gen1,6 -N homolog-acetylglucosaminyltransferase) 1, endonuclease (Drosophila) 348654 yes yes 2 GIGYF2 GRB10 interacting GYF protein 2 26058 yes yes 2 GLI1 GLI family zinc finger 1 2735 yes yes 2 GLIPR1 GLI pathogenesis-related 1 11010 yes yes 2 GLOD4 glyoxalase domain containing 4 51031 yes yes 2 GLRX glutaredoxin (thioltransferase) 2745 yes yes 2 GNPAT glyceronephosphate O-acyltransferase 8443 yes yes 2 GOLGA8E golgi autoantigen, golgin subfamily a, 8E 390535 yes yes 2 GPLD1 glycosylphosphatidylinositol specific phospholipase 2822 yes yes 2 GPR85 GD1 protein-coupled receptor 85 54329 yes yes 2 GRHPR glyoxylate reductase/hydroxypyruvate reductase 9380 yes yes 2 GRIP1 glutamate receptor interacting protein 1 23426 yes yes 2 GSG1 germ cell associated 1 83445 yes yes 2 GSG2 germ cell associated 2 (haspin) 83903 yes yes 2 GSR glutathione reductase 2936 yes yes 2 GSTA2 glutathione S-transferase alpha 2 2939 yes yes 2 GTF2E1 general transcription factor IIE, polypeptide 1, alpha 2960 yes yes 2 GUF1 GUF156kDa GTPase homolog (S. cerevisiae) 60558 yes yes 2 GYG1 glycogenin 1 2992 yes yes 2 HAT1 histone acetyltransferase 1 8520 yes yes 2 HDAC9 histone deacetylase 9 9734 yes yes 2 HECTD2 HECT domain containing 2 143279 yes yes 2 HERC4 hect domain and RLD 4 26091 yes yes 2 HFE hemochromatosis 3077 yes yes 2 HMGB2 high-mobility group box 2 3148 yes yes 2 HMGN4 high mobility group nucleosomal binding domain 4 10473 yes yes 2 HNRNPA2B heterogeneous nuclear ribonucleoprotein A2/B1 3181 yes yes 2 HOMER11 homer homolog 1 (Drosophila) 9456 yes yes 2 HOOK1 hook homolog 1 (Drosophila) 51361 yes yes 2 HS2ST1 heparan sulfate 2-O-sulfotransferase 1 9653 yes yes 2 ICA1L islet cell autoantigen 1,69kDa-like 130026 yes yes 2 IFT80 intraflagellar transport 80 homolog (Chlamydomonas) 57560 yes yes 2 IL13RA1 interleukin 13 receptor, alpha 1 3597 yes yes 2 INO80C INO80 complex subunit C 125476 yes yes 2 ITGAV integrin, alpha V (vitronectin receptor, alpha 3685 yes yes 2 JKAMP chromosomepolypeptide, antigen 14 open CD51) reading frame 100 51528 yes yes 2 KCMF1 potassium channel modulatory factor 1 56888 yes yes 2 KCNA2 potassium voltage-gated channel, shaker-related 3737 yes yes 2 KCNA5 potassiumsubfamily, membervoltage-gated 2 channel, shaker-related 3741 yes yes 2 KCNJ3 potassiumsubfamily, memberinwardly-rectifying 5 channel, subfamily J, 3760 yes yes 2 KCTD6 potassiummember 3 channel tetramerisation domain containing 200845 yes yes 2 KDELC2 6 KDEL (Lys-Asp-Glu-Leu) containing 2 143888 yes yes 2 KDELR1 KDEL (Lys-Asp-Glu-Leu) endoplasmic reticulum 10945 yes yes 2 KIAA1383 KIAA1383protein retention receptor 1 54627 yes yes 2 KIAA1409 KIAA1409 57578 yes yes 2 KLHDC5 kelch domain containing 5 57542 yes yes 2 KLHL7 kelch-like 7 (Drosophila) 55975 yes yes 2 KLRG1 killer cell lectin-like receptor subfamily G, member 1 10219 yes yes 2 KPNB1 karyopherin (importin) beta 1 3837 yes yes 2 KRT32 keratin 32 3882 yes yes 2 KRTAP1-5 keratin associated protein 1-5 83895 yes yes 2 LACTB lactamase, beta 114294 yes yes 2 LCOR ligand dependent nuclear receptor corepressor 84458 yes yes 2 LGR4 leucine-rich repeat-containing G protein-coupled 55366 yes yes 2 LGSN lengsin,receptor lens4 protein with glutamine synthetase 51557 yes yes 2 LIX1 Lix1domain homolog (chicken) 167410 yes yes 2 LMOD3 leiomodin 3 (fetal) 56203 yes yes 2 LMX1A LIM homeobox transcription factor 1, alpha 4009 yes yes 2 LNX2 ligand of numb-protein X 2 222484 yes yes 2 LPCAT1 lysophosphatidylcholine acyltransferase 1 79888 yes yes 2 LPPR5 phosphatidic acid phosphatase type 2 163404 yes yes 2 LRRC27 leucine rich repeat containing 27 80313 yes yes 2 LRRIQ3 leucine-rich repeats and IQ motif containing 3 127255 yes yes 2 LUC7L LUC7-like (S. cerevisiae) 55692 yes yes 2

159

Symbol Gene name Entrez MR TS µT DB MC # KEGG MAGEE1 melanoma antigen family E, 1 57692 yes yes 2 MAK16 MAK16 homolog (S. cerevisiae) 84549 yes yes 2 MCHR2 melanin-concentrating hormone receptor 2 84539 yes yes 2 MCL1 myeloid cell leukemia sequence 1 (BCL2-related) 4170 yes yes 2 MEAF6 chromosome 1 open reading frame 149 64769 yes yes 2 MEOX2 mesenchyme homeobox 2 4223 yes yes 2 MERTK c-mer proto-oncogene tyrosine kinase 10461 yes yes 2 MEST mesoderm specific transcript homolog (mouse) 4232 yes yes 2 MEX3C mex-3 homolog C (C. elegans) 51320 yes yes 2 MFAP3 microfibrillar-associated protein 3 4238 yes yes 2 MGAT4A mannosyl (alpha-1,3-)-glycoprotein beta-1,4-N- 11320 yes yes 2 MID1 midlineacetylglucosaminyltransferase, 1 (Opitz/BBB syndrome) isozyme A 4281 yes yes 2 MIF4GD MIF4G domain containing 57409 yes yes 2 MLF1 myeloid leukemia factor 1 4291 yes yes 2 MLH1 mutL homolog 1, colon cancer, nonpolyposis type 2 4292 yes yes 2 MLLT6 myeloid/lymphoid(E. coli) or mixed-lineage leukemia (trithorax 4302 yes yes 2 MMD monocytehomolog, Drosophila);to macrophage translocated differentiation-associat to, 6 ed 23531 yes yes 2 MMP16 matrix metallopeptidase 16 (membrane-inserted) 4325 yes yes 2 MMP21 matrix metallopeptidase 21 118856 yes yes 2 MRPL13 mitochondrial ribosomal protein L13 28998 yes yes 2 MRS2 MRS2 magnesium homeostasis factor homolog (S. 57380 yes yes 2 MTMR10 myotubularincerevisiae) related protein 10 54893 yes yes 2 MTMR4 myotubularin related protein 4 9110 yes yes 2 MTX3 metaxin 3 345778 yes yes 2 MYCBP c-myc binding protein 26292 yes yes 2 MYH9 myosin, heavy chain 9, non-muscle 4627 yes yes 2 MYO16 myosin XVI 23026 yes yes 2 MYT1L myelin transcription factor 1-like 23040 yes yes 2 NAALADL2 N-acetylated alpha-linked acidic dipeptidase-like 2 254827 yes yes 2 NANP N-acetylneuraminic acid phosphatase 140838 yes yes 2 NDRG3 NDRG family member 3 57446 yes yes 2 NEDD4 neural precursor cell expressed, developmentally 4734 yes yes 2 NFATC3 nucleardown -regulated factor of 4 activated T-cells, cytoplasmic, 4775 yes yes 2 yes NHS Nance-Horancalcineurin -dependent syndrome 3 (congenital cataracts and 4810 yes yes 2 NOL8 nucleolardental anomalies) protein 8 55035 yes yes 2 NR4A2 nuclear receptor subfamily 4, group A, member 2 4929 yes yes 2 NUP153 nucleoporin 153kDa 9972 yes yes 2 NUP62CL nucleoporin 62kDa C-terminal like 54830 yes yes 2 NXPH2 neurexophilin 2 11249 yes yes 2 NXT1 NTF2-like export factor 1 29107 yes yes 2 ODF2L outer dense fiber of sperm tails 2-like 57489 yes yes 2 OGFR opioid growth factor receptor 11054 yes yes 2 ONECUT2 one cut homeobox 2 9480 yes yes 2 OPTN optineurin 10133 yes yes 2 OR51E1 olfactory receptor, family 51, subfamily E, member 1 143503 yes yes 2 OSBPL11 oxysterol binding protein-like 11 114885 yes yes 2 P4HA1 prolyl 4-hydroxylase, alpha polypeptide I 5033 yes yes 2 P4HA2 prolyl 4-hydroxylase, alpha polypeptide II 8974 yes yes 2 PANX2 pannexin 2 56666 yes yes 2 PCDH15 protocadherin 15 65217 yes yes 2 PCDH9 protocadherin 9 5101 yes yes 2 PCDHB4 protocadherin beta 4 56131 yes yes 2 PCGF5 polycomb group ring finger 5 84333 yes yes 2 PCLO piccolo (presynaptic cytomatrix protein) 27445 yes yes 2 PCMT1 protein-L-isoaspartate (D-aspartate) O- 5110 yes yes 2 PCYOX1 prenylcysteinemethyltransferase oxidase 1 51449 yes yes 2 PDE10A phosphodiesterase 10A 10846 yes yes 2 PDE3A phosphodiesterase 3A, cGMP-inhibited 5139 yes yes 2 PDE4D phosphodiesterase 4D, cAMP-specific 5144 yes yes 2 PDIK1L PDLIM1(phosphodiesterase interacting kinase E3 dunce 1 like homolog, Dros ophila) 149420 yes yes 2 PDK4 pyruvate dehydrogenase kinase, isozyme 4 5166 yes yes 2 PEPD peptidase D 5184 yes yes 2 PER3 period homolog 3 (Drosophila) 8863 yes yes 2 PERP PERP, TP53 apoptosis effector 64065 yes yes 2 PGAP1 post-GPI attachment to proteins 1 80055 yes yes 2 PGBD2 piggyBac transposable element derived 2 267002 yes yes 2 PHLDA1 pleckstrin homology-like domain, family A, member 1 22822 yes yes 2 PHYH phytanoyl-CoA 2-hydroxylase 5264 yes yes 2 PIK3R1 phosphoinositide-3-kinase, regulatory subunit 1 5295 yes yes 2 yes PITPNC1 phosphatidylinositol(alpha) transfer protein, cytoplasmic 1 26207 yes yes 2 PLA2G4A phospholipase A2, group IVA (cytosolic, calcium- 5321 yes yes 2 yes PLAU plasminogendependent) activator, urokinase 5328 yes yes 2 PLD1 phospholipase D1, phosphatidylcholine-specific 5337 yes yes 2 PLEKHA2 pleckstrin homology domain containing, family A 59339 yes yes 2 PLP1 proteolipid(phosphoinosit proteinide binding1 specific) member 2 5354 yes yes 2 PMP2 peripheral myelin protein 2 5375 yes yes 2 POF1B premature ovarian failure, 1B 79983 yes yes 2 POLE2 polymerase (DNA directed), epsilon 2 (p59 subunit) 5427 yes yes 2 POLH polymerase (DNA directed), eta 5429 yes yes 2 POLR2K polymerase (RNA) II (DNA directed) polypeptide K, 5440 yes yes 2 PROCR protein7.0kDa C receptor, endothelial (EPCR) 10544 yes yes 2 PROL1 proline rich, lacrimal 1 58503 yes yes 2

160

Symbol Gene name Entrez MR TS µT DB MC # KEGG PRSS12 protease, serine, 12 (neurotrypsin, motopsin) 8492 yes yes 2 PRTG protogenin homolog (Gallus gallus) 283659 yes yes 2 PTH parathyroid hormone 5741 yes yes 2 RAB11FIP2 RAB11 family interacting protein 2 (class I) 22841 yes yes 2 RAB12 RAB12, member RAS oncogene family 201475 yes yes 2 RAB4A RAB4A, member RAS oncogene family 5867 yes yes 2 RAD18 RAD18 homolog (S. cerevisiae) 56852 yes yes 2 RAD23A RAD23 homolog A (S. cerevisiae) 5886 yes yes 2 RAD23B RAD23 homolog B (S. cerevisiae) 5887 yes yes 2 RAG1 recombination activating gene 1 5896 yes yes 2 RAP1GDS1 RAP1, GTP-GDP dissociation stimulator 1 5910 yes yes 2 RB1 retinoblastoma 1 5925 yes yes 2 RBBP6 retinoblastoma binding protein 6 5930 yes yes 2 RBM27 RNA binding motif protein 27 54439 yes yes 2 RCN2 reticulocalbin 2, EF-hand calcium binding domain 5955 yes yes 2 REV1 REV1 homolog (S. cerevisiae) 51455 yes yes 2 RHOA ras homolog gene family, member A 387 yes yes 2 RIF1 RAP1 interacting factor homolog (yeast) 55183 yes yes 2 RIN1 Ras and Rab interactor 1 9610 yes yes 2 RNF11 ring finger protein 11 26994 yes yes 2 RNF144A ring finger protein 144A 9781 yes yes 2 RNF8 ring finger protein 8 9025 yes yes 2 RNGTT RNA guanylyltransferase and 5'-phosphatase 8732 yes yes 2 ROM1 retinal outer segment membrane protein 1 6094 yes yes 2 RORB RAR-related orphan receptor B 6096 yes yes 2 RPA1 replication protein A1, 70kDa 6117 yes yes 2 RPGRIP1L RPGRIP1-like 23322 yes yes 2 RPL34 ribosomal protein L34 6164 yes yes 2 RPN2 ribophorin II 6185 yes yes 2 RQCD1 RCD1 required for cell differentiation1 homolog (S. 9125 yes yes 2 RTN3 reticulonpombe) 3 10313 yes yes 2 SAMD3 sterile alpha motif domain containing 3 154075 yes yes 2 SC5DL sterol-C5-desaturase (ERG3 delta-5-desaturase 6309 yes yes 2 SDCBP syndecanhomolog, S.binding cerevisiae) protein-like (syntenin) 6386 yes yes 2 SFRS12 splicing factor, arginine/serine-rich 12 140890 yes yes 2 SFRS5 splicing factor, arginine/serine-rich 5 6430 yes yes 2 SFXN3 sideroflexin 3 81855 yes yes 2 SHISA2 shisa homolog 2 (Xenopus laevis) 387914 yes yes 2 SIRT5 sirtuin (silent mating type information regulation 2 23408 yes yes 2 SLAMF1 signalinghomolog) lymphocytic5 (S. cerevisiae) activation molecule family 6504 yes yes 2 SLC12A2 solutemember carrier 1 family 12 (sodium/potassium/chloride 6558 yes yes 2 SLC16A9 solutetransporters), carrier familymembe 16,r 2 member 9 (monocarboxylic 220963 yes yes 2 SLC22A7 acid solute transporter carrier family 9) 22 (organic anion transporter), 10864 yes yes 2 SLC23A2 solutemember carrier 7 family 23 (nucleobase transporters), 9962 yes yes 2 SLC26A7 solutemember carrier 2 family 26, member 7 115111 yes yes 2 SLC36A3 solute carrier family 36 (proton/amino acid symporter), 285641 yes yes 2 SLC41A1 member solute carrier 3 family 41, member 1 254428 yes yes 2 SLC5A3 solute carrier family 5 (sodium/myo-inositol 6526 yes yes 2 SLC6A4 solutecotransporter), carrier family member 6 (neurotransmitter 3 transporter, 6532 yes yes 2 SLC7A6OS soluteserotonin), carrier member family 7,4 member 6 opposite strand 84138 yes yes 2 SMAD2 SMAD family member 2 4087 yes yes 2 SMPD3 sphingomyelin phosphodiesterase 3, neutral 55512 yes yes 2 SND1 staphylococcalmembrane (neutral nuclease sphingomyelinase and tudor domain II) containing 27044 yes yes 2 SNRNP27 small1 nuclear ribonucleoprotein 27kDa (U4/U6.U5) 11017 yes yes 2 SOHLH2 spermatogenesis and oogenesis specific basic helix- 54937 yes yes 2 SON SONloop - helixDNA 2 binding protein 6651 yes yes 2 SPAG16 sperm associated antigen 16 79582 yes yes 2 SPP2 secreted phosphoprotein 2, 24kDa 6694 yes yes 2 SPRR2F small proline-rich protein 2F 6705 yes yes 2 ST8SIA1 ST8 alpha-N-acetyl-neuraminide alpha-2,8- 6489 yes yes 2 STAU1 staufen,sialyltransferase RNA binding 1 protein, homolog 1 (Drosophila) 6780 yes yes 2 STK32A serine/threonine kinase 32A 202374 yes yes 2 SUV420H1 suppressor of variegation 4-20 homolog 1 51111 yes yes 2 SV2A synaptic(Drosophila) vesicle glycoprotein 2A 9900 yes yes 2 SYF2 SYF2 homolog, RNA splicing factor (S. cerevisiae) 25949 yes yes 2 SYT10 synaptotagmin X 341359 yes yes 2 SYT15 synaptotagmin XV 83849 yes yes 2 TAF11 TAF11 RNA polymerase II, TATA box binding protein 6882 yes yes 2 TAF13 TAF13(TBP) -associated RNA polymerase factor, 28kDaII, TATA box binding protein 6884 yes yes 2 TALDO1 transaldolase(TBP) -associated 1 factor, 18kDa 6888 yes yes 2 TBC1D24 TBC1 domain family, member 24 57465 yes yes 2 TCEAL5 transcription elongation factor A (SII)-like 5 340543 yes yes 2 TCEAL6 transcription elongation factor A (SII)-like 6 158931 yes yes 2 TET1 tet oncogene 1 80312 yes yes 2 TET3 tet oncogene family member 3 200424 yes yes 2 TEX15 testis expressed 15 56154 yes yes 2 TFDP3 transcription factor Dp family, member 3 51270 yes yes 2 TGDS TDP-glucose 4,6-dehydratase 23483 yes yes 2 TGFBR1 transforming growth factor, beta receptor 1 7046 yes yes 2 TIPIN TIMELESS interacting protein 54962 yes yes 2 TIPRL TIP41, TOR signaling pathway regulator-like (S. 261726 yes yes 2 TKTL1 cerevisiae) transketolase-like 1 8277 yes yes 2

161

Symbol Gene name Entrez MR TS µT DB MC # KEGG TKTL2 transketolase-like 2 84076 yes yes 2 TM2D1 TM2 domain containing 1 83941 yes yes 2 TM4SF20 transmembrane 4 L six family member 20 79853 yes yes 2 TMEM14E transmembrane protein 14E 645843 yes yes 2 TMEM174 transmembrane protein 174 134288 yes yes 2 TMEM185B transmembrane protein 185B (pseudogene) 79134 yes yes 2 TMEM215 transmembrane protein 215 401498 yes yes 2 TMEM232 hypothetical protein LOC642987 642987 yes yes 2 TNPO1 transportin 1 3842 yes yes 2 TRAF3 TNF receptor-associated factor 3 7187 yes yes 2 TRIM58 tripartite motif-containing 58 25893 yes yes 2 TRIM7 tripartite motif-containing 7 81786 yes yes 2 TSGA10 testis specific, 10 80705 yes yes 2 TTC30A tetratricopeptide repeat domain 30A 92104 yes yes 2 TWIST1 twist homolog 1 (Drosophila) 7291 yes yes 2 TWSG1 twisted gastrulation homolog 1 (Drosophila) 57045 yes yes 2 UBE2H ubiquitin-conjugating enzyme E2H (UBC8 homolog, 7328 yes yes 2 UCHL3 ubiquitinyeast) carboxyl-terminal esterase L3 (ubiquitin 7347 yes yes 2 UGT2A3 UDPthiolesterase) glucuronosyltransferase 2 family, polypeptide A3 79799 yes yes 2 UGT8 UDP glycosyltransferase 8 7368 yes yes 2 UHRF2 ubiquitin-like with PHD and ring finger domains 2 115426 yes yes 2 USP42 ubiquitin specific peptidase 42 84132 yes yes 2 USP47 ubiquitin specific peptidase 47 55031 yes yes 2 USP6 ubiquitin specific peptidase 6 (Tre-2 oncogene) 9098 yes yes 2 UTP14A UTP14, U3 small nucleolar ribonucleoprotein, 10813 yes yes 2 VAMP3 vesicle-associatedhomolog A (yeast) membrane protein 3 (cellubrevin) 9341 yes yes 2 VEZF1 vascular endothelial zinc finger 1 7716 yes yes 2 VPS13A vacuolar protein sorting 13 homolog A (S. cerevisiae) 23230 yes yes 2 VPS13B vacuolar protein sorting 13 homolog B (yeast) 157680 yes yes 2 VSX1 visual system homeobox 1 30813 yes yes 2 VWDE von Willebrand factor D and EGF domains 221806 yes yes 2 WARS2 tryptophanyl tRNA synthetase 2, mitochondrial 10352 yes yes 2 WDR26 WD repeat domain 26 80232 yes yes 2 WDR67 WD repeat domain 67 93594 yes yes 2 WT1 Wilms tumor 1 7490 yes yes 2 XAF1 XIAP associated factor 1 54739 yes yes 2 YAP1 Yes-associated protein 1, 65kDa 10413 yes yes 2 YIPF4 Yip1 domain family, member 4 84272 yes yes 2 YWHAB tyrosine 3-monooxygenase/tryptophan 5- 7529 yes yes 2 YWHAH tyrosinemonooxygenase 3-monooxygenase/tryptophan activation protein, beta 5- polypeptide 7533 yes yes 2 ZBTB10 zincmonooxygenase finger and BTB activation domain protein, containing eta 10polypeptide 65986 yes yes 2 ZBTB26 zinc finger and BTB domain containing 26 57684 yes yes 2 ZBTB38 zinc finger and BTB domain containing 38 253461 yes yes 2 ZCCHC11 zinc finger, CCHC domain containing 11 23318 yes yes 2 ZFPM2 zinc finger protein, multitype 2 23414 yes yes 2 ZFR zinc finger RNA binding protein 51663 yes yes 2 ZHX3 zinc fingers and homeoboxes 3 23051 yes yes 2 ZNF160 zinc finger protein 160 90338 yes yes 2 ZNF254 zinc finger protein 254 9534 yes yes 2 ZNF271 zinc finger protein 271 10778 yes yes 2 ZNF396 zinc finger protein 396 252884 yes yes 2 ZNF434 zinc finger protein 434 54925 yes yes 2 ZNF449 zinc finger protein 449 203523 yes yes 2 ZNF460 zinc finger protein 460 10794 yes yes 2 ZNF470 zinc finger protein 470 388566 yes yes 2 ZNF490 zinc finger protein 490 57474 yes yes 2 ZNF510 zinc finger protein 510 22869 yes yes 2 ZNF516 zinc finger protein 516 9658 yes yes 2 ZNF521 zinc finger protein 521 25925 yes yes 2 ZNF599 zinc finger protein 599 148103 yes yes 2 ZRANB1 zinc finger, RAN-binding domain containing 1 54764 yes yes 2 ZXDB zinc finger, X-linked, duplicated B 158586 yes yes 2 ABCA1 ATP-binding cassette, sub-family A (ABC1), member 19 yes 1 ABCA17P 1 ATP-binding cassette, sub-family A (ABC1), member 650655 yes 1 ABCA4 17 ATP-binding (pseudogene) cassette, sub-family A (ABC1), member 24 yes 1 ABCA8 4 ATP-binding cassette, sub-family A (ABC1), member 10351 yes 1 ABCB10 ATP-binding8 cassette, sub-family B (MDR/TAP), 23456 yes 1 ABCC10 ATP-bindingmember 10 cassette, sub-family C (CFTR/MRP), 89845 yes 1 ABCD2 ATP-bindingmember 10 cassette, sub-family D (ALD), member 2 225 yes 1 ABCF2 ATP-binding cassette, sub-family F (GCN20), 10061 yes 1 ABI2 ablmember interactor 2 2 10152 yes 1 ACACB acetyl-Coenzyme A carboxylase beta 32 yes 1 ACAD8 acyl-Coenzyme A dehydrogenase family, member 8 27034 yes 1 ACADS acyl-Coenzyme A dehydrogenase, C-2 to C-3 short 35 yes 1 ACAP2 chain ArfGAP with coiled-coil, ankyrin repeat and PH 23527 yes 1 ACBD5 acyl-Coenzymedomains 2 A binding domain containing 5 91452 yes 1 ACBD7 acyl-Coenzyme A binding domain containing 7 414149 yes 1 ACLY ATP citrate lyase 47 yes 1 ACOX3 acyl-Coenzyme A oxidase 3, pristanoyl 8310 yes 1 ACPL2 acid phosphatase-like 2 92370 yes 1 ACSBG2 acyl-CoA synthetase bubblegum family member 2 81616 yes 1 ACSL6 acyl-CoA synthetase long-chain family member 6 23305 yes 1

162

Symbol Gene name Entrez MR TS µT DB MC # KEGG ACSM2A acyl-CoA synthetase medium-chain family member 123876 yes 1 ACSM3 2A acyl-CoA synthetase medium-chain family member 3 6296 yes 1 ACTC1 actin, alpha, cardiac muscle 1 70 yes 1 ACTN4 actinin, alpha 4 81 yes 1 ACTR3B ARP3 actin-related protein 3 homolog B (yeast) 57180 yes 1 ACVR2A activin A receptor, type IIA 92 yes 1 ACY1 aminoacylase 1 95 yes 1 ADAL adenosine deaminase-like 161823 yes 1 ADAM10 ADAM metallopeptidase domain 10 102 yes 1 ADAM12 ADAM metallopeptidase domain 12 8038 yes 1 ADAM18 ADAM metallopeptidase domain 18 8749 yes 1 ADAM19 ADAM metallopeptidase domain 19 (meltrin beta) 8728 yes 1 ADAM30 ADAM metallopeptidase domain 30 11085 yes 1 ADAT2 adenosine deaminase, tRNA-specific 2, TAD2 134637 yes 1 ADCY10 homolog adenylate (S. cyclase cerevisiae) 10 (soluble) 55811 yes 1 ADCY9 adenylate cyclase 9 115 yes 1 ADCYAP1 adenylate cyclase activating polypeptide 1 (pituitary) 116 yes 1 ADNP activity-dependent neuroprotector homeobox 23394 yes 1 ADORA1 adenosine A1 receptor 134 yes 1 ADRA2A adrenergic, alpha-2A-, receptor 150 yes 1 ADSS adenylosuccinate synthase 159 yes 1 AGL amylo-1, 6-glucosidase, 4-alpha-glucanotransferase 178 yes 1 AGR2 anterior gradient homolog 2 (Xenopus laevis) 10551 yes 1 AGTR2 angiotensin II receptor, type 2 186 yes 1 AGXT2 alanine-glyoxylate aminotransferase 2 64902 yes 1 AHI1 Abelson helper integration site 1 54806 yes 1 AHNAK AHNAK nucleoprotein 79026 yes 1 AIFM1 apoptosis-inducing factor, mitochondrion-associated, 9131 yes 1 AIG1 androgen-induced1 1 51390 yes 1 AIMP1 aminoacyl tRNA synthetase complex-interacting 9255 yes 1 AJAP1 adherensmultifunctional junctions protein associated 1 protein 1 55966 yes 1 AKAP12 A kinase (PRKA) anchor protein 12 9590 yes 1 AKAP6 A kinase (PRKA) anchor protein 6 9472 yes 1 AKAP7 A kinase (PRKA) anchor protein 7 9465 yes 1 AKR7A2 aldo-keto reductase family 7, member A2 (aflatoxin 8574 yes 1 ALDH1L2 aldehyde dehydrogenasereductase) 1 family, member L2 160428 yes 1 ALG10B asparagine-linked glycosylation 10, alpha-1,2- 144245 yes 1 ALG9 glucosyltransferase asparagine-linked glycosylation homolog B (yeast)9, alpha-1,2- 79796 yes 1 ALKBH8 alkB,mannosyltransferase alkylation repair homologhomolog (S.8 (E. cerevisiae) coli) 91801 yes 1 ALS2CR11 amyotrophic lateral sclerosis 2 (juvenile) chromosome 151254 yes 1 ALS2CR4 regio amyotrophicn, candidate lateral 11 sclerosis 2 (juvenile) chromosome 65062 yes 1 AMBRA1 autophagy/beclin-1region, candidate 4 regulator 1 55626 yes 1 AMELY amelogenin, Y-linked 266 yes 1 AMFR autocrine motility factor receptor 267 yes 1 AMIGO2 adhesion molecule with Ig-like domain 2 347902 yes 1 ANAPC11 anaphase promoting complex subunit 11 51529 yes 1 ANAPC5 anaphase promoting complex subunit 5 51433 yes 1 ANAPC7 anaphase promoting complex subunit 7 51434 yes 1 ANGPT1 angiopoietin 1 284 yes 1 ANGPTL2 angiopoietin-like 2 23452 yes 1 ANKAR ankyrin and armadillo repeat containing 150709 yes 1 ANKFY1 ankyrin repeat and FYVE domain containing 1 51479 yes 1 ANKRD12 ankyrin repeat domain 12 23253 yes 1 ANKRD16 ankyrin repeat domain 16 54522 yes 1 ANKRD17 ankyrin repeat domain 17 26057 yes 1 ANKRD28 ankyrin repeat domain 28 23243 yes 1 ANKRD31 ankyrin repeat domain 31 256006 yes 1 ANKRD32 ankyrin repeat domain 32 84250 yes 1 ANKRD34B ankyrin repeat domain 34B 340120 yes 1 ANKRD42 ankyrin repeat domain 42 338699 yes 1 ANKRD44 ankyrin repeat domain 44 91526 yes 1 ANKS1B ankyrin repeat and sterile alpha motif domain 56899 yes 1 ANKS6 ankyrincontaining repeat 1B and sterile alpha motif domain 203286 yes 1 ANO1 containing anoctamin 61, calcium activated chloride channel 55107 yes 1 ANO3 anoctamin 3 63982 yes 1 AOC3 amine oxidase, copper containing 3 (vascular 8639 yes 1 AP3B1 adaptor-relatedadhesion protein protein 1) complex 3, beta 1 subunit 8546 yes 1 AP3M2 adaptor-related protein complex 3, mu 2 subunit 10947 yes 1 AP3S1 adaptor-related protein complex 3, sigma 1 subunit 1176 yes 1 AP4S1 adaptor-related protein complex 4, sigma 1 subunit 11154 yes 1 APBB2 amyloid beta (A4) precursor protein-binding, family B, 323 yes 1 APH1B anteriormember pharynx 2 defective 1 homolog B (C. elegans) 83464 yes 1 APLP2 amyloid beta (A4) precursor-like protein 2 334 yes 1 APOA4 apolipoprotein A-IV 337 yes 1 APOA5 apolipoprotein A-V 116519 yes 1 APOLD1 apolipoprotein L domain containing 1 81575 yes 1 APP amyloid beta (A4) precursor protein 351 yes 1 APPBP2 amyloid beta precursor protein (cytoplasmic tail) 10513 yes 1 AQP7P3 aquaporinbinding protein 7 pseudogene 2 3 441432 yes 1 AQP9 aquaporin 9 366 yes 1 ARAF v-raf murine sarcoma 3611 viral oncogene homolog 369 yes 1 ARAP1 ArfGAP with RhoGAP domain, ankyrin repeat and PH 116985 yes 1 domain 1

163

Symbol Gene name Entrez MR TS µT DB MC # KEGG ARFGAP3 ADP-ribosylation factor GTPase activating protein 3 26286 yes 1 ARFGEF2 ADP-ribosylation factor guanine nucleotide-exchange 10564 yes 1 ARHGAP15 Rhofactor GTPase 2 (brefeldin activating A -inhibited) protein 15 55843 yes 1 ARHGAP19 Rho GTPase activating protein 19 84986 yes 1 ARHGAP20 Rho GTPase activating protein 20 57569 yes 1 ARHGAP21 Rho GTPase activating protein 21 57584 yes 1 ARHGAP24 Rho GTPase activating protein 24 83478 yes 1 ARHGAP36 hypothetical protein FLJ30058 158763 yes 1 ARHGAP6 Rho GTPase activating protein 6 395 yes 1 ARHGAP9 Rho GTPase activating protein 9 64333 yes 1 ARHGEF12 Rho guanine nucleotide exchange factor (GEF) 12 23365 yes 1 ARHGEF6 Rac/Cdc42 guanine nucleotide exchange factor 9459 yes 1 ARHGEF7 Rho(GEF) guanine 6 nucleotide exchange factor (GEF) 7 8874 yes 1 ARID1A AT rich interactive domain 1A (SWI-like) 8289 yes 1 ARID2 AT rich interactive domain 2 (ARID, RFX-like) 196528 yes 1 ARID4A AT rich interactive domain 4A (RBP1-like) 5926 yes 1 ARID4B AT rich interactive domain 4B (RBP1-like) 51742 yes 1 ARIH1 ariadne homolog, ubiquitin-conjugating enzyme E2 25820 yes 1 ARIH2 ariadnebinding protein,homolog 1 2(Drosophila) (Drosophila) 10425 yes 1 ARL1 ADP-ribosylation factor-like 1 400 yes 1 ARL13B ADP-ribosylation factor-like 13B 200894 yes 1 ARL4C ADP-ribosylation factor-like 4C 10123 yes 1 ARMC4 armadillo repeat containing 4 55130 yes 1 ARMCX3 armadillo repeat containing, X-linked 3 51566 yes 1 ARMCX5 armadillo repeat containing, X-linked 5 64860 yes 1 ARNT aryl hydrocarbon receptor nuclear translocator 405 yes 1 ARNTL aryl hydrocarbon receptor nuclear translocator-like 406 yes 1 ARRB1 arrestin, beta 1 408 yes 1 ARRDC2 arrestin domain containing 2 27106 yes 1 ARSB arylsulfatase B 411 yes 1 ART3 ADP-ribosyltransferase 3 419 yes 1 ASAP2 ArfGAP with SH3 domain, ankyrin repeat and PH 8853 yes 1 ASB10 ankyrindomain repeat2 and SOCS box-containing 10 136371 yes 1 ASB11 ankyrin repeat and SOCS box-containing 11 140456 yes 1 ASB7 ankyrin repeat and SOCS box-containing 7 140460 yes 1 ASF1A ASF1 anti-silencing function 1 homolog A (S. 25842 yes 1 ASPH aspartatecerevisiae) beta-hydroxylase 444 yes 1 ASRGL1 asparaginase like 1 80150 yes 1 ASTN2 astrotactin 2 23245 yes 1 ASXL1 additional sex combs like 1 (Drosophila) 171023 yes 1 ATAD2 ATPase family, AAA domain containing 2 29028 yes 1 ATAD2B ATPase family, AAA domain containing 2B 54454 yes 1 ATF2 activating transcription factor 2 1386 yes 1 ATF3 activating transcription factor 3 467 yes 1 ATG2B ATG2 autophagy related 2 homolog B (S. cerevisiae) 55102 yes 1 ATG3 ATG3 autophagy related 3 homolog (S. cerevisiae) 64422 yes 1 ATG4A ATG4 autophagy related 4 homolog A (S. cerevisiae) 115201 yes 1 ATG5 ATG5 autophagy related 5 homolog (S. cerevisiae) 9474 yes 1 ATL2 atlastin GTPase 2 64225 yes 1 ATP10B ATPase, class V, type 10B 23120 yes 1 ATP2C1 ATPase, Ca++ transporting, type 2C, member 1 27032 yes 1 ATP5F1 ATP synthase, H+ transporting, mitochondrial F0 515 yes 1 ATP5G3 ATPcomplex, synthase, subunit H+ B1 transporting, mitochondrial F0 518 yes 1 ATP5L ATPcomplex, synthase, subunit H+ C3 transporting, (subunit 9) mitochondrial F0 10632 yes 1 ATP6V0A4 ATPase,complex, H+subunit transporting, G lysosomal V0 subunit a4 50617 yes 1 ATP6V1F ATPase, H+ transporting, lysosomal 14kDa, V1 9296 yes 1 ATP6V1G1 ATPase,subunit F H+ transporting, lysosomal 13kDa, V1 9550 yes 1 ATP7A ATPase,subunit G1 Cu++ transporting, alpha polypeptide 538 yes 1 ATP8B4 ATPase, class I, type 8B, member 4 79895 yes 1 ATP9A ATPase, class II, type 9A 10079 yes 1 ATRNL1 attractin-like 1 26033 yes 1 ATXN1L ataxin 1-like 342371 yes 1 ATXN7 ataxin 7 6314 yes 1 ATXN7L1 ataxin 7-like 1 222255 yes 1 AVL9 AVL9 homolog (S. cerevisiase) 23080 yes 1 AWAT2 acyl-CoA wax alcohol acyltransferase 2 158835 yes 1 AXL AXL receptor tyrosine kinase 558 yes 1 AZIN1 antizyme inhibitor 1 51582 yes 1 B2M beta-2-microglobulin 567 yes 1 B3GALNT1 beta-1,3-N-acetylgalactosaminyltransferase 1 8706 yes 1 B4GALT3 UDP-Gal:betaGlcNAc(globoside blood group) beta 1,4- galactosyltransferase, 8703 yes 1 B4GALT6 UDP-Gal:betaGlcNAcpolypeptide 3 beta 1,4- galactosyltransferase, 9331 yes 1 B4GALT7 xylosylproteinpolypeptide 6 beta 1,4-galactosyltransferase, 11285 yes 1 BAAT bilepolypeptide acid Coenzyme 7 (galactosyltransferase A: amino acid N-acyltransfe I) rase 570 yes 1 BACH1 BTB(glycine and N CNC-choloyltransferase) homology 1, basic leucine zipper 571 yes 1 BASE breasttranscription cancer factor and salivary1 gland expression gene 317716 yes 1 BASP1 brain abundant, membrane attached signal protein 1 10409 yes 1 BAZ2B bromodomain adjacent to zinc finger domain, 2B 29994 yes 1 BBS10 Bardet-Biedl syndrome 10 79738 yes 1 BBX bobby sox homolog (Drosophila) 56987 yes 1 BCAM basal cell adhesion molecule (Lutheran blood group) 4059 yes 1 BCAR3 breast cancer anti-estrogen resistance 3 8412 yes 1

164

Symbol Gene name Entrez MR TS µT DB MC # KEGG BCAS3 breast carcinoma amplified sequence 3 54828 yes 1 BCL2L15 BCL2-like 15 440603 yes 1 BCO2 beta-carotene oxygenase 2 83875 yes 1 BCOR BCL6 co-repressor 54880 yes 1 BDH1 3-hydroxybutyrate dehydrogenase, type 1 622 yes 1 BDNFOS BDNF opposite strand (non-protein coding) 497258 yes 1 BDP1 B double prime 1, subunit of RNA polymerase III 55814 yes 1 BEND2 BENtranscription domain initiationcontaining factor 2 IIIB 139105 yes 1 BEND6 BEN domain containing 6 221336 yes 1 BEND7 BEN domain containing 7 222389 yes 1 BET3L BET3 like (S. cerevisiae) 100128 yes 1 BEX5 brain expressed, X-linked 5 340542327 yes 1 BIRC7 baculoviral IAP repeat-containing 7 79444 yes 1 BIVM basic, immunoglobulin-like variable motif containing 54841 yes 1 BLZF1 basic leucine zipper nuclear factor 1 8548 yes 1 BMF Bcl2 modifying factor 90427 yes 1 BMP1 bone morphogenetic protein 1 649 yes 1 BMP3 bone morphogenetic protein 3 651 yes 1 BMP7 bone morphogenetic protein 7 655 yes 1 BNIP3L BCL2/adenovirus E1B 19kDa interacting protein 3-like 665 yes 1 BPGM 2,3-bisphosphoglycerate mutase 669 yes 1 BPIL3 bactericidal/permeability-increasing protein-like 3 128859 yes 1 BRD1 bromodomain containing 1 23774 yes 1 BRMS1L breast cancer metastasis-suppressor 1-like 84312 yes 1 BRP44L brain protein 44-like 51660 yes 1 BRPF3 bromodomain and PHD finger containing, 3 27154 yes 1 BRSK2 BR serine/threonine kinase 2 9024 yes 1 BRWD1 bromodomain and WD repeat domain containing 1 54014 yes 1 BSPH1 binder of sperm protein homolog 1 100131 yes 1 BTAF1 BTAF1 RNA polymerase II, B-TFIID transcription 9044137 yes 1 BTBD11 BTBfactor (POZ)-associated, domain 170kDa containing (Mot1 11 homolog, S. 121551 yes 1 BTF3L3 basic transcription factor 3, like 3 132556 yes 1 BTG1 B-cell translocation gene 1, anti-proliferative 694 yes 1 BTK Bruton agammaglobulinemia tyrosine kinase 695 yes 1 BTLA B and T lymphocyte associated 151888 yes 1 BTN2A1 butyrophilin, subfamily 2, member A1 11120 yes 1 BTNL9 butyrophilin-like 9 153579 yes 1 BYSL bystin-like 705 yes 1 BZW2 basic leucine zipper and W2 domains 2 28969 yes 1 C10ORF110 chromosome 10 open reading frame 110 55853 yes 1 C10ORF128 chromosome 10 open reading frame 128 170371 yes 1 C10ORF25 chromosome 10 open reading frame 25 220979 yes 1 C10ORF26 chromosome 10 open reading frame 26 54838 yes 1 C10ORF4 chromosome 10 open reading frame 4 118924 yes 1 C10ORF78 chromosome 10 open reading frame 78 119392 yes 1 C10ORF88 chromosome 10 open reading frame 88 80007 yes 1 C11ORF21 chromosome 11 open reading frame 21 29125 yes 1 C11ORF54 chromosome 11 open reading frame 54 28970 yes 1 C11ORF65 chromosome 11 open reading frame 65 160140 yes 1 C11ORF73 chromosome 11 open reading frame 73 51501 yes 1 C11ORF87 chromosome 11 open reading frame 87 399947 yes 1 C11ORF92 chromosome 11 open reading frame 92 399948 yes 1 C11ORF93 Uncharacterized protein LOC120376 120376 yes 1 C12ORF24 chromosome 12 open reading frame 24 29902 yes 1 C12ORF41 chromosome 12 open reading frame 41 54934 yes 1 C12ORF48 chromosome 12 open reading frame 48 55010 yes 1 C12ORF5 chromosome 12 open reading frame 5 57103 yes 1 C12ORF50 chromosome 12 open reading frame 50 160419 yes 1 C12ORF54 chromosome 12 open reading frame 54 121273 yes 1 C12ORF61 chromosome 12 open reading frame 61 283416 yes 1 C12ORF62 chromosome 12 open reading frame 62 84987 yes 1 C12ORF75 chromosome 12 open reading frame 75 387882 yes 1 C13ORF27 chromosome 13 open reading frame 27 93081 yes 1 C13ORF36 chromosome 13 open reading frame 36 400120 yes 1 C13ORF37 chromosome 13 open reading frame 37 440145 yes 1 C13ORF38 chromosome 13 open reading frame 38 728591 yes 1 C14ORF101 chromosome 14 open reading frame 101 54916 yes 1 C14ORF105 chromosome 14 open reading frame 105 55195 yes 1 C14ORF129 chromosome 14 open reading frame 129 51527 yes 1 C14ORF135 chromosome 14 open reading frame 135 64430 yes 1 C14ORF145 chromosome 14 open reading frame 145 145508 yes 1 C14ORF184 hypothetical protein LOC650662 650662 yes 1 C14ORF45 chromosome 14 open reading frame 45 80127 yes 1 C14orf82 chromosome 14 open reading frame 82 145438 yes 1 C15ORF29 chromosome 15 open reading frame 29 79768 yes 1 C15ORF44 chromosome 15 open reading frame 44 81556 yes 1 C16ORF61 chromosome 16 open reading frame 61 56942 yes 1 C16ORF75 chromosome 16 open reading frame 75 116028 yes 1 C16ORF87 chromosome 16 open reading frame 87 388272 yes 1 C17ORF101 chromosome 17 open reading frame 101 79701 yes 1 C17ORF104 hypothetical protein FLJ35848 284071 yes 1 C17ORF108 hypothetical protein LOC201229 201229 yes 1

165

Symbol Gene name Entrez MR TS µT DB MC # KEGG C17orf49 chromosome 17 open reading frame 49 124944 yes 1 C17ORF50 chromosome 17 open reading frame 50 146853 yes 1 C18ORF16 chromosome 18 open reading frame 16 147429 yes 1 C18ORF32 chromosome 18 open reading frame 32 497661 yes 1 C18ORF62 chromosome 18 open reading frame 62 284274 yes 1 C18ORF8 chromosome 18 open reading frame 8 29919 yes 1 C1ORF112 chromosome 1 open reading frame 112 55732 yes 1 C1ORF124 chromosome 1 open reading frame 124 83932 yes 1 C1ORF125 chromosome 1 open reading frame 125 126859 yes 1 C1ORF131 chromosome 1 open reading frame 131 128061 yes 1 C1ORF144 chromosome 1 open reading frame 144 26099 yes 1 C1ORF150 chromosome 1 open reading frame 150 148823 yes 1 C1ORF186 chromosome 1 open reading frame 186 440712 yes 1 C1ORF195 chromosome 1 open reading frame 195 727684 yes 1 C1ORF198 chromosome 1 open reading frame 198 84886 yes 1 C1ORF26 chromosome 1 open reading frame 26 54823 yes 1 C1ORF51 chromosome 1 open reading frame 51 148523 yes 1 C1ORF88 chromosome 1 open reading frame 88 128344 yes 1 C20ORF11 chromosome 20 open reading frame 11 54994 yes 1 C20ORF152 chromosome 20 open reading frame 152 140894 yes 1 C20ORF197 chromosome 20 open reading frame 197 284756 yes 1 C20ORF202 hypothetical LOC400831 400831 yes 1 C20ORF24 chromosome 20 open reading frame 24 55969 yes 1 C20ORF4 chromosome 20 open reading frame 4 25980 yes 1 C20orf7 chromosome 20 open reading frame 7 79133 yes 1 C20ORF85 chromosome 20 open reading frame 85 128602 yes 1 C20ORF96 chromosome 20 open reading frame 96 140680 yes 1 C21ORF63 chromosome 21 open reading frame 63 59271 yes 1 C21orf82 chromosome 21 open reading frame 82 114036 yes 1 C21ORF84 chromosome 21 open reading frame 84 114038 yes 1 C21ORF91 chromosome 21 open reading frame 91 54149 yes 1 C21ORF99 chromosome 21 open reading frame 99 149992 yes 1 C2CD3 C2 calcium-dependent domain containing 3 26005 yes 1 C2CD4A family with sequence similarity 148, member A 145741 yes 1 C2ORF56 chromosome 2 open reading frame 56 55471 yes 1 C2ORF68 chromosome 2 open reading frame 68 388969 yes 1 C2ORF89 hypothetical protein LOC129293 129293 yes 1 C2ORF90 similar to CG2839-PA 391343 yes 1 C3ORF15 chromosome 3 open reading frame 15 89876 yes 1 C3ORF23 chromosome 3 open reading frame 23 285343 yes 1 C3ORF26 chromosome 3 open reading frame 26 84319 yes 1 C3ORF34 chromosome 3 open reading frame 34 84984 yes 1 C3ORF58 chromosome 3 open reading frame 58 205428 yes 1 C4BPB complement component 4 binding protein, beta 725 yes 1 C4ORF10 chromosome 4 open reading frame 10 317648 yes 1 C4ORF31 chromosome 4 open reading frame 31 79625 yes 1 C4ORF33 chromosome 4 open reading frame 33 132321 yes 1 C4ORF49 chromosome 4 open reading frame 49 84709 yes 1 C4ORF52 hypothetical protein LOC389203 389203 yes 1 C5ORF25 chromosome 5 open reading frame 25 375484 yes 1 C5ORF34 chromosome 5 open reading frame 34 375444 yes 1 C5ORF44 chromosome 5 open reading frame 44 80006 yes 1 C5ORF46 chromosome 5 open reading frame 46 389336 yes 1 C5ORF48 chromosome 5 open reading frame 48 389320 yes 1 C5ORF58 hypothetical protein LOC133874 133874 yes 1 C6ORF115 chromosome 6 open reading frame 115 58527 yes 1 C6ORF138 chromosome 6 open reading frame 138 442213 yes 1 C6ORF142 chromosome 6 open reading frame 142 90523 yes 1 C6ORF167 chromosome 6 open reading frame 167 253714 yes 1 C6ORF204 chromosome 6 open reading frame 204 387119 yes 1 C6ORF225 chromosome 6 open reading frame 225 619208 yes 1 C6ORF52 chromosome 6 open reading frame 52 347744 yes 1 C6ORF57 chromosome 6 open reading frame 57 135154 yes 1 C6ORF64 chromosome 6 open reading frame 64 55776 yes 1 C7ORF10 chromosome 7 open reading frame 10 79783 yes 1 C7ORF46 chromosome 7 open reading frame 46 340277 yes 1 C7ORF57 chromosome 7 open reading frame 57 136288 yes 1 C7ORF63 chromosome 7 open reading frame 63 79846 yes 1 C8ORF33 chromosome 8 open reading frame 33 65265 yes 1 C8ORF34 chromosome 8 open reading frame 34 116328 yes 1 C8ORF37 chromosome 8 open reading frame 37 157657 yes 1 C8ORF44 chromosome 8 open reading frame 44 56260 yes 1 C8ORF46 chromosome 8 open reading frame 46 254778 yes 1 C8ORF85 chromosome 8 open reading frame 85 441376 yes 1 C9ORF102 chromosome 9 open reading frame 102 375748 yes 1 C9ORF123 chromosome 9 open reading frame 123 90871 yes 1 C9ORF125 chromosome 9 open reading frame 125 84302 yes 1 C9ORF131 chromosome 9 open reading frame 131 138724 yes 1 C9ORF152 chromosome 9 open reading frame 152 401546 yes 1 C9ORF3 chromosome 9 open reading frame 3 84909 yes 1 C9ORF41 chromosome 9 open reading frame 41 138199 yes 1 C9ORF46 chromosome 9 open reading frame 46 55848 yes 1

166

Symbol Gene name Entrez MR TS µT DB MC # KEGG C9ORF6 chromosome 9 open reading frame 6 54942 yes 1 C9ORF86 chromosome 9 open reading frame 86 55684 yes 1 C9ORF91 chromosome 9 open reading frame 91 203197 yes 1 CA6 carbonic anhydrase VI 765 yes 1 CA8 carbonic anhydrase VIII 767 yes 1 CABLES1 Cdk5 and Abl enzyme substrate 1 91768 yes 1 CABYR calcium binding tyrosine-(Y)-phosphorylation 26256 yes 1 CACNA1E calciumregulated channel, voltage-dependent, R type, alpha 1E 777 yes 1 CACNA2D3 calciumsubunit channel, voltage-dependent, alpha 2/delta 55799 yes 1 CACNB2 calciumsubunit 3 channel, voltage-dependent, beta 2 subunit 783 yes 1 CACNB4 calcium channel, voltage-dependent, beta 4 subunit 785 yes 1 CALB1 calbindin 1, 28kDa 793 yes 1 CALCA calcitonin-related polypeptide alpha 796 yes 1 CALCOCO1 calcium binding and coiled-coil domain 1 57658 yes 1 CALCOCO2 calcium binding and coiled-coil domain 2 10241 yes 1 CALD1 caldesmon 1 800 yes 1 CAPZA3 capping protein (actin filament) muscle Z-line, alpha 3 93661 yes 1 CASC5 cancer susceptibility candidate 5 57082 yes 1 CASP1 caspase 1, apoptosis-related cysteine peptidase 834 yes 1 CASP10 caspase(interleukin 10, 1, apoptosis-related beta, convertase) cysteine peptidase 843 yes 1 CASP2 caspase 2, apoptosis-related cysteine peptidase 835 yes 1 CASP4 caspase 4, apoptosis-related cysteine peptidase 837 yes 1 CASP7 caspase 7, apoptosis-related cysteine peptidase 840 yes 1 CAT catalase 847 yes 1 CATSPERG chromosome 19 open reading frame 15 57828 yes 1 CAV1 caveolin 1, caveolae protein, 22kDa 857 yes 1 CAV2 caveolin 2 858 yes 1 CCDC108 coiled-coil domain containing 108 255101 yes 1 CCDC111 coiled-coil domain containing 111 201973 yes 1 CCDC120 coiled-coil domain containing 120 90060 yes 1 CCDC121 coiled-coil domain containing 121 79635 yes 1 CCDC125 coiled-coil domain containing 125 202243 yes 1 CCDC132 coiled-coil domain containing 132 55610 yes 1 CCDC149 coiled-coil domain containing 149 91050 yes 1 CCDC18 coiled-coil domain containing 18 343099 yes 1 CCDC22 coiled-coil domain containing 22 28952 yes 1 CCDC30 hypothetical protein LOC728621 728621 yes 1 CCDC33 coiled-coil domain containing 33 80125 yes 1 CCDC52 coiled-coil domain containing 52 152185 yes 1 CCDC53 coiled-coil domain containing 53 51019 yes 1 CCDC6 coiled-coil domain containing 6 8030 yes 1 CCDC68 coiled-coil domain containing 68 80323 yes 1 CCDC7 coiled-coil domain containing 7 221016 yes 1 CCDC82 coiled-coil domain containing 82 79780 yes 1 CCDC88A coiled-coil domain containing 88A 55704 yes 1 CCDC88B coiled-coil domain containing 88B 283234 yes 1 CCNB3 cyclin B3 85417 yes 1 CCNG1 cyclin G1 900 yes 1 CCNO cyclin O 10309 yes 1 CCNY cyclin Y 219771 yes 1 CCRL1 chemokine (C-C motif) receptor-like 1 51554 yes 1 CD109 CD109 molecule 135228 yes 1 CD160 CD160 molecule 11126 yes 1 CD1A CD1a molecule 909 yes 1 CD36 CD36 molecule (thrombospondin receptor) 948 yes 1 CD40 CD40 molecule, TNF receptor superfamily member 5 958 yes 1 CD48 CD48 molecule 962 yes 1 CD55 CD55 molecule, decay accelerating factor for 1604 yes 1 CD9 CD9complement molecule (Cromer blood group) 928 yes 1 CDC14C CDC14 cell division cycle 14 homolog C (S. 168448 yes 1 CDC27 cerevisiae) cell division cycle 27 homolog (S. cerevisiae) 996 yes 1 CDC40 cell division cycle 40 homolog (S. cerevisiae) 51362 yes 1 CDC42BPA CDC42 binding protein kinase alpha (DMPK-like) 8476 yes 1 CDH2 cadherin 2, type 1, N-cadherin (neuronal) 1000 yes 1 CDH26 cadherin-like 26 60437 yes 1 CDH9 cadherin 9, type 2 (T1-cadherin) 1007 yes 1 CDK17 PCTAIRE protein kinase 2 5128 yes 1 CDK19 cell division cycle 2-like 6 (CDK8-like) 23097 yes 1 CDK5RAP1 CDK5 regulatory subunit associated protein 1 51654 yes 1 CDRT15 CMT1A duplicated region transcript 15 146822 yes 1 CDS1 CDP-diacylglycerol synthase (phosphatidate 1040 yes 1 CEACAM1 carcinoembryoniccytidylyltransferase) antigen-related 1 cell adhesion 634 yes 1 CEACAM21 carcinoembryonicmolecule 1 (biliary antigen-relatedglycoprotein) cell adhesion 90273 yes 1 CEACAM3 carcinoembryonicmolecule 21 antigen-related cell adhesion 1084 yes 1 CEACAM5 carcinoembryonicmolecule 3 antigen-related cell adhesion 1048 yes 1 CELF6 bruno-likemolecule 5 6, RNA binding protein (Drosophila) 60677 yes 1 CENPBD1 hypothetical protein MGC16385 92806 yes 1 CENPV centromere protein V 201161 yes 1 CEP192 centrosomal protein 192kDa 55125 yes 1 CEP70 centrosomal protein 70kDa 80321 yes 1 CEP76 centrosomal protein 76kDa 79959 yes 1 CEP97 centrosomal protein 97kDa 79598 yes 1

167

Symbol Gene name Entrez MR TS µT DB MC # KEGG CES1 carboxylesterase 1 (monocyte/macrophage serine 1066 yes 1 CFL2 esterase cofilin 2 (muscle) 1) 1073 yes 1 CFLP1 cofilin pseudogene 1 142913 yes 1 CGRRF1 cell growth regulator with ring finger domain 1 10668 yes 1 CH25H cholesterol 25-hydroxylase 9023 yes 1 CHCHD3 coiled-coil-helix-coiled-coil-helix domain containing 3 54927 yes 1 CHCHD4 coiled-coil-helix-coiled-coil-helix domain containing 4 131474 yes 1 CHD2 chromodomain helicase DNA binding protein 2 1106 yes 1 CHD4 chromodomain helicase DNA binding protein 4 1108 yes 1 CHD6 chromodomain helicase DNA binding protein 6 84181 yes 1 CHD7 chromodomain helicase DNA binding protein 7 55636 yes 1 CHD9 chromodomain helicase DNA binding protein 9 80205 yes 1 CHFR checkpoint with forkhead and ring finger domains 55743 yes 1 CHMP4B chromatin modifying protein 4B 128866 yes 1 CHMP5 chromatin modifying protein 5 51510 yes 1 CHP calcium binding protein P22 11261 yes 1 yes CHST1 carbohydrate (keratan sulfate Gal-6) sulfotransferase 8534 yes 1 CHST14 carbohydrate1 (N-acetylgalactosamine 4-0) 113189 yes 1 CIRH1A sulfotransferase cirrhosis, autosomal 14 recessive 1A (cirhin) 84916 yes 1 CISD1 CDGSH iron sulfur domain 1 55847 yes 1 CISH cytokine inducible SH2-containing protein 1154 yes 1 CKAP5 cytoskeleton associated protein 5 9793 yes 1 CKS1BP6 similar to CDC28 protein kinase 1B 652904 yes 1 CLCN1 chloride channel 1, skeletal muscle 1180 yes 1 CLCN6 chloride channel 6 1185 yes 1 CLDN11 claudin 11 5010 yes 1 CLDN12 claudin 12 9069 yes 1 CLDN14 claudin 14 23562 yes 1 CLDND1 claudin domain containing 1 56650 yes 1 CLEC2A C-type lectin domain family 2, member A 387836 yes 1 CLEC4E C-type lectin domain family 4, member E 26253 yes 1 CLEC4GP1 C-type lectin domain family 4, member G pseudogene 440508 yes 1 CLEC7A 1 C-type lectin domain family 7, member A 64581 yes 1 CLIC4 chloride intracellular channel 4 25932 yes 1 CLIP4 CAP-GLY domain containing linker protein family, 79745 yes 1 CLK4 CDC-likemember 4 kinase 4 57396 yes 1 CLN5 ceroid-lipofuscinosis, neuronal 5 1203 yes 1 CLOCK clock homolog (mouse) 9575 yes 1 CLTA clathrin, light chain (Lca) 1211 yes 1 CLU clusterin 1191 yes 1 CMAH cytidine monophosphate-N-acetylneuraminic acid 8418 yes 1 CMPK2 cytidinehydroxylase monophosphate (CMP -N-acetylneuraminate (UMP-CMP) kinase 2, 129607 yes 1 CNBP mitochondrial CCHC-type zinc finger, nucleic acid binding protein 7555 yes 1 CNDP2 CNDP dipeptidase 2 (metallopeptidase M20 family) 55748 yes 1 CNGA1 cyclic nucleotide gated channel alpha 1 1259 yes 1 CNIH cornichon homolog (Drosophila) 10175 yes 1 CNOT10 CCR4-NOT transcription complex, subunit 10 25904 yes 1 CNOT6 CCR4-NOT transcription complex, subunit 6 57472 yes 1 CNOT6L CCR4-NOT transcription complex, subunit 6-like 246175 yes 1 CNOT7 CCR4-NOT transcription complex, subunit 7 29883 yes 1 CNPY1 canopy 1 homolog (zebrafish) 285888 yes 1 CNPY4 canopy 4 homolog (zebrafish) 245812 yes 1 CNRIP1 cannabinoid receptor interacting protein 1 25927 yes 1 CNTLN centlein, centrosomal protein 54875 yes 1 CNTN5 contactin 5 53942 yes 1 COBL cordon-bleu homolog (mouse) 23242 yes 1 COG6 component of oligomeric golgi complex 6 57511 yes 1 COL11A1 collagen, type XI, alpha 1 1301 yes 1 COL15A1 collagen, type XV, alpha 1 1306 yes 1 COL9A1 collagen, type IX, alpha 1 1297 yes 1 COMMD9 COMM domain containing 9 29099 yes 1 COPS4 COP9 constitutive photomorphogenic homolog 51138 yes 1 COPS5 COP9subunit constitutive 4 (Arabidopsis) photomorphogenic homolog 10987 yes 1 COPS7B COP9subunit constitutive 5 (Arabidopsis) photomorphogenic homolog 64708 yes 1 COQ10B coenzymesubunit 7B Q10(Arabidopsis) homolog B (S. cerevisiae) 80219 yes 1 COQ3 coenzyme Q3 homolog, methyltransferase (S. 51805 yes 1 CORIN corin,cerevisiae) serine peptidase 10699 yes 1 CORO1A coronin, actin binding protein, 1A 11151 yes 1 CORO1C coronin, actin binding protein, 1C 23603 yes 1 COX15 COX15 homolog, cytochrome c oxidase assembly 1355 yes 1 COX5A cytochromeprotein (yeast) c oxidase subunit Va 9377 yes 1 CP110 CP110 protein 9738 yes 1 CPAMD8 C3 and PZP-like, alpha-2-macroglobulin domain 27151 yes 1 CPNE3 copinecontaining III 8 8895 yes 1 CPO carboxypeptidase O 130749 yes 1 CPSF2 cleavage and polyadenylation specific factor 2, 53981 yes 1 CPSF3L cleavage100kDa and polyadenylation specific factor 3-like 54973 yes 1 CPSF7 cleavage and polyadenylation specific factor 7, 59kDa 79869 yes 1 CPXCR1 CPX chromosome region, candidate 1 53336 yes 1 CREB3L2 cAMP responsive element binding protein 3-like 2 64764 yes 1 CRELD2 cysteine-rich with EGF-like domains 2 79174 yes 1 CRK v-crk sarcoma virus CT10 oncogene homolog (avian) 1398 yes 1

168

Symbol Gene name Entrez MR TS µT DB MC # KEGG CRLS1 cardiolipin synthase 1 54675 yes 1 CRTAC1 cartilage acidic protein 1 55118 yes 1 CRYBB2 crystallin, beta B2 1415 yes 1 CRYGB crystallin, gamma B 1419 yes 1 CRYM crystallin, mu 1428 yes 1 CS citrate synthase 1431 yes 1 CSAD cysteine sulfinic acid decarboxylase 51380 yes 1 CSF2RA colony stimulating factor 2 receptor, alpha, low-affinity 1438 yes 1 CSF3R colony(granulocyte stimulating-macrophage) factor 3 receptor (granulocyte) 1441 yes 1 CSMD3 CUB and Sushi multiple domains 3 114788 yes 1 CSNK1G1 casein kinase 1, gamma 1 53944 yes 1 CSPG5 chondroitin sulfate proteoglycan 5 (neuroglycan C) 10675 yes 1 CSPP1 centrosome and spindle pole associated protein 1 79848 yes 1 CSRNP2 cysteine-serine-rich nuclear protein 2 81566 yes 1 CST3 cystatin C 1471 yes 1 CTNNAL1 catenin (cadherin-associated protein), alpha-like 1 8727 yes 1 CTSO cathepsin O 1519 yes 1 CTU1 ATP binding domain 3 90353 yes 1 CUL5 cullin 5 8065 yes 1 CWC15 CWC15 spliceosome-associated protein homolog (S. 51503 yes 1 CX3CR1 chemokinecerevisiae) (C-X3-C motif) receptor 1 1524 yes 1 CXCL16 chemokine (C-X-C motif) ligand 16 58191 yes 1 CXCL6 chemokine (C-X-C motif) ligand 6 (granulocyte 6372 yes 1 CXORF23 chromosomechemotactic protein X open 2) reading frame 23 256643 yes 1 CXORF41 chromosome X open reading frame 41 139212 yes 1 CYB5R4 cytochrome b5 reductase 4 51167 yes 1 CYBB cytochrome b-245, beta polypeptide 1536 yes 1 CYP19A1 cytochrome P450, family 19, subfamily A, polypeptide 1588 yes 1 CYP2A7 cytochrome1 P450, family 2, subfamily A, polypeptide 1549 yes 1 CYP2C18 cytochrome7 P450, family 2, subfamily C, polypeptide 1562 yes 1 CYP2C8 cytochrome18 P450, family 2, subfamily C, polypeptide 1558 yes 1 CYP4Z2P cytochrome8 P450, family 4, subfamily Z, polypeptide 2 163720 yes 1 CYP51A1 pseudogene cytochrome P450, family 51, subfamily A, polypeptide 1595 yes 1 D2HGDH D-2-hydroxyglutarate1 dehydrogenase 728294 yes 1 D4S234E DNA segment on chromosome 4 (unique) 234 27065 yes 1 DACH2 dachshundexpresse d sequencehomolog 2 (Drosophila) 117154 yes 1 DAPK1 death-associated protein kinase 1 1612 yes 1 DAPP1 dual adaptor of phosphotyrosine and 3- 27071 yes 1 DAZL deletedphosphoinositides in azoospermia-like 1618 yes 1 DCAF16 chromosome 4 open reading frame 30 54876 yes 1 DCAF4L1 WD repeat domain 21B 285429 yes 1 DCAF5 WD repeat domain 22 8816 yes 1 DCBLD1 discoidin, CUB and LCCL domain containing 1 285761 yes 1 DCBLD2 discoidin, CUB and LCCL domain containing 2 131566 yes 1 DCC deleted in colorectal carcinoma 1630 yes 1 DCDC1 doublecortin domain containing 1 341019 yes 1 DCDC2 doublecortin domain containing 2 51473 yes 1 DCK deoxycytidine kinase 1633 yes 1 DCTN2 dynactin 2 (p50) 10540 yes 1 DDAH1 dimethylarginine dimethylaminohydrolase 1 23576 yes 1 DDI2 DDI1, DNA-damage inducible 1, homolog 2 (S. 84301 yes 1 DDO D-aspartatecerevisiae) oxidase 8528 yes 1 DDX17 DEAD (Asp-Glu-Ala-Asp) box polypeptide 17 10521 yes 1 DDX19A DEAD (Asp-Glu-Ala-As) box polypeptide 19A 55308 yes 1 DDX21 DEAD (Asp-Glu-Ala-Asp) box polypeptide 21 9188 yes 1 DDX23 DEAD (Asp-Glu-Ala-Asp) box polypeptide 23 9416 yes 1 DDX4 DEAD (Asp-Glu-Ala-Asp) box polypeptide 4 54514 yes 1 DDX46 DEAD (Asp-Glu-Ala-Asp) box polypeptide 46 9879 yes 1 DDX5 DEAD (Asp-Glu-Ala-Asp) box polypeptide 5 1655 yes 1 DDX59 DEAD (Asp-Glu-Ala-Asp) box polypeptide 59 83479 yes 1 DDX60L DEAD (Asp-Glu-Ala-Asp) box polypeptide 60-like 91351 yes 1 DEC1 deleted in esophageal cancer 1 50514 yes 1 DENND1A DENN/MADD domain containing 1A 57706 yes 1 DENND1B DENN/MADD domain containing 1B 163486 yes 1 DENND4A DENN/MADD domain containing 4A 10260 yes 1 DENND4C DENN/MADD domain containing 4C 55667 yes 1 DEPDC6 DEP domain containing 6 64798 yes 1 DERL1 Der1-like domain family, member 1 79139 yes 1 DGCR6 DiGeorge syndrome critical region gene 6 8214 yes 1 DGCR6L DiGeorge syndrome critical region gene 6-like 85359 yes 1 DGKB diacylglycerol kinase, beta 90kDa 1607 yes 1 DGKE diacylglycerol kinase, epsilon 64kDa 8526 yes 1 DGKH diacylglycerol kinase, eta 160851 yes 1 DHDDS dehydrodolichyl diphosphate synthase 79947 yes 1 DHRS13 dehydrogenase/reductase (SDR family) member 13 147015 yes 1 DHRSX dehydrogenase/reductase (SDR family) X-linked 207063 yes 1 DHTKD1 dehydrogenase E1 and transketolase domain 55526 yes 1 DHX15 DEAHcontaining (Asp-Glu-Ala-His) 1 box polypeptide 15 1665 yes 1 DHX33 DEAH (Asp-Glu-Ala-His) box polypeptide 33 56919 yes 1 DHX35 DEAH (Asp-Glu-Ala-His) box polypeptide 35 60625 yes 1 DHX57 DEAH (Asp-Glu-Ala-Asp/His) box polypeptide 57 90957 yes 1 DIAPH2 diaphanous homolog 2 (Drosophila) 1730 yes 1

169

Symbol Gene name Entrez MR TS µT DB MC # KEGG DIDO1 death inducer-obliterator 1 11083 yes 1 DIO1 deiodinase, iodothyronine, type I 1733 yes 1 DIO2 deiodinase, iodothyronine, type II 1734 yes 1 DIP2A DIP2 disco-interacting protein 2 homolog A 23181 yes 1 DIP2B DIP2(Drosophila) disco-interacting protein 2 homolog B 57609 yes 1 DIP2C DIP2(Drosophila) disco-interacting protein 2 homolog C 22982 yes 1 DIRAS3 DIRAS(Drosophila) family, GTP-binding RAS-like 3 9077 yes 1 DIRC1 disrupted in renal carcinoma 1 116093 yes 1 DISC1 disrupted in schizophrenia 1 27185 yes 1 DIXDC1 DIX domain containing 1 85458 yes 1 DKFZP434L hypothetical LOC26082 26082 yes 1 DKK2187 dickkopf homolog 2 (Xenopus laevis) 27123 yes 1 DLC1 deleted in liver cancer 1 10395 yes 1 DLD dihydrolipoamide dehydrogenase 1738 yes 1 DLG2 discs, large homolog 2 (Drosophila) 1740 yes 1 DLGAP1 discs, large (Drosophila) homolog-associated protein 9229 yes 1 DLX5 distal-less1 homeobox 5 1749 yes 1 DMD dystrophin 1756 yes 1 DMRT2 doublesex and mab-3 related transcription factor 2 10655 yes 1 DMRTC2 DMRT-like family C2 63946 yes 1 DMXL2 Dmx-like 2 23312 yes 1 DNA2 DNA replication helicase 2 homolog (yeast) 1763 yes 1 DNAH14 dynein, axonemal, heavy chain 14 127602 yes 1 DNAH5 dynein, axonemal, heavy chain 5 1767 yes 1 DNAJA1 DnaJ (Hsp40) homolog, subfamily A, member 1 3301 yes 1 DNAJB14 DnaJ (Hsp40) homolog, subfamily B, member 14 79982 yes 1 DNAJB2 DnaJ (Hsp40) homolog, subfamily B, member 2 3300 yes 1 DNAJB4 DnaJ (Hsp40) homolog, subfamily B, member 4 11080 yes 1 DNAJC10 DnaJ (Hsp40) homolog, subfamily C, member 10 54431 yes 1 DNAJC14 DnaJ (Hsp40) homolog, subfamily C, member 14 85406 yes 1 DNAJC2 DnaJ (Hsp40) homolog, subfamily C, member 2 27000 yes 1 DNAJC21 DnaJ (Hsp40) homolog, subfamily C, member 21 134218 yes 1 DNAL1 dynein, axonemal, light chain 1 83544 yes 1 DNER delta/notch-like EGF repeat containing 92737 yes 1 DNHD1 dynein heavy chain domain 1 144132 yes 1 DNM1L dynamin 1-like 10059 yes 1 DNM3 dynamin 3 26052 yes 1 DOCK5 dedicator of cytokinesis 5 80005 yes 1 DOCK7 dedicator of cytokinesis 7 85440 yes 1 DOCK8 dedicator of cytokinesis 8 81704 yes 1 DOCK9 dedicator of cytokinesis 9 23348 yes 1 DOK6 docking protein 6 220164 yes 1 DOK7 docking protein 7 285489 yes 1 DOLPP1 dolichyl pyrophosphate phosphatase 1 57171 yes 1 DOPEY1 dopey family member 1 23033 yes 1 DOPEY2 dopey family member 2 9980 yes 1 DPAGT1 dolichyl-phosphate (UDP-N-acetylglucosamine) N- 1798 yes 1 DPF3 D4,acetylglucosaminephosphotransferase zinc and double PHD fingers, family 1 3(GlcNAc -1-P 8110 yes 1 DPH5 DPH5 homolog (S. cerevisiae) 51611 yes 1 DPP6 dipeptidyl-peptidase 6 1804 yes 1 DPP8 dipeptidyl-peptidase 8 54878 yes 1 DPPA3 developmental pluripotency associated 3 359787 yes 1 DPT dermatopontin 1805 yes 1 DPY19L2 dpy-19-like 2 (C. elegans) 283417 yes 1 DPY19L3 dpy-19-like 3 (C. elegans) 147991 yes 1 DPYS dihydropyrimidinase 1807 yes 1 DPYSL5 dihydropyrimidinase-like 5 56896 yes 1 DR1 down-regulator of transcription 1, TBP-binding 1810 yes 1 DRAM2 DNA-damage(negative cofactor regulated 2) autophagy modulator 2 128338 yes 1 DRD5 dopamine receptor D5 1816 yes 1 DSC1 desmocollin 1 1823 yes 1 DSC2 desmocollin 2 1824 yes 1 DSCR6 Down syndrome critical region gene 6 53820 yes 1 DSCR8 Down syndrome critical region gene 8 84677 yes 1 DSE dermatan sulfate epimerase 29940 yes 1 DSG4 desmoglein 4 147409 yes 1 DST dystonin 667 yes 1 DSTYK dual serine/threonine and tyrosine protein kinase 25778 yes 1 DTHD1 FLJ16686 protein 401124 yes 1 DTWD1 DTW domain containing 1 56986 yes 1 DTX3 deltex homolog 3 (Drosophila) 196403 yes 1 DUSP10 dual specificity phosphatase 10 11221 yes 1 DUSP11 dual specificity phosphatase 11 (RNA/RNP complex 8446 yes 1 DUSP16 dual1-interacting) specificity phosphatase 16 80824 yes 1 DUSP19 dual specificity phosphatase 19 142679 yes 1 DUSP27 dual specificity phosphatase 27 (putative) 92235 yes 1 DUSP3 dual specificity phosphatase 3 1845 yes 1 DYNLL1 dynein, light chain, LC8-type 1 8655 yes 1 DYRK1A dual-specificity tyrosine-(Y)-phosphorylation regulated 1859 yes 1 E2F6 E2Fkinase transcription 1A factor 6 1876 yes 1 E2F7 E2F transcription factor 7 144455 yes 1 EAF2 ELL associated factor 2 55840 yes 1

170

Symbol Gene name Entrez MR TS µT DB MC # KEGG EBF1 early B-cell factor 1 1879 yes 1 EBF3 early B-cell factor 3 253738 yes 1 EDDM3A family with sequence similarity 12, member A 10876 yes 1 EDDM3B family with sequence similarity 12, member B 64184 yes 1 EDEM3 ER(epididymal) degradation enhancer, mannosidase alpha-like 3 80267 yes 1 EDNRA endothelin receptor type A 1909 yes 1 EDNRB endothelin receptor type B 1910 yes 1 EEF1E1 eukaryotic translation elongation factor 1 epsilon 1 9521 yes 1 EFCAB2 EF-hand calcium binding domain 2 84288 yes 1 EFCAB6 EF-hand calcium binding domain 6 64800 yes 1 EFHC1 EF-hand domain (C-terminal) containing 1 114327 yes 1 EFR3A EFR3 homolog A (S. cerevisiae) 23167 yes 1 EGFL6 EGF-like-domain, multiple 6 25975 yes 1 EHMT1 euchromatic histone-lysine N-methyltransferase 1 79813 yes 1 EHMT2 euchromatic histone-lysine N-methyltransferase 2 10919 yes 1 EIF1AY eukaryotic translation initiation factor 1A, Y-linked 9086 yes 1 EIF2AK1 eukaryotic translation initiation factor 2-alpha kinase 1 27102 yes 1 EIF2AK2 eukaryotic translation initiation factor 2-alpha kinase 2 5610 yes 1 EIF2C2 eukaryotic translation initiation factor 2C, 2 27161 yes 1 EIF4ENIF1 eukaryotic translation initiation factor 4E nuclear 56478 yes 1 ELAVL2 ELAVimport (embryonicfactor 1 lethal, abnormal vision, 1993 yes 1 ELAVL3 ELAVDrosophila) (embryonic-like 2 lethal,(Hu antigen abnormal B) vision, 1995 yes 1 ELF1 E74-likeDrosophila) factor-like 1 3(ets (Hu domain antigen transcription C) factor) 1997 yes 1 ELF2 E74-like factor 2 (ets domain transcription factor) 1998 yes 1 ELF5 E74-like factor 5 (ets domain transcription factor) 2001 yes 1 ELK3 ELK3, ETS-domain protein (SRF accessory protein 2) 2004 yes 1 ELK4 ELK4, ETS-domain protein (SRF accessory protein 1) 2005 yes 1 ELL elongation factor RNA polymerase II 8178 yes 1 ELMOD2 ELMO/CED-12 domain containing 2 255520 yes 1 ELOVL6 ELOVL family member 6, elongation of long chain 79071 yes 1 ELP2 elongationfatty acids (FEN1/Elo2,protein 2 homolog SUR4/Elo3 (S. cerevisiae)-like, yeast) 55250 yes 1 EMB embigin homolog (mouse) 133418 yes 1 EMCN endomucin 51705 yes 1 EMG1 EMG1 nucleolar protein homolog (S. cerevisiae) 10436 yes 1 EML6 echinoderm microtubule associated protein like 6 400954 yes 1 ENGASE endo-beta-N-acetylglucosaminidase 64772 yes 1 ENKUR enkurin, TRPC channel interacting protein 219670 yes 1 ENPEP glutamyl aminopeptidase (aminopeptidase A) 2028 yes 1 ENPP2 ectonucleotide pyrophosphatase/phosphodiesterase 2 5168 yes 1 ENPP3 ectonucleotide pyrophosphatase/phosphodiesterase 3 5169 yes 1 ENSA endosulfine alpha 2029 yes 1 ENTPD2 ectonucleoside triphosphate diphosphohydrolase 2 954 yes 1 ENTPD5 ectonucleoside triphosphate diphosphohydrolase 5 957 yes 1 EP400 E1A binding protein p400 57634 yes 1 EPAS1 endothelial PAS domain protein 1 2034 yes 1 EPB41L4A erythrocyte membrane protein band 4.1 like 4A 64097 yes 1 EPB41L4B erythrocyte membrane protein band 4.1 like 4B 54566 yes 1 EPC2 enhancer of polycomb homolog 2 (Drosophila) 26122 yes 1 EPHA5 EPH receptor A5 2044 yes 1 EPHA6 EPH receptor A6 285220 yes 1 EPHA7 EPH receptor A7 2045 yes 1 EPHB1 EPH receptor B1 2047 yes 1 EPM2A epilepsy, progressive myoclonus type 2A, Lafora 7957 yes 1 EPT1 selenoproteindisease (laforin) I 85465 yes 1 ERAP1 endoplasmic reticulum aminopeptidase 1 51752 yes 1 ERBB4 v-erb-a erythroblastic leukemia viral oncogene 2066 yes 1 ERCC6 excisionhomolog repair4 (avi an)cross-complementing rodent repair 2074 yes 1 ERCC6L excisiondeficiency, repair complementation cross-complementing group 6 rodent repair 54821 yes 1 ERCC8 excisiondeficiency, repair complementation cross-complementing group 6 -rodentlike repair 1161 yes 1 EREG epiregulindeficiency, complementation group 8 2069 yes 1 ERGIC2 ERGIC and golgi 2 51290 yes 1 ERLEC1 chromosome 2 open reading frame 30 27248 yes 1 ERLIN1 ER lipid raft associated 1 10613 yes 1 ERMP1 endoplasmic reticulum metallopeptidase 1 79956 yes 1 ESCO2 establishment of cohesion 1 homolog 2 (S. 157570 yes 1 ESRP2 cerevisiae) epithelial splicing regulatory protein 2 80004 yes 1 ESRRG estrogen-related receptor gamma 2104 yes 1 ESYT3 family with sequence similarity 62 (C2 domain 83850 yes 1 ETAA1 Ewingcontaining), tumor-associated member C antigen 1 54465 yes 1 ETFB electron-transfer-flavoprotein, beta polypeptide 2109 yes 1 ETNK2 ethanolamine kinase 2 55224 yes 1 EVI2A ecotropic viral integration site 2A 2123 yes 1 EVI5 ecotropic viral integration site 5 7813 yes 1 EXD1 exonuclease 3'-5' domain containing 1 161829 yes 1 EXOC4 exocyst complex component 4 60412 yes 1 EXOC5 exocyst complex component 5 10640 yes 1 EXOC6 exocyst complex component 6 54536 yes 1 EXOG endo/exonuclease (5'-3'), endonuclease G-like 9941 yes 1 EXOSC10 exosome component 10 5394 yes 1 EXOSC9 exosome component 9 5393 yes 1 EXPH5 exophilin 5 23086 yes 1 EXT1 exostoses (multiple) 1 2131 yes 1

171

Symbol Gene name Entrez MR TS µT DB MC # KEGG EXT2 exostoses (multiple) 2 2132 yes 1 F11 coagulation factor XI 2160 yes 1 F13A1 coagulation factor XIII, A1 polypeptide 2162 yes 1 F2R coagulation factor II (thrombin) receptor 2149 yes 1 F2RL1 coagulation factor II (thrombin) receptor-like 1 2150 yes 1 F9 coagulation factor IX 2158 yes 1 FABP2 fatty acid binding protein 2, intestinal 2169 yes 1 FABP4 fatty acid binding protein 4, adipocyte 2167 yes 1 FADD Fas (TNFRSF6)-associated via death domain 8772 yes 1 FAIM Fas apoptotic inhibitory molecule 55179 yes 1 FAM102B family with sequence similarity 102, member B 284611 yes 1 FAM105B family with sequence similarity 105, member B 90268 yes 1 FAM118A family with sequence similarity 118, member A 55007 yes 1 FAM118B family with sequence similarity 118, member B 79607 yes 1 FAM119A family with sequence similarity 119, member A 151194 yes 1 FAM120A family with sequence similarity 120A 23196 yes 1 FAM120C family with sequence similarity 120C 54954 yes 1 FAM122A family with sequence similarity 122A 116224 yes 1 FAM122B family with sequence similarity 122B 159090 yes 1 FAM133A family with sequence similarity 133, member A 286499 yes 1 FAM134B family with sequence similarity 134, member B 54463 yes 1 FAM134C family with sequence similarity 134, member C 162427 yes 1 FAM135A family with sequence similarity 135, member A 57579 yes 1 FAM135B family with sequence similarity 135, member B 51059 yes 1 FAM13A family with sequence similarity 13, member A 10144 yes 1 FAM13AOS FAM13A opposite strand (non-protein coding) 285512 yes 1 FAM159A family with sequence similarity 159, member A 348378 yes 1 FAM161A family with sequence similarity 161, member A 84140 yes 1 FAM164A family with sequence similarity 164, member A 51101 yes 1 FAM170B family with sequence similarity 170, member B 170370 yes 1 FAM173A family with sequence similarity 173, member A 65990 yes 1 FAM174A family with sequence similarity 174, member A 345757 yes 1 FAM179B family with sequence similarity 179, member B 23116 yes 1 FAM180A family with sequence similarity 180, member A 389558 yes 1 FAM182A family with sequence similarity 182, member A 284800 yes 1 FAM19A2 family with sequence similarity 19 (chemokine (C-C 338811 yes 1 FAM22D motif) family- like),with sequencemember A2 similarity 22, member D 728130 yes 1 FAM35B family with sequence similarity 35, member B 414241 yes 1 FAM35B2 family with sequence similarity 35, member B2 439965 yes 1 FAM38B family with sequence similarity 38, member B 63895 yes 1 FAM40B family with sequence similarity 40, member B 57464 yes 1 FAM46A family with sequence similarity 46, member A 55603 yes 1 FAM47E family with sequence similarity 47, member E 100129 yes 1 FAM53C family with sequence similarity 53, member C 51307583 yes 1 FAM55A family with sequence similarity 55, member A 120400 yes 1 FAM55C family with sequence similarity 55, member C 91775 yes 1 FAM70B family with sequence similarity 70, member B 348013 yes 1 FAM71D family with sequence similarity 71, member D 161142 yes 1 FAM71E1 family with sequence similarity 71, member E1 112703 yes 1 FAM76A family with sequence similarity 76, member A 199870 yes 1 FAM84B family with sequence similarity 84, member B 157638 yes 1 FAM92A3 family with sequence similarity 92, member A3 403315 yes 1 FAM92B family with sequence similarity 92, member B 339145 yes 1 FAM95B1 family with sequence similarity 95, member B1 100133 yes 1 FAM9A family with sequence similarity 9, member A 171482036 yes 1 FAM9B family with sequence similarity 9, member B 171483 yes 1 FANCC Fanconi anemia, complementation group C 2176 yes 1 FANCL Fanconi anemia, complementation group L 55120 yes 1 FANCM Fanconi anemia, complementation group M 57697 yes 1 FAP fibroblast activation protein, alpha 2191 yes 1 FAR1 fatty acyl CoA reductase 1 84188 yes 1 FASTKD2 FAST kinase domains 2 22868 yes 1 FBXL13 F-box and leucine-rich repeat protein 13 222235 yes 1 FBXL18 F-box and leucine-rich repeat protein 18 80028 yes 1 FBXL2 F-box and leucine-rich repeat protein 2 25827 yes 1 FBXL22 F-box and leucine-rich repeat protein 22 283807 yes 1 FBXL3 F-box and leucine-rich repeat protein 3 26224 yes 1 FBXL5 F-box and leucine-rich repeat protein 5 26234 yes 1 FBXO18 F-box protein, helicase, 18 84893 yes 1 FBXO28 F-box protein 28 23219 yes 1 FBXO33 F-box protein 33 254170 yes 1 FBXO36 F-box protein 36 130888 yes 1 FBXO4 F-box protein 4 26272 yes 1 FBXO8 F-box protein 8 26269 yes 1 FBXO9 F-box protein 9 26268 yes 1 FBXW7 F-box and WD repeat domain containing 7 55294 yes 1 FCHO1 FCH domain only 1 23149 yes 1 FCRL4 Fc receptor-like 4 83417 yes 1 FCRLA Fc receptor-like A 84824 yes 1 FER fer (fps/fes related) tyrosine kinase 2241 yes 1 FERMT2 fermitin family homolog 2 (Drosophila) 10979 yes 1 FFAR2 free fatty acid receptor 2 2867 yes 1

172

Symbol Gene name Entrez MR TS µT DB MC # KEGG FGB fibrinogen beta chain 2244 yes 1 FGF23 fibroblast growth factor 23 8074 yes 1 FGFR1 fibroblast growth factor receptor 1 2260 yes 1 FGFR1OP FGFR1 oncogene partner 11116 yes 1 FGL1 fibrinogen-like 1 2267 yes 1 FHAD1 forkhead-associated (FHA) phosphopeptide binding 114827 yes 1 FHL3 domain four and 1 a half LIM domains 3 2275 yes 1 FIG4 FIG4 homolog (S. cerevisiae) 9896 yes 1 FIGF c-fos induced growth factor (vascular endothelial 2277 yes 1 FILIP1 filamingrowth Afactor interacting D) protein 1 27145 yes 1 FILIP1L filamin A interacting protein 1-like 11259 yes 1 FKBP7 FK506 binding protein 7 51661 yes 1 FKBP9 FK506 binding protein 9, 63 kDa 11328 yes 1 FKSG83 FKSG83 83954 yes 1 FKTN fukutin 2218 yes 1 FLJ32810 Rho-type GTPase-activating protein FLJ32810 143872 yes 1 FLJ40330 hypothetical LOC645784 645784 yes 1 FLJ40504 hypothetical protein FLJ40504 284085 yes 1 FLJ41603 FLJ41603 protein 389337 yes 1 FLJ43390 hypothetical LOC646113 646113 yes 1 FLJ43859 FAM75-like protein FLJ43859 389761 yes 1 FLJ43950 FAM75-like protein FLJ46321 pseudogene 347127 yes 1 FLJ44082 FAM75-like protein FLJ44082 389762 yes 1 FLOT1 flotillin 1 10211 yes 1 FLT1 fms-related tyrosine kinase 1 (vascular endothelial 2321 yes 1 FMN2 formingrowth 2factor/vascular permeability factor receptor) 56776 yes 1 FNBP1L formin binding protein 1-like 54874 yes 1 FNIP1 folliculin interacting protein 1 96459 yes 1 FOS v-fos FBJ murine osteosarcoma viral oncogene 2353 yes 1 FOSL2 FOS-likehomolog antigen 2 2355 yes 1 FOXA1 forkhead box A1 3169 yes 1 FOXA2 forkhead box A2 3170 yes 1 FOXD4 forkhead box D4 2298 yes 1 FOXD4L1 forkhead box D4-like 1 200350 yes 1 FOXD4L3 forkhead box D4-like 3 286380 yes 1 FOXD4L6 forkhead box D4-like 6 653404 yes 1 FREM2 FRAS1 related extracellular matrix protein 2 341640 yes 1 FRMD4A FERM domain containing 4A 55691 yes 1 FRMD8 FERM domain containing 8 83786 yes 1 FSD1L fibronectin type III and SPRY domain containing 1-like 83856 yes 1 FTSJD1 FtsJ methyltransferase domain containing 1 55783 yes 1 FUBP1 far upstream element (FUSE) binding protein 1 8880 yes 1 FUBP3 far upstream element (FUSE) binding protein 3 8939 yes 1 FUCA2 fucosidase, alpha-L- 2, plasma 2519 yes 1 FUT1 fucosyltransferase 1 (galactoside 2-alpha-L- 2523 yes 1 FUT10 fucosyltransferasefucosyltransferase, 10 H blood(alpha group) (1,3) fucosyltra nsferase) 84750 yes 1 FUT8 fucosyltransferase 8 (alpha (1,6) fucosyltransferase) 2530 yes 1 FYCO1 FYVE and coiled-coil domain containing 1 79443 yes 1 G3BP2 GTPase activating protein (SH3 domain) binding 9908 yes 1 G6PC glucose-6-phosphatase,protein 2 catalytic subunit 2538 yes 1 GAB4 GRB2-associated binding protein family, member 4 128954 yes 1 GABPA GA binding protein transcription factor, alpha subunit 2551 yes 1 GABRA1 gamma-aminobutyric60kDa acid (GABA) A receptor, alpha 1 2554 yes 1 GABRB1 gamma-aminobutyric acid (GABA) A receptor, beta 1 2560 yes 1 GABRE gamma-aminobutyric acid (GABA) A receptor, epsilon 2564 yes 1 GABRP gamma-aminobutyric acid (GABA) A receptor, pi 2568 yes 1 GALC galactosylceramidase 2581 yes 1 GALK2 galactokinase 2 2585 yes 1 GALNT10 UDP-N-acetyl-alpha-D-galactosamine:polypeptide N- 55568 yes 1 GALNT11 UDP-N-acetyl-alpha-D-galactosamine:polypeptacetylgalactosaminyltransferase 10 (GalNAc -T10)ide N- 63917 yes 1 GALNT3 UDP-N-acetyl-alpha-D-galactosamine:polypeptiacetylgalactosaminyltransferase 11 (GalNAc -T11)de N- 2591 yes 1 GAPVD1 GTPaseacetylgalactosaminyltransferase activating protein and VPS9 3 (GalNAc domains-T3) 1 26130 yes 1 GAS2 growth arrest-specific 2 2620 yes 1 GATA3 GATA binding protein 3 2625 yes 1 GATA4 GATA binding protein 4 2626 yes 1 GATA6 GATA binding protein 6 2627 yes 1 GATM glycine amidinotransferase (L-arginine:glycine 2628 yes 1 GC group-specificamidinotransferase) component (vitamin D binding protein) 2638 yes 1 GCC2 GRIP and coiled-coil domain containing 2 9648 yes 1 GCFC1 chromosome 21 open reading frame 66 94104 yes 1 GCM1 glial cells missing homolog 1 (Drosophila) 8521 yes 1 GCNT2 glucosaminyl (N-acetyl) transferase 2, I-branching 2651 yes 1 GDA guanineenzyme (Ideaminase blood group) 9615 yes 1 GDAP2 ganglioside induced differentiation associated protein 54834 yes 1 GDEP gene2 differentially expressed in prostate 118425 yes 1 GDF11 growth differentiation factor 11 10220 yes 1 GDF15 growth differentiation factor 15 9518 yes 1 GDPD5 glycerophosphodiester phosphodiesterase domain 81544 yes 1 GEM GTPcontaining binding 5 protein overexpressed in skeletal muscle 2669 yes 1 GEMIN6 gem (nuclear organelle) associated protein 6 79833 yes 1 GFI1 growth factor independent 1 transcription repressor 2672 yes 1 GFM2 G elongation factor, mitochondrial 2 84340 yes 1

173

Symbol Gene name Entrez MR TS µT DB MC # KEGG GGA2 golgi associated, gamma adaptin ear containing, ARF 23062 yes 1 GHRLOS binding ghrelin oppositeprotein 2 strand (non-protein coding) 100126 yes 1 GIF gastric intrinsic factor (vitamin B synthesis) 2694793 yes 1 GIMAP6 GTPase, IMAP family member 6 474344 yes 1 GIN1 gypsy retrotransposon integrase 1 54826 yes 1 GJA3 gap junction protein, alpha 3, 46kDa 2700 yes 1 GLB1L3 galactosidase, beta 1-like 3 112937 yes 1 GLDC glycine dehydrogenase (decarboxylating) 2731 yes 1 GLIPR1L1 GLI pathogenesis-related 1 like 1 256710 yes 1 GLO1 glyoxalase I 2739 yes 1 GLRX3 glutaredoxin 3 10539 yes 1 GLS glutaminase 2744 yes 1 GLT1D1 glycosyltransferase 1 domain containing 1 144423 yes 1 GM2A GM2 ganglioside activator 2760 yes 1 GMFB glia maturation factor, beta 2764 yes 1 GNA13 guanine nucleotide binding protein (G protein), alpha 10672 yes 1 GNA14 guanine13 nucleotide binding protein (G protein), alpha 9630 yes 1 GNAI3 guanine14 nucleotide binding protein (G protein), alpha 2773 yes 1 GNAS GNASinhibiting complex activity locus polypeptide 3 2778 yes 1 GNAZ guanine nucleotide binding protein (G protein), alpha 2781 yes 1 GNB5 guaninez polypeptide nucleotide binding protein (G protein), beta 5 10681 yes 1 GNL3L guanine nucleotide binding protein-like 3 (nucleolar)- 54552 yes 1 GNPDA2 glucosamine-6-phosphatelike deaminase 2 132789 yes 1 GNPNAT1 glucosamine-phosphate N-acetyltransferase 1 64841 yes 1 GNPTG N-acetylglucosamine-1-phosphate transferase, 84572 yes 1 GOLGA7B golgigamma autoantigen, subunit golgin subfamily a, 7B 401647 yes 1 GOLGB1 golgin B1, golgi integral membrane protein 2804 yes 1 GOLIM4 golgi integral membrane protein 4 27333 yes 1 GORAB golgin, RAB6-interacting 92344 yes 1 GPATCH2 G patch domain containing 2 55105 yes 1 GPBP1 GC-rich promoter binding protein 1 65056 yes 1 GPBP1L1 GC-rich promoter binding protein 1-like 1 60313 yes 1 GPC4 glypican 4 2239 yes 1 GPCPD1 hypothetical protein KIAA1434 56261 yes 1 GPHN gephyrin 10243 yes 1 GPM6A glycoprotein M6A 2823 yes 1 GPR107 G protein-coupled receptor 107 57720 yes 1 GPR112 G protein-coupled receptor 112 139378 yes 1 GPR113 G protein-coupled receptor 113 165082 yes 1 GPR126 G protein-coupled receptor 126 57211 yes 1 GPR137B G protein-coupled receptor 137B 7107 yes 1 GPR171 G protein-coupled receptor 171 29909 yes 1 GPR18 G protein-coupled receptor 18 2841 yes 1 GPR180 G protein-coupled receptor 180 160897 yes 1 GPR22 G protein-coupled receptor 22 2845 yes 1 GPR64 G protein-coupled receptor 64 10149 yes 1 GPR75 G protein-coupled receptor 75 10936 yes 1 GPR98 G protein-coupled receptor 98 84059 yes 1 GPSM2 G-protein signaling modulator 2 (AGS3-like, C. 29899 yes 1 GPX8 glutathioneelegans) peroxidase 8 (putative) 493869 yes 1 GRAMD1C GRAM domain containing 1C 54762 yes 1 GRB7 growth factor receptor-bound protein 7 2886 yes 1 GREB1 GREB1 protein 9687 yes 1 GRHL1 grainyhead-like 1 (Drosophila) 29841 yes 1 GRIA3 glutamate receptor, ionotrophic, AMPA 3 2892 yes 1 GRIA4 glutamate receptor, ionotrophic, AMPA 4 2893 yes 1 GRID1 glutamate receptor, ionotropic, delta 1 2894 yes 1 GRIK1 glutamate receptor, ionotropic, kainate 1 2897 yes 1 GRIN2A glutamate receptor, ionotropic, N-methyl D-aspartate 2903 yes 1 GRIN2B glutamate2A receptor, ionotropic, N-methyl D-aspartate 2904 yes 1 GRIN3A glutamate2B receptor, ionotropic, N-methyl-D-aspartate 116443 yes 1 GRM1 3A glutamate receptor, metabotropic 1 2911 yes 1 GRM5 glutamate receptor, metabotropic 5 2915 yes 1 GRM6 glutamate receptor, metabotropic 6 2916 yes 1 GRM8 glutamate receptor, metabotropic 8 2918 yes 1 GRTP1 growth hormone regulated TBC protein 1 79774 yes 1 GSTA1 glutathione S-transferase alpha 1 2938 yes 1 GSTK1 glutathione S-transferase kappa 1 373156 yes 1 GSTM3 glutathione S-transferase mu 3 (brain) 2947 yes 1 GSTO1 glutathione S-transferase omega 1 9446 yes 1 GTF2B general transcription factor IIB 2959 yes 1 GTF2H3 general transcription factor IIH, polypeptide 3, 34kDa 2967 yes 1 GTF2H4 general transcription factor IIH, polypeptide 4, 52kDa 2968 yes 1 GTF2IRD1 GTF2I repeat domain containing 1 9569 yes 1 GTF3C3 general transcription factor IIIC, polypeptide 3, 9330 yes 1 GTF3C6 general102kDa transcription factor IIIC, polypeptide 6, alpha 112495 yes 1 GTSE1 35kDa G-2 and S-phase expressed 1 51512 yes 1 GUCY1B3 guanylate cyclase 1, soluble, beta 3 2983 yes 1 GULP1 GULP, engulfment adaptor PTB domain containing 1 51454 yes 1 GXYLT1 glycosyltransferase 8 domain containing 3 283464 yes 1 H2AFY H2A histone family, member Y 9555 yes 1 H2BFWT H2B histone family, member W, testis-specific 158983 yes 1

174

Symbol Gene name Entrez MR TS µT DB MC # KEGG HAND2 heart and neural crest derivatives expressed 2 9464 yes 1 HAO1 hydroxyacid oxidase (glycolate oxidase) 1 54363 yes 1 HAP1 huntingtin-associated protein 1 9001 yes 1 HAS2 hyaluronan synthase 2 3037 yes 1 HAUS6 HAUS augmin-like complex, subunit 6 54801 yes 1 HBD hemoglobin, delta 3045 yes 1 HCFC2 host cell factor C2 29915 yes 1 HCG26 HLA complex group 26 352961 yes 1 HCG2P7 HLA complex group 2 pseudogene 7 80867 yes 1 HCRTR1 hypocretin (orexin) receptor 1 3061 yes 1 HDAC6 histone deacetylase 6 10013 yes 1 HEBP1 heme binding protein 1 50865 yes 1 HELB helicase (DNA) B 92797 yes 1 HELQ helicase, POLQ-like 113510 yes 1 HELZ helicase with zinc finger 9931 yes 1 HEMK1 HemK methyltransferase family member 1 51409 yes 1 HERC1 hect (homologous to the E6-AP (UBE3A) carboxyl 8925 yes 1 HERC2 hectterminus) domain domain and RLD and 2RCC1 (CHC1) -like domain 8924 yes 1 HHEX hematopoietically expressed homeobox 3087 yes 1 HIAT1 hippocampus abundant transcript 1 64645 yes 1 HIATL1 hippocampus abundant transcript-like 1 84641 yes 1 HIBADH 3-hydroxyisobutyrate dehydrogenase 11112 yes 1 HIF1A hypoxia inducible factor 1, alpha subunit (basic helix- 3091 yes 1 HIST1H2BD histoneloop -helix cluster transcription 1, H2bd factor) 3017 yes 1 HIST1H2BK histone cluster 1, H2bk 85236 yes 1 HIST2H2BF histone cluster 2, H2bf 440689 yes 1 HIVEP2 human immunodeficiency virus type I enhancer 3097 yes 1 HLA-DPA1 majorbinding histocompatibility protein 2 complex, class II, DP alpha 1 3113 yes 1 HLA-DQA2 major histocompatibility complex, class II, DQ alpha 2 3118 yes 1 HMBOX1 homeobox containing 1 79618 yes 1 HMGA2 high mobility group AT-hook 2 8091 yes 1 HMGCLL1 3-hydroxymethyl-3-methylglutaryl-Coenzyme A lyase- 54511 yes 1 HMGXB4 HMGlike 1 box domain containing 4 10042 yes 1 HNF4A hepatocyte nuclear factor 4, alpha 3172 yes 1 HNF4G hepatocyte nuclear factor 4, gamma 3174 yes 1 HNMT histamine N-methyltransferase 3176 yes 1 HNRNPC heterogeneous nuclear ribonucleoprotein C (C1/C2) 3183 yes 1 HNRNPM heterogeneous nuclear ribonucleoprotein M 4670 yes 1 HOOK2 hook homolog 2 (Drosophila) 29911 yes 1 HOXA1 homeobox A1 3198 yes 1 HOXC5 homeobox C5 3222 yes 1 HOXC6 homeobox C6 3223 yes 1 HOXD13 homeobox D13 3239 yes 1 HPGD hydroxyprostaglandin dehydrogenase 15-(NAD) 3248 yes 1 HPS3 Hermansky-Pudlak syndrome 3 84343 yes 1 HPS4 Hermansky-Pudlak syndrome 4 89781 yes 1 HRASLS HRAS-like suppressor 57110 yes 1 HS3ST1 heparan sulfate (glucosamine) 3-O-sulfotransferase 1 9957 yes 1 HSD17B13 hydroxysteroid (17-beta) dehydrogenase 13 345275 yes 1 HSD17B4 hydroxysteroid (17-beta) dehydrogenase 4 3295 yes 1 HSD3B1 hydroxy-delta-5-steroid dehydrogenase, 3 beta- and 3283 yes 1 HSF2 heatsteroid shock delta transcription-isomerase 1 factor 2 3298 yes 1 HSF5 heat shock transcription factor family member 5 124535 yes 1 HSP90AB2P heat shock protein 90kDa alpha (cytosolic), class B 391634 yes 1 HSPA2 member heat shock 2 (pseudogene) 70kDa protein 2 3306 yes 1 HSPB11 heat shock protein family B (small), member 11 51668 yes 1 HSPE1 heat shock 10kDa protein 1 (chaperonin 10) 3336 yes 1 HTATIP2 HIV-1 Tat interactive protein 2, 30kDa 10553 yes 1 HTN1 histatin 1 3346 yes 1 HTN3 histatin 3 3347 yes 1 HTR2A 5-hydroxytryptamine (serotonin) receptor 2A 3356 yes 1 HTR2C 5-hydroxytryptamine (serotonin) receptor 2C 3358 yes 1 HVCN1 hydrogen voltage-gated channel 1 84329 yes 1 HYAL4 hyaluronoglucosaminidase 4 23553 yes 1 IARS isoleucyl-tRNA synthetase 3376 yes 1 IBTK inhibitor of Bruton agammaglobulinemia tyrosine 25998 yes 1 IDE insulin-degradingkinase enzyme 3416 yes 1 IER5 immediate early response 5 51278 yes 1 IFIT3 interferon-induced protein with tetratricopeptide 3437 yes 1 IFLTD1 intermediaterepeats 3 filament tail domain containing 1 160492 yes 1 IFNA1 interferon, alpha 1 3439 yes 1 IFNA10 interferon, alpha 10 3446 yes 1 IFNAR1 interferon (alpha, beta and omega) receptor 1 3454 yes 1 IFNGR2 interferon gamma receptor 2 (interferon gamma 3460 yes 1 IFNWP18 interferon,transducer omega1) 18 (pseudogene) 360001 yes 1 IFRD1 interferon-related developmental regulator 1 3475 yes 1 IFT52 intraflagellar transport 52 homolog (Chlamydomonas) 51098 yes 1 IGF1 insulin-like growth factor 1 (somatomedin C) 3479 yes 1 IGF2BP1 insulin-like growth factor 2 mRNA binding protein 1 10642 yes 1 IGF2BP2 insulin-like growth factor 2 mRNA binding protein 2 10644 yes 1 IGF2BP3 insulin-like growth factor 2 mRNA binding protein 3 10643 yes 1 IGHV3-21 immunoglobulin heavy variable 3-21 28444 yes 1

175

Symbol Gene name Entrez MR TS µT DB MC # KEGG IGLV7-43 immunoglobulin lambda variable 7-43 28776 yes 1 IGSF1 immunoglobulin superfamily, member 1 3547 yes 1 IGSF3 immunoglobulin superfamily, member 3 3321 yes 1 IKBIP IKK interacting protein 121457 yes 1 IKZF1 IKAROS family zinc finger 1 (Ikaros) 10320 yes 1 IKZF5 IKAROS family zinc finger 5 (Pegasus) 64376 yes 1 IL10 interleukin 10 3586 yes 1 IL12B interleukin 12B (natural killer cell stimulatory factor 2, 3593 yes 1 IL1B interleukincytotoxic lymphocyte 1, beta maturation factor 2, p40) 3553 yes 1 IL1F9 interleukin 1 family, member 9 56300 yes 1 IL1R1 interleukin 1 receptor, type I 3554 yes 1 IL1RAP interleukin 1 receptor accessory protein 3556 yes 1 IL2 interleukin 2 3558 yes 1 IL22RA1 interleukin 22 receptor, alpha 1 58985 yes 1 IL24 interleukin 24 11009 yes 1 IL28RA interleukin 28 receptor, alpha (interferon, lambda 163702 yes 1 IL33 receptor) interleukin 33 90865 yes 1 IL6 interleukin 6 (interferon, beta 2) 3569 yes 1 IL8 interleukin 8 3576 yes 1 ILF2 interleukin enhancer binding factor 2, 45kDa 3608 yes 1 IMMP2L IMP2 inner mitochondrial membrane peptidase-like 83943 yes 1 IMP4 IMP4,(S. cerevisiae) U3 small nucleolar ribonucleoprotein, homolog 92856 yes 1 IMPA1 inositol(myo)-1(or(yeast) 4)-monophosphatase 1 3612 yes 1 IMPG1 interphotoreceptor matrix proteoglycan 1 3617 yes 1 INA internexin neuronal intermediate filament protein, 9118 yes 1 INSR insulinalpha receptor 3643 yes 1 INTU inturned planar cell polarity effector homolog 27152 yes 1 IPO7 importin(Drosophil 7 a) 10527 yes 1 IPO8 importin 8 10526 yes 1 IPO9 importin 9 55705 yes 1 IPPK inositol 1,3,4,5,6-pentakisphosphate 2-kinase 64768 yes 1 IQCF3 IQ motif containing F3 401067 yes 1 IQGAP1 IQ motif containing GTPase activating protein 1 8826 yes 1 IQUB IQ motif and ubiquitin domain containing 154865 yes 1 IRAK1BP1 interleukin-1 receptor-associated kinase 1 binding 134728 yes 1 IRAK3 protein interleukin-1 1 receptor-associated kinase 3 11213 yes 1 IRS1 insulin receptor substrate 1 3667 yes 1 ISCU iron-sulfur cluster scaffold homolog (E. coli) 23479 yes 1 ISPD notch1-induced protein 729920 yes 1 ITGA1 integrin, alpha 1 3672 yes 1 ITGB5 integrin, beta 5 3693 yes 1 ITGB8 integrin, beta 8 3696 yes 1 ITIH4 inter-alpha (globulin) inhibitor H4 (plasma Kallikrein- 3700 yes 1 ITPR2 inositolsensitive 1,4,5-triphosphate glycoprotein) receptor, type 2 3709 yes 1 ITPRIP inositol 1,4,5-triphosphate receptor interacting protein 85450 yes 1 JAKMIP2 Janus kinase and microtubule interacting protein 2 9832 yes 1 JMJD1C jumonji domain containing 1C 221037 yes 1 KAL1 Kallmann syndrome 1 sequence 3730 yes 1 KALRN kalirin, RhoGEF kinase 8997 yes 1 KANK2 KN motif and ankyrin repeat domains 2 25959 yes 1 KAT2B K(lysine) acetyltransferase 2B 8850 yes 1 KATNAL1 katanin p60 subunit A-like 1 84056 yes 1 KBTBD11 kelch repeat and BTB (POZ) domain containing 11 9920 yes 1 KBTBD2 kelch repeat and BTB (POZ) domain containing 2 25948 yes 1 KBTBD4 kelch repeat and BTB (POZ) domain containing 4 55709 yes 1 KBTBD8 kelch repeat and BTB (POZ) domain containing 8 84541 yes 1 KCNA1 potassium voltage-gated channel, shaker-related 3736 yes 1 KCNC2 potassiumsubfamily, membervoltage-gated 1 (episodic channel, ataxia Shaw-related with myokymia) 3747 yes 1 KCNJ10 potassiumsubfamily, memberinwardly-rectifying 2 channel, subfamily J, 3766 yes 1 KCNJ14 potassiummember 10 inwardly-rectifying channel, subfamily J, 3770 yes 1 KCNJ16 potassiummember 14 inwardly-rectifying channel, subfamily J, 3773 yes 1 KCNJ2 potassiummember 16 inwardly-rectifying channel, subfamily J, 3759 yes 1 KCNMB2 potassiummember 2 large conductance calcium-activated 10242 yes 1 KCNN1 potassiumchannel, subfamily intermediate/small M, beta member conductance 2 calcium- 3780 yes 1 KCNN2 potassiumactivated channel, intermediate/small subfamily N, conductance member 1 calcium- 3781 yes 1 KCNQ5 potassiumactivated channel, voltage-gated subfamily channel, N, member KQT-like 2 subfamily, 56479 yes 1 KCNV1 potassiummember 5 channel, subfamily V, member 1 27012 yes 1 KCTD15 potassium channel tetramerisation domain containing 79047 yes 1 KCTD18 potassium15 channel tetramerisation domain containing 130535 yes 1 KCTD8 18 potassium channel tetramerisation domain containing 386617 yes 1 KDELR2 8 KDEL (Lys-Asp-Glu-Leu) endoplasmic reticulum 11014 yes 1 KDM1B amineprotein oxidase retention (flavin receptor containing) 2 domain 1 221656 yes 1 KDM3B lysine (K)-specific demethylase 3B 51780 yes 1 KDM6A lysine (K)-specific demethylase 6A 7403 yes 1 KGFLP1 keratinocyte growth factor-like protein 1 387628 yes 1 KHDRBS2 KH domain containing, RNA binding, signal 202559 yes 1 KHDRBS3 transduction KH domain containing, associated RNA 2 binding, signal 10656 yes 1 KIAA0141 KIAA0141transduction associated 3 9812 yes 1 KIAA0146 KIAA0146 23514 yes 1 KIAA0196 KIAA0196 9897 yes 1 KIAA0240 KIAA0240 23506 yes 1 KIAA0319 KIAA0319 9856 yes 1

176

Symbol Gene name Entrez MR TS µT DB MC # KEGG KIAA0494 KIAA0494 9813 yes 1 KIAA0562 KIAA0562 9731 yes 1 KIAA0564 KIAA0564 23078 yes 1 KIAA0586 KIAA0586 9786 yes 1 KIAA0831 KIAA0831 22863 yes 1 KIAA0907 KIAA0907 22889 yes 1 KIAA1012 KIAA1012 22878 yes 1 KIAA1024 KIAA1024 23251 yes 1 KIAA1107 KIAA1107 23285 yes 1 KIAA1191 KIAA1191 57179 yes 1 KIAA1211 KIAA1211 57482 yes 1 KIAA1267 KIAA1267 284058 yes 1 KIAA1407 KIAA1407 57577 yes 1 KIAA1614 KIAA1614 57710 yes 1 KIAA1715 KIAA1715 80856 yes 1 KIAA1826 KIAA1826 84437 yes 1 KIF13A kinesin family member 13A 63971 yes 1 KIF13B kinesin family member 13B 23303 yes 1 KIF14 kinesin family member 14 9928 yes 1 KIF15 kinesin family member 15 56992 yes 1 KIF18A kinesin family member 18A 81930 yes 1 KIF2C kinesin family member 2C 11004 yes 1 KIF5A kinesin family member 5A 3798 yes 1 KIF5B kinesin family member 5B 3799 yes 1 KIF5C kinesin family member 5C 3800 yes 1 KIN KIN, antigenic determinant of recA protein homolog 22944 yes 1 KL klotho(mouse) 9365 yes 1 KLF11 Kruppel-like factor 11 8462 yes 1 KLF17 Kruppel-like factor 17 128209 yes 1 KLF6 Kruppel-like factor 6 1316 yes 1 KLF9 Kruppel-like factor 9 687 yes 1 KLHDC10 kelch domain containing 10 23008 yes 1 KLHL14 kelch-like 14 (Drosophila) 57565 yes 1 KLHL15 kelch-like 15 (Drosophila) 80311 yes 1 KLHL18 kelch-like 18 (Drosophila) 23276 yes 1 KLHL28 kelch-like 28 (Drosophila) 54813 yes 1 KLHL6 kelch-like 6 (Drosophila) 89857 yes 1 KLHL8 kelch-like 8 (Drosophila) 57563 yes 1 KLK10 kallikrein-related peptidase 10 5655 yes 1 KLK9 kallikrein-related peptidase 9 284366 yes 1 KLKBL4 plasma kallikrein-like protein 4 221191 yes 1 KLRC2 killer cell lectin-like receptor subfamily C, member 2 3822 yes 1 KNG1 kininogen 1 3827 yes 1 KPNA1 karyopherin alpha 1 (importin alpha 5) 3836 yes 1 KPNA3 karyopherin alpha 3 (importin alpha 4) 3839 yes 1 KRBA1 KRAB-A domain containing 1 84626 yes 1 KRCC1 lysine-rich coiled-coil 1 51315 yes 1 KRIT1 KRIT1, ankyrin repeat containing 889 yes 1 KRR1 KRR1, small subunit (SSU) processome component, 11103 yes 1 KRT31 keratinhomolog 31 (yeast) 3881 yes 1 KRT35 keratin 35 3886 yes 1 KRT80 keratin 80 144501 yes 1 KRT82 keratin 82 3888 yes 1 KRTAP20-1 keratin associated protein 20-1 337975 yes 1 KY kyphoscoliosis peptidase 339855 yes 1 L3MBTL3 l(3)mbt-like 3 (Drosophila) 84456 yes 1 LACE1 lactation elevated 1 246269 yes 1 LAMA1 laminin, alpha 1 284217 yes 1 LAMC1 laminin, gamma 1 (formerly LAMB2) 3915 yes 1 LARP1B La ribonucleoprotein domain family, member 1B 55132 yes 1 LARP4 La ribonucleoprotein domain family, member 4 113251 yes 1 LARP4B La ribonucleoprotein domain family, member 4B 23185 yes 1 LASS3 LAG1 homolog, ceramide synthase 3 204219 yes 1 LATS2 LATS, large tumor suppressor, homolog 2 26524 yes 1 LBR lamin(Drosophila) B receptor 3930 yes 1 LCA5 Leber congenital amaurosis 5 167691 yes 1 LCA5L Leber congenital amaurosis 5-like 150082 yes 1 LCP1 lymphocyte cytosolic protein 1 (L-plastin) 3936 yes 1 LDB2 LIM domain binding 2 9079 yes 1 LDHAL6B lactate dehydrogenase A-like 6B 92483 yes 1 LDHD lactate dehydrogenase D 197257 yes 1 LECT2 leukocyte cell-derived chemotaxin 2 3950 yes 1 LEMD3 LEM domain containing 3 23592 yes 1 LEPREL1 leprecan-like 1 55214 yes 1 LEPROT leptin receptor overlapping transcript 54741 yes 1 LETMD1 LETM1 domain containing 1 25875 yes 1 LHX6 LIM homeobox 6 26468 yes 1 LILRA1 leukocyte immunoglobulin-like receptor, subfamily A 11024 yes 1 LILRA2 leukocyte(with TM domain), immunoglobulin-like member 1 receptor, subfamily A 11027 yes 1 LILRA5 leukocyte(with TM domain), immunoglobulin-like member 2 receptor, subfamily A 353514 yes 1 LILRA6 (with leukocyte TM domain), immunoglobulin-like member 5 receptor, subfamily A 79168 yes 1 LILRB1 leukocyte(with TM domain), immunoglobulin-like member 6 receptor, subfamily B 10859 yes 1 (with TM and ITIM domains), member 1

177

Symbol Gene name Entrez MR TS µT DB MC # KEGG LILRB4 leukocyte immunoglobulin-like receptor, subfamily B 11006 yes 1 LIMD2 (with LIM domain TM and containing ITIM domains), 2 member 4 80774 yes 1 LIN54 lin-54 homolog (C. elegans) 132660 yes 1 LIN7A lin-7 homolog A (C. elegans) 8825 yes 1 LIPA lipase A, lysosomal acid, cholesterol esterase 3988 yes 1 LMBR1 limb region 1 homolog (mouse) 64327 yes 1 LMBRD1 LMBR1 domain containing 1 55788 yes 1 LMO7 LIM domain 7 4008 yes 1 LOC100128 hypothetical protein LOC100128554 100128 yes 1 LOC100129554 similar to mCG115122 100129554 yes 1 361LOC100130 hypothetical LOC100130238 100130361 yes 1 LOC100130238 similar to hCG1996578 100130238 yes 1 LOC100144522 hypothetical transcript 100144522 yes 1 LOC121838603 hypothetical LOC121838 121838603 yes 1 LOC121952 hypothetical LOC121952 121952 yes 1 LOC145474 hypothetical protein LOC145474 145474 yes 1 LOC145820 hypothetical protein LOC145820 145820 yes 1 LOC149620 CHIA-like pseudogene 149620 yes 1 LOC158376 hypothetical LOC158376 158376 yes 1 LOC158572 hypothetical LOC158572 158572 yes 1 LOC162632 TL132 pseudogene 162632 yes 1 LOC253039 hypothetical LOC253039 253039 yes 1 LOC284551 hypothetical LOC284551 284551 yes 1 LOC285375 hypothetical LOC285375 285375 yes 1 LOC285456 hypothetical LOC285456 285456 yes 1 LOC285954 hypothetical LOC285954 285954 yes 1 LOC286094 hypothetical LOC286094 286094 yes 1 LOC338667 hypothetical protein LOC338667 338667 yes 1 LOC390956 similar to TRIM5/cyclophilin A fusion protein 390956 yes 1 LOC399815 chromosome 10 open reading frame 88 pseudogene 399815 yes 1 LOC401127 WD repeat domain 5 pseudogene 401127 yes 1 LOC415056 hypothetical LOC415056 415056 yes 1 LOC440925 hypothetical LOC440925 440925 yes 1 LOC641367 cyclin Y-like pseudogene 641367 yes 1 LOC643763 hypothetical LOC643763 643763 yes 1 LOC647121 embigin homolog (mouse) pseudogene 647121 yes 1 LOC647288 CTAGE family, member 5 pseudogene 647288 yes 1 LOC648740 ACTB pseudogene 648740 yes 1 LOC728715 similar to hCG38149 728715 yes 1 LOC728819 hCG1645220 728819 yes 1 LOC728832 similar to Uncharacterized protein FLJ76381 728832 yes 1 LOC729603 calcium binding protein P22 pseudogene 729603 yes 1 LOC90246 hypothetical protein LOC90246 90246 yes 1 LOC96610 BMS1 homolog, ribosome assembly protein (yeast) 96610 yes 1 LONP2 lonpseudogene peptidase 2, peroxisomal 83752 yes 1 LOX lysyl oxidase 4015 yes 1 LPA lipoprotein, Lp(a) 4018 yes 1 LPAR1 lysophosphatidic acid receptor 1 1902 yes 1 LPL lipoprotein lipase 4023 yes 1 LPPR4 plasticity related gene 1 9890 yes 1 LPXN leupaxin 9404 yes 1 LRCH2 leucine-rich repeats and calponin homology (CH) 57631 yes 1 LRFN3 leucinedomain richcontaining repeat 2and fibronectin type III domain 79414 yes 1 LRFN5 leucinecontaining rich 3 repeat and fibronectin type III domain 145581 yes 1 LRMP containing lymphoid-restricted 5 membrane protein 4033 yes 1 LRP12 low density lipoprotein-related protein 12 29967 yes 1 LRP1B low density lipoprotein-related protein 1B (deleted in 53353 yes 1 LRP2 lowtumors) density lipoprotein-related protein 2 4036 yes 1 LRP4 low density lipoprotein receptor-related protein 4 4038 yes 1 LRPAP1 low density lipoprotein receptor-related protein 4043 yes 1 LRRC17 leucineassociated rich proteinrepeat 1containing 17 10234 yes 1 LRRC28 leucine rich repeat containing 28 123355 yes 1 LRRC36 leucine rich repeat containing 36 55282 yes 1 LRRC3B leucine rich repeat containing 3B 116135 yes 1 LRRC48 leucine rich repeat containing 48 83450 yes 1 LRRC8C leucine rich repeat containing 8 family, member C 84230 yes 1 LRRC8D leucine rich repeat containing 8 family, member D 55144 yes 1 LRRIQ1 leucine-rich repeats and IQ motif containing 1 84125 yes 1 LRRN1 leucine rich repeat neuronal 1 57633 yes 1 LRRTM1 leucine rich repeat transmembrane neuronal 1 347730 yes 1 LRRTM4 leucine rich repeat transmembrane neuronal 4 80059 yes 1 LRTOMT leucine rich transmembrane and 0-methyltransferase 220074 yes 1 LSR domain lipolysis containingstimulated lipoprotein receptor 51599 yes 1 LUC7L3 cisplatin resistance-associated overexpressed protein 51747 yes 1 LYPD6 LY6/PLAUR domain containing 6 130574 yes 1 LYSMD3 LysM, putative peptidoglycan-binding, domain 116068 yes 1 MACROD2 containing MACRO domain 3 containing 2 140733 yes 1 MAEL maelstrom homolog (Drosophila) 84944 yes 1 MAGEB16 melanoma antigen family B, 16 139604 yes 1 MAGI2 membrane associated guanylate kinase, WW and 9863 yes 1 MAGOHB mago-nashiPDZ domain homolog containing B (Drosophila)2 55110 yes 1 MAK male germ cell-associated kinase 4117 yes 1

178

Symbol Gene name Entrez MR TS µT DB MC # KEGG MALL mal, T-cell differentiation protein-like 7851 yes 1 MAML3 mastermind-like 3 (Drosophila) 55534 yes 1 MAN1A1 mannosidase, alpha, class 1A, member 1 4121 yes 1 MAN1A2 mannosidase, alpha, class 1A, member 2 10905 yes 1 MANEA mannosidase, endo-alpha 79694 yes 1 MAP1B microtubule-associated protein 1B 4131 yes 1 MAP2K1 mitogen-activated protein kinase kinase 1 5604 yes 1 yes MAP2K6 mitogen-activated protein kinase kinase 6 5608 yes 1 MAP3K9 mitogen-activated protein kinase kinase kinase 9 4293 yes 1 MAP4K4 mitogen-activated protein kinase kinase kinase kinase 9448 yes 1 MAP9 microtubule-associated4 protein 9 79884 yes 1 MAPK1 mitogen-activated protein kinase 1 5594 yes 1 yes MAPK14 mitogen-activated protein kinase 14 1432 yes 1 yes MAPK8 mitogen-activated protein kinase 8 5599 yes 1 MAPK9 mitogen-activated protein kinase 9 5601 yes 1 MAPKAPK5 mitogen-activated protein kinase-activated protein 8550 yes 1 MAPRE2 microtubule-associatedkina se 5 protein, RP/EB family, 10982 yes 1 MAPT microtubule-associatedmember 2 protein tau 4137 yes 1 MARCH7 membrane-associated ring finger (C3HC4) 7 64844 yes 1 MARVELD2 MARVEL domain containing 2 153562 yes 1 MASTL microtubule associated serine/threonine kinase-like 84930 yes 1 MAT1A methionine adenosyltransferase I, alpha 4143 yes 1 MATR3 matrin 3 9782 yes 1 MAX MYC associated factor X 4149 yes 1 MBNL1 muscleblind-like (Drosophila) 4154 yes 1 MBNL2 muscleblind-like 2 (Drosophila) 10150 yes 1 MBP myelin basic protein 4155 yes 1 MBTD1 mbt domain containing 1 54799 yes 1 MCART6 mitochondrial carrier triple repeat 6 401612 yes 1 MCC mutated in colorectal cancers 4163 yes 1 MCCC1 methylcrotonoyl-Coenzyme A carboxylase 1 (alpha) 56922 yes 1 MCFD2 multiple coagulation factor deficiency 2 90411 yes 1 MCM2 minichromosome maintenance complex component 2 4171 yes 1 MCM9 minichromosome maintenance complex component 9 254394 yes 1 MCOLN3 mucolipin 3 55283 yes 1 MCTS1 malignant T cell amplified sequence 1 28985 yes 1 MDM1 Mdm1 nuclear protein homolog (mouse) 56890 yes 1 MDM4 Mdm4 p53 binding protein homolog (mouse) 4194 yes 1 MDN1 MDN1, midasin homolog (yeast) 23195 yes 1 ME2 malic enzyme 2, NAD(+)-dependent, mitochondrial 4200 yes 1 MECOM ecotropic viral integration site 1 2122 yes 1 MED13L mediator complex subunit 13-like 23389 yes 1 MED14 mediator complex subunit 14 9282 yes 1 MED28 mediator complex subunit 28 80306 yes 1 MED30 mediator complex subunit 30 90390 yes 1 MEIS1 Meis homeobox 1 4211 yes 1 MEIS3P1 Meis homeobox 3 pseudogene 1 4213 yes 1 METTL14 methyltransferase like 14 57721 yes 1 MEX3A mex-3 homolog A (C. elegans) 92312 yes 1 MFAP5 microfibrillar associated protein 5 8076 yes 1 MFGE8 milk fat globule-EGF factor 8 protein 4240 yes 1 MFN1 mitofusin 1 55669 yes 1 MFN2 mitofusin 2 9927 yes 1 MFSD10 major facilitator superfamily domain containing 10 10227 yes 1 MFSD6 major facilitator superfamily domain containing 6 54842 yes 1 MGAT3 mannosyl (beta-1,4-)-glycoprotein beta-1,4-N- 4248 yes 1 MGC2752 hypotheticalacetylglucos aminyltransferaseLOC65996 65996 yes 1 MGC4473 hypothetical LOC79100 79100 yes 1 MGST1 microsomal glutathione S-transferase 1 4257 yes 1 MICALCL MICAL C-terminal like 84953 yes 1 MIER3 mesoderm induction early response 1, family member 166968 yes 1 MINPP1 3 multiple inositol polyphosphate histidine phosphatase, 9562 yes 1 MIPEP mitochondrial1 intermediate peptidase 4285 yes 1 MIPOL1 mirror-image polydactyly 1 145282 yes 1 MKL2 MKL/myocardin-like 2 57496 yes 1 MKRN3 makorin ring finger protein 3 7681 yes 1 MKX mohawk homeobox 283078 yes 1 MLKL mixed lineage kinase domain-like 197259 yes 1 MLL3 myeloid/lymphoid or mixed-lineage leukemia 3 58508 yes 1 MLLT10 myeloid/lymphoid or mixed-lineage leukemia (trithorax 8028 yes 1 MLLT3 myeloid/lymphoidhomolog, Drosophila); or mixed-lineage translocated leukemiato, 10 (trithorax 4300 yes 1 MMAA methylmalonichomolog, Drosophila); aciduria translocated (cobalamin deficiency)to, 3 cblA 166785 yes 1 MME type membrane metallo-endopeptidase 4311 yes 1 MMP1 matrix metallopeptidase 1 (interstitial collagenase) 4312 yes 1 MMP19 matrix metallopeptidase 19 4327 yes 1 MMRN2 multimerin 2 79812 yes 1 MNS1 meiosis-specific nuclear structural 1 55329 yes 1 MOBKL1A MOB1, Mps One Binder kinase activator-like 1A 92597 yes 1 MOBP myelin-associated(yeast) oligodendrocyte basic protein 4336 yes 1 MORC3 MORC family CW-type zinc finger 3 23515 yes 1 MOSC1 MOCO sulphurase C-terminal domain containing 1 64757 yes 1 MOXD1 monooxygenase, DBH-like 1 26002 yes 1

179

Symbol Gene name Entrez MR TS µT DB MC # KEGG MPI mannose phosphate isomerase 4351 yes 1 MPPED2 metallophosphoesterase domain containing 2 744 yes 1 MPZ myelin protein zero 4359 yes 1 MRAP melanocortin 2 receptor accessory protein 56246 yes 1 MRFAP1L1 Morf4 family associated protein 1-like 1 114932 yes 1 MRP63 mitochondrial ribosomal protein 63 78988 yes 1 MRPL17 mitochondrial ribosomal protein L17 63875 yes 1 MRPL22 mitochondrial ribosomal protein L22 29093 yes 1 MRPL42 mitochondrial ribosomal protein L42 28977 yes 1 MRPL46 mitochondrial ribosomal protein L46 26589 yes 1 MRPL9 mitochondrial ribosomal protein L9 65005 yes 1 MRPS11 mitochondrial ribosomal protein S11 64963 yes 1 MRPS22 mitochondrial ribosomal protein S22 56945 yes 1 MRPS30 mitochondrial ribosomal protein S30 10884 yes 1 MRPS36 mitochondrial ribosomal protein S36 92259 yes 1 MRRF mitochondrial ribosome recycling factor 92399 yes 1 MS4A1 membrane-spanning 4-domains, subfamily A, 931 yes 1 MS4A14 membrane-spanningmember 1 4-domains, subfamily A, 84689 yes 1 MSL1 male-specificmember 14 lethal 1 homolog (Drosophila) 339287 yes 1 MSL2 male-specific lethal 2 homolog (Drosophila) 55167 yes 1 MSL3 male-specific lethal 3 homolog (Drosophila) 10943 yes 1 MSR1 macrophage scavenger receptor 1 4481 yes 1 MSTN myostatin 2660 yes 1 MT1IP metallothionein 1I (pseudogene) 644314 yes 1 MTAP methylthioadenosine phosphorylase 4507 yes 1 MTCH1 mitochondrial carrier homolog 1 (C. elegans) 23787 yes 1 MTFR1 mitochondrial fission regulator 1 9650 yes 1 MTM1 myotubularin 1 4534 yes 1 MTMR1 myotubularin related protein 1 8776 yes 1 MTMR15 myotubularin related protein 15 22909 yes 1 MTMR7 myotubularin related protein 7 9108 yes 1 MTMR9 myotubularin related protein 9 66036 yes 1 MTPAP mitochondrial poly(A) polymerase 55149 yes 1 MTUS2 KIAA0774 23281 yes 1 MUC15 mucin 15, cell surface associated 143662 yes 1 MUC17 mucin 17, cell surface associated 140453 yes 1 MUC7 mucin 7, secreted 4589 yes 1 MUDENG MU-2/AP1M2 domain containing, death-inducing 55745 yes 1 MUM1 melanoma associated antigen (mutated) 1 84939 yes 1 MXI1 MAX interactor 1 4601 yes 1 MYCN v-myc myelocytomatosis viral related oncogene, 4613 yes 1 MYEF2 myelinneuroblastoma expression derived factor (avian) 2 50804 yes 1 MYEOV myeloma overexpressed (in a subset of t(11;14) 26579 yes 1 MYF5 myogenicpositive multiple factor 5myelomas) 4617 yes 1 MYH15 myosin, heavy chain 15 22989 yes 1 MYL12A myosin, light chain 12A, regulatory, non-sarcomeric 10627 yes 1 MYLIP myosin regulatory light chain interacting protein 29116 yes 1 MYO3A myosin IIIA 53904 yes 1 MYO9A myosin IXA 4649 yes 1 MYOM3 myomesin family, member 3 127294 yes 1 MYSM1 Myb-like, SWIRM and MPN domains 1 114803 yes 1 NAA16 NMDA receptor regulated 1-like 79612 yes 1 NAA25 chromosome 12 open reading frame 30 80018 yes 1 NAE1 NEDD8 activating enzyme E1 subunit 1 8883 yes 1 NAIP NLR family, apoptosis inhibitory protein 4671 yes 1 NALCN sodium leak channel, non-selective 259232 yes 1 NAMPT nicotinamide phosphoribosyltransferase 10135 yes 1 NAPB N-ethylmaleimide-sensitive factor attachment protein, 63908 yes 1 NAPG N-ethylmaleimide-sensitivebeta factor attachment protein, 8774 yes 1 NARG2 NMDAgamma receptor regulated 2 79664 yes 1 NAV2 neuron navigator 2 89797 yes 1 NBEAL1 neurobeachin-like 1 65065 yes 1 NBLA00301 Nbla00301 79804 yes 1 NBR1 neighbor of BRCA1 gene 1 4077 yes 1 NCAPH non-SMC condensin I complex, subunit H 23397 yes 1 NCEH1 arylacetamide deacetylase-like 1 57552 yes 1 NCF4 neutrophil cytosolic factor 4, 40kDa 4689 yes 1 NCKAP1 NCK-associated protein 1 10787 yes 1 NCOA1 nuclear receptor coactivator 1 8648 yes 1 NCOA7 nuclear receptor coactivator 7 135112 yes 1 NCOR1 nuclear receptor co-repressor 1 9611 yes 1 NCR1 natural cytotoxicity triggering receptor 1 9437 yes 1 NCRNA001 non-protein coding RNA 119 348808 yes 1 NCRNA00119 non-protein coding RNA 167 440072 yes 1 NDE167 nudE nuclear distribution gene E homolog 1 (A. 54820 yes 1 NDFIP1 Nedd4nidulans) family interacting protein 1 80762 yes 1 NDP Norrie disease (pseudoglioma) 4693 yes 1 NDRG2 NDRG family member 2 57447 yes 1 NDST1 N-deacetylase/N-sulfotransferase (heparan 3340 yes 1 NDST3 N-deacetylase/N-sulfotransferaseglucosaminyl) 1 (heparan 9348 yes 1 NDUFAB1 NADHglucosaminyl) dehydrogenase 3 (ubiquinone) 1, alpha/beta 4706 yes 1 NEB nebulinsubcomplex, 1, 8kDa 4703 yes 1

180

Symbol Gene name Entrez MR TS µT DB MC # KEGG NEFL neurofilament, light polypeptide 4747 yes 1 NEIL1 nei endonuclease VIII-like 1 (E. coli) 79661 yes 1 NETO1 neuropilin (NRP) and tolloid (TLL)-like 1 81832 yes 1 NEUROD4 neurogenic differentiation 4 58158 yes 1 NF1 neurofibromin 1 4763 yes 1 NF2 neurofibromin 2 (merlin) 4771 yes 1 NFAT5 nuclear factor of activated T-cells 5, tonicity- 10725 yes 1 yes NFE2L2 nuclearresponsive factor (erythroid-derived 2)-like 2 4780 yes 1 NFIB nuclear factor I/B 4781 yes 1 NFIX nuclear factor I/X (CCAAT-binding transcription factor) 4784 yes 1 NFKBIZ nuclear factor of kappa light polypeptide gene 64332 yes 1 NGRN neugrin,enhanc er neurite in B -cells outgrowth inhibitor, associated zeta 51335 yes 1 NHLH1 nescient helix loop helix 1 4807 yes 1 NHLH2 nescient helix loop helix 2 4808 yes 1 NHLRC2 NHL repeat containing 2 374354 yes 1 NID2 nidogen 2 (osteonidogen) 22795 yes 1 NIPAL1 NIPA-like domain containing 1 152519 yes 1 NIPBL Nipped-B homolog (Drosophila) 25836 yes 1 NIPSNAP3B nipsnap homolog 3B (C. elegans) 55335 yes 1 NKD1 naked cuticle homolog 1 (Drosophila) 85407 yes 1 NKX2-1 NK2 homeobox 1 7080 yes 1 NLGN1 neuroligin 1 22871 yes 1 NLN neurolysin (metallopeptidase M3 family) 57486 yes 1 NME7 non-metastatic cells 7, protein expressed in 29922 yes 1 NMT1 N-myristoyltransferase(nucleoside -diphosphate 1 kinase) 4836 yes 1 NOLC1 nucleolar and coiled-body phosphoprotein 1 9221 yes 1 NOM1 nucleolar protein with MIF4G domain 1 64434 yes 1 NOTCH2NL Notch homolog 2 (Drosophila) N-terminal like 388677 yes 1 NOV nephroblastoma overexpressed gene 4856 yes 1 NOVA1 neuro-oncological ventral antigen 1 4857 yes 1 NOVA2 neuro-oncological ventral antigen 2 4858 yes 1 NOX5 NADPH oxidase, EF-hand calcium binding domain 5 79400 yes 1 NPAT nuclear protein, ataxia-telangiectasia locus 4863 yes 1 NPHP1 nephronophthisis 1 (juvenile) 4867 yes 1 NPL N-acetylneuraminate pyruvate lyase 80896 yes 1 NPR2 natriuretic(dihydrodipicolinate peptide receptor synthase) B/guanylate cyclase B 4882 yes 1 NPY6R neuropeptide(atr ionatriuretic Y peptidereceptor receptor Y6 (pseudogene) B) 4888 yes 1 NR1D1 nuclear receptor subfamily 1, group D, member 1 9572 yes 1 NR1D2 nuclear receptor subfamily 1, group D, member 2 9975 yes 1 NR1H4 nuclear receptor subfamily 1, group H, member 4 9971 yes 1 NR2C1 nuclear receptor subfamily 2, group C, member 1 7181 yes 1 NR3C1 nuclear receptor subfamily 3, group C, member 1 2908 yes 1 NR4A1 nuclear(glucocorticoid receptor receptor) subfamily 4, group A, member 1 3164 yes 1 NRAP nebulin-related anchoring protein 4892 yes 1 NRAS neuroblastoma RAS viral (v-ras) oncogene homolog 4893 yes 1 yes NRBP2 nuclear receptor binding protein 2 340371 yes 1 NRIP1 nuclear receptor interacting protein 1 8204 yes 1 NRIP3 nuclear receptor interacting protein 3 56675 yes 1 NRP1 neuropilin 1 8829 yes 1 NRSN1 neurensin 1 140767 yes 1 NRSN2 neurensin 2 80023 yes 1 NSMCE4A non-SMC element 4 homolog A (S. cerevisiae) 54780 yes 1 NT5C3 5'-nucleotidase, cytosolic III 51251 yes 1 NTM neurotrimin 50863 yes 1 NTNG1 netrin G1 22854 yes 1 NTRK2 neurotrophic tyrosine kinase, receptor, type 2 4915 yes 1 NTRK3 neurotrophic tyrosine kinase, receptor, type 3 4916 yes 1 NUAK1 NUAK family, SNF1-like kinase, 1 9891 yes 1 NUB1 negative regulator of ubiquitin-like proteins 1 51667 yes 1 NUBPL nucleotide binding protein-like 80224 yes 1 NUCB2 nucleobindin 2 4925 yes 1 NUCKS1 nuclear casein kinase and cyclin-dependent kinase 64710 yes 1 NUF2 NUF2,substrate NDC80 1 kinetochore complex component, 83540 yes 1 NUFIP1 nuclearhomolog fragile (S. cerevisiae) X mental retardation protein interacting 26747 yes 1 NUP160 nucleoporinprotein 1 160kDa 23279 yes 1 NUP188 nucleoporin 188kDa 23511 yes 1 NUP205 nucleoporin 205kDa 23165 yes 1 NUP35 nucleoporin 35kDa 129401 yes 1 NXPH1 neurexophilin 1 30010 yes 1 NXT2 nuclear transport factor 2-like export factor 2 55916 yes 1 OAS2 2'-5'-oligoadenylate synthetase 2, 69/71kDa 4939 yes 1 OAS3 2'-5'-oligoadenylate synthetase 3, 100kDa 4940 yes 1 OBP2B odorant binding protein 2B 29989 yes 1 OCRL oculocerebrorenal syndrome of Lowe 4952 yes 1 ODZ3 odz, odd Oz/ten-m homolog 3 (Drosophila) 55714 yes 1 ODZ4 odz, odd Oz/ten-m homolog 4 (Drosophila) 26011 yes 1 OFCC1 orofacial cleft 1 candidate 1 266553 yes 1 OFD1 oral-facial-digital syndrome 1 8481 yes 1 OGG1 8-oxoguanine DNA glycosylase 4968 yes 1 OGT O-linked N-acetylglucosamine (GlcNAc) transferase 8473 yes 1 OIT3 oncoprotein(UDP -N-acetylglucosamine:polypeptide induced transcript 3 -N- 170392 yes 1 OLFML1 olfactomedin-like 1 283298 yes 1

181

Symbol Gene name Entrez MR TS µT DB MC # KEGG OMG oligodendrocyte myelin glycoprotein 4974 yes 1 OPHN1 oligophrenin 1 4983 yes 1 OR10H2 olfactory receptor, family 10, subfamily H, member 2 26538 yes 1 OR10X1 olfactory receptor, family 10, subfamily X, member 1 128367 yes 1 OR13A1 olfactory receptor, family 13, subfamily A, member 1 79290 yes 1 OR1M1 olfactory receptor, family 1, subfamily M, member 1 125963 yes 1 OR2K2 olfactory receptor, family 2, subfamily K, member 2 26248 yes 1 OR6A2 olfactory receptor, family 6, subfamily A, member 2 8590 yes 1 OR6P1 olfactory receptor, family 6, subfamily P, member 1 128366 yes 1 OR7A5 olfactory receptor, family 7, subfamily A, member 5 26659 yes 1 OR9Q1 olfactory receptor, family 9, subfamily Q, member 1 219956 yes 1 ORC3L origin recognition complex, subunit 3-like (yeast) 23595 yes 1 ORC5L origin recognition complex, subunit 5-like (yeast) 5001 yes 1 OSBP oxysterol binding protein 5007 yes 1 OSBPL8 oxysterol binding protein-like 8 114882 yes 1 OSGIN2 oxidative stress induced growth inhibitor family 734 yes 1 OTUD1 OTUmember domain 2 containing 1 220213 yes 1 OTUD4 OTU domain containing 4 54726 yes 1 OXA1L oxidase (cytochrome c) assembly 1-like 5018 yes 1 OXGR1 oxoglutarate (alpha-ketoglutarate) receptor 1 27199 yes 1 OXR1 oxidation resistance 1 55074 yes 1 P2RX7 purinergic receptor P2X, ligand-gated ion channel, 7 5027 yes 1 P2RY14 purinergic receptor P2Y, G-protein coupled, 14 9934 yes 1 PABPC1L poly(A) binding protein, cytoplasmic 1-like 80336 yes 1 PABPC4L poly(A) binding protein, cytoplasmic 4-like 132430 yes 1 PACRGL PARK2 co-regulated-like 133015 yes 1 PAEP progestagen-associated endometrial protein 5047 yes 1 PAFAH1B1 platelet-activating factor acetylhydrolase, isoform Ib, 5048 yes 1 PAGE2 Psubunit antigen 1 (45kDa)family, member 2 (prostate associated) 203569 yes 1 PAGE2B P antigen family, member 2B 389860 yes 1 PAK3 p21 protein (Cdc42/Rac)-activated kinase 3 5063 yes 1 PALMD palmdelphin 54873 yes 1 PAMR1 peptidase domain containing associated with muscle 25891 yes 1 PAN3 PAN3regeneration poly(A) 1 specific ribonuclease subunit homolog 255967 yes 1 PAOX (S. polyamine cerevisiae) oxidase (exo-N4-amino) 196743 yes 1 PAPOLB poly(A) polymerase beta (testis specific) 56903 yes 1 PAPPA2 pappalysin 2 60676 yes 1 PAPSS2 3'-phosphoadenosine 5'-phosphosulfate synthase 2 9060 yes 1 PAR1 Prader-Willi/Angelman region-1 145624 yes 1 PARP1 poly (ADP-ribose) polymerase 1 142 yes 1 PARP12 poly (ADP-ribose) polymerase family, member 12 64761 yes 1 PARP16 poly (ADP-ribose) polymerase family, member 16 54956 yes 1 PARP2 poly (ADP-ribose) polymerase 2 10038 yes 1 PARP3 poly (ADP-ribose) polymerase family, member 3 10039 yes 1 PARP9 poly (ADP-ribose) polymerase family, member 9 83666 yes 1 PAR-SN paternally expressed transcript PAR-SN 347746 yes 1 PASK PAS domain containing serine/threonine kinase 23178 yes 1 PATE2 prostate and testis expressed 2 399967 yes 1 PATZ1 POZ (BTB) and AT hook containing zinc finger 1 23598 yes 1 PBX1 pre-B-cell leukemia homeobox 1 5087 yes 1 PCBD1 pterin-4 alpha-carbinolamine 5092 yes 1 PCCA propionyldehydratase/dimerization Coenzyme A carboxylase, cofactor of alphahepatocyte 5095 yes 1 PCDH11Y protocadherinpolypeptide 11 Y-linked 83259 yes 1 PCDH17 protocadherin 17 27253 yes 1 PCDH18 protocadherin 18 54510 yes 1 PCDH19 protocadherin 19 57526 yes 1 PCDHA5 protocadherin alpha 5 56143 yes 1 PCDHA7 protocadherin alpha 7 56141 yes 1 PCDHB18 protocadherin beta 18 pseudogene 54660 yes 1 PCDHB6 protocadherin beta 6 56130 yes 1 PCDHGB7 protocadherin gamma subfamily B, 7 56099 yes 1 PCM1 pericentriolar material 1 5108 yes 1 PCNP PEST proteolytic signal containing nuclear protein 57092 yes 1 PCSK6 proprotein convertase subtilisin/kexin type 6 5046 yes 1 PDCD2 programmed cell death 2 5134 yes 1 PDCD5 programmed cell death 5 9141 yes 1 PDCD6IP programmed cell death 6 interacting protein 10015 yes 1 PDE11A phosphodiesterase 11A 50940 yes 1 PDE1B phosphodiesterase 1B, calmodulin-dependent 5153 yes 1 PDE3B phosphodiesterase 3B, cGMP-inhibited 5140 yes 1 PDE6D phosphodiesterase 6D, cGMP-specific, rod, delta 5147 yes 1 PDE9A phosphodiesterase 9A 5152 yes 1 PDGFC platelet derived growth factor C 56034 yes 1 PDHB pyruvate dehydrogenase (lipoamide) beta 5162 yes 1 PDK3 pyruvate dehydrogenase kinase, isozyme 3 5165 yes 1 PDLIM5 PDZ and LIM domain 5 10611 yes 1 PDPK1 3-phosphoinositide dependent protein kinase-1 5170 yes 1 PDS5B PDS5, regulator of cohesion maintenance, homolog B 23047 yes 1 PDX1 pancreatic(S. cerevisiae) and duodenal homeobox 1 3651 yes 1 PDZD2 PDZ domain containing 2 23037 yes 1 PECI peroxisomal D3,D2-enoyl-CoA isomerase 10455 yes 1 PEG10 paternally expressed 10 23089 yes 1

182

Symbol Gene name Entrez MR TS µT DB MC # KEGG PELI1 pellino homolog 1 (Drosophila) 57162 yes 1 PELI2 pellino homolog 2 (Drosophila) 57161 yes 1 PER1 period homolog 1 (Drosophila) 5187 yes 1 PEX11A peroxisomal biogenesis factor 11 alpha 8800 yes 1 PEX3 peroxisomal biogenesis factor 3 8504 yes 1 PEX5 peroxisomal biogenesis factor 5 5830 yes 1 PEX7 peroxisomal biogenesis factor 7 5191 yes 1 PF4V1 platelet factor 4 variant 1 5197 yes 1 PFAS phosphoribosylformylglycinamidine synthase 5198 yes 1 PFN2 profilin 2 5217 yes 1 PGM2 phosphoglucomutase 2 55276 yes 1 PHACTR4 phosphatase and actin regulator 4 65979 yes 1 PHF12 PHD finger protein 12 57649 yes 1 PHF14 PHD finger protein 14 9678 yes 1 PHF17 PHD finger protein 17 79960 yes 1 PHF20L1 PHD finger protein 20-like 1 51105 yes 1 PHF8 PHD finger protein 8 23133 yes 1 PHIP pleckstrin homology domain interacting protein 55023 yes 1 PHPT1 phosphohistidine phosphatase 1 29085 yes 1 PHTF2 putative homeodomain transcription factor 2 57157 yes 1 PHYHIPL phytanoyl-CoA 2-hydroxylase interacting protein-like 84457 yes 1 PIBF1 progesterone immunomodulatory binding factor 1 10464 yes 1 PIGC phosphatidylinositol glycan anchor biosynthesis, class 5279 yes 1 PIGG phosphatidylinositolC glycan anchor biosynthesis, class 54872 yes 1 PIGZ phosphatidylinositolG glycan anchor biosynthesis, class 80235 yes 1 PIP4K2C phosphatidylinositol-5-phosphateZ 4-kinase, type II, 79837 yes 1 PIP5K1P1 PIP5K1Agamma pseudogene 206426 yes 1 PITPNB phosphatidylinositol transfer protein, beta 23760 yes 1 PJA1 praja ring finger 1 64219 yes 1 PJA2 praja ring finger 2 9867 yes 1 PKIA protein kinase (cAMP-dependent, catalytic) inhibitor 5569 yes 1 PKNOX2 PBX/knottedalpha 1 homeobox 2 63876 yes 1 PKP4 plakophilin 4 8502 yes 1 PLA2G2A phospholipase A2, group IIA (platelets, synovial fluid) 5320 yes 1 yes PLA2G4C phospholipase A2, group IVC (cytosolic, calcium- 8605 yes 1 PLA2G5 phospholipaseindependent) A2, group V 5322 yes 1 yes PLAC1 placenta-specific 1 10761 yes 1 PLAC8 placenta-specific 8 51316 yes 1 PLB1 phospholipase B1 151056 yes 1 PLCB1 phospholipase C, beta 1 (phosphoinositide-specific) 23236 yes 1 PLCB4 phospholipase C, beta 4 5332 yes 1 PLCH1 phospholipase C, eta 1 23007 yes 1 PLCH2 phospholipase C, eta 2 9651 yes 1 PLCXD3 phosphatidylinositol-specific phospholipase C, X 345557 yes 1 PLD6 domain phospholipase containing D family, 3 member 6 201164 yes 1 PLEKHA3 pleckstrin homology domain containing, family A 65977 yes 1 PLEKHA5 pleckstrin(phospho inositide homology binding domain specific) containing, member fami 3 ly A 54477 yes 1 PLEKHA7 pleckstrinmember 5 homology domain containing, family A 144100 yes 1 PLEKHA8 member pleckstrin 7 homology domain containing, family A 84725 yes 1 PLEKHB2 pleckstrin(phosphoinositide homology binding domain specific) containing, member fami 8 ly B 55041 yes 1 PLEKHG2 pleckstrin(evectins) homologymember 2 domain containing, family G (with 64857 yes 1 PLEKHO2 pleckstrinRhoGef domain) homology member domain 2 containing, family O 80301 yes 1 PLK1S1 non-proteinmember 2 coding RNA 153 55857 yes 1 PLK4 polo-like kinase 4 (Drosophila) 10733 yes 1 PLS1 plastin 1 (I isoform) 5357 yes 1 PLSCR4 phospholipid scramblase 4 57088 yes 1 PMCHL1 pro-melanin-concentrating hormone-like 1 5369 yes 1 PMCHL2 pro-melanin-concentrating hormone-like 2 5370 yes 1 PMP22 peripheral myelin protein 22 5376 yes 1 PMPCB peptidase (mitochondrial processing) beta 9512 yes 1 PMS1 PMS1 postmeiotic segregation increased 1 (S. 5378 yes 1 PMS2CL PMS2cerevisiae) C-terminal like pseudogene 441194 yes 1 PMS2L11 postmeiotic segregation increased 2-like 11 441263 yes 1 PNLIPRP3 pseudogene pancreatic lipase-related protein 3 119548 yes 1 PNMA1 paraneoplastic antigen MA1 9240 yes 1 PNPLA7 patatin-like phospholipase domain containing 7 375775 yes 1 PNPT1 polyribonucleotide nucleotidyltransferase 1 87178 yes 1 PODXL podocalyxin-like 5420 yes 1 POFUT1 protein O-fucosyltransferase 1 23509 yes 1 POLR2A polymerase (RNA) II (DNA directed) polypeptide A, 5430 yes 1 POLR2F polymerase220kDa (RNA) II (DNA directed) polypeptide F 5435 yes 1 PON2 paraoxonase 2 5445 yes 1 PON3 paraoxonase 3 5446 yes 1 POTEA POTE ankyrin domain family, member A 340441 yes 1 POU2F3 POU class 2 homeobox 3 25833 yes 1 POU3F2 POU class 3 homeobox 2 5454 yes 1 POU4F2 POU class 4 homeobox 2 5458 yes 1 PPARG peroxisome proliferator-activated receptor gamma 5468 yes 1 PPIE peptidylprolyl isomerase E (cyclophilin E) 10450 yes 1 PPIL1 peptidylprolyl isomerase (cyclophilin)-like 1 51645 yes 1 PPIL5 peptidylprolyl isomerase (cyclophilin)-like 5 122769 yes 1 PPIP5K2 histidine acid phosphatase domain containing 1 23262 yes 1

183

Symbol Gene name Entrez MR TS µT DB MC # KEGG PPM1A protein phosphatase 1A (formerly 2C), magnesium- 5494 yes 1 PPM1B dependent, protein phosphatase alpha isoform 1B (formerly 2C), magnesium- 5495 yes 1 PPM1D proteindependent, phosphatase beta isoform 1D magnesium-dependent, delta 8493 yes 1 PPM1E proteinisoform phosphatase 1E (PP2C domain containing) 22843 yes 1 PPP1R16A protein phosphatase 1, regulatory (inhibitor) subunit 84988 yes 1 PPP1R1A protein16A phosphatase 1, regulatory (inhibitor) subunit 5502 yes 1 PPP1R1C protein1A phosphatase 1, regulatory (inhibitor) subunit 151242 yes 1 PPP1R8 1C protein phosphatase 1, regulatory (inhibitor) subunit 8 5511 yes 1 PPP1R9A protein phosphatase 1, regulatory (inhibitor) subunit 55607 yes 1 PPP2CB protein9A phosphatase 2 (formerly 2A), catalytic subunit, 5516 yes 1 PPP2R2B proteinbeta isoform phosphatase 2 (formerly 2A), regulatory 5521 yes 1 PPP2R3C proteinsubunit phosphataseB, beta isoform 2 (formerly 2A), regulatory 55012 yes 1 PPP2R5E proteinsubunit phosphataseB'', gamma 2, regulatory subunit B', epsilon 5529 yes 1 PPP3CA proteinisoform phosphatase 3 (formerly 2B), catalytic subunit, 5530 yes 1 yes PPP3CB proteinalpha isoform phosphatase 3 (formerly 2B), catalytic subunit, 5532 yes 1 yes PPP3CC proteinbeta isoform phosphatase 3 (formerly 2B), catalytic subunit, 5533 yes 1 yes PPP4C proteingamma phosphatase isoform 4 (formerly X), catalytic subunit 5531 yes 1 PPP4R4 protein phosphatase 4, regulatory subunit 4 57718 yes 1 PPTC7 PTC7 protein phosphatase homolog (S. cerevisiae) 160760 yes 1 PRDM1 PR domain containing 1, with ZNF domain 639 yes 1 PRDM10 PR domain containing 10 56980 yes 1 PRDM2 PR domain containing 2, with ZNF domain 7799 yes 1 PRDM4 PR domain containing 4 11108 yes 1 PREX1 phosphatidylinositol-3,4,5-trisphosphate-dependent 57580 yes 1 PREX2 phosphatidylinositol-3,4,5-trisphosphate-depeRac exchange factor 1 ndent 80243 yes 1 PRG4 proteoglycanRac exchange 4 factor 2 10216 yes 1 PRICKLE1 prickle homolog 1 (Drosophila) 144165 yes 1 PRIM2 primase, DNA, polypeptide 2 (58kDa) 5558 yes 1 PRKAA1 protein kinase, AMP-activated, alpha 1 catalytic 5562 yes 1 PRKACB proteinsubunit kinase, cAMP-dependent, catalytic, beta 5567 yes 1 PRKAR2A protein kinase, cAMP-dependent, regulatory, type II, 5576 yes 1 PRKAR2B proteinalpha kinase, cAMP-dependent, regulatory, type II, 5577 yes 1 PRKCI proteinbeta kinase C, iota 5584 yes 1 PRKCZ protein kinase C, zeta 5590 yes 1 PRKD3 protein kinase D3 23683 yes 1 PRKG1 protein kinase, cGMP-dependent, type I 5592 yes 1 PRND prion protein 2 (dublet) 23627 yes 1 PRO0611 PRO0611 protein 28997 yes 1 PRO0628 PRO0628 protein 29053 yes 1 PROK1 prokineticin 1 84432 yes 1 PRPF19 PRP19/PSO4 pre-mRNA processing factor 19 27339 yes 1 PRPF38B PRP38homolog pre-mRNA ( S. cerevisiae) processing factor 38 (yeast) 55119 yes 1 PRPF39 PRP39domain pre-mRNAcontaining Bprocessing factor 39 homolog (S. 55015 yes 1 PRPS2 phosphoribosylcerevisiae) pyrophosphate synthetase 2 5634 yes 1 PRSS23 protease, serine, 23 11098 yes 1 PRSS38 marapsin 2 339501 yes 1 PRSS7 protease, serine, 7 (enterokinase) 5651 yes 1 PRUNE2 prune homolog 2 (Drosophila) 158471 yes 1 PSEN1 presenilin 1 5663 yes 1 PSG5 pregnancy specific beta-1-glycoprotein 5 5673 yes 1 PSMB4 proteasome (prosome, macropain) subunit, beta type, 5692 yes 1 PSMB9 proteasome4 (prosome, macropain) subunit, beta type, 5698 yes 1 PSMC6 proteasome9 (large multifunctional (prosome, macropain)peptidase 2) 26S subunit, 5706 yes 1 PSMG2 proteasomeATPase, 6 (prosome, macropain) assembly 56984 yes 1 PTCH1 patchedchaperone homolog 2 1 (Drosophila) 5727 yes 1 PTF1A pancreas specific transcription factor, 1a 256297 yes 1 PTGR2 prostaglandin reductase 2 145482 yes 1 PTP4A1 protein tyrosine phosphatase type IVA, member 1 7803 yes 1 PTPLAD1 protein tyrosine phosphatase-like A domain 51495 yes 1 PTPLB proteincontaining tyrosine 1 phosphatase-like (proline instead of 201562 yes 1 PTPN12 catalytic protein tyrosine arginine), phosphatase, member b non-receptor type 12 5782 yes 1 PTPN22 protein tyrosine phosphatase, non-receptor type 22 26191 yes 1 PTPRB protein(lymphoid) tyrosine phosphatase, receptor type, B 5787 yes 1 PTPRE protein tyrosine phosphatase, receptor type, E 5791 yes 1 PTPRG protein tyrosine phosphatase, receptor type, G 5793 yes 1 PTPRO protein tyrosine phosphatase, receptor type, O 5800 yes 1 PTPRQ protein tyrosine phosphatase, receptor type, Q 374462 yes 1 PTPRT protein tyrosine phosphatase, receptor type, T 11122 yes 1 PTTG1IP pituitary tumor-transforming 1 interacting protein 754 yes 1 PUM1 pumilio homolog 1 (Drosophila) 9698 yes 1 PURG purine-rich element binding protein G 29942 yes 1 PUS10 pseudouridylate synthase 10 150962 yes 1 PWWP2A PWWP domain containing 2A 114825 yes 1 PYCARD PYD and CARD domain containing 29108 yes 1 QKI quaking homolog, KH domain RNA binding (mouse) 9444 yes 1 QTRTD1 queuine tRNA-ribosyltransferase domain containing 1 79691 yes 1 RAB11FIP1 RAB11 family interacting protein 1 (class I) 80223 yes 1 RAB17 RAB17, member RAS oncogene family 64284 yes 1 RAB1A RAB1A, member RAS oncogene family 5861 yes 1 RAB23 RAB23, member RAS oncogene family 51715 yes 1 RAB27B RAB27B, member RAS oncogene family 5874 yes 1 RAB30 RAB30, member RAS oncogene family 27314 yes 1

184

Symbol Gene name Entrez MR TS µT DB MC # KEGG RAB32 RAB32, member RAS oncogene family 10981 yes 1 RAB38 RAB38, member RAS oncogene family 23682 yes 1 RAB3GAP2 RAB3 GTPase activating protein subunit 2 (non- 25782 yes 1 RAB7A RAB7A,catalytic) member RAS oncogene family 7879 yes 1 RABGGTB Rab geranylgeranyltransferase, beta subunit 5876 yes 1 RABL2A RAB, member of RAS oncogene family-like 2A 11159 yes 1 RAD1 RAD1 homolog (S. pombe) 5810 yes 1 RAD21 RAD21 homolog (S. pombe) 5885 yes 1 RAD51 RAD51 homolog (RecA homolog, E. coli) (S. 5888 yes 1 RAD51AP1 RAD51cerevisiae) associated protein 1 10635 yes 1 RAET1E retinoic acid early transcript 1E 135250 yes 1 RALGAPA1 GTPase activating Rap/RanGAP domain-like 1 253959 yes 1 RALGPS2 Ral GEF with PH domain and SH3 binding motif 2 55103 yes 1 RAMP1 receptor (G protein-coupled) activity modifying protein 10267 yes 1 RASA1 RAS1 p21 protein activator (GTPase activating protein) 5921 yes 1 RASSF6 Ras1 association (RalGDS/AF-6) domain family 166824 yes 1 RASSF8 member Ras association 6 (RalGDS/AF-6) domain family (N- 11228 yes 1 RB1CC1 RB1-inducibleterminal) member coiled-coil 8 1 9821 yes 1 RBBP9 retinoblastoma binding protein 9 10741 yes 1 RBM15 RNA binding motif protein 15 64783 yes 1 RBM17 RNA binding motif protein 17 84991 yes 1 RBM23 RNA binding motif protein 23 55147 yes 1 RBM24 RNA binding motif protein 24 221662 yes 1 RBM25 RNA binding motif protein 25 58517 yes 1 RBM26 RNA binding motif protein 26 64062 yes 1 RBM38 RNA binding motif protein 38 55544 yes 1 RBM44 RNA binding motif protein 44 375316 yes 1 RBM45 RNA binding motif protein 45 129831 yes 1 RBM46 RNA binding motif protein 46 166863 yes 1 RBM6 RNA binding motif protein 6 10180 yes 1 RBM9 RNA binding motif protein 9 23543 yes 1 RBMS1 RNA binding motif, single stranded interacting protein 5937 yes 1 RBMXL2 RNA1 binding motif protein, X-linked-like 2 27288 yes 1 RBMY1A3P RNA binding motif protein, Y-linked, family 1, member 286557 yes 1 RBP4 A3 retinol pseudogene binding protein 4, plasma 5950 yes 1 RCAN2 regulator of calcineurin 2 10231 yes 1 RDH10 retinol dehydrogenase 10 (all-trans) 157506 yes 1 REEP1 receptor accessory protein 1 65055 yes 1 REEP4 receptor accessory protein 4 80346 yes 1 RELN reelin 5649 yes 1 RET ret proto-oncogene 5979 yes 1 REV3L REV3-like, catalytic subunit of DNA polymerase zeta 5980 yes 1 RFX3 regulatory(yeast) factor X, 3 (influences HLA class II 5991 yes 1 RFX8 hypotheticalexpression) protein LOC731220 731220 yes 1 RG9MTD2 RNA (guanine-9-) methyltransferase domain 93587 yes 1 RGL1 ralcontaining guanine 2 nucleotide dissociation stimulator-like 1 23179 yes 1 RGMB RGM domain family, member B 285704 yes 1 RGN regucalcin (senescence marker protein-30) 9104 yes 1 RGS13 regulator of G-protein signaling 13 6003 yes 1 RGS2 regulator of G-protein signaling 2, 24kDa 5997 yes 1 RGS6 regulator of G-protein signaling 6 9628 yes 1 RGSL1 regulator of G-protein signaling like 1 353299 yes 1 RHAG Rh-associated glycoprotein 6005 yes 1 RHBDL2 rhomboid, veinlet-like 2 (Drosophila) 54933 yes 1 RHBDL3 rhomboid, veinlet-like 3 (Drosophila) 162494 yes 1 RIC3 resistance to inhibitors of cholinesterase 3 homolog 79608 yes 1 RIMKLB ribosomal(C. elegans) modification protein rimK-like family 57494 yes 1 RIOK2 RIOmember kinase B 2 (yeast) 55781 yes 1 RMND5A required for meiotic nuclear division 5 homolog A (S. 64795 yes 1 RMST rhabdomyosarcomacerevisiae) 2 associated transcript (non- 196475 yes 1 RNASEL protein ribonuclease coding) L (2',5'-oligoisoadenylate synthetase- 6041 yes 1 RND3 Rhodependent) family GTPase 3 390 yes 1 RNF135 ring finger protein 135 84282 yes 1 RNF146 ring finger protein 146 81847 yes 1 RNF149 ring finger protein 149 284996 yes 1 RNF160 ring finger protein 160 26046 yes 1 RNF19B ring finger protein 19B 127544 yes 1 RNF212 ring finger protein 212 285498 yes 1 RNF213 ring finger protein 213 57674 yes 1 RNF41 ring finger protein 41 10193 yes 1 RNFT2 ring finger protein, transmembrane 2 84900 yes 1 RNLS renalase, FAD-dependent amine oxidase 55328 yes 1 RNMT RNA (guanine-7-) methyltransferase 8731 yes 1 ROBO2 roundabout, axon guidance receptor, homolog 2 6092 yes 1 ROBO3 roundabout,(Drosophila) axon guidance receptor, homolog 3 64221 yes 1 ROPN1 ropporin,(Dros ophila) rhophilin associated protein 1 54763 yes 1 RPAP3 RNA polymerase II associated protein 3 79657 yes 1 RPE65 retinal pigment epithelium-specific protein 65kDa 6121 yes 1 RPIA ribose 5-phosphate isomerase A 22934 yes 1 RPL22L1 ribosomal protein L22-like 1 200916 yes 1 RPL37A ribosomal protein L37a 6168 yes 1 RPN1 ribophorin I 6184 yes 1

185

Symbol Gene name Entrez MR TS µT DB MC # KEGG RPRD1A regulation of nuclear pre-mRNA domain containing 1A 55197 yes 1 RPRD2 regulation of nuclear pre-mRNA domain containing 2 23248 yes 1 RPS27L ribosomal protein S27-like 51065 yes 1 RPS6KB1 ribosomal protein S6 kinase, 70kDa, polypeptide 1 6198 yes 1 RPS6KC1 ribosomal protein S6 kinase, 52kDa, polypeptide 1 26750 yes 1 RPTN repetin 126638 yes 1 RPUSD3 RNA pseudouridylate synthase domain containing 3 285367 yes 1 RRAD Ras-related associated with diabetes 6236 yes 1 RRM2B ribonucleotide reductase M2 B (TP53 inducible) 50484 yes 1 RRP15 ribosomal RNA processing 15 homolog (S. 51018 yes 1 RSPO2 R-spondincerevisiae) 2 homolog (Xenopus laevis) 340419 yes 1 RSPO3 R-spondin 3 homolog (Xenopus laevis) 84870 yes 1 RTKN2 rhotekin 2 219790 yes 1 RUNDC3B RUN domain containing 3B 154661 yes 1 RUNX1 runt-related transcription factor 1 861 yes 1 RUNX1T1 runt-related transcription factor 1; translocated to, 1 862 yes 1 RUNX2 runt-related(cyclin D -related) transcription factor 2 860 yes 1 RUSC1 RUN and SH3 domain containing 1 23623 yes 1 RWDD3 RWD domain containing 3 25950 yes 1 RYBP RING1 and YY1 binding protein 23429 yes 1 RYR2 ryanodine receptor 2 (cardiac) 6262 yes 1 S100A7A S100 calcium binding protein A7A 338324 yes 1 S100PBP S100P binding protein 64766 yes 1 S1PR5 sphingosine-1-phosphate receptor 5 53637 yes 1 SAE1 SUMO1 activating enzyme subunit 1 10055 yes 1 SAFB2 scaffold attachment factor B2 9667 yes 1 SAMD4A sterile alpha motif domain containing 4A 23034 yes 1 SAMD4B sterile alpha motif domain containing 4B 55095 yes 1 SAP30 Sin3A-associated protein, 30kDa 8819 yes 1 SAP30L SAP30-like 79685 yes 1 SASH1 SAM and SH3 domain containing 1 23328 yes 1 SASS6 spindle assembly 6 homolog (C. elegans) 163786 yes 1 SBNO2 strawberry notch homolog 2 (Drosophila) 22904 yes 1 SCAND3 SCAN domain containing 3 114821 yes 1 SCAPER S-phase cyclin A-associated protein in the ER 49855 yes 1 SCARA5 scavenger receptor class A, member 5 (putative) 286133 yes 1 SCARB1 scavenger receptor class B, member 1 949 yes 1 SCFD2 sec1 family domain containing 2 152579 yes 1 SCGBL secretoglobin-like 284402 yes 1 SCLT1 sodium channel and clathrin linker 1 132320 yes 1 SCRN1 secernin 1 9805 yes 1 SDF2 stromal cell-derived factor 2 6388 yes 1 SDR16C5 short chain dehydrogenase/reductase family 16C, 195814 yes 1 SDR42E2 member short chain 5 dehydrogenase/reductase family 42E, 100294 yes 1 SEC14L3 member SEC14-like 2 3 (S. cerevisiae) 266629512 yes 1 SEC16B SEC16 homolog B (S. cerevisiae) 89866 yes 1 SEC24D SEC24 family, member D (S. cerevisiae) 9871 yes 1 SEC61A2 Sec61 alpha 2 subunit (S. cerevisiae) 55176 yes 1 SEC63 SEC63 homolog (S. cerevisiae) 11231 yes 1 SEL1L sel-1 suppressor of lin-12-like (C. elegans) 6400 yes 1 SENP3 SUMO1/sentrin/SMT3 specific peptidase 3 26168 yes 1 SENP6 SUMO1/sentrin specific peptidase 6 26054 yes 1 SENP8 SUMO/sentrin specific peptidase family member 8 123228 yes 1 SEP15 15 kDa selenoprotein 9403 yes 1 SEPT2 septin 2 4735 yes 1 SEPT8 septin 8 23176 yes 1 SERHL serine hydrolase-like 94009 yes 1 SERHL2 serine hydrolase-like 2 253190 yes 1 SERPINB3 serpin peptidase inhibitor, clade B (ovalbumin), 6317 yes 1 SERPINB4 serpinmember peptidase 3 inhibitor, clade B (ovalbumin), 6318 yes 1 SERPINE2 serpinmember peptidase 4 inhibitor, clade E (nexin, 5270 yes 1 SERTAD1 SERTAplasminogen domain activator containing inhibi 1tor type 1), member 2 29950 yes 1 SETD3 SET domain containing 3 84193 yes 1 SETD4 SET domain containing 4 54093 yes 1 SETD5 SET domain containing 5 55209 yes 1 SETX senataxin 23064 yes 1 SEZ6L2 seizure related 6 homolog (mouse)-like 2 26470 yes 1 SF3A1 splicing factor 3a, subunit 1, 120kDa 10291 yes 1 SFN stratifin 2810 yes 1 SFRS1 splicing factor, arginine/serine-rich 1 6426 yes 1 SFRS11 splicing factor, arginine/serine-rich 11 9295 yes 1 SFRS18 splicing factor, arginine/serine-rich 18 25957 yes 1 SFRS2 splicing factor, arginine/serine-rich 2 6427 yes 1 SFRS2IP splicing factor, arginine/serine-rich 2, interacting 9169 yes 1 SFRS4 splicingprotein factor, arginine/serine-rich 4 6429 yes 1 SGK1 serum/glucocorticoid regulated kinase 1 6446 yes 1 SGMS1 sphingomyelin synthase 1 259230 yes 1 SGOL2 shugoshin-like 2 (S. pombe) 151246 yes 1 SGPP1 sphingosine-1-phosphate phosphatase 1 81537 yes 1 SH2D1A SH2 domain protein 1A 4068 yes 1 SH3BP2 SH3-domain binding protein 2 6452 yes 1 SH3GLB1 SH3-domain GRB2-like endophilin B1 51100 yes 1

186

Symbol Gene name Entrez MR TS µT DB MC # KEGG SH3RF3 SH3 domain containing ring finger 3 344558 yes 1 SH3TC2 SH3 domain and tetratricopeptide repeats 2 79628 yes 1 SHC1 SHC (Src homology 2 domain containing) 6464 yes 1 SHC4 SHCtransforming (Src homology protein 21 domain containing) family, 399694 yes 1 SHCBP1 member SHC SH2-domain 4 binding protein 1 79801 yes 1 SHFM1 split hand/foot malformation (ectrodactyly) type 1 7979 yes 1 SHOC2 soc-2 suppressor of clear homolog (C. elegans) 8036 yes 1 SHOX short stature homeobox 6473 yes 1 SIGLEC1 sialic acid binding Ig-like lectin 1, sialoadhesin 6614 yes 1 SIGLEC10 sialic acid binding Ig-like lectin 10 89790 yes 1 SIGLEC8 sialic acid binding Ig-like lectin 8 27181 yes 1 SIGLEC9 sialic acid binding Ig-like lectin 9 27180 yes 1 SIM2 single-minded homolog 2 (Drosophila) 6493 yes 1 SIP1 survival of motor neuron protein interacting protein 1 8487 yes 1 SIPA1L2 signal-induced proliferation-associated 1 like 2 57568 yes 1 SIRPB2 signal-regulatory protein beta 2 284759 yes 1 SLC10A4 solute carrier family 10 (sodium/bile acid 201780 yes 1 SLC12A1 cotransporter solute carrier familyfamily), 12 member (sodium/potassium/ 4 chloride 6557 yes 1 SLC13A3 solutetransporters), carrier familymember 13 1 (sodium-dependent 64849 yes 1 SLC15A4 solutedicarbox carrierylate familytransporter), 15, member member 4 3 121260 yes 1 SLC16A10 solute carrier family 16, member 10 (aromatic amino 117247 yes 1 SLC16A12 acid solute transporter) carrier family 16, member 12 (monocarboxylic 387700 yes 1 SLC16A4 acid solute transporter carrier family 12) 16, member 4 (monocarboxylic 9122 yes 1 SLC17A2 soluteacid transporter carrier family 5) 17 (sodium phosphate), member 10246 yes 1 SLC17A4 solute2 carrier family 17 (sodium phosphate), member 10050 yes 1 SLC22A11 solute4 carrier family 22 (organic anion/urate 55867 yes 1 SLC25A12 solutetransporter), carrier member family 25 11 (mitochondrial carrier, Aralar), 8604 yes 1 SLC25A29 solutemember carrier 12 family 25, member 29 123096 yes 1 SLC25A3 solute carrier family 25 (mitochondrial carrier; 5250 yes 1 SLC25A30 solutephosphate carrier carrier), family member25, member 3 30 253512 yes 1 SLC25A31 solute carrier family 25 (mitochondrial carrier; adenine 83447 yes 1 SLC28A1 solutenucleotide carrier translocator), family 28 (sodium-coupled member 31 nucleoside 9154 yes 1 SLC2A1 solutetransporter), carrier member family 2 1(facilitated glucose transporter), 6513 yes 1 SLC2A14 solutemember carrier 1 family 2 (facilitated glucose transporter), 144195 yes 1 SLC2A3 member solute carrier 14 family 2 (facilitated glucose transporter), 6515 yes 1 SLC30A4 solutemember carrier 3 family 30 (zinc transporter), member 4 7782 yes 1 SLC31A2 solute carrier family 31 (copper transporters), member 1318 yes 1 SLC32A1 solute2 carrier family 32 (GABA vesicular transporter), 140679 yes 1 SLC35A2 member solute carrier 1 family 35 (UDP-galactose transporter), 7355 yes 1 SLC35A3 solutemember carrier A2 family 35 (UDP-N-acetylglucosamine 23443 yes 1 SLC35B3 solute(UDP - GlcNAc)carrier family transporter), 35, member member B3 A3 51000 yes 1 SLC35D2 solute carrier family 35, member D2 11046 yes 1 SLC35D3 solute carrier family 35, member D3 340146 yes 1 SLC35F1 solute carrier family 35, member F1 222553 yes 1 SLC35F2 solute carrier family 35, member F2 54733 yes 1 SLC36A4 solute carrier family 36 (proton/amino acid symporter), 120103 yes 1 SLC37A3 member solute carrier 4 family 37 (glycerol-3-phosphate 84255 yes 1 SLC39A1 solutetransporter), carrier member family 39 3 (zinc transporter), member 1 27173 yes 1 SLC39A2 solute carrier family 39 (zinc transporter), member 2 29986 yes 1 SLC39A8 solute carrier family 39 (zinc transporter), member 8 64116 yes 1 SLC39A9 solute carrier family 39 (zinc transporter), member 9 55334 yes 1 SLC3A1 solute carrier family 3 (cystine, dibasic and neutral 6519 yes 1 SLC3A2 soluteamino carrieracid transporters, family 3 (activators activator of of dibas cystine,ic and diba sic 6520 yes 1 SLC44A1 soluteneutral carrier amino family acid transport), 44, member member 1 2 23446 yes 1 SLC44A5 solute carrier family 44, member 5 204962 yes 1 SLC45A2 solute carrier family 45, member 2 51151 yes 1 SLC4A11 solute carrier family 4, sodium borate transporter, 83959 yes 1 SLC4A1AP solutemember carrier 11 family 4 (anion exchanger), member 1, 22950 yes 1 SLC4A4 soluteadaptor carrier protein family 4, sodium bicarbonate 8671 yes 1 SLC4A5 solutecotransporter, carrier family member 4, sodium 4 bicarbonate 57835 yes 1 SLC4A7 solutecotransporter, carrier family member 4, sodium 5 bicarbonate 9497 yes 1 SLC5A1 solutecotransporter, carrier family member 5 (sodium/glucose 7 cotransporter), 6523 yes 1 SLC5A8 solutemember carrier 1 family 5 (iodide transporter), member 8 160728 yes 1 SLC5A9 solute carrier family 5 (sodium/glucose cotransporter), 200010 yes 1 SLC6A2 member solute carrier 9 family 6 (neurotransmitter transporter, 6530 yes 1 SLC6A20 solutenoradrenalin), carrier family member 6 (proline 2 IMINO transporter), 54716 yes 1 SLC7A6 solutemember carrier 20 family 7 (cationic amino acid 9057 yes 1 SLC7A8 solutetransporter, carrier y+ family system), 7 (cationic member amino 6 acid 23428 yes 1 SLC8A1 solutetransporter, carrier y+ family system), 8 (sodium/calcium member 8 exchanger), 6546 yes 1 SLC9A5 solutemember carrier 1 family 9 (sodium/hydrogen exchanger), 6553 yes 1 SLC9A6 solutemember carrier 5 family 9 (sodium/hydrogen exchanger), 10479 yes 1 SLCO1C1 solutemember carrier 6 organic anion transporter family, 53919 yes 1 SLCO3A1 solutemember carrier 1C1 organic anion transporter family, 28232 yes 1 SLFN11 schlafenmember 3A1family member 11 91607 yes 1 SLFN5 schlafen family member 5 162394 yes 1 SLITRK4 SLIT and NTRK-like family, member 4 139065 yes 1 SLMAP sarcolemma associated protein 7871 yes 1 SLTM SAFB-like, transcription modulator 79811 yes 1 SMAGP small trans-membrane and glycosylated protein 57228 yes 1 SMARCE1 SWI/SNF related, matrix associated, actin dependent 6605 yes 1 SMC2 structuralregulator ofmaintenance chromatin, subfamilyof e, member 2 1 10592 yes 1 SMC3 structural maintenance of chromosomes 3 9126 yes 1

187

Symbol Gene name Entrez MR TS µT DB MC # KEGG SMC4 structural maintenance of chromosomes 4 10051 yes 1 SMCR7L Smith-Magenis syndrome chromosome region, 54471 yes 1 SMPX smallcandidate muscle 7 -like protein, X-linked 23676 yes 1 SMTN smoothelin 6525 yes 1 SMYD1 SET and MYND domain containing 1 150572 yes 1 SNAP25 synaptosomal-associated protein, 25kDa 6616 yes 1 SNCA synuclein, alpha (non A4 component of amyloid 6622 yes 1 SNORA53 smallprecursor) nucleolar RNA, H/ACA box 53 677832 yes 1 SNORD11B small nucleolar RNA, C/D box 11B 100113 yes 1 SNORD19 small nucleolar RNA, C/D box 19 692089392 yes 1 SNORD51 small nucleolar RNA, C/D box 51 26798 yes 1 SNORD73A small nucleolar RNA, C/D box 73A 8944 yes 1 SNRNP48 small nuclear ribonucleoprotein 48kDa (U11/U12) 154007 yes 1 SNRPA1 small nuclear ribonucleoprotein polypeptide A' 6627 yes 1 SNTB1 syntrophin, beta 1 (dystrophin-associated protein A1, 6641 yes 1 SNTG2 syntrophin,59kDa, basic gamma component 2 1) 54221 yes 1 SNX1 sorting nexin 1 6642 yes 1 SNX13 sorting nexin 13 23161 yes 1 SNX16 sorting nexin 16 64089 yes 1 SNX18 sorting nexin 18 112574 yes 1 SNX19 sorting nexin 19 399979 yes 1 SNX27 sorting nexin family member 27 81609 yes 1 SNX3 sorting nexin 3 8724 yes 1 SOCS6 suppressor of cytokine signaling 6 9306 yes 1 SOD1 superoxide dismutase 1, soluble 6647 yes 1 SOD2 superoxide dismutase 2, mitochondrial 6648 yes 1 SORBS1 sorbin and SH3 domain containing 1 10580 yes 1 SORBS2 sorbin and SH3 domain containing 2 8470 yes 1 SORCS1 sortilin-related VPS10 domain containing receptor 1 114815 yes 1 SOS1 son of sevenless homolog 1 (Drosophila) 6654 yes 1 SOX11 SRY (sex determining region Y)-box 11 6664 yes 1 SOX13 SRY (sex determining region Y)-box 13 9580 yes 1 SOX14 SRY (sex determining region Y)-box 14 8403 yes 1 SOX2OT SOX2 overlapping transcript (non-protein coding) 347689 yes 1 SOX4 SRY (sex determining region Y)-box 4 6659 yes 1 SOX5 SRY (sex determining region Y)-box 5 6660 yes 1 SPACA1 sperm acrosome associated 1 81833 yes 1 SPAG17 sperm associated antigen 17 200162 yes 1 SPAM1 sperm adhesion molecule 1 (PH-20 hyaluronidase, 6677 yes 1 SPANXN1 SPANXzona pe llucidafamily, binding)member N1 494118 yes 1 SPAST spastin 6683 yes 1 SPATA1 spermatogenesis associated 1 64173 yes 1 SPATA6 spermatogenesis associated 6 54558 yes 1 SPATS2 spermatogenesis associated, serine-rich 2 65244 yes 1 SPATS2L spermatogenesis associated, serine-rich 2-like 26010 yes 1 SPG11 spastic paraplegia 11 (autosomal recessive) 80208 yes 1 SPIN2A spindlin family, member 2A 54466 yes 1 SPIN2B spindlin family, member 2B 474343 yes 1 SPIN3 spindlin family, member 3 169981 yes 1 SPIN4 spindlin family, member 4 139886 yes 1 SPOCK1 sparc/osteonectin, cwcv and kazal-like domains 6695 yes 1 SPOPL speckle-typeproteoglycan POZ(testican) protein-like 1 339745 yes 1 SPR sepiapterin reductase (7,8-dihydrobiopterin:NADP+ 6697 yes 1 SPRED1 sprouty-related,oxidoreductase) EVH1 domain containing 1 161742 yes 1 SPTBN1 spectrin, beta, non-erythrocytic 1 6711 yes 1 SPTLC1 serine palmitoyltransferase, long chain base subunit 1 10558 yes 1 SRI sorcin 6717 yes 1 SRP68 signal recognition particle 68kDa 6730 yes 1 SRRT serrate RNA effector molecule homolog (Arabidopsis) 51593 yes 1 SS18L1 synovial sarcoma translocation gene on chromosome 26039 yes 1 ST6GAL2 ST618 -like beta-galactosamide 1 alpha-2,6-sialyltranferase 2 84620 yes 1 ST6GALNA ST6 (alpha-N-acetyl-neuraminyl-2,3-beta-galactosyl- 55808 yes 1 ST7LC1 suppression1,3) -N-acetylgalactosaminide of tumorigenicity 7alpha like -2,6 - 54879 yes 1 STAG2 stromal antigen 2 10735 yes 1 STAM signal transducing adaptor molecule (SH3 domain 8027 yes 1 STAM2 signaland ITAM transducing motif) 1 adaptor molecule (SH3 domain 10254 yes 1 STAMBP STAMand ITAM binding motif) protein 2 10617 yes 1 STAMBPL1 STAM binding protein-like 1 57559 yes 1 STAP1 signal transducing adaptor family member 1 26228 yes 1 STARD4 StAR-related lipid transfer (START) domain 134429 yes 1 STARD5 con StAR-relatedtaining 4 lipid transfer (START) domain 80765 yes 1 STAT1 signalcontaining transducer 5 and activator of transcription 1, 6772 yes 1 STAT6 signal91kDa transducer and activator of transcription 6, 6778 yes 1 STAU2 staufen,interle ukin RNA-4 induced binding protein, homolog 2 (Drosophila) 27067 yes 1 STELLAR germ and embryonic stem cell enriched protein 400206 yes 1 STK10 STELLA serine/threonine kinase 10 6793 yes 1 STK17B serine/threonine kinase 17b 9262 yes 1 STK24 serine/threonine kinase 24 (STE20 homolog, yeast) 8428 yes 1 STK32B serine/threonine kinase 32B 55351 yes 1 STK38L serine/threonine kinase 38 like 23012 yes 1 STRA13 stimulated by retinoic acid 13 homolog (mouse) 201254 yes 1 STRAP serine/threonine kinase receptor associated protein 11171 yes 1

188

Symbol Gene name Entrez MR TS µT DB MC # KEGG STRN3 striatin, calmodulin binding protein 3 29966 yes 1 STRN4 striatin, calmodulin binding protein 4 29888 yes 1 STT3A STT3, subunit of the oligosaccharyltransferase 3703 yes 1 STX2 syntaxincomplex, 2 homolog A (S. cerevisiae) 2054 yes 1 STX6 syntaxin 6 10228 yes 1 STXBP5 syntaxin binding protein 5 (tomosyn) 134957 yes 1 STXBP6 syntaxin binding protein 6 (amisyn) 29091 yes 1 STYK1 serine/threonine/tyrosine kinase 1 55359 yes 1 SUB1 SUB1 homolog (S. cerevisiae) 10923 yes 1 SUCLA2 succinate-CoA ligase, ADP-forming, beta subunit 8803 yes 1 SULF1 sulfatase 1 23213 yes 1 SULT1C4 sulfotransferase family, cytosolic, 1C, member 4 27233 yes 1 SULT1E1 sulfotransferase family 1E, estrogen-preferring, 6783 yes 1 SUMF1 sulfatasemember 1 modifying factor 1 285362 yes 1 SUMO4 SMT3 suppressor of mif two 3 homolog 4 (S. 387082 yes 1 SUV39H2 cerevisiae) suppressor of variegation 3-9 homolog 2 (Drosophila) 79723 yes 1 SVEP1 sushi, von Willebrand factor type A, EGF and 79987 yes 1 SVOP SV2pentraxin related domain protein containing homolog 1 (rat) 55530 yes 1 SWAP70 SWAP switching B-cell complex 70kDa subunit 23075 yes 1 SYDE2 synapse defective 1, Rho GTPase, homolog 2 (C. 84144 yes 1 SYNCRIP synaptotagminelegans) binding, cytoplasmic RNA interacting 10492 yes 1 SYNE1 spectrinprotein repeat containing, nuclear envelope 1 23345 yes 1 SYNJ2BP synaptojanin 2 binding protein 55333 yes 1 SYNPO2 synaptopodin 2 171024 yes 1 SYT1 synaptotagmin I 6857 yes 1 SYTL2 synaptotagmin-like 2 54843 yes 1 TACO1 coiled-coil domain containing 44 51204 yes 1 TACSTD2 tumor-associated calcium signal transducer 2 4070 yes 1 TADA1 transcriptional adaptor 1 (HFI1 homolog, yeast)-like 117143 yes 1 TAF15 TAF15 RNA polymerase II, TATA box binding protein 8148 yes 1 TAF2 TAF2(TBP) -RNAassociated polymerase facto r, II, 68kDa TATA box binding protein 6873 yes 1 TAF7 TAF7(TBP) -RNAassociated polymerase factor, II, 150kDa TATA box binding protein 6879 yes 1 TAF9 TAF9(TBP) -RNAassociated polymerase factor, II, 55kDa TATA box binding protein 6880 yes 1 TAGAP T-cell(TBP) -activationassociated RhoGTPase factor, 32kDa activating protein 117289 yes 1 TAL2 T-cell acute lymphocytic leukemia 2 6887 yes 1 TAOK3 TAO kinase 3 51347 yes 1 TAP1 transporter 1, ATP-binding cassette, sub-family B 6890 yes 1 TARBP2 TAR(MDR/TAP) (HIV-1) RNA binding protein 2 6895 yes 1 TARS threonyl-tRNA synthetase 6897 yes 1 TARSL2 threonyl-tRNA synthetase-like 2 123283 yes 1 TAS2R38 taste receptor, type 2, member 38 5726 yes 1 TASP1 taspase, threonine aspartase, 1 55617 yes 1 TAT tyrosine aminotransferase 6898 yes 1 TATDN1 TatD DNase domain containing 1 83940 yes 1 TBC1D19 TBC1 domain family, member 19 55296 yes 1 TBC1D26 TBC1 domain family, member 26 353149 yes 1 TBC1D9 TBC1 domain family, member 9 (with GRAM domain) 23158 yes 1 TBCD tubulin folding cofactor D 6904 yes 1 TBCE tubulin folding cofactor E 6905 yes 1 TBCK TBC domain-containing protein kinase-like 93627 yes 1 TCEAL3 transcription elongation factor A (SII)-like 3 85012 yes 1 TCEAL7 transcription elongation factor A (SII)-like 7 56849 yes 1 TCEAL8 transcription elongation factor A (SII)-like 8 90843 yes 1 TCEB3B transcription elongation factor B polypeptide 3B 51224 yes 1 TCF15 transcription(elongin A2) factor 15 (basic helix-loop-helix) 6939 yes 1 TCF19 transcription factor 19 6941 yes 1 TCF3 transcription factor 3 (E2A immunoglobulin enhancer 6929 yes 1 TCF4 transcriptionbinding factors factor E12/E47) 4 6925 yes 1 TCFL5 transcription factor-like 5 (basic helix-loop-helix) 10732 yes 1 TCHH trichohyalin 7062 yes 1 TCP11L1 t-complex 11 (mouse)-like 1 55346 yes 1 TCP11L2 t-complex 11 (mouse)-like 2 255394 yes 1 TDH L-threonine dehydrogenase 157739 yes 1 TDRD10 tudor domain containing 10 126668 yes 1 TDRD5 tudor domain containing 5 163589 yes 1 TDRD9 tudor domain containing 9 122402 yes 1 TECRL steroid 5 alpha-reductase 2-like 2 253017 yes 1 TEK TEK tyrosine kinase, endothelial 7010 yes 1 TEKT3 tektin 3 64518 yes 1 TET2 tet oncogene family member 2 54790 yes 1 TEX14 testis expressed 14 56155 yes 1 TFAP2A transcription factor AP-2 alpha (activating enhancer 7020 yes 1 TFAP2D transcriptionbinding protein factor 2 alpha) AP-2 delta (activating enhancer 83741 yes 1 TFB1M transcriptionbinding protein factor 2 delta) B1, mitochondrial 51106 yes 1 TFDP1 transcription factor Dp-1 7027 yes 1 TFE3 transcription factor binding to IGHM enhancer 3 7030 yes 1 TFF3 trefoil factor 3 (intestinal) 7033 yes 1 TFPT TCF3 (E2A) fusion partner (in childhood Leukemia) 29844 yes 1 TFR2 transferrin receptor 2 7036 yes 1 TGFB1 transforming growth factor, beta 1 7040 yes 1 TGFBR3 transforming growth factor, beta receptor III 7049 yes 1 TGOLN2 trans-golgi network protein 2 10618 yes 1

189

Symbol Gene name Entrez MR TS µT DB MC # KEGG THAP11 THAP domain containing 11 57215 yes 1 THAP2 THAP domain containing, apoptosis associated 83591 yes 1 THAP9 THAPprotein domain 2 containing 9 79725 yes 1 THOC1 THO complex 1 9984 yes 1 THUMPD1 THUMP domain containing 1 55623 yes 1 TIAM1 T-cell lymphoma invasion and metastasis 1 7074 yes 1 TIAM2 T-cell lymphoma invasion and metastasis 2 26230 yes 1 TIGD4 tigger transposable element derived 4 201798 yes 1 TIMM8A translocase of inner mitochondrial membrane 8 1678 yes 1 TIMM9 translocasehomolog A (yeast) of inner mitochondrial membrane 9 26520 yes 1 TIMP1 TIMPhomolog metallopeptidase (yeast) inhibitor 1 7076 yes 1 TINAG tubulointerstitial nephritis antigen 27283 yes 1 TIRAP toll-interleukin 1 receptor (TIR) domain containing 114609 yes 1 TKT adaptor transketolase protein 7086 yes 1 TLE4 transducin-like enhancer of split 4 (E(sp1) homolog, 7091 yes 1 TM2D3 TM2Drosophila) domain containing 3 80213 yes 1 TM4SF18 transmembrane 4 L six family member 18 116441 yes 1 TMC7 transmembrane channel-like 7 79905 yes 1 TMC8 transmembrane channel-like 8 147138 yes 1 TMCO1 transmembrane and coiled-coil domains 1 54499 yes 1 TMCO6 transmembrane and coiled-coil domains 6 55374 yes 1 TMEM106B transmembrane protein 106B 54664 yes 1 TMEM123 transmembrane protein 123 114908 yes 1 TMEM126B transmembrane protein 126B 55863 yes 1 TMEM139 transmembrane protein 139 135932 yes 1 TMEM144 transmembrane protein 144 55314 yes 1 TMEM168 transmembrane protein 168 64418 yes 1 TMEM169 transmembrane protein 169 92691 yes 1 TMEM178 transmembrane protein 178 130733 yes 1 TMEM18 transmembrane protein 18 129787 yes 1 TMEM182 transmembrane protein 182 130827 yes 1 TMEM194A transmembrane protein 194A 23306 yes 1 TMEM203 transmembrane protein 203 94107 yes 1 TMEM218 transmembrane protein 218 219854 yes 1 TMEM220 transmembrane protein 220 388335 yes 1 TMEM229A hypothetical protein LOC730130 730130 yes 1 TMEM33 transmembrane protein 33 55161 yes 1 TMEM41A transmembrane protein 41A 90407 yes 1 TMEM41B transmembrane protein 41B 440026 yes 1 TMEM48 transmembrane protein 48 55706 yes 1 TMEM50A transmembrane protein 50A 23585 yes 1 TMEM55A transmembrane protein 55A 55529 yes 1 TMEM56 transmembrane protein 56 148534 yes 1 TMEM64 transmembrane protein 64 169200 yes 1 TMEM67 transmembrane protein 67 91147 yes 1 TMEM68 transmembrane protein 68 137695 yes 1 TMEM74 transmembrane protein 74 157753 yes 1 TMEM90A transmembrane protein 90A 646658 yes 1 TMEM90B chromosome 20 open reading frame 39 79953 yes 1 TMLHE trimethyllysine hydroxylase, epsilon 55217 yes 1 TMOD3 tropomodulin 3 (ubiquitous) 29766 yes 1 TMPRSS11 transmembrane protease, serine 11A 339967 yes 1 ATMPRSS11 transmembrane protease, serine 11F 389208 yes 1 FTMPRSS4 transmembrane protease, serine 4 56649 yes 1 TMSL3 thymosin-like 3 7117 yes 1 TMX1 thioredoxin-related transmembrane protein 1 81542 yes 1 TMX3 thioredoxin-related transmembrane protein 3 54495 yes 1 TNF tumor necrosis factor (TNF superfamily, member 2) 7124 yes 1 TNFAIP8 tumor necrosis factor, alpha-induced protein 8 25816 yes 1 TNFRSF10 tumor necrosis factor receptor superfamily, member 8794 yes 1 TNFRSF18C tumor10c, decoy necrosis without factor an receptor intracellular superfamily domain , member 8784 yes 1 TNFRSF19 tumor18 necrosis factor receptor superfamily, member 55504 yes 1 TNFRSF21 tumor19 necrosis factor receptor superfamily, member 27242 yes 1 TNFSF10 tumor21 necrosis factor (ligand) superfamily, member 10 8743 yes 1 TNNI1 troponin I type 1 (skeletal, slow) 7135 yes 1 TNRC6C trinucleotide repeat containing 6C 57690 yes 1 TNS3 tensin 3 64759 yes 1 TOM1L1 target of myb1 (chicken)-like 1 10040 yes 1 TOMM22 translocase of outer mitochondrial membrane 22 56993 yes 1 TOMM40 translocasehomolog (yeast) of outer mitochondrial membrane 40 10452 yes 1 TOMM70A translocasehomolog (yeast) of outer mitochondrial membrane 70 9868 yes 1 TOP1P1 topoisomerasehomolog A (S. cerevisiae)(DNA) I pseudogene 1 7151 yes 1 TOR3A torsin family 3, member A 64222 yes 1 TOX thymocyte selection-associated high mobility group 9760 yes 1 TP73 tumorbox protein p73 7161 yes 1 TPD52 tumor protein D52 7163 yes 1 TPK1 thiamin pyrophosphokinase 1 27010 yes 1 TPP1 tripeptidyl peptidase I 1200 yes 1 TPRG1 tumor protein p63 regulated 1 285386 yes 1 TPRX1 tetra-peptide repeat homeobox 1 284355 yes 1 TPRXL tetra-peptide repeat homeobox-like 348825 yes 1 TPTE transmembrane phosphatase with tensin homology 7179 yes 1

190

Symbol Gene name Entrez MR TS µT DB MC # KEGG TPTE2P1 transmembrane phosphoinositide 3-phosphatase and 646405 yes 1 TPTE2P3 tensin TPTE andhomolog PTEN 2 homologouspseudogene inositol lipid 220115 yes 1 TRA2B phosphatase transformer 2 pseudogenebeta homolog (Drosophila) 6434 yes 1 TRADD TNFRSF1A-associated via death domain 8717 yes 1 TRAF4 TNF receptor-associated factor 4 9618 yes 1 TRAF5 TNF receptor-associated factor 5 7188 yes 1 TRAPPC6B trafficking protein particle complex 6B 122553 yes 1 TRAT1 T cell receptor associated transmembrane adaptor 1 50852 yes 1 TRDN triadin 10345 yes 1 TREM2 triggering receptor expressed on myeloid cells 2 54209 yes 1 TREML4 triggering receptor expressed on myeloid cells-like 4 285852 yes 1 TRIB2 tribbles homolog 2 (Drosophila) 28951 yes 1 TRIM22 tripartite motif-containing 22 10346 yes 1 TRIM23 tripartite motif-containing 23 373 yes 1 TRIM24 tripartite motif-containing 24 8805 yes 1 TRIM33 tripartite motif-containing 33 51592 yes 1 TRIM37 tripartite motif-containing 37 4591 yes 1 TRIM45 tripartite motif-containing 45 80263 yes 1 TRIO triple functional domain (PTPRF interacting) 7204 yes 1 TRIP13 thyroid hormone receptor interactor 13 9319 yes 1 TRMT6 tRNA methyltransferase 6 homolog (S. cerevisiae) 51605 yes 1 TRNP1 TMF1-regulated nuclear protein 1 388610 yes 1 TRPC6 transient receptor potential cation channel, subfamily 7225 yes 1 TRPV2 transientC, member receptor 6 potential cation channel, subfamily 51393 yes 1 TSC22D1 TSC22V, member domain 2 family, member 1 8848 yes 1 TSC22D2 TSC22 domain family, member 2 9819 yes 1 TSEN54 tRNA splicing endonuclease 54 homolog (S. 283989 yes 1 TSG1 cerevisiae) tumor suppressor TSG1 643432 yes 1 TSG101 tumor susceptibility gene 101 7251 yes 1 TSNAX translin-associated factor X 7257 yes 1 TSPAN12 tetraspanin 12 23554 yes 1 TSPAN15 tetraspanin 15 23555 yes 1 TSPYL4 TSPY-like 4 23270 yes 1 TSPYL5 TSPY-like 5 85453 yes 1 TSTA3 tissue specific transplantation antigen P35B 7264 yes 1 TTC13 tetratricopeptide repeat domain 13 79573 yes 1 TTC21A tetratricopeptide repeat domain 21A 199223 yes 1 TTC23 tetratricopeptide repeat domain 23 64927 yes 1 TTC23L tetratricopeptide repeat domain 23-like 153657 yes 1 TTLL5 tubulin tyrosine ligase-like family, member 5 23093 yes 1 TTLL7 tubulin tyrosine ligase-like family, member 7 79739 yes 1 TTN titin 7273 yes 1 TTPA tocopherol (alpha) transfer protein 7274 yes 1 TTTY4C testis-specific transcript, Y-linked 4C 474150 yes 1 TTTY7 testis-specific transcript, Y-linked 7 246122 yes 1 TUBA1A tubulin, alpha 1a 7846 yes 1 TUBA3E tubulin, alpha 3e 112714 yes 1 TUBGCP3 tubulin, gamma complex associated protein 3 10426 yes 1 TUBGCP5 tubulin, gamma complex associated protein 5 114791 yes 1 TUFT1 tuftelin 1 7286 yes 1 TULP3 tubby like protein 3 7289 yes 1 TWISTNB TWIST neighbor 221830 yes 1 TXK TXK tyrosine kinase 7294 yes 1 TXLNB taxilin beta 167838 yes 1 TXNDC2 thioredoxin domain containing 2 (spermatozoa) 84203 yes 1 TXNL1 thioredoxin-like 1 9352 yes 1 TYSND1 trypsin domain containing 1 219743 yes 1 TYW1 tRNA-yW synthesizing protein 1 homolog (S. 55253 yes 1 U2AF1 U2cerevisiae) small nuclear RNA auxiliary factor 1 7307 yes 1 UBA5 ubiquitin-like modifier activating enzyme 5 79876 yes 1 UBA6 ubiquitin-like modifier activating enzyme 6 55236 yes 1 UBD ubiquitin D 10537 yes 1 UBE2C ubiquitin-conjugating enzyme E2C 11065 yes 1 UBE2CBP ubiquitin-conjugating enzyme E2C binding protein 90025 yes 1 UBE2F ubiquitin-conjugating enzyme E2F (putative) 140739 yes 1 UBE2G1 ubiquitin-conjugating enzyme E2G 1 (UBC7 homolog, 7326 yes 1 UBE2Q2 ubiquitin-conjugatingyeast) enzyme E2Q family member 2 92912 yes 1 UBE4A ubiquitination factor E4A (UFD2 homolog, yeast) 9354 yes 1 UBE4B ubiquitination factor E4B (UFD2 homolog, yeast) 10277 yes 1 UBL4A ubiquitin-like 4A 8266 yes 1 UBP1 upstream binding protein 1 (LBP-1a) 7342 yes 1 UBQLN1 ubiquilin 1 29979 yes 1 UBR1 ubiquitin protein ligase E3 component n-recognin 1 197131 yes 1 UBR3 ubiquitin protein ligase E3 component n-recognin 3 130507 yes 1 UBR4 (putative) ubiquitin protein ligase E3 component n-recognin 4 23352 yes 1 UBR7 ubiquitin protein ligase E3 component n-recognin 7 55148 yes 1 UBXN1 UBX(putative) domain protein 1 51035 yes 1 UBXN2A UBX domain protein 2A 165324 yes 1 UFM1 ubiquitin-fold modifier 1 51569 yes 1 UFSP1 UFM1-specific peptidase 1 (non-functional) 402682 yes 1 UGGT1 UDP-glucose ceramide glucosyltransferase-like 1 56886 yes 1 UGGT2 UDP-glucose ceramide glucosyltransferase-like 2 55757 yes 1

191

Symbol Gene name Entrez MR TS µT DB MC # KEGG UGT3A1 UDP glycosyltransferase 3 family, polypeptide A1 133688 yes 1 UHMK1 U2AF homology motif (UHM) kinase 1 127933 yes 1 ULBP2 UL16 binding protein 2 80328 yes 1 ULK2 unc-51-like kinase 2 (C. elegans) 9706 yes 1 ULK3 unc-51-like kinase 3 (C. elegans) 25989 yes 1 UMPS uridine monophosphate synthetase 7372 yes 1 UNC119B unc-119 homolog B (C. elegans) 84747 yes 1 UNC13B unc-13 homolog B (C. elegans) 10497 yes 1 UNC80 chromosome 2 open reading frame 21 285175 yes 1 UNCX UNC homeobox 340260 yes 1 UPK3A uroplakin 3A 7380 yes 1 UPRT uracil phosphoribosyltransferase (FUR1) homolog (S. 139596 yes 1 UQCR11 cerevisiae) ubiquinol-cytochrome c reductase, 6.4kDa subunit 10975 yes 1 USP10 ubiquitin specific peptidase 10 9100 yes 1 USP13 ubiquitin specific peptidase 13 (isopeptidase T-3) 8975 yes 1 USP24 ubiquitin specific peptidase 24 23358 yes 1 USP30 ubiquitin specific peptidase 30 84749 yes 1 USP38 ubiquitin specific peptidase 38 84640 yes 1 USP4 ubiquitin specific peptidase 4 (proto-oncogene) 7375 yes 1 USP54 ubiquitin specific peptidase 54 159195 yes 1 USP8 ubiquitin specific peptidase 8 9101 yes 1 USP9X ubiquitin specific peptidase 9, X-linked 8239 yes 1 UST uronyl-2-sulfotransferase 10090 yes 1 VANGL1 vang-like 1 (van gogh, Drosophila) 81839 yes 1 VAV3 vav 3 guanine nucleotide exchange factor 10451 yes 1 VCPIP1 valosin containing protein (p97)/p47 complex 80124 yes 1 VCX3B variableinteracting charge, protein X-linked 1 3B 425054 yes 1 VEZT vezatin, adherens junctions transmembrane protein 55591 yes 1 VIPR1 vasoactive intestinal peptide receptor 1 7433 yes 1 VIT vitrin 5212 yes 1 VPS13C vacuolar protein sorting 13 homolog C (S. cerevisiae) 54832 yes 1 VPS13D vacuolar protein sorting 13 homolog D (S. cerevisiae) 55187 yes 1 VPS26A vacuolar protein sorting 26 homolog A (S. pombe) 9559 yes 1 VPS37A vacuolar protein sorting 37 homolog A (S. cerevisiae) 137492 yes 1 VPS41 vacuolar protein sorting 41 homolog (S. cerevisiae) 27072 yes 1 VPS45 vacuolar protein sorting 45 homolog (S. cerevisiae) 11311 yes 1 VPS52 vacuolar protein sorting 52 homolog (S. cerevisiae) 6293 yes 1 VPS8 vacuolar protein sorting 8 homolog (S. cerevisiae) 23355 yes 1 VRK2 vaccinia related kinase 2 7444 yes 1 VRK3 vaccinia related kinase 3 51231 yes 1 VSIG4 V-set and immunoglobulin domain containing 4 11326 yes 1 VSTM2A V-set and transmembrane domain containing 2A 222008 yes 1 VTI1A vesicle transport through interaction with t-SNAREs 143187 yes 1 VWA5B2 homolog von Willebrand 1A (yeast) factor A domain containing 5B2 90113 yes 1 WAC WW domain containing adaptor with coiled-coil 51322 yes 1 WASF1 WAS protein family, member 1 8936 yes 1 WBP5 WW domain binding protein 5 51186 yes 1 WDFY4 WDFY family member 4 57705 yes 1 WDR20 WD repeat domain 20 91833 yes 1 WDR27 WD repeat domain 27 253769 yes 1 WDR6 WD repeat domain 6 11180 yes 1 WDR60 WD repeat domain 60 55112 yes 1 WDR62 WD repeat domain 62 284403 yes 1 WDR7 WD repeat domain 7 23335 yes 1 WDR72 WD repeat domain 72 256764 yes 1 WDR74 WD repeat domain 74 54663 yes 1 WDR92 WD repeat domain 92 116143 yes 1 WDR93 WD repeat domain 93 56964 yes 1 WEE1 WEE1 homolog (S. pombe) 7465 yes 1 WEE2 WEE1 homolog 2 (S. pombe) 494551 yes 1 WFDC13 WAP four-disulfide core domain 13 164237 yes 1 WFDC3 WAP four-disulfide core domain 3 140686 yes 1 WFDC5 WAP four-disulfide core domain 5 149708 yes 1 WFS1 Wolfram syndrome 1 (wolframin) 7466 yes 1 WHAMML2 WAS protein homolog associated with actin, golgi 440253 yes 1 WHSC1 membranes Wolf-Hirschhorn and syndromemicrotubules candidate-like 2 (pseudogene) 1 7468 yes 1 WIF1 WNT inhibitory factor 1 11197 yes 1 WIPF1 WAS/WASL interacting protein family, member 1 7456 yes 1 WIPI2 WD repeat domain, phosphoinositide interacting 2 26100 yes 1 WNK4 WNK lysine deficient protein kinase 4 65266 yes 1 WRB tryptophan rich basic protein 7485 yes 1 WTIP Wilms tumor 1 interacting protein 126374 yes 1 WWC3 WWC family member 3 55841 yes 1 XG Xg blood group 7499 yes 1 XIAP X-linked inhibitor of apoptosis 331 yes 1 XKR6 XK, Kell blood group complex subunit-related family, 286046 yes 1 XPNPEP3 member X-prolyl aminopeptidase6 (aminopeptidase P) 3, 63929 yes 1 XPR1 xenotropicputative and polytropic retrovirus receptor 9213 yes 1 XRCC1 X-ray repair complementing defective repair in 7515 yes 1 XRN1 5'-3'Chinese exoribonuclease hamster cells 1 1 54464 yes 1 XRN2 5'-3' exoribonuclease 2 22803 yes 1 XRRA1 X-ray radiation resistance associated 1 143570 yes 1

192

Symbol Gene name Entrez MR TS µT DB MC # KEGG YIPF1 Yip1 domain family, member 1 54432 yes 1 YIPF6 Yip1 domain family, member 6 286451 yes 1 YIPF7 Yip1 domain family, member 7 285525 yes 1 YPEL1 yippee-like 1 (Drosophila) 29799 yes 1 YPEL2 yippee-like 2 (Drosophila) 388403 yes 1 YPEL5 yippee-like 5 (Drosophila) 51646 yes 1 YWHAQ tyrosine 3-monooxygenase/tryptophan 5- 10971 yes 1 YWHAZ tyrosinemonooxygenase 3-monooxygenase/tryptophan activation protein, theta 5- polypeptide 7534 yes 1 YY1 YY1monooxygenase transcription activationfactor protein, zeta polypeptide 7528 yes 1 ZACN zinc activated ligand-gated ion channel 353174 yes 1 ZBBX zinc finger, B-box domain containing 79740 yes 1 ZBP1 Z-DNA binding protein 1 81030 yes 1 ZBTB1 zinc finger and BTB domain containing 1 22890 yes 1 ZBTB24 zinc finger and BTB domain containing 24 9841 yes 1 ZBTB33 zinc finger and BTB domain containing 33 10009 yes 1 ZBTB37 zinc finger and BTB domain containing 37 84614 yes 1 ZBTB41 zinc finger and BTB domain containing 41 360023 yes 1 ZBTB43 zinc finger and BTB domain containing 43 23099 yes 1 ZBTB5 zinc finger and BTB domain containing 5 9925 yes 1 ZBTB6 zinc finger and BTB domain containing 6 10773 yes 1 ZBTB8A zinc finger and BTB domain containing 8A 653121 yes 1 ZBTB8B zinc finger and BTB domain containing 8B 728116 yes 1 ZC3HAV1 zinc finger CCCH-type, antiviral 1 56829 yes 1 ZC4H2 zinc finger, C4H2 domain containing 55906 yes 1 ZCCHC3 zinc finger, CCHC domain containing 3 85364 yes 1 ZCCHC5 zinc finger, CCHC domain containing 5 203430 yes 1 ZCWPW2 zinc finger, CW type with PWWP domain 2 152098 yes 1 ZDHHC15 zinc finger, DHHC-type containing 15 158866 yes 1 ZDHHC17 zinc finger, DHHC-type containing 17 23390 yes 1 ZDHHC2 zinc finger, DHHC-type containing 2 51201 yes 1 ZDHHC5 zinc finger, DHHC-type containing 5 25921 yes 1 ZDHHC6 zinc finger, DHHC-type containing 6 64429 yes 1 ZDHHC9 zinc finger, DHHC-type containing 9 51114 yes 1 ZEB1 zinc finger E-box binding homeobox 1 6935 yes 1 ZEB2 zinc finger E-box binding homeobox 2 9839 yes 1 ZFP28 zinc finger protein 28 homolog (mouse) 140612 yes 1 ZFP57 zinc finger protein 57 homolog (mouse) 346171 yes 1 ZFP64 zinc finger protein 64 homolog (mouse) 55734 yes 1 ZFP90 zinc finger protein 90 homolog (mouse) 146198 yes 1 ZFYVE16 zinc finger, FYVE domain containing 16 9765 yes 1 ZFYVE26 zinc finger, FYVE domain containing 26 23503 yes 1 ZIC1 Zic family member 1 (odd-paired homolog, 7545 yes 1 ZKSCAN2 zincDrosophila) finger with KRAB and SCAN domains 2 342357 yes 1 ZMAT2 zinc finger, matrin type 2 153527 yes 1 ZMAT4 zinc finger, matrin type 4 79698 yes 1 ZMPSTE24 zinc metallopeptidase (STE24 homolog, S. 10269 yes 1 ZMYM2 zinccerevisiae) finger, MYM-type 2 7750 yes 1 ZMYM4 zinc finger, MYM-type 4 9202 yes 1 ZMYND17 zinc finger, MYND-type containing 17 118490 yes 1 ZNF135 zinc finger protein 135 7694 yes 1 ZNF141 zinc finger protein 141 7700 yes 1 ZNF185 zinc finger protein 185 (LIM domain) 7739 yes 1 ZNF197 zinc finger protein 197 10168 yes 1 ZNF200 zinc finger protein 200 7752 yes 1 ZNF207 zinc finger protein 207 7756 yes 1 ZNF208 zinc finger protein 208 7757 yes 1 ZNF215 zinc finger protein 215 7762 yes 1 ZNF229 zinc finger protein 229 7772 yes 1 ZNF232 zinc finger protein 232 7775 yes 1 ZNF24 zinc finger protein 24 7572 yes 1 ZNF248 zinc finger protein 248 57209 yes 1 ZNF251 zinc finger protein 251 90987 yes 1 ZNF263 zinc finger protein 263 10127 yes 1 ZNF264 zinc finger protein 264 9422 yes 1 ZNF280D zinc finger protein 280D 54816 yes 1 ZNF286A zinc finger protein 286A 57335 yes 1 ZNF286B zinc finger protein 286B 729288 yes 1 ZNF292 zinc finger protein 292 23036 yes 1 ZNF295 zinc finger protein 295 49854 yes 1 ZNF304 zinc finger protein 304 57343 yes 1 ZNF323 zinc finger protein 323 64288 yes 1 ZNF330 zinc finger protein 330 27309 yes 1 ZNF333 zinc finger protein 333 84449 yes 1 ZNF35 zinc finger protein 35 7584 yes 1 ZNF362 zinc finger protein 362 149076 yes 1 ZNF398 zinc finger protein 398 57541 yes 1 ZNF407 zinc finger protein 407 55628 yes 1 ZNF414 zinc finger protein 414 84330 yes 1 ZNF415 zinc finger protein 415 55786 yes 1 ZNF418 zinc finger protein 418 147686 yes 1 ZNF441 zinc finger protein 441 126068 yes 1 ZNF443 zinc finger protein 443 10224 yes 1

193

Symbol Gene name Entrez MR TS µT DB MC # KEGG ZNF451 zinc finger protein 451 26036 yes 1 ZNF474 zinc finger protein 474 133923 yes 1 ZNF480 zinc finger protein 480 147657 yes 1 ZNF493 zinc finger protein 493 284443 yes 1 ZNF498 zinc finger protein 498 221785 yes 1 ZNF503 zinc finger protein 503 84858 yes 1 ZNF506 zinc finger protein 506 440515 yes 1 ZNF507 zinc finger protein 507 22847 yes 1 ZNF512 zinc finger protein 512 84450 yes 1 ZNF518B zinc finger protein 518B 85460 yes 1 ZNF525 zinc finger protein 525 170958 yes 1 ZNF527 zinc finger protein 527 84503 yes 1 ZNF529 zinc finger protein 529 57711 yes 1 ZNF541 zinc finger protein 541 84215 yes 1 ZNF542 zinc finger protein 542 147947 yes 1 ZNF543 zinc finger protein 543 125919 yes 1 ZNF548 zinc finger protein 548 147694 yes 1 ZNF549 zinc finger protein 549 256051 yes 1 ZNF562 zinc finger protein 562 54811 yes 1 ZNF572 zinc finger protein 572 137209 yes 1 ZNF577 zinc finger protein 577 84765 yes 1 ZNF584 zinc finger protein 584 201514 yes 1 ZNF586 zinc finger protein 586 54807 yes 1 ZNF589 zinc finger protein 589 51385 yes 1 ZNF595 zinc finger protein 595 152687 yes 1 ZNF596 zinc finger protein 596 169270 yes 1 ZNF607 zinc finger protein 607 84775 yes 1 ZNF638 zinc finger protein 638 27332 yes 1 ZNF654 zinc finger protein 654 55279 yes 1 ZNF662 zinc finger protein 662 389114 yes 1 ZNF664 zinc finger protein 664 144348 yes 1 ZNF665 zinc finger protein 665 79788 yes 1 ZNF676 zinc finger protein 676 163223 yes 1 ZNF695 zinc finger protein 695 57116 yes 1 ZNF7 zinc finger protein 7 7553 yes 1 ZNF713 zinc finger protein 713 349075 yes 1 ZNF720 zinc finger protein 720 124411 yes 1 ZNF746 zinc finger protein 746 155061 yes 1 ZNF772 zinc finger protein 772 400720 yes 1 ZNF777 zinc finger protein 777 27153 yes 1 ZNF788 zinc finger family member 788 388507 yes 1 ZNF799 zinc finger protein 799 90576 yes 1 ZNF800 zinc finger protein 800 168850 yes 1 ZNF81 zinc finger protein 81 347344 yes 1 ZNF814 zinc finger protein 814 730051 yes 1 ZNF815 zinc finger protein 815 401303 yes 1 ZNF828 zinc finger protein 828 283489 yes 1 ZNF84 zinc finger protein 84 7637 yes 1 ZNF860 zinc finger protein 860 344787 yes 1 ZNRF2 zinc and ring finger 2 223082 yes 1 ZNRF3 zinc and ring finger 3 84133 yes 1 ZRANB2 zinc finger, RAN-binding domain containing 2 9406 yes 1 ZSCAN22 zinc finger and SCAN domain containing 22 342945 yes 1 ZSCAN23 zinc finger and SCAN domain containing 23 222696 yes 1 ZSCAN4 zinc finger and SCAN domain containing 4 201516 yes 1 ZSWIM2 zinc finger, SWIM-type containing 2 151112 yes 1 ZSWIM5 zinc finger, SWIM-type containing 5 57643 yes 1 ZSWIM6 zinc finger, SWIM-type containing 6 57688 yes 1 ZYG11B zyg-11 homolog B (C. elegans) 79699 yes 1

7.6 Gene set enrichment analysis of microRNA 361-5p, Pum1 and Pum2 targets

MicroRNA target predictions were from the following web services: microRNA.org (Betel et al. , 2010), TargetScan (Friedman et al. , 2009), DIANA-microT (Maragkakis et al. , 2009), miRDB (Wang, 2008), and MicroCosm (Griffiths-Jones et al. , 2008). Experimentally verified Pum1 and Pum2 targets were previously published (Galgano et al. , 2008; Morris et al. , 2008; Hafner et al. , 2010b). Results were pooled and converted to Entrez gene identifiers using the DAVID web service (Huang et al. , 2008). Putative targets were compared to a human reference gene list and analyzed for pathway enrichment using PANTHER (Thomas et al. , 2006). Pathways significantly enriched among predicted microRNA 361-5p (miR-361-5p) or experimentally verified Pum1 or Pum2 targets are listed together with their P values.

194

Pathway miR-361-5p Pum1 Pum2

5HT2 type receptor mediated signaling pathway 3.54 x 10 -3 n.s. n.s. 5HT4 type receptor mediated signaling pathway n.s. n.s. 4.8 x 10 -2 Alpha adrenergic receptor signaling pathway 8.40 x 10 -3 n.s. n.s. Alzheimer disease-amyloid secretase pathway 9.73 x 10 -3 2.5 x 10 -2 1.3 x 10 -3 Alzheimer disease-presenilin pathway n.s. 1.8 x 10 -3 5.4 x 10 -6 Angiogenesis 1.55 x 10 -3 1.0 x 10 -7 1.2 x 10 -8 Apoptosis signaling pathway 4.28 x 10 -4 3.6 x 10 -4 1.4 x 10 -6 Axon guidance mediated by semaphorins n.s. 2.0 x 10 -2 2.7 x 10 -2 Axon guidance mediated by Slit/Robo n.s. 5.2 x 10 -3 2.6 x 10 -3 B cell activation 4.85 x 10 -5 3.6 x 10 -5 4.8 x 10 -3 Beta1/2 adrenergic receptor signaling pathway 1.42 x 10 -2 n.s. 3.8 x 10 -3 Beta3 adrenergic receptor signaling pathway n.s. n.s. 1.6 x 10 -2 Blood coagulation n.s. 3.6 x 10 -2 n.s. Cadherin signaling pathway n.s. 4.2 x 10 -2 5.7 x 10 -3 Cell cycle n.s. 3.0 x 10 -2 5.9 x 10 -4 Cholesterol biosynthesis n.s. n.s. 4.7 x 10 -2 Circadian clock system 1.10 x 10 -2 3.3 x 10 -2 n.s. Coenzyme A biosynthesis n.s. n.s. 3.2 x 10 -2 Cortocotropin releasing factor receptor signaling pathway n.s. n.s. 1.8 x 10 -2 Cytoskeletal regulation by Rho GTPase n.s. n.s. 3.8 x 10 -2 De novo purine biosynthesis n.s. 1.8 x 10 -2 n.s. DNA replication n.s. 1.4 x 10 -3 1.1 x 10 -4 EGF receptor signaling pathway 7.20 x 10 -6 5.6 x 10 -7 5.7 x 10 -10 Endothelin signaling pathway 2.62 x 10 -4 n.s. 9.4 x 10 -4 FAS signaling pathway 4.12 x 10 -2 n.s. 1.4 x 10 -2 FGF signaling pathway 2.61 x 10 -5 9.6 x 10 -4 5.6 x 10 -9 Glutamine glutamate conversion 2.14 x 10 -2 n.s. n.s. Glycolysis n.s. n.s. 5.0 x 10 -3 Hedgehog signaling pathway n.s. 1.8 x 10 -4 6.0 x 10 -5 Heterotrimeric G-protein signaling pathway-Gq α and Go α mediated pathway 4.03 x 10 -2 n.s. n.s. Histamine H1 receptor mediated signaling pathway 1.06 x 10 -2 n.s. n.s. Histamine H2 receptor mediated signaling pathway n.s. n.s. 2.9 x 10 -2 Huntington disease n.s. n.s. 4.0 x 10 -6 Hypoxia response via HIF activation n.s. n.s. 2.8 x 10 -2 Inflammation mediated by chemokine and cytokine signaling pathway 4.91 x 10 -3 1.0 x 10 -3 2.1 x 10 -2 Insulin/IGF pathway-MAPKK/MAPK cascade 6.21 x 10 -3 1.9 x 10 -3 3.7 x 10 -5 Insulin/IGF pathway-protein kinase B signaling cascade n.s. 4.9 x 10 -3 1.6 x 10 -4 Integrin signaling pathway 1.85 x 10 -2 5.7 x 10 -5 1.0 x 10 -3 Interferon-gamma signaling pathway 2.85 x 10 -2 5.4 x 10 -4 6.2 x 10 -3 Interleukin signaling pathway n.s. 1.6 x 10 -5 6.1 x 10 -4 Metabotropic glutamate receptor group I pathway 1.14 x 10 -4 n.s. n.s. Metabotropic glutamate receptor group II pathway 3.96 x 10 -2 n.s. 1.5 x 10 -2 Metabotropic glutamate receptor group III pathway 1.14 x 10 -4 n.s. 2.2 x 10 -2 Muscarinic acetylcholine receptor 1 and 3 signaling pathway 1.95 x 10 -2 n.s. 2.2 x 10 -2

195

Muscarinic acetylcholine receptor 2 and 4 signaling pathway n.s. n.s. 4.4 x 10 -2 Nicotinic acetylcholine receptor signaling pathway n.s. n.s. 4.5 x 10 -3 Notch signaling pathway n.s. n.s. 1.4 x 10 -2 O-antigen biosynthesis n.s. n.s. 4.6 x 10 -2 Oxidative stress response 1.71 x 10 -2 1.0 x 10 -3 5.3 x 10 -3 Oxytocin receptor mediated signaling pathway 8.35 x 10 -3 n.s. 3.4 x 10 -2 p53 pathway 4.32 x 10 -5 1.6 x 10 -7 4.9 x 10 -14 p53 pathway by glucose deprivation n.s. n.s. 2.0 x 10 -4 P53 pathway feedback loops 1 2.01 x 10 -2 n.s. n.s. p53 pathway feedback loops 2 1.09 x 10 -2 2.2 x 10 -2 1.2 x 10 -6 Parkinson disease 2.42 x 10 -2 4.3 x 10 -4 4.0 x 10 -5 PDGF signaling pathway 7.68 x 10 -7 7.0 x 10 -7 1.6 x 10 -12 PI3 kinase pathway 2.11 x 10 -2 3.7 x 10 -4 2.2 x 10 -7 Ras Pathway 8.18 x 10 -4 1.3 x 10 -7 1.7 x 10 -7 Salvage pyrimidine ribonucleotides n.s. 2.4 x 10 -2 n.s. T cell activation 2.19 x 10 -6 8.9 x 10 -6 6.1 x 10 -5 TGF-beta signaling pathway 2.50 x 10 -2 2.6 x 10 -4 9.0 x 10 -7 Thyrotropin-releasing hormone receptor signaling pathway 5.37 x 10 -3 n.s. 1.4 x 10 -2 Toll receptor signaling pathway n.s. 1.4 x 10 -3 n.s. Transcription regulation by bZIP transcription factor n.s. n.s. 1.1 x 10 -2 Ubiquitin proteasome pathway n.s. n.s. 1.2 x 10 -6 Vasopressin synthesis n.s. n.s. 1.1 x 10 -2 VEGF signaling pathway 8.46 x 10 -3 1.6 x 10 -2 2.8 x 10 -3 Vitamin B6 metabolism n.s. n.s. 2.2 x 10 -2 Wnt signaling pathway 3.18 x 10 -3 4.3 x 10 -4 3.2 x 10 -8

7.7 pTO-HA-Strep-GW-FRT map and sequence

Map and sequence of the plasmid pTO-HA-Strep-GW-FRT. The following features are indicated (starting from position 1): A hybrid human cytomegalovirus promoter with tet operator sequences (CMV/tetO promoter) that allows tetracyclin-regulated expression of a gene of interest (tetracyclin displaces the tetracycline repressor protein that binds to the tetO sequences and represses gene expression in the absence of inducer); an HA/StrepIII tandem tag after a Kozak sequence/start codon (not shown); a Gateway cassette consisting of two attachment sites (attR1 and attR2), a chloramphenicol acetyltransferase (CAT) and a ‘killer gene’ (ccdB) for the convenient recombinase-based cloning of genes of interest using Gateway technology (Invitrogen); a bovine growth hormone (bGH) terminator for the termination of transcription and polyadenylation of transcripts; a Flippase recognition target (FRT) site for genomic integration by Flippase (FLP)-mediated recombination; a hygromycin resistance gene (hygroB) for selection of cells in which recombination occurred (note that the corresponding promoter is present upstream of the FRT site in the target cell lines); an origin of replication (pBR322), an ampicillin resistance marker gene (AmpR) and a corresponding promoter (AmpR promoter) for propagation of the plasmid in E. coli . The restriction sites used in this work are indicated (blue). The map and formatted sequence were created with PlasMapper (Dong et al. , 2004).

196

1 GACGGATCGGGAGATCTCCCGATCCCCTATGGTGCACTCTCAGTACAATCTGCTCTGATG 60 CTGCCTAGCCCTCTAGAGGGCTAGGGGATACCACGTGAGAGTCATGTTAGACGAGACTAC

61 CCGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCGCTGAGTAGTGCG 120 GGCGTATCAATTCGGTCATAGACGAGGGACGAACACACAACCTCCAGCGACTCATCACGC

121 CGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCATGAAGAATCTGC 180 GCTCGTTTTAAATTCGATGTTGTTCCGTTCCGAACTGGCTGTTAACGTACTTCTTAGACG

CMV/tetO | 181 TTAGGGTTAGGCGTTTTGCGCTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACATT 240 AATCCCAATCCGCAAAACGCGACGAAGCGCTACATGCCCGGTCTATATGCGCAACTGTAA promoter(236,955)>>>

241 GATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATA 300 CTAATAACTGATCAATAATTATCATTAGTTAATGCCCCAGTAATCAAGTATCGGGTATAT

301 TGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACC 360 ACCTCAAGGCGCAATGTATTGAATGCCATTTACCGGGCGGACCGACTGGCGGGTTGCTGG

361 CCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCC 420 GGGCGGGTAACTGCAGTTATTACTGCATACAAGGGTATCATTGCGGTTATCCCTGAAAGG

421 ATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGT 480 TAACTGCAGTTACCCACCTCATAAATGCCATTTGACGGGTGAACCGTCATGTAGTTCACA

481 ATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATT 540

197

TAGTATACGGTTCATGCGGGGGATAACTGCAGTTACTGCCATTTACCGGGCGGACCGTAA

541 ATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCA 600 TACGGGTCATGTACTGGAATACCCTGAAAGGATGAACCGTCATGTAGATGCATAATCAGT

601 TCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTG 660 AGCGATAATGGTACCACTACGCCAAAACCGTCATGTAGTTACCCGCACCTATCGCCAAAC

661 ACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACC 720 TGAGTGCCCCTAAAGGTTCAGAGGTGGGGTAACTGCAGTTACCCTCAAACAAAACCGTGG

721 AAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCG 780 TTTTAGTTGCCCTGAAAGGTTTTACAGCATTGTTGAGGCGGGGTAACTGCGTTTACCCGC

tetO reg(820,859)>>> | 781 GTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCTCCCTATCAGTGATAGAGATC 840 CATCCGCACATGCCACCCTCCAGATATATTCGTCTCGAGAGGGATAGTCACTATCTCTAG

841 TCCCTATCAGTGATAGAGATCGTCGACGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGA 900 AGGGATAGTCACTATCTCTAGCAGCTGCTCGAGCAAATCACTTGGCAGTCTAGCGGACCT

901 GACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGACCGATCCAGCCTCCGGA 960 CTGCGGTAGGTGCGACAAAACTGGAGGTATCTTCTGTGGCCCTGGCTAGGTCGGAGGCCT

BamHI | 961 CTCTAGCGTTTAAACTTAAGCTTGGTACCGAGCTCGGATCCACTAGTCCAGTGTGGTGGA 1020 GAGATCGCAAATTTGAATTCGAACCATGGCTCGAGCCTAGGTGATCAGGTCACACCACCT

1021 ATTCTGCAGATATCCAGCACAGTGGCGGCCGCTCGAGACCATGTACCCATACGATGTTCC 1080 TAAGACGTCTATAGGTCGTGTCACCGCCGGCGAGCTCTGGTACATGGGTATGCTACAAGG

BamHI HA/StrepIII tag(1112,1204)>>> | | 1081 TGACTATGCCGGTACCGAGCTCGGATCCACCATGGCTAGCTGGAGCCACCCGCAGTTCGA 1140 ACTGATACGGCCATGGCTCGAGCCTAGGTGGTACCGATCGACCTCGGTGGGCGTCAAGCT

1141 GAAAGGTGGAGGTTCCGGAGGTGGATCGGGAGGTGGATCGTGGAGCCACCCGCAGTTCGA 1200 CTTTCCACCTCCAAGGCCTCCACCTAGCCCTCCACCTAGCACCTCGGTGGGCGTCAAGCT

attR1 (1217,1336)>>> | 1201 AAAAGCGGCCGATATCACAAGTTTGTACAAAAAAGCTGAACGAGAAACGTAAAATGATAT 1260 TTTTCGCCGGCTATAGTGTTCAAACATGTTTTTTCGACTTGCTCTTTGCATTTTACTATA

1261 AAATATCAATATATTAAATTAGATTTTGCATAAAAAACAGACTACATAATACTGTAAAAC 1320 TTTATAGTTATATAATTTAATCTAAAACGTATTTTTTGTCTGATGTATTATGACATTTTG

1321 ACAACATATCCAGTCATATTGGCGGCCGCATTAGGCACCCCAGGCTTTACACTTTATGCT 1380 TGTTGTATAGGTCAGTATAACCGCCGGCGTAATCCGTGGGGTCCGAAATGTGAAATACGA

BamHI | 1381 TCCGGCTCGTATAATGTGTGGATTTTGAGTTAGGATCCGTCGAGATTTTCAGGAGCTAAG 1440 AGGCCGAGCATATTACACACCTAAAACTCAATCCTAGGCAGCTCTAAAAGTCCTCGATTC

CAT (1450,2109)>>> | 1441 GAAGCTAAAATGGAGAAAAAAATCACTGGATATACCACCGTTGATATATCCCAATGGCAT 1500 CTTCGATTTTACCTCTTTTTTTAGTGACCTATATGGTGGCAACTATATAGGGTTACCGTA

1501 CGTAAAGAACATTTTGAGGCATTTCAGTCAGTTGCTCAATGTACCTATAACCAGACCGTT 1560

198

GCATTTCTTGTAAAACTCCGTAAAGTCAGTCAACGAGTTACATGGATATTGGTCTGGCAA

1561 CAGCTGGATATTACGGCCTTTTTAAAGACCGTAAAGAAAAATAAGCACAAGTTTTATCCG 1620 GTCGACCTATAATGCCGGAAAAATTTCTGGCATTTCTTTTTATTCGTGTTCAAAATAGGC

1621 GCCTTTATTCACATTCTTGCCCGCCTGATGAATGCTCATCCGGAATTCCGTATGGCAATG 1680 CGGAAATAAGTGTAAGAACGGGCGGACTACTTACGAGTAGGCCTTAAGGCATACCGTTAC

1681 AAAGACGGTGAGCTGGTGATATGGGATAGTGTTCACCCTTGTTACACCGTTTTCCATGAG 1740 TTTCTGCCACTCGACCACTATACCCTATCACAAGTGGGAACAATGTGGCAAAAGGTACTC

1741 CAAACTGAAACGTTTTCATCGCTCTGGAGTGAATACCACGACGATTTCCGGCAGTTTCTA 1800 GTTTGACTTTGCAAAAGTAGCGAGACCTCACTTATGGTGCTGCTAAAGGCCGTCAAAGAT

1801 CACATATATTCGCAAGATGTGGCGTGTTACGGTGAAAACCTGGCCTATTTCCCTAAAGGG 1860 GTGTATATAAGCGTTCTACACCGCACAATGCCACTTTTGGACCGGATAAAGGGATTTCCC

1861 TTTATTGAGAATATGTTTTTCGTCTCAGCCAATCCCTGGGTGAGTTTCACCAGTTTTGAT 1920 AAATAACTCTTATACAAAAAGCAGAGTCGGTTAGGGACCCACTCAAAGTGGTCAAAACTA

1921 TTAAACGTGGCCAATATGGACAACTTCTTCGCCCCCGTTTTCACCATGGGCAAATATTAT 1980 AATTTGCACCGGTTATACCTGTTGAAGAAGCGGGGGCAAAAGTGGTACCCGTTTATAATA

1981 ACGCAAGGCGACAAGGTGCTGATGCCGCTGGCGATTCAGGTTCATCATGCCGTTTGTGAT 2040 TGCGTTCCGCTGTTCCACGACTACGGCGACCGCTAAGTCCAAGTAGTACGGCAAACACTA

2041 GGCTTCCATGTCGGCAGAATGCTTAATGAATTACAACAGTACTGCGATGAGTGGCAGGGC 2100 CCGAAGGTACAGCCGTCTTACGAATTACTTAATGTTGTCATGACGCTACTCACCGTCCCG

BamHI | 2101 GGGGCGTAAACGCGTGGATCCGGCTTACTAAAAGCCAGATAACAGTATGCGTATTTGCGC 2160 CCCCGCATTTGCGCACCTAGGCCGAATGATTTTCGGTCTATTGTCATACGCATAAACGCG

2161 GCTGATTTTTGCGGTATAAGAATATATACTGATATGTATACCCGAAGTATGTCAAAAAGA 2220 CGACTAAAAACGCCATATTCTTATATATGACTATACATATGGGCTTCATACAGTTTTTCT

2221 GGTATGCTATGAAGCAGCGTATTACAGTGACAGTTGACAGCGACAGCTATCAGTTGCTCA 2280 CCATACGATACTTCGTCGCATAATGTCACTGTCAACTGTCGCTGTCGATAGTCAACGAGT

2281 AGGCATATATGATGTCAATATCTCCGGTCTGGTAAGCACAACCATGCAGAATGAAGCCCG 2340 TCCGTATATACTACAGTTATAGAGGCCAGACCATTCGTGTTGGTACGTCTTACTTCGGGC

2341 TCGTCTGCGTGCCGAACGCTGGAAAGCGGAAAATCAGGAAGGGATGGCTGAGGTCGCCCG 2400 AGCAGACGCACGGCTTGCGACCTTTCGCCTTTTAGTCCTTCCCTACCGACTCCAGCGGGC

ccdB(2451,2756) | 2401 GTTTATTGAAATGAACGGCTCTTTTGCTGACGAGAACAGGGGCTGGTGAAATGCAGTTTA 2460 CAAATAACTTTACTTGCCGAGAAAACGACTGCTCTTGTCCCCGACCACTTTACGTCAAAT

>>>

2461 AGGTTTACACCTATAAAAGAGAGAGCCGTTATCGTCTGTTTGTGGATGTACAGAGTGATA 2520 TCCAAATGTGGATATTTTCTCTCTCGGCAATAGCAGACAAACACCTACATGTCTCACTAT

2521 TTATTGACACGCCCGGGCGACGGATGGTGATCCCCCTGGCCAGTGCACGTCTGCTGTCAG 2580 AATAACTGTGCGGGCCCGCTGCCTACCACTAGGGGGACCGGTCACGTGCAGACGACAGTC

2581 ATAAAGTCTCCCGTGAACTTTACCCGGTGGTGCATATCGGGGATGAAAGCTGGCGCATGA 2640 TATTTCAGAGGGCACTTGAAATGGGCCACCACGTATAGCCCCTACTTTCGACCGCGTACT

2641 TGACCACCGATATGGCCAGTGTGCCGGTCTCCGTTATCGGGGAAGAAGTGGCTGATCTCA 2700

199

ACTGGTGGCTATACCGGTCACACGGCCAGAGGCAATAGCCCCTTCTTCACCGACTAGAGT

2701 GCCACCGCGAAAATGACATCAAAAACGCCATTAACCTGATGTTCTGGGGAATATAAATGT 2760 CGGTGGCGCTTTTACTGTAGTTTTTGCGGTAATTGGACTACAAGACCCCTTATATTTACA

attR2(2797,2897)>>> | 2761 CAGGCTCCCTTATACACAGCCAGTCTGCAGGTCGACCATAGTGACTGGATATGTTGTGTT 2820 GTCCGAGGGAATATGTGTCGGTCAGACGTCCAGCTGGTATCACTGACCTATACAACACAA

2821 TTACAGCATTATGTAGTCTGTTTTTTATGCAAAATCTAATTTAATATATTGATATTTATA 2880 AATGTCGTAATACATCAGACAAAAAATACGTTTTAGATTAAATTATATAACTATAAATAT

ApaI | 2881 TCATTTTACGTTTCTCGTTCAGCTTTCTTGTACAAAGTGGTGACGTAAGCTAGGGGCCCG 2940 AGTAAAATGCAAAGAGCAAGTCGAAAGAACATGTTTCACCACTGCATTCGATCCCCGGGC

bGH terminator(2962,3189)>>> | 2941 TTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCC 3000 AAATTTGGGCGACTAGTCGGAGCTGACACGGAAGATCAACGGTCGGTAGACAACAAACGG

3001 CCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAA 3060 GGAGGGGGCACGGAAGGAACTGGGACCTTCCACGGTGAGGGTGACAGGAAAGGATTATTT

3061 ATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGG 3120 TACTCCTTTAACGTAGCGTAACAGACTCATCCACAGTAAGATAAGACCCCCCACCCCACC

3121 GGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGG 3180 CCGTCCTGTCGTTCCCCCTCCTAACCCTTCTGTTATCGTCCGTACGACCCCTACGCCACC

3181 GCTCTATGGCTTCTGAGGCGGAAAGAACCAGCTGGGGCTCTAGGGGGTATCCCCACGCGC 3240 CGAGATACCGAAGACTCCGCCTTTCTTGGTCGACCCCGAGATCCCCCATAGGGGTGCGCG

3241 CCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACAC 3300 GGACATCGCCGCGTAATTCGCGCCGCCCACACCACCAATGCGCGTCGCACTGGCGATGTG

3301 TTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCG 3360 AACGGTCGCGGGATCGCGGGCGAGGAAAGCGAAAGAAGGGAAGGAAAGAGCGGTGCAAGC

3361 CCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTT 3420 GGCCGAAAGGGGCAGTTCGAGATTTAGCCCCCGAGGGAAATCCCAAGGCTAAATCACGAA

FRT(3473,3520 | 3421 TACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTACCTAGAAGTTCC 3480 ATGCCGTGGAGCTGGGGTTTTTTGAACTAATCCCACTACCAAGTGCATGGATCTTCAAGG

)>>> hygroB(3530,4547) | 3481 TATTCCGAAGTTCCTATTCTCTAGAAAGTATAGGAACTTCCTTGGCCAAAAAGCCTGAAC 3540 ATAAGGCTTCAAGGATAAGAGATCTTTCATATCCTTGAAGGAACCGGTTTTTCGGACTTG

>>>

3541 TCACCGCGACGTCTGTCGAGAAGTTTCTGATCGAAAAGTTCGACAGCGTCTCCGACCTGA 3600 AGTGGCGCTGCAGACAGCTCTTCAAAGACTAGCTTTTCAAGCTGTCGCAGAGGCTGGACT

3601 TGCAGCTCTCGGAGGGCGAAGAATCTCGTGCTTTCAGCTTCGATGTAGGAGGGCGTGGAT 3660 ACGTCGAGAGCCTCCCGCTTCTTAGAGCACGAAAGTCGAAGCTACATCCTCCCGCACCTA

3661 ATGTCCTGCGGGTAAATAGCTGCGCCGATGGTTTCTACAAAGATCGTTATGTTTATCGGC 3720

200

TACAGGACGCCCATTTATCGACGCGGCTACCAAAGATGTTTCTAGCAATACAAATAGCCG

3721 ACTTTGCATCGGCCGCGCTCCCGATTCCGGAAGTGCTTGACATTGGGGAATTCAGCGAGA 3780 TGAAACGTAGCCGGCGCGAGGGCTAAGGCCTTCACGAACTGTAACCCCTTAAGTCGCTCT

3781 GCCTGACCTATTGCATCTCCCGCCGTGCACAGGGTGTCACGTTGCAAGACCTGCCTGAAA 3840 CGGACTGGATAACGTAGAGGGCGGCACGTGTCCCACAGTGCAACGTTCTGGACGGACTTT

3841 CCGAACTGCCCGCTGTTCTGCAGCCGGTCGCGGAGGCCATGGATGCGATCGCTGCGGCCG 3900 GGCTTGACGGGCGACAAGACGTCGGCCAGCGCCTCCGGTACCTACGCTAGCGACGCCGGC

3901 ATCTTAGCCAGACGAGCGGGTTCGGCCCATTCGGACCGCAAGGAATCGGTCAATACACTA 3960 TAGAATCGGTCTGCTCGCCCAAGCCGGGTAAGCCTGGCGTTCCTTAGCCAGTTATGTGAT

3961 CATGGCGTGATTTCATATGCGCGATTGCTGATCCCCATGTGTATCACTGGCAAACTGTGA 4020 GTACCGCACTAAAGTATACGCGCTAACGACTAGGGGTACACATAGTGACCGTTTGACACT

4021 TGGACGACACCGTCAGTGCGTCCGTCGCGCAGGCTCTCGATGAGCTGATGCTTTGGGCCG 4080 ACCTGCTGTGGCAGTCACGCAGGCAGCGCGTCCGAGAGCTACTCGACTACGAAACCCGGC

4081 AGGACTGCCCCGAAGTCCGGCACCTCGTGCACGCGGATTTCGGCTCCAACAATGTCCTGA 4140 TCCTGACGGGGCTTCAGGCCGTGGAGCACGTGCGCCTAAAGCCGAGGTTGTTACAGGACT

4141 CGGACAATGGCCGCATAACAGCGGTCATTGACTGGAGCGAGGCGATGTTCGGGGATTCCC 4200 GCCTGTTACCGGCGTATTGTCGCCAGTAACTGACCTCGCTCCGCTACAAGCCCCTAAGGG

4201 AATACGAGGTCGCCAACATCTTCTTCTGGAGGCCGTGGTTGGCTTGTATGGAGCAGCAGA 4260 TTATGCTCCAGCGGTTGTAGAAGAAGACCTCCGGCACCAACCGAACATACCTCGTCGTCT

4261 CGCGCTACTTCGAGCGGAGGCATCCGGAGCTTGCAGGATCGCCGCGGCTCCGGGCGTATA 4320 GCGCGATGAAGCTCGCCTCCGTAGGCCTCGAACGTCCTAGCGGCGCCGAGGCCCGCATAT

4321 TGCTCCGCATTGGTCTTGACCAACTCTATCAGAGCTTGGTTGACGGCAATTTCGATGATG 4380 ACGAGGCGTAACCAGAACTGGTTGAGATAGTCTCGAACCAACTGCCGTTAAAGCTACTAC

4381 CAGCTTGGGCGCAGGGTCGATGCGACGCAATCGTCCGATCCGGAGCCGGGACTGTCGGGC 4440 GTCGAACCCGCGTCCCAGCTACGCTGCGTTAGCAGGCTAGGCCTCGGCCCTGACAGCCCG

4441 GTACACAAATCGCCCGCAGAAGCGCGGCCGTCTGGACCGATGGCTGTGTAGAAGTACTCG 4500 CATGTGTTTAGCGGGCGTCTTCGCGCCGGCAGACCTGGCTACCGACACATCTTCATGAGC

4501 CCGATAGTGGAAACCGACGCCCCAGCACTCGTCCGAGGGCAAAGGAATAGCACGTACTAC 4560 GGCTATCACCTTTGGCTGCGGGGTCGTGAGCAGGCTCCCGTTTCCTTATCGTGCATGATG

4561 GAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGGAATCGTTTTCCGGG 4620 CTCTAAAGCTAAGGTGGCGGCGGAAGATACTTTCCAACCCGAAGCCTTAGCAAAAGGCCC

4621 ACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATGCTGGAGTTCTTCGCCCACCCCA 4680 TGCGGCCGACCTACTAGGAGGTCGCGCCCCTAGAGTACGACCTCAAGAAGCGGGTGGGGT

4681 ACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAAATTTCACAA 4740 TGAACAAATAACGTCGAATATTACCAATGTTTATTTCGTTATCGTAGTGTTTAAAGTGTT

4741 ATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTT 4800 TATTTCGTAAAAAAAGTGACGTAAGATCAACACCAAACAGGTTTGAGTAGTTACATAGAA

4801 ATCATGTCTGTATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGT 4860 TAGTACAGACATATGGCAGCTGGAGATCGATCTCGAACCGCATTAGTACCAGTATCGACA

4861 TTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAA 4920 AAGGACACACTTTAACAATAGGCGAGTGTTAAGGTGTGTTGTATGCTCGGCCTTCGTATT

4921 AGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCAC 4980

201

TCACATTTCGGACCCCACGGATTACTCACTCGATTGAGTGTAATTAACGCAACGCGAGTG

4981 TGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCG 5040 ACGGGCGAAAGGTCAGCCCTTTGGACAGCACGGTCGACGTAATTACTTAGCCGGTTGCGC

5041 CGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGC 5100 GCCCCTCTCCGCCAAACGCATAACCCGCGAGAAGGCGAAGGAGCGAGTGACTGAGCGACG

5101 GCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTAT 5160 CGAGCCAGCAAGCCGACGCCGCTCGCCATAGTCGAGTGAGTTTCCGCCATTATGCCAATA

5161 CCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCA 5220 GGTGTCTTAGTCCCCTATTGCGTCCTTTCTTGTACACTCGTTTTCCGGTCGTTTTCCGGT

pBR322 origin(5237,5856)<<< | 5221 GGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGC 5280 CCTTGGCATTTTTCCGGCGCAACGACCGCAAAAAGGTATCCGAGGCGGGGGGACTGCTCG

5281 ATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACC 5340 TAGTGTTTTTAGCTGCGAGTTCAGTCTCCACCGCTTTGGGCTGTCCTGATATTTCTATGG

5341 AGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCG 5400 TCCGCAAAGGGGGACCTTCGAGGGAGCACGCGAGAGGACAAGGCTGGGACGGCGAATGGC

5401 GATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTA 5460 CTATGGACAGGCGGAAAGAGGGAAGCCCTTCGCACCGCGAAAGAGTATCGAGTGCGACAT

5461 GGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCG 5520 CCATAGAGTCAAGCCACATCCAGCAAGCGAGGTTCGACCCGACACACGTGCTTGGGGGGC

5521 TTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGAC 5580 AAGTCGGGCTGGCGACGCGGAATAGGCCATTGATAGCAGAACTCAGGTTGGGCCATTCTG

5581 ACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAG 5640 TGCTGAATAGCGGTGACCGTCGTCGGTGACCATTGTCCTAATCGTCTCGCTCCATACATC

5641 GCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTAT 5700 CGCCACGATGTCTCAAGAACTTCACCACCGGATTGATGCCGATGTGATCTTCTTGTCATA

5701 TTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGAT 5760 AACCATAGACGCGAGACGACTTCGGTCAATGGAAGCCTTTTTCTCAACCATCGAGAACTA

5761 CCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGC 5820 GGCCGTTTGTTTGGTGGCGACCATCGCCACCAAAAAAACAAACGTTCGTCGTCTAATGCG

5821 GCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGT 5880 CGTCTTTTTTTCCTAGAGTTCTTCTAGGAAACTAGAAAAGATGCCCCAGACTGCGAGTCA

5881 GGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCT 5940 CCTTGCTTTTGAGTGCAATTCCCTAAAACCAGTACTCTAATAGTTTTTCCTAGAAGTGGA

5941 AGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTT 6000 TCTAGGAAAATTTAATTTTTACTTCAAAATTTAGTTAGATTTCATATATACTCATTTGAA

AmpR(6011,6871)<<< | 6001 GGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTC 6060 CCAGACTGTCAATGGTTACGAATTAGTCACTCCGTGGATAGAGTCGCTAGACAGATAAAG

6061 GTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTAC 6120 CAAGTAGGTATCAACGGACTGAGGGGCAGCACATCTATTGATGCTATGCCCTCCCGAATG

202

6121 CATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTAT 6180 GTAGACCGGGGTCACGACGTTACTATGGCGCTCTGGGTGCGAGTGGCCGAGGTCTAAATA

6181 CAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCG 6240 GTCGTTATTTGGTCGGTCGGCCTTCCCGGCTCGCGTCTTCACCAGGACGTTGAAATAGGC

6241 CCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATA 6300 GGAGGTAGGTCAGATAATTAACAACGGCCCTTCGATCTCATTCATCAAGCGGTCAATTAT

6301 GTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTA 6360 CAAACGCGTTGCAACAACGGTAACGATGTCCGTAGCACCACAGTGCGAGCAGCAAACCAT

6361 TGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGT 6420 ACCGAAGTAAGTCGAGGCCAAGGGTTGCTAGTTCCGCTCAATGTACTAGGGGGTACAACA

6421 GCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAG 6480 CGTTTTTTCGCCAATCGAGGAAGCCAGGAGGCTAGCAACAGTCTTCATTCAACCGGCGTC

6481 TGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAA 6540 ACAATAGTGAGTACCAATACCGTCGTGACGTATTAAGAGAATGACAGTACGGTAGGCATT

6541 GATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGC 6600 CTACGAAAAGACACTGACCACTCATGAGTTGGTTCAGTAAGACTCTTATCACATACGCCG

6601 GACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTT 6660 CTGGCTCAACGAGAACGGGCCGCAGTTATGCCCTATTATGGCGCGGTGTATCGTCTTGAA

6661 TAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGC 6720 ATTTTCACGAGTAGTAACCTTTTGCAAGAAGCCCCGCTTTTGAGAGTTCCTAGAATGGCG

6721 TGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTA 6780 ACAACTCTAGGTCAAGCTACATTGGGTGAGCACGTGGGTTGACTAGAAGTCGTAGAAAAT

6781 CTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAA 6840 GAAAGTGGTCGCAAAGACCCACTCGTTTTTGTCCTTCCGTTTTACGGCGTTTTTTCCCTT

6841 TAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCA 6900 ATTCCCGCTGTGCCTTTACAACTTATGAGTATGAGAAGGAAAAAGTTATAATAACTTCGT

AmpR promoter(6913,6941)<<< | 6901 TTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAAC 6960 AAATAGTCCCAATAACAGAGTACTCGCCTATGTATAAACTTACATAAATCTTTTTATTTG

6961 AAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTC 7007 TTTATCCCCAAGGCGCGTGTAAAGGGGCTTTTCACGGTGGACTGCAG

203

8 Bibliography

Adkins, J. N., Varnum, S. M., Auberry, K. J., Moore, R. J., Angell, N. H., Smith, R. D., Springer, D. L., and Pounds, J. G. (2002). Toward a human blood serum proteome: analysis by multidimensional separation coupled with mass spectrometry. Mol Cell Proteomics 1, 947–955.

Afanasyeva, E. A., Hotz-Wagenblatt, A., Glatting, K.-H., and Westermann, F. (2008). New miRNAs cloned from neuroblastoma. BMC Genomics 9, 52.

Agrawal, D., Hauser, P., McPherson, F., Dong, F., Garcia, A., and Pledger, W. J. (1996). Repression of p27kip1 synthesis by platelet-derived growth factor in BALB/c 3T3 cells. Mol Cell Biol 16, 4327–4336.

Allerson, C. R., Martinez, A., Yikilmaz, E., and Rouault, T. A. (2003). A high-capacity RNA affinity column for the purification of human IRP1 and IRP2 overexpressed in Pichia pastoris. RNA 9, 364–374.

Alon, U. (2007). Network motifs: theory and experimental approaches. Nat Rev Genet 8, 450–461.

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic local alignment search tool. J Mol Biol 215, 403–410.

Anantharaman, V., Koonin, E. V., and Aravind, L. (2002). Comparative genomics and evolution of proteins involved in RNA metabolism. Nucleic Acids Res 30, 1427–1464.

Anderson, N. L., and Anderson, N. G. (2002). The human plasma proteome: history, character, and diagnostic prospects. Mol Cell Proteomics 1, 845–867.

Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H., and Murphy, K. P. (2007). Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics 23, i19–i28.

Andronescu, M., Fejes, A. P., Hutter, F., Hoos, H. H., and Condon, A. (2004). A new algorithm for RNA secondary structure design. J Mol Biol 336, 607–624.

Archer, S. K., Luu, V.-D., de Queiroz, R. A., Brems, S., and Clayton, C. (2009). Trypanosoma brucei PUF9 regulates mRNAs for proteins involved in replicative processes over the cell cycle. PLoS Pathog 5, e1000565.

Bachler, M., Schroeder, R., and von Ahsen, U. (1999). StreptoTag: a novel method for the isolation of RNA-binding proteins. RNA 5, 1509–1516.

Baek, D., Villén, J., Shin, C., Camargo, F. D. F. D., Gygi, S. P. S. P., Bartel, D. P., and Villen, J. (2008). The impact of microRNAs on protein output. Nature 455, 64–71.

Bailey, T. L., Boden, M., Buske, F. A., Frith, M., Grant, C. E., Clementi, L., Ren, J., Li, W. W., and Noble, W. S. (2009). MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res 37, W202–W208.

204

Bardwell, V. J., and Wickens, M. (1990). Purification of RNA and RNA-protein complexes by an R17 coat protein affinity method. Nucleic Acids Res 18, 6587–6594.

Barrera, L. O., and Ren, B. (2006). The transcriptional regulatory code of eukaryotic cells – insights from genome-wide analysis of chromatin organization and transcription factor binding. Curr Opin Cell Biol 18, 291–298.

Bates, D. O., Cui, T.-G., Doughty, J. M., Winkler, M., Sugiono, M., Shields, J. D., Peat, D., Gillatt, D., and Harper, S. J. (2002). VEGF165b, an inhibitory splice variant of vascular endothelial growth factor, is down-regulated in renal cell carcinoma. Cancer Res 62, 4123–4131.

Beach, D. L., and Keene, J. D. (2008). Ribotrap : targeted purification of RNA-specific RNPs from cell lysates through immunoaffinity precipitation to identify regulatory proteins and RNAs. Methods Mol Biol 419, 69–91.

Bellucci, M., Agostini, F., Masin, M., and Tartaglia, G. G. (2011). Predicting protein associations with long noncoding RNAs. Nat Methods 8, 444–445.

Belotserkovskii, B. P., Liu, R., Tornaletti, S., Krasilnikova, M. M., Mirkin, S. M., and Hanawalt, P. C. (2010). Mechanisms and implications of transcription blockage by guanine-rich DNA sequences. Proc Natl Acad Sci U S A 107, 12816–12821.

Betel, D., Koppal, A., Agius, P., Sander, C., and Leslie, C. (2010). Comprehensive modeling of microRNA targets predicts functional non-conserved and non-canonical sites. Genome Biol 11, R90.

Bhattacharyya, S. N., Habermacher, R., Martine, U., Closs, E. I., and Filipowicz, W. (2006). Relief of microRNA-mediated translational repression in human cells subjected to stress. Cell 125, 1111–1124.

Bindereif, A., and Green, M. R. (1987). An ordered pathway of snRNP binding during mammalian pre-mRNA splicing complex assembly. EMBO J 6, 2415–2424.

Bischofberger, N., Ng, P. G., Webb, T. R., and Matteucci, M. D. (1987). Cleavage of single stranded oligonucleotides by EcoRI restriction endonuclease. Nucleic Acids Res 15, 709–716.

Blencowe, B. J., and Barabino, S. M. (1995). Antisense affinity depletion of RNP particles. Application to spliceosomal snRNPs. Methods Mol Biol 37, 67–76.

Bourdeau, V., Ferbeyre, G., Pageau, M., Paquin, B., and Cedergren, R. (1999). The distribution of RNA motifs in natural sequences. Nucleic Acids Res 27, 4457–4467.

Bowden, J., Brennan, P. A., Umar, T., and Cronin, A. (2002). Expression of vascular endothelial growth factor in basal cell carcinoma and cutaneous squamous cell carcinoma of the head and neck. J Cutan Pathol 29, 585–589.

Brenner, S., Jacob, F., and Meselson, M. (1961). An unstable intermediate carrying information from genes to ribosomes for protein synthesis. Nature 190, 576–581.

205

Brickner, A. G., Warren, E. H., Caldwell, J. A., Akatsuka, Y., Golovina, T. N., Zarling, A. L., Shabanowitz, J., Eisenlohr, L. C., Hunt, D. F., Engelhard, V. H., et al. (2001). The immunogenicity of a new human minor histocompatibility antigen results from differential antigen processing. J Exp Med 193, 195–206.

Brodsky, A. S., and Silver, P. A. (2000). Pre-mRNA processing factors are required for nuclear export. RNA 6, 1737–1749.

Butter, F., Scheibe, M., Mörl, M., and Mann, M. (2009). Unbiased RNA-protein interaction screen by quantitative proteomics. Proc Natl Acad Sci U S A 106, 10626–10631.

Böck-Taferner, P., and Wank, H. (2004). GAPDH enhances group II intron splicing in vitro. Biol Chem 385, 615–621.

Calin, G. A., and Croce, C. M. (2006). MicroRNA signatures in human cancers. Nat Rev Cancer 6, 857–866.

Caputi, M., Mayeda, A., Krainer, A. R., and Zahler, A. M. (1999). hnRNP A/B proteins are required for inhibition of HIV-1 pre-mRNA splicing. EMBO J 18, 4060–4067.

Carmeliet, P. (2005). Angiogenesis in life, disease and medicine. Nature 438, 932–936.

Chang, H.-Y., Fan, C.-C., Chu, P.-C., Hong, B.-E., Lee, H. J., and Chang, M.-S. (2011). hPuf- A/KIAA0020 modulates PARP-1 cleavage upon genotoxic stress. Cancer Res 71, 1126– 1134.

Chang, S.-H., and Hla, T. (2011). Gene regulation by RNA binding proteins and microRNAs in angiogenesis. Trends Mol Med 17, 650–658.

Chen, S. J., and Dill, K. A. (2000). RNA folding energy landscapes. Proc Natl Acad Sci U S A 97, 646–651.

Chendrimada, T. P., Gregory, R. I., Kumaraswamy, E., Norman, J., Cooch, N., Nishikura, K., and Shiekhattar, R. (2005). TRBP recruits the Dicer complex to Ago2 for microRNA processing and gene silencing. Nature 436, 740–744.

Cheong, C.-G., and Hall, T. M. T. (2006). Engineering RNA sequence specificity of Pumilio repeats. Proc Natl Acad Sci U S A 103, 13635–13639.

Chi, S. W., Zang, J. B., Mele, A., and Darnell, R. B. (2009). Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature 460, 479–486.

Cho, S., Kim, J. H., Back, S. H., and Jang, S. K. (2005). Polypyrimidine tract-binding protein enhances the internal ribosomal entry site-dependent translation of p27Kip1 mRNA and modulates transition from G1 to S phase. Mol Cell Biol 25, 1283–1297.

Chu, S., DeRisi, J., Eisen, M., Mulholland, J., Botstein, D., Brown, P. O., and Herskowitz, I. (1998). The transcriptional program of sporulation in budding yeast. Science 282, 699– 705.

Ciais, D., Cherradi, N., Bailly, S., Grenier, E., Berra, E., Pouyssegur, J., Lamarre, J., and

206

Feige, J.-J. (2004). Destabilization of vascular endothelial growth factor mRNA by the zinc-finger protein TIS11b. Oncogene 23, 8673–8680.

Claffey, K. P., Shih, S. C., Mullen, A., Dziennis, S., Cusick, J. L., Abrams, K. R., Lee, S. W., and Detmar, M. (1998). Identification of a human VPF/VEGF 3’ untranslated region mediating hypoxia-induced mRNA stability. Mol Biol Cell 9, 469–481.

Coles, L. S., Bartley, M. A., Bert, A., Hunter, J., Polyak, S., Diamond, P., Vadas, M. A., and Goodall, G. J. (2004). A multi-protein complex containing cold shock domain (Y-box) and polypyrimidine tract binding proteins forms on the vascular endothelial growth factor mRNA. Potential role in mRNA stabilization. Eur Journal Biochem 271, 648–660.

Corcoran, D. L., Georgiev, S., Mukherjee, N., Gottwein, E., Skalsky, R. L., Keene, J. D., and Ohler, U. (2011). PARalyzer: definition of RNA binding sites from PAR-CLIP short- read sequence data. Genome Biol 12, R79.

Crick, F. H., Barnett, L., Brenner, S., and Watts-Tobin, R. J. (1961). General nature of the genetic code for proteins. Nature 192, 1227–1232.

Cuesta, R., Martínez-Sánchez, A., and Gebauer, F. (2009). miR-181a regulates cap-dependent translation of p27(kip1) mRNA in myeloid cells. Mol Cell Biol 29, 2841–2851.

Czaplinski, K., Köcher, T., Schelder, M., Segref, A., Wilm, M., and Mattaj, I. W. (2005). Identification of 40LoVe, a Xenopus hnRNP D family protein involved in localizing a TGF-beta-related mRNA during oogenesis. Dev Cell 8, 505–515.

Dangerfield, J. A., Windbichler, N., Salmons, B., Günzburg, W. H., and Schröder, R. (2006). Enhancement of the StreptoTag method for isolation of endogenously expressed proteins with complex RNA binding targets. Electrophoresis 27, 1874–1877.

Darnell, J. C., Van Driesche, S. J., Zhang, C., Hung, K. Y. S., Mele, A., Fraser, C. E., Stone, E. F., Chen, C., Fak, J. J., Chi, S. W., et al. (2011). FMRP Stalls Ribosomal Translocation on mRNAs Linked to Synaptic Function and Autism. Cell 146, 247–261.

Darnell, R. B. (2010). HITS-CLIP: panoramic views of protein-RNA regulation in living cells. Wiley Interdiscip Rev RNA 1, 266–286.

Das, R., Zhou, Z., and Reed, R. (2000). Functional association of U2 snRNP with the ATP- independent spliceosomal complex E. Mol Cell 5, 779–787.

Deckert, J., Hartmuth, K., Boehringer, D., Behzadnia, N., Will, C. L., Kastner, B., Stark, H., Urlaub, H., and Lührmann, R. (2006). Protein composition and electron microscopy structure of affinity-purified human spliceosomal B complexes isolated under physiological conditions. Mol Cell Biol 26, 5528–5543.

Denli, A. M., Tops, B. B. J., Plasterk, R. H. a, Ketting, R. F., and Hannon, G. J. (2004). Processing of primary microRNAs by the Microprocessor complex. Nature 432, 231– 235.

Detmar, M. (2000). The role of VEGF and thrombospondins in skin angiogenesis. J Dermatol Sci 24 Suppl 1, S78–S84.

207

Detmar, M., Brown, L. F., Berse, B., Jackman, R. W., Elicker, B. M., Dvorak, H. F., and Claffey, K. P. (1997). Hypoxia regulates the expression of vascular permeability factor/vascular endothelial growth factor (VPF/VEGF) and its receptors in human skin. J Invest Dermatol 108, 263–268.

Detmar, M., Brown, L. F., Claffey, K. P., Yeo, K. T., Kocher, O., Jackman, R. W., Berse, B., and Dvorak, H. F. (1994). Overexpression of vascular permeability factor/vascular endothelial growth factor and its receptors in psoriasis. J Exp Med 180, 1141–1146.

Dibbens, J. A., Miller, D. L., Damert, A., Risau, W., Vadas, M. A., and Goodall, G. J. (1999). Hypoxic regulation of vascular endothelial growth factor mRNA stability requires the cooperation of multiple RNA elements. Mol Biol Cell 10, 907–919.

Dibbens, J. A., Polyak, S. W., Damert, A., Risau, W., Vadas, M. a, and Goodall, G. J. (2001). Nucleotide sequence of the mouse VEGF 3’UTR and quantitative analysis of sites of polyadenylation. Biochim Biophy Acta 1518, 57–62.

Djuranovic, S., Nahvi, A., and Green, R. (2011). A parsimonious model for gene regulation by miRNAs. Science 331, 550–553.

Dong, S., Wang, Y., Cassidy-Amstutz, C., Lu, G., Bigler, R., Jezyk, M. R., Li, C., Hall, T. M. T., and Wang, Z. (2011). Specific and modular binding code for cytosine recognition in Pumilio/FBF (PUF) RNA-binding domains. J Biol Chem 286, 26732–26742.

Dong, X., Stothard, P., Forsythe, I. J., and Wishart, D. S. (2004). PlasMapper: a web server for drawing and auto-annotating plasmid maps. Nucleic Acids Res 32, W660–W664.

Dziunycz, P., Iotzova-Weiss, G., Eloranta, J. J., Läuchli, S., Hafner, J., French, L. E., and Hofbauer, G. F. L. (2010). Squamous cell carcinoma of the skin shows a distinct microRNA profile modulated by UV radiation. J Invest Dermatol 130, 2686–2689.

Ellington, A. D., and Szostak, J. W. (1990). In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818–822.

Erickson, S. L., and Lykke-Andersen, J. (2011). Cytoplasmic mRNP granules at a glance. J Cell Sci 124, 293–297.

Farazi, T. a, Spitzer, J. I., Morozov, P., and Tuschl, T. (2011). miRNAs in human cancer. J Pathol 223, 102–115.

Felicetti, F., Errico, M. C., Bottero, L., Segnalini, P., Stoppacciaro, A., Biffoni, M., Felli, N., Mattia, G., Petrini, M., Colombo, M. P., et al. (2008). The promyelocytic leukemia zinc finger-microRNA-221/-222 pathway controls melanoma progression through multiple oncogenic mechanisms. Cancer Res 68, 2745–2754.

Ferrara, N., Gerber, H.-P., and LeCouter, J. (2003). The biology of VEGF and its receptors. Nat Med 9, 669–676.

Ferrara, N., and Henzel, W. J. (1989). Pituitary follicular cells secrete a novel heparin-binding growth factor specific for vascular endothelial cells. Biochem Biophys Res Commun 161, 851–858.

208

Filipovska, A., Razif, M. F. M., Nygård, K. K. a, and Rackham, O. (2011). A universal code for RNA recognition by PUF proteins. Nat Chem Biol 7, 425–427.

Filipovska, A., and Rackham, O. (2011). Designer RNA-binding proteins: New tools for manipulating the transcriptome. RNA Biol 8.

Filipowicz, W., Bhattacharyya, S. N., and Sonenberg, N. (2008). Mechanisms of post- transcriptional regulation by microRNAs: are the answers in sight? Nat Rev Genet 9, 102–114.

Folkman, J. (1990). What is the evidence that tumors are angiogenesis dependent? J Natl Cancer Inst 82, 4–6.

Folkman, J., Merler, E., Abernathy, C., and Williams, G. (1971). Isolation of a tumor factor responsible for angiogenesis. J Exp Med 133, 275–288.

Fornari, F., Gramantieri, L., Ferracin, M., Veronese, A., Sabbioni, S., Calin, G. A., Grazi, G. L., Giovannini, C., Croce, C. M., Bolondi, L., et al. (2008). MiR-221 controls CDKN1C/p57 and CDKN1B/p27 expression in human hepatocellular carcinoma. Oncogene 27, 5651–5661.

Frank, S., Hübner, G., Breier, G., Longaker, M. T., Greenhalgh, D. G., and Werner, S. (1995). Regulation of vascular endothelial growth factor expression in cultured keratinocytes. Implications for normal and impaired wound healing. J Biol Chem 270, 12607–12613.

Friedman, R. C., Farh, K. K.-how, Burge, C. B., and Bartel, D. P. (2009). Most mammalian mRNAs are conserved targets of microRNAs. Genome Res 19, 92–105.

Fujita, P. A., Rhead, B., Zweig, A. S., Hinrichs, A. S., Karolchik, D., Cline, M. S., Goldman, M., Barber, G. P., Clawson, H., Coelho, A., et al. (2011). The UCSC Genome Browser database: update 2011. Nucleic Acids Res 39, D876–D882.

Gaidatzis, D., van Nimwegen, E., Hausser, J., Zavolan, M., Nimwegen, V., E., H., and J (2007). Inference of miRNA targets using evolutionary conservation and pathway analysis. BMC Bioinformatics 8, 69.

Galardi, S., Mercatelli, N., Giorda, E., Massalini, S., Frajese, G. V., Ciafrè, S. A., and Farace, M. G. (2007). miR-221 and miR-222 expression affects the proliferation potential of human prostate carcinoma cell lines by targeting p27Kip1. J Biol Chem 282, 23716– 23724.

Galgano, A. (2010). Pum1 affects VEGF-A expression. In “Comparative analysis of mRNA targets for human PUF-family-RNA binding proteins” (ETH Zurich, Diss. No. 18803), pp. 139–169.

Galgano, A., Forrer, M., Jaskiewicz, L., Kanitz, A., Zavolan, M., and Gerber, A. P. (2008). Comparative analysis of mRNA targets for human PUF-family proteins suggests extensive interaction with the miRNA regulatory system. PLoS One 3, e3164.

Galperin, M. Y., and Cochrane, G. R. (2009). Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2009. Nucleic Acids Res

209

37, D1–D4.

Gerber, A. P., Herschlag, D., and Brown, P. O. (2004a). Extensive association of functionally and cytotopically related mRNAs with Puf family RNA-binding proteins in yeast. PLoS Biol 2, E79.

Gerber, A. P., Luschnig, S., Krasnow, M. A., Brown, P. O., and Herschlag, D. (2006). Genome-wide identification of mRNAs associated with the translational regulator PUMILIO in Drosophila melanogaster. Proc Natl Acad Sci U S A 103, 4487–4492.

Gerber, C. A., Relich, A., and Driscoll, D. M. (2004b). Isolation of an mRNA-binding protein involved in C-to-U editing. Methods Mol Biol 265, 239–249.

Godfried Sie, C. P., and Kuchka, M. (2011). RNA editing adds flavor to complexity. Biochemistry (Mosc) 76, 869–881.

Goldberg-Cohen, I., Furneauxb, H., and Levy, A. P. (2002). A 40-bp RNA element that mediates stabilization of vascular endothelial growth factor mRNA by HuR. J Biol Chem 277, 13635–13640.

Gonsalvez, G. B., Little, J. L., and Long, R. M. (2004). ASH1 mRNA anchoring requires reorganization of the Myo4p-She3p-She2p transport complex. J Biol Chem 279, 46286– 46294.

Grabowski, P. J., and Sharp, P. A. (1986). Affinity chromatography of splicing complexes: U2, U5, and U4 + U6 small nuclear ribonucleoprotein particles in the spliceosome. Science 233, 1294–1299.

Graham, F. L., Smiley, J., Russell, W. C., and Nairn, R. (1977). Characteristics of a human cell line transformed by DNA from human adenovirus type 5. J Gen Virol 36, 59–74.

Green, N. M. (1990). Avidin and streptavidin. Methods Enzymol 184, 51–67.

Greenberg, J. R. (1979). Ultraviolet light-induced crosslinking of mRNA to proteins. Nucleic Acids Res 6, 715–732.

Gregory, R. I., Chendrimada, T. P., Cooch, N., and Shiekhattar, R. (2005). Human RISC couples microRNA biogenesis and posttranscriptional gene silencing. Cell 123, 631–640.

Gregory, R. I., Yan, K.-P., Amuthan, G., Chendrimada, T., Doratotaj, B., Cooch, N., and Shiekhattar, R. (2004). The Microprocessor complex mediates the genesis of microRNAs. Nature 432, 235–240.

Griffiths-Jones, S., Saini, H. K., van Dongen, S., and Enright, A. J. (2008). miRBase: tools for microRNA genomics. Nucleic Acids Res 36, D154–D158.

Gros, F., Hiatt, H., Gilbert, W., Kurland, C. G., Risebrough, R. W., and Watson, J. D. (1961). Unstable ribonucleic acid revealed by pulse labelling of Escherichia coli. Nature 190, 581–585.

Gruber, A. R., Lorenz, R., Bernhart, S. H., Neuböck, R., and Hofacker, I. L. (2008). The

210

Vienna RNA websuite. Nucleic Acids Res 36, W70–W74.

Gstaiger, M., and Aebersold, R. (2009). Applying mass spectrometry-based proteomics to genetics, genomics and network biology. Nat Rev Genet 10, 617–627.

Guo, S.-L., Peng, Z., Yang, X., Fan, K.-J., Ye, H., Li, Z.-H., Wang, Y., Xu, X.-L., Li, J., Wang, Y.-L., et al. (2011). miR-148a promoted cell proliferation by targeting p27 in gastric cancer cells. Int J Biol Sci 7, 567–574.

Gygi, S. P., Rochon, Y., Franza, B. R., and Aebersold, R. (1999). Correlation between protein and mRNA abundance in yeast. Mol Cell Biol 19, 1720–1730.

Hafner, M., Landthaler, M., Burger, L., Khorshid, M., Hausser, J., Berninger, P., Rothballer, A., Ascano, M., Jungkamp, A.-C., Munschauer, M., et al. (2010a). PAR-CliP – a method to identify transcriptome-wide the binding sites of RNA binding proteins. J Vis Exp.

Hafner, M., Landthaler, M., Burger, L., Khorshid, M., Hausser, J., Berninger, P., Rothballer, A., Ascano, M., Jungkamp, A.-C., Munschauer, M., et al. (2010b). Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141, 129–141.

Halbeisen, R. E., Galgano, A., Scherrer, T., and Gerber, A. P. (2008). Post-transcriptional gene regulation: from genome-wide studies to principles. Cell Mol Life Sci 65, 798–813.

Hamasaki, K., Killian, J., Cho, J., and Rando, R. R. (1998). Minimal RNA constructs that specifically bind aminoglycoside antibiotics with high affinities. Biochemistry 37, 656– 663.

Harbison, C. T., Gordon, D. B., Lee, T. I., Rinaldi, N. J., Macisaac, K. D., Danford, T. W., Hannett, N. M., Tagne, J.-B., Reynolds, D. B., Yoo, J., et al. (2004). Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104.

Harper, S. J., and Bates, D. O. (2008). VEGF-A splicing: the key to anti-angiogenic therapeutics? Nat Rev Cancer 8, 880–887.

Hartmuth, K., Urlaub, H., Vornlocher, H.-P., Will, C. L., Gentzel, M., Wilm, M., and Lührmann, R. (2002). Protein composition of human prespliceosomes isolated by a tobramycin affinity-selection method. Proc Natl Acad Sci U S A 99, 16719–16724.

Hartmuth, K., Vornlocher, H.-P., and Lührmann, R. (2004). Tobramycin affinity tag purification of spliceosomes. Methods Mol Biol 257, 47–64.

Hendrickson, D. G., Hogan, D. J., Herschlag, D., Ferrell, J. E., and Brown, P. O. (2008). Systematic identification of mRNAs recruited to argonaute 2 by specific microRNAs and corresponding changes in transcript abundance. PLoS One 3, e2126.

Hengst, L., and Reed, S. I. (1996). Translational control of p27Kip1 accumulation during the cell cycle. Science 271, 1861–1864.

Hershko, D. D. (2010). Cyclin-dependent kinase inhibitor p27 as a prognostic biomarker and potential cancer therapeutic target. Future Oncol 6, 1837–1847.

211

Hieronymus, H., and Silver, P. A. (2004). A systems view of mRNP biology. Genes Dev 18, 2845–2860.

Hoeben, A., Landuyt, B., Highley, M. S., Wildiers, H., Van Oosterom, A. T., and De Bruijn, E. A. (2004). Vascular endothelial growth factor and angiogenesis. Pharmacol Rev 56, 549–580.

Hofacker, I. L., Flamm, C., Heine, C., Wolfinger, M. T., Scheuermann, G., and Stadler, P. F. (2010). BarMap: RNA folding on dynamic energy landscapes. RNA 16, 1308–1316.

Hofbauer, G. F. L., Bouwes Bavinck, J. N., and Euvrard, S. (2010). Organ transplantation and skin cancer: basic problems and new perspectives. Exp Dermatol 19, 473–482.

Hogan, D. J., Riordan, D. P., Gerber, A. P., Herschlag, D., and Brown, P. O. (2008). Diverse RNA-binding proteins interact with functionally related sets of RNAs, suggesting an extensive regulatory system. PLoS Biol 6, e255.

Hogg, J. R., and Collins, K. (2007a). Human Y5 RNA specializes a Ro ribonucleoprotein for 5S ribosomal RNA quality control. Genes Dev 21, 3067–3072.

Hogg, J. R., and Collins, K. (2007b). RNA-based affinity purification reveals 7SK RNPs with distinct composition and regulation. RNA 13, 868–880.

Houck, K. A., Ferrara, N., Winer, J., Cachianes, G., Li, B., and Leung, D. W. (1991). The vascular endothelial growth factor family: identification of a fourth molecular species and characterization of alternative splicing of RNA. Mol Endocrinol 5, 1806–1814.

Hua, Z., Lv, Q., Ye, W., Wong, C.-K. A., Cai, G., Gu, D., Ji, Y., Zhao, C., Wang, J., Yang, B. B., et al. (2006). MiRNA-directed regulation of VEGF and other angiogenic factors under hypoxia. PLoS One 1, e116.

Huang, D. W., Sherman, B. T., Stephens, R., Baseler, M. W., Lane, H. C., and Lempicki, R. A. (2008). DAVID gene ID conversion tool. Bioinformation 2, 428–430.

Huez, I., Bornes, S., Bresson, D., Créancier, L., and Prats, H. (2001). New vascular endothelial growth factor isoform generated by internal ribosome entry site-driven CUG translation initiation. Mol Endocrinol 15, 2197–2210.

Höck, J., and Meister, G. (2008). The Argonaute protein family. Genome Biol 9, 210.

Iioka, H., Loiselle, D., Haystead, T. A., and Macara, I. G. (2011). Efficient detection of RNA- protein interactions using tethered RNAs. Nucleic Acids Res 39, e53.

Iyer, V. R., Horak, C. E., Scafe, C. S., Botstein, D., Snyder, M., and Brown, P. O. (2001). Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409, 533–538.

Jacob, F., and Monod, J. (1961). Genetic regulatory mechanisms in the synthesis of proteins. J Mol Biol 3, 318–356.

Jafarifar, F., Yao, P., Eswarappa, S. M., and Fox, P. L. (2011). Repression of VEGFA by CA-

212

rich element-binding microRNAs is modulated by hnRNP L. EMBO J 30, 1324–1334.

Jiang, L., Suri, A. K., Fiala, R., and Patel, D. J. (1997). Saccharide-RNA recognition in an aminoglycoside antibiotic-RNA aptamer complex. Chem Biol 4, 35–50.

Jiang, L., and Patel, D. J. (1998). Solution structure of the tobramycin-RNA aptamer complex. Nat Struct Biol 5, 769–774.

Jurica, M. S., Licklider, L. J., Gygi, S. R., Grigorieff, N., and Moore, M. J. (2002). Purification and characterization of native spliceosomes suitable for three-dimensional structural analysis. RNA 8, 426–439.

Jurica, M. S., and Moore, M. J. (2002). Capturing splicing complexes to study structure and mechanism. Methods 28, 336–345.

Kaminski, A., Ostareck, D. H., Standart, N. M., and Jackson, R. J. (1998). Affinity methods for isolating RNA binding proteins. In RNA:Protein Interactions: A Practical Approach, C. W. J. Smith, ed. (New York: Oxford Univsersity Press), pp. 137–160.

Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., and Hirakawa, M. (2010). KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Res 38, D355–D360.

Kanitz, A., and Gerber, A. P. (2010). Circuitry of mRNA regulation. Wiley Interdiscip Rev Syst Biol Med 2, 245–251.

Karaa, Z. S., Iacovoni, J. S., Bastide, A., Lacazette, E., Touriol, C., and Prats, H. (2009). The VEGF IRESes are differentially susceptible to translation inhibition by miR-16. RNA 15, 249–254.

Karginov, F. V., Conaco, C., Xuan, Z., Schmidt, B. H., Parker, J. S., Mandel, G., and Hannon, G. J. (2007). A biochemical approach to identifying microRNA targets. Proc Natl Acad Sci U S A 104, 19291–19296.

Keck, P. J., Hauser, S. D., Krivi, G., Sanzo, K., Warren, T., Feder, J., and Connolly, D. T. (1989). Vascular permeability factor, an endothelial cell mitogen related to PDGF. Science 246, 1309–1312.

Kedde, M., van Kouwenhove, M., Zwart, W., Oude Vrielink, J. A. F., Elkon, R., and Agami, R. (2010). A Pumilio-induced RNA structure switch in p27-3’ UTR controls miR-221 and miR-222 accessibility. Nat Cell Biol 12, 1014–1020.

Kedde, M., Strasser, M. J., Boldajipour, B., Oude Vrielink, J. A. F., Slanchev, K., le Sage, C., Nagel, R., Voorhoeve, P. M., van Duijse, J., Ørom, U. A., et al. (2007). RNA-binding protein Dnd1 inhibits microRNA access to target mRNA. Cell 131, 1273–1286.

Kedersha, N., and Anderson, P. (2007). Mammalian stress granules and processing bodies. Methods Enzymol 431, 61–81.

Keefe, A. D., Wilson, D. S., Seelig, B., and Szostak, J. W. (2001). One-step purification of recombinant proteins using a nanomolar-affinity streptavidin-binding peptide, the SBP-

213

Tag. Protein Expr Purif 23, 440–446.

Keene, J. D. (2007). RNA regulons: coordination of post-transcriptional events. Nat Rev Genet 8, 533–543.

Keene, J. D. (2001). Ribonucleoprotein infrastructure regulating the flow of genetic information between the genome and the proteome. Proc Natl Acad Sci U S A 98, 7018– 7024.

Kershner, A. M., and Kimble, J. (2010). Genome-wide analysis of mRNA targets for Caenorhabditis elegans FBF, a conserved stem cell regulator. Proc Natl Acad Sci U S A 107, 3936–3941.

Khorshid, M., Rodak, C., and Zavolan, M. (2011). CLIPZ: a database and analysis environment for experimentally determined binding sites of RNA-binding proteins. Nucleic Acids Res 39, D245–D252.

Kim, N., and Jinks-Robertson, S. (2011). Guanine repeat-containing sequences confer transcription-dependent instability in an orientation-specific manner in yeast. DNA Repair (Amst) 10, 953–960.

Kishore, S., Jaskiewicz, L., Burger, L., Hausser, J., Khorshid, M., and Zavolan, M. (2011). A quantitative analysis of CLIP methods for identifying binding sites of RNA-binding proteins. Nat Methods. van Kouwenhove, M., Kedde, M., and Agami, R. (2011). MicroRNA regulation by RNA- binding proteins and its implications for cancer. Nat Rev Cancer 11, 644–656.

Kozomara, A., and Griffiths-Jones, S. (2011). miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res 39, D152–D157.

Krol, J., Loedige, I., and Filipowicz, W. (2010). The widespread regulation of microRNA biogenesis, function and decay. Nat Rev Genet 11, 597–610.

Kula, A., Guerra, J., Knezevich, A., Kleva, D., Myers, M. P., and Marcello, A. (2011). Characterization of the HIV-1 RNA associated proteome identifies Matrin 3 as a nuclear cofactor of Rev function. Retrovirology 8, 60.

Kullmann, M., Göpfert, U., Siewe, B., and Hengst, L. (2002). ELAV/Hu proteins inhibit p27 translation via an IRES element in the p27 5’UTR. Genes Dev 16, 3087–3099.

Kurosu, T., Ohga, N., Hida, Y., Maishi, N., Akiyama, K., Kakuguchi, W., Kuroshima, T., Kondo, M., Akino, T., Totsuka, Y., et al. (2011). HuR keeps an angiogenic switch on by stabilising mRNA of VEGF and COX-2 in tumour endothelium. Br J Cancer 104, 819– 829.

König, J., Zarnack, K., Rot, G., Curk, T., Kayikci, M., Zupan, B., Turner, D. J., Luscombe, N. M., and Ule, J. (2010). iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat Struct Mol Biol 17, 909–915.

König, J., Zarnack, K., Rot, G., Curk, T., Kayikci, M., Zupan, B., Turner, D. J., Luscombe, N.

214

M., and Ule, J. (2011). iCLIP – transcriptome-wide mapping of protein-RNA interactions with individual nucleotide resolution. J Vis Exp.

Lagos-Quintana, M., Rauhut, R., Lendeckel, W., and Tuschl, T. (2001). Identification of novel genes coding for small expressed RNAs. Science 294, 853–858.

Lamond, a I., Sproat, B., Ryder, U., and Hamm, J. (1989). Probing the structure and function of U2 snRNP with antisense oligonucleotides made of 2’-OMe RNA. Cell 58, 383–390.

Lamond, A. I., and Sproat, B. S. (1994). Isolation and characterisation of ribonucleoprotein complexes. In RNA Processing: A Practical Approach, Volume 1, D. Hames and S. Higgins, eds. (New York: Oxford Univsersity Press), pp. 103–140.

Landthaler, M., Gaidatzis, D., Rothballer, A., Chen, P. Y., Soll, S. J., Dinic, L., Ojo, T., Hafner, M., Zavolan, M., and Tuschl, T. (2008). Molecular characterization of human Argonaute-containing ribonucleoprotein complexes and their bound target mRNAs. RNA 14, 2580–2596.

Lange, T., Guttmann-Raviv, N., Baruch, L., Machluf, M., and Neufeld, G. (2003). VEGF162, a new heparin-binding vascular endothelial growth factor splice form that is expressed in transformed human cells. J Biol Chem 278, 17164–17169.

Langland, J. O., Pettiford, S. M., and Jacobs, B. L. (1995). Nucleic acid affinity chromatography: preparation and characterization of double-stranded RNA agarose. Protein Expr Purif 6, 25–32.

Larocque, D., Galarneau, A., Liu, H.-N., Scott, M., Almazan, G., and Richard, S. (2005). Protection of p27(Kip1) mRNA by quaking RNA binding proteins promotes oligodendrocyte differentiation. Nat Neurosci 8, 27–33.

Lasko, P. (2003). Gene regulation at the RNA layer: RNA binding proteins in intercellular signaling networks. Sci STKE 2003, RE6.

Lau, N. C., Lim, L. P., Weinstein, E. G., and Bartel, D. P. (2001). An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science 294, 858–862.

Lebedeva, S., Jens, M., Theil, K., Schwanhäusser, B., Selbach, M., Landthaler, M., and Rajewsky, N. (2011). Transcriptome-wide Analysis of Regulatory Interactions of the RNA-Binding Protein HuR. Mol Cell 43, 1–13.

Lee, J. F., Hesselberth, J. R., Meyers, L. A., and Ellington, A. D. (2004). Aptamer database. Nucleic Acids Res 32, D95–D100.

Lee, R. C., and Ambros, V. (2001). An extensive class of small RNAs in Caenorhabditis elegans. Science 294, 862–864.

Lee, R. C., Feinbaum, R. L., and Ambros, V. (1993). The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14. Cell 75, 843–854.

Lee, T. I., Rinaldi, N. J., Robert, F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. T., Thompson, C. M., Simon, I., et al. (2002). Transcriptional

215

regulatory networks in Saccharomyces cerevisiae. Science 298, 799–804.

Lei, J., Jiang, A., and Pei, D. (1998). Identification and characterization of a new splicing variant of vascular endothelial growth factor: VEGF183. Biochim Biophy Acta 1443, 400–406.

Lei, Z., Li, B., Yang, Z., Fang, H., Zhang, G.-M., Feng, Z.-H., and Huang, B. (2009). Regulation of HIF-1alpha and VEGF by miR-20b tunes tumor cells to adapt to the alteration of oxygen concentration. PLoS One 4, e7629.

Lemay, V., Hossain, A., Osheim, Y. N., Beyer, A. L., and Dragon, F. (2011). Identification of novel proteins associated with yeast snR30 small nucleolar RNA. Nucleic Acids Res 3000, 1–12.

Leulliot, N., and Varani, G. (2001). Current topics in RNA-protein recognition: control of specificity and biological function through induced fit and conformational capture. Biochemistry 40, 7947–7956.

Leung, D. W., Cachianes, G., Kuang, W. J., Goeddel, D. V., and Ferrara, N. (1989). Vascular endothelial growth factor is a secreted angiogenic mitogen. Science 246, 1306–1309.

Levy, N. S., Chung, S., Furneaux, H., and Levy, A. P. (1998). Hypoxic stabilization of vascular endothelial growth factor mRNA by the RNA-binding protein HuR. J Biol Chem 273, 6417–6423.

Levy, N. S., Goldberg, M. A., and Levy, A. P. (1997). Sequencing of the human vascular endothelial growth factor (VEGF) 3’ untranslated region (UTR): conservation of five hypoxia-inducible RNA-protein binding sites. Biochim Biophy Acta 1352, 167–173.

Lewis, B. P., Burge, C. B., and Bartel, D. P. (2005). Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15–20.

Li, Y., and Altman, S. (2002). Partial reconstitution of human RNase P in HeLa cells between its RNA subunit with an affinity tag and the intact protein components. Nucleic Acids Res 30, 3706–3711.

Li, Y.-L., Ye, F., Hu, Y., Lu, W.-G., and Xie, X. (2009). Identification of suitable reference genes for gene expression studies of human serous ovarian cancer by real-time polymerase chain reaction. Anal Biochem 394, 110–116.

Liang, Y., Li, X.-Y., Rebar, E. J., Li, P., Zhou, Y., Chen, B., Wolffe, A. P., and Case, C. C. (2002). Activation of vascular endothelial growth factor A transcription in tumorigenic glioblastoma cell lines by an enhancer with cell type-specific DNase I accessibility. J Biol Chem 277, 20087–20094.

Licatalosi, D. D., Mele, A., Fak, J. J., Ule, J., Kayikci, M., Chi, S. W., Clark, T. a, Schweitzer, A. C., Blume, J. E., Wang, X., et al. (2008). HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464–469.

Lieb, J. D., Liu, X., Botstein, D., and Brown, P. O. (2001). Promoter-specific binding of Rap1

216

revealed by genome-wide maps of protein-DNA association. Nat Genet 28, 327–334.

Lingner, J., and Cech, T. R. (1996). Purification of telomerase from Euplotes aediculatus: requirement of a primer 3’ overhang. Proc Natl Acad Sci U S A 93, 10712–10717.

Liu, B., Peng, X.-C., Zheng, X.-L., Wang, J., and Qin, Y.-W. (2009). MiR-126 restoration down-regulate VEGF and inhibit the growth of lung cancer cell lines in vitro and in vivo. Lung Cancer 66, 169–175.

Liu, Z., Dong, Z., Han, B., Yang, Y., Liu, Y., and Zhang, J.-T. (2005). Regulation of expression by promoters versus internal ribosome entry site in the 5’-untranslated sequence of the human cyclin-dependent kinase inhibitor p27kip1. Nucleic Acids Res 33, 3763–3771.

Locker, N., Easton, L. E., and Lukavsky, P. J. (2006). Affinity purification of eukaryotic 48S initiation complexes. RNA 12, 683–690.

Locker, N., and Lukavsky, P. J. (2007). A practical approach to isolate 48S complexes: affinity purification and analyses. Methods Enzymol 429, 83–104.

Lohmann, C. M., and Solomon, A. R. (2001). Clinicopathologic variants of cutaneous squamous cell carcinoma. Adv Anat Pathol 8, 27–36.

Long, J., Wang, Y., Wang, W., Chang, B. H. J., and Danesh, F. R. (2010). Identification of microRNA-93 as a novel regulator of vascular endothelial growth factor in hyperglycemic conditions. J Biol Chem 285, 23457–23465.

Lu, G., Dolgner, S. J., and Hall, T. M. T. (2009). Understanding and engineering RNA sequence specificity of PUF proteins. Curr Opin Struct Biol 19, 110–115.

Lu, G., and Hall, T. M. T. (2011). Alternate modes of cognate RNA recognition by human PUMILIO proteins. Structure 19, 361–367.

Lukong, K. E., Chang, K.-wei, Khandjian, E. W., and Richard, S. (2008). RNA-binding proteins in human genetic disease. Trends Genet 24, 416–425.

Lyng, M. B., Laenkholm, A.-V., Pallisgaard, N., and Ditzel, H. J. (2008). Identification of genes for normalization of real-time RT-PCR data in breast carcinomas. BMC Cancer 8, 20.

Mansfield, K. D., and Keene, J. D. (2009). The ribonome: a dominant force in co-ordinating gene expression. Biol Cell 101, 169–181.

Maragkakis, M., Reczko, M., Simossis, V. A., Alexiou, P., Papadopoulos, G. L., Dalamagas, T., Giannopoulos, G., Goumas, G., Koukis, E., Kourtis, K., et al. (2009). DIANA- microT web server: elucidating microRNA functions through target prediction. Nucleic Acids Res 37, W273–W276.

Matthaei, J. H., Jones, O. W., Martin, R. G., and Nirenberg, M. W. (1962). Characteristics and composition of RNA coding units. Proc Natl Acad Sci U S A 48, 666–677.

217

Mayer, G. (2009). The chemical biology of aptamers. Angew Chem Int Ed Engl 48, 2672– 2689.

McArthur, K., Feng, B., Wu, Y., Chen, S., and Chakrabarti, S. (2011). MicroRNA-200b regulates vascular endothelial growth factor-mediated alterations in diabetic retinopathy. Diabetes 60, 1314–1323.

McManus, C. J., and Graveley, B. R. (2011). RNA structure and the mechanisms of alternative splicing. Curr Opin Genet Dev 21, 373–379.

Meiron, M., Anunu, R., Scheinman, E. J., Hashmueli, S., and Levi, B. Z. (2001). New isoforms of VEGF are translated from alternative initiation CUG codons located in its 5’UTR. Biochem Biophys Res Commun 282, 1053–1060.

Meng, F., Henson, R., Wehbe-Janek, H., Ghoshal, K., Jacob, S. T., and Patel, T. (2007). MicroRNA-21 regulates expression of the PTEN tumor suppressor gene in human hepatocellular cancer. Gastroenterology 133, 647–658.

Mesarovic, M. D., Sreenath, S. N., and Keene, J. D. (2004). Search for organising principles: understanding in systems biology. Syst Biol (Stevenage) 1, 19–27.

Millard, S. S., Vidal, A., Markus, M., and Koff, A. (2000). A U-rich element in the 5’ untranslated region is necessary for the translation of p27 mRNA. Mol Cell Biol 20, 5947–5959.

Miller, M. A., and Olivas, W. M. (2011). Roles of Puf proteins in mRNA degradation and translation. Wiley Interdiscip Rev RNA 2, 471–492.

Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., and Alon, U. (2002). Network motifs: simple building blocks of complex networks. Science 298, 824–827.

Miquerol, L., Langille, B. L., and Nagy, A. (2000). Embryonic development is disrupted by modest increases in vascular endothelial growth factor gene expression. Development 127, 3941–3946.

Mohammad, M. M., Donti, T. R., Sebastian Yakisich, J., Smith, A. G., and Kapler, G. M. (2007). Tetrahymena ORC contains a ribosomal RNA fragment that participates in rDNA origin recognition. EMBO J 26, 5048–5060.

Morris, A. R., Mukherjee, N., and Keene, J. D. (2008). Ribonomic analysis of human Pum1 reveals cis-trans conservation across species despite evolution of diverse mRNA target sets. Mol Cell Biol 28, 4093–4103.

Morris, A. R., Mukherjee, N., and Keene, J. D. (2010). Systematic analysis of posttranscriptional gene expression. Wiley Interdiscip Rev Syst Biol Med 2, 162–180.

Mourelatos, Z., Dostie, J., Paushkin, S., Sharma, A., Charroux, B., Abel, L., Rappsilber, J., Mann, M., and Dreyfuss, G. (2002). miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs. Genes Dev 16, 720–728.

Mukherjee, N., Corcoran, D. L., Nusbaum, J. D., Reid, D. W., Georgiev, S., Hafner, M.,

218

Ascano, M., Tuschl, T., Ohler, U., and Keene, J. D. (2011). Integrative Regulatory Mapping Indicates that the RNA-Binding Protein HuR Couples Pre-mRNA Processing and mRNA Stability. Mol Cell 43, 1–13.

Nelson, M. R., Luo, H., Vari, H. K., Cox, B. J., Simmonds, A. J., Krause, H. M., Lipshitz, H. D., and Smibert, C. A. (2007). A multiprotein complex that mediates translational enhancement in Drosophila. J Biol Chem 282, 34031–34038.

Niranjanakumari, S., Lasda, E., Brazas, R., and Garcia-Blanco, M. A. (2002). Reversible cross-linking combined with immunoprecipitation to study RNA-protein interactions in vivo. Methods 26, 182–190.

Nolde, M. J., Saka, N., Reinert, K. L., and Slack, F. J. (2007). The Caenorhabditis elegans pumilio homolog, puf-9, is required for the 3’UTR-mediated repression of the let-7 microRNA target gene, hbl-1. Dev Biol 305, 551–563.

Oliphant, A. R., Brandl, C. J., and Struhl, K. (1989). Defining the sequence specificity of DNA-binding proteins by selecting binding sites from random-sequence oligonucleotides: analysis of yeast GCN4 protein. Mol Cell Biol 9, 2944–2949.

Onesto, C., Berra, E., Grépin, R., and Pagès, G. (2004). Poly(A)-binding protein-interacting protein 2, a strong regulator of vascular endothelial growth factor mRNA. J Biol Chem 279, 34217–34226.

Pagano, M., Tam, S. W., Theodoras, A. M., Beer-Romero, P., Del Sal, G., Chau, V., Yew, P. R., Draetta, G. F., and Rolfe, M. (1995). Role of the ubiquitin-proteasome pathway in regulating abundance of the cyclin-dependent kinase inhibitor p27. Science 269, 682– 685.

Palomero, T., Sulis, M. L., Cortina, M., Real, P. J., Barnes, K., Ciofani, M., Caparros, E., Buteau, J., Brown, K., Perkins, S. L., et al. (2007). Mutational loss of PTEN induces resistance to NOTCH1 inhibition in T-cell leukemia. Nat Med 13, 1203–1210.

Parisien, M., and Major, F. (2008). The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data. Nature 452, 51–55.

Patzel, V., and Sczakiel, G. (1999). Length dependence of RNA-RNA annealing. J Mol Biol 294, 1127–1134.

Pineau, P., Volinia, S., McJunkin, K., Marchio, A., Battiston, C., Terris, B., Mazzaferro, V., Lowe, S. W., Croce, C. M., and Dejean, A. (2010). miR-221 overexpression contributes to liver tumorigenesis. Proc Natl Acad Sci U S A 107, 264–269.

Piqué, M., López, J. M., Foissac, S., Guigó, R., and Méndez, R. (2008). A combinatorial code for CPE-mediated translational control. Cell 132, 434–448.

Poltorak, Z., Cohen, T., Sivan, R., Kandelis, Y., Spira, G., Vlodavsky, I., Keshet, E., and Neufeld, G. (1997). VEGF145, a secreted vascular endothelial growth factor isoform that binds to extracellular matrix. J Biol Chem 272, 7151–7158.

Poy, M. N., Eliasson, L., Krutzfeldt, J., Kuwajima, S., Ma, X., Macdonald, P. E., Pfeffer, S.,

219

Tuschl, T., Rajewsky, N., Rorsman, P., et al. (2004). A pancreatic islet-specific microRNA regulates insulin secretion. Nature 432, 226–230.

Pullmann, R., Kim, H. H., Abdelmohsen, K., Lal, A., Martindale, J. L., Yang, X., and Gorospe, M. (2007). Analysis of turnover and translation regulatory RNA-binding protein expression through binding to cognate mRNAs. Mol Cell Biol 27, 6265–6278.

Pyle, A. M. (2010). The tertiary structure of group II introns: implications for biological function and evolution. Critic Rev Biochem Mol Biol 45, 215–232.

Quenault, T., Lithgow, T., and Traven, A. (2011). PUF proteins: repression, activation and mRNA localization. Trends Cell Biol 21, 104–112.

Ray, P. S., Jia, J., Yao, P., Majumder, M., Hatzoglou, M., and Fox, P. L. (2009). A stress- responsive RNA switch regulates VEGFA expression. Nature 457, 915–919.

Ray, P. S., and Fox, P. L. (2007). A post-transcriptional pathway represses monocyte VEGF- A expression and angiogenic activity. EMBO J 26, 3360–3372.

Rehmsmeier, M., Steffen, P., Hochsmann, M., and Giegerich, R. (2004). Fast and effective prediction of microRNA/target duplexes. RNA 10, 1507–1517.

Ren, B., Robert, F., Wyrick, J. J., Aparicio, O., Jennings, E. G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E., et al. (2000). Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309.

Rouault, T. A., Hentze, M. W., Haile, D. J., Harford, J. B., and Klausner, R. D. (1989). The iron-responsive element binding protein: a method for the affinity purification of a regulatory RNA-binding protein. Proc Natl Acad Sci U S A 86, 5768–5772.

Ruby, S. W., and Abelson, J. (1988). An early hierarchic role of U1 small nuclear ribonucleoprotein in spliceosome assembly. Science 242, 1028–1035. le Sage, C., Nagel, R., Egan, D. A., Schrier, M., Mesman, E., Mangiola, A., Anile, C., Maira, G., Mercatelli, N., Ciafrè, S. A., et al. (2007). Regulation of the p27(Kip1) tumor suppressor by miR-221 and miR-222 promotes cancer cell proliferation. EMBO J 26, 3699–3708.

Salazar, A. M., Silverman, E. J., Menon, K. P., and Zinn, K. (2010). Regulation of synaptic Pumilio function by an aggregation-prone domain. J Neurosci 30, 515–522.

Salven, P., Heikkilä, P., and Joensuu, H. (1997). Enhanced expression of vascular endothelial growth factor in metastatic melanoma. Br J Cancer 76, 930–934.

Sandberg, R., Neilson, J. R., Sarma, A., Sharp, P. a, and Burge, C. B. (2008). Proliferating cells express mRNAs with shortened 3’ untranslated regions and fewer microRNA target sites. Science 320, 1643–1647.

Sauter, E. R., Nesbit, M., Watson, J. C., Klein-Szanto, A., Litwin, S., and Herlyn, M. (1999). Vascular endothelial growth factor is a marker of tumor invasion and metastasis in squamous cell carcinomas of the head and neck. Clin Cancer Res 5, 775–782.

220

Schmittgen, T. D., and Livak, K. J. (2008). Analyzing real-time PCR data by the comparative C(T) method. Nat Protoc 3, 1101–1108.

Schwartz, R. A., Bridges, T. M., Butani, A. K., and Ehrlich, A. (2008). Actinic keratosis: an occupational and environmental disorder. J Eur Acad Dermatol Venereol 22, 606–615.

Selbach, M., Schwanhäusser, B., Thierfelder, N., Fang, Z., Khanin, R., and Rajewsky, N. (2008). Widespread changes in protein synthesis induced by microRNAs. Nature 455, 58–63.

Senger, D. R., Galli, S. J., Dvorak, A. M., Perruzzi, C. A., Harvey, V. S., and Dvorak, H. F. (1983). Tumor cells secrete a vascular permeability factor that promotes accumulation of ascites fluid. Science 219, 983–985.

Shahbabian, K., and Chartrand, P. (2011). Control of cytoplasmic mRNA localization. Cell Mol Life Sci.

Shcherbakova, D. M., Sokolov, K. A., Zvereva, M. I., and Dontsova, O. A. (2009). Telomerase from yeast Saccharomyces cerevisiae is active in vitro as a monomer. Biochemistry (Mosc) 74, 749–755.

Shih, S. C., and Claffey, K. P. (1999). Regulation of human vascular endothelial growth factor mRNA stability in hypoxia by heterogeneous nuclear ribonucleoprotein L. J Biol Chem 274, 1359–1365.

Shyu, A.-B., Wilkinson, M. F., and van Hoof, A. (2008). Messenger RNA regulation: to translate or to degrade. EMBO J 27, 471–481.

Silva, I. J., Saramago, M., Dressaire, C., Domingues, S., Viegas, S. C., and Arraiano, C. M. Importance and key events of prokaryotic RNA decay: the ultimate fate of an RNA molecule. Wiley Interdiscip Rev RNA 2, 818–836.

Slingerland, J., and Pagano, M. (2000). Regulation of the cdk inhibitor p27 and its deregulation in cancer. J Cell Physiol 183, 10–17.

Slobodin, B., and Gerst, J. E. (2011). RaPID: an aptamer-based mRNA affinity purification technique for the identification of RNA and protein factors present in ribonucleoprotein complexes. Methods Mol Biol 714, 387–406.

Solomatin, S. V., Greenfeld, M., Chu, S., and Herschlag, D. (2010). Multiple native states reveal persistent ruggedness of an RNA folding landscape. Nature 463, 681–684.

Sonoda, J., and Wharton, R. P. (1999). Recruitment of Nanos to hunchback mRNA by Pumilio. Genes Dev 13, 2704–2712.

Sontheimer, E. J. (1994). Site-specific RNA crosslinking with 4-thiouridine. Mol Biol Rep 20, 35–44.

Spassov, D. S., and Jurecic, R. (2002). Cloning and comparative sequence analysis of PUM1 and PUM2 genes, human members of the Pumilio family of RNA-binding proteins. Gene 299, 195–204.

221

Spassov, D. S., and Jurecic, R. (2003). The PUF family of RNA-binding proteins: does evolutionarily conserved structure equal conserved function? IUBMB Life 55, 359–366.

Sponer, J., Leszczynski, J., and Hobza, P. (2002). Electronic properties, hydrogen bonding, stacking, and cation binding of DNA and RNA bases. Biopolymers 61, 3–31.

Srisawat, C., Goldstein, I. J., and Engelke, D. R. (2001). Sephadex-binding RNA ligands: rapid affinity purification of RNA from complex RNA mixtures. Nucleic Acids Res 29, E4.

Srisawat, C., Houser-Scott, F., Bertrand, E., Xiao, S., Singer, R. H., and Engelke, D. R. (2002). An active precursor in assembly of yeast nuclear ribonuclease P. RNA 8, 1348– 1360.

Srisawat, C., and Engelke, D. R. (2002). RNA affinity tags for purification of RNAs and ribonucleoprotein complexes. Methods 26, 156–161.

Srisawat, C., and Engelke, D. R. (2001). Streptavidin aptamers: affinity tags for the study of RNAs and ribonucleoproteins. RNA 7, 632–641.

Steitz, J. A., and Vasudevan, S. (2009). miRNPs: versatile regulators of gene expression in vertebrate cells. Biochem Soc Trans 37, 931–935.

Sutton, D. H., Conn, G. L., Brown, T., and Lane, A. N. (1997). The dependence of DNase I activity on the conformation of oligodeoxynucleotides. Biochem J 321, Pt 2, 481–486.

Szabo, A., Perou, C. M., Karaca, M., Perreard, L., Palais, R., Quackenbush, J. F., and Bernard, P. S. (2004). Statistical modeling for selecting housekeeper genes. Genome Biol 5, R59.

Takizawa, P. A., and Vale, R. D. (2000). The myosin motor, Myo4p, binds Ash1 mRNA via the adapter protein, She3p. Proc Natl Acad Sci U S A 97, 5273–5278.

Tenenbaum, S. A., Lager, P. J., Carson, C. C., and Keene, J. D. (2002). Ribonomics: identifying mRNA subsets in mRNP complexes using antibodies to RNA-binding proteins and genomic arrays. Methods 26, 191–198.

Tenenbaum, S. a, Carson, C. C., Lager, P. J., and Keene, J. D. (2000). Identifying mRNA subsets in messenger ribonucleoprotein complexes by using cDNA arrays. Proc Natl Acad Sci U S A 97, 14085–14090.

Tereshko, V., Skripkin, E., and Patel, D. J. (2003). Encapsulating streptomycin within a small 40-mer RNA. Chem Biol 10, 175–187.

Thomas, M. G., Loschi, M., Desbats, M. A., and Boccaccio, G. L. (2011). RNA granules: the good, the bad and the ugly. Cell Signal 23, 324–334.

Thomas, P. D., Kejariwal, A., Guo, N., Mi, H., Campbell, M. J., Muruganujan, A., and Lazareva-Ulitsky, B. (2006). Applications for protein sequence-function evolution data: mRNA/protein expression analysis and coding SNP scoring tools. Nucleic Acids Res 34, W645–W650.

222

Tian, B., and Graber, J. H. (2011). Signals for pre-mRNA cleavage and polyadenylation. Wiley Interdiscip Rev RNA.

Topisirovic, I., Svitkin, Y. V., Sonenberg, N., and Shatkin, A. J. (2011). Cap and cap-binding proteins in the control of gene expression. Wiley Interdiscip Rev RNA 2, 277–298.

Tuerk, C., and Gold, L. (1990). Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249, 505–510.

Tóth-Jakatics, R., Jimi, S., Takebayashi, S., and Kawamoto, N. (2000). Cutaneous malignant melanoma: correlation between neovascularization and peritumor accumulation of mast cells overexpressing vascular endothelial growth factor. Hum Pathol 31, 955–960.

Valencia-Burton, M., McCullough, R. M., Cantor, C. R., and Broude, N. E. (2007). RNA visualization in live bacterial cells using fluorescent protein complementation. Nat Methods 4, 421–427.

Vasudevan, S. (2011). Posttranscriptional Upregulation by MicroRNAs. Wiley Interdiscip Rev RNA.

Vasudevan, S., Tong, Y., and Steitz, J. A. (2007). Switching from repression to activation: microRNAs can up-regulate translation. Science 318, 1931–1934.

Vasudevan, S., and Steitz, J. A. (2007). AU-rich-element-mediated upregulation of translation by FXR1 and Argonaute 2. Cell 128, 1105–1118.

Vazquez-Pianzola, P., Urlaub, H., and Rivera-Pomar, R. (2005). Proteomic analysis of reaper 5’ untranslated region-interacting factors isolated by tobramycin affinity-selection reveals a role for La antigen in reaper mRNA translation. Proteomics 5, 1645–1655.

Ventura, A., and Jacks, T. (2009). MicroRNAs and cancer: short RNAs go a long way. Cell 136, 586–591.

Verhelst, S. H. L., Michiels, P. J. A., van der Marel, G. A., van Boeckel, C. A. A., and van Boom, J. H. (2004). Surface plasmon resonance evaluation of various aminoglycoside- RNA hairpin interactions reveals low degree of selectivity. Chembiochem 5, 937–942.

Vessey, J. P., Schoderboeck, L., Gingl, E., Luzi, E., Riefler, J., Di Leva, F., Karra, D., Thomas, S., Kiebler, M. A., and Macchi, P. (2010). Mammalian Pumilio 2 regulates dendrite morphogenesis and synaptic function. Proc Natl Acad Sci U S A 107, 3222– 3227.

Vessey, J. P., Vaccani, A., Xie, Y., Dahm, R., Karra, D., Kiebler, M. A., and Macchi, P. (2006). Dendritic localization of the translational repressor Pumilio 2 and its contribution to dendritic stress granules. J Neurosci 26, 6496–6508.

Viswanathan, S. R., Daley, G. Q., and Gregory, R. I. (2008). Selective blockade of microRNA processing by Lin28. Science 320, 97–100.

Vumbaca, F., Phoenix, K. N., Rodriguez-Pinto, D., Han, D. K., and Claffey, K. P. (2008). Double-stranded RNA-binding protein regulates vascular endothelial growth factor

223

mRNA stability, translation, and breast cancer angiogenesis. Mol Cell Biol 28, 772–783.

Walker, S. C., Good, P. D., Gipson, T. A., and Engelke, D. R. (2011). The dual use of RNA aptamer sequences for affinity purification and localization studies of RNAs and RNA- protein complexes. Methods Mol Biol 714, 423–444.

Wallace, S. T., and Schroeder, R. (1998). In vitro selection and characterization of streptomycin-binding RNAs: recognition discrimination between antibiotics. RNA 4, 112–123.

Wang, X. (2008). miRDB: a microRNA target prediction and functional annotation database with a wiki interface. RNA 14, 1012–1017.

Wang, X., McLachlan, J., Zamore, P. D., and Hall, T. M. T. (2002). Modular recognition of RNA by a human pumilio-homology domain. Cell 110, 501–512.

Wang, X., Zamore, P. D., and Hall, T. M. (2001). Crystal structure of a Pumilio homology domain. Mol Cell 7, 855–865.

Wang, Y., Killian, J., Hamasaki, K., and Rando, R. R. (1996). RNA molecules that specifically and stoichiometrically bind aminoglycoside antibiotics with high affinities. Biochemistry 35, 12338–12346.

Wang, Y., and Rando, R. R. (1995). Specific binding of aminoglycoside antibiotics to RNA. Chem Biol 2, 281–290.

Wang, Z., Gerstein, M., and Snyder, M. (2009). RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10, 57–63.

Wang, Z., Kayikci, M., Briese, M., Zarnack, K., Luscombe, N. M., Rot, G., Zupan, B., Curk, T., and Ule, J. (2010). iCLIP predicts the dual splicing effects of TIA-RNA interactions. PLoS Biol 8, e1000530.

Wapinski, O., and Chang, H. Y. (2011). Long noncoding RNAs and human disease. Trends Cell Biol 21, 354–361.

Weinlich, S., Hüttelmaier, S., Schierhorn, A., Behrens, S.-E., Ostareck-Lederer, A., and Ostareck, D. H. (2009). IGF2BP1 enhances HCV IRES-mediated translation initiation via the 3’UTR. RNA 15, 1528–1542.

Welting, T. J. M., Mattijssen, S., Peters, F. M. A., van Doorn, N. L., Dekkers, L., van Venrooij, W. J., Heus, H. A., Bonafé, L., and Pruijn, G. J. M. (2008). Cartilage-hair hypoplasia-associated mutations in the RNase MRP P3 domain affect RNA folding and ribonucleoprotein assembly. Biochim Biophy Acta 1783, 455–466.

Wepf, A., Glatter, T., Schmidt, A., Aebersold, R., and Gstaiger, M. (2009). Quantitative interaction proteomics using mass spectrometry. Nat Methods 6, 203–205.

White, E. K., Moore-Jarrett, T., and Ruley, H. E. (2001). PUM2, a novel murine puf protein, and its consensus RNA-binding site. RNA 7, 1855–1866.

224

Whittle, C., Gillespie, K., Harrison, R., Mathieson, P. W., and Harper, S. J. (1999). Heterogeneous vascular endothelial growth factor (VEGF) isoform mRNA and receptor mRNA expression in human glomeruli, and the identification of VEGF148 mRNA, a novel truncated splice variant. Clin Sci (Lond) 97, 303–312.

Wickens, M., Bernstein, D. S., Kimble, J., and Parker, R. (2002). A PUF family portrait: 3’UTR regulation as a way of life. Trends Genet 18, 150–157.

Williamson, J. R. (2000). Induced fit in RNA-protein recognition. Nat Struct Biol 7, 834–837.

Windbichler, N., and Schroeder, R. (2006). Isolation of specific RNA-binding proteins using the streptomycin-binding RNA aptamer. Nat Protoc 1, 637–640.

Winter, J., Jung, S., Keller, S., Gregory, R. I., and Diederichs, S. (2009). Many roads to maturity: microRNA biogenesis pathways and their regulation. Nat Cell Biol 11, 228– 234.

Wreden, C., Verrotti, A. C., Schisa, J. A., Lieberfarb, M. E., and Strickland, S. (1997). Nanos and pumilio establish embryonic polarity in Drosophila by promoting posterior deadenylation of hunchback mRNA. Development 124, 3015–3023.

Wu, H., Zhu, S., and Mo, Y.-Y. (2009). Suppression of cell growth and invasion by miR-205 in breast cancer. Cell Res 19, 439–448.

Wu, L., and Belasco, J. G. (2005). Micro-RNA regulation of the mammalian lin-28 gene during neuronal differentiation of embryonal carcinoma cells. Mol Cell Biol 25, 9198– 9208.

Xiao, S., Day-Storms, J. J., Srisawat, C., Fierke, C. A., and Engelke, D. R. (2005). Characterization of conserved sequence elements in eukaryotic RNase P RNA reveals roles in holoenzyme assembly and tRNA processing. RNA 11, 885–896.

Yakovchuk, P., Protozanova, E., and Frank-Kamenetskii, M. D. (2006). Base-stacking and base-pairing contributions into thermal stability of the DNA double helix. Nucleic Acids Res 34, 564–574.

Yang, G., Fu, H., Zhang, J., Lu, X., Yu, F., Jin, L., Bai, L., Huang, B., Shen, L., Feng, Y., et al. (2010). RNA-binding protein quaking, a critical regulator of colon epithelial differentiation and a suppressor of colon cancer. Gastroenterology 138, 231–240.e1–e5.

Yang, G., Lu, X., Wang, L., Bian, Y., Fu, H., Wei, M., Pu, J., Jin, L., Yao, L., and Lu, Z. (2011). E2F1 and RNA binding protein QKI comprise a negative feedback in the cell cycle regulation. Cell Cycle 10, 2703–2713.

Yang, H., Kong, W., He, L., Zhao, J.-J., O’Donnell, J. D., Wang, J., Wenham, R. M., Coppola, D., Kruk, P. A., Nicosia, S. V., et al. (2008). MicroRNA expression profiling in human ovarian cancer: miR-214 induces cell survival and cisplatin resistance by targeting PTEN. Cancer Res 68, 425–433.

Ye, W., Lv, Q., Wong, C.-K. A., Hu, S., Fu, C., Hua, Z., Cai, G., Li, G., Yang, B. B., and Zhang, Y. (2008). The effect of central loops in miRNA:MRE duplexes on the efficiency

225

of miRNA-mediated gene regulation. PLoS One 3, e1719.

Yigit, E., Batista, P. J., Bei, Y., Pang, K. M., Chen, C.-C. G., Tolia, N. H., Joshua-Tor, L., Mitani, S., Simard, M. J., and Mello, C. C. (2006). Analysis of the C. elegans Argonaute family reveals that distinct Argonautes act sequentially during RNAi. Cell 127, 747–757.

Zhang, C., and Darnell, R. B. (2011). Mapping in vivo protein-RNA interactions at single- nucleotide resolution from HITS-CLIP data. Nat Biotechnol 29, 607–614.

Zhang, C.-D., Pan, M.-H., Tan, J., Li, F.-F., Zhang, J., Wang, T.-T., and Lu, C. (2011). Characteristics and evolution of the PUF gene family in Bombyx mori and 27 other species. Mol Biol Rep.

Zhao, Z., Chang, F. C., and Furneaux, H. M. (2000). The identification of an endonuclease that cleaves within an HuR binding site in mRNA. Nucleic Acids Res 28, 2695–2701.

Zheng, Y., and Miskimins, W. K. (2011). CUG-binding protein represses translation of p27Kip1 mRNA through its internal ribosomal entry site. RNA Biol 8, 365–371.

Zhou, S., Gu, L., He, J., Zhang, H., and Zhou, M. (2011). MDM2 regulates VEGF mRNA stabilization in hypoxia. Mol Cell Biol, [Epub ahead of print].

Zhou, Z., Licklider, L. J., Gygi, S. P., and Reed, R. (2002). Comprehensive proteomic analysis of the human spliceosome. Nature 419, 182–185.

Zhou, Z., and Reed, R. (2003). Purification of functional RNA-protein complexes using MS2- MBP. Curr Protoc Mol Biol Chapter 27, Unit 27.3.

Ziegeler, G., Ming, J., Koseki, J. C., Sevinc, S., Chen, T., Ergun, S., Qin, X., and Aktas, B. H. (2010). Embryonic lethal abnormal vision-like HuR-dependent mRNA stability regulates post-transcriptional expression of cyclin-dependent kinase inhibitor p27Kip1. J Biol Chem 285, 15408–15419.

Zielinski, J., Kilk, K., Peritz, T., Kannanayakal, T., Miyashiro, K. Y., Eiríksdóttir, E., Jochems, J., Langel, U., and Eberwine, J. (2006). In vivo identification of ribonucleoprotein-RNA interactions. Proc Natl Acad Sci U S A 103, 1557–1562.

226

9 Acknowledgments

This dissertation would not have been possible without the creativity and support of my supervisor André Gerber. His enthusiasm for this fascinating topic has been truly inspirational. I would like to thank him very much and wish him all the best for his future career!

I am deeply indebted and thankful to Michael Detmar for his supervision and continuous support, both personally and scientifically. I have greatly enjoyed my time in his lab and appreciate the scientific freedom and motivating environment that he provided for everyone through his hard and diligent work.

I am very grateful to Jonathan Hall for his input and support as a collaborator, member of my thesis committee and co-referee for this thesis. His dedication and his sharp and inquisitive mind have been a great motivation, and his occasional ‘words of wisdom’ will not be forgotten.

I would like to thank all the past and present members of the Gerber, Detmar, Halin and Hall labs for the outstanding atmosphere, scientific discussions, and valuable diversion in the form of (some very memorable!) extracurricular events and excursions. I am especially thankful for the “behind-the-scences” work of Cornelius Fischer and Susanne Holliger. To

Lucy: May he become the first rock star with a serious chocolate milk habit! To Acy: Türkçe dersleri için sa ğ ol. Merhaba akülü mämälär [sic!] onbir matkap ebediyen.

My heartfelt thanks and best wishes also go out to my students Fabi, Alex, Kasia, and particularly the “geezer” Felix, who became a dear friend.

227

Importantly, I would like to express my gratitude to all the other people who have helped me with experiments, reagents or advice: Piotr Dziunycz, Günther Hofbauer, Michael

Hengartner, Boris Günnewig, Julian Zagalak, Andreas Frei, Bernd Wollscheid, Alexander

Wepf, Matthias Gstaiger, Frank Wippich, Christoph Rösli, Gunther Meister, Martijn Kedde,

Reuven Agami, and Susanna Bachmann. It has been a pleasure working with each and everyone one of them, and their contributions and insights are truly appreciated.

Gladly I would like to thank Nina, Asli and Messer for always being around and making me feel at home during the last ten years or so. Who knows what the future holds, but

I would be more than happy if it were not too far away from them.

Kedves Fakopács: Nagyon köszönöm a sok szeretetet és a támogatást. Nélküled nem lettem volna képes erre. Két fiú, egy lány, és ötven+ év… Szeretlek.

Es ist mir die grösste Freude meinen Eltern Marita und Helmut, sowie meinen

Grosseltern Pik, Elfriede, Willi und Gottfried für Ihre Hingabe, Ihre Fröhlichkeit, Ihr

Vertrauen und Ihre bedingungslose Unterstützung zu danken. Ohne sie hätte ich das nie geschafft. Ihr seid die Besten!

228

10 Curriculum Vitae

Alexander Kanitz Winterthurerstrasse 358, 8057 Zurich, Switzerland Email: [email protected]

PERSONAL INFORMATION

Date of birth: November 15, 1980 Place of birth: Düren, Germany Nationality: German

EDUCATION

11/2007 – 12/2011 Graduate studies in Pharmaceutical Sciences Institute of Pharmaceutical Sciences, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland

11/2007 – 12/2011 Graduate studies in Molecular Life Sciences Life Science Zurich Graduate School, Zurich, Switzerland

03/2007 Master of Science (M.Sc.) University of Amsterdam, The Netherlands

10/2004 – 03/2007 Undergraduate studies in Medical Biochemistry Faculty of Science, University of Amsterdam, The Netherlands

07/2004 Bachelor of Science (B.Sc.) Technical University of Munich, Germany

09/2001 – 07/2004 Undergraduate studies in Molecular Biotechnology Institute of Biological Chemistry, Technical University of Munich, Germany

07/2000 Abitur Stiftisches Gymnasium, Düren, Germany

RESEARCH EXPERIENCE

11/2007 – 12/2011 Graduate studies Institute of Pharmaceutical Sciences, Pharmacogenomics Unit Swiss Federal Institute of Technology (ETH), Zurich, Switzerland Supervision: Prof. Dr. Michael Detmar and PD Dr. André Gerber Thesis: “Tools and strategies for the unraveling of post-transcriptional gene regulatory networks”

229

12/2006 – 02/2007 Undergraduate studies (for M.Sc.) Swammerdam Institute for Life Sciences University of Amsterdam, The Netherlands Supervision: Prof. Dr. Stanley Brul and Dr. Hans van der Spek Thesis title: “Status, complications and future prospects of highly active antiretroviral therapy”

02/2006 – 07/2006 Undergraduate studies (for M.Sc.) Institute of Molecular and Cell Biology A*STAR, Singapore Supervision: Prof. Dr. Vinay Tergaonkar Thesis title: “Characterization of a Putative IKK2 in Danio rerio “

10/2004 – 09/2005 Undergraduate studies (for M.Sc.) Department of Hepatology, Academic Medical Center University of Amsterdam, The Netherlands Supervision: Prof. Dr. Ronald Oude-Elferink and Dr. Jurgen Seppen Thesis title: “Lentiviral gene transfer of chemically inducible hepatocyte growth factor receptor and liver re-targeting of lentiviral particles“

03/2004 – 07/2004 Undergraduate studies (for B.Sc.) Institute of Biological Chemistry Technical University of Munich, Germany Supervision: Prof. Dr. Arne Skerra & Dr. Ingo P. Korndörfer Thesis title: ”Production and characterization of the enzymatically active domain of the endolysine Ply500 isolated from Listeria monocytogenes bacteriophage A 500”

ADDITIONAL PROFESSIONAL EXPERIENCE

11/2007 – 12/2011 Teaching experience during Ph.D. studies: - Annual practical course “Medicinal Chemistry” for Bachelor - students in Pharmaceutical Sciences - Supervision of semester, summer and Master project students Institute of Pharmaceutical Sciences, Pharmacogenomics Unit Swiss Federal Institute of Technology (ETH), Zurich, Switzerland

07/2004 – 08/2004 Traineeships Micromet Inc. R&D Division and Supervision: Dr. Andreas Wolf and Dr. Stefan Buziol 08/2003 – 10/2003 Micromet, Inc., Munich, Germany

1994 00/ – 00/ 2004 Private tutoring University, high school and elementary school students Subjects: Mathematics, English, German, Physics, Chemistry, Biology

230

SELECTED PRESENTATIONS

08/2011 Poster presentation: “MicroRNA 361-5p Regulates Human VEGFA Expression in vitro and Its Expression Is Decreased in VEGFA-Overexpressing Squamous Cell Carcinoma“ 8th Annual Retreat of the Zurich Ph.D. Program in Molecular Life Sciences, Bergün, Switzerland

09/2010 Poster presentation : “Regulatory Impact of Human PUF RNA- Binding Proteins and MicroRNAs on Cancer-Related Messages“ 7th Annual Meeting of the NCCR “Frontiers in Genetics”, Saas- Fee, Switzerland

02/2010 Oral presentation : “Elucidating the Potential Interplay between MicroRNAs and Human PUF Family RNA-Binding Proteins” Doktorandentag of the Institute of Pharmaceutical Sciences, ETH Zurich, Switzerland

07/2008 Poster presentation : “Systematic Analysis of the Crosstalk between Regulatory RNA-Binding Proteins and the MicroRNA Machinery“ 13 th Annual Meeting of the RNA Society, Berlin, Germany

PUBLICATIONS

Kanitz A, Imig J, Dziunycz PJ, Galgano A, Hofbauer GFL, Gerber AP, Detmar M. “The expression levels of microRNA-361-5p and its target VEGFA are inversely correlated in human cutaneous squamous cell carcinoma”. Under review (PLoS One).

Kanitz A, Gerber AP. (2010) “Circuitry of mRNA regulation” . Wiley Interdiscip Rev Syst Biol Med. 2(2):245-51

Galgano A, Forrer M, Jaskiewicz L, Kanitz A, Zavolan M, Gerber AP. (2008) “Comparative analysis of mRNA targets for human PUF-family proteins suggests extensive interaction with the miRNA regulatory system” . PLoS One. 3(9):e3164

Korndörder IP, Kanitz A, Danzer J, Zimmer M, Loessner MJ, Skerra A. (2008) “Structural analysis of the L-alanoyl-D-glutamate endopeptidase domain of Listeria bacteriophage endolysin Ply500 reveals a member of the LAS peptidase family” . Acta Crystallogr D Biol Crystallogr. 64(Pt 6):644-50

Markusic DM, Kanitz A, Oude-Elferink RP, Seppen J. (2007) “Preferential gene transfer of lentiviral vectors to liver-derived cells, using a hepatitis B peptide displayed on GP64” . Hum Gene Ther. 18(7):673-9

231

11 Abbreviations

2’-O-Me 2’-O-methylated (or 2’-methoxy) ARE AU-rich element DMEM Dulbecco’s modified Eagle‘s medium DNA deoxyribonucleic acid eGFP enhanced green fluorescent protein ELAVL1/HuR ELAV (embryonic lethal, abnormal vision, Drosophila)-like 1 (Hu antigen R) ELISA enzyme-linked immunosorbent assay EtOH ethanol FRT flippase recognition target G guanine GO gene ontology GRN gene regulatory network HITS-CLIP high-throughput sequencing crosslinking and immunopurification iCLIP individual nucleotide resolution crosslinking and immunopurification hnRNP heterogeneous nuclear ribonucleoprotein HPLC high-performance liquid chromatography IRES internal ribosome entry site

Kd dissociation constant LC-MS liquid chromatography-mass spectrometry LNA locked nucleic acid MFE minimum free energy miRNA microRNA MW molecular weight nt nucleotide PAR-CLIP photoactivatable ribonucleoside enhanced crosslinking and immunopurification PBS phosphate-buffered saline PNA peptide nucleic acids PTGR post-transcriptional gene regulation PUF pumilio/Fem-3-binding RBP RNA-binding protein

232

RIP-Chip RNA-binding protein immunopurification-microarray RLU relative luciferase units RNA ribonucleic acid RNase P ribonuclease P RNP ribonucleoprotein RP-HPLC reversed phase HPLC rRNA ribosomal RNA RT room temperature (22°C) S.D. standard deviation S.E.M. standard error of the mean SELEX systematic evolution of ligands by exponential enrichment snoRNA small nucleolar RNA snRNA small nuclear RNA snRNP small nuclear ribonucleoprotein SV40 simian virus 40 UTR untranslated region UV ultraviolet wt wild type

233