Functional and Molecular Characterization of Hedgehog Signalling Regulation

by

Celine Lacroix

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Pharmaceutical Sciences University of Toronto

© Copyright by Celine Lacroix 2016

Functional and Molecular Characterization of Hedgehog Signalling Regulation

Celine Lacroix

Doctor of Philosophy

Graduate Department of Pharmaceutical Sciences University of Toronto

2016 Abstract

Hedgehog (Hh) signalling plays a key role in controlling cell fate and patterning of tissues during development and also throughout adulthood for tissue homeostasis. The physiological importance of this pathway is further highlighted when mutations in Hh pathway components lead to developmental disorders and diseases: defects of the limbs, cranial structure or nervous system, and cancer in adults. Although the pathway is highly conserved between flies and humans, important divergences exist between them, including the necessity for Sufu to negatively regulate the Gli transcription factors, thought to mediate all Hedgehog transcriptional outputs. The first study presented in this thesis describes Zfp629, a new transcription factor first identified as a novel interactor of Sufu through a proteomics study. ChIP-seq analysis and validation confirmed that Zfp629 controlled the expression of Foxp2. Also a member of this pathway is Smoothened (Smo), a class F G- coupled receptor (GPCR). Several inhibitors have already been introduced to treat Hh-driven cancers such as medulloblastoma and basal cell carcinoma. However, resistance to these inhibitors rapidly develops thereby limiting their efficacy. The determination of SMO crystal structures enables structure-based discovery of new ligands with new chemotypes that will be critical to combat resistance. In the second study of this thesis, we docked 3.2 million available, lead-like molecules against SMO, looking for those with high physical complementarity to its structure; this represents the first such campaign

ii

against a class F GPCR. Twenty-one high-ranking compounds were selected for experimental testing, and four, representing three different chemotypes, were identified to antagonize SMO with IC50 values better than 50 µM. A second screen for analogs revealed another six molecules, with IC50 values in the 2.3 µM-22.4 µM range. Importantly, one of the most active of the new Smo antagonists continued to be efficacious against the D473H mutant of SMO, which confers clinical resistance to the antagonist vismodegib in cancer treatment.

iii

Acknowledgments

The completion of this thesis was made possible by the support, guidance and contributions of several people.

I would like to thank Dr. Carolyn Cummins, Dr. Chi-Chung Hui and Dr. Jason Matthews, my committee members, your insightful questions and discussions and your assistance allowed for this thesis to come to completion.

To my supervisor, Dr. Stephane Angers, thank you for the opportunity, your guidance and for constantly challenging me. Thank you to the Angers lab, past and current members, for your suggestions, discussions and ideas.

To Dr. Brian Shoichet and Inbar Fish, thank you for your patience with my relentless questioning and for your kindness.

I would like to express my gratitude to the Faculty of Pharmacy and Graduate Department of Pharmaceutical Sciences for their support.

Lastly, to my colleagues and friends: I could not have done this without your advice, friendship and encouragements. I have found in you several smart scientists, many more kind hearts and an amazing second family. I would like to thank my family and my partner for always being there for me, milles merci for helping me put things in perspective.

iv

Table of Contents

List of Tables ...... viii

List of Figures ...... ix

List of Appendices ...... xii

List of Abbreviations ...... xiii

Chapter 1 Introduction ...... 1

1 Introduction ...... 1

1.1 Hedgehog Signalling ...... 1

1.2 Sufu ...... 5

1.2.1 Gli ...... 6

1.2.2 Glis proteins ...... 7

1.2.3 Zic proteins ...... 9

1.3 Smoothened (Smo) ...... 9

1.3.1 Smo as a 7-transmembrane protein ...... 10

1.3.2 Smo in disease and as a drug target ...... 14

1.3.3 GPCR and in silico docking ...... 19

1.4 Research objectives ...... 19

1.4.1 Characterize a novel interactor of Sufu ...... 19

1.4.2 Perform a structure-assisted docking screen to identify novel Smo ligands...... 20

Chapter 2 Material and Methods ...... 21

2 Material and Methods ...... 21

2.1 ZFP629 results chapter ...... 21

2.2 Smo results chapter ...... 28

3 Results ...... 33

3.1 Results chapter 1: ZFP629 is a novel zinc finger transcription factor interacting with Sufu ...... 33

v

3.1.1 ZFP629 is a novel interactor of Sufu ...... 33

3.1.2 ZFP629 interaction with SUFU ...... 35

3.1.3 Zfp629 expression in the mouse cerebellum ...... 38

3.1.4 ZFP629 can repress expression ...... 42

3.1.5 ZFP629 interacts with proteins involved in transcription repression ...... 43

3.1.6 ZFP629 DNA-binding consensus sequence ...... 46

3.1.7 Effect of knockdown of Zfp629 on gene expression in NIH3T3 ...... 49

3.1.8 Chromatin Immunoprecipitation of ZFP629-3xFLAG ...... 55

3.1.9 Foxp2 expression is regulated by ZFP629 ...... 60

3.2 Results chapter 2: Identification of novel Smoothened ligands using structure-based docking ...... 63

3.2.1 Targeting the ligand binding site within the heptahelical domain of Smoothened ...... 63

3.2.2 Control docking screens for enrichment of ligand vs decoys...... 63

3.2.3 Prospective full library docking screen – selection of 21 compounds ...... 64

3.2.4 Secondary screen identifies analogs...... 69

3.2.5 The new antagonists exhibit efficacy at the chemoresistant Smo-D473H mutant...... 80

3.2.6 Library Bias ...... 85

4 Discussion and Conclusions ...... 86

4.1 ZFP629 Study ...... 86

4.1.1 ZFP629 is part of the SUFU protein complex ...... 86

4.1.2 ZFP629 consensus sequence ...... 87

4.1.3 ZFP629 is a novel transcription repressor ...... 87

4.1.4 Regulation of ZFP629 ...... 88

4.1.5 Regulation of Zfp629 and understanding its role in the developing cerebellum .. 89

4.1.6 ZFP629 and chemokines ...... 91

vi

4.2 Smo Study ...... 96

4.2.1 Docking ...... 96

4.2.2 Aggregation ...... 97

4.2.3 Library Bias ...... 98

4.2.4 Functional activity of docking hits ...... 98

4.2.5 Functional selectivity ...... 99

5 Conclusio ns and Perspectives ...... 100

References ...... 105

Appendices ...... 122

vii

List of Tables

Table 2-1 Human qPCR primers ...... 23

Table 2-2 Mouse qPCR primers ...... 24

Table 2-3 CAST primers ...... 26

Table 2-4 Mouse qPCR primers ...... 31

Table 3-1 Interactors of SUFU ...... 34

Table 3-2 BioID Assay with BirA*-ZFP629 in HEK293 ...... 43

Table 3-3 ZFP629 linkers ...... 47

Table 3-4 Microarray results ...... 51

Table 3-5 Candidate with expression confirmed by qPCR and ZFP629 motif in promoters ...... 53

Table 3-6 Results of first screen ...... 66

Table 3-7 Hits from initial screen ...... 68

Table 3-8 Results of analog screen ...... 72

Table 3-9 Antagonists discovered by secondary analog screen ...... 74

Table 3-10 Compounds found to be aggregators ...... 77

Table 3-11 Library Bias ...... 85

viii

List of Figures

Figure 1-1 Simplified schematic of the Hh pathway ...... 4

Figure 1-2 Partial sequence alignment of GLIS3 with GLI and Ci ...... 8

Figure 1-3 Schematic representation of a class F GPCR such as Smoothened...... 11

Figure 1-4 Conserved cysteines in Smo CRD ...... 12

Figure 1-5 Simplified schematic of the developing cerebellum ...... 16

Figure 1-6 Medulloblastoma Molecular Subgroups ...... 17

Figure 3-1 Proteins associating with FLAG-SUFU in C3H10T1/2 ...... 35

Figure 3-2 ZFP629 has 19 C2H2 zinc finger motifs ...... 35

Figure 3-3 Interaction of ZFP629 and SUFU ...... 37

Figure 3-4 ZFP629 alignment with Ci/GLI SUFU-interacting motif ...... 37

Figure 3-5 Zfp629 expression in the mouse cerebellum ...... 39

+/- Figure 3-6 Zfp629 expression in wildtype and Ptch1 P7 cerebellum ...... 40

Figure 3-7 Zfp629 expression in two medulloblastoma cell lines ...... 41

Figure 3-8 Effect of purmorphamine on Zfp629 mRNA levels ...... 41

Figure 3-9 ZFP629 can repress transcription ...... 42

Figure 3-10 CAST methodology ...... 48

Figure 3-11 Identification of ZFP629 consensus sequence ...... 49

Figure 3-12 Zfp629-targeting shRNA validation for microarray ...... 50

Figure 3-13 Ccl5 and Cxcl1 are both repressed when Zfp629 is knocked down...... 53

ix

Figure 3-14 Zfp629 overexpression has no effect on Ccl5 or Cxcl1 ...... 54

Figure 3-15 Purmorphamine has only a modest effect on Ccl5 and Cxcl1 ...... 54

-/- -/- Figure 3-16 Expression of Ccl5 and Cxcl1 in Sufu MEFs and Sufu Venus-Sufu cells ...... 55

Figure 3-17 ChIP peaks distribution ...... 56

Figure 3-18 Best motif from MEME-ChIP analysis ...... 57

Figure 3-19 Representation of ZFP629 ChIP-seq data relative to promoters ...... 58

Figure 3-20 de novo motif discovery ...... 59

Figure 3-21 Results of GREAT query of MGI Expression and Phenotype for the cerebellum ... 60

Figure 3-22 Foxp2 is a target gene of ZFP629 and of Hedgehog ...... 62

Figure 3-23 Identification of four novel Smoothened antagonists ...... 67

Figure 3-24 Smo antagonist structures ...... 70

Figure 3-25 Screening of analogs and dose-response analysis of best hit ...... 71

Figure 3-26 Effect of centrifugation of ligand on Smo antagonist activity in Gli-Luciferase reporter assay...... 76

Figure 3-27 Particle formation by itraconazole measured by dynamic light scattering (DLS) .... 78

Figure 3-28 Itraconazole inhibits Smo via an aggregation-based mechanism ...... 79

Figure 3-29 New SMO antagonists compete for BODIPY-cyclopamine binding...... 82

Figure 3-30 Binding poses and inhibition of vismodegib-resistant Smo by 45b...... 84

Figure 4-1 ZNF629 expression profile in human tissue ...... 93

Figure 4-2 ZNF629 expression profile in cell lines ...... 94

Figure 4-3 ZNF629 protein levels in tissues reported by Proteomics DB ...... 95

x

Figure 4-4 ZNF629 protein levels in tissues reported by Human Proteome Map ...... 96

Figure 5-1 Summary of findings ...... 101

Figure 5-2 Schematic of the hypothesis of the role of Zfp629 and Foxp2 in the cerebellum .... 101

Figure 5-3 Drugs targeting the Wnt and Hh pathways ...... 104

xi

List of Appendices

Appendix 1 ...... 122

xii

List of Abbreviations

APC adenomatous polyposis coli B-C Bodipy-cyclopamine BCC Basal cell carcinoma CAST Cyclic amplification and selection of targets Ci Cubitus interruptus ChIP Chromatin immunoprecipitation ChIP-seq Chromatin immunoprecipitation coupled with high-throughput DNA sequencing CRD Cysteine-rich domain DBD DNA-binding domain DMEM Dulbecco’s modified Eagle’s medium FACS Fluorescence-activated cell sorting FBS Fetal bovine serum FDR False discovery rate Fzd Frizzled GAL4 Galactose-responsive transcription factor GAL4, Saccharomyces cerevisiae GAPDH Glyceraldehyde-3-phosphate dehydrogenase Gli Glioma-associated oncogene Gli3R Gli3, repressor form Glis Gli-similar GPCR G-protein coupled receptor Gsk3β Glycogen synthase kinase 3 beta Hh Hedgehog KPi Potassium Phosphate LC-MS/MS Liquid chromatography coupled to tandem mass spectrometry MB Medulloblastoma MEFs Mouse embryonic fibroblasts mSmo mouse Smoothened Ptch1 Patched 1 qPCR quantitative polymerase chain reaction Shh Sonic hedgehog shRNA short hairpin RNA Smo Smoothened Sufu Suppressor of Fused TAP Tandem affinity purification TBS Tris-buffered saline Tc Tanimoto coefficient TSS Transcription start site Wnt Wingless-related MMTV integration site WT Wild type Zfp629 Zinc finger protein 629 (mouse) Zic Zinc finger protein of the cerebellum Znf629 Zinc finger protein 629 (human)

xiii 1

Chapter 1 Introduction 1 Introduction 1.1 Hedgehog Signalling

The Hedgehog morphogen (Hh) was first identified in Drosophila as a secreted protein regulating patterning and development of the fly embryo (Lee et al., 1992; Mohler and Vani, 1992; Tabata et al., 1992). Since the discovery of its vertebrate counterparts, significant progress has been made in uncovering not only the role but also the molecular mechanisms underlying the Hh pathway during development and in disease. The importance of the pathway is manifested through its conservation from flies to humans of most of its components and the main regulatory mechanisms involved. The physiological importance of this pathway is further highlighted by the diseases and defects in humans resulting from mutations in Hedgehog pathway components including defects in limb development, cranial structure or of the nervous system, and cancer.

In Drosophila, Hh regulates the stability and activity of the transcription factor Cubitus interruptus (Ci) (Méthot and Basler, 1999). Under resting conditions, Ci is processed into a transcriptional repressor that inhibits transcription of Hh target genes. This occurs through the concerted actions of Fused, Costal2, Suppressor of Fused and three kinases acting on Ci: PKA, CK1 and GSK3 (Méthot and Basler, 2001). When Hh binds its receptor Patched (Ptc), the receptor inhibitory action on another transmembrane protein Smoothened (Smo) is repressed. Smo gets activated and recruits Fused and Costal2 to its cytoplasmic tail, disrupting the proteolysis complex and the processing of Ci. Full-length Ci can then activate the expression of Hedgehog target genes (Figure 1-1 A).

Whereas this basic framework of the pathway is conserved among species, evolutionary divergences occurred between the fruit fly and mammals; several through gene duplications. For instance, vertebrates have three hedgehog ligands, namely desert hedgehog (Dhh), Indian hedgehog (Ihh) and sonic hedgehog (Shh) and also three dedicated transcription factors, the Gli proteins Gli1, Gli2 and Gli3. Shh is the most widely expressed ligand and regulates the

2 developmental processes of several tissues, including of the limbs, nervous system and skin. Dhh and Ihh have more restricted expression domains and their function is mostly confined to reproductive organs and bone, respectively (Ingham and McMahon, 2001). The three Gli transcription factors and Ci bind a similar DNA motif, but only Gli2 and Gli3 have a repressor domain analogous to Ci (Sasaki et al., 1999). Gli1 is a Hh target gene and functions in a positive loop amplifying signalling. There are two Ptc vertebrate homologues, Patched1 (Ptch1) and Patched2 (Ptch2) that are target genes of the pathway like their fly homolog. However, Ptch1 is the main Hh receptor and is essential for embryonic development, unlike Ptch2, which is dispensable (Nieuwenhuis et al., 2006) (Figure 1-1 B).

3

4

Figure 1-1 Simplified schematic of the Hh pathway Simplified schematic of the Hh pathway in Drosophila (A) and in vertebrates (B). (A) Left side: In absence of the ligand, Ptc (Drosophila Patched) represses Smo. Cos2, Fused (Fu) and SuFu (Drosophila Suppressor of Fused, homolog of vertebrate Sufu) associate with Ci. Through the activity of the kinases CK1α, PKA and GSK3β, this complex promotes the formation of CiR, the repressor form of Ci. Right side: Binding of Hh to Ptc relieves the repression of Smo, leading to the recruitment of kinases and phosphorylation of the cytoplasmic tail of Smo. It also leads to the activation of Ci (CiA: Ci activator) (B) Left side: In absence of the ligand, Ptch1 represses Smo and prevents its ciliary accumulation. This leads to retention of Gli proteins in the cytoplasm by Sufu. Through the recruitment and activity of the kinases CK1α, PKA and GSK3β, this complex promotes the formation of Gli3 repressor (Gli3R). Gli3R translocates to the nucleus to repress Hh target genes. Right side: Binding of Hh to Ptch1 relieves the repression of Smo, allowing it to translocate to the primary cilium. Ligand binding to Ptch1 also leads to the accumulation of the Gli proteins, Sufu and Kif7 at the tip of the cilium. Pathway activation also induces the dissociation of the Gli proteins from Sufu, leading to the activation Gli (GliA) and Hh target gene expression. IFT: intraflagellar transport.

More recently, another significant evolutionary divergence was uncovered: the necessity of the primary cilia for Hh signalling in vertebrate cells. The primary cilium is a non-motile microtubule-based structure that projects from the surface of post-mitotic vertebrate cells. Once considered a vestigial organelle, the primary cilium is now appreciated as an important signalling hub for various growth factor pathways such as PDGF (platelet derived growth factor) and Hh (Huangfu et al., 2003). Mutations affecting the proper function of the cilia are found in several disorders called ciliopathies. Phenotypes observed in those disorders are often analogous to phenotypes arising from mutations in components of the Hh pathway that cause the pathway to be activated (Goetz and Anderson, 2010). Levels of Gli activator and repressor proteins are affected by several genes involved in cilia and basal body functions (Haycraft et al., 2005; Huangfu and Anderson, 2005; Liu et al., 2005). Many components of the Hh pathway are found within the cilia either under basal or activated conditions (Corbit et al., 2005; Haycraft et al., 2005; Rohatgi et al., 2007). For example, in unstimulated conditions, Ptch1 is found in the cilia, with Smo, Sufu and Gli constantly trafficking through it (Haycraft et al., 2005; Rohatgi et al., 2007; Tukachinsky et al., 2010a; Wen et al., 2010). On stimulation, Ptch1 exits the cilia (Rohatgi et al., 2007) while Smo, Sufu and Gli accumulate at its tip (Chen et al., 2009; Kim et al., 2009; Tukachinsky et al., 2010b; Wen et al., 2010). The movement within the cilia from the base to the

5 tip and back requires normal intraflagellar transport and a normal cilium structure. Mutations affecting Kif7, a kinesin and conserved regulator of Hh signalling, disrupt the integrity of the cilium structure and the proper localization of Gli2 in the organelle (He et al., 2014). Kif7 mutations in humans are found in hydrolethalus, acrocallosal and Joubert syndromes, ciliopathies and multiple malformation syndromes characterized by phenotypes reminiscent of Hh mutations including polydactyly, cranio-facial malformations, and various neurological defects (Dafinger et al., 2011; Putoux et al., 2011). Together, this evidence illustrates that the normal function of the primary cilium is crucial for proper Hh signalling and proper control of Gli activity.

1.2 Sufu

Another important divergence between Drosophila and vertebrate Hh signalling lies in their respective requirement for Suppressor of Fused (Drosophila: SuFu, vertebrates: Sufu), a protein involved in negatively regulating the Hedgehog pathway by inhibiting the Gli/Ci family of transcription. SuFu mutant flies survive and develop normally, whereas Sufu-/- mouse embryos die from neural tube defects in early stages of development indicating especially critical roles in vertebrates (Cooper et al., 2005; Svärd et al., 2006).

In the absence of Hh signalling, Sufu regulates Ci/Gli by controlling its subcellular distribution and its transcriptional activity. Sufu binds Ci and Gli by binding both its N- and C- terminus (Ding et al., 1999; Han et al., 2015). The N-terminal domains of Ci/Gli also contain a conserved Sufu-binding motif SYGH, required for the interaction of Sufu with these transcription factors (Dunaeva, 2002). This interaction results in the sequestration of the DNA-binding proteins in the cytoplasm, thereby reducing their nuclear levels (Barnfield et al., 2005; Ding et al., 1999; Kogerman et al., 1999). Their interaction with Sufu also results in the formation of the truncated and repressor form of Ci, Gli2 and Gli3 (Gli3R), and in the degradation of Gli1 (discussed in the next section). However, Sufu is not solely a negative regulator of the pathway; instead it also stabilizes Gli2 and Gli3. Indeed, in Sufu-/- mouse embryonic fibroblasts (MEFs), total protein levels of all three Gli proteins are reduced and their respective protein levels are restored by the exogenous expression of Sufu (Chen et al., 2009).

6

1.2.1 Gli proteins

The Gli family of transcription factors is characterized by five C2H2-Krüppel zinc finger motifs highly conserved amongst all Gli members. Three Gli proteins, Gli1, Gli2 and Gli3, mediate the expression of the Hh target genes in vertebrates, including the expression of Gli1 and Ptch1. Also conserved is the DNA sequence bound by Gli, the Gli Binding Sequence (GBS) 5’- GACCACCCA-3’ (Kinzler and Vogelstein, 1990). In the absence of Hh signalling, Gli3, and to a lesser extent Gli2, undergo partial proteolysis to their respective repressor forms that inhibit the expression of Hh target genes (Pan and Wang, 2007; Pan et al., 2006). For example, Gli3R represses the expression of Gli1, a classic Hh target gene (Hu, 2006). Deletion analysis revealed that in addition to their characteristic five zinc fingers, Gli2 and Gli3 harbour a transcriptional repression domain at their N-terminus and all three Gli proteins contain a C-terminal transcription activation domain (Sasaki et al., 1999). Studies of Gli knockout mice demonstrated that GLI2 and GLI3 are the main transcriptional activator (GliA) and repressor (GliR), respectively, and are both essential for development; while GLI1 is dispensable for development (Ding et al., 1998; Matise et al., 1998; Park et al., 2000). Reinforcing that GLI2 and GLI3 are the main regulators of Hh target gene expression, the response to SHH is lost in mice embryo lacking both Gli2 and Gli3 (Buttitta et al., 2003).

Gli repressor formation is regulated by the phosphorylation of Gli by PKA, GSK3 and CK1 followed by the recruitment of and ubiquitination by the E3 ubiquitin ligase SCFβ-TrCP (Tempé et al., 2006) before limited processing by the proteasome. This partial degradation relies on a processing determinant domain (PDD), found between the zinc finger domain and the β-TrCP binding site found in Gli2 and Gli3, but lacking in Gli1. Gli activity is also controlled by the regulation of its stability, through different E3 ligase complexes. For example, the E3-ligase Spop interacts with Gli2 and Gli3 (Chen et al., 2009; Zhang et al., 2009), an interaction that promotes their ubiquitin-mediated proteasome degradation. In contrast, Gli1 is not a strong Spop substrate; instead Gli1 is targeted for ubiquitin-dependent proteasome degradation by the Itch E3 ligase (Marcotullio et al., 2006).

7

All three Gli proteins interact with Sufu through their N-terminus, which contains a conserved SYGH motif. More recently, Han et al. demonstrated that Sufu could also interact with Ci/Gli mutants lacking the SYGH motif and that the interaction was mediated by a conserved site in their C-terminus. They also demonstrated that for both Ci and Gli2, the interaction between Sufu and the transcription factor is stronger when both N- and C-terminal binding sites are present. Both Sufu-binding sites are also required for a more efficient repression of the Ci/Gli-mediated activation of the pathway. Beyond the physical interaction of Sufu with Gli, Sufu also regulates the transcriptional activity of Gli by recruiting nuclear proteins involved in transcription regulation. Yeast-two-hybrid and co-purification experiments identified Sap18 and Lgals3 as interactors of Sufu (Cheng and Bishop, 2002; Paces-Fessy et al., 2004). Sap18, or Sin3- associated polypeptide 18, is a member of the Sin3 transcription repression complex. In a Gli- luciferase assay, exogenous expression of Sap18 increased the inhibition by Sufu of the Gli1- mediated activation of a luciferase reporter and co-expression of Sap18 and its partner mSin3 led to a further inhibition of Gli1 activity (Cheng and Bishop, 2002). Lgals3 (lectin, galactoside- binding, soluble 3 or Galectin3) on the other hand associates with transcription factors and stimulates transcriptional activity (Shimura et al., 2004).

1.2.2 Glis proteins

Glis2 and Glis3 are members of the Gli-similar family of transcription factors, related to the Gli and Zic families by homology since all three protein families contain five C2H2 zinc fingers. Three aspects of their regulation links them to Hh: 1) similar DNA-binding consensus sequence to Gli suggest the possibility of crosstalk; 2) both Glis2 and Glis3 translocate to the primary cilium and mutations affect the development of organs requiring proper Hh signalling; and 3) Glis2/3 interact with Sufu. Glis2 and Glis3 have a consensus sequence (GlisBS) that is similar to the Gli consensus sequence, GlisBS: 5’-(G/C)TGGGGGGT(A/C)-3’. EMSA and luciferase reporter assays show that Gli proteins can also recognize this sequence and regulate gene expression by binding to it. Reciprocal assays with the GBS and Glis2/3 show that Glis can recognize the Gli consensus sequence, although with a lower affinity, and use it to regulate gene expression. While Glis3 expression induces the GlisBS-Luciferase reporter, Glis2 expression has

8 no effect on transcription of the reporter. Instead, Glis2 inhibits Gli1-mediated activation of the GlisBS-luciferase reporter (Beak et al., 2008; Vasanth et al., 2011).

Sequence alignment of Glis3 and Gli shows that Glis3 contains a highly homologous sequence to the SYGH motif found in the Gli/Ci proteins (Figure 1-2), a sequence required for their interaction with Sufu (Dunaeva, 2002). Co-immunoprecipitation experiments of Sufu and Glis3 deletion mutants show that this homologous sequence is also required for Glis3 interaction with Sufu. However, Glis2 does not contain a similar motif and yet interacts with Sufu. Co- immunoprecipitation experiments indicate that the C-terminus of Glis2 is mediating the interaction with Sufu (ZeRuth et al., 2011).

Ci | SGSYGHI SATALNPMSH hGli1 | GGSYGHL SIGTMSP--- hGli2 | SGSYGHL SAGALSP--- hGli3 | SGSYGHL SASAISP--- hGlis3 | PEV YGHF -LGVRGS---

Figure 1-2 Partial sequence alignment of GLIS3 with GLI and Ci Partial alignment of Ci, GLI and GLIS3 showing conservation of the SUFU-binding motif in

GLIS3. Red: conserved amino acids, green: conserved amino acids with strongly similar properties, blue and black: non-conserved amino acids.

Co-expression of Sufu inhibits Glis3-mediated activation of an Ins2-luciferase reporter. Sufu interaction with Glis3 also protects it from degradation; thus, increasing amounts of Sufu correlate with increasing protein levels of Glis3, while reducing the amount of Glis3 co- precipitating with the E3 ubiquitin ligase Cul3 responsible for Glis3 ubiquitination and degradation. Co-expression of Sufu and Glis3 in the presence of cycloheximide increased the half-life of Glis3 protein, without any significant changes in Glis3 mRNA levels during the experiment. In summary, this study demonstrates that Glis3 interacts with Sufu in a similar fashion to the Gli proteins, and that this interaction modulates Glis3 protein stability and transcriptional activity {ZeRuth:2011es}.

Genetic aberrations in GLIS3 have been linked to neonatal diabetes and developmental defects of the pancreas and kidneys, while loss of function mutation of GLIS2 lead to nephronophthisis, a

9 ciliopathy and the most common genetic cause of end-stage renal disease in youth (Senée et al., 2006; Zhang et al., 2002). Normal development of the kidneys and pancreas also requires the Hh pathway. For example, GLI3 frameshift and splicing mutations are at the root of Pallister-Hall syndrome, a disorder affecting multiple organs. Most patients have extra digits and several are afflicted with severe kidney abnormalities (Cain and Rosenblum, 2010; Lau et al., 2006). Elevated Shh levels prevent the proper development of the pancreas and the spleen while mice with mutations in Shh and Ihh have enlarged pancreata (Hebrok et al., 2000).

1.2.3 Zic proteins

Members of the Zic family of transcription factors (Zic1-3) have been shown to cooperate with the Gli transcription factors in neural development. They also control gene transcription activity and subcellular localization of each other (Aruga, 2004; Koyabu et al., 2001). More specifically, Zic2 interacts with Gli1, sequestering it in the nucleus and leading to increased Gli-dependent luciferase activity. A Zic2 truncation mutant, able to localize in the nucleus but unable to co- immunoprecipitate Gli, is however unable to retain GLI1 in the nucleus, when compared to full- length Zic2 as seen in subcellular fractionation assays (Chan et al., 2011). The preferred Zic DNA-binding sequence, 5’-GACCACCC-3’, is identical to the first 8 bases of the GBS (Mizugishi, 2000).

1.3 Smoothened (Smo)

Smoothened (Smo) and the Frizzled (Fzd) membrane proteins form the class F or Frizzled family of G-protein coupled receptors (Foord et al., 2005). This family of seven transmembrane receptors has low sequence identity to other classes, identity of 24 to 30% and similarity ranging from 43% to 50% within the family. However, proteins of this family are highly conserved from fly through vertebrates. Both Wnt and Hh pathways, signalling through Fzd and Smo respectively, play a role in the complex regulation of cell growth and proliferation of stem and

10 progenitor cells, during development and adult tissue maintenance. In this regard, mutations activating these pathways can lead to uncontrolled cell proliferation and tumour growth.

1.3.1 Smo as a 7-transmembrane protein

Vertebrate genomes encode ten Frizzled receptors, Fzd1-10, and 19 different Wnt proteins. Smo is the eleventh member of the class F/Frizzled family of GPCRs. The Frizzled family is distinct from the other GPCR classes by their characteristic extracellular cysteine-rich domain (CRD) (Figure 1-3). Ten cysteines forming five disulphide bonds in the CRD are highly conserved throughout Fzd isoforms, and eight of those cysteines are found in Smo. These eight cysteines are also conserved throughout evolution (Rana et al., 2013) (Figure 1-4). The CRD of Fzd is essential for Wnt binding (Bhanot et al., 1996) and the Smo CRD is essential for Hh signalling (Aanstad et al., 2009; Rana et al., 2013).

11

Figure 1-3 Schematic representation of a class F GPCR such as Smoothened.

12

hFzd8 ------AASAKELACQEIT--VPLCKGIG--Y 48 mFzd8 ------AASAKELACQEIT--VPLCKGIG--Y 48 dSmo LNYRLYAKKGRDDKPWFDGLDSRHIQCVRRARCYPTSNATNTCFGSKLPY 107 hSmo ------ARRSAAVTGPPPP-LSHCGRAAPCEPLR--YNVCLGSVLPY 85 mSmo ------SRRDVPVTSPPPPLLSHCGRAAHCEPLR--YNVCLGSALPY 89 . : * * * *

hFzd8 NYTYMPNQFNHDTQDEAGLEVHQFWPLVE-IQCSPDLKFFLCSMYTPICL 97 mFzd8 NYTYMPNQFNHDTQDEAGLEVHQFWPLVE-IQCSPDLKFFLCSMYTPICL 97 dSmo ELSSL-DLTDFHTEKELNDKLNDYYALKHVPKCWAAIQPFLCAVFKPKCE 156 hSmo GATSTLLAGDSDSQEEAHGKLVLWSGLRNAPRCWAVIQPLLCAVYMPKCE 135 mSmo GATTTLLAGDSDSQEEAHGKLVLWSGLRNAPRCWAVIQPLLCAVYMPKCE 139 : : .::.* :: : * . :* . :: :**::: * *

hFzd8 E----DYKKPLPPCRSVCERAKAGCAPLMRQYGFAWPDRMRC--DRLPEQ 141 mFzd8 E----DYKKPLPPCRSVCERAKAGCAPLMRQYGFAWPDRMRC--DRLPEQ 141 dSmo KINGEDMV--YLPSYEMCRITMEPCRILYNTTF--FPKFLRCNETLFPTK 202 hSmo N----DRV--ELPSRTLCQATRGPCAIVERERG--WPDFLRCTPDRFPEG 177 mSmo N----DRV--ELPSRTLCQATRGPCAIVERERG--WPDFLRCTPDHFPEG 181 : * *. :*. : * : . :*. :** :*

Figure 1 -4 Conserved cysteines in Smo CRD

Conserved cysteines are in red. Alignment performed with (*) conserved amino acids, (:) conserved amino acids with strongly similar properties, (.) conserved amino acids with weakly similar proper ties. h: human, m: mouse, d: Drosophila. Blue lines link cysteines forming disulphide bonds.

Recent reports raised the possibility that a sterol-like molecule may serve as a ligand for Smo. It has been hypothesized that Patched may be involved in the transport of such molecule. Several lines of evidence supporting this model have been presented in recent years (Eaton, 2008; Hausmann et al., 2009; Taipale et al., 2002). Patched is related to the resistance- nodulation division (RND) family of bacterial proton-driven pumps. These bacterial proteins transport small lipophilic molecules across the membrane bilayer. Except for Patched, characterized mammalian members are involved in cholesterol sensing and sterol metabolism (Eaton, 2008). Second, Patched is needed in substoichiometric levels compared to Smo and the two transmembrane proteins cannot be copurified (Taipale et al., 2002). These two points taken together suggest a model where Patched inhibits Smo by transporting a small molecule, possibly sterol-related.

13

Cyclopamine, the first Smo antagonist discovered, is a plant sterol derivative (Chen, 2002). It binds Smo in the upper section of a long and narrow cavity formed by the extracellular linker domain, extracellular loops and the 7TM bundle (Weierstall et al., 2014). Synthetic sterol derivatives can modulate the activity of Smo by binding to its CRD (Myers et al., 2013; Nachtergaele et al., 2012; Nedelcu et al., 2013).

Fzd and Smo as GPCRs are poorly understood. Emerging evidence for both receptors indicates they modulate multiple signalling pathways, including heteromeric G proteins, instead of a single linear downstream pathway to regulate signalling intracellularly (Arensdorf et al., 2016; Dijksterhuis and Petersen, 2014). In the case of Smo, examples include signalling via Gαi for fibroblast migration (Polizio et al., 2011) and signalling through Gαi and Ca2+ to modulate metabolism (Teperino et al., 2012).

In the absence of Hh ligands, the Hh receptor Ptch1 prevents activation and trafficking of the 7- transmembrane protein Smo to the cilia (Corbit et al., 2005). This inhibition enables both the degradation and the processing of the Gli transcription factors into their repressor forms, which leads to the repression of the Hh target genes (Huntzicker et al., 2006). Binding of Hh to Ptch1 inhibits the receptor, leading to the activation Smo and its translocation to the primary cilium. More specifically, it was found that binding of Hh to its receptor allows the recruitment of CK1a and GRK2 to the cytoplasmic tail of Smo, and its subsequent phosphorylation-driven conformational change (Chen et al., 2011b). This phosphorylation also promotes Smo accumulation to the primary cilium facilitated by β-arrestin and the kinesin motor protein Kif3A (Kovacs et al., 2008; Meloni et al., 2006). In addition to the accumulation of Smo at the cilia, activation of the pathway also increases the presence of Gli, Sufu and of another kinesin motor protein Kif7 at the cilium. This leads to the dissociation of the Gli-Sufu complex, leading to the activation of Gli, its translocation to the nucleus and transcription of target genes (Humke et al., 2010; Tukachinsky et al., 2010a).

Supporting Smo’s function as a GPCR, several lines of evidence in Drosophila and in mammalian cells suggest a coupling of Smo to Gαi to regulate both Gli-dependent signalling and Gli-independent signalling (Chinchilla et al., 2010; Ogden et al., 2008; Polizio et al., 2011;

14

Riobo et al., 2006). Smo-Gαi coupling is required in Shh-induced fibroblast migration, a process that is Gli-independent. Indeed, pertussis toxin, a Gαi inhibitor, did not affect the expression of Hh-target genes stimulated by Shh (Polizio et al., 2011). In developing spinal neurons, Shh 2+ 2+ activates Gαi and causes increase in Ca spike activity, that is Ca influx and release from intracellular stores (Belgacem and Borodinsky, 2011). However, inactivation of Gαi affected neither the Shh-driven specification of neuronal cell types in the chick neural tube nor the

development of mouse limbs (Low et al., 2008; Regard et al., 2013). A link between Gαs (Gnas) and Hedgehog in the context of osteoblast differentiation was demonstrated during the characterization of a progressive osseous heteroplasia (POH) mouse model, Prrx1-Cre; Gnasfl/fl mice. Patients affected by POH and Albright hereditary osteodystrophy (AHO) harbour loss-of- function mutations of GNAS and present dermal, skeletal muscle and deep connective tissue ossification. Loss of Gnas in mice promoted ectopic osteoblast differentiation. At the molecular level, the loss of Gnas resulted in lower PKA activity and consequently reduced processing of Gli, leading to an increase of Gli-dependent gene expression. This increase of Hh target genes expression in the forelimbs promoted osteoblast differentiation and ossification (Regard et al., 2013).

1.3.2 Smo in disease and as a drug target

Mutations activating Wnt signalling in cancer are often found downstream of the Frizzled receptors, such as mutations found in APC and β-catenin (Morin et al., 1997). In contrast, in the case of Smo and the Hh pathway, inactivating mutations in the upstream inhibitor of Smoothened, Patched1 (PTCH1), and activating mutations in SMO are found in several forms of cancers including medulloblastoma (MB) and basal cell carcinoma (BCC), where Hh signalling is known to play a role in oncogenesis and progression (Hahn et al., 1999; Johnson et al., 1996; Pietsch et al., 1997; Raffel et al., 1997).

Hh regulates the survival and proliferation of several tissue progenitor and stem populations. In particular, Shh is a major mitogenic regulator of cerebellar development. Growth of the cerebellum requires the rapid proliferation of cerebellar granule neuron progenitors (CGNP) after

15 their migration to the external germinal layer (EGL), the outer layer of the cerebellum. CGNPs respond to the mitogenic effects of Shh secreted by Purkinje cells, via the increased expression of Hh-target gene Mycn and the consequent upregulation of cyclin D1 and cyclin D2 (Kenney and Rowitch, 2000; Kenney et al., 2003; Vaillant and Monard, 2009). Most of this proliferation and expansion of the EGL occurs in the early days after birth, between P5 and P8 in the mouse. As proliferation proceeds, older CGNPs exit the cell cycle, differentiate into granule neurons and migrate inwards to form the internal granule layer (IGL) with all CGNPs matured by P21 (Figure 1-5). Hh activity is subsequently downregulated to homeostatic levels. In this context, mutational activation of the Shh pathway leads to uncontrolled cell proliferation and results in medulloblastoma, the most common malignant pediatric brain cancer and a major cause of morbidity and mortality in pediatric oncology. Molecular and transcriptional profiling studies uncovered that medulloblastoma could be divided into four subgroups, based on their gene expression signatures: Shh, Wnt, group 3 and group 4 (Figure 1-6) (Taylor et al., 2012). The Shh subgroup represents approximately 30% of all MB and is characterized by aberrant Shh signalling. Individuals with inactivating germline mutations in PTCH1 or SUFU are predisposed to medulloblastoma (MB). Somatic mutations in PTCH1, SMO or SUFU; or amplification of GLI1 or GLI2 characterize the Shh medulloblastoma molecular subgroup.

16

Mitotic CGNP

Postmitotic CGNP EGL Granule neurons

Purkinje neurons SHH SHH

Bergmann glia

IGL

Figure 1-5 Simplified schematic of the developing cerebellum Inspired from (Ruiz i Altaba et al., 2002). SHH is secreted by the Purkinje neurons and stimulates the proliferation of the cerebellar granule neuron progenitors (CGNP) in the outer EGL (external granule layer). Postmitotic CGNPs migrate downwards to form the internal granule layer (IGL).

17

Figure 1-6 Medulloblastoma Molecular Subgroups Comparison of the various subgroups of medulloblastoma including their affiliations with previously published papers on medulloblastoma subgrouping. CDK6: cyclin-dependent kinase 6, CTNNB1: β-catenin 1, LCA: large cell/anaplastic. M+: positive for metastasis, MYC: c-myc, MYCN: n-myc Source: (Taylor et al., 2012)

Inactivating PTCH1 mutations as found in medulloblastoma (MB) and basal cell carcinoma (BCC) lead to constant pathway activity because of deregulated activation of SMO and of the signalling downstream. Efforts to develop Hh inhibitors and SMO antagonists have proven successful with the clinical development of GDC-0449 (vismodegib) (Hoff et al., 2009; Rudin et al., 2009). While preclinical results of vismodegib for the treatment of metastatic medulloblastoma were promising, patients rapidly developed resistance to the drug due to mutations of SMO or downstream effectors (Buonamici et al., 2010; Dijkgraaf et al., 2011; Yauch et al., 2009). Gorlin syndrome patients with BCC (caused by inherited PTCH1 mutations)

18 all responded to vismodegib (Tang et al., 2012). Resistance and drop in response rate arose in patients with advanced or metastatic BCC, with reported overall response rate of 43-58% and 30.3% respectively (Axelson et al., 2013; Chang and Oro, 2012; Sekulic et al., 2012). The first reported SMO-resistant mutation to GDC-0449 was D473H, conferring resistance to the antagonist due to a loss of affinity of the receptor for the small molecule (Yauch et al., 2009). Follow-up studies in mice with either GDC-0449 or LDE225 (Novartis) identified acquired mutations in SMO, including mutation to D473, and activating mutations downstream of SMO including amplification of GLI2, the main transcriptional activator of the pathway as mechanisms of resistance. With such partial success, there is interest in finding novel chemical scaffolds able to inhibit SMO using alternative molecular determinants that would prevent the emergence of resistance or provide secondary lines of treatment.

While oncogenic mutations in Smo have been reported and characterized, no such mutations have been reported so far for Fzd. However, recent studies have demonstrated cancer dependence on Wnt ligands and enhanced response to chemotherapy when combined with Wnt inhibition (Gurney et al., 2012), highlighting the opportunity for drug development targeting the signalling events at the level of the receptor. Current clinical trials are focused on LGK974, an inhibitor of Porcupine, a palmitoyltransferase needed for the post-translational acylation of Wnt ligands required for their secretion and activity (Liu et al., 2013); OMP-54F28, a recombinant fusion protein of the FZD8 cysteine-rich domain with the Fc domain of immunoglobulin that acts as a ligand scavenger (Smith et al., 2014); and Vantictumab (OMP-18R5) a monoclonal antibody binding and inhibiting FZD1, -2, -5, -7 and -8 (Gurney et al., 2012). The first Fzd antagonist, FzM1, was recently discovered in a screen for a pharmacological chaperone for a misfolded Fzd4 mutant implicated in familial exudative vitreoretinopathy (Generoso et al., 2015). The small molecule identified in this screen rescued the plasma localization of this mutant. It also inhibited wildtype Fzd4 and the activity of the pathway in LEF/TCF luciferase reporter assays

(EC50 1.8µM). Treatment of glioblastoma U87MG and colon cancer Caco-2 cells with FzM1 led to more differentiated cells. Hydrogen-to-deuterium exchange and mass spectrometry experiments revealed binding of FzM1 to the intracellular loop 3 (ICL3) of Fzd4, a region involved in the binding of Dishevelled, a downstream effector of the Fzd receptors (Generoso et al., 2015).

19

1.3.3 GPCR and in silico docking

Traditional drug development relies on high-throughput screening (HTS) using different methods for hit identification (Johnson et al., 1996), and is usually time-consuming and costly. Computational screening or docking considerably increases the probability of identifying novel lead compounds as it can sample a relatively large chemical space and is fairly economical compared to HTS (Sela et al., 2010; Sousa et al., 2013). Protein-ligand docking consists of searching combinations of ligand and target protein conformations and scoring them. Multiple orientations and low energy conformations are sampled for each molecule. Then, molecules are ranked based on different properties such as polar, steric, apolar, solvation and predicted affinity (Ferreira et al., 2015; Mysinger and Shoichet, 2010). Focusing on commercially available molecules ensures that the hits may be tested rapidly. Whereas in silico docking has had important successes over the last decade (Carlsson et al., 2011; Dahlgren et al., 2012; de Graaf et al., 2011; Langmead et al., 2012; Ramsden et al., 2009; Repasky et al., 2012; Roughley et al., 2012; Sager et al., 2012; Tosh et al., 2012), it has several limitations, as it neither predicts binding affinities, nor rank-orders the affinities of diverse molecules. The successes of molecular docking have been more evident against the GPCR structures, in part due to a substantial bias of commercial libraries towards GPCR-like ligands resulting from the intensive medicinal chemistry in the field. While hit rates for soluble enzymes can vary between 2 and 10%, docking hit rates against GPCRs ranged from 17 to 35%, with affinities in the 100 pM to 3 µM range for compounds directly from the screens (Carlsson et al., 2010; de Graaf et al., 2011; Katritch et al., 2010; Kolb et al., 2009a; Mysinger et al., 2012b; Wang et al., 2014).

1.4 Research objectives

1.4.1 Characterize a novel interactor of Sufu

Sufu is known for interacting with the Gli proteins and regulating their protein levels and transcriptional activity. Using a functional proteomic approach, we identified Zfp629 as a new

20

transcription factor interacting with Sufu. The role of the first study in this thesis is to characterize the function of this protein. More specifically, ZFP629 is a zinc finger protein. Without other domains indicating it could be a transcription factor, we had to determine whether or not it could modulate gene transcription, which we did using a GAL4 reporter assay. Once that confirmed, we used gene expression analysis and chromatin immunoprecipitation to identify genes modulated by ZFP629.

1.4.2 Perform a structure-assisted docking screen to identify novel Smo ligands.

Smo mediates signal transduction in the Hh pathway, a signal transduction cascade essential for development and implicated in carcinogenesis. While the clinically-approved Smo antagonist vismodegib is able to suppress tumour growth, mutations in Smo that preclude vismodegib binding lead to recurrence of tumours. With the report of the first crystal structure for Smo, a collaboration with Dr. Brian Shoichet was established for in silico docking and screening of small molecules to identify new Smo antagonists and test candidates against vismodegib- resistant Smo. We used Ptch1-/- MEFs and a Gli-Luciferase assay, with the loss of Ptch1 leading to constitutive activation of the pathway, allowing us to screen for antagonists. qPCR for Gli1 was performed to ensure that the inhibition observed with the luciferase assay translated in inhibition of Hh target genes. Bodipy-cyclopamine displacement assays and a Wnt/Fzd reporter assay were used to confirm that new antagonists were binding in the site predicted by our model and confirming specificity of the antagonists towards Smo. And finally, testing against the vismodegib-resistant Smo mutant was done comparing Gli1 mRNA levels in C3H10T1/2 overexpressing mouse wt Smo or D477H Smo.

21

Chapter 2 Material and Methods

2 Material and Methods 2.1 ZFP629 results chapter

Cell culture Cell lines used: HEK293 Flip-In T-REX, HEK293T, C3H10T1/2 (ATCC, CCL-226), NIH3T3 (ATCC, CRL-1658), Plat-E (obtained from Dr. James Ellis with necessary permissions and MTA from Dr. Toshio Kitamura). Cells were grown in DMEM/10% FBS, DMEM/10% calf serum for NIH 3T3. When needed, serum starvation was performed as follow: once cells were confluent, media was changed to 1% serum for 24 h before adding drug. Cyclopamine and purmorphamine were purchased from Santa Cruz Biotechnology (sc-200929 and sc-202785).

Plasmids Zfp629 cDNA was obtained from TCAG Genome Resource Facility (IMAGE:30543652 – IRAV135A2). The coding sequence was amplified by PCR and subcloned into the plasmids of interest. UAS-Luc and CMX-Gal4-DBD were kind gifts of Dr. Carolyn Cummins. Gal4-DBD was amplified by PCR and subcloned into pIRES-puro. All PCR-amplified regions were verified by sequencing. Sufu plasmids used the human sequence. Gli plasmids were gifts from Dr. C.C. Hui (Hospital for Sick Children, Toronto, Canada) and used mouse sequences.

Retrovirus preparation and transduction pMiG and derivatives: 5x106 Plat-E cells in 10 cm plates were transfected with 30 µg DNA with a standard calcium phosphate procedure (Kingston et al., 2003). Medium was changed 16 h post- transduction. Virus-containing medium was collected 48 h post-transfection. Target cells were resuspended in virus-containing medium supplemented with 8 µg/mL polybrene (1,5-dimethyl- 1,5-diazaundecamethylene polymethobromide, hexadimethrine bromide, Sigma). Medium of target cells was changed 24 h after infection.

22

Lentivirus preparation and transduction All lentiviral particles were produced in HEK293T cells by calcium phosphate co-transfection of VSV-G, psPAX2 and lentiviral plasmids in 30% confluent monolayer cell culture grown in 10 cm plates. Media was changed 16 h post-transfection and virus subsequently collected after 24 h and 48 h. Target cells were transduced in the presence of 10 µg/ml polybrene (Sigma-Aldrich). The viral media was replaced with fresh DMEM/10% FBS media 24 h after infection. Where appropriate, cells were selected with 2 µg/mL puromycin for 48 h after viral infection.

Affinity purification, immunoprecipitation and mass spectrometry FLAG-SUFU: 5 x 15 cm of C3H10T1/2 and NIH3T3 stably expressing FLAG-SUFU were collected for each experiment. Cells were resuspended in 10 mL TAP lysis buffer (50 mM Hepes-KOH pH 8.0, 10% glycerol, 0.1% Igepal CA 630, 100 mM KCl, 2 mM EDTA, 2 mM

DTT, 10 mM NaF, 0.25 mM NaOVO3, 10 mM β-glycerophosphate, protease inhibitors (Sigma)) and then subjected to two freeze-thaw cycles before being incubated for 30 min on a rotator at 4°C. Debris were pelleted by centrifugation (18 000 x g, 20 min at 4°C). FLAG-SUFU protein complexes were then purified with FLAG-M2 agarose (Sigma). The resin was washed with TAP lysis buffer followed by washes with 50 mM ammonium bicarbonate pH 7.8. Protein complexes were eluted with 500 mM ammonium hydroxide pH 11.0. Reduction and alkylation of the protein were performed sequentially with 25 mM DTT and 100 mM iodoacetamide before digestion with 1 µg of sequencing-grade trypsin (Promega). Tryptic peptides were analyzed by reverse-phase LC-MS/MS using a LTQ-XL mass spectrometrometer (Thermo Scientific). The acquired tandem mass spectra were searched using SEQUEST against RefSeq mouse database, running on the Sorcerer platform (Sage-N Research). Identified proteins were filtered for a minimum of two unique peptides and then against background experiments to identify unique interactors.

Antibody Production and Purification A rabbit polyclonal antibody was raised against a peptide (aa 591-711) of ZFP629, selected to be more unique when compared to other mouse proteins. The coding sequence for this peptide was cloned into pGEX vector (Promega) downstream of GST. The peptide was expressed in BL-21 E. coli, stimulated with IPTG (0.5 mM IPTG, Sigma) in a shaker at 30°C overnight. GST-fusion

23 proteins were purified using Glutathione Sepharose 4B (GE Healthcare Life Sciences). The purity of the purified peptide was assessed by SDS-PAGE and Coomassie stain, and by mass spectrometry analysis. The produced peptide was then provided to Cocalico Biologicals Inc. for antibody production in rabbit. The crude antiserum was first tested for antigen specificity by western blotting against lysates of cells expressing control or Zfp629-targeting shRNA. For purification of the antibody, it was first passed through a column of GST proteins immobilized on Glutathione Sepharose 4B followed by affinity purification on a GST-peptide column. The antibody was then eluted and tested for antigen specificity.

RNA extraction, cDNA preparation and qPCR RNA samples from medulloblastoma cell lines were a gift from Dr. Peter Dirks. RNA samples from tissues were a gift from Dr. C.C. Hui (The Hospital for Sick Children, Toronto, Canada). RNA was extracted using Trizol (Life Technologies, 15596-018). 1 µg of RNA was DNase- treated (Life Technologies, 18047-019) before being reverse transcribed into cDNA (High- Capacity Reverse Transcription kit, Life Technologies, 4368813). Real-time quantitative PCR reactions were performed on an ABI 7900HT in 384-well plates containing 10 ng cDNA, using Power SYBR Green PCR Master Mix (Life Technologies, 4367660). Relative mRNA levels were calculated using the comparative Ct method, normalized to Gapdh mRNA or Ppib (Cyclophilin B) mRNA. Primers were designed using Primer Express 3.0 (Applied Biosystems) and validated as previously described (Bookout et al., 2006).

Table 2-1 Human qPCR primers Gene Sequence 5’-3’ RefSeq ID PPIB-F GGAGATGGCACAGGAGGAA (cyclophilin B) NM_000942.4 PPIB-R GCCCGTAGTGCTTCAGTTT FOXP2-F TCAGTTCTAAGTGCAAGACGAGACA NM_148898 FOXP2-R TCCATGGCCATAGAGAGTGTGA

24

Table 2-2 Mouse qPCR primers Gene Sequence 5’-3’ RefSeq ID Ccl5-F TCCAATCTTGCAGTCGTGTTTG NM_013653.3 Ccl5-R TCTGGGTTGGCACACACTTG Cxcl1-F CACCCAAACCGAAGTCATAGC NM_008176.3 Cxcl1-R CAAGGGAGCTTCAGGGTCAA Foxp2-F GCTGCCTCAAGCTGGCTTAA NM_053242.4 Foxp2-R ACTCCAGTCACTTCTTTCCATAGTTG Gapdh-F AGAAACCTGCCAAGTATGATG NM_001289726.1 Gapdh-R CAACCTGGTCCTCAGTGTAG Gli1-F GCCACACAAGTGCACGTTTG NM_010296.2 Gli1-R AAGGTGCGTCTTGAGGTTTTCA Ptch1-F CTTCGCTCTGGAGCAGATTT NM_008957.2 Ptch1-R TGTAACAACCCAGTTTAAATAAGAGTC Zfp629-F GGCTCATAGAGGTGCTGGAAGT NM_177226.5 Zfp629-R GGATGATCTCCTCCCCAGAACT

Genome-wide expression array RNA from NIH3T3 stably expressing a control or a targeting shRNA was isolated as described in previous section. Samples were submitted to the Princess Margaret Genomics Centre, Toronto, Canada for the microarray experiment and data analysis. The Illumina Mouse Whole Genome (Version 2, release 6) array was used, with 45 281 probes. Data was first checked for overall quality using R (v2.15.1) with the Bioconductor framework and the LUMI package. For data analysis, raw expression data was first imported in GeneSpring v12.0, then normalized using a standard quantile normalization and a ‘per probe’ median centered normalization. Data analysis and visualization was performed on log2 transformed data. Probes without signal were removed at this point of the analysis. Then only probes in the 80th percentile of intensity distribution in both groups were kept. Unsupervised clustering was performed using a Pearson centered correlation as a distance metric with average linkage rules. This was followed by a moderated t-test, with a q value cut off of 0.2 and a Benjamini-Hochberg FDR for multiple

25 testing correction and a list of 67 significantly different probes was obtained. Functional enrichment was performed with g:GOSt (Gene Group Functional Profiling, g:profiler suite) (Reimand et al., 2011). Motif search in promoters was performed using tools from RSAT (Thomas-Chollier et al., 2011).

Gal4 Reporter Assay HEK293T cells were plated in 24-well plates such that they would be 30% confluent the following day. 800 ng total DNA was transfected by calcium phosphate precipitation (250 ng of GAL4-Luc, 100 ng of RL-TK, varying amounts for the transcription factors and completed with pCDNA3 DNA).

Cyclic amplification and selection of targets (CAST) CAST was preformed as described previously with the following modifications (Atcha et al., 2007; Voz et al., 2000). First, 30 bp random oligomers flanked with aptamers were synthesized (5’-GCGTCGACTCTAGACTGCAG-N30-GAATTCGGATCCCTCGAGCG-3’, Integrated DNA Technologies) and then made into double-stranded DNA as follows: 400 pmol of random oligomers, 1200 pmol cast-rev primer, 200 µM each dNTP and 10 units of Taq (Vivantis Technologies). Conditions were as follow: 5 min at 94°C, 20 min at 65°C and 20 min at 72°C. For CASTing, first selection was done by incubating 15 µg lysate with 150 pmol dsDNA in reaction buffer (15 mM Tris-HCl pH 7.5, 50 mM KCl, 5% glycerol, 5 µM EDTA, 5 µM EDTA,

5.75 mM MgCl2, protease inhibitors) for 30 min at room temperature. Streptavidin-sepharose (GE Healthcare Life Sciences) was added to the mixture for 15 min at room temperature. Resin was then washed once with reaction buffer and 3 times with wash buffer (PBS supplemented with 0.1% NP-40 and 0.1% BSA). Resin was combined with 100 µL PCR mix and purified DNA amplified as follows: 94°C 1 min (initial cycle 5 min), 62°C 1 min, 72°C 1 min, repeat 20 times. 10 µL of the PCR was used for the next CAST incubation, supplemented with 1 µg poly-(dI-dC) (Sigma Aldrich). For primers sequences, refer to Table 2-3. After each PCR cycle, amplicons were cloned into pBlueScript KS(+) and sequenced. Lysates from HEK293T cells expressing pGlue (Streptavidin, Calmodulin and HA tags, (Angers et al., 2006)) and pGLUE-Gli2 were used as background and positive controls, respectively. Motif-discovery software MEME was used to generate potential consensus DNA motifs (Bailey et al., 2009). Sequences enriched in the

26 pGlue CAST were removed from the obtained sequences for pGlue-Gli2 and pGlue-Zfp629 prior to motif discovery.

Table 2-3 CAST primers Name Sequence 5’-3’ cast-fwd GCGTCGACTCTAGACTGCAG cast-rev CGCTCGAGGGATCCGAATTC

ChIP-Seq ChIP-seq: 3 x 107 HEK293T cells (3 x 10 cm plates) transfected with pMiG-Zfp629-3xFLAG were collected 48 h post-tranfection. Protein-chromatin complexes were cross-linked with 1% formaldehyde and rocking for 10 min at room temperature. Cross-linking was quenched with 125 mM glycine and incubation of 5 min at room temperature. Medium was aspirated and cells were washed twice with PBS. Cells were collected using PBS supplemented with 0.1% Tween-20. Cells were then washed with PBS and pelleted by centrifugation for 3 min at 1000 x g at 4°C. The supernatant was removed and the cell pellet was resuspended in 500 µL TSE I buffer (50 mM Tris pH 8.0, 150 mM NaCl, 1 mM EDTA, 1% Triton X-100) supplemented with 1X protease inhibitor cocktail. Cell extracts were sonicated in polystyrene 15 mL conical tubes (Falcon) using the Bioruptor to obtain chromatin fragments of approximately 300 bp in length: two cycles of 15 min (30 s on, 30 s off) at high intensity with the water bath at 4°C, cell extracts were placed on ice for 10 min between the two cycles. Cellular debris were then removed by centrifugation at 21 000 x g for 10 min at 4°C. The supernatant containing the solubilized chromatin was then transferred to a clean microcentrifuge tube. To pre-clear the chromatin samples, 30 µL Protein A agarose slurry (50%, Sigma) was added to the chromatin and incubated at 4°C for 2 h. Protein A agarose was pelleted by centrifugation and the cleared chromatin transferred to clean tubes. To the chromatin was added 2 µg FLAG-M2 antibody (Sigma) and the samples were incubated at 4°C overnight on a rotator. The following day, 25 µL Protein A agarose slurry was added to each sample and left to rotate for 2 more hours at 4°C to allow for binding of Protein A to the antibody complexes. The protein A agarose bound complexes were then pelleted by centrifugation at 5000 rpm for 1 min at room temperature and washed with

27 buffers for 5 min each as follow: 3 washes with 1 mL TSE I, two washes with 1 mL of TSE II (20 mM Tris pH 8.0, 500 mM NaCl, 2 mM EDTA, 1% Triton X-100, 0.1% SDS), once with 1 mL LiCl buffer (20 mM Tris pH 8.0, 250 mM LiCl, 1 mM EDTA, 1% Igepal CA 630, 1% sodium deoxycholate) and 3 washes with 1 mL TE (10 mM Tris pH 8.0, 1 mM EDTA). After the final wash, protein-DNA complexes were eluted in 110 µL of TE + 1% SDS for 1 h, and the cross-links were reversed by overnight incubation at 65°C. DNA was then purified using the QIAquick PCR purification kit. The eluted DNA was separated using a 1% agarose gel and fragments in the 200-400bp range were extracted using the QIAquick Gel Extraction Kit. Amplification was performed using SeqPlex according to the manufacturer’s protocol (Sigma- Aldrich). Library preparation and sequencing was performed by the Princess Margaret Genomics Centre (Toronto, Canada).

ChIP-qPCR Cells were resuspended in at least 10x pellet volume in swelling buffer (25 mM Hepes, pH7.8,

1.5 mM MgCl2, 10 mM KCl. 0.1% NP-40, 1 mM DTT, protease inhibitors (EDTA-free)) and incubated on ice for 10 min. Cells were homogenized using a Dounce homogenizer, 20-40 strokes to release nuclei. Nuclei were pelleted by centrifugation for 5 min at 4°C, 500 x g. Pellets were resuspended in 10 pellet volumes of sucrose buffer A (0.32 M sucrose, 15 mM Hepes pH 7.9, 60 mM KCl, 2 mM EDTA, 0.5 mM EGTA, 0.5% BSA, 0.5 mM spermidine, 0.15 mM spermine, 0.5 mM DTT) and homogenized with 20 strokes. This suspension was then layered on top of an equal volume of sucrose buffer B and centrifuged for 15 min, 4°C, 4000 x g. The nuclei pellet was resuspended in 500uL TSEI buffer for the following steps. The sonication, immunoprecipitation and DNA purification were performed as described in the ChIP-seq section.

ChIP-seq Data Analysis ChIP-seq data analysis was performed using MACS 1.0. Terminal (Mac OSX) was used for text file manipulations. The Cistrome Analysis Pipeline was used to analyze genomic intervals. The Nebula platform was used to draw the peak annotation graphs. MEME-ChIP (Ma et al., 2014) was used to perform the de novo motif enrichment analysis and STAMP (Mahony et al., 2007) was used to assess the similarity between the motifs.

28

Subellular Fractionation Subcellular fractionation was done with the NE-PER (Thermo Scientific) following the manufacturer’s recommendation.

Western Blotting Lysates and eluates were resolved by SDS-PAGE and transferred onto PVDF. Western blotting was performed with the antibodies as indicated in the figure legends. HA antibody (clone 16B2, Covence); FLAG antibody: monoclonal mouse, clone M2, (F3165, Sigma) ; β-tubulin: monoclonal mouse (E7, Developmental Studies Hybridoma Bank).

2.2 Smo results chapter

Docking A training dataset of 308 ligands were extracted from CHEMBL 12 (Gaulton et al., 2012) with a cut-off of 10 µM binding affinity. We used DOCK 3.6 (Mysinger and Shoichet, 2010) to screen the Lead Now subset of the ZINC database (http://zinc.docking.org) with properties of xlogp ≤3.5, molecular weight ≤ 350 Dalton and ≥ 250 and rotatable bonds ≤ 7 (Irwin and Shoichet, 2005; Irwin et al., 2012) against the x-ray crystal structure of the human Smo bound to an antagonist LY2940680 [PDB ID 4JKV] (Wang et al., 2013). About 2.4 million molecules were screened against the Smo orthosteric site. Ligand complementarity of each ligand pose was scored as the sum of the receptor-ligand electrostatic (using ligand probe charges in an electrostatic potential calculated by QNIFFT (Fischer et al., 2014; Gallagher and Sharp, 1998), a version of DelPhi (Gilson and Honig, 1987; Shoichet and Kuntz, 1993)) and van der Waals interaction energy (using the AMBER potential (Meng et al., 1992)) and corrected for ligand desolvation. Partial charges from the united-atom AMBER force field were used for all receptor atoms except for Asn219, Asp384, Arg400 for which the dipole moment was increased, as previously described (Carlsson et al., 2010), to boost electrostatic scores for poses in polar contact with these important residues. Fourty-five matching spheres were used. The degree of ligand sampling was determined by the values of the bin size, bin size overlap and distance tolerance, set at 0.3Å, 0.1Å and 1.2Å, respectively, for both the matching spheres and the docked molecules. Ligand internal degrees of freedom were pre-calculated using OpenEye’s Omega

29 program. Ligand charges and initial solvation energies were calculated using AMSOL (Chambers et al., 1996; Jiabo Li et al., 1998).

Tanimoto coefficient (Tc) calculation An updated dataset of 452 Smo ligands was extracted from CHEMBL 19 (Bento et al., 2014; Gaulton et al., 2012) with a cut-off of 10 µM binding affinity. Using EFCP4 we calculated the Tc (Rogers and Tanimoto, 1960) between our hits and all of the 452 ligands.

Minimization with PLOP Minimization of the complex of compound 45b with the D473H Mutant model was done using PLOP (version 25.1). Only residues within 5 Å from the ligand were allowed to move.

Luciferase Assay Ptch1-/- MEFs stably expressing a Gli-Luciferase reporter and constitutive Renilla Luciferase were used. The Gli-Luciferase reporter is a Firefly Luciferase reporter driven by 8xGli consensus binding sites in its promoter, cloned in a lentiviral plasmid carrying a puromycin resistance cassette for selection. For the assay, 5x104 cells/well were plated in 48-well plates. The next day, the confluent cells were serum-starved with plain DMEM for 24 hours. Drugs and compounds were added to the indicated final concentration in duplicate and incubated for 24 hours. For the assay, Promega Dual Glow reagents were used. Media was removed and cells were lysed in 50 µL Passive Lysis Buffer for 10 min. 10 µL of lysate was assayed in black plates with 10 µL of each substrate, in duplicate. Luminescence was measured with an EnVision 2100 (Perkin Elmer). Firefly Luciferase luminescence was divided by the Renilla Luciferase luminescence, this ratio was then normalized to vehicle condition to obtain the fold change in reporter activity. Cyclopamine and GDC-0449 were purchased from Santa Cruz Biotechnology (sc-200929 and sc-396759). Itraconazole was purchased from SelleckChem (S2476).

Aggregation counter-screens To assess whether or not our hits were sensitive to aggregation, we prepared ligands in media and centrifuged half at 21,000g for 20 min before adding to the cells. The assays were performed as described for the Luciferase Assay and Gli1 mRNA levels (RNA extraction and qPCR).

30

Top-Flash assay The assay was carried out in HEK293T cells, as described (Lui et al., 2011). Briefly, a stable HEK293T cell line with the TopFlash β-catenin-dependent luciferase reporter and Renilla Luciferase were seeded in 24-well plates. For Wnt stimulation, the following day medium was replaced with a 1:1 mix of DMEM-Wnt3a or DMEM-control conditioned medium and compounds were added to the cells. Assays were performed the next day.

RNA isolation, cDNA synthesis and qPCR analysis Ptch1-/- MEFs or C3H10T1/2 cells overexpressing wildtype or mutant Smo were plated at a density of 2x105 cells/well in 12-well plates. The next day, the confluent cells were serum- starved with plain DMEM for 24 hours before drugs and compounds were added to the indicated final concentration, with 0.5% DMSO. RNA was extracted using Trizol (Life Technologies, 15596-018) after 24 hours. 1 µg of RNA was DNase-treated (Life Technologies, 18047-019) before being reverse transcribed into cDNA (High-Capacity Reverse Transcription kit, Life Technologies, 4368813). Real-time quantitative PCR reactions were performed on an ABI 7900HT in 384-well plates containing 10 ng cDNA, using Power SYBR Green PCR Master Mix (Life Technologies, 4367660). Relative Gli1 mRNA levels were calculated using the comparative Ct method, normalized to Gapdh mRNA. Primers used were validated as previously described (Bookout et al., 2006), sequences in Table 2-4.

31

Table 2-4 Mouse qPCR primers Gene Sequence 5’-3’ RefSeq ID Gapdh-F AGAAACCTGCCAAGTATGATG NM_001289726.1 Gapdh-R CAACCTGGTCCTCAGTGTAG Gli1-F GCCACACAAGTGCACGTTTG NM_010296.2 Gli1-R AAGGTGCGTCTTGAGGTTTTCA

BODIPY-Cyclopamine Binding Assay, Microscopy and FACS A HEK293 stable cell line expressing tetracycline-inducible mouse Smoothened with mCherry fused to its C-terminus was used for these experiments (Nedelcu et al., 2013). Cells were grown to confluence in the presence of 1 µg/mL tetracycline for 24 hours. Cells were then incubated with 10 nM BODIPY-Cyclopamine (TRC Canada, B674800) and compounds for 2 hours at 37°C. For FACS, cells first trypsinized, fixed with 4% paraformaldehyde (pH 7.4, in PBS) for 20 min, washed with TBS + 0.1% Triton X-100 and then sorted (FACS). BODIPY fluorescence was measured on the cell sorter LSR Fortessa and FACS data was analyzed with the software FlowJo v.10. BODIPY fluorescence in control HEK293 cells was used to set the background threshold.

Mean fluorescence was plotted against the compound concentration to calculate its IC50. For microscopy, fixed cells were imaged.

Dynamic Light Scattering (DLS) Concentrated DMSO stocks of itraconazole and vismodegib were diluted with filtered DMEM, with a final concentration of 1% DMSO. Compounds 13b, 19b, 20b, 25b, 27b, 32b, 37b, 40b and 45b were diluted with both filtered DMEM and 50mM potassium phosphate buffer, pH 7.0

(also referred to as KPi, 21.1 mM KH2PO4, 28.9 mM K2HPO4) with a final concentration of 1% DMSO. Measurements were made using a DynaPro Plate Reader II system (Wyatt Technology) with a 60 mW laser at ~830 nm in either 96-well or 384-well plates; this particular instrument had been modified by Wyatt Technology to have a larger laser beam width that is appropriate for detecting large colloidal particles (Doak et al., 2010; Duan et al., 2015).

32

Critical aggregation concentration (CAC) determination Normalized scattering intensities (counts/sec, cnt/s) were plotted against decreasing concentrations of itraconazole. Data for colloidal and non-colloidal states were linearly regressed and non-linearly regressed, respectively. The intersection point between them was determined to be the critical aggregation concentration. Concentrations are represented as the mean and the standard deviation of three repetitions.

Enzyme Inhibition Assays. Inhibition of AmpC β-lactamase and MDH in counter-screening assays were measured as described (Coan and Shoichet, 2008; Doak et al., 2010; Duan et al., 2015; Seidler et al., 2003). The final concentration of DMSO was 1% for all samples. Values reported are the average of duplicate samples run in two independent experiments. Both DMEM and potassium phosphate buffer were used as buffers (potassium phosphate buffer also referred to as KPi, 50mM, pH 7.0,

21.1 mM KH2PO4, 28.9 mM K2HPO4).

33

3 Results 3.1 Results chapter 1: ZFP629 is a novel zinc finger transcription factor interacting with Sufu Shaimaa Ahmed: ChIP for ChIP-seq (J. Matthews, University of Toronto, ON, Canada) Swneke Bailey: ChIP-seq MACS analysis (M. Lupien, OICR, Toronto, ON, Canada) Thevagi Satkunendran: ISH experiments of Zfp629 expression in wildtype and SmoM2 cerebellum (C.C. Hui, The Hospital for Sick Children, Toronto, ON, Canada) Celine Lacroix: performed all other experiments and analysis presented.

3.1.1 ZFP629 is a novel interactor of Sufu To identify novel interactors of Sufu, Hedgehog-responsive C3H10T1/2 and NIH 3T3 cells stably expressing FLAG-SUFU were lyzed and anti-FLAG immunoprecipitates were first digested with trypsin and analyzed using LC-MS/MS. Validating the approach, GLI2, GLI3, GSK3β and CNBP,which have been previously described to interact with SUFU, were identified (Figure 3-1, Table 3-1) (D'Amico et al., 2015; Kise et al., 2009; Stone et al., 1999). Among the proteins identified in both cell lines was ZFP629, a zinc finger protein of unknown function. ZFP629 is predicted to have 19 C2H2-zinc finger domains, as represented in Figure 3-2, with 14 zinc fingers in tandem and 5 additional downstream. As Sufu is known to interact with the zinc finger proteins, Gli1-3, Glis2 and Glis3, the identification of an additional zinc finger protein as a SUFU interacting protein was of interest. To validate the mass spectrometry data, co- immunoprecipitation and western blotting experiments were performed. Co-expressed FLAG- ZFP629 and HA-Strep-SUFU were found to selectively interact (Figure 3-3 A and B).

34

Table 3-1 Interactors of SUFU

A _-FLAG purification of FLAG-SUFU in C3H10T1/2 Gene Gene Unique Total % Protein Name ID Symbol Peptides Peptides Coverage 51684 Sufu suppressor of fused homolog (Drosophila) 19 541 45.2 12785 Cnbp cellular nucleic acid binding protein 5 61 34.7 14634 Gli3 GLI-Kruppel family member GLI3 16 28 17.8 68988 Prpf31 PRP31 pre-mRNA processing factor 31 homolog (yeast) 8 10 32.7 244895 Peak1 pseudopodium-enriched atypical kinase 1 2 2 1.2 56637 Gsk3b glycogen synthase kinase 3 beta 3 3 9.8 14633 Gli2 GLI-Kruppel family member GLI2 10 14 11.1 320683 Zfp629 zinc finger protein 629 5 6 7 208177 Phldb2 pleckstrin homology-like domain, family B, member 2 3 3 3 14632 Gli1 GLI-Kruppel family member GLI1 2 2 2.4 13601 Ecm1 extracellular matrix protein 1 6 7 15.6 12331 Cap1 CAP, adenylate cyclase-associated protein 1 4 4 12.7 21762 Psmd2 proteasome (prosome, macropain) 26S subunit, non-ATPase, 2 2 2 2.8 18674 Slc25a3 solute carrier family 25 (mitochondrial carrier, phosphate carrier), member 3 2 3 10.9 18933 Prrx1 paired related homeobox 1 3 4 14.3 26443 Psma6 proteasome (prosome, macropain) subunit, alpha type 6 2 3 9.8 17463 Psmd7 proteasome (prosome, macropain) 26S subunit, non-ATPase, 7 2 2 12.5 26440 Psma1 proteasome (prosome, macropain) subunit, alpha type 1 2 2 9.1

B _-FLAG purification of FLAG-SUFU in NIH3T3 Gene Gene Unique Total % Protein Name ID Symbol Peptides Peptides Coverage 51684 Sufu suppressor of fused homolog (Drosophila) 21 303 45.9 12785 Cnbp cellular nucleic acid binding protein 5 25 34.7 14634 Gli3 GLI-Kruppel family member GLI3 10 16 11 68988 Prpf31 PRP31 pre-mRNA processing factor 31 homolog (yeast) 8 9 27.3 244895 Peak1 pseudopodium-enriched atypical kinase 1 2 2 1.4 56637 Gsk3b glycogen synthase kinase 3 beta 4 4 14.5 14633 Gli2 GLI-Kruppel family member GLI2 4 4 5.2 320683 Zfp629 zinc finger protein 629 6 6 8.5 319178 Hist1h2bb histone cluster 1, H2bb 3 8 27 15381 Hnrnpc heterogeneous nuclear ribonucleoprotein C 2 2 7 67300 Cltc clathrin, heavy polypeptide (Hc) 2 2 2.7 55989 Nop58 NOP58 ribonucleoprotein 4 6 10.3 13211 Dhx9 DEAH (Asp-Glu-Ala-His) box polypeptide 9 7 8 6.1 76936 Hnrnpm heterogeneous nuclear ribonucleoprotein M 10 11 16 101706 Numa1 nuclear mitotic apparatus protein 1 3 3 2 245474 Dkc1 dyskeratosis congenita 1, dyskerin 2 2 6.9 15388 Hnrnpl heterogeneous nuclear ribonucleoprotein L 6 6 15.9 19045 Ppp1ca protein phosphatase 1, catalytic subunit, alpha isoform 3 3 14.8

35

Figure 3-1 Proteins associating with FLAG-SUFU in C3H10T1/2 In orange, proteins known to interact with SUFU. In grey, novel interactors. In green, a novel interactor of SUFU and zinc finger protein, ZFP629.

N C 1 149 535 867 C2H2 Zinc finger

Figure 3-2 ZFP629 has 19 C2H2 zinc finger motifs

3.1.2 ZFP629 interaction with SUFU In order to further define the determinants of this interaction, different truncation constructs of ZFP629, tagged with a FLAG epitope, were produced as depicted in Figure 3-3 C. Reciprocal co-immunoprecipitation experiments were performed with HA-Strep-SUFU. Co-immuno- precipitation with HA-Strep-SUFU indicates that the N-terminal section of ZFP629, the portion of the protein without any zinc fingers, is required for HA-Strep-SUFU binding. This is shown by the loss of interaction of SUFU with FLAG-ZFP629 143-867, lacking the first 142 amino acids, and the sufficiency of the isolated N-terminal portion of ZFP629 to bind SUFU (Figure 3-3).

36

A FLAG-GLI2 + B FLAG-GLI2 + FLAG-ZFP629 + FLAG-ZFP629 + FLAG-MAZ + FLAG-MAZ + Strep-HA-SUFU + + + Strep-HA-SUFU + + +

250 130

130 98

Strep AP 98 FLAG IP 72 WB: FLAG (m) WB: Sufu (Rb) 72 55

55

130 250

98 Strep AP FLAG IP 130 WB: Sufu (Rb) 72 WB: FLAG (m) 98

55 72

55

C FLAG-ZFP629 FLAG 14XZnF FLAG-ZFP629 1-154 FLAG FLAG-ZFP629 1-538 FLAG 14XZnF FLAG-ZFP629 143-867 FLAG 14XZnF

FLAG-ZFP629 + + FLAG-MAZ + + FLAG-ZFP629 1-154 + + FLAG-ZFP629 1-538 + + FLAG-ZFP629 143-867 + + Strep-HA-SUFU + + + + + + + + + + 250 130 95 70 FLAG IP 51 Strep AP WB : HA (Sufu) WB : FLAG 42

29 250 130 95 70 Lysates 51 Lysates WB : HA (Sufu) WB : FLAG 42

29

250 130 95 70 Flag IP 51 Strep AP WB : FLAG WB : HA 42

29

37

Figure 3-3 Interaction of ZFP629 and SUFU HEK293T cells were transfected with HA-Strep-SUFU and various FLAG-tagged zinc finger proteins. GLI2 was used as a positive control for the interaction with SUFU and MAZ was used as a negative control. A) Cell lysates were subjected to Streptavidin-affinity purification and probed for the presence of FLAG-tagged proteins. Only FLAG-GLI2 and FLAG-ZFP629 were detected in (HA-Strep)-SUFU purified proteins. B) Cell lysates were subjected to FLAG immunoprecipitation and probed for the presence of (HA-Strep)-SUFU. SUFU is detected only in FLAG-GLI2 and FLAG-ZFP629 purifications. C) Top: schematic of the different ZFP629 constructs. Bottom left: HEK293T cells were co-transfected with either of the FLAG-tagged ZFP629 constructs and HA-Strep-SUFU. Cell lysates were subjected to FLAG immuno- precipitation and probed for the presence of HA-Strep-SUFU. Only full length FLAG-ZFP629 and the truncations with the intact N-terminus retained the interaction with SUFU. Bottom right: Cell lysates were subjected to Streptavidin-affinity purification and probed for the presence of FLAG-ZFP629 proteins. Only full length FLAG-ZFP629 and FLAG-ZFP629 1-154 were copurified with HA-Strep-SUFU. Representative experiment, n=3.

A multiple sequence alignment of this N-terminal region along with sequences of Ci and the three Gli proteins to search for a sequence similar to the SUFU-binding SYGH motif identified in Ci/Gli did not reveal a similar motif. An alignment with the full sequence of ZFP629 shows a possible homologous sequence, within a zinc finger motif. Sequence alignment to search for the SUFU-interacting site found in the C-terminus of Ci/Gli (Han et al., 2015) also failed to reveal a similar region in ZFP629.

ZFP629 299 QNHNLLKHQKIHAGEKPYRCTECGKSFIQSSELTQHQRTHTGEKPYECLE-CGKSFGHSSTL 360 GLI1(m) 114 ------SRCTS-PGGSYGHLSIG 130 Ci 245 ------SRGSSAASGSYGHISAT 262 GLI2(m) 258 ------SRSSSAASGSYGHLSAG 275 GLI3(m) 323 ------SRSSSSASGSYGHLSAS 340 . . . *:** *

Figure 3-4 ZFP629 alignment with Ci/GLI SUFU-interacting motif Alignment of SUFU interacting region, N-terminal, partial. Green: cysteines and histidines in zinc finger motifs, red: conserved amino acids of the SYGH motif, blue/: : conservation of strongly similar amino acids, (.) conservation of weakly similar amino acids.

38

3.1.3 Zfp629 expression in the mouse cerebellum

Activating mutations, germline or somatic, affecting members of the Hedgehog pathway define a subgroup of medulloblastoma, the SHH subgroup (Taylor et al., 2012). A recent report combining seven independent studies and 550 medulloblastoma found that 28% of medulloblastoma belonged to that group (Kool et al., 2012). SmoM2 mice are a model for Hh- driven cancer carrying the W535L constitutively active and cyclopamine-insensitive Smo mutation found in BCC. It is also a Hh-driven medulloblastoma model with 40% of the animals developing the cerebellar tumour (Mao et al., 2006). As seen in Figure 3-5, in situ hybridization of cerebellum sections shows Zfp629 is expressed in the cerebellum in a pattern similar to Gli2. In the SmoM2 mouse, medulloblastoma predictably developed and Zfp629 expression was dramatically increased within the tumour (right). These results are consistent with a potential role of ZFP629 in the regulation of cerebellar progenitor proliferation and in medulloblastoma development. Given that SUFU interacts with ZFP629, one possibility is that ZFP629 is a downstream mediator of Hh signalling during normal and pathological cerebellar progenitor proliferation.

39

A

zfp629

P7-P10 P21 GFAPCre;SmoM2+/-

B Gli2 Gli2

P7 P21

Figure 3-5 Zfp629 expression in the mouse cerebellum A) In situ hybridization in P7-P10, P21 wildtype cerebella and in SmoM2 cerebella shows Zfp629 expression pattern in the EGL at earlier stages, in the IGL at P21. Representative images, n=3. B) In situ hybridization at P7 and P21 in wildtype cerebella show Gli2 expression pattern. (From the Brain Transcriptome Database, Riken-BSI, Japan)

Given its tissue distribution in the cerebellum and the increased expression of Zfp629 in SmoM2+/- tissues, we wanted to confirm these observations and determine whether or not Zfp629 expression was modulated by treatment with the Smoothened antagonist cyclopamine. Expression of Zfp629 in samples of medulloblastoma derived from Ptch1F/F;GFAP Cre/Cre and Ptch1+/- mice were tested by qPCR. Ptch1+/- mice may spontaneously develop medulloblastoma (14%, (Goodrich et al., 1997)). By 4 weeks, all Ptch1F/F;GFAP Cre/Cre developed cerebellar tumours (Yang et al., 2008). Our results indicate that the expression of Zfp629 in these tissues differ from the increased Zfp629 expression observed in the ISH experiment (Figure 3-6).

Experiments have previously shown that on treatment with cyclopamine, the proliferation of Ptch1+/-; p53+/- MB-derived cell lines is significantly inhibited and that Gli1 expression is significantly reduced (Ward et al., 2009). We wanted to analyze Zfp629 expression in

40

cyclopamine-treated Ptch1+/-; p53+/- MB-derived cell lines to determine if it would be affected in a similar manner. qPCR analysis of Zfp629 mRNA levels showed no significant difference between non-treated and and cyclopamine-treated samples (Figure 3-7). This result suggests that Zfp629 is not a direct Hh target gene in the context of medulloblastoma. We also looked at the levels of Zfp629 expression in NIH3T3 cells treated with purmorphamine, a Smoothened agonist. Zfp629 mRNA levels were not significantly different compared to untreated cells (Figure 3-8). We conclude that Zfp629 is not a target gene of the Hh pathway.

15 Gli1 1.5 Zfp629 s l e v e l 10 1.0 RN A m mRNA Levels

e e v v 5 0.5 i i t t a a l l e e R R 0 0.0 WT P7 Ptch1F/F; GFAPCre/Cre P7 Ptch1+/-

Figure 3-6 Zfp629 expression in wildtype and Ptch1+/- P7 cerebellum Zfp629 transcript levels are not significantly increased in the Ptch1 mutant cerebellum. Only one sample for each genotype was available and tested.

41

2.0 Gli1 2.0 Zfp629

1.5 1.5

1.0 1.0 mRNA Levels mRNA Levels e e v v i i t t a a l l 0.5 0.5 e e R R 0.0 0.0

6 6 #15 #302 #15 #302

#156: Ptch1+/- ; p53+/- Non-Treated +/- -/- #302: Ptch1 ; p53 7 days Cyclopamine Figure 3-7 Zfp629 expression in two medulloblastoma cell lines Zfp629 transcript levels in two different medulloblastoma cell lines are not significantly decreased by treatment with the Smo antagonist cyclopamine (5 µM). Only one sample for each cell line was available and tested. Error bars: s.d. of technical replicates.

Zfp629 Gli1 ) N.S. ) 1.5 1400 * (m RN A (m RN A 1200 on 1.0 on 1000 800 e ss i e ss i p r p r 2.0 x x E E 0.5 1.5 e e v v 1.0 i i a t a t 0.5 e l e l 0.0 0.0 R R e e

DMSO DMSO

Purmorphamin Purmorphamin

Figure 3-8 Effect of purmorphamine on Zfp629 mRNA levels Zfp629 transcript levels in NIH3T3 cell lines are not significantly increased by treatment with the Smo agonist purmorphamine (2 µM). n=3, combined replicates. *p<0.05, ANOVA, N.S. not significant.

42

3.1.4 ZFP629 can repress gene expression

To assess the effect of ZFP629 on transcription activity, dual luciferase reporter assays using fusions of the Gal4-DNA binding domain (Gal4-DBD) to ZFP629 constructs were done in HEK293T cells. Transcription activity elicited by the Gal4-DBD was determined by the expression of the DBD alone and used as reference. Gal4-ZFP629 caused a significant and dose- dependent reduction of luciferase activity, in a similar fashion to a known repressor GLI3. Fusions of the ZFP629 truncation constructs indicate that the ability of ZFP629 to repress transcription is mediated by the core 14 zinc fingers, with Gal4-ZFP629 1-538 still being able to repress transcription induced by Gal4-DBD (Figure 3-9).

2.5 * 2.0 1.5 1.0 0.5 io n

t 0.10 a v

Act i 0.08 * * * old F

e 0.06 * * * v i * * *

Rela t 0.04 * * * 0.02

0.00 Gal4- Gal4-GLI3R Gal4-ZFP629 Gal4-ZFP629 Gal4-ZFP629 Gal4-ZFP629 DBD 1-154 143-867 1-538

Gal4-ZFP629 Gal4 14XZnF Gal4-ZFP629 1-154 Gal4 Gal4-ZFP629 1-538 Gal4 14XZnF Gal4-ZFP629 143-867 Gal4 14XZnF

Figure 3-9 ZFP629 can repress transcription Gal4-Luciferase reporter assay performed in HEK293T cells. Data represenet the average ± s.d. of 3 independent experiments, *p<0.05 ANOVA vs Gal4-DBD alone.

43

3.1.5 ZFP629 interacts with proteins involved in transcription repression

Recently, a new method called BioID was developed for the characterization of protein-protein interaction using proximity-based biotinylation (Roux et al., 2012). This method uses a modified biotin-conjugating enzyme that can activate biotin, but has less affinity for its activated form, BirA*. The freed activated biotin diffuses away and reacts with nearby amine groups. The biotinylated proteins are then purified using streptavidin beads. This method thus allows the use of stringent lysis and wash conditions. Using this approach, BirA*-ZFP629 was overexpressed in HEK293 to identify ZFP629 interacting proteins that could impact its transcriptional regulation activity. Several ZFP629 interactors are involved in transcription repression through complexes such as NuRD (Nucleosome Remodelling Deacetylase) and SWI/SNF (SWItch/Sucrose Non- Fermentable nucleosome remodelling complex) complexes (Table 3-2).

Table 3-2 BioID Assay with BirA*-ZFP629 in HEK293 Representative experiment, n=3. Italicized genes/proteins: known to associate with repressor complexes Gene Gene Symbol Protein name Unique Total % Complex ID Peptide Peptide Coverage Ref 23361 ZNF629 zinc finger protein 629 19 59 24.5 23394 ADNP activity-dependent 23 37 32.3 neuroprotector homeobox 81611 ANP32E acidic (leucine-rich) nuclear 2 2 9.7 phosphoprotein 32 family, member E 1820 ARID3A AT rich interactive domain 3A 4 4 8.3 SWI/SNF (BRIGHT-like) (Ding et al., 2012) 10620 ARID3B AT rich interactive domain 3B 6 10 20.2 SWI/SNF (BRIGHT-like) (Ding et al., 2012) 400 ARL1 ADP-ribosylation factor-like 1 2 2 11.6 11177 BAZ1A bromodomain adjacent to zinc 2 2 3.1 finger domain, 1A 9031 BAZ1B bromodomain adjacent to zinc 7 8 4.5 finger domain, 1B 9790 BMS1 BMS1 homolog, ribosome 3 3 3 assembly protein (yeast) pseudogene; 10951 CBX1 chromobox homolog 1 (HP1 2 2 28.6 PRC1 beta homolog Drosophila ) (Ding et al., 2012) 904 CCNT1 cyclin T1 2 2 5 7203 CCT3 chaperonin containing TCP1, 7 8 24.2 subunit 3 10575 CCT4 chaperonin containing TCP1, 4 5 16.9 subunit 4

44

22948 CCT5 chaperonin containing TCP1, 5 6 19 subunit 5 983 CDK1 cell division cycle 2, G1 to S 4 4 18.9 and G2 to M 1108 CHD4 chromodomain helicase DNA 20 41 17.2 NuRD binding protein 4 (Ding et al., 2012) 1662 DDX10 DEAD (Asp-Glu-Ala-Asp) box 8 11 14.1 polypeptide 10 57062 DDX24 DEAD (Asp-Glu-Ala-Asp) box 2 2 4 polypeptide 24 79009 DDX50 DEAD (Asp-Glu-Ala-Asp) box 5 5 9.8 polypeptide 50 51575 ESF1 similar to ABT1-associated 3 3 3.9 protein; ESF1, 5394 EXOSC10 exosome component 10 3 3 6.3 117246 FTSJ3 FtsJ homolog 3 (E. coli) 6 8 13.7 2618 GART phosphoribosylglycinamide 4 5 5.3 formyltransferase, phosphoribosylglycinamide synthetase, phosphoribosylaminoimidazole synthetase 29997 GLTSCR2 glioma tumour suppressor 2 2 4.6 candidate region gene 2; 26354 GNL3 guanine nucleotide binding 2 2 7.7 protein-like 3 (nucleolar) 2962 GTF2F1 general transcription factor IIF, 3 3 9.5 polypeptide 1, 74kDa 2969 GTF2I general transcription factor II, i; 23 33 27.1 general transcription factor II, i, pseudogene 2975 GTF3C1 general transcription factor 5 5 5.1 IIIC, polypeptide 1, alpha 220kDa 3015 H2AFZ H2A histone family, member Z 2 3 32 8334 HIST1H2AC histone cluster 1, H2ac 4 6 46.2 126961 HIST2H3C histone cluster 1, H3j; 2 3 28.7 3329 HSPD1 heat shock 60kDa protein 1 2 2 6.5 10527 IPO7 importin 7 4 5 5.3 51520 LARS leucyl-tRNA synthetase 2 2 3.5 4176 MCM7 minichromosome maintenance 2 2 5.2 complex component 7 2956 MSH6 mutS homolog 6 (E. coli) 2 3 3 10514 MYBBP1A MYB binding protein (P160) 1a 2 4 2.2 55226 NAT10 N-acetyltransferase 10 (GCN5- 3 3 6 related) 64318 NOC3L nucleolar complex associated 3 2 2 2.2 homolog (S. cerevisiae) 9221 NOLC1 nucleolar and coiled-body 3 3 7.3 phosphoprotein 1 10360 NPM3 nucleophosmin/nucleoplasmin, 2 2 20.2 3 5093 PCBP1 poly(rC) binding protein 1 3 3 15.4

45

26227 PHGDH phosphoglycerate 2 2 4.9 dehydrogenase 56342 PPAN peter pan homolog (Drosophila) 2 2 11.2 5496 PPM1G protein phosphatase 1G 2 2 4.9 (formerly 2C), magnesium- dependent, gamma isoform 5499 PPP1CA protein phosphatase 1, catalytic 3 3 15.5 PRC1 subunit, alpha isoform (Ding et al., 2012) 7001 PRDX2 peroxiredoxin 2 2 2 11.6 26121 PRPF31 PRP31 pre-mRNA processing 2 2 4.6 factor 31 homolog (S. cerevisiae) 10213 PSMD14 proteasome (prosome, 2 2 18.1 macropain) 26S subunit, non- ATPase, 14 10197 PSME3 proteasome (prosome, 2 2 13.8 macropain) activator subunit 3 (PA28 gamma; Ki) 4736 RPL10A ribosomal protein L10a 2 3 13.4 6136 RPL12 ribosomal protein L12 2 2 14.5 23521 RPL13A ribosomal protein L13a 2 2 9.9 6138 RPL15 ribosomal protein L15 3 6 17.6 6139 RPL17 ribosomal protein L17 2 3 10.9 6142 RPL18A ribosomal protein L18a 3 3 23.9 6159 RPL29 ribosomal protein L29 2 3 14.5 6125 RPL5 ribosomal protein L5 3 4 14.1 6185 RPN2 ribophorin II 2 2 4.3 23076 RRP1B ribosomal RNA processing 1 7 7 14.4 homolog B 26156 RSL1D1 ribosomal L1 domain 2 2 6.3 containing 1 29115 SAP30BP SAP30 binding protein 2 2 7.5 SIN3 (Kuzmichev et al., 2002) 26168 SENP3 SUMO1/sentrin/SMT3 specific 2 2 5.9 peptidase 3 871 SERPINH1 serpin peptidase inhibitor, clade 2 2 7.9 H (heat shock protein 47), member 1, (collagen binding protein 1) 94081 SFXN1 sideroflexin 1 2 2 14.6 81855 SFXN3 sideroflexin 3 2 2 6.8 23309 SIN3B SIN3 homolog B, transcription 5 5 6.2 SUFU/SAP18 regulator (yeast) (Cheng and Bishop, 2002) 6597 SMARCA4 SWI/SNF related, matrix 2 2 2.1 SWI/SNF associated, actin dependent (Ding et al., 2012) regulator of chromatin, subfamily a, member 4 8467 SMARCA5 SWI/SNF related, matrix 15 15 16.3 SWI/SNF associated, actin dependent (Ding et al., 2012) regulator of chromatin,

46

subfamily a, member 5 8243 SMC1A structural maintenance of 2 2 2.4 1A 10147 SUGP2 splicing factor, arginine/serine- 3 3 6.3 rich 14 6829 SUPT5H SPT5 homolog, DSIF 2 2 2.9 elongation factor subunit 6838 SURF6 surfeit 6 2 2 13.3 6924 TCEB3 transcription elongation factor 7 7 15.8 B (SIII), polypeptide 3 (110kDa, elongin A) 6949 TCOF1 Treacher Collins-Franceschetti 15 20 17.5 syndrome 1 54881 TEX10 testis expressed 10 2 3 3.4 92609 TIMM50 translocase of inner 5 5 12.3 mitochondrial membrane 50 homolog (S. cerevisiae) 347733 TUBB2B tubulin, beta 2B 3 41 64 7385 UQCRC2 ubiquinol-cytochrome c 2 2 7.3 reductase core protein II 57050 UTP3 UTP3, small subunit (SSU) 3 4 10.9 processome component, homolog (S. cerevisiae) 10009 ZBTB33 zinc finger and BTB domain 12 27 28 containing 33 7750 ZMYM2 zinc finger, MYM-type 2 8 9 13.3 LSD1/CoREST (Gocke and Yu, 2008) 9203 ZMYM3 zinc finger, MYM-type 3 6 8 8.1 LSD1/CoREST (Gocke and Yu, 2008) 7695 ZNF136 zinc finger protein 136 2 2 4.4 55609 ZNF280C zinc finger protein 280C 2 2 3.8 23528 ZNF281 zinc finger protein 281 3 3 6.8 84450 ZNF512 zinc finger protein 512 2 2 6.3 57473 ZNF512B zinc finger protein 512B 3 4 5.3

3.1.6 ZFP629 DNA-binding consensus sequence

C2H2 zinc finger proteins achieve DNA binding and specificity through the coordination of consecutive zinc finger motifs with linkers. These linkers are highly conserved, with TGEKP as the consensus sequence. ZFP629 has 7 TGEKP linkers and 2 TGERP linkers in the 14 zinc fingers domain (Table 3-3). The presence of these conserved linkers is a strong indication that ZFP629 is able to interact directly with DNA. To identify the ZFP629 consensus sequence, the cycling amplification and selection of targets (CAST) methodology was used with HEK293T

47 cell lysate expressing a SBP (Streptavidin binding peptide)-HA-CBP (calmodulin binding peptide)-tagged ZFP629, SBP-HA-CBP-GLI2 and SBP-HA-CBP only lysates as positive and negative controls, respectively (Figure 3-10). Aliquots of each lysate were incubated with a fraction of the random oligos. The transcription factor and DNA complexes were then affinity- purified, followed by elution of the isolated DNA and amplification by PCR. An aliquot of the amplified fragments was then used for another round of enrichment as described. The amplified fragments were also cloned and sequenced. Sequences were then compiled and the sequences of the negative control were removed from the set of sequences obtained for either ZFP629 or GLI2. The remaining sequences were analyzed with MEME to obtain position-weight matrices. The enriched motifs obtained after 5 rounds of selection are presented in Figure 3-11. The motif obtained for GLI2 corresponds to the known GLI binding sequence, validating our approach {Kinzler:1990cp}.

Table 3-3 ZFP629 linkers Linker position Linker sequence 172-176 TGERP 200-204 TGEKP 228-232 TGEKP 256-260 TGEKP 284-288 TGEKP 312-316 AGEKP 340-344 TGEKP 368-372 LREDP 396-400 TGERP 424-428 RGERP 452-456 TGEKP 480-484 TGEKP 508-512 MDENL

Consensus linker TGEK/RP

48

A) CAST oligo

Primer F Primer R

(N...)30

B) CAST Experiment Incubation of lysate and oligomers SBP TF SBP TF Strep

Cloning + sequencing

PCR Amplification Strep Binding of Strep beads

P SBP and precipitation C TF R of bound material

Washes Elution Strep SBP TF SBP TF

Figure 3-10 CAST methodology A) Representation of the oligo used for the experiment. N represents any base, at random. Random oligos flanked by specific sequences to allow for amplification and cloning were synthesized. B) Representation of the methodology. Lysates are prepared and oligomers are added to the lysates to allow the transcription factors to bind. Complexes are isolated by affinity purification and washed. The bound oligos are eluted and a small fraction is used for cloning or amplified for another round of selection. TF: transcription factor of interest.

49

Cycle Sequences Motif

5 9 GLI2

Gli Motif: GACCACCCA Reverse: ACCCACCAG complement: CTGGTGGGT Reverse: TGGGTGGTC

5 23 ZFP629

Figure 3-11 Identification of ZFP629 consensus sequence Top: obtained motif for GLI2, bottom: obtained motif for ZFP629 after 5 rounds of selection of one experiment. n=3

3.1.7 Effect of knockdown of Zfp629 on gene expression in NIH3T3

To identify genes repressed by ZFP629, a genome-wide expression profiling experiment was performed, comparing cell expressing control and Zfp629-targeting shRNA (Figure 3-12). To do this, the Hh-responsive cell line NIH3T3 was used. An unpaired and uncorrected t-test with a p<0.01 yielded 530 significantly different probes. Of these, 67 probes were significantly different based on a moderated t-test with a q-value cutoff of 0.2 and a Benjamini-Hochberg FDR for multiple testing correction (Table 3-4). Analysis of enriched functional annotations for these 67 probes showed enrichment for chemokines in the ZFP629 knockdown cells suggesting that it may participate in the regulation of these pathways.

To further identify target genes, promoters of the genes associated to the 67 probes were searched for the presence of the ZFP629 DNA-binding sequence obtained with the CAST assay

50 described above. qPCR analysis of mRNA levels for 20 genes confirmed the microarray results (Table 3-5).

A second round of validation was performed with a second independent shRNA to confirm that the changes observed were due to knocking down Zfp629. Ccl5 and Cxcl1 both had their expression reduced with either Zfp629 shRNA (Figure 3-13). These results indicate that ZFP629 is required for the expression of both these genes. Overexpression of Zfp629 however was not sufficient to increase their gene expression (Figure 3-14).

To determine if these genes are Hh target genes, NIH3T3 were treated with the Smo agonist purmorphamine and the expression of the candidate genes was assessed by qPCR. Purmorphamine had a very modest effect on the expression of these genes in comparison to the magnitude of the activation of the pathway target genes Ptch1 and Gli1 (Figure 3-15).

The expression of these Hh target genes increases when Sufu expression is lost, and expression levels can be rescued by Sufu ectopic expression. To address the role of Sufu on the expression of these chemokines, their mRNA levels in Sufu-/- MEFs were compared to Sufu-/- MEFs overexpressing Venus-Sufu. Ectopic expression of Sufu reduced the transcript levels of Gli1 and Ccl5 without significantly affecting Cxcl1 levels.

scrambled sh #2 sh #3 1 2 3 4 1 2 3 4 1 2 3 4 ZFP629

`-tubulin

Figure 3-12 Zfp629-targeting shRNA validation for microarray Validation of knockdown in lysates of samples submitted for microarray analysis, β-tubulin: loading control.

51

Table 3-4 Microarray results Symbol Definition Fold Change Gene ID (sh/scr), log2 Mcfd2 193813 multiple coagulation factor deficiency 2 2.14 Prl2c2 18811 prolactin family 2, subfamily c, member 2 1.79 Gch1 14528 GTP cyclohydrolase 1 1.72 H19 14955 H19 fetal liver mRNA (H19), non-coding RNA. 1.71 Dusp4 319520 dual specificity phosphatase 4 1.65 Prl2c2 18811 prolactin family 2, subfamily c, member 2 1.60 Stk4 58231 serine/threonine kinase 4 1.59 Ebp 13595 phenylalkylamine Ca2+ antagonist (emopamil) binding 1.56 protein Tmed10 68581 transmembrane emp24-like trafficking protein 10 (yeast) 1.55 Sphk1 20698 sphingosine kinase 1 transcript variant 1 1.54 Il1rl1 17082 interleukin 1 receptor-like 1 transcript variant 2 1.51 Tmed10 68581 transmembrane emp24-like trafficking protein 10 (yeast) 1.50 Fam129b 227737 family with sequence similarity 129, member B 1.49 Map2k4 26398 mitogen-activated protein kinase kinase 4 1.48 Xdh 22436 xanthine dehydrogenase 1.47 Spon2 100689 spondin 2, extracellular matrix protein 1.44 Rap2a 76108 RAS related protein 2a 1.44 Prkx 19108 protein kinase, X-linked 1.43 Leprotl1 68192 leptin receptor overlapping transcript-like 1 1.42 Slc16a13 69309 solute carrier family 16 (monocarboxylic acid 1.42 transporters), member 13 Ctsk 13038 cathepsin K 1.41 Cx3cl1 20312 chemokine (C-X3-C motif) ligand 1 1.41 Dkk3 50781 dickkopf homolog 3 (Xenopus laevis) 1.39 Rce1 19671 RCE1 homolog, prenyl protein peptidase (S. cerevisiae) 1.39 Megf10 70417 multiple EGF-like-domains 10 1.38 LOC269515 withdrawn 1.38 Nat11 70999 PREDICTED: N-acetyltransferase 11, transcript variant 9 1.38 Mmd 67468 monocyte to macrophage differentiation-associated 1.36 Prcp 72461 prolylcarboxypeptidase (angiotensinase C) 1.36 Spryd4 66701 SPRY domain containing 4 1.32 Pdlim1 54132 PDZ and LIM domain 1 (elfin) 1.31 Trak2 70827 trafficking protein, kinesin binding 2 1.31 Copa 12847 coatomer protein complex subunit alpha 1.31 Snn 20621 stannin 1.30 6330578E17Rik 76178 RIKEN cDNA 6330578E17 gene 1.30 Ivns1abp 117198 influenza virus NS1A binding protein transcript variant 2 1.28 BC003266 80284 cDNA sequence BC003266 1.28 Arhgap29 214137 Rho GTPase activating protein 29 -1.26 Mapk6 50772 mitogen-activated protein kinase 6 transcript variant 2 -1.28 9030418K01Rik 71532 -1.29 Xpa 22590 xeroderma pigmentosum, complementation group A -1.31

52

Serpine2 20720 serine (or cysteine) peptidase inhibitor, clade E, member 2 -1.31 Fam13c 71721 family with sequence similarity 13, member C -1.31 Ptp4a1 19243 protein tyrosine phosphatase 4a1 -1.33 Calml4 75600 calmodulin-like 4 -1.35 Mir16 56209 membrane interacting protein of RGS16 -1.36 Zfp36l1 12192 zinc finger protein 36, C3H type-like 1 -1.37 Ebf3 13593 early B-cell factor 3 -1.38 Zfp36l1 12192 zinc finger protein 36, C3H type-like 1 -1.38 Btg1 12226 B-cell translocation gene 1, anti-proliferative -1.39 Rangrf 57785 RAN guanine nucleotide release factor -1.41 Yeats4 64050 YEATS domain containing 4 -1.41 Hist1h1c 50708 histone cluster 1 -1.43 Ppap2a 19012 phosphatidic acid phosphatase 2a transcript variant 1 -1.43 Mgmt 17314 O-6-methylguanine-DNA methyltransferase -1.44 Abhd8 64296 abhydrolase domain containing 8 -1.46 Ccl5 20304 chemokine (C-C motif) ligand 5 -1.48 Ccl9 20308 chemokine (C-C motif) ligand 9 -1.54 Dguok 27369 deoxyguanosine kinase, nuclear gene encoding -1.55 mitochondrial protein Rasl11b 68939 RAS-like, family 11, member B -1.62 Rasl11b 68939 RAS-like, family 11, member B -1.64 Ogn 18295 Osteoglycin -1.65 Hist2h2aa2 319192 histone cluster 2, H2aa2 -1.65 LOC100043822 100043822 PREDICTED: hypothetical protein LOC100043822 -1.70 Kpna1 16646 karyopherin (importin) alpha 1 -1.76 LOC100044702 withdrawn -1.79 Cxcl1 14825 chemokine (C-X-C motif) ligand 1 -1.81

53

Table 3-5 Candidate genes with expression confirmed by qPCR and ZFP629 motif in promoters Gene Fold Change ZFP629 Binding site p-value qPCR, log2 Arhgap29 -1.3 GTGGGGAAAGTGAGTT 4.30E-05 Btg1 -1.4 Ccl5 -3.2 AGGAGAAAGTGAGGC 2.30E-05 Ccl9 -1.7 ATGGGAAAGTTGACTT 3.20E-05 Ctsk 1.5 ATGTGTAAAAAGAGGA 6.60E-05 Cx3cl1 1.3 AGCTGTAAAATGAGGC 7.40E-06 Cxcl1 -1.8 ATTGTAAAACTGAGTC 3.50E-06 Dkk3 1.3 ATGAAAAGAGTGAGGT 1.20E-05 Dusp4 1.8 ATGGGAACAGTGGGTG 9.40E-05 Il1rl1 1.5 ATGGGAAAATAGAGGA 1.60E-06 Kpna1 -1.9 ATGAGTAGACCGAGGT 8.40E-06 Leprotl1 1.2 Map2k4 1.4 ATTTGGAATTGGGGT 7.10E-05 Mcfd2 2.0 ATGAGAAGACTGAGGC 5.10E-07 Ogn -1.8 TTCTGTAAAGTGAGTT 7.10E-05 Ppap2a -1.6 ATGTGTAAAAAGAGGA 6.60E-06 Rap2a 1.1 GTGTGACAGTTGAGGC 6.40E-05 Sphk1 1.5 GAGGAGAAACTGAGGC 7.10E-05 Spon2 1.5 ATTGTTACACTGTGGC 4.80E-05 Stk4 1.5 ATCTATAAAGTGGGGA 7.40E-06

Ccl5 Cxcl1 Zfp629 ) ) ) A A A N N 1.5 * 1.5 * N 1.5 * R R R m m m ( ( ( on on on i i 1.0 i 1.0 s 1.0 s s e s e s e s p r p r p r x x x E E 0.5 0.5 E 0.5 e e e v v v i i i a t a t a t e l e l 0.0 e l 0.0 R 0.0 R R r 2 3 r 2 3 r 2 3 c h h c h h c h h s s s s s s s s s

Figure 3-13 Ccl5 and Cxcl1 are both repressed when Zfp629 is knocked down. NIH3T3 cells were transduced with control or Zfp629-targeting shRNA. Data represent the average ± s.d. of 3 independent experiments. * p<0.05, t-test.

54

Ccl5 Cxcl1 Zfp629 ) ) ) N.S. N.S. * 1.5 1.5 80 (m RN A (m RN A (m RN A 60 on on on 1.0 1.0 e ss i e ss i e ss i 40 p r p r p r x x x E E E 0.5 0.5 e e e 20 v v v i i i a t a t a t e l e l e l 0.0 0.0 0 R R R 9 9 9 ctrl 62 ctrl 62 ctrl 62

-ZFP -ZFP -ZFP

LAG LAG LAG F F F Figure 3-14 Zfp629 overexpression has no effect on Ccl5 or Cxcl1 NIH3T3 cells were transduced with control or FLAG-ZFP629 virus and mRNA levels were assessed by qPCR. Data represent the average ± s.d. of 3 independent experiments. * p<0.05, t- test. N.S.: not significant.

) Ccl5 ) Cxcl1 1.5 * 1.5 * (m RN A (m RN A on 1.0 on 1.0 e ss i e ss i p r p r x x E 0.5 E 0.5 e e v v i i a t a t e l 0.0 e l 0.0 R R e e

DMSO DMSO

Purmorphamin Purmorphamin ) Ptch1 ) Gli1 15 400 *

(m RN A * (m RN A 350 300 on on 10 250 e ss i e ss i 200 p r p r 5 x x E E 5 4 e e 3 v v i i 2 a t a t 1 e l e l 0 0 R R e e

DMSO DMSO

Purmorphamin Purmorphamin Figure 3-15 Purmorphamine has only a modest effect on Ccl5 and Cxcl1 NIH3T3 cells were treated with Smo agonist purmorphamine for 24h and mRNA levels were assessed by qPCR. Data represent the average ± s.d. of 3 independent experiments. * p<0.05, t- test.

55 ) ) ) Ccl5 Cxcl1 Gli1 1.5 * 1.5 N.S. 1.5 * (m RN A (m RN A (m RN A on on on i i 1.0 1.0 1.0 s s e s e s e ss i p r p r p r x x x E E 0.5 0.5 E 0.5 e e e v v v i i i a t a t a t e l e l e l 0.0 - 0.0 - 0.0 - R R -/ -/ R -/ fu fu u u u fu u Sufu s-S Sufu s-S S nu nu

- Ve - Ve -/ -/ -/- Venus-Suf fu fu u u S S Sufu

Figure 3-16 Expression of Ccl5 and Cxcl1 in Sufu-/- MEFs and Sufu-/- Venus-Sufu cells Data represent the average ± s.d. of 3 independent experiments. * p<0.05, t-test. N.S.: not significant.

3.1.8 Chromatin Immunoprecipitation of ZFP629-3xFLAG

In order to identify genes directly regulated by ZFP629, chromatin immunoprecipitation and sequencing (ChIP-seq) was performed in HEK293T cells overexpressing ZFP629-3xFLAG. Aligned sequences were analyzed using MACS (Zhang et al., 2008), resulting in 9781 peaks (False Discovery Rate: 1%). With almost 10,000 peaks, data mining was simplified with several assumptions. Only coding genes in RefSeq were used to define promoters, removing genes associated with non-coding RNA (NR_). The canonical transcripts list from UCSC Genome Bioinformatics was used to further focus the search (a canonical transcript is the longest transcript of a given gene). Promoters were defined as the sequence 3kb upstream to 1kb downstream of the transcription start site (TSS). Most peaks were found within 10kb of a transcription start site with 25.4% of the peaks within the first 3kb upstream a TSS. Most of the peaks were proximal to genes with only 17.5% of the peaks found in intergenic regions (Figure 3-17).

56

A Genome ChIP

          



    

   



3URPRWHU ES  3URPRWHU ES  3URPRWHU ïES  3URPRWHU ïES  3URPRWHU ïES  3URPRWHU ïES  DoZQVWUHDP ES  DoZQVWUHDP ES  DoZQVWUHDP ïES  DoZQVWUHDP ïES  DoZQVWUHDP ïES  DoZQVWUHDP ïES   875  875  875  875 Coding e[RQ Coding e[RQ ,QWURQ ,QWURQ 'LVWDOLQWHUJHQLF 'LVWDOLQWHUJHQLF

B Hï ChIP Zfp629 Hï Hï at a given distance (density) Proportion of genes with a peak ï ï ï ï ï 0  20  40  Distance from TSS (Kb)

Figure 3-17 ChIP peaks distribution A) Distribution of genome (left) and ChIP-seq peaks (right) in each category. B) Distribution of Zfp629 ChIP-seq peaks with respect to the transcription start site.

57

De novo motif discovery was performed on these 9781 regions using the MEME-ChIP suite (Ma et al., 2014) after individual peaks were defined by PeakSplitter and selected for a minimum arbitrary height of 100 unit, with 500bp sequences centered at the peak summit for a total of 6253 sequences. The top motif enriched was also enriched at the summit of the peaks and bore a significant similarity to the motif found with our CAST assay (Figure 3-18), validating the motif found in that assay. In an attempt to identify potential target genes, these peaks associated with this motif were intersected with the list of promoters obtained above: 611 peaks with motif in total, 183 peaks in 205 promoters. However, functional enrichment using g:Profiler, did not reveal significant associated pathways or functions.

2 Motif from ChIP-seq

1 bits E: 3.4 e-227

C A G G C G A A G A CG AT GAC CT G AT GCCACTGTC T TCAC TATA C A G G G CG CG CG CTGT 2 3 4 5 6 7 8 9 0 1 A A T 10 T11 G12 13 G14 15 16 17 18 19 20 21 22 23 24 similarity to CAST motif: T CA A CT MEME (noG SSC) 09.05.2016 00:41 A T G A GA G E: 1.3 e-12 T 2 A Motif

from 1 bits CAST assay G CG TA GC GG G T AAT CT GTCA TCC C C G A G A T 1 0 T2 3 4 A5 6 7 8 A9 G T 10 11 12 13 14 T15 A A MEMEA (no SSC) 09.05.2016 03:03 TG G Figure 3-18 Best motif from MEME-ChIP analysis Best motif obtained with MEME-ChIP, (default settings except for: minimum motif width of 4bp and site distribution: any number of repetitions). Similarity to motif obtained with the CAST assay was determined with STAMP (default settings, (Mahony et al., 2007)).

Publicly available and curated datasets were mined for ChIP data for this cell line. Very few datasets were available for HEK293 cells, with only H3K4me3 and PolII data being relevant for this project. H3K4me3 is a histone mark found at active or poised promoters. RNA polymerase II is also found at those promoters. Combining these datasets with the canonical transcript list and an HEK293 expression dataset, a list of promoters associated with expressed transcripts was

58 derived. This list of 3843 promoters and genes contained candidate promoters and genes that could be bound and repressed by the overexpressed ZFP629 (Figure 3-19).

Figure 3-19 Representation of ZFP629 ChIP-seq data relative to promoters +ve: positive

De novo motif discovery of the peak sequences within those promoters was performed with the MEME-ChIP suite. Among the different motifs identified, two carried some similarity to the ZFP629 motif identified by CAST (Figure 3-20). We then performed a functional enrichment analysis of the genes closest to the sequences containing those motifs using g:Profiler in an attempt to find whether or not any of these genes had a strong connection to the Hedgehog pathway, but it proved unsuccessful. Instead, with Hh signalling playing a significant role in cerebellar development, we used GREAT (Genomic Regions Enrichment of Annotations Tool, (McLean et al., 2010)) used to query Mouse Genome Informatics (MGI) for positive cerebellar expression and for MGI Phenotype for mutations affecting mouse cerebellar development and the data was used to refine the list of candidate peaks and associated genes to 66 genes (Figure

59

3-21). In an initial validation step, we then selected some of these genes to confirm the ChIP-seq data by ChIP-qPCR and also determined whether or not their expression would be derepressed by the knock down of Zfp629, with significant positive results for Foxp2 (discussed in the next section).

MEME Motif Centrimo 0.0100

0.0080

0.0060 Motif 1 MEME 2 p=5.3e-22 0.0040

E-value: 1.1e-102 Probability 0.0020

0.0000 -250 -200 -150 -100 -50 0 50 100 150 200 250 Position of best site in sequence CentriMo 4.11.1

0.0060

0.0050

0.0040 MEME 5 p=7.9e-7 Motif 2 0.0030

E-value: 3.0e-61 Probability 0.0020 0.0010 0.0000 -250 -200 -150 -100 -50 0 50 100 150 200 250 Position of best site in sequence CentriMo 4.11.1

Figure 3-20 de novo motif discovery On the left: MEME motifs of interest with TGAGG segments also found in ZFP629 motif found by CAST. On the right: Centrimo results for these motifs showing that motif 1 (top) is mostly found at the center of the positive sequences and that motif 2 (bottom) is proximal to the center of positive sequences.

60

MGI Expression: MGI Expression: MGI Expression: MGI Expression: MGI Expression: MGI Expression: Detected Detected Detected Detected Detected Detected ID: 6443 ID: 12476 ID: 12702 ID: 7012 ID: 14698 14699 TS23_rest of TS28_cerebellar TS28_cerebellum TS22_cerebellum TS23_cerebellum TS28_cerebellum cerebellum cortex granule cell layer APP AGTPBP1 CADPS2 AGTPBP1 AGTPBP1 AGTPBP1 ASS1 AKAP12 CXCR4 AKAP12 CADPS2 CADPS2 CTNNB1 ATG5 EN2 ALS2 DISC1 DISC1 GAS1 CADPS2 EXT1 ATG5 DNER DNER GLRB CCND1 FOXP2 CADPS2 EN1 EN1 JUN CDKN2C MIB1 CCND1 EN2 MECP2 MECP2 CRK PSMG1 CDKN2C EPM2A NFIA NFIA CTNNB1 SEMA6A CRK FOXP2 NFIB NFIB CXCR4 SERPINE2 CTNNB1 MECP2 SLC1A3 OCLN DCLK1 SLC1A3 CXCR4 NFIA SUZ12 PAFAH1B1 EN1 UNC5C DAG1 NFIB TPP1 PSAP EN2 ZIC3 DCLK1 SLC1A3 TRIM2 PSMG1 EPB41L2 DISC1 SUZ12 SLC1A3 EXT1 DNAJC3 TPP1 UBE4B FGFR1 DNER TRIM2 VIM FMR1 EN1 TRIO WLS FOXP2 EN2 GAS1 EPB41L2 GLI3 EPM2A Mouse Phenotype ITGB1 FAS MP:0000849 MAP1B FGFR1 abnormal cerebellum MECP2 FMR1 MERTK FOXP2 Mouse Phenotype MET GAS1 MP:0004097 MIB1 GLI3 abnormal cerebellar cortex MTHFR ITGB1 MYO5A JUN Mouse Phenotype NUMB MAP1B MP:0009956 PRNP MDM2 abnormal cerebellar layer PAFAH1B1 MECP2 PSMG1 MERTK Mouse Phenotype PTEN MTHFR MP:0000875 QKI MYO5A abnormal cerebellar Purkinje SEMA6A NFIA SERPINE2 NFIB Mouse Phenotype SHH NUMB MP:0000877 SLC1A3 OCLN abnormal Purkinje cell TPP1 PAFAH1B1 TRIM2 POU3F2 UNC5C PRNP ZIC3 PSAP PSMG1 PTCH1 PTEN SHH SLC1A3 SQSTM1 SUZ12 TPP1 TRIM2 TRIO UNC5C XRCC1 ZIC3 Figure 3-21 Results of GREAT query of MGI Expression and Phenotype for the cerebellum

3.1.9 Foxp2 expression is regulated by ZFP629

Foxp2 is a forkhead box transcription factor required for development and associated with neuronal processes important for speech and language (Lai et al., 2001; 2003). It is expressed in several tissues and different areas of the brain, including the cerebellum. Foxp2+/- and Foxp2-/- mice failed to gain weight normally. Foxp2-/-mice die 3 weeks after birth and display significant motor abnormalities. Cerebellar histology shows that at P17, the stage where normally the EGL has resorbed, Foxp2-/- EGL is still thick. This was due to a defect with the Purkinje cells that failed to align in a continuous row and develop sufficient dendritic arbors. Foxp2 deletion also

61 disrupted the formation and organization of Bergmann glial fibers, thereby preventing the proper migration of the granule cells from the EGL to the IGL (Shu et al., 2005).

ZFP629-3xFLAG bound the FOXP2 promoter in the ChIP-seq experiment. This result was confirmed by further ChIP-qPCR experiments in HEK293T cells (Figure 3-22 A). Overexpression of ZFP629 in HEK293T cells led to a decrease in FOXP2 mRNA levels (Figure 3-22 B). This also supports the hypothesis that ZFP629 is a repressor and represses the expression of Foxp2, knockdown of Zfp629 in Hh-response mouse cell lines led to an increase of Foxp2 expression (Figure 3-22 C).

Comparing the expression of Foxp2 in Sufu-/- MEFs and Sufu-/- MEFs overexpressing Venus- Sufu showed a significant reduction of Foxp2 expression when Sufu expression is restored. Gli1 mRNA levels also show a consistent reduction with the pathway activity being reduced by the overexpression of Sufu (Figure 3-22 D). Intriguingly, Foxp2 has not yet been identified as a Hh/Gli target.

62

A B 0.03 20 empty vector * ctrl * s

Zfp629-3xFLAG e l 15 Zfp629-3xFLAG e v l

10 0.02 2.0 RN A nput i m 1.5 % e v 0.01 i *

a t 1.0 e l

R 0.5

0.00 0.0 control Foxp2 2 region promoter Foxp Zfp629

C D 4 * 1.5 -/- scr Sufu

s -/- l Sufu ; Venus-Sufu e sh2 * v e

l 3 sh3 1.0 RN A

m 2 mRNA levels *

e e v v i i t

t 0.5 a a l l

e 1 * e R

* R * 0 0.0 9 2 1 2 Gli Foxp Foxp Zfp62 Figure 3-22 Foxp2 is a target gene of ZFP629 and of Hedgehog A) ChIP validation confirming the presence of ZFP629-3xFLAG at this locus in HEK293T cells. Foxp2 mRNA levels in MEFs B) overexpressing ZFP629, C) expressing Zfp629-targeting shRNA, or in D) Sufu rescue: Sufu-/- MEFs compared to Sufu-/- MEFs expressing Venus-Sufu. Data represent the average ± s.d. of 3 independent experiments. *: compared to control condition, p<0.05, t-test.

63

3.2 Results chapter 2: Identification of novel Smoothened ligands using structure-based docking Inbar Fish: performed docking and library analysis (B. Shoichet, UCSF, San Francisco, CA, USA) Hayarpi Torosyan: performed the DLS analysis (B. Shoichet, UCSF, San Francisco, CA, USA) Pranavan Paranthaman: helped with some of the RNA extractions (S. Angers, University of Toronto, Toronto, ON, Canada) Celine Lacroix: performed all luciferase assays, qPCR experiments and binding assays

3.2.1 Targeting the ligand binding site within the heptahelical domain of Smoothened

The naturally occurring teratogen cyclopamine antagonizes Smo by binding in a long, narrow cavity in the heptahelical site of the protein. This cavity broadly overlaps with that of orthosteric sites of family A GPCRs, and can accommodate at least two pharmacologically separate sites for antagonists: one at the top of transmembrane domain and involving the extracellular loops, such as for LY2940680, and one deeper in the bundle, such as for SANT-1 (Wang et al., 2014). When we began this study, the only available structure was the complex with LY2940680 (PDB ID 4JKV (Wang et al., 2013)); subsequently, four other ligand structures have been published (Wang et al., 2013; 2014; Weierstall et al., 2014). We targeted the upper 7TM site of 4JKV for docking, including aspects of the second, deeper site.

3.2.2 Control docking screens for enrichment of ligand vs decoys.

As a positive control, we docked a library of 308 known Smo ligands, drawn from the ChEMBL 12 library (Gaulton et al., 2012), combined with 21,250 property matched decoy molecules, which had the same physical properties as the ligand set but were topologically unrelated to these 308 ligands (Mysinger et al., 2012a). We looked for sampling and scoring parameters that enriched the ligands over the decoys among the top-ranked molecules from this screen, using an

64 adjusted Log(AUC) (Mysinger and Shoichet, 2010). We found that increasing the magnitude of the local partial atomic charges of Asn219, Asp384, and Arg400, at their terminal atoms, without changing the overall charge of the residues, improved ligand enrichment; this is a technique that has been used previously to up-weight the electrostatic component of the docking score relative to non-polar terms (Carlsson et al., 2010), hoping to improve specific recognition. The resulting adjusted Log(AUC) was 16.6%. To put this in perspective, among the top 500 docked compounds from the close to 22,000 docked, 116 were known ligands. We suspect the enrichment would have been higher still, but many of the ligands were too large to fit the particular conformation of the site represented by 4JKV.

3.2.3 Prospective full library docking screen – selection of 21 compounds

We used DOCK3.6 to screen the lead-like subset of ZINC compounds (Irwin and Shoichet, 2005; Sterling and Irwin, 2015), containing 3.2 million commercially available compounds, with molecular weight < 350 amu, xlogP < 3.5, and < 7 rotatable bonds. Each library molecule was screened in an average of 213.3 orientations in the site, and in each orientation an average of 745.4 conformations was sampled. Overall, over 1.4 trillion molecular complexes were evaluated. Configurations were ranked according to their electrostatic (using a point charge model of the Poisson-Bolzmann equation, as implemented in QNIFFT (Gallagher and Sharp, 1998; Sharp, 1995), a version of DelPhi) (Shoichet and Kuntz, 1993) and van der Waals complementarity (using the AMBER potential (Meng et al., 1992)) to Smo, corrected for ligand desolvation (using GB/SA electrostatics as implemented in AMSOL (Chambers et al., 1996; Jiabo Li et al., 1998)), and the top scoring configuration of each molecule was retained. The screen took 183 core hours on our lab cluster.

65

The result of the calculation was a ranked list of library molecules, from most to least complementarity to the targeted Smo ligand-binding pocket. As the differences in docking scores among the topped ranked molecules were substantially less than the expected errors of the calculation, we winnowed to a final candidate list for testing by visual inspection, as is commonly done in both high-throughput and virtual screening (Mysinger et al., 2012b). We inspected the top 0.2% of the docking-ranked library, seeking compounds predicted to form hydrogen bonds with at least two of the residues known to be important for binding in the known antagonists (Asn219, Asp473, Arg400, Lys394, Glu518 and Asp384). To bias toward novel scaffolds, we selected not only the compounds that overlapped with the LY2940680 binding site, but also some that bound higher in the site and only partially overlapped with this ligand in the structure. We deprioritized those molecules that were conformationally strained, something not always well captured by the docking scoring function, and selected for molecules in diverse chemotypes. Ultimately, 21 compounds were selected for experimental testing (Table 3-6). All showed specific and satisfactory electrostatic interactions, reasonable poses, and represented different chemotypes compared to known ligands and typically to each other.

Antagonist candidates were tested using Ptch1-/- reporter MEFs. In Ptch1-/- cells, due to a deletion of the Hedgehog ligand receptor and functional inhibitor of Smo Patched, the downstream signalling pathway is constitutively active. The reporter cells were engineered to express the 8XGli-Luciferase reporter that faithfully monitors levels of Gli-mediated transcriptional activity as a readout of Hedgehog signalling. Thus, Firefly Luciferase is constitutively expressed in these cells and its expression is inhibited by Smo antagonists such as cyclopamine (Chen, 2002). In the initial test for activity, we screened the 21 docking hits at a dose of 30 µM. From these, four molecules 3, 6, 44 and 244 (number indicates rank from the docking screen) exhibited greater than 50% inhibition of the reporter (Figure 3-23 A and Table 3-7). These compounds repressed the reporter in a dose dependent manner, with compounds 44

66

and 244 displaying IC50’s of 34.4 µM and 5.3 µM, respectively (Figure 3-23 B). Their ability to repress the pathway was further confirmed by quantifying transcript levels of Gli1, a target gene of the hedgehog pathway, using qPCR (Figure 3-23 C and Table 3-6).

Table 3-6 Results of first screen

Inhibition of IC µM Inhibition - IC µM IC µM Rank Compound 50 50 Binding 50 Gli-Luciferase (range, n) mGli1 qPCR (range, n) (range, n) 43.0 94.8 166 3 C55270268 yes yes yes (28.3-63.0, 4) (72.0-124.9, 3) (130-212, 2) 20.8 109 6 C48326431 yes yes (12.7-34.2, 3) (50.3-237.8, 3) 15 C25216461 no 22 C67090834 no 42 C66303859 no 34.4 14.0 15.6 44 C72143438 yes yes yes (7.6-155, 3) (11.4-17.3, 2) (8.2-3.0, 3) 45 C08112633 no 62 C71853530 no 77 C69810396 no 79 C67472686 no 96 C44950101 no 109 C69457212 no 120 C31712840 no 164 C58316861 no 197 C66148997 no 230 C79031196 no 5.3 11.3 58.3 244 C72431875 yes yes yes (3.0-9.5, 3) (2.2-58.1, 1) (26.2-130, 1) 265 C82230248 no 278 C12802040 no 377 C11697764 no 427 C77929373 no

67

A 2.5 y t i v 2.0 Act i

e s a r

e 1.5 f i * Lu c

e

v 1.0 i Rela t 0.5

0.0 3 6 44 15 22 42 45 62 77 79 96 244 109 120 164 197 230 265 278 377 427 DMSO Cyc (1 +M) B C 1.25 1.25 3 44 6 1.00 ss io n 1.00 e Activit y

44 r 244 0.75 0.75

0.50 0.50 mG li1 Ex p e v i 0.25 0.25 Rela t Relative Luciferase 0.00 0.00 -7 -6 -5 -4 -3 -7 -6 -5 -4 -3 Log [Drug] (M) Log [Drug] (M)

Figure 3-23 Identification of four novel Smoothened antagonists A) First screen of 21 compounds at 30 µM using the 8xGli-Luciferase reporter in Ptch1-/- MEFs. Four compounds showed significant (>50%) inhibition of the reporter compared to DMSO, n=3, p<0.05. Cyc: Cyclopamine (1 µM). B) Dose-response analysis of initial hits using 8xGli- Luciferase reporter in Ptch1-/- MEFs. C) Dose-response analysis of compound 44 by qPCR of Gli1 expression in Ptch1-/- MEFs. Data represent the average ± s.d. of 3 independent experiments.

68

Table 3-7 Hits from initial screen

Rank Tc * in IC µM ZINC ID related ZINC ID Structure 50 Related known structure initial (range, n) known structure screen (right)

O N O O O O HN 43.0 0.33 O 3 C55270268 HN HN O (28.3-63.0, 4) C84758591 O HN HN

N N N O NH

NH 0.26 H+ 20.8 N N 6 C48326431 C40976467 + N N O (12.7-34.2, 3) H N NH

NH+ N

N H O N N NH HN 34.4 0.28 44 C72143438 (7.6-155, 3) C13471218

N F F O F H N HO

N H O N N NH HN 5.3 0.27 244 C72431875 (3.0-9.5, 3) C40403377 O N H H2N N

O * Tc: Tanimoto coefficient

69

3.2.4 Secondary screen identifies analogs.

In an effort to improve affinity, we searched for commercially available analogs of the first four hits. Any compound in the ZINC database (Irwin et al., 2012) within an ECFP4-based Tanimoto coefficient of 0.7 to any of the four hits was considered (representing high topological similarity (Muchmore et al., 2008). Many such compounds were available for compounds 44 and 244, and we selected 231 that either fit within the similarity cut-off for compound 244 or bore the chemical scaffold common to both 44 and 244 (Figure 3-24). Only one analog was available for compound 3, and none were available for compound 6. Since most were larger than the initial lead-like molecules docked, they had not been sampled in the original docking screen. Thus, the entire set of analogs was docked against the Smo structure. Many scored well, and 190 would have ranked among the top 0.5% of compounds from the original screen. Of these, 46 were purchased and tested (compounds 1b-46b). Thirty of these antagonized the reporter at a single dose (Figure 3-25A, Table 3-8), and several had IC50 values in the low micromolar range, including compounds 13b, 25b, 32b and 45b at 10.9µM, 2.3 µM, 9.4µM and 3.1µM respectively, as determined using the Gli-Luciferase assay and/or by measurement of Gli1 levels using qPCR (Figure 3-25B-C, Table 3-9).

70

Figure 3-24 Smo antagonist structures A) 2D structures of the four novel antagonists from the initial screen (left to right: compounds 3, 6, 44 and 244). B) 2D structures of known antagonists (left to right: vismodegib, SANT-1, cyclopamine, and taladegib (LY2940680)).

71

A 2.0 y t i v 1.5 Act i

e s a r e f i 1.0 Lu c

e v i

Rela t 0.5

0.0 o 4 b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b b 4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 DMSOvism

B C 1.50 1.50 y t i v 1.25 1.25 45b ss io n 45b e Act i r

e 1.00 1.00 s a r e f i 0.75 0.75 Lu c mG li1 Ex p

e 0.50

e 0.50 v v i i 0.25 0.25 Rela t Rela t 0.00 0.00 -7 -6 -5 -4 -3 -7 -6 -5 -4 -3 Log [Drug] (M) Log [Drug] (M) Figure 3-25 Screening of analogs and dose-response analysis of best hit A) Second screen at 30 µM using the 8xGli-Luciferase reporter in Ptch1-/- MEFs. vismo: vismodegib at 100 nM. Data represent the average ± s.d. of 3 independent experiments. Bold IDs: compound inhibited reporter significantly, p<0.05 ANOVA. B) Dose-response analysis of compound 45b using the 8xGli-Luciferase reporter in Ptch1-/- MEFs. Data represent the average ± s.d. of 3 independent experiments. C) Dose-response analysis of compound 45b by qPCR of Gli1 expression in Ptch1-/- MEFs. Data represent the average ± s.d. of 3 independent experiments.

72

Table 3-8 Results of analog screen Inhibition Inhibition Compound IC µM IC µM IC µM ID Rank Gli- 50 mGli1 50 Binding 50 ID (range, n) (range, n) (range, n) Luciferase qPCR 1b 61 C72129543 aggregator* 2b 43 C72146912 no

3b 48 C72158134 yes yes

4b 28 C72167017 yes yes

5b 53 C72168261 yes yes

6b 41 C72172828 no

7b 52 C72419993 yes yes

8b 4 C72426960 yes

9b 178 C72481543 yes

10b 33 C72431634 no

11b 11 C72473625 yes

12b 68 C72429353 yes yes

10.9 13b N/A C55271488 yes (8.2-14.5, 3) 14b 2 C72129787 yes

15b 18 C72149026 no yes

16b 12 C72149186 yes

17b 20 C72149758 yes

18b 229 C72150023 no no

22.4 19b 5 C72150480 yes (5.7-87.2, 3) 20b 9 C72153124 aggregator* 21b 6 C72158119 yes bell-shape (3) 22b 13 C72163209 yes

23b 36 C72163710 no yes

24b 15 C72168023 yes bell-shape (3)

2.3 7.8 25b 34 C72168579 yes yes (1.5-3.6, 3) (4.0-15.1, 3) 26b 3 C72408285 yes

27b 1 C72420973 aggregator* 28b 228 C72435718 no yes

29b 107 C72447326 yes

30b 10 C72448241 yes

31b 227 C72457741 yes no

9.4 12.7 32b 224 C72477710 yes yes yes (5.8-15.0, 3) (3.8-42.3, 3) 33b 19 C72476169 yes

34b 35 C72480679 yes

35b 159 C72428267 no

36b 96 C72433192 no

5.4 12.1 37b 54 C72146027 yes yes (3.8-7.6, 3) (9.1-16.2, 2) 38b 231 C72170378 no 39b 204 C72152697 no 40b 114 C72167102 aggregator*

73

41b 126 C72479818 no 42b 186 C72162059 no 13.3 43b 31 C72447879 yes yes (5.7-31.2, 3) 44b 185 C72163442 yes 12.7 3.1 5.0 45b 76 C72475536 yes yes yes (9.4-17.3, (1.9-3.4, 3) (3.7-6.7, 2) 3) 15.8 46b 230 C72447458 yes yes (9.0-27.8, 3) 21.0 vismodegib 30.6 1.8 yes yes yes (13.3- (nM) (12.3-76.3, 3) (1.4-2.2, 3) 33.3) * aggregators: results for aggregation counter-assay in Table 3-10.

74

Table 3-9 Antagonists discovered by secondary analog screen Tc * ZINC ID µ Rank in IC50 M (Related Related known structure ID analogs ZINC ID Structure (range, n) known

screen Gli-Luciferase structure, right)

N O O O O O HN 10.9 0.38 O 13b NA C55271488 HN HN O (8.2-14.5, 3) C84758591 O HN HN

O

N H O N N HN HN 22.4 0.28 19b 5 C72150480 (5.7-87.2, 3) C40898421 N Cl H NH N

N H H N N N O

2.3 0.28 HN 25b 34 C72168579 (1.5-3.6, 3) C40403377 S N NH2 H N O

N H O N N HN NH 9.4 0.25 32b 224 C72477710 (5.8-15.0, 3) C40898663

F F HN N F S N N N

H N O N N NH HN 0.22 N N 5.4 37b 54 C72146027 C95576171 + (3.8-7.6, 3) NH

S HN O

NH2 O N

75

N H O N N HN NH 3.1 0.23 45b 76 C72475536 (1.9-5.0, 3) C40898663

F F HN N F N N S *Tc: Tanimoto coefficient

All of the antagonists with low micromolar IC50’s were counter-screened for colloidal aggregation, a common mechanism of artifactual activity in early ligand discovery (McGovern et al., 2002; Sassano et al., 2013). Dynamic Light Scattering (DLS), centrifugation of putative colloidal aggregates in media, and counter-screening assays against unrelated enzymes were used to confirm that compounds 3, 6, 44, 244, 25b, 32b, 37b and 45b are well-behaved antagonists (Figure 3-26). For the centrifugation, the compounds were diluted to the final concentration in media. This media was then centrifuged and the supernatant was added to the cells 24h prior to the assay. In the case of an aggregating compound, the centrifugation would remove the aggregates from the media and the pathway would not be inhibited at such concentration. We also used two enzymatic assays with enzymes sensitive to colloidal aggregates: AmpC β- lactamase (AmpC) and malate dehydrogenase (MDH) (Babaoglu et al., 2008; McGovern et al., 2002; Seidler et al., 2003). Four compounds were found to be aggregators in one or more assays (Table 3-10). Intriguingly, the same behaviour was observed for the anti-fungal drug itraconazole, which has been promoted into Phase II clinical trials after it was discovered to act as a Smo antagonist in a drug repurposing screen (Kim et al., 2014a; 2013; 2010). Itraconazole was previously shown to be a potent aggregator, active against several GPCRs in the 200 nM to 2 µM range via this artefactual mechanism (Sassano et al., 2013). Consistent with this behaviour, we found that itraconazole formed colloidal particles with a radius of 180 nm, with a critical aggregation concentration just below 1 µM (Figure 3-27), and that its observed antagonism of Smoothened could be disrupted by prior-centrifugation, a harbinger of this mechanism (Figure

76

3-28). These observations highlight the importance of counter-screening for this artifactual mechanism of action when evaluating new Smoothened antagonists.

A 1.25 3 B 1.25 6

1.00 1.00 y y t t i i v v Act i Act i

e 0.75 e 0.75 s s a a r r e e f f i i 0.50 0.50 Lu c Lu c

e e v v i i

0.25 0.25 Rela t Rela t

0.00 0.00 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 Log ([Drug] (M)) Log ([Drug] (M))

C 1.25 44 D 1.25 13b

1.00 1.00 y y t t i i v v Act i Act i

e 0.75 e 0.75 s s a a r r e e f f i i 0.50 0.50 Lu c Lu c

e e v v i i

0.25 0.25 Rela t Rela t

0.00 0.00 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 Log ([Drug] (M)) Log ([Drug] (M))

E 1.25 25b F 1.25 37b

1.00 1.00 y y t t i i v v Act i Act i

e e 0.75 0.75 s s a a r r e e f f i i 0.50 0.50 Lu c Lu c

e e v v i i

0.25 0.25 Rela t Rela t

0.00 0.00 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 Log ([Drug] (M)) Log ([Drug] (M))

G 1.25 45b

1.00 y t i v Act i

e 0.75 s a r e f i 0.50 Lu c

e v i

0.25 Rela t

0.00 -7.5 -7.0 -6.5 -6.0 -5.5 -5.0 -4.5 -4.0 Log ([Drug] (M))

Figure 3-26 Effect of centrifugation of ligand on Smo antagonist activity in Gli-Luciferase reporter assay. Gli-luciferase reporter activity in Ptch1-/- MEFs. Centrifugation had no significant effect on the activity of these novel antagonists. Data represent the average ± s.d. of 3 independent experiments.

77

Table 3-10 Compounds found to be aggregators Compound Compound ID DLS DLS AmpC AmpC MDH MDH KPi DMEM KPi DMEM KPi DMEM Conc. tested % enzyme activity** (µM) C72129543 1b 30 NA 11 NA 11 NA C72153124 20b 0.8* 5* 82 86 10 17 C72420973 27b 0.8* 5* 15 2.5 2 4.5 C72167102 40b NA 5 60 89 *Compounds were not tested at lower concentrations. **Compounds were tested at 100 µM. NA: not available, not tested. AmpC: AmpC β-lactamase MDH: malate dehydrogenase KPi: potassium phosphate buffer, 50mM, pH 7.0.

78

A 25

20

15

10

5 Light scattering Intensity (%)

0 3.E-02 5.E-01 8.E+00 1.E+02 2.E+03 Log[Radius] (nm)

B 1.14 Buffer 1.12 Vismodegob 1.1 Itraconazole Itraconazole + spindown 1.08 1.06 Coef 1.04 1.02 1 0.98 2 10 100 1000 10000 100000 Time (+s)

C 6.0x107

ng Colloid i ) r s /

t 7 Non-colloid

n 4.0x10 c ca tt e (

s

y t d i e s

n 7 li z 2.0x10 e a t n i m r o N 0

-2 -1 0 1 2 Log [Itraconazole] (+M)

Figure 3-27 Particle formation by itraconazole measured by dynamic light scattering (DLS) A) 1 µM Itraconazole forms strongly scattering particles dominated by those at 180 nm radius by DLS. B) The strong DLS decay curve 1 µM Itraconazole (red) is eliminated by centrifugation in a benchtop microfuge. Vismodegib (green) does not form particles by DLS at 1 µM. C) Itraconazole particles transit through a critical aggregation concentration (CAC) of 0.9 ± 0.2 µM, moving from a soluble to a particulate form over a small concentration interval. n=3, combined replicates.

79

A B C y t i

y 1.25 1.25 1.25 t i on v en s i t n t c

1.00 e ss i 1.00 1.00 A e I p r e x E a s 0.75 0.75 0.75 e r li 1 f i e sc en c

0.50 e G 0.50 Lu c 0.50 uo r v l i e F a t v i e e l v a t 0.25 0.25 0.25 i R e l a t R e l

0.00 0.00 R 0.00 -9 -8 -7 -6 -9 -8 -7 -6 -9 -8 -7 -6 Log ([Drug] (M)) Log ([Drug] (M)) Log ([Drug] (M))

D E F y t 1.25 i

y 1.25 1.25 t i on en s v i t n t c 1.00 e ss i 1.00 1.00 A e I p r e x E a s 0.75 0.75 0.75 e r li 1 f i e sc en c

0.50 e G Lu c 0.50 uo r 0.50 l v i e F a t v i e e l 0.25 v a t 0.25 i 0.25 R e l a t R e l

0.00 0.00 R 0.00 -7 -6 -5 -4 -7 -6 -5 -4 -7 -6 -5 -4 Log ([Drug] (M)) Log ([Drug] (M)) Log ([Drug] (M))

Figure 3-28 Itraconazole inhibits Smo via an aggregation-based mechanism Gli-luciferase reporter activity in Ptch1-/- MEFs (left), qPCR of Gli1 transcript in Ptch1-/- MEFs (middle), and direct displacement of bodipy-cyclopamine (right) — A-C) Effect of centrifugation on vismodegib: vismodegib antagonism of Smo is unaffected by a 20 min centrifugation of the antagonist (red) compared to control (black). A,B) combined replicates, n=3, error bars: s.d. C) n=3 representative curve shown. D-F) Effect of centrifugation on itraconazole: itraconazole activity is largely or entirely eliminated by a 20 min centrifugation of the antagonist (red) compared to control (black). Error bars: s.d., combined replicates n=3.

Displacement of a BODIPY-derivative of the canonical Smoothened ligand cyclopamine has been previously used to determine the binding affinity of Smo modulators (Chen et al., 2002). Using a stable inducible HEK293 cell line expressing Smo-mCherry, we tested whether or not well-behaved, non-aggregating antagonists can specifically displace BODIPY-cyclopamine

bound to Smoothened using flow cytometry. Compounds 44 and 45b had IC50 values of 15.6 µM and 12.7 µM in this ligand-displacement assay (Figure 3-29 A and B, Table 3-6 and Table 3-8), suggesting that the binding site for these compounds overlaps with the one occupied by BODIPY-cyclopamine. To further validate the specificity and rule out off-target activity for this new Smo antagonist chemotype, we investigated the activity of compound 45b, the most potent antagonist discovered here against Frizzled receptors. Vertebrate genomes encode for ten

80

Frizzled proteins, which function as receptors for Wnt growth factors, and together with Smo they constitute the class F family of GCPRs. We used HEK293T TopFlash cells, expressing a luciferase reporter under the control of the Wnt/Frizzled transcription factors Lef/Tcf. Wnt3a- conditioned media was used to activate the pathway and potential activity of compound 45b was measured after 24 hours of co-treatment (Figure 3-29 C). Compound 45b had no detectable activity in this assay suggesting that it does not interact with Frizzled receptors. We conclude that compound 45b and the other analogs represent a new chemotype for Smo antagonists.

3.2.5 The new antagonists exhibit efficacy at the chemoresistant Smo- D473H mutant.

Compounds 3, 6, 44, 244, 25b, 32b, 37b and 45b were all docked within the heptahelical bundle of Smo, where other Smo ligands like taladegib (LY2940680) also bind (Figure 3-30). However, these new antagonists are broadly unrelated to previously known Smo antagonists (Figure 3-24 B), and none has an ECFP4-based Tanimoto coefficient (Tc) (Hert et al., 2004; Rogers and Tanimoto, 1960) greater than 0.38 when compared to any Smo antagonist in the ChEMBL19 database (Bento et al., 2014; Gaulton et al., 2012) (Table 3-7 and Table 3-9). This is particularly true of compounds like 45b, which bears a Tc of only 0.23 to the nearest known Smo antagonist, indicating that these molecules are unrelated (Table 3-9). Consistent with the new scaffold it represents, in its docked structure 45b makes interactions different from taladegib reported in the crystal structure, such as with hydrogen bonds with Glu518, Asp384 and Tyr394, and stacking with Tyr 394 (Figure 3-30 E). Encouraged by the unique docked pose of 45b, which does not interact with Asp473, we tested compound 45b against the D473H mutant of Smo, which is the first reported Smo mutation conferring resistance to vismodegib in the clinic. This mutation reduces vismodegib binding to Smo by 100-fold, whereas binding of compound 45b is only

81 decreased 2.7-fold relative to wildtype (1.1 µM to 3.1 µM, Figure 3-30 G). Docking of compound 45b to a model of Smo-D473H, after a minimization with Amber suggested a docking pose with hydrogen bonds with Glu518 and Asp384 (Figure 3-30 H), while His374 slightly moves and does not interfere with the binding of 45b. Whereas the resilience of 45b to this mutant was not a feature that was selected for at the time of docking, and is in this sense fortuitous, it highlights the uniqueness of its chemotype. Such novelty was revealed in the docking results, and in general the ability to discover novel scaffolds and chemotypes is an advantage one can reasonably hope for in a docking screen.

82

A 1.25 44 1.00 Cyclo Compound 44 Vehicle 5 +M 30 +M 3 +M 0.3 +M 0.75 odip y S igna l BODIPY

ed B 0.50 (-Cyclopamine)

0.25 (mSmo-)

No rm ali z mCherry 0.00 -7 -6 -5 -4 -3 Log [Drug] (M)

B 1.50

1.25 Compound 45b 45b Cyclo + + + + 1.00 Vehicle 5 M 30 M 3 M 0.3 M odip y S igna l 0.75 BODIPY (-Cyclopamine) ed B 0.50 (mSmo-) 0.25 No rm ali z mCherry 0.00 -7 -6 -5 -4 -3 Log [Drug] (M)

C 80

60

40

20

Fold activation (over control media) 0 a O b MS D + 3a Wnt3a + 45b Control Medi Wnt Wnt3a + vismodegi Figure 3-29 New SMO antagonists compete for BODIPY-cyclopamine binding. A and B) Dose-response analysis of BODIPY-cyclopamine displacement in HEK293 cells overexpressing Smo-mCherry for compound 44 (A) and compound 45b (B). Left: average cell fluorescence was measured by flow cytometry and plotted against antagonist concentration. Representative curve are shown. Right: representative images of BODIPY-cyclopamine displacement by antagonists in HEK293 cells overexpressing Smo-mCherry. Cells were incubated with 5 nM BODIPY-cyclopamine and compounds 44 or 45b for 2 hours. C) Specificity of 45b towards Smo is demonstrated by the lack of inhibition of the TOPFLASH

83

Wnt-β−catenin reporter by compound 45b (10 µM) and vismodegib (100 nM). n=3, combined experiments, error bars: s.d.

84

A

B 3 C 6 D 44 Lys395 Lys395

Asn219

Asp384 Asp473 Asp473 Asp473 Tyr394

Asp384 Glu518 Glu518 Glu518

E 244 F 45b

Asp473 Tyr394 Asp473 Tyr394

Asp384 Asp384 Glu518 Glu518

G 1.50 H 45b

on 1.25 e ss i p r

x 1.00 E

li 1 0.75 G Tyr394 m

e 0.50 v i Asp384 a t

e l 0.25 R 0.00 His473 -10 -9 -8 -7 -6 -5 -4 Glu518 Log [Drug] (M)

Vismodegib - WT Smo Vismodgib - D477H Smo 45b - WT Smo 45b - D477H Smo Figure 3-30 Binding poses and inhibition of vismodegib-resistant Smo by 45b. A) Side view of the complex of Smoothened structure (represented as purple surface, TM6 was cut) with LY2940680 in the orthosteric site (represented as light pink sticks). B-F) Predicted binding modes of compounds 3, 6, 44, 244 and 45b against Smo wt, respectively. LY2940680 is represented as light pink wires, the compounds as green sticks, hydrogen bonds as black dashed

85 lines and important residues as sticks. G) Compound 45b inhibits Smo wildtype and mouse D477H mutant (equivalent to human D473H) in a dose-dependent manner. n=3, combined experiments, error bars: s.d. H) Predicted binding modes of compound 45b against a model of D473H-Smo.

3.2.6 Library Bias

We wished to investigate how biased our docking library was toward Smo ligands in comparison to well-studied GPCR’s, such as the β2-adrenergic receptor, serotonin 2A or dopamine D2 receptors. We therefore extracted from CHEMBL19 the number of known ligands for each of these receptors and the purchasable fraction (Bento et al., 2014; Gaulton et al., 2012). We used these numbers to calculate the percentage of purchasable known ligands. Finally, we queried the library for purchasable analogs of known ligands for each receptor (Table 3-11). The smaller number of known ligands for Smo compared to these receptors is representative of the more modest drug discovery work on Smo.

Table 3-11 Library Bias

Number of Number of Purchasable Purchasable Receptor known ligands purchasable ligands/ analogs of the (Max IC ) 50 in CHEMBL19 known ligands all ligands known ligands *

SMO 363 25 6.88% 2835 (1 µM)

ADRB2 1432 206 14.38% 21581 (1 µM)

HTR2A 3570 397 11.12% 25390 (1 µM)

DRD2 4939 469 9.50% 26137 (1 µM)

*ECFP4 cutoff: 0.8, CHEMBL 19 (Bento et al., 2014; Gaulton et al., 2012)

86 4 Discussion and Conclusions 4.1 ZFP629 Study

Genetic studies in animal models have established the central role of Gli proteins as main transcriptional effectors of the Hh signalling pathway. Genetic and cell biology studies have subsequently demonstrated a key role of Sufu in regulating their protein levels and transcriptional activity. Chromatin immunoprecipitation of epitope-tagged Gli1 and Gli3 in neural and limb cells, respectively, revealed a vast array of Gli-enriched sites throughout the genome. A few surprising findings were revealed by combining these studies: 1) not all enriched sites display the canonical GBS, and 2) genes specifically expressed in a given tissue were also bound by Gli in other tissues (Lee et al., 2010; Peterson et al., 2012; Vokes et al., 2008; 2007). Thus, Gli binding is not the only transcription factor-mediated event regulating Hh target gene expression.

4.1.1 ZFP629 is part of the SUFU protein complex

Affinity purification of proteins coupled to mass spectrometry has allowed scientists to identify novel protein-protein interactions and from these advance their understanding of cellular and molecular regulatory mechanisms. In the first study of this thesis, we identified a novel transcriptional repressor interacting with Sufu. Since Sufu is mostly known for associating and regulating the transcriptional activity of the Gli proteins the identification of additional transcription factors interacting with Sufu was of great interest. Thus, finding ZFP629 in SUFU- associated complexes in two different cell lines raised the possibility that ZFP629 could be a novel transcription factor involved in Hh signalling, or alternatively that SUFU regulates ZFP629 activity independently of Hh signals. Virtually no data about this zinc finger protein was available at the onset of this project. Publicly available expression data from mouse cerebellum, which we confirmed by in situ hybridization, indicated it was expressed in this tissue and demonstrated it followed a pattern reminiscent of Hh targets and components as shown in Figure 3-5.

87

Using truncations of ZFP629, we were able to determine that the N-terminus of ZFP629, which does not contain any known protein domain, was sufficient to mediate the interaction with SUFU. However, when this N-terminus construct of ZFP629 was co-expressed with Strep-HA- SUFU, it did not co-precipitate with Strep-HA-SUFU. The interaction working only one way could be due to interference coming from the tags and beads affecting the binding of the two proteins. This shortcoming could potentially be resolved in the future by using different tags or C-terminally tagged versions of the proteins.

4.1.2 ZFP629 consensus sequence

As part of this study, we identified a consensus DNA binding sequence for ZFP629 using the CAST approach. The experiment was performed with whole-cell lysates, raising the possibility that purified and sequenced DNA was also pulled down by DNA-binding proteins associating with ZFP629. Another consideration is the possibility that the assay was limited by the length of the oligonucleotides used. If the 14 adjacent zinc fingers were binding the DNA at the same time, the sequence covered by ZFP629 would be 43 bases. Some of the linkers within this group of zinc finger motifs diverge from the consensus sequence for optimal DNA-binding (Table 3-3, at the end of this chapter). Exactly how this affects the binding and specificity of ZFP629 binding to DNA is unknown. Nonetheless, the motif found by the CAST assay was validated by the finding of a similar motif in 10% (611/6253) of the sequences submitted for de novo motif enrichment analysis. The biological significance of ZFP629 binding DNA at these locations remains to be determined.

4.1.3 ZFP629 is a novel transcription repressor

Using a luciferase assay and fusion proteins of ZFP629 to the Gal4-DBD, we determined that ZFP629 can repress transcription. Consistent with these results, BioID experiments determined that ZFP629 comes in close proximity to members of transcription repressor complexes such as SWI/SNF and NURD. The N-terminus by itself had no effect on the Gal4-luciferase assay, but

88

ZFP629 truncations that preserve the core 14 zinc fingers were all able to repress the expression of firefly luciferase, suggesting that the zinc finger motifs are involved in the recruitment of the repressor complexes. However, whether or not the zinc finger-free N-terminus of ZFP629 localizes to the nucleus has not been determined, thus it is impossible to completely rule out that this domain mediates interactions with other transcriptional repressors. Which domains or motifs specifically are required for recruiting transcriptional repressor complexes still has to be addressed. Mass spectrometry analysis and further functional Gal4 assays using the truncated versions of ZFP629 could provide significant information to address this question.

4.1.4 Regulation of ZFP629

ZFP629 interacts with SUFU, a key regulator of GLI. SUFU interacts directly with GLI and is considered a negative regulator of the pathway. Loss of Sufu leads to constitutive activation of the pathway, in a similar fashion to a loss of Ptch1. In Ptch1-/- MEFs, total Gli protein levels are similar to wildtype Gli levels, except for Gli3R levels that are reduced. However, the loss of Sufu leads to an overall loss of Gli protein levels, implying that Sufu is also involved in controlling the full-length Gli protein levels. Gli protein levels are restored in Sufu-/- MEFs by the overexpression of Sufu.

In preliminary experiments performed in wildtype, Ptch1-/-, Sufu-/- and Sufu-/--Venus-Sufu MEFs, ZFP629 protein levels decreased with the loss of Sufu (Appendix 1). These different experiments indicate that ZFP629 could also be stabilized in Ptch1-/-, when compared with wildtype MEFs. However, comparisons between the wildtype, Ptch1-/- and Sufu-/- MEFs are less than ideal as the wildtype MEFs were obtained from a different lab. Additional experiments with MEFs derived identically and from the same mouse strains are needed to draw valid conclusions as to the role of Sufu and the activity of the pathway on ZFP629 protein level regulation. It thus remains to be determined if ZFP629 is controlled at the protein level in the context of the Hh pathway, and if so, whether or not the mechanisms involved are the same as for Gli regulation.

The conserved zinc finger linker found in ZFP629 is phosphorylated in other zinc finger proteins to regulate their transcriptional activity in the context of cell cycle (Dovat et al., 2002; Rizkallah

89 et al., 2014). Phosphorylation at this site affects the affinity of the adjacent zinc finger for DNA. In addition to the zinc finger linkers, there are also several predicted phosphorylation sites within ZFP629 sequence that could be involved in its regulation, including PKA, GSK3 and CK1 motifs. Publicly available phosphoproteomic data indicate ZFP629 is ubiquitylated and phosphorylated at sites with conserved motifs for these kinases (Uniprot, PhosphositePlus, ). This information can guide the selection of candidate post-translation modifications involved in the regulation of ZFP629 protein levels or its transcriptional activity. However, currently available human protein data presents conflicting information. The human protein atlas (www.proteinatlas.org, Figure 4-1, Figure 4-2 at the end of this chapter) reports high protein levels of ZNF629, the ZFP629 human homologue, in the liver, more specifically in hepatocytes, while the first two mass spectrometry-based human proteomes reported high protein levels in their adult retina samples, but neither reported ZNF629 in liver analyses (www.proteomicsdb.org and www.humanproteomemap.org, Figure 4-3, Figure 4-4 at the end of this chapter) (Kim et al., 2014b; Uhlén et al., 2015; Wilhelm et al., 2015). The human protein atlas reports no detection of ZFP629 protein in their adult brain samples, whereas the two other proteomics databases named above report detection of ZFP629 in their fetal brain samples.

ZFP629 does not appear to have a standard β-TrCP degron motif. Such a motif would mark it for ubiquitin-mediated proteolysis. However, the lack of a conserved degron does not rule out an involvement of β-TrCP in the regulation of ZFP629 protein levels (Frescas and Pagano, 2008). Depletion or overexpression of β-TrCP combined with biochemical assays would help determine if ubiquitylation and β-TrCP are involved in the control of ZFP629 protein levels.

4.1.5 Regulation of Zfp629 and understanding its role in the developing cerebellum

Foxp2 is not a canonical Hh/Gli target gene and little is known about the upstream signals regulating its gene expression. However, evidence presented here indicates that its transcription is regulated by the activity of the Hh pathway. Both ZFP629 and SUFU are required to repress Foxp2 levels.

90

While we have produced a rabbit polyclonal antibody against ZFP629, and a few commercial antibodies are available, antibodies suitable for ChIP are difficult to produce and extensive validation is required to obtain reliable data. Ectopic expression of tagged transcription factor, while easier, can lead to artefactual data from the transcription factor binding to low affinity sites or sites that would not be available in a normal context (Fernandez, 2003). To reduce this risk, the levels of the epitope-tagged protein should ideally be similar to the endogenous levels. Another issue is the possibility that the epitope interferes with the DNA binding or association with cofactors. In the recent years, several new methodologies enabling genome editing have been established, such as designer zinc finger nucleases, TALEN and the CRISPR/Cas9 system. The CRISPR/Cas9 system allows for direct manipulation of endogenous sequences in a relatively simple and efficient manner. Genes can be mutated or disrupted. It also allows for epitope tagging of endogenous proteins, keeping their expression under the control of their endogenous promoter (Savic et al., 2015). Although the CRISPR/Cas9-mediated epitope tagging does not prevent the possible detrimental effects of the tag, it provides a solution to the ectopic expression problems and could be used in future ChIP experiments.

How ZFP629 regulates gene expression is not fully understood. Our proteomic data indicating that ZFP629 interacts with repressor complexes such as SWI/SNF and NURD suggests that this may be one mechanism.

Foxp2 is required for proper development of the cerebellum. More specifically, Foxp2 is required for the proper growth and organization of Purkinje cells in the cerebellum. Deletion of Foxp2 results in a disorganized molecular layer deficient in Purkinje cell dendritic arbors and Bergmann radial glial fibers, the later preventing migration of the granule cells to the IGL (Shu et al., 2005).

Is ZFP629 repressing the expression of Foxp2 in the GCNPs? How is Zfp629 modulated in the Purkinje cells to allow for Foxp2 to be expressed? Foxp2 is expressed in interior olive neurons and in Hh-secreting Purkinje cells, but not in the Hh-responsive GCNPs. Proteomic and genomic approaches will be needed to better understand the interplay between the repressor’s expression and its function with respect to Foxp2.

91

4.1.6 ZFP629 and chemokines

Using an expression microarray and Zfp629-targeting shRNAs, a set of chemokine-related genes was identified for being upregulated when the levels of ZFP629 were reduced. Chemokines are well known for the implication in inflammatory responses. They also play a key role in malignant diseases including cancer where they mediate tumour-stroma interactions. Solid tumours are composed of cancer cells and non-malignant stromal cells: macrophages, lymphocytes, fibroblasts and endothelial cells. The tumour cell composition and the presence of immune cells is dictated by the repertoire of chemokines released by the cancer and stromal cells. The stroma also includes the capillaries and extracellular matrix surrounding the malignant cells. Tumour-associated fibroblasts and macrophages have gained considerable attention from the cancer research community as they mediate several tumour-stroma interactions affecting tumour progression and metastasis. Both fibroblasts and macrophages secrete chemokines in response to injuries or infections for the recruitment of immune cells. They also support angiogenesis by secreting matrix metalloproteases that will weaken the surrounding extracellular matrix, also causing the release of VEGF, an angiogenic growth factor.

The concerted actions of tumour-associated fibroblasts and macrophages support proliferative signalling, angiogenesis and invasion/metastasis, three of the six hallmarks of cancer as proposed by Hanahan and Weinberg (Hanahan and Weinberg, 2011).

Hh signalling is involved in several forms of cancer. We discussed previously cancers linked to aberrant Hh pathway activity due to mutations of its components. These cancers are considered to be ligand independent and arising from tumour cell autonomous signalling. However, Hh signalling is also implicated in ligand-dependent tumour cell signalling with the stroma. In pancreatic cancer, the tumour secretes Hh ligands and stromal cells respond by stimulating tumour cell growth while inhibiting stromal vascularization (Tian et al., 2009). In colorectal cancer, the tumour secretes Hh ligands to stimulate VEGF expression and promote tumour vascularization (Chen et al., 2011a). Finally in glioma, Shh is expressed in tumour-associated endothelial cells, in surrounding and infiltrating astrocytes, and proximal to proliferating

92 endothelial cells supporting the formation of new vasculature (Becher et al., 2008). Thus, the initial finding that ZFP629 could regulate cytokine-related genes was of significance given this context. However, even though we confirmed the microarray results, experiments showed only marginal effects on their expression in the presence of the pathway agonist purmorphamine. It is possible that ZFP629 fulfills Sufu-dependent, but Hh ligand-independent, roles, and that it is regulated by other upstream inputs. In fact, Gli proteins have been demonstrated to be regulated by stimulatory inputs other than those mediated by Hh ligands (Ji et al., 2007; Nolan-Stevaux et al., 2009; Seto et al., 2009).

93

znf629 Search Fields » ABOUT & HEL

ZNF629 TISSUE SUBCELL CELL LINE CANCER

GENE/PROTEIN TISSUE ATLAS ? » ANTIBODY/ANTIGEN Gene description Zinc finger protein 629 RNA tissue category Expressed in all.

Protein summary Detected at High or Medium expression levels in 3 of 81 analyzed normal tissue cell types. TISSUE ATLAS Most normal tissues were negative. Strong cytoplasmic positivity was observed in hepatocytes of liver while moderate STAINING OVERVIEW Protein expression cytoplasmic staining was observed in subset of renal tubuli.

Protein class Transcription factors

Dictionary Predicted localization Intracellular

Protein evidence Evidence at protein level Dictionary Protein reliability Uncertain based on 1 antibody.

RNA Protein Expression (FPKM) Organ system Localization (score)

20 10 0 n l m h Liver and pancreas Liver Gallbladder Pancreas Digestive tract (GI-tract) N/A Oral mucosa Salivary gland Esophagus Stomach Duodenum Small intestine Appendix Colon Rectum Urinary tract (Kidney and bladder) Kidney Urinary bladder Male reproductive system (Male tissues) Testis N/A Epididymis Prostate N/A Seminal vesicle Breast and female reproductive system N/A Breast N/A Vagina N/A Cervix, uterine Endometrium Fallopian tube Ovary Placenta Skin and soft tissues Skin Adipose tissue Skeletal muscle Smooth muscle N/A Soft tissue Blood and immune system (Hematopoietic) Bone marrow Lymph node

Tonsil

Spleen Central nervous system (Brain) Cerebral cortex N/A Hippocampus N/A Lateral ventricle N/A Cerebellum Endocrine glands Thyroid gland N/A Parathyroid gland Adrenal gland Respiratory system (Lung) N/A Nasopharynx N/A Bronchus Lung Cardiovascular system Heart muscle

Antibodies in assay

HPA019550

The Human Protein Atlas project is funded by the Knut & Alice Wallenberg foundation.

Figure 4-1 ZNF629 expression profile in human tissue mRNA (left) and protein (right) levels in different human tissues according to the Human Protein Atlas. (www.proteinatlas.org/ENSG00000102870-ZNF629/tissue (Uhlén et al., 2015))

94

ZNF629 TISSUE SUBCELL CELL LINE CANCER

GENE/PROTEIN CELL LINE ATLAS ? »

ANTIBODY/ANTIGEN Gene description Zinc finger protein 629

RNA expression Transcript detected at only low levels CELL ATLAS HPA019550 Protein expression Protein detected at only low levels

Protein class Transcription factors Dictionary Reliability Uncertain RT-4 Dictionary RNA Protein RNA Expression (FPKM) Cell line Cell line summary Antibody staining (score)

20 10 0 0 500 1000 myeloid cell lines HEL Erythroleukemia cell line HL-60 Acute promyelocytic leukemia cell line HMC-1 Mast cell leukemia cell line K-562 Chronic myeloid leukemia cell line NB-4 Acute promyelocytic leukemia cell line THP-1 Acute monocytic leukemia cell line U-937 Monocytic lymphoma cell line lymphoid cell lines Daudi Human Burkitt lymphoma cell line HDLM-2 Hodgkin lymphoma cell line N/A Karpas-707 Multiple myeloma cell line N/A LP-1 Multiple myeloma cell line MOLT-4 Acute lymphoblastic leukemia cell line REH Pre-B cell leukemia cell line RPMI-8226 Multiple myeloma cell line U-266/70 Multiple myeloma cell line U-266/84 Multiple myeloma cell line U-698 B-cell lymphoma cell line brain cell lines SH-SY5Y Metastatic neuroblastoma cell line U-138 MG Glioblastoma cell line U-251 MG Glioblastoma cell line U-87 MG Glioblastoma, astrocytoma cell line lung cell lines A549 Lung carcinoma cell line SCLC-21H Small cell lung carcinoma cell line abdominal cell lines CACO-2 Colon adenocarcinoma cell line CAPAN-2 Pancreas adenocarcinoma cell line Hep G2 Hepatocellular carcinoma cell line breast, female reproductive system cell lines AN3-CA Endometrial adenocarcinoma cell line EFO-21 Ovarian cystadenocarcinoma cell line HeLa Cervical epithelial adenocarcinoma cell line MCF7 Metastatic breast adenocarcinoma cell line SiHa Cervical squamous carcinoma cell line SK-BR-3 Metastatic breast adenocarcinoma cell line urinary, male reproductive system cell lines NTERA-2 Embryonal carcinoma cell line PC-3 Prostate adenocarcinoma cell line RT4 Urinary bladder transitional cell carcinoma cell line skin cell lines A-431 Epidermoid carcinoma cell line HaCaT Keratinocyte cell line SK-MEL-30 Metastatic malignant melanoma cell line WM-115 Malignant melanoma cell line sarcoma cell lines RH-30 Metastatic rhabdomyosarcoma cell line

U-2 OS Osteosarcoma cell line U-2197 Malignant fibrous histiocytoma cell line miscellaneous cell lines BEWO Metastatic choriocarcinoma cell line HEK 293 Embryonal kidney cell line N/A HTh 83 Anaplastic thyroid carcinoma cell line TIME Telomerase-immortalized microvascular endothelial cells Patient cells patient cells - leukemia N/A Leukemia, AML Acute myeloid leukemia N/A Leukemia, B-ALL Acute B Lymphoblastic leukemia N/A Leukemia, T-ALL T-cell acute lymphoblastic leukemia N/A Leukemia, CML Chronic myeloid leukemia N/A Leukemia, B-CLL B-cell chronic lymphocytic leukemia patient cells - PBMC N/A PBMC Peripheral blood mononuclear cells from healthy blood donors

Figure 4-2 ZNF629 expression profile in cell lines

mRNA (left) and protein (right) expression levels in various cell lines, according to the Human Protein Atlas (http://www.proteinatlas.org/ENSG00000102870-ZNF629/cell/HPA019550 (Uhlén et al., 2015))

95

Figure 4-3 ZNF629 protein levels in tissues reported by Proteomics DB (https://www.proteomicsdb.org/proteomicsdb/#human/proteinDetails/84146/summary, (Wilhelm et al., 2015))

96

Figure 4-4 ZNF629 protein levels in tissues reported by Human Proteome Map (http://www.humanproteomemap.org/, (Kim et al., 2014b))

4.2 Smo Study

For the second results section, three results merit emphasis: 1) a structure-based approach discovered several novel scaffolds unrelated to previously described Smo inhibitors; 2) these new antagonists, though docked in one of the canonical Smo intra-helical binding sites, made interactions distinct from previous ligands, and one of the most potent compounds, 45b, was little affected by the D473H mutation in Smo, the mutation that limits vismodegib clinical efficacy; 3) we found it important to confirm the mechanism of binding of these compounds, investigating them not only by functional assay, but by fluorescent-ligand displacement, and controlling for colloidal aggregation. This artefactual mechanism indeed affected four of the 14 antagonists discovered herein, as well as the highly studied Smo antagonist itraconazole, which may therefore also behave as a colloidal aggregator against this target.

4.2.1 Docking

The screen was performed using the only structure of Smoothened available at the time, in an inactive state bound to an antagonist. While such screens using one solved crystal structure

97 proved successful for other receptors and yielded novel ligands with nanomolar affinities, class F receptors pose an added challenge due to their divergence from other GPCRs.

The 2013 GPCR Dock Assessment report (Kufareva et al., 2014) included a homology modeling contest for Smo and comparison to the recently published crystal structures of Smo with the LY2940680 and SANT-1 antagonists for scoring. One of the biggest challenges for the 20 teams assigned to Smo was modeling the extracellular loops (ECL) 1 and ECL3 as well as the extracellular domain (ECD) linker of Smo, all of which are longer than any other characterized ECL and ECD linkers. The published crystal structures revealed an unprecedented involvement of the ECLs in forming the binding pockets. Thus, any inaccuracy in the modeling of the ECLs would impact the accuracy of the binding site prediction. None of the 88 submitted models accurately predicted the binding sites or the amino acids interacting with either antagonist. The best contact ratio achieved was 8.8% and 12.3% for one model each ligand (medians < 1%), while teams tasked with modeling of the serotonin receptor with ergotamine obtained up to 72.6% contact accuracy (median 23.9%) for that measure (Kufareva et al., 2014). There are now five crystal structures published for Smo, four for the receptor co-crystalized with an antagonist and one with an agonist (Wang et al., 2013; 2014; Weierstall et al., 2014). There are also reports for the CRD of Smo and Fzd (Janda et al., 2012; Nachtergaele et al., 2013). Docking to the open and shallow surface of their unique CRDs might prove to be the ultimate challenge.

4.2.2 Aggregation

As in so many other early discovery campaigns, some of our initial hits turned out to be colloidal aggregators — this is an artefact to which Smo is clearly prone. Four out of the 14 analogs that we discovered were aggregators in one or more assays. This emphasizes the importance of controlling for this mechanism in early discovery campaigns against this and related targets. This same aggregation mechanism may also affect a heavily studied Smo antagonist, indeed one advanced into the clinic, the popular repurposing drug itraconazole.

98

4.2.3 Library Bias

Molecular docking screens have proven effective for GPCRs, partly because of the relatively high bias among commercially available compounds towards relevant chemotypes, but also because of the well-formed ligand sites within the trans-membrane helical domains. This is a feature that Smo shares: its site is largely closed off to solvent and tightly defined within the bundle. Both aspects contribute to good ligand complementarity and consequently docking success. As with other GPCR docking campaigns, the hit rate against Smo was substantially higher than we typically observe against soluble proteins. However, our 19% hit rate (4 active out of 21 tested) falls at the lower end of success rate we have observed against other GPCRs (17% to 58%) (Carlsson et al., 2010; 2011; Katritch et al., 2010; Kolb et al., 2009b; Weiss et al., 2013). The activity of the hits were in the 5 to 50 µM range, at least one log weaker than observed in most other GPCR docking campaigns. One possible explanation for these weaker results might be found in the great divergence between Smo and well-drugged GPCRs. While library bias towards GPCRs is favorable for GPCR docking campaigns, this bias for GPCRs might not support a campaign such a divergent binding site. Nevertheless, our main goal in this structure-based virtual screen was to find novel ligands, for which we succeeded with antagonists different from known ligands as indicated by their Tc score (Table 3-6 and Table 3-8).

4.2.4 Functional activity of docking hits

Initial virtual screening campaigns against the inactive inverse-agonist-bound structures of the A2a adenosine, β2-adrenergic and D3 dopamine receptors yielded only inverse agonists and antagonists (Carlsson et al., 2010; 2011; Katritch et al., 2010; Kolb et al., 2009a). Using the active structure of the β2-adrenergic receptor bound to a potent agonist BI167107 and G-protein mimetic nanobody, Weiss et al. (2013) succeeded to identify a novel agonist of the β2-adrenergic receptor. Altogether, this supports the concept that the state of the crystallized receptor used for screening strongly influences the instrinsic activity of the hits. This was also reflected in our screen, with 3 weak partial agonists (data not shown) and 4 antagonists out of the 21 molecules initially screened.

99

4.2.5 Functional selectivity

Historically, GPCRs have been represented as molecular switches regulating a given signalling pathway. In a classical two-state model, a receptor can adopt one of either two conformations, the active is stabilized by an agonist and the inactive state, for which an antagonist prevents binding of the agonist. However, current models describe more a complex reality where a receptor can adopt a multitude of conformations and is capable of engendering more than simple binary responses. Receptors can elicit varied responses through the use of different downstream signalling mechanisms such as G protein coupling or receptor trafficking. In cases where conformations linked to different signals are distinct, ligands can selectively modulate a signal by stabilizing a specific conformation (Wacker et al., 2013).

Recent work provided evidence that some canonical Hh/Gli antagonists such as cyclopamine and GDC-0449 also act as agonists of glucose uptake and Ca2+ influx, all through Smo (Teperino et al., 2012). These effects can be inhibited by pertussis toxin, implicating Gαi downstream of Smo. This functional selectivity could explain some of the severe muscle cramping and weight loss experienced by patients treated with GDC-0449. Understanding the signals downstream of Smo will help screen and develop better drugs selective for Hh/Gli axis. It also opens new lines of research for drug development targeting the control of metabolism.

10 0 5 Conclusions and Perspectives Study 1: ZFP629 is a novel zinc finger transcription factor interacting with SUFU The first study was to initiate the characterization of a novel transcription factor and a novel interactor of SUFU, a central protein of the Hedgehog pathway. This transcription factor, ZFP629, was first identified through a proteomic analysis of SUFU-associated proteins. We have determined that Zfp629 is expressed in the developing cerebellum and Hh-driven tumours by in situ hybridization; however, its mRNA levels were not significantly increased in tumour samples assessed by qPCR. Its transcript levels are also not modulated by strong pathway agonists or antagonists, altogether indicating that Zfp629 is not a Hedgehog target gene.

A Gal4-Luciferase assay indicated that ZFP629 can act as a transcriptional repressor. Using the CAST methodology, a DNA binding sequence was identified and this motif was also found at the summit of 10% of the peaks analyzed for de novo motif enrichment. A first attempt at identifying ZFP629 target genes using a Zfp629-targeting shRNA and microarrays lead to a subset of chemokines, including Ccl5 and Cxcl1 as candidate target genes. However, their expression was only marginally affected by a potent pathway agonist. ChIP-seq was used successfully to identify Foxp2, a transcription factor involved in cerebellar development required for Purkinje cells dendritic arborisation, as a direct target gene of ZFP629. Knockdown of Zfp629 expression lead to the increase of Foxp2 mRNA levels, supporting the hypothesis that ZFP629 represses transcription. Foxp2 expression was significantly repressed when Sufu expression was restored in Sufu-/- MEFs. However, it has never been identified as a Hedgehog target gene. Foxp2 and Zfp629 are both expressed in the cerebellum during development with Foxp2 expressed in the Hh-secreting Purkinje cells, but not in Hh-responsive granule cell progenitors. Together these data suggest that ZFP629 may be repressing the expression of Foxp2 in GCNPs.

10 1

Figure 5-1 Summary of findings

in CGNPs ? Mitotic CGNP

EGL Postmitotic CGNP ZFP629 Granule neurons

Foxp2 Bergmann glia SHH SHH Purkinje in Purkinje cells neurons

Zfp629

Foxp2 IGL

Figure 5-2 Schematic of the hypothesis of the role of Zfp629 and Foxp2 in the cerebellum

The recently developed CRISPR/Cas9 system holds tremendous promise for biomedical research and is transforming basic life sciences research as it allows for fast, inexpensive and specific gene editing. An endonuclease-deficient Cas9, dCas9, allows for specific gene expression

10 2 activation or repression through fusion to an activator or repressor domain. The fusion protein is then guided to a specific locus of interest to activate or repress the expression of a specific gene. Another variation of the Cas9 system allows the identification of DNA-bound proteins at a given locus, a methodology called enChIP. For enChIP, an epitope-tagged dCas9 is guided to a specific locus in a promoter (Waldrip et al., 2014). The DNA-protein complexes surrounding dCas9 are isolated using a chromatin affinity-purification protocol and analyzed by mass spectrometry. Such an approach allows the identification of DNA-bound proteins and potential co-regulators at a specific locus. The enChIP methodology could be used to identify proteins bound to the Zfp629 promoter and help elucidate regulatory mechanisms regulating its expression. Understanding how Zfp629 gene expression is regulated and which signalling pathways are involved would shed light on the intricate mechanisms involved in the development of the cerebellum. The enChIP approach mentioned above could be used to define how ZFP629 regulates Foxp2 expression and identify the determinants involved. Another approach would be to use biotinylated oligos with the ZFP629 binding site, or the region we discovered it occupied within the Foxp2 promoter to isolate DNA-bound ZFP629 and associated proteins. The isolated proteins would then be identified by mass spectrometry.

Elucidating the mechanisms regulating the activity of ZFP629 and confirming the cell-type specific expression of both transcription factors in granule cell progenitors and Purkinje cells would provide the basis to ultimately understand how Zfp629 is regulated to modulate the expression of Foxp2 in both cell types, but also to further understand the interplay between the Foxp2-expressing and Hh-secreting Purkinje cells, and the Hh-responsive granule cell progenitors during cerebellar development.

Whether or not SUFU modulates the protein levels of ZFP629 remains to be addressed and carries ramifications of significance, potentially culminating with SUFU involved in the regulation of transcription factors in other pathways, based on our preliminary results of Zfp629 expression in tumours and with purmorphamine. In separate studies to the enChIP studies, the next step would be to determine if SUFU affects the protein stability of ZFP629, and if so, is it through the same mechanisms as for Gli proteins, such as phosphorylation by CK1a and GSK3 and degradation involving SPOP. If these same mechanisms are indeed involved, we would

10 3 determine if Hh signalling is influencing the effect of SUFU on ZFP629, or if another signalling pathway is involved. At either step, results would reveal impactful information on the role of this important yet less characterized protein, SUFU.

Study 2: Identification of novel Smoothened ligands using structure-based docking Molecular docking screens have proven effective for GPCRs, partly because of the relatively high bias among commercially-available compounds towards relevant chemotypes, but also because of the ideal ligand binding cavity within the trans-membrane helical domains (Carlsson et al., 2011; Kolb et al., 2009b). This is a feature that SMO shares —its transmembrane site is largely closed off from bulk solvent, and though larger than the orthosteric sites of aminergic GPCRs like the β2 receptor, it is substantially smaller than that of peptide GPCRs like the µ- opioid, and is tightly defined. Both burial from solvent and a well-formed site contribute to good ligand complementarity, which is important for docking success. As with other GPCR docking campaigns, the initial hit rate against SMO was higher than we typically observe against soluble proteins, though at 19% (4 active out of 21 tested) it is at the lower end of the range that was observed against other GPCRs (ranging from 17% to 58%). So too was the potency of the hits, which were in the 2 to 25 µM range, one to two logs weaker than observed in most other GPCR campaigns. However, our main goal here was to find novel chemotypes, while in earlier campaigns many of the hits resembled known ligands and recapitulated canonical interactions (Dijkgraaf et al., 2011; Manetti et al., 2010). Insisting on novelty likely reduces the likelihood of finding higher affinity hits that exploit already developed chemotypes, but has the advantage of finding antagonists with new properties.

The new antagonists, though docked in the canonical LY2940680/cyclopamine site, were predicted to make interactions that differ from these compounds, as defined in the SMO crystal structures. Crucially, one of the most potent antagonists, 45b, interacted with Glu518 and Asp384 in its docked WT complex. Docking and minimization in the modeled D473H mutant structure resulted in a very similar pose. This predicted behaviour is consistent with the experimental results showing that compound 45b is resilient to the resistance-conferring mutation D473H. Although we did not predict this from the start, or even select for it, we did

10 4 aim for novelty and selected compounds forming interactions with other residues than the crystal structure.

This structure-based screen found ten new antagonists in three new scaffolds for SMO. One of the most potent, compound 45b, retained its activity against the D473H mutant of SMO that confers clinical resistance to vismodegib. Our study, therefore, leveraged the strength of structure-based docking to identify ligands with new chemotypes for SMO, a class F GPCR. As more structures become available, this approach may also enable the identification of Frizzled ligands, for which no small molecule modulators are currently clinically available and highly sought considering the importance of Wnt and Fzd in cancer. (Figure 5-3)

Figure 5-3 Drugs targeting the Wnt and Hh pathways

10 5 References

Aanstad, P., Santos, N., Corbit, K.C., Scherz, P.J., Trinh, L.A., Salvenmoser, W., Huisken, J., Reiter, J.F., and Stainier, D.Y.R. (2009). The extracellular domain of Smoothened regulates ciliary localization and is required for high-level Hh signaling. Curr. Biol. 19, 1034–1039.

Angers, S., Thorpe, C.J., Biechele, T.L., Goldenberg, S.J., Zheng, N., MacCoss, M.J., and Moon, R.T. (2006). The KLHL12-Cullin-3 ubiquitin ligase negatively regulates the Wnt-beta-catenin pathway by targeting Dishevelled for degradation. Nature Cell Biology 8, 348–357.

Arensdorf, A.M., Marada, S., and Ogden, S.K. (2016). Smoothened Regulation: A Tale of Two Signals. Trends in Pharmacological Sciences 37, 62–72.

Aruga, J. (2004). The role of Zic genes in neural development. Mol. Cell. Neurosci. 26, 205–221.

Atcha, F.A., Syed, A., Wu, B., Hoverter, N.P., Yokoyama, N.N., Ting, J.H.T., Munguia, J.E., Mangalam, H.J., Marsh, J.L., and Waterman, M.L. (2007). A Unique DNA Binding Domain Converts T-Cell Factors into Strong Wnt Effectors. Mol. Cell. Biol. 27, 8352–8363.

Axelson, M., Liu, K., Jiang, X., He, K., Wang, J., Zhao, H., Kufrin, D., Palmby, T., Dong, Z., Russell, A.M., et al. (2013). U.S. Food and Drug Administration approval: vismodegib for recurrent, locally advanced, or metastatic basal cell carcinoma. Clin. Cancer Res. 19, 2289–2293.

Babaoglu, K., Simeonov, A., Irwin, J.J., Nelson, M.E., Feng, B., Thomas, C.J., Cancian, L., Costi, M.P., Maltby, D.A., Jadhav, A., et al. (2008). Comprehensive Mechanistic Analysis of Hits from High-Throughput and Docking Screens against β-Lactamase. Journal of Medicinal … 51, 2502–2511.

Bailey, T.L., Boden, M., Buske, F.A., Frith, M., Grant, C.E., Clementi, L., Ren, J., Li, W.W., and Noble, W.S. (2009). MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, W202–W208.

Barnfield, P.C., Zhang, X., Thanabalasingham, V., Yoshida, M., and Hui, C.-C. (2005). Negative regulation of Gli1 and Gli2 activator function by Suppressor of fused through multiple mechanisms. Differentiation 73, 397–405.

Beak, J.Y., Kang, H.S., Kim, Y.S., and Jetten, A.M. (2008). Functional analysis of the zinc finger and activation domains of Glis3 and mutant Glis3(NDH1). Nucleic Acids Res. 36, 1690– 1702.

Becher, O.J., Hambardzumyan, D., Fomchenko, E.I., Momota, H., Mainwaring, L., Bleau, A.- M., Katz, A.M., Edgar, M., Kenney, A.M., Cordon-Cardo, C., et al. (2008). Gli activity correlates with tumor grade in platelet-derived growth factor-induced gliomas. Cancer Research 68, 2241–2249.

10 6 Belgacem, Y.H., and Borodinsky, L.N. (2011). Sonic hedgehog signaling is decoded by calcium spike activity in the developing spinal cord. Proceedings of the National Academy of Sciences 108, 4482–4487.

Bento, A.P., Gaulton, A., Hersey, A., Bellis, L.J., Chambers, J., Davies, M., Krüger, F.A., Light, Y., Mak, L., McGlinchey, S., et al. (2014). The ChEMBL bioactivity database: an update. Nucleic Acids Res. 42, D1083–D1090.

Bhanot, P., Brink, M., Samos, C.H., Hsieh, J.C., Wang, Y., Macke, J.P., Andrew, D., Nathans, J., and Nusse, R. (1996). A new member of the frizzled family from Drosophila functions as a Wingless receptor. Nature 382, 225–230.

Bookout, A.L., Cummins, C.L., Mangelsdorf, D.J., Pesola, J.M., and Kramer, M.F. (2006). High- throughput real-time quantitative reverse transcription PCR. Curr Protoc Mol Biol Chapter 15, Unit15.8.

Buonamici, S., Williams, J., Morrissey, M., Wang, A., Guo, R., Vattay, A., Hsiao, K., Yuan, J., Green, J., Ospina, B., et al. (2010). Interfering with Resistance to Smoothened Antagonists by Inhibition of the PI3K Pathway in Medulloblastoma. Science Translational Medicine 2, 51ra70– 51ra70.

Buttitta, L., Mo, R., Hui, C.-C., and Fan, C.-M. (2003). Interplays of Gli2 and Gli3 and their requirement in mediating Shh-dependent sclerotome induction. Development 130, 6233–6243.

Cain, J.E., and Rosenblum, N.D. (2010). Control of mammalian kidney development by the Hedgehog signaling pathway. Pediatr Nephrol 26, 1365–1371.

Carlsson, J., Coleman, R.G., Setola, V., Irwin, J.J., Fan, H., Schlessinger, A., Sali, A., Roth, B.L., and Shoichet, B.K. (2011). Ligand discovery from a dopamine D3 receptor homology model and crystal structure. Nat. Chem. Biol. 7, 769–778.

Carlsson, J., Yoo, L., Gao, Z.-G., Irwin, J.J., Shoichet, B.K., and Jacobson, K.A. (2010). Structure-based discovery of A2A adenosine receptor ligands. J. Med. Chem. 53, 3748–3755.

Chambers, C.C., Gregory D Hawkins, Christopher J Cramer, A., and Truhlar, D.G. (1996). Model for Aqueous Solvation Based on Class IV Atomic Charges and First Solvation Shell Effects (American Chemical Society).

Chan, D.W., Liu, V.W., Leung, L.Y., Yao, K.M., Chan, K.K., Cheung, A.N., and Ngan, H.Y. (2011). Zic2 synergistically enhances Hedgehog signalling through nuclear retention of Gli1 in cervical cancer cells. J. Pathol. 225, 525–534.

Chang, A.L.S., and Oro, A.E. (2012). Initial Assessment of Tumor Regrowth After Vismodegib in Advanced Basal Cell Carcinoma. Arch Dermatol 148, 1324–1325.

Chen, J.K. (2002). Inhibition of Hedgehog signaling by direct binding of cyclopamine to Smoothened. Genes & Development 16, 2743–2748.

10 7 Chen, J.K., Taipale, J., Young, K.E., Maiti, T., and Beachy, P.A. (2002). Small molecule modulation of Smoothened activity. Proc. Natl. Acad. Sci. U.S.a. 99, 14071–14076.

Chen, M.H., Wilson, C.W., Li, Y.J., Law, K.K.L., Lu, C.S., Gacayan, R., Zhang, X., Hui, C.C., and Chuang, P.T. (2009). Cilium-independent regulation of Gli protein function by Sufu in Hedgehog signaling is evolutionarily conserved. Genes & Development 23, 1910–1928.

Chen, W., Tang, T., Eastham-Anderson, J., Dunlap, D., Alicke, B., Nannini, M., Gould, S., Yauch, R., Modrusan, Z., DuPree, K.J., et al. (2011a). Canonical hedgehog signaling augments tumor angiogenesis by induction of VEGF-A in stromal perivascular cells. Proceedings of the National Academy of Sciences 108, 9589–9594.

Chen, Y., Sasai, N., Ma, G., Yue, T., Jia, J., Briscoe, J., and Jiang, J. (2011b). Sonic Hedgehog Dependent Phosphorylation by CK1α and GRK2 Is Required for Ciliary Accumulation and Activation of Smoothened. PLoS Biology 9, e1001083.

Cheng, S.Y., and Bishop, J.M. (2002). Suppressor of Fused represses Gli-mediated transcription by recruiting the SAP18-mSin3 corepressor complex. Proc. Natl. Acad. Sci. U.S.a. 99, 5442– 5447.

Chinchilla, P., Xiao, L., Kazanietz, M.G., and Riobo, N.A. (2010). Hedgehog proteins activate pro-angiogenic responses in endothelial cells through non-canonical signaling pathways. 9, 570– 579.

Coan, K.E.D., and Shoichet, B.K. (2008). Stoichiometry and physical chemistry of promiscuous aggregate-based inhibitors. J. Am. Chem. Soc. 130, 9606–9612.

Cooper, A.F., Yu, K.P., Brueckner, M., Brailey, L.L., Johnson, L., McGrath, J.M., and Bale, A.E. (2005). Cardiac and CNS defects in a mouse with targeted disruption of suppressor of fused. Development 132, 4407–4417.

Corbit, K.C., Aanstad, P., Singla, V., Norman, A.R., Stainier, D.Y.R., and Reiter, J.F. (2005). Vertebrate Smoothened functions at the primary cilium. Nature Publishing Group 437, 1018– 1021.

D'Amico, D., Antonucci, L., Di Magno, L., Coni, S., Sdruscia, G., Macone, A., Miele, E., Infante, P., Di Marcotullio, L., De Smaele, E., et al. (2015). Non-canonical Hedgehog/AMPK- Mediated Control of Polyamine Metabolism Supports Neuronal and Medulloblastoma Cell Growth. Developmental Cell 35, 21–35.

Dafinger, C., Liebau, M.C., Elsayed, S.M., Hellenbroich, Y., Boltshauser, E., Korenke, G.C., Fabretti, F., Janecke, A.R., Ebermann, I., Nürnberg, G., et al. (2011). Mutations in KIF7 link Joubert syndrome with Sonic Hedgehog signaling and microtubule dynamics. J. Clin. Invest. 121, 2662–2667.

Dahlgren, M.K., Garcia, A.B., Hare, A.A., Tirado-Rives, J., Leng, L., Bucala, R., and Jorgensen, W.L. (2012). Virtual screening and optimization yield low-nanomolar inhibitors of the

10 8 tautomerase activity of Plasmodium falciparum macrophage migration inhibitory factor. J. Med. Chem. 55, 10148–10159. de Graaf, C., Kooistra, A.J., Vischer, H.F., Katritch, V., Kuijer, M., Shiroishi, M., Iwata, S., Shimamura, T., Stevens, R.C., de Esch, I.J.P., et al. (2011). Crystal structure-based virtual screening for fragment-like ligands of the human histamine H(1) receptor. J. Med. Chem. 54, 8195–8206.

Dijkgraaf, G.J.P., Alicke, B., Weinmann, L., Januario, T., West, K., Modrusan, Z., Burdick, D., Goldsmith, R., Robarge, K., Sutherlin, D., et al. (2011). Small Molecule Inhibition of GDC-0449 Refractory Smoothened Mutants and Downstream Mechanisms of Drug Resistance. Cancer Research 71, 435–444.

Dijksterhuis, J.P., and Petersen, J. (2014). WNT/Frizzled signalling: receptor–ligand selectivity with focus on FZD‐G protein signalling and its physiological relevance: IUPHAR Review 3. British Journal of ….

Ding, J., Xu, H., Faiola, F., Ma'ayan, A., and Wang, J. (2012). Oct4 links multiple epigenetic pathways to the pluripotency network. Cell Res. 22, 155–167.

Ding, Q., Fukami, S.-I., Meng, X., Nishizaki, Y., Zhang, X., Sasaki, H., Dlugosz, A., Nakafuku, M., and Hui, C.C. (1999). Mouse Suppressor of fused is a negative regulator of Sonic hedgehog signaling and alters the subcellular distribution of Gli1. Current Biology 9, 1119–S1.

Ding, Q., Motoyama, J., Gasca, S., Mo, R., Sasaki, H., Rossant, J., and Hui, C.C. (1998). Diminished Sonic hedgehog signaling and lack of floor plate differentiation in Gli2 mutant mice. Development 125, 2533–2543.

Doak, A.K., Wille, H., Prusiner, S.B., and Shoichet, B.K. (2010). Colloid formation by drugs in simulated intestinal fluid. J. Med. Chem. 53, 4259–4265.

Dovat, S., Ronni, T., Russell, D., Ferrini, R., Cobb, B.S., and Smale, S.T. (2002). A common mechanism for mitotic inactivation of C2H2 zinc finger DNA-binding domains. Genes & Development 16, 2985–2990.

Duan, D., Doak, A.K., Nedyalkova, L., and Shoichet, B.K. (2015). Colloidal Aggregation and the in VitroActivity of Traditional Chinese Medicines. ACS Chem. Biol. 10, 978–988.

Dunaeva, M. (2002). Characterization of the Physical Interaction of Gli Proteins with SUFU Proteins. Journal of Biological Chemistry 278, 5116–5122.

Eaton, S. (2008). Multiple roles for lipids in the Hedgehog signalling pathway. Nat Rev Mol Cell Biol 9, 437–445.

Fernandez, P.C. (2003). Genomic targets of the human c-Myc protein. Genes & Development 17, 1115–1129.

10 9 Ferreira, L.G., Santos, Dos, R.N., Oliva, G., and Andricopulo, A.D. (2015). Molecular docking and structure-based drug design strategies. Molecules 20, 13384–13421.

Fischer, M., Coleman, R.G., Fraser, J.S., and Shoichet, B.K. (2014). Incorporation of protein flexibility andconformational energy penalties in dockingscreens to improve ligand discovery. Nature Chemistry 6, 575–583.

Foord, S.M., Bonner, T.I., Neubig, R.R., Rosser, E.M., Pin, J.-P., Davenport, A.P., Spedding, M., and Harmar, A.J. (2005). International Union of Pharmacology. XLVI. G protein-coupled receptor list. Pharmacological Reviews 57, 279–288.

Frescas, D., and Pagano, M. (2008). Deregulated proteolysis by the F-box proteins SKP2 and beta-TrCP: tipping the scales of cancer. Nat Rev Cancer 8, 438–449.

Gallagher, K., and Sharp, K. (1998). Electrostatic Contributions to Heat Capacity Changes of DNA-Ligand Binding. Biophysical Journal 75, 769–776.

Gaulton, A., Bellis, L.J., Bento, A.P., Chambers, J., Davies, M., Hersey, A., Light, Y., McGlinchey, S., Michalovich, D., Al-Lazikani, B., et al. (2012). ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 40, D1100–D1107.

Generoso, S.F., Giustiniano, M., La Regina, G., Bottone, S., Passacantilli, S., Di Maro, S., Cassese, H., Bruno, A., Mallardo, M., Dentice, M., et al. (2015). Pharmacological folding chaperones act as allosteric ligands of Frizzled4. Nat. Chem. Biol. 11, 280–286.

Gilson, M.K., and Honig, B.H. (1987). Calculation of electrostatic potentials in an enzyme active site. Nature 330, 84–86.

Gocke, C.B., and Yu, H. (2008). ZNF198 stabilizes the LSD1-CoREST-HDAC1 complex on chromatin through its MYM-type zinc fingers. PLoS ONE 3, e3255.

Goetz, S.C., and Anderson, K.V. (2010). The primary cilium: a signalling centre during vertebrate development. Nat. Rev. Genet. 11, 331–344.

Goodrich, L.V., Milenkovic, L., Higgins, K.M., and Scott, M.P. (1997). Altered neural cell fates and medulloblastoma in mouse patched mutants. Science 277, 1109–1113.

Gurney, A., Axelrod, F., Bond, C.J., Cain, J., Chartier, C., Donigan, L., Fischer, M., Chaudhari, A., Ji, M., Kapoun, A.M., et al. (2012). Wnt pathway inhibition via the targeting of Frizzled receptors results in decreased growth and tumorigenicity of human tumors. Proceedings of the National Academy of Sciences 109, 11717–11722.

Hahn, H., Wojnowski, L., Miller, G., and Zimmer, A. (1999). The patched signaling pathway in tumorigenesis and development: lessons from animal models. Journal of Molecular Medicine 77, 459–468.

Han, Y., Shi, Q., and Jiang, J. (2015). Multisite interaction with Sufu regulates Ci/Gli activity

11 0 through distinct mechanisms in Hh signal transduction. Proceedings of the National Academy of Sciences 112, 6383–6388.

Hanahan, D., and Weinberg, R.A. (2011). Hallmarks of cancer: the next generation. Cell 144, 646–674.

Hausmann, G., Mering, von, C., and Basler, K. (2009). The Hedgehog Signaling Pathway: Where Did It Come From? PLoS Biology 7, e1000146.

Haycraft, C.J., Banizs, B., Aydin-Son, Y., Zhang, Q., Michaud, E.J., and Yoder, B.K. (2005). Gli2 and Gli3 localize to cilia and require the intraflagellar transport protein polaris for processing and function. PLoS Genet 1, e53.

He, M., Subramanian, R., Bangs, F., Omelchenko, T., Liem, K.F., Kapoor, T.M., and Anderson, K.V. (2014). The kinesin-4 protein Kif7 regulates mammalian Hedgehog signalling by organizing the cilium tip compartment. Nature Publishing Group 16, 663–672.

Hebrok, M., Kim, S.K., Jacques, B.S., McMahon, A.P., and Melton, D.A. (2000). Regulation of pancreas development by hedgehog signaling. Development 127, 4905–4913.

Hert, J., Willett, P., Wilton, D.J., Acklin, P., Azzaoui, K., Jacoby, E., and Schuffenhauer, A. (2004). Comparison of topological descriptors for similarity-based virtual screening using multiple bioactive reference structures. Org. Biomol. Chem. 2, 3256–3266.

Hoff, Von, D.D., LoRusso, P.M., Rudin, C.M., Reddy, J.C., Yauch, R.L., Tibes, R., Weiss, G.J., Borad, M.J., Hann, C.L., Brahmer, J.R., et al. (2009). Inhibition of the Hedgehog Pathway in Advanced Basal-Cell Carcinoma. N. Engl. J. Med. 361, 1164–1172.

Hu, M.C. (2006). GLI3-dependent transcriptional repression of Gli1, Gli2 and kidney patterning genes disrupts renal morphogenesis. Development 133, 569–578.

Huangfu, D., and Anderson, K.V. (2005). Cilia and Hedgehog responsiveness in the mouse. Proc. Natl. Acad. Sci. U.S.a. 102, 11325–11330.

Huangfu, D., Liu, A., Rakeman, A.S., Murcia, N.S., Niswander, L., and Anderson, K.V. (2003). Hedgehog signalling in the mouse requires intraflagellar transport proteins. Nature 426, 83–87.

Humke, E.W., Dorn, K.V., Milenkovic, L., Scott, M.P., and Rohatgi, R. (2010). The output of Hedgehog signaling is controlled by the dynamic association between Suppressor of Fused and the Gli proteins. Genes & Development 24, 670–682.

Huntzicker, E.G., Estay, I.S., Zhen, H., Lokteva, L.A., Jackson, P.K., and Oro, A.E. (2006). Dual degradation signals control Gli protein stability and tumor formation. Genes & Development 20, 276–281.

Ingham, P.W., and McMahon, A.P. (2001). Hedgehog signaling in animal development: paradigms and principles. Genes & Development 15, 3059–3087.

11 1 Irwin, J.J., and Shoichet, B.K. (2005). ZINC--a free database of commercially available compounds for virtual screening. J Chem Inf Model 45, 177–182.

Irwin, J.J., Sterling, T., Mysinger, M.M., Bolstad, E.S., and Coleman, R.G. (2012). ZINC: a free tool to discover chemistry for biology. J Chem Inf Model 52, 1757–1768.

Janda, C.Y., Waghray, D., Levin, A.M., Thomas, C., and Garcia, K.C. (2012). Structural basis of Wnt recognition by Frizzled. Science 337, 59–64.

Jiabo Li, Tianhai Zhu, Christopher J Cramer, A., and Truhlar, D.G. (1998). New Class IV Charge Model for Extracting Accurate Partial Charges from Wave Functions (American Chemical Society).

Johnson, R.L., Rothman, A.L., Xie, J., Goodrich, L.V., Bare, J.W., Bonifas, J.M., Quinn, A.G., Myers, R.M., Cox, D.R., Epstein, E.H., et al. (1996). Human homolog of patched, a candidate gene for the basal cell nevus syndrome. Science 272, 1668–1671.

Katritch, V., Jaakola, V.-P., Lane, J.R., Lin, J., Ijzerman, A.P., Yeager, M., Kufareva, I., Stevens, R.C., and Abagyan, R. (2010). Structure-based discovery of novel chemotypes for adenosine A(2A) receptor antagonists. J. Med. Chem. 53, 1799–1809.

Kenney, A.M., and Rowitch, D.H. (2000). Sonic hedgehog promotes G(1) cyclin expression and sustained cell cycle progression in mammalian neuronal precursors. Mol. Cell. Biol. 20, 9055– 9067.

Kenney, A.M., Cole, M.D., and Rowitch, D.H. (2003). Nmyc upregulation by sonic hedgehog signaling promotes proliferation in developing cerebellar granule neuron precursors. Development 130, 15–28.

Kim, D.J., Kim, J., Spaunhurst, K., Montoya, J., Khodosh, R., Chandra, K., Fu, T., Gilliam, A., Molgo, M., Beachy, P.A., et al. (2014a). Open-Label, Exploratory Phase II Trial of Oral Itraconazole for the Treatment of Basal Cell Carcinoma. Journal of Clinical Oncology 32, 745– 751.

Kim, J., Aftab, B.T., Tang, J.Y., Kim, D., Lee, A.H., Rezaee, M., Kim, J., Chen, B., King, E.M., Borodovsky, A., et al. (2013). Itraconazole and Arsenic Trioxide Inhibit Hedgehog Pathway Activation and Tumor Growth Associated with Acquired Resistance to Smoothened Antagonists. Cancer Cell 23, 23–34.

Kim, J., Tang, J.Y., Gong, R., Kim, J., Lee, J.J., Clemons, K.V., Chong, C.R., Chang, K.S., Fereshteh, M., Gardner, D., et al. (2010). Itraconazole, a Commonly Used Antifungal that Inhibits Hedgehog Pathway Activity and Cancer Growth. Cancer Cell 17, 388–399.

Kim, J., Kato, M., and Beachy, P.A. (2009). Gli2 trafficking links Hedgehog-dependent activation of Smoothened in the primary cilium to transcriptional activation in the nucleus. Proceedings of the National Academy of Sciences 106, 21666–21671.

11 2 Kim, M.-S., Pinto, S.M., Getnet, D., Nirujogi, R.S., Manda, S.S., Chaerkady, R., Madugundu, A.K., Kelkar, D.S., Isserlin, R., Jain, S., et al. (2014b). A draft map of the human proteome. Nature 509, 575–581.

Kingston, R.E., Chen, C.A., and Rose, J.K. (2003). Calcium Phosphate Transfection - Current Protocols in Molecular Biology - Kingston - Wiley Online Library. Current Protocols in ….

Kinzler, K.W., and Vogelstein, B. (1990). The GLI gene encodes a nuclear protein which binds specific sequences in the . Mol. Cell. Biol. 10, 634–642.

Kise, Y., Morinaka, A., Teglund, S., and Miki, H. (2009). Sufu recruits GSK3β for efficient processing of Gli3. Biochemical and Biophysical Research Communications 387, 569–574.

Kogerman, P., Grimm, T., Kogerman, L., Krause, D., Undén, A.B., Sandstedt, B., Toftgård, R., and Zaphiropoulos, P.G. (1999). Mammalian suppressor-of-fused modulates nuclear-cytoplasmic shuttling of Gli-1. Nature Cell Biology 1, 312–319.

Kolb, P., Ferreira, R.S., Irwin, J.J., and Shoichet, B.K. (2009a). Docking and chemoinformatic screens for new ligands and targets. Current Opinion in Biotechnology 20, 429–436.

Kolb, P., Rosenbaum, D.M., Irwin, J.J., Fung, J.J., Kobilka, B.K., and Shoichet, B.K. (2009b). Structure-based discovery of beta2-adrenergic receptor ligands. Proceedings of the National Academy of Sciences 106, 6843–6848.

Kool, M., Korshunov, A., Remke, M., Jones, D.T.W., Schlanstein, M., Northcott, P.A., Cho, Y.- J., Koster, J., Schouten-van Meeteren, A., van Vuurden, D., et al. (2012). Molecular subgroups of medulloblastoma: an international meta-analysis of transcriptome, genetic aberrations, and clinical data of WNT, SHH, Group 3, and Group 4 medulloblastomas. Acta Neuropathol 123, 473–484.

Kovacs, J.J., Whalen, E.J., Liu, R., Xiao, K., Kim, J., and Chen, M. (2008). β-Arrestin–mediated localization of Smoothened to the primary cilium. Science.

Koyabu, Y., Nakata, K., Mizugishi, K., Aruga, J., and Mikoshiba, K. (2001). Physical and functional interactions between Zic and Gli proteins. J. Biol. Chem. 276, 6889–6892.

Kufareva, I., Katritch, V., Participants of GPCR Dock 2013, Stevens, R.C., and Abagyan, R. (2014). Advances in GPCR modeling evaluated by the GPCR Dock 2013 assessment: meeting new challenges. Structure 22, 1120–1139.

Kuzmichev, A., Zhang, Y., Erdjument-Bromage, H., Tempst, P., and Reinberg, D. (2002). Role of the Sin3-histone deacetylase complex in growth regulation by the candidate tumor suppressor p33(ING1). Mol. Cell. Biol. 22, 835–848.

Lai, C.S.L., Fisher, S.E., Hurst, J.A., Vargha-Khadem, F., and Monaco, A.P. (2001). A forkhead- domain gene is mutated in a severe speech and language disorder. Nature 413, 519–523.

11 3 Lai, C.S.L., Gerrelli, D., Monaco, A.P., Fisher, S.E., and Copp, A.J. (2003). FOXP2 expression during brain development coincides with adult sites of pathology in a severe speech and language disorder. Brain 126, 2455–2462.

Langmead, C.J., Andrews, S.P., Congreve, M., Errey, J.C., Hurrell, E., Marshall, F.H., Mason, J.S., Richardson, C.M., Robertson, N., Zhukov, A., et al. (2012). Identification of novel adenosine A(2A) receptor antagonists by virtual screening. J. Med. Chem. 55, 1904–1909.

Lau, J., Kawahira, H., and Hebrok, M. (2006). Hedgehog signaling in pancreas development and disease. Cell. Mol. Life Sci. 63, 642–652.

Lee, E.Y., Ji, H., Ouyang, Z., Zhou, B., Ma, W., Vokes, S.A., McMahon, A.P., Wong, W.H., and Scott, M.P. (2010). Hedgehog pathway-regulated gene networks in cerebellum development and tumorigenesis. Proc. Natl. Acad. Sci. U.S.a. 107, 9736–9741.

Lee, J.J., Kessler, von, D.P., Parks, S., and Beachy, P.A. (1992). Secretion and localized transcription suggest a role in positional signaling for products of the segmentation gene hedgehog. Cell 71, 33–50.

Liu, A., Wang, B., and Niswander, L.A. (2005). Mouse intraflagellar transport proteins regulate both the activator and repressor functions of Gli transcription factors. Development 132, 3103– 3111.

Liu, J., Pan, S., Hsieh, M.H., Ng, N., Sun, F., Wang, T., Kasibhatla, S., Schuller, A.G., Li, A.G., Cheng, D., et al. (2013). Targeting Wnt-driven cancer through the inhibition of Porcupine by LGK974. Proceedings of the National Academy of Sciences 110, 20224–20229.

Low, W.-C., Wang, C., Pan, Y., Huang, X.-Y., Chen, J.K., and Wang, B. (2008). The decoupling of Smoothened from Gαi proteins has little effect on Gli3 protein processing and Hedgehog- regulated chick neural tube patterning. Developmental Biology 321, 188–196.

Lui, T.T.H., Lacroix, C., Ahmed, S.M., Goldenberg, S.J., Leach, C.A., Daulat, A.M., and Angers, S. (2011). The ubiquitin-specific protease USP34 regulates axin stability and Wnt/β- catenin signaling. Mol. Cell. Biol. 31, 2053–2065.

Ma, W., Noble, W.S., and Bailey, T.L. (2014). Motif-based analysis of large nucleotide data sets using MEME-ChIP. Nature Protocols 9, 1428–1450.

Mahony, S., Auron, P.E., and Benos, P.V. (2007). DNA familial binding profiles made easy: comparison of various motif alignment and clustering strategies. PLoS Comput. Biol. 3, e61.

Manetti, F., Faure, H., Roudaut, H., Gorojankina, T., Traiffort, E., Schoenfelder, A., Mann, A., Solinas, A., Taddei, M., and Ruat, M. (2010). Virtual Screening-Based Discovery and Mechanistic Characterization of the Acylthiourea MRT-10 Family as Smoothened Antagonists. Molecular Pharmacology 78, 658–665.

Mao, J., Ligon, K.L., Rakhlin, E.Y., Thayer, S.P., Bronson, R.T., Rowitch, D., and McMahon,

11 4 A.P. (2006). A novel somatic mouse model to survey tumorigenic potential applied to the Hedgehog pathway. Cancer Research 66, 10171–10178.

Marcotullio, L.D., Ferretti, E., Greco, A., De Smaele, E., Po, A., Sico, M.A., Alimandi, M., Giannini, G., Maroder, M., Screpanti, I., et al. (2006). Numb is a suppressor of Hedgehog signalling and targets Gli1 for Itch-dependent ubiquitination. Nature Cell Biology 8, 1415–1423.

Matise, M.P., Epstein, D.J., Park, H.L., Platt, K.A., and Joyner, A.L. (1998). Gli2 is required for induction of floor plate and adjacent cells, but not most ventral neurons in the mouse central nervous system. Development 125, 2759–2770.

McGovern, S.L., Caselli, E., Grigorieff, N., and Shoichet, B.K. (2002). A common mechanism underlying promiscuous inhibitors from virtual and high-throughput screening. J. Med. Chem. 45, 1712–1722.

McLean, C.Y., Bristor, D., Hiller, M., Clarke, S.L., Schaar, B.T., Lowe, C.B., Wenger, A.M., and Bejerano, G. (2010). GREAT. Nat Biotechnol 28, nbt.1630–nbt.1639.

Meloni, A.R., Fralish, G.B., Kelly, P., Salahpour, A., Chen, J.K., Wechsler-Reya, R.J., Lefkowitz, R.J., and Caron, M.G. (2006). Smoothened Signal Transduction Is Promoted by G Protein-Coupled Receptor Kinase 2. Mol. Cell. Biol. 26, 7550–7560.

Meng, E.C., Shoichet, B.K., and Kuntz, I.D. (1992). Automated docking with grid-based energy evaluation. Journal of Computational Chemistry 13, 505–524.

Méthot, N., and Basler, K. (2001). An absolute requirement for Cubitus interruptus in Hedgehog signaling. Development 128, 733–742.

Méthot, N., and Basler, K. (1999). Hedgehog Controls Limb Development by Regulating the Activities of Distinct Transcriptional Activator and Repressor Forms of Cubitus interruptus. Cell 96, 819–831.

Mizugishi, K. (2000). Molecular Properties of Zic Proteins as Transcriptional Regulators and Their Relationship to GLI Proteins. Journal of Biological Chemistry 276, 2180–2188.

Mohler, J., and Vani, K. (1992). Molecular organization and embryonic expression of the hedgehog gene involved in cell-cell communication in segmental patterning of Drosophila. Development 115, 957–971.

Morin, P.J., Sparks, A.B., Korinek, V., Barker, N., Clevers, H., Vogelstein, B., and Kinzler, K.W. (1997). Activation of β-catenin-Tcf signaling in colon cancer by mutations in β-catenin or APC. Science 275, 1787–1790.

Muchmore, S.W., Debe, D.A., Metz, J.T., Brown, S.P., Martin, Y.C., and Hajduk, P.J. (2008). Application of Belief Theory to Similarity Data Fusion for Use in Analog Searching and Lead Hopping. Journal of Chemical … 48, 941–948.

11 5 Myers, B.R., Sever, N., Chong, Y.C., Kim, J., Belani, J.D., Rychnovsky, S., Bazan, J.F., and Beachy, P.A. (2013). Hedgehog pathway modulation by multiple lipid binding sites on the smoothened effector of signal response. Developmental Cell 26, 346–357.

Mysinger, M.M., and Shoichet, B.K. (2010). Rapid context-dependent ligand desolvation in molecular docking. J Chem Inf Model 50, 1561–1573.

Mysinger, M.M., Carchia, M., Irwin, J.J., and Shoichet, B.K. (2012a). Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J. Med. Chem. 55, 6582–6594.

Mysinger, M.M., Weiss, D.R., Ziarek, J.J., Gravel, S., Doak, A.K., Karpiak, J., Heveker, N., Shoichet, B.K., and Volkman, B.F. (2012b). Structure-based ligand discovery for the protein- protein interface of chemokine receptor CXCR4. Proceedings of the National Academy of Sciences 109, 5517–5522.

Nachtergaele, S., Whalen, D.M., Mydock, L.K., Zhao, Z., Malinauskas, T., Krishnan, K., Ingham, P.W., Covey, D.F., Siebold, C., and Rohatgi, R. (2013). Structure and function of the Smoothened extracellular domain in vertebrate Hedgehog signaling. eLife 2, e01340–e01340.

Nachtergaele, S., Mydock, L.K., Krishnan, K., Rammohan, J., Schlesinger, P.H., Covey, D.F., and Rohatgi, R. (2012). Oxysterols are allosteric activators of the oncoprotein Smoothened. Nat. Chem. Biol. 8, 211–220.

Nedelcu, D., Liu, J., Xu, Y., Jao, C., and Salic, A. (2013). Oxysterol binding to the extracellular domain of smoothened in Hedgehog signaling. Nat. Chem. Biol. 9, 557–564.

Nieuwenhuis, E., Motoyama, J., Barnfield, P.C., Yoshikawa, Y., Zhang, X., Mo, R., Crackower, M.A., and Hui, C.-C. (2006). Mice with a targeted mutation of patched2 are viable but develop alopecia and epidermal hyperplasia. Mol. Cell. Biol. 26, 6609–6622.

Ogden, S.K., Fei, D.L., Schilling, N.S., Ahmed, Y.F., Hwa, J., and Robbins, D.J. (2008). G protein G|[agr]|i functions immediately downstream of Smoothened in Hedgehog signalling. Nature 456, 967–970.

Paces-Fessy, M., Boucher, D., Petit, E., Paute-Briand, S., and Blanchet-Tournier, M.-F. (2004). The negative regulator of Gli, Suppressor of fused (Sufu), interacts with SAP18, Galectin3 and other nuclear proteins. Biochem. J. 378, 353–362.

Pan, Y., and Wang, B. (2007). A novel protein-processing domain in Gli2 and Gli3 differentially blocks complete protein degradation by the proteasome. J. Biol. Chem. 282, 10846–10852.

Pan, Y., Bai, C.B., Joyner, A.L., and Wang, B. (2006). Sonic hedgehog signaling regulates Gli2 transcriptional activity by suppressing its processing and degradation. Mol. Cell. Biol. 26, 3365– 3377.

Park, H.L., Bai, C., Platt, K.A., Matise, M.P., Beeghly, A., Hui, C.C., Nakashima, M., and

11 6 Joyner, A.L. (2000). Mouse Gli1 mutants are viable but have defects in SHH signaling in combination with a Gli2 mutation. Development 127, 1593–1605.

Peterson, K.A., Nishi, Y., Ma, W., Vedenko, A., Shokri, L., Zhang, X., McFarlane, M., Baizabal, J.-M., Junker, J.P., van Oudenaarden, A., et al. (2012). Neural-specific Sox2 input and differential Gli-binding affinity provide context and positional information in Shh-directed neural patterning. Genes & Development 26, 2802–2816.

Pietsch, T., Waha, A., Koch, A., Kraus, J., Albrecht, S., Tonn, J., Sörensen, N., Berthold, F., Henk, B., Schmandt, N., et al. (1997). Medulloblastomas of the desmoplastic variant carry mutations of the human homologue of Drosophila patched. Cancer Research 57, 2085–2088.

Polizio, A.H., Chinchilla, P., Chen, X., Kim, S., Manning, D.R., and Riobo, N.A. (2011). Heterotrimeric Gi Proteins Link Hedgehog Signaling to Activation of Rho Small GTPases to Promote Fibroblast Migration. Journal of Biological Chemistry 286, 19589–19596.

Putoux, A., Thomas, S., Coene, K.L.M., Davis, E.E., Alanay, Y., Ogur, G., Uz, E., Buzas, D., Gomes, C., Patrier, S., et al. (2011). KIF7 mutations cause fetal hydrolethalus and acrocallosal syndromes. Nature Genetics 43, 601–606.

Raffel, C., Jenkins, R.B., Frederick, L., Hebrink, D., Alderete, B., Fults, D.W., and James, C.D. (1997). Sporadic medulloblastomas contain PTCH mutations. Cancer Research 57, 842–845.

Ramsden, N.L., Buetow, L., Dawson, A., Kemp, L.A., Ulaganathan, V., Brenk, R., Klebe, G., and Hunter, W.N. (2009). A Structure-Based Approach to Ligand Discovery for 2 C-Methyl- d- erythritol-2,4-cyclodiphosphate Synthase: A Target for Antimicrobial Therapy †. J. Med. Chem. 52, 2531–2542.

Rana, R., Carroll, C.E., Lee, H.-J., Bao, J., Marada, S., Grace, C.R.R., Guibao, C.D., Ogden, S.K., and Zheng, J.J. (2013). Structural insights into the role of the Smoothened cysteine-rich domain in Hedgehog signalling. Nature Communications 4, 2965.

Regard, J.B., Malhotra, D., Gvozdenovic-Jeremic, J., Josey, M., Chen, M., Weinstein, L.S., Lu, J., Shore, E.M., Kaplan, F.S., and Yang, Y. (2013). Activation of Hedgehog signaling by loss of GNAS causes heterotopic ossification. Nature Medicine 19, 1505–1512.

Reimand, J., Arak, T., and Vilo, J. (2011). g:Profiler--a web server for functional interpretation of gene lists (2011 update). Nucleic Acids Res. 39, W307–W315.

Repasky, M.P., Murphy, R.B., Banks, J.L., Greenwood, J.R., Tubert-Brohman, I., Bhat, S., and Friesner, R.A. (2012). Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide. J. Comput. Aided Mol. Des. 26, 787–799.

Riobo, N.A., Saucy, B., DiLizio, C., and Manning, D.R. (2006). Activation of heterotrimeric G proteins by Smoothened. Proceedings of the National Academy of Sciences 103, 12607–12612.

11 7 Rizkallah, R., Alexander, K.E., and Hurt, M.M. (2014). Global mitotic phosphorylation of C 2H 2zinc finger protein linker peptides. Cell Cycle 10, 3327–3336.

Rogers, D.J., and Tanimoto, T.T. (1960). A Computer Program for Classifying Plants. Science 132, 1115–1118.

Rohatgi, R., Milenkovic, L., and Scott, M.P. (2007). Patched1 Regulates Hedgehog Signaling at the Primary Cilium. Science 317, 372–376.

Roughley, S., Wright, L., Brough, P., Massey, A., and Hubbard, R.E. (2012). Hsp90 inhibitors and drugs from fragment and virtual screening. Top Curr Chem 317, 61–82.

Roux, K.J., Kim, D.I., Raida, M., and Burke, B. (2012). A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells. The Journal of Cell Biology 196, 801–810.

Rudin, C.M., Hann, C.L., Laterra, J., Yauch, R.L., Callahan, C.A., Fu, L., Holcomb, T., Stinson, J., Gould, S.E., Coleman, B., et al. (2009). Treatment of Medulloblastoma with Hedgehog Pathway Inhibitor GDC-0449. N. Engl. J. Med. 361, 1173–1178.

Ruiz i Altaba, A., Palma, V., and Dahmane, N. (2002). Hedgehog-Gli signalling and the growth of the brain. Nature Reviews Neuroscience 3, 24–33.

Sager, G., Ørvoll, E.Ø., Lysaa, R.A., Kufareva, I., Abagyan, R., and Ravna, A.W. (2012). Novel cGMP efflux inhibitors identified by virtual ligand screening (VLS) and confirmed by experimental studies. J. Med. Chem. 55, 3049–3057.

Sasaki, H., Nishizaki, Y., Hui, C., Nakafuku, M., and Kondoh, H. (1999). Regulation of Gli2 and Gli3 activities by an amino-terminal repression domain: implication of Gli2 and Gli3 as primary mediators of Shh signaling. Development 126, 3915–3924.

Sassano, M.F., Doak, A.K., Roth, B.L., and Shoichet, B.K. (2013). Colloidal aggregation causes inhibition of G protein-coupled receptors. J. Med. Chem. 56, 2406–2414.

Savic, D., Partridge, E.C., Newberry, K.M., Smith, S.B., Meadows, S.K., Roberts, B.S., Mackiewicz, M., Mendenhall, E.M., and Myers, R.M. (2015). CETCh-seq: CRISPR epitope tagging ChIP-seq of DNA-binding proteins. Genome Research 25, 1581–1589.

Seidler, J., McGovern, S.L., Doman, T.N., and Shoichet, B.K. (2003). Identification and prediction of promiscuous aggregating inhibitors among known drugs. J. Med. Chem. 46, 4477– 4486.

Sekulic, A., Migden, M.R., Oro, A.E., Dirix, L., Lewis, K.D., Hainsworth, J.D., Solomon, J.A., Yoo, S., Arron, S.T., Friedlander, P.A., et al. (2012). Efficacy and safety of vismodegib in advanced basal-cell carcinoma. N. Engl. J. Med. 366, 2171–2179.

Sela, I., Golan, G., Strajbl, M., Rivenzon-Segal, D., Bar-Haim, S., Bloch, I., Inbal, B., Shitrit, A.,

11 8 Ben-Zeev, E., Fichman, M., et al. (2010). G protein coupled receptors -in silico drug discovery and design. Curr Top Med Chem 10, 638–656.

Senée, V., Chelala, C., Duchatelet, S., Feng, D., Blanc, H., Cossec, J.-C., Charon, C., Nicolino, M., Boileau, P., Cavener, D.R., et al. (2006). Mutations in GLIS3 are responsible for a rare syndrome with neonatal diabetes mellitus and congenital hypothyroidism. Nature Genetics 38, 682–687.

Sharp, K.A. (1995). Polyelectrolyte electrostatics: Salt dependence, entropic, and enthalpic contributions to free energy in the nonlinear Poisson–Boltzmann model. Biopolymers 36, 227– 243.

Shimura, T., Takenaka, Y., Tsutsumi, S., Hogan, V., Kikuchi, A., and Raz, A. (2004). Galectin- 3, a novel binding partner of beta-catenin. Cancer Research 64, 6363–6367.

Shoichet, B.K., and Kuntz, I.D. (1993). Matching chemistry and shape in molecular docking. Protein Eng. 6, 723–732.

Shu, W., Cho, J.Y., Jiang, Y., Zhang, M., Weisz, D., Elder, G.A., Schmeidler, J., De Gasperi, R., Sosa, M.A.G., Rabidou, D., et al. (2005). Altered ultrasonic vocalization in mice with a disruption in the Foxp2 gene. Proc. Natl. Acad. Sci. U.S.a. 102, 9643–9648.

Smith, D.C., Gordon, M., Messersmith, W., Chugh, R., Mendelson, D., Dupont, J., Stagg, R., Kapoun, A.M., Xu, L., Brachmann, R.K., et al. (2014). Abstract B79: A first-in-human Phase 1 study of anti-cancer stem cell (CSC) agent OMP-54F28 (FZD8-Fc) targeting the WNT pathway in patients with advanced solid tumors. Molecular Cancer Therapeutics 12, B79–B79.

Sousa, S.F., Ribeiro, A.J.M., Coimbra, J.T.S., Neves, R.P.P., Martins, S.A., Moorthy, N.S.H.N., Fernandes, P.A., and Ramos, M.J. (2013). Protein-Ligand Docking in the New Millennium – A Retrospective of 10 Years in the Field. Cmc 20, 2296–2314.

Sterling, T., and Irwin, J.J. (2015). ZINC 15 - Ligand Discovery for Everyone. J Chem Inf Model 55, 2324–2337.

Stone, D.M., Murone, M., Luoh, S., Ye, W., Armanini, M.P., Gurney, A., Phillips, H., Brush, J., Goddard, A., de Sauvage, F.J., et al. (1999). Characterization of the human suppressor of fused, a negative regulator of the zinc-finger transcription factor Gli. Journal of Cell Science 112 ( Pt 23), 4437–4448.

Svärd, J., Henricson, K.H., Persson-Lek, M., Rozell, B., Lauth, M., Bergström, A., Ericson, J., Toftgård, R., and Teglund, S. (2006). Genetic Elimination of Suppressor of Fused Reveals an Essential Repressor Function in the Mammalian Hedgehog Signaling Pathway. Developmental Cell 10, 187–197.

Tabata, T., Eaton, S., and Kornberg, T.B. (1992). The Drosophila hedgehog gene is expressed specifically in posterior compartment cells and is a target of engrailed regulation. Genes & Development 6, 2635–2645.

11 9 Taipale, J., Cooper, M.K., Maiti, T., and Beachy, P.A. (2002). Patched acts catalytically to suppress the activity of Smoothened. Nature 418, 892–897.

Tang, J.Y., Mackay-Wiggan, J.M., Aszterbaum, M., Yauch, R.L., Lindgren, J., Chang, K., Coppola, C., Chanana, A.M., Marji, J., Bickers, D.R., et al. (2012). Inhibiting the hedgehog pathway in patients with the basal-cell nevus syndrome. N. Engl. J. Med. 366, 2180–2188.

Taylor, M.D., Northcott, P.A., Korshunov, A., Remke, M., Cho, Y.-J., Clifford, S.C., Eberhart, C.G., Parsons, D.W., Rutkowski, S., Gajjar, A., et al. (2012). Molecular subgroups of medulloblastoma: the current consensus. Acta Neuropathol 123, 465–472.

Tempé, D., Casas, M., Karaz, S., Blanchet-Tournier, M.-F., and Concordet, J.-P. (2006). Multisite protein kinase A and glycogen synthase kinase 3beta phosphorylation leads to Gli3 ubiquitination by SCFbetaTrCP. Mol. Cell. Biol. 26, 4316–4326.

Teperino, R., Amann, S., Bayer, M., McGee, S.L., Loipetzberger, A., Connor, T., Jaeger, C., Kammerer, B., Winter, L., Wiche, G., et al. (2012). Hedgehog Partial Agonism Drives Warburg- like Metabolismin Muscle and Brown Fat. Cell 151, 414–426.

Thomas-Chollier, M., Defrance, M., Medina-Rivera, A., Sand, O., Herrmann, C., Thieffry, D., and van Helden, J. (2011). RSAT 2011: regulatory sequence analysis tools. Nucleic Acids Res. 39, W86–W91.

Tian, H., Callahan, C.A., DuPree, K.J., Darbonne, W.C., Ahn, C.P., Scales, S.J., and de Sauvage, F.J. (2009). Hedgehog signaling is restricted to the stromal compartment during pancreatic carcinogenesis. Proceedings of the National Academy of Sciences 106, 4254–4259.

Tosh, D.K., Phan, K., Gao, Z.-G., Gakh, A.A., Xu, F., Deflorian, F., Abagyan, R., Stevens, R.C., Jacobson, K.A., and Katritch, V. (2012). Optimization of adenosine 5'-carboxamide derivatives as adenosine receptor agonists using structure-based ligand design and fragment screening. J. Med. Chem. 55, 4297–4308.

Tukachinsky, H., Lopez, L.V., and Salic, A. (2010). A mechanism for vertebrate Hedgehog signaling: recruitment to cilia and dissociation of SuFu-Gli protein complexes. The Journal of Cell Biology 191, 415–428.

Uhlén, M., Fagerberg, L., Hallström, B.M., Lindskog, C., Oksvold, P., Mardinoglu, A., Sivertsson, Å., Kampf, C., Sjöstedt, E., Asplund, A., et al. (2015). Proteomics. Tissue-based map of the human proteome. Science 347, 1260419.

Vaillant, C., and Monard, D. (2009). SHH Pathway and Cerebellar Development. Cerebellum 8, 291–301.

Vasanth, S., ZeRuth, G., Kang, H.S., and Jetten, A.M. (2011). Identification of Nuclear Localization, DNA Binding, and Transactivating Mechanisms of Kruppel-like Zinc Finger Protein Gli-Similar 2 (Glis2). Journal of Biological Chemistry 286, 4749–4759.

12 0 Vokes, S.A., Ji, H., Wong, W.H., and McMahon, A.P. (2008). A genome-scale analysis of the cis-regulatory circuitry underlying sonic hedgehog-mediated patterning of the mammalian limb. Genes & Development 22, 2651–2663.

Vokes, S.A., Ji, H., McCuine, S., Tenzen, T., Giles, S., Zhong, S., Longabaugh, W.J.R., Davidson, E.H., Wong, W.H., and McMahon, A.P. (2007). Genomic characterization of Gli- activator targets in sonic hedgehog-mediated neural patterning. Development 134, 1977–1989.

Voz, M.L., Agten, N.S., Van de Ven, W.J., and Kas, K. (2000). PLAG1, the main translocation target in pleomorphic adenoma of the salivary glands, is a positive regulator of IGF-II. Cancer Research 60, 106–113.

Wacker, D., Wang, C., Katritch, V., Han, G.W., Huang, X.-P., Vardy, E., McCorvy, J.D., Jiang, Y., Chu, M., Siu, F.Y., et al. (2013). Structural features for functional selectivity at serotonin receptors. Science 340, 615–619.

Waldrip, Z.J., Byrum, S.D., Storey, A.J., Gao, J., Byrd, A.K., Mackintosh, S.G., Wahls, W.P., Taverna, S.D., Raney, K.D., and Tackett, A.J. (2014). A CRISPR-based approach for proteomic analysis of a single genomic locus. Epigenetics 9, 1207–1211.

Wang, C., Wu, H., Evron, T., Vardy, E., Han, G.W., Huang, X.-P., Hufeisen, S.J., Mangano, T.J., Urban, D.J., Katritch, V., et al. (2014). Structural basis for Smoothened receptor modulation and chemoresistance to anticancer drugs. Nature Communications 5, 4355.

Wang, C., Wu, H., Katritch, V., Han, G.W., Huang, X.-P., Liu, W., Siu, F.Y., Roth, B.L., Cherezov, V., and Stevens, R.C. (2013). Structure of the human smoothened receptor bound to an antitumour agent. Nature 497, 338–343.

Ward, R.J., Lee, L., Graham, K., Satkunendran, T., Yoshikawa, K., Ling, E., Harper, L., Austin, R., Nieuwenhuis, E., Clarke, I.D., et al. (2009). Multipotent CD15+ Cancer Stem Cells in Patched-1-Deficient Mouse Medulloblastoma. Cancer Research 69, 4682–4690.

Weierstall, U., James, D., Wang, C., White, T.A., Wang, D., Liu, W., Spence, J.C.H., Bruce Doak, R., Nelson, G., Fromme, P., et al. (2014). Lipidic cubic phase injector facilitates membrane protein serial femtosecond crystallography. Nature Communications 5, 3309.

Weiss, D.R., Ahn, S., Sassano, M.F., Kleist, A., Zhu, X., Strachan, R., Roth, B.L., Lefkowitz, R.J., and Shoichet, B.K. (2013). Conformation guides molecular efficacy in docking screens of activated β-2 adrenergic G protein coupled receptor. ACS Chem. Biol. 8, 1018–1026.

Wen, X., Lai, C.K., Evangelista, M., Hongo, J.A., de Sauvage, F.J., and Scales, S.J. (2010). Kinetics of Hedgehog-Dependent Full-Length Gli3 Accumulation in Primary Cilia and Subsequent Degradation. Mol. Cell. Biol. 30, 1910–1922.

Wilhelm, M., Schlegl, J., Hahne, H., Gholami, A.M., Lieberenz, M., Savitski, M.M., Ziegler, E., Butzmann, L., Gessulat, S., Marx, H., et al. (2015). Mass-spectrometry-based draft of the human proteome. Nature 509, 582–587.

12 1 Yang, Z.-J., Ellis, T., Markant, S.L., Read, T.-A., Kessler, J.D., Bourboulas, M., Schüller, U., Machold, R., Fishell, G., Rowitch, D.H., et al. (2008). Medulloblastoma Can Be Initiated by Deletion of Patched in Lineage-Restricted Progenitors or Stem Cells. Cancer Cell 14, 135–145.

Yauch, R.L., Dijkgraaf, G.J.P., Alicke, B., Januario, T., Ahn, C.P., Holcomb, T., Pujara, K., Stinson, J., Callahan, C.A., Tang, T., et al. (2009). Smoothened Mutation Confers Resistance to a Hedgehog Pathway Inhibitor in Medulloblastoma. Science 326, 572–574.

ZeRuth, G.T., Yang, X.P., and Jetten, A.M. (2011). Modulation of the Transactivation Function and Stability of Kruppel-like Zinc Finger Protein Gli-similar 3 (Glis3) by Suppressor of Fused. Journal of Biological Chemistry 286, 22077–22089.

Zhang, F., Nakanishi, G., Kurebayashi, S., Yoshino, K., Perantoni, A., Kim, Y.-S., and Jetten, A.M. (2002). Characterization of Glis2, a novel gene encoding a Gli-related, Krüppel-like transcription factor with transactivation and repressor functions. Roles in kidney development and neurogenesis. J. Biol. Chem. 277, 10139–10149.

Zhang, Q., Shi, Q., Chen, Y., Yue, T., Li, S., Wang, B., and Jiang, J. (2009). Multiple Ser/Thr- rich degrons mediate the degradation of Ci/Gli by the Cul3-HIB/SPOP E3 ubiquitin ligase. Proc. Natl. Acad. Sci. U.S.a. 106, 21191–21196.

Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nussbaum, C., Myers, R.M., Brown, M., Li, W., et al. (2008). Model-based Analysis of ChIP-Seq (MACS). Genome Biol 9, R137.

12 2 Appendices

Appendix 1

A

-/- ; Venus-SUFU -/- ; Venus-SUFU -/- ; Venus-SUFU

-/- -/- -/- -/- -/- -/-

MEF Ptch1 Sufu Sufu MEF Ptch1 Sufu Sufu MEF Ptch1 Sufu Sufu GLI2

ZFP629

Lamin B

`-tubulin Total Cytoplasm Nucleus

B

-/- ; Venus-SUFU -/- ; Venus-SUFU -/- ; Venus-SUFU

-/- -/- -/- -/- -/- -/-

MEF Ptch1 Sufu Sufu MEF Ptch1 Sufu Sufu MEF Ptch1 Sufu Sufu ZFP629 `-tubulin Lamin B Total Cytoplasm Nucleus

Independent cellular fractionation of MEFs (A-D). Lamin B: nuclear loading control, β-tubulin: cytoplasmic loading control. Equal protein amounts were loaded for each cell lines. Membranes were incubated with the indicated antibodies simultaneous. Exposures on film were also done simultaneously for each membrane/cell line.

C

12 3

ZFP629

LaminB

wt -/- -/- Sufu Ptch1

-/- ; Venus-Sufu

Sufu

D

ZFP629

LaminB

Shorter exposure

ZFP629

LaminB

wt -/- -/-

Sufu Ptch1

-/- ; Venus-Sufu

Sufu Total nuclear fractions for the indicated MEFs. Total nuclear extracts were prepared for each different cell lines. Equal protein amounts were loaded for each cell lines. Membranes were incubated with the indicated antibodies simultaneous. Exposures on film were also done simultaneously for each membrane/cell line.