Hum Genet DOI 10.1007/s00439-015-1591-0

ORIGINAL INVESTIGATION

The nuclear localization pattern and interaction partners of GTF2IRD1 demonstrate a role in chromatin regulation

Paulina Carmona‑Mora1 · Jocelyn Widagdo2 · Florence Tomasetig1 · Cesar P. Canales1 · Yeojoon Cha1 · Wei Lee1 · Abdullah Alshawaf3 · Mirella Dottori3 · Renee M. Whan4 · Edna C. Hardeman1 · Stephen J. Palmer1

Received: 11 February 2015 / Accepted: 4 August 2015 © Springer-Verlag Berlin Heidelberg 2015

Abstract GTF2IRD1 is one of the three members of the mostly involved in chromatin modification and transcrip- GTF2I family, clustered on 7 within a tional regulation, whilst others indicate an unexpected role 1.8 Mb region that is prone to duplications and deletions in connection with the primary cilium. Mapping of the sites in . Hemizygous deletions cause Williams–Beuren of interaction also indicates key features regarding syndrome (WBS) and duplications cause WBS duplica- the evolution of the GTF2IRD1 protein. These data provide tion syndrome. These copy number variations disturb a a visual and molecular basis for GTF2IRD1 nuclear func- variety of developmental systems and neurological func- tion that will lead to an understanding of its role in , tions. mapping data and analyses of knockout mice behaviour and human disease. show that GTF2IRD1 and GTF2I underpin the craniofacial abnormalities, mental retardation, visuospatial deficits and Abbreviations hypersociability of WBS. However, the cellular role of the hESC Human embryonic stem cells GTF2IRD1 protein is poorly understood due to its very PLA Proximity ligation assay low abundance and a paucity of reagents. Here, for the first STED Stimulated emission depletion time, we show that endogenous GTF2IRD1 has a punctate WBS Williams–Beuren syndrome pattern in the nuclei of cultured human cell lines and neu- Y2H Yeast two-hybrid rons. To probe the functional relationships of GTF2IRD1 in an unbiased manner, yeast two-hybrid libraries were screened, isolating 38 novel interaction partners, which Introduction were validated in mammalian cell lines. These relationships illustrate GTF2IRD1 function, as the isolated partners are GTF2IRD1 (GTF2I repeat domain containing protein I) was initially identified in three independent yeast one- hybrid screens as a protein that interacted with DNA baits Electronic supplementary material The online version of this composed of triplicated versions of the upstream regions article (doi:10.1007/s00439-015-1591-0) contains supplementary material, which is available to authorized users. of TNNI1 (O’Mahoney et al. 1998), Hoxc8 (Bayarsaihan and Ruddle 2000), and goosecoid (Ring et al. 2002). In the * Stephen J. Palmer human, the gene encoding GTF2IRD1 is located in a clus- [email protected] ter within chromosome 7q11.23 with two closely related 1 Cellular and Genetic Medicine Unit, School of Medical ; GTF2I encoding TFII-I and GTF2IRD2 encoding Sciences, UNSW Australia, Sydney, NSW 2052, Australia GTF2IRD2. The 7q11.23 region contains three blocks of 2 Queensland Brain Institute, The University of Queensland, low-copy repeats (LCRs) that cause susceptibility to non- Brisbane, QLD 4072, Australia allelic homologous recombination during meiosis that 3 Centre for Neural Engineering, The University of Melbourne, results in offspring carrying hemizygous deletions result- Melbourne, VIC 3010, Australia ing in Williams–Beuren syndrome (WBS OMIM#194050) 4 Biomedical Imaging Facility, Mark Wainwright Analytical (Franke et al. 1999; Osborne et al. 1999; Pérez Jurado Centre, UNSW Australia, Sydney, NSW 2052, Australia et al. 1998; Tassabehji et al. 1999; Tipney et al. 2004)

1 3 Hum Genet or duplications that cause WBS duplication syndrome Much of the thinking regarding GTF2IRD1 func- (OMIM#609757) (Depienne et al. 2007; Merla et al. 2010; tion is based on homology with TFII-I, which has been Sanders et al. 2011; Somerville et al. 2005; Torniero et al. studied more intensively. It is undisputed that GTF2I and 2007; Van der Aa et al. 2009). GTF2IRD1 evolved from a common ancestor but the level Patients with WBS duplication syndrome have only of functional overlap and redundancy between TFII-I and recently been described and little mapping data exists GTF2IRD1 is currently unclear. Direct protein interaction that can discriminate the individual genetic contribu- between them is possible (Palmer et al. 2012) and some of tion to the phenotypes but in WBS, a series of atypical the data indicates similar target gene sets and mechanisms deletion patients indicate that loss of GTF2IRD1 and/or of regulation (Jackson et al. 2005; Palmer et al. 2012; Tan- GTF2I is necessary to manifest the craniofacial abnormal- tin et al. 2004). However, unlike GTF2IRD1, much of the ities, mental retardation, visuospatial construction defi- data on TFII-I indicate a very broad role in both the cyto- cits and hypersociability of WBS (Antonell et al. 2010). plasm and the nucleus and individual isoforms show differ- Given the profound importance of copy number variations ent subcellular localization patterns and functional proper- (CNVs) of these genes to the neurological abnormali- ties (Roy 2012). Some isoforms of TFII-I are thought to ties of WBS, it is not unreasonable to suppose that CNVs reside in the cytoplasm, where they are tethered by inter- which increase GTF2IRD1 and GTF2I gene dosage may actions with Bruton’s tyrosine kinase (Yang and Desiderio also play an important role in the consistent speech delay 1997) or p190RhoGAP (Jiang et al. 2005) and shuttle into and increased rates of autism and schizophrenia found the nucleus in response to signalling events and can also in WBS duplication syndrome and evidence from mouse interact with PLC-γ in a way that competitively inhibits its models supports this idea (Osborne 2010). Therefore, it is binding to TRPC3, thus altering agonist-induced calcium very important to understand the function of these related entry into the cell (Caraveo et al. 2006). At the same time, to comprehend the consequences of their altered TFII-I isoforms have a series of nuclear roles that include dosage. direct DNA binding to the regulatory regions of various Cultured cell transfection studies and transgenic genes, including c-fos (Roy 2012). / experiments both indicate that GTF2IRD1 has strong The majority of studies indicate that Gtf2ird1− − null gene repression capabilities (Issa et al. 2006; Tay et al. mice survive but have craniofacial and neurological abnor- 2003) and binding studies demonstrated that GTF2IRD1 malities (Howard et al. 2012; Schneider et al. 2012; Tassa- / has sequence-specific DNA recognition properties for behji et al. 2005; Young et al. 2008), whereas Gtf2i− − mice GGATTA-containing sequences that are conferred by a are embryonic lethal (Enkhmandakh et al. 2009). These subset of the five I-repeat domains (RDs) that it contains data, in combination with the data on the cytoplasmic (Polly et al. 2003; Thompson et al. 2007; Vullhorst and roles of TFII-I, might be taken to suggest that TFII-I plays Buonanno 2003, 2005). The GTF2IRD1 upstream region a broader and more critical cellular role than GTF2IRD1. (GUR) contains three GGATTA binding sites and EMSA However, evolutionary conservation studies indicate that studies have shown that all three are required to achieve the common ancestor of these two genes bore a stronger high-affinity GTF2IRD1 binding (Palmer et al. 2010). This sequence similarity to the current GTF2IRD1 (Gunbin and may explain why GTF2IRD1 was readily isolated from Ruvinsky 2013). Thus, during the initial period follow- the artificial triplicated bait constructs of the original yeast ing duplication, proto-GTF2I was presumably liberated one-hybrid assays. However, it is unclear what evolutionary from functional constraints, whereas the proto-GTF2IRD1 advantage was bestowed by the multiple duplication of this retained most of the ancestral gene’s role. The GTF2I/ DNA binding domain and how the RDs work in DNA bind- GTF2IRD1 duplication pre-dates the formation of carti- ing site selection of target genes. laginous fish but the GTF2I gene has been lost in the two Apart from the RDs, the human GTF2IRD1 protein bony fish infraclasses (Teleostei and Holostei), whereas contains a short near the N-terminus impli- GTF2IRD1 has been retained in all vertebrates since its for- cated in dimerization (Vullhorst and Buonanno 2003), a mation (Gunbin and Ruvinsky 2013). These data support nuclear localization signal (NLS) near the C-terminus, the likelihood of functional overlaps between these two two SUMOylation motifs of which one is clearly highly proteins and suggest that, at least in bony fish, GTF2IRD1 conserved and functional (Widagdo et al. 2012), a highly is sufficient to support all of the functions they provide in conserved C-terminal domain that may be important for other species. SUMOylation due to the binding of the E3 SUMO-ligase, Based on the well-established DNA binding properties PIASxβ (Widagdo et al. 2012) and a polyserine tract near of GTF2IRD1 and the clear impact on transcriptional regu- the C-terminus that is missing in all fish species but present lation when over-expressed in vitro (Polly et al. 2003; Vull- in amphibians and may, therefore, be a more recent evolu- horst and Buonanno 2003) and in vivo (Issa et al. 2006), tionary refinement. it has been assumed that GTF2IRD1 is a conventional

1 3 Hum Genet factor that has a consistent set of gene tar- result was confirmed using a pool of 4 anti-GTF2IRD1 siR- gets that will be dysregulated in its absence. But, despite NAs. While no change in the band pattern from whole cell demonstrated alterations in behaviour and motor function extracts was observed when HeLa cells were treated with in Gtf2ird1 knockout mice (Howard et al. 2012; Young control siRNA, the upper band at 130 kDa was lost when et al. 2008) and electrophysiological changes in CNS neu- cells were treated with anti-GTF2IRD1 siRNA (Fig. 1a). rons (Proulx et al. 2010), evidence for such a gene set from The lower molecular weight band detected by M19 transcriptional analysis of knockout brain tissue has so could not be identified. We considered the possibility that far proved elusive (O’Leary and Osborne 2011). This has it constituted TFII-I or GTF2IRD2 and was detected as a led to speculation on the possibility of missing roles for result of cross-reactivity with the M19 antibody. However, GTF2IRD1, such as a cytoplasmic function similar to TFII- this possibility was dismissed by immunoblotting using I (O’Leary and Osborne 2011). anti-TFII-I and anti-GTF2IRD2 antibodies. These antibod- Therefore, there is a strong need to supply some fun- ies both identified bands that run at a very similar molecular damental information regarding GTF2IRD1, which has weight to GTF2IRD1 and not at the lower level of approxi- been hampered by the lack of good quality antibodies and mately 110 kDa. The possibility that the anti-GTF2IRD1 a lack of understanding about its protein–protein interac- siRNAs affected the levels of TFII-I and GTF2IRD2 or tions. In this paper, we examine the subcellular localiza- that M19 was cross-reacting with these related proteins was tion of endogenous GTF2IRD1 and demonstrate that it is also excluded by probing extracts treated using the siRNA distributed in a speckled pattern in the nucleus that brings knockdown oligonucleotides (Fig. 1a). it into close proximity with several markers of chromatin/ Endogenous GTF2IRD1 has never been convincingly transcriptional regulation. To identify the range of protein localized within the cell by immunofluorescence. Using interactions engaged in by GTF2IRD1, we used yeast two- the 333A antibody we were able to detect a punctate sig- hybrid library screenings to generate an unbiased com- nal in the nuclei of HeLa cells and this signal was com- prehensive list of protein partners. Most of the proteins pletely abrogated in the majority of cells treated with the isolated support a role in the regulation of chromatin. Inter- anti-GTF2IRD1 siRNAs (Fig. 1b). Cell counting indicated actions with other DNA binding proteins and transcrip- that approximately 98 % of cells had no signal while the tional co-factors suggest that GTF2IRD1 binds to chro- remaining 2 % showed the normal pattern, indicating that matin targets using cooperative mechanisms. In addition, they had failed to be transfected by the siRNA oligonu- interactions with several components of the primary cilium cleotides. Use of super-resolution confocal STED (stimu- and ARM repeat proteins offer an intriguing new direction lated emission depletion) microscopy demonstrated that in GTF2IRD1 research. the speckles were distributed evenly throughout the nuclei of HeLa, HEK-293 (Fig. 1c) and SH-SY5Y cells (data not shown) in large numbers. Results We considered the possibility that the pattern of GTF2IRD1 localization was specific to immortalized Endogenous GTF2IRD1 exists in a punctate pattern cell lines. Since GTF2IRD1 function has been associ- in the nucleus ated with neurobehavioural abnormalities in mouse stud- ies (Howard et al. 2012; Young et al. 2008) and we had Previous analyses of endogenous GTF2IRD1 have been previously shown expression in the mouse brain (Palmer hampered by a lack of good quality antibodies and the low et al. 2007), an understanding of localization in neurons abundance of the protein. However, two commercial anti- was sought. Human cerebellum samples were obtained bodies were identified, the first of which (333A) is usable for immunofluorescence analysis but no convincing evi- in immunofluorescence and immunoprecipitation but is dence of GTF2IRD1 localization could be found in frozen specific for the human protein and is a very poor detection or fixed tissue (data not shown). Therefore, neurons dif- reagent on western blots. The second (M19) is highly sen- ferentiated from human embryonic stem (ES) cells were sitive as a detection reagent on western blots but has - analysed using the 333A antibody. Co-immunofluores- tively poor specificity and does not work for immunofluo- cence with anti-β-tubulin III and anti-MAP2ab antibodies rescence (Fig. 1). showed punctate nuclear expression of GTF2IRD1 in neu- In whole cell extracts of HeLa cells, M19 detects two ronal cells (Fig. 2), consistent with its expression pattern major bands in the 110–130 kDa range (Fig. 1a). After in the immortalized cell lines (Fig. 1). Of note, GTF2IRD1 immunoprecipitation using the 333A antibody, only the nuclear expression was also observed in β-tubulin III nega- upper band (running at approximately 130 kDa) is detected tive and MAP2ab negative cells, which correspond to sub- by M19, indicating that this corresponds to endogenous populations of early neural progenitors (Dottori, personal GTF2IRD1 (predicted molecular weight: 106 kDa). This communication).

1 3 Hum Genet

Fig. 1 Detection of endogenous human GTF2IRD1. a Western blot ▸ analysis of endogenous GTF2IRD1, GTF2IRD2 and TFII-I in HeLa cell extracts. A single western blot was cut into strips (indicated by the boxes), probed using anti-GTF2IRD1 (M19), anti-GTF2IRD2 (IRD2) and anti-TFII-I antibodies, and the resulting film expo- sures realigned. The anti-GTF2IRD1 M19 antibody detects two bands above 100 kDa in the whole cell extract (No siRNA) but after immunoprecipitation using the anti-GTF2IRD1 333A antibody (IP 333A), only the upper band is detected. In extracts from cells trans- fected with a negative control siRNA (CON siRNA), both bands are detected but in cells transfected with the anti-GTF2IRD1 siRNA pool (IRD1 siRNA), the upper band disappears. Immunoblotting (IB) for GTF2IRD2 and TFII-I shows that the lower band does not correspond to these proteins and the anti-GTF2IRD1 siRNA lane is unaffected in both blots, showing that there is no evidence of compensatory pro- tein level change or antibody cross-reactivity. b Immunofluorescence analysis of endogenous GTF2IRD1 protein using the 333A antibody on HeLa cells treated with control or targeting siRNA. c Immuno- fluorescence analysis of endogenous GTF2IRD1 distribution in the nucleus of HeLa and HEK-293 cells using stimulated emission deple- tion (STED) super resolution confocal microscopy

Endogenous GTF2IRD1 is found in close proximity to elements of chromatin regulation

These results prompted an exploration of co-localization with other nuclear speckling bodies as a potential means to understand GTF2IRD1 function. Comparison of the distri- bution patterns using co-immunofluorescence revealed that none of the markers tested showed consistent one-to-one overlap with GTF2IRD1. However, qualitative assessment indicated that some GTF2IRD1 protein showed fractional overlap with markers of chromatin and transcriptional regu- lation, including histone H3 methylation marks, members of the heterochromatin protein 1 family and SP1 (Fig. 3). Co- localization of GTF2IRD1 and the nuclear bodies detected by antibodies against, coilin, LAP2, the nuclear pore com- plex (NPC) and PML was very limited, although the sparse PML bodies typically showed at least one overlapping sig- nal per nucleus (Supplementary Material, Fig. S1). Proximity ligation assays (PLA) were used as a quan- titative means to assess the incidence of close proximity between endogenous GTF2IRD1 protein and the mark- ers of chromatin/transcriptional regulation in HeLa cells (Fig. 3). These analyses indicated that GTF2IRD1 is found in close proximity with the heterochromatin proteins HP1β and HP1γ at the highest frequency, followed by SP1, HP1α, H3K27Me2/3 and H3K4Me3 in descending order (Fig. 4). reported (Tussie-Luna et al. 2002; Widagdo et al. 2012) this PLA signal detected for GTF2IRD1 and H3K9Me3 was area is unexplored and we therefore set out to address this equivalent to the combined background control levels, sug- need using a comprehensive screening approach. gesting that the incidence of close proximity between these proteins is equal to or approaching zero. Yeast two‑hybrid library screening for novel These observations suggested an association with elements GTF2IRD1 interacting partners of the chromatin regulation machinery, but stronger evidence of such interactions prevented firmer functional conclusions. Yeast two-hybrid (Y2H) screening was chosen because While some GTF2IRD1 protein binding partners have been it is unbiased and can identify both transient and stable

1 3 Hum Genet

presumed to encode proteins that bind directly to the GAL4 DNA binding domain and were also discarded. This refinement process led to the identification of 40 individual GTF2IRD1 binding protein candidates (Table 1). Five of these clones were not pursued beyond sequence identification as they were known to be solely cytoplasmic, extracellular or cell membrane localized and were less likely to be of biological relevance. Two of the proteins have been described previously as interacting nuclear partners (Tussie- Luna et al. 2002; Widagdo et al. 2012). Of the remaining 33 proteins, 26 either shuttle into the nucleus or are pri- marily located in the nucleus according to known func- tions or predictions summarized in the subcellular locali- zation database, COMPARTMENTS (Binder et al. 2014). The KPNA proteins were predicted to have been isolated due to their binding to the GTF2IRD1 nuclear localization signal (NLS). Preliminary Y2H studies mapped this inter- action to the C-terminal domain of GTF2IRD1, where the NLS is located, suggesting that this was indeed the case and these proteins were not pursued beyond this point (data not shown). It was striking that six of the remaining non-nuclear proteins are associated with or have links with centrosome and primary cilia function (Table 1), an association that has never been previously noted in connection with GTF2IRD1.

Domain characterization for the novel protein interactions of GTF2IRD1

To map the binding domains of these proteins in Fig. 2 Endogenous GTF2IRD1 adopts a speckled nuclear pattern in GTF2IRD1, a range of Y2H bait plasmids was con- hESC-derived neuronal cell cultures. Immunofluorescence analysis structed containing 8 separate 88 amino acid regions of the of hESC-derived cells, driven into the neuronal pathway of develop- GTF2IRD1 protein (Fig. 5a) containing known functional ment, shows that GTF2IRD1 has the same nuclear pattern found in immortalized cell lines. GTF2IRD1 is expressed in all cells including units or sequences that are strongly conserved between differentiating neurons, as marked by β-tubulin III and MAP2ab anti- species, as described previously (Widagdo et al. 2012). A bodies. Scale bars represent 20 µm selected set of prey proteins was co-transformed with each of the 8 domain-specific plasmids and plated on QDO/x-α- GAL media (Fig. 5b). Protein interactions were mapped to interactions. To provide as complete a list of binding part- several of the domains, sometimes multiple domains. The ners as possible, two Y2H screens were performed using a highest number of interactions mapped to the regions con- universal normalized mouse cDNA library (derived from taining the SUMOylation motifs (Fig. 5c). a collection of different mouse tissues) and a human brain normalized cDNA library. These screens led to the initial Subcellular localization of GTF2IRD1 and its novel isolation of 191 positive yeast colonies. protein partners Clones with prey sequences duplicating other clones or out of frame with the GAL4 DNA binding domain were To gather further evidence for interactions in a mamma- discarded. Most of the remaining clones were retrans- lian cell context and to check the subcellular localization formed into haploid AH109 yeast using the original rescued characteristics of the putative protein partners, plasmids prey plasmid or a reconstructed prey plasmid containing the encoding epitope-tagged versions or EGFP fusion proteins full-length prey cDNA (Supplementary Material, Table 1), were either obtained or constructed (see Supplementary together with the bait construct or the empty pGBKT7 Material, Table S1). These plasmids were co-transfected control plasmid (Supplementary Material, Fig. S2). Prey into HeLa cells with plasmids encoding either GTF2IRD1- clones that were resistant to the quadruple dropout (QDO) EGFP or -tagged GTF2IRD1 and co-localization was media in the presence of the empty control plasmid were analysed by fluorescence microscopy (Fig. 6).

1 3 Hum Genet

Fig. 3 Co-localization of endogenous GTF2IRD1 with markers of chromatin/transcrip- tional regulation using confocal immunofluorescence analysis and PLA. The markers include HP1 α, β, γ, H3K9Me3 (K9), H3K27Me2/3 (K27), H3K4Me3 (K4) and SP1. The PLA images show representative Z-stack confocal reconstructions of DAPI-stained nuclei overlaid with the PLA dots generated by the same antibody pairings as the adjacent immunofluores- cence images

The majority of the candidate proteins localized to the proteins that localized predominantly outside of the nucleus nucleus, as expected, with some degree of nuclear speck- in these assays (Fig. 6b) may shuttle into the nucleus under ling in many cases but most showed little direct overlap normal circumstances to interact with GTF2IRD1 and with the tagged GTF2IRD1 protein (Fig. 6a). Candidate thus, these findings should not prejudice the likelihood of

1 3 Hum Genet

2002). In HeLa cells, the protein predominantly local- ized to the cell surface and, therefore, interaction with GTF2IRD1 was not tested by co-IP. However, the family member PKP1 is known to localize more readily to the nucleus (Hatzfeld et al. 2000) and we considered the pos- sibility that the interaction of GTF2IRD1 with PKP family members may be conserved. To address this question, the prey vector pGADT7 containing the PKP1 open reading frame was co-transformed with GTF2IRD1 and an inter- action in yeast was verified (Supplementary Material, Fig. S2). Second, the localization of PKP1-EGFP to the nuclei of HeLa cells was confirmed (Fig. 6b) and PKP1 was included in the co-IP experiments. All of the candidate proteins tested were found to co- immunoprecipitate with GTF2IRD1 from the HeLa cell extracts with varying levels of recovery and all the con- trol interactions with GFP were negative as expected (Fig. 7). The majority of experiments were performed using EGFP-tagged versions of the candidate proteins and GTF2IRD1-Myc (Fig. 7a) but SETD6 and ZMYM2 co-IPs were performed using the reverse configuration (Fig. 7b) because only Myc-tagged versions of these constructs were available.

Fig. 4 PLA quantification of the incidence of mean close proxim- Co‑immunofluorescence analysis of endogenous ity per nucleus between GTF2IRD1 and markers of chromatin/tran- ZMYMs in mammalian cells scriptional regulation. The markers include HP1 α, β, γ, H3K9Me3 (K9), H3K27Me2/3 (K27), H3K4Me3 (K4) and SP1. a Histogram representing the total mean PLA dots per nucleus, the contribution of Of the previously unreported candidate proteins isolated the total caused by GTF2IRD1 (IRD1) background signal, the marker in these screens, the proteins ZMYM2 and ZMYM3 were background signal and the resulting estimate of the corrected PLA the most prominent to us because they have been previ- mean. b Table of the same data shown in A (rounded to integers) with associated estimates of standard deviation (SD) for each mean ously isolated using immunoaffinity purification from endogenous HeLa cell extracts as part of a complex that contained TFII-I, BHC110, BHC80, CoREST, HDAC1 a genuine biological interaction. However, these proteins and HDAC2 (Hakimi et al. 2003). Therefore, while direct were not analysed in the next stage of verification, involv- binding of TFII-I to ZMYM2 and ZMYM3 had not been ing co-immunoprecipitation (co-IP), since they were not demonstrated, it seemed plausible that direct interactions occupying the same cellular compartment as GTF2IRD1. between ZMYM proteins and members of the TFII-I fam- ily, including GTF2IRD1, are a conserved feature that con- GTF2IRD1 interacts with chromatin modifiers fers the ability to integrate into HDAC-containing silencing and transcriptional regulators in mammalian cells complexes. To examine this hypothesis, anti-ZMYM2 and anti- The majority of the candidate proteins that showed signifi- ZMYM3 antibodies were obtained and co-immunoflu- cant distribution in the nuclear compartment were selected orescence analysis of their endogenous co-localization for a further level of validation using co-IP analysis of the with endogenous GTF2IRD1 and TFII-I was conducted recombinant proteins. Plasmids encoding tagged fusion in HeLa cells. All 4 of these proteins were distributed in proteins of GTF2IRD1 and each candidate protein were punctate patterns, and some co-localization was appar- co-transfected into HeLa cells and protein complexes were ent, but the overlap was not one-to-one as a proportion of immunoprecipitated using the anti-GFP antibody. Co-IP the red and green signal was still very obvious (Fig. 8a). proteins were analysed on western blots using the anti-Myc However, it was noteworthy that the co-localization of antibody. Negative controls were performed by co-transfec- GTF2IRD1 and the ZMYMs was not dissimilar from the tion with the empty pEGFP vector. co-localization of TFII-I and the ZMYMs; previously PKP2 is a member of the plakophilin family that plays associated by co-immunoprecipitation in HeLa cells dual roles in the nucleus and in desmosomes (Chen et al. (Hakimi et al. 2003).

1 3 Hum Genet

Table 1 Summary of Y2H screens Gene symbol Name Location Validation

AKIRIN2 Akirin 2 Nuclear Yeast ALMS1 Alstrom syndrome 1 Primary cilia/centrosome Yeast ARMCX5 Armadillo repeat containing, X-linked 5 Nuclear/cytoplasmic Yeast ATF7IP Activating 7 interacting protein Nuclear co-IP ATP2C1 ATPase, Ca transporting, type 2C, member 1 Cytoplasmic n.d. ++ BBS4 Bardet–Biedl syndrome 4 Primary cilia/centrosome Yeast CSPP1 Centrosome and spindle pole associated protein 1 Cilia/centrosome Yeast DCAF6a DDB1 and CUL4 associated factor 6 Nucleus co-IP ELF2 E74-like factor 2 (ets domain transcription factor) Nucleus Yeast FAM47E Family with sequence similarity 47, member E Nucleus Yeast FBXW10 F-box and WD repeat domain containing 10 Nuclear/cytoplasmic Yeast FHAD1 Forkhead-associated (FHA) phosphopeptide binding domain 1 Unknown Yeast HOMEZ and leucine zipper encoding Nucleus co-IP HSF2BP Heat shock transcription factor 2 binding protein Cytoplasm Yeast HTRA4 HtrA serine peptidase 4 Extracellular n.d. INTS12 Integrator complex subunit 12 Nucleus co-IP KPNA1 Karyopherin alpha 1 (importin alpha 5) Nucleus Yeast KPNA2a Karyopherin alpha 2 (RAG cohort 1, importin alpha 1) Nucleus Yeast KPNA3 Karyopherin alpha 3 (importin alpha 4) Nucleus Yeast KPNA4 Karyopherin alpha 4 (importin alpha 3) Nucleus Yeast MBD3L1 Methyl-CpG-binding domain protein 3-like 1 Nucleus co-IP NAP1L2 Nucleosome assembly protein 1-like 2 Nucleus co-IP OPHN1 Oligophrenin 1 Cytoplasm Yeast PARPBP PARP1 binding protein Nucleus Yeast PIAS1a Protein inhibitor of activated STAT-1 Nucleus Yeast PIAS2b Protein inhibitor of activated STAT, 2 Nucleus Reported PKP2 Plakophilin 2 Desmosome/nucleus Yeast SCNM1 Sodium channel modifier 1 Nucleus Yeast SETD6 SET domain containing 6 Nucleus co-IP SPTLC1 Serine palmitoyltransferase, long chain base subunit 1 Endoplasmic reticulum n.d. TAF1B TATA box binding protein (TBP)-associated factor, RNA polymerase I, B, 63kD Nucleus Yeast TMEM55A Transmembrane protein 55A Membrane n.d. TRIP11 Thyroid hormone interacting protein 11 Golgi/primary cilia Yeast USP20 Ubiquitin specific peptidase 20 Cytoplasm/centrosome Yeast USP33 Ubiquitin specific peptidase 33 Cytoplasm/centrosome Yeast VIMP VCP-interacting membrane protein Endoplasmic reticulum n.d. ZC4H2a , C4H2 domain containing Nucleus co-IP ZMYM2 Zinc finger, MYM-type 2 Nucleus co-IP ZMYM3 Zinc finger, MYM-type 3 Nucleus co-IP ZMYM5a,b Zinc finger, MYM-type 5 Nucleus Reported

Combined results of GTF2IRD1 interacting partners identified in two independent screens (mouse universal and human brain): gene symbols are sorted alphabetically. Location data are based on reported subcellular localizations or predicted/inferred information using COMPARTMENTS (Binder et al. 2014). Several clones occurred in both screens (a) and 2 genes (b) have been described previously (Tussie-Luna et al. 2002; Widagdo et al. 2012). Some interactions were not pursued beyond sequence analysis (not done—n.d.). Other clones were either validated solely by retrans- formation in yeast (yeast), or by yeast retransformation and subsequent transfection into HeLa cell lines and co-immunoprecipitation (co-IP)

To explore these associations quantitatively, the frequency of (Fig. 8b). These data indicated that the incidence of close prox- close proximity per nucleus between endogenous GTF2IRD1 imity per nucleus was similar and at relatively high levels for or TFII-I and ZMYM2 and ZMYM3 was estimated using PLA both GTF2IRD1 and TFII-I with the ZMYM proteins.

1 3 Hum Genet

Fig. 5 Mapping of interac- tion domains in GTF2IRD1 with the proteins identified in the Y2H screens. a Diagram of human GTF2IRD1 and the corresponding subdomains used for mapping (black bars above). The domains include the leucine zipper region (LZ), five repeat domains (RDs), two regions containing SUMO motifs and a nuclear localization signal (NLS). b Representative example images of yeast colonies plated on double dropout agar (DDO) as a control, or quadruple dropout (QDO) agar containing x-α-gal. Each colony represents yeast co-transformed with the empty vector control (V), domain- specific or full-length (FL) bait plasmids together with the prey plasmid identified in the Y2H screen. The example shown is the NAP1L2 interaction. Slight background activity in some surviving yeast is typical of Y2H assays and is ignored. c Summary of the domain mapping results using yeast co- transformation

Discussion subcompartments (Sleeman and Trinkle-Mulcahy 2014). However, PLA quantification demonstrated that the strong- In this paper, we have shown for the first time that the est potential association of the markers chosen was the HP1 endogenous human GTF2IRD1 protein is localized primar- proteins. HP1 proteins have a chromodomain that recog- ily to the nucleus in immortalized human cell lines and in nizes the H3K9me2/3 mark and were originally associated neurons and progenitors differentiated from human embry- with heterochromatin but are now recognized as having onic stem cells. The expression of GTF2IRD1 in progeni- multiple roles in transcriptional activation, sister chroma- tor and neuronal populations derived from human ES cells tid cohesion, chromosome segregation, telomere mainte- suggests a function of this protein from early stages of nance, DNA repair and RNA splicing (Canzio et al. 2014). human neuronal development. While HP1α and HP1β are generally localized to hetero- The localization of GTF2IRD1 within the nucleus chromatin, HP1γ is often found in euchromatin at tran- assumes a speckled pattern that is similar to the TFII-I pat- scription start sites (Sridharan et al. 2013). GTF2IRD1 was tern in HeLa cells (Tanikawa et al. 2011). GTF2IRD1 is also found in close proximity with the transcription fac- clearly excluded from the nucleoli but the distribution does tor SP1, which is positively regulated by direct binding of not directly match any of the standard markers of nuclear ATF7IP (Fujita et al. 2003); identified as a novel interaction

1 3 Hum Genet

Fig. 6 Subcellular localization of constitutively expressed causes abnormal appearance of the nucleus, which is apparent in the GTF2IRD1 and the novel putative protein partners. a Confocal DAPI images. This is assumed to be a consequence of the impact of immunofluorescence analysis of HeLa cells transfected with plasmids the partner on nuclear behaviour. b An identical analysis with partner encoding human GTF2IRD1-Myc (red) or GTF2IRD1-EGFP (green) proteins that were found to show cytoplasmic and nuclear localiza- together with plasmids encoding the partner, also tagged with Myc, tion. PKP1 was not identified in the Y2H screens but was selected EGFP or FLAG (PIAS1 only). All of the proteins in a were found due to homology to PKP2, which localizes to the cell periphery. Scale to localize to the nucleus. Over-expression of some proteins typically bars represent 20 µm (color figure online) partner in this study. Close proximities were observed at Based on these associations alone, one might specu- a lower frequency with the chromatin mark H3K4Me3, late that GTF2IRD1 plays a role in transcriptional found at the transcription start site of active genes and with regulation and developmental gene silencing. These H3K27Me2/3, which is a mark mediated via PRC2, a key ideas fit very well with previous observations regard- repressive factor for the regulation of developmental genes ing the repression of multiple tissue-specific genes in a (Golbabapour et al. 2013). transgenic system (Issa et al. 2006), the direct negative

1 3 Hum Genet

Fig. 8 Endogenous co-localization of GTF2IRD1 and TFII-I with ZMYM2 and ZMYM3 using confocal immunofluorescence analy- sis and PLA. a Confocal immunofluorescence analysis of HeLa cells using antibodies against the proteins indicated. Despite the previously reported association of TFII-I with ZMYM2 and ZMYM3 in immu- Fig. 7 Novel GTF2IRD1 interactions with nuclear proteins revealed noaffinity analysis of HeLa cell extracts, only partial co-localization is by co-immunoprecipitation in mammalian cells. Panels show western observed. b PLA quantification of the incidence of mean close prox- blot analyses of HeLa cells transiently transfected with the indicated imity per nucleus between GTF2IRD1 and TFII-I with ZMYM2 and constructs. a Protein partners were immunoprecipitated (IP) with ZMYM3. The table indicates the total combined mean PLA dots per anti-GFP antibody and immunoblotted (IB) with anti-GFP to show nucleus, the contribution of the total caused by GTF2IRD1 (IRD1) successful immunoprecipitation. In one case (INTS12-GFP), the load- background signal, the ZMYM background signal and the resulting ing of the input was too low to be detected but sufficient protein was estimate of the corrected PLA mean. The adjacent columns show the recovered in the IP. Immunoblotting with anti-Myc to reveal co-immu- associated estimates of standard deviation (SD) for each mean noprecipitation (Co-IP) of GTF2IRD1-Myc showed that GTF2IRD1 was recovered in all experiments, except for the pEGFP vector con- trol (CON GFP). b Due to limited plasmid clone availability, some autoregulation of the GTF2IRD1 promoter/enhancer by partners were assayed in the reverse configuration. HeLa cells were transfected with plasmids encoding GTF2IRD1-EGFP or EGFP alone its own protein product (Palmer et al. 2010) and the fact and SETD6-Myc or ZMYM2-Myc. Proteins were immunoprecipitated that mouse Gtf2ird1 is widely and robustly expressed using anti-GFP antibody and the interactions detected by immunoblot- during development but restricted to specific cell types ting with anti-Myc antibody. Numbers below the construct names rep- such as neurons and brown adipose tissue during adult- resent the molecular weight in kDa, which was assessed as approxi- mately correct in all cases against molecular weight markers hood (Palmer et al. 2007).

1 3 Hum Genet

A large group of novel GTF2IRD1 protein–protein inter- complex (ALMS1, BBS4 and CSPP1) as well as 3 other actions was identified by Y2H screening. Many of these proteins that are linked with primary cilium function interactions were tested in mammalian cells via co-local- (TRIP11, USP20 and USP33). No previous reports have ization and co-IP; the latter approach forming the basis indicated a role for GTF2IRD1 in this structure and there for a broader interactional network summary (Supplemen- is no evidence as yet to suggest that GTF2IRD1 shuttles tary Material, Fig. S3). As anticipated, a large number of to this site but the isolation of 6 proteins belonging to this GTF2IRD1 novel partners are nuclear-localized or have the grouping in an unbiased screen seems beyond the likeli- capability to shuttle into the nucleus, while some are gen- hood of coincidence and could initiate a valuable new line erally cytoplasmic or extracellular and are therefore more of future investigation. All 3 of the main proteins isolated likely to be artefacts of the screening system, although are associated with ciliopathies: mutations in BBS4 cause there is no additional evidence to support that conclusion. Bardet–Biedl syndrome 4 (OMIM #615982); ALMS1, Putting the proteins of primary interest into functional Alstrom syndrome (OMIM #203800) and CSPP1, Joubert groups, several broad categories emerge; such as nuclear syndrome 21 (OMIM #615636). Primary cilia in special- import functions (KPNA1-4), post-translational modifica- ized sensory cells are well known but it is now clear that tions of ubiquitination (e.g. USP20, USP33 and FBXW10) these structures are virtually universal in all cell types, and SUMOylation (PIAS1 and PIAS2), DNA binding playing critical roles in the sonic hedgehog and Wnt signal- proteins and transcriptional co-regulators (e.g. ELF2, ling pathways and are particularly important in the devel- HOMEZ, TRIP11, ZC4H2) and the largest group, which oping brain (Guemez-Gamboa et al. 2014; Han et al. 2009). is primarily associated with chromatin regulation (e.g. GTF2IRD1 was shown to interact with 3 members SETD6, ATF7IP, DCAF6, ZMYM2, ZMYM3, ZMYM5, of the ARM repeat-containing family; PKP1, PKP2 and MBD3L1 and NAP1L2). ARMCX5. The plakophilins localize to the cytoplasmic The association of GTF2IRD1 with gene silencing func- surface of desmosomes but also localize to the nucleoplasm tions is consistent with the identification of multiple bind- in a wide range of cells. They are widely viewed as signal- ing partners that play a role in transcriptional regulation ling proteins that shuttle between these locations playing through chromatin modification (Supplemental Material, roles of structural scaffold at the desmosome and transcrip- Table S2). One might predict on this basis that a major tional regulation in the nucleus (Bass-Zubek et al. 2009), functional role for GTF2IRD1 is to nucleate complexes being capable of potentiating β-catenin/TCF-mediated tran- of proteins that are capable of changing histone marks scriptional regulation (Chen et al. 2002). ARMCX5 func- and direct them to specific locations in the genome, either tion is poorly understood but evidence suggests that the through the direct DNA binding properties of GTF2IRD1 Armcx genes arose as a cluster on the X chromosome as or by association with other transcription factors. The iden- a result of retrotransposition from Armc10. These genes tification of 3 members of the ZMYM family in the screens encode proteins that are highly expressed in the develop- is consistent with the isolation of ZMYM2 and ZMYM3 ing and adult nervous system, localize both to the nucleus in association with TFII-I in the same HDAC-containing and to mitochondria and play a role in the distribution and complex using immunoaffinity purification from endog- dynamics of the mitochondria (Lopez-Domenech et al. enous HeLa cell extracts (Hakimi et al. 2003). This would 2012). suggest that the interaction between ZMYM proteins and Eighteen of the novel partner proteins were mapped members of the GTF2I family is an evolutionary con- to interaction domains in GTF2IRD1, highlighting two served feature. Endogenous co-immunofluorescence anal- important points. First, the majority of the interactions ysis of the ZMYM proteins suggested some overlap with localized to the SUMO domains, indicating that some of GTF2IRD1 and TFII-I and the PLA quantification demon- these interactions may be regulated by post-translational strated frequent close proximity of GTF2IRD1 and TFII-I modification of this domain, as was previously shown for with the ZMYMs. However, it was also clear that only a ZMYM5 (Widagdo et al. 2012). The second point is that fraction of the GTF2IRD1 and TFII-I protein population the other major site of interaction is RD1, although bind- was in association with the ZMYMs, suggesting that while ing to the other repeat domains was also common. The endogenous TFII-I has been co-purified in complexes con- repeat domains are known to be the site of DNA bind- taining ZMYM2 and ZMYM3 (Hakimi et al. 2003), these ing activity and RDs 2–5 all show varying levels of DNA data do not necessarily provide a picture of how TFII-I is binding and sequence specificity (Vullhorst and Buonanno distributed in various complexes and such interactions may 2005), whereas RD1 does not bind DNA. It is difficult to be transient and fractional. understand what evolutionary advantage was bestowed Several proteins fall into a category that could indicate by the internal duplication of the repeat domains, leading a signalling role. The largest grouping of these includes to their expansion to 5 copies in humans and 6 copies in 3 proteins that localize to the primary cilium/centrosome mice. It seems unlikely that this was driven by a need to

1 3 Hum Genet refine direct DNA binding properties, although this does with x-α-gal (40 µg/mL), deficient in tryptophan, leucine, restrict high affinity binding to sites that contain at least 2 histidine and adenine. For library screening, pGBKT7- GGATTA recognition sequences, as shown for the autoreg- GTF2IRD1 was transformed into the AH109 yeast strain ulation of the GTF2IRD1 promoter/enhancer (Palmer and mated with the Mate and Plate (Clontech Laborato- et al. 2010). If, however, it is assumed that the RDs form ries) Universal Mouse (Normalized) library (#630482) or an important protein interaction surface, an evolutionary the Human Brain (Normalized) cDNA library (#630486), expansion of this domain would initially amplify the num- according to the manufacturer’s protocol. ber of proteins with which GTF2IRD1 could interact and Clones appearing on QDO medium after 4–7 days were subsequent divergence of the repeat domain sequence could further analysed by re-streaking onto QDO/x-α-gal plates. diversify the range of simultaneous protein–protein interac- Plasmids were extracted from the yeast, grown in E.coli tions. Alternatively, adding repeat domains could provide a and retransformed into haploid AH109 yeast together with secondary interaction surface for the same partner protein, pGBKT7-GTF2IRD1 to confirm the interactions. Identi- thus allowing greater control over the binding reaction. The fication of inserts in the prey plasmids was performed by multiple binding sites of several of the partner proteins in Sanger sequencing and BLASTn (NCBI) searches. Protein the Y2H mapping experiments indicate that the latter sce- information was retrieved from the UniProt Consortium nario is possible. database. In conclusion, this paper provides visual and biochemi- cal insights into the functions of GTF2IRD1 using unbi- Cell lines and transfections ased systems. These data form the basis for a set of test- able hypotheses that places GTF2IRD1 as a nuclear protein HeLa and HEK-293 cells were grown in Dulbecco’s modi- capable of engaging partner proteins in transcriptional fied Eagle’s medium, supplemented with 10 % foetal regulation through chromatin modification. In addition, bovine serum, and 1 penicillin (100 U/ml)/streptomycin × GTF2IRD1 may be located to sites on the genome through (100 µg/ml) at 37 °C in 5 % CO2. For siRNA transfection, interactions with other DNA binding proteins identified in the ON-TARGETplus GTF2IRD1 siRNA SMART pool the Y2H screens and may integrate cytoplasmic signals via (L-013262-00-0005, sequences: GUGUGCAGAUCCU- ARM repeat proteins or members of the centrosome/pri- GUUUAA, UCACGGGUCUGCCUGAUGA, AGUAUC- mary cilium complex. CACUUCAUCAUUA, UCCCGGGACCUCUUAAUUA; Dharmacon) was transfected into HeLa cells (100 pmol/ well of a 6-well plate) using Lipofectamine 2000 (Life Materials and methods Technologies) following the product protocol. Transfected cells were incubated for 48 h before analysis. Transient Plasmids transfections of mammalian expression vectors were per- formed in HeLa cells using Lipofectamine LTX (Life Tech- For Y2H experiments, bait (pGBKT7) and prey (pGADT7) nologies) according to the manufacturer’s protocol. plasmids were obtained from the Matchmaker Gold Yeast The H9 (WA-09, WiCell) human ES cell line was Two-Hybrid System (Clontech Laboratories). pGBKT7 cultured feeder-free on vitronectin-coated plates using constructs containing full-length human GTF2IRD1 cDNA MTeSR-1 defined media according to the manufacturer’s or the 88 amino acid domain regions (Fig. 5) have been instructions (Stem Cell Technologies) and maintained at described previously (Widagdo et al. 2012). Mammalian 37 °C with 5 % CO2. Colonies were mechanically dis- expression constructs for GTF2IRD1 (pMyc-GTF2IRD1 sected every 7 days and transferred to freshly prepared and pEGFP-GTF2IRD1) were described previously (Wid- coated plates. Cell culture media was changed every agdo et al. 2012). A detailed list of all other constructs used day. Neural inductions of human ES cells were set up as in this study is shown in Supplementary Material, Table S1. described (Denham et al. 2012) with some slight modifi- cations. Briefly, human ES cells were mechanically dis- Yeast two‑hybrid assays and library screening sected into pieces approximately 0.5 mm in diameter and transferred to laminin-coated organ culture plates in Saccharomyces cerevisiae strain AH109 was transformed N2B27 medium containing 1:1 mix of Neurobasal medium with both prey and bait plasmids using the standard lith- (NBM) with DMEM/F12 medium. Neurobasal media con- ium acetate/polyethylene glycol protocol (Gietz and Woods tained Neurobasal A medium supplemented with 1 % N2, 2002). Diploids were grown on double dropout (DDO) 2 % B27, 2 mM l-glutamine and 0.5 % Penicillin/Strep- selective medium which lacks tryptophan and leucine tomycin (all sourced from Gibco). Cells were cultured in at 30 °C for 4–5 days. To identify protein interactions, N2B27 media for 14 days without passaging. SB431542 cells were grown on a quadruple dropout (QDO) medium (10 μM, Tocris) and noggin (500 ng/ml, Peprotech) were

1 3 Hum Genet supplemented in the N2B27 media for the first 7 days, fol- Immunoprecipitation and immunoblotting lowed by basic fibroblast growth factor (bFGF; 20 ng/ml, Peprotech) supplementation only for the remaining 7 days. Cells were lysed 24 h after expression of vector trans- Fresh supplemented media were replaced every second fection or 48 h after siRNA transfection in lysis buffer day. Following 14 days, colonies were dissected into (20 mM Tris–HCl, pH7.4; 420 mM NaCl; 10 mM MgCl2; pieces and cultured in suspension in NBM supplemented 2 mM EDTA; 10 % Glycerol; 1 % Triton X-100; 2.5 mM with epidermal growth factor (EGF) and bFGF at 20 ng/ β-Glycerophosphate; 1 mM NaF) supplemented with pro- mL each (Peprotech) for 1 week to generate neurospheres. tease inhibitor cocktail (Sigma P8340) and incubated for Neuronal differentiation was performed by mechanically 30 min on ice before being sonicated twice for 7 s on ice. disaggregating neurospheres and plating the cells onto Cell lysates were centrifuged at 20,000g for 10 min at poly-d-/laminin dishes in unsupplemented NBM for 4 °C to remove cell debris and pre-cleared by incubation 1–2 weeks, as previously described (Denham and Dottori with Pure Proteome Protein A/G magnetic beads (Milli- 2011). pore, #LSKMAGAG02) for 30 min at 4 °C. The anti-GFP antibody (ab290, Abcam) was coupled to the protein A/G Antibodies Magnetic beads for 30 min at room temperature, and then washed three times in PBS/Tween 20 (0.2 %). Pre-cleared Anti-GTF2IRD1 antibodies included the 333A rab- lysates were incubated with the antibody-bound beads bit polyclonal (A301-333A-1 Bethyl Laboratories Inc.), at 4 °C overnight. Beads were washed in PBS/Tween 20 the epitope for which maps to between amino acids four times and proteins were eluted by boiling in 1 Lae- × 909–959 of the human protein, and the M19 rabbit pol- mmli sample buffer containing 0.1 M DTT. Proteins were yclonal (sc-14714 Santa Cruz), for which the epitope resolved by 8 or 10 % SDS-PAGE and transferred to a maps to ‘within an internal region of WBSCR11 of PVDF membrane for western blot analysis using stand- mouse origin’, according to the manufacturer’s descrip- ard methods. Briefly, membranes were blocked for 1 h in tion. For TFII-I, the antibodies used were #4562 rab- blocking solution (TBS/Tween 20 5 % non-fat milk pow- bit polyclonal (Cell Signaling Technology, proprietary der), incubated with the primary antibody for 2 h in the epitope information) for western blot or the #sc-9943 same solution and washed for 3 10 min in TBS/Tween × goat polyclonal for immunofluorescence (Santa Cruz) 20. The secondary antibody incubation was conducted for for which the epitope is described as ‘mapping near the 45 min in blocking solution, washed as before and signal C-terminus of human TFII-I’. The antibody obtained was detected using the ECL substrates, Clarity (Bio-Rad) for GTF2IRD2, #H00084163-B01P mouse polyclonal or Luminata Forte (Merck Millipore), and exposure to (Abnova) was produced using the full-length human pro- X-ray film. tein as an immunogen. The monoclonal mouse antibod- ies against MAP2ab (#MA5-12823, Thermo Scientific) Immunofluorescence and β-tubulin III (#MAB1637 Merk Millipore) were used as neuronal markers. Antibodies against nuclear sub- For immunofluorescence analysis of endogenous proteins, compartments included mouse monoclonal antibodies; HeLa cells were washed with PBS and then fixed and per- (Abcam), anti-coilin (ab11822-50), anti-Histone H3 tri- meabilized for 15 min in 4 % PFA/0.25 % Triton-X100. methyl K4 (ab12209), anti LAP2 (ab11823), anti-nuclear For analysis of transfected proteins, 24 h after transfec- pore complex (ab24609) and anti-SC-35 (ab11826); tion, HeLa cells were washed with PBS and fixed in ice- (Active Motif), anti-histone H3 tri-methyl K9 (#39286), cold methanol for 10 min. After fixation, all cells were anti-histone H3 di/tri-methyl K27 (#39538), anti-HP1α incubated with blocking buffer (10 % BSA in PBS) for (#39978), anti-HP1β (#39980) and anti-HP1γ (#39982); 1 h at room temperature, followed by the primary antibody (Abnova), SP1 (H00006667-M02) and (Santa Cruz) anti- incubation in 1 % BSA in PBS. Detection was carried out PML (N-19) sc-9862 goat polyclonal IgG. using secondary antibodies conjugated to Alexa Fluor Dyes Antibodies against epitope tags and GFP included; (Molecular Probes). ProLong Gold Antifade reagent with mouse monoclonal anti-Myc antibody clone 9E10 (Sigma), DAPI (Molecular Probes) was used as mounting media in anti-FLAG rabbit polyclonal F7425 (Sigma), anti-HA all preparations, except for stimulated emission depletion mouse monoclonal #MMS-101R-500 (Covance) and the (STED) imaging, where DAPI was excluded. rabbit polyclonal anti-GFP #ab290 (Abcam). To detect the For immunofluorescence analysis where both antibodies endogenous ZMYM proteins, we used the rabbit polyclonal were derived from rabbit serum (endogenous GTF2IRD1 anti-ZNF198 (ZMYM2) A301-710A and anti-ZNF261 and ZMYM2 or ZMYM3), incubations were performed (ZMYM3) A300-200A (Bethyl Laboratories). in a sequential manner. HeLa cells were washed with PBS

1 3 Hum Genet followed by fixation/permeabilization for 15 min in 4 % experiments). Background control levels were estimated PFA/0.25 % Triton-X100. After blocking with 10 % foetal by counting the dots per nucleus per cell in the background calf serum (FCS) for 30 min, cells were incubated over- control slides using the same procedure. night at 4 °C with rabbit anti-GTF2IRD1 (333A) and then blocked for 1.5 h at room temperature with a Fab fragment Online resources goat anti-rabbit (1:45, Jackson Immunoresearch Labora- tories, Inc), which converts the presentation of the rabbit The interactional network of GTF2IRD1, was generated IgG (H L) of the primary antibody into a goat antigen. using Cytoscape 3.1.1 (The Cytoscape Consortium) (Shan- + This was followed by incubation with a secondary anti-goat non et al. 2003), retrieving protein–protein interactions antibody conjugated to Alexa Fluor 488. The primary rab- from IntAct database: Cytoscape: http://www.cytoscape. bit antibody of the protein partners was added and detected org/ and IntAct, EMBL-EBI: http://www.ebi.ac.uk/intact/. with a secondary anti-rabbit antibody conjugated to Alexa Fluor 594 in the second phase. Negative controls were per- Acknowledgments We thank Kylie M. Taylor for her technical formed to ensure the complete blocking of rabbit IgG from assistance. We are grateful for the plasmid constructs provided by the researchers detailed in the Supplementary Table 1. We extend thanks the first primary antibody. to the Biomedical Imaging Facility, from the Mark Wainwright Ana- Cells were visualized by confocal microscopy using a lytical Centre at UNSW Australia for their training and support for Leica TCS SP5 microscope under 63 or 100 magnifi- the microscopy techniques. PC-M and CPC are recipients of a CONI- × × CYT-Becas Chile scholarship from the Government of Chile. cation. For stimulated emission depletion (STED) experi- ments, the imaging system was connected to a 592 nm con- Compliance with ethical standards tinuous wave depletion laser. Funding This work was supported by the National Health and Medi- Proximity ligation assay (PLA) cal Research Council of Australia (Project Grant 1049639). Conflict of interest The authors declare no conflict of interest. PLA was performed using HeLa cells grown on 12 12 mm coverslips and fixed for endogenous protein × detection as described earlier (Immunofluorescence). The References assay was carried out using the Duolink kit (Olink AB), following the manufacturer’s protocol, using red detection Antonell A, Del Campo M, Magano LF, Kaufmann L, de la Iglesia reagents and the secondary probes provided when the pri- JM, Gallastegui F, Flores R, Schweigmann U, Fauth C, Kotzot mary antibody pairs were from mouse, rabbit or goat ori- D, Perez-Jurado LA (2010) Partial 7q11.23 deletions further implicate GTF2I and GTF2IRD1 as the main genes responsi- gin. Background controls were produced by performing ble for the Williams–Beuren syndrome neurocognitive profile. J two parallel experiments in which the two primary antibod- Med Genet 47:312–320. doi:10.1136/jmg.2009.071712 ies are left out of the procedure. Bass-Zubek AE, Godsel LM, Delmar M, Green KJ (2009) Plakophi- In the cases where a PLA antibody pair consisted of two lins: multifunctional scaffolds for adhesion and signaling. Curr Opin Cell Biol 21:708–716. doi:10.1016/j.ceb.2009.07.002 primary antibodies raised in rabbit, the protocol was modi- Bayarsaihan D, Ruddle FH (2000) Isolation and characterization of fied for sequential primary antibody incubation. After fixa- BEN, a member of the TFII-I family of DNA-binding proteins tion and blocking, samples were incubated overnight with containing distinct helix–loop–helix domains. Proc Natl Acad rabbit anti-GTF2IRD1 333A, followed by blocking with Sci USA 97:7342–7347 Binder JX, Pletscher-Frankild S, Tsafou K, Stolte C, O’Donoghue SI, goat anti-rabbit Fab fragment (1:45 for 1.5 h) followed by Schneider R, Jensen LJ (2014) COMPARTMENTS: unification an incubation for 1 h at 37 °C with the PLA probe anti- and visualization of protein subcellular localization evidence. Goat PLUS Duolink. Then the second rabbit primary anti- Database (Oxford) 2014:bau012. doi:10.1093/database/bau012 body was added at room temperature for 2 h and blocked Canzio D, Larson A, Narlikar GJ (2014) Mechanisms of functional promiscuity by HP1 proteins. Trends Cell Biol 24:377–386. for 1 h at 37 °C with PLA probe anti-Rabbit MINUS. The doi:10.1016/j.tcb.2014.01.002 procedure then continued with the standard PLA protocol. Caraveo G, van Rossum DB, Patterson RL, Snyder SH, Deside- Background controls, lacking one or both primary antibod- rio S (2006) Action of TFII-I outside the nucleus as an inhibi- ies with Fab fragment incubation were performed to ensure tor of agonist-induced calcium entry. Science 314:122–125. doi:10.1126/science.1127815 complete blocking of the first primary antibody rabbit IgG. Chen X, Bonne S, Hatzfeld M, van Roy F, Green KJ (2002) Pro- For each sample, three-dimensional acquisitions tein binding and functional characterization of plakophilin 2. (z-stacks) of the nucleus were obtained and the maximum Evidence for its diverse roles in desmosomes and beta-catenin intensity projection pictures were created and analysed signaling. J Biol Chem 277:10512–10522. doi:10.1074/jbc. M108765200 for the number of PLA positive puncta. Positive PLA sig- Denham M, Dottori M (2011) Neural differentiation of induced nal was counted (Image J, manual cell counter tool) as pluripotent stem cells. Methods Mol Biol 793:99–110. nuclear dots per cell, in 30 cells per experiment (n 2 doi:10.1007/978-1-61779-328-8_7 = 1 3 Hum Genet

Denham M, Parish CL, Leaw B, Wright J, Reid CA, Petrou S, Dottori Merla G, Brunetti-Pierri N, Micale L, Fusco C (2010) Copy number M, Thompson LH (2012) Neurons derived from human embryonic variants at Williams–Beuren syndrome 7q11.23 region. Hum stem cells extend long-distance axonal projections through growth Genet 128:3–26. doi:10.1007/s00439-010-0827-2 along host white matter tracts after intra-cerebral transplantation. O’Leary J, Osborne LR (2011) Global analysis of gene expression Front Cell Neurosci 6:11. doi:10.3389/fncel.2012.00011 in the developing brain of Gtf2ird1 knockout mice. PLoS One Depienne C, Heron D, Betancur C, Benyahia B, Trouillard O, 6:e23868. doi:10.1371/journal.pone.0023868 Bouteiller D, Verloes A, LeGuern E, Leboyer M, Brice A (2007) O’Mahoney J, Guven KL, Joya JE, Robinson S, Wade RP, Hardeman Autism, language delay and mental retardation in a patient EC (1998) Identification of a novel slow-muslce-fiber enhancer with 7q11 duplication. J Med Genet 44:452–458. doi:10.1136/ binding protein, MusTRD1. Mol Cell Biol 18:6641–6652 jmg.2006.047092 Osborne LR (2010) Animal models of . Am J Enkhmandakh B, Makeyev AV, Erdenechimeg L, Ruddle FH, Chimge Med Genet C Semin Med Genet 154C:209–219. doi:10.1002/ NO, Tussie-Luna MI, Roy AL, Bayarsaihan D (2009) Essential ajmg.c.30257 functions of the Williams–Beuren syndrome-associated TFII- Osborne LR, Campbell T, Daradich A, Scherer SW, Tsui LC (1999) I genes in embryonic development. Proc Natl Acad Sci USA Identification of a putative transcription factor gene (WBSCR11) 106:181–186. doi:10.1073/pnas.0811531106 that is commonly deleted in Williams–Beuren syndrome. Franke Y, Peoples RJ, Francke U (1999) Identification of GTF2IRD1, Genomics 57:279–284 a putative transcription factor within the Williams–Beuren syn- Palmer SJ, Tay ES, Santucci N, Cuc Bach TT, Hook J, Lemckert FA, drome deletion at 7q11.23. Cytogenet Genome Res 86:296–304 Jamieson RV, Gunnning PW, Hardeman EC (2007) Expression of Fujita N, Watanabe S, Ichimura T, Ohkuma Y, Chiba T, Saya H, Gtf2ird1, the Williams syndrome-associated gene, during mouse Nakao M (2003) MCAF mediates MBD1-dependent transcrip- development. Gene Expr Patterns 7:396–404. doi:10.1016/j. tional repression. Mol Cell Biol 23:2834–2843 modgep.2006.11.008 Gietz RD, Woods RA (2002) Transformation of yeast by LiAc/SS Palmer SJ, Santucci N, Widagdo J, Bontempo SJ, Taylor KM, Tay ES, carrier DNA/PEG Method. Methods Enzymol 35:87–96 Hook J, Lemckert F, Gunning PW, Hardeman EC (2010) Nega- Golbabapour S, Majid NA, Hassandarvish P, Hajrezaie M, Abdulla tive autoregulation of GTF2IRD1 in Williams–Beuren syndrome MA, Hadi AH (2013) Gene silencing and Polycomb group pro- via a novel DNA binding mechanism. J Biol Chem 285:4715– teins: an overview of their structure, mechanisms and phyloge- 4724. doi:10.1074/jbc.M109.086660 netics. OMICS 17:283–296. doi:10.1089/omi.2012.0105 Palmer SJ, Taylor KM, Santucci N, Widagdo J, Chan YK, Yeo JL, Guemez-Gamboa A, Coufal NG, Gleeson JG (2014) Primary cilia Adams M, Gunning PW, Hardeman EC (2012) GTF2IRD2 in the developing and mature brain. Neuron 82:511–521. from the Williams–Beuren critical region encodes a mobile- doi:10.1016/j.neuron.2014.04.024 element-derived fusion protein that antagonizes the action of its Gunbin KV, Ruvinsky A (2013) Evolution of general transcription related family members. J Cell Sci 125:5040–5050. doi:10.1242/ factors. J Mol Evol 76:28–47. doi:10.1007/s00239-012-9535-y jcs.102798 Hakimi MA, Dong Y, Lane WS, Speicher DW, Shiekhattar R (2003) Pérez Jurado LA, Wang Y-K, Peoples R, Coloma A, Cruces J, Francke A candidate X-linked mental retardation gene is a component U (1998) A duplicated gene in the breakpoint regions of the of a new family of histone deacetylase-containing complexes. J 7q11.23 Williams-Beuren syndrome deletion encodes the initia- Biol Chem 278:7234–7239. doi:10.1074/jbc.M208992200 tor binding protein TFII-I and BAP-135, a tar- Han YG, Kim HJ, Dlugosz AA, Ellison DW, Gilbertson RJ, Alva- get of BTK. Hum Mol Genet 7:325–334 rez-Buylla A (2009) Dual and opposing roles of primary cilia Polly P, Haddadi LM, Issa LL, Subramaniam N, Palmer SJ, Tay ES, in medulloblastoma development. Nat Med 15:1062–1065. Hardeman EC (2003) hMusTRD1a1 represses activation doi:10.1038/nm.2020 of the troponin I slow enhancer. J Biol Chem 278:36603–36610 Hatzfeld M, Haffner C, Schulze K, Vinzens U (2000) The function of Proulx E, Young EJ, Osborne LR, Lambe EK (2010) Enhanced pre- plakophilin 1 in desmosome assembly and actin filament organi- frontal serotonin 5-HT(1A) currents in a mouse model of Wil- zation. J Cell Biol 149:209–222 liams–Beuren syndrome with low innate anxiety. J Neurodev Howard ML, Palmer SJ, Taylor KM, Arthurson GJ, Spitzer MW, Du X, Disord 2:99–108. doi:10.1007/s11689-010-9044-5 Pang TY, Renoir T, Hardeman EC, Hannan AJ (2012) Mutation Ring C, Ogata S, Meek L, Song J, Ohta T, Miyazono K, Cho KW of Gtf2ird1 from the Williams–Beuren syndrome critical region (2002) The role of a Williams–Beuren syndrome-associated results in facial dysplasia, motor dysfunction, and altered vocalisa- helix–loop–helix domain-containing transcription factor in tions. Neurobiol Dis 45:913–922. doi:10.1016/j.nbd.2011.12.010 activin/nodal signaling. Genes Dev 16:820–835. doi:10.1101/ Issa LL, Palmer SJ, Guven KL, Santucci N, Hodgson VR, Popovic K, gad.963802 Joya JE, Hardeman EC (2006) MusTRD can regulate postnatal Roy AL (2012) Biochemistry and biology of the inducible multifunc- fiber-specific expression. Dev Biol 293:104–115. doi:10.1016/j. tional transcription factor TFII-I: 10 years later. Gene 492:32– ydbio.2006.01.019 41. doi:10.1016/j.gene.2011.10.030 Jackson TA, Taylor HE, Sharma D, Desiderio S, Danoff SK (2005) Sanders SJ, Ercan-Sencicek AG, Hus V, Luo R, Murtha MT, Moreno- Vascular endothelial growth factor receptor-2: counter-regulation De-Luca D, Chu SH, Moreau MP, Gupta AR, Thomson SA, by the transcription factors, TFII-I and TFII-IRD1. J Biol Chem Mason CE, Bilguvar K, Celestino-Soper PB, Choi M, Craw- 280:29856–29863. doi:10.1074/jbc.M500335200 ford EL, Davis L, Wright NR, Dhodapkar RM, DiCola M, Jiang W, Sordella R, Chen GC, Hakre S, Roy AL, Settleman J (2005) DiLullo NM, Fernandez TV, Fielding-Singh V, Fishman DO, An FF domain-dependent protein interaction mediates a sign- Frahm S, Garagaloyan R, Goh GS, Kammela S, Klei L, Lowe aling pathway for growth factor-induced gene expression. Mol JK, Lund SC, McGrew AD, Meyer KA, Moffat WJ, Murdoch Cell 17:23–35. doi:10.1016/j.molcel.2004.11.024 JD, O’Roak BJ, Ober GT, Pottenger RS, Raubeson MJ, Song Y, Lopez-Domenech G, Serrat R, Mirra S, D’Aniello S, Somorjai I, Abad Wang Q, Yaspan BL, Yu TW, Yurkiewicz IR, Beaudet AL, Can- A, Vitureira N, Garcia-Arumi E, Alonso MT, Rodriguez-Prados tor RM, Curland M, Grice DE, Gunel M, Lifton RP, Mane SM, M, Burgaya F, Andreu AL, Garcia-Sancho J, Trullas R, Garcia- Martin DM, Shaw CA, Sheldon M, Tischfield JA, Walsh CA, Fernandez J, Soriano E (2012) The Eutherian Armcx genes regu- Morrow EM, Ledbetter DH, Fombonne E, Lord C, Martin CL, late mitochondrial trafficking in neurons and interact with Miro Brooks AI, Sutcliffe JS, Cook EH Jr, Geschwind D, Roeder K, and Trak2. Nat Commun 3:814. doi:10.1038/ncomms1829 Devlin B, State MW (2011) Multiple recurrent de novo CNVs,

1 3 Hum Genet

including duplications of the 7q11.23 Williams syndrome Thompson PD, Webb M, Beckett W, Hinsley T, Jowitt T, Sharrocks region, are strongly associated with autism. Neuron 70:863–885. AD, Tassabehji M (2007) GTF2IRD1 regulates transcription by doi:10.1016/j.neuron.2011.05.002 binding an evolutionarily conserved DNA motif ‘GUCE’. FEBS Schneider T, Skitt Z, Liu Y, Deacon RM, Flint J, Karmiloff-Smith A, Lett 581:1233–1242. doi:10.1016/j.febslet.2007.02.040 Rawlins JN, Tassabehji M (2012) Anxious, hypoactive pheno- Tipney HJ, Hinsley TA, Brass A, Metcalfe K, Donnai D, Tassabehji type combined with motor deficits in Gtf2ird1 null mouse model M (2004) Isolation and characterisation of GTF2IRD2, a novel relevant to Williams syndrome. Behav Brain Res 233:458–473. fusion gene mapping to the Williams–Beuren syndrome critical doi:10.1016/j.bbr.2012.05.014 region. Eur J Hum Genet 12:551–560 Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Torniero C, dalla Bernardina B, Novara F, Vetro A, Ricca I, Darra F, Amin N, Schwikowski B, Ideker T (2003) Cytoscape: a software Pramparo T, Guerrini R, Zuffardi O (2007) Cortical dysplasia of environment for integrated models of biomolecular interaction the left temporal lobe might explain severe expressive-language networks. Genome Res 13:2498–2504. doi:10.1101/gr.1239303 delay in patients with duplication of the Williams–Beuren locus. Sleeman JE, Trinkle-Mulcahy L (2014) Nuclear bodies: new insights Eur J Hum Genet 15:62–67. doi:10.1038/sj.ejhg.5201730 into assembly/dynamics and disease relevance. Curr Opin Cell Tussie-Luna MI, Michel B, Hakre S, Roy AL (2002) The SUMO Biol 28:76–83. doi:10.1016/j.ceb.2014.03.004 ubiquitin-protein isopeptide ligase family member Miz1/PIASx- Somerville MJ, Mervis CB, Young EJ, Seo EJ, del Campo M, Bam- beta/Siz2 is a transcriptional cofactor for TFII-I. J Biol Chem forth S, Peregrine E, Loo W, Lilley M, Perez-Jurado LA, Mor- 277:43185–43193. doi:10.1074/jbc.M207635200 ris CA, Scherer SW, Osborne LR (2005) Severe expressive-lan- Van der Aa N, Rooms L, Vandeweyer G, van den Ende J, Reyniers E, guage delay related to duplication of the Williams–Beuren locus. Fichera M, Romano C, Delle Chiaie B, Mortier G, Menten B, N Engl J Med 353:1694–1701. doi:10.1056/NEJMoa051962 Destree A, Maystadt I, Mannik K, Kurg A, Reimand T, McMul- Sridharan R, Gonzales-Cope M, Chronis C, Bonora G, McKee R, lan D, Oley C, Brueton L, Bongers EM, van Bon BW, Pfund Huang C, Patel S, Lopez D, Mishra N, Pellegrini M, Carey M, R, Jacquemont S, Ferrarini A, Martinet D, Schrander-Stumpel Garcia BA, Plath K (2013) Proteomic and genomic approaches C, Stegmann AP, Frints SG, de Vries BB, Ceulemans B, Kooy reveal critical functions of H3K9 methylation and heterochroma- RF (2009) Fourteen new cases contribute to the characterization tin protein-1gamma in reprogramming to pluripotency. Nat Cell of the 7q11.23 microduplication syndrome. Eur J Med Genet Biol 15:872–882. doi:10.1038/ncb2768 52:94–100. doi:10.1016/j.ejmg.2009.02.006 Tanikawa M, Wada-Hiraike O, Nakagawa S, Shirane A, Hiraike H, Vullhorst D, Buonanno A (2003) Characterisation of general tran- Koyama S, Miyamoto Y, Sone K, Tsuruga T, Nagasaka K, Mat- scription factor 3, a transcription factor involved in slow mus- sumoto Y, Ikeda Y, Shoji K, Oda K, Fukuhara H, Nakagawa K, cle-specific gene expression. J Biol Chem 278:8370–8379. Kato S, Yano T, Taketani Y (2011) Multifunctional transcription doi:10.1074/jbc.M209361200 factor TFII-I is an activator of BRCA1 function. Br J Cancer Vullhorst D, Buonanno A (2005) Multiple GTF2I-like repeats of general 104:1349–1355. doi:10.1038/bjc.2011.75 transcription factor 3 exhibit DNA binding properties. Evidence for Tantin D, Tussie-Luna MI, Roy AL, Sharp PA (2004) Regulation of a common origin as a sequence-specific DNA interaction module. J immunoglobulin promoter activity by TFII-I class transcrip- Biol Chem 280:31722–31731. doi:10.1074/jbc.M500593200 tion factors. J Biol Chem 279:5460–5469. doi:10.1074/jbc. Widagdo J, Taylor KM, Gunning PW, Hardeman EC, Palmer SJ M311177200 (2012) SUMOylation of GTF2IRD1 regulates protein partner Tassabehji M, Carette M, Wilmot C, Donnai D, Read AP, Metcalfe interactions and ubiquitin-mediated degradation. PLoS One K (1999) A transcription factor involved in skeletal muscle gene 7:e49283. doi:10.1371/journal.pone.0049283 expression is deleted in patients with Williams syndrome. Eur J Yang W, Desiderio S (1997) BAP-135, a target for Bruton’s tyros- Hum Genet 7:737–747. doi:10.1038/sj.ejhg.5200396 ine kinase in response to B cell receptor engagement. Proc Natl Tassabehji M, Hammond P, Karmiloff-Smith A, Thompson P, Thor- Acad Sci USA 94:604–609 geirsson SS, Durkin ME, Popescu NC, Hutton T, Metcalfe K, Young EJ, Lipina T, Tam E, Mandel A, Clapcote SJ, Bechard AR, Rucka A, Stewart H, Read AP, Maconochie M, Donnai D (2005) Chambers J, Mount HT, Fletcher PJ, Roder JC, Osborne LR GTF2IRD1 in craniofacial development of humans and mice. (2008) Reduced fear and aggression and altered serotonin metab- Science 310:1184–1187. doi:10.1126/science.1116142 olism in Gtf2ird1-targeted mice. Genes Brain Behav 7:224–234. Tay ES, Guven KL, Subramaniam N, Polly P, Issa LL, Gunning doi:10.1111/j.1601-183X.2007.00343.x PW, Hardeman EC (2003) Regulation of alternative splicing of Gtf2ird1 and its impact on slow muscle promoter activity. Bio- chem J 374:359–367. doi:10.1042/BJ20030189

1 3