Structural consequences of disease-causing mutations in the ATRX-DNMT3-DNMT3L (ADD) domain of the chromatin-associated ATRX

Anthony Argentaro*, Ji-Chun Yang†, Lynda Chapman†, Monika S. Kowalczyk*, Richard J. Gibbons*, Douglas R. Higgs*, David Neuhaus†, and Daniela Rhodes†‡

†Medical Research Council Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, United Kingdom; and *Medical Research Council Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital, Headington, Oxford OX3 9DS, United Kingdom

Communicated by David Weatherall, University of Oxford, United Kingdom, May 18, 2007 (received for review February 26, 2007) The chromatin-associated protein ATRX was originally identified mulating evidence suggests that the role of PHD fingers is to because mutations in the ATRX cause a severe form tether (directly or indirectly) to chromatin (11). of syndromal X-linked mental retardation associated with Significantly, ATRX is unusual in that its PHD finger is not ␣ -thalassemia. Half of all of the disease-associated missense isolated but is flanked N-terminally by an additional C2C2 motif. mutations cluster in a cysteine-rich region in the N terminus of Sequence database searches revealed that the only proteins that ATRX. This region was named the ATRX-DNMT3-DNMT3L (ADD) share this feature are DNMT3A, DNMT3B, and DNMT3L, domain, based on sequence homology with a family of DNA three proteins involved in DNA methylation (12) [see also methyltransferases. Here, we report the solution structure of the supporting information (SI) Fig. 5). This unique arrangement ADD domain of ATRX, which consists of an N-terminal GATA-like has been named the ATRX-DNMT3-DNMT3L (ADD) domain zinc finger, a plant homeodomain finger, and a long C-terminal (13). Additionally, we note that the sequence homology extends ␣-helix that pack together to form a single globular domain. C-terminally beyond the cysteine-rich region by Ϸ25 aa. The Interestingly, the ␣-helix of the GATA-like finger is exposed and ADD domain is consistently found in orthologues of ATRX in highly basic, suggesting a DNA-binding function for ATRX. The vertebrates but is absent from the corresponding ATRX-like BIOCHEMISTRY disease-causing mutations fall into two groups: the majority affect proteins in Caenorhabditis elegans and Drosophila, where DNA buried residues and hence affect the structural integrity of the ADD methylation is absent or rare. Consistent with this observation, domain; another group affects a cluster of surface residues, and the pattern of methylation is perturbed in the DNA of patients these are likely to perturb a potential protein interaction site. The with ATR-X syndrome (14) and in mice in which ATRX has effects of individual point mutations on the folding state and been inactivated (3). Together, these observations suggest that stability of the ADD domain correlate well with the levels of the ADD domain is present in chromatin-associated proteins mutant ATRX protein in patients, providing insights into the that play a role in establishing and/or maintaining a normal molecular pathophysiology of ATR-X syndrome. pattern of DNA methylation. Thus, although the precise roles of ATRX in vivo are not known, mutations in the ADD domain ATR-X syndrome ͉ NMR structure ͉ zinc finger clearly affect a number of biological pathways. To understand the pathophysiology of mutations found in the TRX was identified when the gene encoding this protein ATRX-ADD domain, we determined the three-dimensional Awas shown to be mutated in a form of X-linked mental structure of this domain by NMR. Mapping 40 mutations to the retardation (ATR-X syndrome) in young males (1, 2). Further- structure reveals key functional regions and thereby provides an more, null mutations in mice are lethal at the embryonic stage understanding of how mutations give rise to ATR-X syndrome. of development (3). Because ATRX mutations reduce ␣-globin synthesis, causing ␣-thalassemia, it seems likely that ATRX Results normally plays a role in the regulation of globin gene expression NMR Structure Determination of the ADD Domain of ATRX. We report (2, 4). The complexity of the disease also suggests that ATRX here the structure of the polypeptide K159–K296 (the ADD could be involved in the regulation of other as yet unidentified domain) from human ATRX protein, determined by hetero- . ATRX is a large (2,492 residue; Ϸ280 kDa) nuclear nuclear multidimensional NMR. The N terminus of the domain protein predominantly localized to heterochromatin and nuclear was established by proteolysis experiments using a longer PML bodies (5, 6). It contains two highly conserved domains, polypeptide. Expression trials of constructs with different C and missense mutations that give rise to ATR-X syndrome fall termini showed that the 25 residues C-terminal to the PHD motif within these. At the C terminus is a /ATPase domain, were required for solubility. The first 10 (K159–I168) and final which characterizes ATRX as a member of the SNF2 (SWI/SNF) family of chromatin-associated proteins. Experimental evidence shows that ATRX acts as a DNA-dependent ATPase and as a Author contributions: A.A., J.-C.Y., and L.C. contributed equally to this work; R.J.G., D.R.H., DNA translocase, and it confers modest chromatin-remodeling D.N., and D.R. designed research; A.A., J.-C.Y., L.C., and M.S.K. performed research; R.J.G., D.R.H., and D.N. contributed new reagents/analytic tools; A.A., J.-C.Y., L.C., M.S.K., R.J.G., activity in vitro (6). Thus, it seems likely that ATRX exerts its D.R.H., D.N. and D.R. analyzed data; and R.J.G., D.R.H., D.N., and D.R. wrote the paper. function by targeting chromatin. The authors declare no conflict of interest. Of the missense mutations identified in the ATRX gene, 50% Abbreviations: ADD, ATRX-DNMT3-DNMT3L; PHD, plant homeodomain. are located in the N terminus of the ATRX protein, which Data deposition: The NMR chemical shifts reported in this paper have been deposited in the represents just 4% of the coding sequence (Fig. 1a) (7). This BioMagResBank, www.bmrb.wisc.edu [accession no. 15001 (1H, 13C, and 15N NMR reso- region is highly cysteine-rich and contains two different types of nance assignments for ADD domain 159–296)]. The atomic coordinates of the 32 accepted zinc-finger motif. It was first noticed that a region of the structures have been deposited in the , www.pdb.org (PDB ID code 2jm1). sequence shares homology with the plant homeodomain (PHD)- ‡To whom correspondence should be addressed. E-mail: [email protected]. type zinc fingers (8). Mutations in other PHD-containing pro- This article contains supporting information online at www.pnas.org/cgi/content/full/ teins (WSTF and AIRE) are also associated with human disease 0704057104/DC1. (9, 10). PHD fingers are found in nuclear proteins, and accu- © 2007 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0704057104 PNAS ͉ July 17, 2007 ͉ vol. 104 ͉ no. 29 ͉ 11939–11944 Downloaded by guest on October 1, 2021 3 37 22222 2 a 22 3 2 2 2 1 2492

ADD Helicase domains

b Strand 1 Strand 2 Helix 1

GATA-like finger

160 170 180 190 200 210 216 KRGEDGLHG I VSCTACG QQVN HFQKDS I YRHP S L Q V LICKNC FKYYMSDD I SRDSDGM

Strand 3 Strand 4 Helix 2 Helix 3 Helix 4

PHD finger C-terminal helix

217220 230 240 250 260 270 280 290 296 DEQC R WC A EGG NL ICCDFCHNAFC K K C I LRN L GERK L ST IMDENNQW Y C Y ICHPEPLLDLVTACNSVFENLEQLLQQNKK

c GATA-like finger PHD finger h3 h3 d h2 h1 h2 Loop 2 PHD like finger

C-terminal Helix 1 4 5 10 811

Zn1 Zn2 Zn3 s4 h4 2 3 6 9 7 12 linker s1 s2 s3

Loop 1 C-terminal helix h4 h1 GATA-like finger

Fig. 1. ATRX protein sequence, structure, and disease-associated mutations. (a) The locations of the highly conserved N-terminal cysteine-rich domain and the C-terminal helicase-like domain are shown. The positions of missense mutations are indicated with circles and the number of times (Ͼ1) the mutation has been identified in unrelated individuals is indicated within relevant circles. All of the circles drawn between the oblique lines above the bar refer to mutations within the ADD domain. (b) Locations of mutations and secondary structural elements in the ADD domain. The N-terminal GATA-like zinc finger is indicated by a light green bar, the PHD finger by a mauve bar, and the C-terminal extension by a light blue bar. The conserved cysteine residues are marked as orange vertical bars. Missense mutations are highlighted in green (surface), blue (buried), and orange (cysteines); the insertion mutation is highlighted by an upward green arrow and the deletion by a downward blue arrow. Residues where there is homology across the whole family of ADD domain sequences (ATRX, DNMT3A, DNMT3B, and DNMT3L) are marked with filled circles (absolute conservation), gray circles (strong conservation), and open circles (weak conservation); for the full alignment, see SI Fig. 5.(c) Schematic showing the zinc-binding topology and secondary structure elements of the ADD domain, color scheme as for b. ␤-Strands are labeled s1–s4 and helices h1–h4. The zinc binding within the PHD finger has the ‘‘cross-braced’’ topology characteristic of such domains, with each zinc coordinated by a noncontiguous set of ligands. (d) Ribbon representation of the NMR structure of the ADD domain (lowest energy structure from the accepted ensemble of 32) of ATRX. The GATA-like finger is shown in green, the PHD finger in mauve, and the C-terminal helix in blue. Linker and unstructured regions are shown in gray, zinc atoms in pink, and side chains of the zinc coordinating cysteines in orange.

4 (Q293–K296) residues of the ADD domain are unstructured in Packed against this GATA-like finger is a second subdomain solution, whereas residues S210–D217 form a somewhat disor- (residues 218–272), which binds two zinc ions and closely dered linker between subdomains (see below). Unexpectedly, resembles the structure reported for several PHD fingers (11, this linker showed 15N relaxation properties essentially identical 16). Finally, there is a long C-terminal ␣-helix (residues 273–293) to those of the ordered part of the protein, suggesting that any that runs out from the PHD finger and makes extensive hydro- internal motions within it take place on a time scale slower than phobic contacts with the N-terminal GATA finger, bringing the overall tumbling of the molecule (data not shown). For the N and C termini of the ADD domain close together. This ensemble of accepted structures, the backbone rmsd over resi- combination of fused GATA-like and PHD fingers within a dues 168–209 and 218–293 is 0.48 Ϯ 0.12 Å. single domain is thus far unique. As for both of the previously reported zinc fingers of GATA-1 Structure of the ADD Domain. The ADD domain is composed of (15, 17), the GATA-like finger of the ADD domain comprises an three clearly distinguishable modules that pack together through irregular hairpin loop carrying the first two zinc ligands, followed extensive hydrophobic interactions to form a single globular by a short antiparallel ␤-sheet (residues 186–189, s1, paired with domain (Fig. 1 c and d). Starting at the N terminus, there is first residues 194–197, s2) and a short ␣-helix (residues 198–206, h1), a subdomain (residues 168–209) that binds a single zinc ion the latter starting between the second pair of zinc ligands. The through four cysteines and is structurally very similar to the zinc correspondence between the GATA-like finger of the ADD fingers of the erythroid transcription factor GATA-1 (15). domain and the fingers from GATA-1 itself is very close (Fig. 2

11940 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0704057104 Argentaro et al. Downloaded by guest on October 1, 2021 Fig. 2. Sequence alignment and structural comparisons of the ADD domain with other GATA and PHD fingers. (a) Structure-based sequence alignment of the GATA-like zinc finger of ATRX with the N- and C-terminal zinc fingers of GATA-1. Residues considered as structurally equivalent are shown in uppercase, structurally dissimilar residues are shown in lowercase, and background colors follow the ClustalX scheme. Numbering is based on the ATRX sequence, and the positions of the metal-binding cysteines are indicated with triangles below the alignment. (b) Structural superposition of the GATA-like finger of the ADD domain of ATRX (color scheme as in Fig. 1 b–d) with the C-terminal zinc finger of GATA-1 (shown in gray, except for the zinc, which is red). The position in the structure of the single-residue insertion in the ADD domain relative to GATA-1 is indicated as ‘‘s’’ and that of the four-residue insertion is indicated as ‘‘fqkd.’’ The helix used for DNA binding by the C-terminal finger of GATA-1 and its analogue in the ADD domain are indicated, and the basic residues that the ADD domain might use for DNA binding (see text) are shown as side chains in blue and labeled with their sequence positions. The superposition was made by fitting the N, C␣, and BIOCHEMISTRY CЈ atoms of residues 167–180, 185–190, and 192–207 of ATRX to the corresponding residues of the C-terminal finger of GATA-1, based on the alignment in a. (c) PHD finger from the ADD domain of ATRX (color scheme as in Fig. 1 b–d). (d) Structure of the PHD finger from V(D)J recombination activating protein 2 (RAG2, Protein Data Bank ID code 2a23) (28), shown in the same orientation as in c.(e) Structure of the PHD finger from death-inducer obliterator-1 (Dio1, Protein Data Bank ID code 1wem), shown in the same orientation as in c. These PHD fingers were chosen for comparison, because they have the closest structural similarity to the PHD finger of ATRX in terms of the position and orientation of helical elements. The correct orientation in c–e was achieved by superposing the positions of the corresponding zinc-binding atoms of the eight zinc-binding residues in each protein.

a and b). The main differences are a four-residue insertion at through an extensive network of hydrophobic interactions. The positions 181–185 and a single-residue insertion at position 191 key residues in this network are T172 on the GATA-like finger, in the ADD domain structure. Discounting these, the backbone which makes many hydrophobic contacts to side chains of rmsd between this region of the ADD domain and the C- residues on the PHD finger, and W222 on the PHD finger, which terminal finger of GATA-1 is 2.2 Å. Residue I196 of the ADD similarly makes many hydrophobic contacts to the GATA-like domain (analogous to V184 in GATA-1) forms the basis of a finger. Interestingly, the N-terminal finger of GATA-1 itself small hydrophobic core (contacting T172, A173, H189, P190, forms a complex with a partner protein, ‘‘friend of GATA’’ L192, F201, and Y204). For the ADD and both the GATA-1 (FOG) (20), and the interface in this complex has a similar fingers, the arrangement of the ligands around the zinc has S location to the ‘‘internal interface’’ between the GATA-like and absolute chirality (following Berg’s convention in ref. 18). PHD fingers of the ADD domain (SI Fig. 6). The PHD finger in the ADD domain structure shares the The 25 residues C-terminal to the PHD motif form a long cross-braced topology of zinc-binding interactions (Fig. 1c), but ␣-helix (residues 273–293, h4) that completes the structure by it differs from others in that all eight zinc-binding residues are extending away from the PHD finger and across one surface of cysteines (SI Fig. 5b). A more significant difference is the the GATA finger (Fig. 1d). Here, it makes a further network of presence of two well formed ␣-helices in the ADD domain PHD hydrophobic interactions, extending significantly the structural finger spanning residues 241–248 (h2) and residues 250–258 core between the GATA-like and PHD fingers. Residues on the (h3); this region of the sequence, which corresponds to ‘‘loop 2’’ C-terminal helix that make hydrophobic contacts to the GATA- of other PHD domains (11), is consequently expanded relative like finger include L276, V277, V283, F284, L287, and L290. to other PHD fingers (SI Fig. 5b). As in other PHD fingers, the PHD finger of the ADD domain also contains a short antipar- Surface Properties of the ADD Domain. The discovery through the allel ␤-sheet (residues 230–232, s3, paired with residues 237–239, structural analysis that the ADD domain contains a GATA-like s4), and both the zinc sites have the S absolute chirality (18). finger is intriguing and might shed light on the function of There is considerable divergence among published structures of ATRX. GATA-1 itself is a transcription factor, and binding to its PHD fingers; some contain helices or approximately helical cognate DNA response element is mediated through the ␣-helix single turns at similar locations to the helices in the ADD domain of its C-terminal finger (15). Interestingly, the corresponding (particularly h2), but none contains these two helices simulta- helix (h1) is exposed in the ATRX-ADD domain structure, and neously and positioned as they are in ATRX (Fig. 2). The basic residues on its surface (K198 and K202) combine with K183 distinction between PHD and RING fingers can be difficult, but on a neighboring loop to form a basic patch (Figs. 2 and 3), the presence of a largely buried tryptophan (W263) two residues perhaps suggesting a DNA-binding function for the ATRX- N-terminal to the seventh zinc-binding ligand indicates that the ADD domain also. structure can be classified as a PHD finger (19). Apart from the basic patch on the GATA-like finger, the only The GATA-like and PHD fingers pack closely together other significantly basic patch on the surface of the ATRX-ADD

Argentaro et al. PNAS ͉ July 17, 2007 ͉ vol. 104 ͉ no. 29 ͉ 11941 Downloaded by guest on October 1, 2021 Fig. 3. Electrostatic potential and location of mutations in the structure of the ADD domain. (a) Surface electrostatic potential of the ADD domain, shown in the same orientation as the ribbon view in b. The helix in the GATA-like finger (h1) is solvent-exposed and basic, and the two helices within loop 2 of the PHD finger (h2 and h3) form another basic patch. The linker between the GATA-like and PHD fingers is highly acidic. (b) Ribbon structure of the ADD domain showing the locations of mutations found in patients with ATR-X syndrome. Mutations are classified as surface (green), buried (blue), or cysteine (orange) and are represented by using their side chains, except for the glycine mutation G249C/D and the glutamine insertion, which are represented by thickening the backbone. The surface mutations are individually labeled.

domain occurs on the PHD finger in the region involving helices surface residues were expressed at relatively high levels and were 2 and 3, where residues K241, K242, R246, R251, and K252 correctly folded as judged by one-dimensional NMR. Therefore, combine to form the most prominent feature of the electrostatic these data reinforce the conclusions drawn from the structural surface (Fig. 3). The two basic patches lie on the same face of the analysis and suggest that the destabilization of the structural core ADD domain surface and are separated by Ϸ20 Å. Much of the results in lower recovery of soluble protein from E. coli. remainder of the ADD domain surface is acidic, in particular the area immediately between the two basic patches, where Relating Structure to Function in Vivo. Potentially, the most inter- several acidic side chains are contributed by the linker between esting and significant insight might come from investigating the the GATA-like and PHD fingers (D207, D208, D212, D214, and effects of known mutations in the ADD structure on ATRX D217) and the PHD finger itself (E218 and E225). expression in vivo. Because ATRX is ubiquitously expressed throughout differentiation and development, we analyzed en- Structure/Function Analysis of Amino Acid Point Mutations. To relate dogenously expressed ATRX by using EBV-transformed cell the structural information to function, we mapped the 40 lines from normal individuals and from patients with ATR-X previously reported ATRX mutations of known functional rel- syndrome (SI Table 1). We first analyzed the ATRX mRNA evance to the three-dimensional structure of the ADD domain expression levels and found, as expected, that the missense (Fig. 3, see also SI Table 1). These occur as single point mutations had no significant effect on mRNA expression (Fig. mutations in patients and change the identity of single amino 4a). We next analyzed the levels of ATRX protein by quantita- acids at 28 positions in the structure, located in both the tive immunoblotting (Fig. 4b and SI Table 1). Patients with point GATA-like and PHD fingers. Multiple mutations change the mutations of the zinc-coordinating cysteines had very low, but identity of 16 deeply buried amino acid residues, in the hydro- readily detectable, levels of full-length protein (7–12% of normal phobic core of the structure, referred to as buried mutations. levels). Mutations in buried residues also gave rise to lower levels Additional mutations affect six zinc-coordinating cysteines (Fig. of full-length ATRX protein (6–29% of normal levels). Muta- 3, C200, C220, C223, C240, C243, and C265). Interestingly, tions of surface residues (as defined in SI Table 1) had less effect another set of point mutants map to six solvent-exposed amino on the amounts of ATRX (32–55% of normal levels). It is acid residues located along a strip on one face of the ADD striking that the levels of ATRX protein found in patients domain structure (Fig. 3), involving residues from both the correlate with the amount of ADD domain recovered from E. GATA-like and PHD fingers. Within the GATA-like finger, two coli (SI Table 1). Given that the ATRX mRNA levels are not mutations (N179S and the glutamine insertion between Q176 affected by missense mutations, the most likely explanation for and Q177) occur in a loop region, marginally increasing its the low levels of ATRX found in patients is that destabilization polarity. Adjacent to these is a region that appears to be a of the structural fold of the ADD domain causes loss of mutational hotspot (L245–E252). These mutations fall within full-length ATRX protein in the cellular environment. the loop 2 region that in other PHD fingers has been identified Thus, these structure/function studies demonstrate that there as a protein interaction site (11). In the PHD finger of ATRX, are at least two molecular mechanisms underlying ATR-X this loop 2 region is longer and more highly structured, contain- syndrome. Mutations that affect the structural core of the ADD ing two ␣-helices. Residues on the surface of the two ␣-helices domain destabilize the protein fold, which in turn leads to lower give rise to the larger of the two basic patches mentioned (Fig. ATRX protein levels in vivo. The lower levels of protein as well 3), and some of the mutations result in a reduction in basicity. as the instability of the structure are likely to affect function, To obtain a measure of the effect of the mutations on the including interactions with ATRX partners. The second mech- ADD domain structure, we expressed many of the point mutants anism involves the class of mutations that fall on the surface of in Escherichia coli and analyzed their ability to fold by using the ADD domain (Fig. 4). Although these mutations also affect one-dimensional NMR (SI Table 1 and SI Fig. 7). ADD domain protein stability to some extent, they are most likely to affect proteins containing a mutation of buried amino acid residues interactions with specific ATRX ligands directly. expressed at low levels and those affecting zinc-coordinating cysteines expressed only in trace amounts. Surprisingly, when Discussion sufficient protein was available for one-dimensional NMR anal- The structure of the ADD domain of ATRX presented here ysis, most of these mutant proteins appeared to be correctly represents a unique combination of zinc-binding modules, com- folded. By contrast, protein constructs containing mutations of prising a GATA-like zinc finger, a PHD zinc finger, and C-

11942 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0704057104 Argentaro et al. Downloaded by guest on October 1, 2021 ATRX protein at all (7–10% of normal levels), suggesting that the extensive hydrophobic core of such mutant proteins partly rescues their fold. This observation would also explain why mutations of other buried or partly buried residues have inter- mediate effects (6–29% of normal levels). These observations of the effects of mutations are reminiscent of the well documented case of the tumor suppressor p53 (21). It was therefore of particular interest that there is a small subset of six mutated locations in the ADD domain of ATRX involving surface amino acid residues. Patients with these mutations are clinically indistinguishable from those with mutations in the struc- tural core of the ADD domain. Because such patients have rela- tively normal levels of the protein, these surface mutations must cause the disease by affecting an essential ATRX function. Four of the mutated locations on the surface lie within or adjacent to a large basic patch (Fig. 3) that corresponds to the loop 2 region in other PHD fingers. In some cases, loop 2 has been shown to act as a protein interaction surface. For example, interaction of the PHD finger of KAP1 with Mi-2a depends on residues within loop 2 (22). Similarly, this region of the PHD domain is found in the Drosophila protein Pygopus (a nuclear component of the Wnt signaling path- way), which binds Legless/BCL9. These observations suggest that the exposed loop 2 region in the PHD finger of ATRX may similarly be involved in interacting with specific protein ligands and that point mutations in this region result in disease because they disrupt Fig. 4. ATRX in vivo expression in EBV-transformed patient lymphocytes. (a) such interactions. ATRX mRNA levels of patient mutations and normal controls as determined by Although the specific ligand(s) for the ADD domain of ATRX BIOCHEMISTRY quantitative RT-PCR. Patients are grouped according to the nature of their is unknown, studies on other PHD-finger-containing proteins underlying mutation: cysteine mutations are orange, buried mutations are suggest that the function of such domains is to interact (directly blue, and surface mutations are green. Values for normal individuals are or indirectly) with DNA or chromatin. Recent studies have represented by black circles. For each case, the ATRX mRNA level is expressed shown that the PHD fingers of at least two chromatin-associated as the percentage of the average for 18 normal control individuals. (b) ATRX proteins (BPTF and ING2) specifically recognize trimethylated protein levels of patients and normal controls. Cases are grouped as in a. ATRX protein levels are expressed as a percentage of the average value for seven H3 histone tails (H3K4me3). In these cases, discrimination of the normal control individuals. (c) Representative Western blots showing ATRX methylation state of the side chain is achieved by accommodat- protein levels (including loading control). Lane 1 represents the ATRX protein ing the trimethylated group of H3K4 into a hydrophobic aro- level for a cysteine mutation, lanes 2–4 are buried mutations, lanes 5–7 are matic cage on the surface of these PHD domains. No such surface mutations, and lane 8 is a normal control. aromatic cage is present in the ADD domain structure, suggest- ing that if the PHD finger of ATRX participates in histone tail recognition, it does so by recognizing unmodified histone tails or ␣ terminal -helix that pack together to fold as a single globular tails with other posttranscriptional modifications. domain. Given the sequence similarity across the members of the The presence of a basic GATA-1-like zinc finger in the ADD ADD domain family (12) (SI Fig. 5), it is likely that the essential domain of ATRX suggests a function in DNA or chromatin features of this structure are also shared by the DNMT3 proteins. binding. GATA-1 itself is a transcription factor; its C-terminal The ADD domain of ATRX harbors 50% of all naturally zinc finger binds DNA sequence specifically, and a structure of occurring missense mutations of the ATRX gene. Mapping of the complex has been determined (15). Such a binding site has the mutations to the structure provides insights into key func- not been identified for ATRX, but modeling the ␣-helix of the tional regions of the ADD domain and hence an understanding GATA-like finger of the ADD domain onto the recognition helix of how these mutations lead to the genetic disease ATR-X of GATA-1 itself in its complex with DNA suggests that an syndrome. In general, ATR-X syndrome is thought to be caused analogous interaction would probably be sterically accessible for by a loss of function arising from mutations in the ATRX protein the ADD domain (SI Fig. 8). Consistent with this view, previous and, where studied, is most commonly associated with low levels studies demonstrated that in vitro, the ADD domain of ATRX of ATRX protein in patients (Fig. 4 and unpublished data). With is able to bind DNA homopolymers (23) and genomic DNA few exceptions (7), the ATRX phenotype is uniformly severe fragments (unpublished data). Notably, DNA binding was sub- regardless of the type of mutation. The structure/function anal- stantially decreased by mutations of the two basic lysine residues ysis presented here shows that most of the ADD point mutations (K198A and K202A) on the DNA-recognition helix (unpub- associated with ATR-X syndrome affect buried amino acid lished data). Although the structural homology to GATA-1 and residues that are important for structural integrity. In general, DNA-binding activity are tantalizing, it remains to be shown such mutations are likely to destabilize the structure, providing whether the interaction is sequence-specific or is relevant to an explanation for the lower levels of ATRX protein found in ATRX function. patients. There is a close relationship between the location and In conclusion, the mapping of the point mutations on the function of mutated residues in the core of the ADD domain and three-dimensional structure of the ADD domain of ATRX, the varying amount ATRX protein found in patients: the more together with in vivo analysis of the effects of mutations on essential the residue is for the structural integrity, the more ATRX protein levels, provides a molecular understanding of severe is its effect on ATRX protein levels. ATR-X syndrome. This structure/function analysis has led to the The mutations that result in the lowest ATRX protein levels discovery of both putative DNA and protein-interaction regions in patients affect the structurally crucial zinc-coordinating cys- (the GATA-like finger and the loops 1 and 2 regions of the PHD teines. It is surprising that patients with such mutations have any finger), which may act independently or in concert.

Argentaro et al. PNAS ͉ July 17, 2007 ͉ vol. 104 ͉ no. 29 ͉ 11943 Downloaded by guest on October 1, 2021 Materials and Methods round, 32 correspond to a well defined plateau region in the Preparation of the ADD Domain of ATRX. The polypeptide-spanning energy and energy-ordered rmsd profiles (SI Fig. 9), indicating residues 159–296 of ATRX (Fig. 1b) were chosen for NMR that they form a suitable set for reporting structural statistics (SI studies based on experiments to test different fragments for their Table 3) (25). expression, solubility, and monodisperse behavior. Proteins were expressed and purified as described in ref. 24 and set out in SI Preparation and Quantitation of ATRX RNA. RNA was prepared Methods. from EBV-transformed lymphoblastoid cell lines. Complemen- tary DNA (cDNA) was prepared from 2 ␮g of RNA and Site-Directed Mutagenesis. Site-directed mutagenesis was per- analyzed by using quantitative real-time PCR (26). Details of formed by using the QuikChange Site-Directed Mutagenesis kit primers, appropriate controls, and cycling conditions are given (Stratagene, La Jolla, CA) according to manufacturer’s instruc- in SI Methods. tions. The oligonucleotides designed to generate mutant proteins are set out in SI Table 2. All constructs were checked by DNA Western Blot Analysis and Quantification of ATRX Protein. Nuclei sequence analysis of both strands. from EBV-transformed cell lines from normal control individ- uals and those with ATR-X syndrome were prepared in dupli- NMR Structure Determination. Details of NMR sample prepara- cate. Proteins were isolated and analyzed by immunoblotting tion, data acquisition, assignment strategy, and structure calcu- (27). Intensities of bands in the Western blots were quantified by lations appear in SI Methods. For the PHD finger domain, using the Quantity One 1D ChemiDoc analysis software (Bio- metal–ligand connectivities were unambiguously established by Rad, Hemel Hempstead, U.K.). analyzing preliminary structures calculated without metal- binding constraints in conjunction with sequence alignment We thank Alexey Murzin and Antonina Andreeva for assistance against other PHD fingers; this analysis left no ambiguities for and helpful discussions concerning sequence alignment and structural the GATA-like finger. Of 100 structures calculated in the final homologies.

1. Weatherall DJ, Higgs DR, Bunch C, Old JM, Hunt DM, Pressley L, Clegg JB, 15. Omichinski JG, Clore GM, Schaad O, Felsenfeld G, Trainor C, Appella E, Bethlenfalvay NC, Sjolin S, Koler RD, et al. (1981) N Engl J Med 305:607–612. Stahl SJ, Gronenborn AM (1993) Science 261:438–446. 2. Gibbons RJ, Picketts DJ, Villard L, Higgs DR (1995) Cell 80:837–845. 16. Pascual J, Martinez-Yamout M, Dyson HJ, Wright PE (2000) J Mol Biol 3. Garrick D, Sharpe JA, Arkell R, Dobbie L, Smith AJH, Wood WG, Higgs DR, 304:723–729. Gibbons RJ (2006) PLoS Genet 2:438–450. 17. Kowalski K, Czolij R, King GF, Crossley M, Mackay JP (1999) J Biomol NMR 4. Gibbons RJ, Pellagatti A, Garrick D, Wood WG, Malik N, Ayyub H, Langford 13:249–262. C, Boultwood J, Wainscoat JS, Higgs DR (2003) Nat Genet 34:446–449. 18. Berg J (1988) Proc Natl Acad Sci USA 85:99–102. 5. McDowell TL, Gibbons RJ, Sutherland H, O’Rourke DM, Bickmore WA, 19. Dodd RB, Allen MD, Brown SE, Sanderson CM, Duncan LM, Lehner PJ, Pombo A, Turley H, Gatter K, Picketts DJ, Buckle VJ, et al. (1999) Proc Natl Bycroft M, Read RJ (2004) J Biol Chem 279:53840–53847. Acad Sci USA 96:13983–13988. 20. Liew CK, Simpson RJY, Kwan AHY, Crofts LA, Loughlin FE, Matthews JM, 6. Xue YT, Gibbons R, Yan ZJ, Yang DF, McDowell TL, Sechi S, Qin J, Zhou Crossley M, Mackay JP (2005) Proc Natl Acad Sci USA 102:583–588. SL, Higgs D, Wang WD (2003) Proc Natl Acad Sci USA 100:10635–10640. 21. Joerger AC, Ang HC, Fersht AR (2006) Proc Natl Acad Sci USA 103:15056– 7. Gibbons RJ, Wada T (2004) in Molecular Basis of Inborn Errors of Development, 15061. eds Epstein C, Erickson R, Wynshaw-Boris A (Oxford Univ Press, London), pp 22. Capili AD, Schultz DC, Rauscher FJI, Borden KL (2001) EMBO J 20:165– 747–757. 177. 8. Gibbons RJ, Bachoo S, Picketts DJ, Aftimos S, Asenbauer B, Bergoffen J, 23. Cardoso C, Lutz Y, Mignon C, Compe E, Depetris D, Mattei MG, Fontes M, Berry SA, Dahl N, Fryer A, Keppler K, et al. (1997) Nat Genet 17:146–148. 9. Lu X, Meng X, Morris CA, Keating MT (1998) Genomics 54:241–249. Colleaux L (2000) J Med Genet 37:746–751. 10. Nagamine K, Peterson P, Scott HS, Kudoh J, Minoshima S, Heino M, Krohn 24. Court R, Chapman L, Fairall L, Rhodes D (2005) EMBO Rep 6:39–45. KJ, Lalioti MD, Mullis PE, Antonarakis SE, et al. (1997) Nat Genet 17:393–398. 25. Fletcher CM, Jones DNM, Diamond R, Neuhaus D (1996) J Biomol NMR 11. Bienz M (2006) Trends Biochem Sci 31:35–40. 8:292–310. 12. Xie S, Wang Z, Okano M, Nogami M, Li Y, He WW, Okumura K, Li E (1999) 26. Heid CA, Stevens J, Livak KJ, Williams PM (1996) Genome Res 6:986–994. Gene 236:87–95. 27. Harlow E, Lane D (1988) Antibodies: A Laboratory Manual (Cold Spring 13. Aapola U, Shibuya K, Scott HS, Ollila J, Vihinen M, Heino M, Shintani A, Harbor Lab Press, Cold Spring Harbor, NY). Kawasaki K, Minoshima S, Krohn K, et al. (2000) Genomics 65:293–298. 28. Elkin SK, Ivanov D, Ewalt M, Ferguson CG, Hyberts SG, Sun Z-YJ, Prestwich 14. Gibbons RJ, McDowell TL, Raman S, O’Rourke DM, Garrick D, Ayyub H, GD, Yuan J, Wagner G, Oettinger MA, et al. (2005) J Biol Chem 280:28701– Higgs DR (2000) Nat Genet 24:368–371. 28710.

11944 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0704057104 Argentaro et al. Downloaded by guest on October 1, 2021