Identification of fragments binding to CHD1 double , an epigenetic reader of KDM1A K114me2 and H3 K4me3

INAUGURALDISSERTATION

zur Erlangung des Doktorgrades der Fakultät für Chemie und Pharmazie der Albert-Ludwigs-Universität Freiburg im Breisgau

vorgelegt von

Joel Patrick McMillan aus Opotiki, New Zealand 2018

Vorsitzender des Promotionsausschusses: Prof. Dr. Stefan Weber Dekan: Prof. Dr. Manfred Jung Referent: Prof. Dr. Oliver Einsle Korreferrent: Prof. Dr. Roland Schüle Datum der mündlichen Prüfung: 12/10/2018

Abstract Prostate cancer is the most common malignancy in men worldwide and the third most common cause of cancer-related deaths [1-3]. Chromosomal rearrangements, such as TMPRSS2-ERG, occur frequently in prostate cancer. Chromosomal rearrangements in prostate cells are dependent on the presence of androgens, such as dihydrotestosterone (DHT). In this study, KDM1A K114 was found to be dimethylated by EHMT2 in prostate cells. CHD1 was shown to bind specifically to KDM1A K114me2. Another potential interaction partner of KDM1A methylated at K114 is the tandem Tudor domain of KDM4A. Androgen- dependent methylation of KDM1A K114 by EHMT2 and recognition by CHD1 controls recruitment of androgen-receptor (AR) to chromatin and therefore AR- dependent expression. The EHMT2-KDM1A K114me2-CHD1-AR circuit was also shown to control chromosomal rearrangements, such as TMPRSS2-ERG. A 1.6 Å resolution X-ray co-crystal structure of the CHD1 double chromodomain

(DCD) and a KDM1A108-119 K114me2 peptide revealed a unique binding mode that presents an opportunity to develop specific inhibitors of the interaction between CHD1 and KDM1A K114me2. Fragment screening using X-ray crystallography identified a total of 53 fragments binding to eight unique sites on CHD1 DCD. A total of 41 fragments were identified to bind at a fragment ‘hotspot’ near the peptide binding interface. Fragments bound at this hotspot could represent good starting points for development of chemical probes specifically blocking the interaction between CHD1 and KDM1A K114me2 but not the other identified methylated interaction partner, H3 K4me3. One fragment was identified that binds to the methyllysine-binding aromatic cage of CHD1 DCD. This is the first reported small molecule bound to the aromatic cage of a DCD. An unexpected fragment binding site was also identified at the methylated peptide binding interface. Fragments bound at this, or the aromatic cage, site could represent starting points for the development of the first chemical probes or inhibitors for a DCD.

i

Acknowledgements I would first like to thank my supervisors Prof. Dr Oliver Einsle (Albert-Ludwigs- Universität Freiburg), Prof. Dr Roland Schüle (Universitätsklinikum Freiburg), Dr Ralf Flaig (Diamond Light Source Ltd.), and Dr Holger Greschik (Universitätsklinikum Freiburg) for their guidance during my doctoral studies. I would like to acknowledge funding from Diamond Light Source Ltd (Diamond Light Source studentship [STU0075]). I would like to thank Swiss Light Source

(initial data collection for the His6-CHD1270-443 and KDM1A108-119 K114me2 co- crystal structure) and Diamond Light Source for beam time and their staff for assistance with X-ray diffraction data collection for all structures presented in this work. I would especially like to acknowledge the I04 and I04-1 beamline staff from Diamond Light Source for their assistance with X-ray diffraction data collection and the XChem Facility (Diamond Light Source) for the use of their facility and the assistance of their staff in performing and analysing fragment screening using X-ray crystallography. For their assistance in the generation and analysis of ITC and X-ray crystal structures I am indebted to Dr Anne Kathrin- Schott, Dr Stefan Gerhardt, and Dr Daniel Wohlwend. I would also like to thank all of the co-authors from my scientific publications for their invaluable expertise and contributions. A special thanks also goes to Dr Eric Metzger for all of his scientific contributions, which were of significant help especially at the beginning of this research project. High-throughput cloning was performed using the Oxford Production Facility. Finally, I would like to thank all my colleagues from the Schüle lab (Univeristätsklinikum Freiburg) and the MX lab at Diamond Light Source.

ii

Table of Contents

1 Introduction ...... 1

1.1 Prostate cancer ...... 1

1.2 Epigenetics ...... 2 1.2.1 Post-translational modifications ...... 2 1.2.2 Erasers - lysine demethylases ...... 3 1.2.2.1 KDM1A ...... 3 1.2.2.2 KDM4A ...... 5 1.2.3 Reader ...... 7 1.2.4 Methyllysine reader domains ...... 8 1.2.4.1 PHD family ...... 9 1.2.4.2 Tandem Tudor domain (TTD) ...... 10 1.2.4.3 Chromodomain ...... 10 1.2.4.4 CHD family ...... 12 1.2.5 CHD1 ...... 13 1.2.5.1 CHD1 and prostate cancer ...... 17 1.2.6 Epigenetic proteins as drug targets ...... 17 1.2.7 Targeting reader domains ...... 18 1.2.7.1 Acetyllysine reader domain inhibitors ...... 18 1.2.7.2 Methyllysine reader domain inhibitors ...... 19

2 Materials and Methods ...... 23

2.1 General methods ...... 23 2.1.1 Reagents ...... 23 2.1.2 Primers and constructs ...... 23 2.1.3 Peptide synthesis ...... 23 2.1.4 Site-directed mutagenesis ...... 23 2.1.5 Protein-domain microarray ...... 23

2.2 Methods relating to chapter 3: Assembly of methylated KDM1A and CHD1 in prostate cells ...... 23 2.2.1 Cloning and protein expression ...... 23 2.2.2 Protein purification ...... 24 2.2.2.1 Cell lysis ...... 24 2.2.2.2 Immobilised metal affinity chromatography (IMAC) ...... 24 2.2.2.3 Gel filtration chromatography ...... 24 2.2.3 Isothermal titration calorimetry (ITC) ...... 24

2.2.4 His6-CHD1270-443-KDM1A108-119 K114me2 co-crystallisation ...... 24

iii

2.2.5 X-ray diffraction data collection and analysis ...... 25 2.2.6 Peptide pull-down ...... 25 2.2.6.1 Protein expression and purification of GST-tagged proteins ...... 25 2.2.6.2 Peptide binding ...... 26 2.2.6.3 Peptide pull-down assay ...... 26

2.3 Methods relating to chapter 4: Fragment screening using X-ray crystallography ...... 26 2.3.1 Crystallisation ...... 26 2.3.2 Imaging and ranking crystal drops ...... 27 2.3.3 Fragment library ...... 27 2.3.4 DMSO tolerance test ...... 27 2.3.5 Fragment soaking ...... 28 2.3.6 X-ray diffraction data collection ...... 28 2.3.7 X-ray diffraction data analysis ...... 28 2.3.8 Model building, ligand identification and modelling, and refinement ...... 28 2.3.8.1 PanDDA ...... 29 2.3.8.2 Modelling and refinement ...... 30 2.3.9 Fragment validation ...... 31 2.3.10 High-throughput cloning and expression of DCDs from other CHD proteins ...... 32 2.3.11 Large-scale protein expression and purification trials ...... 32

2.4 Methods relating to chapter 5: Other potential readers of KDM1A methylated at K114 ...... 32 2.4.1 Cloning and protein expression ...... 32

2.4.1.1 pET15b-KDM4A895-1010 ...... 32

2.4.1.2 pOPINF-KDM4A895-1010 ...... 33 2.4.2 Protein purification ...... 33 2.4.2.1 Cell lysis ...... 33 2.4.2.2 IMAC ...... 33 2.4.2.3 3C cleavage and dialysis ...... 33 2.4.2.4 Reverse IMAC ...... 33 2.4.2.5 Gel filtration chromatography ...... 34 2.4.3 SEC-MALS ...... 34 2.4.4 Isothermal titration calorimetry ...... 34 2.4.5 Crystallisation ...... 34

2.4.5.1 I4132 ...... 34

2.4.5.2 P63 ...... 35

2.4.5.3 P21 ...... 35

iv

2.4.5.4 P1 ...... 35 2.4.6 X-ray diffraction data collection and analysis ...... 36

3 Assembly of methylated KDM1A and CHD1 in prostate cells ...... 37

3.1 EHMT2 methylates KDM1A ...... 37

3.2 CHD1 is a KDM1A K114me2 reader protein ...... 39 3.2.1 Isothermal titration calorimetry ...... 41

3.2.2 His6-CHD1270-443 and KDM1A108-119 K114me2 co-crystallisation ...... 42 3.2.2.1 Phosphorylation of KDM1A S111 disrupts binding by CHD1 DCD ...... 49 3.2.2.2 KDM1A R113 is important for binding to CHD1 ...... 51 3.2.3 CHD1 interacts with KDM1A K114me2 in vivo ...... 52

3.3 Co-occupancy of CHD1, KDM1A K114me2 at AR-binding sites controls AR- dependent transcription ...... 52

3.4 KDM1A K114me2 controls TMPRSS2-ERG gene fusion ...... 53

3.5 Discussion and outlook ...... 54

4 Fragment screening using X-ray crystallography ...... 59

4.1 Crystallisation ...... 61

4.2 DMSO tolerance test ...... 63

4.3 Fragment soaking experiments ...... 64

4.3.1 P212121 structure...... 65

4.4 Fragment identification...... 66

4.5 Fragment modelling and validation ...... 70 4.5.1 Site 7...... 72 4.5.1.1 Fragment x0210 ...... 72 4.5.2 Site 2...... 76 4.5.3 Site 1...... 78 4.5.3.1 Orientation 1 ...... 80 4.5.3.2 Orientation 2 ...... 91 4.5.4 Site 3...... 98 4.5.5 Site 4...... 100 4.5.6 Site 5...... 100 4.5.7 Site 8...... 101 4.5.8 Site 9...... 102

4.6 Outlook ...... 103

v

5 KDM4A: Another potential KDM1A K114me reader ...... 108

5.1 KDM4A is a potential KDM1A K114me reader protein ...... 108

5.2 KDM4A-TTD exists as a dimer ...... 108

5.3 KDM4A-TTD binds KDM1A K114me2/3 ...... 108

5.4 KDM1A K114me2/3 peptides bind at the aromatic cage of KDM4A HTD-2 ...... 112

5.5 Other KDM4 family members do not interact specifically with methylated KDM1A K114 ...... 117

6 Discussion ...... 120

7 Appendix A: Fragment screening ...... 123

7.1 X-ray diffraction data collection, refinement, and fragment validation statistics ...... 123

7.2 H3 K4me3 binding and fragment details ...... 132

7.3 Fragment binding validation ...... 140 7.3.1 Site 1 (and site 6) ...... 140 7.3.1.1 Orientation 1 ...... 140 7.3.1.2 Orientation 2 ...... 150 7.3.2 Site 2...... 154 7.3.3 Site 3...... 154 7.3.4 Site 4...... 155 7.3.5 Site 5...... 156 7.3.6 Site 7...... 156 7.3.7 Site 8...... 157 7.3.8 Site 9...... 157

7.4 High resolution limits of datasets used for PanDDA ...... 159

8 Appendix B: Additional materials and methods ...... 160

8.1 Competent cells, plasmids, peptides, and primers ...... 160

8.2 CHD protein expression and purification protocols ...... 163

8.2.1 CHD2260-452 expression and purification ...... 163 8.2.1.1 Transformation and protein expression ...... 163 8.2.1.2 Cell lysis ...... 163 8.2.1.3 IMAC ...... 163 8.2.1.4 3C cleavage and dialysis ...... 164

vi

8.2.1.5 Reverse IMAC ...... 164 8.2.1.6 Gel filtration chromatography ...... 164

8.2.2 CHD4364-506 expression and purification ...... 164 8.2.2.1 Transformation and protein expression ...... 164 8.2.2.2 Protein purification ...... 164

8.2.3 CHD4499-677 expression and purification ...... 165 8.2.3.1 Transformation and protein expression ...... 165 8.2.3.2 Protein purification ...... 165

8.2.4 CHD6285-435 protein expression and purification ...... 165 8.2.4.1 Transformation, protein expression and purification ...... 165

8.2.5 CHD8637-790 protein expression and purification ...... 165 8.2.5.1 Transformation and protein expression ...... 165 8.2.5.2 Protein purification ...... 166

8.2.6 CHD9680-840 protein expression and purification...... 166 8.2.6.1 Transformation and protein expression ...... 166 8.2.6.2 Cell lysis ...... 166 8.2.6.3 IMAC ...... 166 8.2.6.4 Gel filtration chromatography ...... 167

References ...... 168

vii

List of Figures

Figure 1.1 Reaction mechanism of lysine demethylation by KDM1A...... 4

Figure 1.2 Crystal structure of KDM1A171-836 in complex with CoREST308-440. .... 5 Figure 1.3 Schematic of KDM4A...... 6 Figure 1.4 Cavity insertion and surface groove recognition pockets...... 9 Figure 1.5 General structure of PHD domain...... 9 Figure 1.6 Bi-lobal KDM4A TTD structure...... 10 Figure 1.7 Structures of a chromodomain, double chromodomain, and chromobarrel domain...... 11 Figure 1.8 Domain architecture of the CHD family...... 12 Figure 1.9 X-ray crystal structure of CHD1 DCD bound to H3 K4me3 peptide.. 14 Figure 1.10 Structure of Chd1 bound to a ...... 16 Figure 1.11 UNC1215 binds in a unique 2:2 polyvalent mode to L3MBTL3...... 19 Figure 1.12 Compounds bind to CBX7 CD...... 20 Figure 3.1 EHMT2 methylates KDM1A K114...... 38 Figure 3.2 EHMT2 methylates KDM1A K114 in vivo...... 39 Figure 3.3 Protein-domain microarray...... 40 Figure 3.4 CHD6 and CHD9 peptide pull-down...... 41

Figure 3.5 ITC binding assays with His6-CHD1270-443 and KDM1A104-127 K114 and H31-23 K4 peptides with differing methylation states...... 42

Figure 3.6 Electrostatic model of a 1.6 Å co-crystal structure of His6-CHD1270-443 and KDM1A108-119 K114me2...... 44

Figure 3.7 Co-crystal structure of His6-CHD1270-443 and KDM1A108-119 K114me2 (1.6 Å)...... 44 Figure 3.8 KDM1A K114me2, H3 K4me3, and NS1 K229me2 peptides bind at the same site of CHD1 double chromodomain...... 45 Figure 3.9 KDM1A K114me2 forms cation-π interactions with W322 and W325...... 47 Figure 3.10 Anionic residues from CHD1 chromodomain 2 form interactions with KDM1A K114me2, H3 K4me3, and NS1 K229me2...... 48 Figure 3.11 Electrostatic models of Rag2 and ING4 PHD domains bound to H3 K4me3 peptides...... 49 Figure 3.12 Phosphorylation of KDM1A S111 disrupts binding between

His6-CHD1270-443 to KDM1A104-127 K114me2...... 50

viii

Figure 3.13 ITC experiments demonstrate the importance of the CHD1 D425- KDM1A R113 interaction...... 51 Figure 3.14 CHD1 interacts with KDM1A K114me2 in vivo...... 52 Figure 3.15 Schematic demonstrating how an inhibitor targeting the interaction between CHD1 and KDM1A might function...... 55 Figure 3.16 Potential novel site for drug development...... 57 Figure 4.1 Diamond-SGC Poised Library (DSPL)...... 64 Figure 4.2 Model of a 1.74 Å X-ray crystal structure of CHD1 DCD in complex with H3 K4me3 peptide crystallised in space group P212121...... 66 Figure 4.3 Crystal packing of CHD1 crystals...... 67

Figure 4.4 Fragment binding sites in P21212 CHD1 DCD structures...... 69

Figure 4.5 Fragment binding sites in P212121 CHD1 DCD structures...... 70 Figure 4.6 Fragment x0210 binds in the aromatic cage of CHD1 DCD...... 73 Figure 4.7 Evidence of co-occupancy of fragment x0210 and H3 K4me3 in the aromatic cage...... 73 Figure 4.8 Fragment x0210 binds at site 7 in the aromatic cage...... 74 Figure 4.9 Fragment x0210 forms a cation-π interaction with CHD1 W325...... 75 Figure 4.10 Crystal packing near the peptide interface may limit solvent access to the aromatic cage of chain B...... 76 Figure 4.11 Fragment x0381 binds to CHD1 DCD at site 2 located at the peptide- binding interface...... 77 Figure 4.12 Fragments bind at site 1 in two distinct orientations...... 80 Figure 4.13 Fragment x0628 binds to site 1 in orientation 1...... 81 Figure 4.14 Fragment x0628 binds at site 1 in orientation 1 and extends toward the CHD1 D425-KDM1A R113 interface...... 82 Figure 4.15 Fragment x0572 binds at site 1 in orientation 1 in a similar manner to fragment x0628...... 83 Figure 4.16 Nine fluorophenyl or chlorophenyl-containing fragments bind at site 1 in orientation 1...... 83 Figure 4.17 Fragment x0040 binds at site 1 in orientation 1 and extends out of the pocket...... 84 Figure 4.18 Fragment x0040 binds at site 1 in orientation 1 with high confidence...... 85

ix

Figure 4.19 Residues from the peptide-binding interface adopt different positions in the fragment x0271 bound state...... 86 Figure 4.20 Modelling of fragment x0271 at site 1 is well supported...... 87 Figure 4.21 Fragment x0271 bound at site 1 in orientation 1...... 88 Figure 4.22 Fragment x0335 bound at site 1 in orientation 1...... 89 Figure 4.23 Modelling of fragment x0335 at site 1 in orientation 1 is well supported...... 90 Figure 4.24 Fragment x0125 is a high confidence hit at site 1 in orientation 2. ... 92 Figure 4.25 Fragment x0125 binds to CHD1 DCD at site 1 in orientation 2...... 92 Figure 4.26 Binding site and validation of fragment x0240...... 93

Figure 4.27 Complex X-ray crystal structure of His6-CHD1270-443, H31-12 K4me3, and fragment x0240 (2.23 Å)...... 94 Figure 4.28 Fragment x0459 binds in a unique manner to site 1...... 95 Figure 4.29 Evidence for placement of fragment x0459 at sites 1 and 6...... 96 Figure 4.30 Fragment x0459 is a unique fragment bound at site 1...... 97 Figure 4.31 Fragments bound at site 1 in orientation 2 occupy different areas of the pocket...... 97 Figure 4.32 Fragments bound at site 3...... 99 Figure 4.33 Fragment x0432-1 bound at site 4...... 100 Figure 4.34 High-confidence modelling of fragment x0622 at site 5...... 101 Figure 4.35 Fragment x0507 bound at site 8 of CHD1 DCD...... 102 Figure 4.36 Fragment x0157 bound at site 9 at crystal packing interface...... 103 Figure 5.1 Protein-domain microarray...... 109

Figure 5.2 SEC-MALS results indicate KDM4A895-1010 exists as dimer in solution...... 110 Figure 5.3 Head-to-tail dimer occludes the aromatic cage...... 110

Figure 5.4 ITC binding assays between His6-KDM4A895-1010 and KDM1A104-127 K114 peptides with differing methylation states...... 111

Figure 5.5 KDM1A108-119 K114me2/3 peptides are bound at a likely crystal packing interface...... 112

Figure 5.6 KDM1A108-119 K114me2/3 peptides form extensive interactions with two KDM4A-TTD molecules...... 114

Figure 5.7 X-ray crystal structure at 1.63 Å resolution of KDM4A895-1010 and

KDM1A108-119 K114me2 peptide...... 114

x

Figure 5.8 KDM1A R113 and K114me2 side chains form interactions with

KDM4A895-1010...... 115

Figure 5.9 An imidazole molecule occupies the aromatic cage of a KDM4A895-1010 symmetry molecule...... 116 Figure 5.10 KDM1A K114me3 forms cation-π interactions with the aromatic cage of KDM4A895-1010...... 116

Figure 5.11 KDM1A108-119 K114me2/3 peptides bind in an orientation similar to H4 K20me3...... 117 Figure 5.12 KDM4A is the only member of the KDM4 subfamily that interacts specifically with KDM1A104-127 K114me3...... 118 Figure 7.1 Models of fragments x0003 (1.76 Å), x0021 (1.85 Å), and x0040 (1.81 Å) binding sites...... 140 Figure 7.2 Models of fragments x0051 (1.67 Å), x0076 (1.75 Å), and x0156 (1.46 Å) binding sites...... 141 Figure 7.3 Models of fragments x0245 (1.74 Å), x0257 (1.77 Å), and x0271 (1.92 Å) binding sites...... 142 Figure 7.4 Models of fragments x0277 (1.65 Å), x0317 (2.27 Å), and x0332 (1.45 Å) binding sites...... 143 Figure 7.5 Models of fragments x0335 (1.64 Å), x0357 (1.93 Å), and x0368-1 (1.51 Å) binding sites...... 144 Figure 7.6 Models of fragments x0368-2 (1.51 Å), x0385 (1.74 Å), and x0400 (1.54 Å) binding sites...... 145 Figure 7.7 Models of fragments x0407 (1.53 Å), x0432 (1.58 Å), and x0454 (1.82 Å) binding sites...... 146 Figure 7.8 Models of fragments x0478 (1.63 Å), x0481 (1.68 Å), and x0566-1 (1.50 Å) binding sites...... 147 Figure 7.9 Models of fragments x0566-2 (1.50 Å), x0572 (1.77 Å), and x0592 (2.07 Å) binding sites...... 148 Figure 7.10 Models of fragments x0601 (1.71 Å), x0628 (1.63 Å), and x0666 (1.72 Å) binding sites...... 149 Figure 7.11 Models of fragments x0110 (1.76 Å), x0125 (1.62 Å), and x0134 (1.72 Å) binding sites...... 150 Figure 7.12 Models of fragments x0157-2 (1.38 Å), x0232 (1.56 Å), and x0240 (2.23 Å) binding sites...... 151

xi

Figure 7.13 Models of fragments x0459-1 (1.40 Å), x0459-2 (1.40 Å), and x0484- 1 (1.60 Å) binding sites...... 152 Figure 7.14 Models of fragments x0484-2 (1.60 Å) and x0608 (1.67 Å) binding sites...... 153 Figure 7.15 Model of fragment x0381 binding site (1.49 Å)...... 154 Figure 7.16 Model of fragment x0137 binding site (1.39 Å)...... 154 Figure 7.17 Model of fragment x0321 binding site (1.45 Å)...... 155 Figure 7.18 Model of fragment x0432-1 binding site (1.58 Å)...... 155 Figure 7.19 Model of fragment x0622 binding site (1.28 Å)...... 156 Figure 7.20 Model of fragment x0210-2 binding site (1.42 Å)...... 156 Figure 7.21 Model of fragment x0507 binding site (1.67 Å)...... 157 Figure 7.22 Models of fragment x0056 binding site (1.32 Å)...... 157 Figure 7.23 Models of fragments x0157-1 (1.38 Å), x0210-1 (1.42 Å), and x0453 (1.57 Å) binding sites...... 158 Figure 7.24 Model of fragment x0542 binding site (1.55 Å)...... 159 Figure 7.25 High resolution limits of datasets used in PanDDA analyses with

His6-CHD1270-443...... 159

xii

List of Tables

Table 1.1 List of methyllysine reader domains and marks they bind...... 8 Table 1.2 Structures of CHD1 protein domains in the PDB...... 13 Table 3.1 Data collection and refinement statistics ...... 43 Table 4.1 Metrics used for fragment validation ...... 61 Table 4.2 Summary of datasets suitable for PanDDA...... 65 Table 4.3 Number of fragments at different sites...... 72 Table 4.4 Summary of cloning and expression tests with CHD family members...... 106 Table 5.1 X-ray diffraction data collection and refinement statistics ...... 113 Table 7.1 X-ray diffraction data collection, refinement, and fragment validation statistics for x0003, x0021, x0040, x0051, and x0056...... 123 Table 7.2 X-ray diffraction data collection, refinement, and fragment validation statistics for x0076, x0110, x0125, x0134, and x0137...... 124 Table 7.3 X-ray diffraction data collection, refinement, and fragment validation statistics for x0156, x0157, x0210, x0232, x0240, and x0245...... 125 Table 7.4 X-ray diffraction data collection, refinement, and fragment validation statistics for x0257, x0271, x0277, x0317, and x0321...... 126 Table 7.5 X-ray diffraction data collection, refinement, and fragment validation statistics for x0332, x0335, x0357, x0368, and x0381...... 127 Table 7.6 X-ray diffraction data collection, refinement, and fragment validation statistics for x0385, x0400, x0407, x0432, and x0453...... 128 Table 7.7 X-ray diffraction data collection, refinement, and fragment validation statistics for x0454, x0459, x0478, x0481, and x0484...... 129 Table 7.8 X-ray diffraction data collection, refinement, and fragment validation statistics for x0507, x0542, x0566, x0572, and x0592...... 130 Table 7.9 X-ray diffraction data collection, refinement, and fragment validation statistics for x0601, x0608, x0622, x0628, and x0666...... 131 Table 7.10 H3 K4me3 binding and fragment details ...... 132 Table 8.1 Competent cells used in this study ...... 160 Table 8.2 Plasmids used in this study...... 160 Table 8.3 Peptides used in this study...... 160 Table 8.4 Primers used in this study...... 161 Table 8.5 Expression constructs generated in this study...... 162

xiii

Table 8.6 Commercial crystallisation screens used in this study...... 162

xiv

List of Abbreviations

A280 absorbance at 280 nm ADD ATRX-DNMT3-DNMT3L AGS phosphothiophosphoric acid adenylate ester AI auto-induction AML acute myeloid leukaemia AR androgen receptor ATP adenosine triphosphate BAH bromo adjacent homology BDC background density correction bp base pairs BTP Bis-Tris propane CBX chromobox CD chromodomain CHCT CHD1 helical C-terminal domain CHD1 chromodomain--DNA-binding protein 1 CHD2 chromodomain-helicase-DNA-binding protein 2 CHD3 chromodomain-helicase-DNA-binding protein 3 CHD4 chromodomain-helicase-DNA-binding protein 4 CHD5 chromodomain-helicase-DNA-binding protein 5 CHD6 chromodomain-helicase-DNA-binding protein 6 CHD7 chromodomain-helicase-DNA-binding protein 7 CHD8 chromodomain-helicase-DNA-binding protein 8 CHD9 chromodomain-helicase-DNA-binding protein 9 ChIP chromatin immunoprecipitation ChIP-seq ChIP-sequencing CRPC castration resistant prostate cancer CtIP CtBP-interaction protein Da Dalton DBD DNA-binding domain DCD double chromodomain DHT dihydrotestosterone DMSO dimethyl sulfoxide DNA deoxyribonucleic acid

xv

DSPL Diamond-SGC poised library DTT dithiothreitol EDO 1,2-ethane-diol EHMT2 euchromatic -lysine N-methyltransferase 2 ETS E twenty-six FAD flavin adenine dinucleotide FBDD fragment-based drug design GOI gene of interest GST glutathione S-transferase H3 K4me3 histone 3 trimethylated at lysine 4 H3 K9me3 histone 3 trimethylated at lysine 9 His6 hexahistidine HRV human Rhinovirus HTD hybrid Tudor domain HTS high throughput screening IMAC immobilised metal affinity chromatography IPTG isopropyl β-D-1-thiogalactopyranoside ITC isothermal titration calorimetry JmjC JumonjiC JmjN JumonjiN kDa kilo Daltons KDM lysine-specific demethylase KDM1A lysine-specific histone demethylase 1A KDM4A lysine-specific demethylase 4A KDM4B lysine-specific demethylase 4B KDM4C lysine-specific demethylase 4C KDM4D lysine-specific demethylase 4D KDM4E lysine-specific demethylase 4E LC-MS liquid chromatography-mass spectrometry MBT malignant brain tumour MME monomethyl ether MMS matrix microseeding MPD 2-methyl-2,4-pentanediol NGS next-generation sequencing

xvi

NLS nuclear localisation signal NMR nuclear magnetic resonance NS1 non-structural protein 1 OCC occupancy PanDDA pan-dataset density analysis PCP-SILAC protein correlation profiling-stable isotope labelling by amino acids in cell culture PDB PEG polyethylene glycol PHD plant homeodomain PHF21A PHD finger protein 21A POI protein of interest PTM post-translational modification rIgG rabbit IgG RMSD root mean square deviation RNA ribonucleic acid RNA-seq RNA-sequencing rr RNAi resistant RSCC real-space correlation coefficient RSZD real-space Z-difference score RSZO real-space observed density Z-score RT-PCR reverse transcription polymerase chain reaction SAM S-adenosyl-methionine SDS-PAGE sodium dodecyl sulfate polyacrylamide gel electrophoresis SGC structural genomics consortium siRNA small interfering RNA SPR surface plasmon resonance SWIRM Swi3, Rsc8, and Moira TMPRSS2 transmembrane protease, serine 2 tNCS translational non-crystallographic symmetry TTD tandem Tudor domain v/v volume per volume w/v weight per volume XCE XChemExplorer

xvii

Introduction

1 Introduction

1.1 Prostate cancer Prostate cancer is the most common malignancy in men [1]. It is a major cause of cancer deaths, especially in men from Western countries [2], and is the third highest cause of cancer-specific death in Western countries [3]. Prostate cancer genomes exhibit low mutation rates and possess few focal chromosomal gains or losses [4]. The advent of next-generation sequencing (NGS) has allowed the identification of genomic rearrangements which result in aberrant androgen- driven expression of ETS (E twenty-six) transcription factor oncogenes in prostate cells. Chromosomal rearrangements are frequently observed in prostate cancers, with ~50-60% of prostate cancers containing such rearrangements [5]. These chromosomal rearrangements typically involve the fusion of the 5’ untranslated region of an androgen-regulated gene, such as the transmembrane protease, serine 2 (TMPRSS2), with most of the coding sequence of an ETS transcription factor, such as ERG [6]. Prostate tumours can be classified according to the presence of an ETS fusion which exhibit genetic, epigenetic, and expression profiles distinct from prostate tumours that do not possess an ETS gene fusion [6]. Some driver genetic alterations have been observed that are mutually exclusive with ETS fusion. One change observed in ~5-10% of prostate cancers is overexpression of the serine protease inhibitor, SPINK1, and is observed exclusively in tumours without ETS fusion [7]. Another alteration which is present in ~10-15% of prostate cancers and is mutually exclusive to ETS fusion is the mutation or homozygous deletion of CHD1 [7]. Formation of ETS gene fusions is dependent on androgen-receptor (AR) dependent [8, 9]. The main oncogenic pathway in prostate cancer is androgen signalling and is a major target of therapy for patients with locally advanced or metastatic disease [10]. Therapies targeting AR signalling typically are effective for a short period before the prostate cancer becomes desensitised and develops into castration resistant prostate cancer (CRPC). Uncovering more mechanistic detail regarding the AR-dependent formation of genomic rearrangements and development of additional therapies targeting the AR signalling pathways are therefore of significant interest. Epigenetic proteins have been shown to be important for AR-dependent

1

Introduction transcription and epigenetic alterations have been suggested to play a role in prostate cancer [10, 11].

1.2 Epigenetics In eukaryotic cells, DNA is packaged into chromatin through the wrapping of 147 bp of DNA around a histone octamer composed of two copies each of H2A, H2B, H3, and H4 to form a nucleosome. Epigenetics can be broadly used to describe chromatin-based events that regulate DNA-templated processes, such as gene transcription [12]. Post-translational modifications (PTMs) to DNA and histone proteins can be added and removed dynamically by different classes of enzymes, which has an effect on processes such as transcription and can also be transferred to daughter cells. PTMs can alter chromatin directly by physically changing interactions between the histone proteins and the DNA within . Certain PTMs can also impact DNA-templated processes by recruiting chromatin-acting proteins and protein complexes to specific chromatin regions through recognition of specific PTMs by specialised domains.

1.2.1 Post-translational modifications Histone tails, particularly those of H3 and H4, are subject to a range of PTMs. In total, histone proteins have been demonstrated to be modified by at least 32 different types of PTM, with lysine subject to the greatest number (14) of different modifications [13]. One of the most prevalent and widely studied PTMs is lysine methylation. Lysine residues can be either mono-, di-, or tri-methylated. The methylation state of particular histone lysine residues can affect gene expression differently. For instance, di- and tri-methylation of histone 3 lysine 4 (H3 K4me2/3) is associated with active gene expression [14, 15], whereas di- and tri-methylation of histone 3 lysine 9 (H3 K9me2/3) is associated with heterochromatin and gene repression [16, 17]. Under physiological conditions, methylation of lysine does not affect lysine’s positive charge, unlike lysine acetylation. Therefore, lysine methylation does not directly alter chromatin structure but rather acts through recruitment of specific proteins/protein complexes to particular chromosomal locations. Methyllysine-interacting proteins can be divided into three different classes: writers, erasers, and readers. In the case of methyllysine, writers are lysine methyltransferases that catalyse the transfer of a methyl group from a S-adenosyl-L-methionine (SAM) cofactor to a lysine

2

Introduction residue. In the vast majority of methyltransferases this transfer is catalysed by a SET domain [18]. Erasers consist of lysine demethylases that remove methyl groups from mono- di-, or tri-methyllysine residues. Reader proteins contain a domain or domains that bind to a particular lysine methylation state(s). Reader proteins can then alter gene expression through recruitment of transcription factors or protein complexes which then make alterations to chromatin, thereby affecting gene expression. Reader domains are frequently part of multi-domain proteins with other chromatin-acting functions, e.g. the lysine demethylase 4A (KDM4A) contains tandem Tudor domains (TTDs) which bind trimethylated histone residues in addition to its catalytic lysine demethylase domain. Binding of trimethylated histone lysine residue by the TTD may recruit KDM4A to specific regions of chromatin so that the lysine demethylase activity of KDM4A is targeted to specific regions of the .

1.2.2 Erasers - lysine demethylases There are two classes of lysine demethylases: (i) the amino oxidase lysine specific demethylase (KDM) family consisting of KDM1A and KDM1B, with KDM1A being much more widely studied; and, (ii) the JumonjiC (JmjC)-domain containing demethylases, of which there are at least 28 members from seven subfamilies. These subfamilies are classified according to the identity of their ancillary domains [19].

1.2.2.1 KDM1A Lysine-specific demethylase 1 (KDM1A/LSD1) was the first reported lysine demethylase [20]. KDM1A demethylates mono- and di-methyllysine residues in a flavin adenine dinucleotide (FAD)-dependent manner (Figure 1.1). Several mechanisms have been proposed for the first and rate-determining step of KDM1A-catalysed lysine demethylation [21]. The most favourable involves cleavage of an α-CH bond and hydride transfer from di- or mono-methyllysine to FAD. The iminium intermediate is then hydrolysed to a hemiaminal intermediate, which rearranges spontaneously to a demethylated amine and formaldehyde. This mechanism does not allow KDM1A to demethylate trimethylated lysine as Nε of trimethyllysine does not possess any free electrons. KDM1A was first shown to demethylate H3 K4me1/2, a mark of open chromatin and active gene expression [20]. Demethylation of H3 K4me1/2 in nucleosomes requires bound corepressor,

3

Introduction

CoREST [22, 23]. Additionally, KDM1A has been demonstrated to demethylate the repressive H3 K9me1/2 mark when co-localised with AR at AR target , resulting in de-repression of AR target genes [17]. KDM1A has also been shown to demethylate a number of non-histone proteins, including p53 and DNMT1 [24, 25].

KDM1A is a multi-domain enzyme (Figure 1.2) composed of a flexible N-terminal region, which contains a nuclear localisation signal (NLS; [26]) and several phosphorylation sites [27-30]; a Swi3, Rsc8, and Moira (SWIRM) domain important for protein stability, histone tail interaction, and protein-protein interactions; and a C-terminal amine oxidase-like domain responsible for FAD- dependent lysine demethylation (Figure 1.2). The oxidase domain contains a unique two helix insert that has been coined the ‘tower’ domain. The tower domain is important for the binding of KDM1A interaction partners such as CoREST (Figure 1.2; [23, 31]).

Figure 1.1 Reaction mechanism of lysine demethylation by KDM1A. Description of the mechanism is found in text from section 1.2.2.1. R = ribitol adenosine diphosphate. KDM1A has been shown to have a major role in cell-fate determination and has been implicated in maintenance of stem and progenitor cells as well as differentiation pathways to hematopoietic, sperm, adipose, and neuronal cells among others [32-36]. Overexpression of KDM1A has been linked with a range of cancers such as rhabdomyosarcoma, prostate, lung, and breast cancers [37-40]. In prostate cancer, KDM1A is correlated with cell proliferation, migration, invasion, and angiogenesis [41-43]. KDM1A inhibition is of therapeutic interest and has been suggested as an appropriate therapeutic target in acute myeloid leukaemia (AML) and small cell lung cancer [44, 45] and was suggested to have

4

Introduction therapeutic potential in prostate cancer [46]. At least four phase I clinical trials are in the recruitment stage of investigations into KDM1A inhibition, typically in conjunction with all-trans retinoic acid for treatment of AML. Combined treatment of rhabdomyosarcoma cells with KDM1A and histone deacetylase inhibitors has also been identified in cell-based assays as a promising new therapeutic strategy [47].

Figure 1.2 Crystal structure of KDM1A171-836 in complex with CoREST308-440. The flexible N-terminal region is not present in the complex crystal structure of KDM1A, CoREST, H3 peptide, and flavin adenine dinucleotide (FAD; 3.1 Å resolution; PDB 2V1D [31]). KDM1A is modelled as a cartoon and coloured according to the domain: SWIRM (wheat); amine oxidase (blue); and tower domain (green). CoREST is displayed as an orange surface model. FAD (carbon, magenta) and H3 peptide (carbon, yellow) are modelled as sticks and coloured by atom type (oxygen, red; and nitrogen, blue). 1.2.2.2 KDM4A There are seven subfamilies of JmjC domain-containing lysine demethylases. One such subfamily is the KDM4 subfamily, often referred to as the JMJD2 subfamily. There are four recognised members of the KDM4 subfamily (KDM4A, KDM4B, KDM4C, and KDM4D) and possibly a fifth member, KDM4E. Members of the KDM4 subfamily demethylate the repressive mark, H3 K9me2/3 [48]. Additionally, KDM4 demethylases can demethylate H3 K36me3, a mark associated with transcriptional elongation [48-50]. All four members contain JmjN and JmjC domains at their amino terminus. The JmjC domain is responsible for catalysis and the JmjN domain forms extensive interactions with JmjC,

5

Introduction providing structural integrity to the catalytic core (Figure 1.3). Additionally, KDM4A, KDM4B, and KDM4C contain two PHD (plant homeodomain) domains, with an as yet unclear function, and two Tudor domains which form a TTD that has been demonstrated to bind a number of different trimethyllysine histone marks (Figure 1.3; [51-53]).

Figure 1.3 Schematic of KDM4A. Domains for which there is structural information available are displayed as cartoon models and coloured according to their domain. JmjN, wheat; JmjC, blue; the rest of the catalytic core, grey (PDB 2OQ6; [54]). The tandem Tudor domain (PDB 2GFA; [51]) is coloured green. Ni2+ shown as sphere (grey); oxalylglycine (carbon, magenta); H3 K9me3 peptide (carbon, yellow); and H3 K4me3 peptide (carbon, orange). Oxygen and nitrogen coloured red and blue, respectively. Lysine demethylation by JmjC domain-containing demethylases, such as KDM4A, is not restricted to certain methylation states for mechanistic reasons, as is the case with KDM1A, so can demethylate trimethyllysine residues as well as the lower methylation states. The consensus mechanism for oxidative lysine demethylation by these enzymes is an ordered sequential mechanism in which the coordination of the co-substrate α-ketoglutarate to Fe2+ is followed by binding of methylated lysine and molecular oxygen [55, 56]. Molecular oxygen coordination enables decarboxylation of bound α-ketoglutarate to produce succinate, carbon dioxide, and a reactive Fe4+=O species, which then oxidises an Nε methyl group from the methyllysine residue to produce a hemiaminal that then fragments into the demethylated lysine residue and formaldehyde.

KDM4A, as well as the other KDM4 subfamily members, are associated with a number of different diseases and have been the subject of a number of chemical

6

Introduction probe and drug discovery efforts. KDM4A is silenced or down-regulated in bladder cancer [57] and is observed to be overexpressed in squamous cell carcinoma and ovarian, prostate, colorectal, lung and breast cancers [58-65]. Overexpression of KDM4A induces localised chromosome instability and site- specific copy gain [63], and heart-specific knockout mice have an altered response to cardiac stress [66].

KDM4A has been the target of a number of drug discovery efforts, which are currently still at the preclinical phase. Many of the compounds that have been observed to inhibit KDM4A or other members of the KDM4 subfamily are non- specific and affect other demethylases or other classes of enzymes. Different strategies have been adopted to inhibit the demethylase activity of KDM4A. Inhibitors have been designed that chelate the Fe2+ ion bound at the active site, compete with the co-substrate α-ketoglutarate, disrupt the binding of structural Zn2+ ions, or have been derived from peptides and inhibit methylated peptide binding [67-70]. Recently, the inhibitor QC6352 [71], an orally available, potent and selective inhibitor for KDM4 subfamily demethylases [71], was shown to be effective at reducing colony formation and cell viability of breast and colon cancer cells [71]. Proliferation, sphere formation, and xenograft formation of breast cancer stem-like cells from triple-negative tumours has also been shown to be blocked by QC6352 [72].

1.2.3 Reader proteins Reader proteins that bind to specific PTMs have been reported for most types of modifications. In contrast to writer and eraser enzymes, reader proteins do not directly modify PTMs. Instead, they often act indirectly to alter gene expression by facilitating recruitment of other transcription factors, enzymes, or complexes to a specific region of the genome. Two widely studied classes of reader domains are acetyllysine and methyllysine reader domains. A major difference between these two lysine modifications is that acetylation of lysine abolishes the positive charge, whereas methylated lysine remains positively charged. Four classes of reader domains have been identified that bind acetylated lysine residues: bromodomain; double bromodomain; double PHD finger; and double pleckstrin homology [13, 73-75]. There have been a large number of different classes of reader domains shown to recognise methylated lysine residues (Table 1.1). 7

Introduction

1.2.4 Methyllysine reader domains There are at least 15 different types of methyllysine reader domains that have been identified (Table 1.1). The majority of lysine methylation marks that these domains recognise are from H3 and H4, although many are also known to bind methylated lysine residues of non-histone proteins [76-78].

Table 1.1 List of methyllysine reader domains and marks they bind. Reader domains Methylation marks ADD H3 K9me3 Ankyrin H3 K9me1/2 BAH H3 K27me3, H4 K20me2 CW H3 K4me3 Chromobarrel H3 K4me1, H3 K36me2/3, H4 K20me1 CD H3 K9me2/3, H3 K27me2/3 DCD H3 K4me1/2/3 MBT H3 Kme1/2, H4 Kme1/2 PHD H3 K4me2/3, H3 K9me3 PWWP H3 K36me3, H3 K79me3, H4 K20me1/3 SAWADEE H3 K9me1/2/3 Spindlin H3 Rme2a K4me3 TTD H3 K4me3, H3 K9me3, H3 K23me3, H4 K20me2/3 Tudor H3 K36me3 WD40 H3 K9me3, H3 K27me3 Table adapted from Andrews et al. 2016 [13]. ADD (ATRX-DNMT3-DNMT3L), BAH (bromo adjacent homology), CD (chromodomain), DCD (double chromodomain), MBT (malignant brain tumour), PHD (plant homeodomain), TTD (tandem Tudor domain).

A feature common to almost all methyllysine reader domains is an aromatic pocket consisting of between two and four aromatic amino acid residues, often referred to as an ‘aromatic cage’. The methylammonium group of the methyllysine residues can form cation-π interactions with the slightly negatively charged surfaces of the aromatic side chains of Phe, Tyr, and Trp residues. There are two general types of methyllysine recognition: cavity insertion and surface groove (Figure 1.4). In the cavity insertion recognition mode, mono- or dimethyllysine side chains insert into a deep enclosed pocket. Binding of trimethyllysine is likely occluded in this binding mode by steric hindrance. A common feature of these pockets is the presence of a nearby acidic residue that can interact electrostatically with the methylammonium proton of mono- and dimethyllysine, which is not possible with trimethyllysine. Pockets that bind methyllysine using a surface groove recognition mode are wider and more exposed to solvent. Surface groove recognition pockets tend to display lower selectivity for a particular methylation state than cavity insertion pockets.

8

Introduction

Figure 1.4 Cavity insertion and surface groove recognition pockets. (a) The cavity insertion recognition mode of lower methylation states (Kme1/2) is displayed. H4 K20me2 (sticks; carbon, yellow) inserts deep into a small enclosed cavity of L3MBTL1 (PDB 2RJE; [79]) which allows size selection by steric hindrance. Additionally, a common feature of such pockets is an acidic residue that forms an electrostatic interaction with the methylammonium proton. (b) H3 K9me3 (sticks; carbon, yellow) is bound by HP1 (PDB 1KNE; [80]) in a surface recognition mode. Surface groove recognition pockets are wider and often demonstrate less stringency regarding methylation state. Aromatic cage residues shown as stick models and coloured by atom type: carbon, grey; oxygen, red; and nitrogen, blue. Electrostatic surface models (isocontour, ±8 kT/e) were generated with the APBS [81] plugin within PyMOL (Schrödinger, LLC). 1.2.4.1 PHD family The PHD protein domain family members are composed of 50-80 amino acid residues that can differ substantially in sequence [82]. The PHD fold typically consists of a β-sheet composed of two anti-parallel β-strands and can include a C-terminal α-helix (Figure 1.5). The fold is stabilised by a Cys4-His-Cys3 motif 2+ that binds to Zn ions [82]. PHD domains have been shown to bind unmethylated, methylated and acetylated lysine residues [83-85].

Figure 1.5 General structure of PHD domain. X-ray co-crystal structure of Rag2 PHD domain in complex with H3 K4me3 peptide (1.1 Å; PDB 2V89; [86]). Rag2 coloured by secondary structure: helix (red); strand (yellow); and loop (green). Zn2+ ions (grey) shown as spheres and the H3 K4me3 peptide modelled as sticks and coloured by atom type: carbon, grey; oxygen, red; and nitrogen, blue. 9

Introduction

1.2.4.2 Tandem Tudor domain (TTD) One well studied TTD is that of the lysine demethylase KDM4A. The TTDs of KDM4A interdigitate to form a bi-lobal saddle-shaped structure. The two lobes each resemble the canonical Tudor domain and are referred to as hybrid Tudor domains 1 and 2 (HTD-1, 2; Figure 1.6). Methylated peptides have been observed only to bind to HTD-2 [51-53]. Trimethylated lysine residues of H3 K4me3, H3 K23me3, and H4 K20me3 are bound in an aromatic cage composed of residues from both Tudor domains (Figure 1.6). Interestingly, the H4 K20me3 peptide binds at the same site but in a different orientation to the H3 K4me3 and H3 K23me3 peptides (Figure 1.6). Recognition of these different methyllysine marks by KDM4A TTD likely targets KDM4A to specific regions of chromatin. It has been suggested that binding of H4 K20me2/3 by KDM4A plays a role in regulation of DNA repair by controlling recruitment at DNA damage sites [87].

Figure 1.6 Bi-lobal KDM4A TTD structure. Tudor domains 1 (wheat) and 2 (blue) interdigitate to form a bi-lobal TTD. H4 K20me3 peptide (green) interacts with HTD-2 in a different orientation to H3 K4me3 (magenta) and H3 K23me3 (yellow). Trimethyllysine side chains (modelled as sticks) bind in an aromatic cage formed by both Tudor domains. KDM4A shown as cartoon model, aromatic cage and methyllysine residues modelled as sticks, and peptides modelled as ribbons. 1.2.4.3 Chromodomain can be divided into three separate classes: chromodomain (CD; Figure 1.7a), chromobarrel domain (Figure 1.7b), and double chromodomain (DCD; Figure 1.7c). The CD class contains the HP1 chromodomain (Figure 1.7) and the chromobox (CBX) proteins. The canonical CD is a three-stranded, anti- 10

Introduction parallel β-sheet with a C-terminal α-helix. CDs have been shown to bind the repressive H3 K9me2/3 and H3 K27me2/3 marks [80, 88, 89]. Chromobarrels contain an additional β0 strand that mimics the β-stand of methylated peptides when bound to CD proteins. Chromobarrel domains have been shown to bind H3 K4me1, H3 K36me2/3, and H4 K20me1 [90, 91]. The DCD has been shown to bind methylated H3 K4 and NS1 (non-structural protein 1) K229 peptides [76, 92]. The DCD class is composed of domains from the CHD (Chromodomain- Helicase-DNA-binding) family and consist of two CDs. Only one CHD DCD (CHD1) has been structurally characterised and a unique peptide binding interface was observed between the two CDs (Figure 1.7).

Figure 1.7 Structures of a chromodomain, double chromodomain, and chromobarrel domain. (a) X-ray co-crystal structure (2.4 Å) of HP1 chromodomain in complex with H3 K9me3 peptide. (PDB 1KNE, [80]). (b) X-ray co-crystal structure (2.4 Å) of MSL3 chromobarrel domain in complex with H4 K20me1 peptide (PDB 3OA6, [90]) (c) X-ray co-crystal structure (2.4 Å) of CHD1 double chromodomain in complex with H3 K4me3 peptide (PDB 2B2W, [92]). Labels are black for chromodomain 1, grey for linker region, and pink for chromodomain 2. Protein models are coloured according to secondary structure: helices, red; strands, yellow; and loops, green. Methylated histone peptides are displayed as stick models and coloured by atom type: carbon, grey; oxygen, red; and nitrogen, blue.

11

Introduction

1.2.4.4 CHD family CHD proteins are highly conserved in eukaryotes from yeast to humans. In humans, there are nine members of the CHD family (CHD1-9). Members of this family can be classified into three subfamilies: subfamily I consists of CHD1 and CHD2; subfamily II consists of CHD3, CHD4, and CHD5; and subfamily III consists of CHD6, CHD7, CHD8, and CHD9. All CHD family members contain a DCD and an ATP-dependent helicase domain but differ in the identity of their ancillary domains (Figure 1.8). In addition to the DCD and ATPase domains, CHD1 and CHD2 contain a SANT-SLIDE DNA binding domain [93, 94] and a C-terminal CHCT domain, the precise function of which remains unclear [95]. The DCD of CHD1 has been shown to bind methylated H3 K4 [92] and NS1 K229 [76], however the DCD of CHD2 has not been conclusively demonstrated to bind to a methylated target. Members of subfamily II do not contain known DNA-binding domains. They do however contain two PHD domains, which have been shown to bind unmethylated H3 and H3 K9me3 [84, 96]. The function of the DCDs from these family members is not yet entirely clear. The CHD4 DCD has been proposed to act as a DNA binding domain rather than a methylated histone binding domain [97, 98]. Recently, it has been reported that the DCD of CHD5 interacts in a methylation-specific manner to H3 K27me3 [99]. Subfamily III members contain SANT non-specific DNA binding domains and BRK domains of as yet unknown function. Reports on the ability of the DCDs of this subfamily to bind methylated histones are contradictory with some suggesting that there is evidence for binding to methylated histones, whereas other reports do not find any evidence of methyllysine binding [76, 100-102].

Figure 1.8 Domain architecture of the CHD family. Chromodomains (purple), ATPase lobe 1 (dark red), ATPase lobe 2 (blue), DNA binding domain (dark green), CHCT (grey), plant homeodomain (yellow), SANT (light green), BRK (pink).

12

Introduction

CHD dysfunction has been implicated in a number of disease pathologies, including cancer [9, 103, 104], developmental disorders [105] and neurological disorders [106].

1.2.5 CHD1 CHD1 is the most well-studied of the CHD protein family. CHD1 is an ATP- dependent helicase that can induce nucleosome sliding and regular spacing of nucleosomes [107-109]. CHD1 has been shown to have a role in HIV-1 gene expression [110], embryonic stem cell differentiation [111], maintaining pluripotency of stem cells in mice [112], and reprogramming of epithelium into hematopoietic stem cells [113]. CHD1 dysfunction has been implicated in prostate, gastric, and lung cancers [103, 114, 115].

CHD1 has been structurally and functionally well-characterised with 17 different three-dimensional structural models of yeast and human Chd1 present in the PDB (Table 1.2). Table 1.2 Structures of CHD1 protein domains in the PDB. PDB code Domain Organism Technique Resolution Binding partner Reference 2DY7 CD1 Yeast NMR - - [116] 2DY8 CD2 Yeast NMR - - [116] 2H1E DCD Yeast X-ray 2.2 Å - [117] 2B2V DCD Human X-ray 2.7 Å H3 K4me1 [92] 2B2T DCD Human X-ray 2.5 Å H3 T3ph K4me3 [92] 2B2W DCD Human X-ray 2.4 Å H3 K4me3 [92] 2B2U DCD Human X-ray 3.0 Å H3 Rme2 K4me3 [92] 2B2Y DCD Human X-ray 2.4 Å - [92] 4O42 DCD Human X-ray 1.9 Å NS1 K229me2 [76] 4NW2 DCD Human X-ray 1.9 Å NS1 K229me3 [76] 3MWY DCD-ATPase Yeast X-ray 3.7 Å AGS [118] 2XB0 DBD Yeast X-ray 2.0 Å - [94] 3TED DBD Yeast X-ray 2.0 Å 12mer DNA [119] 5J70 DBD Yeast X-ray 3.0 Å 17mer DNA - 4B4C DBD Human X-ray 1.6 Å - - 2N39 CHCT Human NMR - - [95] 509G Full-length Yeast CryoEM 4.8 Å Nucleosome [120] PDB, protein data bank; CD1, chromodomain 1; CD2, chromodomain 2; DCD, double chromodomain; DBD; DNA-binding domain; CHCT, CHD1 helical C-terminal domain; AGS, phosphothiophosphoric acid adenylate ester.

These studies have begun to uncover the function of the different domains and regions of the protein and also how these domains cooperate to interact with nucleosomes and induce nucleosome sliding and positioning. The first structures of CHD1 were of the CDs from humans and yeast. From these initial structural and biochemical experiments, it was ascertained that the DCD of human CHD1 but not yeast Chd1 bind methylated H3 K4 tails [92, 121]. The yeast and human DCDs differ greatly in their first insertion regions. In humans this insertion forms 13

Introduction part of the peptide binding surface and contributes directly to histone peptide binding, whereas as in yeast this packs against the α2 helix [117]. CHD1 DCD binds H3 K4me3 at a unique site formed between CDs 1 and 2 with residues from both CDs interacting with H3 K4me3 (Figure 1.9). W322 and W325 from CD1 form an aromatic cage that interacts with the methyllysine and R2 of H3 K4me3 through cation-π interactions. Two acidic residues from CD2, D408 and D425, interact with the N-terminal amine of the H3 K4me3 peptide. There are two insertions in the DCD. The insertion in CD1 allows the formation of the unique peptide binding site and Y295 from this insert directly interacts with the back- bone amide of K4me3. The second insert is between the two CDs and is a helix- turn-helix motif, which in the CHD1 structure is often referred to as a linker region or chromo-wedge.

Figure 1.9 X-ray crystal structure of CHD1 DCD bound to H3 K4me3 peptide. Co- crystal structure of CHD1 DCD and H3 K4me3 demonstrating the important regions of the domain (2.4 Å; PDB 2B2W; [92]). CD1 (wheat), insert 1 (green), insert 2/linker region/chromo-wedge (grey), CD2 (blue). H3 K4me3 peptide modelled as sticks and coloured according to atom type: carbon, yellow; oxygen, red; and nitrogen, blue. Residues which form important interactions with H3 K4me3 peptide modelled as sticks. DB, DNA binding. In addition to H3 K4me3, it has been demonstrated that CHD1 can be hijacked by the viral protein NS1, which can be methylated at its C-terminal tail (K229). NS1 K229me2 is bound by CHD1 DCD in a similar manner to H3 K4me3. The

14

Introduction precise role of the recognition of H3 K4me3 by CHD1 is unclear. The recognition of H3 K4me3 by CHD1 has been suggested to modulate the efficiency of pre- mRNA splicing in part by mediating the interaction between H3 K4me3 and components of the spliceosome [122]. A study by Morettini et al. (2010) [123] suggested that the DCDs were important for the activity of the Chd1 from Drosophila, however the chromosomal location of chromodomain mutant Chd1 was unaffected.

The DCD has been proposed to regulate ATPase remodelling activity of yeast Chd1 and prevent recognition of naked DNA [118]. The DCD was suggested to maintain Chd1 in a so-called ‘gated’ conformation in which the DCD maintains the ATPase in an open conformation that cannot perform ATP hydrolysis [118]. The highly conserved, acidic, second helix of the linker region (α4) packs against a basic surface of ATPase lobe 2 and CD2 is in contact with ATPase lobe 1. Chd1 protein in which several of the acidic residues from α4 helix were mutated was shown to lose its selectivity for nucleosomal DNA over naked DNA indicating an important role for this region, and the DCD as a whole, in the specificity of the nucleosome remodelling activity of Chd1.

The recently published structure of Chd1 from S. cerevisiae in complex with a nucleosome (Figure 1.10) revealed how Chd1 contacts the nucleosome and how remodelling might be achieved [120]. In the structure, two turns of nucleosomal DNA have been detached and are bound by the DBD of Chd1. Chd1 is bound between two DNA gyres in a catalytically active state. The ATPase directly binds to the second DNA gyre and is anchored through an interaction between the N-terminal tail of H4 and an acidic surface of ATPase lobe 2 [120]. To allow the ATPase to adopt an active, or ‘ungated’ conformation, the DCD swings out 15° and binds to nucleosomal DNA through a highly conserved loop from CD1, which contains several basic residues [120, 124]. The ATPase lobe 2 can then swing in 40° to a closed state capable of ATP-dependent nucleosome remodelling. Based on their structural observations and available biochemical data, Farnung et al. (2017) [120] have proposed a model for nucleosome remodelling by Chd1. A ratcheting cycle by the ATPase motor moves Chd1 in a 3’ to 5’ direction away from the nucleosome dyad. The continued association of Chd1 with histones

15

Introduction causes directional DNA translocation toward the dyad. This could result in the dissociation of a short region of DNA from the histone octamer and the propagation of this region repositions the histone octamer. This would explain how Chd1 induces nucleosome sliding but not how it induces regular nucleosome spacing, therefore more work is required to uncover how this is achieved.

Figure 1.10 Structure of Chd1 bound to a nucleosome. Cryo-electron microscopy structure (4.8 Å) of Chd1 from Saccharomyces cerevisiae in complex with nucleosomal DNA and histones from Xenopus laevis (PDB 5O9G; [120]). The structure is displayed as a cartoon model. The two strands of DNA are coloured differently: guide strand (dark teal) and tracking strand (light green). The histone proteins of the histone octamer are coloured as follows: H2A (pink and slate); H2B (yellow and orange); H3.2 (green and wheat); and H4 (cyan and grey) (a) Chd1 is coloured magenta. (b) Chd1 is coloured according to the different domains: DCD (violet); ATPase lobe 1 (dark red); ATPase lobe 2 (blue); and DNA-binding region (dark green). 16

Introduction

A role for the recognition of methylated H3 K4 tails by the DCD of CHD1 in nucleosome remodelling remains unclear. Elucidating a functional role for H3 K4me3 recognition is complicated slightly by the fact that all multidomain structures and the majority of the biochemical assays have been performed with Chd1 from S. cerevisiae, which unlike human CHD1 does not bind methylated H3 K4 [121].

1.2.5.1 CHD1 and prostate cancer CHD1 is a tumour suppressor gene that is one of the most frequently homozygously deleted genes in prostate cancer [114, 115] and correlates with high numbers of copy number variants [7]. Rearrangements of CHD1 have also been observed by NGS studies [8, 125]. CHD1 is required for recruitment of CtIP (CtBP-interaction protein) to chromatin and facilitates the opening of chromatin at DNA strand breakpoints to facilitate DNA repair in prostate cancer cells [126]. Cells with depleted levels of CHD1 are sensitive to PARP inhibitors [126]. CHD1 disruption or deletion is negatively correlated with ERG rearrangements such as TMPRSS2-ERG [114, 115, 127, 128]. CHD1 is required for recruitment of AR at AR responsive promoters and AR-dependent gene expression [114]. This explains the reason for the mutual exclusivity of CHD1 disruption or deletion and ETS gene family fusion as AR-dependent expression is required for translocations of ERG [114].

1.2.6 Epigenetic proteins as drug targets Aberrant expression and/or function of epigenetic writer, eraser, or reader proteins can induce cellular transformation and/or the progression/development of disease, with cancer being particularly well-studied. These proteins are therefore of significant interest to academic and pharmaceutical research. To facilitate research into the functional roles of epigenetic proteins, significant effort is going into the development of chemical probes and inhibitors for epigenetic proteins [129-132].

Chemical probes allow the function of a specific protein domain to be separated from scaffolding functions and the function of any other domains from the often multi-domain epigenetic proteins. Currently, in order for a compound to be classified as a chemical probe it must have an affinity of <100 nM in vitro, exhibit >30-fold selectivity for the target compared to related sub-families, and possess a

17

Introduction cellular potency of <1 µM [130]. This is distinct from inhibitors which do not have such requirements, especially for selectivity. A number of broad-selectivity inhibitors targeting DNA methylases or histone deacetylases have been approved for the treatment of haematological malignancies [129, 133]. In addition, many inhibitory compounds targeting epigenetic proteins are undergoing clinical trials [129]. Of the three classes of epigenetic protein domains, reader domains have received the least attention in terms of chemical probe and drug development.

1.2.7 Targeting reader domains Computational studies analysing ‘druggability’ of reader domains suggest that different types of reader domains might be more amenable to development of potent inhibitors than others [134, 135]. In general, acetyllysine reader domains were predicted to be more druggable than methyllysine reader domains as acetyllysine-binding pockets tend to be more enclosed, narrow, and hydrophobic than methyllysine binding pockets [134]. There appears to be some variation in the druggability within methyllysine-reader domain family members, e.g. PHF8 and ING4, two PHD-containing proteins that bind methyllysine have very different druggability scores, with PHF8 predicted to be druggable whereas ING4 is not predicted to be druggable [134]. Indeed, a cell penetrating peptide inhibitor derived from H3 K4me3 has recently been reported for PHF8 [136]. The inhibitor is not yet selective for PHF8, so further development is required.

1.2.7.1 Acetyllysine reader domain inhibitors The first inhibitors targeting acetyllysine reader domains were reported in 2010 [137, 138] and established acetyllysine readers as suitable targets for chemical probe and inhibitor development. Chemical probe and inhibitor development for BET (bromodomain and external motif) domain family members has assisted in the establishment of BET family association with a variety of diseases, including cancer [139, 140], auto-immune disease [141], and inflammation [138]. Development of inhibitors of non-BET acetyllysine reader domains has proven to be difficult due to their more open acetyllysine binding pockets, however a number of inhibitors have been successfully developed [129, 142]. A number of acetyllysine reader-targeting compounds have undergone, or are currently undergoing, clinical trials for treatment of a range of disease pathologies [129].

18

Introduction

1.2.7.2 Methyllysine reader domain inhibitors UNC1215, which binds to L3MBTL3, was the first chemical probe identified for a methyllysine reader domain [143]. The methyllysine binding pocket of L3MBTL3 is amenable to compound development and the methyllysine is bound in an enclosed cavity in the cavity insertion mode (Figure 1.4). UNC1215 binds to L3MBTL3 in a unique 2:2 polyvalent mode (Figure 1.11). L3MBTL3 binds UNC1215 with an in vitro affinity of 40 nM and exhibits 50-fold selectivity for L3MBTL3 over other family members [143]. Further development produced compounds with increased selectivity but reduced affinity [144]. The unique 2:2 polyvalent binding mode and narrow methyllysine binding pocket might have allowed the development of high potency compounds.

Figure 1.11 UNC1215 binds in a unique 2:2 polyvalent mode to L3MBTL3. X-ray crystal structure (PDB 4FL6; [143]) of L3MBTL3 in complex with the chemical probe, UNC1215 (sticks; carbon, yellow; oxygen, red; and nitrogen, blue). One copy of L3MBTL3 is coloured wheat and the other copy is coloured blue. L3MBTL3 is modelled as a cartoon and surface model. The chemical structure of UNC1215 is displayed on the right. Inhibitors for the CD-containing, H3 K9me3 and H3 K27me3-binding, protein CBX7 have been developed [145, 146]. The only potent compounds that bind to

CBX7 are large peptide-derived inhibitors, the most potent being UNC3866 (Kd ~100 nM; Figure 1.12). UNC3866 (Figure 1.12) displays similar potency against CBX4, so is not selective and therefore cannot be considered a chemical probe. Large peptide-derived inhibitors typically exhibit low cell permeability, which is reflected in the high concentration (30 µM) of UNC3866 used in cellular assays. A small molecule inhibitor (MS37452), which binds to the aromatic cage of

19

Introduction

CBX7 and competitively inhibits H3 K27me3 binding has been reported (Figure 1.12), however MS37452 only has a potency of 28 µM [145].

The compound WAG-003 has been shown to bind to the PHD domains of KDM5A [147] and compound CF16 binds to the PHD domains of PYGO1 and PYGO2 [148]. Further development of CF16 is required as the affinity for the PHD fingers of PYGO1 and PYGO2 is very weak and the selectivity for these two domains also must be established. WAG-003 binds to KDM5A with an affinity of only 30 µM and is not selective as it binds KDM4A and ING2 PHD domains with similar affinity [147]. Recently, we reported the compound A366, which binds to the Spindlin reader domain of the Spindlin1 protein [149]. IC50 values of 200 nM were determined, however selectivity remains an issue as this inhibitor was previously reported to be a selective inhibitor for the methyltransferases EHMT1 and EHMT2 [150].

Figure 1.12 Compounds bind to CBX7 CD. CBX7 CD displayed as a surface model (grey). UNC3866 (carbon, yellow; PDB 5EPJ; [146]), MS37452 (carbon, green; PDB 4X3T; [145]), and H3 K27me3 (carbon, cyan; PDB 2L1B; [151]) modelled as sticks. Chemical structures of UNC3866 and MS37452 are displayed. Most structures of reader domain containing proteins consist of only the reader domain. Many reader domains form part of much larger multi-domain proteins or exist in larger protein complexes. This could alter the chemical environment of the methyllysine and acetyllysine-binding pockets allowing for more potent compounds to be developed if other binding partners or domains of the protein interact with this region. This could perhaps allow a similar binding mode to L3MBTL3 and UNC1215, potentially enabling more potent and selective 20

Introduction inhibitors to be designed through exploitation of pockets from the reader and other domains or from alteration of the surface characteristics. It is therefore important, even in the early stages of compound development, to determine affinities in the context of the full-length protein, not just with the isolated reader domain.

Most studies targeting reader domains have focused on the pocket that binds the modified residue. However, these pockets can carry an electronegative charge and are often very shallow making them difficult target sites for development of potent and selective small molecule probes. An alternative approach could be to target nearby pockets or other regions of the binding surface of these reader domains more suitable for compound development.

A technique which could be exploited to identify suitable alternative sites is fragment-based drug discovery (FBDD). Fragment screening using only hundreds of fragments enables a much wider range of chemical space to be explored than is possible with traditional high-throughput screening (HTS) approaches due to their reduced size and complexity. Fragment screening could be appropriate to identify compounds binding to the PTM binding site but could also significantly aid the identification of suitable alternative sites for drug development through the identification of ‘hotspots’ for compound binding. The affinity of fragments for such hotspots is often due to the displacement of ‘unhappy’ water molecules [152]. This technique also has the potential to identify novel allosteric inhibition sites, which might otherwise be missed by other approaches. Fragments are low MW compounds (<300 Da) and typically bind with low affinity, so significant chemical elaboration of the initial hits will be required to increase potency, which presents its own significant challenges. However, FBDD remains a very exciting area for the development of compounds for difficult targets.

Fragment screening has already proven useful for the development of compounds targeting the bromodomain of KAT2B/PCAF [153]. Inhibitory compounds have been developed from fragment screening for BAZ2A and BAZ2B [154, 155] a target for which development of potent and selective compounds was predicted to be difficult [135, 156]. Additionally, fragment screening has been used to identify new chemical scaffolds binding to KAT2B, BRD4, CREBBP, and PHIP(2) [157,

21

Introduction

158]. Fragment screening or development of peptide-derived inhibitors may prove to be the most appropriate methods for identifying and developing inhibitory compounds for the more challenging methyllysine reader domains.

22

Materials and Methods

2 Materials and Methods

2.1 General methods

2.1.1 Reagents Unless explicitly stated, all reagents were obtained from Sigma-Aldrich (USA).

2.1.2 Primers and constructs Lists of primers and constructs used in this study are supplied in Appendix 8.1.

2.1.3 Peptide synthesis All peptides used in this study were synthesised by Peptide Specialty Laboratories GmbH (Heidelberg, Germany) and supplied as a powder. A full list of the peptides used in this study is provided in Appendix Table 8.3.

2.1.4 Site-directed mutagenesis Site-directed mutagenesis was performed using the QuikChange Multi Site- Directed Mutagenesis Kit (Agilent Technologies, USA) according to manufacturer’s instructions.

2.1.5 Protein-domain microarray Performed as described in Espejo et al. (2002) [159].

2.2 Methods relating to chapter 3: Assembly of methylated KDM1A and CHD1 in prostate cells Methods that are not explicitly described in this section were performed as described in Metzger et al. (2016) [11].

2.2.1 Cloning and protein expression

DNA encoding CHD1270-443 was cloned into pET15b by ligation-dependent cloning using EcoRI and BamHI restriction sites. pET15b-CHD1270-443 vector was transformed into Escherichia coli BL21(DE3) pRIPL using the heat shock method. E. coli BL21(DE3) pRIPL pET15b-CHD1270-443 were grown in terrific broth at 37 °C while shaking at 200 rpm until an OD600 of 0.6-0.8 was reached. Cells were immediately cooled to 16 °C and expression induced by addition of IPTG to a final concentration of 0.5 mM. After 18 h growth at 16 °C while

23

Materials and Methods shaking at 200 rpm, cells were pelleted by centrifugation at 5,000 rpm for 20 min. Cell pellets were flash-frozen in liquid nitrogen and stored at -80 °C.

2.2.2 Protein purification

2.2.2.1 Cell lysis Cells were resuspended in 20 mM BTP pH 8.0, 250 mM NaCl, 20 mM imidazole and lysed by sonication. Cell debris was pelleted by centrifugation at 20,000 rpm for 1 h. Supernatant was then passed through a 0.22 µm filter.

2.2.2.2 Immobilised metal affinity chromatography (IMAC) TALON® SuperflowTM (GE Healthcare, USA) resin was added to filtered supernatant in a 50 mL CorningTM FalconTM tube (Fisher Scientific, UK). Protein was left to bind to resin at 4 °C, while shaking, for >1 h. Resin was washed three times with 20 mM BTP pH 8.0, 250 mM NaCl, 20 mM imidazole. Protein was eluted from TALON® resin by addition of 20 mM BTP pH 8.0, 250 mM NaCl, 150 mM imidazole.

2.2.2.3 Gel filtration chromatography Protein was further purified by gel filtration chromatography using an S75 16/60 column (GE Healthcare, USA) pre-equilibrated in 20 mM BTP pH 8.0, 250 mM NaCl.

2.2.3 Isothermal titration calorimetry (ITC) ITC experiments were performed at 25 °C with a MicroCal VP-ITC microcalorimeter (GE Healthcare, USA). A sample cell filled with 100 µM CHD1 in 20 mM BTP pH 8.0, 75 mM NaCl was injected with 12 µL samples of peptide (1.5 mM). In total, 24 injections were performed with 240 s spacing between injections and a reference power of 15 μcal s−1. Automated baseline assignment and peak integration were performed with NITPIC version 1.0.1 [160]. Plotting of isotherms and global analysis of multiple titrations with the Simplex algorithm were performed using SEDPHAT version 10.58d [161, 162]. Isotherms were validated with 2D surface error projections at P levels of 0.68 and 0.95 in SEDFIT [161, 162]. Figures were generated by GUSSI [162].

2.2.4 His6-CHD1270-443-KDM1A108-119 K114me2 co-crystallisation

His6-CHD1270-443–KDM1A108-119 K114me2 complex was formed by mixing −1 His6-CHD1270-443 (14 mg mL ) with a six-fold molar excess of KDM1A108–119 24

Materials and Methods

K114me2 peptide. Crystals of the complex (space group P21212) were grown by sitting-drop vapor diffusion at 4 °C in buffer containing 100 mM HEPES, pH 7.5, 0.2 M L-proline, and 10% w/v PEG 3,350. Crystal formed within four days. Prior to flash cooling in liquid nitrogen, crystals were cryoprotected in reservoir solution with 20% v/v ethylene glycol.

2.2.5 X-ray diffraction data collection and analysis Initial X-ray diffraction data were collected at Swiss Light Source, beamline PXI. The dataset for the final refined model was collected with a Pilatus 6M detector (Dectris, Switzerland) at Diamond Light Source, at 100 K, at beamline I04 with a wavelength of 0.9795 Å. X-ray diffraction data were processed and analysed with XDS [163] and Aimless [164]. The structure was solved by molecular replacement with Phaser [165] using PDB 2B2W [92] chain A as a search model. Manual building and refinement of the model were performed with Coot [166] and Refmac5 [167] in the CCP4 program suite [168] and BUSTER version 2.11.5 [169]. Model validation was performed with MolProbity [170] and the RCSB Validation Server. All residues are in favoured regions as assessed by Ramachandran plot analysis. Statistics for data collection and refinement are provided in Table 3.1. Structure has been deposited in the PDB; 5AFW. Structural figures were generated with PyMOL (The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC) or CCP4mg [171]. Electrostatic surface potential was determined using APBS [81].

2.2.6 Peptide pull-down

2.2.6.1 Protein expression and purification of GST-tagged proteins pGEX-4T-1 vectors encoding GST-tagged CHD6276-436 and CHD9685-840 were provided by Prof. Dr Mark T. Bedford (University of Texas). The pGEX-4T-1 vectors were transformed into E. coli BL21 (DE3) pRIL cells and expressed as described in section 2.2.1. Cells were lysed as described in section 2.2.2.1. Lysis buffer used: 50 mM Tris pH 7.5, 250 mM NaCl. Supernatant was incubated with glutathione sepharose 4B beads (GE Healthcare, USA) for 90-120 min at 4 °C with shaking. Glutathione sepharose 4B beads were washed four times with lysis buffer. Bound protein was eluted with 50 mM Tris pH 7.5, 250 mM NaCl, 20 mM reduced L-glutathione.

25

Materials and Methods

2.2.6.2 Peptide binding C-terminally biotinylated peptides were bound to SoftLinkTM soft release avidin resin (Promega, USA) by addition of 100 µg of peptide to 50 µL of resin. Binding buffer (50 mM Tris pH 8.0, 200 mM NaCl, 0.3% v/v NP-40) was added to a total reaction volume of 100 µL. Reaction mixtures were incubated on a rotatory wheel at 4 °C for 2 h. Resin was washed three times with 1 mL of 50 mM Tris pH 8.0, 200 mM NaCl, 0.3% v/v NP-40.

2.2.6.3 Peptide pull-down assay Protein (10 µg) was added to 15 µL of resin, prepared as described in section 2.2.6.2, and 50 mM Tris pH 8.0, 200 mM NaCl, 0.3% v/v NP-40 buffer was added to a final volume of 1 mL. Protein-peptide-resin mixture was incubated on a rotatory wheel at 4 °C for 45 min. Resin was then washed three times with 1.5 mL of buffer. Protein bound to the peptide-linked resin was eluted by addition of 30 µL 1x SDS-PAGE loading dye followed by incubation at 95 °C for 5 min. Eluted samples were loaded onto an SDS-PAGE gel and visualised using Coomassie blue staining.

2.3 Methods relating to chapter 4: Fragment screening using X-ray crystallography Protein expression and purification performed as described in section 2.2.

2.3.1 Crystallisation

His6-CHD1270-443 and H31-12 K4me3 complex was formed by mixing -1 His6-CHD1270-443 (14 mg mL ) with a six-fold molar excess of H31-12 K4me3 peptide. Seed stocks of crushed His6-CHD1270-443 and KDM1A108-119 K114me2 co-crystals at different dilutions were prepared in 100 mM HEPES, pH 7.5, 0.2 M L-proline, 10% w/v PEG 3,350 using the contemporary method as described in the Seed BeadTM Kit (Hampton Research, USA). Matrix microseeding was performed with seed stocks of His6-CHD1270-443 and KDM1A109-119 K114me2 co-crystals at different dilutions. The following matrix screens were explored: PACT premierTM, JCSG-plusTM, and WizardTM III+IV from Molecular Dimensions (UK); and PEG/Ion HTTM and Crystal Screen HTTM from Hampton Research (USA). Crystal conditions identified from initial screens were then optimised and X-ray diffraction data from these crystals were collected, without additional cryoprotectant, at Diamond Light Source beamline, I04. His6-CHD1270-443 and 26

Materials and Methods

H31-12 K4me3 co-crystals (P21212) used for fragment screening were grown by sitting-drop vapour diffusion at 4 °C in buffer containing 0.01 M sodium/potassium phosphate buffer, 0.1 M BTP pH 5.5, and 25% w/v PEG 3,350.

Drops composed of 150 nL His6-CHD1270-443-H31-12 K4me3 complex, 45 nL seed stock, and 105 nL mother liquor were set up in Swissci 3-drop crystallisation plates (Molecular Dimensions Ltd, UK) using a Mosquito® Crystal robot (TTP Labtech Ltd, UK).

2.3.2 Imaging and ranking crystal drops Crystallisation drops were imaged by a Formulatrix® Rock Imager 1000 (Formulatrix, UK). Drops were ranked according to the likelihood that a crystal is present using TeXRank [172]. The ranked crystallisation drops were then examined manually to first, confirm the presence of suitable crystal(s) for fragment soaking, and second, to assign a location for fragment dispensing near the edge of the drop away from the crystal(s). The assigned coordinates were then used for dispensing DMSO and fragments to the different drops to avoid damaging the crystal(s) as described in sections 2.3.4 and 2.3.5.

2.3.3 Fragment library The Diamond-SGC Poised Library (DSPL) was used for fragment screening. Fragments are dissolved at 500 mM concentration in DMSO, with a small subset in ethylene glycol. The fragment library was supplied by the XChem Facility at Diamond Light Source (UK). Details of the full library can be found at http://www.diamond.ac.uk/Beamlines/Mx/Fragment-Screening/Fragment- Libraries.html and information regarding the design of the DSPL fragment library are provided in [158].

2.3.4 DMSO tolerance test Dispensing of DMSO was performed using an Echo® 550 liquid handler (Labcyte Inc., USA). DMSO was added to crystal-containing drops to final concentrations of 0, 5, 10, 20, and 30% (percentages were calculated based on the original drop volume prior to vapour diffusion). Crystals were soaked in DMSO for 1 or 3 h at 4 °C before flash cooling in liquid nitrogen without additional cryoprotection. X-ray diffraction data were collected with a Pilatus 6M detector (Dectris AG, Switzerland) at 100 K at Diamond Light Source beamline, I04, with a wavelength

27

Materials and Methods of 0.9795 Å. Optimal DMSO concentration and soaking time were determined by examination of X-ray diffraction data.

2.3.5 Fragment soaking Fragments from DSPL were added to crystal-containing drops to a final concentration of 100 mM (20% DMSO) using an Echo® 550 liquid handler (Labcyte Inc., USA) and soaked for 1 h at 4 °C. One crystal per drop was flash- cooled.

2.3.6 X-ray diffraction data collection X-ray diffraction data for the 714 crystals were collected with a Pilatus 6M-F detector (Dectris AG, Switzerland) at 100 K at Diamond Light Source beamline, I04-1, with a wavelength of 0.920 Å.

2.3.7 X-ray diffraction data analysis X-ray diffraction data were analysed using the Diamond Light Source xia2 [173] or autoPROC [174] autoprocessing pipelines. The xia2 pipeline makes use of the following programs: DIALS [175]; XDS [163]; autoPROC [174]; POINTLESS [176]; and CCP4 [168]. The autoPROC pipeline makes use of the following programs: XDS [163]; CCP4 [168]; AIMLESS [164]; POINTLESS [176]; and STARANISO [177].

2.3.8 Model building, ligand identification and modelling, and refinement All steps were performed within the data-management software package XChemExplorer (XCE; [178]). XCE utilises the suite of programs from Phenix [179] and CCP4 [168]. The ‘best’ autoprocessed data produced by the pipelines were selected through a combination of automated and manual analysis. Data were reprocessed if required. XCE generated electron density maps using the difference map pipeline, DIMPLE in CCP4 [168]. AceDRG [180] was used for generating ligand restraints. Fragment binding was detected by PanDDA (Pan- Dataset Density Analysis; [181]; section 2.3.8.1). Manual building and refinement of the models were performed with Coot [166] and REFMAC [167], respectively. Validation of protein and fragment models were performed with MolProbity [170] and EDSTATS [182].

28

Materials and Methods

2.3.8.1 PanDDA The PanDDA method allows extraction of obscured weak signals (e.g. from weakly bound, low occupancy fragments) from diffraction data. A detailed description of this recently developed technique can be found in Pearce et al. (2017) [181] and is now available as part of CCP4 [168].

All input models, which include solvent molecules, must be completely identical in atomic composition and numbering for PanDDA. All of the models from the fragment screen were generated by DIMPLE using a refined reference model, so crystals in the same space group are identical and any that were not were rejected in the analysis. Two different crystal forms were identified: P21212 and P212121 space groups, which differ in their atomic composition and number of CHD1 molecules in the asymmetric unit (see Appendix 7.1). PanDDA was performed separately with the two groups.

For comparison of maps from different crystals, which are not perfectly isomorphous, the maps are first aligned using the refined models as reference points. Additionally, the maps must have been calculated at the same level of detail. Higher resolution datasets must therefore be truncated to the target dataset resolution limit, while datasets of lower resolution than the target dataset are excluded from analysis. Maps are recalculated at the truncated diffraction data resolution limit. Following alignment of all the maps for a certain resolution, the electron density from ground-state datasets are used to parameterise a statistical model. Each aligned map is placed on a grid allowing sampling of the electron density to be performed at every grid point in each dataset. The observed value of the electron density at each grid point m, in dataset i, is treated as being sampled from a distribution: 표푏푠푒푟푣푒푑 푡푟푢푒 𝜌푖,푚 = 𝜌푚 + 휀푖

푡푟푢푒 𝜌푖,푚 is the dataset-independent, natural variation at grid point m. εi is the experimental uncertainty of the electron density for dataset i. Areas of a dataset that deviate from the average map, termed ‘events’, can then be determined. The use of multiple datasets in the analysis allows the calculation of Z-scores. A Z-score for dataset i, at grid point m, can be calculated from:

29

Materials and Methods

표푏푠푒푟푣푒푑 𝜌푖,푚 − 휇푚 푍푖,푚 = 2 2 √𝜎푖 + 푠푚 where µm is the mean value of the electron density at grid point m, σi is the uncertainty of dataset i, and sm is the variance of the ‘true’ electron density at point m. A description of how all of these parameters and equations are derived is provided by Pearce et al. (2017) [181].

A large Z-score is indicative of a significant deviation in the map from the average map. Z-maps are contoured at Z=2.5. Areas of significance are then identified by filtering remaining patches of density by Z=3.0 and have a volume of ≥10 Å3. Nearby patches of density are grouped together if the minimum distance between the two patches is <5 Å. For every event identified by analysis of Z-maps, a background density correction (BDC) factor is estimated. In order to estimate the BDC, different fractions of the average map are subtracted from the individual map. Both local (location of the event as defined by the Z-blob) and global correlation between the average map and the subtracted map are determined. As the fraction of the average map being subtracted from the map increases, the local correlation between the two maps should decrease faster than the global correlation. The reason for this is the local density is the result of a mixture of the average map and something else. The BDC is the fraction with the largest difference between the local and global correlation. The event map which is then used for modelling can then be determined as shown:

[event map] = [dataset map] – BDC x [ground state map]

2.3.8.2 Modelling and refinement Fragments were modelled in Coot [166] within the XCE [178] framework in the reference frame using the event and Z-maps. Changes to nearby solvent molecules and CHD1 protein are also modelled based on these two maps. An ensemble model of the changed (or bound) state plus the ground state (model without fragment) is generated. Any differences between the two states are modelled as unique alternate conformations (e.g. bound = D conformation, ground = C conformation). Initial occupancy of the bound conformation is determined by:

30

Materials and Methods occupancy = 2 x [1-BDC]

Maps are then back-transformed to the native crystallographic frame and refinement of the ensemble models is then performed against the native dataset by REFMAC [167], as part of the program suite CCP4 [168] employed within the XCE framework [178]. Constrained group occupancy refinement is performed with the bound and ground state conformers, which ensures they sum to unitary occupancy.

2.3.9 Fragment validation A number of different validation criteria were used to assess the quality of the data supporting the modelling of each fragment. The validation program EDSTATS [182] was used within CCP4 [168] for the generation of fragment validation statistics. Five ligand validation metrics were employed to assess fragment modelling: RSCC, RSZD, RSZO/OCC, B-factor ratio, and RMSD. Real- space correlation coefficient (RSCC) provides a measure of the overall agreement of the model with the data. Real-space Z-difference score (RSZD) provides a measure of model accuracy through analysis of the difference density surrounding the atoms. Real-space observed density Z-score (RSZO; [182]) is a model precision metric and is calculated by dividing the average of the density by the map noise. A normalised parameter, RSZO/OCC, in which the RSZO is divided by the occupancy is used in place of RSZO. The reason for this is the density is directly proportional to the occupancy of a molecule, therefore to enable direct comparisons between fragments with different occupancies the RSZO/OCC metric is more useful. B-factor ratio provides an assessment of the consistency of the fragment with the local environment. B-factor ratio is determined by dividing the B-factor of the fragment with those of amino acid residues within 4 Å of the fragment. The final metric is fragment RMSD, which is determined by comparing the atomic coordinates of the fragment after refinement with the coordinates in the original ensemble model and provides an assessment of the fragment stability during refinement. Ideal values for each of the validation metrics are provided in Table 4.1.

31

Materials and Methods

2.3.10 High-throughput cloning and expression of DCDs from other CHD proteins Synthetic DNA of CHD2, CHD4, CHD6, CHD8, and CHD9 were ordered from Integrated DNA Technologies (USA). The DCDs from each of these proteins were amplified by PCR with PfuUltraII hot start polymerase (Agilent Technologies, USA), using a standard protocol. Details of the primers and constructs generated are supplied in Appendix 8.1. Construct lengths for CHD2, CHD4, CHD6, and CHD9 were based on previous studies reporting structural, biophysical, or biochemical data with the DCDs, or the CHD4 PHD domains [76, 84]). Secondary structure prediction using the Phyre2 web portal [183] was used to assist the design of a number of different CHD8 DCD constructs. Cloning and small-scale expression tests were performed using the Oxford Protein Production Facility protocol: http://www.oppf.rc-harwell.ac.uk/OPPF/protocols/.

2.3.11 Large-scale protein expression and purification trials Constructs and expression conditions were selected for scale-up based on results from small-scale expression tests. IPTG or auto-induction methods were used to induce expression of protein constructs in 1 L of E. coli Lemo21(DE3) at 16, 20, 25, or 37 °C in the presence or absence of varying concentrations of L-rhamnose for 18 or 24 h after induction (3 h for cultures grown at 37 °C). Cells were pelleted and stored as described in section 2.2.1. Purification protocols were developed for constructs with high expression levels. These protocols are summarised in Appendix 8.2.

2.4 Methods relating to chapter 5: Other potential readers of KDM1A methylated at K114

2.4.1 Cloning and protein expression

2.4.1.1 pET15b-KDM4A895-1010

DNA encoding KDM4A895-1010 was cloned into pET15b by ligation-dependent cloning using EcoRI and BamHI restriction sites. pET15b encodes a non- cleavable, N-terminal His6 tag. pET15b-KDM4A895-1010 was transformed into E. coli BL21(DE3) pRIL using the heat shock method. Protein expression was performed as described in section 2.2.1.

32

Materials and Methods

2.4.1.2 pOPINF-KDM4A895-1010 TM DNA encoding KDM4A895-1010 was cloned into pOPINF using In-Fusion (Clontech, USA) enzyme according to standard protocols. pOPINF encodes an

N-terminal His6 tag with a Human Rhinovirus (HRV) 3C protease cleavage site between the His6 tag and the cloned gene. pOPINF-KDM4A895-1010 was transformed into E. coli Lemo21(DE3) using the heat shock method. Protein expression was performed as described in section 2.2.1.

2.4.2 Protein purification

2.4.2.1 Cell lysis Cells were resuspended in 50 mM Tris pH 7.5, 500 mM NaCl, 20 mM imidazole and lysed using a Constant Cell Disruption System (Constant Systems Ltd., UK). Cell debris was pelleted by centrifugation at 20,000 rpm for 1 h. Supernatant was passed through a 0.22 µm filter.

2.4.2.2 IMAC Supernatant was passed through a 5 mL HisTrapTM HP column (GE Healthcare, USA). The column was attached to an ÄKTApurifier (GE Healthcare, USA). Weakly bound proteins were washed off the column by flowing through 50 mM

Tris pH 7.5, 500 mM NaCl, 20 mM imidazole until the A280 stabilised. Protein was then eluted using a linear gradient of increasing imidazole concentration.

His6-KDM4A895-1010 was further purified by gel filtration (section 2.4.2.5) following assessment of sample identity and purity by SDS-PAGE.

His6-3C-KDM4A895-1010 underwent further purification steps (sections 2.4.2.3 and 2.4.2.4) prior to gel filtration chromatography.

2.4.2.3 3C cleavage and dialysis

His6-HRV-3C enzyme was added to His6-3C-KDM4A895-1010 at a ratio of 50 µg of

His6-HRV-3C per mg of His6-3C-KDM4A895-1010. The reaction mixture was transferred to moistened dialysis tubing (MWCO 3000 Da) and dialysed overnight at 4 °C into 2 L of 50 mM Tris pH 7.5, 300 mM NaCl, 1 mM DTT.

2.4.2.4 Reverse IMAC The dialysed reaction mixture from section 2.4.2.3 was passed through a 1 mL HisTrapTM HP column (GE Healthcare, USA) at 4 °C. Unbound, cleaved protein was collected. Any protein, which remained bound to the column was eluted with

33

Materials and Methods

50 mM Tris pH 7.5, 300 mM NaCl, 500 mM imidazole. Samples were run on an

SDS-PAGE gel to assess the efficiency of 3C cleavage and His6 tag removal from the protein, and assess the purity of the sample.

2.4.2.5 Gel filtration chromatography Protein from sections 2.4.2.2 and 2.4.2.4 was further purified by gel filtration chromatography using an S75 16/60 column (GE Healthcare, USA) pre-equilibrated in 50 mM Tris pH 7.5, 300 mM NaCl (200 mM NaCl for crystallisation experiments).

2.4.3 SEC-MALS Size exclusion chromatography and multiangle light scattering (SEC-MALS) was performed with His6-KDM4A895-1010 and KDM4A895-1010. Protein samples (100 µL) at 2 mg mL-1 were loaded onto a Superdex 75 Increase 10/300 GL column (GE Healthcare, USA), pre-equilibrated in 50 mM Tris pH 7.5, 200 mM NaCl, in-line with a DAWN HELEOS II light scattering detector (Wyatt Technology Ltd, UK). Data acquisition and analysis was performed with ASTRA® software (Wyatt Technology, UK).

2.4.4 Isothermal titration calorimetry ITC experiments were performed at 10 °C with a MicroCal VP-ITC microcalorimeter (GE Healthcare, USA). Experiments were performed by injection of 10 µL of peptide (0.9-1.2 mM) into the sample cell containing

40-60 µM of KDM4A895-1010 in 50 mM Tris pH 7.5, 30 mM NaCl. A total of 27 injections were performed with a spacing of 300 s and a reference power of 8 µcal s-1. Data analysis and isotherm validation was performed as described in section 2.2.3.

2.4.5 Crystallisation X-ray diffraction datasets were collected from four different types of

His6-KDM4A895-1010 crystals: apo His6-KDM4A895-1010 crystals in space groups

I4132 and P63; His6-KDM4A895-1010 in complex with KDM1A108-119 K114me3

(P1); and KDM4A895-1010 in complex with KDM1A108-119 K114me2 (P21).

2.4.5.1 I4132

His6-KDM4A895-1010 apo crystals formed by mixing His6-KDM4A895-1010 -1 (24 mg mL ) with a two-fold molar excess of KDM1A108-119 K114me3. Crystals

34

Materials and Methods

(space group I4132) were grown by sitting-drop vapour diffusion at 20 °C in buffer containing 1 M ammonium sulfate, 0.1 M Bis-Tris pH 5.5, 1% w/v PEG 3,350. Crystals formed within three days. Crystals were cryoprotected by addition of 10% v/v glycerol to crystallisation solution.

2.4.5.2 P63

His6-KDM4A895-1010 apo crystals (space group P63) were formed initially by -1 mixing His6-KDM4A895-1010 (24 mg mL ) with a two-fold molar excess of TM KDM1A108-119 K114me3 and preparing matrix microseeding screens with Index

(Hampton Research, USA) using microseeds from I4132 crystals grown as described in section 2.4.5.1. Crystals were grown by sitting-drop vapour diffusion at 20 °C in buffer containing 1 M succinic acid pH 7.0, 0.1 M HEPES pH 7.0, 1% w/v PEG MME 2,000. Crystals formed within seven days.

2.4.5.3 P21

Crystals of the KDM4A895-1010 – KDM1A108-119 K114me2 complex formed by -1 mixing KDM4A895-1010 (48 mg mL ) with 1.25-fold molar excess of

KDM1A108-119 K114me2. Crystals (space group P21) were grown by sitting-drop vapour diffusion at 20 °C in buffer containing 0.03 M magnesium chloride hexahydrate, 0.03 M calcium chloride dihydrate, 0.1 M imidazole/MES monohydrate, pH 6.5, 12.5% v/v MPD, 12.5% w/v PEG 1,000, 12.5% w/v PEG 3,350. Crystals formed within four weeks.

2.4.5.4 P1

Crystals of the His6-KDM4A895-1010 and KDM1A108-119 K114me3 complex were -1 formed by mixing His6-KDM4A895-1010 (13 mg mL ) with two-fold molar excess of KDM1A108-119 K114me3. Microcrystals of His6-KDM4A895-1010 (space group

I4132, section 2.4.5.1) were generated using the contemporary method of seed stock preparation as described in the Seed BeadTM Kit (Hampton Research, USA). Initial crystals were generated by streaking these diluted microseeds across crystallisation drops using the seeding tool from Hampton Research, USA. Crystals (space group P1) were grown by sitting-drop vapour diffusion at 20 °C in buffer containing 0.2 M ammonium sulfate, 0.1 M Bis-Tris pH 6.0, 27.5% w/v PEG 3,350 and were optimised by further seeding experiments. Crystals formed within five days.

35

Materials and Methods

2.4.6 X-ray diffraction data collection and analysis

X-ray diffraction data from I4132 and P63 apo-crystals were processed using

DIALS [175] as part of xia2 [173]. Data from multiple P1 His6-KDM4A895-1010 and KDM1A108-119 K114me3 co-crystals were integrated with XDS [163], then combined and merged using BLEND [184]. The data from the P21 KDM4A895-1010 and KDM1A108-119 K114me2 complex crystal was processed with XDS [163] as part of xia2 [173].

Structures were solved by molecular replacement with PDB 2QQR [52] as a search model using Phaser [165]. The search model PDB 2QQR [52] was separated into two separate search models: the first being HTD-2; and the second being HTD-1. Molecular replacement was performed first with HTD-2 and then with HTD-1. This was necessary for each different space group. Manual building was performed as described in section 2.5.1. Refinement was performed with Refmac5 [167] in the CCP4 program suite [168], phenix.refine [185] in the Phenix program suite [179], and BUSTER [169]. Model validation and generation of structural figures were performed as described in section 2.5.1.

36

Assembly of methylated KDM1A and CHD1 in prostate cells

3 Assembly of methylated KDM1A and CHD1 in prostate cells The work presented in this chapter is from collaborative work between myself and the following individuals: Eric Metzger, Dominica Willmann, Ignasi Forne, Philipp Metzger, Stefan Gerhardt, Kerstin Petroll, Anne von Maessenhausen, Sylvia Urban, Anne-Kathrin Schott, Alexsandra Espejo, Adrien Eberlin, Daniel Wohlwend, Katrin Schüle, Michael Schleicher, Sven Perner, Mark Bedford, Manfred Jung, Jörn Dengjel, Ralf Flaig, Axel Imhof, Oliver Einsle, and Roland Schüle. This chapter provides a summary of the work presented in Metzger et al. (2016) [11], but places greater emphasis on the biochemical, biophysical, and structural data. I was directly involved in the cloning, expression, purification, crystallisation, and ITC experiments with CHD1 DCD and the peptide pull-down experiments with CHD6 and CHD9 DCDs. Other experimental work presented in this chapter and in Metzger et al. (2016) [11] were performed by one or more of the individuals listed above.

KDM1A has been suggested to play a role in AR-dependent prostate cancer [17]. However, it had not been established whether KDM1A plays a role in the formation of genomic translocations, such as TMPRSS2-ERG, in prostate cells. Given the prevalence of genomic translocations in prostate cancer and the therapeutic interest in KDM1A the aim of the work presented in this chapter was to determine what, if any, role KDM1A has in the formation of androgen- dependent translocation in prostate cells.

3.1 EHMT2 methylates KDM1A Euchromatic histone-lysine N-methyltransferase 2 (EHMT2) was identified from protein correlation profiling of human LNCaP cells subjected to stable isotope labelling by amino acids in cell culture as having a similar enrichment profile as KDM1A (Figure 3.1a). The sequence of KDM1A was examined to determine if there were any lysine residues in KDM1A that might be methylated by EHMT2. An identified potential site of lysine methylation was KDM1A K114, which is part of the flexible, N-terminal region of KDM1A. This region had recently been reported to be post-translationally modified through phosphorylation of S111 by 37

Assembly of methylated KDM1A and CHD1 in prostate cells

PKCα [28] hinting that this region might be amenable to other PTMs, such as methylation. To determine if KDM1A K114 might be a site of methylation for EHMT2, the methyltransferase domain of EHMT2 expressed with an N-terminal 3 GST tag (GST-EHMT2786-1210), was incubated in the presence of H-labelled S-adenosyl-methionine (SAM) with either GST-KDM1A or GST-tagged KDM1A in which K114 has been mutated to an alanine (GST-KDM1A K114A; Figure

3.1b). GST-KDM1A was methylated by GST-EHMT2786-1210, whereas the GST- KDM1A K114A mutant displayed no evidence of methylation indicating that KDM1A K114 might be a new site of EHMT2 methylation. Methylation of KDM1A K114 by EHMT2 in vitro was confirmed by western blot analysis of

GST-KDM1A or GST-KDM1A K114A incubated with GST-EHMT2786-1210 with or without the addition of SAM. Visualisation using a specific anti-KDM1A K114me2 antibody revealed methylation of GST-KDM1A but not the

GST-KDM1A K114A mutant by GST-EHMT2786-1210 (Figure 3.1c).

Figure 3.1 EHMT2 methylates KDM1A K114. (a) Protein enrichment profiles from a cluster of 166 proteins from PCP-SILAC with LNCaP cells treated with the androgen, DHT. KDM1A (red) and EHMT2 (orange) profiles are displayed together with the other 164 proteins (grey). (b) Autoradiograph and Coomassie stained gel. Sample composition indicated by the labels above the gels. (c) GST-KDM1A or GST-KDM1A K114A were incubated with or without addition of GST-EHMT2786-1210 or SAM as indicated. Western blot analyses performed with anti-EHMT2, anti-KDM1A, and anti-KDM1A K114me2. PCP-SILAC, protein correlation profiling-stable isotope labelling by amino acids in cell culture. Figures are from Metzger et al. (2016) [11]. To determine if KDM1A is methylated at K114 by EHMT2 in vivo, Flag-KDM1A or Flag-KDM1A K114A were expressed in the presence or absence of Flag- EHMT2 in 293T cells. Immunoprecipitation of cell extracts with an anti-Flag antibody proceeded by western blot analysis with anti-KDM1A K114me2 antibody revealed that only Flag-KDM1A was methylated in the presence of Flag- EHMT2 (Figure 3.2a). In LNCaP cells methylation of KDM1A K114 can be

38

Assembly of methylated KDM1A and CHD1 in prostate cells increased by overexpression of EHMT2 (Figure 3.2b), or decreased either through knockdown of EHMT2 with siRNA (Figure 3.2c) or inhibition of EMHT2 by treatment with the EHMT2 inhibitor, BIX-01294 ([186]; Figure 3.2d). These experiments confirm that KDM1A is methylated at K114 by EHMT2.

Figure 3.2 EHMT2 methylates KDM1A K114 in vivo. (a) Extracts from 293T cells transfected with either Flag-KDM1A or Flag-KDM1A K114A with or without Flag-EHMT2 addition. Western blot performed with anti-Flag, anti-KDM1A K114me2, and anti-KDM1A. (b-d) Western blots performed with anti-EHMT2, anti-KDM1A K114me2, anti-KDM1A, and anti-α-tubulin (control). (b) Control or EHMT2- overexpressing LNCaP cells. (c) Control siRNA or EHMT2 siRNA-treated LNCaP cells. (d) LNCaP cells cultured with or without the addition of the EHMT2 inhibitor, BIX-01294. Figures are from Metzger et al. (2016) [11]. 3.2 CHD1 is a KDM1A K114me2 reader protein To determine what functional role methylation of KDM1A K114 might have, protein-domain microarrays with unmethylated and dimethylated KDM1A104-127 K114 peptides were performed to identify potential KDM1A K114me2 reader proteins. The microarray contained protein domains which were known to recognise methylated lysine residues or were putative reader domains. It was observed that KDM1A104-127 K114me2, but not unmethylated KDM1A104-127, peptide interacted with the GST-tagged CHD1 DCD, indicating CHD1 might specifically recognise KDM1A K114me2 (Figure 3.3). Interestingly, several other CHD family members included in the protein-domain microarray were not observed to bind to KDM1A104-127 K114me2 peptide (Figure 3.3). Of the nine CHD family members, the DCDs of CHD1, CHD2, CHD4, CHD5, and CHD7 proteins were included in the protein-domain microarray. Peptide pull-down assays with GST-tagged CHD6 and CHD9 DCDs indicate that these domains are unlikely to bind KDM1A K114me2 (Figure 3.4). GST-CHD6276-436 appears to 39

Assembly of methylated KDM1A and CHD1 in prostate cells bind H31-23 K4me3 in agreement with binding data from Qin et al. (2014) [76]. TM There is also some non-specific binding to the SoftLink avidin beads and to the other peptides. Phosphorylation of T6 may also weaken binding of CHD6 to

H3 K4me3. GST-CHD9685-840 does not appear to bind KDM1A K114me2 or H3 K4me3.

(a) (b) PHD PHD A1 BPTF B1 Dnmt 3a A2 ING2 B2 Dnmt 3b A3 PHF2 B3 DnMT3L A4 PHF8 B4 Trim24 Brd+PHD A5 DATF1 B5 ING3 A6 Rag2 B6 ING4 A7 PCCX1 B7 ING5 A8 P300 B8 PHD TIF1A A9 PHF20 B9 TRI66 A10 PHD PHF3 B10 BRPF1 A11 PHD PHF5 B11 MLL4 A12 PHD CHD5(1-2) B12 MTF2

PHD + BROMO BROMO C1 KMD4A PHD+Tudor D1 GCN5 E1 TIF1α (c) C2 KDM4C PHD D2 TAF1-D1 E2 KAP-1 C3 M96 Tudor+PHD D3 TAF1-D2 E3 P300 C4 MYST4 PHD+PHD D4 PCAF E4 WDR9(1-2) CHD1 C5 NSD1 PHD+PWWP D5 SP140 E5 WDR9(1) C6 WHSC1 PHD+PWWP D6 SNF2 E6 WDR9(2) C7 PRKCB1 PHD+BRD+PWWP D7 SMAP E7 BAZ C8 BS69 PHD+BRD D8 BAF180(1-2) E8 BRDT(1-2) C9 ATRX D9 BAF180(3) E9 BRDT(1) C10 RAL1 D10 BAF180(3-4) E10 BRDT(2) C11 BAZ1b D11 BAF180(5-6) E11 BRD4(1) C12 CBP SANT/TSN CHROMO CHROMO F1 MPP11 like G1 TIP60 H1 Mi-2 F2 MTA1 G2 CHD2 H2 HP1α CHD1 F3 NCoR2 G3 CHD4 H3 HP1γ F4 NCoR2(1) G4 MPP8 H4 Msl3-like F5 NCoR2(2) G5 SMARCC2 H5 SUV39H1 F6 NCoR1 G6 MRG15 H6 CBX1 F7 RERE G7 RBBP1 H7 HP1β F8 ADA2 SANT G8 PC2 H8 CDY1 F9 Zuot in Rel G9 PC3 H9 CHD1 F10 TSN G10 CHD5 H10 CBX4 F11 TSN m5 G11 CHD7(1-2) H11 CBX7 F12 TSN m6 G12 CBX6 H12 CBX5 CHROMO/BRK/MRG PWWP PWWP/CW I1 CBX3 J1 BRPF1 K1 PWWP PKCB1 I2 CBX2 J2 BS69 K2 PWWP HDGR3 I3 CDYL2 J3 DNMT3B K3 PWWP DNM3A I4 CBX8 J4 HDGF K4 CW1 I5 BRK SMCA2 J5 HRP3 K5 CW3 I6 BRK SMCA4 J6 MSH6 K6 CW4 I7 BRK CHD6 J7 NSD1 K7 CW5 I8 BRK CHD7 J8 WHSC1(1) K8 CW6 I9 BRK Q6DTK9 J9 PSIP1 K9 FATM14 I10 MRG MS3L1 J10 BRD1 K10 TAF10 I11 MRG MO4L1 J11 ZCPW1 K11 TULP1 J12 MBD5

ANK FHA/KH/BRCT BRCT Others Others Others L1 BARD1 M1 RAD53 FHA2 N1 BRCA1 O1 NO66 M1 P1 POZ Zbtb4 Q1 FG Nup116 L2 G9aδ M2 RAD53 FHA1 N2 53BP1 O2 NO66 M2 P2 POZ KA150 Q2 FG Nup42 L3 hG9 M3 MDC1 FHA N3 Crb2 O3 NO66 M5 P3 MBD1 Q3 FG Nup2 L4 hEUHMT1 M4 KI FHA N4 TopBP(1-6) O4 NO66 JMJC P4 MBD2 Q4 XP120 L5 mG9a M5 CHK FHA N5 Rad4 I, II O5 Dna 230 P5 MBD3 Q5 X Arm ARV L6 MMP8 M6 SAM68 KH N6 Rad4 III, IV O6 Dna 343 P6 MeCP2 Q6 X Arm Beta L7 RFXδ M7 QKI KH N7 Ect2 O7 SWIRM KIAA1915 P7 SFMBT Q7 X Arm Delta L8 RFX M8 BRCT FCP1 N8 Ligase IV O8 SWIRM KIAA0601 P8 SFMBt 4xMBT Q8 Lap2 Lem L9 Anco1 M9 BRCT Bard1 N9 MDC1 O9 SWIRM ADA2 P9 WD40 Tap5 Q9 Lap2 Lem Mut L10 53BP2 M10 BRCT TDT N10 Rad9 O10 SWIRM SMRC2 P10 WD40 CSA Q10 Lap2 Lem-like7 L11 Notch N11 REV1 O11 SWIRM SMRC1 P11 Shooting Q11 Kim17 L12 IKBα N12 DNA LIGASE II P12 PHD ZFP-1 Q12 Plant agenet FXR1

TUDOR TUDOR TUDOR WD40 R1 TDRD1(1) S1 EBNA2 Co-A T1 Lin9 TDR U1 WDR5 R2 TDRD1(2) S2 Ret-bp1 T2 LBR TDR U2 WDR9 R3 TDRD2 S3 M96 T3 LBR211 U3 TBLR12 R4 TDRD3 S4 STK31 T4 SPF30 U4 TBLR1 R5 TDRD4-1 S5 53BP1(1-2) T5 JMJN 2B WT U5 RbAb46 R6 TDRD4-2 S6 53BP1(1) T6 Lin9 DIRP U6 RbAb48 R7 TDRD4-3 S7 53BP1(2) T7 Colon Short 1-2 U7 EED R8 TDRD5 S8 Anchor T8 ARI4A U8 HIRA R9 TDRD7-1 S9 2B T9 SETB1 U9 WDHD1 R10 TDRD7-3 S10 2C T10 SND1 U10 Mep50 R11 TDRD7(1-3) S11 RBP1 like-2 T11 STK31 U11 DDB2 R12 Tudor 9 S12 SMN T12 SPIN1 U12 BRWD Figure 3.3 Protein-domain microarray. (a) Schematic of protein-domain microarray. M represents GST. Each GST-tagged potential reader domain is in duplicate. (b) List of GST-fused potential reader domains bound to nitrocellulose. (c) Microarrays probed with either Cy3-labelled KDM1A104-127, KDM1A104-127 K114me2 or anti-GST antibody, followed by visualisation with a fluorescein isothiocyanate-conjugated secondary antibody. Magnification of the H panel is displayed in the images on the right. Position of CHD1 DCD (H9) is emphasised by the red circle. Figure adapted from [11]. 40

Assembly of methylated KDM1A and CHD1 in prostate cells

Figure 3.4 CHD6 and CHD9 peptide pull-down. Coomassie staining of peptide pull- down assay with the GST-tagged double chromodomains of CHD6 and CHD9. MW, molecular weight; kDa, kilodaltons. 3.2.1 Isothermal titration calorimetry To probe the specificity of CHD1 DCD for KDM1A methylated at K114 over unmethylated KDM1A, isothermal titration calorimetry (ITC) analysis was performed. ITC experiments were performed with hexahistidine (His6)-tagged

CHD1 DCD (His6-CHD1270-443) and KDM1A104-127 K114me1, KDM1A104-127

K114me2, and KDM1A104-127 K114me3 peptides. Kd values of 124 µM, 74 µM, and 82 µM were determined for the mono-, di-, and tri-methylated KDM1A104-127 peptides, respectively (Figure 3.5a). No detectable binding was observed between

His6-CHD1270-443 and unmethylated KDM1A104-127 peptide (Figure 3.5a). In addition, Kd values for His6-CHD1270-443 with H31-23 K4me3 or unmethylated

H31-23 peptides revealed a Kd value of 47 µM for H31-23 K4me3 and no detectable binding with unmethylated H31-23 (Figure 3.5b), which are consistent with previous ITC experiments [92]. For all experiments, the binding stoichiometry was 1:1, suggesting that KDM1A104-127 methylated peptides are bound at one site by CHD1 DCD. To investigate how the binding of KDM1A K114me2 by the DCD of CHD1 is achieved a co-crystal structure of the CHD1 DCD in complex with the KDM1A K114me2 peptide was determined.

41

Assembly of methylated KDM1A and CHD1 in prostate cells

Figure 3.5 ITC binding assays with His6-CHD1270-443 and KDM1A104-127 K114 and H31-23 K4 peptides with differing methylation states. Raw titration data and integrated heat measurements of representative ITC experiments are displayed in the upper and lower plots, respectively. (a) Titration data and integrated heat measurements of His6-CHD1270-443 with KDM1A104-127 K114me0 (black), KDM1A104-127 K114me1 (blue), KDM1A104-127 K114me2 (orange), and KDM1A104-127 K114me3 (red). (b) Titration data and integrated heat measurements of His6-CHD1270-443 with H31-23 (black) and H31-23 K4me3 (blue). n.d, no detectable binding.

3.2.2 His6-CHD1270-443 and KDM1A108-119 K114me2 co-crystallisation

A co-crystal structure of His6-CHD1270-443 in complex with KDM1A108-119 K114me2 peptide was determined at 1.60 Å resolution (Table 3.1; Figure 3.6). There is no electron density to support the positioning of the N-terminal His6 tag, therefore these residues have not been modelled. There was also insufficient evidence to support the modelling of part of the helix-turn-helix linker region between CDs 1 and 2 (residues 363-373). The higher B factors for amino acid residues in the linker region compared to the two CDs suggest that this region displays a high degree of movement. This is consistent with previously determined X-ray crystal structures of CHD1 DCD (PBD 4O42, [76]; PDB 2B2Y and 2B2W, [92]) and also the proposed regulatory and DNA binding function of part of this linker region [120, 124]. The KDM1A K114me2 peptide binds at an acidic surface groove of His6-CHD1270-443 formed between CDs 1 and 2 (Figure

3.6 and 3.7). KDM1A108-119 K114me2 is bound by CHD1 DCD in a similar 42

Assembly of methylated KDM1A and CHD1 in prostate cells fashion to H3 K4me3 [92] and NS1 K229me2 [76] peptides (Figure 3.8). Only a small number of residues from the peptide appear to form interactions with the DCD of CHD1 and of the twelve residues in the peptide, only residues 110-116 (TSRRKme2RA) could be modelled confidently (Figure 3.6). Residues Y295,

W322, W325, D408, E424, and D425 form interactions with the KDM1A108-119 K114me2 peptide (Figure 3.7). Residues T110, R115, and A116 do not appear to interact with the DCD of CHD1. The backbone carbonyl of R112 interacts via a water molecule with the carbonyl group of E424 and the backbone amide of D425. The Y295 hydroxyl group interacts with the K114me2 backbone amide.

Table 3.1 Data collection and refinement statistics CHD1-KDM1A K114me2 X-ray source Diamond Light Source, I04 X-ray detector Dectris Pilatus-6M Wavelength [Å] 0.9795 Space group P21212 Cell constants a, b, c [Å] 110.30, 44.44, 46.19 α, β, γ [°] 90.00, 90.00, 90.00 Molecules in asymmetric unit 1 Resolution limits 55.15-1.60 (1.62-1.60)a Completeness [%] 98.9 (83.9) Unique reflections 30, 650 (1,282) Multiplicity 4.4 (4.0) Overall B factor, Wilson plot [Å2] 41 Rmerge 0.029 (1.41) Rmeas 0.037 (1.79) Rp.i.m. 0.023 (0.84) Mean I/σ(I) 16.4 (0.8) CC1/2 0.999 (0.554) Refinement statistics Rwork/Rfree 0.209/0.227 No. of atoms Protein 1,410 Peptide/ion/EDOb 12/3/36 Water 119 B factors [Å2] Protein 56.4 Peptide/ion/EDO 67.7/65.3/57.3 Water 57.4 Cruickshank’s DPI 0.083 R.M.S.D Bonds [Å] 0.010 Angles [°] 0.98 aValues in parentheses are for highest resolution shell. bEDO, 1,2-ethane-diol

43

Assembly of methylated KDM1A and CHD1 in prostate cells

Figure 3.6 Electrostatic model of a 1.6 Å co-crystal structure of His6-CHD1270-443 and KDM1A108-119 K114me2. Electrostatic surface (isocontour, ±8 kT/e) representation of His6-CHD1270-443 reveals binding of KDM1A108-119 K114me2 peptide (sticks; carbon, yellow; oxygen, red; nitrogen, blue) at a largely acidic surface groove. A 2mFo-DFc omit map (blue, 1 σ) for KDM1A K114m2 is shown as a mesh. Electrostatics determined using APBS [81] plugin within PyMOL (Schrödinger, LLC).

Figure 3.7 Co-crystal structure of His6-CHD1270-443 and KDM1A108-119 K114me2 (1.6 Å). His6-CHD1270-443 shown as a cartoon model (teal). Residues interacting with KDM1A108-119 K114me2 peptide (carbon, yellow; oxygen, red; nitrogen, blue) are shown as stick models and coloured according to atom type (carbon, teal; oxygen, red; nitrogen, blue). Electrostatic interactions between CHD1 and KDM1A K114108-119 K114me2 peptide are indicated by black dashed lines. 44

Assembly of methylated KDM1A and CHD1 in prostate cells

Figure 3.8 KDM1A K114me2, H3 K4me3, and NS1 K229me2 peptides bind at the same site of CHD1 double chromodomain. Overlay of CHD1-KDM1A K114me2, CHD1-H3 K4me3 (PDB 2B2W; [92]), and CHD1-NS1 K229me2 (PDB 4O42; [76]) structures. CHD1 double chromodomain (grey) is shown as a cartoon and transparent surface model. KDM1A K114me2 (carbon, yellow), H3 K4me3 (carbon, green), and NS1 K220me2 (carbon, magenta) peptides are modelled as sticks (oxygen, red; nitrogen, blue). There is little movement of the CHD1 DCD upon binding of the KDM1A K114me2 peptide compared to the structure with no peptide bound (PDB 2B2Y, [92]). The RMSD between the two structures is 0.855. The absence of large conformational changes upon methyllysine binding is a common feature of methyllysine reader domains and is energetically favourable as it allows for minimal protein conformational entropy loss upon binding of the peptide. The methylammonium group of KDM1A K114me2 interacts with CHD1 DCD, primarily through cation-π interactions with the partial negative charge of planar aromatic π-electrons on either side of W322 and W325 indole ring systems. These aromatic residues along with the anionic side chain of E272 form the so-called ‘aromatic cage’ of CHD1 (Figure 3.9). When the position of the two tryptophan residues of the aromatic cage of the CHD1 unbound structure is compared with the peptide-bound structures a slight shift in the position of one of the aromatic cage residues (W325) is observed (Figure 3.9b). This slight shift positions the aromatic side chain of W325 in closer proximity to the methyllysine residues and

45

Assembly of methylated KDM1A and CHD1 in prostate cells allows the guanidino group of the n-2 peptide arginine to form cation-π interactions with the other side of the W325 side chain (Figure 3.10).

CHD1 exhibits a preference for methylated KDM1A K114 over unmethylated KDM1A K114 through its semi-rigid aromatic cage. The more diffuse charge of the larger cationic methylammonium enables the formation of multiple strong interactions with both W322 and W325. The distance of 4.5 Å between the methylammonium nitrogen and the ring centroids of the indoles is close to the average distance of 4.6 Å between methyllysine or acetylcholine and the aromatic side chains of Phe, Tyr, or Trp in protein structures deposited in the PDB [187]. The H3 K4me3 and NS1 K229me2 methylammonium groups bind in almost identical positions to KDM1A K114me2 and with almost no difference in the position of the W322 and W325 residues. The lack of selectivity for a particular methylation state of KDM1A K114 can be explained by examining the methyllysine binding pocket. The methyllysine binding pocket is very open and the K114me2 side chain lies along the surface and is bound in the surface groove recognition mode rather than the cavity insertion mode. Due to the openness of the pocket, higher methylation states are not occluded by steric hindrance. There is no preference for trimethyllysine over dimethyllysine due to the absence of a third aromatic residue. The typical mechanism for dimethyllysine selectivity is achieved through an electrostatic interaction between the nitrogen of the dimethylammonium group and a nearby acidic amino acid residue. In the complex structure of CHD1 and KDM1A K114me2 there are no acidic residues close enough to form this interaction. The lone pair of electrons instead is exposed to the solvent and forms an interaction with a nearby water molecule (Figure 3.9a). The lack of selectivity for a particular methylation state is consistent with the reported affinities for the different methylation states of H3 and NS1 peptides [76], where there is a strong preference for methylated lysine but not for a particular methylation state.

46

Assembly of methylated KDM1A and CHD1 in prostate cells

Figure 3.9 KDM1A K114me2 forms cation-π interactions with W322 and W325. The image on the left shows the position of KDM1A K114me2 residue in the aromatic cage of CHD1 near residues W322 and W325. Residues W322 and W325 are displayed as stick models. Cation-π interactions between the centre of the positively-charged dimethylammonium group and the W322 and W325 indoles are indicated with black dashed lines. (b) Overlay of the methylated lysine, W322, and W325 from the crystal structures of CHD1 double chromodomain with KDM1A K114me2 (yellow), H3 K4me3 (green), NS1 K229me2 (magenta), or without methylated peptide (grey). Oxygen and nitrogen atoms are coloured red and blue, respectively.

Unlike the majority of methyllysine binding proteins, which have aromatic cages composed of three or four aromatic amino acid residues, the aromatic cage of CHD1 DCD consists of only two tryptophan residues. Flanagan et al. (2005) [92] proposed that in order to compensate for the presence of only two aromatic amino acid residues, an arginine in position n-2 relative to the methyllysine residue, which can form a cation-π interaction with W325, is necessary for binding. It has since been demonstrated that substitution of H3 R2 with any other residue almost completely abolished binding of H3 K4me3 peptides by the CHD1 DCD [76]. Like the two previously characterised NS1 and H3 K4 peptides, KDM1A K114me2 possesses an arginine residue (R112) at position n-2 (Figure 3.10a,d). The guanidino group of R112 lies alongside W325 and forms a cation-π interaction with the partial negative charge of the W325 indole side chain (Figure 3.10a).

In the co-crystal structure of the CHD1 DCD bound to H3 K4me3 peptide (PDB 2B2W, [92]), the N-terminal amine of the peptide forms several interactions with two acidic residues (D408 and D425) from CD2. (Figure 3.10b) The N-terminal amine forms a salt bridge with an oxygen atom from the D408 carboxyl group.

47

Assembly of methylated KDM1A and CHD1 in prostate cells

The N-terminal amine also forms a hydrogen bonding interaction with a nearby water molecule, which also forms hydrogen bond interactions with the carboxyl groups of D408 and D425.

Figure 3.10 Anionic residues from CHD1 chromodomain 2 form interactions with KDM1A K114me2, H3 K4me3, and NS1 K229me2. KDM1A K114me2 (a; carbon, yellow), H3 K4me3 (b; carbon, green), and NS1 K229me2 (c; carbon, magenta) peptides interact with acidic amino acid residues from CHD1 chromodomain 2. CHD1 DCD is displayed as a cartoon model and coloured teal. CHD1 residues that interact with the peptides are modelled as sticks. Oxygen and nitrogen atoms are coloured red and blue, respectively. Intermolecular interactions are displayed as black dashed lines. (d) Sequence alignment of the recognition sequences of the peptides highlights that the only conserved residue, excluding the methylated lysine (red), is the arginine at position n-2. A significant difference in the sequence motif exists at n-1 for KDM1A.

The KDM1A and NS1 peptides differ from the H3 peptide in that they extend beyond the n-3 position. The N-terminus of H3 imparts selectivity for H3 K4me3 for many reader domains, such as the PHD domains of ING4 and Rag2. In these domains the N-terminus of the histone peptide sits in a closed off pocket that does not allow binding of a peptide that extends beyond this point (Figure 3.11). Consistent with this, both ING4 and Rag2 PHD domains were not observed to bind KDM1A K114me2 in the protein domain microarray (Figure 3.3). CHD1 48

Assembly of methylated KDM1A and CHD1 in prostate cells

DCD is different from these reader domains as it possesses a more open pocket near the n-3 binding site, which allows peptides that extend beyond n-3 to bind. In the case of the NS1 K229me2 peptide, this allows an arginine in position n-5 (R224) to compensate for the absence of any interaction between the alanine residue at n-3 and CHD1 DCD through formation of electrostatic interactions between NS1 R224 and the D408 and D425 carboxyl groups (Figure 3.10c). In the case of KDM1A, S111 in position n-3 interacts solely with D408 and forms two hydrogen bonds between its hydroxyl group and both oxygen atoms of the D408 carboxyl group and the main chain amide interacts with an oxygen atom from D408 carboxyl group (Figure 3.10a).

Figure 3.11 Electrostatic models of Rag2 and ING4 PHD domains bound to H3 K4me3 peptides. (a, b) The H3 A1 binding pockets are important for site selectivity as additional N-terminal residues cannot be accommodated in either structure resulting in the selectivity of these domains for the H3 N-terminus. (a) Electrostatic model (isocontour, ±8 kT/e) of a 1.1 Å X-ray crystal structure of Rag2 PHD domain bound to H3 K4me3 (PDB 2V89; [86]). (b) Electrostatic model (isocontour, ±8 kT/e) of a 1.8 Å X-ray crystal structure of ING4 PHD domain bound to H3 K4me3 (PDB 2PNX; [188]). In both (a) and (b) H3 K4me3 peptides are modelled as sticks and coloured according to atom type: carbon, yellow; oxygen, red; and nitrogen, blue. Electrostatic models were generated with the APBS software plugin [81] within PyMOL (Schrödinger, LLC).

3.2.2.1 Phosphorylation of KDM1A S111 disrupts binding by CHD1 DCD It has been demonstrated that KDM1A S111 can be phosphorylated by PKCα [28]. Dynamic interplay between phosphorylated and methylated residues has been reported to affect binding of methyllysine readers [102, 189]. Phosphorylation of H3 T3 has also been shown to abrogate binding of H3 K4me3 by the CHD1 DCD [92]. We therefore decided to explore the possibility that phosphorylation of S111 might disrupt binding of KDM1A K114me2 by CHD1

DCD. ITC experiments with His6-CHD1270-443 and KDM1A104-127 S111ph K114me2 revealed that binding is significantly disrupted by S111

49

Assembly of methylated KDM1A and CHD1 in prostate cells phosphorylation as no measurable binding was observed (Figure 3.12a). The effect of S111 phosphorylation can be mimicked through KDM1A S111E mutation (Figure 3.12b). Given the negative surface of the peptide binding groove of CHD1 DCD it is unsurprising that introduction of negatively charged phosphate groups to the H3 K4me3 and KDM1A K114me2 peptides would disrupt binding. This is supported by binding data reported by Qin et al. (2014) [76], which demonstrated that mutation of any of the first five residues of H3 peptide to aspartate or glutamate residues resulted in binding by CHD1 DCD being almost completely abolished. Currently it is not known whether KDM1A S111ph and K114me2 can co-exist in cells. Therefore, further study examining these two marks in vivo is required to unveil whether there is any cross-talk between these PTMs and what the functional consequences of any cross-talk might be.

Figure 3.12 Phosphorylation of KDM1A S111 disrupts binding between His6-CHD1270-443 to KDM1A104-127 K114me2. Representative ITC binding assays between His6-CHD1270-443 and KDM1A104-127 S111ph K114me2 (a) or KDM1A104-127 S111E K114me2 (b). Raw titration data and integrated heat measurements are displayed in the upper and lower plots, respectively. No measurable binding is observed.

50

Assembly of methylated KDM1A and CHD1 in prostate cells

3.2.2.2 KDM1A R113 is important for binding to CHD1 The presence of R113 at position n-1 in the KDM1A peptide is one of the key differences between KDM1A and the NS1 and H3 K4me3 peptides (Figure 3.10). The Nε atom of the R113 guanidyl group forms a water-mediated electrostatic interaction with one of the carboxyl oxygen atoms of CHD1 D425 (Figure 3.10a). To investigate the importance of this interaction for the binding of KDM1A

K114me2 by CHD1 DCD, the binding affinity between a His6-CHD1270-443

D425A mutant and KDM1A104-127 K114me2 was determined by ITC (Figure 3.13a). No measurable binding was observed, suggesting that this interaction might be important for binding of KDM1A K114me2 by CHD1. The importance of this interaction for binding of KDM1A K114me2 was confirmed by the complete abolishment of detectable binding between His6-CHD1270-443 and

KDM1A104-127 R113A K114me2 by ITC (Figure 3.13b).

Figure 3.13 ITC experiments demonstrate the importance of the CHD1 D425- KDM1A R113 interaction. Raw titration data and integrated heat measurements of representative ITC binding assays are displayed in the upper and lower plots, respectively. (a) ITC binding assay between His6-CHD1270-443 D425A mutant and KDM1A104-127 K114me2. No measurable binding is observed. (b) ITC binding assay between His6-CHD1270-443 and KDM1A104-127 R113A K114me2. No detectable binding is observed.

51

Assembly of methylated KDM1A and CHD1 in prostate cells

The interaction between the CHD1 DCD and KDM1A K114me2 peptides in vitro had been thoroughly characterised. We then sought to establish if full-length CHD1 interacts with full-length KDM1A K114me2 in vivo.

3.2.3 CHD1 interacts with KDM1A K114me2 in vivo Coimmunoprecipitation of endogenous CHD1 with Flag-KDM1A occurred only in the presence of EHMT2 (Figure 3.14a). CHD1 did not coimmunoprecipitate with the Flag-KDM1A K114A mutant even in the presence of EHMT2. In LNCaP cells, coimmunoprecipitation of endogenously-expressed CHD1 and KDM1A K114me2 with EHMT2 was dependent on the presence of the androgen, DHT (Figure 3.14b). The observation that addition of DHT was necessary for the interaction of endogenous CHD1 and KDM1A with EHMT2 raised the possibility that this interaction might be important for AR-dependent gene transcription in prostate cells.

Figure 3.14 CHD1 interacts with KDM1A K114me2 in vivo. Anti-CHD1, anti- EHMT2, anti-KDM1A, and anti-KDM1A K114me2 western blot analyses. (a) Lysates from 293T cells transfected with Flag-KDM1A or Flag-KDM1A K114A in the presence or absence of EHMT2. Immunoprecipitation performed with anti-Flag antibody. (b) Lysates from LNCaP cells cultured in the presence or absence of DHT. Immunoprecipitation performed as indicated with rabbit IgG (rIgG) or anti-EHMT2 antibodies. Figures are from Metzger et al. (2016) [11]. 3.3 Co-occupancy of CHD1, KDM1A K114me2 at AR-binding sites controls AR-dependent transcription Chromatin immunoprecipitation (ChIP) with massive parallel sequencing (ChIP- seq) with anti-CHD1 and anti-KDM1A K114me2 antibodies revealed co- occupation of 9,660 chromatin sites by CHD1 and KDM1A K114me2 in LNCaP cells treated with DHT. Comparison of these sites with published genome-wide AR binding data [190] showed that a total of 2,941 sites, and 2,397 gene locations, are co-occupied by AR, CHD1, and KDM1A K114me2. Global transcriptome analysis of LNCaP cells showed that of the 2,397 genes which were

52

Assembly of methylated KDM1A and CHD1 in prostate cells co-occupied by CHD1, KDM1A K114me2, and AR, a total of 861 genes are differentially regulated by the presence or absence of DHT.

Further investigation with a representative gene, TMPRSS2, for the 2,397 identified genes provided evidence that androgen-induced dimethylation of KDM1A K114 by EHMT2 is necessary for chromatin occupancy of CHD1 and AR at these chromatin locations. ChIP-quantitative (q)PCR was performed with LNCaP cells cultured in the presence or absence of DHT and treated with either Bix-01294, siRNA against EHMT2, control siRNA, or without additional treatment. Treatment with only DHT increases the levels of KDM1A K114me2, CHD1, and AR at TMPRSS2 compared to untreated cells. This is blocked in cells treated with Bix-01294. A similar reduction is observed in LNCaP cells treated with siRNA against EHMT2 compared to treatment with control siRNA.

3.4 KDM1A K114me2 controls TMPRSS2-ERG gene fusion Androgen-induced binding of AR to androgen-response elements (AREs) is known to induce alterations in chromosome conformation and the formation of looped chromatin necessary for coordinated transcription to occur [191]. At the TMPRSS2 the enhancer region of TMPRSS2 loops with a region in intron 1, a region where breaks in the genome occur during TMPRSS2-ERG fusion [192]. We investigated whether dimethylation of KDM1A K114 controls androgen- dependent TMPRSS2 enhancer-breakpoint loop formation. In LNCaP cells cultured in the presence of DHT, BIX-01294 prevented enhancer-breakpoint loop formation as assessed by chromosome conformation capture analysis. Enhancer- breakpoint loop formation could also be reduced by knockdown of EHMT2, KDM1A, or CHD1.

To provide further support for the role of KDM1A K114me2 in formation of the enhancer-breakpoint loop, LNCaP cells expressing RNAi (rr) resistant Flag-NLS- KDM1A-rr or Flag-NLS-KDM1A K114A-rr constructs were assessed by chromosome conformation capture analysis. After knockdown of endogenous KDM1A, TMPRSS2 enhancer-breakpoint looping was only observed in cells expressing Flag-NLS-KDM1A-rr. To investigate whether recognition of KDM1A K114me2 by CHD1 also plays a role in TMPRSS2 enhancer-breakpoint formation, a similar approach was used. LNCaP cells expressing CHD1-rr or 53

Assembly of methylated KDM1A and CHD1 in prostate cells

CHD1 D425A-rr were examined for evidence of TMPRSS2 enhancer breakpoint formation. Upon knockdown of endogenous CHD1, loop formation was only observed in LNCaP cells expressing CHD1-rr. This highlights the importance of CHD1 and the role of CHD1 D425 in the recognition of KDM1A K114me2 and subsequent loop formation at the TMPRSS2 locus.

It has been reported that TMPRSS2-ERG gene fusion is induced in a small proportion of prostate cells by treatment with DHT [192]. Given the importance of KDM1A K114me2 in DHT-induced TMPRSS2 enhancer-breakpoint formation, it seemed plausible that TMPRSS2-ERG gene fusion might be dependent on KDM1A K114me2. TMPRSS2-ERG gene fusion in LNCaP cells was assessed by fluorescence in situ hybridisation (FISH). Consistent with previous results, it was observed that addition of DHT was sufficient to induce TMPRSS2-ERG fusion in a small number of LNCaP cells. DHT-induced formation of TMPRSS2-ERG fusion was observed to be significantly reduced by knockdown of KDM1A, EHMT2, or CHD1 in LNCaP cells. Additionally, we observed that androgen- regulated expression of the TMPRSS2-ERG fusion transcript was reduced upon knockdown of KDM1A, EHMT2, or CHD1.

3.5 Discussion and outlook Together, the data presented in this chapter and in Metzger et al. (2016) [11] demonstrate that the androgen-induced dimethylation of KDM1A K114 by EHMT2 and subsequent binding of KDM1A K114me2 by the DCD of CHD1 are necessary for recruitment of AR to AREs. Androgen-induced dimethylation of KDM1A K114 and recognition of this methylation mark by CHD1 is a critical step that controls AR-dependent gene expression. A crucial role for this androgen- dependent mechanism in TMPRSS2-ERG fusion was also uncovered, providing additional insight into this highly prevalent genomic translocation in prostate tumour cells. Due to the important roles of AR, KDM1A, CHD1, and TMPRSS2- ERG gene fusion in prostate tumour development, understanding this mechanism further and probing the interaction between KDM1A K114me2 and CHD1 to learn more about the functional importance of this interaction in normal prostate cells and prostate tumour evolution is of significant interest. Preventing this interaction between KDM1A K114me2 and CHD1 from occurring might be of

54

Assembly of methylated KDM1A and CHD1 in prostate cells therapeutic interest because of the potential to block AR-dependent expression of genes, such as TMPRSS2-ERG, in prostate tumour cells (Figure 3.15).

Figure 3.15 Schematic demonstrating how an inhibitor targeting the interaction between CHD1 and KDM1A might function. (a) In the absence of DHT, KDM1A K114 is not dimethylated by EHMT2, and neither CHD1 nor AR are recruited. AR- dependent genes are not expressed. (b) In the presence of DHT, KDM1A is dimethylated by EHMT2 at K114. CHD1 binds KDM1A K114me2 and AR is recruited, resulting in AR-dependent genes being expressed. (c) Addition of an inhibitor that disrupts KDM1A K114me2 binding by CHD1 would prevent recruitment of AR, thereby disrupting AR- dependent gene expression.

It would be of significant benefit to have a chemical probe that specifically targets the interaction between CHD1 and KDM1A K114me2 as this would allow the mechanism to be explored in greater detail and also enable us to determine which CHD1 functions are dependent on interaction with KDM1A K114me2 and which are either independent of H3 K4me recognition or methylated peptide recognition as a whole. This is of interest as currently it is not clear how methylated peptide binding by CHD1 DCD affects recruitment to specific sites on chromatin. The co- crystal structure of His6-CHD1270-443 and KDM1A108-110 K114me2 was analysed to determine whether specifically targeting this interaction might be possible and which sites would be most suitable. CHD1 DCD represents the most suitable 55

Assembly of methylated KDM1A and CHD1 in prostate cells target as there is no structural information for the flexible N-terminus of KDM1A independent of CHD1. Another possibility would be to target EHMT2 to prevent methylation of KDM1A K114 as this would prevent CHD1 and KDM1A interaction. It would, however, likely be more difficult to design a compound specifically targeting this methylation interaction as there is no structural information for how EHMT2 interacts with KDM1A and how this interaction might be selectively targeted without affecting its other functions.

There are a number of aspects of the interaction between CHD1 and KDM1A K114me2 that make it a challenging but interesting target for identification and development of inhibiting compounds. Targeting protein-protein interaction surfaces is difficult as these interaction surfaces are often shallow surface grooves, which are less amenable to development of compounds that are potent and specific for that site. The peptide binding surface of CHD1 is a shallow, negatively charged surface groove which makes development of potent, specific compounds, which also have suitable pharmacokinetics, challenging. The most obvious target site for the development of inhibitors of this interaction is the aromatic cage of CHD1 DCD as compounds have already been developed that target the aromatic cages of other reader proteins [143, 144, 146, 149, 193]. However, compound development has been most successful with methyllysine readers, such as L3MBTL3, which possess a much narrower and enclosed pocket than methyllysine readers such as CHD1 and KDM4A. Development of compounds targeting the open pocket of reader domains that recognise higher methylation states has proven to be more difficult. An approach used with the chromodomain protein CBX7 was to target the domain with a peptide-based inhibitor, which is potent but has limitations in terms of its in vivo application [146]. The recent identification of small-molecules targeting the aromatic cage of Spindlin1 by us and others indicate development of potent and specific small- molecule chemical probes or inhibitors could be possible for readers of higher lysine methylation states [149, 193, 194]. A computational study examining the ‘druggability’ of the aromatic cages of a large number of methyl- and acetyl- lysine reader domains found that in general the deep, hydrophobic pockets had higher druggability scores than the shallow, open pockets of readers of higher methylation states [134]. Sites were classified as ‘druggable’ or ‘undruggable’

56

Assembly of methylated KDM1A and CHD1 in prostate cells based on a range of criteria, and according to their criteria the aromatic cage of CHD1 was classified as undruggable [134]. Targeting the methyllysine binding pocket of CHD1 would also not allow the development of compounds that specifically target the interaction with KDM1A K114me2 as the interaction with H3 K4me3 would also be disrupted. An alternative approach would be to target another area of the protein, which might be more amenable to compound binding and development of selective and potent compounds. Most of the peptide binding interface suffers from the same issues as the aromatic cage; there is however a nearby pocket that might be more amenable to targeting with compounds (Figure 3.16).

Figure 3.16 Potential novel site for drug development. (a) His6-CHD1270-443 is displayed as a grey surface model. The pocket of interest for development of potent and CHD1-KDM1A K114me2 interaction-specific inhibitors is coloured red and displayed in the bottom image, which is a 90° rotation of the top image. The eventual target site, the CHD1 D425-KDM1A R113 interface, is also indicated in the top image by red surface colouration. KDM1A K114me2 (carbon, yellow) and H3 K4me3 (carbon, green) peptides are displayed as stick models. Oxygen and nitrogen atoms are coloured red and blue, respectively. (b) Carved electrostatic surface model (isocontour, ±8 kT/e) of the pocket indicated in (a). The perspective shown in (b) is a further 90° rotation from the lower figure in (a) Select water molecules in the pocket are modelled as spheres (red). Position of KDM1A R113 relative to the pocket is indicated. CHD1 DCD is shown as an electrostatic surface model (isocontour, ±8 kT/e) and as sticks coloured by atom type: carbon, grey; oxygen, red; and nitrogen, blue. Electrostatic surface model was generated using the APBS [81] plugin within PyMOL (Schrödinger, LLC). 57

Assembly of methylated KDM1A and CHD1 in prostate cells

This pocket is less open than the aromatic cage and not as negatively charged, potentially increasing the chances of developing a potent small molecule probe targeting this area of the protein. If a potent lead compound is discovered that binds to this pocket, the compound could then be elaborated to extend out into the CHD1 D425-KDM1A R113 interface to potentially inhibit the interaction of CHD1 with KDM1A K114me2 without significantly altering its interaction with H3 K4me3. The precise role of H3 K4me3 recognition by CHD1 DCD in chromatin localisation, spliceosome function, and nucleosome remodelling has not yet been established, however it could yet play an important functional role in different cell types, particularly prostate cells, given a sub-class of prostate cancer cell types is characterised by the deletion of CHD1. A compound targeting the interaction between CHD1 and KDM1A K114me2 would need to have much higher potency for this interaction over that with H3 K4me3 to be classified as a chemical probe. The identification and development of a compound selectively inhibiting the interaction with KDM1A K114me2 would greatly facilitate the improvement of our understanding of the physiological roles and consequences of the interaction between CHD1 and KDM1A K114me2 and also allow these to be separated from the role of CHD1 interaction with H3 K4me3. There is also the potential for any such compound to be of therapeutic interest given the important role of the interaction between CHD1 and KDM1A K114me2 for AR binding and androgen-dependent gene expression. Selectivity for the KDM1A K114me2 interaction may also be important for development of inhibitors as it might allow for the inhibition of AR-dependent CHD1 functions without affecting its AR-independent functions, which could be important for decreasing deleterious off-target effects. The following chapter describes our efforts to develop compounds that could be developed into chemical probes for CHD1 DCD.

58

Fragment screening using X-ray crystallography

4 Fragment screening using X-ray crystallography A technique that is gaining popularity and has been in use for over 20 years for identification of quality lead compounds for subsequent drug development is FBDD. FBDD involves the screening of compounds of lower molecular weight than those typically used in HTS approaches. The definition of a fragment is somewhat loose; however, fragments are typically <300 Da and have a cLogP ≤3 [195]. The major advantage of FBDD over HTS is that a well-chosen fragment library, containing only hundreds of fragments, can cover a much wider range of chemical space than HTS can ever realistically achieve. Due to their lower molecular weight and reduced complexity, fragment hits typically bind with much weaker affinities (100 µM – 10 mM) than hits from HTS. As a consequence, suitably sensitive biophysical or biochemical techniques need to be used to detect binding of fragment hits. A number of techniques have been used in FBDD: nuclear magnetic resonance (NMR), surface plasmon resonance (SPR), X-ray crystallography, thermal shift, functional screening, computational, ITC, mass spectrometry, affinity chromatography, biolayer interferometry, and microscale thermophoresis. Of these techniques, NMR, SPR, crystallography, and thermal shift are currently the most popular [195]. Often FBDD efforts employ a number of these orthologous techniques, with each having advantages and disadvantages.

A technique that is becoming increasingly popular for FBDD is X-ray crystallography. X-ray crystallography is a suitable fragment screening technique as it is a highly sensitive technique able to detect weakly bound fragments. The major advantage that distinguishes X-ray crystallography from other techniques is that detailed information about ligand binding is obtained at the same time as fragment identification. One disadvantage with X-ray crystallography is that it does not provide any affinity data. Therefore, after initial hits are determined, an orthologous method such as SPR or microscale thermophoresis would need to be employed to determine affinities. Until recently, a major limitation of fragment screening using X-ray crystallography has been the length of time the experiments take compared to other techniques, with crystallisation, fragment soaking, X-ray diffraction data collection, data processing, and structure determination all being very time intensive. Recent advances in all of these areas has greatly reduced the 59

Fragment screening using X-ray crystallography time and effort required to perform each of these steps. Advances in crystallisation robotics has greatly reduced the time and resources required to produce very large numbers of protein crystals. The development of rapid crystal soaking platforms for fragment screening [196] and the development of specialist user facilities such as the XChem Facility at Diamond Light Source has simplified and greatly reduced the time required to soak very large numbers of crystals. Continued advances at modern synchrotrons in robotics, detectors and automated data collection allow high-quality X-ray diffraction data to be collected increasingly rapidly. In the case of the specialised beamline, I04-1 at Diamond Light Source, full X-ray diffraction datasets for many hundreds of crystals can be collected in a matter of hours. Traditionally, the analysis of many hundreds of datasets would be very time intensive and even just managing this amount of data can be difficult. To assist in the handling of large amounts of data, XChemExplorer (XCE) was developed by Krojer et al. (2017) [178] to support the simultaneous analysis of the very large numbers of protein-ligand complexes produced in structure-based drug design studies.

The use of X-ray crystallography for fragment screening has many challenges and bottlenecks. One of these is obtaining high-quality protein crystals that consistently diffract to high resolution (at least better than 2.7 Å or ideally better than 2.0 Å resolution) and are not damaged by soaking with fragments dissolved in DMSO. The second major challenge is identifying small fragments, which are often bound at low occupancy, in the many hundreds of datasets. Traditionally, every single structure must be assessed to search for weak electron density indicative of fragment binding and the modelling of such fragments is subjective and often suffers from bias due to overeager modelling. This is especially a problem when identifying low occupancy fragments as density for a fragment is often obscured by data from solvent molecules that are present at the same site in protein molecules within the crystal that do not have fragment bound.

The recent development of Pan-Dataset Density Analysis (PanDDA) by Pearce et al. (2017) [181] greatly assists fragment identification and modelling, and allows data from weakly bound fragments to be extracted from the rest of the data. The PanDDA algorithm allows the non-ground (potentially fragment bound) state of a

60

Fragment screening using X-ray crystallography crystal to be extracted from the ground state. An accurate estimate of the ground state map is obtained by averaging near convergent density from independently measured, near isomorphous crystals (>30).

The use of multiple maps allows the calculation of Z-scores at each point in every map. Z-scores indicate the significance of any deviation at each point in an individual dataset from the average dataset. Generation of Z-maps then allows the potential sites of fragment binding (‘events’) to be identified objectively. Manual examination of events allows modelling of fragments where appropriate using the event and Z-maps. Following modelling of the fragments and adjustment of nearby solvent and protein atoms, an ensemble model of the bound and ground state is generated and refined. It has been demonstrated that modelling of partially occupied ligands with a superposed ground state model can provide superior accuracy as assessed by the fragment validation metrics listed in Table 4.1 [197]. The PanDDA method, when employed within the XCE framework, has the potential to greatly decrease the time required to identify fragment binding in X-ray crystal structures and allow more accurate modelling of sub-unitary occupancy fragments.

Table 4.1 Metrics used for fragment validation Metric Preferred values RSCC >0.7 RSZD <3 RSZO/OCC >2 B-factor ratio ~1 RMSD <1 RSCC = Real space correlation coefficient; RSZD = real difference density Z-score; RSZO/OCC = real space observed density/fragment occupancy; B-factor ratio = ratio of the fragment B-factor compared to amino acid residues within 4 Å; RMSD = root mean square deviation of the fragment compared to original model.

Given these recent advances in fragment screening using X-ray crystallography, it was decided that this would be an appropriate technique for identification of fragments that bind to CHD1 DCD. The main advantage of this approach being that structural information regarding the binding of the fragments would be immediately available and could inform subsequent decisions.

4.1 Crystallisation The major bottleneck when screening for fragments using X-ray crystallography is the generation of suitable protein crystals. A range of minimum criteria must be

61

Fragment screening using X-ray crystallography met before proceeding with a fragment screening campaign: (i) crystals must consistently diffract to high-resolution; (ii) be highly reproducible; (iii) tolerate addition of DMSO; (iv) be of a consistent size; and (v) be easy to handle. Criteria (iv) and (v) are not absolute requirements, however fragment screening is much simpler if these are also met.

The DCD of CHD1 was identified as potentially being a suitable target for fragment screening with X-ray crystallography as there are no major changes to the overall structure of CHD1 DCD upon binding of either KDM1A K114me2 or H3 K4me3 peptide (PDB 2B2W; [92]) compared to the apo structure (PDB 2B2Y) published by Flanagan et al. (2005) (RMSD of 0.855 and 0.281, respectively). This is important, as by far the simplest method to screen for fragments by X-ray crystallography is soaking into pre-formed crystals. If there are significant shifts in protein conformation upon binding of native ligands this would potentially limit the usefulness of this technique as fragments identified might only bind to a specific conformation of the domain. CHD1 DCD in the apo form and with a number of different peptides had been successfully crystallised previously by others [76, 92] and we had determined a high-resolution (1.6 Å) structure of CHD1 DCD in complex with KDM1A108-119 K114me2 peptide. Therefore, it seemed likely that crystals suitable for fragment screening could be developed.

The only report of the apo-CHD1 DCD being successfully crystallised previously was by Flanagan et al. (2005) (PDB 2B2Y; [92]). The structure was only determined at 2.35 Å resolution and like all of the CHD1 DCD structures reported by Flanagan et al. (2005) [92], has an unusual asymmetric unit consisting of two full length DCDs and one copy of just CD1, suggesting some proteolytic digestion occurred prior to crystallisation. It is therefore unlikely that the reported crystallisation conditions for generation of apo-crystals would be suitable for production of crystals for fragment screening. Indeed, it was not possible to produce apo-CHD1 DCD crystals in similar conditions to those reported. Crystal screens were performed with His6-CHD1270-443 with every available crystallisation screen (listed in Appendix Table 8.6), at 4 and 20 °C. Many different protein concentrations were trialled and the buffer was optimised using

62

Fragment screening using X-ray crystallography thermal shift, however no protein crystals were observed. Matrix microseeding

(MMS) with His6-CHD1270-443 and KDM1A108-119 K114me2 co-crystals was also unsuccessful in producing apo crystals. Different length constructs with and without a His6-tag were tested but also did not yield crystals. Given this impasse, a different approach was adopted. Instead of apo-CHD1, co-crystals of the DCD of CHD1 and H31-12 K4me3 peptide were to be used for fragment screening. Co- crystals of His6-CHD1270-443 and H31-12 K4me3 had been obtained previously but did not yet meet the criteria required for fragment screening. The use of

His6-CHD1270-443 and H31-12 K4me3 co-crystals would likely restrict the accessibility of the aromatic cage and the rest of the peptide binding site, however the D425-R113 interface, the nearby pocket identified in the previous chapter, and any allosteric sites would still likely be accessible. After much optimisation, highly reproducible crystals diffracting to consistently high resolution (1.6-2.0 Å) were obtained in one condition (0.01 M sodium/potassium phosphate, 0.1 M BTP pH 5.5, 21% w/v PEG 3,350).

4.2 DMSO tolerance test Optimal DMSO concentration and soaking time for fragment screening was assessed by X-ray diffraction. It was found that a 20% v/v DMSO concentration soak for 1 h did not significantly affect X-ray diffraction. This concentration was selected for fragment screening with the Diamond-SGC Poised Library (DSPL), a fragment library developed through a collaboration between Diamond Light Source and the SGC in Oxford. Details about the library design can be found in [158]. The rationale behind the development of this library was to provide fragments that once identified as binding to the protein target could be rapidly elaborated with robust and general synthetic reactions from constituents that are commercially available. So-called poised bonds within the fragments can be separated into at least two synthons allowing a large number of similar fragments to be rapidly generated by parallel synthesis.

There are 776 fragments in total in the DSPL fragment library. Fragments are dissolved in DMSO (a small subset in ethylene glycol) at 500 mM. Composition of the DSPL is provided in Figure 4.1.

63

Fragment screening using X-ray crystallography

Figure 4.1 Diamond-SGC Poised Library (DSPL). (a) Poised reaction core scaffolds and heterocycle formations in the library. (b) Composition of poised reaction core scaffolds in the DSPL fragment library. Figure adapted from Cox et al. (2016) [158]. 4.3 Fragment soaking experiments Prior to performing a full fragment screen using X-ray crystallography, a pre- screen with 100 fragments from the DSPL established that the fragment soaking system was valid for FBDD using X-ray crystallography. The remaining fragments from the library were then soaked into the CHD1 crystals. A total of 714 crystals from individual crystallisation drops were soaked with different fragments from the DSPL fragment library and X-ray diffraction data were collected. Not all 776 compounds were successfully soaked into protein crystals. This attrition rate is due mostly to unsuccessful transfer of fragments from the

64

Fragment screening using X-ray crystallography screen to the drop but also occasionally from crystal damage upon fragment soaking. X-ray diffraction data from 668 crystals were of sufficient quality for the xia2 [173] and/or autoPROC [174] autoprocessing pipelines used at Diamond Light Source to process the X-ray diffraction data. Interestingly, two different space groups were observed: P21212 and P212121. The presence of two different space groups was unexpected, as all previous CHD1 DCD crystals produced in this study were in the same space group (P21212). The two space groups have different cell parameters (Table 4.2), and P212121 crystals contain two CHD1 DCD molecules in the asymmetric unit as opposed to the single CHD1 DCD in the asymmetric unit of P21212 crystals. In order for the DIMPLE pipeline and PanDDA analysis to be performed correctly, datasets were clustered into two groups according to their space group. A new structural model for the

His6-CHD1270-443 and H31-12 K4me3 complex (space group, P212121) was also determined so that DIMPLE and PanDDA could be performed with these datasets (see section 4.3.1). Models and electron density could be generated for 630 of the 668 datasets using DIMPLE. Of these 630 datasets, 585 were suitable for PanDDA analysis. The majority of the datasets that were suitable for PanDDA analysis (492/585) had been processed in the space group P21212, however a small but significant number (93/585) were processed in the new P212121 space group. The distribution of the high-resolution limits for the datasets used for PanDDA analysis are displayed in Figure 7.25.

Table 4.2 Summary of datasets suitable for PanDDA.

Space group P21212 P212121 Total no. of datasets 492 93 Mol. in asym. unit 1 2 Cell parameter ranges a, b, c (Å) 109.40-112.70a, 43.80-45.90, 45.15-45.92, 93.45-94.70, 44.2-47.30 110.10-112.4 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 RMSD to reference model 0.350-0.725 Å 0.130-0.880 Å High resolution limit 1.28-2.62 (1.90)b Å 1.32-2.26 (1.73) Å RMSD, root mean square deviation. PanDDA, Pan-dataset density analysis. aThese parameter ranges represent extreme values. bValues in parentheses represent average resolution (see Figure 7.25).

4.3.1 P212121 structure

As mentioned in section 4.3, the asymmetric unit of P212121 crystals contain two CHD1 DCD molecules in the asymmetric unit (Figure 4.2). H3 K4me3 peptides are bound to both CHD1 DCD molecules in the asymmetric unit. Unlike, previous models of CHD1 DCD determined from crystals in the P21212 space group, all of the linker region could be modelled confidently. This is likely due to the tighter 65

Fragment screening using X-ray crystallography packing of the linker region in the P212121 crystals, limiting the flexibility of this region, compared to the P21212 crystals (Figure 4.3). This linker region has been shown to be important for the function of the yeast Chd1 enzyme and for interaction with nucleosomal DNA [118, 120, 124]. The ability to model this region using well-defined electron density may increase the chances of detecting and accurately modelling fragments bound at the linker region in P212121 crystals, which could be of interest to future studies given the important role proposed for the linker region for the proper functioning of Chd1.

Figure 4.2 Model of a 1.74 Å X-ray crystal structure of CHD1 DCD in complex with H3 K4me3 peptide crystallised in space group P212121. CHD1 DCD molecules displayed as a cartoon model. H3 K4me3 peptides modelled as sticks and coloured according to atom type. CHD1 chain A monomer is coloured teal and CHD1 chain B is coloured blue. The linker region of chain B is shown by grey colouration of the cartoon model. The linker regions of both chain A and B could be modelled confidently. 4.4 Fragment identification PanDDA [181] was performed separately with the two different space groups. A large number of events were identified by PanDDA. Analysis of the P21212 datasets identified 205 events at eight different sites from 179 datasets. Manual inspection of these events revealed that 51 of the 205 events from the P21212 datasets potentially were the result of fragment binding (‘hits’) at five of the eight sites. Analysis of the P212121 datasets identified 41 events at eight different sites from 27 datasets. From the 41 events identified in the P212121 datasets, 18 potential hits were identified at five out of the eight sites. A large number of the events (75) from the P21212 datasets that were deemed unlikely to be hits, were from one particular site located at the flexible and difficult to model linker region of CHD1 DCD (Figure 4.3). 66

Fragment screening using X-ray crystallography

Figure 4.3 Crystal packing of CHD1 crystals. Crystal packing near the linker region is more compact in P212121 crystals (b) than P21212 crystals (a). There is one molecule in the asymmetric unit of P21212 crystals and two in the P212121 crystals. Molecules in the asymmetric unit are coloured according to B-factor as indicated in (c). Nearby symmetry molecules are coloured grey. 67

Fragment screening using X-ray crystallography

In models from the P21212 space group this region is in general very poorly resolved, with insufficient density in most datasets to confidently place many of the residues in this region, which is potentially the reason for the very high number of false-positives at this site. A number of sites identified as containing fragment hits, are shared between the P21212 and P212121 datasets (Figures 4.4 and 4.5). There are however some sites that are unique to a particular space group, some of which are the result of the different crystal packing interactions or due to the presence of a second copy of CHD1 in the asymmetric unit. In total, nine different sites from the P21212 and P212121 datasets were identified as having fragments bound (Figures 4.4 and 4.5). Site 1 is present in structures from both space groups and by far has the most fragment hits. Five fragment binding sites are observed in P21212 datasets (sites 1-5), with three of these (sites 2, 4, and 5) not observed in P212121 datasets. P212121 datasets were observed to have fragments bound to sites 1 and 3, and also four additional sites (sites 6-9) unique to this space group (Figures 4.4 and 4.5). Site 1 is the main pocket of interest near the peptide binding interface (Figure 3.16), which was postulated as being more suitable for compound binding than the aromatic cage or the rest of the shallow peptide binding surface groove. None of the fragments bound in this pocket are large enough to extend out to the D425-R113 interaction interface, so significant chemical elaboration of selected fragments will need to be performed to develop an inhibitor that targets the D425-R113 interaction site. Site 6 from the P212121 structures represents this same pocket but in the second copy of CHD1 in the asymmetric unit of this crystal form. Examination of the two pockets revealed there were no significant differences between the site 1 and site 6 pockets, therefore site 1 and site 6 were treated as being equivalent. Very interestingly, and slightly unexpectedly, two different sites were identified at the peptide binding interface. Site 2 is located near CHD1 D408, a residue that H3 K4me3, KDM1A K114me2, and NS1 K229me2 all form important interactions with. Site 7 is located in the aromatic cage, representing the first known compound to bind in the aromatic cage of a DCD. These two sites could be interesting for the development of compounds that inhibit the interaction of CHD1 with all identified methylated binding partners. The remaining sites are located at the surface of CHD1 DCD in regions too far away for the development of competitive inhibitors of methylated peptide binding. From the X-ray crystal structures there is no clear evidence that

68

Fragment screening using X-ray crystallography any of these might represent an allosteric inhibition site for the interaction with methylated binding partners so are therefore currently of less interest. However, these sites could be of interest to future investigations as some of these sites appear to be near, or could be near, potential protein-protein or protein-DNA interaction surfaces.

Figure 4.4 Fragment binding sites in P21212 CHD1 DCD structures. CHD1 DCD displayed as a cartoon model (teal). H3 K4me3 peptide displayed as a stick model (carbon, yellow; oxygen, red; nitrogen, blue). Fragments bound at the different sites are displayed as stick and surface models. Site 1 (red) indicates fragments bound at the main site of interest for further development of compounds. Site 2 (magenta) binds at a location overlapping with the binding site of H3 K4me3 peptide. Remaining binding sites (3-5) are coloured grey. The bottom image is a 90° rotation of the top image. Images generated by aligning and overlaying all fragment bound structures in PyMOL (Schrödinger, LLC).

An interesting feature of the P212121 datasets was that several of the sites were observed at the surface of chain A but not chain B (Figure 4.5). The exceptions being site 6 from chain B, which is equivalent to site 1 in chain A and site 9 is located between the two chains. The most likely cause of this is crystal packing

69

Fragment screening using X-ray crystallography either altering the chemical environment at the different binding sites or making some sites inaccessible to fragment soaking. This highlights the important role that crystal packing can have on fragment screening using X-ray crystallography.

Figure 4.5 Fragment binding sites in P212121 CHD1 DCD structures. Both CHD1 DCD monomers are displayed as cartoon models. Chain A is coloured teal and chain B is coloured blue. H3 K4me3 peptides are modelled as sticks (carbon, yellow; oxygen, red; nitrogen, blue). Fragments are displayed as stick and surface models. Site 1 (red) is the site of primary interest for further compound development. Site 6 from chain B is also coloured red as it corresponds to site 1 from chain A. Site 7 (magenta) is located in the aromatic cage of chain A. Other binding sites 3, 8, and 9 are coloured grey. The bottom image is a 180° rotation of the top image. Images generated in PyMOL (Schrödinger, LLC) by aligning and overlaying all structures with fragments bound. 4.5 Fragment modelling and validation Fragments were modelled manually at potential hit sites determined by PanDDA if there was sufficient evidence to do so. Fragments were modelled based on several different aligned maps: the 2mFo-DFc map (contoured to 1 σ); the event

70

Fragment screening using X-ray crystallography map (1-BDC, contoured to 2 σ); and the Z-map (contoured to ±3 σ). Both the event and Z-maps were examined to determine if there is sufficient evidence for the placement of the soaked fragment at the event site identified by PanDDA, and fragments were then modelled mostly using these two maps. Nearby solvent molecules and amino acid residues were remodelled if there is strong support for these shifts in the maps. Solvent molecules or the H3 K4me3 peptide present in the ground-state model were deleted from the bound state model if they conflicted with the placement of the fragment. As fragments generally bind with low affinity and have sub-unitary occupancy, the fragment and any other changes induced by fragment binding will only be present in a subset of the CHD1 DCD molecules within the crystals. The X-ray diffraction data is therefore a mixture of CHD1 DCD with fragment bound and CHD1 DCD without fragment bound. To accurately represent the data both states should be modelled, so ensemble models combining the ground and bound states were generated. Features present only in the bound state are modelled as a unique alternate conformer (assigned ‘D’). Features present in the ground state but not the bound state are modelled as alternate conformers (assigned ‘C’). Ensemble models and aligned maps are then shifted back to their native (or crystallographic) state. Refinement and further manual modelling of the ensemble model then improves the structures and confirms the presence of the fragments. In total, 53 fragments were modelled successfully. Seven of the fragments bound at two different sites, so in total 46 different fragments were identified. The number of fragments binding to the different sites is shown in Table 4.3. The presence or absence of H3 K4me3 peptide in the ground and/or bound states of the different ensemble models is indicated in Appendix Table 7.10. X-ray diffraction data collection and refinement statistics, fragment validation statistics, and a full list of the fragments bound to different sites are provided in Appendices 7.1, 7.2, and 7.3. Select examples of interesting fragments are discussed in detail.

71

Fragment screening using X-ray crystallography

Table 4.3 Number of fragments at different sites. Site Number of Fragments Site 1 35 (31) Site 2 1 (1) Site 3 2 (2) Site 4 1 (0) Site 5 1 (1) Site 6 6 (1) Site 7 1 (0) Site 8 1 (1) Site 9 5 (3) Values in parentheses represent the number of fragments bound only to that particular site.

4.5.1 Site 7

Site 7 is located in the aromatic cage of CHD1 DCD chain A in P212121 crystals (Figure 4.5). The identification of a fragment binding site in the aromatic cage of CHD1 DCD was slightly unexpected as the crystals of CHD1 DCD used for fragment screening were co-crystallised with H31-12 K4me3 peptide. Addition of

H31-12 K4me3 was essential for crystallisation and strong electron density was observed for H3 K4me3 (residues 1-5) in previous datasets from P21212 crystals. One might therefore have expected that the aromatic cage would largely be occluded as a site for fragment binding as it is unlikely that a fragment could outcompete the large H3 K4me3 peptide, especially within a crystal. However, in protein-ligand crystals, the ligand will never achieve unitary occupancy even in co-crystallisation experiments meaning that X-ray diffraction data collected from protein-ligand crystals are a mixture of protein with and without ligand bound. It is therefore still possible for a fragment to bind to the small fraction of molecules in crystals without H31-12 K4me3 peptide bound. The PanDDA algorithm can then extract the weak signal from the subpopulation with fragment bound from the major fraction with H31-12 K4me3 peptide bound, allowing fragments to be modelled.

4.5.1.1 Fragment x0210 Only one fragment could be confidently modelled at this site, fragment x0210. Fragment x0210, 4-(3,4-difluorobenzoyl)-1-methylpiperazin-1-ium, contains an amide coupling poised core scaffold and is bound in the aromatic cage of CHD1 DCD (Figure 4.6).

72

Fragment screening using X-ray crystallography

Figure 4.6 Fragment x0210 binds in the aromatic cage of CHD1 DCD. (a) Chemical structure of fragment x0210. (b) Electrostatic surface model of the bound state of a 1.42 Å X-ray crystal structure of CHD1 DCD (isocontour, ± 8 kT/e), with fragment x0210 bound in the aromatic cage (only CHD1 DCD chain A is shown). Fragment x0210 is shown as a ball-and-stick model and coloured according to atom type (carbon, green; oxygen, red; nitrogen, blue; and fluorine, cyan).

Figure 4.7 Evidence of co-occupancy of fragment x0210 and H3 K4me3 in the aromatic cage. (a) PanDDA input model showing the ground state (gold) model with H3 K4me3 and nearby CHD1 DCD residues as sticks. The 2mFo-DFc (blue; 1 σ) and mFo-DFc (green/red; ±3 σ) maps indicate the presence of another species in the aromatic pocket. (b) Ground state model displayed. (c) Bound state model displayed. (b,c) Event (BDC=0.67; magenta; 2 σ) and Z-maps (green; 5 σ) from PanDDA analysis, clearly reveal the presence of fragment x0210 in the aromatic cage, the absence of H3 K4me3, the movement of a number of side chains including W325, and the presence of a number of new water molecules. (d) Refined ensemble model of x0210 aromatic cage. The 2mFo-DFc (blue; 1 σ) and mFo-DFc (green/red ±3 σ) maps support the generated ensemble model, the placement of fragment x0210, and the positioning of bound- and ground-state molecules. Fragment x0210 shown as ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue; fluorine, grey). Ground state-only water molecules coloured cyan. 73

Fragment screening using X-ray crystallography

Support for the placement and modelling of fragment x0210 in the aromatic cage in the bound state is displayed in Figure 4.7. In the PanDDA import files for x0210 it can be seen from examining the 2mFo-DFc and mFo-DFc maps that there are areas of electron density not explained by the ground state model alone (Figure 4.7a). Generation of the event and Z-maps by PanDDA indicated there was no evidence to support the presence of the H3 K4me3 in the bound state. Absence of event map density to support the placement of H3 K4me3 peptide is specific to the peptide as there is event map density for the surrounding protein molecules indicating that the H3 K4me3 peptide is absent in the bound state. Upon deletion of H3 K4me3 and several nearby water molecules, fragment x0210 could be modelled into the event map (Figure 4.7b,c). Additionally, it is clear from the event and Z-maps that several of the nearby sidechains of CHD1 DCD occupy different positions and that there is evidence for the presence of a number of additional water molecules along one side of the aromatic pocket in the bound state (Figure 4.7b,c). Refinement of the ensemble model (bound state and ground state models combined) resulted in a structural model accounting for all of the electron density in the aromatic cage (Figure 4.7d). Fragment x0210-2 is modelled at this site with a high degree of confidence (Figure 4.8).

Figure 4.8 Fragment x0210 binds at site 7 in the aromatic cage. Event (a; BDC=0.67; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (c) indicate high confidence for the placement of fragment x0210-2. (a,b) The bound state of CHD1 DCD is shown as a cartoon model (pale blue) and fragment x0210-2 is shown as a ball-and-stick model (carbon, green). Residues within 7 Å of fragment x0210-2 in the bound state (carbon, pale blue) and the ground state (carbon, magenta) are modelled as sticks and coloured by atom type. Oxygen, nitrogen, and fluorine are coloured red, blue, and grey, respectively.

The aromatic cage of CHD1 DCD is composed of two tryptophan residues and surrounding acidic and polar residues that create a negatively charged pocket (Figures 4.6 and 4.8). The position of several of these residues is altered in the fragment-bound state compared to the ground state (Figure 4.9). The most 74

Fragment screening using X-ray crystallography significant being the movement of W325 away from the methyllysine binding site to occupy an almost identical position as observed in the apo CHD1 DCD structure (PDB 2B2Y; [92]; Figure 4.9b). This shift enables the positioning of the methylpiperazine moiety near W325 and W322 residues. The positively charged methylamine group can then form a cation-π interaction with W325 (Figure 4.9b). To accommodate the difluorophenyl moiety the Q341 side chain flips almost 180° out of the pocket. There are also smaller shifts in the E333, T337 and W322 residues outward from the pocket centre, further enlarging the pocket. In contrast, the E272 side chain, which can no longer interact with W325, has shifted inwards slightly. The difluorophenyl ring does not appear to form any electrostatic interactions with CHD1 DCD. The highly electronegative moiety may facilitate binding through favourable van der Waals interactions and/or through an increase in the positive charge of the methylpiperazine group, due to the highly electronegative difluorophenyl moiety, resulting in the formation of stronger cation-π interactions with the tryptophan indoles.

Figure 4.9 Fragment x0210 forms a cation-π interaction with CHD1 W325. (a) Comparison of the positions of the aromatic cage residues in the bound state (carbon, pale blue), ground state (carbon, magenta), and apo CHD1 DCD (carbon, cyan; PDB 2B2Y, [92]). (b) Fragment x0210 forms a cation-π interaction with the W325 indole. CHD1 DCD modelled as a cartoon (pale blue). Aromatic cage residues modelled as sticks and coloured by atom type. Fragment x0210 shown as a ball-and-stick model (carbon, green). Oxygen, nitrogen, and fluorine atoms coloured red, blue, and light blue, respectively. Cation-π interaction is indicated with a black dashed line.

Fragment x0210 is only observed in the aromatic cage of chain A despite the presence of another aromatic cage in the asymmetric unit of P212121 crystals. This might be the result of crystal packing. In the P212121 space group the N-terminal region of helix α4 from a symmetry molecule packs closely to the peptide binding interface of chain B (Figure 4.10a). In contrast, the packing around the chain A site is not as compact, allowing fragments to soak into this site (Figure 4.10b).

75

Fragment screening using X-ray crystallography

Figure 4.10 Crystal packing near the peptide interface may limit solvent access to the aromatic cage of chain B. Crystal packing in P212121 CHD1 DCD crystals positions helix α4 of the linker region from a symmetry molecule in much closer proximity to the H3 K4me3 peptide in chain B (a) than in chain A (b). This tighter packing might explain why fragment x0210 does not bind in the aromatic cage of chain B. CHD1 DCD shown as a cartoon model. Chain A, chain B, and symmetry molecules are coloured teal, blue, and grey, respectively. H3 K4me3 peptides are modelled as sticks (carbon, yellow). Fragment x0210 is displayed as a ball-and-stick model (carbon, green). Oxygen, nitrogen, and fluorine are coloured red, blue, and light blue, respectively. 4.5.2 Site 2 Site 2 is another interesting site located at the methylated peptide interface of CHD1 DCD that could be targeted for development of compounds inhibiting the binding of KDM1A K114me2, H3 K4me3, and NS1 K229me2. This site overlaps with the binding site of the N-terminal regions of all three peptides, so would likely not be suitable for development of a KDM1A K114me2-specific inhibitory compound (Figure 4.4 and 4.11). Only one high confidence fragment, x0381 (N-(2-ethyl-2H-tetrazol-5-yl)butanamide), was observed to bind to CHD1 DCD at site 2 (Figure 4.11).

One other fragment was identified by PanDDA analysis at this site and could be modelled into the event map; however, the fragment was unstable during refinement and shifted significantly from the modelled position, so was not included as a hit. The event (BDC=0.70; magenta; 2 σ; Figure 4.11d), 2mFo-DFc

(blue; 1 σ; Figure 4.11e), and mFo-DFc (green/red; ±3 σ; Figure 4.11e) maps indicate strong support for the placement of fragment x0381. The majority of the fragment validation metrics indicate strong confidence in the ligand placement; however, the B-factor ratio indicates some inconsistency with the surrounding protein molecule. This might in part be the result of the ground state model that was used. Modelling of fragments at the peptide interface is complicated slightly by the inconsistent occupancy of the H31-12 K4me3 peptide in different fragment- bound models. The ground state model used in the PanDDA analysis contains an 76

Fragment screening using X-ray crystallography

H3 K4me3 peptide modelled at the peptide binding site of CHD1 DCD. Refinement of the x0381 ensemble model revealed no evidence of H3 K4me3 bound to the ground state of x0381. The H3 K4me3 peptide was therefore removed from the x0381 ground state model. There is insufficient evidence to model anything else in place of H3 K4me3 in the ground state, so in the ground state model of x0381 there are no solvent molecules modelled, which is unlikely to be an accurate representation of the ground state. This may have resulted in the occupancy of fragment x0381 being slightly overestimated resulting in the inflated B-factor for the ligand compared to the nearby amino acid residues.

Figure 4.11 Fragment x0381 binds to CHD1 DCD at site 2 located at the peptide- binding interface. (a,b,c) Electrostatic potential (isocontour, ±8 kT/e) surface models of the bound state X-ray crystal structure, x0381 (1.49 Å). (a) Bound state only, showing the position of fragment x0381 and two bound state-only DMSO molecules (carbon, yellow; oxygen, red; and sulfur, orange). (b,c) H3 K4me3 (carbon, yellow) and KDM1A K114me2 (carbon, cyan) peptides have been overlaid and modelled as sticks. Event (d; BDC=0.7; magenta; 2 σ), 2mFo-DFc (e; blue; 1 σ), and mFo-DFc (e; green/red; ±3 σ) maps and ligand (f) validation metrics indicate good support for the modelling of fragment x0381 (g). (d,e) CHD1 DCD displayed as a cartoon model (pale blue). Residues within 7 Å of fragment x0381 (ball-and-stick; carbon, green) are displayed as sticks and coloured by atom type. Oxygen and nitrogen coloured red and blue, respectively. 77

Fragment screening using X-ray crystallography

The amide group of fragment x0381 forms a hydrogen bond with the oxygen from the carboxyl group of D408, an important residue for H3 K4me3, KDM1A K114me2, and NS1 K229me2 binding. The tetrazole of fragment x0381 and the W423 indole may also form π-stacking interactions (Figure 4.11). Site 2 is difficult for development of potent compounds as the binding surface is very open and exposed to solvent. As fragment x0381 is located near the N-terminal regions of the identified methylated peptide binding partners of CHD1 DCD, it might be possible to develop a peptide-derived inhibitor incorporating a similar fragment to x0381.

4.5.3 Site 1 Site 1 represents a hotspot for fragment binding with the vast majority of the fragments binding to this site (41 fragments). Site 1 is a pocket near the CHD1 D425-KDM1A R113 interaction surface which was highlighted in chapter 3, and was postulated that it might be more amenable to compound binding than the aromatic cage or the rest of the peptide binding site. Fragments were identified at site 1 in both P21212 and P212121 crystals. In addition, fragments were identified at site 6 in the P212121 crystals, which is the equivalent of site 1. Site 6 from chain B is essentially identical to site 1 from chain A so are discussed together as one site. Due to the large number of fragments bound at this site, not all will be discussed in detail. Representative examples of interesting compounds with different binding modes, interactions, and poised core scaffolds will be discussed. There are four water molecules in the site 1 pocket and an additional three near the pocket opening. Displacement of these water molecules may in part drive binding to this fragment hotspot as the affinity of fragments for such hotspots is often due to the displacement of ‘unhappy’ water molecules [152].

One of the most notable aspects of fragment binding at site 1, is a structural change in the pocket upon compound binding, resulting in an opening of one side of the pocket. This open pocket is only observed in CHD1 DCD structures with fragments bound at site 1 and in the ground states of these crystals this side of the pocket is closed. The opening of this side of the pocket allows fragments to bind in two distinct orientations at site 1 (Figure 4.12). In one orientation, fragments extend out of the main pocket into the newly created opening (orientation 2). In the case of fragments bound in orientation 1, the formation of this pocket is still 78

Fragment screening using X-ray crystallography observed, however the fragments do not extend out into this new pocket but rather toward the CHD1 DCD D425-KDM1A R113 interaction interface. The unanticipated opening of this side of the binding pocket potentially creates an opportunity to increase potency and/or selectivity of the fragments. For example, fragments in orientation 1 could be elaborated to also extend into this pocket or the opposite approach could be adopted where fragments bound in orientation 2 are used as the basis for elaboration of compounds to extend toward the CHD1- KDM1A interface.

Of the 41 fragments bound to site 1, 30 are bound in orientation 1 with 11 bound in orientation 2 (Appendix 7.3.1). Interestingly, in the bound states of a number of datasets there is no evidence to support placement of the H3 K4me3 peptide and in some cases, there is clear evidence supporting the absence of the peptide. Ten of the structures with fragments bound at site 1 in orientation 1 and five in orientation 2 have no density to support peptide binding in the bound state. The only fragment that was not bound at site 1 or bound at the peptide interface, which displayed an absence of H3 K4me3 peptide binding, was fragment x0137, which is bound to site 3 (Figure 4.32 and Table 7.10). It is difficult to gauge the significance of the absence of histone peptide binding in the presence of these fragments as in the majority of cases there are no clear changes in protein conformation that would explain the lack of binding. Fragments present in structures without clear evidence for H3 K4me3 binding do not appear to share a common scaffold or binding mode. The possibility that the absence of H3 K4me3 density in the bound states of some crystals is simply due to very low occupancy of the peptide in these particular crystals prior to fragment soaking cannot be excluded. However, this remains unlikely given such low occupancy has not been observed previously in crystals without fragment addition. It appears more likely to be a result, either directly or indirectly, of the addition of particular fragments, although more data is required to support this hypothesis. The presence or absence of the H3 K4me3 peptide in the different fragment co-crystal structures is indicated in Appendix Table 7.10.

79

Fragment screening using X-ray crystallography

Figure 4.12 Fragments bind at site 1 in two distinct orientations. CHD1 DCD shown as cartoon model (teal). H31-6 K4me3 is displayed as a stick model (carbon, yellow; oxygen, red; nitrogen, blue). (a) Overlay of all X-ray crystal structures (P21212 space group) with fragments bound at site 1 in orientation 1. Stick and surface models for all fragments are displayed (grey). The lower figure has been rotated 90°. (b) Overlay of all X-ray crystal structures (P21212 space group) with fragments bound at site 1 in orientation 2. Stick and surface models displayed for all fragments (grey).

4.5.3.1 Orientation 1 There are 30 fragments bound to site 1 in orientation 1. These fragments differ in how they sit in the pocket, the identity of their functional groups, the interactions they form with protein and solvent, and in their poised reaction core scaffolds. The most prevalent core scaffold in this orientation, and indeed in all of the hits, is amide coupling. This is not unexpected given the prevalence of this scaffold in the DSPL library (Figure 4.1). There are also examples of N-arylation, ester and ether biosynthesis, and urea poised reaction scaffolds. Several examples will be discussed in the following section.

80

Fragment screening using X-ray crystallography

Figure 4.13 Fragment x0628 binds to site 1 in orientation 1. Event (a; BDC=0.60; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (d) indicate high confidence for the placement of fragment x0628 (c). (a,b) The bound-state of CHD1 DCD is shown as a cartoon model (pale blue). Residues within 7 Å of fragment x0628 (ball-and-stick; carbon, green) in the bound state (carbon, pale blue) and the ground state (carbon, magenta) are modelled as sticks and coloured by atom type. Oxygen and nitrogen are coloured red and blue, respectively.

Fragment x0628 contains a core N-arylation poised scaffold and is bound to site 1 in orientation 1 (Figures 4.13 and 4.14). Fragment x0628, or 3-[(4,6-dimethyl-2- pyrimidinyl)amino]phenol, forms two hydrogen bonds with CHD1 DCD: one between the hydroxyl group from the phenol moiety of fragment x0628 and the carbonyl group of L428; and the other between the amine group of fragment x0628 and the carboxyl group of E298. Large shifts in the orientation of nearby amino acid residues are not observed. There are however some small changes in the bound state compared to the ground state (Figure 4.13b). The most significant being the movement of E298 and L428 to form hydrogen bonds with fragment x0628. Fragment x0628 extends out of the site 1 pocket toward the CHD1 D425- KDM1A R113 interface (Figure 4.14a,b) to within 4.4 Å of KDM1A R113. Fragment x0572, which only differs from fragment x0628 by the absence of a hydroxyl group, binds to site 1 in an almost identical orientation (Figure 4.14c), with the major difference being the absence of an interaction with L428 (Figure 4.15). 81

Fragment screening using X-ray crystallography

Figure 4.14 Fragment x0628 binds at site 1 in orientation 1 and extends toward the CHD1 D425-KDM1A R113 interface. (a,b) Bound-state surface models of a 1.63 Å X-ray crystal structure of CHD1 DCD (grey) in complex with fragment x0628 (ball-and- stick; carbon, green) and H31-12 K4me3 peptide (sticks; carbon, yellow). An overlay of the KDM1A K114me2 peptide (sticks; carbon, cyan) is displayed. Fragment x0628 extends toward the CHD1 DCD D425-KDM1A R113 interface with one of the methyl groups from fragment x0628 being only 4.4 Å away from KDM1A R113. Dots indicate van der Waals spheres. (c) Electrostatic potential (isocontour, ±8 kT/e) surface model of the site 1 pocket demonstrates the similar positioning of fragments x0628 (sticks; carbon, green) and x0572 (sticks; carbon, magenta) within the moderately electronegative pocket. Oxygen and nitrogen coloured red and blue, respectively.

Nine of the fragments bound at site 1 in orientation 1 contain fluorophenyl or chlorophenyl groups that insert deep into the pocket (Figure 4.16). Interestingly, one of the fluorophenyl-containing fragments, x0040, binds in an orientation in which the fluorine atom is positioned quite differently from the other fluorophenyl-containing fragments and also the chlorophenyl-containing fragments (Figure 4.16). Unlike the other fragments, in which the fluorine or chlorine atoms occupy an area near one side of the pocket, fragment x0040, or 4-(4-fluorophenyl)-1-piperazinecarboxamide, points directly toward the rear of

82

Fragment screening using X-ray crystallography the pocket (Figures 4.16 and 4.17). The piperazinecarboxamide moiety extends out of the pocket toward the CHD1 D425-KDM1A R113 interface with a minimum distance of 5.6 Å between the fragment and KDM1A R113 (Figure 4.17b).

Figure 4.15 Fragment x0572 binds at site 1 in orientation 1 in a similar manner to fragment x0628. Event (a; BDC=0.66; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (c) indicate strong support for placement of fragment x0572. Chemical structure of fragment x0572 displayed above panel (a). Bound state of CHD1 DCD shown as a cartoon representation (pale blue). Residues within 7 Å of fragment x0572 (ball-and-stick; carbon, green) in the bound (carbon, pale blue) and ground (b; carbon, magenta) states are modelled as sticks and coloured by atom type. Oxygen and nitrogen are coloured red and blue, respectively.

Figure 4.16 Nine fluorophenyl or chlorophenyl-containing fragments bind at site 1 in orientation 1. A carved surface model of the site 1 binding pocket of CHD1 DCD is displayed as a grey surface model (1.76 Å resolution). Fragments are modelled as sticks and coloured by atom type with carbon atoms coloured as indicated, and oxygen, nitrogen, fluorine and chlorine atoms coloured red, blue, cyan, and orange, respectively. (a) Fluorophenyl-containing fragments are coloured as follows: x0003 (green); x0040 (magenta); x0076 (yellow); x0156 (tan); and x0454 (dark green). (b) Chlorophenyl- containing fragments are coloured as follows: x0257 (marine); x0271 (light grey); x0481 (pink); and x0601 (pale green).

83

Fragment screening using X-ray crystallography

Figure 4.17 Fragment x0040 binds at site 1 in orientation 1 and extends out of the pocket. Bound-state surface models of a 1.81 Å X-ray crystal structure of CHD1 DCD in complex with fragment x0040 (ball-and-stick; carbon, green) and H31-12 K4me3 peptide (stick; carbon, yellow). An overlay of the KDM1A K114me2 peptide (carbon, cyan) displayed as a stick model. (b) Fragment x0040 extends toward the CHD1 DCD D425- KDM1A R113 interface with the closest modelled atom 5.6 Å away from KDM1A R113. Dots indicate van der Waals spheres. (c) Electrostatic potential (isocontour, ±8 kT/e) surface model of the site 1 pocket showing the unique position of fragment x0040 within the pocket. Oxygen, nitrogen, and fluorine coloured red, blue, and light blue, respectively.

Residues E298 and K432, which are in close proximity to fragment x0040 occupy different positions in the bound and ground states, however CHD1 does not appear to form any electrostatic interactions with fragment x0040 (Figure 4.18). If fragment x0040 is compared to the other fluoro- and chloro-phenyl-containing fragments bound at site 1 in orientation 1 (Figure 4.16), the absence of an amide or similar group between the two ring systems might explain why the fragment occupies a different position to all of the other fragments. Fragment x0040 may interact with the site 1 pocket through a combination of hydrophobic interactions,

84

Fragment screening using X-ray crystallography favourable van der Waals interactions, and displacement of high energy water molecules.

Figure 4.18 Fragment x0040 binds at site 1 in orientation 1 with high confidence. Event (a; BDC=0.72; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (d) for fragment x0040 indicate high confidence in the placement of fragment x0040 (c). Bound state of CHD1 DCD is shown as a cartoon model (pale blue). Residues within 7 Å of fragment x0040 (ball-and-stick; carbon, green) in the bound state (carbon, pale blue) and ground state (carbon, magenta) are shown as stick models. Oxygen, nitrogen, and fluorine coloured as red, blue, and grey, respectively. One of the chlorophenyl-containing fragments, x0271, or ethyl (3-chlorophenyl)carbamate, has several interesting effects on the bound state of the protein compared to the ground state. The first major difference between the bound state and the ground state is the absence of H31-12 K4me3 peptide in the bound state. This was observed with fragments bound at the peptide binding interface in sites 2 and 7, however it is interesting that even when a fragment does not bind directly to the peptide binding site, there is no evidence for the peptide being present in the bound state. Whether this is a direct consequence of fragment binding cannot be ascertained from the crystal structure alone. There are some positional changes of amino acid residues at the peptide binding surface. The majority of the altered residues in the peptide binding interface are from CD2 and are in close in proximity to the site 1 binding pocket, suggesting the positional changes might be a consequence of fragment x0271 binding (Figure 4.19). Interestingly, the positions of the residues at the peptide-binding interface differ 85

Fragment screening using X-ray crystallography from the positions adopted by H3 K4me3 peptide-bound or apo-conformations and there is no shift in the position of the aromatic cage residues. In addition, there are a number of residues near fragment x0271 that have altered conformations in the bound state. L428, Q389, I328, E298, and K432 all occupy different positions in the bound state compared to the ground state (Figures 4.19 and 4.20). The shift in the conformation of E298 allows a hydrogen bond interaction with the amide of fragment x0271. Figure 4.20 demonstrates that fragment x0271 has been modelled with high confidence at site 1 in orientation 1.

Figure 4.19 Residues from the peptide-binding interface adopt different positions in the fragment x0271 bound state. H3 K4me3 peptide (sticks; carbon, yellow) is only present in the ground state of x0271. Stick models of aromatic cage tryptophan residues are displayed. Residues that have a different orientation in the ground state (carbon, magenta) compared to the bound state (carbon, pale blue) have been modelled as sticks. CHD1 DCD bound state shown as a cartoon model. Fragment x0271 shown as a ball-and- stick model (carbon, green). Oxygen, nitrogen, and chlorine atoms are coloured red, blue, and grey, respectively.

The chlorophenyl moeity of fragment x0271 inserts deep into the pocket with the large chlorine atom occupying a small sub-pocket (Figure 4.21a,b). The carbamate moiety extends out of the pocket in the direction of the CHD1 D425- KDM1A R113 interface (Figure 4.21a,c). Fragment x0271 does not extend as far

86

Fragment screening using X-ray crystallography toward the CHD1 D425-KDM1A R113 interface as some of the other fragments in this orientation, with the minimum distance being 6.8 Å.

Figure 4.20 Modelling of fragment x0271 at site 1 is well supported. The event (a; BDC=0.70; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ) and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (d) for fragment x0271 (c) indicate very high confidence in fragment modelling. (a,b) Bound state of CHD1 DCD is shown as a cartoon model (pale blue). Residues within 7 Å of fragment x0271 in the bound state (carbon, pale blue) and the ground state (carbon, magenta) are displayed as stick models. Oxygen, nitrogen, and chlorine are coloured red, blue, and grey, respectively. Electrostatic interactions indicated by black dashed lines.

A common feature of many of the fragments bound at site 1 in orientation 1 is the presence of a highly polar functional group that sits in a small sub-pocket within site 1. Thus far only the fluoro or chloro polar groups have been highlighted, there are however two fragments (x0332 and x0335) bound at site 1 in orientation 1 which have a polar nitrile group that sits in this same sub-pocket. The short distance of the nitrile C-N triple bond (1.16 Å) enables the presence of two covalent bonded atoms from the phenyl ring rather than the single bond observed with other fragments. As a result the highly polar nitrile group sits very deep into one of the sub-pockets in site 1, opening up a narrow, highly electronegative channel through to the peptide binding interface (Figure 4.22b,c). Interestingly,

87

Fragment screening using X-ray crystallography there is no evidence to support the placement of the H3 K4me3 peptide in the bound state of this crystal.

Figure 4.21 Fragment x0271 bound at site 1 in orientation 1. (a) Surface model of a 1.94 Å X-ray crystal structure of a CHD1 DCD (grey) in complex with H3 K4me3 peptide (stick; carbon, yellow) and fragment x0271 (ball-and-stick; carbon, green). KDM1A K114me2 (stick; carbon, cyan) has been superposed onto the structure. (b) The carbamate moiety extends out of the pocket to within 6.8 Å of KDM1A R113. (c) Electrostatic surface potential (isocontour, ±8 kT/e) model of site 1. The chlorine of the chlorophenyl moiety occupies a sub-pocket of site 1. Grey dots represent the van der Waals sphere of Cl.

There is strong support for placment of fragment x0335 in the modelled orientation (Figure 4.23). Clear density for the fragment is observed in the event map and the 2mFo-DFc map (Figure 4.23a,b). All of the fragment validation metrics indicate high confidence in fragment modelling (Figure 4.23d). Fragment x0335 does not appear to form any electrostatic interactions with CHD1 DCD. Binding is likely driven by a mixture of van der Waals forces, hydrophobic interactions, and displacement of water molecules. 88

Fragment screening using X-ray crystallography

Figure 4.22 Fragment x0335 bound at site 1 in orientation 1. (a,b) Bound state surface model (grey) of the 1.63 Å X-ray crystal structure of CHD1 DCD in complex with fragment x0335 (ball-and-stick; carbon, green). H3 K4me3 peptide from the ground state of x0335 (stick; carbon, yellow) and KDM1A K114me2 peptide (stick; carbon, cyan) have been superposed. (b) The minimum distance between fragment x0335 and KDM1A R113 is 6 Å. (c,d) Electrostatic potential surface models (isocontour, ±8 kT/e) of CHD1 DCD. (c) Carved surface model of the site 1 pocket demonstrates the positioning of the nitrile group into a sub-pocket of site 1 and the opening of an electronegative channel extending toward the peptide binding interface (d). Oxygen and nitrogen atoms are coloured red and blue, respectively.

There are many different avenues that could be pursued to develop more potent compounds as there are many different core scaffolds, functional groups, and interactions that could be optimised. A common feature of many of the fragments is the presence of a polar group that sits in a sub-pocket of site 1. In the case of the hydroxyl group from fragment x0628, this group actually forms a hydrogen bond with the L428 carbonyl group. Optimisation of these groups presents one optimisable region. Another common feature is the presence of hydrogen bond donor groups, which can form hydrogen bonds with the oxygen atoms from the E298 carboxyl group. There are also possibilities to introduce several water- mediated hydrogen bonds with CHD1 DCD.

89

Fragment screening using X-ray crystallography

Figure 4.23 Modelling of fragment x0335 at site 1 in orientation 1 is well supported. Event (a; BDC=0.73; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (d) for fragment x0335 (c) indicate high confidence in modelling of fragment x0335. (a,b) Bound state of CHD1 DCD shown as a cartoon model (pale blue). Residues within 7 Å of fragment x0335 (ball-and-stick; carbon, green) in the bound state (carbon, pale blue) and the ground state (carbon, magenta) are modelled as sticks. Oxygen and nitrogen atoms are coloured red and blue, respectively. Fragments such as x0666 and x0432-2 might form hydrogen bonds with a nearby water molecule, which in turn interacts with the carbonyl group of Q389 (Appendix Figures 7.7 and 7.10). In fragment x0478, one of the carbon atoms from the phenyl ring is replaced with a nitrogen atom. A hydrogen bond with a nearby water molecule is then possible, which in turn forms hydrogen bond interactions with the K432 amine and the carbonyl of Q389 (Appendix Figure 7.8). Optimisation of charge distribution across the phenyl group, a common moiety in fragments bound at site 1 in orientation 1, may be another optimisable feature of the fragments. Extending fragments out into the CHD1 D425-KDM1A R113 binding interface will be quite challenging and unfortunately none of the fragments bound at site 1 in orientation 1 are large enough to indicate how exactly this might be achieved. The CHD1 D425-KDM1A R113 interaction surface is quite charged, so addition of charged or polar groups to the end of some of the fragments bound in orientation 1 that extend out toward this surface might improve binding affinity by facilitating interactions with K431 and/or D425 side chains. This might then enable inhibition of the CHD1-KDM1A K114me2 interaction by disrupting the important CHD1 D425-KDM1A R113 interaction. 90

Fragment screening using X-ray crystallography

4.5.3.2 Orientation 2 There are fewer fragments bound in orientation 2 compared to orientation 1. Fragments were classified as binding in orientation 2 if a significant proportion of the fragment extended out of the pocket into the fragment induced opening on one side of the pocket. In total, there are nine different fragments bound at site 1 in orientation 2 with one of these, x0459, being a hybrid of orientations 1 and 2. Six of these fragments: x0110, x0125, x0134, x0232, x0240, and x0608 are bound to

P21212 crystals. Three fragments: x0157, x0459, and x0484 are bound to P212121 crystals. Fragments x0459 and x0484 are observed at both sites 1 and 6. Fragment x0157 is bound at site 6 and also site 9. The accessibility of this area of the site 1 pocket for fragment binding was initially unexpected, as prior to fragment binding this side of the pocket is completely closed but rearrangement of K432 and E298 residues open up this side of the pocket. Often these residues also participate directly in the binding of the fragments through formation of hydrogen bonding interactions. Similar to fragments that bind to site 1 in orientation 1, fragments in orientation 2 differ quite widely in the identity of their core poised reaction scaffolds, their functional groups, which part of the pocket they occupy, and how they interact with CHD1.

One fragment that was observed to bind in orientation 2 and could be modelled with very high confidence is fragment x0125, or 2,4-difluoro-6-(1H-pyrazol-5- yl)phenol (Figure 4.24). Similar to some of the fragments bound in orientation 1, fragment x0125 contains a phenyl ring with the 4-fluoro group occupying the site 1 sub-pocket (Figure 4.25b). Unlike the orientation 1 fragments, fragment x0125 contains multiple highly electronegative groups: two fluorine atoms and a hydroxyl group, altering the charge distribution of the fragment. The hydroxyl group points toward E298 but is not within hydrogen bonding distance. The other fluorine group points in the direction of the CHD1 D425-KDM1A R113 interface. The pyrazole moiety extends out into the fragment-induced opening of the pocket. The pyrazole may then interact with CHD1 E298 and K432 side chains via water- mediated hydrogen bond interactions (Figure 4.24).

91

Fragment screening using X-ray crystallography

Figure 4.24 Fragment x0125 is a high confidence hit at site 1 in orientation 2. Event (a; BDC=0.79; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (c) show strong evidence for the modelling of fragment x0125. (a,b) CHD1 DCD is shown as a cartoon model (pale blue). Residues within 7 Å of fragment x0125 (ball-and-stick; carbon, green) are modelled as sticks and coloured according to atom type (carbon, pale blue). Ground state orientations of E298 and K432 are modelled as sticks (carbon, magenta). Oxygen, nitrogen, and fluorine atoms are coloured red, blue, and grey, respectively. Intermolecular interactions are displayed as black dashed lines.

Figure 4.25 Fragment x0125 binds to CHD1 DCD at site 1 in orientation 2. (a) Bound state surface model of a 1.62 Å X-ray crystal structure of CHD1 DCD (grey) in complex with fragment x0125 (sticks; carbon, green) and H3 K4me3 peptide (sticks; carbon, yellow). Overlaid KDM1A K114me2 peptide is shown as a stick model (carbon, cyan) (b) Electrostatic surface model (isocontour, ±8 kT/e) of the site 1 pocket with fragment x0125 (ball-and-stick; carbon, green) bound. Oxygen, nitrogen, and fluorine atoms coloured red, blue, and light blue, respectively. Electrostatic potential determined using the APBS [81] plugin within PyMOL (Schrödinger, LLC). 92

Fragment screening using X-ray crystallography

Fragment x0240 (N-(2-furylmethyl)-1H-benzimidazol-2-amine), inserts deep into the site 1 pocket in orientation 2 (Figures 4.26 and 4.27). The benzimidazole moiety sits in the pocket and might form electrostatic interactions with the carboxyl group of E298 and the Q389 carbonyl. The furan moiety extends out of the pocket but does not form any obvious interactions with CHD1 DCD (Figures 4.26 and 4.27).

Figure 4.26 Binding site and validation of fragment x0240. Event (a; BDC=0.65; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (c) for fragment x0240 indicate good confidence in fragment modelling. Bound state CHD1 DCD modelled as a cartoon (pale blue). Residues in the bound state within 7 Å of fragment x0240 (ball-and-stick; carbon, green) are modelled as sticks. Ground state positions of E298 and K432 are modelled as sticks (carbon, magenta). Oxygen and nitrogen coloured red and blue, respectively. Intermolecular interactions indicated by black dashed lines. An interesting and unique fragment bound at site 1 is x0459, dimethyl-1,3- phenylenebiscarbamate. This fragment is unusual as it not only extends out of the site 1 pocket into the fragment-induced opening in the pocket but also toward the CHD1 D425-KDM1A R113 interface (Figure 4.28). This is made possible by the unique chemical structure of this fragment (Figure 4.29). Fragment x0459 consists of a phenyl group, which sits deep in the pocket and has two aliphatic moieties that extend from this phenyl group in opposite directions: one in the direction of the bound-state induced opening of the pocket; and the other in the direction of orientation 1 fragments toward the CHD1 D425-KDM1A R113 interface.

Fragment x0459 contains two amide coupling poised core scaffolds and two ether synthesis poised core scaffolds, meaning at least two different synthetic reactions could be used to elaborate this fragment. Fragment x0459 is bound at both site 1

93

Fragment screening using X-ray crystallography and site 6 in chains A and B, respectively. The position of these fragments in the binding pocket is almost identical (Figure 4.30). Validation metrics for both of the fragments (Figure 4.29e,f) indicate a high level of confidence in the modelling of both these fragments. Fragment x0459-2 has a high RMSD, which is the result of a manual positional adjustment of part of the fragment after PanDDA modelling. Hydrogen bonding interactions appear to form between the amide groups from both carbamate moieties and the Q389 main chain carbonyl group and the oxygen atom of the carboxyl moiety of E298 (Figure 4.29). These interactions are observed with both fragments. Fragment x0459-1 may also form a water-mediated hydrogen bond interaction with the backbone carbonyl group of Q389.

Figure 4.27 Complex X-ray crystal structure of His6-CHD1270-443, H31-12 K4me3, and fragment x0240 (2.23 Å). His6-CHD1270-443 is shown as a grey surface model. Binding of fragment x0240 (ball-and-stick; carbon, green) at site 1 in orientation 2 is demonstrated in the two images, the bottom being a 90° rotation of the top image. H3 K4me3 peptide (carbon, yellow); and a superposed KDM1A K114me2 peptide (carbon, cyan) are modelled as sticks. Oxygen and nitrogen atoms are coloured red and blue, respectively.

94

Fragment screening using X-ray crystallography

Figure 4.28 Fragment x0459 binds in a unique manner to site 1. Bound state of a 1.40 Å X-ray crystal structure of His6-CHD1270-443 in complex with H3 K4me3 peptide and fragment x0459. Only one CHD1 DCD from the asymmetric unit is shown. The bottom image is a 90° rotation of the top image. The CHD1 DCD is represented as a grey surface model. Fragment x0459 (carbon, green), H3 K4me3 peptide (carbon, yellow) and a superposed KDM1A K114me2 peptide (carbon, cyan) are modelled as sticks. Oxygen and nitrogen atoms are coloured red and blue, respectively. If the position of all of the fragments bound in orientation 2 are compared the fragments are seen to occupy different areas of the pocket. There is a sub-pocket of site 1 which could be targeted during initial optimisation of fragments bound in orientation 2 (Figure 4.31b; red circle). In fragments bound in orientation 1, there is an apparent preference for compounds with polar functional groups that occupy this region of the pocket. Therefore, the addition of polar functional groups, such as fluorine, chlorine, or nitrile groups, to fragments bound in orientation 2 might be one avenue that could be pursued for development of potency before extending the fragments out toward the CHD1 DCD D425-KDM1A R113 interface. In the case of fragment x0459, the functional group(s) could be added to the phenyl ring. 95

Fragment screening using X-ray crystallography

Figure 4.29 Evidence for placement of fragment x0459 at sites 1 and 6. Panels a, c, and e represent the fragment bound to chain A at site 1 (x0459-2). Panels b, d, and f represent the fragment bound to chain B at site 6 (x0459-1). Binding at the two sites is almost identical. Event maps for x0459-2 (a; BDC=0.77; magenta; 2 σ) and x0459-1 (b; BDC=0.79; magenta; 2 σ) are displayed. The 2mFo-DFc (blue; 1 σ) and mFo-DFc (green/red; ±3 σ) maps are shown for fragments x0459-2 (c) and x0459-1 (d). (a, b, c, d) CHD1 DCD shown as a cartoon model (pale blue); fragments are displayed as ball-and- stick models (carbon, green); and residues from the bound state (carbon, pale blue) and the ground state (carbon, magenta) within 7 Å of the fragment are modelled as sticks. Oxygen and nitrogen are coloured red and blue, respectively. (e, f) Validation metrics showing high confidence in fragment placement. (g) Chemical structure of fragment x0459. *High RMSD is the result of manual adjustment of fragment position after PanDDA modelling.

96

Fragment screening using X-ray crystallography

Figure 4.30 Fragment x0459 is a unique fragment bound at site 1. Fragment x0459 extends out toward the peptide interaction surface and in the direction of typical orientation 2 fragments. (a) CHD1 DCD is shown as a surface model (grey). Superposition of KDM1A K114me2 peptide (sticks; carbon, cyan) onto the CHD1 DCD, H3 K4me3 peptide (sticks, carbon, yellow), and fragment x0459 (sticks, carbon, green) complex structure reveals the fragment is 6.8 Å away from R113. (b) Site 1 pocket is shown as a carved electrostatic potential (isocontour, ±8 kT/e) surface model. The position of fragment x0459 bound to site 6 of chain B (carbon, green) is almost identical to that of the fragment bound to site 1 from chain A (carbon, magenta). Fragment x0459 shown as a stick model. Oxygen and nitrogen atoms are coloured red and blue, respectively.

Figure 4.31 Fragments bound at site 1 in orientation 2 occupy different areas of the pocket. (a) The nine different fragments bound at site 1 in orientation 2 are displayed as stick models and demonstrates the wide range of space occupied by the different fragments. Fragments are coloured by atom type. Carbon atoms from each fragment are coloured differently. (b) Fragments x0125 (carbon, green), x0240 (carbon, marine), and x0459 (carbon, pink) are displayed as stick models. Oxygen, nitrogen, and fluorine atoms are coloured red, blue, and cyan, respectively. The red dashed oval indicates a sub- pocket, which based on the observations from orientation 1 fragments might be an area to target for initial fragment optimisation.

The presence of fragments bound at site 1 in two different orientations accessing both overlapping and distinct areas of the same pocket opens up exciting possibilities for elaboration of these fragments. This might be achieved in several 97

Fragment screening using X-ray crystallography different ways. The simplest optimisation would simply involve extending fragments bound in orientation 1 further toward and into the CHD1 D425- KDM1A R113 interface. To increase potency, these fragments could also be extended out into the fragment-induced opening of the site 1 pocket. Another approach would be to use a fragment bound in orientation 2 as a starting point for elaboration of the compound in the direction of the CHD1 D425-KDM1A R113 interface. A final approach might also be possible, in which synthons from fragments bound in orientation 1 are combined with synthons from fragments bound in orientation 2 using robust synthetic chemical reactions. If successful, this approach may hasten the development of potent compounds targeting this site.

In addition to fragments bound at site 1 and in the peptide binding interface, there are a number of fragments bound to five different sites at the surface of CHD1 DCD. For the purpose of inhibiting the interaction between CHD1 DCD and methylated proteins, the fragments bound at these sites do not immediately have a clear application. In the context of the full-length CHD1 protein, however, some of these sites may have an impact on the binding of methylated proteins or the general function of the enzyme. Some of the fragments bound to these different sites will be discussed.

4.5.4 Site 3 Site 3 is located near the C-terminus of the CHD1 DCD (Figures 4.4 and 4.5).

Fragment x0137 is bound to site 3 in a P21212 crystal and fragment x0321 is bound to site 3 in a P212121 crystal. In both space groups the site is located near a CHD1 DCD symmetry molecule but neither fragment forms direct interactions with the symmetry molecule. Fragment x0137 forms a hydrogen bond interaction between its amide group and the Y440 hydroxyl group and a further hydrogen bond with the amide group of H397 through one of the oxygen atoms from the sulfonamide moiety (Figure 4.32a,b). Fragment x0321 forms several water- mediated interactions with K401, Q434, D438, and Y440 residues (Figure 4.32d,e). Additionally, the dimethylphenyl group might form π-stacking interactions with the phenyl ring of F441.

98

Fragment screening using X-ray crystallography

Figure 4.32 Fragments bound at site 3. Bound states of x0137 (a, b; 1.39 Å) and x0321 (d, e; 1.45 Å) displayed as cartoon models (pale blue). Fragments are displayed as ball-and-stick models (carbon, green). Residues within 7 Å of the fragments in the bound (carbon, pale blue) and ground (carbon, magenta) states are displayed as sticks and coloured by atom type. Intermolecular interactions indicated by black dashed lines. The event (magenta; 2 σ), 2mFo-DFc (blue; 1 σ), and mFo-DFc (green/red; ±3 σ) maps and validation metrics for x0137 (a-c) and x0321 (d-f) indicate high confidence in the modelling of these fragments. Oxygen, nitrogen, and sulfur atoms are coloured red, blue, and yellow, respectively.

Site 3 is of less interest than sites 1, 2, and 7 as it is not located near the methylated peptide binding interface and there is no evidence of this being a site for potential allosteric inhibition. However, this site could potentially be of some future interest as it might be located at an intramolecular domain-domain interaction surface. Alignment of the human CHD1 DCD with the recently published structure of yeast Chd1 bound to a nucleosome (PDB 5O9G; [120]) suggests that site 3 is located at the contact interface between CHD1 DCD and the DNA binding domain. The local environment of site 3 is likely to be very different in the context of the full-length CHD1 protein, which could limit the usefulness of the two fragment hits at site 3.

99

Fragment screening using X-ray crystallography

4.5.5 Site 4

Site 4 is a site specific to the P21212 crystals and only one fragment, x0432, was modelled at this site. Fragment x0432 also binds at site 1 in orientation 1 (Appendix Figure 7.7). Fragment x0432-1 forms two electrostatic interactions with D356 and a water-mediated interaction with K355 (Figure 4.33). Fragment validation metrics indicate good confidence in fragment modelling (Figure 4.33). Site 4 is located at the first α-helix of the linker region (α3) between the two CDs.

This region is generally poorly defined in P21212 crystals with weak electron density and high B-factor values for any modelled residues. Site 4 is not present at a known protein or DNA interaction surface and is not located near the peptide binding interface. It is therefore currently of less interest than other fragment binding sites. As more information is uncovered this site might become of interest as it is present at the important linker region.

Figure 4.33 Fragment x0432-1 bound at site 4. Event (a; BDC=0.61; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (c) indicate good confidence in modelling of fragment x0432-1. The bound state model of x0432-1 CHD1 DCD (1.58 Å) is displayed as a cartoon model (pale blue). Residues within 7 Å of fragment x0432-1 (ball-and-stick; carbon, green) in the bound (carbon, pale blue) and ground (carbon, magenta) states are displayed as sticks and coloured by atom type. Intermolecular interactions indicated with black dashed lines. Oxygen and nitrogen atoms coloured red and blue, respectively. 4.5.6 Site 5 Site 5 is located in a solvent-rich surface groove located far away from the peptide binding interface (Figure 4.4). There is strong support for modelling of fragment x0622, 2-ethyl-N-(2-methylphenyl)butanamide, which is the only fragment bound at site 5 (Figure 4.34). The event (BDC=0.51; magenta; 2 σ; Figure 4.34a) and

2mFo-DFc (blue; 1 σ; Figure 4.34b) maps strongly support the presence of the fragment. There is, however, some positive difference density for part of the 100

Fragment screening using X-ray crystallography

2-ethylbutanamide moiety. This has resulted in the higher than expected RSZD value for the ligand. The likely reason for this is some variation in ground state models at this site. In some of the models this site is occupied by DMSO, whereas in other models there is only density to support the placement of water molecules. Fragment x0622 forms a large number of direct and water-mediated electrostatic interactions with CHD1 DCD. Currently, there is no information indicating this region directly participates in interdomain or nucleosomal interactions. However, given the quality of the fragment modelling and the number of interactions, fragment x0622 represents a very good starting point for development of inhibitory compounds if this region is found to be important for the biological function of CHD1.

Figure 4.34 High-confidence modelling of fragment x0622 at site 5. Event (a; BDC=0.51; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (c) for fragment x0622 indicate strong evidence for modelling of fragment x0622. (a,b) CHD1 DCD is shown as a cartoon model (pale blue). Residues from the bound (carbon, pale blue) and ground (carbon, magenta) states within 7 Å of fragment x0622 (ball-and-stick; carbon, green) are modelled as sticks and coloured by atom type. Oxygen and nitrogen are coloured red and blue, respectively. Intermolecular interactions are indicated with black dashed lines. 4.5.7 Site 8 Site 8 is located between β1 and β2 strands from CD1 and α3 from the linker region. Site 8 is only observed in P212121 crystals and only one fragment, x0507, could be modelled at site 8 with a close to acceptable level of confidence (Figure 4.5; Figure 4.35). X-ray diffraction data from this crystal is not of high quality

(Rmeas=0.123, over full-resolution range) and also exhibits translational non- crystallographic symmetry (tNCS). The fragment hit should therefore be interpreted with some caution and binding should be confirmed either through an orthologous binding assay or further soaking experiments with fragment x0507 to 101

Fragment screening using X-ray crystallography obtain a soaked crystal of higher quality. There is much less support for the positioning of the 1,3-benzodioxole moiety compared to the ethanesulfonamide moiety (Figure 4.35). The position of the side chains of D356, Q357, and I316 is shifted in the bound state. This allows for a water-mediated hydrogen bonding interaction between the amide of the fragment and an oxygen atom from the carboxyl moiety of D356 and a hydrogen bond interaction between an oxygen from the ethanesulfonamide moiety with the Q357 amine group.

Figure 4.35 Fragment x0507 bound at site 8 of CHD1 DCD. Event (a; BDC=0.64; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and fragment validation metrics (c) indicate weak evidence for the modelling of fragment x0507. (a,b) CHD1 DCD is shown as a cartoon model (pale blue). Residues from the bound (carbon, pale blue) and ground (carbon, magenta) states within 7 Å of fragment x0507 (ball-and-stick; carbon, green) modelled as sticks and coloured by atom type. Oxygen, nitrogen, and sulfur are coloured red, blue, and yellow, respectively. Intermolecular interactions indicated with black dashed lines. (c) Most validation metrics are still acceptable despite the weak density; however, the B-factor ratio and the poor density fit of the 1,3-benzodioxole moiety indicate there are issues with modelling the fragment. 4.5.8 Site 9 Site 9 is located at the intermolecular interface of CHD1 chain A and chain B in the asymmetric unit of P212121 crystals (Figure 4.5). The site is formed by β1 from CD1 of chain A, α3 and α4 helices from the linker region of chain B, and the η2 helix from chain A. As site 9 is located at the interface between the two CHD1 DCD molecules, fragment binding at this site could be due to packing interactions. Site 9 is located far away from the peptide binding interface. Alignment of fragment-bound structures with the structure of yeast Chd1 bound to a nucleosome (PDB 5O9G; [120]) suggests that site 9 might be located near an interface between the linker region of CHD1 DCD and nucleosomal DNA. In total, five different fragments were observed to bind at site 9: x0157, x0210, 102

Fragment screening using X-ray crystallography x0453, x0542, and x0056 (Appendix 7.3.8). None of these compounds appear to form extensive interactions with CHD1. Confidence in the placement of fragment x0157 is high (Figure 4.36c) with almost no difference density observed at the selected contour level (3 σ; Figure 4.36b). There is a very large positional shift of R276 (β1, chain A) to form an interaction with the ethenone oxygen. There is also a shift in the position of Y373 (Figure 4.36b) in the bound state to potentially form π-stacking interactions with the dimethylphenyl ring. The oxygen atom from the morpholine moiety interacts via a water molecule with the hydroxyl group of T359. Fragment x0157 is also observed to bind at site 1 in orientation 2 (Appendix Figure 7.12).

Figure 4.36 Fragment x0157 bound at site 9 at crystal packing interface. Event (a; BDC=0.60; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and the fragment validation metrics (c) show strong evidence for the modelling of fragment x0157. (a,b) CHD1 DCDs are shown as cartoon models (chain A, pale blue; chain B gold). Residues from the bound (carbon, pale blue) and ground (carbon, magenta) states within 7 Å of fragment x0157 (ball-and-stick; carbon, green) modelled as sticks and coloured by atom type. Oxygen and nitrogen are coloured red and blue, respectively. Intermolecular interactions indicated with black dashed lines. 4.6 Outlook In the process of performing fragment screening using X-ray crystallography, a robust crystallisation procedure for generating large numbers of soakable, high- resolution complex crystals of His6-CHD1270-443 and H31-12 K4me3 was developed. As a result, any elaborated compounds developed from these fragments could be soaked readily into this crystal system allowing rapid acquisition of structural information at all stages of elaboration. The crystals form between 6-8 days and do not decrease in quality within three months (longer incubation times were not tested). The protein can be expressed and purified very

103

Fragment screening using X-ray crystallography easily using a simple two-step purification with a very high yield (70-100 mg of protein per litre of media) and can be flash-cooled with liquid nitrogen and stored at -80 °C for at least a year (longer period not tested). Therefore, using the procedure outlined in chapter 2 it is possible to produce high-quality crystals, with a relatively small time investment, for soaking of tens or hundreds of fragments into different crystals. Once the protein has been purified and appropriate seed stocks have been developed and stored, structure-based optimisation could proceed very rapidly with minimal effort from the crystallographer. One day every two or three months could be devoted to preparation of tens or hundreds of crystals. The moment new compounds are synthesised, they can be soaked into any number of the crystals from the maintained stock and quickly and easily flash-cooled in liquid nitrogen for X-ray diffraction data collection. Analysis of diffraction data should also be rapid and require minimal effort as the refined models and average statistical maps generated in this study can be used to assist compound identification and modelling. PanDDA [181] is available in the latest versions of CCP4 [168] and can utilise the pre-calculated average maps from this study to search only the new datasets for fragment-binding events.

X-ray crystallography does not provide a measure of the binding affinity of the compounds for CHD1 DCD, therefore, any further development of fragments identified in this study will require the establishment of a sensitive biophysical or biochemical assay to assess binding affinity. One difficulty faced when trying to develop a suitable assay system is the instability of His6-CHD1270-443 at higher temperatures and low salt concentration. Sensitivity to high concentrations of DMSO, which will be present when measuring affinities of weakly binding fragments, will also likely present an obstacle for development of a suitable assay system. One interesting observation from the attempts to crystallise different constructs of CHD1 DCD was the identification of a more stable construct, less prone to precipitation than His6-CHD1270-443. This construct, His6-3C-

CHD1267-449, has a slightly extended N- and C-terminus and a cleavable His6 tag.

This construct, when cleaved (CHD1267-449), could potentially be more suitable for development of binding assays, with initial experiments using this construct being promising.

104

Fragment screening using X-ray crystallography

Another aspect that is important for development of chemical probes or inhibitors is an understanding of the potential off-target effects and selectivity of the compounds. In examining potential off-target effects and selectivity, a first step is to examine binding to closely-related family members or domains that share a similar binding profile. There are eight other CHD proteins for which far less is known about the DCDs compared to CHD1. The majority of studies with the DCDs from other CHD family members have used constructs with large GST-tags attached. The advantage of a large tag such as GST is that it may help to solubilise the DCDs; however, a major disadvantage is that such a large tag, particularly in relation to the small DCDs, may interfere with binding to interaction partners. A potentially more suitable purification tag is the much smaller His6 tag. I sought to generate suitable His6-tagged constructs and establish expression and purification protocols for several of the CHD family members: CHD2, CHD4, CHD6, CHD8, and CHD9 (summarised in Table 4.4).

Large amounts of pure CHD6 and CHD9 DCDs could be obtained and are suitable for binding assays. One construct of CHD8 DCD could be expressed and purified in low quantities. This construct is of particular interest as CHD8 DCD has not, to my knowledge, been expressed before without the addition of a large solubility tag and there is no robust binding data reported with this domain. CHD2, and CHD4 DCD and PHD domains are expressed with high yields and are easily purified by a simple three-step procedure, however an issue with stability at high protein concentrations means some further optimisation might be required. The expression and purification procedures established for these protein domains has the potential to assist future experiments, first of all for analysis of off-target effects from any compounds developed for CHD1, and secondly, for structural and functional studies with these DCDs. It should now be possible to perform binding and structural studies with CHD6, CHD8, and CHD9 DCDs and with some optimisation potentially with CHD2 and CHD4, which should help to reveal what, if any, methylation mark they recognise, and what role the domain may play in the function of the different . If potent inhibitory compounds are identified for CHD1 DCD, the constructs designed in this study could be suitable for establishing their selectivity for CHD1 DCD over other CHD family members in vitro.

105

Fragment screening using X-ray crystallography

Table 4.4 Summary of cloning and expression tests with CHD family members.

All constructs are DCDs from the respective proteins, except CHD4364-506, which are constructs of CHD4 PHD domains. pOPINF, expression vector encoding an N-terminal His6 tag with an HRV 3C protease cleave site between the His6 tag and gene of interest (GOI). pOPINE, expression vector encoding a non-cleavable, C-terminal His6 tag. Expression transformation (exp trans) represents transformation into the E. coli expression strain Lemo21(DE3). PCR amp; amplification of GOI from synthetic DNA by PCR. Cloning; incorporation of GOI into chosen vector. IPTG (isopropyl β-D-1- thiogalactopyranoside) and AI (auto-induction); small scale expression tests in which expression of the protein of interest was induced using IPTG or AI methods. Up-scaled; scale-up of expression test to 1 L, followed by protein purification. Ticks indicate a positive result, and crosses indicate a negative result. The number of ticks indicate relative levels of protein expression, with one being low and three being high. n/a, not assessed.

There is still relatively little known about KDM1A K114 methylation, so it is difficult to predict what effect targeting the CHD1 KDM1A K114me2 interaction may have on different cell types and what off-target effects any inhibitory compounds may have. From LC-MS experiments [11], mono-methylated KDM1A K114 was detected in addition to dimethylation. Trimethylated KDM1A

106

Fragment screening using X-ray crystallography

K114 was not observed, however, it is possible that another methyltransferase could trimethylate K114. To understand what other effects methylation of K114 could have we sought first to establish if there were any protein domains that might interact with different KDM1A K114 methylation states.

107

KDM4A: Another potential KDM1A K114me reader

5 KDM4A: Another potential KDM1A K114me reader

5.1 KDM4A is a potential KDM1A K114me reader protein In addition to the CHD1 DCD, another potential reader of KDM1A methylated at K114 was identified. This potential reader domain is the tandem Tudor domain (TTD) of KDM4A, which has previously been observed to be a reader of H3 K4me3 [51], H3 K23me3 [53], and H4 K20me3 [52] marks. Protein-domain microarray data provided by Mark Bedford’s lab (University of Texas) revealed a potential interaction between KDM4A-TTD and KDM1A K114me2/3 but not KDM1A K114me1 or unmethylated KDM1A peptide (Figure 5.1). Interestingly, no interaction was observed between KDM4C-TTD and methylated KDM1A peptides despite KDM4C-TTD being reported as also binding H3 K4me3 by us and others [53, 149]. Structural and biophysical experiments were performed to explore the interaction between KDM4A and KDM1A K114me2/3 in vitro.

5.2 KDM4A-TTD exists as a dimer

KDM4A895-1010 was observed to exist solely as a dimer in solution (Figure 5.2).

X-ray crystal structures of apo-KDM4A895-1010, in two different space groups

(I4132 and P63; Table 5.1), indicate that the dimer adopts a ‘head-to-tail’ conformation. In the I4132 crystal (1.95 Å resolution) the HTD-1 from both KDM4A molecules interact with the HTD-2 from the symmetry-related KDM4A molecules (Figure 5.3). R113 from HTD-1 occupies the aromatic cage of HTD-2 from the symmetry-related molecule, occluding the known binding site of other methylated peptides. Therefore, binding of KDM1A K114me2/3 will likely require disruption of this dimer.

5.3 KDM4A-TTD binds KDM1A K114me2/3 To investigate the specificity of KDM4A-TTD for KDM1A methylated K114 and the selectivity for a particular methylation state, ITC was performed with the

KDM4A-TTD and KDM1A104-127, KDM1A104-127 K114me1, KDM1A104-127

K114me2, or KDM1A104-127 K114me3. Binding affinities of 63 µM, 68 µM,

2.2 µM, and 2.3 µM at 10°C were determined for KDM1A104-127, KDM1A104-127

K114me1, KDM1A104-127 K114me2, and KDM1A104-127 K114me3, respectively.

108

KDM4A: Another potential KDM1A K114me reader

(a) (b)

TUDOR TUDOR TUDOR A1 TDRD1(1) B1 EBNA2 Co-A C1 ESET A2 TDRD1(2) B2 Ret-bp1 C2 CG1-72 A3 TDRD2 B3 M96 C3 FX A4 TDRD3 B4 STK31 C4 PGF20 A5 TDRD4-1 B5 53BP1(1-2) C5 Pombe1 A6 TDRD4-2 B6 53BP1(1) C6 Pombe2 A7 TDRD4-3 B7 53BP1(2) C7 Colon1 A8 TDRD5 B8 Anchor C8 Colon2 A9 TDRD7-1 B9 2B C9 Colon1-2 A10 TDRD7-3 B10 2C C10 KDM4A-1 A11 TDRD7(1-3) B11 RBP1 like-2 C11 KDM4A-2 (c) A12 Tudor 9 B12 SMN C12 KMD4A1-2 TUDOR MBT-TUDOR WD40 D1 Lin9 TDR E1 L3MBT F1 WDR5 D2 LBR TDR E2 SCML1 F2 WDR9 D3 LBR211 E3 SCML2 F3 TBLR12 D4 SPF30 E4 SCMH1 F4 TBLR1 D5 JMJN 2B WT E5 LML2 F5 RbAb46 D6 Lin9 DIRP E6 KIAA1617 F6 RbAb48 D7 Colon Short 1-2 E7 PHF20 MBT F7 EED D8 ARI4A E8 CG1-72 MBT F8 HIRA D9 SETB1 E9 PHF20 MBT+TDR F9 WDHD1 D10 SND1 E10 dSfmbt HIS(a) F10 Mep50 D11 STK31 E11 L3MBT(1-3) F11 DDB2 KDM1A KDM1A K114me1 D12 SPIN1 E12 TDRD7 (503-770) F12 BRWD

PHD PHD PHD + G1 BPTF H1 Dnmt 3a I1 KMD4A PHD+Tudor G2 ING2 H2 Dnmt 3b I2 KDM4C PHD G3 PHF2 H3 DnMT3L I3 M96 Tudor+PHD G4 PHF8 H4 Trim24 Brd+PHD I4 MYST4 PHD+PHD G5 DATF1 H5 ING3 I5 NSD1 PHD+PWWP G6 Rag2 H6 ING4 I6 WHSC1 PHD+PWWP G7 PCCX1 H7 ING5 I7 PRKCB1 PHD+BRD+PWWP G8 P300 H8 PHD TIF1A I8 BS69 PHD+BRD G9 PHF20 H9 TRI66 I9 ATRX KDM1A K114me2 G10 PHD PHF3 H10 BRPF1 I10 RAL1 KDM1A K114me3 G11 PHD PHF5 H11 MLL4 I11 BAZ1b G12 PHD CHD5(1-2) H12 MTF2 I12 CBP

BROMO BROMO SANT/TSN CHROMO J1 GCN5 K1 TIF1α L1 MPP11 like M1 TIP60 J2 TAF1-D1 K2 KAP-1 L2 MTA1 M2 CHD2 J3 TAF1-D2 K3 P300 L3 NCoR2 M3 CHD4 J4 PCAF K4 WDR9(1-2) L4 NCoR2(1) M4 MPP8 J5 SP140 K5 WDR9(1) L5 NCoR2(2) M5 SMARCC2 J6 SNF2 K6 WDR9(2) L6 NCoR1 M6 MRG15 J7 SMAP K7 BAZ L7 RERE M7 RBBP1 J8 BAF180(1-2) K8 BRDT(1-2) L8 ADA2 SANT M8 PC2 α GST J9 BAF180(3) K9 BRDT(1) L9 Zuot in Rel M9 PC3 J10 BAF180(3-4) K10 BRDT(2) L10 TSN M10 CHD5 (d) J11 BAF180(5-6) K11 BRD4(1) L11 TSN m5 M11 CHD7(1-2) L12 TSN m6 M12 CBX6

KDM1A CHROMO CHROMO/BRK/MRG PWWP PWWP/CW N1 Mi-2 O1 CBX3 P1 BRPF1 Q1 PWWP PKCB1 N2 HP1α O2 CBX2 P2 BS69 Q2 PWWP HDGR3 N3 HP1γ O3 CDYL2 P3 DNMT3B Q3 PWWP DNM3A N4 Msl3-like O4 CBX8 P4 HDGF Q4 CW1 N5 SUV39H1 O5 BRK SMCA2 P5 HRP3 Q5 CW3 N6 CBX1 O6 BRK SMCA4 P6 MSH6 Q6 CW4 N7 HP1β O7 BRK CHD6 P7 NSD1 Q7 CW5 N8 CDY1 O8 BRK CHD7 P8 WHSC1(1) Q8 CW6 KDM1A K114me1 N9 CHD1 O9 BRK Q6DTK9 P9 PSIP1 Q9 FATM14 N10 CBX4 O10 MRG MS3L1 P10 BRD1 Q10 TAF10 N11 CBX7 O11 MRG MO4L1 P11 ZCPW1 Q11 TULP1 N12 CBX5 P12 MBD5

ANK FHA/KH/BRCT BRCT Others R1 BARD1 S1 RAD53 FHA2 T1 BRCA1 U1 NO66 M1 R2 G9aδ S2 RAD53 FHA1 T2 53BP1 U2 NO66 M2 R3 hG9 S3 MDC1 FHA T3 Crb2 U3 NO66 M5 R4 hEUHMT1 S4 KI FHA T4 TopBP(1-6) U4 NO66 JMJC KDM1A K114me2 R5 mG9a S5 CHK FHA T5 Rad4 I, II U5 Dna 230 R6 MMP8 S6 SAM68 KH T6 Rad4 III, IV U6 Dna 343 R7 RFXδ S7 QKI KH T7 Ect2 U7 SWIRM KIAA1915 R8 RFX S8 BRCT FCP1 T8 Ligase IV U8 SWIRM KIAA0601 R9 Anco1 S9 BRCT Bard1 T9 MDC1 U9 SWIRM ADA2 R10 53BP2 S10 BRCT TDT T10 Rad9 U10 SWIRM SMRC2 R11 Notch T11 REV1 U11 SWIRM SMRC1 R12 IKBα T12 DNA LIGASE II KDM1A K114me3 Others Others V1 POZ Zbtb4 W1 FG Nup116 V2 POZ KA150 W2 FG Nup42 V3 MBD1 W3 FG Nup2 V4 MBD2 W4 XP120 V5 MBD3 W5 X Arm ARV V6 MeCP2 W6 X Arm Beta V7 SFMBT W7 X Arm Delta V8 SFMBt 4xMBT W8 Lap2 Lem α GST V9 WD40 Tap5 W9 Lap2 Lem Mut V10 WD40 CSA W10 Lap2 Lem-like7 V11 Shooting W11 Kim17 V12 PHD ZFP-1 W12 Plant agenet FXR1 Figure 5.1 Protein-domain microarray. (a) Schematic of the protein-domain microarray. M is GST (b) List of GST fused protein domains bound to nitrocellulose. (c) Microarrays probed with either Cy3-labelled KDM1A104-127, KDM1A104-127 K114me1, KDM1A104-127 K114me2, KDM1A104-127 K114me3, or anti-GST antibody, followed by visualisation with a fluorescein isothiocyanate-conjugated secondary antibody. (d) Magnified image of data from box C showing interaction of KDM4A-TTD with KDM1A K114me2 and KDM1A K114me3. Data was provided by the lab of Mark Bedford (University of Texas).

109

KDM4A: Another potential KDM1A K114me reader

Figure 5.2 SEC-MALS results indicate KDM4A895-1010 exists as dimer in solution. Absorption at 280 nm (blue) and molecular weight (red) are displayed.

Figure 5.3 Head-to-tail dimer occludes the aromatic cage. (a) Model of a 1.95 Å resolution structure of KDM4A895-1010 (space group I4132) showing head-to-tail binding between HTD-2 (black oval) and HTD-1 (green oval) of a symmetry-related molecule. (b) R113 from a symmetry-related molecule (blue) occupies the aromatic cage.

All of the KDM1A104-127 peptides interact in a 1:1 stoichiometry with

His6-KDM4A895-1010, indicating that the peptides interact at a single site from each KDM4A895-1010 molecule.

110

KDM4A: Another potential KDM1A K114me reader

Figure 5.4 ITC binding assays between His6-KDM4A895-1010 and KDM1A104-127 K114 peptides with differing methylation states. Raw titration data and integrated heat measurements of representative ITC binding experiments are displayed in the upper and lower plots, respectively.

111

KDM4A: Another potential KDM1A K114me reader

5.4 KDM1A K114me2/3 peptides bind at the aromatic cage of KDM4A HTD-2

To uncover the binding mode of KDM4A-TTD and KDM1A108-119 K114me2/3, X-ray co-crystal structures with either the di- or tri-methylated peptides were determined (Table 5.1). A 1.32 Å resolution co-crystal structure of His6-

KDM4A895-1010 bound to KDM1A K114me3 was determined in space group P1. In these crystals there are two KDM4A molecules in the asymmetric unit and one KDM1A K114me3 peptide. The KDM1A K114me3 peptide is bound between the HTD-2 from the two different KDM4A-TTD in the asymmetric unit (Figure 5.5a).

A 1.63 Å resolution X-ray co-crystal structure of KDM4A895-1010 bound to

KDM1A108-119 K114me2 was determined in space group P21. As was the case with the P1 structure, there are two KDM4A895-1010 molecules and one

KDM1A108-119 K114me2 peptide in the asymmetric unit (Figure 5.5b). In this crystal the KDM1A108-119 K114me2 peptide does not sit between the two

KDM4A895-1010 molecules in the asymmetric unit but is instead bound between one of the copies and a KDM4A895-1010 symmetry molecule (Figure 5.5c).

Figure 5.5 KDM1A108-119 K114me2/3 peptides are bound at a likely crystal packing interface. KDM4A-TTD are shown in cartoon representation and coloured according to chain: chain A is coloured blue; chain B contains the aromatic cage that binds the methyllysine side chain and is wheat coloured; and symmetry-molecules coloured grey. KDM1A108-119 K114me3 (a) and K114me2 (b,c) peptides are modelled as sticks and coloured by atom type: carbon, green; oxygen, red; and nitrogen, blue. (a) Asymmetric unit of His6-KDM4A895-1010 and KDM1A108-119 K114me3 complex (space group, P1). (b) Asymmetric unit of KDM4A895-1010 and KDM1A108-119 K114me2 complex (space group,

P21). (c) A chain A symmetry-related molecule packs against the KDM1A108-119 K114me2 peptide and the HTD-2 of chain B

Five residues (112-116) of the KDM1A108-119 K114me3 residues and ten residues

(108-117) from the KDM1A108-119 K114me2 peptide could be modelled confidently (Figure 5.6). In both structures extensive interactions are formed between the KDM1A peptides and both KDM4A subunits (Figure 5.6).

112

KDM4A:KDM1A Another K114me potential reader Table 5.1 X-ray diffraction data collection and refinement statistics apo apo KDM1A108-119 K114me2 KDM1A108-119 K114me3 X-ray source Diamond Light Source, I04 Diamond Light Source, I04 Diamond Light Source, I04 Diamond Light Source, I04 X-ray detector Dectris Pilatus-6M Dectris Pilatus-6M Dectris Pilatus-6M Dectris Pilatus-6M Wavelength [Å] 0.9795 0.9795 0.9795 0.9795 Space group I4132 P63 P21 P1 Cell constants a, b, c [Å] 134.84, 134.84, 134.84 78.65, 78.65, 86.06 47.64, 52.93, 51.27 31.96, 42.95, 48.82 α, β, γ [°] 90.00, 90.00, 90.00 90.00, 90.00, 120.00 90.00, 113.12, 90.00 73.41, 75.08, 78.84 KDM4A mol. in asym. unit 1 2 2 2 Resolution limits 55.05-1.95 (1.98-1.95)a 36.39-1.64 (1.70-1.64) 41.15-1.63 (1.66-1.63) 45.70-1.32 (1.34-1.32) Completeness [%] 100.0 (100.0) 99.8 (99.9) 99.1 (93.3) 98.5 (88.1) Unique reflections 15,588 (774) 36,940 (3,669) 29,184 (1,412) 55,177 (2,483) Multiplicity 18.3 (16.8) 6.6 (6.1) 4.2 (3.0) 4.2 (3.2) Overall B factor, Wilson plot [Å2] 44.4 37.5 25.21 26.26 Rmerge 0.058 (2.426) 0.035 (0.630) 0.052 (0.624) 0.038 (0.557) Rmeas 0.061 (2.563) 0.038 (0.688) 0.060 (0.747) 0.051 (0.750)

1 Rp.i.m. 0.014 (0.621) 0.015 (0.276) 0.028 (0.441) 0.033 (0.499) 13 Mean I/σ(I) 20.0 (1.3) 22.5 (2.8) 14.6 (1.6) 13.5 (1.5)

CC1/2 0.999 (0.445) 0.999 (0.826) 0.999 (0.581) 0.999 (0.763) Refinement statistics Rwork/Rfree 0.216/0.232 0.163/0.198 0.176/0.208 0.147/0.183 No. of atoms Protein 919 1834 1,793 1,933

Peptide N/A N/A 93 50 Ligands 24 N/A 43 15 Water 33 103 241 266 B factors [Å2] Protein 51.83 38.41 29.89 26.43 Peptide N/A N/A 47.26 45.45 Ligands 96.05 N/A 46.89 35.37 Water 55.83 46.81 40.50 32.87 R.M.S.D Bonds [Å] 0.003 0.017 0.010 0.011 Angles [°] 0.545 1.76 1.02 1.42 aValues in parentheses are from highest resolution shell.

KDM4A: Another potential KDM1A K114me reader

Figure 5.6 KDM1A108-119 K114me2/3 peptides form extensive interactions with two KDM4A-TTD molecules. (a) KDM1A K114me2 peptide (carbon, green) forms extensive interactions with one KDM4A895-1010 subunit in the asymmetric unit (carbon, wheat) and a symmetry molecule (carbon, grey). (b) KDM1A K114me3 peptide (carbon, green) forms extensive interactions with both His6-KDM4A895-1010 subunits in the asymmetric unit (subunit A carbon, wheat; and subunit B carbon, marine). Oxygen and nitrogen atoms coloured red and blue, respectively. IMD, imidazole.

Figure 5.7 X-ray crystal structure at 1.63 Å resolution of KDM4A895-1010 and KDM1A108-119 K114me2 peptide. Electrostatic potential (isocontour, 8 kT/e) surface model of KDM4A895-1010 bound to KDM1A108-119 K114me2 peptide, displayed in stick representation and coloured by atom type (carbon, green; oxygen, red; and nitrogen, blue). A 2mFo-DFc omit map for KDM1A108-119 K114me2 peptide (1 σ) is displayed as blue mesh. Electrostatic model was generated using the APBS [81] plugin within PyMOL (Schrödinger, LLC). 114

KDM4A: Another potential KDM1A K114me reader

KDM1A108-119 K114me2 is bound to the acidic surface of HTD-2 of

KDM4A895-1010, which is more negatively charged than HTD-1 (Figure 5.7). The K114me2 side chain is bound in a very negatively charged aromatic pocket (Figure 5.7). The dimethylammonium group of K114 forms cation-π interactions with the aromatic side chains of F932, W967, and Y973 (Figure 5.8a). Interestingly, the K114me2 sidechain is positioned so that the lone pair of electrons of the Nε atom is oriented toward D934 enabling an electrostatic interaction with the carboxyl group of D934. The majority of the other interactions between KDM1A108-119 K114me2 and this KDM4A895-1010 subunit occur through interactions between the main chain atoms of the peptide chains. The guanidino moiety of KDM1A R113 appears to form two salt bridging interactions with the oxygen atoms from the carboxyl group of D939 (Figure 5.8b). However, something that remains unclear is to what degree the position of this residue in the crystal structure might be the result of crystal packing interactions with a symmetry-related molecule (Figure 5.9). The guanidino moiety of R113 forms interactions with the hydroxyl group of S936 and the carboxyl group of D934 from the symmetry molecule. R113 may also form a weak cation-π interaction with an imidazole molecule that occupies the aromatic cage of the symmetry molecule.

Figure 5.8 KDM1A R113 and K114me2 side chains form interactions with KDM4A895-1010. (a) The dimethylammonium group of K114me2 (carbon, green) forms cation-π interactions with F932, W967, and Y973 and interacts electrostatically with D934. (b) KDM1A R113 forms salt bridging interactions with D939. KDM4A895-1010 modelled as a cartoon (wheat). Interacting residues modelled as sticks and coloured by atom type. Oxygen and nitrogen coloured red and blue, respectively.

The KDM1A108-119 K114me3 peptide also interacts with one of the

His6-KDM4A895-1010 subunits through cation-π interactions between the trimethylammonium moiety and the aromatic side chains of F932, W967, and 115

KDM4A: Another potential KDM1A K114me reader

Y973 (Figure 5.10). KDM4A-TTD was observed in ITC experiments to interact in solution with methylated KDM1A K114 and histone peptides with a stoichiometry of 1:1 (Figure 5.4; [51-53]). Therefore, the 2:1 stoichiometry in the

P1 and P21 crystal structures (Figure 5.6) is unlikely to occur in solution and the interaction between KDM1A K114me2/3 peptides and the second KDM4A-TTD (Figure 5.6; subunit B) are likely to be crystallisation artefacts. They cannot, however, be completely discounted.

Figure 5.9 An imidazole molecule occupies the aromatic cage of a KDM4A895-1010 symmetry molecule. R113 from KDM1A108-119 K114me2 forms several direct interactions with a KDM4A895-1010 symmetry molecule and also may interact weakly via a cation-π interaction with an imidazole (IMD) molecule occupying the aromatic cage of the KDM4A symmetry molecule. KDM1A108-119 K114me2 peptide (carbon, green) and interacting residues from a symmetry molecule (carbon, grey) modelled as sticks and coloured by atom type. Oxygen and nitrogen coloured red and blue, respectively.

Figure 5.10 KDM1A K114me3 forms cation-π interactions with the aromatic cage of KDM4A895-1010. The trimethylammonium group of KDM1A108-119 K114me3 (carbon, green) interacts with F932, W967, and Y973 (carbon, wheat). (a) Interactions between the centre of the trimethylammonium and the aromatic rings are indicated by black dashed lines. (b) Van der Waals spheres of the methyllysine carbon atoms are indicated with green dots. His6-KDM4A895-1010 displayed as a cartoon model. Sticks are coloured by atom type. Oxygen and nitrogen are coloured red and blue, respectively.

116

KDM4A: Another potential KDM1A K114me reader

Crystal packing forces do seem to affect the positioning of the peptides. Despite the observation that both the di- and tri-methylammonium groups adopt positions within the aromatic cage consistent with previous studies, and also form interactions that make sense in light of the binding affinities measured by ITC, the two peptides are positioned differently (Figure 5.11). The KDM1A K114me2/3 peptides are bound in orientations that are more similar to H4 K20me3 than either H3 K4me3 or H3 K23me3 (Figure 5.11b). To conclude whether both, one, or none of the binding modes observed for the KDM1A K114me2/3 peptides accurately depict the interaction occurring in solution further experiments would be required. Mutation of D939 to abolish the salt bridging interactions with R113 could be used to determine the importance of these interactions for binding. One could also try to crystallise the complex in a different crystal form in which such packing does not occur, however extensive efforts failed to yield such crystals. Given the small size of the KDM4A-TTD, the most definitive solution would be the determination of the solution complex model using NMR.

Figure 5.11 KDM1A108-119 K114me2/3 peptides bind in an orientation similar to H4 K20me3. (a) KDM1A108-119 K114me2/3 peptides bind in the same orientation but differ substantially in position. KDM1A K114me2 (carbon, green) and KDM1A K114me3 (carbon, wheat) peptides displayed as stick models. (b) Binding orientation of KDM1A108-119 K114me2/3 peptides is similar to H4 K20me3. Ribbon models of the peptides are displayed: KDM1A K114me2 (green); KDM1A K114me3 (blue); H3 K4me3 (pink); H3 K23me3 (yellow); and H4 K20me3 (grey). HTD-2 of KDM4A- TTD modelled as cartoon (wheat). 5.5 Other KDM4 family members do not interact specifically with methylated KDM1A K114 There are four other members of the KDM4 demethylase subfamily (Figure 5.12a). Two, KDM4D and KDM4E, do not possess a TTD, whereas KDM4B and KDM4C do possess TTDs (Figure 5.12a,b). Prompted by the observation that KDM4C-TTD did not appear to bind KDM1A K114 methylated peptides, we 117

KDM4A: Another potential KDM1A K114me reader tested whether cell lysate from HEK293 transfected with Flag-KDM4A, Flag-KDM4B, Flag-KDM4C, Flag-KDM4D, or myc-KDM4E would bind specifically to KDM1A peptides methylated at K114 (experiment performed by Dr Henriette Franz, Schüle lab). Western blot analysis with anti-Flag or anti-myc antibodies revealed that only KDM4A appears to bind specifically to KDM1A trimethylated at K114 (Figure 5.12c).

Figure 5.12 KDM4A is the only member of the KDM4 subfamily that interacts specifically with KDM1A104-127 K114me3. (a) Domain architecture of KDM4 family members. JmjN (wheat); JmjC (blue); PHD (red); and Tudor domains (green). (b) Sequence alignment of the TTDs of KDM4A, KDM4B, and KDM4C. Black diamonds indicate aromatic residues that interact with di-and tri-methylammonium groups. Black circle indicates an interacting residue shared between KDM1A108-119 K114me2/3. Red triangles indicate residues only interacting with KDM1A K114me2 peptide. Blue triangles indicate residues only interacting with KDM1A K114me3 peptide. (c) Peptide pull-down assay with Flag-KDM4A, Flag-KDM4B, Flag-KDM4C, Flag-KDM4D, and myc-KDM4E overexpressed in HEK293 cells. Visualised by western blot with anti-Flag and anti-myc specific antibodies (Dr. Henriette Franz). (d) Structural alignment of HTD-2 from KDM4A (wheat), KDM4B (grey; PDB 4UC4; [53]), and KDM4C (marine; PDB 2XDP). Presence of R919 in place of an acidic residue might explain lack of binding by KDM4C.

Sequence and structural comparison of the TTDs of KDM4B and KDM4C with the KDM4A-KDM1A108-119 K114me2/3 complexes does not explain the lack of 118

KDM4A: Another potential KDM1A K114me reader methylation-specific binding by KDM4B. However, the substitution of R919 in KDM4C in place of an aspartate residue present in the KDM4A and KDM4B TTDs may provide an explanation for the absence of KDM1A K114me2/3 binding by KDM4C (Figure 5.12b,c,d).

Further structural, biochemical, and biophysical experiments are needed to fully explore the binding mechanism of KDM4A-TTD and KDM1A K114me2/3, and to explain the absence of specific binding to KDM1A K114me3 by KDM4B and

KDM4C. Extensive efforts to obtain KDM4A895-1010-KDM1A108-119 K114me2/3 co-crystals in which the peptide does form interactions with two different

KDM4A895-1010 molecules were unsuccessful. A possible reason for this might be the charge of the two HTDs; HTD-2, which binds the K114me2 side chain is negatively charged, whereas HTD-1 is much more positively charged (Figure 5.7). This possibly explains the head-to-tail interaction observed in the apo- crystals between the HTDs (Figure 5.3). The addition of the positively charged

KDM1A108-119 K114me2/3 peptides prevents this association. Formation of co- crystals then proceeds through interaction of the negatively charged HTD-2 from one KDM4A895-1010 with the positively charged KDM1A108-119 K114me2 peptide bound to the HTD-2 of another KDM4A895-1010 molecule. Due to the extensive effort and difficulty obtaining suitable KDM4A895-1010-KDM1A108-119 K114me2/3 co-crystals, the best approach to uncover new mechanistic details about this interaction could be through the use of NMR, proceeded by mutagenesis studies. Additionally, the in vivo relevance of this interaction needs to be established.

119

Discussion

6 Discussion In this study we identified a novel methylation site in the flexible N-terminal region of KDM1A at K114. KDM1A K114 is mono- and dimethylated by EHMT2 in an androgen-dependent manner. Dimethylated KDM1A K114 was shown to be specifically recognised by CHD1 both in vitro and in vivo. The EHMT2-KDM1A K114me2-CHD1 circuit controls the recruitment of AR to chromatin, thereby regulating AR-dependent gene expression. Androgen- dependent chromosomal rearrangements such as TMPRSS2-ERG gene fusions were also demonstrated to be regulated by this circuit. A 1.6 Å resolution X-ray co-crystal structure of CHD1 DCD and KDM1A108-119 K114me2 peptide revealed the recognition mode of KDM1A K114me2 by CHD1 DCD. The binding of KDM1A K114me2 peptide was observed to be similar to that of H3 K4me3 and NS1 K229me2/3 by CHD1 DCD. One major difference between the binding of KDM1A K114me2 and these two peptides was observed. A unique electrostatic interaction between CHD1 D425 and R113 of KDM1A was demonstrated to be important for the binding of KDM1A K114me2. We postulated that a pocket close to this binding interface might be targeted by small molecule compounds, which might then be elaborated to develop an inhibitor or chemical probe specifically targeting the interaction between CHD1 and KDM1A K114me2. These might then be developed into first-in-class inhibitors of AR functions for the treatment of prostate cancer. Development of compounds specifically targeting the CHD1 interaction with KDM1A K114me2 would also enable the function of this interaction to be separated from the interaction between CHD1 and H3 K4me3, and possibly any other as yet unidentified interaction partners.

Fragment screening using X-ray crystallography was used to identify compounds that bind to different areas of CHD1 DCD. In total, 46 different fragments were identified as binding to nine different sites at the surface of CHD1 DCD. The pocket proposed as a suitable site for the development of CHD1-KDM1A K114me2 specific inhibitors prior to fragment screening, was identified as a fragment hotspot with 33 unique fragments shown to bind to this pocket. Fragments binding at this site cause a conformational change which opens up one side of the pocket, allowing fragments at this site to bind in two distinct 120

Discussion orientations. This creates the opportunity to combine overlapping fragments bound in different orientations to possibly increase affinity and selectivity of the compounds. Many of the fragments bound to this site extend out from this pocket toward the CHD1 D425-KDM1A R113 interface; however significant chemical elaboration will be required to firstly, increase the potency of the fragment hits, and secondly, to extend the compounds further into the CHD1 D425-KDM1A R113 interface to specifically inhibit this interaction. Despite the extensive further work required to develop potent and specific compounds targeting the interaction between CHD1 and KDM1A K114me2, the identification of a fragment hotspot near an important binding interface is a very promising result, and indicates that development of such compounds might be feasible.

One fragment was identified that binds to the aromatic cage of CHD1 and overlaps with the site of methyllysine binding. This fragment is the first reported small molecule bound to the aromatic cage of a DCD. Another fragment bound to the methylated peptide binding interface of CHD1 DCD was identified. The binding site of this fragment overlaps with those observed for KDM1A K114me2, H3 K4me3, and NS1 K229me2. The peptide binding interface of CHD1 DCD is a shallow, electronegative surface groove, potentially limiting the potency of small molecule inhibitors targeting this site. One possible way to overcome this could be the development of peptide-derived inhibitors by utilising the structural information available for fragment x0210-2 bound to the aromatic cage, the N-terminal peptide overlapping fragment (x0381) and the KDM1A K114me2, H3 K4me3, and NS1 K229me2/3 peptide-bound structures of CHD1 DCD. This may decrease their utility in vivo due to cell permeability and compound stability but might allow a potent inhibitory compound to be developed which might then be used to study how the binding of methylated peptides by CHD1 DCD affects chromosomal localisation and the nucleosome remodelling and spacing functions in mammalian cells.

There is still relatively little known about the function of KDM1A K114 methylation and the methylation-specific binding of CHD1. It remains to be elucidated whether KDM1A K114 is methylated in other cell types in the presence or absence of androgens. Can other hormones stimulate methylation of

121

Discussion

KDM1A K114 (e.g. oestrogen)? Other questions also remain. For instance, is there a demethylase that can act on this methylation mark; what if any cross-talk exists between KDM1A K114me2 and S111 phosphorylation; in other cell types can different methyltransferases methylate K114; and do other methylation states exist? The X-ray structure obtained was only of the CHD1 DCD and a short KDM1A K114me2 peptide. The rapid advances being made in cryo-electron microscopy opens up the possibility of determining a structure of full-length CHD1 with methylated KDM1A K114 to reveal more information about how these two large proteins might interact and also if the flexible N-terminal region of KDM1A forms any other interactions with CHD1.

Finally, are there any other reader proteins of methylated or unmethylated KDM1A K114? In this study the TTD of KDM4A was identified as a potential reader domain of KDM1A K114me2/3 in vitro. X-ray co-crystal structures of KDM4A TTD with KDM1A K114me2/3 peptides suggest a possible binding mechanism, however further binding studies with mutant proteins and structural studies are required to support this binding mode. The in vivo relevance of this interaction is also yet to be elucidated and binding between full-length KDM4A and KDM1A K114me2/3 has not yet been demonstrated. Experiments investigating KDM1A K114 methylation in different cell types may help to uncover if these two proteins interact in vivo and what, if any, function does this interaction have.

122

Appendix A: Fragment A: Appendix screening 7 Appendix A: Fragment screening 7.1 X-ray diffraction data collection, refinement, and fragment validation statistics Table 7.1 X-ray diffraction data collection, refinement, and fragment validation statistics for x0003, x0021, x0040, x0051, and x0056. x0003 x0021 x0040 x0051 x0056 Data Collection Autoprocessing XDS-3d XDS-3d DIALS XDS-3d DIALS Space group P21212 P21212 P21212 P21212 P212121 a, b, c (Å) 110.73, 44.60, 46.79 110.43, 45.46, 47.01 110.53, 45.26, 47.14 111.05, 45,28, 47.06 45.49, 93.66, 110.56 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 28.98-1.76 (1.81-1.76)† 31.34-1.85 (1.90-1.85) 47.14-1.81 (1.86-1.81) 55.53-1.67 (1.71-1.67) 47.61-1.32 (1.35-1.32) Rmerge 0.075 (1.675) 0.077 (1.643) 0.062 (1.827) 0.027 (1.175) 0.042 (0.812)

Rmeas 0.089 (2.001) 0.091 (1.927) 0.074 (2.157) 0.032 (1.387) 0.050 (1.079) Rpim 0.047 (1.083) 0.048 (0.996) 0.039 (1.137) 0.017 (0.731) 0.027 (0.705)

I/σI 11.3 (1.0) 13 (1.2) 12 (1.2) 24.7 (1.4) 13.7 (1.3) 123 CC1/2 0.998 (0.604) 0.998 (0.539) 0.999 (0.664) 0.999 (0.704) 0.999 (0.582)

Wilson B factor (Å2) 34.22 29.93 31.22 33.44 15.42 Completeness (%) 99.7 (99.9) 99.9 (100) 100 (99.8) 99.8 (99.9) 99.6 (97.0) Unique reflections 23,684 (1,738) 20,398 (1,548) 22,288 (1,571) 28,288 (2,062) 111,238 (7,901) Multiplicity 6.4 (6.3) 6.5 (6.8) 6.4 (6.5) 6.5 (6.8) 5.7 (3.9) Refinement Rwork/Rfree 0.204/0.244 0.203/0.255 0.217/0.250 0.221/0.255 0.175/0.198 Molprobity score 1.30 1.26 1.08 1.53 0.93 R.M.S.D Bonds (Å) 0.017 0.017 0.015 0.018 0.012 Angles (°) 1.662 1.643 1.555 1.771 1.374 Fragment Site 1 Site 1 Site 1 Site 1 Site 9 Occupancy 0.61 0.63 0.63 0.54 0.59 RSCC 0.91 0.94 0.94 0.86 0.73 RMSD 0.3 0.16 0.31 0.44 1.88* RSZD 2.0 0.7 1.9 0.8 0.1 B-factor ratio 1.04 0.87 1.13 1.18 1.38 RSZO/OCC 3.93 4.44 3.65 1.85 1.86 †Values in parentheses are from highest resolution shell. *High RMSD of fragment x0056 is a result of manual adjustment of the fragment after generation of the ensemble model.

Appendix A: Fragment A: Appendix screening

Table 7.2 X-ray diffraction data collection, refinement, and fragment validation statistics for x0076, x0110, x0125, x0134, and x0137. x0076 x0110 x0125 x0134 x0137 Data Collection Autoprocessing XDS-3dii XDS-3dii DIALS DIALS DIALS Space group P21212 P21212 P21212 P21212 P21212 a, b, c (Å) 110.30, 45.58, 47.22 110.90, 45.30, 46.97 111.04, 44.87, 46.97 110.95, 45.13, 47.03 111.96, 44.75, 47.01 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 47.22-1.75 (1.80-1.75)† 55.45-1.76 (1.81-1.76) 46.97-1.62 (1.66-1.62) 47.04-1.72 (1.76-1.72) 44.74-1.39 (1.43-1.39) Rmerge 0.042 (1.583) 0.040 (1.441) 0.044 (1.271) 0.047 (1.258) 0.042 (1.141) Rmeas 0.049 (1.750) 0.047 (1.710) 0.052 (1.506) 0.056 (1.503) 0.051 (1.446) Rpim 0.026 (0.983) 0.025 (0.912) 0.028 (0.801) 0.030 (0.814) 0.027 (0.876)

I/σI 17.7 (1.2) 19.4 (1.2) 14.6 (1.2) 14.1 (1.1) 13.2 (1.0)

CC1/2 0.999 (0.607) 1.000 (0.539) 0.999 (0.719) 0.999 (0.730) 0.998 (0.553) Wilson B factor (Å2) 35.43 34.64 25.69 28.46 20.54

12 Completeness (%) 99.8 (99.6) 99.9 (99.9) 99.6 (98.6) 100.0 (100.0) 99.9 (98.5)

4 Unique reflections 24,732 (1,743) 24,203 (1,754) 30,506 (2,203) 25,914 (1,890) 48,384 (3,479) Multiplicity 6.5 (6.7) 6.4 (6.6) 6.4 (6.5) 6.2 (6.4) 5.9 (4.2) Refinement Rwork/Rfree 0.220/0.251 0.212/0.233 0.216/0.240 0.216/0.242 0.194/0.218 Molprobity score 1.74 1.65 1.05 0.88 1.15 R.M.S.D Bonds (Å) 0.020 0.019 0.017 0.016 0.013 Angles (°) 1.900 1.749 1.679 1.631 1.544 Fragment Site 1 Site 1 Site 1 Site 1 Site 3 Occupancy 0.58 0.46 0.58 0.57 0.64 RSCC 0.92 0.91 0.92 0.90 0.97 RMSD 0.54 0.31 0.21 1.41* 0.62 RSZD 0.9 0.1 0.5 0.8 1.3 B-factor ratio 1.18 1.03 0.84 1.15 0.86 RSZO/OCC 2.07 3.04 3.62 2.63 5.47 †Values in parentheses are from highest resolution shell. *High RMSD of fragment x0134 is a result of manual adjustment of the fragment after generation of the ensemble model.

Appendix A: Fragment A: Appendix screening Table 7.3 X-ray diffraction data collection, refinement, and fragment validation statistics for x0156, x0157, x0210, x0232, x0240, and x0245. x0156 x0157 x0210 x0232 x0240 x0245 Data Collection Autoprocessing DIALS XDS-3dii XDS-3dii XDS-3d XDS-3dii DIALS Space group P21212 P212121 P212121 P21212 P21212 P21212 a, b, c (Å) 110.74, 44.82, 46.87 45.66, 93.66, 110.05 45.66, 94.15 ,110.79 111.26, 45.00, 47.04 110.99, 45.11, 47.04 110.65, 45.36, 47.12 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 46.85-1.46 (1.50-1.46)† 47.44-1.38 (1.42-1.38) 55.40-1.42 (1.46-1.42) 34.99-1.56 (1.60-1.56) 47.04-2.23 (2.29-2.23) 45.36-1.74 (1.79-1.74) Rmerge 0.047 (1.138) 0.044 (1.152) 0.041 (1.038) 0.040 (1.180) 0.094 (1.293) 0.071 (1.512) Rmeas 0.056 (1.435) 0.052 (1.463) 0.049 (1.311) 0.047 (1.418) 0.112 (1.544) 0.085 (1.801) Rpim 0.030 (0.858) 0.028 (0.887) 0.026 (0.787) 0.025 (0.777) 0.062 (0.835) 0.045 (0.966) I/σI 13.5 (1.2) 15.9 (1.1) 18.4 (1.3) 17.8 (1.2) 11.4 (1.4) 10.7 (1.1) CC1/2 0.999 (0.592) 0.999 (0.454) 0.999 (0.497) 0.999 (99.7) 0.997 (0.629) 0.999 (0.553) 2 Wilson B factor (Å ) 19.02 18.34 17.98 36.83 42.41 22.18 Completeness (%) 98.3 (96.9) 100.0 (99.9) 100.0 (99.9) 99.8 (99.7) 99.4 (99.6) 100.0 (100.0)

12 Unique reflections 40,511 (2,900) 97,781 (7,139) 90,935 (6,646) 34,386 (2,489) 12,016 (851) 25,143 (1,812)

Multiplicity 6.1 (5.1) 6.1 (4.3) 6.5 (5.0) 6.5 (6.0) 5.9 (6.2) 6.3 (6.4) 5

Refinement Rwork/Rfree 0.198/0.249 0.175/0.193 0.177/0.203 0.218/0.252 0.222/0.277 0.226/0.273 Molprobity score 1.28 0.93 1.08 1.43 1.44 1.47 R.M.S.D Bonds (Å) 0.016 0.014 0.013 0.016 0.011 0.017 Angles (°) 1.578 1.517 1.379 1.707 1.358 1.618 Fragment Site 1 Site 6 Site 9 Site 7 Site 9 Site 1 Site 1 Site 1 Occupancy 0.55 0.53 0.66 0.49 0.53 0.62 0.52 0.65 RSCC 0.96 0.86 0.91 0.90 0.86 0.93 0.94 0.92 RMSD 0.22 0.35 0.33 0.27 0.34 0.24 0.34 0.28 RSZD 1.1 0.4 0.1 0.1 0.0* 0.5 0.3 0.4 B-factor ratio 1.21 1.15 1.04 1.21 0.94 1.14 0.72 1.1 RSZO/OCC 3.82 2.45 4.09 3.06 3.21 3.06 3.46 3.54 †Values in parentheses are from highest resolution shell. *Fragment RSZD indicates an issue with fragment modelling.

Appendix A: Fragment A: Appendix screening Table 7.4 X-ray diffraction data collection, refinement, and fragment validation statistics for x0257, x0271, x0277, x0317, and x0321. x0257 x0271 x0277 x0317 x0321 Data Collection Autoprocessing DIALS XDS-3d DIALS autoPROC DIALS Space group P21212 P21212 P21212 P21212 P212121 a, b, c (Å) 110.79, 45.18, 46.92 110.31, 44.81, 46.80 110.77, 45.11, 46.96 110.97, 45.43, 47.05 45.31, 93.70, 110.64 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 55.37-1.77 (1.82-1.77)† 28.43-1.92 (1.97-1.92) 45.10-1.65 (1.69-1.65) 110.97-2.27 (2.39-2.27) 47.64-1.45 (1.49-1.45) Rmerge 0.056 (1.737) 0.075 (1.235) 0.038 (1.263) 0.112 (0.891) 0.044 (1.170) Rmeas 0.066 (2.069) 0.089 (1.469) 0.045 (1.498) 0.133 (1.039) 0.053 (1.400) Rpim 0.035 (1.112) 0.048 (0.786) 0.024 (0.797) 0.071 (0.559) 0.028 (0.844) I/σI 12.9 (1.1) 13.0 (1.5) 15.5 (1.1) 11.1 (2.1) 14.3 (1.2) CC1/2 0.999 (0.549) 0.999 (0.626) 1.000 (0.691) 0.996 (0.725) 0.999 (0.549) 2 Wilson B factor (Å ) 29.92 32.87 28.18 46.56 20.49 Completeness (%) 100.0 (100.0) 99.8 (99.3) 100.0 (100.0) 100.0 (100.0) 100.0 (100.0)

12 Unique reflections 23,744 (1,733) 18,390 (1,332) 29,150 (2,120) 11,580 (1,641) 84,341 (6,153)

Multiplicity 6.3 (6.3) 6.4 (6.3) 6.2 (6.4) 6.4 (6.3) 6.1 (5.0) 6

Refinement Rwork/Rfree 0.213/0.240 0.209/0.249 0.222/0.264 0.222/0.289 0.195/0.235 Molprobity score 1.54 1.69 1.05 1.68 0.87 R.M.S.D Bonds (Å) 0.017 0.015 0.018 0.016 0.014 Angles (°) 1.736 1.603 1.646 1.684 1.480 Fragment Site 1 Site 1 Site 1 Site 1 Site 3 Occupancy 0.40 0.55 0.51 0.5 0.81 RSCC 0.88 0.93 0.90 0.87 0.87 RMSD 1.23* 0.33 0.29 1.38* 0.22 RSZD 1.8 0.5 0.9 0.5 0.1 B-factor ratio 1.23 0.91 1.18 1.13 1.27 RSZO/OCC 1.78 3.82 3.14 2.2 2.22 †Values in parentheses are from highest resolution shell. *High RMSD of fragments x0257 and x0317 is due to manual adjustment of fragments after generation of the ensemble model.

Appendix A: Fragment A: Appendix screening Table 7.5 X-ray diffraction data collection, refinement, and fragment validation statistics for x0332, x0335, x0357, x0368, and x0381. x0332 x0335 x0357 x0368a x0381 Data Collection Autoprocessing XDS-3dii XDS-3d XDS-3d XDS-3dii DIALS Space group P21212 P21212 P21212 P212121 P21212 a, b, c (Å) 111.44, 45.05, 46.88 111.32 45.20 46.98 111.14, 44.91, 47.09 45.67, 94.06, 110.57 111.36, 44.89, 47.04 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 46.88-1.45 (1.49-1.45)† 29.12-1.64 (1.68-1.64) 23.93-1.93 (1.98-1.93) 71.64-1.51 (1.55-1.51) 47.04-1.49 (1.53-1.49) Rmerge 0.034 (1.131) 0.048 (1.169) 0.078 (1.056) 0.067 (1.137) 0.059 (1.117) Rmeas 0.041 (1.404) 0.057 (1.374) 0.093 (1.259) 0.079 (1.399) 0.070 (1.366) Rpim 0.022 (0.820) 0.030 (0.715) 0.049 (0.679) 0.042 (0.803) 0.037 (0.773) I/σI 18.9 (1.2) 16.8 (1.5) 12.4 (1.6) 11.5 (1.3) 10.5 (1.1) CC1/2 0.999 (0.613) 0.999 (0.638) 0.999 (0.636) 0.998 (0.537) 0.995 (0.646) 2 Wilson B factor (Å ) 23.08 25.76 33.71 21.65 27.08 Completeness (%) 99.9 (99.9) 99.8 (99.6) 99.7 (99.1) 100.0 (100.0) 100.0 (99.9)

12 Unique reflections 42,688 (3,071) 29,818 (2,169) 18,386 (1,337) 75,654 (5,518) 39,428 (2,860)

Multiplicity 6.2 (5.2) 6.4 (6.7) 6.4 (6.3) 6.5 (5.6) 6.2 (5.3) 7

Refinement Rwork/Rfree 0.231/0.264 0.219/0.254 0.207/0.246 0.235/0.255 0.234/0.266 Molprobity score 1.60 0.70 1.42 1.71 1.05 R.M.S.D Bonds (Å) 0.014 0.017 0.017 0.016 0.014 Angles (°) 1.650 1.698 1.723 1.590 1.601 Fragment Site 1 Site 1 Site 1 Site 1 Site 6 Site 2 Occupancy 0.51 0.59 0.50 0.48 0.48 0.67 RSCC 0.80 0.89 0.85 0.84 0.87 0.86 RMSD 0.15 0.2 0.37 0.35 0.56 0.45 RSZD 0.4 1.0 1.3 0.9 0.3 0.2 B-factor ratio 1.0 1.0 1.23 1.24 0.79 1.49 RSZO/OCC 2.16 3.22 3.20 2.77 3.54 1.79 †Values in parentheses are from highest resolution shell. aCrystal displays evidence of tNCS.

Appendix A: Fragment A: Appendix screening Table 7.6 X-ray diffraction data collection, refinement, and fragment validation statistics for x0385, x0400, x0407, x0432, and x0453. x0385 x0400 x0407a x0432 x0453 Data Collection Autoprocessing XDS-3dii DIALS DIALS DIALS XDS-2dii Space group P21212 P21212 P21212 P21212 P212121 a, b, c (Å) 110.99, 44.92, 46.95 110.90, 44.72, 46.99 110.61, 45.48, 47.17 111.29, 45.01, 47.14 45.28, 93.73, 111.25 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 44.92-1.74 (1.79-1.74)† 46.98-1.54 (1.58-1.54) 47.17-1.53 (1.57-1.53) 45.01-1.58 (1.62-1.58) 47.84-1.57 (1.61-1.57) Rmerge 0.055 (1.600) 0.061 (1.367) 0.054 (1.196) 0.045 (1.235) 0.068 (1.370) Rmeas 0.065 (1.895) 0.073 (1.677) 0.064 (1.464) 0.054 (1.486) 0.081 (1.638) Rpim 0.035 (1.004) 0.040 (0.956) 0.034 (0.830) 0.028 (0.817) 0.043 (0.889) I/σI 14.7 (1.2) 10.3 (1.1) 11.5 (1.1) 13.1 (1.1) 12.8 (1.2) CC1/2 0.999 (0.572) 0.997 (0.577) 0.996 (0.576) 0.997 (0.787) 0.999 (0.484) 2 Wilson B factor (Å ) 31.31 20.92 29.55 29.25 21.27 Completeness (%) 99.9 (100.0) 99.5 (98.9) 100.0 (99.8) 100.0 (100.0) 99.2 (98.5)

12 Unique reflections 24,828 (1,770) 35,167 (2,547) 36,769 (2,693) 33,513 (2,429) 66,341 (4,799)

Multiplicity 6.4 (6.6) 6.2 (5.6) 6.3 (5.6) 6.4 (6.0) 6.5 (6.3) 8

Refinement Rwork/Rfree 0.218/0.256 0.222/0.255 0.264/0.301 0.217/0.257 0.255/0.277 Molprobity score 1.18 0.89 1.45 1.83 0.81 R.M.S.D Bonds (Å) 0.018 0.016 0.020 0.017 0.017 Angles (°) 1.701 1.679 1.691 1.669 1.682 Fragment Site 1 Site 1 Site 1 Site 1 Site 4 Site 9 Occupancy 0.49 0.59 0.49 0.51 0.86 0.60 RSCC 0.85 0.90 0.92 0.93 0.84 0.80 RMSD 0.2 0.22 0.29 0.22 0.34 0.32 RSZD 0.6 1.1 0.0* 0.0* 0.1 0.8 B-factor ratio 1.11 1.25 0.97 1.09 0.98 1.21 RSZO/OCC 1.84 2.88 2.24 2.75 1.4 2.17 aCrystal displays evidence of tNCS. *Fragment RSZD of 0.0 indicates an issue with fragment modelling.

Appendix A: Fragment A: Appendix screening Table 7.7 X-ray diffraction data collection, refinement, and fragment validation statistics for x0454, x0459, x0478, x0481, and x0484. x0454 x0459 x0478 x0481 x0484a Data Collection Autoprocessing DIALS XDS-3dii DIALS XDS-3dii DIALS Space group P21212 P212121 P21212 P21212 P212121 a, b, c (Å) 110.57, 45.54, 47.13 45.56, 94.09, 110.44 110.11, 45.77, 47.01 110.85, 45.57, 47.14 45.57, 94.01, 110.73 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 55.29-1.82 (1.87-1.82)† 47.62-1.40 (1.44-1.40) 43.23-1.63 (1.67-1.63) 55.42-1.68 (1.72-1.68) 55.37-1.60 (1.64-1.60) Rmerge 0.084(1.720) 0.050 (1.158) 0.044 (1.232) 0.041 (1.387) 0.085 (1.181) Rmeas 0.100 (2.073) 0.060 (1.468) 0.052 (1.456) 0.049 (1.631) 0.102 (1.410) Rpim 0.054 (1.143) 0.032 (0.887) 0.028 (0.767) 0.026 (0.851) 0.054 (0.760) I/σI 9.2 (1.1) 14.4 (1.1) 15.5 (1.2) 18.0 (1.2) 7.4 (1.1) CC1/2 0.998 (0.503) 0.999 (0.433) 0.999 (0.671) 0.999 (0.579) 0.998 (0.582) 2 Wilson B factor (Å ) 25.38 18.29 23.03 32.08 17.06 Completeness (%) 100.0 (100.0) 99.9 (99.3) 99.7 (99.3) 98.6 (99.6) 100.0 (100.0)

12 Unique reflections 22,089 (1,583) 94,196 (6,868) 30,331 (2,154) 27,607 (1,989) 63,708 (4,637)

Multiplicity 6.3 (6.1) 6.2 (4.6) 6.3 (6.5) 6.5 (6.8) 6.4 (6.2) 9

Refinement Rwork/Rfree 0.232/0.272 0.229/0.245 0.225/0.267 0.249/0.276 0.257/0.291 Molprobity score 1.36 1.12 0.95 1.40 0.95 R.M.S.D Bonds (Å) 0.018 0.016 0.017 0.021 0.012 Angles (°) 1.756 1.585 1.683 1.867 1.248 Fragment Site 1 Site 1 Site 6 Site 1 Site 1 Site 1 Site 6 Occupancy 0.65 0.62 0.52 0.65 0.41 0.53 0.48 RSCC 0.91 0.88 0.85 0.96 0.82 0.71 0.74 RMSD 0.44 1.16* 0.32 1.22 0.85 0.29 0.49 RSZD 0.8 1.0 0.4 1.5 1.0 1.1 1.3 B-factor ratio 1.08 1.11 1.24 0.91 1.26 1.49 1.51 RSZO/OCC 3.08 3.06 3.08 6.77 1.46 1.32 1.88 †Values in parentheses are from highest resolution shell. aCrystal displays evidence of tNCS. *High fragment RMSD is a result of manual adjustment of the fragment after generation of the ensemble model.

Appendix A: Fragment A: Appendix screening Table 7.8 X-ray diffraction data collection, refinement, and fragment validation statistics for x0507, x0542, x0566, x0572, and x0592. x0507a x0542 x0566a x0572 x0592 Data Collection Autoprocessing XDS-3dii XDS-3d XDS-3d autoPROC XDS-3dii Space group P212121 P212121 P212121 P21212 P21212 a, b, c (Å) 45.63, 94.06, 110.80 45.57, 93.75, 110.68 45.58, 93.90, 110.52 110.51, 45.54, 47.01 110.02, 44.59, 46.80 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 55.40-1.67 (1.71-1.67)† 32.94-1.55 (1.59-1.55) 43.21-1.50 (1.54-1.50) 110.51-1.77 (1.87-1.77) 46.80-2.07 (2.12-2.07) Rmerge 0.103 (1.311) 0.050 (1.197) 0.050 (1.134) 0.083 (0.629) 0.081 (1.315) Rmeas 0.123 (1.553) 0.059 (1.453) 0.060 (1.395) 0.097 (0.782) 0.097 (1.568) Rpim 0.067 (0.824) 0.031 (0.809) 0.032 (0.798) 0.051 (0.457) 0.052 (0.843) I/σI 11.0 (1.3) 16.6 (1.3) 15.0 (1.4) 14.6 (2.3) 11.6 (1.6) CC1/2 0.996 (0.529) 1.000 (0.490) 0.999 (0.538) 0.996 (0.751) 0.997 (0.755) 2 Wilson B factor (Å ) 16.67 21.43 21.36 41.72 34.48 Completeness (%) 98.0 (100.0) 100.0 (99.9) 100.0 (100.0) 99.8 (99.5) 99.4 (98.2)

1 Unique reflections 55,098 (4,135) 69,717 (5,083) 76,832 (5,628) 23,725 (3,381) 14,535 (1,065)

30 Multiplicity 6.3 (6.8) 6.5 (5.9) 6.4 (5.5) 5.7 (5.0) 6.2 (6.1)

Refinement Rwork/Rfree 0.268/0.305 0.212/0.236 0.259/0.277 0.240/0.296 0.216/0.238 Molprobity score 1.15 0.56 0.97 1.48 1.59 R.M.S.D Bonds (Å) 0.024 0.014 0.018 0.021 0.015 Angles (°) 2.049 1.414 1.703 1.899 1.595 Fragment Site 8 Site 9 Site 1 Site 6 Site 1 Site 1 Occupancy 0.65 0.72 0.57 0.51 0.57 0.57 RSCC 0.78 0.89 0.81 0.69 0.91 0.91 RMSD 0.33 0.33 0.55 0.34 0.23 1.41* RSZD 0.7 1.5 0.1 1.2 0.1 0.6 B-factor ratio 1.51 1.18 1.52 1.54 1.0 0.76 RSZO/OCC 1.69 2.92 1.23 1.57 2.63 2.98 †Values in parentheses are from highest resolution shell. aCrystal displays evidence of tNCS. *High fragment RMSD is a result of manual adjustment of the fragment after generation of the ensemble model.

Appendix A: Fragment A: Appendix screening Table 7.9 X-ray diffraction data collection, refinement, and fragment validation statistics for x0601, x0608, x0622, x0628, and x0666. x0601 x0608 x0622 x0628 x0666 Data Collection Autoprocessing XDS-3dii DIALS DIALS DIALS XDS-3d Space group P21212 P21212 P21212 P21212 P21212 a, b, c (Å) 110.63, 45.37, 46.98 110.52, 45.05, 46.62 109.64, 45.94, 46.76 110.45, 45.64, 47.07 110.78, 44.95, 46.88 α, β, γ (°) 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 90.00, 90.00, 90.00 Resolution (Å) 55.31-1.71 (1.75-1.71)† 55.26-1.67 (1.71-1.67) 45.95-1.28 (1.31-1.28) 45.64-1.63 (1.67-1.63) 21.59-1.72 (1.76-1.72) Rmerge 0.059 (1.513) 0.041 (1.266) 0.039 (0.800) 0.040 (1.200) 0.045 (1.128) Rmeas 0.071 (1.785) 0.049 (1.499) 0.046 (1.061) 0.048 (1.419) 0.053 (1.328) Rpim 0.038 (0.939) 0.026 (0.793) 0.025 (0.691) 0.026 (0.749) 0.028 (0.697) I/σI 13.4 (1.0) 15.1 (1.1) 14 (1.2) 14.9 (1.3) 17.3 (1.3) CC1/2 0.999 (0.639) 0.999 (0.719) 0.999 (0.580) 0.999 (0.670) 0.999 (0.813) 2 Wilson B factor (Å ) 29.31 28.51 16.13 26.97 30.53 Completeness (%) 99.7 (99.7) 100.0 (99.9) 99.9 (99.5) 100.0 (100.0) 99.1 (99.1)

1 Unique reflections 26,288 (1,894) 27, 818 (1,985) 61,712 (4,487) 30,496 (2,192) 25,375 (1,840)

31 Multiplicity 6.4 (6.7) 6.3 (6.5) 5.5 (3.4) 6.3 (6.5) 6.4 (6.7)

Refinement Rwork/Rfree 0.220/0.249 0.211/0.231 0.184/0.221 0.248/0.291 0.209/0.240 Molprobity score 1.10 1.32 1.42 1.88 1.48 R.M.S.D Bonds (Å) 0.016 0.018 0.014 0.019 0.019 Angles (°) 1.562 1.731 1.530 1.859 1.738 Fragment Site 1 Site 1 Site 5 Site 1 Site 1 Occupancy 0.62 0.44 0.57 0.66 0.51 RSCC 0.93 0.93 0.92 0.94 0.93 RMSD 0.22 0.19 0.25 0.21 0.4 RSZD 0.0* 0.3 2.2 0.7 0.9 B-factor ratio 1.15 1.03 1.05 0.81 1.06 RSZO/OCC 2.26 4.09 3.51 4.39 4.31 †Values in parentheses are from highest resolution shell. *Fragment RSZD of 0.0 indicates there is an issue with fragment modelling.

Appendix A: Fragment A: Appendix screening 7.2 H3 K4me3 binding and fragment details Table 7.10 H3 K4me3 binding and fragment details Binding H3 K4me3 Crystal Occupancy SMILES Compound Compound Name Manufacturer MW (Da) Site Present

Site 1 N-(3-chlorophenyl)-2,2,2- x0003 0.61 Yes c1cc(cc(c1)Cl)NC(=O)C(F)(F)F Specs 240.3 (1)* trifluoroacetamide

1-Methyl-N-(3- Site 1 methylphenyl)-1H- x0021 0.63 Yes Cc1cccc(c1)Nc2c3cnn(c3ncn2)C Pharmeks 239.3 (1) pyrazolo[3,4-d]pyrimidin-4- amine

1

32 Site 1 4-(4-Fluorophenyl)-1- x0040 0.63 Yes c1cc(ccc1N2CCN(CC2)C(=O)N)F Specs 223.3 (1) piperazinecarboxamide

Site 1 4-(1,2,3-thiadiazol-4- x0051 0.54 Yes CCNC(=O)Oc1ccc(cc1)c2csnn2 KeyOrganics 249.3 (1) yl)phenyl ethylcarbamate

1-{2-[(4- Site 1 x0076 0.58 Yes c1cc(ccc1NC(=O)C[NH+]2CCCC2)F fluorophenyl)amino]-2- Specs 223.3 (1) oxoethyl}pyrrolidinium

Site 1 1-methyl-3-(2-thienyl)-1H- x0110 0.46 Yes Cn1c(cc(n1)c2cccs2)N MayBridge 179.2 (2) pyrazol-5-amine

Appendix A: Fragment A: Appendix screening

Site 1 2,4-difluoro-6-(1H-pyrazol- x0125 0.58 Yes c1cn[nH]c1c2cc(cc(c2O)F)F MayBridge 196.2 (2) 5-yl)phenol

Site 1 Ground x0134 0.57 Cc1ccc(o1)c2cccc(c2)N 3-(5-methyl-2-furyl)aniline Apollo Scientific 173.2 (2) state only

4-[2-(4- Site 1 Ground x0156 0.55 Cc1c(c(n[nH]1)C)CCOc2ccc(cc2)F fluorophenoxy)ethyl]-3,5- Enamine 243.3 (1) state only dimethyl-1H-pyrazole

1 33 Site 1 N-(4-(2-aminothiazol-4- x0232 0.62 No CC(=O)Nc1ccc(cc1)c2csc(n2)N IBScreen 233.2 (2) yl)phenyl)acetamide

Site 1 N-(2-furylmethyl)-1H- x0240 0.52 No c1ccc2c(c1)[nH]c(n2)NCc3ccco3 IBScreen 213.3 (2) benzimidazol-2-amine

Site 1 Ground 3-[2R)-2-hydroxypropyl]-1- x0245 0.65 C[C@@H](CNC(=O)Nc1ccccc1)O Enamine 194.2 (1) state only phenylurea

Site 1 1-(4-Chlorophenyl)-3-(1- x0257 0.40 Yes CC[C@H](CO)NC(=O)Nc1ccc(cc1)Cl Enamine 242.7 (1) hydroxy-2-butanyl)urea

Appendix A: Fragment A: Appendix screening

Site 1 Ground Ethyl (3- x0271 0.55 CCOC(=O)Nc1cccc(c1)Cl Aldrich-CPR 199.6 (1) state only chlorophenyl)carbamate

Site 1 Ground 4-Methyl-N-phenyl-1- x0277 0.51 C[NH+]1CCN(CC1)C(=O)Nc2ccccc2 Crea-Chim 219.3 (1) state only piperazinecarboxamide

Site 1 Ground 2-(2-methylphenoxy)-N- x0317 0.50 Cc1ccccc1OCC(=O)Nc2cccnc2 Asinex 242.3 (1) state only (pyridin-3-yl)acetamide

1 Site 1 34 (1) Ground 4'-hydroxy-3- x0332 0.51 c1cc(cc(c1)c2ccc(cc2)O)C#N Aldrich-CPR 195.2 state only biphenylcarbonitrile

Site 1 Ground 3'-hydroxy-3- x0335 0.59 c1cc(cc(c1)c2cccc(c2)O)C#N Aldrich-CPR 195.2 (1) state only biphenylcarbonitrile

Benzyl (2S)-2- Site 1 Sigma-Aldrich x0357 0.50 No c1ccc(cc1)COC(=O)N2CCC[C@H]2CO (hydroxymethyl)-1- 235.3 (1) (BuildingBlocks) pyrrolidinecarboxylate

Site 1 Ground 2[(methoxyacetyl)amino] x0368 0.48 COCC(=O)Nc1ccccc1C(=O)N ChemBridge 208.0 (1) state only benzamide

Appendix A: Fragment A: Appendix screening

1-(benzofuran-2- Site 1 Ground x0385 0.49 CC1CCN(CC1)C(=O)c2cc3ccccc3o2 ylcarbonyl)-4- ChemBridge 243.0 (1) state only methylpiperadine

Site 1 x0400 0.59 No c1ccc(cc1)CONC(=O)N 1-(Benzyloxy)urea Otava 166.2 (1)

Site 1 N-(2-Pyridinyl)-1,2- x0407 0.49 Yes c1ccc(c(c1)N)Nc2ccccn2 Otava 185.2 (1) benzenediamine

4-((5-methyl-

1 Site 1 [1,2,4]triazolo[1,5-

35 x0432 0.51 Yes Cc1cc(n2c(n1)ncn2)Nc3ccc(cc3)O IBScreen 241.3 (1) a]pyrimidin-7-

yl)amino)phenol

Site 1 N'-(4-fluorophenyl) x0454 0.65 Yes c1cc(ccc1NNC(=O)C2CCC2)F MayBridge 208.2 (1) cyclobutanecarbohydrazide

Site 1 Dimethyl 1,3- x0459 0.62 Yes COC(=O)Nc1cccc(c1)NC(=O)OC Crea-Chim 224.2 (2) phenylenebiscarbamate

Site 1 1-[(2S)-2-butanyl]-3-(3- x0481 0.41 Yes CC[C@H](C)NC(=O)Nc1cccc(c1)Cl ChemBridge 226.7 (1) chlorophenyl)urea

Appendix A: Fragment A: Appendix screening

Site 1 2-[(1H-benzimidazol-2- ChemBridge x0484 0.53 Yes c1ccc(c(c1)CNc2[nH]c3ccccc3n2)O 239.3 (2) ylamino)methyl]phenol BuildingBlocks

Site 1 Benzyl 4-hydroxy-1- x0566 0.57 Yes c1ccc(cc1)COC(=O)N2CCC(CC2)O Fluorochem 235.3 (1) piperidinecarboxylate

Site 1 4,6-Dimethyl-N-phenyl-2- x0572 0.57 Yes Cc1cc(nc(n1)Nc2ccccc2)C Fluorochem 199.3 (1) pyrimidinamine

1 Site 1 Ground C[C@H](C(=O)[O- (2R)-2-(4-

36 x0592 0.57 Crea-Chim 242.5 (1) state only ])Oc1ccc(cc1)c2ccccc2 biphenylyloxy)propanoate

Site 1 Methyl [(5-chloro-1H-indol- x0601 0.62 Yes COC(=O)NCc1cc2cc(ccc2[nH]1)Cl Crea-Chim 238.7 (1) 2-yl)methyl]carbamate

Methyl (4-methyl-1,3- Site 1 x0608 0.44 Yes Cc1cccc2c1nc(s2)NC(=O)OC benzothiazol-2- Crea-Chim 222.3 (2) yl)carbamate

Site 1 3-[(4,6-dimethyl-2- x0628 0.66 Yes Cc1cc(nc(n1)Nc2cccc(c2)O)C ChemBridge 215.0 (1) pyrimidinyl)amino]phenol

Appendix A: Fragment A: Appendix screening

Site 1 3-(4'-Hydroxy-4- x0666 0.51 Yes c1cc(ccc1CCC(=O)[O-])c2ccc(cc2)O Specs 242.3 (1) biphenylyl)propanoic acid

N-(2-ethyl-2H-tetrazol-5- Site 2 x0381 0.67 No CCCC(=O)Nc1nnn(n1)CC ChemBridge 183.0 yl)butanamide

Ground {4-[(methylsulfonyl)amino] Site 3 x0137 0.64 CS(=O)(=O)Nc1ccc(cc1)C[NH3+] Apollo Scientific 201.3 state only phenyl}methanaminium

4-{2-[(2,3-

1 Site 3 x0321 0.81 Yes Cc1cccc(c1C)NC(=O)C[NH+]2CCOCC2 dimethylphenyl)amino]-2- Asinex 249.3 37 oxoethyl}morpholin-4-ium

4-((5-methyl- [1,2,4]triazolo[1,5- Site 4 x0432 0.86 Yes Cc1cc(n2c(n1)ncn2)Nc3ccc(cc3)O IBScreen 241.3 a]pyrimidin-7- yl)amino)phenol

2-ethyl-N-(2- Site 5 x0622 0.57 Yes CCC(CC)C(=O)Nc1ccccc1O Specs 207.2 hydroxyphenyl)butanamide

Site 6 2-(2,4-dimethylphenoxy)-1- x0157 0.53 Yes Cc1ccc(c(c1)C)OCC(=O)N2CCOCC2 Asinex 249.3 (2) (4-morpholinyl)ethanone

Appendix A: Fragment A: Appendix screening

Site 6 Ground 2[(methoxyacetyl)amino] x0368 0.48 COCC(=O)Nc1ccccc1C(=O)N ChemBridge 208.0 (1) state only benzamide

Site 6 Dimethyl 1,3- x0459 0.52 Yes COC(=O)Nc1cccc(c1)NC(=O)OC Crea-Chim 224.2 (2) phenylenebiscarbamate

Site 6 2-cyclohexyl-N-3- x0478 0.65 Yes c1cc(cnc1)NC(=O)CC2CCCCC2 ChemBridge 218.0 (1) pyrinylacetamide

1 Site 6 2-[(1H-benzimidazol-2- ChemBridge

38 x0484 0.48 Yes c1ccc(c(c1)CNc2[nH]c3ccccc3n2)O 239.3

(2) ylamino)methyl]phenol BuildingBlocks

Site 6 Benzyl 4-hydroxy-1- x0566 0.51 Yes c1ccc(cc1)COC(=O)N2CCC(CC2)O Fluorochem 235.3 (1) piperidinecarboxylate

Ground 1-(3,4-difluorobenzoyl)-4- Site 7 x0210 0.49 C[NH+]1CCN(CC1)C(=O)c2ccc(c(c2)F)F Asinex 240.3 state only methylpiperazine

N-(1,3-benzodioxol-5- Site 8 x0507 0.65 Yes CCS(=O)(=O)NCc1ccc2c(c1)OCO2 Crea-Chim UAB 243.3 ylmethyl)ethanesulfonamide

Appendix A: Fragment A: Appendix screening

2-(4-methyl-1-piperazinyl)- Site 9 x0056 0.59 Yes C[NH+]1CCN(CC1)c2nc3ccccc3s2 Enamine 233.3 1,3-benzothiazole

2-(2,4-dimethylphenoxy)-1- Site 9 x0157 0.66 Yes Cc1ccc(c(c1)C)OCC(=O)N2CCOCC2 Asinex 249.3 (4-morpholinyl)ethanone

Ground 1-(3,4-difluorobenzoyl)-4- Site 9 x0210 0.53 C[NH+]1CCN(CC1)C(=O)c2ccc(c(c2)F)F Asinex 240.3 state only methylpiperazine

1 4-[(3-fluorobenzoyl)oxy]-1-

39 Site 9 x0453 0.66 Yes C[NH+]1CCC(CC1)OC(=O)c2cccc(c2)F Specs 238.3

methylpiperidinium

1,2,3-benzothiadiazol-5- Site 9 x0542 0.72 Yes c1cc2c(cc1C(=O)N3CCOCC3)nns2 MayBridge 249.4 yl(4-morpholinyl)methanone

*Number in parentheses indicates the orientation of fragments bound to site 1.

Appendix A: Fragment screening

7.3 Fragment binding validation 7.3.1 Site 1 (and site 6) 7.3.1.1 Orientation 1

Figure 7.1 Models of fragments x0003 (1.76 Å), x0021 (1.85 Å), and x0040 (1.81 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; fluorine, grey; chlorine, grey). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and fragment validation metrics (c,f,i) indicate a high level of confidence in fragment modelling. 140

Appendix A: Fragment screening

Figure 7.2 Models of fragments x0051 (1.67 Å), x0076 (1.75 Å), and x0156 (1.46 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; sulfur, yellow; fluorine, grey). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) indicate a high level of confidence in modelling of fragment x0156, good confidence in x0076, good confidence in the 4-(1,2,3-thiadiazol-4-yl)phenyl moiety and low confidence in the ethyl carbamate moiety of fragment x0051.

141

Appendix A: Fragment screening

Figure 7.3 Models of fragments x0245 (1.74 Å), x0257 (1.77 Å), and x0271 (1.92 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; chlorine, grey). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) indicate a high level of confidence in fragment modelling.

142

Appendix A: Fragment screening

Figure 7.4 Models of fragments x0277 (1.65 Å), x0317 (2.27 Å), and x0332 (1.45 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) indicate a high level of confidence in modelling of fragments x0277 and x0332, and the N-(3-pyridinyl)acetamide moiety of fragment x0317 but low confidence in the 2-methylphenol ring of fragment x0317. *High RMSD of fragment x0317 is a result of manual adjustment of the fragment after generation of the ensemble model.

143

Appendix A: Fragment screening

Figure 7.5 Models of fragments x0335 (1.64 Å), x0357 (1.93 Å), and x0368-1 (1.51 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the respective fragments indicate a high level of confidence in fragment modelling.

144

Appendix A: Fragment screening

Figure 7.6 Models of fragments x0368-2 (1.51 Å), x0385 (1.74 Å), and x0400 (1.54 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the respective fragments indicate a high level of confidence in fragment modelling.

145

Appendix A: Fragment screening

Figure 7.7 Models of fragments x0407 (1.53 Å), x0432 (1.58 Å), and x0454 (1.82 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; fluorine, grey). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the respective fragments indicate high confidence in the modelling of fragment x0454 but the RSZD of fragments x0407 and x0432-2 indicate a problem with the modelling of these fragments.

146

Appendix A: Fragment screening

Figure 7.8 Models of fragments x0478 (1.63 Å), x0481 (1.68 Å), and x0566-1 (1.50 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; chlorine, grey). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the fragments indicates high confidence for fragment x0478, reasonable confidence for fragment x0481, and low confidence in fragment x0566-1 modelling. *High RMSD of fragment x0478 is a

147

Appendix A: Fragment screening result of manual adjustment of the fragment after generation of the ensemble model.

Figure 7.9 Models of fragments x0566-2 (1.50 Å), x0572 (1.77 Å), and x0592 (2.07 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the fragments indicate low confidence in fragment x0566-2 modelling and high confidence in the modelling of fragments x0572 and x0592. *High RMSD of fragment x0592 is a result of manual adjustment of the fragment after generation of the ensemble model.

148

Appendix A: Fragment screening

Figure 7.10 Models of fragments x0601 (1.71 Å), x0628 (1.63 Å), and x0666 (1.72 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; chlorine, grey). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the respective fragments indicate a high level of confidence in the modelling of fragments x0628 and x0666, however, the RSZD of fragment x0601 indicates an issue with modelling of this fragment. 149

Appendix A: Fragment screening

7.3.1.2 Orientation 2

Figure 7.11 Models of fragments x0110 (1.76 Å), x0125 (1.62 Å), and x0134 (1.72 Å) binding sites. CHD1 DCD bound states displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; fluorine, grey; sulfur, yellow). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the respective fragments indicate a high level of confidence in fragment modelling. *High RMSD of fragment x0134 is a result of manual adjustment of the fragment after generation of the ensemble model. 150

Appendix A: Fragment screening

Figure 7.12 Models of fragments x0157-2 (1.38 Å), x0232 (1.56 Å), and x0240 (2.23 Å) binding sites. CHD1 DCD bound states displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; sulfur, yellow). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the respective fragments indicate a high level of confidence in fragment modelling.

151

Appendix A: Fragment screening

Figure 7.13 Models of fragments x0459-1 (1.40 Å), x0459-2 (1.40 Å), and x0484-1 (1.60 Å) binding sites. CHD1 DCD bound states displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b,d,e,g,h; carbon, pale blue) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) indicate high confidence in the modelling of fragments x0459-1 and x0459-2, and low confidence in modelling of fragment x0484-1, with the phenol group being very low confidence. *High RMSD of fragment x0459-2 is a result of manual adjustment of the fragment after generation of the ensemble model.

152

Appendix A: Fragment screening

Figure 7.14 Models of fragments x0484-2 (1.60 Å) and x0608 (1.67 Å) binding sites. CHD1 DCD bound states displayed as a cartoon model (pale blue). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; sulfur, yellow). Residues from the bound (a,b,d,e; carbon, pale blue) and ground (b,e; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d; magenta; 2 σ), 2mFo-DFc (b,e; blue; 1 σ), and mFo-DFc (b,e; green/red; ±3 σ) maps and validation metrics (c,f) for the fragments indicate low confidence in modelling of fragment x0484-2 (the modelling of the phenol group being very low confidence) and high confidence in the modelling of fragment x0608.

153

Appendix A: Fragment screening

7.3.2 Site 2

Figure 7.15 Model of fragment x0381 binding site (1.49 Å). CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b; carbon, pale blue) and ground (b; carbon, magenta) states within 7 Å of fragment x0381 are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (c) for fragment x0381 indicates good confidence in modelling. 7.3.3 Site 3

Figure 7.16 Model of fragment x0137 binding site (1.39 Å). CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue; sulfur, yellow). Residues from the bound (a,b; carbon, pale blue) and ground (b; carbon, magenta) states within 7 Å of fragment x0137 are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (c) for fragment x0137 indicate high confidence in fragment modelling.

154

Appendix A: Fragment screening

Figure 7.17 Model of fragment x0321 binding site (1.45 Å). CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b; carbon, pale blue) and ground (b; carbon, magenta) states within 7 Å of fragment x0321 are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b,e; green/red; ±3 σ) maps and validation metrics (c) for the fragment indicate high confidence in fragment modelling. 7.3.4 Site 4

Figure 7.18 Model of fragment x0432-1 binding site (1.58 Å). CHD1 DCD bound state displayed as cartoon model (pale blue). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b; carbon, pale blue) and ground (b; carbon, magenta) states within 7 Å of fragment x0432-1 are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (c) for fragment x0432-1 indicates good confidence in modelling.

155

Appendix A: Fragment screening

7.3.5 Site 5

Figure 7.19 Model of fragment x0622 binding site (1.28 Å). CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b; carbon, pale blue) and ground (b; carbon, magenta) states within 7 Å of fragment x0622 are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (c) for fragment x0381 indicates high confidence in modelling. 7.3.6 Site 7

Figure 7.20 Model of fragment x0210-2 binding site (1.42 Å). CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue; fluorine, grey). Residues from the bound (a,b; carbon, pale blue) and ground (b; carbon, magenta) states within 7 Å of fragment x0210-2 are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (c) for fragment x0210-2 indicates high confidence in modelling.

156

Appendix A: Fragment screening

7.3.7 Site 8

Figure 7.21 Model of fragment x0507 binding site (1.67 Å). CHD1 DCD bound state displayed as a cartoon model (pale blue). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue). Residues from the bound (a,b; carbon, pale blue) and ground (b; carbon, magenta) states within 7 Å of fragment x0507 are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (c) for fragment x0507 indicates low confidence in modelling. 7.3.8 Site 9

Figure 7.22 Models of fragment x0056 binding site (1.32 Å). CHD1 DCD bound state displayed as a cartoon model (chain A, pale blue; chain B, gold). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue; sulfur, yellow). Residues from the bound (a,b; chain A carbon, pale blue; chain B carbon, gold) and ground (b; carbon, magenta) states within 7 Å of the fragment are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (c) for the fragment x0056 indicate low confidence in fragment modelling. The position of the 1,3-benzothiazole group is slightly ambiguous. *High RMSD of fragment x0056 is a result of manual adjustment of the fragment after generation of the ensemble model.

157

Appendix A: Fragment screening

Figure 7.23 Models of fragments x0157-1 (1.38 Å), x0210-1 (1.42 Å), and x0453 (1.57 Å) binding sites. CHD1 DCD bound state displayed as a cartoon model (chain A, pale blue; chain B, gold). Fragments displayed as ball-and-stick models (carbon, green; oxygen, red; nitrogen, blue; fluorine, grey). Residues from the bound (a,b,d,e,g,h; chain A carbon, pale blue; chain B carbon, gold) and ground (b,e,h; carbon, magenta) states within 7 Å of the fragments are displayed as sticks and coloured by atom type. Event (a,d,g; magenta; 2 σ), 2mFo-DFc (b,e,h; blue; 1 σ), and mFo-DFc (b,e,h; green/red; ±3 σ) maps and validation metrics (c,f,i) for the fragments indicate high confidence in the modelling of fragments x0157-1 and x0453, however, the RSZD of x0210-1 indicate an issue with modelling of this fragment at site 9.

158

Appendix A: Fragment screening

Figure 7.24 Model of fragment x0542 binding site (1.55 Å). CHD1 DCD bound state displayed as a cartoon model (chain A, pale blue; chain B, gold). Fragment displayed as a ball-and-stick model (carbon, green; oxygen, red; nitrogen, blue; sulfur, yellow). Residues from the bound (a,b; chain A carbon, pale blue; chain B carbon, gold) and ground (b,e; carbon, magenta) states within 7 Å of the fragment are displayed as sticks and coloured by atom type. Event (a; magenta; 2 σ), 2mFo-DFc (b; blue; 1 σ), and mFo-DFc (b; green/red; ±3 σ) maps and validation metrics (c) for the fragment indicate good confidence in fragment modelling. 7.4 High resolution limits of datasets used for PanDDA

Figure 7.25 High resolution limits of datasets used in PanDDA analyses with His6-CHD1270-443. (a) P21212 datasets. (b) P212121 datasets.

159

Appendix B: Additional materials and methods 8 Appendix B: Additional materials and methods

8.1 Competent cells, plasmids, peptides, and primers Table 8.1 Competent cells used in this study Strain Description One Shot® TOP10 F- mrcA Δ(mrr-hsdRMS-mrcBC) Φ80lacZΔM15 Δ lacX74 recA1 araD139 Δ(araleu)7697 galU galK rpsL (StrR) endA1 nupG Lemo21(DE3) fhuA2 [lon] ompT gal (λDE3) [dcm] ΔhsdS/ pLemo(CamR) λ DE3 = λ sBamHIo ∆EcoRI-B int::(lacI::PlacUV5::T7 gene1) i21 ∆nin5 pLemo = pACYC184-PrhaBAD-lysY TM - - R Rosetta (DE3) F- ompT hsdSB(rB mB ) gal dcm (DE3) pRARE (Cam ) - - r BL21- F- ompT hsdS(rB mB ) dcm+ Tet gal (DE3) endA Hte [argU CodonPlus(DE3)- ileY leuW Camr] pRIL XL10-Gold Tetr ∆ (mcrA)183 ∆(mcrCB-hsdSMR-mrr)173 endA1 supE44 thi-1 recA1 gyrA96 relA1 lac The [F’ proAB lacIq Z∆M15 Tn10 (Tetr) Amy Camr] One Shot® F´ (proAB lacIq lacZΔM15 Tn10(TetR ) Δ(ccdAB)) mcrA Δ(mrr OmniMAX™ 2 T1R hsdRMS-mcrBC) Φ 80(lacZ)ΔM15 Δ(lacZYA- argF)U169 endA1 recA1 supE44 thi1 gyrA96 relA1 tonA panD

Table 8.2 Plasmids used in this study. Plasmid Description pOPINF E. coli, baculovirus, and mammalian expression vector encoding an N- terminal hexa-histidine tag and 3C protease cleavage site; AmpR pOPINE E. coli baculovirus, and mammalian expression vector encoding an C- terminal hexa-histidine tag; AmpR pET15b E. coli expression vector encoding an N-terminal hexa-histidine tag; AmpR pGEX-4T-1 E. coli expression vector encoding an N-terminal GST tag and thrombin cleavage site; AmpR

Table 8.3 Peptides used in this study. Peptide Amino acid sequence H31-12 K4me3 ARTKme3QTARKSTG H31-23 ARTKQTARKSTGGKAPRKQLATK H31-23 K4me3 ARTKme3QTARKSTGGKAPRKQLATK H31-23 T6ph ARTKQTphARKSTGGKAPRKQLATK H31-23 K4me3 T6ph ARTKme3QTphARKSTGGKAPRKQLATK KDM1A108-119 K114me2 RRTSRRKme2RAKVE KDM1A108-119 K114me3 RRTSRRKme3RAKVE KDM1A104-127 TPEGRRTSRRKRAKVEYREMDESL KDM1A104-127 S111ph TPEGRRTSphRRKRAKVEYREMDESL KDM1A104-127 S111phK114me2 TPEGRRTSphRRKme2RAKVEYREMDESL KDM1A104-127 K114me1 TPEGRRTSRRKme1RAKVEYREMDESL KDM1A104-127 K114me2 TPEGRRTSRRKme2RAKVEYREMDESL KDM1A104-127 K114me3 TPEGRRTSRRKme3RAKVEYREMDESL KDM1A104-127 S111E K114me2 TPEGRRTERRKme2RAKVEYREMDESL KDM1A104-127 R113A K114me2 TPEGRRTSRAKme2RAKVEYREMDESL 160

Appendix B: Additional materials and methods

Table 8.4 Primers used in this study. Primer name Nucleotide sequence of primers (5’ – 3’) CHD1270-449fwd-pOPINE AGGAGATATACCATGGAATTTGAAACGATCGAGCGTTTCATG CHD1270-449rev-pOPINE GTGATGGTGATGTTTCGTGGTTTTGCTCTGATTACGAGAAAAGT CHD1267-449fwd-pOPINF AAGTTCTGTTTCAGGGCCCGGAAGAGGAGGAATTTGAAACGATCGA GCG

CHD1267-449rev-pOPINF ATGGTCTAGAAAGCTTTACGTGGTTTTGCTCTGATTACGAGAAAAG TATTCATC

CHD1267-449fwd-pOPINE AGGAGATATACCATGGAAGAGGAGGAATTTGAAACGATCGAGCG CHD1267-449rev-pOPINE GTGATGGTGATGTTTCGTGGTTTTGCTCTGATTACGAGAAAAGTAT TCATC

CHD1222-449fwd-pOPINF AAGTTCTGTTTCAGGGCCCGGAGGAAGACTACGACAACGACAAACG TAGC

CHD1222-449rev-pOPINF ATGGTCTAGAAAGCTTTACGTGGTTTTGCTCTGATTACGAGAAAAG TATTC

CHD1222-449fwd-pOPINE AGGAGATATACCATGGAGGAAGACTACGACAACGACAAACGTAGC CHD1222-449rev-pOPINE GTGATGGTGATGTTTCGTGGTTTTGCTCTGATTACGAGAAAAGTAT TC

CHD2260-452fwd-pOPINF AAGTTCTGTTTCAGGGCCCGAGCGAAACCATTGAAAAGGTGTTAGA TTCTC

CHD2260-452rev-pOPINF ATGGTCTAGAAAGCTTTATGTTTTAGAGTTATTACGTGAGTGGAAA GAATCAATG

CHD2260-452fwd-pOPINE AGGAGATATACCATGAGCGAAACCATTGAAAAGGTGTTAGATTCTC CHD2260-452rev-pOPINE GTGATGGTGATGTTTTGTTTTAGAGTTATTACGTGAGTGGAAAGAA TCAATG

CHD4364-506fwd-pOPINF AAGTTCTGTTTCAGGGCCCGGGGTACGAGACAGACCATCAGGATTA TTGTG

CHD4364-506rev-pOPINF ATGGTCTAGAAAGCTTTACAGAATCTTCTGCACTTTACCCTTCAGC G

CHD4364-506fwd-pOPINE AGGAGATATACCATGGGGTACGAGACAGACCATCAGGATTATTGTG CHD4364-506rev-pOPINE GTGATGGTGATGTTTCAGAATCTTCTGCACTTTACCCTTCAGCG CHD4499-677fwd-pOPINF AAGTTCTGTTTCAGGGCCCGAAGGGTAAAGTGCAGAAGATTCTGAT TTGGA

CHD4499-677rev-pOPINF ATGGTCTAGAAAGCTTTACAGTTCACGATGATTCCAATAGGATTGT TTG

CHD4499-677fwd-pOPINE AGGAGATATACCATGAAGGGTAAAGTGCAGAAGATTCTGATTTGGA CHD4499-677rev-pOPINE GTGATGGTGATGTTTCAGTTCACGATGATTCCAATAGGATTGTTTG CHD6285-435fwd-pOPINF AAGTTCTGTTTCAGGGCCCGGAACCGCCGGAAGACGATGC CHD6285-435rev-pOPINF ATGGTCTAGAAAGCTTTAAATTTCCGGCAGAACCTGCAGG CHD6285-435fwd-pOPINE AGGAGATATACCATGGAACCGCCGGAAGACGATGC CHD6285-435rev-pOPINE GTGATGGTGATGTTTAATTTCCGGCAGAACCTGCAGG CHD8624-831fwd-pOPINF AAGTTCTGTTTCAGGGCCCGACCCTGCCAAGTATGCAGTTCTTCG CHD8624-831rev-pOPINF ATGGTCTAGAAAGCTTTAATTCTGGCGGTTGTACCAATTAAACAGA AGCC

CHD8624-831fwd-pOPINE AGGAGATATACCATGACCCTGCCAAGTATGCAGTTCTTCG CHD8624-831rev-pOPINE GTGATGGTGATGTTTATTCTGGCGGTTGTACCAATTAAACAGAAGC C

CHD8624-790fwd-pOPINF AAGTTCTGTTTCAGGGCCCGACCCTGCCAAGTATGCAGTTCTTCG CHD8624-790rev-pOPINF ATGGTCTAGAAAGCTTTAGCGATTCACACGTTTCAATTCCGGATGG C

CHD8624-790fwd-pOPINE AGGAGATATACCATGACCCTGCCAAGTATGCAGTTCTTCG CHD8624-790rev-pOPINE GTGATGGTGATGTTTGCGATTCACACGTTTCAATTCCGGATGGC CHD8637-831fwd-pOPINF AAGTTCTGTTTCAGGGCCCGTCCGAAGAAGATGCTGCGATTGTTGA TAAGG

CHD8637-831rev-pOPINF ATGGTCTAGAAAGCTTTAATTCTGGCGGTTGTACCAATTAAACAGA AGCC

CHD8637-831fwd-pOPINE AGGAGATATACCATGTCCGAAGAAGATGCTGCGATTGTTGATAAGG CHD8637-831rev-pOPINE GTGATGGTGATGTTTATTCTGGCGGTTGTACCAATTAAACAGAAGC C

CHD8637-790fwd-pOPINF AAGTTCTGTTTCAGGGCCCGTCCGAAGAAGATGCTGCGATTGTTGA 161

Appendix B: Additional materials and methods

TAAGG

CHD8637-790rev-pOPINF ATGGTCTAGAAAGCTTTAGCGATTCACACGTTTCAATTCCGGATGG C

CHD8637-790fwd-pOPINE AGGAGATATACCATGTCCGAAGAAGATGCTGCGATTGTTGATAAGG CHD8637-790rev-pOPINE GTGATGGTGATGTTTGCGATTCACACGTTTCAATTCCGGATGGC CHD8663-831fwd-pOPINF AAGTTCTGTTTCAGGGCCCGACAGAAGCAGAAGAATTCTTCGTAAA ATATAAAAACTATTCG

CHD8663-831rev-pOPINF ATGGTCTAGAAAGCTTTAATTCTGGCGGTTGTACCAATTAAACAGA AGCC

CHD8663-831fwd-pOPINE AGGAGATATACCATGACAGAAGCAGAAGAATTCTTCGTAAAATATA AAAACTATTCG

CHD8663-831rev-pOPINE GTGATGGTGATGTTTATTCTGGCGGTTGTACCAATTAAACAGAAGC C

CHD8663-790fwd-pOPINF AAGTTCTGTTTCAGGGCCCGACAGAAGCAGAAGAATTCTTCGTAAA ATATAAAAACTATTCG

CHD8663-790rev-pOPINF ATGGTCTAGAAAGCTTTAGCGATTCACACGTTTCAATTCCGGATG CHD8663-790fwd-pOPINE AGGAGATATACCATGACAGAAGCAGAAGAATTCTTCGTAAAATATA AAAACTATTCG

CHD8663-790rev-pOPINE GTGATGGTGATGTTTGCGATTCACACGTTTCAATTCCGGATG CHD9680-840fwd-pOPINF AAGTTCTGTTTCAGGGCCCGTTTGTTGAGAATCCTTCGGAAGAAGA CGCGG

CHD9680-840rev-pOPINF ATGGTCTAGAAAGCTTTAAGGGCGATCCAGACGACGGGTATCAGG CHD9680-840fwd-pOPINE AGGAGATATACCATGTTTGTTGAGAATCCTTCGGAAGAAGACGCGG CHD9680-840rev-pOPINE GTGATGGTGATGTTTAGGGCGATCCAGACGACGGGTATCAGG CHD1_D425A_fwd GAGTGCAGCTGGGAAGCTGGAGCTCTCATTTCC CHD1_D425A_rev GGAAATGAGAGCTCCAGCTTCCCAGCTGCACTC

Table 8.5 Expression constructs generated in this study. pET15b-CHD1270-443 pET15b-CHD1270-443 D425A pOPINE-CHD1270-449 pOPINF-CHD1267-449 pOPINE-CHD1267-449 pOPINF-CHD1222-449 pOPINE-CHD1222-449 pOPINF-CHD2260-452 pOPINE-CHD2260-452 pOPINF-CHD4364-506 pOPINF-CHD4499-677 pOPINE-CHD4499-677 pOPINF-CHD6285-435 pOPINE-CHD6285-435 pOPINF-CHD8624-831 pOPINE-CHD8624-831 pOPINF-CHD8624-790 pOPINE-CHD8624-790 pOPINF-CHD8637-831 pOPINE-CHD8637-831 pOPINF-CHD8637-790 pOPINE-CHD8637-790 pOPINF-CHD8663-831 pOPINE-CHD8663-831 pOPINF-CHD8663-790 pOPINE-CHD8663-790 pOPINF-CHD9680-840 pOPINE-CHD9680-840 pOPINF-KDM4A895-1010 pET15b-KDM4A895-1010 pGEX-4T-1-CHD6276-436 pGEX-4T-1-CHD9685-840

Table 8.6 Commercial crystallisation screens used in this study. Crystal screens Provider IndexTM Hampton Research (USA) PEG/IonTM Hampton Research (USA) Crystal screenTM Hampton Research (USA) ProPlexTM Molecular Dimensions (UK) Wizard I & IITM Molecular Dimensions (UK) Wizard III & IVTM Molecular Dimensions (UK) Morpheus® Molecular Dimensions (UK) PACT premierTM Molecular Dimensions (UK)

162

Appendix B: Additional materials and methods

8.2 CHD protein expression and purification protocols

8.2.1 CHD2260-452 expression and purification

8.2.1.1 Transformation and protein expression

The pOPINF-CHD2260-452 vector was transformed into E. coli Lemo21(DE3) using the heat shock method. A 200 µL aliquot of the transformation mixture was spread onto a 20 mL agar plate supplemented with carbenicillin and chloramphenicol at final concentrations of 100 µg/mL and 34 µg/mL, respectively. Colonies formed following overnight incubation at 37 °C. A single colony was used to induce 10 mL of LB media supplemented with carbenicillin and chloramphenicol at final concentrations of 50 µg/mL and 34 µg/mL, respectively. The culture was incubated overnight in a 37 °C incubator shaking at 225 rpm. The overnight culture (10 mL) was added to 1 L of terrific broth supplemented with carbenicillin and chloramphenicol at final concentrations of 50 µg/mL and 34 µg/mL, respectively. Cells were then incubated at 37 °C with shaking at 220 rpm until an OD600 of 0.6-0.8 was reached. The culture was cooled to 16 °C and expression was induced by addition of IPTG to a concentration of 0.5 mM. The culture was then incubated for 20 h at 16 °C with shaking at 220 rpm. Cells were pelleted by centrifugation, frozen in liquid nitrogen, and stored at -80 °C.

8.2.1.2 Cell lysis Cells were resuspended in lysis buffer (50 mM Tris pH 8, 500 mM NaCl, 20 mM imidazole, cOmpleteTM EDTA-free protease inhibitor cocktail (Sigma-Aldrich, USA)) and lysed using a Constant Cell Disruption System (Constant Systems Ltd., UK). Cell debris was pelleted by centrifugation at 20,000 rpm for 1 h. Supernatant was passed through a 0.22 µm filter.

8.2.1.3 IMAC Supernatant was passed through a 5 mL HisTrapTM HP column (GE Healthcare, USA). The column was attached to an ÄKTApurifier (GE Healthcare, USA). Weakly bound proteins were washed off the column by flowing through 50 mM

Tris pH 8, 500 mM NaCl, 20 mM imidazole until the A280 stabilised. Protein was then eluted using a linear gradient of increasing imidazole concentration.

163

Appendix B: Additional materials and methods

8.2.1.4 3C cleavage and dialysis

His6-HRV-3C enzyme was added to His6-3C-CHD2260-452 at a ratio of 50 µg of

His6-HRV-3C per mg of His6-3C-CHD2260-452. The reaction mixture was transferred to moistened dialysis tubing (MWCO 3000 Da) and dialysed overnight at 4 °C into 2 L of 50 mM Tris pH 7.5, 500 mM NaCl, 1 mM DTT.

8.2.1.5 Reverse IMAC The dialysed reaction mixture was passed through a 5 mL HisTrapTM HP column (GE Healthcare, USA) at 4 °C. Unbound, cleaved protein was collected. Any protein, which remained bound to the column was eluted with 50 mM Tris pH 8, 500 mM NaCl, 500 mM imidazole. Samples were run on SDS-PAGE gel to assess the efficiency of 3C cleavage and His6 tag removal from the protein, and the purity of the sample.

8.2.1.6 Gel filtration chromatography Protein was further purified by gel filtration chromatography using an S75 16/60 column (GE Healthcare, USA) pre-equilibrated in 50 mM Tris pH 8, 200 mM NaCl.

8.2.2 CHD4364-506 expression and purification

8.2.2.1 Transformation and protein expression

Transformation of pOPINF-CHD4364-506 (double PHD domains) into E. coli

Lemo21(DE3) and expression of His6-3C-CHD4364-506 was performed as described in section 8.2.1.1.

8.2.2.2 Protein purification

Cell lysis and IMAC protein purification of His6-3C-CHD4364-506 was performed as described in sections 8.2.1.2 and 8.2.1.3. His6-3C-CHD4364-506 was further purified by gel filtration chromatography as described in section 8.2.1.6. Protein yield is very high; however, protein stability is low. Addition of Zn2+ ions to the growth media may improve protein stability. Cleavage of the His6 tag by His6-HRV-3C requires further optimisation.

164

Appendix B: Additional materials and methods

8.2.3 CHD4499-677 expression and purification

8.2.3.1 Transformation and protein expression

Transformation of pOPINF-CHD4499-677 into E. coli Lemo21(DE3) and expression of His6-3C-CHD4499-677 was performed as described in section 8.2.1.1.

8.2.3.2 Protein purification

Cell lysis and IMAC protein purification of His6-3C-CHD4499-677 was performed as described in sections 8.2.1.2, 8.2.1.3, and 8.2.1.6. There are issues of protein stability and yield of soluble protein is low. Buffer optimisation is required to improve stability.

8.2.4 CHD6285-435 protein expression and purification

8.2.4.1 Transformation, protein expression and purification

Transformation of pOPINF-CHD6285-435 into E. coli Lemo21(DE3), protein expression, and protein purification of His6-3C-CHD2285-435 was performed as described in section 8.2.1.

8.2.5 CHD8637-790 protein expression and purification

8.2.5.1 Transformation and protein expression

The pOPINF-CHD8637-790 vector was transformed into E. coli Lemo21(DE3) using the heat shock method. A 200 µL aliquot of the transformation mixture was spread onto 20 mL LB agar plates supplemented with carbenicillin, chloramphenicol, and L-rhamnose to final concentrations of 100 µg/mL, 34 µg/mL, and 0.625 mM, respectively. Colonies formed after overnight incubation at 37 °C.

A single colony was used to induce 20 mL of Power BrothTM (Molecular Dimensions, UK) supplemented with carbenicillin, chloramphenicol, and L-rhamnose at final concentrations of 100 µg/mL, 34 µg/mL, and 0.625 mM, respectively. The culture was incubated overnight in a 37 °C incubator shaking at 225 rpm. The overnight culture (10 mL) was added to 1 L of auto-induction media supplemented with carbenicillin, chloramphenicol, and L-rhamnose at final concentrations of 50 µg/mL, 34 µg/mL, and 0.625 mM, respectively. Cells were then incubated at 37 °C with shaking at 220 rpm for 4 h. Following this, the

165

Appendix B: Additional materials and methods culture was incubated for 24 h at 25 °C with shaking at 220 rpm. Cells were pelleted by centrifugation, frozen in liquid nitrogen, and stored at -80 °C.

8.2.5.2 Protein purification Cells were resuspended in lysis buffer (50 mM HEPES pH 7.5, 500 mM NaCl, 20 mM imidazole, 5% v/v glycerol, cOmpleteTM EDTA-free protease inhibitor cocktail (Sigma-Aldrich, USA)) and lysed using a Constant Cell Disruption System (Constant Systems Ltd., UK). Cell debris was pelleted by centrifugation at 20,000 rpm for 1 h. Supernatant was passed through a 0.22 µm filter.

Supernatant was passed through a 1 mL HisTrapTM HP column (GE Healthcare, USA) pre-equilibrated in lysis buffer. Weakly bound protein was washed off the column with lysis buffer. Remaining bound protein was eluted from the column with elution buffer (50 mM HEPES pH 7.5, 500 mM NaCl, 300 mM imidazole, 5% v/v glycerol). Sample identity and purity was assessed by SDS-PAGE. Protein was further purified by gel filtration chromatography with a S75 16/60 column (GE Healthcare, USA). Protein yield is low (1 mg protein/1 L of culture).

8.2.6 CHD9680-840 protein expression and purification

8.2.6.1 Transformation and protein expression

Transformation of pOPINE-CHD9680-840 into E. coli Lemo21(DE3) and expression of CHD9680-840-His6 was performed as described in sections 8.2.1.1.

8.2.6.2 Cell lysis Cells were resuspended in lysis buffer (50 mM HEPES pH 7.5, 500 mM NaCl, 20 mM imidazole, cOmpleteTM EDTA-free protease inhibitor cocktail (Sigma- Aldrich, USA)) and lysed using a Constant Cell Disruption System (Constant Systems Ltd., UK). Cell debris was pelleted by centrifugation at 20,000 rpm for 1 h. Supernatant was passed through a 0.22 µm filter.

8.2.6.3 IMAC Supernatant was passed through a 5 mL HisTrapTM HP column (GE Healthcare, USA). The column was attached to an ÄKTApurifier (GE Healthcare, USA). Weakly bound proteins were washed off the column by flowing through 50 mM

HEPES pH 7.5, 500 mM NaCl, 20 mM imidazole until the A280 stabilised. Protein was then eluted using a linear gradient of increasing imidazole concentration.

166

Appendix B: Additional materials and methods

8.2.6.4 Gel filtration chromatography Protein was further purified by gel filtration chromatography using an S75 16/60 column (GE Healthcare, USA) pre-equilibrated in 50 mM HEPES pH 7.5, 200 mM NaCl.

167

References

References 1. Siegel, R., D. Naishadham, and A. Jemal, Cancer statistics, 2013. CA Cancer J Clin, 2013. 63(1): p. 11-30. 2. Marugame, T. and K. Katanoda, International comparisons of cumulative risk of breast and prostate cancer, from cancer incidence in five continents Vol. VIII. Jpn J Clin Oncol, 2006. 36(6): p. 399-400. 3. Torre, L.A., et al., Global cancer statistics, 2012. CA Cancer J Clin, 2015. 65(2): p. 87-108. 4. Lin, C., et al., Nuclear receptor-induced chromosomal proximity and DNA breaks underlie specific translocations in cancer. Cell, 2009. 139(6): p. 1069- 83. 5. Tomlins, S.A., et al., Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. Science, 2005. 310(5748): p. 644-8. 6. Attard, G., et al., Prostate cancer. The Lancet, 2016. 387(10013): p. 70-82. 7. Rubin, M.A., C.A. Maher, and A.M. Chinnaiyan, Common gene rearrangements in prostate cancer. J Clin Oncol, 2011. 29(27): p. 3659-68. 8. Barbieri, C.E., et al., Exome sequencing identifies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nat Genet, 2012. 44(6): p. 685-9. 9. Huang, S., et al., Recurrent deletion of CHD1 in prostate cancer with relevance to cell invasiveness. Oncogene, 2012. 31(37): p. 4164-70. 10. Cucchiara, V., et al., Epigenomic regulation of androgen receptor signaling: Potential role in prostate cancer therapy. Cancers (Basel), 2017. 9(1). 11. Metzger, E., et al., Assembly of methylated KDM1A and CHD1 drives androgen receptor-dependent transcription and translocation. Nat Struct Mol Biol, 2016. 23(2): p. 132-9. 12. Dawson, M.A. and T. Kouzarides, Cancer epigenetics: from mechanism to therapy. Cell, 2012. 150(1): p. 12-27. 13. Andrews, F.H., B.D. Strahl, and T.G. Kutateladze, Insights into newly discovered marks and readers of epigenetic information. Nat Chem Biol, 2016. 12(9): p. 662-8. 14. Bernstein, B.E., et al., Genomic maps and comparative analysis of histone modifications in human and mouse. Cell, 2005. 120(2): p. 169-81.

168

References

15. Kim, T.H., et al., A high-resolution map of active promoters in the . Nature, 2005. 436(7052): p. 876-80. 16. Kimura, H., Histone modifications for human epigenome analysis. J Hum Genet, 2013. 58(7): p. 439-45. 17. Metzger, E., et al., LSD1 demethylates repressive histone marks to promote androgen-receptor-dependent transcription. Nature, 2005. 437(7057): p. 436- 9. 18. Qian, C. and M.M. Zhou, SET domain protein lysine methyltransferases: Structure, specificity and catalysis. Cell Mol Life Sci, 2006. 63(23): p. 2755- 63. 19. Lohse, B., et al., Inhibitors of histone demethylases. Bioorg. Med. Chem., 2011. 19(12): p. 3625-3636. 20. Shi, Y., et al., Histone demethylation mediated by the nuclear amine oxidase homolog LSD1. Cell, 2004. 119(7): p. 941-53. 21. Kong, X., et al., Catalytic mechanism investigation of lysine-specific demethylase 1 (LSD1): a computational study. PLoS One, 2011. 6(9): p. e25444. 22. Shi, Y.J., et al., Regulation of LSD1 histone demethylase activity by its associated factors. Mol Cell, 2005. 19(6): p. 857-64. 23. Lee, M.G., et al., An essential role for CoREST in nucleosomal histone 3 lysine 4 demethylation. Nature, 2005. 437(7057): p. 432-5. 24. Huang, J., et al., p53 is regulated by the lysine demethylase LSD1. Nature, 2007. 449(7158): p. 105-8. 25. Wang, J., et al., The lysine demethylase LSD1 (KDM1) is required for maintenance of global DNA methylation. Nat. Genet., 2009. 41(1): p. 125-129. 26. Jin, Y., et al., Nuclear import of human histone lysine-specific demethylase LSD1. The Journal of Biochemistry, 2014. 156(6): p. 305-313. 27. Costa, R., et al., The lysine-specific demethylase 1 is a novel substrate of protein kinase CK2. Biochim Biophys Acta, 2014. 1844(4): p. 722-9. 28. Nam, H.J., et al., Phosphorylation of LSD1 by PKCalpha is crucial for circadian rhythmicity and phase resetting. Mol Cell, 2014. 53(5): p. 791-805. 29. Peng, B., et al., Phosphorylation of LSD1 by PLK1 promotes its chromatin release during mitosis. Cell Biosci, 2017. 7: p. 15.

169

References

30. Peng, B., et al., Modulation of LSD1 phosphorylation by CK2/WIP1 regulates RNF168-dependent 53BP1 recruitment in response to DNA damage. Nucleic Acids Res, 2015. 43(12): p. 5936-47. 31. Forneris, F., et al., Structural basis of LSD1-CoREST selectivity in histone H3 recognition. J Biol Chem, 2007. 282(28): p. 20070-4. 32. Siklenka, K., et al., Disruption of histone methylation in developing sperm impairs offspring health transgenerationally. Science, 2015. 350(6261): p. aab2006. 33. Whyte, W.A., et al., Enhancer decommissioning by LSD1 during embryonic stem cell differentiation. Nature, 2012. 482(7384): p. 221-5. 34. Thambyrajah, R., et al., GFI1 proteins orchestrate the emergence of haematopoietic stem cells through recruitment of LSD1. Nat Cell Biol, 2016. 18(1): p. 21-32. 35. Hino, S., et al., FAD-dependent lysine-specific demethylase-1 regulates cellular energy expenditure. Nat Commun, 2012. 3: p. 758. 36. Fuentes, P., et al., CoREST/LSD1 control the development of pyramidal cortical neurons. Cereb Cortex, 2012. 22(6): p. 1431-41. 37. Kahl, P., et al., Androgen receptor coactivators lysine-specific histone demethylase 1 and four and a half LIM domain protein 2 predict risk of prostate cancer recurrence. Cancer Res, 2006. 66(23): p. 11341-7. 38. Schildhaus, H.U., et al., Lysine-specific demethylase 1 is highly expressed in solitary fibrous tumors, synovial sarcomas, rhabdomyosarcomas, desmoplastic small round cell tumors, and malignant peripheral nerve sheath tumors. Hum Pathol, 2011. 42(11): p. 1667-75. 39. Lim, S., et al., Lysine-specific demethylase 1 (LSD1) is highly expressed in ER-negative breast cancers and a biomarker predicting aggressive biology. Carcinogenesis, 2010. 31(3): p. 512-20. 40. Lv, T., et al., Over-expression of LSD1 promotes proliferation, migration and invasion in non-small cell lung cancer. PLoS One, 2012. 7(4): p. e35065. 41. Ketscher, A., et al., LSD1 controls metastasis of androgen-independent prostate cancer cells through PXN and LPAR6. Oncogenesis, 2014. 3: p. e120.

170

References

42. Willmann, D., et al., Impairment of prostate cancer cell growth by a selective and reversible lysine-specific demethylase 1 inhibitor. Int J Cancer, 2012. 131(11): p. 2704-9. 43. Kashyap, V., et al., The lysine specific demethylase-1 (LSD1/KDM1A) regulates VEGF-A expression in prostate cancer. Mol Oncol, 2013. 7(3): p. 555-66. 44. Mohammad, H.P. and R.G. Kruger, Antitumor activity of LSD1 inhibitors in lung cancer. Mol Cell Oncol, 2016. 3(2): p. e1117700. 45. Fiskus, W., et al., Highly effective combination of LSD1 (KDM1A) antagonist and pan-histone deacetylase inhibitor against human AML cells. Leukemia, 2017. 46. Gupta, S., et al., Reversible lysine-specific demethylase 1 antagonist HCI- 2509 inhibits growth and decreases c-MYC in castration- and docetaxel- resistant prostate cancer cells. Prostate Cancer Prostatic Dis, 2016. 47. Haydn, T., et al., Concomitant epigenetic targeting of LSD1 and HDAC synergistically induces mitochondrial apoptosis in rhabdomyosarcoma cells. Cell Death Dis, 2017. 8(6): p. e2879. 48. Klose, R.J., E.M. Kallin, and Y. Zhang, JmjC-domain-containing proteins and histone demethylation. Nat Rev Genet, 2006. 7(9): p. 715-27. 49. Bannister, A.J., et al., Spatial distribution of di- and tri-methyl lysine 36 of histone H3 at active genes. J Biol Chem, 2005. 280(18): p. 17732-6. 50. Guenther, M.G., et al., A chromatin landmark and transcription initiation at most promoters in human cells. Cell, 2007. 130(1): p. 77-88. 51. Huang, Y., et al., Recognition of histone H3 lysine-4 methylation by the double tudor domain of JMJD2A. Science, 2006. 312(5774): p. 748-51. 52. Lee, J., et al., Distinct binding modes specify the recognition of methylated histones H3K4 and H4K20 by JMJD2A-tudor. Nat Struct Mol Biol, 2008. 15(1): p. 109-11. 53. Su, Z., et al., Reader domain specificity and lysine demethylase-4 family function. Nat Commun, 2016. 7: p. 13387. 54. Ng, S.S., et al., Crystal structures of histone demethylase JMJD2A reveal basis for substrate specificity. Nature, 2007. 448(7149): p. 87-91.

171

References

55. Hancock, R.L., et al., The activity of JmjC histone lysine demethylase KDM4A is highly sensitive to oxygen concentrations. ACS Chem Biol, 2017. 12(4): p. 1011-1019. 56. Markolovic, S., et al., Structure-function relationships of human JmjC oxygenases-demethylases versus hydroxylases. Curr Opin Struct Biol, 2016. 41: p. 62-72. 57. Kauffman, E.C., et al., Role of androgen receptor and associated lysine- demethylase coregulators, LSD1 and JMJD2A, in localized and advanced human bladder cancer. Mol Carcinog, 2011. 50(12): p. 931-44. 58. Shin, S. and R. Janknecht, Activation of androgen receptor by histone demethylases JMJD2A and JMJD2D. Biochem Biophys Res Commun, 2007. 359(3): p. 742-6. 59. Kim, T.D., et al., The JMJD2A demethylase regulates apoptosis and proliferation in colon cancer cells. J Cell Biochem, 2012. 113(4): p. 1368-76. 60. Kogure, M., et al., Deregulation of the histone demethylase JMJD2A is involved in human carcinogenesis through regulation of the G(1)/S transition. Cancer Lett, 2013. 336(1): p. 76-84. 61. Mallette, F.A. and S. Richard, JMJD2A promotes cellular transformation by blocking cellular senescence through transcriptional repression of the tumor suppressor CHD5. Cell Rep, 2012. 2(5): p. 1233-43. 62. Ding, X., et al., Epigenetic activation of AP1 promotes squamous cell carcinoma metastasis. Sci Signal, 2013. 6(273): p. ra28.1-13, S0-15. 63. Black, J.C., et al., KDM4A lysine demethylase induces site-specific copy gain and rereplication of regions amplified in tumors. Cell, 2013. 154(3): p. 541- 55. 64. Patani, N., et al., Histone-modifier gene expression profiles are associated with pathological and clinical outcomes in human breast cancer. Anticancer Res, 2011. 31(12): p. 4115-25. 65. Berry, W.L., et al., Oncogenic features of the JMJD2A histone demethylase in breast cancer. Int J Oncol, 2012. 41(5): p. 1701-6. 66. Zhang, Q.J., et al., The histone trimethyllysine demethylase JMJD2A promotes cardiac hypertrophy in response to hypertrophic stimuli in mice. J Clin Invest, 2011. 121(6): p. 2447-56.

172

References

67. Roatsch, M., et al., Substituted 2-(2-aminopyrimidin-4-yl) pyridine-4- carboxylates as potent inhibitors of JumonjiC domain-containing histone demethylases. Future Med. Chem., 2016. 68. Sekirnik, R., et al., Inhibition of the histone lysine demethylase JMJD2A by ejection of structural Zn(II). Chem Commun (Camb), 2009(42): p. 6376-8. 69. Kawamura, A., et al., Highly selective inhibition of histone demethylases by de novo macrocyclic peptides. Nat Commun, 2017. 8: p. 14773. 70. Hamada, S., et al., Synthesis and activity of N-oxalylglycine and its derivatives as Jumonji C-domain-containing histone lysine demethylase inhibitors. Bioorg Med Chem Lett, 2009. 19(10): p. 2852-5. 71. Chen, Y.K., et al., Design of KDM4 inhibitors with antiproliferative effects in cancer models. ACS Med Chem Lett, 2017. 8(8): p. 869-874. 72. Metzger, E., et al., KDM4 inhibition targets breast cancer stem-like cells. Cancer Res, 2017. 73. Dhalluin, C., et al., Structure and ligand of a histone acetyltransferase bromodomain. Nature, 1999. 399(6735): p. 491-6. 74. Lange, M., et al., Regulation of muscle development by DPF3, a novel histone acetylation and methylation reader of the BAF complex. Genes Dev, 2008. 22(17): p. 2370-84. 75. Su, D., et al., Structural basis for recognition of H3K56-acetylated histone H3-H4 by the chaperone Rtt106. Nature, 2012. 483(7387): p. 104-7. 76. Qin, S., et al., Structural basis for histone mimicry and hijacking of host proteins by influenza virus protein NS1. Nat Commun, 2014. 5: p. 3952. 77. Cui, G., et al., PHF20 is an effector protein of p53 double lysine methylation that stabilizes and activates p53. Nat Struct Mol Biol, 2012. 19(9): p. 916-24. 78. Esteve, P.O., et al., Methyllysine reader plant homeodomain (PHD) finger protein 20-like 1 (PHF20L1) antagonizes DNA (cytosine-5) methyltransferase 1 (DNMT1) proteasomal degradation. J Biol Chem, 2014. 289(12): p. 8277- 87. 79. Min, J., et al., L3MBTL1 recognition of mono- and dimethylated histones. Nat Struct Mol Biol, 2007. 14(12): p. 1229-30. 80. Jacobs, S.A. and S. Khorasanizadeh, Structure of HP1 chromodomain bound to a lysine 9-methylated histone H3 tail. Science, 2002. 295(5562): p. 2080- 2083.

173

References

81. Baker, N.A., et al., Electrostatics of nanosystems: application to microtubules and the ribosome. Proc Natl Acad Sci U S A, 2001. 98(18): p. 10037-41. 82. Sanchez, R. and M.M. Zhou, The PHD finger: a versatile epigenome reader. Trends Biochem Sci, 2011. 36(7): p. 364-72. 83. Zeng, L., et al., Mechanism and regulation of acetylated histone binding by the tandem PHD finger of DPF3b. Nature, 2010. 466(7303): p. 258-62. 84. Mansfield, R.E., et al., Plant homeodomain (PHD) fingers of CHD4 are histone H3-binding modules with preference for unmodified H3K4 and methylated H3K9. J Biol Chem, 2011. 286(13): p. 11779-91. 85. Pena, P.V., et al., Molecular mechanism of histone recognition by plant homeodomain of ING2. Nature, 2006. 442(7098): p. 100-3. 86. Matthews, A.G., et al., RAG2 PHD finger couples histone H3 lysine 4 trimethylation with V(D)J recombination. Nature, 2007. 450(7172): p. 1106- 10. 87. Mallette, F.A., et al., RNF8- and RNF168-dependent degradation of KDM4A/JMJD2A triggers 53BP1 recruitment to DNA damage sites. Embo j, 2012. 31(8): p. 1865-78. 88. Bannister, A.J., et al., Selective recognition of methylated lysine 9 on histone H3 by the HP1 chromo domain. Nature, 2001. 410(6824): p. 120-4. 89. Min, J., Y. Zhang, and R.M. Xu, Structural basis for specific binding of Polycomb chromodomain to histone H3 methylated at Lys 27. Genes Dev, 2003. 17(15): p. 1823-8. 90. Kim, D., et al., Corecognition of DNA and a methylated histone tail by the MSL3 chromodomain. Nat Struct Mol Biol, 2010. 17(8): p. 1027-9. 91. Xu, C., et al., Structural basis for the recognition of methylated histone H3K36 by the Eaf3 subunit of histone deacetylase complex Rpd3S. Structure, 2008. 16(11): p. 1740-50. 92. Flanagan, J.F., et al., Double chromodomains cooperate to recognize the methylated histone H3 tail. Nature, 2005. 438(7071): p. 1181-5. 93. Delmas, V., D.G. Stokes, and R.P. Perry, A mammalian DNA-binding protein that contains a chromodomain and an SNF2/SWI2-like helicase domain. Proc Natl Acad Sci U S A, 1993. 90(6): p. 2414-8.

174

References

94. Ryan, D.P., et al., The DNA-binding domain of the Chd1 chromatin- remodelling enzyme contains SANT and SLIDE domains. Embo j, 2011. 30(13): p. 2596-609. 95. Mohanty, B., et al., The chromatin remodelling protein CHD1 contains a previously unrecognised C-terminal helical domain. J Mol Biol, 2016. 96. Oliver, S.S., et al., Multivalent recognition of histone tails by the PHD fingers of CHD5. Biochemistry, 2012. 51(33): p. 6534-44. 97. Bouazoune, K., et al., The dMi-2 chromodomains are DNA binding modules important for ATP-dependent nucleosome mobilization. Embo j, 2002. 21(10): p. 2430-40. 98. Watson, A.A., et al., The PHD and chromo domains regulate the ATPase activity of the human chromatin remodeler CHD4. J Mol Biol, 2012. 422(1): p. 3-17. 99. Egan, C.M., et al., CHD5 is required for neurogenesis and has a dual role in facilitating gene expression and polycomb gene repression. Dev Cell, 2013. 26(3): p. 223-36. 100. Rodriguez-Paredes, M., et al., The chromatin remodeling factor CHD8 interacts with elongating RNA polymerase II and controls expression of the cyclin E2 gene. Nucleic Acids Res, 2009. 37(8): p. 2449-60. 101. Schnetz, M.P., et al., Genomic distribution of CHD7 on chromatin tracks H3K4 methylation patterns. Genome Res, 2009. 19(4): p. 590-601. 102. Shanle, E.K., et al., Histone peptide microarray screen of chromo and Tudor domains defines new histone lysine methylation interactions. Epigenetics Chromatin, 2017. 10: p. 12. 103. Kim, M.S., et al., Genetic and expressional alterations of CHD genes in gastric and colorectal cancers. Histopathology, 2011. 58(5): p. 660-8. 104. Bagchi, A., et al., CHD5 is a tumor suppressor at human 1p36. Cell, 2007. 128(3): p. 459-75. 105. Vissers, L.E., et al., Mutations in a new member of the chromodomain gene family cause CHARGE syndrome. Nat Genet, 2004. 36(9): p. 955-7. 106. Barnard, R.A., M.B. Pomaville, and B.J. O'Roak, Mutations and modeling of the chromatin remodeler CHD8 define an emerging autism etiology. Front Neurosci, 2015. 9: p. 477.

175

References

107. Lieleg, C., et al., Nucleosome spacing generated by ISWI and CHD1 remodelers is constant regardless of nucleosome density. Mol Cell Biol, 2015. 35(9): p. 1588-605. 108. Hughes, A.L. and O.J. Rando, Comparative genomics reveals Chd1 as a determinant of nucleosome spacing in vivo. G3 (Bethesda), 2015. 5(9): p. 1889-97. 109. Lusser, A., D.L. Urwin, and J.T. Kadonaga, Distinct activities of CHD1 and ACF in ATP-dependent chromatin assembly. Nat Struct Mol Biol, 2005. 12(2): p. 160-6. 110. Rodgers, M.J., et al., CHD1 and CHD2 are positive regulators of HIV-1 gene expression. Virol J, 2014. 11: p. 180. 111. Piatti, P., et al., Embryonic stem cell differentiation requires full length Chd1. Sci Rep, 2015. 5: p. 8007. 112. Gaspar-Maia, A., et al., Chd1 regulates open chromatin and pluripotency of embryonic stem cells. Nature, 2009. 460(7257): p. 863-8. 113. Koh, F.M., et al., Emergence of hematopoietic stem and progenitor cells involves a Chd1-dependent increase in total nascent transcription. Proc Natl Acad Sci U S A, 2015. 112(14): p. E1734-43. 114. Burkhardt, L., et al., CHD1 is a 5q21 tumor suppressor required for ERG rearrangement in prostate cancer. Cancer Res, 2013. 73(9): p. 2795-805. 115. Liu, W., et al., Identification of novel CHD1-associated collaborative alterations of genomic structure and functional assessment of CHD1 in prostate cancer. Oncogene, 2012. 31(35): p. 3939-48. 116. Okuda, M., M. Horikoshi, and Y. Nishimura, Structural polymorphism of chromodomains in Chd1. J Mol Biol, 2007. 365(4): p. 1047-62. 117. Flanagan, J.F., et al., Molecular implications of evolutionary differences in CHD double chromodomains. J Mol Biol, 2007. 369(2): p. 334-42. 118. Hauk, G., et al., The chromodomains of the Chd1 chromatin remodeler regulate DNA access to the ATPase motor. Mol Cell, 2010. 39(5): p. 711-23. 119. Sharma, A., et al., Crystal structure of the chromodomain helicase DNA- binding protein 1 (Chd1) DNA-binding domain in complex with DNA. J Biol Chem, 2011. 286(49): p. 42099-104. 120. Farnung, L., et al., Nucleosome-Chd1 structure and implications for chromatin remodelling. Nature, 2017.

176

References

121. Sims, R.J., et al., Human but not yeast CHD1 directly binds directly and selectively to histone H3 methylated at lysine 4 via its tandem chromodomains. Journal of Biological Chemistry, 2005. 280(51): p. 41789- 41792. 122. Sims, R.J., 3rd, et al., Recognition of trimethylated histone H3 lysine 4 facilitates the recruitment of transcription postinitiation factors and pre- mRNA splicing. Mol Cell, 2007. 28(4): p. 665-76. 123. Morettini, S., et al., The chromodomains of CHD1 are critical for enzymatic activity but less important for chromatin localization. Nucleic Acids Res, 2011. 39(8): p. 3103-15. 124. Nodelman, I.M., et al., Interdomain communication of the Chd1 chromatin remodeler across the DNA gyres of the nucleosome. Mol Cell, 2017. 125. Baca, S.C., et al., Punctuated evolution of prostate cancer genomes. Cell, 2013. 153(3): p. 666-77. 126. Kari, V., et al., Loss of CHD1 causes DNA repair defects and enhances prostate cancer therapeutic responsiveness. EMBO Rep, 2016. 17(11): p. 1609-1623. 127. Yadav, S.S., et al., Next-generation sequencing technology in prostate cancer diagnosis, prognosis, and personalized treatment. Urol Oncol, 2015. 33(6): p. 267.e1-13. 128. Grasso, C.S., et al., The mutational landscape of lethal castration-resistant prostate cancer. Nature, 2012. 487(7406): p. 239-43. 129. Greschik, H., R. Schüle, and T. Gunther, Selective targeting of epigenetic reader domains. Expert Opin Drug Discov, 2017. 12(5): p. 449-463. 130. Brown, P.J. and S. Muller, Open access chemical probes for epigenetic targets. Future Med Chem, 2015. 7(14): p. 1901-17. 131. Pachaiyappan, B. and P.M. Woster, Design of small molecule epigenetic modulators. Bioorg Med Chem Lett, 2014. 24(1): p. 21-32. 132. Ackloo, S., P.J. Brown, and S. Muller, Chemical probes targeting epigenetic proteins: Applications beyond oncology. Epigenetics, 2017. 12(5): p. 378-400. 133. Jones, P.A., J.P. Issa, and S. Baylin, Targeting the cancer epigenome for therapy. Nat Rev Genet, 2016. 17(10): p. 630-41.

177

References

134. Santiago, C., K. Nguyen, and M. Schapira, Druggability of methyl-lysine binding sites. J Comput Aided Mol Des, 2011. 25(12): p. 1171-8. 135. Vidler, L.R., et al., Druggability analysis and structural classification of bromodomain acetyl-lysine binding sites. J Med Chem, 2012. 55(17): p. 7346- 59. 136. Dorosz, J., et al., Structure-based design of a new scaffold for cell- penetrating peptidic inhibitors of the histone demethylase PHF8. Chembiochem, 2017. 18(14): p. 1369-1375. 137. Filippakopoulos, P., et al., Selective inhibition of BET bromodomains. Nature, 2010. 468(7327): p. 1067-73. 138. Nicodeme, E., et al., Suppression of inflammation by a synthetic histone mimic. Nature, 2010. 468(7327): p. 1119-23. 139. Dawson, M.A., et al., Inhibition of BET recruitment to chromatin as an effective treatment for MLL-fusion leukaemia. Nature, 2011. 478(7370): p. 529-33. 140. Zuber, J., et al., RNAi screen identifies Brd4 as a therapeutic target in acute myeloid leukaemia. Nature, 2011. 478(7370): p. 524-8. 141. Bandukwala, H.S., et al., Selective inhibition of CD4+ T-cell cytokine production and autoimmunity by BET protein and c-Myc inhibitors. Proc Natl Acad Sci U S A, 2012. 109(36): p. 14532-7. 142. Ferri, E., C. Petosa, and C.E. McKenna, Bromodomains: Structure, function and pharmacology of inhibition. Biochem Pharmacol, 2016. 106: p. 1-18. 143. James, L.I., et al., Discovery of a chemical probe for the L3MBTL3 methyllysine reader domain. Nat Chem Biol, 2013. 9(3): p. 184-91. 144. James, L.I., et al., Small-molecule ligands of methyl-lysine binding proteins: optimization of selectivity for L3MBTL3. J Med Chem, 2013. 56(18): p. 7358- 71. 145. Ren, C., et al., Small-molecule modulators of methyl-lysine binding for the CBX7 chromodomain. Chem Biol, 2015. 22(2): p. 161-8. 146. Stuckey, J.I., et al., A cellular chemical probe targeting the chromodomains of Polycomb repressive complex 1. Nat Chem Biol, 2016. 12(3): p. 180-7. 147. Wagner, E.K., et al., Identification and characterization of small molecule inhibitors of a plant homeodomain finger. Biochemistry, 2012. 51(41): p. 8293-306.

178

References

148. Miller, T.C.R., et al., Competitive binding of a benzimidazole to the histone- binding pocket of the Pygo PHD finger. ACS Chemical Biology, 2014. 9(12): p. 2864-2874. 149. Wagner, T., et al., Identification of a small-molecule ligand of the epigenetic reader protein Spindlin1 via a versatile screening platform. Nucleic Acids Res, 2016. 150. Sweis, R.F., et al., Discovery and development of potent and selective inhibitors of histone methyltransferase g9a. ACS Med Chem Lett, 2014. 5(2): p. 205-9. 151. Kaustov, L., et al., Recognition and specificity determinants of the human cbx chromodomains. J Biol Chem, 2011. 286(1): p. 521-9. 152. Blundell, T.L., Protein crystallography and drug discovery: recollections of knowledge exchange between academia and industry. IUCrJ, 2017. 4(Pt 4): p. 308-321. 153. Zeng, L., et al., Selective small molecules blocking HIV-1 Tat and coactivator PCAF association. J Am Chem Soc, 2005. 127(8): p. 2376-7. 154. Drouin, L., et al., Structure enabled design of BAZ2-ICR, a chemical probe targeting the bromodomains of BAZ2A and BAZ2B. J Med Chem, 2015. 58(5): p. 2553-9. 155. Chen, P., et al., Discovery and characterization of GSK2801, a selective chemical probe for the bromodomains BAZ2A and BAZ2B. J Med Chem, 2016. 59(4): p. 1410-24. 156. Ferguson, F.M., et al., Targeting low-druggability bromodomains: fragment based screening and inhibitor design against the BAZ2B bromodomain. J Med Chem, 2013. 56(24): p. 10183-7. 157. Navratilova, I., et al., Discovery of new bromodomain scaffolds by biosensor fragment screening. ACS Med Chem Lett, 2016. 7(12): p. 1213-1218. 158. Cox, O.B., et al., A poised fragment library enables rapid synthetic expansion yielding the first reported inhibitors of PHIP(2), an atypical bromodomain. Chem. Sci., 2016. 7(3): p. 2322-2330. 159. Espejo, A., et al., A protein-domain microarray identifies novel protein- protein interactions. Biochem J., 2002. 367: p. 697-702. 160. Keller, S., et al., High-precision isothermal titration calorimetry with automated peak-shape analysis. Anal Chem, 2012. 84(11): p. 5066-73.

179

References

161. Houtman, J.C., et al., Studying multisite binary and ternary protein interactions by global analysis of isothermal titration calorimetry data in SEDPHAT: application to adaptor protein complexes in cell signaling. Protein Sci, 2007. 16(1): p. 30-42. 162. Brautigam, C.A., et al., Integration and global analysis of isothermal titration calorimetry data for studying macromolecular interactions. Nat Protoc, 2016. 11(5): p. 882-94. 163. Kabsch, W., Xds. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 2): p. 125-32. 164. Evans, P.R. and G.N. Murshudov, How good are my data and what is the resolution? Acta Crystallogr D Biol Crystallogr, 2013. 69(Pt 7): p. 1204-14. 165. McCoy, A.J., et al., Phaser crystallographic software. J Appl Crystallogr, 2007. 40(Pt 4): p. 658-674. 166. Emsley, P., et al., Features and development of Coot. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 4): p. 486-501. 167. Murshudov, G.N., A.A. Vagin, and E.J. Dodson, Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr D Biol Crystallogr, 1997. 53: p. 240-255. 168. Winn, M.D., et al., Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr, 2011. 67(Pt 4): p. 235-42. 169. Bricogne, G., et al., BUSTER version 2.11.5. 2017, Cambridge, United Kingdom: Global Phasing Ltd. 170. Chen, V.B., et al., MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 1): p. 12-21. 171. McNicholas, S., et al., Presenting your structures: the CCP4mg molecular- graphics software. Acta Crystallogr D Biol Crystallogr, 2011. 67(Pt 4): p. 386-94. 172. Ng, J.T., et al., Using textons to rank crystallization droplets by the likely presence of crystals. Acta Crystallogr D Biol Crystallogr, 2014. 70(Pt 10): p. 2702-18. 173. Winter, G., xia2: an expert system for macromolecular crystallography data reduction. Journal of Applied Crystallography, 2009. 43(1): p. 186-190.

180

References

174. Vonrhein, C., et al., Data processing and analysis with the autoPROC toolbox. Acta Crystallogr D Biol Crystallogr, 2011. 67(Pt 4): p. 293-302. 175. Waterman, D.G., et al., Diffraction-geometry refinement in the DIALS framework. Acta Crystallogr D Struct Biol, 2016. 72(Pt 4): p. 558-75. 176. Evans, P., Scaling and assessment of data quality. Acta Crystallogr D Biol Crystallogr, 2006. 62(Pt 1): p. 72-82. 177. Tickle, I. J., et al., STARANISO. 2018. Cambridge, United Kingdom: Global Phasing Ltd. 178. Krojer, T., et al., The XChemExplorer graphical workflow tool for routine or large-scale protein-ligand structure determination. Acta Crystallogr D Struct Biol, 2017. 73(Pt 3): p. 267-278. 179. Adams, P.D., et al., PHENIX: a comprehensive Python-based system for macromolecular structure solution. Acta Crystallogr D Biol Crystallogr, 2010. 66(Pt 2): p. 213-21. 180. Long, F., et al., AceDRG: a stereochemical description generator for ligands. Acta Crystallogr D Struct Biol, 2017. 73(Pt 2): p. 112-122. 181. Pearce, N.M., et al., A multi-crystal method for extracting obscured crystallographic states from conventionally uninterpretable electron density. Nat Commun, 2017. 8: p. 15123. 182. Tickle, I.J., Statistical quality indicators for electron-density maps. Acta Crystallogr D Biol Crystallogr, 2012. 68(Pt 4): p. 454-67. 183. Kelley, L.A., et al., The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc, 2015. 10(6): p. 845-58. 184. Foadi, J., et al., Clustering procedures for the optimal selection of data sets from multiple crystals in macromolecular crystallography. Acta Crystallogr D Biol Crystallogr, 2013. 69(Pt 8): p. 1617-32. 185. Afonine, P.V., et al., Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr D Biol Crystallogr, 2012. 68(Pt 4): p. 352-67. 186. Kubicek, S., et al., Reversal of H3K9me2 by a small-molecule inhibitor for the G9a histone methyltransferase. Mol Cell, 2007. 25(3): p. 473-81. 187. Rapp, C., et al., Cation-pi interactions of methylated ammonium ions: a quantum mechanical study. Proteins, 2014. 82(7): p. 1494-502.

181

References

188. Hung, T., et al., ING4 mediates crosstalk between histone H3 K4 trimethylation and H3 acetylation to attenuate cellular transformation. Mol Cell, 2009. 33(2): p. 248-56. 189. Fischle, W., et al., Regulation of HP1-chromatin binding by histone H3 methylation and phosphorylation. Nature, 2005. 438(7071): p. 1116-22. 190. Tan, P.Y., et al., Integration of regulatory networks by NKX3-1 promotes androgen-dependent prostate cancer survival. Mol Cell Biol, 2012. 32(2): p. 399-414. 191. Wang, Q., J.S. Carroll, and M. Brown, Spatial and temporal recruitment of androgen receptor and its coactivators involves chromosomal looping and polymerase tracking. Mol Cell, 2005. 19(5): p. 631-42. 192. Haffner, M.C., et al., Androgen-induced TOP2B-mediated double-strand breaks and prostate cancer gene rearrangements. Nat Genet, 2010. 42(8): p. 668-75. 193. Robaa, D., et al., Identification and structure-activity relationship studies of small-molecule inhibitors of the methyllysine reader protein Spindlin1. ChemMedChem, 2016. 11(20): p. 2327-2338. 194. Bae, N., et al., Developing Spindlin1 small-molecule inhibitors by using protein microarrays. Nat Chem Biol, 2017. 13(7): p. 750-756. 195. Keseru, G.M., et al., Design principles for fragment libraries: Maximizing the value of learnings from pharma fragment-based drug discovery (FBDD) programs for use in academia. J Med Chem, 2016. 196. Collins, P.M., et al., Gentle, fast and effective crystal soaking by acoustic dispensing. Acta Crystallogr D Struct Biol, 2017. 73(Pt 3): p. 246-255. 197. Pearce, N.M., T. Krojer, and F. von Delft, Proper modelling of ligand binding requires an ensemble of bound and unbound states. Acta Crystallogr D Struct Biol, 2017. 73(Pt 3): p. 256-266.

182