An Exploration of Some Aspects of Molecular Replacement in Macromolecular Crystallography
Total Page:16
File Type:pdf, Size:1020Kb
An exploration of some aspects of molecular replacement in macromolecular crystallography Richard Mifsud Christ’s College Submitted: June 2018 This dissertation is submitted for the degree of Doctor of Philosophy Declarations This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except as declared in the Preface and specified in the text. It is not substantially the same as any that I have submitted, or, is being concurrently submitted for a degree or diploma or other qualification at the University of Cambridge or any other University or similar institution except as declared in the Preface and specified in the text. I further state that no substantial part of my dissertation has already been submitted, or, is being concurrently submitted for any such degree, diploma or other qualification at the University of Cambridge or any other University or similar institution except as declared in the Preface and specified in the text. It does not exceed the prescribed word limit for the relevant Degree Committee. 1 Acknowledgements I would like to thank all those who helped fund my research that greatly increased my understanding of this field, allowed me to contribute to the wealth of knowledge, and enabled me to generate this thesis. I have been on the Cambridge Institute for Medical Research 4 year PhD Programme, funded by the Wellcome Trust. I would like to thank INSTRUCT for granting a travel fellowship to allow me to attend the Gordon Research Conference on structural biology in 2014. I would like to give thanks to Professor Randy Read for allowing me to do a Doctor of Philosophy project in his group. He has provided invaluable advice throughout, and has helped guide and develop my project. He has also supported me throughout the project, providing funding to attend the informative phenix conferences held every six months. I would like to thank all the scientists in the Read Group who have helped increase my understanding in several areas. In particular, Gabor Bunkozi for his assistance in explaining the various Python libraries in phenix, and the intricacies of some of the existing code. I would also like to thank him for his advice with the concepts of the NaCelleS project. The initiative shown by Cavan Bennett and his group leader Cedric Ghevaert in introducing me to the issues that had arisen with crystallising the CRLF3 protein is gratefully acknowledged, as is their subsequent support and collaboration in determining the structure of major elements of this protein, as reported in this thesis. The CRLF3 structural determination was collaborative, as it required a range of skills. I designed the constructs that were taken forward for experimentation. The cloning and expression of the protein was undertaken by Cavan, who also purified and prepared the crystals for data collection, working with Yahui Yan. The later preparation of heavy atom derivatives was also done by Yahui. The remote data collection was led by Yahui, with my direct involvement. I led the immediate and subsequent analysis of the collected data, and the solving of the phase problem. Model building and refinement was my work; and the data will shortly be deposited. 2 Airlie McCoy is another person I would like to thank. She has always cheerfully fixed any Phaser bugs I identified and has helped in a number of other ways. Rob Oeffner has been helpful in providing test data, helping to run programs in Windows and on the cluster. Tristan Croll has been a great support in the laboratory and the discussions with WeeLee Chan early in the project were helpful. More broadly within the CIMR, I am very grateful to Jim Huntington and Andreas Floto for the laboratory rotations that I undertook. The CIMR programme has many two hour core seminars, which have helped me to gain a broader understanding, as well as various Wellcome Trust PhD days, which have allowed a broad overview into what other PhD students have been doing. I would like to thank the following people for proof-reading my thesis. Beth Humphreys, John Mifsud, John Aldridge and Lucy Oswald, along with Gordon Frazer for showing me how to bind the thesis. Finally, I would like to thank the CIMR Level 7 Canteen for coping with my innumerable allergies, Christ’s College for all the support they have shown throughout the years, and my long suffering girlfriend Lucy Oswald, for astronomical levels of patience. 3 Abstract This thesis reports work in three areas of X-ray crystallography. An initial chapter describes the structure of a protein, the methods based on the use of X-rays and computer analysis of diffraction patterns to determine crystal structure, and the subsequent derivation of the structure of part or all of a protein molecule. Work to determine the structure of the protein cytokine receptor-like factor 3 (CRLF3) leading to the successful generation of a structural model of a significant part of this molecule is then described in Chapter 2. A variety of techniques had to be deployed to complete this work, and the steps undertaken are described. Analysis was performed principally using phaser, using maximum likelihood methods. Areas for improvement in generating non-crystallographic symmetry (NCS) operators in existing programmes were identified and new and modified algorithms implemented and tested. Searches based on improved single sphere algorithms, and a new two-sphere approach, are reported. These methods showed improvements in many cases and are available for future use. In Chapter 4, work on determining the relative importance of low resolution and high intensity data in molecular replacement solutions is described. This work has shown that high intensity data are more important than the low resolution data, dispelling a common perception and helping in experimental design. 4 Abbreviations 2D two-Dimensional 3D three-Dimensional ASA Accessible Surface Area ATP Adenosine TriPhosphate BCA BiCinchoninic Acid BIS-TRIS Bis-tris methane BRIDGE The BRIDGE Consortium is an organisation umbrella for Next Generation Sequencing at Cambridge CASP Critical Assessment of protein Structure Prediction CCD Charge-Coupled Device CCP-EM Collaborative Computational Project for Electron cryo-Microscopy CCP4 Collaborative Computational Project Number 4 in protein crystallography cDNA Complementary DNA CREME9 Cis-Regulatory Module Explorer for the human genome 9 (aka CRLF3) CRLF3 Cytokine Receptor-Like Factor 3 CRLM9 Cytokine Receptor-Like Molecule 9 (aka CRLF3) CYTOR4 Cytoskeleton Regulator 4 (aka CRLF3) DNA DeoxyriboNucleic Acid DSSP Database of Secondary Structures Program EDTA EthyleneDiamineTetraacetic Acid EG Ethylene Glycol EGA European Genome-phenome archive at the European Bioinformatics Institute eLLG Expected Log Likelihood Gain EM Electron Microscopy EMPIAR Electron Microscopy Public Image Archive ExAC Exome Aggregation Consortium FN3 Fibronectin type III domain FN3con Fibronectin type III domain consensus protein GCSF Granulocyte colony-stimulating factor GST Glutathione S-Transferase 5 GTP Guanosine-5’-TriPhosphate Hg-SAD Mercury atom single-wavelength anomalous diffraction HMM Hidden Markov model HySS Hybrid Substructure Search IhhN Indian hedgehog gene Ihog Interference hedgehog IKMC International Knock-out Mouse Consortium IPTG IsoPropyl β-D-1-ThioGalactopyranoside IQ Intelligence Quotient iRNA interfering RNA IUCr International Union of Crystallography JAK-STAT JAK (Janus kinases)-STAT (Signal Transducer and Activator of Transcription proteins) signalling pathway LLG Log-Likelihood Gain MAD Multiple-wavelength Anomalous Dispersion MES 2-(N-Morpholino)EthaneSulfonic acid MIR Multiple Isomorphous Replacement MIRAS Multiple Isomorphous Replacement with Anomalous Scattering mmCIF Macromolecular Crystallographic Information File format MR Molecular Replacement NCAM1 Neural Cell Adhesion Molecule 1 NCS Non-Crystallographic Symmetry NF1 NeuroFibromatosis type 1 NGS Next Generation Sequencing NHS National Health Service NIHR National Institute for Health Research NMR Nuclear Magnetic Resonance NOE Nuclear Overhauser Effect NRF No Reported Factors PCR Polymerase Chain Reaction PDB Protein Data Bank PEG PolyEthylene Glycol PFPE PerFluoroPolyEther oil 6 phenix Python-based Hierarchical ENvironment for Integrated Xtallography PPI Poly-Proline type I helix PPII Poly-Proline type II helix PSF Point Spread Function REFMAC REFinement of MACromolecular structures program RF Random Forest (algorithm) RIP Radiation-damage-Induced Phasing RIPAS Radiation-damage-Induced Phasing with Anomalous Scattering RMS Root-Mean-Square RMSD Root-Mean-Square Deviation RNA RiboNucleic Acid ROBO1 ROundaBOut guidance receptor 1 SAD Single-wavelength Anomalous Dispersion S-SAD Sulphur SAD siRNA Small interfering RNA SIR Single Isomorphous Replacement SIRAS Single Isomorphous Replacement with Anomalous Scattering SPRY domain named from SPla and the RYanodine Receptor STRIDE STRuctural IDEntification algorithm SVN SubVersioN (version control software) TAE 40mM Tris, 20mM Acetic acid, 1mM EDTA TFZ Translation Function Z-score TFZ0 Initial Z-score for the translation search Tris Tris(hydroxymethyl)aminomethane XDS X-ray Detector Software XFEL X-ray Free Electron Laser 7 Table of Contents Declarations……………………………………………………………………………………………….…....1 Acknowledgments……………………………………………………………………………………….……2 Abstract……………………………………………………………………………………………….………....4 Abbreviations……………………………………………………………………………………………….….5 Chapter 1 Introduction .......................................................................................................