Deciphering Pre-Leukaemic Dysregulation Caused by Mutant C/Ebpα at the Level of the Single Cell
Total Page:16
File Type:pdf, Size:1020Kb
Deciphering pre-leukaemic dysregulation caused by mutant C/EBPa at the level of the single cell Moosa Rashid Qureshi Department of Haematology University of Cambridge This dissertation is submitted for the degree of Doctor of Philosophy Queens’ College August 2019 DeClaration This thesis is the result of my own work and inCludes nothing whiCh is the outCome of work done in Collaboration exCept as deClared in the Acknowledgments and speCified in the text. It is not substantially the same as any that I have submitted, or, is being ConCurrently submitted for a degree or diploma or other qualifiCation at the University of Cambridge or any other University or similar institution exCept as deClared in the Acknowledgments and speCified in the text. I further state that no substantial part of my thesis has already been submitted, or, is being ConCurrently submitted for any suCh degree, diploma or other qualification at the University of Cambridge or any other University or similar institution exCept as deClared in the Acknowledgments and speCified in the text. It does not exCeed the prescribed word limit for the relevant Degree Committee Moosa Qureshi August 2019 I AbstraCt DeCiphering pre-leukaemic dysregulation Caused by mutant C/EBPa at the level of the single cell Moosa Rashid Qureshi C/EBPa is an ideal Candidate to explore pre-leukaemiC dysregulation of transCriptional networks beCause it is a signifiCant player in normal myelopoiesis, it interaCts with other transCription faCtors (TFs) in lineage speCification, and the CEBPA gene has been identified as an early mutation in acute myeloid leukaemia (AML). N321D is a mutation affeCting the leuCine zipper domain of C/EBPa whiCh has previously generated an aggressive AML phenotype in a mouse model. The first objeCtive of my thesis was to ConstruCt a novel in vitro cellular model to reCapitulate pre-leukaemic Changes assoCiated with the CEBPA N321D mutation. The conditionally immortalised murine Hoxb8-FL Cell line can differentiate into relatively homogenous populations of functional myeloid, lymphoid and dendritiC Cells in appropriate Cell Culture conditions. These Cells were transduCed with retroviral veCtors whiCh expressed either no transgene or “Empty VeCtor” (EV), the CEBPA wildtype (WT) or N321D mutation. CEBPA N321D-transduCed Cells had immature morphologiCal appearanCe in Flt3L differentiation media, proliferated indefinitely, and were CharaCterised by a plasmacytoid dendritiC Cell (pDC)-like immunophenotype. These Cells were Capable of long-term engraftment in a mouse model but did not reCapitulate leukaemia for several months after transplantation. The seCond objeCtive was to CharaCterise the transCriptional deregulation associated with the phenotypes demonstrated by EV, CEBPA WT and N321D-expressing Cells in the presenCe of Flt3L. Single cell RNA-seq (scRNA-seq) confirmed phenotypic observations that CEBPA N321D- transduCed Cells had a pDC preCursor-like gene expression signature when plaCed in Flt3L conditions of differentiation. Single Cell analysis also identified transCriptional heterogeneity in the CEBPA N321D Compartment at single Cell resolution. II The third objeCtive was to use ChIP-seq to investigate the effeCt of the N321D mutation on C/EBPa’s TF binding profile, and Correlate this with sCRNA-seq data to identify a set of candidate genes (NOTCH2, JAK2, SIRPA, FOS, and others) where gene expression Changes were assoCiated with direCt binding events by the mutant TF. In addition to characterising the transCriptional effeCts of the CEBPA mutation, a fourth objeCtive was to update and expand the Göttgens group’s CODEX database of TF-binding and histone modifiCation data for human blood Cells, increasing annotation of the genome to 14.83%. This online database now provides signifiCant Coverage of the non-coding portion of the genome, and is therefore potentially of interest to the wider haematology researCh community. III Acknowledgements Firstly, I would like to thank my supervisor Bertie Göttgens for his invaluable mentorship and support during my PhD, partiCularly for providing vision at those definitive points when I could not myself see a Clear path forwards. It has been a real privilege to work under his supervision. I am indebted to everyone in the Göttgens lab for providing a fantastiC environment of academiC endeavour, but in partiCular I would like to thank Fernando Calero-Nieto for providing me with praCtiCal help and adviCe throughout my sCientifiC apprenticeship, and for teaChing me the importanCe of sCientifiC rigour. In addition, I would like to thank Sarah Kinston and Iwo KuCinski for their praCtiCal help with Cloning and single Cell work, and the Flow Cytometry Team at the CIMR for helping me to develop a workable flow panel and for Cell sorting. From the Göttgens bioinformatiCs team, I would like to thank Evangelia Diamanti, Fiona Hamey, Xiaonan Wang, and Wajid Jawaid for helping me with R-code, and espeCially Rebecca Hannah who helped enormously with proCessing my ChIP-seq data and my work on the CODEX database. It would be greatly remiss of me not to aCknowledge the outstanding administrative support I reCeived from Martin Dawes. Regarding Collaboration, I would like to thank the Huntly group and in partiCular George Giotopoulos for his help and expertise with in vivo work. Thirdly, I thank the Cambridge CanCer Centre for seleCting me for this PhD, and CanCer Research UK and the Addenbrooke’s Charitable Trust for supporting my researCh. In addition, I would like to thank the Charles Slater Fund and Queens’ College for funding me to present my work at international ConferenCes. I would also like to thank my father who Continues to inspire me, my mother whose memory sustains me, and all my extended family. Finally, I express my gratitude to Sara Khalil Ebrahim Ali for providing love, intelleCtual stimulation, asking me questions whiCh I Can never answer, and for simply putting up with me. IV Abbreviations AML ACute Myeloid Leukaemia APC Antigen Presenting Cell APML ACute PromyeloCytiC Leukaemia ASAP Automated Single-cell Analysis Pipeline BMDC Bone Marrow derived DendritiC Cell BMS Bone Marrow Stroma BR BasiC Region bZIP basiC-region leuCine Zipper C/EBP CCAAT/EnhanCer Binding Protein cDC Conventional DendritiC Cell cDNA Complementary DNA CDP Common DendritiC Cell Progenitor CHIP Clonal Haematopoiesis of Indeterminate Potential ChIP Chromatin ImmunopreCipitation ChIP-seq Chromatin ImmunopreCipitation SequenCing CLP Common Lymphoid Progenitor CMGA Complex MoleCular GenetiC Abnormalities CML ChroniC Myeloid Leukaemia CMP Common Myeloid Progenitor CRISPR Clustered Regularly InterspaCed Short PalindromiC Repeats CyTOF Cytometry by Time Of Flight DC DendritiC Cell DGE Differential Gene Expression dm double mutated DMEM DulbeCCo's Modified Eagle's Medium DMSO Dimethyl Sulfoxide DNA DeoxyribonuCleiC ACid dNTP DeoxynuCleotide Triphosphates dT Deoxythymine EDTA EthylenediaminetetraaCetiC ACid ENCODE ENCyClopaedia Of DNA Elements ERCC External RNA Controls Consortium V EV Empty VeCtor FAB FrenCh-AmeriCan-British (AML ClassifiCation System) FACS FluoresCenCe-Activated Cell Sorting FBS Foetal Bovine Serum G-CSF GranuloCyte-Colony Stimulating FaCtor GEO Gene Expression Omnibus GFP Green Fluorescent Protein GM-CSF GranuloCyte MaCrophage-Colony Stimulating Factor GMP GranuloCyte MaCrophage Progenitor GO Gene Ontology GRN Gene Regulatory Network GWAS Genome-wide AssoCiation Study HEPES 4-(2-Hydroxyethyl)-1-PiperazineethanesulfoniC ACid HPC-7 Haematopoietic precursor cell-7 HSC HaematopoietiC Stem Cell HSPC Haematopoietic Stem and Progenitor Cells IFN Interferon IL-3 Interleukin 3 ImmGen ImmunologiCal Genome IRES Internal Ribosome Entry Site LIC Leukaemia Initiating Cell LMPP Lymphoid-Primed Multipotent Progenitor LSK Lin- Sca-1+ C-Kit+ LT-HSC Long-Term HSC LZ LeuCine Zipper MACS Model-based Analysis of ChIP-seq mChrom1 mouse Chromosome 1 M-CSF MaCrophage-Colony Stimulating FaCtor MDP MonoCyte/MaCrophage DendritiC Cell Progenitor MDS MyelodysplastiC Syndrome MEP MegakaryoCyte ErythroCyte Progenitor MHC Major HistoCompatibility Complex MLL Mixed Lineage Leukaemia MLP Multi-Lymphoid Progenitor MPP Multipotent Progenitor VI mRNA messenger RNA NOD SCID Non-DiabetiC with Severe Combined ImmunodefiCienCy Disease NGS Next Generation SequenCing NSG NOD SCID Gamma PBS Phosphate Buffered Saline PCA PrinCipal Component Analysis PCR Polymerase Chain ReaCtion pDC PlasmaCytoid DendritiC Cell PBS Phosphate Buffered Saline qPCR Quantitative PCR RNA RibonuCleic aCid RPMI Roswell Park Memorial Institute Medium sc single Cell SCF Stem Cell Factor Scl Stem Cell leukaemia SD SequenCing Depth shRNA short hairpin RNA smFISH single-moleCule FluoresCenCe In Situ Hybridisation SNP Single NuCleotide Polymorphism SNR Signal-to-Noise Ratio ST-HSC Short-term HSC TAD TransaCtivation Domain TAE Tris/ACetiC ACid/EDTA Buffer TBE Tris/Borate/EDTA Buffer TBS Tris Buffered Saline TE Tris/EDTA TF TransCription FaCtor t-SNE t-Distributed StoChastiC Neighbour Embedding TSS TransCription Start Site UCSC University of California Santa Cruz (Genome Browser) UMI Unique MoleCular Identifier URE Upstream Regulatory Element WGS Whole Genome SCreening WT Wildtype VII VIII Contents Declaration ........................................................................................................................I Abstract ........................................................................................................................... II Acknowledgements .......................................................................................................