California State University, Northridge Gene
Total Page:16
File Type:pdf, Size:1020Kb
CALIFORNIA STATE UNIVERSITY, NORTHRIDGE GENE EXPRESSION PROFILING IN A FAMILY WITH A NOVEL FORM OF BETA-THALASSEMIA A thesis submitted in partial fulfillment of the requirement For the degree of Master of Science in Biology By Forough Taghavifar May 2016 The thesis of Forough Taghavifar is approved: Dr. Stan Metzenberg, Ph.D. Date Dr. Steven B Oppenheimer, Ph.D. Date Dr. Aida Metzenberg, Ph.D., Chair Date California State University, Northridge ii ACKNOWLEDGEMENTS I would like to extend the outmost gratitude to my father, mother and husband whose emotional support and guidance has enabled me to successfully enhance my academic journey. Also, a heartfelt appreciation goes to Dr. Aida Metzenberg, my wise and trusting advisor. Dr. Aida Metzenberg, not only have you been my thesis advisor and mentor, you have also served as a mother figure for me ever since I emigrated to the US and started my master’s program at Cal State University Northridge, -for that I will always thank you! Furthermore, I would thank Dr. Stan Metzenberg for believing in me before I personally believed in my abilities. Dr. Stan Metzenberg, if it were not for your computer modeling class, a spark would not have ignited my interest in wanting to further sophisticate my project. Lastly, I would like to acknowledge the Bruins in Genomics (BIG) summer program that I had the pleasure of attending in the summer of 2015. Also a very special thanks to my mentor at the BIG program, David Casero, who has more than willingly assisted me throughout all the RNA sequencing data analysis steps. iii TABLE OF CONTENTS SIGNATURE PAGE.……………………………………………………………………..ii ACKNOELEDGMENT………………………………………………………………......iii LIST OF FIGURES………………………………………………...……….....................vi LIST OF TABLES…..…………………………………………….................................viii ABSTRACT…………………………………………………...........................................ix CHAPTER I: INTRODUCTION Globin genes and hemoglobinopathies………………………….………………...1 Culmination of literature review………………………………………………......4 The aims of this study and hypothesis…………………………………………….7 Preliminary data………………………………………………………………….10 CFAPTER II: MATERIALS AND METHODS Splice site prediction using Alternative Splice Site Predictor (ASSP)…………..12 RNA isolation and RT-qPCR……………………………………………………12 Library construction and RNA Sequencing……………………………………...14 Mapping and RNA-seq data analysis…………………………………………….14 Analyzing effects of globin gene masking……………………………………….15 Gene ontology (GO) Analysis using the DAVID annotation database………….16 CHAPTER III: RESULTS The novel mutation was predicted to generate a cryptic donor site……………...17 The novel mutation down-regulates β-globin expression………………………..18 RNA-Seq data visualization in the HBB locus…………………………………..21 iv The novel mutation introduces a splice donor site……………………………….22 More than 300 genes are differentially expressed in ß-thalassemic blood………25 β-Thalassemia shows important similarities and differences with sickle cell disease at the transcriptome level………………………………………………..38 CHAPTER IV: DISCUSSION…………………………………………………………..41 REFERENCES…………………………………………………………………………..53 APPENDIX A: Supplemental Matrix 1………………………………………………….61 APPENDIX B: Supplemental Table 1……..…………………………………………….66 APPENDIX C: RNA-Seq data analysis commands…...………………………………...69 v LIST OF FIGURES Figure 1: Globin gene clusters and chromosome location………………………………...2 Figure 2: Proposed model of alternative splicing……………………...…...…………......9 Figure 3: Sanger DNA sequencing chromatograms ……...…...…………………….......10 Figure 4: Subject family structure..………………………………………...………….....11 Figure 5: Results of bioinformatics sequence analysis Figure 5A:……………………………………………….……………...………..17 Figure 5B:……………………………………………….……………...………..18 Figure 6: qPCR amplification curves and bar chart……………………………………...20 Figure 7: IGV visualization of reads mapped to the HBB Figure 7A:……………………………………………….……………...………..21 Figure 7B:……………………………………………….……………...………..22 Figure 8: UCSC genome browser snapshots of the HBB locus Figure 8A:……………………………………………….……………...………..23 Figure 8B:…………………………………………….……………...…………..24 Figure 8C:……………………………………………….……………...………..24 Figure 9: Comparison of the transcript read counts (FPKMs)…………………………...26 Figure 10: Hierarchical clustering of the differentially expressed genes………………..28 vi Figure 11: Heat-map showing the expression fold change of the DEGs………………...35 Figure 12: Diagonal (log-log) plots of expression levels of individual genes (FPKM units) Figure 12A:……………………………………………….……………...………36 Figure 12B:……………………………………………….……………...………37 Figure 12C:……………………………………………….……………...………37 Figure 13: Venn diagram comparing the number of the DEGs……..…………………...40 vii LIST OF TABLES Table 1: Cryptic splice site mutations.……………………….....………..……………….8 Table 2: Hematological parameters……………………………………………………...10 Table 3: qPCR plate display……………………………………………………………..13 Table 4: qPCR quantification data……………………………………………………….19 Table 5: Gene ontology analysis table…………………………………………………...29 Table 6: Comparing the list of the DEGs in anemias……………………………………39 viii ABSTRACT GENE EXPRESSION PROFILING IN A FAMILY WITH ß-THALASSEMIA By Forough Taghavifar Master of Science in Biology The genetic causes of β-thalassemia are largely well described. However, the disease is very heterogeneous at both the molecular and clinical levels. Studying the transcriptome profiles of β-thalassemia patients, especially individuals who carry novel mutations in the β-globin gene (HBB), may improve our understanding of the heterogeneity and molecular mechanisms of the disease and its possible treatment. Here, I characterized members of a family with β-thalassemia using whole genome expression analysis. I report a novel mutation in the exon 1 region of HBB (HBB:c.51C>T) that was associated with an unexpected phenotype of β-thalassemia in a heterozygote who also carries a typical β-thalassemia allele. I analyzed effects of the novel mutation at the transcriptome level by RT-qPCR and high-throughput RNA sequencing using an Illumina Hiseq2500 system. The results revealed that the novel mutation creates a cryptic donor splice site in the HBB, which causes alternative splicing from the site and down-regulates (~0.7) expression of the β-globin. Gene expression profiling analysis showed that there were more than 300 differentially expressed genes (DEGs) in β-thalassemic blood. The DEGs were enriched in pathways that are directly or indirectly related to β-thalassemia such as hemopoiesis, heme biosynthesis, response to oxidative stress, inflammatory responses, immune responses, controlling of Circadian rhythms, apoptosis, and other cellular activities. It was possible to compare these findings with published results of ix RNA-seq analysis of sickle cell disease and KLF1-null anemia, and recognize similarities and differences in their transcriptional expression patterns. While many DEGs involved in response to hemolysis, iron homeostasis, and anemia were in common between these three types of anemias, over 200 DEGs were unique to β-thalassemia. Although this study was limited by the small sample size of the patients, it provides a wealth of data on β-thalassemia because it is the first broad investigation of blood cell gene expression in this disease, and gives us novel insight that can be used in drug discovery to identify novel therapeutic approaches for the disease. x CHAPTER I: INTRODUCTION Globin genes and hemoglobinopathies It has been determined that approximately 250,000 individuals are born with a disorder caused by a defect in the hemoglobin, each year. This means that these disorders, which are called hemoglobinopathies, contribute more mortalities than any other group of genetically inherited disorders [1]. For this reason, scientists have extensively studied the hemoglobin molecule and the diseases caused by mutations in the genes encoding for it. Hemoglobin is a protein in red blood cells that carries oxygen from the lungs to every cell within the different tissues in the body, and transports carbon dioxide from the tissues to the lungs in order for it to be excreted from the body. The main hemoglobin molecule in human is called Adult Hemoglobin or HbA. This molecule is a tetramer that is made up of four subunits, two alpha-globin (α-globin) chains and two beta-globin (β- globin) chains (α2β2). Over the course of evolution, the gene that codes for α-globin and the one that codes for β-globin, have been separated. In humans, the α-globin gene cluster is located on chromosome 16, whereas the β-globin gene cluster is located on chromosome 11 [2]. Human Hemoglobin is heterogeneous and during development undergoes a succession of various globin chains, which are differentially expressed during embryonic, fetal and adult life, i.e. α2ε2 (embryonic hemoglobin (HbE)), α2γ2 (Fetal hemoglobin (HbF)), α2δ2 (hemoglobin A2) and α2β2 (adult hemoglobin (HbA)). The globin gene clusters and their chromosomal locations are shown on the Figure 1. In a normal adult, the overall hemoglobin composition is about 96%-98% HbA, 2%-3% HbA2 and <1% HbF. This High percentage of HbA reflects the effectiveness of this form of the hemoglobin in carrying oxygen to adult tissues [1]. The official symbol for the 1 hemoglobin beta or β-globin gene is HBB, and it is located at position 15.5 on the short arm of chromosome 11 (11p15.5) [2]. Figure 1) A A h#p://www.fastbleep.com/biology6notes/32/156/846? Figure 1. Globin gene clusters and chromosome location While the hemoglobinopathies are endemic in many world populations, their genetic carrier