Detecting Gene Modules Differentially Expressed in Multiple Human Brain
Total Page:16
File Type:pdf, Size:1020Kb
Detecting Gene Modules Differentially Expressed in Multiple Human Brain Regions THESIS Presented in Partial Fulfillment of the Requirements for the Degree Master of Science in the Graduate School of The Ohio State University By Zhiwei Ma Graduate Program in Biophysics The Ohio State University 2012 Master's Examination Committee: Dr. Kun Huang, Advisor Dr. Raghu Machiraju Copyright by Zhiwei Ma 2012 Abstract Molecular screen methods such as microarrays have been used to identify molecular signatures and biological processes important for particular neuronal functions. This thesis applied a weight gene co-expression network analysis algorithm, edge-covering Quasi-Clique Merger algorithm (eQCM), on human brain microarray data from the Allen Institute of Brain Science. One thousand and sixty-six (1066) gene modules were identified. Within these 1066 gene modules, using eigengene as the representation of each gene module, 46 gene modules with significant p-values were selected by comparing the gene expression profiles between the hippocampus, parahippocampal gyrus and basal ganglia in the human brain. Through gene ontology enrichment analysis, 10 out of these 46 gene modules are significantly engaged in several biological processes of neuronal functions. The results showed that the correlation between molecular similarities and spatial proximity still exists in some human brain regions other than the neocortex. ii Dedication This document is dedicated to my parents. iii Acknowledgments I would like to express my deep gratitude to my advisor, Dr. Kun Huang, for his excellent overall guidance during my stay at OSU. I would like to thank Dr. Raghu Machiraju for being my committee member and providing me many great revision suggestions for my thesis. I would like to thank Dr. Yang Xiang for instructing me about the usage of the eQCM program. I am grateful to Ms. Kim Leonard for editing my thesis. iv Vita 2010................................................................B.S. Physics, Jilin University, China 2010 to present ..............................................Graduate Student, Biophysics Graduate Program, The Ohio State University, USA Fields of Study Major Field: Biophysics v Table of Contents Abstract .............................................................................................................................. ii Dedication ......................................................................................................................... iii Acknowledgments ............................................................................................................ iv Vita ..................................................................................................................................... v List of Tables .................................................................................................................. viii List of Figures ................................................................................................................... ix Chapter 1: Introduction ................................................................................................... 1 1.1 Background ............................................................................................................... 1 1.2 Problem Statement .................................................................................................... 2 1.3 Thesis Statement ....................................................................................................... 2 1.4 Roadmap.................................................................................................................... 3 Chapter 2: Allen Brain Data ............................................................................................ 5 2.1 The Allen Human Brain Atlas ................................................................................... 5 2.2 The Allen Developing Mouse Brain Atlas .............................................................. 11 2.3 Human Brain Gene Expression Dataset Used in This Thesis ................................. 18 vi Chapter 3: Workflow and Algorithm ........................................................................... 20 3.1 Preliminary Data Processing ................................................................................... 21 3.2 Gene Co-expression Network Analysis Using the eQCM Algorithm .................... 21 Chapter 4: Differentially Expressed Gene Modules in Specific Brain Regions ........ 25 Chapter 5: Discussion & Conclusion............................................................................. 31 References ........................................................................................................................ 34 Appendix A: Gene Symbols in Each of the Resulted 46 Gene Modules .................... 36 vii List of Tables Table 1. The donors’ information of these three datasets ................................................ 18 Table 2. Twenty-six gene modules significantly engaged in several important biological processes. .......................................................................................................................... 27 viii List of Figures Figure 1. Enter gene name/symbol/NCBI Accession Number/Entrez Gene ID ............... 6 Figure 2. Gene search result.............................................................................................. 6 Figure 3. Planar view ........................................................................................................ 7 Figure 4. Correlation search .............................................................................................. 8 Figure 5. The result of a correlation search ...................................................................... 8 Figure 6. Differential search ............................................................................................. 9 Figure 7. The result of a differential search .................................................................... 10 Figure 8. ISH data ........................................................................................................... 10 Figure 9. Enter gene symbol/ name/Entrez Gene ID ...................................................... 11 Figure 10. Results of showing relevant search topics ....................................................... 12 Figure 11. The relevant information of the gene symbol “Gabra1” ................................. 13 Figure 12. Image series of “Gabra1” ................................................................................ 14 Figure 13. Neuroblast search ............................................................................................ 15 Figure 14. Neuroblast search ............................................................................................ 15 Figure 15. Results of a neuroblast search ......................................................................... 16 Figure 16. Anatomic search .............................................................................................. 17 Figure 17. Temporal search .............................................................................................. 17 ix Figure 18. Advanced search .............................................................................................. 18 Figure 19. The summary flowchart of the procedures performed in Chapter 3 & 4 ........ 20 Figure 20. The pseudocode of the eQCM algorithm ........................................................ 22 Figure 21. The histogram showing the number of elements of each gene module .......... 23 Figure 22. The spatial distribution of the samples ............................................................ 26 Figure 23. The ANOVA boxplot for gene module #14 in three different brain regions .. 28 Figure 24. The ANOVA boxplot for gene module #30, #19, #20 and #2 in three different brain regions...................................................................................................................... 29 Figure 25. The ANOVA boxplot for gene module #29, #34, #42 and #32 in three different brain regions ....................................................................................................... 30 x Chapter 1: Introduction 1.1 Background The human brain has a very complex structure. Particular neuronal functions are localized to different parts of the brain (Flourens, 1824; Broca, 1861). Brodmann’s work (1909) of constructing a cytoarchitectural map of the neocortex showed distinct cellular organizations across different brain regions. Many previous research studies have been done to identify the functional specializations and related neuropathology in the brain. Since the development of molecular biology, some molecular screen methods such as microarrays have been used to identify molecular signatures and pathways important for particular neurobiological processes. Previously, large-scale screenings for gene expression profiles across all different human brain regions were both costly and infeasible. Since the establishment of the Allen Institute for Brain Science, genome-wide atlases of gene expression in the brain of different species are being created (Jones, Overly and Sunkin, 2009). The genome-wide atlases of gene expression provide contemporary neuroscientists great opportunities to investigate gene expression patterns in multiple brain regions of different species and have led to knowledge discovery with respect to neurological disorders. Using