Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress
Total Page:16
File Type:pdf, Size:1020Kb
University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations Fall 2010 Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress Renuka Nayak University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Computational Biology Commons, and the Genomics Commons Recommended Citation Nayak, Renuka, "Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress" (2010). Publicly Accessible Penn Dissertations. 1559. https://repository.upenn.edu/edissertations/1559 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/1559 For more information, please contact [email protected]. Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress Abstract Genes interact in networks to orchestrate cellular processes. Here, we used coexpression networks based on natural variation in gene expression to study the functions and interactions of human genes. We asked how these networks change in response to stress. First, we studied human coexpression networks at baseline. We constructed networks by identifying correlations in expression levels of 8.9 million gene pairs in immortalized B cells from 295 individuals comprising three independent samples. The resulting networks allowed us to infer interactions between biological processes. We used the network to predict the functions of poorly-characterized human genes, and provided some experimental support. Examining genes implicated in disease, we found that IFIH1, a diabetes susceptibility gene, interacts with YES1, which affects glucose transport. Genes predisposing to the same diseases are clustered non-randomly in the network, suggesting that the network may be used to identify candidate genes that influence disease susceptibility. These analyses showed that human coexpression networks based on natural variation may offer information on gene functions and interactions. We then examined the extent to which networks change upon stress. We studied changes in expression levels and gene relationships induced by two stresses: endoplasmic reticulum (ER) stress and exposure to ionizing radiation (IR). Using large datasets, we found between 30-70% of genes change expression upon stress. In contrast, the majority (between 65-95%) of gene relationships are maintained as assessed using statistical, network and machine learning methods. However, a subset of genes altered relationships upon stress. These genes tended to be critical for the cellular response to the specific stress examined. For example, BIP and CHOP altered relationships in ER stress; p21, GADD45A and CCNB1 altered relationships in IR stress. Some genes with altered relationships have not been implicated in ER or IR stress or do not change expression; these are genes that may be critical but remain unexplored. We provide evidence implicating two such genes, INHBE and SLC3A2, in the response to ionizing radiation. Our results suggest that the majority of gene relationships are maintained upon stress, but those genes with altered relationships tend to be critical to the stress response. Degree Type Dissertation Degree Name Doctor of Philosophy (PhD) Graduate Group Cell & Molecular Biology First Advisor Vivian G. Cheung Second Advisor Michael J. Kearns Keywords genomics, networks, genetics, computational biology Subject Categories Computational Biology | Genetics and Genomics | Genomics This dissertation is available at ScholarlyCommons: https://repository.upenn.edu/edissertations/1559 COEXPRESSION NETWORKS BASED ON NATURAL VARIATION IN HUMAN GENE EXPRESSION AT BASELINE AND UNDER STRESS Renuka R. Nayak A DISSERTATION in Cell and Molecular Biology Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy 2010 Supervisor of Dissertation Co-supervisor _______________________ _______________________ Vivian G. Cheung Michael J. Kearns Professor, Pediatrics Professor, Computer and Information Science Graduate Group Chairperson _______________________ Daniel S. Kessler, Associate Professor, Cell and Developmental Biology Dissertation Committee Marisa S. Bartolomei, Professor, Cell and Developmental Biology Morris J. Birnbaum, Professor, Medicine Shane T. Jensen, Associate Professor, Statistics Carlo C. Maley, Associate Professor, Cell and Developmental Biology Dedicated to the memory of Rich ii ACKNOWLEDGEMENTS I am deeply indebted to many people for the project undertaken here. I thank Vivian G. Cheung. She encouraged me to be fearless, to find joy in nature and science always, and to push myself to the limit of my abilities. Her tireless efforts to provide feedback and advice have shaped me. I thank Michael J. Kearns. He introduced me to new ways of thinking and learning about biology. It is clearly manifested in this dissertation. He has been instrumental to my development as a young researcher. I thank my thesis committee members for their excellent suggestions, their encouragement and for sharing their stories with me. Many thanks to members of the Cheung and Spielman labs. I have enjoyed contemplating profound scientific and non- scientific ideas with you over the years. Thanks to colleagues, friends, and the MD-PhD Office for their guidance and friendship. And, most importantly, I thank my parents and my brother. Your unending love and support have made this possible. iii ABSTRACT COEXPRESSION NETWORKS BASED ON NATURAL VARIATION IN HUMAN GENE EXPRESSION AT BASELINE AND UNDER STRESS Renuka R. Nayak Dissertation Advisors: Vivian G. Cheung and Michael J. Kearns Genes interact in networks to orchestrate cellular processes. Here, we used coexpression networks based on natural variation in gene expression to study the functions and interactions of human genes. We asked how these networks change in response to stress. First, we studied human coexpression networks at baseline. We constructed networks by identifying correlations in expression levels of 8.9 million gene pairs in immortalized B cells from 295 individuals comprising three independent samples. The resulting networks allowed us to infer interactions between biological processes. We used the network to predict the functions of poorly-characterized human genes, and provided some experimental support. Examining genes implicated in disease, we found that IFIH1, a diabetes susceptibility gene, interacts with YES1, which affects glucose transport. Genes predisposing to the same diseases are clustered non-randomly in the network, suggesting that the network may be used to identify candidate genes that influence disease susceptibility. These analyses showed that human coexpression iv networks based on natural variation may offer information on gene functions and interactions. We then examined the extent to which networks change upon stress. We studied changes in expression levels and gene relationships induced by two stresses: endoplasmic reticulum (ER) stress and exposure to ionizing radiation (IR). Using large datasets, we found between 30—70% of genes change expression upon stress. In contrast, the majority (between 65—95%) of gene relationships are maintained as assessed using statistical, network and machine learning methods. However, a subset of genes altered relationships upon stress. These genes tended to be critical for the cellular response to the specific stress examined. For example, BIP and CHOP altered relationships in ER stress; p21, GADD45A and CCNB1 altered relationships in IR stress. Some genes with altered relationships have not been implicated in ER or IR stress or do not change expression; these are genes that may be critical but remain unexplored. We provide evidence implicating two such genes, INHBE and SLC3A2, in the response to ionizing radiation. Our results suggest that the majority of gene relationships are maintained upon stress, but those genes with altered relationships tend to be critical to the stress response. v TABLE OF CONTENTS Acknowledgements iii Abstract iv Table of Contents vi List of Tables ix List of Figures x Introduction 1 Chapter 1: Coexpression network based on natural variation in human gene expression reveals gene interactions and functions Abstract 30 Introduction 31 Results 32 Gene coexpression network 32 Properties of human gene coexpression networks in B cells 34 Biological processes in B cells 36 Predicting gene functions using coexpression networks 37 Predicting the functions of genes implicated in GWAS 40 Human gene coexpression website 43 Discussion 43 Methods 46 Samples and gene expression measurements 46 Gene correlation and construction of the coexpression network 47 Random gene pairs and networks 48 Enrichment analysis 48 Co-occurrence of biological processes 49 Tunicamycin treatment 49 Databases 50 Chapter 2: Genome-wide study of human gene relationships at baseline and upon stress Abstract 70 Introduction 71 Results 72 • Human cells exhibit extensive changes in gene expression levels upon stress 72 • Correlation in expression between genes at baseline and under stress 75 vi • Coexpression between genes does not differ significantly upon ER 76 stress • Coexpression between genes does not differ significantly upon IR stress 77 • Examining whether subnetworks of genes change expression as expected upon 78 stress • Upon ER stress, the majority of neighbors in a subnetwork change 81