Epistatic and Environmental Control of Genome-Wide Gene Expression
Total Page:16
File Type:pdf, Size:1020Kb
Epistatic and Environmental Control of Genome-Wide Gene Expression Timothy P. York,1,4 Michael F. Miles,2 Kenneth S. Kendler,3 Colleen Jackson-Cook,4 Melissa L. Bowman,2 and Lindon J. Eaves1,3,4 1 Massey Cancer Center,Virginia Commonwealth University, Richmond,Virginia, United States of America 2 Departments of Pharmacology/Toxicology, Neurology and the Center for Study of Biological Complexity,Virginia Commonwealth University, Richmond,Virginia, United States of America 3 Department of Psychiatry,Virginia Institute for Psychiatric and Behavioral Genetics,Virginia Commonwealth University, Richmond,Virginia, United States of America 4 Department of Human Genetics,Virginia Institute for Psychiatric and Behavioral Genetics,Virginia Commonwealth University, Richmond,Virginia, United States of America ll etiological studies of complex human traits to influences of genetic or environmental variation. Afocus on analyzing the causes of variation. Given The study of MZ and DZ twins is a well-tested this complexity, there is a premium on studying approach to partition the genetic, shared and non- those processes that mediate between gene prod- shared environmental factors in multifactorial traits ucts and cellular or organismal phenotypes. Studies (Jinks & Fulker, 1970). Modern refinements of statis- of levels of gene expression could offer insight into tical methods for the analysis of multivariate twin these processes and are likely to be especially data (Neale & Cardon, 1992) have made it possible useful to the extent that the major sources of their for the twin study to resolve genetic (or environmen- variation are known in normal tissues. The classical tal) effects that contribute to the clustering of multiple study of monozygotic (MZ) and dizygotic (DZ) twins phenotypes and to the sequence of developmental was employed to partition the genetic and environ- events that lead ultimately to clinically significant mental influences in gene expression for over 6500 complex outcomes. human genes measured using microarrays from Numerous studies of human disease have utilized lymphoblastoid cell lines. Our results indicate that microarray technology to simultaneously measure mean expression levels are correlated about .3 in thousands of gene expression profiles between groups monozygotic (MZ) and .0 in dizygotic (DZ) twins sug- of individuals or cell types with the goal of identifying gesting an overall epistatic regulation of gene expression. Furthermore, the functions of several of candidate disease genes or classifying tissue into the genes whose expression was most affected by disease subgroups. It is not clear whether or not the environmental effects, after correction for measure- interindividual variation associated with a particular ment error, were consistent with their known role in outcome is due to genetic or random environmental mediating sensitivity to environmental influences. causes (Oleksiak et al., 2002). Genetic sources of gene expression variation are likely due to DNA sequence polymorphisms, whereas all nongenetic components The measurement of gene expression levels for a very can be generally classified as random environmental large number of genes through microarray technology sources. Recently, investigators have studied the extent offers the hope of a better understanding of some of to which gene expression levels are under genetic the primary mechanisms through which both genes control for a limited number of genes (Cheung et al., and environment affect susceptibility to complex mul- 2003; Yan et al., 2002). In another report, a signifi- tifactorial disorders. If microarray technology is to cant heritable component for many genes was provide a means for the assessment of the endopheno- suggested by mapping both cis- and trans-acting loci type in studies of disease etiology, it will be necessary that effect gene expression (Morely et al., 2004). to characterize the differential roles of genetic and The examination of interindividual variation in environmental influences on levels of gene expression. the level of gene expression in normal tissue is a nec- The study of gene expression levels in samples of monozygotic (MZ) and dizygotic (DZ) twin pairs Received 23 November, 2004; accepted 2 December, 2004. may offer a valuable overview of the contribution of genetic variation and environmental exposures to Address for correspondence: Timothy P. York, Virginia Institute for Psychiatric & Behavioral Genetics, Virginia Commonwealth University, gene expression, and identifies clusters of genes whose PO Box 980003, Richmond, VA 23298-0003, USA. E-mail: expression in specific tissues is more or less sensitive [email protected] Twin Research and Human Genetics Volume 8 Number 1 pp. 5–15 5 Downloaded from https://www.cambridge.org/core. IP address: 170.106.35.76, on 28 Sep 2021 at 01:09:45, subject to the Cambridge Core terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1375/twin.8.1.5 Timothy P. York, Michael F. Miles, Kenneth S. Kendler, Colleen Jackson-Cook, Melissa L. Bowman, and Lindon J. Eaves essary prerequisite for interpreting alterations in gene the correlation in gene expression between transformed expression profiles that are causally associated with cells and peripheral blood leukocytes. disease tissue. Peripheral blood is a readily accessible Microarrays source of cells for the study of this relationship and TM has the potential for acting as a surrogate tissue for Total RNA was isolated according to the STAT-60 diagnostic purposes since it is exposed to many of the protocol. RNA concentration was determined by same environmental substances as target tissues absorbance at 260 nm, and RNA quality was ana- (Whitney et al., 2003). Recent studies have demon- lyzed by agarose gel electrophoresis and 260/280 strated that stable gene expression profiles can be absorbance ratios. Total RNA (7 µg) derived from monitored despite temporal changes in blood chem- each sample was reverse transcribed into double- istry (Radich et al., 2004) and signatures of gene stranded cDNA using the Invitrogen Superscript II expression due to environmental exposures can be System (Invitrogen, Carlsbad, CA). Biotin-labeled reliably identified (Lampe et al., 2004). cRNA was synthesized from this cDNA using Ultimately, the same methodology may be BioArray High Yield RNA Transcript Labeling Kit extended to quantify the genetic and environmental (ENZO Diagnostics, Farmingdale, NY) according to sources of variation at the intermediary stage of gene the manufacturer’s instructions, purified using the expression and present a clearer picture of the etiology RNAeasy Mini Kit (Qiagen, Mountain View, CA), and molecular pathology of multifactorial disease. and quantified by absorbance at 260 nm. Labeled To provide proof of principle for the use of twins cRNA samples were hybridized to oligonucleotide in the study of gene expression, we have examined the microarrays (Human GeneChip™ HG-133A, patterns of correlation in expression levels of over Affymetrix, Santa Clara, CA) containing probes for 22,000 genes in cultured lymphoblastoid cell lines over 22,000 genes and ESTs. Array hybridization, derived from peripheral blood lymphocytes from 10 scanning and quality control assessment were per- pairs of MZ and 5 pairs of DZ female twins that were formed according to the manufacturer’s protocol and obtained from Coriell Cell Repositories (Camden, as described previously (Hassan et al., 2003; Thibault NJ). The mean twin pair age was 31 (± 12) years old. et al., 2000). Background correction, normalization, Individuals had no reported history of major illness and expression summary were conducted with the (Appendix Table 1). Gene expression was measured robust multiarray average (RMA; Irizarry et al., on oligonucleotide microarrays (Human GeneChip™ 2003) implemented in the Bioconductor R packages HG133-A, Affymetrix) which contain over 22,000 (Gentleman et al., 2004). named genes and expressed sequence tags (ESTs). Labeled cRNA from each individual was hybridized Statistical Methodology to a single microarray. On average, 41% of genes We derived the variance components separately for present on the microarrays were expressed in the both zygosities to test the hypothesis that no differ- sample cell lines. Genes without a MAS 5 ‘present’ or ences exist within and between pairs across all genes ‘marginally present’ call in all samples were removed (probesets) measured. The model was yijk = m + Pi + from analysis (Affymetrix, 1999). A total of 6578 P:Ii(j) + Gk + (P*G)ik + (P:I*G)i(j)k. The response, yi(j)k, is genes met this very conservative filtering criteria and the level of intensity for the jth individual in the ith pair were included in this study. and kth gene. The variable m is the mean gene expres- sion level across all individuals. The variation Materials and Methods between pairs is listed as Pi and the within pair vari- Sample ance is represented by the nested term P:Ii(j).Gk is the Cultured lymphoblastoid cell lines were obtained from average gene expression intensity across pairs and 10 pairs of MZ and 5 pairs of DZ twins. These cells individuals within pairs. The interaction terms, were obtained from Coriell Cell Repositories (Camden, (P*G)ik and (P:I*G)i(j)k, account for the gene depen- NJ). Zygosity was confirmed by genotyping a panel of dent variation between and within pairs respectively. 20 microsatellite markers. There were no discordant These terms are used to gauge the overall importance markers within