Gene-Environment Interactions in Cardiovascular Disease by Cavin Keith Ward-Caviness Graduate Program in Computational Biology and Bioinformatics Duke University Date:_______________________ Approved: ___________________________ Elizabeth R. Hauser, Supervisor ___________________________ William E. Kraus ___________________________ Sayan Mukherjee ___________________________ H. Frederik Nijhout Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate Program in Computational Biology and Bioinformatics in the Graduate School of Duke University 2014 i v ABSTRACT Gene-Environment Interactions in Cardiovascular Disease by Cavin Keith Ward-Caviness Graduate Program in Computational Biology and Bioinformatics Duke University Date:_______________________ Approved: ___________________________ Elizabeth R. Hauser, Supervisor ___________________________ William E. Kraus ___________________________ Sayan Mukherjee ___________________________ H. Frederik Nijhout An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate Program in Computational Biology and Bioinformatics in the Graduate School of Duke University 2014 Copyright by Cavin Keith Ward-Caviness 2014 Abstract In this manuscript I seek to demonstrate the importance of gene-environment interactions in cardiovascular disease. This manuscript contains five studies each of which contributes to our understanding of the joint impact of genetic variation and environmental exposures to cardiovascular disease: a candidate gene study for gene- smoking interactions associated with early-onset coronary artery disease, an epidemiology study of the association between traffic-related air pollution and cardiovascular disease, a Genome-Wide Interaction Study for gene-by-traffic related air pollution interactions associated with peripheral arterial disease, a Genome-Wide Interaction Study for gene-by-traffic related air pollution interactions on coronary atherosclerosis burden, and a method for analyzing associations between high- dimensional genomics datasets. Smoking is a strong risk factors for coronary artery disease, and may play a causative role in the incidence of coronary artery disease. Smoking had been implicated as a reason for heterogeneity observed in associations between genetic variants on chromosome three and coronary artery disease. I used a family-based early-onset coronary artery disease cohort (GENECARD) to study gene-smoking interactions. I also used data from the three independent cohorts to perform a meta-analysis of gene- smoking interactions focusing on the KALRN gene and Rho-GTPase pathway. I found iv significant evidence for gene-smoking interactions associations involving variants in KALRN and other Rho-GTPase pathway genes on chromosome 3. Though the estimated increase in incident cardiovascular disease or cardiovascular events due to air pollution exposure is modest at 3-5%, the ubiquitous nature of air pollution exposures means it has a substantial population-level impact on cardiovascular disease. Historically genome-wide interaction studies with air pollution have not yielded genome-wide significant interactions, however by implementing statistical tools novel to this field I have discovered significant interactions between genetic variants and traffic-related air pollution that are associated with cardiovascular diseases. I studied interactions associated with peripheral arterial disease and the number of diseased coronary vessels (an indicator for coronary artery disease burden) using race-stratified cohort study designs. With peripheral arterial disease I observed that variants in both BMP8A and BMP2 showed evidence for interactions in both European- American and African-American cohorts. In BMP8A I uncovered the first genome-wide significant interaction with air pollution associated with cardiovascular disease. BMP2 gene expression is upregulated after exposure to black carbon, a major component of diesel exhaust, and coding variants within this gene showed evidence for interaction. With the number of diseased coronary vessels I observed that variants in PIGR showed significant evidence for involvement in gene-traffic related air pollution interactions. I v observed that coding variation within PIGR was associated with coronary artery disease burden in a gene-by-traffic related air pollution interaction model. As PIGR is involved in the immune response it represents a strong candidate gene discovered via an unbiased genome-wide scan. The use of high dimensional data to study chronic disease is becoming commonplace. In order to properly analyze high-dimensional data without suffering from high false-discovery rate penalties, the data is often summarized in a way that takes advantage of the correlation structure. Two common approaches for this are principal components analysis and canonical correlation analysis. However neither of these approaches are appropriate when one preferentially desires to preserve structure within the data. To address this shortcoming I developed constrained canonical correlation analysis (cCCA). With cCCA one can evaluate the correlation between two high dimensional datasets while preferentially preserving structure in one of the datasets. This has uses when studying multi-variate outcomes such as cardiovascular disease using multi-variate predictors such as air pollution. Additionally cCCA can be used to create endophenotype factors that specifically explain the variation within a high-dimensional set of predictors (such as gene expression or metabolomics data) with respect to potential endophenotypes for cardiovascular disease, such as cholesterol measures. vi For the family. vii Contents Abstract ......................................................................................................................................... iv List of Tables ................................................................................................................................ xii List of Figures .............................................................................................................................xiv List of Symbols and Abbreviations .........................................................................................xvi Acknowledgements ................................................................................................................... xix 1. Introduction ............................................................................................................................... 1 1.1 Cardiovascular Disease ................................................................................................... 1 1.2 Genetics of Cardiovascular Disease ............................................................................... 4 1.3 Air Pollution, Smoking and Cardiovascular Disease .................................................. 7 1.3.1 Smoking Background .................................................................................................. 7 1.3.2 Air Pollution Background .......................................................................................... 8 1.4 Gene-Environment Interaction Studies ....................................................................... 12 2. Gene-smoking interactions in multiple Rho-GTPase pathway genes in an early-onset coronary artery disease cohort .................................................................................................. 17 2.1 Introduction ..................................................................................................................... 17 2.2 Methods ........................................................................................................................... 18 2.2.1 Subjects ....................................................................................................................... 18 2.2.2 Statistical Methods .................................................................................................... 20 2.2.2.1 Meta-Analysis ..................................................................................................... 23 2.2.2.2 Pathway-based Analysis ................................................................................... 23 2.3 Results .............................................................................................................................. 24 2.3.1 APL Results ................................................................................................................ 25 viii 2.3.2 APL-OSA Results ...................................................................................................... 27 2.3.3 Pathway-based WebGestalt Results ...................................................................... 54 2.4 Discussion ........................................................................................................................ 55 2.4.1 Chromosome 3q13 associations and LD ................................................................ 56 2.4.2 Evidence for validation ............................................................................................. 58 2.4.3 Evidence for allelic heterogeneity in KALRN ....................................................... 58 2.4.4 Functional heterogeneity in KALRN ...................................................................... 60 2.4.5 Smoking and KALRN ..............................................................................................
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages238 Page
-
File Size-