Replication and Functional Validation of Snps Previously Associated with Coronary Artery Disease
Total Page:16
File Type:pdf, Size:1020Kb
Replication and Functional Validation of SNPs Previously Associated with Coronary Artery Disease Matthew B. Sellers, MD A Thesis Submitted to the Graduate Faculty of WAKE FOREST UNIVERSITY GRADUATE SCHOOL OF ARTS AND SCIENCES in Partial Fulfillment of the Requirements for the Degree of MASTER OF SCIENCE in Clinical and Population Translational Science Winston- Salem, North Carolina August 2013 Approved by: David M. Herrington, M.D., M.H.S., Advisor Examining Committee: Beverly M. Snively, Ph.D., Examining Chair Donald W. Bowden, Ph.D., Chairman Daniel Beavers, PhD Timothy D. Howard, Ph.D ACKNOWLEDGEMENTS I would like to sincerely thank the following people for their guidance, support and mentorship, without whom this thesis submission would not have been possible. To Dr. Herrington, thank you for your mentorship and guidance throughout the thesis process, as well as your continued mentorship during my cardiology fellowship. Your commitment to the cardiology research fellows, as well as the general clinical fellows, in our pursuit for knowledge has been unparallel. To Allie Richardson and Karen Blinson, thank you for keeping me in line with all my abstract and poster deadlines, appointment times, and professional obligations. You were instrumental in getting me settled into fellowship as well as providing general advice and support for settling into life in Winston-Salem. To Georgia Saylor, thank you for your computer and statistical guidance. You were always available on an immediate basis for help, and have spent countless hours running long SAS programs for these genetic analyses. To Drs. Howard and Liu, thank you for your guidance with several abstracts and with writing the Translational Science Institute grant. You have been very approachable and patient when discussing genetic concepts and basic laboratory and analysis techniques. Your mentorship was not only instrumental in helping me develop the tools to complete this research project but also forged a friendship through the process. I would also like to take this opportunity to thank my wife, Holly, for her continued understanding as I have moved her from city to city during my training, decided to undertake an additional year of fellowship and giving me my newborn son for which I am eternally grateful. ii TABLE OF CONTENTS Page List of Abbreviations……………………………………………………………………….iv List of Illustrations………………………………………………………………………….v Abstract……………………………………………………………………………………..vi Chapter 1. Introduction…………………………………………………………………………1 1. Specific Aims……………………………………………………………………….18 2. Replication of the Association of PSRC1 and MTHFD1L with Coronary Artery Disease: The Multi-Ethnic Study of Atherosclerosis…………………………………………20 3. Functional Validation of Previously Identified SNPs Associated with Coronary Artery Disease………………………………………………………………………………35 4. Curriculum Vitae……………………………………………………………………47 iii LIST OF ABBREVIATIONS CAD = Coronary Artery Disease CHD = Coronary Heart Disease GWAS = Genome-Wide Association Studies Health ABC = Health, Aging and Body Composition Study LDL = Low Density Lipoprotein mRNA = Message RNA (ribonucleic acid) SNP = Single Nucleotide Polymorphism iv LIST OF ILLUSTRATIONS Chapter 1 Page I. Previously published SNPs associated with CAD with genom-wide significance...4 Chapter 2 I. Manhattan Plot representing the MESA GWAS for all ethnic groups………….. 26 II. Replication of previously published SNPs with genome-wide significance within the 4 ethnic groups MESA GWAS………………………………………………….30 III. MESA SNP associations replicated within Health ABC and meta analysis……..31 IV. Hazard ratios with 95% CI of SNP associated with MTHFD1L in African Americans (top) and SNP associated with PSRC1 in Caucasians (bottom)………………….31 Chapter 3 I. Previously published SNPs associated with CAD in the literature that were available for gene expression analysis………………………………………………………39 II. SNP associations with mRNA expression levels………………………………….40 III. Statistically significant SNP associations stratified by race………………………40 v Abstract Introduction Genome-wide association studies have identified ~30 SNPs associated with CHD in several large cohorts of predominantly European descent. We sought additional replication of these associations in more diverse cohorts including the Multi-Ethnic Study of Atherosclerosis (MESA) cohort (N=6,425), and the Health Aging and Body Composition (Health ABC) cohort (N = 2,800). To provide additional functional validation of a subset of 16 of these SNPs we also examined associations of these SNPs with expression of ~25,000 gene transcripts in purified monocytes from a subset (N=1264) of MESA participants. Methods Cox proportional hazards models were used to measure the association between SNPs and time to first CHD event after adjustment for age, gender, study site, and genetic ancestral principal components in both the MESA and Health ABC cohorts. Random effects meta- analysis was used to evaluate the association across both cohorts. Genome-wide mRNA expression profiles were generated with the Illumina HumanHT-12 BeadChip platform in purified monocytes from 1264 MESA subjects sampled to achieve balance across age, race/ethnicity and sex. Association of 16 of these SNPs with individual gene transcripts were assessed using generalized linear models adjusting for age, race, sex, study site and technical factors including the percent of cell contamination with neutrophils, natural killer cells, B cells and T cells. Results Two of the 30 previously published SNP associations with coronary artery disease, rs599839 within the PSRC1 gene in close proximity to chromosome 1 (p value 0.01, HR 1.45) vi and rs6922269 (p value 0.03, HR 1.52) within the MTHFD1L gene on chromosome 6, revealed nominal significance in the MESA cohort and these findings were replicated within the Health ABC cohort (PSRC1 p value 0.009, HR 1.25 and MTHFD1L p value 0.016, HR 1.19). Of the 16 previously reported SNPs associated with CAD, 5 were found to be statistically significantly associated with differing mRNA expression levels after Bonferroni adjustment (p values < 3.17 x 10-6), and rs599839 revealed differing mRNA expression of the PSRC1 gene (p value 3 x 10- 14). Increasing number of the rs599839 minor allele was associated with decreased expression of the PSRC1 transcripts. Conclusion Two SNPs in proximity to PSRC1 (rs599839) and MTHFD1L (rs6922269) were validated in 2 large independent cohorts strengthening the association of these SNPs with coronary artery disease. Five SNPs with GWAS evidence of associations with coronary artery disease are associated with differing mRNA expression, suggesting that the mechanism of some of these associations may involve modulation of gene expression. More research is warranted to determine the full mechanisms of these associations. vii Chapter 1 Introduction In addition to environmental and lifestyle factors1, coronary disease tends to cluster in families suggesting genetic factors play a crucial role in the development of cardiovascular disease, especially premature coronary heart disease. In an early case-control study in the Honolulu Heart Study cohort, the relative risk of coronary heart disease death was 11.3 for fathers of CHD cases with early onset CHD, and the relative risk of developing CHD for siblings of early onset CHD cases was 2.52. In a prospective study of coronary heart disease in males age 39-59, a reported parental history of coronary heart disease was statistically significantly (p = 0.01) associated with an increased risk of the combined incidence of symptomatic myocardial infarction and angina pectoris in subjects under 50 years of age after adjustment for known CHD risk factors3. Further support of genetic influences on the development of coronary heart disease, as well as the increased risk of CHD mortality, was illustrated in a landmark publication within the New England Journal of Medicine utilizing monozygotic and dizygotic twins. The relative hazard of death from CHD when a male twin died of CHD prior to age 55 was 8.1 (95% CI 2.7 to 24.5) for monozygotic twins and 3.8 (95% CI 1.4 to 10.5) for dizygotic twins. Similar risks were seen when female twins died prior to the age of 65 years, and these risks were not significantly decreased when adjusting for known CHD risk factors4. In a subsequent twin study, the heritability of death was 0.57 (95% CI, 0.45-0.69) amongst male twins, and 0.38 (0.26-0.50) amongst female twins5. As the genetic influence on coronary heart disease has been well established for several decades, since the sequencing of the human genome, researchers have focused on trying to identify genes and specific regions within the genome with hopes of identifying new pathophysiologic mechanisms and risk factors for the development of cardiovascular disease, as well as other complex diseases. Genome-Wide Association Studies Cardiovascular epidemiology and clinical research have been unable to fully explain an individual’s risk for developing disease. The portion of cardiovascular disease risk not explained by traditional risk factors has been termed the “missing risk” or “missing heritability”6. Recent efforts to explain this missing risk have focused on genetic research, and with new technological advancements, such as genome-wide association studies (GWAS), there has been an explosion of new genetic information. Genome-wide association studies (GWAS) are trying to characterize the association between single nucleotide polymorphisms (SNPs), which serve as markers for specific regions within the genome,