Jóhanna Jakobsdóttir
Total Page:16
File Type:pdf, Size:1020Kb
GENETICS OF AGE-RELATED MACULOPATHY & SCORE STATISTICS FOR X-LINKED QUANTITATIVE TRAIT LOCI by J´ohannaJakobsd´ottir BS, University of Iceland, Reykjav´ık,Iceland, 2004 Submitted to the Graduate Faculty of the Department of Biostatistics Graduate School of Public Health in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2009 UNIVERSITY OF PITTSBURGH Graduate School of Public Health This dissertation was presented by J´ohannaJakobsd´ottir It was defended on April 1, 2009 and approved by Dissertation Advisor: Daniel E. Weeks, Ph.D., Professor, Depts. of Human Genetics and Biostatistics, Graduate School of Public Health, University of Pittsburgh Committee Member: Eleanor Feingold, Ph.D., Associate Professor, Depts. of Human Genetics and Biostatistics, Graduate School of Public Health, University of Pittsburgh Committee Member: Sati Mazumdar, Ph.D., Professor, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh Committee Member: Lisa Weissfeld, Ph.D., Professor, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh ii Copyright c by J´ohannaJakobsd´ottir 2009 iii GENETICS OF AGE-RELATED MACULOPATHY & SCORE STATISTICS FOR X-LINKED QUANTITATIVE TRAIT LOCI J´ohannaJakobsd´ottir,PhD University of Pittsburgh, 2009 Age-related maculopathy (ARM) is a common cause of irreparable vision loss in industrialized countries. The disease is characterized by progressive loss of central vision making everyday tasks challenging. The etiology is complex and has both an environmental and a strong genetic compo- nents. The public health relevance of the work is to improve the understanding genetic causes in the disease etiology and ultimately to lead to better disease management and prevention. From my ARM work, I present four papers covering range of statistical approaches. The first paper presents fine-mapping efforts, using both linkage and association methods, under previously identified link- age peaks on chromosomes 1q31 and 10q26. We replicate the discovery of the complement factor H (CFH ) gene on 1q31 and identify a novel locus, harboring three closely linked genes (PLEKHA1, LOC387715, and HTRA1 ), on 10q26. Both discoveries have been widely replicated. In the next paper I present meta-analysis of 11 CFH and 5 LOC387715 data sets. We also replicate these findings in two independent case-control cohorts, including one cohort, where ARM status was not a factor in the ascertainment. In the third paper we replicate discoveries of new complement related loci (C2 and CFB) on chromosome 19p13 as well as developing classification models based on SNPs from CFH, LOC387715, and C2. The last paper focuses on applying statistical techniques from the diagnostic medicine literature to ARM. We comment on the importance of understand- ing the difference and similarities between different goals of genetic studies: improving etiological understanding or finding variants that discriminate well between cases and controls. This work is particularly relevant today when there has been explosion in the availability of direct-to-consumer DNA tests. In addition to carrying out linkage and association analysis, I also have extended the statistical theory behind score-based linkage analyses for X chromosomal markers. This work has public health relevance because many complex common diseases have sex-specific differences, such as prevalence iv and age of onset. Modeling those appropriately with powerful and robust methods will bring an improved understanding of their genetic basis. v TABLE OF CONTENTS PREFACE .............................................. xvii 1.0 BASICS IN GENETICS AND GENE MAPPING .................1 2.0 BACKGROUND ON THE AGE-RELATED MACULOPATHY PROJECT .5 2.1 Epidemiology and genetics of Age-related maculopathy................5 2.2 Statistical methods used in the project.........................8 2.2.1 Linkage analysis.................................8 2.2.1.1 Parametric LOD scores and HLOD scores..............8 2.2.1.2 Model-free LOD scores........................9 2.2.2 Association analysis............................... 10 2.2.2.1 χ2 and Fisher's exact tests and logistic regression......... 10 2.2.2.2 CCREL and MQLS.......................... 11 2.2.3 Joint linkage and association.......................... 12 2.2.3.1 GIST.................................. 12 2.2.4 Meta-analysis................................... 12 2.2.4.1 Fixed effects modeling......................... 13 2.2.4.2 Random effects modeling....................... 13 2.2.5 Model building.................................. 13 2.2.5.1 Logistic regression........................... 13 2.2.5.2 GMDR................................. 14 2.2.5.3 ROC curves.............................. 14 2.2.6 Practical issues.................................. 14 2.2.6.1 Samples vs. individuals........................ 14 2.2.6.2 Check your files, scripts, and results................. 15 vi 3.0 SUSCEPTIBILITY GENES FOR AGE-RELATED MACULOPATHY ON CHROMOSOME 10Q26 .................................. 17 3.1 Abstract.......................................... 17 3.2 Introduction....................................... 18 3.3 Material and Methods.................................. 19 3.3.1 Families and Case-Control Cohort....................... 19 3.3.2 Affection-Status Models............................. 19 3.3.3 Pedigree and Genotyping Errors and Data Handling............. 21 3.3.4 Allele Frequencies and Hardy-Weinberg Equilibrium............. 21 3.3.5 Genetic Map................................... 21 3.3.6 LD Structure................................... 21 3.3.7 Linkage Analysis................................. 22 3.3.7.1 Two-point analysis........................... 22 3.3.7.2 Multipoint analysis ignoring LD................... 22 3.3.7.3 Multipoint analysis using htSNPs.................. 22 3.3.8 Association Analysis............................... 23 3.3.9 GIST Analysis.................................. 23 3.3.10 Tripartite Analyses............................... 24 3.3.11 Part I: Analysis of CIDR SNPs......................... 24 3.3.12 Part II: Analysis of Locally Genotyped SNPs................. 26 3.3.13 Part III: Interaction and OR Analysis..................... 26 3.3.13.1 Unrelated cases............................. 26 3.3.13.2 Analysis of interaction with CFH .................. 29 3.3.13.3 Magnitude of association....................... 29 3.3.13.4 Multiple-testing issues......................... 30 3.4 Results.......................................... 30 3.4.1 Part I: Analysis of CIDR SNPs......................... 30 3.4.1.1 CIDR linkage results.......................... 30 3.4.1.2 CIDR association results....................... 31 3.4.1.3 CIDR GIST results.......................... 35 3.4.2 PART II: Analysis of Locally Genotyped SNPs................ 35 3.4.2.1 Local association results........................ 35 vii 3.4.2.2 Local GIST results........................... 35 3.4.3 Part III: Interaction and OR Analyses..................... 37 3.4.3.1 GIST results.............................. 37 3.4.3.2 Logistic regression results....................... 37 3.4.3.3 OR and AR.............................. 37 3.4.3.4 Subphenotype analyses........................ 38 3.5 Discussion......................................... 38 4.0 CFH, ELOVL4, PLEKHA1 AND LOC387715 GENES AND SUSCEPTI- BILITY TO AGE-RELATED MACULOPATHY: AREDS AND CHS CO- HORTS AND META-ANALYSES ........................... 45 4.1 Abstract.......................................... 45 4.2 Introduction....................................... 46 4.3 Results.......................................... 48 4.3.1 Association analyses............................... 48 4.3.1.1 CFH .................................. 49 4.3.1.2 ELOVL4 ................................ 52 4.3.1.3 PLEKHA1 and LOC387715 ..................... 53 4.3.2 Interaction analyses............................... 54 4.3.3 APOE results.................................. 55 4.3.4 Meta-analyses.................................. 55 4.3.4.1 Meta-analysis of CFH ......................... 55 4.3.4.2 Meta-analysis of LOC387715 ..................... 57 4.4 Discussion......................................... 57 4.5 Materials and Methods................................. 62 4.5.1 Cardiovascular health study (CHS) participantssampling and phenotyping. 62 4.5.2 Age-related eye disease study (AREDS) participantssampling and pheno- typing....................................... 63 4.5.3 Genotyping.................................... 64 4.5.4 Association analyses............................... 64 4.5.5 Distinguishing between PLEKHA1 and LOC387715 ............. 65 4.5.6 Interaction analyses............................... 65 4.5.7 APOE analyses................................. 66 viii 4.5.8 Meta-analyses.................................. 67 5.0 C2 AND CFB GENES IN AGE-RELATED MACULOPATHY AND JOINT ACTION WITH CFH AND LOC387715 GENES .................. 71 5.1 Abstract.......................................... 71 5.2 Introduction....................................... 72 5.3 Materials and methods.................................. 74 5.3.1 Phenotyping, study participants and quality control............. 74 5.3.2 Genotyping.................................... 75 5.3.3 Association analyses and LD estimation.................... 77 5.3.3.1 Case-Control data........................... 77 5.3.3.2 Family data............................... 77 5.3.4 Multifactor and gene-gene interaction analyses................ 78 5.3.4.1