Searching for Rare Variants Associated with Osahs-Related Phenotypes
Total Page:16
File Type:pdf, Size:1020Kb
SEARCHING FOR RARE VARIANTS ASSOCIATED WITH OSAHS-RELATED PHENOTYPES THROUGH PEDIGREES by JINGJING LIANG Dissertation Advisor: Dr. Xiaofeng Zhu Department of Population and Quantitative Health Sciences CASE WESTERN RESERVE UNIVERSITY May 29, 2019 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of Jingjing Liang candidate for the degree of Ph.D Committee Chair Scott M. Williams Committee Member Jonathan L. Haines Committee Member Xiaofeng Zhu Committee Member Rong Xu Committee Member Curtis M. Tatsuoka Date of Defense January 29, 2019 *We also certify that written approval has been obtained for any proprietary material contained therein. 1 Table of Contents CHAPTER 1: LITERATURE REVIEW AND SPECIFIC AIMS ………………14 1.1 Obstructive sleep apnea-hypopnea syndrome …………………………………..14 1.2 AHI and SpO2 …………………………………………………………………...16 1.3 Rare variants and missing heritability …………………………………………..22 1.4 Rare variant association analysis ...………………………………………………24 1.5 Rare variant test using pedigree………………………………………………….27 1.6 Annotating variants in genetic regions ………………………………………….29 1.7 Mendelian randomization………………………………………………………..32 1.8 Specific aims ……………………………………………………………………36 CHAPTER 2: IDENTIFYING LOW FREQUENCY AND RARE VARIANTS ASSOCIATED WITH AVSPO2S USING PEDIGREES .…….………38 2.1 Introduction ……………………………………………………………………...38 2.2 Material and methods……………………………………………………………..42 2.2.1 Description of study samples…..……………………………………………..42 2.2.2 Overview of the method………………………………………………………45 2.2.3 Primary phenotype……………………………………………………………47 2.2.4 Whole-genome sequencing…………………………………………………...47 2.2.5 Imputation of replication cohorts……………………………………………...48 2 2.2.6 Linkage analysis of AvSpO2S………………………………………………...48 2.2.7 Simulated family data…………………………………………………………50 2.2.8 Analysis of CFS stage I families………………….…………………………50 2.2.9 Gene-based association test…… …………………………………………….51 2.2.10 Identifying variants contributing to linkage evidence in CFS-EAs……...…..54 2.2.11 Estimating the proportion of AvSpO2S variability explained by the Identified variants in DLC1 ……………………….………………………...….……55 2.2.12 Test the aggregation of variant effect size directions in DLC1…….……….55 2.2.13 Gene expression Analysis …………………………………………………56 2.2.14 Cell type specific regulatory annotation enrichment analysis ……………....57 2.2.15 Mendelian randomization analysis …………………………………………58 2.2.16 AvSpO2S and methylation in the DLC1 gene in MESA ……………………58 2.2.17 AvSpO2S and DLC1 gene expression in the Sleep Heart Health Study from Framingham Heart Study…...……………..……………………….…………..59 2.3 Results ……………………………………………………………………….......60 2.3.1 Persistent linkage evidence of AvSpO2S on chromosome 8p23………….....60 2.3.2 Choice of family specific LOD score threshold 0.1………………………….62 2.3.3 Multiple low frequency and rare variants in DLC1 are associated with AvSpO2S……………………………………………………………………………..65 3 2.3.4 Conditioning on the effects of identified DLC1 variants reduces the linkage evidence in CFS-EAs………...……………………………………………...73 2.3.5 Identified DLC1 variants are enriched in regulatory regions………………...76 2.3.6 DLC1 non-coding variants are associated with DLC1 expression level in human skin cells-transformed fibroblasts……………………………………………77 2.3.7 DLC1 DNA methylation is weakly associated with AvSpO2S and AHI ….....82 2.3.8 DLC1 expression is weakly associated with AvSpO2S and AHI …………....84 2.3.9 Association of DLC1 with AvSpO2S is not mediated by AHI……………….84 2.3.10 Potential pleiotropic effect between DLC1, AvSpO2S and lung function ….84 2.4 Discussion ……………………………………………………………………….88 CHAPTER 3: ASSESSING THE INDEPENDENCE OF ASSOCIATION TESTS AND LINKAGE EVIDENCE OBTAINED IN THE SAME DATA ..……..93 3.1 Introduction ………………………………………………………………………93 3.2 Method .………………………………………………………………………94 3.3 Results ……………………………………………………………………...…….95 3.3.1 Assess the empirical P-values for burden and SKAT tests using genome-wide variants ……………………………………………………………..95 3.3.2 Enrichment of 5% significant gene-based tests on chromosome 8 target region in CFS-EAs ………………………………………………...……..……98 3.4 Discussion ………………………………………………………………………99 4 CHAPTER 4: DEVELOPING AN ANALYSIS PIPELINE TO SEARCH RARE VARIANTS THROUGH LINKAGE ANALYIS USING WGS DATA …..102 4.1 Introduction …………………………………………………………………….102 4.2 Method …………………………………………………………………….104 4.2.1 Overview of the analysis pipeline ………………………………………….104 4.2.2 Variance component linkage analysis ………………………………………106 4.2.3 Identify candidate variants in WGS data …………………………………107 4.2.4 Gene-based association tests using WGS data ……………………………107 4.3 Results ………………………………………………………………………….109 4.3.1 Multiple rare variants in CAV1 are associated with AHI ……………………109 4.3.2 CAV1 variants regulatory annotation and gene expression analysis …………110 4.4 Discussion ……………………………………………………………………112 CHAPTER 5: DISCUSSION AND FUTURE WORK …………………………118 5 List of Tables Table 1.1 Summary of the features, the pros and cons of the different type of methods for rare variants association test ……………………………………………26 Table 2.1 Sample characteristics of TOPMed WGS and imputed genotype studies…...42 Table 2.2 Selected coding and non-coding variants on 105 genes in target region ……....52 Table 2.3 Type I error of burden and SKAT test ………………………………………66 Table 2.4 Power of burden and SKAT tests when simulated rare variants account for 1% of variability of phenotype ………………………………………………………66 Table 2.5 Power of burden and SKAT tests when simulated rare variants account for 0.5% of variability of phenotype…………………………………………………...…67 Table 2.6 Stage I and II gene-based association tests with AvSpO2S……………..……..71 Table 2.7 Gene-based association test for DLC1 with AvSpO2S in TOPMed sequencing and independent replication data with imputed genotypes….……………... 72 Table 2.8 Mendelian randomization analysis to assess causal effects of DLC1 expression in skin cell-transformed fibroblasts to AvSpO2S …………...………………..81 Table 2.9 Sample characteristics and results of DLC1 methylation association test with AvSpO2S ……………………………………………………………………...83 Table 2.10 Sample characteristics and results of DLC1 expression level association with AHI and AvSpO2S …………………………..…………………………83 6 Table 2.11 Stage I and II gene-based association tests with AvSpO2S by adjusting AHI as covariate ……………………………………………………................85 Table 2.12 Stage I and II gene-based association tests for DLC1 with AvSpO2S Or FEV1/FVC by adjusting FEV1/FVC or AvSpO2S as covariates ……………………86 Table 3.1 Summary of the genome-wide gene based test using CFS-EA stage I family filtered variants for AvSpO2S…………………………………………………….96 Table 3.2 Enrichment of 5% significant AvSpO2S gene-based tests on chromosome 8 target region in CFS-EA ………………………………...………………98 Table 4.1 Stage I and II gene-based association tests with AHI …………………111 7 List of Figures Figure 1.1 Normal breathing, snoring and OSAHS …………………………………...15 Figure 1.2 Example of polysomnography trace during repetitive apnoeas …………...17 Figure 1.3 Distribution of AHI and AvSpO2S raw measurement in CFS-EAs………..18 Figure 1.4 Boxplots to compare AHI and AvSpO2S in healthy people and OSAHS patients in CFS-EAs……………..……………………………………………..19 Figure 1.5 Variance component linkage analysis of AHI ………………………………21 Figure 1.6 Genetic variant frequencies and effect sizes ………………………………22 Figure 1.7 Non-coding regions affect proximal and distal regulators ………………30 Figure 2.1 Variance component linkage analysis of AvSpO2S in CFS-EAs on chromosome 8……………………………………………………………………………62 Figure 2.2 Mixture distribution of family specific LOD scores in CFS families……...63 Figure 2.3 Distribution of family specific LOD score (All 40 causal variants have the same effect directions.)………………..…………… …………………………..64 Figure 2.4 Distribution of family specific LOD score (Half of 40 causal variants have the same effect directions.)………… …………………………………...64 Figure 2.5 The analysis flow chart for searching low frequency and rare variants associated with AvSpO2S using the TOPMed WGS data ………………………………68 Figure 2.6 Linkage evidence of AvSpO2S on chromosome 8 in CFS-EAs …………..70 8 Figure 2.7 Conditional linkage analysis for AvSpO2S on chromosome 8 in CFS-EAs TOPMed individuals ……………...………………………………………….75 Figure 2.8 Effect sizes of the 57 variants in DLC1 estimated using the stage II samples conditional on MAF …..………………………………………………………..76 Figure 2.9 Cell type specific regulatory annotation enrichment tests for the identified non-coding variants in DLC1 …………...……………………………………78 Figure 2.10 51 non-coding variants and the corresponding effect sizes in DLC1 genes plotted against physical locations ………………………………………….79 Figure 2.11 Gene based test for DLC1, MYOM2 and CSMD1 selected variants with corresponding expression level in GTEx tissues …………………………………..80 Figure 2.12 Mendelian randomization analysis using 24 DLC1 variants as instrumental variables…………………………………………………...……………….81 Figure 2.13 Comparison of the effect sizes of single variant associations with and without adjusting for FEV1, FVC and AHI as covariate for DLC1 selected variants ………………………………………………………………………....87 Figure 3.1 Distribution of variants number in each gene in genome-wide gene-based test using CFS-EA Stage I families filtered for coding and non-coding variants ……………………………………………………………………………..……96 Figure 3.2 Quantile-quantile plots for genome-wide gene-based test using CFS-EA stage I family filtered variants for AvSpO2S……………………………..……97 Figure 4.1 Overview of the analysis pipeline processes ……………………………..106 9 Figure 4.2 Variance component linkage analysis of AHI in