Genome Wide Association of Chronic Kidney Disease Progression: the CRIC Study (Author List and Affiliations Listed at End of Document)
Total Page:16
File Type:pdf, Size:1020Kb
SUPPLEMENTARY MATERIALS Genome Wide Association of Chronic Kidney Disease Progression: The CRIC Study (Author list and affiliations listed at end of document) Genotyping information page 2 Molecular pathway analysis information page 3 Replication cohort acknowledgments page 4 Supplementary Table 1. AA top hit region gene function page 5-6 Supplementary Table 2. EA top hit region gene function page 7 Supplementary Table 3. GSA pathway results page 8 Supplementary Table 4. Number of molecular interaction based on top candidate gene molecular networks page 9 Supplementary Table 5. Results of top gene marker association in AA, based on EA derived candidate gene regions page 10 Supplementary Table 6. Results of top gene marker association in EA, based on AA derived candidate gene regions page 11 Supplementary Table 7. EA Candidate SNP look up page 12 Supplementary Table 8. AA Candidate SNP look up page 13 Supplementary Table 9. Replication cohorts page 14 Supplementary Table 10. Replication cohort study characteristics page 15 Supplementary Figure 1a-b. Boxplot of eGFR decline in AA and EA page 16 Supplementary Figure 2a-l. Regional association plot of candidate SNPs identified in AA groups pages 17-22 Supplementary Figure 3a-f. Regional association plot of candidate SNPs identified in EA groups pages 23-25 Supplementary Figure 4. Molecular Interaction network of candidate genes for renal, cardiovascular and immunological diseases pages 26-27 Supplementary Figure 5. Molecular Interaction network of candidate genes for renal diseases pages 28-29 Supplementary Figure 6. ARRDC4 LD map page 30 Author list and affiliations page 31 1 Supplemental Materials Genotyping Genotyping was performed on a total of 3,635 CRIC participants who provided specific consent for investigations of inherited genetics (of a total of 3,939 CRIC participants). Genotyping was conducted at the Genetic Analysis Platform, Broad Institute of MIT and Harvard, using the Illumina HumanOmni1- Quad v1.0 microarray, which comprised of 1,140,419 SNPs. SNP genotypes were called using Illumina’s BeadStudio Genotyping Module (Illumina Inc, San Diego, CA, USA). Among the 3,635 participant samples, 81 were excluded based on quality control metrics including sex discordance (n=7), reduced or excess genotypic heterozygosity (heterozygosity rate ± 3 standard deviations from the mean; n=25), cryptic relatedness (π >0.185; n=64) or sample ID mismatches (n=28); 16 individuals were excluded because of more than one of these four metrics. All samples had a genotype call rate of at least 97%. This resulted in 3,527 samples that passed our quality control metrics. We then undertook principle component (PC) analyses to infer genetic ancestry on these remaining 3,527 samples. Using the cut points of PC 1 ≥ 0.0298 and PC 2 ≥ 0.0651, we identified a cluster of 1,581 participants of European ancestry (whites) and a cluster of 1,493 participants of African ancestry (blacks). Of the potential 1,140,419 markers on the genotype array, 184,789 were removed based on poor clustering or replication failures (i.e. assay performance). Then separately within the white and black subgroup populations, we completed SNP marker-level quality control on the remaining 955,630 SNP makers. In whites, we excluded a total of 214,437 markers (22.4%) because of minor allele frequency (MAF) < 0.03, 531 (0.06 %) because of deviation from Hardy-Weinberg equilibrium (P < 1 × 10-7), and 1,053 (0.11 %) because of genotype call rate < 0.95; 739,978 SNP markers remained in the discovery phase for white. Among blacks, we excluded a total of 111,478 markers (12.0%) because of MAF< 0.03, 1,047 (0.11 %) because of deviation from Hardy-Weinberg equilibrium (P < 1 × 10-7), and 1,109 (0.12 %) because of genotype call rate < 0.95; 839,205 SNP markers remained in the discovery phase for blacks. We completed genome-wide imputation based on the 1000 Genomes mixed race/ethnicity (ALL) genetic backbone (NCBI build 37, release date March 2012; n=1093) using IMPUTE2. Only those imputed SNP markers passing imputation quality control (info [a measure of r2] ≥ 0.80) were retained for analysis, resulting in a total of 8,191,067 and 11,726,769 variants with MAF >0.03 in our EA and AA cohorts, respectively. 2 Molecular pathway analysis information Of our 18 identified candidate SNPs, based on physical proximity, we were able to assign 16 SNP to specific genes. Interaction networks centered around the molecule products of these 16 genes were constructed through the use of Integrative pathway analysis (Ingenuity® Systems, www.ingenuity.com). An interaction network is a graphical representation of the molecular relationships between genes and gene products. Gene products are represented as nodes, and the biological relationship between two nodes is represented as a line that is supported by at least one reference from the literature. In the graphic network, type categories of the molecules are represented by various shapes, and the labels along with each line indicate varieties of interactions between the connected molecules. Supplementary Figures 4and 5 list the legends for the type categories of the molecules, and the nature of the interactions between molecules. Only experimentally verified interaction relationships, either direct physical contact relationship or indirect regulation relationship through intermediate molecules, were considered. Exogenous chemicals were excluded from the interaction network analysis. We found eight of our a-priori assigned gene products to be associated with renal or urological function (Supplementary Figure 5). Five of the SNPs were located within and three of the others SNPs or SNPs in LD with them were in close proximity (within 30kb) to the assigned genes (Supplementary Figures2- c,g,j,k and 3-b,c,d,e). In order to test for the statistical significance of our identified renal related molecular networks, the p-value was calculated based on Fisher's Exact Test to assess the degree of over- represented pathway compare to as expected by chance by accounting for: 1) the number of Functions/Pathways/Lists Eligible molecules that participate in that annotation as defined by the molecules in the selected Reference set; 2) the total number of molecules in the selected Reference set known to be associated with that function; 3) the total number of Functions/Pathways/Lists Eligible molecules in the selected Reference set; and 4) the total number of molecules in the Reference Set. 3 Replication Cohorts acknowledgments This study was supported by grants U01DK57292, U01DK57329, U01DK057300, U01DK057298, U01DK057249, U01DK57295, U01DK070657, U01DK057303, U01DK070657, U01DK57304 and DK57292-05 from the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) and, in part, by the Intramural Research Program of the NIDDK. This project has been funded in whole or in part with federal funds from the National FIND study (AA) Cancer Institute, National Institutes of Health (NIH), under contract N01-CO-12400 and the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research. This work was also supported by the National Center for Research Resources for the General Clinical Research Center grants: Case Western Reserve University, M01-RR-000080; Wake Forest University, M01-RR-07122; Harbor-University of California, Los Angeles Medical Center, M01-RR-00425; College of Medicine, University of California, Irvine, M01-RR-00827– 29; University of New Mexico, HSC M01-RR-00997; and Frederic C. Bartter, M01-RR-01346. Genotyping was performed by the Center for Inherited Disease Research, which is fully funded through a federal contract from the NIH to Johns Hopkins University (N01-HG-65403). The results of this analysis were obtained using the S.A.G.E. package of genetic epidemiology software, which is supported by a U.S. Public Health Service Resource Grant (RR03655) from the National Center for Research Resources. Wake Forest T2D- NIH R01 DK066358 (DWB), R01 DK053591 (DWB), R01 HL56266 (BIF), R01 DK070941 ESKD Study (BIF), General Clinical Research Center of Wake Forest University School of Medicine M01 RR07122. NIH R01 DK 070941 (BIF) and R01 DK53591 (DWB), and by the NIDDK Wake Forest non- and NCI Intramural Research Programs. MAB was supported by F32 diabetic ESKD Study DK080617 from the NIDDK. See funding sources of the CKDGen Cohorts in the manuscript with PMID 25493955. CKDGen CAB's work was supported by the Else Kröner-Fresenius-Stiftung. We thank all the FinnDiane researchers, as well as the physicians and nurses at each center participating in the recruitment of the patients and in the collection of samples and data (Sandholm et al. PloS Gen2012; Thorn LM et al., Diabetes Care 2005). The FinnDiane study was supported by grants from the Folkhälsan Research Foundation, Liv och Hälsa Foundation, the FinnDiane Willhelm and Else Stockmann Foundation, Helsinki University Central Hospital Research Funds (EVO), the Finnish Cultural Foundation, the Signe and Ane Gyllenberg Foundation, Finnish Medical Society (Finska läkaresällskapet), Academy of Finland, Novo Nordisk Foundation and Tekes. AASK U01 DK048689, M01 RR-00071 4 Supplementary Table 1. African American top hit region candidate gene function SNP to Gene Assignment (location or Gene name function and associations distance to gene) rs12057968 Chromosome 1 Open Reading Frame 100: Protein coding with unknown function. C1orf100 Expressed in kidney. Intronic Adenylosuccinate synthetase: nearby gene with overlapping SNPs in LD with C1orf100. ADSS catalyzes the first committed step in the conversion of inosine monophosphate to (ADSS-23kb) adenosine monophosphate. May affect energy metabolism. Associated with schizophrenia. Expressed in kidney. Bardet-Biedl Syndrome 9: protein coding gene associated with Bardet-Biedl syndrome, rs73690944 which includes renal malformations and CKD. The exact function of this gene has not yet BBS9 been determined. Thought to function as a coat complex required for sorting of specific Intronic membrane proteins to the primary cilia. Also involved in parathyroid hormone action in bones.