Functional Genomic Annotation of Genetic Risk Loci Highlights Inflammation and Epithelial Biology Networks in CKD
Total Page:16
File Type:pdf, Size:1020Kb
BASIC RESEARCH www.jasn.org Functional Genomic Annotation of Genetic Risk Loci Highlights Inflammation and Epithelial Biology Networks in CKD Nora Ledo, Yi-An Ko, Ae-Seo Deok Park, Hyun-Mi Kang, Sang-Youb Han, Peter Choi, and Katalin Susztak Renal Electrolyte and Hypertension Division, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania ABSTRACT Genome-wide association studies (GWASs) have identified multiple loci associated with the risk of CKD. Almost all risk variants are localized to the noncoding region of the genome; therefore, the role of these variants in CKD development is largely unknown. We hypothesized that polymorphisms alter transcription factor binding, thereby influencing the expression of nearby genes. Here, we examined the regulation of transcripts in the vicinity of CKD-associated polymorphisms in control and diseased human kidney samples and used systems biology approaches to identify potentially causal genes for prioritization. We interro- gated the expression and regulation of 226 transcripts in the vicinity of 44 single nucleotide polymorphisms using RNA sequencing and gene expression arrays from 95 microdissected control and diseased tubule samples and 51 glomerular samples. Gene expression analysis from 41 tubule samples served for external validation. 92 transcripts in the tubule compartment and 34 transcripts in glomeruli showed statistically significant correlation with eGFR. Many novel genes, including ACSM2A/2B, FAM47E, and PLXDC1, were identified. We observed that the expression of multiple genes in the vicinity of any single CKD risk allele correlated with renal function, potentially indicating that genetic variants influence multiple transcripts. Network analysis of GFR-correlating transcripts highlighted two major clusters; a positive correlation with epithelial and vascular functions and an inverse correlation with inflammatory gene cluster. In summary, our functional genomics analysis highlighted novel genes and critical pathways associated with kidney function for future analysis. J Am Soc Nephrol 26: 692–714, 2015. doi: 10.1681/ASN.2014010028 Twenty million people suffer from CKD and ESRD in experiments to understand the genetics of a com- the United States. The risk of death significantly plex trait such as CKD is the genome-wide associ- increases as kidney function (GFR) declines and it can ation study (GWAS).2 These studies compare be as high as 20% for patients with diabetes on 1 hemodialysis. Diseases of the kidney and urinary Received January 8, 2014. Accepted July 8, 2014. tract are ranked 12th in the mortality charts (www. Published online ahead of print. Publication date available at cdc.org), indicating their importance in public health. www.jasn.org. CKD is a typical gene environmental disease. Present address: Dr. Sang-Youb Han, Department of Internal Several environmental factors play important roles Medicine, Inje University, Ilsan-Paik Hospital, Goyang, Gyeonggi, inCKD development; diabetesandhypertension are South Korea. the two most important causes, accounting for close Correspondence: Dr. Katalin Susztak, Perelman School of to 75% of ESRD cases. In addition, CKD has a clear Medicine, University of Pennsylvania, 415 Curie Boulevard, 415 genetic component, because ,20% of patients with Clinical Research Building, Philadelphia, PA 19104. Email: diabetes or hypertension will actually develop kid- [email protected] ney disease. At present, one of the most powerful Copyright © 2015 by the American Society of Nephrology 692 ISSN : 1046-6673/2603-692 J Am Soc Nephrol 26: 692–714, 2015 www.jasn.org BASIC RESEARCH genetic variants in two groups of participants: people with the Reports from the ENCODE project indicate that the majority disease (patients) and similar people without the disease (con- (70%–80%) of the gene regulatory elements (promoters, en- trols). If a variant (single nucleotide polymorphism [SNP]) is hancers, and insulators) are within 250 kb of the gene.3 Using more frequent in people with the disease, the SNP is said to be these criteria, we identified 306 genes within 500 kb of 44 CKD associated with the disease. GWASs, however, have several SNPs. There was no gene within the 500-kb window around the limitations. First, GWASs became possible, because the ge- rs12437854 SNP; therefore, 43 loci were followed. We called netic information is inherited in fairly large blocks. Therefore, these transcripts CKD risk-associated transcripts (CRATs). we do not have to test the association with each of the close to 20 million genetic variations but can use fewer (about 1 mil- CRATs Are Enriched for Kidney-Specific Expression lion) SNPs representing the genetic variation of larger genetic We hypothesized that cells that express CRATs play an important regions (called haplotype or linkage disequilibrium block). role in controlling kidney function. Therefore, we determined Although haplotype blocks made GWAS convenient and fi- expression levels of all CRATsin control (normal) human kidney nancially feasible, they also mean that we do not know which samples (n=2) using comprehensive RNA sequencing analysis. of the many variants within a single haplotype block is func- We found that 41% of the CKD risk loci-associated transcripts tionally relevant. showed high expression (upper quartile) and that only 6% of Furthermore, .83% of the disease-associated SNPs are lo- CRAT transcripts were not detectable in human kidney tubule calized to the noncoding region of the genome3; therefore, it samples (Supplemental Figure 1). Overall, we found that a large is unclear how they induce illness. Recent reports from the percentage of the CKD SNP neighboring transcripts (94%; 287 Encyclopedia of DNA Elements (ENCODE) project indicate of 306) were expressed in the human kidney, indicating statisti- that most complex trait polymorphisms are localized to gene cally significant kidney-specific enrichment compared with 44 regulatory regions in target cell types.4 Disease-associated ge- randomly selected loci, where only 13% of the transcripts netic variants can alter binding sites for important transcrip- showed high expression and 16% of the nearby transcripts 2 tion factors and influence the expression of nearby genes.3,5–7 were not expressed in the kidney (P51.25310 9). Genetic variants can potentially alter steady-state expression Gene ontology analysis (david.abcc.ncifcrf.gov) to under- of genes, in which case they interfere with basal transcription stand the tissue specificity of CRATs indicated specificand factor binding or can alter the amplitude of transcript changes significant enrichment in the kidney and peripheral leukocytes after signal-dependent transcription factor binding. (P value=0.0082 and P value=0.0014, respectively). Next, we Here, we hypothesized that polymorphisms associated with compared absolute expression levels of CRATs by RNA se- renal disease will influence the expression of nearby transcript quencing in 16 different human organs using the Illumina levels in the kidney. We used genomics and systems biology Body Map database (www.ebi.ac.uk). The atlas confirmed approaches to investigate tissue-specific expression of tran- the statistically significant kidney-specific expression enrich- scripts and their correlation with kidney function. ment of CRATs (Supplemental Figure 2). For example, the atlas highlighted the high and kidney-specific expression of Uro- modulin (UMOD). In summary, expression of CRATs was en- RESULTS riched in the kidney and peripheral lymphocytes, potentially indicating the role of these cells in kidney disease development. CKD Risk-Associated Transcripts By manual literature search, we identified all GWASs reporting CRAT Expression in Normal and Diseased Human genetic association for CKD-related traits (Supplemental Table1). Kidney Glomerular Samples Many of these studies, however, used different parameters as We hypothesized that functionally important CRATs are not kidney disease indicators. We included SNPs associated with only expressed in relevant cell types (kidney and leukocytes) eGFR (on the basis of serum creatinine or cystatin C calcula- but that their expression level will change in CKD. To test this tions) or the presence of ESRD. Our analysis identified 10 pub- hypothesis, we analyzed gene expression levels in a large lications meeting these criteria.8–25 Most publications did not collection of microdissected human glomerular (n=51) and differentiate cases on the basis of disease etiology and included tubule (n=95) samples. Kidney samples were obtained from a cases with hypertensive and diabetic kidney disease. Coding diverse population (Supplemental Tables 3 and 4). Statistical polymorphisms and SNPs that did not reach genome-wide sig- analysis failed to detect ethnicity-driven gene expression dif- 2 nificance (P.5310 8) were excluded.26 Finally, 44 leading ferences (data not shown). SNPs meeting all of these criteria were used for further analysis Transcript profiling was performed for each individual sam- (Supplemental Table 2). Three SNPs associated only with dia- ple using Affymetrix U133v2 arrays. The data were processed betic CKD development were also analyzed separately; all other using established pipelines, and they contained probe set iden- SNPs were from studies including both diabetic and nondiabetic tifications for 226 transcripts from 306 originals CRATs. We cases. There were only two SNPs that reached genome-wide analyzed the expression levels of 226 CRATsin 51 microdissected significance in multiple