BASIC RESEARCH www.jasn.org

Integrated Functional Genomic Analysis Enables Annotation of Kidney Genome-Wide Association Study Loci

Karsten B. Sieber,1 Anna Batorsky,2 Kyle Siebenthall,2 Kelly L. Hudkins,3 Jeff D. Vierstra,2 Shawn Sullivan,4 Aakash Sur,4,5 Michelle McNulty,6 Richard Sandstrom,2 Alex Reynolds,2 Daniel Bates,2 Morgan Diegel,2 Douglass Dunn,2 Jemma Nelson,2 Michael Buckley,2 Rajinder Kaul,2 Matthew G. Sampson,6 Jonathan Himmelfarb,7,8 Charles E. Alpers,3,8 Dawn Waterworth,1 and Shreeram Akilesh3,8

Due to the number of contributing authors, the affiliations are listed at the end of this article.

ABSTRACT Background Linking genetic risk loci identified by genome-wide association studies (GWAS) to their causal remains a major challenge. Disease-associated genetic variants are concentrated in regions con- taining regulatory DNA elements, such as promoters and enhancers. Although researchers have previ- ously published DNA maps of these regulatory regions for kidney tubule cells and glomerular endothelial cells, maps for podocytes and mesangial cells have not been available. Methods We generated regulatory DNA maps (DNase-seq) and paired expression profiles (RNA-seq) from primary outgrowth cultures of human glomeruli that were composed mainly of podo- cytes and mesangial cells. We generated similar datasets from renal cortex cultures, to compare with those of the glomerular cultures. Because regulatory DNA elements can act on target genes across large genomic distances, we also generated a chromatin conformation map from freshly isolated human glomeruli. Results We identified thousands of unique regulatory DNA elements, many located close to transcription factor genes, which the glomerular and cortex samples expressed at different levels. We found that ge- netic variants associated with kidney diseases (GWAS) and kidney expression quantitative trait loci were enriched in regulatory DNA regions. By combining GWAS, epigenomic, and chromatin conformation data, we functionally annotated 46 kidney disease genes. Conclusions We demonstrate a powerful approach to functionally connect kidney disease-/trait–associated loci to their target genes by leveraging unique regulatory DNA maps and integrated epigenomic and ge- netic analysis. This process can be applied to other kidney cell types and will enhance our understanding of genome regulation and its effects on in kidney disease.

J Am Soc Nephrol 30: ccc–ccc, 2019. doi: https://doi.org/10.1681/ASN.2018030309

In the past 15 years, large-scale genome-wide asso- ciation studies (GWAS) have successfully identified genetic variants associated with a wide variety of Received March 23, 2018. Accepted December 26, 2018. measurable traits and human diseases, including Published online ahead of print. Publication date available at those related to kidney function.1–3 The minority www.jasn.org. of GWAS variants localize to -coding Correspondence: Dr. Shreeram Akilesh, Department of Ana- sequences (exemplified by APOL14); most lie in tomic Pathology, University of Washington, Box 356100, 1959 nonprotein-coding genomic sequences, which NE Pacific Street, Seattle, WA 98195. Email: [email protected] compose .98% of the .5 Regulatory Copyright © 2019 by the American Society of Nephrology

J Am Soc Nephrol 30: ccc–ccc, 2019 ISSN : 1046-6673/3003-ccc 1 BASIC RESEARCH www.jasn.org

DNA elements, which encompass promoters, enhancers, and Significance Statement insulators, are small segments of the genome where DNA bind- ing recognize specific DNA sequence features leading to The absence of high-resolution epigenomic maps of key kidney cell the recruitment of histone modifying complexes, displacement types has hampered understanding of kidney-specific genome of nucleosomes, and opening of nuclear chromatin.6 Active reg- regulation in health anddisease. Kidney-associatedgenetic variants, identified in genome-wide association studies, are concentrated in ulatory DNA elements are often located in nonprotein-coding accessible chromatin regions containing regulatory DNA elements. sequences7,8 and appear to functionally and physically associate The authors describe the generation and initial characterization of with their target gene promoters over large genomic distances, paired DNA maps of these regulatory regions and gene expression often skipping intervening genes.8 Furthermore, GWASvariants profiles of cells from primary human glomerular and cortex cultures. frequently localize to regulatory DNA elements.5,9–14 Because By integrating analyses of genetic and epigenomic data with ge- nome-wide chromatin conformation data generated from freshly the chromatin accessibility of some regulatory DNA elements isolated human glomeruli, they physically and functionally con- – can be very cell type–specific,7,9,15 17 one mechanism by which nected 42 kidney genetic loci to 46 potential target genes. Apply- genetic variants can contribute to disease risk is by altering the ing this approach to other kidney cell types is expected to regulation of gene expression in a cell type–specific manner enhanceunderstanding of genome regulationand its effects ongene (Supplemental Figure 1).18 Delineating the cell type–specific expression in kidney disease. gene regulatory networks for multiple important kidney cell types will therefore be paramount to dissecting kidney disease mechanisms. (protocol #1297). Approximately 1 cm3 portions of unin- Enzymatic reporters, such as deoxyribonuclease I (DNase I)19,20 volved kidney cortex (from the pole furthest from the tumor and the Tn5 transposase,21 when combined with next-generation mass) were harvested and transported in RPMI medium on sequencing methods, efficiently identify regions of open chroma- ice. These tissues were then minced with a sterilized razor tin that are associated with regulatory DNA elements. However, blade and the fragments were placed in 20 ml of prewarmed these methodologies have only been applied to the study of a few RPMI medium (without serum) supplemented with Accutase kidney cell types and chromatin accessibility maps are lacking for (diluted 1:10; Sigma), collagenase P (100 mg/ml; Roche), and many important kidney cell types such as podocytes, mesangial trypsin/EDTA (0.25% solution diluted 1:10; Gibco). The tis- cells, distal tubule cells, peritubular microvascular endothelial sue fragments were digested at 37°C for 20 minutes with vig- cells, pericytes, and resident immune cells (e.g.,macrophages orous agitation. For glomerular core isolation, the softened and dendritic cells). To begin to approach this problem, we re- tissue fragments were mashed through a #60-gauge sterilized port the generation of high-resolution chromatin accessibility steel mesh using the bottom of sterilized glass beaker. This maps (DNase-sequencing [DNase-seq]) and paired gene expres- treatment stripped the glomerular cores of their Bowman’s sion profiles (RNA-sequencing [RNA-seq]) of primary cultures capsules. The isolated glomerular cores passed through the of human glomerular outgrowth cells (mixed cultures of po- mesh and were collected on a #140-gauge steel mesh placed docytes and mesangial cells). To enable comparative analysis of below. Tubules were disrupted sufficiently that they did not cell type–specific features, we also generated datasets from pri- collect on the #140-gauge steel mesh. The isolated glomerular mary human renal cortical cultures treated in similar fashion. cores were washed extensively with sterile room temperature To understand the basis of long-range interactions between PBS and were then transferred into a tissue culture flask with regulatory DNA elements and their target genes, we prewarmed culture medium (RPMI supplemented with 10% generated a chromatin conformation (Hi-C) map from freshly FBS and insulin-transferrin-selenite+ [ITS+] supplement; isolated human glomeruli. We then integrated these diverse Corning). For the primary culture of cortical cells, after digestion measures of genome function to connect kidney GWAS loci of a separate cortical tissue fragment (as described above), the to their potential target genes, thereby gaining new insights pieces were spun down and macerated in a petri dish using a into the genome regulatory mechanisms that underlie kidney sterile plunger from a 5 ml syringe. These softened tissue frag- phenotypes and disease. ments were then transferred into tissue culture flasks with pre- warmed medium containing 10% FBS and ITS+ as described above. After 2–3 days (for cortex cultures) and 10–14 days METHODS (for glomerular outgrowth cultures), the tissue fragments were decanted and the adherent cells were fed with fresh medium. Generation and Characterization of Primary At this stage, primary cortex cells grew rapidly and had an epi- Glomerular and Cortex Cultures thelioid morphology, whereas primary glomerular outgrowth Kidney tissues were obtained from patients undergoing radi- colonies grew more variably with a mixture of epithelioid and cal nephrectomy for renal tumors with informed consent for spindled cells. Glomerular outgrowth cells were used at first pas- DNA sequencing obtained before surgery. Patient character- sage for all experiments (typically approximately 2–3 weeks istics and resulting dataset features are listed in Supplemental after initial isolation). Cortex outgrowth cells were subcultured Table 1. The study and consent forms were approved by the at 1:4 when they reached 80% confluence,andusedwithintwo University of Washington’s Institutional Review Board passages for all experiments.

2 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH

Immunofluorescence Staining of Primary Cultures Total RNA Library Prep Kit with Ribo-Zero Gold (Illumina) Disaggregated single-cell suspensions in growth medium were and subjected to paired-end (2376 bp) sequencing. applied to sterilized glass coverslips placed in a six-well plate and incubated overnight to allow cells to fully adhere. The RNA-Seq Data following day, the cells were washed three times with cold RNA-seq libraries were aligned to the reference human ge- PBS and then fixed with 2% paraformaldehyde with 4% sucrose nome (GRCh38/hg38) using TopHat 2.0.13,25 andassignedto for 10 minutes at room temperature. The fixative was then transcript models using RNA-STAR 2.3.1.26 removed and cells washed once with PBS. Cells were permea- bilized with 0.3% Triton X-100 in PBS for 10 minutes at room Processing of Glomeruli for Hi-C temperature and then washed three times with PBS. Primary Glomeruli were mechanically isolated as described before from a antibodies were incubated for 1 hour at room temperature portion of uninvolved kidney cortexobtained from a 74-year-old followedby threewashes with PBS. Primaryantibodies reported woman undergoing nephrectomy for renal cell carcinoma. The in this study recognized podocin (P0372 rabbit polyclonal; glomeruli were fixed for 20 minutes in 10 ml of a 1:10 dilution of Sigma-Aldrich), WT1 (sc-192 rabbit polyclonal; Santa Cruz 10% neutral buffered formalin (Fisher) with end-over-end tum- Biotechnology), synaptopodin (10r-s125a mouse monoclonal; bling. At the end of this period, 0.1 g glycine (Sigma) was added Fitzgerald Industries International), PAX2 (716000 rabbit poly- directly to the tube to quench the fixation reaction. The cell pellet clonal; Invitrogen), claudin-1 (ab15098 rabbit polyclonal; Ab- was delivered to Phase Genomics, Inc. (Seattle, WA) for Hi-C cam), and smooth muscle actin (A5228 mouse monoclonal; library preparation using a Phase Genomics Human Hi-C Kit. A Sigma-Aldrich). Species-specific fluorescence-labeled second- total of 481,795,427 23150 bp read pairs were sequenced from ary antibodies were then applied for 1 hour at room tempera- the resulting library on an Illumina HiSeq 4000. Topology- ture, followed by three washes with PBS (Alexa Fluor 488 associated domain (TAD) calling was performed using the anti-rabbit [A11070] or Alexa Fluor 594 anti-mouse [A11020]; DomainCaller algorithm at 50 kbp resolution with default both from Invitrogen). After counterstaining with DAPI parameters.27 Chromosomal contacts (binned to 10 kb (10 mg/ml) and mounting, cells were imaged with a widefield resolution) were generated using the Phase Genomics Matlock Nikon TI-E inverted epifluorescence microscope. tool (https://github.com/phasegenomics/matlock).

Processing of Cell Cultures for DNase-Seq Availability of Data For each primary culture, cells were subjected to DNase I All datasets produced for this study are available on the Gene treatment, small DNA fragment isolation, and library con- Expression Omnibus as a SuperSeries GSE115961 (https:// struction per published Encyclopedia of DNA Elements www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE115961). (ENCODE) protocols or a modified protocol adapted for Also see Supplemental Table 1 for individual dataset accession low DNA input.22,23 Libraries were subjected to paired-end numbers and sequencing depth. (2336 bp) sequencing. Most datasets (Supplemental Table 1) used in this study were deemed of high quality (signal portion Data Exploration and Visualization of tags.0.4).7 For many of the computational analyses, R (v3.3.3) (https:// www.R-project.org/) was used to format, analyze, and visual- DNase-Seq Data Processing ize the data. To annotate genes and transcripts, BioMart Sequence reads from our DNase-seq libraries were subjected (v2.30.0)28,29 and Bioconductor were used. Genomic coordi- to a uniform data-processing pipeline, which was used pre- nates were manipulated using GenomicRanges (v1.26.4).30 viously for ENCODE DNase-seq datasets.7 Briefly, read pairs For visualization, gplots (v3.0.1) (https://CRAN.R-project. passing quality filters are trimmed of adapter sequences and org/package=gplots) and ggplot2 (v2.2.1) (https://ggplot2. aligned to the reference human genome (GRCh38/hg38) using tidyverse.org/) were used. For data manipulation, packages BWA.24 Genomic regions with a significant enrichment within the tidyverse (v1.1.1) (https://CRAN.R-project.org/ of DNase I cleavages were identified using the hotspot algo- package=tidyverse) were used, including tidyr (v0.6.3), plyr rithm,7 and were further refined to fixed-width, 150-bp (v1.8.4), dplyr (v0.7.1), purr (v0.2.2.2), and tibble (v1.3.3). regions (“peaks”) containing the highest cleavage density (referred to as DNase I–hypersensitive sites; DHS). Hotspot Differential Gene Expression Analysis and peak calling were performed using full-depth data. Differential gene expression was calculated using DESeq2 (v1.14.1).31 All reported significant differential gene expres- Processing of Cell Cultures for RNA-Seq sion results are on the basis of the DESeq2 multiple testing– We washed 500,000–1,000,000 cells of primary cortex or corrected P values #0.01 and .1.5 log2 fold change in each glomerular cultures once in PBS and stabilized them at 4°C respective culture. To filter out nonexpressed genes and ensure in RNALater (Ambion). Total RNA was extracted using a mir- that the genes were expressed in at least one sample type, the Vana RNA isolation kit (Ambion). Illumina sequencer- differential genes were filtered to require that the median frag- compatible libraries were constructed using a TruSeq Stranded ments per kilobase of transcript per million mapped reads

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 3 BASIC RESEARCH www.jasn.org

(FPKM) expression in either the glomerular or cortex cultures Correlation of Differentially Accessible DHS with was .1 FPKM. Using the glomerular and tubular differentially Differentially Expressed Genes expressed genes, the enrichment for biologic We calculated the number of differential DHS within a given processes was performed using clusterProfiler (v3.2.14)32 window from the transcription start site (TSS) of differen- with a P valuecutoffof0.05andaBonferroni-corrected tiallyexpressed genes withvalid EntrezandHGNC identifiers q value cutoff of 0.01. Differential gene expression was visu- in each respective culture. Next, we calculated the number of alized in heatmaps generated using heatmap.2 from gplots (for differential DHS near these genes that were constitutively row normalized data) or pheatmap (for raw expression and expressed (i.e.,notsignificantly changing) in both glomer- row normalized data). ular and cortex cultures. The constitutively expressed genes were defined as .1medianFPKMinbothglomerularand DHS Master List Generation and Differential DHS tubular cultures, adjusted P values .0.05, and the top four Analysis quintiles of DESeq2 variance stabilized gene expression val- Master lists were generated with a bedops -u command as ues. To test for the enrichment of differential DHS near dif- detailed on the BEDOPS website (https://bedops.readthe- ferentially expressed genes, the number of differentially docs.io/en/latest/).33,34 We created DHS master lists for our accessible DHS in cortex and glomerular cultures within 6 glomerular and cortex DHS data, for ENCODE tubule data- 20 kb of the TSS for differentially expressed genes were sets (RPTEC, HRE, and HRCE), and for ENCODE fetal kidney counted and compared with the distribution for constitu- data. In most analyses, to improve the stringency of peak tively expressed genes. The Spearman correlation was calcu- calling, we filtered the glomerular and cortex culture master lated on the values of the log2FoldChange and number of list by only including peaks with a minimum cut count of differential DHS. 50 and those that were present in at least three out of the seven total sample datasets. We used a similar approach to generate Testing for Enrichment of Kidney GWAS Variants in the master list of 72 fetal kidney DNase-seq datasets by in- Glomerular and Tubular DHS cluding peaks with a minimum cut count of 50 in at least five The GWAS catalog42 was downloaded on August 24, 2017 samples. This resulted in master DHS lists of comparable size (Supplemental Table 2), and filtered for all kidney traits and between our datasets (335,161 DHS) and the fetal kidney removing sex SNPs; this returned 636 entries. datasets (353,676 DHS). We utilized the DESeq2 software SNPs with the same position were then removed, keeping the package31 in R to identify DHS with significant differences SNP with the most significant P value, reducing the list to in accessibility between replicate glomerular and cortex cul- 477 entries (Supplemental Table 3). To collapse SNPs within ture samples, analyzing each patient separately. Sites that met a linkage disequilibrium, we generated probabilistic identifi- false discovery rate (FDR) threshold of 1% were considered cation of casual SNPs credible sets43 for each SNP and differential DHS. The distance of differential DHS to the near- estimated colocalization with the credible sets using PIC- est gene and the associated biologic ontologies were computed COLO.44 We removed GWAS catalog SNPs with the higher using the Stanford GREAT analysis tool.35 P valueifthetwocrediblesetshadaH4posteriorprobability of colocalization $0.80 (see Supplemental Table 4). This Transcription Factor Motif Enrichment resulted in 430 kidney GWAS catalog loci remaining, of Transcription factor motif models were curated from TRANS- which 200 achieved a genome-wide significance P value 2 FAC v.11,36 JASPAR,37 and a SELEX-derived collection.38 In- #5310 8 (Supplemental Table 5). The proportions of the stances of transcription factor recognition sequences in the GWAS SNPs that overlapped kidney master list DHS within human genome were identified by scanning the genome the specified windows (i.e.,paddedintervalsaroundthe with these motif models using the FIMO tool39 from the SNP) were then calculated. In addition, to determine that MEME Suite v.4.6,40 with a fifth order Markov model gener- single SNPs were not outliers, we built a distribution for the ated from the 36 bp mappable genome used as the background kidney GWAS SNPs using a leave-one-out analysis. The kid- 2 model. Instances with a FIMO P,10 4 were retained and ney GWAS SNPs were then split into 200 groups, with each used for subsequent analyses. Families of transcription factors group leaving out one kidney GWAS SNP. The proportions with highly similar recognition motifs were grouped into clus- of the GWAS SNPs that overlapped kidney master list DHS ters.41 Motif enrichments were then calculated by using a cus- within the specified windows were then calculated for the tom Python script to count the number of DHS that contained 200 groups. The kidney proportions were compared with a an instance of any motif model within the cluster of highly randomized distribution of 10,000 sets of 200 SNPs matched similar motif models. For a given analysis, these counts were with the SNPsnap45 tool that has been previously used by compared between a query set of DHS (e.g.,thosewith CKDGen.1 SNPsnap matched the kidney GWAS SNPs on increased accessibility in glomerular cultures) and a back- minor allele frequency, linkage disequilibrium, distance to ground set (e.g., all other DHS), and significance was deter- nearest gene, and local gene density. FDR P values were cal- mined using the hypergeometric distribution with Bonferroni culated on the basis of this bootstrapped background correction of P values. distribution.

4 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH

Enrichment of Kidney Expression Quantitative Trail RESULTS Loci SNPs in Kidney DHS We selected all expression quantitative trait loci (eQTL) SNPs Generation and Characterization of Primary 2 (P value ,1310 4) that were eGenes (i.e. genes whose expres- Glomerular and Cortex Cultures sion is associated with an eQTL SNP with a q value ,0.05), Glomeruli were isolated by mechanical sieving from patients from Genotype-Tissue Expression project (GTEx v746)and undergoing nephrectomy for renal tumors (Figure 1A). The NephQTL47 tissues. These SNPs were then tested for overlap isolated glomerular cores were grown in tissue culture dishes of the kidney master list DHS. For each tissue, the proportion until attached and primary outgrowth cultures were estab- of all eGene eQTL SNPs (eSNPs) in kidney DHS relative to the lished. Genome-wide chromatin accessibility profiling with total number of eSNPs within the tissue was calculated. With DNase-seq and gene expression profiling with RNA-seq was the number of tissue-specific eSNPs ranging from 43,751 to performed in parallel on these primary glomerular out- 2,144,827, we had increased power to detect small changes in growth cultures. Similar datasets were also generated from the proportion of eSNPs found in kidney DHS, resulting in a primary cultures of normal human renal cortex51–53 (three significant difference between tissue proportions using a test of patients) grown under similar conditions in the presence of equal proportions (chi-squared test=5861.5, degrees of free- 10% FBS (Supplemental Table 1). In primary culture, the 2 dom=47, P value ,2.2310 16). For the GTEx samples, these cells growing out of the attached glomerular cores showed a proportions were normally distributed with a mean proportion mixedepithelioidandspindledmorphologycharacteristic of 0.0174 with SD 0.0015 (Shapiro test P=0.2207). To test if there of podocytes and mesangial cells, respectively (Figure 1B). was enrichment of the proportion of eSNPs within kidney DHS These cultures exhibited patchy staining for the podocyte for a given tissue, we compared its proportion with the distribu- marker WT1, compared with moderate and more uniform tion of proportions across the GTEx samples and tested for sig- staining for podocin (Figure 1C). The parietal epithelial cell nificance with a one-sided Z test. In addition to the kidney master markers PAX2 and claudin-1 were consistently identified by list DHS, we evaluated for enrichment in the subset of DHS with immunofluorescence staining in the glomerular cultures greater accessibility in either glomeruli or cortex samples. (Figure 1C). Although the expression of these markers may indicate the presence of parietal epithelial cells within Integrated Functional Genomic Mapping of GWAS our glomerular cultures, mechanical isolation almost uni- SNPs to Putative Target Genes formly stripped the Bowman’s capsule away from the isola- For these analyses, we utilized all unique kidney disease/trait ted glomeruli, and therefore these markers were most likely SNPs (Supplemental Table 5) because even SNPs below the ge- being expressed on cultured and proliferating podocytes, as 2 nome-wide significance threshold (P,5310 8) can contain ge- has been reported by others.54–58 Ver y few cells (,1%) dem- netic signals that contribute to cell type–specific regulation,1 onstrated staining for smooth muscle actin (data not shown) especially when combined with epigenomic data.5,48 We linked despitethisgene(ACTA2) being strongly expressed in the unique autosomal GWAS SNP positions to DHS with differen- primary glomerular cultures (see below). Consistent with tial accessibility within a 620 kb window centered on the SNP. the idea that high exogenous VEGFA supplementation is re- Each differential DHS was then linked to all potential differen- quired to support the growth of primary endothelial cells, tially expressed genes within 6500 kb. For this analysis, differ- staining for the endothelial marker CD31 was not detected entially expressed genes lacking gene identifiers were in any of our cultures (data not shown). These immunoflu- removed. Using the 10 kb windows encompassing the SNP orescence findings demonstrated that the primary glomer- position and the putative target gene TSS as “baits,” we tallied ular outgrowths consisted of mixed cultures of podocytes long-range chromatin contacts identified by Hi-C in human and mesangial cells. glomeruli. We visually confirmed interaction or lack thereof be- tween all GWASSNP–target gene pairs. Lastly, we tested whether Gene Expression Signatures of Glomerular Outgrowth the genomic range between each GWAS SNP and its target gene and Cortex Cultures TSS was wholly contained within TADs identified in human Next, we statistically compared the global gene expression pro- glomeruli using the BEDOPS33 command bedops -e 100%. files of the glomerular and cortex cultures using DESeq231 (Supplemental Table 6). This identified 507 differentially LocusZoom Plots expressed genes with higher levels in the glomerular cultures Genetic association and regional linkage disequilibrium plots and 257 genes with higher expression in the cortex cultures. were rendered using the LocusZoom49 webtool (locuszoom.org), Unsupervised clustering of these differentially expressed using the default 1000 Genomes EUR database (November 2014) genes clearly demarcated the two types of primary cultures for linkage data. Genome-wide association summary statistics (Figure 1D). Examination of the ontologies linked to genes for serum creatinine were obtained from the CKDGen website with higher expression in the glomerular cultures versus the (https://fox.nhlbi.nih.gov/CKDGen/)1 and summary statistics cortex cultures revealed marked enrichment for genes associ- for circulating parathyroid hormone levels were obtained ated with extracellular matrix organization and for renal/ directly from the authors of a previous study.50 urogenital system development (Supplemental Figure 2).

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 5 BASIC RESEARCH www.jasn.org

A Nephrectomy B (for RCC) Sieving to isolate glomeruli

Genome Scale Datasets RNA-seq DNase-seq Glomerular cores Primary Glomerular 4 4 Cultures

Primary Cortex 33 200µm Cultures

C WT1 Podocin

E WT1 BMP7 SYNPO ARHGAP24 RAP1GAP Podocyte TLR4 PAX2 Claudin1 PODXL PTPRO TRPC6 NPHS1 NPHS2 ACTA2 PDGFRB Mesangial 20µm MEIS1 Cell TBX18 DES CD34 D VWF Endothelial KDR Cell FLT1 FLT4

n=507 ANPEP higher in GGT1 FPKM z-score MME Glomerular ≥10 cultures 2 GDA KL 8 1 Tubule BHMT AQP1 6 n=257 0 AQP7 higher in SLC5A2 4 Cortex −1 SLC34A1 cultures UMOD 2 −2 0 Patient 3 Patient 5 Patient 5 Patient 6 Patient 4 Patient 2 Patient 1 Patient 6 Patient 3 Patient 2 Patient 4 Patient 5 Patient 1 Patient 5

Primary Primary Primary Primary Glomerular Cortex Glomerular Cortex Cultures Cultures Cultures Cultures (4 patients) (3 patients) (4 patients) (3 patients)

Figure 1. Primary human glomerular cell outgrowth cultures and primary human cortex cultures express podocyte/mesangial and tubular markers respectively. (A) Schematic of samples and datasets generated for this study. (B) Phase contrast micrograph of adherent human glomeruli with outgrowth of epithelioid (podocytes) and spindled cells (mesangial cells) after 14 days in culture.

6 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH

After this global analysis of gene expression, we queried our endothelial cells. Furthermore, the expression signatures of data for known cell type–specific marker genes. First, we ex- podocytes (embedded within the whole glomerular cell gene amined the expression of lineage-defining genes in the gene expression data) are consistent with an injured/proliferative expression data. The podocyte-selective genes WT1,59 podocyte phenotype. BMP7,60 SYNPO,61 ARHGAP24,62 and RAP1GAP63 were In contrast to glomerular outgrowths, primary culture of highly expressed in all four primary glomerular outgrowth cul- cortical tubular epithelial cells is usually performed under se- tures, but were not expressed in primary cortex cultures (Figure rum-free growth conditions.77–79 Exposure of primary cortical 1E). Interestingly, although podocin gene (NPHS2) transcripts tubular epithelial cells to serum stimulates rapid growth,80 but were undetectable by RNA-seq, podocin protein expression was results in the downregulation of functional markers and may seen by immunofluorescence microscopy (Figure 1C), which is promote fibroblast outgrowth.81 In this study, we cultured the 64 consistent with the known long t1/2 of podocin protein. Genes cortex samples under identical growth conditions to the glo- encoding the mesangial cell markers ACTA2 (smooth muscle merular outgrowth cultures primarily to remove the effect of actin), PDGFRB, and MEIS1 were also expressed in the primary serum as analytic confounder. Cognizant of this limitation of glomerular cultures; TBX1865 and DES were not. TLR4 is nor- cell preparation, hereafter we refer to these as “primary mally expressed in both podocytes and mesangial cells,66,67 and cortex cultures” to reflect their injured tubular epithelial cell this gene showed strong expression in the primary glomerular phenotype and potentially complex cellular composition. cultures. Endothelial cell–associated marker genes (CD34, VWF, KDR, FLT1, and FLT4)68 were not detected in any of the cell Examination of Chromatin Accessibility and Gene types examined. Tubule-associated genes (ANPEP, GGT1, MME, Expression Profiles at WT1 and PDGFRB Gene Loci GDA, KL, and BHMT)69,70 showed stronger and more consistent The RNA-seq data showed significantly higher expression of expression in the primary cortex culture datasets, although other WT1 and PDGFRB in the glomerular cultures than the pri- known tubule markers (AQP1, AQP7, SLC5A2, and SLC34A1) mary cortex cultures (Figure 2, A and B, top panels). Exami- were not detected. Undetectable uromodulin gene expression is nation of the chromatin accessibility patterns around these consistent with minimal contamination by distal nephron two genes revealed numerous regulatory elements, which ap- components in the primary cortex cultures (Figure 1D). Min- pear as DHS (Figure 2, A and B, bottom panels, shaded bars). imal expression of some podocyte (e.g., SYNPO) and mesangial The patterns of accessibility of these DHS are consistent both cell genes (e.g., ACTA2) in the cortical tubule cultures may across the independently derived samples and within cell types. reflect low-level contamination by glomerular outgrowths in Although many of these DHS cluster around the TSSs, consistent these primary cultures, although care was taken to reduce glo- with a role in promoter function, many others are located tens merular attachment by early and frequent media exchanges. of kilobases away (Figure 2, red arrows) and represent putative At the time of the study, the cellular input for DNase-seq distal regulatory elements. required millions of cells necessitating culture and expansion of the primary cell populations described in this study. Culture Global Characterization of Chromatin of glomerular outgrowth cells is a long-established method,71 Accessibility Profiles typically performed in the presence of serum, and has been Next, we analyzed the chromatin accessibility landscapes of the used to generate conditionally immortalized human podo- glomerular and cortexcultures genome-wide to compare them cyte,72 mesangial,73 and glomerular endothelial cell lines.74 with published DHS data already available in ENCODE, to However, this two dimensional culture method leads to down- characterize their positioning with respect to chromatin land- regulation of key podocyte markers,75 and consistent with marks and to identify DHS that were significantly differentially this, the primary glomerular outgrowth cultures did not ex- accessible in the glomerular and cortex cultures. Before this press PODXL, NPHS1, NPHS2,orTRPC6.76 The expression of study, the only kidney-focused DHS maps available through mesangial markers and a mixed epithelioid and spindled mor- ENCODE were for fetal kidney (primarily from the second phology is consistent with the presence of mesangial cells in trimester) and from cultured proximal tubules (RPTEC), the glomerular outgrowth cultures. The absence of glomeru- whole renal cells (HRE), or renal cortical epithelial lar endothelial cells is not surprising given that these cells re- cells (HRCE). Because we report the first glomerular DHS quire the presence of VEGF in the growth medium for ex vivo maps, we compared their profiles to ENCODE fetal kidney survival and expansion.74 Taken together, these analyses of DHS data as the closest approximation to previously published phenotype and the gene expression data confirmed that the data. Combining all of the unique DHS detected in our primary glomerular outgrowth cultures comprise a mixed glomerular cultures produced 402,684 elements that showed population of podocytes and mesangial cells with few, if any substantial overlap (51.6%) with fetal kidney DHS maps

(C) Immunofluorescence staining for indicated proteins (green) in human glomerular cell outgrowth cultures. Nuclei are counterstained with DAPI (blue). (D) Unsupervised clustering and row-normalized expression heatmap of all differentially expressed genes in glomerular and cortex samples. (E) Heatmap of raw FPKM expression values of selected lineage-defining genes in RNA-seq datasets.

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 7 BASIC RESEARCH www.jasn.org

ABChromosome 11 Chromosome 5

WT1 PDGFRB

FPKM FPKM

9.13 15.98 Primary Glomerular 16.51 12.99 Cultures (4 patients) 12.70 7.81 10.46 12.52

(RNA-seq) 0.22 0.06 Primary Gene expression Cortex 1.28 0.41 Cultures (3 patients) 1.50 0.72

Primary Glomerular Cultures (4 patients)

(DNase-seq) Primary Cortex Cultures Chromatin Accessibility (3 patients)

Promoter 20 kb Promoter 20 kb

Figure 2. Correlating gene expression (RNA-seq) with chromatin accessibility (DNase-seq) reveals patterns of genome regulation for two key glomerular genes, (A) WT1 and (B) PDGFRB. Each horizontal “track” represents either RNA-seq (top panel) or DNase-seq (bottom panel) data for each of four primary glomerular outgrowth cultures and three primary cortex cultures. The RNA-seq tracks and DNase-seq tracks are shown for the same genomic region. For reference, the position of the TSS of WT1 and PDGFRB gene is in- dicated by the vertical black dashed line. For RNA-seq tracks, increased gene expression is visualized by greater transcript tag density over the exons of the predicted gene model (exons are taller vertical bars, direction of transcription is indicated by arrowheads). The quantified RNA expression level (FPKM) for the gene in that sample is shown to the left of each track. For DNase-seq tracks, increased chromatin accessibility is visualized by greater sequence tag density, resulting in a “peak.” Numerous DHS show significantly differ- ential accessibility between glomerular cells and tubular cells, which are highlighted by the vertical shaded bars. There is dense dif- ferential DHS activity around the TSS of both genes consistent with regulation of gene promoters. This correlates with elevated expression of both genes in the glomerular cells. However, several DHS located several kilobases from the promoter region may also be contributing to gene regulation (red arrows), consistent with their role as distal regulatory DNA elements. RNA-seq and DNase-seq tracks are shown at the same vertical scale for both panels and for all samples.

(Supplemental Figure 3A). However, many DHS identified in by our data (Supplemental Figure 3B) despite differences in our glomerular DNase-seq data were not present in the fetal culture conditions (as described above). Overall, the glomer- kidney dataset. This may be because of adult–fetal differences ular and cortex culture DHS maps generated in this study and a more glomerular cell–enriched composition in our contained 176,462 distinct DHS that were not present in the samples. Similarly, comparison of our cortex culture DHS previously published ENCODE kidney datasets. profiles to ENCODE tubule datasets (RPTEC, HRE, and Next, to further increase the stringency of our analysis, we HRCE) revealed a large overlap (57.7%), suggesting that a filtered the glomerular and cortex culture DHS list by only large portion of the tubule regulatory landscape is captured including high-confidence DHS with a minimum cut count

8 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH of 50 and those that were present in at least three out of the are transcription factor genes. DHS represent sites where tran- seven total datasets. Using these criteria, 335,161 DHS were scription factors bind to sequence-specific DNA recognition present in either the glomerular or cortex cultures (i.e.,a motifs.89 Therefore, we asked if the DNA recognition motifs master list of all DHS), a far greater number than the number of particular families of transcription factors were enriched of regulatory elements identified by chromatin immunopre- in DHS showing different patterns of accessibility in the glo- cipitation sequencing for histone modifications (H3K4Me3, merular and cortex cultures (Figure 3F). DHS with greater ac- H3K27Me3) or CTCF binding (Figure 3, A and B). In contrast cessibility in glomerular cultures were enriched for motifs of to H3K4Me3, which is associated with promoters,82 the forkhead, GATA, TEAD, and the SOX family of transcription majority of DHS are located .5 kb from known TSS of genes, factors. Conversely, DHS with greater accessibility in cortex consistent with their role as distal regulatory elements.7 cultures were enriched for motifs of CTF, E-box, NF-I, and Of the master list DHS, 296,041 (88.3%) represented reg- HNF transcription factor families. These studies showed that ulatory elements with stable chromatin accessibility between transcription factor genes with cell type–specific function are at the glomerular and tubular samples. 21,059 (6.3%) of the the apex of regulatory control in our DHS maps. Furthermore, master list DHS showed significantly greater accessibility the differentially accessible DHS encode distinct patterns of in the glomerular cultures versus the cortex cultures transcription factor binding motif enrichment between the (FDR,0.01). Conversely, 18,061 (5.4%) of the master list glomerular and cortex cultures. DHS showed significantly greater accessibility in the cortex cultures than the glomerular cultures. We calculated the Kidney Disease Risk Loci Are Enriched in Glomerular distance of these differentially accessible DHS to the closest and Tubular DHS and eQTL gene TSS. Similar to the global DHS distribution (Figure 3, GWAS have revolutionized the identification of genetic loci A and B), very few (,10%) of these differentially accessible that associate with disease and other complex traits. However, DHS were located within 5 kb of gene TSS (Figure 3C). Exam- most SNPs associated with traits identified by these studies are ination of the genes located closest to the DHS with greater in nonprotein-coding DNA sequence (i.e., introns and inter- accessibility in the glomerular cultures uncovered enrichment genic regions). Given that these loci are nonprotein-coding, it for ontologies associated with glomerular function (e.g., kidney has been difficult to assign a target gene to the genetic signal mesenchyme development, glomerulus development, etc.) that is responsible for the risk allele’sphenotype.Itisnow (Figure 3D). Conversely, examination of the genes located clos- becoming clear that many of these genetic risk variants are est to the differential DHS with greater accessibility in the cor- enriched in genomic regions containing regulatory elements tex cultures uncovered enrichment for ontologies associated that control gene expression programs which can influence with tubular function (e.g., nephron epithelium development, disease.5 This has also been recently demonstrated for kidney renal tubule development, etc.). Taken together, these analyses disease traits by the CKDGen consortium.1 However, those support the idea that the differentially accessible DHS drive the analyses did not test for enrichment in fetal kidney DHS differences in gene expression programs between the primary data and did not include chromatin accessibility data from glomerular and cortex cell cultures. primary glomerular outgrowth cells, which we report here. Therefore, we revisited the question to ask if loci associated Linking Differentially Accessible DHS to Differentially with kidney disease and related traits were enriched in our Expressed Genes: A Role for Transcription Factors newly generated regulatory DNA maps and in the fetal kidney Because the regulation of gene expression may be influ- DHS data available through ENCODE. We examined expand- enced by multiple distal regulatory elements, we enumerated ing windows surrounding genome-wide significant loci differentially accessible DHS in windows around the TSS of dif- (Supplemental Table 5) for glomerular or tubular DHS from ferentially expressed genes. We found that the number of differ- our master DHS list or from ENCODE fetal kidney DHS. We ential DHS located near differentially expressed genes was compared these to bootstrapped distributions (10,000 itera- significantly correlated with the magnitude of the differential tions) of randomized SNPs using the SNPsnap tool45 to match 2 gene expression (Spearman correlation=0.45, P=4.2310 53) SNPs on the basis of the allele frequency, number of SNPs in (Figure 3E). Examination of the top ten glomerular-expressed linkage disequilibrium, distance to the nearest gene, and gene genes with the greatest number of nearby differentially acces- density of SNPs associated with kidney disease/traits. Com- sible DHS uncovered genes with known roles in glomerular pared with the background distribution, the kidney disease/ cell biology (e.g., WT1,83 MEIS2,84,85 BMP7,60,86 and traitSNPsweresignificantly more likely than the matched FOXC287,88) and other genes with unknown or poorly charac- random SNPs to have a master list DHS or fetal kidney DHS terized roles (e.g., TENM3 and FOXL1) (Figure 3E). At least five in their vicinity, regardless of the window surrounding the of these genes are transcription factors that can exert broad SNP (Figure 4A; FDR#0.001 for all padded window sizes influence on gene expression programs by regulating the ex- tested). For example, 37.5% (75 out of 200) and 32% pression of numerous target genes.89 Similarly, of the top ten (64 out of 200) of genome-wide significant kidney disease-/ cortex-expressed genes with nearby differentially accessible trait-associated SNPs were located within 62kbofamaster DHS, four (POU3F3,90–92 SIM2,93 SIM1,93,94 and TFCP2L195) list and fetal kidney DHS respectively. Next, we asked if our

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 9 10 log negative by ordered rank enrichment GREAT Stanford highest the with using gene enrichment ontologies renal ontology nearest the gene human the for for in to annotated are peaks distance were shown of (blue) ChIP-seq accessibility cells Results function or greater cortex a with tool. DHS or DHS as located of (maroon) analysis of are shown number glomerular percentage type The in is cells The cell (C) DHS (HRE) (blue) shown. either (B) accessible epithelial is cells in TSS shown. renal DHS cortex gene is human accessible or from in TSSs distance ferentially (maroon) of peaks gene cells function ChIP-seq a from glomerular or as in distance DHS ENCODE from of of cells number function (HRE) The epithelial a (A) as expression. ENCODE gene from changing with factors transcription 3. Figure mro o)adtoewt rae cesblt ncre el bu o)i hw.Teeoe ahgn srpeetdb w oson dots two by represented is gene each Therefore, is shown. that is dot) gene (blue each cells For cortex DHS. in accessible accessibility greater differentially with of those number and the dot) versus (maroon expression gene signi differential of expressed magnitude the of Correlation AI RESEARCH BASIC CD AB EF Log2 fold change in Gene Expression Number of DHS 100,000 120,000 140,000 160,000 20,000 40,000 60,000 80,000

ora fteAeia oit fNephrology of Society American the of Journal CORTEX higher in GLOMERULI –2.5 ifrnilyacsil H r nihdfrdsa oain eltp speci cell-type location, distal for enriched are DHS accessible Differentially Percentage of DHS 0.0 2.5 5.0 10 20 30 40 50 60 0 0 fi

POU3F3 0 to 5 atyhge ngoeua el rcre el,tenme fDSwt rae cesblt ngoeua cells glomerular in accessibility greater with DHS of number the cells, cortex or cells glomerular in higher cantly accessibility in: DHS withhigher 0 to 5 Cortex Glomeruli SIM2 (±20kb ofTSSdifferentiallyexpressed 5 to 50 www.jasn.org Number ofdifferentiallyaccessibleDHS SIM1

–20 5 to 50 CYP24A1 PLAU KRT7 C1orf116 ACTR3C MAL

KRT81 50 to 500

kilobases 50 to 500 kilobases

500 to 1000 500 to 1000 02 accessible in: DHS (#)more Cortex (18,061) Glomeruli (21,059) CRB2 EDIL3 Spearman corr.=0.45 VCAM1 ZFPM2 BMP7

MEIS2 >1000 UNC5C

CTGF >1000 PDGFRB FOXL1 FOXC2 g 0 enes) P=4.2x10 . WT1 bfo h ers eeTS D h ers ee otedifferentially the to genes nearest The (D) TSS. gene nearest the from kb 5 -53 Pos. regulationofcarbohydratemetabolicprocess Percentage of DHS 10 20 30 40 50 60 70 80 0 Positive regulationofreproductiveprocess Metanephric mesenchymedevelopment Metanephric nephronmorphogenesis Higher accessibility Higher accessibility

Establishment ofendothelialbarrier 0 to 5 Regulation ofreproductiveprocess Kidney mesenchymedevelopment Nephron epitheliumdevelopment in Glomeruli Constitutive Cell proliferationintheforebrain in Cortex Nephron tubuledevelopment DHS withhigheraccessibilityin: Enriched geneontologiesin Renal tubuledevelopment Glomerulus development 5 to 50 Glomeruli Cortex DHS accessibility

kilobases 50 to 500 pattern

500 to 1000 01 02 035 30 25 20 15 10 5 0 fi DHS orHREChIP-seqpeaks(#) eeotlge n r oae near located are and ontologies gene c HNF1/POU E-box CTF/NF−I SOX TEAD/TEF GATA Forkhead ZNF (includesWT1) E2F3 CTCF HINFP AP1 TF family HRE CTCF(64,563) HRE H3K27Me3(69,185) HRE H3K4Me3(51,755) DHS Masterlist(335,161)

>1000 -log10 binomialp-value Cortex Glomeruli mScNephrol Soc Am J –log10(P value) ’ S.Temjrt fdif- of majority The TSS. s 10 0 20 40 60 80 100 binomial 30: ccc P au.(E) value. – ccc ,2019 www.jasn.org BASIC RESEARCH datasets were identifying novel loci compared with the fetal 20 kb of DHS with differential accessibility between glomer- kidney DHS data. For this, we pooled all kidney disease/trait ular and cortex cells. Then we asked if any protein-coding GWAS SNPs identified at various window sizes in Figure 4A genes 6500 kb from these differential DHS also exhibited and asked how many of these were unique to either our data significant changes in gene expression between glomerular versus the ENCODE fetal kidney data, or jointly identified by and cortex cells. This identified 57 unique SNP positions as- both (i.e., shared). Although some kidney disease/trait GWAS sociating with 64 unique potential target genes resulting in SNPs are jointly identified, many are uniquely identified by 73 unique SNP–target gene interaction pairs (Supplemental DHS seen in our data (Figure 4B). This analysis again demon- Table 7). This analysis also underscored the complexity of strated that kidney disease-/trait-associated SNPs are enriched correlating differentially accessible regulatory elements to dif- in regulatory DNA elements, but also showed that the DHS ferentially expressed genes. Some GWAS loci (e.g., rs12826808) maps reported in this study identify distinct genetic loci. could potentially affect multiple target genes (e.g., MGP, Next, we sought to compare our regulatory DNA maps with ERP27, and RERG). Conversely, some genes (e.g., SFTA2 and the complementary method of eQTL analysis. In thisapproach, VARS2) may be regulated by DHS near more than one GWAS genetic variants are correlated with tissue-specific gene expres- locus (rs3130544 and rs9263871). sion profiles from hundreds to thousands of individuals.96–99 Although our target gene search interval (6500 kb) ap- Most variants marked by eQTL (eSNPs) act in cis with their pears large, the majority of differentially accessible DHS in potential target genes (eGenes), consistent with their role as our data were located between 5 and 500 kb from gene TSSs tissue-specific regulatory DNA elements.100 The GTEx con- (Figure 3C), and regulatory elements such as enhancers are sortium has generated dense eQTL maps for a wide range of well known to physically associate and act over large distances human tissues,46 but maps of kidney tissues and cells had not on their target genes (Supplemental Figure 1).103 However, to been generated until recently.47,101,102 When we compared the validate physical associations between GWAS SNPs and their proportion of eSNPs in kidney eQTL and GTEx data that over- potential target genes, we generated genome-wide chromatin lapped with our master DHS list, we found that the proportion conformation (Hi-C) data from freshly isolated human glo- of overlap was highest for the glomerular and tubulointersti- meruli. In this method, long-range genomic interactions in tial eQTL samples (Figure 4C). Both the glomerular and tu- cells can be mapped by stabilizing physical contacts with gentle bulointerstitial eSNP overlap proportions were significantly fixation.104 In the glomerular Hi-C data, we visualized long- higher than the mean overlap proportion of GTEx tissues range chromatin interactions originating from either the 28 (Ptubulointerstitial=1.61310 , Pglomeruli=0.001), indicating GWAS SNP or putative target gene TSS. All 73 SNP–target that glomerular and tubulointerstitial eSNPs were significantly gene pairs had such interaction data available (list of chromatin enriched in kidney DHS master list. We also found that glomer- contacts provided as a University of California, Santa Cruz ular eSNPs were significantly enriched in DHS with increased Genome Browser interact track file in Supplemental Table 8). 2 accessibility in glomeruli (P=6.59310 5), and tubulointerstitial Physical association between the GWAS SNP and the target eSNPs were significantly enriched in DHS with increased acces- gene TSS was visually confirmed for 52 SNP–target gene 2 sibility in tubules (P=2.27310 16). These findings demonstrate interaction pairs (Supplemental Figure 4, Supplemental that our chromatin accessibility maps for glomerular and Table 7, and Table 1). Using this integrated approach, in cortex cultures both corroborate and complement recently total, we assigned 42 unique GWAS loci to 46 putative target published kidney eQTL data. genes. Rather than being randomly oriented, the linear ge- nome is organized into compact, three-dimensional units Genome Conformation Links GWAS SNPs to Their termed TADs.27 Although gene regulatory interactions Putative Target Genes may occur across TAD boundaries, they are much less fre- Thus far, we have demonstrated that the differentially acces- quent, thus defining TADs as a compartmentalized func- sible DHS cluster in proximity to differentially expressed genes tional unit of chromatin.105 We computed TADs from our (Figure 3E), and that kidney GWAS loci and eQTL are glomerular Hi-C data (Supplemental Table 9) and found enriched in the accessible chromatin regions of primary glo- that the majority (90%, 47 out of 52) of the physically inter- merular and cortical cultures (Figure 4). Next, we sought to acting GWAS SNP–target gene pairs were contained within combine these orthogonal views of genome function to link TADs. This provides further evidence for the localization of GWAS loci to their target genes. To do this, we first identified SNP–target gene interactions within functional chromatin kidney disease-/trait-associated SNPs that were located within compartments.

the horizontal axis. Genes with greater expression in glomeruli tend to have greater numbers of DHS with higher accessibility in glomeruli (and vice versa for genes expressed at a higher level in cortex cultures). This enrichment is significant: Spearman correlation=0.45, 2 P=4.2310 53. (F) Enrichment of transcription factor DNA recognition motifs in constitutive (left column) and differentially accessible DHS (middle and right columns) between the two sample types. The Bonferroni-corrected P values of the enrichments using a hypergeometric distribution are shown. ChIP-seq, chromatin immunoprecipitation sequencing.

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 11 BASIC RESEARCH www.jasn.org

A This Study - ENCODE - Fetal Kidney B Number of GWAS SNPs uniquely Master list DHS Master list DHS overlapping ENCODE or 1 This Study’s DHS 0 kb 2 kb 5 kb 0.75 14 13 7 34 11 50 25 73 0.5 6

10 kb 20 kb 50 kb 0.25 11 9 134 23 41

overlap with a master list DHS FDR<0.001 FDR<0.001

Proportion of padded SNPs that 0 111 151 179 0 2 5 10 20 50 0 2 5 10 20 50 Genomic Padding Applied to SNP (+/- kb) Kidney genome-wide significant SNPs (n=200) Unique to Fetal Kidney DHS Background: SNPSnap-matched SNPs Unique to DHS from This Study Shared

C 0.026 Kidney eQTL (Gillies et al.) GTEx tissues 0.024

0.022

0.020

Mean proportion 0.018 of GTEx tissues (±2 SD, shaded) Proportion of eSNPs that

overlap with a master list DHS 0.016

0.014 liver lung testis ovary uterus spleen vagina thyroid pituitary prostate stomach pancreas Glomeruli nerve_tibial artery_tibial artery_aorta brain_cortex whole_blood adrenal_gland colon_sigmoid artery_coronary muscle_skeletal brain_amygdala colon_transverse brain_cerebellum heart_left_ventricle brain_hippocampus Tubulointerstitium esophagus_mucosa brain_hypothalamus minor_salivary_gland brain_substantia_nigra esophagus_muscularis adipose_subcutaneous heart_atrial_appendage breast_mammary_tissue brain_frontal_cortex_ba9 adipose_visceral_omentum brain_cerebellar_hemisphere cells_transformed_fibroblasts brain_caudate_basal_ganglia skin_sun_exposed_lower_leg brain_putamen_basal_ganglia small_intestine_terminal_ileum skin_not_sun_exposed_suprapubic brain_anterior_cingulate_cortex_ba24 esophagus_gastroesophageal_junction brain_nucleus_accumbens_basal_ganglia

Figure 4. Comparison of this study’s DHS data to ENCODE and eQTL datasets reveals enrichment in kidney GWAS and eQTL. 2 (A) Kidney GWAS SNPs achieving genome-wide significance (P,5310 8) are enriched in the chromatin accessibility sites of glomerular and cortex cells. The proportion of kidney GWAS SNPs that overlap an element in the master list comprising DHS from both glomerular and cortex cells (left panel) or ENCODE fetal kidney datasets (right panel) is shown as a function of increasing padding around that SNP. This proportion is compared with a background distribution of SNPs matched for genomic context using the SNPSnap45 tool (10,000 iterations). The bounds of the boxes represent the 25th–75th percentile range, with the median as the central hatch within the box; the whiskers represent 1.53 the interquartile range. Regardless of the padding interval, kidney GWAS SNPs (red line) are more likely to be localized near a kidney DHS both in this study’s and the ENCODE fetal kidney datasets. For each comparison, FDR,0.001.

12 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH

Integrated Functional Genomic Snapshots of Exemplar together with similar maps generated for primary cortex GWAS Loci cultures revealed thousands of differentially accessible DHS Finally, we applied our integrated functional genomic ap- genome-wide. Most of these (.90%) were located .5kb proach to examine our annotation of two GWAS loci to their from the TSSs of known genes and were associated with respective target genes in greater detail. In this analysis, we gene ontologies corresponding to the cell type from which found that in some instances, the GWAS SNP was predicted they were derived. Many of these DHS were distinct from to regulate a distant gene, skipping the nearest genes. The SNPs those previously available through the ENCODE consortium. rs219779 and rs219780 have recently been linked to circulating Furthermore, the subset of DHS that were differentially acces- parathyroid hormone levels, kidney stones, and bone mineral sible were enriched in the vicinity of transcription factors with density presumably via their control of calcium metabolism critical lineage-specific functionality. We also found that kid- within kidney tubules (Figure 5A).50,106 rs219779/rs219780 ney disease-/trait-associated genetic variants were enriched in are in the last coding exon of the CLDN14 gene, and are ap- our newly described regulatory maps, even though they proximately 500 bp from an intronic regulatory element with comprised only a few kidney cell types. Finally, we show a greater chromatin accessibility in tubular cultures compared powerful approach integrating genetic data (GWAS) with with glomerular cultures (Figure 5B). The CLDN14 gene was functional genomic readouts (chromatin immunoprecipita- not strongly expressed in either the glomerular or cortex cul- tion sequencing, DNase-seq, and RNA-seq) and chromatin tures (FPKM,1). However, the nearest gene with strongly conformation (Hi-C) to assign genetic signals to putative target differential gene expression is SIM2 (8.0-fold higher expres- genes with greater confidence. sion in cortex cultures), which suggested that it may in fact be One limitation of our study is artifacts induced by expand- the regulatory target of the SNP-associated, differentially ac- ing primary cells in two-dimensional culture and in the pres- cessible DHS near CLDN14. A long-range physical interaction ence of serum. At the time of data production in this study, between rs219779/rs219780 and SIM2 was also supported DNase-seq protocols113 typically required inputs of approxi- by Hi-C data in human glomeruli and this entire region was mately 1,000,000 cells, which necessitated the use of mixed contained with a TAD (Figure 5C). primary glomerular outgrowth cultures to achieve the re- In another instance, an SNP associated with a complex and quired cell numbers. Advances in the protocol and library composite phenotype such as increased eGFR (rs84178)107 was construction22,23 enabled reduction in cell input numbers found within 20 kb of a DHS with differentially higher acces- even during the course of this study. Recently, an alternative sibility in glomerular cultures compared with cortex cultures chromatin accessibility profiling technique using Tn5-trans- (Figure 6A). This DHS was located within an intron of the posase “tagmentation” and sequencing (ATAC-seq) has been KCNQ1 gene that was not expressed in either the glomerular described.21 Although all chromatin accessibility profiling or cortex cultures (Figure 6B). Instead, CDKN1C was the only methods have their unique advantages and limitations,114 protein-coding gene that was differentially expressed within a the ATAC-seq protocol does involve fewer steps than the 500 kb interval around rs84178. The CDKN1C gene DNase-seq protocol and can generate data from as few as product p57 is an important marker of podocyte maturation 500–50,000 cells. Because the problem of high mitochondrial and function.108–112 Chromatinconformationdataagain read contamination in ATAC-seq appears to have been re- supported a long-range interaction between the GWAS locus cently solved,115 this method could now be combined with and the CDKN1C gene, both of which were localized within a gentler psychrophilic dissociation strategies116 and affinity- TAD found in human glomeruli (Figure 6C). based approaches to generate epigenomic profiles of rare cell populations derived from freshly dissociated single cell sus- pensions of kidney tissue. More facile epigenomic profiling of DISCUSSION directly isolated cells would enable generation of datasets from many more individuals and would overcome the artifacts Our data reports reference quality genome-wide regulatory introduced by in vitro culture and expansion. DNA maps for primary human glomerular cultures compris- A second potential limitation is that our functional anno- ing podocytes and mesangial cells. Analysis of these data tation strategy only examines genetic risk variants that lie

See Supplemental Table 5 for the list of SNPs used for this analysis. (B) Kidney GWAS SNPs overlap shared and unique DHS in this study’s and ENCODE fetal kidney DHS data. The distribution of kidney GWAS SNPs that are identified by either this study or ENCODE fetal kidney DHS is shown at each padding interval depicted in (A). Although many kidney GWAS SNPs overlap both this study’s DHS and the ENCODE fetal kidney DHS at various padding sizes, many are unique to each dataset. (C) Kidney DHS are enriched in kidney eQTL. The proportion of SNPs/variants (i.e., eSNPs) associated with eQTL from GTEx samples or a recently published dataset47 from isolated human glomeruli and tubulointerstitium is depicted. The red line indicates the mean proportion of overlapping eSNPs in the GTEx samples and the gray area indicates 62 SD from the mean. Overlap with this study’s kidney DHS master list is significant for both glomerular eSNPs (P=0.01) and 2 tubulointerstitial eSNPs (P=1.61310 8).

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 13 BASIC RESEARCH www.jasn.org

Table 1. Kidney GWAS SNPs and their putative target genes inferred by integrated analysis of genetic, chromatin accessibility, gene expression, and chromatin conformation data

GWAS rs ID SNP Chromosome SNP Position Target Gene GLOM/CORTEX log2 Fold Change Within Glomerular TAD? rs2786111 Chr1 197336969 CFHR1 4.28 Yes rs2049805 Chr1 155225189 EFNA1 22.09 Yes rs12568771 Chr1 17287314 MFAP2 2.58 Yes rs267738 Chr1 150968149 MLLT11 2.29 Yes rs2049805 Chr1 155225189 MUC1 23.08 Yes rs2352039 Chr1 78354261 PTGFR 22.73 Yes rs1800615 Chr1 15505786 TMEM51-AS1 22.81 No rs84178 Chr11 2753144 CDKN1C 3.04 Yes rs11604462 Chr11 65784177 GAL3ST3 2.40 Yes rs3741414 Chr12 57450266 ARHGEF25 2.36 Yes rs3741414 Chr12 57450266 DTX3 1.74 Yes rs12826808 Chr12 15170446 ERP27 3.61 Yes rs7956634 Chr12 15168260 ERP27 3.61 Yes rs12826808 Chr12 15170446 MGP 4.55 Yes rs7956634 Chr12 15168260 MGP 4.55 Yes rs12826808 Chr12 15170446 RERG 3.78 Yes rs7956634 Chr12 15168260 RERG 3.78 Yes rs490049 Chr13 32990725 KL 21.89 Yes rs12589674 Chr14 53779635 BMP4 3.23 No rs2071047 Chr14 53951693 BMP4 3.23 No rs8056893 Chr16 68270489 ESRP2 22.07 Yes rs889472 Chr16 79612092 MAF 3.27 Yes rs344364 Chr16 1923515 SLC9A3R2 1.67 Yes rs9895232 Chr17 4615844 SMTNL2 4.55 Yes rs12975033 Chr19 48746186 CA11 1.97 Yes rs12975033 Chr19 48746186 FUT1 21.52 Yes rs7562121 Chr2 99767892 AFF3 2.88 Yes rs7583877 Chr2 99844192 AFF3 2.88 Yes rs815815 Chr2 47171925 EPCAM 22.67 Yes rs187355703 Chr2 176128855 HOXD1 21.74 Yes rs13421350 Chr2 172453843 ITGA6 21.74 Yes rs1260326 Chr2 27508073 KRTCAP3 22.12 Yes rs35612822 Chr2 216505032 MREG 21.98 Yes rs12105918 Chr2 144450626 ZEB2 1.94 Yes rs6127099 Chr20 54114863 CYP24A1 22.95 Yes rs9977499 Chr21 27362678 ADAMTS5 2.49 Yes rs219780 Chr21 36461009 SIM2 22.89 Yes rs9839909 Chr3 171790673 TNIK 1.51 Yes rs6816344 Chr4 72917687 ADAMTS3 2.09 Yes rs7674118 Chr4 165929313 CPE 2.01 No rs700236 Chr5 39367637 DAB2 2.29 Yes rs6910061 Chr6 11101685 GCNT2 21.99 No rs9263871 Chr6 31202751 PSORS1C3 22.17 Yes rs3130544 Chr6 31090563 SFTA2 23.18 Yes rs9263871 Chr6 31202751 SFTA2 23.18 Yes rs955333 Chr6 154626274 TIAM2 1.80 Yes rs9263871 Chr6 31202751 VARS2 22.03 Yes rs1799884 Chr7 44189469 AEBP1 3.03 Yes rs10954650 Chr7 139641806 KLRG2 3.11 Yes rs6977660 Chr7 19765857 MACC1 23.23 Yes rs10954650 Chr7 139641806 PARP12 21.51 Yes rs7805747 Chr7 151710715 SMARCD3 2.04 Yes GLOM/CORTEX, Ratio of RNA-seq gene expression in glomerular cultures versus cortex cultures Chr, chromosome.

14 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH

A rs219779 r2 100 15 100 kb 80 0.8 10 0.6 60 (p−value) 0.4 10 40

GWAS 0.2 5 (cM/Mb)

−log 20

0 0 Recombination rate

B MORC3 CHAF1B HLCS CLDN14 SIM2

CTCF H3K4me3 HRE input rs219779 Glom. RNA-seq ChIP-seq Tubule

Glom.

DNase-seq Tubule Significantly different DHSs

C rs219779/rs219780 “bait” SIM2 promoter “bait”

TAD Observed Long-range ~300 kb Interactions ~280 kb Glomeruli Hi-C

Figure 5. Integration of genetics and functional genomic data identifies SIM2 as a potential target gene of the rs219779/rs219780 locus linked to circulating parathyroid hormone and calcium levels. (A) A LocusZoom style plot of genetic association and regional linkage is shown for chr21: 36,341,453–36,871,451 (hg38). Variants with no linkage data are shown in gray. (B) Chromatin immuno- precipitation sequencing, RNA-seq, and DNase-seq tracks of the genomic region surrounding the rs219779/rs219780 locus are shown. The inset shows the location of a DHS located very close to rs219779 with greater accessibility in cortex cultures compared with the glomerular cultures. Previous studies have suggested that the nearest gene, CLDN14, plays a functional role in determining circulating parathyroid hormone levels and calcium levels; however, this gene is not expressed in either the primary cortex or glomerular cultures. In fact, the only gene with a significant change in gene expression in a 500 kb window around this locus is SIM2. Other DHS in this interval with higher accessibility in cortex cultures may also be contributing to SIM2 regulation. (C) Examination of long-range genomic contacts in human glomerular Hi-C data using either rs219779/rs219780 (orange arcs) or SIM2 gene TSS (purple arcs) as “baits.” This entire region lies within a TAD identified in human glomeruli (maroon bar). within/near differentially accessible DHS and ignores variants to study their contribution to kidney disease in relevant model that lie in nonchanging DHS. We acknowledge that this cor- systems. To this end, we also use Hi-C104,118,119 as another layer relative association method also does not reveal whether the of evidence to reveal the physical arrangement of GWASloci and identified SNP causally influences DHS regulation of target target genes in the three-dimensional environment of nuclear gene function, whether it pleiotropically influences the mea- chromatin. A related method has been recently applied to study sured phenotype by affecting gene expression outside our ex- cultured glomerular endothelial cells (HRGEC) and kidney tu- amination window, or if it is simply linked to a causal SNP that bule cells (RPTEC).120 In the future, true functional validation of is not within a differentially accessible DHS.117 Even with these DHS-target gene interactions may be definitively assessed with caveats, we have annotated 46 target genes that may be con- CRISPR-modulation approaches,121,122 or perhaps with ge- tributing to kidney disease/traits. Because, by definition, these nome-editing of the SNP and/or DHS to test for the effects on genes exhibit a kidney cell type–specific pattern of expression target gene regulation, as has been done only for a handful of (either in glomerular cells or cortex cells), it will be interesting genetic variants identified for other human phenotypes.13,14,123

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 15 BASIC RESEARCH www.jasn.org

A 10 r2 100 rs84178 100 kb 8 0.8 80 6 0.6 60 0.4 (p−value) 4 0.2 40 10 GWAS (cM/Mb) 2 20 −log

0 0 Recombination rate

B KCNQ1 SLC22A18 KCNQ1-OT1 PHLDA2 CDKN1C

CTCF

HRE H3K4me3 input rs84178 Glom. RNA-seq ChIP-seq Cortex

Glom.

DNase-seq Cortex Significantly different DHSs TAD boundary rs84178 “bait” CDKN1C promoter “bait” C TADs

Observed Long-range ~110kb Interactions ~170kb Glomeruli Hi-C

Figure 6. Integration of genetics and functional genomic data identifies CDKN1C as a potential target gene of the rs84178 locus linked to elevated eGFR. (A) A LocusZoom style plot of genetic association and regional linkage is shown for chr11: 2,668,770–2,938,770 (hg38). Variants with no linkage data are shown in gray. (B) Chromatin immunoprecipitation sequencing, RNA- seq, and DNase-seq tracks of the genomic region surrounding the rs84178 locus are shown. The inset shows the location of a DHS overlapping rs84178 with greater accessibility in glomerular cultures compared with the cortex cultures. This DHS and rs84178 lie within an intron of the KCNQ1 gene which is not expressed in either the primary tubular or glomerular cultures. The only protein-coding gene with a significant change in gene expression in a 500 kb window around this locus is CDKN1C, the protein product of which, p57, plays an important role in podocyte biology. Together with the other DHS in this interval showing higher chromatin accessibility in glo- merular cells, this suggests that the mechanism by which rs84178 contributes to elevated eGFR is by regulation of CDKN1C in glomerular cells (likely in podocytes). (C) Examination of long-range genomic contacts in human glomerular Hi-C data using either rs84178 (orange arcs) or CDKN1C gene TSS (purple arcs) as “baits.” This span between rs84178 and the CDKN1C gene lies within a TAD identified in human glomeruli (maroon bar). Some long-range contacts from both rs84178 and the CDKN1C gene TSS appear to cross this TAD’s boundary (gray dashed line) into the neighboring TAD. Also note the sharp drop-off in genetic linkage across the TAD boundary in (A).

Many kidney diseases encompass an interplay of kidney- disease-/trait-associated loci can generate hypotheses re- intrinsic and kidney-extrinsic (e.g., cardiovascular) pathobio- garding the specific kidney cell types (e.g., glomerular cells, logic mechanisms. GWAS do not discriminate between these Figure 6) that play a key role in the disease process as has mechanisms and inherently cannot identify the relevant cell been suggested recently for other diseases.9,125,126 Because types that are contributing to the disease process. However, many studies of CKD use imperfect and nonspecificbio- because GWAS-identified loci (even those that do not markers of kidney function (e.g., serum creatinine), de- achieve genome-wide significance thresholds) are enriched ployment of cell type–specific biomarkers together with for regulatory DNA elements that show a highly cell type– chromatin accessibility maps of important kidney cell types specific pattern of activity,7,9,15–17 obtaining high-resolution will improve the assignment of genetic risk to specific cells. chromatin accessibility maps for unique kidney cell types Also, our finding that cell type–specificregulatoryactivity will be essential to assess their contribution to complex is concentrated around transcription factor genes (which, kidney traits and disease processes.124 Our findings in turn, can regulate other genes with pleiotropic effects), also support the idea that functional mapping of kidney provides an explanation for how small GWAS effect sizes

16 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH mayresultinanamplified phenotypic response. In this S.A. was supported, in part, by a Damon Runyon Cancer Research context, complex traits and kidney disease may not simply Foundation Fellowship (DRG 114-13). This project was also sup- be explained by stepwise accumulation of risk-predisposing ported by National Center for Advancing Translational Sciences genetic variants that only marginally affect the expression of grants to J.H. (5UH3TR000504 and 1UG3TR002158), a National their target genes. Our finding instead lends some support Institute of Diabetes and Digestive and Kidney Diseases grant to the recently proposed omnigenic model of inherited risk (P30DK017047) to the University of Washington Diabetes Research by which genetic variants perturb cell type–specificgene Center, and by an unrestricted gift from the Northwest Kidney regulatory networks.124 Centers to the Kidney Research Institute. Infrastructure for the In summary, we describe the generation and initial char- Cohorts for Heart and Aging Research in Genomic Epidemiology acterization of paired chromatin accessibility (DNase-seq) Consortium is supported, in part, by the National Heart, Lung, and and gene expression profiles (RNA-seq) from primary hu- Blood Institute grant R01HL105756. man glomerular and cortical cultures from multiple individ- Neither GlaxoSmithKline nor Phase Genomics provided funding uals. We show that combined analysis of genetic, functional or otherwise influenced the reporting of the data and results presented genomic, and chromatin conformation data can shed new in this study. light on the complex regulation of gene expression that pre- disposes to kidney traits and disease. These datawill be useful DISCLOSURES to annotate and compare with future kidney GWAS and K.B.S. and D.W. are full-time employees of GlaxoSmithKline, LLC. S.S. is a 47,101,102 emerging eQTL studies. In the future, generation full-time employee of Phase Genomics, Inc. of truly cell type–specific chromatin accessibility maps for a wide range of kidney cell types would enable high-res- olution functional annotation of genetic risk variants and SUPPLEMENTAL MATERIAL generation of kidney gene regulatory networks that have only been previously achieved in limited fashion and in This article contains the following supplemental material model organisms.127,128 online at http://cjasn.asnjournals.org/lookup/suppl/doi:10.1681/ ASN.2018030277/-/DCSupplemental. Supplemental Figure 1. Regulatory DNA elements stimulate the ACKNOWLEDGMENTS transcription of target gene(s). While regulatory elements may be located at great distances (10-100’s of kilobases) from their target gene We would like to thank John A. Stamatoyannopoulos whose Ency- (s) when measured by linear genomic distance, activated elements are clopedia of DNA Elements (ENCODE) grant from the National frequently in close physical proximity with their actively transcribed Human Genome Research Institute (U54HG007010) supported se- target genes.8 Activated regulatory elements are characterized by the quencing and data processing for this project. We also thank the displacement of nucleosomes and an open nuclear chromatin investigators (Drs. Cassianne Robinson-Cohen and Bryan Kes- state, which is particularly susceptible to DNase I-mediated cleavage.6 tenbaum in particular), staff, and participants of the individual The resulting released DNA fragments can be sequenced and mapped participating studies for their valuable contributions; and Drs. Stuart to identify accessible chromatin regions genomewide (DNase-seq). In Shankland, Andrey Shaw, Laura Yerges-Armstrong, Matt Nelson, parallel, expression of the genes can be measured by RNA-seq, which Stephanie Chissoe, and Magdalena Skipper for helpful discussions on enables correlation of regulatory DNA activity with gene transcrip- these data and the manuscript. tion. Techniques such as Hi-C104,118 reveal genome topology and can S.A. conceived of the project, procured and processed samples, be used to test if regulatory elements and their putative target genes are performed and supervised experiments, analyzed data, and wrote the in physical proximity within nuclear chromatin. Interestingly, many manuscript. K.B.S. analyzed data and wrote the manuscript. J.V. single nucleotide polymorphisms (SNPs) linked to human disease that provided improvements to the DNase-sequencing protocol and motif have been identified through large population-based studies (GWAS) enrichment analysis that are included in this manuscript. A.B. and are concentrated within or near regulatory elements. Given the above K.J.H. performed immunofluorescence microscopy experiments. model of distal regulation of gene transcription by cell-type specific K.S. and A.R. performed analyses and visualizations. S.S. and A.S. regulatory elements, this leads to our current understanding that processed and analyzed Hi-C data and created topology-associated GWAS SNPs can identify/modify the function of these elements and domain calls. D.B., M.D., and D.D. performed sequencing experi- thereby control the expression of critical cell-type specific genes. ments. R.S. and J.N. processed and curated sequencing data. R.S., Therefore, we predict that integrated analysis of cell-type specific M.B., and R.K. annotated datasets and submitted to public re- chromatin accessibility, gene expression and genome topology maps positories with required metadata. M.M. and M.G.S. analyzed kidney will help to localize kidney disease-associated genetic variants to cell- expression quantitative trait loci data together with K.B.S. and S.A. J.H., type specific regulatory elements and to functionally annotate their C.E.A., and D.W. wrote and edited the manuscript together with S.A. regulated target genes. and K.B.S. M.G.S. is supported by the Charles Woodson Clinical Supplemental Figure 2. Enriched gene ontologies in genes that are Research Fund, the Ravitz Foundation, and National Institutes of expressed at a higher level in glomerular cell cultures vs. cortex cell Health RO1DK108805. cultures (all p,1x10-7).

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 17 BASIC RESEARCH www.jasn.org

Supplemental Figure 3. Comparison of DHS profilesfromthisstudy 5. Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, tofetal kidney and tubule DHS data previouslyavailable fromENCODE. et al.: Systematic localization of common disease-associated variation – (A) Overlap of all unique DHS from glomerular cultures from this study in regulatory DNA. Science 337: 1190 1195, 2012 6. Gross DS, Garrard WT: Nuclease hypersensitive sites in chromatin. andthefetalkidneyDHSdata.(B)OverlapofalluniqueDHSfromcortex Annu Rev Biochem 57: 159–197, 1988 cultures from this study and the 6 kidney tubule data sets available 7. Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E, through ENCODE (proximal tubule cells, RPTEC; human renal epi- et al.: The accessible chromatin landscape of the human genome. thelium, HRE; human cortical renal epithelium, HRCE). Nature 489: 75–82, 2012 Supplemental Figure 4. Physical interactions between kidney GWAS 8. Zhang Y, Wong C-H, Birnbaum RY, Li G, Favaro R, Ngan CY, et al.: Chromatin connectivity maps reveal dynamic promoter-enhancer loci associatedwithdifferentiallyaccessibleDHSand their putativetarget long-range associations. Nature 504: 306–310, 2013 gene identified by differential gene expression. Long-range genomic 9. Gerasimova A, Chavez L, Li B, Seumois G, Greenbaum J, Rao A, contacts deduced from glomerular Hi-C data using the GWASSNP (red et al.: Predicting cell types and genetic variations contributing to hash mark and orange arcs) and target gene TSS (blue hash mark and disease by combining GWAS and epigenetic data. PLoS One 8: purple arcs) as “baits”. Topologically associated domains (TAD) in e54359, 2013 10. He H, Li W, Liyanarachchi S, Srinivas M, Wang Y, Akagi K, et al.: human glomeruli are indicated when present by the horizontal maroon Multiple functional variants in long-range enhancer elements con- bars; breaks in the bars represent the boundary between adjacent TADs. tribute to the risk of SNP rs965513 in thyroid cancer. Proc Natl Acad Also compare to Table 1 and Supplemental Table 7. Sci U S A 112: 6128–6133, 2015 Supplemental Table 1. Patient characteristics, sample identifica- 11. Gaulton KJ, Ferreira T, Lee Y, Raimondo A, Mägi R, Reschen ME, et al.: tions and sequencing depth. DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Con- sortium: Genetic fine mapping and genomic annotation defines Supplemental Table 2. GWAS catalog download on August 24, 2017 causal mechanisms at type 2 diabetes susceptibility loci. Nat Genet selecting for kidney traits. 47: 1415–1425, 2015 Supplemental Table 3. Unique SNP positions from GWAS catalog 12. Oldridge DA, Wood AC, Weichert-Leahey N, Crimmins I, Sussman R, download. Winter C, et al.: Genetic predisposition to neuroblastoma mediated – Supplemental Table 4. H3/H4 coloc data from PICCOLO used to by a LMO1 super-enhancer polymorphism. Nature 528: 418 421, 2015 prune SNP positions (if prune=TRUE). 13. Claussnitzer M, Dankel SN, Kim K-H, Quon G, Meuleman W, Haugen Supplemental Table 5. Kidney GWAS SNPs used in this analysis for C, et al.: FTO obesity variant circuitry and adipocyte browning in hu- analysis (n=430) and the subset (n=200) achieving the genome-wide mans. NEnglJMed373: 895–907, 2015 significance threshold (p,5x10-8). 14. Soldner F, Stelzer Y, Shivalila CS, Abraham BJ, Latourelle JC, Barrasa Supplemental Table 6. DESeq2 output for differential gene ex- MI, et al.: Parkinson-associated risk variant in distal enhancer of a-synuclein modulates target gene expression. Nature 533: 95–99, pression between primary glomerular and cortex cultures. 2016 Supplemental Table 7. Expanded functional annotation data of 15. Forrest AR, Kawaji H, Rehli M, Baillie JK, de Hoon MJ, Haberle V, kidney GWAS-putative target gene pairs. et al.: FANTOM Consortium and the RIKEN PMI and CLST (DGT): A Supplemental Table 8. Hi-C contacts between kidney GWAS SNPs promoter-level mammalian expression atlas. Nature 507: 462– 470, and putative target gene transcription start sites (UCSC Genome 2014 fi 16. Heinz S, Romanoski CE, Benner C, Glass CK: The selection and Browser Custom Tract Interact le). function of cell type-specific enhancers. Nat Rev Mol Cell Biol 16: 144– Supplemental Table 9. DomainCaller topologically associated 154, 2015 domains (TADs) from freshly isolated human glomeruli. 17. Pellacani D, Bilenky M, Kannan N, Heravi-Moussavi A, Knapp DJHF, Gakkhar S, et al.: Analysis of normal human mammary epigenomes reveals cell-specific active enhancer states and associated transcrip- tion factor networks. Cell Reports 17: 2060–2074, 2016 REFERENCES 18. Heinz S, Romanoski CE, Benner C, Allison KA, Kaikkonen MU, Orozco LD, et al.: Effect of natural genetic variation on enhancer selection and 1. Pattaro C, Teumer A, Gorski M, Chu AY, Li M, Mijatovic V, et al.: ICBP function. Nature 503: 487–492, 2013 Consortium; AGEN Consortium; CARDIOGRAM; CHARGe-Heart 19. Wu C: The 59 ends of Drosophila heat shock genes in chromatin are Failure Group; ECHOGen Consortium: Genetic associations at 53 loci hypersensitive to DNase I. Nature 286: 854–860, 1980 highlight cell types and biological pathways relevant for kidney func- 20. Boyle AP, Davis S, Shulha HP, Meltzer P, Margulies EH, Weng Z, et al.: tion. Nat Commun 7: 10023, 2016 High-resolution mapping and characterization of open chromatin 2. TeumerA,TinA,SoriceR,GorskiM,YeoNC,ChuAY,etal.:DCCT/ across the genome. Cell 132: 311–322, 2008 EDIC: Genome-wide association studies identify genetic loci as- 21. Buenrostro JD, Giresi PG, Zaba LC, Chang HY, Greenleaf WJ: Trans- sociated with albuminuria in diabetes. Diabetes 65: 803–817, position of native chromatin for fast and sensitive epigenomic pro- 2016 filing of open chromatin, DNA-binding proteins and nucleosome 3. Iyengar SK, Sedor JR, Freedman BI, Kao WHL, Kretzler M, Keller BJ, position. Nat Methods 10: 1213–1218, 2013 et al.: Family Investigation of Nephropathy and Diabetes (FIND): Ge- 22. Gansauge M-T, Meyer M: Single-stranded DNA library preparation for nome-wide association and trans-ethnic meta-analysis for advanced the sequencing of ancient or damaged DNA. Nat Protoc 8: 737–748, diabetic kidney disease: Family investigation of nephropathy and di- 2013 abetes (FIND). PLoS Genet 11: e1005352, 2015 23. Snyder MW, Kircher M, Hill AJ, Daza RM, Shendure J: Cell-free DNA 4. Genovese G, Friedman DJ, Ross MD, Lecordier L, Uzureau P, comprises an in vivo nucleosome footprint that informs its tissues-of- Freedman BI, et al.: Association of trypanolytic ApoL1 variants origin. Cell 164: 57–68, 2016 with kidney disease in African Americans. Science 329: 841–845, 24. Li H, Durbin R: Fast and accurate short read alignment with Burrows- 2010 Wheeler transform. Bioinformatics 25: 1754–1760, 2009

18 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH

25. Trapnell C, Pachter L, Salzberg SL: TopHat: Discovering splice junc- 45. Pers TH, Timshel P, Hirschhorn JN: SNPsnap: A Web-based tool for tions with RNA-Seq. Bioinformatics 25: 1105–1111, 2009 identification and annotation of matched SNPs. Bioinformatics 31: 26. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al.: 418–420, 2015 STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29: 15–21, 46. Battle A, Brown CD, Engelhardt BE, Montgomery SB; GTEx Consor- 2013 tium; Laboratory, Data Analysis &Coordinating Center (LDACC)— 27. Dixon JR, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, et al.: Topological Analysis Working Group; Statistical Methods groups—Analysis domains in mammalian genomes identified by analysis of chromatin Working Group; Enhancing GTEx (eGTEx) groups; NIH Common interactions. Nature 485: 376–380, 2012 Fund; NIH/NCI; NIH/NHGRI; NIH/NIMH; NIH/NIDA; Biospecimen 28. Durinck S, Moreau Y, Kasprzyk A, Davis S, De Moor B, Brazma A, et al.: Collection Source Site—NDRI; Biospecimen Collection Source BioMart and Bioconductor: A powerful link between biological data- Site—RPCI; Biospecimen Core Resource—VARI; Brain Bank Re- bases and microarray data analysis. Bioinformatics 21: 3439–3440, pository—University of Miami Brain Endowment Bank; Leidos Bio- 2005 medical—Project Management; ELSI Study; Genome Browser Data 29. Durinck S, Spellman PT, Birney E, Huber W: Mapping identifiers for the Integration &Visualization—EBI; Genome Browser Data Integration integration of genomic datasets with the R/Bioconductor package &Visualization—UCSC Genomics Institute, University of California biomaRt. Nat Protoc 4: 1184–1191, 2009 Santa Cruz; Lead analysts; Laboratory, Data Analysis &Coordinating 30. Lawrence M, Huber W, Pagès H, Aboyoun P, Carlson M, Gentleman R, Center (LDACC); NIH program management; Biospecimen collection; et al.: Software for computing and annotating genomic ranges. PLOS Pathology; eQTL manuscript working group: Genetic effects on gene Comput Biol 9: e1003118, 2013 expression across human tissues. Nature 550: 204–213, 2017 31. Love MI, Huber W, Anders S: Moderated estimation of fold change 47. Gillies CE, Putler R, Menon R, Otto E, Yasutake K, Nair V, et al.: Ne- and dispersion for RNA-seq data with DESeq2. Genome Biol 15: 550, phrotic Syndrome Study Network (NEPTUNE): An eQTL landscape of 2014 kidney tissue in human nephrotic syndrome. Am J Hum Genet 103: 32. Yu G, Wang L-G, Han Y, He Q-Y: ClusterProfiler: An R package for 232–244, 2018 comparing biological themes among gene clusters. OMICS 16: 284– 48. Wang X, Tucker NR, Rizki G, Mills R, Krijger PH, de Wit E, et al.: Dis- 287, 2012 covery and validation of sub-threshold genome-wide association 33. Neph S, Kuehn MS, Reynolds AP, Haugen E, Thurman RE, Johnson study loci using epigenomic signatures. eLife 5: 5, 2016 AK, et al.: BEDOPS: High-performance genomic feature operations. 49. Pruim RJ, Welch RP, Sanna S, Teslovich TM, Chines PS, Gliedt TP, Bioinformatics 28: 1919–1920, 2012 et al.: LocusZoom: Regional visualization of genome-wide association 34. Neph S, Reynolds AP, Kuehn MS, Stamatoyannopoulos JA: Operating scan results. Bioinformatics 26: 2336–2337, 2010 on genomic ranges using BEDOPS. Methods Mol Biol 1418: 267–281, 50. Robinson-Cohen C, Lutsey PL, Kleber ME, Nielson CM, Mitchell BD, 2016 Bis JC, et al.: Genetic variants associated with circulating parathyroid 35. McLean CY, Bristor D, Hiller M, Clarke SL, Schaar BT, Lowe CB, et al.: hormone. JAmSocNephrol28: 1553–1565, 2017 GREAT improves functional interpretation of cis-regulatory regions. 51. Vesey DA, Qi W, Chen X, Pollock CA, Johnson DW: Isolation and Nat Biotechnol 28: 495–501, 2010 primary culture of human proximal tubule cells. Methods Mol Biol 466: 36. Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, 19–24, 2009 et al.: TRANSFAC and its module TRANSCompel: Transcriptional 52. Glynne PA: Primary culture of human proximal renal tubular epithelial gene regulation in eukaryotes. Nucleic Acids Res 34: D108–D110, cells. Methods Mol Med 36: 197–205, 2000 2006 53. Trifillis AL, Regec AL, Trump BF: Isolation, culture and characterization 37. Bryne JC, Valen E, Tang M-HE, Marstrand T, Winther O, da Piedade I, of human renal tubular cells. JUrol133: 324–329, 1985 et al.: JASPAR, the open access database of transcription factor- 54. Bariety J, Mandet C, Hill GS, Bruneval P: Parietal podocytes in normal binding profiles: New content and tools in the 2008 update. Nucleic human glomeruli. J Am Soc Nephrol 17: 2770–2780, 2006 Acids Res 36: D102–D106, 2008 55. Zhang J, Hansen KM, Pippin JW, Chang AM, Taniguchi Y, Krofft RD, 38. Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, et al.: et al.: De novo expression of podocyte proteins in parietal epithelial DNA-binding specificities of human transcription factors. Cell 152: cells in experimental aging nephropathy. Am J Physiol Renal Physiol 327–339, 2013 302: F571–F580, 2012 39. Grant CE, Bailey TL, Noble WS: FIMO: Scanning for occurrences of a 56. Appel D, Kershaw DB, Smeets B, Yuan G, Fuss A, Frye B, et al.: Re- given motif. Bioinformatics 27: 1017–1018, 2011 cruitment of podocytes from glomerular parietal epithelial cells. JAm 40. Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, et al.: Soc Nephrol 20: 333–343, 2009 MEME SUITE: Tools for motif discovery and searching. Nucleic Acids 57. Andeen NK, Nguyen TQ, Steegh F, Hudkins KL, Najafian B, Alpers CE: Res 37: W202–W208, 2009 The phenotypes of podocytes and parietal epithelial cells may overlap 41. Maurano MT, Haugen E, Sandstrom R, Vierstra J, Shafer A, Kaul R, in diabetic nephropathy. Kidney Int 88: 1099–1107, 2015 et al.: Large-scale identification of sequence variants influencing hu- 58. Weinstein T, Cameron R, Katz A, Silverman M: Rat glomerular epi- man transcription factor occupancy in vivo. Nat Genet 47: 1393–1401, thelial cells in culture express characteristics of parietal, not visceral, 2015 epithelium. JAmSocNephrol3: 1279–1287, 1992 42. MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, et al.: The 59. Mundlos S, Pelletier J, Darveau A, Bachmann M, Winterpacht A, Zabel new NHGRI-EBI Catalog of published genome-wide association B: Nuclear localization of the protein encoded by the Wilms’ tumor studies (GWAS Catalog). Nucleic Acids Res 45[D1]: D896–D901, gene WT1 in embryonic and adult tissues. Development 119: 1329– 2017 1341, 1993 43. Farh KK-H, Marson A, Zhu J, Kleinewietfeld M, Housley WJ, Beik S, 60. Kazama I, Mahoney Z, Miner JH, Graf D, Economides AN, Kreidberg et al.: Genetic and epigenetic fine mapping of causal autoimmune JA: Podocyte-derived BMP7 is critical for nephron development. JAm disease variants. Nature 518: 337–343, 2015 Soc Nephrol 19: 2181–2191, 2008 44. Guo C, Nelson MR, Esparza-Gordillo J, Hurle MR, Johnson T, Sieber 61. Mundel P, Heid HW, Mundel TM, Krüger M, Reiser J, Kriz W: KB: A little data goes a long way: Finding target genes across the Synaptopodin: An actin-associated protein in telencephalic dendrites GWAS Catalog by colocalizing GWAS and eQTL top hits. Presented at and renal podocytes. JCellBiol139: 193–204, 1997 the 2018 Annual Meeting of American Society of Human Genetics— 62. Akilesh S, Suleiman H, Yu H, Stander MC, Lavin P, Gbadegesin R, et al.: Program Number 220. San Diego, CA, October 16–20, 2018 Arhgap24 inactivates Rac1 in mouse podocytes, and a mutant form is

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 19 BASIC RESEARCH www.jasn.org

associated with familial focal segmental glomerulosclerosis. JClin 83. Guo J-K, Menke AL, Gubler M-C, Clarke AR, Harrison D, Hammes A, Invest 121: 4127–4137, 2011 et al.: WT1 is a key regulator of podocyte function: Reduced expres- 63. Potla U, Ni J, Vadaparampil J, Yang G, Leventhal JS, Campbell KN, sion levels cause crescentic glomerulonephritis and mesangial scle- et al.: Podocyte-specific RAP1GAP expression contributes to focal rosis. Hum Mol Genet 11: 651–659, 2002 segmental glomerulosclerosis-associated glomerular injury. JClin 84. Takemoto M, He L, Norlin J, Patrakka J, Xiao Z, Petrova T, et al.: Large- Invest 124: 1757–1769, 2014 scale identification of genes implicated in kidney glomerulus devel- 64. Gödel M, Ostendorf BN, Baumer J, Weber K, Huber TB: A novel do- opment and function. EMBO J 25: 1160–1174, 2006 main regulating degradation of the glomerular slit diaphragm protein 85. Brunskill EW, Aronow BJ, Georgas K, Rumballe B, Valerius MT, podocin in cell culture systems. PLoS One 8: e57078, 2013 Aronow J, et al.: Atlas of gene expression in the developing kidney at 65. Xu J, Nie X, Cai X, Cai C-L, Xu P-X: Tbx18 is essential for normal de- microanatomic resolution. Dev Cell 15: 781–791, 2008 velopment of vasculature network and glomerular mesangium in the 86. Mitu GM, Wang S, Hirschberg R: BMP7 is a podocyte survival factor mammalian kidney. Dev Biol 391: 17–31, 2014 and rescues podocytes from diabetic injury. Am J Physiol Renal 66. Bondeva T, Roger T, Wolf G: Differential regulation of Toll-like receptor 4 Physiol 293: F1641–F1648, 2007 gene expression in renal cells by angiotensin II: Dependency on AP1 and 87. Motojima M, Ogiwara S, Matsusaka T, Kim SY, Sagawa N, Abe K, et al.: PU.1 transcriptional sites. Am J Nephrol 27: 308–314, 2007 Conditional knockout of Foxc2 gene in kidney: Efficient generation of 67. Banas MC, Banas B, Hudkins KL, Wietecha TA, Iyoda M, Bock E, et al.: conditional alleles of single-exon gene by double-selection system. TLR4 links podocytes with the innate immune system to mediate Mamm Genome 27: 62–69, 2016 glomerular injury. J Am Soc Nephrol 19: 704–713, 2008 88. Motojima M, Tanimoto S, Ohtsuka M, Matsusaka T, Kume T, Abe K: 68. Lindenmeyer MT, Eichinger F, Sen K, Anders H-J, Edenhofer I, Characterization of kidney and skeleton phenotypes of mice double Mattinzoli D, et al.: Systematic analysis of a novel human renal glo- heterozygous for Foxc1 and Foxc2. Cells Tissues Organs 201: 380– merulus-enriched gene expression dataset. PLoS One 5: e11545, 389, 2016 2010 89. Lambert SA, Jolma A, Campitelli LF, Das PK, Yin Y, Albu M, et al.: The 69. Chabardès-Garonne D, Mejéan A, Aude J-C, Cheval L, Di Stefano A, human transcription factors. Cell 172: 650–665, 2018 Gaillard M-C, et al.: A panoramic view of gene expression in the hu- 90. Nakai S, Sugitani Y, Sato H, Ito S, Miura Y, Ogawa M, et al.: Crucial man kidney. Proc Natl Acad Sci U S A 100: 13710–13715, 2003 roles of Brn1 in distal tubule formation and function in mouse kidney. 70. Rudnicki M, Eder S, Perco P, Enrich J, Scheiber K, Koppelstätter C, Development 130: 4751–4759, 2003 et al.: Gene expression profiles of human proximal tubular epithelial 91. Kumar S, Rathkolb B, Kemter E, Sabrautzki S, Michel D, Adler T, et al.: cells in proteinuric nephropathies. Kidney Int 71: 325–335, 2007 Generation and standardized, systemic phenotypic analysis of 71. Oberley TD, Burkholder PM, Mills MD: Culture of human glomerular Pou3f3L423P mutant mice. PLoS One 11: e0150472, 2016 cells. Am J Pathol 96: 101–119, 1979 92. Rieger A, Kemter E, Kumar S, Popper B, Aigner B, Wolf E, et al.: 72. Saleem MA, O’Hare MJ, Reiser J, Coward RJ, Inward CD, Farren T, Missense mutation of POU domain class 3 transcription factor 3 in et al.: A conditionally immortalized human podocyte cell line dem- Pou3f3L423P mice causes reduced nephron number and impaired onstrating nephrin and podocin expression. JAmSocNephrol13: development of the thick ascending limb of the loop of henle. PLoS 630–638, 2002 One 11: e0158977, 2016 73. Sarrab RM, Lennon R, Ni L, Wherlock MD, Welsh GI, Saleem MA: Es- 93. Ema M, Morita M, Ikawa S, Tanaka M, Matsuda Y, Gotoh O, et al.: Two tablishment of conditionally immortalized human glomerular me- new members of the murine Sim gene family are transcriptional re- sangial cells in culture, with unique migratory properties. Am J Physiol pressors and show different expression patterns during mouse em- Renal Physiol 301: F1131–F1138, 2011 bryogenesis. Mol Cell Biol 16: 5865–5875, 1996 74. Satchell SC, Tasman CH, Singh A, Ni L, Geelen J, von Ruhland CJ, 94. Serluca FC, Fishman MC: Pre-pattern in the pronephric kidney field of et al.: Conditionally immortalized human glomerular endothelial cells zebrafish. Development 128: 2233–2241, 2001 expressing fenestrations in response to VEGF. Kidney Int 69: 1633– 95. Werth M, Schmidt-Ott KM, Leete T, Qiu A, Hinze C, Viltard M, et al.: 1640, 2006 Transcription factor TFCP2L1 patterns cells in the mouse kidney col- 75. Shankland SJ, Pippin JW, Reiser J, Mundel P: Podocytes in culture: lecting ducts. eLife 6: 531, 2017 Past, present, and future. Kidney Int 72: 26–36, 2007 96. Cheung VG, Conlin LK, Weber TM, Arcaro M, Jen K-Y, Morley M, et al.: 76. Winn MP, Conlon PJ, Lynn KL, Farrington MK, Creazzo T, Hawkins AF, Natural variation in human gene expression assessed in lympho- et al.: A mutation in the TRPC6 cation channel causes familial focal blastoid cells. Nat Genet 33: 422–425, 2003 segmental glomerulosclerosis. Science 308: 1801–1804, 2005 97. Stranger BE, Nica AC, Forrest MS, Dimas A, Bird CP, Beazley C, et al.: 77. Detrisac CJ, Sens MA, Garvin AJ, Spicer SS, Sens DA: Tissue culture of Population genomics of human gene expression. Nat Genet 39: human kidney epithelial cells of proximal tubule origin. Kidney Int 25: 1217–1224, 2007 383–390, 1984 98. Stranger BE, Forrest MS, Dunning M, Ingle CE, Beazley C, Thorne N, 78. Chuman L, Fine LG, Cohen AH, Saier MH Jr.: Continuous growth of et al.: Relative impact of nucleotide and copy number variation on proximal tubular kidney epithelial cells in hormone-supplemented gene expression phenotypes. Science 315: 848–853, 2007 serum-free medium. JCellBiol94: 506–510, 1982 99. Emilsson V, Thorleifsson G, Zhang B, Leonardson AS, Zink F, Zhu J, 79. Taub M, Sato G: Growth of functional primary cultures of kidney epi- et al.: Genetics of gene expression and its effect on disease. Nature thelial cells in defined medium. JCellPhysiol105: 369–378, 1980 452: 423–428, 2008 80. Elliget KA, Trump BF: Primary cultures of normal rat kidney proximal 100. Gilad Y, Rifkin SA, Pritchard JK: Revealing the architecture of gene tubule epithelial cells for studies of renal cell injury. In Vitro Cell Dev regulation: The promise of eQTL studies. Trends Genet 24: 408–415, Biol 27A: 739–748, 1991 2008 81. Miller JH: Restricted growth of rat kidney proximal tubule cells cul- 101. Ko Y-A, Yi H, Qiu C, Huang S, Park J, Ledo N, et al.: Genetic- tured in serum-supplemented and defined media. J Cell Physiol 129: variation-driven gene-expression changes highlight genes with 264–272, 1986 important functions for kidney disease. Am J Hum Genet 100: 940– 82. Liang G, Lin JCY, Wei V, Yoo C, Cheng JC, Nguyen CT, et al.: Distinct 953, 2017 localization of histone H3 acetylation and H3-K4 methylation to the 102. Qiu C, Huang S, Park J, Park Y, Ko Y-A, Seasock MJ, et al.: Renal transcription start sites in the human genome. Proc Natl Acad Sci U S A compartment-specific genetic variation analyses identify new path- 101: 7357–7362, 2004 ways in chronic kidney disease. Nat Med 24: 1721–1731, 2018

20 Journal of the American Society of Nephrology J Am Soc Nephrol 30: ccc–ccc,2019 www.jasn.org BASIC RESEARCH

103. Williamson I, Hill RE, Bickmore WA: Enhancers: From developmental 117. Zhu Z, Zhang F, Hu H, Bakshi A, Robinson MR, Powell JE, et al.: In- genetics to the genetics of common human disease. Dev Cell 21: 17– tegration of summary data from GWAS and eQTL studies predicts 19, 2011 complex trait gene targets. Nat Genet 48: 481–487, 2016 104. Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy 118. Rao SSP, Huntley MH, Durand NC, Stamenova EK, Bochkov ID, T, Telling A, et al.: Comprehensive mapping of long-range interac- Robinson JT, et al.: A 3D map of the human genome at kilobase res- tions reveals folding principles of the human genome. Science 326: olution reveals principles of chromatin looping. Cell 159: 1665–1680, 289–293, 2009 2014 105. Li G, Ruan X, Auerbach RK, Sandhu KS, Zheng M, Wang P, et al.: Ex- 119. Javierre BM, Burren OS, Wilder SP, Kreuzhuber R, Hill SM, Sewitz S, tensive promoter-centered chromatin interactions provide a topo- et al.: BLUEPRINT Consortium: Lineage-specific genome architecture logical basis for transcription regulation. Cell 148: 84–98, 2012 links enhancers and non-coding disease variants to target gene pro- 106. Thorleifsson G, Holm H, Edvardsson V, Walters GB, Styrkarsdottir U, moters. Cell 167: 1369–1384.e19, 2016 Gudbjartsson DF, et al.: Sequence variants in the CLDN14 gene as- 120. Brandt MM, Meddens CA, Louzao-Martinez L, van den Dungen NAM, sociate with kidney stones and bone mineral density. Nat Genet 41: Lansu NR, et al.: Chromatin conformation links distal target genes to 926–930, 2009 CKD loci. J Am Soc Nephrol 29: 462–476, 2017 107. Gorski M, van der Most PJ, Teumer A, Chu AY, Li M, Mijatovic V, et al.: 121. Fulco CP, Munschauer M, Anyoha R, Munson G, Grossman SR, 1000 Genomes-based meta-analysis identifies 10 novel loci for kidney Perez EM, et al.: Systematic mapping of functional enhancer- function. Sci Rep 7: 45040, 2017 promoter connections with CRISPR interference. Science 354: 769– 108. Nagata M, Nakayama K, Terada Y, Hoshi S, Watanabe T: Cell cycle 773, 2016 regulation and differentiation in the human podocyte lineage. 122. Simeonov DR, Gowen BG, Boontanrart M, Roth TL, Gagnon JD, Am J Pathol 153: 1511–1520, 1998 Mumbach MR, et al.: Discovery of stimulation-responsive immune 109. Nagata M, Shibata S, Shigeta M, Yu-Ming S, Watanabe T: Cyclin-de- enhancers with CRISPR activation. Nature 549: 111–115, 2017 pendent kinase inhibitors: p27kip1 and p57kip2 expression during 123. Canver MC, Smith EC, Sher F, Pinello L, Sanjana NE, Shalem O, et al.: human podocyte differentiation. Nephrol Dial Transplant 14[Suppl 1]: BCL11A enhancer dissection by Cas9-mediated in situ saturating 48–51, 1999 mutagenesis. Nature 527: 192–197, 2015 110. Barisoni L, Mokrzycki M, Sablay L, Nagata M, Yamase H, Mundel P: 124. Boyle EA, Li YI, Pritchard JK: An expanded view of complex traits: Podocyte cell cycle regulation and proliferation in collapsing glo- From polygenic to omnigenic. Cell 169: 1177–1186, 2017 merulopathies. Kidney Int 58: 137–143, 2000 125. Trynka G, Sandor C, Han B, Xu H, Stranger BE, Liu XS, et al.: Chromatin 111. Shankland SJ, Eitner F, Hudkins KL, Goodpaster T, D’Agati V, Alpers marks identify critical cell types for fine mapping complex trait vari- CE: Differential expression of cyclin-dependent kinase inhibitors in ants. Nat Genet 45: 124–130, 2013 human glomerular disease: Role in podocyte proliferation and matu- 126. Coetzee SG, Pierce S, Brundin P, Brundin L, Hazelett DJ, Coetzee GA: ration. Kidney Int 58: 674–683, 2000 Enrichment of risk SNPs in regulatory regions implicate diverse tissues 112. Hiromura K, Haseley LA, Zhang P, Monkawa T, Durvasula R, Petermann in Parkinson’s disease etiology. Sci Rep 6: 30509, 2016 AT, et al.: Podocyte expression of the CDK-inhibitor p57 during de- 127. Thiagarajan RD, Georgas KM, Rumballe BA, Lesieur E, Chiu HS, velopment and disease. Kidney Int 60: 2235–2246, 2001 Taylor D, et al.: Identification of anchor genes during kidney de- 113. John S, Sabo PJ, CanfieldTK,LeeK,VongS,WeaverM,etal.:Ge- velopment defines ontological relationships, molecular sub- nome-scale mapping of DNase I hypersensitivity. Curr Protoc Mol Biol compartments and regulatory pathways. PLoS One 6: e17286, Chapter 27: Unit 21.27, 2013 2011 114. Meyer CA, Liu XS: Identifying and mitigating bias in next-generation 128. White JT, Zhang B, Cerqueira DM, Tran U, Wessely O: Notch sig- sequencing methods for chromatin biology. Nat Rev Genet 15: 709– naling, wt1 and foxc2 are key regulators of the podocyte gene 721, 2014 regulatory network in Xenopus. Development 137: 1863–1873, 115. Montefiori L, Hernandez L, Zhang Z, Gilad Y, Ober C, Crawford G, 2010 et al.: Reducing mitochondrial reads in ATAC-seq using CRISPR/Cas9. Sci Rep 7: 2451, 2017 116. Adam M, Potter AS, Potter SS: Psychrophilic proteases dramatically reduce single-cell RNA-seq artifacts: A molecular atlas of kidney de- See related editorial, “Long-Range Chromatin Interactions in the Kidney,” on velopment. Development 144: 3625–3632, 2017 pages XXX–XXX.

AFFILIATIONS

1GlaxoSmithKline, LLC, Collegeville, Pennsylvania; 2Altius Institute for Biomedical Sciences, Seattle, Washington; 3Department of Anatomic Pathology, 5Department of Biomedical and Health Informatics, and 7Division of Nephrology, Department of Medicine, University of Washington, Seattle, Washington; 4Phase Genomics Inc., Seattle, Washington; 6Division of Pediatric Nephrology, Department of Pediatrics, University of Michigan School of Medicine, Ann Arbor, Michigan; and 8Kidney Research Institute, Seattle, Washington

J Am Soc Nephrol 30: ccc–ccc, 2019 Epigenomic Annotation of GWAS Loci 21