Gene Network Analysis in a Pediatric Cohort Identifies Novel Lung Function Genes
Total Page:16
File Type:pdf, Size:1020Kb
Gene Network Analysis in a Pediatric Cohort Identifies Novel Lung Function Genes Bruce A. Ong1.¤,JinLi2., Joseph M. McDonough1, Zhi Wei3, Cecilia Kim2, Rosetta Chiavacci2, Frank Mentch2, Jason B. Caboot4, Jonathan Spergel5,7,8, Julian L. Allen1,7, Patrick M. A. Sleiman2,6,7*, Hakon Hakonarson1,2,6,7* 1 Division of Pulmonary Medicine and Cystic Fibrosis Center, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America, 2 Center for Applied Genomics, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America, 3 Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey, United States of America, 4 Division of Pediatric Pulmonology, Madigan Army Medical Center, Tacoma, Washington, United States of America, 5 Center for Pediatric Eosinophilic Disorders, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America, 6 Division of Human Genetics, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America, 7 Department of Pediatrics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, United States of America, 8 Division of Allergy and Immunology, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania, United States of America Abstract Lung function is a heritable trait and serves as an important clinical predictor of morbidity and mortality for pulmonary conditions in adults, however, despite its importance, no studies have focused on uncovering pediatric-specific loci influencing lung function. To identify novel genetic determinants of pediatric lung function, we conducted a genome-wide association study (GWAS) of four pulmonary function traits, including FVC, FEV1, FEV1/FVC and FEF25–75% in 1556 children. Further, we carried out gene network analyses for each trait including all SNPs with a P-value of ,1.061023 from the individual GWAS. The GWAS identified SNPs with notable trends towards association with the pulmonary function 27 measures, including the previously described INTS12 locus association with FEV1 (pmeta = 1.41610 ). The gene network analyses identified 34 networks of genes associated with pulmonary function variables in Caucasians. Of those, the 24 glycoprotein gene network reached genome-wide significance for all four variables. P-value range pmeta = 6.29610 - 2.8061028 on meta-analysis. In this study, we report on specific pathways that are significantly associated with pediatric lung function at genome-wide significance. In addition, we report the first loci associated with lung function in both pediatric Caucasian and African American populations. Citation: Ong BA, Li J, McDonough JM, Wei Z, Kim C, et al. (2013) Gene Network Analysis in a Pediatric Cohort Identifies Novel Lung Function Genes. PLoS ONE 8(9): e72899. doi:10.1371/journal.pone.0072899 Editor: Wenge Guo, New Jersey Institute of Technology, United States of America Received April 8, 2013; Accepted July 15, 2013; Published September 2, 2013 Copyright: ß 2013 Ong et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Funding: This study was funded by an Institutional Development Fund to The Center for Applied Genomics, and by a Pennsylvania State Grant (SAP Number: 4100042728). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. * E-mail: [email protected] (HH); [email protected] (PMAS) ¤ Current address: Division of Pediatric Pulmonology, Tripler Army Medical Center, Honolulu, Hawaii, United States of America . These authors contributed equally to this work. Introduction gene with large effect underlying normal variation in lung function has been identified, indicating that a polygenic model of The respiratory system plays a central role in overall health and inheritance is most likely to determine lung function [5,7]. In lung-related disease. Spirometry is a commonly used technique to addition to a genetic determinants, environmental variables are measure lung function as well as identify respiratory illness and known to affect lung function. Smoking is a well established follow lung disease progression. Commonly used spirometric environmental detriment to lung function, and children living near indices include the forced expiratory volume in 1 second (FEV1), highways have also been shown to have significantly reduced lung the forced vital capacity (FVC), and the FEV1/FVC ratio. Prior function [8]. studies have demonstrated that pulmonary function can predict The first genome-wide association study (GWAS) to look at both pulmonary morbidity and mortality [1,2]. adult pulmonary function, specifically FEV1/FVC was reported in Both genetic and environmental factors influence lung function. 2009 [9]. Recently, three large-cohort-based studies have reported Population-based studies have described significant correlations a total of 28 additional loci significantly associated with pulmonary between pulmonary function measures among monozygotic twins function measures in adults [10,11,12]. However, these loci are [3], and heritability estimates of lung function in excess of 40% estimated to explain only a small portion of the variation in FEV1 have been reported [4,5]. Longitudinal studies have also and FEV1/FVC and do not address the unique aspects of determined that the decline in lung function is also influenced pulmonary function in developing children or African-Americans. by genetic factors [6]. While the heritability estimates of lung It is likely that a large number of loci influencing the heritable function are high, no definitive evidence of a single Mendelian variation in lung function remain unidentified [10,11,12]. PLOS ONE | www.plosone.org 1 September 2013 | Volume 8 | Issue 9 | e72899 Network Analysis and Pediatric Lung Function Genes Two factors make children a unique population to study lung genotype clustering files were used to process the genotyping function. First, lung function rapidly increases from childhood to signals and generate genotype calls. adolescence and by age 21 reaches a plateau phase followed by a slow but steady decline [13]. Thus, studying pulmonary function Quality Control in children could yield associations with loci related to lung After genotyping, only samples with a call rate .95% and development which may not manifest phenotypic expression in genotyping markers with a missing rate ,5%, minor allele adults. Second, it is possible that effects such as cigarette smoke frequency .0.01 and HWE-p-value .0.0001 were retained for exposure or other environmental factors over time may weaken analysis. We further extracted markers shared by 550k and 610k phenotype expression and mask associations with genetic loci. It is chips for the association study. The population structure of all the plausible that the pediatric population may exhibit a stronger samples was confirmed by Multi-Dimensional Scaling (MDS) association between pulmonary function measures and genetic analysis using PLINK software version 1.06 [15]. We detected determinants. cryptic related or duplicated samples in the cohort by determining Using pulmonary function data from children undergoing the whole genome identity-by-descent (IBD) scores with PLINK spirometry testing at the Children’s Hospital of Philadelphia, we software and excluded one sample from each pair with IBD scores conducted a GWAS to identify loci influencing pediatric lung .0.25. function. To enhance study power, we additionally applied a pathway-based approach to search for gene networks influencing Association Analysis lung function. We and others have previously shown that GWAS Linear regression was performed at each SNP for the four lung derived gene network analyses can effectively uncover significant function traits using PLINK software version 1.06 [15]. In the associations between gene pathways and complex diseases that are Caucasian cohort, a subset of the subjects had cystic fibrosis, not discovered by individual SNP analyses [14]. asthma or cancer diagnoses, all of which were treated as covariates in the regression model. In the African American cohort, Materials and Methods diagnoses of sickle cell disease, asthma and cancer were similarly treated as covariates. All four lung function phenotypes were Subjects approximately normally distributed (Figure S1), a linear regression This study was approved by the institutional review board of the model was utilized to evaluate the association between each of the Children’s Hospital of Philadelphia. In accordance with IRB phenotypes and the SNP genotype within Caucasian and African protocol, all subjects eligible for participation in this study were American cohorts separately. We also conducted meta-analyses enrolled after obtaining informed written consent either from their using the PLINK and METAL packages [16]. parents, if under the age of 18 with the child’s assent, or directly from the subject if aged 18 or over. The study cohort consisted of Genotype Imputation children and young adults between 5–21 years of age that had completed pulmonary function testing and genotype analysis at the We performed pre-phasing imputation of the two loci on Children’s Hospital of Philadelphia.