Whole Exome Sequencing Analyses Reveal Gene–Microbiota Interactions
Total Page:16
File Type:pdf, Size:1020Kb
Inflammatory bowel disease Original research Whole exome sequencing analyses reveal gene– Gut: first published as 10.1136/gutjnl-2019-319706 on 10 July 2020. Downloaded from microbiota interactions in the context of IBD Shixian Hu ,1,2 Arnau Vich Vila ,1,2 Ranko Gacesa,1,2 Valerie Collij,1,2 Christine Stevens,3 Jack M Fu,4,5,6 Isaac Wong,4,5 Michael E Talkowski,4,5,6,7,8 Manuel A Rivas,9 Floris Imhann,1,2 Laura Bolte,1,2 Hendrik van Dullemen,1 Gerard Dijkstra ,1 Marijn C Visschedijk,1 Eleonora A Festen,1 Ramnik J Xavier,10,11 Jingyuan Fu,2,12 Mark J Daly,3 Cisca Wijmenga,2 Alexandra Zhernakova,2 Alexander Kurilshikov,2 Rinse K Weersma 1 ► Additional material is ABSTRact published online only. To view Objective Both the gut microbiome and host genetics Significance of this study please visit the journal online are known to play significant roles in the pathogenesis (http:// dx. doi. org/ 10. 1136/ What is already known about this subject? gutjnl- 2019- 319706). of IBD. However, the interaction between these two factors and its implications in the aetiology of IBD remain ► Gene–microbiome interactions are important in For numbered affiliations see the pathogenesis of IBD. end of article. underexplored. Here, we report on the influence of host genetics on the gut microbiome in IBD. ► Multiple genetic and epidemiological factors have been identified to be associated to Correspondence to Design To evaluate the impact of host genetics on Professor Rinse K Weersma; the gut microbiota of patients with IBD, we combined changes in gut microbiome homeostasis in both r. k. weersma@ mdl. umcg. nl whole exome sequencing of the host genome and whole IBD and the general population. genome shotgun sequencing of 1464 faecal samples ► The identified gene–microbiome interactions in SH, AVV, RG and VC are joint IBD contain mostly common genetic variants. first authors. from 525 patients with IBD and 939 population- based AZ, AK and RKW are joint senior controls. We followed a four- step analysis: (1) exome- What are the new findings? authors. wide microbial quantitative trait loci (mbQTL) analyses, ► Novel associations between common genomic (2) a targeted approach focusing on IBD- associated variants located in IBD implicated genes (MYRF, Received 23 August 2019 genomic regions and protein truncating variants (PTVs, Revised 8 April 2020 IL17REL, SEC16A and WDR78) or immune- http://gut.bmj.com/ Accepted 20 April 2020 minor allele frequency (MAF) >5%), (3) gene- based related genes (CABIN1) to the gut microbial Published Online First burden tests on PTVs with MAF <5% and exome copy features have been identified in both IBD and 10 July 2020 number variations (CNVs) with site frequency <1%, (4) the general population cohort. joint analysis of both cohorts to identify the interactions ► By using high- resolution sequencing data, we between disease and host genetics. were also able to identify rare and deleterious Results We identified 12 mbQTLs, including variants variants in five genes (GPR151, CYP2D6, TPTE2, in the IBD- associated genes IL17REL, MYRF, SEC16A LEKR1 and CD160) that may also be involved in on September 27, 2021 by guest. Protected copyright. and WDR78. For example, the decrease of the pathway the regulation of the gut microbiota. acetyl- coenzyme A biosynthesis, which is involved in ► Disease- specific host microbiota interactions short chain fatty acids production, was associated were assessed by taking into account potential with variants in the gene MYRF (false discovery rate cofounding factors such as medication use. <0.05). Changes in functional pathways involved in the metabolic potential were also observed in participants How might it impact on clinical practice in the carrying rare PTVs or CNVs in CYP2D6, GPR151 and foreseeable future? CD160 genes. These genes are known for their function ► Our research revealed the host–microbiota in the immune system. Moreover, interaction analyses interactions in context of IBD, which helps confirmed previously known IBD disease- specific mbQTLs us to understand the pathology of IBD and in TNFSF15. potentially move towards new therapeutic Conclusion This study highlights that both common targets for IBD. and rare genetic variants affecting the immune system are key factors in shaping the gut microbiota in the context of IBD and pinpoints towards potential Large- scale genome-wide association studies mechanisms for disease treatment. (GWAS) have identified more than 200 genetic loci © Author(s) (or their associated with IBD, including genes implicated in employer(s)) 2021. Re- use the immune pathways involved in responses to gut permitted under CC BY. microbes.2 Published by BMJ. INTRODUCTION Extensive changes in the composition of the gut To cite: Hu S, Vich Vila A, IBD, comprising Crohn’s disease (CD) and UC, is microbiota have been reported in patients with IBD. Gacesa R, et al. Gut a chronic inflammatory condition of the gut with Several studies have described similar alteration on 2021;70:285–296. an increasing incidence in westernised countries.1 the faecal microbiota of patients with IBD, mainly Hu S, et al. Gut 2021;70:285–296. doi:10.1136/gutjnl-2019-319706 285 Inflammatory bowel disease a decreased microbial richness, the depletion of strictly anaer- WES and data processing obic commensal species and the expansion of pathobiont.3–5 WES was performed on blood samples. Library preparation Despite these observations, the gut microbiota composition and sequencing were done at the Broad Institute of MIT and Gut: first published as 10.1136/gutjnl-2019-319706 on 10 July 2020. Downloaded from of patients with IBD is heterogeneous and mainly influenced Harvard. On average, 86.06 million high- quality reads were by disease behaviour together with the impact of clinical and generated per sample and 98.85% of reads were aligned to environmental factors.6 7 As neither genetics nor microbiome a human reference genome (hg19). Moreover, 81% of the studies have revealed the triggering factors for IBD, there is an exonic regions were covered with a read depth >30×. Next, increasing need to study host–microbial interactions in order to the Genome Analysis Toolkit21 of the Broad Institute was used understand the aetiology and progression of the disease.8 9 for variant calling. Variants with a call rate <0.99 or Hardy- To date, both mouse models and human studies have shown Weinberg equilibrium test with p<0.0001 were excluded using that IBD-associated genes interact with the intestinal micro- PLINK tool (V.1.9). To remove genetic outliers, we combined biome via regulation of the mucosal physical barrier as well as WES data with genomes of Europeans from publically available immune responses. For example, the nucleotide- binding oligo- 1000 Genome Project (phase 3) data (http://www.internation - merisation domain (NOD)- like receptor 2 (NOD2) is involved algenome. org/), and performed principal component analysis in the bacterial peptidoglycan recognition.10 It has been shown (PCA) analysis based on single nucleotide polymorphisms (SNPs) that NOD2 knock- out mice show ineffective recognition and shared between datasets. Outliers were defined as samples which clearance of bacterial pathogens. As a consequence, these mice fall outside of a mean±3 SD interval in both of the first two present increased abundances of pathogenic bacteria from the PCs, and these samples were removed. We also removed sex- Bacteroides and Escherichia genera.11–13 Another host–micro- mismatching samples based on the inbreeding coefficient (lower biome interaction involves ATG16L1, a gene implicated in than 0.4 for females and higher than 0.7 for males) and related autophagy. In patients with CD, ATG16L1- T300A mutation samples with identity- by- descent>0.185.22 GATK germline copy carriers have more pathosymbionts in their gut mucosa.14 number variant (gCNV)23 was used for copy number variant Recently, genome-wide host–microbiota association analyses (CNV) detection. GATK-gCNV uses a Bayesian model to adjust have reported correlations between variants in immune- related for known bias factors of exome capture and sequencing, such genes and microbial features. For example, IL10 has been asso- as GC content and mappability, while also controlling for other ciated with the abundance of Enterobacteriaceae15 and IL1R2 technical and systematic differences. Raw sequencing files are associated with the overall community composition (beta compressed into read counts over the set of exons defined under diversity).16 Gencode Annotation (V.33). After processing, variant quality and Host genetics–microbiome association studies have been frequency filters (<1% site frequency) are applied to produce the described in cohorts based on the general population.15 16 These final CNV callset (https:// gatkforums. broadinstitute. org/ gatk). studies tend to miss the genetics signals that are more pronounced In summary, 73 164 common variants (minor allele frequency in a disease context like IBD. On the other hand, the microbial (MAF) >5%), 98 878 rare variants (MAF <5%) and 1046 CNVs quantitative trait loci (mbQTL) studies in IBD cohorts available (site frequency <1%) from 920 LifeLines-DEEP and 435 indi- to date have been limited in either sample size or in genomic and viduals with IBD were considered for further analyses. http://gut.bmj.com/ microbiome resolution. Also details in phenotypes capturing the heterogeneity present within IBD has been lacking in previous 17 18 Metagenomic sequencing and data processing studies. The discovery of host–microbiota interactions, Metagenomic sequencing was performed for faecal samples, moreover, has been hampered by the large influence of intrinsic using the Illumina MiSeq platform. Reads belonging to the and environmental factors on the gut microbiome and relatively 19 human genome were removed by mapping the data to the human low microbial heritability. reference genome (version NCBI37) with kneaddata (V.0.5.1, The aim of this study was to expand current knowledge of on September 27, 2021 by guest.