Supplementary material Gut

Supplementary information

Title: Meta-analysis of genome-wide association studies and functional assays

decipher susceptibility for gastric cancer in Chinese

Caiwang Yan1,2,3†, Meng Zhu1,2†, Yanbing Ding4†, Ming Yang5†, Mengyun Wang6,7†,

Gang Li8†, Chuanli Ren9†, Tongtong Huang1, Wenjun Yang10, Bangshun He11, Meilin

Wang2,12, Fei Yu1, Jinchen Wang1, Ruoxin Zhang6,7,Tianpei Wang1, Jing Ni1, Jiaping

Chen1,2, Yue Jiang1,2, Juncheng Dai1,2, Erbao Zhang1,2, Hongxia Ma1,2, Yanong Wang13,

Dazhi Xu13, Shukui Wang2,11, Yun Chen2,14, Zekuan Xu2,15, Jianwei Zhou2,16,

Guozhong Ji2,17, Zhaoming Wang18, Zhengdong Zhang2,12, Zhibin Hu1,2,3, Qingyi

Wei6,19,20, Hongbing Shen1,2, Guangfu Jin1,2,3*

*Corresponding to: Guangfu Jin, Department of Epidemiology, School of Public

Health, Nanjing Medical University, Nanjing, 211166, China. Tel: +86-25-8686-8397,

Fax: +86-25-8686-8499. E-mail: [email protected].

Supplementary information includes:

Supplementary Text (Pages 2-9)

Supplementary Figures 1-24 (Pages 10-35)

Supplementary Tables 1-17 (Pages 36-64)

1

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Text

Subjects of GWASs Onco-GWAS: We recruited gastric cancer cases histopathologically confirmed from hospitals in Jiansu province, China. The cancer-free control subjects were selected from individuals receiving routine physical examination at hospitals or those participating in community screening for non-communicable diseases in Jiangsu province. A total of 1,140 cases and 345 controls were genotyped using the Illumina OncoArray 1, and 708 controls were genotyped using the Illumina OmniZhongHua chips. NJ-GWAS and BJ-GWAS: These two studies recruited participants from Nanjing (565 cases and 1162 controls) and Beijing (468 cases and 1123 controls), respectively. All participants were genotyped by using the Affymetrix Genome-Wide Human SNP Array 6.0. The details of study design and relevant data were reported previously 2. SX-GWAS: A total of 1,625 gastric cancer cases and 2,100 controls were from the Shanxi Upper Gastrointestinal Cancer Genetics Project and the Linxian Nutrition Intervention Trial. All participants were genotyped by using the Illumina 660W Quad chip. The study was reported elsewhere 3, and the public available data was approved and downloaded from dbGap (https://www.ncbi.nlm.nih.gov/projects/gap/, study accession number: phs000361.v1.p1).

Quality control procedures for GWASs We performed quality control procedures on genotyping and subjects using the same protocol for all four GWAS datasets. Firstly, the genotyped variants were excluded if they had a call rate of <95%, a P value for Hardy–Weinberg Equilibrium (HWE) in controls ≤1.0×10−6 or a minor allele frequency (MAF) of <1% in controls. The variants were flipped to forward strand by comparing alleles or checking allele frequency, while ambiguous (A/T or C/G with MAF>0.45) were also excluded. Secondly, subjects were removed before imputation if they were with call rates of <95%, outliers (>6 s.d. from the mean) in population stratification analysis and heterozygosity analysis, or duplicated or related individuals (PI_HAT >0.25).

Imputation We used SHAPEIT (v. 2.12) for phasing and IMPUTE2 (v. 2.3.1) for imputation, with a merged reference panel (n=2,504) of the 1000 Genomes Project Phase III (October 2014 release). In particular, we re-imputed the HLA region (Chr6:28Mb-34Mb) using SNP2HLA with a Chinese reference panel of 10,689 samples 4. The imputed variants were excluded from subsequent association analysis if they were poorly imputed (info <0.5), had low-frequency (MAF<0.01) or deviated from HWE (P <1×10-4). As a result, 6.23 to 7.34 million qualified variants were retained in individual GWAS dataset, and a total of 6,192,596 shared variants were presented for genetic association analysis. Of note, 661 subjects were also genotyped using the ExomeArray previously 5, and the overall consistence rate was 97.80% for the genotypes of 20,792 variants.

Target sequencing We performed targeted region resequencing for 1q22 (170kb), 5p13.1 (216kb) and 10q23.33 (162kb) in 200 GC cases and 300 controls. Three regions based on linkage disequilibrium (LD) block were captured using custom-designed probes based on the SureSelect Target Enrichment Platform (Agilent Technologies, Santa Clara, CA). The captured libraries were sequenced on a Genome Analyzer IIx 2

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

(Illumina, San Diego, CA). Low-quality reads and adapters were filtered with FASTQ Quality Filter Tool, and the qualified reads were aligned to the reference (hg19) using Burrows Wheeler Aligner (BWA) V.0.6. Duplicates were marked with Picard Tools, and base qualities were recalibrated using Genome Analysis Toolkit (GATK, V.3.7). We called variants using GATK and Freebayes simultaneously and only considered the variants identified by both tools. For quality control, we removed 20 cases and 17 controls because of low depth (< 10× across samples) or low consistence rate of genotypes (<90%) as compared with chips in NJ-GWAS. A total of 3,066 variants (2,860 SNVs and 206 INDELs) were identified through the target sequencing, of which 805 were filtered out because of low call rate (<90%) or deviation from HWE with P< 1.0×10−4 in controls. As a result, a total of 881 variants in 1q22, 722 variants in 5p13.1, and 658 variants in 10q23.33 were available for further genetic association analysis in 180 cases and 283 controls.

Functional annotations We performed functional annotation for variants in the flanking region (1 Mb upstream and downstream) of the lead variants rs6897169 at 5p13.1 and rs10509671 at 10q23.33. Genetic variants highly correlated with the lead variants (r2> 0.6, based on the CHB populations from the 1000 Genomes Project Phase 3 data) were first annotated by ANNOVAR 6. Sorting Intolerant from Tolerant (SIFT) 7 and Polymorphism Phenotyping v.2 (PolyPhen-2) 8 were applied to predict the effects of the coding variants on the protein structure and function. ChIP-seq data from stomach mucosa for histone modification markers (H3K4me1 and H3K4me3) were obtained from the Roadmap epigenomics database 9, while ChIP-seq data for H3K4me1, H3K4me3 and H3K27ac in fetal stomach tissues were obtained from the Expression Omnibus (GEO: GSM1102794, GSM1102800 and GSM1102783). DNaseI hypersensitive site (DHS) peaks from DNase-seq data for the fetal stomach were taken from four individuals (DS17963, H-23887, H-24342 and H-24510) of the Encyclopedia of DNA Elements (ENCODE) database. These data were visualized using the University of California Santa Cruz (UCSC) genome browser 10. We performed expression quantitative trait loci (eQTL) analysis using the data from 237 normal stomach tissue samples in the Genotype-Tissue Expression project (GTEx) 11.

Tissue samples A total of 114 gastric biopsy tissues were obtained from 30 superficial gastritis, 32 atrophic gastritis, 23 intestinal metaplasia and 29 dysplasia subjects from the Affiliated Hospital of Yangzhou University, Yangzhou, China. The precancerous lesion samples were obtained during endoscopic examination and frozen immediately at −80 °C. None of the individuals used antibiotics within 2months prior to the collection of biopsy samples. Subjects did not take proton pump inhibitors for at least twoweeks before sample collection. All subjects provided an informed consent for obtaining study specimens, and the study was approved by the institutional review board of Nanjing Medical University. These samples were used for determination of PRKAA1 mRNA expression levels.

Differential expression analysis We also obtained mRNA expression and genotypic data for the gastric cancer samples from The Cancer Genome Atlas (TCGA) as of April 8, 2015. The normalized expectation-maximization read counts were available for 415 samples, including 32 3

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

paired samples (tumors with adjacent normal tissues). The independent sample t-test (415 tumor samples vs 32 normal tissues) and paired sample t-test (32 paired samples) were used to examine differences in gene expression between the tumors and adjacent normal tissues.

Luciferase reporter assays Luciferase constructs were generated to include the potential regulatory elements encompassing corresponding variants (chr5: 40755192-40756091 for rs3805495, chr5: 40798425-40799224 for rs59133000, chr5: 40835445-40836094 for rs10065570, chr10: 96052401-96053300 for region 1, chr10:96074801-96075600 for region 2, hg19). Sequences were cloned into the pGL3-Basic or pGL3-Promoter Vector (Promega, Madison, WI, USA). The constructed plasmids were sequenced to confirm the accuracy. 7.5 × 104 cells were plated into 24-well plates and co-transfected the next day with reporter vectors and pRL-SV40 Renilla Luciferase Control Vector (Promega) into immortalized human gastric epithelial cell line GES1 and gastric cancer cell lines BGC823 and SGC7901 using Lipofectamine 2000 reagent (Invitrogen, Carlsbad, CA, USA). After 48 hours of culturing, the cells were lysed, and luciferase activity was measured using the Dual-Luciferase Reporter Assay System (Promega). Relative luminescent signals were calculated by normalizing luciferase signals with Renilla signals. Each cell line was used in 3 independent transfection experiments, and each experiment was performed in triplicate.

TNF- treatment BGC823 cells were seeded into 60 mm cell culture dishes at the density of 5 × 105 cells per dish and treated with TNF- at various concentrations (0, 20, 100 or 200 ng/mL, an NF-B inducer, H8916, Sigma) for 1 or 3 h, respectively.

Motif analysis Binding site analysis of transcription factors was done with the JASPAR 2018 12 and enhancer element locator (EEL) algorithm 13. To report only the most likely sites, stringent thresholds were applied, namely an 85 % relative profile score threshold for JASPAR set to “Homo sapiens” and an absolute cutoff threshold of 9 for EEL set to matrices downloaded from JASPAR database. The variants rs59133000, rs3781266 and rs3740365 and their flanking sequences overlap with NFKB1, POU2F1 and PAX3 motifs, respectively, in both JASPAR and EEL algorithm prediction results.

Electrophoretic mobility shift assay (EMSA) Nuclear extracts were prepared from gastric cancer cell line BGC823 using the NE-PER Nuclear and Cytoplasmic Extraction kit (Thermo Scientific, Waltham, MA, USA). DNA oligonucleotides for each variant were synthesized with 5’-biotin labeling and HPLC purified by Sangon Biotech (Shanghai, China; probe sequences are listed in Supplementary Table 14). Double-stranded DNA probes were prepared by combining sense and antisense oligonucleotides, heat annealing, and slow cooling. Probes and BGC823 cell nuclear extracts were then incubated by using the LightShift EMSA Optimization & Control Kit (Thermo Scientific) at 4 °C for 20 min. For competition assays, unlabeled competitors at 10-fold or 100-fold excess oligonucleotides were added to the reaction mixture 10 min before the addition of labeled probes. For supershift assays, supershift antibody (NFKB1, #3035, Cell Signaling Technology, Danvers, MA, USA) or normal rabbit IgG (Millipore, Billerica, MA, USA) was incubated with nuclear extract before incubation with poly(dI:dC) at 4

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

4 °C for 1 h. After incubation, binding reactions were separated on a 6% polyacrylamide gel and transferred blots were developed using the Chemiluminescent Nucleic Acid Detection Module (Thermo Scientific) and signals were visualized with the ChemiDoc XRS+ scanner (BIO-RAD, Louisville, KY, USA).

Chromatin immunoprecipitation (ChIP) assay The ChIP assays were performed using the EZ-Magna ChIP KIT according to the manufacturer’s instructions (Millipore). Briefly, after stimulated by TNF- (100ng/ml) for 3 h, cells were cross-linked with 1% formaldehyde at room temperature for 10 min. Nuclear lysates were subsequently sonicated to generate 200-1,000 bp chromatin fragments using the Bioruptor sonicator VCX150 (SONICS&MATERIALS, Newtown, CT, USA), with cycles of 20 s ON and 40 s OFF for total 9 min. Supernatant was collected and incubated with antibodies (H3K4me1 (ab8895, Abcam, Cambridge, UK), H4Keme3 (ab8580, Abcam), H3K27ac (ab4729, Abcam), NFKB1 (#3035, Cell Signaling Technology), POU2F1 (#8157, Cell Signaling Technology) and PAX3 (University of Iowa Hybridoma Bank, Iowa City, IA, USA) for ChIP bounded to magnetic beads at 4°C overnight. Purified immunoprecipitated DNA was assayed by TB Green (Takara, Japan) qPCR for enrichment of target sites using the primers listed in Supplementary Table 15.

RNAi, plasmid and transient transfection Specific siRNAs targeting PRKAA1 and NOC3L were custom-designed and provided by Invitrogen, while specific siRNA targeting NFKB1 were provided by Ribobio (Guangzhou, China) (Supplementary Table 16). The pENTER-PRKAA1 plasmid was constructed by Vigene Biosciences (Shandong, China); while the pEGFP-C1-NOC3L plasmid was constructed by Xinkeyuan (Nanjing, China). 1.0 × 105 cells were seeded on 60-mm culture plates and transfected with oligonucleotides or plasmids using Lipofectamine 2000 (Invitrogen) according to the manufacturer’s instructions. Medium was changed after 24 h and the transfected cells were incubated at 37°C with 5% CO2 and collected after 48 h.

CRISPR/Cas9-mediated knockout of PRKAA1 The deletion of PRKAA1 with CRISPR/Cas9 system was previously described 5. Briefly, guide RNAs were designed to recognize chr5:40,771,905-40,771,927 (PRKAA1) and cloned into PGL3. Constructs were introduced into gastric cancer cell line (BGC823) using Lipofectamine 2000 reagent (Invitrogen) along with a plasmid encoding Cas9 (1.0 g of single-guide RNA [sgRNA] and 2.0 g of Cas9) according to the manufacturer’s protocol (Invitrogen). After 24 hours, Puromycin (1.0 g/ml, Gibco, Carlsbad, CA, USA) and Blasticidin (10.0 g/ml, InvivoGene, San Diego, CA, USA) were added to the medium for a 48 hours treatment and subsequently single clones were selected through serial dilution. The knockout of PRKAA1 gene was confirmed by sequencing and western blotting (Supplementary Figure 6). sgRNA sequences and primers used for amplifying the sgRNA target site and for sequencing are listed in Supplementary Table 17.

Lentiviral constructs, lentivirus production and infection The lentiviral shRNA constructs targeting NOC3L were purchased from GENECHEM (Shanghai, China). The shRNA with a non-targeting sequence was used as a negative control. For viral transduction, the BGC823 cells were seeded in 6-well plate at a density of 60%–70%. 16-20 h later, cell culture medium was replaced with 5

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

lentivirus-containing medium. For lentivirus-mediated knockdown, 24 h later, virus was removed and replaced by normal medium containing final 1.5 g/ml puromycin (Gibco). When uninfected control cells completely died, the target cells were cultured in normal growth medium with 1.0 g/ml puromycin. The knockdown of NOC3L gene was confirmed by real-time PCR and western blotting (Supplementary Figure 7).

RNA extraction, reverse transcription and the quantitative real-time PCR Cells were lysed with TRIzol LS Reagent (Invitrogen) and total RNA was extracted using the RNAeasy mini Kit (QIAGEN, Dusseldorf, Germany) according to manufacturer’s instructions. All of the samples were of good quality, and 500 ng of RNA from each sample was reverse transcribed to generate complementary DNA (cDNA) using PrimeScriptTM RT Master Mix (Takara) according to the manufacturer’s instructions. RT-qPCR was performed using TB GreenTM Premix Ex TaqTM (Tli RNaseH Plus) (Takara) and QuantStudio 7 Flex Real-Time PCR System (Applied Biosystems, Carlsbad, CA, USA). Results were analyzed using the comparative Ct method normalizing to a control sample and housekeeping primers (Supplementary Table 16).

Western blotting Denatured proteins were separated by 10% SDS-PAGE and transferred to a 0.45-m PVDF membrane (Millipore). Membranes were then blocked in 5% non-fat milk at room temperature, and then incubated with the appropriate primary antibodies at 4 overnight and secondary antibodies at room temperature for 1 hour. Primary antibodies used for western blot were rabbit anti-PRKAA1 (1:1000, ab32047, Abcam), rabbit anti-NOC3L (1:2000, ab85653, Abcam) and mouse anti-GAPDH (1:1000, AG019, Beyotime) respectively. Secondary antibodies used were anti-mouse (A0216, Beyotime) and anti-rabbit (A0208, Beyotime). Antibody-bound proteins were visualized with the ChemiDoc XRS+ scanner (Bio-Rad).

Cell viability and proliferation assays Cell viability was measured using a Cell Counting Kit-8 (CCK8, Dojindo, Japan) according to the manufacturer’s instructions. Briefly, different treated cells were collected and reseeded into 96-well plates and incubated at 37°C overnight. Next, 10 l CCK8 was added to each well with 100 l RPMI 1640 at the indicated time points, and the absorbance was measured at 450 nm two hours later using a microplate reader. The values were obtained from three replicate wells for each treatment and time point. Each sample was tested with five replicates. For colony formation assay, cells from each group were seeded on a 6-well plate in complete medium and incubated at 37 ° C and 5% CO2 to grow until visible colonies appeared. Colonies were washed with pre-cooled PBS, fixed in methanol for 20 min, stained with crystal violet solution (Beyotime, Shanghai, China) for 15 min and quantified. The experiment was repeated three times.

EdU incorporation assay EdU incorporation assay was determined by using the 5-ethynyl-2’-deoxyuridine (EdU) with the Cell-LightTM EdU Apollo 567 In Vitro Kit (Ribobio, Guangzhou, China) according to the manufacturer’s protocol. Briefly, cells were seeded in triplicate at 5×104 cells per well in 24-well plates and incubated at 37 °C overnight. The cells were then exposed to 50 mM of EdU for addition 2 h at 37 °C under 5 % 6

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

CO2. Afterwards, cultured cells were fixed with 4 % paraformaldehyde for 30 min and treated with 0.5 % Triton X-100 for 20 min at room temperature for permeabilization. Subsequently, the cells were incubated with DAPI at room temperature for 30 min, followed by observation under a fluorescence microscope (ECLIPSE-Ti, Nikon, Japan). The EdU incorporation rate was expressed as the ratio of EdU-positive cells (red cells) to total DAPI-positive cells (blue cells), which were counted using Image-Pro Plus (IPP) 6.0 software (Media Cybernetics).

Mouse Xenograft tumor model Athymic nude mice were purchased from the Vital River Laboratory Animal Technology Co. (Beijing, China) and maintained in laminar flow cabinets under specific pathogen-free conditions. Cells (3.0×106) were injected subcutaneously into the bilateral armpit of 5-6 week-old male BALB/C nude mice. Tumor growth rate was monitored by measuring tumor diameters twice a week. The tumor volumes were measured as length×width2×0.5. Twenty-one days after injection, the mice were killed and tumor weights were measured and analyzed. Animal care and handling procedures were performed in accordance with the National Institutes of Health’s Guide for the Care and Use of Laboratory Animals, and were approved by the Committee on the Ethics of Animal Experiments of Nanjing Medical University (Nanjing, China).

Methods for replication studies In the first stage (Replication I), 33 were included as they: 1) had an association P<1.0×10−3 for the meta-analysis; 2) showed absence of heterogeneity of results among studies (I2≤0.75 and P>1×10-4); 3) were out of the known region (at least 1Mb away); and 4) were proxies for respective region after LD pruning. In the second stage (Replication II), 5 variants were genotyped using TaqMan allelic discrimination assay with an ABI 7900 system. Information on primers and probes are available on request. Two blank controls were included in each 384-well plate for quality control, and the genotyping was performed by technicians who were blinded to case-control status. For the replication stages, data analysis and management were performed with PLINK (v.1.9). In the replication I stage, age, gender, and smoking and drinking status were adjusted as covariates; while in the replication II stage, age and gender were available for the multivariate adjustment in four cohorts with smoking and drinking status specific available to Shanghai cohort. Then, results from different studies were combined with a fixed-effect meta-analysis.

Gene-based analyses We performed gene-based analyses on the basis of single-variant P values from the meta-analysis as implemented in MAGMA (Multi-marker Analysis of GenoMic Annotation) 14. As defined in the NCBI 37.3 build, a total of 18,672 genes covered by at least one variant from the meta-analysis were obtained for association analysis. In the gene-based analysis, LD between markers was estimated with the reference genomes of East Asian populations sequenced as part of the 1,000 Genomes project (Phase 1, release 3). We applied the Benjamini-Hochberg method for false discovery rate (FDR) estimation to correct for multiple testing, setting the threshold for significance at an FDR of 0.05.

Gene set pathway analyses 7

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

We conducted gene set pathway analyses in MAGMA based on the gene-based analysis results to evaluate gene enrichment. A total of 4,622 curated gene sets representing known biological and metabolic pathways were derived from the KEGG (Kyoto Encyclopedia of Genes and Genomes) 15 and GO () 16 databases, which were catalogued by and obtained from MSigDB version 6.20 17. To correct for multiple testing, the default value of 10,000 permutations was applied. The estimate of the effect size (beta) reflects the difference in the association between genes in the gene set and genes outside the gene set obtained by fitting a regression model to the data.

Reference 1. Amos CI, Dennis J, Wang Z, et al. The OncoArray Consortium: A Network for Understanding the Genetic Architecture of Common Cancers. Cancer Epidemiol Biomarkers Prev 2017;26:126-35. 2. Shi Y, Hu Z, Wu C, et al. A genome-wide association study identifies new susceptibility loci for non-cardia gastric cancer at 3q13.31 and 5p13.1. Nat Genet 2011;43:1215-8. 3. Abnet CC, Freedman ND, Hu N, et al. A shared susceptibility locus in PLCE1 at 10q23 for gastric adenocarcinoma and esophageal squamous cell carcinoma. Nat Genet 2010;42:764-7. 4. Zhou F, Cao H, Zuo X, et al. Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease. Nat Genet 2016;48:740-6. 5. Zhu M, Yan C, Ren C, et al. Exome Array Analysis Identifies Variants in SPOCD1 and BTN3A2 That Affect Risk for Gastric Cancer. Gastroenterology 2017;152:2011-21. 6. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 2010;38:e164. 7. Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 2009;4:1073-81. 8. Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. CurrProtoc Hum Genet 2013;Chapter 7:Unit7.20. 9. Roadmap Epigenomics Consortium, Kundaje A, Meuleman W, Ernst J, et al. Integrative analysis of 111 reference human epigenomes. Nature 2015;518:317-30. 10. Casper J, Zweig AS, Villarreal C, et al. The UCSC Genome Browser database: 2018 update. Nucleic Acids Res 2018;46:D762-D769. 11. GTEx Consortium. The Genotype-Tissue Expression (GTEx) project. Nat Genet 2013;45:580-5. 12. Khan A, Fornes O, Stigliani A, et al. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res 2018;46:D260-D266. 13. Hallikas O, Palin K, Sinjushina N, et al. Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell 2006;124:47-59. 14. de Leeuw CA, Mooij JM, Heskes T, et al. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput Biol 2015;11:e1004219. 15. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27-30. 8

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

16. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000;25:25-9. 17. Liberzon A, Subramanian A, Pinchback R, et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 2011;27:1739-40.

9

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

A: Onco-GWAS

B: NJ-GWAS

C: BJ-GWAS

D: SX-GWAS

Supplementary Figure 2. Ethnicity and population structure determined by top two principal components for each study. Left: the relatedness between the studied subjects, together with European (CEU), African (YRI), Chinese (CHB), and Japanese (JPT) data from the HapMap project. Right: the population structures between the cases and the controls.

11

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Figure 3. Imputation quality assessment for each study. The proportion of high quality SNPs (info > 0.8) were plotted by minor allele frequency (MAF).

12

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

A: Onco-GWAS

B: NJ-GWAS

14

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

C: BJ-GWAS

D: SX-GWAS

Supplementary Figure 5. Manhattan plots from the individual GWAS dataset of gastric cancer. The associations (-log10 (P) values, Y-axis) are plotted against genomic position (X-axis by and the chromosomal position of NCBI build 37). The purple horizontal line corresponds to a P-value of 1.0×10−5.

15

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

A: 1q22

After adjusting for the most significant variant at 1q22

B: 5p13.1

After adjusting for the most significant variant at 5p13.1

19

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

C: 10q23.33

After adjusting for the most significant variant at 10q23.33

Supplementary Figure 9. Regional association plots of 1q22, 5p13.1 and 10q23.33 based on target sequencing. In each panel, variants (dots) are shown based on their chromosomal positions (GRCh37/hg19 human genome build) on the X axis and -log10 association P-value on the Y axis with the gray line marks P = 0.05. In each locus, the association results of all captured variants were shown in the top panel. P values are presented before (up panel) and after (bottom panel) correction for the most significant variant for corresponding region.

20

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Figure 24. Manhattan plots of the gene-based associations with gastric cancer risk. The associations (-log10 (P) values, Y-axis) are plotted against genomic position (X-axis by chromosome and the chromosomal position of NCBI build 37). The green horizontal line corresponds to the P-value of 0.05 after FDR correction.

35

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 1. The characteristics of subjects included in the four GWASs. Onco-GWAS a NJ-GWAS BJ-GWAS SX-GWAS a Variables Cases (n=1,140) Controls (n=1,053) Cases (n=550) Controls (n=1,155) Cases (n=456) Controls (n=1,118) Cases (n=1,625) Controls (n=2,100) Age (Mean ± S.D.) 61.42 ± 10.76 61.04 ± 8.68 58.24 ± 11.91 59.02 ± 9.74 56.79 ± 12.42 62.44 ± 9.17 - - Age (%)

<60 442 (38.77) 541 (51.37) 292 (53.09) 583 (50.48) 256 (56.14) 437 (39.09) 731 (44.98) 1,000 (47.62) ≥60 698 (61.23) 512 (48.62) 258 (46.91) 572 (49.52) 200 (43.86) 681 (60.91) 894 (55.02) 1,100 (52.38) Gender (%)

Male 844 (74.04) 751 (71.32) 392 (71.27) 823 (71.26) 322 (70.61) 873 (78.09) 1,260 (77.54) 1,430 (68.10) Female 296 (25.96) 302 (28.68) 158 (28.73) 332 (28.74) 134 (29.39) 245 (21.91) 365 (22.46) 670 (31.90) Smoking Status (%) b

Smokers 441 (40.35) 524 (49.76) 250 (45.45) 565 (48.92) 129 (28.29) 590 (52.77) - - Nonsmokers 652 (59.65) 528 (50.14) 300 (54.55) 590 (51.08) 327 (71.71) 528 (47.23) - - Drinking status (%) c

Drinkers 392 (35.73) 397 (37.70) 212 (38.55) 621 (53.77) 86 (18.86) 415 (37.12) - - Nondrinkers 705 (74.27) 656 (62.30) 338 (61.45) 534 (46.23) 370 (81.14) 703 (62.88) - - a Smoking and drinking status were available for a subset of subjects in the Onco-GWAS but not for SX-GWAS. b Smokers were defined as individuals who smoked at least one cigarette per day for more than one year during their lifetime; otherwise, they were considered nonsmokers. c Drinkers were defined as individuals who drank an average of twice or more in a week for at least one year in their lifetime; otherwise, they were considered nondrinkers.

36

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 2. The characteristics of subjects included in the replication studies. Replication II a Replication I Shandong (SD) Ningxia (NX) Jiangsu (JS) Shanghai (SH) Variables Cases Controls Cases Controls Cases Controls Cases Controls Cases Controls

(n=1,710) (n=1,802) (n=1,610) (n=2,048) (n=801) (n=737) (n=1,789) (n=2,587) (n=1,125) (n=1,149) Age (Mean ± S.D.) 61.66 ± 9.65 60.55 ± 10.38 59.58 ± 11.80 60.38 ± 17.37 58.58±10.32 64.59±7.71 62.74 ± 10.66 62.08 ± 10.57 58.60 ± 11.36 59.04 ± 11.86

Age (%)

<60 674 (39.42) 723 (40.12) 739 (45.90) 924 (45.12) 349 (43.57) 151 (20.49) 621 (34.71) 1,022 (39.51) 578 (51.38) 579 (50.39)

1,036 (60.58) 1,079 (59.88) 869 (53.98) 1,124 (54.88) 348 (43.45) 586 (79.51) 1,158 (64.73) 1,565 (60.49) 547 (48.62) 570 (49.61) ≥60 Gender (%)

Male 1,263 (73.86) 1,395 (77.41) 1,253 (77.83) 1,660 (81.05) 624 (77.90) 299 (40.57) 1,255 (70.35) 1,679 (64.90) 800 (71.11) 800 (69.63)

Female 447 (26.14) 407 (22.59) 357 (22.17) 388 (18.95) 177 (22.10) 438 (59.43) 529 (29.65) 908 (35.10) 325 (28.89) 349 (30.37)

Smoking Status (%) b

Smokers 813 (47.54) 904 (50.17) 203 (38.89) 596 (59.13) 338 (46.49) - 604 (37.89) 514 (44.62) 439 (39.02) 430 (37.42)

Nonsmokers 897 (52.46) 898 (49.83) 319 (61.11) 412 (40.87) 389 (53.51) - 990 (62.11) 638 (55.38) 686 (60.98) 719 (62.58)

Drinking status (%) c

Drinkers 652 (38.13) 832 (46.17) 193 (37.12) 448 (44.44) 190 (26.17) - 458 (28.75) 353 (30.64) 270 (24.00) 267 (23.24)

Nondrinkers 1,058 (61.87) 970 (53.83) 327 (62.88) 560 (55.56) 536 (73.83) - 1,135 (71.25) 799 (69.36) 855 (76.00) 882 (76.76) a Smoking and drinking status were available for a subset of subjects in the replication II studies. b Smokers were defined as individuals who smoked at least one cigarette per day for more than one year during their lifetime; otherwise, they were considered nonsmokers. c Drinkers were defined as individuals who drank an average of twice or more in a week for at least one year in their lifetime; otherwise, they were considered nondrinkers.

37

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 3. The results of 33 variants selected for the two-stage replication. Cytoband Variant Effect allele frequency No. Nearby gene Study OR(95% CI) a P a (position) (referent, effect allele) (case, control) 1 1p36.31 rs3789562 KCNAB2 Onco-GWAS 0.136,0.162 0.74(0.61-0.92) 5.12E-03 (6104029) (A,G) NJ-GWAS 0.117,0.133 0.79(0.59-1.04) 8.97E-02

BJ-GWAS 0.114,0.137 0.74(0.54-1.03) 7.48E-02

SX-GWAS 0.151,0.164 0.83(0.71-0.98) 2.51E-02

Replication I 0.130,0.127 1.03(0.89-1.19) 6.83E-01

2 2p25.3 rs67723639 AC018685.2 Onco-GWAS 0.316,0.358 0.78(0.68-0.90) 9.49E-04 (2775739) (G,A) NJ-GWAS 0.343,0.368 0.83(0.68-1.00) 5.44E-02

BJ-GWAS 0.337,0.362 0.85(0.68-1.06) 1.59E-01

SX-GWAS 0.346,0.365 0.92(0.83-1.02) 9.71E-02

Replication I 0.359,0.341 1.09(0.98-1.21) 1.09E-01

3 2p21 rs12613605 HAAO, ZFP36L2 Onco-GWAS 0.243,0.244 1.00(0.87-1.15) 9.74E-01 (43358910) (G,T) NJ-GWAS 0.247,0.204 1.32(1.10-1.59) 2.55E-03

BJ-GWAS 0.241,0.212 1.18(0.96-1.46) 1.22E-01

SX-GWAS 0.272,0.233 1.25(1.12-1.39) 4.74E-05

Replication I 0.226,0.229 0.98(0.88-1.10) 7.41E-01

4 2q34 rs35850242 ERBB4 Onco-GWAS 0.369,0.320 1.27(1.12-1.44) 2.56E-04 (212851151) (G,A) NJ-GWAS 0.381,0.335 1.25(1.07-1.46) 4.76E-03

BJ-GWAS 0.360,0.330 1.18(0.99-1.41) 6.15E-02

SX-GWAS 0.365,0.349 1.07(0.97-1.17) 1.84E-01

Replication I 0.352,0.344 1.04(0.94-1.15) 4.15E-01

5 2q36.1 rs78206742 EPHA4, PAX3 Onco-GWAS 0.090,0.108 0.56(0.38-0.81) 2.33E-03 (222520241) (C,T) NJ-GWAS 0.065,0.085 0.69(0.51-0.95) 2.26E-02

BJ-GWAS 0.067,0.090 0.69(0.49-0.99) 4.58E-02

SX-GWAS 0.082,0.099 0.78(0.66-0.93) 5.74E-03

Replication I 0.100,0.101 1.00(0.85-1.16) 9.68E-01

6 3p22.3 rs77799149 TRIM71 Onco-GWAS 0.106,0.133 0.77(0.64-0.93) 6.06E-03 (32876688) (C,A) NJ-GWAS 0.092,0.135 0.66(0.52-0.83) 3.74E-04

BJ-GWAS 0.082,0.121 0.70(0.53-0.92) 1.15E-02

SX-GWAS 0.105,0.115 0.91(0.78-1.06) 2.23E-01

Replication I 0.118,0.140 0.82(0.71-0.95) 7.68E-03

Replication II-SD 0.121,0.134 0.88(0.70-1.10) 2.67E-01

Replication II-NX 0.116,0.107 1.23(0.94-1.61) 1.29E-01

Replication II-JS 0.130,0.134 0.97(0.85-1.10) 6.22E-01

7 3q11.2 rs7624041 NSUN3 Onco-GWAS 0.098,0.077 1.40(1.12-1.75) 3.04E-03 (94108663) (G,A) NJ-GWAS 0.089,0.077 1.20(0.91-1.57) 2.00E-01

BJ-GWAS 0.115,0.081 1.34(1.00-1.80) 5.23E-02

38

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

SX-GWAS 0.116,0.109 1.10(0.95-1.28) 1.95E-01

Replication I 0.096,0.075 1.31(1.10-1.55) 2.05E-03

Replication II-SD 0.114,0.089 1.33(1.14-1.55) 2.91E-04

Replication II-NX 0.105,0.089 1.30(0.98-1.72) 6.60E-02

Replication II-JS 0.079,0.079 1.02(0.87-1.19) 8.39E-01

Replication II-SH 0.082,0.072 1.15(0.92-1.43) 2.29E-01

8 4q13.3 rs1432329 UGT2A1 Onco-GWAS 0.220,0.247 0.85(0.73-0.98) 3.02E-02 (70513710) (T,C) NJ-GWAS 0.192,0.227 0.81(0.68-0.98) 2.67E-02

BJ-GWAS 0.187,0.229 0.78(0.63-0.96) 1.98E-02

SX-GWAS 0.200,0.220 0.87(0.78-0.98) 1.65E-02

Replication I 0.234,0.230 1.02(0.91-1.15) 7.02E-01

9 4q28.1 rs10029005 LOC285419, ANKRD50 Onco-GWAS 0.354,0.326 1.13(1.00-1.29) 5.04E-02 (125451364) (G,A) NJ-GWAS 0.344,0.306 1.21(1.04-1.42) 1.65E-02

BJ-GWAS 0.365,0.339 1.14(0.95-1.37) 1.58E-01

SX-GWAS 0.372,0.336 1.17(1.06-1.28) 1.59E-03

Replication I 0.357,0.328 1.15(1.04-1.27) 8.34E-03

Replication II-SD 0.365,0.341 1.11(1.00-1.22) 4.18E-02

Replication II-NX 0.373,0.334 1.16(0.97-1.38) 1.08E-01

Replication II-JS 0.356,0.327 1.15(1.05-1.26) 3.28E-03

Replication II-SH 0.355,0.334 1.10(0.97-1.25) 1.26E-01

10 5q23.2 rs6595364 SNCAIP Onco-GWAS 0.043,0.057 0.66(0.48-0.91) 1.07E-02 (121708588) (A,G) NJ-GWAS 0.041,0.062 0.64(0.47-0.89) 7.16E-03

BJ-GWAS 0.043,0.066 0.68(0.47-0.97) 3.34E-02

SX-GWAS 0.033,0.035 0.89(0.67-1.94) 4.45E-01

Replication I 0.049,0.059 0.82(0.66-1.01) 6.40E-02

11 5q33.3 rs31225 ITK Onco-GWAS 0.347,0.384 0.85(0.75-0.97) 1.37E-02 (156624087) (A,T) NJ-GWAS 0.312,0.344 0.86(0.73-1.02) 8.35E-02

BJ-GWAS 0.331,0.362 0.90(0.75-1.09) 2.76E-01

SX-GWAS 0.326,0.358 0.86(0.79-0.95) 2.89E-03

Replication I 0.371,0.376 0.98(0.88-1.08) 6.54E-01

12 6p12.3 rs6921311 OPN5 Onco-GWAS 0.096,0.134 0.70(0.58-0.85) 2.32E-04 (47745580) (T,C) NJ-GWAS 0.115,0.113 1.03(0.82-1.31) 7.74E-01

BJ-GWAS 0.100,0.137 0.67(0.52-0.87) 2.14E-03

SX-GWAS 0.099,0.111 0.86(0.74-1.00) 4.94E-02

Replication I 0.108,0.106 1.02(0.88-1.2) 7.80E-01

13 6q21 rs7767017 C6orf183 Onco-GWAS 0.173,0.232 0.70(0.60-0.81) 1.63E-06 (109519922) (A,G) NJ-GWAS 0.200,0.219 0.89(0.75-1.07) 2.12E-01

BJ-GWAS 0.219,0.238 0.88(0.72-1.08) 2.17E-01

SX-GWAS 0.204,0.224 0.88(0.79-0.99) 2.94E-02

39

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Replication I 0.204,0.218 0.92(0.82-1.04) 2.02E-01

14 6q24.1 rs11155229 AK097143 Onco-GWAS 0.047,0.034 1.40(1.02-1.91) 3.55E-02 (142336861) (C,G) NJ-GWAS 0.058,0.042 1.56(1.04-2.33) 3.21E-02

BJ-GWAS 0.063,0.042 1.75(1.08-2.82) 2.23E-02

SX-GWAS 0.042,0.035 1.19(0.92-1.53) 1.78E-01

Replication I 0.032,0.033 0.97(0.74-1.27) 8.08E-01

15 7p22.1 rs7807755 DAGLB Onco-GWAS 0.258,0.233 1.15(1.00-1.32) 4.83E-02 (6488097) (C,T) NJ-GWAS 0.246,0.232 1.09(0.91-1.31) 3.50E-01

BJ-GWAS 0.254,0.241 1.15(0.93-1.42) 1.97E-01

SX-GWAS 0.241,0.203 1.28(1.14-1.44) 3.65E-05

Replication I 0.241,0.252 0.95(0.85-1.07) 3.88E-01

16 7q36.1 rs2106776 XRCC2 Onco-GWAS 0.270,0.308 0.82(0.72-0.94) 4.36E-03 (152372924) (G,A) NJ-GWAS 0.274,0.281 0.94(0.80-1.12) 5.02E-01

BJ-GWAS 0.253,0.310 0.78(0.65-0.94) 9.38E-03

SX-GWAS 0.261,0.284 0.88(0.79-0.98) 2.15E-02

Replication I 0.285,0.286 1.00(0.89-1.11) 9.53E-01

17 7q36.1 rs12667127 XRCC2, ACTR3B Onco-GWAS 0.489,0.428 1.36(1.19-1.55) 4.57E-06 (152408207) (A,G) NJ-GWAS 0.499,0.478 1.11(0.94-1.32) 2.29E-01

BJ-GWAS 0.511,0.461 1.34(1.10-1.64) 4.01E-03

SX-GWAS 0.488,0.478 1.06(0.96-1.18) 2.69E-01

Replication I 0.584,0.567 1.08(0.98-1.20) 1.21E-01

18 8p23.2 rs315232 AC133633.2 Onco-GWAS 0.354,0.328 1.14(0.99-1.3) 6.79E-02 (2277027) (C,A) NJ-GWAS 0.404,0.365 1.25(1.05-1.48) 1.25E-02

BJ-GWAS 0.419,0.374 1.27(1.04-1.54) 2.01E-02

SX-GWAS 0.349,0.332 1.1(0.99-1.23) 6.99E-02

Replication I 0.341,0.364 0.90(0.80-1.00) 4.64E-02

19 8p23.2 rs76340783 CSMD1 Onco-GWAS 0.187,0.235 0.67(0.56-0.80) 8.16E-06 (4148574) (T,C) NJ-GWAS 0.209,0.242 0.77(0.63-0.95) 1.39E-02

BJ-GWAS 0.218,0.225 0.88(0.69-1.13) 3.27E-01

SX-GWAS 0.227,0.242 0.88(0.78-1.00) 4.39E-02

Replication I 0.269,0.271 0.99(0.88-1.11) 8.88E-01

20 9p13.3 rs3763615 CNTFR Onco-GWAS 0.091,0.080 1.15(0.93-1.43) 1.83E-01 (34584105) (T,C) NJ-GWAS 0.135,0.096 1.39(1.12-1.72) 2.41E-03

BJ-GWAS 0.140,0.093 1.41(1.10-1.80) 6.32E-03

SX-GWAS 0.115,0.098 1.23(1.05-1.43) 8.72E-03

Replication I 0.083,0.083 1.00(0.84-1.19) 9.80E-01

21 9p13.2 rs4271056 snoU13, ALDH1B1 Onco-GWAS 0.065,0.046 1.45(1.11-1.87) 5.45E-03 (38232043) (T,C) NJ-GWAS 0.068,0.044 1.80(1.26-2.58) 1.20E-03

BJ-GWAS 0.067,0.054 1.51(1.02-2.26) 4.19E-02

40

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

SX-GWAS 0.053,0.044 1.18(0.95-1.47) 1.26E-01

Replication I 0.048.0.046 1.06(0.85-1.33) 5.81E-01

22 10q22.1 rs57312792 UNC5B, RP11-432J9.6 Onco-GWAS 0.042,0.024 1.90(1.34-2.70) 3.29E-04 (72881510) T,G) NJ-GWAS 0.055,0.033 1.90(1.26-2.85) 1.97E-03

BJ-GWAS 0.045,0.033 1.53(0.93-2.53) 9.64E-02

SX-GWAS 0.028,0.026 1.12(0.80-1.57) 5.17E-01

Replication I 0.038,0.031 1.24(0.96-1.60) 1.07E-01

23 11q13.4 rs55660813 RP11-632K5.3 Onco-GWAS 0.461,0.489 0.89(0.79-1.00) 5.69E-02 (74031688) (C,A) NJ-GWAS 0.471,0.523 0.77(0.67-0.90) 9.46E-04

BJ-GWAS 0.449,0.516 0.77(0.66-0.92) 2.67E-03

SX-GWAS 0.435,0.444 0.95(0.87-1.05) 3.28E-01

Replication I 0.466,0.483 0.94(0.85-1.03) 1.74E-01

24 12p11.22 rs10466811 7SK Onco-GWAS 0.310,0.329 0.89(0.77-1.02) 9.65E-02 (27959201) (G,A) NJ-GWAS 0.331,0.383 0.80(0.69-0.94) 5.33E-03

BJ-GWAS 0.325,0.376 0.88(0.73-1.07) 1.97E-01

SX-GWAS 0.301,0.330 0.86(0.77-0.95) 4.65E-03

Replication I 0.326,0.351 0.89(0.80-0.98) 2.00E-02

Replication II-SD 0.318,0.298 1.10(0.94-1.29) 2.51E-01

Replication II-NX 0.344,0.303 1.21(1.00-1.46) 4.74E-02

25 13q33.1 rs61973994 NALCN Onco-GWAS 0.177,0.207 0.81(0.68-0.95) 9.93E-03 (101762201) (C,T) NJ-GWAS 0.126,0.189 0.60(0.49-0.74) 8.33E-07

BJ-GWAS 0.138,0.186 0.77(0.60-0.98) 3.42E-02

SX-GWAS 0.185,0.194 0.92(0.81-1.04) 1.81E-01

Replication I 0.177,0.162 1.10(0.97-1.25) 1.45E-01

26 16q21 rs144028663 CCDC113 Onco-GWAS 0.053,0.032 1.64(1.22-2.21) 9.95E-04 (58296843) (C,T) NJ-GWAS 0.047,0.042 1.22(0.84-1.77) 2.93E-01

BJ-GWAS 0.053,0.043 1.57(1.03-2.4) 3.45E-02

SX-GWAS 0.043,0.033 1.36(1.07-1.75) 1.40E-02

Replication I 0.049,0.044 1.08(0.86-1.36) 4.96E-01

27 16q21 rs140068619 PRSS54 Onco-GWAS 0.052,0.030 1.74(1.29-2.35) 2.95E-04 (58321433) (C,T) NJ-GWAS 0.045,0.039 1.25(0.86-1.83) 2.41E-01

BJ-GWAS 0.052,0.039 1.63(1.07-2.49) 2.27E-02

SX-GWAS 0.041,0.030 1.40(1.08-1.8) 9.70E-03

Replication I 0.040,0.038 1.04(0.81-1.34) 7.46E-01

28 16q23.1 rs73614660 FA2H Onco-GWAS 0.027,0.051 0.49(0.35-0.68) 1.86E-05 (74751532) (A,T) BJ-GWAS 0.032,0.036 0.67(0.37-1.2) 1.75E-01

NJ-GWAS 0.032,0.036 0.81(0.47-1.41) 4.62E-01 SX-GWAS 0.031,0.039 0.75(0.57-0.99) 4.49E-02

Replication I 0.052,0.043 1.18(0.94-1.47) 1.52E-01

41

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

29 17p13.1 rs9909013 GAS7 Onco-GWAS 0.437,0.488 0.79(0.70-0.91) 5.45E-04 (9870143) (G,A) NJ-GWAS 0.402,0.448 0.82(0.71-0.95) 7.45E-03

BJ-GWAS 0.406,0.439 0.85(0.72-1.02) 7.35E-02

SX-GWAS 0.438,0.459 0.92(0.83-1.01) 6.82E-02

Replication I 0.449,0.480 0.87(0.79-0.96) 7.38E-03

Replication II-SD 0.476,0.472 1.01(0.92-1.11) 8.00E-01

Replication II-NX 0.459,0.460 0.97(0.82-1.15) 7.27E-01

Replication II-JS 0.464,0.483 0.92(0.79-1.08) 3.21E-01

30 17q24.2 rs12601974 ARSG Onco-GWAS 0.196,0.204 0.90(0.75-1.09) 2.79E-01 (66375875) (T,A) NJ-GWAS 0.181,0.226 0.75(0.63-0.90) 2.27E-03 BJ-GWAS 0.171,0.212 0.79(0.65-0.97) 2.71E-02 SX-GWAS 0.204,0.217 0.89(0.78-1.03) 1.30E-01 Replication I 0.226,0.213 1.09(0.97-1.22) 1.71E-01 31 18p11.32 rs9965710 EMILIN2 Onco-GWAS 0.124,0.143 0.77(0.62-0.96) 2.06E-02 (2889021) (T,C) NJ-GWAS 0.147,0.177 0.78(0.63-0.98) 2.90E-02

BJ-GWAS 0.134,0.172 0.70(0.54-0.91) 7.58E-03

SX-GWAS 0.118,0.118 0.98(0.81-1.19) 8.38E-01

Replication I 0.179,0.159 1.15(1.01-1.31) 3.90E-02

32 19q13.31 rs11555891 IRGC Onco-GWAS 0.131,0.137 0.93(0.77-1.14) 4.96E-01 (44223113) (G,A) NJ-GWAS 0.112,0.146 0.69(0.54-0.89) 3.86E-03

BJ-GWAS 0.104,0.131 0.64(0.48-0.85) 2.48E-03

SX-GWAS 0.128,0.145 0.83(0.72-0.97) 1.58E-02

Replication I 0.157,0.143 1.14(0.99-1.30) 6.05E-02

33 20p12.3 rs2223836 PLCB1 Onco-GWAS 0.294,0.340 0.79(0.70-0.91) 5.80E-04 (8610270) (A,G) NJ-GWAS 0.198,0.217 0.85(0.68-1.05) 1.21E-01

BJ-GWAS 0.215,0.216 0.98(0.77-1.25) 8.58E-01

SX-GWAS 0.305,0.334 0.87(0.78-0.96) 5.54E-03

Replication I 0.310,0.332 0.91(0.82-1.01) 6.37E-02

a OR (95%CI), odds ratio and 95% confidence interval were estimated for the effect allele.

42

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 4. Summary of association results for the gastric cancer risk-related loci that were reported in previous studies. Alleles EAF in 1000 Genomes c Most significant associations in previous reports d No. Chr. Variant Associated genes OR(95%CI) b P b (Ref/Eff) a AFR AMR ASN EUR OR P Risk allele Study 1 1p35.2 rs112754928 SPOCD1 G/A - - 0.00 0.00 0.02 0.00 0.54(0.44-0.67) 2.33E-08 G 6 2 1q22 rs760077 MUC1 T/A 0.72(0.66-0.79) 2.09E-12 0.36 0.30 0.17 0.37 0.79(0.73-0.85) 1.10E-09 T 4 rs140081212 MUC1 G/A 0.73(0.67-0.80) 4.11E-12 0.35 0.30 0.18 0.37 0.79(0.73-0.85) 7.90E-10 G 4 rs4072037 MUC1 A/G 0.75(0.69-0.82) 1.15E-10 0.34 0.34 0.16 0.41 0.74(0.69-0.79) 6.28E-17 A 2,5,7 rs80142782 ASH1L T/C 0.63(0.55-0.72) 2.54E-11 0.00 0.00 0.09 0.00 0.62(0.56-0.69) 1.71E-19 T 7 3 2p11.2 chr2:86020821 intergenic G/T ------12.7(5.03-32.08) 7.60E-08 T 4 4 3q13.31 rs9841504 ZBTB20 C/G 0.81(0.74-0.89) 4.34E-06 0.34 0.17 0.16 0.08 0.58(0.47-0.72) 6.10E-07 C 3 5 5p13.1 rs10074991 PRKAA1 A/G 1.23(1.31-1.15) 1.22E-10 0.67 0.70 0.46 0.71 1.25(1.20-1.30) 4.83E-26 G 5 rs10036575 PRKAA1 C/T 1.24(1.16-1.31) 1.90E-11 0.71 0.74 0.46 0.77 1.23 4.80E-06 T 4 rs13361707 PRKAA1 T/C 1.23(1.31-1.15) 1.23E-10 0.67 0.70 0.46 0.71 1.32(1.13-1.43) 9.70E-11 C 3,7 6 5q14.3 rs7712641 lnc-POLR3G-4 T/C 1.09(1.02-1.16) 6.12E-03 0.32 0.59 0.58 0.65 0.84(0.80-0.88) 1.21E-11 C 7 7 6p21.1 rs2294693 UNC5CL, TSPO2 T/C 1.12(1.04-1.20) 1.69E-03 0.29 0.26 0.26 0.18 1.18(1.12-1.26) 2.50E-08 C 5 8 6p22.1 rs1679709 BTN3A2 G/A 0.85(0.78-0.93) 1.63E-04 0.23 0.12 0.14 0.12 0.80 (0.76-0.85) 1.04E-12 G 6 9 8q24.3 rs2294008 PSCA C/T 1.13(1.05-1.21) 5.08E-04 0.38 0.57 0.38 0.44 1.20(1.15-1.28) 5.95E-11 T 1,4,7 10 9q34.2 rs7849280 ABO A/G 1.15(1.07-1.24) 3.31E-04 0.20 0.05 0.16 0.06 1.15 2.64E-13 G 8 11 10q23.33 rs2274223 PLCE1, NOC3L A/G 1.29 (1.20-1.39) 3.85E-12 0.35 0.22 0.19 0.32 1.31(1.19-1.43) 8.40E-09 G 2 rs3765524 PLCE1, NOC3L C/T 1.29 (1.20-1.39) 8.98E-12 0.42 0.22 0.19 0.29 1.31 (1.20–1.44) 5.32E-09 G 2 rs3781264 PLCE1, NOC3L T/C 1.33(1.23-1.45) 3.04E-12 0.17 0.20 0.14 0.30 1.36 (1.23–1.50) 3.76E-09 C 2 rs11187842 PLCE1, NOC3L C/T 1.31(1.21-1.42) 1.05E-10 0.03 0.07 0.13 0.09 1.34 (1.21–1.49) 2.53E-09 T 2 12 11q22.3 chr11:108137985 ATM C/T ------4.84(2.67-8.77) 2.00E-07 T 4 chr11:108124573 ATM C/A ------7.78(3.38-17.91) 1.40E-06 A 4

13 11q22.3 chr11:102612948 intergenic !A/A ------4.9(2.79-8.62) 3.40E-08 A 4 14 12q24.11-12 rs6490061 CUX2 C/T 1.02(0.95-1.09) 5.86E-01 0.53 0.63 0.42 0.79 0.91 3.20E-08 C 8 15 20q11.21 rs2376549 DEFB families T/C 1.09(1.01-1.17) 1.82E-02 0.91 0.47 0.28 0.47 1.11 8.11E-10 C 8 16 20q12 rs55864139 CHD6 T/A - - 0.00 0.01 0.00 0.01 3.12(1.90-5.11) 6.90E-06 A 4 a Ref, reference allele; Eff, effect allele; b Derived from the meta-analysis results of four GWAS datasets using the inverse variance-weighted fixed effect model; c EAF: effect allele frequency; AFR, African ; AMR, Ad Mixed American; ASN: Asian; EUR: European. d The associations were derived from published GWAS studies of gastric cancer, and the most significant associations were extracted. The reported studies were showed as follows: 1. Genetic variation in PSCA is associated with susceptibility to diffuse-type gastric cancer. Nat Genet. 2008 Jun;40(6):730-40. 2. A shared susceptibility locus in PLCE1 at 10q23 for gastric adenocarcinoma and esophageal squamous cell carcinoma. Nat Genet. 2010 Sep;42(9):764-7. 3. A genome-wide association study identifies new susceptibility loci for non-cardia gastric cancer at 3q13.31 and 5p13.1. Nat Genet. 2011 Oct 30;43(12):1215-8. 4. Loss-of-function variants in ATM confer risk of gastric cancer. Nat Genet. 2015 Aug;47(8):906-10. 5. Genome-wide association study of gastric adenocarcinoma in Asia: a comparison of associations between cardia and non-cardia tumours. Gut. 2016 Oct;65(10):1611-8. 6. Exome Array Analysis Identifies Variants in SPOCD1 and BTN3A2 That Affect Risk for Gastric Cancer. Gastroenterology. 2017 Jun;152(8):2011-2021 7. Identification of new susceptibility loci for gastric non-cardia adenocarcinoma: pooled results from two Chinese genome-wide association studies. Gut. 2017 Apr;66(4):581-587. 8. Genome-wide association study identifies gastric cancer susceptibility loci at 12q24.11-12 and 20q11.21. Cancer Sci. 2018 Dec;109(12):4015-4024.

43

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 5. Association results and functional annotations of 59 variants in strong LD with the lead variant rs6897169 at 5p13.1 Reference Effect GENCODE dbSNP Regulome CADD No ID Position r2 a OR (95%CI) b P-value b allele allele genes function annotation DB score score 1 rs10036575 chr5:40685795 C T 0.96 1.24(1.16-1.32) 1.01E-11 PTGER4 intronic 4 3.01 2 rs28540420 chr5:40687463 T C 0.96 1.24(1.16-1.32) 1.16E-11 PTGER4 intronic 5 7.23 3 rs10055925 chr5:40688059 G A 0.96 1.24(1.16-1.32) 1.21E-11 PTGER4 intronic 5 4.81 4 rs4957342 chr5:40694154 C T 0.98 1.24(1.16-1.32) 1.19E-11 PTGER4 downstream 5 0.02 0.98 PTGER4(dist=1129), 5 rs7730368 chr5:40694966 C A 1.24(1.16-1.32) 1.21E-11 intergenic 5 2.59 TTC33(dist=16712) 0.98 PTGER4(dist=1359), 6 rs10672218 chr5:40695196 AGAT A 1.23(1.31-1.15) 1.65E-10 intergenic 6 0.30 TTC33(dist=16482) 0.98 PTGER4(dist=5349), 7 rs57806386 chr5:40699186 C A 1.24(1.16-1.32) 1.22E-11 intergenic 6 1.93 TTC33(dist=12492) 0.98 PTGER4(dist=5847), 8 rs4509070 chr5:40699684 T A 1.24(1.16-1.32) 1.22E-11 intergenic 6 5.49 TTC33(dist=11994) ATCCAC 0.83 PTGER4(dist=7332), 9 rs146005788 chr5:40701169 A 1.23(1.31-1.15) 1.32E-09 intergenic 6 10.06 G TTC33(dist=10509) 0.98 PTGER4(dist=10237), 10 rs10078575 chr5:40704074 G A 1.24(1.17-1.32) 9.61E-12 intergenic 7 4.31 TTC33(dist=7604) 0.98 PTGER4(dist=10253), 11 rs6872282 chr5:40704090 T C 1.24(1.16-1.32) 1.20E-11 intergenic 7 9.31 TTC33(dist=7588) 0.98 PTGER4(dist=11582), 12 rs60770691 chr5:40705419 T A 1.24(1.16-1.32) 1.22E-11 intergenic 7 10.63 TTC33(dist=6259) 13 rs7716285 chr5:40711361 G A 0.98 1.24(1.16-1.32) 1.20E-11 TTC33 downstream 6 3.32 14 rs1345778 chr5:40712797 C A 0.98 1.24(1.16-1.32) 1.22E-11 TTC33(NM_012382:c.*3450T>G) 3'-UTR 5 1.11 15 rs980093 chr5:40717105 T C 0.98 1.24(1.16-1.32) 1.24E-11 TTC33 intronic 6 4.36 16 rs7726270 chr5:40718788 T C 0.98 1.24(1.16-1.32) 1.31E-11 TTC33 intronic 7 0.41 17 rs6873054 chr5:40719543 C T 0.98 1.24(1.16-1.32) 1.35E-11 TTC33 intronic 6 0.40 18 rs10038769 chr5:40720128 C A 0.98 1.24(1.16-1.32) 1.40E-11 TTC33 intronic 5 0.61 19 rs1692252 chr5:40722822 A G 0.98 1.24(1.16-1.31) 1.55E-11 TTC33 intronic 5 1.30 20 rs6897169 chr5:40726138 T C 1.00 1.25(1.17-1.33) 5.48E-12 TTC33 intronic 6 7.65 21 rs145884228 chr5:40726316 TTA T 0.79 1.26(1.35-1.18) 1.05E-11 TTC33 intronic 6 0.42 22 rs6860328 chr5:40729974 C T 0.98 1.23(1.16-1.31) 1.90E-11 TTC33 intronic 6 7.51 23 rs7705504 chr5:40739261 T C 0.98 1.23(1.16-1.31) 2.15E-11 TTC33 intronic 6 1.84 24 rs10071679 chr5:40743061 A G 0.98 1.23(1.16-1.31) 2.24E-11 TTC33 intronic 7 2.81 25 rs59585832 chr5:40744038 T C 0.98 1.24(1.16-1.31) 1.66E-11 TTC33 intronic 5 0.70 26 rs3805497 chr5:40746885 T A 0.98 1.23(1.16-1.31) 2.21E-11 TTC33 intronic 3a 1.53 27 rs2329353 chr5:40748268 A G 0.98 1.23(1.16-1.31) 2.11E-11 TTC33 intronic 6 2.12 28 rs77439436 chr5:40748968 A G 0.98 1.23(1.16-1.31) 2.14E-11 TTC33 intronic 7 4.33

44

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

29 rs6868517 chr5:40752536 G A 0.96 1.23(1.16-1.31) 2.15E-11 TTC33 intronic 7 0.93 30 rs3805495 chr5:40755568 T C 0.96 1.23(1.16-1.31) 2.34E-11 TTC33 intronic 2a 9.14 31 rs7702883 chr5:40756681 A G 0.96 1.22(1.15-1.3) 1.29E-10 TTC33 upstream 5 4.35 32 rs1002424 chr5:40767397 G A 0.96 1.24(1.16-1.31) 1.71E-11 PRKAA1 intronic 5 1.77 33 rs1002423 chr5:40767458 T C 0.96 1.24(1.16-1.31) 1.53E-11 PRKAA1 intronic 5 16.19 34 rs58692207 chr5:40780569 C T 0.96 1.24(1.16-1.32) 1.28E-11 PRKAA1 intronic 3a 1.72 35 rs59338019 chr5:40780786 G A 0.96 1.24(1.16-1.32) 1.28E-11 PRKAA1 intronic 6 0.74 36 rs28544962 chr5:40781187 C T 0.96 1.24(1.16-1.32) 1.28E-11 PRKAA1 intronic 3a 1.24 37 rs35972942 chr5:40786356 CA C 0.76 1.24(1.32-1.16) 1.11E-10 PRKAA1 intronic 5 6.65 38 rs4957352 chr5:40787523 T C 0.96 1.24(1.16-1.32) 1.24E-11 PRKAA1 intronic 6 1.70 39 rs10074991 chr5:40790551 A G 0.96 1.24(1.16-1.32) 9.10E-12 PRKAA1 intronic 7 3.74 40 rs546238399 chr5:40790627 C CTCT 0.89 1.22(1.30-1.15) 1.30E-09 PRKAA1 intronic NA 4.69 41 rs373477888 chr5:40790640 T TG 0.89 1.22(1.30-1.15) 1.30E-09 PRKAA1 intronic 6 0.08 42 rs58751240 chr5:40791501 T C 0.96 1.24(1.16-1.32) 1.22E-11 PRKAA1 intronic 2b 4.68 43 rs13361707 chr5:40791884 T C 0.96 1.24(1.16-1.32) 9.39E-12 PRKAA1 intronic 4 7.70 44 rs3805487 chr5:40796033 T C 0.96 1.24(1.16-1.32) 1.11E-11 PRKAA1 intronic 4 8.00 45 rs59133000 chr5:40798974 T C 0.96 1.24(1.16-1.32) 1.17E-11 PRKAA1 upstream 2b 14.16 0.70 PRKAA1(dist=7639), 46 rs166073 chr5:40805936 T C 1.24(1.16-1.32) 5.49E-10 intergenic 7 0.01 LOC100506548(dist=19429) 0.71 PRKAA1(dist=9630), 47 rs1122655 chr5:40807927 T C 1.19(1.12-1.26) 4.55E-08 intergenic 6 2.59 LOC100506548(dist=17438) 0.71 PRKAA1(dist=12129), 48 rs1001684 chr5:40810426 A C 1.19(1.12-1.26) 5.63E-08 intergenic 7 8.73 LOC100506548(dist=14939) 0.69 PRKAA1(dist=13934), 49 rs11956019 chr5:40812231 A G 1.19(1.12-1.26) 5.41E-08 intergenic 5 10.02 LOC100506548(dist=13134) 0.69 PRKAA1(dist=14080), 50 rs11956047 chr5:40812377 A G 1.19(1.12-1.26) 5.35E-08 intergenic 4 4.75 LOC100506548(dist=12988) 0.69 PRKAA1(dist=20791), 51 rs10043245 chr5:40819088 T C 1.19(1.12-1.26) 3.51E-08 intergenic 6 7.63 LOC100506548(dist=6277) 0.71 PRKAA1(dist=21250), 52 rs73084490 chr5:40819547 A G 1.19(1.12-1.26) 3.59E-08 intergenic 7 0.54 LOC100506548(dist=5818) 0.69 PRKAA1(dist=21456), 53 rs11957736 chr5:40819753 T C 1.19(1.12-1.26) 3.51E-08 intergenic 7 3.93 LOC100506548(dist=5612) 54 rs6876367 chr5:40829311 T C 0.69 1.19(1.12-1.26) 4.77E-08 LOC100506548 upstream 4 5.52 55 rs7717357 chr5:40829915 T C 0.69 1.19(1.12-1.26) 3.56E-08 LOC100506548 upstream 5 1.44 56 rs201454585 chr5:40830602 A AT 0.60 1.18(1.26-1.10) 7.56E-07 RPL37 downstream 3a 0.12 57 rs10065570 chr5:40835627 T C 0.69 1.19(1.12-1.26) 4.14E-08 RPL37 upstream 4 9.43 58 rs3763074 chr5:40838022 C A 0.69 1.18(1.11-1.26) 1.01E-07 RPL37(dist=2635),CARD6(dist=3388) intergenic 7 1.16 59 rs6899032 chr5:40839206 G C 0.69 1.18(1.11-1.26) 1.08E-07 RPL37(dist=3819),CARD6(dist=2204) intergenic 6 0.64 a LD value of r2 with the lead variant rs6897169 in CHB of the 1000 Genomes; b Derived from the meta-analysis results of four GWASs using the inverse variance-weighted fixed effect model.

45

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 6. The results of eQTL-analysis between rs6897169 genotypes and flanking genes in a one mega- window based on 237 normal stomach tissue samples

No. Gene Gencode id P-value Effect size (effect allele, T) Standard error

1 LINC00603 ENSG00000250048.1 - - 2 U1 ENSG00000199361.1 0.69 0.028 0.070 3 PTGER4 ENSG00000171522.5 0.013 -0.22 0.088 4 TTC33 ENSG00000113638.8 0.56 -0.040 0.069 5 PRKAA1 ENSG00000132356.7 7.2E-04 0.19 0.056 6 RPL37 ENSG00000145592.9 0.95 3.0E-03 0.047 7 SNORD72 ENSG00000212296.1 - - 8 CARD6 ENSG00000132357.9 0.46 -0.034 0.045 9 C7 ENSG00000112936.14 0.11 0.083 0.052 10 U7 ENSG00000253098.1 - - 11 MROH2B ENSG00000171495.12 - - 12 C6 ENSG00000039537.9 0.45 -0.073 0.097 13 PLCXD3 ENSG00000182836.5 0.70 -0.028 0.074 14 OXCT1 ENSG00000083720.8 0.70 -0.031 0.079

46

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 7. In silico prediction of transcription factors with allele specific binding to rs59133000 by EEL algorithm (A) and prediction tool JASPAR (B).

Matrix ID Risk allele Score Predicted sequence Non-risk allele Score Predicted sequence

(A) EEL (Score > 9.0)

NFKB1 C 9.747 TGAAATTTCCT T - -

(B) JASPAR (Relative profile score > 85%)

NFKB1 C 0.892 TGAAATTTCCT T - -

47

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 8. Association results and functional annotations of 78 variants in strong LD with the lead variant rs10509671 at 10q23.33. Reference Effect GENCODE dbSNP Regulome CADD No. ID Position r2 a OR (95%CI) b P-value b allele allele genes function annotation DB score score 1 rs11187826 chr10:95988042 A G 0.75 1.20(1.10-1.31) 3.22E-05 PLCE1 intronic 5 1.46 2 rs3740360 chr10:96025491 A C 0.92 1.26(1.16-1.37) 1.25E-07 PLCE1 intronic 7 0.00 3 rs17109875 chr10:96026575 T C 0.92 1.27(1.17-1.39) 4.14E-08 PLCE1 intronic 5 5.70 4 rs11187836 chr10:96032119 C T 0.75 1.27(1.15-1.40) 1.91E-06 PLCE1 intronic 6 1.76 5 rs11187837 chr10:96035980 T C 0.87 1.28(1.17-1.40) 5.10E-08 PLCE1 intronic 5 1.62 6 rs7084339 chr10:96043732 A G 0.63 1.29(1.20-1.38) 1.23E-11 PLCE1-AS1 ncRNA_intronic 7 8.97 7 rs12263737 chr10:96044913 G A 0.65 1.28(1.19-1.38) 2.33E-11 PLCE1-AS1 ncRNA_intronic 6 1.88 8 rs10882416 chr10:96045054 C T 0.65 1.28(1.19-1.38) 1.78E-11 PLCE1-AS1 ncRNA_intronic 7 1.09 9 rs58783042 chr10:96049708 AC A 0.96 1.32(1.21-1.43) 6.26E-11 PLCE1 intronic 6 0.36 10 rs11187840 chr10:96050351 A G 0.96 1.31(1.21-1.42) 1.25E-10 PLCE1 intronic 7 0.66 11 rs11187842 chr10:96052511 C T 0.96 1.31(1.21-1.42) 1.05E-10 PLCE1 intronic 5 5.01 12 rs3781266 chr10:96052747 T C 0.96 1.31(1.21-1.42) 1.03E-10 PLCE1 intronic 4 7.90 13 rs3740365 chr10:96053239 A T 0.96 1.31(1.21-1.42) 1.00E-10 PLCE1 intronic 5 13.45 14 rs12220091 chr10:96053689 C T 0.96 1.31(1.21-1.42) 9.72E-11 PLCE1 intronic 7 0.09 15 rs75017201 chr10:96055152 C T 0.96 1.31(1.21-1.42) 8.54E-11 PLCE1 intronic 5 0.24 16 rs200197176 chr10:96055963 C CA 0.96 1.32(1.22-1.44) 3.75E-11 PLCE1 intronic 6 1.20 17 rs3765524 chr10:96058298 C T 0.65 1.29(1.20-1.39) 8.98E-12 PLCE1 exonic 7 6.58 18 rs11187845 chr10:96060198 C A 0.96 1.31(1.21-1.42) 7.55E-11 PLCE1 intronic 7 12.60 19 rs7897678 chr10:96060610 C G 0.65 1.29(1.20-1.39) 9.21E-12 PLCE1 intronic 6 0.69 20 rs7914672 chr10:96060847 T A 0.65 1.29(1.20-1.39) 8.69E-12 PLCE1 intronic 6 1.68 21 rs7897963 chr10:96060875 A G 0.65 1.29(1.20-1.39) 8.60E-12 PLCE1 intronic 6 0.99 22 rs140311370 chr10:96061619 ACTT A 0.96 1.33(1.22-1.44) 2.73E-11 PLCE1 intronic 6 3.14 23 rs12217792 chr10:96062386 T C 0.96 1.32(1.21-1.43) 5.31E-11 PLCE1 intronic 6 1.97 24 rs3781265 chr10:96063279 T A 0.65 1.29(1.20-1.39) 6.00E-12 PLCE1 intronic 6 0.04 25 rs11187847 chr10:96063440 C G 0.96 1.32(1.21-1.43) 5.19E-11 PLCE1 intronic 7 7.04 26 rs3818432 chr10:96064168 C A 0.71 1.30(1.21-1.40) 3.69E-12 PLCE1 intronic 6 4.51 27 rs7099485 chr10:96065694 T C 0.65 1.29(1.20-1.39) 4.12E-12 PLCE1 intronic 7 4.86 28 rs2274223 chr10:96066341 A G 0.65 1.29(1.20-1.39) 3.85E-12 PLCE1 exonic 3a 3.31 29 rs10509670 chr10:96067947 A G 0.65 1.31(1.22-1.41) 3.60E-12 PLCE1 intronic 6 0.64 30 rs11187850 chr10:96068480 A G 0.71 1.30(1.21-1.40) 4.04E-12 PLCE1 intronic 7 12.02 31 rs10509671 chr10:96069054 A C 1.00 1.34(1.23-1.45) 2.51E-12 PLCE1 intronic 5 4.06 32 rs7096883 chr10:96069149 G A 0.96 1.32(1.21-1.43) 5.14E-11 PLCE1 intronic 5 0.84 33 rs7096678 chr10:96069208 C T 1.00 1.34(1.23-1.45) 2.98E-12 PLCE1 intronic 7 2.39 34 rs6583934 chr10:96069405 T G 0.65 1.29(1.20-1.39) 5.64E-12 PLCE1 intronic 7 0.00 35 rs7100626 chr10:96069674 C A 0.96 1.32(1.21-1.43) 4.35E-11 PLCE1 intronic 6 4.53 36 rs11187851 chr10:96069875 G A 1.00 1.34(1.23-1.45) 2.52E-12 PLCE1 intronic 7 0.19

48

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

37 rs11187852 chr10:96070132 G A 0.96 1.32(1.21-1.43) 4.38E-11 PLCE1 intronic 4 3.05 38 rs3781264 chr10:96070375 A G 1.00 1.33(1.23-1.45) 3.04E-12 PLCE1 intronic 4 5.77 39 rs752140 chr10:96071396 C T 0.96 1.32(1.21-1.43) 4.39E-11 PLCE1 intronic 6 4.19 40 rs11187853 chr10:96072228 G A 0.65 1.29(1.20-1.39) 4.17E-12 PLCE1 intronic 6 1.37 41 rs75409190 chr10:96072425 C T 0.96 1.32(1.21-1.43) 4.34E-11 PLCE1 intronic 7 7.40 42 rs3215794 chr10:96072873 CTA C 0.96 1.33(1.22-1.45) 1.91E-11 PLCE1 intronic 7 2.91 43 rs6583935 chr10:96073325 C T 0.65 1.29(1.20-1.39) 4.30E-12 PLCE1 intronic 6 2.47 44 rs201053600 chr10:96073508 AGAT A 0.96 1.33(1.22-1.45) 1.94E-11 PLCE1 intronic 4.74

45 rs10882422 chr10:96073563 T C 0.65 1.29(1.20-1.39) 3.88E-12 PLCE1 intronic 7 4.09 46 rs7903902 chr10:96074157 T C 0.65 1.29(1.20-1.39) 4.37E-12 PLCE1 intronic 6 1.05 47 rs12220125 chr10:96074939 T G 0.71 1.30(1.21-1.40) 3.74E-12 PLCE1 intronic 2b 1.30 48 rs7908638 chr10:96075433 T C 0.65 1.29(1.20-1.39) 4.68E-12 PLCE1 intronic 6 0.80 49 rs11187856 chr10:96076869 G A 0.96 1.32(1.21-1.43) 4.80E-11 PLCE1 intronic 7 0.26 50 rs12219592 chr10:96077222 C T 0.67 1.32(1.20-1.46) 1.58E-08 PLCE1 intronic 7 2.79 CTCAAA 51 rs113406892 chr10:96079609 C 0.85 1.32(1.22-1.43) 3.23E-11 PLCE1 intronic 6 14.25 GG 52 rs11187863 chr10:96081457 T G 0.81 1.29(1.19-1.40) 4.57E-10 PLCE1 intronic 7 4.30 53 rs11187864 chr10:96082506 C T 0.81 1.29(1.19-1.40) 6.88E-10 PLCE1 intronic 7 0.58 54 rs12781451 chr10:96083920 G A 0.85 1.31(1.21-1.42) 5.48E-11 PLCE1 intronic 6 4.05 55 rs3831084 chr10:96084372 A AT 0.85 1.32(1.22-1.43) 3.39E-11 PLCE1 intronic 5 10.05 56 rs11187866 chr10:96085991 C G 0.81 1.29(1.19-1.40) 5.68E-10 PLCE1 intronic 6 3.28 57 rs11187869 chr10:96087497 C T 0.85 1.32(1.22-1.44) 5.22E-11 PLCE1 intronic 5 9.53 PLCE1(NM_001288989 58 rs11187870 chr10:96087866 G C 0.81 1.20(1.11-1.31) 6.26E-06 UTR3 6 13.59 :c.*166_*167delins0) 59 rs11187877 chr10:96092121 G A 0.77 1.29(1.19-1.40) 2.20E-09 NOC3L(dist=861) downstream 7 14.51 NOC3L(NM_022451:c. 60 rs145707916 chr10:96092992 G GA 0.77 1.29(1.19-1.40) 2.15E-09 UTR3 7 5.82 *942_*941delins0) NOC3L(NM_022451:c. 61 rs11558740 chr10:96093375 C T 0.85 1.31(1.21-1.42) 1.51E-10 UTR3 6 4.46 *559_*558delins0) 62 rs138437339 chr10:96095271 CATTTT C 0.77 1.29(1.19-1.40) 2.30E-09 NOC3L intronic 7 8.59 63 rs11187882 chr10:96095861 A C 0.77 1.29(1.19-1.40) 2.43E-09 NOC3L intronic 6 0.04 64 rs11187883 chr10:96096866 T C 0.77 1.29(1.19-1.40) 2.38E-09 NOC3L intronic 7 1.11 CAAAAA 65 rs551061011 chr10:96097114 C 0.81 1.31(1.21-1.43) 1.78E-10 NOC3L intronic AA 66 rs3740359 chr10:96100119 C T 0.77 1.29(1.19-1.40) 1.92E-09 NOC3L intronic 6 1.30 67 rs36027832 chr10:96100752 A AG 0.60 1.27(1.18-1.38) 2.17E-09 NOC3L intronic 6 68 rs12241122 chr10:96100756 A C 0.60 1.27(1.18-1.38) 2.07E-09 NOC3L intronic 6 0.23 69 rs11187890 chr10:96102758 T C 0.73 1.29(1.19-1.41) 1.28E-09 NOC3L intronic 5 0.53 70 rs11187893 chr10:96105979 C G 0.81 1.21(1.12-1.32) 2.17E-06 NOC3L intronic 6 5.31

49

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

71 rs11187894 chr10:96106025 C T 0.81 1.31(1.20-1.42) 2.47E-10 NOC3L intronic 6 2.43 72 rs11187895 chr10:96106240 G C 0.77 1.29(1.19-1.40) 2.79E-09 NOC3L exonic 6 6.56 73 rs11187897 chr10:96106603 A G 0.77 1.29(1.19-1.40) 2.40E-09 NOC3L intronic 5 7.61 74 rs10882435 chr10:96107867 A T 0.81 1.32(1.21-1.43) 1.16E-10 NOC3L intronic 6 0.95 75 rs11187899 chr10:96108102 G C 0.81 1.32(1.21-1.44) 1.02E-10 NOC3L intronic 7 0.85 76 rs11187900 chr10:96108364 T G 0.81 1.33(1.22-1.44) 5.40E-11 NOC3L intronic 6 1.40 77 rs10882436 chr10:96108800 T C 0.81 1.33(1.22-1.45) 5.29E-11 NOC3L intronic 7 2.16 78 rs12217597 chr10:96110812 T C 0.81 1.33(1.22-1.45) 5.39E-11 NOC3L intronic 6 0.99 a LD value of r2 with the lead variant rs10509671 in CHB of the 1000 Genomes; b Derived from the meta-analysis results of four GWAS using the inverse variance-weighted fixed effect model.

50

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 9. In silico analyses of missense variants in PLCE1 and NOC3L ID Gene cDNA mutationa Protein alteration Mutation type MAFb SIFT PolyPhen-2 ClinVar

rs3765524 PLCE1 c.5964C>T p.Thr1777Ile missense 0.189 tolerated benign With Benign allele rs2274223 PLCE1 c.6414A>G p.His1927Arg missense 0.189 tolerated benign With Benign allele rs11187895 NOC3L c.1331G>C p.Pro444Arg missense 0.141 tolerated benign With Benign allele a The accession numbers in GenBank are NM_016341.3 and NM_022451.10 for PLCE1 and NOC3L, respectively. b Minor allele frequency (MAF) from CHB of 1000 Genomes.

51

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 10. The results of eQTL-analysis between rs10509671 genotypes and flanking genes in a one mega-base pair window based on 237 normal stomach tissue samples. Effect size No. Gene Gencode id P-value Standard error (effect allele, C) 1 MYOF ENSG00000138119.12 0.30 -0.046 0.046 2 CEP55 ENSG00000138180.11 0.52 -0.031 0.048 3 FFAR4 ENSG00000186188.6 0.33 0.082 0.084 4 RBP4 ENSG00000138207.8 0.05 0.180 0.090 5 PDE6C ENSG00000095464.9 0.86 0.014 0.078 6 FRA10AC1 ENSG00000148690.10 0.47 0.064 0.088 7 LGI1 ENSG00000108231.7 0.74 0.020 0.061 8 SLC35G1 ENSG00000176273.10 0.56 -0.042 0.072 9 PIPSL ENSG00000180764.11 0.68 0.041 0.100 10 PLCE1 ENSG00000138193.10 0.27 -0.067 0.061 11 PLCE1-AS1 ENSG00000268894.2 0.088 -0.15 0.088 12 NOC3L ENSG00000173145.7 8.4E-08 0.280 0.050 13 TBC1D12 ENSG00000108239.8 0.76 0.018 0.058 14 HELLS ENSG00000119969.10 0.19 0.048 0.037 15 CYP2C18 ENSG00000108242.8 0.27 0.036 0.033 16 CYP2C19 ENSG00000165841.5 0.59 0.020 0.037 17 CYP2C9 ENSG00000138109.9 0.09 0.067 0.039 18 CYP2C8 ENSG00000138115.9 0.02 0.160 0.067 19 C10orf129 ENSG00000173124.10 0.07 0.025 0.064 20 PDLIM1 ENSG00000107438.4 0.51 0.042 0.064

52

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 11. In silico prediction of transcription factors with allele specific binding to rs3781266 and rs3740365 by EEL algorithm (A) and prediction tool JASPAR (B). Matrix ID Non-risk allele Score Predicted sequence Risk allele Score Predicted sequence

rs3781266 (A) EEL (Score > 9.0)

POU3F3 T 10.035 AATATGGTGATAA C - - RUNX1 T 9.374 AATAGTGGTAT C - - POU2F1 T 9.270 AATATGGTGATA C - - (B) JASPAR (Relative profile score > 85%) POU3F3 T 0.851 AATATGGTGATAA C - - RUNX1 T 0.863 AATAGTGGTAT C - - POU2F1 T 0.859 AATATGGTGATA C - - rs3740365 (A) EEL (Score > 9.0) PROP1 A - - T 11.575 AAAATAAAT TA OTX1 A - - T 10.954 TTAATCGG PAX3 A - - T 9.718 AAACCGATTA (B) JASPAR (Relative profile score > 85%) PROP1 A - - T 0.880 AAAATAAAT TA OTX1 A - - T 0.972 TTAATCGG PAX3 A - - T 0.887 AAACCGATTA

53

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 12. Summary of 34 genes significantly associated with gastric cancer risk after multiple testing correction in gene-based analysis as implemented in MAGMA.

a b c No. Gene Chr. Position start Position stop Variants Z-Stat P P FDR 1 EFNA1 1 155095349 155112386 10 5.0171 2.62E-07 0.0002577 2 SLC50A1 1 155102817 155116334 10 5.4406 2.66E-08 4.51E-05 3 DPM3 1 155107367 155117996 4 5.5266 1.63E-08 3.05E-05 4 KRTCAP2 1 155136884 155156331 22 5.9886 1.06E-09 2.82E-06 5 TRIM46 1 155140702 155162447 21 6.107 5.08E-10 2.23E-06 6 MUC1 1 155153300 155167768 9 5.8873 1.96E-09 4.58E-06 7 THBS3 1 155160379 155182772 16 6.0518 7.16E-10 2.23E-06 8 MTX1 1 155173490 155188625 11 6.0557 6.99E-10 2.23E-06 9 GBA 1 155199239 155219653 35 5.1251 1.49E-07 0.0001633 10 FAM189B 1 155211996 155230274 33 5.0855 1.83E-07 0.0001902 11 SCAMP3 1 155220770 155237176 19 5.1936 1.03E-07 0.0001284 12 CLK2 1 155227659 155248305 20 5.334 4.80E-08 6.90E-05 13 HCN3 1 155242218 155264639 26 5.3642 4.07E-08 6.33E-05 14 PKLR 1 155254084 155283531 27 5.2066 9.62E-08 0.0001283 15 FDPS 1 155273539 155295457 12 4.8822 5.25E-07 0.0004452 16 RUSC1-AS1 1 155285251 155298938 3 4.564 2.51E-06 0.0018024 17 RUSC1 1 155285640 155305909 6 4.7185 1.19E-06 0.0009644 18 ASH1L 1 155300052 155537324 82 4.6542 1.63E-06 0.0012653 19 MSTO1 1 155574961 155589758 8 5.1769 1.13E-07 0.0001317 20 YY1AP1 1 155624233 155663823 20 4.4294 4.73E-06 0.0032676 21 DAP3 1 155652693 155713801 29 4.9221 4.28E-07 0.0003806 22 GON4L 1 155714450 155834185 40 4.6043 2.07E-06 0.0015457 23 PTGER4 5 40675032 40701962 56 6.385 8.57E-11 5.34E-07 24 TTC33 5 40706678 40761072 107 6.4451 5.78E-11 5.34E-07 25 PRKAA1 5 40754481 40803297 86 6.4035 7.59E-11 5.34E-07 26 RPL37 5 40826430 40840387 47 3.7886 7.57E-05 0.0423981 27 DAGLB 7 6443747 6492837 140 4.022 2.89E-05 0.01796 28 PLCE1 10 95748746 96093149 586 4.9657 3.42E-07 0.0003195 29 NOC3L 10 96087989 96127715 76 5.8518 2.43E-09 5.05E-06 30 GFRA1 10 117811436 118038126 488 3.9308 4.23E-05 0.0254933 31 OR10G2 14 22097066 22107998 58 3.9152 4.52E-05 0.0263561 32 IRGC 19 44215214 44229173 10 4.3059 8.31E-06 0.0055449 33 YDJC 22 21977378 22004017 31 4.0944 2.12E-05 0.0136286 34 CCDC116 22 21982086 21996616 10 3.7839 7.72E-05 0.0423981 a Number of variants annotated to the gene; b Z-value for each gene; c Corrected P value based on FDR method.

54

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 13. Summary of 222 pathways with P < 0.05 in pathway analysis as implemented in MAGMA No. Source Pathway name Ngenesa Beta SE P P_corrb 1 GO GO_CELLULAR_RESPONSE_TO_INTERLEUKIN_1 83 0.42 0.09 4.34E-06 0.02 2 GO GO_RESPONSE_TO_PLATELET_DERIVED_GROWTH_FACTOR 18 0.75 0.22 3.88E-04 0.73 3 GO GO_IMMUNE_RESPONSE 978 0.09 0.03 4.57E-04 0.78 4 GO GO_POSITIVE_REGULATION_OF_T_CELL_MEDIATED_IMMUNITY 28 0.51 0.16 7.64E-04 0.91 5 GO GO_TOLL_LIKE_RECEPTOR_SIGNALING_PATHWAY 85 0.28 0.09 8.28E-04 0.92 6 GO GO_RESPONSE_TO_VIRUS 240 0.16 0.06 1.88E-03 0.99 7 GO GO_REGULATION_OF_LYMPHOCYTE_MIGRATION 35 0.40 0.14 2.55E-03 1.00 8 GO GO_POSITIVE_REGULATION_OF_T_CELL_MEDIATED_CYTOTOXICITY 12 0.73 0.26 2.63E-03 1.00 9 GO GO_RESPONSE_TO_INTERLEUKIN_1 110 0.22 0.08 3.06E-03 1.00 10 GO GO_CELLULAR_SODIUM_ION_HOMEOSTASIS 19 0.53 0.19 3.17E-03 1.00 11 GO GO_CELLULAR_RESPONSE_TO_DSRNA 37 0.36 0.13 3.31E-03 1.00 12 GO GO_AMINO_ACID_ACTIVATION 49 0.34 0.13 3.71E-03 1.00 13 GO GO_NEGATIVE_REGULATION_OF_VIRAL_PROCESS 81 0.23 0.09 4.34E-03 1.00 14 GO GO_MYD88_INDEPENDENT_TOLL_LIKE_RECEPTOR_SIGNALING_PATHWAY 30 0.40 0.15 4.43E-03 1.00 15 GO GO_MEMBRANE_DISASSEMBLY 47 0.34 0.13 4.67E-03 1.00 16 GO GO_CELLULAR_PROTEIN_COMPLEX_ASSEMBLY 332 0.12 0.04 4.86E-03 1.00 17 GO GO_DEFENSE_RESPONSE_TO_VIRUS 160 0.18 0.07 4.95E-03 1.00 18 GO GO_ALPHA_BETA_T_CELL_ACTIVATION 53 0.28 0.11 5.27E-03 1.00 19 GO GO_ATP_HYDROLYSIS_COUPLED_TRANSMEMBRANE_TRANSPORT 35 0.34 0.13 5.43E-03 1.00 20 GO GO_NEGATIVE_REGULATION_OF_LEUKOCYTE_MIGRATION 32 0.38 0.15 5.52E-03 1.00 21 GO GO_T_CELL_DIFFERENTIATION_INVOLVED_IN_IMMUNE_RESPONSE 29 0.37 0.15 5.66E-03 1.00 22 GO GO_INNATE_IMMUNE_RESPONSE 536 0.09 0.04 6.16E-03 1.00 23 GO GO_GOLGI_VESICLE_TRANSPORT 308 0.12 0.05 6.25E-03 1.00 24 GO GO_MUSCLE_CONTRACTION 230 0.14 0.06 6.63E-03 1.00 25 GO GO_REGULATION_OF_CELLULAR_EXTRAVASATION 22 0.41 0.17 7.00E-03 1.00 26 GO GO_REGULATION_OF_GLUTAMATE_SECRETION 13 0.56 0.23 7.02E-03 1.00 27 GO GO_POSITIVE_REGULATION_OF_INTERLEUKIN_12_PRODUCTION 30 0.37 0.15 7.02E-03 1.00 28 GO GO_MITOCHONDRIAL_CALCIUM_ION_HOMEOSTASIS 16 0.57 0.23 7.15E-03 1.00 29 GO GO_EPIDERMAL_GROWTH_FACTOR_RECEPTOR_SIGNALING_PATHWAY 55 0.27 0.11 7.24E-03 1.00 30 GO GO_EMBRYONIC_VISCEROCRANIUM_MORPHOGENESIS 11 0.64 0.26 7.53E-03 1.00 31 GO GO_REGULATION_OF_MICROTUBULE_POLYMERIZATION 31 0.34 0.14 7.92E-03 1.00 32 GO GO_POSITIVE_REGULATION_OF_GLYCOPROTEIN_METABOLIC_PROCESS 20 0.46 0.19 7.98E-03 1.00 33 GO GO_RETROGRADE_VESICLE_MEDIATED_TRANSPORT_GOLGI_TO_ER 75 0.22 0.09 8.25E-03 1.00 34 GO GO_NEURON_MIGRATION 107 0.21 0.09 8.33E-03 1.00 35 GO GO_REGULATION_OF_VIRAL_ENTRY_INTO_HOST_CELL 24 0.36 0.15 8.38E-03 1.00 36 GO GO_DSRNA_FRAGMENTATION 22 0.42 0.18 8.42E-03 1.00 37 GO GO_CELL_COMMUNICATION_BY_ELECTRICAL_COUPLING 15 0.57 0.24 8.49E-03 1.00

55

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

38 GO GO_PROTEIN_LOCALIZATION_TO_CELL_SURFACE 22 0.40 0.17 8.67E-03 1.00 39 GO GO_POSITIVE_REGULATION_OF_NEUTROPHIL_MIGRATION 27 0.43 0.18 8.86E-03 1.00 40 GO GO_POSITIVE_REGULATION_OF_AMINO_ACID_TRANSPORT 12 0.65 0.28 9.37E-03 1.00 41 GO GO_ALPHA_BETA_T_CELL_DIFFERENTIATION 45 0.29 0.12 9.40E-03 1.00 42 GO GO_MYD88_DEPENDENT_TOLL_LIKE_RECEPTOR_SIGNALING_PATHWAY 32 0.34 0.15 1.01E-02 1.00 43 GO GO_REGULATION_OF_TRANSMISSION_OF_NERVE_IMPULSE 12 0.57 0.24 1.02E-02 1.00 44 GO GO_RESPONSE_TO_TRANSFORMING_GROWTH_FACTOR_BETA 138 0.16 0.07 1.02E-02 1.00 45 GO GO_PROTEIN_COMPLEX_LOCALIZATION 53 0.27 0.12 1.03E-02 1.00 46 GO GO_CD4_POSITIVE_ALPHA_BETA_T_CELL_ACTIVATION 34 0.31 0.14 1.08E-02 1.00 47 GO GO_POTASSIUM_ION_IMPORT 28 0.37 0.16 1.11E-02 1.00 48 GO GO_PROTEIN_LOCALIZATION_TO_CHROMATIN 13 0.66 0.29 1.14E-02 1.00 49 GO GO_CELLULAR_RESPONSE_TO_ARSENIC_CONTAINING_SUBSTANCE 14 0.60 0.26 1.15E-02 1.00 50 GO GO_SODIUM_ION_EXPORT 12 0.57 0.25 1.20E-02 1.00 51 GO GO_CELLULAR_RESPONSE_TO_NITROGEN_LEVELS 9 0.67 0.30 1.22E-02 1.00 52 GO GO_METANEPHRIC_MESENCHYME_DEVELOPMENT 14 0.45 0.20 1.22E-02 1.00 53 GO GO_RNA_DEPENDENT_DNA_BIOSYNTHETIC_PROCESS 21 0.32 0.14 1.24E-02 1.00 54 GO GO_REGULATION_OF_SIGNAL_TRANSDUCTION_BY_P53_CLASS_MEDIATOR 158 0.14 0.06 1.26E-02 1.00 55 GO GO_HEMIDESMOSOME_ASSEMBLY 12 0.52 0.23 1.28E-02 1.00 56 GO GO_CHEMOKINE_MEDIATED_SIGNALING_PATHWAY 68 0.27 0.12 1.32E-02 1.00 57 GO GO_ESTABLISHMENT_OR_MAINTENANCE_OF_TRANSMEMBRANE_ELECTROCHEMICAL_GRADIENT 12 0.57 0.26 1.33E-02 1.00 58 GO GO_REGULATION_OF_STEROID_METABOLIC_PROCESS 73 0.20 0.09 1.35E-02 1.00 59 GO GO_REGULATION_OF_GTPASE_ACTIVITY 650 0.07 0.03 1.36E-02 1.00 60 GO GO_POSITIVE_REGULATION_OF_INNATE_IMMUNE_RESPONSE 241 0.12 0.05 1.39E-02 1.00 61 GO GO_POSITIVE_REGULATION_OF_T_CELL_CYTOKINE_PRODUCTION 13 0.52 0.24 1.40E-02 1.00 62 GO GO_SULFUR_COMPOUND_METABOLIC_PROCESS 345 0.10 0.05 1.41E-02 1.00 63 GO GO_TELOMERE_MAINTENANCE_VIA_TELOMERASE 17 0.33 0.15 1.52E-02 1.00 64 GO GO_ESTABLISHMENT_OF_PROTEIN_LOCALIZATION_TO_CHROMOSOME 13 0.54 0.25 1.52E-02 1.00 65 GO GO_POSITIVE_REGULATION_OF_REACTIVE_OXYGEN_SPECIES_BIOSYNTHETIC_PROCESS 45 0.26 0.12 1.52E-02 1.00 66 GO GO_REGULATION_OF_ESTABLISHMENT_OF_PROTEIN_LOCALIZATION_TO_CHROMOSOME 11 0.57 0.26 1.53E-02 1.00 67 GO GO_REGULATION_OF_TRANSLATIONAL_FIDELITY 13 0.49 0.23 1.54E-02 1.00 68 GO GO_LACTATION 40 0.27 0.13 1.56E-02 1.00 69 GO GO_REGULATION_OF_MITOCHONDRIAL_TRANSLATION 10 0.48 0.22 1.60E-02 1.00 70 GO GO_EMBRYONIC_FORELIMB_MORPHOGENESIS 32 0.35 0.16 1.60E-02 1.00 71 GO GO_TRYPTOPHAN_METABOLIC_PROCESS 12 0.49 0.23 1.60E-02 1.00 72 GO GO_REGULATION_OF_NEUTROPHIL_MIGRATION 32 0.33 0.16 1.60E-02 1.00 73 GO GO_MITOTIC_SPINDLE_ASSEMBLY 40 0.28 0.13 1.63E-02 1.00 74 GO GO_NEGATIVE_REGULATION_OF_VIRAL_GENOME_REPLICATION 46 0.27 0.13 1.69E-02 1.00 75 GO GO_BODY_FLUID_SECRETION 70 0.20 0.09 1.70E-02 1.00 76 GO GO_POSITIVE_REGULATION_OF_ACTION_POTENTIAL 11 0.58 0.28 1.73E-02 1.00 77 GO GO_REGULATION_OF_ARF_PROTEIN_SIGNAL_TRANSDUCTION 16 0.40 0.19 1.73E-02 1.00

56

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

78 GO GO_DNA_TEMPLATED_TRANSCRIPTION_ELONGATION 93 0.18 0.08 1.74E-02 1.00 79 GO GO_MEMBRANE_FUSION 152 0.13 0.06 1.78E-02 1.00 80 GO GO_REGULATION_OF_CELL_PROLIFERATION_INVOLVED_IN_KIDNEY_DEVELOPMENT 12 0.55 0.26 1.80E-02 1.00 81 GO GO_ETHER_METABOLIC_PROCESS 12 0.57 0.27 1.81E-02 1.00 82 GO GO_RESPONSE_TO_ISOQUINOLINE_ALKALOID 29 0.32 0.15 1.83E-02 1.00 83 GO GO_KIDNEY_MESENCHYME_DEVELOPMENT 17 0.39 0.19 1.84E-02 1.00 84 GO GO_MICROTUBULE_ORGANIZING_CENTER_ORGANIZATION 84 0.19 0.09 1.87E-02 1.00 85 GO GO_EPITHELIAL_CELL_MORPHOGENESIS 40 0.27 0.13 1.92E-02 1.00 86 GO GO_TELOMERE_MAINTENANCE_VIA_TELOMERE_LENGTHENING 26 0.28 0.13 1.94E-02 1.00 87 GO GO_INDOLALKYLAMINE_METABOLIC_PROCESS 16 0.43 0.21 1.94E-02 1.00 88 GO GO_MITOTIC_CELL_CYCLE_ARREST 13 0.41 0.20 1.97E-02 1.00 89 GO GO_REGULATION_OF_NITRIC_OXIDE_BIOSYNTHETIC_PROCESS 50 0.24 0.12 1.98E-02 1.00 90 GO GO_MUSCLE_SYSTEM_PROCESS 277 0.10 0.05 2.00E-02 1.00 91 GO GO_REGULATION_OF_CELLULAR_RESPONSE_TO_HEAT 73 0.21 0.10 2.00E-02 1.00 92 GO GO_REGULATION_OF_CYTOPLASMIC_TRANSLATION 13 0.42 0.20 2.00E-02 1.00 93 GO GO_NEGATIVE_REGULATION_OF_MUSCLE_CONTRACTION 22 0.36 0.18 2.08E-02 1.00 94 GO GO_EMBRYONIC_SKELETAL_SYSTEM_MORPHOGENESIS 93 0.18 0.09 2.15E-02 1.00 95 GO GO_AROMATIC_AMINO_ACID_FAMILY_METABOLIC_PROCESS 28 0.30 0.15 2.19E-02 1.00 96 GO GO_POSITIVE_REGULATION_OF_STEROID_METABOLIC_PROCESS 22 0.35 0.17 2.28E-02 1.00 97 GO GO_VESICLE_ORGANIZATION 274 0.09 0.05 2.32E-02 1.00 98 GO GO_AMINOGLYCAN_BIOSYNTHETIC_PROCESS 107 0.16 0.08 2.32E-02 1.00 99 GO GO_PRE_MIRNA_PROCESSING 13 0.47 0.24 2.33E-02 1.00 100 GO GO_REGULATION_OF_ACTION_POTENTIAL 38 0.28 0.14 2.40E-02 1.00 101 GO GO_CELLULAR_MODIFIED_AMINO_ACID_BIOSYNTHETIC_PROCESS 49 0.23 0.12 2.41E-02 1.00 102 GO GO_POSITIVE_REGULATION_OF_I_KAPPAB_KINASE_NF_KAPPAB_SIGNALING 174 0.12 0.06 2.44E-02 1.00 103 GO GO_REGULATION_OF_TRANSPORTER_ACTIVITY 197 0.12 0.06 2.45E-02 1.00 104 GO GO_GLOBAL_GENOME_NUCLEOTIDE_EXCISION_REPAIR 30 0.26 0.13 2.50E-02 1.00 105 GO GO_SMALL_MOLECULE_CATABOLIC_PROCESS 324 0.09 0.05 2.56E-02 1.00 106 GO GO_IMMUNE_EFFECTOR_PROCESS 433 0.08 0.04 2.57E-02 1.00 107 GO GO_CELL_COMMUNICATION_INVOLVED_IN_CARDIAC_CONDUCTION 37 0.28 0.14 2.57E-02 1.00 108 GO GO_MITOCHONDRIAL_CALCIUM_ION_TRANSPORT 11 0.50 0.26 2.58E-02 1.00 109 GO GO_REGULATION_OF_N_METHYL_D_ASPARTATE_SELECTIVE_GLUTAMATE_RECEPTOR_ACTIVITY 15 0.41 0.21 2.60E-02 1.00 110 GO GO_CHROMOSOME_LOCALIZATION 60 0.20 0.10 2.61E-02 1.00 111 GO GO_PEPTIDYL_TYROSINE_AUTOPHOSPHORYLATION 38 0.25 0.13 2.63E-02 1.00 112 GO GO_REGULATION_OF_REACTIVE_OXYGEN_SPECIES_BIOSYNTHETIC_PROCESS 62 0.20 0.10 2.66E-02 1.00 113 GO GO_NEGATIVE_REGULATION_OF_FIBROBLAST_GROWTH_FACTOR_RECEPTOR_SIGNALING_PATHWAY 11 0.46 0.24 2.66E-02 1.00 114 GO GO_METANEPHRIC_NEPHRON_MORPHOGENESIS 20 0.36 0.19 2.72E-02 1.00 115 GO GO_REGULATION_OF_DNA_BIOSYNTHETIC_PROCESS 90 0.16 0.08 2.73E-02 1.00 116 GO GO_GLUTATHIONE_METABOLIC_PROCESS 52 0.24 0.13 2.78E-02 1.00 117 GO GO_CHONDROITIN_SULFATE_PROTEOGLYCAN_BIOSYNTHETIC_PROCESS 30 0.27 0.14 2.79E-02 1.00

57

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

118 GO GO_TRANSCRIPTION_ELONGATION_FROM_RNA_POLYMERASE_II_PROMOTER 79 0.18 0.09 2.79E-02 1.00 119 GO GO_REGULATION_OF_I_KAPPAB_KINASE_NF_KAPPAB_SIGNALING 228 0.10 0.05 2.80E-02 1.00 120 GO GO_DNA_CATABOLIC_PROCESS 27 0.29 0.15 2.81E-02 1.00 121 GO GO_REGULATION_OF_MICROVILLUS_ORGANIZATION 14 0.44 0.23 2.85E-02 1.00 122 GO GO_POSITIVE_REGULATION_OF_RENAL_SODIUM_EXCRETION 13 0.48 0.25 2.97E-02 1.00 123 GO GO_POSITIVE_REGULATION_OF_NF_KAPPAB_TRANSCRIPTION_FACTOR_ACTIVITY 125 0.13 0.07 2.98E-02 1.00 124 GO GO_REGULATION_OF_MICROTUBULE_POLYMERIZATION_OR_DEPOLYMERIZATION 174 0.11 0.06 3.02E-02 1.00 125 GO GO_APOPTOTIC_DNA_FRAGMENTATION 15 0.43 0.23 3.04E-02 1.00 126 GO GO_SPINDLE_ASSEMBLY 67 0.18 0.10 3.05E-02 1.00 127 GO GO_CELLULAR_RESPONSE_TO_CYTOKINE_STIMULUS 563 0.07 0.04 3.08E-02 1.00 128 GO GO_CARDIAC_CONDUCTION 82 0.18 0.10 3.09E-02 1.00 129 GO GO_PHAGOSOME_MATURATION 36 0.27 0.14 3.10E-02 1.00 130 GO GO_INOSITOL_PHOSPHATE_CATABOLIC_PROCESS 10 0.45 0.24 3.12E-02 1.00 131 GO GO_MAINTENANCE_OF_CELL_NUMBER 130 0.13 0.07 3.16E-02 1.00 132 GO GO_POSITIVE_REGULATION_OF_LYMPHOCYTE_MIGRATION 24 0.35 0.19 3.19E-02 1.00 133 GO GO_LYSOSOMAL_TRANSPORT 67 0.19 0.10 3.21E-02 1.00 134 GO GO_PHOSPHOLIPID_SCRAMBLING 13 0.38 0.21 3.22E-02 1.00 135 GO GO_BONE_REMODELING 34 0.25 0.14 3.24E-02 1.00 136 GO GO_PURINE_CONTAINING_COMPOUND_BIOSYNTHETIC_PROCESS 137 0.12 0.07 3.27E-02 1.00 137 GO GO_REGULATION_OF_SYMBIOSIS_ENCOMPASSING_MUTUALISM_THROUGH_PARASITISM 193 0.11 0.06 3.28E-02 1.00 138 GO GO_CELL_MIGRATION_INVOLVED_IN_GASTRULATION 14 0.47 0.26 3.32E-02 1.00 139 GO GO_SINGLE_ORGANISM_MEMBRANE_FUSION 123 0.13 0.07 3.32E-02 1.00 140 GO GO_CELLULAR_RESPONSE_TO_INTERFERON_GAMMA 94 0.16 0.09 3.36E-02 1.00 141 GO GO_RRNA_CONTAINING_RIBONUCLEOPROTEIN_COMPLEX_EXPORT_FROM_NUCLEUS 11 0.48 0.26 3.38E-02 1.00 142 GO GO_NEGATIVE_REGULATION_OF_CELLULAR_AMIDE_METABOLIC_PROCESS 130 0.13 0.07 3.39E-02 1.00 143 GO GO_REGULATION_OF_B_CELL_APOPTOTIC_PROCESS 17 0.33 0.18 3.42E-02 1.00 144 GO GO_RNA_CAPPING 35 0.27 0.15 3.47E-02 1.00 145 GO GO_RESPONSE_TO_TUMOR_NECROSIS_FACTOR 220 0.11 0.06 3.49E-02 1.00 146 GO GO_REGULATION_OF_DELAYED_RECTIFIER_POTASSIUM_CHANNEL_ACTIVITY 18 0.36 0.20 3.51E-02 1.00 147 GO GO_EPOXYGENASE_P450_PATHWAY 18 0.37 0.20 3.52E-02 1.00 148 GO GO_REGULATION_OF_RENAL_SODIUM_EXCRETION 22 0.32 0.18 3.53E-02 1.00 149 GO GO_NEGATIVE_REGULATION_OF_MULTI_ORGANISM_PROCESS 140 0.12 0.07 3.53E-02 1.00 150 GO GO_SODIUM_ION_HOMEOSTASIS 30 0.27 0.15 3.56E-02 1.00 151 GO GO_NEGATIVE_REGULATION_OF_TELOMERE_MAINTENANCE_VIA_TELOMERASE 11 0.40 0.22 3.57E-02 1.00 152 GO GO_REGULATION_OF_NUCLEOSIDE_METABOLIC_PROCESS 48 0.22 0.12 3.61E-02 1.00 153 GO GO_DNA_TEMPLATED_TRANSCRIPTIONAL_PREINITIATION_COMPLEX_ASSEMBLY 11 0.38 0.21 3.65E-02 1.00 154 GO GO_CELLULAR_RESPONSE_TO_FLUID_SHEAR_STRESS 19 0.36 0.20 3.72E-02 1.00 155 GO GO_REGULATION_OF_CHOLESTEROL_HOMEOSTASIS 11 0.42 0.24 3.77E-02 1.00 156 GO GO_POSITIVE_REGULATION_OF_TYPE_I_INTERFERON_PRODUCTION 68 0.17 0.10 3.79E-02 1.00 157 GO GO_REGULATION_OF_GONADOTROPIN_SECRETION 13 0.35 0.20 3.81E-02 1.00

58

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

158 GO GO_SENSORY_PERCEPTION_OF_MECHANICAL_STIMULUS 153 0.12 0.07 3.82E-02 1.00 159 GO GO_ENDOSOME_TO_LYSOSOME_TRANSPORT 41 0.23 0.13 3.82E-02 1.00 160 GO GO_SOMATIC_STEM_CELL_POPULATION_MAINTENANCE 64 0.18 0.10 3.84E-02 1.00 161 GO GO_CHONDROITIN_SULFATE_BIOSYNTHETIC_PROCESS 25 0.27 0.15 3.85E-02 1.00 162 GO GO_MYELIN_ASSEMBLY 17 0.37 0.21 3.86E-02 1.00 163 GO GO_ER_TO_GOLGI_VESICLE_MEDIATED_TRANSPORT 162 0.11 0.06 3.86E-02 1.00 164 GO GO_OSTEOBLAST_DEVELOPMENT 18 0.35 0.20 3.88E-02 1.00 165 GO GO_BASEMENT_MEMBRANE_ORGANIZATION 11 0.47 0.26 3.89E-02 1.00 166 GO GO_POSITIVE_REGULATION_OF_CALCIUM_ION_TRANSMEMBRANE_TRANSPORTER_ACTIVITY 31 0.29 0.16 3.90E-02 1.00 167 GO GO_ORGANELLE_FUSION 126 0.12 0.07 3.95E-02 1.00 168 GO GO_REGULATION_OF_INTERFERON_ALPHA_PRODUCTION 19 0.30 0.17 3.97E-02 1.00 169 GO GO_LIPID_CATABOLIC_PROCESS 245 0.09 0.05 4.00E-02 1.00 170 GO GO_REGULATION_OF_ORGANELLE_ASSEMBLY 144 0.12 0.07 4.05E-02 1.00 171 GO GO_MICROTUBULE_POLYMERIZATION 26 0.28 0.16 4.06E-02 1.00 172 GO GO_PROTEIN_LOCALIZATION_TO_ENDOPLASMIC_RETICULUM 118 0.13 0.08 4.11E-02 1.00 173 GO GO_PTERIDINE_CONTAINING_COMPOUND_BIOSYNTHETIC_PROCESS 17 0.33 0.19 4.12E-02 1.00 174 GO GO_REGULATION_OF_POTASSIUM_ION_TRANSMEMBRANE_TRANSPORTER_ACTIVITY 40 0.23 0.13 4.16E-02 1.00 175 GO GO_T_CELL_MEDIATED_IMMUNITY 24 0.29 0.17 4.17E-02 1.00 176 GO GO_REGULATION_OF_VASCULAR_PERMEABILITY 30 0.27 0.15 4.18E-02 1.00 177 GO GO_REPLICATIVE_SENESCENCE 12 0.34 0.20 4.20E-02 1.00 178 GO GO_TRANSCRIPTION_FROM_RNA_POLYMERASE_I_PROMOTER 32 0.23 0.14 4.23E-02 1.00 179 GO GO_REGULATION_OF_VENTRICULAR_CARDIAC_MUSCLE_CELL_ACTION_POTENTIAL 11 0.45 0.26 4.25E-02 1.00 180 GO GO_REGULATION_OF_PROTEIN_LOCALIZATION_TO_CHROMOSOME_TELOMERIC_REGION 14 0.36 0.21 4.29E-02 1.00 181 GO GO_EPITHELIAL_CELL_FATE_COMMITMENT 15 0.37 0.22 4.32E-02 1.00 182 GO GO_REGULATION_OF_INTERLEUKIN_1_SECRETION 32 0.24 0.14 4.39E-02 1.00 183 GO GO_CELLULAR_POTASSIUM_ION_HOMEOSTASIS 12 0.43 0.25 4.41E-02 1.00 184 GO GO_REGULATION_OF_NEUTROPHIL_CHEMOTAXIS 27 0.28 0.17 4.44E-02 1.00 185 GO GO_REGULATION_OF_AMINO_ACID_TRANSPORT 24 0.29 0.17 4.49E-02 1.00 186 GO GO_PROTEIN_LOCALIZATION_TO_CHROMOSOME 45 0.20 0.12 4.50E-02 1.00 187 GO GO_VIRION_ASSEMBLY 36 0.21 0.13 4.52E-02 1.00 188 GO GO_CELL_MIGRATION_INVOLVED_IN_HEART_DEVELOPMENT 14 0.34 0.20 4.57E-02 1.00 189 GO GO_FATTY_ACID_CATABOLIC_PROCESS 73 0.14 0.08 4.59E-02 1.00 190 GO GO_GOLGI_TO_ENDOSOME_TRANSPORT 19 0.29 0.17 4.65E-02 1.00 191 GO GO_MULTI_ORGANISM_LOCALIZATION 62 0.19 0.11 4.66E-02 1.00 192 GO GO_NUCLEOSIDE_PHOSPHATE_BIOSYNTHETIC_PROCESS 185 0.10 0.06 4.67E-02 1.00 193 GO GO_MEMBRANE_DEPOLARIZATION_DURING_CARDIAC_MUSCLE_CELL_ACTION_POTENTIAL 14 0.41 0.24 4.68E-02 1.00 194 GO GO_RESPONSE_TO_POTASSIUM_ION 14 0.35 0.21 4.69E-02 1.00 195 GO GO_REGULATION_OF_VIRAL_GENOME_REPLICATION 72 0.17 0.10 4.73E-02 1.00 196 GO GO_RNA_INTERFERENCE 12 0.43 0.26 4.74E-02 1.00 197 GO GO_NEGATIVE_REGULATION_OF_HUMORAL_IMMUNE_RESPONSE 13 0.47 0.28 4.78E-02 1.00

59

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

198 GO GO_AROMATIC_AMINO_ACID_FAMILY_CATABOLIC_PROCESS 20 0.29 0.18 4.81E-02 1.00 199 GO GO_INTERACTION_WITH_HOST 125 0.12 0.07 4.82E-02 1.00 200 GO GO_METAPHASE_PLATE_CONGRESSION 41 0.21 0.13 4.83E-02 1.00 201 GO GO_RESPONSE_TO_ANGIOTENSIN 17 0.37 0.23 4.84E-02 1.00 202 GO GO_PHAGOLYSOSOME_ASSEMBLY 11 0.40 0.24 4.84E-02 1.00 203 GO GO_REGULATION_OF_RNA_POLYMERASE_II_TRANSCRIPTIONAL_PREINITIATION_COMPLEX_ASSEMBLY 13 0.42 0.25 4.86E-02 1.00 204 GO GO_REGULATION_OF_MITOPHAGY 42 0.22 0.13 4.86E-02 1.00 205 GO GO_RESPONSE_TO_XENOBIOTIC_STIMULUS 101 0.14 0.08 4.87E-02 1.00 206 GO GO_CARDIAC_MUSCLE_CELL_ACTION_POTENTIAL 37 0.24 0.15 4.87E-02 1.00 207 GO GO_RESPONSE_TO_DSRNA 71 0.17 0.11 4.90E-02 1.00 208 GO GO_POSITIVE_REGULATION_OF_RESPONSE_TO_TUMOR_CELL 10 0.39 0.24 4.92E-02 1.00 209 GO GO_REGULATION_OF_VITAMIN_METABOLIC_PROCESS 11 0.43 0.26 4.96E-02 1.00 210 KEGG KEGG_LINOLEIC_ACID_METABOLISM 29 0.43 0.15 1.91E-03 0.28 211 KEGG KEGG_GLYCOSAMINOGLYCAN_BIOSYNTHESIS_KERATAN_SULFATE 15 0.55 0.21 4.69E-03 0.53 212 KEGG KEGG_PURINE_METABOLISM 155 0.15 0.06 9.06E-03 0.75 213 KEGG KEGG_TOLL_LIKE_RECEPTOR_SIGNALING_PATHWAY 101 0.20 0.08 9.60E-03 0.76 214 KEGG KEGG_DRUG_METABOLISM_CYTOCHROME_P450 69 0.26 0.12 1.93E-02 0.94 215 KEGG KEGG_VASCULAR_SMOOTH_MUSCLE_CONTRACTION 111 0.15 0.08 2.05E-02 0.95 216 KEGG KEGG_AMINOACYL_TRNA_BIOSYNTHESIS 39 0.27 0.13 2.14E-02 0.95 217 KEGG KEGG_CHEMOKINE_SIGNALING_PATHWAY 183 0.13 0.07 2.44E-02 0.97 218 KEGG KEGG_ALDOSTERONE_REGULATED_SODIUM_REABSORPTION 42 0.24 0.13 2.54E-02 0.97 219 KEGG KEGG_LONG_TERM_DEPRESSION 68 0.19 0.10 3.55E-02 0.99 220 KEGG KEGG_ETHER_LIPID_METABOLISM 33 0.22 0.13 4.19E-02 1.00 221 KEGG KEGG_GAP_JUNCTION 83 0.16 0.10 4.43E-02 1.00 222 KEGG KEGG_TRYPTOPHAN_METABOLISM 39 0.21 0.13 4.65E-02 1.00 a Number of gene included in the pathway b Corrected P value based on MAGMA’s empirical multiple testing correction method

60

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 14. Oligonucleotides used in EMSAs. Name Sequence (5’-3’)a

rs59133000_T BIOATCCTGAAATTTTCTCCCCAAAGC rs59133000_C BIOATCCTGAAATTTCCTCCCCAAAGC rs11187842_G BIOTGTACCTTTGGAGAGTAATTCCCTA rs11187842_A BIOTGTACCTTTGGAAAGTAATTCCCTA rs3781266_T BIOAGATGGGAAATATGGTGATAATGAT rs3781266_C BIOAGATGGGAAATACGGTGATAATGAT rs3740365_A BIOATGAAAATAAATAAGCCAAAAACTA rs3740365_T BIOATGAAAATAAATTAGCCAAAAACTA POU3F3 ATGCCATAATAAATTCCTGA RUNX1 GCTTGGTGTGGTCAGTGT POU2F1 AGGCTGATTTGCATAGCCCA POU2F1_Mutant AGGCTGGCCTGCATAGCCCA PROP1 TAATTGAATTA OTX1 CAGTAAGCCTTTAATCCTGTCT PAX3 CTGGGCGTTATTAGCATATCCCACC PAX3_Mutant CTGGGCGTTATGAGGATCTACCACC a BIO: 5’ biotinylation on both the sense and antisense strands of the duplex

61

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 15. Oligonucleotides used for ChIP-qPCR analyses. Name Primer sequence (5’-3’)

ChIP-rs59133000_F AACAGAGCTTGCTTTGGGGA ChIP-rs59133000_R CCGTAAGTTGCTAGAGGGCA ChIP-rs11187842_F CTCCTCTGATGATGTTCTTGGA ChIP-rs11187842_R GCCTCCAAATTGTTCCCA ChIP-rs3781266_F AGGGAGTTGTTAGACCAGGAGT ChIP-rs3781266_R CCTGCTGATATTAGCCTTGACA ChIP-rs3740365_F ATGTTGAGTTGCCTTGCTTGT ChIP-rs3740365_R TGCCAAAAATGGACTTCCTGC

62

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 16. Sequences of siRNA and primers for quantitative real-time PCR (qPCR) assay. Name Sequence (5’-3’)

PRKAA1 siRNA1 CCCAUCCUGAAAGAGUACCAUUCUU

PRKAA1 siRNA2 CCCUCAAUAUUUAAAUCCUUCUGUG

NFKB1 siRNA AUAUUUGAAGGUAUGGGCCAUCUGC

NOC3L siRNA1 ACUGCACACAGAGACUCUGAAUAUU

NOC3L siRNA2 GCAGCGAGCUCUUGCCUUCAUCAAA PRKAA1-qPCR_F CAGCCGAGAAGCAGAAACAC PRKAA1-qPCR_R ACCACATCAAGGCTCCGAAT NFKB1-qPCR_F ACAGATGGCACTGCCAACAG NFKB1-qPCR_R GGGATGGGCCTTCACATACATA NOC3L-qPCR_F CAAGTTTCTCAGCAGCGAGC NOC3L-qPCR_R TGGGATGATAATGCCTCCGC

GAPDH-qPCR_F AGCCACATCGCTCAGACAC GAPDH-qPCR_R GCCCAATACGACCAAATCC

63

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760 Supplementary material Gut

Supplementary Table 17. Sequences of sgRNA and primers for amplifying sgRNA target site and sequencing. Name Sequence (5’-3’) CCTTTCTGGTGTGGATTATTGTC (forward) PRKAA1-sgRNA GACAATAATCCACACCAGAAAGG (reverse) PRKAA1-sgRNA_F CTTGAGTTACTGATTTGGGTTC PRKAA1-sgRNA_R AAGCGAAACCCTGCCTCTAT

64

Yan C, et al. Gut 2019; 0:1–11. doi: 10.1136/gutjnl-2019-318760