<<

Supplementary material Gut

Supplemental laboratory methods

The samples were sent to Metabolon, Inc. (Durham, NC, USA) on dry ice with the ATBC samples sent prior to that of the PLCO samples. The serum samples were assayed using untargeted ultrahigh performance liquid chromatography-tandem and/or gas chromatography mass spectrometry. Metabolites were measured using either the Orbitrap Elite or Q-Exactive platforms. Peaks were identified via linkage to Metabolon’s known chemical reference library.

Ultrahigh performance liquid chromatography/Mass Spectroscopy (UPLC/MS/MS). The LC/MS portion of the platform was based on a Waters ACQUITY ultra-performance liquid chromatography (UPLC) and a ThermoFisher Scientific Orbitrap Elite or Q-Exactive high resolution/accurate mass spectrometer, which consisted of a heated electrospray ionization (HESI) source and orbitrap mass analyzer operated at 30,000 or 35,000 mass resolution, respectively. The sample extract was dried then reconstituted in acidic or basic LC-compatible solvents, each of which contained 8 or more injection standards at fixed concentrations to ensure injection and chromatographic consistency. Data was acquired over the course of 4 years; over which time Metabolon’s methodology has evolved. For the first ATBC studies, one sample aliquot was acquired using acidic positive ion optimized conditions and another using basic negative ion optimized conditions in two independent injects over separate dedicated columns. Subsequent studies (i.e. PLCO) increased the number of chromatographic columns as methodology advanced. Details in methodology can be found in Evans et al, 2009, Evans et al 2014, and Long et al, 2017 1, 2, 3.

Gas chromatography/Mass Spectroscopy (GC/MS). The ATBC samples but not PLCO had GC/MS analysis. The samples were re-dried under vacuum desiccation for a minimum of 18 hours prior to being derivatized under dried nitrogen using bistrimethyl-silyl-triflouroacetamide (BSTFA). The GC column was 5% phenyl and the temperature ramp was from 40° to 300° C in a 16 minute period. Samples were analyzed on a Thermo-Finnigan Trace DSQ fast-scanning single-quadrupole mass spectrometer using electron impact ionization.

Data Extraction and Compound Identification: Raw data was extracted, peaks identified and QC processed using Metabolon’s hardware and software. Compounds were identified by comparison to library entries of purified standards or recurrent unknown entities. Metabolon maintains a library based on authenticated standards that contains the retention time/index (RI), mass to charge ratio (m/z), and chromatographic data (including MS/MS spectral data) on all molecules present in the library. Biochemical identifications are based on three criteria: retention index within a narrow RI window of the proposed identification, accurate mass match to the library +/- 0.005 amu, and the MS/MS forward and reverse scores between the experimental data and authentic standards. The MS/MS scores are based on a comparison of the ions present in the experimental spectrum to the ions present in the library spectrum. While there may be similarities between these molecules based on one of these factors, the use of all three data points can be utilized to distinguish and differentiate biochemicals. More than 3300 commercially available purified standard compounds have been acquired and registered into LIMS for distribution to both the LC-MS and GC-MS platforms for determination of their analytical characteristics. Additional mass spectral entries have been created for structurally unnamed biochemicals, which have been identified by virtue of their recurrent nature (both

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

chromatographic and mass spectral). These compounds have the potential to be identified by future acquisition of a matching purified standard or by classical structural analysis. Curation: A variety of curation procedures were carried out to ensure that a high quality data set was made available for statistical analysis and data interpretation. The QC and curation processes were designed to ensure accurate and consistent identification of true chemical entities, and to remove those representing system artifacts, mis-assignments, and background noise. Metabolon data analysts use proprietary visualization and interpretation software to confirm the consistency of peak identification among the various samples. Library matches for each compound were checked for each sample and corrected if necessary.

1 Evans AM, DeHaven CD, Barrett T, et al. Integrated, nontargeted ultrahigh performance liquid chromatography/electrospray ionization tandem mass spectrometry platform for the identification and relative quantification of the small-molecule complement of biological systems. Anal Chem 2009;81:6656-67. 2 Evans AM, Bridgewater BR, Liu Q, et al. High resolution mass spectrometry improves data quantity and quality as compared to unit mass resolution mass spectrometry in high-throughput profiling metabolomics Metabolomics 2014;4:7. 3 Long T, Hicks M, Yu HC, et al. Whole-genome sequencing identifies common-to-rare variants associated with human blood metabolites. Nature genetics 2017;49:568-78.

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Supplementary methods: pathway groups for pathway analysis 1. Alanine and aspartate metabolism group Alanine and aspartate metabolism Alanine and aspartate metabolism; pyrimidine metabolism, uracil containing 2. Benzoate metabolism Benzoate metabolism; phenylalanine & tyrosine metabolism Benzoate metabolism Benzoate metabolism; bacterial 3. Butanoate metabolism; cysteine, methionine, SAM, taurine metabolism group Butanoate metabolism; cysteine, methionine, SAM, taurine metabolism Cysteine, methionine, SAM, taurine metabolism Cysteine, methionine, SAM, taurine metabolism; endocannabinoid 4. Bile acids group Secondary bile acid metabolism Primary bile acid metabolism 5. Carnitine metabolism group Carnitine metabolism Carnitine metabolism; fatty acid synthesis Acetylated peptides 6. Food component/plant group Chemical; food component/plant Food component/plant Food component/plant; krebs cycle, dicarboxylic acid , gluconeogenesis, pyruvate metabolism; food component/plant Pentose metabolism; food component/plant Sugar, sugar substitute, starch; food component/plant Tryptophan metabolism; food component/plant Valine, leucine and isoleucine metabolism; food component/plant 7. Chemical group Chemical Chemical; food component/plant Chemical; ketone bodies Detoxification metabolism; chemical Valine, leucine and isoleucine metabolism; chemical 8. Dipeptide group/polypeptide Dipeptide Dipeptide derivative Dipeptide derivative; glutathione metabolism Dipeptide; urea cycle; arginine and proline metabolism Polypeptide 9. Drug Drug

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

10. Polyunstaturated (n3 and n6) Essential fatty acid; polyunsaturated fatty acid (n3 and n6) Long chain fatty acid; polyunsaturated fatty acid (n3 and n6) Polyunsaturated fatty acid (n3 and n6) 11. Fatty acid metabolism Fatty acid, dicarboxylate; lysine metabolism Fatty acid metabolism (also BCAA metabolism) Fatty acid metabolism; carnitine metabolism Fatty acid metabolism; valine, leucine and isoleucine metabolism Fatty acid, amino Fatty acid, branched Fatty acid, methyl ester Fatty acid, monohydroxy Fatty acid metabolism (acyl choline) Fatty acid metabolism (acyl glutamine) Fatty acid, dicarboxylate 12. Fibrinogen cleavage peptide Fibrinogen cleavage peptide 13. Gamma-glutamyl amino acid metabolism group Gamma-glutamyl amino acid 14. Glutamate metabolism Glutamate metabolism 15. Glutathione metabolism Glutathione metabolism Dipeptide derivative; glutathione metabolism 16. Glycerolipid metabolism group Glycerolipid metabolism Glycerolipid metabolism; phospholipid metabolism 17. Glycine, serine and threonine metabolism group Glycine, serine and threonine metabolism 18. Glycolysis, gluconeogenesis, pyruvate metabolism group Glycolysis, gluconeogenesis, pyruvate metabolism Glycolysis, gluconeogenesis, pyruvate metabolism; food component/plant 19. Hemoglobin and porphyrin metabolism Hemoglobin and porphyrin metabolism 20. Histidine metabolism Histidine metabolism 21. Krebs cycle / TCA cycle Krebs cycle / TCA cycle 22. Long chain fatty acid group Long chain fatty acid 23. Lysine metabolism Lysine metabolism

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

24. Lysolipid group Lysolipid; lysoplasmalogen Lysophospholipid Lysolipid Glycerophosphodiester /lysolipid 25. Medium and short chain fatty acid group Medium chain fatty acid Medium chain fatty acid; fatty acid, amino Short chain fatty acid 26. Mono and diacylglycerol group Monoacylglycerol Diacylglycerol 27. Pentose metabolism group Pentose metabolism Pentose metabolism; food component/plant 28. Phenylalanine & tyrosine metabolism Tyrosine metabolism Phenylalanine & tyrosine metabolism Phenylalanine metabolism Tyrosine metabolism; phenylalanine & tyrosine metabolism Phenylalanine & tyrosine metabolism; acetylated Phenylalanine & tyrosine metabolism; acetylated peptides Phenylalanine metabolism; phenylalanine & tyrosine metabolism 29. Polyamine metabolism group Polyamine metabolism Guanidino and acetamido metabolism Guanidino and acetamido metabolism; polyamine metabolism 30. group Purine and pyrimidine metabolism Purine metabolism, (hypo)xanthine/inosine containing Purine metabolism, adenine containing Purine metabolism, guanine containing Purine metabolism, guanine containing; purine metabolism, adenine containing Purine metabolism, urate metabolism; purine metabolism, (hypo)xanthine/inosine containing 31. Pyrimidine metabolism group Pyrimidine metabolism, cytidine containing Pyrimidine metabolism, orotate containing Pyrimidine metabolism, thymine containing Pyrimidine metabolism, thymine containing; valine, leucine and isoleucine metabolism Pyrimidine metabolism, uracil containing Purine and pyrimidine metabolism Alanine and aspartate metabolism; pyrimidine metabolism, uracil containing 32. Sphingolipid metabolism

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Sphingolipid metabolism 33. Sterol/steroid Progestin steroids Corticosteroids Pregnenolone steroids Sterol/steroid Androgenic steroids 34. Sugar metabolism Fructose, mannose, galactose, starch, and sucrose metabolism Fructose, mannose, galactose, starch, and sucrose metabolism; glycogen metabolism Amino sugar metabolism Amino sugar metabolism; pentose metabolism 35. Tocopherol metabolism Tocopherol metabolism 36. Tryptophan metabolism group Tryptophan metabolism Tryptophan metabolism; food component/plant 37. Tobacco metabolism group Tobacco

38. Urea cycle; arginine and proline metabolism group Urea cycle; arginine and proline metabolism Dipeptide; urea cycle; arginine and proline metabolism 39. Valine, leucine and isoleucine metabolism group Valine, leucine and isoleucine metabolism Valine, leucine and isoleucine metabolism; chemical Valine, leucine and isoleucine metabolism; food component/plant Fatty acid metabolism; valine, leucine and isoleucine metabolism Pyrimidine metabolism, thymine containing; valine, leucine and isoleucine metabolism 40. Vitamins and cofactors not included in other categories Ascorbate and aldarate metabolism Glyoxylate and dicarboxylate metabolism; ascorbate and aldarate metabolism Nicotinate and nicotinamide metabolism Pantothenate and COA metabolism Riboflavin metabolism Vitamin B6 metabolism; pyridoxal metabolism 41. Xanthine metabolism Xanthine metabolism 42. Others Eicosanoid Endocannabinoid Inositol metabolism Ketone bodies

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Chemical; ketone bodies Oxidative phosphorylation Creatine metabolism

Supplemental statistical methods for the pathway analysis:

We identified the null distribution of Fisher’s statistic by permutation. Specifically, we permuted case/control status 10,000 times and recalculated the statistic each time. The null distribution is the distribution of those statistics and the pathway-level P-value is the proportion of those permutation-generated statistics that exceed the observed value. We then calculated an overall pathway level P-value by combining the ATBC and PLCO values using Fisher’s method.

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Supplemental table 1: Number of cases and proportions by follow-up time in ATBC and PLCO

Time ATBC, n PLCO, n ATBC % PLCO % <5 63 52 0.17 0.49 5-10 84 26 0.23 0.24 10-15 96 28 0.26 0.26 15-20 72 1 0.19 0.01 >20 57 0 0.15 0.00 Total 372 107 1 1

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Supplemental Table 2. Odds ratios (OR) and 95% confidence intervals (CI) by quartile of the FDR < 0.05 associated metabolites.

Metabolites ATBC (n=372 PLCO (n=107 Both (n=379 matched sets) matched sets) matched sets) Quantiles OR (95% CI) OR (95% CI) OR (95% CI) Glycylvaline Q1 referent referent referent Q2 2.09 (1.29,3.41) 3.05 (1.04,8.98) 2.23 (1.43,3.48) Q3 1.42 (0.88,2.29) 3.71 (1.39,9.86) 1.71 (1.11,2.63) Q4 2.87 (1.82,4.52) 3.35 (1.27,8.79) 2.95 (1.96,4.45) Aspartylphenylalanine Q1 referent referent referent Q2 1.14 (0.7,1.83) 0.74 (0.28,1.91) 1.04 (0.68,1.59) Q3 1.86 (1.2,2.86) 1.67 (0.69,4.04) 1.82 (1.23,2.68) Q4 2.10 (1.36,3.25) 2.46 (1.02,5.92) 2.17 (1.47,3.2) Tyrosylglutamine Q1 referent referent referent Q2 0.74 (0.49,1.13) 0.70 (0.34,1.44) 0.73 (0.51,1.05) Q3 0.68 (0.44,1.05) 0.66 (0.30,1.49) 0.68 (0.46,0.99) Q4 0.43 (0.27,0.68) 0.76 (0.32,1.8) 0.49 (0.33,0.73) Alpha-glutamyltyrosine Q1 referent referent referent Q2 0.64 (0.42,0.95) 0.64 (0.26,1.62) 0.64 (0.44,0.92) Q3 0.53 (0.34,0.82) 0.36 (0.13,0.98) 0.49 (0.33,0.74) Q4 0.46 (0.29,0.71) 0.45 (0.19,1.09) 0.46 (0.31,0.68) Pyroglutamylglycine Q1 referent referent referent Q2 1.31 (0.82,2.07) 0.51 (0.2,1.29) 1.09 (0.72,1.64) Q3 1.96 (1.26,3.03) 1.49 (0.66,3.36) 1.84 (1.25,2.71) Q4 2.30 (1.47,3.58) 1.91 (0.86,4.26) 2.20 (1.49,3.24) DSGEGDFXAEGGGVR Q1 referent referent referent Q2 0.62 (0.41,0.94) 1.32 (0.58,2.98) 0.73 (0.50,1.05) Q3 0.46 (0.30,0.72) 0.85 (0.38,1.91) 0.53 (0.36,0.78) Q4 0.49 (0.32,0.76) 0.63 (0.26,1.53) 0.51 (0.35,0.76) Cysteine-glutathione Q1 referent referent referent disulfide Q2 0.73 (0.48,1.09) 0.56 (0.26,1.25) 0.69 (0.48,0.99) Q3 0.52 (0.33,0.80) 0.70 (0.32,1.53) 0.55 (0.38,0.81) Q4 0.45 (0.29,0.71) 0.51 (0.22,1.18) 0.47 (0.31,0.69) Phenylalanylphenylalanine Q1 referent referent referent Q2 1.26 (0.80,1.97) 0.91 (0.39,2.13) 1.17 (0.79,1.74) Q3 2.24 (1.46,3.44) 0.84 (0.36,1.97) 1.84 (1.25,2.69) Q4 2.17 (1.35,3.49) 2.58 (1.08,6.17) 2.26 (1.49,3.43) Aspartate Q1 referent referent referent Q2 1.05 (0.66,1.68) 2.15 (0.82,5.64) 1.20 (0.79,1.83) Q3 1.39 (0.91,2.15) 2.49 (0.93,6.70) 1.53 (1.03,2.27) Q4 1.74 (1.16,2.6) 4.13 (1.51,11.27) 1.96 (1.35,2.85)

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Phenylalanylleucine Q1 referent referent referent Q2 1.68 (1.04,2.73) 1.22 (0.44,3.40) 1.59 (1.03,2.46) Q3 2.48 (1.57,3.92) 2.78 (1.12,6.89) 2.54 (1.68,3.82) Q4 2.23 (1.39,3.57) 2.39 (1.00,5.72) 2.27 (1.50,3.43) Tryptophylglutamate Q1 referent referent referent Q2 1.26 (0.78,2.01) 0.56 (0.19,1.61) 1.10 (0.71,1.69) Q3 1.09 (0.67,1.75) 1.68 (0.71,3.98) 1.21 (0.79,1.83) Q4 1.99 (1.30,3.06) 1.19 (0.51,2.77) 1.79 (1.22,2.63) Glutamate Q1 referent referent referent Q2 2.08 (1.30,3.33) 0.49 (0.18,1.31) 1.59 (1.04,2.43) Q3 1.74 (1.09,2.78) 1.72 (0.76,3.90) 1.73 (1.15,2.60) Q4 2.40 (1.47,3.93) 2.66 (1.06,6.70) 2.46 (1.59,3.80) Mannose Q1 referent referent referent Q2 1.45 (0.92,2.30) 1.17 (0.50,2.72) 1.38 (0.92,2.07) Q3 1.77 (1.12,2.77) 2.07 (0.82,5.20) 1.82 (1.21,2.73) Q4 1.75 (1.08,2.83) 3.39 (1.18,9.74) 1.96 (1.26,3.04) Gamma-glutamylglutamate Q1 referent referent referent Q2 1.24 (0.8,1.93) 0.63 (0.20,1.94) 1.13 (0.75,1.71) Q3 1.52 (0.99,2.33) 3.30 (1.28,8.53) 1.73 (1.17,2.56) Q4 1.52 (0.98,2.37) 4.83 (1.58,14.77) 1.78 (1.18,2.69) Sphingosine Q1 referent referent referent Q2 1.38 (0.79,2.43) 1.38 (0.58,3.26) 1.38 (0.86,2.21) Q3 2.11 (1.26,3.54) 2.48 (1.06,5.82) 2.21 (1.42,3.43) Q4 1.56 (0.90,2.72) 2.61 (1.13,6.02) 1.83 (1.15,2.90) Cotinine Q1 referent referent referent Q2 0.94 (0.59,1.52) 1.96 (0.61,6.31) 1.05 (0.67,1.63) Q3 2.01 (1.30,3.11) 1.79 (0.37,8.65) 2.00 (1.31,3.03) Q4 1.41 (0.88,2.24) 2.41 (0.60,9.71) 1.49 (0.96,2.31) Phenylalanine Q1 referent referent referent Q2 1.37 (0.86,2.16) 0.78 (0.34,1.76) 1.20 (0.80,1.78) Q3 1.75 (1.11,2.78) 1.47 (0.67,3.25) 1.68 (1.13,2.50) Q4 1.89 (1.16,3.06) 1.28 (0.53,3.09) 1.72 (1.13,2.63) 3-ureidopropionate Q1 referent referent referent Q2 1.29 (0.83,1.99) 1.31 (0.52,3.33) 1.29 (0.87,1.92) Q3 1.34 (0.87,2.06) 1.04 (0.41,2.64) 1.28 (0.87,1.89) Q4 1.56 (1.01,2.43) 2.34 (0.96,5.67) 1.69 (1.14,2.51) Gamma- Q1 referent referent referent glutamylphenylalanine Q2 1.64 (1.03,2.60) 1.75 (0.72,4.30) 1.66 (1.10,2.51) Q3 1.12 (0.70,1.78) 2.03 (0.74,5.56) 1.24 (0.81,1.90) Q4 2.06 (1.29,3.28) 2.74 (0.97,7.75) 2.16 (1.41,3.30) 7-methylguanine Q1 referent referent referent Q2 1.65 (1.08,2.53) 0.97 (0.41,2.32) 1.49 (1.02,2.19)

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Q3 1.21 (0.76,1.92) 1.11 (0.45,2.73) 1.19 (0.79,1.79) Q4 2.00 (1.25,3.20) 1.65 (0.65,4.22) 1.92 (1.26,2.93) 3-methoxytyrosine Q1 referent referent referent Q2 1.46 (0.93,2.30) 3.62 (1.37,9.53) 1.72 (1.14,2.59) Q3 1.62 (1.02,2.55) 2.25 (0.83,6.12) 1.71 (1.13,2.59) Q4 2.02 (1.25,3.25) 3.58 (1.28,10.07) 2.23 (1.45,3.44) Gamma-glutamylisoleucine Q1 referent referent referent Q2 1.21 (0.76,1.92) 1.41 (0.56,3.55) 1.25 (0.82,1.88) Q3 1.37 (0.87,2.18) 1.47 (0.56,3.83) 1.39 (0.92,2.11) Q4 1.82 (1.14,2.90) 1.90 (0.77,4.66) 1.83 (1.21,2.78) O-cresol sulfate Q1 referent referent referent Q2 0.96 (0.61,1.51) 1.15 (0.49,2.65) 1.00 (0.67,1.49) Q3 1.54 (1.01,2.34) 1.14 (0.43,3.05) 1.47 (1.00,2.16) Q4 1.20 (0.77,1.87) 1.69 (0.67,4.25) 1.28 (0.86,1.91) C-glycosyltryptophan Q1 referent referent referent Q2 1.17 (0.71,1.94) 0.92 (0.37,2.31) 1.11 (0.71,1.72) Q3 2.02 (1.26,3.23) 0.65 (0.27,1.60) 1.58 (1.04,2.40) Q4 1.97 (1.19,3.26) 1.62 (0.7,3.77) 1.87 (1.22,2.89) Hydroxycotinine Q1 referent referent referent Q2 1.28 (0.83,1.98) 1.73 (0.40,7.54) 1.31 (0.86,1.99) Q3 1.34 (0.86,2.07) 0.29 (0.04,2.06) 1.25 (0.81,1.91) Q4 1.39 (0.88,2.20) 2.00 (0.43,9.18) 1.43 (0.92,2.22) Alpha-tocopherol Q1 referent referent referent Q2 0.77 (0.51,1.16) 0.19 (0.06,0.59) 0.65 (0.44,0.96) Q3 0.66 (0.43,1.01) 0.41 (0.14,1.25) 0.62 (0.41,0.92) Q4 0.71 (0.46,1.11) 0.25 (0.08,0.80) 0.62 (0.41,0.94) Tryptophan Q1 referent referent referent Q2 1.04 (0.66,1.64) 1.13 (0.43,2.96) 1.06 (0.7,1.60) Q3 1.75 (1.14,2.69) 0.74 (0.27,1.98) 1.53 (1.03,2.26) Q4 1.60 (1.01,2.52) 1.82 (0.76,4.35) 1.64 (1.10,2.46) 7-Hoca Q1 referent referent referent Q2 1.53 (0.98,2.37) 1.16 (0.43,3.10) 1.46 (0.98,2.18) Q3 1.49 (0.95,2.33) 0.88 (0.31,2.48) 1.37 (0.91,2.07) Q4 1.65 (1.04,2.62) 1.50 (0.61,3.67) 1.61 (1.07,2.44) Guanine Q1 referent referent referent Q2 0.64 (0.36,1.14) NA 0.64 (0.36,1.14) Q3 0.61 (0.34,1.12) NA 0.61 (0.34,1.12) Q4 0.43 (0.23,0.84) NA 0.43 (0.23,0.84) N2,N2-dimethylguanosine Q1 referent referent referent Q2 2.00 (1.26,3.17) 3.58 (0.61,20.98) 1.87 (1.25,2.81) Q3 2.57 (1.60,4.13) 0.28 (0.03,2.62) 1.94 (1.27,2.97) Q4 2.28 (1.34,3.89) 1.10 (0.28,4.27) 2.20 (1.37,3.53)

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Cotinine N-oxide Q1 referent referent referent Q2 2.11 (1.31,3.42) NA 2.19 (1.38,3.49) Q3 1.91 (1.17,3.13) NA 1.75 (1.08,2.83) Q4 2.04 (1.26,3.31) NA 1.91 (1.21,3.00) 1 Odds Ratios (OR) and 95% confidence intervals (CI) for each quartile based in the cohort specific distribution of the controls in log-metabolite level calculated using conditional logistic regression and conditioned on matched variables (age, date of blood draw, sex, and race). An overall estimate was calculated by combining the two ORs using a fixed-effects meta-analysis. 372 case-control sets from the ATBC study and 107 case-control sets from the PLCO study.

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Supplemental table 3. Metabolites associated with pancreatic ductal adenocarcinoma the Alpha-Tocopherol, Beta-Carotene (ATBC) Cancer Prevention Study and the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial cohort (PLCO) combined

Minimally adjusted meta-analysis1 Fully adjusted meta-analysis1, 2 (n=479 case-control sets) (n=479 case-control sets) Metabolites Sub-Pathway Class OR (95% CI) P-value Q-value OR (95% CI) P-value Q-value Glycylvaline Dipeptide Peptide 1.46 (1.28,1.67) 4.33E-08 1.33E-05 1.48 (1.28,1.70) 5.70E-08 2.33E-05 Aspartylphenylalanine Dipeptide Peptide 1.38 (1.21,1.59) 2.60E-06 3.28E-04 1.41 (1.22,1.62) 2.50E-06 4.92E-04 Tyrosylglutamine Dipeptide Peptide 0.72 (0.63,0.83) 3.21E-06 3.28E-04 0.73 (0.63,0.84) 1.08E-05 7.34E-04 Pyroglutamylglycine Dipeptide Peptide 1.35 (1.18,1.53) 5.16E-06 3.46E-04 1.38 (1.20,1.58) 3.61E-06 4.92E-04 α-glutamyltyrosine Dipeptide Peptide 0.74 (0.65,0.84) 5.63E-06 3.46E-04 0.74 (0.65,0.85) 1.48E-05 8.63E-04 DSGEGDFXAEGGGVR Fibrinogen cleavage peptide Peptide 0.74 (0.65,0.85) 1.05E-05 5.37E-04 0.73 (0.63,0.83) 5.80E-06 5.93E-04 Cysteine-glutathione disulfide Glutathione metabolism Amino acid 0.75 (0.65,0.85) 1.49E-05 6.54E-04 0.75 (0.65,0.86) 4.85E-05 2.44E-03 Phenylalanylphenylalanine Dipeptide Peptide 1.33 (1.17,1.52) 2.15E-05 8.23E-04 1.38 (1.20,1.59) 8.05E-06 6.59E-04 Aspartate Alanine and aspartate Amino acid 1.31 (1.15,1.49) 3.93E-05 1.23E-03 1.31 (1.15,1.49) 7.24E-05 2.96E-03 metabolism Phenylalanylleucine Dipeptide Peptide 1.33 (1.16,1.53) 4.02E-05 1.23E-03 1.35 (1.17,1.56) 5.37E-05 2.44E-03 Tryptophylglutamate Dipeptide Peptide 1.30 (1.15,1.48) 4.84E-05 1.35E-03 1.30 (1.14,1.48) 1.27E-04 3.99E-03 Glutamate Glutamate metabolism Amino acid 1.31 (1.14,1.49) 8.91E-05 2.28E-03 1.35 (1.16,1.57) 9.31E-05 3.46E-03 Mannose Fructose, mannose, galactose, Carbohydrate 1.37 (1.16,1.62) 2.34E-04 5.53E-03 1.35 (1.12,1.62) 1.54E-03 3.00E-02 starch, and sucrose metabolism γ-glutamylglutamate γ-glutamyl amino acid Peptide 1.27 (1.11,1.46) 6.99E-04 1.45E-02 1.30 (1.12,1.51) 5.40E-04 1.47E-02 Sphingosine Sphingolipid metabolism Lipid 1.34 (1.13,1.59) 7.07E-04 1.45E-02 1.34 (1.12,1.60) 1.18E-03 2.46E-02 Cotinine Tobacco metabolite Xenobiotics 1.26 (1.10,1.44) 8.15E-04 1.56E-02 1.24 (1.05,1.46) 1.03E-02 1.26E-01 Phenylalanine Phenylalanine metabolism; Amino acid 1.26 (1.10,1.45) 8.91E-04 1.61E-02 1.30 (1.12,1.51) 6.61E-04 1.69E-02 phenylalanine & tyrosine metabolism 3-ureidopropionate Alanine and aspartate Amino acid; 1.26 (1.10,1.44) 9.82E-04 1.68E-02 1.23 (1.07,1.42) 3.75E-03 5.91E-02 metabolism; pyrimidine nucleotide metabolism, uracil containing γ-glutamylphenylalanine γ-glutamyl amino acid Peptide 1.26 (1.10,1.45) 1.11E-03 1.79E-02 1.32 (1.13,1.54) 3.64E-04 1.06E-02 7-methylguanine Purine metabolism, guanine Nucleotide 1.27 (1.10,1.47) 1.25E-03 1.92E-02 1.30 (1.11,1.51) 8.38E-04 2.02E-02 containing O-cresol sulfate Benzoate metabolism; Xenobiotics; 1.24 (1.09,1.42) 1.50E-03 2.19E-02 1.19 (1.03,1.38) 2.21E-02 1.97E-01 phenylalanine & tyrosine amino acid metabolism

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

γ-glutamylisoleucine γ-glutamyl amino acid Peptide 1.25 (1.09,1.43) 1.60E-03 2.19E-02 1.25 (1.08,1.45) 2.37E-03 4.22E-02 3-methoxytyrosine Tyrosine metabolism; Amino acid 1.26 (1.09,1.46) 1.64E-03 2.19E-02 1.36 (1.16,1.59) 1.10E-04 3.75E-03 phenylalanine & tyrosine metabolism C-glycosyltryptophan Tryptophan metabolism Amino acid 1.26 (1.09,1.46) 2.17E-03 2.72E-02 1.30 (1.11,1.52) 9.93E-04 2.26E-02 Hydroxycotinine Tobacco metabolite Xenobiotics 1.23 (1.08,1.41) 2.22E-03 2.72E-02 1.18 (1.00,1.39) 4.75E-02 2.69E-01 Tryptophan Tryptophan metabolism Amino acid 1.24 (1.08,1.42) 2.38E-03 2.74E-02 1.26 (1.09,1.45) 1.95E-03 3.62E-02 Alpha-tocopherol Tocopherol metabolism Cofactors and 0.80 (0.69,0.92) 2.41E-03 2.74E-02 0.81 (0.70,0.94) 5.72E-03 8.08E-02 vitamins 7-α-hydroxy-3-oxo-4- Sterol Lipid 1.22 (1.07,1.40) 3.53E-03 3.87E-02 1.21 (1.05,1.39) 7.94E-03 1.05E-01 cholestenoate (7-Hoca) Guanine Purine metabolism, guanine Nucleotide 0.74 (0.61,0.91) 4.10E-03 4.33E-02 0.72 (0.59,0.90) 2.91E-03 4.96E-02 containing N2,N2-dimethylguanosine Purine metabolism, guanine Nucleotide 1.24 (1.07,1.44) 4.23E-03 4.33E-02 1.31 (1.11,1.54) 1.20E-03 2.46E-02 containing Cotinine N-oxide Tobacco metabolite Xenobiotics 1.22 (1.06,1.39) 4.60E-03 4.56E-02 1.18 (1.00,1.38) 4.49E-02 2.69E-01 1 Odds Ratios (OR) and 95% confidence intervals (CI) for a 1 standard deviation (SD) increase in log-metabolite level calculated using conditional logistic regression and conditioned on matched variables (age, date of blood draw, sex, and race). An overall estimate was calculated by combining the two ORs using a fixed-effects meta-analysis. 372 case-control sets from the ATBC study and 107 case-control sets from the PLCO study.

2 Additionally adjusted for age, smoking (ATBC: years smoked and smoking intensity; PLCO: never, former quit >15 years ago, former quit <15 years ago, or current smoking), BMI (kg/m2, continuous), and diabetes (yes, no).

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Supplemental figure 1: A. Spearman correlation coefficients between the top metabolites associated with pancreatic cancer (n=31, Q-value < 0.05) and known pancreatic cancer risk factors among in the ATBC study controls (n=372)

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

B. Spearman correlation coefficients between the top metabolites associated with pancreatic cancer (n=30, Q-value < 0.05) and known pancreatic cancer risk factors among in the PLCO study controls (n=107)

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Supplemental Table 4. Odds ratios and 95% confidence intervals for the top selected metabolites using stepwise logistic regression in ATBC and PLCO combined Odds Ratios (95% confidence intervals) Metabolites Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Glycylvaline 1.48(1.29,1.71) 1.55(1.34,1.80) 1.52(1.31,1.76) 1.51(1.30,1.76) 1.51(1.30,1.76) 1.50(1.29,1.75) 1.32(1.10,1.60)

α-tocopherol 0.74(0.64,0.86) 0.72(0.62,0.84) 0.71(0.60,0.83) 0.70(0.59,0.82) 0.69(0.59,0.82) 0.70(0.60,0.83) Mannose 1.36(1.14,1.61) 1.37(1.15,1.63) 1.32(1.11,1.58) 1.31(1.10,1.56) 1.32(1.11,1.58)

3-methoxytyrosine 1.28(1.09,1.49) 1.26(1.08,1.47) 1.24(1.06,1.45) 1.26(1.08,1.47)

Tryptophan 1.20(1.04,1.39) 1.23(1.06,1.42) 1.25(1.08,1.46)

Hydroxycotinine 1.20(1.05,1.37) 1.20(1.05,1.38)

Tyrosylglutamine 0.81(0.67,0.98) 1 Odds Ratios (OR) and 95% confidence intervals (CI) for a 1 standard deviation (SD) increase in log-metabolite level calculated using stepwise conditional logistic regression among the FDR < 0.05 metabolites (table 2). The overall estimate was calculated by combining the two ORs in ATBC and PLCO using a fixed-effects meta-analysis. 372 case- control sets from the ATBC study and 107 case-control sets from the PLCO study.

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Supplemental Figure 2. Time varying metabolite associations from the Prostate, Lung, Ovarian, Colorectal Cancer Screening Trial cohort (PLCO) nested case-control (n=107 matched sets) for the five significant time varying metabolites observed in the ATBC. X axis is time from baseline blood draw (0 year) to date of pancreatic adenocarcinoma diagnosis up to 16 years. The Y-axis shows the strength of the association measured as either the OR (right axis) or log(OR) (left axis).

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Supplemental Table 5: Metabolic pathways associated with pancreatic cancer 1,2 Sub-pathway Metabolites, n P-value Dipeptide group/polypeptide 28 <0.0001 Fibrinogen cleavage peptide 2 0.0002 Alanine and aspartate metabolism group 8 0.0005 Glutathione metabolism 3 0.0005 Purine metabolism group 17 0.0009 Tobacco metabolism group 4 0.001 γ-glutamyl amino acid metabolism group 13 0.0015 Glutamate metabolism 4 0.0017 Glycolysis, gluconeogenesis, pyruvate metabolism group 7 0.004 Bile acids 19 0.006 Sphingolipid metabolism 5 0.007 Benzoate metabolism 20 0.009 Tocopherol metabolism 6 0.01 Tryptophan metabolism group 17 0.01 Phenylalanine & tyrosine metabolism 18 0.02 Sugar metabolism 10 0.02 Pyrimidine metabolism group 13 0.03 Sterol/steroid 28 0.03 Butanoate metabolism; cysteine, methionine, SAM, taurine 15 0.04 metabolism Urea cycle; arginine and proline metabolism group 17 0.04 Pentose metabolism group 11 0.05 Xanthine metabolism 16 0.11 Chemical group 13 0.15 Glycine, serine and threonine metabolism 10 0.16 Krebs cycle / TCA cycle 6 0.19 Food component/plant group 23 0.22 Hemoglobin and porphyrin metabolism 7 0.26 Valine, leucine and isoleucine metabolism 27 0.29 Lysolipid group 41 0.31 Vitamins and cofactors not included in other categories 11 0.34 Mono and diacylglycerol group 11 0.40 Polyamine metabolism 5 0.44 Drug 5 0.47 Carnitine metabolism 14 0.51 Polyunstaturated (n3 and n6) 14 0.54 Glycerolipid metabolism 6 0.57 Medium and short chain fatty acid 10 0.68 Others3 16 0.69 Fatty acid metabolism 36 0.72 Long chain fatty acid group 14 0.85

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811 Supplementary material Gut

Histidine metabolism 10 0.86 Lysine metabolism 7 0.99 1 Pathways are based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) and described in Supplementary Table 1. For each of the two studies, we combined the P-values of the metabolites included in a given pathway by Fisher’s method. We then calculated an overall pathway level P-value by combining the ATBC and PLCO values using Fisher’s method. The analysis included 372 case-control sets from the ATBC study and 107 case-control sets from the PLCO study. 2 The p-values are not adjusted for multiple comparisons. The Bonferroni corrected significance for the 42 pathways is 0.05/42=0.0012. 3 “Others” group included categories with less than 5 metabolites that could not otherwise be categorized.

Stolzenberg-Solomon R, et al. Gut 2020;0:1–8. doi: 10.1136/gutjnl-2019-319811