<<

ARTICLE

https://doi.org/10.1038/s41467-019-13498-3 OPEN Metabolomics analysis of human acute graft- versus-host disease reveals changes in host and microbiota-derived metabolites

David Michonneau 1,2,9, Eleonora Latis3,9, Emmanuel Curis 4,5,6, Laetitia Dubouchet 2, Sivapriya Ramamoorthy7, Brian Ingram7, Régis Peffault de Latour1,2, Marie Robin1, Flore Sicre de Fontbrune1, Sylvie Chevret6,8, Lars Rogge 3,10 & Gérard Socié 1,2,10* 1234567890():,; Despite improvement in clinical management, allogeneic hematopoietic stem cell trans- plantation (HSCT) is still hampered by high morbidity and mortality rates, mainly due to graft versus host disease (GvHD). Recently, it has been demonstrated that the allogeneic immune response might be influenced by external factors such as tissues microenvironment or host microbiota. Here we used high throughput metabolomics to analyze two cohorts of geno- typically HLA-identical related recipient and donor pairs. Metabolomic profiles markedly differ between recipients and donors. At the onset of acute GvHD, in addition to host-derived metabolites, we identify significant variation in microbiota-derived metabolites, especially in aryl hydrocarbon receptor (AhR) ligands, bile acids and plasmalogens. Altogether, our findings support that the allogeneic immune response during acute GvHD might be influ- enced by bile acids and by the decreased production of AhR ligands by microbiota that could limit indoleamine 2,3-dioxygenase induction and influence allogeneic T cell reactivity.

1 Hematology Transplantation, Saint Louis Hospital, 1 avenue Claude Vellefaux, 75010 Paris, France. 2 Université de Paris, INSERM U976, 75010 Paris, France. 3 Institut Pasteur, Immunoregulation Unit, Department of Immunology, 25 rue du Docteur Roux, 75015 Paris, France. 4 Université de Paris, INSERM UMR-S1144, 75013 Paris, France. 5 Université de Paris, Laboratoire de biomathématiques plateau iB2EA 7537–BioSTM, Faculté de pharmacie, 75006 Paris, France. 6 Service de Biostatistique et Information Médicale, Hôpital Saint-Louis, AP-HP, 1 avenue Claude Vellefaux, 75010 Paris, France. 7 Metabolon, Inc., Morrisville, NC, USA. 8 Université de Paris, INSERM U1153, Epidemiology and Biostatistics Sorbonne Paris Cité Research Center (CRESS), ECSTRA Team, 75010 Paris, France. 9These authors contributed equally: David Michonneau, Eleonora Latis. 10These authors jointly supervised this work: Lars Rogge, Gérard Socié. *email: [email protected]

NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3

llogeneic hematopoietic stem cell transplantation (HSCT) (46.18%), (24.16%), xenobiotics (13%), nucleotides Ais a major treatment for hematologic malignancies and for (4.43%), carbohydrate (3.82%), cofactors and (3.82%), inherited or acquired hematopoiesis disorders. However, peptide (3.21%) and energy (1.38%) pathways (Fig. 1c). it is still hampered by high morbidity and mortality rates1 mainly due to graft-versus-host disease (GvHD). Although insight to GvHD pathophysiology has been gained from the development of Alteration of recipients’ metabolome after transplantation.To animal models2, most of these experimental studies have focused identify whether metabolome of allogeneic-HSCT recipients dif- on the role of different T cell subsets3,4. However, it has been fers from those of healthy subjects, recipients without GvHD were recently demonstrated that the allogeneic immune response compared to their related sibling donors. For each metabolite, might be influenced by external factors such as tissues UPLC-MS/MS generates many empty values when the corre- microenvironment5,6 or host microbiota7,8. Recently, metabo- sponding sample’s value is under the lower limit of quantifica- lomics has emerged as a major new field in system biology that tion21. To identify metabolites that were more frequently detected can reflect how genetics, environmental factors, or microbiota in recipients or donors, irrespective of the amounts of com- affect host biochemical processes9. During the transplantation pounds, the distribution of non-detectable values was assessed process, patients are subjected to chemotherapy, antibiotics, and (Fig. 2a, Supplementary Fig. 1a). Using this approach, we iden- antiviral therapy that can alter tissue biology and microbiota, and tified five and three metabolites in cohorts 1 and 2 that were more could impact GvHD severity or relapse risk10,11. Many metabo- frequently detected in donors or in their related recipients after lites can contribute to immune response regulation through their correction for multiple testing, respectively (Supplementary direct influence on immune cell activation, proliferation, or sur- Data 2 and 3). After filtering metabolites with more than 50% of vival12–14. Hence, it has been suggested that pre-transplant non-detectable values and imputation with half of the minimal metabolomics profiles could impact transplant outcomes15. detected values (Fig. 2b, Supplementary Fig. 1b), metabolites with Microbiota-derived metabolites such as short-chain fatty acids putative biological relevance were identified by comparison of the could regulate tissue reparation and immune cell activation16–18. amount of each metabolite between recipients without GvHD and In mice, butyrate-producing bacteria can mitigate experimental their paired related donors (Fig. 2c, d, Supplementary Fig. 1c, d, acute GvHD16 and a higher abundance of these bacteria in Supplementary Data 4 and 5). As compared with healthy subjects, humans is associated with a reduced rate of respiratory viral allogeneic-HSCT recipients without GvHD were mainly char- infection after allogeneic-HSCT19. If many metabolites could acterized by a significant increase in the amounts of complex lipid regulate immunity, there is no clear broad and unbiased overview metabolites, especially fatty acid, mono and diacylglycerol and of how metabolomics pathways are influenced by the transplan- primary BA. Amino acid metabolism was also altered by trans- tation process and which changes are more specifically associated plantation. Especially, -derived metabolites and taur- with acute GvHD at disease onset20. ine production were strongly decreased after transplantation, Using high throughput metabolomics to analyze two cohorts of whereas metabolites were increased in recipients genotypically HLA-identical-related recipients and donors, we without GvHD. Global analysis of metabolites behavior by show that HSCT is followed by major changes in metabolomics comparison of their relative amount confirmed that BA and profiles of recipients. At acute GVHD onset, significant variation indolepropionate were the most affected after transplantation of host- and microbiota-derived metabolites are identified, mainly (Fig. 2e, f, Supplementary Fig. 1e, f). Among the polyamine affecting compounds of the tryptophan metabolism, a group, we observed a significantly increased level of group of metabolites that acts as ligands for the aryl hydrocarbon 5-methylthioadenosine (MTA) and N-acetylputrescine that might receptor (AhR), together with bile acids (BAs) and plasmalogens. play a role in protection of gut integrity22, could inhibit macro- This suggests that dysbiosis, together with transplantation-related phage activation23, and reduce T cell activation24,25. Interestingly, alteration of host metabolism, induce major change in circulating it was also recently demonstrated that intestinal inflammation is metabolites in recipients that might influence allogeneic immune regulated by microbial-derived metabolites through the regula- cell reactivity. tion of NLRP6 inflammasome signaling. Whereas taurine may activate inflammasome signaling and lead to production of the pro-inflammatory cytokine, polyamine may inhibit this pathway Results and protect gut from auto-inflammation26. Bile acids have Metabolomics profiling of patients after allogeneic-HSCT. recently been described as a potent regulator of the immune Herein we studied two cohorts of patients, one single center system through the inhibition of the NLRP3-dependent inflam- cohort from Saint Louis hospital (cohort 1, n = 43) and one masome pathway27, the recruitment of NKT cells in the liver, multicentric from 13 French transplantation centers (cohort 2, through the regulation of CXCL16 production by liver sinusoid n = 56), who received an allogeneic-HSCT from an HLA- endothelial cells28,29, and to reduce macrophages activation, identical sibling donor. Plasma samples from sibling donors migration, and cytokines production30–32. Altogether, these were obtained before stem cell collection and thus considered as a results suggest that the metabolomics changes observed in healthy group of reference. Recipients’ samples were collected at patients without GvHD could contribute to decrease inflamma- the onset of acute GvHD before any treatment with corticoster- some activation and could thus protect epithelial integrity after oids or at day 90 after transplantation for those who did not transplantation. Using the same approach, we compared meta- develop GvHD (Fig. 1a). Patient and transplant characteristics are bolomic profiling of recipients with GvHD to their related described in Supplementary Data 1. Importantly, almost all donors. Few metabolites were more frequently detected in donors patients had natural feeding at the time of sampling, excluding or in recipients with GvHD (n = 0 in cohort 1 and n = 6in the possibility that metabolomics changes might be attributed to cohort 2) (Fig. 3a, Supplementary Fig. 2a, Supplementary data 6). artificial parenteral or enteral nutrition. Using ultrahigh perfor- Comparison of the amounts of each metabolite identified 150 and mance liquid chromatography-tandem mass spectrometry 182 metabolites that were significantly changed after transplan- (UPLC-MS/MS) we detected 801 and 927 circulating metabolites tation, in cohorts 1 and 2 respectively, with 110 metabolites in cohorts 1 and 2, respectively, with 653 metabolites being shared by both cohorts (Fig. 3c, d, Supplementary Fig. 2c, d, shared by both cohorts (Fig. 1b). The average distribution of these Supplementary Data 7 and 8). Most of metabolic pathways shared metabolites into each metabolic pathway was: lipid involved in recipients with GvHD were similar to those identified

2 NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3 ARTICLE

ab Cohorts Outcomes Data acquisition & statistical analysis

Cryopreseved plasma

Sibling- Recipients No GvHD Ultrahigh performance liquid chromatography after identical No GvHD at day + 90 with and mass spectrometry donors transplant immunosuppressive Mass spectrometry 148 653 274 prophylaxis peak identification Compounds Compounds Compounds Cohort 1 : n = 32 Cohort 2 : n = 30 Data analysis Monocentric cohort 1 Acute GvHD Donors (n = 43) Data pre-processing : filtration, missing data imputation, log-transformation Recipients (n = 43) Acute GvHD before corticosteroids Statistical analysis : t-test, fold change, global comparison of metabolites behaviour Multicentric cohort 2 and LASSO logistic regression Donors (n = 56) Cohort 1 : n = 12 Recipients (n = 56) Cohort 2 : n = 26 Confirmation analysis after matching on clinical data (age, gender,BMI) Cohort 1 Cohort 2 c 55 60 33 49 11 14 13 63 64 21

20 16

7 5 1 8 68 65

18 15 17 9 61 59 42 56 58 66 67 12 2 19 10 23 24 72 3 4 6 48 53 50 62 34 38 76 75 74 27 22

51 45 40 43 46 57 69 71 78 29 25 26 44 37 32 28 70 35 47 52 41 39 77 73 31 30 36 54

Super pathway Amino acid Cofactors and vitamins Lipid Peptide Carbohydrate Energy Nucleotide Xenobiotics

Fig. 1 Study design and general overview of detected metabolites. a Metabolomic profile was studied in two cohorts of patients, a single center cohort from Saint Louis hospital, Paris (cohort 1), and a multicentric French cohort (cohort 2). Plasma samples were collected in sibling donors before stem cell collection and in recipients after transplantation, at day 90 for those who did not developed GvHD or before corticosteroid treatment at disease onset for those who developed GvHD. Samples were frozen and then analyzed by ultrahigh-performance liquid chromatography-tandem mass spectrometry (UPLC- MS/MS) for circulating metabolites identification and quantification. *One donor did not give his consent for further analysis. b In all, 801 and 923 metabolites were identified in cohorts 1 and 2, respectively, with 653 metabolites shared by both cohorts. c Treemap showing the distribution of metabolites among super pathway (in color) and their related sub-pathways (square) for the 653 metabolites used for statistical analysis. Each number corresponds to one of the sub-pathway listed in Source Data file. in recipients without GvHD, with the exception of primary BA metabolism, was the only metabolite significantly less frequently and mono/diacylglycerol that were increased in recipients by detected at GvHD onset after Bonferroni correction in cohort 2 comparison with their donors. A common feature of all recipients (Fig. 4a, Supplementary Fig. 3a, Supplementary Data 9 and 10). was a strong decrease in xenobiotics detection, especially in Comparison of metabolites quantities revealed substantial changes xanthine, tobacco, and food metabolites. This suggests that all in amino acids (tryptophan and arginine pathways) and lipids recipients modified their behaviors regarding food or tobacco (lysoplasmalogens, plasmalogens, and phospholipids) that were all intake, as strongly recommended after HSC transplantation. decreased in patient with GvHD. In addition to the above described pathways, we also identified a strong increase of multiple complex Acute GvHD is characterized by specific metabolomics chan- lipid products in patients with acute GvHD, such as medium- and ges. As recently demonstrated in mice models, metabolites could long-chain fatty acid, polyunsaturated fatty acid (n3 and n6) and in regulate GvHD severity17,33. In patients, metabolome mainly primary and secondary BA (Fig. 4b, c, Supplementary Fig. 3b, c, reflects the metabolism of host cells and of its microbiota. To Supplementary Data 11 and 12). Global comparison of metabolites fi determine which metabolomics changes are associated with acute ratio between patients also con rmed that BA, plasmalogens, GvHD, we compared the metabolome of recipients with GvHD at tryptophan, and arginine metabolites were the main contributors of disease onset with that of recipients without GvHD. Assessment of the changes observed at GvHD onset (Fig. 4e, f, Supplementary fi non-detectable value distribution among patients revealed that Fig. 3e, f). These results were con rmed after adjustment for age, indolepropionate, a microbial-derived compound from tryptophan gender, and BMI that were considered as the main confounding

NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3

a b Distribution of non detectable value not associated with a clinical outcome (n = 614) 150 Recipients without GvHD Donors

Distribution of non detectable value associated with a clinical outcome (P<.05) (n = 32)

Distribution of non detectable value associated 100 with a clinical outcome (P<.0001) (n = 5)

Metabolites significantly associated with a group of patients (n = 37)

50 Discarded 83 metabolites undetected in more than 50% of samples

Imputation of remaining undetected values

Non detectable values distribution profile 0 Detection of significant metabolites with t-test and 100 50 0 50 100 fold change analysis (n = 64) Percentage of non detectable values c d

9 Mannose 4 Glucose Ribonate 5-methylthioadenosine (MTA) 2 8 Adenine 2-aminooctanoate 0 2-hydroxyoctanoate 7 Glycochenodeoxycholate Glycochenodeoxycholate sulfate −2 Cystine 1-palmitoyl-2-steroyl-GPC(16:0/18:0) 6 1–2-dipalmitoyl-GPC (16:0/16:0) −4 Oleoyl-oleoyl-glycerol (18:1/18:1) [1] Oleoyl-oleoyl-glycerol (18:1/18:1) [2] 5 Oleoyl-arachidonoyl-glycerol (18:1/20:4) [2] Linoleoyl-arachidonoyl-glycerol (18:2/20:4) N-methylpipecolate value) 4 4-cholesten-3-one P 1-palmitoleoylglycerol (16:1) N6-carbamoylthreonyladenosine N4-acetylcytidine 3 C-glycosyltryptophan Pseudouridine –Log( N1-methyladenosine N2,N2-dimethylguanosine 2 N1-methylinosine Erythritol Methionine sulfone 1 N-acetylputrescine 4-acetamidobutanoate 2-piperidinone Glucuronate 0 Pyridoxate Gamma-glutamylglycine 1 N-acetylarginine Lanthionine –4 –3 –2 –1 0 1 2 3 4 Sulfate Pro-hydroxy-pro 5-hydroxylysine Fold change (recipient/donor) 5alpha-androstan-3alpha,17beta-diol monosulfate Epiandrosterone sulfate Androsterone sulfate Cofactors and vitamins Nucleotide 5alpha-pregnan-3beta,20alpha-diol monosulfate Dehydroisoandrosterone sulfate (DHEA-S) Lipid Xenobiotics Glycolithocholate sulfate Amino acid Carbohydrate Deoxycholate Glycodeoxycholate Energy Indolepropionate Peptide Cinnamoylglycine Hippurate Pipecolate Homoarginine Guanidinoacetate Lactosyl-N-palmitoyl-sphingosine (d18:1/16:0) Lactosyl-N-nervonoyl-sphingosine (d18:1/24:1) Phosphoethanolamine Succinate Taurine Adenosine 5′-monophosphate (AMP) 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0) 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2) 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1) Bilirubin (Z,Z) Biliverdin Gamma-CEHC Uridine Cysteine- disulfide EDTA 3-(4-hydroxyphenyl)lactate 1,5-anhydroglucitol (1,5-AG) Alpha-hydroxyisovalerate 7-alpha-hydroxy-3-oxo-4-cholestenoate (7-Hoca) Alpha.hydroxyisocaproate Donors Recipients w/o aGvHD

1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2)

ef1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2) 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)

1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0) indole-3-carboxylic acid

indole-3-carboxylic acid 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* urea 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1) taurocholenate sulfate urea dimethylarginine (SDMA + ADMA) taurocholenate sulfate 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2)* N-methylproline dimethylarginine (SDMA + ADMA) N-methylproline 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2)* proline kynurenate hyocholate ornithine indolelactate spermidine kynurenate 2-oxoarginine* ornithine glycocholenate sulfate* Hyocholate indolelactate 2-oxoarginine* glycocholenate sulfate* chenodeoxycholate picolinate homoarginine chenodeoxycholate picolinate taurodeoxycholate indoleacetate homoarginine indoleacetate Deoxycholate tartronate (hydroxymalonate) taurodeoxycholate kynurenine 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* tartronate (hydroxymalonate) 3-indoxyl sulfate 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4)* Deoxycholate 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* tryptophan betaine 1-(1-enyl-palmitoyl)-GPC (P-16:0)* trans-4-hydroxyproline 3-indoxyl sulfate 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4)* N-delta-acetylornithine tryptophan betaine 1-(1-enyl-palmitoyl)-GPC (P-16:0)* N-delta-acetylornithine trans-4-hydroxyproline arginine homocitrulline arginine 1-(1-enyl-oleoyl)-GPE (P-18:1)* homocitrulline N-acetylproline 1-(1-enyl-oleoyl)-GPE (P-18:1)* N-acetylkynurenine (2) N-acetylproline glycohyocholate 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* N-acetylkynurenine (2) Cholate 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) Cholate 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) C-glycosyltryptophan Glycohyocholate 1-(1-enyl-palmitoyl)-GPE (P-16:0)* 1-(1-enyl-stearoyl)-GPE (P-18:0)* taurocholate glycocholate 1-(1-enyl-palmitoyl)-GPE (P-16:0)* 1-(1-enyl-stearoyl)-GPE (P-18:0)*taurocholate glycocholate C-glycosyltryptophan N-acetyltryptophan tryptophan 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4)* N-acetyltryptophan 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4)* N-acetylcitrulline tryptophan N-acetylcitrulline

taurolithocholate 3-sulfate 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2)* 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2) 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* citrulline citrulline Taurolithocholate 3-sulfate 4-acetamidobutanoate 4-acetamidobutanoate indoleacetylglutamine indoleacetylglutamine taurochenodeoxycholate taurochenodeoxycholate N2,N5-diacetylornithine N2,N5-diacetylornithine N-methylpipecolate N-methylpipecolate

Glycodeoxycholate glycodeoxycholate N-acetylarginine Secondary bile acid N-acetylarginine Acisoga acisoga 5-methylthioadenosine (MTA) 5-methylthioadenosine (MTA) Glycolithocholate sulfate N -acetylputrescine Pro-hydroxy-pro glycolithocholate sulfate N-acetylputrescine pro-hydroxy-pro Ursodeoxycholate Ursodeoxycholate Polyamine Glycochenodeoxycholate Glycochenodeoxycholate Indolepropionate Indolepropionate Glycoursodeoxycholate Glycoursodeoxycholate AhR ligands Vlycochenodeoxycholate glucuronide Glycochenodeoxycholate glucuronide (tryptophan)

Vlycochenodeoxycholate sulfate Glycochenodeoxycholate sulfate Primary bile acid factors. Thus, we were able to identify a group of metabolites that (Fig. 6c, d, Supplementary Data 14) and used to build an over- were still significantly associated with GvHD onset in both cohorts representation analysis (ORA) of the main pathways that contribute (Supplementary Data 13, Fig. 5). To confirm the implication of to GvHD (Fig. 6e, f). This approach confirmed that most of these metabolites at acute GvHD onset, PCA was used to confirm metabolites identified in univariate analysis were also selected by that recipient with or without GvHD could be discriminated in sPLS-DA and belong to the previously identified metabolic path- multivariate analysis (Fig. 6a, b). Metabolites that mostly contribute ways, especially plasmalogens, arginine, and tryptophan metabo- to acute GvHD profile were then identified with sparse PLS-DA lism. Considering the fact that metabolites variation might result

4 NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3 ARTICLE

Fig. 2 Metabolomics changes after transplantation in recipients without GvHD. a Distribution of non-detectable values for each metabolite was assessed in recipient without GvHD and in their related paired donors. Each line represents a profile of distribution for non-detectable value between groups, observed for at least one metabolite. The frequency of non-detectable values in both groups was compared for each distribution profile using a McNemar exact test and colored as follows: green: frequency of non-detectable values non-significantly different between groups; orange: frequency of non- detectable values significantly different between groups with p < 0.05 but not significant after Bonferroni correction for multiple testing; red: frequency of non-detectable values significantly different between groups after Bonferroni correction for multiple testing (p < 7.69 × 10−5). Red area represents the threshold used for filtration of metabolites with more than 50% of non-detectable value. b A total number of 651 metabolites were analyzed for comparison of paired donor and recipients without GvHD. Among them, 37 metabolites were more frequently detected in one group of patients, including 5 metabolites that were still significant after Bonferroni correction (Supplementary Data 2). Eighty-three metabolites with more than 50% of non- detectable values were discarded for further analysis, and remaining non-detectable values were replaced after imputation with half of the minimal detected value for each corresponding metabolite. c The remaining 568 metabolites were compared with a paired Student test followed by a Bonferroni correction. The volcano plot represents the variation of metabolites amount between recipients without GvHD and their related paired donors according to the −log(p value). List of the 64 significant metabolites is available in Supplementary Data 4. d Heatmap representation of the most significant metabolites after Student test, after hierarchical clustering of samples. e, f Significant variation of metabolites was confirmed after global comparison of the relative amounts between compounds. Main pathways identified in the previous analysis (i.e. polyamine metabolism, tryptophan metabolism, urea cycle, arginine & proline metabolism, bacterial or fungal, plasmalogen or lysoplasmalogen, and primary or secondary bile acid metabolism) were used to build an undirected graph where each node is a metabolite and two nodes are connected if their ratio is unchanged between the two groups (see Methods). The same network was first colored according to the considered sub-pathway (e) and then according to microbial-derived metabolites (red nodes) (f).

from their interdependency, lasso logistic regression was performed At the onset of acute GvHD, we observed a significant decrease to identify the set of metabolites that seems to be predictive of acute in tryptophan metabolites, including microbiota-produced com- GvHD (Supplementary Data 14). Using lasso regression analysis, pounds, such as 3-indoxyl sulfate, indoleacetate, indoleacetylgluta- we were able to confirm that both tryptophan and arginine path- mine, and indolepropionate, and host-derived compounds ways were involved, especially the AhR ligand 3-indoxyl sulfate that produced by the IDO (indoleamine 2,3-dioxygenase) tolerogenic was detected in both cohorts. We finally compared compounds pathway37, N-acetylkynurenine and picolinic acid. Indole com- rates for these pathways and observed similar profile in both pounds are AhR ligands that regulate IDO induction in immune cohorts at GvHD onset (Fig. 7a). In addition to host-derived cells38, whereas kynurenine was recently demonstrated to be a metabolites, we also identified significant variation in microbiota- strong inhibitor of T cell activation39. However, it should be derived indole compounds in both cohorts, especially in AhR emphasized that the functional consequence of AhR ligands ligands (Fig. 7b). All these microbiota-derived metabolites changes decrease should be explored, best in animal models. It has been are consistent with a major dysbiosis associated with acute demonstrated experimentally that the biological effect of some AhR GvHD7,34,35. Interestingly, it was recently demonstrated that fecal ligands, such as indolepropionate, is not dose-responsive40,41,sug- microbiota transplantation in HSCT patients was followed by gesting that the absence of some AhR ligands could be more rele- increased microbiome diversity and a concomitant increase in the vant than their decrease in acute GvHD. This is also why it appears level of urinary 3-indoxyl sulfate36. that the comparison of metabolites that are detected or not detected in the recipients (Fig. 4a, Supplementary Fig. 3a, Tables 9 and 11) is, Discussion in our view, as important as the comparison of metabolites amount. GvHD is an immune reaction from donor immune cells targeting Using this approach, we identify N-acetyl-kynurenine in both allogeneic antigens in recipients, whose pathophysiology is still cohorts, as well as indole-3-carboxylic acid, 3-indoxyl sulfate, and only partially understood in humans. Recent experimental indolepropionate in cohort 2, which were not only decreased in researches have highlighted the complex network of interactions recipients with GvHD but also more frequently undetectable in and regulations between immune cells, microbiota, and host these patients. Our findings thus support the hypothesis that acute environment4. Here we report that patients who underwent GvHD onset is associated with a decreased production of AhR ligands by microbiota that could limit IDO induction. IDO activity allogeneic-HSCT have major metabolomics changes compared to – healthy subjects, and that acute GvHD onset seems to be asso- has been associated with GvHD severity in mice and humans42 44, ciated with specific differences in metabolomics profile by com- and positively regulates AhR signaling through the production of parison with patients who did not developed GvHD. Comparing the AhR ligand kynurenine. AhR can also modulate Th17 response patients without GvHD with those who developed an unpre- and promote tolerance through the differentiation and the activa- dictable time onset GvHD leads to intrinsic differences between tion of Treg and Tr1 cells45. Otherwise, we observed significant groups of patients that may have contributed to the observed variation in citrulline, a precursor of arginine synthetized by the variations in metabolites. These differences include slight differ- intestine which has been considered as a surrogate biomarker of gut ences in the day of sampling after transplantation, feeding, or epithelial mass46. It was recently identified as a risk factor of acute treatments received at the time of sampling. To minimize the GvHD, as a consequence of epithelial damage and gut putative impact of these confounding factors, we used two permeability47,48. Many lipids pathways seem to be altered at the independent cohorts of patients to confirm our results and onset of acute GvHD. Polyunsaturated fatty acids were increased at explored groups of as much as possible similar patient char- the onset of GvHD and are the precursors of the eicosanoid family, acteristics in terms of HLA matching, immunosuppression, or such as leukotriene or prostaglandin (PG)49. Inhibition of 5- diet mode. Our results suggest that transplantation is followed by lipoxygenase (5-LO) reduces leukotriene B4 production from ara- modification in the recipients’ eating behaviors, mainly affecting chidonic acid and protects mice from acute GvHD in an experi- xanthine metabolism (suggesting different consumption of cho- mental model50.5-LOdeficiency was associated with a decreased colate, tea, and coffee) or condiments-derived metabolites production of pro-inflammatory cytokines such as interferon-γ, (piperine, alliin, cinnamoilglycine). Importantly, most patients TNF-α, and IL-17, and an increased level of circulating IL-10 had oral feeding at time of sampling and did not receive artificial (ref. 50). In addition, PGE2 is critical for gut repair after inflam- feeding that could have affected metabolism. mation and was recently found to be regulated by bacterial-

NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3

a b Distribution of non detectable value not associated with a clinical out come (n = 637) 70 Recipients with GvHD Donors Distribution of non detectable value associated 60 with a clinical out come (P<.05) (n = 16)

50 Metabolites significantly associated with a group of patients (n = 16)

40

30 Discarded 33 metabolites undetected in more than 50% of samples

20

Imputation of remaining undetected values 10

Non detectable values distribution profile 0 Detection of significant metabolites with t-test and fold change analysis (n = 150) 100 50 050100 Percentageof non detectable values c d 10 1-docosahexaenoylglycerol (22:6) 3 1-oleoylglycerol (18:1) N1−methyladenosine 9 1,2-dipalmitoyl-GPC (16:0/16:0) 2 Tauroursodeoxycholate 1-palmitoyl-2-oleoyl-GPE (16:0/18:1) N-stearoyl-sphingosine (d18:1/18:0) 1 8 N-palmitoyl-sphingosine (d18:1/16:0) N−acetylneuraminate 0 Hexadecanedioate 7 Glycochenodeoxycholate N−acetylputrescine −1 N1−methylinosine 6 Glucuronate −2 N2,N2-dimethylguanosine 2−aminooctanoate 5 2−hydroxyoctanoate −3 value) N4−acetylcytidine Methionine sulfone P pro−hydroxy−pro 4 Quinolinate Pseudouridine C−glycosyltryptophan Xanthosine 3 Imidazole propionate

–Log( 2−aminoadipate Methionine 2 6−hydroxyindole sulfate 3−indoxyl sulfate Piperine 1 Allantoin Behenoyl sphingomyelin (d18:1/22:0) Cinnamoylglycine 1-palmitoyl-2-linoleoyl-GPC (16:0/18:2) 0 Indolepropionate 4−methylcatechol sulfate p−cresol sulfate –1 4−vinylphenol sulfate 3−methoxycatechol sulfate –8–6–4–202468 Sphingomyelin (d18:1/15:0, d16:1/17:0) Sphingomyelin (d18:2/23:0, d18:1/23:1, d17:1/24:1) Tricosanoyl sphingomyelin (d18:1/23:0) Sphingomyelin (d18:1/22:1, d18:2/22:0, d16:1/24:1) Fold change (recipient/donor) Citrulline uridine Guanidinoacetate 4−ethylphenylsulfate 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2) Cofactors and vitamins Nucleotide 5alpha-pregnan-3beta,20alpha-diol monosulfate Serotonin Lipid Xenobiotics 4-androsten-3beta,17beta-diol monosulfate 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2) 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4) Amino acid Carbohydrate 5-acetylamino-6-formylamino-3-methyluracil 1.3.7−trimethylurate Energy Peptide Paraxanthine 7-alpha-hydroxy-3-oxo-4-cholestenoate (7-Hoca) 4−hydroxychlorothalonil 2−hydroxylaurate 1-linoleoyl-2-linolenoyl-GPC (18:2/18:3) 1-linoleoyl-GPG (18:2) N−acetyl−cadaverine Tryptophan Sphingomyelin (d18:1/21:0, d17:1/22:0, d16:1/23:0) Hippurate Sphingomyelin (d18:1/14:0, d16:1/16:0) Xanthurenate N−methylproline 2−oxoarginine Deoxycholate Myristoyl dihydrosphingomyelin (d18:0/14:0) Homoarginine 2−hydroxydecanoate 3.7−dimethylurate Phenylacetylcarnitine 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4) 1-(1-enyl-oleoyl)-2-linoleoyl-GPE (P-18:1/18:2) 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1) 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2) 1-linoleoyl-2-arachidonoyl-GPC (18:2/20:4n6) 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1) 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4) 3−acetylphenol sulfate 3-(3-hydroxyphenyl)propionate 1-linoleoyl-GPC (18:2) 1,2-dilinoleoyl-GPC (18:2/18:2) Pyrraline Citraconate/glutaconate Trigonelline (N'-methylnicotinate) N−2−furoylglycine 4-vinylguaiacol sulfate 3−methyl catechol sulfate 3−hydroxypyridine 1−methylxanthine Catechol sulfate O−methylcatechol sulfate 3−hydroxyhippurate Glycodeoxycholate Gentisate Donors Recipients with aGvHD

1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* efcitrulline Glycodeoxycholate 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* citrulline 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* homoarginine glycodeoxycholate 2-oxoarginine* 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* homoarginine 1-(1-enyl-palmitoyl)-GPC (P-16:0)* 1-(1-enyl-stearoyl)-GPE (P-18:0)* 2-oxoarginine* 1-(1-enyl-palmitoyl)-GPE (P-16:0)* picolinate 1-(1-enyl-palmitoyl)-GPC (P-16:0)* 1-(1-enyl-stearoyl)-GPE (P-18:0)* 1-(1-enyl-oleoyl)-GPE (P-18:1)* picolinate N2,N5-diacetylornithine 1-(1-enyl-palmitoyl)-GPE (P-16:0)* 1-(1-enyl-oleoyl)-GPE (P-18:1)* arginine N-acetylcitrulline 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2)* N2,N5-diacetylornithine 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* arginine N-acetylcitrulline 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2)* glycolithocholate sulfate* 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:*1) 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* chenodeoxycholate glycolithocholate sulfate* 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* chenodeoxycholate indoleacetate tartronate (hydroxymalonate) indolelactate taurolithocholate 3-sulfate indoleacetate tartronate (hydroxymalonate) 3-indoxyl sulfate indolelactate ornithine spermidine indole-3-carboxylic acid 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2)* taurolithocholate 3-sulfate tryptophan betaine 3-indoxyl sulfate 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2)* proline N-acetylproline ornithine spermidine indole-3-carboxylic acid 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2)* N-acetylproline tryptophan betaine cholate N-acetylkynurenine (2) proline 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2)* N-acetylkynurenine (2) 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)* taurodeoxycholate N-methylproline cholate N-delta-acetylornithine 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)* taurodeoxycholate N-methylproline urea hyocholate glycohyocholate hyocholate N-delta-acetylornithine urea glycohyocholate 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) N-acetylarginine 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4)* deoxycholate N-acetylarginine tryptophan 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4)* tryptophan kynurenate 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4)* kynurenate 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4)* Deoxycholate indoleacetylglutamine AhR ligands indoleacetylglutamine N-acetyltryptophan N-acetyltryptophan

4-acetamidobutanoate (tryptophan metabolism) 4-acetamidobutanoate trans-4-hydroxyproline trans-4-hydroxyproline

homocitrulline Indolepropionate homocitrulline N-methylpipecolate kynurenine Indolepropionate N-methylpipecolate kynurenine C-glycosyltryptophan dimethylarginine (SDMA + ADMA) C-glycosyltryptophan dimethylarginine (SDMA + ADMA)

glycochenodeoxycholate taurochenodeoxycholate glycochenodeoxycholate taurochenodeoxycholate

pro-hydroxy-pro 5-methylthioadenosine (MTA) pro-hydroxy-pro 5-methylthioadenosine (MTA) N-acetylputrescine taurocholenate sulfate N-acetylputrescine taurocholenate sulfate glycocholenate sulfate* glycocholenate sulfate*

Primary bile acids acisoga acisoga

Glycocholate Glycocholate

Taurocholate Secondary bile acids Taurocholate

Ursodeoxycholate Ursodeoxycholate

Glycochenodeoxycholate sulfate Glycoursodeoxycholate Glycochenodeoxycholate sulfate Glycoursodeoxycholate

Glycochenodeoxycholate glucuronide Glycochenodeoxycholate) glucuronide metabolized BA51. Bile acids effects on immune response and associated with a dramatically low level of plasmalogens and lyso- inflammation are not fully understood, with data suggesting a role plasmalogens at disease onset. The biological role of plasmalogens in pro-inflammatory cytokines production, T cell activation and in inflammatory processes remains controversial, with a putative neutrophil recruitment52,53, while other recent data suggest that protective effect against ROS products54. Interestingly, it has been they could inhibit inflammasome activation27. Finally, aGvHD was suggested that plasmalogens could play a role as reservoir for

6 NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3 ARTICLE

Fig. 3 Metabolomics changes after transplantation in recipients with GvHD. a Distribution of non-detectable values for each metabolite was assessed in recipient with GvHD and in their related paired donors. The frequency of non-detectable values in both groups was compared and colored as follows: green: frequency of non-detectable values non-significantly different between groups; orange: frequency of non-detectable values significantly different between groups with p < 0.05 but not significant after Bonferroni correction for multiple testing. Red area represents the threshold used for filtration of metabolites with more than 50% of non-detectable value. b A total number of 653 metabolites were analyzed for comparison of paired donor and recipients with GvHD. Among them, 16 metabolites were more frequently detected in one group of patients, including no metabolite that were still significant after Bonferroni correction. To compare the amount of each metabolite between recipients and their donors, 33 metabolites with more than 50% of non- detectable values were discarded for further analysis. c The remaining 620 metabolites were compared with a paired Student test followed by a Bonferroni correction. The volcano plot represents the variation of metabolites amount between recipients with GvHD and their related paired donors according to the −log(p value). List of the 150 significant metabolites is available in Supplementary Data 7. d Heatmap representation of the most significant metabolites after Student test, after hierarchical clustering of samples. e, f Significant variation of metabolites was confirmed after global comparison of the relative amounts between compounds. Main pathways identified in the previous analysis (i.e. polyamine metabolism, tryptophan metabolism, urea cycle, arginine & proline metabolism, bacterial or fungal, plasmalogen or lysoplasmalogen, and primary or secondary bile acid metabolism) were used to build an undirected graph where each node is a metabolite and two nodes are connected if their ratio is unchanged between the two groups (see Methods). The same network was first colored according to the considered sub-pathway (e) and then according to microbial-derived metabolites (red nodes) (f).

second intracellular messengers such as arachidonic acid and sub- objectives were to include at least 40 patients per cohort with an expected acute sequent production of eicosanoid, as it was recently demonstrated GvHD incidence of 40%. Clinical data were extracted from medical records and that plasmalogens hydrolysis in LPS-primed macrophages increases included gender, age, CMV status, underlying hematological diagnosis, HLA 55 matching between donor and recipient, stem cell source, conditioning regimen, T arachidonic release for eicosanoid synthesis . cell depletion, GvHD prophylaxis, presence or absence of GVHD and nutrition An unsolved question raised by these results is the relation type (oral vs. parenteral vs. enteral) at sampling day. between acute GvHD and metabolome alterations at disease Donors’ samples were collected during medical visit before any stem cell ’ onset. It is tempting to consider that at least a part of metabolites collection procedure. Recipients samples were collected at day 90 ± 5 after transplantation for patients who did not develop acute GvHD at any time after changes could be involved in GvHD pathogenesis, as suggested by transplantation, and at the onset of symptoms, before starting any corticosteroid 16,17 recent results in animal model . Nevertheless, one cannot treatment, for acute GvHD. Patients in the non-GVHD group never developed exclude that GvHD by itself could induce alteration of tissues neither classical nor late onset acute GvHD. All samples were collected on EDTA metabolism. Indeed, it has been previously demonstrated, mostly tubes (BD Vacutainer, K3E 7.2 mg, Plus blood Collection Tubes), centrifuged within 4 h to collect plasma and immediately frozen at −80 °C until processing. In in experimental models, that dysbiosis observed in GvHD could patient who had GvHD, organ involvement and stage, the date, the treatment, the be both partly causative and consequence of the allogeneic response to treatment and the date of response were recorded. GvHD grade was immune response7,16. Our results suggest that many variations evaluated according to Glucksberg score and Consensus criteria57,58. observed in metabolites could be due to microbiota changes after All patients gave their written consent for clinical research. This non- interventional research study with no additional clinical procedure was carried out transplantation especially if GvHD occurs. Recently, it was in accordance with the Declaration of Helsinki. Data analyses were carried out demonstrated that plasma metabolome can predict gut micro- using a database with all patient identifiers removed. This study was declared to the biome α-diversity56. In this study, 40 metabolites were identified CNIL (Commission National Informatique et Liberté, number KoT1175225K) and as being associated with human disease and microbiome, many of was approved by the local ethic committee and Institutional Review Board (CPP Ile them being also identified after transplantation, our study. In our de France IV, IRB number 00003835). cohorts, all patients received antibiotics at the time of sampling. Antibiotics are involved in dysbiosis of microbiota and most Samples collection and preparation. Concerning CRYOSTEM plasma, samples likely dysbiosis participates to metabolomics changes after annotated have been provided by the CRYOSTEM Consortium (https://doi.org/ 10.25718/cryostem-collection/2018) and the SFGM-TC (Société francophone de transplantation. However, whether this phenomenon could be a greffe de moelle et de thérapie cellulaire). Aliquots of 1 mL were divided in four factor increasing acute GvHD severity remains to be explored. aliquots of 250 μL for further study and sent to Metabolon company (Morrisville, Interestingly, recent clinical studies suggested that antibiotics may US) for further process. Samples were prepared using the automated MicroLab ® have a major impact on GvHD-risk in humans8,10. STAR system from Hamilton Company. Several recovery standards were added prior to the first step in the extraction process for QC purposes. For the meta- Altogether, our results revealed that allogeneic-HSCT is asso- bolomic analysis, a total of 100 μL of sample was extracted under vigorous shaking ciated with major metabolomics variations. The onset of acute for 2 min (Glen Mills GenoGrinder 2000) with methanol 80% containing the fol- GvHD was characterized by reduced production of tryptophan- lowing recovery standards: DL-2-fluorophenylglycine, tridecanoic acid, d6-choles- derived metabolites, especially host- or microbial-derived AhR terol, and DL-4-chlorophenylalanine. The resulting extract was divided into five fractions: two for analysis by two separate reverse phase (RP)/UPLC-MS/MS ligands that might contribute to increase allogeneic immune methods with positive ion mode electrospray ionization (ESI), one for analysis by response in the recipient. Reduced production of plasmalogens RP/UPLC-MS/MS with negative ion mode ESI and one for analysis by HILIC/ together with the increased level of BAs and polyunsaturated UPLC-MS/MS with negative ion mode ESI. The remaining aliquot was reserved for acids are potential metabolomics pathways that could be involved backup. Samples were placed briefly on a TurboVap® (Zymark) to remove the fl organic solvent. The sample extracts were stored overnight under nitrogen before in the early pro-in ammatory response during GvHD. These preparation for analysis. findings highlight major biological processes involved at GvHD onset that may represent new therapeutic targets for GvHD Mass spectrometry. All methods utilized a Waters ACQUITY UPLC and a prophylaxis or treatment. Thermo Scientific Q-Exactive high-resolution/accurate mass spectrometer inter- faced with a heated electrospray ionization (HESI-II) source and Orbitrap mass analyzer operated at R = 35,000 mass resolution. The sample extract was dried then Methods reconstituted in solvents compatible to each of the four methods. For each sample, Patients. Patients and their related donors analyzed in this study underwent an two aliquots of each sample were reconstituted in 50 μL of 6.5 mM ammonium allogeneic HSCT in Saint Louis hospital, Paris, France (cohort 1, n = 43) or in one bicarbonate in water (pH 8) for the negative ion analysis and another two aliquots of the 33 French national transplant centers involved in CRYOSTEM Consortium, of each were reconstituted using 50 μL 0.1% in water (pH ~3.5) for the funded under the French Government’s National Investment Program (Inves- positive ion method. Each reconstitution solvent contained a series of standards at tissement d’avenir) (cohort 2, n = 56). Inclusion criteria were adult patients (more fixed concentrations to ensure injection and chromatographic consistency. The than 18-year-old), with an HLA-identical sibling donor who underwent an internal standards consist of a variety of deuterium labeled or halogenated bio- allogeneic-HSCT. Patients with HIV or HTLV co-infection were excluded. Our chemicals specifically designed both to cover the entire chromatographic run and

NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3

abDistribution of non detectable value not Recipients with GvHD Recipients without GvHD associated with a clinical out come (n = 510) 80 Distribution of non detectable value associated with a clinical out come (P<.05) (n = 14)

60 Metabolites significantly associated with a group of patients (n = 14)

40 Discarded 35 metabolites undetectedin more than 50% of samples

20 Imputation of remaining undetected values

Detection of significant metabolites with t-test and 0 fold change analysis (n = 41)

Non detectable values distribution profile 100 50 0 50 100 Percentage of non detectable values cd 6 Citrate Aconitate [cis or trans] 5 Cortisone 3-hydroxystearate Gamma-glutamylmethionine 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2) 4 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4) Citrulline 1-palmitoyl-2-gamma-linolenoyl-GPC (16:0/18:3n6) Betaine 3 3-indoxyl sulfate value) Trigonelline (N'-methylnicotinate) P Guanidinoacetate 2 1-linoleoyl-GPC (18:2) Allantoin Urate –Log( Retinol ( A) 1 Indoleacetate 1,2-dilinoleoyl-GPC (18:2/18:2) 1-linoleoyl-2-arachidonoyl-GPC (18:2/20:4n6) 1-linoleoyl-2-linolenoyl-GPC (18:2/18:3) 0 1-linolenoyl-GPC (18:3) N-acetylarginine phosphate 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4) –1 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2) Picolinate –4–3–2–101234 Isovalerylglycine 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1) Fold change (acuteGvHD/no GvHD) 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4) 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2) 4-guanidinobutanoate Cofactors and vitamins Nucleotides 1-stearoyl-2-linoleoyl-GPC (18:0/18:2) Homoarginine Lipid Xenobiotice 2-oxoarginine Amino acid Carbohydrate 10-nonadecenoate. Margarate (17:0) Energy Peptide Stearate (18:0) Nonadecanoate (19:0) 10-heptadecenoate Linoleate (18:2n6) Docosadienoate (22:0) Dihomo-linoleate (20:2n6) Eicosenoate (20:1) 2-hydroxyoctanoate 2-hydroxyglutarate Glucuronate Tauroursodeoxycholate N-stearoyl-sphingosine (d18:1/18:0) N-palmitoyl-sphingosine (d18:1/16:0) 1-docosahexaenoylglycerol (22:6) 1-oleoylglycerol (18:1) 4 N2,N2-dimethylguanosine C-glycosyltryptophan 2 Cytidine Linoleoyl ethanolamide 0 Adrenate (22:4n6) N-acetylneuraminate Glycerol −2 Docosapentaenoate (n3 DPA; 22:5n3) Eicosapentaenoate (EPA; 20:5n3) −4 Hexanoylglutamine Alpha ketobutyrate aGvHD No aGvHD

efAhR ligands 3-indoxyl sulfate (tryptophan metabolism) 3-indoxyl sulfate Plasmalogens 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2) 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2)* Indolepropionate 2-oxoarginine 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4) 2-oxoarginine* 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4) Indolepropionate 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4)* citrulline 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2)* 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4)* citrulline 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2)* N-acetylcitrulline homoarginine N-acetylcitrulline 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) 1-(1-enyl-stearoyl)-GPE (P-18:0)* Homoarginine 1-(1-enyl-stearoyl)-GPE (P-18:0)* 1-(1-enyl-oleoyl)-GPE (P-18:1)* 1-(1-enyl-palmitoyl)-GPC (P-16:0)* 1-(1-enyl-oleoyl)-GPE (P-18:1)* 1-(1-enyl-palmitoyl)-GPC (P-16:0)* N-methylproline 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* N-methylproline 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2)* 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* 1-(1-enyl-palmitoyl)-GPE (P-16:0)* 1-(1-enyl-palmitoyl)-GPE (P-16:0)* N2,N5-diacetylornithine indoleacetylglutamine arginine N2,N5-diacetylornithine indoleacetylglutamine arginine 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2) N-delta-acetylornithine deoxycholate N-delta-acetylornithine deoxycholate urea urea indoleacetate picolinate Indoleacetate picolinate tryptophan betaine proline tryptophan betaine proline N-acetylarginine chenodeoxycholate N-acetylarginine chenodeoxycholate Arginine glycodeoxycholate ornithine glycodeoxycholate ornithine 5-methylthioadenosine (MTA) 5-methylthioadenosine (MTA) 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* taurodeoxycholate taurodeoxycholate indole-3-carboxylic acid 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* indole-3-carboxylic acid N-acetylproline taurolithocholate 3-sulfate N-acetylproline indolelactate taurolithocholate 3-sulfate indolelactate Tauroursodeoxycholate glycochenodeoxycholate Tauroursodeoxycholate glycochenodeoxycholate tryptophan tryptophan glycohyocholate glycohyocholate 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* spermidine N-acetyltryptophan spermidine N-acetyltryptophan dimethylarginine (SDMA + ADMA) dimethylarginine (SDMA + ADMA) trans-4-hydroxyproline trans-4-hydroxyproline N-acetylputrescine cholate 4-acetamidobutanoate N-acetylputrescine cholate 4-acetamidobutanoate pro-hydroxy-pro pro-hydroxy-pro kynurenate 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)* kynurenate 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)* C-glycosyltryptophan kynurenine C-glycosyltryptophan kynurenine 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* acisoga Hyocholate acisoga Glycocholenate sulfate* Taurochenodeoxycholate glycocholenate sulfate* hyocholate taurochenodeoxycholate Glycolithocholate sulfate* glycolithocholate sulfate* Secondary bile acids Glycocholate Glycocholate Glycochenodeoxycholate) glucuronide Glycochenodeoxycholate glucuronide Glycochenodeoxycholate sulfate Glycochenodeoxycholate sulfate Glycoursodeoxycholate homocitrulline Glycoursodeoxycholate

Homocitrulline Ursodeoxycholate taurocholenate sulfate Taurocholenate sulfate Ursodeoxycholate Primary bile acids Taurocholate Taurocholate to not interfere with the detection of any endogenous biochemicals. Authentic Tridecanoic acid was purchased from Sigma-Aldrich (St. Louis, MO) and d6- standards of d7-glucose, d3-leucine, d8-phenylalanine, and d5-tryptophan were cholesterol was from Cambridge Isotope Laboratories (Andover, MA). Standards purchased from Cambridge Isotope Laboratories (Andover, MA). D5-hippuric for the HILIC dilution series of alpha-ketoglutarate, ATP, malic acid, NADH, and acid, d5-indole acetic acid, and d9-progesterone were procured from C/D/N Iso- oxaloacetic acid were purchased from Sigma-Aldrich Co. LLC. (St. Louis, MO) topes, Inc. (Pointe-Claire, Quebec). Bromophenylalanine was provided by Sigma- while succinic acid, pyruvic acid and NAD+ were purchased from MP Biomedicals, Aldrich Co. LLC. (St. Louis, MO) and amitriptyline was from MP Biomedicals, LLC. (Santa Ana, CA). LLC. (Aurora, OH). Recovery standards of DL-2-fluorophenylglycine and DL-4- Limit of detection (LOD) for standards analyzed in a dilution series using chlorophenylalanine were from Aldrich Chemical Co. (Milwaukee, WI). reverse phase chromatography is available in Supplementary Data 15.

8 NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3 ARTICLE

Fig. 4 Metabolomics pathways changes associated with aGvHD onset. a Distribution of non-detectable values for each metabolite was assessed in recipients with or without GvHD. The frequency of non-detectable values in both groups was compared for each distribution profile using a Fisher exact test for a 2 × 2 contingency table and colored as follows: green: frequency of non-detectable values non-significantly different between groups; orange: frequency of non-detectable values significantly different between groups with p < 0.05 but not significant after Bonferroni correction for multiple testing. Red area represents the threshold used for filtration of metabolites with more than 50% of non-detectable value. b A total number of 524 metabolites were analyzed for comparison of recipients with or without GvHD. Among them, 14 metabolites were more frequently detected in one group of patients, but none was significant after Bonferroni correction (Supplementary Data 9). To compare the amount of each metabolite between recipients with or without GvHD, 35 metabolites with more than 50% of non-detectable values were discarded for further analysis. c The remaining 489 metabolites were compared with a Student test with Satterwhaite’s correction for unequal variance, followed by a Bonferroni correction. The volcano plot represents the variation of metabolites amount between recipients with GvHD and those without GvHD according to the −log(p value). List of the 41 significant metabolites is available in Supplementary Data 11. d Heatmap representation of the more significant metabolites after Student test after hierarchical clustering of sample. e, f Significant variation of metabolites was confirmed after global comparison of the relative amounts between compounds. Main pathways identified in the previous analysis (i.e. polyamine metabolism, tryptophan metabolism, urea cycle, arginine & proline metabolism, bacterial or fungal, plasmalogen or lysoplasmalogen, and primary or secondary bile acid metabolism) were used to build an undirected graph where each node is a metabolite and two nodes are connected if their ratio is unchanged between the two groups (see Online Methods). The same network was first colored according to the considered sub-pathway (e) and then according to microbial-derived metabolites (red nodes) (f).

One aliquot was analyzed using acidic positive ion conditions (LC pos), Overall process variability was determined by calculating the median RSD for chromatographically optimized for more hydrophilic compounds. In this method, the all endogenous metabolites (i.e., non-instrument standards) present in 100% of the extract was gradient eluted from a C18 column (Waters UPLC BEH C18-2.1 × 100 MTRX samples, which are technical replicates created from a large pool of mm, 1.7 µm) using water and methanol, containing 0.05% perfluoropentanoic acid extensively characterized human plasma. The median RSD for the MTRX samples (PFPA) and 0.1% formic acid (FA) at pH = 2.5. Elution was performed at was equal to 9–10%. Five MTRX samples and three process blank samples were 0.35 mL min−1 in a linear gradient from 5% to 80% of methanol containing 0.1% FA processed per every batch of 30 samples. Experimental samples were randomized and 0.05% PFPA over 3.35 min. A second aliquot was also analyzed using acidic across the platform run with QC samples spaced evenly among the injections. positive ion conditions; however, it was chromatographically optimized for more hydrophobic compounds. In this method, the extract was gradient eluted from the Compounds identification and quantification. Raw data were extracted, peak- same afore mentioned C18 column using methanol 50%, acetonitrile 50%, water, 0.05 identified, and QC processed using Metabolon’s hardware and software. Com- % PFPA, and 0.01 % FA at pH = 2.5 and was operated at an overall higher organic pounds were identified by comparison to library entries of purified standards or content. Elution was performed at 0.60 mL/min in a linear gradient from 40% to recurrent unknown entities59,60. Briefly, Metabolon maintains a library based on 99.5% over 1 min, hold 2.4 min at 99.5% of methanol 50%, acetonitrile 50%, 0.05% authenticated standards that contains the retention time/index (RI), mass to charge PFPA, and 0.01% FA. A third aliquot was analyzed using basic negative ion-optimized ratio (m/z), and chromatographic data (including MS/MS spectral data) on all conditions with a separate dedicated C18 column (LC neg). The basic extracts were molecules present in the library. Furthermore, biochemical identifications are based gradient eluted from the column using methanol 95% and water 5%, with 6.5 mM on three criteria: retention index within a narrow RI window of the proposed ammonium bicarbonate at pH 8. Elution was performed at 0.35 mL min−1 with a identification, accurate mass match to the library ±10 ppm, and the MS/MS for- linear gradient from 0.5% to 70% of methanol 95%, water 5% with 6.5 mM ward and reverse scores between the experimental data and authentic standards. ammonium bicarbonate over 4 min, followed by a rapid gradient to 99% in 0.5 min. The MS/MS scores are based on a comparison of the ions present in the experi- The sample injection volume was 5 μL and a 2× needle loop overfill was used. mental spectrum to the ions present in the library spectrum. While there may be Separations utilized separate acid and base-dedicated 2.1 mm × 100 mm Waters BEH similarities between these molecules based on one of these factors, the use of all C18 1.7 μm columns held at 40 °C. The fourth aliquot was analyzed via negative three data points can be utilized to distinguish and differentiate biochemicals. More ionization following elution from an HILIC column (LC HILIC) (Waters UPLC BEH than 3300 commercially available purified standard compounds have been acquired Amide 2.1 × 150 mm, 1.7 µm, held at 40 °C) using a gradient consisting of water and registered for analysis on all platforms for determination of their analytical (15%), methanol (5%), and acetonitrile (80%) with 10 mM ammonium formate, pH characteristics. The full list of identified metabolites in both cohorts is available in 10.16. Elution flow rate was 0.5 mL/min with a linear gradient from 5% to 50% in 3.5 Supplementary Data 16. Microbiota-derived metabolites identification was based min, followed by a linear gradient from 50% to 95% in 2 min, of water (50%), on the Human Metabolome Database (www.hmdb.ca). The QC and curation acetonitrile (50%) with 10 mM ammonium formate, pH 10.6. The MS analysis processes were designed to ensure accurate and consistent identification of true alternated between MS and data-dependent MSn scans using dynamic exclusion. The chemical entities, and to remove those representing system artifacts, mis-assign- scan range varied slightly between methods but covered 70–1000m/z. ments, and background noise. Metabolon data analysts use proprietary visualiza- tion and interpretation software to confirm the consistency of peak identification among the various samples. Library matches for each compound were checked for Quality assurance and quality control (QA/QC). Several types of controls were each sample and corrected if necessary. Peaks were quantified using area-under- analyzed in concert with the experimental samples: a pooled matrix sample gen- the-curve. A data normalization step was performed to correct variation resulting erated by taking a small volume of each experimental sample (or alternatively, use from instrument inter-day tuning differences. Essentially, each compound was of a pool of well-characterized human plasma, named MTRX for sample matrix) corrected in run-day blocks by registering the medians to equal one (1.00) and served as a technical replicate throughout the dataset; extracted water samples normalizing each data point proportionately served as process blanks; and a cocktail of QC standards listed below, which were carefully chosen not to interfere with the measurement of endogenous compounds Data analysis. Metabolite amounts were analyzed using a three-step procedure, were spiked into every analyzed sample, allowed instrument performance mon- both for the comparison between recipients according to the GVHD status [reci- itoring, and aided chromatographic alignment. In LC neg conditions, internal pient with acute GVHD, Ra vs those without GVH, Rs], and for the comparison standards were D7-glucose, d3-methionine, d3-leucine, d8-phenylalanine, d5- between donors and their recipients without GVHD [D vs Rs] or with GVHD [D tryptophan, bromophenylalanine, d15-octanoic acid, d19-decanoic acid, d27- vs Ra]. The first two steps dealt with each metabolite separately, whereas the third tetradecanoic acid, d35-octadecanoic acid, d2-eicosanoic acid. In LC HILIC con- and fourth step analyzed the whole dataset at once. These steps are detailed below; ditions, internal standards were D35-octadecanoic acid, d5-indole acetic acid, briefly, the two first steps differed in terms of outcomes that were compared across bromophenylalanine, d5-tryptophan, d4-tyrosine, d3-, d3-, d7- groups, either the proportion of non-detectable metabolite quantities (first step) or ornithine, d4-lysine. In LC pos conditions, internal standards were d7-glucose, d3- the average metabolite quantity (second step); the third step specifically aimed to methionine, d3-leucine, d8-phenylalanine, d5-tryptophan, bromophenylalanine, detect groups of metabolites that experience similar pattern of changes across d4-tyrosine, d5-indole acetic acid, d5-hippuric acid, amitriptyline, d9-progesterone, groups. For the Rs vs Ra comparison, a fourth step specifically aimed to detect d4-dioctylphthalate. metabolites that best predict the subject’s group. Instrument variability was determined by calculating the median relative For D vs Rs and for D vs Ra, analyses comparisons were performed considering standard deviation (RSD) for the internal standards that were added to each sample the donor–recipient pairs; consequently, only donors with a matched recipient who = – prior to injection into the mass spectrometers (median RSD 3 4%). Instruments did not develop GVHD were considered. All metabolites were kept for the main are calibrated at least weekly in the utilized polarity using thermo and mass analysis. As a sensitivity analysis, all donors were included, ignoring matching with accuracy is monitored at the batch level for the internal standards. A batch fails QC their recipients, and led to similar conclusions. if any of the internal standards are more than 5 ppm away from the Analyses for Ra vs Rs comparison were performed considering the two theoretical mass. groups of patients independently. Only natural metabolites, including bacteria and

NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3

abAhR ligands (tryptophan metabolism) Indolepropionate Indolepropionate Arginine Plasmalogens 3-indoxyl sulfate 3-indoxyl sulfate N-acetylcitrulline 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4) N-acetylcitrulline 2-oxoarginine 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4)* 2-oxoarginine* N2,N5-diacetylornithine N2,N5-diacetylornithine 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2)*

1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2) 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2)*

N-acetylarginine N-acetylarginine N-delta-acetylornithine N-delta-acetylornithine citrulline 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4)* citrulline 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4)* 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2) picolinate N-methylproline picolinate N-methylproline 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* 1-(1-enyl-stearoyl)-GPE (P-18:0)* 1-(1-enyl-stearoyl)-GPE (P-18:0)* indoleacetate homoarginine indoleacetate homoarginine 1-(1-enyl-oleoyl)-GPE (P-18:1)* 1-(1-enyl-oleoyl)-GPE (P-18:1)* arginine arginine 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2) 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2)* 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* 1-(1-enyl-palmitoyl)-GPE (P-16:0)* 1-(1-enyl-palmitoyl)-GPE (P-16:0)* 5-methylthioadenosine (MTA) 5-methylthioadenosine (MTA) glycodeoxycholate 1-(1-enyl-palmitoyl)-GPC (P-16:0)* glycodeoxycholate deoxycholate 1-(1-enyl-palmitoyl)-GPC (P-16:0)* deoxycholate taurolithocholate 3-sulfate taurolithocholate 3-sulfate urea urea indoleacetylglutamine indoleacetylglutamine glycohyocholate glycohyocholate indole-3-carboxylic acid indole-3-carboxylic acid glycolithocholate sulfate* glycolithocholate sulfate* taurodeoxycholate taurodeoxycholate chenodeoxycholate chenodeoxycholate glycochenodeoxycholate glycochenodeoxycholate N-acetylputrescine pro-hydroxy-pro N-acetylputrescine pro-hydroxy-pro tryptophan betaine C-glycosyltryptophan tryptophan betaine C-glycosyltryptophan

1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)* 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)* N-acetylproline N-acetylproline ornithine ornithine glycocholenate sulfate* glycocholenate sulfate* hyocholate hyocholate 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* N-acetyltryptophan trans-4-hydroxyproline N-acetyltryptophan trans-4-hydroxyproline proline cholate proline kynurenine cholate kynurenine glycocholate spermidine Glycocholate spermidine indolelactate 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* acisoga indolelactate 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* acisoga kynurenate kynurenate homocitrulline homocitrulline tryptophan tryptophan taurochenodeoxycholate taurochenodeoxycholate 4-acetamidobutanoate Glycochenodeoxycholate sulfate 4-acetamidobutanoate glycochenodeoxycholate sulfate dimethylarginine (SDMA + ADMA) dimethylarginine (SDMA + ADMA) Taurocholate Taurocholate Glycoursodeoxycholate Primary bile acids Glycoursodeoxycholate Glycochenodeoxycholate glucuronide Glycochenodeoxycholate glucuronide taurocholenate sulfate Taurocholenate sulfate

Ursodeoxycholate Ursodeoxycholate Secondary bile acids

Tauroursodeoxycholate Tauroursodeoxycholate

cd3-indoxyl sulfate AhR ligands 3-indoxyl sulfate (tryptophan metabolism)

Indolepropionate Indolepropionate

Arginine

N-methylproline N-methylproline

Taurocholate Taurocholate

N2,N5-diacetylornithine N2,N5-diacetylornithine Plasmalogens

Glycodeoxycholate glycodeoxycholate 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2)* 1-(1-enyl-palmitoyl)-2-linoleoyl-GPE (P-16:0/18:2)* Glycolithocholate sulfate 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) glycolithocholate sulfate* 1-(1-enyl-stearoyl)-2-oleoyl-GPE (P-18:0/18:1) 1-(1-enyl-stearoyl)-GPE (P-18:0)* 1-(1-enyl-stearoyl)-GPE (P-18:0)* trans-4-hydroxyproline trans-4-hydroxyproline Arginine arginine indoleacetylglutamine 1-(1-enyl-oleoyl)-GPE (P-18:1)* indoleacetylglutamine 1-(1-enyl-oleoyl)-GPE (P-18:1)* taurodeoxycholate 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4)* taurodeoxycholate 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* 1-(1-enyl-palmitoyl)-2-oleoyl-GPE (P-16:0/18:1)* N-acetylcitrulline N-acetylcitrulline 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2)* 1-(1-enyl-palmitoyl)-2-linoleoyl-GPC (P-16:0/18:2)* 1-(1-enyl-stearoyl)-2-arachidonoyl-GPE (P-18:0/20:4) urea urea 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPC (P-16:0/20:4)* N-delta-acetylornithine N-delta-acetylornithine taurolithocholate 3-sulfate taurolithocholate 3-sulfate 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4)* 1-(1-enyl-palmitoyl)-2-arachidonoyl-GPE (P-16:0/20:4)* N-acetylarginine indole-3-carboxylic acid N-acetylarginine indole-3-carboxylic acid chenodeoxycholate 1-(1-enyl-palmitoyl)-GPE (P-16:0)* chenodeoxycholate 1-(1-enyl-palmitoyl)-GPE (P-16:0)* homocitrulline homocitrulline 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2)*homoarginine 1-(1-enyl-stearoyl)-2-linoleoyl-GPE (P-18:0/18:2)*homoarginine glycoursodeoxycholate glycoursodeoxycholate

2-oxoarginine* deoxycholate5-methylthioadenosine (MTA) 2-oxoarginine* deoxycholate5-methylthioadenosine (MTA) citrulline ursodeoxycholate citrulline ursodeoxycholate indolelactate indolelactate indoleacetate hyocholate 1-(1-enyl-palmitoyl)-GPC (P-16:0)* hyocholate 1-(1-enyl-palmitoyl)-GPC (P-16:0)* pro-hydroxy-pro pro-hydroxy-pro glycohyocholate cholate Indoleacetate glycohyocholate cholate

tryptophan betaine tryptophan betaine glycocholenate sulfate* glycocholenate sulfate* glycochenodeoxycholate glucuronide (1) glycochenodeoxycholate glucuronide (1) proline proline 4-acetamidobutanoate 4-acetamidobutanoate spermidine spermidine N-acetylputrescine tryptophan N-acetylputrescine tryptophan glycochenodeoxycholate glycochenodeoxycholate N-acetyltryptophan acisoga N-acetyltryptophan acisoga glycochenodeoxycholate sulfate glycochenodeoxycholate sulfate ornithine ornithine N-acetylproline N-acetylproline C-glycosyltryptophan C-glycosyltryptophan picolinate picolinate 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* 1-(1-enyl-palmitoyl)-2-oleoyl-GPC (P-16:0/18:1)* 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* 1-(1-enyl-palmitoyl)-2-palmitoleoyl-GPC (P-16:0/16:1)* kynurenine 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)* kynurenine 1-(1-enyl-palmitoyl)-2-palmitoyl-GPC (P-16:0/16:0)* kynurenate kynurenate dimethylarginine (SDMA + ADMA) Tauroursodeoxycholate Dimethylarginine (SDMA + ADMA) Tauroursodeoxycholate Taurocholenate sulfate Taurocholenate sulfate

Secondary bile acids Glycocholate Glycocholate Primary bile acids Taurochenodeoxycholate Taurochenodeoxycholate

Fig. 5 Metabolomics variation at aGvHD onset is not affected by age, gender, nor BMI. Recipients with or without GvHD were compared after adjustment for age, gender, and body mass index (BMI) and re-analyzed to avoid any metabolite variation due to these parameters. The list of 57 metabolites significantly changing with acute GvHD onset in both cohorts is available in Supplementary Data 13. The main pathways involved at GvHD onset were used to build a network of compositional data where nodes represent metabolites and are linked if the ratio between corresponding metabolites was not significantly different between groups of patients. The same network was first colored according to the considered sub-pathway for cohorts 1 and 2 (a and c, respectively) and then according to microbial-derived metabolites (red nodes) (b and d, respectively). fungi-derived metabolites, were considered, excluding drugs or assimilated in the Saint Louis’ cohort and nine in the Cryostem’s cohort. The age difference in compounds. A sensitivity analysis was done based on a 1:1 matching on sex and a given pair was between −3 and +1 year in the Saint Louis’ cohort and −11 and age between receivers with (Ra, cases) or without (Rs, controls) GVHD, which led +6 years in the Cryostem one. to similar results. This pairing was done by matching each Ra patient with the Rs All analyses were performed first on the Saint Louis hospital cohort patient with the closest age and sex. After pairing, there was four gender mismatch (exploratory analysis), then on the Cryostem cohort (confirmatory analysis). The

10 NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3 ARTICLE

abPrincipal component analysis (cohort 1) Principal component analysis (cohort 2)

15

10 10

5 0 0

–5 –10 Dim 2 (9.93%) Dim 2 (12.45%) –10

–20 No GvHD –15 No GvHD GvHD GvHD –20 –20 –10 0 10 20 30 40 –30 –20 –10 0 10 20 Dim 1 (13.16%) Dim 1 (12.63%)

cdPLS-DA (cohort 1) PLS-DA (cohort 2) 5 10

2.5

5 0

0 –2.5 X-variate 2: 6% expl. var. X-variate 2: 9% expl. var. –5 –5 No GvHD No GvHD GvHD GvHD –7.5 –10 –5 0 5 –12 –8 –4 0 4 X-variate 1: 10% expl. var X-variate 1: 11% expl. var

efPathway Pathway P value Plasmalogen Plasmalogen Urea cycle; arginine and proline Urea cycle; arginine and proline 0.0000 0.3840 Purine,(Hypo)xanthine/inosine containing Glutamate FA, acyl Tryptophan Vitamine A Vitamine A Guanidino and acetamido Glycine, serine and threonine Phospholipid Dipeptide Long chain FA Guanidino and acetamido Polyunsatyrated FA (n3 and n6) Acetylated peptide Creatinine Creatinine Pyrimidine, cytidine containing Endocannabinoid Tryptophan Sphingolipid Aminosugar Lysoplasmalogen Nicotinate and nicotinamide Phospholipid Sphingolipid and aspartate Monoacylglycerol Glycolysis, gluconeogenesis, and pyruvate Glycine, serine and threonine Purine, (Hypo)xanthine/inosine containing Leucine, iso leucine and valine Pyrimidine, uracil containing Steroid TCA cycle FA, dicarboxylate Phenylalanine and tyrosine FA, monohydroxy FA, monohydroxy 02 4 6 810 12 14 16 02 4 6 810 12 14 16 Fold enrichment Fold enrichment

Fig. 6 Identification of main pathways involved in aGVHD using multivariate analysis. Principal component analysis was able to discriminate recipient with (red) or without (green) GvHD in dimension 1 for cohort 1 (a) and 2 (b). Sparse partial least square discriminant analysis (sPLS-DA) was then used to identify metabolites that mostly contribute to the discrimination of patients with (orange) or without GvHD (blue) for cohort 1 (c) and 2 (d). Complete list of metabolites is available in Supplementary Data 14. These metabolites were then used to identify main pathways involved in acute GvHD using overrepresentation analysis for cohort 1 (e) and 2 (f). confirmatory analysis used the same filters and procedures than the exploratory analysis of Ra vs paired Rs, given matching was performed on age and sex, estimates analysis to assess whether the results obtained in the first cohort of patients were were only adjusted for difference in BMI between the two subjects of a pair. consistent in an independent cohort. To ensure the comparability of the results, All analyses were performed using R and appropriate additional packages, as both analyses used only the 653 metabolites that were shared by the two datasets described below. (excluding 150 metabolites detected only in the Saint Louis cohort and 274 The first step of analysis consisted in selection of metabolites and imputation of metabolites detected only in the Cryostem cohort). undetectable amounts. Since some metabolites could not be detected in several Unless specified, all analyses were done using a 5% type I error rate (p < 0.05, patients, indicating low amounts of metabolites in these patients, we first wondered after multiplicity correction when required). To handle obvious confounders, all whether these non-detectable amounts occurred randomly or preferentially in one analyses of steps two and three (detailed below) were secondly adjusted. More of the two groups. This was done, for each metabolite, by replacing the values by 1 specifically, unpaired analyses were adjusted for age, sex, and BMI. Analyses on if an amount could be detected (that is, a numerical value was present in the paired cases (either D vs Rs, or D vs Ra) were adjusted for differences of age and of dataset) and 0 otherwise (that is, the value was missing in the dataset). The BMI, as well as for sex match between the two subjects of the pair. For paired proportion of one was then compared between the two groups. For paired data, the

NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3

a Tryptophan Arginine, proline and urea

C-glycosyltryptophan arginine 3-indoxyl sulfate 1.38 indole-3-carboxylic acid 2-oxoarginine 2.24 citrulline 1.24 2.02 1.1 urea 1.79 dimethylarginine tryptophan betaine 0.96 indoleacetate 1.57 0.83 1.34 (SMDA + AMDA) 0.69 1.12 0.55 trans-4-hydroxyproline 0.9 homoarginine tryptophan 0.41 indoleacetylglutamine 0.67 0.28 0.45 0.14 proline 0.22 homocitrulline 0 0 serotonin indolelactate pro-hydroxy-pro N-acetylarginine

picolinate indolepropionate ornithine N-acetylcitrulline

N-acetyltryptophan kynurenate N2, N5-diacetylornithine N-acetylproline N-acetylkynurenine kynurenine *** N-methylproline N-delta-acetylornithine

Secondary bile acid Plasmalogen and lysoplasmalogen deoxycholate ursodeoxycholate 1-(1-enyl-oleoyl)-GPE (P-18:1) sulfate 1-(1-enyl-palmitoyl)-2-arachidonoyl- 9.55 glycocholenate sulfate 1-(1-enyl-stearoyl)-GPE (P-18:0) 1.03 8.6 0.93 GPC (P-16:0/20:4) 7.64 0.83 ursodeoxycholate 6.69 glycodeoxycholate 1-(1-enyl-stearoyl)-2-oleoyl-GPE 0.72 1-(1-enyl-palmitoyl)-2-arachidonoyl- 5.73 0.62 4.78 (P-18:0/18:1) 0.52 GPE (P-16:0/20:4) 3.82 glycodeoxycholate 1-(1-enyl-stearoyl)-2-linoleoyl- 0.41 2.87 0.31 1-(1-enyl-palmitoyl)-2-linoleoyl- tauroursodeoxycholate sulfate 6 1.91 GPE (P-18:0/18:2) 0.21 GPC (P-16:0/18:2) 0.96 0.1 0 0 1-(1-enyl-stearoyl)-2- 1-(1-enyl-palmitoyl)-2-linoleoyl- taurolithocholate 3- glycohyocholate arachidonoyl-GPE (P-18:0/20:4) GPE (P-16:0/18:2) sulfate 1-(1-enyl-palmitoyl)-GPE (P-16:0) 1-(1-enyl-palmitoyl)-2-oleoyl- taurodeoxycholate glycolithocholate GPC (P-16:0/18:1) glycolithocholate 1-(1-enyl-palmitoyl)-GPC (P-16:0) 1-(1-enyl-palmitoyl)-2-oleoyl- taurocholenate sulfate sulfate GPE (P-16:0/18:1) 1-(1-enyl-palmitoyl)-2-palmitoyl- 1-(1-enyl-palmitoyl)-2-palmitoleoyl- hyocholate glycoursodeoxycholate GPC (P-16:0/16:0) GPC (P-16:0/16:1)

Polyunsaturated fatty acid Long chain fatty acid

adrenate (22:4n6) arachidate (20:0) 10-nonadecenoate stearidonate (18:4n3) arachidonate (20:4n6) eicosenoate (20:1) 2.46 (19:1n9) 2,48 2.19 dihomo-linoleate 2,23 1.91 10-heptadecenoate 1,98 mead acid (20:3n9) 1.64 (20:2n6) 1,74 erucate (22:1n9) 1.37 1,49 1.09 (17:1n7) 1,24 linolenate [alpha or 0.82 dihomo-linolenate 0.55 0,99 0,74 gamma; (18:3n3 or 6)] 0.27 (20:3n3 or n6) stearate (18:0) margarate (17:0) 0 0,5 0,25 docosadienoate 0 linoleate (18:2n6) (22:2n6) pentadecanoate (15:0) myristate (14:0) docosahexaenoate eicosapentaenoate (DHA; 22:6n3) (EPA; 20:5n3) myristoleate (14:1n5) docosapentaenoate (n3 palmitoleate (16:1n7) docosatrienoate DPA; 22:5n3) (22:3n3) docosapentaenoate (n6 palmitate (16:0) nonadecanoate (19:0) DPA; 22:5n6) oleate/vaccenate (18:1)

b 3- Indoxyl sulfate Indoleacetate Indoleacetylglutamine Indolepropionate *** ****** **** N.S.* **** ****

2.0 1.0 2.0 2.0 1.5 0.5 1.5 1.0 1.5 0.0 1.0 0.5 1.0 –0.5 0.5 0.0 0.5 –1.0 0.0 05 0.0 –1.5 Indoleacetate 05 3-indoxyl sulfate –1.0 Indolepropionate

Indoleacetylglutamine –05 –2.0 –1.5 –1.0 –1.0 –2.5 –2.0 –1.5 aGvHD –+– + –+–+ –+–+ –+–+ cohort 1 2 1 2 1 2 1 2 Aryl hydrocarbon receptor ligands

Fig. 7 Pathways associated with aGVHD involve microbiota-derived aryl hydrocarbon receptor ligands. Metabolites from the six most frequently involved sub-pathways at GvHD onset were compared between recipient with and without GvHD in both cohorts (blue: monocentric cohort 1; orange: multicentric cohort 2; red line: ratio = 1). Radar plots represent the ratio of the mean value of each metabolite in recipients with GvHD compared to those without GvHD, with bars corresponding to the standard deviation of the mean value in both cohorts. The red line corresponds to a ratio of 1. b Within the tryptophan pathway, the four metabolites that were decreased at GvHD onset were microbial-derived indole compounds that may act as aryl hydrocarbon receptor (AhR) ligands. Logarithm of the individual values are represented for both cohorts in patients with or without GvHD and values were compared with a Student test. Each box represents the median value with interquartile range, whiskers described minimal and maximal values. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001. comparison was done using the exact version of the McNemar’s test (that is, A metabolite was declared to be significantly more frequently undetectable binomial test to compare the proportion of discordant pairs in one direction to in one group (hence, present at significantly lower amounts in patients of that 0.5). For unpaired data, the comparison was done using the exact Fisher’s test for group) when the test was significant after Bonferroni correction for multiple 2 × 2 contingency tables. testing. For the D vs Rs analysis, all the 653 metabolites were analyzed, leading

12 NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3 ARTICLE to significant differences for p<7.66 × 10−5. For the Rs vs Ra analysis, 524 Basically, the idea is to build an undirected graph (network) where each node is metabolites were analyzed, leading to significant results when p<9.54 × 10−5. a chemical and two nodes are connected if, and only if, their ratio is unchanged After this first step, chemicals that could not be detected in at least one half of between the two groups. The theoretical graph is a set of disjoint subgraphs, each the patients either overall or in either group of the Saint Louis hospital cohort were subgraph being a clique (that is, in this subgraph, each node is connected to all excluded for the following two steps. As stated above, the resulting list of excluded other nodes). In a given clique, all chemicals experience no change in their relative chemicals also applied to the Cryostem cohort dataset. amount, hence their absolute amount changes identically (eventually, does no Based on the experimental protocol, missing data can be considered mainly to change) between the two groups. Conversely, chemicals belonging to different rely on the limit of quantification (LOQ) of the analytical method: any value below cliques experience a change in their relative amount, hence different changes in this limit could not be measured and reported (BLQ data). To consider the their absolute amount, with at least one of them being modified between the two information of such BLQ data—known as left-censoring—the most usual method groups. is to impute half the LoQ to missing data. Since LoQ is unknown for each chemical, To build the observed graph, the ratio is tested and if the p value of the test is imputations used half the minimal observed value. It allows unbiased estimates below a predefined threshold, pmax, the connection between the two corresponding unless a large amount of data is missing (but such cases were removed by filtering is removed (the ratio has changed between the two groups). This is done for all at the end of step 1 of the analysis). These findings were notably reported in Keizer possible ratios, and the resulting graph is then analyzed. If it contains more than et al.61, who showed that except when the percentage of missing data is high, one subgraph, then at most one of these subgraphs correspond to “no difference imputing values by LOQ/2 achieves less bias than simply discarding the data, and between the groups”, and all other subgraphs contain chemicals that do change in fact is similar to having complete data or using more complex methods than the between the two groups. The largest subgraph will be taken as the “no change” ones used in this paper. Its main drawback is to generate ties in the sample, thus subgraph. resulting in impaired estimates of variances. Therefore, a small random noise The key step is the choice of the threshold, pmax. Obviously, the larger the generated from a Gaussian distribution was added to imputed values, and then threshold is, the more likely is the discovery of unconnected subgraphs. Hence, rounded to the nearest integer to maintain the characteristics of the original data. A pmax was selected to minimize the probability of observing at least two disjoint 100 variance value of the Gaussian distribution was used to ensure a lower subgraphs, under the null hypothesis that no chemical changes (in graph terms, variability than that observed in detected amounts. To check that this additional that the theoretical graph contains a single clique). This was achieved by step did not introduced bias, all analyses were also performed without this simulation: under the null hypothesis and assuming independent chemicals, additional noise. Similarly, to check the assumption that using LOQ/2 does not 10,000 simulations were done with the same sample size and the same number of introduces a bias, analyses were also performed using other fractions of the nodes than in our datasets, using a log-normal distribution; pmax was estimated as minimal value (25%, 75%, 90%, 95%, 99%) or the minimal value itself, another the threshold to use so that the proportion of simulations that give two or more common approach62. Because of the limited sample size, and of the lack of disconnected subgraphs is at most 5%. Practically, to account for simulation sensitivity of the results to these choices, more complex imputation methods like uncertainties and deviations to the simulation model, we used the lowest bound of fi fi tting a truncated distribution to the data were not tested. the 95% con dence interval of pmax obtained by simulation, truncated to the At the end of this step, a principal component analysis (PCA) and a sparse second decimal, ensuring a conservative test. The Supplementary Data 17 gives the partial least square discriminant analysis (sPLS-DA) were performed to check that threshold obtained and used for our different datasets in main analyses. the different groups were indeed separated, allowing identifying metabolites whose This third step was performed twice: first on all metabolites (after exclusion of level differ between the groups (R package FactoMineR63, function PCA and R metabolites with too large non-detectable amounts, as reported above), and package mixOmics64, function plsda and splsda). Due to the small sample size and secondly only on metabolites from a preselected set of candidate metabolic the very low (sample number)/(predictor number) ratio, PLS-DA is prone to pathways of interest: polyamine metabolism, tryptophan metabolism, urea cycle, overfitting and cross-validation cannot be used here. Hence, PLS-DA results should arginine & proline metabolism, bacterial or fungal, plasmalogen or be taken as descriptive. lysoplasmalogen, and primary or secondary BA metabolism. Metabolites that were identified in sPLS-DA were then used to build an ORA. All the analyses for this third step, including simulations, were done using the Enrichment (E) was calculated by considering the number of metabolites identified SARP.compo package for R65, available on the CRAN repository. with sPLS-DA in each pathways (k), the total number of metabolites identified in Last, the fourth step selected metabolites using group membership modeling. sPLSD-DA (n), the number of metabolites in each pathway (m), and the total One may wonder whether some of the metabolites selected as associated with either number of metabolites used for analysis (M) as follows: E = (k/m)/((n−k)/(N−m)). group, may have been selected only due to their relationships with the others. For each pathway, p value was determined by calculation of the hypergeometric Therefore, multivariable analyses were performed, considering (i) the large number distribution. of metabolites (491 for the Rs vs Ra comparison) against the small number of The second step of analysis aimed to compare average amount for each observations (overall, 43 in the Saint Louis sample and 56 in the Cryostem sample) compound. After filtering and imputation, each chemical was analyzed separately. and (ii) the low prevalence of GvHD, notably in the Saint Louis sample (with only For a given metabolite, the amounts between the two groups were compared, after 12 GvHD). The latter issue avoids any multivariable regression in the Saint Louis log transformation, using two-sided Student’s T test, with Aspin–Welch correction sample (indeed, it is commonly reported that at least 10 events should be observed for unequal variances in the case of unpaired data; the paired version of the test was when including one variable in the model). Then, to handle the first issue, lasso used for paired data. A chemical was declared to be present in significantly logistic regression—following its original use in linear models66—was used. It tends different amounts in the two groups if the test was significant after Bonferroni’s to produce some model coefficients that are exactly 0 and hence gives interpretable correction for multiple testing. models. Thus, this technic is robust in the context of our analysis, due to its For the D vs Rs analysis, 83 metabolites were excluded by filtering on non- tendency to prefer solutions with fewer non-zero coefficients, effectively reducing detectable values, leading to 570 metabolites for further analyses. Consequently, the number of features upon which the given solution is dependent p<8.77 × 10−5 (that is, p* < 0.05 where p*isp after Bonferroni correction) was used as the threshold to detect significantly differentially present metabolites. fi Reporting summary. Further information on research design is available in For the Rs vs Ra analysis, 35 metabolites were excluded by ltering on non- the Nature Research Reporting Summary linked to this article. detectable values, leading to 491 metabolites to analyze. Consequently, p<1.02 × 10−4 (that is, p* < 0.05 where p*isp after Bonferroni correction) was used as the threshold to detect significantly differentially present metabolites. Data availability Heatmap was constructed on metabolites identified with the two-sided The source data underlying Fig. 1c are provided as a Source Data file. Raw data Student’s T-test in the previous step. Hierarchical clustering analysis of data was underlying Figs. 2–7 are available on MetaboLights repository for both cohorts: performed using Euclidian distance measurement of similarity followed by a MTBLS204 (cohort 1) and MTBLS205 (cohort 2). All other data are available from the ’ clustering algorithm based on Ward s linkage, that cluster data to minimize the corresponding author on reasonable request. sum of squares of any two clusters. Analyses were performed using the hclust function from R v3.5.1. The third step of analysis aimed to identify similarly behaving chemicals. The Received: 23 January 2019; Accepted: 13 November 2019; global comparison was based on the idea that, rather than absolute amounts, the relative amounts between chemicals are informative—first, because biologically they better represent the equilibria between different chemicals, hence the relative importance of different pathways and secondly, because using ratios lowers the necessity of a perfect normalization of the data. Consequently, the change in the ratio of two metabolites between the two groups was the parameter of interest. References Since the number of ratios increases as the square of the number of tested 1. Gooley, T. A. et al. Reduced mortality after allogeneic hematopoietic-cell – metabolites, and since these ratios are obviously correlated, analyzing all ratios transplantation. N. Engl. J. Med. 363, 2091 2101 (2010). individually with multiple testing corrections would be very inefficient (that is, 2. Schroeder, M. A. & DiPersio, J. F. Mouse models of graft-versus-host disease: achieving a very low power). We used instead a global comparison of all pairwise advances and limitations. Dis. Model. Mech. 4, 318–333 (2011). ratios, according to the method described in the paper by Curis et al.65 and briefly 3. Zeiser, R. & Blazar, B. R. Pathophysiology of chronic graft-versus-host disease described thereafter. and therapeutic targets. N. Engl. J. Med. 377, 2565–2579 (2017).

NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications 13 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3

4. Zeiser, R. & Blazar, B. R. Acute graft-versus-host disease. N. Engl. J. Med. 378, 35. Legoff, J. et al. The eukaryotic gut virome in hematopoietic stem cell 586 (2018). transplantation: new clues in enteric graft-versus-host disease. Nat. Med. 5. Santos E. & Sousa, P. et al. Peripheral tissues reprogram CD8+ T cells for https://doi.org/10.1038/nm.4380 (2017). pathogenicity during graft-versus-host disease. JCI Insight 3, https://doi.org/ 36. DeFilipp, Z. et al. Third-party fecal microbiota transplantation following allo- 10.1172/jci.insight.97011 (2018). HCT reconstitutes microbiome diversity. Blood Adv. 2, 745–753 (2018). 6. Michonneau, D. et al. The PD-1 axis enforces an anatomical segregation of 37. Mellor, A. L. & Munn, D. H. Ido expression by dendritic cells: tolerance and CTL activity that creates tumor niches after allogeneic hematopoietic stem cell tryptophan catabolism. Nat. Rev. Immunol. 4, 762–774 (2004). transplantation. Immunity 44, 143–154 (2016). 38. Nguyen, N. T. et al. Aryl hydrocarbon receptor and kynurenine: recent 7. Jenq, R. R. et al. Regulation of intestinal inflammation by microbiota following advances in autoimmune disease research. Front. Immunol. 5, https://doi.org/ allogeneic bone marrow transplantation. J. Exp. Med. 209, 903–911 (2012). 10.3389/fimmu.2014.00551 (2014). 8. Jenq, R. R. et al. Intestinal Blautia is associated with reduced death from graft- 39. Cronin, S. J. F. et al. The metabolite BH4 controls T cell proliferation in versus-host disease. Biol. Blood Marrow Transplant. https://doi.org/10.1016/j. autoimmunity and cancer. Nature 563, 564–568 (2018). bbmt.2015.04.016 (2015). 40. Hubbard, T. D. et al. Adaptation of the human aryl hydrocarbon receptor to 9. Zmora, N., Bashiardes, S., Levy, M. & Elinav, E. The role of the immune sense microbiota-derived . Sci. Rep. 5, 12689 (2015). system in metabolic health and disease. Cell Metab. 25, 506–521 (2017). 41. Hubbard, T. D., Murray, I. A. & Perdew, G. H. Indole and tryptophan 10. Shono, Y. et al. Increased GVHD-related mortality with broad-spectrum metabolism: endogenous and dietary routes to ah receptor activation. Drug antibiotic use after allogeneic hematopoietic stem cell transplantation in Metab. Dispos. 43, 1522–1535 (2015). human patients and mice. Sci. Transl. Med. 8, 339ra71 (2016). 42. Jasperson, L. K. et al. Inducing the tryptophan catabolic pathway, indoleamine 11. Peled, J. U. et al. Intestinal microbiota and relapse after hematopoietic-cell 2,3-dioxygenase (IDO), for suppression of graft-versus-host disease (GVHD) transplantation. J. Clin. Oncol. 35, 1650–1659 (2017). lethality. Blood 114, 5062–5070 (2009). 12. Veldhoen, M. & Ferreira, C. Influence of nutrient-derived metabolites on 43. Landfried, K. et al. Tryptophan catabolism is associated with acute GVHD lymphocyte immunity. Nat. Med. 21, 709–718 (2015). after human allogeneic stem cell transplantation and indicates activation of 13. Guma, M., Tiziani, S. & Firestein, G. S. Metabolomics in rheumatic diseases: indoleamine 2,3-dioxygenase. Blood 118, 6971–6974 (2011). desperately seeking biomarkers. Nat. Rev. Rheumatol. 12, 269–281 (2016). 44. Ratajczak, P. et al. IDO in human gut graft-versus-host disease. Biol. Blood 14. Levy, M., Thaiss, C. A. & Elinav, E. Metabolites: messengers between the Marrow Transplant. 18, 150–155 (2012). microbiota and the immune system. Genes Dev. 30, 1589–1597 (2016). 45. Gutiérrez-Vázquez, C. & Quintana, F. J. Regulation of the immune response 15. Reikvam, H., Hatfield, K. & Bruserud, Ø. The pretransplant systemic by the aryl hydrocarbon receptor. Immunity 48,19–33 (2018). metabolic profile reflects a risk of acute graft versus host disease after 46. Bahri, S. et al. Citrulline: from metabolism to therapeutic use. Nutrition 29, allogeneic stem cell transplantation. Metabolomics 12, 12 (2016). 479–484 (2013). 16. Mathewson, N. D. et al. Gut microbiome-derived metabolites modulate 47. Hueso, T. et al. Association between low plasma level of citrulline before intestinal epithelial cell damage and mitigate graft-versus-host disease. Nat. allogeneic hematopoietic cell transplantation and severe gastrointestinal graft Immunol. 17, 505–513 (2016). vs host disease. Clin. Gastroenterol. Hepatol. 16, 908–917.e2 (2018). 17. Fujiwara, H. et al. Microbial metabolite sensor GPR43 controls severity of 48. Rashidi, A. et al. Pretransplant serum citrulline predicts acute graft-versus- experimental GVHD. Nat. Commun. 9, 3674 (2018). host disease. Biol. Blood Marrow Transplant. https://doi.org/10.1016/j. 18. Scott, N. A. et al. Antibiotics induce sustained dysregulation of intestinal T cell bbmt.2018.06.036 (2018). immunity by perturbing macrophage homeostasis. Sci. Transl. Med. 10, 49. Calder, P. C. & Grimble, R. F. Polyunsaturated fatty acids, inflammation and eaao4755 (2018). immunity. Eur. J. Clin. Nutr. 56(Suppl 3), S14–19 (2002). 19. Haak, B. W. et al. Impact of gut colonization with butyrate-producing 50. Rezende, B. M. et al. Inhibition of 5-lipoxygenase alleviates graft-versus-host microbiota on respiratory viral infection following allo-HCT. Blood 131, disease. J. Exp. Med. 214, 3399–3415 (2017). 2978–2986 (2018). 51. Jain, U. et al. Temporal regulation of the bacterial metabolite deoxycholate 20. Riwes, M. & Reddy, P. Microbial metabolites and graft versus host disease. during colonic repair is critical for crypt regeneration. Cell Host Microbe 24, Am. J. Transplant. 18,23–29 (2018). 353–363.e5 (2018). 21. Wei, R. et al. Missing value imputation approach for mass spectrometry-based 52. Cai, S.-Y. et al. Bile acids initiate cholestatic liver injury by triggering a metabolomics data. Sci. Rep. 8, 663 (2018). hepatocyte-specificinflammatory response. JCI Insight 2, https://doi.org/ 22. Chen, J. et al. are required for expression of Toll-like receptor 2 10.1172/jci.insight.90780 (2017). modulating intestinal epithelial barrier integrity. Am. J. Physiol. Gastrointest. 53. Li, M., Cai, S.-Y. & Boyer, J. L. Mechanisms of bile acid mediated Liver Physiol. 293, G568–576 (2007). inflammation in the liver. Mol. Asp. Med. 56,45–53 (2017). 23. Zhang, M., Wang, H. & Tracey, K. J. Regulation of macrophage activation and 54. Wallner, S. & Schmitz, G. Plasmalogens the neglected regulatory and inflammation by : a new chapter in an old story. Crit. Care Med. 28, scavenging lipid species. Chem. Phys. Lipids 164, 573–589 (2011). N60–66 (2000). 55. Gil-de-Gómez, L., Astudillo, A. M., Lebrero, P., Balboa, M. A. & Balsinde, J. 24. Hayes, C. S. et al. Polyamine-blocking therapy reverses immunosuppression in Essential role for ethanolamine plasmalogen hydrolysis in bacterial the tumor microenvironment. Cancer Immunol. Res. 2, 274–285 (2014). lipopolysaccharide priming of macrophages for enhanced arachidonic acid 25. Henrich, F. C. et al. Suppressive effects of tumor cell-derived 5’-deoxy-5’- release. Front. Immunol. 8, 1251 (2017). methylthioadenosine on human T cells. Oncoimmunology 5, e1184802 (2016). 56. Wilmanski, T. et al. Blood metabolome predicts gut microbiome α-diversity 26. Levy, M. et al. Microbiota-modulated metabolites shape the intestinal in humans. Nat. Biotechnol. https://doi.org/10.1038/s41587-019-0233-9 microenvironment by regulating NLRP6 inflammasome signaling. Cell 163, (2019). 1428–1443 (2015). 57. Glucksberg, H. et al. Clinical manifestations of graft-versus-host disease in 27. Guo, C. et al. Bile acids control inflammation and metabolic disorder through human recipients of marrow from HL-A-matched sibling donors. inhibition of NLRP3 inflammasome. Immunity 45, 802–816 (2016). Transplantation 18, 295–304 (1974). 28. Ma, C. et al. Gut microbiome-mediated bile acid metabolism regulates liver 58. Przepiorka, D. et al. 1994 consensus conference on acute GVHD grading. cancer via NKT cells. Science 360, https://doi.org/10.1126/science.aan5931 Bone Marrow Transpl. 15, 825–828 (1995). (2018). 59. Dehaven, C. D., Evans, A. M., Dai, H. & Lawton, K. A. Organization of GC/ 29. Schramm, C. Bile acids, the microbiome, immunity, and liver tumors. N. Engl. MS and LC/MS metabolomics data into chemical libraries. J. Cheminformatics J. Med. 379, 888–890 (2018). 2, 9 (2010). 30. Pols, T. W. H. et al. TGR5 activation inhibits atherosclerosis by reducing 60. DeHaven, C. D., Evans, A. M., Dai, H. & Lawton, K. A. Software techniques macrophage inflammation and lipid loading. Cell Metab. 14, 747–757 (2011). for enabling high-throughput analysis of metabolomic datasets. Metabolomics. 31. Haselow, K. et al. Bile acids PKA-dependently induce a switch of the IL-10/IL- https://doi.org/10.5772/31277 (2012). 12 ratio and reduce proinflammatory capability of human macrophages. J. 61. Keizer, R. J. et al. Incorporation of concentration data below the limit of Leukoc. Biol. 94, 1253–1264 (2013). quantification in population pharmacokinetic analyses. Pharmacol. Res. 32. Perino, A. et al. TGR5 reduces macrophage migration through mTOR- Perspect. 3, e00131 (2015). induced C/EBPβ differential translation. J. Clin. Invest. 124, 5424–5436 62. Do, K. T. et al. Characterization of missing values in untargeted MS-based (2014). metabolomics data and evaluation of missing data handling strategies. 33. Swimm, A. et al. Indoles derived from intestinal microbiota act via type I Metabolomics 14, 128 (2018). interferon signaling to limit graft-versus-host disease. Blood 132, 2506–2519 63. Lê, S., Josse, J. & Husson, F. FactoMineR: an R package for multivariate (2018). analysis. J. Stat. Softw. 25,1–18 (2008). 34. Wikoff, W. R. et al. Metabolomics analysis reveals large effects of gut 64. Rohart, F., Gautier, B., Singh, A. & Lê Cao, K.-A. mixOmics: an R package microflora on mammalian blood metabolites. Proc. Natl Acad. Sci. USA 106, for’omics feature selection and multiple data integration. PLoS Comput. Biol. 3698–3703 (2009). 13, e1005752 (2017).

14 NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-13498-3 ARTICLE

65. Curis, E. et al. Determination of sets of covariating gene expression using script. All authors provided feedback on the manuscript. D.M., S.C., L.R., and G.S. graph analysis on pairwise expression ratios. Bioinformatics 35, 258–265 supervised and conceived the project. R.P.L. and G.S. secured funding for the study. (2019). 66. Tibshirani, R. Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Competing interests Ser. B Methodol. 58, 267–288 (1996). The authors declare no competing interests.

Acknowledgements Additional information The authors thank all members of the CRYOSTEM Consortium and of the Franco- Supplementary information is available for this paper at https://doi.org/10.1038/s41467- phone Society of Marrow Transplantations and Cellular Therapy (SFGM-TC) for 019-13498-3. providing patients samples used in this study and for their support: University Hos- pital of Angers, University Hospital of Dijon Bourgogne, University Hospital of Correspondence and requests for materials should be addressed to G.Sé. Besançon, University Hospital of Grenoble, University Hospital of Lille, University Hospital of Lyon, University Hospital of Bordeaux, Paoli-Calmettes Institute, AP-HM, Peer review information Nature Communications thanks Jyoti Shankar, Jeff Xia, and the University Hospital of Nantes, AP-HP (Saint Louis Hospital, La Pitié-Salpêtrière other, anonymous, reviewer/s for their contribution to the peer review of this work. Peer Hospital, Saint-Antoine Hospital, Robert Debré Hospital, Necker Hospital, Groupe reviewer reports are available. Hospitalier Henri Mondor), INSERM, University Hospital of Toulouse, University Hospital of Tours, University Hospital of Rennes, University Hospital of Clermont- Reprints and permission information is available at http://www.nature.com/reprints Ferrand, University Hospital of Saint-Etienne, Lucien Neuwirth Cancer Institute, University Hospital of Poitiers, University Hospital of Nice, University Hospital of Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in Brest, Military Hospital of Percy, University Hospital of Montpellier, Gustave Roussy published maps and institutional affiliations. Institute, University Hospital of Limoges, University Hospital of Caen, Etablissement Français du Sang. We thank Alexion Company for their financial support. This project was funded by a grant from the Direction Générale de l’OffredeSoins(DGOS)and Open Access the Agence National de la Recherche (ANR) within a Projet de Recherche Transla- This article is licensed under a Creative Commons tionnelle en Santé (ANR-PRTS 13002). G.S. received a research grant from Alexion Attribution 4.0 International License, which permits use, sharing, Pharmaceutical company (APALEX) and from the Institut National du Cancer adaptation, distribution and reproduction in any medium or format, as long as you give (project number INCa 2014-1-PL BIO-07-IP-1). CRYOSTEM project is supported by appropriate credit to the original author(s) and the source, provide a link to the Creative a grant from the Institut National du Cancer (INCa) and the Agence Nationale de la Commons license, and indicate if changes were made. The images or other third party ’ Recherche (ANR) and under the auspices of the Société Francophone de Greffe de material in this article are included in the article s Creative Commons license, unless Moelle et de Thérapie Cellulaire (SFGM-TC). indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from Author contributions the copyright holder. To view a copy of this license, visit http://creativecommons.org/ D.M. and E.L. performed the experiments, collected and analyzed the data, designed licenses/by/4.0/. figures. E.C and S.C. performed statistical analysis. L.D., S.R., and B.I. performed the experiments. R.P.L. coordinated Cryostem consortium and provided access to samples from cohort 2. M.R. and F.S.F. collected clinical data. D.M. and G.S wrote the manu- © The Author(s) 2019

NATURE COMMUNICATIONS | (2019) 10:5695 | https://doi.org/10.1038/s41467-019-13498-3 | www.nature.com/naturecommunications 15