View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by White Rose Research Online

This is a repository copy of Host transcriptomic signature as alternative test-of-cure in visceral leishmaniasis patients coinfected with HIV.

White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/158875/

Version: Accepted Version

Article: Kaye, Paul orcid.org/0000-0002-8796-4755, Adriaensen, Wim, Cuypers, Bart et al. (7 more authors) (2020) Host transcriptomic signature as alternative test-of-cure in visceral leishmaniasis patients coinfected with HIV. EBioMedicine. ISSN 2352-3964 (In Press)

Reuse Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of the full text version. This is indicated by the licence information on the White Rose Research Online record for the item.

Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request.

[email protected] https://eprints.whiterose.ac.uk/ 1 Host transcriptomic signature as alternative test-of-cure in visceral leishmaniasis

2 patients coinfected with HIV

3

4 Wim Adriaensen1*, Bart Cuypers2,3, Carlota F. Cordero1, Bewketu Mengasha4, Séverine

5 Blesson5, Lieselotte Cnops1, Paul M. Kaye6, Fabiana Alves5, Ermias Diro4, Johan van

6 Griensven1

7 *corresponding author ([email protected], +3232470807) 8

9 1 Unit of Neglected Tropical Diseases, Department of Clinical Sciences, Institute of Tropical

10 Medicine, Nationalestraat 122, 2000 Antwerp, Belgium

11 2 Unit of Molecular Parasitology, Department of Biomedical Sciences, Institute of Tropical

12 Medicine, Nationalestraat 122, 2000 Antwerp, Belgium

13 3 Adrem Data Lab, Department of Mathematics and Computer Science, University of

14 Antwerp, Middelheim 1, 2020 Antwerp, Belgium

15 4 Leishmaniasis Research and Treatment Centre, University of Gondar, Gondar, Ethiopia

16 5 Drugs for Neglected Diseases initiative, 15 Chemin Louis-Dunant, 1202 Geneva,

17 Switzerland

18 6 York Biomedical Research Institute, University of York, UK

19

20 Keywords: Visceral leishmaniasis, HIV, RNA signature, treatment efficacy, blood signature

21

22

23

24

25

26

1 27 Abstract:

28 Background

29 Visceral leishmaniasis (VL) treatment in HIV patients very often fails and is followed by high

30 relapse and case-fatality rates. Hence, treatment efficacy assessment is imperative but based on

31 invasive organ aspiration for parasite detection. In the search of a less-invasive alternative and

32 because the host immune response is pivotal for treatment outcome in immunocompromised

33 VL patients, we studied changes in the whole blood transcriptional profile of VL-HIV patients

34 during treatment.

35 Methods:

36 Embedded in a clinical trial in Northwest Ethiopia, RNA-Seq was performed on whole blood

37 samples of 28 VL-HIV patients before and after completion of a 29-day treatment regimen of

38 AmBisome or AmBisome/miltefosine. Pathway analyses were combined with a machine

39 learning approach to establish a clinically-useful 4- set.

40 Findings:

41 Distinct signatures of differentially expressed between D0 and D29 were identified for

42 patients who failed treatment and were successfully treated. Pathway analyses in the latter

43 highlighted a downregulation of genes associated with host cellular activity and immunity, and

44 upregulation of antimicrobial peptide activity in phagolysosomes. No signs of disease

45 remission nor pathway enrichment were observed in treatment failure patients. Next, we

46 identified a 4-gene pre-post signature (PRSS33, IL10, SLFN14, HRH4) that could accurately

47 discriminate treatment outcome at end of treatment (D29), displaying an average area-under-

48 the-ROC-curve of 0.95 (CI: 0.75–1.00).

49 Interpretation:

2 50 A simple blood-based signature thus holds significant promise to facilitate treatment efficacy

51 monitoring and provide an alternative test-of-cure to guide patient management in VL-HIV

52 patients.

53 Funding:

54 Project funding was provided by the AfricoLeish project, supported by the European Union

55 Seventh Framework Programme (EU FP7).

56

57 Research in context

58 Evidence before this study

59 Human immunodeficiency virus (HIV) infection has been identified as a significant challenge

60 facing visceral leishmaniasis (VL) control. VL treatment in HIV patients very often fails and

61 results in extended treatment time followed by high relapse and case-fatality rates. In addition

62 to limited and toxic treatment options, the treatment outcome assessment thus becomes

63 imperative in VL-HIV patients to guide decisions on treatment extension, treatment adjustment

64 or secondary prophylaxis initiation. However, with no alternatives to assess treatment efficacy,

65 repeated invasive and painful aspiration from infected organs for microscopical detection of

66 the parasite remains the only approach. Chronic patients thus undergo repeated tissue aspirates

67 or empirical optimization of treatment regimens. Hence, the development of a less-invasive

68 alternative to assess treatment efficacy represents an urgent and important unmet clinical need

69 to achieve VL control.

70 Added value of this study

71 Our study is the first to investigate whole transcriptome changes in the severly neglected group

72 of visceral leishmaniasis patients with a concurrent HIV infection, where the host immune

73 response is largely uncharted but believed to be pivotal for treatment outcome. In a translational

3 74 effort, the findings were biologically interpreted and systemically reduced to a 4-gene

75 biomarker that could accurately discriminate treatment outcome.

76 Implications of the all the available evidence

77 This signature holds significant promise to facilitate treatment efficacy monitoring in R&D and

78 could provide an alternative test-of-cure to guide patient management in VL-HIV patients.

79

80 Introduction

81 Visceral leishmaniasis (VL) is a potentially fatal, yet neglected, vector-borne disseminated

82 infection caused by protozoans of the Leishmania donovani spp. complex (1). Typical

83 symptoms include fever spikes, substantial weight loss, splenomegaly and alterations of

84 haematopoiesis. With a global estimate of 90,000 cases annually, Ethiopia together with Brazil,

85 India, Kenya, Somalia, South Sudan and Sudan host more than 90% of all VL cases (2).

86

87 Compared to varying cure rates of around 90-95% in VL patients, treatment of patients with a

88 concurrent Human Immunodeficiency Virus-1 (HIV) infection (referred to herein as ‘VL-HIV

89 patients’) in endemic regions of East-Africa, Brazil and India frequently fails. Treatment failure

90 results in extended treatments and case-fatality rates up to 25% (3,4). This is particularly true

91 for East Africa where antileishmanial drugs show lower efficacy rates and HIV prevalence

92 rates of 10-20% are reported among VL patients (3). Even if apparent parasitological clearance

93 at end of treatment and viral suppression with ART is achieved, up to 60% of VL-HIV patients

94 will develop recurrent relapse, typically within 3-6 months after initial cure (compared to 1-

95 5% in immunocompetent VL patients) (3,5). Hence, treatment outcome assessment is

96 imperative in VL-HIV patients to guide decisions on treatment extension, treatment adjustment

97 or secondary prophylaxis initiation.

98

4 99 To date, a repeated invasive and painful aspiration for microscopical detection of the parasite

100 from infected organs (spleen, bone marrow or lymph nodes) remains the only approach for test-

101 of-cure. Although spleen aspiration shows the highest sensitivity (6), it has a life-threatening

102 risk of splenic haemorrhage that renders it unsuitable in patients with severe thrombocytopenia.

103 In addition, these techniques require a great level of expertise, training of personnel and

104 appropriate facilities where blood transfusion and management of intraabdominal bleeding is

105 possible. Due to these reasons, chronic patients often undergo repeated tissue aspirates or

106 empirical optimization of treatment regimens. Hence, the development of a less-invasive

107 alternative to assess treatment efficacy represents an urgent and important unmet clinical need.

108

109 Molecular techniques for parasite detection seem promising, but could be less suitable as the

110 parasitic load in blood decreases steeply after two days of treatment and gives no information

111 on the host’s immunological recovery (7). In immunocompromised individuals in particular,

112 host immune response restoration has been shown to be pivotal in the efficacy of VL treatment

113 (8). Therefore, transcriptomic signatures in peripheral blood may reflect immunological

114 responses underpinning clearance or persistence of parasites. In recent years, gene signatures

115 derived from blood transcriptomic profiling have shown great promise in treatment monitoring

116 for a number of infectious diseases (9–11). With a 5-gene signature, robust prediction of

117 treatment failure in tuberculosis patients could be achieved after 1 or 4 weeks of therapy (11).

118 Similarly, Liu et al. identified and validated a 10-gene signature that predicted Ebola treatment

119 outcome with an accuracy of 85% to 92% (9). Yet, previous studies were often confined to

120 single timepoint measurements and purely statistical approaches that could be complicated by

121 patient-to-patient variation and little biological relevance of selected genes, all jeopardizing

122 their generalizability.

123

5 124 To date, two small-scale studies in VL patients described distinct expression profiles in

125 respectively the blood and the lymph nodes before and after treatment with amphotericin B

126 (12) or sodium stibogluconate (13). Likewise, Gardinassi et al. also defined distinct

127 immunological signatures in the blood for active and cured L. infantum infected patients (14).

128 Although performed in immunocompetent VL patients, these findings support the pursuit of a

129 blood-based test-of-cure. None of the previous studies, however, sought to translate such

130 knowledge into a clinically useful signature to guide the challenging clinical management of

131 VL-HIV patients and facilitate the evaluation in clinical trials of urgently needed novel

132 therapeutics.

133

134 Here, to minimize the impact of heterogeneity in gene expression levels among individual VL-

135 HIV patients we studied relative changes in the gene expression profile during VL treatment.

136 By combining biological insight with a stringent machine learning approach, we have identified

137 a relevant and robust 4-gene blood signature that accurately determined treatment outcome in

138 VL-HIV patients.

139

140 Materials and Methods

141 Study population and design

142 In 2014-16, a total of 30 patients recruited at the Leishmaniasis Research and Treatment Centre

143 (Gondar, Ethiopia) in a randomized clinical trial (RCT) on combination treatment

144 (NCT02011958, sponsored by the Drugs for Neglected Diseases Initiative) were included in

145 this study for additional whole blood sampling (15). In short, AmBisome monotherapy and a

146 combination treatment of AmBisome and miltefosine were administrated. AmBisome

147 monotherapy dosage was 40mg/kg total dose, IV infusion of 5mg/kg on day 1-5, 10, 17 and

148 24. In the combination treatment AmBisome dosage was 30mg/kg total dose, IV infusion of

6 149 5mg/kg on day 1, 3, 5, 7, 9 and 11 and miltefosine every day for 28 days (50mg if patient

150 weight < 25kg, 100mg if patient weight is >25kg). Treatment success was defined as absence

151 of parasites in spleen aspirate at D29. Treatment failure at D29 was defined as presence of

152 parasites at the D29 assessment, or no clinical response to treatment requiring rescue

153 medication on or before D29. Treatment was extended or rescue medication was provided after

154 D29 based on parasitological examination and clinical improvement. Once the patients had a

155 negative parasitology result, they started a follow-up period of one year (up to D390) to assess

156 long-term relapse-free survival and safety. Rescue therapy was also given to all patients who

157 had a confirmed VL relapse during follow-up (for more details, see (16)). After exclusion of 2

158 patients due to a technical error in the RNA isolation process, 28 VL-HIV patients were

159 included in the herein described analyses (Figure 1).

160 !

161 Ethics approval and consent to participate

162 The study protocol was approved by the University of Gondar Institutional Review Board, the

163 Ethiopian National Research Ethics Review Committee, the Médecins Sans Frontiers Ethics

164 Review Board, the London School of Hygiene and Tropical Medicine Research Ethics

165 Committee, the Antwerp University Hospital Ethics Committee, the Prince Leopold Institute

166 of Tropical Medicine Institutional Review Board and for this specific study by the University

167 of York Biology Ethics Committee. All patients provided informed consent and the study was

168 carried out in accordance with international guidelines (Helsinki declaration, Good Clinical

169 Practices and local regulations).

170

171 RNA isolation and sequencing

172 Whole blood was collected and stabilized in 2.5mL PAXgene Blood RNA tubes (PreAnalytiX

173 GmbH, Hombrechtikon, Switzerland) and stored at -80 °C. Isolation and purification of total

7 174 RNA was performed using the PAXgene blood RNA kit (PreAnalytiX GmbH, Hombrechtikon,

175 Switzerland) according to the manufacturer’s instructions. Next, messenger RNA was enriched

176 by depleting ribosomal and globulin RNA (Globin-zero gold removal kit, Illumina Inc, San

177 Diego, USA). RNA concentrations were measured with NanoDrop 1000 Spectophotometer

178 (NanoDrop Technologies, USA) and RNA integrity using an Agilent 2100 Bioanalyzer (Nano

179 kit, Agilent, CA, US). Subsequently, cDNA amplification, adapter ligation and indexing were

180 carried out on 1µg of total RNA by using TruSeq stranded mRNA library preparation (Illumina

181 Inc, San Diego, USA). Libraries were sequenced on an Illumina NextSeq500 instrument

182 (single-end, 75bp) using 1.2 pM and 1.89% PhiX with a total of 4 runs and an average coverage

183 of 19.3 million reads per sample. 98% of the trimmed passed-filter reads mapped against the

184 .

185

186 Mechanistic approach

187 Longitudinal and inclusive DGE analyses (FDR corrected p-value ≤ 0.05 and a ≥ 1.5 absolute

188 fold difference) between the timepoints D0 and D29 were performed for both the treatment

189 success and failure group, based on a paired generalized linear model with CLC Genomic

190 Workbench software V12 (Qiagen Bioinformatics). Next, the pre-ranked feature within Gene

191 Set Enrichment Analysis (GSEA) software v.3 (BROAD institute, California, USA) was used

192 to determine enriched gene sets in the treatment groups. The well-defined hallmark, curated

193 canonical and datasets from the Molecular Signature Database (MsigDB) were

194 used for the enrichment analyses. In addition, previously described whole blood transcription

195 modules (BTMs) were evaluated (17,18). Gene sets were considered significant if their false

196 discovery rate (FDR) was less than 25%. Default parameters were used with 1000 permutations

197 and the conservative classic enrichment statistics was used for calculation of the enrichment

198 score (ES). Gene sets that did have not a minimum of 10 genes shared with the pre-ranked gene

8 199 list were excluded from the analyses, and normalized enrichment scores were depicted.

200 Leading edge analyses were run on the GSEA results to identify core group genes.

201

202 Non-redundant biological terms were visualized in a functionally grouped network by means

203 of the ClueGo plug-in v2.5.4 for Cytoscape v.3.7.1 (19,20). Enrichment/depletion analyses

204 were carried out with a two-sided hypergeometric test for datasets from the EBI-UniProt GO,

205 KEGG, REACTOME and WikiPathways databases. All genes were recognised and 45 (11.3%)

206 were not functionally annotated. The applied kappa statistics for connectivity was kept at the

207 default value of 0.4. GO term fusion was selected to reduce redundancy. In addition, only terms

208 with a p-value of <0.100 were shown, as well as a GO tree interval level of 3 to 8 (medium to

209 detailed network specificity). The autoannotate plug-in was used for cluster delineation and

210 naming.

211

212 Discriminatory approach

213 A cross-sectional and stringent machine learning approach was designed. The count table was

214 normalized for the ‘per sample’ sequencing depth using the DESEQ2 package in R. Genes that

215 had a read count of zero in 80% of the samples were removed. The D29/D0 expression values

216 were used as input for the machine learning approach to correct for patient specific expression

217 patterns. We adopted a random forest (RF) classification with leave-one-out cross-validation.

218 Within each cross-validation loop, a feature selection was first applied on the remaining

219 samples, only selecting the differentially expressed genes between treatment cure and failure

220 based on a more stringent DGE analyses (FDR corrected p-value ≤ 0.01 and a ≥ 2 absolute fold

221 difference) to target a limited gene set of highly discriminatory genes. These DEGs were then

222 used to construct the RF classifier. The number of trees, with a maximum depth of 4 and equal

223 minimal number of samples in each leaf, was limited to 100. The relative importancies of the

9 224 genes in the forest decision making were stored if the left-out sample was correctly predicted.

225 After summing the classifier importancies for each gene from every successful loop, the top 20

226 genes were depicted. The final 4-gene RF classifier was constructed in a similar approach but,

227 to further reduce potential overfitting, a 4-fold cross-validation was applied with a maximum

228 tree depth of 3 and a minimal number of 3 samples in each leaf. All these analyses were carried

229 out using the sklearn package in Python v3.7.0 (21).

230

231 Reverse transcription quantitative polymerase chain reaction (RT-qPCR)

232 Predesigned PrimeTime®qPCR assays (Integrated DNA Technologies (IDT), Iowa, USA)

233 were used for human gene expression analysis of HRH4 (Hs.PT.58.4403324), IL10

234 (Hs.PT.58.2807216), PRSS33 (Hs.PT.58.3087623), SLFN14 (Hs.PT.58.993231) and the

235 reference gene TBP (Hs.PT.58v.39858774) (File S2). The recommended assay with the

236 highest algorithm score was selected for each gene. After read counts were normalized via

237 DESEQ2, TBP was selected as reference gene based on lowest maximal deviation between

238 samples, lowest coefficient of variation, lowest LOG2FC change, commercial availability

239 and literature review. Previous studies have validated TBP as a reliable reference gene for

240 qPCR (22,23) and it was detected in all study samples.

241 Samples with sufficient left-over volume to reach an equal amount of 0.5 µg in every RNA

242 extract were included (Figure S2). Both, cDNA synthesis with iScript cDNA synthesis Kit

243 (BioRad laboratories, California, USA) and qPCR were performed on the QuantStudio 5

244 real-time PCR system (Applied Biosystems, ThermoFisher, Waltham, MA USA). The

245 PrimeTime gene expression master mix with low passive reference dye (IDT) was used. To

246 reduce technical variation, all samples from 1 patient were run in duplicate for the four genes

247 and reference gene on one plate simultaneously. As the efficiencies were almost all ranging

248 between 90-110%, we used the comparative Ct-value method (ΔΔCt) for relative

10 249 quantification between the timepoints. Individual LOG2FC between D29 and D0 were

250 calculated based on CPM values or Ct-values for the RNA sequencing or RT-qPCR results,

251 respectively.

252

253 Statistical and visual analyses

254 For patient characteristics (Table 1), continuous data was represented as medians with

255 interquartile ranges and categorical data as numbers and frequency. Statistical significance

256 between treatment success and failure patients was determined with the non-parametric Mann-

257 Whitney U test for continuous variables and two-sided Fisher’s exact test for categorical

258 variables. P-values < 0.05 were considered significant. The ROC curve of the final 4-gene

259 classifier was generated using the sklearn package in Python. The AUC, sensitivity and

260 specificity were calculated over 1000 bootstrap iterations. For each iteration, a 4-fold cross

261 validation was applied. Principal component analysis (PCA) scatter plots, heatmaps (Euclidean

262 distance, complete linkage), venn diagrams and volcano plots were created with the CLC

263 Genomic Workbench software V.12 (Qiagen Bioinformatics). Violin plots were created with

264 Graphpad Prism v8.0.1 (GraphPad Software, San Diego, USA).

265

266 Results

267 Nested in a clinical trial, 30 VL-HIV patients underwent additional whole blood sampling.

268 After exclusion of two patients due to technical error, a total of 28 VL-HIV patients were

269 included in the analyses (Figure 1a). Twelve (43%) patients reached parasitological cure after

270 1 cycle of treatment and 16 required extended or rescue treatment. At admission, half of the

271 cases had a history of VL and 82% were on antiretroviral therapy (ART) (Table 1). In both

272 treatment outcome groups, a similar CD4 reconstitution of around 110 cells/µL was observed

273 after 29 days of VL treatment compared to baseline. Except for a higher percentage of viral

11 274 controllers in the treatment failure group (possibly linked to the higher number of people on

275 ART), no significant differences in patient characteristics nor treatment regimens could be

276 observed between the treatment success and failure group. These findings indicate that HIV-

277 related markers were not associated with VL treatment outcome and that despite a successful

278 suppression of the viral load or CD4 T cell recovery, VL treatment failure still occurred and

279 that vice versa, non-controlled HIV patients were able to clear the infection.

280

281 Table 1: Patient characteristics before and after treatment, stratified by treatment outcome TOTAL SUCCESS FAILURE p-value (n=28) (n=12, 43%) (n=16, 57%) 282 Demographic characteristics Age, median (1-99%) 34 (27-45) 33 (28-45) 34 (27-44) 0.815 Men, n (%) 28 (100) 12 (100) 16 (100) 1.000 Clinical characteristics at D0 Treatment regime, n (%) 0.491 AmBisome 9 (32.1) 3 (25) 6 (37.5) AmBisome + Miltefosine 19 (67.9) 9 (75) 10 (62.5) Primary VL, n (%) 14 (50) 5 (41.7) 9 (56.3) 0.704 Site of parasite detection, n (%) 0.141 Spleen (+1 to +2) 1 (4) 1 (8) 0 (0) Spleen (+3 to +6) 26 (92) 11 (92) 15 (94) Bone marrow 1 (4) 0 (0) 1 (6) On ART, n (%) 23 (82.1) 8 (66.7) 15 (93.8) 0.133 HIV viral load, n (%) 0.049 Undetectable 8 (29) 1 (8) 7 (44) >20-10000 copies/mL 7 (25) 3 (25) 4 (25) >10000 copies/mL 13 (46) 8 (67) 5 (31) CD4 count, median (1-99%) 47 (11-159) 44.5 (11-159) 61.5 (15-152) 0.516 Concomitant diseases, n (%) 23 (82.1) 10 (83.3) 13 (81.3) 1.000 Platelet count (x103/µl), median (1-99%) 100 (47-195) 116 (47-181) 98 (55-195) 0.693 Clinical characteristics at D29 Site of parasite detection, n (%) NA Spleen (+1 to +2) 6 (21) NA 6 (37.5) Spleen (+3 to +6) 6 (21) NA 6 (37,5) Bone marrow 4 (14) NA 4 (25) CD4 count, median (1-99%) 155 (20-341) 153 (20-285) 171 (42-341) 0.807 Platelet count (x103/µl), median (1-99%) 221 (87-701) 206 (115-464) 235 (87-701) 0.710 VL: visceral leishmaniasis, ART; antiretroviral therapy, HIV: human immunodeficiency virus, D: day of treatment, NA: not applicable

12 a

Eclded Techical e dig 1 ear afer reamen RNA ilai N=2 Lg-e ced N=8, 50%

D29 eae faile D57-116 eae N=16, 57% cce Relaed N=8, 50% VL-HIV aie VL-HIV aie ih RNA aled ilaed D0 ad D29 ale N=30 N=28 D29 eae cce Lg-e ced N=12, 43% N=6, 50%

Relaed N=5, 42%

LTFU N=1, 8% b

Mechaiic aach D0 D29 (lgidial) Paha anale Neork anale Inclie DGE anale

Rb feae 4-gee igae Dice ch N=28 84% eiii 85% ecifici

Diciia aach Teae faile ce (c-ecial) Machine learning (random fore) Leae-one-o cro-alidaion Sringen DGE anale D29/D0 normaliaion 283 284 Figure 1: Flow charts of study participants and study design. (a) Flow chart of inclusion and stratification by 285 treatment outcomes of study participants (b) Lay-out of the study design and generation of 4-gene signature. 286 LTFU: lost to follow-up, D: Day of treatment, VL-HIV: visceral leishmaniasis-HIV co-infection, DGE: 287 differential gene expression. 288

289 Distinct gene expression changes in treatment success and failure patients

290 We first used principal component analysis (PCA) to reduce the dimensionality of the data and

291 determine whether patients could be clustered. However, after pooling all the samples, the first

292 principal component only explained 11.9% of the total variance in the dataset and this was

293 associated with a lower read count in one of the samples (Figure S1).

294

295 To account for this high variability in transcript abundance between individual VL-HIV

296 patients, possibly due to diverse HIV or VL histories, genetic backgrounds, nutritional status

297 and a high presence of comorbidities (>80%, Table 1), we performed longitudinal paired

298 analyses of intra-individual changes in gene expression levels between Day 0 (D0) and D29

13 299 (using the mechanistic approach described in Methods, Figure 1b). Differential gene

300 expression (DGE) analyses between D0 and D29 resulted in distinct transcriptomic patterns by

301 treatment outcome, with 397 and 194 unique DEGs for the treatment success and failure group,

302 respectively (Figure 2a, File S1 for complete lists). Only 142 were common DEGs between

303 the groups, of which only one (CTD-2545M3.6) encoding an uncharacterised was

304 regulated in the opposite direction in the groups. An almost equal distribution of up-

305 (n=218/397, 55%) and down- (n=179/397, 45%) regulated genes was observed during

306 treatment success, while in the treatment failure group less genes significantly changed in

307 expression level and their expression was mainly downregulated (n=166/194, 86%) (Figure

308 2a). This observation was also reflected in corresponding unsupervised hierarchical clustering

309 of the expression data (Figure 2b&c). In the treatment success group, all samples were

310 distinctly clustered by timepoint and showed a reversed pattern at D29 (Figure 2b). In the

311 treatment failure group, no primary clustering by timepoint could be observed with a more

312 scattered pattern (Figure 2c), indicating no mutual delineated impact of VL treatment on host

313 transcriptomic profile. Altogether, these results suggest that significant changes take place in

314 the abundance of blood transcripts during successful VL treatment and that the patterns of gene

315 expression depend on the treatment efficacy.

14 a ! upregulated upregulated ! 218 28 !

179 166 downregulated downregulated

b SUCCESS

! c FAILURE

316 !

15 317 318 Figure 2: Distinct patterns of differentially expressed genes (DEGs) in treatment success and failure 319 patients. (a) Overlap of differentially expressed genes (DEGs) between D0 and D29 for treatment success and 320 failure patients. Venn diagram showing the number, uniqueness and directionality of the DEGs (>1.5 absolute 321 fold change with FDR p-value <0.05 – see mechanistic approach in materials and methods) between D0 and D29 322 for treatment success (orange, n=12) and failure cases (red, n=16) (b) Unsupervised clustering of differentially 323 expressed genes (DEGs) in treatment success and (c) failure patients. Heat maps showing the z-scores (bottom 324 scale) or extent of up (red) or downregulation (blue) of the DEGs (Y-axis) at D0 (green samples) and D29 (purple 325 samples) for all individual treatment success patients (n=12) and all individual treatment failure patients (n=16). 326 Unsupervised hierarchal clustering of the samples was applied and based on Euclidean distance and complete 327 linkage. D: Day of treatment 328 329 Functional annotation of unique gene transcripts underlying successful VL treatment

330 To better understand the ontology and function of the 397 unique genes that showed significant

331 alterations in their expression levels during successful VL treatment, we assessed a total of

332 6161 published gene sets across 4 different databases for significant enrichment. In total, 77

333 biological GO terms, 12 canonical pathways, 8 hallmark gene sets and 5 BTMs were enriched

334 (Figure 3). Despite more than 200 upregulated genes in the treatment success cases, enriched

335 gene sets across the four different databases showed a robust downregulation of pathways by

336 D29, in particular those linked to cell cycle, apoptosis, proteolysis and adaptive immune

337 response processes. The top 8 most influential genes that were part of the core enrichment

338 group in more than 35 gene sets, included different alpha subunits of the proteasome alpha

339 (PSMA6, PSMA2, PSMA3, PSMA4), one subunit of the proteasome 26s (PASMD14), cyclin-

340 dependent kinase 1 (CDK1), interferon-g (IFNG) and interleukin 6 (IL6). In contrast, no

341 significant enrichments could be observed within the 194 unique DEGs from the failure group,

342 consolidating the observation of no clear delineated impact of VL treatment on host pathways

343 among patients with treatment failure after one month of treatment.

344

345 To obtain insight in the correlations between enriched terms and define overarching pathways

346 or processes, a functionally grouped network of enriched categories was generated that showed

347 2 distinct clusters (Figure 4). The main cluster was downregulated and consisted of cell cycle

348 features (green), including (stem) cell differentiation, cell division, antigen presentation, cell

16 349 proliferation and related catabolic or proteolytic processes. The second cluster was upregulated

350 and covered features of phagolysosomal activity to kill or stop the growth of fungal and

351 microbial organisms (purple), including G-protein coupled receptor signalling and vacuolar

352 activity with primary (azurophilic) granules loaded with defesins and serine proteases.

353

354 These findings were indicative of a clear remission of active disease in successfully treated

355 patients. The data indicated a distinct and profound downregulation of the host cellular activity

356 and immunity and suggested clearance of the parasite by antimicrobial peptides in

357 phagolysosomes.

17 GO terms: biological process NES Size Hallmarks NES Size CELL CYCLE -3,79 44 E2F TARGETS -2,92 17 CELL CYCLE PROCESS -3,72 39 G2M CHECKPOINT -2,76 17 MITOTIC CELL CYCLE -3,57 33 MYC TARGETS V1 -2,70 17 ORGANELLE FISSION -2,89 22 INTERFERON GAMMA RESPONSE -2,51 12 MITOTIC NUCLEAR DIVISION -2,86 19 TNFA SIGNALING VIA NFKB -1,74 11 NUCLEAR SEGREGATION -2,73 10 MITOTIC SPINDLE -1,69 10 SISTER CHROMATID SEGREGATION -2,71 10 MTORC1 SIGNALING -1,63 12 CHROMOSOME SEGREGATION -2,71 13 HYPOXIA 0,78 10 REGULATION OF CELL CYCLE -2,63 31 Canonical pathways NES size CHROMOSOME ORGANIZATION -2,60 24 CELL CYCLE MITOTIC -2,60 18 CELL DIVISION -2,41 20 CELL CYCLE -2,52 20 DNA CONFORMATION CHANGE -2,37 12 DNA REPLICATION -2,45 12 REGULATION OF DNA METABOLIC PROCESS -2,34 11 MITOTIC M M G1 PHASES -2,37 11 POSITIVE REGULATION OF CELL DEATH -2,34 16 G1 S TRANSITION -2,14 10 NEGATIVE REGULATION OF TRANSPORT -2,24 11 MITOTIC G1 G1 S PHASES -2,13 10 REGULATION OF CELL CYCLE PROCESS -2,23 19 IMMUNE SYSTEM -2,02 38 REGULATION OF PROTEIN MODIFICATION BY SMALL PROTEIN ... -2,20 13 CYTOKINE SIGNALING IN IMMUNE SYSTEM -1,94 14 POSITIVE REGULATION OF PROTEOLYSIS -2,18 15 METABOLISM OF AMINO ACIDS AND DERIVATIVES -1,57 11 POSITIVE REGULATION OF CELLULAR PROTEIN CATABOLIC … -2,18 11 CLASS I MHC MEDIATED ANTIGEN PROCESSING… -1,47 10 REGULATION OF PROTEOLYSIS -2,16 26 ADAPTIVE IMMUNE SYSTEM -1,40 17 POSITIVE REGULATION OF CATABOLIC PROCESS -2,16 15 GPCR LIGAND BINDING 1,36 10 RESPONSE TO ALCOHOL -2,15 12 Blood transcriptional modules NES size NEGATIVE REGULATION OF CELL CYCLE -2,11 13 M4.0 - CELL CYCLE AND TRANSCRIPTION -3,53 37 REGULATION OF CELL CYCLE PHASE TRANSITION -2,11 12 M4.1 - CELL CYCLE (I) -3,51 25 POSITIVE REGULATION OF PROTEIN CATABOLIC PROCESS -2,08 14 M4.5 - MITOTIC CELL CYCLE IN STIMULATED CD4 T … -2,07 10 NEGATIVE REGULATION OF PROTEIN MODIFICATION PROCESS -2,08 20 M11 - ENRICHED IN MONOCYTES (II) -1,23 10 NEGATIVE REGULATION OF PROTEIN METABOLIC PROCESS -2,07 30 M37 - IMMUNE ACTIVATION - GENERIC CLUSTER 1,43 18 REGULATION OF CELLULAR PROTEIN CATABOLIC PROCESS -2,04 14 REGULATION OF MITOTIC CELL CYCLE -2,04 17 CELL CYCLE PHASE TRANSITION -2,03 12 REGULATION OF CATABOLIC PROCESS -2,02 24 REGULATION OF PROTEIN LOCALIZATION -1,99 22 REGULATION OF PROTEIN CATABOLIC PROCESS -1,98 19 REGULATION OF INNATE IMMUNE RESPONSE -1,97 14 REGULATION OF CELL DEATH -1,96 40 ESTABLISHMENT OF PROTEIN LOCALIZATION -1,91 15 REGULATION OF CELLULAR PROTEIN LOCALIZATION -1,89 16 NEGATIVE REGULATION OF TRANSFERASE ACTIVITY -1,88 16 REGULATION OF PROTEIN SECRETION -1,88 11 ACTIVATION OF IMMUNE RESPONSE -1,86 18 ANTIGEN PROCESSING AND PRESENTATION OF PEPTIDE ANTIGEN -1,86 10 INNATE IMMUNE RESPONSE -1,84 23 REGULATION OF CELLULAR LOCALIZATION -1,80 31 RESPONSE TO CYTOKINE -1,80 34 POSITIVE REGULATION OF DEFENSE RESPONSE -1,77 15 ESTABLISHMENT OF LOCALIZATION IN CELL -1,77 34 PROTEIN LOCALIZATION -1,77 23 NEGATIVE REGULATION OF CELL COMMUNICATION -1,76 26 REGULATION OF TRANSFERASE ACTIVITY -1,76 40 PROTEIN CATABOLIC PROCESS -1,75 14 ANTIGEN PROCESSING AND PRESENTATION -1,73 12 POSITIVE REGULATION OF IMMUNE RESPONSE -1,72 24 CELLULAR MACROMOLECULE LOCALIZATION -1,69 22 CELLULAR MACROMOLECULAR COMPLEX ASSEMBLY -1,66 12 PROTEIN LOCALIZATION TO ORGANELLE -1,65 10 IMMUNE RESPONSE -1,63 53 REGULATION OF PROTEIN MODIFICATION PROCESS -1,63 57 NEGATIVE REGULATION OF NITROGEN COMPOUND METABOLIC… -1,63 29 REGULATION OF GROWTH -1,62 22 MUSCLE ORGAN DEVELOPMENT -1,61 10 REGULATION OF T CELL PROLIFERATION -1,61 10 REGULATION OF ORGANELLE ORGANIZATION -1,61 21 PROTEOLYSIS -1,60 28 REGULATION OF CHROMOSOME ORGANIZATION -1,59 11 NEGATIVE REGULATION OF CYTOKINE PRODUCTION -1,58 12 DNA METABOLIC PROCESS -1,58 16 RESPONSE TO DRUG -1,54 17 MACROMOLECULE CATABOLIC PROCESS -1,54 22 CELLULAR RESPONSE TO STRESS -1,50 30 CELLULAR RESPONSE TO CYTOKINE STIMULUS -1,50 31 POSITIVE REGULATION OF BIOSYNTHETIC PROCESS -1,49 47 358

359 Figure 3: Enriched gene sets in the treatment success group (n=12). Normalized enrichment scores for 360 hallmarks, GO terms and canonical pathways from the MsigDB database and previously described blood 361 transcriptional modules are depicted in blue (downregulated) or red (upregulated) scale. The size column

18 362 represents the number of genes from the respective gene set found in the expression dataset. NES: normalized 363 enrichment score 364

Phagolysosomal activity Cell cycle purinergic receptor activity

cyclin-dependent protein kinase embryonic activity Alpha-defensins placenta morphogenesis

sarcolemma Hematopoietic cell lineage

regulation of T-helper 1 type negative adaptive immune immune response regulation of response interferon-gamma production IL-17 signaling positive pathway regulation of Cell Cycle peptidyl-tyrosine phosphorylation Downregulated terms %Genes / Term 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 negative regulation of interferon-gamma production 6* regulation of adaptive immune response 13* cyclin-dependent protein kinase activity 11* IL-17 signaling pathway 8* T-helper 1 cell differentiation 4. T-helper 1 type immune response 6. Interleukin-4 and Interleukin-13 signaling 8. Proteasome 6* Antigen processing-Cross presentation 8. Cross-presentation of soluble exogenous antige... **8 Cell Cycle **31 APC/C-mediated degradation of cell cycle 8* Cdc20:Phospho-APC/C mediated degradation of Cyclin A 7* Regulation of APC/C activators between G1/S and ear... 7. APC/C:Cdc20 mediated degradation of mitotic proteins **8 Activation of APC/C and APC/C:Cdc20 mediat... 8* APC:Cdc20 mediated degradation of cell cycle protei... **8 Vpu mediated degradation of CD4 6* SCF(Skp2)-mediated degradation of p27/p21 6. Degradation of beta-catenin by the destruction complex 7. Separation of Sister Chromatids **13 Resolution of Sister Chromatid Cohesion 9. Mitotic Metaphase and Anaphase 13* FCERI mediated NF-kB activation 7. Regulation of ornithine decarboxylase (ODC) 6* Mitotic G2-G2/M phases **13 Regulation of mitotic cell cycle 8* Mitotic G1-G1/S phases **12 Dectin-1 mediated noncanonical NF-kB signaling 6. NIK-->noncanonical NF-kB signaling 6. CDT1 association with the CDC6:ORC:origin complex 6. Mitotic Prometaphase 12* Mitotic Anaphase 13* Cyclin E associated events during G1/S transition 7. G1/S Transition **11 S Phase 10. G2/M Transition **13 Cell Cycle, Mitotic **27 Ubiquitin Mediated Degradation of Phosphorylated Cdc25A 6. p53-Independent DNA Damage Response 6. p53-Independent G1/S DNA damage checkpoint 6. The role of GTSE1 in G2/M progression after G2 checkpoint 8* antigen processing and presentation of exogenous antigen 12. proteasome core complex, alpha-subunit complex **4 regulation of stem cell differentiation 10* regulation of cellular amine metabolic process 8. antigen processing and presentation of exogenous pepti... 8* kinetochore 11* chromosome, centromeric region 14* regulation of hematopoietic stem cell differentiation 8* ! Upregulated terms %Genes / Term 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 purinergic receptor activity *5 sarcolemma 11. Hematopoietic cell lineage **10 Acute myeloid leukemia **8 Alpha-defensins **5 Antimicrobial peptides *9 killing of cells of other organism *8 response to fungus **9 antimicrobial humoral response *11 modification of morphology or physiology of oth... *11 modification by host of symbiont morphology or physiology *9 disruption by host of symbiont cells *5 antibacterial humoral response 6. negative regulation of growth of symbiont in host 4. vacuolar lumen **17 azurophil granule **14 ! 365

19 366 Figure 4: Functionally grouped network analyses of enriched gene sets in the treatment success group 367 (n=12). Functionally related clusters of enriched gene sets are represented by different colours and the label of 368 the most significant term per cluster is shown. The node size represents the term enrichment significance and the 369 connectivity by grey lines (kappa statistics 0.4). Similar terms were fused to reduce redundancy. Bottom tables 370 showed the directionality of significant terms and the frequency of genes from the respective terms detected in 371 the expression dataset. The exact number of genes detected is indicated behind each bar. 372

373 Gene set to accurately distinguish treatment outcome at D29

374 After having established that the whole blood transcriptome markedly changed during the

375 course of treatment and reflected relevant biological processes, we investigated whether this

376 transcriptomic change could be used to accurately determine the treatment outcome at D29. To

377 answer this question, a machine learning approach was adopted that compared D0/D29

378 transcript abundance between treatment success and failure patients (using the discriminatory

379 approach described in the Methods, Figure 1b). To avoid overfitting, the random forest (RF)

380 classifiers were evaluated by leave-one-out cross-validation. In total, 70% of the constructed

381 classifiers could correctly predict the outcome of the left-out patient. The top 20 genes with the

382 highest summed feature importance in the tree decision making over all correctly classifying

383 classifiers were kept and included several pseudogenes and novel transcripts (Table 2, first

384 column). To increase the biological relevance and robustness of selected genes, these 20 genes

385 were subsequently compared with the list of DEGs from the mechanistic approach (D0 vs D29)

386 for overlap (Table 2). Four genes were found to be shared with the treatment success group,

387 but none with the failure group.

388

389 Next, a new RF classifier with 4-fold cross-validation was built to assess whether the

390 combination of these 4 genes alone was able to accurately determine the treatment outcome.

391 Bootstrapping this classifer a 1000 times showed a mean area-under-the-ROC-curve (AUC)

392 of 0.95 (95% CI: 0.75-1.00) - with a mean sensitivity of 84% (95% CI: 83-85) and specificity

393 of 85% (95% CI: 83-86) - to correctly distinguish treatment failure from success cases (Figure

394 5). We validated the findings by reverse transcription quantitative polymerase chain reaction

20 395 (RT-qPCR) and showed similar distribution patterns in log2(foldchange) (LOG2FC) changes

396 for all 4 genes (Figure S2 and File S2). Due to the lack of follow-up samples in the 16 treatment

397 failure patients, we could not investigate whether these patients also expressed the 4-gene

398 signature when reaching parasitological cure after extended treatment (Figure 1a). We did,

399 however, observe that three out of the four genes were also differentially expressed in the 13

400 VL-HIV patients with long-term cure (cf. no relapse) at 6 months after treatment end (Figure

401 S3a), of which eight were treatment failure cases at D29. In agreement, the four genes could

402 not be detected in treatment failure patients at D29 with low parasite burden (1+,2+) in splenic

403 aspirates, indicating a true cure signature that reflected disease remission after an apparent

404 complete resolution of the parasite.

405

406 With regard to long-term cure prediction, distinct patterns at D29 could be observed in gene

407 expression levels of successfully treated patients who relapsed in the following year (n=5, 42%)

408 and those who did not (Figure 1). Yet, our 4-gene set was mostly found in the common DEGs

409 between long-term cured and relapsed patients, except for SLFN14, indicating little value in

410 relapse prediction (Figure S3b). The relapsed patients overlapped for 61% with the failure

411 group at D29, compared to only 30% with the long-term cured patients. These findings argue

412 for the investigation of a predictive signature at D29 to predict relapse development in the next

413 6-12 months, but patient numbers were too small to perform robust analyses.

21 a b 400

300

200

100

Feature relative importance relative Feature 0

IL10 HRH4 PRSS33 SLFN14 414 ! 415 Figure 5: Receiver operator curves (ROC) for the final random forest classifier based on the 4-gene set. 416 (a) Showing the average AUC value of the 4-gene random forest classifier, calculated over 1000 bootstrap 417 replicates. (b) Bar chart showing the sum of relative importancies in the classifier over 1000 bootstrap 418 replicates. 419

420 Discussion

421 In this study, we investigated the whole blood transcriptome of 28 well-characterized Ethiopian

422 VL-HIV patients before and after 1 month of VL treatment. We observed a distinctive pattern

423 of disease remission in successfully cured patients and the complete lack of it in treatment

424 failure cases. Subsequently, we identified a 4-gene signature able to discriminate treatment

425 success at D29 with a sensitivity of 84% and specificity of 85%. Application of this signature

426 as a low-invasive tool to assess treatment efficacy in VL-HIV patients could have significant

427 value in guiding patient management and future R&D.

428

429 Due to a possible homeostatic mechanism to control persistent infection-induced inflammation

430 in active VL, elevated levels of the regulatory cytokine IL-10 that significantly decreased

431 following successful treatment, have been reported in numerous animal and human studies

432 (24,25). This marked downregulation of IL10 transcription in successfully treated patients was

433 strongly recognized in our 4-gene signature. This is, however, in contrast with the two previous

434 transcriptomic studies in VL patients that did not find a significant regulation of IL10 (12,13).

22 435 While we cannot rule out that our signature is HIV-specific, our longitudinal approach that

436 accounts for natural variation in gene transcription between patients and the use of RNAseq

437 instead of microarray, could have contributed to detecting these associations. Likewise, the

438 other three genes (HRH4, PRSS33 and SLFN14) have not been reported in the previous two

439 studies.

440

441 In addition, we could not confirm the IFNG gene as the major regulator gene as reported in

442 previous transcriptome studies of VL patients (12). Although the IFNG gene and pathways

443 were also strongly downregulated in treated VL cases compared to active cases, we rather

444 observed a prominent upregulation of vacuolar activity with primary (azurophilic) granules

445 loaded with defesins and serine proteases. These genes and terms suggested an enhanced

446 phagocytosis, neutrophil involvement and complement activation; and may reflect the

447 proteolytic degradation responsible for parasite clearance. This process was reflected in the 4-

448 gene signature by the marked upregulation of the amidolytic activity of serine protease 33. Its

449 predominant expression in peripheral macrophages and highest discriminatory importance in

450 the classifier supports future studies into the role of PRSS33 in phagolyosomal degradation of

451 Leishmania parasites.

452

453 It has been postulated that the interactions of Leishmania spp. with eosinophils and mast cells

454 influence the macrophage response to infection and the development of an adaptive immune

455 response (26). An important mediator in this process is the histamine receptor H4, also

456 designated as the ‘immune system histamine receptor’ due to its chemotactic properties

457 (eosinophil migration and mast call chemotaxis) and involvement in dendritic cell activation

458 and T-cell differentiation. Because histamines have been described to have an assisting role in

459 in vitro clearance of Leishmania infection (26), the observed upregulation of HRH4 in

23 460 successfully treated patients could indicate clearance of parasites. Nevertheless, due to its

461 importance in the 4-gene signature, the role of histamine, mast cells and its receptors on

462 treatment efficacy in VL and VL-HIV patients merits further research. In particular because

463 the H4 receptor is also being explored as a promising drug target in many chronic inflammatory

464 disorders (27).

465

466 Defects in the SLFN14 gene of the signature are associated with thrombocytopenia and

467 excessive bleeding (28). Likewise, the platelet count is known to be altered during VL infection

468 due to splenic sequestration and often leads to bleeding tendency. Hence, the upregulation in

469 SLFN14 expression may reflect the described return of platelet counts to normal levels after

470 successful cure. Yet, no clear correlation could be observed between SLFN14 expression and

471 recovery of platelet count after therapy in our study (Figure S4). It is of note that the gene was

472 absent in the DEG list at D210 in long-term cured patients (Figure S4). Therefore, its

473 importance should be validated as it could have been influenced by a small group of patients

474 with severe thrombocytopenia. Our top 20 list from the discriminatory approach consisted of

475 many pseudogenes and novel transcripts that also warrant further research, as they are not yet

476 fully understood but are increasingly acknowledged as key contributors to immune responses

477 (29).!

478

479 Unfortunately, we could not study differences in treatment regimens due to sample size

480 restrictions. Therefore, it remains unknown whether the broad antiproliferative activities of

481 miltefosine contributed to the strong predominance of the downregulated cell cycle cluster in

482 the network analyses. This predominance is however believed to reflect a cessation of the

483 massive proliferation and differentiation of T and B cell clonotypes during the acute adaptive

484 immune response, as similar results were observed in studies of immunocompetent VL patients

24 485 in India and Brazil treated with amphotericin B (inciting cell wall disruption) and pentavalent

486 antimonials (broad macromolecule inhibitor), respectively (12,14). This would indicate it is

487 not HIV-specific nor affected by the administered drugs. Nevertheless, a true mechanistic

488 evaluation of affected organs with single cell evaluations should be studied to assess the causal

489 and time relationship with drug modulation and parasite reduction. It is also to be noted that

490 we could not observe enriched gene sets in the treatment failure group but noticed a significant

491 decline in expression of many IgG-related genes (almost 40% of all DEGs). Many of these

492 genes were also shared at D29 with the seemingly cured patients who relapsed within the

493 next year (35% of common DEGs). This finding could suggest the initial presence of

494 hypergammaglobulinemia at diagnosis as a risk factor for treatment failure or relapse, and

495 merits further research.

496

497 Despite the high inter-patient variation in gene levels, we were able to construct a good

498 classifier. We believe our methodology applied some good general practices that could be

499 incorporated in future studies. Previous studies targeting limited transcriptomic gene sets for

500 diagnosis or treatment outcome prediction in VL or even infectious diseases in general were

501 mostly confined to cross-sectional comparisons at a single timepoint (9,10,30). Such studies

502 can be complicated by patient-to-patient variation in transcript abundance and are highly

503 dependent on the stability of the used housekeeping gene for quantification in future validation

504 studies with RT-qPCR. Therefore, we targeted intra-individual and relative changes in

505 transcript abundance that are likely to enhance its generalizability but require both pre and post

506 sampling. Additionally, purely statistical approaches were often adopted to select the best

507 discriminatory genes. Such approaches can have the inherent danger of identifying genes with

508 little biological relevance or interpretation, jeopardizing their validation in other patient cohorts

509 (31). Correspondingly, our top 20 list from the discriminatory approach consisted of many

25 510 pseudogenes and functionally uncharacterized transcripts. By overlaying the machine learning

511 results with mechanistic insight analyses, we believe the 4 genes represent relevant but distinct

512 pathways that reflect underlying biological changes during treatment. When monitoring of

513 treatment outcome is envisioned, the specificity of these pathways or genes to Leishmania

514 infection is also less critical, compared to signature development for diagnosis or disease

515 development prediction.

516

517 A limitation of this study is that the 4-gene set identified here is derived from data collected

518 from a single cohort of male VL-HIV patients in a single geographic area in East Africa.

519 Although it will be necessary to further validate this biomarker in VL-HIV patients, and to

520 include patients from other regions/countries, this should not detract from the value of such a

521 biomarker. Even if restricted to use in East Africa, it would still be of high interest as the burden

522 of VL-HIV is significant in this area. It will be of future interest to evaluate performance of

523 this biomarker as the chemotherapeutic landscape alters, though the combination of

524 AmBisome/miltefosine is the most effective treatment of VL-HIV patients in East Africa (15)

525 and our data suggested that the identification of the 4-gene signature was not merely a reflection

526 of drug choice.

527

528 In the near future, a composite endpoint consisting of clinical signs, antigen detection methods

529 and a host 4-gene biomarker could be a powerful, less-invasive tool for research and drug

530 development (R&D) in leishmaniasis, both for immunocompetent and VL-HIV patients. In

531 particular, future studies should assess the predictive value of the signature at earlier timepoints

532 and evaluate the methodology required to develop a more point-of-care detection system. This

533 may guide early case management, treatment recommendation and could facilitate early futility

534 analyses and dose-finding studies of novel compounds. It is also to be noted that its value in

26 535 similar parasitic infections could be studied, as rather general disease remission signatures were

536 detected. Our results also encourage future investigations in VL relapse prediction by blood-

537 based transcriptomic signatures. Yet other processes are likely to be predictive in parasite

538 recrudescence and to be reflected in such a signature. However, the specificity of the signature

539 to Leishmania will also become more relevant. For example, an extensively validated blood-

540 based 3-gene signature recently satisfied the WHO criteria for a non-sputum-based triage test

541 for tuberculosis across heterogeneous culture-confirmed datasets (30).

542

543 In conclusion, we have identified an easily accessible 4-gene blood signature with high

544 discriminatory value to assess treatment efficacy at the end of treatment in VL-HIV patients.

545 Application of this signature as a low-invasive tool to assess treatment efficacy in VL-HIV

546 patients could have significant value in guiding patient management and future R&D.

547

548 References

549 1. Burza S, Croft SL, Boelaert M. Leishmaniasis. Lancet. 2018 Sep;392(10151):951–70.

550 2. World Health Organization. Surveillance of leishmaniasis in the WHO European

551 Region. Wkly Epidomiological Rec. 2018 Oct;93(40):521–40.

552 3. Diro E, Lynen L, Ritmeijer K, Boelaert M, Hailu A, van Griensven J. Visceral

553 Leishmaniasis and HIV coinfection in East Africa. PLoS Negl Trop Dis.

554 2014;8(6):e2869.

555 4. Lindoso JAL, Cunha MA, Queiroz IT, Moreira CHV. Leishmaniasis-HIV coinfection:

556 current challenges. HIV AIDS (Auckl). 2016;8:147–56.

557 5. Alvar J, Aparicio P, Aseffa A, Den Boer M, Canavate C, Dedet JP, et al. The

558 relationship between leishmaniasis and AIDS: the second 10 years. Clin Microbiol

559 Rev. 2008/04/11. 2008;21(2):334–59.

27 560 6. Srivastava P, Dayama A, Mehrotra S, Sundar S. Diagnosis of visceral leishmaniasis.

561 Trans R Soc Trop Med Hyg. 2011 Jan;105(1):1–6.

562 7. Pourabbas B, Moghadam AG, Pouladfar G, Rezaee Z, Alborzi A. Quantification of

563 Leishmania infantum kinetoplast DNA for monitoring the response to meglumine

564 antimoniate therapy in visceral leishmaniasis. Am J Trop Med Hyg. 2013

565 May;88(5):868–71.

566 8. Croft SL, Sundar S, Fairlamb AH. Drug resistance in leishmaniasis. Clin Microbiol

567 Rev. 2006/01/19. 2006;19(1):111–26.

568 9. Liu X, Speranza E, Muñoz-Fontela C, Haldenby S, Rickett NY, Garcia-Dorival I, et al.

569 Transcriptomic signatures differentiate survival from fatal outcomes in humans

570 infected with Ebola virus. Genome Biol. 2017 Dec 19;18(1):4.

571 10. Robinson M, Sweeney TE, Barouch-Bentov R, Sahoo MK, Kalesinskas L, Vallania F,

572 et al. A 20-Gene Set Predictive of Progression to Severe Dengue. Cell Rep.

573 2019;26(5Robinson, M., Sweeney, T. E., Barouch-Bentov, R., Sahoo, M. K.,

574 Kalesinskas, L., Vallania, F., … Einav, S. (2019). A 20-Gene Set Predictive of

575 Progression to Severe Dengue. Cell Reports, 26(5), 1104–1111.e4.

576 https://doi.org/10.1016/j.celrep.2019.01.033):1104-1111.e4.

577 11. Thompson EG, Du Y, Malherbe ST, Shankar S, Braun J, Valvo J, et al. Host blood

578 RNA signatures predict the outcome of tuberculosis treatment. Tuberculosis. 2017

579 Dec;107:48–58.

580 12. Fakiola M, Singh OP, Syn G, Singh T, Singh B, Chakravarty J, et al. Transcriptional

581 blood signatures for active and amphotericin B treated visceral leishmaniasis in India.

582 PLoS Negl Trop Dis. 2019;13(8):e0007673.

583 13. Salih MAM, Fakiola M, Lyons PA, Younis BM, Musa AM, Elhassan AM, et al.

584 Expression profiling of Sudanese visceral leishmaniasis patients pre- and post-

28 585 treatment with sodium stibogluconate. Parasite Immunol. 2017 Jun;39(6):e12431.

586 14. Gardinassi LG, Garcia GR, Costa CHN, Costa Silva V, de Miranda Santos IKF. Blood

587 Transcriptional Profiling Reveals Immunological Signatures of Distinct States of

588 Infection of Humans with Leishmania infantum. PLoS Negl Trop Dis. 2016;10(11).

589 15. Diro E, Blesson S, Edwards T, Ritmeijer K, Fikre H, Admassu H, et al. A randomized

590 trial of AmBisome monotherapy and AmBisome and miltefosine combination to treat

591 visceral leishmaniasis in HIV co-infected patients in Ethiopia. PLoS Negl Trop Dis.

592 2019 Jan 17;13(1):e0006988.

593 16. Diro E, Edwards T, Ritmeijer K, Fikre H, Abongomera C, Kibret A, et al. Long term

594 outcomes and prognostics of visceral leishmaniasis in HIV infected patients with use

595 of pentamidine as secondary prophylaxis based on CD4 level: a prospective cohort

596 study in Ethiopia. PLoS Negl Trop Dis. 2019 Feb 21;13(2):e0007132.

597 17. Chaussabel D, Quinn C, Shen J, Patel P, Glaser C, Baldwin N, et al. A Modular

598 Analysis Framework for Blood Genomics Studies: Application to Systemic Lupus

599 Erythematosus. Immunity. 2008 Jul 18;29(1):150–64.

600 18. Li S, Rouphael N, Duraisingham S, Romero-Steiner S, Presnell S, Davis C, et al.

601 Molecular signatures of antibody responses derived from a systems biology study of

602 five human vaccines. Nat Immunol. 2014 Feb;15(2):195–204.

603 19. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: A

604 Software Environment for Integrated Models of Biomolecular Interaction Networks.

605 Genome Res. 2003 Nov 1;13(11):2498–504.

606 20. Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, et al.

607 ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and

608 pathway annotation networks. Bioinformatics. 2009 Apr 15;25(8):1091–3.

609 21. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-

29 610 learn: Machine Learning in Python. J Mach Learn Res. 2011;12(Oct):2825–30.

611 22. Nakayama T, Okada N, Yoshikawa M, Asaka D, Kuboki A, Kojima H, et al.

612 Assessment of suitable reference genes for RT-qPCR studies in chronic rhinosinusitis.

613 Sci Rep. 2018 Dec 1;8(1).

614 23. Watson S, Mercier S, Bye C, Wilkinson J, Cunningham AL, Harman AN.

615 Determination of suitable housekeeping genes for normalisation of quantitative real

616 time PCR analysis of cells infected with human immunodeficiency virus and herpes

617 viruses. Virol J. 2007;4.

618 24. Rodrigues V, Cordeiro-da-Silva A, Laforge M, Silvestre R, Estaquier J. Regulation of

619 immunity during visceral Leishmania infection. Parasit Vectors. 2016;9:118.

620 25. Nylén S, Sacks D. Interleukin-10 and the pathogenesis of human visceral

621 leishmaniasis. Trends Immunol. 2007 Sep;28(9):378–84.

622 26. Rodríguez NE, Wilson ME. Eosinophils and mast cells in leishmaniasis. Immunol Res.

623 2014 Aug;59(1–3):129–41.

624 27. Thangam EB, Jemima EA, Singh H, Baig MS, Khan M, Mathias CB, et al. The Role

625 of Histamine and Histamine Receptors in Mast Cell-Mediated Allergy and

626 Inflammation: The Hunt for New Therapeutic Targets. Front Immunol. 2018 Aug

627 13;9:1873.

628 28. Fletcher SJ, Johnson B, Lowe GC, Bem D, Drake S, Lordkipanidzé M, et al. SLFN14

629 mutations underlie thrombocytopenia with excessive bleeding and platelet secretion

630 defects. J Clin Invest. 2015 Sep 1;125(9):3600–5.

631 29. Atianand MK, Caffrey DR, Fitzgerald KA. Immunobiology of Long Noncoding

632 RNAs. Annu Rev Immunol. 2017 Apr 26;35(1):177–98.

633 30. Warsinske H, Vashisht R, Khatri P. Host-response-based gene signatures for

634 tuberculosis diagnosis: A systematic comparison of 16 signatures. PLoS Med.

30 635 2019;16(4).

636 31. Burel JG, Babor M, Pomaznoy M, Lindestam Arlehamn CS, Khan N, Sette A, et al.

637 Host Transcriptomics as a Tool to Identify Diagnostic and Mechanistic Immune

638 Signatures of Tuberculosis. Front Immunol. 2019 Feb 19;10:221.

639

640 DECLARATIONS

641

642 Acknowledgements

643 First, we would like to say our words of gratitude to all study participants for their trust and to

644 the study team at the Leishmaniasis Research and Treatment Centre and Médécins sans

645 frontiers Abdurafi Health Centre for their unwavered efforts. Second, we also want to

646 acknowledge all involved staff at the Institute of Tropical Medicine (ITM) and University of

647 York for their administrative and scientific support. In particular, we would like to thank the

648 Drugs for Neglected Diseases initiative (DNDi) for supporting this research and Gilead for

649 donation of the trial medication.

650

651 Funding statement

652 Funding was provided by the research and development project, AfricoLeish, supported by the

653 European Union Seventh Framework Programme (EU FP7), under grant agreement n°305178.

654 WA and BC are personally supported by a Research Foundation-Flanders postdoctoral

655 fellowship. PMK is funded by a Wellcome Trust Senior Investigator Award (WT104726). The

656 funders had no role in study design, data collection and analysis, decision to publish, or

657 preparation of the manuscript.

658

659 Competing interests

31 660 The authors declare that there are no competing interests.

661

662 Author contributions

663 All authors read and approved the final version of the manuscript.

664 Wim Adriaensen: Conceptualization, data curation, formal analysis, investigation,

665 visualization, writing & editing

666 Bart Cuypers: Formal analysis, methodology, writing, review & editing

667 Carlota F. Cordero: Formal analysis

668 Bewketu Mengasha: Data collection and curation

669 Séverine Blesson: Data curation, project coordination

670 Lieselotte Cnops: Formal analysis, writing, review & editing

671 Paul M. Kaye: Methodology, supervision, review & editing

672 Fabiana Alves: Data curation, funding acquisition, project administration, review & editing

673 Ermias Diro: Data curation, project coordination, funding acquisition, review & editing

674 Johan van Griensven: Conceptualization, methodology, funding acquisition, project

675 administration, supervision, review & editing

676

677 Availability of data and materials

678 The raw dataset supporting the conclusions of this article is available in the Sequence Read

679 Archive repository, [PRJNA595895, http://www.ncbi.nlm.nih.gov/bioproject/595895].

680 Additional metadata are also available upon request to [email protected] for researchers

681 who meet the criteria for access to confidential data.

682

32 683 Supplementary Materials

684 685 Figure S1: Patient heterogeneity across all samples. Scatter plot representating the first 2 principal components 686 on the axes. Each dot represents a patient sample at D0 or D29, as indicated by the node shape. Colors represent 687 the sequencing depth or read count categories expressed per million reads.

33 a

b

688

34 689 Figure S2: RT-qPCR validation. (a) Due to the lack of left-over volume in many samples and five 690 undetermined Ct values, relative quantification could only be performed for a subset of 12 patients (b) Violin 691 plots of LOG2FC changes between D0 and D29 with RNA sequencing or RT-PCR for the 4-gene set. 692 Diamonds represents data from 12 individual patients. Red diamonds indicate the subset of successfully cured 693 patients.In particular the directionality and distribution of the individual fold changes were compared between 694 D0 and D29, as differences in fold change values are likely to occur due to mapping, normalization and read 695 counting strategies in a whole genome or single region assay. In cured patients (red diamonds), a similar 696 upregulation for HRH4, PRSS33, SLFN14 and downregulation for IL10 could be observed when measured 697 with RNA sequencing or with RT-qPCR (non-significant differences), validating the possibility to detect 698 relative changes in our 4 genes also with RT-qPCR. P-values were calculated with the non-parametric 699 Wilcoxon test. D: Day; RT-qPCR: reverse transcription quantitative polymerase chain reaction; Ct: cycle 700 threshold; t. ns: non-significant. 701

a b N=5

HRH4 IL10 HRH4 PRSS33 IL10 PRSS33

SLFN14

SLFN14

N=12 N=13 N=6 N=16 702 ! 703 Figure S3: Overlap of differentially expressed genes (DEGs) by long-term treatment outcome. (a) Venn 704 diagram showing the number of DEGs (>1.5 absolute fold change with FDR corrected p-value <0.05) between 705 D0 and D29 for successfully treated patient at D29 and non-relapsed patients at D210 (cf. long-term cured). 706 Sample was missing for 1 non-relapsed patient at D210. (b) Venn diagram showing the number of DEGs (>1.5 707 absolute fold change with FDR corrected p-value <0.05) between D0 and D29 for failure (red, n=16) and 708 successfully cured patients that did not experience a relapse in the next year (yellow, n=6) or did relapse (pink, 709 n=5). D: day of treatment 710

8

4 fold change

2 0 Log -4 Treatment failure Treatment success

SLFN14 SLFN14 -8 0 100 200 300 400 500 600 700 Percent difference in platelet count (x103/ul) between D0 and D29 711

35 712 Figure S4: Correlation between changes in SLFN14 gene expression and individual recovery in platelet 713 count (n=28). Platelet counts are expressed in x103/µl. Linear regression lines by treatment group were added to 714 shown directionality of the correlations. LOG2FC: log2(fold change), D: day of treatment 715

716 Data File S1: List of differentially expressed genes for the treatment success group (.xlsx).

717 Data File S2: List of differentially expressed genes for the treatment failure group (.xlsx).

718 Data File S3: Details on the RT-qPCR validation.

Alternative Gene Assay ID Ref Seq # Exon location Detects all variants available

HRH4 Hs.PT.58.4403324 NM_021624(1) 2-3a No No

IL10 Hs.PT.58.2807216 NM_000572 3-5 Yes

PRSS33 Hs.PT.58.3087623 NM_152891 1-3 Yes

SLFN14 Hs.PT.58.993231 NM_001129820 1-2 Yes

TBP Hs.PT.58v.39858774 NM_003194(1) 1-2 No Yes

719 Hs.PT.58.4403324 (HRH4) 720 Primer 1: AAGACCCCAGTATGTTGAGTTC 721 Primer 2: TGGCTCACTACTGACTATCTGT 722 Probe: /5Cy5/CGATACCTGTCAGTCTCAAATGCTGTGT/3IAbRQSp/ 723 724 Hs.PT.58v.39858774 (TBP) 725 Primer 1: CAGCAACTTCCTCAATTCCTTG 726 Primer 2: GCTGTTTAACTTCGCTTCCG 727 Probe: /56-FAM/TGATCTTTG/ZEN/CAGTGACCCAGCATCA/3IABkFQ/ 728 729 Hs.PT.58.2807216 (IL10) 730 Primer 1: TCACTCATGGCTTTGTAGATGC 731 Primer 2: GCGCTGTCATCGATTTCTTC 732 Probe: /56-FAM/AGGCATTCT/ZEN/TCACCTGCTCCACG/3IABkFQ/ 733 734 Hs.PT.58.3087623 (PRSS33) 735 Primer 1: CAACGATCCGACTGGACAT 736 Primer 2: GCCTTGAAGACAGAATGAGAGG 737 Probe: /5HEX/AGCAGCTCC/ZEN/CAGCACCAGAA/3IABkFQ/ 738 739 Hs.PT.58.993231 (SLFN14) 740 Primer 1: TTCGTGCAGATGAAGCTGAT 741 Primer 2: GTGGTCATGATGCTGGATACTC 742 Probe: /56-FAM/TTGGTCACAGACTACAACTCTTGCCTG/36-TAMSp/

36