bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 mitoXplorer, a visual data mining platform to systematically analyze and 2 visualize mitochondrial expression dynamics and mutations 3 4 Annie Yim1*, Prasanna Koti1*, Adrien Bonnard2, Milena Duerrbaum1, Cecilia Mueller3, 5 Jose Villaveces1, Salma Gamal1, Giovanni Cardone1, Fabiana Perocchi3, Zuzana 6 Storchova1,4, Bianca H. Habermann1,2,5 7 8 1 Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152, Martinsried, 9 Germany 10 2 Aix-Marseille University, INSERM, TAGC U1090, 13009 Marseille, France 11 3 Functional Genomics of Mitochondrial Signaling, Center, Ludwig Maximilian 12 University (LMU) Munich, Germany 13 4 Department of Molecular Genetics, TU Kaiserslautern, Paul Ehrlich Strasse 24, 14 67663, Kaiserslautern, Germany. 15 5 Aix-Marseille University, CNRS, IBDM UMR 7288, 13009 Marseille, France 16 17 18 19 * these authors contributed equally 20 21 22 23 Corresponding author: 24 Bianca H. Habermann 25 Aix-Marseille University, CNRS, IBDM UMR 7288, Case 907 26 Parc Scientifique de Luminy 27 163, Avenue de Luminy, 28 13009 Marseille 29 France 30 e-mail: [email protected] 31 32 33 34 35 Keywords: 36 Mitochondrial expression dynamics, mitochondrial mutations, mitochondrial functions, 37 visual data mining, Trisomy 21, oxidative phosphorylation, mitochondrial morphology, 38 image analysis 39

1 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

40 Abstract 41 Background

42 Mitochondria produce cellular energy in the form of ATP and are involved in various

43 metabolic and signaling processes. However, the cellular requirements for

44 mitochondria are different depending on cell type, cell state or organism. Information

45 on the expression dynamics of with mitochondrial functions (mito-genes) is

46 embedded in publicly available transcriptomic or proteomic studies and the variety of

47 available datasets enables us to study the expression dynamics of mito-genes in many

48 different cell types, conditions and organisms. Yet, we lack an easy way of extracting

49 these data for gene groups such as mito-genes.

50

51 Results

52 Here, we introduce the web-based visual data mining platform mitoXplorer, which

53 systematically integrates expression and mutation data of mito-genes. The central part

54 of mitoXplorer is a manually curated mitochondrial interactome containing ~1200

55 genes, which we have annotated in 35 different mitochondrial processes. This

56 mitochondrial interactome can be integrated with publicly available transcriptomic,

57 proteomic or mutation data in a user-centric manner. A set of analysis and visualization

58 tools allows the mining and exploration of mitochondrial expression dynamics and

59 mutations across various datasets from different organisms and to quantify the

60 adaptation of mitochondrial dynamics to different conditions. We apply mitoXplorer to

61 quantify expression changes of mito-genes of a set of aneuploid cell lines that carry

62 an extra copy of 21. mitoXplorer uncovers remarkable differences in the

63 regulation of the mitochondrial transcriptome and proteome due to the dysregulation

64 of the mitochondrial in retinal pigment epithelial trisomy 21 cells which

65 results in severe defects in oxidative phosphorylation.

2 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

66

67 Conclusions

68 We demonstrate the power of the visual data mining platform mitoXplorer to explore

69 expression data in a focused and detailed way to uncover underlying potential

70 mechanisms for further experimental studies. We validate the hypothesis-creating

71 power of mitoXplorer by testing predicted phenotypes in trisomy 21 model systems.

72 MitoXplorer is freely available at http://mitoxplorer.ibdm.univ-mrs.fr. MitoXplorer does

73 not require installation nor programming knowledge and is web-based. Therefore,

74 mitoXplorer is accessible to a wide audience of experimental experts studying

75 mitochondrial dynamics.

76 77

3 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

78 Background

79 Enormous amounts of transcriptomic data are publicly available for exploration. This

80 richness of data gives us the unique opportunity to explore the behavior of individual

81 genes or groups of genes within a vast variety of different cell types, developmental or

82 disease conditions or in different species. By integrating these data in a sophisticated

83 way, we may be capable to discover new dependencies between genes or processes.

84 Specific databases are available for mining and exploring disease-associated data,

85 such as The Cancer Genome Atlas (TCGA [1]), or the International Cancer

86 Consortium Data Portal (ICGC [2]). Especially cancer data portals provide users with

87 the opportunity to perform deeper exploration of expression changes of individual

88 genes or gene groups in different tumor types ([1-3]; for a review on available cancer

89 data portals, see [4]). Expression Atlas on the other hand provides pre-processed data

90 from a large variety of different studies in numerous species [5]. Indeed, the majority

91 of transcriptomic datasets are not related to cancer and are stored in public

92 repositories such as Gene Expression Omnibus (GEO [6]), DDBJ Omics Archive [7]

93 or ArrayExpress [8]. Currently, it is not straightforward to integrate data from these

94 repositories without at least basic programming knowledge.

95

96 Next to extracting reliable information from -omics datasets, it is equally important to

97 support interactive data visualization. This is a key element for a user-guided

98 exploration and interpretation of complex data, facilitating the generation of biologically

99 relevant hypotheses – a process referred to as visual data mining (VDM, reviewed e.g.

100 in [9]). Therefore, essentially all online data portals provide graphical tools for data

101 exploration.

102

4 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

103 What is fundamentally lacking is a user-centric, web-based and interactive platform for

104 data integration of a set of selected genes or sharing the same cellular

105 function(s). The benefits of such a tool are evident: first, it would give us the possibility

106 to explore the expression dynamics of or mutations in this set of selected genes across

107 many different conditions, tissues, as well as across different species. Second, by

108 integrating data using enrichment techniques, for instance with epigenetic or ChIP-seq

109 data or by network analysis using the cellular interactome(s), we can recognize the

110 mechanisms that regulate the expression dynamics of the selected gene set.

111

112 One interesting set of genes are mitochondria-associated genes (mito-genes): in other

113 words, all genes, whose encoded proteins localize to mitochondria and fulfill their

114 cellular function within this organelle. Mito-genes are well-suited for such a systematic

115 analysis, because we have a relatively complete knowledge of their identity and can

116 categorize them according to their mitochondrial functions [10]. This a priori knowledge

117 can help us in mining and exploring the expression dynamics of mito-genes and

118 functions in various conditions and species.

119

120 Mitochondria are essential organelles in eukaryotic cells that are required for

121 producing cellular energy in form of ATP and for numerous other metabolic and

122 signaling functions [10]. Attributable to their central cellular role, mitochondrial

123 dysfunctions were found to be associated with a number of human diseases such as

124 obesity, diabetes, neurodegenerative diseases and cancer [11-15]. However,

125 mitochondria are not uniform organelles. Their structural and metabolic diversity and

126 how they influence each other has been well described in literature [16-20]. This

127 mitochondrial heterogeneity in different tissues is reflected in their molecular

5 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

128 composition [21]. The total number of proteins that contribute to mitochondrial

129 functions and localize to mitochondria is currently not precisely known and might differ

130 between tissues and species [22,23]. Yet, based on proteomics data from several

131 organisms, it is likely that mitochondria contain more than 1000 proteins [23-30].

132 Mitochondria have their own genome, whose size in animals is between 11 and 28

133 kilo-bases [31]. Most metazoan mitochondria encode 13 essential proteins of the

134 respiratory chain required for oxidative phosphorylation (OXPHOS), all rRNAs of the

135 small and large mitochondrial ribosomal subunits, as well as most mitochondrial

136 tRNAs [32]. All other proteins found in mitochondria (mito-proteins) are encoded by

137 genes in the nucleus; the products of these nuclear-encoded mitochondrial

138 genes (NEMGs) are transported to and imported into mitochondria.

139

140 Based on data from mitochondrial proteomics studies or genome-scale prediction of

141 mito-proteins, several electronic repositories of the mitochondrial interactome have

142 been created [24,33-36], though they often lack proper functional assignments of mito-

143 proteins. Moreover, proteomics studies describing the mitochondrial proteome can

144 suffer from a high false-positive rate [23]. Computational prediction or machine

145 learning [37] on the other hand lack experimental confirmation. As a consequence,

146 none of the published mitochondrial interactomes available to date can be taken

147 without further manual curation. Moreover, these lists are not integrated with any

148 available data analysis tool to explore mitochondrial expression dynamics under

149 varying conditions or in different tissues or species.

150

151 In this study, we present mitoXplorer, a web-based, highly interactive visual data

152 mining platform to integrate transcriptome, proteome, as well as mutation-based data

6 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

153 with a manually curated, function-based mitochondrial interactome. With mitoXplorer,

154 we can explore the expression dynamics, as well as mutations of mito-genes and their

155 associated mitochondrial processes (mito-processes) across a large variety of

156 different -omics datasets without the need of programming knowledge. MitoXplorer

157 provides users with dynamic and interactive figures, which instantly display information

158 on mitochondrial gene functions and protein-protein interactions. To achieve this,

159 mitoXplorer integrates publicly available -omics data with our hand-curated

160 mitochondrial interactomes for different model species. Additionally, users can upload

161 their own data for integration with our hand-curated mitochondrial interactome, as well

162 as the publicly available -omics data stored in the mitoXplorer database. In order to

163 test the analytical and predictive power of mitoXplorer, we generated transcriptome

164 and proteome data from aneuploid cell lines, carrying trisomy 21 (T21). We used

165 mitoXplorer to analyze and integrate our data with publicly available trisomy 21 data.

166 MitoXplorer enabled us to predict respiratory failure in one of our T21 cell lines, which

167 we experimentally confirmed, demonstrating the predictive power of mitoXplorer.

7 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

168 Results 169 The outline of the mitoXplorer web-platform is illustrated in Figure 1: at the back-end,

170 manually curated mitochondrial interactomes from human, mouse and Drosophila, as

171 well as expression and mutation data from these three species are stored in a MySQL

172 database (details on the implementation of the back-end are available in Methods, as

173 well as Additional File 1, Supplementary Figure S1).

174

175 The user interacts with the mitoXplorer web-platform via the front-end, which offers

176 different visualization and analysis methods. Users can either browse stored public

177 data or upload their own data.

178

179 The mitochondrial interactomes

180 The main component of mitoXplorer is the mitochondrial interactome. Its accurate

181 annotation and completeness are essential for performing a meaningful mitoXplorer-

182 based analysis. To establish mitochondrial interactomes, we have assembled and

183 manually curated lists of genes with annotated mitochondrial processes (mito-

184 processes) for human, mouse and Drosophila. Starting from published mitochondrial

185 proteomics data [27], we removed false-positives and supplemented likely missing

186 genes using information from Mitocarta [24], as well as orthologs across species. We

187 relied mainly on literature sources and information from the respective gene entry at

188 NCBI [38] for establishing whether a protein in question is primarily localized to

189 mitochondria. This resulted in 1166 human, 1161 mouse and 1099 Drosophila mito-

190 genes. We annotated the genes with, and grouped them according to 35 mito-

191 processes using controlled vocabulary (Table 1). In addition to purely mitochondrial

192 processes, we added cytosolic processes coupled to mitochondrial functions,

193 including glycolysis, the pentose phosphate pathway or apoptosis. According to our

8 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

194 annotation strategy, one gene is part of only a single mito-process. We used all mito-

195 genes to create mitochondrial interactomes by adding protein-protein interaction

196 information from STRING [39].

197

198 Currently, the interactomes of three organisms are available on mitoXplorer: Homo

199 sapiens (human), Mus musculus (mouse) and Drosophila melanogaster (fruit fly).

200 Mito-genes of human, mouse and Drosophila annotated with mito-processes are

201 available in Additional File 2, Supplementary Table S1 a-c. These manually curated

202 and annotated interactomes enables the meaningful analysis and visualization of

203 mitochondrial expression dynamics of mito-processes by comparing differential

204 expression of two or more conditions in mitoXplorer.

205 206 The mitoXplorer expression and mutation database

207 To foster the analysis of mitochondrial expression dynamics and mutations,

208 mitoXplorer hosts expression and mutation data from public repositories in a MySQL

209 database.

210

211 Expression data encompass analyzed data of differentially expressed genes from

212 RNA-seq studies and are available in the form of log2 fold change (log2FC) and p-

213 value. One differential dataset thus includes two experimental conditions with all

214 replicates. Mutation data include analyzed data of identified SNPs of one sample

215 against a publicly available reference genome or transcriptome. Pre-analyzed public

216 data are taken as provided by the authors of the respective study. Thus, the algorithms

217 and their settings might differ between data from different studies or sources.

218

9 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

219 The largest public resource imported into mitoXplorer covers publicly available

220 expression data of human cancers from The Cancer Genome Atlas (TCGA, [1]). We

221 have included all paired samples. This resulted in a total of 523 differential datasets

222 from 6 different cancer types: kidney cancer (KIRK), breast cancer (BRCA), liver

223 cancer (LIHC), thyroid cancer (THCA), lung cancer (LUAD) and prostate cancer

224 (PRAD). Changes in mitochondrial metabolism have been described in many cancer

225 types (for a review, see [40]). As mitoXplorer is the thus far only resource that allows

226 a focused analysis of mito-genes across different cancer types or patient groups, this

227 resource should be especially useful to shed light on the expression dynamics or

228 mutational data of mito-genes in cancer and to classify the mitochondrial metabolic

229 profiles of tumor types and sub-types. Users can moreover integrate proprietary data

230 with differential expression or with mutation data from different tumor types and

231 subtypes.

232

233 We provide data from human trisomy 21 patients (GEO accession numbers:

234 GSE55426; GSE79842; [41,42]), from trisomy 21 studies in mouse (GSE5542 [41];

235 GSE79842 [42]; GSE48555 [43]), as well as differential datasets from this study from

236 human trisomic cell lines (11 datasets), which have been partially published elsewhere

237 [44,45] (GEO accessions: GSE39768; GSE47830; GSE102855). These

238 transcriptomic, as well as proteomic datasets should help understand the role of

239 mitochondria and the mitochondrial metabolism in trisomy 21.

240

241 We also uploaded differential transcriptomic and proteomic data of five different mouse

242 conditional heart knock-out strains of genes involved in mitochondrial replication,

243 transcription and [46] (Lrpprc, Mterf4, Tfam, Polrmt, Twnk (Twinkle), (GEO

10 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

244 accession: GSE96518)). These data are especially helpful in unraveling the

245 transcriptional and post-transcriptional effects on mito-genes upon disruption of gene

246 expression at different levels in mitochondria.

247

248 To extend mitoXplorer to other model organism, we added data from D. melanogaster,

249 namely expression data from 185 wild-derived, inbred strains (males and females)

250 from the Drosophila Genetics Reference Panel [47] (DGRP2). The wild-derived fly

251 strains come from different environmental and social situations and display a

252 substantial quantitative genetic variation in gene expression. The availability of these

253 data on mitoXplorer allows a focused analysis of mito-genes to elucidate, whether

254 mitochondrial expression dynamics is equally impacted by the environment in these

255 strains. Finally, we have uploaded data from a recently published systematic study of

256 flight muscle development in D. melanogaster [48] (GEO accession: GSE107247).

257 This enables the analysis of mitochondrial expression dynamics during the

258 development and differentiation of a tissue that is highly dependent on an efficient

259 mitochondrial metabolism and especially ATP production for proper functioning.

260

261 All publicly available data can be viewed and accessed from the mitoXplorer

262 DATABASE web-site.

263 264 User-provided expression and/or mutation data

265 Researchers can upload and explore their own data in mitoXplorer, given that they

266 originate from one of the species contained in the mitoXplorer platform. Data must be

267 pre-analyzed. Differential expression data must contain the dataset ID (describing the

268 experimental condition), the gene name and the log2FC. Optional values include the

269 p-value, as well as the averaged read counts (or intensities) of the replicates of the

11 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

270 compared conditions. Mutation data must contain the dataset ID, gene name, the

271 chromosome, the position, as well as reference and alternative allele. Optional values

272 include the effect, as well as the consequence of the mutation. The entire list of genes

273 from a study should be uploaded to the platform for several reasons: first, a restriction

274 to only differentially expressed or mutated genes will suppress links between proteins

275 in the interactome; second, an integration of user data with publicly provided data is

276 difficult with incomplete datasets; third, mitoXplorer will automatically select the mito-

277 genes from the user data. Users can either generate their own data in the format

278 described on our website; or use the RNA-seq pipeline that we provide at

279 https://gitlab.com/habermannlab/mitox_rnaseq_pipeline/. Uploaded data will be

280 checked for correct formatting and integrated with the interactome of the chosen

281 species. User data are only visible to the owner and are stored in the mitoXplorer

282 MySQL database for 7 days. Users can integrate their own data with available public

283 data on mitoXplorer to perform various analyses and visualizations as described below

284 (Figure 1).

285 286 Analysis and visualization tools in mitoXplorer

287 The mitoXplorer web-platform provides a set of powerful, easy-to-read and highly

288 interactive visualization tools to analyze and visualize public, as well as user-provided

289 data by VDM (Figure 1): an Interactome View to analyze the overall expression and

290 mutation dynamics of all mito-processes of a single dataset containing differentially

291 expressed genes between two conditions and potential mutations in mito-genes; the

292 Comparative Plot, consisting of an interactive scatter plot, as well as an interactive

293 heatmap for comparing up to 6 datasets; the Hierarchical Clustering, as well as the

294 Principal Component Analysis for comparative analysis of many datasets.

295

12 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

296 Interactome View 297 The Interactome View can be used to get an at-a-glance view of the overall expression

298 dynamics of all mito-processes of a single dataset of differentially expressed mito-

299 genes and potential mutations (Figure 2 a). It allows users to identify the most

300 prominently changed mito-processes or -genes in a dataset. The genes are grouped

301 according to mito-processes and displayed in the process they are assigned to. The

302 Interactome View is highly dynamic and can be adjusted by users to their needs.

303

304 When the Interactome View is launched, each mito-process is primarily shown as a

305 grey circle with elements colored in grey, blue and/or red, indicating up- or down-

306 regulated genes within the process, respectively (Figure 2 a). Thus, mito-processes

307 with the most up- or down-regulated genes can be quickly identified.

308

309 When clicking on a process name, its circle opens and displays all its member genes

310 as bubbles, whereby the size of the bubble indicates the strength of the differential

311 regulation and the color indicates up- (blue) or down- (red) regulation of the gene

312 (Figure 2 b). Both, the log2FC as well as the p-value are color-coded in the Interactome

313 View. Only genes with a p-value below 0.05 will be colored. If information about

314 mutations are included in the dataset, these are indicated by a thicker, black border of

315 the gene bubble.

316

317 Hovering over a gene will display the gene name, its function, its mito-process, the

318 log2FC and the p-value of the differential expression analysis, as well as potential

319 mutations in the information panel (Figure 2 c). If a gene physically interacts with other

320 mito-genes, hovering over it or over the process circle will in addition display these

321 connections (Figure 2 c). Thus, the user is immediately informed about the location

13 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

322 and connectivity of the protein of interest within the mitochondrial interactome. Users

323 can also search for specific genes using the ‘FIND A GENE’ box at the top of the page.

324

325 The Interactome View can be launched by clicking on the ‘eye’ symbol next to dataset

326 names from the ANALYSIS page of mitoXplorer, after having chosen the organism,

327 the project and the dataset. Alternatively, users can access single datasets from the

328 DATABASE page of the platform, by clicking on the eye symbol of a listed dataset

329 after having chosen a species, as well as a project. A new page will be opened for the

330 Interactome View, which allows opening and comparing multiple datasets at the same

331 time. This is especially useful for comparing the overall expression change of mito-

332 processes of multiple datasets.

333

334 Comparative Plot

335 The Comparative Plot visualization combines several interactive graphs to analyze

336 one mito-process, allowing the comparison of up to 6 datasets. It includes a scatter

337 plot with a dynamic y-axis, as well as an interactive heatmap at the bottom of the page.

338 The mito-process to be visualized can be selected in the process panel (Figure 3 a).

339 Red and blue coloring of the dots and the bar chart indicates the directionality of

340 differential expression (blue: upregulated; red: downregulated); bright blue, larger

341 gene bubbles in the scatter plot indicate mutations, if available from the dataset. This

342 view offers an overview of the expression dynamics of all members of one mito-

343 process for up to 6 individual datasets and thus can be helpful in identifying co-

344 regulated genes e.g. in time-course data, patients or multiple mutant datasets.

345

14 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

346 Hovering over a gene bubble, or over a bar in the heatmap will again display the

347 respective associated information of the gene in the information panel (gene name,

348 function, mito-process, log2FC, p-value, potential mutations) (Figure 3 b). The

349 heatmap can be sorted according to the dataset, as well as the differential expression

350 values within one dataset (Figure 3 c). The Comparative Plot is especially useful for

351 performing a detailed, comparative, mito-process based analysis of differential

352 expression dynamics between different datasets.

353

354 We applied this analysis method to visualize differential expression data from a time-

355 series study of flight muscle development during pupal stages in Drosophila [48]

356 (Figure 3). While enrichment analysis has revealed a general positive enrichment of

357 processes like ‘TCA Cycle’ in the course of flight muscle development, mitoXplorer

358 identifies 12 genes of TCA cycle that are co-regulated. This group of genes is strongly

359 upregulated between 0 and 16 hours of development, when myoblasts divide and fuse

360 to myotubes. The same group of genes is consecutively downregulated in two phases

361 at time-points 30 to 48 hours and 72 to 90 hours APF, when myotubes differentiate to

362 mature muscle fibers. This is surprising as in mature muscle fibers, the TCA cycle

363 should be important for proper functioning. Their strong induction between the first two

364 time-points could be responsible for downregulation at later stages.

365

366 The Heatmap: Hierarchical Clustering 367 Hierarchical Clustering visualization allows the analysis of up to 100 datasets,

368 analyzing one process at a time. This creates a heatmap with mito-genes, as well as

369 -datasets, which are clustered according to the log2FC using hierarchical clustering

370 (Figure 4 a). The results are displayed as a clustered heatmap, with a dendrogram

371 indicating the distance between datasets or between genes.

15 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

372

373 Hovering over a gene will display its associated information, as well as dataset

374 information in the information panel (Figure 4 b). The user can furthermore zoom into

375 parts of the heatmap to get a more detailed view of the data. The heatmap is

376 particularly useful for discovering groups of similarly regulated mito-genes or datasets

377 within one mito-process.

378

379 We applied this visualization tool to display transcriptome and proteome data from a

380 recent, systematic study of mouse conditional knock-out strains for five genes involved

381 in mitochondrial replication (Twinkle (Twnk)), mtDNA maintenance (Tfam), mito-

382 transcription (Polrmt), mito-mRNA maturation (Lrpprc) and mito-translation (mTerf4)

383 [46]. Interestingly, the expression dynamics of the mitochondrial transcriptomes and

384 proteomes in heart tissue did not cluster together for the mutants upon the loss of any

385 of these genes. In accordance with this, the expression of some mito-genes in the

386 process pyruvate metabolism that is shown here differs on transcriptome and

387 proteome level. This demonstrates the usefulness of hierarchical clustering and the

388 heatmap display in identifying the correlation or divergence between genes as well as

389 datasets.

390

391 Principal Component Analysis

392 A larger number of datasets can be compared using Principal Component Analysis

393 (PCA), either for an individual mito-process, or considering all mito-genes together

394 (Figure 5 a). In PCA, the expression value (e.g. log2FC) of each gene is considered

395 as one dimension, and each dataset represents one data point. In the resulting 3D

396 PCA plot, the three axes represent the first three principal components and each

16 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

397 bubble represents one dataset. The PCA is again interactive. The mito-process to be

398 viewed can be selected via a drop-down menu on the top of the page. The plot can be

399 turned and moved in 3D and has a zooming function.

400

401 Hovering over a bubble will give all information associated with the individual dataset

402 in the information panel, including the values of the first three principal components

403 (Figure 5 b). The information differs for each project chosen.

404

405 Individual datasets can be selected and colored via the dataset panel next to the plot

406 (Figure 5 c). For data from TCGA, the filter and coloring can for instance be used to

407 highlight or to limit the plot to data from different tumors, different tumor stages or

408 according to any other additional information provided. The PCA is especially useful

409 for analyzing a large number of datasets and displaying specific trends in sub-groups.

410

411 We used the PCA plot to visualize data from the TCGA for four cancer types stored in

412 mitoXplorer in Figure 5 a, whereby the colors of the bubbles represent the different

413 tumor types. The PCA mode clearly highlights the distinctness of the different tumor

414 types. In particular, kidney and liver cancer are highly distinct with respect to the first

415 three components of all mito-genes (Figure 5 a).

416

417 Both views intended for large datasets, the Heatmap and the PCA, can identify groups

418 of correlated datasets. In order to allow a more detailed, gene-centered analysis of

419 correlated datasets, we added the possibility to select and group datasets in the

420 Heatmap and the PCA view. Groups of datasets can be compared against each other

421 with the Comparative Plot, whereby the log2FC is averaged over the data within a

17 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

422 group. This functionality is useful, if for instance different groups of patients with a

423 similar expression pattern need to be compared to each other; or to compare the

424 expression changes during tumor development in different tumor stages.

425

426 Taken together, mitoXplorer provides a versatile, interactive and integrative set of tools

427 to visualize and analyze the expression dynamics as well as mutations of mito-genes

428 and mito-processes, providing a detailed understanding of observed changes at a

429 molecular level.

430

431 Analyzing cell lines carrying trisomy 21 using the mitoXplorer platform

432 To demonstrate the analytical and predictive power of mitoXplorer, we analyzed the

433 transcriptome and proteome of a set of aneuploid cell lines carrying an extra copy of

434 chromosome 21 (trisomy 21, T21). Mitochondrial dysfunction has been repeatedly

435 found in T21 patients, whereby mostly oxidative stress, as well as – potentially

436 resulting – mitochondrial respiratory deficiency have been shown to contribute to some

437 of the observed clinical features (see for instance [49-62]). Transcriptome studies of

438 different T21 tissues using microarrays [63-73] and more recently RNA sequencing

439 [41,42,74] and proteomics [75-79] have revealed a complex picture of gene expression

440 changes, with a marked dissimilarity in differential expression of mito-genes on mRNA

441 and protein levels, indicating a potential post-transcriptional regulatory effect of some

442 mito-genes in T21 [78]. Yet, mito-gene and protein expression data in different tissues

443 or under varying conditions in T21 remain still sparse and a coherent hypothesis of the

444 underlying mechanisms leading to the mitochondrial deficiencies in T21 patients is still

445 missing.

446

18 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

447 We used trisomy 21 cell lines derived from either the euploid human colon cancer cell

448 line HCT116 or from the retinal pigmented epithelial cell line RPE1, to which an extra

449 copy of chromosome 21 [45] was added. We used two RPE1-derived and two

450 HCT116-derived clones trisomic for chromosome 21 (Additional File 3, Supplementary

451 Table S2 a), which were validated by fluorescent in situ hybridization and by whole

452 genome sequencing. We used transcriptomic data of the original euploid RPE1 line

453 and its two trisomic derivatives (RPE_T21 clone 1 and 2 (c1, c2) [45]), as well as for

454 HCT116, and its trisomic derivatives (HCT_T21 (c1, c3)). We included proteomics data

455 for RPE1 and one of its T21 derivatives (RPE_T21 c1). We performed bioinformatic

456 analysis to determine differential expression of the above conditions (Additional File

457 3, Supplementary Table S2 b-e) and uploaded the differential expression data of the

458 transcriptome and proteome on the mitoXplorer platform for further in-depth,

459 mitochondrial analysis.

460

461 Differences between trisomy 21 cell lines

462 MitoXplorer analysis of data comparing HCT116- and RPE1-derived T21 cell lines

463 using the Interactome View revealed that T21 induced strong effects with respect to

464 the overall expression changes in mito-genes (Figure 6). HCT_T21 showed a subtle,

465 but consistent up-regulation of mito-genes (Figure 6 a). In contrast, RPE_T21 cells

466 showed a strong down-regulation of a few genes involved in several mito-processes,

467 such as fatty acid metabolism, glycolysis or mitochondrial dynamics (Figure 6 b).

468 Remarkably, quantitative proteome data from RPE_T21 c1 cells suggested that all

469 mitochondria-encoded genes involved in OXPHOS, as well as the majority of nuclear-

470 encoded OXPHOS-genes are downregulated (Figure 6 c). In conclusion, mitoXplorer

471 analysis revealed significant differences in mito-gene expression between the different

19 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

472 cell lines. Importantly in RPE_T21 cells, proteome data show a remarkable difference

473 to transcriptome data.

474 475 mitoXplorer reveals mitochondrial ribosomal assembly defects in RPE_T21 cell

476 lines

477 To investigate the differences further, we next performed a more detailed analysis of

478 expression changes in these T21 cell lines using Comparative Plots in mitoXplorer.

479 Transcriptome and proteome data from RPE_T21, but not from HCT_T21 cell lines

480 revealed that several subunits of the small (mitoribosome)

481 were significantly downregulated on either RNA or protein level, or both (Figure 7 a).

482 MRPS21 was strongly reduced on RNA- and protein-level. The genes MRPS33,

483 MRPS14 and MRPS15 were largely normal on RNA level, while their protein levels

484 decreased more than 2-fold (log2FC: MRPS33: -2.147; MRPS14: -1.827; MRPS15: -

485 1.057). The mitoribosome subunits are encoded in the nuclear genome and their

486 protein products are imported into the mitochondria, where they assemble with

487 mitochondrial ribosomal RNAs to form the large and small subunits of the

488 mitoribosome. The mitoribosome is responsible for translating the 13 mt-mRNAs

489 encoded in the mitochondrial genome, all of which code for key subunits of the

490 respiratory chain required for OXPHOS [80,81]. In accordance with a disrupted

491 mitochondrial translation machinery, all quantifiable mitochondria-encoded OXPHOS

492 proteins (Complex I: MT-ND1 and MT-ND5; Complex IV: MT-CO2) were severely

493 diminished on protein-, but not on RNA-level in RPE1_T21 cells (Figure 7 b,c).

494

495 Interestingly, 36 of the quantifiable OXPHOS proteins encoded in the nuclear genome

496 were also found to be down-regulated at the proteome, but not at transcriptome level

497 in RPE_T21 cells (Figure 7 c). These include subunits of the NADH dehydrogenase

20 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

498 (complex I), ubiquinol-cytochrome c reductase (complex III) and cytochrome c oxidase

499 (complex IV). It is important to note that there is no general down-regulation of

500 mitochondrial proteins in these cells and only a few, specific proteins are strongly

501 downregulated (Figure 6 c). Together, these data demonstrate the power of

502 mitoXplorer to help identify the cause of important changes in mito-gene expression,

503 here the downregulation of mitoribosomal subunits at the transcription level and the

504 resulting cause, in this case the downregulation of the majority of OXPHOS proteins.

505

506 RPE_T21 cells are defective in oxidative phosphorylation

507 The massive downregulation of OXPHOS proteins in RPE_T21 cells suggests that

508 these cells should suffer from a severe OXPHOS deficiency. To test this hypothesis

509 experimentally, we analyzed cellular respiration and glycolysis in T21 cell lines using

510 a Seahorse XF96 analyzer to quantify oxygen consumption rate (OCR) as an indicator

511 of mitochondrial respiration (Figure 8 a-d, f), as well as the proton production rate

512 (PPR) as an indicator of glycolysis (Figure 8 e, g). In intact RPE_T21 cells, we indeed

513 observed dramatically reduced levels of cellular respiration in comparison to the

514 diploid control (Figure 8 a).

515

516 As a complex I deficiency has been reported in trisomy 21 patients [58], we next asked

517 whether RPE_T21 cells selectively suffer from a complex I deficiency, or whether the

518 entire respiratory chain is affected, as suggested by our proteomics data. We used

519 permeabilized cells to test each individual complex with the Seahorse analyzer,

520 supplementing with pyruvate/malate, succinate and TMPD/ascorbate for assessing

521 complex I, II or IV functionality, respectively. As expected from our proteomics

522 analysis, RPE_T21 cells displayed a severe deficiency of the entire respiratory chain

21 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

523 (Figure 8 b-d). The glycolytic rate of RPE_T21 cells in the presence of glucose was

524 similar to the diploid control cells. Inhibition of ATP-production was not able to

525 stimulate the cells to a higher glycolytic rate (Figure 8 e), which agrees with the already

526 low OXPHOS levels observed in these cells. HCT_T21 cells, on the other hand,

527 displayed normal respiration, as well as glycolysis (Figure 8 f, g). This suggests that

528 the respiratory chain, as well as the mitochondrial translational machinery is not

529 generally affected in all T21 cells. Taken together, mitoXplorer uncovered OXPHOS

530 deficiencies in RPE_T21 cells, which we verified experimentally, demonstrating the

531 power of an in-depth analysis of mitochondrial expression dynamics to identify the

532 potential molecular cause of the observed phenotype.

533

534 Quantification of mitochondrial network morphology using mitoMorph

535 We further wanted to investigate, if T21 and the defective OXPHOS had a

536 consequence on mitochondrial morphology and the mitochondrial network structure

537 was changed in T21 cell lines. To quantify mitochondrial morphology in RPE_T21

538 cells, we stained mitochondria using the MitoTracker Deep Red FM dye. In order to

539 quantify the characteristics of mitochondrial morphology, we developed a new Fiji

540 plug-in for quantification of mitochondrial network features, which we called mitoMorph

541 (Figure 9 a,b). MitoMorph is based on the scripts provided by Leonard, et al. [82] for

542 quantifying mitochondrial network features such as filaments, rods, puncta and

543 swollen mitochondrial structures (see Materials and Methods for implementation

544 details). MitoMorph reports the percentages of filaments, rods, puncta and swollen for

545 each individual cell, as well as for all selected cells in a batch analysis. Moreover, it

546 provides the lengths and areas of filaments and rods. Figure 9 (c-f) shows the

547 distribution of mitochondrial network features for the two wild-type and T21 cell lines.

22 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

548 MitoMorph analysis revealed that in both backgrounds, T21 cells had fewer

549 mitochondrial filaments than their wild-type counterparts, but instead possessed a

550 slightly higher number of rods, which was significant in HCT_T21 cells. Both T21 cell

551 lines had significantly more swollen structures than their wild-type counterparts.

552 Length and area distribution of filaments and rods were not significantly different

553 between the wild-type and the trisomy 21 cells (Additional File 1, Supplementary

554 Figure S2 a-d). In conclusion, mitochondrial morphology based on light-microscopy is

555 mildly affected in trisomy 21.

556

557 Data integration with publicly available trisomy 21 datasets

558 After discovering this differential OXPHOS defect in our RPE_T21 cell lines, we were

559 interested in the overlap of the mito-transcriptome and -proteome of RPE_T21 cells

560 with data from trisomy 21 patients. We used proteomic and transcriptomic data from a

561 monozygotic twin study discordant for chromosome 21 [41,78]. In agreement with our

562 RPE_T21 data, systematic proteome and proteostasis profiling of fibroblasts from

563 monozygotic twins discordant for T21 revealed a significant, although milder down-

564 regulation of the mitochondrial proteome, including proteins involved in OXPHOS,

565 which is not apparent from transcriptomic analysis of the same cells (see Additional

566 File 1, Supplementary Figure S3 a, b). Taken together, the analysis of both datasets

567 with mitoXplorer suggests a strong post-transcriptional effect leading to reduced

568 expression levels of proteins involved in OXPHOS in trisomy 21.

569

23 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

570 Discussion 571 572 The web-based mitoXplorer platform for mito-centric data exploration

573 MitoXplorer is a practical web tool with an intuitive interface for users who wish to gain

574 insight from -omics data in mitochondrial functions. It is the first tool that takes

575 advantage of the breadth of -omics data available to date to explore expression

576 variability of mito-genes and -processes. It does so by integrating a hand-curated,

577 annotated mitochondrial interactome with -omics data available in public databases or

578 provided by the user.

579

580 MitoXplorer has been conceived and implemented as a visual data mining (VDM)

581 platform: by iteratively interacting, visualizing and by allowing manipulation of the

582 graphical display of data, the user can effectively discover complex data to extract

583 knowledge and gain deeper understanding of the data. MitoXplorer provides a set of

584 particularly interactive and flexible visualization tools, with a fine-grained, function- as

585 well as gene-based resolution of the data. Clustering, as well as PCA-analysis help in

586 addition to mine a larger number of -omics data effectively by grouping datasets with

587 similar expression patterns.

588

589 VDM-based knowledge discovery is offered by a large number of resources and

590 platforms. However, to the best of our knowledge, no currently available tool allows to

591 explore expression variation of a specific subset of genes in a large number of -omics

592 datasets. It permits users to exploit publicly available transcriptome, proteome or

593 mutation data to study the variation and thus, the adaptability of a defined gene set in

594 different conditions or species. While mitoXplorer offers the exploration of mito-genes,

595 we have designed the platform such that users, who wish to download a local version

24 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

596 of mitoXplorer can also upload their own interactome, which can be any gene group

597 of interest. Thus, mitoXplorer can be flexibly adjusted to any user-defined interactome

598 set.

599

600 Cell type-specific de-regulation of mito-genes in trisomy 21

601 We tested mitoXplorer by expression profiling of mito-genes in T21 cell lines. Mito-

602 genes were strongly deregulated in both trisomic cell types tested, the non-cancerous

603 retinal pigment epithelial cell line RPE1 and the cancer cell line HCT116. Yet, the

604 changes in expression were quite different in the two cell lines. It is not unexpected

605 that mito-genes are differentially expressed in different cell types, reflecting the

606 divergent cellular energy- and metabolic demands [20]. Gene expression is moreover

607 tightly regulated in a cell-type specific manner by regulating transcription, translation

608 and the epigenetic state of the cell. Thus, also divergent and cell-type specific

609 expression changes of mito-genes upon introduction of an extra chromosome is not

610 surprising.

611

612 mitoXplorer revealed divergent de-regulation of mitochondrial transcriptome

613 and proteome in trisomy 21

614 We found a remarkable difference between transcriptome and proteome levels of mito-

615 genes in RPE_T21 cells. In particular the OXPHOS proteins were strongly down-

616 regulated at protein, but not mRNA level. This can be explained by essential

617 components of the respiratory chain being encoded in the mitochondrial genome and

618 thus requiring a functioning mitochondrial replication system, as well as intact

619 mitochondrial transcription and translation. Thus, there is as strong post-transcriptional

620 regulation of the mitochondrial proteome. In case of the RPE_T21 cell line, the

25 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

621 disintegration of the mitoribosome and thus a failure of mitochondrial translation is

622 likely causative for the down-regulation of OXPHOS components on protein-level,

623 possibly by proteolysis, as the essential mitochondrial subunits are not produced and

624 thus complexes cannot assemble. This conclusion is further supported by the fact that

625 we could not observe a significant difference in mitochondrial transcript levels, with

626 some mt-mRNAs even being upregulated; thus, mtDNA -maintenance, -replication as

627 well as mito-transcription seem to be unaffected.

628

629 MitoXplorer analysis of previously published data of the mito-proteome of fibroblasts

630 isolated from monozygotic twins discordant for T21 [78] revealed a similar post-

631 transcriptional effect as our T21 model cell lines. Taken together, our data uncovered

632 a significant post-transcriptional regulation of the mitochondrial process OXPHOS in

633 trisomy 21 that could bring new insight into the mechanisms of mitochondrial defects

634 in trisomy 21 patients.

635

636 mitoXplorer identified mitochondrial S21 (MRPS21) as

637 potentially causative for OXPHOS failure

638 The most notable difference in RPE_T21 cells compared to wild-type is the 10-12 –

639 fold downregulation of mitochondrial ribosomal protein S21 (MRPS21) on transcript

640 level, as well as the downregulation of Mrps21 protein and other proteins of the small

641 and – to a lesser extend – large mitoribosome subunit. Thus, our data suggest that the

642 integrity of the mitoribosome is compromised, leading to its disintegration and

643 subsequently, the downregulation of mitochondrial proteins of the respiratory chain.

644 Mrps21 is a late-assembly component and lies at the outer rim of the body (or bottom)

645 of the small subunit (SSU) of the mitoribosome. Nevertheless, it interacts with a

26 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

646 number of other proteins of the SSU and also directly contacts bases of the 12S rRNA

647 [83,84]. Thus, its absence could destabilize the SSU of the mitoribosome. The two

648 most down-regulated proteins are Mrps33 and Mrps14, both of which directly interact

649 with each other and several other proteins in the SSU and are localized to the head of

650 the SSU. Furthermore, together with another down-regulated component, Mrps15,

651 they are proteins that are incorporated late in the mitoribosome assembly process [84].

652 This raises the possibility that late-assembly proteins disintegrate more readily from

653 the mitoribosome, leading to their enhanced degradation and thus ribosome

654 malfunction.

655

656 Based on promoter analysis using MotifMap [85], potential binding motifs of two

657 transcription factors located on chromosome 21, GABPA and ETS2, can be found in

658 the promoter region of the MRPS21 gene. Gabpa, which is also known as nuclear

659 respiratory factor 2, has already been implicated in mitochondrial biogenesis by

660 regulating Tfb1m expression [86]: its depletion in mouse embryonic fibroblasts showed

661 reduced mitochondrial mass, ATP production, oxygen consumption and mito-protein

662 synthesis, but had no effect on mitochondrial morphology, membrane potential or

663 apoptosis. Direct or indirect regulation of mitoribosomal proteins could be another

664 regulatory function of this transcription factor. GABPA is not affected on transcriptome

665 level, but is downregulated on protein-level in RPE_T21 cells. ETS2 on the other hand

666 has so far not been implicated in mitochondrial biogenesis or functional regulation.

667

668 Outlook

669 MitoXplorer is integrating, clustering and visualizing numerical data resulting from

670 expression studies (transcriptome, proteome), as well as mutation data. Thus, it is

27 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

671 currently limited to analyzing mito-genes without offering the ability to explore their

672 embedding in a broader, cellular context and thus to learn about potential regulatory

673 mechanisms of observed expression changes of mito-genes. Therefore, in the next

674 release of mitoXplorer, we plan to fully embed mito-genes within the cellular gene

675 regulatory, as well as signaling network by adding information from epigenetic studies

676 (ChIP-seq, methylation data), as well as from the cellular interactome. We will provide

677 the tools to perform enrichment analysis of observed transcription factors binding in

678 the promoter regions of co-regulated mito-genes; and to explore the regulatory

679 network of mito-genes by offering network analysis methods, such as viPEr [87]. Other

680 analysis methods we will provide include correlation analysis, as well as cross-species

681 data mining. Upon user request, we will also add the mitochondrial interactomes of

682 other species. As mitoXplorer stores the mitochondrial interactomes, as well as

683 associated -omics data in a MySQL database, all technical requirements for extending

684 the functionalities of mitoXplorer are already implemented.

685

686 Conclusions

687 mitoXplorer is a powerful, web-based visual data mining platform that allows users to

688 in-depth analyze and visualize mutations and expression dynamics of mito-genes and

689 mito-processes by integrating a manually curated mitochondrial interactome with -

690 omics data in various tissues, conditions or species. We used transcriptome and

691 proteome data from cell lines with trisomy 21 to demonstrate the value of mitoXplorer

692 in analyzing in detail the expression dynamics of mito-genes and -processes. We have

693 used mitoXplorer to integrate these data with publicly available datasets of patients

694 with trisomy 21. Using mitoXplorer for data mining, we predicted failure of

695 mitochondrial respiration in one of the trisomy 21 cell lines, which we could verify

28 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

696 experimentally. Our results demonstrate the power of a visual data mining platform

697 such as mitoXplorer to explore expression dynamics of a specified mito-gene set in a

698 detailed and focused manner, leading to discovery of underlying molecular

699 mechanisms and providing testable hypotheses for further experimental studies.

700

29 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

701 Methods

702

703 Implementation of mitoXplorer

704

705 Web interface of mitoXplorer (Front-end)

706 The web interface of mitoXplorer at the front-end allows users to access, interact and

707 visualize data from its database, including the interactome and expression/mutation

708 data. The interactive elements and visualizations on mitoXplorer are all built with

709 Javascript, a dynamic programming language that enables interactivity on web pages

710 by manipulating elements through DOM (Document Object Model). DOM is a

711 representation of document, such as HTML, in a tree structure, with each element as

712 a node or an object. Through Javascript and its libraries, visualizations in mitoXplorer

713 can react to users’ action and dynamically change the properties (size, color,

714 coordinates) of web elements and display interactivity. All the visualization

715 components in mitoXplorer described below are modular by design and can be

716 deployed individually or incorporated into web platforms easily.

717

718 Mitochondrial Interactome (D3 - Data binding and selection)

719 The visualization of the interactome is created with the implementation of a Javascript

720 library, D3 (d3.js) [88]. D3 (Data-driven documents) is capable of binding data, usually

721 in the form of JSON (Javascript-oriented notation), to the elements of the DOM so that

722 their properties are entirely based on given data. In the interactome, D3 creates an

723 SVG (Scalable Vector Graphic) element for each gene within the DOM in the form of

724 a bubble, with sizes and colors dependent on their Log2FC values. The coordinates

725 of bubbles are also calculated according to the data (e.g. the largest one being at the

30 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

726 center) so that the layout of the whole interactome is visually appealing. Upon hovering

727 over any bubble (gene), D3 selects the element and passes additional data bound to

728 that element to the corresponding web element (sidebar) for display.

729

730 Comparative plot (D3 - Transition and sorting)

731 The comparative plot combines three interdependent visualizations (scatterplot, bar

732 chart and heatmap) built upon D3. Apart from data-binding and selection, these

733 visualizations exploit the functionality of D3 of transition and sorting through its API. In

734 the scatterplot, genes are displayed as nodes, whose colors and position again

735 depend on the data (log2FC). When another mito-process is selected at the bar chart,

736 D3 updates the data bound to the node and the properties of the nodes are changed.

737 The transition (changes in color and position) is smooth and gives users the

738 impression that the visualization is truly dynamic and interactive. D3 can manipulate,

739 not only the elements, but also the data bound to the elements. Upon clicking the

740 dataset or gene names on the heatmap, the data can be sorted accordingly and an

741 index is assigned to each element (tile on the heatmap) to indicate its position.

742

743 Hierarchical clustering (mpld3 - Visualization in Python implemented in D3)

744 The heatmap displaying the results of hierarchical clustering is built with mpld3, a

745 Python library that exports graphics made with Python’s Matplotlib-based libraries to

746 JSON objects that could be displayed on web browsers. Mpld3 benefits from D3’s

747 data-binding property and allows users to create a plugin that interacts with the data

748 on the visualization. The advantage of using mpld3 is that the analysis and

749 visualizations made in Python can be directly translated to JSON and deployed in

750 Javascript on webpages without re-programming. In the case of hierarchical

31 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

751 clustering, since libraries for both clustering analysis and visualization of results in a

752 heatmap with a dendrogram are available in Python (described below), it is exported

753 to JSON with mpld3, and a Javascript tooltip plugin that allows users to select data or

754 display information with D3.

755

756 Principle Component Analysis (three.js - 3D visualization)

757 The visualization of the result of Principle Component Analysis (PCA) is 3-

758 dimensional, with each dimension representing one of the first three Principle

759 Components (PCs). This is achieved through the implementation of three.js, a

760 Javascript library that enables animated 3D graphics to be created and displayed in a

761 web browser. It starts with building a “scene”, or a canvas, on which 3D objects will be

762 created. Then a “camera” is set up that controls the view of objects on the scene from

763 the users’ perspective, such as the field of view (width, height, depth) and its ratio; and

764 a “renderer” that renders the scene at short time intervals so objects are displayed as

765 animated object (either they are animated by themselves or moved around on the

766 scene by users). Objects of different texture, geometry and color, can now be added

767 to and rendered on the scene. Finally, the scene with objects is attached to the DOM

768 of a webpage to become visible. In the PCA visualization, each dataset is represented

769 and rendered as a small sphere, with coordinates (x, y, z) depending on the values of

770 its first three PCs, and colors on the grouping of that dataset. When users drag around

771 on the canvas or zoom in or out, all objects are re-rendered in such a way that the

772 scene appears to be a 3-dimensional space.

773

774 MitoXplorer Database (back-end)

32 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

775 A MySQL database hosted at the back-end of mitoXplorer contains the interactomes

776 of mito-genes, including the mito-process, and the interactions between

777 gene products; and the expression and mutation data from public databases. Each

778 entry of the expression and mutation data has a foreign link to the interactome and file

779 directory (dataset table). This ensures that the expression and mutation data will be

780 updated together with the interactome, or when a dataset is updated or deleted. Users

781 can upload their own differential expression and/or mutation data, which will be

782 processed and integrated with the interactome by extracting mito-genes, and stored

783 in the mitoXplorer database for up to 7 days.

784

785 Data analysis and communication between front- and back-end

786 A Python application serves as a bridge between the front- and back-end of

787 mitoXplorer. Upon the users’ request to access the database or perform analysis at

788 the web interface, an AJAX-asynchronous call directed to the Python application is

789 made, so the request can be performed in the background and the webpage is updated

790 without reloading. The Python application then processes the request by connecting

791 to the MySQL database and analyzes the data retrieved from it. The application also

792 handles the user uploads (e.g. data cleaning) before saving it to the MySQL database.

793 The main libraries used by the Python application for analysis include: 1) Scikit-learn:

794 a machine learning library that provides tools for PCA, to perform dimensionality

795 reduction on the expression of all mito-genes and of each mito-process. The first three

796 principal components are extracted for each dataset. 2) SciPy: a mathematical library

797 that provides modules for Hierarchical Clustering, to calculate 2-dimensional distance

798 matrices between genes and between datasets based on expression values, for each

799 mito-process. 3) Seaborn: a statistical visualization library built on top of SciPy to

33 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

800 create heatmaps from the results. All the results are produced in JSON format, which

801 are then sent via the HTTP protocol back to the front-end and visualized with

802 Javascript.

803

804 The usage of mitoXplorer does not require installation or programming knowledge.

805 Documentation and tutorials are available online and on GitLab

806 (https://gitlab.com/habermannlab/mitox). MitoXplorer is also available for download

807 and installation on a local server, if users wish to build their own gene list and apply

808 the interactive features and database of mitoXplorer, which stores the available

809 expression and mutation data for all genes. Setup instructions are also available on

810 GitLab (https://gitlab.com/habermannlab/mitox). We also provide a docker version of

811 mitoxplorer at (https://gitlab.com/habermannlab/mitox, branch docker-version).

812

813 Transcriptomics and proteomics of aneuploid cell lines

814 The proteome analysis of the trisomic cell lines was previously described [44,45].

815

816 The raw reads from RNA-sequencing were processed to remove low quality reads and

817 adapter sequences, and aligned to the reference genome (hg19) with TopHat2 [89].

818 The Cufflinks package [90] was used to calculate the expression difference between

819 two samples (aneuploid vs diploid) of multiple replicates and test the statistical

820 significance. Transcriptome and proteome information are available in public

821 repositories: NGS data have been deposited in NCBI’s Gene Expression Omnibus and

822 are accessible through GEO Series accession number GSE102855 and GSE131249.

823 Data from Kühl et al. [46], as well as Liu, et al. [78], Letourneau et al. [41] and Sullivan

824 [42] were uploaded as provided by the authors.

34 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

825

826 Data processing of public data and correlational analysis

827 The public NGS datasets on mitoXplorer were downloaded from GEO; only RNA-seq

828 data, not microarray data are currently uploaded on mitoXplorer. The pre-analyzed

829 data were downloaded and transformed to transcript per million (TPM). Log2FC were

830 calculated for each disease-sample, using the corresponding diploid samples as

831 control (or the mean of normal samples if there were no paired samples). Metadata of

832 the samples (e.g. cell types) was also downloaded and stored in the mitoXplorer

833 database. The links to the experiments for each dataset are available at the

834 DATABASE summary page of mitoXplorer. TCGA differential expression data were

835 downloaded from the NCI GDC Data Portal (https://portal.gdc.cancer.gov/). For

836 calculating differential expression, the log2FC was calculated from TPM (transcripts

837 per million) for each paired sample.

838

839 Cell culture and treatment

840 The human cell line RPE-1 hTERT (referred to as RPE) was a kind gift by Stephen

841 Taylor (University of Manchester, UK). Human HCT116 cells (referred to as HCT) were

842 obtained from ATCC (No. CCL-247). Trisomic cell lines were generated by microcell-

843 mediated chromosome transfer as described previously [45]. The A9 donor mouse cell

844 lines were purchased from the Health Science Research Resources Bank (HSRRB),

845 Osaka 590-0535, Japan. All cell lines were maintained at 37°C with 5% CO2

846 atmosphere in Dulbecco´s Modified Eagle Medium (DMEM) containing 10% fetal

847 bovine serum (FBS), 100 U penicillin and 100 U streptomycin.

848

849 MitoTracker staining and imaging

35 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

850 Mitochondria were stained in 96-well plates. The cells were incubated for 30 min at

851 37°C with 100 nM MitoTracker deep Red FM (M22426, Invitrogen ®) dye prior to

852 fixation. Cells were fixed with 3% PFA in DMEM for 5 min at room temperature. After

853 washing twice, 1xPBST, plates were stored with 1xPBS containing with 0.01% sodium

854 azide was added. Plates were stored at 4°C in the dark. Imaging was carried out on

855 an inverted Zeiss Observer.Z1 microscope with a spinning disc and 473 nm, 561 nm

856 and 660 nm argon laser lines. Imaging devices were controlled, captured, stored and

857 processed with the SlideBook Software in Fiji [91]. The images were captured

858 automatically on multiple focal planes (step size 700 nm) with a 40x magnification air

859 objective.

860

861 Metabolic profiling of wild-type and T21 cell lines

862 RPE and HCT cells and their T21 derivatives were seeded at 25,000 or 36,000

863 cells/well respectively, on XF96 cell plates (Seahorse Bioscience, Agilent

864 Technologies), 30 hours before being assayed. Optimization of reagents as well as

865 CCCP and digitonin titrations were performed as described by the manufacturer’s

866 protocols (Seahorse Bioscience). The experiments were performed using the

867 mitochondrial and glycolytic stress test assay protocol as suggested by the

868 manufacturer (Seahorse Bioscience, Agilent Technologies). By employing the

869 Seahorse Bioscience XF Extracellular Flux Analyzer, the rate of cellular oxidative

870 phosphorylation (oxygen consumption rate (OCR)) and glycolysis (cellular proton

871 production rate (PPR)) were measured simultaneously.

872

873 For OCR measurement, DMEM media was supplemented with 25 mM glucose, 1 mM

874 pyruvate and 2 mM glutamine. Basal rate was recorded and additions for the mito

36 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

875 stress test were as follows: 1.5 µM oligomycin, CCCP, 2µM rotenone + 4 µM antimycin

876 A. For PPR measurement, DMEM media was supplemented with 2 mM glutamine.

877 Basal rate was recorded and additions for the glycolysis stress test were as follows:

878 10 mM glucose, 1.5 µM oligomycin and 100 mM 2-deoxyglucose.

879

880 For intact cells, the CCCP concentrations were 7 and 1.5 µM for RPE1 and HCT116

881 cells, respectively. The assays of intact cells were performed in 96-well plates with at

882 least 10 replicates per cell line. For the permeabilized RPE1 cell lines, the CCCP and

883 digitonin concentrations were 10 µM and 40 µM, respectively. In the case of

884 permeabilized HCT116 cell lines, the CCCP and digitonin concentrations were 12 and

885 50µM, respectively. For OCR measurement, Mannitol-sucrose buffer (MAS) was

886 prepared according to Seahorse Biosciences. For permeabilization, digitionin was

887 added to MAS buffer together with the respective respiratory substrates: 10mM

888 pyruvate / 2 mM malate, 10 mM succinate / 2 µM rotenone or 0.5 mM TMPD / 2 mM

889 ascorbate / 2 µM antimycin A. Basal respiration was recorded, as were additions of

890 4mM ADP, 1.5 µM oligomycin, CCCP and 2 uM rotenone ± 4uM antimycin A or 20 mM

891 Na-azide. The assays in permeabilized cells were performed in poly-D-lysine-coated

892 96-well plates with at least 5 replicates per cell line.

893

894 Normalization was performed with the CyQuant cell proliferation assay kit (Life

895 Technologies) in the same plate used for the assay of intact cells; and in a parallel

896 plate for the permeabilized cells. Data analysis was done according to [92].

897

898 The mitoMorph plug-in for morphological characterization of mitochondria by

899 image analysis

37 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

900 Classification and measurement of mitochondria were performed using the software

901 ImageJ [93], complemented with all the default plugins provided by Fiji [91] and with

902 the additional plugin FeatureJ. A set of functions were developed to assist the user in

903 the preparation and analysis of the data, either in interactive or batch processing

904 mode.

905

906 Using this toolset, after all the cells of interest were manually outlined in each image,

907 the mitochondria were segmented and characterized. For each processing step, the

908 algorithms used are reported as described in ImageJ, and their parameters are

909 specified in physical units.

910

911 The images were pre-processed by first suppressing the background signal (rolling

912 ball background subtraction, kernel radius: 2.5 µm) and then enhancing the

913 mitochondria signal (Laplacian of Gaussian, smoothing scale: 1 µm, followed by

914 contrast limited adaptive histogram equalization, CLAHE, kernel size: 2.5 µm).

915 Mitochondria candidates were obtained by segmentation, using Yen thresholding

916 algorithm [94], and subjected to classification based on a set of determined features.

917

918 Objects that were too small were excluded from the analysis, and the remaining ones

919 were assigned to one of four categories: networked, puncta, rods and swollen [82].

920 Objects that were quasi-round, compact in intensity, and larger than the puncta were

921 classified as swollen. All objects with an intermediate phenotype between fragmented

922 puncta and network of filaments were classified as rods.

923

38 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

924 Classification was performed by sequentially verifying different selection criteria, one

925 set for each class, based on the following measured features: area (A), aspect ratio

926 (AR), circularity (C), solidity (S), minimum Feret diameter (here indicated as minimum

927 linear extension, MLE) and longest shortest-path (here indicated as extension, E).

928 While all the other measures are directly derived from the segmentation, the extension

929 is measured as the longest shortest-path between any two end points in the skeleton

930 derived from the segmentation. The selection criteria are evaluated sequentially as

931 reported in Additional File 4, Supplementary Table S3.

932

933 We would like to note that analysis of mitochondrial morphology on projected images

934 is limited, as mitochondrial structures might not be resolved properly.

935

936 Image analysis using mitoMorph and data processing

937 Image processing and analysis was done in Fiji. Image stacks were Z-projected, cells

938 were manually selected and the resulting images were saved for further batch

939 processing using mitoMorph. Resulting network statistics of mitochondrial features for

940 each individual cell were used for further processing (Additional File 4, Supplementary

941 Table S3). All statistical processing and data visualization of mitoMorph results was

942 done using R.

943

944 Declarations

945 Acknowledgements

946 We want to thank Michael Volkmer for help and advice in web-server management

947 and development. We want to thank Stephen Taylor (University of Manchester, UK)

948 for providing cell lines. We thank Alice Carrier, Friedhelm Pfeiffer and Frank Schnorrer

39 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

949 for critical reading of the manuscript. We thank the Max Planck Society, the Max-

950 Planck Institute for Biochemistry, the Aix-Marseille University, the CNRS and the

951 Institute of Developmental Biology Marseille (IBDM) for their support.

952

953 Funding

954 This work was supported by DFG grant HA 6905/2–1 from the German research

955 foundation, the A*MIDEX grant 2HABERRE/RHRE/ID17HRU288 from Aix-Marseille

956 University and ANR grant ANR-18-CE45-0016-01 (to BHH); the ERASMUS+

957 Traineeship program (to AY), the Munich Center for Systems Neurology (SyNergy

958 EXC 1010) and the Bert L & N Kuggie Vallee Foundation (to FP), and the Bavarian

959 Molecular Biosystems Research Network D2-F5121.2-10c/4822 to CM.

960

961 Availability of data and materials

962 The mitoXplorer web-server is freely available at http://mitoxplorer.ibdm.univ-mrs.fr/.

963 The source code of mitoXplorer is available at https://gitlab.com/habermannlab/mitox.

964 The pipeline for differential expression analysis and mutation calling of RNA-seq data

965 is available at https://gitlab.com/habermannlab/mitox_rnaseq_pipeline. MitoMorph is

966 freely available at https://github.com/giocard/mitoMorph. RNA-seq data published with

967 this study are available via the Gene Expression Omnibus (GEO) database (accession

968 numbers: GSE131249).

969

970 Details on software provided in this manuscript:

971 Project name: mitoXplorer

972 Project home page: http://mitoxplorer.ibdm.univ-mrs.fr/

973 Archived version: https://gitlab.com/habermannlab/mitox

40 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

974 Operating system(s): Platform independent

975 Programming language: JavaScript, PHP, MySQL, Python

976 Other requirements: none

977 License: GNU public license

978 Any restrictions to use by non-academics: none

979

980 Project name: mitoMorph

981 Project home page: https://github.com/giocard/mitoMorph

982 Archived version: https://github.com/giocard/mitoMorph

983 Operating system(s): Platform independent

984 Programming language: Groovy, ImageJ Macro

985 Other requirements: ImageJ/Fiji software

986 License: GNU Public License

987 Any restrictions to use by non-academics: none

988

989 Author’s contributions

990 AY and PK were the main developers of the mitoXplorer web-server with the help of

991 JV, SG and AB. JV developed the interactome view of the web-server. Data analysis

992 was done by AY, PK, MD and BHH. MitoTracker staining and imaging was carried out

993 by MD, CM and FP carried out metabolic measurements. Handling of cells and cell

994 culture was done by MD and ZS. GC conceived and developed the mitoMorph Fiji

995 plugin, image analysis with mitoMorph was done by BHH. The project was conceived

996 by BHH, the manuscript was written by AY and BHH with contributions from SZ, CG,

997 MD and CM. All authors read and approved the final version of the manuscript.

998

41 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

999 Ethics approval and consent to participate

1000 Not applicable.

1001

1002 Competing interests

1003 We declare that no competing interests exist.

1004

1005 Additional files

1006 Additional file 1: Figures S1-S4. Supplementary figures. (PDF format).

1007 Additional file 2: Supplementary table S1 a-c. (Excel format).

1008 Additional file 3: Supplementary table S2 a-e. (Excel format).

1009 Additional File 4: Supplementary table S3 a-d. (Excel format).

42 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1010 References 1011 1. Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills GB, 1012 Shaw KRM, Ozenberger BA, et al. The Cancer Genome Atlas Pan-Cancer analysis 1013 project. Nat. Genet. Nature Publishing Group; 2013;45:1113–20.

1014 2. Zhang J, Baran J, Cros A, Guberman JM, Haider S, Hsu J, et al. International 1015 Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics 1016 data. Database (Oxford). 2011;2011:bar026–6.

1017 3. Krempel R, Kulkarni P, Yim A, Lang U, Habermann B, Frommolt P. Integrative 1018 analysis and machine learning on cancer genomics data using the Cancer Systems 1019 Biology Database (CancerSysDB). BMC Bioinformatics. BioMed Central; 1020 2018;19:156.

1021 4. Klonowska K, Czubak K, Wojciechowska M, Handschuh L, Zmienko A, 1022 Figlerowicz M, et al. Oncogenomic portals for the visualization and analysis of 1023 genome-wide cancer data. Oncotarget. 2016;7:176–92.

1024 5. Papatheodorou I, Fonseca NA, Keays M, Tang YA, Barrera E, Bazant W, et al. 1025 Expression Atlas: gene and protein expression across multiple studies and 1026 organisms. Nucleic Acids Res. 2018;46:D246–51.

1027 6. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene 1028 expression and hybridization array data repository. Nucleic Acids Res. Oxford 1029 University Press; 2002;30:207–10.

1030 7. Kodama Y, Mashima J, Kaminuma E, Gojobori T, Ogasawara O, Takagi T, et al. 1031 The DNA Data Bank of Japan launches a new resource, the DDBJ Omics Archive of 1032 functional genomics experiments. Nucleic Acids Res. 2012;40:D38–42.

1033 8. Parkinson H, Sarkans U, Kolesnikov N, Abeygunawardena N, Burdett T, Dylag M, 1034 et al. ArrayExpress update--an archive of microarray and high-throughput 1035 sequencing-based functional genomics experiments. Nucleic Acids Res. 1036 2011;39:D1002–4.

1037 9. Simoff SJ, Böhlen MH, Mazeika A. Visual Data Mining: An Introduction and 1038 Overview. Visual Data Mining. Berlin, Heidelberg: Springer Berlin Heidelberg; 2008. 1039 pp. 1–12.

1040 10. Scheffler IE. Mitochondria. Hoboken, NJ, USA: John Wiley & Sons, Inc; 2007.

1041 11. Nunnari J, Suomalainen A. Mitochondria: in sickness and in health. Cell. 1042 2012;148:1145–59.

1043 12. Suomalainen A, Battersby BJ. Mitochondrial diseases: the contribution of 1044 organelle stress responses to pathology. Nat. Rev. Mol. Cell Biol. 2018;19:77–92.

1045 13. Zong W-X, Rabinowitz JD, White E. Mitochondria and Cancer. Mol. Cell. 1046 2016;61:667–76.

1047 14. Wallace DC. Mitochondria and cancer. Nat. Rev. Cancer. Nature Publishing 1048 Group; 2012;12:685–98.

43 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1049 15. Schapira AHV. Mitochondrial diseases. Lancet. 2012;379:1825–34.

1050 16. Mannella CA. Structural diversity of mitochondria: functional implications. Ann. N. 1051 Y. Acad. Sci. Wiley/Blackwell (10.1111); 2008;1147:171–9.

1052 17. Vafai SB, Mootha VK. Mitochondrial disorders as windows into an ancient 1053 organelle. Nature. Nature Publishing Group; 2012;491:374–83.

1054 18. Wai T, Langer T. Mitochondrial Dynamics and Metabolic Regulation. Trends 1055 Endocrinol. Metab. 2016;27:105–17.

1056 19. Benard G, Bellance N, James D, Parrone P, Fernandez H, Letellier T, et al. 1057 Mitochondrial bioenergetics and structural network organization. J. Cell. Sci. The 1058 Company of Biologists Ltd; 2007;120:838–48.

1059 20. Woods DC. Mitochondrial Heterogeneity: Evaluating Mitochondrial 1060 Subpopulation Dynamics in Stem Cells. Stem cells international. Hindawi; 1061 2017;2017:7068567–7.

1062 21. Mootha VK, Bunkenborg J, Olsen JV, Hjerrild M, Wisniewski JR, Stahl E, et al. 1063 Integrated analysis of protein composition, tissue diversity, and gene regulation in 1064 mouse mitochondria. Cell. 2003;115:629–40.

1065 22. Jensen RE, Dunn CD, Youngman MJ, Sesaki H. Mitochondrial building blocks. 1066 Trends Cell Biol. 2004;14:215–8.

1067 23. Pagliarini DJ, Calvo SE, Chang B, Sheth SA, Vafai SB, Ong S-E, et al. A 1068 mitochondrial protein compendium elucidates complex I disease biology. Cell. 1069 2008;134:112–23.

1070 24. Calvo SE, Clauser KR, Mootha VK. MitoCarta2.0: an updated inventory of 1071 mammalian mitochondrial proteins. Nucleic Acids Res. 2016;44:D1251–7.

1072 25. Gray MW. Mosaic nature of the mitochondrial proteome: Implications for the 1073 origin and evolution of mitochondria. Proc. Natl. Acad. Sci. U.S.A. 2015;112:10133– 1074 8.

1075 26. Meisinger C, Sickmann A, Pfanner N. The mitochondrial proteome: from 1076 inventory to function. Cell. 2008;134:22–4.

1077 27. Lotz C, Lin AJ, Black CM, Zhang J, Lau E, Deng N, et al. Characterization, 1078 design, and function of the mitochondrial proteome: from organs to organisms. J. 1079 Proteome Res. American Chemical Society; 2014;13:433–46.

1080 28. Gaucher SP, Taylor SW, Fahy E, Zhang B, Warnock DE, Ghosh SS, et al. 1081 Expanded coverage of the human heart mitochondrial proteome using 1082 multidimensional liquid chromatography coupled with tandem mass spectrometry. J. 1083 Proteome Res. 2004;3:495–505.

1084 29. Taylor SW, Fahy E, Zhang B, Glenn GM, Warnock DE, Wiley S, et al. 1085 Characterization of the human heart mitochondrial proteome. Nat. Biotechnol. Nature 1086 Publishing Group; 2003;21:281–6.

44 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1087 30. Gonczarowska-Jorge H, Zahedi RP, Sickmann A. The proteome of baker's yeast 1088 mitochondria. . 2017;33:15–21.

1089 31. Kolesnikov AA, Gerasimov ES. Diversity of mitochondrial genome organization. 1090 Biochemistry Mosc. SP MAIK Nauka/Interperiodica; 2012;77:1424–35.

1091 32. Hällberg BM, Larsson N-G. Making proteins in the powerhouse. Cell Metab. 1092 2014;20:226–40.

1093 33. Catalano D, Licciulli F, Turi A, Grillo G, Saccone C, D'Elia D. MitoRes: a resource 1094 of nuclear-encoded mitochondrial genes and their products in Metazoa. BMC 1095 Bioinformatics. BioMed Central; 2006;7:36.

1096 34. Smith AC, Robinson AJ. MitoMiner v3.1, an update on the mitochondrial 1097 proteomics database. Nucleic Acids Res. 2016;44:D1258–61.

1098 35. Godin N, Eichler J. The Mitochondrial Protein Atlas: A Database of 1099 Experimentally Verified Information on the Human Mitochondrial Proteome. J. 1100 Comput. Biol. Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, 1101 NY 10801 USA; 2017;24:906–16.

1102 36. Cotter D, Guda P, Fahy E, Subramaniam S. MitoProteome: mitochondrial protein 1103 sequence database and annotation system. Nucleic Acids Res. 2004;32:D463–7.

1104 37. Guda C, Fahy E, Subramaniam S. MITOPRED: a genome-scale method for 1105 prediction of nucleus-encoded mitochondrial proteins. Bioinformatics. 2004;20:1785– 1106 94.

1107 38. NCBI Resource Coordinators. Database resources of the National Center for 1108 Biotechnology Information. Nucleic Acids Res. 2018;46:D8–D13.

1109 39. Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, et al. The 1110 STRING database in 2017: quality-controlled protein-protein association networks, 1111 made broadly accessible. Nucleic Acids Res. 2017;45:D362–8.

1112 40. DeBerardinis RJ, Chandel NS. Fundamentals of cancer metabolism. Sci Adv. 1113 American Association for the Advancement of Science; 2016;2:e1600200.

1114 41. Letourneau A, Santoni FA, Bonilla X, Sailani MR, Gonzalez D, Kind J, et al. 1115 Domains of genome-wide gene expression dysregulation in Down's syndrome. 1116 Nature. Nature Publishing Group; 2014;508:345–50.

1117 42. Sullivan KD, Lewis HC, Hill AA, Pandey A, Jackson LP, Cabral JM, et al. Trisomy 1118 21 consistently activates the interferon response. Elife. eLife Sciences Publications 1119 Limited; 2016;5:1709.

1120 43. Lane AA, Chapuy B, Lin CY, Tivey T, Li H, Townsend EC, et al. Triplication of a 1121 21q22 region contributes to B cell transformation through HMGN1 overexpression 1122 and loss of histone H3 Lys27 trimethylation. Nat. Genet. Nature Publishing Group; 1123 2014;46:618–23.

45 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1124 44. Dürrbaum M, Kuznetsova AY, Passerini V, Stingele S, Stoehr G, Storchova Z. 1125 Unique features of the transcriptional response to model aneuploidy in human cells. 1126 BMC Genomics. BioMed Central; 2014;15:139.

1127 45. Stingele S, Stoehr G, Peplowska K, Cox J, Mann M, Storchova Z. Global 1128 analysis of genome, transcriptome and proteome reveals the response to aneuploidy 1129 in human cells. Mol. Syst. Biol. EMBO Press; 2012;8:608.

1130 46. Kühl I, Miranda M, Atanassov I, Kuznetsova I, Hinze Y, Mourier A, et al. 1131 Transcriptomic and proteomic landscape of mitochondrial dysfunction reveals 1132 secondary coenzyme Q deficiency in mammals. Elife. eLife Sciences Publications 1133 Limited; 2017;6:1494.

1134 47. Huang W, Carbone MA, Magwire MM, Peiffer JA, Lyman RF, Stone EA, et al. 1135 Genetic basis of transcriptome diversity in Drosophila melanogaster. Proc. Natl. 1136 Acad. Sci. U.S.A. National Academy of Sciences; 2015;112:E6010–9.

1137 48. Spletter ML, Barz C, Yeroslaviz A, Zhang X, Lemke SB, Bonnard A, et al. A 1138 transcriptomics resource reveals a transcriptional transition during ordered 1139 sarcomere morphogenesis in flight muscle. Elife. eLife Sciences Publications 1140 Limited; 2018;7:1361.

1141 49. Valenti D, de Bari L, De Filippis B, Henrion-Caude A, Vacca RA. Mitochondrial 1142 dysfunction as a central actor in intellectual disability-related diseases: an overview 1143 of Down syndrome, autism, Fragile X and Rett syndrome. Neurosci Biobehav Rev. 1144 2014;46 Pt 2:202–17.

1145 50. Tiano L, Busciglio J. Mitochondrial dysfunction and Down's syndrome: is there a 1146 role for coenzyme Q(10) ? Littarru GP, editor. Biofactors. Wiley-Blackwell; 1147 2011;37:386–92.

1148 51. Pagano G, Castello G. Oxidative stress and mitochondrial dysfunction in Down 1149 syndrome. Adv. Exp. Med. Biol. New York, NY: Springer US; 2012;724:291–9.

1150 52. Ogawa O, Perry G, Smith MA. The “Down's” side of mitochondria. Dev. Cell. 1151 2002;2:255–6.

1152 53. Prince J, Jia S, Båve U, Annerén G, Oreland L. Mitochondrial enzyme 1153 deficiencies in Down's syndrome. J Neural Transm Park Dis Dement Sect. 1154 1994;8:171–81.

1155 54. Roat E, Prada N, Ferraresi R, Giovenzana C, Nasi M, Troiano L, et al. 1156 Mitochondrial alterations and tendency to apoptosis in peripheral blood cells from 1157 children with Down syndrome. FEBS Lett. 2007;581:521–5.

1158 55. Piccoli C, Izzo A, Scrima R, Bonfiglio F, Manco R, Negri R, et al. Chronic pro- 1159 oxidative state and mitochondrial dysfunctions are more pronounced in fibroblasts 1160 from Down syndrome foeti with congenital heart defects. Hum. Mol. Genet. 1161 2013;22:1218–32.

1162 56. Phillips AC, Sleigh A, McAllister CJ, Brage S, Carpenter TA, Kemp GJ, et al. 1163 Defective mitochondrial function in vivo in skeletal muscle in adults with Down's

46 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1164 syndrome: a 31P-MRS study. Dzeja P, editor. PLoS ONE. Public Library of Science; 1165 2013;8:e84031.

1166 57. Aburawi EH, Souid A-K. Lymphocyte respiration in children with Trisomy 21. 1167 BMC Pediatr. BioMed Central; 2012;12:193.

1168 58. Valenti D, Manente GA, Moro L, Marra E, Vacca RA. Deficit of complex I activity 1169 in human skin fibroblasts with chromosome 21 trisomy and overproduction of 1170 reactive oxygen species by mitochondria: involvement of the cAMP/PKA signalling 1171 pathway. Biochem. J. Portland Press Limited; 2011;435:679–88.

1172 59. Valenti D, Tullo A, Caratozzolo MF, Merafina RS, Scartezzini P, Marra E, et al. 1173 Impairment of F1F0-ATPase, adenine nucleotide translocator and adenylate kinase 1174 causes mitochondrial energy deficit in human skin fibroblasts with chromosome 21 1175 trisomy. Biochem. J. electronic edn. Portland Press Limited; 2010;431:299–310.

1176 60. Abu Faddan N, Sayed D, Ghaleb F. T lymphocytes apoptosis and mitochondrial 1177 membrane potential in Down's syndrome. Fetal Pediatr Pathol. 2011;30:45–52.

1178 61. Izzo A, Nitti M, Mollo N, Paladino S, Procaccini C, Faicchia D, et al. Metformin 1179 restores the mitochondrial network and reverses mitochondrial dysfunction in Down 1180 syndrome cells. Hum. Mol. Genet. 2017;26:1056–69.

1181 62. Busciglio J, Pelsman A, Wong C, Pigino G, Yuan M, Mori H, et al. Altered 1182 metabolism of the amyloid beta precursor protein is associated with mitochondrial 1183 dysfunction in Down's syndrome. Neuron. 2002;33:677–88.

1184 63. Lockstone HE, Harris LW, Swatton JE, Wayland MT, Holland AJ, Bahn S. Gene 1185 expression profiling in the adult Down syndrome brain. Genomics. 2007;90:647–60.

1186 64. Halevy T, Biancotti J-C, Yanuka O, Golan-Lev T, Benvenisty N. Molecular 1187 Characterization of Down Syndrome Embryonic Stem Cells Reveals a Role for 1188 RUNX1 in Neural Differentiation. Stem Cell Reports. 2016;7:777–86.

1189 65. Olmos-Serrano JL, Kang HJ, Tyler WA, Silbereis JC, Cheng F, Zhu Y, et al. 1190 Down Syndrome Developmental Brain Transcriptome Reveals Defective 1191 Oligodendrocyte Differentiation and Myelination. Neuron. 2016;89:1208–22.

1192 66. Jiang J, Jing Y, Cost GJ, Chiang J-C, Kolpa HJ, Cotton AM, et al. Translating 1193 dosage compensation to trisomy 21. Nature. Nature Publishing Group; 1194 2013;500:296–300.

1195 67. Helguera P, Seiglie J, Rodriguez J, Hanna M, Helguera G, Busciglio J. Adaptive 1196 downregulation of mitochondrial function in down syndrome. Cell Metab. 1197 2013;17:132–40.

1198 68. Ripoll C, Rivals I, Ait Yahya-Graison E, Dauphinot L, Paly E, Mircher C, et al. 1199 Molecular signatures of cardiac defects in Down syndrome lymphoblastoid cell lines 1200 suggest altered ciliome and Hedgehog pathways. Veitia RA, editor. PLoS ONE. 1201 Public Library of Science; 2012;7:e41616.

47 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1202 69. Li C, Jin L, Bai Y, Chen Q, Fu L, Yang M, et al. Genome-wide expression 1203 analysis in Down syndrome: insight into immunodeficiency. Rogers LK, editor. PLoS 1204 ONE. 2012;7:e49130.

1205 70. Chou CY, Liu LY, Chen CY, Tsai CH, Hwa HL, Chang LY, et al. Gene expression 1206 variation increase in trisomy 21 tissues. Mamm. Genome. 2008;19:398–405.

1207 71. Altug-Teber O, Bonin M, Walter M, Mau-Holzmann UA, Dufke A, Stappert H, et 1208 al. Specific transcriptional changes in human fetuses with autosomal trisomies. 1209 Cytogenet. Genome Res. Karger Publishers; 2007;119:171–84.

1210 72. Conti A, Fabbrini F, D'Agostino P, Negri R, Greco D, Genesio R, et al. Altered 1211 expression of mitochondrial and extracellular matrix genes in the heart of human 1212 fetuses with chromosome 21 trisomy. BMC Genomics. BioMed Central; 2007;8:268.

1213 73. Mao R, Wang X, Spitznagel EL, Frelin LP, Ting JC, Ding H, et al. Primary and 1214 secondary transcriptional effects in the developing human Down syndrome brain and 1215 heart. Genome Biol. BioMed Central; 2005;6:R107.

1216 74. Hibaoui Y, Grad I, Letourneau A, Sailani MR, Dahoun S, Santoni FA, et al. 1217 Modelling and rescuing neurodevelopmental defect of Down syndrome using 1218 induced pluripotent stem cells from monozygotic twins discordant for trisomy 21. 1219 EMBO Mol Med. EMBO Press; 2014;6:259–77.

1220 75. Engidawork E, Gulesserian T, Fountoulakis M, Lubec G. Aberrant protein 1221 expression in cerebral cortex of fetus with Down syndrome. Neuroscience. 1222 2003;122:145–54.

1223 76. Cheon MS, Fountoulakis M, Dierssen M, Ferreres JC, Lubec G. Expression 1224 profiles of proteins in fetal brain with Down syndrome. J. Neural Transm. Suppl. 1225 2001;:311–9.

1226 77. Cabras T, Pisano E, Montaldo C, Giuca MR, Iavarone F, Zampino G, et al. 1227 Significant modifications of the salivary proteome potentially associated with 1228 complications of Down syndrome revealed by top-down proteomics. Mol. Cell 1229 Proteomics. 2013;12:1844–52.

1230 78. Liu Y, Borel C, Li L, Müller T, Williams EG, Germain P-L, et al. Systematic 1231 proteome and proteostasis profiling in human Trisomy 21 fibroblast cells. Nat 1232 Commun. Nature Publishing Group; 2017;8:1212.

1233 79. Sullivan KD, Evans D, Pandey A, Hraha TH, Smith KP, Markham N, et al. 1234 Trisomy 21 causes changes in the circulating proteome indicative of chronic 1235 autoinflammation. Sci Rep. Nature Publishing Group; 2017;7:14818.

1236 80. Chacinska A, Koehler CM, Milenkovic D, Lithgow T, Pfanner N. Importing 1237 mitochondrial proteins: machineries and mechanisms. Cell. 2009;138:628–44.

1238 81. Sylvester JE, Fischel-Ghodsian N, Mougey EB, O'Brien TW. Mitochondrial 1239 ribosomal proteins: candidate genes for mitochondrial disease. Genet. Med. 1240 2004;6:73–80.

48 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1241 82. Leonard AP, Cameron RB, Speiser JL, Wolf BJ, Peterson YK, Schnellmann RG, 1242 et al. Quantitative analysis of mitochondrial morphology and membrane potential in 1243 living cells using high-content imaging, machine learning, and morphological binning. 1244 Biochim. Biophys. Acta. 2015;1853:348–60.

1245 83. Amunts A, Brown A, Toots J, Scheres SHW, Ramakrishnan V. Ribosome. The 1246 structure of the human mitochondrial ribosome. Science. American Association for 1247 the Advancement of Science; 2015;348:95–8.

1248 84. Bogenhagen DF, Ostermeyer-Fay AG, Haley JD, Garcia-Diaz M. Kinetics and 1249 Mechanism of Mammalian Mitochondrial Ribosome Assembly. Cell Reports. 1250 2018;22:1935–44.

1251 85. Daily K, Patel VR, Rigor P, Xie X, Baldi P. MotifMap: integrative genome-wide 1252 maps of regulatory motif sites for model species. BMC Bioinformatics. BioMed 1253 Central; 2011;12:495.

1254 86. Yang Z-F, Drumea K, Mott S, Wang J, Rosmarin AG. GABP transcription factor 1255 (nuclear respiratory factor 2) is required for mitochondrial biogenesis. Mol. Cell. Biol. 1256 2014;34:3194–201.

1257 87. Garmhausen M, Hofmann F, Senderov V, Thomas M, Kandel BA, Habermann 1258 BH. Virtual pathway explorer (viPEr) and pathway enrichment analysis tool 1259 (PEANuT): creating and analyzing focus networks to identify cross-talk between 1260 molecules and pathways. BMC Genomics. BioMed Central; 2015;16:790.

1261 88. Bostock M, Ogievetsky V, Heer J. D³ Data-Driven Documents. IEEE 1262 Transactions on Visualization and Computer Graphics. 17:2301–9.

1263 89. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: 1264 accurate alignment of transcriptomes in the presence of insertions, deletions and 1265 gene fusions. Genome Biol. BioMed Central; 2013;14:R36.

1266 90. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential 1267 gene and transcript expression analysis of RNA-seq experiments with TopHat and 1268 Cufflinks. Nat Protoc. Nature Publishing Group; 2012;7:562–78.

1269 91. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, et al. 1270 Fiji: an open-source platform for biological-image analysis. Nat. Methods. Nature 1271 Publishing Group; 2012;9:676–82.

1272 92. Divakaruni AS, Paradyse A, Ferrick DA, Murphy AN, Jastroch M. Analysis and 1273 interpretation of microplate-based oxygen consumption and pH data. Meth. Enzymol. 1274 Elsevier; 2014;547:309–54.

1275 93. Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of 1276 image analysis. Nat. Methods. NIH Public Access; 2012;9:671–5.

1277 94. Yen JC, Chang FJ, Chang S. A new criterion for automatic multilevel 1278 thresholding. IEEE Trans Image Process. 1995;4:370–8.

49 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1279 Figure legends

1280 Figure 1: Setup of the mitoXplorer web-based visual data mining platform. A

1281 manually curated, annotated mitochondrial interactome represents the central part of

1282 the mitoXplorer software, for which we have assembled 1166 mito-genes in human,

1283 1161 mito-genes in mouse and 1099 mito-genes in fruit fly in 35 mitochondrial

1284 processes (mito-processes). We have connected gene products using protein-protein

1285 interactions from STRING [39]. Publicly available expression and mutation data from

1286 repositories such as TCGA or GEO are provided for data integration, analysis and

1287 visualization and are stored together with species interactomes in a MySQL database.

1288 Users can provide their own data, which are temporarily stored and only accessible to

1289 the user. A set of Python-based scripts at the back-end of the platform handle data

1290 formatting, integration and analysis (Additional File 1, Supplementary Figure S1). The

1291 user interacts with mitoXplorer via several visual interfaces, by which the user can

1292 analyze, integrate and visualize his private, as well as public data. Four interactive

1293 visualization interfaces are offered: 1) the Interactome View allows at-a-glance

1294 visualization of the entire mitochondrial interactome of a single dataset (see Figure 2);

1295 2) Comparative Plots, consisting of a scatter plot and a sort-able heatmap allows

1296 comparison of up to six datasets, whereby a single mito-process is analyzed at a time

1297 (see Figure 3); 3) Hierarchical Clustering allows comparison of a large number of

1298 datasets, whereby datasets are clustered according to their expression values.

1299 Hierarchical Clustering plots are zoom-able and interactive (see Figure 4); 4) Principle

1300 Component Analysis displays PCA-analyzed datasets in 3D, providing filtering and

1301 grouping functions. There is in principle no limit to the number of datasets that can be

1302 analyzed using PCA (see Figure 5).

1303

50 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1304 Figure 2: Interactome View of the mitoXplorer platform. (a) Overview of all mito-

1305 processes of one dataset. A process can either be shown as one circle with colored

1306 segments according to the number of dysregulated genes, or upon clicking on the

1307 process, by showing all individual genes being part of this process; see (b) by clicking

1308 on the process Translation or the adjacent ‘+’, the circle is replaced by individual

1309 bubbles representing genes of this process. Clicking on the process again, or on the

1310 adjacent ‘-‘ will revert to the circular display. (c) Hovering over a gene bubble will

1311 display the name of the gene and associated information (gene name, description,

1312 chromosomal location, mitochondrial process, accession numbers, as well as log2 fold

1313 change, p-value and observed mutations), as well as all connections to mito-genes in

1314 other processes. Compared were the retinal epithelial cell line RPE1 (RPE) wild-type

1315 to RPE1 with Trisomy 21 (RPE_T21).

1316

1317 Figure 3: Comparative Plot of the mitoXplorer platform. (a) The Comparative Plot

1318 display is composed of a scatterplot and sortable heatmap and a bar chart for the

1319 selection of mito-processes. The scatterplot shows the log2 fold change (y-axis) and

1320 the datasets (x-axis). Each bubble represents one gene, whereby red dots indicate

1321 downregulated, and blue dots upregulated genes. The process to be shown can be

1322 selected by clicking on the process name in the bar chart next to the scatterplot, the

1323 chosen process is indicated on its top. In this case, TCA cycle was chosen. The

1324 heatmap at the bottom shows the individual genes and the datasets, whereby the

1325 genes are colored according to their log2 fold change (indicated at the bottom of the

1326 plot). (b) Hovering over a gene bubble (or a gene square in the heatmap) will display

1327 available information (in case of fly: gene name, mitochondrial process, gene

1328 description, chromosomal location, gene symbol, as well as log2 fold change, p-value

51 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1329 and observed mutations). (c) The heatmap is sortable by log2 fold change (as

1330 indicated by the pointer in c), as well as by dataset. Clicking on one of the datasets

1331 will sort the heatmap according to the log2 fold change of all genes in this dataset, as

1332 is illustrated here. Clicking on one of the genes will sort the heatmap according to its

1333 log2 fold changes across different datasets. The time-series study of developing flight

1334 muscle was used to demonstrate the functionality of this visualization method.

1335

1336 Figure 4: Hierarchical Clustering and heatmap plot of the mitoXplorer platform.

1337 Hierarchical Clustering of expression data results in a so-called heatmap. (a) Heatmap

1338 of transcriptome and proteome data of mouse knock-out strains of genes involved in

1339 mitochondrial replication, DNA-maintenance, transcription and RNA processing (taken

1340 from [46]). Data are clustered according to genes, as well as datasets. Gene boxes

1341 are colored according to their log2 fold change. At the top of the heatmap, the user

1342 can choose the mito-process to be displayed. (b) Hovering over one of the gene boxes

1343 will display information on the gene, such as the gene name, mito-process, log2 fold

1344 change, p-value and – if available – observed mutations. The heatmap is also zoom-

1345 able by clicking on the magnification glass at the bottom of the plot, so that large

1346 datasets can be visualized and analyzed efficiently. Datasets can be selected within

1347 the heatmap by grouping. To do this, first a group name has to be defined; second,

1348 the datasets belonging to this group have to be selected by clicking on one of the gene

1349 boxes of the dataset. This process can be repeated and the resulting groups can then

1350 be analyzed using Comparative Plots.

1351

1352 Figure 5: Principal component analysis and PCA plot of the mitoXplorer

1353 platform. (a) PCA analysis and plot of transcriptome data of The Cancer Genome

52 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1354 Atlas (TCGA) database [1], showing four different cancer types: breast cancer

1355 (BRCA), kidney cancer (KIRK), liver cancer (LIHC) and lung cancer (LUAD). Each

1356 bubble represents one dataset, in this case, one cancer patient. At the right side at the

1357 top of the plot, the mito-process to be shown can be chosen. In this case, ‘All

1358 Processes’ are chosen, containing data from all mito-genes. At the right side next to

1359 the plot, different colors, as well as filters can be chosen. In this case, the Cancer Type

1360 was chosen for coloring, showing the four different cancer types in four different colors.

1361 (b) Hovering over a bubble will display associated information on the dataset, including

1362 the dataset name, and in case of the TCGA, information on the cancer type, the stage,

1363 the gender, the vital status, as well as skin color. In addition, the three PC components

1364 are shown. (c) Selecting color schemes on the right-hand side will change the coloring

1365 of the bubbles. In this case, only lung cancer is shown, and coloring is done according

1366 to Stage, Gender, Vital, and Skin color. This panel can also be used for selecting

1367 specific datasets. For instance, clicking on one of the stages will only display the

1368 chosen stage and omit datasets from other stages. As in the heatmap, datasets can

1369 be selected from the PCA for grouping. To do this, first a group name has to be defined;

1370 second, the datasets belonging to this group have to be selected by clicking on one of

1371 the dataset bubbles. This process can be repeated and the resulting groups can then

1372 be analyzed using Comparative Plots.

1373

1374 Figure 6: Interactome View of the transcriptome and proteome of cell lines

1375 carrying trisomy 21. Trisomy 21 samples were compared against their wild-type

1376 counterpart. Transcriptomic analysis of (a) HCT116_T21 (trisomy 21 against wild-type)

1377 and (b) RPE21_T21 (trisomic against wild-type); (c) proteomic analysis of RPE_T21

1378 cells (trisomy 21 against wild-type). Transcriptome changes are different between the

53 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1379 two trisomy 21 cell lines HCT116 and RPE1. Expression changes of HCT_T21 cells

1380 are mild and genes tend to be upregulated (a), while some genes are strongly

1381 downregulated in RPE_T21 cells (b). The transcriptome (b) and the proteome (c) of

1382 RPE_T21 cells respond quite differently, with a strong down-regulation of components

1383 of the process oxidative phosphorylation (OXPHOS) at proteome level, which is not

1384 observed on transcriptome level. Most genes differentially expressed at transcript

1385 level, on the other hand, show no significant changes on proteome level. Red bubbles

1386 indicate downregulation, blue ones indicate upregulated genes. The size of the bubble

1387 corresponds to the log2 fold change.

1388

1389 Figure 7: Scatterplots of translation, mitochondrial- as well as nuclear

1390 components of oxidative phosphorylation of trisomy 21 cells. (a) In the

1391 mitochondrial process translation, MRPS21 is strongly down-regulated on

1392 transcriptome level in RPE_T21 cells as compared to RPE1 wild-type (wt) cells. No

1393 change is observed in HCT_T21 cells. On proteome level, several components of the

1394 mitoribosome small subunit (SSU) are down-regulated in RPE_T21 cells. (b)

1395 Transcript levels of mitochondrial-encoded genes of oxidative phosphorylation

1396 (OXPHOS) are not affected. (c) A significant number of components of OXPHOS are

1397 down-regulated on protein-level in RPE_T21 cells, while no significant or only mild

1398 reduction can be observed on transcriptome level in trisomy 21 cell lines. Scatterplots

1399 are taken from the mitoXplorer comparative plot interface. Each bubble represents one

1400 gene, pink highlighted dots are selected, light blue dots indicate mutated genes. On

1401 the y-axis, the log2 fold change is plotted, the cell lines (transcriptome of HCT116 T21

1402 (HCT_T21) clone 1 (c1) and clone 3 (c3) vs wild-type, as well as transcriptome of

1403 RPE1 T21 (RPE_T21) clone 1 (c1) and clone 2 (c2) vs wild-type and proteome of

54 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1404 RPE1 T21 clone 1 (RPE_T21 c1) vs wild-type) are plotted on the x-axis. The gene

1405 highlighted in pink has been selected on the web-server: MRPS21 for the process

1406 translation; MT-CO2 for the process mt oxidative phosphorylation; and no gene has

1407 been selected in the process oxidative phosphorylation.

1408

1409 Figure 8: Mitochondrial respiration and glycolysis is strongly affected in

1410 RPE_T21 cells and not affected in HCT_T21 cells. (a) Respiration in intact

1411 RPE_T21 cells is greatly decreased compared to wild-type. (b – d) Permeabilized

1412 RPE_T21 cells supplemented for substrates of complex I, II and IV as indicated in the

1413 header of each plot, showed equally dysfunctional OXPHOS, suggesting a general

1414 break-down of the respiratory chain. (e) RPE_T21 cells do not have any spare

1415 glycolytic reserve. Respiration (f), as well as glycolysis (g) is virtually unchanged in

1416 HCT_T21 cells compared to their wild-type counterparts. Bright red: RPE_T21 clone

1417 1; light red: RPE_T21 clone 2; dark red: RPE wild-type; dark blue: HCT wild-type; light

1418 blue: HCT_T21 clone 1. Measurements of cellular respiration in intact and

1419 permeabilized cells, as well as glycolytic potential were done using the Seahorse

1420 Bioscience XF Extracellular Flux Analyzer (Seahorse Biosciences). The experiments

1421 were performed using the mitochondrial and glycolytic stress test assay protocol as

1422 suggested by the manufacturer; the rate of cellular oxidative phosphorylation (oxygen

1423 consumption rate (OCR)) and glycolysis (cellular proton production rate (PPR)) were

1424 measured simultaneously.

1425

1426 Figure 9: Mitochondrial morphology is slightly changed in trisomy 21 cells. We

1427 have stained the mitochondrial network and analyzed the network morphology,

1428 measuring the percentage of filaments, rods, puncta and swollen using the Fiji plug-in

55 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1429 mitoMorph. (a) Sample of mitoMorph analysis of an RPE1 wild-type cell and (b) of an

1430 HCT116 wild-type cell. Filament networks are highlighted in lilac, rods in green, puncta

1431 (referring to fragmented mitochondria) are highlighted in orange and swollen ones in

1432 blue. The percentage filaments, rods, puncta and swollen mitochondria are

1433 automatically scored and reported to the user. (c) the percentage of filaments is

1434 slightly, but significantly reduced in RPE_T21, as well as HCT_T21 cells compared to

1435 wild-type. (d) There is no significant change in the percentage of rods in RPE_T21

1436 cells and slightly higher percentage in HCT_T21 cells compared to wild-type. (e) The

1437 percentage of puncta is unchanged in both T21 cell lines. (f) There are significantly

1438 more swollen mitochondria in both, RPE_T21, as well as HCT_T21 cells compared to

1439 wild-type. Underlying numerical values are provided in Additional File 4,

1440 Supplementary Table S3, sample images are shown in Additional File 1,

1441 Supplementary Figure S4.

1442

56 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1443 Tables 1444 1445 Table 1: Mito-processes and number of genes in Human, Mouse and Drosophila. 1446 Mito-process Human Mouse Drosophila Amino Acid 81 80 67 Metabolism Apoptosis 55 54 44 Bile Acid Synthesis 2 2 7 Calcium Signaling 24 23 12 and Transport Cardiolipin 5 5 5 Biosynthesis Fatty Acid 22 22 15 Biosynthesis & Elongation Fatty Acid 30 31 26 Degradation & Beta- oxidation Fatty Acid 14 11 19 Metabolism Fe-S cluster 24 25 18 biosynthesis Folate & Pterine 12 12 9 Metabolism Fructose Metabolism 7 7 3 Glycolysis 37 39 35 Heme Biosynthesis 9 9 9 Import & Sorting 51 52 62 Lipoic Acid 3 3 4 Metabolism Metabolism of Lipids 34 36 17 & Lipoproteins Metabolism of 16 15 18 Vitamins & Co- Factors Mitochondrial Carrier 46 45 46 Mitochondrial 60 60 47 Dynamics Mitochondrial 18 18 10 Signaling Nitrogen Metabolism 8 8 21 Nucleotide 14 14 13 Metabolism Oxidative 167 164 174 Phosphorylation Oxidative 13 13 13 Phosphorylation (MT) Pentose Phosphate 7 7 6 Pathway Protein Stability & 26 26 20 Degradation Pyruvate Metabolism 26 25 24 Replication & 53 54 33 Transcription

57 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

ROS Defense 32 32 30 Translation 184 183 191 Translation (MT) 24 24 24 Transmembrane 20 19 21 Transport Tricarboxylic Acid 21 22 29 Cycle Ubiquinone 9 9 9 Biosynthesis Unknown 13 13 21 1447

1448 1449 1450

58 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1451 Figures 1452 Figure 1

1453 1454

59 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1455 Figure 2

1456 1457 1458

60 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1459 Figure 3

1460 1461 1462

61 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1463 Figure 4

1464 1465

62 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1466 Figure 5

1467 1468

63 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1469 Figure 6 1470

1471 1472 1473

64 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1474 Figure 7

1475 1476

65 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1477 Figure 8

1478 1479

66 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1480 Figure 9

1481 1482

67 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1483 Additional file 1 – Supplementary Figures 1484 Supplementary Figure S1

1485 1486 1487 1488 Supplementary Figure S1: programmatic skeleton of the mitoXplorer web-platform. In the back- 1489 end, A MySQL database stores the mito-interactomes, as well as expression and mutation data that 1490 are publicly available. User-uploaded data are stored temporarily and only available to the user. A set 1491 of python-scripts connect to the MySQL database for data retrieval of both, mito-interactomes and 1492 expression and mutation data. The mitomodel script connects to the MySQL database directly for the 1493 visualization of the Interactome View. A set of scripts perform comparative analysis, for generating

68 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1494 Comparative Plots, Heatmap and PCA visualization. In the front-end, a set of javascripts handle the 1495 visualizations of the plots: the ‘interactome’ and ‘database’ scripts handle the data presentation of the 1496 mito-interactome and the available public data for the web-site; mitomodel visualizes the Interactome 1497 View and the scripts in the compare box are responsible for visualizing Comparative Plot, Heatmap and 1498 PCA. The CSS layer handles the css-styles of the page and finally, the HTML/PHP layer creates the 1499 actual interface for the user. 1500 1501

69 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1502 Supplementary Figure S2 1503

1504 1505 1506 Supplementary Figure S2: Length and area distribution of filaments and rods in wild-type and 1507 T21 derived RPE1 and HCT116 cells. (a) stacked bar-plots of filament length distribution of RPE1 1508 wild-type (labeled RPE_wt), RPE1 21/3 (labeled RPE_T21), HCT116 wild-type (labeled HCT_wt) and 1509 HCT116 21/3 (labeled HCT_T21) cells. Overall, shorter filaments are more frequent in HCT116 than in 1510 RPE1 cells. In T21, filaments tend to be slightly shorter. (b) stacked bar-plots of filament area 1511 distribution of RPE_wt, RPE_T21, HCT_wt wild-type and HCT_T21 cells. Overall, less area is occupied 1512 by filaments in HCT116 than in RPE1 cells. In HCT_T21 cells, a notably smaller area is assigned to 1513 filaments, while in RPE_T21 cells, this change is much less pronounced. (c) stacked bar-plots of rod 1514 length distribution of RPE_wt, RPE_T21, HCT_wt and HCT_T21 cells. Overall, in the range between 4 1515 and 10 microns, more rods are found in RPE1 cells. Between wild-type and T21 cells, no real length 1516 difference is observable. (d) stacked bar-plots of rod area distribution of RPE_wt, RPE_T21, HCT_wt 1517 and HCT_T21 cells. Overall, there is a tendency of slightly larger rod areas in HCT116 cells. In HCT116 1518 cells, rods seem to occupy slightly smaller areas when carrying the extra copy of chromosome 21. 1519 1520

70 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1521 Supplementary Figure S3 1522

1523 1524 1525 1526 Supplementary Figure S3: mitoXplorer scatterplot of Translation and nuclear-encoded Oxidative 1527 Phosphorylation of fibroblasts of monozygotic twins discordant for trisomy 21 (T21_MZ) and 1528 RPE1 T21 cells. (a) The mRNA of mitoribosome small subunit component MRPS21 is strongly down- 1529 regulated only in RPE_T21 cells and is mostly unaffected in monozygotic twins discordant for T21 1530 (T21_MZ fibroblasts: T21_Letour_MZ_fib, T21_Liu_MZ). Mitoribosome proteins are significantly 1531 downregulated in RPE_T21 cells and mildly affected in T21_MZ fibroblasts. (b) Oxidative 1532 Phosphorylation components encoded in the nucleus are downregulated on protein level in both, 1533 RPE_T21, as well as T21_MZ fibroblasts, whereby deregulation is milder in T21_MZ. In both conditions, 1534 the Oxidative Phosphorylation transcriptome is mostly unaffected. 1535 1536 1537 1538 1539

71 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1540 Supplementary Figure S4 1541

1542 1543 1544 1545 Supplementary Figure S4: Mitochondrial network and mitoMorph based image analysis of wild- 1546 type and T21 cells. MitoTracker stainings (a-d) of RPE_wt (a) and RPE_T21 (b), as well as HCT_wt 1547 (c) and HCT_T21 (d). (a, b) The mitochondrial network is largely intact in RPE_T21 cells, with only 1548 slightly lower percentage filaments and an increased number of swollen mitochondria. (c, d) In HCT116 1549 cells, the mitochondrial network is overall less abundant, with more rod-like and fragmented 1550 mitochondria (puncta). With trisomy 21, cells show an even more pronounced presence of rods at the 1551 cost of longer filaments, as well as more puncta and swollen mitochondria. The scale bar is 50 µm. 1552 Mitochondria were stained with MitoTracker deep Red FM from Invitrogen. Staining was done in 96- 1553 well plates. The cells were incubated for 30 min at 30°C with 100 nM MitoTracker dye prior to fixation. 1554 Cells were fixed with 3% PFA in DMEM for 5 min at room temperature. After washing with 1xPBS, 1555 1xPBS with 0.02% sodium azide was added. Plates were stored at 4°C in the dark. Imaging was carried 1556 out on an inverted Zeiss Observer.Z1 microscope with a spinning disc and 473 nm, 561 nm and 660 1557 nm argon laser lines. The images were captured automatically on multiple focal planes (step size 700 1558 nm) with a 40x magnification air objective. Image stacks were Z-projected using Fiji for further analysis. 1559

72