bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 mitoXplorer, a visual data mining platform to systematically analyze and 2 visualize mitochondrial expression dynamics and mutations 3 4 Annie Yim1*, Prasanna Koti1*, Adrien Bonnard2, Milena Duerrbaum1, Cecilia Mueller3, 5 Jose Villaveces1, Salma Gamal1, Giovanni Cardone1, Fabiana Perocchi3, Zuzana 6 Storchova1,4, Bianca H. Habermann1,2,5 7 8 1 Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152, Martinsried, 9 Germany 10 2 Aix-Marseille University, INSERM, TAGC U1090, 13009 Marseille, France 11 3 Functional Genomics of Mitochondrial Signaling, Gene Center, Ludwig Maximilian 12 University (LMU) Munich, Germany 13 4 Department of Molecular Genetics, TU Kaiserslautern, Paul Ehrlich Strasse 24, 14 67663, Kaiserslautern, Germany. 15 5 Aix-Marseille University, CNRS, IBDM UMR 7288, 13009 Marseille, France 16 17 18 19 * these authors contributed equally 20 21 22 23 Corresponding author: 24 Bianca H. Habermann 25 Aix-Marseille University, CNRS, IBDM UMR 7288, Case 907 26 Parc Scientifique de Luminy 27 163, Avenue de Luminy, 28 13009 Marseille 29 France 30 e-mail: [email protected] 31 32 33 34 35 Keywords: 36 Mitochondrial expression dynamics, mitochondrial mutations, mitochondrial functions, 37 visual data mining, Trisomy 21, oxidative phosphorylation, mitochondrial morphology, 38 image analysis 39
1 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
40 Abstract 41 Background
42 Mitochondria produce cellular energy in the form of ATP and are involved in various
43 metabolic and signaling processes. However, the cellular requirements for
44 mitochondria are different depending on cell type, cell state or organism. Information
45 on the expression dynamics of genes with mitochondrial functions (mito-genes) is
46 embedded in publicly available transcriptomic or proteomic studies and the variety of
47 available datasets enables us to study the expression dynamics of mito-genes in many
48 different cell types, conditions and organisms. Yet, we lack an easy way of extracting
49 these data for gene groups such as mito-genes.
50
51 Results
52 Here, we introduce the web-based visual data mining platform mitoXplorer, which
53 systematically integrates expression and mutation data of mito-genes. The central part
54 of mitoXplorer is a manually curated mitochondrial interactome containing ~1200
55 genes, which we have annotated in 35 different mitochondrial processes. This
56 mitochondrial interactome can be integrated with publicly available transcriptomic,
57 proteomic or mutation data in a user-centric manner. A set of analysis and visualization
58 tools allows the mining and exploration of mitochondrial expression dynamics and
59 mutations across various datasets from different organisms and to quantify the
60 adaptation of mitochondrial dynamics to different conditions. We apply mitoXplorer to
61 quantify expression changes of mito-genes of a set of aneuploid cell lines that carry
62 an extra copy of chromosome 21. mitoXplorer uncovers remarkable differences in the
63 regulation of the mitochondrial transcriptome and proteome due to the dysregulation
64 of the mitochondrial ribosome in retinal pigment epithelial trisomy 21 cells which
65 results in severe defects in oxidative phosphorylation.
2 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
66
67 Conclusions
68 We demonstrate the power of the visual data mining platform mitoXplorer to explore
69 expression data in a focused and detailed way to uncover underlying potential
70 mechanisms for further experimental studies. We validate the hypothesis-creating
71 power of mitoXplorer by testing predicted phenotypes in trisomy 21 model systems.
72 MitoXplorer is freely available at http://mitoxplorer.ibdm.univ-mrs.fr. MitoXplorer does
73 not require installation nor programming knowledge and is web-based. Therefore,
74 mitoXplorer is accessible to a wide audience of experimental experts studying
75 mitochondrial dynamics.
76 77
3 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
78 Background
79 Enormous amounts of transcriptomic data are publicly available for exploration. This
80 richness of data gives us the unique opportunity to explore the behavior of individual
81 genes or groups of genes within a vast variety of different cell types, developmental or
82 disease conditions or in different species. By integrating these data in a sophisticated
83 way, we may be capable to discover new dependencies between genes or processes.
84 Specific databases are available for mining and exploring disease-associated data,
85 such as The Cancer Genome Atlas (TCGA [1]), or the International Cancer
86 Consortium Data Portal (ICGC [2]). Especially cancer data portals provide users with
87 the opportunity to perform deeper exploration of expression changes of individual
88 genes or gene groups in different tumor types ([1-3]; for a review on available cancer
89 data portals, see [4]). Expression Atlas on the other hand provides pre-processed data
90 from a large variety of different studies in numerous species [5]. Indeed, the majority
91 of transcriptomic datasets are not related to cancer and are stored in public
92 repositories such as Gene Expression Omnibus (GEO [6]), DDBJ Omics Archive [7]
93 or ArrayExpress [8]. Currently, it is not straightforward to integrate data from these
94 repositories without at least basic programming knowledge.
95
96 Next to extracting reliable information from -omics datasets, it is equally important to
97 support interactive data visualization. This is a key element for a user-guided
98 exploration and interpretation of complex data, facilitating the generation of biologically
99 relevant hypotheses – a process referred to as visual data mining (VDM, reviewed e.g.
100 in [9]). Therefore, essentially all online data portals provide graphical tools for data
101 exploration.
102
4 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
103 What is fundamentally lacking is a user-centric, web-based and interactive platform for
104 data integration of a set of selected genes or proteins sharing the same cellular
105 function(s). The benefits of such a tool are evident: first, it would give us the possibility
106 to explore the expression dynamics of or mutations in this set of selected genes across
107 many different conditions, tissues, as well as across different species. Second, by
108 integrating data using enrichment techniques, for instance with epigenetic or ChIP-seq
109 data or by network analysis using the cellular interactome(s), we can recognize the
110 mechanisms that regulate the expression dynamics of the selected gene set.
111
112 One interesting set of genes are mitochondria-associated genes (mito-genes): in other
113 words, all genes, whose encoded proteins localize to mitochondria and fulfill their
114 cellular function within this organelle. Mito-genes are well-suited for such a systematic
115 analysis, because we have a relatively complete knowledge of their identity and can
116 categorize them according to their mitochondrial functions [10]. This a priori knowledge
117 can help us in mining and exploring the expression dynamics of mito-genes and
118 functions in various conditions and species.
119
120 Mitochondria are essential organelles in eukaryotic cells that are required for
121 producing cellular energy in form of ATP and for numerous other metabolic and
122 signaling functions [10]. Attributable to their central cellular role, mitochondrial
123 dysfunctions were found to be associated with a number of human diseases such as
124 obesity, diabetes, neurodegenerative diseases and cancer [11-15]. However,
125 mitochondria are not uniform organelles. Their structural and metabolic diversity and
126 how they influence each other has been well described in literature [16-20]. This
127 mitochondrial heterogeneity in different tissues is reflected in their molecular
5 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
128 composition [21]. The total number of proteins that contribute to mitochondrial
129 functions and localize to mitochondria is currently not precisely known and might differ
130 between tissues and species [22,23]. Yet, based on proteomics data from several
131 organisms, it is likely that mitochondria contain more than 1000 proteins [23-30].
132 Mitochondria have their own genome, whose size in animals is between 11 and 28
133 kilo-bases [31]. Most metazoan mitochondria encode 13 essential proteins of the
134 respiratory chain required for oxidative phosphorylation (OXPHOS), all rRNAs of the
135 small and large mitochondrial ribosomal subunits, as well as most mitochondrial
136 tRNAs [32]. All other proteins found in mitochondria (mito-proteins) are encoded by
137 genes in the nucleus; the protein products of these nuclear-encoded mitochondrial
138 genes (NEMGs) are transported to and imported into mitochondria.
139
140 Based on data from mitochondrial proteomics studies or genome-scale prediction of
141 mito-proteins, several electronic repositories of the mitochondrial interactome have
142 been created [24,33-36], though they often lack proper functional assignments of mito-
143 proteins. Moreover, proteomics studies describing the mitochondrial proteome can
144 suffer from a high false-positive rate [23]. Computational prediction or machine
145 learning [37] on the other hand lack experimental confirmation. As a consequence,
146 none of the published mitochondrial interactomes available to date can be taken
147 without further manual curation. Moreover, these lists are not integrated with any
148 available data analysis tool to explore mitochondrial expression dynamics under
149 varying conditions or in different tissues or species.
150
151 In this study, we present mitoXplorer, a web-based, highly interactive visual data
152 mining platform to integrate transcriptome, proteome, as well as mutation-based data
6 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
153 with a manually curated, function-based mitochondrial interactome. With mitoXplorer,
154 we can explore the expression dynamics, as well as mutations of mito-genes and their
155 associated mitochondrial processes (mito-processes) across a large variety of
156 different -omics datasets without the need of programming knowledge. MitoXplorer
157 provides users with dynamic and interactive figures, which instantly display information
158 on mitochondrial gene functions and protein-protein interactions. To achieve this,
159 mitoXplorer integrates publicly available -omics data with our hand-curated
160 mitochondrial interactomes for different model species. Additionally, users can upload
161 their own data for integration with our hand-curated mitochondrial interactome, as well
162 as the publicly available -omics data stored in the mitoXplorer database. In order to
163 test the analytical and predictive power of mitoXplorer, we generated transcriptome
164 and proteome data from aneuploid cell lines, carrying trisomy 21 (T21). We used
165 mitoXplorer to analyze and integrate our data with publicly available trisomy 21 data.
166 MitoXplorer enabled us to predict respiratory failure in one of our T21 cell lines, which
167 we experimentally confirmed, demonstrating the predictive power of mitoXplorer.
7 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
168 Results 169 The outline of the mitoXplorer web-platform is illustrated in Figure 1: at the back-end,
170 manually curated mitochondrial interactomes from human, mouse and Drosophila, as
171 well as expression and mutation data from these three species are stored in a MySQL
172 database (details on the implementation of the back-end are available in Methods, as
173 well as Additional File 1, Supplementary Figure S1).
174
175 The user interacts with the mitoXplorer web-platform via the front-end, which offers
176 different visualization and analysis methods. Users can either browse stored public
177 data or upload their own data.
178
179 The mitochondrial interactomes
180 The main component of mitoXplorer is the mitochondrial interactome. Its accurate
181 annotation and completeness are essential for performing a meaningful mitoXplorer-
182 based analysis. To establish mitochondrial interactomes, we have assembled and
183 manually curated lists of genes with annotated mitochondrial processes (mito-
184 processes) for human, mouse and Drosophila. Starting from published mitochondrial
185 proteomics data [27], we removed false-positives and supplemented likely missing
186 genes using information from Mitocarta [24], as well as orthologs across species. We
187 relied mainly on literature sources and information from the respective gene entry at
188 NCBI [38] for establishing whether a protein in question is primarily localized to
189 mitochondria. This resulted in 1166 human, 1161 mouse and 1099 Drosophila mito-
190 genes. We annotated the genes with, and grouped them according to 35 mito-
191 processes using controlled vocabulary (Table 1). In addition to purely mitochondrial
192 processes, we added cytosolic processes coupled to mitochondrial functions,
193 including glycolysis, the pentose phosphate pathway or apoptosis. According to our
8 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
194 annotation strategy, one gene is part of only a single mito-process. We used all mito-
195 genes to create mitochondrial interactomes by adding protein-protein interaction
196 information from STRING [39].
197
198 Currently, the interactomes of three organisms are available on mitoXplorer: Homo
199 sapiens (human), Mus musculus (mouse) and Drosophila melanogaster (fruit fly).
200 Mito-genes of human, mouse and Drosophila annotated with mito-processes are
201 available in Additional File 2, Supplementary Table S1 a-c. These manually curated
202 and annotated interactomes enables the meaningful analysis and visualization of
203 mitochondrial expression dynamics of mito-processes by comparing differential
204 expression of two or more conditions in mitoXplorer.
205 206 The mitoXplorer expression and mutation database
207 To foster the analysis of mitochondrial expression dynamics and mutations,
208 mitoXplorer hosts expression and mutation data from public repositories in a MySQL
209 database.
210
211 Expression data encompass analyzed data of differentially expressed genes from
212 RNA-seq studies and are available in the form of log2 fold change (log2FC) and p-
213 value. One differential dataset thus includes two experimental conditions with all
214 replicates. Mutation data include analyzed data of identified SNPs of one sample
215 against a publicly available reference genome or transcriptome. Pre-analyzed public
216 data are taken as provided by the authors of the respective study. Thus, the algorithms
217 and their settings might differ between data from different studies or sources.
218
9 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
219 The largest public resource imported into mitoXplorer covers publicly available
220 expression data of human cancers from The Cancer Genome Atlas (TCGA, [1]). We
221 have included all paired samples. This resulted in a total of 523 differential datasets
222 from 6 different cancer types: kidney cancer (KIRK), breast cancer (BRCA), liver
223 cancer (LIHC), thyroid cancer (THCA), lung cancer (LUAD) and prostate cancer
224 (PRAD). Changes in mitochondrial metabolism have been described in many cancer
225 types (for a review, see [40]). As mitoXplorer is the thus far only resource that allows
226 a focused analysis of mito-genes across different cancer types or patient groups, this
227 resource should be especially useful to shed light on the expression dynamics or
228 mutational data of mito-genes in cancer and to classify the mitochondrial metabolic
229 profiles of tumor types and sub-types. Users can moreover integrate proprietary data
230 with differential expression or with mutation data from different tumor types and
231 subtypes.
232
233 We provide data from human trisomy 21 patients (GEO accession numbers:
234 GSE55426; GSE79842; [41,42]), from trisomy 21 studies in mouse (GSE5542 [41];
235 GSE79842 [42]; GSE48555 [43]), as well as differential datasets from this study from
236 human trisomic cell lines (11 datasets), which have been partially published elsewhere
237 [44,45] (GEO accessions: GSE39768; GSE47830; GSE102855). These
238 transcriptomic, as well as proteomic datasets should help understand the role of
239 mitochondria and the mitochondrial metabolism in trisomy 21.
240
241 We also uploaded differential transcriptomic and proteomic data of five different mouse
242 conditional heart knock-out strains of genes involved in mitochondrial replication,
243 transcription and translation [46] (Lrpprc, Mterf4, Tfam, Polrmt, Twnk (Twinkle), (GEO
10 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
244 accession: GSE96518)). These data are especially helpful in unraveling the
245 transcriptional and post-transcriptional effects on mito-genes upon disruption of gene
246 expression at different levels in mitochondria.
247
248 To extend mitoXplorer to other model organism, we added data from D. melanogaster,
249 namely expression data from 185 wild-derived, inbred strains (males and females)
250 from the Drosophila Genetics Reference Panel [47] (DGRP2). The wild-derived fly
251 strains come from different environmental and social situations and display a
252 substantial quantitative genetic variation in gene expression. The availability of these
253 data on mitoXplorer allows a focused analysis of mito-genes to elucidate, whether
254 mitochondrial expression dynamics is equally impacted by the environment in these
255 strains. Finally, we have uploaded data from a recently published systematic study of
256 flight muscle development in D. melanogaster [48] (GEO accession: GSE107247).
257 This enables the analysis of mitochondrial expression dynamics during the
258 development and differentiation of a tissue that is highly dependent on an efficient
259 mitochondrial metabolism and especially ATP production for proper functioning.
260
261 All publicly available data can be viewed and accessed from the mitoXplorer
262 DATABASE web-site.
263 264 User-provided expression and/or mutation data
265 Researchers can upload and explore their own data in mitoXplorer, given that they
266 originate from one of the species contained in the mitoXplorer platform. Data must be
267 pre-analyzed. Differential expression data must contain the dataset ID (describing the
268 experimental condition), the gene name and the log2FC. Optional values include the
269 p-value, as well as the averaged read counts (or intensities) of the replicates of the
11 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
270 compared conditions. Mutation data must contain the dataset ID, gene name, the
271 chromosome, the position, as well as reference and alternative allele. Optional values
272 include the effect, as well as the consequence of the mutation. The entire list of genes
273 from a study should be uploaded to the platform for several reasons: first, a restriction
274 to only differentially expressed or mutated genes will suppress links between proteins
275 in the interactome; second, an integration of user data with publicly provided data is
276 difficult with incomplete datasets; third, mitoXplorer will automatically select the mito-
277 genes from the user data. Users can either generate their own data in the format
278 described on our website; or use the RNA-seq pipeline that we provide at
279 https://gitlab.com/habermannlab/mitox_rnaseq_pipeline/. Uploaded data will be
280 checked for correct formatting and integrated with the interactome of the chosen
281 species. User data are only visible to the owner and are stored in the mitoXplorer
282 MySQL database for 7 days. Users can integrate their own data with available public
283 data on mitoXplorer to perform various analyses and visualizations as described below
284 (Figure 1).
285 286 Analysis and visualization tools in mitoXplorer
287 The mitoXplorer web-platform provides a set of powerful, easy-to-read and highly
288 interactive visualization tools to analyze and visualize public, as well as user-provided
289 data by VDM (Figure 1): an Interactome View to analyze the overall expression and
290 mutation dynamics of all mito-processes of a single dataset containing differentially
291 expressed genes between two conditions and potential mutations in mito-genes; the
292 Comparative Plot, consisting of an interactive scatter plot, as well as an interactive
293 heatmap for comparing up to 6 datasets; the Hierarchical Clustering, as well as the
294 Principal Component Analysis for comparative analysis of many datasets.
295
12 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
296 Interactome View 297 The Interactome View can be used to get an at-a-glance view of the overall expression
298 dynamics of all mito-processes of a single dataset of differentially expressed mito-
299 genes and potential mutations (Figure 2 a). It allows users to identify the most
300 prominently changed mito-processes or -genes in a dataset. The genes are grouped
301 according to mito-processes and displayed in the process they are assigned to. The
302 Interactome View is highly dynamic and can be adjusted by users to their needs.
303
304 When the Interactome View is launched, each mito-process is primarily shown as a
305 grey circle with elements colored in grey, blue and/or red, indicating up- or down-
306 regulated genes within the process, respectively (Figure 2 a). Thus, mito-processes
307 with the most up- or down-regulated genes can be quickly identified.
308
309 When clicking on a process name, its circle opens and displays all its member genes
310 as bubbles, whereby the size of the bubble indicates the strength of the differential
311 regulation and the color indicates up- (blue) or down- (red) regulation of the gene
312 (Figure 2 b). Both, the log2FC as well as the p-value are color-coded in the Interactome
313 View. Only genes with a p-value below 0.05 will be colored. If information about
314 mutations are included in the dataset, these are indicated by a thicker, black border of
315 the gene bubble.
316
317 Hovering over a gene will display the gene name, its function, its mito-process, the
318 log2FC and the p-value of the differential expression analysis, as well as potential
319 mutations in the information panel (Figure 2 c). If a gene physically interacts with other
320 mito-genes, hovering over it or over the process circle will in addition display these
321 connections (Figure 2 c). Thus, the user is immediately informed about the location
13 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
322 and connectivity of the protein of interest within the mitochondrial interactome. Users
323 can also search for specific genes using the ‘FIND A GENE’ box at the top of the page.
324
325 The Interactome View can be launched by clicking on the ‘eye’ symbol next to dataset
326 names from the ANALYSIS page of mitoXplorer, after having chosen the organism,
327 the project and the dataset. Alternatively, users can access single datasets from the
328 DATABASE page of the platform, by clicking on the eye symbol of a listed dataset
329 after having chosen a species, as well as a project. A new page will be opened for the
330 Interactome View, which allows opening and comparing multiple datasets at the same
331 time. This is especially useful for comparing the overall expression change of mito-
332 processes of multiple datasets.
333
334 Comparative Plot
335 The Comparative Plot visualization combines several interactive graphs to analyze
336 one mito-process, allowing the comparison of up to 6 datasets. It includes a scatter
337 plot with a dynamic y-axis, as well as an interactive heatmap at the bottom of the page.
338 The mito-process to be visualized can be selected in the process panel (Figure 3 a).
339 Red and blue coloring of the dots and the bar chart indicates the directionality of
340 differential expression (blue: upregulated; red: downregulated); bright blue, larger
341 gene bubbles in the scatter plot indicate mutations, if available from the dataset. This
342 view offers an overview of the expression dynamics of all members of one mito-
343 process for up to 6 individual datasets and thus can be helpful in identifying co-
344 regulated genes e.g. in time-course data, patients or multiple mutant datasets.
345
14 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
346 Hovering over a gene bubble, or over a bar in the heatmap will again display the
347 respective associated information of the gene in the information panel (gene name,
348 function, mito-process, log2FC, p-value, potential mutations) (Figure 3 b). The
349 heatmap can be sorted according to the dataset, as well as the differential expression
350 values within one dataset (Figure 3 c). The Comparative Plot is especially useful for
351 performing a detailed, comparative, mito-process based analysis of differential
352 expression dynamics between different datasets.
353
354 We applied this analysis method to visualize differential expression data from a time-
355 series study of flight muscle development during pupal stages in Drosophila [48]
356 (Figure 3). While enrichment analysis has revealed a general positive enrichment of
357 processes like ‘TCA Cycle’ in the course of flight muscle development, mitoXplorer
358 identifies 12 genes of TCA cycle that are co-regulated. This group of genes is strongly
359 upregulated between 0 and 16 hours of development, when myoblasts divide and fuse
360 to myotubes. The same group of genes is consecutively downregulated in two phases
361 at time-points 30 to 48 hours and 72 to 90 hours APF, when myotubes differentiate to
362 mature muscle fibers. This is surprising as in mature muscle fibers, the TCA cycle
363 should be important for proper functioning. Their strong induction between the first two
364 time-points could be responsible for downregulation at later stages.
365
366 The Heatmap: Hierarchical Clustering 367 Hierarchical Clustering visualization allows the analysis of up to 100 datasets,
368 analyzing one process at a time. This creates a heatmap with mito-genes, as well as
369 -datasets, which are clustered according to the log2FC using hierarchical clustering
370 (Figure 4 a). The results are displayed as a clustered heatmap, with a dendrogram
371 indicating the distance between datasets or between genes.
15 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
372
373 Hovering over a gene will display its associated information, as well as dataset
374 information in the information panel (Figure 4 b). The user can furthermore zoom into
375 parts of the heatmap to get a more detailed view of the data. The heatmap is
376 particularly useful for discovering groups of similarly regulated mito-genes or datasets
377 within one mito-process.
378
379 We applied this visualization tool to display transcriptome and proteome data from a
380 recent, systematic study of mouse conditional knock-out strains for five genes involved
381 in mitochondrial replication (Twinkle (Twnk)), mtDNA maintenance (Tfam), mito-
382 transcription (Polrmt), mito-mRNA maturation (Lrpprc) and mito-translation (mTerf4)
383 [46]. Interestingly, the expression dynamics of the mitochondrial transcriptomes and
384 proteomes in heart tissue did not cluster together for the mutants upon the loss of any
385 of these genes. In accordance with this, the expression of some mito-genes in the
386 process pyruvate metabolism that is shown here differs on transcriptome and
387 proteome level. This demonstrates the usefulness of hierarchical clustering and the
388 heatmap display in identifying the correlation or divergence between genes as well as
389 datasets.
390
391 Principal Component Analysis
392 A larger number of datasets can be compared using Principal Component Analysis
393 (PCA), either for an individual mito-process, or considering all mito-genes together
394 (Figure 5 a). In PCA, the expression value (e.g. log2FC) of each gene is considered
395 as one dimension, and each dataset represents one data point. In the resulting 3D
396 PCA plot, the three axes represent the first three principal components and each
16 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
397 bubble represents one dataset. The PCA is again interactive. The mito-process to be
398 viewed can be selected via a drop-down menu on the top of the page. The plot can be
399 turned and moved in 3D and has a zooming function.
400
401 Hovering over a bubble will give all information associated with the individual dataset
402 in the information panel, including the values of the first three principal components
403 (Figure 5 b). The information differs for each project chosen.
404
405 Individual datasets can be selected and colored via the dataset panel next to the plot
406 (Figure 5 c). For data from TCGA, the filter and coloring can for instance be used to
407 highlight or to limit the plot to data from different tumors, different tumor stages or
408 according to any other additional information provided. The PCA is especially useful
409 for analyzing a large number of datasets and displaying specific trends in sub-groups.
410
411 We used the PCA plot to visualize data from the TCGA for four cancer types stored in
412 mitoXplorer in Figure 5 a, whereby the colors of the bubbles represent the different
413 tumor types. The PCA mode clearly highlights the distinctness of the different tumor
414 types. In particular, kidney and liver cancer are highly distinct with respect to the first
415 three components of all mito-genes (Figure 5 a).
416
417 Both views intended for large datasets, the Heatmap and the PCA, can identify groups
418 of correlated datasets. In order to allow a more detailed, gene-centered analysis of
419 correlated datasets, we added the possibility to select and group datasets in the
420 Heatmap and the PCA view. Groups of datasets can be compared against each other
421 with the Comparative Plot, whereby the log2FC is averaged over the data within a
17 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
422 group. This functionality is useful, if for instance different groups of patients with a
423 similar expression pattern need to be compared to each other; or to compare the
424 expression changes during tumor development in different tumor stages.
425
426 Taken together, mitoXplorer provides a versatile, interactive and integrative set of tools
427 to visualize and analyze the expression dynamics as well as mutations of mito-genes
428 and mito-processes, providing a detailed understanding of observed changes at a
429 molecular level.
430
431 Analyzing cell lines carrying trisomy 21 using the mitoXplorer platform
432 To demonstrate the analytical and predictive power of mitoXplorer, we analyzed the
433 transcriptome and proteome of a set of aneuploid cell lines carrying an extra copy of
434 chromosome 21 (trisomy 21, T21). Mitochondrial dysfunction has been repeatedly
435 found in T21 patients, whereby mostly oxidative stress, as well as – potentially
436 resulting – mitochondrial respiratory deficiency have been shown to contribute to some
437 of the observed clinical features (see for instance [49-62]). Transcriptome studies of
438 different T21 tissues using microarrays [63-73] and more recently RNA sequencing
439 [41,42,74] and proteomics [75-79] have revealed a complex picture of gene expression
440 changes, with a marked dissimilarity in differential expression of mito-genes on mRNA
441 and protein levels, indicating a potential post-transcriptional regulatory effect of some
442 mito-genes in T21 [78]. Yet, mito-gene and protein expression data in different tissues
443 or under varying conditions in T21 remain still sparse and a coherent hypothesis of the
444 underlying mechanisms leading to the mitochondrial deficiencies in T21 patients is still
445 missing.
446
18 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
447 We used trisomy 21 cell lines derived from either the euploid human colon cancer cell
448 line HCT116 or from the retinal pigmented epithelial cell line RPE1, to which an extra
449 copy of chromosome 21 [45] was added. We used two RPE1-derived and two
450 HCT116-derived clones trisomic for chromosome 21 (Additional File 3, Supplementary
451 Table S2 a), which were validated by fluorescent in situ hybridization and by whole
452 genome sequencing. We used transcriptomic data of the original euploid RPE1 line
453 and its two trisomic derivatives (RPE_T21 clone 1 and 2 (c1, c2) [45]), as well as for
454 HCT116, and its trisomic derivatives (HCT_T21 (c1, c3)). We included proteomics data
455 for RPE1 and one of its T21 derivatives (RPE_T21 c1). We performed bioinformatic
456 analysis to determine differential expression of the above conditions (Additional File
457 3, Supplementary Table S2 b-e) and uploaded the differential expression data of the
458 transcriptome and proteome on the mitoXplorer platform for further in-depth,
459 mitochondrial analysis.
460
461 Differences between trisomy 21 cell lines
462 MitoXplorer analysis of data comparing HCT116- and RPE1-derived T21 cell lines
463 using the Interactome View revealed that T21 induced strong effects with respect to
464 the overall expression changes in mito-genes (Figure 6). HCT_T21 showed a subtle,
465 but consistent up-regulation of mito-genes (Figure 6 a). In contrast, RPE_T21 cells
466 showed a strong down-regulation of a few genes involved in several mito-processes,
467 such as fatty acid metabolism, glycolysis or mitochondrial dynamics (Figure 6 b).
468 Remarkably, quantitative proteome data from RPE_T21 c1 cells suggested that all
469 mitochondria-encoded genes involved in OXPHOS, as well as the majority of nuclear-
470 encoded OXPHOS-genes are downregulated (Figure 6 c). In conclusion, mitoXplorer
471 analysis revealed significant differences in mito-gene expression between the different
19 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
472 cell lines. Importantly in RPE_T21 cells, proteome data show a remarkable difference
473 to transcriptome data.
474 475 mitoXplorer reveals mitochondrial ribosomal assembly defects in RPE_T21 cell
476 lines
477 To investigate the differences further, we next performed a more detailed analysis of
478 expression changes in these T21 cell lines using Comparative Plots in mitoXplorer.
479 Transcriptome and proteome data from RPE_T21, but not from HCT_T21 cell lines
480 revealed that several subunits of the small mitochondrial ribosome (mitoribosome)
481 were significantly downregulated on either RNA or protein level, or both (Figure 7 a).
482 MRPS21 was strongly reduced on RNA- and protein-level. The genes MRPS33,
483 MRPS14 and MRPS15 were largely normal on RNA level, while their protein levels
484 decreased more than 2-fold (log2FC: MRPS33: -2.147; MRPS14: -1.827; MRPS15: -
485 1.057). The mitoribosome subunits are encoded in the nuclear genome and their
486 protein products are imported into the mitochondria, where they assemble with
487 mitochondrial ribosomal RNAs to form the large and small subunits of the
488 mitoribosome. The mitoribosome is responsible for translating the 13 mt-mRNAs
489 encoded in the mitochondrial genome, all of which code for key subunits of the
490 respiratory chain required for OXPHOS [80,81]. In accordance with a disrupted
491 mitochondrial translation machinery, all quantifiable mitochondria-encoded OXPHOS
492 proteins (Complex I: MT-ND1 and MT-ND5; Complex IV: MT-CO2) were severely
493 diminished on protein-, but not on RNA-level in RPE1_T21 cells (Figure 7 b,c).
494
495 Interestingly, 36 of the quantifiable OXPHOS proteins encoded in the nuclear genome
496 were also found to be down-regulated at the proteome, but not at transcriptome level
497 in RPE_T21 cells (Figure 7 c). These include subunits of the NADH dehydrogenase
20 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
498 (complex I), ubiquinol-cytochrome c reductase (complex III) and cytochrome c oxidase
499 (complex IV). It is important to note that there is no general down-regulation of
500 mitochondrial proteins in these cells and only a few, specific proteins are strongly
501 downregulated (Figure 6 c). Together, these data demonstrate the power of
502 mitoXplorer to help identify the cause of important changes in mito-gene expression,
503 here the downregulation of mitoribosomal subunits at the transcription level and the
504 resulting cause, in this case the downregulation of the majority of OXPHOS proteins.
505
506 RPE_T21 cells are defective in oxidative phosphorylation
507 The massive downregulation of OXPHOS proteins in RPE_T21 cells suggests that
508 these cells should suffer from a severe OXPHOS deficiency. To test this hypothesis
509 experimentally, we analyzed cellular respiration and glycolysis in T21 cell lines using
510 a Seahorse XF96 analyzer to quantify oxygen consumption rate (OCR) as an indicator
511 of mitochondrial respiration (Figure 8 a-d, f), as well as the proton production rate
512 (PPR) as an indicator of glycolysis (Figure 8 e, g). In intact RPE_T21 cells, we indeed
513 observed dramatically reduced levels of cellular respiration in comparison to the
514 diploid control (Figure 8 a).
515
516 As a complex I deficiency has been reported in trisomy 21 patients [58], we next asked
517 whether RPE_T21 cells selectively suffer from a complex I deficiency, or whether the
518 entire respiratory chain is affected, as suggested by our proteomics data. We used
519 permeabilized cells to test each individual complex with the Seahorse analyzer,
520 supplementing with pyruvate/malate, succinate and TMPD/ascorbate for assessing
521 complex I, II or IV functionality, respectively. As expected from our proteomics
522 analysis, RPE_T21 cells displayed a severe deficiency of the entire respiratory chain
21 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
523 (Figure 8 b-d). The glycolytic rate of RPE_T21 cells in the presence of glucose was
524 similar to the diploid control cells. Inhibition of ATP-production was not able to
525 stimulate the cells to a higher glycolytic rate (Figure 8 e), which agrees with the already
526 low OXPHOS levels observed in these cells. HCT_T21 cells, on the other hand,
527 displayed normal respiration, as well as glycolysis (Figure 8 f, g). This suggests that
528 the respiratory chain, as well as the mitochondrial translational machinery is not
529 generally affected in all T21 cells. Taken together, mitoXplorer uncovered OXPHOS
530 deficiencies in RPE_T21 cells, which we verified experimentally, demonstrating the
531 power of an in-depth analysis of mitochondrial expression dynamics to identify the
532 potential molecular cause of the observed phenotype.
533
534 Quantification of mitochondrial network morphology using mitoMorph
535 We further wanted to investigate, if T21 and the defective OXPHOS had a
536 consequence on mitochondrial morphology and the mitochondrial network structure
537 was changed in T21 cell lines. To quantify mitochondrial morphology in RPE_T21
538 cells, we stained mitochondria using the MitoTracker Deep Red FM dye. In order to
539 quantify the characteristics of mitochondrial morphology, we developed a new Fiji
540 plug-in for quantification of mitochondrial network features, which we called mitoMorph
541 (Figure 9 a,b). MitoMorph is based on the scripts provided by Leonard, et al. [82] for
542 quantifying mitochondrial network features such as filaments, rods, puncta and
543 swollen mitochondrial structures (see Materials and Methods for implementation
544 details). MitoMorph reports the percentages of filaments, rods, puncta and swollen for
545 each individual cell, as well as for all selected cells in a batch analysis. Moreover, it
546 provides the lengths and areas of filaments and rods. Figure 9 (c-f) shows the
547 distribution of mitochondrial network features for the two wild-type and T21 cell lines.
22 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
548 MitoMorph analysis revealed that in both backgrounds, T21 cells had fewer
549 mitochondrial filaments than their wild-type counterparts, but instead possessed a
550 slightly higher number of rods, which was significant in HCT_T21 cells. Both T21 cell
551 lines had significantly more swollen structures than their wild-type counterparts.
552 Length and area distribution of filaments and rods were not significantly different
553 between the wild-type and the trisomy 21 cells (Additional File 1, Supplementary
554 Figure S2 a-d). In conclusion, mitochondrial morphology based on light-microscopy is
555 mildly affected in trisomy 21.
556
557 Data integration with publicly available trisomy 21 datasets
558 After discovering this differential OXPHOS defect in our RPE_T21 cell lines, we were
559 interested in the overlap of the mito-transcriptome and -proteome of RPE_T21 cells
560 with data from trisomy 21 patients. We used proteomic and transcriptomic data from a
561 monozygotic twin study discordant for chromosome 21 [41,78]. In agreement with our
562 RPE_T21 data, systematic proteome and proteostasis profiling of fibroblasts from
563 monozygotic twins discordant for T21 revealed a significant, although milder down-
564 regulation of the mitochondrial proteome, including proteins involved in OXPHOS,
565 which is not apparent from transcriptomic analysis of the same cells (see Additional
566 File 1, Supplementary Figure S3 a, b). Taken together, the analysis of both datasets
567 with mitoXplorer suggests a strong post-transcriptional effect leading to reduced
568 expression levels of proteins involved in OXPHOS in trisomy 21.
569
23 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
570 Discussion 571 572 The web-based mitoXplorer platform for mito-centric data exploration
573 MitoXplorer is a practical web tool with an intuitive interface for users who wish to gain
574 insight from -omics data in mitochondrial functions. It is the first tool that takes
575 advantage of the breadth of -omics data available to date to explore expression
576 variability of mito-genes and -processes. It does so by integrating a hand-curated,
577 annotated mitochondrial interactome with -omics data available in public databases or
578 provided by the user.
579
580 MitoXplorer has been conceived and implemented as a visual data mining (VDM)
581 platform: by iteratively interacting, visualizing and by allowing manipulation of the
582 graphical display of data, the user can effectively discover complex data to extract
583 knowledge and gain deeper understanding of the data. MitoXplorer provides a set of
584 particularly interactive and flexible visualization tools, with a fine-grained, function- as
585 well as gene-based resolution of the data. Clustering, as well as PCA-analysis help in
586 addition to mine a larger number of -omics data effectively by grouping datasets with
587 similar expression patterns.
588
589 VDM-based knowledge discovery is offered by a large number of resources and
590 platforms. However, to the best of our knowledge, no currently available tool allows to
591 explore expression variation of a specific subset of genes in a large number of -omics
592 datasets. It permits users to exploit publicly available transcriptome, proteome or
593 mutation data to study the variation and thus, the adaptability of a defined gene set in
594 different conditions or species. While mitoXplorer offers the exploration of mito-genes,
595 we have designed the platform such that users, who wish to download a local version
24 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
596 of mitoXplorer can also upload their own interactome, which can be any gene group
597 of interest. Thus, mitoXplorer can be flexibly adjusted to any user-defined interactome
598 set.
599
600 Cell type-specific de-regulation of mito-genes in trisomy 21
601 We tested mitoXplorer by expression profiling of mito-genes in T21 cell lines. Mito-
602 genes were strongly deregulated in both trisomic cell types tested, the non-cancerous
603 retinal pigment epithelial cell line RPE1 and the cancer cell line HCT116. Yet, the
604 changes in expression were quite different in the two cell lines. It is not unexpected
605 that mito-genes are differentially expressed in different cell types, reflecting the
606 divergent cellular energy- and metabolic demands [20]. Gene expression is moreover
607 tightly regulated in a cell-type specific manner by regulating transcription, translation
608 and the epigenetic state of the cell. Thus, also divergent and cell-type specific
609 expression changes of mito-genes upon introduction of an extra chromosome is not
610 surprising.
611
612 mitoXplorer revealed divergent de-regulation of mitochondrial transcriptome
613 and proteome in trisomy 21
614 We found a remarkable difference between transcriptome and proteome levels of mito-
615 genes in RPE_T21 cells. In particular the OXPHOS proteins were strongly down-
616 regulated at protein, but not mRNA level. This can be explained by essential
617 components of the respiratory chain being encoded in the mitochondrial genome and
618 thus requiring a functioning mitochondrial replication system, as well as intact
619 mitochondrial transcription and translation. Thus, there is as strong post-transcriptional
620 regulation of the mitochondrial proteome. In case of the RPE_T21 cell line, the
25 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
621 disintegration of the mitoribosome and thus a failure of mitochondrial translation is
622 likely causative for the down-regulation of OXPHOS components on protein-level,
623 possibly by proteolysis, as the essential mitochondrial subunits are not produced and
624 thus complexes cannot assemble. This conclusion is further supported by the fact that
625 we could not observe a significant difference in mitochondrial transcript levels, with
626 some mt-mRNAs even being upregulated; thus, mtDNA -maintenance, -replication as
627 well as mito-transcription seem to be unaffected.
628
629 MitoXplorer analysis of previously published data of the mito-proteome of fibroblasts
630 isolated from monozygotic twins discordant for T21 [78] revealed a similar post-
631 transcriptional effect as our T21 model cell lines. Taken together, our data uncovered
632 a significant post-transcriptional regulation of the mitochondrial process OXPHOS in
633 trisomy 21 that could bring new insight into the mechanisms of mitochondrial defects
634 in trisomy 21 patients.
635
636 mitoXplorer identified mitochondrial ribosomal protein S21 (MRPS21) as
637 potentially causative for OXPHOS failure
638 The most notable difference in RPE_T21 cells compared to wild-type is the 10-12 –
639 fold downregulation of mitochondrial ribosomal protein S21 (MRPS21) on transcript
640 level, as well as the downregulation of Mrps21 protein and other proteins of the small
641 and – to a lesser extend – large mitoribosome subunit. Thus, our data suggest that the
642 integrity of the mitoribosome is compromised, leading to its disintegration and
643 subsequently, the downregulation of mitochondrial proteins of the respiratory chain.
644 Mrps21 is a late-assembly component and lies at the outer rim of the body (or bottom)
645 of the small subunit (SSU) of the mitoribosome. Nevertheless, it interacts with a
26 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
646 number of other proteins of the SSU and also directly contacts bases of the 12S rRNA
647 [83,84]. Thus, its absence could destabilize the SSU of the mitoribosome. The two
648 most down-regulated proteins are Mrps33 and Mrps14, both of which directly interact
649 with each other and several other proteins in the SSU and are localized to the head of
650 the SSU. Furthermore, together with another down-regulated component, Mrps15,
651 they are proteins that are incorporated late in the mitoribosome assembly process [84].
652 This raises the possibility that late-assembly proteins disintegrate more readily from
653 the mitoribosome, leading to their enhanced degradation and thus ribosome
654 malfunction.
655
656 Based on promoter analysis using MotifMap [85], potential binding motifs of two
657 transcription factors located on chromosome 21, GABPA and ETS2, can be found in
658 the promoter region of the MRPS21 gene. Gabpa, which is also known as nuclear
659 respiratory factor 2, has already been implicated in mitochondrial biogenesis by
660 regulating Tfb1m expression [86]: its depletion in mouse embryonic fibroblasts showed
661 reduced mitochondrial mass, ATP production, oxygen consumption and mito-protein
662 synthesis, but had no effect on mitochondrial morphology, membrane potential or
663 apoptosis. Direct or indirect regulation of mitoribosomal proteins could be another
664 regulatory function of this transcription factor. GABPA is not affected on transcriptome
665 level, but is downregulated on protein-level in RPE_T21 cells. ETS2 on the other hand
666 has so far not been implicated in mitochondrial biogenesis or functional regulation.
667
668 Outlook
669 MitoXplorer is integrating, clustering and visualizing numerical data resulting from
670 expression studies (transcriptome, proteome), as well as mutation data. Thus, it is
27 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
671 currently limited to analyzing mito-genes without offering the ability to explore their
672 embedding in a broader, cellular context and thus to learn about potential regulatory
673 mechanisms of observed expression changes of mito-genes. Therefore, in the next
674 release of mitoXplorer, we plan to fully embed mito-genes within the cellular gene
675 regulatory, as well as signaling network by adding information from epigenetic studies
676 (ChIP-seq, methylation data), as well as from the cellular interactome. We will provide
677 the tools to perform enrichment analysis of observed transcription factors binding in
678 the promoter regions of co-regulated mito-genes; and to explore the regulatory
679 network of mito-genes by offering network analysis methods, such as viPEr [87]. Other
680 analysis methods we will provide include correlation analysis, as well as cross-species
681 data mining. Upon user request, we will also add the mitochondrial interactomes of
682 other species. As mitoXplorer stores the mitochondrial interactomes, as well as
683 associated -omics data in a MySQL database, all technical requirements for extending
684 the functionalities of mitoXplorer are already implemented.
685
686 Conclusions
687 mitoXplorer is a powerful, web-based visual data mining platform that allows users to
688 in-depth analyze and visualize mutations and expression dynamics of mito-genes and
689 mito-processes by integrating a manually curated mitochondrial interactome with -
690 omics data in various tissues, conditions or species. We used transcriptome and
691 proteome data from cell lines with trisomy 21 to demonstrate the value of mitoXplorer
692 in analyzing in detail the expression dynamics of mito-genes and -processes. We have
693 used mitoXplorer to integrate these data with publicly available datasets of patients
694 with trisomy 21. Using mitoXplorer for data mining, we predicted failure of
695 mitochondrial respiration in one of the trisomy 21 cell lines, which we could verify
28 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
696 experimentally. Our results demonstrate the power of a visual data mining platform
697 such as mitoXplorer to explore expression dynamics of a specified mito-gene set in a
698 detailed and focused manner, leading to discovery of underlying molecular
699 mechanisms and providing testable hypotheses for further experimental studies.
700
29 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
701 Methods
702
703 Implementation of mitoXplorer
704
705 Web interface of mitoXplorer (Front-end)
706 The web interface of mitoXplorer at the front-end allows users to access, interact and
707 visualize data from its database, including the interactome and expression/mutation
708 data. The interactive elements and visualizations on mitoXplorer are all built with
709 Javascript, a dynamic programming language that enables interactivity on web pages
710 by manipulating elements through DOM (Document Object Model). DOM is a
711 representation of document, such as HTML, in a tree structure, with each element as
712 a node or an object. Through Javascript and its libraries, visualizations in mitoXplorer
713 can react to users’ action and dynamically change the properties (size, color,
714 coordinates) of web elements and display interactivity. All the visualization
715 components in mitoXplorer described below are modular by design and can be
716 deployed individually or incorporated into web platforms easily.
717
718 Mitochondrial Interactome (D3 - Data binding and selection)
719 The visualization of the interactome is created with the implementation of a Javascript
720 library, D3 (d3.js) [88]. D3 (Data-driven documents) is capable of binding data, usually
721 in the form of JSON (Javascript-oriented notation), to the elements of the DOM so that
722 their properties are entirely based on given data. In the interactome, D3 creates an
723 SVG (Scalable Vector Graphic) element for each gene within the DOM in the form of
724 a bubble, with sizes and colors dependent on their Log2FC values. The coordinates
725 of bubbles are also calculated according to the data (e.g. the largest one being at the
30 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
726 center) so that the layout of the whole interactome is visually appealing. Upon hovering
727 over any bubble (gene), D3 selects the element and passes additional data bound to
728 that element to the corresponding web element (sidebar) for display.
729
730 Comparative plot (D3 - Transition and sorting)
731 The comparative plot combines three interdependent visualizations (scatterplot, bar
732 chart and heatmap) built upon D3. Apart from data-binding and selection, these
733 visualizations exploit the functionality of D3 of transition and sorting through its API. In
734 the scatterplot, genes are displayed as nodes, whose colors and position again
735 depend on the data (log2FC). When another mito-process is selected at the bar chart,
736 D3 updates the data bound to the node and the properties of the nodes are changed.
737 The transition (changes in color and position) is smooth and gives users the
738 impression that the visualization is truly dynamic and interactive. D3 can manipulate,
739 not only the elements, but also the data bound to the elements. Upon clicking the
740 dataset or gene names on the heatmap, the data can be sorted accordingly and an
741 index is assigned to each element (tile on the heatmap) to indicate its position.
742
743 Hierarchical clustering (mpld3 - Visualization in Python implemented in D3)
744 The heatmap displaying the results of hierarchical clustering is built with mpld3, a
745 Python library that exports graphics made with Python’s Matplotlib-based libraries to
746 JSON objects that could be displayed on web browsers. Mpld3 benefits from D3’s
747 data-binding property and allows users to create a plugin that interacts with the data
748 on the visualization. The advantage of using mpld3 is that the analysis and
749 visualizations made in Python can be directly translated to JSON and deployed in
750 Javascript on webpages without re-programming. In the case of hierarchical
31 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
751 clustering, since libraries for both clustering analysis and visualization of results in a
752 heatmap with a dendrogram are available in Python (described below), it is exported
753 to JSON with mpld3, and a Javascript tooltip plugin that allows users to select data or
754 display information with D3.
755
756 Principle Component Analysis (three.js - 3D visualization)
757 The visualization of the result of Principle Component Analysis (PCA) is 3-
758 dimensional, with each dimension representing one of the first three Principle
759 Components (PCs). This is achieved through the implementation of three.js, a
760 Javascript library that enables animated 3D graphics to be created and displayed in a
761 web browser. It starts with building a “scene”, or a canvas, on which 3D objects will be
762 created. Then a “camera” is set up that controls the view of objects on the scene from
763 the users’ perspective, such as the field of view (width, height, depth) and its ratio; and
764 a “renderer” that renders the scene at short time intervals so objects are displayed as
765 animated object (either they are animated by themselves or moved around on the
766 scene by users). Objects of different texture, geometry and color, can now be added
767 to and rendered on the scene. Finally, the scene with objects is attached to the DOM
768 of a webpage to become visible. In the PCA visualization, each dataset is represented
769 and rendered as a small sphere, with coordinates (x, y, z) depending on the values of
770 its first three PCs, and colors on the grouping of that dataset. When users drag around
771 on the canvas or zoom in or out, all objects are re-rendered in such a way that the
772 scene appears to be a 3-dimensional space.
773
774 MitoXplorer Database (back-end)
32 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
775 A MySQL database hosted at the back-end of mitoXplorer contains the interactomes
776 of mito-genes, including the mito-process, gene ontology and the interactions between
777 gene products; and the expression and mutation data from public databases. Each
778 entry of the expression and mutation data has a foreign link to the interactome and file
779 directory (dataset table). This ensures that the expression and mutation data will be
780 updated together with the interactome, or when a dataset is updated or deleted. Users
781 can upload their own differential expression and/or mutation data, which will be
782 processed and integrated with the interactome by extracting mito-genes, and stored
783 in the mitoXplorer database for up to 7 days.
784
785 Data analysis and communication between front- and back-end
786 A Python application serves as a bridge between the front- and back-end of
787 mitoXplorer. Upon the users’ request to access the database or perform analysis at
788 the web interface, an AJAX-asynchronous call directed to the Python application is
789 made, so the request can be performed in the background and the webpage is updated
790 without reloading. The Python application then processes the request by connecting
791 to the MySQL database and analyzes the data retrieved from it. The application also
792 handles the user uploads (e.g. data cleaning) before saving it to the MySQL database.
793 The main libraries used by the Python application for analysis include: 1) Scikit-learn:
794 a machine learning library that provides tools for PCA, to perform dimensionality
795 reduction on the expression of all mito-genes and of each mito-process. The first three
796 principal components are extracted for each dataset. 2) SciPy: a mathematical library
797 that provides modules for Hierarchical Clustering, to calculate 2-dimensional distance
798 matrices between genes and between datasets based on expression values, for each
799 mito-process. 3) Seaborn: a statistical visualization library built on top of SciPy to
33 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
800 create heatmaps from the results. All the results are produced in JSON format, which
801 are then sent via the HTTP protocol back to the front-end and visualized with
802 Javascript.
803
804 The usage of mitoXplorer does not require installation or programming knowledge.
805 Documentation and tutorials are available online and on GitLab
806 (https://gitlab.com/habermannlab/mitox). MitoXplorer is also available for download
807 and installation on a local server, if users wish to build their own gene list and apply
808 the interactive features and database of mitoXplorer, which stores the available
809 expression and mutation data for all genes. Setup instructions are also available on
810 GitLab (https://gitlab.com/habermannlab/mitox). We also provide a docker version of
811 mitoxplorer at (https://gitlab.com/habermannlab/mitox, branch docker-version).
812
813 Transcriptomics and proteomics of aneuploid cell lines
814 The proteome analysis of the trisomic cell lines was previously described [44,45].
815
816 The raw reads from RNA-sequencing were processed to remove low quality reads and
817 adapter sequences, and aligned to the reference genome (hg19) with TopHat2 [89].
818 The Cufflinks package [90] was used to calculate the expression difference between
819 two samples (aneuploid vs diploid) of multiple replicates and test the statistical
820 significance. Transcriptome and proteome information are available in public
821 repositories: NGS data have been deposited in NCBI’s Gene Expression Omnibus and
822 are accessible through GEO Series accession number GSE102855 and GSE131249.
823 Data from Kühl et al. [46], as well as Liu, et al. [78], Letourneau et al. [41] and Sullivan
824 [42] were uploaded as provided by the authors.
34 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
825
826 Data processing of public data and correlational analysis
827 The public NGS datasets on mitoXplorer were downloaded from GEO; only RNA-seq
828 data, not microarray data are currently uploaded on mitoXplorer. The pre-analyzed
829 data were downloaded and transformed to transcript per million (TPM). Log2FC were
830 calculated for each disease-sample, using the corresponding diploid samples as
831 control (or the mean of normal samples if there were no paired samples). Metadata of
832 the samples (e.g. cell types) was also downloaded and stored in the mitoXplorer
833 database. The links to the experiments for each dataset are available at the
834 DATABASE summary page of mitoXplorer. TCGA differential expression data were
835 downloaded from the NCI GDC Data Portal (https://portal.gdc.cancer.gov/). For
836 calculating differential expression, the log2FC was calculated from TPM (transcripts
837 per million) for each paired sample.
838
839 Cell culture and treatment
840 The human cell line RPE-1 hTERT (referred to as RPE) was a kind gift by Stephen
841 Taylor (University of Manchester, UK). Human HCT116 cells (referred to as HCT) were
842 obtained from ATCC (No. CCL-247). Trisomic cell lines were generated by microcell-
843 mediated chromosome transfer as described previously [45]. The A9 donor mouse cell
844 lines were purchased from the Health Science Research Resources Bank (HSRRB),
845 Osaka 590-0535, Japan. All cell lines were maintained at 37°C with 5% CO2
846 atmosphere in Dulbecco´s Modified Eagle Medium (DMEM) containing 10% fetal
847 bovine serum (FBS), 100 U penicillin and 100 U streptomycin.
848
849 MitoTracker staining and imaging
35 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
850 Mitochondria were stained in 96-well plates. The cells were incubated for 30 min at
851 37°C with 100 nM MitoTracker deep Red FM (M22426, Invitrogen ®) dye prior to
852 fixation. Cells were fixed with 3% PFA in DMEM for 5 min at room temperature. After
853 washing twice, 1xPBST, plates were stored with 1xPBS containing with 0.01% sodium
854 azide was added. Plates were stored at 4°C in the dark. Imaging was carried out on
855 an inverted Zeiss Observer.Z1 microscope with a spinning disc and 473 nm, 561 nm
856 and 660 nm argon laser lines. Imaging devices were controlled, captured, stored and
857 processed with the SlideBook Software in Fiji [91]. The images were captured
858 automatically on multiple focal planes (step size 700 nm) with a 40x magnification air
859 objective.
860
861 Metabolic profiling of wild-type and T21 cell lines
862 RPE and HCT cells and their T21 derivatives were seeded at 25,000 or 36,000
863 cells/well respectively, on XF96 cell plates (Seahorse Bioscience, Agilent
864 Technologies), 30 hours before being assayed. Optimization of reagents as well as
865 CCCP and digitonin titrations were performed as described by the manufacturer’s
866 protocols (Seahorse Bioscience). The experiments were performed using the
867 mitochondrial and glycolytic stress test assay protocol as suggested by the
868 manufacturer (Seahorse Bioscience, Agilent Technologies). By employing the
869 Seahorse Bioscience XF Extracellular Flux Analyzer, the rate of cellular oxidative
870 phosphorylation (oxygen consumption rate (OCR)) and glycolysis (cellular proton
871 production rate (PPR)) were measured simultaneously.
872
873 For OCR measurement, DMEM media was supplemented with 25 mM glucose, 1 mM
874 pyruvate and 2 mM glutamine. Basal rate was recorded and additions for the mito
36 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
875 stress test were as follows: 1.5 µM oligomycin, CCCP, 2µM rotenone + 4 µM antimycin
876 A. For PPR measurement, DMEM media was supplemented with 2 mM glutamine.
877 Basal rate was recorded and additions for the glycolysis stress test were as follows:
878 10 mM glucose, 1.5 µM oligomycin and 100 mM 2-deoxyglucose.
879
880 For intact cells, the CCCP concentrations were 7 and 1.5 µM for RPE1 and HCT116
881 cells, respectively. The assays of intact cells were performed in 96-well plates with at
882 least 10 replicates per cell line. For the permeabilized RPE1 cell lines, the CCCP and
883 digitonin concentrations were 10 µM and 40 µM, respectively. In the case of
884 permeabilized HCT116 cell lines, the CCCP and digitonin concentrations were 12 and
885 50µM, respectively. For OCR measurement, Mannitol-sucrose buffer (MAS) was
886 prepared according to Seahorse Biosciences. For permeabilization, digitionin was
887 added to MAS buffer together with the respective respiratory substrates: 10mM
888 pyruvate / 2 mM malate, 10 mM succinate / 2 µM rotenone or 0.5 mM TMPD / 2 mM
889 ascorbate / 2 µM antimycin A. Basal respiration was recorded, as were additions of
890 4mM ADP, 1.5 µM oligomycin, CCCP and 2 uM rotenone ± 4uM antimycin A or 20 mM
891 Na-azide. The assays in permeabilized cells were performed in poly-D-lysine-coated
892 96-well plates with at least 5 replicates per cell line.
893
894 Normalization was performed with the CyQuant cell proliferation assay kit (Life
895 Technologies) in the same plate used for the assay of intact cells; and in a parallel
896 plate for the permeabilized cells. Data analysis was done according to [92].
897
898 The mitoMorph plug-in for morphological characterization of mitochondria by
899 image analysis
37 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
900 Classification and measurement of mitochondria were performed using the software
901 ImageJ [93], complemented with all the default plugins provided by Fiji [91] and with
902 the additional plugin FeatureJ. A set of functions were developed to assist the user in
903 the preparation and analysis of the data, either in interactive or batch processing
904 mode.
905
906 Using this toolset, after all the cells of interest were manually outlined in each image,
907 the mitochondria were segmented and characterized. For each processing step, the
908 algorithms used are reported as described in ImageJ, and their parameters are
909 specified in physical units.
910
911 The images were pre-processed by first suppressing the background signal (rolling
912 ball background subtraction, kernel radius: 2.5 µm) and then enhancing the
913 mitochondria signal (Laplacian of Gaussian, smoothing scale: 1 µm, followed by
914 contrast limited adaptive histogram equalization, CLAHE, kernel size: 2.5 µm).
915 Mitochondria candidates were obtained by segmentation, using Yen thresholding
916 algorithm [94], and subjected to classification based on a set of determined features.
917
918 Objects that were too small were excluded from the analysis, and the remaining ones
919 were assigned to one of four categories: networked, puncta, rods and swollen [82].
920 Objects that were quasi-round, compact in intensity, and larger than the puncta were
921 classified as swollen. All objects with an intermediate phenotype between fragmented
922 puncta and network of filaments were classified as rods.
923
38 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
924 Classification was performed by sequentially verifying different selection criteria, one
925 set for each class, based on the following measured features: area (A), aspect ratio
926 (AR), circularity (C), solidity (S), minimum Feret diameter (here indicated as minimum
927 linear extension, MLE) and longest shortest-path (here indicated as extension, E).
928 While all the other measures are directly derived from the segmentation, the extension
929 is measured as the longest shortest-path between any two end points in the skeleton
930 derived from the segmentation. The selection criteria are evaluated sequentially as
931 reported in Additional File 4, Supplementary Table S3.
932
933 We would like to note that analysis of mitochondrial morphology on projected images
934 is limited, as mitochondrial structures might not be resolved properly.
935
936 Image analysis using mitoMorph and data processing
937 Image processing and analysis was done in Fiji. Image stacks were Z-projected, cells
938 were manually selected and the resulting images were saved for further batch
939 processing using mitoMorph. Resulting network statistics of mitochondrial features for
940 each individual cell were used for further processing (Additional File 4, Supplementary
941 Table S3). All statistical processing and data visualization of mitoMorph results was
942 done using R.
943
944 Declarations
945 Acknowledgements
946 We want to thank Michael Volkmer for help and advice in web-server management
947 and development. We want to thank Stephen Taylor (University of Manchester, UK)
948 for providing cell lines. We thank Alice Carrier, Friedhelm Pfeiffer and Frank Schnorrer
39 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
949 for critical reading of the manuscript. We thank the Max Planck Society, the Max-
950 Planck Institute for Biochemistry, the Aix-Marseille University, the CNRS and the
951 Institute of Developmental Biology Marseille (IBDM) for their support.
952
953 Funding
954 This work was supported by DFG grant HA 6905/2–1 from the German research
955 foundation, the A*MIDEX grant 2HABERRE/RHRE/ID17HRU288 from Aix-Marseille
956 University and ANR grant ANR-18-CE45-0016-01 (to BHH); the ERASMUS+
957 Traineeship program (to AY), the Munich Center for Systems Neurology (SyNergy
958 EXC 1010) and the Bert L & N Kuggie Vallee Foundation (to FP), and the Bavarian
959 Molecular Biosystems Research Network D2-F5121.2-10c/4822 to CM.
960
961 Availability of data and materials
962 The mitoXplorer web-server is freely available at http://mitoxplorer.ibdm.univ-mrs.fr/.
963 The source code of mitoXplorer is available at https://gitlab.com/habermannlab/mitox.
964 The pipeline for differential expression analysis and mutation calling of RNA-seq data
965 is available at https://gitlab.com/habermannlab/mitox_rnaseq_pipeline. MitoMorph is
966 freely available at https://github.com/giocard/mitoMorph. RNA-seq data published with
967 this study are available via the Gene Expression Omnibus (GEO) database (accession
968 numbers: GSE131249).
969
970 Details on software provided in this manuscript:
971 Project name: mitoXplorer
972 Project home page: http://mitoxplorer.ibdm.univ-mrs.fr/
973 Archived version: https://gitlab.com/habermannlab/mitox
40 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
974 Operating system(s): Platform independent
975 Programming language: JavaScript, PHP, MySQL, Python
976 Other requirements: none
977 License: GNU public license
978 Any restrictions to use by non-academics: none
979
980 Project name: mitoMorph
981 Project home page: https://github.com/giocard/mitoMorph
982 Archived version: https://github.com/giocard/mitoMorph
983 Operating system(s): Platform independent
984 Programming language: Groovy, ImageJ Macro
985 Other requirements: ImageJ/Fiji software
986 License: GNU Public License
987 Any restrictions to use by non-academics: none
988
989 Author’s contributions
990 AY and PK were the main developers of the mitoXplorer web-server with the help of
991 JV, SG and AB. JV developed the interactome view of the web-server. Data analysis
992 was done by AY, PK, MD and BHH. MitoTracker staining and imaging was carried out
993 by MD, CM and FP carried out metabolic measurements. Handling of cells and cell
994 culture was done by MD and ZS. GC conceived and developed the mitoMorph Fiji
995 plugin, image analysis with mitoMorph was done by BHH. The project was conceived
996 by BHH, the manuscript was written by AY and BHH with contributions from SZ, CG,
997 MD and CM. All authors read and approved the final version of the manuscript.
998
41 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
999 Ethics approval and consent to participate
1000 Not applicable.
1001
1002 Competing interests
1003 We declare that no competing interests exist.
1004
1005 Additional files
1006 Additional file 1: Figures S1-S4. Supplementary figures. (PDF format).
1007 Additional file 2: Supplementary table S1 a-c. (Excel format).
1008 Additional file 3: Supplementary table S2 a-e. (Excel format).
1009 Additional File 4: Supplementary table S3 a-d. (Excel format).
42 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1010 References 1011 1. Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills GB, 1012 Shaw KRM, Ozenberger BA, et al. The Cancer Genome Atlas Pan-Cancer analysis 1013 project. Nat. Genet. Nature Publishing Group; 2013;45:1113–20.
1014 2. Zhang J, Baran J, Cros A, Guberman JM, Haider S, Hsu J, et al. International 1015 Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics 1016 data. Database (Oxford). 2011;2011:bar026–6.
1017 3. Krempel R, Kulkarni P, Yim A, Lang U, Habermann B, Frommolt P. Integrative 1018 analysis and machine learning on cancer genomics data using the Cancer Systems 1019 Biology Database (CancerSysDB). BMC Bioinformatics. BioMed Central; 1020 2018;19:156.
1021 4. Klonowska K, Czubak K, Wojciechowska M, Handschuh L, Zmienko A, 1022 Figlerowicz M, et al. Oncogenomic portals for the visualization and analysis of 1023 genome-wide cancer data. Oncotarget. 2016;7:176–92.
1024 5. Papatheodorou I, Fonseca NA, Keays M, Tang YA, Barrera E, Bazant W, et al. 1025 Expression Atlas: gene and protein expression across multiple studies and 1026 organisms. Nucleic Acids Res. 2018;46:D246–51.
1027 6. Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene 1028 expression and hybridization array data repository. Nucleic Acids Res. Oxford 1029 University Press; 2002;30:207–10.
1030 7. Kodama Y, Mashima J, Kaminuma E, Gojobori T, Ogasawara O, Takagi T, et al. 1031 The DNA Data Bank of Japan launches a new resource, the DDBJ Omics Archive of 1032 functional genomics experiments. Nucleic Acids Res. 2012;40:D38–42.
1033 8. Parkinson H, Sarkans U, Kolesnikov N, Abeygunawardena N, Burdett T, Dylag M, 1034 et al. ArrayExpress update--an archive of microarray and high-throughput 1035 sequencing-based functional genomics experiments. Nucleic Acids Res. 1036 2011;39:D1002–4.
1037 9. Simoff SJ, Böhlen MH, Mazeika A. Visual Data Mining: An Introduction and 1038 Overview. Visual Data Mining. Berlin, Heidelberg: Springer Berlin Heidelberg; 2008. 1039 pp. 1–12.
1040 10. Scheffler IE. Mitochondria. Hoboken, NJ, USA: John Wiley & Sons, Inc; 2007.
1041 11. Nunnari J, Suomalainen A. Mitochondria: in sickness and in health. Cell. 1042 2012;148:1145–59.
1043 12. Suomalainen A, Battersby BJ. Mitochondrial diseases: the contribution of 1044 organelle stress responses to pathology. Nat. Rev. Mol. Cell Biol. 2018;19:77–92.
1045 13. Zong W-X, Rabinowitz JD, White E. Mitochondria and Cancer. Mol. Cell. 1046 2016;61:667–76.
1047 14. Wallace DC. Mitochondria and cancer. Nat. Rev. Cancer. Nature Publishing 1048 Group; 2012;12:685–98.
43 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1049 15. Schapira AHV. Mitochondrial diseases. Lancet. 2012;379:1825–34.
1050 16. Mannella CA. Structural diversity of mitochondria: functional implications. Ann. N. 1051 Y. Acad. Sci. Wiley/Blackwell (10.1111); 2008;1147:171–9.
1052 17. Vafai SB, Mootha VK. Mitochondrial disorders as windows into an ancient 1053 organelle. Nature. Nature Publishing Group; 2012;491:374–83.
1054 18. Wai T, Langer T. Mitochondrial Dynamics and Metabolic Regulation. Trends 1055 Endocrinol. Metab. 2016;27:105–17.
1056 19. Benard G, Bellance N, James D, Parrone P, Fernandez H, Letellier T, et al. 1057 Mitochondrial bioenergetics and structural network organization. J. Cell. Sci. The 1058 Company of Biologists Ltd; 2007;120:838–48.
1059 20. Woods DC. Mitochondrial Heterogeneity: Evaluating Mitochondrial 1060 Subpopulation Dynamics in Stem Cells. Stem cells international. Hindawi; 1061 2017;2017:7068567–7.
1062 21. Mootha VK, Bunkenborg J, Olsen JV, Hjerrild M, Wisniewski JR, Stahl E, et al. 1063 Integrated analysis of protein composition, tissue diversity, and gene regulation in 1064 mouse mitochondria. Cell. 2003;115:629–40.
1065 22. Jensen RE, Dunn CD, Youngman MJ, Sesaki H. Mitochondrial building blocks. 1066 Trends Cell Biol. 2004;14:215–8.
1067 23. Pagliarini DJ, Calvo SE, Chang B, Sheth SA, Vafai SB, Ong S-E, et al. A 1068 mitochondrial protein compendium elucidates complex I disease biology. Cell. 1069 2008;134:112–23.
1070 24. Calvo SE, Clauser KR, Mootha VK. MitoCarta2.0: an updated inventory of 1071 mammalian mitochondrial proteins. Nucleic Acids Res. 2016;44:D1251–7.
1072 25. Gray MW. Mosaic nature of the mitochondrial proteome: Implications for the 1073 origin and evolution of mitochondria. Proc. Natl. Acad. Sci. U.S.A. 2015;112:10133– 1074 8.
1075 26. Meisinger C, Sickmann A, Pfanner N. The mitochondrial proteome: from 1076 inventory to function. Cell. 2008;134:22–4.
1077 27. Lotz C, Lin AJ, Black CM, Zhang J, Lau E, Deng N, et al. Characterization, 1078 design, and function of the mitochondrial proteome: from organs to organisms. J. 1079 Proteome Res. American Chemical Society; 2014;13:433–46.
1080 28. Gaucher SP, Taylor SW, Fahy E, Zhang B, Warnock DE, Ghosh SS, et al. 1081 Expanded coverage of the human heart mitochondrial proteome using 1082 multidimensional liquid chromatography coupled with tandem mass spectrometry. J. 1083 Proteome Res. 2004;3:495–505.
1084 29. Taylor SW, Fahy E, Zhang B, Glenn GM, Warnock DE, Wiley S, et al. 1085 Characterization of the human heart mitochondrial proteome. Nat. Biotechnol. Nature 1086 Publishing Group; 2003;21:281–6.
44 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1087 30. Gonczarowska-Jorge H, Zahedi RP, Sickmann A. The proteome of baker's yeast 1088 mitochondria. Mitochondrion. 2017;33:15–21.
1089 31. Kolesnikov AA, Gerasimov ES. Diversity of mitochondrial genome organization. 1090 Biochemistry Mosc. SP MAIK Nauka/Interperiodica; 2012;77:1424–35.
1091 32. Hällberg BM, Larsson N-G. Making proteins in the powerhouse. Cell Metab. 1092 2014;20:226–40.
1093 33. Catalano D, Licciulli F, Turi A, Grillo G, Saccone C, D'Elia D. MitoRes: a resource 1094 of nuclear-encoded mitochondrial genes and their products in Metazoa. BMC 1095 Bioinformatics. BioMed Central; 2006;7:36.
1096 34. Smith AC, Robinson AJ. MitoMiner v3.1, an update on the mitochondrial 1097 proteomics database. Nucleic Acids Res. 2016;44:D1258–61.
1098 35. Godin N, Eichler J. The Mitochondrial Protein Atlas: A Database of 1099 Experimentally Verified Information on the Human Mitochondrial Proteome. J. 1100 Comput. Biol. Mary Ann Liebert, Inc. 140 Huguenot Street, 3rd Floor New Rochelle, 1101 NY 10801 USA; 2017;24:906–16.
1102 36. Cotter D, Guda P, Fahy E, Subramaniam S. MitoProteome: mitochondrial protein 1103 sequence database and annotation system. Nucleic Acids Res. 2004;32:D463–7.
1104 37. Guda C, Fahy E, Subramaniam S. MITOPRED: a genome-scale method for 1105 prediction of nucleus-encoded mitochondrial proteins. Bioinformatics. 2004;20:1785– 1106 94.
1107 38. NCBI Resource Coordinators. Database resources of the National Center for 1108 Biotechnology Information. Nucleic Acids Res. 2018;46:D8–D13.
1109 39. Szklarczyk D, Morris JH, Cook H, Kuhn M, Wyder S, Simonovic M, et al. The 1110 STRING database in 2017: quality-controlled protein-protein association networks, 1111 made broadly accessible. Nucleic Acids Res. 2017;45:D362–8.
1112 40. DeBerardinis RJ, Chandel NS. Fundamentals of cancer metabolism. Sci Adv. 1113 American Association for the Advancement of Science; 2016;2:e1600200.
1114 41. Letourneau A, Santoni FA, Bonilla X, Sailani MR, Gonzalez D, Kind J, et al. 1115 Domains of genome-wide gene expression dysregulation in Down's syndrome. 1116 Nature. Nature Publishing Group; 2014;508:345–50.
1117 42. Sullivan KD, Lewis HC, Hill AA, Pandey A, Jackson LP, Cabral JM, et al. Trisomy 1118 21 consistently activates the interferon response. Elife. eLife Sciences Publications 1119 Limited; 2016;5:1709.
1120 43. Lane AA, Chapuy B, Lin CY, Tivey T, Li H, Townsend EC, et al. Triplication of a 1121 21q22 region contributes to B cell transformation through HMGN1 overexpression 1122 and loss of histone H3 Lys27 trimethylation. Nat. Genet. Nature Publishing Group; 1123 2014;46:618–23.
45 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1124 44. Dürrbaum M, Kuznetsova AY, Passerini V, Stingele S, Stoehr G, Storchova Z. 1125 Unique features of the transcriptional response to model aneuploidy in human cells. 1126 BMC Genomics. BioMed Central; 2014;15:139.
1127 45. Stingele S, Stoehr G, Peplowska K, Cox J, Mann M, Storchova Z. Global 1128 analysis of genome, transcriptome and proteome reveals the response to aneuploidy 1129 in human cells. Mol. Syst. Biol. EMBO Press; 2012;8:608.
1130 46. Kühl I, Miranda M, Atanassov I, Kuznetsova I, Hinze Y, Mourier A, et al. 1131 Transcriptomic and proteomic landscape of mitochondrial dysfunction reveals 1132 secondary coenzyme Q deficiency in mammals. Elife. eLife Sciences Publications 1133 Limited; 2017;6:1494.
1134 47. Huang W, Carbone MA, Magwire MM, Peiffer JA, Lyman RF, Stone EA, et al. 1135 Genetic basis of transcriptome diversity in Drosophila melanogaster. Proc. Natl. 1136 Acad. Sci. U.S.A. National Academy of Sciences; 2015;112:E6010–9.
1137 48. Spletter ML, Barz C, Yeroslaviz A, Zhang X, Lemke SB, Bonnard A, et al. A 1138 transcriptomics resource reveals a transcriptional transition during ordered 1139 sarcomere morphogenesis in flight muscle. Elife. eLife Sciences Publications 1140 Limited; 2018;7:1361.
1141 49. Valenti D, de Bari L, De Filippis B, Henrion-Caude A, Vacca RA. Mitochondrial 1142 dysfunction as a central actor in intellectual disability-related diseases: an overview 1143 of Down syndrome, autism, Fragile X and Rett syndrome. Neurosci Biobehav Rev. 1144 2014;46 Pt 2:202–17.
1145 50. Tiano L, Busciglio J. Mitochondrial dysfunction and Down's syndrome: is there a 1146 role for coenzyme Q(10) ? Littarru GP, editor. Biofactors. Wiley-Blackwell; 1147 2011;37:386–92.
1148 51. Pagano G, Castello G. Oxidative stress and mitochondrial dysfunction in Down 1149 syndrome. Adv. Exp. Med. Biol. New York, NY: Springer US; 2012;724:291–9.
1150 52. Ogawa O, Perry G, Smith MA. The “Down's” side of mitochondria. Dev. Cell. 1151 2002;2:255–6.
1152 53. Prince J, Jia S, Båve U, Annerén G, Oreland L. Mitochondrial enzyme 1153 deficiencies in Down's syndrome. J Neural Transm Park Dis Dement Sect. 1154 1994;8:171–81.
1155 54. Roat E, Prada N, Ferraresi R, Giovenzana C, Nasi M, Troiano L, et al. 1156 Mitochondrial alterations and tendency to apoptosis in peripheral blood cells from 1157 children with Down syndrome. FEBS Lett. 2007;581:521–5.
1158 55. Piccoli C, Izzo A, Scrima R, Bonfiglio F, Manco R, Negri R, et al. Chronic pro- 1159 oxidative state and mitochondrial dysfunctions are more pronounced in fibroblasts 1160 from Down syndrome foeti with congenital heart defects. Hum. Mol. Genet. 1161 2013;22:1218–32.
1162 56. Phillips AC, Sleigh A, McAllister CJ, Brage S, Carpenter TA, Kemp GJ, et al. 1163 Defective mitochondrial function in vivo in skeletal muscle in adults with Down's
46 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1164 syndrome: a 31P-MRS study. Dzeja P, editor. PLoS ONE. Public Library of Science; 1165 2013;8:e84031.
1166 57. Aburawi EH, Souid A-K. Lymphocyte respiration in children with Trisomy 21. 1167 BMC Pediatr. BioMed Central; 2012;12:193.
1168 58. Valenti D, Manente GA, Moro L, Marra E, Vacca RA. Deficit of complex I activity 1169 in human skin fibroblasts with chromosome 21 trisomy and overproduction of 1170 reactive oxygen species by mitochondria: involvement of the cAMP/PKA signalling 1171 pathway. Biochem. J. Portland Press Limited; 2011;435:679–88.
1172 59. Valenti D, Tullo A, Caratozzolo MF, Merafina RS, Scartezzini P, Marra E, et al. 1173 Impairment of F1F0-ATPase, adenine nucleotide translocator and adenylate kinase 1174 causes mitochondrial energy deficit in human skin fibroblasts with chromosome 21 1175 trisomy. Biochem. J. electronic edn. Portland Press Limited; 2010;431:299–310.
1176 60. Abu Faddan N, Sayed D, Ghaleb F. T lymphocytes apoptosis and mitochondrial 1177 membrane potential in Down's syndrome. Fetal Pediatr Pathol. 2011;30:45–52.
1178 61. Izzo A, Nitti M, Mollo N, Paladino S, Procaccini C, Faicchia D, et al. Metformin 1179 restores the mitochondrial network and reverses mitochondrial dysfunction in Down 1180 syndrome cells. Hum. Mol. Genet. 2017;26:1056–69.
1181 62. Busciglio J, Pelsman A, Wong C, Pigino G, Yuan M, Mori H, et al. Altered 1182 metabolism of the amyloid beta precursor protein is associated with mitochondrial 1183 dysfunction in Down's syndrome. Neuron. 2002;33:677–88.
1184 63. Lockstone HE, Harris LW, Swatton JE, Wayland MT, Holland AJ, Bahn S. Gene 1185 expression profiling in the adult Down syndrome brain. Genomics. 2007;90:647–60.
1186 64. Halevy T, Biancotti J-C, Yanuka O, Golan-Lev T, Benvenisty N. Molecular 1187 Characterization of Down Syndrome Embryonic Stem Cells Reveals a Role for 1188 RUNX1 in Neural Differentiation. Stem Cell Reports. 2016;7:777–86.
1189 65. Olmos-Serrano JL, Kang HJ, Tyler WA, Silbereis JC, Cheng F, Zhu Y, et al. 1190 Down Syndrome Developmental Brain Transcriptome Reveals Defective 1191 Oligodendrocyte Differentiation and Myelination. Neuron. 2016;89:1208–22.
1192 66. Jiang J, Jing Y, Cost GJ, Chiang J-C, Kolpa HJ, Cotton AM, et al. Translating 1193 dosage compensation to trisomy 21. Nature. Nature Publishing Group; 1194 2013;500:296–300.
1195 67. Helguera P, Seiglie J, Rodriguez J, Hanna M, Helguera G, Busciglio J. Adaptive 1196 downregulation of mitochondrial function in down syndrome. Cell Metab. 1197 2013;17:132–40.
1198 68. Ripoll C, Rivals I, Ait Yahya-Graison E, Dauphinot L, Paly E, Mircher C, et al. 1199 Molecular signatures of cardiac defects in Down syndrome lymphoblastoid cell lines 1200 suggest altered ciliome and Hedgehog pathways. Veitia RA, editor. PLoS ONE. 1201 Public Library of Science; 2012;7:e41616.
47 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1202 69. Li C, Jin L, Bai Y, Chen Q, Fu L, Yang M, et al. Genome-wide expression 1203 analysis in Down syndrome: insight into immunodeficiency. Rogers LK, editor. PLoS 1204 ONE. 2012;7:e49130.
1205 70. Chou CY, Liu LY, Chen CY, Tsai CH, Hwa HL, Chang LY, et al. Gene expression 1206 variation increase in trisomy 21 tissues. Mamm. Genome. 2008;19:398–405.
1207 71. Altug-Teber O, Bonin M, Walter M, Mau-Holzmann UA, Dufke A, Stappert H, et 1208 al. Specific transcriptional changes in human fetuses with autosomal trisomies. 1209 Cytogenet. Genome Res. Karger Publishers; 2007;119:171–84.
1210 72. Conti A, Fabbrini F, D'Agostino P, Negri R, Greco D, Genesio R, et al. Altered 1211 expression of mitochondrial and extracellular matrix genes in the heart of human 1212 fetuses with chromosome 21 trisomy. BMC Genomics. BioMed Central; 2007;8:268.
1213 73. Mao R, Wang X, Spitznagel EL, Frelin LP, Ting JC, Ding H, et al. Primary and 1214 secondary transcriptional effects in the developing human Down syndrome brain and 1215 heart. Genome Biol. BioMed Central; 2005;6:R107.
1216 74. Hibaoui Y, Grad I, Letourneau A, Sailani MR, Dahoun S, Santoni FA, et al. 1217 Modelling and rescuing neurodevelopmental defect of Down syndrome using 1218 induced pluripotent stem cells from monozygotic twins discordant for trisomy 21. 1219 EMBO Mol Med. EMBO Press; 2014;6:259–77.
1220 75. Engidawork E, Gulesserian T, Fountoulakis M, Lubec G. Aberrant protein 1221 expression in cerebral cortex of fetus with Down syndrome. Neuroscience. 1222 2003;122:145–54.
1223 76. Cheon MS, Fountoulakis M, Dierssen M, Ferreres JC, Lubec G. Expression 1224 profiles of proteins in fetal brain with Down syndrome. J. Neural Transm. Suppl. 1225 2001;:311–9.
1226 77. Cabras T, Pisano E, Montaldo C, Giuca MR, Iavarone F, Zampino G, et al. 1227 Significant modifications of the salivary proteome potentially associated with 1228 complications of Down syndrome revealed by top-down proteomics. Mol. Cell 1229 Proteomics. 2013;12:1844–52.
1230 78. Liu Y, Borel C, Li L, Müller T, Williams EG, Germain P-L, et al. Systematic 1231 proteome and proteostasis profiling in human Trisomy 21 fibroblast cells. Nat 1232 Commun. Nature Publishing Group; 2017;8:1212.
1233 79. Sullivan KD, Evans D, Pandey A, Hraha TH, Smith KP, Markham N, et al. 1234 Trisomy 21 causes changes in the circulating proteome indicative of chronic 1235 autoinflammation. Sci Rep. Nature Publishing Group; 2017;7:14818.
1236 80. Chacinska A, Koehler CM, Milenkovic D, Lithgow T, Pfanner N. Importing 1237 mitochondrial proteins: machineries and mechanisms. Cell. 2009;138:628–44.
1238 81. Sylvester JE, Fischel-Ghodsian N, Mougey EB, O'Brien TW. Mitochondrial 1239 ribosomal proteins: candidate genes for mitochondrial disease. Genet. Med. 1240 2004;6:73–80.
48 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1241 82. Leonard AP, Cameron RB, Speiser JL, Wolf BJ, Peterson YK, Schnellmann RG, 1242 et al. Quantitative analysis of mitochondrial morphology and membrane potential in 1243 living cells using high-content imaging, machine learning, and morphological binning. 1244 Biochim. Biophys. Acta. 2015;1853:348–60.
1245 83. Amunts A, Brown A, Toots J, Scheres SHW, Ramakrishnan V. Ribosome. The 1246 structure of the human mitochondrial ribosome. Science. American Association for 1247 the Advancement of Science; 2015;348:95–8.
1248 84. Bogenhagen DF, Ostermeyer-Fay AG, Haley JD, Garcia-Diaz M. Kinetics and 1249 Mechanism of Mammalian Mitochondrial Ribosome Assembly. Cell Reports. 1250 2018;22:1935–44.
1251 85. Daily K, Patel VR, Rigor P, Xie X, Baldi P. MotifMap: integrative genome-wide 1252 maps of regulatory motif sites for model species. BMC Bioinformatics. BioMed 1253 Central; 2011;12:495.
1254 86. Yang Z-F, Drumea K, Mott S, Wang J, Rosmarin AG. GABP transcription factor 1255 (nuclear respiratory factor 2) is required for mitochondrial biogenesis. Mol. Cell. Biol. 1256 2014;34:3194–201.
1257 87. Garmhausen M, Hofmann F, Senderov V, Thomas M, Kandel BA, Habermann 1258 BH. Virtual pathway explorer (viPEr) and pathway enrichment analysis tool 1259 (PEANuT): creating and analyzing focus networks to identify cross-talk between 1260 molecules and pathways. BMC Genomics. BioMed Central; 2015;16:790.
1261 88. Bostock M, Ogievetsky V, Heer J. D³ Data-Driven Documents. IEEE 1262 Transactions on Visualization and Computer Graphics. 17:2301–9.
1263 89. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: 1264 accurate alignment of transcriptomes in the presence of insertions, deletions and 1265 gene fusions. Genome Biol. BioMed Central; 2013;14:R36.
1266 90. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential 1267 gene and transcript expression analysis of RNA-seq experiments with TopHat and 1268 Cufflinks. Nat Protoc. Nature Publishing Group; 2012;7:562–78.
1269 91. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, et al. 1270 Fiji: an open-source platform for biological-image analysis. Nat. Methods. Nature 1271 Publishing Group; 2012;9:676–82.
1272 92. Divakaruni AS, Paradyse A, Ferrick DA, Murphy AN, Jastroch M. Analysis and 1273 interpretation of microplate-based oxygen consumption and pH data. Meth. Enzymol. 1274 Elsevier; 2014;547:309–54.
1275 93. Schneider CA, Rasband WS, Eliceiri KW. NIH Image to ImageJ: 25 years of 1276 image analysis. Nat. Methods. NIH Public Access; 2012;9:671–5.
1277 94. Yen JC, Chang FJ, Chang S. A new criterion for automatic multilevel 1278 thresholding. IEEE Trans Image Process. 1995;4:370–8.
49 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1279 Figure legends
1280 Figure 1: Setup of the mitoXplorer web-based visual data mining platform. A
1281 manually curated, annotated mitochondrial interactome represents the central part of
1282 the mitoXplorer software, for which we have assembled 1166 mito-genes in human,
1283 1161 mito-genes in mouse and 1099 mito-genes in fruit fly in 35 mitochondrial
1284 processes (mito-processes). We have connected gene products using protein-protein
1285 interactions from STRING [39]. Publicly available expression and mutation data from
1286 repositories such as TCGA or GEO are provided for data integration, analysis and
1287 visualization and are stored together with species interactomes in a MySQL database.
1288 Users can provide their own data, which are temporarily stored and only accessible to
1289 the user. A set of Python-based scripts at the back-end of the platform handle data
1290 formatting, integration and analysis (Additional File 1, Supplementary Figure S1). The
1291 user interacts with mitoXplorer via several visual interfaces, by which the user can
1292 analyze, integrate and visualize his private, as well as public data. Four interactive
1293 visualization interfaces are offered: 1) the Interactome View allows at-a-glance
1294 visualization of the entire mitochondrial interactome of a single dataset (see Figure 2);
1295 2) Comparative Plots, consisting of a scatter plot and a sort-able heatmap allows
1296 comparison of up to six datasets, whereby a single mito-process is analyzed at a time
1297 (see Figure 3); 3) Hierarchical Clustering allows comparison of a large number of
1298 datasets, whereby datasets are clustered according to their expression values.
1299 Hierarchical Clustering plots are zoom-able and interactive (see Figure 4); 4) Principle
1300 Component Analysis displays PCA-analyzed datasets in 3D, providing filtering and
1301 grouping functions. There is in principle no limit to the number of datasets that can be
1302 analyzed using PCA (see Figure 5).
1303
50 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1304 Figure 2: Interactome View of the mitoXplorer platform. (a) Overview of all mito-
1305 processes of one dataset. A process can either be shown as one circle with colored
1306 segments according to the number of dysregulated genes, or upon clicking on the
1307 process, by showing all individual genes being part of this process; see (b) by clicking
1308 on the process Translation or the adjacent ‘+’, the circle is replaced by individual
1309 bubbles representing genes of this process. Clicking on the process again, or on the
1310 adjacent ‘-‘ will revert to the circular display. (c) Hovering over a gene bubble will
1311 display the name of the gene and associated information (gene name, description,
1312 chromosomal location, mitochondrial process, accession numbers, as well as log2 fold
1313 change, p-value and observed mutations), as well as all connections to mito-genes in
1314 other processes. Compared were the retinal epithelial cell line RPE1 (RPE) wild-type
1315 to RPE1 with Trisomy 21 (RPE_T21).
1316
1317 Figure 3: Comparative Plot of the mitoXplorer platform. (a) The Comparative Plot
1318 display is composed of a scatterplot and sortable heatmap and a bar chart for the
1319 selection of mito-processes. The scatterplot shows the log2 fold change (y-axis) and
1320 the datasets (x-axis). Each bubble represents one gene, whereby red dots indicate
1321 downregulated, and blue dots upregulated genes. The process to be shown can be
1322 selected by clicking on the process name in the bar chart next to the scatterplot, the
1323 chosen process is indicated on its top. In this case, TCA cycle was chosen. The
1324 heatmap at the bottom shows the individual genes and the datasets, whereby the
1325 genes are colored according to their log2 fold change (indicated at the bottom of the
1326 plot). (b) Hovering over a gene bubble (or a gene square in the heatmap) will display
1327 available information (in case of fly: gene name, mitochondrial process, gene
1328 description, chromosomal location, gene symbol, as well as log2 fold change, p-value
51 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1329 and observed mutations). (c) The heatmap is sortable by log2 fold change (as
1330 indicated by the pointer in c), as well as by dataset. Clicking on one of the datasets
1331 will sort the heatmap according to the log2 fold change of all genes in this dataset, as
1332 is illustrated here. Clicking on one of the genes will sort the heatmap according to its
1333 log2 fold changes across different datasets. The time-series study of developing flight
1334 muscle was used to demonstrate the functionality of this visualization method.
1335
1336 Figure 4: Hierarchical Clustering and heatmap plot of the mitoXplorer platform.
1337 Hierarchical Clustering of expression data results in a so-called heatmap. (a) Heatmap
1338 of transcriptome and proteome data of mouse knock-out strains of genes involved in
1339 mitochondrial replication, DNA-maintenance, transcription and RNA processing (taken
1340 from [46]). Data are clustered according to genes, as well as datasets. Gene boxes
1341 are colored according to their log2 fold change. At the top of the heatmap, the user
1342 can choose the mito-process to be displayed. (b) Hovering over one of the gene boxes
1343 will display information on the gene, such as the gene name, mito-process, log2 fold
1344 change, p-value and – if available – observed mutations. The heatmap is also zoom-
1345 able by clicking on the magnification glass at the bottom of the plot, so that large
1346 datasets can be visualized and analyzed efficiently. Datasets can be selected within
1347 the heatmap by grouping. To do this, first a group name has to be defined; second,
1348 the datasets belonging to this group have to be selected by clicking on one of the gene
1349 boxes of the dataset. This process can be repeated and the resulting groups can then
1350 be analyzed using Comparative Plots.
1351
1352 Figure 5: Principal component analysis and PCA plot of the mitoXplorer
1353 platform. (a) PCA analysis and plot of transcriptome data of The Cancer Genome
52 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1354 Atlas (TCGA) database [1], showing four different cancer types: breast cancer
1355 (BRCA), kidney cancer (KIRK), liver cancer (LIHC) and lung cancer (LUAD). Each
1356 bubble represents one dataset, in this case, one cancer patient. At the right side at the
1357 top of the plot, the mito-process to be shown can be chosen. In this case, ‘All
1358 Processes’ are chosen, containing data from all mito-genes. At the right side next to
1359 the plot, different colors, as well as filters can be chosen. In this case, the Cancer Type
1360 was chosen for coloring, showing the four different cancer types in four different colors.
1361 (b) Hovering over a bubble will display associated information on the dataset, including
1362 the dataset name, and in case of the TCGA, information on the cancer type, the stage,
1363 the gender, the vital status, as well as skin color. In addition, the three PC components
1364 are shown. (c) Selecting color schemes on the right-hand side will change the coloring
1365 of the bubbles. In this case, only lung cancer is shown, and coloring is done according
1366 to Stage, Gender, Vital, and Skin color. This panel can also be used for selecting
1367 specific datasets. For instance, clicking on one of the stages will only display the
1368 chosen stage and omit datasets from other stages. As in the heatmap, datasets can
1369 be selected from the PCA for grouping. To do this, first a group name has to be defined;
1370 second, the datasets belonging to this group have to be selected by clicking on one of
1371 the dataset bubbles. This process can be repeated and the resulting groups can then
1372 be analyzed using Comparative Plots.
1373
1374 Figure 6: Interactome View of the transcriptome and proteome of cell lines
1375 carrying trisomy 21. Trisomy 21 samples were compared against their wild-type
1376 counterpart. Transcriptomic analysis of (a) HCT116_T21 (trisomy 21 against wild-type)
1377 and (b) RPE21_T21 (trisomic against wild-type); (c) proteomic analysis of RPE_T21
1378 cells (trisomy 21 against wild-type). Transcriptome changes are different between the
53 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1379 two trisomy 21 cell lines HCT116 and RPE1. Expression changes of HCT_T21 cells
1380 are mild and genes tend to be upregulated (a), while some genes are strongly
1381 downregulated in RPE_T21 cells (b). The transcriptome (b) and the proteome (c) of
1382 RPE_T21 cells respond quite differently, with a strong down-regulation of components
1383 of the process oxidative phosphorylation (OXPHOS) at proteome level, which is not
1384 observed on transcriptome level. Most genes differentially expressed at transcript
1385 level, on the other hand, show no significant changes on proteome level. Red bubbles
1386 indicate downregulation, blue ones indicate upregulated genes. The size of the bubble
1387 corresponds to the log2 fold change.
1388
1389 Figure 7: Scatterplots of translation, mitochondrial- as well as nuclear
1390 components of oxidative phosphorylation of trisomy 21 cells. (a) In the
1391 mitochondrial process translation, MRPS21 is strongly down-regulated on
1392 transcriptome level in RPE_T21 cells as compared to RPE1 wild-type (wt) cells. No
1393 change is observed in HCT_T21 cells. On proteome level, several components of the
1394 mitoribosome small subunit (SSU) are down-regulated in RPE_T21 cells. (b)
1395 Transcript levels of mitochondrial-encoded genes of oxidative phosphorylation
1396 (OXPHOS) are not affected. (c) A significant number of components of OXPHOS are
1397 down-regulated on protein-level in RPE_T21 cells, while no significant or only mild
1398 reduction can be observed on transcriptome level in trisomy 21 cell lines. Scatterplots
1399 are taken from the mitoXplorer comparative plot interface. Each bubble represents one
1400 gene, pink highlighted dots are selected, light blue dots indicate mutated genes. On
1401 the y-axis, the log2 fold change is plotted, the cell lines (transcriptome of HCT116 T21
1402 (HCT_T21) clone 1 (c1) and clone 3 (c3) vs wild-type, as well as transcriptome of
1403 RPE1 T21 (RPE_T21) clone 1 (c1) and clone 2 (c2) vs wild-type and proteome of
54 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1404 RPE1 T21 clone 1 (RPE_T21 c1) vs wild-type) are plotted on the x-axis. The gene
1405 highlighted in pink has been selected on the web-server: MRPS21 for the process
1406 translation; MT-CO2 for the process mt oxidative phosphorylation; and no gene has
1407 been selected in the process oxidative phosphorylation.
1408
1409 Figure 8: Mitochondrial respiration and glycolysis is strongly affected in
1410 RPE_T21 cells and not affected in HCT_T21 cells. (a) Respiration in intact
1411 RPE_T21 cells is greatly decreased compared to wild-type. (b – d) Permeabilized
1412 RPE_T21 cells supplemented for substrates of complex I, II and IV as indicated in the
1413 header of each plot, showed equally dysfunctional OXPHOS, suggesting a general
1414 break-down of the respiratory chain. (e) RPE_T21 cells do not have any spare
1415 glycolytic reserve. Respiration (f), as well as glycolysis (g) is virtually unchanged in
1416 HCT_T21 cells compared to their wild-type counterparts. Bright red: RPE_T21 clone
1417 1; light red: RPE_T21 clone 2; dark red: RPE wild-type; dark blue: HCT wild-type; light
1418 blue: HCT_T21 clone 1. Measurements of cellular respiration in intact and
1419 permeabilized cells, as well as glycolytic potential were done using the Seahorse
1420 Bioscience XF Extracellular Flux Analyzer (Seahorse Biosciences). The experiments
1421 were performed using the mitochondrial and glycolytic stress test assay protocol as
1422 suggested by the manufacturer; the rate of cellular oxidative phosphorylation (oxygen
1423 consumption rate (OCR)) and glycolysis (cellular proton production rate (PPR)) were
1424 measured simultaneously.
1425
1426 Figure 9: Mitochondrial morphology is slightly changed in trisomy 21 cells. We
1427 have stained the mitochondrial network and analyzed the network morphology,
1428 measuring the percentage of filaments, rods, puncta and swollen using the Fiji plug-in
55 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1429 mitoMorph. (a) Sample of mitoMorph analysis of an RPE1 wild-type cell and (b) of an
1430 HCT116 wild-type cell. Filament networks are highlighted in lilac, rods in green, puncta
1431 (referring to fragmented mitochondria) are highlighted in orange and swollen ones in
1432 blue. The percentage filaments, rods, puncta and swollen mitochondria are
1433 automatically scored and reported to the user. (c) the percentage of filaments is
1434 slightly, but significantly reduced in RPE_T21, as well as HCT_T21 cells compared to
1435 wild-type. (d) There is no significant change in the percentage of rods in RPE_T21
1436 cells and slightly higher percentage in HCT_T21 cells compared to wild-type. (e) The
1437 percentage of puncta is unchanged in both T21 cell lines. (f) There are significantly
1438 more swollen mitochondria in both, RPE_T21, as well as HCT_T21 cells compared to
1439 wild-type. Underlying numerical values are provided in Additional File 4,
1440 Supplementary Table S3, sample images are shown in Additional File 1,
1441 Supplementary Figure S4.
1442
56 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1443 Tables 1444 1445 Table 1: Mito-processes and number of genes in Human, Mouse and Drosophila. 1446 Mito-process Human Mouse Drosophila Amino Acid 81 80 67 Metabolism Apoptosis 55 54 44 Bile Acid Synthesis 2 2 7 Calcium Signaling 24 23 12 and Transport Cardiolipin 5 5 5 Biosynthesis Fatty Acid 22 22 15 Biosynthesis & Elongation Fatty Acid 30 31 26 Degradation & Beta- oxidation Fatty Acid 14 11 19 Metabolism Fe-S cluster 24 25 18 biosynthesis Folate & Pterine 12 12 9 Metabolism Fructose Metabolism 7 7 3 Glycolysis 37 39 35 Heme Biosynthesis 9 9 9 Import & Sorting 51 52 62 Lipoic Acid 3 3 4 Metabolism Metabolism of Lipids 34 36 17 & Lipoproteins Metabolism of 16 15 18 Vitamins & Co- Factors Mitochondrial Carrier 46 45 46 Mitochondrial 60 60 47 Dynamics Mitochondrial 18 18 10 Signaling Nitrogen Metabolism 8 8 21 Nucleotide 14 14 13 Metabolism Oxidative 167 164 174 Phosphorylation Oxidative 13 13 13 Phosphorylation (MT) Pentose Phosphate 7 7 6 Pathway Protein Stability & 26 26 20 Degradation Pyruvate Metabolism 26 25 24 Replication & 53 54 33 Transcription
57 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
ROS Defense 32 32 30 Translation 184 183 191 Translation (MT) 24 24 24 Transmembrane 20 19 21 Transport Tricarboxylic Acid 21 22 29 Cycle Ubiquinone 9 9 9 Biosynthesis Unknown 13 13 21 1447
1448 1449 1450
58 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1451 Figures 1452 Figure 1
1453 1454
59 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1455 Figure 2
1456 1457 1458
60 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1459 Figure 3
1460 1461 1462
61 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1463 Figure 4
1464 1465
62 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1466 Figure 5
1467 1468
63 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1469 Figure 6 1470
1471 1472 1473
64 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1474 Figure 7
1475 1476
65 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1477 Figure 8
1478 1479
66 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1480 Figure 9
1481 1482
67 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1483 Additional file 1 – Supplementary Figures 1484 Supplementary Figure S1
1485 1486 1487 1488 Supplementary Figure S1: programmatic skeleton of the mitoXplorer web-platform. In the back- 1489 end, A MySQL database stores the mito-interactomes, as well as expression and mutation data that 1490 are publicly available. User-uploaded data are stored temporarily and only available to the user. A set 1491 of python-scripts connect to the MySQL database for data retrieval of both, mito-interactomes and 1492 expression and mutation data. The mitomodel script connects to the MySQL database directly for the 1493 visualization of the Interactome View. A set of scripts perform comparative analysis, for generating
68 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1494 Comparative Plots, Heatmap and PCA visualization. In the front-end, a set of javascripts handle the 1495 visualizations of the plots: the ‘interactome’ and ‘database’ scripts handle the data presentation of the 1496 mito-interactome and the available public data for the web-site; mitomodel visualizes the Interactome 1497 View and the scripts in the compare box are responsible for visualizing Comparative Plot, Heatmap and 1498 PCA. The CSS layer handles the css-styles of the page and finally, the HTML/PHP layer creates the 1499 actual interface for the user. 1500 1501
69 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1502 Supplementary Figure S2 1503
1504 1505 1506 Supplementary Figure S2: Length and area distribution of filaments and rods in wild-type and 1507 T21 derived RPE1 and HCT116 cells. (a) stacked bar-plots of filament length distribution of RPE1 1508 wild-type (labeled RPE_wt), RPE1 21/3 (labeled RPE_T21), HCT116 wild-type (labeled HCT_wt) and 1509 HCT116 21/3 (labeled HCT_T21) cells. Overall, shorter filaments are more frequent in HCT116 than in 1510 RPE1 cells. In T21, filaments tend to be slightly shorter. (b) stacked bar-plots of filament area 1511 distribution of RPE_wt, RPE_T21, HCT_wt wild-type and HCT_T21 cells. Overall, less area is occupied 1512 by filaments in HCT116 than in RPE1 cells. In HCT_T21 cells, a notably smaller area is assigned to 1513 filaments, while in RPE_T21 cells, this change is much less pronounced. (c) stacked bar-plots of rod 1514 length distribution of RPE_wt, RPE_T21, HCT_wt and HCT_T21 cells. Overall, in the range between 4 1515 and 10 microns, more rods are found in RPE1 cells. Between wild-type and T21 cells, no real length 1516 difference is observable. (d) stacked bar-plots of rod area distribution of RPE_wt, RPE_T21, HCT_wt 1517 and HCT_T21 cells. Overall, there is a tendency of slightly larger rod areas in HCT116 cells. In HCT116 1518 cells, rods seem to occupy slightly smaller areas when carrying the extra copy of chromosome 21. 1519 1520
70 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1521 Supplementary Figure S3 1522
1523 1524 1525 1526 Supplementary Figure S3: mitoXplorer scatterplot of Translation and nuclear-encoded Oxidative 1527 Phosphorylation of fibroblasts of monozygotic twins discordant for trisomy 21 (T21_MZ) and 1528 RPE1 T21 cells. (a) The mRNA of mitoribosome small subunit component MRPS21 is strongly down- 1529 regulated only in RPE_T21 cells and is mostly unaffected in monozygotic twins discordant for T21 1530 (T21_MZ fibroblasts: T21_Letour_MZ_fib, T21_Liu_MZ). Mitoribosome proteins are significantly 1531 downregulated in RPE_T21 cells and mildly affected in T21_MZ fibroblasts. (b) Oxidative 1532 Phosphorylation components encoded in the nucleus are downregulated on protein level in both, 1533 RPE_T21, as well as T21_MZ fibroblasts, whereby deregulation is milder in T21_MZ. In both conditions, 1534 the Oxidative Phosphorylation transcriptome is mostly unaffected. 1535 1536 1537 1538 1539
71 bioRxiv preprint doi: https://doi.org/10.1101/641423; this version posted May 17, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1540 Supplementary Figure S4 1541
1542 1543 1544 1545 Supplementary Figure S4: Mitochondrial network and mitoMorph based image analysis of wild- 1546 type and T21 cells. MitoTracker stainings (a-d) of RPE_wt (a) and RPE_T21 (b), as well as HCT_wt 1547 (c) and HCT_T21 (d). (a, b) The mitochondrial network is largely intact in RPE_T21 cells, with only 1548 slightly lower percentage filaments and an increased number of swollen mitochondria. (c, d) In HCT116 1549 cells, the mitochondrial network is overall less abundant, with more rod-like and fragmented 1550 mitochondria (puncta). With trisomy 21, cells show an even more pronounced presence of rods at the 1551 cost of longer filaments, as well as more puncta and swollen mitochondria. The scale bar is 50 µm. 1552 Mitochondria were stained with MitoTracker deep Red FM from Invitrogen. Staining was done in 96- 1553 well plates. The cells were incubated for 30 min at 30°C with 100 nM MitoTracker dye prior to fixation. 1554 Cells were fixed with 3% PFA in DMEM for 5 min at room temperature. After washing with 1xPBS, 1555 1xPBS with 0.02% sodium azide was added. Plates were stored at 4°C in the dark. Imaging was carried 1556 out on an inverted Zeiss Observer.Z1 microscope with a spinning disc and 473 nm, 561 nm and 660 1557 nm argon laser lines. The images were captured automatically on multiple focal planes (step size 700 1558 nm) with a 40x magnification air objective. Image stacks were Z-projected using Fiji for further analysis. 1559
72