! 

"#$%"&#$

       %$'()   %*$+,+'-)+,   .. ..  ''                           !!"   # $% &    '  () "!"!  * +,    ,   ,-  ./  , 0  12'   3   4 2/     5  6  7.  ,8  ,   )8    9:  07  12

   )2"!"!2' 3   4   ,- 0 )  0    0  ;   2           <(=2+< 2          2)8>?(&? + *!?&  2

                         , 2               3         2',        3                    ,   2 0            33  @     .AA0)1        .-0)12-             3  ,   ,            2 '   ,    -0)         3         ,2 '   ,      ,               ,                 2 -    ,,     , .B)/1,  AA0)  -0)  3                    -0)          2    3          3   3 ,,          -0)               2  -       ,      -0)      3    .1     3         .13    2   3                 ,  ,      -0)  , AA0)  2 -  ,3     3 ,           , ,                  ,    2/  - $   3      3              3        B)/ , -0)            , =    =   9   2 C , 3       -0)                 2

   ,                        

              !" " #" $%& ' (    )*'+,'  (

D)   "!"!

))> <+ <"!< )8>?(&? + *!?&       = <<++. EE 22E F G     = <<++1 To everyone who supported me

List of papers

This thesis is based on the following papers, which are referred to in the text by their roman numerals.

I Herman S, Åkerfeldt T, Spjuth O, Burman J, Kultima K. Biochemical differences in cerebrospinal fluid between secondary progressive and relapsing–remitting multiple sclerosis. Cells 8(2):84 (2019).

II Herman S, Khoonsari PE, Tolf A, Steinmetz J, Zetterberg H, Åkerfeldt T, Jakobsson PJ, Larsson A, Spjuth O, Burman J, Kultima K. Integration of magnetic resonance imaging and protein and metabolite CSF measurements to enable early diagnosis of secondary progressive multiple sclerosis. Theranostics 8(16):4477–4490 (2018).

III Herman S, Arvidsson McShane S, Zhukovshy C, Khoonsari PE, Burman J, Spjuth O, Kultima K. Disease phenotype prediction in multiple sclerosis. Manuscript (2020).

IV Herman S, Khoonsari PE, Spjuth O, Burman J, Kultima K. A biochemical signature of progressive multiple sclerosis. Manuscript (2020).

Reprints were made with permission from the publishers.

List of related papers

The following papers were not included in the thesis.

I Carlsson H, Abujrais S, Herman S, Khoonsari PE, Åkerfeldt T, Svenningsson A, Burman J, Kultima K. Targeted metabolomics of CSF in healthy individuals and patients with secondary progressive multiple sclerosis using high-resolution mass spectrometry. Metabolomics 16(2):26 (2020).

II Wiberg A, Olsson–Strömberg U, Herman S, Kultima K, Burman J. Profound but transient changes in the inflammatory milieu of the blood during autologous hematopoietic stem cell transplantation. Biology of Blood and Marrow Transplantation 26(1):50–57 (2020).

III Peters K, Bradbury J, Bergmann S, Capuccini M, Cascante M, de Atauri P, Ebbels TMD, Foguet C, Glen R, Gonzalez-Beltran A, Günther UL, Handakas E, Hankemeier T, Haug K, Herman S, Holub P, Izzo M, Jacob D, Johnson D, Jourdan F, Kale N, Karaman I, Khalili B, Emami Khonsari P, Kultima K, Lampa S, Larsson A, Ludwig C, Moreno P, Neumann S, Novella JA, O’Donovan C, Pearce JTM, Peluso A, Piras ME, Pireddu L, Reed MAC, Rocca–Serra P, Roger P, Rosato A, Rueedi R, Ruttkies C, Sadawi N, Salek RM, Sansone SA, Selivanov V, Spjuth O, Schober D, Thévenot EA, Tomasoni M, van Rijswijk M, van Vliet M, Viant MR, Weber RJM, Zanetti G, Steinbeck C. PhenoMeNal: processing and analysis of metabolomics data in the cloud. Gigascience 8(2) (2019).

IV Khoonsari PE, Moreno P, Bergmann S, Burman J, Capuccini M, Carone M, Cascante M, de Atauri P, Foguet C, Gonzalez–Beltran A, Hankemeier T, Haug K, He S, Herman S, Johnson D, Kale N, Larsson A, Neumann S, Peters K, Pireddu L, Rocca–Serra P, Roger P, Rueedi R, Ruttkies C, Sadawi N, Salek RM, Sansone SA, Schober D, Selivanov V, Thévenot EA, van Vliet M, Zanetti G, Steinbeck C, Kultima K, Spjuth O. Interoperable and scalable metabolomics data analysis with microservices. Bioinformatics 35(19):3752–3760 (2019).

V Novella JA, Khoonsari PE, Herman S, Whitenack D, Capuccini M, Burman J, Kultima K, Spjuth O. Container-based bioinformatics with Pachyderm. Bioinformatics 35(5):839–846 (2019). VI Herman S, Niemelä V, Khoonsari PE, Sundblom J, Burman J, Landtblom AM, Spjuth O, Nyholm D, Kultima K. Alterations in the tyrosine and phenylalanine pathways revealed by biochemical profiling in cerebrospinal fluid of Huntington’s disease subjects. Scientific Reports 9(1):4129 (2019).

VII Herman S, Khoonsari PE, Aftab O, Krishnan S, Strömbom E, Larsson R, Hammerling U, Spjuth O, Kultima K, Gustafsson M. Mass spectrometry based metabolomics for in vitro systems pharmacology: pitfalls, challenges, and computational solutions. Metabolomics 13(7):79 (2017). Contents

1 Introduction ...... 1 1.1 The central nervous system ...... 2 1.2 Neurons ...... 2 1.3 Neurotransmitters ...... 3 1.4 Cerebrospinal fluid ...... 3 1.5 Neuroimmunology ...... 4 1.6 Multiple sclerosis ...... 5 1.7 Metabolomics ...... 6 1.8 Biomarkers and multianalyte algorithmic assays ...... 8

2 Aims ...... 9

3 Methodologies ...... 10 3.1 Mass spectrometry ...... 10 3.2 Tandem mass spectrometry ...... 11 3.3 Liquid chromatography ...... 12 3.4 Experimental design ...... 13 3.5 Metabolite identification ...... 15 3.6 Metabolite quantification ...... 16 3.7 Normalization ...... 16 3.8 Covariate correction ...... 17 3.9 Dimensionality reduction and latent variable models ...... 18 3.10 Regularization techniques ...... 19 3.11 Multilevel modelling ...... 20 3.12 Achieving robust results ...... 22 3.13 Model performance estimation ...... 23 3.14 Conformal prediction ...... 24

4 Study summaries ...... 26 4.1 Paper I ...... 26 4.2 Paper II ...... 29 4.3 Paper III ...... 31 4.4 Paper IV ...... 34 4.5 Principal findings ...... 37

5 Reflections ...... 38 5.1 Experimental aspects ...... 38 5.2 Computational thoughts ...... 38 5.3 Biological contemplations ...... 41 5.4 Future work ...... 42

6 Concluding remarks ...... 44

7 Acknowledgements ...... 45

References ...... 48 Abbreviations

AUC/AUROC Area under the ROC curve BER Balanced error rate CID Collision-induced dissociation CNS Central nervous system CSF Cerebrospinal fluid EDSS Expanded disability status score ESI Electrospray ionization HCD Higher collision energy dissociation HMDB the Human Metabolome Database HPLC High-performance liquid chromatography LASSO Least absolute shrinkage and selection operator LC-MS Liquid chromatography-mass spectrometry m/z mass-to-charge ratio MAAA Multianalyte assay with algorithmic analyses MRI Magnetic resonance imaging MS/MS Tandem mass spectrometry OLS OPLS-DA Orthogonal partial least squares discriminant analysis PC Principal component PCA Principal component analysis PLS-DA Partial least squares discriminant analysis PMS Progressive multiple sclerosis PNS Peripheral nervous system PPMS Primary progressive multiple sclerosis ROC Receiver operating characteristic RP Reversed-phase RRMS Relapsing-remitting multiple sclerosis SPMS Secondary progressive multiple sclerosis VIP Variable importance in projection QC Quality control

1. Introduction

Clinical tests are traditionally based on single measurements that are taken together, evaluated and acted upon by clinical practitioners. In other domains, decision-making is increasingly driven or supported by computational algorithms and advanced analytics, such as in stock trading, inventory replenishment and delivery, and electricity supply and demand. Implementing computationally driven healthcare solutions is challenging since medical diagnoses and the practice of medicine depend on many factors that are not directly related to the analysis of data; e.g. wishes and needs of patients, availability and affordability of healthcare, interpretation of signs and symptoms by clinical practitioners, conceptualization of diseases and understanding of their pathophysiology. Besides, human health is too valuable to completely entrust to an autonomous algorithm with no human supervision [1, 2]. Instead, augmenting clinical practitioners with advanced analytical insights in combination with comprehensive information to support decision-making in an educated and data-driven way, would be a more feasible strategy that can benefit the current healthcare system. Multiple sclerosis is a neurological disease that could involve multiple intertwined substages that are difficult to discern because of their transitional occurrence, Figure 1.1. As such, multiple sclerosis is a great example where computationally supported diagnostics could be of value for subtype distinction. This thesis addresses the need for an earlier detection of patients developing the progressive and neurodegenerative subtype of multiple sclerosis. The result from this work has revealed that several biochemical pathways are affected once a transition to the progressive subtype occurs. These alterations enable distinction between the progressive phase and its preceding subtype.

Figure 1.1. The subtypes of multiple sclerosis can be developed in a gradual process, where the phenotypes appear to be mixed during a transitional period.

1 Considering the infeasibility of utilizing a complete biochemical profile as a diagnostic test, work towards a limited subset of highly informative markers was carried out. The identified subsets, comprising solely low-molecular candidate markers or in combination with radiological and protein measures, indicated success in detecting the progressive phenotype.

1.1 The central nervous system The nervous system can be divided into the central nervous system (CNS) and the peripheral nervous system (PNS). The CNS is the primary processing unit that consists of the brain and spinal cord, and essentially determines who we are and how we interact with the environment [3, 4]. External nerves and ganglia, which comprise the PNS, gather external and internal sensations that the CNS organizes into conscious experiences. The brain is a highly sophisticated organ that is composed of many specialized compartments including cortex, cerebellum, basal ganglia and the brain stem. Anatomically the brain can also be divided into the right and left hemisphere and multiple lobes related to specific functions: the frontal lobe to judgment and motor function; the occipital lobes to visual processing; the parietal lobes to somatic sensation and movement; the temporal lobes to learning, memory and emotion [3]. Extending from the brain stem is the spinal cord, whose main function is to allow the brain to efficiently communicate with the rest of the body. The spinal cord consists of a column of neurons encased in a segmented bone structure that makes up the spine. Each segment of this structure contains spinal neurons responsible for a particular body region.

1.2 Neurons The basic working units in the CNS are neurons (nerve cells) that react to stimuli and deliver information in the form of electrochemical signals. The human brain consists of approximately 1011 neurons interconnected in a highly complex structure [3]. At the age of 25, the human brain has reached its mature stage and as we grow the complexity of our neural network increases with age as more connections are formed [5]. Neurons are composed of a cell body carrying all the genetic information, dendrites which extend from the cell body and collect information from the surrounding areas and the axon which conducts information to other targeted neurons, muscles or glands. To increase the efficiency by which signals are conducted, some axons are wrapped in an insulating sheath composed of several layers of a lipid structure called myelin. In the CNS, myelin consists

2 of two main proteins, myelin basic protein (MBP) and the hydrophobic proteolipid protein (PLP) [3].

1.3 Neurotransmitters Neurotransmitters are brain-derived chemical messengers that transmit information within and between the CNS and the rest of the body [3]. The total number of neurotransmitters that are being used in the human body is unknown, but is well over 100. One example of an important neurotransmitter is the catecholamine dopamine that modulates our reward and pleasure system. Many drugs of abuse such as alcohol, cocaine and amphetamine work by enhancing the effect of dopamine. Dopamine is present in many brain regions but mostly concentrated in the striatum, a critical component in the reward and motor systems. In Parkinson’s disease, a key cause is considered to be a deficiency in the dopaminergic pathway, caused by the degradation of the dopaminergic neurons that supply the striatum with dopamine [4, 6]. Another example of a well-known neurotransmitter is serotonin. Serotonin is a multifaceted neurotransmitter that is best known for its ability to give us the feeling of well-being. Serotonin is involved in virtually all human behaviour, including anger, appetite, attention, memory and sexuality. In many of these behaviours, there is at least one critical brain region, where specific serotonin receptors are involved in generating the behavioural output. However, just as each behaviour is regulated by a multitude of serotonin receptors, each serotonin receptor is expressed in multiple brain regions and most likely linked to multiple behaviours. This may explain why many drugs targeting a specific serotonin receptor (e.g. serotonin-based antidepressants) affects more than only one behaviour [3, 7].

1.4 Cerebrospinal fluid The CNS is protected by three layers of tissue, collectively called the meninges. In between the meninges flows the cerebrospinal fluid (CSF), which supports the blood vessels, protects the neural tissue and acts as a shock absorber. Additionally, the CSF allows the brain to float, reducing its in situ weight from an average weight of 1400 g to less than 50 g [3]. Previously, CSF was seen as a clear body fluid with minimum content and with the primary purpose of protecting the brain. Now we know that CSF plays an essential role in brain metabolism as it exports waste products from neural tissue and supplies the CNS with nutrients from the blood [8]. When damage in the CNS occurs, the molecular composition of CSF will dramatically change. It is commonly accepted that these changes mirror the

3 alterations in the CNS, making CSF an appealing source for monitoring pathological events in the CNS. Cerebrospinal fluid is collected under local anaesthesia through a lumbar puncture in the lower back. The needle is inserted in the spinal cord between two lumbar bones and drops of CSF are collected [9].

1.5 Neuroimmunology The immune system is the body’s maintenance system, protecting us from external and internal threats. It was long believed that the immune system was exclusively maintaining the body and not engaged in the CNS. However, pioneering work has shown that the immune system plays a vital role in the neurogenesis, the process by which neurons are produced by neural stem cells and under the right circumstances protects injured neurons from degradation [10, 11]. Mice with a deficient immune system were shown to have an impaired cognitive performance compared with wild-type mice, and restoring their immune functions by immune cell transplantation resulted in a cognitive improvement [12]. The CNS resident immune cells are the microglia that actively patrol the CNS for modifications that might perturb homeostasis (i.e. the steady-state of internal conditions). Once microglia encounter foreign or potentially harmful substances, it will enter an active state where it upregulates CNS innate immunity and triggers appropriate responses such as inflammation [13]. Neuroinflammation comprises the complex and fundamental mechanisms to protect the neural tissue from damage or infection. However, under aberrant circumstances, neuroinflammation can be harmful, where microglial activation can contribute to and enhance neurodestructive effects [14]. The adaptive immune system comprises the second level of defence that triggers highly specific responses based on immunological memory. The adaptive immune response consists of B and T lymphocytes, or in short B and T cells, with antigen-specific receptors. An initial encounter with a pathogen triggers reproduction of immune cells with pathogen-specific antigens, establishing immunological memory. Once these cells have been multiplied, a second encounter with the same pathogen is met by instantaneous and tailor-made counteractions. The underlying principle of vaccines is exploiting this process by introducing fragments of pathogens for training the immunological memory. Under abnormal conditions not yet fully understood, self-recognition can arise, where the normally protective immune cells start to react to harmless internal structures. These abnormal immune responses are characteristic of autoimmune diseases [15].

4 Expanded disability status score (EDSS)

9 Restricted to bed

Restricted 7 to wheelchair Walking 6 assistance

Increased Up to 3 years delay S limitation in PM 4 walking S

Minimal 2 disability

RRMS 0 Time SPMS SPMS initiation diagnosis

Figure 1.2. The typical progression of multiple sclerosis. The degenerative phenotype, secondary progressive multiple sclerosis (SPMS), is typically diagnosed retrospectively, where a patient meanwhile is diagnosed with the preceding phenotype, relapsing-remitting multiple sclerosis (RRMS).

1.6 Multiple sclerosis Multiple sclerosis is a chronic autoimmune, inflammatory, relapsing or progressive disease of the CNS, in which the immune system attacks the protective myelin sheaths on the neurons. The exposed axons cause the nerve impulses travelling to and from the brain and spinal cord to be distorted or interrupted, giving rise to a variety of symptoms depending on the afflicted region. When the inflammation subsides, scar tissue (sclerosis) is formed in the damaged area, giving the disease its name. Multiple sclerosis exists in two prominent phenotypes. The most common initial phenotype is called relapsing-remitting multiple sclerosis (RRMS), in which focal inflammation appears over time, initializing clinical relapses or lesions that are followed by a complete or partial recovery and no disease progression in between relapses [16]. The RRMS stage has been illustrated by the blue line in Figure 1.2. In many of these patients, the neurological damage accumulates over time, causing the disease to evolve into a progressive and degenerative phenotype called secondary progressive multiple sclerosis (SPMS) [16]. At this stage, new lesions are rare. Instead, old lesions are gradually expanding and neurodegeneration is abundant with increasing disability (red line in Figure 1.2). With time, an increasing number of patients will be diagnosed with SPMS, but the interindividual variation to reach this point is considerable.

5 Secondary progressive multiple sclerosis is typically diagnosed retrospectively, up to three years after the transition has occurred, which partly is due to its vague functional definition and diagnostic strategy [17–19]. Today the diagnosis is typically given at clinical manifestation, after a careful examination of the patient’s historical disease course, the rate of permanent disability accumulation and neuronal loss on repeated magnetic resonance imaging (MRI) scans. But as the brain has the ability to compensate for neuronal loss by e.g. large extra-region neuronal recruitment, increased activation of homologous brain regions and neuronal plasticity [20, 21], it is not until these mechanisms have been exhausted that the signs will manifest. As a consequence, the diagnosis is given years after the pathophysiological and biochemical changes are developed. Once the neurodegenerative process has started, treatment is limited to patients that display superimposed relapses [22]. Finally, a third less common phenotype is the primary progressive multiple sclerosis (PPMS), in which patients instantly develop neurodegeneration, without going through the RRMS phase. Although, there are currently divided opinions whether this phenotype should be treated as an independent phenotype or if PPMS and SPMS collectively should be denoted as progressive multiple sclerosis (PMS) [19, 23]. Herein, the progressive subtypes will be referred to as PMS.

1.7 Metabolomics To aid the development of disease-modifying therapies as well as disease-specific diagnostic markers for PMS, we need to gain a deeper understanding of its underlying molecular profile and its unique molecular attributes. Omics informally refers to the biological studies of the molecular dynamics in our bodies, starting from Genomics; the study of the genome, Transcriptomics; the study of RNA transcripts, Proteomics; the study of proteins and Metabolomics; the study of low-weight molecules called metabolites, to mention a few, Figure 1.3. Metabolomics is the study of low-weight molecules (metabolites) comprising the global metabolic network (metabolome) in a biological sample. As the metabolome includes all intermediate- and end products of all biochemical pathways, changes caused by various pathophysiological processes will immediately leave traces in the metabolome [24, 25]. The prominence of metabolomics has been fuelled by significant technological advances and especially by the increased sensitivity of mass spectrometers commonly used for metabolomic studies. At current state, we can measure up to 10,000 independent spectral features from a single biological specimen. But under favourable conditions, we will only be able to decipher the identity of a third of these features [26]. Identification and validation are the primary

6 challenges faced when performing non-targeted metabolomics [27, 28]. Unlike genes, transcripts and proteins, which are unique sequences of individual components (nucleotides and amino acids), metabolites are small molecules characterised solely by their unique composition and orientation of atoms and bonds [29].

Time

Metabolome

Proteome

Transcriptome

Genome Environmental influence Physiological influence

Figure 1.3. A generalisation of the omics-cascade with the genome, transcriptome, proteome and metabolome relative to each other. Out of these four, the metabolome is the most dynamical layer with the fastest turnover.

In a non-targeted approach, the metabolites are measured in relation to each other, providing relative abundances [30]. A non-targeted approach has the advantage of measuring the metabolome as thoroughly as the technologies allow. In the discovery phase, this is of preference as blind exploration will generate unbiased hypotheses and an overview of molecular events [25, 31]. Many non-targeted metabolomics studies are being performed, comparing various disease states [32–34], various food choices [35–37] or other conditions such as impaired glucose tolerance [38], inflammation [39] and professional exercise [40]. Once a hypothesis has been generated, one can proceed to deploy a targeted approach to verify the proposition. In a targeted approach, a single or a few metabolites will be selectively measured, enabling more accurate estimations [30, 31, 41]. By adding chemically labelled internal standards, the measured relative abundances can be converted into close to absolute quantities with the help of a calibration curve [42]. The standards are typically labelled by one or a few heavy hydrogen(s), deuterium(s), which possess one proton and one neutron in its nucleus, whereas the far more common isotope of hydrogen only contains one proton in its nucleus. Hence, the mass of a heavy labelled standard will be increased by the number of hydrogens replaced by deuterium. Targeted

7 metabolomics studies are typically driven by a specific question or hypothesis that motivates the selection of metabolites [25].

1.8 Biomarkers and multianalyte algorithmic assays Although metabolomics is the primary focus of this work, all -omics levels are important to enable a complete pathological characterisation of PMS. Once this is accomplished, a more thorough understanding of disease progression can be achieved and reliable and specific biomarkers can be evaluated and established for diagnostic and prognostic purposes. Biomarkers are characteristics that can be objectively measured (e.g. a biological substance) and could aid interpretation or prediction of the incidence or outcome of disease [43, 44]. However, the molecular pathology of neurological diseases is often highly complex and the information that a single biological source or biomarker can provide is limited. Therefore, multi-omics as well as combining omics related findings with other valuable information such as clinical measures and/or historical patient data are currently areas of intense effort. This concept of integrating information from several sources is sometimes referred to as a multianalyte assay with algorithmic analyses (MAAA) and has the aim of improving diagnostics where single biomarkers have limited success [45]. A successful example of a MAAA is the improved test for prostate cancer, referred to as the Stockholm3 blood-test, that combines information from protein markers, genetic polymorphisms and clinical measures [46, 47]. We applied this concept in Paper II, where we integrated information from metabolite and protein measures in CSF, with radiological information from MRI scans of the brain and spinal cord and clinical information. Also, in Paper III, we combined information from a limited set of carefully selected metabolites to distinguish PMS patients, to monitor the transitional event and to assess the biochemical signature on a per-patient level.

8 2. Aims

The aim of this thesis is to aid early detection and prediction of disease progression in PMS using primarily metabolomics and machine learning approaches. With this goal in mind, the following targeted objectives were specified:

1. Determine the biochemical differences between RRMS and PMS patients, and evaluate if these can distinguish the two phenotypes.

2. Extract a limited set of highly informative markers (metabolites, proteins, MRI derived and/or clinical measures) that efficiently and consistently can distinguish PMS.

3. Evaluate if the biochemical signature of PMS can predict clinical outcome of a therapeutic intervention.

9 3. Methodologies

3.1 Mass spectrometry Mass spectrometry is an analytical technique that enables quantification and identification of molecules present in a sample. By measuring the mass-to- charge ratio (m/z) of molecules and their relative abundance, it generates a mass spectrum that reveals the molecular fingerprint of the sample, also called the elemental or isotopic signature. A typical mass spectrometer has three compartments: an ion source, converting the sample into gas phase ions; a mass analyser, measuring the m/z of the ions; and a detector, recording the relative abundance. A key feature and a necessity for the analytical strategy in mass spectrometry is the ionization of molecules with the accession of energy. Herein, a Thermo Q Exactive Orbitrap mass spectrometer was used for analysis, Figure 3.1. The Q Exactive Orbitrap uses electrospray ionization (ESI) to transform molecules into ions and solution into gas by applying high voltage to eluting liquid. ESI is a soft ionization technique in the sense that very small amounts of energy are retained by the gaseous molecule, preventing any unwanted compound fragmentation to occur. Before ESI was developed, fragmentation of analytes was a problematic issue in the field [48, 49]. The gaseous ions are transferred to the mass analyser through several ion optics or ion transfer tubes, which stabilizes and maintains a steady stream of ions. On its way to the mass analyser, the stream of ions passes through the mass filter, which in a Q Exactive Orbitrap is a quadrupole mass filter composed of four symmetrically arranged rods. The quadrupole separates the gaseous ions based on their m/z, where low-weight ions are attracted to one axis and high-weight ions to the other. The sorted ions are then passed to the detector system, to measure their abundance in relation to each other [49, 50]. In Q Exactive Orbitrap, the gaseous ions will pass through the C-trap compartment, finally arriving at the Orbitrap where they are forced to circulate an axial spindle. The rotation generates harmonic ion oscillations with a particular frequency proportional to the m/z. The detector transforms and enhances these oscillations to electrical signals using Fourier transformation, recorded and digitized to a mass spectrum [50, 51].

10 Figure 3.1. Schematic of the Q Exactive mass spectrometer. Printed with permission from Thermo Fisher Scientific.

3.2 Tandem mass spectrometry Tandem mass spectrometry or commonly denoted MS/MS, refers to the fragmentation of ions to acquire structural information [52]. This information is important in the identification process and is typically collected in every non-targeted metabolomics study [29]. When performing tandem mass spectrometry using a Q Exactive Orbitrap mass spectrometer, the quadrupole will selectively isolate the ions of interest (usually called the precursor ion). The homogenous ion stream will then be passed on, through the C-trap into a multipole collision cell where it will collide with a collision gas, causing the ions to dissociate into smaller pieces or fragments (daughter ions), Figure 3.2. The generated daughter ions will be related to the molecular structure of its parent ion and can, therefore, aid in its identification [48]. This process is called higher collision energy dissociation (HCD) and is a collision-induced dissociation (CID) technique specific for the Orbitrap instruments [50]. The ion fragments will further be transferred back to the C-trap and into the Orbitrap where they will be analysed as previously described, generating a fragmentation mass spectrum, Figure 3.2. In the work presented herein, tandem mass spectrometry analyses were performed on pooled subgroups of study samples in which the top most dominant ions were repeatedly selected for fragmentation. This strategy was aimed at generating fragmentation patterns of as many unique ions as possible to aid the identification process.

11 Intensity

Mass-to-charge (m/z)

Figure 3.2. A fictional fragmentation spectrum of a parent ion (top right).

3.3 Liquid chromatography Mass spectrometers are limited in terms of the dynamic range of molecules that they can measure. When a sample is complex, which often is the case, direct injection of the sample will result in a less thorough fingerprint, caused by internal competition between the molecules. Coupling the mass spectrometer to a chromatograph (gas or liquid) tackles this limitation by pre-separating the molecules based on their physicochemical properties (e.g. hydrophobicity) and eluting them in real-time into the mass spectrometer. Hence, the pre-separation will introduce a time dimension to the generated data, resulting in multiple mass spectra recorded over a time interval rather than a single mass spectrum at an instant time point. This will generate three-dimensional data consisting of intensity, m/z and retention time information. Using such a setup enables discrimination of molecules with the same mass but different physiochemical properties, which are otherwise impossible to distinguish using current mass spectrometers alone [24]. Within this work, reversed-phase (RP) liquid chromatography using a C-18 column was used, which is commonly applied in non-targeted metabolomics [24]. The samples were injected to a Thermo Accucore aQ RP C-18 column with a particle size of 2.6 μm. The high-performance liquid chromatography (HPLC) system used was a Thermo Ultimate 3000 HPLC which was coupled to the Thermo Q Exactive Orbitrap mass spectrometer. The HPLC system consists of the column connected to one or several mobile

12 phases (solvents). The mobile phases, as well as the sample, are transported through the column by pressure generated by a pump. The samples interact with the C-18 chains in the column, commonly referred to as the stationary phase, causing the sample content to elute at different time points. The sequential elution of molecules can be fine-tuned by utilizing a gradient of an aqueous and an organic mobile phase by adjusting the aqueous/organic ratio, to favour different physicochemical properties. The aqueous phase we have used consisted of acidified water (99.9% MilliQ water with 0.1% formic acid) and the organic phase consisted of 89.9% acetonitrile, 10% isopropanol and 0.1% formic acid.

3.4 Experimental design Liquid chromatography-mass spectrometry (LC-MS) is a genuinely sensitive technology, which hence carries an increased risk for experimental variability and biases in the generated data. Such variability may arise from contaminants from solvents or tubes used in the sample preparation. Batching of samples enforced by the infeasibility of preparing all samples at once may give rise to variation unique to the batch-assigned samples, i.e. batch effects. Samples could further have been collected at various time points, seasons of the year and could also have been stored and transported for different amounts of time, which also may give rise to unwanted variation [31, 53–56]. If striving to compare differences between two or more groups (i.e. healthy vs. diseased, cats vs. dogs) one needs to consider the within-group distributions of attributes like age, gender or country of origin. Any of these abovementioned sources of diversity will give rise to unwanted variation and biases in the final data, which if confounding with study groups could be devastating.

Equilibration Blank QC Sample

...... n = 8 n = 8

QC Dilution series

1:32 1:16 1:8 1:4 1:2 1:1

Figure 3.3. Analysis injection order design. Quality control (QC) pools have been coloured in grey, blank samples for contaminant collection in blue and study samples in orange.

Statistical consideration should optimally be taken into account as early as in the planning stage of the study. A key point to be aware of is that the

13 interpretation of data and the conversion of raw data measures to biologically relevant information will only be as good as the quality of the experiments. The experimental design is the most critical step in order to achieve valuable and accurate knowledge from a study. A fundamental rule for obtaining high-quality data is to randomise the samples throughout the experiment(s), so one can avoid or at least minimize biases and confounding effects from various sources including instrumental and experimental [57]. The targets for randomisation include the order in which samples are handled, prepared and analysed. An effective strategy is to blind the experimentalist, i.e. recode the samples to remove any prior knowledge of sample identity or property.

“Statistical methods require that the observations (or errors) be independently distributed random variables. Randomization usually makes this assumption valid.” - Design & analysis of experiments, Douglas G. Montgomery.

In LC-MS analyses, the ion signals are to a large extent proportional to the analyte concentration. However, a critical issue is the potential ion suppression caused by e.g. matrix interference [58]. Other issues that can affect the amplitude of the ion signals are intensity/sensitivity decaying throughout the analysis (potentially caused by aggregated compounds in the column), carry-over effects between samples, or saturation effects when the instrument is unable to concurrently process all present molecules [31, 59]. Hence, an analyte in a complex biological sample would typically give a significantly lower signal than in pure solution [48]. Using labelled or unlabelled internal standards of a known quantity, followed by a calibration curve, would enable estimation of absolute quantities of precursor ions [41, 42]. However, in non-targeted metabolomics, where all molecules are of equal interest, this is not a feasible approach. Instead, a cocktail of labelled internal standards covering a large extent of the mass range and retention time interval can be included to aid in in-between sample comparison (e.g. normalization based on internal standards) and real-time quality assessment. To assess data quality and performance during and after the analysis, dedicated quality samples are typically included, i.e. blank samples for contaminant detection, and quality control (QC) samples for estimating instrumental variability and metabolite stability. The QC samples typically correspond to an identical sample (commonly a pool of all study samples), which is repeatedly injected to ensure reproducibility throughout the analysis and enable post-analysis quality assessment [31], Figure 3.3. If internal standards have been included, the signals of these compounds can be used to monitor the sensitivity and performance of the instrumental setup [30]. To pick up signals that are genuinely originating from the biological samples, a QC dilution series is advantageous. This can be generated by simply repeatedly injecting the QC sample using a rising injection volume.

14 Signals that are originating from the samples will correlate with the dilution and can thereby be extracted. This analysis design has been employed in all studies herein and is illustrated in Figure 3.3. The first repeatedly injected QC samples that initialize the analysis are included to equilibrate the system. These should not be included in the data processing or metabolite stability estimation. The number of QCs that are needed to stabilize the system may vary. If the system has been equipped with a completely new column, the number of QCs needed for equilibration is typically larger.

3.5 Metabolite identification Metabolite identification is, as previously mentioned, one of the major challenges in the field of non-targeted metabolomics [27–29]. As a result, there is no current golden standard of how to perform this very essential step. In our work, we have mainly used two approaches. A computationally based approach where experimental MS/MS spectra have been compared to in silico generated fragmentation patterns of metabolites present in the Human Metabolome Database (HMDB) [60]. The matches were scored and ranked accordingly and, if possible, the hits were further validated against experimental fragmentation patterns deposited in public databases. However, not all metabolites in HMDB have an experimentally characterised fragmentation pattern publicly available. The second approach utilizes a library of more than 400 known substances that have been characterised in the previously described LC-MS system. The metabolite collection includes key metabolites from e.g. well-known biochemical networks in the neurodegenerative field, such as the dopaminergic pathway, the kynurenine pathway as well as the phenylalanine, tyrosine and tryptophan metabolisms. The library-based identification matches the derived m/z and retention times of the molecules against those of substances in the library, to find candidate identities. The MS/MS spectra of the tentatively identified molecules are then extracted and compared to those from the library and scored using an overall peak coverage, taking into account the intensity and mass accuracy of the daughter ions. Identities were reported on two confirmation levels. Identities confirmed by only m/z and retention time of the pure standards, where an MS/MS spectrum was not available from the samples, were depicted as verified on level 1. Identities confirmed on the criteria of level 1 and also by MS/MS fragmentation were depicted as verified on level 2. Identities of metabolites with an available MS/MS spectrum that did not match the fragmentation patterns of the pure standards were rejected.

15 3.6 Metabolite quantification Transforming the generated raw data into biologically valuable information requires extensive data pre-processing. Ion signals or peaks are often reduced by a process called peak-picking, quantified and charge estimated by feature detection, aligned and linked in-between samples using feature alignment or linkage, and finally, exported in a convenient format. If needed, potential drifts in retention time can be corrected using map alignment and pose clustering, where the generated features are clustered between samples before linkage. Many tools and algorithms have been developed to quantify signals in mass spectrometry data. In the following studies, all mass spectrometry data have been pre-processed using the OpenMS tools [61] in the KNIME workflow engine [62], which together provide a wide range of customizable tools for non-targeted metabolomics data pre-processing. Mass spectrometry data can be recorded in two distinct modes; The centroid mode that collects a discrete signal, or the profile mode that generates a collection of signals. Peak-picking or centroiding is the process where clusters of related signals generated in profile mode are aggregated into a single peak (equivalent to peaks recorded in centroid mode) [63]. The profile mode is richer, containing information about peaks’ shape. The generated files are, however, much larger and heavier to process. As such, most quantification algorithms have been developed for centroided data. Feature finder or feature detection is the subsequent step that attempts to group peaks originating from the same ion. This process is usually done by first connecting consecutive scans into a mass trace and then grouping co- eluting mass traces into a feature, based on a plausible isotopic pattern [64]. The generated features within each sample have to further be linked across samples to enable abundance comparison. This is done by matching the features based on mass and retention time, allowing a user-defined deviation [65].

3.7 Normalization A main objective of normalization is to remove as much noise and systematic variation as possible while retaining as much information as possible. Normalization is a step that is most often needed to correct for unwanted experimental or instrumental variability [66, 67]. If the study is a targeted study, where only one or a few molecules are of interest, internally spiked heavy versions of these molecules are usually added to enable normalization and absolute quantification using a calibration curve [41, 42, 66]. Non-targeted studies, where this is not an option, alternative methods that optimally benefit all measured molecules need to be considered [67]. It is important to consider why a particular normalization method is used and what it is trying to correct or account for. To account for variation in sample

16 injection or instrumental sensitivity decay, normalization using a set of distributed (in mass and chemical properties) internal standards might be a good approach. If less volume was to be injected or less sample volume was added in the sample preparation, the abundance of endogenous and internally spiked molecules should be affected while ratios of these should be unaffected. For analyses spanning several days or weeks, intensity/sensitive decay will occur [31, 68]. However, this typically does not affect all molecules in the same way. Some molecules might decrease over time, some might increase and some might be completely unaffected. This can make correction of such effects a bit complicated. An efficient and generic way of handling such variation is to normalize each molecule independently, by fitting a curve for each molecule to find a molecule-specific trend that can further be used for correction. In Paper I, III & IV, a locally weighted regression was fitted for each metabolite with respect to the injection order. The retrieved residuals were summed with the global average of the metabolite and used as corrected values [68, 69].

3.8 Covariate correction Studies in blood and urine have shown that as we age various compounds (e.g. proteins and metabolites) are being altered [70–74]. Additionally, sex-specific differences have been found in the molecular composition of blood [71–76]. Therefore, if a study contains study groups with non-equal age and sex distributions, these attributes must be taken into account and either be corrected for or evaluated. Otherwise, there is a risk that the results may be explained by the difference in age or sex rather than the studied conditions. This risk is, of course, present for many other attributes as well, such as country of origin, storage time etc. When correcting for a discrete attribute such as age, one could either correct all compounds or solely those that demonstrate a statistically significant age dependence. In the work presented herein, age-dependent metabolites were corrected using linear detrending, upon the assumption of linearity [70, 77]. Essentially, a linear model is trained using the metabolite levels as the response variable and age as the explanatory or predictor variable. The age coefficient is extracted and used to estimate and remove the age effect on the metabolite levels [77]. Estimating age effect should preferably be done on healthy control subjects to make sure one estimates the effect of healthy normal ageing. This is increasingly important when studying neurodegenerative diseases, as neurodegenerative effects are commonly overlapping with the effect of ageing. This is also true for multiple sclerosis, where the progressive phase typically (but not always) follows the relapsing-remitting phase, introducing a systematic age difference. In the

17 cohorts studied herein, there was an age difference between the RRMS and PMS patients. As such, age was confounding with the disease stages and had to be corrected for. This was done in both cohorts using linear detrending.

3.9 Dimensionality reduction and latent variable models Metabolomics suffers from the “curse of dimensionality”, i.e. the high-dimensional property of the measurements impedes statistical significance and power in the presence of sparse data [56]. Additionally, multidimensional measures from modern measuring instrumentation tend to suffer from severe multicollinearity (i.e. the variables can be explained by a linear combination of others) [78]. The use of dimensionality reduction related techniques is therefore especially attractive when analysing such data, as they are specifically designed to overcome dimensionality problems and take advantage of the collinearity. The key principle of such models (also called latent variable models) is the construction of pseudo-variables, typically called factors or latent variables, from combinations of original variables. Latent variables are formed to best capture the underlying structure or pattern of variables corresponding to their multicollinearity, either based on the dominant patterns (unsupervised) or based on group differences (supervised). Principal component analysis (PCA) is a commonly used unsupervised dimensionality reduction technique. Unsupervised is referring to the fact that no prior knowledge or response variable is provided. PCA decomposes the data into orthogonal latent variables, called principal components, that comprise linear combinations of the original variables [79–82]. PCA aims at summarizing present information in a lower dimensional space, storing the most dominant variation in the first component and decreasing amount of variation in the subsequent components. The combination of variables, including corresponding weights, are stored in vectors of so-called loadings. These can be illustrated as spectra or bar charts, to visualise the original variables contribution to a principal component. A typical use of PCA is to assess the underlying structure and quality of newly generated data, as well as conduct anomaly detection, where abnormal or outlier samples are identified. If an experiment was not done in a randomised setting, introduced biases will often be evident in a PCA score plot, as such variation tends to be very dominant [31, 83]. PCA can also be used for unbiased evaluation and visualisation of variables that have been selected using a supervised model. Partial least squares discriminant analysis (PLS-DA) is a supervised latent variable model that decomposes the data into latent variables, meanwhile having an informative response variable as a target (e.g. diagnostic label, disease state or other property of interest). PLS-DA aims at maximizing the inter-group covariance, i.e. to maximize the separation between the groups of

18 interest [78, 84, 85]. Unlike PCA, PLS-DA components are not required to be orthogonal. However, such a requirement can be applied, then referred to as orthogonal PLS-DA (OPLS-DA), in which case the solution is simply a rotation of the PLS-DAs solution. Similar to PCA, loadings are acquired when training a PLS-DA model. These are, however, rarely used in independence as they are not representing the variance of interest, i.e. they are not explaining the group separating variance. Instead, the loadings are used to compute the “Variable Importance in Projection” (VIP), which are the weighted loadings on the targeted response variable. The VIP for a given original variable will reveal how much that variable is contributing to a latent variable that is separating the groups of interest. This information is often used to extract features that distinguish the groups. In the work herein, PLS-DA was used to find differential metabolites in PMS compared with RRMS patients and control subjects in Paper I, and to rank variables according to their differential ability in Paper II. It was also used in Paper IV to evaluate if a subset of metabolites could predict a poor or beneficial treatment response.

3.10 Regularization techniques In multivariate or ordinary least squares (OLS) minimization, several variables (e.g. metabolic features) are included to optimally model a response variable (Equation 3.1). To find the optimal set of predictor coefficients, the sum of squared residuals are typically minimized (Equation 3.2), which is inherently difficult as of the high-dimensional space [81, 86]. Regularization is a technique that tackles this problem by restricting the values of coefficients using a penalty term, imposing values closer to zero (Equation 3.3). This prohibits very high values, and hence guards from overfitting, i.e. avoid data-specific results with no generalisation power. In regularized regression, the penalty term is controlled by a hyperparameter called the Lagrange multiplier (λ). The larger the value of λ, the greater penalty is applied and reduction performed (also called shrinkage) [81].

p Y = β0 + ∑ β jXj + ε (3.1) j=1

N p 2 OLS = ∑(yi − β0 − ∑ xijβ j) (3.2) i=1 j=1   p   p = + λ α β  +( − α) β 2 Penalized OLS OLS ∑ j 1 ∑ j (3.3) j=1 j=1

19 The type of penalty is decided by the parameter α, that can range from 0, then referred to as ridge regression, to 1, referred to as least absolute shrinkage and selection operator (LASSO) regression. Ridge regression is especially good at handling sparse sample sizes with many variables, which could be multicollinear. The primary advantage of ridge regression is its ability to jointly model correlating variables and shrink their coefficients together. However, ridge regression cannot shrink coefficient completely down to zero, and hence it cannot completely remove any variables from the model. LASSO, on the other hand, has the ability to shrink coefficients of non-informative variables down to zero. As such, LASSO is preferred when there are many non-informative variables that could be excluded from the model. A disadvantage with LASSO is, however, its inability to handle redundant variables. While ridge regression would value them equally, LASSO would instead only keep one of them, which could be non-desirable in a discovery study where we typically want to extract all biological variables that carry information. Therefore, using an α of 0.5 is often promoted as it enables a mix between the ridge and LASSO penalties, commonly referred to as the elastic-net regression, and enables utilization of both their strengths [81]. A Bayesian interpretation of ridge versus LASSO regression would be that the coefficients in ridge regression follow a normal distribution, while LASSO can be interpreted as linear regression where the coefficients have a Laplace distribution, peaking sharply at zero. As regularized regression has the ability to shrink the coefficients of non-informative variables, regularization methods are natural variable selection strategies. While non-informative variables are minimized or excluded from the equation, informative variables can be ranked by their coefficients’ absolute values. In Paper III, regularized regression was used to extract metabolic features that were highly distinctive for PMS patients.

3.11 Multilevel modelling In longitudinal studies, where, for instance, several subjects have been followed and sampled over time, there will be a dependency between data points belonging to the same subject. As such, many of the commonly used statistical methods cannot be applied as they have an assumption of independence. In the biological field, dependencies in data are very common. Consider for example a genotype study of children. If there are children sharing the same parents in the study, i.e. siblings, those children will not be independent. Multilevel models, also called hierarchical models or mixed effect models, are models that account for a hierarchical structure in data, denoted as levels [87, 88]. For instance, in the typical pupil-classroom example there are two

20 levels, pupils (level 1) in classrooms (level 2). Between-classroom variance, corresponding to observed or unobserved classroom aspects that affect the pupils, lead to correlations between pupils from the same class that may need to be incorporated into the modelling. Levels are the keystones in multilevel modelling, where two-levelled models are the most common type, but more levels can, of course, be used if required by the data structure [88]. In multilevel linear modelling, there are a variety of designs that can be utilized. A simple design is a two-levelled model that allows the intercept (β0) to vary between subjects, while having the slope (β1) fixed (sometimes called a random-intercept model) [88]. This design can be written as following:

Level 1

Yij = β0 j + β1xij+ εij

Level 2

β0 j = γ00 +U0 j

Yet, you may expect also the slope to vary between different subjects, which could be incorporated as:

Level 1

Yij = β0 j + β1 jxij+ εij

Level 2

β0 j = γ00 +U0 j β1 j = γ10 +U1 j

However, even though this design might be a more accurate representation of the problem at hand, it does make the model more complex as such design would require the model to estimate more parameters. This trade-off needs to be taken into account when choosing the model design. In Paper IV, we modelled the relationship between a metabolite and a clinical or protein measure over time in several individuals. Herein, we suspected that the relationship between them was varying over time. To incorporate this scenario into the model, we extended the random-intercept model to include an term between time and the clinical or protein measure. This allowed the metabolite and clinical or protein measure to vary their relationship (both in terms of the intercept and the slope) over time:

21 Level 1

Yij = β0 j + β1 jxij+ εij

Level 2

β0 j = γ00 + γ01t j +U0 j β1 j = γ10 + γ11t j

3.12 Achieving robust results In explorative studies with multidimensional measurements there is a high risk of over-fitting, that is finding variational differences unique to a specific dataset, or finding variational differences by random chance [86, 89]. We have demonstrated this phenomenon by comparing the separations of real group assignments versus randomised groups [83]. In that demonstration, a generalisation of Fisher’s discriminant analysis was used to find decompressions giving clear group separations for both the actual and randomly assigned groups, Figure 3.4. To avoid such results, validation is required by e.g. cross-validation, test-set validation, or other types of validation techniques. Cross-validation is a computational validation procedure that is commonly used in studies when

Figure 3.4. Separation achieved by a generalisation of Fisher’s discriminant analysis using the correct class labels (left) versus the separation achieved with randomised class labels (right). The large and clearly visible separation achieved with the correct classes is not very outstanding compared to the also large and clear separation achieved with the randomised classes. From Herman et al. 2017 [83].

22 samples are sparse and no samples are willingly spared for validation. Cross-validation iteratively resamples the dataset and repeatedly trains a model on parts of the data while evaluating the model on the remaining part [81]. In the studies presented herein, K-fold cross-validation typically repeated ten times has been used. Briefly, the samples were randomly divided into K equally sized and balanced groups, out of which K-1 were used for training and the last group for testing. The modelling was repeated K times, so that each group may act as a test-set. Finally, the samples were resampled into K new groups and the procedure was repeated all over. This was done repeatedly and the average and standard deviations of model performances and variable contributions (e.g. VIP scores from PLS-DA models or coefficients from the regularized regression) were computed. Although cross-validation is a convenient validation strategy when samples are limited, a test-set validation will always be superior. Test-set validation refers to the use of a separate and preferably independent dataset to evaluate the validity of results. In an explorative study, where the aim is to extract discriminatory variables using variable selection strategies, cross-validation is a valid and favourable strategy to get as robust results as possible from the data. But when promising targets have been extracted and need to be tested for e.g. future clinical use in the overall population, one or several independent cohorts, covering a wide range of attributes, such as age, gender, country of origin etc., should be used to estimate the robustness and variability of the potential markers. Permutation test, also called significance test, is an increasingly common statistical method to evaluate if an observed model performance is significant, i.e. better than random chance. Permutation tests are iteratively training models on shuffled data to construct a reference distribution of model performances achieved by random chance. Comparing the observed model performance achieved by the true labels to the reference distribution will estimate the significance of the model performance (i.e. its p-value) [90, 91].

3.13 Model performance estimation There are several measures that can be used to estimate model performance. One of these is the area under the ROC curve (AUC or AUROC), where ROC stands for receiver operating characteristic. AUC is a performance measure for binary classification problems. The ROC shows the ratio of true positives to all positives versus the ratio of false positives to all negatives at various decision thresholds. One class is assigned as the positive class, typically the study cases in a case-control study, and the other as the negative class. AUC goes to 1.0 for a perfect separation, whereas 0.5 indicates no separation (i.e. random chance) [56, 90].

23 The error rate is a more naturally intuitive measure, as it is simply the ratio of misclassified observations to all observations and the result is the percentage of observations that have been misclassified. A disadvantage of error rates, is its sensitivity to imbalanced datasets, i.e. datasets containing classes of different sizes. Therefore, balanced error rates (BER) have been used in the work herein, where class-specific error rates are estimated and then averaged. Other popular measures are the R2 and Q2 commonly used for the (O)PLS-DA models. R2 is the percentage of variation in the response explained by the model, whereas Q2 is the predictive performance. Q2 is a cumulative measure of the individual performance of each latent variable. In datasets where the number of features is much greater than the number of observations, high Q2 values can be obtained by chance. Therefore, permutation tests (i.e. shuffling the class labels) should be performed to estimate the likelihood of obtaining the acquired R2 and Q2 values by random chance, i.e. their statistical significance [90].

3.14 Conformal prediction The abovementioned performance measures are typically estimated using a held-out subset of observations by e.g. cross-validation or test-set validation. These convey the model’s performance on a global-level, but do not give any information on how confident each prediction is. Conformal prediction is a framework for complementing predictions from standard classification or regression algorithms with a measure of their confidence [92–94]. This confidence is on a per-prediction basis rather than relying on the average performance. As such, unclear cases are given less confidence in their predictions (hence a larger prediction interval) and vice versa.Asan example, any traditional biomarker used in the clinic today provides a point estimate that needs to be compared to a reference interval, commonly calibrated by characterising the variation amongst healthy individuals. Similarly, conformal prediction puts the evaluated observation in perspective to the known cases of the evaluated class, producing a measure of its similarity. Conformal prediction was introduced by Vovk. et al. and has been adopted in several scientific fields such as in computational drug discovery [94–96]. The similarity is quantified by a conformal p-value, which is not to be confused with p-values from traditional . The conformal p-value denotes how similar a predicted case is to the preceding class members, where a high p-value indicates a high similarity and vice versa [93]. Each evaluated observation will receive a p-value for each class. If these are similar, it may indicate an indeterminate case and could possibly be a mix of the two classes. As such, conformal prediction is especially fitting for

24 multiple sclerosis subtype distinction, where the progressive phase appears to follow the relapsing-remitting phase through a transitional stage. Another advantage of using conformal prediction is that it enables the use of confidence thresholds. For instance, utilizing a confidence threshold of 95% in a binary classification problem will give rise to three types of predictions. In clear cases, the methodology will deliver a single-labelled prediction with 95% confidence or above. If the predictions of both classes achieve > 95% confidence, the methodology will present both predictions, i.e. a double-labelled prediction. Finally, if no prediction achieves 95% confidence, the methodology will give an empty-labelled prediction. The ratio of double-labelled predictions at a given confidence is typically used to measure the model efficiency. Related to the efficiency is the model validity. A model is said to be valid if the frequency of errors does not exceed the allowance of the chosen confidence level. In the case of a 95% confidence level, the frequency of errors is allowed to be up to 5% [93]. In Paper III, a binary classifier distinguishing PMS from RRMS patients was complemented with conformal prediction to estimate the prediction confidence on a per-patient level. This was further used to evaluate the longitudinal effects of rituximab as a treatment for PMS patients.

25 4. Study summaries

4.1 Paper I Biochemical differences in cerebrospinal fluid between secondary progressive and relapse-remitting multiple sclerosis. The aim of this study was to characterise the biochemical differences in CSF between the SPMS and RRMS phenotypes, and evaluate if these can distinguish the SPMS patients. To assess the biochemical differences, we profiled the CSF metabolome of 46 multiple sclerosis patients and ten non-inflammatory neurological control subjects, using high-resolution mass spectrometry. Thirty of the patients were diagnosed with RRMS and 16 with SPMS. All patients met the diagnostic criteria for multiple sclerosis, i.e. the revised McDonald’s criteria [97], and underwent a clinical examination and lumbar puncture at inclusion. In addition, disease activity was recorded using MRI within a week after the lumbar puncture. In total, 117 metabolites present in at least 75% of the study participants were successfully identified using an in-house library. Seventeen (15%) of these demonstrated an age dependence and were corrected using linear detrending. To extract inter-group differences, PLS-DA models were trained using 5-fold cross-validation repeated ten times. Model performance was estimated using AUC, and VIP scores were collected in each iteration. The model comparing SPMS with RRMS patients achieved statistically significant (p-value<0.05) quality metrics of R2=0.81 and Q2=0.47, and a cross-validated AUC of 0.92(±0.097), Figure 4.1. An averaged VIP score ≥1.0 was chosen as a variable selection cut off, resulting in 37 altered metabolites. Univariate analyses revealed that 28 of these were significantly altered in independence, of which 21 remained statistically significant after correction for multiple comparisons. Connecting these metabolites to biochemical pathways, revealed eight biochemical pathways that were affected once transitioning to SPMS: aminoacyl-tRNA biosynthesis; nitrogen metabolism; phenylalanine metabolism; purine metabolism; pyrimidine metabolism; tryptophan metabolism; valine, leucine and isoleucine biosynthesis and degradation, Figure 4.2. Finally, the extracted metabolites were associated with clinical measures including the disease duration, MRI derived measures and the Expanded Disability Status Score (EDSS) used to quantify disability. The strongest

26 A. SPMS vs. RRMS B. SPMS vs. Controls 1.0 1.0

0.8 0.8

0.6 0.6 6HQVLWLYLW\ 6HQVLWLYLW\ 0.4 0.4

0.2 0.2 AUC = 0.92(±0.097) AUC = 0.84(±0.149)

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.40.60.81.0 í6SHFLILFLW\ í6SHFLILFLW\ C. D. 8

6

6 5

4 4 -log(p-value) -log(p-value) 3

2 2

1

0 0.00 0.05 0.10 0.15 0.20 0.00 0.05 0.10 0.15 Pathway Impact Pathway Impact Aminoacyl-tRNA biosynthesis Valine, leucine and isoleucine biosynthesis Purine metabolism Phenylalanine metabolism Pyrimidine metabolism Tryptophan metabolism Valine, leucine and isoleucine degradation Nitrogen metabolism Caffeine metabolsim Figure 4.1. Metabolic differences in SPMS compared with RRMS patients and control subjects. Average ROC curves with corresponding average AUC and standard deviation for the PLS-DA models comparing (A) SPMS with RRMS and (B) SPMS with controls. Pathway analyses on altered metabolites in (C) SPMS compared with RRMS patients and (D) SPMS patients compared with controls. The size of the node indicates the pathway impact (equivalent to the x-axis) computed by the relative betweenness centrality and the colour corresponds to the pathway. Pathways that were found non-significant in both comparisons have been coloured white. The red lines indicate a significance level of 0.05.

27 DGDWLRQ \SWRSKDQELRV\QWKHVLV DGDWLRQ fold change. The W\URVLQHDQGWr \URVLQHPHWDEROLVP 2 3KHQ\ODODQLQHPHWDEROLVP T $UJLQLQHDQGSUROLQHPHWDEROLVP OHXFLQHDQGLVROHXFLQHELV\QWKHVLV \SWRSKDQPHWDEROLVP Tr r /\VLQHGHJ WDWHDQGJOXWDPDWHPHWDEROLVP OHXFLQHDQGLVROHXFLQHGHJr \ODODQLQe DOLQe V DOLQe 3KHn V DVSDr 'í*OXWDPLQHDQG'íJOXWDPDWHPHWDEROLVP 3URSDQRDWHPHWDEROLVP 7KLDPLQHPHWDEROLVP 9LWDPLQ%PHWDEROLVP &\VWHLQHDQGPHWKLRQLQHPHWDEROLVP $ODQLQe LQDQGFKORURSK\OOPHWDEROLVP yr 3\ULPLGLQHPHWDEROLVP K 1LFRWLQDWHDQGQLFRWLQDPLGHPHWDEROLVP rp 3DQWRWKHQDWHDQG&R$ELRV\QWKHVLV SHQRLGíTXLQRQHELRV\QWKHVLV 3R HLQHPHWDEROLVP Caff 1LWURJHQPHWDEROLVP 6XOIXUPHWDEROLVP LQHPHWDEROLVP 3Xr $PLQRDF\OW51$ELRV\QWKHVLV

r 8ELTXLQRQHDQGRWKHUWH

HLQH Caff

Kynurenate *

DPLQH r * xyty * í0HWKR *

1í$FHW\OVHURWRQLQ

\SWRSKDQ í+\GUR[\LQGROHDFHWDWH

xytr * * LGLQH í+\GUR xyur

&LWrXOOLQH 'HR XWDQRDWH

*

HUGLQ

\ODPPRQLR

* b %LOL v

,QGROHííDFHWDWHxate LPHWK

LGR * í Tr

*

r XUHQLQH

í3\ * * Kyn

XWDQRDWH * * \ODODQLQH

*

LQH 3KH n 3LSHFRODWH \OJO\FRO í$FHWDPLGRb

* XWDQRDWH * \OíKRPRVHr *

* 0.05 have been marked with an asterisk ’*’.

;DQWKRVLQH DOLQH * V

*

*

LGLQH í'LK\GUR[\SKH

2í6XFFLn n

Ur < í*XDQLGLQR

+RPRJHQWLVDWH b

,VROHXFLQH/HXFLQH

0HWKLRQLQH

K\PLQH RDWH T

\ODODQLQH Urate

\ODFHWDWH

Metabolite to biochemical pathway linkage. The altered metabolites have

\URVLQH

1í0HWK\OQLFRWLQDWH &\FOLF$03

1 T *OXWDPLQH xybenz *XDQRVLQH 3KHn

1í$FHW\OSKHn í+\GUR )ROGFKDQJH  From this study, we could conclude that SPMS patients do have a ORJ

í0HWK\OííR[RSHQWDQRDWH í outer layer represents SPMSrepresents in comparison SPMS with patients RRMS comparedwith patients with a and control q-value the subjects. inner layer Significant fold changes 28 been linked with pathwaysmetabolite as labels colour-coded indicates ribbons. the magnitude Blue-to-red and coding direction next of to the the log association wasduration, found where 4-acetamidobutanoate did between notMetabolites display any 4-acetamidobutanoate association from toglutamine, and age. the methionine, the thymine pyrimidinewith and disease metabolism, uridine, disability, were4-acetamidobutanoate, including neither significantly of disease these associated deoxyuridine, metaboliteswith displayed ageing, any activity association makingprogression them and and particularly the transitional interesting event. brain in understanding disease atrophy.distinctive biochemical profile thatpatients and is control able subjects. Similar to Hence,SPMS low-molecular distinguish distinction markers and them show to could promise from potentially in detect RRMS a transitional event. Figure 4.2. 4.2 Paper II Integration of magnetic resonance imaging and protein and metabolite CSF measurements to enable early diagnosis of secondary progressive multiple sclerosis. Even though the biochemical profile was able to distinguish the SPMS patients, the high-dimensional nature of omics measurements makes them impractical to implement in clinical practice. Therefore, in a second study on the same cohort, we proceeded to identify a small set of highly informative measures that in combination would be able to distinguish between RRMS and SPMS patients. In this study, we combined the CSF metabolic measurements, with CSF protein, MRI derived and clinical information. Using a rigorous variable selection scheme utilizing PLS-DA and repeated cross-validation, a subset of eleven measures was extracted. Out of these, three were radiological measures (the size of the spinal cord and the third ventricle, and the total number of T1 hypointense lesions), six were proteins (galectin-9, monocyte chemoattractant protein-1 (MCP-1), tumor necrosis factor alpha (TNF-α), transforming growth factor alpha (TGF-α), soluble CD40L (sCD40L) and platelet-derived growth factor AA (PDGF-AA)) and two were metabolites (20β-dihydrocortisol (20β-DHF) and indolepyruvate), Figure 4.3. Performing a PCA on these measurements revealed a highly significant (p-value=8.5×10-9) separation between the RRMS and SPMS patients in the first principal component, Figure 4.3A. To compare the integrated information to the separate measurements alone, group differences were evaluated also in the individual measures. Statistically significant differences were found between RRMS and SPMS patients for 20β-DHF, galectin-9, indolepyruvate, MCP-1, PDGF-AA, sCD40L, TGF-α, total T1, and the size of the spinal cord and third ventricle, Figure 4.3C. In addition, the ROC curves with its corresponding AUC value for the combined as well as separate measurements were estimated, Figure 4.3D. The combined measurements achieved an AUC of 0.97, which was an improvement from the best performing individual measures (i.e. the size of the spinal cord and third ventricle) with AUCs of 0.85. Finally, we also assessed disease progression in SPMS by stratifying the SPMS patients into three groups, based on their clinical degree of change after follow-up. Myelin basic protein (MBP) and macrophage-derived chemokine (MDC) as well as the metabolites 20β-DHF and 5,6-dihydroxyprostaglandin F1a (5,6-DH-PGF1) demonstrated statistically significant (p-value<0.05) differences between the groups, Figure 4.4.

29 A. B. *** 3'*)$$ 4 4 V&'/

TG) Į

TN) Į 2 2 Indolepyruvate ł Spinal cord ł ł ł ł ł 0 0&3 0 ł ł ł ł ł ȕ'+) ł ł Total T1 łł łłł PC1: 30% explained variance ł RRMS í2 í2 ł łł ł ł *DOHFWLQ ł SPMS ł ł Transitions Third ventricle í4 í20 2 4 6 0.0 0.1 0.2 0.3 0.4 PC2: 17% explained variance C.

*** ** ** ** ** 600 NS. NS. * NS. NS. 60 ** 15 20 500 900

40 400 19 10 600

300 20 Control 18 RRMS 5 300 SPMS 200 0 Transitions Size of third ventricle (mm) *DOHFWLQ SJP/ T1 (N) Total 0&3 SJP/ ȕ'+)ORJ 2(intensity)

*** ** NS. 9 * ** * NS. 22 NS. NS. 80 ** 12 NS. ** ** NS. NS. * 8 3 10 30 21 7 60 8 2 6 20 40 6 25 5 1 19 20 4 4 )Į SJP/ )Į SJP/ 3 2 Size of spinal cord (mm) V&'/ SJP/ TN 0 TG 3'*)$$ SJP/ 20 ,QGROHS\UXYDWHORJ 2(intensity) 18 0 D. Combined Third ventricle *DOHFWLQ Total T1 ȕ'+) 0&3 1.0

0.8

0.6

0.4 Sensitivity 0.2 $8&  $8&  $8&  $8&  $8&  $8&  0

Spinal cord Indolepyruvate TN) Į TG) Į V&'/ 3'*)$$ 1.0

0.8

0.6

0.4 Sensitivity 0.2 $8&  $8&  $8&  $8&  $8&  $8& 

0 0.2 0.4 0.6 0.8 1.0 0000.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 00.20.40.60.81.00 0.2 0.4 0.6 0.8 1.0 í6SHFLILFLW\ í6SHFLILFLW\ í6SHFLILFLW\ í6SHFLILFLW\ í6SHFLILFLW\ í6SHFLILFLW\

30 Figure 4.3. (A) Principal component analysis (PCA) of the extracted eleven variables. The distribution of the first principal component (PC1) scores to the right display a highly significant difference (p-value=8.5×10-9) between the RRMS and SPMS patients. (B) The absolute values of the loadings for PC1, conveying the variable contribution in PC1. The variables have been colour-coded by the class with the highest abundance. (C) Statistically significant differences were found between RRMS and SPMS patients for the size of the third ventricle, galectin-9, total T1, 20β-DHF, MCP-1, the size of the spinal cord, indolepyruvate, TGF-α, sCD40L and PDGF-AA. Significance levels have been denoted as ’*’ p-value<0.05, ’**’ p- value<0.01 and ’***’ p-value<0.001, whereas non-significant is denoted NS. (D) Discriminative performance of the combined and single measurements. The receiver operating characteristic (ROC) curves for the compressed information in PC1 as well as for the separate variables alone, with corresponding areas under the curve (AUC).

Overall, this study demonstrated the importance and value of combining information from multiple markers to get a more accurate patient stratification. In addition, the proteins and metabolites that were associated with disease progression in SPMS patients can be of future value for the prediction of disease course.

340 ł ł łł 15.0 ł łł łł 19.75 ł 18.75 ł 1.3 110 ł łł 1.1 50 ł 18.50 ł łł ł ł 19.50 ł 0.9 ł MBP MDC 40 ł ł

ł ȕí'+) ł ł ł 18.25 0.7 í'+í3*) 1 ł łł 30 ł 19.25 ł ł 0.5 ł ł ł ł ł łł 18.00 ł 20 ł ł ł 0.3 ł ł 19.00 ł ł ł 11223 3 1 23 1 2 3 Clinical degree of change Clinical degree of change Clinical degree of change Clinical degree of change Figure 4.4. Myelin basic protein (MBP), macrophage-derived chemokine (MDC), 20β-dihydrocortisol (20β-DHF) and 5,6-dihydroxyprostaglandin F1a (5,6-DH-PGF1) showed statistically significant differences between the groups, where 1 corresponds to no clinical change, 2 to an intermediate clinical change and 3 to a severe clinical change after follow-up. The abundance in transitioning patients have been marked with arrows on the y-axes.

4.3 Paper III Disease phenotype prediction in multiple sclerosis. In our third study, we continued our methodological thinking from Paper II by including an additional cohort of 39 RRMS and 35 PMS patients, as well as 49 healthy control subjects. By matching the metabolic measurements to the old cohort, we hoped to extract metabolites that could be reproducibly measured

31 in CSF. We further wanted to improve our methodology by complementing our classifier with conformal prediction.

A. B. C. 1.0 RRMS vs PMS Controls O

5.0 5.0 0.8 O

2.5 O 2.5 O 0.6 O O O O O O O O

O O O O O O O 0.0 O O 0.0

O 6HQVLWLYLW\ 0.4 O O O O O O O O O O O ï2.5 ï2.5 0.2 O

AUCPHWDEROLWHV í

PC2: explaining 8.9% of the variance PC2: explaining AUCPC1  í ï5.0 ï5.0 0 ï 40 4 ï 40 4 0 0.2 0.4 0.6 0.8 1.0 PC1: explaining 19% of the variance PC1: explaining 19% of the variance í6SHFLILFLW\ Figure 4.5. Principal component analysis (PCA) of the selected 28 metabolic features in both cohorts. (A) The separation between PMS (red triangles) and RRMS (blue circles) in cohort 1 (filled) and cohort 2 (non-filled). (B) The 95% confidence space of each diagnostic group and cohort were computed and the controls were projected into the model space (black squares) revealing that, in terms of this molecular signature, the RRMS patients resemble the control subjects. (C) To quantify the separation of the first principal component (PC1), the area under the curve (AUC) was estimated to 0.94 (95% confidence interval of 0.90–0.99). This was an improvement over the best performing individual feature (AUC=0.80).

The metabolic measurements from both cohorts were separately subjected to a variable selection procedure utilizing regularized regression and implemented in a nested cross-validation scheme. Selecting metabolites that were important for distinguishing PMS from RRMS patients in both cohorts, resulted in 28 consistent metabolites, Figure 4.5A. As in Paper II, the combined information from the 28 metabolites outperformed the individual measurements alone, Figure 4.5C. Remarkably, projecting the held-out control subjects from both cohorts into a PCA of the 28 metabolites revealed that the control subjects clearly clustered amongst the RRMS patients, Figure 4.5B. This suggests that the extracted 28 metabolites are a PMS specific signature that is altered in PMS compared with both RRMS patients and control subjects. We further assessed the prediction confidence on a per-patient level using conformal prediction. This revealed that 88% of the patients had been correctly classified with a 90% confidence, Figure 4.6A. One out of eight RRMS patients that developed manifested PMS within three years was predicted as PMS at study onset, whereas six had highly uncertain (<50%) predictions, Figure 4.6B. This could potentially indicate a transitional stage. Finally, as a proof-of-concept, this methodology was also applied to a longitudinal collection of CSF from 22 patients in a phase 1b clinical trial of rituximab for PMS. Seventeen of these patients were classified as PMS at the onset of the study, of which twelve patients decreased their prediction

32 Conformal prediction Conformal prediction A. of PMS and RRMS patients. B. of the patients transitioning from RRMS to PMS.

1.0 1.0

O 0.75 0.75

0.50 0.50

O O 0.25 0.25

Conformal prediction p-value (PMS) Conformal O O O O O O O O OOO OO 0 O O OO OOO OO OOO OOOOO 0

0 0.25 0.50 0.75 1.0 0 0.25 0.50 0.75 1.0

Conformal prediction Conformal prediction C. of patient 1 from the ITT-PMS study. D. of patient 18 from the ITT-PMS study.

1.0 1.0 Baseline EDSS EDSS

SDMT 9-HPT(ND) SDMT 9-HPT(ND) 12 months

0.75 0.75

9-HPT(D) 9-HPT(D)

3 months 3 months 0.50 FSMC 0.50 25FWT FSMC

Baseline 6 months

0.25 0.25 Conformal prediction p-value (PMS) Conformal

6 months 0 12 months 0

0 0.25 0.50 0.75 1.0 0 0.25 0.50 0.75 1.0 Conformal prediction p-value (RRMS) Conformal prediction p-value (RRMS) Baseline Follow-up Figure 4.6. Conformal prediction performed on the extracted molecular signature. (A) The PMS phenotype is shown on the y-axis and RRMS on the x-axis, where the PMS patients are represented by red triangles and RRMS as blue circles. The majority of the patients are located close to its corresponding phenotype, while some are located along the diagonal. The area around the diagonal indicates an almost equal confidence in both MS predictions. (B) The eight transitioning patients are represented as black stars, where one has a clear PMS phenotype, six have highly uncertain predictions (<50%) and one still shows high similarity to the RRMS phenotype. (C) An example of a patient that displayed a treatment effect, supported by its clinical measures at baseline and follow-up (visualised in the radar chart). (D) An example of a patient that demonstrated no treatment effect, but rather achieved higher confidence in its PMS prediction after follow-up. The following abbreviations are used in the figure; 25FWT: 25-ft walk test, 9-HDP(D): 9-hole peg test dominant hand, 9-HDP(ND): 9- hole peg test non-dominant hand, EDSS: expanded disability status score, FSMC: fatigue scale for motor and cognitive function, SDMT: symbol digit modalities test.

33 5.0 PMS

ł 0.0 Baseline PC2 (8.9%)

í5.0 í40 4 3 months

5.0

3 months 0.0 6 months

ł PC2 (8.9%)

í5.0 í40 4

12 months 5.0

ł 6 months 0.0 PC2 (8.9%) RRMS í5.0 ł ł í40 4

5.0 12 months Healthy 0.0 PC2 (8.9%)

í5.0 í40 4 PC1 (19.0%) í40 4 PC1: explaining 19.0% variance Figure 4.7. The repeated samplings from patients in the phase 1b clinical trial of rituximab for PMS, where the first principal component (PC1) scores are shown as boxplots to the left and the score plot of PC1 and PC2 to the right (the treated patients are shown as black triangles). A statistically significant (p-value<0.01) difference on a group-level could be seen after one year. certainty after treatment, indicating a potential treatment effect, Figure 4.6C. Also, on a group-level, a significant (p-value <0.01) treatment effect could be seen after twelve months follow-up, Figure 4.7. Ultimately, this study demonstrates that it is plausible to generate a condensed set of molecular markers that can distinguish PMS from RRMS patients. Further, the confidence in single patient predictions can be quantified, using conformal prediction, and may be used to monitor disease course. This conceptual work may be useful in clinical practice for personalised treatment strategies in future trials.

4.4 Paper IV A biochemical signature of progressive multiple sclerosis. As a continuation of our third study, we investigated if the 28 metabolites were related to the clinical outcome of rituximab for PMS and if they could predict a poor or beneficial treatment response. We also investigated how each of the metabolites was altered in RRMS and PMS patients in comparison to healthy control subjects and to each other. Finally, we investigated the metabolites’

34 associations to a panel of six CSF protein biomarkers of axonal, myelin and astrocyte damage as well as T- and B- cell activation and differentiation.

A. PMS - RRMS PMS - healthy RRMS - healthy







Fold change 0 2

log í

B.

í 0 

log2 Fold change

* * * * * * * * * *** *** *** * * *** ** *** * ** *** * ** PMS - RRMS

** * ** * * * *** *** ** * * *** *** ** *** *** ** PMS - HC

**** ***** RRMS - HC 20pos 791pos 199pos 185pos 590pos 763pos 591pos 207neg 1453pos 1575pos 1851pos 2331pos 1815pos 1817pos 2601pos 1306pos 1836pos 2406pos 2095pos 2687pos 1457pos 1691pos 2456pos 1886pos 1711pos 1900pos 2573pos 2637pos Figure 4.8. (A) The biochemical differences between PMS and RRMS patients (blue), PMS and healthy controls (red), and RRMS and healthy controls (green). (B) Clustering of the log2 fold changes from the group comparisons between the PMS and RRMS patients as well as the healthy controls (HC), with the Euclidean distance as a similarity measure. Statistical significance is marked with asterisks: ’*’ p-value<0.05 and ’**’ p-value<0.01, ’***’p-value<0.001.

The protein markers were measured in the longitudinal CSF samples from all treated patients who participated in the clinical trial of rituximab. Sixteen of these patients were followed for an extended time period, where clinical examinations were performed before treatment (baseline) and at 6-, 12- and 24-months follow-ups. Ten of the 22 treated patients accumulated disability within two years after treatment. Four patients were excluded from the analysis as they were not followed for an extended time period and did not show a significant deterioration within the first year. Using the composite signature and a leave-one-out cross-validation modelling scheme utilizing PLS-DA, an AUC of 0.63 was achieved. Inspecting the sensitivity and specificity, we found that 90% of the patients displaying a poor treatment response were correctly classified, whereas only 38% of the patients with a beneficial outcome (i.e. demonstrating no disease progression) could be correctly classified. Hence, the signature was predominantly predicting patients to have a poor outcome.

35 itntv S eesi h M ainsadsm fte eealso were them the of of some some 36 that and suggests This patients performance. PMS motoric the and the cognitive in to in related EDSS levels twelve with CSF association of distinctive any subset showed different metabolites the a cohort. of current whereas None Figure measures, motoric markers, using and 4.9. cognitive or to protein assessed one associations displayed to were the metabolites associations associations of displayed metabolites markers, several Twelve protein modelling. or multilevel in measures altered clinical significantly 4.8. Figure alterations that were signature, reinforcing also significant specific subjects, PMS metabolites control six a and were is Only patients the this RRMS the 13 of subjects. between found control majority which were healthy the of with that comparison altered, revealed patients, significantly RRMS in irrhclcutrn nlsswspromduigPasnscreaina a as correlation Pearson’s using measure. performed similarity was analysis clustering Hierarchical blue. soitos h in( r-)o ahcefiin a utpidb h corresponding the the by of multiplied was direction coefficient each and of -log strength -1) or the (1 sign estimate ’**’ the and associations, To p-value<0.05 measure ’*’ motoric asterisks: ’***’p-value<0.001. and with marked cognitive p-value<0.01, is the significance and and Statistical (D) SDMT FSMC. dominant the of the EDSS, of hand, 9-HPT (m/s), (ND) 6MWT non-dominant (m/s), 25FWT measures: clinical eight 4.9. Figure íORJ í osmaie oto h niiulmtbltsddso significantly show did metabolites individual the of most summarize, To the and metabolites the between relations any were there if investigate To those with patients PMS the in levels metabolite individual the Comparing 10 10 Sívalue)sign(coef) tasomdpvle oiieascain r akdin marked are associations Positive p-value. -transformed 02 soitosbtentemtbltsadpoenboakr swl as well as biomarkers protein and metabolites the between Associations ** ** GFAp ** ** ** ** ** MBP *

CXCL13 **

sCD27 * NFL * ** * * Gal-9 *

EDSS

6MWT *

25FWT

FSMC mot. * * * **

red FSMC cog. ** n eaiein negative and , * * * ** SDMT *

9-HPT (D) * 9-HPT (ND) * 591pos 590pos 185pos 2331pos 1711pos 2601pos 2637pos 763pos 207neg 20pos 199pos 1886pos 1900pos 1575pos 1817pos 1306pos 1815pos 1836pos 2687pos 2095pos 791pos 1453pos 2456pos 1851pos 1691pos 2573pos 1457pos 2406pos investigated metabolites might have potential as individual markers. However, as previously shown in Paper II & III, there is value in combining information from an optimized set of markers.

4.5 Principal findings Altogether, we have shown that several biochemical pathways are affected once a transition to the progressive phase occurs. These alterations enable discrimination of RRMS and PMS patients based on their biochemical profiles. However, considering the infeasibility of implementing a clinical test that is based on the complete profile, we further demonstrated how to extract an optimized and limited subset of markers to detect PMS patients. Finally, we demonstrated how a concise biochemical signature could be complemented with conformal predictions to assess the prediction confidence of single patient assessments to provide a personalised evaluation.

37 5. Reflections

5.1 Experimental aspects Throughout these four years, I have learned a tremendous amount of do’s and don’t’s. Regarding experimental work, one should strive towards standardizing as much as possible. This includes even tiny details such as, if feasible, to prepare master mixes for all solutions and buffers at the beginning of sample preparation. Minor differences in solution properties may give rise to major effects in the final data and, as metabolites are very heterogeneous, they may be affected differently by such variation. Thus, making it challenging to correct for computationally. Nevertheless, there will be many sources of variation; different batches of tubes, pipettes, the temperature in the room etc. Clever constrained randomisation, where study groups are equally distributed throughout the sample preparation and analysis order, is key to guard against potentially confounding effects and enable post analysis corrections [31]. If the experimental and instrumental errors are randomly distributed throughout the study groups, supervised algorithms and variable selection procedures will most often be able to bypass this ’bad’ variation and extract the variation of interest [57].

5.2 Computational thoughts In computational analyses, it is important to remember that any trained model is built on assumptions. A model is neither true nor false, it is merely a simplification of the problem at hand. It may give rise to extraordinary discoveries, hidden in the data, or it may produce irrational and potentially dangerous outcomes which may have large real-world implications [98]. For instance, a flawed algorithm put in medical practice for choice of treatment or diagnostic purposes will probably cause much more harm to patients than a single doctor’s mistake [1, 99]. Hence, it is important to remember that models are tools to aid us in solving tasks that are too complex to be handled by a human brain. Nevertheless, the models are not aware of what is a reasonable answer or way to act. They merely conduct whatever programmed calculations it has been given. Expertise is needed to interpret the given output or incorporated into the model as prior knowledge. Some may argue that imposing prior knowledge is biased, but we need to remember that a

38 intensity 2 log 17.0 17.5 18.0 18.5 16.5 17.0 17.5 18.0 18.5 19.0

20 30 40 50 60 70 20 30 40 50 60 70 Age Age

Figure 5.1. Two metabolites that show an age dependence in a non-linear (left) and linear (right) fashion in CSF from semi-healthy individuals (i.e. individuals that had a clinical reason to take a CSF sample, but did not get a diagnosis).

model is never going to be a true replica of the world either way and neglecting decades of gathered knowledge would be ignorant. A specific example of where my assumptions may not be true for all cases, is the assumption of linearity in the molecular effect of ageing. Based on my own observations, there seem to be cases where metabolites decrease in a non-linear fashion in semi-healthy individuals, Figure 5.1. These cases might benefit from using a non-linear based age correction strategy. Besides, age effects could potentially also be cumulative, i.e. only seen when molecules are viewed in combination. Alternative age correction strategies could in those cases be the utilization of non-linear models or targeted supervised multivariate methods. In a study with the aim of characterising the normal levels of tau and the 42-amino acid form of β-amyloid in CSF from neurologically healthy individuals, it was found that while 42-β-amyloid levels stayed constant between the age of 25–95, tau demonstrated an exponential-like increase after 40 years of age [100]. This would suggest that a non-linear curve fitting strategy would probably perform better than a linear one and do a more accurate age effect estimation in the case of tau, while 42-β-amyloid would not need any correction. Ageing is an important factor that might be confounding with the pathological events in neurological diseases such as multiple sclerosis. In multiple sclerosis, PMS patients (SPMS patients in particular) are typically older than RRMS patients as SPMS follows the RRMS stage. Therefore, when comparing PMS with RRMS patients, there is typically a systematic age difference between these patient groups. Ignoring age in such a case would risk confusing biochemical differences that are caused by age with those that are caused by the disease stages. To avoid such confounding results, it is essential to correct for group-wise age differences. However, as the disease stages are confounding with age, age correction using patient data will risk removing disease-related differences. Therefore, control subjects,

39 preferably healthy, need to be included to enable estimation of a healthy age effect that can be used for correction of the patient data. In the second cohort that was investigated in Paper III & IV, 49 healthy control subjects ranging from 18-74 years of age were included and used for correction. Unfortunately, in the first cohort (investigated in Paper I-III) there were only ten control subjects ranging between 19-65 years of age. As the number of control subjects in the first cohort was sparse, it was decided that the RRMS patients (ranging between 18-70 years) would be included in the estimation of age effects, with the assumption that the effect of the predominantly autoimmune and inflammatory disease activity would not be overlapping with the effects of ageing. Another thing that I would like to reflect upon is the choice of validation strategy. As previously stated, an external and preferably independent validation cohort is the preferred validation strategy. However, in the case where samples are sparse, such a test-set would have to be very small and inevitably the validation would be very sensitive to outlier samples in the test-set. In these cases, cross-validation would be more appropriate as all samples would contribute to the validation and the effect of any outlier samples would be minimized. The cross-validation approach was therefore used in all the work herein. In Paper III, where data from two cohorts were analysed, it could be argued that one of the cohorts should be used for variable selection and model training and the other one for validation. However, we chose to analyse the cohorts separately and construct a consensus result by the overlapping findings (an approach related to meta-analysis). Our main reason for doing so was that we wanted to extract molecules that could be reproducibly measured and were consistently selected as discriminatory. Even though the cohorts were analysed in the same manner with the same instrumental settings, there will be many cohort specific findings. A drawback with this approach was, however, that we did not have any true test-set to evaluate our findings in a fair way. Instead, we performed a PCA on the selected molecules in all patients and projected the control subjects from both cohorts into the PCA. Remarkably, the control subjects that had been left out in the selection procedure, clearly clustered amongst the RRMS patients. Even though this is neither a traditional nor ideal validation, it did show that the RRMS patients resembled control subjects and that the extracted findings represented alterations unique to the PMS patients. Hence, it did emphasize that the findings were not random but displayed relevant differences in the PMS patients. The next step is, nevertheless, to validate these alterations in a new set of RRMS and PMS patients.

40 p-value=1.2×10-5 25.0 p-value=0.17 p-value=0.17

24.5 butanoate

24.0

23.5 LQWHQVLW\RIíDFHWDPLGR 2 log

23.0

22.5

PMS RRMS Controls ansitions Tr PMS 3 months PMS 6 months PMS 12 months Figure 5.2. The CSF levels of 4-acetamidobutanoate in the study subjects from both cohorts. A significant difference can be seen between RRMS and PMS patients, whereas intermediate levels are displayed by RRMS patients that were transitioning to PMS within three years after sampling. The plot also includes measurements from the participants in the phase 1b clinical trial of rituximab for PMS.

5.3 Biological contemplations In Paper I, multiple biochemical pathways were found to be perturbed in PMS compared to RRMS patients. These included the tryptophan metabolism, covering both the kynurenine pathway and the serotonin pathway. Multiple international research groups have been focusing on the kynurenine pathway in multiple sclerosis as well as in other neurological diseases [101–107]. The kynurenine pathway is the major tryptophan-degradation pathway that ultimately leads to the production of NAD+. Dysregulation of this pathway has been shown to lead to lower levels of NAD+ in multiple sclerosis patients and is, therefore, hypothesized to cause energy depletion [105]. In Paper I, we confirmed that the kynurenine pathway is altered in multiple sclerosis, including when patients are transitioning to the PMS stage. We also saw alterations in the serotonin pathway, which has not been studied thoroughly in connection to multiple sclerosis. Nevertheless,

41 serotonin alterations are related to depression, and multiple sclerosis has been reported to have a higher rate of depression compared with other neurological disorders [108]. This enhanced depression rate is believed to be caused by a combination of psychosocial factors and biochemical alterations [108, 109]. Furthermore, the use of antidepressants has been indicated as having a beneficial effect on patients with multiple sclerosis. As such, it was hypothesized that the pathology of depression and multiple sclerosis may have shared biochemical mechanisms or that serotonin-based antidepressants may provide neuroprotective benefits to patients with multiple sclerosis [108]. However, the MS-SMART trial evaluating the benefits of exclusive treatment with fluoxetine (an antidepressant that increases the serotonin levels in the CNS) for PMS, recently announced that no evidence of neuroprotection could be found [110]. 4-Acetamidobutanoate was extracted as a highly discriminant metabolite in Paper I, where it was significantly increased in PMS patients compared with RRMS patients and control subjects. Increased CSF levels of 4-acetamidobutanoate in PMS patients were further seen in the second cohort of multiple sclerosis patients investigated in Paper III & IV, Figure 5.2. 4-Acetamidobutanoate is an alternative precursor of GABA and might indicate a shift in energy metabolism in the dopaminergic neurons that have been shown to utilize this pathway [111]. However, this hypothesis needs to be further investigated and, regardless, 4-acetamidobutanoate may be of value in itself as a marker for PMS. Even though there were only eight RRMS patients in the two cohorts that transitioned to PMS within three years, they did show intermediate CSF levels of 4-acetamidobutanoate, locating them in-between the levels of the RRMS and PMS patients, Figure 5.2. This might suggest that 4-acetamidobutanoate is increased in the CSF before PMS has clinically manifested, and continues to increase until it stabilizes at high levels in the PMS stage.

5.4 Future work In Paper III, we demonstrated a conceptual methodology that could discern PMS from RRMS patients using a set of 28 metabolites. Using conformal prediction, we could generate a prediction confidence for each single patient prediction, enabling an evaluation of a patient’s current biochemical state. As the transition from RRMS to PMS occurs gradually, there is reason to believe that during this transitional period there could, biochemically, be a mix of the two phenotypes. As such, patients that are going through this change should display a similar conformity to both phenotypes. Even though the variable subset could be optimized, potentially including more heterogeneous sources of information (as was done in Paper II), the methodology holds promise in itself. Integrating information from multiple sources could aid in getting a

42 more holistic view that could improve the performance of diagnostics. The work herein has been carried out in order to push the diagnostic boundary closer to when the transitioning to the PMS stage occurs biochemically. The results suggest that we should be able to detect transitioning patients that have not yet been diagnosed with PMS. A natural continuation of this work would be to gather enough samples from RRMS patients that get diagnosed PMS within three years after sampling and compare these to RRMS patients that remain in the RRMS stage after a three years follow-up. This would enable a biochemical comparison between RRMS patients that develop PMS within the near future with those who do not, and an extraction of even more fine-tuned biomarkers that potentially could detect patients that are entering the transitioning stage. Such a cohort would also enable a validation of the intermediate levels of 4-acetamidobutanoate displayed by patients in an ongoing transitional event. Furthermore, even though the primary source of information in this work has been the metabolome, many other fields have shown great potential in finding efficient biomarkers for multiple sclerosis and its subtypes. For instance, small non-coding RNAs and in particular microRNAs are considered a novel type of highly suitable biomarkers as they seem to be key regulators of several biological processes, including immunity and inflammation [112–114]. I believe that utilizing all omics layers, including clinical and prior knowledge of a patient, to achieve a holistic view of the patient’s condition may be the most effective strategy. Extending the work that was done in Paper II, with even more prior knowledge should be our next stage. Having access to such holistic overview, would enable thorough work towards a full characterisation of PMS and extraction of a minimal set of highly informative markers that would be the most effective combination to identify even the slightest sign of transitioning. Furthermore, if this would be equally efficient in blood as in CSF, blood would be the biological metric of preference, as of its less invasive collection.

43 6. Concluding remarks

Although the road to computationally aided healthcare solutions is crooked and not without challenges, there is a lot of potential in the field. I foresee a future where data-driven solutions will aid clinical expertise, enabling more precise and sensitive diagnoses at an earlier stage. When clinical practitioners have become accustomed to the concept, there will come an era when computationally aided solutions will enter general healthcare. Until then, we as researchers and developers in the field need to: (1) Collect high quality data for model training, (2) Standardize the development and validation of computational healthcare solutions, and (3) Generate trust in computational solutions by e.g. increasing the model explainability, initiating collaborative efforts with clinical practitioners and proving the value of development. Thereafter, we may take the next serious step towards algorithmic technology for clinical practice and hence, possibly an earlier detection of progressive multiple sclerosis.

44 7. Acknowledgements

This work was carried out at the Department of Medical Sciences, Division of Clinical Chemistry at Uppsala University.

I would like to thank all colleagues and friends who supported me throughout this work, in particular:

My main supervisor Kim Kultima for your continuous support and encouragement for me to improve both as a researcher and as a person. You have taught me what research means including many life advices, such as “the devil lies in the details” and “having suspenders and belt”. Quotes and mindsets that will probably stay with me for life.

My co-supervisor Joachim Burman for always making time in your busy schedule for our regular meetings, and for always answering my questions with the most clever real-world metaphors. You have taught me the value of great communication which has resulted in my own strive to improve my communication skills.

My co-supervisor Ola Spjuth for your continuous positive attitude and for always having an open door. It was a real pleasure working with you. Also, thank you for letting me be part of the Uppsala PhenoMeNal team. It was a great experience that I learned a lot from.

My co-supervisors Camilla Svensson and Johan Lengqvist, as well as my almost co-supervisor Torbjörn Åkerfeldt.

My examiner Lena Hamelius.

Our research group leader at the Division of Clincial Chemistry, Anders Larsson.

My opponent Gabi Kastenmüller, and my thesis committee members Anders Helander, Carl Brunius and Maja Jagodic as well as my half-time reviewers Anna Fogdell-Hahn and Tuulia Hyötyläinen.

My fellow past and present colleagues Andreas Giannisis, Asma Al-Grety, Caroline Bjurnemark, Efthymia Chantzi, Elena Ossipova, Henrik

45 Carlsson, Kristian Peters, Muhammad Kashif, Niclas Rollborn, Obaid Aftab, Sandy Abujrais and Shibu Krishnan with whom I have had the privilege to share office space with.

My colleague Christina Zjukovskaja for grammar checking this thesis.

Payam Emami Khoonsari and Eva Freyhult for some of the most interesting discussions about multivariate statistics, multilevel modelling, supervised learning, to mention some of the topics that have been covered.

My colleagues Anders Larsson, Jon Ander Novella, Marco Capuccini, Samuel Lampa and Staffan Arvidsson McShance from my affiliated Department of Pharmaceutical Biosciences, Division of Pharmaceutical Bioinformatics. Thank you for great collaborations and for teaching me about cloud computing, conformal prediction, reproducible research and workflows.

My collaborators and co-authors Andreas Tolf, Anna Wiberg, Julia Steinmetz and Valter Niemelä for great collaborations and exchanged knowledge.

All co-authors that I have not mentioned.

Mia Wadelius for the time in FUG and for always having an open door, and Katarina Jonasson Vangen and Maria Nord for all administrative help and for friendly conversations.

The R&D team at Pelago Bioscience, for letting me apply my analytical skills in a different research setting and allowing me the freedom to explore new analytical concepts.

My brother Andreas Herman for great discussions about life, life goals, statistics, machine learning and everything that comes into our minds.

My parents Lena Herman and Milos Herman for shaping me into the person I am today.

My grandparents Ann-Britt Bengtsson, Ove Ronström, Ludmilla & Theodore Herman.

My friends Alona Nyberg, Joel Ås, Lovisa Pettersson and Marcus Hong for the frequent afterworks on Wednesdays, and all minor or major get-togethers whenever there is reason to celebrate (and oh there are many!).

46 My friend Ann-Christine (Anki) Immersköld, for being there for me since 2004.

All my other great friends and people around that bring joy in my life.

My significant other and husband-to-be Joakim Hellner. Thank you for always being there for me, for proofreading this thesis and for keeping me healthy with your great food.

Finally, I would like to give my warmest thank to my cats Curie and Miramis who always bring me comfort in tough times.

Sincerely,

Stephanie Herman

July 2020

47 References

[1] E. J. Topol. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med., 25(1):44–56, 2019. [2] K. H. Yu, A. L. Beam, and I. S. Kohane. Artificial intelligence in healthcare. Nat Biomed Eng, 2(10):719–731, 2018. [3] Eric R. Kandel. Principles of Neural Science. 5th edition, 2013. [4] Dale Purves. Neuroscience. 5th edition, 2012. [5] A. P. Anokhin, N. Birbaumer, W. Lutzenberger, A. Nikolaev, and F. Vogel. Age increases brain complexity. Electroencephalogr Clin Neurophysiol, 99(1):63–68, 1996. [6] D. Scherman, C. Desnos, F. Darchen, P. Pollak, F. Javoy-Agid, and Y. Agid. Striatal dopamine deficiency in Parkinson’s disease: role of aging. Ann. Neurol., 26(4):551–557, 1989. [7] M. Berger, J. A. Gray, and B. L. Roth. The expanded biology of serotonin. Annu. Rev. Med., 60:355–366, 2009. [8] R. Spector, S. Robert Snodgrass, and C. E. Johanson. A balanced view of the cerebrospinal fluid composition and functions: Focus on adult humans. Exp. Neurol., 273:57–68, 2015. [9] C. E. Teunissen, A. Petzold, J. L. Bennett, F. S. Berven, L. Brundin, M. Comabella, D. Franciotta, J. L. Frederiksen, J. O. Fleming, R. Furlan, R. Q. Hintzen, S. G. Hughes, M. H. Johnson, E. Krasulova, J. Kuhle, M. C. Magnone, C. Rajda, K. Rejdak, H. K. Schmidt, V. van Pesch, E. Waubant, C. Wolf, G. Giovannoni, B. Hemmer, H. Tumani, and F. Deisenhammer. A consensus protocol for the standardization of cerebrospinal fluid collection and biobanking. Neurology, 73(22):1914–1922, 2009. [10] G. Moalem, R. Leibowitz-Amit, E. Yoles, F. Mor, I. R. Cohen, and M. Schwartz. Autoimmune T cells protect neurons from secondary degeneration after central nervous system axotomy. Nat. Med., 5(1):49–55, 1999. [11] Y. Ziv, N. Ron, O. Butovsky, G. Landa, E. Sudai, N. Greenberg, H. Cohen, J. Kipnis, and M. Schwartz. Immune cells contribute to the maintenance of neurogenesis and spatial learning abilities in adulthood. Nat. Neurosci., 9(2):268–275, 2006. [12] J. Kipnis, H. Cohen, M. Cardon, Y. Ziv, and M. Schwartz. T cell deficiency leads to cognitive dysfunction: implications for therapeutic vaccination for schizophrenia and other psychiatric conditions. Proc. Natl. Acad. Sci. U.S.A., 101(21):8180–8185, 2004. [13] J. D. Cherry, J. A. Olschowka, and M. K. O’Banion. Neuroinflammation and M2 microglia: the good, the bad, and the inflamed. J Neuroinflammation, 11:98, 2014. [14] W. J. Streit, R. E. Mrak, and W. S. Griffin. Microglia and neuroinflammation: a pathological perspective. J Neuroinflammation, 1(1):14, 2004.

48 [15] B. Hemmer, M. Kerschensteiner, and T. Korn. Role of the innate and adaptive immune responses in the course of multiple sclerosis. Lancet Neurol, 14(4):406–419, 2015. [16] C. A. Dendrou, L. Fugger, and M. A. Friese. Immunopathology of multiple sclerosis. Nat. Rev. Immunol., 15(9):545–558, 2015. [17] H. Inojosa, U. Proschmann, K. Akgün, and T. Ziemssen. A focus on secondary progressive multiple sclerosis (SPMS): challenges in diagnosis and definition. J. Neurol., 2019. [18] J. Lorscheider, K. Buzzard, V. Jokubaitis, T. Spelman, E. Havrdova, D. Horakova, M. Trojano, G. Izquierdo, M. Girard, P. Duquette, A. Prat, A. Lugaresi, F. Grand’Maison, P. Grammond, R. Hupperts, R. Alroughani, P. Sola, C. Boz, E. Pucci, J. Lechner-Scott, R. Bergamaschi, C. Oreja-Guevara, G. Iuliano, V. Van Pesch, F. Granella, C. Ramo-Tello, D. Spitaleri, T. Petersen, M. Slee, F. Verheul, R. Ampapa, M. P. Amato, P. McCombe, S. Vucic, J. L. Sánchez Menoyo, E. Cristiano, M. H. Barnett, S. Hodgkinson, J. Olascoaga, M. L. Saladino, O. Gray, C. Shaw, F. Moore, H. Butzkueven, and T. Kalincik. Defining secondary progressive multiple sclerosis. Brain, 139(Pt 9):2395–2405, 2016. [19] F. D. Lublin, S. C. Reingold, J. A. Cohen, G. R. Cutter, P. S. Sørensen, A. J. Thompson, J. S. Wolinsky, L. J. Balcer, B. Banwell, F. Barkhof, B. Bebo, P. A. Calabresi, M. Clanet, G. Comi, R. J. Fox, M. S. Freedman, A. D. Goodman, M. Inglese, L. Kappos, B. C. Kieseier, J. A. Lincoln, C. Lubetzki, A. E. Miller, X. Montalban, P. W. O’Connor, J. Petkau, C. Pozzilli, R. A. Rudick, M. P. Sormani, O. Stüve, E. Waubant, and C. H. Polman. Defining the clinical course of multiple sclerosis: the 2013 revisions. Neurology, 83(3):278–286, 2014. [20] C. Mainero, F. Caramia, C. Pozzilli, A. Pisani, I. Pestalozza, G. Borriello, L. Bozzao, and P. Pantano. fMRI evidence of brain reorganization during attention and memory tasks in multiple sclerosis. Neuroimage, 21(3):858–867, 2004. [21] M. López-Góngora, A. Escartín, S. Martínez-Horta, R. Fernández-Bobadilla, L. Querol, S. Romero, M. Á. Mañanas, and J. Riba. Neurophysiological Evidence of Compensatory Brain Mechanisms in Early-Stage Multiple Sclerosis. PLoS ONE, 10(8):e0136786, 2015. [22] B. A. Cree, P. A. Gourraud, J. R. Oksenberg, C. Bevan, E. Crabtree-Hartman, J. M. Gelfand, D. S. Goodin, J. Graves, A. J. Green, E. Mowry, D. T. Okuda, D. Pelletier, H. C. von Büdingen, S. S. Zamvil, A. Agrawal, S. Caillier, C. Ciocca, R. Gomez, R. Kanner, R. Lincoln, A. Lizee, P. Qualley, A. Santaniello, L. Suleiman, M. Bucci, V. Panara, N. Papinutto, W. A. Stern, A. H. Zhu, G. R. Cutter, S. Baranzini, R. G. Henry, and S. L. Hauser. Long-term evolution of multiple sclerosis disability in the treatment era. Ann. Neurol., 80(4):499–510, 2016. [23] H. Lassmann, J. van Horssen, and D. Mahad. Progressive multiple sclerosis: pathology and pathogenesis. Nat Rev Neurol, 8(11):647–656, 2012. [24] K. Dettmer, P. A. Aronov, and B. D. Hammock. Mass spectrometry-based metabolomics. Mass Spectrom Rev, 26(1):51–78, 2007. [25] G. J. Patti, O. Yanes, and G. Siuzdak. Innovation: Metabolomics: the apogee of the omics trilogy. Nat Rev Mol Cell Biol., 13(4):263–269, 2012.

49 [26] C. B. Newgard. Metabolomics and Metabolic Diseases: Where Do We Stand? Cell Metab., 25(1):43–56, 2017. [27] E. Bach, S. Szedmak, C. Brouard, S. Böcker, and J. Rousu. Liquid-chromatography retention order prediction for metabolite identification. Bioinformatics, 34(17):i875–i883, 2018. [28] M. Vinaixa, E. L. Schymanski, S. Neumann, M. Navarro, R. M. Salek, and O. Yanes. Mass spectral databases for LC/MS- and GC/MS-based metabolomics: State of the field and future prospects. Trends in Analytical Chemistry, 78:23–35, 2016. [29] A. C. Schrimpe-Rutledge, S. G. Codreanu, S. D. Sherrod, and J. A. McLean. Untargeted Metabolomics Strategies-Challenges and Emerging Directions. J. Am. Soc. Mass Spectrom., 27(12):1897–1905, 2016. [30] D. Broadhurst, R. Goodacre, S. N. Reinke, J. Kuligowski, I. D. Wilson, M. R. Lewis, and W. B. Dunn. Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies. Metabolomics, 14(6):72, 2018. [31] W. B. Dunn, A. W. Wilson, I. D. Nicholls, and D. Broadhurst. The importance of experimental design and QC samples in large-scale and MS-driven untargeted metabolomic studies of humans. Bioanalysis, 4(18):2249–2264, 2012. [32] T. Thomas, D. Stefanoni, J. A. Reisz, T. Nemkov, L. Bertolone, R. O. Francis, K. E. Hudson, J. C. Zimring, K. C. Hansen, E. A. Hod, S. L. Spitalnik, and A. D’Alessandro. COVID-19 infection results in alterations of the kynurenine pathway and fatty acid metabolism that correlate with IL-6 levels and renal status. medRxiv, 2020. [33] S. Herman, V. Niemelä, P. Emami Khoonsari, J. Sundblom, J. Burman, A. M. Landtblom, O. Spjuth, D. Nyholm, and K. Kultima. Alterations in the tyrosine and phenylalanine pathways revealed by biochemical profiling in cerebrospinal fluid of Huntington’s disease subjects. Sci Rep, 9(1):4129, 2019. [34] C. Peña-Bautista, M. Roca, D. Hervás, A. Cuevas, R. López-Cuevas, M. Vento, M. Baquero, A. García-Blanco, and C. Cháfer-Pericás. Plasma metabolomics in early Alzheimer’s disease patients diagnosed with amyloid biomarker. J Proteomics, 200:144–152, 2019. [35] L. Shi, C. Brunius, I. A. Bergdahl, I. Johansson, O. Rolandsson, C. Donat Vargas, H. Kiviranta, K. Hanhineva, A. Åkesson, and R. Landberg. Joint Analysis of Metabolite Markers of Fish Intake and Persistent Organic Pollutants in Relation to Type 2 Diabetes Risk in Swedish Adults. J. Nutr., 149(8):1413–1423, 2019. [36] M. Rådjursöga, H. M. Lindqvist, A. Pedersen, G. B. Karlsson, D. Malmodin, C. Brunius, L. Ellegård, and A. Winkvist. The 1H NMR serum metabolomics response to a two meal challenge: a cross-over dietary intervention study in healthy human volunteers. Nutr J, 18(1):25, 2019. [37] F. Madrid-Gambin, C. Brunius, M. Garcia-Aloy, S. Estruel-Amades, R. Landberg, and C. Andres-Lacueva. Untargeted 1H NMR-Based Metabolomics Analysis of Urine and Serum Profiles after Consumption of Lentils, Chickpeas, and Beans: An Extended Meal Study To Discover Dietary Biomarkers of Pulses. J. Agric. Food Chem., 66(27):6997–7005, 2018.

50 [38] C. Wildberg, A. Masuch, K. Budde, G. Kastenmüller, A. Artati, W. Rathmann, J. Adamski, T. Kocher, H. Völzke, M. Nauck, N. Friedrich, and M. Pietzner. Plasma metabolomics to identify and stratify patients with impaired glucose tolerance. J Clin Endocrinol Metab., 104(12):6357–6370, 2019. [39] M. Pietzner, A. Kaul, A. K. Henning, G. Kastenmüller, A. Artati, M. M. Lerch, J. Adamski, M. Nauck, and N. Friedrich. Comprehensive metabolic profiling of chronic low-grade inflammation among generally healthy individuals. BMC Med., 15(1):210, 2017. [40] G. Quintas, X. Reche, J. D. Sanjuan-Herráez, H. Martínez, M. Herrero, X. Valle, M. Masa, and G. Rodas. Urine metabolomic analysis for monitoring internal load in professional football players. Metabolomics, 16(45), 2020. [41] H. Carlsson, S. Abujrais, S. Herman, P. E. Khoonsari, T. Åkerfeldt, A. Svenningsson, J. Burman, and K. Kultima. Targeted metabolomics of CSF in healthy individuals and patients with secondary progressive multiple sclerosis using high-resolution mass spectrometry. Metabolomics, 16(2):26, 2020. [42] H. Carlsson, K. Hjorton, S. Abujrais, L. Rönnblom, T. Åkerfeldt, and K. Kultima. Measurement of hydroxychloroquine in blood from sle patients using LC-HRMS-evaluation of whole blood, plasma, and serum as sample matrices. Arthritis Res Ther., 22(1), 2020. [43] V. Gallo, M. Egger, V. McCormack, P. B. Farmer, J. P. Ioannidis, M. Kirsch-Volders, G. Matullo, D. H. Phillips, B. Schoket, U. Stromberg, R. Vermeulen, C. Wild, M. Porta, and P. Vineis. STrengthening the Reporting of OBservational studies in Epidemiology–Molecular Epidemiology (STROBE-ME): an extension of the STROBE Statement. PLoS Med., 8(10):e1001117, 2011. [44] Biomarkers Definitions Working Group. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin. Pharmacol. Ther., 69(3):89–95, 2001. [45] J. M. Colón-Franco, P. M. M. Bossuyt, A. Algeciras-Schimnich, C. Bird, J. Engstrom-Melnyk, M. Fleisher, M. Kattan, and G. Lambert-Messerlian. Current and Emerging Multianalyte Assays with Algorithmic Analyses-Are Laboratories Ready for Clinical Adoption? Clin. Chem., 64(6):885–891, 2018. [46] A. Möller, H. Olsson, H. Grönberg, M. Eklund, M. Aly, and T. Nordström. The Stockholm3 blood-test predicts clinically-significant cancer on biopsy: independent validation in a multi-center community cohort. Prostate Cancer Prostatic Dis., 22(1):137–142, 2019. [47] H. Grönberg, J. Adolfsson, M. Aly, T. Nordström, P. Wiklund, Y. Brandberg, J. Thompson, F. Wiklund, J. Lindberg, M. Clements, L. Egevad, and M. Eklund. Prostate cancer screening in men aged 50-69 years (STHLM3): a prospective population-based diagnostic study. Lancet Oncol., 16(16):1667–1676, 2015. [48] C. S. Ho, C. W. Lam, M. H. Chan, R. C. Cheung, L. K. Law, L. C. Lit, K. F. Ng, M. W. Suen, and H. L. Tai. Electrospray ionisation mass spectrometry: principles and clinical applications. Clin Biochem Rev, 24(1):3–12, 2003. [49] S. Banerjee and S. Mazumdar. Electrospray ionization mass spectrometry: a technique to access the information beyond the molecular weight of the

51 analyte. Int J Anal Chem, 2012:282574, 2012. [50] Elizabeth S. Hecht, Michaela Scigelova, Shannon Eliuk, and Alexander Makarov. Fundamentals and Advances of Orbitrap Mass Spectrometry, pages 1–40. American Cancer Society, 2019. [51] Q. Hu, R. J. Noll, H. Li, A. Makarov, M. Hardman, and R. Graham Cooks. The Orbitrap: a new mass spectrometer. J Mass Spectrom, 40(4):430–443, 2005. [52] M. Pettersson Bergstrand, M. R. Meyer, O. Beck, and A. Helander. Human urinary metabolic patterns of the designer benzodiazepines flubromazolam and pyrazolam studied by liquid chromatography-high resolution mass spectrometry. Drug Test Anal, 10(3):496–506, 2018. [53] I. Kohler, A. Verhoeven, R. J. Derks, and M. Giera. Analytical pitfalls and challenges in clinical metabolomics. Bioanalysis, 8(14):1509–1532, 2016. [54] J. A. Kirwan, D. I. Broadhurst, R. L. Davidson, and M. R. Viant. Characterising and correcting batch variation in an automated direct infusion mass spectrometry (DIMS) metabolomics workflow. Anal Bioanal Chem, 405(15):5147–5157, 2013. [55] O. Teahan, S. Gamble, E. Holmes, J. Waxman, J. K. Nicholson, C. Bevan, and H. C. Keun. Impact of analytical bias in metabonomic studies of human blood serum and plasma. Anal. Chem., 78(13):4307–4318, 2006. [56] D. I. Broadhurst and D. B. Kell. Statistical strategies for avoiding false discoveries in metabolomics and related experiments. Metabolomics, 2(4):171–196, 2006. [57] Douglas C. Mongomery. Design and Analysis of Experiments. 4th edition, 1997. [58] R. King, R. Bonfiglio, C. Fernandez-Metzler, C. Miller-Stein, and T. Olah. Mechanistic investigation of ionization suppression in electrospray ionization. J. Am. Soc. Mass Spectrom., 11(11):942–950, 2000. [59] N. C. Hughes, E. Y. K. Wong, J. Fan, and N. Bajaj. Determination of carryover and contamination for mass spectrometry-based chromatographic assays. AAPS J, 9(3):E353–E360, 2007. [60] P. Emami Khoonsari, P. Moreno, S. Bergmann, J. Burman, M. Capuccini, M. Carone, M. Cascante, P. de Atauri, C. Foguet, A. N. Gonzalez-Beltran, T. Hankemeier, K. Haug, S. He, S. Herman, D. Johnson, N. Kale, A. Larsson, S. Neumann, K. Peters, L. Pireddu, P. Rocca-Serra, P. Roger, R. Rueedi, C. Ruttkies, N. Sadawi, R. M. Salek, S. A. Sansone, D. Schober, V. Selivanov, E. A. Thévenot, M. van Vliet, G. Zanetti, C. Steinbeck, K. Kultima, and O. Spjuth. Interoperable and scalable data analysis with microservices: applications in metabolomics. Bioinformatics, 35(19):3752–3760, 2019. [61] M. Sturm, A. Bertsch, C. Gröpl, A. Hildebrandt, R. Hussong, E. Lange, N. Pfeifer, O. Schulz-Trieglaff, A. Zerck, K. Reinert, and O. Kohlbacher. OpenMS - an open-source software framework for mass spectrometry. BMC Bioinformatics, 9:163, 2008. [62] M. R. Berthold, N. Cebron, F. Dill, T. R. Gabriel, T. Kötter, T. Meinl, P. Ohl, C. Sieb, K. Thiel, and B. Wiswedel. Knime: The konstanz information miner. In Data Analysis, Machine Learning and Applications, pages 319–326, 2008. [63] J. Zhang, E. Gonzalez, T. Hestilow, W. Haskins, and Y. Huang. Review of peak detection algorithms in liquid-chromatography-mass spectrometry. Curr.

52 Genomics, 10(6):388–401, 2009. [64] E. Kenar, H. Franken, S. Forcisi, K. Wörmann, H. Häring, R. Lehmann, P. Schmitt-Kopplin, A. Zell, and O. Kohlbacher. Automated label-free quantification of metabolites from liquid chromatography–mass spectrometry data. Mol Cell Proteomics., 13(1):348–359, 2014. [65] H. Weisser, S. Nahnsen, J. Grossmann, L. Nilse, A. Quandt, H. Brauer, M. Sturm, E. Kenar, O. Kohlbacher, R. Aebersold, and L. Malmström. An automated pipeline for high–throughput label-free quantitative proteomics. J Proteome Res., 12(4):1628–1644, 2013. [66] H. Mizuno, K. Ueda, Y. Kobayashi, N. Tsuyama, K. Todoroki, J. Z. Min, and T. Toyo’oka. The great importance of normalization of LC-MS data for highly-accurate non-targeted metabolomics. Biomed Chromatogr., 31(1), 2017. [67] B. A. Ejigu, D. Valkenborg, G. Baggerman, M. Vanaerschot, E. Witters, J. Dujardin, T. Burzykowski, and M. Berg. Evaluation of normalization methods to pave the way towards large-scale LC-MS-based metabolomics profiling experiments. OMICS: A Journal of Integrative Biology, 17(9):473–485, 2013. [68] W. B. Dunn, D. Broadhurst, P. Begley, E. Zelena, S. Francis-McIntyre, N. Anderson, M. Brown, J. D. Knowles, A. Halsall, J. N. Haselden, A. W. Nicholls, I. D. Wilson, D. B. Kell, and R. Goodacre. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. Nat Protoc, 6(7):1060–1083, 2011. [69] K. Kultima, A. Nilsson, B. Scholz, U. L. Rossbach, M. Fälth, and P. E. Andrén. Development and evaluation of normalization methods for label-free relative quantification of endogenous peptides. Mol. Cell Proteomics, 8(10):2285–2295, 2009. [70] C. Menni, G. Kastenmüller, A. K. Petersen, J. T. Bell, M. Psatha, P. C. Tsai, C. Gieger, H. Schulz, I. Erte, S. John, M. J. Brosnan, S. G. Wilson, L. Tsaprouni, E. M. Lim, B. Stuckey, P. Deloukas, R. Mohney, K. Suhre, T. D. Spector, and A. M. Valdes. Metabolomic markers reveal novel pathways of ageing and early development in human populations. Int J Epidemiol, 42(4):1111–1119, 2013. [71] B. F. Darst, R. L. Koscik, K. J. Hogan, S. C. Johnson, and C. D. Engelman. Longitudinal plasma metabolomics of aging and sex. Aging (Albany NY), 11(4):1262–1282, 2019. [72] Z. Yu, G. Zhai, P. Singmann, Y. He, T. Xu, C. Prehn, W. Römisch-Margl, E. Lattka, C. Gieger, N. Soranzo, J. Heinrich, M. Standl, E. Thiering, K. Mittelstraß, H. E. Wichmann, A. Peters, K. Suhre, Y. Li, J. Adamski, T. D. Spector, T. Illig, and R. Wang-Sattler. Human serum metabolic profiles are age dependent. Aging Cell, 11(6):960–967, 2012. [73] M. J. Rist, A. Roth, L. Frommherz, C. H. Weinert, R. Krüger, B. Merz, D. Bunzel, C. Mack, B. Egert, A. Bub, B. Görling, P. Tzvetkova, B. Luy, I. Hoffmann, S. E. Kulling, and B. Watzl. Metabolite patterns predicting sex and age in participants of the Karlsruhe Metabolomics and Nutrition (KarMeN) study. PLoS ONE, 12(8):e0183228, 2017.

53 [74] W. B. Dunn, W. Lin, D. Broadhurst, P. Begley, M. Brown, E. Zelena, A. A. Vaughan, A. Halsall, N. Harding, J. D. Knowles, S. Francis-McIntyre, A. Tseng, D. I. Ellis, S. O’Hagan, G. Aarons, B. Benjamin, S. Chew-Graham, C. Moseley, P. Potter, C. L. Winder, C. Potts, P. Thornton, C. McWhirter, M. Zubair, M. Pan, A. Burns, J. K. Cruickshank, G. C. Jayson, N. Purandare, F. C. Wu, J. D. Finn, J. N. Haselden, A. W. Nicholls, I. D. Wilson, R. Goodacre, and D. B. Kell. Molecular phenotyping of a UK population: defining the human serum metabolome. Metabolomics, 11:9–26, 2015. [75] K. Mittelstrass, J. S. Ried, Z. Yu, J. Krumsiek, C. Gieger, C. Prehn, W. Roemisch-Margl, A. Polonikov, A. Peters, F. J. Theis, T. Meitinger, F. Kronenberg, S. Weidinger, H. E. Wichmann, K. Suhre, R. Wang-Sattler, J. Adamski, and T. Illig. Discovery of sexual dimorphisms in metabolic and genetic biomarkers. PLoS Genet., 7(8):e1002215, 2011. [76] J. Krumsiek, K. Mittelstrass, K. T. Do, F. Stückler, J. Ried, J. Adamski, A. Peters, T. Illig, F. Kronenberg, N. Friedrich, M. Nauck, M. Pietzner, D. O. Mook-Kanamori, K. Suhre, C. Gieger, H. Grallert, F. J. Theis, and G. Kastenmüller. Gender-specific pathway differences in the human serum metabolome. Metabolomics, 11(6):1815–1833, 2015. [77] F. Falahati, D. Ferreira, H. Soininen, P. Mecocci, B. Vellas, M. Tsolaki, I. Kłoszewska, S. Lovestone, M. Eriksdotter, L. Wahlund, A. Simmons, E. Westman, AddNeuroMed consortium, and the Alzheimer’s Disease Neuroimaging Initiative. The effect of age correction on multivariate classification in Alzheimer’s disease, with a focus on the characteristics of incorrectly and correctly classified subjects. Brain Topogr., 29(2):296–307, 2016. [78] S. Wold, M. Sjöström, and L. Eriksson. PLS-regression: a basic tool of chemometrics. Chemometr. Intell. Lab. Syst., 58(2):109–130, 2001. [79] K. Pearson. On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(11):559–572, 1901. [80] H. Hotelling. Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6):417–441, 1933. [81] Trevor Hastie, Robert Tibshirani, and Jerome H. Friedman. The Elements of Statistical Learning. 2nd edition, 2009. [82] M. Ringnér. What is principal component analysis? Nat. Biotechnol., 26(3):303–304, 2008. [83] S. Herman, P. Emami Khoonsari, O. Aftab, S. Krishnan, E. Strömbom, R. Larsson, U. Hammerling, O. Spjuth, K. Kultima, and M. Gustafsson. Mass spectrometry based metabolomics for in vitro systems pharmacology: pitfalls, challenges, and computational solutions. Metabolomics, 13(7):79, 2017. [84] J. Trygg and S. Wold. Orthogonal projections to latent structures (o-pls). Journal of Chemometrics, 16(3):119–128, 2002. [85] M. Barker and W. Rayens. Partial least squares for discrimination. Journal of Chemometrics, 17(3):166–173, 2003. [86] Peter Bühlmann and Sara Van De Geer. Statistics for High-Dimensional Data. 2011. [87] Andrew Gelman and Jennifer Hill. Data Analysis Using Regression and Multilevel/Hierarchical Models. 2006.

54 [88] Stephen W. Raudenbush and Anthony S. Bryk. Hierarchical Linear Models: Applications and Data Analysis Methods. 2nd edition, 2002. [89] Kevin P. Murphy. Machine Learning: A Probabilistic Perspective. 2012. [90] J. A. Westerhuis, H. C. J. Hoefsloot, S. Smit, D. J. Vis, A. K. Smilde, E. J. J. van Velzen, J. P. M. van Duijnhoven, and F. A. van Dorsten. Assessment of PLSDA cross validation. Metabolomics, 4:81–89, 2008. [91] Paul W. Mielke Jr. and Kenneth J. Berry. Permutation Methods: A Distance Function Approach. 2001. [92] Harris Papadopoulos. Tools in Artificial Intelligence, chapter Inductive Conformal Prediction: Theory and Application to Neural Networks. IntechOpen, 2008. [93] U. Norinder, L. Carlsson, S. Boyer, and M. Eklund. Introducing conformal prediction in predictive modeling. A transparent and flexible alternative to applicability domain determination. J Chem Inf Model, 54(6):1596–1603, 2014. [94] Vladimir Vovk, Alexander Gammerman, and Glenn Shafer. Algorithmic Learning in a Random World. 2005. [95] C. Saunders, A. Gammerman, and V. Vovk. Machine learning applications of algorithmic randomness. Proceedings of the 16th Int. Conf. on Machine Learning., pages 444–453, 1999. [96] C. Saunders, A. Gammerman, and V. Vovk. Transduction with confidence and credibility. Proceedings of the 16th Int. Joint Conf. on Artificial Intelligence, 2:722–726, 1999. [97] C. H. Polman, S. C. Reingold, B. Banwell, M. Clanet, J. A. Cohen, M. Filippi, K. Fujihara, E. Havrdova, M. Hutchinson, L. Kappos, F. D. Lublin, X. Montalban, P. O’Connor, M. Sandberg-Wollheim, A. J. Thompson, E. Waubant, B. Weinshenker, and J. S. Wolinsky. Diagnostic criteria for multiple sclerosis: 2010 revisions to the McDonald criteria. Ann. Neurol., 69(2):292–302, 2011. [98] Richard McElreath. Statistical Rethinking: A Bayesian Course with Examples in R and Stan. 2015. [99] C. Ross and I. Swetlitz. IBM’s Watson supercomputer recommended ’unsafe and incorrect’ cancer treatments, internal documents show. In Stat News, 2018. https://www.statnews.com/2018/07/25/ibm-watson-recommended- unsafeincorrect-treatments/. [100] M. Sjögren, H. Vanderstichele, H. Agren, O. Zachrisson, M. Edsbagge, C. Wikkelsø, I. Skoog, A. Wallin, L. O. Wahlund, J. Marcusson, K. Nägga, N. Andreasen, P. Davidsson, E. Vanmechelen, and K. Blennow. Tau and Abeta42 in Cerebrospinal Fluid From Healthy Adults 21-93 Years of Age: Establishment of Reference Values. Clin Chem., 47(10):1776–81, 2001. [101] K. Rejdak, H. Bartosik-Psujek, B. Dobosz, T. Kocki, P. Grieb, G. Giovannoni, W. A. Turski, and Z. Stelmasiak. Decreased level of kynurenic acid in cerebrospinal fluid of relapsing-onset multiple sclerosis patients. Neurosci Lett., 331(1):63–65, 2002. [102] Z. Hartai, P. Klivenyi, T. Janaky, B. Penke, L. Dux, and L. Vecsei. Kynurenine metabolism in multiple sclerosis. Acta Neurol Scand., 112(2):93–96, 2005. [103] B. Nourbakhsh, P. Bhargava, H. Tremlett, J. Hart, J. Graves, and E. Waubant.

55 Altered tryptophan metabolism is associated with pediatric multiple sclerosis risk and course. Ann Clin Transl Neurol., 5(10):1211–1221, 2018. [104] M. D. Lovelace, B. Varney, G. Sundaram, M. J. Lennon, C. K. Lim, K. Jacobs, G. J. Guillemin, and B. J. Brew. Recent Evidence for an Expanded Role of the Kynurenine Pathway of Tryptophan Metabolism in Neurological Diseases. Neuropharmacology, 112(Pt B):373–388, 2017. [105] C. K. Lim, A. Bilgin, D. B. Lovejoy, V. Tan, S. Bustamante, B. V. Taylor, A. Bessede, B. J. Brew, and G. J. Guillemin. Kynurenine pathway metabolomics predicts and provides mechanistic insight into multiple sclerosis progression. Sci Rep., 7:41473, 2017. [106] M. D. Lovelace, B. Varney, G. Sundaram, N. F. Franco, M. L. Ng, S. Pai, C. K. Lim, G. J. Guillemin, and B. J. Brew. Current evidence for a role of the kynurenine pathway of tryptophan metabolism in multiple sclerosis. Front Immunol., 7:246, 2016. [107] C. Rajda, Z. Majláth, D. Pukoli, and L. Vécsei. The dialogue between the immune system and the central nervous system. international journal of molecular sciences. Int J Mol Sci., 16:18270–18282, 2015. [108] L. B. Grech, E. Butler, S. Stuckey, and R. Hester. Neuroprotective benefits of antidepressants in multiple sclerosis: Are we missing the mark? J Neuropsychiatry Clin Neurosci., 31(4):289–297, 2019. [109] S. B. Patten, R. A. Marrie, and M. G. Carta. Depression in multiple sclerosis. Int Rev Psychiatry., 29(5):463–472, 2017. [110] J. Chataway, F. D. Angelis, P. Connick, R. A. Parker, D. Plantone, A. Doshi, N. John, J. Stutters, D. MacManus, F. P. Carrasco, F. Barkhof, S. Ourselin, M. Braisher, M. Ross, G. Cranswick, S. H. Pavitt, G. Giovannoni, C. A. G. Wheeler-Kingshott, C. Hawkins, B. Sharrack, R. Bastow, C. J. Weir, N. Stallard, S. Chandran, and MS-SMART Investigators. Efficacy of three neuroprotective drugs in secondary progressive multiple sclerosis (MS-SMART): a phase 2b, multiarm, double-blind, randomised placebo-controlled trial. Lancet Neurol., 19(3):214–225, 2020. [111] J. I. Kim, S. Ganesan, S. X. Luo, Y. Wu, E. Park, E. J. Huang, L. Chen, and J. B. Ding. Aldehyde dehydrogenase 1a1 mediates a gaba synthesis pathway in midbrain dopaminergic neurons. Science, 350(6256):102–106, 2015. [112] E. Piket, G. Y. Zheleznyakova, L. Kular, and M. Jagodic. Small non-coding RNAs as important players, biomarkers and therapeutic targets in multiple sclerosis: A comprehensive overview. J. Autoimmun., 101:17–25, 2019. [113] K. Regev, B. C. Healy, A. Paul, C. Diaz-Cruz, M. A. Mazzola, R. Raheja, B. I. Glanz, P. Kivisäkk, T. Chitnis, M. Jagodic, F. Piehl, T. Olsson, M. Khademi, S. Hauser, J. Oksenberg, S. J. Khoury, H. L. Weiner, and R. Gandhi. Identification of MS-specific serum miRNAs in an international multicenter study. Neurol Neuroimmunol Neuroinflamm, 5(5):e491, 2018. [114] P. Bergman, E. Piket, M. Khademi, T. James, L. Brundin, T. Olsson, F. Piehl, and M. Jagodic. Circulating miR-150 in CSF is a novel candidate biomarker for multiple sclerosis. Neurol Neuroimmunol Neuroinflamm, 3(3):e219, 2016.

56

# /   /                .   0 1 

#         0 1 2/  / 12   1   1     3#         4  5         2    1         1       6       /         0 1  37  8  12())'2          96      /         0 1 :3;

         .  33     .. ..  ''