Towards an Earlier Detection of Progressive Multiple Sclerosis

! "#$%"&#$ %$'() %*$+,+'-)+, .. .. '' !!" # $%& ' () "!"! * +, , ,- ./ , 0 12' 3 4 2/ 5 6 7. ,8 , )8 9: 07 12 )2"!"!2' 3 4 ,- 0 ) 0 0 ; 2 <(=2+< 2 2)8>?(&? + *!?& 2 , 2 3 2', 3 , 2 0 33 @ .AA0)1 .-0)12- 3 , , 2 ' , -0) 3 ,2 ' , , , 2 - ,, , .B)/1, AA0) -0) 3 -0) 2 3 3 3 ,, -0) 2 - , -0) 3 .1 3 .13 2 3 , , -0) , AA0) 2 - ,3 3 , , , , 2/ - $ 3 3 3 B)/ , -0) , = = 9 2 C , 3 -0) 2 , !" " #" $%& ' ( )*'+,' ( D) "!"! ))> <+ <"!< )8>?(&? + *!?& = <<++. EE 22E F G = <<++1 To everyone who supported me List of papers This thesis is based on the following papers, which are referred to in the text by their roman numerals. I Herman S, Åkerfeldt T, Spjuth O, Burman J, Kultima K. Biochemical differences in cerebrospinal fluid between secondary progressive and relapsing–remitting multiple sclerosis. Cells 8(2):84 (2019). II Herman S, Khoonsari PE, Tolf A, Steinmetz J, Zetterberg H, Åkerfeldt T, Jakobsson PJ, Larsson A, Spjuth O, Burman J, Kultima K. Integration of magnetic resonance imaging and protein and metabolite CSF measurements to enable early diagnosis of secondary progressive multiple sclerosis. Theranostics 8(16):4477–4490 (2018). III Herman S, Arvidsson McShane S, Zhukovshy C, Khoonsari PE, Burman J, Spjuth O, Kultima K. Disease phenotype prediction in multiple sclerosis. Manuscript (2020). IV Herman S, Khoonsari PE, Spjuth O, Burman J, Kultima K. A biochemical signature of progressive multiple sclerosis. Manuscript (2020). Reprints were made with permission from the publishers. List of related papers The following papers were not included in the thesis. I Carlsson H, Abujrais S, Herman S, Khoonsari PE, Åkerfeldt T, Svenningsson A, Burman J, Kultima K. Targeted metabolomics of CSF in healthy individuals and patients with secondary progressive multiple sclerosis using high-resolution mass spectrometry. Metabolomics 16(2):26 (2020). II Wiberg A, Olsson–Strömberg U, Herman S, Kultima K, Burman J. Profound but transient changes in the inflammatory milieu of the blood during autologous hematopoietic stem cell transplantation. Biology of Blood and Marrow Transplantation 26(1):50–57 (2020). III Peters K, Bradbury J, Bergmann S, Capuccini M, Cascante M, de Atauri P, Ebbels TMD, Foguet C, Glen R, Gonzalez-Beltran A, Günther UL, Handakas E, Hankemeier T, Haug K, Herman S, Holub P, Izzo M, Jacob D, Johnson D, Jourdan F, Kale N, Karaman I, Khalili B, Emami Khonsari P, Kultima K, Lampa S, Larsson A, Ludwig C, Moreno P, Neumann S, Novella JA, O’Donovan C, Pearce JTM, Peluso A, Piras ME, Pireddu L, Reed MAC, Rocca–Serra P, Roger P, Rosato A, Rueedi R, Ruttkies C, Sadawi N, Salek RM, Sansone SA, Selivanov V, Spjuth O, Schober D, Thévenot EA, Tomasoni M, van Rijswijk M, van Vliet M, Viant MR, Weber RJM, Zanetti G, Steinbeck C. PhenoMeNal: processing and analysis of metabolomics data in the cloud. Gigascience 8(2) (2019). IV Khoonsari PE, Moreno P, Bergmann S, Burman J, Capuccini M, Carone M, Cascante M, de Atauri P, Foguet C, Gonzalez–Beltran A, Hankemeier T, Haug K, He S, Herman S, Johnson D, Kale N, Larsson A, Neumann S, Peters K, Pireddu L, Rocca–Serra P, Roger P, Rueedi R, Ruttkies C, Sadawi N, Salek RM, Sansone SA, Schober D, Selivanov V, Thévenot EA, van Vliet M, Zanetti G, Steinbeck C, Kultima K, Spjuth O. Interoperable and scalable metabolomics data analysis with microservices. Bioinformatics 35(19):3752–3760 (2019). V Novella JA, Khoonsari PE, Herman S, Whitenack D, Capuccini M, Burman J, Kultima K, Spjuth O. Container-based bioinformatics with Pachyderm. Bioinformatics 35(5):839–846 (2019). VI Herman S, Niemelä V, Khoonsari PE, Sundblom J, Burman J, Landtblom AM, Spjuth O, Nyholm D, Kultima K. Alterations in the tyrosine and phenylalanine pathways revealed by biochemical profiling in cerebrospinal fluid of Huntington’s disease subjects. Scientific Reports 9(1):4129 (2019). VII Herman S, Khoonsari PE, Aftab O, Krishnan S, Strömbom E, Larsson R, Hammerling U, Spjuth O, Kultima K, Gustafsson M. Mass spectrometry based metabolomics for in vitro systems pharmacology: pitfalls, challenges, and computational solutions. Metabolomics 13(7):79 (2017). Contents 1 Introduction .................................................................................................. 1 1.1 The central nervous system ............................................................. 2 1.2 Neurons ............................................................................................. 2 1.3 Neurotransmitters ............................................................................. 3 1.4 Cerebrospinal fluid ........................................................................... 3 1.5 Neuroimmunology ........................................................................... 4 1.6 Multiple sclerosis ............................................................................. 5 1.7 Metabolomics ................................................................................... 6 1.8 Biomarkers and multianalyte algorithmic assays ........................... 8 2 Aims ............................................................................................................. 9 3 Methodologies ........................................................................................... 10 3.1 Mass spectrometry ......................................................................... 10 3.2 Tandem mass spectrometry ........................................................... 11 3.3 Liquid chromatography ................................................................. 12 3.4 Experimental design ....................................................................... 13 3.5 Metabolite identification ................................................................ 15 3.6 Metabolite quantification ............................................................... 16 3.7 Normalization ................................................................................. 16 3.8 Covariate correction ....................................................................... 17 3.9 Dimensionality reduction and latent variable models .................. 18 3.10 Regularization techniques .............................................................. 19 3.11 Multilevel modelling ...................................................................... 20 3.12 Achieving robust results ................................................................ 22 3.13 Model performance estimation ..................................................... 23 3.14 Conformal prediction ..................................................................... 24 4 Study summaries ........................................................................................ 26 4.1 Paper I ............................................................................................. 26 4.2 Paper II ............................................................................................ 29 4.3 Paper III .......................................................................................... 31 4.4 Paper IV .......................................................................................... 34 4.5 Principal findings ........................................................................... 37 5 Reflections .................................................................................................. 38 5.1 Experimental aspects ..................................................................... 38 5.2 Computational thoughts ................................................................. 38 5.3 Biological contemplations ............................................................. 41 5.4 Future work .................................................................................... 42 6 Concluding remarks .................................................................................. 44 7 Acknowledgements ................................................................................... 45 References ........................................................................................................ 48 Abbreviations AUC/AUROC Area under the ROC curve BER Balanced error rate CID Collision-induced dissociation CNS Central nervous system CSF Cerebrospinal fluid EDSS Expanded disability status score ESI Electrospray ionization HCD Higher collision energy dissociation HMDB the Human Metabolome Database HPLC High-performance liquid chromatography LASSO Least absolute shrinkage and selection operator LC-MS Liquid chromatography-mass spectrometry m/z mass-to-charge ratio MAAA Multianalyte assay with algorithmic analyses MRI Magnetic resonance imaging MS/MS Tandem mass spectrometry OLS Ordinary least squares OPLS-DA Orthogonal partial least squares discriminant analysis PC Principal component PCA Principal component analysis PLS-DA Partial least squares discriminant analysis PMS Progressive multiple sclerosis PNS Peripheral nervous system PPMS Primary progressive multiple sclerosis ROC Receiver operating characteristic

Towards an Earlier Detection of Progressive Multiple Sclerosis

Multiple Discriminant Analysis

An Overview and Application of Discriminant Analysis in Data Analysis

S41598-018-25035-1.Pdf

Multicollinearity Diagnostics in Statistical Modeling and Remedies to Deal with It Using SAS

Best Subset Selection for Eliminating Multicollinearity

On the Determination of Proper Regularization Parameter

Baseline and Interferent Correction by the Tikhonov Regularization Framework for Linear Least Squares Modeling

Application of Principal Component Analysis (PCA) to Reduce Multicollinearity Exchange Rate Currency of Some Countries in Asia Period 2004-2014

Year Seminars on College Students' Life-Long Learning Orientations

Ocrep: an Optimally Conditioned Regularization for Pseudoinversion

DIAGNOSTICS in MULTIPLE REGRESSION the Data Are in the Form

SUSAM : a Teaching Tool for Multicollinearity Analysis