Proteomic-based Signature of Brain-related as Novel Candidate Biomarkers for Alzheimer’s Disease Diagnosis

by

Ilijana Begcevic

A thesis submitted in conformity with requirements for the degree of Doctor of Philosophy Laboratory Medicine and Pathobiology University of Toronto

© Copyright by Ilijana Begcevic 2018

Proteomic-based Signature of Brain-related Proteins as Novel Candidate Biomarkers for Alzheimer’s Disease Diagnosis

Ilijana Begcevic

Doctor of Philosophy

Laboratory Medicine and Pathobiology University of Toronto

2018

Abstract

Alzheimer’s disease (AD) is the most common type of dementia characterized by the progressive memory loss and cognitive decline, with typical age of onset after 65 years. The main disease hallmarks include extracellular amyloid beta deposits and intracellular neurofibrillary tangles. Brain pathology is reflected in cerebrospinal fluid (CSF) where core biomarkers amyloid beta 1-42, total tau, and phosphorylated tau levels are changed relative to cognitively normal elderly. There is a need for novel disease-specific biomarkers that will capture disease more accurately and at an earlier stage, predict disease severity, and be used in research and clinical trial settings for patients’ enrichment and stratification or to monitor drug efficacy. The aim of the present dissertation is to evaluate diagnostic potential of brain- related proteins in CSF of AD patients.

The study was focused on proteins with high specificity for brain tissue. Only proteins expressed predominantly in the brain and reliably identified in the human CSF proteome were selected for development of multiplex mass spectrometry-based assay. Candidate biomarkers were then evaluated in AD patients with different spectrum of disease severity and cognitively healthy individuals (in total 154 subjects). This exploratory study suggests APLP1 protein as a

ii promising biomarker for early AD detection. Specifically, the protein was elevated in mild cognitive impairment compared to cognitively healthy individuals (p<0.001) and was able to discriminate the two groups with high accuracy (AUC: 0.789); this potential was also observed for the multivariate panel of two proteins, APLP1 and SPP1 (AUC: 0.841). Future studies need to validate the biomarkers in a larger, independent cohort including AD, MCI and preclinical

AD patients.

iii

Acknowledgments

First I would like to thank my supervisor, Dr. Eleftherios P. Diamandis, for giving me the opportunity to pursue the PhD as an international student in his lab, for his input and feedback during this journey. I would also like to thank my Masters’ thesis supervisor, Dr.

Ana-Maria Simundic for supporting and persuading me to do my PhD abroad.

I would like to acknowledge my PhD advisory committee, Dr. Isabelle Aubert, Dr. Lili-

Naz Hazrati and Dr. George Yousef, for their valuable advice and comments during my study. In addition, I acknowledge all the members of my final oral examination committee,

Dr. Gerold Schmitt-Ulms, Dr. Vinod Chandran and Dr. Cheryl Wellington for their time and feedback on my PhD thesis. I would like to thank the Department of Laboratory

Medicine and Pathobiology at the University of Toronto, especially Dr. Harry Elsholtz for his support and assistance.

Special thanks to Dr. Davor Brinc for his mentorship, collaboration and guidance during all these years. Many thanks as well to Dr. Eduardo Martinez-Morillo for his training, support and advice. I have learned so much from both of you; your knowledge and dedication always inspired me.

All the members of the ACDC lab were very helpful and supportive in so many ways.

Thanks to Ihor Batruch and Dr. Andrei Drabovich for their mass spectrometry input, expertise and help with many troubleshootings. Thanks to Antoninus Soosaipillai for all of his assistances and advices.

I would like to give thanks to my dear friends who were incredibly supportive during my studies, Andrea, Ana, Christina, Lampros, Tereza, Marta and Dorotea. Tanks to Maria iv for her support, both professionally and as my very close friend. Thanks to Igor for his support and understanding.

Finally, I would like to thank my family, without you nothing would be possible. My deepest thanks to my Mom Finka, Dad Drago, my beautiful sisters, Mimi and Ines, my brother-in law Vlado and my little nephew Ivan. I cannot thank you enough for all your love and support.

v

Table of Contents

Acknowledgments iv Table of Contents vi List of Tables x List of Figures xi List of Appendices xii List of Abbreviations xiii Chapter 1 1 1 Introduction 1 1.1 Alzheimer’s disease 1 1.1.1 Alzheimer’s disease and dementias 1 1.1.2 Epidemiology of dementia and Alzheimer’s disease 3 1.1.3 Alzheimer’s disease pathogenesis and hypothesis 5 1.1.4 Risk factors for Alzheimer’s disease 11 1.1.5 Clinical presentation of dementia and Alzheimer’s disease 16 1.1.6 Diagnosis of Alzheimer’s disease 18 1.1.7 Treatment 22 1.2 Mass spectrometry-based biomarker development 27 1.2.1 Introduction to biomarkers and biomarker’s characteristics 27 1.2.2 Phases of protein biomarker development 35 1.2.3 Common samples used for biomarker discovery 39 1.2.4 Mass spectrometry-based protein biomarker identification and quantification 41 1.2.5 Targeted mass spectrometry protein quantification 46 1.2.6 Tissue-specific databases 49 1.3 Alzheimer’s disease CSF biomarkers 50 1.3.1 Cerebrospinal fluid 50 1.3.2 Types of Alzheimer’s disease biomarkers 55 1.3.3 Clinical use of CSF AD biomarkers 60 1.3.4 Pre-analytical factors and AD CSF biomarkers 61

vi

1.3.5 Analytical methods used for evaluation of CSF biomarkers 62 1.3.6 Diagnostic performance of CSF biomarkers 67 1.3.7 Other AD biomarkers 72 1.3.8 Conclusion 73 1.4 Rationale and aims of the study 75 1.4.1 Rationale 75 1.4.2 Hypothesis 79 1.4.3 Objectives of the study 79 Chapter 2 82 2 Semiquantitative proteomic analysis of human hippocampal tissues from Alzheimer’s disease and age-matched control brains 82 2.1 Introduction 82 2.2 Methods 83 2.2.1 Brain tissues samples 83 2.2.2 Mass spectrometry sample preparation 84 2.2.3 Liquid chromatography-tandem mass spectrometry (LC-MS/MS) 84 2.2.4 Data analysis 85 2.3 Results and Discussion 85 Chapter 3 92 3 Identification of brain-related proteins in the cerebrospinal fluid proteome 92 3.1 Introduction 92 3.2 Methods 94 3.2.1 Cerebrospinal fluid sample preparation 94 3.2.2 Strong cation exchange chromatography 95 3.2.3 Liquid chromatography-tandem mass spectrometry (LC-MS/MS) 96 3.2.4 Data analysis 96 3.3 Results 98 3.3.1 Cerebrospinal fluid proteome 98 3.3.2 Identification of brain-related proteins in the CSF proteome 103 3.3.3 Cell type-specific brain-related proteins in the CSF proteome 107 3.4 Discussion 108 Chapter 4 115

vii

4 Development of targeted mass spectrometry assay for relative quantification of brain- related proteins 115 4.1 Introduction 115 4.2 Methods 116 4.2.1 Selection of brain-related proteins as biomarker candidates 116 4.2.2 Selection of peptides for candidate proteins 117 4.2.3 Identification of peptides and selection of transitions in CSF 117 4.2.4 Mass spectrometry sample preparation 119 4.2.5 Liquid chromatography-tandem mass spectrometry (LC-MS/MS) 120 4.2.6 Linearity 121 4.2.7 Reproducibility 121 4.2.8 Freeze-thaw assay and carry-over effect 121 4.2.9 Data analysis 122 4.3 Results 123 4.3.1 Identification of proteotypic peptides in CSF for SRM assay development 123 4.3.2 Linearity 127 4.3.3 Reproducibility 128 4.3.4 Freeze-thaw assay and carry-over effect 129 4.4 Discussion 129 Chapter 5 134 5 Verification of brain-related proteins as potential diagnostic biomarkers of Alzheimer’s disease 134 5.1 Introduction 134 5.2 Methods 137 5.2.1 Cerebrospinal fluid samples 137 5.2.2 Multiplex selected reaction monitoring assay 139 5.2.3 Mass spectrometry sample preparation 140 5.2.4 Liquid chromatography-tandem mass spectrometry (LC-MS/MS) 140 5.2.5 Quality control 141 5.2.6 Data analysis 141 5.2.7 Statistical analysis 142 5.3 Results 142 5.3.1 Patients’ characteristics 142 viii

5.3.2 Candidates’ comparison 148 5.3.3 Diagnostic performance 153 5.3.4 Multivariate analysis (Cohort 1) 155 5.3.5 Correlation of candidate proteins with MMSE and CDR tests 155 5.3.6 Distribution of APOE phenotype among groups 157 5.3.7 Distribution of candidate proteins among APOE phenotypes 158 5.4 Discussion 159 Chapter 6 166 6 General discussion and future direction 166 References 174 Appendices 205

ix

List of Tables

Table 1.1: Analytical characteristics of ELISA, xMAP and MSD assays. 65 Table 3.1: Number of identified proteins and peptides in six individual CSF samples 99 Table 3.2: Overlap of proteins in individual samples 100 Table 3.3: Overlap of peptides in individuals samples 102 Table 3.4: Representative cell type-specific brain-expressed proteins according to the Human Protein Atlas immunohistochemistry data. 108 Table 4.1: Proteins and peptides of the developed method 125 Table 5.1: Patients’ characteristics (Cohort 1) 143 Table 5.2: Patients’ characteristics (Cohort 2) 145 Table 5.3: APOE phenotype distribution (Cohort 1). 157 Table 5.4: APOE phenotype distribution (Cohort 2). 158

x

List of Figures

Figure 1.2: Protein biomarker development phases. 38 Figure 1.3: Selected reaction monitoring (SRM) methodology. 47 Figure 1.4: Simplified diagram of the cerebrospinal fluid flow. 53 Figure 2.1: Proteins identified in Alzheimer’s disease (AD) and control brain samples. 86 Figure 2.2: Ontology (GO) analysis of AD hippocampal proteome. 88 Figure 3.1: Candidate selection workflow. 99 Figure 3.2: Venn diagram of proteins identified in 6 individual CSF samples. 101 Figure 3.3: Venn diagram of peptides identified in 6 individual CSF samples. 103 Figure 3.4: CSF brain tissue-enriched and group-enriched proteins and their relative abundance. 105 Figure 3.5: Relative abundance of CSF proteome and 78 protein candidates. 106 Figure 3.6: (GO) analysis of 78 protein candidates. 107 Figure 4.1: Identification of endogenous peptides for SRM assay development: peptide example. 124 Figure 5.1: Distribution of cognitive test scores (Cohort 1). 144 Figure 5.2: Distribution of cognitive test scores (Cohort 2). 146 Figure 5.3: Distribution of CSF biomarkers Aβ1-42, t-tau and p-tau. 147 Figure 5.4: Distribution of candidate protein biomarkers in CSF samples (n=53). 149 Figure 5.5: Distribution of candidate protein biomarkers in CSF samples, Set 1. 151 Figure 5.6: Distribution of candidate protein biomarkers in CSF samples, Set 2. 152 Figure 5.7: Receiver-operating characteristic (ROC) curve for best performing candidates (Cohort 1). 153 Figure 5.8: Receiver-operating characteristic (ROC) curve for best performing candidate (Cohort 2). 154 Figure 5.9: Correlation between candidate levels in CSF and cognitive tests (Cohort 1). 156

xi

List of Appendices

Appendix 2.1: Top 10 upregulated proteins in Alzheimer's disease (AD) tissues in comparison to Control tissues. 205 Appendix 2.2: Top 10 upregulated proteins in 'Control' tissues in comparison to Alzheimer's disease (AD) tissues. 205 Appendix 2.3: List of 40 CSF proteins that were found exclusively in Alzheimer's disease hippocampal tissues. 206 Appendix 3.1: Brain-enriched and group-enriched proteins identified in brain proteome. 207 Appendix 3.2: 78 tissue-enriched and brain-enriched proteins. 207 Appendix 3.3: Supplementary method. 209 Appendix 3.4: KLK6 concentration in brain tissue extracts and CSF pool. 211 Appendix 4.1: Endogenous peptides used for prediction of retention time (RT). 211 Appendix 4.2: Pierce peptides used for prediction of retention time (RT). 212 Appendix 4.3: Retention time correlations. 212 Appendix 4.4: Endogenous peptides identified in CSF. 213 Appendix 4.5: Peptides and transitions of the developed SRM assay. 214 Appendix 4.6: Analytical characteristics of SRM assays for 30 proteins 218 Appendix 4.7: Reproducibility assay. 219 Appendix 4.8: Carry-over effect. 220 Appendix 5.1: Multiplex SRM assays for clinical samples analysis. 222 Appendix 5.2: Statistical analysis (Cohort 1). 229 Appendix 5.3: Total assay reproducibility (Cohort 1). 230 Appendix 5.4: Statistical analysis (Cohort 2). 231 Appendix 5.5: Statistical analysis of proteins’ abundance among APOE phenotypes. 233 Appendix 5.6: Statistical analysis of proteins’ abundance among APOE ε4 phenotypes. 234

xii

List of Abbreviations

AA, Alzheimer’s Association

Aβ, amyloid-β

AD, Alzheimer’s disease

ADRDA, Alzheimer’s Disease and Related Disorders Association

APLP1, amyloid-like protein 1

APOE, apolipoprotein E

APP, amyloid precursor protein

AUC, area under the curve

BACE, β-site APP cleaving / β-secretase

BBB, blood-brain barrier

CAA, cerebral amyloid angiopathy

CDR, Clinical Dementia Rating

CNS, central nervous system

CNTN2, contactin-2

CSF, cerebrospinal fluid

CV, coefficient of variation

DLB, dementia with Lewy bodies

ECL, electrochemiluminescence

ELISA, enzyme-linked immunosorbent assay

EOAD, early-onset Alzheimer’s disease

ESI, electrospray

FDA, Food and Drug Administration

FDG, fluorodeoxyglucose

FDR, false discovery rate

xiii

FFPE, formalin-fixed paraffin-embedded

FTD, frontotemporal dementia

GO, gene ontology

HPA, human protein atlas

IPI, international protein index iTRAQ, isobaric tags for relative and absolute quantification

LC, liquid chromatography

LOAD, late-onset Alzheimer’s disease

LOB, level of blank

LOD, level of detection

LTP, long-term potentiation

LTQ, linear ion trap m/z, mass to charge

MALDI, matrix-assisted-laser-desorption-ionization

MCI, mild cognitive impairment

MMSE, Mini-Mental State Examination

MRI, magnetic resonance imaging

MS, mass spectrometry

MS1, full mass spectrum

MS/MS, tandem mass spectrometry

NFT, neurofibrillary tangles

NIA, National Institute on Aging

NINCDS, National Institute of Neurological and Communicative Disorders and Stroke

NPTXR, neuronal pentraxin receptor

PD, Parkinson’s disease

PET, positron emission tomography

xiv

PiB, Pittsburg compound B

PSEN1, presenilin 1

PSEN2, presenilin 2

P-tau, phosphorylated tau

PTM, post translational modification

QC, quality control

R2, coefficient of determination

ROC, receiver-operating characteristic

RT, retention time

SCX, strong cation exchange chromatography

SILAC, stable isotope labeling with amino acids in cell culture

SPP1, osteopontin

SRM, selected reaction monitoring

SWATH, sequential window acquisition of all theoretical ion spectra

TFA, trifluoroacetic acid

TOF, time-of-flight

T-tau, total tau

VaD, vascular dementia

xv

1

Chapter 1

1 Introduction

1.1 Alzheimer’s disease

1.1.1 Alzheimer’s disease and dementias

Alzheimer’s disease (AD) is slowly progressing neurodegenerative disorder presenting with different cognitive impairments, typically as memory deficit, poor judgement, perception, changes in behaviour, and deteriorating language skills.

AD was described for the first time in 1906 by German physician Alois Alzheimer.

He presented the clinical and neuropathological characteristics of a single case of female patient Auguste D, describing memory loss and hallucinations as well as brain deposits, plaques and tangles, today known as the main pathological features of the disease. Emil

Kraepelin named this disorder Alzheimer’s disease in 1910 [1].

The pathology of AD is characterized by brain deposits of amyloid-β (amyloid plaques) and tau (neurofibrillary tangles, NFT). Diagnosis is currently based on the clinical presentation (core clinical criteria); CSF (tau and amyloid-β proteins) and imaging biomarkers (e.g. amyloid PET, structural MRI) can increase certainty of AD diagnosis, but are temporarily recommended for use in research and clinical trials and as an optional tool when found appropriate by clinician (if available), and not for routine clinical diagnosis

[2].

AD is an evolving field of research, with the goal of understanding cause(s) of AD and designing curative or at least disease-modifying therapies. Recently, a new concept of

2 preclinical AD has been proposed, defined as asymptomatic stage of AD, opening questions about risk factors and markers of progression to clinical AD. In addition, stage of mild cognitive impairment has been recognized, representing symptomatic, pre-dementia stage of AD. Use of biomarkers for preclinical and clinical AD diagnosis, prediction of progression from preclinical to clinical AD and of further AD severity is emphasized. As multiple drug treatments are developed and reach clinical trial stage, use of biomarkers is also needed for patient selection and as indicators of response to treatment and target engagement [3, 4].

Clinical AD is part of the dementia spectrum of diseases. Dementia refers to a syndrome caused by several neurological disorders that present with altered cognitive and behavioural functions which interferes with everyday activities. The most common dementias include AD dementia, vascular dementia (VaD), dementia with Lewy bodies

(DLB), and frontotemporal dementia (FTD). Among different dementia subtypes, AD is considered the most common type of dementia encompassing 50-70% of all dementia cases

[5]. Vascular dementia accounts for about 10 to 15% of all cases and is typically caused by subcortical and cortical infarcts and vascular bleeding [6]. Lewy body dementia is a term that encompasses DLB and Parkinson’s disease dementia. Typically, DLB has dementia presentation before or together with the parkinsonism (Parkinson-like syndromes, such as movement disorder), while patients with Parkinson’s disease dementia primarily develop

Parkinson’s disease (PD), followed by dementia onset as disease progresses (dementia rate among PD is approximately 10% per year) [7, 8]. DLB encompasses about 10-15% of dementia cases. Lewy body dementia is characterized by brain deposits of α- known as Lewy bodies (in cell body) and Lewy neurites (in neuronal processes) [7].

Frontotemporal dementia is a broad term that describes progressive clinical symptoms such

3 as impaired personality, behaviour, executive functions (such as attention, judgement, organization) and language skills. This dementia group is a common form among the younger dementia population, with prevalence between 3 and 26% of early-onset dementias in patients before age of 65. Most of the FTD cases (approximately 60%) are thus in the range of 45 to 64 years, while about 30% can be diagnosed after the age of 64. Typically pathological hallmarks of FTD are brain deposits of tau and TDP-43 protein and degeneration of frontal lobes, anterior temporal lobes, anterior cingulate cortex and insular cortex [9]. AD can be present together with other forms of dementia. [7, 10].

AD has been classified as sporadically occurring form, late onset AD (LOAD), and familial, early onset AD (EOAD). Sporadic LOAD disease typically presents after age 65 and is most prevalent form of AD. Both environmental and genetic risk factors have been associated with this form of AD [3].

EOAD typically affects individuals before 65 years of age and is caused by mutations in several , such as amyloid precursor protein (APP), presenilin 1 (PSEN1) and presenilin 2 (PSEN2). Multiple mutations in these genes have been identified and typically show autosomal dominant transmission pattern and full penetrance; many of the

EOAD cases remained unexplained though [11]. This form of AD is rare and represents minority of all AD patients (<1%) [12].

1.1.2 Epidemiology of dementia and Alzheimer’s disease

The 2005 Delphi epidemiology study estimated that 24.3 million individuals worldwide have dementia, with incidence of 4.6 million per year [13]. According to the

Alzheimer’s disease International report from 2015 there were 46.8 million affected

4 individuals around the globe with 9.9 million of new cases [14]. The 2017 report of the

Alzheimer’s Association indicates that 5.5 million Americans had AD [15].

The number of people with dementia was expected to increase approximately two- fold every 20 years; thus by the 2050, 131.5 million people will be living with this condition. Region with the highest number of people with dementia is Asia (22.9 million), followed by Europe (10.5 million), Americas (9.4 million) and Africa (4.0 million), while

China is the country with most dementia cases (9.5 million). About 58% of people with dementia come from low and middle income countries and is predicted to reach 68% by

2050. Prevalence and incidence of dementia increase exponentially with age; the largest distribution of new cases derives for age range 75-79 years in Asia, 80-84 in Europe and

Americas and 65-69 in Africa [14]. After age of 65 the overall estimated incidence of AD is

1-3% and prevalence 10-30% [12]. Sex-related differences in incidence and prevalence of dementia and AD have been reported for most of the world regions, being higher for women then in man (particularly among the oldest old) [5].

Together with increasing dementia cases, the overall cost of dementia is increasing and presents a major health and social burden. Estimated cost in 2015 was US$ 818 billion, compared to US$ 604 billion in 2010, and it is expected to increase to US$ 2 trillion by

2030. Dementia influences not only the quality of life of affected individuals, but also of their families, friends and caregivers. It is among leading diseases (top 10) that causes heavy burden for elderly [14]. People with dementia generally have shorter lifespan; for

AD estimated median survival is 7 years. However, this is difficult to directly estimate due to common comorbidities which may contribute or lead to mortality [10].

5

1.1.3 Alzheimer’s disease pathogenesis and hypothesis

1.1.3.1 Alzheimer’s disease pathogenesis

The well recognized pathological hallmarks of AD are brain deposits of extracellular amyloid-β (Aβ) plaques and intracellular NFT [16, 17]. Both deposits promote synaptic loss, aberrant neuronal network activity and neuronal cell death, leading to brain atrophy and impaired memory and cognition [18]. Neuronal loss usually occurs in the specific brain regions

(such as C1 region of the hippocampus), while the general loss of the brain volume is suggested to be due to the loss of neuronal processes [16].

Amyloid β is produced under normal condition from its precursor, amyloid precursor protein (APP) which neuronal functions are still largely unknown [12]. APP is type 1 transmembrane glycoprotein and Aβ fragments are generated by APP cleavage, in the Aβ domain region, by membrane-associated α-secretase, β-secretase and enzyme complex

γ-secretase. The Aβ domain consists of 28 residues outside of the membrane and another 12-14 residues in the transmembrane region [1]. The enzyme α-secretase cleaves APP outside of the transmembrane part of the APP at several possible sites; amino acid positions 13-16 or 17-20, while β-secretase (or β-site APP cleaving enzyme, BACE) cleaves at the position 1 of the Aβ sequence, in the extracellular region of APP. Protease γ-secretase is composed of several units: presenilin, nicastin, presenilin enhancer 2 (PEN2) and anterior pharynx defective 1 (APH1) which all together compose proteolytic complex involved in processing of many substrates, including APP, which cleavage occur in the transmembrane part of the APP, at amino acids positions 33-42 of the Aβ domain [19]. Enzymatic processing of APP can result in non- pathological or pathological pathway. In the non-pathological (non-amyloidogenic) cleavage pathway, APP is cleaved by enzyme α-secretase generating extracellular soluble APP fragment

6

(sAPPα) and intracellular carboxy-terminal fragment (CTFα). Alternatively, APP can be processed first by cleavage of β-secretase following by α-secretase cleavage to produce soluble

APP fragment (sAPPβ) produced after β-secretase cleavage, short Aβ fragments (13-16 amino acid long) and CTFα fragment. The pathological, amyloidogenic, pathway involve cleavage of

APP with β-secretase and γ-secretase and generation of sAPPβ, Aβ1-42 peptide including other

Aβ carboxy-terminal fragments (Aβ1-33 until 40, Aβ1-17 until 20) and APP intracellular domain (AICD) produced by γ-secretase. The amyloidogenic peptide fragment Aβ1-42 is prone to aggregation, leading to formation of toxic soluble oligomers (dimers, trimers and larger oligomers), and further formation of insoluble, mature Aβ fibrils and finally Aβ plaques, which include also other Aβ fragments (e.g. Aβ1-40, found to be less pathogenic, and more abundant)

[19]. Microscopically, senile (amyloid) plaques are complex structures and can exist in the brain as amyloid diffuse plaques and as amyloid deposits (core of the plaque), surrounded by the dystrophic neurites (typically tau immune-positive), called neuritic plaques, associated with neuronal injury and greater glial activation [20].

In addition, the Aβ pathology is not immediately distributed throughout the all brain regions; rather it follows a specific pattern of progression. According to the Thal and colleges, there are five phases of Aβ plaques deposition or progression in the AD brain [21]. In the first phase Aβ deposits are spread only in the neocortex, in the second phase, apart from neocortex, allocortex (including hippocampus) is affected by the deposits, in the third and fourth phase amyloid deposits are found in the subcortical regions such as diencephalic nuclei, striatum, cholinergic nuclei of the basal forebrain and brainstem nuclei, while in the fifth phase cerebellum is affected. This characterization of Aβ plaque pathology (so called Thal phases) was included as the guidelines for the post-mortem neuropathological examination of AD brains and confirmation of definite AD, as well as neuritic plaque neuropathology [20].

7

The main component of intracellular NFT is microtubule-associated protein tau, found to be hyperphosphorylated in these lesions. Tau protein is encoded by microtubule-associated protein tau gene (MAPT) at 17q21 and it encompasses 16 exons. Alternative splicing of exon 2, exon 3 and exon 10 leads to translation of six tau isoforms in the human adult brain. Tau isoforms vary in the existence of near-amino-terminal inserts (possible 0, 1 or 2 inserts or 0N, 1N and 2N, respectively) and the existence of repeat R2 domain with possibility of three or four carboxy-terminal repeat domain (3R or 4R, respectively). The resulting splicing tau forms are 2N4R (441 amino acid isoform), 1N4R (412 amino acid isoform), 0N4R (383 amino acid isoform), 2N3R (410 amino acid isoform), 1N3R (381 amino acid isoform) and

0N3R (352 amino acid isoform). Tau protein is mainly expressed by neurons (in smaller part by glia cells) and its expression is found to show regional differences (e.g. higher abundance is found in neocortex comparing to cerebrellum and white matter) [22]. Under physiological condition, tau protein is bound to microtubule allowing its stabilization and axonal outgrowth.

During pathogenesis, tau is hyperphosphorylated by several kinases such as serine/threonine kinases cyclin-dependent kinase 5 (CDK5), glycogen synthase kinase 3 (GSK3), microtubule- affinity-regulating kinases (MARK). This aberrant posttranslational event leads to detachment of tau from the microtubules resulting in axonal destabilization, impaired trafficking and synaptic dysfunction. Hyperphosphorylated tau has amyloidogenic properties; it is therefore prone to self-aggregation and formation of paired helical filaments (PHFs) which ultimately aggregate into NFT and neuropil threads (described as axonal and dendrites processes of the

NFT-containing neurons) [19].

According to the Braak and Braak, there are six stages of NFT progression in the human

AD brain, known as Braak stages [23]. These stages are incorporated into neuropathological guidelines for characterizing NFT in AD brains post mortem; in Braak stage I and II NFT are

8 present mostly in entorhinal cortex and close related areas, in Braak stage III and IV NFT are more profuse in hippocampus and amygdala, while in Braak stages V and VI NFT are spread throughout neocortex [20].

1.1.3.2 Amyloid hypothesis

The “amyloid hypothesis” suggests that altered homeostasis between impaired Aβ production and Aβ clearance are the primary and initial pathological events in AD, while tau pathology is downstream of the Aβ pathology. Disruption of Aβ production/clearance balance results in Aβ accumulation, neuronal degeneration and dementia symptoms. In EOAD Aβ production is impaired and in LOAD Aβ clearance is impaired. This hypothesis has been initially proposed in 1991 and has been updated over the years [24, 25]. There is accumulating evidences which are in favor of this hypothesis.

Identification of mutations in genes causing EOAD provided evidence that changes in Aβ production, processing and aggregation is the prime mechanism for disease development. Missense mutations in APP, PSEN1, PSEN2, as well as duplication of APP cause

Aβ aggregation and plaque formation in early-onset familial cases. Individuals with familial mutations in APP and PSENs initially develop Aβ pathology which is then followed by formation of NFT of wild type tau protein. Also, certain mutation in APP (A673T, at the second amino acid of Aβ domain, next to cleavage site of β-secretase) acts as a “protective” mutation for developing AD in elderly individuals without disease (carriers had lower risk of development of AD and cognitive decline); this mutation seems to result in decrease in β- secretase proteolytic processing of APP and formation of pathogenic peptides (demonstrated in vitro) [26]. In contrast, mutations in tau gene do not cause Aβ pathology and are associated with development of FTD [27].

9

Evidence from the AD biomarker research suggest that Aβ precedes tau neuropathological changes, as Aβ biomarkers (positive Aβ imaging and lower CSF Aβ1-42) are first detectable, even before AD symptoms appears, compared to tau biomarkers (increased total tau (t-tau) and phosphorylated tau (p-tau) CSF levels) [28].

Amyloid hypothesis initially postulated that amyloid deposited in plaques is neurotoxic, however amyloid oligomers seems to be more toxic Aβ species then Aβ fibrils and plaques causing severely impairment in synaptic function [29]. Aβ oligomers inhibited hippocampal synaptic long-term potentiation (LTP) in rats after intracerebral injection of human Aβ oligomers; this effect was absent when injections were pretreated with antibodies against Aβ

[30]. Moreover, soluble Aβ oligomers isolated from human AD brain caused decrease of hippocampal dendritic spine density and impaired memorization of newly learned behaviour in rats, inhibited LTP and promoted long-term depression in mice [31].

Some of the challenges for the amyloid hypothesis come from the neuropathological study by Braak, which suggested that tau pathology actually preceded Aβ deposits in the human brain [32]. In addition, tau has been found to be necessary for facilitating Aβ pathology; hippocampal neurons (expressing mouse or human tau) treated with Aβ fibrils resulted in induced neuronal degeneration, while tau knockout neurons did not show degeneration upon exposure to fibrillar Aβ [33]. However, Aβ fibrils were able to enhance MARK activation

(enzyme that phosphorylates tau). In, addition, synaptic loss, as one of the hallmarks of AD, showed better correlation with cognitive decline in AD patients when compared to Aβ or tau deposits [34]. Part of the controversies arrives from the failures of clinical trials focused on Aβ pathology as potential drug target; this is discussed in more details under the section 1.1.7.

10

1.1.3.3 Transmission hypothesis

The newly proposed “transmission hypothesis” highlights the prion-like properties of the pathological forms of Aβ and tau. According to this hypothesis, pathogenic misfolded proteins are prone to self-aggregation and formation of seeds (so called nucleation phase) which can further initiate aggregation of the same proteins (previously naturally folded) into the pathogenic protein oligomers and then fibrils (growth phase). Eventually the growing fibrils can fragment and these seed forms can then propagate from neuron-to-neuron and act as nodes for subsequent new fibril formation, leading to disease progression [35]. The development of transgenic animal models for AD led to the demonstration of this hypothesis in vivo. More specifically, intracerebral injection of Aβ containing brain extracts (from human AD brain or aged APP transgenic mice) into transgenic mouse model with hAPP provoked early Aβ deposition, in addition to vascular Aβ deposition known as cerebral amyloid angiopathy (CAA). However, this phenomenon was not observed in wild-type mice [36]. Furthermore, when Aβ brain extracts were injected locally into a specific brain region, the pathology was evident in the closely, axonally connected regions [35]. In a similar fashion, when brain extracts containing tau filaments (from transgenic mouse model expressing mutant human tau) were injected into the wild-type mouse model expressing human tau, the tau pathological depositions were induced and further spread over time to the neighboring regions [37]. Moreover, the pattern of Aβ and tau pathological progression among the regions in the human AD brains (Aβ fist affects neocortex, then allocortex and subcortical regions; tau fist affects entorhinal cortex, hippocampus, amygdala and then neocortex) was suggestive of potential self-propagation hypothesis of AD pathologies [38].

11

1.1.4 Risk factors for Alzheimer’s disease

1.1.4.1 Early onset AD

Early onset AD is defined as AD diagnosed before age 65. Around 10-15% of familial EOAD patients have autosomal dominant mutations in APP, PSEN1 and PSEN2 genes. These mutations account for less than 1% of AD cases [11, 12].

Investigation of the genes responsible for the familial form of AD started in 1987 with the cloning of APP and its localization to the chromosome 21 [39]. APP gene is located at the chromosome 21q21; its expression is regulated by alternative splicing, resulting in translation of three isoforms: APP695, APP751 and APP770. APP is expressed by many tissues in the body; in the brain, APP695 is the main neuronal isoform, and

APP751 the main isoform found in astrocytes. 51 pathogenic mutations in APP gene have been reported, 15 non-pathogenic and 1 with unclear pathogenic nature (source: http://www.molgen.vib-ua.be/ADMutations, last entered April 2017, search by gene) and 34

APP mutations were related to AD phenotype only (source: http://www.molgen.vib- ua.be/ADMutations, last entered April 2017, search by phenotype) [40]. Around 14% of autosomal dominant EOAD are due to mutations in APP gene. As a precursor of Aβ, mutations in APP disrupt Aβ processing and promote aggregation. Certain APP mutations

(around the C terminal Aβ domain) result in aberrant APP processing by γ-secretase, changing the relative ratio of Aβ1-42 and Aβ1-40 towards higher production of more amyloidogenic peptide Aβ1-42 without actually changing the total amount of Aβ. Other mutations of APP in the Aβ domain, such as Artic (E693G), lead to enhanced Aβ aggregation [41]. A Swedish mutation KM670/671NL (double mutation) causes increse in Aβ production [42, 43].

12

While most mutation can inherited in autosomal dominant pattern, recently recessive mutations (A673V and 693Δ) were reported to cause EOAD in homozygous patients [44].

Copy number mutations have also been found in APP. Duplications in APP gene have variable frequency among EOAD cases and are found in different population, e.g. there were no reports on APP duplication in Swedish and Finnish families, around 2 % is found in Dutch and 8% in French families, while 18% is present in Japanese families of

EOAD. The duplication of APP seems to result in Aβ brain deposition but also vascular Aβ deposition or CAA [45]. Moreover, individuals with chromosome 21 trisomy or Down syndrome, express AD neuropathology during their life (by the age of 35); however if there is only a partial chromosome 21 trisomy, without involvement of APP gene, the AD pathology will not evolve [12, 44].

PSEN1 and PSEN2 are important components of the γ-secretase complex (form catalytic site), involved in the processing of APP [24]. They are highly homologous integrated membrane proteins with nine transmembrane domains and hydrophilic intracellular loop domain, expressed by many tissues in the body. In the brain they are more abundant in the hippocampus and cerebellum [46]. PSEN1 is located at the chromosome

14q24.3, and PSEN2 at the 1q31-q42. Mutations related to PSEN1 and PSEN2 are mostly missense mutations [46]. The resulting mutations have an impact on γ-secretase processing by inhibiting/decreasing cleavage of intracellular APP domain, generation of longer, premature Aβ peptides and change in the preferential enzyme cleavage site (with predominant cleavage at the positions 49-50 or 51-51), leading to the altered ratios of Aβ1-

42 and Aβ1-40 fragments [24, 44].

13

Total of 219 pathogenic PSEN1 mutations were reported among which, there are 4 non-pathogenic and 7 with unclear pathogenic properties (source: http://www.molgen.vib- ua.be/ADMutations, last entered April 2017, search by gene), 154 were described as pathogenic and related to AD phenotype only (source: http://www.molgen.vib- ua.be/ADMutations, last entered April 2017, search by phenotype) [40]. Autosomal dominant mutations in PSEN1 cause around 80% of familial EOAD cases. Interestingly, some PSEN1 and APP dominant mutations are found in the LOAD cases, even dough they represent familial early onset mutations [44].

Fewer mutations in PSEN2 were reported to date; overall 16 pathogenic mutations were described, 7 non-pathogenic and 16 with unclear pathogenic properties (source: http://www.molgen.vib-ua.be/ADMutations, last entered April 2017, search by gene), while

11 mutations were reported to be pathogenic for AD phenotype only (source: http://www.molgen.vib-ua.be/ADMutations, last entered April 2017, search by phenotype)

[40]. Mutations of PSEN2 are responsible for around 5% of familial EOAD [44]

1.1.4.2 Late onset AD

Late onset AD is defined as AD diagnosed after age 65 and is also referred to as sporadic form of AD. Combination of genetic and environmental susceptibility factors play a role in development of LOAD, such as older age, family history, life style and certain medical conditions.

Advanced age is the most important risk factor of AD. It is estimated that incidence of AD doubles every five years after age of 65 [47]. However, advanced age is not sufficient for developing AD, neither AD can be considered as a normal part of aging.

14

Risk factors related to lifestyle have been suggested, such as saturated fat intake, lack of physical activity, smoking, heavy alcohol consumption and low education level.

Obesity, high serum cholesterol level, hypertension, atherosclerosis, cerebral vascular lesions, are important risk factors. In addition, other risk factors such as depression, traumatic brain injury, can increase the risk for AD development [5].

Family history is another risk factor for AD. The first-degree relatives of AD patients have a higher chance of developing disease, especially the ones with more than two family members with diagnosed AD [15, 47].

There are several susceptibility genes for the onset of sporadic AD, among which apolipoprotein E (APOE) ε4 allele is the strongest reported to date. The presence of this allele can increase the risk for AD for about 3 to 15 times, in heterozygotes and homozygotes, respectively, compared to carriers of ε3/ε3 genotype [48]. The association between ε4 allele frequency and AD was first discovered in early 1990s [49, 50]. APOE gene is located at the chromosome 19q13.2; this gene contains three polymorphic alleles: ε2, ε3 and ε4 with frequency in general population of 8, 78 and 14%, respectively [48]. The three APOE isoforms (APOE2, APOE3, APOE4) differ in one or two amino acids at positions 112 and 158, where either arginine or cysteine is present; APOE2 (Cys112, Cys158), APOE3 (Cys112,

Arg158), APOE4 (Arg112, Arg158) [51]. Subsequently there are six possible APOE phenotypes; three homozygous (E2/2, E3/3, and E4/4) and three heterozygous (E2/3, E2/4, and

E3/4). APOE is expressed by several tissues in the body, with the most abundant expression in the liver and the brain where it is produced mainly by astrocytes [52]. Similar to its functions at the periphery, in the brain APOE is involved in the cholesterol metabolism, transporting cholesterol to neurons by receptor-mediated endocytosis. The variations in single amino acid among the three isoforms result in different affinity towards lipids, receptors and Aβ [48].

15

One copy of ε4 allele increases the risk for LOAD approximately 3 times, while two copies increase the risk approximately 15 times compared to non-carriers (among Caucasians).

In contrast, ε2 allele decreases the risk of AD [53]. The association of APOE ε4 and AD was stronger for Japanese subjects and weaker in African American and Hispanic subjects [53]. The frequency of ε4 allele increases from ~14% in general population to ~40% in AD patients and is associated with earlier age of AD with mean disease onset of 68 years in ε4 homozygous and 76 years in heterozygous, compared to 84 years in non-carriers [48]. In addition, ε4 allele has been associated with accelerated Aβ pathology compared to non-carriers [54]. Indeed, amyloid plaques are more abundant in ε4 carriers [54], showing lower CSF concentration of Aβ1-42 in

AD patients [55]. In addition to enhanced Aβ plaques in the brain, ε4 carriers exhibit vascular

Aβ deposition (CAA) [54]. Similarly, APOE ε4 carriers have enhanced Aβ pathology in MCI patients and cognitively normal individuals. MCI patients have lower levels of CSF Aβ1-42, higher t-tau and brain atrophy then non-carriers. Finally, MCI patients who are ε4 carriers are at higher risk of progression to AD dementia [48]. Cognitively normal ε4 carriers also showed lower CSF Aβ1-42 levels and elevated amyloid deposits measured with Pittsburg compound B

(PiB) PET imaging then non-carriers [56, 57]. Aβ depositions measured with PiB-PET can be detected in cognitively normal individuals at age of 56 in APOE ε4 carriers, while in non- carriers at age of 76 [48].

Data from the animal models (transgenic mice expressing human APP (hAPP) and human APOE isoforms) revealed higher amyloid pathology in hippocampus in mice expressing

APOE4 then APOE3 and APOE2. Slower Aβ clearance from the hippocampal interstitial fluid was observed in APOE4 mice compared to APOE3 and APOE2 isoform-expressing mice [57].

It has been suggested that APOE affects Aβ clearance and accumulation in an isoform- dependent manner [58]. For instance, APOE can bind to Aβ peptide, and may act as a chaperone

16 to promote formation of Aβ filaments; alleles differ in the ability to promote filament formation

(APOE4>APOE3>APOE2) [59]. There are different pathways through which APOE may promote Aβ clearance in the brain; lipidated APOE particles mediate Aβ cellular uptake by receptor-mediated endocytosis, the mechanism which is suggested to be dependent on the lipidation and isoform status. In addition, APOE may influence Aβ clearance through the blood- brain barrier (BBB), transporting soluble Aβ to blood, the efficiency of which is dependent on

APOE isoforms (transport efficiency: E2>E3>E4). Lastly, it has been proposed that APOE can facilitate intracellular clearance of soluble Aβ in macroglia [52].

Additionally, APOE4 may contribute to AD pathology by Aβ-independent mechanisms, by influencing synaptic plasticity, cholesterol homeostasis, neurovascular function and neuroinflammation. For example, APOE4 transgenic mouse model showed lower synaptic density compared to APOE3 transgenic mice; in addition, impaired synaptic transmission was observed in very young APOE4 transgenic mice [48].

Over 20 other than APOE risk loci and genes for AD have been identified through genome-wide association (GWAS) studies (e.g. BIN1, PICALM) and next-generation sequencing studies (NGS) (e.g. TREM2). These genes seem to be implicated in several pathways/functions, such as lipid metabolism (e.g. CLU, ABCA7), immune system (e.g.

CLU, CR1, CD33, MS4A, TREM2), endocytosis and synaptic function (e.g. PICALM,

BIN1, EPHA1, SORL1, CD2AP) [60]. However, these risk variants contribute much less to overall individual risk for developing AD then APOE ε4.

1.1.5 Clinical presentation of dementia and Alzheimer’s disease

Dementia can present as a spectrum of clinical symptoms of cognitive and behavioural impairment. Common dementia symptoms include impaired episodic memory

17

(the inability to learn and remember new information), inability to perform complex, multistep tasks, poor reasoning and judgment, aberrant visuospatial functions (e.g. failure to recognize familiar faces/objects), difficulties with language skills, significant change in behaviour, personality and social engagement [2].

Three stages of dementia are typically described; mild, moderate, severe dementia.

Symptoms present at each stage and duration of each stage may differ among patients [10].

In the early stage (or mild dementia) patients commonly experience memory impairment, some difficulties with language skills, inability to perform decision-making functions, unawareness of exact time, date, getting lost in familiar surroundings, noticeable change in behavior and personality (depression, anxiety, aggressiveness, apathy). This stage is typical for the first two years after disease onset. The middle stage (or moderate dementia) is characterized by advanced memory loss and communication problems, more significant unawareness of time and space, patients start being dependent on the other people’s help (family, friends, caregivers), experience sleep disturbance, hallucinations, sadness, aggressiveness. This stage last approximately until fifth year from disease onset.

In the late stage (or severe dementia) patients are completely dependent and cannot function by themselves. Apart from severe memory loss, patients are unable to communicate, comprehend, recognize familiar faces/objects, exhibit loss of orientation

(e.g. can get lost even in their home), have difficulties with eating, swallowing, walking.

They may experience urine and feces incontinence. Although patients are at this stage extremely inactive and show apathy, they may display significant agitation and aggressiveness. This last stage starts usually after fifth year of onset [2, 10].

18

Apart from symptoms characteristic for dementia (regardless the prime cause), there are symptoms more specific for AD. Patients with AD have slowly progressing onset of symptoms [2]. Most patients have typical or amnestic clinical presentation with impaired episodic memory, and at least one more cognitive function (such as visuospatial abilities, reasoning and judgment). The atypical or non-amnestic symptoms start usually before memory impairment and are initially more obvious, such as impairment in language domains, visuospatial abilities or executive functions (e.g. problem solving and reasoning)

[2, 3]. Atypical presentation of AD is more frequently seen in EOAD patients, while typical presentation in mostly seen in LOAD [61].

Differential diagnosis of dementias can be challenging. Typical symptoms more specific to the DLB (at an earlier stage of disease) compared to AD are visual hallucinations, parkinsonism and fluctuating cognition [7]. In VaD clinical characteristics such as impaired attention and executive functions are more common; depression and apathy are also more prominent than in AD [6]. Symptoms more suggestive for FTD are behavioural changes (e.g. socially inappropriate behaviour), and language difficulties, while memory and visuospatial deficit is more suggestive for AD [9].

1.1.6 Diagnosis of Alzheimer’s disease

Currently, the diagnosis of AD is based mostly on clinical criteria, including a thorough medical history, mental status/neuropsychological testing, and a physical and neurological exam, based on which only probable AD diagnosis can be made [2]. Definite diagnosis can be only made by the neuropathological examination of specific post-mortem brain regions and thus it remains the gold standard for AD diagnosis. An accurate diagnosis of AD is still challenging, mostly due to misdiagnosis involving other types of dementias. In addition, some reversible

19 disorders can mimic clinical presentation of dementia such as vitamin deficiency (vitamin B12), depression and hypothyroidism [18].

Initial clinical criteria for AD diagnosis were published in 1984 by the National Institute of Neurological and Communicative Disorders and Stroke (NINCDS) and the Alzheimer’s

Disease and Related Disorders Association (ADRDA) workgroup, known as the NINCDS-

ADRDA criteria, and were used in practice for more than 25 years. According to these criteria, probable AD diagnosis was based mostly on the exclusion of other possible disorders. Disease was diagnosed only when signs of dementia were present, which means that the pre-clinical stage and milder forms of cognitive impairment were not considered in these guidelines. In addition, there were no clear recommendations for diagnosis of mixed forms of dementia, or differentiation of AD from other dementia types, since there was in general lack of knowledge and recognition of other dementias and appreciation of their co-existence with AD [62].

In 2011, the National Institute on Aging (NIA) and the Alzheimer’s Association (AA) proposed new, revised diagnostic and research criteria [62]. The new recommendations for AD diagnosis intended to refine diagnostic accuracy and improve identification of AD cases [2]. The newly revised criteria described pathophysiological and clinical presentation separately, without previous assumption that pathology and symptoms arises simultaneously. This initiative was based on the fact that AD-like pathology can coexist with other dementias [63], that AD pathology can be present in individuals without any clinical signs of disease [64], as well as the fact that AD pathology can results in an atypical clinical presentation [65]. The important improvement in the new criteria is the acknowledgment that there are different stages of AD and further, integration of biomarkers (CSF and imaging) related to pathological changes into the diagnostic criteria. Biomarkers are divided into two groups which describe pathophysiological changes: 1) biomarkers of amyloid-β deposition (decreased CSF Aβ1-42, abnormal Aβ PET

20 imaging) and 2) biomarkers of neuronal injury and degeneration (increased CSF t-tau and p-tau, reduction of fluorodeoxyglucose (FDG) uptake by tissue (of specific regions) detected by PET imaging, atrophy of specific regions as detected by structural MRI). Biomarkers of amyloid-β pathology could appear up to 20 year before clinical symptoms become evident, while biomarkers of neuronal injury and degeneration appear later in the disease course [62].

Three stages of AD are suggested, preclinical stage, mild cognitive impairment (MCI) due to AD, and dementia due to AD. However, it has been recognized that it is challenging to distinguish between preclinical stage and MCI patients as well as between MCI and dementia due to AD (at the stage of mild dementia) patients.

Preclinical phase is defined by presence of early AD pathology, which is based on biomarker evidence (CSF Aβ1-42 and tau and imaging), in asymptomatic individuals or individuals with very subtle cognitive impairment. This group precedes MCI due to AD and represents individuals at risk of progression to clinical AD. The NIA-AA recommendation for characterization of the preclinical stage is intended purely as research criteria at present, without clinical diagnostic purposes. Within the preclinical stage, further three stages are proposed: stage 1 for asymptomatic individuals with positive biomarkers of Aβ accumulation, stage 2 for asymptomatic individuals with positive biomarkers of Aβ accumulation and neuronal degeneration/neuronal injury, and stage 3 for individuals with positive biomarkers of Aβ accumulation and neuronal degeneration/neuronal injury showing subtle cognitive impairment

[66].

MCI due to AD defines symptomatic patients at pre-dementia phase of AD. The NIA-

AA outlined core clinical diagnostic criteria for use in the clinic and the research criteria. Core clinical criteria are based on individuals’ change in cognition with mildly affected one or more

21 cognitive functions (e.g. memory, executive, visuospatial function, language), with independence in everyday activities still conserved. In addition, evidence of longitudinal decline in cognition and elimination of alternative diagnosis increases certainty that MCI is due to AD pathology. The differential diagnosis may include assessing clinical presentation specific for other dementias, such as parkinsonism for LBD, prominent language or behavioural impairment for FTD, or evidence of cerebrovascular pathology examined with structural brain imaging for

VaD. Research criteria include use of biomarkers (in addition to core clinical criteria) to increase the diagnostic accuracy of MCI due to AD. Positive biomarkers of Aβ deposition and neuronal injury provide the highest certainty that AD pathology is the cause of MCI [67].

According to NIA-AA core clinical criteria probable AD dementia is diagnosed based on the clinical presentation of general dementia symptoms and symptoms more specific for AD (as described under section 1.1.5). Clinical criteria consider both typical and atypical clinical presentation in AD dementia diagnosis. In addition, it is critical to exclude other possible causes of dementia (based on history, clinical presentation or tests that would indicate other etiology) such as cerebrovascular pathology, DLB, FTD or other causes (e.g. due to medications), in order to diagnose probable AD dementia [2].

Evidence of causative genetic mutations in APP, PSEN1, PSEN2 genes in patients who meet core clinical criteria for probable AD increases certainty that AD is underling pathology and cause of dementia. The presence of APOE ε4 allele was not recognized as specific enough to increase certainty of AD diagnosis. CSF and imaging biomarkers can increase confidence of

AD diagnosis, but are temporarily recommended for use in research settings, clinical trials and as an optional test when considered appropriate by clinician (if available), and not for routine clinical diagnosis [2]. Definite AD diagnosis can be made post-mortem, based on

22 the neuropathological examination of individuals who met ante-mortem criteria for AD dementia [2, 20].

The NIA-AA core clinical criteria for diagnosis of AD (MCI, AD dementia) are designed to be useful for AD diagnosis by general healthcare doctors (without access of neurophysiological tests or biomarkers) as well as for dementia specialists.

Similar to NIA-AA guidelines, the International Working Group (IWG) proposed research criteria for diagnosis of AD. This group acknowledged the importance of revising

NINCDS-ADRIA criteria, incorporation of biomarkers, and characterization of earlier stages of

AD. The diagnosis of probable AD is based on the core clinical criteria in addition to one or more supportive parameters (imaging or CSF biomarker evidence, and/or presence of autosomal dominant mutations) and exclusion criteria (evidence of other causes of dementia based on history, clinical presentation) [68]. IWG advanced criteria defined individuals in preclinical stage (as defined by NIA-AA) as asymptomatic at risk (asymptomatic patients with biomarker evidence of AD pathology) and pre-symptomatic at risk (asymptomatic patients with proven autosomal dominant mutations in APP, PSEN1, PSEN2) [69].

1.1.7 Treatment

At present, only symptomatic therapy is available for AD patients. inhibitors (donepezil, galantamine and rivastigmine) and N-methyl-D-aspartic acid (NMDA) receptor antagonist (glutamate antagonist) memantine are currently approved medications used in the clinic for patients’ treatment. As symptomatic drugs, these medications tend to improve/stabilize cognitive functions.

23

In AD there is a significant loss of cholinergic neurons in the basal nuclei of the forebrain in the AD brain, producing aberrant cholinergic transmission in the hippocampus and cortex. As a result there is insufficient amount of neurotransmitter acetylcholine which is connected with the cognitive and learning deficits in AD [70].

Glutamate-mediated transmission is another impaired signal transduction in AD, known as glutamate excitotoxicity. Glutamate is the major neurotransmitter of the excitatory neurons in the brain and it targets several receptors of the CNS, such as NMDA receptor, involved in memory and learning functioning. Under the pathological conditions, the NMDA receptors are overstimulated by excessive glutamate concentration, causing neuronal injury and damage [71].

Acetylcholinesterase inhibitors are thus design to inhibit enzyme acthylcholinesterase

(enzyme that hydrolyzes acetylcholine) in order to increase amount and viability of neurotransmitter acetylcholine in the synaptic space. However, the benefit of these mediations is only short-term, with commonly observed moderate cognitive improvement for approximately 6 months. Apart from stabilization of cognitive functions, other symptoms are improved, such as behavioural; overall their benefits on symptoms are seen for the first year of the treatment.

Acetylcholinesterase inhibitors are prescribed for mild and moderate stage of AD dementia [18].

Memantine is non-competitive NMDA receptor antagonist deliberate to block overstimulation of NMDA receptors preserving normal functioning of the receptors, without side effects. It is generally well tolerated drug (some adverse effects possible), prescribed for treatment of moderate to severe AD dementia, with observed improvement in cognition over 6 months of treatment, along with behavioural functions [18].

Significant efforts have been made over the past decades towards development of disease-modifying agents focused on the AD pathological mechanisms.

24

Initial drug discovery was based on the amyloid hypothesis. As a result, APP processing enzymes β- and γ-secretase have been targets in early clinical trials. Development of y-secretase inhibitors has been challenging since γ-secretase is implicated in the processing of various transmembrane proteins, leading to adverse effects of these drugs. Semagacestat and avagacestat, small molecule y-secretase inhibitors, failed to improve cognitive benefits and slow down the disease course and were withdrawn from the trials because of the safety reasons; semagacestat worsened the symptoms in AD patients and was associated with skin cancer and infections. Avagacestat also worsened cognitive symptoms, and had gastrointestinal side effects, glycosuria, skin cancer and asymptomatic cerebral microbleedings [72, 73]. Clinical trials for β- secretase inhibitors failed due to pharmacokinetic properties or due to the side effects such as liver toxicity. However, there are ongoing clinical trials of novel β-secretase inhibitors (phase

II/III), as well as y-secretase modulator (phase II) [74].

Immunotherapy is also under investigation as potential option for AD patients. The main idea of this group of drugs is to increase clearance of brain Aβ through active or passive immunization. Active immunisation of patients with full-length Aβ1-42 (AN-1792) resulted in reduction of Aβ plaques, however some of the participants (6%) developed meningoencephalitis and the study was terminated [75]. Passive immunisation with monoclonal antibodies against

Aβ showed more promise. Bapinezumab, a humanized antibody directed against the N-terminal

Aβ epitope, was tested in phase III trial on patients with mild to moderate AD. There was no observed cognitive improvement in AD patients, and an adverse effect of edema was found among patients receiving the drug. Only APOE ε4 carriers displayed lower levels of p-tau and decreased PIB-PET imaging. However, significant percentage of APOE ε4 non-carriers did not have positive Aβ pathology at the baseline, which raised a question regarding the reliable diagnosis of these patients [76].

25

Another humanized Aβ antibody, solanezumab, binds to the central region of Aβ and targets soluble Aβ (more neurotoxic Aβ forms). This drug was tested in a large clinical trial, as well in AD patients with mild to moderate dementia but failed to provide benefits on cognition and functioning (as primary outcome). Increased Aβ1-42 levels but no change in tau proteins were found in solanezumab group compared to placebo [77]. However, this drug was proven to be safe and additional clinical trials of solanezumab are ongoing [74, 77].

Overall, clinical trials targeting Aβ (inhibiting/modulating Aβ processing or promoting

Aβ clearance) suggested that treatments are likely to be less effective in the patients with dementia stage of AD, and highlighted that such treatments should be introduced earlier in the disease course in prodromal and presymtomatic stage when disease is less prominent. Indeed, there are several other clinical trials currently engaged for treatment of patients with MCI and mild AD (e.g. crenezumab, gantenerumab, aducanumab). The early study (phase Ib) of aducanumab, human monoclonal antibody design to target aggregated Aβ (soluble oligomers and insoluble fibrils), showed reduction of Aβ in the brain measured by Aβ imaging

(florbetapir-PET) in a dose dependent manner as well as slower cognitive progression [3, 78].

This drug is currently in phase III clinical trial. Moreover, there are ongoing trials focusing on the disease prevention (prevention trials) [79]. Solanezumab is being tested in phase III clinical trial (A4 trial- Anti-Amyloid in Asymptomatic Alzheimer’s disease) on asymptomatic patients at risk for developing AD (preclinical or presymptomatic stage), with positive Aβ pathology

(pre-screened with PET imaging) (https://www.clinicaltrials.gov/ct/show/NCT02008357). Apart from preclinical trials in individuals in risk of developing sporadic AD, there are additional trials oriented toward individuals with genetic risk factors (e.g. APOE ε4 homozygotes) and also families with autosomal-dominant mutations related to AD (e.g. Dominantly Inherited

26

Alzheimer’s Network-Treatment Unit trial) (ClinicalTrials.gov, identifiers: NCT02565511,

NCT01760005, respectively).

In addition, failure of earlier studies could be partially explained by incorrect diagnosis, not sufficient target engagement, or the possibility of directing drugs against the wrong target

(working under assumption of amyloid hypothesis).

An alternative therapeutic target is tau with drug development aimed to prevent tau aggregation. Tau aggregation inhibitor TRx0237 is currently in the phase III clinical trial, and active tau vaccine AADvac1 is in the phase II trial in AD patients (ClinicalTrial.gov, identifiers:

NCT02245568, NCT02579252, respectively). The AACvac1 is the first tau-based active immunotherapy being tested in humans. This peptide vaccine is designed to specifically target pathological tau conformational epitope aiming to inhibit progression of tau pathology and slow disease progression. The results from phase I trials on mild to moderate AD indicate high immunogenicity in treated patients with favorable safeness. Nevertheless, another active vaccine

(ACI-35) is under investigation and there are as well studies designed for development of tau- based passive immunotherapies [74, 80]

Importance of biomarker inclusion in clinical trials is several; diagnosis, especially of preclinical AD and early clinical stages such as MCI; prediction or indication of preclinical- clinical progression. Biomarkers can help in selecting patient for a particular drug (e.g. by focusing on patient at certain stage of AD or AD subtype) or indicating drug target engagement and downstream effects (called theragnostic biomarkers) [4].

Besides disease-modifying drugs targeting Aβ or tau pathology, medications tend to moderate neuroinflammation in AD are been tested in clinical trials (again microglia activation, e.g. azeliragon in phase III) [74]. There are significant numbers of further trials testing dietary

27 compounds and symptomatology-oriented medications (e.g. for treatment of agitation) in AD patients [3].

1.2 Mass spectrometry-based biomarker development

1.2.1 Introduction to biomarkers and biomarker’s characteristics

1.2.1.1 Biomarker definition and classification

Biological marker or biomarker can be defined as an unbiased in vivo measure of a physiological or pathological process in the body or as a pharmacological response to a therapeutic treatment [81]. Biomarkers have several applications in the clinic and can be used to establish clinical diagnosis (diagnostic biomarkers), to assess the stage of the disease course

(staging biomarkers), to predict disease risk (risk biomarkers) and prognosis (prognostic biomarkers), as well as to predict and monitor efficiency of the treatment or to guide therapeutic dose (companion biomarkers) [4, 81]. In research and clinical trial settings, biomarkers can be used to improve diagnostic accuracy and allow patients’ stratification or enrichment of specific patients’ subgroups [4]. Moreover, use of biomarkers in pharmaceutical research would encompass several types of biomarkers: target engagement biomarkers (designed to monitor therapeutic efficacy to bind the target site), pharmacodynamics biomarkers (used to measure the therapeutic effects on the downstream disease mechanisms), as safety biomarkers, and as a surrogate endpoint (biomarkers used instead of clinical endpoints to predict clinical benefit)

[82]. Theragnostic biomarkers define biomarkers used to identify and monitor effect of a drug on disease molecular mechanisms (e.g. monitor target engagement and downstream effect of a drug) [4].

28

1.2.1.2 Analytical and clinical evaluation of biomarkers

Before a biomarker or a diagnostic test is introduced into the clinic, it is necessary to establish a reliable analytical method for its measurement and evaluate its performance in an appropriate clinical context.

1.2.1.2.1 Analytical evaluation

Accuracy of the biomarker measurement is an important analytical specification of the method; it defines the closeness of the agreement between the measured analyte value and its true value, or in other words how correctly the method of choice measures analyte of interest. A related method characteristic is analytical specificity or the ability of the method to measure exactly the particular analyte, which is essential for reliable quantification [83].

Precision of the method reflects how closely in agreement the results are obtained from repeated analysis of the same sample. Poor precision can mask small differences among samples. Different terminology has been suggested for measurement of assay imprecision depending on the time and multiple conditions over which measurement is repeated.

Repeatability measures precision over a short period of time under the same conditions and is known as intra-assay, within-run, or within-day precision. Intermediate imprecision measures within-laboratory precision over longer period (days) under different conditions (e.g. using different reagents, calibrations). This type of precision is also called between-run, between-day or inter-assay precision. Reproducibility measures variations between laboratories using the same analytical test that is being evaluated [83].

Analytical assay assessment also consists of defining assay linearity or the range over which there is proportional relationship between measurement value and corresponding concentration of the analyte. It is usually determined by performing serial dilutions of sample

29 with known concentration. In assessing linearity it is important to cover the analytical range within which biomarker concentration in clinical samples is expected to fall.

Analytical sensitivity has been defined in different ways. Limit of blank (LOB) defines the highest value of the blank measurement lacking analyte of interest. Limit of detection (LOD) defines the lowest concentration at which the analyte can be detected, while the limit of quantification (LOQ) is the lowest level at which analyte can be quantified with acceptable precision and accuracy [83].

1.2.1.2.2 Clinical evaluation

Different measures of biomarker clinical performance have been established over time.

Basic measures include measures of diagnostic accuracy including sensitivity, specificity, predictive values, receiver-operating characteristic (ROC) curve and likelihood ratio.

Diagnostic accuracy of the biomarker or test is its ability to correctly classify individuals into their designated clinical groups (such as disease condition from non-disease condition).

Here the test under evaluation (index test) is compared against the results obtained with the reference, standard test, so called “gold standard”, in the same patients suspected to have a certain disease of the interest [84].

Ideally, the diagnostic test would separate completely the disease (true positive, TP) from the non-disease (true negative, TN) cases; however, typically there is an overlap between these two entities with possibility of two types of test errors. False negative (FN) error would occur if the diagnostic test fails to identify a disease in a true positive patient, while false positive (FP) error would occur if the diagnostic test identifies a disease in a true negative patient. Overall, the diagnostic accuracy of an index test can be quantitatively expressed as the

30 proportion of the true positives and true negatives over the number of all possible cases: accuracy=(TP+TN)/(TP+TN+FP+FN) [84].

Diagnostic sensitivity describes a biomarker’s ability to identify patients who have a disease condition: the number of true positives divided by the number of all cases with a disease.

Sensitivity = TP/(TP + FN)

Diagnostic specificity is an ability of a biomarker to identify individuals who do not have a specific disease condition or the number of true negatives divided by the number of all cases without a disease.

Specificity = TN/(TN + FP)

Positive predictive value (PPV) describes the percentage of patients with the positive biomarker test that have a specific disease; or in other words the number of true positives divided by the number of all cases with a positive test, which basically illustrates the probability of having the specific condition/disease in an individual with a positive test.

PPV = TP/ TP +FP

On the other hand, the negative predictive value (NPV) describes the percentage of the individuals with a negative biomarker test that do not have a specific disease; or the number of true negatives divided by the number of all cases with the negative test, which again refers to the probability of not having a specific condition/disease in an individual with a negative test.

NPV = TN /TN + FN

31

On the other hand, likelihood ratio (LR+ or positive likelihood ratio) defines how likely it is that a patient with a disease condition will have a positive test result comparing to the person without disease condition. Negative likelihood ratio (LR-) thus defines how likely it is that a person without disease condition will have a negative test result comparing to the person with a disease condition [84].

LR(+) = (sensitivity)/ (1-specificity)

LR(-) = (1-sensitivity)/ (specificity)

Diagnostic accuracy can be also presented using the ROC curve, which plots sensitivity

(y-axes) and (1-specificity) (x-axes) at different cut off values. Overall accuracy is calculated area under the curve (AUC) and can have a value from 0.5, indicating there is no predictive biomarker ability (random), to 1, indicating perfect performance or accuracy [85]. Thus AUC represents quantitative expression defining how close is the ROC plot under evaluation from the ideal ROC curve (AUC=1). For instance, AUC of 85% would mean that 85% of a time randomly selected subject from the disease group will have test result higher than the subject randomly selected from the non-disease group [86].

Added value of the biomarker test can be estimated by comparing ROC curve area or by using different reclassification measures, namely: net classification improvement (NRI), integrated discrimination improvement (IDI), predictiveness curve, decision curve analysis

[87].

The accuracy of a biomarker can differ due to the different disease stage/patients’ subgroups in which accuracy is being assessed. For example, biomarker may be very accurate in differentiating more advanced disease stage relative to normal controls due to widely different

32 concentrations among the groups, but poor in differentiating early, clinically poorly defined stage from healthy.

Some of the measures of diagnostic accuracy are affected by disease prevalence, for example, higher disease prevalence will result in higher PPV [88]

Desirable level of PPV and NPV can be dependent on intended use of the test, for example, test with very good specificity (e.g. 99%) applied as a screening test in a population with very low disease prevalence may still prohibitive number of false positives.

It is therefore important then to specify the population and the clinical context for which biomarker is sought and design the study accordingly.

Another important aspect for diagnostic accuracy is the correct and reliable classification of the disease and non-disease condition, based on the most accurate diagnostic criteria

(reference standard). Poor choice of reference standard will affect the diagnostic accuracy [89,

90].

The diagnostic accuracy should be ideally performed in randomly selected participants based on the presence of the expected/defined clinical symptoms (disease group), while control, non-disease group, should include individuals suspected of having the clinical symptoms, but were proven not to have the disease. If the control group encompasses healthy individuals or some other disease group diagnostic accuracy can be unrepresentative and biased. In case of the rare disease and small sample size of the disease group, control group should have similar size or two-to-three times higher sample group [84].

Before clinical test/biomarker is recommended for clinical practice, its clinical utility has to be evaluated. Physicians and all relevant decision makers has to assess the impact of the new

33 clinical test/biomarker on the patients’ health outcome (e.g. will test lead to health benefit or who can benefit from the test the most, how the test improves health outcome compared to the existing alternative) [91].

In order to have more transparent and consistent reporting of the diagnostic accuracy across the studies, STAndard for the Reportingof Diagnostic accuracy (STARD) statement was established [92]. The STARD offers guidelines presented as a list of information the diagnostic accuracy study has to report such as cohort characteristics, description of index test and reference method, methods used to estimate diagnostic accuracy, presentation of all participants

(as diagram), clinical characteristic of participants, etc. [93]. The biomarker study must provide sufficient details to allow replication and confirmation by future studies.

Biomarker development faces many challenges related to the pre-analytical (e.g. patient availability, sample collection, handling, storage), analytical (e.g. reliable analytical method) and post-analytical (e.g. interpretation of results, appropriate statistical analysis) factors.

One of the main challenges in biomarker research is bias that can be defined as systematic difference between groups not ascribed to disease under study. Bias can be difficult to identify and to address/minimize. Some strategies of avoiding bias can be apply, such as design unbiased, equal comparison between disease and control group; the only factor that can vary in the study is the measured factor (outcome). This should be applied during study design

(e.g. selection of subjects/samples), conduct (report subjects’ characteristics, sample collection, handling, and analyzation) and interpretation (argue possible bias) [94].

Control group and disease group should be carefully selected to match the targeted population for which biomarker is intended for [95]. The effect of patient characteristics (such

34 as age, sex, and lifestyle) and biological variation (intra- and inter-individual) on biomarker level should be investigated so that adequate controls are selected.

Pre-analytical factors can also introduce spurious differences between patients, as discussed in details in the following section.

1.2.1.3 Pre-analytical phase and biomarkers

Variations of biomarker concentration can arise from the pre-analytical factors, which has to be evaluated as the potential source of bias. These factors can be of physiological or non- physiological nature and can be generally classified into three groups: physiological factors, sample collection/handling, and interfering factors [96].

Some of the physiological factors that can influence the biomarkers’ concentration are age, sex, time (circadian rhythm), seasonal changes, altitude, conditions such as pregnancy and menstruation as well as lifestyle (e.g. exercise, smoking, alcohol consumption, diet). Moreover, intra- and inter-individual variation of the biomarkers has to be assessed since these can account for higher variability then the analytical variations; inter-individual biological variation in biomarker concentration is usually two-fold higher than the intra-individual variation [97].

Sample collection, handling and storage should be standardized. Variations related to the sample collection can be due to the length of fasting time (e.g. triglycerides, glucose), time of the sample collection (e.g. urine), posture during collection (e.g. albumin increases in sitting posture). Blood collection tubes can also affect biomarker concentration, for example by adsorbing the biomarker, affecting the biomarker stability, or by causing analytical interference,

(e.g. plasma anticoagulant EDTA chelates cations and may influence enzyme activity assays)

[96].

35

Moreover, handling conditions, such as time and temperature of sample storage, centrifugation duration/speed, are also important processing factors that may introduce spurious variations in the biomarker concentration.

Finally, interfering factors can affect biomarker concentration. The most common interferences in the clinical laboratory are hemolysis, icterus, and lipemia. Hemolysis can affect concentration of the components that have much higher concentration within the erythrocytes

(e.g. potassium, lactate dehydrogenase enzyme), cause degradation of some analytes (e.g.

ACTH due to release of proteolytic enzymes), or analytical interference; hyperlipidemia can cause volume displacement or spectral interference; icterus can affect redox reaction and also spectral interference. Other endogenous interfering factors include circulating antibodies, drugs, etc.

1.2.2 Phases of protein biomarker development

Typical protein biomarker development pipeline encompasses several phases such as biomarker identification and qualification, verification, pre-clinical assay development, validation, and commercialization or final approval of an assay for the clinical use by the health agencies (e.g. FDA). The entire biomarker development process is usually high in cost and time consuming; it can pass several years or even decades until a potential biomarker reaches the clinic [98]. Before a biomarker development process begins, it is important first to identify the unmet clinical need and clearly define the purpose of a future biomarker. Relevant clinical specimen for discovery and ultimate clinical use should be defined. Commonly used biological samples for biomarker development are summarised under the section 1.2.3.

Simplified overview of the biomarker development phases is presented in the Figure 1.2.

36

1.2.2.1 Protein biomarker identification phase

Discovery phase of biomarker development includes protein identification and quantification (typically semi-quantitative) in an attempt to find proteins that have different concentration between a disease and a control state [99].

Typically, identification stage of biomarker development involves cell lines, animal models, or proximal fluids and blood samples as a source of potential biomarkers. Several approaches can be taken to identify potential candidates, such as genomics, transcriptomics and proteomics.

Mass spectrometry has been proven a powerful method for large-scale protein identification in various biological fluids; proteomes thus identified are considered the most valuable source of potential clinical biomarkers [100]. One of the limitations of MS-based proteomics is limited sensitivity of MS methods (see section 1.2.5.1). In order to allow for identification of lower abundant proteins in a complex matrix (e.g. blood, proximal fluids), different fractionation strategies are being employed, which can increase imprecision and results in lower throughput.

In this phase more than thousands of proteins are typically identified utilizing smaller number of samples (typically around 10). There is a high possibility for false positive biomarker discovery at this initial stage. This can be a particular problem for lower-abundant proteins.

These proteins have lower incidence of fragmentation and thus identification and quantification can be random [101]. It is important to include larger number of samples and repeated analysis, as much as this is possible. In large-scale proteomics false discovery rate (FDR) is defined as the number of false positive identifications over the all peptide identifications and is an estimate of identification confidence.

37

1.2.2.2 Protein biomarker qualification phase

In the qualification phase, biomarkers identified in the discovery phase as showing different concentration between groups, are tested again in a smaller (independent) number of samples. Moreover, if the initial sample for discovery phase was proximal fluid, cell line or animal model and the ultimate sample of choice for verification/validation phase is for instance plasma, then candidates’ differential expression should be confirmed in the plasma during the qualification phase [101]. Method utilized here can be mass spectrometry, but with modest fractionation. Furthermore, additional filtering criteria can be applied in this phase to decrease number of candidate biomarkers to a reasonable number suitable for verification stage of quantification. Some of the criteria can include selection of tissue-specific or secreted and membrane proteins [98]. During this phase the most promising candidates will be then selected for the verification phase.

1.2.2.3 Protein biomarker verification phase

The verification phase includes evaluation of candidates in larger number of independent samples. Samples employed should reflect target population for which the intended clinical use of a biomarker.

Mass spectrometry-based targeted quantification is commonly utilized at this stage, providing multiplexing capacities, better high-throughput, good accuracy and sensitivity

(especially comparing to discovery-based methods) [102]. Immunoassay can also be utilized and provide advantage of faster sample analysis, better sensitivity. However, immuno-based assays may have limited availability considering that for some of the newly discovered proteins reliable antibody-based assays or high-quality antibodies may not exist. Detailed explanation of targeted mass spectrometry-based quantification is described under the section: 1.2.5.

38

1.2.2.4 Protein biomarker validation phase

At the validation phase only a few candidate biomarkers, showed to be the most promising in the verification phase, are being tested in much larger number of samples. Ideally, samples are obtained from different clinical centers. For this stage usually more high-throughput immunoassays with excellent reproducibility, are being used. This validation phase is characterized by long duration and high cost.

Once the biomarker has completed all the phases of biomarker development, the next step is the development and commercialization of the new test for the clinical use; before the biomarker and assay will reach the clinic certain rules and regulations have to be met before they will be approved by the health care agencies, such as FDA for the United States [101].

Figure 1.1: Protein biomarker development phases. The figure shows the main phases of biomarker development as described in the text. At the beginning of the biomarker discovery thousands of candidates are identified in lower number of samples, resulting in the limited throughput. Candidate/s passing the quantification and verification phases can be considered for the validation phase during which 1-10 candidates are evaluated in large number of samples using high throughput method. The process from the biomarker discovery to the clinical use is long and high in cost with rigorous evaluations performed by the healthcare agencies before the biomarker and assay is finally approved.

39

1.2.3 Common samples used for biomarker discovery

Different specimens can be used for protein biomarker discovery, including plasma or serum, urine, proximal fluids: cerebrospinal fluid, amniotic fluid, seminal plasma, as well as tissue samples, cell lines, or animal model. Each of these samples has advantages and limitations. Among the body fluids, blood is commonly the specimen of choice; it can be obtained in large amount with minimally invasive procedure, i.e. venipuncture, and because of the systemic circulation and contact with many organs, blood plasma and serum can reflect physiological/pathological changes in the body. However, blood is also very challenging fluid for mass spectrometry-based biomarker identification, considering its high complexity and wide dynamic range of protein concentration (more than ten orders of magnitude), with high abundant proteins masking the low abundant protein identification [97]. Also, blood can dilute biomarker released locally. Applying different fractionation strategies and depletion of high-abundant proteins, the matrix complexities can be decreased which can facilitate identification of lower abundant proteins, but with compensation of lower high-throughput and higher process variability [100, 103].

The advantages of the urine as a sample are its non-invasive collection and high amount of sample availability. The urine protein amount is lower than in blood, but still it is considered complex matrix, with high dynamic range [104]. Moreover, the standardization of urine collection and normalization of protein level is required among the samples due to the changes in the urinary flow rate.

Proximal fluid refers to the fluid that surrounds the certain organ, usually in direct contact, and thus it can reflect physiological and pathological changes. Examples of such fluids are cerebrospinal fluid, nipple aspirate fluid, saliva, seminal plasma, ascites, and amniotic fluid.

40

They are less complex samples than blood and can contain higher amount of tissue-derived proteins, especially attractive for biomarker discoveries since they can directly mirror pathological processes [105]. Their common drawback is invasive collection, and consequently lower sample availability, low amount of obtained sample volume and possibility of blood contamination.

Tissue samples can be also used for the biomarker discovery considering high concentration of potential biomarkers in pathological tissues. However its collection by biopsy or during surgery is an invasive procedure, with often limited availability of control, healthy tissues. Tissue samples are very heterogeneous with different cell types represented and some tissue samples, such as formalin fixed, paraffin-embedded (FFPE) sections are challenging to analyze, due to the formalin induced protein cross-linking. The later difficulties can be partially overcome by employing laser capture micro-dissection of tissue as well as novel protocols for

FFPE tissues, more compatible with mass spectrometry analysis [106, 107]. In addition, protein biomarker candidates identified in tissue may not be present in fluid of interest (e.g. proximal fluid or blood) due to the limited leakage into the fluid, or degradation by the proteolytic enzymes [100].

Alternatively, cell lines and tissues/fluid from animal models can be used as initial sample of choice. Advantages of cell lines (cell lysates or secretome) are their availability, medium sample complexity, enabling identification of wide range of proteins under controlled biological and experimental conditions [108]. At the same time cell-line is a very simplified construct, lacking in vivo tissue heterogeneity and interaction with microenvironment. Also, cell lines are often established from a specific patient at a specific time in disease course and the patient type, and time of disease course may not be relevant for biomarker discovery.

41

Animal models offer controlled biological and experimental variation, sample availability (tissues, fluids) at different disease stages. However, the relevance of modelled pathology to human pathology should be demonstrated.

1.2.4 Mass spectrometry-based protein biomarker identification and quantification

1.2.4.1 The mass spectrometer

The principal parts of the mass spectrometers are ion source, mass analyzer, and detector. The ion source ionizes and volatilizes proteins/peptides before entering the mass analyzer; the most common types of an ion source for proteins are electrospray ionization (ESI) and matrix-assisted-laser-desorption-ionization (MALDI). The ESI ionizes analytes in the liquid sample and is usually coupled to liquid chromatography and subsequently to an ion trap or quadrupole type of mass spectrometer; it is more frequently used for ionization of complex mixtures. At the ESI source, high voltage is applied resulting in release of fine droplets (spray of droplets) of charged peptides from solution which then migrate to the mass analyzer under the electric field; the addition of the drying gas further facilitates solvent evaporation and peptide vaporization [109]. The MALDI ionizes peptides incorporated into dry, crystallized matrix using pulsed laser beam; it is more appropriate for the simpler sample matrix and is usually coupled to the time-of flight type of mass analyzer [110]. Mass analyzers are the main part of the mass spectrometers; here peptides are separated according to their mass-to-charge (m/z) ratios. The main types of the analyzers are ion trap, time-of-flight (TOF), quadrupole, and

Fourier transform ion cyclotron (FT-MS). These analyzers separate ions by applying electric field (ion trap, TOF, quadrupole) or magnetic field (FT-MS). The more frequently used mass analyzers for the proteomic studies (protein identification) are ion trap and TOF. Ion traps are sensitive, relatively inexpensive but have rather low mass accuracy; while TOF have similar

42 performance to ion traps, it has larger mass range. Mass analyzers are commonly combined for tandem mass analysis (MS/MS). Briefly, in the first analyzer the ions of intact peptides are selected and then fragmented followed by analysis of peptide fragments. Some of the hybrid instruments are LTQ-Orbitrap (types of ion traps), TOF-TOF, quadrupole-TOF. The most common hybrid type of instrument for targeted quantification (with selected reaction monitoring assay) is triple quadrupole, described in more details under the section: 1.2.5.

The detector or is the third part of the mass spectrometer, and typically it is microchannel plate detector, containing electron multipliers. When ions hit the surface of the electron multipliers, electron emitting is initiated; the resulting electron flow is proportional to the number of ions of the specific m/z ions that reached the detector, reflecting thus peptide abundance [109].

1.2.4.2 Mass spectrometry-based protein identification

Protein identification with mass spectrometry can be achieved with top-down approach, identifying intact proteins, and with bottom-up approach, identifying peptides which are then matched to the corresponding proteins. Bottom-up proteomics is more widely used and is currently a method of choice for large-scale protein identification.

The bottom-up workflow starts first with denaturation of proteins in a desired sample, followed by chemical modification of cysteine residues in order to avoid formation of disulfide bonds. In the next step, proteins are digested with proteolytic enzymes, such as trypsin, which breaks proteins into smaller fragments easier to analyze in mass spectrometer. In order to decrease sample complexity and allow identification of lower abundant proteins, several fractionation strategies can be employed before mass spectrometry analysis. In a common strategy, peptide mixture is fractionated with multi-dimensional protein identification

43 technologies (MudPIT), separating peptides based on two-dimensional liquid chromatography such as strong cation exchange chromatography-SCX (peptides separated by their charge), usually off-line separation, in addition to reverse phase liquid chromatography-RP-LC (peptides separated by their hydrophobicity), classically coupled on-line to the mass spectrometer [111].

However, separation can also be performed using one dimensional electrophoresis (1DE) (such as SDS-PAGE); proteins are cut from the gel and digested (in-gel digestion) into peptides.

After fractionation, peptides are volatilized and ionized by electrospray ionisation (ESI), and then transferred to the mass analyzer (gas phase), where peptides are separated by their m/z, subjected to collision induced fragmentation and detection of fragmented ions based on which mass spectra of fragments are generated (tandem MS or MS/MS). Isolation of peptides for fragmentation is typically based on the top most intense precursors (data-dependent acquisition).

Experimental fragment spectra are then matched with the theoretical mass spectra from the protein sequence database and assigned to the peptides and corresponding proteins. Based on these matches, scores reflecting statistical probability of peptides identity is allocated with various available searching algorithms, such as Sequest, Mascot, X!Tandem, Andromeda, in addition to the calculation of the false discovery rate [112, 113]. This approach of bottom up protein identification is also known as “shotgun” proteomics.

1.2.4.3 Mass spectrometry-based protein quantification

Mass spectrometry-based quantitative methods can be label-free or using labeled analogs and may facilitate relative (differential) or absolute quantification. Typically, relative quantification is achieved with label-free approach, metabolic or chemical stable-isotope labeling (e.g. SILAC, iTRAQ, respectively) while absolute quantification is commonly performed with metabolic or chemical stable-isotope labeling (such as QconCAT and AQUA

44 peptides, respectively) using targeted mass spectrometry approach [114]. Targeted mass spectrometry and related relative/absolute quantification will be discussed in the following section 1.2.5.

The global protein quantification can be achieved along with the protein identification.

Label-free relative quantification is frequently based on the spectral counting and intensity- based quantification. Spectral counting is performed at the MS/MS level, in a way that the number of observed peptide spectra is an indication of protein abundance, e.g. the more abundant the peptide is, it will be sequenced more times, resulting in higher observations of

MS/MS fragmentation and thus higher spectral counting. However, this strategy can introduce bias and variability due to the different peptide properties (e.g. peptide length). Thus several modifications of the spectral counting techniques are developed such as Normalized Spectral

Abundance Factor (NSAF) which takes into account length of the peptide or Protein Abundance

Index (PAI, emPAI) which accounts for the number of observed tryptic peptides per protein

[115, 116]. The signal intensity or area under the curve (AUC) measurement is based on the peptide’s m/z ratio and peptide elution retention time, resulting in MS1 intensities. This type of label-free quantification hence requires constant chromatographic conditions, with reproducible

RT and chromatograms. Commonly used M1 peptide signal intensity methods are MaxQuant

LFQ and intensity-based absolute quantification (iBAQ) [117, 118].

In a label-based approach, protein quantification is based on the labeling of proteins or peptides with stable isotopes. The stable isotope labeling with amino acids in cell culture

(SILAC) approach is based on the addition of heavy isotope-labeled 13C and 15N amino acids into the cell culture which result into metabolic labeling of proteins [119]. Typically, cell cultures labeled with the heavy isotopes are subjected to desire experimental treatment (e.g. drug addition) while the other cells, not labeled (having light 12C and 14N isotopes) are treated as

45 a control. Next, the equimolar mixtures of both “heavy” and “light” lysates are processed for mass spectrometry sample preparation and analysis. Relative abundance of the peptides is then calculated based on the heavy-to-light peptide ratio. SILAC provides accurate relative protein quantification but it is restricted to the actively dividing cells (such as cancer cell lines) and are thus limited to wide use on the primary cell culture, biological fluids and tissue samples.

However, whole organisms (e.g. mice) can be labeled by feeding with the heavy-labeled diet

[120].

Other approaches for relative protein quantification are based on non-metabolic, chemical labeling of peptides, using heavy or light isotope-labeled and chemically reactive tags.

Isobaric tags for relative and absolute quantification (iTRAQ) consists of three parts, reporter group, balance group and peptide-reactive group and allow label of peptides binding to the amines of N-terminus and lysine side chains. The labeling is performed in the protein digest before samples with different tags are combined and subjected to mass spectrometry analysis.

The reporter group can have different masses which are “balanced” by the balance group to have the constant summed m/z [121]. That means that these isobaric tags can be used to label different samples and will differ only when subjected to peptide fragmentation in mass spectrometer (MS/MS). Relative peptide abundance results from comparing relative signal intensities of the iTRAQ reporter ions. Thus, iTRAQ provides multiplexing capabilities and is applicable to the analysis of fluids and tissues (compared to SILAC) but is also prone to variabilities due to the sample preparation and underestimation of results [122]. Besides iTRAQ, other chemical labeling methods are available such as tandem mass tags (TMT, amines-reactive tag, quantification based on MS/MS) and isotope-coded affinity tags (ICAT, cysteine-reactive tags, quantification based on MS1).

46

1.2.5 Targeted mass spectrometry protein quantification

Verification phase of biomarker development usually involves testing hundreds of potential candidates in a large number of samples, highlighting the need for multiplex methodology that allows simultaneous measurement of many proteins at once, while still preserving good accuracy and reproducibility. Selected reaction monitoring (SRM) is targeted- mass spectrometry platform, based on the a priori knowledge of the protein and peptide of interest for detection and relative or absolute quantification of proteins in a complex matrix

[114]. The important benefit of this technology is capability of simultaneous measurements of hundreds of proteins in a single measurement (run), typically without the need of applying sample enrichment or fractionation methodologies.

Protein quantification is based on prior selection of the targeted precursor or proteotypic peptide, unique to the protein of interest, and the corresponding fragmentation ions [123].

Such monitoring of specific pairs of precursor and its fragment ions, called transitions, are commonly achieved with the triple quadrupole mass spectrometer. In an SRM experiment, digested peptides (separated by online coupled LC and ionized with ESI) enter the first quadrupole (Q1) of mass analyzer where only peptides with predefined m/z are filtered and fragmented in the second quadrupole (Q2) by the collison-induced fragmentation. In the third quadrupole (Q3), precursor fragment ions of specific m/z are monitored, reaching the detector where signal intensity is recorded over time. Selected transitions and RT are principal and unique parameters of the SRM method for specific peptide.

Relative quantification with SRM is achieved with spiked isotopically labeled peptide standards (heavy peptides) of unknown absolute amount. Absolute SRM quantification can be achieved with stable isotope dilution peptides (AQUA) of known amount (typically contacting

47 trypsin-cleavable tag) [124], using quantification concatemers (QconCAT) [125] and heavy- labeled whole protein standards (protein standard absolute quantification, PSAQ) [126]. When protein quantification is performed with the PSAQ, endogenous and standard protein are treated in the same way during the entire sample preparation, accounting for the total variability of pre- analytical and analytical phase; isotopically labeled peptides on the other hand are usually added before or after trypsin digestion.

Figure 1.2: Selected reaction monitoring (SRM) methodology. SRM assay is based on the capabilities of the triple quadrupole mass spectrometer: first quadrupole (Q1) acts as a filter of peptides with predefined m/z, which are then fragmented in the second quadrupole (Q2), while the third quadrupole (Q3) serves as a filter of preselected fragment ions. Finally, measurement and integration of fragment ion intensities is recorded. ESI-electrospray ionization, I-signal intensity, RT-retention time.

1.2.5.1 Comparison of targeted mass spectrometry and immunoassays

The main advantages of an SRM assay are high specificity, good reproducibility, and multiplexing capabilities. Despite the fact that the development of a high-quality targeted assay requires initial efforts, it is still time- and cost- efficient compared to development of an immunoassay such as enzyme-linked immunosorbent assay (ELISA). Limitations of SRM assays are often related to the sensitivity for very low abundant proteins of a complex mixture, comparing to ELISA. Measurements of proteins in digested samples without any enrichment strategies allow quantification typically up to low microgram or high nanogram per mililiter

[114]. However, in order to improve sensitivity to the lower nanogram per mililiter levels,

48 prefractionation of samples can be performed or typical SRM assay can be combined with protein or peptide antibody-based enrichment, followed by the SRM analysis [127, 128]. The additional sample manipulation can on the other hand reduce the high-throughput, especially when fractionation methods are used prior to mass spectrometry analysis.

Mass spectrometry is considered highly specific method for measurement of biomarkers.

Different protein isoforms, posttranslational modifications (e.g. diverse glycoforms), and even mutated protein forms (application of proteogenomics) can be accurately identified and quantified using mass spectrometry technologies.

In contrast, immunoassays can lack specificity (for example in differentiation different protein isoforms, measurement of unbound or “free” vs. “total” (dimers, multimers, etc) concentration of specific protein). The critical step in immunoassay development is generation of high quality antibodies. Choice of antigens used in immunization, reliable method of selection of generated antibodies and consistent source of antibodies are some of the critical factors in antibody development. Several recent publications highlight the potential problems with use of antibodies of poor quality [129, 130]; this has led to proposal for better quality control of antibodies and initiatives such as sharing validated data of publicly available antibodies through portal Antibodypedia [131].

Immunoassays offer the highest sensitivity, however high quality, sensitive antibodies are prerequisite. For instance, quantification of protein with low concertation in plasma, interleukin-6, (normal range 0-5 pg/mL) can be performed in hospital laboratory utilizing immunoassays, which is not possible with targeted mass spectrometry [97].

49

1.2.6 Tissue-specific databases

There are several ongoing efforts to catalogue gene transcripts and proteins in different fluids, tissues, cell types using various omics technologies and imaging. The results are then made available via public database. Examples of such databases are BioGPS (www.biogps.ord), based on the mRNA expression (microarray), Human Protein Atlas database

(www.humanproteinatlas.org), based on mRNA and (RNAseq) and protein expression

(antibody-based expression), and Human Proteome Map (www.humanproteomemap.org), based on the protein expression (mass spectrometry analysis).

Human Protein Atlas (HPA) is the most inclusive and informative proteomic database, consistent of three main atlases: cell atlas, normal tissue and cancer atlas [132]. The current version 16.1 of HPA provides information of all RNA and 87% of proteins of human predictive genes. Normal tissue atlas is represented with 44 different human tissues and in total 76 different cell types. In addition, HPA tissue proteome provides several sub-proteomes such as tissue specific proteomes, the housekeeping proteome, the secretome and membrane proteome, cancer proteome and druggable proteome. The HPA tissue-specific proteome has been defined by the high mRNA expression of genes in the particular tissue relative to the other body tissues and have been categorized into three groups: tissue-enriched (with at least five times higher mRNA expression in the particular tissue relative to other tissues), group-enriched (with at least five times higher mRNA expression in the group of 2-7 tissues) and tissue enhanced genes (with at least five times higher mRNA expression in brain relative to the average expression in all other tissues). Organ with the highest number of tissue-elevated genes is testis (n=2,220), while brain, represented with cerebral cortex, is the second organ with 1,437 elevated genes

(http://www.proteinatlas.org/humanproteome/tissue+specific). The Gene Ontology (GO) analysis of the brain elevated genes indicates that the main functions of brain proteins are

50 synaptic transmission and neurological processes, whereas most of the brain-enriched genes are membrane-bound and secreted proteins. This comprehensive database provides an indispensable repository of the human proteome and its applications for disease diagnostics and drug discovery.

Such databases can assist in biomarker discovery. For instance in order to reduce number of candidate biomarkers to more realistic number suitable for the verification phase, additional filtering criteria can be applied to the list of preselected candidates.

1.3 Alzheimer’s disease CSF biomarkers

1.3.1 Cerebrospinal fluid

Cerebrospinal fluid (CSF) is clear and transparent fluid that surrounds the brain and the spinal cord. It is in direct contact with the CNS, acting to protect, support, nurture brain tissue and to eliminate metabolic brain products (so called sink waste function). CSF is important for the homeostasis of the extracellular environment and hormonal-to- neuropeptide balance in the CNS [133].

1.3.1.1 CSF production and composition

It is widely accepted that the majority of CSF is produced by the choroid plexus as plasma ultra-filtrate in the lateral, third and fourth ventricles, whereas a smaller portion, around

20%, is derived from the parenchyma via interstitial fluid and capillary endothelium [134]. CSF production is a dynamic process with a rate of about 500 mL per day and an approximate volume of 150 mL (turn over 3-4 times per day). The production is influenced by the circadian rhythm with the highest secretion observed during the night [135].

51

The choroid plexus is composed of a single layer of ependyma, forming numerous villi at the apical side of the ventricles, enlarging the surface of the plexus. Epithelial cells are connected with tight junctions close to the apical side, while on the basal side they are lining the basement membrane. Below the basement membrane is connective tissue with collagen fibroblasts and nerve fibres. The capillaries are positioned at the center of each villus and contain fenestrated epithelial cells, which are more permeable then the capillary epithelial cells of the blood-brain barrier. The tight junctions between epithelial cells of the choroid plexus form the blood-CSF barrier [134]. These tight junctions prevent paracellular molecular transport from the blood into the CSF.

CSF is actively secreted by epithelium of the choroid plexus into the ventricles by unidirectional secretion of ions due to the polarity of the epithelium (consequence of arrangement of transporter proteins at the basolateral and apical membrane) which results in osmotic gradient and passage of the water [134]. This net water flux into the ventricles is

+ - - primarily mediated by net flux of Na , Cl and HCO3 across the epithelial cells facilitated by the specialized protein transporters located at the basolateral and/or apical side; water transport is by diffusion or facilitated by the water transporters.

CSF matrix is relatively simpler than blood; it is almost acellular fluid with normal cell count of 0-4 white blood cells (WBC)/µL [136]. Due to the selective properties of blood-CSF barrier, CSF protein concentration is much lower (approximately 200 times) than in a blood, typically in the range of 0.2 to 0.4 mg/mL. Among the electrolytes, Cl-, Mg2+ are in higher

+ - 2+ concentration, while K , HCO3 and Ca are in lower concentration then plasma [137]. Apart from electrolytes CSF contains as well as peptides and proteins, amino acids, sugars, organic acids, vitamins, hormones.

52

The blood-brain barrier is made of the tight junctions between the endothelial cells of the parenchymal capillaries, preventing direct contact of the blood and brain interstitial fluid. The endothelial cells lie on the basement membrane under which are the pericytes and the astrocytes processes. The molecules and ions thus transfer through the epithelial cells usually passing membrane lipid layer (transmembrane diffusion) or by the transport-dependent passage. For example small molecule such as oxygen and carbon dioxide transport by passive diffusion, while molecules such as glucose and amino acids use protein transporters to pass from the blood to the brain; these solute carriers can facilitate unidirectional or bidirectional transport of molecules to the brain and out of the brain [138]. Drugs can pass by transmembrane diffusion or through the various transporters, depends on their properties [139]. ATP-binding cassette transporters (ABC, active transporters) mediate efflux of the non-polar lipophilic molecules out of the brain and endothelium and can thus reduce penetrations of such drugs. Larger peptides and proteins can be transported via receptor-mediated transport/transcytosis (e.g. insulin, transferrin, immunoglobulin G), or via adsorptive transcytosis (e.g. albumin), both vesicular transport system. Under normal condition leucocytes can enter the blood-brain barrier via diapedesis. In the case of the pathology (i.e. brain inflammation) tight junctions can be modified which can facilitate paracellular movement of the mononuclear cells [138]. Blood-brain and blood-CSF barrier share the similar mechanisms for transport of molecules across barriers, however they differ in expression of ion channels, transporters and proteins that form tight junctions [140].

1.3.1.2 CSF flow and reabsorption

The CSF circulation within the brain is complex [141-143]. According to the generally accepted model, CSF moves from the ventricles’ cavities (from lateral to third and fourth ventricles) to the subarachnoid space by unidirectional bulk flow, circulating further to the lower

53 parts (surrounding the spinal cord), and upwards surrounding the brain parenchyma (Figure

1.3). Most of the CSF is being absorbed to the blood through the arachnoid villi (arachnoid granulations) into the superior sagittal venous sinuses [134]. The small portion of CSF can drain to the peripheral lymphatic system along the cranial nerves [134]. However, recent studies have discovered the presence of the dural lymphatic vessels in the brain, suggesting their contribution in CSF drainage [142, 143]. A new working hypothesis suggests that the CSF is constantly being produced and reabsorbed in the entire CSF system; as water exchange between the CSF system and the brain tissue through the capillary endothelium, challenging the “classical” model of CSF production/absorption and flow [141].

Figure 1.3: Simplified diagram of the cerebrospinal fluid flow. According to the “classical” model, cerebrospinal fluid (CSF) is largely being produced in the choroid plexus (CP) of lateral, third and fourth ventricles. CSF travels then into the subarachnoid space (SAS) from where it flows upwards towards superior sagittal sinuses (SSS) where most of the CSF is being absorbed, and downwards towards the SAS of the spinal cord space surrounding the spinal cord.

54

1.3.1.3 CSF proteins

Approximately 80% of the total CSF protein is derived from the plasma and another

20% is secreted by the CNS [133]. Plasma protein transfer into CSF is dependent on the molecular size, with smaller molecules transporting faster and having thus higher CSF/plasma ratio [144]. Proteins produced by the CNS can originate from neurons and glia or are synthesised by choroid plexus and leptomeninges [145].

Despite lower protein concentration in CSF relative to blood, the most abundant proteins found in plasma are also present in highest abundance in CSF (e.g. albumin) [133]. The concentration of brain-derived proteins is generally much higher in CSF compared to plasma, while plasma-derived proteins have lower concentration in CSF compared to blood [145].

Examples of proteins with higher CSF concentration and high CSF-to-blood serum ratios include prostaglandin D2 synthase (ratio 34/1), S-100B (18/1), tau protein (10/1), and cystatin C

(5/1) [145, 146]. Brain-derived proteins are not necessarily brain-specific, since they can be produced by other tissues. Some of the brain-derived proteins have high specificity for cerebral tissue and certain cell types (e.g. neuronal and glial proteins: S100B, NSE, tau) and have been associated with several neurological pathologies [145]. For example, elevated CSF levels of proteins S100B and NSE were found in stroke patients and Creutzfeldt-Jacob disease, while tau protein (total and phosphorylated-tau) is known to be elevated in AD [4, 145]. Blood-related proteins in CSF such as apolipoprotein B-100 and hemoglobin are commonly used as an indication of blood contamination of CSF since they are not normally present in CSF [133, 147].

The protein concentration differs between the sites of the CSF system and is also dependent on the CSF flow rate and origin of the protein found in CSF. Protein concentration of plasma-derived proteins is higher in lumbar part than in the ventricles because of the movement

55 along the rostro-caudal (ventricular-to-lumbar) gradient. On the other hand, neuronal and glial, brain-derived proteins, decrease from ventricular to lumbar part, while the concentration of the leptomeningeal proteins increase [144, 145]. Nevertheless, there are also reports that suggest that there is no change in concentration of brain-derived proteins between ventricular and lumbar space (investigated in pathological CSF samples [148, 149].

Flow rate is influenced by age with decreased flow rate with the advanced age. Decrease in flow rate results in increase of blood-derived and leptomeningeal proteins in CSF, while neuronal and glial protein concentration remains constant, as suggested by Reiber [145]. Using a sheep model Chen et al. observed an age-dependent increase on plasma proteins, but decrease in brain-derived and choroid plexus-derived proteins [150].

1.3.2 Types of Alzheimer’s disease biomarkers

In 1998 the Ronald and Nancy Reagan Research Institute of the Alzheimer’s Association and the National Institute on Aging Working Group defined the characteristics of an ideal AD biomarker. According to the Working Group such biomarker should reflect pathological processes in the brain, be validated in post-mortem confirmed AD cases and should be able to precisely differentiate AD from other forms of dementia, as well as to allow AD detection in an early stage of disease course. Moreover, an ideal biomarker measurement should be reliable, reproducible, non-invasive (or moderately invasive), simple to perform and inexpensive. The sensitivity and specificity (separating AD from other dementias) of such biomarker should be

>80%, while the positive predictive value should be ≥80% [151]. Similarly as mentioned in the previous section, according to the Working Group, biomarkers can be used for several reasons, such as diagnosis, screening, prediction and progression, as well as for monitor response to treatment and to determine relationship between pathology and clinical signs. Still, significant

56 interest is on the early AD diagnostic biomarkers, which is partially dependent on the need for biomarkers in the clinical trials testing disease-modifying treatments in the early stages of disease.

AD biomarkers connected to the pathophysiological changes of the disease are categorized as core biomarkers, which can be separated into those that reflect Aβ pathology and those that reflect NFT pathology and axonal degeneration. Imaging and CSF biomarkers encompass both of these categories.

Biomarkers that reflect Aβ brain deposition are CSF levels of Aβ1-42 (decreased in AD) and PET Aβ (increased Aβ tracer retention on PET, such as Pittsburgh compound B tracer, PiB-

PET, that bounds to fibrillary Aβ). Biomarkers that reflect neuronal dysfunction and degeneration are CSF levels of tau protein (t-tau and p-tau) (increased in AD), abnormal FDG-

PET imaging (decreased FDG update), and abnormal structural MRI (indicating brain atrophy)

[2].

These biomarkers have been evaluated as diagnostic and prognostic/prediction biomarkers for both predicting progression of MCI to AD and cognitively normal individuals to

MCI/AD [152]. In addition, they have been used in clinical trial settings to monitor target engagement, track disease modification and for patient enrichment [153, 154].

1.3.2.1 Comparison of imaging and CSF core biomarkers

The autopsy and ante-mortem (in vivo) studies demonstrated that CSF Aβ1-42 levels correlate negatively with the amyloid depositions in the brain [155], suggesting that the low

CSF Aβ1-42 is a consequence of Aβ1-42 brain accumulation [4]. The PiB ligand binds to the fibrillary brain Aβ, allowing in vivo monitoring of Aβ depositions, likewise the PiB-PET

57 imaging showed mostly good association with the post-mortem Aβ studies [28]. Both biomarkers of Aβ accumulation correlate with each other; reduced CSF levels of Aβ1-42 correlate with positive PiB-PET imaging (high binding) [156]. Both PiB-PET and CSF Aβ1-42 do not correlate well with cognitive decline [157].

It has been suggested that CSF t-tau is a measure of neuronal and axonal injury and degeneration due to tau pathology; high t-tau levels have been found to correlate with the NFT post-mortem observed accumulation [155]. High t-tau levels are not however specific for AD pathology, and high CSF levels have been observed in acute brain injuries, such as stoke [158] and brain trauma [159], as well as in the different forms dementias (e.g. FTD, DLB) [160]; the highest CSF t-tau increase has been observed in Creutzfeldt-Jakob disease [161]. High CSF tau levels have been suggestive of faster progression from MCI to AD [162] and more rapid cognitive decline in AD patients [163].

The CSF p-tau seems to be more specific for AD pathology compared to t-tau, with increased levels reflecting tau phosphorylation and formation of NFT. It also shows good correlation with the autopsy examination of NFT and the rate of atrophy at the hippocampal region, as well as with fast progression from MCI to AD. In addition, both CSF levels of t-tau and p-tau was found to correlate strongly in AD patients [4, 157].

The FDG-PET imaging is a measure of glucose brain metabolism, reflecting synaptic function. The typical imaging of FDG uptake is characterized with decreased metabolism at the temporo-parietal and posterior cingulate brain regions [164, 165]. The autopsy and ante-mortem studies confirmed good correlation between positive FDG-PET imaging in AD patients and

NFT neuropathology; in addition reduced FDG uptake has been shown to correlate with greater cognitive impairment. Another imaging biomarker is structural MRI, which is an indication of

58 brain atrophy as a result of neuronal and synaptic loss, with typical presentation at the medial temporal lobe, paralimbic and temporo-parietal regions. Structural MRI is not specific features of AD pathology, but the grade of atrophy correlates with disease severity and neuronal loss, as well as with the autopsy Braak staging of NFT pathology, and tau immunostaining [157].

Besides it has been proposed that synaptic dysfunction and brain atrophy correlate better with cognitive decline then NFT and plaque load.

1.3.2.2 Dynamic model of AD biomarkers

It has been widely accepted that AD pathology appears gradually, over decades, before the first clinical signs become evident, with the dementia representing the end stage of the disease pathological changes.

CSF biomarkers change over time. The biomarkers change over time depends on the disease course [153]. Longitudinal study evaluating CSF biomarkers in the stage of AD dementia (patients with mild-to-moderate AD) indicated that during the 2-year disease progression period t-tau levels did not change and no correlation with cognitive decline was observed [166]. In another study in a 6-month period all three biomarkers showed stability in

AD patients [167]. Similarly, they seem not to correlate well with the cognitive decline, suggesting they may not be good indicators of disease progression [168]. This change in biomarkers over time and relation to cognitive function has to be further evaluating in much longer follow up studies.

Several models have recently been proposed that describe biomarker change over time in the context of AD development. The initial model by Jack et al. considers change of AD biomarkers across disease stages. As suggested by the model the change of the biomarkers over the time of disease progression (preclinical-MCI-AD) typically follows sigmoidal curve,

59 reaching eventually a plateau phase as disease progress. According to this model, Aβ biomarkers appear initially and early in the disease course, even before the clinical signs of disease are present. Neurodegenerative biomarkers appear later and they correlate with disease severity. This model was designed to put in perspective the biomarker dynamics over the course of all AD stages (cognitively normal-preclinical, MCI, AD dementia), to help with AD staging

[28].

In an updated model by Jack et al changes are modelled over time of disease progression rather than over clinical stages [157]. CSF Aβ1-42 is the first abnormal biomarker during time of disease development, followed closely by amyloid PET imaging, after which CSF tau become prominent and as final biomarkers MRI and FDG-PET appear simultaneously (last on the time course is the occurrence of the cognitive decline).

In this model, high risk and low risk of developing AD was taken into account since these can change time of onset of cognitive decline; for instance individuals with the high risk

(e.g. carriers of genetic risk alleles, low brain reserve, present co-pathologies) of AD have earlier onset of the cognitive impairment then the low risk individuals (carrying protective mutations, have higher brain reserve).

Under the same report, the Jack and colleges also added additional model which recognizes evidence of tau as initial pathological event in AD. This model is based on the autopsy evidence and in vivo (biomarker) evidence and thus differs from the previous model, which is based solely on the biomarker evidence of AD pathology. This model suggests that tau pathology might appear early in time, before amyloid biomarkers, but it cannot be detected with in vivo biomarkers. Both tau and amyloid have an independent pathological pathway early in disease course. Thus amyloid biomarkers accelerate independently over time, reaching the

60 threshold and become detected first with in vivo biomarkers. This influences then acceleration of tau pathology and subsequent detection of CSF tau. MRI and FDG-PET biomarkers appear abnormal last, followed by cognitive impairment [157].

1.3.3 Clinical use of CSF AD biomarkers

Diagnosis of AD based on the clinical criteria as defined by NINCDS-ADRDA, can achieve sensitivity in the range of 71 to 87% and specificity 44 to 71% when compared to autopsy-confirmed diagnosis as assessed by studies in elderly patients at specialized AD clinics

[169]. It has been recognized that the diagnostic accuracy may be lower at primary care centers

[1]. High clinical misdiagnosis rate is related to the cases of early AD, atypical AD and mixed dementias [170].

The use of biomarkers in the clinic could thus be beneficial for the more accurate AD diagnosis. Recently a consensus report was published by the Alzheimer’s Biomarkers

Standardization Initiative (ABSI) members suggesting that CSF biomarkers should be considered and performed in individuals with memory complain and/or individuals accepted to the memory clinic because of memory decline. This would encompass subjects with MCI, early onset dementia, atypical clinical symptoms or complex clinical differential diagnosis [170]. The

ABSI members indicated that estimate of cost-effectiveness should be implemented to assess clinical utility of the biomarkers. They recommended the use of all three biomarkers (Aβ1-42, t- tau and p-tau-181) for an accurate diagnosis in combination with clinical history, neurophysiological examination and neuroimaging data; with cut off values determined by each laboratory. As will be discuss in the following sections, one of the main obstacles of wide use of these biomarkers in the clinical routine is significant variations in values between the different

61 centers which resulted in lack of established universal cut off diagnostic values. Therefore different studies evaluating AD CSF biomarker commonly use diverse cut off values [171, 172].

Sjogren and colleges proposed reference values for Aβ1-42, t-tau based on 291 neurologically healthy individuals with the age range from 21 to 93 years. Since they observed correlation of t-tau with age, they stratified values according to specific age range: for the 21-50 years t-tau was <300 pg/mL, 51-70 years <450 pg/mL and 71-93 years <500 pg/mL. Levels of

Aβ1-42 were not influenced by age and were >500 pg/mL [173]. However, according to ABSI consensus there is no need for stratification of AD biomarkers by age and most studies do not include age depended cut offs [170].

1.3.4 Pre-analytical factors and AD CSF biomarkers

The impact of pre-analytical factors on AD CSF biomarkers (Aβ1-42, t-tau and p-tau) has been investigated in several studies [174-176]. Polypropylene tubes have been recommended for CSF collection since different types of tubes such as polyethylene or glass tubes showed reduction in Aβ1-42 levels [177]. However, it has been reported that even different commercial polypropylene tubes, with significant heterogeneity in their composition, resulted in variations in all three biomarkers’ concentration, with maximum median variations among tubes -48% to 31% for Aβ1-42, -8% to +8% for t-tau and -4 to +6%, with the highest differences observed for the Aβ. This effect was explained by the absorption of hydrophobic

Aβ1-42 peptide to the collection tubes [176].

Other factors can influence biomarkers’ inconsistency such as storage temperature at -20 or -80 degrees (for t-tau and p-tau, but not Aβ1-42), delayed time before storage (Aβ1-42, p-tau) and freeze and thaw cycles (for Aβ1-42, t-tau). It has been recommended to store CSF at -80 degrees as soon as possible after collection and maximum of 3 freeze/thaw cycles [175]. Some

62 discrepancies due to pre-analytical variations between studies have been reported, e.g. variations in Aβ1-42 related to the influence of CSF centrifuge, delayed time of sample storage, or diurnal variation [174, 175, 177, 178].

Variations in AD CSF biomarkers due to the pre-analytical influences highlight the need for standardized procedures for sample collection and processing. To address these concerns, the

ABSI has recognized the pre-analytical impact on existing AD CSF biomarkers and gave recommendation for sample collection, handling and storage in order to reduce variability caused by these factors [177]. Although some pre-analytical factors are still under evaluation

(e.g. diurnal variation, different polypropylene tubes), the ABSI group currently recognized no impact of circadian rhythm, fasting, or CSF gradient on biomarker concentration; CSF collection should be performed by lumbar puncture between L3-L5 vertebrates in polypropylene tubes, centrifugation was recommended only for blood contaminated samples, while sample storage was recommended at -80 degrees (-20 for less than 2 months) and up to 2 freeze/thaw cycles [177].

1.3.5 Analytical methods used for evaluation of CSF biomarkers

The most common assays in use for measurement Aβ1-42, t-tau and p-tau CSF AD biomarkers are immunoassays, especially ELISA and xMAP Luminex assay, followed by recently introduced electrochemiluminescence (ECL) method.

1.3.5.1 ELISA method

The most widely used ELISA platform for measurement of CSF Aβ1-42, t-tau and p-tau

(181) is Innotest ELISA [179, 180]. These assays are sandwich type immunoassays performed on a sold-phase, 96-well plate. The Aβ1-42 ELISA is based on the combination of specific

63 monoclonal antibodies: 21F12 (targeting C-terminus) and biotinylated 3D6 (targeting N- terminus) [180]. The t-tau ELISA is based on the AT120 detection monoclonal antibody that recognises epitope common to all tau isoforms (irrespectively of phosphorylation) and two biotinylated antibodies (HT7- irrespectively of phosphorylation and BT2- binds non- phosphorylated tau). The p-tau ELISA is based on the phosphorylation at the Thr 181, using capture monoclonal antibody HT7 (capture all isoforms) and detection with biotinylated monoclonal AT270 antibody (specific for Thr 181) [181]. The detection method is streptavidin- peroxidase based colorimetric assay. Analytical range of the calibrators for Aβ1-42 assay is

125-2000 pg/mL, t-tau 75-1200 pg/mL and p-tau (181) 15.6-500 pg/mL [19].

1.3.5.2 Luminex xMAP method

The first simultaneous measurement of all three biomarkers (Aβ1-42, t-tau and p-tau

181) was performed with Luminex xMAP technology (INNO-BIA AlzBio3 kit). This is a sandwich, bead-type immunoassay, with capture antibodies covalently bound to the microspheres (each microsphere specific to an analyte has a different fluorescent label) and fluorescently labeled detection antibody. A combination of labeled detection antibody and labeled microsphere analyzed by flow cytometry identifies antigens of interest in the same sample [182]. The method uses similar combination of antibodies as Innotest ELISA: for Aβ1-

42 capture monoclonal antibody is 4D7A3 (recognize C-terminus), and detection antibody is the same as for ELISA (3D6). The combination of antibodies for t-tau is capture AT120 and detection HT7 antibody, while the antibody combination for p-tau is the opposite of ELISA

(capture AT270, detection HT7). The range of the calibrators for Aβ1-42 is 60-2,000, for t-tau

25-1,500, and for p-tau (181) 10-250 pg/mL. When compared to ELISA, the xMAP showed wider measurement range and different absolute values in biomarkers, nevertheless the correction factor was suggested to be used to convert the xMAP to ELISA-related values [19].

64

1.3.5.3 Electrochemiluminiscence method

Recently, ECL has been introduced as a novel type of immunoassay in dementia CSF biomarker research. The assay is based on the biotin-labeled capture antibody and ruthenium complex-labeled detection antibody. The advantage of this type of an assay is shorter time of analysis, smaller sample volume required, high sensitivity and large dynamic range. So far, there are two ECL platforms have been evaluated, the Meso Scale Discovery (MSD) and Roche

Diagnostics, for quantification of t-tau and Aβ1-42 in CSF, in addition to other Aβ species (e.g.

Aβ1-38, Aβ1-40, sAPPβ and sAPPα) [19, 183-185]. The analytical performance of MSD platform for Aβ1-42 and t-tau suggested larger range of calibrators (Aβ1-42: 0.19-1,370 pg/mL, t-tau: 4.39-9,600 pg/mL) and thus lower levels of detection when compared to ELISA and xMAP technology, however their calibrators were aqueous-based [19]. Novel Elecsys Aβ1-42

ECL assay developed by Roche Diagnostics is fully automated assay suggesting low lot-to-lot and between-laboratory variability, with improved total assay variability (CV=2-5%) compared to the other immunoassay platforms (discussed in section 1.3.5.4) ; the assay was linear along the working range, from 200 to 1,700 pg/mL [185].

1.3.5.4 Analytical evaluation of ELISA, xMAP and MSD platforms

Analytical performances of ELISA, xMAP and MSD assays showed higher analytical range for MSD system (for Aβ1-42, t-tau), while the highest variability in measured levels, estimated from the multicentre studies, was observed for the xMAP and MSD than for ELISA

[19, 186, 187]. The analytical performance of the ELISA, xMAP and MSD assays are summarized in the Table 1.1.

65

Table 1.1: Analytical characteristics of ELISA, xMAP and MSD assays.

ELISA xMAP MSD Calibrators' range (pg/mL)1: Aβ1-42 125-2,000 60-2,000 0.19-3,170 t-tau 75-1,200 25-1,500 4.39-9,600 p-tau 15.6-500 10-250 NA Overall variability (CV, %)2: Aβ1-42 17–29 17-38 13–36 t-tau 17-27 13-28 NA p-tau 12-28 11-30 NA 1 Based on Kang et al. 2013. 2 Overall variability (within-run, within-laboratory, between-laboratory and lot-lot variability, based on Mattsson et al. 2013.

Measured levels of CSF Aβ1-42, t-tau and p-tau showed variations across different studies, resulting in difficulties in establishing universal diagnostic cut off values for these biomarkers. Observed variations can be a consequence of the pre-analytical and analytical factors. Due to existing obstacles of the current technologies Alzheimer’s Association quality control program was launched to monitor analytical performance of different platforms and investigate the source of variability for Aβ1-42, t-tau and p-tau in CSF.

The first report of the quality control group assessed variation in biomarkers using

ELISA, xMAP and MSD platform among 40 laboratories. They observed substantial between- laboratory variations among platforms, with total coefficient of correlation ranging from 13% to

36%, with the highest variation noted for xMAP and MSD systems. Total variation for ELISA was 16-28% (CV), 13-36% (CV) for xMAP and 16-36% (CV) for MSD assay [187]. In the subsequent report from the Alzheimer’s Association quality control program group biomarker variation was evaluated among 84 laboratories for the three platforms. The overall variation

66

(including between-laboratory variation, within-run variation, within-laboratory variation, lot-lot variations) between laboratories was around 20-30% (CV). The within-run variations was estimated to be <5-10%, within-laboratory 5-19% (CV), while between-laboratory variation was

19-28% (CV) and accounted for the majority of overall variations. The variability in Aβ1-42 levels were estimated to be due to both between laboratory measurement and lot-to-lot variations, while variability of t-tau and p-tau were more related to the between laboratory performance [186].

These analytical and manufacture-related differences in biomarker measurements are one of the main obstacles in determine uniform cut-off values for AD biomarkers.

1.3.5.5 Mass spectrometry quantitative methods

Lame et al. developed selected reaction monitoring (SRM)-based mass spectrometry for measurement of Aβ1-38, Aβ1-40 and Aβ1-42 peptides using mix mode solid phase extraction

(SPE) as an enrichment method [188]. The subsequent study employed improved SRM-based quantification (monitoring three transitions/peptide, scheduled method) using as well SPE for

Aβ1-38, Aβ1-40, and Aβ1-42 peptides purification. With this assay higher levels of Aβ1-42 peptide were observed compared to ELISA. This SRM method for Aβ1-42 reliably separated controls from AD patients [189]. Candidate reference measurement procedure for quantification of Aβ1-42 in CSF was proposed by Leinenbach et al. and IFCC Working group on CSF Proteins

[190]. The assay was based on the parallel reaction monitoring (PRM)-based quantification on high-resolution mass spectrometer. The linear working range of Aβ1-42 was determined between 150 and 4,000 pg/ml with intra-assay and inter-assay imprecision of 5% and 6.4%, respectively. This method can ultimately lead to assessment of Aβ1-42 concentration in a

67 certified reference material with a goal of standardizing and harmonizing the method for measurement of Aβ1-42 and estimating the cut off values.

In addition to Aβ, mass spectrometry-based quantification was developed for tau protein as well. In the recent study, after initial monoclonal antibody immuno-enrichment, total tau protein was quantified with tryptic peptide common for all six tau isoforms. The method was able to differentiate AD patients from healthy controls with 1.7 fold change which was somewhat lower then observed difference with a total tau immunoassay in the same study and good correlation was observed between two methods [191]. Another MS-based tau quantification was SRM method developed without prior immunoprecipitation of tau protein, but rather using protein precipitation with perchloric acid followed by SPE extraction and protein digestion. In this method, 7 tau peptides, that cover different parts of the protein sequence, were monitored in the CSF pools. Levels observed with ELISA were 17 to 25 times lower when compared to SRM levels [192]. An extensive quantitative analysis of various tau peptides was achieved using PRM-MS mode and protein precipitation/SPE as an extraction method, showing diverse abundance of tau peptides depending on their protein sequence position, with the highest abundance observed for the peptides located in the middle of the sequence [193].

1.3.6 Diagnostic performance of CSF biomarkers

1.3.6.1 Diagnostic performance based on the methods

Early investigations on Aβ started in the 1980s with Aβ isolation and characterization from the AD disease brain patients with extensive cerebrovascular amyloidosis [194]. Finding that Aβ can be detected and quantified in CSF of healthy individuals suggested that this protein may have diagnostic value for AD [195]. Development of ELISA specific for Aβ1-42 form by

68

Motter et al. indicated that the CSF Aβ1-42 levels are decreased in AD patients compared to controls, while total Aβ levels remained unchanged [196]. The approximate 2-fold change in Aβ reduction was observed between AD and control cases, with sensitivity 86% at 90% specificity

(Innotest ELISA) [181].

One of the initial studies on evaluation of tau protein in CSF with ELISA was performed by Vandermeeren et al. in 1993 using polyclonal antibodies (detection antibody) [197]. The study demonstrated that tau can be detected in CSF; with an observed increase in tau levels in

AD patients when compared with controls; the authors also found an overlap in tau levels between AD and other neurological disorders. Additional studies were undertaken to evaluate tau levels in CSF and an increase of tau in AD was observed, further confirming that tau is an important CSF marker of neuronal degeneration [179, 198]. Approximate increase of 3-fold was observed for t-tau in AD when compared to t-tau levels in healthy elderly controls with 81% sensitivity and 90% specificity (Innotest ELISA) [181].

In terms of p-tau, phosphorylation at residues 231 and 181 are the most common investigated p-tau forms. Independent studies showed an increase of CSF p-tau 231 and p-tau

181 levels in AD patients compared to control group and additionally showed an improvement in discrimination between AD and other neurological diseases; strong correlation between p-tau and t-tau was observed as well [179, 199-201]. For discrimination AD from healthy controls the mean sensitivity of different p-tau ELISAs is 80% at the specificity of 92% [181]. Performance of different p-tau epitopes has been examined; the results showed that p-tau forms, including p-

231, p-181 and p-199, were significantly higher in AD in comparison with other dementias, other neurological disorders and controls. All three assays indicated similar performance in discriminating AD from non-demented controls. Specificity ≥75% for p-181 and p-231 was achieved in discriminating AD from non-AD group (when sensitivity was set to ≥85%;

69 specifically discrimination between AD and FTD was improved with p-tau231, while p-181 maximized discrimination between AD and LBD) [201].

Consistent with previous ELISA findings, the initial xMAP study demonstrated decreased Aβ1-42 levels in AD when compared to healthy controls (sensitivity 91%, specificity

81%) and neurologic diseases (specificity 75%). Also t-tau and p-tau levels were increased in

AD patients in comparison with healthy controls (t-tau: sensitivity 83%, specificity 83%, p-tau: sensitivity: 72%, specificity: 76%) and other neurological disorders (t-tau specificity 89% and p- tau specificity 95%). In addition, lower levels of Aβ1-42 and increased levels of t-tau were found for MCI patients with progression to AD than in controls (Aβ1-42 sensitivity 67%, t-tau sensitivity 80%), while p-tau levels were not significantly increased [182].

Pan et al. assessed clinical performance of Aβ1-42 and t-tau markers in AD using MSD platform; ratio t-tau/Aβ1-42 showed best performance in distinguishing AD from control subjects (sensitivity 86-87%, specificity 72-84%) [202].

1.3.6.2 Overall diagnostic performance

Overall diagnostic performance of CSF AD biomarkers across different studies suggest sensitivity and specificity of Aβ1-42, t-tau and p-tau of approximately 80-90% for differentiating AD from cognitively healthy elderly [181]. However, the performance of single biomarkers showed wide variation across the studies and substantial overlap between controls and AD groups [19, 203]. Moreover, CSF Aβ1-42 and tau lack specificity when used to differentiate AD from other types of dementia, where these biomarkers appear abnormal (e.g. decrease in CSF Aβ1-42 and increase in t-tau can be found in dementia with Lewy body, FTD), while p-tau is considered to be more specific for AD [201].

70

Reasons for lower specificity are related to: mixed dementias’ ethnologies, overlapping pathologies and clinical signs; cognitively normal individuals (asymptomatic) can have present

AD pathological changes, all of these can reflect on the differential biomarker performance, excluding the possibility of reaching 100% diagnostic accuracy. If clinical diagnosis of recruited subjects is not accurate and relevant groups are not homogeneous (e.g. control group have preclinical AD and diagnosed AD cases have actually mixed dementia or DLB) this underestimate the performance of biomarker under evaluation.

Approximate individual sensitivity and specificity for Aβ1-42, t-tau and p-tau in discriminating AD from other forms of dementias (sensitivity=75%, 75%, 75%, respectively and specificity=71%, 78%, 77%, respectively) is lower than individual sensitivity and specificity for discriminating AD vs. cognitively healthy individuals (sensitivity=82%, 82%,

79%, respectively and specificity=83%, 86%, 79%, respectively) [19]. However, the combination of these biomarkers provides better discriminative value for AD diagnosis compared with other forms of dementias, as well as cognitively healthy individuals. For example, combination of Aβ1-42 and t-tau yielded sensitivity of approximately 86% and specificity of 84% between AD and non-AD dementias, while combination of Aβ1-42 and p-tau resulted in sensitivity of 96% and specificity 89%. Similar sensitivity was observed when combining Aβ1-42 and t-tau in differentiation AD from healthy controls (89% sensitivity and specificity) [19].

Several studies demonstrated diagnostic relevance of the combined biomarkers in predicting progression from MCI to AD. Higher t-tau levels and decreased Aβ1-42 were observed in the small cohort of patients with MCI who progressed to AD comparing to stable

MCI patients and healthy controls (but not AD), with observed combined biomarker sensitivity of 90% (follow up period of 18 months, measured with ELISA method) [204]. A longitudinal

71 study monitored progression of MCI to AD (follow up period of 4 to 6 years) with xMAP technology measuring Aβ1-42, t-tau and p-181 tau in MCI patients and healthy controls.

Patients who converted to AD had lower Aβ1-42 and increased t-tau and p-181 tau at baseline compered to MCI who remained stable, MCI who developed other dementia and control group.

A combination of Aβ1-42 and t-tau protein resulted in 95% sensitivity and 83% specificity at baseline for progression of MCI patients to AD (with PPV 81% and NPP 96%). Similar performance was observed with p-tau and Aβ1-42 in combination, as well as with t-tau and

Aβ1-42/p-tau ratio [172]. The performance of combined CSF biomarkers was observed further in larger, multicenter studies [171, 205, 206]. In one of the studies, Mattsson and colleges recruited total of 750 MCI patients monitored for at least 2 years to evaluate CSF biomarker performance in identify incipient AD; combination of Aβ1-42/p-tau and t-tau resulted in sensitivity of 83% and specificity of 72% (PPV 62%, NPP 88%), with observed inter-site variability in assays; these figure values were lower than in those reported form a single center studies [171]. Overall, sensitivity and specificity of all three biomarkers for identifying MCI that progress to AD seems to be similar or somewhat lower from the ones found for the AD cases.

Based on longitudinal studies, diagnostic sensitivity and specificity to determine prodromal AD in MCI patients is typically from 75%, to 95% [172, 207]. Considering the variation in reported accuracy and in absolute values of biomarkers between centers, there is still a need for more research guided toward standardization of the methods, as well as the longitudinal studies with the longer period of patients monitoring for the studies evaluation progression of MCI to AD stage since the approximate rate of conversion is 10-15% per year.

Presence of AD pathology can be found in cognitively healthy elderly, with estimated positive amyloid brain pathology (PiB-PET/CSF Aβ1-42) in approximately 20-40% of clinically normal individuals [66]. AD biomarkers have been investigated in prediction of progression

72 from asymptomatic individuals to MCI stage; low levels of CSF Aβ1-42 were found to predict cognitive decline in healthy elderly subjects. The prediction of progression to MCI has been also suggested for the combination of CSF tau/Aβ1-42. Nevertheless these indications have to be further evaluated in larger, longitudinally followed cohorts [152].

1.3.7 Other AD biomarkers

Additional potential CSF candidates have been evaluated in an attempt to find novel AD biomarkers. As such Aβ oligomers, other truncated Aβ isoforms (e.g. Aβ1-38, Aβ1-40), soluble forms of APP (sAβPPα and sAβPPβ), several synaptic and neuronal biomarkers (e.g. VLP-1, , SNAP25) have been studied in CSF. Some of challenges are related to the lack of reproducible results in observed fold change (e.g. sAβPPα and sAβPPβ), low detectable levels in

CSF (e.g. Aβ oligomers), while some markers showed potential when combined with Aβ1-42

(e.g. ratio Aβ1-42/ Aβ1-40) [4].

Plasma is an attractive proximal fluid for AD biomarker research since it is less invasive and thus can be easier obtained than CSF. However, the attempts of AD biomarkers in this fluid did not yield achievements as did CSF biomarkers. One of the reasons is limited detection of brain proteins in the blood as well as inconsistency in replication of findings. Amyloid-β has been commonly studied in plasma; however contradictory results on Aβ1-42 and Aβ1-42 change in AD and controls have been reported [4]. Low t-tau levels in blood are the major encounter of its reliable measurements. The novel ultra-sensitive assay have been developed for t-tau measurements in blood, with significant increase in AD compared to controls and MCI patients detected; however, significant overlap between groups have been observed; no correlation between CSF and blood t-tau levels were find, thus more research are needed to further evaluate its clinical performance [208].

73

Recent large meta-analysis studied by Olsson and colleges assessed 15 CSF and blood biomarkers, including Aβ1-42, t-tau and p-tau, from 231 studies encompassing total of 15,699

AD patients and 13,018 healthy elderly controls. The evaluated biomarkers reflected APP metabolism, neurodegeneration, NFT pathology, function of blood-brain barrier, and glial activation. Taking into account the fold-change between AD and control group, the best performing biomarkers, statistically significant, were core CSF AD biomarkers with average fold change as follows: t-tau 2.54, p-tau 1.88 and Aβ1-42 0.54. These biomarkers also showed good performance in separating MCI due to AD and those patients with stable MCI, with somewhat smaller average ratios. Among other CSF biomarkers, protein NFL had as well major fold difference between AD and controls of 2.35, while other CSF biomarkers showed only modest differences (NSE, VLP-1, HFABP, YKL-40, had average ratios below 1.50), and low

(MCP-1, albumin CSF/serum ratio, Aβ1-40), or no significant difference (GFAP, Aβ1-38, sAβPPα, sAβPPβ). On the other hand, among plasma or serum biomarkers, only t-tau showed significant and major fold difference between AD and controls with average ratio 1.95 [209].

1.3.8 Conclusion

Current AD candidate biomarkers demonstrated good diagnostic performance and significant efforts have been made towards analytical standardizations and their possible implementation in the clinic. However, they still show variations in reported accuracy between studies without universally defined cut off values and complete standardization of analytical and pre-analytical procedures. While their diagnostic potential is promising, their utility in clinical trials to monitor effects of drugs on the disease-modification is however questionable.

The need for identifying novel AD biomarkers has been widely recognized; current AD biomarkers reflect amyloid plaque and NFT formation and neuronal injury; however other AD

74 pathophysiological pathways are affected in AD that these biomarkers do not mirror. Therefore, detailed characterisation of many other pathological mechanisms in AD are warranted in order to find novel biomarkers that can reflect pathological changes and be associated with disease severity. New biomarkers may be of particular importance for the clinical trials, but also for more accurate AD diagnosis, especially in non-specialized memory clinics such as primary care units.

75

1.4 Rationale and aims of the study

1.4.1 Rationale

Alzheimer’s disease is a progressive neurodegenerative disease mainly affecting elderly people after the age of 65. Amyloid plaques and NFT, are the main pathological features of the disease, and are associated with the spectrum of clinical signs and symptoms in AD patients [1].

Typical symptoms include progressive cognitive decline, apathy, depression, behavioural changes, impaired judgment, disorientation and, in later stages, difficulty speaking, swallowing and walking. Currently there are no effective disease-modifying treatments available to cure or slow the disease progression; only symptomatic therapies are used in the clinic for treatment of dementia phase, which are mostly effective during the first year of the treatment.

The diagnosis of probable AD is currently made based on the core clinical criteria, including medical history, mental status testing, and a physical and neurological exam. The gold standard for the definite AD diagnosis is based on the post-mortem histological brain examination [2]. The use of neuroimaging and CSF biomarkers for the diagnostic purpose have been currently limited mostly to research settings, as recommended by the NIA-AA newly revised research and diagnostic criteria for AD and MCI [2, 67].

The challenge of the early and accurate diagnosis of AD is commonly related to the overlapping symptoms and brain pathology with the other forms of dementia. There is also high possibility of existence of co-pathology where both AD and other dementias are present. Current focus in the AD research is development of treatments effective in early stage of the disease (i.e. disease modifying treatments) with the aim to prevent or slow disease onset and progression.

The significant efforts are invested in development of more targeted treatments with disease- modifying potential, with a number of drugs currently evaluated in the clinical trials [74]. These

76 treatments are predicted to be the most effective if introduced earlier in the disease course. With the availability of disease-modifying drugs, biomarkers are needed to diagnose early AD stages accurately. This would allow identifications of patients for recruitment into the clinical trials, and eventually help in classifying patients for drug use and later monitoring response to therapy.

In addition, biomarkers are needed for monitoring drug efficacy or as surrogate endpoint in clinical trials testing disease modifying treatments.

The CSF is a proximal fluid of the CNS, surrounding the brain and the spinal cord. It resides in direct contact with the brain parenchyma and thus can reflect physiological and pathological changes in the brain. As such cerebrospinal fluid may be the most promising source of AD biomarkers; especially highly specific, brain-related protein biomarkers.

The most extensively studied and evaluated AD biomarkers to date are CSF Aβ1-42, t- tau, and p-tau levels. This core AD biomarkers reflect main AD hallmarks: Aβ1-42 peptide is a marker of Aβ plaque formation, while t-tau and p-tau are biomarkers of neuronal injury and neurodegeneration. Decrease CSF levels of Aβ1-42 and increase levels of t-tau and p-tau have been observed in AD patients compared with healthy controls, however with substantial overlap in levels between the two groups [203]. While the diagnostic performances of single biomarkers are variable, it has been recognized that the combination of these biomarkers has overall good discriminatory power for differentiating AD against the healthy controls (sensitivity and specificity around 85 to 90%). However, the diagnostic accuracy for the combined biomarkers in predicting progression from MCI to AD varies across studies (approximate range of sensitivity and specificity 75-95%). In addition, the discriminatory power of these three biomarkers for differentiating AD from other types of dementia is lower compared to cognitively healthy controls [19]. Finally, it has been previously reported that current AD biomarker do not correlate well with cognitive decline in AD patients [166, 210].

77

The current CSF biomarkers have been tested in clinical trials, however with contradictory results, questioning their usefulness as indicators of efficacy of new therapies

[154]. While some studies testing the efficacy of Aβ immunotherapies showed decreased CSF levels in p-tau but no change in t-tau and Aβ1-42 [211], the others showed increased Aβ1-42 levels but no change in tau proteins [77]. In addition, reduction in biomarkers (reduced levels of p-tau in APOE ε4 carriers, but not in non-carriers) after treatment with Aβ immunotherapy

(bapinezumab) was not supported by the improvement in cognitive function in AD patients; similar observation was found in the solanezumab trials.

Other problems associated with the use of current biomarkers in the clinical settings are the lack of method standardization, which complicates definition of cut-off values and comparison of results obtained across multiple centres. Access to biomarkers is also limited [2].

Therefore, there is a need for novel, specific biomarkers that can accurately and proactively identify evolving cases of AD. Biomarkers that change as a result of different course of the disease, from asymptomatic stage, MCI due to AD, to different AD dementia severity will allow reliable diagnosis of the full AD spectrum. Understanding of AD pathology is still not complete and it is now known that other pathological mechanisms apart from plaques and NFT are involved, such as dysfunction and degeneration of synapses and microglia activation. Novel biomarkers are needed to improvement understanding of the underlying pathological mechanisms.

Biomarker search has typically used the strategy of comparing proteomes in a sample from patients vs. samples from relevant controls. This strategy includes a number of steps, with initial untargeted survey of proteins in different samples, followed several cycles of validation with more precise and focused method in independent group of samples. In the first step, use of technologies, such as mass spectrometry, allows identification of large number of candidates.

78

Application of certain filter criteria is a common approach employed to narrow down the protein selection to create a list of manageable number of candidates for the verification step. Different samples have been suggested as initial source material for discovery of potential biomarkers, such as cell line supernatants, tissues, proximal fluids or blood [98].

The focus on selection of tissue-specific proteins as initial candidate biomarkers has been previously demonstrated as a promising approach for the discovery of novel protein biomarkers [212-214]. Differential expression of proteins specific to a particular tissue can have strong disease specificity indicating pathology unique to that tissue. Some of these tissue- specific proteins showed promise as potential biomarkers, such in the case of the male infertility

(testis-specific protein TEX101) and in the cerebral hemorrhagic stroke (brain-specific proteins

NFM, α-Inx and β-Syn) [212, 213].

Hippocampal tissue may be of particular interest for seeking candidate biomarkers that reflect AD; it is one of the earliest affected regions which results in memory deficits.

Human Protein Atlas (HPA) portal has comprehensive data for tissue-specific transcriptomes and proteomes [132]. Tissue-specific proteome has been defined by high mRNA expression in a particular tissue relative to the other body tissues and has been categorized into three groups: tissue-enriched (with at least five times higher mRNA expression relative to all other tissues), group-enriched (with at least five times higher mRNA expression in the group of

2-7 tissues including tissue of interest) and tissue-enhanced (with at least five times higher mRNA expression relative to the average expression in all other tissues).

Secreted and membrane-bound proteins can be of particular interest. Significant amounts of membrane-shed and secreted proteins may be released into proximal fluids (such as CSF) and can serve as candidate biomarkers; these proteins have been previously suggested as promising

79 biomarker candidates of various diseases (e.g. prostate and lung cancer) [215, 216]. Based on the HPA gene ontology (GO) analysis majority of the brain-enriched proteins are found to be of membrane and secreted origin; likewise the most of the proteins found in the CSF are assigned as membrane and secreted [118, 132].

Mass spectrometry-based SRM technology allows multiplex and specific analysis of a large number of proteins which is of particular value in early stages of biomarker discovery and verification. Quantification of potential biomarkers with conventional immunoassays can have certain limitations. The availability of specific antibodies for antibody-based assays is often limited and involves potential cross reactivity with other proteins; in addition, the multiplexing capabilities of such assays are also restricted. On the contrary, the mass spectrometry-based quantification can offer reliable and accurate quantification of multiple proteins simultaneously, with high specificity. For this reason, state-of-the-art targeted mass spectrometry is a valuable platform for detection and quantification of numerous potential biomarker candidates in biological fluids such as CSF.

1.4.2 Hypothesis

We hypothesize that brain-related proteins, found in the CSF, may serve as potential biomarkers of different stages of AD.

1.4.3 Objectives of the study

Objective 1: Identify Alzheimer’s disease-unique proteins in hippocampal tissue and

cerebrospinal fluid proteome.

Objective 2: Create comprehensive CSF proteome and select tissue-enriched and group-

enriched (secreted and/or membrane bound) proteins reliably identified in CSF.

80

Objective 3: Develop mass spectrometry-based quantitative selected reaction monitoring assays for simultaneous quantification of candidate proteins in CSF.

Objective 4: Verify diagnostic potential of candidate proteins in the CSF samples of the

Alzheimer’s disease cohort.

81

Chapter 2

Parts of this chapter were published in:

Begcevic I#, Kosanam H#, Martinez-Morillo E, Dimitromanolakis A, Diamanids P, Kuzmanov, U, Hazrati LN, Diamandis E. Semiquantitative proteomic analysis of human hippocampal tissues from Alzheimer’s disease and age-matched control brains. Clin Proteomics 2013;10:5-11.

# Equal contribution to this work.

Copyright permission has been granted.

82

Chapter 2

2 Semiquantitative proteomic analysis of human hippocampal tissues from Alzheimer’s disease and age-matched control brains

2.1 Introduction

Alzheimer’s disease (AD) is a progressive neurodegenerative disease mainly affecting people over age of 65. The hallmarks of AD are the extracellular deposits known as amyloid β

(Aβ) plaques and the intracellular neurofibrillary tangles (NFT), the principal players thought to be involved in synaptic loss and neuronal cell death [16, 17]. Currently, diagnosis of AD is based on clinical criteria that are relied on neuropsychological examination, mental status testing and insight into the medical history of the patients. However, still, the gold standard for

AD diagnosis remains histological examination of post mortem brain regions [2]. Furthermore, there are no accurate methods to track the efficacy of new therapies. Hence, there is a desperate need for specific biomarkers that proactively identify evolving cases of AD and may lend way to more favorable medical outcomes.

Cerebrospinal fluid (CSF) has been so far the most promising source of potential protein biomarkers. CSF amyloid β 1-42 fragment (Aβ1-42) has shown about 50% decrease in AD patients in comparison to cognitively normal individuals [203], however it is not consistent in distinguishing AD from other forms of dementia [152]. Other prospective candidates, total tau

(t-tau) and phosphorylated tau (p-tau) levels have been found increased in CSF AD cases compared to controls [217]. While t-tau levels have a trend to be elevated in other neurodegenerative diseases as well [179] indicating the lack of specificity, p-tau levels may potentially discriminate AD from other types of dementias [201]. The combination of these

83 three biomarkers represents markers for Aβ depositions as well as neuronal injury and has confirmed better diagnostic performance when combined together [4].

Human brain tissue proteomics have been studied gradually in the last decade [218-220].

A recent proteomic study with mass spectrometry analysis has demonstrated a total of 197 proteins differentially abundant in AD versus controls, after examining the temporal lobe region

[221], whereas in another study by Sultana et al. 18 proteins were identified in hippocampus region with altered protein level, which are involved in different cellular functions in AD pathology [222]. Together with temporal lobe, hippocampus is one of the earliest affected regions in AD pathology, causing memory and cognitive functions impairment [21, 23].

Therefore, proteomic analysis of AD hippocampus could help in defining the etiology of the disease as well as identify potential biomarkers and therapeutic targets.

In this pilot study we present one of the first comprehensive proteomic analyses of the hippocampal region of three brains affected by AD and three age-matched controls; in addition presence of hippocampal proteins in human CSF was examined.

2.2 Methods

2.2.1 Brain tissues samples

Post-mortem frozen brain hippocampal tissues were obtained with the Research Ethics

Board approval (#11-1012-T) from the University Health Network, Toronto, Canada. Three pathologically confirmed AD tissues (all three had Braak stage 6/6) were obtained from three female patients (aged 69, 75 and 98 years) with post-mortem interval (PMI) of 13, 4 and 19.5 hours, respectively, while three control tissues were obtained from one female (aged 77 years) and two male patients (aged 78 and 80 years) with PMI of 12, 12 and 4 hours, respectively.

84

Control patients were diagnosed with non-metastatic colon cancer, cardiovascular disease and heart failure, respectively.

2.2.2 Mass spectrometry sample preparation

Prior to digestion, frozen tissue sections from both AD and controls were cut and weighted (~150 mg wet weight). Proteins from these six brain tissues were extracted and solubilized using 0.2 % RapiGest (Waters Corporation, Milford, USA) in 50 mM ammonium bicarbonate. Briefly, tissue samples were homogenized (Polytron PT3100, Capitol Scientific,

Austin, USA) at 15,000 rpm, for 15 s and sonicated on ice three times for 15 s with MISONIX immersion tip sonicator (Q SONICA LLC, CT, USA). The samples were centrifuged at 15,000 g at 4 °C for 20 min; the supernatants were collected and measured for total protein content.

Three AD tissues and three control tissues were pooled separately and an equal amount (3 mg) of protein from each pool was processed. Proteins were reduced and alkylated with 5 mM dithiothreitol and 15 mM iodoacetamide. To digest the proteins, sequencing grade trypsin

(Promega, WI, USA) was added, at an enzyme to substrate ratio of 1:50 and the digestion was carried out at 37 °C for 18 hours. Fractionation of acidified tryptic-peptides was performed on a

PolySULFOETHYL aspartamide strong cation exchange (SCX) column (The Nest Group, Inc.,

MA, USA) connected to an Agilent 1100 HPLC system. SCX fractionation was performed in triplicate for AD and control pools, and 20 fractions were collected per chromatographic run.

This amounted to a total of 120 SCX fractions, which were then subjected to LC-MS/MS analysis after a brief desalting procedure.

2.2.3 Liquid chromatography-tandem mass spectrometry (LC-MS/MS)

A 60 min linear gradient method was operated with buffer A→B (Buffer A: 0.26 M formic acid (FA) in 5% acetonitrile, B: 0.26 M FA in 5% acetonitrile and 1 M ammonium

85 formate) at a flow rate of 250 μL/min. SCX fractionation was performed in triplicate for AD and control pools. The peptides from SCX fractions were desalted and injected onto a nano-LC system (Proxeon Biosystems, Odense, Denmark) connected online to LTQ-Orbitrap XL mass spectrometer (Thermo Fisher Scientific, San Jose, CA, USA). A 90 min linear gradient reverse- phase chromatography (Buffer A: 0.1% FA in water and B: 0.1% FA in acetonitrile) at a flow rate of 400 nL/min was performed to resolve peptides on a C18 column (75 μM X 5cm). The mass spectra were acquired in data-dependent mode.

2.2.4 Data analysis

The MS spectra were searched against the non-redundant IPI human database (version

3.71 containing both forward and reverse protein sequences) using two search engines, separately: Mascot, version 2.1.03 (Matrix Science) and the Global Proteome Machine manager, version 2006.06.01. The resulting Mascot DAT and X! Tandem XML files were merged using

Scaffold® (version 2.06, Proteome Software Inc., Portland, Oregon) with ‘MudPIT’

(multidimensional protein identification technology) option checked. Protein false-positive rate was set to ≤ 1%.

2.3 Results and Discussion

From the proteomic analysis of 6 hippocampal tissue specimens (pool of 3 AD and pool of

3 controls), 2,954 proteins were identified, with at least two peptides for 1,203 of them. A total of 2,354 proteins were detected in AD tissues and 2,750 in control tissues, with 204 proteins exclusively detected in AD and 600 in controls (Figure 2.1A). Furthermore, 1,605 proteins were identified in all the three AD technical replicates, and 1,755 proteins were identified in all the three control technical replicates. Of 204 AD-exclusive proteins, 124 were identified with ≥2

86 peptides. Two hundred fifty five proteins in 600 control-exclusive proteins were identified with

≥2 peptide hits.

Figure 2.1: Proteins identified in Alzheimer’s disease (AD) and control brain samples. A) Overlap of the proteins identified in post-mortem hippocampal tissue specimens from AD patients and age-matched healthy controls. AD and control pools (n=3) were analysed in triplicates. B) Overlap of AD and control proteins with literature-based CSF proteome [223]. The comparison reveals that 40 AD-specific proteins identified in the current study were also present in the CSF database.

Analysis of technical and biological replicates is necessary to ensure the accuracy and biological significance of proteomic datasets. Pooling of biological replicates does not allow statistical comparisons, but produces sufficient sample to enhance the detection of low- abundance proteins through technical replicates and helps to balance the between subject variability. In the current study, we pooled biological replicates and performed SCX fractionation of each pool in triplicate. A total of 120 fractions were analysed with approximately 300 hours of instrument time. An ideal proteomic study includes both biological and technical replicates. With this study, we intended to generate a comprehensive hippocampal proteome dataset and build a platform for future translational investigations in AD. Therefore, we pooled our biological samples to save precious instrument time but included technical

87 triplicates to enhance confidence in protein identification and increase the depth of proteomic analysis. Despite the shortcomings of our pooling strategy, our study unveiled the one of the largest hippocampal proteome database reported to date. We believe that the proteome presented here provides a valuable resource for researchers aiming to develop novel AD biomarkers and therapeutic targets. Post-mortem interval (PMI) is a critical factor affecting integrity of the proteome in post-mortem tissues. A short PMI (<2 h) is advantageous in proteomics studies of human tissues, since PMI-prone artifacts, such as proteolytic degradation, insolubility and oxidation/nitration of certain proteins, are minimized and results more likely represent the intrinsic situation in AD and control brains. However, due to scarcity of post-mortem tissues, it is not always possible to secure tissues with short PMIs. In the current study, 2 of 6 tissues have a PMI of < 4 h and the rest were collected at ~12 h PMI. A recent study using 2-dimenstional gel electrophoresis (2DE) showed that only a small percentage (6.5% of ~2,500 proteins) of brain proteins underwent proteolytic degradation after 48 h PMI with no significant changes in their solubility [224]. Techniques such as 2DE and western blots could only capture intact proteins, leaving out proteolytic peptides, and this severely impacts the quality of proteomic analysis. On the other hand, sample preparation methods used in the current study, e.g. SCX chromatography, will ensure the capture of total proteome that includes proteolytically degraded peptides as well as trypsin-generated peptides. In addition, we also avoided the dialysis procedure to prevent loss of low molecular weight proteolytic peptides. Therefore, the overall effect of proteolytic degradation during PMI is minimized.

It should be noted that when the same amount of total protein was processed for proteomic analysis; more proteins were identified in controls, compared to AD tissues. Lower number of protein identifications in AD tissues may be attributed to inherent insolubility of protein aggregates which renders them inaccessible to trypsin proteolytic activity Moesin

88

(MSN), heat shock protein beta-1 (HSPB1), S100B protein (S100B), and chloride intracellular channel protein 1 (CLIC1) were the top four over-expressed proteins (≥ 4.5-fold measured in terms of spectral counts) in AD tissues. Elongation factor 1-alpha 2 (EEF1A2), 2-oxoglutarate dehydrogenase (OGDH), isoform 1 of immunoglobulin superfamily member 8 (IGSF8) and actin-related protein 2 (ACTR2) are highly down-regulated proteins (≥ 4.5-fold) in AD tissues.

Top 10 highly over-expressed and under-expressed protein fold changes are presented as

Appendix 1 and 2.

We relied upon “Protein Center” (Thermo Fisher Scientific, USA) to retrieve Gene

Ontology information for cellular localization, biological processes and molecular function of

AD proteins. As expected, majority of identified proteins were of membranous and cytoplasmic origin and were associated with metabolic processes, protein binding and catalytic functions

(Figure 2.2).

Figure 2.2: Gene Ontology (GO) analysis of AD hippocampal proteome.

89

Diagram showing cellular localization (A), biological processes (B) and molecular mechanism (C) for AD hippocampal proteome. Gene Ontology information was retrieved from ‘Protein center’ database.

Proteins identified in AD pool were compared against the human proteome database to identify statistically significant over-represented gene ontology functions using BinGO

(Cytoscape plugin) enrichment map. A p-value threshold of <0.001 and a false discovery rate of

FDR<5% were used to confidently predict enriched GO terms among AD proteins. Protein binding (p=3.28E-82), catalytic activity (p=8.72E-37), oxido-reductase activity (p=7.78E-25), adenyl nucleotide binding (p=1.10E-9), SNARE binding (p=4.56E-8) and syntaxin binding

(p=1.5E-6) were the most significantly enriched molecular functions of AD proteins. Among the biological processes, cellular metabolic processes (p=1.05E-42), primary metabolic process

(p=8.21E-23), vesicle mediated transport (p=1.16E-23), cellular ketone metabolic process

(p=3.08E-21), oxidative phosphorylation (p=1.01E-17) and positive regulation of ubiquitin activity (p=3.15E-9) were of high statistical significance.

The underlying objective of the current study was to segregate promising candidate biomarkers from the list of 2,954 proteins. To this end, we considered the 204 and 600 proteins that were identified exclusive to AD tissues and control tissues, respectively. The failure to detect these proteins does not endorse their absence; however, it does imply that these proteins are differentially expressed. Mere differential expression of a protein in AD tissues does not qualify the protein to be a biomarker, unless its disease-specific higher expression in tissues is reflected in easily accessible bio-fluids such as CSF or serum. In this light, we compared the current hippocampal proteome with literature-based CSF proteome [223]; 25% of 2,954 tissue proteins were present in CSF. A considerable finding is that 40 of the 204 AD-exclusive proteins and 106 of 600 control-exclusive proteins were also detected in CSF (Figure 2.1B).

Secretory origin is one of the most important qualifications of biomarker candidate [214]. It is

90 well-established that majority of CSF and serum proteins are of extracellular and secretory origin. Therefore, we assumed that extracellular and secreted proteins, identified in CSF and up/down-regulated in AD tissues are favourable candidates for biomarker verification. As most of the 40 and 106 proteins that are present in CSF proteome were either extracellular/secretory or membranous origin, therefore, it is worthwhile to include these proteins as a potential source of candidate biomarkers. Appendix 2.3 shows 40 AD exclusive proteins found in literature

CSF, together with their cellular localization.

In conclusion, hippocampus is one of the primary regions of the brain affected by AD.

This structure is known to host tangles and plaques in the earliest phases of the disease cascade.

The proteome of such a pivotal region represents a promising source of diagnostic markers and molecular targets for therapeutic intervention. Herein, we performed proteomic analysis of freshly-frozen post-mortem hippocampal tissue sections from AD (n=3) and age-matched controls (n=3). Our detailed proteomic analysis utilizing offline multidimensional chromatography coupled with the LTQ-Orbitrap XL mass spectrometer and semi-quantitative spectral counting methods identified 2,954 proteins, one of the largest human hippocampal proteome database published to date. We applied a hypothesis-driven set of filtering criteria, based on protein’s cellular origin and identification in the CSF proteome to find proteins that can be used as potential biomarkers in CSF.

91

Chapter 3

Parts of this chapter were published in:

Begcevic I#, Brinc D, Drabovich, AP, Batruch, I, Diamandis EP. Identification of brain-enriched proteins in the cerebrospinal fluid proteome by LC-MS/MS profiling and mining of the Human Protein Atlas. Clin Proteomics 2016;13:11-23.

# Performed all experiments and data analysis. Graphical representation of data was conducted in collaboration with DB.

Copyright permission has been granted.

92

Chapter 3

3 Identification of brain-related proteins in the cerebrospinal fluid proteome

3.1 Introduction

Cerebrospinal fluid (CSF) is a proximal fluid residing in direct contact with the cerebral parenchyma. CSF acts to protect, support and nurture brain tissues and is essential for brain functioning. Apart from hydro-mechanical protection, CSF is also important for the homeostasis of the extracellular environment and hormonal-to-neuropeptide balance in the central nervous system (CNS) [133, 141]. The majority of CSF is produced as plasma ultra-filtrate by the choroid plexus in the lateral, third and fourth ventricles, whereas a smaller portion is derived from the cerebral interstitial fluid and cerebral capillaries [134]. CSF production is a dynamic process with a rate of about 500 mL per day, and CSF absorption is mainly performed through arachnoid villi from the subarachnoid space into the venous sinuses [134]. Approximately 80% of the total CSF protein is derived from the plasma, upon crossing the blood-CSF barrier, and another 20% is secreted by the CNS [133]. Examples of proteins with higher CSF concentration and high CSF-to-blood serum ratios include prostaglandin D2 synthase (ratio 34/1), S-100B

(18/1), tau protein (10/1), and cystatin C (5/1) [145, 146]. The most abundant blood-derived proteins in CSF are albumin and immunoglobulins. Blood-related proteins in CSF such as apolipoprotein B-100 and hemoglobin are commonly used as an indication of blood contamination of CSF [133, 147].

Detailed composition of the CSF proteome may provide novel insights for the in-depth understanding of CNS functioning under physiological and pathological conditions. The advantages of tissue-specific proteomes have been previously demonstrated for the discovery of

93 novel protein biomarkers [212, 213]. The Human Protein Atlas (HPA) provides comprehensive data on the tissue-specific transcriptome and proteomes, based on the RNA-sequencing analysis of 32 human tissues and immunohistochemistry analysis of 44 tissues, respectively [132]. Apart from the tissue-specific proteomes, HPA includes comprehensive summaries of regulatory, secreted and membrane, cancer-specific and druggable proteomes. This makes HPA an indispensable repository of the human proteome and its applications for disease diagnostics and drug discovery. It is worth noting that brain is the top second organ with the largest number of tissue-specific genes. From the 1,134 elevated genes in the brain (HPA v13), 315 are tissue- enriched genes, 226 genes are found to be elevated in a group of 2-7 tissues and 590 genes are annotated as tissue-enhanced genes. Tissue-enriched genes are considered genes with mRNA expression at least five times higher in the cerebral cortex relative to other tissues, while group- enriched genes have mRNA expression at least five times higher in the group of 2-7 tissues, including cerebral cortex, relative to all other tissues. Lastly, tissue-enhanced genes have at least five times higher mRNA expression in brain relative to the average expression in all other tissues. The Gene Ontology (GO) analysis of the elevated genes indicates that the main functions of brain proteins are synaptic transmission and neurological processes, whereas most of the brain-enriched genes are membrane-bound and secreted proteins. Interestingly, membrane-bound and secreted proteins represent the majority of the CSF proteome and their fraction is much higher in CSF than in blood [118, 133]. Considering that membrane and secreted proteins are overrepresented in the CSF, they could be potentially reliably identified and quantified, which makes them respectable biomarker candidates. Besides, significant amounts of membrane-shed and secreted proteins may be released into proximal fluids (such as

CSF); these proteins have been previously suggested as promising biomarker candidates of various diseases [215, 216].

94

The field of CSF proteomics is constantly expanding and many efforts have been made to characterize the CSF proteome. The most extensive CSF protein mapping to date, by Zhang et al., identified 3,256 proteins [118], and by Guldbrandsen et al., identified 3,081 proteins

[225], followed by Shutzer et al. with 2,630 [223], and Pan et al. with 2,594 proteins [226].

The main purpose of the present study was to expand the knowledge of the human CSF proteome and generate a panel of brain-enriched proteins that can potentially serve as a platform for biomarker discovery of Alzheimer’s disease (AD). Here, we performed two-dimensional chromatography (off-line strong-cation exchange fractionation followed by the on-line reverse- phase separation) and mass spectrometry analysis to generate the extensive proteome of normal

CSF samples. HPA data was further applied to select brain-related secreted and membrane- bound proteins found in the CSF. Since high-quality antibodies and ELISAs may not be available for many brain tissue-specific proteins, we provided a list of brain-enriched proteins detectable by mass spectrometry and thus quantifiable in CSF by antibody-free selected reaction monitoring (SRM) assays [102, 227].

3.2 Methods

3.2.1 Cerebrospinal fluid sample preparation

Six non-pathological (normal) CSF samples were retrospectively retrieved for CSF proteome analysis as samples archived after routine biochemical examinations at the Mount

Sinai Hospital, Toronto and stored at -80 ºC until further use. All samples were transparent, clear and without any visible blood contamination. The patients’ age ranged from 32 to 72 years and included 3 female and 3 male patients. The ethical approval was obtained from the Mount

Sinai Hospital Research Ethics Board.

95

For the CSF proteomic analysis, samples were thawed at room temperature, centrifuged for 10 min at 17,000 g and subjected to mass spectrometry sample preparation. Each CSF sample was adjusted to a volume equivalent to 300 µg total protein, denatured with 0.05%

RapiGest (Waters, Milford, USA) and reduced with 5 mM dithiothreitol (Sigma-Aldrich,

Oakville, Canada) at 60 ºC for 40 min. Alkylation was achieved with 15 mM iodoacetamide

(Sigma-Aldrich, Oakville, Canada) for 60 min in the dark at room temperature. Protein digestion was carried out with trypsin (Sigma-Aldrich, Oakville, Canada) in 50 mM ammonium bicarbonate (1:30 trypsin to total protein ratio), for 18 hours at 37 ºC. Digestion and RapiGest cleavage were completed with 1% trifluoroacetic acid following sample centrifugation at 500 g for 30 min. Samples were frozen at -80 ºC until strong-cation exchange (SCX) HPLC peptide separation.

3.2.2 Strong cation exchange chromatography

Trypsinized samples were diluted two-fold with the SCX Buffer A (0.26 M formic acid,

5% acetonotrile) and loaded on the SCX PolySULFOETHYL Column (The Nest Group, Inc,

Southborough, USA) coupled to the Agilent 1100 HPLC system. The peptides were eluted with the gradual increase of the SCX Buffer B (0.26 M formic acid, 5% acetonitrile, 1M ammonium formate) during the 70 min gradient (30-40 min 20% SCX Buffer B; 45-55 min 100% SCX

Buffer B) and a flow rate of 200 µL/min. The eluent was monitored at 280 nm and fractions

(400 µL) were collected. Based on the elution profile, 15 individual fractions and one pooled fraction (for low absorbance fractions, at the end of the gradient) per sample were selected for mass spectrometry analysis. Peptides were purified by extraction using OMIX C18 tips, eluted with 5 µL of acetonitrile solution (65% acetonitrile, 0.1% formic acid) and finally diluted with

60 µL of water-formic acid (0.01% formic acid) solution.

96

3.2.3 Liquid chromatography-tandem mass spectrometry (LC-MS/MS)

In total, 96 desalted SCX fractions from six individual CSF samples were loaded on the

96 well-plate. Using an auto-sampler, 18 µL of each sample were injected into an in-house packed 3.3 cm trap pre-column (5 μm C18 particle, column inner diameter 150 μm) and peptides were eluted from the 15 cm analytical column (3 μm C18 particle, inner diameter 75

μm, tip diameter 8 μm). The liquid chromatography, EASY-nLC system (Thermo Fisher,

Odense, Denmark) was coupled online to the Q-Exactive Plus (Thermo Fisher, San Jose, USA) mass spectrometer with a nanoelectrospray ionization source. The 60-min liquid- chromatography (LC) gradient was applied with an increasing percentage of buffer B (0.1% formic acid in acetonitrile) for peptide elution; at the flow rate of 300 nL/min. Full MS1 scan was acquired from 400 to 1500 m/z in the Orbitrap at a resolution of 70,000, followed by the

MS2 scans on the top 12 precursor ions at a resolution of 17,500 in a data-dependent acquisition

(DDA) mode. The dynamic exclusion was enabled for 45 s and unassigned charge, as well as charge states +1 and +4 to ≥8 were omitted from MS2 fragmentation.

3.2.4 Data analysis

The Human Protein Atlas (HPA) [132] version 13 (the tissue specific proteome database) was utilized to generate a list of secreted and membrane-bound brain-expressed proteins that had high mRNA expression in the brain relative to other human tissues. The list of

318 brain-enriched proteins (with mRNA expression at least 5 times higher in the cerebral cortex relative to other tissues) and 226 group-enriched proteins (with mRNA expression at least

5 times higher in the group of 2-7 tissues, including cerebral cortex) was downloaded from the

HPA database (www.proteinatlas.org). Brain-related proteins were then merged with the secretome (n=3,171 proteins) and the membrane proteome (n=5,570 proteins), generated based

97 on the prediction algorithms for membrane and secreted proteins. The resulting list of brain- related proteins were merged with the in-house generated brain hippocampal proteome (IPI protein identifier of brain proteins was first transformed into Ensemble gene ID and 2926 proteins were merged with generated HPA protein list based on the Ensemble gene ID).

Raw files were uploaded into the Proteome Discoverer, version 1.4 (Thermo Fisher, San

Jose, USA), and searched with both Mascot and Sequest HT algorithms against the human

TrEMBL database (July 2014 release). Searching parameters included: two maximum missed cleavages, cysteine carbamidomethylation as a static modification, methionine oxidation as a dynamic modification, precursor mass tolerance of 7 ppm, fragment mass tolerance of 0.02 Da.

Proteins were grouped automatically by Proteome Discoverer software and the master protein per group was assigned by the Parsimony Principle. Decoy database search was set to 1% false discovery rate at the peptide level. The final list of brain-enriched and group-enriched candidates was selected based on protein identification in at least 4 out of 6 individual samples.

Brain-enriched (n=196) and group-enriched (n=138) proteins were first retrieved from HPA, merged with secreted/membrane proteome to generate a list of brain/group enriched secreted/membrane proteins which were then merged with the in-house generated CSF proteome (based on the gene name) using R statistical software version 2.15.2

(www.Rproject.org). Label-free quantification of the CSF proteome and 78 candidate proteins was performed using MS1 area obtained with Proteome Discoverer (v1.4). Venn diagram for inter-individual sample reproducibility was prepared using Jvenn [228]. The GO analysis of candidate proteins was executed with PANTHER classification system [229].

Immunohistochemistry-based expression of the candidate proteins were manually assessed using the HPA database. Validation data were annotated as IHC brain evidence (detected, not detected, NA-not available) and IHC tissue expression (number of tissues protein is

98 expressed/total number of tissues evaluated), considering four brain-derived tissues as a single tissue. Mass spectrometry-based protein evidence was assessed by merging candidate proteins

(based on Ensemble gene ID) with the previously developed brain hippocampal proteome

(Chapter 2), annotated as present/absent. The comparison between in-house developed CSF proteome and CSF proteome from the literature (Guldbrandsen et al. and Zhang et al.) was performed with R statistical software (v 2.15.2), merging UniProt accession protein identifiers.

3.3 Results

3.3.1 Cerebrospinal fluid proteome

To generate an in-house CSF proteome of wide age range of healthy individuals, six non-pathological CSF samples from three female and three male individuals were selected

(Figure 3. 1), with patients’ age from 32 to 72 years. Numbers of identified proteins in each individual CSF sample ranged from 1,109 to 1,421, while numbers of identified peptides varied from 6,272 to 8,632 at 1% FDR at the peptide level. Merging of proteomes of six individuals resulted in 2,615 proteins (12,443 peptides) which represented our complete CSF proteome.

Table 3.1 includes the number of proteins and peptides identified in all 6 CSF samples.

99

Figure 3.1: Candidate selection workflow. Six individual CSF samples were digested, fractionated with SCX-HPLC and analyzed with LC- MS, MS/MS to generate in-house human CSF proteome. 196 brain-enriched and 138 group- enriched proteins (secreted/membrane) were compared against individual CSF proteome and 78 brain-related proteins, found reproducibly in individual CSF samples, were selected. A- absorbance, RI- relative intensity, m/z- mass-to-charge ratio.

Table 3.1: Number of identified proteins and peptides in six individual CSF samples

Sample # Proteins # Peptides

CSF1 1200 6978 CSF2 1282 7629 CSF3 1109 6272 CSF4 1421 8632 CSF5 1305 7253 CSF6 1241 6756 Total 2615 12443

100

Between any two samples, the average percentage of common proteins was 66.9%.

Fewer proteins were common between 3-6 samples. Specifically, 1,183 (45.4%) proteins were common in at least 3 samples, 947 (36.2%) in at least 4 samples, 734 (28.1%) proteins were shared with at least 5 samples, while 546 (20.9%) were shared among all 6 samples (Figure 3.2,

Table 3.2). At the peptide level, the average percentage of peptides common between any two samples was 74%. Similar to proteins, fewer number of peptides where common among more samples. 7,423 (59.7%) identified peptides were shared among at least 3 samples, 6,138 (49.3%) between at least 4 samples, 4,937 (39.7%) peptides between at least 5 samples, while 3,625

(29.1%) were shared among all 6 samples (Figure 3.3, Table 3.3).

Table 3.2: Overlap of proteins in individual samples

*among 2615 proteins

Overlapped CSF A and B Number Percentage of A Percentage of B CSF1 and CSF2 859 71.6 67.0 CSF1 and CSF3 757 63.1 68.3 CSF1 and CSF4 823 68.6 57.9 CSF1 and CSF5 827 68.9 63.4 CSF1 and CSF6 787 65.6 63.4 CSF2 and CSF3 794 61.9 71.6 CSF2 and CSF4 924 72.1 65.0 CSF2 and CSF5 900 70.2 69.0 CSF2 and CSF6 841 65.6 67.8 CSF3 and CSF4 780 70.3 54.9 CSF3 and CSF4 771 69.5 59.1 CSF3 and CSF6 752 67.8 60.6 CSF4 and CSF5 894 62.9 68.5 CSF4 and CSF6 859 60.5 69.2 CSF5 and CSF6 843 64.6 67.9 common in all 6 CSFs* 546 20.9 NA common in at least 5 CSFs 734 28.1 NA common in at least 4 CSFs 947 36.2 NA common in at least 3 CSFs 1188 45.4 NA

101

Figure 3.2: Venn diagram of proteins identified in 6 individual CSF samples. Total number of identified proteins in all samples was 2,615 with 546 (21%) common proteins for all 6 samples. Number of individual proteins ranged from 1,109 to 1,421.

102

Table 3.3: Overlap of peptides in individuals samples

*among 2615 proteins

Overlapped CSF A and B Number Percentage of A Percentage of B CSF1 and CSF2 5696 81.6 74.7 CSF1 and CSF3 4982 71.4 79.4 CSF1 and CSF4 5502 78.8 63.7 CSF1 and CSF5 5367 76.9 74.0

CSF1 and CSF6 5003 71.7 74.1 CSF2 and CSF3 5180 67.9 82.6 CSF2 and CSF4 6181 81.0 71.6 CSF2 and CSF5 5748 75.3 79.2 CSF2 and CSF6 5276 69.2 78.1 CSF3 and CSF4 5102 81.3 59.1 CSF3 and CSF5 4988 79.5 68.8

CSF3 and CSF6 4633 73.9 68.6 CSF4 and CSF5 5815 67.4 80.2 CSF4 and CSF6 5415 62.7 80.2 CSF5 and CSF6 5199 71.7 77.0 common in all 6 CSFs* 3625 29.1 NA common in at least 5 CSFs 4937 39.7 NA common in at least 4 CSFs 6138 49.3 NA common in at least 3 CSFs 7423 59.7 NA

103

Figure 3.3: Venn diagram of peptides identified in 6 individual CSF samples. Total number of identified proteins in all samples is 12,443 with 3,625 (29%) common proteins for all 6 samples. Number of individual proteins ranged from 6,272 to 8,632.

3.3.2 Identification of brain-related proteins in the CSF proteome

According to our analysis, the total number of tissue-enriched and group-enriched proteins with HPA evidence of high mRNA expression in the brain was 318 and 226, respectively. Of those, 196 tissue-enriched and 138 group-enriched proteins were secreted and/or membrane proteins. Mass spectrometry protein evidence was further assessed in the generated hippocampal proteome (Chapter 2). Out of 196 brain-enriched proteins, 57 were found at protein level in the hippocampal proteome and 29 out of 138 group-enriched proteins

104

(Appendix 3.1). We then examined our CSF proteome for the presence of 196 tissue-enriched and 138 group-enriched proteins (Figure 3.1).

Less than 30% of brain-enriched (33 proteins) or group-enriched proteins (24) were found in all six CSF replicates. Additional proteins can be found in at least 4 or 5 out of the 6 replicate samples. In total, 78 brain-related proteins (secreted or membrane-bound) were found in CSF of at least 4 different individuals. Appendix 3.2 contains the list of all proteins with their relative abundance in CSF based on average area (AA), average number of unique peptides,

RNA tissue-specific score (RNA TS), IHC evidence based on HPA and mass spectrometry evidence based on their presence in hippocampal proteome (Chapter 2). According to these experimental data, tissue-enriched proteins with the highest abundance in CSF were amyloid- like protein 1, APLP1 (AA=1.07x1010) with average number of 9 unique peptides identified, followed by secretogranin-3, SCG3 (AA=7.98x109 and 24 unique peptides). Of the HPA proteins identified in CSF, V-set and transmembrane domain-containing protein 2B, VSTM2B

(RNA TS=108) and neurocan core protein, NCAN (RNA TS=60) had the highest RNA TS. In the group-enriched proteins, the most abundant proteins were kallikrein-6, KLK6

(AA=1.61x1010; 14 unique peptides) and secreted phosphoprotein 1/osteopontin, SPP1

(AA=1.32x1010; 10 unique peptides). Neurexophilin-1, NXPH1 (RNA TS=44) and contactin- associated protein-like 5, CNTNAP5 (RNA TS=16) had the highest RNA TS. Figure 3.4 shows tissue-enriched and group-enriched candidates and their abundance in CSF. In addition, the validation of the KLK6 at the protein level in brain tissues and CSF pool was performed using

SRM assay. These findings, together with the methods used, were reported in the Appendix 3.3 and Appendix 3.4.

105

Figure 3.4: CSF brain tissue-enriched and group-enriched proteins and their relative abundance. A) 45 brain-enriched and B) 33 group-enriched proteins were detected in at least 4 out of 6 CSF samples, and the average MS1 area was used as a proxy of protein abundance. Abundance is indicated for representative protein isoform. Shaded bars show proteins that are detected in all 6 samples.

106

To compare the relative abundance (based on MS1 area) of selected 78 proteins over the relative abundance of the complete CSF proteome, we plotted MS1 areas of candidate proteins over the MS1 area of all identified proteins (Figure 3.5). As a result, most of 78 proteins were positioned in the middle and the upper range of the complete CSF proteome relative abundance.

The indication of such candidate distribution suggests that the abundance of most of the 78 proteins is medium to high when compared to the CSF proteome and thus will be measurable by

SRM assays in CSF samples. Knowledge of protein abundances is important to predict if proteins could be quantified in clinical samples using SRM assays, as we previously demonstrated for testis-specific proteins in seminal plasma [102]. The most represented GO molecular functions of 78 proteins were binding (35% of proteins) and receptor activity (33% of proteins) as shown in Figure 3.6.

Figure 3.5: Relative abundance of CSF proteome and 78 protein candidates. Shaded dots show 78 protein candidates over the complete proteome. Most of the selected 78 brain-specific proteins were positioned in the range of medium- and upper-abundance proteins of the CSF proteome.

107

Figure 3.6: Gene Ontology (GO) analysis of 78 protein candidates. The most represented GO molecular functions were binding (35% of proteins) and receptor activity (33% of proteins).

3.3.3 Cell type-specific brain-related proteins in the CSF proteome

Given that HPA also contains data on IHC staining of proteins in several brain regions

(hippocampus, lateral ventricle, cortex and cerebellum) and cell types, we analyzed the CSF proteins in order to identify brain region- and cell-type specific proteins. Since some CNS diseases originate in specific regions [23, 230] or cell types [230], measurement of CSF proteins with specific expression in the corresponding regions or cell types may pinpoint the pathological process with high diagnostic sensitivity. Proteins with staining specific for a single cell type are shown in Table 3.4. The neuron-specific proteins included neurosecretory protein VGF, receptor-type tyrosine-protein -like N and neurexophilin-1, neuropil “specific”, neurocan core protein, tenascin-R and cell adhesion molecule 3, while protein with specific staining for the Purkinje cells was transmembrane protein 132D. Immunohistochemical images of these proteins can be found at the HPA website (http://www.proteinatlas.org).

108

Table 3.4: Representative cell type-specific brain-expressed proteins according to the Human Protein Atlas immunohistochemistry data.

Ctx- cerebral cortex; Hp-Hippocampus; LV- lateral ventricles; Cb- cerebellum.

Cell Type Gene Name Staining level Brain region

VGF medium Ctx, Hp, LV

medium Hp Neuron PTPRN low Ctx

NXPH1 low Hp

NCAN medium Ctx

Neuropil TNR medium Ctx

CADM3 medium Ctx

Purkinje cell TMEM132D low Cb

3.4 Discussion

The prime goal of this study was to generate comprehensive proteome of normal CSF samples and define brain-related proteins identified in the generated proteome. In order to obtain in-depth proteome coverage of normal CSF and allow for identification of low abundance proteins, we performed off-line SCX fractionation of individual CSF samples, followed by LC-

MS/MS analysis. The Q Exactive Plus mass spectrometer provided high-resolution, high mass accuracy, wide dynamic range and excellent sensitivity, and along with the benefit of pre- fractionation strategy, facilitated identification of the extensive CSF proteome. With a total number of 2,615 identified proteins, this study provides additional information about the CSF proteome when compared to previous proteomic studies [223, 225, 226, 231-233]. Recent

109 studies of CSF identified similar number of proteins, utilizing different separation methodologies and mass spectrometry-based proteomics [118, 223, 225, 226].

We compared our CSF proteome to the CSF proteome identified by Guldbrandsen et al., with 3,081 protein sets or 2,875 protein groups reported (available from: http://probe.uib.no/csf- pr) and by Zhang et al. with 2,513 proteins reported with at least two unique peptides. When

CSF proteins from both studies were compared against our proteome, the combined CSF proteome consisted of 4,649 proteins and 4,346 proteins for Guldbrandsen et al. plus our proteome and Zhang et al. plus our proteome, respectively. Overall, the combined CSF proteome for all three studies consisted of 5,133 proteins. The number of overlapping proteins between Guldbrandsen and our study was 819 (18% of the combined proteomes, 31% of our proteome), with 2,034 proteins detected only in the Guldbrandsen study, and 1,796 only in the present study. Similarly, the number of overlapping proteins between Zhang et al. and our study was 782 (18% of the combined proteomes, 30% of our proteome), with 1,731 proteins detected only in Zhang study, and 1,833 only in the present study. In addition, number of unique proteins, identified only in this study, was 1,764. These discrepancies in CSF proteins are partially due to the different proteomic workflows and other technical differences. For example,

Guldbrandsen et al. used three different separation approaches (immuno-depletion, SDS-PAGE,

MM RP-AX, glycoprotein enrichment) while we used a single (SCX) strategy. However, inter- individual variation of CSF composition seems to be the major factor since only 21% of our identified proteins were common in all 6 samples. The fact that there are numerous unique proteins identified among the different groups indicates a need for more studies, in order to have a complete picture of the CSF proteome. In addition, pre-analytical variables should be standardized allowing for reliable and comparable proteomic research.

110

It should also be noted that availability of high-quality clinical samples represents a recognized issue in the field of biomarker discovery [98]. Most of the previous studies employed pools of CSF samples for protein identification. Here, we analyzed individual samples in order to obtain complete CSF proteome and to evaluate inter-individual reproducibility. The biological reproducibility among six individual samples indicated that only 21% of the proteins were common to all samples, which led us to the conclusion that the inter-individual heterogeneity was an important contributor to variation of CSF proteins, as also observed in previous studies [223, 234]. Some of the inter-individual differences could be explained by sex and age differences [231, 235]. Sample size for this comparison is relatively small and sex differences should be further examined. Although the samples in this study cover a wide age range, the age influence on CSF proteome composition was not within the scope of this study.

CSF analysis can be affected by several pre-analytical parameters, such as variability of sample collection tubes, stability, sample storage and other parameters which should be standardized [176, 236]. One of the common pre-analytical parameters that can affect the CSF protein composition is blood contamination, possibly introduced during the lumbar puncture procedure. Protein concentration in CSF is much lower compared to blood (approximately 150 times lower). Therefore, even a small blood contamination can significantly increase the protein amount in the CSF and have an impact on qualitative and quantitative analysis of CSF proteome. In order to ensure the quality of the CSF samples in this study, visual and biochemical analysis was made and samples with no visible blood contamination or xanthochromia were selected. We also sought to determine the contribution of plasma proteins in our CSF proteome. The database of 1,050 plasma proteins generated by Guldbrandsen et al.

(http://probe.uib.no/csf-pr) was utilized. The number of proteins common to CSF and blood plasma was 415, indicating that 2,200 proteins were unique to the CSF.

111

CSF communicates closely with brain tissue, and as such it can be considered an ideal specimen for biomarker discovery of CNS diseases and basic neuroscience research. Thus, the following goal of the study was to create a signature of highly specific brain-derived proteins identified in our CSF proteome. HPA-based brain-specific proteome (defined here as combining tissue-enriched and group-enriched proteins from HPA) was utilized for candidate selection.

Only proteins of secreted or membrane origin were considered. A list of 78 brain-specific proteins found in at least 4 out of 6 of our CSF proteomes, was generated (Figures 3.4 and

Appendix 3.2). Overall, 57 (52%) of the brain-related proteins (identified in the CSF proteome) were present in all six individual proteomes, 67 (61%) in at least 5 proteomes, 78 (72%) in at least 4 proteomes, 85 (78%) in at least 3 proteomes and 95 (87%) in at least 2 proteomes (data not shown). In addition, 95% and 96% of the candidates were found in the proteome of Zhang et al. and Guldbrandsen et al., respectively. We intend to develop highly accurate and specific

SRM assays for their quantification in AD, to determine their potential as diagnostic biomarkers. For some of the candidates, no commercial antibodies have been developed, resulting in limited information about their distribution and concentration in the brain tissues or

CSF (for example, VSTM2B protein previously linked to pathogenesis of ataxia telangiectasia

[237]). Furthermore, highly specific brain proteins identified in this study could reveal new pathways or disease mechanisms and lead to discovery of novel therapeutic targets. However, some possible limitations of the biomarker discovery approach utilized in this study should be considered. In a disease state, due to the neuronal cell’s degeneration, some of the intracellular proteins could be released in the extracellular space or secreted into the CSF. Any immune cells recruited to the lesion may also secrete proteins into the CSF, although these would not be considered brain-specific. These proteins would thus remain undetected by our study.

112

Notably, some of the proteins found in CSF have been previously linked to neurodegenerative diseases. For example, APLP1, is a membrane-bound glycoprotein associated with the synaptic function and a member of a highly conserved gene family, together with amyloid precursor protein (APP) and amyloid precursor-like protein 2 (APP2). Several studies have shown co-localization of APLP1 with APP in control subjects and Alzheimer’s disease brain plaques [238, 239]. In addition, APLP1 is one of the substrates of BACE1, an enzyme involved in Alzheimer’s disease pathology [240]. Finally, a recent study also suggests that

APLP1 has significance as a potential biomarker of Parkinson’ disease progression [241].

SCG3, part of the family involved in the secretory granule biogenesis and neurotransmitter storage and transport, can be accumulated in the senile plaques of Alzheimer’s disease patients [242]. It has also been reported in the context of Parkinson’s disease, in an in- vitro model, where SH-SY5Y cell exposure to the neurotoxin paraquat resulted in decreased

SCG3 expression levels [243]. SCG3 and SCG2 were previously evaluated as potential biomarkers of multiple sclerosis and decreased levels were observed for SCG3 and SCG2 in serum and CSF samples of multiple sclerosis patients [244, 245].

Kallikrein 6 (KLK6) was the most abundant protein of the group-enriched proteins.

KLK6 is one of the 15-member family of the secreted serine proteases with trypsin-like activity.

Among all tissues in the body, KLK6 has the highest expression in the central nervous system and high amounts of KLK6 are present in the CSF [246-248]. It has been suggested that KLK6 may process APP and this way contributes to AD pathology [249, 250]. Several other studies found decreased levels of KLK6 in AD brain regions (e.g. parietal and frontal cortex) [251,

252]. These findings were previously confirmed by our group at the protein level, indicating lower KLK6 levels in AD brain tissue extracts [248, 253]. Studies of KLK6 levels in the CSF are still limited and conflicting, showing both low and high levels of KLK6 in the AD CSF

113 samples [248, 254]. Recent findings revealed that α-synuclein, a protein involved in the pathology of Parkinson’s disease, is also a potential KLK6 substrate [255-257]. Even more intriguing is the finding that overexpression of KLK6 in α-synuclein transgenic mouse model leads to clearance of α-synuclein, suggesting a potential therapeutic application [258]. In contrast, elevated levels of KLK6 have been observed in multiple sclerosis patients and its role in the disease pathology has been related to an immuno-inflammatory pathway, particularly by activating PAR receptors, key triggers of inflammatory processes [259-261]. Here, we evaluated

KLK6 protein level in the brain tissue extracts and pool of CSF samples as a complement of mRNA expression data from HPA (KLK6 immunohistochemistry from the HPA not available).

These findings confirmed its abundance in the CSF, as well as in the brain tissue extracts, where significantly differential levels were observed between brain regions (Appendix 3.3).

Another group-enriched protein connected with neurodegenerative diseases and observed with high abundance is a glycosylated phosphoprotein SPP1. A recent study revealed its potential as a diagnostic biomarker of Parkinson’s disease [241]. SPP1 protein expression was found in neurons, Lewy bodies and microglia of substantia nigra region in Parkinson’s disease, pyramidal neurons of hippocampus in AD (with increased levels relative to age- matched controls) and astrocytes within plaques and white matter of multiple sclerosis patients

(with increased levels relative to controls) [262-264]. SPP1 levels in CSF were also elevated in

AD and mild cognitive impairment [265], and multiple sclerosis patients [262, 266].

In conclusion, the present study contributes to the existing knowledge of the human CSF proteome and, in addition, provides a panel of highly specific brain-derived proteins that can be robustly measured in CSF by mass spectrometry assays. In future, we intend to develop quantitative SRM assays for selected 78 proteins and use them as a signature biomarker panel for evaluation of AD.

114

Chapter 4

Parts of this chapter are submitted in:

Begcevic I#, Brinc D, Dukic L, Simundic AM, Zavoreo I, Basic-Kes V, Martinez-Morillo E, Batruch, I, Drabovich AP, Diamandis EP. Targeted mass spectrometry-based assays for relative quantification of thirty brain-related proteins and their clinical applications. J Proteome Res 2017.

# Performed all experiments and data analysis. Graphical representation of data was conducted in collaboration with DB.

115

Chapter 4

4 Development of targeted mass spectrometry assay for relative quantification of brain-related proteins

4.1 Introduction

Cerebrospinal fluid (CSF) has been the sample of choice in the quest of novel biomarkers of numerous neurological disorders. Given its direct contact with the brain parenchyma, CSF may reflect pathological changes in the brain and serve as a source of brain- related proteins. CSF matrix is relatively simpler than plasma with much lower total protein concentration; however most the abundant proteins found in plasma are also present in highest abundance in CSF (e.g. albumin) [133]. Still, the concentration of brain-derived proteins is generally much higher in CSF compared to plasma, while plasma-derived proteins have lower concentration in CSF compared to blood [145]. Brain-derived proteins are not necessarily brain- specific, since they can be produced by other tissues. Yet, some of the brain-derived proteins have high specificity for cerebral tissue and certain cell types (e.g. neuronal and glial proteins:

S100B, NSE, tau) and have been associated with several neurological pathologies [145]. For example, elevated CSF levels of proteins S100B and NSE were found in stroke patients and

Creutzfeldt-Jacob disease, while tau proteins (total and phosphorylated-tau) are known to be elevated in Alzheimer’s disease (AD) [4, 145].

In order to seek disease-associated indicators, brain-specific proteins can be of significant interest as novel candidate biomarkers for various neurological disorders [212].

Previously, we have reliably identified brain-related proteins, highly specific to the brain tissue, in the proteome of non-pathological CSF samples (Chapter 3) [267]. Development of a quantitative, multiplex assay for these proteins would be a valuable tool for testing their

116 potential as biomarkers. Similar efforts have been made by our group for discovery of tissue- specific biomarkers using targeted proteomics for central nervous system (CNS) and other disorders [212, 213].

Biomarker discovery is dependent on application of high-throughput technologies for discovery and subsequent verification and validation of proteins abnormally regulated in disease. Selected reaction monitoring (SRM) is a targeted-mass spectrometry platform used in proteomics for relative and absolute quantification of proteins in complex biological samples.

Peptide and protein quantification is based on the prior selection of the targeted precursor ion and its fragmentation pattern, suitable for an optimal assay with high specificity and sensitivity

[114]. The ability of simultaneous quantification of significant numbers of analytes (even hundreds of proteins) in a complex samples makes this technology commonly a method of choice for biomarker verification phase [98, 123]. Despite the fact that the development of a high-quality targeted assay requires significant effort, it is still time- and cost- efficient, compared to development of an immunoassay such as ELISA.

The principal aim of the present chapter is therefore to develop multiplex mass spectrometry-based SRM assays for relative quantification of brain-related, highly specific proteins, previously identified in the non-pathogenic CSF samples (Chapter 3). The developed assay will be then used to evaluate candidate proteins in a set of AD patients (Chapter 5).

4.2 Methods

4.2.1 Selection of brain-related proteins as biomarker candidates

Brain-related proteins were selected using the Human Protein Atlas (HPA) database

(version 13) of tissue specific proteins (http://www.proteinatlas.org) and our comprehensive, in- house developed CSF proteome, as previously published [267] and described in the Chapter 3.

117

Briefly, brain-enriched and group-enriched proteins, secreted and/or of membrane origin (HPA source), reproducibly detected in individual CSF proteomes were further selected as candidate proteins for SRM assay development (n=78).

4.2.2 Selection of peptides for candidate proteins

Peptides and transitions for 78 candidate proteins were initially selected from the SRM

Atlas database (www.srmatlas.org) and from our previously developed and published SRM assays [212]. Initially, between 1 and 7 unique proteotypic peptides per protein were selected (7 to 25 amino acids in length), with a maximum of 7 transitions per precursor peptide (b- and y- ions). Peptides containing C-terminal cysteine or glutamine were excluded and peptides with methionine in the sequence were preferentially avoided. In addition, peptides with +1 and +4 charge were omitted as well as peptides with possible tryptic misscleavage at N- and C- terminus. Peptides were searched against the Basic Logical Alignment Search Tool (BLAST, www..org/blast) to confirm peptide uniqueness for the targeted proteins.

4.2.3 Identification of peptides and selection of transitions in CSF

Sets of peptide transitions for 78 candidate proteins selected under 4.2.2 (above) were further evaluated to establish a single injection multiplex SRM assay. First, retention times (RT) for each peptide were predicted using two prediction methods.

In one of the approaches, RT was predicted using 9 endogenous CSF peptides with elution profile covering most of the 60 min liquid chromatography (LC) gradient. The endogenous peptides were selected from our CSF proteome based on high abundance (area under the curve) of proteins and their RT, as observed on the Q Exactive Plus mass spectrometer. Peptides with methionine and cysteine or C-terminal glutamine were avoided, if

118 possible. The 60-min gradient was applied for the TSQ Quantiva (Thermo Scientific, San Jose,

USA), a triple quadrupole type mass spectrometer, and a CSF pooled sample digest for RT prediction was run in triplicates. The experimentally observed RTs were correlated with the RTs of the same peptides obtained from the SRM Atlas database. In another approach, RT was predicted using the elution profile of a commercial heavy peptide set (15 synthetic heavy labeled peptides for lysine and arginine, Pierce Peptides Retention Time Calibration Mixture solution,

Thermo Scientific). Briefly, 100 fmol of Pierce Peptide solution was spiked-in into the CSF pool digest. The same 60-min LC gradient was applied for TSQ Quantiva instrument and the

CSF pool digest was run in triplicates. The experimentally observed RT was correlated with the hydrophobicity of the peptide sequence (using SRRCalc 3.0, 300 A calculator, Skyline software, version 3.1).

Predicted RTs were calculated for all peptides under evaluation and scheduled SRM assays were prepared (avoiding peptide elution overlap). RT windows were set ± 3 min from predicted elution time. Unscheduled methods were prepared for peptides for which predicted RT was outside of LC gradient.

For peptides positively identified in our in-house developed CSF proteome, RT observed with discovery Q-Exactive Plus instrument (RT extracted with Proteome Discoverer software,

Thermo Scientific) was compared with the RT observed with TSQ Quantiva (both 60 min LC gradient) and was used as one of the criteria to confirm peptides’ identification.

Peptides were then selected over several rounds of peptide evaluation, based on the following criteria: i) absence of interferences, ii) observed coelution of transitions, iii) transitions’ pattern and order (compared to SRM Atlas data available), peptides containing proline were preferred (generate high intensity signal).

119

Peptide identity was further confirmed using heavy peptides. Lyophilized heavy labeled peptides (JPT Peptide Technology, Berlin, Germany) were reconstituted in 100 µL of 20% ACN

(Fischer Scientific) in 0.1 M ammonium bicarbonate (Fischer Scientific), aliquoted and stored at

-20 ºC. Heavy labeled peptides were combined in equal amounts to create a heavy peptide pool for further analysis. Light (endogenous) peptides’ identity was confirmed using heavy labeled peptides based on the same RT observed, identical order and relative ratios of transitions in both light and heavy peptides. Transitions for the final SRM method were then selected; transitions with the highest intensities were preferred.

4.2.4 Mass spectrometry sample preparation

For SRM assay development phase, two different pools of non-pathological CSF samples were prepared (combining 5 to 6 individual samples), stored at -80 ºC and used for assay evaluation. Individual CSF samples were retrospectively collected as leftovers after routine biochemical examinations at Mount Sinai Hospital, Toronto. Ethics approval was obtained from the Mount Sinai Hospital Research Ethics Board (REB 15-0265-E).

Aliquots of CSF pools were thawed and centrifuged for 10 min at 17,000 g. Volumes corresponding to 10-15 µg of total protein were denatured with 0.05% RapiGest (Waters,

Milford, USA) and reduced with 5 mM dithiothreitol (Sigma-Aldrich, Oakville, Canada) at 60

ºC for 40 min. Alkylation was achieved with 15 mM iodoacetamide (Sigma-Aldrich, Oakville,

Canada) for 60 min in the dark at 22 ºC. Protein digestion was carried out with trypsin (Sigma-

Aldrich, Oakville, Canada) in 50 mM ammonium bicarbonate (1:30 trypsin to total protein ratio), for 24 hours at 37 ºC. Heavy peptides were then spiked into the digest followed by addition of 1% trifluoroacetic acid (TFA, Fisher Scientific). Samples were then centrifuged at

1,000 g for 30 min and supernatants were retained. Peptides were purified using OMIX C18

120 tips, eluted in 5/4.5 µL of acetonitrile solution (65% acetonitrile, 0.1% formic acid) and finally diluted with 60/54 µL of water-formic acid (0.01% formic acid).

4.2.5 Liquid chromatography-tandem mass spectrometry (LC-MS/MS)

Peptides were analysed using a triple quadrupole mass spectrometer, TSQ Quantiva.

Each sample was injected (18 μL) into an in-house packed 3.3 cm pre-column (5 μm C18 particle, column inner diameter 150 μm) followed by a 15 cm analytical column (3 μm C18 particle, inner diameter 75 μm, tip diameter 8 μm). The liquid chromatography, EASY-nLC

1000 system (Thermo Fisher, Odense, Denmark) was coupled online to the TSQ Quantiva mass spectrometer with a nanoelectrospray ionization source. The 60-min LC gradient was applied with an increasing percentage of buffer B (0.1% formic acid in acetonitrile) for peptide elution at the flow rate of 300 nL/min. Collision energy (CE) for each precursor peptide was tested over the range of 8 V (2 steps, ±2 V from nonoptimized CE). Optimized CE for each peptide was selected based on the highest peak area observed. Additionally, dwell time and RT windows were adjusted to ensure a minimum of 15 points per LC peak. The SRM assay parameters were thus set up as follows: positive-ion mode, optimized collision energy values, adjusted dwell time, 0.2/0.7 Th Q1 resolution of full width at half-maximum and 0.7 Th in Q3 resolution. Raw data were uploaded and analyzed with Skyline software (University of Washington).

During the SRM development phase, 60 min LC gradients were used. In the interest of reducing instrument run time for clinical sample analysis, the gradient was adjusted from 60 to

37 min; this gradient was used for further analysis. All LC peaks were manually inspected to ensure that sufficient points per LC peak were achieved with the modified gradient.

121

4.2.6 Linearity

A pool of CSF samples was prepared and volume corresponding to 15 µg of total protein was aliquoted for each point of linearity. Digestion was carried out as previously described.

Heavy labeled peptide pool solution was prepared as defined before and used for linearity analysis. Serial dilution of heavy labeled peptide pool was spiked into aliquots of CSF pool digests with constant concentration of endogenous (light) peptide (4000, 2000, 1000, 500, 250,

125, 62.5, 31.25, 15.62, 7.81, 3.9, 1.95, 0.98, 0.48, 0.24, 0.12, 0.06 and 0.03 fmol of heavy peptides per injection). All samples were analysed in triplicates and run on the instrument from lowest to highest concentration. Coefficient of variation (CV) was calculated for all points within the linear range.

Based on the linearity and observed heavy to light ratios, heavy labeled peptide amount was optimized relative to the light peptide amount observed in the CSF pool so that the ratio of light and heavy peptide is close to 1, preferably. The heavy peptide pool with amounts close to the levels of endogenous peptide were prepared and used for reproducibility, freeze/thaw, carry- over effect studies and clinical sample analysis.

4.2.7 Reproducibility

Four aliquots of the CSF pool were prepared for testing the reproducibility of the assay.

Aliquots were digested independently, then distributed over multiple wells, ran in three replicates each (12 in total) and analysed over several days.

4.2.8 Freeze-thaw assay and carry-over effect

For freeze/thaw (F/T) study a pool of CSF samples (prepared from 9 individual samples, previously kept below -20 ºC) was subjected to additional four F/T cycles (-80 ºC freeze and 20

122

ºC thaw). After each cycle, an aliquot corresponding to 15 µg total protein was kept for mass spectrometry sample preparation. All cycles were processed at the same time and run on the instrument in duplicate. For each F/T cycle mean value of L/H ratio was calculated and compared against the baseline (first F/T cycle).

The effect of carry-over was tested under the LC-MS/MS conditions. A CSF pool sample digest was prepared (as described under mass spectrometry sample preparation) and heavy peptide mix was spiked into the digest. Carry-over effect for endogenous and heavy- labeled peptides was examined using pattern of injections as follows: two blank injections, three sample injections, followed by 3 blank injections (blank: BSA solution in MS Buffer A). An experiment was repeated three times. The carry-over ratio was estimated for endogenous and heavy-labeled peptide using formula: AUC (first blank injection) - AUC (last blank injection) /

AUC (last sample injection) – AUC (last blank injection), according to the published guidelines and recommendations (AUC=area under the curve) [268, 269]. The result was presented as percentage of carry-over effect.

4.2.9 Data analysis

The raw files were uploaded to Skyline software (version 3.5.0.9319, University of

Washington), which was used for peak integration and quantification of the AUC as well as light to heavy peptide ratios (AUClight/AUCheavy). For relative quantification, the average

AUClight/AUCheavy was multiplied by the amount of the heavy labeled peptide spiked-in into the samples to calculate the relative amount of endogenous light peptide taking into consideration the volume of CSF used. For peptides with methionine in the sequence, AUClight/AUCheavy was calculated as sum of AUC for mono-oxidized (due to methionine oxidation during sample processing) and non-oxidized form of the peptide, for both light and heavy labeled peptide as

123 indicated: AUC (oxidized + non-oxidized)light/AUC (oxidized + non-oxidized)heavy. SRM data were manually evaluated and samples with poor integration and not reliable quantification were excluded. Linearity was assessed (linear regression, sigmoidal curve, coefficient of variation profile) using R statistical and graphics software (www.Rproject.org).

4.3 Results

4.3.1 Identification of proteotypic peptides in CSF for SRM assay development

Peptide identification in CSF for SRM assay development was a multistep process with several rounds of peptide evaluation. RT prediction algorithms were used for initial peptide assessment. Experimentally observed peptides’ RT was compared with predicted RT; peptides consistently detected around the specific RT within the scheduled RT window during the evaluation steps were selected. Additionally, if peptides were identified in our discovery phase, their observed RT (with SRM assay) was compared with discovery RT and was used as confirmation of peptides’ identity (Figure 4.1). Apart from RT, the following criteria were applied for the final endogenous peptide selection: i) absence of interference, ii) transitions’ coelution, iii) transition’s pattern and order.

124

Figure 4.1: Identification of endogenous peptides for SRM assay development: peptide example. (A) Predicted retention time (RT) based on RT observed from SRM Atlas, SRRCalc. 3.0 and discovery experiment. (B) Co-elution of 4 transitions (with the same order) from endogenous (light) and isotopically labeled standard (heavy) peptides. Both peptides elute at the same time. Heavy peptide is spiked-in around the same level as the endogenous peptide.

Initially, 377 proteotypic peptides and 2591 transitions, representing 78 candidate proteins were selected from SRM Atlas for SRM assay development. Predicted RTs for targeted peptides were calculated based on linear regression equations from the two RT prediction algorithms (first: RT of synthetic peptides vs. hydrophobicity index; second: RT of endogenous peptides vs. their RT from SRM Atlas) and used to schedule the SRM assay. The coefficient of correlation (r2) obtained for both RT prediction algorithms was 0.98 (Appendix 4.1-4.3).

Peptides were monitored over a six-minute RT window (predicted RT ± 3 min). In total, 48 scheduled (approximately 100 transitions per method) and 6 unscheduled (approximately 45 transitions per method) SRM methods were initially prepared.

125

Light peptides were then evaluated at three consecutive rounds in two different CSF pools. If multiple peptides per protein were available after several rounds of evaluation, the peptides were analyzed in serially diluted CSF and the peptide with the higher intensity transitions observed at the lower points of the serial dilution was selected. In addition, for peptides with possible +3 charge, both charge forms of the peptides were evaluated (+2 and +3) and the peptide with the more intense signal for a given charge was selected. After several rounds of peptide evaluation, 47 peptides, with 3-5 transitions per peptide, were considered for final assessment based on heavy-labeled synthetic peptides.

Light peptides were then evaluated based on the observed co-elution of transitions for both heavy and light peptides, identical order of transitions’ intensities, equivalent transition intensity ratios and RT alignment (Figure 4.1). Out of 47 peptides, 17 endogenous peptides were not confirmed using heavy-peptides in the CSF pool tested, while 30 peptides were identified and confirmed (Table 4.1, Appendix 4.4). For the final method with 30 peptides, two to four transitions per peptide were selected; three transitions were retained for most of the peptides, two for the peptides with interferences in other transitions (or for some very high abundance proteins) and four transitions were reserved for the lower abundance peptides, preferably.

Table 4.1: Proteins and peptides of the developed method

Accession Gene Protein Peptide Precursor

Uniprot Name Name Sequence m/z

P51693 APLP1 Amyloid-like protein 1 DELAPAGTGVSR 586.80

Q96GW7 BCAN Brevican core protein FNVYCFR 503.23

Q8N3J6 CADM2 Cell adhesion molecule 2 SDDGVAVICR 546.26

Q8IUK8 CBLN2 Cerebellin-2 VAFSATR 376.21

126

Accession Gene Protein Peptide Precursor

Uniprot Name Name Sequence m/z

Q96KN2 CNDP1 Beta-Ala-His dipeptidase ALEQDLPVNIK 620.35

Q02246 CNTN2 Contactin-2 VTVTPDGTLIIR 642.88

Q8NFT8 DNER Delta and Notch-like epidermal VTATGFQQCSLIDGR 826.91

growth factor-related receptor

Q92876 KLK6 Kallikrein-6 LSELIQPLPLER 704.41

O14594 NCAN Neurocan core protein TGFPSPAER 481.24

O95502 NPTXR Neuronal pentraxin receptor VAQLPLSLK 484.81

Q92823 NRCAM Neuronal cell adhesion molecule VFNTPEGVPSAPSSLK 815.43

Q99784 OLFM1 Neolin1 LTGISDPVTVK 565.33

Q14982 OPCML Opioid-binding protein/cell ITVNYPPYISK 647.86

adhesion molecule

P23471 PTPRZ1 Receptor-type AIIDGVESVSR 573.31

tyrosine- zeta

P13521 SCG2 Secretogranin-2 ALEYIENLR 560.80

Q9BYH1 SEZ6L Seizure 6-like protein ETGTPIWTSR 574.29

P10451 SPP1 Osteopontin AIPVAQDLNAPSDWDSR 927.95

O15240 VGF Neurosecretory protein VGF FGEGVSSPK 454.23

Q9NT99 LRRC4B Leucine-rich repeat-containing DLAEVPASIPVNTR 741.40

protein 4B

Q8N126 CADM3 Cell adhesion molecule 3 LLLHCEGR 333.18

Q8WXD2 SCG3 Secretogranin-3 LLNLGLITESQAHTLEDEVAEVLQK 921.83

Q15818 NPTX1 Neuronal pentraxin-1 FQLTFPLR 511.30

Q9P0K9 FRRS1L DOMON domain-containing HDIDSPPASER 612.29

protein FRRS1L

Q96PX8 SLITRK1 SLIT and NTRK-like protein 1 LSNVQELFLR 609.85

P61278 SST Somatostatin SANSNPAMAPR 558.27

Q16653 MOG Myelin-oligodendrocyte FSDEGGFTCFFR 735.31

127

Accession Gene Protein Peptide Precursor

Uniprot Name Name Sequence m/z

glycoprotein

P01303 NPY Pro-neuropeptide Y ESTENVPR 466.23

Q86UN3 RTN4RL2 Reticulon-4 receptor-like 2 LFLQNNLIR 565.84

Q99574 SERPINI1 Neuroserpin ALGITEIFIK 552.84

O60241 BAI2 Brain-specific LLAPAALAFR 521.82

angiogenesis inhibitor 2

In addition to the developed SRM assay for 30 candidate proteins, protein predicted not to be changed in a disease state (extracellular matrix protein 1, ECM1) was included as a negative control. List of peptides and transitions of the developed assay can be found in

Appendix 4.5.

4.3.2 Linearity

Assay linearity for all peptides was assessed using CSF samples spiked with different levels of heavy peptides covering the range of five orders of magnitude (0.03 to 4000 fmol/ injection). Data was inspected manually and outliers in three replicate measurements were excluded if an aberrant chromatogram was observed. Minimum of two transitions per protein was used for peptide quantification, for both endogenous and isotope-labeled peptides.

Developed assays showed a wide dynamic range with overall median fold span of 8.16x103. All

SRM assays showed good linearity with coefficient of determination r2 > 0.987 for most of the peptides; two lower abundance peptides representing OLFM1 and SLITRK1 proteins showed good, but somewhat lower r2 value in comparison to other proteins (r2=0.961 and 0.974, respectively).

128

Coefficients of variation (CV) were calculated for each L/H ratio across the established linear range for each peptide. All points within the linear range had CV ≤ 20% (for all peptides);

0.7% was the lowest (PTPRZ1 protein) and 9.6% the highest (OLFM1 protein) observed median

CV. Linear range, correlation coefficients and median CV values for all peptides are summarized in Appendix 4.6.

The amount of heavy-labeled peptide spiked into the CSF samples should result in L/H ratio within the linear range, to allow reliable quantification. The amount of heavy peptide for further sample analysis was chosen to obtain close to one L/H ratio. This was possible for most of the peptides. Signal for some heavy labeled peptides was not reliable for quantification at the level of endogenous peptides, and for these peptides higher spikes were selected (heavy/light between 11 and 44 for 10 peptides), as well within the linear range.

4.3.3 Reproducibility

The reproducibility of the analytical process was tested over several days (n=8), using aliquots of the same CSF pool. CVs for all peptides were below 20%. The precision for all analysed replicates (total CV) derived from four reproducibility samples and for all peptides was below 10%, except for the protein FRRS1L with a total CV of 17.0%. Overall, CVs for all peptides ranged from 2.0% (SEZ6L) to 17.0% (FRRS1L), with a median CV of 4.0%. Higher

CVs were mostly observed for the lower abundance proteins and for proteins represented by very hydrophobic peptides (derived from SCG3 protein). Total reproducibility for all peptides is shown in Appendix 4.7

129

4.3.4 Freeze-thaw assay and carry-over effect

Repeated freeze-thaw cycles were performed using a CSF sample pool. Comparison of mean values of peptide abundance (L/H ratio) for all F/T cycles against the first F/T cycle as a baseline, showed that the levels of all peptides were not affected by 5 F/T cycles (data not shown).

Carry-over effect on the LC-MS/MS instrument was tested using samples with optimized L/H ratios as described earlier. The carry-over for all candidate peptides, both endogenous and heavy-labeled peptides, was below 2.5% in all three experiments performed, apart from the peptide HDIDSPPASER (corresponding to protein FRRS1L) with observed carry-over of 4.0%, in one out of three experiments (experiment 2: -0.6% and experiment 3: -

2.2%). Average carry-over for all peptides (endogenous and heavy-labeled) was below 2%

(Appendix 4.8).

4.4 Discussion

In this study we aimed at developing a multiplex SRM assay for a panel of 30 CSF brain-related and highly specific proteins. The assay shows good linearity (r2>0.961), wide dynamic range (median fold range across peptides = 8.16x103), and has acceptable reproducibility (CV<20%). SRM assay will be used to evaluate candidates’ diagnostic potential in AD.

Targeted, multiplexed-proteomic platforms have been utilized for evaluation of candidate biomarkers in CSF of various CNS disorders. To our knowledge, the SRM assays for

19 of our candidates have been previously reported by other groups for quantification in CSF, where different peptides or the identical peptides reported here were used in the assays [212,

130

241, 270-273]. For 11 of our candidates, this study is the first report of SRM assay for quantification in CSF (e.g. SST, SEZ6L, SLITRK1, RTN4RL2, BAI2).

The main advantages that make SRM the method of choice for verification of novel biomarkers is high specificity, good sensitivity, relatively fast method development and multiplex capacity. By employing SRM-based quantification, even hundreds of peptides/proteins have been successfully measured in an unfractionated biological sample [274].

While immunoassays, such as ELISA, can suffer from lack of specificity, SRM assay provides highly selective and specific measurements of analytes in a complex samples.

Limitations of SRM assays are often related to the poor sensitivity for very low abundance proteins and use of heavy-labeled standards. From the initial 78 candidates for SRM assay development, we were able to develop assays for 30 proteins. This can be explained partially by the limited sensitivity for peptide detection in a complex protein background; initial identification of the 78 proteins was performed after decreasing CSF complexity (SCX fractionation) [267]. Some brain-related proteins may have a very low concentration in CSF and may not be consistently detected with SRM assays. Furthermore, it is possible that for some proteins, selected peptides were not optimal for development of a targeted assay. Assays developed in this study do not require enrichment methods and are suitable for relatively fast sample preparation, however, for the very low abundance proteins, sensitivity can be improved with the immuno-SRM methodology. This can be achieved by applying anti-protein or anti- peptide antibody as an enrichment strategy prior to SRM analysis. For instance, Anderson et al. described peptide-based immunaffinity enrichment named stable-isotope standards and capture by anti-peptide antibody or SISCAPA and demonstrated approximately 120-fold enrichment of selected peptides in human plasma prior to quantification with MS [275]. This method was subsequently enhanced with automated and multiplexed abilities (up to 9 peptides) enabling

131 peptide detection in low pictogram per milliliter range, after its extraction from larger sample volume (1 mL) [128]. Likewise, the protein-based antibody enrichment was also utilized prior to

SRM analysis showing as well impressive improvement in peptides detection [276].

Nevertheless, immuno-based strategies coupled to SRM provide improved sensitivity, but require high quality antibody and commonly longer assay-development time with higher cost.

The SRM protein quantification relies on the measurement of proteotypic peptides unique for selected proteins. Thus the first step towards SRM assay development is selection of

“representative” or proteotypic peptides. Peptides can be initially selected either based on the experimental data generated in-house or from the publicly available peptide repositories (e.g.

SRMAtlas), scientific literature or generated in-silico (e.g. using software to generate tryptic peptides). Moreover, peptides with highest MS intensities are preferred; with good ionization properties and m/z ratio within the range of the instrument (typical length of the peptides is 8-25 amino acids) [277]. Additionally, peptides with possible tryptic misscleavage should be avoid as well as the peptides with N-terminal amino acids prone to chemical modifications such as glutamine and cysteine which can modify peptide’s m/z and RT; methionine can be also modified, but usually at a smaller degree. Considering that part of the endogenous peptide (and heavy-labeled peptide) can be modified, it is important to determine the extent of modified peptide over unmodified when absolute quantification is performed [124]. Lastly, peptides should not ideally comprise post-translational modifications.

Heavy isotope-labeled peptides standards are used to ensure correct detection and accurate relative or absolute quantification. In this study relative quantification was performed since spiked heavy-labeled peptides were of unknown absolute amount and the tryptic digestion was assumed to be complete. Absolute SRM quantification can be achieved with stable isotope- labeled peptides of known amount (typically contacting trypsin-cleavable tag) [124], using

132 quantification concatemers (QconCAT) [125] and heavy-labeled whole protein standards

(protein standard absolute quantification, PSAQ) [126].

In summary, the field of biomarker research is constantly expanding and various sets of candidate proteins are being tested as potential biomarkers of AD. Our approach was to focus on proteins that are highly specific for brain tissue and develop a multiplex SRM assay intended for their quantification in CSF. Utilizing targeted mass spectrometry abilities, we were able to develop highly reliable SRM assays for simultaneous relative quantification of 30 proteins in human CSF samples. This assay was subsequently used in our preliminary verification study of

AD patients (Chapter 5).

133

Chapter 5

Parts of this chapter are in manuscript preparation:

Begcevic I#, Brinc D, Brown M, Martinez-Morillo E, Goldhardt O, Grimmer T, Magdolen V, Batruch, I, Diamandis EP. Brain-related proteins as potential CSF biomarkers of Alzheimer’s disease: a targeted mass spectrometry approach.

Begcevic I, Brinc D, Brown M, Martinez-Morillo E, Tsolaki M, Lazarou I, Batruch, I, Diamandis EP. Brain-related proteins as CSF indicators of different Alzheimer’s disease stages.

# Performed all experiments and data analysis. Statistical analysis was performed in collaboration with DB and MB.

134

Chapter 5

5 Verification of brain-related proteins as potential diagnostic biomarkers of Alzheimer’s disease

5.1 Introduction

Alzheimer’s disease (AD) is a progressive neurodegenerative disease and the most common type of dementia, mainly affecting people after age 65. The pathological hallmarks of

AD are the extracellular deposits known as amyloid β (Aβ) plaques, composed of Aβ-aggregates and intracellular neurofibrillary tangles, consisting of hyperphosphorylated protein tau (p-tau) fibrils [16, 17]. Amyloid β fragments are produced by cleavage of amyloid precursor protein

(APP) by the membrane-associated enzymes β-secretase and the enzyme complex γ-secretase.

Both Aβ and tau aggregates cause impaired synaptic plasticity and neuronal cell death [18].

Currently, the diagnosis of AD is based on clinical criteria, including medical history, mental status testing and a physical and neurological exam. With such tests, only probable AD can be diagnosed [2]. According to the National Institute on Aging (NIA) and the Alzheimer’s

Association newly revised diagnostic and research criteria, there are three stages of AD: preclinical stage, mild cognitive impairment (MCI) due to AD, and dementia due to AD.

Preclinical stage describes individuals with existing early brain pathology, but who are asymptomatic of memory and cognitive decline, while individuals with MCI due to AD are prodromal, with mild symptoms related to disease pathology. Patients with dementia due to AD have memory, thinking and behavioural symptoms that significantly interfere with their everyday activities and are accompanied by more severe pathological brain changes. The stage of dementia is overall characterized by a spectrum of clinical symptoms. These symptoms

135 typically appear gradually, indicating different levels of dementia: mild dementia (or early stage), moderate (or middle stage) and severe dementia (or late stage) [10].

Cerebrospinal fluid (CSF) composition can reflect pathological changes of the brain.

Currently, the most established CSF biomarkers for AD are amyloid β 1-42 fragment (Aβ1-42) and tau protein (total tau (t-tau) and p-tau) [19]. These core CSF biomarkers reflect main AD hallmarks: Aβ1-42 peptide is a marker of Aβ plaque formation, while t-tau and p-tau are biomarkers of neuronal injury and degeneration [4]. It is known that AD patients have decreased

CSF levels of Aβ1-42 and increased levels of t-tau and p-tau, compared with healthy controls.

Even though these CSF biomarkers have been extensively studied, they are not in widely used in the clinic, largely due to the lack of method standardisation, and are mostly utilized in the research settings, as suggested by the new AD diagnostic criteria [2].

It has been outlined previously that an ideal AD biomarker should reflect pathological changes in the brain with accurate performance (differentiating form other forms of dementia) and with specificity and sensitivity >80% [151]. In general biomarkers can have several utilities, such as diagnosis, screening, prognosis and monitoring disease progression; however, significant interest in application of AD biomarkers is early AD diagnosis and their use in the clinical trials (e.g. for patient enrichment and/or for monitoring drug efficacy and target engagement).

Current biomarkers were inconstant in clinical trials to tract treatment efficacy of Aβ immunotherapy; with lack of correlation with cognitive function [77, 211]. Moreover, their diagnostic accuracy still has to be further validated due to the wide variations across the studies, especially for the identifying incipient AD in MCI cases [19]. The lower diagnostic accuracy is also related to differential diagnosis of other forms of dementias mostly due to the common

136 overlap of pathologies that can co-exist with AD pathology (such as vascular and Lewy body pathologies). Nevertheless, defying underlying pathophysiological mechanisms is important for further understanding of disease pathology, identifying new biomarkers and therapeutic disease- modifying treatments. For these reasons there is still a need for novel biomarkers which can facilitate an early and accurate diagnosis, predict disease progression and cognitive decline, improve understanding of neuropathological changes in AD and be used in clinical trials [18].

A number of studies have already examined the CSF proteome [118, 223, 225]. In our recent study, we identified a set of brain-specific proteins that can be consistently detected in normal CSF proteome (Chapter 3) [267]. We have further developed mass spectrometry-based assays for quantification of highly specific brain proteins, suitable for biomarker studies

(Chapter 4).

Genetically, the strongest risk factor for developing AD is the ε4 allele of the apolipoprotein E (APOE) gene. The human APOE gene contains three polymorphic alleles: ε2,

ε3 and ε4, with frequency of 8, 78 and 14%, respectively [48]. The frequency of the ε4 allele increases to ~40% in AD patients and is associated with earlier age of AD onset and accelerated

Aβ pathology, compared to non-carriers. Indeed, amyloid plaques are more abundant in ε4 carriers, showing lower CSF concentration of Aβ1-42, as well as enhanced Pittsburgh compound B (PiB) binding to Aβ aggregates on PET imaging [48]. Moreover, it has been suggested that APOE affects Aβ clearance, aggregation and deposition in an isoform-dependent manner [58]. Additionally, APOE4 may contribute to AD pathology by Aβ-independent fashion influencing synaptic plasticity, cholesterol homeostasis, neurovascular function and neuroinflammation [48].

137

The main goal of the present study is to use mass spectrometry-based selected reaction monitoring assays (SRM) to evaluate the diagnostic potential of 30 brain-related proteins in cognitively healthy individuals, patients with diagnosis of mild cognitive impairment (MCI) and

AD, as well as in patients with different degrees of severity of AD dementia. Under this exploratory study, candidate biomarkers were assessed for differential expression, diagnostic performance (single and multivariate protein panel) and correlation with cognitive testes.

Furthermore, since APOE e4 is the main risk factor for developing AD, and is associated with disease pathology, another aim was to assess if protein abundance changes with the APOE phenotype and if biomarker performance is affected by the APOE phenotype.

5.2 Methods

5.2.1 Cerebrospinal fluid samples

5.2.1.1 Cohort 1: cognitively normal individuals, MCI, AD

Fifty three CSF samples were retrospectively collected at the Centre for Cognitive

Disorders, an out-patient clinic of the Department of Psychiatry and Psychotherapy of the

Technical University of Munich. Patient characteristics are shown in Table 5.1.

Individuals with suspected cognitive decline had undergone a standardized diagnostic procedure with detailed physical, neurologic and psychiatric examination, as well as extensive neuropsychological testing. Patients with a final diagnosis of AD and MCI were included in the study; one patient with diagnosis of posterior cortical atrophy due to AD and one with subtle cognitive impairment were excluded from the study. Clinical diagnosis of dementia due to AD and MCI due to AD were made based on standard diagnostic criteria [2, 67]. Cognitively healthy individuals without report of any memory deficits and suspected or confirmed diagnosis

138 of cancer were recruited from urological surgeries with spinal anesthesia at the Department of

Anesthesiology to represent “normal” controls. These participants signed independent consent for biobanking of cognitively healthy controls. CSF samples from control donors were inspected for AD biomarkers; only samples with non-pathological values of AD biomarkers were included in the study. All participants signed written informed consent for biobanking, approved by the local ethics committee. The following clinical parameters were collected for all participants: age, sex, scores of cognitive examination: mini-mental state examination (MMSE) and clinical dementia rating scale (CDR).

All samples were collected by lumbar puncture, inspected microscopically for blood contamination (RBC count), centrifuged and stored at -80 ºC in polypropylene tubes. Collected

CSF samples were shipped on dry ice to the Lunenfeld−Tanenbaum Research Institute, Mount

Sinai Hospital, Toronto, Canada and stored at below -20 ºC until processing. Ethics approval was obtained from the Mount Sinai Hospital Research Ethics Board for usage of these samples.

CSF samples utilized as quality controls for the assays were retrospectively retrieved as leftover samples archived after routine biochemical examinations at Mount Sinai Hospital,

Toronto. Use of these samples for SRM assay development and evaluation was approved by the

Mount Sinai Hospital Research Ethics Board.

5.2.1.2 Cohort 2: MCI, mild, moderate and severe AD

One hundred and one CSF samples were retrospectively collected at the Memory and dementia clinic of the of the 3rd Department of Neurology, G.H "G.Papanikolaou", School of

Medicine, Aristotle University of Thessaloniki, Greece and from the Day Centers of the Greek Association of Alzheimer’s Disease and Related Disorders (GAARD), Thessaloniki,

Greece. Summary of patient characteristics is given in Table 5.2.

139

Patients suspected of having AD were examined by the specialist neuropsychiatrist and diagnosis was made based on the NINCDS/ADRDA criteria for probable AD [278]. Disease severity was determined based on the MMSE and CDR scores and patients were categorized as having mild (MMSE=20-26, CDR=1), moderate (MMSE=10-19, CDR=2) and severe

(MMSE=0-9, CDR=3) dementia. Diagnosis of MCI was based on the description by Peterson

[279], which is almost equal to the NIA-AA criteria for MCI due to AD [67]. The study was approved by the GAARD scientific and ethic committee.

CSF samples were inspected for AD biomarkers (Aβ1-42, t-tau, p-tau) using Innotest

ELISA kit (Fujirebio Europe) [280]. The following clinical parameters were collected for all participants: age, sex, scores of cognitive examination: mini mental state examination (MMSE) and CSF Aβ1-42, t-tau, p-tau values (if available).

All samples were collected by lumbar puncture, inspected macroscopically for blood contamination, centrifuged and stored at -80 ºC in polypropylene tubes. Samples were shipped to the Lunenfeld−Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Canada and stored in the same way as the Cohort 1.

5.2.2 Multiplex selected reaction monitoring assay

A multiplexed, scheduled, SRM assay was developed for 30 brain-related proteins, as described in Chapter 4. Protein found not to change in disease (extracellular matrix protein 1,

ECM1) was included as a negative control. Another protein primarily related to demyelinated diseases (myelin basic protein, MBP) was also included; the SRM method for MBP has been described previously [212]. A peptide corresponding to apolipoprotein B (APOB) protein, which is normally not present in CSF, was monitored to further check for blood contamination

[124]. For peptides containing methionine (SST protein), both oxidized and non-oxidized forms

140 of the peptide were monitored. Four peptides that represent different APOE phenotypes were additionally added to the assay, including an APOE peptide for total APOE, as a control. For

APOE peptides with methionine and N-terminal cysteine, peptide forms with methionine- oxidation and cystein cyclization were monitored as well. This APOE method was previously described in detail elsewhere [124].

5.2.3 Mass spectrometry sample preparation

Clinical samples were subjected to mass spectrometry sample preparation as described in

Chapter 4 (section 4.2.4). Mixture of APOE heavy peptides was spiked into samples prior addition of trypsin, while mixture of 32 heavy peptides (30 candidates, ECM1, MBP) and heavy peptide for total APOE was spiked into the digest after trypsination (24 hours at 37 ºC), followed by addition of 1% trifluoroacetic acid. Samples were then purified using OMIX C18 tips, eluted in 4.5 µL of acetonitrile solution (65% acetonitrile, 0.1% formic acid) and finally diluted with 54 µL of water-formic acid (0.01% formic acid).

5.2.4 Liquid chromatography-tandem mass spectrometry (LC-MS/MS)

Peptides were analysed using a triple quadrupole mass spectrometer, TSQ Quantiva with pre-column, analytical column and coupled liquid chromatography as specified before (Chapter

4, section: 4.2.5). A 37-min LC gradient was applied, with an increasing percentage of buffer B

(0.1% formic acid in acetonitrile) for peptide elution at a flow rate of 300 nL/min. The SRM assay parameters were set up as follows: positive-ion mode, optimized collision energy values, adjusted dwell time, 0.7 Th Q1 resolution of full width at half-maximum and 0.7 Th in Q3 resolution. LC peaks for all peptides were manually inspected to ensure acquisition of minimum

10 points per LC peak. Raw data were uploaded and analyzed with Skyline software (University of Washington).

141

5.2.5 Quality control

A CSF pool was prepared (with 6 individual CSF samples) in five aliquots corresponding to 15 µg of total protein each. These samples were then used for quality control, for testing assay reproducibility during analysis of clinical samples. CSF aliquots were digested independently and simultaneously with clinical samples. In total, 14 replicates for Cohort 1

(four aliquots in three injections and one aliquot in duplicate injection) were analysed over several days during the run of clinical samples (before, in the middle and at the end of the run sequence).

5.2.6 Data analysis

Clinical samples were randomized and ran in duplicates. The raw files were uploaded to

Skyline software (version 3.5.0.9319), which was used for peak integration and quantification of the area under the curve (AUC). Relative quantification was performed as previously described

(Chapter 4, section: 4.2.9). For peptides with amino acid methionine in the sequence,

AUClight/AUCheavy was calculated as: AUC (oxidized + non-oxidized)light/AUC (oxidized + non- oxidized)heavy. SRM data were manually evaluated and samples with poor integration were excluded. All samples were analysed in a blinded fashion. Identification of APOE phenotype was determined as indicated in our previous publication [124]. The STRING database version

10.5 (https://string-db.org/) was used to predict protein-protein interaction (physical and functional) of the top candidate biomarkers [281]. The database was searched by the protein name and protein networks were assessed by confidence and by molecular interaction. The confidence score threshold was set to the high confidence (0.700).

142

5.2.7 Statistical analysis

Statistical analysis was performed with R statistical and graphics software

(www.Rproject.org). Means, medians, standard deviations, interquartile ranges and coefficients of variations were calculated. Linear regression was used to test for differences with ages. For tests involving a dichotomous variable, such as gender, Fisher’s exact test was used. To account for differences in age and gender between groups, multivariate linear regression was performed to test for differences in protein abundance, MMSE and CDR scores across disease stages.

Correlation analysis for MMSE, CDR scores and protein abundance was performed using

Spearman’s rank correlation test. ROC curves were prepared for the most significant proteins and AUC values with 95% confidence intervals were calculated. AUC values were covariate- adjusted by age or gender when there was a significant association (p<0.05) between a marker and the covariates. A multivariate prediction panel was employed using the Least Absolute

Shrinkage and Selection Operator (LASSO) method by selecting a fixed number of proteins into the model. P-values for comparison between groups were indicated as non-adjusted and adjusted for multiple comparison by the Holm method (p<0.05 was considered statistically significant).

5.3 Results

5.3.1 Patients’ characteristics

5.3.1.1 Cohort 1

CSF samples from patients with diagnosis of MCI, AD or control patients (cognitively healthy or patients suffering from unrelated disease) were analyzed for 30 brain-related proteins.

In total, fifty three individuals were included in the study, 20 patients with diagnosis of MCI due to AD, 10 patients with dementia due to AD, and 23 cognitively healthy individuals (control

143 group) (Table 5.1). Patients with AD diagnosis were in a stage of mild (n=7) and moderate

(n=3) dementia.

Table 5.1: Patients’ characteristics (Cohort 1)

Mild cognitive Alzheimer’s disease Group Control impairment dementia Participants, n 23 20 10 Agea 66 (57, 70) 72.5 (69.7, 76) 70.5 (68.2, 73.5) Ageb 64.7 (9.2) 72.4 (4.9) 70.3 (3.9) Sex-female, n (%) 5 (22%) 10 (50%) 6 (60%) MMSE scorea 30 (29, 30) 26 (24.5, 27) 22 (18.5, 24.75) MMSE scoreb 29.4 (0.74) 25.7 (1.92) 21.2 (3.82) CDR scorea 0 (0, 0) 0.5 (0.5, 0,5) 1 (0.5, 1) CDR scoreb 0 (0) 0.471 (0.121) 0.778 (0.264) a Expressed as median (25th , 75th percentile) b Expressed as mean (standard deviation)

The cognitive tests, MMSE and CDR, were significantly different between cognitively healthy individuals, MCI and AD patients, confirming that our samples represent the diagnostic groups of interest and are suitable for biomarker studies (Figure 5.1, Table 5.1). The highest

MMSE score was found in the control group (median score 30), follow by the MCI group

(median score 26), and the AD group (median score 22), as expected. The lowest CDR score was found in the control group (median score 0), then the MCI group (median score 0.5) and the

AD group (median score 1), as expected.

144

Figure 5.1: Distribution of cognitive test scores (Cohort 1). Cognitive tests, mini-mental state examination (MMSE) and clinical dementia rating scale (CDR), were compared between cognitively healthy individuals (control group), mild cognitive impairment (MCI) and Alzheimer’s disease (AD) patients. Statistically significant difference in cognitive performance was observed between three groups based on both test (p<0.001).

The mean age for MCI and AD patients was 72.4 and 70.3 years, respectively, while the mean age in the control group was 64.7 years. There were 10 females in the MCI group, six in the AD group, whereas five females were among control individuals. Statistically significant differences were found for age (p=0.0023), sex (p=0.0439) as well as MMSE and CDR scores

(p<0.001) between the three groups. These data are also shown in Table 5.1.

5.3.1.2 Cohort 2

CSF samples with MCI and AD patients with different dementia severity (n=101) were randomized into two sets. In the first set, 8 patients were with diagnosis of MCI, 11 with mild dementia, 24 with moderate and 15 with severe dementia, while in the second set, 6 patients had MCI, 8 mild dementia, 16 moderate and 13 severe dementia (Table 5.2).

145

Table 5.2: Patients’ characteristics (Cohort 2)

Mild cognitive Mild AD Moderate AD Severe AD Set 1 impairment dementia dementia dementia

Participants, n 8 11 24 15

Agea 75 (70.7, 80.5) 71 (68, 76.5) 76.5 (70.7, 78.25) 76 (69.5, 82)

Ageb 74.5 (7.8) 71.4 (8.4) 75.7 (6.4) 74.4 (9.3)

Sex-female, n (%) 3 (38) 3 (27) 13 (54) 6 (40)

MMSEa 28 (26, 29) 24 (22, 25.5) 19 (16.75, 20) 8 (2.5, 10)

MMSEb 27.62 (1.77) 23.91 (1.7) 18.46 (2.04) 6.53 (4.63)

Set 2

Participants, n 6 8 16 13

76.5 (71.5, Agea 68 (60, 74) 78.5 (74.7, 83.2) 75 (72, 76) 80.7)

Ageb 67.6 (9.2) 76.2 (8.8) 78.1 (6.9) 71.1 (9.0)

Sex- female, n (%) 5 (83) 3 (38) 6 (38) 2 (15%)

MMSEa 27.5 (26.25, 28.75) 24 (22.75, 24) 17.5 (16.75, 19) 7 (2, 10)

MMSEb 27.67 (1.63) 23.62 (1.30) 17.62 (2.19) 6.23 (4.4)

a Expressed as median (25th , 75th percentile)

b Expressed as mean (standard deviation)

The MMSE cognitive test was significantly different (p<0.001) in both sets between four

groups tested (Figure 5.2, Table 5.2). As expected, MCI patients had the highest MMSE score

146 follow by mild, moderate, and severe AD. In the first set the mean age for MCI group was 74.5, for mild dementia group 71.4, for moderate 75.7 and for severe dementia 74.4. In the second set the mean age for MCI group was 67.6, for mild dementia group 76.2, for moderate 78.2 and for severe dementia 71.1. In set 1 there were three females in the MCI and mild dementia group, thirteen in moderate and six in severe dementia, whereas in the set 2 five females were in the

MCI, three in mild, six in moderate and two in severe dementia group. There was no difference in age (p=0.514) or gender (p=0.504) in set 1, while statistical significant difference were found for age (p=0.041), sex (p=0.047) between four groups in the set 2.

Figure 5.2: Distribution of cognitive test scores (Cohort 2). Cognitive test, mini-mental state examination (MMSE) was compared between mild cognitive impairment (MCI) mild, moderate and severe Alzheimer’s disease (AD) dementia patients. Statistically significant difference in cognitive performance was observed between four groups in both sets (p<0.001).

147

Difference in AD CSF biomarkers were tested between MCI, mild, moderate and severe AD, in smaller proportion of participants among total number of participants (n=101). Overall 54 participants were tested for Aβ1-42 (distributed by groups, MCI: n=10, mild: n=7, moderate: n=23, severe: n=14), 42 for t-tau (distributed by groups, MCI: n=9, mild: n=6, moderate: n=16, severe: n=11) and 43 for p-tau (distributed by groups, MCI: n=9, mild: n=5, moderate: n=21, severe: n=8). Statistical difference was observed between disease groups for Aβ1-42 and t-tau

(p<0.05). Distribution of Aβ1-42, t-tau and p-tau is shown in Figure 5.3.

Figure 5.3: Distribution of CSF biomarkers Aβ1-42, t-tau and p-tau. Proteins were tested between MCI, mild, moderate and severe AD. Aβ1-42 and t-tau were significantly different between tested groups (p<0.05).

148

5.3.2 Candidates’ comparison

A multiplexed, 30-protein SRM assay was utilized to explore the diagnostic potential of brain-related proteins in two AD cohorts. This SRM assay was previously developed for relative quantification of CSF proteins; mass spectrometry parameters of the method can be found under the Chapter 4. Complete multiplex assay for clinical samples (30 candidates, ECM1, MBP,

APOB, APOE proteins) are presented in Appendix 5.1.

5.3.2.1 Cohort 1

Although the participants in the present cohort were all aged individuals, there was a small, but significant, difference in age among the groups. Also, the number of males and females was different among MCI, AD and control groups; both variables were included as covariates for protein abundance comparison between the groups. After adjusting for differences in age and gender, 11 proteins showed statistically significant differences among MCI, AD and control groups (p<0.05), Appendix 5.2. These proteins displayed an overall expression pattern of increase in MCI group, but not in the AD group (Figure 5.4). The observed ratios between

MCI and controls were 1.2-1.5, between AD and controls were 1.0-1.2 and between MCI and

AD were 1.2-1.4. However, when multiple correction testing was taken into account (using the

Holm method), only protein APLP1 was significantly different among the tested groups

(p=0.012), with elevated levels observed in MCI in comparison to healthy controls (p<0.001),

Figure 5.4. The control protein ECM1 did not differ among MCI, AD and control groups

(p=0.585). Reproducibility (technical and process reproducibility) for all peptides was below

20% (CV<20%). All CV values are indicated in the Appendix 5.3.

149

Figure 5.4: Distribution of candidate protein biomarkers in CSF samples (n=53). Candidate proteins were measured with SRM assay (relative quantification) and compared between controls (n=23), mild cognitive impairment, MCI (n=20) and Alzheimer’s disease (n=10) patient samples. APLP1 protein showed statistically significant difference between controls and MCI groups (p<0.001).

150

5.3.2.2 Cohort 2

For initial evaluation of 30 biomarker candidates, 101 samples from MCI and different

AD severity were randomized into two separate processing/analysis steps in order to have two independent measurements.

Overall, majority of the proteins had the same pattern/trend observed in the first and second set of samples. In the first data set, proteins NPTXR, NPY and VGF were significantly different between four groups (p=0.014, 0.033, 0.038, respectively), before correction for multiple comparison testing by Holm, after correction the significance did not remain. Likewise, the findings observed in the first data set were not replicated in the second data set. In addition, some proteins showed differential levels when compared MCI vs. moderate and severe AD such as BAI2, ECM1, FRRS1L, NPTXR, NPY, SLITRK1 and VGF (p=0.044, 0.033, 0.042, 0.004,

0.004, 0.048, 0.005 respectively), but the findings observed in the first data set were not replicated in the second data set. Only protein NPTXR showed statistically significant difference between MCI vs. combined moderate and severe AD groups (Set1 p=0.004, Set2 p=0.039). Still, this significance did not remain after correction by Holm. Control protein ECM1 did not differ among MCI, mild, moderate and severe AD (set 1 p=0.200, set 2 p=0.926) but differed between MCI vs. moderate and severe AD groups (only in set 1), however when multiple correction was applied the difference did not remain (p>0.05). Statistical analysis of the

Cohort 2 is shown in Appendix 5.4. Reproducibility (technical and process reproducibility) was performed in a similar way as in the Cohort 1, with observed CV<20% (data not shown).

Distribution of all candidate proteins between different stages is shown in Figure 5.5 and

Figure 5.6.

151

value

Figure 5.5: Distribution of candidate protein biomarkers in CSF samples, Set 1. Candidate proteins were measured with SRM assay (relative quantification) and compared between MCI (n=8), mild (n=11), moderate (n=24) and severe AD (n=15).

152

value

Figure 5.6: Distribution of candidate protein biomarkers in CSF samples, Set 2. Candidate proteins were measured with SRM assay (relative quantification) and compared between MCI (n=6), mild (n=8), moderate (n=16) and severe AD (n=13).

153

5.3.3 Diagnostic performance

5.3.3.1 Cohort 1

Diagnostic performance was evaluated by calculating the AUC for discrimination between MCI vs. control and AD vs. control groups. The highest AUC values for MCI vs. control were achieved for CNTN2 protein (AUC=0.809, 95% CI: 0.65, 0.92), followed by SPP1

(AUC=0.800, 95% CI: 0.64, 0.94), APLP1 (AUC=0.789, 95% CI: 0.63, 0.91) and MOG proteins (AUC=0.739, 95% CI: 0.58, 0.87). Figure 5.3A shows the ROC curves for individual proteins. The diagnostic performance of these proteins for AD vs. control differentiation was somewhat lower and not significant (AUC=0.670, 0.600, 0.591, 0.626, respectively, the ROC curves not shown).

Figure 5.7: Receiver-operating characteristic (ROC) curve for best performing candidates (Cohort 1). A) Individual ROC of best protein candidates and multivariate panel proteins (APLP1 and SPP1) for mild cognitive impairment (MCI) vs. control group classification. B) Individual ROC of best protein candidates and multivariate panel proteins (APLP1, SPP1 and CNTN2) for combined disease group (AD+MCI) vs. control group classification.

154

5.3.3.2 Cohort 2

Diagnostic performance was evaluated by calculating AUC for discrimination MCI vs. moderate and severe AD. Based on the performance of candidates in both sets, only NPTXR protein showed significant and reproducible separation between two groups. In the first set AUC for NPTXR was 0.799 (95% CI: 0.628, 0.928) and in the second set 0.799 (95% CI: 0.586,

0.960). Figure 5.8 shows ROC curve for this protein in both sets. Only small subset of patients had current AD biomarkers (Aβ1-42, t-tau, p-tau) values available; therefore in order not to introduce a bias due to non-random availability of these values and small sample size, diagnostic accuracy (ROC curve analysis) of NPTXR was not compared with these biomarkers.

Figure 5.8: Receiver-operating characteristic (ROC) curve for best performing candidate (Cohort 2). ROC curve of NPTXR protein in Set 1 and Set 2; AUC value for Set 1 was 0.799 (95% CI: 0.628, 0.928) and in Set 2 0.799 (95% CI: 0.586, 0.960).

155

5.3.4 Multivariate analysis (Cohort 1)

The multivariate model was generated for prediction of MCI vs. control and MCI plus

AD vs. control classification using the LASSO method. The selected linear predictor model for the multivariate panel (MCI vs. control) was 0.25xAPLP1 + 0.009xSPP1, with AUC value of

0.841 (95% CI: 0.734 - 0.949). Figure 5.3A shows ROC curve of this multivariate panel. A multivariate panel could not be developed for AD vs. healthy controls due to a small sample size of the AD group. For the combination of MCI and AD vs. control group classification, the multivariate linear predictor was 1.344xAPLP1 + 0.866xCNTN2 + 0.102xSPP1, and had an

AUC of 0.758 (95% CI: 0.655 - 0.861). Figure 5.3B shows ROC curves of this multivariate panel and its individual proteins for comparison.

5.3.5 Correlation of candidate proteins with MMSE and CDR tests

5.3.5.1 Cohort 1

Pairwise Spearman’s rank correlation was used to assess if there is a correlation between protein candidates and the cognitive tests MMSE and CDR. Five proteins showed negative correlation with MMSE score, while three proteins showed positive correlation with CDR score

(p<0.05). Proteins APLP1, CNTN2 and SPP1 correlated with both MMSE (Spearman’s rho= -

0.40, -0.37, -0.46, respectively) and CDR (Spearman’s rho= 0.35, 0.29, 0.44, respectively), whereas proteins KLK6 and MOG correlated exclusively with MMSE score (Spearman’s rho= -

0.28 for both). Figure 5.4 illustrates Spearman’s rank correlation coefficients between candidates and cognitive testes, for pairs significant at the 0.05 level.

156

Figure 5.9: Correlation between candidate levels in CSF and cognitive tests (Cohort 1). Spearman’s rank correlation was performed to asses correlation between candidate proteins and min-mental state examination test scores (MMSE) and clinical dementia rating scale scores (CDR). APLP1, SPP1 and CNTN2 proteins correlated with both cognitive tests.

157

5.3.5.2 Cohort 2

Pairwise Spearman’s rank correlation was used to assess if there is a correlation between protein candidates and cognitive test MMSE scores. Few proteins showed positive correlation with MMSE score (data not shown). Spearman’s rank correlation coefficients between level of candidates and cognitive test (for pairs significant at 0.05 level) was: 0.21 for BAI2, 0.23 for

NCAN, 0.29 for NPY, 0.22 for OPCML, 0.29 for RTN4RL2, 0.26 for SCG2, 0.23 for SEZ6L,

0.25 for SST and 0.32 for VGF.

5.3.6 Distribution of APOE phenotype among groups

5.3.6.1 Cohort 1

APOE phenotype was analyzed using SRM in AD cohorts. There were 80% APOE ε4 carriers among AD patients, 45% among MCI and 35% among the control group. APOE ε4 homozygous patients were present only in MCI (n=2) and AD (n=2) groups, as expected. The number of ε4 carriers was marginally different than non-carriers between MCI, AD and controls

(p=0.05, data not shown) Overall, four APOE phenotypes were identified in all subjects, ε2/ε3,

ε3/ε3, ε3/ε4 and ε4/ε4, with no difference in the APOE phenotype frequencies among the tested groups (p=0.069). Frequencies of APOE phenotypes are shown in Table 5.3.

Table 5.3: APOE phenotype distribution (Cohort 1).

APOE Cognitively Healthy Mild Cognitive Alzheimer's phenotype Control Impairment Disease ε4-carriers (%) 35 45 80 ε2/ε3, n 0 1 0 ε3/ε3, n 15 10 2 ε3/ε4, n 8 7 6 ε4/ε4, n 0 2 2 Total, n 23 20 10

158

5.3.6.2 Cohort 2

There was overall 31 % APOE ε4 carriers among disease patients, 14% among MCI,

21% among mild, 30% among moderate and 46% among severe AD dementia. APOE ε4 homozygous were present only in mild (n=2) and severe AD (n=2) groups. There was no significant difference in distribution of ε4 carriers between disease patients with different severity (p=0.138). Overall, five APOE phenotypes were identified in all subjects, ε2/ε3, ε2/ε4,

ε3/ε3, ε3/ε4 and ε4/ε4, with no difference in the APOE phenotype frequencies among tested groups (p=0.160). In set 1 all five APOE phenotypes were present, ε2/ε3 (n=2), ε2/ε4 (n=1),

ε3/ε3 (n=38), ε3/ε4 (n=13) and ε4/ε4 (n=4), while in set 2 only three: ε2/ε3 (n=3), ε3/ε3 (n=27),

ε3/ε4 (n=13). Frequencies of APOE phenotypes are shown in Table 5.4.

Table 5.4: APOE phenotype distribution (Cohort 2).

APOE mild moderate severe MCI Total phenotype dementia dementia dementia ε4-carriers (%) 14 21 30 46 31 ε2/ε3 0 2 2 1 5 ε2/ε4 0 0 1 0 1 ε3/ε3 12 13 26 14 65 ε3/ε4 2 2 11 11 26 ε4/ε4 0 2 0 2 4 Grand Total 14 19 40 28 101

5.3.7 Distribution of candidate proteins among APOE phenotypes

5.3.7.1 Cohort 1

Most of the proteins (n=17) showed significantly different abundances between APOE

ε4 homozygous, ε4 heterozygous, and ε4-non carriers (p<0.05, when not corrected for multiple testing). For individual APOE phenotypes, six of the proteins showed significantly different

159 abundance between ε2/ε3, ε3/ε3, ε3/ε4 and ε4/ε4 phenotypes (p<0.05, when not corrected for multiple testing). Six proteins (MOG, NRCAM, SEZ6L, CNDP1, NPTXR, CADM3) showed significantly elevated levels between ε4/ε4 vs. ε3/ε3 and ε3/ε4 phenotype, while one protein

(MOG) showed elevated levels between ε4/ε4 vs. ε2/ε3 (p<0.05, when not corrected for multiple testing). Appendix 5.5 shows statistical analysis of proteins related to individual APOE phenotypes, while Appendix 5.6 shows statistical analysis of proteins between different APOE

ε4 carriers. When multiple correction testing was applied, no significance remained between protein abundance and APOE phenotype or proteins and APOE ε4 carriers (p>0.05).

5.3.7.2 Cohort 2

In both set of samples none of the proteins showed reproducible difference in relative abundance between APOE phenotypes (data not shown). One protein showed modest significance only in the first set (p=0.040, when not adjusted for multiple comparison. There was no difference in proteins in set 2 between different phenotypes (p>0.05).

5.4 Discussion

The main goal of the present study was to evaluate 30 CSF, brain-related proteins, as biomarkers of AD-associated cognitive impairment. The diagnostic potential of these proteins was assessed in a relatively small cohort of cognitively healthy individuals (control group) and patients with MCI and AD (n=53 in total), in addition to the cohort with MCI patients and patients with different severity of AD dementia (n=101 in total). APLP1 showed the most promise as a potential biomarker of early cognitive decline, since its concentration was elevated at the MCI stage. A few other proteins were also promising (SPP1, CNTN2 and NPTXR).

160

The well-known CSF AD biomarkers have good diagnostic performance for AD, compared to cognitively healthy elderly with approximate sensitivity and specificity 80-90%.

Based on longitudinal studies, diagnostic sensitivity and specificity to determine prodromal AD in MCI patients is typically higher than 75%, with some studies reporting even higher figures values (sensitivity 83%, specificity 95%) [171, 172, 207]. Still, the performance of these biomarkers is not consistent among the studies and has to be further validated. Likewise, CSF

Aβ1-42 and tau lack specificity when used to differentiate AD from other types of dementia, mostly due to the co-existence of non-AD pathologies, which also alter the levels of these biomarkers; p-tau is considered more specific for AD. The drawback of their widespread use in the clinical settings is the lack of established method standardization, resulting in method- specific results and cut-off values; access to these biomarkers is also limited [2].

In the quest of novel biomarkers that could reflect other or associated pathophysiological mechanisms then plaque and NFT load, especially in an early stage of disease and correlate with disease severity we focused on the brain-enriched proteins as potential proteins that could mirror disease-associated changes.

In the present study increased levels of APLP1 were found in MCI vs. control group with an observed 1.3-fold difference. Although this difference is relatively small and should be interpreted with caution, a similar fold change has been observed with other biomarker studies in AD [271, 272]. For example, the Wildsmith study found fold-changes ranging from 0.6 to 2 for comparisons between AD, MCI and controls. The Paterson study found fold-changes ranging from 1.3 to 2 for comparisons between AD vs. non-AD subjects, for their most significant markers [271, 272].

161

Apart from APLP1, some other proteins show similar pattern (increase in MCI, but not in AD), particularly CNTN2, MOG and SPP1. The best diagnostic performance was observed with a multivariate panel that included APLP1 and SPP1. Therefore SPP1 may add additional value in patient classification. In addition, APLP1, SPP1 and CNTN2 showed the most promising association between protein abundance and cognitive decline with significant correlation with both MMSE and CDR tests. Biomarker indications of clinical endpoint are of special importance in clinical trials settings, allowing monitoring therapeutic effect and use as surrogate endpoints. The current biomarkers lack of these characteristics [154].

In the second cohort, only protein NPTXR showed potential in discriminating MCI patients from more advanced AD stages, i.e. moderate and severe AD (based on AUC performance). Trend towards differential abundance of this protein was also observed between these two groups, suggesting NPTXR may be more relevant as a marker of progression, however the difference did not remain after correction for multiple comparison testing. NPTXR is transmembrane presynaptic protein, suggested to be involved in activation of both excitatory and inhibitory neurons [282]. It has been reported as potential prognostic AD biomarker [271], more specific for AD [283]. Moreover, differential abundance of NPTXR has been observed in the asymptomatic carriers of AD familial mutations comparing to non-carriers with elevated levels observed in asymptomatic carriers [284].

APLP1, CNTN2 and SPP1 have previously been investigated in the context of AD.

APLP1 is a membrane glycoprotein, a member of the highly conserved APP gene family, which also includes two homologous proteins, amyloid precursor protein (APP) and amyloid precursor-like protein-2 (APLP2). While APP and APLP2 are ubiquitously expressed proteins,

APLP1 is exclusively expressed in brain tissue [285]. It has been suggested that APLP1 is enzymatically processed in the similar way as APP [240, 286]. Several APLP1-derived peptides

162 have been identified in human CSF; some of these endogenous, Aβ-like peptides have been proposed as surrogate CSF biomarkers of brain Aβ1-42 production [287] and as biomarkers of target engagement (γ-secretase modulators) [288]. Interestingly, all of these APLP1 peptides contain in their sequence the tryptic peptide monitored in this study, for APLP1 levels by SRM.

Contrary to our findings, another targeted proteomic study by Wildsmith et al. [271] measuring the same APLP1 peptide, did not find any difference in APLP1 between MCI and controls; however, their MCI group had only five subjects.

Protein SPP1 is a secreted glyco-phosphoprotein with broad spectrum of functions. It is involved in inflammatory and anti-apoptotic processes, acts as a cell adhesion molecule and as a cytokine. It is expressed in several tissues, including the brain, where increased levels were found in pyramidal neurons of the hippocampus in AD patients, compared to control subjects

[289]. In human CSF, elevated levels of SPP1 have been found in patients with AD pathology

[272, 290], MCI patients who progressed to AD [265, 291], and in individuals carrying mutations related to familial AD (mutations in PSEN1 and APP genes) [284]. In contrast, one mass spectrometry-based study did not observe any difference in AD patients, yet this protein was part of the multiprotein panel for prediction of MCI patients converting to AD [270].

Differences in SPP1 levels have also been observed in other neurodegenerative and neuroimmfamatory diseases, such as Lewy body dementia [292], Parkinson’s disease [241] and multiple sclerosis [293], which indicates that this marker could reflect inflammatory brain pathology.

Although not significant after applying multiple correction testing, SPP1 was increased in MCI compared to controls (1.5 fold change). The same fold difference was observed in the

Paterson study, but between AD and non-AD patients [272]. SPP1 levels have also been associated with cognitive decline in AD, showing a positive correlation with MMSE [290]; in

163 our study, a negative correlation with MMSE was observed. This discrepancy could be due to inclusion of MCI and AD and control participants, while previous studies examined this correlation in AD patients only. Since controls had lower levels of SPP1 and higher MMSE scores compared to disease patients, overall negative correlation was observed in our study.

Indeed, a trend of weak positive correlation was observed between SPP1 and MMSE when only

MCI and AD patients were considered, however, this was not significant (data not shown).

CNTN2 has not been previously extensively studied as an AD biomarker. Two targeted, quantitative mass spectrometry-based studies did not find any changes in CSF of AD patients

[270, 271], and one study found elevated levels in AD in one out of two discovery-based mass spectrometry approaches [283]. Together with SPP1, CNTN2 was one of the proteins in a multi- panel, designed to predict of MCI patients converting to AD [270]. Still, CNTN2 seems to be relevant for AD pathology. CNTN2 is a neuronal cell adhesion molecule and it was defined as a functional APP ligand, leading to γ-secretase-dependent release of APP intracellular domain, moderating neurogenesis [294]. Furthermore, CNTN2 seems to be a BACE1 substrate; BACE1 enzymatic activity is associated with decreased cell surface and increased secreted CNTN2 form

[295]. This protein has been also studied as a potential autoantigen in multiple sclerosis patients

[296]; in our previous report (paper under submission) elevated levels have been associated with clinically isolated syndrome (early stage of multiple sclerosis).

The present study thus offers new insights into the biomarker potential of APLP1, SPP1 and CNTN2 proteins in MCI discrimination from healthy controls and correlation with cognitive decline. If this potential would be confirmed in further validation, these proteins may add in patient stratification research and clinical trials settings and/or monitoring therapeutic effect on disease severity.

164

These three proteins do not have any direct or indirect (functional) interactions between each other (STRING database analysis, data not shown), however other proteins from our 30- protein panel appear to have some interaction with CNTN2. Specifically, NRCAM and PTPRZ1 have been identified as CNTN2 interacting partners (edge confidence score 0.953 and 0.868, respectively). By assessing molecular interaction of proteins, molecular binding was associated with NRCAM and CNTN2, while no molecular interaction was assigned between PTPRZ1 and

CNTN2. Based on network functional enrichment analysis, significantly enriched biological processes (GO) for NRCAM and CNTN2 are neuronal ion channel clustering (GO: 0045161), axonal fasciculation (GO: 0007413) and neuron maturation (GO: 0042551), while cell adhesion molecules is the enriched KEGG pathway (pathway ID: 4514).

The utility of targeted proteomics in verification of novel candidate biomarkers is recognized [212, 270, 271] and has several advantages for biomarker verification over immuno- based assays such as high specificity, time-efficient assay development, multiplexing capability and antibody-free performance. For these reasons we have selected SRM methodology as a tool of choice for testing biomarker performance of our candidate proteins. However, future validation studies can be performed with an alternative method (e.g. ELISA) to confirm observed changes.

APOE ε4 is the major risk factor for developing AD, and it has also been associated with

AD pathology. The APOE ε4 carriers have earlier onset of AD pathology compared to non- carriers, and have more severe amyloid burden (e.g. lower levels of Aβ1-42 have been observed in both AD and MCI patients who were ε4 carriers) [48]. For this reason, we investigated if levels of our candidate proteins change with APOE phenotype. Six proteins (MOG, NRCAM,

SEZ6L, CNDP1, NPTXR, CADM3) showed a different distribution among APOE phenotypes and 17 proteins among APOE ε4 homozygous and heterozygous carriers, vs. non-carriers.

165

However, only four ε4 homozygous were identified in the cohort, and not all APOE phenotypes were represented (cohort 1); thus this type of comparison should be further performed in cohort with higher number and equally distributed APOE phenotypes per each group.

In conclusion, novel biomarkers of AD are needed to achieve early and more accurate diagnosis, predict disease progression and cognitive decline and facilitate patient stratification in clinical and therapeutic research settings. We evaluated the performance of 30 candidate brain- related proteins as biomarkers for diagnosis of MCI and AD using targeted, SRM assays.

APLP1, SPP1 and CNTN2 showed potential as indicators of disease with best discriminatory performance for cognitively impaired patients (MCI plus AD) as well as MCI vs. control patients, while NPTXR showed potential to discriminate early (MCI) from more advanced AD stages. APLP1 levels were found elevated in MCI patients. The observed findings need to be validated in a larger, independent cohort of MCI and AD patients.

166

Chapter 6

6 General discussion and future direction

The present dissertation is part of an effort to find reliable biomarkers of AD that would allow early and accurate AD detection or prediction.

Our initial pilot study of AD hippocampal tissues was focused mostly on the global proteome characterization; AD-exclusive proteins were identified and portion of these were found to be present in the literature-based CSF proteome.

Our goal was then to focus on brain tissue-specific proteins present in the CSF. The main hypothesis is that brain-related, highly specific proteins present in CSF can serve as potential

AD biomarkers. The approach taken was to detect proteins that can be reliably found in the CSF proteome and then develop quantitative assays for measurement of candidate proteins in the

CSF. Final objective was to validate candidate biomarkers in the CSF samples from different

AD cohorts.

In the first step, we used publicly available HPA database in order to select highly specific brain proteins, as defined by their mRNA expressions, and then further focus on membrane-bound and secreted protein as candidate biomarkers.

In the second step, CSF proteome was analyzed and membrane-bound or secreted brain- specific proteins that could reliably be detected in the CSF samples from healthy individuals were selected. Mass spectrometry assay for simultaneous quantification of these candidate biomarkers was developed. Multiplex selected reaction monitoring assay was then utilized for quantification of the potential biomarkers in a cohorts of cognitively normal individuals, MCI and AD patients with different disease severity (i.e. mild, moderate and severe AD), in order to

167 assess their diagnostic potential. The most promising candidate for differentiation between MCI due to AD vs. cognitively normal individuals was APLP1 protein; specifically, increase in

APLP1 was observed in MCI group relative to the control. The levels again decreased in the AD group. APLP1 also correlated with the cognitive decline. This is the first mass spectrometry- based study to report differential abundance of APLP1 in the early stage of AD, i.e. MCI due to

AD.

Other candidates also showed promising diagnostic performance, specifically CNTN2 and SPP1, in separating MCI due to AD from cognitively healthy individuals. Multivariate panel of APLP1 and SPP1 showed to be the best discriminator for MCI due to AD stage vs. controls (based on AUC performance). Lastly, NPTXR protein showed promise for differentiation between MCI stage and more advanced AD stages (based on AUC performance).

APLP1 is a membrane glycoprotein and a member of the highly conserved APP gene family with exclusive expression in the brain tissue, in contrast to other family members, APP and APLP2 proteins [285]. APLP1 is involved in the synaptic cell adhesion (trans-interaction) and has an important role in preserving synaptic density and basal transmission [297]. Some of the endogenous APLP1 peptides found in CSF have been proposed as surrogate CSF biomarkers of brain Aβ1-42 production [287] and as biomarkers of target engagement upon treatment with

γ-secretase modulators [288].

Although not significantly different between control, MCI and AD groups after correction for the multiple test comparison, SPP1 and CNTN2 proteins are worth of further investigation. These proteins had good diagnostic performance in discriminating MCI from controls; their levels showed weak to moderate correlation with cognitive decline. SPP1 is a secreted protein with several functions acting as a matrix protein and as a cytokine [263]. There

168 is no known direct link between SPP1 and AD pathology; however, elevated levels of this protein have been reported in AD and MCI patients who progressed to AD dementia [290, 291].

CNTN2 is an axonal cell surface protein with a role in cell adhesion. It is located at the juxtaparanodal region of the nodes of Ranvier where, together with the protein CNTNAP2, contributes to the clustering of the voltage-gated potassium channels [298]. All three proteins,

APLP1, SPP1 and CNTN2, do not appear to have direct or indirect (functional) interactions (as assessed by bioinformatic analysis).

Other studies have evaluated APLP1, SPP1 and CNTN2 as AD biomarkers, but are different from this study in assessing another biomarker potential, study cohorts (e.g. selected groups for comparison, sample size), peptides monitored were different (e.g. CNTN2), correlation with cognitive decline was not assessed (e.g. APLP1, CNTN2); previous study that assessed APLP1 as a biomarker did not show difference in protein abundance in MCI group vs. control. The present study thus offers the new insights into the biomarker potential of APLP1,

SPP1 and CNTN2 proteins in MCI discrimination from healthy controls and their correlation with cognitive decline.

NPTXR protein showed a trend in differential abundance between MCI and more severe

AD stages (moderate and severe AD); significant performance was observed in differentiation of MCI from the moderate and severe AD (based on AUC performance), suggesting it may be more relevant as a marker of progression. This transmembrane presynaptic protein has been previously proposed as potential prognostic AD biomarker [271], more specific for AD

(compared to PD) [283].

APOE ε4 is the major risk factor for developing AD, and it has also been associated with

AD pathology. Hence one of the aims was to assess if candidates’ abundance changes with the

169

APOE phenotype. Although several proteins showed slightly different concentration among identified APOE phenotypes or between APOE ε4 homozygous and heterozygous carriers vs. non-carriers, these differences were not significant following correction for multiple comparison testing. This type of comparison should be further performed in cohorts with larger sample size and equally distributed APOE phenotypes per each group.

It should be noted that monitoring brain-enriched proteins as potential novel candidates in neurodegenerative diseases have been recently attempted by other group, however, following a different approach and technology (antibody suspension bead array). The latter study found elevated CSF levels of neurogranin and neuromodulin (GAP-43) in AD, proteins that were not present in our initial list of candidates [299].

Mass spectrometry is a powerful state-of-the art technology in protein biomarker identification and quantification from various biological samples. SRM assay is often a method of choice for biomarker quantification due to high specificity of the method, excellent multiplexing capabilities and its applicability for measurement of proteins for which immuno- assays do not exist. In addition, SRM is independent of matrix-effects (such as protein binding to other proteins, protein aggregation) proven to affect measurement of many analytes (e.g. CSF

Aβ1-42 protein), although sensitivity of the method is matrix-dependent. SRM is an antibody- free quantitative method which eliminates potential antibody cross reactivity as well as batch-to- batch reagents’ variability, common concerns of immunoassays.

To introduce mass spectrometry method for regular measurement of AD biomarkers in clinics is challenging due to limited availability of mass spectrometry instrumentation and expertise. It may be possible to set up AD testing centres that would employ the mass spectrometry technology. Immunoassays for most promising candidates would offer several advantages such as higher sensitivity, higher throughput and lower cost of performance.

170

For the future studies with more extensive validations of candidates in larger cohorts, accurate absolute mass spectrometry-based quantification or immunoassays can be developed.

There are several limitations of the current study. Candidates were selected using pre- defined criteria combining HPA database and non-pathological CSF samples for final selection of candidates. This eliminates the candidates that would be found only during pathogenesis or as response to pathogenesis. For instance, proteins could be overexpressed in a disease state and would be detected in pathologic CSF but not in CSF derived from healthy individuals. To include such proteins, initial survey of candidates should be performed in both disease and non- disease CSF proteomes.

In addition, differentially expressed proteins between disease CSF vs. healthy CSF proteins would be as well missed. Preliminary studies seeking AD CSF biomarkers based on differential abundance between AD and control group have been previously utilized [300].

Future studies are needed to examine a more comprehensive set of proteins in AD patients in comparison with controls. For this approach, relatively novel mass spectrometry-based relative quantification could be employed, sequential window acquisition of all theoretical ion spectra

(SWATH) [301]. This approach allows higher multiplexing capacity (thousands of proteins) then the SRM (hundreds of proteins) with accuracy and reproducibility closely comparable to the SRM. The basis of this method is the prior generation of spectral libraries for the samples of interest (diseased and controls) consisting of peptides’ fragment spectra (from all ionized peptides present in the sample), and their RTs. Based on the matching of unique fragment pattern, m/z and RT, peptides in the actual samples can then be identified and further quantified in a targeted mass spectrometry-based manner [302]. Absence of the peptides in the samples used for building libraries, interfering signals, background noise, time-consuming assay optimisation are some of the limitations of the SWATH-MS.

171

Another limitation related to the selection of candidates was the focus on predicted secreted and membrane-bound proteins. Intracellular proteins which were excluded may have lower abundance in the normal CSF, but under the pathological conditions, they could be released in the extracellular space and leak into the CSF. An example of such intracellular protein is tau protein. Likewise, several novel CSF biomarker candidates such as neurogranin are intracellular proteins found to be elevated in AD [303]. In order to monitor tissue-specific proteins, regardless of their cellular localization, SWATH-MS would be a favorable method of choice, since it can provide quantitative analysis of large number of proteins with any pre- selection. However, the possible challenge here can be the sensitivity of low abundant brain- derived proteins present in CSF.

Other limitations are related to the patient cohorts used in the study. Several patient cohorts have been tested in this study. Cohorts had limited sample size and there were slight differences in age and sex distribution between groups. An important consideration for the future studies is also estimation of the power and sample size, taking into account technical and biological variations. In addition, age and sex of the selected control group and disease group should be matched. Cohorts did not include preclinical AD which would allow assessment of proposed candidates in very early stage of potentially developing AD. Some of the cohorts lacked healthy age-matched controls which allowed assessment only between stages of AD.

Moreover, diagnostic accuracy of promising candidate proteins identified in this study was not compared with the existing AD CSF biomarkers (Aβ1-42, t-tau, p-tau) due to lack of available data.

Regardless of our effort to define novel AD biomarkers, there are still certain unsolved issues in the field; some of them were not the primary scope of the present study.

172

One of the issues is the ability of biomarkers to identify AD independently of other dementias. This would improve differential diagnosis and also correctly identify patients for any

AD-specific treatment. It may also be expected that such biomarkers may be close to AD- specific pathological mechanism and therefore more relevant for understanding of the disease course. In this study we did not test diagnostic potential of candidate biomarkers in differentiating different forms of dementia. Our proposed AD biomarkers should be validated in patients with different types of dementias, such as DLB, VaD, FTD, in future studies. Likewise,

AD patients with co-morbidities (e.g. mixed dementia types) should be excluded.

Another outstanding problem is search for biomarkers of risk for progression from pre- clinical to advanced stages. Our cohorts did not include longitudinally followed patients which would allow assessment of candidates’ predictors of disease course. Patients should be monitored for sufficient period of time, allowing disease progression to the next stage, e.g. approximate annual rate of progression of MCI due to AD patients to AD dementia stage is 10 to 15% [172]

There are several therapeutic options being pursued in AD. Biomarkers that can measure target engagement, pathological changes or indirectly predict change in disease course are also needed. Our study did not address this issue as the samples from such trials were not available.

Finally, novel biomarkers are needed to further elucidate AD pathogenesis; the present study did not correlate proposed biomarkers to pathology findings nor tested the functional significance of biomarkers in pathogenesis in animal or in vitro models. Further basic research of candidate proteins may reveal more insights into their potential role in AD pathogenesis.

In summary, current dissertation presents the mass spectrometry-based approach for AD brain-specific biomarker identification and verification. Brain-related proteins reliably identified in the CSF proteomes were selected for the SRM assay development. Diagnostic potential of 30

173 candidates was assessed in CSF samples of AD patients with different spectrum of disease severity (MCI, mild, moderate and severe dementia) in addition to cognitively healthy individuals. Protein APLP1 was found to be elevated in the MCI stage indicating promise as early AD biomarker; APLP1 and SPP1 showed promised diagnostic potential in discriminating

MCI from control group, while protein NPTXR showed potential to discriminate early (MCI) from more advanced AD stages. Lastly, proteins APLP1, SPP1 and CNTN2 may be indicators of disease progression, demonstrating weak to moderate correlation with cognitive tests. In conclusion, this study identifies new proteins with biomarker potential at early and more advanced stages of AD severity, providing support to our hypothesis that brain-specific proteins may be biomarkers of AD at different stages. Our findings have to be further confirmed in independent set of samples. If the performance of proposed biomarkers is confirmed, these proteins may add value in the clinic or clinical trial settings as diagnostic biomarkers (alone or in combination with the existing biomarkers) of the prodromal AD stage, and in monitoring disease progression.

174

References

1. Blennow K, de Leon MJ, Zetterberg H: Alzheimer's disease. Lancet 2006, 368:387-403.

2. McKhann GM, Knopman DS, Chertkow H, Hyman BT, et al: The diagnosis of dementia due to

Alzheimer's disease: recommendations from the National Institute on Aging-Alzheimer's

Association workgroups on diagnostic guidelines for Alzheimer's disease. Alzheimers Dement

2011, 7:263-269.

3. Scheltens P, Blennow K, Breteler MM, de Strooper B, et al: Alzheimer's disease. Lancet 2016,

388:505-517.

4. Blennow K, Hampel H, Weiner M, Zetterberg H: Cerebrospinal fluid and plasma biomarkers in

Alzheimer disease. Nat Rev Neurol 2010, 6:131-144.

5. Winblad B, Amouyel P, Andrieu S, Ballard C, et al: Defeating Alzheimer's disease and other

dementias: a priority for European science and society. Lancet Neurol 2016, 15:455-532.

6. O'Brien JT, Thomas A: Vascular dementia. Lancet 2015, 386:1698-1706.

7. Walker Z, Possin KL, Boeve BF, Aarsland D: Lewy body dementias. Lancet 2015, 386:1683-1697.

8. Emre M, Aarsland D, Brown R, Burn DJ, et al: Clinical diagnostic criteria for dementia

associated with Parkinson's disease. Mov Disord 2007, 22:1689-1707; quiz 1837.

9. Bang J, Spina S, Miller BL: Frontotemporal dementia. Lancet 2015, 386:1672-1682.

10. WHO: Dementia: a public health priority. WHO 2012.

11. Cacace R, Sleegers K, Van Broeckhoven C: Molecular genetics of early-onset Alzheimer's

disease revisited. Alzheimers Dement 2016, 12:733-748.

12. Masters CL, Bateman R, Blennow K, Rowe CC, et al: Alzheimer's disease. Nat Rev Dis Primers

2015, 1:15056.

13. Ferri CP, Prince M, Brayne C, Brodaty H, et al: Global prevalence of dementia: a Delphi

consensus study. Lancet 2005, 366:2112-2117.

175

14. Prince M, Wimo A, Guerchet M, Ali G, et al: The World Alzheimer Report 2015, The global

impact of dementia: an analysis of prevalence, incidence, cost and trends London: Alzheimer’s

Disease International; 2015.

15. Alzheimer’s A: 2017 Alzheimer's disease facts and figures. Alzheimers Dement 2017, 13:325-

373.

16. Huang Y, Mucke L: Alzheimer mechanisms and therapeutic strategies. Cell 2012, 148:1204-

1222.

17. Ballatore C, Lee VM, Trojanowski JQ: Tau-mediated neurodegeneration in Alzheimer's disease

and related disorders. Nat Rev Neurosci 2007, 8:663-672.

18. Ballard C, Gauthier S, Corbett A, Brayne C, et al: Alzheimer's disease. Lancet 2011, 377:1019-

1031.

19. Kang JH, Korecka M, Toledo JB, Trojanowski JQ, et al: Clinical utility and analytical challenges in

measurement of cerebrospinal fluid amyloid-beta(1-42) and tau proteins as Alzheimer

disease biomarkers. Clin Chem 2013, 59:903-916.

20. Hyman BT, Phelps CH, Beach TG, Bigio EH, et al: National Institute on Aging-Alzheimer's

Association guidelines for the neuropathologic assessment of Alzheimer's disease. Alzheimers

Dement 2012, 8:1-13.

21. Thal DR, Rub U, Orantes M, Braak H: Phases of A beta-deposition in the human brain and its

relevance for the development of AD. Neurology 2002, 58:1791-1800.

22. Wang Y, Mandelkow E: Tau in physiology and pathology. Nat Rev Neurosci 2016, 17:5-21.

23. Braak H, Braak E: Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol

1991, 82:239-259.

24. Selkoe DJ, Hardy J: The amyloid hypothesis of Alzheimer's disease at 25 years. EMBO Mol Med

2016, 8:595-608.

176

25. Hardy J, Allsop D: Amyloid deposition as the central event in the aetiology of Alzheimer's

disease. Trends Pharmacol Sci 1991, 12:383-388.

26. Jonsson T, Atwal JK, Steinberg S, Snaedal J, et al: A mutation in APP protects against

Alzheimer's disease and age-related cognitive decline. Nature 2012, 488:96-99.

27. Iqbal K, Liu F, Gong CX: Tau and neurodegenerative disease: the story so far. Nat Rev Neurol

2016, 12:15-27.

28. Jack CR, Jr., Knopman DS, Jagust WJ, Shaw LM, et al: Hypothetical model of dynamic

biomarkers of the Alzheimer's pathological cascade. Lancet Neurol 2010, 9:119-128.

29. Palop JJ, Mucke L: Amyloid-beta-induced neuronal dysfunction in Alzheimer's disease: from

synapses toward neural networks. Nat Neurosci 2010, 13:812-818.

30. Walsh DM, Klyubin I, Fadeeva JV, Cullen WK, et al: Naturally secreted oligomers of amyloid

beta protein potently inhibit hippocampal long-term potentiation in vivo. Nature 2002,

416:535-539.

31. Shankar GM, Li S, Mehta TH, Garcia-Munoz A, et al: Amyloid-beta protein dimers isolated

directly from Alzheimer's brains impair synaptic plasticity and memory. Nat Med 2008,

14:837-842.

32. Braak H, Del Tredici K: The pathological process underlying Alzheimer's disease in individuals

under thirty. Acta Neuropathol 2011, 121:171-181.

33. Rapoport M, Dawson HN, Binder LI, Vitek MP, et al: Tau is essential to beta -amyloid-induced

neurotoxicity. Proc Natl Acad Sci U S A 2002, 99:6364-6369.

34. Terry RD, Masliah E, Salmon DP, Butters N, et al: Physical basis of cognitive alterations in

Alzheimer's disease: synapse loss is the major correlate of cognitive impairment. Ann Neurol

1991, 30:572-580.

35. Jucker M, Walker LC: Self-propagation of pathogenic protein aggregates in neurodegenerative

diseases. Nature 2013, 501:45-51.

177

36. Meyer-Luehmann M, Coomaraswamy J, Bolmont T, Kaeser S, et al: Exogenous induction of

cerebral beta-amyloidogenesis is governed by agent and host. Science 2006, 313:1781-1784.

37. Clavaguera F, Bolmont T, Crowther RA, Abramowski D, et al: Transmission and spreading of

tauopathy in transgenic mouse brain. Nat Cell Biol 2009, 11:909-913.

38. Jucker M, Walker LC: Pathogenic protein seeding in Alzheimer disease and other

neurodegenerative disorders. Ann Neurol 2011, 70:532-540.

39. Kang J, Lemaire HG, Unterbeck A, Salbaum JM, et al: The precursor of Alzheimer's disease

amyloid A4 protein resembles a cell-surface receptor. Nature 1987, 325:733-736.

40. Cruts M, Theuns J, Van Broeckhoven C: Locus-specific mutation databases for

neurodegenerative brain diseases. Hum Mutat 2012, 33:1340-1344.

41. Nilsberth C, Westlind-Danielsson A, Eckman CB, Condron MM, et al: The 'Arctic' APP mutation

(E693G) causes Alzheimer's disease by enhanced Abeta protofibril formation. Nat Neurosci

2001, 4:887-893.

42. Mullan M, Crawford F, Axelman K, Houlden H, et al: A pathogenic mutation for probable

Alzheimer's disease in the APP gene at the N-terminus of beta-amyloid. Nat Genet 1992,

1:345-347.

43. Citron M, Oltersdorf T, Haass C, McConlogue L, et al: Mutation of the beta-amyloid precursor

protein in familial Alzheimer's disease increases beta-protein production. Nature 1992,

360:672-674.

44. Karch CM, Cruchaga C, Goate AM: Alzheimer's disease genetics: from the bench to the clinic.

Neuron 2014, 83:11-26.

45. Rovelet-Lecrux A, Hannequin D, Raux G, Le Meur N, et al: APP locus duplication causes

autosomal dominant early-onset Alzheimer disease with cerebral amyloid angiopathy. Nat

Genet 2006, 38:24-26.

178

46. Guerreiro RJ, Gustafson DR, Hardy J: The genetic architecture of Alzheimer's disease: beyond

APP, PSENs and APOE. Neurobiol Aging 2012, 33:437-456.

47. Budson AE, Kowall NW: Common dementias. In The handbook of Alzheimer’s disease and other

dementias. Wiley-Blackwell; 2011: 1-193

48. Liu CC, Kanekiyo T, Xu H, Bu G: Apolipoprotein E and Alzheimer disease: risk, mechanisms and

therapy. Nat Rev Neurol 2013, 9:106-118.

49. Strittmatter WJ, Saunders AM, Schmechel D, Pericak-Vance M, et al: Apolipoprotein E: high-

avidity binding to beta-amyloid and increased frequency of type 4 allele in late-onset familial

Alzheimer disease. Proc Natl Acad Sci U S A 1993, 90:1977-1981.

50. Poirier J, Davignon J, Bouthillier D, Kogan S, et al: Apolipoprotein E polymorphism and

Alzheimer's disease. Lancet 1993, 342:697-699.

51. Mahley RW, Rall SC, Jr.: Apolipoprotein E: far more than a lipid transport protein. Annu Rev

Genomics Hum Genet 2000, 1:507-537.

52. Kim J, Basak JM, Holtzman DM: The role of apolipoprotein E in Alzheimer's disease. Neuron

2009, 63:287-303.

53. Farrer LA, Cupples LA, Haines JL, Hyman B, et al: Effects of age, sex, and ethnicity on the

association between apolipoprotein E genotype and Alzheimer disease. A meta-analysis.

APOE and Alzheimer Disease Meta Analysis Consortium. JAMA 1997, 278:1349-1356.

54. Schmechel DE, Saunders AM, Strittmatter WJ, Crain BJ, et al: Increased amyloid beta-peptide

deposition in cerebral cortex as a consequence of apolipoprotein E genotype in late-onset

Alzheimer disease. Proc Natl Acad Sci U S A 1993, 90:9649-9653.

55. Prince JA, Zetterberg H, Andreasen N, Marcusson J, et al: APOE epsilon4 allele is associated

with reduced cerebrospinal fluid levels of Abeta42. Neurology 2004, 62:2116-2118.

56. Head D, Bugg JM, Goate AM, Fagan AM, et al: Exercise Engagement as a Moderator of the

Effects of APOE Genotype on Amyloid Deposition. Arch Neurol 2012, 69:636-643.

179

57. Castellano JM, Kim J, Stewart FR, Jiang H, et al: Human apoE isoforms differentially regulate

brain amyloid-beta peptide clearance. Sci Transl Med 2011, 3:89ra57.

58. Verghese PB, Castellano JM, Holtzman DM: Apolipoprotein E in Alzheimer's disease and other

neurological disorders. Lancet Neurol 2011, 10:241-252.

59. Ma J, Yee A, Brewer HB, Jr., Das S, et al: Amyloid-associated proteins alpha 1-antichymotrypsin

and apolipoprotein E promote assembly of Alzheimer beta-protein into filaments. Nature

1994, 372:92-94.

60. Karch CM, Goate AM: Alzheimer's disease risk genes and mechanisms of disease

pathogenesis. Biol Psychiatry 2015, 77:43-51.

61. van der Flier WM, Pijnenburg YA, Fox NC, Scheltens P: Early-onset versus late-onset

Alzheimer's disease: the case of the missing APOE varepsilon4 allele. Lancet Neurol 2011,

10:280-288.

62. Jack CR, Jr., Albert MS, Knopman DS, McKhann GM, et al: Introduction to the

recommendations from the National Institute on Aging-Alzheimer's Association workgroups

on diagnostic guidelines for Alzheimer's disease. Alzheimers Dement 2011, 7:257-262.

63. Schneider JA, Arvanitakis Z, Leurgans SE, Bennett DA: The neuropathology of probable

Alzheimer disease and mild cognitive impairment. Ann Neurol 2009, 66:200-208.

64. Price JL, Morris JC: Tangles and plaques in nondemented aging and "preclinical" Alzheimer's

disease. Ann Neurol 1999, 45:358-368.

65. Alladi S, Xuereb J, Bak T, Nestor P, et al: Focal cortical presentations of Alzheimer's disease.

Brain 2007, 130:2636-2645.

66. Sperling RA, Aisen PS, Beckett LA, Bennett DA, et al: Toward defining the preclinical stages of

Alzheimer's disease: recommendations from the National Institute on Aging-Alzheimer's

Association workgroups on diagnostic guidelines for Alzheimer's disease. Alzheimers Dement

2011, 7:280-292.

180

67. Albert MS, DeKosky ST, Dickson D, Dubois B, et al: The diagnosis of mild cognitive impairment

due to Alzheimer's disease: recommendations from the National Institute on Aging-

Alzheimer's Association workgroups on diagnostic guidelines for Alzheimer's disease.

Alzheimers Dement 2011, 7:270-279.

68. Dubois B, Feldman HH, Jacova C, Dekosky ST, et al: Research criteria for the diagnosis of

Alzheimer's disease: revising the NINCDS-ADRDA criteria. Lancet Neurol 2007, 6:734-746.

69. Dubois B, Feldman HH, Jacova C, Hampel H, et al: Advancing research diagnostic criteria for

Alzheimer's disease: the IWG-2 criteria. Lancet Neurol 2014, 13:614-629.

70. Ballinger EC, Ananth M, Talmage DA, Role LW: Basal Forebrain Cholinergic Circuits and

Signaling in Cognition and Cognitive Decline. Neuron 2016, 91:1199-1218.

71. Scarpini E, Scheltens P, Feldman H: Treatment of Alzheimer's disease: current status and new

perspectives. Lancet Neurol 2003, 2:539-547.

72. Doody RS, Raman R, Farlow M, Iwatsubo T, et al: A phase 3 trial of semagacestat for treatment

of Alzheimer's disease. N Engl J Med 2013, 369:341-350.

73. Coric V, van Dyck CH, Salloway S, Andreasen N, et al: Safety and tolerability of the gamma-

secretase inhibitor avagacestat in a phase 2 study of mild to moderate Alzheimer disease.

Arch Neurol 2012, 69:1430-1440.

74. Graham WV, Bonito-Oliva A, Sakmar TP: Update on Alzheimer's Disease Therapy and

Prevention Strategies. Annu Rev Med 2017, 68:413-430.

75. Holmes C, Boche D, Wilkinson D, Yadegarfar G, et al: Long-term effects of Abeta42

immunisation in Alzheimer's disease: follow-up of a randomised, placebo-controlled phase I

trial. Lancet 2008, 372:216-223.

76. Salloway S, Sperling R, Fox NC, Blennow K, et al: Two phase 3 trials of bapineuzumab in mild-

to-moderate Alzheimer's disease. N Engl J Med 2014, 370:322-333.

181

77. Doody RS, Thomas RG, Farlow M, Iwatsubo T, et al: Phase 3 trials of solanezumab for mild-to-

moderate Alzheimer's disease. N Engl J Med 2014, 370:311-321.

78. Sevigny J, Chiao P, Bussiere T, Weinreb PH, et al: The antibody aducanumab reduces Abeta

plaques in Alzheimer's disease. Nature 2016, 537:50-56.

79. Abbott A, Dolgin E: Failed Alzheimer's trial does not kill leading theory of disease. Nature

2016, 540:15-16.

80. Novak P, Schmidt R, Kontsekova E, Zilka N, et al: Safety and immunogenicity of the tau vaccine

AADvac1 in patients with Alzheimer's disease: a randomised, double-blind, placebo-

controlled, phase 1 trial. Lancet Neurol 2017, 16:123-134.

81. Biomarkers Definitions Working G: Biomarkers and surrogate endpoints: preferred definitions

and conceptual framework. Clin Pharmacol Ther 2001, 69:89-95.

82. Zhao X, Modur V, Carayannopoulos LN, Laterza OF: Biomarkers in Pharmaceutical Research.

Clin Chem 2015, 61:1343-1353.

83. QMP-LS: Chemistry-General Broadsheet – Analytical Method Validation: What does the

laboratory need to do and when? Toronto (ON): QMP-LS; 2010.

84. Linnet K, Bossuyt PM, Moons KG, Reitsma JB: Quantifying the accuracy of a diagnostic test or

marker. Clin Chem 2012, 58:1292-1301.

85. Soreide K: Receiver-operating characteristic curve analysis in diagnostic, prognostic and

predictive biomarker research. J Clin Pathol 2009, 62:1-5.

86. Zweig MH, Campbell G: Receiver-operating characteristic (ROC) plots: a fundamental

evaluation tool in clinical medicine. Clin Chem 1993, 39:561-577.

87. Moons KG, de Groot JA, Linnet K, Reitsma JB, et al: Quantifying the added value of a diagnostic

test or marker. Clin Chem 2012, 58:1408-1417.

88. Leeflang MM, Bossuyt PM, Irwig L: Diagnostic test accuracy may vary with prevalence:

implications for evidence-based diagnosis. J Clin Epidemiol 2009, 62:5-12.

182

89. Reitsma JB, Rutjes AW, Khan KS, Coomarasamy A, et al: A review of solutions for diagnostic

accuracy studies with an imperfect or missing reference standard. J Clin Epidemiol 2009,

62:797-806.

90. Biesheuvel C, Irwig L, Bossuyt P: Observed differences in diagnostic test accuracy between

patient subgroups: is it real or due to reference standard misclassification? Clin Chem 2007,

53:1725-1729.

91. Bossuyt PM, Reitsma JB, Linnet K, Moons KG: Beyond diagnostic accuracy: the clinical utility of

diagnostic tests. Clin Chem 2012, 58:1636-1643.

92. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, et al: Towards complete and accurate

reporting of studies of diagnostic accuracy: the STARD initiative. Standards for Reporting of

Diagnostic Accuracy. Clin Chem 2003, 49:1-6.

93. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, et al: STARD 2015: An Updated List of Essential

Items for Reporting Diagnostic Accuracy Studies. Clin Chem 2015, 61:1446-1452.

94. Ransohoff DF: Bias as a threat to the validity of cancer molecular-marker research. Nat Rev

Cancer 2005, 5:142-149.

95. Pavlou MP, Diamandis EP, Blasutig IM: The long journey of cancer biomarkers from the bench

to the clinic. Clin Chem 2013, 59:147-157.

96. Narayanan S: The preanalytic phase. An important component of laboratory medicine. Am J

Clin Pathol 2000, 113:429-452.

97. Anderson NL, Anderson NG: The human plasma proteome: history, character, and diagnostic

prospects. Mol Cell Proteomics 2002, 1:845-867.

98. Drabovich AP, Martinez-Morillo E, Diamandis EP: Toward an integrated pipeline for protein

biomarker development. Biochim Biophys Acta 2015, 1854:677-686.

183

99. Batruch I, Lecker I, Kagedan D, Smith CR, et al: Proteomic analysis of seminal plasma from

normal volunteers and post-vasectomy patients identifies over 2000 proteins and candidate

biomarkers of the urogenital system. J Proteome Res 2011, 10:941-953.

100. Good DM, Thongboonkerd V, Novak J, Bascands JL, et al: Body fluid proteomics for biomarker

discovery: lessons from the past hold the key to success in the future. J Proteome Res 2007,

6:4549-4555.

101. Rifai N, Gillette MA, Carr SA: Protein biomarker discovery and validation: the long and

uncertain path to clinical utility. Nat Biotechnol 2006, 24:971-983.

102. Drabovich AP, Jarvi K, Diamandis EP: Verification of male infertility biomarkers in seminal

plasma by multiplex selected reaction monitoring assay. Mol Cell Proteomics 2011, 10:M110

004127.

103. Whiteaker JR, Zhang H, Eng JK, Fang R, et al: Head-to-head comparison of serum fractionation

techniques. J Proteome Res 2007, 6:828-836.

104. Konvalinka A, Scholey JW, Diamandis EP: Searching for new biomarkers of renal diseases

through proteomics. Clin Chem 2012, 58:353-365.

105. Sedlaczek P, Frydecka I, Gabrys M, Van Dalen A, et al: Comparative analysis of CA125, tissue

polypeptide specific antigen, and soluble interleukin-2 receptor alpha levels in sera, cyst, and

ascitic fluids from patients with ovarian carcinoma. Cancer 2002, 95:1886-1893.

106. Minjarez B, Valero Rustarazo ML, Sanchez del Pino MM, Gonzalez-Robles A, et al: Identification

of polypeptides in neurofibrillary tangles and total homogenates of brains with Alzheimer's

disease by tandem mass spectrometry. J Alzheimers Dis 2013, 34:239-262.

107. Casadonte R, Caprioli RM: Proteomic analysis of formalin-fixed paraffin-embedded tissue by

MALDI imaging mass spectrometry. Nat Protoc 2011, 6:1695-1709.

108. Pavlou MP, Dimitromanolakis A, Diamandis EP: Coupling proteomics and transcriptomics in the

quest of subtype-specific proteins in breast cancer. Proteomics 2013, 13:1083-1095.

184

109. Gevaert K, Vandekerckhove, J.: Gel-free proteomics: methods and protocols. New York: Human

Press; 2011.

110. Aebersold R, Mann M: Mass spectrometry-based proteomics. Nature 2003, 422:198-207.

111. Batruch I, Smith CR, Mullen BJ, Grober E, et al: Analysis of seminal plasma from patients with

non-obstructive azoospermia and identification of candidate biomarkers of male infertility. J

Proteome Res 2012, 11:1503-1511.

112. Nesvizhskii AI, Vitek O, Aebersold R: Analysis and validation of proteomic data generated by

tandem mass spectrometry. Nat Methods 2007, 4:787-797.

113. Cox J, Neuhauser N, Michalski A, Scheltema RA, et al: Andromeda: a peptide search engine

integrated into the MaxQuant environment. J Proteome Res 2011, 10:1794-1805.

114. Picotti P, Aebersold R: Selected reaction monitoring-based proteomics: workflows, potential,

pitfalls and future directions. Nature methods 2012, 9:555-566.

115. Ishihama Y, Oda Y, Tabata T, Sato T, et al: Exponentially modified protein abundance index

(emPAI) for estimation of absolute protein amount in proteomics by the number of

sequenced peptides per protein. Mol Cell Proteomics 2005, 4:1265-1272.

116. Zybailov B, Mosley AL, Sardiu ME, Coleman MK, et al: Statistical analysis of membrane

proteome expression changes in Saccharomyces cerevisiae. J Proteome Res 2006, 5:2339-

2347.

117. Cox J, Mann M: MaxQuant enables high peptide identification rates, individualized p.p.b.-

range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 2008,

26:1367-1372.

118. Zhang Y, Guo Z, Zou L, Yang Y, et al: A comprehensive map and functional annotation of the

normal human cerebrospinal fluid proteome. J Proteomics 2015, 119:90-99.

185

119. Drabovich AP, Pavlou MP, Schiza C, Diamandis EP: Dynamics of Protein Expression Reveals

Primary Targets and Secondary Messengers of Estrogen Receptor Alpha Signaling in MCF-7

Breast Cancer Cells. Mol Cell Proteomics 2016, 15:2093-2107.

120. Kruger M, Moser M, Ussar S, Thievessen I, et al: SILAC mouse for quantitative proteomics

uncovers kindlin-3 as an essential factor for red blood cell function. Cell 2008, 134:353-364.

121. Ross PL, Huang YN, Marchese JN, Williamson B, et al: Multiplexed protein quantitation in

Saccharomyces cerevisiae using amine-reactive isobaric tagging reagents. Mol Cell Proteomics

2004, 3:1154-1169.

122. Unwin RD, Griffiths JR, Whetton AD: Simultaneous analysis of relative protein expression

levels across multiple samples using iTRAQ isobaric tags with 2D nano LC-MS/MS. Nat Protoc

2010, 5:1574-1582.

123. Lange V, Picotti P, Domon B, Aebersold R: Selected reaction monitoring for quantitative

proteomics: a tutorial. Molecular systems biology 2008, 4:222.

124. Martinez-Morillo E, Nielsen HM, Batruch I, Drabovich AP, et al: Assessment of Peptide chemical

modifications on the development of an accurate and precise multiplex selected reaction

monitoring assay for apolipoprotein e isoforms. J Proteome Res 2014, 13:1077-1087.

125. Pratt JM, Simpson DM, Doherty MK, Rivers J, et al: Multiplexed absolute quantification for

proteomics using concatenated signature peptides encoded by QconCAT genes. Nat Protoc

2006, 1:1029-1043.

126. Huillet C, Adrait A, Lebert D, Picard G, et al: Accurate quantification of cardiovascular

biomarkers in serum using Protein Standard Absolute Quantification (PSAQ) and selected

reaction monitoring. Mol Cell Proteomics 2012, 11:M111 008235.

127. Keshishian H, Addona T, Burgess M, Kuhn E, et al: Quantitative, multiplexed assays for low

abundance proteins in plasma by targeted mass spectrometry and stable isotope dilution.

Mol Cell Proteomics 2007, 6:2212-2229.

186

128. Whiteaker JR, Zhao L, Anderson L, Paulovich AG: An automated and multiplexed method for

high throughput peptide immunoaffinity enrichment and multiple reaction monitoring mass

spectrometry-based quantification of protein biomarkers. Mol Cell Proteomics 2010, 9:184-

196.

129. Prassas I, Brinc D, Farkona S, Leung F, et al: False biomarker discovery due to reactivity of a

commercial ELISA for CUZD1 with cancer antigen CA125. Clin Chem 2014, 60:381-388.

130. Haverland N, Pottiez G, Wiederin J, Ciborowski P: Immunoreactivity of anti-gelsolin antibodies:

implications for biomarker validation. J Transl Med 2010, 8:137.

131. Bjorling E, Uhlen M: Antibodypedia, a portal for sharing antibody and antigen validation data.

Mol Cell Proteomics 2008, 7:2028-2037.

132. Uhlen M, Fagerberg L, Hallstrom BM, Lindskog C, et al: Proteomics. Tissue-based map of the

human proteome. Science 2015, 347:1260419.

133. Kroksveen AC, Opsahl JA, Aye TT, Ulvik RJ, et al: Proteomics of human cerebrospinal fluid:

discovery and verification of biomarker candidates in neurodegenerative diseases using

quantitative proteomics. J Proteomics 2011, 74:371-388.

134. McComb JG: Recent research into the nature of cerebrospinal fluid formation and absorption.

J Neurosurg 1983, 59:369-383.

135. Nilsson C, Stahlberg F, Thomsen C, Henriksen O, et al: Circadian variation in human

cerebrospinal fluid production measured by magnetic resonance imaging. Am J Physiol 1992,

262:R20-24.

136. Reiber H, Peter JB: Cerebrospinal fluid analysis: disease-related data patterns and evaluation

programs. J Neurol Sci 2001, 184:101-122.

137. Hladky SB, Barrand MA: Mechanisms of fluid movement into, through and out of the brain:

evaluation of the evidence. Fluids Barriers CNS 2014, 11:26.

187

138. Abbott NJ, Patabendige AA, Dolman DE, Yusof SR, et al: Structure and function of the blood-

brain barrier. Neurobiol Dis 2010, 37:13-25.

139. Banks WA: Characteristics of compounds that cross the blood-brain barrier. BMC Neurol 2009,

9 Suppl 1:S3.

140. Redzic Z: Molecular biology of the blood-brain and the blood-cerebrospinal fluid barriers:

similarities and differences. Fluids Barriers CNS 2011, 8:3.

141. Oreskovic D, Klarica M: The formation of cerebrospinal fluid: nearly a hundred years of

interpretations and misinterpretations. Brain Res Rev 2010, 64:241-262.

142. Louveau A, Smirnov I, Keyes TJ, Eccles JD, et al: Structural and functional features of central

nervous system lymphatic vessels. Nature 2015, 523:337-341.

143. Aspelund A, Antila S, Proulx ST, Karlsen TV, et al: A dural lymphatic vascular system that drains

brain interstitial fluid and macromolecules. J Exp Med 2015, 212:991-999.

144. Reiber H: Proteins in cerebrospinal fluid and blood: barriers, CSF flow rate and source-related

dynamics. Restor Neurol Neurosci 2003, 21:79-96.

145. Reiber H: Dynamics of brain-derived proteins in cerebrospinal fluid. Clin Chim Acta 2001,

310:173-186.

146. Redzic ZB, Preston JE, Duncan JA, Chodobski A, et al: The choroid plexus-cerebrospinal fluid

system: from development to aging. Curr Top Dev Biol 2005, 71:1-52.

147. Zhang J: Proteomics of human cerebrospinal fluid - the good, the bad, and the ugly.

Proteomics Clin Appl 2007, 1:805-819.

148. Brandner S, Thaler C, Lewczuk P, Lelental N, et al: Neuroprotein dynamics in the cerebrospinal

fluid: intraindividual concomitant ventricular and lumbar measurements. Eur Neurol 2013,

70:189-194.

149. Aasebo E, Opsahl JA, Bjorlykke Y, Myhr KM, et al: Effects of blood contamination and the

rostro-caudal gradient on the human cerebrospinal fluid proteome. PLoS One 2014, 9:e90429.

188

150. Chen CP, Chen RL, Preston JE: The influence of cerebrospinal fluid turnover on age-related

changes in cerebrospinal fluid protein concentrations. Neurosci Lett 2010, 476:138-141.

151. Consensus report of the Working Group on: "Molecular and Biochemical Markers of

Alzheimer's Disease". The Ronald and Nancy Reagan Research Institute of the Alzheimer's

Association and the National Institute on Aging Working Group. Neurobiol Aging 1998,

19:109-116.

152. Holtzman DM: CSF biomarkers for Alzheimer's disease: current utility and potential future

use. Neurobiol Aging 2011, 32 Suppl 1:S4-9.

153. Lleo A, Cavedo E, Parnetti L, Vanderstichele H, et al: Cerebrospinal fluid biomarkers in trials for

Alzheimer and Parkinson diseases. Nat Rev Neurol 2015, 11:41-55.

154. Khan TK, Alkon DL: Alzheimer's Disease Cerebrospinal Fluid and Neuroimaging Biomarkers:

Diagnostic Accuracy and Relationship to Drug Efficacy. J Alzheimers Dis 2015, 46:817-836.

155. Tapiola T, Alafuzoff I, Herukka SK, Parkkinen L, et al: Cerebrospinal fluid {beta}-amyloid 42 and

tau proteins as biomarkers of Alzheimer-type pathologic changes in the brain. Arch Neurol

2009, 66:382-389.

156. Fagan AM, Mintun MA, Mach RH, Lee SY, et al: Inverse relation between in vivo amyloid

imaging load and cerebrospinal fluid Abeta42 in humans. Ann Neurol 2006, 59:512-519.

157. Jack CR, Jr., Knopman DS, Jagust WJ, Petersen RC, et al: Tracking pathophysiological processes

in Alzheimer's disease: an updated hypothetical model of dynamic biomarkers. Lancet Neurol

2013, 12:207-216.

158. Hesse C, Rosengren L, Andreasen N, Davidsson P, et al: Transient increase in total tau but not

phospho-tau in human cerebrospinal fluid after acute stroke. Neurosci Lett 2001, 297:187-

190.

159. Neselius S, Brisby H, Theodorsson A, Blennow K, et al: CSF-biomarkers in Olympic boxing:

diagnosis and effects of repetitive head trauma. PLoS One 2012, 7:e33606.

189

160. Green AJ, Harvey RJ, Thompson EJ, Rossor MN: Increased tau in the cerebrospinal fluid of

patients with frontotemporal dementia and Alzheimer's disease. Neurosci Lett 1999, 259:133-

135.

161. Otto M, Wiltfang J, Tumani H, Zerr I, et al: Elevated levels of tau-protein in cerebrospinal fluid

of patients with Creutzfeldt-Jakob disease. Neurosci Lett 1997, 225:210-212.

162. Blom ES, Giedraitis V, Zetterberg H, Fukumoto H, et al: Rapid progression from mild cognitive

impairment to Alzheimer's disease in subjects with elevated levels of tau in cerebrospinal

fluid and the APOE epsilon4/epsilon4 genotype. Dement Geriatr Cogn Disord 2009, 27:458-

464.

163. Samgard K, Zetterberg H, Blennow K, Hansson O, et al: Cerebrospinal fluid total tau as a

marker of Alzheimer's disease intensity. Int J Geriatr Psychiatry 2010, 25:403-410.

164. Hoffman JM, Welsh-Bohmer KA, Hanson M, Crain B, et al: FDG PET imaging in patients with

pathologically verified dementia. J Nucl Med 2000, 41:1920-1928.

165. Minoshima S, Giordani B, Berent S, Frey KA, et al: Metabolic reduction in the posterior

cingulate cortex in very early Alzheimer's disease. Ann Neurol 1997, 42:85-94.

166. Sunderland T, Wolozin B, Galasko D, Levy J, et al: Longitudinal stability of CSF tau levels in

Alzheimer patients. Biol Psychiatry 1999, 46:750-755.

167. Blennow K, Zetterberg H, Minthon L, Lannfelt L, et al: Longitudinal stability of CSF biomarkers

in Alzheimer's disease. Neurosci Lett 2007, 419:18-22.

168. Sluimer JD, Bouwman FH, Vrenken H, Blankenstein MA, et al: Whole-brain atrophy rate and

CSF biomarker levels in MCI and AD: a longitudinal study. Neurobiol Aging 2010, 31:758-764.

169. Beach TG, Monsell SE, Phillips LE, Kukull W: Accuracy of the clinical diagnosis of Alzheimer

disease at National Institute on Aging Alzheimer Disease Centers, 2005-2010. J Neuropathol

Exp Neurol 2012, 71:266-273.

190

170. Molinuevo JL, Blennow K, Dubois B, Engelborghs S, et al: The clinical use of cerebrospinal fluid

biomarker testing for Alzheimer's disease diagnosis: a consensus paper from the Alzheimer's

Biomarkers Standardization Initiative. Alzheimers Dement 2014, 10:808-817.

171. Mattsson N, Zetterberg H, Hansson O, Andreasen N, et al: CSF biomarkers and incipient

Alzheimer disease in patients with mild cognitive impairment. JAMA 2009, 302:385-393.

172. Hansson O, Zetterberg H, Buchhave P, Londos E, et al: Association between CSF biomarkers

and incipient Alzheimer's disease in patients with mild cognitive impairment: a follow-up

study. Lancet Neurol 2006, 5:228-234.

173. Sjogren M, Vanderstichele H, Agren H, Zachrisson O, et al: Tau and Abeta42 in cerebrospinal

fluid from healthy adults 21-93 years of age: establishment of reference values. Clin Chem

2001, 47:1776-1781.

174. Bjerke M, Portelius E, Minthon L, Wallin A, et al: Confounding factors influencing amyloid Beta

concentration in cerebrospinal fluid. Int J Alzheimers Dis 2010, 2010.

175. Le Bastard N, De Deyn PP, Engelborghs S: Importance and impact of preanalytical variables on

Alzheimer disease biomarker concentrations in cerebrospinal fluid. Clin Chem 2015, 61:734-

743.

176. Perret-Liaudet A, Pelpel M, Tholance Y, Dumont B, et al: Risk of Alzheimer's disease biological

misdiagnosis linked to cerebrospinal collection tubes. J Alzheimers Dis 2012, 31:13-20.

177. Vanderstichele H, Bibl M, Engelborghs S, Le Bastard N, et al: Standardization of preanalytical

aspects of cerebrospinal fluid biomarker testing for Alzheimer's disease diagnosis: a

consensus paper from the Alzheimer's Biomarkers Standardization Initiative. Alzheimers

Dement 2012, 8:65-73.

178. Schoonenboom NS, Mulder C, Vanderstichele H, Van Elk EJ, et al: Effects of processing and

storage conditions on amyloid beta (1-42) and tau concentrations in cerebrospinal fluid:

implications for use in clinical practice. Clin Chem 2005, 51:189-195.

191

179. Blennow K, Wallin A, Agren H, Spenger C, et al: Tau protein in cerebrospinal fluid: a

biochemical marker for axonal degeneration in Alzheimer disease? Mol Chem Neuropathol

1995, 26:231-245.

180. Andreasen N, Hesse C, Davidsson P, Minthon L, et al: Cerebrospinal fluid beta-amyloid(1-42) in

Alzheimer disease: differences between early- and late-onset Alzheimer disease and stability

during the course of disease. Arch Neurol 1999, 56:673-680.

181. Blennow K, Hampel H: CSF markers for incipient Alzheimer's disease. Lancet Neurol 2003,

2:605-613.

182. Olsson A, Vanderstichele H, Andreasen N, De Meyer G, et al: Simultaneous measurement of

beta-amyloid(1-42), total tau, and phosphorylated tau (Thr181) in cerebrospinal fluid by the

xMAP technology. Clin Chem 2005, 51:336-345.

183. Welge V, Fiege O, Lewczuk P, Mollenhauer B, et al: Combined CSF tau, p-tau181 and amyloid-

beta 38/40/42 for diagnosing Alzheimer's disease. J Neural Transm (Vienna) 2009, 116:203-

212.

184. Hertze J, Minthon L, Zetterberg H, Vanmechelen E, et al: Evaluation of CSF biomarkers as

predictors of Alzheimer's disease: a clinical follow-up study of 4.7 years. J Alzheimers Dis

2010, 21:1119-1128.

185. Bittner T, Zetterberg H, Teunissen CE, Ostlund RE, Jr., et al: Technical performance of a novel,

fully automated electrochemiluminescence immunoassay for the quantitation of beta-

amyloid (1-42) in human cerebrospinal fluid. Alzheimers Dement 2016, 12:517-526.

186. Mattsson N, Andreasson U, Persson S, Carrillo MC, et al: CSF biomarker variability in the

Alzheimer's Association quality control program. Alzheimers Dement 2013, 9:251-261.

187. Mattsson N, Andreasson U, Persson S, Arai H, et al: The Alzheimer's Association external

quality control program for cerebrospinal fluid biomarkers. Alzheimers Dement 2011, 7:386-

395 e386.

192

188. Lame ME, Chambers EE, Blatnik M: Quantitation of amyloid beta peptides Abeta(1-38),

Abeta(1-40), and Abeta(1-42) in human cerebrospinal fluid by ultra-performance liquid

chromatography-tandem mass spectrometry. Anal Biochem 2011, 419:133-139.

189. Pannee J, Portelius E, Oppermann M, Atkins A, et al: A selected reaction monitoring (SRM)-

based method for absolute quantification of Abeta38, Abeta40, and Abeta42 in cerebrospinal

fluid of Alzheimer's disease patients and healthy controls. J Alzheimers Dis 2013, 33:1021-

1032.

190. Leinenbach A, Pannee J, Dulffer T, Huber A, et al: Mass spectrometry-based candidate

reference measurement procedure for quantification of amyloid-beta in cerebrospinal fluid.

Clin Chem 2014, 60:987-994.

191. McAvoy T, Lassman ME, Spellman DS, Ke Z, et al: Quantification of tau in cerebrospinal fluid by

immunoaffinity enrichment and tandem mass spectrometry. Clin Chem 2014, 60:683-689.

192. Bros P, Vialaret J, Barthelemy N, Delatour V, et al: Antibody-free quantification of seven tau

peptides in human CSF using targeted mass spectrometry. Front Neurosci 2015, 9:302.

193. Barthelemy NR, Fenaille F, Hirtz C, Sergeant N, et al: Tau Protein Quantification in Human

Cerebrospinal Fluid by Targeted Mass Spectrometry at High Sequence Coverage Provides

Insights into Its Primary Structure Heterogeneity. J Proteome Res 2016, 15:667-676.

194. Glenner GG, Wong CW: Alzheimer's disease: initial report of the purification and

characterization of a novel cerebrovascular amyloid protein. Biochem Biophys Res Commun

1984, 120:885-890.

195. Seubert P, Vigo-Pelfrey C, Esch F, Lee M, et al: Isolation and quantification of soluble

Alzheimer's beta-peptide from biological fluids. Nature 1992, 359:325-327.

196. Motter R, Vigo-Pelfrey C, Kholodenko D, Barbour R, et al: Reduction of beta-amyloid peptide42

in the cerebrospinal fluid of patients with Alzheimer's disease. Ann Neurol 1995, 38:643-648.

193

197. Vandermeeren M, Mercken M, Vanmechelen E, Six J, et al: Detection of tau proteins in normal

and Alzheimer's disease cerebrospinal fluid with a sensitive sandwich enzyme-linked

immunosorbent assay. J Neurochem 1993, 61:1828-1834.

198. Mori H, Hosoda K, Matsubara E, Nakamoto T, et al: Tau in cerebrospinal fluids: establishment

of the sandwich ELISA with antibody specific to the repeat sequence in tau. Neurosci Lett

1995, 186:181-183.

199. Kohnken R, Buerger K, Zinkowski R, Miller C, et al: Detection of tau phosphorylated at

threonine 231 in cerebrospinal fluid of Alzheimer's disease patients. Neurosci Lett 2000,

287:187-190.

200. Vanmechelen E, Vanderstichele H, Davidsson P, Van Kerschaver E, et al: Quantification of tau

phosphorylated at threonine 181 in human cerebrospinal fluid: a sandwich ELISA with a

synthetic phosphopeptide for standardization. Neurosci Lett 2000, 285:49-52.

201. Hampel H, Buerger K, Zinkowski R, Teipel SJ, et al: Measurement of phosphorylated tau

epitopes in the differential diagnosis of Alzheimer disease: a comparative cerebrospinal fluid

study. Arch Gen Psychiatry 2004, 61:95-102.

202. Pan C, Korff A, Galasko D, Ginghina C, et al: Diagnostic Values of Cerebrospinal Fluid T-Tau and

Abeta(4)(2) using Meso Scale Discovery Assays for Alzheimer's Disease. J Alzheimers Dis 2015,

45:709-719.

203. Sunderland T, Linker G, Mirza N, Putnam KT, et al: Decreased beta-amyloid1-42 and increased

tau levels in cerebrospinal fluid of patients with Alzheimer disease. JAMA 2003, 289:2094-

2103.

204. Riemenschneider M, Lautenschlager N, Wagenpfeil S, Diehl J, et al: Cerebrospinal fluid tau and

beta-amyloid 42 proteins identify Alzheimer disease in subjects with mild cognitive

impairment. Arch Neurol 2002, 59:1729-1734.

194

205. Visser PJ, Verhey F, Knol DL, Scheltens P, et al: Prevalence and prognostic value of CSF markers

of Alzheimer's disease pathology in patients with subjective cognitive impairment or mild

cognitive impairment in the DESCRIPA study: a prospective cohort study. Lancet Neurol 2009,

8:619-627.

206. Shaw LM, Vanderstichele H, Knapik-Czajka M, Clark CM, et al: Cerebrospinal fluid biomarker

signature in Alzheimer's disease neuroimaging initiative subjects. Ann Neurol 2009, 65:403-

413.

207. Andreasen N, Minthon L, Davidsson P, Vanmechelen E, et al: Evaluation of CSF-tau and CSF-

Abeta42 as diagnostic markers for Alzheimer disease in clinical practice. Arch Neurol 2001,

58:373-379.

208. Zetterberg H, Wilson D, Andreasson U, Minthon L, et al: Plasma tau levels in Alzheimer's

disease. Alzheimers Res Ther 2013, 5:9.

209. Olsson B, Lautner R, Andreasson U, Ohrfelt A, et al: CSF and blood biomarkers for the diagnosis

of Alzheimer's disease: a systematic review and meta-analysis. Lancet Neurol 2016, 15:673-

684.

210. Williams JH, Wilcock GK, Seeburger J, Dallob A, et al: Non-linear relationships of cerebrospinal

fluid biomarker levels with cognitive function: an observational study. Alzheimers Res Ther

2011, 3:5.

211. Blennow K, Zetterberg H, Rinne JO, Salloway S, et al: Effect of immunotherapy with

bapineuzumab on cerebrospinal fluid biomarker levels in patients with mild to moderate

Alzheimer disease. Arch Neurol 2012, 69:1002-1010.

212. Martinez-Morillo E, Garcia Hernandez P, Begcevic I, Kosanam H, et al: Identification of novel

biomarkers of brain damage in patients with hemorrhagic stroke by integrating

bioinformatics and mass spectrometry-based proteomics. J Proteome Res 2014, 13:969-981.

195

213. Drabovich AP, Dimitromanolakis A, Saraon P, Soosaipillai A, et al: Differential diagnosis of

azoospermia with proteomic biomarkers ECM1 and TEX101 quantified in seminal plasma. Sci

Transl Med 2013, 5:212ra160.

214. Prassas I, Chrystoja CC, Makawita S, Diamandis EP: Bioinformatic identification of proteins

with tissue-specific expression for biomarker discovery. BMC Med 2012, 10:39.

215. Saraon P, Musrap N, Cretu D, Karagiannis GS, et al: Proteomic profiling of androgen-

independent prostate cancer cell lines reveals a role for protein S during the development of

high grade and castration-resistant prostate cancer. J Biol Chem 2012, 287:34019-34031.

216. Planque C, Kulasingam V, Smith CR, Reckamp K, et al: Identification of five candidate lung

cancer biomarkers by proteomics analysis of conditioned media of four lung cancer cell lines.

Mol Cell Proteomics 2009, 8:2746-2758.

217. Hampel H, Burger K, Teipel SJ, Bokde AL, et al: Core candidate neurochemical and imaging

biomarkers of Alzheimer's disease. Alzheimers Dement 2008, 4:38-48.

218. Korolainen MA, Nyman TA, Aittokallio T, Pirttila T: An update on clinical proteomics in

Alzheimer's research. J Neurochem 2010, 112:1386-1414.

219. Donovan LE, Higginbotham L, Dammer EB, Gearing M, et al: Analysis of a membrane-enriched

proteome from postmortem human brain tissue in Alzheimer's disease. Proteomics Clin Appl

2012, 6:201-211.

220. Xu B, Gao Y, Zhan S, Xiong F, et al: Quantitative protein profiling of hippocampus during

human aging. Neurobiol Aging 2016, 39:46-56.

221. Andreev VP, Petyuk VA, Brewer HM, Karpievitch YV, et al: Label-free quantitative LC-MS

proteomics of Alzheimer's disease and normally aged human brains. J Proteome Res 2012,

11:3053-3067.

222. Sultana R, Boyd-Kimball D, Cai J, Pierce WM, et al: Proteomics analysis of the Alzheimer's

disease hippocampal proteome. J Alzheimers Dis 2007, 11:153-164.

196

223. Schutzer SE, Liu T, Natelson BH, Angel TE, et al: Establishing the proteome of normal human

cerebrospinal fluid. PLoS One 2010, 5:e10980.

224. Crecelius A, Gotz A, Arzberger T, Frohlich T, et al: Assessing quantitative post-mortem changes

in the gray matter of the human frontal cortex proteome by 2-D DIGE. Proteomics 2008,

8:1276-1291.

225. Guldbrandsen A, Vethe H, Farag Y, Oveland E, et al: In-depth characterization of the

cerebrospinal fluid (CSF) proteome displayed through the CSF proteome resource (CSF-PR).

Mol Cell Proteomics 2014, 13:3152-3163.

226. Pan S, Zhu D, Quinn JF, Peskind ER, et al: A combined dataset of human cerebrospinal fluid

proteins identified by multi-dimensional chromatography and tandem mass spectrometry.

Proteomics 2007, 7:469-473.

227. Drabovich AP, Pavlou MP, Dimitromanolakis A, Diamandis EP: Quantitative analysis of energy

metabolic pathways in MCF-7 breast cancer cells by selected reaction monitoring assay. Mol

Cell Proteomics 2012, 11:422-434.

228. Bardou P, Mariette J, Escudie F, Djemiel C, et al: jvenn: an interactive Venn diagram viewer.

BMC Bioinformatics 2014, 15:293.

229. Mi H, Muruganujan A, Casagrande JT, Thomas PD: Large-scale gene function analysis with the

PANTHER classification system. Nat Protoc 2013, 8:1551-1566.

230. Braak H, Ghebremedhin E, Rub U, Bratzke H, et al: Stages in the development of Parkinson's

disease-related pathology. Cell Tissue Res 2004, 318:121-134.

231. Zhang J, Goodlett DR, Peskind ER, Quinn JF, et al: Quantitative proteomic analysis of age-

related changes in human cerebrospinal fluid. Neurobiol Aging 2005, 26:207-227.

232. Xu J, Chen J, Peskind ER, Jin J, et al: Characterization of proteome of human cerebrospinal

fluid. Int Rev Neurobiol 2006, 73:29-98.

197

233. Zougman A, Pilch B, Podtelejnikov A, Kiehntopf M, et al: Integrated analysis of the

cerebrospinal fluid peptidome and proteome. J Proteome Res 2008, 7:386-399.

234. Stoop MP, Coulier L, Rosenling T, Shi S, et al: Quantitative proteomics and metabolomics

analysis of normal human cerebrospinal fluid samples. Mol Cell Proteomics 2010, 9:2063-

2075.

235. Preston JE: Age choroid plexus-cerebrospinal fluid system. Microscopy Research and Technique

2001, 52:31-37.

236. Teunissen CE, Tumani H, Bennett JL, Berven FS, et al: Consensus Guidelines for CSF and Blood

Biobanking for CNS Biomarker Studies. Mult Scler Int 2011, 2011:246412.

237. Bartsch O, Schindler D, Beyer V, Gesk S, et al: A girl with an atypical form of ataxia

telangiectasia and an additional de novo 3.14 Mb microduplication in region 19q12. Eur J Med

Genet 2012, 55:49-55.

238. McNamara MJ, Ruff CT, Wasco W, Tanzi RE, et al: Immunohistochemical and in situ analysis of

amyloid precursor-like protein-1 and amyloid precursor-like protein-2 expression in

Alzheimer disease and aged control brains. Brain Res 1998, 804:45-51.

239. Kim TW, Wu K, Xu JL, McAuliffe G, et al: Selective localization of amyloid precursor-like protein

1 in the cerebral cortex postsynaptic density. Brain Res Mol Brain Res 1995, 32:36-44.

240. Li Q, Sudhof TC: Cleavage of amyloid-beta precursor protein and amyloid-beta precursor-like

protein by BACE 1. J Biol Chem 2004, 279:10542-10550.

241. Shi M, Movius J, Dator R, Aro P, et al: Cerebrospinal fluid peptides as potential Parkinson

disease biomarkers: a staged pipeline for discovery and validation. Mol Cell Proteomics 2015,

14:544-555.

242. Pla V, Paco S, Ghezali G, Ciria V, et al: Secretory sorting receptors carboxypeptidase E and

secretogranin III in amyloid beta-associated neural degeneration in Alzheimer's disease. Brain

Pathol 2013, 23:274-284.

198

243. Li F, Tian X, Zhou Y, Zhu L, et al: Dysregulated expression of secretogranin III is involved in

neurotoxin-induced dopaminergic neuron apoptosis. J Neurosci Res 2012, 90:2237-2246.

244. Mattsson N, Ruetschi U, Podust VN, Stridsberg M, et al: Cerebrospinal fluid concentrations of

peptides derived from chromogranin B and secretogranin II are decreased in multiple

sclerosis. J Neurochem 2007, 103:1932-1939.

245. Teunissen CE, Koel-Simmelink MJ, Pham TV, Knol JC, et al: Identification of biomarkers for

diagnosis and progression of MS by MALDI-TOF mass spectrometry. Mult Scler 2011, 17:838-

850.

246. Petraki CD, Karavana VN, Skoufogiannis PT, Little SP, et al: The spectrum of human kallikrein 6

(zyme/protease M/neurosin) expression in human tissues as assessed by

immunohistochemistry. J Histochem Cytochem 2001, 49:1431-1441.

247. Shaw JL, Diamandis EP: Distribution of 15 human kallikreins in tissues and biological fluids.

Clin Chem 2007, 53:1423-1432.

248. Diamandis EP, Yousef GM, Soosaipillai AR, Grass L, et al: Immunofluorometric assay of human

kallikrein 6 (zyme/protease M/neurosin) and preliminary clinical applications. Clin Biochem

2000, 33:369-375.

249. Little SP, Dixon EP, Norris F, Buckley W, et al: Zyme, a novel and potentially amyloidogenic

enzyme cDNA isolated from Alzheimer's disease brain. J Biol Chem 1997, 272:25135-25142.

250. Magklara A, Mellati AA, Wasney GA, Little SP, et al: Characterization of the enzymatic activity

of human kallikrein 6: Autoactivation, substrate specificity, and regulation by inhibitors.

Biochem Biophys Res Commun 2003, 307:948-955.

251. Ashby EL, Kehoe PG, Love S: Kallikrein-related peptidase 6 in Alzheimer's disease and vascular

dementia. Brain Res 2010, 1363:1-10.

199

252. Ogawa K, Yamada T, Tsujioka Y, Taguchi J, et al: Localization of a novel type trypsin-like serine

protease, neurosin, in brain tissues of Alzheimer's disease and Parkinson's disease. Psychiatry

Clin Neurosci 2000, 54:419-426.

253. Zarghooni M, Soosaipillai A, Grass L, Scorilas A, et al: Decreased concentration of human

kallikrein 6 in brain extracts of Alzheimer's disease patients. Clin Biochem 2002, 35:225-231.

254. Mitsui S, Okui A, Uemura H, Mizuno T, et al: Decreased cerebrospinal fluid levels of neurosin

(KLK6), an aging-related protease, as a possible new risk factor for Alzheimer's disease. Ann N

Y Acad Sci 2002, 977:216-223.

255. Kasai T, Tokuda T, Yamaguchi N, Watanabe Y, et al: Cleavage of normal and pathological forms

of alpha-synuclein by neurosin in vitro. Neurosci Lett 2008, 436:52-56.

256. Recchia A, Debetto P, Negro A, Guidolin D, et al: Alpha-synuclein and Parkinson's disease.

FASEB J 2004, 18:617-626.

257. Tatebe H, Watanabe Y, Kasai T, Mizuno T, et al: Extracellular neurosin degrades alpha-

synuclein in cultured cells. Neurosci Res 2010, 67:341-346.

258. Spencer B, Michael S, Shen J, Kosberg K, et al: Lentivirus mediated delivery of neurosin

promotes clearance of wild-type alpha-synuclein and reduces the pathology in an alpha-

synuclein model of LBD. Mol Ther 2013, 21:31-41.

259. Scarisbrick IA, Radulovic M, Burda JE, Larson N, et al: Kallikrein 6 is a novel molecular trigger of

reactive astrogliosis. Biol Chem 2012, 393:355-367.

260. Burda JE, Radulovic M, Yoon H, Scarisbrick IA: Critical role for PAR1 in kallikrein 6-mediated

oligodendrogliopathy. Glia 2013, 61:1456-1470.

261. Hebb AL, Bhan V, Wishart AD, Moore CS, et al: Human kallikrein 6 cerebrospinal levels are

elevated in multiple sclerosis. Curr Drug Discov Technol 2010, 7:137-140.

200

262. Maetzler W, Berg D, Schalamberidze N, Melms A, et al: Osteopontin is elevated in Parkinson's

disease and its absence leads to reduced neurodegeneration in the MPTP model. Neurobiol

Dis 2007, 25:473-482.

263. Carecchio M, Comi C: The role of osteopontin in neurodegenerative diseases. J Alzheimers Dis

2011, 25:179-185.

264. Sinclair C, Mirakhur M, Kirk J, Farrell M, et al: Up-regulation of osteopontin and alphaBeta-

crystallin in the normal-appearing white matter of multiple sclerosis: an

immunohistochemical study utilizing tissue microarrays. Neuropathol Appl Neurobiol 2005,

31:292-303.

265. Sun Y, Yin XS, Guo H, Han RK, et al: Elevated osteopontin levels in mild cognitive impairment

and Alzheimer's disease. Mediators Inflamm 2013, 2013:615745.

266. Housley WJ, Pitt D, Hafler DA: Biomarkers in multiple sclerosis. Clin Immunol 2015, 161:51-58.

267. Begcevic I, Brinc D, Drabovich AP, Batruch I, et al: Identification of brain-enriched proteins in

the cerebrospinal fluid proteome by LC-MS/MS profiling and mining of the Human Protein

Atlas. Clin Proteomics 2016, 13:11.

268. Henry P: The measurement of splashover and carryover in centrifugal analyzers. J Automat

Chem 1979, 1:195-198.

269. CHEMISTRY IUOPAA: Proposals for the description effects in clinical chemistry and

measurement of carry-over. Pure Appl Chem 1991, 63:301-306.

270. Spellman DS, Wildsmith KR, Honigberg LA, Tuefferd M, et al: Development and evaluation of a

multiplexed mass spectrometry based assay for measuring candidate peptide biomarkers in

Alzheimer's Disease Neuroimaging Initiative (ADNI) CSF. Proteomics Clin Appl 2015, 9:715-731.

271. Wildsmith KR, Schauer SP, Smith AM, Arnott D, et al: Identification of longitudinally dynamic

biomarkers in Alzheimer's disease cerebrospinal fluid by targeted proteomics. Mol

Neurodegener 2014, 9:22.

201

272. Paterson RW, Heywood WE, Heslegrave AJ, Magdalinou NK, et al: A targeted proteomic

multiplex CSF assay identifies increased malate dehydrogenase and other neurodegenerative

biomarkers in individuals with Alzheimer's disease pathology. Transl Psychiatry 2016, 6:e952.

273. Kroksveen AC, Jaffe JD, Aasebo E, Barsnes H, et al: Quantitative proteomics suggests decrease

in the secretogranin-1 cerebrospinal fluid levels during the disease course of multiple

sclerosis. Proteomics 2015, 15:3361-3369.

274. Burgess MW, Keshishian H, Mani DR, Gillette MA, et al: Simplified and efficient quantification

of low-abundance proteins at very high multiplex via targeted mass spectrometry. Mol Cell

Proteomics 2014, 13:1137-1149.

275. Anderson NL, Anderson NG, Haines LR, Hardie DB, et al: Mass spectrometric quantitation of

peptides and proteins using Stable Isotope Standards and Capture by Anti-Peptide Antibodies

(SISCAPA). J Proteome Res 2004, 3:235-244.

276. Karakosta TD, Soosaipillai A, Diamandis EP, Batruch I, et al: Quantification of Human Kallikrein-

Related Peptidases in Biological Fluids by Multiplatform Targeted Mass Spectrometry Assays.

Mol Cell Proteomics 2016, 15:2863-2876.

277. Gallien S, Duriez E, Domon B: Selected reaction monitoring applied to proteomics. J Mass

Spectrom 2011, 46:298-312.

278. McKhann G, Drachman D, Folstein M, Katzman R, et al: Clinical diagnosis of Alzheimer's

disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of

Health and Human Services Task Force on Alzheimer's Disease. Neurology 1984, 34:939-944.

279. Petersen RC, Smith GE, Waring SC, Ivnik RJ, et al: Mild cognitive impairment: clinical

characterization and outcome. Arch Neurol 1999, 56:303-308.

280. Toledo JB, Zetterberg H, van Harten AC, Glodzik L, et al: Alzheimer's disease cerebrospinal fluid

biomarker in cognitively normal subjects. Brain 2015, 138:2701-2715.

202

281. Szklarczyk D, Morris JH, Cook H, Kuhn M, et al: The STRING database in 2017: quality-

controlled protein-protein association networks, made broadly accessible. Nucleic Acids Res

2017, 45:D362-D368.

282. Lee SJ, Wei M, Zhang C, Maxeiner S, et al: Presynaptic Neuronal Pentraxin Receptor Organizes

Excitatory and Inhibitory Synapses. J Neurosci 2017, 37:1062-1080.

283. Yin GN, Lee HW, Cho JY, Suk K: Neuronal pentraxin receptor in cerebrospinal fluid as a

potential biomarker for neurodegenerative diseases. Brain Res 2009, 1265:158-170.

284. Ringman JM, Schulman H, Becker C, Jones T, et al: Proteomic changes in cerebrospinal fluid of

presymptomatic and affected persons carrying familial Alzheimer disease mutations. Arch

Neurol 2012, 69:96-104.

285. Ludewig S, Korte M: Novel Insights into the Physiological Function of the APP (Gene) Family

and Its Proteolytic Fragments in Synaptic Plasticity. Front Mol Neurosci 2016, 9:161.

286. Eggert S, Paliga K, Soba P, Evin G, et al: The proteolytic processing of the amyloid precursor

protein gene family members APLP-1 and APLP-2 involves alpha-, beta-, gamma-, and epsilon-

like cleavages: modulation of APLP-1 processing by n-glycosylation. J Biol Chem 2004,

279:18146-18156.

287. Yanagida K, Okochi M, Tagami S, Nakayama T, et al: The 28-amino acid form of an APLP1-

derived Abeta-like peptide is a surrogate marker for Abeta42 production in the central

nervous system. EMBO Mol Med 2009, 1:223-235.

288. Sjodin S, Andersson KK, Mercken M, Zetterberg H, et al: APLP1 as a cerebrospinal fluid

biomarker for gamma-secretase modulator treatment. Alzheimers Res Ther 2015, 7:77.

289. Wung JK, Perry G, Kowalski A, Harris PL, et al: Increased expression of the remodeling- and

tumorigenic-associated factor osteopontin in pyramidal neurons of the Alzheimer's disease

brain. Curr Alzheimer Res 2007, 4:67-72.

203

290. Comi C, Carecchio M, Chiocchetti A, Nicola S, et al: Osteopontin is increased in the

cerebrospinal fluid of patients with Alzheimer's disease and its levels correlate with cognitive

decline. J Alzheimers Dis 2010, 19:1143-1148.

291. Simonsen AH, McGuire J, Hansson O, Zetterberg H, et al: Novel panel of cerebrospinal fluid

biomarkers for the prediction of progression to Alzheimer dementia in patients with mild

cognitive impairment. Arch Neurol 2007, 64:366-370.

292. Heywood WE, Galimberti D, Bliss E, Sirka E, et al: Identification of novel CSF biomarkers for

neurodegeneration and their validation by a high-throughput multiplexed targeted proteomic

assay. Mol Neurodegener 2015, 10:64.

293. Bornsen L, Khademi M, Olsson T, Sorensen PS, et al: Osteopontin concentrations are increased

in cerebrospinal fluid during attacks of multiple sclerosis. Mult Scler 2011, 17:32-42.

294. Ma QH, Futagawa T, Yang WL, Jiang XD, et al: A TAG1-APP signalling pathway through Fe65

negatively modulates neurogenesis. Nat Cell Biol 2008, 10:283-294.

295. Gautam V, D'Avanzo C, Hebisch M, Kovacs DM, et al: BACE1 activity regulates cell surface

contactin-2 levels. Mol Neurodegener 2014, 9:4.

296. Derfuss T, Parikh K, Velhin S, Braun M, et al: Contactin-2/TAG-1-directed autoimmunity is

identified in multiple sclerosis patients and mediates gray matter pathology in animals. Proc

Natl Acad Sci U S A 2009, 106:8302-8307.

297. Schilling S, Mehr A, Ludewig S, Stephan J, et al: APLP1 Is a Synaptic Cell Adhesion Molecule,

Supporting Maintenance of Dendritic Spines and Basal Synaptic Transmission. J Neurosci

2017, 37:5345-5365.

298. Gennarini G, Bizzoca A, Picocci S, Puzzo D, et al: The role of Gpi-anchored axonal glycoproteins

in neural development and neurological disorders. Mol Cell Neurosci 2017, 81:49-63.

204

299. Remnestal J, Just D, Mitsios N, Fredolini C, et al: CSF profiling of the human brain enriched

proteome reveals associations of neuromodulin and neurogranin to Alzheimer's disease.

Proteomics Clin Appl 2016, 10:1242-1253.

300. Abdi F, Quinn JF, Jankovic J, McIntosh M, et al: Detection of biomarkers with a multiplex

quantitative proteomic platform in cerebrospinal fluid of patients with neurodegenerative

disorders. J Alzheimers Dis 2006, 9:293-348.

301. Gillet LC, Navarro P, Tate S, Rost H, et al: Targeted data extraction of the MS/MS spectra

generated by data-independent acquisition: a new concept for consistent and accurate

proteome analysis. Mol Cell Proteomics 2012, 11:O111 016717.

302. Schubert OT, Gillet LC, Collins BC, Navarro P, et al: Building high-quality assay libraries for

targeted analysis of SWATH MS data. Nat Protoc 2015, 10:426-441.

303. Thorsell A, Bjerke M, Gobom J, Brunhage E, et al: Neurogranin in cerebrospinal fluid as a

marker of synaptic degeneration in Alzheimer's disease. Brain Res 2010, 1362:13-22.

205

Appendices

1. Appendix 2.1: Top 10 upregulated proteins in Alzheimer's disease (AD) tissues in comparison to Control tissues.

Gene Name Spectral counts 'AD' Spectral counts 'Control' Fold-change MSN 48.01 8.76 5.5 HSPB1 41.51 8.12 5.1 S100B 197.88 39.58 5.0 CLIC1 29.57 6.23 4.7 DDAH2 34.67 8.24 4.2 RDX 41.15 11.01 3.7 NDRG2 229.94 67.60 3.4 PRDX6 154.51 45.81 3.4 EPHX1 28.37 8.71 3.3 AHCY 16.93 5.22 3.2 A minimum of 2-peptide hit protein identification was used.

2. Appendix 2.2: Top 10 upregulated proteins in 'Control' tissues in comparison to Alzheimer's disease (AD) tissues.

Gene Name Spectral counts 'AD' Spectral counts 'Control' Fold-change EEF1A2 5.06 37.25 7.4 OGDH 10.11 47.95 4.7 IGSF8 5.65 26.06 4.6 ACTR2 9.93 44.05 4.4 IGHV4-31 17.55 76.61 4.4 DCTN1 8.60 37.29 4.3 RPLP2 10.24 43.40 4.2 PAFAH1B1 5.79 22.78 3.9 SPTBN2 33.94 131.50 3.9 TF 8.55 32.62 3.8 A minimum of 2-peptide hit protein identification was used.

206

3. Appendix 2.3: List of 40 CSF proteins that were found exclusively in Alzheimer's disease hippocampal tissues.

Gene Cellular localization* ACANName extracellular PTN extracellular, endoplasmic reticulum, cytoplasm SEMA4C Cytoskeleton, membrane, cytoplasm ENDOD1 extracellular PI16 extracellular, membrane BID mitochondrion, membrane, cytoplasm, cytosol EFEMP1 extracellular, cell surface, membrane CREG1 extracellular, organelle lumen, nucleus SUMF2 LAMP1 cell surface, membrane, cytoplasm, vacuole, endosome SEMA7A cell surface, membrane SCG5 extracellular, cytoplasm SPARC extracellular, cytoplasm, nucleus VASN extracellular, membrane ADRM1 membrane, cytoplasm, proteasome, organelle lumen, nucleus IGSF1 extracellular, membrane LYPLA1 mitochondrion, cytoplasm CD14 e xtracellular, cell surface, membrane, cytoplasm, endosome SNX12 membrane NID2 extracellular, cell surface, membrane LMAN2 membrane, endoplasmic reticulum, cytoplasm, Golgi DDRGK1 endoplasmic reticulum, cytoplasm PODXL2 membrane CTSH extracellular, cytoplasm, vacuole, cytosol SDC4 extracellular, cell surface, membrane, cytoplasm, Golgi, vacuole, organelle lumen TXNDC5 end oplasmic reticulum, cytoplasm, organelle lumen, vacuole NLN mitochondrion, cytoplasm HS6ST2 membrane, cytoplasm, Golgi LAMC1 extracellular, cytoskeleton, organelle lumen, chromosome, nucleus PHOSPHO2 LAMA5 extracellular, cytoskeleton, membrane, cy toplasm LGI3 extracellular, cytoplasm CDH4 membrane PODXL membrane COL6A1 extracellular, membrane, endoplasmic reticulum, cytoplasm, organelle lumen SCN1B extracell ular, membrane S100A6 membrane, cytoplasm, cytosol, nucleus CNDP1 extracellular IL6ST extracellular, cell surface, membrane LAMA2 extracellular,cytoskeleton,membrane,mitochondrion,cytoplasm,organelle lumen,

207

Gene Cellular localization* Name chromosome, cytosol, nucleus * Gene Ontology information was retrieved from Protein Center.

4. Appendix 3.1: Brain-enriched and group-enriched proteins identified in brain proteome. A) Number of brain-enriched proteins (secreted/membrane) identified in brain hippocampal proteome (n=57). B) Number of group-enriched (secreted/membrane) proteins identified in brain hippocampal proteome (n=23).

5. Appendix 3.2: 78 tissue-enriched and brain-enriched proteins.

Average Number of IHC (HPA) Average RNA tissue RNA IHC (HPA) brain Brain Gene unique samples tissue areas category TS evidence proteome peptides detected expression APLP1 9 1.07E+10 6 Tissue enriched 6 detected 1/42* present SCG3 24 7.98E+09 6 Tissue enriched 8 detected 7/42 present BCAN 21 3.64E+09 6 Tissue enriched 15 detected 1/42 present VGF 23 3.62E+09 6 Tissue enriched 10 detected 3/42 present NPTX1 11 3.27E+09 6 Tissue enriched 18 NA NA present NCAN 22 2.40E+09 6 Tissue enriched 60 detected 1/42 present OPCML 7 1.56E+09 6 Tissue enriched 14 NA NA present NPTXR 14 1.39E+09 6 Tissue enriched 7 detected 2/42 present CNTN2 26 1.35E+09 6 Tissue enriched 6 detected 1/42 present SEZ6 13 1.35E+09 6 Tissue enriched 35 detected 1/42 absent NRXN2 20 1.18E+09 6 Tissue enriched 5 NA NA absent SLITRK1 7 1.00E+09 6 Tissue enriched 27 detected 1/42 absent BAI2 4 8.40E+08 6 Tissue enriched 8 NA NA absent

208

Average Number of IHC (HPA) Average RNA tissue RNA IHC (HPA) brain Brain Gene unique samples tissue areas category TS evidence proteome peptides detected expression NRXN1 20 7.82E+08 6 Tissue enriched 16 NA NA present CADM2 6 7.32E+08 6 Tissue enriched 10 detected 3/42 present MOG 3 6.13E+08 6 Tissue enriched 48 detected 1/42 absent VSTM2B 5 5.84E+08 6 Tissue enriched 108 NA NA absent CNTNAP4 12 5.38E+08 6 Tissue enriched 15 detected 1/42 absent MEGF10 7 4.35E+08 6 Tissue enriched 5 detected 31/42 absent LY6H 4 4.06E+08 6 Tissue enriched 8 NA NA present SERPINI1 8 3.91E+08 6 Tissue enriched 8 detected 1/42 present TNR 6 3.66E+08 6 Tissue enriched 25 detected 1/42 present GRIA4 4 3.31E+08 6 Tissue enriched 6 NA NA present LRRC4B 4 3.18E+08 6 Tissue enriched 5 detected 37/42 present PCDHGC5 4 2.62E+08 6 Tissue enriched 32 detected NA absent LINGO1 6 1.52E+08 6 Tissue enriched 8 NA NA present SCN3B 1 1.50E+08 6 Tissue enriched 8 NA NA present TMEM132D 4 1.32E+08 6 Tissue enriched 13 detected 2/42 absent OLFM1 4 8.97E+07 6 Tissue enriched 13 detected 26/42 present PCDH8 3 8.41E+07 6 Tissue enriched 5 detected NA absent PCDH9 3 6.03E+07 6 Tissue enriched 9 detected NA present CDH8 2 4.23E+07 6 Tissue enriched 8 detected 1/40 absent KIAA1549L 2 1.84E+07 6 Tissue enriched 13 detected NA absent CSPG5 1 2.55E+08 5 Tissue enriched 43 detected 1/42 present CBLN2 1 2.30E+08 5 Tissue enriched 9 NA NA absent FAM19A1 1 1.09E+08 5 Tissue enriched 17 detected 18/42 absent ATP1B2 1 9.09E+07 5 Tissue enriched 6 detected 1/41 present ST8SIA3 1 5.82E+07 4 Tissue enriched 6 detected 1/41 absent C1QTNF4 2 4.86E+07 4 Tissue enriched 8 detected 19/41 absent TMEM59L 2 2.74E+07 4 Tissue enriched 11 detected 1/42 absent LGI1 2 2.19E+07 4 Tissue enriched 6 NA NA present LRRTM2 1 1.83E+07 5 Tissue enriched 9 NA NA absent GPM6A 1 1.70E+07 4 Tissue enriched 23 detected 1/42 present GPR158 1 9.62E+06 4 Tissue enriched 9 detected 1/42 present IL1RAPL1 1 8.09E+06 4 Tissue enriched 8 detected NA absent KLK6 14 1.61E+10 6 Group enriched 6 NA NA absent SPP1 10 1.32E+10 6 Group enriched 10 detected 25/42 present NRCAM 35 9.43E+09 6 Group enriched 6 detected 2/42 present CNDP1 25 9.05E+09 6 Group enriched 7 detected 3/42 present SCG2 18 1.46E+09 6 Group enriched 6 detected 39/42 present CADM3 10 1.38E+09 6 Group enriched 6 detected 3/42 present PTPRZ1 12 1.58E+09 6 Group enriched 6 detected 1/42 present SEZ6L 8 2.90E+08 6 Group enriched 15 NA NA absent SPOCK3 5 5.79E+08 6 Group enriched 7 NA NA absent RTN4RL2 7 5.51E+08 6 Group enriched 6 detected 41/41 absent

209

Average Number of IHC (HPA) Average RNA tissue RNA IHC (HPA) brain Brain Gene unique samples tissue areas category TS evidence proteome peptides detected expression CNTNAP2 15 4.92E+08 6 Group enriched 5 detected 2/42 present DNER 3 4.05E+08 6 Group enriched 6 detected 1/42 absent CAMK2B 3 2.67E+08 6 Group enriched 6 detected 6/42 present ICAM5 8 2.52E+08 6 Group enriched 7 detected 1/42 present FSTL5 6 2.51E+08 6 Group enriched 15 detected 42/42 absent FRRS1L 4 2.12E+08 6 Group enriched 7 detected 28/42 absent CAMK2A 3 2.06E+08 6 Group enriched 12 detected 34/42 present MAG 3 1.55E+08 6 Group enriched 7 detected 1/42 present NXPH1 2 9.31E+07 6 Group enriched 44 detected 34/41 absent MDGA2 3 8.04E+07 6 Group enriched 9 not detected 20/42 absent EPHA5 2 7.29E+07 6 Group enriched 5 NA NA present SST 1 3.61E+07 6 Group enriched 5 detected 8/42 absent TMEM132B 2 3.30E+07 6 Group enriched 6 detected NA present CDH18 2 2.78E+07 6 Group enriched 14 detected 3/42 absent IGLON5 3 8.09E+08 5 Group enriched 6 detected 1/42 present CNTNAP5 3 1.50E+08 4 Group enriched 16 NA NA absent EFNA3 1 1.38E+08 4 Group enriched 5 detected 16/42 absent CALY 1 8.85E+07 5 Group enriched 15 detected 38/41 absent AQP4 1 8.10E+07 5 Group enriched 9 detected 4/42 present NPY 2 6.95E+07 4 Group enriched 15 detected 6/42 absent PTPRN 3 4.48E+07 4 Group enriched 6 detected 3/42 present PTPRT 2 3.58E+07 4 Group enriched 6 detected 29/42 absent FAIM2 2 2.72E+07 4 Group enriched 6 detected 1/42 present * number of tissues protein is expressed / total number of tissues evaluated, including brain

6. Appendix 3.3: Supplementary method.

KLK6 selected reaction monitoring (SRM) assay

Post-mortem brain tissue samples were obtained with Research Ethics Board approval from the

University Health Network, Toronto, Canada. Frozen tissue sections from several regions (frontal cortex,

substantia nigra and cerebellum) from three control patients (diagnosed with non-metastatic colon

cancer, cardiovascular disease, or heart failure) were first homogenized in liquid nitrogen, using mortar

and pestle, following the homogenization in 50 mM ammonium bicarbonate with Polytron PT3100

homogenizer (Capitol Scientific, USA) at 15,000 rpm, for 30 s and sonicated on ice three times for 10 s

with MISONIX immersion tip sonicator (Q SONICA LLC, USA). The samples were centrifuged at

210

15,000 g at 4 °C for 10 min; supernatants were collected, and samples adjusted to have the equal total protein amount. In addition, CSF pool of non-pathological samples was prepared and subjected to mass spectrometry preparation. For SRM-based quantification 10 µg of total protein was prepared from tissue extracts and CSF pool, denatured with 0.05% RapiGest, reduced using 5 mM dithiothreitol (40 min at 60

°C) and alkylated with 15 mM iodoacetamid (1 hour, 22 °C, in the dark). Heavy labelled synthetic

KLK6-proteotypic peptide (Spike Tides TQL, JPT Peptide Technology, Berlin, Germany) was spiked into each sample (10 fmol and 20 fmol/injection for tissue extracts and CSF pool, respectively), after which samples were digested with trypsin, at the enzyme to protein ratio of 1:10 and 1:30 for tissue extracts and CSF pool, respectively at 37 °C overnight. Peptides were then purified by extraction with

OMIX C18 tips, separated by liquid chromatography, EASY-nLC system, ionized with nano- electrospray ionization, and analysed using TSQ Vantage and Quantiva mass spectrometer (Thermo

Fisher Scientific, USA). Proteotypic peptide LSELIQPLPLER was used for KLK6 quantification. Only three most intense and specific fragment ion transitions (transitions m/z= 965.6, 852.5, 724.4 for light peptide m/z 704.4 and transitions m/z= 975.6, 862.5, 734.4 for heavy peptide m/z 739.9) were used for quantification. The analysis was done using PinPoint (Thermo Fisher Scientific, USA) and Skyline

(University of Washington) software.

211

7. Appendix 3.4: KLK6 concentration in brain tissue extracts and CSF pool. Brain tissue extracts and CSF pool were subjected to mass spectrometry sample preparation and analyzed using TSQ Vantage (brain tissues) and TSQ Quantiva (CSF) mass spectrometers. One- way ANOVA and Bonferroni's Multiple Comparison Test was performed with GradPad Prism between brain regions, n=3, *p<0.05. Data are shown as mean +/- standard error of the mean (SEM). TP- total protein, SNc- substantia nigra.

8. Appendix 4.1: Endogenous peptides used for prediction of retention time (RT).

Average Measured Proteins Peptide Sequence Precursor (m/z) RT (min) CST3 ASNDMYHSR 360.82 13.2 A2M GPTQEFK 403.71 17.3 RBP4 LIVHNGYCDGR 435.21 19.6 APOE LGPLVEQGR 484.78 24.9 SCG2 VLEYLNQEK 568.30 26.9 SERPINF2 LCQDLGPGAFR 617.31 29.9 CST3 ALDFAVGEYNK 613.81 33.5 KLK6 LSELIQPLPLER 704.41 40.0 APOH FICPLTGLWPINTLK 886.99 52.2

212

9. Appendix 4.2: Pierce peptides used for prediction of retention time (RT).

Average Measured Peptide Sequence Precursor (m/z) RT (min)

SSAAPPPPPR* 493.77 14.47 GISNEGQNASIK* 613.32 17.14 DIPVPKPK* 451.28 20.98 IGDYAGIK* 422.74 23.00 TASEFDSAIAQDK* 695.83 25.8

SAAGAFGPELSR* 586.80 27.68 ELGQSGVDTYLQTK* 773.90 29.05 GLILVGGYGTR* 558.33 34.43 GILFVGSGVSGGEEGAR* 801.41 35.01 SFANQPLEVVYSK* 745.39 34.22 LTILEELR* 498.80 39.54 NGFILDGFPR* 573.30 41.18 ELASGLSFPVGFK* 680.37 41.41

LSSEAPALFQFDLK* 787.42 44.11 K*, heavy lysine; R*, heavy arginine

10. Appendix 4.3: Retention time correlations. Correlation between the measured retention time of endogenous peptides (n=9) and RT from SRM Atlas (A) and RT of Pierce peptides (n=14) and their hydrophobicity indexes calculated with Skyline SSRCalc. 3.0 (B). Shaded region indicates 95% confidence bounds.

213

11. Appendix 4.4: Endogenous peptides identified in CSF.

Gene Predicted RT (min) Observed RT (min) Peptide Sequence Name m/z z P1* P2* P3* S1-P1^ S1-P2^ S2 S3 APLP1 DELAPAGTGVSR 586.80 2 22.4 21.7 20.89 19.8 20.1 20.4 20.1 BCAN FNVYCFR 503.23 2 33.3 31.9 34.91 33.7 34.4 34.6 33.7 CADM2 SDDGVAVICR 546.26 2 24.5 23.3 23.90 22.2 22.5 23.1 23 CBLN2 VAFSATR 376.21 2 20.3 17.7 NA 17.9 18.4 19.2 18.9 CNDP1 ALEQDLPVNIK 620.35 2 34.5 32.4 33.07 32.5 32.6 33 32.3 CNTN2 VTVTPDGTLIIR 642.88 2 35.1 34.9 34.28 33.5 33.7 34 33.7 DNER VTATGFQQCSLIDGR 551.61 3 31.6 31.8 31.52 34 33.7 34.2 34 KLK6 LSELIQPLPLER 704.41 2 41.0 42.6 40.04 39.3 39.8 40.2 39.7 NCAN TGFPSPAER 481.24 2 22.6 22.4 23.33 23 22.4 22.9 22.3 NPTXR VAQLPLSLK 484.81 2 32.8 32.4 34.51 34 33.6 34.2 33.7 NRCAM VFNTPEGVPSAPSSLK 543.95 3 32.3 33.1 30.81 30.4 30.4 30.9 30.6 OLFM1 LTGISDPVTVK 565.33 2 26.8 27.9 28.04 27.2 27.7 28.2 28.5 OPCML ITVNYPPYISK 647.86 2 32.6 30.4 32.48 31.9 32.5 32.3 31.7 PTPRZ1 AIIDGVESVSR 573.31 2 NA 27.9 27.76 26.4 27 27.5 27.2 SCG2 ALEYIENLR 560.80 2 31.4 34.7 33.86 33 32.6 33.9 33 SEZ6L ETGTPIWTSR 574.29 2 26.9 26.4 27.66 27.4 26.4 27.7 27.2 SPP1 AIPVAQDLNAPSDWDSR 927.95 2 35.2 37.5 35.20 34.6 x 35.2 34.8 VGF FGEGVSSPK 454.23 2 20.3 18.0 19.06 18.3 17.7 18.5 18.4 LRRC4B DLAEVPASIPVNTR 741.40 2 34.8 32.8 32.61 32.2 31.9 32.3 32.2 CADM3 LLLHCEGR 333.18 3 21.1 25.3 21.22 19.3 x 19.7 19.3 SCG3 LLNLGLITESQAHTLEDEVA 921.83 3 58.4 53.2 NA x 54.4 54.5 54.5 EVLQK NPTX1 FQLTFPLR 511.30 2 42.9 40.3 42.85 42.2 41.9 42.8 42.4 FRRS1L HDIDSPPASER 408.53 3 NA 15.8 NA 15.7 15.5 15.6 15.6 SLITRK1 LSNVQELFLR 609.85 2 38.7 39.3 40.45 39.9 40 40.5 40.3 SST SANSNPAMAPR 558.27 2 17.3 15.9 NA 15.4 15.4 15.7 15.4 MOG FSDEGGFTCFFR 735.31 2 40.3 39.8 43.28 42.9 42.8 43.6 43 NPY ESTENVPR 466.23 2 15.4 11.5 NA 13.3 13.9 14.2 13.8 RTN4RL2 LFLQNNLIR 565.84 2 34.0 38.3 38.55 x 37.5 38.1 37.2 SERPINI1 ALGITEIFIK 552.84 2 43.7 42.9 44.23 43.3 43.2 44.2 43.9 BAI2 LLAPAALAFR 521.82 2 38.1 37.8 39.80 39.8 38.6 39.8 39.2 NA- not available x- not found *P1: Prediction 1, based on RT (endogenous peptides, Quantiva) vs. RT (endogenous peptides, SRM Atlas); P2: Prediction 2, based on RT (Pierce peptides) vs. hydrophobicity index; P3: Prediction 3, based on RT (Q Exactive Plus) vs. RT (Quantiva) ^S1-P1: Step1-observed RT based on P1; S1-P2: Step1-observed RT based on P2; S2: Step 2, S3: Step 3

214

12. Appendix 4.5: Peptides and transitions of the developed SRM assay.

Gene Name Peptide Sequence Peptide Precursor m/z Product m/z APLP1 DELAPAGTGVSR light 586.80 815.44 APLP1 DELAPAGTGVSR light 586.80 744.40 APLP1 DELAPAGTGVSR light 586.80 358.16 APLP1 DELAPAGTGVSR heavy 591.80 825.45 APLP1 DELAPAGTGVSR heavy 591.80 754.41 APLP1 DELAPAGTGVSR heavy 591.80 358.16 BCAN FNVYCFR light 503.23 744.35 BCAN FNVYCFR light 503.23 482.22 BCAN FNVYCFR light 503.23 262.12 BCAN FNVYCFR heavy 508.24 754.36 BCAN FNVYCFR heavy 508.24 492.23 BCAN FNVYCFR heavy 508.24 262.12 CADM2 SDDGVAVICR light 546.26 618.34 CADM2 SDDGVAVICR light 546.26 547.30 CADM2 SDDGVAVICR light 546.26 448.23 CADM2 SDDGVAVICR heavy 551.27 628.35 CADM2 SDDGVAVICR heavy 551.27 557.31 CADM2 SDDGVAVICR heavy 551.27 458.24 CBLN2 VAFSATR light 376.21 652.34 CBLN2 VAFSATR light 376.21 434.24 CBLN2 VAFSATR light 376.21 347.20 CBLN2 VAFSATR light 376.21 291.16 CBLN2 VAFSATR heavy 381.21 662.35 CBLN2 VAFSATR heavy 381.21 444.24 CBLN2 VAFSATR heavy 381.21 357.21 CBLN2 VAFSATR heavy 381.21 296.16 CNDP1 ALEQDLPVNIK light 620.35 798.47 CNDP1 ALEQDLPVNIK light 620.35 683.45 CNDP1 ALEQDLPVNIK light 620.35 570.36 CNDP1 ALEQDLPVNIK heavy 624.36 806.49 CNDP1 ALEQDLPVNIK heavy 624.36 691.46 CNDP1 ALEQDLPVNIK heavy 624.36 578.38 CNTN2 VTVTPDGTLIIR light 642.88 884.52 CNTN2 VTVTPDGTLIIR light 642.88 672.44 CNTN2 VTVTPDGTLIIR light 642.88 442.76 CNTN2 VTVTPDGTLIIR heavy 647.88 894.53 CNTN2 VTVTPDGTLIIR heavy 647.88 682.45 CNTN2 VTVTPDGTLIIR heavy 647.88 447.77 DNER VTATGFQQCSLIDGR light 826.91 660.37 DNER VTATGFQQCSLIDGR light 826.91 460.25

215

Gene Name Peptide Sequence Peptide Precursor m/z Product m/z DNER VTATGFQQCSLIDGR light 826.91 347.17 DNER VTATGFQQCSLIDGR light 826.91 232.14 DNER VTATGFQQCSLIDGR heavy 831.91 670.38 DNER VTATGFQQCSLIDGR heavy 831.91 470.26 DNER VTATGFQQCSLIDGR heavy 831.91 357.18 DNER VTATGFQQCSLIDGR heavy 831.91 242.15 KLK6 LSELIQPLPLER light 704.41 852.49 KLK6 LSELIQPLPLER light 704.41 724.44 KLK6 LSELIQPLPLER heavy 709.42 862.50 KLK6 LSELIQPLPLER heavy 709.42 734.44 NCAN TGFPSPAER light 481.24 656.34 NCAN TGFPSPAER light 481.24 559.28 NCAN TGFPSPAER light 481.24 306.14 NCAN TGFPSPAER heavy 486.24 666.34 NCAN TGFPSPAER heavy 486.24 569.29 NCAN TGFPSPAER heavy 486.24 306.14 NPTXR VAQLPLSLK light 484.81 557.37 NPTXR VAQLPLSLK light 484.81 347.23 NPTXR VAQLPLSLK heavy 488.82 565.38 NPTXR VAQLPLSLK heavy 488.82 355.24 NRCAM VFNTPEGVPSAPSSLK light 815.43 1168.62 NRCAM VFNTPEGVPSAPSSLK light 815.43 786.44 NRCAM VFNTPEGVPSAPSSLK light 815.43 531.31 NRCAM VFNTPEGVPSAPSSLK heavy 819.43 1176.64 NRCAM VFNTPEGVPSAPSSLK heavy 819.43 794.45 NRCAM VFNTPEGVPSAPSSLK heavy 819.43 539.33 OLFM1 LTGISDPVTVK light 565.33 1016.56 OLFM1 LTGISDPVTVK light 565.33 915.51 OLFM1 LTGISDPVTVK light 565.33 745.41 OLFM1 LTGISDPVTVK heavy 569.33 1024.58 OLFM1 LTGISDPVTVK heavy 569.33 923.53 OLFM1 LTGISDPVTVK heavy 569.33 753.42 OPCML ITVNYPPYISK light 647.86 981.50 OPCML ITVNYPPYISK light 647.86 867.46 OPCML ITVNYPPYISK light 647.86 704.40 OPCML ITVNYPPYISK heavy 651.86 989.52 OPCML ITVNYPPYISK heavy 651.86 875.48 OPCML ITVNYPPYISK heavy 651.86 712.41 PTPRZ1 AIIDGVESVSR light 573.31 848.41 PTPRZ1 AIIDGVESVSR light 573.31 733.38 PTPRZ1 AIIDGVESVSR light 573.31 577.29 PTPRZ1 AIIDGVESVSR heavy 578.32 858.42

216

Gene Name Peptide Sequence Peptide Precursor m/z Product m/z PTPRZ1 AIIDGVESVSR heavy 578.32 743.39 PTPRZ1 AIIDGVESVSR heavy 578.32 587.30 SCG2 ALEYIENLR light 560.80 644.37 SCG2 ALEYIENLR light 560.80 531.29 SCG2 ALEYIENLR heavy 565.81 654.38 SCG2 ALEYIENLR heavy 565.81 541.30 SEZ6L ETGTPIWTSR light 574.29 917.48 SEZ6L ETGTPIWTSR light 574.29 759.41 SEZ6L ETGTPIWTSR light 574.29 549.28 SEZ6L ETGTPIWTSR heavy 579.29 927.49 SEZ6L ETGTPIWTSR heavy 579.29 769.42 SEZ6L ETGTPIWTSR heavy 579.29 559.29 SPP1 AIPVAQDLNAPSDWDSR light 927.95 933.41 SPP1 AIPVAQDLNAPSDWDSR light 927.95 862.37 SPP1 AIPVAQDLNAPSDWDSR light 927.95 262.15 SPP1 AIPVAQDLNAPSDWDSR heavy 932.96 943.41 SPP1 AIPVAQDLNAPSDWDSR heavy 932.96 872.38 SPP1 AIPVAQDLNAPSDWDSR heavy 932.96 272.16 VGF FGEGVSSPK light 454.23 760.38 VGF FGEGVSSPK light 454.23 574.32 VGF FGEGVSSPK heavy 458.24 768.40 VGF FGEGVSSPK heavy 458.24 582.33 LRRC4B DLAEVPASIPVNTR light 741.40 954.54 LRRC4B DLAEVPASIPVNTR light 741.40 786.45 LRRC4B DLAEVPASIPVNTR light 741.40 586.33 LRRC4B DLAEVPASIPVNTR heavy 746.41 964.54 LRRC4B DLAEVPASIPVNTR heavy 746.41 796.46 LRRC4B DLAEVPASIPVNTR heavy 746.41 596.34 CADM3 LLLHCEGR light 333.18 658.27 CADM3 LLLHCEGR light 333.18 521.21 CADM3 LLLHCEGR light 333.18 386.18 CADM3 LLLHCEGR heavy 336.52 668.28 CADM3 LLLHCEGR heavy 336.52 531.22 CADM3 LLLHCEGR heavy 336.52 391.19 SCG3 LLNLGLITESQAHTLEDEVAEVLQK light 921.83 1070.54 SCG3 LLNLGLITESQAHTLEDEVAEVLQK light 921.83 227.18 SCG3 LLNLGLITESQAHTLEDEVAEVLQK light 921.83 341.22 SCG3 LLNLGLITESQAHTLEDEVAEVLQK heavy 924.50 1074.55 SCG3 LLNLGLITESQAHTLEDEVAEVLQK heavy 924.50 227.18 SCG3 LLNLGLITESQAHTLEDEVAEVLQK heavy 924.50 341.22 NPTX1 FQLTFPLR light 511.30 746.46 NPTX1 FQLTFPLR light 511.30 633.37

217

Gene Name Peptide Sequence Peptide Precursor m/z Product m/z NPTX1 FQLTFPLR light 511.30 276.13 NPTX1 FQLTFPLR heavy 516.30 756.46 NPTX1 FQLTFPLR heavy 516.30 643.38 NPTX1 FQLTFPLR heavy 516.30 276.13 ECM1* ELLALIQLER light 599.36 842.51 ECM1* ELLALIQLER light 599.36 771.47 ECM1* ELLALIQLER light 599.36 658.39 ECM1* ELLALIQLER heavy 604.37 852.52 ECM1* ELLALIQLER heavy 604.37 781.48 ECM1* ELLALIQLER heavy 604.37 668.40 FRRS1L HDIDSPPASER light 612.29 858.40 FRRS1L HDIDSPPASER light 612.29 743.37 FRRS1L HDIDSPPASER light 612.29 656.34 FRRS1L HDIDSPPASER light 612.29 253.09 FRRS1L HDIDSPPASER heavy 617.29 868.40 FRRS1L HDIDSPPASER heavy 617.29 753.38 FRRS1L HDIDSPPASER heavy 617.29 666.34 FRRS1L HDIDSPPASER heavy 617.29 253.09 SLITRK1 LSNVQELFLR light 609.85 805.46 SLITRK1 LSNVQELFLR light 609.85 677.40 SLITRK1 LSNVQELFLR light 609.85 548.36 SLITRK1 LSNVQELFLR heavy 614.85 815.46 SLITRK1 LSNVQELFLR heavy 614.85 687.41 SLITRK1 LSNVQELFLR heavy 614.85 558.36 SST SANSNPAMAPR light 558.27 756.38 SST SANSNPAMAPR light 558.27 642.34 SST SANSNPAMAPR light 558.27 474.25 SST SANSNPAMAPR heavy 563.27 766.39 SST SANSNPAMAPR heavy 563.27 652.35 SST SANSNPAMAPR heavy 563.27 484.26 SST SANSNPAM[+16]APR light 566.26 772.38 SST SANSNPAM[+16]APR light 566.26 658.33 SST SANSNPAM[+16]APR light 566.26 490.24 SST SANSNPAM[+16]APR light 566.26 343.21 SST SANSNPAM[+16]APR heavy 571.27 782.39 SST SANSNPAM[+16]APR heavy 571.27 668.34 SST SANSNPAM[+16]APR heavy 571.27 500.25 SST SANSNPAM[+16]APR heavy 571.27 353.22 MOG FSDEGGFTCFFR light 735.31 991.45 MOG FSDEGGFTCFFR light 735.31 934.42 MOG FSDEGGFTCFFR heavy 740.32 1001.45 MOG FSDEGGFTCFFR heavy 740.32 944.43

218

Gene Name Peptide Sequence Peptide Precursor m/z Product m/z NPY ESTENVPR light 466.23 614.33 NPY ESTENVPR light 466.23 485.28 NPY ESTENVPR light 466.23 272.17 NPY ESTENVPR heavy 471.23 624.33 NPY ESTENVPR heavy 471.23 495.29 NPY ESTENVPR heavy 471.23 282.18 RTN4RL2 LFLQNNLIR light 565.84 870.52 RTN4RL2 LFLQNNLIR light 565.84 757.43 RTN4RL2 LFLQNNLIR light 565.84 515.33 RTN4RL2 LFLQNNLIR heavy 570.84 880.52 RTN4RL2 LFLQNNLIR heavy 570.84 767.44 RTN4RL2 LFLQNNLIR heavy 570.84 525.34 SERPINI1 ALGITEIFIK light 552.84 920.55 SERPINI1 ALGITEIFIK light 552.84 750.44 SERPINI1 ALGITEIFIK light 552.84 520.35 SERPINI1 ALGITEIFIK light 552.84 407.27 SERPINI1 ALGITEIFIK heavy 556.84 928.56 SERPINI1 ALGITEIFIK heavy 556.84 758.45 SERPINI1 ALGITEIFIK heavy 556.84 528.36 SERPINI1 ALGITEIFIK heavy 556.84 415.28 BAI2 LLAPAALAFR light 521.82 816.47 BAI2 LLAPAALAFR light 521.82 745.44 BAI2 LLAPAALAFR light 521.82 577.35 BAI2 LLAPAALAFR heavy 526.83 826.48 BAI2 LLAPAALAFR heavy 526.83 755.44 BAI2 LLAPAALAFR heavy 526.83 587.35 * Negative control

13. Appendix 4.6: Analytical characteristics of SRM assays for 30 proteins

Gene Linear Range Linear Range Median Peptide Sequence R2 Name Low High CV (%) APLP1 DELAPAGTGVSR 0.997 0.49 4000 1.3 BCAN FNVYCFR 0.997 0.49 4000 1.2 CADM2 SDDGVAVICR 0.997 1.95 4000 3.1 CBLN2 VAFSATR 0.992 0.49 2000 8 CNDP1 ALEQDLPVNIK 0.997 0.49 4000 1.1 CNTN2 VTVTPDGTLIIR 0.998 0.49 4000 3.7 DNER VTATGFQQCSLIDGR 0.999 1.95 4000 1.3 KLK6 LSELIQPLPLER 0.996 0.49 4000 1.7

219

Gene Linear Range Linear Range Median Peptide Sequence R2 Name Low High CV (%) NCAN TGFPSPAER 0.999 0.49 4000 3.1 NPTXR VAQLPLSLK 0.989 0.49 4000 3.1 NRCAM VFNTPEGVPSAPSSLK 0.995 0.49 4000 0.9 OLFM1 LTGISDPVTVK 0.961 0.49 2000 9.6 OPCML ITVNYPPYISK 0.995 0.49 4000 7.7 PTPRZ1 AIIDGVESVSR 0.996 1.95 4000 0.7 SCG2 ALEYIENLR 0.996 0.49 4000 1.6 SEZ6L ETGTPIWTSR 0.999 0.49 4000 1.8 SPP1 AIPVAQDLNAPSDWDSR 0.998 1.95 4000 1.5 VGF FGEGVSSPK 0.994 0.49 4000 2 LRRC4B DLAEVPASIPVNTR 0.997 0.49 4000 2.4 CADM3 LLLHCEGR 0.998 0.49 4000 2.9 SCG3 LLNLGLITESQAHTLEDEVAEVLQK 0.999 31.25 4000 3.4 NPTX1 FQLTFPLR 0.996 0.49 4000 1 FRRS1L HDIDSPPASER 0.987 1.95 4000 9.5 SLITRK1 LSNVQELFLR 0.974 0.49 4000 3.3 SST SANSNPAMAPR 0.999 0.49 4000 3.6 MOG FSDEGGFTCFFR 0.998 1.95 4000 3.1 NPY ESTENVPR 0.999 0.49 4000 4.2 RTN4RL2 LFLQNNLIR 0.997 1.95 4000 1.9 SERPINI1 ALGITEIFIK 0.996 0.49 4000 1.3 BAI2 LLAPAALAFR 0.993 0.49 4000 2 ECM1 ELLALIQLER 0.997 1.95 4000 1.8

14. Appendix 4.7: Reproducibility assay.

Gene Name Modified Peptide Sequence CV (%) APLP1 DELAPAGTGVSR 3.8 BCAN FNVYC[+57]FR 3.5 CADM2 SDDGVAVIC[+57]R 3.8 CBLN2 VAFSATR 9.5 CNDP1 ALEQDLPVNIK 4.4 CNTN2 VTVTPDGTLIIR 5.3 DNER VTATGFQQC[+57]SLIDGR 4.9 KLK6 LSELIQPLPLER 2.6 NCAN TGFPSPAER 3.1 NPTXR VAQLPLSLK 8.8 NRCAM VFNTPEGVPSAPSSLK 2.5 OLFM1 LTGISDPVTVK 8.2

220

OPCML ITVNYPPYISK 2.1 PTPRZ1 AIIDGVESVSR 2.5 SCG2 ALEYIENLR 2.3 SEZ6L ETGTPIWTSR 2.0 SPP1 AIPVAQDLNAPSDWDSR 2.4 VGF FGEGVSSPK 2.7 LRRC4B DLAEVPASIPVNTR 3.4 CADM3 LLLHC[+57]EGR 2.9 SCG3 LLNLGLITESQAHTLEDEVAEVLQK 9.0 NPTX1 FQLTFPLR 5.1 ECM1 ELLALIQLER 2.5 FRRS1L HDIDSPPASER 17.0 SLITRK1 LSNVQELFLR 6.9 SST SANSNPAMAPR 5.0 SST SANSNPAM[+16]APR 4.8 MOG FSDEGGFTC[+57]FFR 2.3 NPY ESTENVPR 3.1 RTN4RL2 LFLQNNLIR 5.8 SERPINI1 ALGITEIFIK 4.9 BAI2 LLAPAALAFR 3.0 SST SANSNPAMAPR+SANSNPAM[+16]APR 4.2

15. Appendix 4.8: Carry-over effect.

Gene Experiment 1 Experiment Experiment Average Peptide Sequence Peptide Name (%) 2 (%) 3 (%) (%) APLP1 DELAPAGTGVSR light 0.072 0.041 0.025 0.046 APLP1 DELAPAGTGVSR heavy 0.074 0.065 0.011 0.050 BCAN FNVYCFR light 0.001 0.089 0.127 0.072 BCAN FNVYCFR heavy 0.140 0.058 0.093 0.097 CADM2 SDDGVAVICR light -0.181 0.380 0.101 0.100 CADM2 SDDGVAVICR heavy 0.078 0.069 0.096 0.081 CBLN2 VAFSATR light 0.017 -0.327 0.092 -0.073 CBLN2 VAFSATR heavy 0.019 0.072 0.114 0.068 CNDP1 ALEQDLPVNIK light 0.076 0.068 0.086 0.077 CNDP1 ALEQDLPVNIK heavy 0.047 0.087 0.034 0.056 CNTN2 VTVTPDGTLIIR light 0.124 0.404 0.165 0.231 CNTN2 VTVTPDGTLIIR heavy 0.179 0.342 0.425 0.316 DNER VTATGFQQCSLIDGR light 0.435 1.845 1.589 1.290 DNER VTATGFQQCSLIDGR heavy 0.645 1.659 1.554 1.286 KLK6 LSELIQPLPLER light 0.064 0.051 0.076 0.064

221

Gene Experiment 1 Experiment Experiment Average Peptide Sequence Peptide Name (%) 2 (%) 3 (%) (%) KLK6 LSELIQPLPLER heavy 0.063 0.077 0.062 0.068 NCAN TGFPSPAER light 0.086 0.029 0.023 0.046 NCAN TGFPSPAER heavy 0.025 0.064 0.033 0.041 NPTXR VAQLPLSLK light 0.509 0.932 0.756 0.732 NPTXR VAQLPLSLK heavy -0.018 0.099 -0.029 0.017 NRCAM VFNTPEGVPSAPSSLK light 0.244 0.022 0.096 0.121 NRCAM VFNTPEGVPSAPSSLK heavy 0.046 0.056 0.045 0.049 OLFM1 LTGISDPVTVK light 0.000 0.000 -0.081 -0.027 OLFM1 LTGISDPVTVK heavy 0.022 0.068 0.045 0.045 OPCML ITVNYPPYISK light 0.020 0.044 0.032 0.032 OPCML ITVNYPPYISK heavy 0.023 0.064 0.047 0.045 PTPRZ1 AIIDGVESVSR light 0.440 0.882 1.236 0.853 PTPRZ1 AIIDGVESVSR heavy 0.373 0.821 1.210 0.801 SCG2 ALEYIENLR light 0.039 0.141 0.115 0.098 SCG2 ALEYIENLR heavy 0.094 0.221 0.112 0.142 SEZ6L ETGTPIWTSR light 0.003 0.026 -0.007 0.007 SEZ6L ETGTPIWTSR heavy 0.009 -0.004 -0.119 -0.038 SPP1 AIPVAQDLNAPSDWDSR light 0.070 0.089 0.065 0.074 SPP1 AIPVAQDLNAPSDWDSR heavy 0.076 0.068 0.089 0.078 VGF FGEGVSSPK light -0.002 0.008 0.006 0.004 VGF FGEGVSSPK heavy 0.078 0.070 0.105 0.084 LRRC4B DLAEVPASIPVNTR light 0.012 0.049 0.029 0.030 LRRC4B DLAEVPASIPVNTR heavy 0.074 0.125 0.103 0.100 CADM3 LLLHCEGR light 0.038 0.060 0.001 0.033 CADM3 LLLHCEGR heavy 0.079 0.026 0.039 0.048 SCG3 LLNLGLITESQAHTLEDEV light 0.373 0.419 0.657 0.483 AEVLQK SCG3 LLNLGLITESQAHTLEDEV heavy 0.242 0.307 0.821 0.457 AEVLQK NPTX1 FQLTFPLR light 0.066 -0.004 0.095 0.052 NPTX1 FQLTFPLR heavy 0.101 0.027 0.087 0.072 ECM1 ELLALIQLER light 0.943 2.407 2.335 1.895 ECM1 ELLALIQLER heavy 1.011 2.285 2.198 1.831 FRRS1L HDIDSPPASER light 3.979 -0.606 -2.220 0.384 FRRS1L HDIDSPPASER heavy 0.001 0.084 0.068 0.051 SLITRK1 LSNVQELFLR light -0.089 0.635 1.072 0.539 SLITRK1 LSNVQELFLR heavy 0.104 0.439 0.422 0.322 SST SANSNPAMAPR light 0.000 -0.007 -0.005 -0.004 SST SANSNPAMAPR heavy -0.019 0.038 0.068 0.029 SST SANSNPAM[+16]APR light -0.162 0.967 -0.494 0.104 SST SANSNPAM[+16]APR heavy 0.105 0.163 0.009 0.092 MOG FSDEGGFTCFFR light 0.115 0.670 0.564 0.450

222

Gene Experiment 1 Experiment Experiment Average Peptide Sequence Peptide Name (%) 2 (%) 3 (%) (%) MOG FSDEGGFTCFFR heavy 0.278 0.561 0.599 0.479 NPY ESTENVPR light 0.103 0.111 -0.001 0.071 NPY ESTENVPR heavy 0.042 0.143 0.010 0.065 RTN4RL2 LFLQNNLIR light 0.100 0.323 0.352 0.258 RTN4RL2 LFLQNNLIR heavy 0.346 0.358 0.029 0.244 SERPINI1 ALGITEIFIK light 0.248 0.517 0.575 0.447 SERPINI1 ALGITEIFIK heavy 0.389 0.814 0.789 0.664 BAI2 LLAPAALAFR light 0.013 0.021 0.045 0.027 BAI2 LLAPAALAFR heavy 0.055 0.115 0.088 0.086

16. Appendix 5.1: Multiplex SRM assays for clinical samples analysis.

Gene Precursor Product Transition Peptide Sequence Peptide Quantifier Name m/z m/z Ion Type APLP1 DELAPAGTGVSR light 586.80 815.44 y9 ● APLP1 DELAPAGTGVSR light 586.80 744.40 y8 ● APLP1 DELAPAGTGVSR light 586.80 358.16 b3 ● APLP1 DELAPAGTGVSR heavy 591.80 825.45 y9 ● APLP1 DELAPAGTGVSR heavy 591.80 754.41 y8 ● APLP1 DELAPAGTGVSR heavy 591.80 358.16 b3 ● BCAN FNVYCFR light 503.23 744.35 y5 ● BCAN FNVYCFR light 503.23 482.22 y3 ● BCAN FNVYCFR light 503.23 262.12 b2 ● BCAN FNVYCFR heavy 508.24 754.36 y5 ● BCAN FNVYCFR heavy 508.24 492.23 y3 ● BCAN FNVYCFR heavy 508.24 262.12 b2 ● CADM2 SDDGVAVICR light 546.26 618.34 y5 ● CADM2 SDDGVAVICR light 546.26 547.30 y4 ● CADM2 SDDGVAVICR light 546.26 448.23 y3 ● CADM2 SDDGVAVICR heavy 551.27 628.35 y5 ● CADM2 SDDGVAVICR heavy 551.27 557.31 y4 ● CADM2 SDDGVAVICR heavy 551.27 458.24 y3 ● CBLN2 VAFSATR light 376.21 652.34 y6 CBLN2 VAFSATR light 376.21 434.24 y4 ● CBLN2 VAFSATR light 376.21 347.20 y3 ● CBLN2 VAFSATR light 376.21 291.16 y5 ● CBLN2 VAFSATR heavy 381.21 662.35 y6 CBLN2 VAFSATR heavy 381.21 444.24 y4 ● CBLN2 VAFSATR heavy 381.21 357.21 y3 ● CBLN2 VAFSATR heavy 381.21 296.16 y5 ●

223

Gene Precursor Product Transition Peptide Sequence Peptide Quantifier Name m/z m/z Ion Type CNDP1 ALEQDLPVNIK light 620.35 798.47 y7 ● CNDP1 ALEQDLPVNIK light 620.35 683.45 y6 ● CNDP1 ALEQDLPVNIK light 620.35 570.36 y5 ● CNDP1 ALEQDLPVNIK heavy 624.36 806.49 y7 ● CNDP1 ALEQDLPVNIK heavy 624.36 691.46 y6 ● CNDP1 ALEQDLPVNIK heavy 624.36 578.38 y5 ● CNTN2 VTVTPDGTLIIR light 642.88 884.52 y8 ● CNTN2 VTVTPDGTLIIR light 642.88 672.44 y6 CNTN2 VTVTPDGTLIIR light 642.88 442.76 y8 ● CNTN2 VTVTPDGTLIIR heavy 647.88 894.53 y8 ● CNTN2 VTVTPDGTLIIR heavy 647.88 682.45 y6 CNTN2 VTVTPDGTLIIR heavy 647.88 447.77 y8 ● DNER VTATGFQQCSLIDGR light 826.91 660.37 y6 ● DNER VTATGFQQCSLIDGR light 826.91 460.25 y4 ● DNER VTATGFQQCSLIDGR light 826.91 347.17 y3 ● DNER VTATGFQQCSLIDGR light 826.91 232.14 y2 ● DNER VTATGFQQCSLIDGR heavy 831.91 670.38 y6 ● DNER VTATGFQQCSLIDGR heavy 831.91 470.26 y4 ● DNER VTATGFQQCSLIDGR heavy 831.91 357.18 y3 ● DNER VTATGFQQCSLIDGR heavy 831.91 242.15 y2 ● KLK6 LSELIQPLPLER light 704.41 852.49 y7 ● KLK6 LSELIQPLPLER light 704.41 724.44 y6 ● KLK6 LSELIQPLPLER heavy 709.42 862.50 y7 ● KLK6 LSELIQPLPLER heavy 709.42 734.44 y6 ● NCAN TGFPSPAER light 481.24 656.34 y6 ● NCAN TGFPSPAER light 481.24 559.28 y5 NCAN TGFPSPAER light 481.24 306.14 b3 ● NCAN TGFPSPAER heavy 486.24 666.34 y6 ● NCAN TGFPSPAER heavy 486.24 569.29 y5 NCAN TGFPSPAER heavy 486.24 306.14 b3 ● NPTXR VAQLPLSLK light 484.81 557.37 y5 ● NPTXR VAQLPLSLK light 484.81 347.23 y3 ● NPTXR VAQLPLSLK light 484.81 299.17 b3 NPTXR VAQLPLSLK heavy 488.82 565.38 y5 ● NPTXR VAQLPLSLK heavy 488.82 355.24 y3 ● NPTXR VAQLPLSLK heavy 488.82 299.17 b3 NRCAM VFNTPEGVPSAPSSLK light 815.43 1168.62 y12 ● NRCAM VFNTPEGVPSAPSSLK light 815.43 786.44 y8 ● NRCAM VFNTPEGVPSAPSSLK light 815.43 531.31 y5 ● NRCAM VFNTPEGVPSAPSSLK heavy 819.43 1176.64 y12 ● NRCAM VFNTPEGVPSAPSSLK heavy 819.43 794.45 y8 ●

224

Gene Precursor Product Transition Peptide Sequence Peptide Quantifier Name m/z m/z Ion Type NRCAM VFNTPEGVPSAPSSLK heavy 819.43 539.33 y5 ● OLFM1 LTGISDPVTVK light 565.33 1016.56 y10 ● OLFM1 LTGISDPVTVK light 565.33 915.51 y9 ● OLFM1 LTGISDPVTVK light 565.33 745.41 y7 OLFM1 LTGISDPVTVK heavy 569.33 1024.58 y10 ● OLFM1 LTGISDPVTVK heavy 569.33 923.53 y9 ● OLFM1 LTGISDPVTVK heavy 569.33 753.42 y7 OPCML ITVNYPPYISK light 647.86 981.50 y8 ● OPCML ITVNYPPYISK light 647.86 867.46 y7 ● OPCML ITVNYPPYISK light 647.86 704.40 y6 ● OPCML ITVNYPPYISK heavy 651.86 989.52 y8 ● OPCML ITVNYPPYISK heavy 651.86 875.48 y7 ● OPCML ITVNYPPYISK heavy 651.86 712.41 y6 ● PTPRZ1 AIIDGVESVSR light 573.31 848.41 y8 ● PTPRZ1 AIIDGVESVSR light 573.31 733.38 y7 ● PTPRZ1 AIIDGVESVSR light 573.31 577.29 y5 ● PTPRZ1 AIIDGVESVSR heavy 578.32 858.42 y8 ● PTPRZ1 AIIDGVESVSR heavy 578.32 743.39 y7 ● PTPRZ1 AIIDGVESVSR heavy 578.32 587.30 y5 ● SCG2 ALEYIENLR light 560.80 644.37 y5 ● SCG2 ALEYIENLR light 560.80 531.29 y4 ● SCG2 ALEYIENLR heavy 565.81 654.38 y5 ● SCG2 ALEYIENLR heavy 565.81 541.30 y4 ● SEZ6L ETGTPIWTSR light 574.29 917.48 y8 ● SEZ6L ETGTPIWTSR light 574.29 759.41 y6 ● SEZ6L ETGTPIWTSR light 574.29 549.28 y4 ● SEZ6L ETGTPIWTSR heavy 579.29 927.49 y8 ● SEZ6L ETGTPIWTSR heavy 579.29 769.42 y6 ● SEZ6L ETGTPIWTSR heavy 579.29 559.29 y4 ● SPP1 AIPVAQDLNAPSDWDSR light 927.95 933.41 y8 ● SPP1 AIPVAQDLNAPSDWDSR light 927.95 862.37 y7 ● SPP1 AIPVAQDLNAPSDWDSR light 927.95 262.15 y2 ● SPP1 AIPVAQDLNAPSDWDSR heavy 932.96 943.41 y8 ● SPP1 AIPVAQDLNAPSDWDSR heavy 932.96 872.38 y7 ● SPP1 AIPVAQDLNAPSDWDSR heavy 932.96 272.16 y2 ● VGF FGEGVSSPK light 454.23 760.38 y8 ● VGF FGEGVSSPK light 454.23 574.32 y6 ● VGF FGEGVSSPK heavy 458.24 768.40 y8 ● VGF FGEGVSSPK heavy 458.24 582.33 y6 ● LRRC4B DLAEVPASIPVNTR light 741.40 954.54 y9 ● LRRC4B DLAEVPASIPVNTR light 741.40 786.45 y7 ●

225

Gene Precursor Product Transition Peptide Sequence Peptide Quantifier Name m/z m/z Ion Type LRRC4B DLAEVPASIPVNTR light 741.40 586.33 y5 ● LRRC4B DLAEVPASIPVNTR heavy 746.41 964.54 y9 ● LRRC4B DLAEVPASIPVNTR heavy 746.41 796.46 y7 ● LRRC4B DLAEVPASIPVNTR heavy 746.41 596.34 y5 ● CADM3 LLLHCEGR light 333.18 658.27 y5 ● CADM3 LLLHCEGR light 333.18 521.21 y4 ● CADM3 LLLHCEGR light 333.18 386.18 y6 ● CADM3 LLLHCEGR heavy 336.52 668.28 y5 ● CADM3 LLLHCEGR heavy 336.52 531.22 y4 ● CADM3 LLLHCEGR heavy 336.52 391.19 y6 ● SCG3 LLNLGLITESQAHTLEDEV light 921.83 1070.54 y19 ● AEVLQK SCG3 LLNLGLITESQAHTLEDEV light 921.83 227.18 b2 ● AEVLQK SCG3 LLNLGLITESQAHTLEDEV light 921.83 341.22 b3 ● AEVLQK SCG3 LLNLGLITESQAHTLEDEV heavy 924.50 1074.55 y19 ● AEVLQK SCG3 LLNLGLITESQAHTLEDEV heavy 924.50 227.18 b2 ● AEVLQK SCG3 LLNLGLITESQAHTLEDEV heavy 924.50 341.22 b3 ● AEVLQK NPTX1 FQLTFPLR light 511.30 746.46 y6 ● NPTX1 FQLTFPLR light 511.30 633.37 y5 ● NPTX1 FQLTFPLR light 511.30 276.13 b2 ● NPTX1 FQLTFPLR heavy 516.30 756.46 y6 ● NPTX1 FQLTFPLR heavy 516.30 643.38 y5 ● NPTX1 FQLTFPLR heavy 516.30 276.13 b2 ● ECM1 ELLALIQLER light 599.36 842.51 y7 ● ECM1 ELLALIQLER light 599.36 771.47 y6 ECM1 ELLALIQLER light 599.36 658.39 y5 ● ECM1 ELLALIQLER heavy 604.37 852.52 y7 ● ECM1 ELLALIQLER heavy 604.37 781.48 y6 ECM1 ELLALIQLER heavy 604.37 668.40 y5 ● FRRS1L HDIDSPPASER light 612.29 858.40 y8 ● FRRS1L HDIDSPPASER light 612.29 743.37 y7 ● FRRS1L HDIDSPPASER light 612.29 656.34 y6 ● FRRS1L HDIDSPPASER light 612.29 253.09 b2 ● FRRS1L HDIDSPPASER heavy 617.29 868.40 y8 ● FRRS1L HDIDSPPASER heavy 617.29 753.38 y7 ● FRRS1L HDIDSPPASER heavy 617.29 666.34 y6 ● FRRS1L HDIDSPPASER heavy 617.29 253.09 b2 ● SLITRK1 LSNVQELFLR light 609.85 805.46 y6 ●

226

Gene Precursor Product Transition Peptide Sequence Peptide Quantifier Name m/z m/z Ion Type SLITRK1 LSNVQELFLR light 609.85 677.40 y5 ● SLITRK1 LSNVQELFLR light 609.85 548.36 y4 SLITRK1 LSNVQELFLR heavy 614.85 815.46 y6 ● SLITRK1 LSNVQELFLR heavy 614.85 687.41 y5 ● SLITRK1 LSNVQELFLR heavy 614.85 558.36 y4 SST SANSNPAMAPR light 558.27 756.38 y7 ● SST SANSNPAMAPR light 558.27 642.34 y6 ● SST SANSNPAMAPR light 558.27 474.25 y4 SST SANSNPAMAPR heavy 563.27 766.39 y7 ● SST SANSNPAMAPR heavy 563.27 652.35 y6 ● SST SANSNPAMAPR heavy 563.27 484.26 y4 ● SST SANSNPAM[+16]APR light 566.26 772.38 y7 ● SST SANSNPAM[+16]APR light 566.26 658.33 y6 ● SST SANSNPAM[+16]APR light 566.26 490.24 y4 ● SST SANSNPAM[+16]APR light 566.26 343.21 y3 ● SST SANSNPAM[+16]APR heavy 571.27 782.39 y7 ● SST SANSNPAM[+16]APR heavy 571.27 668.34 y6 ● SST SANSNPAM[+16]APR heavy 571.27 500.25 y4 ● SST SANSNPAM[+16]APR heavy 571.27 353.22 y3 ● MOG FSDEGGFTCFFR light 735.31 991.45 y8 ● MOG FSDEGGFTCFFR light 735.31 934.42 y7 ● MOG FSDEGGFTCFFR heavy 740.32 1001.45 y8 ● MOG FSDEGGFTCFFR heavy 740.32 944.43 y7 ● NPY ESTENVPR light 466.23 614.33 y5 ● NPY ESTENVPR light 466.23 485.28 y4 ● NPY ESTENVPR light 466.23 272.17 y2 ● NPY ESTENVPR heavy 471.23 624.33 y5 ● NPY ESTENVPR heavy 471.23 495.29 y4 ● NPY ESTENVPR heavy 471.23 282.18 y2 ● RTN4RL2 LFLQNNLIR light 565.84 870.52 y7 ● RTN4RL2 LFLQNNLIR light 565.84 757.43 y6 ● RTN4RL2 LFLQNNLIR light 565.84 515.33 y4 RTN4RL2 LFLQNNLIR heavy 570.84 880.52 y7 ● RTN4RL2 LFLQNNLIR heavy 570.84 767.44 y6 ● RTN4RL2 LFLQNNLIR heavy 570.84 525.34 y4 SERPINI1 ALGITEIFIK light 552.84 920.55 y8 ● SERPINI1 ALGITEIFIK light 552.84 750.44 y6 ● SERPINI1 ALGITEIFIK light 552.84 520.35 y4 SERPINI1 ALGITEIFIK light 552.84 407.27 y3 SERPINI1 ALGITEIFIK heavy 556.84 928.56 y8 ● SERPINI1 ALGITEIFIK heavy 556.84 758.45 y6 ●

227

Gene Precursor Product Transition Peptide Sequence Peptide Quantifier Name m/z m/z Ion Type SERPINI1 ALGITEIFIK heavy 556.84 528.36 y4 SERPINI1 ALGITEIFIK heavy 556.84 415.28 y3 BAI2 LLAPAALAFR light 521.82 816.47 y8 ● BAI2 LLAPAALAFR light 521.82 745.44 y7 ● BAI2 LLAPAALAFR light 521.82 577.35 y5 ● BAI2 LLAPAALAFR heavy 526.83 826.48 y8 ● BAI2 LLAPAALAFR heavy 526.83 755.44 y7 ● BAI2 LLAPAALAFR heavy 526.83 587.35 y5 ● MBP GVDAQGTLSK light 488.26 819.42 y8 ● MBP GVDAQGTLSK light 488.26 633.36 y6 ● MBP GVDAQGTLSK light 488.26 505.30 y5 ● MBP GVDAQGTLSK heavy 492.27 827.43 y8 ● MBP GVDAQGTLSK heavy 492.27 641.37 y6 ● MBP GVDAQGTLSK heavy 492.27 513.31 y5 ● APOB GFEPTLEALFGK light 654.84 975.55 y9 APOB GFEPTLEALFGK light 654.84 664.37 y6 APOB GFEPTLEALFGK light 654.84 535.32 y5 APOE- LGADMEDVC[+57.0]GR light 611.76 735.31 y6 peptide A APOE- LGADMEDVC[+57.0]GR light 611.76 606.27 y5 peptide A APOE- LGADMEDVC[+57.0]GR light 611.76 392.17 y3 peptide A APOE- LGADMEDVC[+57.0]GR heavy 616.77 745.32 y6 peptide A APOE- LGADMEDVC[+57.0]GR heavy 616.77 616.27 y5 peptide A APOE- LGADMEDVC[+57.0]GR heavy 616.77 402.18 y3 peptide A APOE- LGADM[+16.0]EDVC[+5 light 619.76 735.31 y6 peptide A 7.0]GR APOE- LGADM[+16.0]EDVC[+5 light 619.76 606.27 y5 peptide A 7.0]GR APOE- LGADM[+16.0]EDVC[+5 light 619.76 392.17 y3 peptide A 7.0]GR APOE- LGADM[+16.0]EDVC[+5 heavy 624.76 745.32 y6 peptide A 7.0]GR APOE- LGADM[+16.0]EDVC[+5 heavy 624.76 616.27 y5 peptide A 7.0]GR APOE- LGADM[+16.0]EDVC[+5 heavy 624.76 402.18 y3 peptide A 7.0]GR APOE- LGADMEDVR light 503.24 892.38 y8 peptide B APOE- LGADMEDVR light 503.24 835.36 y7

228

Gene Precursor Product Transition Peptide Sequence Peptide Quantifier Name m/z m/z Ion Type peptide B APOE- LGADMEDVR light 503.24 764.32 y6 peptide B APOE- LGADMEDVR heavy 508.24 902.39 y8 peptide B APOE- LGADMEDVR heavy 508.24 845.37 y7 peptide B APOE- LGADMEDVR heavy 508.24 774.33 y6 peptide B APOE- LGADM[+16.0]EDVR light 511.23 908.38 y8 peptide B APOE- LGADM[+16.0]EDVR light 511.23 851.36 y7 peptide B APOE- LGADM[+16.0]EDVR light 511.23 780.32 y6 peptide B APOE- LGADM[+16.0]EDVR heavy 516.24 918.39 y8 peptide B APOE- LGADM[+16.0]EDVR heavy 516.24 861.36 y7 peptide B APOE- LGADM[+16.0]EDVR heavy 516.24 790.33 y6 peptide B APOE- LAVYQAGAR light 474.77 764.40 y7 peptide C APOE- LAVYQAGAR light 474.77 665.34 y6 peptide C APOE- LAVYQAGAR light 474.77 502.27 y5 peptide C APOE- LAVYQAGAR heavy 479.77 774.41 y7 peptide C APOE- LAVYQAGAR heavy 479.77 675.34 y6 peptide C APOE- LAVYQAGAR heavy 479.77 512.28 y5 peptide C APOE- C[+57.0]LAVYQAGAR light 554.78 835.44 y8 peptide D APOE- C[+57.0]LAVYQAGAR light 554.78 764.40 y7 peptide D APOE- C[+57.0]LAVYQAGAR light 554.78 665.34 y6 peptide D APOE- C[+57.0]LAVYQAGAR heavy 559.79 845.45 y8 peptide D APOE- C[+57.0]LAVYQAGAR heavy 559.79 774.41 y7 peptide D APOE- C[+57.0]LAVYQAGAR heavy 559.79 675.34 y6 peptide D APOE- C[+40.0]LAVYQAGAR light 546.27 665.34 y8 peptide D

229

Gene Precursor Product Transition Peptide Sequence Peptide Quantifier Name m/z m/z Ion Type APOE- C[+40.0]LAVYQAGAR light 546.27 502.27 y7 peptide D APOE- C[+40.0]LAVYQAGAR light 546.27 374.21 y6 peptide D APOE- C[+40.0]LAVYQAGAR heavy 551.27 675.34 y8 peptide D APOE- C[+40.0]LAVYQAGAR heavy 551.27 512.28 y7 peptide D APOE- C[+40.0]LAVYQAGAR heavy 551.27 384.22 y6 peptide D APOE- LGPLVEQGR light 484.78 701.39 y6 total APOE- LGPLVEQGR light 484.78 588.31 y5 total APOE- LGPLVEQGR light 484.78 489.24 y4 total APOE- LGPLVEQGR heavy 489.78 711.40 y6 total APOE- LGPLVEQGR heavy 489.78 598.32 y5 total APOE- LGPLVEQGR heavy 489.78 499.25 y4 total

17. Appendix 5.2: Statistical analysis (Cohort 1).

Multivariate linear regression Gene Name p-value (unadjusted) p-value (adjusted by Holm) APLP1 0.0004 0.0122 SPP1 0.0046 0.1378 CNTN2 0.0059 0.1703 NCAN 0.0217 0.6086 DNER 0.0271 0.6837 NPTX1 0.0253 0.6837 SEZ6L 0.0260 0.6837 NRCAM 0.0335 0.8047 NPY 0.0380 0.8737 LRRC4B 0.0421 0.9267 PTPRZ1 0.0440 0.9267 BAI2 0.0574 1.0000 BCAN 0.0576 1.0000 CADM2 0.2237 1.0000 CADM3 0.0769 1.0000

230

Multivariate linear regression Gene Name p-value (unadjusted) p-value (adjusted by Holm) CBLN2 0.0675 1.0000 CNDP1 0.1357 1.0000 ECM1 0.5853 1.0000 FRRS1L 0.0882 1.0000 KLK6 0.1043 1.0000 MOG 0.0565 1.0000 NPTXR 0.0698 1.0000 OLFM1 0.2911 1.0000 OPCML 0.0921 1.0000 RTN4RL2 0.0524 1.0000 SCG2 0.1778 1.0000 SCG3 0.1069 1.0000 SERPINI1 0.0544 1.0000 SLITRK1 0.0690 1.0000 SST 0.2100 1.0000 VGF 0.2103 1.0000 in bold: statistical significance at p<0.05, adjusted by Holm

18. Appendix 5.3: Total assay reproducibility (Cohort 1).

Gene Name Peptide Sequence Modified Peptide Sequence CV (%) APLP1 DELAPAGTGVSR DELAPAGTGVSR 4.2 BCAN FNVYCFR FNVYC[+57]FR 3.6 CADM2 SDDGVAVICR SDDGVAVIC[+57]R 4.8 CBLN2 VAFSATR VAFSATR 12.0 CNDP1 ALEQDLPVNIK ALEQDLPVNIK 3.8 CNTN2 VTVTPDGTLIIR VTVTPDGTLIIR 5.0 DNER VTATGFQQCSLIDGR VTATGFQQC[+57]SLIDGR 5.2 KLK6 LSELIQPLPLER LSELIQPLPLER 3.1 NCAN TGFPSPAER TGFPSPAER 4.5 NPTXR VAQLPLSLK VAQLPLSLK 10.8 NRCAM VFNTPEGVPSAPSSLK VFNTPEGVPSAPSSLK 3.7 OLFM1 LTGISDPVTVK LTGISDPVTVK 13.2 OPCML ITVNYPPYISK ITVNYPPYISK 4.0 PTPRZ1 AIIDGVESVSR AIIDGVESVSR 2.7 SCG2 ALEYIENLR ALEYIENLR 4.9 SEZ6L ETGTPIWTSR ETGTPIWTSR 3.5 SPP1 AIPVAQDLNAPSDWDSR AIPVAQDLNAPSDWDSR 4.1 VGF FGEGVSSPK FGEGVSSPK 4.6 LRRC4B DLAEVPASIPVNTR DLAEVPASIPVNTR 3.1

231

Gene Name Peptide Sequence Modified Peptide Sequence CV (%) CADM3 LLLHCEGR LLLHC[+57]EGR 5.1 SCG3 LLNLGLITESQAHTLEDEVAEVLQK LLNLGLITESQAHTLEDEVAEVLQK 7.0 NPTX1 FQLTFPLR FQLTFPLR 4.5 ECM1 ELLALIQLER ELLALIQLER 2.5 FRRS1L HDIDSPPASER HDIDSPPASER 15.7 SLITRK1 LSNVQELFLR LSNVQELFLR 6.9 SST SANSNPAMAPR SANSNPAMAPR 7.2 SST SANSNPAMAPR SANSNPAM[+16]APR 7.8 MOG FSDEGGFTCFFR FSDEGGFTC[+57]FFR 3.8 NPY ESTENVPR ESTENVPR 3.4 RTN4RL2 LFLQNNLIR LFLQNNLIR 4.1 SERPINI1 ALGITEIFIK ALGITEIFIK 3.5 BAI2 LLAPAALAFR LLAPAALAFR 5.1

19. Appendix 5.4: Statistical analysis (Cohort 2).

Multivariate linear regression MCI vs. mild vs. moderate vs. severe AD MCI vs. moderate and severe AD Gene p-value p-value (adjusted by p-value p-value (adjusted by Name (unadjusted) Holm) (unadjusted) Holm) Set 1 APLP1 0.491 1.000 0.347 1.000 BAI2 0.191 1.000 0.044 1.000 BCAN 0.597 1.000 0.370 1.000 CADM2 0.305 1.000 0.088 1.000 CADM3 0.588 1.000 0.238 1.000 CBLN2 0.214 1.000 0.062 1.000 CNDP1 0.193 1.000 0.143 1.000 CNTN2 0.851 1.000 0.605 1.000 DNER 0.522 1.000 0.217 1.000 ECM1 0.200 1.000 0.033 0.923 FRRS1L 0.155 1.000 0.042 1.000 KLK6 0.700 1.000 0.335 1.000 LRRC4B 0.686 1.000 0.294 1.000 MOG 0.450 1.000 0.172 1.000 NCAN 0.226 1.000 0.060 1.000 NPTX1 0.218 1.000 0.057 1.000 NPTXR 0.014 0.422 0.004 0.117 NPY 0.033 1.000 0.004 0.117 NRCAM 0.364 1.000 0.105 1.000

232

Multivariate linear regression MCI vs. mild vs. moderate vs. severe AD MCI vs. moderate and severe AD Gene p-value p-value (adjusted by p-value p-value (adjusted by Name (unadjusted) Holm) (unadjusted) Holm) OLFM1 0.456 1.000 0.128 1.000 OPCML 0.205 1.000 0.051 1.000 PTPRZ1 0.667 1.000 0.654 1.000 RTN4RL2 0.238 1.000 0.062 1.000 SCG2 0.378 1.000 0.086 1.000 SCG3 0.684 1.000 0.312 1.000 SERPINI1 0.466 1.000 0.143 1.000 SEZ6L 0.190 1.000 0.055 1.000 SLITRK1 0.192 1.000 0.048 1.000 SPP1 0.712 1.000 0.500 1.000 SST 0.439 1.000 0.094 1.000 VGF 0.038 1.000 0.005 0.146 Set 2 APLP1 0.262 1.000 0.864 1.000 BAI2 0.617 1.000 0.288 1.000 BCAN 0.442 1.000 0.206 1.000 CADM2 0.262 1.000 0.357 1.000 CADM3 0.230 1.000 0.138 1.000 CBLN2 0.269 1.000 0.406 1.000 CNDP1 0.910 1.000 0.932 1.000 CNTN2 0.338 1.000 0.663 1.000 DNER 0.280 1.000 0.369 1.000 ECM1 0.926 1.000 0.453 1.000 FRRS1L 0.221 1.000 0.726 1.000 KLK6 0.690 1.000 0.997 1.000 LRRC4B 0.476 1.000 0.298 1.000 MOG 0.370 1.000 0.859 1.000 NCAN 0.285 1.000 0.158 1.000 NPTX1 0.271 1.000 0.058 1.000 NPTXR 0.118 1.000 0.039 1.000 NPY 0.251 1.000 0.111 1.000 NRCAM 0.265 1.000 0.214 1.000 OLFM1 0.561 1.000 0.392 1.000 OPCML 0.355 1.000 0.309 1.000 PTPRZ1 0.486 1.000 0.754 1.000 RTN4RL2 0.337 1.000 0.125 1.000 SCG2 0.412 1.000 0.229 1.000 SCG3 0.960 1.000 0.963 1.000 SERPINI1 0.138 1.000 0.054 1.000

233

Multivariate linear regression MCI vs. mild vs. moderate vs. severe AD MCI vs. moderate and severe AD Gene p-value p-value (adjusted by p-value p-value (adjusted by Name (unadjusted) Holm) (unadjusted) Holm) SEZ6L 0.412 1.000 0.256 1.000 SLITRK1 0.224 1.000 0.187 1.000 SPP1 0.149 1.000 0.781 1.000 SST 0.075 1.000 0.080 1.000 VGF 0.370 1.000 0.167 1.000

20. Appendix 5.5: Statistical analysis of proteins’ abundance among APOE phenotypes.

Kruskal Wallis

Gene p-value p-value (adjusted Name (unadjusted) by Holm) MOG 0.026 0.801 NRCAM 0.036 1.000 SEZ6L 0.039 1.000 CNDP1 0.040 1.000 NPTXR 0.046 1.000 CADM3 0.049 1.000 OPCML 0.052 1.000 CNTN2 0.057 1.000 SLITRK1 0.058 1.000 BAI2 0.062 1.000 DNER 0.062 1.000 KLK6 0.062 1.000 SPP1 0.064 1.000 RTN4RL2 0.087 1.000 NPTX1 0.090 1.000 CBLN2 0.093 1.000 NCAN 0.098 1.000 VGF 0.099 1.000 SST 0.106 1.000 CADM2 0.109 1.000 FRRS1L 0.137 1.000 NPY 0.139 1.000 SCG2 0.142 1.000 SERPINI1 0.149 1.000 LRRC4B 0.188 1.000 ECM1 0.204 1.000

234

Kruskal Wallis

Gene p-value p-value (adjusted Name (unadjusted) by Holm) BCAN 0.246 1.000 APLP1 0.318 1.000 SCG3 0.326 1.000 PTPRZ1 0.384 1.000 OLFM1 0.810 1.000

21. Appendix 5.6: Statistical analysis of proteins’ abundance among APOE ε4 phenotypes.

Kruskal Wallis ε4+/+ vs. ε4+/- vs. ε4-/- Protein unadjusted p- adjusted p- value value MOG 0.012 0.381 CNDP1 0.016 0.490 NRCAM 0.018 0.517 NPTXR 0.018 0.517 OPCML 0.022 0.584 SLITRK1 0.024 0.626 CNTN2 0.024 0.626 SEZ6L 0.025 0.626 KLK6 0.026 0.626 DNER 0.026 0.626 SPP1 0.030 0.626 BAI2 0.030 0.626 RTN4RL2 0.038 0.727 NPTX1 0.039 0.727 VGF 0.044 0.745 CADM3 0.044 0.745 CBLN2 0.047 0.745 CADM2 0.052 0.745 SST 0.059 0.764 FRRS1L 0.066 0.789 NCAN 0.067 0.789 SCG2 0.068 0.789 SERPINI1 0.072 0.789 NPY 0.081 0.789 ECM1 0.104 0.789 LRRC4B 0.118 0.789

235

Kruskal Wallis ε4+/+ vs. ε4+/- vs. ε4-/- Protein unadjusted p- adjusted p- value value BCAN 0.126 0.789 APLP1 0.180 0.789 PTPRZ1 0.266 0.797 SCG3 0.345 0.797 OLFM1 0.681 0.797