INVESTIGATING THE MOLECULAR BASIS

OF CONCUSSION USING WHOLE EXOME

SEQUENCING AND BIOINFORMATICS

Omar Ezzeldin Ibrahim Abdelrahman

BBehavSc(HonsPsych), GCertBiotech

Submitted in fulfilment of the requirements for the degree of

Doctor of Philosophy

School of Biomedical Sciences

Faculty of Health

Queensland University of Technology

2021

LIST OF PUBLICATIONS:

1- Heidi G. Sutherland, Neven Maksemous, Cassie L. Albury, Omar Ibrahim, Robert A. Smith,

Rod A. Lea, Larisa M. Haupt, Bronwyn Jenkins, Benjamin Tsang, Lyn R. Griffiths.

Comprehensive exonic sequencing of hemiplegic migraine related genes in a cohort of

suspected probands identifies known and potential pathogenic variants. Accepted in Cells.

2- Ibrahim O, Sutherland HG, Maksemous N, Smith R, Haupt LM, Griffiths LR. Exploring

Neuronal Vulnerability to Head Trauma Using a Whole Exome Approach. J Neurotrauma.

2020 Sep 1;37(17):1870-1879. doi: 10.1089/neu.2019.6962. Epub 2020 May 4. PMID:

32233732; PMCID: PMC7462038. Q1 Journal

3- Maksemous N, Smith RA, Sutherland HG, Maher BH, Ibrahim O, Nicholson GA, Carpenter

EP, Lea RA, Cader MZ, Griffiths LR. Targeted next generation sequencing identifies a genetic

spectrum of DNA variants in patients with hemiplegic migraine. Cephalalgia Reports. 2019

Oct 11;2:2515816319881630.

4- Ibrahim O, Sutherland HG, Haupt LM, Griffiths LR. Saliva as a comparable-quality source of

DNA for Whole Exome Sequencing on Ion platforms. Genomics. 2020 Mar;112(2):1437-1443.

doi: 10.1016/j.ygeno.2019.08.014. Epub 2019 Aug 21. PMID: 31445087. Q1 Journal

5- Ibrahim O, Sutherland HG, Avgan N, Spriggens LK, Lea RA, Haupt LM, Shum DH, Griffiths

LR. Investigation of the CADM2 polymorphism rs17518584 in memory and executive

functions measures in a cohort of young healthy individuals. Neurobiology of learning and

memory. 2018 Nov 1;155:330-6. Q1 Journal.

CONCUSSION GENETICS i

6- Ibrahim O, Sutherland HG, Haupt LM, Griffiths LR. An emerging role for epigenetic factors

in relation to executive function. Briefings in functional genomics. 2017 Nov 20;17(3):170-

80. Q1 Journal.

7- Avgan N, Sutherland HG, Spriggens LK, Yu C, Ibrahim O, Bellis C, Haupt LM, Shum DH,

Griffiths LR. BDNF variants may modulate long-term visual memory performance in a healthy

cohort. International journal of molecular sciences. 2017 Mar 17;18(3):655. Q1 Journal

Planned/Under Review

1- Rare Variants and Persistent Post-Concussion Symptoms: an exploratory study. Planned

Submission: Frontiers in Molecular Neuroscience – Q1

2- Mitochondrial Variants and Post-Concussion Migraine: A whole exome approach. Planned

submission: International Journal of Molecular Sciences. – Q1

3- A Machine Learning approach to investigating concussion and head trauma outcomes. Under

Review: Journal of Molecular Medicine – Q1

LIST OF CONFERENCES:

Oral Presentations:

1- O.Ibrahim, H. Sutherland, R. Lea, L. Haupt, L. Griffiths. : A novel approach to identify

functional genetic variants associated with persistent post-concussion symptoms. Oral

Presentation. Australasian Neuroscience Society, Adelaide, December 2019.

2- O. Ibrahim, H.Sutherland, L. Haupt, L. Griffiths. Investigating the molecular basis of

concussion susceptibility. IHBI Inspires. Oral Presentation. Brisbane, August 2018.

ii CONCUSSION GENETICS

Poster Presentations:

1. O. Ibrahim, H.Sutherland, P. Dunn, L. Haupt, L. Griffiths. , An alternative approach to

investigating neurological disturbances following head trauma, Gene Mappers, Sydney,

November 2019.

2. O. Ibrahim, H.Sutherland, L. Haupt, L. Griffiths. Genetics of Post-Concussion

Symptoms.ASMR Post Graduate Conference. Brisbane, May 2019.

3. O. Ibrahim, H.Sutherland, L. Haupt, L. Griffiths. Exploring Mutations Implicated in Severe

Reactions to Trivial Head Trauma. Genetics Society of Australasia Annual Scientific

Meeting, Sydney, August 2018.

4. H. Sutherland, N. Maksemous, C. Albury, O. Ibrahim, R. Smith, L. Haupt, R. Lea, and L.

Griffiths. PRRT2 mutations in familial hemiplegic migraine. GeneMappers, QLD, 2018.

5. O. Ibrahim, H.G. Sutherland, T. Sheppard, N. Avgan, L.G. Spriggens, L.M. Haupt, D.K.

Shum, L.R. Griffiths. Potential role for CADM2 SNP in Visual Memory, Prospective Memory,

and Executive Function in Healthy Individuals. Human Genetics Society of Australasia Annual

Scientific Meeting, Brisbane, August 2017.

6. Omar Ibrahim, Cassie Albury, Heidi Sutherland, Miles Benton, Larisa Haupt, Lyn Griffiths.

Comparing Whole Exome Sequencing Data Quality between Saliva and Blood Samples.

ASMR postgraduate conference Queensland, June 2017.

CONCUSSION GENETICS iii

7. O. Ibrahim, H.G. Sutherland, N. Avgan, L. M. Haupt, T. Sheppard, L.K. Spriggens, D.H.K.

Shum, L.R. Griffiths. HEY1 gene related SNP playing a role in working memory, visual spatial

capacity, and verbal recall. ASMR postgraduate conference Queensland, June 2016.

Awards and Scholarships:

1- 2019 : European Molecular Biology Laboratory PhD Course Delegate

2- 2018 : Chronic Disease and Ageing Manuscript Award

3- 2018 : QUT Faculty of Health Scholarship

4- 2017 : QUT Bluebox Innovation Camp 1st place, Best Pitch

5- 2017 : QUT PhD Tuition Fee Waiver Scholarship

iv CONCUSSION GENETICS

Keywords

Concussion, mTBI, Neurogenetics, brain injury, next generation sequencing, Whole

Exome Sequencing, Post-concussion symptoms, head trauma, head injuries, Machine Learning,

Mitochondrial DNA, Epigenetics, Neuronal Vulnerability

CONCUSSION GENETICS v

Abstract

Concussion or mild traumatic brain injury (mTBI) is a transient neurological dysfunction that follows closed head injury. It is often attributed to the acceleration and deceleration forces that cause shear and tear to the neuronal axons. The molecular aftermath of concussion has recently been further understood and has highlighted an important role for neuronal ion channels and neurotransmitters in restoring neuronal homeostasis. These processes are crucial as concussion is often paired with neuronal depolarisation which is suggested to cause most post-concussion symptoms (PCS). Rare variants in ion channel genes (e.g. CACNA1A and

ATP1A2) have been implicated in response to head trauma and the ensuing prolonged symptoms, yet most existing research has focussed on single common variant association studies. To date, there is little consensus on the genetic basis of concussion development and outcomes. Further, there are no studies that have explored the use of whole exome sequencing

(WES) as well as machine learning (ML) methods to investigate the potential genetic basis of

PCS.

This PhD project utilised 3 groups of participants: a) N = 16 individuals with severe neurological reactions to trivial head trauma b) N = 26 individuals with persistent post- concussion symptoms; and c) N = 18 individuals who recovered from mTBI within a normal and reasonable time frame (up to 6 weeks). DNA extracted from saliva and whole blood from the same individuals was compared to determine the suitability of saliva as a relevant source of

DNA for these groups. WES was undertaken on the Ion Platform Instrumentation (Ion Proton and GeneStudio S5 Plus) with Variant Call Files (VCFs) of groups a and b analysed for rare

vi CONCUSSION GENETICS

deleterious variants (MAF < 0.01). Mitochondrial variants were also explored in the entire cohort by realigning the WES sequencing data to the mitochondrial reference genome. Finally, methylation changes in sites implicated in previous studies were explored using pyrosequencing methods. Further the VCFs of groups a, b, and c were used to develop an ML model of classification based on genetic variants.

Saliva was noted to be a comparable high-quality source of DNA for use in WES interchangeably with whole-blood isolated DNA. This study identified variants in a number of ion channel and neurotransmitter pathways to have a potential role in concussion related symptoms. Using WES, 11 variants were identified in genes implicated in neurological recessive disorders. Rare and novel mutations in Ion Channel genes were found in 5 cases,

Neurotransmitter pathway mutations were found in 2 cases, and Ubiquitin related mutations were identified in 4 cases, 2 of whom are related. All the identified mutations were predicted to be deleterious by in-silico and previous studies where possible. WES further identified 42 variants in individuals with persistent PCS. These variants were mostly identified as deleterious by in-silico prediction tools but were predicted to be Variants of Unknown Significance (VUS) in the clinical database Clinvar. These 42 variants were identified to be in genetic pathways similar to the ones identified in the first group with severe neurological symptoms (a). 3 likely pathogenic mitochondrial variants, with one implicated in ATPase pathways, were identified in the larger cohort at a higher frequency similar to publicly available neurological databases and higher than healthy control databases containing similarly sequenced samples. ML models, in particular Gradient Boosted Trees and Random forests were then used and able to classify individuals who recover normally after mTBI and distinguish these from individuals with HM- like symptoms following trivial head trauma and those with persistence PCS with an AUC of

0.7, showing potential future uses in larger cohorts. The application of ML methods also implicated a tau-related gene (TTBK2) and an insulin-resistance related one in post-head trauma

CONCUSSION GENETICS vii

outcomes. These preliminary results suggest a potential role for combining genomics and AI to tailor treatments and care for mTBI patients, in line with the larger goals of personalised medicine.

Conclusion: DNA extracted from the blood and saliva of the same person appears to yield similar coverage and depth when examined by WES on an Ion Torrent platform, indicating saliva is a valid source of DNA for use in WES. It is hypothesised that heterozygous deleterious mutations in genes implicated in recessive neurological dysfunctions might posit individuals at a risk of responding poorly to trivial head trauma due to neurological vulnerability and compromised homeostatic processes. Genetic neuronal vulnerability to head trauma was hypothesised as a framework of understanding severe cases of brain injury. The same framework could be used to further explore milder forms of PCS as evidenced by the variants identified in the second cohort. Non-linear ML algorithms, in particular those built on decision trees can classify individuals into general mTBI response groups with high accuracy through

WES data.

viii CONCUSSION GENETICS

Table of Contents

KEYWORDS ...... V

ABSTRACT ...... VI

TABLE OF CONTENTS ...... IX

LIST OF FIGURES ...... XII

LIST OF TABLES ...... XIII

LIST OF ABBREVIATIONS ...... XV

STATEMENT OF ORIGINAL AUTHORSHIP ...... XVII

ACKNOWLEDGEMENTS ...... XVIII

CHAPTER 1: INTRODUCTION ...... 1

1.1 BACKGROUND ...... 1

1.2 CONTEXT ...... 3

1.3 AIMS & HYPOHTESIS ...... 5

1.4 SIGNIFICANCE, SCOPE AND DEFINITION ...... 6

1.5 THESIS OUTLINE ...... 7

CHAPTER 2: LITERATURE REVIEW ...... 9

2.1 PHYSIOLOGY OF MTBI ...... 9

2.2 OUTCOMES OF MTBI ...... 12

2.3 DIAGNOSTIC AND THERAPEUTIC APPROACHES ...... 15

2.4 GENETIC PATHWAYS ...... 19

2.5 CONCLUSION ...... 29

CHAPTER 3: RESEARCH DESIGN ...... 31

3.1 METHODOLOGY AND RESEARCH DESIGN ...... 31

3.2 PARTICIPANTS ...... 34

3.3 INSTRUMENTS ...... 35

3.4 ANALYSIS ...... 42

CONCUSSION GENETICS ix

3.5 LIMITATIONS ...... 47

CHAPTER 4: SALIVA AS A COMPARABLE SOURCE OF DNA FOR WES ...... 49

4.1 BACKGROUND ...... 51

4.2 EXPERIMENTAL AIM ...... 53

4.3 METHODS ...... 53

4.4 RESULTS ...... 55

4.5 DISCUSSION ...... 62

4.6 CONCLUSION ...... 64

CHAPTER 5: RARE VARIANTS IMPLICATED IN SEVERE REACTIONS TO TRIVIAL HEAD TRAUMA .. 67

5.1 BACKGROUND ...... 68

5.2 METHODS ...... 71

5.3 RESULTS ...... 75

5.4 DISCUSSION ...... 88

5.5 LIMITATIONS ...... 95

5.6 CONCLUSION ...... 96

CHAPTER 6: EXPLORING THE ROLE OF RARE ION CHANNEL AND NEURONAL-RELATED VARIANTS IN PERSISTENT POST-CONCUSSION SYMPTOMS ...... 99

6.1 BACKGROUND ...... 100

6.2 METHODS ...... 101

6.3 RESULTS ...... 103

6.4 DISCUSSION ...... 115

6.5 CONCLUSION ...... 128

6.6 FUTURE DIRECTIONS ...... 128

CHAPTER 7: MITOCHONDRIAL AND EPIGENETIC CORRELATES OF PCS ...... 131

7.1 MITOCHONDRIAL VARIANTS IMPLICATED IN RESPONSE TO HEAD TRAUMA ...... 131

7.2 METHYLATION CHANGES POST-CONCUSSION ...... 139

CHAPTER 8: PREDICTING OUTCOMES OF HEAD TRAUMA USING MACHINE LEARNING ...... 146

x CONCUSSION GENETICS

8.1 BACKGROUND ...... 148

8.2 METHODS ...... 155

8.3 RESULTS ...... 159

8.4 DISCUSSION ...... 166

8.5 CONCLUSION ...... 173

CHAPTER 9: DISCUSSION, FUTURE DIRECTIONS AND CONCLUDING REMARKS...... 175

9.1 KEY FINDINGS ...... 175

9.2 LIMITATIONS ...... 179

9.3 FUTURE DIRECTIONS ...... 180

BIBLIOGRAPHY ...... 182

APPENDICES ...... 222

CONCUSSION GENETICS xi

List of Figures

Figure 1 Ion P1 Chip Heat map of libraries loading density...... 57

Figure 2 Workflow of the process comparing the utility of saliva-extracted DNA in WES experiments as an equal alternative to blood-extracted DNA...... 60

Figure 3 screen shot of Integrative Genome Viewer (IGV) ...... 61

Figure 4 Variant filtering pipeline to explore relevant variants in patients with severe reaction of trivial head trauma (n=16) using Ion Reporter software...... 86

Figure 5: Pathway analysis for candidate genes based on Genemania tool...... 87

Figure 6 Pedigree Charts of Families included in the study ...... 114

Figure 7: Boxplots demonstrating methylation averages for the 4 CpGs...... 144

Figure 8: Machine Learning types and approaches ...... 153

Figure 9: Number of groups and corresponding BIC...... 160

Figure 10 K-means clustering results ...... 161

xii CONCUSSION GENETICS

List of Tables

Table 1: Diagnostic Criteria for mTBI ...... 16

Table 2 Quality Control (QC) Metrics of the 8 samples ...... 56

Table 3 Variant concordance between different samples donor...... 58

Table 4 Variants concordance using other variant caller and concordance analysis...... 59

Table 5 Clinical Notes for 24 cases referred for diagnostic testing of suspected Hemiplegic Migraine with notable varied neurological dysfunctions following minor and/or trivial head trauma...... 72

Table 6: Quality Metrics of WES by Ion Proton ...... 77

Table 7 Minor Allele Frequencies (MAFs) of variants deemed relevant to symptoms by WES analysis ...... 78

Table 8 Functional Scores ...... 80

Table 9 Gene Set Enrichment Analysis Top Results ...... 85

Table 10 rare variants identified in a cohort of persistent post-concussion patients and their minor allele frequencies (MAF) ...... 105

Table 11: Functional Scores of Identified Variants ...... 109

Table 12 mtDNA Variants of Interest identified in the cohort ...... 135

Table 13 Haplogroups identified in the cohort ...... 136

Table 14: Average methylation across CN samples and GOM controls at four loci in the RAB5B gene ...... 143

Table 15: independent-sample t-test comparing methylation levels between concussed and healthy patients at each CpG location ...... 144

Table 16 K-means Clustering Results ...... 160

Table 17: performance of different algorithms on various pairing of subgroups of responses to head trauma...... 164

Table 18: Confusion matrices for three different models where 0 is control and 1 is a case, depending on the model. The higher the number is in the intersecting cell between respective 0 or 1 rows or columns, the more accurate the prediction is...... 165

CONCUSSION GENETICS xiii

Table 19 Variants of most importance to the two models with above chance (AUC > 0.5) prediction accuracy ...... 165

xiv CONCUSSION GENETICS

List of Abbreviations

AUC: Area Under the Curve

BDNF: Brain-Derived Neurotrophic Factor

CT: Computed Tomography

DP: Depth of Coverage

FHM: Familial Hemiplegic Migraine

GQ: Genotyping quality

GRC: Genomics Research Centre

ML: Machine Learning

MRI: Magnetic Resonance Imaging

mTBI: Mild Traumatic Brain Injury

mtDNA: Mitochondrial DNA

NGS: Next generation sequencing

PCR: Polymerase Chain Reaction

PCS: Post-Concussion Symptoms

PolyPhen: Polymorphism Phenotyping

ROC: Receiver Operator Characteristic

CONCUSSION GENETICS xv

SIFT: Sorting Intolerant from Tolerant

SNP: Single Nucleotide Polymorphism

SNV: Single Nucleotide Variant

VGCC: Voltage Gated Calcium Channel

WES: Whole Exome Sequencing

xvi CONCUSSION GENETICS Statement of Original Authorship

The work contained in this thesis has not been previously submitted to meet requirements for an award at this or any other higher education institution. To the best of my knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made.

Signature: QUT Verified Signature

Date: __9/06/2021______

CONCUSSION GENETICS xvii

Acknowledgements

This document is, as most PhD theses, a labour of love, self-doubt, curiosity, and most of all, a testament to the support I have in my life. First and foremost, I couldn’t have done this without my family, and the unconditional support they have given me over the past years, tolerating the emotional highs and lows that come with the territory. Notably, my mother, to whom I owe my interest in pursuing knowledge, and who has been an unequivocal believer in my choices, a stable sounding board throughout my life and tour de force role model in life- long learning.

I wouldn’t be where I am now without the three people who opened the doors of science and genomics for me, my supervisors. I am forever grateful to you. To Prof Lyn Griffiths, thank you for all the chances you have given me, and taken on me, not knowing who I was or if I was going to live up to any potential, and for trusting me with this project. To A Prof Larisa Haupt, thank you for all the extra hours you spent helping me craft my science, writing, presentation, and helping me grow both in and out of academia, and listened to all my self-doubts throughout this journey. To Dr Heidi Sutherland, I owe you knowing what it means to work in a lab, how to write for life sciences, and without your patience with my novice ways, I would have given up on this field-shift years ago.

To Paul Dunn and Nick Harvey, thank you for taking me in, for the irregular coffee runs, for all the memes, and here is to supporting each other in our respective journeys. Last but not least, thank you to the GRC team, and all the friends and loved ones I have made and/or lost throughout this journey. Last but not least, I’d like to acknowledge the funding I have received for this degree, namely my Supervisor’s scholarship as well as the Faculty of Health Postgraduate Scholarship.

xviii CONCUSSION GENETICS

Chapter 1: Introduction

This chapter outlines the background (section 1.1) and context (section 1.2) of the

research, and its purpose (section 1.3). Section 1.4 describes the significance and scope of this

research and provides definitions of terms used. Finally, section 1.5 includes an outline of the

remaining chapters of the thesis.

1.1 BACKGROUND

Brain injuries are a global source of disability and a burden on health care systems.

Acquired brain injuries, which can be non-traumatic or traumatic, often develop after birth.

While the former is usually due to endogenous causes i.e. stroke, the latter is often caused by

an external force. Currently, the Glasgow Coma Score is used to rate the gravity of a traumatic

brain injury, usually as either mild (score of 13-15), moderate (score of 9-12), or severe (score

of 3-8) 1. The literature often uses “Mild Traumatic Brain Injury” (mTBI) and “concussion”

interchangeably 2, and thus the same convention will be followed throughout this research

document. Similarly, sports-related concussion (SRC), a term often used in sports medicine

literature, and implies direct impact to any body part that sends impulsive forces to the brain 3,

is occasionally used and will be referred to as “concussion” throughout this document.

Concussion can be defined as “a traumatically induced transient disturbance of brain function”,

usually as a result of a head impact, which is underpinned by intertwined physiological

processes 4. It presents either with or without loss of consciousness and a range of symptoms

including headaches, transient memory problems, confusion, lack of coordination, dizziness,

and neuropsychiatric distress, as well as susceptibility to neuropsychiatric disorders following

repeated incidence 5, 6. There are also numerous documentations of progressive neurological

Chapter 1: Introduction 1

dysfunction following repeated injuries 7. Further, chronic pain is an occasional long-term effect that decreases the quality of life for affected individuals 8. Traumatic Brain Injury (TBI), including its mild form mTBI, is a common occurrence in contact sports, road accidents, outdoor activities, as well as occupations with physical activity components. Estimates of concussion prevalence vary significantly, particularly for sports-related concussion which is rarely reported and is of unknown prevalence in Australia 9. In the US, around 2 million brain injuries are recorded annually, with mTBI accounting for the majority of cases 10. It is estimated that around 85% of TBI cases presenting to hospitals at any given moment will be mTBI. More alarmingly, some estimate that up to 1 in 2 people will experience a form of TBI in their lifetime

11. The current approach to thinking about mTBI is that it entails both acute and chronic components, whereby symptoms and recovery are not linear processes, but entail several peaks and troughs towards a normal life, which is a near impossibility for some chronic sufferers 12

Despite its prevalence (approximately 6 per 1000 people), concussion is often underdiagnosed as structural imaging does not necessarily show changes in the immediate aftermath 13, when intracranial injuries are absent . In particular, micro-tears usually have a delayed onset and are usually undetected via scans available in emergency departments 14. In most cases, presenting symptoms following concussion are transient and hence little attention is paid to the long-lasting effects that might develop later in life. Returning to sports or work can exacerbate symptoms if a secondary trauma is inflicted. Second Impact Syndrome (SIS), is an often-fatal condition that happens when the brain swells rapidly shortly after a person suffers a second concussion before symptoms from an earlier concussion have subsided, can occur in cases where insufficient time is given for recovery 15. It is thus imperative to explore factors which predispose individuals to complicated symptomatology or adverse outcomes 16.

2 Chapter 1: Introduction

Post-mTBI or concussion outcomes, including developing long term sequelae, are

diverse, spanning physical, cognitive, behavioural and emotional changes 17. This suggests a

role – at least in part- for genetics 18, which are closely linked to functional changes

underpinning the mechanisms of concussion, especially those related to neurometabolism and

ion channel disruption. Post-concussion symptoms are also closely intertwined with other

biological (e.g. sex or gender) and socioeconomic (e.g. access to health care or return to work

expectations) statuses that have been demonstrated to affect the outcomes of TBI 19. A role for

ion-channels genes has already been established through studies of severe responses to trivial

head trauma 20-22. However, as will be explored in this literature review, few studies have been

able to establish strong genetic associations with concussion incidence and persistent symptoms

to date. Seeing that rare variants, resultant from filtering by allele frequency in control groups

has been instrumental in other disorders (e.g. epilepsy and schizophrenia) 23, 24, it is possible to

explore them in the context of post-concussion symptoms. Concussion has been further linked

to a myriad of mood, neuropsychiatric, and neurodegenerative disorders, as well as conditions

like Chronic Traumatic Encephalopathy (CTE) 25, 26. Hence, exploring the underlying genetic

aspects of concussion, to potentially predict unfavourable outcomes after an incident, could

allow for well-tailored prevention and treatment for individuals who are exposed to head trauma

as part of their everyday activities (contact sports athletes, construction workers, motorcyclists).

Most studies exploring the genetics of concussion have focused on candidate-genes or common

polymorphisms rather than potential causal rare variants, a gap that will be explored in detail in

the literature review of this dissertation.

1.2 CONTEXT

Exploring the genetic factors associated with concussion incidence is of paramount

relevance, particularly due to the fact that a history of multiple concussion has been found to

Chapter 1: Introduction 3

be associated with neurodegenerative pathologies 27-29. These pathologies contribute to the healthcare burden of existing conditions among an ageing world population. Despite the fact that there is some evidence of neuropsychological outcomes being modulated by genetic variants 30, to date, there has been no Next Generation Sequencing (NGS) - which can detect thousands of variants in a gene at a time - studies. In particular, there are no Whole Exome

Sequencing studies to explore rare and functional variants implicated in persistent post- concussion symptoms. It has become clear that candidate gene associations are inadequate; for they do not allow for gene discovery and often lead to conflicting results and replicability problems if underpowered. There is clearly a knowledge gap in the literature surrounding tailored concussion management protocols, and with the advent of personalised medicine, incorporating genetic profiles into neuroprotection or treatment plans may reduce the chances of exacerbating a concussion 31 or prolonged debilitating symptoms. Further, neuroprotective compound clinical trials32, predominantly of ion-channel blockers have had inconsistent results

33, with most of the promising results limited to in-vitro and animal models. With the advent of pharmacogenomics, and its incorporation into various potential treatments for other neuropsychiatric conditions, tailoring neuroprotection treatments or post-concussion treatments by genotype might be one of the missing links in concussion management.

This research project is conducted in Australia, with the majority of collaborating clinicians and recruitment avenues based in Australia and New Zealand. However, seeing that the case rates are comparable to those in North America, and the fact that concussion/TBI is a global problem, it is expected that the results/findings will be useful to researchers and clinicians around the world.

4 Chapter 1: Introduction

1.3 AIMS & HYPOHTESIS

Hypothesis

The risk of concussion and the severity of its symptoms are modulated through genetic

and epigenetic factors, including variants in structural, ion channel, neurotrophic,

neurotransmitter, and mitochondrial genes. The use of whole exome sequencing (WES) and

machine learning will identify rare variants and genetic pathways implicated in neuronal

vulnerability to head trauma.

Aims

The aim of the research described in this thesis is to identify the role of genetics in the

risk and severity of concussion and its outcomes. The study encompasses four specific aims:

Aim1: To develop a protocol for using saliva and blood-extracted DNA in the same WES

analysis on Ion platforms.

Aim 2: To perform WES on individuals from concussion-affected cohorts to investigate

rare variants in ion channels, neurotransmitter and neurological structural genes that may be

implicated in the development and persistence of post-concussion symptoms (PCS), and severe

neurological dysfunction following trivial head trauma.

Aim 3: To explore epigenetic and mitochondrial correlates of head trauma response and

incidence.

Aim 4: Using bioinformatics and machine learning to explore candidate genes and genetic

pathways implicated in outcomes of head trauma through supervised non-linear models.

Chapter 1: Introduction 5

1.4 SIGNIFICANCE, SCOPE AND DEFINITION

The existing research includes investigating the established role of CACNA1A mutations in

specific concussion cases, and APOE in other studies, as well as the myriad of candidate gene

polymorphism association studies. To date, few studies have been able to establish a direct

association between concussion incidence and specific genes, including ion channel genes.

Most studies have focused on the outcomes of concussion and their genetic association but are

yet to explore the link between genes of interest and concussion incidence or development.

Further, the studies detailed in this literature review have all focused on common Single

Nucleotide Polymorphisms. The current candidate gene approaches depend on association

analyses of specific variants with the traits being studied. However, gene discovery methods

include burden analysis and sequencing of trios to identify functional variants that potentially

segregate affected from non-affected individuals. Besides rare variants in CACNA1A and

ATP1A2, which have been linked to concussion-like symptoms following trivial head trauma,

little attention has been given to rare mutations that affect protein function and their implication

in the processes that follow head trauma. Consequently, this dissertation proposes that a mixed

approach of Whole Exome Sequencing (WES) and Machine Learning will allow the

identification of previously unknown relevant genetic variants that may contribute to the

development of and response to concussion. Furthermore, as with most neuropsychological

conditions, concussion symptoms are multifactorial, and hence exploring the whole exome to

identify genetic pathways involved in concussion could potentially provide more insight into

its development.

Given the high numbers of mTBI cases in civilian, military, and professional sports

environments, and the expected toll long term complications take on health, it is imperative to

explore more streamlined diagnostic and therapeutic approaches to manage acute and long-term

6 Chapter 1: Introduction

symptoms. The significance of this research will be demonstrated through future reduction in

health care costs and improved quality of lives.

1.5 THESIS OUTLINE

The objective of this thesis is to detail the work undertaken to explore new approaches in

the genetics of concussion development and outcomes. A critical review of past and current

literature, justifying the need for the new approaches proposed here can be found in Chapter 2.

Chapter 3 further details the study aims and links them to selected methods. The theory

underlying the methods and relevant procedures is also presented in Chapter 3. Chapter 4

(published) presents the results of Aim 1, to confirm saliva as a valid source of DNA for WES

in Ion platforms. Chapter 5 (published) outlines the findings of the first part of Aim 2, to use

WES to identify rare variants implicated in severe responses to trivial head trauma. Chapter 6

outlines the findings of the second part of Aim 1, to explore the presence of rare variants in a

cohort of persistent PCS. Chapter 7 presents the results from exploratory studies investigating

mitochondrial and epigenetic correlates of concussion. The results of using machine learning

methods (Aim 3) to predict outcomes following head trauma are presented in Chapter 8. Finally,

Chapter 9 includes a summary and discussion of the most prominent results of this work as well

as concluding remarks and future directions.

Chapter 1: Introduction 7

Chapter 2: Literature Review

Concussion can be defined as “a traumatically induced transient disturbance of brain

function”, usually as a result of a head impact, which is underpinned by intertwined

physiological processes 4. No direct insult to the brain is required, and most concussions are

the result of closed head injuries, where the skull sustains no fractures. It presents either with

or without loss of consciousness and a range of symptoms including headaches, transient

memory problems, confusion, lack of co-ordination, dizziness, and neuropsychiatric distress,

as well as susceptibility to neuropsychiatric disorders following repeated incidence 5. Further,

chronic pain is an occasional long-term effect that decreases the quality of life for affected

individuals 8.

2.1 PHYSIOLOGY OF MTBI

2.1.1 Macro-level changes in TBI

The different layers/mesenchyme that surround brain and often provide an extra protective

layer outside the blood-brain-barrier do not necessarily function well in the case of

concussion. That is mainly because brain injuries, especially concussion are generally

attributed to biomechanical forces that create shear and tear forces, and are predominantly

caused by the brain moving with a heavy load of momentum relative to the skull 34, leading

to transient neurological dysfunctions 35. The brain is a viscoelastic organ in response to

pressure or impact 36, 37. That viscoelasticity (both viscous and elastic when deformed) nature

of the brain allows for shears and micro-tears to spread beyond the point of impact, affecting

cells and cytoskeletal elements 38, 39. The shear and tear forces spread into neurons and glia,

Chapter 2: Literature Review 9

potentially damaging dendrites, axons, and astrocytes 35. These cellular injuries then progress into pathological processes that sometimes cause secondary injury 40.

Typical concussion-causing injuries involve some sort of acceleration and deceleration impact-induced brain changes, including shearing and straining rotational forces, which may lead to loss of consciousness 41. The injury can be localised or focused yet causes “stress waves” that impact over remote parts of the brain 42. Also, Traumatic Brain Injury (TBI) is associated with metabolism level changes, specifically a metabolic depression that remains for a minimum of 5 days and up to 14 days based on the injury 43, and suggests a role for glucose metabolism in susceptibility to/or protection from concussion 44. A neuroinflammatory response post-concussion, has been documented 45, with proinflammatory micro regulators responding to the severity of injury 46. These responses have been suggested to share symptomatology with some inflammatory disorders, especially persistent post- concussion ones which Rathbone and colleagues 47 described as post-inflammatory brain syndrome (PIBS). This overarching umbrella of syndromes is hypothesised to consist of conditions that arise when an unfavourable inflammatory response in invoked in the central nervous system, including encephalopathy, as well as post-operative and cancer treatment- related neurocognitive dysfunction. However, the work of Almeida-Suhett and colleagues 48,

49 on animal models suggests that inflammatory markers do not travel as far away from the impact site as do the concussive energy waves, to which they accredit most of the damage.

While there are some small physiological changes that occur with concussion/mTBI, they tend to be microstructural and do not typically show as structural changes in first-response imaging, which suggests that the molecular changes play a role in making concussion a “functional” injury 50.

10 Chapter 2: Literature Review

2.1.1.1 Cellular and Molecular level changes

Molecular-level changes observed in association with concussion include, but are not

limited to, ionic and metabolic changes, as well as neurotransmission and connectivity

disruption. These changes are typically on a micro-level in the extracellular matrix of the

brain. Consequently, these changes are not easily detectable through structural imagining, the

first stage of diagnosis in cases of head traumas 35. The shear and tear forces can lead to

cellular distortion and changes in membrane permeability, which sequentially contribute to an

acute imbalance in extracellular ion gradients, in particular, K+, which is expelled into the

extracellular space 42, 51. Glutamate and other excitatory neurotransmitters are then released

by neurons, likely due to the imbalance caused by K+ efflux, paired with Na+ and Ca2+ influx

into the neurons, with the resulting ion imbalances contributing to local and downstream

neuronal depolarisation 35. The increase in extracellular glutamate in the aftermath of injury

has been linked to functional changes of AMPA and NMDA receptors 52, 53. This imbalance

in excitatory and inhibitory neurotransmitters becomes crucial in the development of

neurological and neuropsychiatric symptoms that depend on Long Term Potentiation and

Depression (LTP and LTD) 54, 55. Cellular level homeostasis can be re-established through ion

pump activation. These pumps require high Adenosine Triphosphate (ATP), energy, and

oxidative stress demands on the neurons 51. This energy depletion is characterised by hyper-

glycolysis and interference by Ca2+ ions with cellular and mitochondrial functioning. This can

last for a period of up to 2 weeks, equivalent to the average length of recovery time from

concussion 56. The mitochondrial dysfunction-caused hypermetabolism is followed by a

paucity of glucose metabolism 57. In addition, Tau protein, often associated with Dementia

and Alzheimer’s disease has been detected in metabolic models of concussion and described

Chapter 2: Literature Review 11

as a sign of cytoskeletal damage caused by calcium influx 58. On an anatomical level, it has

been shown that the axonal stretch, wear and tear is quite similar in most parts of the brain

regardless of the injury location 59. As ion channel changes are involved in several stages of

concussion, in particular the earliest stages which follow an impact, they are hypothesised to

play a role in concussion severity and subsequent recovery and consequently may have a role

in concussion risk.

2.2 OUTCOMES OF MTBI

2.2.1 Immediate

Post-concussive symptoms include dizziness, nausea, as well as attention and working

memory problems 60. In addition, migraine and headache are common after mTBIs. In a cohort

of military members with diagnosed mTBI, 76.8% reported a headache within 7 days after an

injury, with migraine features most commonly presenting 61. Post-traumatic amnesia is

another outcome where concussed individuals suffer a temporary loss of memories leading to

or after the injury 2. Further, in a review of the neuropsychiatric outcomes of concussion

Radhakrishnan and colleagues 62 divide the signs and symptoms of concussions into cognitive,

somatic, and affective symptoms that vary in representation and degree among individuals.

These distinct groupings have also been supported by other studies 63. The symptoms that

manifest immediately after injury usually resolve in 7-10 days 64, 65. Nevertheless, Mc Fie and

colleagues 66 found that 10% to 15% of athletes do not recover spontaneously in the common

time frame. Similar numbers (a small but noticeable 10%) have been documented throughout

the literature 67. There is little known about spontaneous recovery in non-sports concussion,

highlighting the need for wider research in non-sports and non-military settings. Interestingly,

12 Chapter 2: Literature Review

Donnan and colleagues 68 note that “concussion” episodes are not unique to mTBI injuries, which explains the reported incidents of concussion following trivial head trauma.

2.2.2 Prolonged and more severe outcomes

Despite the fact that most concussion symptoms fully resolve by three months post- injury, a large estimate of 5-43% of the injured population will develop prolonged sequela 69, often referred to as Post-Concussion Syndrome (PCS). In a longitudinal study, Hiploylee,

Dufort 41 found that only 27% of their population recovered eventually. PCS presents both short-term and long-term consequences; the former including the inability to return to work and society, and the latter including more adverse neuropsychiatric disorders 70. While the diagnosis of post-concussion syndrome is contested in some areas of the literature, post- concussion symptoms are a lot clearer to define. Hence, PCS will be used throughout this document as a general term for post-concussion symptoms and syndrome. Structural abnormalities, including white and grey matter damage, as well as cortical thickness changes, tend to persist in some cases for a year or more after trauma 71. These structural abnormalities have been associated with a myriad of unfavourable cognitive and psychological presentations, including but not limited to, working memory problems, personality changes, and executive function deficits 72. In a scoping review, McInnes and colleagues 73 found that around 50% of single mTBI incidences cause prolonged cognitive deficits well beyond the expected three months. Nonetheless, this high number should be considered with caution, seeing that their definition of cognitive impairment was quite broad and numerous cohorts were included in the final count despite only showing a marginally significant association with a single cognitive test 74. Donnan and colleagues 68 found in their systematic review, that when excluding genetic factor studies, the severity of injury is the most reliable predictor of

Chapter 2: Literature Review 13

symptoms and recovery. This, however, does not explain the different symptoms and recovery of people who sustain injuries of similar severity, nor does it explain that findings from other studies 75-77 in veterans have indicated that the functional toll of mTBI is not related to the severity of injury. Similar inferences can be made from the rare documented cases of cognitive deficits following sub-concussive injuries 31, suggesting a factor beyond the severity of impact in neuronal vulnerability. There are varying reports on the differences of mTBI and post- concussion symptoms, with some showing that psychological factors contributing more to male athletes developing worse PCS due to their attitudes towards reporting sports concussions as well as knowledge of symptoms and management 78.

Of note, second impact syndrome (SIS) is one of the severe outcomes following brain injuries/concussion. SIS is often fatal and results from sustaining a brain injury before total remission from a previous incident 79. The mechanisms of SIS are, as of yet, not fully understood, and the definition of the syndrome is itself contested 16, primarily due to the lack of consensus on diagnostic criteria. Hence, genetic risk factors may provide a useful approach to exploring and better defining what SIS entails. With varying degrees of outcomes, and extensive research, our understanding of the impact of mechanical insult and spreading wave on long term molecular and anatomical functionality remains limited, mostly due to the challenging nature of tracking brain changes 59.

In summary, concussion symptoms are varied and encompass a range of neurological, behavioural, and cognitive outcomes. They typically resolve in weeks to months with a documented portion of the population who struggle with persistent PCS.

14 Chapter 2: Literature Review

2.3 DIAGNOSTIC AND THERAPEUTIC APPROACHES

2.3.1 Current Diagnostic Approaches

The diagnosis of concussion is currently a contested area, in particular defining what injuries and symptoms encompass a concussion or mTBI 42. The lack of consensus regarding diagnostic criteria 62 has been attributed to the vague and general spectrum of symptoms currently included as concussion, often dependent on self-reporting 61 and leading to a varied array of heterogenous symptoms that might be attributed to mTBI 80. The fact that a large part of concussion symptoms have a psychological aspect (e.g. fatigue or cognitive impact) ultimately causes a self-report disparity and further hinders the development of an universal diagnostic method 81.

The Glasgow Coma Scale (GCS) is one of the more widely used assessments to diagnose mTBI/concussion, when a person scores 13-15, with or without loss of consciousness for no more than 30 minutes, and amnesia that is no more than 24 hours 57. Further details on the

Glasgow coma scale and different criteria for TBI types are provided in Table 1. Another challenge to diagnosing mTBI worth noting, is that numerous neurological and neuropsychiatric symptoms depend on self-report, which is subjective at best 38. Moreover, the Standardised Assessment of Concussion (SAC) is a test usually administered in early post- injury phases, which provides a more accurate estimation of neurocognitive domains, orientation, immediate memory, concentration and delayed recall 82, and provides an attempt at navigating the uncertainty of symptom diagnosis. One of the challenges facing the diagnosis of concussion is that many people who have concussion will wait until the symptoms resolve without seeking diagnosis or treatment when faced with the pressure of returning to work or the game 9. The currently accepted theory is that after clinical recovery, there is a “window of

Chapter 2: Literature Review 15

cerebral vulnerability” where a second impact can cause severe deterioration and sometimes permanent damage 83. This is consistent with the observation that recovery from symptoms, which evolve in minutes to hours post injury and typically resolve in up to 10 days, is not always paired with recovery in neuropsychological correlates or sustained structural recovery

84. Consequently, validating diagnostic methods to quantify the extent of injury and predict the ensuing damage might be instrumental in reducing further damage during the vulnerability window.

Table 1: Diagnostic Criteria for mTBI

Loss of consciousness (LoC) < 30 minutes

Loss of Memory (retrograde or anterograde amnesia) up to 24 hours

Change of mental state (disorientation, confusion, dizziness)

Glasgow Coma Score >= 13

Adapted from (Coyle, 2015)

Brain imaging provides crucial information that helps to determine the appropriate course of treatment following a concussion incident. However, it is not routinely administered, in particular X-ray and magnetic resonance imaging (MRI) 85. In some cases, computed tomography (CT) is recommended to predict the risk of any possible complications 86. CT is also more generally available than MRI, which is more expensive but superior in differentiating gray from white matter 87. However, potential side effects and the high cost of

16 Chapter 2: Literature Review

MRI make it difficult to adopt as a standard for all cases 10. Although an mTBI presentation may not present with cognitive or neuropsychological sequelae, this does not negate the existence of injury, nor its severity 2. Further, the myriad of differing symptoms and their severity following a concussion posits a widely-acknowledged challenge to accurately diagnosing mTBI 38. Consequently, employing newer imaging technologies may allow for more refined exploration of the impact of head injuries. Diffusion tensor imaging allows more detailed analysis of microstructural injuries in axons not detectable via CT 35. Of interest, a

CT-diagnosed intracranial haemorrhage means that the mTBI can be subclassified as complicated mild TBI 88. Voxel-based morphometry is also used to supplement data regarding grey matter consistencies following trauma 71. White matter changes and morphology post- injury seems to be a promising avenue as they have been linked to persistent symptoms one year after injury 89, 90. More specifically, it has been demonstrated that white matter changes as recent as 2 weeks post injury, are correlated with persistent neuropsychological symptoms

1 year post injury 91.

Assessing TBI is a comprehensive multidisciplinary process of which neurocognitive assessments are a crucial element. Neurocognitive assessments are now an endorsed aspect of concussion management, due to their ability to detect abnormal deficits following head trauma

92. Further, a review by Kontos and colleagues 93 reiterated the importance of including neuro- cognitive assessments in assessing injury-related damages, identifying their sensitivity to both acute and long-term deficits.

Biomarkers are still sought as an alternative in the diagnostic process of mTBI. The high cost and intensive nature of imaging requires quicker and more routinely administrable ways to explore brain health post injury, biomarkers for example 94. Sharma and colleagues 10 designed a biomarker panel based on serum protein levels that can accurately identify the

Chapter 2: Literature Review 17

severity of a TBI. The panel includes CRP, a broad systematic inflammation marker, MMP-

2, a zinc-dependent protease, and CKBB, an energy production related creatine-kinase 10. A

systematic review by Cairelli and colleagues 95 using network analysis identified 17 potential

substances that could be the basis for a myriad of biomarkers, including genetic variants,

presented in a network in Figure 1. Nevertheless, these diagnostic and therapeutic approaches

do not take into consideration any predisposition to develop concussion symptoms or

unfavourable outcomes.

2.3.2 Current therapeutic approaches

Because of the diverse physiological and molecular changes that occur with TBI, finding

universally effective therapies has been challenging 32 . For example, Omelchenko and

colleagues 96 demonstrated that inhibiting sodium-calcium exchanger 1 (NCX1) promoted

better recovery from the oxidative stress that has been described earlier, and in turn could

provide better recovery from diffuse axonal injury. In mouse models, multi-channel blockers

97 were demonstrated to provide a protective effect against ischemic brain injury, which shares

several pathways with mTBI/concussion in terms of ion channel roles and oxidative stress and

have had more neuroprotection research conducted 98. To date, several therapies have been

tested for their efficacy in minimising negative mTBI outcomes. They mostly encompass

therapies that target metabolites and other molecules that are by-products of mTBI, or

therapies that act as neuroprotective factors in early injury phases. Antioxidants, particularly

N-acetylcysteine, have been shown to improve mTBI outcomes and might have

neuroprotective effects on at-risk subjects, as indicated by intraperitoneally injection in

murine TBI models 99. Specifically, early administration either reversed or limited behavioural

problems post injury. Cholinergic systems have been suggested to play a role in memory

functions 100 with further disturbances in the cerebral cholinergic pathways post injury often

18 Chapter 2: Literature Review

linked to cognitive impairment 101. As such, cholinesterase inhibitors are suggested to improve the cognitive prognosis of mTBI 102. Calcium channels, primarily voltage-gated

(VGCC), have been established to have neuroprotective characteristics 103. Hence, calcium channel blockers have been trialled for treatment of TBI 104, 105. However, a recent meta- analysis identified no significant improvement following severe TBIs after treatment with calcium channel blockers 106. It is possible that the efficacy of neuroprotective agents is genotype dependent, which accentuates the relevance of studying the genetic basis of concussion symptoms and development.

2.4 GENETIC PATHWAYS

It is almost impossible to claim that genetics will have an isolated or singular role to play, considering that mTBI provides a prime example for everchanging biology and environment interaction 19. To date, research exploring the genetics of concussion has been inconclusive, with few studies able to isolate specific genes or mutations implicated in concussion risk and severity or persistence of its outcomes. Several systematic reviews have been published in relation to the genetic factors of concussion or sport concussion 5, 17, 18, 107.

Nevertheless, based on our current knowledge regarding the mechanisms of concussion, ion channel-, neurotransmitter-, and metabolism-related genes provide plausible targets for further examination in the processes of concussion. Studying the genetic basis of concussion, or the risk thereof entails exploring the genetic basis of physiological changes, cognitive and behavioural outcomes, response to injury, as well as the subsequent neurotrauma 107. In addition, genetic factors may affect the risk of concussion by modulating risk-taking behaviour and impulsivity 108, 109. This risk may include rare variants, that are present in less than 1% of the larger population, as recorded in “1000 Genomes Project” and dbSNP databases, as well as common variants, with a minor allele frequency (MAF) in over 5% of

Chapter 2: Literature Review 19

the population. Furthermore, persistent post-concussion symptoms have been explored from a common disease, common variant approach. However, recent studies have alluded to the notion that a common disease rare variant approach might be the missing link in understanding the genetic basis of certain phenotypes 110.

Especially for complex diseases or phenotypes, a combination of rare variants is likely to precipitate a higher risk, once penetrance estimation has been mitigated 111. This is in line with evidence that functional regions of the contain more variants that are enriched for complex trait heritability 112. Schizophrenia, for example, a neuropsychiatric disorder for which heritability estimates have been consistently low, has benefited from NGS identifying sporadic de-novo mutations in patients 113, and machine learning classifying the exomes of healthy and schizophrenic patients 114. Li and colleagues 115 found that four neuropsychiatric disorders share genes with de-novo variants that are suggested to contribute to their aetiology. Girard and colleagues 113 have also explored the role of de novo mutations in schizophrenia beyond common SNP approaches. These rare variants indicate potential deleteriousness on the basis that “evolution is the best classifier of deleteriousness”, which means that the more heavily conserved a gene is among species (fewer polymorphisms), the more potentially damaging rare variants in it will be 116. Genetic networks are important to consider when exploring complex neurological disorders such as persistent PCS. It could be hypothesised that networks of genetic variants are represented in changes in brain networks, or what is known as the “connectome” 117.

Genetic studies in the context of concussion can be divided into 2 different groups: i) studies that examine the genetic risk of developing concussion, including severe reaction to minor head trauma; and ii) other studies that explore genetic association with concussion recovery and outcomes in the long or short term. The variants in the first group can either be

20 Chapter 2: Literature Review

polymorphisms (SNPs) that are associated with risk or deleterious rare mutations, which cause conditions relating to or overlapping with, concussion symptoms. In concussion research, genetic associations have been typically based on candidate gene approaches, unlike most other traits where a Genome Wide Association Studies (GWAS) approach is more common.

However, polygenic risk scores, which combine networks of genes along with other demographic values can provide a valuable alternative to relying on GWAS significance alone

70. A recent review by Kurowski and colleagues 118 used a systems biology approach to identify candidate genes potentially associated with recovery from TBI. Their analysis revealed two major groups of implicated genes: those related to neural recovery processes, and genes that play a role in functional cognitive and behavioural reserves.

The need for genetics research is evident when looking at the contradicting results from fMRI studies exploring post-concussion working memory 119 or how Transcranial Magnetic

Stimulation signals are NMDA receptor dependent 120. A role for genetics in predisposition to more negative outcomes is further supported by studies exploring longitudinal data of children with and without mTBIs, finding that the higher risk for mTBI was also linked with higher initial and persistent outcomes 121

2.4.1 Ion Channels

Exploring genetic variation in ion channel genes may provide an informative approach to studying concussion risk, as patients with Ion Channelopathies often suffer adverse reactions to head trauma 83. Giza and Hovda 35 hypothesise that voltage or ligand-gated ion channels are activated after ionic flux and create “a diffuse spreading depression-like state that may be the biological substrate for very acute post concussive impairments”.

Chapter 2: Literature Review 21

Calcium ions are one of the first to enter the cells following impact and disruption of the extracellular ionic gradient. Hence, it is no surprise that the calcium channel gene

CACNA1A, which encodes the Calcium Voltage-Gated Channel Subunit Alpha1 A (Cav2.1), has been strongly identified with concussion susceptibility, specifically the rare missense mutation S218L, a C to T change at location 928 on 19 122. Delayed fatal cerebral oedema and coma are potential outcomes for patients with this mutation after mTBI 21, 123, an effect similar to secondary concussion which can result in oedema or death 124. CACNA1A dysfunctions also cause Familial Hemiplegic Migraine (FHM1), episodic ataxia type 2 (EA2) and spinocerebellar ataxia type 6 (SCA6) 125. The overlap of symptoms between FHM1 and concussion (i.e., lack of coordination, migraine headache, altered sensory perception, confusion, language incomprehension, and temporary or progressive attacks of muscle weakness) 126 suggest a common underlying mechanism and potentially genetic commonalities. This is further supported by the finding that a history of migraine or headache predisposes individuals to more unfavourable outcomes after a concussion, or at least a prolonged recovery time 127. Indeed, much like in concussion, cortical spreading depression due to increased Ca2+ neuronal influx is common in FHM1 mutation cases 125, 128. This proposition is supported by mouse models, where analysis of the human S218L FHM1 mutation identified severe oedema and brain damage following mTBI 129. Further, Ca2+ homeostasis has been recorded to be disrupted following TBIs 130, leading to several complications. This is one of the only rare causal mutations of concussion identified to date.

The role of calcium channels extends beyond mutations in CACNA1A, with studies establishing a role for other calcium channels and calcium-activated potassium channels in both healthy and disease-related cerebrovascular response to head trauma 131.

22 Chapter 2: Literature Review

In addition to calcium channels, concussion entails disturbance to the sodium-potassium ionic gradient of the neuronal matrix. Consequently, the sodium-potassium pumps (Na+ / K+-

ATPases) are activated to restore neuronal membrane potential. The membranes of the central nervous system cells have the α2 subunit of the Na+ / K+ pump, encoded by the ATP1A2 gene

132. Dysfunction in these subunits caused by ATP1A2 mutations lead to the pathogenic removal of glutamate by glial cells, causing cortical spreading depression, leading to FHM2 and symptoms similar to concussion events 133, 134. A recent study by Maksemous and colleagues

20 identified 8 more ATP1A2 rare variants in a cohort of individuals with severe neurological reactions to trivial head trauma, implicating thus a second ion channel gene in concussion development and response.

Other sodium and potassium channel genes have also been implicated in concussion; however, limited research to date supports definitive links. For example, the potassium ion channel gene KCNJ10 plays a role in altered resting polarisation levels for cells, which influences the threshold of post-injury ion flux activation (variants included Q212R, L166Q, and G83V) 135. The SCNA7 and SCN10 genes have also been linked to blood pressure and cardiac function regulation, which in turn affect blood flow alterations in the aftermath of a concussion, potentially affecting recovery times 136-138. Another form of channel relates to solute carriers. Madura and colleagues 139 found that a polymorphism in the SLC17A7 promoter (rs74174284), which codes for the vesicular transporters that regulate synaptic glutamate, is significantly associated with prolonged recovery periods after concussion.

2.4.2 Neurotransmitter Genes

While little is known of the role of neurotransmitter genes in concussion, there are several neurotransmitter common polymorphisms that have been associated with concussion

Chapter 2: Literature Review 23

aetiology. The glutamate and γ-aminobutyric acid (GABA) as primary excitatory and inhibitory neurotransmitters in the brain play important roles in the concussion process, particularly with the indiscriminate glutamate release that leads to increased intracellular Ca2+ after concussion 140. Glutamate is also believed to contribute to secondary injuries through excitotoxicity 141. In a network analysis study of the literature, Cairelli and colleagues explored the role of glutamate in neuroprotection following mTBI and its association with lactate production 95. McDevitt and colleagues found a significant association between recovery times after concussion and the long variant of variable number tandem repeat (VNTR) alleles of the GRIN2A gene that encodes an NMDA glutamate receptor subunit 142. This is consistent with other evidence indicating the NMDA glutamate receptor subunit composition is altered in the aftermath of concussion 140.

The neurotransmitters serotonin and dopamine have been implicated in several mood disorders, which develop occasionally after concussion. The serotonin gene HTR1A was hypothesised to affect concussion outcomes in children, specifically post-concussive depressive symptoms by Smyth and colleagues 143, however no significant association between the common rs6295 G allele and post-concussive symptoms was found. Interestingly, some studies showed that children with higher incidence of mTBI or prolonged PCS tend to have more pre-injury behavioural problems 144, 145, which could hint at a role for behaviourally linked genes, particularly those implicated in risk taking which happen to be in neurotransmitter pathways. For example, Yue and colleagues 146 found that the DRD2 rs6277 polymorphism is associated with improved verbal learning 6 months after injury. However, the cohort consisted of patients with minimal head trauma, suggesting the effects of DRD2 are injury severity dependent. Conversely, Dretsch and colleagues 147 explored the role of the dopamine receptor gene (DRD2) Taq 1A common SNP rs1800497 in concussion risk in

24 Chapter 2: Literature Review

active-duty soldiers and found no significant association. DRD2 is of particular interest due

to its role in executive function and proposed link to neurodegenerative disorders 148. Further,

McFie and colleagues 66 suggest that dopamine levels might be contributing to risk-taking

behaviour, thus modulating concussion rates. Catechol-O-methyltransferase (COMT) is a

gene that has been associated with memory executive function due to its expression in the Pre-

Frontal Cortex (PFC) and activation of neurotransmitters, including dopamine 149. Winkler

and colleagues 150 found that the COMT common SNP (rs4680) Val158Met potentially

contributes to better nonverbal cognition after mTBIs.

2.4.3 Neural Structural Genes

Apolipoprotein E (ApoE) has been shown to be vital for neural integrity maintenance

and recovery from brain injuries 151. ApoE protein plays an important role in neural tissue

healing and repair 152 and is one the most studied genes in relation to mTBI outcomes, often

yielding conflicting results. Several studies have found no correlation between the ApoE-ε4

allele and deteriorated conditions after mTBI 153-155. Nevertheless, ApoE has been suggested

to predispose individuals to a 3-fold increase in concussion risk through the G-219-TT

genotype as opposed to the G-219-GG genotype 156. Terrell and colleagues 157 found that the

ApoE-ε4 allele protects against concussion risk. Further, in accordance with its association

with memory impairment in neurodegenerative disorders, the ApoE-ε4 allele has been shown

to be associated with decreased verbal memory, 6 months after a mTBI 30. The Tau protein is

crucial for establishing microtubules in the neurons, and ApoE-ε4 has been found to play a

role in tau-mediated neurodegeneration 158. Additionally, Tau has been implicated in the

concussion processes post-impact, yet no significant link has been found between the Ser53Pro

variant and increased concussion risk 156. In contrast, Tierney and colleagues 159 identified a

significant association between carriers of all ApoE rare alleles (E2, E4, promoter) and

Chapter 2: Literature Review 25

previous concussions, as well as a significant association between the ApoE minor allele and

a history of two or more concussions. This data suggests ApoE may play different roles before

and after concussion, potentially related to co-expressed proteins. Along with Tau protein,

Heavy Neurofilaments are part of the neural cytoskeleton and their integrity is crucial to its

maintenance 160. It follows that their encoding genes might be implicated in response to injury

disturbances. However, McDevitt and colleagues found no significant association between a

neurofilament heavy NEFH polymorphism and concussion risk 161.

2.4.4 Pro and anti-inflammatory cytokines

The bone marrow tyrosine kinase gene located on the X chromosome (BMX) has been

identified to regulate cytokine receptors signalling processes 162, and recruitment of

inflammatory cells during brain injury 163. As cytokines have an established role in

inflammatory processes, BMX has been implicated in TBI. Wang and colleagues explored the

role of 4 BMX polymorphisms in a cohort of mTBI patients and found a significant association

between two common SNPs- rs16979956 and rs35697037, and post-injury anxiety and

dizziness symptoms, respectively 164. In accordance with the inflammatory response theory

discussed earlier, it follows that genes implicated in inflammation responses (IL-1B and IL-

6) could be relevant to concussion outcomes 66. This is particularly supported by the previous

evidence linking both protein levels to post-concussion symptoms and recovery 165-167. In a

candidate gene study, McFie and colleagues 66 investigated the roles of IL-6 rs1800795 and

IL-1B rs16944 polymorphisms in concussion incidence and severity, with genotypes linked

to lower frequency of injuries and shorter symptoms. Interestingly, in a study exploring

genetic expression post-mTBI in animals, it was mediators of interleukin rather than cytokines

that were differentially expressed 48. Nonetheless, confirming the inflammation hypothesis

26 Chapter 2: Literature Review

Mountney and colleagues 168 found higher neuroinflammatory biomarker levels to be

associated with worse injury outcomes.

2.4.5 Neurotrophic genes

Brain-derived neurotrophic factor (BDNF) is known to play an important role in brain repair,

neural regeneration, and survival processes, which are crucial for recovery from concussion

and influencing clinical outcomes 169, 170. BDNF has also been implicated in several memory

functions as well as neuroplasticity levels 171. Narayanan and colleagues found a significant

association of the BDNF rs6265 Val66Met polymorphism with neurocognitive outcomes of

concussion patients, in particular, Met allele carriers 172. The same allele has also been found

to be associated with improved visual-auditory working tasks in concussed athletes 173 and to

affect olfactory functions in concussed female athletes 174. Additionally, the rs1157659 SNP

minor allele was found to be significantly associated with hippocampal size reduction and

functional connectivity in war veterans 175. Interestingly, BDNF levels in serum have been

found to act as an indicator of brain injury severity in the immediate aftermath of trauma and

as a predictor of injury prognosis over 6 months 170. Further, in a study exploring the changes

in BDNF mRNA in controlled mTBI in animals, there were significant differences 176.

2.4.6 Epigenetics of Concussion

Several studies have established epigenetic changes post-concussion or mild traumatic

brain injury. However, to date, most of these studies have focussed on animal models. For

example, expression of the enzymes that regulate methylation processes have been shown to

be severely affected by brain injury levels 177, and site-dependent methylation changes

(hypomethylation and hypermethylation) have been shown to affect hippocampal tissues more

than prefrontal cortex in animal models 178. In addition, Sagarkar and Colleagues 179 found

Chapter 2: Literature Review 27

that mTBI can lead to substantial and persistent changes in the methylation of BDNF gene

promoters in animal models. Interestingly, BDNF methylation has been previously identified

to be associated with executive function 180, which may relate to the cognitive decline that

presents after TBIs. Interestingly, the expression of Cyclic Adenosine 3,5-Monophosphate

Response Element-Binding Protein (CREB), another gene which has been implicated in

memory functions 181 was found to be affected by injury in juvenile rat injury models 182.

Further, Hehar and colleagues 183 found that epigenetic modification, particularly, methylation

in the promoter regions of BDNF, IGF-2, and IGF-R, is inherited, with concussion to offspring

contributing to PCS symptoms. An exploratory study recently compared epigenome wide

methylation in patients post-concussion and patients who had never been concussed 184.

Bahado and colleagues identified 4 CpG sites in genes (FLOT2, RAB5B, FCMR, GALNT10)

implicated in neural processes and integrity to be differentially expressed across concussed

and non-concussed patients.

2.4.7 Mitochondrial Correlates of Concussion

ApoE, as established in prior paragraphs, is one of the most studied genes in relation to

TBI outcomes due to its role in in protecting neurons from oxidative cell death (22 in

Bulstrode et al). This ties in with the oxidative stress that occurs post-concussion, and thus an

interaction between mtDNA variants and ApoE genotypes has been found in a cohort of

neurodegenerative disorders 185. Age, a predictive factor of TBI outcomes 186, has also been

found to have its effect modulated by genomic mtDNA variants 187. ApoE has also been

associated with mitochondrial toxicity in neurodegenerative disorders 188, 189. There is

substantial evidence for involvement of mitochondrial variants in response to TBI and

correlation with persistent symptoms. A recent study has also explored the role of

mitochondrial haplogroups in modulating the risk of sustaining concussion as well as

28 Chapter 2: Literature Review

developing PCS 190. This comes as no surprise seeing the role energy production plays in the

aftermath of concussion//head trauma. Nonetheless, there is little evidence exploring the role

of mitochondrial variants in cases with severe responses to trivial head trauma or prolonged

PCS.

2.5 CONCLUSION

Despite the high prevalence of concussion, there are few established biomarkers

including genetic ones to detect the risk of concussion development and persistent concussive

symptoms. The strongest evidence for a role of genetic factors in concussion are the rare ion

channel-gene mutations implicated in severe reactions to trivial head trauma. However,

several genetic pathways have also been implicated in recovery from concussion or worsening

of the symptoms in the long term. At present , there seems to be little consensus about how

these genetic components interact or affect the development of concussion symptoms with

regards to the severity of trauma.

Chapter 2: Literature Review 29

Chapter 3: Research Design

This chapter describes the design adopted by this research to achieve the aims and

objectives stated in section 3 of Chapter 1. Section 3.1 discusses the methodology used in the

study, the stages by which the methodology was implemented, and the research design;

section 3.2 details the participants in the study; section 3.3 lists all the instruments used in the

study and justifies their use; section 3.4 discusses how the data was analysed; finally, section

3.5 discusses the ethical considerations of the research and its potential problems and

limitations.

3.1 METHODOLOGY AND RESEARCH DESIGN

3.1.1 Methodology

This study predominantly relied on Next Generation Sequencing (NGS), in particular WES,

to undertake the genotyping required for the aims. A suite of Bioinformatics and ML tools

were used to analyse the data produced in a variety of ways suited to each of the aims.

3.1.2 Research Design

Aim1: To develop a protocol for using saliva and blood-extracted DNA in the same

WES analysis on Ion platforms.

Hypothesis 1: Saliva provides a more convenient alternative of comparable quality to

blood as a source of DNA for WES on Ion platforms.

Methods:

Chapter 3: Research Design 31

a) Recruit healthy volunteers who donate saliva and whole blood samples (n = 4). b) Extract genomic DNA from each of the samples using commercially available kits (Oragene

and QIAGEN midi-kit) perform WES using the Ion Proton Instrumentation. c) Use Ion Server to create variant files for each of the samples. d) Using Ion Reporter software, compare the quality metrics of the blood-extracted and saliva-

extracted DNA. e) Compare the variants, using Ion Reporter software, identified in the same person based on

their DNA sample source and calculate concordance levels. Repeat the concordance analysis

using other software for replication purposes. f) Statistical analysis using independent group t-test to assure that no significant differences exist

between groups.

Aim 2: To perform WES on individuals from concussion-affected cohorts to investigate

rare variants in ion channels, neurotransmitter and neurological structural genes that may be

implicated in the development and persistence of PCS.

Hypothesis 2: Individuals who develop severe neurological reactions to trivial head

trauma or persistent PCS will harbour rare variants in ion-channel genes.

Methods: a) Undertake WES on selected individuals from 2 concussion affected cohorts (HM and PCS as

defined in section 3.2) to identify variants that relate to concussion susceptibility, severe

reactions to minor head trauma, and persistent post-concussive symptoms. Sequencing will

be conducted using Ion Proton and Ion S5+ as per section 3.3.2 b) Investigate rare variants in control populations and unaffected family members to ensure

minor allele frequencies are consistent.

32 Chapter 3: Research Design

Aim 3: Investigate epigenetic and mitochondrial correlates of head trauma response and

incidence.

Hypothesis 3: Mitochondrial DNA variants and epigenetic differences can explain

differences in response to head trauma.

Study Design :

a) Identify mitochondrial variants that can be extracted from WES data for higher

prevalence in individuals with multiple concussion history or persistent PCS (n =49 from HM

and PCS cohorts in section 3.2). Variant identification conducted using the tools in section

3.4.1 and 3.4.5.

b) Compare epigenetic changes, particularly, methylation of relevant gene promoters

between two cohorts (n = 32 for each group healthy and a concussion-affected) derived from

concussion patients and athletes with long-term concussion symptoms. Methylation analysis

will be undertaken using pyrosequencing as described in section 3.4.6

Aim 4: Use machine learning to explore candidate genes and genetic pathways

implicated in concussion.

Hypothesis 4: machine learning models can classify sub-cohorts of concussion patients

with above-chance accuracy.

Methods: a) Use machine learning (ML) modelling (gradient boosting specifically) to explore the potential

association between the identified variants and common symptoms across the different

populations. Section. 3.4.4 details the methods used for ML as well section 8.2.

Chapter 3: Research Design 33

3.2 PARTICIPANTS

3.2.1 Ethical Approval

Sample collection were made through standardised saliva and blood collection kits, with

DNA extractions performed using protocols currently optimised in the Genomics Research

Centre (GRC). Ethical approval has been granted for the samples that were obtained through

the diagnostic facility of the GRC (Approval Number 1400000748), and for the 2 other cohorts

(Approval Number 1700000811).

3.2.2 Participant groups

a) Group 1: 16 patients who were referred to the Genomics Research Centre Diagnostic testing

for FHM similar symptoms following minor head trauma but had tested negative for mutations

in the known FHM genes. They are referred to throughout the document as FHM or HM

groups. Their clinical presentation is detailed in Chapter 5.

b) Group 2: The second group was comprised of 33 individuals who either reported persistent

post-concussion symptoms, history of multiple concussion incidents with few minor head

trauma, or both. Samples were recruited via local sporting clubs and university organisations,

as well as through relevant media announcements (e.g. radio and TV interviews and relevant

newsletters articles). Where available, their affected and unaffected family members were also

recruited. Referred to as PCS in most of the document.

c) Group 3: Those participants (n = 18) were either recruited directly or referred through

collaborators (namely Prof Fatima Nasrallah). They had sustained head mTBI and recovered

within the expected time frame. Referred to as recovery throughout most of the document.

34 Chapter 3: Research Design

3.3 INSTRUMENTS

3.3.1 DNA extraction and quantification

Saliva

Saliva was collected using Oragene kits, with participants following the instructions provided

with the kits. Samples were then incubated at 50°C in a water incubator overnight. In a 15mL

tube, each entire sample was mixed with 1/25th PT-L2P and vortexed for a few seconds. This

was then followed by incubation on ice for 10 minutes and centrifuging at room temperature

for 10 minutes at a high speed (3,500 x g). The clear supernatant was then transferred to a

fresh tube and mixed by inversion 10 times with 1.2x volume of room temperature 100%

ethanol then centrifuged for 15 minutes at a high speed. The pellet contained the DNA and

supernatant was carefully removed. 1 mL of 70% ethanol was added carefully to the tube

without disturbing the smear or the pellet for an ethanol wash, let stand at room temperature

for 1 minute, then pipetted out. The DNA pellet/smear was allowed to fully air-dry (minimum

3 hours), then rehydrated by adding 0.5 mL of distilled water and by vortexing the sample

for 30 seconds.

Blood

Participants were directed to pathology labs where their DNA was collected by

professional phlebotomists. Blood samples were stored at -80°C immediately after collection

and were allowed to thaw in room temperature 2-4 hours prior to extraction. DNA was

extracted from blood samples using QIAGEN midi-kit extraction protocol. QIAGEN Protease

was mixed with blood and when necessary, PBS. After adding 2.4 ml Buffer AL, the sample

was mixed thoroughly by inverting the tube 15 times, followed by additional vigorous shaking

Chapter 3: Research Design 35

for at least 1 min and incubation at 70°C for 10 minutes. Ethanol (100%) was then added to the sample and mixed in by inverting the tube 10 times followed by additional vigorous shaking. The resultant solution was then transferred to QIAamp Midi column placed in a 15 ml centrifuge tube and centrifuged at 3500 x g for 5 min. After removing the filtrate (the solution that passes through the column’s filter after spinning down), 2 ml Buffer AW1 were added to the QIAamp Midi column and centrifuged at 4500 x g for 1 minute. Finally, the

QIAamp Midi column was placed in a clean 15 ml centrifuge tube, and 300 µl distilled water, was pipetted directly onto the membrane of the QIAamp Midi column. The DNA can then be transferred from the centrifuge tube for storage after incubation at room temperature for 5 min, and centrifuging at 4500 x g for 2 min.

Quantification using Qubit

DNA extracted from blood and saliva was quantified using the Qubit® Fluorometer,

Thermo Fisher. The protocol of the manufacturer was followed. The first step of preparing the

Qubit® dsDNA High Sensitivity(HS) Assay is preparing the 2 standards it requires. Qubit® working solution was prepared by diluting the Qubit® dsDNA HS Reagent 1:200 in Qubit® dsDNA HS Buffer. 190 µL of Qubit® working solution were added to each of the tubes

(0.6mL) used for standards. 10 µL of each Qubit® standard were added to the respectively labelled tube (Std 1 and Std 2), then mixed by vortexing 2–3 seconds and spinning down for

5 seconds. The tubes were left to incubate at room temperature for 2 minutes, and then the instructions on the instrument were followed to define standards.

The same steps were then followed for the samples, whereby 198 µL of the Qubit® working solution was added to 2 µL of each sample in individual tubes. After vortexing,

36 Chapter 3: Research Design

spinning, and incubation, each sample tube was inserted into the sample chamber and “read

tube” was selected on the instrument. The instrument displays the results on the assay screen,

whereby the top value (in large font) is the concentration of the original sample. This was

repeated until all samples were read.

3.3.2 WES

The human genome contains around 3x10^9 bases of DNA bases, of which 30M bases

form the exome, which translate to functional proteins (Rabbani, 2014). These bases equate

to about 1-2% of the whole genome but are hypothesised to harbour most of the disease-

causing mutations, especially in Mendelian disorders (Majewski et al 2011, add another).

Hence, whole exome sequencing, which involves amplifying and sequencing these exonic

regions is a potentially useful method to identify said mutations, in particular rare variants,

which might be missed in GWAS panels. It is also more cost and time effective when

compared to more comprehensive sequencing methods such as Whole Genome Sequencing.

WES is often used in family pedigree studies and population studies. WES is now also used

routinely in diagnostic and clinical settings and has advanced detection rate for numerous

Mendelian diseases 191, 192. The routine use of WES has helped numerous patients with genetic

disease receive a diagnosis that would not have been possible otherwise 193. WES has proven

to be a powerful, efficient, and affordable research and diagnostic tool194 and hence was

chosen for this project. For Whole Exome Sequencing, Ion Torrent Instrumentation was used;

Ion Proton and Ion S5 for sequencing, and Ion Chef for template preparation. Ion P1 microchip

is used for sequencing through the Ion Proton and Ion Chips 540 and 550 are used for

sequencing through the Ion S5 instrument.

Chapter 3: Research Design 37

WES studies incorporate either a case-control or family-based design. Family-based

WES studies typically recruit families where one or more related individuals are exhibiting a

phenotype or a disease that is not shared by the rest of the family, often referred to as

probands195. Through elimination and a series of steps detailed in multiple guidelines, it is

often possible to identify one or two mutations that only the probands harbour. Case-control

studies, on the other hand, are primarily focused on finding mutations or rare variants that are

more enriched and represented in a higher number in a cohort of cases with a certain

condition/disease. The control cases can be either a healthy cohort that is specifically

chosen/recruited for the purposes of comparison to a diseased group, or a public database of

individuals who meet certain criteria/are chosen for their healthy statuses. The steps for WES

on both instruments are the same and are as follows:

1- Library preparation: Ion Platforms utilise a combination of amplicons as the ultimate method

of sequencing. Consequently, the first step requires the amplification and enrichment of

exonic regions from genomic DNA. 75ng of input gDNA was used for the initial amplification

reaction. These amplicons are then pooled, and primers are partially digested as a prior step

to adapter ligation. IonXpress barcodes (ThermoFisher Scientific) specific to the Ion Torrent

platform were used to uniquely tag each library. Following a series of cleaning,

Ethanol/Agenacourt AMPure XP clean-up method, purified adapter-ligated libraries were

subjected to a final amplification step to enrich adapter-ligated DNA and to increase library

concentration. Following this, libraries underwent a two round purification process using the

Ethanol/Agencourt AMPure XP clean-up method and quantified using the Qubit Fluorometer

(ThermoFisher Scientific). Ion Ampliseq exome libraries typically yield 300 – 1,500ng/mL

per sample. Following calculation of library concentration, samples were diluted to a final

concentration of 22ng/mL(~100pM) and pooled prior to template preparation.

38 Chapter 3: Research Design

2- Template Preparation: Template preparation is an automated process done through the Ion

Chef instrument using Hi-Q Chef Solutions and Chef Kits (ThermoFisher Scientific,

Waltham, Massachusetts). The ultimate goal of the process is to isolate exome-amplified

fragments and load samples on the ion-based sequencing chip of choice.

3- Analysis of variants in the samples. Filtering of variants can be done through a number of

limiting parameters, and based on the number of filters, and whether it is stringent or lenient,

we end up with certain number of variants. Our stringent filtering criteria that resulted in a

small number of candidate genes in each family allowed us to take a candidate gene approach

a step further. The stringency of the filtering is often dependent on the hypothesis and whether

it started as a more generic or specific in relation to certain genes.

3.3.3 Methylation Analysis

There are several methods to assess methylation levels, including methylation specific

restriction endonucleases (MSRE), methylation specific high-resolution melting (MS-HRM)

, quantitative methylation specific polymerase chain reaction (qMSP), and pyrosequencing

196. Pyrosequencing has several advantages, including simplicity of use, and accuracy in both

CpG rich and CpG poor regions. 197. In a review of the most popular methylation levels

assessment methods, Setakova and colleagues concluded that pyrosequencing was the most

feasible, providing the highest consistency at base-level resolution. Assay design is flexible,

as the distance from the first base to be sequenced can be varied, and therefore, the primer can

usually be positioned in a region free of CpG sites. In addition, there are four options for

design: the assay can be performed in forward or reverse orientations, and on either the top or

the bottom strands 198. Methylation levels analysis is done using the Pyrosequencing method.

Pyrosequencing uses PCR-amplified regions from bisulphite-converted DNA.

Chapter 3: Research Design 39

A complete bisulphite conversion is essential for an accurate and reliable

pyrosequencing, as unconverted cytosines can incorrectly be counted as methylated loci 196.

Conversion kits have come a long way from using high bisulphite salt concentration with high

temperatures and low pH, which used to require high DNA input, resulting in a large

fragmented/lost portion of the already low yield 199. Nowadays, there is a wide variety of

commercial kits available that are able to convert as low as 100 pg of DNA in less than 2 h

200. These kits use convenient column system and guarantee more than 99% conversion

efficiency. Pyrosequencing is done using the PyroMark Q48 (Qiagen) instrumentation. It

provides real-time sequence-based Pyrosequencing technology for detection and

quantification in genetic analyses and epigenetic methylation studies. There are 5 steps to a

pyrosequencing reaction: a) Amplification of DNA segment using PCR, and biotinylation of the strand serving as the

pyrosequencing template. b) DNA polymerase is used to incorporate dNTPs to the sequencing primers. Each incorporation

event is accompanied by a proportionate release of pyrophosphate (PPi) c) ATP sulfurylase converts PPi to ATP in the presence of adenosine 5' phosphosulfate (APS).

This ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin that generates

visible light in amounts that are proportional to the amount of ATP. The light produced in the

luciferase-catalysed reaction is detected by CCD sensors and seen as a peak in the raw data

output (Pyrogram). The height of each peak (light signal) is proportional to the number of

nucleotides incorporated d) A continuous cycle of degradation of unincorporated dNTPS and ATP using apyrase is done

before adding further nucleotides.

40 Chapter 3: Research Design

e) As the incorporation goes on, he complementary DNA strand is elongated, and the nucleotide

sequence is determined from the signal peaks in the Pyrogram trace.

3.3.4 Sanger Sequencing

The first step in validation of relevant variants detected via WES will be confirmation

that the variants are real and not sequencing artefacts. Despite some recent reviews suggesting

a lack of utility for validation of NGS 201, Sanger sequencing is still considered the gold

standard and used for confirmation of variants, particularly those with low coverage, and

familial segregation analysis when appropriate. It is the “first-generation” sequencing which

is based on PCR amplification of specification genetic fragments that are to be sequenced.

The next step which differentiates Sanger sequencing from a normal amplification process is

the incorporation of fluorescently labelled 2’, 3’- dideoxynucleotide triphosphate (ddNTP)

into the growing stands, terminating DNA synthesis (Heather and Chain, 2016). Genetic

sequencer such as the 3500 biosystems genetic analyser are used to separate the synthesised

products based on different wavelengths that emit from different nucleotides (e.g. 420nm is

adenine) (Gupta and Gupta 2014). The advantage of using Sanger using as a complementary

method to next generation sequencing is the high accuracy (99.99%) of sequencing (Morey,

Fernandez, 2013), which is crucial for clinical and diagnostic settings and less in research

settings.

Sanger sequencing Steps:

Chapter 3: Research Design 41

1. Design primers that amplify the region where the mutation is located using NCBI primer Blast

software.

2. Run gradient PCR to decide on the optimal annealing temperature for the primer pair that

yields the highest concentration of PCR product and least primer-dimers. The DNA

polymerase used is GoTaq® Flexi DNA Polymerase from Promega. PCR protocol is

described in detail in Appendix 1.

3. Amplify the sample with the mutation using the optimised protocol, as well as another sample

as a “positive control”.

4. Clean the PCR products with EXO-SAP kit.

5. Use the cleaned products in a BigDye™ Terminator v3.1 reaction, which is compatible

with the 3500 Applied Biosystems Genetic Analyser.

3.4 ANALYSIS

3.4.1 Bioinformatics pipeline

Following WES, the Ion Torrent Server is used to generate quality metrics, align reads

to the Human Genome 19 (hg19), and the Ion Torrent Variant Caller (TVC) is used to call

sequence variants. Whole exomes are analysed for genes implicated in known head-trauma

related conditions, followed by analysis, then all genes related to ion channels, or neuronal

functions. Ion Reporter software (most updated version) is used to explore and filter variants

based on Minor Allele Frequency (MAF), gene ontology, and functional scores.

In-silico prediction tools are then used to investigate the pathogenicity of the mutations.

Two sets of tools were used, training-based algorithms (PolyPhen-2 and Mutation Taster) and

42 Chapter 3: Research Design

non-training-based algorithms (SIFT and Mutations Assessor), to avoid results based solely on machine learning tools, which are prone to over-fitting 202, 203.

Mutation Assessor, an in-silico prediction tool based on protein function, was used to explore the potential impact of Amino Acid (AA) changes on protein structure. Mutation

Assessor produces functional impact scores (FI), wherein scores below 0.8 have a neutral impact, scores between 0.8 and 1.9 have a low impact, scores between 1.9 and 3.5 have a medium impact, and scores higher than 3.5 have a high impact204. PredictSNP is a suitable tool for assessing the potential pathogenicity of variants, as it compiles the data of 8 of the top established prediction models and metrics; MAPP, nsSNPAnalyzer, PANTHER, PhD-SNP,

PolyPhen-1, PolyPhen-2, SIFT and SNAP 205 to produce a percentage of deleteriousness or non-pathogenicity as an average of all the tool prediction scores. This is used as an integrative tool to make sure that there is no bias arising from SIFT and PolyPhen as an initial step using

Ion . Mutation taster, which predicts how potentially disease-causing a variant is 206, is used in conjunction with PredictSNP to establish a more detailed understanding of the effects of certain mutations identified as amino acid sequence changes, frameshifts, and splice sites. All reported mutations are confirmed via Sanger sequencing. Variant relevance is usually determined based on; a) biological relevance, such as the expression of the gene in the brain, as determined by NCBI gene database; b) the limited number of individuals with the variant in other databases such as dbSNP and gnomAD; c) existing evidence of the involvement of the variant or the gene in a relevant neurological disorder or pathology; and d) evidence of involvement of the gene in a relevant biological process (i.e. response to concussion or neuroprotective factors).

Chapter 3: Research Design 43

3.4.2 Functional prediction tools for mutations (PredictSNP, Mutation taster)

SNPs are extremely common across the human genome. It is easier to predict the

pathogenic effect of SNPs that cause early codon termination or splice site changes than

missense variants. Hence, prediction tools are needed to identify variants based on possible

effects on protein features, splice sites, coding frames, and phenotypes in general. Sorting

Intolerant From Tolerant (SIFT) and polymorphism phenotyping (PolyPhen) are the two of

the common in-silico bioinformatics tools used to predicted effect of a mutation and its likely

pathogenicity 202. PolyPhen appears to be of more value when exploring variants leading to

loss of function rather than gain of function, whereas SIFT equally predicts both types of

functional changes 207. Ion Reporter offers a filter by functional scores SIFT and PolyPhen,

which is why they were used in the targeted analyses of Chapters 5 and 6. Further, PredictSNP

is a suitable tool for assessing the potential pathogenicity of a certain variant, as it compiles

the data of 8 of the top established prediction models and metrics; MAPP, nsSNPAnalyzer,

PANTHER, PhD-SNP, PolyPhen-1, PolyPhen-2, SIFT and SNAP 205. A percentage of

deleteriousness or benignity is then produced as an average of all the tool prediction scores.

However, most of these tools depend on a training-set machine learning model, and thus

predictions are not always accurate 208. Mutation taster is another in-silico prediction software

that predicts how potentially disease-causing a variant is, rather than deleteriousness 206.

Consequently, Mutation taster is used in conjunction with predict SNP to establish more

detailed understanding of the effects of a certain mutation on amino acid sequences,

frameshifts, and splice sites. Finally, Combined Annotation-Dependent Depletion (CADD),

which can integrate several predictors into a single score 209, has been used in Chapter 8 to

prioritise machine learning input variants. While CADD has many advantages and is the

44 Chapter 3: Research Design

preferred score in many ways, Ion Reporter platform at the time of conducting these

experiments was limited to SIFT and PolyPhen.

3.4.3 Variant Frequency tools

The Exome Aggregation Consortium (ExAC) is a collection of exomes aggregated from

several clinical sequencing exomes submitted by collaborators around the world. ExAC is part

of The Genome Aggregation Database (gnomAD) attempts to be a repository for both exome

and genome sequencing data from a wide variety of large-scale sequencing projects. There

are around 123,136 exome sequences and 15,496 whole-genome sequences from unrelated

individuals so far. dbSNP is a free repository for genetic variants developed and hosted by

the National Center for Biotechnology Information(NCBI) in collaboration with the National

Human Genome Research Institute (NHGRI). It is connected to several clinical databases

where clinical symptoms are recorded along with their respective mutations.

3.4.4 Machine Learning: Ronin and cloud computing

Machine learning requires high computational powers as iterations of the same model

can go up in the order of millions 210. WES files provide thousands of variants that are

potentially contributing to the prediction model of PCS. Consequently, handling of the

preprocessing files as well as the modelling needs to be run using High Performance

Computers (HPC). Ronin provides a graphical user interface (GUI) with access to Amazon

Web Services (AWS). The platform allows for creating virtual machines with varying

computing, memory, and storage capacities. This method predominantly relates to Aim 4.

Chapter 3: Research Design 45

3.4.5 Mitochondrial Analysis of variants

Ion Torrent sequencing amplicons overlap with some of the mitochondrial DNA,

making it possible to re-align the raw BAM files to the human mitochondrial DNA; this is

done in a virtual machine (HPC) using SAMTOOLS 211. This method is used for Aim 3. The

re-aligned fastq files is then converted to a smaller mitochondrial-only BAM file, where

variants are called using VCF tools 212. There are numerous in-silico prediction tools available

to predict the pathogenicity of a given mitochondrial variant (e.g. MitoMap 213) or to identify

an individual’s most likely haplotype based on the haplogroup to which the majority of their

mitochondrial variants belong.

Mitochondrial variants predicted to be pathogenic or deleterious and over- represented

in the study cohort compared to various databases are then confirmed by Sanger sequencing,

as the average coverage of mitochondrial variants extracted from Ion WES data is only around

40-50x, which is considerably less than the average of genomic/exonic variants.

3.4.6 Methylation Analysis

Quantification of Methylation levels at a certain CpG is done using the PyroMark

Pyrosequencing platform. This requires preprocessing of the genomic DNA using bisulfite

conversion to allow for methylation quantification using any platform. Genomic DNA is

converted using the Zymo bisulfite conversion kit, Zymo EZ DNA Methylation kit 200, as

per the manufacturer’s protocol. PCR products are then amplified using primers designed by

the Qiagen pyro primer design software. The samples are then analysed using the PyroSeq on-

screen instructions. To minimise sequencing artefacts, samples of comparable demographics

belonging to different cohorts (e.g. recovered concussion patients vs PCS) are analysed in the

46 Chapter 3: Research Design

same run, which increases the confidence that differences observed are real. This method is

used for Aim 3.

3.5 LIMITATIONS

This study was expected to face some limitations, particularly around recruitment of

appropriate patient cohorts to enable high-powered statistical analysis. However, the samples

obtained provided enough material for WES-based analysis which depends mostly on

functional rare variants rather than association analysis with polymorphisms. The sample size

of this study is clearly a major limitation to making generalised statements. There were

limitations however on obtaining pre- and post-concussion DNA samples from individuals to

establish statistically significant epigenetic changes. Further limitations will be discussed in

Chapter 9 along with future directions.

Chapter 3: Research Design 47

Chapter 4: Saliva as a Comparable Source of DNA for WES

This Chapter details the results of Aim 1. To develop a protocol for using saliva and blood-extracted DNA in the same WES analysis on Ion platforms.

Due to its ease of collection and non-invasiveness, for some of the cohorts the DNA samples provided to the research team were in the form of saliva specimens. To date, WES has predominantly utilised DNA extracted from blood samples. Hence, it was necessary to develop a protocol and pipelines that ensure data from saliva and blood extracted DNA can be used in the same analysis for calling variants. The results presented here were published as

“Ibrahim O, Sutherland HG, Haupt LM, Griffiths LR. Saliva as a comparable-quality source of DNA for Whole Exome Sequencing on Ion platforms. Genomics. 2020;112(2):1437-

1443.doi:10.1016/j.ygeno.2019.08.014”. Genomics is a Q1 journal with Impact Factor: 6.2

Abstract

Background: Whole Exome Sequencing (WES) utilises overlapping fragments that are prone to sequencing artefacts. Saliva, a non-invasive source of DNA, has been successfully used in WES studies on various platforms. This study aimed to explore the validity and quality of saliva as a DNA source when compared to whole blood on an Ion Platform. Methods: DNA was extracted from both whole blood and saliva from four individuals. WES was performed on the Ion Proton platform. Quality metrics (Depth, Genotyping Quality, etc.) and variant identification were compared for the same source sample-pairs. Results: No significant differences in quality metrics were identified between data obtained from whole blood and saliva samples, with saliva samples having higher coverage depth in some instances. Variants

Chapter 4: Saliva as a Comparable Source of DNA for WES 49

within the same sample, from the two genomic DNA sources, had an average concordance on par with other studies on other platforms with different chemistry. Conclusion: Saliva- extracted DNA provides comparable sequencing quality to whole blood for WES on Ion

Torrent Platforms.

50 Chapter 4: Saliva as a Comparable Source of DNA for WES

4.1 BACKGROUND

WES is a high throughput sequencing approach that provides rapid sequencing of

exomes and genomes. It is based on targeted enrichment, which aids in mitigating the

limitations arising from using low-yield DNA such as that extracted from saliva 214. According

to Wall and colleagues 215, the most common platforms for whole exome/genome sequencing

are the Illumina (IL) and Complete Genomics (CG) platforms, which have been compared in

only one study 216.

An alternative to these platforms is the Ion Proton suite of Next Generation Sequencing

(NGS) instrumentation located in the Genomics Research Centre (GRC), which will be used

in this study. The Ion instrumentation is based on semiconductor chips that measure changes

in pH levels of the nucleotide being sequenced. Ion technology uses chips with discrete wells

connected to sensors as a mechanism for base calling, predicting incorporation signals for the

sequenced bases. This eliminates the need for randomly generated DNA clusters on flow cells

217.

DNA can be extracted from several tissues/cells, the most common being blood and buccal

cells in saliva. DNA extracted from saliva/buccal cells is easily collected and has been used

in various sequencing applications, including Polymerase Chain Reaction (PCR) and NGS.

DNA extracted from saliva often has low yield and contains non-human DNA 214. It is also

assumed that DNA extracted from buccal cells will have lower mapped reads compared to

that extracted from blood, potentially due to non-human components. Generally, achieving a

100% concordance to the reference genomic is a practical impossibility, with most DNA

sources providing around 95% mappable reads 217. While a comparison has been performed

Chapter 4: Saliva as a Comparable Source of DNA for WES 51

using these two DNA sources for an NGS panel on an Ion Proton Platform (the one used for this project) 218, no WES comparison has been previously reported.

Variant Calling

Despite the fact that sequencing costs have been steeply dropping in recent years, allowing for wider use of WES technology, computing costs are not dropping at a similar rate

219. The Ion Torrent Reporter variant caller maps sequence reads to a reference genome (the

Hg19 assembly of the human genome) and then uses several algorithms to compare the sample sequence data with the reference, looking for alternate alleles to call or identify variants 111,

220. Approximately 0.01% of the sequenced bases will be identified to mismatch the reference genome, with few likely to be identified as true variants 221. O’rawe and colleagues 222 found that five different pipelines will produce only 57% common variants when starting with the same dataset 222. They compared results from the whole exome and genome sequencing data in 4 individuals across the Illumina and CG platforms and calculated genotype discordance rates predominantly depending on GQ scores and found a discordance rate of 0.152 on the

Illumina Platform between DNA extracted from saliva and blood samples from the same individual. It should be noted that concordance rates are never a 100% even when the same sample has been analysed through different pipelines, with examples of higher discordance than between different samples 222. More recently, a study by Rosenfeld and colleagues 223 identified approximately 19% of variants were unique to one dataset when comparing the same sample on both 1000 Genomes Project and CG platform, which is attributed to difference in sequencing technology, chemistry, and the fact that NGS is based on “reads” that are prone to error. Nevertheless, there are several critical measures to apply while filtering sequencing data to minimise the identification of erroneous variants, predominantly based on the depth of read (depth), genotype quality score (GQ), and the alternate allele ratio 221. Carson and

52 Chapter 4: Saliva as a Comparable Source of DNA for WES

colleagues 224 attempted to filter genotypes by depth and GQ scores and found improved

concordance rates.

4.2 EXPERIMENTAL AIM

To test whether saliva-extracted DNA has a similar quality of depth, coverage, and

variant detection rate to the DNA extracted from blood on an Ion Proton WES Platform.

This data will be used to develop a protocol for examination of saliva-extracted DNA

by WES and may provide important insight into potential subtle differences in template and

data quality following NGS. Ensuring that there is ample sequencing coverage obtained from

saliva-extracted DNA will allow analysis of the data obtained through saliva samples to be

performed in conjunction with the blood-extracted samples.

4.3 METHODS

Saliva and blood samples were both collected from 4 healthy individuals (2 males and

2 females) and DNA extracted from each sample for use in WES. One male identified as North

African and the remaining 3 identified as Caucasian. The average age was 28.25 (SD = 4.6).

DNA was extracted from Oragene saliva kits according to the manufacturer’s manual, with

DNA extracted from blood samples using the QIAGEN DNAeasy kit as per the

manufacturer’s instructions. A standard amount of DNA (75ng) was used after quantification

by Qubit in all sequencing runs regardless of the DNA source, thus accounting for potential

source yield differences.

WES was performed using the Ion AmpliSeq™ Exome RDY Library Preparation

protocol and the Ion P1™ Hi‐Q Chef Kit protocol as per the manufacturer’s instructions. Each

of the samples pairs (saliva and blood sourced from the same person) were loaded on the same

Chapter 4: Saliva as a Comparable Source of DNA for WES 53

chip by Ion Chef, to standardise sequencing and loading conditions pertaining to chip differences, leading to a total of 4 chips. The Ion Proton was then used to run the WES. The

Ion Torrent Server generated quality metrics while the Ion Reporter Server was used to call sequence variants.

4.3.1 Analysis

Following WES, the Ion Torrent Server generated quality metrics, which were above the accepted thresholds (Reads > 20 M, and Average base coverage depth > 20-30×) while the Ion Torrent Variant Caller (TVC) was used to call sequence variants. Samples were labelled 1b, 1 s, 2b, 2 s, 3b, 3 s, 4 s, and 4b, with “b” denoting DNA extracted from whole blood and “s” denoting DNA extracted from saliva. The Ion Reporter™ software was used to explore variants in each sample pair, using the visualise function, which depicts variant intersection between two given samples. Filter parameters (Genotyping Quality score and

Depth scores) were adjusted to explore the effect on concordance rates. Two lists of discordant variants were created from each sample pair, corresponding to either saliva or whole blood-origin. Variants were then arranged by depth, with variants with the highest 5 depth scores and lowest 5 depth scores loaded for analysis in Integrated Genome Viewer

(IGV). We further used bcftools version 1.6 variant caller (mpileup up command) to recall the variants of each sample. Next, we used bcftools stats command to compare the discordance between samples pairs, adding to the robustness of the results. Pipeline script available upon request.

54 Chapter 4: Saliva as a Comparable Source of DNA for WES

4.4 RESULTS

Quality Metrics

Table 2 details the quality metrics for each of the 8 exomes. By examining the figures,

it can be seen that 3 out of the 4 saliva samples had a higher number of mapped reads than

their same-source blood equivalent, while 4 of the 4 blood samples had a higher percentage

of on-target reads.

Quality Control of Sequencing

The 4 chips demonstrated considerably consistent high loading of ISPs (Average 93-

96% across the 4 chips) and were all within the commonly accepted mean depth and number

of reads figures. An example of good loading quality of libraries on one of the chips is shown

in Figure 1, and the 4 chips’ heat maps are provided in Appendix 2.

Chapter 4: Saliva as a Comparable Source of DNA for WES 55

Table 2 Quality Control (QC) Metrics of the 8 samples

4 donors, saliva and blood source from each generated by Ion Torrent Server. The

8 samples passed minimum QC thresholds (reads >20 million, average depth >20).

Saliva-extracted DNA has comparable WES quality metrics to blood-extracted DNA.

Sample (DNA Number of Percentage of Average base Uniformity of base Source) mapped reads reads on target (%) coverage depth coverage (%)

1(S) 34,067,384 64.74 64.68 86.82 1 (B) 24,110,301 79.24 56.77 77.83 2 (S) 38,997,671 84.64 105.3 89.14 2 (B) 36,843,687 89.31 104.8 85.3 3 (S) 42,068,276 87.21 116.7 89.71 3 (B) 38,093,213 89.02 107.1 86.95

4 (S) 43,768,912* 83.73 117.5 91.78

4 (B) 44,514,504* 91.1 130.2 90.21

Mean 37,807,994 84 100 87

Std. deviation 6151273.734 7.9613581 24.2742 4.057579

Minimum 24,110,301 65 57 78

*A visually observed difference between saliva and blood in one sample pair (same donor, different sources), different to the pattern observed in the 3 other samples pairs.

56 Chapter 4: Saliva as a Comparable Source of DNA for WES

Figure 1 Ion P1 Chip Heat map of libraries loading density.

The red circle represents the part of the chip where Ion SpheresTM Particles(ISP) containing libraries are loaded. The axis on the right is a heat scale from blue (0%) to red

(100%) indicating live ISPs (ISP with sequencing library attached). The loading density average generated by the Ion Torrent Server is the figure title. Left and bottom axes represent physical dimensions of the chip in micrometres.

Variant Calling

Ion ReporterTM software was used to generate variant discordance figures and lists of discordant variants between each sample pair (different source of DNA from the same person).

For each sample, the discordant variants from one source were arranged in a descending order based on read depth, to exclude low coverage as a cause for the discordance. The 5 discordant variants with the highest depth and the 5 discordant variants with the lowest depth from each sample were analysed in Integrative Genomics Viewer (IGV) for both samples (DNA and

Saliva). This is demonstrated in Figure 2 where two of the Binary Alignment Map (BAM) files are almost identical, yet one sample was identified as a wild-type and the other as

Chapter 4: Saliva as a Comparable Source of DNA for WES 57

heterozygous genotype. It appears that in the top 5 discordant variants with highest read depth, the discordance was a variant calling error, for the sequencing was almost identical in each case. However, for the bottom 5, it appears as though there were more sequencing artefacts, as depth was quite low, and most reads are unreliable 225 in that bracket of depth. By applying a higher depth cut-offs filter for variant inclusion, discordance between the samples increased, indicating that several true variants were concordant and with an average depth (25x), whereas lowering it did not decrease discordance. Figure 2 details the steps of the analysis and Table

3 details the variant concordance between the samples. Other examples are presented in

Appendix 3. To increase the robustness of the results, we have used bcftools (mpileup command) to recall the variants in the 8 samples and recalculated the concordance rates using vcftools (compare), as demonstrated in Table 4.

Table 3 Variant concordance between different samples donors.

(s= saliva extracted sample, b= blood extracted sample). Average variant concordance between different samples with the same donor < 90%. Results based on

Torrent Variant Caller (TVC)

Variants Comparison Total Variants In 1st only In 2nd only Concordance of samples Variants Discordance (%) (%) 1*s vs. 1*b 3008 828 39506 90.15 9.85 2*s vs. 2*b 855 549 38084 96.2 3.8 3*s vs.3*b 837 544 38977 96.35 3.65 4*s vs. 4*b 557 560 38768 97.05 2.95 1*s vs.2*s 15448 14304 53003 43.86 56.13 1*s vs.3*s 15201 14949 53648 43.8 56.2 1*s vs. 4*s 15204 14730 53386 43.93 56.07 1*b vs.2*s 14462 15498 53387 43.88 56.12 1*b vs.3*s 14252 16180 52699 42.25 57.75 1*b vs.4*s 14300 16011 52530 42.3 57.7

58 Chapter 4: Saliva as a Comparable Source of DNA for WES

2*s vs.3*s 13201 14093 51648 47.15 52.85 2*s vs.4*s 13113 13788 51343 47.61 52.39 2*b vs.1*s 14086 15536 52785 43.88 56.12 2*b vs.3*s 14230 13032 51479 47.04 52.96 2*b vs. 4*s 12905 13881 51085 47.43 52.57 3*s vs.4*s 13712 13495 51942 47.62 52.38 3*b vs.1*s 14699 15244 53398 43.92 56.08 3*b vs.2*s 13916 13317 51471 47.09 52.91 3*b vs.4*s 13589 13665 51819 47.41 52.59 4*b vs.1*s 14788 15259 53487 43.82 56.18 4*b vs.2*s 13796 13123 51351 47.58 52.42 4*b vs.3*s 13459 13678 51872 47.68 52.32 1*b vs.2*b 14476 15206 51725 42.62 57.38 1*b vs.3*b 14285 15920 52439 42.4 57.6 1*b vs.4*b 14284 15993 52512 42.34 57.66 2*b vs.3*b 13089 13994 51207 47.11 52.89 2*b vs.4*b 12962 13941 51190 47.44 52.56 3*b vs.4*b 13590 13664 51818 47.4 52.6

*Numbers indicate the donor of saliva or whole blood.

Table 4 Variants concordance using other variant caller and concordance analysis.

Comparison Variants Concordance % (Recalled by of samples bcftools and compared by vcftools)

1*s vs. 1*b 78 2*s vs. 2*b 79.4 3*s vs.3*b 78.9 4*s vs. 4*b 80 1*s vs.2*s 25 1*s vs.3*s 36 1*s vs. 4*s 40

Chapter 4: Saliva as a Comparable Source of DNA for WES 59

1*b vs.2*s 12.8 1*b vs.3*s 42 1*b vs.4*s 40 2*s vs.3*s 21 3*b vs.1*s 34

*Numbers indicate the donor of saliva or whole blood.

• Library Prepration • Ion Proton Sequencing WES • QC checks passed

• Variants were called for each sample and exported to VCF files. • VCFs from the same participant compared for concordant and discordant Variants variants Called

• Filter parameteres were adjusted to explore effect on concordance rates. Filters

• Two lists of discordant samples created from each samples pair • Variants arranged by depth, variants with highest 5 depth scores and lowest 5 depth scores for analysis in IGV Arrange • variants examined for false positives Discordant Variants

• An average measure of concordance between samples from the same participant can be achieved, in line with similar experiments. Discordant Confirmation variants determind as true positives.

Figure 2 Workflow of the process comparing the utility of saliva-

extracted DNA in WES experiments as an equal alternative to blood-extracted DNA.

60 Chapter 4: Saliva as a Comparable Source of DNA for WES

Figure 3 screen shot of Integrative Genome Viewer (IGV)

IGV visualises BAM files (containing sequencing alignment data before variant calling). This window was loaded with 2 BAM files, one from a saliva (Top) sample and another from a blood sample (Bottom) of the same participant. A discordant variant

(chr15:78,556,571) was called (A/A) by Ion Reporter in one sample and (A/G) in another despite both samples having the same genotype. Each of the horizontal lines represents one called base. The green represents the (As), and orange represent the (Gs). The reference sequence (Human genome 19 (Hg19)) is shown in the bottom of the window.

Chapter 4: Saliva as a Comparable Source of DNA for WES 61

4.5 DISCUSSION

We explored saliva as a DNA source yielding similar sequencing quality to whole

blood-extracted DNA in a WES experiment using an Ion platform. While previous studies

have explored this in NGS targeted-gene panels 218, to our knowledge this is the first study to

compare whole blood and saliva DNA performance in Ion platform WES in healthy unrelated

individuals. WES poses a challenge in obtaining an even depth of coverage across regions,

which ultimately means that each time a region is target-captured, the depth of coverage may

vary 219. This, along with the inconsistent nature of sequencing fragments, make WES more

prone to sequencing errors than gene-specific NGS. These errors come as no surprise, with

numerous documented genotyping errors documented in the literature 226-228.

In accordance with Wall and colleagues 215, whole blood- and saliva-extracted genomic

DNA did not yield significant differences in quality and error rates of the sequencing data. In

fact, saliva samples did –on average – have slightly higher coverage depth than their whole

blood counterparts. One of the concerns about using saliva in NGS is the potential of human

microbiome genome reads interfering with the sequencing process. However, any current

mapping algorithm will not align those reads to the human reference genome, hence excluding

them from any further computational processes, i.e. variant calling. We find WES on an Ion

platform to yield similar results for DNA extracted from saliva and whole blood.

Consequently, we propose that using saliva is an appropriate, less invasive method of

collecting DNA for research (and potentially diagnostic) NGS applications in centres relying

on Ion technology.

Further, there was no bias in the presence of the known Ion platform error types in one

source of DNA over the other. As the Ion platform uses reiterative early prediction of bases,

62 Chapter 4: Saliva as a Comparable Source of DNA for WES

this may explain why several variants were identified to have different sequences based on a slight variation in the allele ratios. This warrants using a more strict allele ratio filter (e.g. 65-

35% when using these variants in different statistical analyses). Further, the Ion platform has more errors when sequencing insertions or deletions (indels) of homopolymeric DNA 217, which could explain the low-quality errors at the bottom of the arranged list of discordant variants. Of particular interest is noting that the discordance between different-individual samples goes increases from ~52% to ~56% in all comparisons including 1b or 1s, the samples with lower sequencing quality. This suggests that lower sequencing quality will increase false positives. Further, the discordance rates identified did not decrease with increasing GQ score cut off or with increasing Depth/GQ. Nevertheless, we noted that discordance levels increased when changing the quality metrics thresholds. This is attributed to variants that TVC identified in both samples with a marginal quality metric in one sample, leading the variant “call” to disappear from the marginal sample upon applying more stringent filters. This is consistent with O’Rawe and colleagues 222 who found that increasing the threshold/stringency of variant callers did not increase SNV concordance between different pipelines with the same sample.

In addition to the Torrent Variant Caller analysis, comparing variant concordance from bcftools variant calling using vcftools yielded similar, although slightly lower, concordance rates between samples of the same source. This, too, is in concordance with the literature presented in the introduction wherein multiple variant callers will never yield the same results.

We note that Torrent Variant Caller might be producing better concordance rates due to its prior configuration with the platform. We also note that despite that the concordance rates drop for all the 28 different samples comparison, which suggests a variant caller role, consistent with the literature. Despite the many advantages of the Ion platform, it has been shown to have disadvantages with variant calling compared to other platforms. For example, using different variant callers on the same sample on an Ion platform will yield higher

Chapter 4: Saliva as a Comparable Source of DNA for WES 63

discordances than IL platforms 229. This high quality of data is demonstrated by the fact that

discordance was down to an average of 2% when including only SNVs in the analysis.

Inconsistent coverage of regions seems to be the cause of several variant calling errors.

In particular, the sample 1 pair had, on-average, lower quality reads than the remaining 3,

which might be due to a range of factors, from library preparation to PCR conditions and

reaction quality. Hence it had the highest discordance rate of called variants. Further, we noted

that the same regions had different values for coverage for each sample. This observation is

consistent with that of Kidd and colleagues 214, who found no systematic differences between

saliva sample quality metrics. Furthermore, as the discordant variants were not in specific

regions, there does not appear to be an underlying pattern to coverage discrepancy, as might

be the case with regions that are not “reliably callable” 230. Patel, Kottyan 221 suggest that

sequencing artefacts provide a limitation to NGS platforms and utilising the metadata of VCF

files might be a way forward.

There are several challenges that still face DNA base calling and alignment algorithms

231 in rapidly developing NGS technologies. These challenges are further pronounced when

attempting to process high-throughput data from next-generation platforms to variants 232, in

particular where single base read quality influences sequencing and variant calling e.g. Single

Nucleotide Variants (SNVs) 233. Nevertheless, at the current capacity, NGS provides genomics

diagnostic and research data that are crucial for advancing the field of genomics.

4.6 CONCLUSION

DNA extracted from the blood and saliva of the same person yields similar coverage

and depth when examined by WES on an Ion Torrent platform, indicating saliva is a valid

source of DNA for use in WES. Furthermore, there is an average concordance rate of called

64 Chapter 4: Saliva as a Comparable Source of DNA for WES

variants above 95% between the different sources of DNA sample in the same individual. We are certain that these differences are related to saliva and blood because of the patterns found despite accounting for batch error by sequencing pairs of saliva and blood samples from each individual in the same run.

It is hypothesized that the discordant percentage is due to variant calling artifacts rather than sequencing errors. Further, discordant variants were shown to have allele ratios indicative of false positives, and thus such a variant would not be analysed in any NGS regardless of the sample source. That is, in typical NGS experiments where variants of interest are scrutinised for quality, visualised using software as IGV, and sometimes confirmed by Sanger sequenced, these variants would not be an issue.

Chapter 4: Saliva as a Comparable Source of DNA for WES 65

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

This chapter presents the details of the first section of Aim 2, to explore rare variants implicated in severe reactions to trivial head trauma. The cohort used for this study consisted of 16 individuals who had presented with concussion related symptoms following trivial head trauma, including coma-like symptoms and other transient neurological disturbances, consistent with carriers of Familial Hemiplegic Migraine (FHM) mutation. They had been previously screened and found to be negative for mutations in the known FHM genes (CACNA1A,

ATP1A2, SCN1A, NOTCH3, and TRESK18) which have also been implicated in conditions with symptomatic overlap with FHM. We hypothesised that individuals in this cohort may be harbouring pathogenic variants which contribute to excessive symptomatology in response to minor head trauma. The results of this chapter were published as “Ibrahim O, Sutherland HG,

Maksemous N, Smith R, Haupt LM, Griffiths LR. Exploring Neuronal Vulnerability to Head

Trauma Using a Whole Exome Approach. J Neurotrauma. 2020;10.1089/neu.2019.6962. doi:10.1089/neu.2019.6962”.

Journal of Neurotrauma is a Q1 journal with Impact Factor: 3.75

Abstract

Background: Mutations in ion channel genes (e.g. CACNA1A, ATP1A2) have been implicated in the development of severe neurological disturbances in response to trivial head trauma. However, a portion of the population presents with similar symptoms and have no mutations in the known genes. Aim: Using Whole Exome Sequencing (WES) to explore rare deleterious variants causing severe neurological disturbance following mild head trauma, with

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 67

a focus on Ion Channel genes. Methods: This study included 16 patients who presented with a

range of neurological symptoms following trivial trauma/minor head injuries. They had also

previously tested negative for mutations using the Genomics Research Centre Familial

Hemiplegic Migraine 5-gene diagnostic panel. WES was performed on the Ion Proton (Life

Technologies/ThermoFisher Scientific) instrument. The Ion Reporter server was used for

bioinformatics analysis. Analysis was performed using filters that excluded variants deemed by

Sorting Tolerant from Intolerant (SIFT) and Polymorphism Phenotyping (PolyPhen) scores to

be benign and included only genes related to neural processes and which are highly expressed

in the brain. Candidate variants were explored using other in-silico prediction tools

(PredictSNP, Mutation Taster, and Mutation Assessor). Results: Rare and novel mutations in

Ion Channel genes were found in 5 cases, Neurotransmitter pathway mutations were found in

2, and Ubiquitin related mutations in 4 cases, 2 of whom are related. All the mutations were

predicted to be deleterious. Discussion: Ion channel genes are crucial in restoring neuronal

homeostasis following head trauma, along with Gabanergic and Glutamergic pathways that

modulate neurotoxicity. Ubiquitination of Ion Channels has also been implicated in several

neurological dysfunctions. It is hypothesised that heterozygous deleterious mutations in genes

implicated in recessive neurological dysfunctions might posit individuals at a risk of responding

poorly to trivial head trauma due to neurological vulnerability and compromised homeostatic

processes.

5.1 BACKGROUND

Neuronal and synaptic ion channels, which include calcium, potassium, and sodium

channels, ATPase transporters, and solute carriers, control the flow of ions and

neurotransmitters through cellular membranes 234. In addition to their normal functions, they

are crucial for restoring neuronal homeostasis in the aftermath of a head impact 235. Ion

68 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

channelopathies, a wide range of dysfunctions related to ion channels and ion transporters, have long been documented as one of the main causes of several neurological disturbances. The calcium channel gene, CACNA1A has been identified as a key gene in which mutations can cause Familial Hemiplegic Migraine (FHM), along with several other FHM-related ion channel genes (ATP1A2, SCN1A). CACNA1A, which encodes the 1α pre-forming subunit of Ca2+ voltage-gated Cav2.1 channel has been shown to harbour mutations implicated in the development of an array of neurological disturbances and concussion-related symptoms

(migraine, epilepsy, coma, and cerebral oedema) following minor or trivial head trauma 22, 236,

237. Recently, mutations in the ATP1A2 gene have also been found to be present in individuals of a large family with an over-representation of concussion incidents 238. Maksemous and colleagues 20 additionally recently reported 8 ATP1A2 rare variants in individuals who show severe responses to trivial head trauma. In addition, head trauma was found to start a myriad of neurological symptoms in individuals with variants in ABCD1, an X-linked adrenomyeloneuropathy causing gene, which is implicated in cellular peroxisome functioning

239. These mutations can change the functionality and structure of ion channels, leading to abnormal responses to minor trauma or mild Traumatic Brain Injury (mTBI) 21. Additionally, ion channels are suggested to play a role in the development of acquired neurological dysfunction (e.g. epilepsy), following environmental disturbances (e.g. head injury or trauma)

240. Further evidence suggests that genotype, particularly that of neuronal genes including

KIAA0319 25, BDNF 172-174, and ApoE 30, 156 play a role in response to head injury and recovery from mTBI.

The aftermath of a head trauma

As detailed in the literature review, head injury or impact, leads to cellular distortion and changes in membrane permeability, which contributes to an acute imbalance in extracellular

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 69

ion gradients, in particular, K+, which is expelled into the extracellular space 42, 51. Glutamate and other excitatory neurotransmitters are then released by neurons, likely due to the imbalance caused by K+ efflux, paired with Na+ and Ca2+ influx into the neurons, with the resulting ion imbalances contributing to local and downstream neuronal depolarisation 35. Cellular level homeostasis can be re-established through ion pump activation, requiring high Adenosine

Triphosphate (ATP) expenditure and energy, a process which leads to higher neuronal oxidative stress 51. This energy depletion is characterised by hyper-glycolysis (an above-average increase in glucose utilisation 241 and interference of Ca2+ ions with cellular and mitochondrial functioning. The high influx of Ca2+ following a TBI has been linked to long-term neuronal dysfunction 105, leading to an array of varied symptoms, with headache and migraine being the most common presentation 61. While mutations in the calcium channel gene (CACNA1A) have been shown to affect neurological response to head trauma in carriers 122, 129, 237, 242, several cases present with similar reactions to head trauma with no mutations present in CACNA1A. In a cohort of patients referred for FHM genetic testing following severe reactions to trivial head trauma, the majority of individuals were shown to be negative for variants in FHM genes. Post- head injury symptoms include migraine, and cognitive or affective changes 62. However, the current study focuses on somatic rather than cognitive or behavioural symptoms Based on the crucial role ion channels and neurotransmitters play in response to head trauma, we hypothesise that underlying mutations in ion channel and neurotransmitter genes might be implicated in the development of concussion symptoms following minor head trauma, similar to those observed in CACNA1A and ATP1A2 cases. In this study we investigate the role of rare and functional variants in other ion homeostasis genes in relation to neurological symptoms following minor head trauma.

70 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

5.2 METHODS

5.2.1. Cohort

16 patients who were referred to the Genomics Research Centre Diagnostic testing for

FHM similar symptoms following minor head trauma but had tested negative for mutations in

the known FHM genes through a comprehensive NGS 5-gene panel. Ethical approval was

granted for further investigation the samples that were obtained through the diagnostic facility

of the GRC (Approval Number 1400000748). Their clinical presentation is detailed in Table 5

(next page).

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 71

Table 5 Clinical Notes for 16 cases referred for diagnostic testing of suspected Hemiplegic Migraine with notable varied neurological dysfunctions following minor and/or trivial head trauma.

CASE ID Age Sex Clinical Notes

R170 10-18 M Confusional Migraine Following minor head injury

R211 10-18 M Severe migraine and ataxia following head injury

Netball hit to right side of temple. Patient described surroundings going black and then quickly recovering. Playing a soccer game made them more confused and presented to ER with hemiplegia. A background of headaches, unilateral throbbing, increases upon sitting up, which typically occurring R259 43374 F in the morning and are a little bit unpredictable. Photophobia, noise sensitivity, and nausea reported. No other neurological phenomena i.e. aura reported. Both maternal and paternal aunties have bad migraines however neither parents nor siblings or anyone else in the extended family suffer from migraines.

Recurrent episodes of pallor and vomiting following minor head injuries. Slurred speech with inappropriate words. Mild biparietal headache on most days since the injury, these headaches usually R117 0-10 F last for several minutes and resolve spontaneously. Consistent noise sensitivity and light sensitivity occurred for three weeks post injury. Mother has a history of migraines but there is no other significant family history of neurological problems.

R118 0-10 M Catastrophic cerebral oedema following trivial injury, mum has had hemiplegic migraines R197 10-18 M Migraine on minimal trauma, has two episodes of confusion and headache post rugby games. R150 0-10 F Ischaemic stroke after mild head injury R171 10-18 M Acute confusional migraine after head injury.

72 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

Severe bleed following suspected minor fall. Only a relatively minor fall but was followed by R120 0-10 M malignant cerebral oedema and a relatively small subdural bleed with herniation within half an hour of the fall there is no history or family history of migraine. R240 10-18 F Family history of migraine. Netball hit head, lead to headache and confusion. Repeated attacks of "concussion" after minor head trauma. After minor head trauma he develops a R110 10-18 M "migraine like" episode with slurred speech, diplopia, headache and vomiting. R111 30-50 M Similar to son R110 R167 18-30 M Head injury induced migraine R206 18-30 M Left sided numbness following a concussion Multiple episodes in a few months of visual disturbance, headache, and vomiting following trivial R222 10-18 M head trauma, most recent from hitting head against a player's chest at basketball. R256 18-18 F An episode of stroke related to minor head trauma, seizure disorder and episodic ataxia

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 73

74 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

5.2.2. WES analysis

As per the protocols detailed in chapter 3, DNA was extracted from Saliva and blood

samples. WES was performed using the Ion AmpliSeqTM Exome RDY Library Preparation

protocol and the Ion PITM Hi‐Q TM Chef Kit protocol as per the manufacturer’s instructions.

NGS was performed on the Ion Proton instrument. Ion Torrent Platform was used to generate

quality metrics while Ion Reporter was used to call the variants. Samples had an average read

depth of 25 million reads, indicating high quality sequencing, with none having less than 20

million reads. Ion Reporter filter was applied to explore variants in each sample based on gene

ontology (neural, neurological, brain, ion channel, neuro*), with Minor Allele Frequency

(MAF) below 0.001 according to dbSNP, which is the database that Ion Reporter uses for

filtering.

To increase the robustness of the analysis, the 16 exomes were analysed using the in-

house variant browser pipeline, VCFDART. The pipeline utilizes predetermined gene tiers (see

Supplementary Appendix S1) as well as scores from various in silico prediction tools to filter

variants of interest. The genes in those tiers were selected from established genetic diagnostic

tests for several neurological disorders, with the last tier showing all variants that are not present

in previous tiers, thus reducing the likelihood of missing any relevant variants in genes not

selected in the tiers. Further, the analysis was conducted with other neurological exomes

analyzed from epilepsy, CADASIL syndrome, and mTBI patients with no severe responses to

trivial head trauma.

5.3 RESULTS

In order to identify variants in ion-channel and neuronal genes that might be involved in

concussion response, WES was performed in 16 individuals who were referred for diagnostic

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 75

testing after a severe response to a minor head trauma, but for whom no likely pathogenic variants were detected by full exonic sequencing of the known FHM genes. Quality metrics were above the accepted thresholds 243 for all samples assayed, as detailed in Table 6. Samples had an average number of reads of 33M, with a minimum of 21x106 reads. The average mean coverage depth was 99X across all samples, with the minimum coverage depth 61X. Ion Torrent exome fragments are 200 base pairs (bp) long with an average read length in the samples of

111 bp, with a range of 18 and SD of 5.2, indicating an acceptable level of fragment integrity.

Samples had an average of 37,550 called variants and a minimum of 35,600 variants. Ion

Reporter has an ontology filter, wherein gene expression, function, and gene families can be used to filter genes. This filter is based on Gene Ontology (GO) database244, which is one of the most comprehensive ontology resources available online. Following our hypothesis which is based primarily on ion channel genes, we included the following gene ontologies in our analyses: neural, neurological, brain, ion channel, and neuro* (search expression for any word starting with the prefix neuro). To identify rare and novel mutations, only variants with Minor

Allele Frequency (MAF) below 0.001 were included. SIFT and PolyPhen-2, two complementary pathogenicity prediction tools were used to filter variants in ion reporter to ensure that the mutations included were of a certain level of functional impact. It is worth mentioning that only one variant (CACNA1C p.Ile662Leu) was predicted to be benign by

Polyphen, which was reflected by the lowest deleterious prediction score in Table 7 (82%).

However, as all other in-silico tools predicted this variant to be damaging it was still included.

Each case had an average of 25 filtered variants. The 16 filtered Variant Caller Files (VCFs) were screened for variants in genes in common. However, as none were identified, each case was assessed individually to identify gene variants most likely to contribute to poor response to head trauma as outlined below.

76 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

Table 6: Quality Metrics of WES by Ion Proton

Mean Mapped Number of Read Mean CASE ID Reads Variants Length Depth (Millions) (bp) R170 22.32 36432 177 61.7 R211 36.67 38092 192 112.9 R259 32.1 37658 190 96.71 R117 37.22 38028 190 111 R118 35.24 37166 188 100 R197 38.27 38119 186 112.7 R150 21.7 38563 186 65.6 R171 32.44 37356 185 95.5 R120 47.17 37912 187 140 R240 40.3 37635 192 123 R110 26.6 37431 189 82.34 R111 30.8 37451 187 94.8 R167 29 37394 193 88.4 R206 27.8 37595 181 79.5 R222 41.1 38347 174 114.7 R256 39.3 35628 191 119.5 MEAN 33.6269 37550.4 186.75 99.8969 SD 6.8414 699.854 5.21416 20.538 Min 21.7 35628 174 61.7

Rare and novel variants were found in 12 cases, including in genes encoding potassium channels (KCNJ10 and KCNAB1), calcium channels (CACNA1I and

CACNA1C), ATPases (ATP10A and ATP7B), solute Carriers (SLC26A4), and neurotransmitter receptors (GABRG1 and GRIK1). Furthermore, in 4 of the cases, while no relevant variants in ion channel or neurotransmitter genes were found,

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 77

variants in ubiquitin-related genes (SQSTM1, HECTD1, and TRIM2) which were

included as part of the neuronal ontology filter, were detected and are of potential

interest. MAFs of each of the identified variants from the databases are summarised

in Table 7.

Table 7 Minor Allele Frequencies (MAFs) of variants deemed relevant to symptoms by

WES analysis

CASE ID Gene Variant gnomAD (v 2.1.)

19 in 277134 R170 ATP10A rs142704035 (0.00006)

7 in 246244 R211 ATP7B rs751710854 (0.00002)

4 in R211 CACNA1I rs751729397 R117482(0.00001)

R259 CACNA1C chr12:2690844 _

97 in 277144 = R117 KCNJ10 r s138457635 0.0003500

No rs 3:156232893 C 1 in 239230 R118 KCNAB1 / G (0.000004)

44 in R197 SLC26A4 rs111033199 276848(0.0001)

4 in R150 GABRG1 rs759786658 274046(0.00001)

78 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

1 in R171 GRIK1 rs757997768 246130(0.000004)

No rs R120 TRIM2 - chr4:154191614G>C

13 in 276216 R240 HECTD1 rs371260055 (0.00004)

4 in 246230 R110,R111 SQSTM1 rs771966860 (0.00001)

Scores from in-silico prediction tools are reported in Table 8 for all the variants. All mutations identified via WES were predicted to be over 80% deleterious by predictSNP2 and predicted to be disease-causing (not a polymorphism) by Mutation Taster. Despite the variation in symptoms among the cases included in this study, the cohort can be categorised into four groups of main symptomatology: migraine, light and noise sensitivity, hemiparesis or hemiplegia, and cerebral oedema or stroke. Using an alternative variant explorer (VCF-DART)

245 produced results identical to those reached using Ion Reporter. This was assessed via exploring the variants with MAF less than 0.01 and predicted to be deleterious in the gene lists of Appendix 4. In the 2 cases where the most relevant variant to the presentation was not in the gene lists, it was identified in the last tier where variants predicted to be highly deleterious in any additional genes are presented.

Of the 7 cases who developed migraine following minor head injuries, 2 had ATPase related variants. The variant in ATP10A has a high predicted functional impact and has a MAF of 0.00006, while the ATP7B variant has a medium predicted functional impact and has an even lower MAF of 0.00002. In the related (father and son as per clinical records) cases (R110 and

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 79

R111), a Sequestome 1 (SQSTM1) mutation was identified (MAF = 0.00001). Another case,

R197, was found to have an SLC26A4 mutation which is known as a pathogenic variant for

Pendred syndrome, according to HGMD and dbSNP (gnomAD MAF = 0.0001). Finally, a glutamate receptor mutation (GRIK1), only found in one other individual in gnomAD (MAF =

0.000004) was found in case R171.

There were 2 cases who developed light and noise sensitivity following their minor head injuries. The first one (#R259) had a CACNA1C mutation that is novel. The other case (R117) had a mutation in KCNJ10, which has been reported as a variant of uncertain significance for the autosomal recessive Seizures, Sensorineural deafness, and Ataxia, Mental retardation, and

Electrolyte imbalance (SeSAME) syndrome and has a slightly higher MAF (0.0003) than the rest of the variants found in this cohort. A third group presented with cerebral oedema and ischaemic stroke following head injuries. One case sample had a novel (previously unidentified in databases gnomAD, ClinVar, and dbSNP) variant in TRIM2, one had a GABRG1 variant with a MAF of 0.00001, and one had a KCNAB1 variant that does not have an assigned rs number in dbSNP.

Table 8 Functional Scores

(next page)

80 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

Table 8a: Predicted deleteriousness by several in-silico tools for the variants found in Ion Channel genes

Mutation Variant/AA Predict CASE ID Gene Transcript SIFT Polyphen Mutation taster Assessor (FI change SNP2 score) c.2642C>T / R170 ATP10A NM_024490.3 D Probably Damaging 87% D D/AA /Protein /SS High(3.98) p.Ala881Val

D/AA/Protein/SS/Known c.2383C>T / potential disease R211 ATP7B NM_000053.3 D Probably Damaging 82%D Medium (3.35) p.Leu795Phe mutation (HGMD CM970141)

c.331C>G / R211 CACNA1I NM_021096.3 D Probably Damaging 87% D D/AA /Protein /SS NA p.Arg111Gly c.1984A>C / R259 CACNA1C NM_199460.2 D B 82%D D/AA /Protein /SS NA p.Ile662Leu c.52C>T/ R117 KCNJ10 NM_002R120.4 D Possibly Damaging 87% D D/AA /Protein /SS Medium (2.11) p.Arg18Trp c.749C>G / R118 KCNAB1 NM_172160.2 D Probably Damaging 87% D D/AA /Protein /SS Medium (2.915) p.Ala250Gly c.412G>A / R197 SLC26A4 NM_000441.1 D Probably Damaging 87% D D/AA /Protein /SS Medium (3.215) p.Val138Ile

Table 8b: Predicted deleteriousness by several in-silico tools for the variants found in neurotransmitter genes

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 81

Mutation Variant/AA Predict CASE ID Gene Transcript SIFT Polyphen Mutation taster Assessor (FI change SNP2 score)

D/AA c.137A>G / Probably Medium R150 GABRG1 NM_173536.3 D 87%D changes/Protein p.Asp46Gly Damaging (2.11) Features/SS changes

c.1282A>T / Probably R171 GRIK1 NM_000830.4 D 87%D D/AA /Protein /SS Medium(3.43) p.Asn428Tyr Damaging

Table 8c: Predicted deleteriousness by several in-silico tools for the variants found in ubiquitin genes Mutation Variant/AA Predict CASE ID Gene Transcript SIFT Polyphen Mutation taster Assessor (FI change SNP2 score)

c.158G>C Probably R120 TRIM2 NM_015271.3 D 87%D D/AA /Protein /SS High(4.6) /p.Cys53Ser Damaging

82 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

c.5000G>A / Probably R240 HECTD1 NM_015382.3 D 87%D D/AA /Protein /SS Low(1.83) p.Arg1667His Damaging

c.1210A>G / Possibly R110,R111 SQSTM1 NM_003900.4 D 87%D D/AA /Protein /SS Medium(2.98) p.Met404Val Damaging

Predict Mutation SIFT Polyphen SNP2 taster T= D = D= disease

Tolerated deleterious causing AA changes = D = Amino deleterious acid changes Protein = DL = B = Protein Deleterious Predicted feature Low Benign might be Confidence affected SS = splice site changes Table 8(a,b,c) key

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 83

84 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

Network and Pathway analysis

KCNJ10, KCND3, and CACNA1I are co-localised, that is expressed in the same tissue 246, and ATP10A, SLC26A4, and CACNA1C have predicted genetic interactions, with ATP10A sharing 2 protein domains with ATP7B 247. KCNAB1 has predicted genetic interaction with

CACNA1A, while KCNB3 interacts with ATP1A2 247, which has strong links to trivial trauma response. Pathways visualised based on Genemania interaction visualisation tool in Figure 5

Further, gene set enrichment analysis results are presented in Table 9.

Table 9 Gene Set Enrichment Analysis Top Results

Enriched Category ID Genes name

Ion Channel R-HSA- ATP1A2/ ATP7B/ATP10A Transport 983712

Ion Transport by R-HSA- ATP1A2/ ATP7B/ATP10A P-type 936837 ATPases

NCAM1 R-HSA- CACNA1I/CACNA1C interactions 419037 Integration R-HSA- of energy CACNA1I/CACNA1C 163685 metabolism

Transmission across R-HSA- GRIK1/KCNJ10/CACNA1A Chemical 112315 Synapses

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 85

No Filter(101839 variants, 18160 genes)

Filter 1 (MAF < 0.01) (21903 variants, 10755 genes)

Filter 2 by in-silico prediction(2091 variants, 1891 genes)

Filter 3- Gene Ontology (Neuro*, Neura*, Ion Channels)(151 variants, 135 genes)

Filter 4 Post Hoc Manual Selection based on biological relevance and literature

Figure 4 Variant filtering pipeline to explore relevant variants in patients with

severe reaction of trivial head trauma (n=16) using Ion Reporter software.

86 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

Figure 5: Pathway analysis for candidate genes based on Genemania tool.

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 87

5.4 DISCUSSION

This chapter detailed Whole Exome Sequencing of 16 individuals who were referred

genetic testing for suspected FHM mutations. Most individuals of this cohort were found to

have deleterious variants that are either known to be pathogenic for autosomal recessive

neurological disorders, or in genes where other mutations are known to cause neurological

disorders with symptoms like those developing following trivial head trauma. No variants were

found to be present in multiple unrelated samples, nor were there any shared variants at the

gene level. We hypothesised that Ion Channel genes would harbour rare deleterious mutations

in individuals who responded severely to trivial trauma. This was supported by finding likely

pathogenic mutations in Potassium channel, ATPase, and Solute Carrier genes. The hypothesis

of genetic risk modulating the severity of acquired neurological dysfunction has already been

demonstrated in animal models in the case of epilepsy 248. Further, our results suggest that

heterozygous mutations of autosomal recessive neurological disorders could be contributing to

subtle neural changes that are exacerbated by a certain trigger (e.g. head trauma). Further, recent

studies have investigated a potential relationship between multiple concussions and

neurodegenerative diseases 28, 31, which sheds further light on the mutations found in ubiquitin

genes that are implicated in neurodegeneration. We hypothesise that the relationship is not

linear, but rather stemming from a neuronal vulnerability caused by shared genetic pathways.

5.4.1 Potassium Channel genes

The mutation found in KCNJ10 is a documented potential pathogenic variant for the

autosomal recessive Seizures, Sensorineural deafness, Ataxia, Mental retardation, and

Electrolyte imbalance (SeSAME) syndrome. Our patient presented with recurrent episodes of

pallor and vomiting following minor head injuries, as well as slurred speech with inappropriate

88 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

words. After that she had a mild biparietal headache on most days, these headaches usually

lasted for several minutes and resolve spontaneously. She was sensitive to loud noises and was

sensitive to light for three weeks following the event. Additionally, the mother has a history of

migraines but there was no other significant family history of neurological problems. SeSAME

like syndrome is to be expected in heterozygous carriers of other protein changing variants in

KCNJ10 (Sala-Rabanal, Kucheryavykh, Skatchkov, Eaton, & Nichols, 2010). The mutation

changes the positively charged Arginine to polar-neutral Tryptophan (R18W), with a medium

functional impact on the protein (Mutation Assessor: 2.11). K+ channels, predominantly Kv4.1

are crucial for clearing potassium following excitability and avoiding abnormal brain function

(Isaksen & Lykke-Hartmann, 2016). Further, Kir4.1, encoded by KCNJ10, is an inwardly

rectifying potassium channel, which restores negative resting potential, and is involved in

clearing extracellular glutamate (Mendez-Gonzalez et al., 2016). Consequently, impairment of

glutamate uptake due to Kir4.1 dysfunction has been implicated in seizures (Inyushin et al.,

2010). Interestingly, mutations in KCNJ10 have been found to impair K+ and glutamate uptake

in astrocytes (Inyushin et al., 2010), which could explain the presentation of the patient.

Mutations in KCNAB1, which is strongly transcribed in the dorsolateral prefrontal cortex,

have been found to cause epileptic encephalopathy 249. The case with a KCNAB1 mutation

suffered from a catastrophic cerebral oedema following trivial injury. It is an Alanine to Glycine

amino acid change (A250G), with a medium functional impact (fl score = 2.915). As per 3D

modelling in Appendix 5, this mutation changes 4 different sites in the structure of the protein

tetramer, potentially affecting its folding.

5.4.2 ATPase genes

Mutations in ATP10A have been found in individuals with Autism (which shares some

pathways with migraine 250 and Angelman Syndrome. Russo and colleagues suggested ATP10A

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 89

imprinting is linked to Migraine with Aura 251. They found that a of the region 15q11-q13

contributes to migraine with aura susceptibility, this region contains the maternally imprinted

gene ATP10A. The case with an ATP10A variant presented with confusional migraine following

minor head injury. It is an Alanine to Valine mutation with a high functional impact (3.985),

which is quite a surprising impact as this is a conservative change.

The mutation we found in ATP7B is documented as a likely pathogenic allele for Wilson

syndrome, which is autosomal recessive copper metabolism disorder. Wilson Disease is

characterised by neurological dysfunctions as primary presentation in patients in their 20s and

30s 252. Further, copper ions imbalance has been linked to migraine and ataxia 253, 254, and

dysfunctional copper metabolism has been linked to axonal neuropathy 255. The patient with

ATP7B mutation presented with severe migraine and ataxia following head injury (an extreme

stressor). ATP7B gene has several domains, predominantly for coding copper binding proteins

252. Heterozygote mutations were also found to cause symptomatology in ATP7B mutations 256.

Further, SNPs in ATP7B have been linked to Alzheimer’s Disease 257, which further supports

the gene’s implication in neural processes. This mutation (L795F) has a predicted medium

functional impact on the protein (Fl score 3.385). This patient further had a mutation in a

calcium channel gene (CACNA1I), which will be discussed later.

5.4.3 Calcium Channel genes

Voltage gated Calcium channels have been previously implicated in epilepsy, hemiplegic

migraine, and schizophrenia 258, 259. Further, extracellular Calcium levels influence the levels of

neurotransmission secretion 260. CACNA1A remains the most explored gene with regards to

trivial head trauma response. However, mutations in CACNA1A can have heterogeneous

neurological presentations, with the example of a family harbouring the same mutation yet

developing varying symptoms including migraine, hemiplegia, coma, and progressive

90 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

cerebellar ataxia 261 which are all symptoms present in this cohort. Calcium channels, primarily

voltage-gated (VGCC), have been established to have neuroprotective characteristics 103.

Hence, calcium channel blockers have been trialled for treatment of TBI 104, 105.

A novel CANCA1C deleterious (p.Arg111Gly ) mutation was detected in an individual

presenting with prolonged migraines and photophobia following a hit on the head during

netball. CACNA1C controls calcium influx into the cells and has been implicated in 5 major

neuropsychiatric disorders 262. CACNA1C mechanisms have also been implicated in the

survival of young neurons in the hippocampus and in forebrains of animal models 263. Although

the I662L mutation has a low functional score (1.33), it is predicted to be deleterious by most

prediction tools.

A rare CACNA1I mutation was detected in a patient with severe migraine and ataxia

following head injury. CACNA1I has been implicated in neurological and neuropsychiatric

disorders including arthritis and schizophrenia. This mutation was deemed mosaic by Sanger

sequencing, which might suggest tissue-specific impact. This patient also had the Wilson

syndrome ATP7B mutation described above. Although the CACNA1I variant has a low

predicted functional impact, it is still predicted to be disease causing and splice site changing.

Mutations in this gene have been found to disrupt Cav3.3 channel activity 264, impacting sleep

spindle activity. They are mostly expressed in GABAergic neurons of the thalamic reticular

nucleus (TRN) 265. Most relevantly, these channels are most active upon transient membrane

hyperpolarisations, where they regular rebound bursting 266.

5.4.4 Solute Carriers

One patient had one SLC26A4 heterozygous known disease-causing mutation. This

patient developed migraine on minimal trauma and confusion and headache post rugby games,

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 91

which could be the (trigger) for the pathogenic allele effects to manifest. This mutation is a

known pathogenic allele for Pendred syndrome, a recessive vestibular disorder that presents

with several symptoms including enlarged vestibular aqueduct and inner ear abnormality 267.

Further, vestibular migraines are quite common among vestibular dysfunctions 268, which are

associated with enlarged vestibular aqueducts and associated sensorineural dysfunctions 269.

The V138F mutation has a medium functional score (3.215) on the protein structure.

5.4.5 Neurotransmission

Gamma-aminobutyric acid (GABA) type-A receptor gamma1 subunit (GABRG1),

encodes a protein that is part of the ligand-gated ion channel family. It is highly and solely

expressed in the brain. Mutations in this gene have been implicated in epilepsy and seizure

cases 270. Following brain injuries, a significant change in the expression of the neurotransmitter

GABA related genes has been documented 271. Drexel and colleagues 272 found an abnormal

expression of the GABA receptor subunits, particularly in post-synaptic neurons following a

brain injury, including GABRG1 and GABRA6. In addition, GABA agonists have been found

to protect against ischemia following injury 273.

The case with a GABRG1 mutation had developed an ischemic stroke following a mild

head injury, which could indicate impaired neuroprotective function of GABRG1. The D46G

mutation has a medium impact as predicted by mutation assessor (FL score 2.11).

The principle of neuroprotection depends on balance between excitatory (e.g. glutamate)

and inhibitory (e.g. GABA) neurotransmitters 274. Consequently, impairments in glutamate

receptors could be hypothesised to reduce this balance and posit neurons at risk following any

stress (e.g. trauma). Glutamate levels rise dramatically following impact or stroke and are

hypothesised to play a role in the aftermath of TBI, with glutamate levels responsible for certain

92 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

levels of neurotoxicity. Glutamate and γ-aminobutyric acid (GABA) act as primary excitatory

and inhibitory neurotransmitters in the brain and play important roles in the concussion process,

particularly with the indiscriminate glutamate release that leads to increased intracellular Ca2+

after concussion 140. Glutamate is also believed to contribute to secondary injuries through

excitotoxicity 141. GRIK1, the glutamate ionotropic receptor kainate type subunit 1 has been

implicated in epilepsy related neurological disorders 275. However, more relevant is the fact that

this gene encodes an NMDA receptor, which have been found to play a role in migraine

pathogenesis 276. Mutations in the GRIK1 gene have been found in epilepsy cases, as well as

linked to Ca2+ deregulation following brain injury. The latter finding specifically correlates

with the presentation of the patient with a GRIK1 mutation who presented with confusional

migraine following minor head injury. This mutation results in a N428Y change, with a medium

functional impact predicted, and FL score of 3.43.

5.4.6 Ubiquitin

Through analysis of filtered variants in neuronal-related genes, 4 cases (including a father

and son) were found to have deleterious mutations in ubiquitin related genes, SQSTM1, TRIM2,

and HECTD1, and neither had mutations of interest in ion channel or neurotransmitter genes.

Sequestome 1 (SQSTM1) is the ubiquitin-binding protein p62 and its mutations have been

implicated in Frontotemporal dementia and or Amyotrophic lateral sclerosis 277-280. Several

studies found that the mechanism with which p62 deficiency contributes to neuropathology is

via impairment of complex I mitochondrial respiration. This has been replicated in other studies

281, 282. They have also been implicated in neurodegeneration with ataxia, dystonia, and gaze

palsy, mostly linked to autophagy, and mitophagy in particular 283, 284 . This pathogenic

mutation is present in both a father and a son with a similar presentation, wherein after a minor

head trauma they have “migraine like” episodes, with slurred speech, diplopia, headache and

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 93

vomiting. These neurological disturbances seem to overlap with the ones that may arise from the heterogenous set of disorders this gene has been implicated in. This mutation (M404V) has a medium functional impact (Fl score 2.98). From modelling it appears to be in the middle of the alpha-carbon helix, which indicates folding implications 285.

TRIM2 has a neuroprotective role and is an E3-ubiquitin ligase in proteasome mediated degradation of target proteins. Loss of mutations in this gene have been implicated in early onset axonal neuropathy 286, and Charcot Marie Tooth autosomal recessive disorder 287.

Thompson and colleagues 288 demonstrated that one of the pathways for neuroprotection is that

TRIM2 ubiquitinates a cell death mediator (Bim/Bcl-2), and supressing expression of TRIM2 blocked ubiquitination of cell-death mediator and blocked any neuroprotective processes.

Further, Boone and colleagues 289 found that TRIM2 expression is supressed in dying hippocampal cells following Traumatic Brain Injury (TBI). This patient suffered from a severe bleed following a minor fall, causing malignant oedema and subdural bleed. Impairment of ubiquitin proteasome system are implicated in neurodegeneration 290. Homozygous animal models of compromised TRIM2 brain expression suffer from axonal swelling, accumulation of disorganized neurofilaments and microtubules 291. Consequently, a heterozygote mutation affecting the integrity of TRIM2 ubiquitination properties, might be exacerbated in the aftermath of an external trigger (fall), in our case. In other words, while a full-fledged axonal neuropathy is probably unlikely in our case, the deleterious variant impacts the TRIM2 protein function (C26S, fl: 4.6), predicted high functional impact, and is likely to contribute to neurofilament dysregulation.

HECTD1 is, like TRIM2, an E3 ubiquitin ligase, with mutations implicated in Autism

Spectrum disorders and neurodegeneration. HECTD1 plays a role in the development of the head mesenchyme and neural tube closure 292. Mutations in this gene have been linked to neural

94 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

tube defects 293. This person presented with confusion and headache following a minor netball

injury in the head. This mutation, is predicted to be disease causing and is in only 3 exomes in

ExAc, but has a high-low functional impact on mutation assessor (1.83) This person also had a

pathogenic mutation in the APC gene (Medium, fl 2.01) which is highly expressed in the brain

and plays a role in cell adhesion. It is worth noting that APC and HECTD1 have physical

interactions based on the genemania database. More specifically, HECTD1 was found to

regulate the APC-Axin interaction through ubiquitination 294. Changes in proteins developing

the mesenchyme and cell adhesion could contribute to neural vulnerability towards trivial

trauma.

It is no surprise seeing genes implicated in neurodegeneration 295 as candidates in

individuals suffering from head trauma reactions, as a review by Sundman and colleagues 296

explored the occurrence of ALS in mTBI cases, finding a higher odds ratios when it comes to

prior head trauma. Here, we posit that the neural vulnerability is shared between vulnerability

to head trauma and neurodegeneration.

5.5 LIMITATIONS

We acknowledge our lack of comprehensive access to family members available for this

study, thus limiting our ability to establish implicated mutations as candidate causal ones.

Further, there are some limitations surrounding the suitability of grouping the individuals of the

study. These individuals have been referred specifically for FHM suspected symptoms, which

are quite varied and diverse. The only common element is how easily the developed negative

outcomes to trivial head trauma. Hence, we acknowledge the heterogeneity of their

presentations and therefore shy from making any conclusion beyond our neuronal vulnerability

hypothesis. This grouping is only suitable for a pilot study and any more generalisable

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 95

conclusions will need to be explored in larger cohorts of mTBI and concussion patients with

clearer groupings. Also, our analysis was targeted to include only genes with neuronal related

ontology, and variants with a low MAF and high deleteriousness scores. While this approach

leads to identifying variants with the highest penetrance, we acknowledge that there might be

other more common, less deleterious variants that contribute to each case’s etiology. It is also

imperative to acknowledge that this approach of selectively looking for rare variants in specific

gene sets has its own limitations. While traditionally it has been used in diagnostic genetics

studies, it carries the risks of finding rare variants amongst the genes of interest and retrofitting

a link that fits the original hypothesis. Nevertheless, these samples were analysed alongside

samples from epilepsy, CADASIL, and normal mTBI patients with no severe responses to

trivial head trauma, where no variants of interest that met the filtering criteria were detected.

5.6 CONCLUSION

Mutations in Ion Channel and neurotransmitter related genes are implicated in

vulnerability to minimal or trivial head trauma. Ubiquitin-related genes are a new finding that

needs to be explored in further studies. These mutations can cause structural changes to neurons,

changing the influx of ions or efflux of neurotransmitters. Consequently, the trivial trauma acts

as a precipitant of disturbance, in which a heterozygote mutation does not lead to neurological

symptoms under normal conditions. However, symptoms of neurological disorders which are

autosomal recessive can develop in response to extreme disturbance (head impact, trivial

trauma) as concussion-like or migraine symptoms.

We have found rare and predicted-to-be-damaging variants in ion channel and

neurotransmitter-related genes, in case samples with a severe response to minor head trauma,

implicating them in vulnerability to head trauma. Most of those genes were connected to

96 Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma

CACNA1A via a shared protein domain, co-expression, or pathway. These variants may cause functional changes to neurons, changing the influx of ions or efflux of neurotransmitters.

Consequently, we hypothesise that while a heterozygote mutation does not lead to neurological symptoms under normal conditions, head trauma acts as a precipitant of disturbance. Therefore, symptoms of neurological disorders which are autosomal recessive could develop in response to extreme disturbance (head impact, trivial trauma) as concussion-like or migraine symptoms.

As the evidence for ion channel involvement in head trauma increases, larger affected cohorts are required to be recruited to confirm the relevance of each gene to symptom subsets.

Chapter 5: Rare Variants Implicated in Severe Reactions to Trivial Head Trauma 97

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

This chapter details the second part of Aim 2, using Whole Exome Sequencing to explore rare variants in a cohort of individuals who presented with either persistent post-concussion symptoms or a history of high susceptibility to concussion incidents. Unlike the cohort in

Chapter 5, their injuries were mostly related to high impact incidents, and their symptoms are not as severe as those with FHM mutations. Where possible, affected and unaffected family members were recruited and screened for variants of interest.

Abstract

Background: Concussion is a transient brain dysfunction that often develops in response to head trauma with acceleration/deceleration that creates stress waves. Post-concussion symptoms include a wide range of neurological, cognitive, and behavioural dysfunctions that tend to resolve in the weeks/month following the incident. There is a percentage of the population that exhibit persistent post-concussion symptoms that do not resolve with time.

Genetic studies have used large collegiate, athlete or military veteran cohorts to find associations between common polymorphisms and persistent symptoms, albeit with inconsistent results. Methods: Whole Exome Sequencing was performed on the Ion S5 instrument for a cohort of 33 individuals who struggle with persistent post-concussion symptoms. Ion Torrent Server was used to generate Variant Caller Files and analysis was done using Ion Report variant viewer and VCF-Dart.

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

99

Results: Rare variants that are implicated in neural processes or neurological disorders

were found in individuals with related symptoms. While few common variants have been found

between cases, this is the first study that explores candidate rare variants in concussive cases

which are not suspected to have Familial Hemiplegic migraine; we find a myriad of rare variants

in ion channel and neurotransmitter genes, as well as variants of pathogenicity linked to other

neurological disorders, similar to the results presented in Chapter 5 for the cohort with severe

reactions to trivial head trauma.

6.1 BACKGROUND

Concussion, as detailed in the literature review, is a transient neuronal disturbance that

follows a head injury or impact where the brain is strained under shear and tear forces 34. It is

often the result of a mild traumatic brain injury (mTBI), but the literature often uses concussion

and mTBI interchangeably. The transient neurological disturbances that follow concussion are

closely intertwined with its molecular aetiology, wherein the stress waves create an ion

imbalance that needs to be rectified through some energy expenditure causing oxidative stress

to the neurons 35. It is hypothesised that the myriad of genes that contribute to these processes

of restoring neuronal homeostasis can harbour deleterious variants that are detrimental to their

function, and thus affecting the overall neurological outcome. Concussion outcomes vary from

behavioural changes (e.g. depression and anxiety), cognitive changes (e.g. memory problems),

to more somatic symptoms (e.g. migraine and headache).

While most concussed individuals recover within 4 to 6 weeks post-injury, a percentage

of them develop persistent post-concussion symptoms (PCS). Most of these symptoms are not

severe enough to require hospitalisation but are enough to impair normal everyday functioning

and cause chronic health care needs.

100Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion

symptoms

It is for these reasons, and others detailed in chapter 2, that the need for understanding the

genetic contribution to persistent PCS arises. Most genetic studies have explored the association

between common polymorphisms and concussion outcomes. Nevertheless, rare variants have

not been explored in any cohorts. Therefore, this study details the preliminary results of a

growing cohort of individuals with PCS who come from all walks of life and sports, in an effort

to reduce the over representation of certain sports or careers in the existing literature.

6.2 METHODS

Samples were recruited via local sporting clubs and university organisations, as well as

through relevant media announcements (e.g. radio and TV interviews and relevant newsletters

articles). As per the protocols detailed in Chapter 3 (3.3.1 and 3.3.2), DNA was extracted from

these saliva samples. WES was performed using the Ion AmpliSeqTM Exome RDY Library

Preparation protocol, and the Ion PITM Hi‐Q TM Chef Kit protocol as per the manufacturer’s

instructions. NGS was performed on the Ion S5 plus instrument, which uses the same template

and chemistry as the Ion Proton used in Chapters 4 and 5 but allows for a higher number of

flows and reads and thus more samples per run. Ion Torrent Platform was used to generate

quality metrics while Ion Reporter was used to call the variants. Samples had an average read

depth of 25 million reads, indicating high quality sequencing, with none having less than 18

million reads. Ion Reporter filtering was applied to explore variants in each sample based on

gene ontology (neural, neurological, brain, ion channel, neuro*), with Minor Allele Frequency

(MAF) below 0.01. The filter also included functional scores of SIFT and Polyphen to include

variants predicted to be deleterious. In order to identify rare variants that might contribute to

the persistent post-concussion symptoms present in this cohort of 33 individuals, a stepwise

elimination process was undertaken to identify the top 1 or 2 variants of interest that offer the

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

101

best explanation of the symptoms. As detailed in chapter 3, a combination of in-silico prediction tools and literature search to identify potential genes or variants that may be implicated in concussion development or outcomes and are hence of interest was then used to prioritise the different variants for inclusion or exclusion from further analysis.

For this study, aside from the individual cases, we were able to recruit family members for 4 related individuals:

Family 1 included 3 participants, a son and a daughter with PCS, and a father who has sustained concussions and recovered. The mother had passed away so unfortunately a full trio analysis could not be performed.

Family 2 included 4 participants, two brothers (CN7, CN8) who both had to quit rugby due to persistent post-concussion neurological problems as well as short term memory problems. Their parents (CN35,CN36) had sustained head injuries in a car accident with only the mother developing PCS symptoms.

Family 3 included 3 generations, a grandfather with Parkinson’s disease (CN23), a grandson with PCS (CN15), his sister who recovered from sports concussion throughout her life (CN12), and a cousin who was identified as susceptible to multiple concussions.

Family 4 was recruited where one member suffers from more severe concussion-related symptoms than the rest of the family (CN17, CN18, CN20, and CN21). A professional athlete

(CN18) presented with high rate of concussion incidents and persistent neurological post- concussion symptoms. Interestingly, his sibling played the same sport and never suffered from any concussions despite being more prone to head trauma as reported by their coach. Further, a sister (CN17) who is a regular gymnast and dancer, and has also sustained multiple head

102Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

trauma, has never had a diagnosed concussion or any neurological symptoms. Unfortunately,

we have not been able to obtain research consent from the unaffected brother, only the affected

male and unaffected sister.

6.3 RESULTS

Samples had an average number of reads of 30M with a minimum of 18M reads. The

average mean coverage depth was 80X across all samples, with the minimum coverage depth

30X. Ion Torrent exome fragments are 200 base pairs (bp) long, so an average read length in

the samples of 110 bp, with a range of 18 and SD of 5.2, indicates an acceptable level of

fragment integrity. Samples had an average of 39,000 called variants and a minimum of 34,000

variants. Ion Reporter has an ontology filter, wherein gene expression, function, and gene

families can be used to filter genes. Following the hypothesis detailed in chapter 2 which is

based primarily on ion channel genes, the following gene ontologies were included in the

analyses: neural, neurological, brain, ion channel, and neuro*. To identify rare and novel

mutations, only variants with Minor Allele Frequency (MAF) below 0.01 were included. The

MAF cut off was increased from 0.001 used for the study with FHM cohort (Chapter5) as this

cohort exhibits less severe and thus more common symptoms. SIFT and PolyPhen-2, two

complementary pathogenicity prediction tools were used to filter variants in ion reporter to

ensure that the mutations included were of a certain level of functional impact.

Seeing that the MAF filter is relaxed for this study to 0.01, a higher number of candidate

variants were identified in each sample, all in genes expressed in the brain and/or with

biological relevance. This was expected seeing that the ontology-based filtering included

“neuro” as part of the inclusion criteria. Constraint scores were then used to prioritise variants.

Constraint scores - also known as Z scores – can be found on databases like GnomAD and

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

103

ExAC 297. In large-cohort studies, they have been used to prioritise rare-variants 298. For missense variants, positive Z-scores indicate more constraint (fewer observed variants than expected), and negative scores indicate less constraint (more observed variants than expected).

A greater Z-score indicates more intolerance to the class of variation. Z-scores on GnomAD database are generated by a sequence-context-based mutational model that predicted the number of expected rare (< 1% MAF) variants per transcript. The square root of the chi-squared value of the deviation of observed counts from expected counts was multiplied by -1 if the observed count was greater than the expected and vice versa.

In this cohort, there were a few rare variants identified in ion channel genes, neuronal structure genes, neurotransmitters, and ubiquitin genes. Some of the most notable neurological conditions that have been linked to the genes of interest in this chapter are ataxia, migraine, epilepsy, and neurodegenerative disorders. Nonetheless, there was little evidence that variants segregate affected individuals in families and hence private and shared variants will be described individually.

Table 10 details the variants identified in each case and their MAF, while Table 11 details the predicted pathogenicity scores from multiple in-silico prediction tools.

104Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

Table 10 Rare variants identified in a cohort of persistent post-concussion patients and their minor allele frequencies (MAF)

MAF ID GENE VARIANT (gnomAD)

CN2&3 NOS1 rs771520384 0.000008

CN5 GRIN3A rs34755188 0.01

CN6 DARS2 rs121918208 0.0001

CN7 CACNA1A rs41276886 0.001 CN8 CACNA1A rs41276886 0.001 CN7 PHKB rs34667348 0.002 CN8 PHKB rs34667349 0.002

CN9 ATM rs587779844 0.0001

CN10 RELN rs114344654 0.0002

CN11 TTC19 rs141892030 0.0002

VDAC3 rs150041962 0.0008 CN15 RIC3 rs144806410 0.002

CN18 SCN11A rs146942592 0.0001

CN18 TENM4 rs191549326 0.002

CN24 NGF rs11466111 0.01 CN24 NTRK1 rs6336 0.04

CN24 CHRNB1 rs200684767 0.0001

CN25 GPRC5B rs779671010 0.00004

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

105

CN25 SRR rs140102145 0.0004

CN26 P2RX7 rs28360447 0.01

CN26 GRIN1 rs201908310 0.0004

CN27 KIF1B rs117525287 0.004

CN27 NGF rs11466111 0.01

CN28 GABRP rs138089418 0.003

CN28 SYNJ1 rs769125182 0.00002

CN29 UNC13B rs918241456 0.000007

CN29 SRR rs140102145 0.0004

CN30 GRIN3A rs140872676 0.0002

CN31 GRIA1 rs370642711 0.00001

CN32 TRIM46 rs80254867 0.01

CN32 APBB1 rs120074117 0.0001

CN32 KCNJ10 rs142596580 0.0002

CN33 GABRA6 rs377498858 0.00005

CN33 MKKS rs74315394 0.05 CN34 DLG1 rs78190191 0.01

CN36 SDHAF4 rs146446063 0.01

CN36 TIAM1 rs141720377 0.008

CN40 RNF112 chr17:19319355 Novel

106Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

CN41 ANK2 rs142534126 0.0001

CN41 SOD1 rs121912455 0.000003

CN47 RIMS2 rs143929294 0.001

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

107

Table 11: Functional Scores of Identified Variants

AA ID GENE VARIANT Transcript Polyphen SIFT CHANGE

CN2&3 NOS1 rs771520384 ENST00000317775.6 p.Arg975Cys probably_damaging deleterious

CN5 GRIN3A rs34755188 ENST00000361820.3 p.Arg480His probably_damaging tolerated

CN6 DARS2 rs121918208 ENST00000361951.4 p.Cys152Phe possibly_damaging deleterious

CN7 CACNA1A rs41276886 ENST00000360228.5 p.Ala453Thr probably_damaging

CN8 CACNA1A rs41276886

CN7 PHKB rs34667348 ENST00000323584.5 p.Gln657Ter High-confidence

CN8 PHKB chr16:47684830 p.Gln657Lys

CN9 ATM rs587779844 ENST00000278616.4 p.Thr1743Ile possibly_damaging deleterious

CN10 RELN rs114344654 ENST00000428762.1 p.Val2372Met probably_damaging deleterious

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms 109

CN11 TTC19 rs141892030 ENST00000261647.5 p.Ala263Ser probably_damaging deleterious

CN15 VDAC3 rs150041962 ENST00000022615.4 p.Thr137Asn probably_damaging tolerated

CN15 RIC3 rs144806410 ENST00000309737.6 p.Pro101Ser probably_damaging deleterious

CN18 SCN11A rs146942592 ENST00000302328.3 p.Arg238Cys possibly_damaging deleterious

CN18 TENM4 rs191549326 ENST00000278550.7 p.Pro485Ser possibly_damaging tolerated

CN24 NGF rs11466111 ENST00000369512.2 p.Arg80Gln probably_damaging deleterious

CN24 NTRK1 rs6336 ENST00000524377.1 p.His604Tyr probably_damaging deleterious

CN24 CHRNB1 rs200684767 ENST00000306071.2 p.Arg216Gln possibly_damaging deleterious

CN25 GPRC5B rs779671010 ENST00000537135.1 p.Ala87Val probably_damaging deleterious

CN25 SRR rs140102145 ENST00000301364.5 p.Arg737Trp probably_damaging deleterious

CN26 P2RX7 rs28360447 ENST00000546057.1 p.Gly150Arg probably_damaging deleterious

110Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

CN26 GRIN1 rs201908310

CN27 KIF1B rs117525287 ENST00000263934.6 p.Asn731Ser probably_damaging deleterious

CN27 NGF rs11466111 ENST00000369512.2 p.Arg80Gln probably_damaging deleterious

CN28 GABRP rs138089418

CN28 SYNJ1 rs769125182 ENST00000433931.2 p.Ala1251Thr benign tolerated_low_confidence

CN29 UNC13B rs918241456 ENST00000396787.1 p.Asn1047Lys probably_damaging deleterious

CN29 SRR rs140102145

CN30 GRIN3A rs140872676

CN31 GRIA1 rs370642711 ENST00000518783.1 p.Arg739Gln probably_damaging deleterious

CN32 TRIM46 rs80254867 ENST00000334634.4 p.Arg161His probably_damaging deleterious

CN32 APBB1 rs120074117 ENST00000342245.4 p.Arg498Leu probably_damaging deleterious

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms 111

CN32 KCNJ10 rs142596580 ENST00000368089.3 p.Lys354Arg benign deleterious

CN33 GABRA6 rs377498858 ENST00000274545.5 p.Thr113Met probably_damaging deleterious

CN33 MKKS rs74315394 ENST00000347364.3 p.Ala242Ser probably_damaging deleterious

CN34 DLG1 rs78190191 ENST00000346964.2 p.Arg819Gln possibly_damaging deleterious

CN36 SDHAF4 rs146446063 ENST00000370474.3 p.Pro75Ser probably_damaging deleterious

CN36 TIAM1 rs141720377 ENST00000455508.1 p.Arg31His probably_damaging deleterious

CN40 RNF112 chr17:19319355

CN41 TNR rs768208178 ENST00000367674.2 p.Ser1220Phe possibly_damaging deleterious

CN41 ANK2 rs142534126 ENST00000357077.4 p.Thr1437Met probably_damaging

CN41 SOD1 rs121912455 ENST00000270142.6 p.Gly73Ser probably_damaging deleterious

CN47 RIMS2 rs143929294

112Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

Pedigree of family 1 (CN1-3)

Pedigree of family 2 (CN7,8,35,36)

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

113

Family 3 (CN15 and CN38 were affected by post-concussion migraine and multiple concussions respectively).

Pedigree of family 4 (CN17,18,21,22)

Figure 6 Pedigree Charts of Families included in the study

114Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

6.4 DISCUSSION

This chapter details Whole Exome Sequencing of 33 individuals who were recruited into

the study based on their persistent post-concussion symptoms (PCS). Like the diagnostic

genomics cohort, most individuals of this cohort were found to have predicted deleterious

variants in genes that are known to cause neurological disorders with symptoms like those

involved in their PCS. Nonetheless, there are two main differences from the severe reaction

cohort. The first is that a large number of the variants identified are documented as benign or

likely benign for a specific disorder, whilst being predicted to have a deleterious effect, and the

second being that the average MAF for these variants was slightly higher than those identified

in the Chapter 5 cohort. The former observation may be attributed to the fact that the functional

damage caused by the variant is not enough to cause a diagnosable disease; however, it may

cause enough impairment to normal neuronal functions that restore homeostasis. As for the

allele frequency, it might be attributed to the fact that the symptoms in this cohort are not as

severe as those described in chapter 5. However, it is most likely more innocuous and a direct

result of a higher MAF filtering threshold. Previous studies have found that heterozygote

mutations of dominant models are associated with increased susceptibility to complex disorders

such as schizophrenia 194. We hypothesised that neuronal-related genes would harbour rare

deleterious variants in individuals with persistent post-concussion symptoms or a history of

high concussion incidence. This was supported by finding likely pathogenic variants in calcium,

sodium, potassium channel, and solute carrier genes. Further, the results support the preliminary

findings of chapter 5 where genes implicated in neurological autosomal recessive disorders

could be contributing to subtle neural changes that are exacerbated by a certain trigger (e.g.

head trauma). We hypothesise that the relationship is not linear, but rather stemming from a

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

115

neuronal vulnerability caused by shared genetic pathways. Most probands had private

individual variants that were not shared with others in this cohort. However, GRIN3A and SRR

had separate variants in 2 individuals, while NGF had a variant that was identified in 2 unrelated

individuals.

6.4.1 ION CHANNEL GENES

The first two related cases are brothers (CN7, CN8, CN35, and CN36), who both had to

quit rugby due to persistent post-concussion neurological problems as well as short term

memory problems. They shared a CACNA1A mutation that is predicted to be disease causing

by all in-silico tools, and in some disease databases (HGMD CM072914). However, it is

suggested to be benign in ClinVar. Nonetheless, CACNA1A remains one of the most extensively

studied genes in relation to head trauma and migraine, which fits in with their presentation. The

fact that it is benign for FHM could explain why neither of them developed idiopathic

symptoms, but rather milder forms of migraine when exposed to external triggers (head

trauma). Their parents had been involved in a car accident in the past, and while the mother

developed persistent post-concussion headache, the father seemed to recover from the accident

quickly. However, the CACNA1A was found in the father not the mother. The siblings both

have symptoms in line with the reported implications of CACNA1A ion channelopathies. There

might be an explanation for why the father never developed the same symptoms, which is

having his own unknown protective variants. Alternatively, the siblings could have risk variants

from the mother that amplified the neuronal vulnerability imparted by the CACNA1A mutation.

However, no shared rare variant in ion channel genes was detected in the mother and both sons.

Two other variants found in the siblings are in the AMBRA1 and PHKB genes. The latter

may be of more relevance as it is documented as a pathogenic variant causing glycogen storage

116Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion

symptoms

disease (autosomal recessive), therefore it likely affects protein function. Notably, PHKB expression was found to be elevated in individuals with persistent post-concussion symptoms

299 . While this is not a direct link, there are reports of neurological symptoms in some forms of glycogen storage disease 300. Glycogen-related dysfunctions have also been linked to epilepsy

301, which as established through the past chapter, shares molecular pathways with concussion and migraine.

CN41

Another case in this cohort is an otherwise healthy male in his 30s, who has a history of high concussion incidence in contact sport (rugby). Upon analysis, rare variants were shortlisted to include three genes, ANK2, SOD1, and TNR. ANK2 codes for ankyrin B, a membrane protein that affects the localization of ion channels and transporters 302. It is a member of the family of genes involved in long QT syndromes, which are expressed in the brain and cardiac tissues 303.

Despite its close relation to ion channels and transporters, and expression in the brain, mutations reported for ANK2 cause cardiac problems, like arrythmia, but also sinus arrythmia, as well as risk of sudden death due to epilepsy 304, 305. Recent studies have indicated that the role ANK2 plays in axonal branching might be a key factor in neurodevelopmental disorders, including

ASD 306.

As for the copper and zinc superoxide dismutase (SOD1), mutations in that gene have been found to cause excitotoxicity, oxidative stress, ER stress, mitochondrial dysfunction, axonal transport disruption, prion-like propagation, and non-cell autonomous toxicity of neuroglia 307. Glia with a SOD1 mutation (G93A) have also been found to present a

“neuroinflammatory phenotype” 308, which is a hallmark for concussion processes. It was also characterised decades ago as a cause of mitochondrial degeneration in ALS families, with

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

117

mitochondrial processes being a major suspect in concussion outcomes. While SOD1 is a likely candidate, TNR, which encodes a member of the tenascin family of extracellular matrix glycoproteins and is expressed exclusively in the central nervous system, might be playing a role. Of most interest, TNR-expressing cells are a subpopulation of astrocytes and regulate extracellular glutamate homeostasis 309.

CN26

Another case is a woman in her 50s (CN26) presenting with severe neuropsychiatric symptoms including depression, lethargy, lack of mobility, memory problems, and inability to focus or concentrate. Whilst the onset of these symptoms was attributed by the participant and the treating physician to a brain injury, the participant reported a history of abuse and maltreatment as well as current mood problems. The gene with the most relevant exonic rare variant was PXTR2. It is, perhaps unsurprisingly, implicated in ion channel processes as well as the development of depressive disorders 310. More specifically, the link to concussion processes is stronger through recent understanding of the role of ATP release and signalling through the P2X7 purinergic receptor that contributes to changes in brain tau levels 311. With a

Z score of 5.8, it is one of the most intolerant genes to missense mutations in this chapter, which partly strengthens the link proposed here. GRIN1 is one of the most intolerant genes to variation with a Z score of 6 and mutations in that gene have been implicated in encephalopathy with movement disorders 312 and depression and schizophrenia symptoms 313 because of its role in neural plasticity 314. A rare intronic variant in GRIN1, which has been reported in ClinVar, was found in one individual, but lack of conservation at that base suggests that it is unlikely to be functional (or affect splicing).

118Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

CN18

A professional athlete (CN18) presented with high rate of concussion incidents and persistent neurological post-concussion.The two most prominent rare variants found in CN18 were found in SCN11A and TENM4 genes. Both of which are linked to neurodegeneration and tremors 315. SCN11A is particularly linked to episodic pain 316 and some missense mutations cause loss pain perception 317, which could, potentially, contribute to more risk-taking behaviour. Further, the voltage-gate sodium channel encoded by SCN11A seems to be closely linked to the molecular processes that ensue after a concussion, including neuroinflammation.

SCN11A, is particularly relevant to concussion processes due to its role in neuronal excitability levels and allowing stimuli (e.g. light head injury) that would not normally cause depolarising effects, to have prolonged influence 318. TENM4 seems to complement the effect of SCN11A, as it plays a role in establishing proper neuronal connectivity during development, and has also been implicated in essential tremor 319.

CN9

One case (CN9) developed persistent post-concussion migraine for years. However, their most notable variants were in SCN9A and ATM. SCN9A is involved in epilepsies 320 (though the variant itself is found to be likely benign in ClinVar), and more interestingly, in pain disorders 321, a set of genetic disorders or traits shared with another case of persistent post- concussion symptoms found in this cohort with SCN11A mutation (CN18).

ATM, on the other hand is involved in Ataxia Telangiectasia 322,which is generally fatal when both copies of the pathogenic variant are present, as it is an autosomal recessive condition.

However, there is evidence that mutations in the ATM gene cause neurodegenerative changes

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

119

323, which could contribute to much less severe neurological symptoms (i.e. a migraine) when an external trigger is involved, i.e. concussion.

CN10

A rare variant (p.Val2372Met) was identified in the reelin gene, RELN which is known to have a neuronal function, in a 50-year-old female (CN10) who struggled for 6 months with severe visual disturbances and memory problems post-concussion, and then the symptoms persisted in a more moderate form to date. Mutations in RELN which have been found to cause

Autosomal Dominant Lateral Temporal Lobe Epilepsy (ADLTE) 324, 325 , and it is the gene with the highest Z score among this person’s candidate variants (2.25). However, this specific variant was found to be benign for ADLTE or of unknown significance through previous studies reported in ClinVar, which explains why the person never developed idiopathic epilepsy. It is suggested to regulate synaptic plasticity as well as neurotransmission 326. RELN is also involved in memory functions and more importantly, encodes Reelin which is implicated in normal synapse functioning, and abnormal ones in neurodegenerative disorders like AD 327.

Impairment of such functions might suggest why this person’s brain was never capable of fully recovering. Another gene of interest that has been identified is SLC24A4, and while some polymorphisms 328 and methylation levels 329 in this gene have been linked to late onset AD – in line with the neurodegenerative pattern that is forming in this cohort, the majority of the evidence seems to point to dental implications 330, 331.

CN47

120Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

An otherwise healthy male in his 30s (CN47) volunteered to this study after having persistent problems with sleeping, headaches, ringing ears and cognitive ability, 18 months after being diagnosed with PCS by a neurologist. This person had no mutations of direct link to brain injury or neurodegenerative diseases. However, it is worth noting that they had a mutation in

RIMS2, which regulates synaptic functions – in particular synaptic membrane exocytosis which is of interest to concussion processes 332, 333. It is also causative of Syndromic Congenital Cone-

Rod Synaptic Disease with Neurodevelopmental and Pancreatic Involvement 334.

CN34

Another person (CN34) who participated in this study and had to stop playing Australian

Football League because of multiple concussion incidents and long-term memory and reaction times problems. One variant of initial interest in DLG1 made it through the filtering process.

Although DLG1 was found to be highly expressed in the brain, hence included in the primary stage of results, it appears as though it plays a role mostly in neuropsychiatric disorders – in particular, schizophrenia. DLG1 may be important for the pruning process of neuron branches that occurs during adolescence, this crucial brain remodelling mechanism may be the basis for the altered neural connectivity and functioning that can result in schizophrenia later in life 335.

CN29

In a case (CN29) who suffers from migraine with aura, post-concussion depression and apathy, as well as stroke, there were three variants of interest. The first was found in UNC13B, which encodes a presynaptic protein that promotes the priming of synaptic vesicles by acting through syntaxin 336. It is also responsible for exocytosis, a process that is closely correlated with calcium ion channels 337, and concussion processes in general. Exocytosis has also been

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

121

implicated in another gene found in this cohort. Nonetheless, the literature suggests that mutations in UNC13B are mostly suggested to cause neuropsychiatric conditions like bipolar disorder and schizophrenia 338, rather than neurological ones like the migraine this carrier suffers from. The participant had also reported prolonged apathy and depression, which might be linked to the higher expression of this gene in the cerebral cortex. Of interest, the case had another rare variant in the Serine Racemase gene SRR, in which mutations can cause alterations in d-serine levels, evident in schizophrenia 339 and Parkinson’s Disease (PD) 340. It has also been linked to neurodegeneration 341 and in particular a specific type of ataxia associated with dysfunctional ATPase 342, 343. As discussed earlier, ATPase plays a crucial role in the molecular processes in the aftermath of a concussion that aim to restore homeostasis.

CN27

In another case (CN27) with migraine and depression, two rare variants were detected.

The first is in KIF1B, which is a neuronally expressed gene that is likely related to the irreversible axonal loss characterizing MS in the long term 344. Despite other studies finding conflicting links to MS as a disease, most literature agrees that mutations in this gene contribute to neurodegeneration 345. Of interest, KIF1B is involved in mitochondrial transport processes

346. As established earlier, mitochondrial processes are crucial for restoring homeostasis post- concussion, and impairment thereof might very well explain some of the persistent symptoms.

As this gene has a Z score of 3.6, it is likely to harbour the variant of higher effect since the gene is less tolerant to missense mutations. In contrast, NGF, where the other variant was found, has a Z score of 0.9. However, despite having a lower Z score, the Nerve Growth Factor (NGF) has been suggested to be a migraine mediator 347, and more specifically as a neurotrophin elevated with the incidence of migraine 348 349. Seeing that the evidence linking NGF variants

122Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

to migraine is quite limited to date 350, it follows that whilst the variant found might not cause migraine idiopathically, it is possible that the concussion triggered it.

CN25

In a case of persistent post-concussion symptoms (CN25), only one variant of interest was detected in GPRC5B. Most literature suggests involvement in behaviour traits rather than neurological 351, 352. However, this case also had another mutation in the SRR gene, which was found in another case, and is implicated in neurodegeneration as well as ataxia types.

Interestingly, this person reported anxiety, depression, apathy, heightened emotions, as well as short term memory problems. They have also reported a family history of Alzheimer’s.

CN24

There was another case (CN24) this cohort with an NGF rare variant. More interestingly, they had another variant in a neurotrophic gene, NTRK1. They share numerous signalling pathways, which contribute to their previously described role in synaptic plasticity. A combination of deleterious variants in both genes could prove detrimental to the health of the neurons in response to a brain injury. NTRK1 in particular has been implicated in migraine aetiology , which shares multiple pathways with the molecular aftermath of concussion 353.

While this person also had an ion channel variant, in CHRNB1, it is reported to be more implicated in muscle weakness presentations 354. This person’s main symptoms included migraine as well as short and long-term memory problems.

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

123

CN6

An otherwise healthy male in his 60s (CN6), sustained multiple concussion over his lifetime that led him to quit professional contact sport after developing long term memory problems as well as depression. The only variant of interest found in his exome was in the

DARS2 gene, which has been implicated in leukoencephalopathy with brainstem and spinal cord involvement and elevated lactate (LBSL). And while he does not present with the full- blown disorder, cognitive impairment seems to be common among patients with LBSL and

DARS2 mutations, especially processing speed and working memory 355. LBSL is most often a relatively mild disorder, characterized by juvenile onset of slowly progressive ataxia, cerebellar, pyramidal, spasticity, and dorsal column dysfunction 356. However, the clinical course of LBSL is not uniform, and there is a lack of longitudinal data on these patients 357.

Most relevant is the fact that DARS2 mutations are involved in mitochondrial function 358, another emerging theme among some cases of this cohort.

CN4

CN4 case presented with a history of multiple head injuries, one of which resulted in a stroke. He suffers from hemisensory events as a result of that. As per the variants table, there were a few candidate variants based on first-tier filtering. The variant with the highest deleterious scores was in KALRN, a human Kalirin protein gene which has been implicated in several neurological and neuropsychiatric disorders, including ischaemic stroke 359 360.

CN2&CN3

A brother and a sister (CN2 and CN3) had to stop their contact sports (rugby and field hockey respectively) after struggling with a history of high concussion incidence. While their

124Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

father (CN1) had also been a regular athlete in a contact sport, he had never had any problems

with concussion, and in over 30 years sustained just one after a justifiable head impact, from

which he recovered within the expected timeframe. A rare, predicted-deleterious variant in the

neuronal Nitric Oxide (nNos) gene NOS1 was found in the siblings but not in the father.

Unfortunately, the mother had passed away and had struggled with non-neurological health

problems, so there is little documentation of her history with concussion. nNOS is expressed in

neuronal cell bodies and NO derived from nNOS (nNOS-NO) acts as an important

neurotransmitter associated with neuronal plasticity, memory formation, regulation of central

nervous system blood flow, transmission of pain signals and neurotransmitter release 361.

Further, while NOS1 variants have not been studied in relation to mTBI, the availability of the

neuronal NOS certainly has been. Further, deficiency of NOS1 but not NOS2 and NOS3

attenuates SOD2 nitration after experimental TBI. Nitration and inactivation of SOD2 could

lead to self-amplification of oxidative stress in the brain progressively enhancing PN production

and secondary damage. With linkages to neurotransmitter release (glutamate excitotoxicity)

and oxidative stress, there is a link to concussion processes that warrants further investigation

of the NOS1 gene in larger cohorts.

6.4.2 UBIQUITIN GENES

In the previous Chapter, 2 genes linked to the E3 ubiquitin ligase were identified in the

cohort with severe responses to trivial head trauma. In this cohort of post-concussion symptoms,

two other E3 ubiquitin ligase genes were identified.

CN40

The first one was RNF112, identified in a 50-year-old female with post-concussion

migraine, short- and long-term memory problems. It is known as neurolastin and is implicated

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

125

in neuronal function and growth 362. Overexpression of this zinc finger protein has been found

to have neuroprotective properties and antiapoptotic effects in astrocytes 363. It was also found

to play a role in protecting neurons from intercerebral haemorrhage when a deletion causes its

overexpression following brain injury 364.

CN32

A 50-year-old former athlete (CN32) who retired from sports due to persistent post-

concussion symptoms, including depression and memory problems, had a rare variant in the E3

ubiquitin ligase gene, TRIM46. It is suggested to be involved the pathological localisation of

Tau in neurodegenerative diseases 365. Without TRIM46, all neurites have a dendrite-like mixed

microtubule organization resulting in Tau sorting and altered cargo trafficking. 366.

Interestingly, this person had another mutation in an Alzheimer’s and amyloid related gene,

APBB1 367 368. While polymorphisms are contested in a direct role in AD 369, the biological

processes related to APBB1 are in line with the neuronal vulnerability detailed earlier. Seeing

that both genes have higher than average Z scores, it is possible that the combination might be

causing neurotoxicity in response to head trauma. In addition, they had a KCNJ10 variant,

which was discussed in detail in Chapter 4 as implicated in disorders like epilepsy and SeSame

syndrome as an individual in the FHM cohort had a different mutation in this gene.

6.4.3 NEUROTRANSMITTER GENES

In a participant (CN31) with a persistent, yet mild post-concussion headaches, a rare

variant was found in a glutamate receptor, GRIA1. It has been linked to migraine in numerous

studies 370, and as described earlier, contributes to neuronal excitotoxicity. Interestingly, some

studies suggest an implication of glutamergic dysfunction in the ATP1A2 FHM models 371,

126Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion

symptoms

which is one of the genes driving this study based on the findings from severe reactions to trivial head trauma.

The same applies to GRIN3A, another glutamergic receptor gene which has implicated in migraine 372, and found in our cohort. This participant (CN30) struggled with multiple concussion incidents from non-contact sport activities and developed long term depression in the aftermath. Another GRIN3A variant was found in a case (CN5) who reported susceptibility to concussion during contact sports and had to stop rugby due to the anxiety that ensued from his concussion incidents. GRIN3A variant that is slightly more common than what has been detected in this cohort, but still predicted to be deleterious.

The two neurotransmitters often cited in concussion studies are GABA and glutamate. As detailed in the previous Chapter, their balance maintains a neuroprotective element in the brain.

While one case (CN28) had a rare variant in a GABA related gene (GABRP) and a sodium channel gene (SCN3B), the gene with highest Z score was SYNJ1. Synaptojanin 1 gene has been implicated in multiple studies on early-onset parkinsonism 373 . More importantly, other studies have indicated that the form of parkinsonism caused by SYNJ1 mutations is characterised by seizures 374, which sometimes present independently of Parkinson’s 375. Seeing that this person reported prolonged migraine attacks post-concussion and knowing the established common neurobiological pathways of migraine and epilepsy, it is not unreasonable to hypothesise that this variant could be contributing to this individual’s persistent post- concussion symptoms.

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

127

6.5 CONCLUSION

In an exploratory study of 33 individuals with persistent post-concussion symptoms, WES

identified rare variants that are implicated in critical neurological processes or symptoms

critical to concussion recovery. Of those variants, three were supported by including either

siblings or family members who either shared the same variant and presentation or were

unaffected and negative for the variant. In this cohort, we find numerous variants in ion channel

genes, as well as genes involved in neurodegeneration. This is in line with the preliminary study

in chapter 5, where more severe responses seemed to be linked to the same gene families as

well. Nonetheless, it is crucial to note that identification bias could be playing a role in

identifying these gene families, seeing that the methodology focussed on ion channel genes

based on the literature. In chapter 5, neuronal vulnerability to head trauma was hypothesised as

a framework of understanding severe cases of brain injury. Here, it is proposed that the same

framework could be used to further explore milder forms of PCS. While the research on

concussion genetics has been limited, it is possible that a rare variant approach could shed

further light on genes that are yet to be explored.

6.6 FUTURE DIRECTIONS

The next step for this study is to apply for access to global healthy datasets (i.e. UK

biobank) and utilise rare variant analyses methods to validate this data. There are multiple

proposed ways to do that, including machine learning models (classification, clustering, etc).

Other statistically informed approaches would include burden testing and rare variant

association analyses. While the former is based on statistically comparing the total count of rare

variants in a certain gene to a control population376, the latter statistically compares specific

variants and their frequency in cases vs control groups, whether specifically recruited or from

128Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion

symptoms

public databases (e.g. gnomAD)377, 378. Further, exploring variant combinations in this cohort compared to individuals who recover from mTBI within the expected time frames would provide further insights pertaining to the recovery process.

Chapter 6: Exploring the role of rare ion channel and neuronal-related variants in persistent post-concussion symptoms

129

Chapter 7: Mitochondrial and Epigenetic correlates of PCS

This chapter details the work undertaken as part of Aim 3, to conduct an exploratory study

investigating potential mitochondrial DNA and epigenetic correlates of concussion outcomes.

Section 1 presents the background, methods and results of identifying mitochondrial variants

in the concussion cohort. Section 2 details the background, methods, and results of exploring

methylation levels in a persistent PCS cohort.

7.1 MITOCHONDRIAL VARIANTS IMPLICATED IN RESPONSE TO HEAD TRAUMA

7.1.1 Background

Mitochondria are cellular organelles that are essential for energy production and the

oxidative cycle. They are mobile, constantly fusing and dividing 379. Mitochondria have their

own non-nuclear DNA, with a genome that is extremely small in comparison to the human

genome 380. The maternal inheritance modality of mtDNA, along with human migration, have

created unique clusters of mutations that are used to assign sequences to haplogroups 381.

mtDNA has some unique characteristics; it is continuously fusing and mixing content with

neighbouring cells, and the effect of most mutations are not expressed until they accumulate

into a high proportion of the mtDNA copies present382. For a long time, mitochondrial genetics

and diseases that are due to defects in mitochondrial DNA (mtDNA) were considered anomalies

due to the different genetic rules by which mtDNA is inherited compared with nuclear DNA,

the presence of several mtDNA copies in individual cells and a belief that mtDNA disease is

rare 383. However, it is now believed that mtDNA is more susceptible to oxidative damage and

mutation than nuclear DNA due to inadequate repair mechanisms, the absence of histones and

131

the lack of genome recombination. Together with oxidative damage, increased mtDNA replication errors also generate mutations 384.

The most common method to extract mtDNA is from saliva or whole blood, with some other sources including hair follicles 385. However, there have been studies exploring extracting mtDNA from disease-related tissues, which assists with exploring heteroplasmies. Despite its widespread use, WES is still limited with regards to investigating mitochondrial disorders 386.

In particular, using WES as a method of exploring Mitochondrial DNA (mtDNA) variants has not been published in neurotrauma and mTBI literature to date.

Mitochondrial haplogroups 387 and polymorphisms 388 have been shown to play a role in risk and protection from neurodegenerative disorders such as Parkinson’s. Deletions in mitochondrial DNA (mtDNA) have been found to reduce neuronal respiratory processes, in turn contributing to the onset of multiple sclerosis (MS) 389. More importantly, age-related mitochondrial dysfunction following injury has been demonstrated in animal models 390.

Bulstrode and colleagues 187 found a significant association between mitochondrial haplotypes and mTBI outcomes. This association related to the metabolic stress that TBI imparts on the brain, which leads to reduced oxidative phosphorylation (OXPHOS) in neuronal cells 391. It is well established that some haplogroups (K,T) are more efficient in relation to OXPHOS and thus have an under-representation in neurodegenerative diseases and over-representation in healthy centenarians. Further, studies in the past have explored deletions 392 and polymorphisms

393 following severe traumatic brain injuries. In particular, Conley, Okonkwo 393 found that the mitochondrial polymorphism, A10398G, which impacts the function of complex I of mitochondrial OXPHOS, is associated with reduced functional capacity and slower recovery following mTBI. This same polymorphism has also been found to be associated with

132

neurodegenerative disorders 394. However, to date, there are no studies exploring rare

mitochondrial variants and mutations in relation to PCS.

This is a cohort of varying responses to concussion/head trauma, and while they can be

subgrouped in several ways, the most basic one is that some of them reacted well and others

reacted badly to head trauma. Consequently, this was the grouping adopted for purposes of this

study.

7.1.2 Methods

7.1.2.1 Alignment and Variant Calling

As detailed in the methods chapter, sequencing reads from the Ion Platform are realigned

to the mitochondrial reference genome. While some studies used the WES alignment to explore

mtDNA variants 386, we opted to realign the original files of all the samples (groups 1,2, and 3

from section 3.2) to the mitochondrial reference genome, in order to capture any possible

variants. There are a few mitochondrial variants that are within the coverage of the exome kits,

which are often shown when annotating Ion VCFs with a third party software. This method of

realigning off-target reads has been shown to be of sufficient quality for clinical diagnostics.

395Mitochondrial variants are then called and annotated using bcftools suite.

7.1.2.2. Analyses

Online tools are used to identify variants of interest, including Mito-Map and

MitoMaster396 . Multiple VCFs were generated at different coverage cut-offs, as mtDNA has

been shown to require higher coverage levels than genomic DNA. The VCFs were then

converted into Fasta files as the preferred format for Mito-Map. To explore potential variants

133

of interest, multiple coverage cut-offs were used at first to explore the optimum balance

between false negative and false positive variant calling.

MitoMaster was used to annotate the variants from the Fasta files which was then

uploaded to MSeqDR mvTool portal. Mitovar was used to generate pathogenicity predictions

of the variants identified in this cohort 397. Haplogrep 398 was used to predict the haplogroup of

each individual, with results presented in Table 20.

Two databases provide insight into the relevance of frequency in this population, ALFA

and MGP. They were accessed initially through Mitomap portal. ALFA project: 2743

participants aged 45 to 74 years were included in the ALFA parent cohort. We show that this

cohort, mostly composed of cognitively normal offspring of AD patients, is enriched for AD

genetic risk factors 399. MGP: MGP contains aggregated information on 267 healthy

individuals, representative of the Spanish population that were used as controls in the MGP

(Medical Genome Project). Fisher’s exact test was used to determine if there are any

associations between two categorical variables (i.e.; variants). It is particularly useful for small

sample sizes400 as is the case with this study.

7.1.3 Results

Haplotype accuracy seemed to change with the coverage cut-off for the variants included.

While a cut-off of 30x seemed to allow for more variants to be detected, the majority were

deemed to be false positives. Further, a coverage cut-off of 60x increased the overall accuracy

of haplotyping individuals by an average of 10%.

According to in-silico haplotyping results based on uploading mitochondrial variants to

MitoMap, most individuals belonged to U and H haplogroups, as reported in Table 13 and

134

Appendix 6. While the percentages show a higher number of certain haplogroups in the concussion group compared to the recovery group, Fisher’s exact test between the two groups showed no significant association (p = 0.15). Of the 69 mitochondrial variants in 49 individuals

(min coverage 60x), there are 3 that are predicted to be likely pathogenic (Table 12). The variant of utmost interest is an ATPase6 (ATPase6:A177T) variant was found at a higher frequency in our population than the databases described below. It is found at a 12% frequency in our population (not including siblings) – which is higher than the 4.2% highest frequency reported.

When adding more samples to the analysis (mostly healthy recovery), the frequency of

ATPase6:A177T decreased to 8.6%, which is still higher than the average reported frequency of 3%. It is predicted to be likely pathogenic on MitoVar, which suggests a change in protein function.

The other mitochondrial variant (rs2853498) identified was a tRNA related variant, which was found in a higher frequency (18%) in this study’s cohort than 12% reported in other cohorts of public data sets.

Table 12 mtDNA Variants of Interest identified in the cohort

rCRS Alt Patient dbSNP dbSNP rs Ref Mut type Locus Other Position (Samples) Report (MAF) number 0.15 PD (ALFA 9055 G A transition ATPase6 ATPase6:A177T protective project)/ rs193303045 factor 0.06 (MGP) Stroke / 0.22 ALFA Altered 12308 A G transition L(CUN) MitoTIP42.00% – 0.11 rs2853498 brain pH MGP /sCJD

135

Altered 0.25 ALFA 11467 A G Transition ND4 ND4:L236L brain pH – 0.14 rs2853493 /sCJD MGP

Table 13 Haplogroups identified in the cohort

Group

Haplogroup Recovery Concussion

H 8(11.59%) 18(26.09%) JT 2(2.90%) 6(8.70%) N 4 (5.80%) 3(4.35%) R 1(1.45%) 2(2.90%) U 2(2.90%) 18(26.09%) X 0(0%) 5(7.25%)

7.1.4 Discussion

Mitochondrial variants are an emerging area of research for exploring the genetics of

concussion incidence and outcomes. The balance of variants attained through optimising

coverage cut-offs is consistent with the literature wherein mtDNA requires higher levels of

coverage than genomic DNA. In a cohort of severe responders to trivial head trauma,

individuals with persistent post-concussion symptoms, and individuals who recovered from

severe head injuries, there is a higher frequency of the ATPase6:A177T variant than indicated

in other databases. It is also suggested to be likely pathogenic. The role and biology of this

gene, and its link to ATPase processes hint at a possible connection to post-concussion

physiology. Of further interest, other mutations in the ATPase6 gene have been found to inhibit

ATPase activity, causing some cases of maternally inhibited ataxia 401. Polyphen and SIFT

analysis predicted the p.A177T variant to be “Probably damaging” with PSCI score 0.845 and

136

“Affect protein function” with a score of 0.02. Also, in p.A177T mutation, A177 formed three polar interactions with L173, I174, and M181 while T177 in altered protein formed four interactions, three were with L173, I174, and M181 and third extra interaction with L173 402.

Noer, Sudoyo 403 reported p.A177T in patients with mitochondrial encephalomyopathy, characterized clinically by myoclonic epilepsy and ragged-red fibre (MERRF) syndrome.

Abnormal patterns of mitochondrial translation products were observed in the skeletal muscle of patients, consistent with the expected consequential defect in protein synthesis. Interestingly, this variant is suggested to be a PD protective factor in MITOMAP database, which warrants further investigation into the biology of Parkinson’s and concussion. However, upon further investigation of the possible reasons for this “PD protective factor” tag in the database, it appears that it the protective status is assigned to variants in the UK cluster. Variants in the UK cluster have been has been showed to have PD protective factor, while those in H haplogroup are associated with PD risk 404. Nonetheless, studies exploring haplogroups or clusters as risk or protective factors for neurodegenerative diseases should be carefully scrutinised. According to systematic review405 , the same haplogroup or cluster can infer risk or protection for AD depending on certain demographics (e.g. biological sex). Further, other studies have found this gene to be mostly susceptible to mutations in breast cancer patients 406 and osteosarcoma 407.

Consequently, making any causal connections would be unsubstantiated at the moment. It is important to recognise that mtDNA is highly oxidative, and it is highly possible that neurons where heteroplasmy occurs are contributing to the neurodegenerative phenotype with higher penetrance, but because they are cell specific, are not detected in our cohort.

The other 2 variants (rs2853498, rs2853493) found in this cohort is suggested to be a risk factor for altered brain PH, as well as Sporadic Creutzfeldt–Jakob disease sCJD, a sporadic neurodegenerative disorder. It was found in a higher number in a cases population and no

137

controls in an unpublished study reported on Mitotip database (Zhang, Dong and Xiaoping,

2014). The fact that both variants seem to be found at much higher (almost 2x) ratio in AD- related individuals than healthy controls, with MAFs closer to our cohort, suggest potential involvement in neurodegenerative processes. Two synonymous AA changes, m.11467A>G in

ND4 and m.12372G>A in ND5, are more frequently observed in sCJD patients. Rollins, Martin

408 found that individuals with one of these variants have a significantly high brain Ph value

(7.006±0.18 SD) in brain tissues compared with that of control (6.86±0.18 SD). It has been hypothesized that these variants disrupt coupling processes due to less excess mitochondrial oxidation and decreased H+ ion gradients in the outer membrane. The roles of those two mutations in the pathogenesis of prion diseases remain unsettled 409.

Further, while the results are inconclusive with the current sample size, there seems to be no pattern of haplogroups in any of the sub-groups in this cohort. This is at odds with previous studies that have found a link between certain haplogroups (K and T) and neuroprotection against persistent PCS 190. Nonetheless, seeing the paucity of healthy controls who never had concussion or mTBI, it is interesting to see no one with the haplogroups K and T. Finally, it is important to acknowledge the role population and sampling bias could have played in the range of haplotypes present in this cohort. Due to the small sample size, it is not possible to ascertain that different ethnicities are present in our population, hence contributing to the varying haplotypes. Further, these haplotypes estimates are broad indicators seeing that they were derived from WES data. Further, all our participants have had a head injury in the past, with varying degrees of recovery, which is inherently a sampling bias that could only be remedied by recruiting individuals who had never had a concussion or head injury in the past and including them in the haplotyping analysis.

138

7.1.5 Conclusion

mtDNA variants extracted from WES data can offer valuable insights into the processes

by which mitcohondria is implicated in the development and persistence of concussive

symptoms. In this cohort, an ATPase-related mtDNA variant seems to be represented at a higher

frequency than average databases. Further, there is little presence of the haplogroups indicated

to be protective from TBI in the literature in this cohort, which might suggest another layer of

inherent risk.

7.2 METHYLATION CHANGES POST-CONCUSSION

This section explores the proposed methodology and rationale for the second part of aim

3.

7.2.1 Background

DNA methylation is one of several epigenetic modifications that regulate the process of

gene expression. Cytosine residues in CpG sites are often the target of methylation through

adding DNA methyl transferases (DNMT). Methylation is often considered as an expression

inhibitor as it limits the accessibility of DNA during transcription 410. Epigenetic markers are

not static and are prone to external environmental influences 411, including minor injuries.

Further, DNA methylation changes have been indicated as potential biomarkers for specific

traits and disorders 412 413 414 or as universal biomarkers for a wide range of inflammatory and

immune conditions415. There are two approaches, the first encompasses identifying genes of

interest and exploring the methylation levels in the CpGs of that gene. A relevant example of

that is the use of BDNF methylation levels as a biomarker to an array of psychiatric disorders

416. A more comprehensive approach involves genome wide methylation analysis (methylome),

139

which requires a larger sample size and multiple testing corrections but has been promising in substance use disorders 417, smoking status, and depression 418.

With regards to neurological disorders, including post-concussion symptoms, methylation analysis is quite challenging, seeing that brain tissue is often inaccessible for individuals who do not require surgery. Hence, methylation analysis depends predominantly on peripheral tissues, like blood and saliva for example.

A recent study 184 identified that, among other loci, cg20569893 in the FLOT2 region and cg00611535 in RAB5B had statistically significant variations in methylation between paediatric subjects that had experienced concussions and those that had not. There were numerous other genes that had statistical significance, but those two genes had some of the highest predictive

Area Under the Curve (AUC) values. In other words, whether the person has had a concussion or not could be accurately determined based on the methylation levels in those genes.

FLOT2 is the Flotillin gene, which is a scaffolding protein implicated in various functions, of most relevance is ionotropic glutamate receptor binding. Methylation levels of FLOT2 have been found to play a role in Panic disorder 419. FLOT2 is highly expressed in the brain 420.

Flotillin is an integral membrane protein that has also been shown to be an important part of regulating homeostasis and neuronal signalling 421, which is often disrupted in the aftermath of concussion.

As for RAB5B, it is mostly crucial in the Rab5-mediated endolysosomal trafficking pathway 422, and belongs to the Ras Analog in Brain (RAB) proteins. They are small guanosine triphosphatases (GTPases) that belong to the Ras-like GTPase superfamily, and they can regulate vesicle trafficking, which affects apoptosis 423. Most relevantly to neuronal function,

140

early pathogenesis of Alzheimer’s Disease (AD) 424 has been linked to the dysregulation of

Rab5‐mediated endocytic pathways.

The initial plan for this part of the project was to recruit individuals who are actively at

risk of developing a concussion (e.g. contact sports athletes) and obtaining saliva samples

before and after a concussion. However, there were consistent challenges with recruitment and

such samples were not possible to obtain. Therefore, healthy controls with no history of head

trauma were used in comparison to individuals who have been concussed in the past.

7.2.2 Methods

While it is logistically difficult to obtain brain tissues for methylation analysis, blood and

saliva both offer suitable alternatives where specific genomics methylation is representative of

methylation profile in other tissues 425. Further, Braun and colleagues 426 found that saliva yields

closer similarity of DNA methylation (DNAm) to brain tissues than blood. Pyrosequencing is

suggested to measure methylation levels at CpG sites with high accuracy, all the while being

relatively quick, allowing it to be used for screening large groups of samples simultaneously

427. As discussed in chapter 3, pyrosequencing is suggested to provide the best balance in terms

of accuracy, cost-efficiency and ability to sequence individual CpGs.

PyroMark Assay Design software 2.0 (Qiagen Inc., CA) was used to design specific sets

of primers that would amplify the target regions of the CpG sites. Primers are available in

Appendix. Pyrosequencing requires biotinylated PCR amplicons, hence one of primer of each

set was biotin labelled. An independent group t-test was used to analyse the differences in

averages of methylation across the two groups. A t-test compares the means of two unrelated

groups regarding a certain variable with the null hypothesis being that there are no significant

differences.

141

There were two groups of patients used in this cohort a) Concussion patients recruited through partner neurologists (N = 38); and b) A cohort of 400 healthy individuals whose executive and memory functions are within their age range average. Average age was 40 years old with SD = 18. There were 18 Males and 20 Females. Further details about this cohort are found in a study published as part of this candidature. They have also never reportedly had any head injuries in the past. The data available for these participants include a battery of memory tests as well as executive function tests (Wechsler Memory Test, Trail Making Test, Stroop

Effect Test, and Shum Visual Memory.) 428. Further details about this cohort are, including the tests are outlined in a study published as part of this candidature428.

Bisulphite conversion of the DNA samples, by converting the cytosine to uracil, and maintaining the methyl-cytosine. The EZ DNA Methylation kit (ZYMO Research, D5002) was used for the conversion process. PCR reactions were carried out as per the GRC standard protocols and after optimising using temperature gradients. Pyrosequencing for CpG regions of genes established in the literature as either contributing to memory and executive functions or ones that change in terms of methylation post trauma. The methylation quantification will be done via Bio-Molecular Systems' Qseq pyrosequencer. It can sequence up to 48 samples per run, with multiple CpG sites in amplicons up to ~70 bp. Pyromark® Q25 software was used to calculate the methylation percentage at each of the CpG sites. Participant ages ranged from 18 to 68 years of age and included both males and females. Participants were matched as closely as possible to controls without evidence of concussion (‘GOM’ samples, as they were obtained from a Genetics of Memory study) of the same age and sex, and the comparative methylation of the RAB5B and FLOT2 loci were to be examined. Of all the genes identified in the original study, these 2 genes were the ones for which assays could be designed without the authors’

142

input, as they were contacted to confirm primer designs. Pyromark software was used to analyse

methylation level in CpG mode as it considered to be less prone to manual errors 429.

7.2.3 Results

As can be seen in Table 14 below, the average percentage of methylation at all four loci

of RAB5B gene in both the sample population and control groups is extremely similar,

suggesting inconclusive results. Boxplots (Figure 7) show little differences, and an independent

t-test (Table 15) returned no significant p-values. Sample-specific methylation percentages are

reported in Appendix. As for FLOT2 sites, multiple attempts were made to amplify and quantify

the region but none of the used primers were successful.

Table 14: Average methylation across CN samples and GOM controls at four loci in

the RAB5B gene

CN GOM Site methylation methylation (%) (%) 1 6.2 6.1 2 5.5 5.7 3 3.7 3.7 4 3.4 3.9

*CN = Concussion Samples GOM = Genetics of memory cohort

143

Table 15: independent-sample t-test comparing methylation levels between concussed and healthy patients at each CpG location

t-test p-value CpG1 0.39 0.34 CpG2 -0.28 0.38 CpG3 0 0.5 CpG4 -1.09 0.13

Figure 7: Boxplots demonstrating methylation averages for the 4 CpGs.

144

7.2.4 Discussion and Limitations

In this pilot study of methylation levels analysis, no significant differences were found

between PCS patients and healthy individuals who had never had a TBI in their life. This

analysis was limited to a single gene (RAB5B) where significant differences had been identified

in a similarly sized paediatric cohort. While the most innocuous explanation for the difference

is that both studies are underpowered and thus do not produce statistically robust results, other

possible explanations need to be explored. The cohorts used in both studies are quite different.

The first difference is age, and the original study being conducted with a paediatric cohort,

where methylation differences might be more malleable to environmental triggers. This is

supported by evidence that methylation level are most dynamic in the first 10 years of life 430.

The only difference that could contribute to the lack of difference is the source of DNA, wherein

this study employed saliva-extracted DNA and the original used whole-blood DNA.

Nonetheless, there is emerging evidence suggesting that saliva-DNA is the closest in terms of

true representation of brain methylation levels. The healthy cohort was comprised mainly of

young individuals, with over 70% of the population aged between 16 and 25. There were

multiple samples in the PCS cohort whose ages above 68, and no exact `age matches were found

for them. Consequently, identifying whether the methylation differences are due to concussion

or age-related methylation changes could prove difficult to ascertain. Further, there are expected

limitations to be faced when recruiting samples for epigenetic comparison pre- and post-

concussion.

145

Chapter 8: Predicting outcomes of head trauma using Machine Learning

This chapter details using Machine Learning to explore non-linear interactions between variants that contribute to post-concussion outcomes. A principal component analysis was first undertaken followed by 3 types of classification algorithms (Gradient Boosted Trees, Random

Forests, and Logistic Regression). Results are described in terms of prediction accuracy metrics and discussed in relation to their biological relevance.

The results of this chapter will be published in the manuscript: “O Ibrahim, H

Sutherland, R Lea, L Haupt, L Griffiths. A machine learning approach to predicting neurological outcomes of head trauma.” Planned submission: Neurology Journal, Q1, IF: 8

ABSTRACT

Background:

Concussion is a transient disruption of neuronal homeostasis that is often the result of a head trauma in an acceleration/deceleration context. Most individuals develop non-severe neurological symptoms that resolve in a matter of weeks to month at most. Nonetheless, there are individuals who suffer prolonged and persistent post-concussion symptoms (PCS) following average head injuries. There are also individuals who develop severe neurological dysfunction following trivial head trauma. Finding biomarkers that predict head trauma outcome is of clinical and economic interest seeing that persistent or severe PCS can often be

146

debilitating. It is also of biological interest as finding the right biomarker will provide further insights into the etiology and development of concussion. To date, no studies have explored the use of Machine Learning methods on genomic data to explore outcomes of head trauma.

Methods:

This study utilised three groups of individuals (N = 60): a) 16 individuals with severe neurological responses to trivial head trauma b) 26 Individuals with persistent PCS c) 18 individuals with normal recovery from concussion or mTBI. WES was completed using the Ion

Platforms instruments (S5 and Proton). Bioinformatic data cleaning and manipulation was done using Ion Reporter and an in-house suite of tools. Machine learning models were implemented in R-studio environment. For unsupervised clustering, k-means clustering was used, while

Random Forests and Xtreme Gradient Boosting (xgboost) were used for supervised classification.

Results:

K-means clustering showed no intrinsic grouping in the population, which was supported by a principal component analysis (PCA). By using variants of Minor Allele Frequency between 0.2 and 0.4 in the training set (randomly sampled 70%), both RF and Xgboost accurately classified individuals who recovered well from concussion from individuals with severe reactions to trivial head trauma or individuals with persistent PCS with an AUC of 0.7, indicating above chance prediction. Two genes appeared to be implicated in the prediction, the Tau-related gene

TTBK2 and, PPP1R3A, which has a role in insulin resistance.

147

Discussion:

Machine Learning methods in combination with WES data have the potential to predict severe

or prolonged responses to head trauma from healthy recovery. While the numbers are not of

clinical significance, they are indicative of above chance trend relevant to the sample size.

Metrics including sensitivity, specificity, and kappa were all within acceptable range to support

the prediction accuracy. Furthermore, as sequencing artefacts which are a common occurrence

in WES platforms were manually removed, the final results reflect findings based on inclusion

of true variants.

8.1 BACKGROUND

Machine Learning (ML) is commonly used to describe a set of analyses that employ

statistical and probabilistic methods to infer relationships between variables, and has seen an

increase of utilisation in the biomedical research field in the past 10 years 431. Its use in

genomics has been versatile. To date ML and, more broadly Artificial Intelligence (AI), has

been used in genomics mostly in 2 areas: A) AI in genetic sequencing, where machine and deep

learning have been used to call “sequencing” variants based on platform specific signals. This

has provided a major catalyst to the accessibility of sequencing technologies, potentially

lowering cost and decreasing the time needed to sequence whole genomes 432. B) AI in variant

pathogenicity. While the study of genetic causes of diseases depend on these findings, with

tens of thousands of variants expected per individual, it is virtually impossible to manually sift

through them 433and filtering mechanisms are required to prioritize the likely “pathogenicity”

of detected variants 434. Most of the current in-silico pathogenicity prediction tools depend on

training sets from global genomics databases.

148

8.1.1 ML in common and rare disease genetic aetiology

One of the biggest challenges of genetics to date is determining “causal” genetic variants,

especially in complex disorders, where tens or hundreds of genes are implicated. Hence,

expanding the current knowledge of variants implicated in idiopathic (organically occurring)

or environment-triggered disease will be paramount to the development of personalised and

precision medicine. The combination of machine learning and genomics technologies offer a

way to advance precision medicine, both in linking phenotypes to risk genes and variants or

tailored treatments via pharmacogenomics 435

There are two widely used approaches to explore the genetic variants associated with a

certain disorder or trait: 1) Genome Wide Association Studies and 2) Rare variant association

studies. The first postulates that common polymorphisms (usually present in >10% of the

population) can in combination contribute to causing complex disorders that have been difficult

to understand the genetic basis of. This approach can require hundreds of thousands of samples

to achieve statistical power as there are potentially millions of polymorphisms to account for as

variables. This is where machine and deep learning become instrumental in finding genetic

signatures of possible polymorphism combinations that cluster or classify cases and controls.

The second approach assumes that a combination of rarer variants (in less than 1% of the

population) in each individual contributes to the emergence of certain disorders where

heritability is still to be explained. ML approaches have been instrumental in decoding the

heritability of some complex neurological disorders (e.g. schizophrenia 114 ), where new genes

of interest were detected based on non-linear interactions and NGS data. However, global scale

initiatives remain limited in terms of using ML to find genetic signatures of common disorders

or traits (e.g. intelligence).

149

Single variants statistical thresholds have been the basis for GWAS analyses, yielding very low results in heritability estimates. Some studies have reported low odds ratio/ small effect sizes for rare alleles in Crohn’s disease (IBD) when using high throughput sequencing methods and a large number of patients 436, 437. Polygenic traits have been suggested to have weak, hard to detect, genetic interactions between common SNPs contributing to their development 438. It is now predicted that interactions or combination of variants either as epistasis (where the effect on mutation is dependent on the absence or presence of another) or genetic interaction could hold more answers to heritability estimates of complex traits and disorders than single variants. Nonetheless, it is difficult to find those interactions through linear analyses 439. Consequently, in the context of neurological disorders, where rare variants seem to contribute more, it follows that similar rarely detectable interactions could be part of their development or pathology, especially if “rare variant” disorders are usually studied in the context of novel or rare mutations. Machine learning models have recently gained popularity in polygenic risk scores studies, but are not widely used yet to produce large-scale results 440.

Nonetheless, depending on the effect size, it is likely that some more common SNPs would be contributing to the phenotypic variability. However, in cases of small sample size at hand, it is not possible to include all variants to avoid the risk of over-fitting, where too many variables create a high prediction accuracy in the training set that cannot be replicated in the testing set.

Neuropsychiatric disorders with multiple susceptibility genes as well as high sensitivity to environmental influences have been difficult to fully understand through regular Mendelian genetics alone. However, using ML approaches has been useful, particularly in the case of autism spectrum disorder (ASD) and schizophrenia 441. Supervised learning methods, in particular, have been able to differentiate between overlapping medical conditions, including different dementias and cognitive impairments 442.

150

Further, using ML approaches to classify exomes has rendered high accuracy in predictions, and provided new insights into the mechanisms of missing heritability of schizophrenia through a rare variant approach 114. This study considers Post-Concussion

Syndrome (PCS) to be a common neurological/neuropsychiatric disorder. Li and colleagues 115 found that four neuropsychiatric disorders share genes with de novo variants that are suggested to contribute to their etiology. Girard and colleagues 113 have also explored the role of de novo mutations through exome sequencing in schizophrenia as alternative to common SNPs approaches. This approach identified a significantly higher number of variants than expected and provided new insights into the understanding of schizophrenia’s missing heritability genes.

More specifically, pertinent to concussion, is the fact that the neurological hallmark of migraine, cortical spreading depression, as well as cortical and subcortical functional and structural abnormalities in multisensory processing pathways 443, 444 are shared with other neurological disorders, suggesting an underlying genetic pathway to neurological susceptibility, especially as a result of a trigger (stress or injury). ML, contrary to other methods offers both reliability and generalisability, which is useful when exploring biomarkers for contested clinical diagnosis. Results from multiple studies demonstrate that machine learning-based, nonparametric algorithms seem to outperform linear or regression models in genomic variant effect context, yielding better prediction 445. As symptoms reporting is often subjective or can be diagnosed by a clinician based on equally subjective reporting/instruments, there have been few attempts to automatically classify different outcomes to mTBI. Similarly, other neurological disorders, including frontotemporal epilepsy present similar challenges especially when it comes to subjective methods of diagnosis 446. Using gene/variant features might provide insights into the networks impaired post-concussion that are dependent on different genotypes or impairment to gene functionality. This is supported by research proposing that PCS is linked to structural connectivity changes 38 and cortico-cortical connections rerouting following brain

151

injury 447. Methods that explore large data sets of genetic information in decision-making

processes rely on either linear regression towards the effects of certain variants, or the genomic

covariance between cases or controls 445.

8.1.2 Machine Learning types and methods

Linear ML models, which do not account for interaction between genetic variants, have

been found to perform on par, no better than simple linear regression models, in the context of

common variant based disorders 439. Furthermore, as this research is primarily focused on rare

variants and the sample size is too small to explore regression models, non-linear ML models

were chosen. Data mining algorithms can learn from examples and model the non-linear

relationships between variables. The most commonly used data mining algorithms for health

care systems include naive Bayes classifiers, neural networks, support vector machines, logistic

regression, fuzzy rules, and decision trees 446. Non-linear interactions are best explored through

ensemble methods like random forests and gradient boosted trees. While regression trees alone

explore higher order interactions, and boosted trees add individual weak learners to create an

overall strong predictor 440, a combination of both approaches might provide insights related to

varying levels of information. Gradient boosting, an ensemble method 448, is also based on

decision trees. A decision tree is a representation of an analogous machine learning approach

used for the classification of objects into decision classes, where an object is represented in a

form of an attribute‐value vector (attribute 1, attribute 2, … attribute N, decision class)

(Quinlan, 2014). Decision trees can be used for two types of problems: classification wherein

the decision class belongs to a categorical variable; and regression, used when the decision class

is a continuous variable. A decision class is one special attribute whose value is known for the

objects in the learning set and which will be predicted according to the induced decision tree

for all further unseen objects. The premise of classification is to predict the value of the decision

152

class of previously unseen objects based on training with objects for which the decision class

value is known.

Classifiers typically start with supervised learning phases, in which the model is built

based on training sets, where the values of the decision classes are known to the algorithm 449.

To ensure model utility for various datasets, datasets are often divided into a training and testing

dataset, whereby the latter is never introducted to the algorithm in the learning (training) phase.

Hence the responses of the methods give an idea on how the methods would perform on a new

data dataset (new cases).

Machine learning types

Supervised Unsupervised Semi-supervised Reinforcement Learning learning Learning learning

Learning from both Classification Clustering e.g. Robotics and labelled and (Binary) (Biological data) Gaming unlabeled data

Dimension Regression Reduction (imaging (Continuous) and text)

Figure 8: Machine Learning types and approaches

153

Boosting algorithms are usually good for large data sets, as they create a sequential combination of predictors (variables) 450, then a group of predictors is formed along with their residual to which new predictors are added leading to an overall prediction score 445. Boosting seems to be an ideal form of dealing with the dimensionality problem 451. That problem, also known as the dimensionality cure, was first coined by Bellman in 1957, where they predicted that the larger the data sets will be and the larger number of variables they will include, the more dimensions of the data will need to be analysed in every step of the algorithm, making its cost almost prohibitive. While boosting works on different types of algorithms, there is a specific type of boosting applied here, gradient boosted trees is based on specific models, i.e. random forests etc. Nontheless, the literature is slowly growing with examples of using classification alogrithms equally in smaller sample sizes. For example, Kassahun and colleagues (Kassahun et al., 2014) used a ML approach to classify a cohort of around 50 individuals, which is similar to the size explored in this study. Regression trees are most useful when investigating nonlinear and higher-order interactions, mainly due to their ability to reweight different weak variants to maximise prediction power of the model overall 440.

On the other hand, deep network approaches work better for images and signs due to their capacity to reduce noise, but have been shown to be less effective in tabular data prediction (in this case, tables of genetic variants) 452. A common critique of ML methods is that higher prediction accuracy can be a sign of overfitting, wherein the classifier works well on the training data but fails in new test sets. Preventing overfitting can be undertaken in multiple forms, including but not limited to, using more regularised implementations of boosted trees. It can also be undertaken by using a separate set for training and a completely novel testing set to assess the model performance. There are two main possible goals for using ML: prediction or interpretation. While the former is used primarily in clinical settings, the latter is predominantly

154

helpful in exploratory research. For example, presenting the algorithm with a set of data points

to classify based on a number of predictors, the generative approach builds a full model of each

class, whereas the discriminative approach focuses only on separating the two classes 434.

Feature selection is a crucial part of designing any machine learning model. In particular,

when the sample size (i.e. cases and controls) is less than the number of predictors (features)

that are being input into the model. This also allows for incorporating a priori knowledge to

inform the model, making it more biologically relevant, for example.

This study, therefore, aimed to explore the potential of using machine learning models to

elicit further insights from the wealth of data obtained from WES presented in Chapters 5-7.

By incorporating machine learning, there is better chance of elucidating non-linear interactions,

hence maximising the utility of this experiment and building on the neuronal vulnerability

hypothesis.

8.2 METHODS

There were 3 groups in this cohort (N=60) that were used in the analysis. The first group

is comprised of 16 cases that have developed severe reactions to trivial head trauma (referred

to as FHM – since their presentations overlap with the conditions). Their clinical presentation

is detailed in Chapter 5. The second group was comprised of 26 individuals who reported

persistent post-concussion symptoms that were not severe enough to warrant long term medical

care (referred to as PCS for short). Their clinical presentation is detailed in Chapter 6. The third

group were 18 individuals who had sustained a concussion or head trauma but had no severe or

long-lasting symptoms (referred to as Recovered). These were either referred through clinicians

or recruited directly as individuals who met the criteria. Family members were excluded from

this analysis to reduce weighted bias towards over-represented variants.

155

Whole Exome Sequencing (WES) was completed through the Ion proton and Ion S5+ instrumentation, and variants were called using the Torrent Variant Caller (TVC). Variant

Caller Format (VCF) files were then exported to a UNIX environment for filtering and merging.

VCFs were first normalised using the bcftools -norm command, then merged using the vcf- merge command, specifying that a reference allele is to be denoted by “0”. 2 merged files were created, the first containing both severe responses similar to hemiplegic migraine (HM) vs non- severe post-concussion symptoms (PCS), while the second contained PCS and healthy recovery

(Healthy) samples. The merged files were then annotated using dbSNP database and SNPeffect program. The dbSNP data base annotates the variants with multiple prediction scores for pathogenicity (including CADD, SIFT, and Polyphen, which are described in detail in chapter

3), and minor allele frequencies from multiple public databases (including gnomAD and

1000genome). snpEff was then used to create filtered VCFs based coverage above 20x, and minor allele frequency < 0.01. Quality Controls (QC) seem to influence prediction accuracy and AUC figures 453, 454. As one established way to mitigate biases is using higher QC thresholds

439, data was filtered for variants with >20x coverage. Genotype Quality (a metric produced by the variant caller as confidence measure of genotyping accuracy) above 90 was also used to filter variants, in an attempt to reduce sequencing artefacts. In an attempt to reduce missing rare variants that might be synonymous for example, no filtering was done for variant type. Samples were sequenced in random pairings (on sequencing chips), with sequencing runs often including individuals from different groups. This randomisation reduces any batch effects that might arise.

For the VCF of HM and PCS, the idea was to explore all rare variants and their ability to contribute to the model, hence no prediction scores were used to filter variants. For PCS and

Healthy recovery samples, a rare-variants VCF was created, as well as one with more common

156

variants that are predicted to be deleterious (SIFT deleterious, Polyphen Probably Damaging,

and CAD > 5). This second approach was justified as their presentation is more common and

therefore it is possible that common variants may be more useful for classification. The VCFs

were then imported to RStudio, where the variants were turned into tabular format using

adegenet and vcfR packages.

8.2.1 PCA

Principal Component Analysis (PCA), a statistical method to extrapolate existing

structures in a certain data set, has been widely used in population genetics to control population

structures or stratification that may lead to biased data in genetic association studies 455 456, and

will be applied to classification in this study. A PCA was done using k-means clustering and

distinct component analysis to make sure there is no underlying population stratification or

structures that inflate certain classification results. This aimed at exploring broader within-

sample similarities or differences. The analysis was done in RStudio using k-means packages

and dapc function. Results were visualised through the ggplot2 package.

8.2.2 Classification

For the classification analysis, 3 types of supervised algorithms were used: 1) gradient

boosted trees with regularisation (Extreme Gradient Boosting Implementation) using the

xgboost package; 2) Random Forests using the RandomForest package; and 3) Logistic

Regression using the glm package.

The training and test R code used for the analysis are available for each of the algorithms,

with HM/PCS data and PCS/Healthy recovery data. Boosting is prone to overfitting 450, 457,

which is why using a low learning rate is recommended and is what was implemented (eta =

157

0.01). There were 1088 variants used to classify HM and recovery, 1311 variants used to classify PCS and HM, and 1330 variants used to classify PCS and recovery.

Using the prediction values generated on the test set for each model, as well as the true membership values of each sample, a confusion matrix was generated along with accuracy, sensitivity, specificity, and Kappa statistics using the confusion matrix function of Caret package. Furthermore, to evaluate the prediction accuracy and performance of the models, the

Receiver Operator Characteristic (ROC) curve was plotted to explore the relationship between false positives and negatives, and the Area Under the Curve (AUC) for each model was calculated. Wray and colleagues explored 458 the merits of using ROC curves in genomics instead of prediction threshold, and found it provides a more robust result that is adaptable to clinical situations that are rarely an either or scenario. The shape of the curve suggests high sensitivity or specificity, often assisting with inferencing basic information about the model 459.

The number of folds was adjusted as well as the learning rate to achieve optimum balance between sensitivity and specificity.

Each model was tested once with internal bootstrapping, then the sampling was repeated

10,000-100,000 times to ensure that all combinations of samples (different controls and cases) produced a consistent average of accuracy, area under the curve, sensitivity and specificity to confirm that the prediction accuracy is not dependant on a certain division of the sample as adopted in 439. By using 70%-30% cut off for the train/test data sets we are being more conservative and cautious about avoiding overfitting, in a study using machine learning to predict good neurological outcomes, the cut of was 80% - 20% for training-test, which still accounted for overfitting 460. Multiple attempts at variant filtering were done to reach the best feature (variant) selection for each model. All the variants included in each analysis iteration as the top predictors have been visualised in IGV to exclude obvious sequencing artefacts. When

158

a variant proved to be a sequencing artefact, it was removed from the analysis and the entire

training and testing process was repeated.

8.3 RESULTS

8.3.1 PCA

Using K-means clustering algorithms to determine any existing groups or population sub

structures yielded no internal groups, when using all the variants in the 60 individuals as well

as rare variants under 0.01. This is demonstrated by Table 16 where Bayesian information

criterion (BIC) values, which should decrease dramatically when the optimum number of K

(internal groups) is found, seem to be consistent. It is further confirmed in figures 9a and 9b

where no sharp drop or “elbow” can be visualised on the plot when allowing for the algorithm

to explore the maximum possible K value, a different analysis than reported in Table 16.

Discriminant Analysis of Principal Component (DAPC). DAPC was then used to further

investigate internal grouping based on variants and assign individuals to prospective groups. It

can be seen in figures 10a and 10b that all individuals were predicted to be in the same group,

except for 1,2,3 or 4 individuals that were forced to belong to another group as the DAPC

algorithm was assuming K = 2,3,4,5. It is evident that the placement of those individuals is not

consistent and thus it is highly unlikely that they are outliers. In other words, when running the

algorithm with the assumption that there are 2 groups, only one random individual is assigned

to the 2nd group as the rest seem to group together, the same for assigning 3 groups from the

beginning, only 2 random individuals are assigned to the 2nd and 3rd group as the rest group

together. Ancestry does not seem to be the deciding factor in choosing said random individuals.

Therefore, there does not appear to be underlying sub-structure within the population or

between the subgroups, which justified proceeding with the classification analysis.

159

Table 16 K-means Clustering Results

MAF < 0.01 All Variants Group K BIC Group K BIC 1 570 1 383 2 570 2 383 3 570 3 383 4 570 4 383 5 570 5 383 6 570 6 383

Figure 9: Number of groups and corresponding BIC.

Panel 9a (left) shows k-means clustering with all variants, while panel 9b (right) shows k-means clustering with variants of MAF < 0.01.

160

Figure 10 K-means clustering results

Panel 10a (top) shows DAPC results with all variants, while panel 10b (bottom) shows DAPC results with variants of MAF < 0.01.

161

8.3.2 Classifying Severe Responses to head trauma and persistent PCS

To determine whether we could use ML to classify HM-like individuals (n= 16) from

normal persistent PCS individuals (n= 26) xgboost was applied to the tabular data of variants

with a coverage of over 20x in all cases and MAF of less than 0.01. In machine learning the

area under the curve (AUC) is a measure of the ability of a classifier to distinguish between

classes and is used as a summary of the ROC curve, with higher values indicating better

predictive accuracy, above 0.5 means above chance (which is 50% for each class in binary

classifications). Kappa, on the other hand, measures the validity of the prediction accuracy,

with values 0.3-0.5 reflecting acceptable validity, and 0.6-0.7 reflecting strong validity.

In the first instance, an AUC of 0.95 and a Kappa of 0.83 were achieved. That was using

all variants with allele frequency of 0.01 or less. Nonetheless, upon examining the variants

contributing to the model, the top variants appeared to be sequencing artefacts (i.e. present in

all samples with threshold-like allele ratio, hence they were called as variants in some cases and

as wild-types in the others, randomly. Removing these artefacts resulted in a drop of the AUC

to 0.5. However, this figure was consistently obtained through a specificity of 1 and sensitivity

of 0, rather than an overall reduction in both sensitivity and specificity. The same result was

obtained when only including variants with CADD over 15 to ensure functional impact had the

same results of AUC 0.5, sensitivity 1 and specificity of 0.

8.3.3 Classifying Severe responses to head trauma and healthy recovery from mTBI

To determine whether we could use ML to classify HM-like individuals (n= 16) When

using rare variants, individuals who recovered following a concussion, xgboost was used on

162

multiple VCFs with different variant filtering thresholds. The first instance included variants with MAF below 0.01. However, this resulted in specificity of almost 0, and AUC was 0.5.

When using variants 0.1 - 0.4 with CADD, and after removing the previously known sequencing artefacts, AUC of 0.71 was obtained– suggesting a trend. Further details on the model performance can be found in Tables 17 and 18. As per Table 19, Three variants contributed to over 0.5 of the model: 1- chr15:43170793 (0.43) 2- chr11:5411398 (0.23) 3- chr6_31238155 (0.04), and were confirmed to be real variants by visually inspecting sequencing traces on Integrated Genomics Viewer (IGV) of the BAM files. Classifying

Persistent PCS vs Healthy Recovery

Multiple attempts at variant filtering were done to reach the best feature (variant) selection for this model. The first one included variant with MAF below 0.01, similar to the previous model used to classify individuals with persistent PCS and individuals who recovered following a concussion, but there was very little (< 50%) accuracy using the three algorithms. The second attempt included variants between MAF 0.01 and 0.1, with similar results for accuracy and

AUC were obtained. In those two instances the AUC value was varied between 0.4 and 0.5, but there were scenarios where sensitivity and specificity were as polar as when including severe responses to trivial head trauma cases. Finally, using variants of higher frequency, i.e. MAF between 0.2 and 0.5, a high accuracy of classification was obtained, as per Table 17.

Gradient boosted trees were able to accurately classify 87.5% of the cases and controls, with an AUC of 0.875 and a Kappa of 0.7 as presented in Table 17. The confusion matrices

(Table 17) shows a balanced ratio of cases and controls to interpret Kappa. Using Random

Forests achieved similar results with an AUC of 0.9 and a Kappa of 0.69 as reported in Table

17. However, similar to the other model (HM and PCS), logistic regression (glm package) was

163

the lowest performing model, with an AUC of 0.78 and a Kappa of 0.57. As per Table 19, two variants contributed to over 0.5 of the model: 1) chr7:113518434 (0.4) 2) chr1:152749091 (0.1).

Table 17: performance of different algorithms on various pairing of subgroups of responses to head trauma.

HM: individuals with Hemiplegic-like symptoms with severe reactions to trivial head trauma. PCS: individuals with persistent post-concussion symptoms. Recovery:

Individuals who recovered in the normal timeframe from a diagnosed concussion/mTBI.

HM/Recovery ( 0.1 < HM/PCS (0.01 & CADD PCS/Recovery (0.1 < MAF < 0.4) & CADD > > 15) MAF < 0.4) 15 xgboost Rf xgboost Rf xgboost Rf Balanced 50 50 71.6 73 74 71 Accuracy AUC 0.5 0.5 0.716 0.732 0.74 0.71 Sensitivity 1 1 0.83 0.71 0.88 0.84 Specificity 0 0 0.6 1 0.6 0.6 Kappa 0 0 0.4 0.85 0.51 0.49

164

Table 18: Confusion matrices for three different models where 0 is control and 1 is a case, depending on the model. The higher the number is in the intersecting cell between respective 0 or 1 rows or columns, the more accurate the prediction is.

Reference HM/Recovery ( 0.1 < HM/PCS (0.01 & CADD PCS/Recovery (0.1 < MAF < 0.4) & CADD > > 15) MAF < 0.4) 15 Prediction 0 1 0 1 0 1 0 8 4 5 1 8 2 1 0 0 2 3 1 3

Table 19 Variants of most importance to the two models with above chance (AUC > 0.5) prediction accuracy

Frequency FHM vs gnomAD Feature Gain in Gene SIFT PolyPhen Recovery frequency population tolerated_ chr15:43170793 0.47 0.3 0.23 TTBK2/ low_ benign 0.37 confidence Possibly_ chr11:5411398 0.23 0.16 0.11 OR51M1 Deleterious 0.36 damaging deleterious_ chr6:31238155 0.04 0.04 0.02 HKLA-C benign 0.3 low_confidence

PCS vs Recovery

probably_ chr7:113518434 0.45 0.36 0.29 PPP1R3A deleterious 0.2 damaging tolerated_ chr1:152749091 0.15 0.14 0.11 LCE1F low_ unknown 0.16 confidence chr10:105205302 0.08 0.12 0.14 PDCD11 deleterious_ possibly_ 0.37

165

low_confidence damaging

8.4 DISCUSSION

This study utilised the WES data of 60 individuals with varied responses to head trauma

and mTBI. Machine Learning classifier algorithms were used to classify individuals with severe

responses to trivial head trauma and individuals with non-severe persistent post-concussion

symptoms based on rare variants. It was also used to classify individuals with persistent post-

concussion, and individuals with normal recovery following medically diagnosed mTBI. Initial

results of some models show above average prediction accuracy, with various figures including

sensitivity, specificity, AUC, and KAPPA supporting that accuracy level. The results are

discussed below in further detail. Overall, the initial results for all of these models appeared

well above average, but when accounting for sequencing artefacts several models performed

quite poorly, mostly at a chance level, while others were still able to achieve an above-chance

prediction accuracy.

8.4.1 Subgrouping logic and outcomes

This study recruited a group of individuals who have all had a head injury of some sort,

and the clearest grouping was by two factors, severity of the symptoms relative to the severity

of the injury, and persistence of the symptoms after the normal recovery times (up to 3 months).

Based on clinical notes and self-reported symptoms, the cohort was divided into 3 groups that

have different outcomes following head trauma.

ML non-linear algorithms, xgboost, and random forests were applied to classify three

pairings of these subgroups. The pairing with the least prediction success was in relation to

166

individuals with severe neurological dysfunction following mTBI or trivial head trauma from individuals who developed persistent less-severe post-concussion symptoms. As per Table 17, despite obtaining an AUC above the clinical threshold of 0.85, a sensitivity and specificity above 85% and kappa value of 0.8, further examination of the variables contributing to the model suggested that the top ones were sequencing artefacts. That is, they were variants of high coverage and genotyping quality but were present in all samples due to sequencing errors, and were only counted as variables in some samples due to variant calling chance differences, a process that has been demonstrated in Chapter 4. This result (AUC 0.8) could not be replicated after removing those sequencing artefacts regardless of the filtering method used for variant filtering. The analyses (training and testing) were rerun after removing all the sequencing artefacts identified earlier. There could be several explanations for this. The most obvious one, is that this is not a large enough sample size to distinguish between two cohorts with individuals who both have neurological symptoms, albeit of different severities and in response to varying head injuries.

Further support to this reasoning comes from the successful attempts to classify each of these clinical/symptomatic cohorts from a group of people who have sustained brain injuries/head trauma and never developed any severe or long-term sequalae.

Those classifications were undertaken through training the models on a subset of 70% of the total group. The models were first trained using a set of variants that had a minor allele frequency less than 0.01, in line with the rare variant approach discussed earlier. However, rare variants performed poorly in all classifications. This is potentially due to the small sample size, and the fact that no rare variants would be represented at a high enough frequency to create a trend. Nonetheless, it is only functional common variants (CADD > 15) that achieved a high enough accuracy and Kappa levels relative to the sample size.

167

8.4.2 Biological Relevance

Besides the potential to use the algorithm in clinical outcome prediction, classifiers

produce a list of variable importance to the prediction process. Seeing that this model is based

on small sample size and the prediction accuracy is below the clinical utility threshold (AUC =

0.85), it is the biological insights that provide more value to understanding responses to head

trauma and outcomes of concussion. Of all the variants that were identified as highest

contributors to the models, the top variants in each of the model classifying symptomatic

individuals from healthy recovery individuals seemed to have the most biological relevance.

For classifying individuals with HM-like symptoms following trivial head trauma, a

variant in the Tau tubulin kinase 2 (TTBK2) gene was the strongest predictor (Table 19). It is a

kinase that phosphorylates Tau and tubulin and has been shown to be involved in several crucial

cellular and neural processes 461. Tau is a protein discussed in the literature review as a target

of interest in neurodegeneration and concussion studies. Tau has 4 segments, with 85 potential

phosphorylation sites 462. Those sites, and the process of phosphorylation has long been

implicated in neurodegeneration 463, which is closely linked to brain injury. Aggregation of

Tau, a hallmark of AD, occurs partially due to hyperphosphorylation 464. This aggregation is

detrimental to Tau’s healthy functioning in axonal growth, vesicle transport, and signal

propagation mediated by microtubules, with those effects suggested to be linked to

AD/neurodegeneration 465. It is also worth noting that all the 10 sites in Tau that TTBK2 can

phosphorylate are related to AD 466. The variant identified in this cohort has been heavily

reported in ClinVar database. However, it was shown to be benign for AD. Nonetheless, this is

in line with the neuronal vulnerability hypothesis on which this research is built, whereby

variants that are biologically relevant to severe neurological dysfunction but are not causal

precipitate a higher risk of pathologic response to a trigger like head trauma or brain injury.

168

On another relevant note, TTBK2 mutations that cause premature truncations of the protein have been shown to cause Spinocerebellar ataxia type 11 (SCA11) is a rare, dominantly inherited human ataxia characterized by atrophy of Purkinje neurons in the cerebellum 467.

In the model classifying persistent PCS from healthy mTBI recovery, the variant (Table

19) with the most importance was in a gene not directly to mTBI and more to insulin resistance

(IR), PPP1R3A 468. It is not highly expressed in the brain. However, seeing that it was confirmed to not be a sequencing artefact, further exploration is warranted. Recent research has demonstrated a crucial role for insulin in regulating the brain’s functions and energy homeostasis, opposed to the historical view that the brain is insulin-independent 469. Brain IR can develop both intrinsically in transient ways to regulate bodily functions, but it can also occur in response to environmental trigger, i.e. brain injury. Circulating insulin is often accessible by the brain through the blood-brain barrier, which is often impacted in the aftermath of a concussion, to maintain two functions in: controlling food intake and regulating cognitive functions. It is no surprise that there is accumulating evidence that dysregulated insulin signalling is strongly implicated in neurodegeneration 470. There is an emerging body of literature linking insulin resistance to neuroprotective and healing mechanisms following mTBI. Several reports have demonstrated that TBI reduces insulin sensitivity and secretion, contributing to IR and glucose homeostasis impairment 471 472. Further, it is documented that

TBI launch a cascade of inflammatory events that lead to downstream inactivation of insulin receptor substrate proteins and subsequently insulin signalling 473. Most relevantly, this disruption influences AKT neuroprotective mechanisms and its role in energy metabolism One prominent downstream target of insulin signalling is AKT, a neuroprotective kinase that also influences neuronal energy metabolism 474. Therefore, susceptibility to insulin resistance, or developing it following traumatic brain injury could be increasing vulnerability to

169

neurodegeneration and other concussion symptoms in the absence of such neuroprotective pathway. The variant with the highest importance in classifying PCS from healthy recovery is documented in ClinVar as benign for insulin-resistance.

Performance Evaluation

Machine learning boosting methods often require smaller samples than gene-score based methods. Several studies have demonstrated high performance on WES samples with samples of similar size to this study 459. This is demonstrated by the level of acceptable preliminary results that boosting methods have provided using our small initial sample. Calibration is essential when predicting observed traits based on gene traits, and unlike regression-based predictiveness, it is often calculated from the absolute difference between predicted and actual phenotype 440. In this case, there was little calibration needed beyond the model parameters, seeing that the phenotypes were of binary nature. We acknowledge that a binary classification is somewhat reductionist seeing the wide variety of presentations in our cohort. However, due to the sample size and the varied amount of clinical information collected for each sample, this exploratory trial of ML considered the outcomes of mTBI as discrete or categorical variables.

Despite the small sample size, there were several steps taken to reduce biases. One way to reduce sequencing run bias, especially when it comes to cases and controls, is to split them between sequencing runs. As the sequencing runs for this study included samples that were both cases and controls, on average, it is highly unlikely that a certain group shares the same sequencing artefacts. On the off chance that this is the case, each model’s top variants were manually scrutinised in IGV to ensure the variants were real.

Accuracy is a commonly used initial indicator of a model performance, and it is calculated from the fraction of correctly classified values of the test set (i.e. cases or controls).

170

However, the reliability of accuracy as a predictor of model performance is highly contested

475. In particular, high accuracy should not be equated with high performance or prediction accuracy on other test sets 476. The failings of accuracy as a measure are often accentuated in cases where there is a significant imbalance between classes. Highly imbalanced or skewed training data is a common occurence in medical data sets, in particular those including cases and controls. Hence, a classifier assigning a case status to a test set with the majority being cases may have high accuracy, but not proper real world application. Conditions under which the cohorts or samples are often recruited often mean that the ratio of cases and controls might not necessarily reflect reality 477. Consequently, rebalancing the data by ensuring that a minimum number of cases and controls is used in both the training and data set is one advisable way to control for that skewness. To further demonstrate the pitfalls of overall accuracy, Lu and colleagues 478 presented an example in which a test set containing 99 controls but with only one case, an underperforming classifier that predicts all samples to be controlled will have an apparent overall accuracy of 99/100 = 0.99, even though the accuracy for the case class is 0.

The overall accuracy of the predictions (that is, the percentage of predictions that were correct) was 99.9%. However, a null classifier which is specifically designed to predict everything as controls will achieve a similar accuracy, showing the drawbacks of solely relying on accuracy as a performance measure 478.

Consequently, other metrics were used to measure the performance of the algorithms, including the Area under the Curve (AUC) of Receiver-Operating-Characteristic (ROC). AUC is considered the de facto standard for performance visualization for binary classification, but its generalization to higher class numbers is not as effective 479. While AUC, sensitivity, and specificity are limited to binary classification, Kappa is sometimes used in both binary and multi-class situations. Moreover, while there are some suggestions that Kappa occasionally

171

performs in anomalous ways in multi-class classification, no such evidence of this occurring in binary classification has been reported in the literature 479. Kappa gives a better indicator of how the classifier performed across all instances because a simple accuracy can be skewed if the class distribution is similarly skewed 480.

Kappa is often less misleading than accuracy as a crude figure, as it takes into account the random chance that results will agree with a random classifier 481. For example, based on the observed population frequencies, the expected accuracy of prediction will be 75%.

Consequently, a prediction accuracy of 80% is not much of an improvement above chance than if the expected accuracy was 50%. Despite its apparent importance as a figure, there is no standardised interpretation of the kappa statistic. Landis and Koch 482consider 0-0.20 as slight,

0.21-0.40 as fair, 0.41-0.60 as moderate, 0.61-0.80 as substantial, and 0.81-1 as almost perfect.

On the other hand, Fleiss 483considers kappas > 0.75 as excellent, 0.40-0.75 as fair to good, and

< 0.40 as poor. It is important to note that both scales are somewhat arbitrary484. At least two further considerations should be taken into account when interpreting the kappa statistic. Kappa is best interpreted in light of a confusion matrix, which provides an insight into the balance or lack thereof in the population. Confusion matrices are presented in the results section, showing a somewhat balanced distribution of cases and controls. Accuracy benchmark, especially in exploratory machine learning, is an arbitrary process, which is mostly dependent on the end goal or the main purpose for the analysis. In this study, both classification models had acceptable Kappa values of above 0.4, higher than chance sensitivity and specificity ( > 60%), as well as adjusted accuracy that is reflective of balanced numbers (> 70%).

172

Our feature selection was hypothesis driven at first, including only rare deleterious

variants. However, while that might have produced high prediction levels for cases that showed

severe, in other words, rare presentations, it seemed to have little value by way of classifying

individuals with more common symptoms from controls who developed no symptoms

following mTBI. This is where the importance of Gain Metric is demonstrated. It is assumed to

be one of the better predictors of relative importance to the model 114. It is entirely possible that

moving forward, a combination of algorithms will be needed to establish the best prediction

power. It is known in the computational intelligence literature that no single algorithm is the

best for all problems, a concept referred to as the “no free lunch” 477, 485.

Machine Learning results are paired with inferential statistics, which demonstrate

whether a sample of results supports a certain hypothesis and whether the conclusions achieved

can be generalized beyond what was tested 486. The single-problem analysis is quite common

in classification settings, and numerous statistical tests have been developed to assess the

accuracy from early on 487. Although the required conditions for using parametric statistics are

not usually checked, a parametric statistical study could obtain similar conclusions to a non-

parametric one. Our problem here, seeing that the cohort does not follow a normal distribution

with regards to the phenotypes exhibited, is definitely a single-problem analysis and a non-

parametric one for that matter. While the principles of machine learning are not new, the

advances in technology over the years have certainly made the application of machine learning

in a wide variety of problems a lot more accessible than thought prior.

8.5 CONCLUSION

This study provided a “proof-of-concept” level evidence for the utility of Machine

Learning models, particularly decision-tree-based ones, in providing further insights to the

173

biology of concussion. These models will need to be used in larger numbers in comparison to control populations (e.g. UK Biobank). There are several limitations that need to be acknowledged, aside from the sample size. Only SNPs were included in this analysis, to reduce any multidimensionality biases. However, that potentially limited the information indels in the populations (especially FHM) can provide to the prediction models. The next step will be to expand this cohort and use it along with a larger healthy control group and replicating the same biological/statistical data would be the only way to move forward towards clinical utility. Non- linear Machine Learning algorithms, in particular, those built on decision trees can classify individuals into general mTBI response groups with high accuracy through WES data. There is a lot of potential for incorporating ML into routine assessment and treatment of head trauma/mTBI as demonstrated by these results. While at the current level of understanding the model is far from being used in clinical settings, there are two main applications the approach of using machine learning and whole exome sequencing in concussion genetics. The first is immediate, and pertains to the genetic variants which have been found to have high importance in classifying different groups or phenotypes. These variants, and in fact their respective genes, offer an insight into the biological underpinnings of persistent post concussion symptoms and severe reactions to trivial head trauma. These insights could be used for more tailored treatments as pharmacogenomics becomes a more commonplace practice in everyday healthcare. The other application, which will be more future oriented and dependent on replicating these results more diverse and larger cohorts, is population screening, to offer advice with regards to career or sporting choices.

174

Chapter 9: Discussion, Future Directions and Concluding Remarks.

This chapter begins with a summary and a discussion of key findings discovered during this

candidature (Section 9.1). Section 9.2 discusses the limitations associated with the

project design. Section 9.3 lists possible future directions that can be used to further

advance this study. Section 9.4 describes how this study can be translated for use in a

clinical setting, and section 9.5 completes this thesis with final concluding remarks.

9.1 KEY FINDINGS

9.1.1 Saliva as a comparable source of quality DNA

This study explored saliva as a DNA source yielding similar sequencing quality to whole

blood-extracted DNA in a WES experiment using an Ion platform. While previous studies have

explored this in NGS targeted-gene panels 218, to our knowledge this was the first study to

compare whole blood and saliva DNA performance in Ion platform WES in healthy unrelated

individuals. Whole blood- and saliva-extracted genomic DNA did not yield significant

differences in quality and error rates of the sequencing data. In fact, saliva samples did – on

average – have slightly higher coverage depth than their whole blood counterparts. There was

no bias in the presence of the known Ion platform error types in one source of DNA over the

other. Furthermore, variant concordance between the sample pairs (same donor, different DNA

source) was above 90% in all pairs using two different bioinformatics pipelines. This was also

supported by the finding that same-source sample discordance appears to be minimal compared

to the different individuals’ samples discordance, regardless of the source (saliva or whole

175

blood). This discordance was therefore attributed to the variant caller software. Establishing

saliva as a comparable source of DNA allowed for using a combination of samples in this study

from both blood and saliva collections.

The significance of this finding is related to the applicability of using saliva in numerous

settings as opposed to whole blood collection, which can comparatively present logistical

difficulties.

9.1.2 Neuronal Vulnerability to Trivial trauma

In this study, we hypothesised that ion channel and neurotransmitter genes would harbour

rare (MAF <= 0.001) deleterious mutations in individuals who developed concussion-related

symptoms following mostly trivial head trauma. Among the 16 individuals screened by WES

in this study, 12 cases were identified to have variants predicted to be deleterious in other

neuronal genes (ATP10A, ATP7B, CACNA1I, CACNA1C, KCNJ10, KCNAB1, SLC26A4,

GABRG1, GRIK1, TRIM2, HECTD1, SQSTM1). Of the latter, deleterious variants either

implicated in autosomal recessive neurological disorders, or located in genes in which other

mutations cause a neurological disorder were observed in 8/12 cases. Therefore, we propose

that, similar to some CACNA1A and ATP1A2 mutation carriers, individuals with deleterious

variants in other neuronal genes, particularly those involved in ion homeostasis, may lead to an

increased vulnerability to head trauma. This premise of genetic risk modulating the severity of

acquired neurological dysfunction has already been demonstrated in animal models of epilepsy

248. As such, it follows that this hypothesis might also suit the case of neuronal vulnerability to

head trauma. These results provide preliminary evidence that WES can be used in cases where

a hemiplegic migraine diagnosis is suspected with no mutations found in its genes. This primary

176

study established a hypothesis for neuronal vulnerability to head trauma that was used to inform

further experiments.

9.1.3 Ion Channel variants in Persistent PCS

In a cohort of 33 individuals who reported prolonged non-severe PCS, WES identified

rare variants of interest in 42 genes related to various ion channel and neurotransmitter

processes. Most notably, in cases where unaffected family members were recruited, the

variants’ significance was further supported (SCN11A, NOS1, PHKB, and VDAC3). Most of

those families shared a similar presentation, wherein related family members share the same

risk factors (e.g. certain contact sports), yet only one would develop persistent PCS or show

higher tendency to develop concussion. In accordance with the results obtained from severe

neurological symptoms cohort, a high number of these variants has been identified in

neurological disease cohorts. Interestingly, the variants in this cohort were mostly listed as

likely benign or variants of unknown significance in the clinical database Clinvar. This is

perhaps in line with the observation that most of the symptoms in this cohort are non-severe,

and only trauma induced. Thus, variants that have little impact to cause a full presentation of a

disease can nevertheless contribute to these individuals’ vulnerability to concussion.

9.1.4 Mitochondrial Correlates

Mitochondrial DNA variants are rarely explored in the literature with regards to

concussion or PCS. They are also rarely explored in WES data sets, despite the high coverage

that some regions provide for their respective variants. In this cohort, haplotyping was

undertaken based on WES data and we found a paucity of the haplogroups K and T, which were

previously suggested as TBI-protective. Further, an ATPase6 variant linked to ATPase

processes was found to be over-represented in this cohort. More interestingly, 3 variants were

177

identified in the cohort, which were found to be over-represented in neurological cohorts, and

are documented as likely pathogenic for mitochondrial neurodegenerative disorders.

9.1.5 Machine Learning approach

This study investigated the use of Machine Learning (ML) methods in exploring

concussion outcomes. Implementing a clustering algorithm (k-means) as well as a principal

component analysis (PCA) revealed no population sub-groups in the cohort. This justified

moving forward with a classification algorithm without taking prior memberships into account.

Three classification algorithms (Gradient Boosted Trees, Random Forests, and Logistic

Regression) were used, and the first two seemed to outperform the third in all models. Using a

70-30 train-test sets sampling approach, the accuracy of the algorithm was tested on 30% of

each group which the algorithm had never seen before and was not trained to find their

respective membership group. For predicting whether someone will respond severely to trivial

head trauma or not, it appears that using common variants with MAF between 0.1 and 0.4 to

train the classifiers achieved a high accuracy of prediction with an AUC above 0.7, and

adequate sensitivity and specificity. As for predicting whether someone will develop persistent

non-severe PCS or recover in a normal timeframe, more common variants used in training sets

appeared to be the best classifiers, predicting groups with high accuracy and AUC above 0.7.

In both models, figures used to ensure that the high prediction scores are not due to chance or

sample imbalance (Kappa (>0.4), Sensitivity (0.7), and Specificity (> 0.6) seem to support the

findings. Consequently, this study demonstrated a utility for combining WES data and ML

methods to predict mTBI outcomes in a small a sample as 50 individuals. ML also implicated

two biological pathways of future interest, Tau phosphorylation and insulin-signalling and

resistance.

178

9.2 LIMITATIONS

While the results of this research are promising, and there is potential in application into

clinical settings, this is not without limitations. These limitations serve as an inspiration for

future directions, where the utility of this novel research will be further tested. They will be

explored further along with further exploration of the resultant integrity of the data and research

design.

Sample size: While the cohort is comparable to other small studies that had been conducted in

concussion outcomes, it is nowhere enough to establish statistical power using linear or

multivariate statistical models. It also limits the application of a rare variant burden testing in

comparison to public database populations. While attempts have been made for the first 2 years

of this project to recruit as many participants as possible through clinical referrals, to ensure

that there is adequate clinical data and standardised testing as part of the analysis, very few

clinicians were willing to refer patients with PCS for genetic testing/study.

Retrospective Concussion report: the reported post-concussion symptoms were lumped

together arbitrarily, seeing that the participation was outside of health settings and completing

prolonged questionnaires would have created a hinderance to recruitment which was already

limited. Consequently, post-concussion symptoms are based on self-report, as well as the

indication of participants that those symptoms have only begun post-concussion.

No baseline measurements: The literature suggests that baseline neurological, cognitive, and

behavioural figures are some of the best predictors for post-concussion outcomes. Hence,

multiple attempts to recruit athletes in different sports as their baseline is often well

documented, there were concerns from sports club’s management about recruiting their

athletes.

179

Limited family numbers: while some family members were recruited, others were problematic

to recruit. The case where one sibling with vulnerability to concussions had a SCN11A variant

while the sister did not, could have been stronger should the third sibling, a professional rugby

player, had agreed to participate.

Control populations: We had no access to databases with specific brain injury/concussion

history. Despite having access to some public databases, none provide detailed information on

previous brain injuries.

Rare variant exploration: The approach taken in chapters 5 and 6 is one that had been applied

in exploring various neurological conditions over the years. However, a key disadvantage of

such methodology is the possibility of over-extending the possible links between variants and

phenotypes. In other words, the probability of individuals carrying variants that happen to be

in genes of interest is high, and thus larger numbers are required to ascertain with statistical

approaches that those rare variants do cluster more in individuals with that phenotype.

Furthermore, larger datasets will allow for creating a comparator set of ontology genes that

might not be directly related to the phenotype, but will be tested for higher percentage of rare

variants of interest. This cannot be applied with the size of the current data set due to multiple

testing associated errors.

9.3 FUTURE DIRECTIONS

There are several future directions for this research. It is the first to demonstrate a

potential utility for Whole Exome Sequencing and Machine Learning methods for better

understanding and management of concussion. However, much larger cohorts are required to

allow for more confident results and other types of statistical tests that are not accessible with

the current sample size (e.g. burden testing). Crucially, functional assays for some of the

180

variants identified will establish stronger links with concussion cellular and molecular processes. It will be important for sports organisations and clinicians to further collaborate with researchers moving forward, allowing for collecting DNA samples pre- and post-concussion in individuals at high-risk of concussion (e.g. athletes). This will open the door for better epigenomic analyses and understanding of the immediate and long term methylome changes that occur after concussion. While some individuals who have recovered from concussion were used in this study as a type of control, individuals who had never had a concussion in their life would make a different, equally important control population. This will provide more certainty and confidence in the results of rare-variant analyses as well as Machine Learning models.

Moving forward, larger sample sizes and more detailed clinical observations (i.e. psychometrics) will permit the use of more reinforcement type models, including deep learning or neural networks. Finally, it is the recommendation of this research to utilise the current international organisations that are dedicated to managing and treating concussion to establish international cohorts that would unify clinical reporting of mTBI phenotypes, a crucial step in developing biomarkers for diagnosis and tailored treatments.

181

Bibliography

1. Boswell J. Glasgow Coma Scale. 2015. p. 450-2.

2. Levin HS, Diaz-Arrastia RR. Diagnosis, prognosis, and clinical management of mild traumatic brain injury. LANCET NEUROLOGY. 2015;14(5):506-17.

3. McCrory P, Meeuwisse W, Dvorak J, et al. Infographic: Consensus statement on concussion in sport. British journal of sports medicine. 2017 Nov;51(21):1557-8.

4. Harmon KG, Drezner J, Gammons M, et al. American Medical Society for Sports Medicine Position Statement: Concussion in Sport. Clinical Journal of Sport Medicine. 2013;23(1):1-18.

5. McAllister TW. Genetic Factors Modulating Outcome After Neurotrauma. PM&R. 2010;2(12):S241-S52.

6. Committee on Sports-Related Concussions in Y, Board on Children Y, Families, Institute of M, National Research C. The National Academies Collection: Reports funded by National Institutes of Health. In: Graham R, Rivara FP, Ford MA, Spicer CM, editors. Sports- Related Concussions in Youth: Improving the Science, Changing the Culture. Washington (DC): National Academies Press (US)

Copyright 2014 by the National Academy of Sciences. All rights reserved.; 2014.

7. Bailes JE, Dashnaw ML, Petraglia AL, Turner RC. Cumulative effects of repetitive mild traumatic brain injury. Progress in neurological surgery. 2014;28:50-62.

8. Sang CN, Sundararaman L. Chronic Pain Following Concussion. Current pain and headache reports. 2017 January 03;21(1):1.

9. Finch CF, Clapperton AJ, McCrory P. Increasing incidence of hospitalisation for sport- related concussion in Victoria, Australia. Medical Journal of Australia. 2013;198(8):427-30.

10. Sharma R, Rosenberg A, Bennett ER, Laskowitz DT, Acheson SK. A blood-based biomarker panel to risk-stratify mild traumatic brain injury. PLoS One. 2017;12(3):e0173798.

11. Maas AIR, Menon DK, Adelson PD, et al. Traumatic brain injury: integrated approaches to improve prevention, clinical care, and research. Lancet Neurol. 2017 Dec;16(12):987-1048.

12. Masel BE, DeWitt DS. Traumatic brain injury: a disease process, not an event. Journal of neurotrauma. 2010 Aug;27(8):1529-40.

182 Bibliography

13. Wintermark M, Sanelli PC, Anzai Y, et al. Imaging evidence and recommendations for traumatic brain injury: conventional neuroimaging techniques. Journal of the American College of Radiology. 2015;12(2):e1-e14.

14. Smits M, Houston GC, Dippel DWJ, et al. Microstructural brain injury in post- concussion syndrome after minor head injury. Neuroradiology. 2011;53(8):553-63.

15. Wetjen NM, Pichelmann MA, Atkinson JL. Second impact syndrome: concussion and second injury brain complications. Journal of the American College of Surgeons. 2010;211(4):553-7.

16. Hebert O, Schlueter K, Hornsby M, Van Gorder S, Snodgrass S, Cook C. The diagnostic credibility of second impact syndrome: A systematic literature review. Journal of Science and Medicine in Sport. 2016 2016/10/01/;19(10):789-94.

17. Davidson J, Cusimano MD, Bendena WG. Post-Traumatic Brain Injury: Genetic Susceptibility to Outcome. The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry. 2015 Aug;21(4):424-41.

18. McGrew CA. Sports-related Concussion — Genetic Factors. Current Sports Medicine Reports. 2019;18(1):20-2.

19. Mollayeva T, Mollayeva S, Colantonio A. Traumatic brain injury: sex, gender and intersecting vulnerabilities. Nature reviews Neurology. 2018 Dec;14(12):711-22.

20. Maksemous N, Smith RA, Sutherland HG, et al. Targeted next generation sequencing identifies a genetic spectrum of DNA variants in patients with hemiplegic migraine. Cephalalgia Reports. 2019;2:2515816319881630.

21. Curtain RP, Smith RL, Ovcaric M, Griffiths LR. Minor Head Trauma–Induced Sporadic Hemiplegic Migraine Coma. Pediatric Neurology. 2006;34(4):329-32.

22. Terwindt G, van den Maagdenberg A. Early seizures and cerebral oedema after trivial head trauma associated with the CACNA1A S218L mutation. 2009.

23. Ng SB, Turner EH, Robertson PD, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 2009 2009/09/01;461(7261):272-6.

24. Taylor JC, Martin HC, Lise S, et al. Factors influencing success of clinical genome sequencing across a broad spectrum of disorders. Nat Genet. 2015 Jul;47(7):717-26.

25. Walter A, Herrold AA, Gallagher VT, et al. KIAA0319 Genotype Predicts the Number of Past Concussions in a Division I Football Team: A Pilot Study. Journal of neurotrauma. 2019 Apr 1;36(7):1115-24.

Bibliography 183

26. Stein TD, Alvarez VE, McKee AC. Concussion in Chronic Traumatic Encephalopathy. Current pain and headache reports. 2015 Oct;19(10):47.

27. McKee AC, Abdolmohammadi B, Stein TD. The neuropathology of chronic traumatic encephalopathy. Handbook of clinical neurology. 2018;158:297-307.

28. McKee AC, Gavett BE, Stern RA, et al. TDP-43 proteinopathy and motor neuron disease in chronic traumatic encephalopathy. Journal of neuropathology and experimental neurology. 2010 Sep;69(9):918-29.

29. Omalu B, Bailes J, Hamilton RL, et al. Emerging histomorphologic phenotypes of chronic traumatic encephalopathy in American athletes. Neurosurgery. 2011 Jul;69(1):173-83; discussion 83.

30. Yue J, Robinson CK, Burke JF, et al. Apolipoprotein E epsilon 4 (APOE-ε4) genotype is associated with decreased 6-month verbal memory performance after mild traumatic brain injury. Brain and Behavior. 2017;7(9):n/a-n/a.

31. Guth T, Ketcham CJ, Hall EE. Influence of Concussion History and Genetics on Event- Related Potentials in Athletes: Potential Use in Concussion Management. Sports (Basel, Switzerland). 2018 Jan 19;6(1).

32. Corps KN, Roth TL, McGavern DB. Inflammation and Neuroprotection in Traumatic Brain Injury. JAMA Neurology. 2015;72(3):355-62.

33. Carbonara M, Fossi F, Zoerle T, et al. Neuroprotection in Traumatic Brain Injury: Mesenchymal Stromal Cells can Potentially Overcome Some Limitations of Previous Clinical Trials. Frontiers in Neurology. 2018;9:885.

34. Silverberg ND, Gardner AJ, Brubacher JR, Panenka WJ, Li JJ, Iverson GL. Systematic Review of Multivariable Prognostic Models for Mild Traumatic Brain Injury. Journal of neurotrauma. 2015;32(8):517-26.

35. Giza CC, Hovda DA. The New Neurometabolic Cascade of Concussion. Neurosurgery. 2014;75 Suppl 4:S24-S33.

36. Hirakawa K, Hashizume K, Hayashi T. [Viscoelastic property of human brain -for the analysis of impact injury (author's transl)]. No to shinkei = Brain and nerve. 1981 Oct;33(10):1057-65.

37. Budday S, Sommer G, Holzapfel G, Steinmann P, Kuhl E. Viscoelastic parameter identification of human brain tissue. Journal of the mechanical behavior of biomedical materials. 2017;74:463-76.

184 Bibliography

38. Coyle HL, Ponsford J, Hoy KE. Understanding individual variability in symptoms and recovery following mTBI: A role for TMS-EEG? Neuroscience and biobehavioral reviews. 2018 Sep;92:140-9.

39. Meythaler JM, Peduzzi JD, Eleftheriou E, Novack TAJAopm, rehabilitation. Current concepts: diffuse axonal injury–associated traumatic brain injury. 2001;82(10):1461-71.

40. Armstrong RC, Mierzwa AJ, Sullivan GM, Sanchez MA. Myelin and oligodendrocyte lineage cells in white matter pathology and plasticity after traumatic brain injury. Neuropharmacology. 2016;110:654-9.

41. Hiploylee C, Dufort PA, Davis HS, et al. Longitudinal Study of Postconcussion Syndrome: Not Everyone Recovers. Journal of neurotrauma. 2017;34(8):1511-23.

42. Meaney DF, Smith DH. Biomechanics of Concussion. Clinics in Sports Medicine. 2011;30(1):19-31.

43. Prins ML, Alexander D, Giza CC, Hovda DA. Repeated Mild Traumatic Brain Injury: Mechanisms of Cerebral Vulnerability. Journal of neurotrauma. 2013;30(1):3-38.

44. Otori T, Friedland JC, Sinson G, McIntosh TK, Raghupathi R, Welsh FA. Traumatic brain injury elevates glycogen and induces tolerance to ischemia in rat brain. Journal of neurotrauma. 2004;21(6):707-18.

45. Patterson ZR, Holahan MR. Understanding the neuroinflammatory response following concussion to develop treatment strategies. Frontiers in cellular neuroscience. 2012;6:58.

46. Lai AY, Todd KG. Differential regulation of trophic and proinflammatory microglial effectors is dependent on severity of neuronal injury. Glia. 2008;56(3):259-70.

47. Rathbone ATL, Tharmaradinam S, Jiang S, Rathbone MP, Kumbhare DA. A review of the neuro-and systemic inflammatory responses in post concussion symptoms: introduction of the “post-inflammatory brain syndrome” PIBS. Brain, behavior, and immunity. 2015;46:1-16.

48. Almeida A. Genetic determinants of neuronal vulnerability to apoptosis. Cellular and molecular life sciences : CMLS. 2013 Jan;70(1):71-88.

49. Almeida-Suhett CP, Li Z, Marini AM, Braga MFM, Eiden LE. Temporal course of changes in gene expression suggests a cytokine-related mechanism for long-term hippocampal alteration after controlled cortical impact. Journal of neurotrauma. 2014;31(7):683-90.

50. McCrory P, Meeuwisse W, Dvorak J, et al. Consensus statement on concussion in sport—the 5th international conference on concussion in sport held in Berlin, October 2016. 2017;51(11):838-47.

Bibliography 185

51. Finnoff JT, Jelsing EJ, Smith J. Biomarkers, genetics, and risk factors for concussion. PM and R. 2011;3(10):S452-S9.

52. Cantu D, Walker K, Andresen L, et al. Traumatic Brain Injury Increases Cortical Glutamate Network Activity by Compromising GABAergic Control. Cerebral Cortex. 2015;25(8):2306-20.

53. Yasen AL, Smith J, Christie ADJJon. Glutamate and GABA concentrations following mild traumatic brain injury: a pilot study. 2018;120(3):1318-22.

54. Froemke RC. Plasticity of cortical excitatory-inhibitory balance. Annual review of neuroscience. 2015 Jul 8;38:195-219.

55. Luscher C, Malenka RC. NMDA receptor-dependent long-term potentiation and long- term depression (LTP/LTD). Cold Spring Harbor perspectives in biology. 2012 Jun 1;4(6).

56. Giza CC, Kutcher JS, Ashwal S, et al. Summary of evidence-based guideline update: evaluation and management of concussion in sports: report of the Guideline Development Subcommittee of the American Academy of Neurology. Neurology. 2013;80(24):2250-7.

57. Esterov D, Greenwald BD. Autonomic Dysfunction after Mild Traumatic Brain Injury. Brain Sci. 2017;7(8):100.

58. Shultz SR, McDonald SJ, Haar CV, et al. The potential for animal models to provide insight into mild traumatic brain injury: Translational challenges and strategies. Neuroscience and biobehavioral reviews. 2017;76:396-414.

59. Okamoto RJ, Romano AJ, Johnson CL, Bayly PV. Insights Into Traumatic Brain Injury From MRI of Harmonic Brain Motion. Journal of experimental neuroscience. 2019;13:1179069519840444.

60. Jacobs B, Beems T, Stulemeijer M, et al. Outcome prediction in mild traumatic brain injury: age and clinical variables are stronger predictors than CT abnormalities. Journal of neurotrauma. 2010;27(4):655-68.

61. Finkel AG, Yerry JA, Klaric JS, Ivins BJ, Scher A, Choi YS. Headache in military service members with a history of mild traumatic brain injury: A cohort study of diagnosis and classification. Cephalalgia. 2017;37(6):548-59.

62. Radhakrishnan R, Garakani A, Gross LS, et al. Neuropsychiatric aspects of concussion. LANCET PSYCHIATRY. 2016;3(12):1166-75.

63. Ayr LK, Yeates KO, Taylor HG, Browne M. Dimensions of postconcussive symptoms in children with mild traumatic brain injuries. Journal of the International Neuropsychological Society : JINS. 2009 Jan;15(1):19-30.

186 Bibliography

64. Iverson GL. Outcome from mild traumatic brain injury. Current Opinion in Psychiatry. 2005;18(3):301-17.

65. Pearce AJ, Tommerdahl M, King DA. Neurophysiological abnormalities in individuals with persistent post-concussion symptoms. Neuroscience. 2019 2019/06/01/;408:272-81.

66. Mc Fie S, Abrahams S, Patricios J, Suter J, Posthumus M, September AV. Inflammatory and apoptotic signalling pathways and concussion severity: a genetic association study. Journal of sports sciences. 2018 Oct;36(19):2226-34.

67. Guinto G, Guinto-Nishimura Y. Postconcussion syndrome: a complex and underdiagnosed clinical entity. World neurosurgery. 2014 Nov;82(5):627-8.

68. Donnan J, Walsh S, Fortin Y, et al. Factors associated with the onset and progression of neurotrauma: A systematic review of systematic reviews and meta-analyses. Neurotoxicology. 2017 Jul;61:234-41.

69. Iverson. Outcome from mild traumatic brain injury. Curr Opin Psychiatry. 2005 May;18(3):301-17.

70. Polimanti R, Chen C-Y, Ursano RJ, et al. Cross-Phenotype Polygenic Risk Score Analysis of Persistent Post-Concussive Symptoms in U.S. Army Soldiers with Deployment- Acquired Traumatic Brain Injury. Journal of neurotrauma. 2017;34(4):781-9.

71. Dean PJA, Sato JR, Vieira G, McNamara A, Sterr A. Long-term structural changes after mTBI and their relation to post-concussion symptoms. Brain Injury. 2015 2015/08/24;29(10):1211-8.

72. Eme R. Neurobehavioral Outcomes of Mild Traumatic Brain Injury: A Mini Review. Brain Sci. 2017 04/25

03/25/received

04/21/accepted;7(5):46.

73. McInnes K, Friesen CL, MacKenzie DE, Westwood DA, Boe SG. Mild Traumatic Brain Injury (mTBI) and chronic cognitive impairment: A scoping review. PLoS One. 2017;12(4):e0174847.

74. Iverson GL, Karr JE, Gardner AJ, Silverberg ND, Terry DP. Results of scoping review do not support mild traumatic brain injury being associated with a high incidence of chronic cognitive impairment: Commentary on McInnes et al. 2017. PLoS One. 2019;14(9):e0218997.

75. Kulas JF, Rosenheck RA. A Comparison of Veterans with Post-traumatic Stress Disorder, with Mild Traumatic Brain Injury and with Both Disorders: Understanding Multimorbidity. Military medicine. 2018 Mar 1;183(3-4):e114-e22.

Bibliography 187

76. Hoge CW, Goldberg HM, Castro CA. Care of war veterans with mild traumatic brain injury--flawed perspectives. The New England journal of medicine. 2009 Apr 16;360(16):1588-91.

77. Polusny MA, Kehle SM, Nelson NW, Erbes CR, Arbisi PA, Thuras P. Longitudinal effects of mild traumatic brain injury and posttraumatic stress disorder comorbidity on postdeployment outcomes in national guard soldiers deployed to Iraq. Archives of general psychiatry. 2011 Jan;68(1):79-89.

78. Kroshus E, Baugh CM, Stein CJ, Austin SB, Calzo JP. Concussion reporting, sex, and conformity to traditional gender norms in young adults. Journal of adolescence. 2017 Jan;54:110-9.

79. Weinstein E, Turner M, Kuzma BB, Feuer H. Second impact syndrome in football: new imaging and insights into a rare and devastating condition. J Neurosurg Pediatr. 2013 Mar;11(3):331-4.

80. Rosenbaum SB, Lipton ML. Embracing chaos: the scope and importance of clinical and pathological heterogeneity in mTBI. Brain imaging and behavior. 2012 Jun;6(2):255-82.

81. Johansson B, Rönnbäck L. Long-lasting mental fatigue after traumatic brain injury–a major problem most often neglected diagnostic criteria, assessment, relation to emotional and cognitive problems, cellular background, and aspects on treatment. Traumatic brain injury: IntechOpen; 2014.

82. Yengo-Kahn AM, Hale AT, Zalneraitis BH, Zuckerman SL, Sills AK, Solomon GSJNf. The sport concussion assessment tool: a systematic review. 2016;40(4):E6.

83. McCrea M, Meier T, Huber D, et al. Role of advanced neuroimaging, fluid biomarkers and genetic testing in the assessment of sport-related concussion: a systematic review. British journal of sports medicine. 2017 Jun;51(12):13.

84. McCrea M, Barr WB, Guskiewicz K, et al. Standard regression-based methods for measuring recovery after sport-related concussion. Journal of the International Neuropsychological Society. 2005;11(1):58-69.

85. Denay KL, Martin ER. Concussion Diagnostic Imaging Options. In: Patel DS, editor. Concussion Management for Primary Care : Evidence Based Answers to Cases and Questions. Cham: Springer International Publishing; 2020. p. 77-87.

86. Pelc NJ. Recent and future directions in CT imaging. Annals of biomedical engineering. 2014;42(2):260-8.

188 Bibliography

87. Irimia A, Maher AS, Rostowsky KA, Chowdhury NF, Hwang DH, Law EM. Brain Segmentation From Computed Tomography of Healthy Aging and Geriatric Concussion at Variable Spatial Resolutions. Frontiers in Neuroinformatics. 2019 2019-March-18;13(9).

88. Voormolen DC, Haagsma JA, Polinder S, et al. Post-concussion symptoms in complicated vs. uncomplicated mild traumatic brain injury patients at three and six months post-injury: results from the center-tbi study. 2019;8(11):1921.

89. Dadar M, Zeighami Y, Yau Y, et al. White matter hyperintensities are linked to future cognitive decline in de novo Parkinson's disease patients. Neuroimage Clin. 2018;20:892-900.

90. Narayana PA. White matter changes in patients with mild traumatic brain injury: MRI perspective. Concussion. 2017;2(2):CNC35-CNC.

91. Strauss SB, Kim N, Branch CA, et al. Bidirectional Changes in Anisotropy Are Associated with Outcomes in Mild Traumatic Brain Injury. AJNR American journal of neuroradiology. 2016 Nov;37(11):1983-91.

92. Van Kampen DA, Lovell MR, Pardini JE, Collins MW, Fu FH. The "value added" of neurocognitive testing after sports-related concussion. American Journal of Sports Medicine. 2006;34(10):1630-5.

93. Kontos AP, Sufrinko A, Womble M, Kegel N. Neuropsychological Assessment Following Concussion: an Evidence‐Based Review of the Role of Neuropsychological Assessment Pre- and Post-Concussion. Current pain and headache reports. 2016;20(6):1-7.

94. Mountney A, Boutte AM, Cartagena CM, et al. Functional and Molecular Correlates after Single and Repeated Rat Closed-Head Concussion: Indices of Vulnerability after Brain Injury. Journal of neurotrauma. 2017 Oct 1;34(19):2768-89.

95. Cairelli MJ, Fiszman M, Zhang H, Rindflesch TC. Networks of neuroinjury semantic predications to identify biomarkers for mild traumatic brain injury. J Biomed Semantics. 2015;6:25-.

96. Omelchenko A, Shrirao AB, Bhattiprolu AK, et al. Dynamin and reverse-mode sodium calcium exchanger blockade confers neuroprotection from diffuse axonal injury. Cell Death & Disease. 2019 2019/09/27;10(10):727.

97. Kotoda M, Ishiyama T, Mitsui K, Hishiyama S, Matsukawa T. Neuroprotective effects of amiodarone in a mouse model of ischemic stroke. BMC Anesthesiology. 2017 2017/12/08;17(1):168.

98. Aoki Y, Tamura M, Itoh Y, et al. Effective plasma concentration of a novel Na+/Ca2+ channel blocker NS-7 for its cerebroprotective actions in rats with a transient middle cerebral artery occlusion. J Pharmacol Exp Ther. 2001 Feb;296(2):306-11.

Bibliography 189

99. Eakin K, Baratz-Goldstein R, Pick CG, et al. Efficacy of N-Acetyl Cysteine in Traumatic Brain Injury. PLOS ONE. 2014;9(4):e90617.

100. Robinson L, Platt B, Riedel G. Involvement of the cholinergic system in conditioning and perceptual memory. Behav Brain Res. 2011 Aug 10;221(2):443-65.

101. Arciniegas DB, Silver JM. Pharmacotherapy of posttraumatic cognitive impairments. Behavioural Neurology. 2006;17(1):25-42.

102. Hernandez-Tejada MA, Brawman-Mintzer O. Chapter 17 - Interventional Drugs for TBI Rehabilitation of Cognitive Impairment: The Cholinesterase Inhibitor Rivastigmine A2 - Heidenreich, Kim A. New Therapeutics for Traumatic Brain Injury. San Diego: Academic Press; 2017. p. 273-85.

103. Shahlaie K, Lyeth BG, Gurkoff GG, Muizelaar JP, Berman RF. Neuroprotective effects of selective N-type VGCC blockade on stretch-injury-induced calcium dynamics in cortical neurons. Journal of neurotrauma. 2010;27(1):175-87.

104. Bibak B, Khaksari M, Badavi M, Rashidy-Pour A. Effect of calcium channel blocker nicardipine on brain edema in rats. International Journal of Pharmacology. 2007;3(3):248-53.

105. Gurkoff G, Shahlaie K, Lyeth B, Berman R. Voltage-gated calcium channel antagonists and traumatic brain injury. Pharmaceuticals. 2013;6(7):788-812.

106. Xu G-Z, Wang M-D, Liu K-G, Bai Y-A, Wu W, Li W. A meta-analysis of treating acute traumatic brain injury with calcium channel blockers. Brain Research Bulletin. 2013;99:41-7.

107. Weaver SM, Portelli JN, Chau A, Cristofori I, Moretti L, Grafman J. Genetic polymorphisms and traumatic brain injury: the contribution of individual differences to recovery. Brain imaging and behavior. 2014;8(3):420-34.

108. Panenka WJ, Gardner AJ, Dretsch MN, Crynen GC, Crawford FC, Iverson GL. Systematic Review of Genetic Risk Factors for Sustaining a Mild Traumatic Brain Injury. Journal of neurotrauma. 2017;34(13):293-2099.

109. Abrahams S, Mc Fie S, Patricios J, Posthumus M, September AV. Risk factors for sports concussion: an evidence-based systematic review. British journal of sports medicine. 2014;48(2):91-U7.

110. Cirulli ET, Goldstein DB. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nature Reviews Genetics. 2010 06/01/online;11:415.

111. Li H, Ruan J, Durbin R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research. 2008;18(11):1851-8.

190 Bibliography

112. Márquez-Luna C, Gazal S, Loh P-R, Furlotte N, Auton A, Price AL. Modeling functional enrichment improves polygenic prediction accuracy in UK Biobank and 23andMe data sets. 2018:375337.

113. Girard SL, Gauthier J, Noreau A, et al. Increased exonic de novo mutation rate in individuals with schizophrenia. Nat Genet. 2011 Jul 10;43(9):860-3.

114. Trakadis YJ, Sardaar S, Chen A, Fulginiti V, Krishnan A. Machine learning in schizophrenia genomics, a case-control study using 5,090 exomes. American journal of medical genetics Part B, Neuropsychiatric genetics : the official publication of the International Society of Psychiatric Genetics. 2019 Mar;180(2):103-12.

115. Li J, Cai T, Jiang Y, et al. Genes with de novo mutations are shared by four neuropsychiatric disorders discovered from NPdenovo database. Molecular psychiatry. 2016 Feb;21(2):290-7.

116. Cooper GM, Shendure J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nature Reviews Genetics. 2011 08/18/online;12:628.

117. Catani M, Thiebaut de Schotten M, Slater D, Dell'Acqua F. Connectomic approaches before the connectome. NeuroImage. 2013 Oct 15;80:2-13.

118. Kurowski BG, Treble-Barna A, Pitzer AJ, et al. Applying Systems Biology Methodology To Identify Genetic Factors Possibly Associated with Recovery after Traumatic Brain Injury. Journal of neurotrauma. 2017 Jul 15;34(14):2280-90.

119. Bryer E, Medaglia J, Rostami S, Hillary FGJJotINS. Neural recruitment after mild traumatic brain injury is task dependent: a meta-analysis. 2013;19(7):751-62.

120. Premoli I, Castellanos N, Rivolta D, et al. TMS-EEG signatures of GABAergic neurotransmission in the human cortex. 2014;34(16):5603-12.

121. Taylor HG, Dietrich A, Nuss K, et al. Post-concussive symptoms in children with mild traumatic brain injury. Neuropsychology. 2010;24(2):148-59.

122. Stam AH, Luijckx G-J, Poll-Thé BT, et al. Early seizures and cerebral oedema after trivial head trauma associated with the CACNA1A S218L mutation. Journal of Neurology, Neurosurgery & Psychiatry. 2009;80(10):1125-9.

123. Tantsis EM, Gill D, Griffiths L, et al. Eye movement disorders are an early manifestation of CACNA1A mutations in children. Developmental Medicine & Child Neurology. 2016;58(6):639-44.

124. Shrey DW, Griesbach GS, Giza CC. The Pathophysiology of Concussions in Youth. Physical Medicine and Rehabilitation Clinics of North America. 2011;22(4):577-602.

Bibliography 191

125. Tolner EA, Houben T, Terwindt GM, de Vries B, Ferrari MD, van den Maagdenberg A. From migraine genes to mechanisms. PAIN. 2015;156(4):S64-S74.

126. Sun YM, Lu C, Wu ZY. Spinocerebellar ataxia: relationship between phenotype and genotype – a review. Clinical Genetics. 2016;90(4):305-14.

127. Mihalik JP, Register-Mihalik J, Kerr ZY, Marshall SW, McCrea MC, Guskiewicz KM. Recovery of Posttraumatic Migraine Characteristics in Patients After Mild Traumatic Brain Injury. The American Journal of Sports Medicine. 2013;41(7):1490-6.

128. Vecchia D, Tottene A, van den Maagdenberg A, Pietrobon D. Mechanism underlying unaltered cortical inhibitory synaptic transmission in contrast with enhanced excitatory transmission in Ca(V)2.1 knockin migraine mice. NEUROBIOLOGY OF DISEASE. 2014;69:225-34.

129. Van Maagdenberg AM, Pizzorusso T, Kaja S, et al. High cortical spreading depression susceptibility and migraine-associated symptoms in Ca(v)2.1 S218L mice. Annals of neurology. 2010;67(1):85-98.

130. Weber JT. Calcium homeostasis following traumatic neuronal injury. Curr Neurovasc Res. 2004 Apr;1(2):151-71.

131. Pramanik K, Chun CZ, Garnaas MK, et al. Dusp-5 and Snrk-1 coordinately function during vascular development and disease. Blood. 2009 Jan 29;113(5):1184-91.

132. Isaksen TJ, Lykke-Hartmann K. Insights into the Pathology of the alpha(2)-Na+/K+- ATPase in Neurological Disorders; Lessons from Animal Models. FRONTIERS IN PHYSIOLOGY. 2016;7.

133. Bottger P, Glerup S, Gesslein B, et al. Glutamate-system defects behind psychiatric manifestations in a familial hemiplegic migraine type 2 disease-mutation mouse model. SCIENTIFIC REPORTS. 2016;6:22047.

134. Leo L, Gherardini L, Barone V, et al. Increased susceptibility to cortical spreading depression in the mouse model of Familial hemiplegic migraine type 2. PLoS genetics. 2011;7(6):e1002129.

135. Mendez-Gonzalez MP, Kucheryavykh YV, Zayas-Santiago A, et al. Novel KCNJ10 Gene Variations Compromise Function of Inwardly Rectifying Potassium Channel 4.1. JOURNAL OF BIOLOGICAL CHEMISTRY. 2016;291(14):7716-26.

136. Behr ER, Savio-Galimberti E, Barc J, et al. Role of common and rare variants in SCN10A: results fromthe Brugada syndrome QRS locus gene discovery collaborative study. CARDIOVASCULAR RESEARCH. 2015;106(3):520-9.

192 Bibliography

137. Daniil G, Fernandes-Rosa FL, Chemin J, et al. CACNA1H Mutations Are Associated With Different Forms of Primary Aldosteronism. EBioMedicine. 2016;13:225-36.

138. Zhang B, Li M, Wang L, et al. The Association between the Polymorphisms in a Sodium Channel Gene SCN7A and Essential Hypertension: A Case‐Control Study in the Northern Han Chinese. Annals of Human Genetics. 2015;79(1):28-36.

139. Madura SA, McDevitt JK, Tierney RT, et al. Genetic variation in SLC17A7 promoter associated with response to sport-related concussions. BRAIN INJURY. 2016;30(7):908-13.

140. Guerriero RM, Giza CC, Rotenberg A. Glutamate and GABA Imbalance Following Traumatic Brain Injury. Current Neurology and Neuroscience Reports. 2015;15(5):1-11.

141. Chodobski A, Zink BJ, Szmydynger-Chodobska J. Blood–Brain Barrier Pathophysiology in Traumatic Brain Injury. Translational Stroke Research. 2011;2(4):492-516.

142. McDevitt J, Tierney RT, Phillips J, Gaughan JP, Torg JS, Krynetskiy E. Association between GRIN2A promoter polymorphism and recovery from concussion. BRAIN INJURY. 2015;29(13-14):1674-81.

143. Smyth K, Sandhu SS, Crawford S, Dewey D, Parboosingh J, Barlow KM. The role of serotonin receptor alleles and environmental stressors in the development of post-concussive symptoms after pediatric mild traumatic brain injury. Dev Med Child Neurol. 2014 Jan;56(1):73-7.

144. Goldstrohm SL, Arffa S. Preschool children with mild to moderate traumatic brain injury: an exploration of immediate and post-acute morbidity. Archives of clinical neuropsychology : the official journal of the National Academy of Neuropsychologists. 2005 Aug;20(6):675-95.

145. Light R, Asarnow R, Satz P, Zaucha K, McCleary C, Lewis R. Mild closed-head injury in children and adolescents: Behavior problems and academic outcomes. Journal of Consulting and Clinical Psychology. 1998;66(6):1023-9.

146. Yue JK, Winkler EA, Rick JW, et al. DRD2 C957T polymorphism is associated with improved 6-month verbal learning following traumatic brain injury. neurogenetics. 2017 January 01;18(1):29-38.

147. Dretsch MN, Silverberg N, Gardner AJ, et al. Genetics and Other Risk Factors for Past Concussions in Active-Duty Soldiers. Journal of neurotrauma. 2017;34(4):869-75.

148. Jordan BD. Genetic influences on outcome following traumatic brain injury. Neurochemical research. 2007 Apr-May;32(4-5):905-15.

Bibliography 193

149. Witte AV, Flöel A. Effects of COMT polymorphisms on brain function and behavior in health and disease. Brain Research Bulletin. 2011;88(5):418-28.

150. Winkler EA, Yue J, McAllister TW, et al. COMT Val 158 Met polymorphism is associated with nonverbal cognition following mild traumatic brain injury. Neurogenetics. 2016;17(1):31-41.

151. Lynch JR, Pineda JA, Morgan D, et al. Apolipoprotein E affects the central nervous system response to injury and the development of cerebral edema. Annals of neurology. 2002;51(1):113-7.

152. Crawford F, Wood M, Ferguson S, et al. Apolipoprotein E-genotype dependent hippocampal and cortical responses to traumatic brain injury. Neuroscience. 2009 Apr 10;159(4):1349-62.

153. Kristman VL, Tator CH, Kreiger N, et al. Does the apolipoprotein epsilon 4 allele predispose varsity athletes to concussion? A prospective cohort study. CLINICAL JOURNAL OF SPORT MEDICINE. 2008;18(4):322-8.

154. Chamelian L, Reis M, Feinstein A. Six-month recovery from mild to moderate Traumatic Brain Injury: the role of APOE-epsilon 4 allele. BRAIN. 2004;127:2621-8.

155. Pruthi N, Chandramouli BA, Kuttappa TB, et al. Apolipoprotein e polymorphism and outcome after mild to moderate traumatic brain injury: A study of patient population in India. Neurology India. 2010;58(2):264-9.

156. Terrell TR, Bostick RM, Abramson R, et al. APOE, APOE Promoter, and Tau Genotypes and Risk for Concussion in College Athletes. Clinical Journal of Sport Medicine. 2008;18(1):10-7.

157. Terrell TR, Abramson R, Barth JT, et al. Genetic polymorphisms associated with the risk of concussion in 1056 college athletes: a multicentre prospective cohort study. British journal of sports medicine. 2018;52(3):192.

158. Shi Y, Yamada K, Liddelow SA, et al. ApoE4 markedly exacerbates tau-mediated neurodegeneration in a mouse model of tauopathy. Nature. 2017;549(7673):523.

159. Tierney RT, Mansell JL, Higgins M, et al. Apolipoprotein E genotype and concussion in college athletes. Clin J Sport Med. 2010 Nov;20(6):464-8.

160. Godsel LM, Hobbs RP, Green KJ. Intermediate filament assembly: dynamics to disease. Trends in Cell Biology. 2007;18(1):28-37.

161. McDevitt JK, Tierney RT, Mansell JL, et al. Neuronal structural protein polymorphism and concussion in college athletes. Brain Injury. 2011;25(11):1108-13.

194 Bibliography

162. Chen K-Y, Huang L-M, Kung H-J, Ann DK, Shih H-M. The role of tyrosine kinase Etk Bmx in EGF-induced apoptosis of MDA-MB-468 breast cancer cells. Oncogene. 2004;23(10):1854-62.

163. Chen K-Y, Wu C-C, Chang C-F, et al. Suppression of Etk/Bmx Protects against Ischemic Brain Injury. Cell Transplantation. 2012;21(1):345-54.

164. Wang Y-J, Hsu Y-W, Chang C-M, et al. The influence of BMX gene polymorphisms on clinical symptoms after mild traumatic brain injury. Biomed Res Int. 2014;2014:293687.

165. Yang SH, Gangidine M, Pritts TA, Goodman MD, Lentsch AB. Interleukin 6 mediates neuroinflammation and motor coordination deficits after mild traumatic brain injury and brief hypoxia in mice. Shock (Augusta, Ga). 2013;40(6):471.

166. Clausen F, Hånell A, Israelsson C, et al. Neutralization of interleukin‐1β reduces cerebral edema and tissue loss and improves late cognitive outcome following traumatic brain injury in mice. European Journal of Neuroscience. 2011;34(1):110-23.

167. Sun Y, Bai L, Niu X, et al. Elevated serum levels of inflammation-related cytokines in mild traumatic brain injury are associated with cognitive performance. Frontiers in Neurology. 2019;10:1120.

168. Mountney A, Boutté AM, Cartagena CM, et al. Functional and molecular correlates after single and repeated rat closed-head concussion: indices of vulnerability after brain injury. 2017;34(19):2768-89.

169. Karpova NN. Role of BDNF epigenetics in activity-dependent neuronal plasticity. NEUROPHARMACOLOGY. 2014;76(C):709-18.

170. Korley FK, Diaz-Arrastia R, Wu AHB, et al. Circulating Brain-Derived Neurotrophic Factor Has Diagnostic and Prognostic Value in Traumatic Brain Injury. Journal of neurotrauma. 2016;33(2):215-25.

171. Bekinschtein P, Cammarota M, Izquierdo I, Medina JH. BDNF and memory formation and storage. The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry. 2008;14(2):147-56.

172. Narayanan V, Veeramuthu V, Ahmad-Annuar A, et al. Missense Mutation of Brain Derived Neurotrophic Factor (BDNF) Alters Neurocognitive Performance in Patients with Mild Traumatic Brain Injury: A Longitudinal Study. PLOS ONE. 2016;11(7):e0158838.

173. Niechwiej-Szwedo E, Gonzalez D, Tapper A, Mardian E, Roy E, Duncan R. The BDNF Val66Met polymorphism is associated with improved performance on a visual-auditory working memory task in varsity athletes. Journal of Vision. 2015;15(12):676-.

Bibliography 195

174. Larson-Dupuis C, Chamard É, Falardeau V, et al. Impact of BDNF Val66Met polymorphism on olfactory functions of female concussed athletes. Brain Injury. 2015 2015/07/03;29(7-8):963-70.

175. Hayes JP, Reagan A, Logue MW, et al. BDNF genotype is associated with hippocampal volume in mild traumatic brain injury. Genes, Brain and Behavior. 2018;17(2):107-17.

176. Almeida-Suhett CP, Li Z, Marini AM, Braga MF, Eiden LE. Temporal course of changes in gene expression suggests a cytokine-related mechanism for long-term hippocampal alteration after controlled cortical impact. Journal of neurotrauma. 2014 Apr 1;31(7):683-90.

177. Bailey ZS, Grinter MB, De La Torre Campos D, VandeVord PJ. Blast induced neurotrauma causes overpressure dependent changes to the DNA methylation equilibrium. Neuroscience Letters. 2015 2015/09/14/;604:119-23.

178. Wong VS, Langley B. Epigenetic changes following traumatic brain injury and their implications for outcome, recovery and therapy. NEUROSCIENCE LETTERS. 2016;625:26- 33.

179. Sagarkar S, Bhamburkar T, Shelkar G, Choudhary A, Kokare DM, Sakharkar AJ. Minimal traumatic brain injury causes persistent changes in DNA methylation at BDNF gene promoters in rat amygdala: A possible role in anxiety-like behaviors. NEUROBIOLOGY OF DISEASE. 2017;106:101-9.

180. Ibrahim O, Sutherland HG, Haupt LM, Griffiths LR. An emerging role for epigenetic factors in relation to executive function. Brief Funct Genomics. 2017 Nov 20.

181. Avgan N, Sutherland HG, Lea RA, et al. A CREB1 Gene Polymorphism (rs2253206) Is Associated with Prospective Memory in a Healthy Cohort. FRONTIERS IN BEHAVIORAL NEUROSCIENCE. 2017;11.

182. Mychasiuk R, Hehar H, Ma I, Esser MJ. Dietary intake alters behavioral recovery and gene expression profiles in the brain of juvenile rats that have experienced a concussion. FRONTIERS IN BEHAVIORAL NEUROSCIENCE. 2015;9:17.

183. Hehar H, Ma I, Mychasiuk R. Intergenerational Transmission of Paternal Epigenetic Marks: Mechanisms Influencing Susceptibility to Post-Concussion Symptomology in a Rodent Model. SCIENTIFIC REPORTS. 2017;7:1.

184. Bahado-Singh RO, Vishweswaraiah S, Er A, et al. Artificial intelligence and the detection of pediatric concussion using epigenomic analysis. Brain Research. 2019 2019/10/16/:146510.

196 Bibliography

185. Carrieri G, Bonafè M, De Luca M, et al. Mitochondrial DNA haplogroups and APOE4 allele are non-independent variables in sporadic Alzheimer's disease. Hum Genet. 2001 Mar;108(3):194-8.

186. Lingsma HF, Roozenbeek B, Steyerberg EW, Murray GD, Maas AI. Early prognosis in traumatic brain injury: from prophecies to predictions. Lancet Neurol. 2010 May;9(5):543-54.

187. Bulstrode H, Nicoll JA, Hudson G, Chinnery PF, Di Pietro V, Belli AJAon. Mitochondrial DNA and traumatic brain injury. 2014;75(2):186-95.

188. Blennow K, Mattsson N, Schöll M, Hansson O, Zetterberg HJTips. Amyloid biomarkers in Alzheimer's disease. 2015;36(5):297-309.

189. S Hauser P, O Ryan RJCAR. Impact of apolipoprotein E on Alzheimer's disease. 2013;10(8):809-17.

190. Bulstrode H, Nicoll JA, Hudson G, Chinnery PF, Di Pietro V, Belli A. Mitochondrial DNA and traumatic brain injury. Annals of neurology. 2014 Feb;75(2):186-95.

191. Jagadeesh KA, Wenger AM, Berger MJ, et al. M-CAP eliminates a majority of variants of uncertain significance in clinical exomes at high sensitivity. Nature Genetics. 2016 2016/12/01;48(12):1581-6.

192. Iglesias A, Anyane-Yeboa K, Wynn J, et al. The usefulness of whole-exome sequencing in routine clinical practice. Genetics in medicine : official journal of the American College of Medical Genetics. 2014 Dec;16(12):922-31.

193. Yang Y, Muzny DM, Reid JG, et al. Clinical Whole-Exome Sequencing for the Diagnosis of Mendelian Disorders. 2013;369(16):1502-11.

194. Solomon BD, Hadley DW, Pineda-Alvarez DE, et al. Incidental medical information in whole-exome sequencing. Pediatrics. 2012 Jun;129(6):e1605-11.

195. Patowary A, Won SY, Oh SJ, et al. Family-based exome sequencing and case-control analysis implicate CEP41 as an ASD gene. Transl Psychiatry. 2019;9(1):4-.

196. Šestáková Š, Šálek C, Remešová H. DNA Methylation Validation Methods: a Coherent Review with Practical Comparison. Biological Procedures Online. 2019 2019/10/01;21(1):19.

197. Reed K, Poulin ML, Yan L, Parissenti AMJAb. Comparison of bisulfite sequencing PCR with pyrosequencing for measuring differences in DNA methylation. 2010;397(1):96-106.

198. England R, Pettersson M. Pyro Q-CpG™: quantitative analysis of methylation in multiple CpG sites by Pyrosequencing®. Nature Methods. 2005 2005/10/01;2(10):i-ii.

Bibliography 197

199. Sant KE, Nahar MS, Dolinoy DC. DNA methylation screening and analysis. Developmental Toxicology: Springer; 2012. p. 385-406.

200. Hernández HG, Tse MY, Pang SC, Arboleda H, Forero DA. Optimizing methodologies for PCR-based DNA methylation analysis. Biotechniques. 2013;55(4):181-97.

201. Beck TF, Mullikin JC, Biesecker LG, Program NCS. Systematic evaluation of Sanger validation of next-generation sequencing variants. Clinical chemistry. 2016;62(4):647-54.

202. Flanagan SE, Patch A-M, Ellard S. Using SIFT and PolyPhen to Predict Loss-of- Function and Gain-of-Function Mutations. Genetic Testing and Molecular Biomarkers. 2010;14(4):533-7.

203. Korvigo I, Afanasyev A, Romashchenko N, Skoblov M. Generalising better: Applying deep learning to integrate deleteriousness prediction scores for whole-exome SNV studies. PLOS ONE. 2018;13(3):e0192829.

204. Reva B, Antipin Y, Sander C. Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res. 2011;39(17):e118-e.

205. Glass K, Girvan M. Annotation Enrichment Analysis: An Alternative Method for Evaluating the Functional Properties of Gene Sets. Scientific Reports. 2014 02/26/online;4:4191.

206. Schwarz JM, Rödelsperger C, Schuelke M, Seelow D. MutationTaster evaluates disease-causing potential of sequence alterations. Nature methods. 2010;7(8):575.

207. Valdmanis PN, Verlaan DJ, Rouleau GA. The proportion of mutations predicted to have a deleterious effect differs between gain and loss of function genes in neurodegenerative disease. Human mutation. 2009 Mar;30(3):E481-9.

208. Care MA, Needham CJ, Bulpitt AJ, Westhead DR. Deleterious SNP prediction: be mindful of your training data! Bioinformatics. 2007;23(6):664-72.

209. Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019 Jan 8;47(D1):D886-d94.

210. Al-Jarrah OY, Yoo PD, Muhaidat S, Karagiannidis GK, Taha K. Efficient Machine Learning for Big Data: A Review. Big Data Research. 2015 2015/09/01/;2(3):87-93.

211. Li H, Handsaker B, Wysoker A, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9.

212. Danecek P, Auton A, Abecasis G, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156-8.

198 Bibliography

213. Lott MT, Leipzig JN, Derbeneva O, et al. mtDNA Variation and Analysis Using Mitomap and Mitomaster. Current protocols in bioinformatics. 2013 Dec;44(123):1.23.1-6.

214. Kidd JM, Sharpton TJ, Bobo D, et al. Exome capture from saliva produces high quality genomic and metagenomic data. BMC GENOMICS. 2014;15(1):262-.

215. Wall JD, Tang LF, Zerbe B, et al. Estimating genotype error rates from high-coverage next-generation sequence data. GENOME RESEARCH. 2014;24(11):1734-9.

216. Drmanac R, Sparks AB, Callow MJ, et al. Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays. Science. 2010;327(5961):78-81.

217. Cliften P. Base Calling, Read Mapping, and Coverage Analysis. In: Kulkarni S, Pfeifer JD, editors. Clinical genomics

. Amsterdam: Elsevier/Academic Press; 2015.

218. Meghnani V, Mohammed N, Giauque C, Nahire R, David T. Performance Characterization and Validation of Saliva as an Alternative Specimen Source for Detecting Hereditary Breast Cancer Mutations by Next Generation Sequencing. International Journal of Genomics. 2016 10/13

06/28/received

08/22/revised

09/26/accepted;2016:2059041.

219. Laurie S, Fernandez-Callejo M, Marco-Sola S, et al. From Wet-Lab to Variations: Concordance and Speed of Bioinformatics Pipelines for Whole Genome and Whole Exome Sequencing. Hum Mutat. 2016 Dec;37(12):1263-71.

220. Altschul SF, Gish W, Miller W, Meyers EW, Lipman DJ. Basic Local Alignment Search Tool. Journal of Molecular Biology. 1990;215(3):403-10.

221. Patel ZH, Kottyan LC, Lazaro S, et al. The struggle to find reliable results in exome sequencing data: filtering out Mendelian errors. Frontiers in genetics. 2014;5:16.

222. O'Rawe J, Jiang T, Sun G, et al. Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing. Genome Med. 2013;5(3):28.

223. Rosenfeld JA, Mason CE, Smith TM. Limitations of the Human Reference Genome for Personalized Genomics. PLOS ONE. 2012;7(7):e40294.

Bibliography 199

224. Carson AR, Smith EN, Matsui H, et al. Effective filtering strategies to improve data quality from population-based whole exome sequencing studies. BMC BIOINFORMATICS. 2014;15(1):125-.

225. Sims D, Sudbery I, Ilott NE, Heger A, Ponting CP. Sequencing depth and coverage: key considerations in genomic analyses. NATURE REVIEWS GENETICS. 2014;15(2):121-32.

226. Ledergerber C, Dessimoz C. Base-calling for next-generation sequencing platforms. Briefings in Bioinformatics. 2011;12(5):489-97.

227. Paul JS, Nielsen R, Song YS, Albrechtsen A. Genotype and SNP calling from next- generation sequencing data. Nature Reviews Genetics. 2011;12(6):443-51.

228. Koboldt DC, Ding L, Mardis ER, Wilson RK. Challenges of sequencing human genomes. Briefings in Bioinformatics. 2010;11(5):484-98.

229. Hwang S, Kim E, Lee I, Marcotte EM. Systematic comparison of variant calling pipelines using gold standard personal exome variants. Sci Rep. 2015 Dec 7;5:17875.

230. Zook JM, Chapman B, Wang J, et al. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat Biotechnol. 2014 Mar;32(3):246-51.

231. Rimmer A, Phan H, Mathieson I, et al. Integrating mapping-, assembly- and haplotype- based approaches for calling variants in clinical sequencing applications. Nat Genet. 2014 Aug;46(8):912-8.

232. Raczy C, Petrovski R, Saunders CT, et al. Isaac: ultra-fast whole-genome secondary analysis on Illumina sequencing platforms. Bioinformatics (Oxford, England). 2013;29(16):2041-3.

233. McKenna A, Hanna M, Banks E, et al. The genome analysis toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research. 2010;20(9):1297-303.

234. Mazzo F, Zwart R, Serratto GM, et al. Reconstitution of synaptic Ion channels from rodent and human brain in Xenopus oocytes: a biochemical and electrophysiological characterization. Journal of Neurochemistry. 2016;138(3):384-96.

235. Stiefel MF, Tomita Y, Marmarou A. Secondary ischemia impairing the restoration of ion homeostasis following traumatic brain injury. Journal of neurosurgery. 2005;103(4):707- 14.

236. Maksemous N, Roy B, Smith RA, Griffiths LR. Next-generation sequencing identifies novelCACNA1Agene mutations in episodic ataxia type 2. Molecular Genetics & Genomic Medicine. 2016;4(2):211-22.

200 Bibliography

237. Kors EE, Terwindt GM, Vermeulen FL, et al. Delayed cerebral edema and fatal coma after minor head trauma: role of the CACNA1A calcium channel subunit gene and relationship with familial hemiplegic migraine. Annals of Neurology: Official Journal of the American Neurological Association and the Child Neurology Society. 2001;49(6):753-60.

238. Pelzer N, Blom D, Stam A, et al. Recurrent coma and fever in familial hemiplegic migraine type 2. A prospective 15-year follow-up of a large family with a novel ATP1A2 mutation. Cephalalgia. 2017;37(8):737-55.

239. Raymond GV, Seidman R, Monteith TS, et al. Head trauma can initiate the onset of adreno-leukodystrophy. Journal of the neurological sciences. 2010 2010/03/15/;290(1):70-4.

240. Berkovic SF, Mulley JC, Scheffer IE, Petrou S. Human epilepsies: interaction of genetic and acquired factors. Trends in Neurosciences. 2006 2006/07/01/;29(7):391-7.

241. Bergsneider M, Hovda DA, Shalmon E, et al. Cerebral hyperglycolysis following severe traumatic brain injury in : a positron emission tomography study. Journal of neurosurgery. 1997 Feb;86(2):241-51.

242. Blumkin L, Michelson M, Leshinsky-Silver E, Kivity S, Lev D, Lerman-Sagie T. Congenital Ataxia, Mental Retardation, and Dyskinesia Associated With a Novel CACNA1A Mutation. Journal of Child Neurology. 2010 2010/07/01;25(7):892-7.

243. Carson AR, Smith EN, Matsui H, et al. Effective filtering strategies to improve data quality from population-based whole exome sequencing studies. BMC bioinformatics. 2014;15:125-.

244. Harris MA, Clark J, Ireland A, et al. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 2004;32(Database issue):D258-D61.

245. Benton MC, Smith RA, Haupt LM, et al. Variant Call Format–Diagnostic Annotation and Reporting Tool: A Customizable Analysis Pipeline for Identification of Clinically Relevant Genetic Variants in Next-Generation Sequencing Data. The Journal of Molecular Diagnostics. 2019;21(6):951-60.

246. Johnson JM, Castle J, Garrett-Engele P, et al. Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science. 2003 Dec 19;302(5653):2141-4.

247. Lin A, Wang RT, Ahn S, Park CC, Smith DJ. A genome-wide map of human genetic interactions inferred from radiation hybrid genotypes. Genome Res. 2010 Aug;20(8):1122-32.

248. Frankel WN, Taylor L, Beyer B, Tempel BL, White HS. Electroconvulsive thresholds of inbred mouse strains. Genomics. 2001;74(3):306-12.

Bibliography 201

249. Zhang Y, Kong W, Gao Y, et al. Gene Mutation Analysis in 253 Chinese Children with Unexplained Epilepsy and Intellectual/Developmental Disabilities. PLOS ONE. 2015;10(11):e0141782.

250. Gargus JJ. Genetic Calcium Signaling Abnormalities in the Central Nervous System: Seizures, Migraine, and Autism. Annals of the New York Academy of Sciences. 2009;1151(1):133-56.

251. Russo L, Mariotti P, Sangiorgi E, et al. A New Susceptibility Locus for Migraine with Aura in the 15q11-q13 Genomic Region Containing Three GABA-A Receptor Genes. The American Journal of Human Genetics. 2005 2005/02/01/;76(2):327-33.

252. Machado AAC, Deguti MM, Genschel J, et al. Neurological manifestations and ATP7B mutations in Wilson's disease. Parkinsonism & Related Disorders. 2008 2008/04/01/;14(3):246-9.

253. Borkum JM. Migraine triggers and oxidative stress: a narrative review and synthesis. Headache: The Journal of Head and Face Pain. 2016;56(1):12-35.

254. Dhillon KS, Singh J, Lyall JS. A new horizon into the pathobiology, etiology and treatment of migraine. Medical hypotheses. 2011;77(1):147-51.

255. Bandmann O, Weiss KH, Kaler SG. Wilson's disease and other neurological copper disorders. The Lancet Neurology. 2015 2015/01/01/;14(1):103-13.

256. Peña-Quintana L, García-Luzardo MR, García-Villarreal L, et al. Manifestations and Evolution of Wilson Disease in Pediatric Patients Carrying ATP7B Mutation L708P. Journal of Pediatric Gastroenterology and Nutrition. 2012;54(1):48-54.

257. Squitti R, Polimanti R, Bucossi S, et al. Linkage Disequilibrium and Haplotype Analysis of the ATP7B Gene in Alzheimer's Disease. Rejuvenation Research. 2013;16(1):3-10.

258. Zamponi GW. Targeting voltage-gated calcium channels in neurological and psychiatric diseases. Nature reviews Drug discovery. 2016;15(1):19.

259. Zhuchenko O, Bailey J, Bonnen P, et al. Autosomal dominant cerebellar ataxia (SCA6) associated with small polyglutamine expansions in the α1A-voltage-dependent calcium channel. Nature genetics. 1997;15(1):62.

260. Petersen OH, Michalak M, Verkhratsky A. Calcium signalling: past, present and future. Cell calcium. 2005;38(3-4):161-9.

261. Wada T, Kobayashi N, Takahashi Y, Aoki T, Watanabe T, Saitoh S. Wide clinical variability in a family with a CACNA1A T666m mutation: hemiplegic migraine, coma, and progressive ataxia. Pediatric Neurology. 2002;26(1):47-50.

202 Bibliography

262. Consortium C-DGotPG. Identification of risk loci with shared effects on five major psychiatric disorders: a genome-wide analysis. The Lancet. 2013;381(9875):1371-9.

263. De Jesús-Cortés H, Rajadhyaksha AM, Pieper AA. Cacna1c: Protecting young hippocampal neurons in the adult brain. Neurogenesis. 2016 2016/01/01;3(1):e1231160.

264. Andrade A, Hope J, Allen A, Yorgan V, Lipscombe D, Pan JQ. A rare schizophrenia risk variant of CACNA1I disrupts CaV3.3 channel activity. Sci Rep. 2016 Oct 19;6:34233.

265. Andrade A, Hope J, Allen A, Yorgan V, Lipscombe D, Pan JQ. A rare schizophrenia risk variant of CACNA1I disrupts CaV3.3 channel activity. Scientific Reports. 2016 2016/10/19;6(1):34233.

266. Lee SE, Lee J, Latchoumane C, et al. Rebound burst firing in the reticular thalamus is not essential for pharmacological absence seizures in mice. Proc Natl Acad Sci U S A. 2014 Aug 12;111(32):11828-33.

267. Wémeau J-L, Kopp P. Pendred syndrome. Best Practice & Research Clinical Endocrinology & Metabolism. 2017;31(2):213-24.

268. Frejo L, Giegling I, Teggi R, Lopez-Escamez JA, Rujescu D. Genetics of vestibular disorders: pathophysiological insights. Journal of Neurology. 2016 04/15

07/20/received

11/01/revised

11/29/accepted;263:45-53.

269. Reyes S, Wang G, Ouyang X, et al. Mutation analysis of SLC26A4 in mainland Chinese patients with enlarged vestibular aqueduct. Otolaryngol Head Neck Surg. 2009 Oct;141(4):502- 8.

270. Yuan H, Low CM, Moody OA, Jenkins A, Traynelis SF. Ionotropic GABA and Glutamate Receptor Mutations and Human Neurologic Diseases. Mol Pharmacol. 2015 Jul;88(1):203-17.

271. Gibson CJ, Meyer RC, Hamm RJ. Traumatic brain injury and the effects of diazepam, diltiazem, and MK-801 on GABA-A receptor subunit expression in rat hippocampus. Journal of Biomedical Science. 2010;17(1):38-.

272. Drexel M, Puhakka N, Kirchmair E, Hörtnagl H, Pitkänen A, Sperk G. Expression of GABA receptor subunits in the hippocampus and thalamus after experimental traumatic brain injury. Neuropharmacology. 2015 2015/01/01/;88:122-33.

Bibliography 203

273. Liu B, Li L, Zhang Q, et al. Preservation of GABAA receptor function by PTEN inhibition protects against neuronal death in ischemic stroke. Stroke. 2010;41(5):1018-26.

274. Xu J, Liu Y, Zhang G-Y. Neuroprotection of GluR5-containing Kainate Receptor Activation against Ischemic Brain Injury through Decreasing Tyrosine Phosphorylation of N- Methyl-d-aspartate Receptors Mediated by Src Kinase. The Journal of biological chemistry. 2008 01/16/received

07/14/revised;283(43):29355-66.

275. Wang J, Lin Z-J, Liu L, et al. Epilepsy-associated genes. Seizure. 2017 2017/01/01/;44:11-20.

276. Vikelis M, Mitsikostas DD. The role of glutamate and its receptors in migraine. CNS Neurol Disord Drug Targets. 2007 Aug;6(4):251-7.

277. Van Der Zee J, Van Langenhove T, Kovacs GG, et al. Rare mutations in SQSTM1 modify susceptibility to frontotemporal lobar degeneration. Acta neuropathologica. 2014;128(3):397-410.

278. Rubino E, Rainero I, Chiò A, et al. SQSTM1 mutations in frontotemporal lobar degeneration and amyotrophic lateral sclerosis. Neurology. 2012;79(15):1556-62.

279. Rea SL, Majcher V, Searle MS, Layfield R. SQSTM1 mutations–bridging Paget disease of bone and ALS/FTLD. Experimental cell research. 2014;325(1):27-37.

280. Le Ber I, Camuzat A, Guerreiro R, et al. SQSTM1 mutations in French patients with frontotemporal dementia or frontotemporal dementia with amyotrophic lateral sclerosis. JAMA neurology. 2013;70(11):1403-10.

281. Seibenhener ML, Du Y, Diaz-Meco M-T, Moscat J, Wooten MC, Wooten MW. A Role for Sequestosome1/p62 in Mitochondrial Dynamics, Import and Genome Integrity. Biochimica et biophysica acta. 2013 11/09;1833(3):452-9.

282. Bartolome F, Esteras N, Martin-Requero A, et al. Pathogenic p62/SQSTM1 mutations impair energy metabolism through limitation of mitochondrial substrates. Scientific Reports. 2017 05/10

11/03/received

03/30/accepted;7:1666.

283. Narendra D, Kane LA, Hauser DN, Fearnley IM, Youle RJ. p62/SQSTM1 is required for Parkin-induced mitochondrial clustering but not mitophagy; VDAC1 is dispensable for both. Autophagy. 2010 Nov;6(8):1090-106.

204 Bibliography

284. East DA, Fagiani F, Crosby J, et al. PMI: a DeltaPsim independent pharmacological regulator of mitophagy. Chem Biol. 2014 Nov 20;21(11):1585-96.

285. Nemtseva EV, Gerasimova MA, Melnik TN, Melnik BSJPo. Experimental approach to study the effect of mutations on the protein folding pathway. 2019;14(1):e0210361.

286. Ylikallio E, Pöyhönen R, Zimon M, et al. Deficiency of the E3 ubiquitin ligase TRIM2 in early-onset axonal neuropathy. Human molecular genetics. 2013;22(15):2975-83.

287. Pehlivan D, Akdemir ZC, Karaca E, et al. Exome sequencing reveals homozygous TRIM2 mutation in a patient with early onset CMT and bilateral vocal cord paralysis. Human genetics. 2015;134(6):671-3.

288. Thompson S, Pearson AN, Ashley MD, et al. Identification of a novel BIM (BCL-2 interacting mediator of cell death) E3-ligase, tri-partite motif containing protein 2 (TRIM2), and its role in rapid ischemic tolerance-induced neuroprotection. Journal of Biological Chemistry. 2011:jbc. M110. 197707.

289. Boone DK, Weisz HA, Bi M, et al. Evidence linking microRNA suppression of essential prosurvival genes with hippocampal cell death after traumatic brain injury. Scientific Reports. 2017;7(1):6645.

290. McKinnon C, Tabrizi SJ. The ubiquitin-proteasome system in neurodegeneration. Antioxid Redox Signal. 2014 Dec 10;21(17):2302-21.

291. Balastik M, Ferraguti F, Pires-da Silva A, et al. Deficiency in ubiquitin ligase TRIM2 causes accumulation of neurofilament light chain and neurodegeneration. Proceedings of the National Academy of Sciences. 2008.

292. Yamaguchi Y, Miura M. How to form and close the brain: insight into the mechanism of cranial neural tube closure in mammals. Cellular and molecular life sciences. 2013;70(17):3171-86.

293. Greene ND, Stanier P, Copp AJ. Genetics of human neural tube defects. Human molecular genetics. 2009;18(R2):R113-R29.

294. Tran H, Bustos D, Yeh R, et al. HectD1 E3 ligase modifies adenomatous polyposis coli (APC) with polyubiquitin to promote the APC-axin interaction. The Journal of biological chemistry. 2013 Feb 8;288(6):3753-67.

295. Muto V, Flex E, Kupchinsky Z, et al. Biallelic SQSTM1 mutations in early-onset, variably progressive neurodegeneration. Neurology. 2018;91(4):e319-e30.

Bibliography 205

296. Sundman MH, Hall EE, Chen NK. Examining the relationship between head trauma and neurodegenerative disease: A review of epidemiology, pathology and neuroimaging techniques. Journal of Alzheimer's disease & Parkinsonism. 2014 Jan 31;4.

297. Lek M, Karczewski KJ, Minikel EV, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536(7616):285-91.

298. Samocha KE, Robinson EB, Sanders SJ, et al. A framework for the interpretation of de novo mutation in human disease. Nat Genet. 2014 Sep;46(9):944-50.

299. Hardy JJ, Mooney SR, Pearson AN, et al. Assessing the accuracy of blood RNA profiles to identify patients with post-concussion syndrome: A pilot study in a military patient population. PloS one. 2017;12(9):e0183113-e.

300. Shin YS. Glycogen Storage Disease: Clinical, Biochemical, and Molecular Heterogeneity. Seminars in Pediatric Neurology. 2006 2006/06/01/;13(2):115-20.

301. Vilchez D, Ros S, Cifuentes D, et al. Mechanism suppressing glycogen synthesis in neurons and its demise in progressive myoclonus epilepsy. Nature Neuroscience. 2007 2007/11/01;10(11):1407-13.

302. Mohler PJ, Bennett V. Ankyrin-based cardiac arrhythmias: a new class of channelopathies due to loss of cellular targeting. Current opinion in cardiology. 2005;20(3):189-93.

303. Nashef L, Hindocha N, Makoff A. Risk factors in sudden death in epilepsy (SUDEP): the quest for mechanisms. Epilepsia. 2007;48(5):859-71.

304. Kashef F, Li J, Wright P, et al. Ankyrin-B Protein in Heart Failure IDENTIFICATION OF A NEW COMPONENT OF METAZOAN CARDIOPROTECTION. Journal of Biological Chemistry. 2012;287(36):30268-81.

305. Mohler PJ, Splawski I, Napolitano C, et al. A cardiac arrhythmia syndrome caused by loss of ankyrin-B function. Proceedings of the National Academy of Sciences. 2004;101(24):9137-42.

306. Yang R, Walder-Christensen KK, Kim N, et al. ANK2 autism mutation targeting giant ankyrin-B promotes axon branching and ectopic connectivity. 2019;116(30):15262-71.

307. Hayashi Y, Homma K, Ichijo H. SOD1 in neurotoxicity and its controversial roles in SOD1 mutation-negative ALS. Advances in biological regulation. 2016;60:95-104.

206 Bibliography

308. Hensley K, Abdel-Moaty H, Hunter J, et al. Primary glia expressing the G93A-SOD1 mutation present a neuroinflammatory phenotype and provide a cellular system for studies of glial inflammation. Journal of neuroinflammation. 2006;3(1):2.

309. Okuda H, Tatsumi K, Morita S, et al. Chondroitin sulfate proteoglycan tenascin-R regulates glutamate uptake by adult brain astrocytes. Journal of Biological Chemistry. 2014;289(5):2620-31.

310. Lucae S, Salyakina D, Barden N, et al. P2RX7, a gene coding for a purinergic ligand- gated ion channel, is associated with major depressive disorder. Human Molecular Genetics. 2006;15(16):2438-45.

311. Gao AF, editor. The Role of ATP and P2X Purinoceptor 7 in the Development of Cerebral Tau2018.

312. Ohba C, Shiina M, Tohyama J, et al. GRIN1 mutations cause encephalopathy with infantile-onset epilepsy, and hyperkinetic and stereotyped movement disorders. Epilepsia. 2015;56(6):841-8.

313. Georgi A, Jamra RA, Klein K, et al. Possible association between genetic variants at the GRIN1 gene and schizophrenia with lifetime history of depressive symptoms in a German sample. Psychiatric Genetics. 2007;17(5):308-10.

314. Weder N, Zhang H, Jensen K, et al. Child Abuse, Depression, and Methylation in Genes Involved With Stress, Neural Plasticity, and Brain Circuitry. Journal of the American Academy of Child & Adolescent Psychiatry. 2014 2014/04/01/;53(4):417-24.e5.

315. Hor H, Francescatto L, Bartesaghi L, et al. Missense mutations in TENM4, a regulator of axon guidance and central myelination, cause essential tremor. Human Molecular Genetics. 2015;24(20):5677-86.

316. Leng X-R, Qi X-H, Zhou Y-T, Wang Y-P. Gain-of-function mutation p.Arg225Cys in SCN11A causes familial episodic pain and contributes to essential tremor. Journal of Human Genetics. 2017 2017/06/01;62(6):641-6.

317. Leipold E, Liebmann L, Korenke GC, et al. A de novo gain-of-function mutation in SCN11A causes loss of pain perception. Nature Genetics. 2013 2013/11/01;45(11):1399-404.

318. Herzog R, Cummins T, Waxman SJJon. Persistent TTX-resistant Na+ current affects resting potential and response to depolarization in simulated spinal sensory neurons. 2001;86(3):1351-64.

319. Houle G, Schmouth JF, Leblond CS, et al. Teneurin transmembrane protein 4 is not a cause for essential tremor in a Canadian population. Movement disorders : official journal of the Movement Disorder Society. 2017 Feb;32(2):292-5.

Bibliography 207

320. Singh NA, Pappas C, Dahle EJ, et al. A role of SCN9A in human epilepsies, as a cause of febrile seizures and as a potential modifier of Dravet syndrome. PLoS genetics. 2009;5(9):e1000649-e.

321. Drenth JPH, Waxman SG. Mutations in sodium-channel gene SCN9A cause a spectrum of human genetic pain disorders. The Journal of Clinical Investigation. 2007 12/03/;117(12):3603-9.

322. Sandoval N, Platzer M, Rosenthal A, et al. Characterization of ATM Gene Mutations in 66 Ataxia Telangiectasia Families. Human Molecular Genetics. 1999;8(1):69-79.

323. Li J, Chen J, Ricupero CL, et al. Nuclear accumulation of HDAC4 in ATM deficiency promotes neurodegeneration in ataxia telangiectasia. Nature medicine. 2012;18(5):783-90.

324. Česká K, Aulická Š, Horák O, et al. Autosomal dominant temporal lobe epilepsy associated with heterozygous reelin mutation: 3 T brain MRI study with advanced neuroimaging methods. Epilepsy Behav Case Rep. 2018;11:39-42.

325. Michelucci R, Pulitano P, Di Bonaventura C, et al. The clinical phenotype of autosomal dominant lateral temporal lobe epilepsy related to reelin mutations. Epilepsy Behav. 2017;68:103-7.

326. Dazzo E, Fanciulli M, Serioli E, et al. Heterozygous Reelin Mutations Cause Autosomal-Dominant Lateral Temporal Epilepsy. The American Journal of Human Genetics. 2015 2015/06/04/;96(6):992-1000.

327. Lussier AL, Weeber EJ, Rebeck GW. Reelin Proteolysis Affects Signaling Related to Normal Synapse Function and Neurodegeneration. Frontiers in cellular neuroscience. 2016 2016-March-29;10(75).

328. Lu H, Zhu X-C, Wang H-F, et al. Lack of Association Between SLC24A4 Polymorphism and Late-onset Alzheimer's Disease in Han Chinese. Curr Neurovasc Res. 2016 2016;13(3):239-43.

329. Yu L, Chibnik LB, Srivastava GP, et al. Association of Brain DNA Methylation in SORL1, ABCA7, HLA-DRB5, SLC24A4, and BIN1 With Pathological Diagnosis of Alzheimer Disease. JAMA Neurology. 2015;72(1):15-24.

330. Wang S, Choi M, Richardson AS, et al. STIM1 and SLC24A4 Are Critical for Enamel Maturation. Journal of Dental Research. 2014 2014/07/01;93(7_suppl):94S-100S.

331. Parry David A, Poulter James A, Logan Clare V, et al. Identification of Mutations in SLC24A4, Encoding a Potassium-Dependent Sodium/Calcium Exchanger, as a Cause of Amelogenesis Imperfecta. The American Journal of Human Genetics. 2013 2013/02/07/;92(2):307-12.

208 Bibliography

332. Saeedeh H, Ehsan A, Hassan R-K, et al. A meta-analysis of gene expression data highlights synaptic dysfunction in the hippocampus of brains with Alzheimer’s disease. Scientific Reports (Nature Publisher Group). 2020;10(1).

333. Tucsek Z, Valcarcel-Ares MN, Tarantini S, et al. Hypertension-induced synapse loss and impairment in synaptic plasticity in the mouse hippocampus mimics the aging phenotype: implications for the pathogenesis of vascular cognitive impairment. GeroScience. 2017;39(4):385-406.

334. Mechaussier S, Almoallem B, Zeitz C, et al. Loss of Function of RIMS2 Causes a Syndromic Congenital Cone-Rod Synaptic Disease with Neurodevelopmental and Pancreatic Involvement. The American Journal of Human Genetics. 2020.

335. Cocchi E, Drago A, Serretti A. Hippocampal Pruning as a New Theory of Schizophrenia Etiopathogenesis. Molecular Neurobiology. 2016 2016/04/01;53(3):2065-81.

336. Bi C, Wu J, Jiang T, et al. Mutations of ANK3 identified by exome sequencing are associated with autism susceptibility. Human mutation. 2012;33(12):1635-8.

337. Nakamura T, Jimbo K, Nakajima K, Tsuboi T, Kato T. De novo UNC13B mutation identified in a bipolar disorder patient increases a rare exon-skipping variant. Neuropsychopharmacology Reports. 2018;38(4):210-3.

338. Egawa J, Hoya S, Watanabe Y, et al. Rare UNC13B variations and risk of schizophrenia: Whole-exome sequencing in a multiplex family and follow-up resequencing and a case–control study. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics. 2016 2016/09/01;171(6):797-805.

339. Labrie V, Fukumura R, Rastogi A, et al. Serine racemase is associated with schizophrenia susceptibility in humans and in a mouse model. Human molecular genetics. 2009;18(17):3227-43.

340. Nickels SL, Walter J, Bolognin S, et al. Impaired serine metabolism complements LRRK2-G2019S pathogenicity in PD patients. Parkinsonism & Related Disorders. 2019 2019/10/01/;67:48-55.

341. Anderson JF, Siller E, Barral JM. The Neurodegenerative-Disease-Related Protein Sacsin Is a Molecular Chaperone. Journal of Molecular Biology. 2011 2011/08/26/;411(4):870- 80.

342. Anderson JF, Siller E, Barral JM. The Sacsin Repeating Region (SRR): A Novel Hsp90- Related Supra-Domain Associated with Neurodegeneration. Journal of Molecular Biology. 2010 2010/07/23/;400(4):665-74.

Bibliography 209

343. Li X. Autosomal Recessive Spastic Ataxia of Charlevoix-Saguenay (ARSACS): a once obscure neurodegenerative disease with increasing significance for neurological research. McGill Science Undergraduate Research Journal. 2013;8(1).

344. Aulchenko YS, Hoppenbrouwers IA, Ramagopalan SV, et al. Genetic variation in the KIF1B locus influences susceptibility to multiple sclerosis. Nature Genetics. 2008 2008/12/01;40(12):1402-3.

345. Chevalier-Larsen E, Holzbaur ELF. Axonal transport and neurodegenerative disease. Biochimica et Biophysica Acta (BBA) - Molecular Basis of Disease. 2006 2006/11/01/;1762(11):1094-108.

346. Sheng Z-H, Cai Q. Mitochondrial transport in neurons: impact on synaptic homeostasis and neurodegeneration. Nature Reviews Neuroscience. 2012 2012/02/01;13(2):77-93.

347. Giniatullin R, Nistri A, Fabbretti E. Molecular Mechanisms of Sensitization of Pain- transducing P2X3 Receptors by the Migraine Mediators CGRP and NGF. Molecular Neurobiology. 2008 2008/05/06;37(1):83.

348. Martins LB, Duarte H, Ferreira AVM, Rocha NP, Teixeira AL, Domingues RB. Migraine is associated with altered levels of neurotrophins. Neuroscience Letters. 2015 2015/02/05/;587:6-10.

349. Blandini F, Rinaldi L, Tassorelli C, et al. Peripheral Levels of BDNF and NGF in Primary Headaches. Cephalalgia. 2006 2006/02/01;26(2):136-42.

350. Coskun S, Varol S, Ozdemir HH, et al. Association of brain-derived neurotrophic factor and nerve growth factor gene polymorphisms with susceptibility to migraine. Neuropsychiatric disease and treatment. 2016;12:1779-85.

351. Sano T, Kohyama-Koganeya A, Kinoshita MO, et al. Loss of GPRC5B impairs synapse formation of Purkinje cells with cerebellar nuclear neurons and disrupts cerebellar synaptic plasticity and motor learning. Neuroscience research. 2018 2018/11/01/;136:33-47.

352. Askland KD. Editorial: “Ion channels and mental illness: exploring etiology and pathophysiology in major psychiatric disorders”. 2015 2015-April-24;6(152).

353. Martins L, Teixeira A, Domingues R. Neurotrophins and migraine. Vitamins and hormones: Elsevier; 2017. p. 459-73.

354. Kullmann DM. Neurological Channelopathies. 2010;33(1):151-72.

355. Martikainen MH, Ellfolk U, Majamaa K. Impaired information-processing speed and working memory in leukoencephalopathy with brainstem and spinal cord involvement and

210 Bibliography

elevated lactate (LBSL) and DARS2 mutations: a report of three adult patients. Journal of Neurology. 2013 2013/08/01;260(8):2078-83.

356. Van Berge L, Hamilton EM, Linnankivi T, et al. Leukoencephalopathy with brainstem and spinal cord involvement and lactate elevation: clinical and genetic characterization and target for therapy. Brain. 2014;137(4):1019-29.

357. N’Gbo N’Gbo Ikazabo R, Mostosi C, Jissendi P, Labaisse MA, Vandernoot I. A New DARS2 Mutation Discovered in an Adult Patient. Case Reports in Neurology. 2020;12(1):107-13.

358. Isohanni P, Linnankivi T, Buzkova J, et al. DARS2 mutations in mitochondrial leucoencephalopathy and multiple sclerosis. Journal of Medical Genetics. 2010;47(1):66-70.

359. Dang M, Wang Z, Zhang R, et al. KALRN Rare and Common Variants and Susceptibility to Ischemic Stroke in Chinese Han Population. NeuroMolecular Medicine. 2015 2015/09/01;17(3):241-50.

360. Vasudeva K, Munshi A. Genetics of platelet traits in ischaemic stroke: focus on mean platelet volume and platelet count. International Journal of Neuroscience. 2019 2019/05/04;129(5):511-22.

361. Garry P, Ezra M, Rowland M, Westbrook J, Pattinson K. The role of the nitric oxide pathway in brain injury and its treatment—from bench to bedside. Experimental neurology. 2015;263:235-43.

362. Lomash RM, Gu X, Youle RJ, Lu W, Roche KW. Neurolastin, a dynamin family GTPase, regulates excitatory synapses and spine density. Cell reports. 2015;12(5):743-51.

363. Wang S-M, Lee Y-C, Ko C-Y, et al. Increase of zinc finger protein 179 in response to CCAAT/enhancer binding protein delta conferring an antiapoptotic effect in astrocytes of Alzheimer’s disease. Molecular neurobiology. 2015;51(1):370-82.

364. Zhang F, Zhang C. Rnf112 deletion protects brain against intracerebral hemorrhage (ICH) in mice by inhibiting TLR-4/NF-κB pathway. Biochemical and biophysical research communications. 2018;507(1-4):43-50.

365. Zempel H, Mandelkow E. Lost after translation: missorting of Tau protein and consequences for Alzheimer disease. Trends in neurosciences. 2014;37(12):721-32.

366. van Beuningen Sam FB, Will L, Harterink M, et al. TRIM46 Controls Neuronal Polarity and Axon Specification by Driving the Formation of Parallel Microtubule Arrays. Neuron. 2015 2015/12/16/;88(6):1208-26.

Bibliography 211

367. Cousin E, Hannequin D, Ricard S, et al. A risk for early-onset Alzheimer's disease associated with the APBB1 gene (FE65) intron 13 polymorphism. Neuroscience Letters. 2003 2003/05/15/;342(1):5-8.

368. Khanahmadi M, Farhud DD, Malmir M. Genetic of Alzheimer's Disease: A Narrative Review Article. Iran J Public Health. 2015;44(7):892-901.

369. Guénette SY, Bertram L, Crystal A, et al. Evidence against association of the FE65 gene (APBB1) intron 13 polymorphism in Alzheimer's patients. Neuroscience Letters. 2000 2000/12/15/;296(1):17-20.

370. Formicola D, Aloia A, Sampaolo S, et al. Common variants in the regulative regions of GRIA1 and GRIA3 receptor genes are associated with migraine susceptibility. BMC Medical Genetics. 2010 2010/06/25;11(1):103.

371. Gasparini CF, Smith RA, Griffiths LR. Genetic insights into migraine and glutamate: a protagonist driving the headache. Journal of the neurological sciences. 2016 2016/08/15/;367:258-68.

372. Formicola D, Esposito T, Magliocca S, et al. A coding variant in GRIN3A gene is associated with migraine in italian population. 2011.

373. Quadri M, Fang M, Picillo M, et al. Mutation in the SYNJ1 Gene Associated with Autosomal Recessive, Early‐Onset P arkinsonism. Human mutation. 2013;34(9):1208-15.

374. Krebs CE, Karkheiran S, Powell JC, et al. The Sac1 Domain of SYNJ 1 Identified Mutated in a Family with Early‐Onset Progressive P arkinsonism with Generalized Seizures. Human mutation. 2013;34(9):1200-7.

375. Dyment DA, Smith AC, Humphreys P, et al. Homozygous nonsense mutation in SYNJ1 associated with intractable epilepsy and tau pathology. Neurobiology of aging. 2015;36(2):1222. e1-. e5.

376. Guo MH, Plummer L, Chan Y-M, Hirschhorn JN, Lippincott MF. Burden testing of rare variants identified through exome sequencing via publicly available control data. The American Journal of Human Genetics. 2018;103(4):522-34.

377. Lee S, Abecasis GR, Boehnke M, Lin X. Rare-variant association analysis: study designs and statistical tests. The American Journal of Human Genetics. 2014;95(1):5-23.

378. Wang GT, Peng B, Leal SM. Variant association tools for quality control and analysis of large-scale sequence and genotyping array data. The American Journal of Human Genetics. 2014;94(5):770-83.

212 Bibliography

379. Suen D-F, Norris KL, Youle RJ. Mitochondrial dynamics and apoptosis. Genes & development. 2008;22(12):1577-90.

380. Andrews RM, Kubacka I, Chinnery PF, Lightowlers RN, Turnbull DM, Howell N. Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat Genet. 1999 Oct;23(2):147.

381. Macaulay V, Richards DM. Human mitochondrial DNA and the evolution of Homo sapiens: Springer; 2006.

382. Viscomi C, Zeviani M. MtDNA-maintenance defects: syndromes and genes. Journal of Inherited Metabolic Disease. 2017 2017/07/01;40(4):587-99.

383. Taylor RW, Turnbull DM. Mitochondrial DNA mutations in human disease. Nature reviews Genetics. 2005;6(5):389-402.

384. Kenney MC, Chwa M, Atilano SR, et al. Molecular and bioenergetic differences between cells with African versus European inherited mitochondrial DNA haplogroups: implications for population susceptibility to diseases. Biochim Biophys Acta. 2014 Feb;1842(2):208-19.

385. Graffy EA, Foran DR. A Simplified Method for Mitochondrial DNA Extraction from Head Hair Shafts. Journal of Forensic Sciences. 2005 08/17;50(5):JFS2005126-4.

386. Pronicka E, Piekutowska-Abramczuk D, Ciara E, et al. New perspective in diagnostics of mitochondrial disorders: two years' experience with whole-exome sequencing at a national paediatric centre. J Transl Med. 2016;14(1):174-.

387. Pyle A, Foltynie T, Tiangyou W, et al. Mitochondrial DNA haplogroup cluster UKJT reduces the risk of PD. 2005;57(4):564-7.

388. van der Walt JM, Nicodemus KK, Martin ER, et al. Mitochondrial polymorphisms significantly reduce the risk of Parkinson disease. Am J Hum Genet. 2003 Apr;72(4):804-11.

389. Campbell GR, Ziabreva I, Reeve AK, et al. Mitochondrial DNA deletions and neurodegeneration in multiple sclerosis. 2011;69(3):481-92.

390. Gilmer LK, Ansari MA, Roberts KN, Scheff SW. Age-Related Mitochondrial Changes after Traumatic Brain Injury. Journal of neurotrauma. 2010 2010/05/01;27(5):939-50.

391. Vagnozzi R, Marmarou A, Tavazzi B, et al. Changes of cerebral energy metabolism and lipid peroxidation in rats leading to mitochondrial dysfunction after diffuse brain injury. Journal of neurotrauma. 1999 Oct;16(10):903-13.

392. McDonald RP, Horsburgh KJ, Graham DI, Nicoll JAJN. Mitochondrial DNA deletions in acute brain injury. 1999;10(9):1875-8.

Bibliography 213

393. Conley YP, Okonkwo DO, Deslouches S, et al. Mitochondrial polymorphisms impact outcomes after severe traumatic brain injury. 2014;31(1):34-41.

394. Van Der Walt JM, Nicodemus KK, Martin ER, et al. Mitochondrial polymorphisms significantly reduce the risk of Parkinson disease. The American Journal of Human Genetics. 2003;72(4):804-11.

395. Wagner M, Berutti R, Lorenz-Depiereux B, et al. Mitochondrial DNA mutation analysis from exome sequencing-A more holistic approach in diagnostics of suspected mitochondrial disease. J Inherit Metab Dis. 2019 Sep;42(5):909-17.

396. Lott MT, Leipzig JN, Derbeneva O, et al. mtDNA variation and analysis using MITOMAP and MITOMASTER. 2013;44(1):1.23. 1-1.. 6.

397. Shen L, Attimonelli M, Bai R, et al. MSeqDR mvTool: A mitochondrial DNA Web and API resource for comprehensive variant annotation, universal nomenclature collation, and reference genome conversion. Human mutation. 2018;39(6):806-10.

398. Weissensteiner H, Pacher D, Kloss-Brandstätter A, et al. HaploGrep 2: mitochondrial haplogroup classification in the era of high-throughput sequencing. Nucleic Acids Res. 2016;44(W1):W58-W63.

399. Molinuevo JL, Gramunt N, Gispert JD, et al. The ALFA project: A research platform to identify early pathophysiological features of Alzheimer's disease. Alzheimer's & dementia (New York, N Y). 2016 Jun;2(2):82-92.

400. Connelly LM. Fisher's exact test. MedSurg Nursing. 2016;25(1):58-60.

401. Kucharczyk R, Rak M, Di Rago J-P. Biochemical consequences in yeast of the human mitochondrial DNA 8993T > C mutation in the ATPase6 gene found in NARP/MILS patients. BBA - Molecular Cell Research. 2009;1793(5):817-24.

402. Heidari MM, Mirfakhradini FS, Tayefi F, Ghorbani S, Khatami M, Hadadzadeh M. Novel Point Mutations in Mitochondrial MT-CO2 Gene May Be Risk Factors for Coronary Artery Disease. Applied Biochemistry and Biotechnology. 2020 2020/07/01;191(3):1326-39.

403. Noer AS, Sudoyo H, Lertrit P, et al. A tRNA(Lys) mutation in the mtDNA is the causal genetic lesion underlying myoclonic epilepsy and ragged-red fiber (MERRF) syndrome. Am J Hum Genet. 1991 Oct;49(4):715-22.

404. Khusnutdinova E, Gilyazova I, Ruiz-Pesini E, et al. A Mitochondrial Etiology of Neurodegenerative Diseases: Evidence from Parkinson's Disease. Annals of the New York Academy of Sciences. 2008;1147(1):1-20.

214 Bibliography

405. Ridge PG, Kauwe JSK. Mitochondria and Alzheimer’s Disease: the Role of Mitochondrial Genetic Variation. Current Genetic Medicine Reports. 2018 2018/03/01;6(1):1- 10.

406. Ghaffarpour M, Mahdian R, Fereidooni F, Kamalidehghan B, Moazami N, Houshmand M. The mitochondrial ATPase6 gene is more susceptible to mutation than the ATPase8 gene in breast cancer patients. Cancer Cell Int. 2014;14(1):21-.

407. Guo X-G, Liu C-T, Dai H, Guo Q-N. Mutations in the mitochondrial ATPase6 gene are frequent in human osteosarcoma. Experimental and Molecular Pathology. 2013 2013/02/01/;94(1):285-8.

408. Rollins B, Martin MV, Sequeira PA, et al. Mitochondrial variants in schizophrenia, bipolar disorder, and major depressive disorder. PloS one. 2009;4(3):e4913.

409. Zhang J, Zhang Z-X, Du P-C, et al. Analyses of the mitochondrial mutations in the Chinese patients with sporadic Creutzfeldt–Jakob disease. European Journal of Human Genetics. 2015 2015/01/01;23(1):86-91.

410. Fessele KL, Wright FJBrfn. Primer in genetics and genomics, article 6: Basics of epigenetic control. 2018;20(1):103-10.

411. Alghanim H, Wu W, McCord BJE. DNA methylation assay based on pyrosequencing for determination of smoking status. 2018;39(21):2806-14.

412. Mah W-C, Lee CGL. DNA methylation: potential biomarker in Hepatocellular Carcinoma. Biomarker Research. 2014 2014/03/17;2(1):5.

413. Tang Q, Cheng J, Cao X, Surowy H, Burwinkel B. Blood-based DNA methylation as biomarker for breast cancer: a systematic review. Clinical Epigenetics. 2016 2016/11/14;8(1):115.

414. JUBB AM, QUIRKE P, OATES AJ. DNA Methylation, a Biomarker for Colorectal Cancer. Annals of the New York Academy of Sciences. 2003;983(1):251-67.

415. Levenson VV. DNA methylation as a universal biomarker. Expert Review of Molecular Diagnostics. 2010 2010/05/01;10(4):481-8.

416. Zheleznyakova GY, Cao H, Schiöth HB. BDNF DNA methylation changes as a biomarker of psychiatric disorders: literature review and open access database analysis. Behavioral and Brain Functions. 2016 2016/06/06;12(1):17.

417. Andersen AM, Dogan MV, Beach SR, Philibert RAJG. Current and future prospects for epigenetic biomarkers of substance use disorders. 2015;6(4):991-1022.

Bibliography 215

418. Numata S, Ishii K, Tajima A, et al. Blood diagnostic biomarkers for major depressive disorder using multiplex DNA methylation profiles: discovery and validation. Epigenetics. 2015 2015/02/01;10(2):135-41.

419. Shimada-Sugimoto M, Otowa T, Miyagawa T, et al. Epigenome-wide association study of DNA methylation in panic disorder. Clinical Epigenetics. 2017 2017/01/21;9(1):6.

420. Uhlén M, Fagerberg L, Hallström BM, et al. Proteomics. Tissue-based map of the human proteome. Science. 2015 Jan 23;347(6220):1260419.

421. Sasaki Y, Oshima Y, Koyama R, et al. Identification of Flotillin-2, a Major Protein on Lipid Rafts, as a Novel Target of p53 Family Members. 2008;6(3):395-406.

422. Nagano M, Toshima JY, Siekhaus DE, Toshima J. Rab5-mediated endosome formation is regulated at the trans-Golgi network. Communications Biology. 2019 2019/11/15;2(1):419.

423. Yuan W, Song C. The Emerging Role of Rab5 in Membrane Receptor Trafficking and Signaling Pathways. Biochemistry Research International. 2020 2020/02/11;2020:4186308.

424. Xu W, Fang F, Ding J, Wu C. Dysregulation of Rab5-mediated endocytic pathways in Alzheimer's disease. 2018;19(4):253-62.

425. Langie SAS, Moisse M, Declerck K, et al. Salivary DNA Methylation Profiling: Aspects to Consider for Biomarker Identification. Basic & Clinical Pharmacology & Toxicology. 2017 02/10

09/01/received

11/22/accepted;121(Suppl Suppl 3):93-101.

426. Braun P, Hafner M, Nagahama Y, et al. Genome-Wide Dna Methylation Comparison Between Live Human Brain and Peripheral Tissues Within Individuals. European Neuropsychopharmacology. 2017 2017/01/01/;27:S506.

427. Fakruddin M, Chowdhury A, Hossain MN, Mannan K, Mazumda R. Pyrosequencing- principles and applications. Int J Life Sci Pharma Res. 2012;2:65-76.

428. Ibrahim O, Sutherland HG, Avgan N, et al. Investigation of the CADM2 polymorphism rs17518584 in memory and executive functions measures in a cohort of young healthy individuals. Neurobiology of learning and memory. 2018 Nov;155:330-6.

429. Grasso C, Trevisan M, Fiano V, et al. Performance of Different Analytical Software Packages in Quantification of DNA Methylation by Pyrosequencing. PLOS ONE. 2016;11(3):e0150483.

216 Bibliography

430. Pérez RF, Santamarina P, Tejedor JR, et al. Longitudinal genome-wide DNA methylation analysis uncovers persistent early-life DNA methylation changes. Journal of Translational Medicine. 2019 2019/01/09;17(1):15.

431. Auffray C, Griffin JL, Khoury MJ, Lupski JR, Schwab M. Ten years of genome medicine. BioMed Central; 2019.

432. Nagarajan N, Yapp EK, Le NQK, Kamaraj B, Al-Subaie AM, Yeh H-Y. Application of Computational Biology and Artificial Intelligence Technologies in Cancer Precision Drug Discovery. Biomed Res Int. 2019;2019.

433. Ho SS, Urban AE, Mills RE. Structural variation in the sequencing era. Nature Reviews Genetics. 2019:1-19.

434. Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nature Reviews Genetics. 2015;16(6):321-32.

435. Ashley EA. The precision medicine initiative: a new national effort. Jama. 2015 Jun 2;313(21):2119-20.

436. Momozawa Y, Mni M, Nakamura K, et al. Resequencing of positional candidates identifies low frequency IL23R coding variants protecting against inflammatory bowel disease. Nature genetics. 2011;43(1):43.

437. Huang H, Fang M, Jostins L, et al. Fine-mapping inflammatory bowel disease loci to single-variant resolution. Nature. 2017;547(7662):173-8.

438. Yang J, Benyamin B, McEvoy BP, et al. Common SNPs explain a large proportion of the heritability for human height. Nature genetics. 2010;42(7):565.

439. Romagnoni A, Jégou S, Van Steen K, Wainrib G, Hugot J-P. Comparative performances of machine learning methods for classifying Crohn Disease patients using genome-wide genotyping data. Scientific reports. 2019;9(1):1-18.

440. Pare G, Mao S, Deng WQ. A machine-learning heuristic to improve gene score prediction of polygenic traits. Scientific reports. 2017;7(1):1-11.

441. Sardaar S, Qi B, Dionne-Laporte A, Rouleau GA, Rabbany R, Trakadis YJ. Machine learning analysis of exome trios to contrast the genomic architecture of autism and schizophrenia. BMC Psychiatry. 2020 2020/02/28;20(1):92.

442. Er F, Iscen P, Sahin S, Çinar N, Karsidag S, Goularas D. Distinguishing age-related cognitive decline from dementias: A study based on machine learning algorithms. Journal of clinical neuroscience : official journal of the Neurosurgical Society of Australasia. 2017 Aug;42:186-92.

Bibliography 217

443. Messina R, Filippi M, Goadsby PJ. Recent advances in headache neuroimaging. Current opinion in neurology. 2018 Aug;31(4):379-85.

444. Rocca MA, Ceccarelli A, Falini A, et al. Brain gray matter changes in migraine patients with T2-visible lesions: a 3-T MRI study. Stroke. 2006 Jul;37(7):1765-70.

445. González-Recio O, Jiménez-Montero J, Alenda R. The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets. Journal of dairy science. 2013;96(1):614-24.

446. Kassahun Y, Perrone R, De Momi E, et al. Automatic classification of epilepsy types using ontology-based and genetics-based machine learning. Artificial Intelligence in Medicine. 2014 2014/06/01/;61(2):79-88.

447. Johnson B, Zhang K, Gay M, et al. Alteration of brain default network in subacute phase of injury in concussed individuals: resting-state fMRI study. 2012;59(1):511-8.

448. Hastie T, Tibshirani R, Friedman J. Boosting and additive trees. The elements of statistical learning: Springer; 2009. p. 337-87.

449. Kokol P, Pohorec S, Štiglic G, Podgorelec V. Evolutionary design of decision trees for medical application. WIREs Data Mining and Knowledge Discovery. 2012;2(3):237-54.

450. Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of statistics. 2001:1189-232.

451. Kuo FY, Sloan IHJNotA. Lifting the curse of dimensionality. 2005;52(11):1320-8.

452. Shavitt I, Segal E, editors. Regularization learning networks: deep learning for tabular datasets. Advances in Neural Information Processing Systems; 2018.

453. Abraham G, Kowalczyk A, Zobel J, Inouye M. Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease. Genetic epidemiology. 2013;37(2):184-95.

454. Chen G-B, Lee SH, Montgomery GW, et al. Performance of risk prediction for inflammatory bowel disease based on genotyping platform and genomic risk score method. BMC medical genetics. 2017;18(1):94.

455. Ma J, Amos CI. Investigation of Inversion Polymorphisms in the Human Genome Using Principal Components Analysis (Principal Components Analysis of Inversions). PLoS ONE. 2012;7(7):e40224.

456. Bertolini F, Galimberti G, Calò DG, Schiavo G, Matassino D, Fontanesi L. Combined use of principal component analysis and random forests identify population‐informative single

218 Bibliography

nucleotide polymorphisms: application in cattle breeds. Journal of Animal Breeding and Genetics. 2015;132(5):346-56.

457. Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). The annals of statistics. 2000;28(2):337-407.

458. Wray NR, Yang J, Goddard ME, Visscher PM. The genetic interpretation of area under the ROC curve in genomic profiling. PLoS genetics. 2010 Feb 26;6(2):e1000864.

459. Daneshjou R, Wang Y, Bromberg Y, et al. Working toward precision medicine: Predicting phenotypes from exomes in the Critical Assessment of Genome Interpretation (CAGI) challenges. Human mutation. 2017;38(9):1182-92.

460. Park JH, Shin SD, Song KJ, et al. Prediction of good neurological recovery after out- of-hospital cardiac arrest: A machine learning analysis. Resuscitation. 2019 Sep;142:127-35.

461. Liao J-C, Yang TT, Weng RR, Kuo C-T, Chang C-W. TTBK2: A Tau Protein Kinase beyond Tau Phosphorylation. Biomed Res Int. 2015 2015/04/09;2015:575170.

462. Wang J-Z, Xia Y-Y, Grundke-Iqbal I, Iqbal K. Abnormal hyperphosphorylation of tau: sites, regulation, and molecular mechanism of neurofibrillary degeneration. Journal of Alzheimer's disease. 2013;33(s1):S123-S39.

463. Hanger DP, Anderton BH, Noble W. Tau phosphorylation: the therapeutic challenge for neurodegenerative disease. Trends in molecular medicine. 2009;15(3):112-9.

464. Gendron TF, Petrucelli L. The role of tau in neurodegeneration. Molecular neurodegeneration. 2009;4(1):13.

465. LaPointe NE, Morfini G, Pigino G, et al. The amino terminus of tau inhibits kinesin‐ dependent axonal transport: implications for filament toxicity. Journal of neuroscience research. 2009;87(2):440-51.

466. Sato S, Cerny RL, Buescher JL, Ikezu T. Tau‐tubulin kinase 1 (TTBK1), a neuron‐ specific tau kinase candidate, is involved in tau phosphorylation and aggregation. Journal of neurochemistry. 2006;98(5):1573-84.

467. Bowie E, Norris R, Anderson KV, Goetz SC. Spinocerebellar ataxia type 11-associated alleles of Ttbk2 dominantly interfere with ciliogenesis and cilium stability. PLoS genetics. 2018;14(12):e1007844.

468. Sánchez-Pozos K, Ortíz-López MG, Peña-Espinoza BI, et al. Whole-exome sequencing in maya indigenous families: variant in PPP1R3A is associated with type 2 diabetes. Molecular Genetics and Genomics. 2018;293(5):1205-16.

Bibliography 219

469. Cetinkalp S, Simsir IY, Ertek S. Insulin resistance in brain and possible therapeutic approaches. Current vascular pharmacology. 2014;12(4):553-64.

470. Laron Z. Insulin and the brain. Archives of physiology and biochemistry. 2009 May;115(2):112-6.

471. Strommer L, Wickbom M, Wang F, et al. Early impairment of insulin secretion in rats after surgical trauma. European journal of endocrinology. 2002;147(6):825-33.

472. Shi J, Dong B, Mao Y, et al. Review: Traumatic brain injury and hyperglycemia, a potentially modifiable risk factor. Oncotarget. 2016;7(43):71052-61.

473. Karelina K, Weil ZM. Neuroenergetics of traumatic brain injury. Concussion. 2015;1(2):CNC9-CNC.

474. Ahn J-Y. Neuroprotection signaling of nuclear akt in neuronal cells. Exp Neurobiol. 2014;23(3):200-6.

475. Ben-David A. A lot of randomness is hiding in accuracy. Engineering Applications of Artificial Intelligence. 2007;20(7):875-85.

476. Fernandes JA, Irigoien X, Goikoetxea N, et al. Fish recruitment prediction, using robust supervised classification methods. Ecological Modelling. 2010;221(2):338-52.

477. Garcia-Moral AI, Solera-Urena R, Pelaez-Moreno C, Diaz-de-Maria F. Data balancing for efficient training of hybrid ANN/HMM automatic speech recognition systems. IEEE Transactions on audio, speech, and language processing. 2010;19(3):468-81.

478. Lu H, Xu Y, Ye M, Yan K, Gao Z, Jin Q. Learning misclassification costs for imbalanced classification on gene expression data. BMC Bioinformatics. 2019 2019/12/24;20(25):681.

479. Delgado R, Tibau X-A. Why Cohen’s Kappa should be avoided as performance measure in classification. PloS one. 2019;14(9):e0222916.

480. Ben-David A. Comparison of classification accuracy using Cohen’s Weighted Kappa. Expert Systems with Applications. 2008;34(2):825-32.

481. Viera AJ, Garrett JM. Understanding interobserver agreement: the kappa statistic. Fam med. 2005;37(5):360-3.

482. Landis JR, Koch GG. An application of hierarchical kappa-type statistics in the assessment of majority agreement among multiple observers. Biometrics. 1977:363-74.

483. Fleiss JL, Nee JC, Landis JR. Large sample variance of kappa in the case of different sets of raters. Psychological bulletin. 1979;86(5):974.

220 Bibliography

484. Seigel DG, Podgo MJ, Remaley NA. Acceptable values of kappa for comparison of two groups. American journal of epidemiology. 1992;135(5):571-8.

485. Wolpert DH, Macready WG. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation. 1997;1(1):67-82.

486. García S, Fernández A, Luengo J, Herrera F. Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences. 2010 2010/05/15/;180(10):2044-64.

487. Dietterich TG. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput. 1998;10(7):1895–923.

Bibliography 221

Appendices

Appendix1

Setting up standard PCR reaction:

Single __X Reaction Reagent Reaction Volume Volume (µL)

10X PCR Buffer 5 (green)

25 mM 1.75 MgCl2 5 mM 1 dNTP forward 1 primer reverse 1 primer Nuclease- 13 free water Taq 0.25 polymerase Master 23 Mix Total DNA (20 2 - ng/µL) Reaction 25 - TOTAL

222 Appendices

Incubate reaction mix in Thermo-cycler using the following cycle conditions:

Cycles Temperature Time

1 95.0°C 2.00 min 94.0°C 1:00 min 30 65.0°C 1:00 min 72.0°C 1:00 min 72.0°C 2:00 min 1 4.0°C ¥

Appendices 223

Appendix 2

Heatmaps for 4 chips comparing sequencing of DNA extracted from both saliva and

blood for 4 pairs of samples.

224 Appendices

Appendices 225

226 Appendices

Appendix 3

IGV traces of several variants that were called differently in saliva and blood samples.

Appendices 227

228 Appendices

Appendix 4

Tier1 Tier2 Tier3 ABCD1 ABHD14B ACAN

ABCD1P2 ABHD17AP2 ADAMTS2

ABCD1P3 APTR ADAMTS9 ABCD1P4 ATP2B1 ANK3

ABCD1P5 KATNBL1P2 ANXA1

ATP1A1 LAPTM4B Anxa2 ATP1A1OS MAPT APOE

ATP1A2 NACAD ARHGAP20

ATP1A3 ABCA8 B3GNT7 ATP1A4 ABCB1 B4GALT2 CACNA1A ABCB10 BCL11A CACNA1A ABCB10P1 CAMK4 CACNA1B ABCB10P3 CDH13 CACNA1C ABCB10P4 CHST5 CACNA1C ABCB11 CHST6 CACNA1D ABCB4 COLEC12 CACNA1D ABCB5 DGKH CACNA1E ABCB6 DPP10 CACNA1F ABCB7 DYSF CACNA1F ABCB8 ELMO1 CACNA1G ABCB9 ESR1 CACNA1H ABHD1 ESRRB CACNA1H ABHD10 FMOD

Appendices 229

CACNA1I ABHD11 GRIN2A CACNA1S ABHD12 KERA

CACNA2D1 ABHD12B LIPC

CACNA2D2 ABHD13 LRRC7

CACNA2D3 ABHD14A LRRN3

CACNA2D4 ABHD15 LUM

FSCN1 ABHD16A LYPLAL1 FSCN2 ABHD16B MAPK1 FSCN3 ABHD17A NDN

OBSCN ABHD17AP1 NMU

PRRT1 ABHD17AP3 NR4A2

PRRT2 ABHD17AP4 OGN

PRRT3 ABHD17AP5 OMD

PRRT4 ABHD17AP6 PAWR

SCN10A ABHD17B PCDH11X SCN10A ABHD17C PDE7B SCN11A ABHD18 PLAT SCN11A ABHD2 PLEKHG1 SCN1A ABHD3 PLG SCN1B ABHD4 PPP1R3B SCN2A ABHD5 PRELP SCN2B ABHD6 PROX1 SCN3A ABHD8 RPS6KA1 SCN3B ACAD10 S100A10 SCN4A ACAD11 S100A4 SCN4A ACAD8 SCARA3 SCN4B ACAD9 SDCBP

230 Appendices

SCN5A ACADL SEMA3A SCN5A ACADM SERINC2 SCN7A ACADS SLC17A7 SCN8A ACADSB SLC26A4 SCN9A ACADVL SMC2 SCN9A AFG3L1P SP6 SCNM1 AFG3L2 TCEB1 SCNN1A AFG3L2P1 TGFB2 SCNN1B ANO1 TRIM72 SCNN1D ANO10 TSHZ2 SCNN1G ANO2 TTLL7 ANO3 TYRP1 ANO4 UX2 ANO5 ZDHHC21 ANO6 ZIC5 ANO7 CAPN1 ANO7P1 CAPN2 ANO8 CAPN3 ANO9 CAPN5 ANOS1 CAPN6 APTX CAPN7 ATCAY CAPN8 ATM CAPN9 ATMIN CAPN10 ATN1 CAPN11 ATP2B2 CAPN12 ATP2B3 CAPN13 ATP2B4 CAPN14 ATXN1 CAPN15 ATXN10 ADGB ATXN1L CASP1 ATXN2 CASP2 ATXN2L CASP3 ATXN3 CASP4 ATXN3L CASP5

Appendices 231

ATXN7 CASP6 ATXN7L1 CASP7 ATXN7L2 CASP8 ATXN7L3 CASP9 ATXN7L3B CASP10 ATXN8OS CASP12 BEAN1 CASP14 C10orf2 CASP16P C10orf25 IL1B CA8 IL1-RN CCDC88C IL6 CDCA8 IL8 COQ10A IL10 COQ10B PTGS2

COQ10BP2 TNFA

COQ2 CCBP2 COQ3 CCL1 COQ4 CCL2 COQ5 CCL3 COQ6 CCL4 COQ7 CCL5 COQ8B CCL6 COQ9 CCL7 CTDP1 CCL8 CWF19L1 CCL9 DCTPP1 CCL11 DNMT1 CCL12 DNMT3A CCL13 DNMT3B CCL14 DNMT3L CCL15 EEF2 CCL16 EEF2K CCL17 EEF2KMT CCL18 ELOVL4 CCL19

232 Appendices

ELOVL5 CCL20 FGF14 CCL21 FXN CCL22 GAPT CCL23 GATM CCL24 GLTPP1 CCL25 GRID2 CCL26 GRID2IP CCL27 GRM1 CCL28 ITPR1 CCR1 KATNA1 CCR2 KATNAL1 CCR3 KATNAL2 CCR4 KATNB1 CCR5 KATNBL1 CCR6

KATNBL1P1 CCR7

KATNBL1P3 CCR8

KATNBL1P4 CCR9

KATNBL1P5 CCR10

KATNBL1P6 CCRL2

KCNC1 CX3CL1 KCNC2 CX3CR1 KCNC3 CXCL1 KCNC4 CXCL2 KCND1 CXCL3 KCND2 CXCL4 KCND3 CXCL5 LAPTM4A CXCL6 LAPTM5 CXCL7 MAFG CXCL9 MATN1 CXCL9

Appendices 233

MATN2 CXCL10 MATN3 CXCL11 MATN4 CXCL12 MIATNB CXCL13 NANOG CXCL14 NANOGNB CXCL15

NANOGNBP1 CXCL16

NANOGNBP2 CXCL17

NANOGNBP3 CXCR3

NANOGP1 CXCR4

NANOGP10 CXCR5

NANOGP2 CXCR6 NANOGP3 CXCR7 NANOGP4 DUFFY NANOGP5 IL8 NANOGP6 IL8RA NANOGP7 IL8RB NANOGP8 ITGA4 NANOGP9 ITGB1 NANOS1 TREM1 NANOS2 TREM2 NANOS3 TREM3 NOP56 XCL1 PANO1 XCR1 PAPOLG GRIN1 PDSS1 GRIN2B PDSS2 GRIN2C PDYN GRIN2D POLG GRIN3A POLG2 GRIN3B PPP2R2A GRIA1 PPP2R2B GRIA2

234 Appendices

PPP2R2C GRIA3 PPP2R2D GRIA4

PPP2R2DP1 TARDBP

PRKCG APP RUBCN SNCA RUBCNL SOD1 SACS FUS SETX ACE SFXN1 ENPEP SFXN2 NOS1 SFXN3 NOS2 SFXN4 NOS3 SFXN5 PHACTR1 SIL1 COMT SLC1A3 MAOA SPTBN2 SLC1A2 STUB1 SLC17A7 SYNE1 SLC6A4 SYT14 MTDH SYT14P1 AGT TAPT1 EDN1 TBP EDN3 TDP1 BDNF TGM6 CALCA TPP1 AGTR2 TTBK2 ADRA2A TTPA EDNRB TTPAL HTR1A VAMP1 HTR2A VLDLR ADORA1 WWOX EDNRA ZNF592 PTGER2 OQ8A HTR1B ABCC1

Appendices 235

ABCC8 DUSPS TRPM2 TRPM8 CACN2B4 TRPM4 AQP4 MMP2 MMP3 MMP9 TSPO

236 Appendices

Appendix 5

TRIM2 ATP10A

Appendices 237

KCNAB1

SQSTM1

238 Appendices

Appendices 239

Appendix 6

Clinician Recruited GENDE Multiple ID Age Previous Contact sports diagnose Still suffering? as R Concs d

CN1 PCS 30 F Yes(Hockey) Yes No Yes

CN2 PCS 33 M Yes(Rugby) Yes No Yes

CN3 PCS 66 M No Yes No No CN4 PCS 36 M No Yes No Yes

CN5 PCS 21 M Yes(Rugby) Yes No Yes

CN6 PCS 61 M YES(AFL) No No Yes

CN7 PCS 36 M Yes(RUGBY) Yes No Yes

CN8 PCS 29 M Yes(Rugby) Yes No Yes

CN9 PCS 23 F No Yes No N

Yes(Mental and Cognitive CN10 PCS 56 F No Yes No fatigue)

Appendices 241

Yes (Light and sound CN11 PCS 31 F No Yes Yes sensitivity)

Yes memory retention, CN15 PCS 35 M No Yes Yes 3-5 irritibility

Headinjur CN17 27 F No No N/A N/A y CN18 PCS 25 M Yes Yes No Yes CN22 PCS 29 M Yes Yes No Yes 6-10 Headinjur CN23 93 M No Unknown N/A N/a y CN24 PCS 31 F No Yes Yes Yes CN25 PCS 53 M No Yes Yes No CN26 pcs 67 F No Yes Yes Yes

CN27 PCS 42 F Yes Rugby/AFL Yes Yes Yes

CN28 PCS 26 F Yes Yes Yes Yes

Yes Boxing, Jujitsu,weight lifting and CN29 PCS 26 M Yes Yes Yes running

242 Appendices

CN30 pcs 64 M No Yes Yes Yes CN31 PCS 33 F No Yes Yes Yes CN32 pcs 52 M Rugby Yes No Yes CN33 pcs 23 F Yes Yes Yes Yes CN34 PCS 22 F Yes Yes Yes No Headinjur CN36 64 F Yes Netball No Yes No y Headinjur CN37 20 F No No No No y CN38 PCS 20 F Yes Yes No Yes 2 CN39 PCS 51 F No Yes No Yes 2 CN40 PCS 50 F No No Unknown Yes

CN41 pcs 34 M Rugby Union Unknown No Yes

CN48 pcs 82 F No No Yes Yes CN49 pcs 52 F No Yes Yes Yes CN50 pcs 73 F No Yes No No CN51 pcs 66 F No Yes No No CN52 pcs 51 M Yes Yes No Yes CN53 PCS 50 M Yes No No Yes CN54 PCS 30 M Yes Yes Yes Yes CN55 41 F No Yes Yes Yes CN56 20 F No N/A Yes No

Appendices 243

244 Appendices

Appendix 7

Mitochondrial haplogroups

Sample Haplogroup BB11 H BB12 R BB13 N BB15 JT BB17 N BB18 H BB19 H BB21 H BB27 H BB3 JT BB7 N DG1091 H DG1140 X DG185 JT DG186 H DG223 H DG237 U DG241 R DG254 H DG433 H DG478 H DG494 U DG518 H DG522 H DG584 H DG707 U DG738 H DG750 U

Appendices 245

DG906 JT DG951 U DG986 U DG990 X CN10 U CN11 H CN12 H CN15 H CN17 U CN18 H CN22 JT CN24 U CN25 H CN26 U CN27 H CN28 JT CN29 U CN30 U CN31 U CN32 JT CN33 U CN34 H CN35 U CN36 X CN37 N CN38 R CN39 H CN3 U CN40 U CN41 JT CN43 H CN44 N CN45 U CN46 H CN47 U

246 Appendices

CN4 N CN5 U CN6 H CN7 X CN8 X CN9 N

Appendix 7

FLOT2 sequence:

TTACCCGCCCCTACCAGCATCCTCCCCTGGAAGGCAACTTCCTGCCAGCTTCCTGG

TCACTCAGCC

FLOT2 F: AAGGGTAGTAATAATAGGAAGTTGAAA

bio-FLOT2 R: AAATACAATTAAAAAAAACTTCCAAATCTT

FLOT2 seq: AGGAAGTTGAAATTGTTAGA

bio-RAB5B F: bio-GTATATATATAGAGGTAGGAAGAGGGTTT

RAB5B R: CCCTAACATCAACCCTTCCTCCTTTTAAT

RAB5B seq: CCTCCTTTTAATAAAACTTCA

RAB5B sequence:

AACATCTCAGACCGCCCGCGGCCACCCCAAATACGCCTGTCCAC

Appendices 247

Appendix 9

site_1 site_2 site_3 site_4 Group CN01 5 7 3 3 Concussion CN02 5 10 20 5 Concussion CN03 6 9 4 2 Concussion CN04 6 4 3 3 Concussion CN05 6 7 4 4 Concussion CN06 7 5 3 3 Concussion CN07 7 7 3 4 Concussion CN08 8 7 4 4 Concussion CN09 7 5 4 4 Concussion CN10 7 5 3 3 Concussion CN11 7 7 3 4 Concussion CN12 6 6 3 3 Concussion CN13 7 4 3 3 Concussion CN14 7 7 3 3 Concussion CN15 5 3 3 3 Concussion CN16 6 3 2 3 Concussion CN17 5 5 3 2 Concussion CN18 6 5 3 3 Concussion CN19 6 5 3 3 Concussion CN20 4 4 2 3 Concussion CN21 4 4 3 3 Concussion CN22 3 3 3 3 Concussion CN24 4 4 3 3 Concussion CN25 13 9 4 6 Concussion CN26 8 8 5 5 Concussion CN27 11 7 5 5 Concussion CN28 9 8 4 6 Concussion CN29 6 5 4 4 Concussion

248 Appendices

CN31 11 8 4 6 Concussion CN32 11 8 5 6 Concussion CN33 5 3 3 2 Concussion CN34 3 3 2 1 Concussion CN35 3 4 2 2 Concussion CN36 4 3 3 2 Concussion CN37 3 3 3 1 Concussion CN38 3 2 2 1 Concussion CN23 7 5 2 3 Concussion CN30 NA NA NA NA NA GOM01 9 7 6 4 Healthy GOM02 5 6 3 3 Healthy GOM03 2 4 3 1 Healthy GOM04 5 5 4 4 Healthy GOM05 7 6 4 4 Healthy GOM06 5 4 3 3 Healthy GOM07 6 6 3 4 Healthy GOM08 7 6 3 5 Healthy GOM09 8 7 4 4 Healthy GOM10 6 5 5 4 Healthy GOM11 4 4 2 2 Healthy GOM12 6 4 3 2 Healthy GOM13 7 8 4 4 Healthy GOM14 4 4 3 3 Healthy GOM15 3 4 2 2 Healthy GOM16 8 10 5 5 Healthy GOM17 7 9 5 5 Healthy GOM18 6 6 4 4 Healthy GOM19 9 9 4 5 Healthy GOM20 11 10 8 9 Healthy GOM21 7 7 3 5 Healthy GOM22 7 6 3 12 Healthy GOM24 6 6 3 4 Healthy GOM25 5 5 3 3 Healthy GOM26 16 9 7 9 Healthy

Appendices 249

GOM27 7 5 4 4 Healthy GOM28 6 5 2 2 Healthy GOM29 3 2 2 2 Healthy GOM31 3 3 3 2 Healthy GOM32 2 4 2 2 Healthy GOM33 3 3 3 2 Healthy GOM34 17 15 11 10 Healthy GOM35 2 2 2 1 Healthy GOM36 2 2 2 2 Healthy GOM37 4 4 2 2 Healthy GOM38 3 3 3 2 Healthy GOM23 NA NA NA NA NA GOM30 3 3 3 2 Healthy

250 Appendices