Linköping Studies in Science and Technology Dissertation No. 2017

Advanced analysis of MRI data

Xuan Gu Linköping Studies in Science and Technology Dissertaons, No. 2017

Advanced analysis of diffusion MRI data

Xuan Gu

Linköping University Department of Biomedical Engineering Division of Biomedical Engineering SE-581 83 Linköping, Sweden

Linköping 2019 Edition 1:1

© Xuan Gu, 2019 ISBN 978-91-7519-003-7 ISSN 0345-7524 URL http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-161288

Published articles have been reprinted with permission from the respective copyright holder. Typeset using XƎTEX

Printed by LiU-Tryck, Linköping 2019

ii POPULÄRVETENSKAPLIG SAMMANFATTNING

Diffusions-viktad magnetresonansavbildning (dMRI) är en icke-invasiv metod för att stude- ra hur vattenmolekyler rör sig på grund av diffusion. För att uppnå detta används speciella sekvenser i en magnetkamera, som sätter fart på molekylerna med hjälp av starka magnet- fält. dMRI kan användas som ett verktyg för att studera hjärnkonnektivitet, det vill säga hur olika delar av hjärnan är ihopkopplade via nervfibrer.

Denna avhandling presenterar nya metoder för att analysera data från dMRI experiment. En första utmaning är att dMRI data innehåller flera typer av artefakter. Geometriska distortioner skapas via skillnader i hur olika vävnader reagerar på starka magnetfält. Det är även vanligt att försökspersonen rör på huvudet under experimentet.

Efter att data har korrigerats för distortioner skattas en -modell för varje område i hjärnan. Denna modell säger hur stark diffusionen är och om den är starkare i en riktning jämfört med andra riktningar, vilket kan representeras med en . Från tensorn kan man sen räkna ut olika skalära mått, som medeldiffusionen i alla riktningar och hur riktad diffusionen är. Dessa skalära mått kan användas i gruppanalyser med friska kontrollpersoner och personer med en viss sjukdom, för att bättre förstå olika hjärnsjukdomar. Det är även viktigt att veta hur bra en diffusions-modell passar till data, vilket kan kvantifieras som osäkerhet. Denna avhandling presenterar flera sätt att skatta osäkerhet.

iii ABSTRACT

Diffusion magnetic resonance imaging (diffusion MRI) is a non-invasive imaging modality which can measure diffusion of water molecules, by making the MRI acquisition sensitive to diffusion. Diffusion MRI provides unique possibilities to study structural connectivity of the human brain, e.g. how the connects different parts of the brain. Diffusion MRI enables a range of tools that permit qualitative and quantitative assessments of many neurological disorders, such as and Parkinson.

This thesis introduces novel methods for diffusion MRI data analysis. Prior to estimating a diffusion model in each location () of the brain, the diffusion data needs tobepre- processed to correct for geometric distortions and head motion. A deep learning approach to synthesize diffusion scalar maps from a T1-weighted MR image is proposed, and it is shown that the distortion-free synthesized images can be used for distortion correction. An evaluation, involving both simulated data and real data, of six methods for susceptibility distortion correction is also presented in this thesis.

A common problem in diffusion MRI is to estimate the uncertainty of a diffusion model. An empirical evaluation of , a technique that permits reconstruction of white matter pathways in the human brain, is presented in this thesis. The evaluation is based on analyzing 32 diffusion datasets from a single healthy subject, to study how reliable tractography is. In most cases only a single dataset is available for each subject. This thesis presents methods based on frequentistic (bootstrap) as well as Bayesian inference, which can provide uncertainty estimates when only a single dataset is available. These uncertainty measures can then, for example, be used in a group analysis to downweight subjects with a higher uncertainty.

iv Acknowledgments

Undertaking this PhD has been a truly life-changing experience for me and it would not have been possible to do without the support and guidance that I received from many people. I would like to first say a very big thank you to my supervisor DrAnders Eklund for all the support and encouragement he gave me. Without his guidance and constant feedback this PhD would not have been achievable. My sincere thanks also go to Professor Hans Knutsson who has provided me an opportunity to start the PhD. Many thanks also to Dr Evren Özarslan who has provided insightful discussions and suggestions about the research. My deep appreciation goes out to the present members of the group: Cem Yolcu, David Abramian, Deneb Boito and Magnus Herberthson. I also thank people who were part of the group, including Jens Sjölund, Mats Andersson, Olivier Cros and Snehlata Shakya. I am honored to work with all of you. I also thank the wonderful staff in the Department of Biomedical Engineering for always being so helpful and friendly. I gratefully acknowledge the funding received towards my PhD from the Swedish research council, the ITEA/VINNOVA funded project BENEFIT, and the Seeing Organ Function (SOF) project supported by the Knut and Alice Wallenberg Foundation. I especially thank my parents for their unconditional trust, timely encour- agement, and endless patience. Finally, I would like to thank the unsung hero Harley who works 24/7 in the server room to process my endless analyses.

v Contents

Abstract iii

Acknowledgments v

Contents vi

List of Figures ix

List of Tables xii

1 Introduction 1 1.1 Introduction ...... 1 1.2 Research questions ...... 1 1.3 Outline ...... 2 1.4 Included publications ...... 2 1.5 Abbreviations ...... 3 1.6 Reproducibility and ethics ...... 4

2 Magnetic resonance imaging 5 2.1 History of MRI ...... 5 2.2 MR physics ...... 6 2.3 MR signals ...... 8 2.4 MR spatial encoding ...... 10 2.5 k-space ...... 11

3 Diffusion MRI 13 3.1 History of Diffusion MRI ...... 13 3.2 Diffusion basics ...... 14 3.3 Diffusion MRI techniques ...... 15

4 Preprocessing of diffusion MRI data 27 4.1 Introduction ...... 27 4.2 Susceptibility-induced distortion correction ...... 28 4.3 Eddy-current-induced distortion correction ...... 35

vi 4.4 Head motion correction ...... 36 4.5 Discussion ...... 37

5 Tractography 39 5.1 Introduction ...... 39 5.2 Deterministic tractography ...... 41 5.3 Probabilistic tractography ...... 43 5.4 Stopping and rejection criteria ...... 48 5.5 Limitations ...... 51

6 Uncertainty in diffusion models 53 6.1 Introduction ...... 53 6.2 Bootstrap ...... 54 6.3 Bayesian methods ...... 59 6.4 Reducing uncertainty using a spatial model ...... 63

7 Deep learning in diffusion MRI 67 7.1 Introduction ...... 67 7.2 Architectures ...... 68 7.3 Applications ...... 71

8 Summary of papers 77 8.1 Introduction ...... 77 8.2 Paper I - Repeated tractography of a single subject: how high is the variance? ...... 77 8.3 Paper II - Bayesian diffusion tensor estimation with spatial priors 78 8.4 Paper III - Multi-fiber estimation and tractography for diffusion MRI using mixture of non-central Wishart distributions . . . . 78 8.5 Paper IV - Using the wild bootstrap to quantify uncertainty in mean apparent propagator MRI ...... 78 8.6 Paper V - Generating diffusion MRI scalar maps ...... 79 8.7 Paper VI - Evaluation of six phase encoding based susceptibility distortion correction methods for diffusion MRI ...... 79

Bibliography 81

Paper I 97

Paper II 123

Paper III 137

Paper IV 145

Paper V 161

vii Paper VI 173

viii List of Figures

2.1 Precession of protons’ magnetic moment (the dotted line) in an external B0 (the solid line)...... 7 2.2 The net magnetization M (in a rotating frame of reference) with a flip angle α in an external magnetic field B0 and a RF pulse magnetic field B1...... 7 ∗ 2.3 T1, T2 and T2 relaxation...... 8 2.4 The free induction decay (FID) signal is oscillating with a constant frequency. Immediately after the pulse, the transverse component of the precessing magnetization will decay and induce an alternat- ing current in the coil...... 9 2.5 Examples of T1- and T2-weighted images from HCP...... 10 2.6 An example of an image in k-space and its reconstructed image. . 12

3.1 Schematic diagram of the PGSE sequence. Diffusion gradients are applied on both sides of the 180○ pulse. The signal is acquired immediately following the second diffusion gradient...... 14 3.2 Examples of diffusion weighted images for different b-values. The images are scaled independently. A higher b-value results in a lower signal to noise ratio. Data is from MGH adult diffusion (Setsompop et al., 2013)...... 15 3.3 Diffusion tensor (from left to right): sphere (λ1 = λ2 = λ3), planar (λ1 = λ2 > λ3) and cigar (λ1 > λ2 = λ3)...... 17 3.4 DTI scalars (from left to right): FA, RGB coded FA, MD, AD, RD. Data is from MGH adult diffusion (Setsompop et al., 2013). . . . . 18 3.5 Probability density functions with different kurtosis...... 19 3.6 In the ball-and-stick model, the free water diffusion is represented by a ball and the fibers are represented by sticks...... 21 3.7 An example of a NODDI scalar map: isotropic (CSF) volume frac- tion. Data is from the NODDI Matlab toolbox tutorial...... 21 3.8 The commonly employed acquisition scheme for DSI features Cartesian sampling where only those points that remain within a sphere in q-space are retained...... 22

ix 3.9 In FRT the integral of the signal over each equator is assigned to the point farthest from that equator...... 23 3.10 MAP-MRI scalars (from left to right): RTOP, NG and PA for a healthy control. Data is from MGH adult diffusion (Setsompop et al., 2013)...... 26

4.1 A diffusion weighted image damaged by slice misalignment dueto within volume motion. Left: Sagittal view, middle: coronal view, right: axial view. Data is from the developing HCP project (Hughes et al., 2017)...... 28 4.2 A diffusion weighted image damaged by severe signal dropouts across multiple slices. Left: Sagittal view, middle: coronal view, right: axial view. Data is from the developing HCP project (Hughes et al., 2017)...... 28 4.3 LR: b0 with LR distortion. RL: b0 with RL distortion. AP: b0 with AP distortion. PA: b0 with PA distortion. These images were gener- ated by running real Human Project images through the POSSUM simulator...... 29 4.4 A fieldmap estimated using FSL’s function topup. Each pixel de- scribes the off-resonance field value in the unit ofHz...... 30 4.5 Diffusion weighted images with eddy-current distortions. ThePE directions are LR (first row) and RL (second row). Left: Sagittal view, middle: coronal view, right: axial view. Data is from WU- Minn HCP (Van Essen et al., 2013)...... 35

5.1 An example of deterministic tractography using the Stanford HARDI dataset. The seed mask is two sagittal slices of the . Tracking from this seed mask gives us a model of the corpus callosum tract...... 41 5.2 Tractography starting at a seed point and following the local major direction step-by-step...... 42 5.3 A streamline based tractography algorithm...... 43 5.4 For the repetition bootstrap, a large number of datasets repetitions would be required. The parameter of interest is calculated from each dataset. The N samples can be used to construct a good representation of the uncertainty function...... 45 5.5 A graphical illustration of the bootstrap samples obtained by sam- pling with replacement. By repeating the process NB times, one can obtain NB bootstrap samples. These samples can then be used to obtain the probability distributions of a parameter of interest. . 46 5.6 An example for deterministic tractography using the Stanford HARDI dataset. The seed mask is two sagittal slices of the corpus callosum. The stopping criterion is an FA threshold of 0.2 (top), 0.4 (middle) and 0.6 (bottom)...... 49

x 5.7 An example for deterministic tractography using the Stanford HARDI dataset. The seed mask is two sagittal slices of the corpus callosum. Left: no termination mask used, right: the termination mask is a binary white matter mask...... 50 5.8 An example of probabilistic tractography using stopping and re- jection criteria to increase the tract accuracy. The seed mask is located at the lateral geniculate nucleus of the thalamus. The lat- eral primary visual cortex is used as the inclusion mask and the termination mask. The curvature threshold is 0.2. Top: no stopping and rejection criteria used, bottom: both the inclusion mask and the termination mask are used. Data is from the FSL tractography tutorial...... 51

6.1 Standard deviation of RTOP, NG and PA for subjects MGH-1003, 1005, 1007, 1010, slice 45, using 500 bootstrap samples. CSF were removed prior to data analysis...... 60 6.2 CV of FA maps calculated from the MCMC samples. First row: MCMC with 3D spatial priors, second row: MCMC without 3D spatial priors, third row: ratios of CV of FA maps calculated from MCMC without and with 3D spatial priors. Columns from left to right: b = 0/1000, 0/3000, 0/5000, 0/10, 000 s/mm2 and the entire dataset...... 66

7.1 A CNN architecture where feature extraction by filters is learned through back propagation. Pooling operations, such as max or mean in a region, are commonly used to reduce the number of pixels in each layer of the network...... 69 7.2 A noise-to-image GAN contains a generator and a discriminator, and can be trained to generate realistic synthetic images...... 70 7.3 The CycleGAN model contains two generators GA2B ∶ A → B and GB2A ∶ B → A, and two discriminators DA and DB for image-to- image translation. A cycle loss is introduced to capture the differ- ence between the original data and the reconstructed data...... 71 7.4 T1-to-FA/MD image translation results for 4 subjects. First row: True T1-weighted images, second row: true FA/MD images, third row: synthetic FA/MD images, fourth row: Difference between real and synthetic FA/MD...... 75

xi List of Tables

4.1 A list of susceptibility-induced distortion correction tools...... 34 4.2 A list of eddy-current-induced distortion correction tools...... 34 4.3 A list of head motion correction tools...... 34

5.1 A list of diffusion tractography tools...... 40

xii 1 Introduction

All things are difficult before they are easy. — Thomas Fuller 1.1 Introduction

Diffusion magnetic resonance imaging was introduced in the mid-1980s and has become one of the most important modern neuroimaging techniques. Dif- fusion MRI reveals the microarchitecture in the brain by measuring the ran- dom motion of water molecules in brain tissues. Diffusion MRI has many clinical applications, such as the diagnosis of acute brain stroke. It can also produce astonishing maps of white matter tracts in the brain, with the po- tential to aid the understanding of psychiatric disorders and improve radiation therapy planning. In this thesis, the focus is on the analysis of diffusion MRI data, including: pre-processing, tractography, uncertainty estimation as well as taking advantage of recent developments in deep learn- ing.

1.2 Research questions

1. How can the uncertainty of different diffusion MRI models be quantified? 2. How can we correct for distortions in diffusion MRI data and which software gives the best performance? 3. Can developments in deep learning be used in diffusion MRI, to further improve data analysis?

1 1. Introduction

1.3 Outline

This thesis consists of 8 chapters. The first chapter is the present introduction. The second chapter is about the basics of magnetic resonance imaging and is followed by a chapter about diffusion MRI. Preprocessing of diffusion MRI data is covered in chapter 4 and chapter 5 focuses on tractography. Chapter 6 covers uncertainty of diffusion models, and chapter 7 contains an introduction to deep learning and its use in diffusion MRI. Chapter 8 provides a brief summary of the included papers.

1.4 Included publications

I Gu, X., Eklund, A. and Knutsson, H., 2017. Repeated tractography of a single subject: how high is the variance?. In Modeling, Analysis, and Visualization of (pp. 331-354). Springer, Cham.

II Gu, X., Sidén, P., Wegmann, B., Eklund, A., Villani, M. and Knutsson, H., 2017. Bayesian diffusion tensor estimation with spatial priors. In International Conference on Computer Analysis of Images and Patterns (pp. 372-383). Springer, Cham.

III Shakya, S., Gu, X., Batool, N., Özarslan, E. and Knutsson, H., 2017. Multi-fiber estimation and tractography for diffusion MRI using mixtureof non-central Wishart distributions. In Eurographics Workshop on Visual Computing for Biology and Medicine, Bremen, Germany (pp. 1-5). The Eurographics Association.

IV Gu, X., Eklund, A., Özarslan, E. and Knutsson, H., 2019. Using the wild bootstrap to quantify uncertainty in mean apparent propagator MRI. Frontiers in Neuroinformatics, 13, p.43 (13 pages).

V Gu, X., Knutsson, H., Nilsson, M. and Eklund, A., 2019. Gener- ating diffusion MRI scalar maps from T1 weighted images using generative adversarial networks. In Scandinavian Conference on Image Analysis (pp. 489-498). Springer, Cham.

VI Gu, X. and Eklund, A., 2019. Evaluation of six phase encoding based susceptibility distortion correction methods for diffusion MRI. Manuscript submitted for publication. 20 pages.

2 1.5. Abbreviations

1.5 Abbreviations

This table lists the abbreviations that are used in this thesis.

AD Axial Diffusivity ADC Apparent Diffusion Coefficient ANN Artificial Neural Network AP Anterior Posterior CSF CNN Convolutional Neural Network CV Coefficient of Variation DR-BUDDI Diffeomorphic Registration for Blip-Up Blip-Down Diffusion Imaging DNN Deep Neural Network DKI Diffusion Kurtosis Imaging DSC Dice similarity coefficient DSI Diffusion Spectrum Imaging DTI Diffusion Tensor Imaging EC Extracellular Compartment EPI Echo-Planar Imaging FA Fractional Anisotropy FB Fieldmap Based FID Free Induction Decay FRT Funk-Radon transform GAN Generative Adversarial Network GMRF Gaussian Markov Random Field GRE Gradient Echo HARDI High Angular Resolution Diffusion Imaging HCP HySCO Hyperelastic Susceptibility Correction IC Intracellular Compartment IQT Image Quality Transfer LR Left Right LSR Least Squares Regression LSR-MS Multi Step Least Squares Regression IVIM Intravoxel Incoherent Motion MAP Maximum A Posteriori MAP-MRI mean apparent propagator MRI MCMC Markov chain Monte Carlo MD Mean Diffusivity MRA Magnetic Resonance MRF Markov Random Field MRI Magnetic Resonance Imaging

3 1. Introduction

NMR Nuclear Magnetic Resonance NG Non-Gaussianity NODDI Neurite Orientation Dispersion and Density Imaging ODF Orientation Distribution Function OLS Ordinary Least Squares PA Posterior Anterior PA Propagator Anisotropy PB Phase Encoding Based PCG Preconditioned Conjugate Gradient PD Proton-Density PE Phase Encoding PGSE Pulsed Gradient Spin Echo QBI Q-Ball Imaging RB Registration Based RD Radial Diffusivity ReLU Rectified Linear Unit RL Right Left RTOP Return-to-Origin Probability RTPP Return-to-Plane Probability RTAP Return-to-Axis Probability RF Radio-Frequency SE Spin Echo SH Spherical harmonics SPM Statistical Parametric Mapping TR Repetition Time UNIT Unsupervised Image-to-Image Translation UGL Unweighted Graph-Laplacian WLS Weighted Least Squares

1.6 Reproducibility and ethics

All included papers are based on openly available data, and the used process- ing scripts are also shared on GitHub 1. Together, this facilitates reproduc- tion and extension of our results (Poldrack and Gorgolewski, 2014; Eklund, Nichols, and Knutsson, 2017; Poldrack, Baker, Durnez, et al., 2017). The ethics board decided that no additional ethics approvals are required to ana- lyze the open datasets.

1https://github.com/xuagu37

4 2 Magnetic resonance imaging

It was eerie. I saw myself in that machine. — Isidor Isaac Rabi

2.1 History of MRI

Rabi, Zacharias, Millman, et al. (1938) discovered nuclear magnetic resonance (NMR) in molecular beams. For this work, Rabi was awarded the Nobel Prize in Physics in 1944. Purcell, Torrey, and Pound (1946) and Bloch (1946) extended this technique to liquids and solids, for which they were jointly awarded the Nobel Prize in Physics in 1952. Hahn (1950) discovered the spin echo. Carr (1953) described the first techniques for using gradients in magnetic fields and produced a one-dimensional image in his PhD thesis. Damadian (1971) reported that tumors can be differentiated from healthy tissues because of their different relaxation times in vivo by NMR. Based on Carr’s technique, Lauterbur (1973) developed a way to generate the first 2D and 3D magnetic resonance imaging (MRI) images. In the same year, Lauterbur produced the first image of a live subject, a clam. Mansfield and Grannell (1975) expanded Lauterbur’s work and developed the MRI sequence called echo-planar imaging (EPI) which greatly sped up the imaging process. Together with his collaborator Andrew Maudsley, Mansfield produced the first image of a human body part, a finger. Lauterbur and Mansfield shared the 2003 Nobel Prize in Physiology or Medicine for their work in MRI. The world’s first 1.5 T scanner was built in GE Research Center, New York (Bottomley, Hart, Edelstein, et al., 1982). Nowadays MRI is most commonly performed in the clinic at 1.5 T and 3 T, higher fields such as 7 T are gaining more

5 2. Magnetic resonance imaging popularity because of their increased sensitivity and resolution (Nowogrodzki, 2018). In research laboratories, human studies have been performed at up to 9.4 T (Vaughan, DelaBarre, Snyder, et al., 2006) and animal studies have been performed at up to 21.1 T (Qian, Masad, Rosenberg, et al., 2012).

2.2 MR physics

Spin, precession The most widely utilized nuclei in MRI are the protons in the hydrogen atom. Protons possess a fundamental property of nature known as spin angular momentum (or spin for short). The protons have their own magnetic fields and will tend to align themselves to the external magnetic field B0, which results in a type of movement called precession, as shown in Figure 2.1. The precession occurs when the magnetic moments are not aligned with the external magnetic field B0. The precession frequency is given by the Larmor equation

ω0 = γB0, (2.1) where ω0 is the Larmor precession frequency, γ is the gyromagnetic ratio, and B0 is the strength of the external magnetic field. The sum of many spins’ magnetic moments, the net magnetization, M0 is aligned with the external magnetic field B0 in thermal equilibrium (Liang and Lauterbur, 2000) and can be calculated according to

2̵ 2 γ h B0Ns M0 = , (2.2) 4KTs ̵ where h is the Planck’ constant h divided by 2π, Ns is the total number of spins, K is the Boltzmann constant and Ts is the absolute temperature.

RF pulse, resonance, flip angle

A radio-frequency (RF) pulse is applied with its magnetic field B1 perpendic- ular to the main magnetic field B0. The Larmor equation gives the necessary frequency of the RF pulse. The protons will come into phase with the RF pulse, which results in a decreased longitudinal magnetization Mz and a newly established transversal magnetization Mxy. The phenomenon that protons pick up energy from the RF pulse is called resonance. The radiation energy that can be absorbed by protons must be equal to the energy difference ∆E between the adjacent spin states. That is, ̵ ∆E = γhB0, (2.3) which is known as the resonance condition. The protons will afterwards emit the energy while returning to a more stable state. The net magnetization M

6 2.2. MR physics

Figure 2.1: Precession of protons’ magnetic moment (the dotted line) in an external magnetic field B0 (the solid line). is flipped down with an angle α, as shown in Figure 2.2, according to

τp α = ∫ γB1(t)dt, (2.4) 0 where τp is the length of the RF pulse.

Figure 2.2: The net magnetization M (in a rotating frame of reference) with a flip angle α in an external magnetic field B0 and a RF pulse magnetic field B1.

7 2. Magnetic resonance imaging

T1 and T2 Relaxation After the RF pulse is switched off, the protons go back to their original state. This process is called relaxation, see Figure 2.3. The longitudinal magnetiza- tion Mz goes back to its original size M0 and the newly established transversal magnetization Mxy starts to disappear, these effects are called longitudinal relaxation (T1 relaxation) and transversal relaxation (T2 relaxation), respec- tively. T1 and T2 relaxations can be modelled as (Bloch, 1946)

−t/T1 Mz(t) = M0 (1 − e ) , (2.5)

−t/T2 Mxy(t) = Mxy(0)e . (2.6)

The time that it takes for the longitudinal magnetization Mz to reach 63% of its maximum value M0 is called T1 relaxation time. T2 represents the time required for the transversal magnetization Mxy to return to 1/e (about 37%) of its initial maximum value. In practice, the spins dephase faster than the intrinsic T2 mechanism as a result of magnetic field inhomogeneities. This ∗ type of decay is called T2 relaxation.

∗ Figure 2.3: T1, T2 and T2 relaxation.

2.3 MR signals

Free induction decay After the RF pulse is switched off, the transverse component of the net mag- netization M generates a current in the receiver coil. The received signal alternates with the Larmor frequency. This type of signal is called free induc- tion decay (FID) signal and can be described mathematically as (Hashemi, Bradley, and Lisanti, 2012)

∗ −t/T Mxy(t) = M0e 2 cos(ω0t). (2.7)

8 2.3. MR signals

Figure 2.4: The free induction decay (FID) signal is oscillating with a constant frequency. Immediately after the pulse, the transverse component of the precessing magnetization will decay and induce an alternating current in the coil.

Spin echo A 90○ RF pulse and a 180○ RF pulse together produce a spin echo (SE). This type of MR sequence is called the spin echo sequence (Hahn, 1950). The 90o RF pulse first tips the net magnetization M into the transverse plane. After ○ ∗ ○ the 90 RF pulse the protons dephase due to the T2 relaxation. The 180 RF pulse rephases the dephasing protons and the result is a spin echo. The time between the middle of the first RF pulse and the peak of the spin echo is called the echo time (TE). The sequence then repeats at the repetition time (TR). During the 1960s and early 1970s, pulse sequences based on SEs largely dominated the MR literature (Elster, 1993). The SE sequences can be adjusted to give T1-weighted, T2-weighted and proton density images.

Gradient echo A single RF pulse in conjunction with a gradient reversal produce a gradient echo (GRE). This type of MR sequence is called the gradient echo sequence. Interest in GRE signal formation largely floundered until 1976, when Mans- field and Maudsley (1977) proposed their revolutionary “fast scan imaging”, which used gradient reversals to generate echoes. The combination of short TR and short TE values allows for very rapid signal acquisition. For this reason GRE sequences form the basis for most rapid imaging.

Image contrast The TR and the TE can be used to control the image contrast, i.e. the weighting of the MR image. The spin echo signal is given by

− / − / s ∝ (1 − e TR T1 )e TE T2 . (2.8)

9 2. Magnetic resonance imaging

The values of T1, T2 are specific to a tissue or pathology. Different combina- tions of TR and TE values give different contrast as T1-weighted, T2-weighted, or PD (proton-density) weighted:

• Short TR + short TE = T1-weighted

• Long TR + long TE = T2-weighted • Long TR + short TE = PD-weighted

Figure 2.5 shows some examples of T1- and T2-weighted images, data are from the Human Connectome Project (HCP)1 (Van Essen, Smith, Barch, et al., 2013; Glasser, Sotiropoulos, Wilson, et al., 2013).

Figure 2.5: Examples of T1- and T2-weighted images from HCP.

2.4 MR spatial encoding

Slice selection When a subject is put into an MR scanner, all the protons in the whole body spin with the same Larmor frequency and will be excited by the RF pulses at the same time. To excite a specific slice only, a slice selecting gradient Gss is superimposed on the external magnetic field B0. The precession frequency of the excited slice is given by the Larmor equation according to

ω = γ(B0 + Gssz), (2.9)

1Data collection and sharing for this project was provided by the Human Connectome Project (U01-MH93765) (HCP; Principal Investigators: Bruce Rosen, M.D., Ph.D., Arthur W. Toga, Ph.D., Van J.Weeden, MD). HCP funding was provided by the National Institute of Dental and Craniofacial Research (NIDCR), the National Institute of Mental Health (NIMH), and the National Institute of Neurological Disorders and Stroke (NINDS). HCP data are disseminated by the Laboratory of Neuro Imaging at the University of Southern California.

10 2.5. k-space

where z is the slice position along the slice selecting gradient Gss. When an RF pulse of frequency ωRF is applied, setting

ωRF = ω (2.10) yields the slice position. In practice, the RF pulse has a range of frequencies. The slice thickness is determined by the RF pulse bandwidth and the strength of the slice selecting gradient Gss; a stronger gradient and a smaller RF pulse bandwidth produce thinner slices, and vice versa.

Frequency encoding

Within the selected slice, a frequency encoding gradient Gf along the x axis is applied superimposed on the external magnetic field B0 at the time of signal detection. The effective magnetic field B(x) along Gf is given by

B(x) = B0 + Gf x. (2.11)

Now the protons along the frequency encoding gradient Gf will experience different strengths of magnetic fields, which lead to different spin frequencies

ω(x) = γB(x) = γ(B0 + Gf x). (2.12)

Phase encoding

A phase encoding gradient Gp, perpendicular to the frequency encoding gra- dient Gf , is applied for a short time along the y axis. The protons along the y axis thus precess at different frequencies during the switch-on of Gp and when Gp is switched off, the protons will have the same precession frequency but different phases. The phase shift at position y is given by

φ(y) = γGpytp, (2.13) for a rectangular gradient Gp, where tp is the time that Gp is applied.

2.5 k-space

The signal measured in MRI represents the Fourier transform of the spin vector. K-space represents spatial frequencies in the MR image, that is, k- space is the Fourier transform of the MR image. Only after the k-space is filled and the inverse Fourier transform has been applied to it, an MR image can be visualized, see Figure 2.6. With the slice selection employed, the relation between the MR signal s(kx, ky) and the MR image I(kx, ky) can be described as (Liang and Lauterbur, 2000)

−i2π(kxx+ky y) s(kx, ky) = ∫ ∫ I(x, y)e dxdy, (2.14)

11 2. Magnetic resonance imaging

where the three time-dependent components kx and ky are related to the respective gradients

t γ ′ ′ kx(t) = ∫ Gx(t )dt , (2.15) 2π 0 t γ ′ ′ ky(t) = ∫ Gy(t )dt . (2.16) 2π 0 If a Cartesian sampling pattern is used, the MR image can easily be recon- structed by an inverse Fourier transform

I(x, y) = IFT (s(x, y)) . (2.17)

The MR image intensity I(x, y) is a function of relaxation times (T1 and T2), spin density ρ, diffusion and other parameters. The design of the pulse sequence allows one contrast mechanism to be emphasized over the others, which results in different MR image modalities such as T1-weighted images, T2-weighted images, PD-weighted images, as well as diffusion weighted images.

Figure 2.6: An example of an image in k-space and its reconstructed image.

12 3 Diffusion MRI

Out of clutter, find simplicity. — Albert Einstein

3.1 History of Diffusion MRI

Hahn (1950) noticed that the self-diffusion of liquid molecules leads to anat- tenuation of observed signals in addition to the decay due to T1 and T2 relax- ation. Hahn also pointed out that the self-diffusion in liquids can be a means of measuring the self-diffusion coefficient. To eliminate the diffusion effect, Carr and Purcell (1954) extended Hahn’s analysis and proposed a new scheme for measuring T2 using multiple echo pulses. To account for the diffusion ef- fect, Henry Torrey generalized the Bloch Equation (Bloch, 1946) by modifying the description of transverse magnetization to include diffusion terms (Torrey, 1956). The first quantification of diffusion changes was pioneered by Stejskal and Tanner (1965), in which they made the image contrast dependent on diffu- sion in the basic spin echo sequence. Based on the pioneering work of Stejskal and Tanner, Le Bihan and Breton (1985) tried to differentiate liver tumors from angiomas using diffusion encoding gradients. Due to the limitation of the hardware and the slow imaging method, the first diffusion images on liver were greatly corrupted by huge motion artifacts. Mansfield (1977) described the general principles of echo-planar imaging (EPI), which made it possible to collect MR images within seconds. The EPI based diffusion sequences solved the problem of serious motion artifacts because of long acquisition time and collection of “clean” MR images could become a reality.

13 3. Diffusion MRI

3.2 Diffusion basics

Free water diffusion The Einstein equation for diffusion describes the relationship between the mean squared displacement ⟨∣r∣2⟩ and the diffusion coefficient D. The Einstein equation for diffusion in three dimensions (Einstein, 1906) states that

⟨∣r∣2⟩ = 6Dt, (3.1) or in one dimension

⟨x2⟩ = 2Dt, (3.2) where ⟨⋅⟩ gives the expected value, r or x is the displacement of water molecules, D is the diffusion coefficient, t is the diffusion time. For free water diffusion, the water molecules displacement is described by a Gaussian probability density function

2 1 − ∣r∣ p(r, t) = √ e 4Dt . (3.3) (4πDt)3

The diffusion coefficient D of pure water in body temperature is roughly 3 × 10−3 mm2/s.

Stejskal and Tanner sequence The pulsed gradient spin echo (PGSE) (Stejskal and Tanner, 1965) sequence, see Figure 3.1, laid the foundation of all modern diffusion MR sequences. Ste-

Figure 3.1: Schematic diagram of the PGSE sequence. Diffusion gradients are ap- plied on both sides of the 180○ pulse. The signal is acquired immediately following the second diffusion gradient. jskal and Tanner solved the Bloch-Torrey equation and derived the diffusion

14 3.3. Diffusion MRI techniques echo attenuation as

S −γ2δ2G2(∆− δ )D = e 3 , (3.4) S0 where S is the signal strength of the PGSE sequence with diffusion gradients, S0 is the signal strength of the PGSE sequence without diffusion gradients, γ is the gyromagnetic ratio, δ and ∆ are the duration and separation of the two diffusion gradients, G is the diffusion gradient amplitude, D is the diffusion coefficient of the water molecules. The diffusion-weighting factor, i.e.the b-value (from Denis Le Bihan), is defined in units of s/mm2 as δ b = γ2δ2G2 (∆ − ) . (3.5) 3 In practice, the gradient amplitude G, the gradient duration δ and interval ∆ are changed to control the b-value. The PGSE sequence permits us to measure the diffusion coefficient D which is typically calculated from the plot of ln S versus b. A higher b-value makes the water molecules more sensitive S0 to the molecular displacement, which results in more signal attenuation, see Figure 3.2.

Figure 3.2: Examples of diffusion weighted images for different b-values. The images are scaled independently. A higher b-value results in a lower signal to noise ratio. Data is from MGH adult diffusion (Setsompop, Kimmlingen, Eberlein, et al., 2013).

3.3 Diffusion MRI techniques

Diffusion tensor imaging (DTI) assumes that water diffusion follows a3D Gaussian distribution, and the estimated tensor is exactly the covariance ma- trix of the Gaussian (Basser, Mattiello, and LeBihan, 1994a; Basser, Mat- tiello, and LeBihan, 1994b). The ball-and-stick model is a multi-compartment model, where the ball is the isotropic Gaussian, and the sticks are purely anisotropic Gaussian (Behrens, Woolrich, Jenkinson, et al., 2003). The strength of these methods is that they only require a few samples to estimate the whole distribution. The weakness is that these methods are based on some assumptions, and it is common that the diffusion in practice does not follow

15 3. Diffusion MRI the assumptions. The methods diffusion spectrum imaging (DSI) (Wedeen, Hagmann, Tseng, et al., 2005) and q-ball imaging (QBI) (Tuch, 2004) use Fourier transformation or spherical harmonics to calculate the orientation dis- tribution function (ODF). The advantage of these methods is that they are not limited by the Gaussian distribution assumption and it does not require data fitting which usually involves some complicated optimization problem. The weakness of these methods is that they often need more diffusion data.

Diffusion tensor imaging Diffusion tensor imaging (Basser, Mattiello, and LeBihan, 1994a; Basser, Mat- tiello, and LeBihan, 1994b) was proposed to account for the anisotropy of diffusion in tissues. It is able to characterize the principal diffusion direc- tion of fibers in human brains. In homogenous materials, the diffusion process can be characterized by a scalar diffusion coefficient D. However, in biological tissues such as white matter of the brain, the diffusion is anisotropic with direction dependent diffusion. Diffusion in anisotropic media is better described by the diffusion tensor, a 3 × 3 real symmetric representing diffusion along different directions ⎡ ⎤ ⎢D D D ⎥ ⎢ xx xy xz⎥ = ⎢ ⎥ D ⎢Dxy Dyy Dyz⎥ (3.6) ⎢ ⎥ ⎣Dxz Dyz Dzz ⎦ and Equation 3.4 can be re-written as

S T = e−bg Dg, (3.7) S0 where g is a 3 × 1 unit vector of the gradient direction. It is necessary to measure the normalized diffusion attenuation S along at least 6 independent S0 directions to estimate the six unknown tensor elements. A real symmetric matrix D can be decomposed as (Bosch, 1986) D = QΛQT , (3.8) where Q = [e , e , e ] is an orthogonal matrix whose columns are the eigen- 1 2 3 ⎡ ⎤ ⎢λ 0 0 ⎥ ⎢ 1 ⎥ = ⎢ ⎥ vectors of D, and Λ ⎢ 0 λ2 0 ⎥ is a diagonal matrix whose entries are the ⎢ ⎥ ⎣ 0 0 λ3⎦ eigenvalues of D. The diffusion tensor can be defined with its shape described by the three eigenvalues and its orientation described by the three eigenvectors, see Figure 3.3. In a diffusion experiment, the diffusion-weighted signal Si of the ith measurement for one voxel, ignoring the effect of imaging gradients, is given by

Si −bgT Dg = e i i , for i = 1, 2, ⋯,M, (3.9) S0

16 3.3. Diffusion MRI techniques

Figure 3.3: Diffusion tensor ellipsoids (from left to right): sphere (λ1 = λ2 = λ3), planar (λ1 = λ2 > λ3) and cigar (λ1 > λ2 = λ3). where M is the total number of diffusion measurements. Taking the logarithm of both sides, the equation above becomes

( ) = ( ) − T ln Si ln S0 bgi Dgi, (3.10) which, assuming additive noise on the log scale, can be structured into the well-known multiple linear regression form

y = Xw + εε, (3.11)

T where y = [ln(S1), ln(S2), ⋯, ln(SM )] represents the logarithm of the mea- T sured signal, w = [Dxx,Dyy,Dzz,Dxy,Dxz,Dyz, ln(S0)] are the unknown regression coefficients, X is a M × 7 design matrix containing the different diffusion gradient directions,

⎡ 2 2 2 1 ⎤ ⎢ g g g 2g1xg1y 2g1xg1z 2g1yg1z − ⎥ ⎢ 1x 1y 1z b ⎥ ⎢ g2 g2 g2 2g g 2g g 2g g − 1 ⎥ = − ⎢ 2x 2y 2z 2x 2y 2x 2z 2y 2z b ⎥ X b ⎢ ⎥ , (3.12) ⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥ ⎢ 2 2 2 − 1 ⎥ ⎣gMx gMy gMz 2gMxgMy 2gMxgMz 2gMygMz b ⎦

T and ε = [ε1, ε2, ⋯, εM ] are the error terms. We will consider diffusion data containing M volumes with N voxels ordered in a M ×N matrix Y. The linear regression model given in Equation 3.11 can be rewritten for simultaneous estimation of all parameters, according to

Y = XW + E, (3.13) where W is a 7 × N matrix of regression coefficients and E is a M × N matrix of error terms. After fitting the diffusion data, a diffusion tensor is obtained in each voxel. Commonly used DTI scalar metrics are fractional anisotropy (FA), mean diffusivity (MD), axial diffusivity (AD) and radial diffusivity (RD)

17 3. Diffusion MRI

(Basser and Pierpaoli, 1996), which are defined as

λ + λ + λ MD = 1 2 3 , (3.14) ¿ 3 Á 2 2 2 ÀÁ(λ1 − λ2) + (λ2 − λ3) + (λ3 − λ1) FA = , (3.15) ( 2 + 2 + 2) 2 λ1 λ2 λ3

AD = λ1, (3.16) λ + λ RD = 1 2 . (3.17) 2 MD is the average of all eigenvalues and provides information regarding the amount of obstruction to water molecule diffusion. The FA value is the vari- ance of the eigenvalues in a given voxel. FA values range from 0 to 1, with 0 corresponding to isotropic diffusion, suggesting, for example, no neuronal fiber tracts or randomly oriented fiber tracts, and 1 to anisotropic diffusion, suggesting coherent and intact fiber tracts. It is possible to display FA values using RGB coding where the colors red, green and blue represent the principal eigenvector in the x, y, z axes respectively, see Figure 3.4.

Figure 3.4: DTI scalars (from left to right): FA, RGB coded FA, MD, AD, RD. Data is from MGH adult diffusion (Setsompop, Kimmlingen, Eberlein, et al., 2013).

The common diffusion tensor model has very limited capabilities to resolve multiple-fiber orientations inside a voxel, due to the use of a single tensorto model complex diffusion processes occurring within a voxel. It is obvious that biological tissues are usually heterogeneous, and a single diffusion tensor is not able to distinguish complex fiber structures, for example, crossing, kissing, or branching within a voxel.

Diffusion kurtosis imaging The diffusion tensor model framework provides a complete account of free dif- fusion. In practice, biological tissues are heterogeneous and water molecules are encountering barriers when moving across compartments. As a result, a water molecule will not diffuse as far as one that follows a Gaussian distribu- tion in a given time interval. Liu, Bammer, Acar, et al. (2004) proposed to

18 3.3. Diffusion MRI techniques include a series of higher‐order to account for the non‐Gaussian effect

S ( ) ( ) ( ) ( ) ln [ ] = j2b 2 D 2 + j3b 3 D 3 i1i2 i1i2 i1i2i3 i1i2i3 S0 ( ) ( ) ( ) ( ) + j4b 4 D 4 + j5b 5 D 5 + ⋯, (3.18) i1i2i3i4 i1i2i3i4 i1i2i3i4i5 i1i2i3i4i5

( ) where j is the imaginary unit, b n is the element of an nth order b-matrix, i1i2⋯in ( ) D n is the element of an nth order diffusion tensor. The fourth-order i1i2⋯in truncation of Equation 3.18 is known as diffusion kurtosis imaging (DKI) (Jensen, Helpern, Ramani, et al., 2005). Kurtosis is a concept borrowed from probability theory and statistics, and it is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. Kurtosis is calculated as E(x − µ)4 K = , (3.19) σ4 where µ is the mean, σ is the standard deviation, and E(⋅) represents the expected value. The kurtosis of the normal distribution is K = 3. Figure 3.5 shows the probability density functions with different kurtosis. Diffusion

Figure 3.5: Probability density functions with different kurtosis. kurtosis imaging (Jensen, Helpern, Ramani, et al., 2005) was proposed as a natural extension of diffusion tensor imaging. By performing a one-step Taylor expansion of the exponent in the diffusion tensor imaging, the attenuation can be written as

S −bD(g)+ 1 b2D(g)2K(g) = e 6 , (3.20) S0

19 3. Diffusion MRI where

3 3 D(g) = ∑ ∑ gigjDij, (3.21) i=1 j=1 MD2 3 3 3 3 K(g) = ∑ ∑ ∑ ∑ g g g g W . (3.22) ( )2 i j k l ijkl D g i=1 j=1 k=1 l=1

Dij are the elements of the second order diffusion tensor D, and Wijkl are the elements of the fourth order kurtosis tensor W. Similarly as the diffusion tensor D, the kurtosis tensor W is symmetric and has 15 additional elements to be estimated.

Multi-tensor model The multi-tensor model (Tuch, Reese, Wiegell, et al., 2002) was proposed as a natural extension of the single-tensor model to characterize diffusivity of a voxel containing different fiber compartments. Each compartment canbe described using a distinct tensor, according to

N T Si = S0 ∑ fn exp(−bgi Dngi), (3.23) n=1 where N is the number of tensors per voxel and fn is the volume fraction for each tensor. Jian, Vemuri, Özarslan, et al. (2007) and Shakya, Batool, Özarslan, et al. (2017) proposed extensions in which each compartment is not described by a single tensor but a Wishart distribution of diffusion tensors. The ball-and-stick model (Behrens, Woolrich, Jenkinson, et al., 2003; Behrens, Berg, Jbabdi, et al., 2007), one of the most popular multi-tensor models, was originally implemented to estimate fiber orientations in FSL Diffusion Toolkit (bedpostx). The ball-and-stick model is a special case of the multi-tensor model and mainly aims for resolving crossing fibers and estimating their relative volume fractions. The fiber populations are represented by sticks, which have a unique tensor representation with radial diffusivity removed. The free water diffusion is represented by a ball, which is an isotropic tensor, seeFigure 3.6. The diffusion signal can be written according to

N N T S −bd −bg Dig = (1 − ∑ fi) e + (∑ fi) e , (3.24) S0 i=1 i=1 where N is the maximum number of fibers (sticks), d is a diffusivity parameter, fi represents the stick’s volume fraction and Di is the tensor of the i-th stick compartment. The stick tensor Di has one positive eigenvalue and two zero eigenvalues. The ball-and-stick model assumes that all fibers share the same diffusion profile (the sticks), and it cannot provide axial or radial diffusivity information.

20 3.3. Diffusion MRI techniques

Figure 3.6: In the ball-and-stick model, the free water diffusion is represented bya ball and the fibers are represented by sticks.

Another popular multi-tensor model is neurite orientation dispersion and density imaging (NODDI) (Zhang, Schneider, Wheeler-Kingshott, et al., 2012), see Figure 3.7. NODDI adopts a three-compartment model, the in- tracellular compartment (IC), the extracellular compartment (EC) and the cerebrospinal fluid (CSF) component. The intracellular compartment canbe viewed as a set of sticks. The extra-cellular components are modelled as sim- ple (Gaussian) anisotropic diffusion. The normalized diffusion signal canbe written as

E = (1 − fiso)(ficEic + fecEec) + fisoEiso, (3.25) where Eic, Eec and Eiso are respectively the normalized signals coming from intra-cellular space, extracellular space and CSF, fiso, fic and fec are respec- tively the fractions occupied by the CSF, the intra-cellular space and the extra-cellular space.

Figure 3.7: An example of a NODDI scalar map: isotropic (CSF) volume fraction. Data is from the NODDI Matlab toolbox tutorial1.

Diffusion spectrum imaging Diffusion Spectrum Imaging (DSI) (Wedeen, Hagmann, Tseng, etal., 2005) is a method based on the Fourier transform relation between diffusion signals

1http://mig.cs.ucl.ac.uk/index.php?n=Tutorial.NODDImatlab

21 3. Diffusion MRI and the averaged propagator. DSI is a typical representative of the q-space encoding technique, see Figure 3.8. The signal can be written as

S(q) = ∫ p(r)ei2πqrdr, (3.26) S0 where r is the spin displacement and p(r) is the probability density function of the average spin displacement, known as the averaged propagator or the diffusion spectrum. The q-space and the b-space (the gradient g) are related = 1 by q 2π γδg. When no model is used, estimation of the fiber orientation in a voxel requires the characterization of a 3D diffusion probability density function, i.e. the averaged propagator p(r). At each voxel, the averaged prop- agator can be reconstructed as the Inverse Fourier transform of the measured diffusion signal,

S(q) p(r) = IFT { } . (3.27) S0

By acquiring the q-space signal S(q) and taking the inverse Fourier transform, the averaged propagator p(r) can be derived, giving a model-free description of the diffusion process within the voxel of interest. Hence, to sample theq- space of size Nx, Ny and Nz, a total of Nx × Ny × Nz diffusion-weighted scans are required. From the averaged propagator p(r) of the molecules within a

Figure 3.8: The commonly employed acquisition scheme for DSI features Cartesian sampling where only those points that remain within a sphere in q-space are retained. voxel, various information about the tissue microstructure can potentially be extracted. For example, the orientation distribution function (ODF) can be defined as

ODF (u) = ∫ p(ru)r2dr, (3.28) and ODF measures the quantity of diffusion in the direction of the unit vec- tor u. The rich microstructure information provided by DSI comes at the

22 3.3. Diffusion MRI techniques expensive cost of a very long acquisition time. It was reported that a DSI ac- quisition protocol that requires around 500 measurements led to a total scan time of about 30 minutes (Wedeen, Hagmann, Tseng, et al., 2005).

Q-ball imaging Q-ball imaging (QBI) (Tuch, 2004) aims to reconstruct the ODF from diffusion measurements taken on a sphere in q-space, using the Funk-Radon transform. Instead of sampling q-space in a 3D Cartesian grid, the QBI technique samples q-space on a sphere, see Figure 3.9. By doing so, the required number of diffusion-encoding directions is considerably lower compared with DSI. The definition of the ODF used by Tuch(2004) is, however, different from the actual marginal propagator of diffusion in constant solid angle. It is computed as a linear radial projection of the propagator 1 ψ(u) = ∫ P (ru)dr, (3.29) Z where Z is a normalization constant. The ODF can be directly approximated from the the spherically sampled q-space signals, by simply taking the Funk- Radon transform (FRT), according to 1 S(q) ODF (u) ≈ FRT ( ) . (3.30) Z S0 However, there is no golden standard for the choice of the optimal q-space

Figure 3.9: In FRT the integral of the signal attenuation over each equator is as- signed to the point farthest from that equator. sampling radius and the optimal number of sampled points of QBI experi- ments. Regular choices are using b-values in the range 2500 to 4000 s/mm2 with more than 250 directions.

Mean apparent propagator MRI In mean apparent propagator MRI (MAP-MRI) (Özarslan, Koay, Shepherd, et al., 2013) the three-dimensional q-space diffusion signal attenuation E(q)

23 3. Diffusion MRI is expressed as

Nmax ( ) = ∑ ∑ ( ) E q an1n2n3 Φn1n2n3 A, q , (3.31) N=0 {n1,n2,n3} ( ) where Φn1n2n3 A, q are related to Hermite basis functions and depend on the second-order tensor A and the q-space vector q. A can be taken to be the covariance matrix of displacement, defined as

2 ⎛ux 0 0 ⎞ = T = ⎜ 2 ⎟ A 2R DRtd 0 uy 0 , (3.32) ⎝ 2⎠ 0 0 uz where R is the transformation matrix whose columns are the eigenvectors of the standard diffusion tensor D, and td is the diffusion time. The non-negative indices ni are the order of Hermite basis functions which satisfy the condition n1 + n2 + n3 = N, and Nmax is the maximum order. The q-space vector q is defined as q = γδG/2π, where γ is the gyromagnetic ratio, δ is the diffusion gradient duration, and G determines the gradient strength and direction. The diffusion propagator, a 3-dimensional probability density function, isthe three-dimensional inverse Fourier transform of E(q), and can be expressed as

Nmax ( ) = ∑ ∑ ( ) P r an1n2n3 Ψn1n2n3 A, r , (3.33) N=0 {n1,n2,n3} ( ) where Ψn1n2n3 A, r are the corresponding basis functions in displacement space r. The number of coefficients for MAP-MRI is given by 1 N N N N = (⌊ max ⌋ + 1) (⌊ max ⌋ + 2)(4⌊ max ⌋ + 3) . (3.34) coef 6 2 2 2 ( ) ( ) The MAP-MRI basis functions, Φn1n2n3 A, q in q-space and Ψn1n2n3 A, r in displacement r-space, are given by ( ) = ( ) ( ) ( ) Φn1n2n3 A, q ϕn1 ux, qx ϕn2 uy, qy ϕn3 uz, qz , (3.35) ( ) = ( ) ( ) ( ) Ψn1n2n3 A, r ψn1 ux, x ψn2 uy, y ψn3 uz, z , (3.36) with (Özarslan, Koay, and Basser, 2008)

−n i −2 2 2 2 ϕn(u, q) = √ e π q u Hn(2πuq), (3.37) 2nn! 1 −x2/2u2 ψn(u, x) = √ e Hn(x/u), (3.38) 2n+1πn!u where Hn(x) is the nth order Hermite polynomial. Equation 3.31 can be written in matrix form (with error term added on the right side) as

y = Qa + εε, (3.39)

24 3.3. Diffusion MRI techniques

where y is a vector of T signal values, Q is a T × Ncoef design matrix formed ( ) by the basis functions Φn1n2n3 A, q , a contains the parameters to estimate, and ε is the error. The coefficients a can be obtained by solving the following quadratic minimization problem, min ∣∣y − Qa∣∣2, Ka ≥ 0, 1T Ka ≤ 0.5, (3.40) a where 0 and 1 are vectors with elements 0 and 1, respectively. The rows of ( ) the constraint matrix K are the basis functions Ψn1n2n3 A, r evaluated on a uniform Cartesian grid in the positive z half space. The first constraint en- forces non-negativity of the propagator, and the second one limits the integral of the probability density (propagator) to a value no greater than 1. Zero displacement probabilities include the return-to-origin probability (RTOP), and its variants in 1D and 2D: the return-to-plane probability (RTPP), and the return-to-axis probability (RTAP), respectively. In terms of MAP-MRI coefficients they are given by (Özarslan, Koay, Shepherd, etal., 2013)

Nmax 1 N/2 RTOP = √ ∑ ∑ (−1) an n n Bn n n , (3.41) 3∣ ∣ 1 2 3 1 2 3 8π A N=0 {n1,n2,n3} N 1 max ( + )/ = √ ∑ ∑ (− ) n2 n3 2 RT AP 1 an1n2n3 Bn1n2n3 , (3.42) 2πuyuz N=0 {n1,n2,n3}

Nmax 1 (n1)/2 RTPP = √ ∑ ∑ (−1) an n n Bn n n , (3.43) 2πu 1 2 3 1 2 3 x N=0 {n1,n2,n3} where ( )1/2 = n1!n2!n3! Bn1n2n3 Kn1n2n3 , (3.44) n1!!n2!!n3!! = and Kn1n2n3 1 if n1, n2 and n3 are all even and 0 otherwise. The non- Gaussianity (NG) measures the dissimilarity between the propagator and its Gaussian parts, according to (Özarslan, Koay, Shepherd, et al., 2013) ¿ Á a2 NG = ÀÁ1 − 000 . (3.45) ∑Nmax ∑ a2 n=0 {n1,n2,n3} n1n2n3 The propagator anisotropy (PA) measures the angular similarity between the propagator and its isotropic part, according to (Özarslan, Koay, Shepherd, et al., 2013) ¿ 2 Á Nmax Á (∑ = ∑{ } an n n on n n ) PA = ÀÁ1 − n 0 n1,n2,n3 1 2 3 1 2 3 , (∑Nmax ∑ a2 )(∑Nmax ∑ o2 ) n=0 {n1,n2,n3} n1n2n3 m=0 {m1,m2,m3} m1m2m3 (3.46) where on1n2n3 are the MAP-MRI coefficients of its isotropic part. Figure 3.10 shows some examples of MAP-MRI scalars for a healthy control.

25 3. Diffusion MRI

Figure 3.10: MAP-MRI scalars (from left to right): RTOP, NG and PA for a healthy control. Data is from MGH adult diffusion (Setsompop, Kimmlingen, Eberlein, et al., 2013).

26 4 Preprocessing of diffusion MRI data

Have no fear of perfection - you’ll never reach it. — Salvador Dali

4.1 Introduction

Having collected the raw diffusion weighted images, it is necessary to perform a number of steps in order to correct the data for typical distortions associated with the EPI sequence used to acquire the data, and to generate diffusion scalar maps. There are many calculations and image processing steps that contribute to generating the final results and scalar maps, and each stepcan add uncertainty to the final results. This chapter reviews the most common type of distortions, and how they can be corrected. There are a number of artifacts that can reduce the quality of diffusion MRI data, including susceptibility-induced distortions, eddy-current-induced distortions, head motion, ghosting, and signal dropouts, see Figure 4.1, 4.2, 4.3, 4.4 and 4.5 for examples. The vast majority of diffusion imaging is per- formed using EPI. Spatial encoding in diffusion imaging is achieved by using field gradients that cause the signal from different positions to be emittedat different frequencies. It is assumed that there is a linear relationship between position and frequency. However, in practice there are several fields added to the linear gradients, which cause susceptibility-induced distortions and eddy- current–induced distortions. A comprehensive preprocessing pipeline for dif- fusion MRI data should therefore accomplish the following goals:

1. susceptibility-induced distortion correction,

27 4. Preprocessing of diffusion MRI data

2. eddy-current-induced distortion correction, 3. head motion correction.

Figure 4.1: A diffusion weighted image damaged by slice misalignment due to within volume motion. Left: Sagittal view, middle: coronal view, right: axial view. Data is from the developing HCP project (Hughes, Cordero-Grande, Murgasova, et al., 2017).

Figure 4.2: A diffusion weighted image damaged by severe signal dropouts across multiple slices. Left: Sagittal view, middle: coronal view, right: axial view. Data is from the developing HCP project (Hughes, Cordero-Grande, Murgasova, et al., 2017).

4.2 Susceptibility-induced distortion correction

Susceptibility, or magnetic susceptibility is a measure of how much a mate- rial will become magnetized in an applied magnetic field. When a subject (consisting of flesh and bone) is placed in the scanner, it will resist mag- netization and disrupt the homogeneity of the magnetic field. The difference

28 4.2. Susceptibility-induced distortion correction between the ideal homogeneous magnetic field and the disrupted field is called a susceptibility-induced off-resonance field. Whenever there are two materi- als (e.g. bone and air) with different susceptibility, the magnetic field will have an inhomogeneous distribution described by a set of partial differential equations. The transformations applied to the images have two components: a displacement that is restricted along the phase-encoding direction, and an intensity modulation given by the Jacobian determinant of the geometrical transformation. Figure 4.3 shows some examples of susceptibility distortions for four phase encoding (PE) directions: LR (left right) and RL (right left), or AP (anterior posterior) and PA (posterior anterior). Strategies for correct- ing susceptibility-induced distortions depend on the available data. There are three main techniques used for correcting the susceptibility distortion, reg- istration based (RB) methods, fieldmap based (FB) methods and phase en- coding based (PB) methods. Table 4.1 shows a list of susceptibility-induced distortion correction tools.

Figure 4.3: LR: b0 with LR distortion. RL: b0 with RL distortion. AP: b0 with AP distortion. PA: b0 with PA distortion. These images were generated by running real Human Connectome Project images through the POSSUM simulator.

Registration based methods If the diffusion MRI data is only acquired with a single PE direction, the best strategy is the registration based (RB) methods. This class of methods does a (nonlinear) alignment to the anatomical image, mapping the spatial contrast of the b0 image to the spatial contrast in the anatomical image. A non-diffusion weighted image (i.e. b0, since it has a higher SNR compared to higher b-values) and the high spatial resolution anatomical image can be used for the registration, but distortions in the b0 image and differences in contrast complicate the registration. If several non-diffusion weighted images are collected, the average b0 volume can be used for the registration. FSL provides a script epi_reg designed to register b0 images to anatomical (e.g. T1-weighted) images using boundary- based registration (Greve and Fis-

29 4. Preprocessing of diffusion MRI data chl, 2009). This script will perform a white-matter segmentation of the struc- tural image to define a white-matter boundary. The white-matter boundary is mapped to the b0 image using a 6 degree-of-freedom transformation, and samples of the b0 image intensity are then taken along both sides of the bound- ary. The difference between the intensities in each pair is used to calculate the cost function. To prevent it from being unduly influenced by outliers, the difference is run through a sigmoid-like non-linear function to suppress the effect of very large differences. The sign of the difference isalsoimpor- tant and it is set-up to expect a higher intensity in the grey-matter in the b0 image. Viewing the registered b0 image with an overlay of the white-matter boundaries is a good way to assess the accuracy of the registration.

Fieldmap based methods A fieldmap describes the inhomogeneity field (off-resonance frequency) ateach voxel. There are many different ways that field maps can be acquired. A simple method is to collect MR images at two slightly different echo times, and then estimate the field map from those two scans (Jezzard and Balaban, 1995; Nayak and Nishimura, 2000; Cusack, Brett, and Osswald, 2003). Given a field ∆B0, we would expect the observed difference in phase between the two acquisitions to be

∆Φ = 2πγ∆B0∆t, (4.1) where γ is the gyromagnetic ratio, ∆t is the echo time difference. Figure 4.4 shows an example of an estimated fieldmap using FSL’s function topup. It is

Figure 4.4: A fieldmap estimated using FSL’s function topup. Each pixel describes the off-resonance field value in the unit ofHz. hence easy to calculate the fieldmap as ∆Φ ∆B = . (4.2) 0 2πγ∆t

30 4.2. Susceptibility-induced distortion correction

With the fieldmap estimated, if the diffusion MRI data is acquired with single PE direction, FUGUE1 (FMRIB’s Utility for Geometrically Unwarping EPIs) in FSL can be used for the susceptibility distortion correction. It performs unwarping of the b0 image based on the estimated fieldmap.

Phase encoding based methods It has been demonstrated that it is not possible to reconstruct distortion- free diffusion data from the data acquired using a single PE direction, even with the known displacement field (Andersson, Skare, and Ashburner, 2003). However, it is possible to estimate the displacement field by maximizing the similarity of two distorted images collected with opposing phase-encoding di- rections. With the estimated displacement field, an undistorted image can be reconstructed in a least squares sense. This is how the FSL function topup estimates and corrects susceptibility-induced distortions. The inhomogene- ity field ∆B0 is directly related to the displacement field d of a given pixel (x, y, z). Along the phase-encoding direction, the displacement field d is pro- portional to the slice readout time (Tro) and the inhomogeneity field ∆B0 as follows (Jezzard and Balaban, 1995)

d(x, y, z) = γ∆B0(x, y, z)Tro. (4.3)

The inhomogeneity field ∆B0(x, p) and the displacement field d(x, p) can be parameterised by some vector p. The main equation which describes the updating rule for estimating the parameters from the diffusion data is given by − ⎡ df ⎤ 1 ⎡ ⋯ ⎤ ⎢ 1 ⎥ ⎛ ⎢R 0 0 ⎥ ⎢ dp ⎥⎞ ⎢ ⎥ df ⎜ T T T ⎢ 0 R 0 0 ⎥ ⎢ 2 ⎥⎟ − = ⎜[ df1 df2 df1 ] ⎢ ⎥ ⎢ dp ⎥⎟ p p0 ⎜ ( ) ( ) ⋯ ( ) ⎢ ⎥ ⎢ ⎥⎟ ⎜ dp dp dp ⎢ 0 0 R 0 ⎥ ⎢ ⋮ ⎥⎟ ⎢ ⎥ ⎢ ⎥ ⎝ ⎣ 0 0 0 R⎦ ⎢ df1 ⎥⎠ ⎣ dp ⎦ ⎡ ⎤ ⎡ ⎤ ⎢R 0 ⋯ 0 ⎥ ⎢f1⎥ ⎢ ⎥ ⎢ ⎥ T T T ⎢ 0 R 0 0 ⎥ ⎢f2⎥ × [ df1 df2 df1 ] ⎢ ⎥ ⎢ ⎥ ( ) ( ) ⋯ ( ) ⎢ ⎥ ⎢ ⎥ , (4.4) dp dp dp ⎢ 0 0 R 0 ⎥ ⎢ ⋮ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ 0 0 0 R⎦ ⎣f1⎦ where

⎡ + ⎤ df ⎢ dfi ⎥ i = ⎢ dp ⎥ , (4.5) ⎢ dfi− ⎥ dp ⎣ dp ⎦ I −I R = [ ] , (4.6) −II

fi+ fi = [ ] . (4.7) fi−

1https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FUGUE

31 4. Preprocessing of diffusion MRI data

p is the vector of weights of the basis warps, fi denotes the ith column of the EPI image, + and − denote the two phase-encoding directions. We can formu- late a model for data acquired using two opposing phase-encoding directions f+ and f− according to

f+ K+ [ ] = [ ] ρ, (4.8) f− K− where the matrix K depends on the inhomogeneity field ∆B0(x, p) (estimated using Equation 4.4) and maps the undistorted image ρ to the distorted EPI image f. The equation can be easily solved in a least squares sense by

−1 T T K+ T T f+ ρˆ = ([K+ K− ] [ ]) [K+ K− ] [ ] . (4.9) K− f−

Similarly, a reversed gradient method named hyperelastic susceptibility correction (HySCO) was proposed in (Ruthotto, Kugel, Olesch, et al., 2012; Ruthotto, Mohammadi, Heck, et al., 2013) and integrated as a batch tool into the Statistical Parametric Mapping (SPM) (Penny, Friston, Ashburner, et al., 2011) software. Given two oppositely distorted images f+ and f−, the inhomo- geneity field ∆B0 can be estimated by minimizing the distance function

1 d∆B0(x) D (∆B ) = ∫ f+ (x + ∆B (x)v)(1 + ) 0 2 0 dv

d∆B0(x) − f−(x − ∆B (x)v)(1 − ) , (4.10) 0 dv where v denotes the phase-encoding direction, for example v = (0, 1, 0)T . Finally, the corrected images are obtained by applying the transformations for ∆ˆB0 on f+ and f−. Diffeomorphic Registration for Blip-Up blip-Down Diffusion Imaging (DR- BUDDI) (Irfanoglu, Modi, Nayak, Knutsen, et al., 2014; Irfanoglu, Modi, Nayak, Hutchinson, et al., 2015) is a module of the TORTOISE software (Pierpaoli, Walker, Irfanoglu, et al., 2010; Irfanoglu, Nayak, Jenkins, et al., 2017) which is used to correct susceptibility-induced distortions using blip-up and blip-down data (another term for data acquired with two phase encoding directions). The processing starts with head motion and eddy current distor- tion correction for both the blip-up and down datasets using the DIFFPREP tool of TORTOISE. After this step, all diffusion volumes in both datasets are aligned to their corresponding non-diffusion weighted volumes, and the b vec- tors for all volumes are properly reoriented (Leemans and Jones, 2009). DIFF- PREP also outputs two transformation files, one for each dataset, describing the entire motion and eddy-current distortions for all volumes. These trans- formation files are fed into the DR-BUDDI tool. The non-diffusion weighted images for both blip-up and down are then quadratically registered (Rohde,

32 4.2. Susceptibility-induced distortion correction

Barnett, Basser, et al., 2004) to the structural image within DR-BUDDI to correct for concomitant field distortions. The resulting deformation fields are subsequently combined with each volume’s motion and eddy-current trans- formations to generate the overall displacement fields, which are used along with the original diffusion data to generate the corrected blip-up and down ′ ′ diffusion datasets f+ af− with a single interpolation step (to avoid unnecessary blurring). Lastly, these two datasets are combined using geometric averaging to generate the final corrected dataset ¯f, i.e. according to

′ f f−′ ¯f = 2 + . (4.11) f+′ + f−′

33 4. Preprocessing of diffusion MRI data Website Website Website https://www.nitrc.org/projects/dtic/ https://www.nitrc.org/projects/dtic/ https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/eddy https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/topup https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FNIRT https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FUGUE https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/eddy http://cmictig.cs.ucl.ac.uk/wiki/index.php/Reg_f3d https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/MCFLIRT http://www.diffusiontools.com/documentation/hysco.html http://www.diffusiontools.com/documentation/ECMOCO.html https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FLIRT/UserGuide#epi_reg https://science.nichd.nih.gov/confluence/display/nihpd/DR-BUDDI https://afni.nimh.nih.gov/pub/dist/doc/program_help/3dQwarp.html http://www.diffusiontools.com/documentation/ECMOCO.html https://afni.nimh.nih.gov/pub/dist/doc/program_help/unWarpEPI.py.html A list of head motion correction tools. http://brainsuite.org/processing/diffusion/distortion-correction-and-co-registration/ https://tortoise.nibib.nih.gov/tortoise/v313/7-step-3-eddy-motion-and-epi-distortion-correction https://na-mic.org/wiki/Mbirn:_B0_and_eddy_current_correction_code_for_diffusion_MRI https://tortoise.nibib.nih.gov/tortoise/v313/7-step-3-eddy-motion-and-epi-distortion-correction eddy correction ECMOCO Head motion flirt (mcflirt) A list of eddy-current-induced distortion correction tools. A list of susceptibility-induced distortion correction tools. Table 4.3: dtic dtic fnirt eddy BDP topup Mbirn HySCO epi_reg FUGUE NiftyReg 3dQwarp Table 4.2: ECMOCO Table 4.1: unWarpEPI DR-BUDDI DIFFPREP DIFFPREP FSL FSL SPM eddy_correct distortion correction distortion correction Susceptibility-induced Eddy-current -induced analysis software Diffusion MRI data FSL FSL FSL FSL FSL FSL SPM SPM AFNI AFNI reg_f3d BrainSuite TORTOISE TORTOISE TORTOISE analysis software analysis software Diffusion MRI data Diffusion MRI data

34 4.3. Eddy-current-induced distortion correction

4.3 Eddy-current-induced distortion correction

Eddy currents are induced in the MR scanner coils due to the quickly changing magnetic fields. Eddy-current-induced distortions usually blur the boundaries between white matter and gray matter, and warp the data along the phase encoding direction, see Figure 4.5, which can cause serious problems when correcting for head motion or registering a subject to a brain template (for a group analysis). The most appealing way to reduce the eddy-current- induced distortions is to register the diffusion images to a reference image, usually the first non-diffusion weighted volume (b0). The bonus of this registration is the simultaneous correction of head motion (Graham, Drobnjak, and Zhang, 2016). Table 4.2 shows a list of eddy-current-induced distortion correction tools.

Figure 4.5: Diffusion weighted images with eddy-current distortions. The PE direc- tions are LR (first row) and RL (second row). Left: Sagittal view, middle: coronal view, right: axial view. Data is from WU-Minn HCP (Van Essen, Smith, Barch, et al., 2013).

FSL’s tool eddy_correct, calling another FSL function flirt (Jenkinson and Smith, 2001; Jenkinson, Bannister, Brady, et al., 2002), registers each volume in a dataset to a b0 image in order to simultaneously correct for head motion and eddy-current-induced distortions. Flirt is a global optimisation based tool for linear affine intra- and inter-modal brain image registration. Cost functions include least squares and normalised correlation for within-modality registration, as well as correlation ratio (the default), mutual information and normalised mutual information for between-modality registration.

35 4. Preprocessing of diffusion MRI data

It has been demonstrated that using eddy_correct leads to significant er- rors (Graham, Drobnjak, and Zhang, 2016) thus there is a new more ad- vanced tool in FSL called eddy (Andersson and Sotiropoulos, 2016) to correct for eddy-current-induced distortions and subject movements. Using eddy re- quires that the data have been collected in a specific way. If the diffusion data is sampled using gradient directions spread over the whole sphere, rather than over the half sphere, eddy will be better able to correct for distortions. Also, it is easy to estimate the distortions from the diffusion data acquired with a reversed phase encoding direction pair. This is because eddy attempts to combine the correction for susceptibility, eddy currents and head motion, and therefore it is necessary to feed eddy the susceptibility distortions calculated from topup. Eddy assumes the fieldmap format given by the topup tool in FSL. If there is no topup output available, it is possible to use a traditional dual echo-time fieldmap prepared using the FSL function PRELUDE. If a fieldmap is not supplied, eddy will still be able to correct for eddy currents and sub- ject movement, though the results will still contain susceptibility distortions. There also needs to be a minimum number of diffusion directions depending on the b-value, a larger number of directions are required for higher b-values. The algorithm can be summarized as follows:

1. N diffusion weighted volumes fi with acquisition parameters (PE‐direction and bandwidth) ai as input. 2. Susceptibility field h as input.

3. Initialize eddy current parameters βi and subject movement ri to 0.

4. Calculate the estimated intensity sˆi in the undistorted space s.

5. Train Gaussian Process prediction maker using sˆi

6. Draw prediction sample si from Gaussian Process ˆ 7. Calculate the prediction in acquisition fi ˆ 8. Update βi and ri based on fi − fi

4.4 Head motion correction

Head motion correction is an important issue in diffusion data analysis as even the slightest patient motion can induce significant motion distortion, particularly at tissue boundaries, at the edge of the brain or near major vessels. The relatively long scan time means that bulk motion effects are unavoidable in awake subjects. Image registration is commonly employed to correct for distortions caused by head motion, and is in eddy_correct performed at the same time as eddy current correction. It is harder to estimate head motion for large b-values, since the SNR is lower for such volumes. Table 4.3 shows a list of head motion correction tools.

36 4.5. Discussion

4.5 Discussion

Not only the distortions themselves, but also the strategies employed to cor- rect them may introduce errors. For example, when correcting for motion and eddy-current-induced distortions (by performing linear registration of the diffusion-weighted images to one of the non-diffusion weighted images), itis important to also reorient the encoding gradient vectors with the same ro- tation matrix. Neglecting to perform this important step may have minimal impact on scalar indices such as FA, but it can introduce biases on the order of a couple of degrees to estimates of the principal eigenvector (Leemans and Jones, 2009). It is also important to correct for any residual eddy currents along the phase-encoding direction (by modulating the signal intensity back to its correct value by scaling the intensity in proportion to the change in the volume of the voxel). Neglecting to do this can introduce biases in quantita- tive metrics and estimates of orientation (Jones and Cercignani, 2010).

37

5 Tractography

Anything is one of a million paths. — Carlos Castenda

5.1 Introduction

Tractography is a collection of methods used to reconstruct white matter pathways in the human brain, using the orientation information derived from the used diffusion model. The diffusion of water in white matter is greatest along the fiber orientation. Thus the white matter fiber bundle direction is assumed to be along the major direction of the local diffusivity. Tractography does not model individual white matter fibers but average bundles. The possibility to extract white matter fiber bundles from diffusion weighted data can be used to delineate brain connectivity and study changes and disease of white matter fibers. Therefore, tractography has been of great interest to both clinicians and basic science researchers. Tractography algorithms can be categorized into two classes, deterministic and probabilistic tractography, see Table 5.1.

39 5. Tractography ) ) ) ) 2010 ) 2005 ) ) 2014 2012 ) ) ) ) 2009 ) 2006 2006 2005 2006 2004 2004 2010 ) 1996 Reference Goebel ( Yeh, Wedeen, and Tseng ( Pieper, Halle, and Kikinis ( Park, Shenton, and Westin ( Jiang, Van Zijl, Kim, et al. ( Zhang, Yushkevich, and Gee ( Fillard, Toussaint, and Pennec ( Cook, Bai, Nedjati-Gilani, et al. ( Leemans, Jeurissen, Sijbers, et al. ( Jenkinson, Beckmann, Behrens, et al. ( Garyfallidis, Brett, Amirbekian, et al. ( Papademetris, Jackowski, Rajeevan, et al. ( Cárdenes, Muñoz-Moreno, Tristan-Vega, et al. ( Website dipy.org www.slicer.org fsl.fmrib.ox.ac.uk www.mristudio.org camino.cs.ucl.ac.uk www.exploredti.com github.com/medInria dti-tk.sourceforge.net www.brainvoyager.com dsi-studio.labsolver.org www.lpi.tel.uva.es/saturn A list of diffusion tractography tools. neuroimage.yonsei.ac.kr/dodti bioimagesuiteweb.github.io/webapp white.stanford.edu/newlm/index.php/MrDiffusion Table 5.1: probabilistic deterministic deterministic deterministic deterministic deterministic deterministic deterministic deterministic deterministic Tractography deterministic/probabilistic deterministic/probabilistic deterministic/probabilistic deterministic/probabilistic FSL Dipy DoDTI Camino DTI-TK SATURN MedINRIA DSI Studio DTI Studio ExploreDTI MrDiffusion BrainVoyager BioImage Suite analysis software Diffusion MRI data

40 5.2. Deterministic tractography

5.2 Deterministic tractography

With deterministic tractography, streamlines are steered according to a fixed (deterministic) direction at each voxel, which can be computed from the es- timated diffusion model. Deterministic tractography is frequently employed in the scientific and clinical setting due to its simplicity and computation speed. However, deterministic tractography algorithms are unable to account for inherent uncertainty in estimates of fiber orientations, and are sensitive to the estimated principal direction and susceptible to noise. Figure 5.1 shows an example illustrating deterministic tractography performed on the Stanford HARDI dataset (150 measurements, b = 2000 s/mm2 ) (Rokem, Yeatman, Pestilli, et al., 2015). The seed mask is two sagittal slices of the corpus callo- sum.

Figure 5.1: An example of deterministic tractography using the Stanford HARDI dataset. The seed mask is two sagittal slices of the corpus callosum. Tracking from this seed mask gives us a model of the corpus callosum tract.

The directional information of a diffusion model enables estimating and modeling the trajectories of white matter tracts in the human brain. Deter- ministic tractography methods are primarily based on streamline algorithms, where the local white matter fiber bundle direction is defined by the major direction of the diffusion model. In many brain regions, it is reasonable to model the fiber tracts as smooth curves at the resolution of voxels in diffusion MRI (around 2 mm). Tractography can be performed by simply starting at a seed point and following the local major direction (the vector field) on a step-by-step basis, see Figure 5.2. The most straightforward way to perform tractography is connecting each voxel to its neighboring voxel following the fiber direction, e.g. the major direction of the diffusion model. However, the vector field

41 5. Tractography of the major direction derived from diffusion model is inherently discrete on a voxel-by-voxel basis. Thus when using this approach, the estimated fiber bundle often deviates from the true fiber orientation since for each voxel the choice of direction is limited to only 3 × 3 × 3 − 1 = 26 directions in 3D space.

Figure 5.2: Tractography starting at a seed point and following the local major direction step-by-step.

To avoid this problem, several interpolation approaches have been pro- posed to estimate a continuous representation of the local major direction. Aldroubi and Basser (1999) and Pajevic, Aldroubi, and Basser (2002) devel- oped a mathematical framework to generate a continuous tensor field from a discrete one, by repeatedly performing one-dimensional B-spline transforma- tions of the diffusion data. Using the generated continuous representation of the diffusion tensor, a streamline based tractography algorithm was proposed by Basser (1998), see Figure 5.3. Mathematically, the streamline r(s) is a 3D curve and can be parameterized by an arc length s. The arc length is defined as the distance along the streamline from the start. The streamline can be associated with the major eigenvector of the diffusion tensor e1 according to dr(s) = t(s), (5.1) ds t(s) = e1(r(s)), (5.2) where t(s) is the tangent vector of the streamline. This equation describes that the major eigenvector e1 is tangent to the fiber bundles. To trace fiber bundle trajectories, the tangent at s to the streamline is assumed to be the local estimate of fiber orientation. For the diffusion tensor model, the local fiber orientation is the eigenvector corresponding to the largest eigenvalue of the diffusion tensor, i.e. t(s) = e1(r(s)). Since Equation 5.1 is a differential equation which describes how the streamline changes, there is no direct equa- tion to calculate the streamline at any point. Numerically the streamline can be integrated by approximation using a Taylor expansion (Basser, 1998):

r(si+1) ≈ r(si) + αe1(r(si)), (5.3)

42 5.3. Probabilistic tractography

Figure 5.3: A streamline based tractography algorithm. where α is a sufficiently small constant. Although this approximation iseasy to implement, this method has the drawback that all errors in the calculation of the streamline will be accumulated as the streamline progresses. A more reliable solution is to adopt the high order Runge-Kutta method (Press, Teukolsky, Vetterling, et al., 2007) given by α r(s + ) ≈ r(s ) + (k + 2k + 2k + k ) , (5.4) i 1 i 6 1 2 3 4 k1 = t(si), (5.5) α k = t (s + k ) , (5.6) 2 i 2 1 α k = t (s + k ) , (5.7) 3 i 2 2 α k = t (s + k ) . (5.8) 4 i 2 3 The high order Runge-Kutta method is a nonlinear and more accurate solution to integrate the streamline, but it is much more computationally demanding. A graphics card (GPU) can perform tractography in parallel to speed up the process enormously, as each thread can independently process one seed (Eklund, Dufort, Forsberg, et al., 2013).

5.3 Probabilistic tractography

The uncertainty of the principal orientation increases with higher noise lev- els, fewer gradient directions and lower diffusion anisotropy. Tractography is very sensitive to uncertainty in the principal orientation estimation. For each tracking step into an adjacent voxel, tractography errors will be accu-

43 5. Tractography mulated and the estimated white matter tract will further deviate from the truth. Tractography errors propagate through the tractography process and lead to erroneous results about white matter connectivity. The uncertainty cannot be quantified in conventional deterministic trac- tography. However, the amount of uncertainty in the orientation estimation can be used to delineate the potential pathways. Repeated deterministic tractography from some seed voxels will return identical results, based on the estimated principal orientation of the seed voxels. In probabilistic tractogra- phy, the principal orientation is not seen as fixed but as a distribution with a certain uncertainty. Repeated tractography will therefore lead to slightly different results, even if the same seed voxel is used every time. After gener- ating a number of streamlines from each seed voxel, a tract probability map can be derived from the number of streamlines passing through each voxel. The goal of probabilistic tractography is to develop a full representation of the uncertainty accumulated in each step of the tractography process. Prob- abilistic tractography aims to estimate the confidence that the path of least hindrance for diffusion from seed point A passes through region B, giventhe model and the data. Probabilistic tractography is much more computation- ally demanding than deterministic tractography, due to the estimation and usage of orientation distributions in all voxels. There are many tractography algorithms implemented on GPUs to accelerate the process, such as FSL’s bedpostx_gpu and probtrackx_gpu (Hernández, Guerrero, Cecilia, et al., 2013; Hernandez-Fernandez, Reguly, Jbabdi, et al., 2019).

Principal direction uncertainty The first step in probabilistic tractography is to build a 3D uncertainty func- tion, that characterizes the uncertainty in the fiber orientation estimation. This orientation distribution function is a sphere density function, whose sur- face value describes the probability density of the true fiber orientation. There are several ways to estimate uncertainty, the most common being frequentistic methods, such as bootstrap, and Bayesian methods.

Bootstrap Bootstrap (Efron, 1979) is a non-parametric statistical technique used to quantify uncertaintiy of parameters. Bootstrap has been widely used in dif- fusion data analysis to study the uncertainty associated with diffusion model parameter estimation (Jones, 2008; Haroon, Morris, Embleton, et al., 2009; Berman, Chung, Mukherjee, et al., 2008). The repetition bootstrap method requires multiple diffusion datasets col- lected with the same diffusion gradients to perform resampling, see Figure 5.4. The uncertainty function samples are obtained by fitting diffusion mod- els to each of these datasets and estimating the principal diffusion direction.

44 5.3. Probabilistic tractography

However, a large number of data (e.g. 1000) repetitions would be required

Figure 5.4: For the repetition bootstrap, a large number of datasets repetitions would be required. The parameter of interest is calculated from each dataset. The N samples can be used to construct a good representation of the uncertainty function. to construct a good representation of the uncertainty function. It is expen- sive and time consuming to acquire additional repeated diffusion datasets in practice. An implementation of bootstrap sampling with replacement for the dif- fusion tensor model was first proposed by Pajevic and Basser (1999) and Pajevic and Basser (2003) using both simulated and real diffusion MRI data. A healthy subject was scanned using 6 directions and for each direction the measurement was repeated 4 times. For the ith distinct b-matrix bi, the mea- surement is repeated ni (ni > 1) times and designated as subset Si. Together

45 5. Tractography

with the zero diffusion weighting, this yields a total of NS = M + 1 distinct diffusion gradients where M is the number of non-collinear diffusion gradi- = M ent directions. The total number of measurements is thus NM ∑i=0 ni. To perform bootstrap sampling, ni measurements will be drawn randomly with replacement for each subset Si and a diffusion tensor can be estimated from the NM randomly picked measurements, see Figure 5.5. With bootstrap sam- pling repeated NB times, NB diffusion tensors can be used to estimate the probability distribution of the quantities of interest, e.g. principal direction. A full representation of the principal direction uncertainty can be constructed using a large number of bootstrap samples.

Figure 5.5: A graphical illustration of the bootstrap samples obtained by sampling with replacement. By repeating the process NB times, one can obtain NB bootstrap samples. These samples can then be used to obtain the probability distributions of a parameter of interest.

It is not practical to acquire a large number of diffusion datasets with the same diffusion gradients. For most clinical and research applications, itis more interesting to obtain a higher number of gradient directions, since dif- fusion parameter estimation can be more precise with high angular resolution diffusion imaging (HARDI) (Tuch, Reese, Wiegell, etal., 2002). To be able to use bootstrap for diffusion data with many gradient directions, instead of repetitions of the same direction, residual bootstrap can be used with the assumption that the error terms have constant variance. The wild bootstrap is suited when data are heteroscedastic (i.e. have non-constant variance). Bootstrap for diffusion MRI is further discussed in Chapter 6.

46 5.3. Probabilistic tractography

Bayesian methods In the Bayesian inference framework, the posterior distribution quantifies the uncertainty of the parameters β given the data S, according to Bayes theorem:

P (S∣β)P (β) P (β∣S) = , (5.9) P (S) where p(β) is the prior information about the parameters, and p(S∣β) is the likelihood of S given β. The weakness of a Bayesian approach is that it re- quires explicit representations of all the assumptions made in the likelihood and prior distributions, thus a closed form expression for the posterior dis- tribution is normally not available. The strength of a Bayesian approach, on the other hand, is that it offers a way to calculate uncertainty given these assumptions. The likelihood must include a parametric assumption for the re- lationship between the data and quantities of interest, and also about the noise distributions. The prior distribution describes information known about the parameters before any data are collected. Once the likelihood and prior distri- butions have been described mathematically, there are various techniques to recover the posterior distribution by analytical solutions, functional approxi- mations, or efficient strategies for generating samples from the posterior (e.g. Markov chain Monte Carlo (MCMC)). Because of its mathematical simplicity, most applications of Bayesian methods to diffusion MRI have used a MCMC estimation process. By constructing a Markov chain that has the desired pos- terior distribution as its equilibrium distribution, MCMC is guaranteed to generate samples from the true posterior distribution if given enough time. However, like all sampling techniques MCMC is computationally expensive, and it can sometimes be difficult to ascertain when convergence to the true distribution has been achieved. The Bayesian ball-and-stick model used in FSL is further discussed in Chapter 6.

Sampling uncertainty In the previous section, we have shown how it is possible to compute the uncertainty associated with a measurement of local fiber structure. With this uncertainty taken into consideration, there are now an infinite number of possible principal diffusion directions we might follow when the streamline comes to each voxel. There are therefore an infinite number of possible paths starting from the seeds. To solve this problem, we can resort to an intuitive technique that repetitively samples from the resulting probability distribution function of the uncertainty. The procedure can be summarized as follows

1. Start at a seed voxel,

2. Draw a sample from the principal direction distribution function at the current voxel,

47 5. Tractography

3. Move the streamline along the sampled principal direction,

4. Repeat 2 and 3 until a stopping criterion is met.

By repeating this procedure many times, we may build the entire spatial probability density function of the white matter tracts initialized from a seed voxel. One way to obtain probabilities is to count the number of streamlines that pass through each voxel and divide it by the total number of sample streamlines. The resulting discrete values can be interpreted as probabilities that the pathway through the diffusion data passes through a particular region (e.g. a voxel).

5.4 Stopping and rejection criteria

In order to perform tractography, we need a method for identifying when the tracking must stop. Tractography has no mechanism for estimating uncer- tainty in each step as the streamline progresses. Therefore, some heuristic stopping criteria must be defined to terminate the streamline. A stopping criterion determines if the tracking stops or continues at each tracking po- sition. Two commonly used stopping criteria are fractional anisotropy (FA) and curvature. Unlike a stopping criterion which only stops the streamline at the current position, a rejection criterion rejects the entire streamline when some conditions are met.

FA A diffusion scalar map can be used to define a stopping criterion. The stopping criterion uses a diffusion scalar map, e.g. an FA map, to stop the tracking whenever the scalar value is lower than a fixed threshold. Here, we show an example using the FA map of the DTI model. The threshold stopping criterion uses trilinear interpolation at the tracking position. The rationale for using an FA threshold is that voxels with low FA values tend to have a greater degree of ambiguity in the major direction of the fiber bundle. The FA threshold also prevents the streamline from going into gray matter. A too high FA threshold will result in a lower number of reconstructed fiber tracts. Figure 5.6 shows an example of deterministic tractography using the FA map of the DTI model. The obvious limitation of using an FA threshold as the stopping criterion is that FA can be quite low in regions of white matter, such as the centrum semiovale where fiber tracts are crossing. This will cause some tracts to terminate prematurely.

Curvature The curvature is measured along the streamline as the angle between two suc- cessive steps, and a curvature threshold restricts the maximum angle through

48 5.4. Stopping and rejection criteria

Figure 5.6: An example for deterministic tractography using the Stanford HARDI dataset. The seed mask is two sagittal slices of the corpus callosum. The stopping criterion is an FA threshold of 0.2 (top), 0.4 (middle) and 0.6 (bottom). which paths can propagate, which will prevent bended fiber trajectories. Al- though the assumption of low curvature is valid for most of the cases, there are some exceptions. For example, the Meyer’s loop of the optic pathway and the short subcortical U-fibers have high curvature.

49 5. Tractography

Termination mask The termination mask is used as a stopping criterion, so that whenever a streamline reaches a voxel in the termination mask it will stop. Figure 5.7 shows an example of deterministic tractography using a binary white mat- ter mask as the termination mask. The binary white matter mask prevents streamlines from entering the gray matter, since the goal of most diffusion tractography studies is to map out specific white matter pathways. The ter- mination mask can also be used as the destination of the tracts. The mask will then prevent tracts from continuing past the destination to other parts of the brain.

Figure 5.7: An example for deterministic tractography using the Stanford HARDI dataset. The seed mask is two sagittal slices of the corpus callosum. Left: no termination mask used, right: the termination mask is a binary white matter mask.

Inclusion mask and exclusion mask Anatomical priors can be used to define the inclusion mask and the exclusion mask (Smith, Tournier, Calamante, et al., 2012). The inclusion mask is used to specify a region that all streamlines must pass through. That is, if a particular streamline does not pass through the inclusion mask it will be rejected. The exclusion mask is a region which rejects any streamline that touches it. This can, for example, be very helpful to stop tracts crossing between the hemispheres (by including a mask of the mid-sagittal plane as an exclusion mask). To prevent streamlines from progressing through regions of a large un- certainty, and to prevent false positives, deterministic streamline algorithms usually have stopping criteria based on FA and streamline curvature. On the other hand, the motivation of probabilistic tractography is to quantify the un- certainty of tracts, so that a streamline can progress beyond large-uncertainty regions. This consideration reduces the need for stopping criteria like FA and curvature thresholds in probabilistic tractography. Stopping criteria and re- jection criteria can be used in combination to increase tractography accuracy

50 5.5. Limitations

(Girard, Whittingstall, Deriche, et al., 2014). For example, considering a connection from A to B, we can incorporate anatomical priors such that the streamline must go through region C (the inclusion mask), or it must not go through D (the exclusion mask). The streamline would be terminated if it arrives at region B, or if it enters region D (the termination mask). The use of stopping and rejection criteria is extremely useful for isolating specific tracts of interest. Figure 5.8 shows a probabilistic tractography example of the optic radiation (also known as the geniculocalcarine tract) using a seed mask at the lateral geniculate nucleus of the thalamus. The lateral primary visual cortex is used as the inclusion mask and the termination mask. It is demonstrated that the optic radiation is reconstructed more accurately with the inclusion mask and the termination mask, compared to not using the two masks.

Figure 5.8: An example of probabilistic tractography using stopping and rejection criteria to increase the tract accuracy. The seed mask is located at the lateral geniculate nucleus of the thalamus. The lateral primary visual cortex is used as the inclusion mask and the termination mask. The curvature threshold is 0.2. Top: no stopping and rejection criteria used, bottom: both the inclusion mask and the termination mask are used. Data is from the FSL tractography tutorial1.

5.5 Limitations

Even though tractography has demonstrated success in reconstructing and visualizing major white matter fiber bundles, it still has several limitations and the errors of an estimated white matter tract are generally unknown (Maier-Hein, Neher, Houde, et al., 2017). There are many potential sources of

1https://www.fmrib.ox.ac.uk/primers/intro_primer/ExBox8/IntroBox8.html

51 5. Tractography errors that can confound white matter tracking results. The main limitations are summarized below.

Validation and reliability Validity and reliability issues are hindering tractography’s widespread imple- mentation, which questions the real clinical value of tractography in a clinical and surgical setting.

Diffusion imaging artifacts The diffusion measurements are highly sensitive to a wide range of artifacts including acquisition noise, head motion, geometric distortions, and eddy- current distortions, as discussed in Chapter 4. These artifacts will impair the directional information derived from the diffusion model fitting. Very small perturbations in the diffusion data (i.e. noise, distortion, etc.) may leadto significant errors.

Fiber orientation errors One of the most challenging aspects of tractography is to estimate the local fiber orientations from the diffusion MR images based on a certain diffusion model. It is very difficult to obtain accurate fiber orientations given thata single white matter voxel can contain hundreds of thousands of fibers. This means that different fiber configurations such as bending, fanning, crossing, and kissing of fibers can all give rise to the same diffusion measurements, mak- ing it impossible to distinguish between these cases at the local voxel level. One of the most widely used diffusion models for characterizing fiber orien- tation is the diffusion tensor, but it will only give an average orientation, i.e. there may not be any fiber in the direction of the principal eigenvector. More advanced diffusion models such as DSI and QBI represent fiber orientations as a continuous function (i.e. ODF), making it possible to perform tractography even in white matter regions with complex fiber architecture. However, these advanced models require a very large number of measurements to reconstruct the 3D diffusion propagator, resulting in a long scan time. Even the advanced diffusion models are still based on simplified approximations of the physical reality. As such, these models will still be subject to modeling errors.

Streamline error accumulation Errors in streamline-based tractography algorithms will be accumulated as the streamline propagates. Errors are large in gray matter, CSF and some white matter regions where more than one fiber bundle may exist.

52 6 Uncertainty in diffusion models

In these matters the only certainty is that nothing is certain. — Pliny the Elder

6.1 Introduction

Uncertainty and its representation play an important role in inferring useful information from diffusion data. In diffusion MRI, researchers attempt to infer information about, for example, diffusion anisotropy or the underlying fiber tract direction, by fitting models of the diffusion and measurement pro- cesses to the diffusion data. In this scheme, uncertainty is caused bothby the noise and artifacts present in the diffusion data, but also by the incom- plete modeling of the diffusion signal. To use diffusion MRI metrics as clinical biomarkers, it is necessary to estimate the metric itself as well as its uncer- tainty. Without an uncertainty measure it is difficult to know if a subject, or a specific brain region, for example has a lower non-Gaussianity, or ifitis lower by random chance. For group analyses, an uncertainty for each subject and voxel can be used to downweight outliers or subjects with more uncer- tain estimates (e.g. due to more severe head motion). Diffusion data analysis pipelines are based on a number of parameters and settings. An uncertainty measure can be used to select the parameters and settings that result in the lowest uncertainty. Similarly, it becomes possible to investigate how much the uncertainty is increased when the amount of diffusion data is reduced (to decrease MR scanner time). Furthermore, an uncertainty measure can be used for comparing different softwares and preprocessing pipelines, as a better pipeline should result in a lower uncertainty.

53 6. Uncertainty in diffusion models

6.2 Bootstrap

Bootstrap was proposed by Efron (1979) inspired by earlier work on the jack- knife (Quenouille, 1949). The main idea behind bootstrap is simple, but the calculations can be time consuming. To estimate the uncertainty of the mean of N samples, one simply performs resampling (with replacement) a large number of times (e.g. 1000), and calculates the sample mean for each repli- cation. The standard deviation of the 1000 means can then be seen as the uncertainty of the mean value. For neuroimaging data, the process is repeated for all brain voxels (20,000 - 100,000) which leads to long processing times. For diffusion MRI, there are three possible ways of using the bootstrap; repetition bootstrap, residual bootstrap and wild bootstrap.

Repetition bootstrap The repetition bootstrap method requires multiple diffusion datasets collected with the same diffusion gradients to perform resampling. The uncertainty function samples are obtained by fitting a diffusion model to each of these datasets, and estimating the principal diffusion direction (or any other pa- rameter of interest, such as fractional anisotropy). However, a large number of data (e.g. 1000) repetitions would be required to obtain a good estimate of the uncertainty. It is expensive and time consuming to acquire so much diffusion data for each subject. An implementation of bootstrap sampling with replacement for the dif- fusion tensor model was first proposed in (Pajevic and Basser, 1999). To estimate the diffusion tensor D, 6 oblique diffusion gradients b plus b = 0 were used to simulate 7 diffusion measurements. For each of the 7 diffusion gradients b, simulation was repeated n times which yields a total 7n mea- surements called the original set. A bootstrap set was created by drawing n random samples with replacement from each of the seven b sets. By repeat- ing this step NB times, NB bootstrap sets were obtained. Distributions of quantities of interest, e.g. FA, MD, can then be reconstructed from the NB bootstrap samples. It was demonstrated that the probability distribution of the first tensor component Dxx obtained within a single voxel in gray matter followed a Gaussian distribution (Pajevic and Basser, 1999). Repetition bootstrap for diffusion tensor imaging was generalized in (Paje- vic and Basser, 2003). For the ith distinct diffusion gradient bi, the measure- ment is repeated ri times to form a set Si. To perform bootstrap sampling, ni measurements satisfying ni < ri will be drawn randomly with replacement from each set Si. It was reported that the reliability of this bootstrap method increased monotonically as ri increased, at the cost of longer acquisition time. The repetition bootstrap method has been successfully used in (Pajevic and Basser, 2003) to study the distribution of the 6 diffusion tensor components Dij, the trace Tr(D), the relative anisotropy RA and eigenvalues λi. It has

54 6.2. Bootstrap

been shown that the estimated probability density function of Dij for both white and gray matter voxels fits well to a Gaussian distribution. The esti- mated probability density function of the trace follows a Gaussian distribution for most white and gray matter voxels, but a Gaussian distribution is not as good for voxels in cerebrospinal fluid. Jones (2003) used repetition bootstrap to determine confidence intervals in fiber orientation by constructing maps of the “cone of uncertainty”. Similarly, repetition bootstrap has been applied to Q-ball imaging data to obtain uncer- tainty of fiber orientations (Hess, Mukherjee, Berman, et al., 2006; Berman, Mukherjee, Hess, et al., 2007). It is not practical to acquire a large number of diffusion datasets with the same diffusion gradients. For most clinical and research applications, itis more interesting to obtain a higher number of gradient directions, since dif- fusion parameter estimation can be more precise with high angular resolution diffusion imaging (HARDI). It has been shown that in order to obtain precise and robust estimates of the PDF for diffusion tensor derived parameters, at least four or five repetitions are needed (O’Gorman and Jones, 2006).

Residual bootstrap Freedman et al. (1981) developed residual bootstrap to extend the original bootstrap (Efron, 1979) to regression models. Residual bootstrap refers to the bootstrap resampling technique applied to the linear regression model, where the residuals (based on the initially fitted model) are resampled instead of the raw sample values. To be able to use bootstrap for diffusion data with many gradient directions, instead of repetitions of the same direction, residual bootstrap can be used with the assumption that the error terms have constant variance (homoscedasticity). While the assumption of a constant variance can be true for single shell diffusion data, where a single b-value is used, itis unlikely to be true for multi-shell diffusion data, since a higher b-value leads to a lower signal to noise ratio. In this case, raw residuals need to be modified to have constant variance. Using the residual bootstrap for diffusion tensor analysis was proposed in (Chung, Lu, and Henry, 2006). Residual bootstrap can be used in the homoscedastic condition, and also in the heteroscedastic condition as long as the heteroscedasticity can be modelled. In a diffusion experiment, the diffusion-weighted signal Si of the ith mea- surement in one voxel for the diffusion tensor model is given by = (− T ) = ⋯ Si S0 exp bgi Dgi , for i 1, 2, ,M, (6.1) where S is the signal without diffusion weighting, b is the diffusion weighting 0 ⎡ ⎤ ⎢D D D ⎥ ⎢ xx xy xz⎥ = ⎢ ⎥ × factor, D ⎢Dxy Dyy Dyz⎥ is the diffusion tensor in the form ofa 3 3 ⎢ ⎥ ⎣Dxz Dyz Dzz ⎦ positive definite matrix, gi is a 3 × 1 unit vector of the gradient direction,

55 6. Uncertainty in diffusion models and M is the total number of measurements. Using the log transform, the equation becomes

( ) = ( ) − T ln Si ln S0 bgi Dgi, (6.2) which, assuming additive noise on the log scale, can be structured into the well-known multiple linear regression form

y = Xβ + εε, (6.3)

T where y = [ln(S1), ln(S2), ⋯, ln(SM )] represents the logarithm of the mea- T sured signal, β = [Dxx,Dyy,Dzz,Dxy,Dxz,Dyz, lnS0] are the unknown re- gression coefficients, X is a M × 7 design matrix containing the different dif- fusion gradient directions,

⎡ 2 2 2 1 ⎤ ⎢ g g g 2g1xg1y 2g1xg1z 2g1yg1z − ⎥ ⎢ 1x 1y 1z b ⎥ ⎢ g2 g2 g2 2g g 2g g 2g g − 1 ⎥ = − ⎢ 2x 2y 2z 2x 2y 2x 2z 2y 2z b ⎥ X b ⎢ ⎥ , (6.4) ⎢ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⎥ ⎢ 2 2 2 − 1 ⎥ ⎣gMx gMy gMz 2gMxgMy 2gMxgMz 2gMygMz b ⎦

T and ε = [ε1, ε2, ⋯, εM ] are the error terms. The weighted least squares (WLS) estimate of β is given by

βˆ = (XT WX)−1XT Wy. (6.5)

To determine the weight matrix, we first need to calculate the ordinary least squares (OLS) estimate

ˆ T −1 T βOLS = (X X) X y, (6.6) and then calculate the OLS fitted diffusion measurements ˆ SˆOLS = exp(XβOLS). (6.7)

The weight matrix can then be expressed as (Chung, Lu, and Henry, 2006)

= (ˆ2 ) W diag SOLS . (6.8)

In order to perform residual bootstrap, the error terms εi need to be i.i.d. (independent and identically distributed) to satisfy the basic assumption of bootstrap. However, generally, the raw residuals ε do not satisfy this condi- tion. However, the variance of the residuals from WLS can be modelled as (Chung, Lu, and Henry, 2006)

σ2 Var(εi) = . (6.9) Si

56 6.2. Bootstrap

Therefore, raw residuals from WLS can be modified to have constant variance by the transformation

εˆi rˆi = , (6.10) − 1 1 2 ( − ) 2 wi 1 hi ˆ where εˆi = yi − xiβ is the residual of the WLS estimate, wi is the ith diagonal element of W and hi is the ith diagonal element of the matrix H given by H = X(XT WX)−1XT W. (6.11)

Finally, residual bootstrap resampling is defined as

∗ − 1 ∗ = ˆ + 2 yi xiβ wi εi , (6.12) ∗ ∗ where yi is the ith element of resampled y, xi is the ith row of X and εi is a random draw with replacement from the centered modified residuals ˆr − r¯. With this modified-variance based residual bootstrap, Chung, Lu, and Henry (2006) demonstrated that there was nearly no bias estimating the 95th percentile confidence interval of the angle of primary eigenvector e1, and very small root-mean-square error for high anisotropy of FA = 0.5/0.8 for simulated diffusion data with Rician noise as the source of uncertainty. Haroon, Morris, Embleton, et al. (2009) presented a study of an implemen- tation of residual bootstrap for Q-ball analysis. The residuals were obtained by a multi-tensor model fitting,

N T Si = S0 ∑ fn exp(−bgi Dngi), (6.13) n=1 where fn is defined as the fractional contribution of each diffusion tensor Dn, such that

N ∑ fn = 1. (6.14) n=1 The initial estimate of the number of intravoxel fiber populations was obtained from Q-ball analysis. In each of the gradient sampling directions, a residual εˆi is calculated according to

εˆi = Si − Sˆi. (6.15) For all voxels, residual bootstrap was performed by randomly selecting one of the εˆi and adding it to the predicted signal, i.e. ∗ = ˆ + ∗ Si Si εi , (6.16) ∗ where Si is the generated bootstrap sample of the diffusion-weighted signal for gradient direction gi. The diffusion data created by each residual boot- strap sampling was then processed with Q-ball to extract the estimates of

57 6. Uncertainty in diffusion models underlying fiber orientations. However, we know that the assumption ofho- moscedasticity is not valid when estimating the diffusion tensor using a linear regression. The violation is induced by applying the logarithm transform in order to achieve the linear relation. Haroon, Morris, Embleton, et al. (2009) applied the residual bootstrap to a tensor mixture model without discussing the heteroscedasticity issue, which make the results less reliable (Wegmann, Eklund, and Villani, 2017a).

Wild bootstrap The wild bootstrap, proposed originally by (Wu et al., 1986), is suited when residuals are heteroscedastic (i.e. have non-constant variance). Using the wild bootstrap for diffusion tensor analysis was proposed in (Whitcher, Tuch, and Wang, 2005). Wild bootstrap resampling is defined as ∗ = ˆ + ( ) ∗ yi xiβ f εˆi ui , (6.17) ( ) ∗ where f εˆi is a transformation of the ith residual εˆi and ui is a random variable with mean 0 and variance 1. Possible choices for f(εˆi) include √ N f(εˆ ) = εˆ , (6.18) i − i √N K 1 f(εˆi) = εˆi, (6.19) 1 − hi 1 f(εˆi) = εˆi, (6.20) 1 − hi where N is the number of diffusion measurements, K is the number of esti- mated parameters, and hi is the ith diagonal element of the (unweighted) hat matrix H defined by

H = X(XT X)−1XT . (6.21)

∗ The most common two-point distributions choices for ui are (Mammen et al., 1993; Davidson and Flachaire, 2008) ⎧ √ √ ⎪− 5−1 with probability √5+1 , u∗ = ⎨ √ 2 √2 5 (6.22) i ⎪ 5+1 with probability √5−1 ; ⎩ 2 2 5 −1 with probability 1 , ∗ = { 2 ui 1 (6.23) 1 with probability 2 . ( ) ∗ With this design, f εˆi ui will have constant variance if the residual terms are heteroscedastic. Berman, Chung, Hess, et al. (2007) and Berman, Chung, Mukherjee, et al. (2008) applied the wild bootstrap to assess uncertainty in ODF from the

58 6.3. Bayesian methods q-ball restriction. The raw noisy data were fitted to a fourth order spheri- cal harmonic representation and the residuals were then adjusted to have a constant variance as = εˆi rˆi 1 , (6.24) (1 − hi) 2 where hi is the ith diagonal element of the hat matrix H, given by = ( T )−1 T H ZQ ZQZQ ZQ, (6.25) where ZQ is constructed from the diffusion gradient directions (Hess, Mukher- jee, Han, et al., 2006). Although studies have shown that wild bootstrapping introduces bias and a level of error into the estimation of uncertainty, which is substantially reduced when using the residual bootstrap method (Chung, Lu, and Henry, 2006; Chung, Berman, Rae, et al., 2007), it is not always pos- sible to apply the residual bootstrap since the variance of the residuals from a model fitting sometimes cannot be adjusted to have a constant variance. To the best of our knowledge, most of the studies about bootstrap are for single- shell diffusion data (Pajevic and Basser, 2003; Heim, Hahn, Sämann, et al., 2004; Chung, Lu, and Henry, 2006; Berman, Chung, Mukherjee, et al., 2008; Yuan, Zhu, Ibrahim, et al., 2008; Zhu, Liu, Connelly, et al., 2008; Haroon, Morris, Embleton, et al., 2009). To study uncertainty of MAP-MRI, we generated 500 wild bootstrap sam- ples of three MAP-MRI metrics (RTOP, NG and PA) for four the MGH adult subjects from the Human Connectome Project (HCP) (Van Essen, Smith, Barch, et al., 2013). Diffusion tensor fitting, MAP-MRI fitting and bootstrap sampling were implemented using C++ and OpenMP to reduce the process- ing time. Figure 6.1 shows the standard deviation of RTOP, NG and PA for subjects MGH-1003, 1005, 1007 and 1010. There are two main clusters of voxels in the RTOP standard deviation maps wherein the white matter areas generally appear hyperintense, while the gray matter areas make up the lower intensity regions. A portion of the white matter regions shows higher standard deviation values for RTOP, most notably in the splenium of the cor- pus callosum. Within the corpus callosum, the splenium part shows a higher standard deviation than the genu and body of the corpus callosum. The stan- dard deviation maps for RTOP, NG and PA show marked differences across subjects.

6.3 Bayesian methods

Another approach for estimating uncertainty is to use Bayesian methods. In the Bayesian framework, the posterior distribution of the parameters given the data can be formulated as P (S∣β)P (β) P (β∣S) = , (6.26) P (S)

59 6. Uncertainty in diffusion models

Figure 6.1: Standard deviation of RTOP, NG and PA for subjects MGH-1003, 1005, 1007, 1010, slice 45, using 500 bootstrap samples. CSF voxels were removed prior to data analysis. where S is the diffusion data and β are the diffusion parameters to be esti- mated. Compared to frequentistic approaches (e.g. bootstrap), the researcher has to use informative or non-informative priors about the parameters of in- terest. A benefit of Bayesian methods is that the uncertainty is easy to obtain (i.e. without any time consuming calculations), as long as the diffusion model is linear (Sjölund, Eklund, Özarslan, et al., 2018).

Diffusion tensor model The noise is modelled separately for each voxel as independently identically distributed Gaussian, with a mean of zero and standard deviation of σ. The parameters of real interest in the tensor are the three eigenvalues and the three angles, defining the shape and orientation of the tensor. The setof

60 6.3. Bayesian methods parameters to be estimated in the diffusion tensor model is

β = (θ, ϕ, ψ, λ1, λ2, λ3, S0, σ), (6.27) where (θ, ϕ, ψ) are the three angles defining the shape and orientation of the tensor, λ1, λ2, λ3 are the three eigenvalues of the tensor, S0 is the non-diffusion weighted measurement. Please note that it is also possible to estimate the 6 tensor components Dxx,Dyy,Dzz,Dxy,Dxz,Dyz directly with the set of parameters

β = (Dxx,Dyy,Dzz,Dxy,Dxz,Dyz, S0, σ). (6.28)

The likelihood of seeing the data at each voxel S = [S1 S2 ⋯ SM ] given a diffusion model may now be written as

M P (S∣β) = ∏ P (Si∣β), (6.29) i=1

P (Si∣β) ∼ N (Sˆi, σ), (6.30) where M is the number of diffusion measurements, and Si and Sˆi are the measured and fitted values of the ith acquisition, respectively. Some non- informative choices of the prior distribution can be (Behrens, Woolrich, Jenk- inson, et al., 2003)

P (θ, ϕ, ψ) ∝ sin(θ), (6.31)

P (S0) ∼ U(0, ∞), (6.32)

P (λi) = Γ(aλ, bλ), (6.33) 1 P ( ) ∼ Γ(a , b ). (6.34) σ2 σ σ

Ball-and-stick model In the ball-and-stick model (Behrens, Woolrich, Jenkinson, et al., 2003; Behrens, Berg, Jbabdi, et al., 2007), the diffusion signal is modelled as

N N T S −bd −bg Dig = (1 − ∑ fi) e + (∑ fi) e , (6.35) S0 i=1 i=1 where N is the maximum number of fibers (sticks), d is a diffusivity parameter, fi represents the stick’s volume fraction and Di is the tensor of the i-th stick compartment. The stick tensor Di has one positive eigenvalue and two zero eigenvalues, pointing along the fiber direction (θ, ϕ). The parameter set β now has six parameters (θ, ϕ, S0, σ, d, f). Since the number of sticks (fibers) in each voxel is unknown, automatic relevance determination is used to empirically

61 6. Uncertainty in diffusion models estimate it. Some non-informative choices of the prior distribution can be (Behrens, Woolrich, Jenkinson, et al., 2003)

P (θ, ϕ) ∝ sin(θ), (6.36)

P (S0) ∼ U(0, ∞), (6.37) P (f) ∼ U(0, 1), (6.38)

P (d) = Γ(ad, bd), (6.39) 1 P ( ) ∼ Γ(a , b ). (6.40) σ2 σ σ

MCMC The Bayesian approach offers a way to estimate uncertainty by deriving the posterior probability from a prior probability and a likelihood function. The likelihood must include a parametric assumption of the relationship between the data and quantities of interest, and also about the noise distributions. The prior distribution describes information known about the parameters before any data are collected. The weakness of a Bayesian approach is that it requires explicit representations of all the assumptions made in the likelihood and prior distributions, thus a closed form expression for the posterior distribution is normally not available. Once the likelihood and prior distributions have been described mathe- matically, there are various of techniques to recover the posterior distribution by analytical solutions, functional approximations (e.g. variational Bayes), or efficient strategies for generating samples from the posterior (Markov chain Monte Carlo (MCMC)). MCMC methods comprise a class of algorithms for sampling from a pos- terior distribution. A commonly used example of random MCMC methods is Gibbs sampling (Geman and Geman, 1984). MCMC entails sampling from a conditional distribution given a multivariate distribution. Suppose we want to obtain N posterior samples of a parameter set β = (β1, β2, ⋯, βK ) from a joint posterior distribution p(β1, β2, ⋯, βK ). We denote the ith sample by i = ( i i ⋯ i ) β β1, β2, , βK . The MCMC process can then be described as 0 = ( 0 0 ⋯ 0 ) 1. Initialize β β1 , β2 , , βK . i+1 = ( i+1 i+1 ⋯ i+1) 2. Get the next sample β β1 , β2 , , βK based on the current i = ( i i ⋯ i ) = ⋯ − sample β β1, β2, , βK for i 0, 1, ,N 1. Each component of the parameter vector βi+1 is sampled from the full conditional distribution conditioned on all other components sampled so far. The full conditional distribution for the kth component can be written as ( i+1∣ i+1 i+1 ⋯ i+1 i ⋯ i ) P βk β1 , β2 , , βk−1, βk+1, , βK .

3. Repeat step 2 N times.

62 6.4. Reducing uncertainty using a spatial model

Because of its mathematical simplicity, most applications of Bayesian methods to diffusion MRI have used a MCMC estimation process. MCMC is guaranteed to sample from the true posterior distribution if given enough time. However, like all sampling techniques MCMC is computationally ex- pensive, and it can sometimes be difficult to ascertain when convergence to this true distribution has been achieved, especially if convergence has to be examined for each voxel.

6.4 Reducing uncertainty using a spatial model

The spatial information present in diffusion MRI data is ignored by widely used voxel-by-voxel diffusion tensor estimation algorithms, e.g. ordinary least squares (OLS) and weighted least squares (WLS) (Chung, Lu, and Henry, 2006), which assume that the data in each voxel is independent. This is in contrast to functional MRI, where Gaussian smoothing is a standard pre- processing step. Algorithms which assume independent voxels make the pa- rameter estimation more vulnerable to the image noise. The robustness and reliability of the estimated diffusion parameters can be improved by incorpo- rating spatial information in the parameter estimation. Spatial regularization is a technique that encourages similarity between pa- rameters over a neighborhood of voxels. This is typically done by minimizing a cost function which is often the sum of a data error term and a regulariza- tion term (Poupon, Clark, Frouin, et al., 2000; Wang, Vemuri, Chen, et al., 2003; Raj, Hess, and Mukherjee, 2011). One of the drawbacks of this method is that the weight of the regularization term must be tuned in order to find the optimal solution. However, determining the appropriate regularization weight is a difficult problem, and there is no general solution. An alternative but similar concept is the use of probabilistic spatial priors in a Bayesian framework to spatially regularize the model fitting procedure (Walker-Samuel, Orton, Boult, et al., 2011; Sidén, Eklund, Bolin, et al., 2017; Demiralp and Laidlaw, 2011; King, Gadian, and Clark, 2009; Poupon, Man- gin, Clark, et al., 2001). The regularization strength is then governed by a hyperparameter that scales the prior precision matrix which can be estimated directly during the fitting procedure, rather than being estimated separately. Poupon, Mangin, Clark, et al. (2001) used a Markov random field (MRF) framework to obtain a regularized fiber orientation map, in order to improve diffusion tractography. Martı́n-Fernández, San Josá-Estépar, Westin, etal. (2003) and Martı́n-Fernández, Westin, and Alberola-López (2004) both pro- posed Gaussian and non-Gaussian MRF approaches for the 2D spatial regu- larization of diffusion tensor fields. Martı́n-Fernández, Westin, and Alberola- López (2004) extended the 2D Gaussian MRF method to 3D using multivari- ate Gaussian Markov random field (GMRF) ideas. The maximum a posteri- ori (MAP) estimator was found by the simulated annealing algorithm in all

63 6. Uncertainty in diffusion models the three approaches. King, Gadian, and Clark (2009) tackled the crossing- fiber problem using an MRF model and MCMC sampling for the multi-fiber ball-and-stick model outlined in (Behrens, Berg, Jbabdi, et al., 2007), which offered a useful solution for the crossing-fiber problem in diffusion tractogra- phy. Walker-Samuel, Orton, Boult, et al. (2011) proposed a MCMC method using GMRF to incorporate spatial information when estimating the apparent diffusion coefficient (ADC). To use the spatial dependency between the signals in neighbouring vox- els for accurate inference, we extend a MCMC algorithm with spatial priors, previously used for fMRI analysis (Sidén, Eklund, Bolin, et al., 2017), to dif- fusion MRI. A preconditioned conjugate gradient (PCG) approach is applied to make the MCMC algorithm a practical option for whole brain 3D analysis. We calculate fractional anisotropy (FA) maps from three diffusion tensor es- timation methods: least-squares, and MCMC with and without spatial priors for various b-values. The coefficient of variation (CV) is calculated to measure the dispersion of the FA maps from the MCMC samples.

Diffusion tensor estimation with spatial priors We will consider diffusion data containing M volumes with N voxels ordered in a M ×N matrix Y. The linear regression model can be used to simultaneously estimate the parameters for all voxels, according to

Y = XB + E, (6.41) where B is a 7 × N matrix of regression coefficients and E is a M × N matrix of error terms. Here we assume that the noise in each voxel is modelled as independent and identically distributed (i.i.d.) Gaussian noise. The likelihood then becomes (Sidén, Eklund, Bolin, et al., 2017)

N ( ∣ ) = N ( −1 ) p Y B, λ ∏ Y⋅,n; XB⋅,n, λn IM , (6.42) n=1 with Y⋅,n and B⋅,n denoting the nth column of Y and B, λn is the noise precision of voxel n and IM is the M × M identity matrix. The spatial part of the model is incorporated via the following prior on the regression coefficients B

7 ( ∣ ) = ( T ∣ ) T ∣ ∼ N ( −1 −1) p B α ∏ p Bk,⋅ αk , Bk,⋅ αk 0, αk Dw , (6.43) k=1 T × where Bk,⋅ denotes the transposed kth row of B, Dw is a N N spatial T precision matrix and α = [α1, α1, ⋯, α7] are hyperparameters that control the smoothness, which are to be estimated from the data for each regressor in X. We choose the unweighted graph-Laplacian (UGL) (Penny, Trujillo- Barreto, and Friston, 2005) which has 6’s on the diagonal when modeling the

64 6.4. Reducing uncertainty using a spatial model

3D brain and 4’s when modeling a 2D slice, and Dw(i, j) = −1 if i and j are adjacent. The log likelihood can be expressed as

M N log p(Y∣B,λλ) = ∑ log(λn) 2 n=1 N − 1 ( T − T + T T ) ∑ λn Y⋅,nY⋅,n 2Y⋅,nXB⋅,n B⋅,nX XB⋅,n , (6.44) 2 n=1 where we have ignored everything that is constant with respect to the param- eters. Since Y and X will not change during the MCMC algorithm, quantities such as XT X can be effectively pre-computed, removing the fourth dimension from the likelihood which leads to significant speed up. We obtain closed form expressions for all full conditional posteriors, and can therefore perform Gibbs sampling. The full conditional posterior of B is given by 1 log p(B∣Y,λλλ,αα) = − wT Bw˜ + bT w , (6.45) 2 r r w r

T T where bw = vec(diag(λ)Y X), B˜ = X X ⊗ diag(λ) + diag(α) ⊗ Dw, wr = vec(BT ). The full conditional posterior of λ can be written as

N N λn log p(λ∣Y, B,αα) = (u˜2 − 1) ∑ log(λn) − ∑ , (6.46) n=1 n=1 u˜1n

1 1 T T T T T 1 M where = (Y⋅ Y⋅,n − 2Y⋅ XB⋅ + B⋅ X XB⋅,n)+ , u˜2 = +u2. The u˜1n 2 ,n ,n ,n ,n u1 2 full conditional posterior of α is given by

7 7 αk log p(α∣Y, B,λλ) = (q˜2 − 1) ∑ log(αk) − ∑ , (6.47) k=1 k=1 q˜1k

1 1 T T 1 N where = (Y⋅ XB⋅ ) + , q˜2 = + q2. We refer to Sidén, Eklund, q˜1k 2 ,n ,n q1 2 Bolin, et al. (2017) for further details, and how to use sparse pre-conditioned conjugate gradient (PCG) to work with large covariance matrices (for 100,000 voxels and 7 regressors, the covariance matrix is 700, 000 × 700, 000). We use the MGH adult diffusion dataset from the Human Connectome Project (HCP) (Van Essen, Smith, Barch, et al., 2013). The number of MCMC samples we use is 1000, and the number of samples for burn-in is 500. We use the CV (the ratio of the standard deviation to the mean) to measure the uncertainty of the FA maps calculated from the MCMC samples, as showed in Figure 6.2. A lower uncertainty is found in the first row for MCMC with 3D spatial priors. For b = 0/1000 and 0/3000 s/mm2, the de- crease in uncertainty was clearly higher than other combinations of b-values.

65 6. Uncertainty in diffusion models

Figure 6.2: CV of FA maps calculated from the MCMC samples. First row: MCMC with 3D spatial priors, second row: MCMC without 3D spatial priors, third row: ratios of CV of FA maps calculated from MCMC without and with 3D spatial priors. Columns from left to right: b = 0/1000, 0/3000, 0/5000, 0/10, 000 s/mm2 and the entire dataset.

66 7 Deep learning in diffusion MRI

But they are useless. They can only give you answers. — Pablo Picasso

7.1 Introduction

Accurate mathematical models that express relationships to a diffusion weighted signal are sometimes intractable. In such situations, machine learn- ing methods, and especially deep learning methods, can be a better choice (Le- Cun, Bengio, and Hinton, 2015). Machine learning methods are data-driven computational methods that model the output from its input while making the smallest number of assumptions possible. For example, Rajpurkar, Irvin, Zhu, et al. (2017) trained a convolutional neural network with 121 layers, using 100,000 X-ray images, to enable automatic detection of 14 different diseases, and the performance was similar or better compared to a radiologist. Using machine learning for is not a new concept, but the increasing amount of available training data in combination with faster computers (es- pecially graphics cards (Eklund, Dufort, Forsberg, et al., 2013)) has made it practically possible to train very advanced machine learning models. In med- ical imaging, deep learning has been focused extensively on lesion detection and segmentation (Greenspan, Van Ginneken, and Summers, 2016), since this is a hard problem, but deep learning has been applied to many different appli- cations. In diffusion MRI, deep learning has been used for diffusion parameter estimation (Bertleff, Domsch, Weingärtner, et al., 2017; Golkov, Dosovitskiy, Sämann, et al., 2015; Koppers and Merhof, 2016), image quality enhancement (Blumberg, Tanno, Kokkinos, et al., 2018; Tanno, Worrall, Ghosh, et al., 2017;

67 7. Deep learning in diffusion MRI

Albay, Demir, and Unal, 2018), and data synthesis (Koppers, Haarburger, and Merhof, 2016). Deep learning has also been used for clinical purposes, such as lesion size estimation (Bagher-Ebadian, Jafari-Khouzani, Mitsias, et al., 2011), survival time prediction (Schuster, Hardiman, and Bede, 2017; Nie, Zhang, Adeli, et al., 2016) and cancer grading (Abraham and Nair, 2018), using diffusion MR images in combination with other MR images.

7.2 Architectures

Artificial Neural Network The first artificial neural network (ANN), called a perceptron, was invented by Rosenblatt (1958). It was intended to model how the human brain pro- cessed visual data and learned to recognize objects. By the late 1980s, many researchers were using ANNs for a variety of purposes and this architecture became a hot topic in the field of artificial intelligence. An ANN is an ensem- ble of perceptrons, where each perceptron is defined by a non-linear activation function f and by two set of parameters: W (the weights) and b (the bias). The output of each neuron is the linear combination of the input x with the W added to the bias b, followed by the application of the non-linear activation function f (e.g. a sigmoid or a hyperbolic tangent function). Mathematically this can be expressed as

z = Wx, (7.1) a = f(z). (7.2)

Once the ANN has been trained, for example through back propagation, the network works in a deterministic manner based on the optimized weights W.

Deep Neural Network Many hidden layers can be added to an ANN, defining a deep architecture referred to as a deep neural network (DNN). In general, a DNN includes an input layer, multiple hidden layers, and an output layer, but there is no clear definition regarding how many layers that are required to obtain adeep network.

Convolutional Neural Network For image data, the number of nodes and the number of parameters required by a DNN can become very large, and it will be difficult to train such a network. For example, a fully connected network that works on a 1,000 × 1,000 pixels image needs to learn 1,000,000 weights per node per layer. Con- volutional neural networks (CNNs) have been proposed exactly to overcome this issue. In a CNN, a number of filters are applied in every node in every

68 7.2. Architectures layer of the network, to derive important features that are then used by a final fully connected layer, see Figure 7.1. For a convolutional layer with 100 nodes with filters of size 3 × 3, only 1000 weights have to be learned (900 filter weights plus 100 bias weights). CNNs are usually trained through back propagation, which requires many training images with ground truth. CNNs became popular in 2012, when they provided state of the art performance for the ImageNet database with 1.3 million images and 1000 classes (Krizhevsky, Sutskever, and Hinton, 2012).

Figure 7.1: A CNN architecture where feature extraction by filters is learned through back propagation. Pooling operations, such as max or mean in a region, are com- monly used to reduce the number of pixels in each layer of the network.

Autoencoder An autoencoder uses an unsupervised learning strategy to train the network. This procedure enables to encode the data in a lower dimensional space and extract the most discriminative features. An autoencoder can be divided into two parts: encoder and decoder. The encoder (denoted as f(x)) is used to generate a reduced feature representation from an initial input x, and the decoder (denoted as g(f(x))) is used to reconstruct the initial input from the output of the encoder by minimizing the loss function L(x, g(f(x))).

Generative Adversarial Networks Generative adversarial networks (GANs) is one of the most important ideas in machine learning in the last 20 years (Goodfellow, Pouget-Abadie, Mirza, et al., 2014). A GAN contains a generator, which is trained to generate as realistic images as possible, and a discriminator, which is trained to classify

69 7. Deep learning in diffusion MRI if an image is real or synthetic, see Figure 7.2. For images, both the gener- ator and the discriminator is a CNN. GANs can for example be trained to synthesize very realistic faces (Karras, Aila, Laine, et al., 2018). GANs can broadly be divided into noise-to-image GANs (generating new images from a noise vector) and image-to-image GANs (transforming a given image into a new image). GANs have already been widely used for medical image process- ing applications, such as denoising, reconstruction, segmentation, detection, classification and image synthesis. However, GANs for medical image trans- lation is still rather unexplored, especially for cross-modality translation of MR images (Yi, Walia, and Babyn, 2019).

Figure 7.2: A noise-to-image GAN contains a generator and a discriminator, and can be trained to generate realistic synthetic images.

A CycleGAN (Zhu, Park, Isola, et al., 2017) can be trained using two unpaired groups of images, to translate images between domain A and domain B. A CycleGAN usually consists of four main components, two generators (GA2B and GB2A) and two discriminators (DA and DB), as shown in Figure 7.3. The two generators synthesize domain A/B images based on images from domain B/A. The two discriminators make the judgement if the input images are real or synthetic. The translation between the two image domains is regularized by the cycle consistency, expressed as

GB2A(GA2B(IA)) ≈ IA, (7.3)

GA2B(GB2A(IB)) ≈ IB, (7.4) where IA and IB are two images from domain A and B. The loss function contains two terms: adversarial loss and cycle loss, and can be written as (Zhu, Park, Isola, et al., 2017)

2 2 Ladv =Ea∈A[(DA(a) − 1) ] + Eb∈B[(DB(b) − 1) ] (7.5) 2 2 + Ea∈A[DB(GA2B(a)) ] + Eb∈B[DA(GB2A(b)) ] 2 2 + Ea∈A[(DB(GA2B(a)) − 1) ] + Eb∈B[(DA(GB2A(b)) − 1) ],

Lcyc = Ea∈A[∣GB2A(GA2B(a)) − a∣] + Eb∈B[∣GA2B(GB2A(b)) − b∣]. (7.6)

70 7.3. Applications

Figure 7.3: The CycleGAN model contains two generators GA2B ∶ A → B and GB2A ∶ B → A, and two discriminators DA and DB for image-to-image translation. A cycle loss is introduced to capture the difference between the original data and the reconstructed data.

The adversarial loss encourages the discriminators to approve the real images of the corresponding groups, and to reject images generated by the corre- sponding generators. The generators are also encouraged to generate images that can fool the discriminators. The cycle loss guarantees that the image can be reconstructed from the other domain, as stated in Equation 7.4. The total loss is the sum of the adversarial loss and the cycle loss

Ltotal = Ladv + Lcyc. (7.7)

7.3 Applications

Diffusion parameter estimation The task of estimating a vector of diffusion parameters P from a vector of signal measurements S can be formalized as

P = f(S). (7.8)

The diffusion parameter vector P can represent diffusion model parameters, e.g. diffusion tensor elements or diffusion model scalars, e.g. FA, oracombi- nation of the two categories. The operation f can represent diffusion model fitting, diffusion scalar calculation, or a joint step of the two. The objective of a deep learning method is to adjust the weights of a neural network, repre- senting f(⋅), such that the output of the network well approximate the target vector P. ANNs have been used for voxel‐wise parameter estimation for the com- bined intravoxel incoherent motion (IVIM) and kurtosis model facilitating robust diffusion parameter mapping in the human brain (Bertleff, Domsch, Weingärtner, et al., 2017). This model can be written as

K S(b) = S ⋅ (f ⋅ exp (−bD∗) + (1 − f) ⋅ exp (−bD + b2D2)) , (7.9) 0 6 where S0 is the signal intensity without diffusion weighting, f is the volume fraction of the perfusion‐dominated compartment, D∗ is the pseudo-diffusion

71 7. Deep learning in diffusion MRI coefficient, D represents the ordinary diffusion coefficient and thekurtosis K is a measure of non‐Gaussian diffusion behavior. The parameters to be estimated are thus

⎛ S0 ⎞ ⎜ ⎟ ⎜ f ⎟ ⎜ ⎟ P = ⎜ D ⎟ . (7.10) ⎜ ⎟ ⎜ D∗ ⎟ ⎝ k ⎠ Training data were simulated according to Equation 7.9, by varying the five parameters individually and then adding Rician noise (Wegmann, Eklund, and Villani, 2017b; Gudbjartsson and Patz, 1995). A feed‐forward ANN with one hidden layer and 10 nodes was used. The activation functions between hidden layers were rectified linear units (ReLUs) (Nair and Hinton, 2010), which are expressed as

f(z) = max(0, z). (7.11)

Compared with state‐of‐the‐art least squares regression (LSR) and multi-step least squares regression (LSR-MS), the proposed ANN approach demonstrated better accuracy, and further reduced intra‐subject variations and outliers. Golkov, Dosovitskiy, Sperl, et al. (2016) proposed to perform diffusion model fitting and scalar calculation as a single optimized step. Anetwork was trained to predict diffusion scalars directly from the subset Sα, to achieve a scan time reduction factor of n/α where n is the total number of measure- ments and α is the number of measurements in the subset. The ANN used has three hidden layers, each consisting of 150 hidden units with ReLU activation functions. It has previously been recommended to use 150 and 88 measure- ments for the DKI and NODDI model, respectively (Lu, Jensen, Ramani, et al., 2006; Zhang, Schneider, Wheeler-Kingshott, et al., 2012) whereas Golkov, Dosovitskiy, Sperl, et al. (2016) demonstrated that their method worked with only 12 measurements for DKI and 8 measurements for NODDI, resulting in a twelve-fold reduction of scan time and strongly improved clinical feasibility. Koppers and Merhof (2016) presented an approach for estimating the fiber orientation distribution functions (fODF) directly from raw diffusion data within a single voxel, by converting the model fitting process into a classifi- cation problem based on a CNN, where each class defines a certain direction. For this purpose one hemisphere of the fODF was equidistantly sampled for 250 possible fiber directions. Results showed that this method outperformed the ball-and-stick model on real data.

Image quality transfer Deep learning has shown promise in the task of transferring low quality images to high quality images, i.e. image quality transfer (IQT) (Blumberg, Tanno,

72 7.3. Applications

Kokkinos, et al., 2018). The proposed memory-efficient 3D CNN was used to enhance the resolution of diffusion MR images. The training data was cre- ated by sampling subvolumes from the ground truth high resolution diffusion images, which were downsampled to generate the low resolution counterparts. Downsampling was performed by taking a block-wise mean. Tanno, Worrall, Ghosh, et al. (2017) proposed an implementation of Bayesian IQT via 3D CNNs in which they introduced two uncertainty mea- sures: intrinsic uncertainty and parameter uncertainty. The intrinsic uncer- tainty quantifies the irreducible variance of the statistical mapping from low- resolution to high resolution, by introducing a heteroscedastic noise model. The resolution and parameter uncertainty quantifies the degree of ambiguity in the model parameters that best explain the observed data through varia- tional dropout (Srivastava, Hinton, Krizhevsky, et al., 2014). The training data was created the same way as in (Blumberg, Tanno, Kokkinos, et al., 2018). In medical imaging, GANs have also shown its success in the super- resolution task applied to diffusion MR images. Albay, Demir, and Unal (2018) utilized a GAN to obtain a higher spatial resolution. The introduced GAN takes a down-sampled low resolution slice of diffusion data and synthe- sizes its high resolution counterpart. The results (FA maps, color FA maps, reconstructed tensors and ODFs) demonstrate that GANs can be trained to create high resolution data using low resolution input.

Synthesis Koppers, Haarburger, and Merhof (2016) proposed an approach based on DNNs to learn the non-linear mapping between different shells of diffusion signals, which makes it possible to augment existing shells by predicting fur- ther shells from those that have actually been acquired. The proposed DNN is designed to predict spherical harmonics coefficients representing a HARDI signal on one shell, from HARDI signals on one or more other shells of the same voxel. Diffusion MR signals can be represented by Spherical harmonics (SH) according to (Descoteaux, Angelino, Fitzgibbons, et al., 2006) S = BC, (7.12) where S is the diffusion signal, B is the SH basis and C is the SH coefficient vector. The SH coefficients C are estimated for every shell using a least- squares fit with regularization −1 C = (BT B + λL) BT S, (7.13) where λ is a regularization parameter. The DNN consists of an input layer, three hidden layers and an output layer. The DNN takes diffusion signals as input and predicts SH coefficients as output. This approach can be particu- larly useful to reduce the acquisition time of multi-shell diffusion data.

73 7. Deep learning in diffusion MRI

Implementations of CycleGAN and unsupervised image-to-image transla- tion (UNIT) for 2D T1-T2-weighted translation were reported in (Welander, Karlsson, and Eklund, 2018), and results showed that visually realistic syn- thetic T1- and T2-weighted images can be generated from the other modality, proven via a perceptual study. In (Dar, Yurt, Karacan, et al., 2018) a condi- tional GAN was proposed to do the translation between T1- and T2-weighted images, in which a probabilistic GAN (PGAN) and a CycleGAN were trained for paired and unpaired source-target images, respectively. The proposed GAN demonstrated visually and quantitatively accurate translations for both healthy subjects and glioma patients. A patch-based conditional GAN (Olut, Sahin, Demir, et al., 2018) was proposed to generate magnetic resonance angiography (MRA) images from T1- and T2-weighted images jointly. The steerable filter responses were incorporated in the loss function to extract directional features of the vessel structure. MR image translation based on downsampled images was investigated in (Dar, Yurt, Shahdloo, et al., 2018), to reduce scan time. Three different types of input were fed to the GANs: downsampled target images, downsampled source images, and downsampled target and source images jointly. It was demonstrated that the GAN with the combination of downsampled target and source images as the input out- performed its two competitors, which resulted in a reduction of the scan time up to a factor of 50. A 3D conditional GAN (Yu, Zhou, Wang, et al., 2018) was applied to synthesize FLAIR images from T1-weighted images, and the synthetic FLAIR images improved brain tumor segmentation, compared with using only T1-weighted images. Similarly, Abramian and Eklund (2019) used a 3D CycleGAN to synthesize fMRI images from T1-weighted images. Translation between structural and diffusion images can be achieved us- ing CycleGAN. We used diffusion and T1-weighted images from the Human Connectome Project (HCP)1 (Van Essen, Smith, Barch, et al., 2013; Glasser, Sotiropoulos, Wilson, et al., 2013) for 1065 healthy subjects. We followed the network architecture design given in the original CycleGAN paper (Zhu, Park, Isola, et al., 2017). We used 2 feature extraction convolutional layers, 9 residual blocks and 3 deconvolutional layers for the generators. For the discriminators we used 4 feature extraction convolutional layers and a final layer to produce a one-dimensional output. We trained the network with a learning rate of 0.0004. We kept the same learning rate for the first half of the training, and linearly decayed the learning rate to zero over the sec- ond half. A total of 1000 subjects were used for training, and 65 subjects were used for testing. An Nvidia Titan X Pascal graphics card was used to

1Data collection and sharing for this project was provided by the Human Connectome Project (U01-MH93765) (HCP; Principal Investigators: Bruce Rosen, M.D., Ph.D., Arthur W. Toga, Ph.D., Van J.Weeden, MD). HCP funding was provided by the National Institute of Dental and Craniofacial Research (NIDCR), the National Institute of Mental Health (NIMH), and the National Institute of Neurological Disorders and Stroke (NINDS). HCP data are disseminated by the Laboratory of Neuro Imaging at the University of Southern California.

74 7.3. Applications

train the network. Figure 7.4 shows the qualitative results of T1-to-FA image translation for 4 test subjects. The results show a good match between the synthetic FA images and their ground truth, for both texture of white matter tracts and global content. However, when compared with the ground truth, it is observed that the synthetic FA images have a reduced level of details on white matter tracts. The difference image shows the absolute error be- tween synthetic and real FA images. Results of T1-to-MD image translation are also shown in Figure 7.4. The synthetic MD images demonstrate great visual similarity to the ground truth. The CSF region and its boundaries are accurately synthesized. While the synthetic FA images appear realistic, the

Figure 7.4: T1-to-FA/MD image translation results for 4 subjects. First row: True T1-weighted images, second row: true FA/MD images, third row: synthetic FA/MD images, fourth row: Difference between real and synthetic FA/MD. training of the GAN will depend on the training data used. For example, if the GAN is trained using data from healthy controls it is likely that the GAN will be biased for brain tumor patients, and can for example remove existing tumors (Cohen, Luck, and Honari, 2018). Future research may focus on creating other diffusion-derived scalar maps from more advanced diffusion models, such as mean apparent propagator (MAP) MRI. In this work, we have only used 2D CycleGAN, but it has been reported that 3D GANs using spatial information (Nie, Trullo, Lian, et al., 2017) across slices yield better mappings between two domains (at the cost of a higher memory consumption and a higher computational cost). A comparison study of image translation using 2D and 3D GANs is thus worth looking into.

75 7. Deep learning in diffusion MRI

Clinical applications To determine the final size of an ischemic lesion, an ANN was trained topre- dict the 3-month chronic T2-weighted images using diffusion, T1-, T2-weighted and proton density images (Bagher-Ebadian, Jafari-Khouzani, Mitsias, et al., 2011). The trained ANN produced maps of predicted outcome that were well correlated (r = 0.80, p < 0.0001) with the T2-weighted images at 3 months for all test patients. Schuster, Hardiman, and Bede (2017) proposed to use clinical character- istics in combination with diffusion MR images and T1-weighted images to predict survival of amyotrophic lateral sclerosis patients using deep learning. Clinical data alone had a survival prediction accuracy of 66.67%. Deep learn- ing prediction accuracy increased greatly to 79.17% by combining the three sources of information. Similarly, Nie, Zhang, Adeli, et al. (2016) proposed using 3D CNNs to automatically extract features from multi-modal preoper- ative brain images, i.e. T1-weighted, functional and diffusion MR images of high-grade glioma patients. The CNN extracted features, along with the piv- otal clinical features, were used to train a support vector machine to predict if the patient has a long or short overall survival time. It was demonstrated that using the extracted features from the trained 3D CNN can achieve an accuracy as high as 89.95%. In contrast, with hand crafted features alone, the accuracy obtained was just 62.96%. To determine the grade group in prostate cancer using multi-parametric MR (T2- and diffusion weighted images) biomarkers, Abraham and Nair(2018) investigated using an autoencoder to extract high-level features from hand- crafted texture. This method achieved first place winning over 43 methods submitted by 21 groups.

76 8 Summary of papers

The end of all our exploring will be to arrive where we started. — Thomas Stearns Eliot

8.1 Introduction

In this chapter, we summarize the included papers.

8.2 Paper I - Repeated tractography of a single subject: how high is the variance?

In this paper, we investigated the test-retest reliability of diffusion tractogra- phy, using 32 diffusion datasets from a single healthy subject. Tractography was carried out using FSL and Dipy. Two seed masks (corticospinal and cin- gulum gyrus tracts) were created from the JHU White-Matter Tractography atlas. The standard deviation, the coefficient of variation (CV) and the Dice similarity coefficient (DSC) were used to examine the reproducibility. Our results show that the multi-fiber model in FSL is able to reveal more con- nections between brain areas, compared to the single fiber model, and that distortion correction increases the reproducibility.

77 8. Summary of papers

8.3 Paper II - Bayesian diffusion tensor estimation with spatial priors

In this paper, we proposed a fast Markov chain Monte Carlo (MCMC) method for diffusion tensor estimation, for both 2D and 3D spatial priors. The regu- larization parameter was jointly estimated with the tensor using MCMC. We compared fractional anisotropy (FA) maps for various b-values using three dif- fusion tensor estimation methods: least-squares and MCMC with and without spatial priors. Coefficient of variation (CV) was calculated to measure theun- certainty of the FA maps calculated from the MCMC samples. Our results show that the MCMC algorithm with spatial priors provides a denoising effect and reduces the uncertainty of the MCMC samples.

8.4 Paper III - Multi-fiber estimation and tractography for diffusion MRI using mixture of non-central Wishart distributions

In this paper, we estimated the orientations of crossing fibers within a voxel using a continuous mixture of diffusion tensors model in which the mixing distribution is a non-central Wishart distribution. We compared the results with other fiber orientation reconstruction methods using synthetic data with different angles between crossing fibers and different noise levels. Wecon- ducted tractography for real human data. Results show that the model with a non-central Wishart distribution was comparable and, in some cases, better than the state-of-the-art algorithms.

8.5 Paper IV - Using the wild bootstrap to quantify uncertainty in mean apparent propagator MRI

In this paper, the uncertainty of different MAP-MRI metrics was quanti- fied by estimating the empirical distributions using the wild bootstrap. We applied the wild bootstrap to both phantom data and human brain data, and obtain empirical distributions for the MAP-MRI metrics return-to-origin probability (RTOP), non-Gaussianity (NG) and propagator anisotropy (PA). We demonstrated the impact of diffusion acquisition scheme (number of shells and number of measurements per shell) on the uncertainty of MAP-MRI met- rics. We demonstrated how the uncertainty of these metrics can be used to improve group analyses, and to compare different preprocessing pipelines.

78 8.6. Paper V - Generating diffusion MRI scalar maps

8.6 Paper V - Generating diffusion MRI scalar maps

from T1 weighted images using generative adversarial networks

In this paper, we demonstrated how Generative Adversarial Networks (GANs) can be used to generate synthetic diffusion scalar maps from structural T1- weighted images. Specially, we trained the popular CycleGAN model to learn to map a T1 image to fractional anisotropy (FA) and mean diffusivity (MD), and vice versa. As an application, we showed that synthetic FA images can be used as a target for non-linear registration, to correct for geometric distortions common in diffusion MRI.

8.7 Paper VI - Evaluation of six phase encoding based susceptibility distortion correction methods for diffusion MRI

In this paper we quantitatively evaluated six popular phase encoding based methods for correcting susceptibility distortions in diffusion MRI data. We employed a framework that allows for the simulation of realistic diffusion MRI data with susceptibility distortions. We evaluated the ability for methods to correct distortions by comparing the corrected data with the ground truth. We also validated two popular indirect metrics using both simulated data and real data. We found that EPIC, HySCO and TOPUP offered better correction than the other three methods. We suggest that indirect metrics must be interpreted cautiously when evaluating methods for correcting susceptibility distortions in diffusion MRI data.

79

Bibliography

Abraham, Bejoy and Madhu S Nair (2018). “Computer-aided classification of prostate cancer grade groups from MRI images using texture features and stacked sparse autoencoder”. In: Computerized Medical Imaging and Graphics 69, pp. 60–68. Abramian, David and Anders Eklund (2019). “Generating fMRI volumes from T1-weighted volumes using 3D CycleGAN”. In: arXiv preprint arXiv:1907.08533. Albay, Enes, Ugur Demir, and Gozde Unal (2018). “Diffusion MRI spatial super-resolution using generative adversarial networks”. In: International Workshop on Predictive Intelligence In Medicine. Springer, pp. 155–163. Aldroubi, Akram and Peter Basser (1999). “Reconstruction of vector and tensor fields from sampled discrete data”. In: Contemporary Mathematics 247, pp. 1–16. Andersson, Jesper LR, Stefan Skare, and John Ashburner (2003). “How to correct susceptibility distortions in spin-echo echo-planar images: applica- tion to diffusion tensor imaging”. In: NeuroImage 20.2, pp. 870–888. Andersson, Jesper LR and Stamatios N Sotiropoulos (2016). “An integrated approach to correction for off-resonance effects and subject movement in diffusion MR imaging”. In: NeuroImage 125, pp. 1063–1078. Bagher-Ebadian, Hassan, Kourosh Jafari-Khouzani, Panayiotis D Mitsias, et al. (2011). “Predicting final extent of ischemic infarction using artificial neural network analysis of multi-parametric MRI in patients with stroke”. In: PLoS ONE 6.8, e22626.

81 Bibliography

Basser, Peter J (1998). “Fiber-tractography via diffusion tensor MRI (DT- MRI)”. In: Proceedings of the International Society for Magnetic Resonance in Medicine. Vol. 1226, p. 1226. Basser, Peter J, James Mattiello, and Denis LeBihan (1994a). “Estimation of the effective self-diffusion tensor from the NMR spin echo”. In: Journal of Magnetic Resonance, Series B 103.3, pp. 247–254. Basser, Peter J, James Mattiello, and Denis LeBihan (1994b). “MR diffusion tensor spectroscopy and imaging”. In: Biophysical Journal 66.1, pp. 259– 267. Basser, Peter J and Carlo Pierpaoli (1996). “Microstructural and physiological features of tissues elucidated by quantitative-diffusion-tensor MRI”. In: Journal of Magnetic Resonance, Series B 111, pp. 209–219. Behrens, Timothy EJ, H Johansen Berg, Saad Jbabdi, et al. (2007). “Proba- bilistic diffusion tractography with multiple fibre orientations: What can we gain?” In: NeuroImage 34.1, pp. 144–155. Behrens, Timothy EJ, Mark W Woolrich, Mark Jenkinson, et al. (2003). “Characterization and propagation of uncertainty in diffusion-weighted MR imaging”. In: Magnetic Resonance in Medicine 50.5, pp. 1077–1088. Berman, Jeffrey I, SungWon Chung, Pratik Mukherjee, et al. (2008). “Prob- abilistic streamline Q-ball tractography using the residual bootstrap”. In: NeuroImage 39.1, pp. 215–222. Berman, JI, S Chung, CP Hess, et al. (2007). “Model-based bootstrap re- sampling methods for HARDI: Quantitation of ODF uncertainty without multiple acquisitions”. In: Proceedings of the International Society for Mag- netic Resonance in Medicine, p. 1471. Berman, JI, P Mukherjee, CP Hess, et al. (2007). “Bootstrap tractography using multimodal fiber orientation data from high angular resolution dif- fusion imaging”. In: Proceedings of the International Society for Magnetic Resonance in Medicine, p. 526. Bertleff, Marco, Sebastian Domsch, Sebastian Weingärtner, et al. (2017). “Dif- fusion parameter mapping with the combined intravoxel incoherent motion and kurtosis model using artificial neural networks at 3 T”. In: NMR in Biomedicine 30.12, e3833. Bloch, Felix (1946). “Nuclear induction”. In: Physical Review 70.7-8, p. 460. Blumberg, Stefano B, Ryutaro Tanno, Iasonas Kokkinos, et al. (2018). “Deeper image quality transfer: Training low-memory neural networks for 3D images”. In: International Conference on and Computer-Assisted Intervention. Springer, pp. 118–125. Bosch, AJ (1986). “The factorization of a square matrix into two symmetric matrices”. In: The American Mathematical Monthly 93.6, pp. 462–464. Bottomley, PA, HR Hart, WA Edelstein, et al. (1982). “Head imaging and spectroscopy at 1.5 Tesla”. In: Magnetic Resonance Imaging 1.4, p. 231. Cárdenes, Rubén, Emma Muñoz-Moreno, Antonio Tristan-Vega, et al. (2010). “Saturn: a software application of tensor utilities for research in neu-

82 Bibliography

roimaging”. In: Computer Methods and Programs in Biomedicine 97.3, pp. 264–279. Carr, Herman Y (1953). “Free precession techniques in nuclear magnetic res- onance”. PhD thesis. Harvard University. Carr, Herman Y and Edward M Purcell (1954). “Effects of diffusion on free precession in nuclear magnetic resonance experiments”. In: Physical Review 94.3, p. 630. Chung, S, JI Berman, C Rae, et al. (2007). “Effect of DTI bootstrap bias on the DTI uncertainty measurements and probabilistic tractography”. In: Proceedings of the International Society for Magnetic Resonance in Medicine, p. 1596. Chung, SungWon, Ying Lu, and Roland G Henry (2006). “Comparison of bootstrap approaches for estimation of uncertainties of DTI parameters”. In: NeuroImage 33.2, pp. 531–541. Cohen, Joseph Paul, Margaux Luck, and Sina Honari (2018). “Distribution matching losses can hallucinate features in medical image translation”. In: arXiv preprint arXiv:1805.08841v3. Cook, PA, Y Bai, SKKS Nedjati-Gilani, et al. (2006). “Camino: open-source diffusion-MRI reconstruction and processing”. In: Proceedings of the Inter- national Society for Magnetic Resonance in Medicine. Vol. 2759. Seattle WA, USA, p. 2759. Cusack, Rhodri, Matthew Brett, and Katja Osswald (2003). “An evaluation of the use of magnetic field maps to undistort echo-planar images”. In: NeuroImage 18.1, pp. 127–142. Damadian, Raymond (1971). “Tumor detection by nuclear magnetic reso- nance”. In: Science 171.3976, pp. 1151–1153. Dar, Salman Ul Hassan, Mahmut Yurt, Levent Karacan, et al. (2018). “Image synthesis in multi-contrast MRI with conditional generative adversarial networks”. In: arXiv preprint arXiv:1802.01221. Dar, Salman Ul Hassan, Mahmut Yurt, Mohammad Shahdloo, et al. (2018). “Synergistic reconstruction and synthesis via generative adver- sarial networks for accelerated nulti-contrast MRI”. In: arXiv preprint arXiv:1805.10704. Davidson, Russell and Emmanuel Flachaire (2008). “The wild bootstrap, tamed at last”. In: Journal of Econometrics 146.1, pp. 162–169. Demiralp, Cagatay and David H Laidlaw (2011). “Generalizing diffusion ten- sor model using probabilistic inference in Markov random fields”. In: Pro- ceedings of the International Society for Magnetic Resonance in Medicine, p. 1935. Descoteaux, Maxime, Elaine Angelino, Shaun Fitzgibbons, et al. (2006). “Ap- parent diffusion coefficients from high angular resolution diffusion imaging: Estimation and applications”. In: Magnetic Resonance in Medicine 56.2, pp. 395–410.

83 Bibliography

Efron, Bradley (1979). “Bootstrap methods: another look at the jackknife”. In: vol. 7. 1, pp. 1–26. Einstein, Albert (1906). “A new determination of molecular dimensions”. In: Annals of Physics 19, pp. 289–306. Eklund, Anders, Paul Dufort, Daniel Forsberg, et al. (2013). “Medical im- age processing on the GPU–Past, present and future”. In: Medical Image Analysis 17.8, pp. 1073–1094. Eklund, Anders, Thomas E Nichols, and Hans Knutsson (2017). “Reply to Brown and Behrmann, Cox, et al., and Kessler et al.: Data and code shar- ing is the way forward for fMRI”. In: Proceedings of the National Academy of Sciences 114.17, E3374–E3375. Elster, Allen D (1993). “Gradient-echo MR imaging: techniques and acronyms”. In: 186.1, pp. 1–8. Fillard, Pierre, Nicolas Toussaint, and Xavier Pennec (2006). “Medinria: DT- MRI processing and visualization software”. In: SIMILAR NoE Tensor Workshop. Vol. 5. Freedman, David A et al. (1981). “Bootstrapping regression models”. In: The Annals of Statistics 9.6, pp. 1218–1228. Garyfallidis, Eleftherios, Matthew Brett, Bagrat Amirbekian, et al. (2014). “Dipy, a library for the analysis of diffusion MRI data”. In: Frontiers in Neuroinformatics 8, p. 8. Geman, Stuart and Donald Geman (1984). “Stochastic relaxation, Gibbs dis- tributions, and the Bayesian restoration of images”. In: Readings in Com- puter Vision. Elsevier, pp. 564–584. Girard, Gabriel, Kevin Whittingstall, Rachid Deriche, et al. (2014). “Towards quantitative connectivity analysis: reducing tractography biases”. In: Neu- roImage 98, pp. 266–278. Glasser, Matthew F, Stamatios N Sotiropoulos, J Anthony Wilson, et al. (2013). “The minimal preprocessing pipelines for the Human Connectome Project”. In: NeuroImage 80, pp. 105–124. Goebel, R (1996). “Brainvoyager: A program for analyzing and visualizing functional and structural magnetic resonance data sets”. In: NeuroImage 3.3, S604. Golkov, Vladimir, Alexey Dosovitskiy, Philipp Sämann, et al. (2015). “Q- Space deep learning for twelve-fold shorter and model-free diffusion MRI scans”. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 37–44. Golkov, Vladimir, Alexey Dosovitskiy, Jonathan I Sperl, et al. (2016). “Q- space deep learning: twelve-fold shorter and model-free diffusion MRI scans”. In: IEEE Transactions on Medical Imaging 35.5, pp. 1344–1351. Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, et al. (2014). “Generative adversarial nets”. In: Advances in Neural Information Processing Systems, pp. 2672–2680.

84 Bibliography

Graham, Mark S, Ivana Drobnjak, and Hui Zhang (2016). “Realistic simula- tion of artefacts in diffusion MRI for validating post-processing correction techniques”. In: NeuroImage 125, pp. 1079–1094. Greenspan, Hayit, Bram Van Ginneken, and Ronald M Summers (2016). “Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique”. In: IEEE Transactions on Medical Imaging 35.5, pp. 1153–1159. Greve, Douglas N and Bruce Fischl (2009). “Accurate and robust brain im- age alignment using boundary-based registration”. In: NeuroImage 48.1, pp. 63–72. Gu, Xuan and Anders Eklund (2019). “Evaluation of six phase encoding based susceptibility distortion correction methods for diffusion MRI”. In: BioRxiv, p. 766139. Gu, Xuan, Anders Eklund, and Hans Knutsson (2017). “Repeated tractogra- phy of a single subject: how high is the variance?” In: Modeling, Analysis, and Visualization of Anisotropy. Springer, Cham, pp. 331–354. Gu, Xuan, Anders Eklund, Evren Özarslan, et al. (2019). “Using the wild bootstrap to quantify uncertainty in mean apparent propagator MRI”. In: Frontiers in Neuroinformatics 13, p. 43. Gu, Xuan, Hans Knutsson, Markus Nilsson, et al. (2019). “Generating diffu- sion MRI scalar maps from T1 weighted images using generative adversar- ial networks”. In: Scandinavian Conference on Image Analysis. Springer, pp. 489–498. Gu, Xuan, Per Sidén, Bertil Wegmann, et al. (2017). “Bayesian diffusion ten- sor estimation with spatial priors”. In: International Conference on Com- puter Analysis of Images and Patterns. Springer, pp. 372–383. Gudbjartsson, Hákon and Samuel Patz (1995). “The Rician distribution of noisy MRI data”. In: Magnetic Resonance in Medicine 34.6, pp. 910–914. Hahn, Erwin L (1950). “Spin echoes”. In: Physical Review 80.4, p. 580. Haroon, Hamied A, David M Morris, Karl V Embleton, et al. (2009). “Using the model-based residual bootstrap to quantify uncertainty in fiber orien- tations from Q-ball analysis”. In: IEEE Transactions on Medical Imaging 28.4, pp. 535–550. Hashemi, Ray Hashman, William G Bradley, and Christopher J Lisanti (2012). MRI: The basics. Lippincott Williams & Wilkins. Heim, S, K Hahn, PG Sämann, et al. (2004). “Assessing DTI data quality using bootstrap analysis”. In: Magnetic Resonance in Medicine 52.3, pp. 582–589. Hernández, Moisés, Ginés D Guerrero, José M Cecilia, et al. (2013). “Ac- celerating fibre orientation estimation from diffusion weighted magnetic resonance imaging using GPUs”. In: PLoS ONE 8.4, e61892. Hernandez-Fernandez, Moises, Istvan Reguly, Saad Jbabdi, et al. (2019). “Us- ing GPUs to accelerate computational diffusion MRI: From microstruc- ture estimation to tractography and ”. In: NeuroImage 188, pp. 598–615.

85 Bibliography

Hess, Christopher P, Pratik Mukherjee, Eric T Han, et al. (2006). “Q-ball re- construction of multimodal fiber orientations using the spherical harmonic basis”. In: Magnetic Resonance in Medicine 56.1, pp. 104–117. Hess, CP, P Mukherjee, JL Berman, et al. (2006). “In vivo comparison of Q- ball, spherical deconvolution and diffusion orientation transform HARDI for fiber orientation estimation using the bootstrap method”. In: Proceed- ings of the International Society for Magnetic Resonance in Medicine, p. 159. Hughes, Emer, Lucilio Cordero-Grande, Maria Murgasova, et al. (2017). “The Developing Human Connectome: announcing the first release of open ac- cess neonatal brain imaging”. In: Poster presented at the Annual Meeting of the Organization for Human . Irfanoglu, M Okan, Pooja Modi, Amritha Nayak, Elizabeth B Hutchinson, et al. (2015). “DR-BUDDI (Diffeomorphic Registration for Blip-Up blip- Down Diffusion Imaging) method for correcting echo planar imaging dis- tortions”. In: NeuroImage 106, pp. 284–299. Irfanoglu, M Okan, Pooja Modi, Amritha Nayak, Andrew Knutsen, et al. (2014). “DR-BUDDI: Diffeomorphic Registration for Blip Up-down Diffu- sion Imaging”. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 218–226. Irfanoglu, Mustafa Okan, Amritha Nayak, Jeffrey Jenkins, et al. (2017). “TORTOISE V3: improvements and new features of the NIH diffusion MRI processing pipeline”. In: Proceedings of the International Society for Magnetic Resonance in Medicine. ISMRM Hawaii, USA. Jenkinson, Mark, Peter Bannister, Michael Brady, et al. (2002). “Improved optimization for the robust and accurate linear registration and motion correction of brain images”. In: NeuroImage 17.2, pp. 825–841. Jenkinson, Mark, Christian F Beckmann, Timothy EJ Behrens, et al. (2012). “FSL”. In: NeuroImage 62.2, pp. 782–790. Jenkinson, Mark and Stephen Smith (2001). “A global optimisation method for robust affine registration of brain images”. In: Medical Image Analysis 5.2, pp. 143–156. Jensen, Jens H, Joseph A Helpern, Anita Ramani, et al. (2005). “Diffu- sional kurtosis imaging: the quantification of non-gaussian water diffu- sion by means of magnetic resonance imaging”. In: Magnetic Resonance in Medicine 53.6, pp. 1432–1440. Jezzard, Peter and Robert S Balaban (1995). “Correction for geometric dis- tortion in echo planar images from B0 field variations”. In: Magnetic Res- onance in Medicine 34.1, pp. 65–73. Jian, Bing, Baba C Vemuri, Evren Özarslan, et al. (2007). “A novel tensor distribution model for the diffusion-weighted MR signal”. In: NeuroImage 37.1, pp. 164–176. Jiang, Hangyi, Peter CM Van Zijl, Jinsuh Kim, et al. (2006). “DtiStudio: resource program for diffusion tensor computation and fiber bundle track-

86 Bibliography

ing”. In: Computer Methods and Programs in Biomedicine 81.2, pp. 106– 116. Jones, Derek K (2003). “Determining and visualizing uncertainty in estimates of fiber orientation from diffusion tensor MRI”. In: Magnetic Resonance in Medicine 49.1, pp. 7–12. Jones, Derek K (2008). “Tractography gone wild: probabilistic fibre tracking using the wild bootstrap with diffusion tensor MRI”. In: IEEE Transac- tions on Medical Imaging 27.9, pp. 1268–1274. Jones, Derek K and Mara Cercignani (2010). “Twenty-five pitfalls in the anal- ysis of diffusion MRI data”. In: NMR in Biomedicine 23.7, pp. 803–820. Karras, Tero, Timo Aila, Samuli Laine, et al. (2018). “Progressive growing of GANs for improved quality, stability, and variation”. In: International Conference on Learning Representations. King, Martin D, David G Gadian, and Chris A Clark (2009). “A random effects modelling approach to the crossing-fibre problem in tractography”. In: NeuroImage 44.3, pp. 753–768. Koppers, Simon, Christoph Haarburger, and Dorit Merhof (2016). “Diffu- sion MRI signal augmentation: from single shell to multi shell with deep learning”. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 61–70. Koppers, Simon and Dorit Merhof (2016). “Direct estimation of fiber orienta- tions using deep learning in diffusion imaging”. In: International Workshop on Machine Learning in Medical Imaging. Springer, pp. 53–60. Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E Hinton (2012). “Imagenet classification with deep convolutional neural networks”. In: Advances in Neural Information Processing Systems, pp. 1097–1105. Lauterbur, Paul C (1973). “Image formation by induced local interactions: examples employing nuclear magnetic resonance”. In: Nature 242, pp. 190– 191. Le Bihan, Denis and E Breton (1985). “Imagerie de diffusion in-vivo par réso- nance magnétique nucléaire”. In: Comptes-Rendus de l’Académie des Sci- ences 93.5, pp. 27–34. LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton (2015). “Deep learning”. In: Nature 521.7553, p. 436. Leemans, A, B Jeurissen, J Sijbers, et al. (2009). “ExploreDTI: a graphical toolbox for processing, analyzing, and visualizing diffusion MR data”. In: Proceedings of the International Society for Magnetic in Medicine. Vol. 17, p. 3537. Leemans, Alexander and Derek K Jones (2009). “The B-matrix must be ro- tated when correcting for subject motion in DTI data”. In: Magnetic Res- onance in Medicine 61.6, pp. 1336–1349. Liang, Zhi-Pei and Paul C Lauterbur (2000). Principles of magnetic resonance imaging: a signal processing perspective. SPIE Optical Engineering Press.

87 Bibliography

Liu, Chunlei, Roland Bammer, Burak Acar, et al. (2004). “Characterizing non- Gaussian diffusion by using generalized diffusion tensors”. In: Magnetic Resonance in Medicine 51.5, pp. 924–937. Lu, Hanzhang, Jens H Jensen, Anita Ramani, et al. (2006). “Three- dimensional characterization of non-gaussian water diffusion in humans using diffusion kurtosis imaging”. In: NMR in Biomedicine 19.2, pp. 236– 247. Maier-Hein, Klaus H, Peter F Neher, Jean-Christophe Houde, et al. (2017). “The challenge of mapping the human connectome based on diffusion trac- tography”. In: Nature Communications 8.1, p. 1349. Mammen, Enno et al. (1993). “Bootstrap and wild bootstrap for high dimen- sional linear models”. In: The Annals of Statistics 21.1, pp. 255–285. Mansfield, P (1977). “Multi-planar imaging formation using NMR spin echoes”. In: Journal of Physics C: Solid State Physics 10.3, pp. 55–58. Mansfield, P and PK Grannell (1975). “”Diffraction” and microscopy insolids and liquids by NMR”. In: Physical Review 12.9, p. 3618. Mansfield, P and Andrew A Maudsley (1977). “Planar spin imaging byNMR”. In: Journal of Magnetic Resonance (1969) 27.1, pp. 101–119. Martı́n-Fernández, Marcos, Raul San Josá-Estépar, Carl-Fredrik Westin, et al. (2003). “A novel Gauss-Markov random field approach for regularization of diffusion tensor maps”. In: International Conference on Computer Aided Systems Theory. Springer, pp. 506–517. Martı́n-Fernández, Marcos, Carl-Fredrik Westin, and Carlos Alberola-López (2004). “3D Bayesian regularization of diffusion tensor MRI using multi- variate Gaussian Markov random fields”. In: pp. 351–359. Nair, Vinod and Geoffrey E Hinton (2010). “Rectified linear units improve restricted boltzmann machines”. In: Proceedings of the International Con- ference on Machine Learning, pp. 807–814. Nayak, Krishna S and Dwight G Nishimura (2000). “Automatic field map gen- eration and off-resonance correction for projection reconstruction imaging”. In: Magnetic Resonance in Medicine 43.1, pp. 151–154. Nie, Dong, Roger Trullo, Jun Lian, et al. (2017). “Medical image synthe- sis with context-aware generative adversarial networks”. In: International Conference on Medical Image Computing and Computer-Assisted Inter- vention, pp. 417–425. Nie, Dong, Han Zhang, Ehsan Adeli, et al. (2016). “3D deep learning for multi- modal imaging-guided survival time prediction of brain tumor patients”. In: International Conference on Medical Image Computing and Computer- Assisted Intervention. Springer, pp. 212–220. Nowogrodzki, A (2018). “The world’s strongest MRI machines are pushing human imaging to new limits”. In: Nature 563.7729, p. 24. O’Gorman, Ruth L and Derek K Jones (2006). “Just how much data need to be collected for reliable bootstrap DT-MRI?” In: Magnetic Resonance in Medicine 56.4, pp. 884–890.

88 Bibliography

Olut, Sahin, Yusuf Huseyin Sahin, Ugur Demir, et al. (2018). “Generative adversarial training for MRA image synthesis using multi-contrast MRI”. In: International Conference on Medical Image Computing and Computer- Assisted Intervention, pp. 147–154. Özarslan, E, CG Koay, and PJ Basser (2008). “Simple harmonic oscillator based estimation and reconstruction for one-dimensional q-space MR”. In: Proceedings of the International Society for Magnetic Resonance in Medicine. Vol. 16, p. 35. Özarslan, Evren, Cheng Guan Koay, Timothy M Shepherd, et al. (2013). “Mean apparent propagator (MAP) MRI: a novel diffusion imaging method for mapping tissue microstructure”. In: NeuroImage 78, pp. 16–32. Pajevic, S and PJ Basser (1999). “Non-parametric statistical analysis of dif- fusion tensor MRI data using the bootstrap method”. In: Proceedings of the International Society for Magnetic Resonance in Medicine, p. 1790. Pajevic, Sinisa, Akram Aldroubi, and Peter J Basser (2002). “A continuous tensor field approximation of discrete DT-MRI data for extracting mi- crostructural and architectural features of tissue”. In: Journal of Magnetic Resonance 154.1, pp. 85–100. Pajevic, Sinisa and Peter J Basser (2003). “Parametric and non-parametric statistical analysis of DT-MRI data”. In: Journal of Magnetic Resonance 161.1, pp. 1–14. Papademetris, Xenophon, Marcel Jackowski, Nallakkandi Rajeevan, et al. (2005). “BioImage Suite: An integrated medical image analysis suite”. In: The Insight Journal 1.3, p. 13. Park, Hae-Jeong, Martha E Shenton, and Carl-Fredrik Westin (2004). “An analysis tool for quantification of diffusion tensor MRI data”. In: Inter- national Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 1089–1090. Penny, William D, Karl J Friston, John T Ashburner, et al. (2011). Statistical parametric mapping: the analysis of functional brain images. Elsevier. Penny, William D, Nelson J Trujillo-Barreto, and Karl J Friston (2005). “Bayesian fMRI time series analysis with spatial priors”. In: NeuroImage 24.2, pp. 350–362. Pieper, Steve, Michael Halle, and Ron Kikinis (2004). “3D Slicer”. In: Inter- national Symposium on Biomedical Imaging. IEEE, pp. 632–635. Pierpaoli, C, L Walker, MO Irfanoglu, et al. (2010). “TORTOISE: an inte- grated software package for processing of diffusion MRI data”. In: Pro- ceedings of the International Society for Magnetic Resonance in Medicine, p. 1597. Poldrack, Russell A, Chris I Baker, Joke Durnez, et al. (2017). “Scanning the horizon: towards transparent and reproducible neuroimaging research”. In: Nature Reviews Neuroscience 18.2, p. 115.

89 Bibliography

Poldrack, Russell A and Krzysztof J Gorgolewski (2014). “Making big data open: data sharing in neuroimaging”. In: Nature Neuroscience 17.11, p. 1510. Poupon, Cyril, Chris A Clark, Vincent Frouin, et al. (2000). “Regularization of diffusion-based direction maps for the tracking of brain white matter fascicles”. In: NeuroImage 12.2, pp. 184–195. Poupon, Cyril, J-F Mangin, Chris A Clark, et al. (2001). “Towards inference of human brain connectivity from MR diffusion tensor data”. In: Medical Image Analysis 5.1, pp. 1–15. Press, William H, Saul A Teukolsky, William T Vetterling, et al. (2007). Nu- merical recipes 3rd edition: The art of scientific computing. Cambridge university press. Purcell, Edward M, H Co Torrey, and Robert V Pound (1946). “Resonance absorption by nuclear magnetic moments in a solid”. In: Physical Review 69.1-2, p. 37. Qian, Chunqi, Ihssan S Masad, Jens T Rosenberg, et al. (2012). “A volume birdcage coil with an adjustable sliding tuner ring for neuroimaging in high field vertical magnets: ex and in vivo applications at 21.1 T”.In: Journal of Magnetic Resonance 221, pp. 110–116. Quenouille, Maurice H (1949). “Approximate tests of correlation in time-series 3”. In: Mathematical Proceedings of the Cambridge Philosophical Society. Vol. 45. 3. Cambridge University Press, pp. 483–484. Rabi, I. I., J. R. Zacharias, S. Millman, et al. (1938). “A new method of measuring nuclear magnetic moment”. In: Physical Review 53 (4), pp. 318– 318. Raj, Ashish, Christopher Hess, and Pratik Mukherjee (2011). “Spatial HARDI: improved visualization of complex white matter architecture with Bayesian spatial regularization”. In: NeuroImage 54.1, pp. 396–409. Rajpurkar, Pranav, Jeremy Irvin, Kaylie Zhu, et al. (2017). “Chexnet: Radiologist-level pneumonia detection on chest X-rays with deep learn- ing”. In: arXiv preprint arXiv:1711.05225. Rohde, Gustavo Kunde, AS Barnett, PJ Basser, et al. (2004). “Comprehen- sive approach for correction of motion and distortion in diffusion-weighted MRI”. In: Magnetic Resonance in Medicine 51.1, pp. 103–114. Rokem, Ariel, Jason D Yeatman, Franco Pestilli, et al. (2015). “Evaluating the accuracy of diffusion MRI models in white matter”. In: PLoS ONE 10.4, e0123272. Rosenblatt, Frank (1958). “The perceptron: a probabilistic model for informa- tion storage and organization in the brain”. In: Psychological Review 65.6, p. 386. Ruthotto, L, H Kugel, J Olesch, et al. (2012). “Diffeomorphic susceptibility artifact correction of diffusion-weighted magnetic resonance images”. In: Physics in Medicine & Biology 57.18, p. 5715.

90 Bibliography

Ruthotto, Lars, Siawoosh Mohammadi, Constantin Heck, et al. (2013). “Hy- perelastic susceptibility artifact correction of DTI in SPM”. In: Bildverar- beitung für die Medizin. Springer, pp. 344–349. Schuster, Christina, Orla Hardiman, and Peter Bede (2017). “Survival predic- tion in Amyotrophic lateral sclerosis based on MRI measures and clinical characteristics”. In: BMC Neurology 17.1, p. 73. Setsompop, Kawin, R Kimmlingen, E Eberlein, et al. (2013). “Pushing the limits of in vivo diffusion MRI for the Human Connectome Project”. In: NeuroImage 80, pp. 220–233. Shakya, Snehlata, Nazre Batool, Evren Özarslan, et al. (2017). “Multi-fiber reconstruction using probabilistic mixture models for diffusion MRI ex- aminations of the brain”. In: Modeling, Analysis, and Visualization of Anisotropy. Springer, pp. 283–308. Shakya, Snehlata, Xuan Gu, Nazre Batool, et al. (2017). “Multi-fiber esti- mation and tractography for diffusion MRI using mixture of non-central wishart distributions”. In: Proceedings of the Eurographics Workshop on Visual Computing for Biology and Medicine. The Eurographics Associa- tion, pp. 119–123. Sidén, Per, Anders Eklund, David Bolin, et al. (2017). “Fast Bayesian whole- brain fMRI analysis with spatial 3D priors”. In: NeuroImage 146, pp. 211– 225. Sjölund, Jens, Anders Eklund, Evren Özarslan, et al. (2018). “Bayesian uncer- tainty quantification in linear models for diffusion MRI”. In: NeuroImage 175, pp. 272–285. Smith, Robert E, Jacques-Donald Tournier, Fernando Calamante, et al. (2012). “Anatomically-constrained tractography: improved diffusion MRI streamlines tractography through effective use of anatomical information”. In: NeuroImage 62.3, pp. 1924–1938. Srivastava, Nitish, Geoffrey Hinton, Alex Krizhevsky, et al. (2014). “Dropout: a simple way to prevent neural networks from overfitting”. In: The Journal of Machine Learning Research 15.1, pp. 1929–1958. Stejskal, Edward O and John E Tanner (1965). “Spin diffusion measurements: spin echoes in the presence of a time-dependent field gradient”. In: The Journal of Chemical Physics 42.1, pp. 288–292. Tanno, Ryutaro, Daniel E Worrall, Aurobrata Ghosh, et al. (2017). “Bayesian image quality transfer with CNNs: exploring uncertainty in dMRI super- resolution”. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 611–619. Torrey, Henry C (1956). “Bloch equations with diffusion terms”. In: Physical Review 104.3, p. 563. Tuch, David S (2004). “Q-ball imaging”. In: Magnetic Resonance in Medicine 52.6, pp. 1358–1372.

91 Bibliography

Tuch, David S, Timothy G Reese, Mette R Wiegell, et al. (2002). “High an- gular resolution diffusion imaging reveals intravoxel white matter fiber heterogeneity”. In: Magnetic Resonance in Medicine 48.4, pp. 577–582. Van Essen, David C, Stephen M Smith, Deanna M Barch, et al. (2013). “The WU-Minn Human Connectome Project: An overview”. In: NeuroImage 80, pp. 62–79. Vaughan, Thomas, Lance DelaBarre, Carl Snyder, et al. (2006). “9.4 T hu- man MRI: preliminary results”. In: Magnetic Resonance in Medicine 56.6, pp. 1274–1282. Walker-Samuel, Simon, Matthew Orton, Jessica KR Boult, et al. (2011). “Im- proving apparent diffusion coefficient estimates and elucidating tumor het- erogeneity using Bayesian adaptive smoothing”. In: Magnetic Resonance in Medicine 65.2, pp. 438–447. Wang, Zhizhou, Baba C Vemuri, Yunmei Chen, et al. (2003). “A constrained variational principle for direct estimation and smoothing of the diffusion tensor field from DWI”. In: Biennial International Conference on Infor- mation Processing in Medical Imaging. Springer, pp. 660–671. Wedeen, Van J, Patric Hagmann, Wen-Yih Isaac Tseng, et al. (2005). “Map- ping complex tissue architecture with diffusion spectrum magnetic reso- nance imaging”. In: Magnetic Resonance in Medicine 54.6, pp. 1377–1386. Wegmann, Bertil, Anders Eklund, and Mattias Villani (2017a). “Bayesian het- eroscedastic regression for diffusion tensor imaging”. In: Modeling, Analy- sis, and Visualization of Anisotropy. Springer, pp. 257–282. Wegmann, Bertil, Anders Eklund, and Mattias Villani (2017b). “Bayesian Ri- cian regression for neuroimaging”. In: Frontiers in Neuroscience 11, p. 586. Welander, Per, Simon Karlsson, and Anders Eklund (2018). “Generative adversarial networks for image-to-image translation on multi-contrast MR images-a comparison of CycleGAN and UNIT”. In: arXiv preprint arXiv:1806.07777. Whitcher, B, DS Tuch, and L Wang (2005). “The wild bootstrap to quantify variability in diffusion tensor MRI”. In: Proceedings of the International So- ciety for Magnetic Resonance in Medicine. ISMRM Miami Beach, p. 1333. Wu, Chien-Fu Jeff et al. (1986). “Jackknife, bootstrap and other resampling methods in regression analysis”. In: The Annals of Statistics 14.4, pp. 1261– 1295. Yeh, Fang-Cheng, Van Jay Wedeen, and Wen-Yih Isaac Tseng (2010). “Gen- eralized q-sampling imaging”. In: IEEE Transactions on Medical Imaging 29.9, pp. 1626–1635. Yi, Xin, Ekta Walia, and Paul Babyn (2019). “Generative adversarial network in medical imaging: A review”. In: Medical Image Analysis, p. 101552. Yu, Biting, Luping Zhou, Lei Wang, et al. (2018). “3D cGAN based cross- modality MR image synthesis for brain tumor segmentation”. In: Interna- tional Symposium on Biomedical Imaging, pp. 626–630.

92 Bibliography

Yuan, Ying, Hongtu Zhu, Joseph G Ibrahim, et al. (2008). “A note on the validity of statistical bootstrapping for estimating the uncertainty of tensor parameters in diffusion tensor images”. In: IEEE Transactions on Medical Imaging 27.10, pp. 1506–1514. Zhang, Hui, Torben Schneider, Claudia A Wheeler-Kingshott, et al. (2012). “NODDI: practical in vivo neurite orientation dispersion and density imag- ing of the human brain”. In: NeuroImage 61.4, pp. 1000–1016. Zhang, Hui, Paul A Yushkevich, and James C Gee (2005). “Deformable reg- istration of diffusion tensor MR images with explicit orientation opti- mization”. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, pp. 172–179. Zhu, Jun-Yan, Taesung Park, Phillip Isola, et al. (2017). “Unpaired image-to- image translation using cycle-consistent adversarial networks”. In: IEEE International Conference on Computer Vision, pp. 2223–2232. Zhu, Tong, Xiaoxu Liu, Patrick R Connelly, et al. (2008). “An optimized wild bootstrap method for evaluation of measurement uncertainties of DTI- derived parameters in human brain”. In: NeuroImage 40.3, pp. 1144–1156.

93

Papers

The papers associated with this thesis have been removed for copyright reasons. For more details about these see: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-161288 FACULTY OF SCIENCE AND ENGINEERING

Linköping Studies in Science and Technology, Dissertation No. 2017, 2019 Department of Biomedical Engineering

Linköping University SE-581 83 Linköping, Sweden www.liu.se