Validation and Comparison of Three Freely Available Methods For
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2020.10.17.343574; this version posted January 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 Validation and comparison of three freely available methods for 2 extracting white matter hyperintensities: FreeSurfer, UBO Detector 3 and BIANCA 4 5 Authors 6 Isabel Hotz a,b, Pascal F. Deschwanden a, Franziskus Liem b, Susan Mérillat b, Spyridon Kollias d, 7 Lutz Jäncke a,b,c 8 9 Author Affiliations 10 11 a Division of Neuropsychology, Department of Psychology, University of Zurich, Switzerland 12 b University Research Priority Program (URPP), Dynamics of Healthy Aging, University of Zurich, Zurich, Switzerland 13 c Department of Special Education, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia 14 d Department of Neuroradiology, University Hospital Zurich, Zurich, Switzerland 15 16 17 Data and code availability statement 18 19 The data supporting this manuscript are not publicly available because the used consent does not 20 allow for the public sharing of the data. We are currently working on a solution for this matter. 21 Our code for preparing the data for BIANCA and executing BIANCA can be found here: 22 https://github.com/dynage/bianca 23 24 The following openly available software were used: 25 FreeSurfer (https://surfer.nmr.mgh.harvard.edu/fswiki/DownloadAndInstall) 26 UBO Detector (https://cheba.unsw.edu.au/research-groups/neuroimaging/pipeline) 27 BIANCA (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/BIANCA/Userguide), part of FSL software (RRID: 28 SCR_002823, https://fsl.fmrib.ox.ac.uk) and the MATLAB implementation of LOCATE 29 (https://git.fmrib.ox.ac.uk/vaanathi/LOCATE-BIANCA) 30 31 32 Funding statement 33 34 This research was funded by the Velux Stiftung (project No. 369) and the University Research 35 Priority Program «Dynamics of Healthy Aging» of the University of Zurich (UZH). 36 37 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.17.343574; this version posted January 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 38 Conflict of interest disclosure 39 40 The authors declare no conflicts of interest. 41 42 Ethics approval statement 43 The study was approved by the ethical committee of the canton of Zurich. 44 45 Patient consent statement 46 Participation was voluntary and all participants gave written informed consent in accordance with the 47 declaration of Helsinki. 48 49 ORCID 50 51 Isabel Hotz: http://orcid.org/0000-0002-7938-0688 52 53 54 55 56 Corresponding Authors 57 58 Isabel Hotz 59 Binzmuehlestrasse 14, Box 25 60 CH-8050 Zurich 61 Switzerland 62 [email protected] 63 64 Lutz Jäncke 65 Binzmuehlestrasse 14, Box 25 66 CH-8050 Zurich 67 Switzerland 68 [email protected] 69 70 71 72 73 74 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.17.343574; this version posted January 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 75 Abstract 76 White matter hyperintensities of presumed vascular origin (WMH) are frequently found in MRIs of 77 patients with neurological and vascular disorders, but also in healthy elderly subjects. Here, we 78 compared and validated three freely available algorithms for WMH extraction: FreeSurfer, UBO 79 Detector, and the Brain Intensity AbNormality Classification Algorithm (BIANCA). 80 The algorithms were applied to longitudinal MRI data of cognitively healthy older adults (baseline N 81 = 231, age range 64 – 87 years). We fully manually segmented WMH in T1w, 3D FLAIR, 2D 82 FLAIR MR images of 16 subjects and compared the corresponding algorithm outputs to assess the 83 algorithm’s segmentation accuracy. To validate the algorithms, we associated the extracted WMH 84 volumes with the Fazekas scores and age, and explored correlations between repeated measures using 85 the full longitudinal dataset. We further performed post hoc analyses of the algorithm’s outputs to 86 verify longitudinal robustness. 87 FreeSurfer fundamentally underestimated the WMH volumes and scored the worst in Dice Similarity 88 Coefficient. Nevertheless, FreeSurfer’s WMH volumes strongly correlated with the Fazekas scores 89 and exhibited longitudinal robustness. BIANCA accomplished a high segmentation accuracy, but 90 associations with the Fazekas scores were weak and a significant number of outlier-WMH volumes 91 was identified in the within-person trajectories. UBO Detector achieved the best results in terms of 92 the cost-benefit ratio. Although there is room for optimization, its performance was of high accuracy 93 and consistency, despite being based on a built-in training dataset. In summary, our results emphasize 94 the great importance of carefully contemplating the choice of the WMH segmentation algorithm in 95 future studies. 96 97 Keywords 98 White matter hyperintensities 99 Automated segmentation 100 MRI 101 Healthy aging 102 Validation 3 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.17.343574; this version posted January 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 103 1. Introduction 104 As our lifespan increases and the population ages, cognitive limitations caused by cerebrovascular 105 diseases will become more common (Baker et al., 2012). White matter hyperintensities of presumed 106 vascular origin (WMH) are considered as a marker of cerebrovascular diseases. They appear 107 hyperintense on T2-weighted MRI images, like fluid-attenuated inversion recovery (FLAIR) 108 sequences, without cavitation, and isointense or hypointense on T1-weighted (T1w) sequences 109 (Wardlaw et al., 2013). The FLAIR sequence is generally the most sensitive structural sequence for 110 visualizing WMH via magnetic resonance imaging (MRI) (Wardlaw et al., 2013). WMH are often 111 seen in MR images of the brain in patients with neurological and vascular disorders but also in 112 healthy elderly people (Caligiuri et al., 2015). In the context of clinical diagnostics, the Fazekas scale 113 (Fazekas, Chawluk, Alavi, Hurtig, & Zimmerman, 1987), the Scheltens scale (Scheltens et al., 1993), 114 and the age-related white matter changes scale (ARWMC) (Wahlund et al., 2001) are commonly 115 used to visually assess the severity and progression of WMH. However, these visual rating scales 116 unfortunately do not provide true quantitative data, have a relatively low reliability and are time- 117 consuming to obtain (Mäntylä et al., 1997). Compared to such scales, volumetric measurements are 118 more reliable and more sensitive to age effects in longitudinal studies of WMH (T. L. A. van den 119 Heuvel et al., 2016), especially in cognitively healthy samples where the expected WMH volume 120 increases over time are rather small. While several previous studies have been segmenting WMH 121 manually (Anbeek, Vincken, van Osch, Bisschops, & van der Grond, 2004; Dadar et al., 2017; de 122 Sitter et al., 2017; Klöppel et al., 2011; Kuijf et al., 2019; Steenwijk et al., 2013), this extremely time- 123 consuming option seems not feasible in future studies, especially when considering the current trend 124 towards big data (i.e., datasets with a large N and multiple time points of data acquisition). 125 Automated methods that can detect WMH robustly and with high accuracy are therefore very useful 126 and promising. Recently, Caligiuri and colleagues (2015) compared different existing algorithms 127 including supervised, unsupervised, and semi-automated techniques. They found that many of these 128 algorithms are not freely available, study and/or protocol specific, and have been validated on small 129 sample sizes. There is still no consensus on which algorithm(s) is (are) of good quality and should be 4 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.17.343574; this version posted January 21, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 130 applied to detect WMH (Dadar et al., 2017; Frey et al., 2019). Consequently, the methodology of 131 pertinent studies is very heterogeneous and compromises the comparability of such studies. 132 The primary goal of our current work is therefore to validate and precisely compare the performance 133 of three freely available WMH extraction methods: FreeSurfer (Fischl, 2012), UBO Detector (Jiang 134 et al., 2018), and BIANCA (Brain Intensity AbNormality Classification Algorithm) (Griffanti et al., 135 2016). The FreeSurfer Image Analysis Suite (Fischl, 2012) is a fully automated software for the 136 surface- and volume-based analysis of brain structure using information of T1w images. While 137 quantifying WMH is not FreeSurfer’s main aim, it still provides an unsupervised WMH segmentation 138 as part of its pipeline and enables WMH quantification based on T1w images alone. UBO Detector 139 and BIANCA are supervised tools specifically developed to automatically respectively semi- 140 automatically segment WMH based on the k-nearest neighbor (KNN) algorithm. Although recent 141 studies have been using these three algorithms, their validity and reliability is still not sufficiently 142 clear. Former validations were often performed i) in patients with moderate to high WMH load, ii) 143 using cross-sectional data, iii) in small samples.