S S symmetry

Article A Novel Tool for Supervised Segmentation Using

Daniel Chalupa * and Jan Mikulka Department of Theoretical and Experimental Electrical Engineering, Brno University of Technology, Technická 3082/12, 616 00 Brno, Czech Republic; [email protected] * Correspondence: [email protected]

 Received: 15 October 2018; Accepted: 7 November 2018; Published: 12 November 2018 

Abstract: The rather impressive extension library of medical image-processing platform 3D Slicer lacks a wide range of machine-learning toolboxes. The authors have developed such a toolbox that incorporates commonly used machine-learning libraries. The extension uses a simple graphical user interface that allows the user to preprocess data, train a classifier, and use that classifier in common medical image-classification tasks, such as tumor staging or various anatomical segmentations without a deeper knowledge of the inner workings of the classifiers. A series of experiments were carried out to showcase the capabilities of the extension and quantify the symmetry between the physical characteristics of pathological tissues and the parameters of a classifying model. These experiments also include an analysis of the impact of training vector size and feature selection on the sensitivity and specificity of all included classifiers. The results indicate that training vector size can be minimized for all classifiers. Using the data from the Brain Tumor Segmentation Challenge, Random Forest appears to have the widest range of parameters that produce sufficiently accurate segmentations, while optimal Support Vector Machines’ training parameters are concentrated in a narrow feature space.

Keywords: 3D slicer; classification; extension; random forest; segmentation; sensitivity analysis; support vector machine; tumor

1. Introduction 3D Slicer [1] is a free open-source platform for medical image visualization and processing. Its main functionality comes from the Extension Library, which consists of various modules that allow specific analyses of the input data, such as filtering, artefact suppression, or surface reconstruction. There is a lack of machine-learning extensions except for the open-source DeepInfer [2]. This deep-learning deployment kit uses 3D convolutional neural networks to detect and localize the target tissue. The development team demonstrated the use of this kit on the prostate segmentation problem for image-guided therapy. Researchers and practitioners are able to select a publicly available task-oriented network through the module without the need to design or train it. To enable the use of other machine-learning techniques, we developed the Supervised Segmentation Toolbox as an extensible machine-learning platform for 3D Slicer. Currently, Support Vector Machine (SVM) and Random Forest (RF) classifiers are included. These classifiers are well-researched and often used in image-processing tasks, as demonstrated in References [3–8]. SVMs [9] train by maximizing the distance between marginal samples (also referred to as support vectors) and a discriminative hyperplane by maximizing f in the equation:

1 f (α1 αn) = ∑ αi ∑ ∑ αiαjyiyj→xi →xj, (1) ··· − 2 i j ·

Symmetry 2018, 10, 627; doi:10.3390/sym10110627 www.mdpi.com/journal/symmetry SymmetrySymmetry20162018,,xx10,, x 627 22 of of 9 9 wherewhere yy isis defined defined as as +1 +1 for for a a class class A sample and –1 for a class class B sample, α isis a a Lagrangian Lagrangian multiplier, multiplier, andand ~x→x isis the the feature feature vector vector of of the the individual individual sample. sample. On On real real data, data, this this is is often often too too strict strict because because ofof noisy noisy samples samples that that might might cross cross their their class class boundary. boundary. This This is is solved solved by by using using a a combination combination of of techniquestechniques known known as as the the kernel kernel trick trick and and soft soft margining. margining. The The kernel kernel trick trick uses uses a a kernel kernel function function φφ toto remap remap the the original original feature feature space space to to a a higher-dimensional higher-dimensional one one by by replacing replacing the the dot dot product product in in  EquationEquation ( (1)1) with withφφ(x(x)) φ(xx ).. This This allows allows linear linear separation separation of of the the data data as required as required by the by SVM. the SVM. An i i ·· jj exampleAn example of the of the kernel kernel function function is the is the radial radial basis basis function function used used in in this this study. study. Incorporating Incorporating a a soft soft marginmargin allows allows some some samples samples to to cross cross their their class class boundary. boundary. Examples Examples of of such such soft-margin soft-margin SVMs SVMs are are the the C-SVMC-SVM [ [1010]] and and N-SVM N-SVM [ [1111].]. The The CC parameter of the C-SVM modifies the influence influence of of each each support support vectorvector on on the the final final discriminatory discriminatory hyperplane. hyperplane. The The larger larger the the CC,, the the closer closer the the soft-margin soft-margin C-SVM C-SVM is is to to aa hard-margin hard-margin SVM. SVM. The The N parameter of the N-SVM defines a minimum number of support vectors vectors and,and, consequently, consequently, an an upper upper bound bound of of the the guaranteed guaranteed maximum maximum percentage percentage of of misclassifications. misclassifications. FurtherFurther modifications modifications to to the the SVM SVM can can also also be be done, done, such such as as using using fuzzy-data fuzzy-data points points for for the the training training dataset,dataset, as as demonstrated demonstrated in in References References [ [55,,66].]. RFRF uses uses a a voting voting system system on on the the results results of of the the individual individual decision decision trees. trees. Each Each decision decision tree tree is is createdcreated on on the the basis basis of of a a bootstrapped bootstrapped sample sample of of the the training training data data [ [1212].].

2.2. Materials Materials and and Methods Methods TheThe data data used used in in this this study study describe describe Low Low Grade Grade Glioma Glioma (LGG) (LGG) obtained obtained by by MRI. MRI. The The dataset dataset consistsconsists of of four four images images of of one one patient patient (LG_0001, (LG_0001, Figure Figure11):): T1- T1- and and T2-weighted, T2-weighted, contrast-enhanced contrast-enhanced T1C,T1C, and and Fluid Fluid Attenuated Attenuated Inversion Inversion Recovery Recovery (FL). (FL). These These data data are are part part of of the the training training set set featured featured in in thethe Brain Brain Tumor Tumor Segmentation Segmentation (BraTS) (BraTS) 2012 2012 Challenge. Challenge.

FigureFigure 1. 1. SliceSlice 88 88 of of the the LG_0001 LG_0001 data data from from the the 2012 2012 Brain Brain Tumor Tumor Segmentation Segmentation (BraTS) (BraTS) Challenge. Challenge. TheThe dataset dataset consists consists of of a a ( (aa)) T1-weighted T1-weighted image image with with labels labels as as overlay; overlay; ( (bb)) postcontrast postcontrast T1-weighted T1-weighted image;image; ( (cc)) T2-weighted T2-weighted image; image; and and ( (dd)) Fluid Fluid Attenuated Attenuated Inversion Inversion Recovery Recovery (FL) (FL) image. image.

TheThe free free and and open-source open-source Supervised Supervised Segmentation Segmentation Toolbox Toolbox extension extension [ [2413]] of of the the 3D 3D Slicer Slicer was was usedused throughout throughout this this study. study. The The extension extension allows allows the the user user to to train train a a range range of of classifiers classifiers using using labeled labeled data.data. The The user user is is also also able able to to perform perform a a grid grid search search to to select select the the optimal optimal parameters parameters for for the the classifier. classifier. ToTo achieve achieve this, this, the the extension extension uses uses either either an an already already available available function function or or a a cross-validation cross-validation algorithm algorithm developeddeveloped by by the the author author of of the the extension, extension, depending depending on on the the classifier classifier library library used. used. Currently, Currently, N-SVM N-SVM andand C-SVM C-SVM from from the the dlib library library [ [1314]] and and C-SVM C-SVM and and Random Random Forest Forest from from Shark-ml Shark-ml library library [ [1415]] are are Symmetry 2016, xx, x 3 of 9

Symmetry 2018, 10, 627 3 of 9 incorporated.Symmetry 2016, xx, The x extension takes care of the parallelizable parts of the training and classification3 of 9 subtasks, thus significantly reducing computation times. A preprocessing algorithm selection is also a partincorporated. of the extension. The extension This allows takes for care artefact of the correction parallelizable or feature parts of extraction. the training The and extension classification workflow issubtasks, depicted thus in Figure significantly2. reducing reducing computation computation times. times. A A preprocessing preprocessing algorithm algorithm selection selection is is also also a apart part of of the the extension. extension. This This allows allows for for artefact artefact correction correction or or feature feature extraction. extraction. The The extension workflow is depicted in Figure2. Training Samples Preprocessing Training Segmentation

Training Samples Preprocessing Training Segmentation

Figure 2. Supervised segmentation extension workflow.

3. Results and DiscussionFigure 2. Supervised segmentation extension workflow.

3. ResultsA series and of Discussion tests were performed in order to provide a sense of the speed and accuracy of the provided classifiers. Sensitivity and specificity metrics were used to evaluate the results. A classifier A series of tests were performed in order to provide a sense of the speed and accuracy of the that had a larger sum of specificity and sensitivity (or their respective means, when cross-validation provided classifiers. Sensitivity and specificity metrics were used to evaluate the results. A classifier was used) was considered a better classifier. During the first test run, each type of classifier was trained that had a larger sum of specificity and sensitivity (or their respective means, when cross-validation andwas evaluated used) was consideredusing the single a better patient classifier. image During set. the Optimal first test training run, each parameters type of classifier of the classifierswas trained were obtainedand evaluated using using a grid-search the single approach. patient image The set. results Optimal are training presented parameters in Figures of3 the–5. classifiers The γ parameter were isobtained common using in both a grid-search SVM classifiers approach. and The influences resultsare the presented variance of in the Figures radial3–5 basis. The kernel.γ parameter A large γ meansis common that more in both data SVM points classifiers will look and similar, influences thus the preventing variance of overfitting. the radial basisUsing kernel. the aforementioned A large γ dataset,means that the more results data indicate points will a relative look similar, insensitivity thus preventing of the classification overfitting. Using accuracy the aforementioned on this parameter. Fordataset, the given the results dataset, indicateC values a relative of the insensitivity C-SVM larger of the than classification 1 seem optimal. accuracy The on thisbest parameter. results of the N-SVMFor the classifier given dataset, are obtainedC values with of theN around C-SVM 10% larger or lower than 1 combined seem optimal. with a The high-variance best results radial of the basis function.N-SVM classifier Optimal are RF obtained training with parametersN around were: 10% or a smalllower combinednode size with of under a high-variance 5, number radial of trees basis higher thanfunction. 800, no Optimal out-of-bag RF training samples, parameters and no random were: a small attributes. node size of under 5, number of trees higher than 800, no out-of-bag samples, and no random attributes.

1 1

1 1 0.02 0.9 0.02 0.9 0.02 0.9 0.02 0.9 0.8 0.8 0.04 0.8 0.04 0.8 0.04 0.04 0.7 0.7 0.7 0.7 0.06 0.06 0.06 0.6 0.06 0.6

0.6 0.6 N

N 0.08 0.5 0.08 0.5 N N 0.08 0.5 0.08 0.5

0.4 0.4 0.1 0.4 0.1 0.4 0.1 0.1 0.3 0.3 0.3 0.3 0.12 0.12 0.12 0.12 0.2 0.2 0.2 0.2

0.140.14 0.140.14 0.10.1 0.1 0.1

0 0 0 0 -3-3 -2-2 -1 00 11 22 3 3 -3 -3 -2 -2 -1 -1 0 0 1 1 2 2 3 3 loglog .. log log . . 1010 10 10

(a)(a) (b) (b)

FigureFigure 3. 3. N-Support Vector Vector Machine Machine (SVM)-classifier (SVM)-classifier (a) (a sensitivity) sensitivity and and (b) ( specificityb) specificity using using different different parameterparameter pairs. Symmetry 2016, xx, x 4 of 9

Symmetry 2016, xx, x 4 of 9

Symmetry 2018, 10, 627 4 of 9 -3.5 1 -3.5 1

-3-3.5 0.9 1 -3-3.5 0.9 1

-2.5 -3 0.8 0.9 -2.5 -3 0.8 0.9

-2-2.5 0.7 0.8 -2-2.5 0.7 0.8

-1.5 -2 0.6 0.7 -1.5 -2 0.6 0.7

C C 10

10 -1-1.5 0.5 0.6 -1-1.5 0.5 0.6

log log C

-0.5C 0.4 -0.5 0.4 10

10 -1 0.5 -1 0.5

log log

0-0.5 0.3 0.4 0-0.5 0.3 0.4

0.5 0 0.2 0.3 0.5 0 0.2 0.3

10.5 0.1 0.2 10.5 0.1 0.2

1.5 1 0 0.1 1.5 1 0 0.1 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 log . log . 10 10 1.5 0 1.5 0 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 log . log . (a)10 (b)10

Figure 4. C-SVM-classifier(a) (a) sensitivity and (b) specificity using different(b) parameter pairs.

FigureFigure 4. 4.C-SVM-classifierC-SVM-classifier (a) (a sensitivity) sensitivity and and (b) (b specificity) specificity using using different different parameter parameter pairs. pairs.

1 1 1 1 Sensitivity Sensitivity Sensitivity Sensitivity Specificity Specificity Specificity Specificity 0.98 0.98 1 1 1 1 0.99 0.95 Sensitivity Sensitivity 0.96 Sensitivity Sensitivity Specificity Specificity Specificity 0.96 Specificity 0.98 0.98 0.99 0.94 0.95 0.98 0.96 0.94 0.96 0.9 0.92 0.94 0.92 0.98 0.94 0.97 0.9 0.9 0.92 0.9 0.92 0.85 0.88 0.97 0.9

0.96 0.88

Sensitivity and Specificity and Sensitivity Specificity and Sensitivity Specificity and Sensitivity Sensitivity and Specificity and Sensitivity 0.86 0.9 0.85 0.88 0.86

0.8 0.96 0.84 0.88

Sensitivity and Specificity and Sensitivity Specificity and Sensitivity Specificity and Sensitivity Sensitivity and Specificity and Sensitivity 0.95 0.86

0.82 0.84 0.86 0.8 0.84 0.95 0.75 0.94 0.8 0.82 0 5 10 15 20 800 1000 1200 1400 1600 1800 2000 0.820 0.2 0.4 0.6 0.8 0.840 0.5 1 1.5 2 Node size Number of Trees Oob Number of Random Attributes 0.75 0.94 0.8 0.82 0 5 10 15 20 800 1000 1200 1400 1600 1800 2000 0 0.2 0.4 0.6 0.8 0 0.5 1 1.5 2 Figure 5. RandomNode size Forest classifier sensitivityNumber of Trees and specificity using differentOob parameters.Number Left of to Random right: Attributes Different node size, number of trees, OOB and number of random attributes. Figure 5. Random Forest classifier sensitivity and specificity using different parameters. Left to right: TheDifferent second node test size, run number consisted of trees, of using OOB a and different number number of random of slices attributes. around the center of the tumor Figure 5. Random Forest classifier sensitivity and specificity using different parameters. Left to right: to reveal the impact of the size of the training set on the specificity and sensitivity of all classifiers TheDifferent second node test size, run number consisted of trees, of using OOB and a different number of number random of attributes. slices around the center of the (Figures6a and7). The results indicate that reducing the number of unique training samples has tumor to reveal the impact of the size of the training set on the specificity and sensitivity of all a negligibleThe second effect on test the run subsequent consisted classification of using a different accuracy. number RF shows of slices slightly around better the classification center of the classifiers (Figures 6a–7). The results indicate that reducing the number of unique training samples accuracytumor improvementto reveal the when impact using of the a larger size trainingof the training vector. Using set on a the reduced specificity training and dataset sensitivity influences of all has a negligible effect on the subsequent classification accuracy. RF shows slightly better classification trainingclassifiers process (Figures length 6a– and7). The might results result indicate in a simpler that reducing classifier, the which number is easier of unique to interpret training and samples has accuracy improvement when using a larger training vector. Using a reduced training dataset influences shorterhas a negligible classification effect computation on the subsequent times. classification The classification accuracy. time RF is showsa limiting slightly factor better of using classification these training process length and might result in a simpler classifier, which is easier to interpret and has methodsaccuracy in improvement real-time applications. when using a larger training vector. Using a reduced training dataset influences shorter classification computation times. The classification time is a limiting factor of using these training process length and might result in a simpler classifier, which is easier to interpret and has methods in real-time applications. shorter classification computation times. The classification time is a limiting factor of using these methods in real-time applications. Symmetry 2016, xx, x 5 of 9

Symmetry 2016, xx, x 5 of 9 Symmetry 2018, 10, 627 5 of 9

#105 #105 2 1 2 1 Unique Training Samples Unique Training Samples Sensitivity Sensitivity 1.8 0.98 1.8 0.98 #105 Specificity #105 Specificity 2 1 2 1 Unique Training Samples Unique Training Samples 1.6 0.96 1.6 0.96 Sensitivity Sensitivity 1.8 Specificity 0.98 1.8 Specificity 0.98 1.4 0.94 1.4 0.94 1.6 0.96 1.6 0.96

1.2 0.92 1.2 0.92 1.4 0.94 1.4 0.94

1 0.9 1 0.9 1.2 0.92 1.2 0.92

0.8 0.88 0.8 0.88

1 0.9 1 0.9

Sensitivity and specificity and Sensitivity specificity and Sensitivity

Unique Training Samples Training Unique Samples Training Unique 0.6 0.86 0.6 0.86

0.8 0.88 0.8 0.88

Sensitivity and specificity and Sensitivity specificity and Sensitivity Unique Training Samples Training Unique Unique Training Samples Training Unique 0.4 0.84 0.4 0.84 0.6 0.86 0.6 0.86

0.2 0.82 0.2 0.82 0.4 0.84 0.4 0.84

0 0.8 0 0.8 0.2 0 10 20 30 40 50 60 70 80 90 1000.82 0.2 0 10 20 30 40 50 60 70 80 90 1000.82 Slice Count Slice Count

0 0.8 0 0.8 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 Slice(a) Count Slice(b) Count

Figure 6. (a) N-SVM(a) and (b) C-SVM sensitivity and specificity using different(b) training vector sizes.

Figure 6. (a) N-SVM and (b) C-SVM sensitivity and specificity using different training vector sizes.

#105 2 1 Unique Training Samples Sensitivity 1.8 0.98 #105 Specificity 2 1 Unique Training Samples 1.6 0.96 Sensitivity 1.8 Specificity 0.98 1.4 0.94 1.6 0.96

1.2 0.92 1.4 0.94

1 0.9 1.2 0.92

0.8 0.88

1 0.9

Sensitivity and specificity and Sensitivity Unique Training Samples Training Unique 0.6 0.86

0.8 0.88 Sensitivity and specificity and Sensitivity Unique Training Samples Training Unique 0.4 0.84 0.6 0.86

0.2 0.82 0.4 0.84

0 0.8 0.2 0 10 20 30 40 50 60 70 80 90 1000.82 Slice Count

0 0.8 Figure 7. RF-classifier0 sensitivity10 20 30 and40 specificity50 60 using70 different80 90 training100 vector sizes. Figure 7. RF-classifier sensitivity and specificitySlice Count using different training vector sizes. The effect of different image types on classifier accuracy was examined in the last test run (FiguresThe8a effect andFigure9). of Slices 7.differentRF-classifier 88 ofimage the LG_0001 sensitivity types images on and classifier specificity were used accuracy using as different a source was trainingexamined of training vector insamples. sizes. the last Sufficient test run (FiguressensitivitySymmetry 20168a and–,9xx).,specificity Slices x 88 of were the LG_0001 obtained images by only were using used T1- and as a T2-weighted source of training images. samples. Furthermore, Sufficient6 of all 9 The effect of different image types on classifier accuracy was examined in the last test run sensitivityclassifiers benefited and specificity from the were addition obtained of a by postcontrast only using T1-T1-weighted and T2-weighted image. The images. RF classifier Furthermore, achieved all (Figures 8a–9). Slices 88 of the LG_0001 images were used as a source of training samples. Sufficient classifiersbest overall benefited results with from the the use addition of FL, of and a postcontrast postcontrast T1-weighted T1- and T2-weighted image. The images. RF classifier achieved sensitivity and specificity were obtained by only using T1- and T2-weighted images. Furthermore, all best overall results with the use of FL, and postcontrast T1- and T2-weighted images. classifiers1 benefited from the addition of a postcontrast1 T1-weighted image. The RF classifier achieved best overall0.9 results with the use of FL, and postcontrast0.9 T1- and T2-weighted images.

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

Sensitivity and specificity and Sensitivity specificity and Sensitivity

0.3 0.3

0.2 0.2

Sensitivity 0.1 0.1 Specificity Sensitivity Specificity 0 0

T1 T2 FL T1 T2 FL T1C T1C T1, T2 FL, T1 FL, T2 T1, T2 T1, FL T2, FL T1, T1C T1C, T2 FL, T1C T1, T1C T1C, T2 T1C, FL FL, T1, T2 T1, T2, FL Image Types T1, T1C, T2 FL, T1, T1CFL, T1C, T2 Image Types T1, T1C, T2 T1, T1C, FLT1C, T2, FL FL, T1, T1C, T2 T1, T1C, T2, FL (a) (b) Classifier sensitivity using different input images. N-SVM ( ), C-SVM ( ). FigureFigure 8. 8. Classifier sensitivity using different input images. N-SVM (aa), C-SVM(b).

1

0.9

0.8

0.7

0.6

0.5

0.4 Sensitivity and specificity and Sensitivity

0.3

0.2

Sensitivity 0.1 Specificity

0

T1 T2 FL T1C T1, T2 FL, T1 FL, T2 T1, T1C T1C, T2 FL, T1C FL, T1, T2 Image Types T1, T1C, T2 FL, T1, T1CFL, T1C, T2 FL, T1, T1C, T2

Figure 9. RF-Classifier sensitivity using different input images.

The following standardized procedure was designed in order to compare classifier performance. Training samples were extracted from the whole volume of the unmodified T1C and T2 images. Then, sensitivity and specificity were obtained using fivefold cross-validation. The best performing parameters and results are reported in Table1. Segmentations are shown in Figure 10. Classification results can be further improved by using preprocessed data instead of raw data, and by means of postprocessing to remove the outlying and inlying holes as demonstrated in Reference [15].

Table 1. Classifier comparison and best-performing parameters.

Parameters Sensitivity Specificity Acc. Prec. DICE Jaccard C-SVM γ = 1.0, C = 1.0 0.72 0.98 0.99 0.96 0.80 0.66 3 N-SVM γ = 10.0− , N = 0.1 0.77 0.97 0.99 0.96 0.82 0.70 0 % OOB, RF 0 random attributes, 1.00 0.95 1.00 0.96 0.94 0.89 1200 trees, node size 2 Symmetry 2016, xx, x 6 of 9

1 1

0.9 0.9

0.8 0.8

0.7 0.7

0.6 0.6

0.5 0.5

0.4 0.4

Sensitivity and specificity and Sensitivity specificity and Sensitivity

0.3 0.3

0.2 0.2

Sensitivity 0.1 0.1 Specificity Sensitivity Specificity 0 0

T1 T2 FL T1 T2 FL T1C T1C T1, T2 FL, T1 FL, T2 T1, T2 T1, FL T2, FL T1, T1C T1C, T2 FL, T1C T1, T1C T1C, T2 T1C, FL FL, T1, T2 T1, T2, FL Image Types T1, T1C, T2 FL, T1, T1CFL, T1C, T2 Image Types T1, T1C, T2 T1, T1C, FLT1C, T2, FL FL, T1, T1C, T2 T1, T1C, T2, FL (a) (b)

Symmetry 2018, 10Figure, 627 8. Classifier sensitivity using different input images. N-SVM (a), C-SVM(b). 6 of 9

1

0.9

0.8

0.7

0.6

0.5

0.4 Sensitivity and specificity and Sensitivity

0.3

0.2

Sensitivity 0.1 Specificity

0

T1 T2 FL T1C T1, T2 FL, T1 FL, T2 T1, T1C T1C, T2 FL, T1C FL, T1, T2 Image Types T1, T1C, T2 FL, T1, T1CFL, T1C, T2 FL, T1, T1C, T2

Figure 9. RF-Classifier sensitivity using different input images.

The following standardized procedure was designed in order to compare classifier performance. Training samples were extracted from the whole volume of the unmodified T1C and T2 images. Then, sensitivity and specificity were obtained using fivefold cross-validation. The best performing parameters and results are reported in Table1. Segmentations are shown in Figure 1010. Classification results can be further improved by using preprocessed data instead of raw data, and by means of postprocessing to remove the outlying voxels and inlying holes as demonstrated in Reference [1516].

Table 1. Classifier comparison and best-performing parameters.

ParametersParameters Sensitivity Sensitivity Specificity Acc. Prec. Prec. DICE DICE Jaccard Jaccard C-SVMC-SVM γγ== 1.0, 1.0,CC = 1.0 0.72 0.72 0.98 0.99 0.99 0.96 0.96 0.80 0.80 0.66 0.66 33 N-SVMN-SVM γγ== 10.0 10.0−− ,, N = 0.1 0.770.77 0.97 0.99 0.99 0.96 0.96 0.82 0.82 0.70 0.70 0 % OOB, SymmetryRFRF2016, xx,0 x random attributes, 1.001.00 0.95 1.00 1.00 0.96 0.96 0.94 0.94 0.89 0.89 7 of 9 1200 trees, node size 2

(a) (b) (c)

FigureFigure 10. 10. ((aa)) RF, RF, ( (bb)) C-SVM, C-SVM, and and ( (cc)) N-SVM N-SVM classification classification results results of of the the slice slice 88 88 (white) (white) and and ground ground truthtruth (red). (red).

Lastly,Lastly, the the performance performance of of the the RF RF classifier classifier trained trained on on all all tumor tumor cores cores of of the the 20 20 real real high-grade high-grade gliomaglioma volumes volumes using using the the 3D 3D Slicer Slicer extension extension were were compared compared to to similar similar studies studies performed performed on on the the BraTSBraTS dataset. dataset. The The values values were were obtained obtained as as a a mean mean of of fivefold fivefold cross-validation. cross-validation. This This comparison comparison is is shownshown in in Table Table22.. The The other other DICE DICE values values are are from from Reference Reference [ [1716].]. TheThe means to to combine combine the the results results of of different different classifiers classifiers to to further further expand expand the the usability usability of the of theSupervised Supervised Segmentation Segmentation Toolbox Toolbox extension extension were were added. added. Currently, Currently, logical logical AND AND and and OR OR and and a amajority majority voting voting system system are are implemented. implemented. An An addition addition of of a a Multiple Classifier System (MCS) is is currentlycurrently considered. considered. A A review review of of the the advantages advantages of of MCS MCS is is provided provided by by Wozniak et al. [ [1817].]. Termenon Termenon andand Graña Graña [ [1918]] used used a a two-stage two-stage MCS MCS where where the the second second classifier classifier was was trained trained on on low-confidence low-confidence data data obtainedobtained by by training training and and analysis analysis of of the the first first classifier. classifier. In In the the future, future, the the authors authors expect expect implementing implementing additional classifiers as well. Adding a Relevance Vector Machine (RVM), for example, might bring an improvement over SVM [19].

Table 2. RF classifier comparison with similar studies. The classifier was trained using all 20 of the real high-grade glioma volumes, and the DICE value is a mean of fivefold cross-validation.

Paper Approach DICE This paper RF 0.43 Geremia [20] Spatial decision forests with intrinsic hierarchy 0.32 Kapás [15] RF 0.58 Bauer [21] Integrated hierarchical RF 0.48 Zikic [22] Context-sensitive features with a decision tree ensemble 0.47 Festa [23] RF using neighborhood and local context features 0.50

4. Conclusions The Supervised Segmentation Toolbox extension was presented as an addition to the 3D Slicer extension library. This extension allows the user to train and use three types of classifiers, with more to be added in the future. The usability of the extension was demonstrated on a brain-tumor segmentation use case. The effects of the training parameters of all classifiers on the final sensitivity and specificity of the classification were considered to provide an insight into usable parameter selection for future studies. A low γ in combination with softer margin terms resulted in a better performing classifier commonly for both SVM classifiers. This might be largely due to a limited training sample, and a broader dataset should be analyzed in order to generalize the results. The RF classifier performed best using no added randomization, a relatively large tree count, and a small node size. The possibility of reducing training vector size in order to reduce model complexity and decrease classification time is verified. A 20-fold increase of the number of unique training samples resulted, at best, in a 2% increase of specificity. All combinations of input images are considered as a training input for all classifiers, and the significance of adding more types of images is discussed. A combination of T1C and T2 images performed sufficiently for all classifiers. The addition of the FL image brought a slight improvement in sensitivity. Lastly, best-performing parameter combinations were listed and the corresponding Symmetry 2018, 10, 627 7 of 9 additional classifiers as well. Adding a Relevance Vector Machine (RVM), for example, might bring an improvement over SVM [20].

Table 2. RF classifier comparison with similar studies. The classifier was trained using all 20 of the real high-grade glioma volumes, and the DICE value is a mean of fivefold cross-validation.

Paper Approach DICE This paper RF 0.43 Geremia [21] Spatial decision forests with intrinsic hierarchy 0.32 Kapás [16] RF 0.58 Bauer [22] Integrated hierarchical RF 0.48 Zikic [23] Context-sensitive features with a decision tree ensemble 0.47 Festa [24] RF using neighborhood and local context features 0.50

4. Conclusions The Supervised Segmentation Toolbox extension was presented as an addition to the 3D Slicer extension library. This extension allows the user to train and use three types of classifiers, with more to be added in the future. The usability of the extension was demonstrated on a brain-tumor segmentation use case. The effects of the training parameters of all classifiers on the final sensitivity and specificity of the classification were considered to provide an insight into usable parameter selection for future studies. A low γ in combination with softer margin terms resulted in a better performing classifier commonly for both SVM classifiers. This might be largely due to a limited training sample, and a broader dataset should be analyzed in order to generalize the results. The RF classifier performed best using no added randomization, a relatively large tree count, and a small node size. The possibility of reducing training vector size in order to reduce model complexity and decrease classification time is verified. A 20-fold increase of the number of unique training samples resulted, at best, in a 2% increase of specificity. All combinations of input images are considered as a training input for all classifiers, and the significance of adding more types of images is discussed. A combination of T1C and T2 images performed sufficiently for all classifiers. The addition of the FL image brought a slight improvement in sensitivity. Lastly, best-performing parameter combinations were listed and the corresponding results were compared. The RF classifier had the largest sensitivity and worst specificity, and C-SVM performed oppositely. The significance of these two metrics largely depends on the type of task for which the classifiers are used. All sensitivity and specificity data were obtained directly using the 3D Slicer extension.

Supplementary Materials: The authors publicly release the source code of the Supervised Segmentation Toolbox. The source code can be found on GitHub [13]. The BraTS 2012 dataset is available on the challenge’s website. Author Contributions: J.M. conceived and designed the experiments; D.C. performed the experiments, programmed the extension, analyzed the data, and wrote the paper. Funding: This work was supported by the Ministry of Health of the Czech Republic, grant no. NV18-08-00459. Conflicts of Interest: The authors declare no conflict of interest.

Abbreviations The following abbreviations are used in this manuscript:

BraTS Brain-Tumor Segmentation FL Fluid-Attenuated Inversion Recovery LGG Low-Grade Glioma MCS Multiple Classifier System OOB Out Of Box RVM Relevance Vector Machine RF Random Forest SVM Support Vector Machine Symmetry 2018, 10, 627 8 of 9

References

1. Fedorov, A.; Beichel, R.; Kalpathy-Cramer, J.; Finet, J.; Fillion-Robin, J.C.; Pujol, S.; Bauer, C.; Jennings, D.; Fennessy, F.; Sonka, M.; et al. 3D Slicer as an image computing platform for the Quantitative Imaging Network. Magn. Reson. Imaging 2012, 30, 1323–1341. [CrossRef][PubMed] 2. Mehrtash, A.; Sedghi, A.; Ghafoorian, M.; Taghipour, M.; Tempany, C.M.; Wells, W.M.; Kapur, T.; Mousavi, P.; Abolmaesumi, P.; Fedorov, A. Classification of Clinical Significance of MRI Prostate Findings Using 3D Convolutional Neural Networks. Proc. SPIE Int. Soc. Optical Eng. 2017, 10134. 3. Zhang, Y.; Dong, Z.; Wang, S.; Ji, G.; Yang, J. Preclinical diagnosis of magnetic resonance (MR) brain images via discrete wavelet packet transform with Tsallis entropy and generalized eigenvalue proximal support vector machine (GEPSVM). Entropy 2015, 17, 1795–1813. [CrossRef] 4. Zhang, Y.; Wang, S.; Phillips, P.; Dong, Z.; Ji, G.; Yang, J. Detection of Alzheimer’s disease and mild cognitive impairment based on structural volumetric MR images using 3D-DWT and WTA-KSVM trained by PSOTVAC. Biomed. Signal Process. Control 2015, 21, 58–73. [CrossRef] 5. Wang, S.; Li, Y.; Shao, Y.; Cattani, C.; Zhang, Y.; Du, S. Detection of dendritic spines using wavelet packet entropy and fuzzy support vector machine. CNS Neurol. Dis.-Drug Targets 2017, 16, 116–121. [CrossRef] [PubMed] 6. Zhang, Y.; Yang, Z.; Lu, H.; Zhou, X.; Phillips, P.; Liu, Q.; Wang, S. Facial emotion recognition based on biorthogonal wavelet entropy, fuzzy support vector machine, and stratified cross validation. IEEE Access 2016, 4, 8375–8385. [CrossRef] 7. Zheng, Q.; Wu, Y.; Fan, Y. Integrating semi-supervised and supervised learning methods for label fusion in multi-atlas based . Front. Neuroinform. 2018, 12, 69. [CrossRef][PubMed] 8. Amiri, S.; Mahjoub, M.A.; Rekik, I. Tree-based Ensemble Classifier Learning for Automatic Brain Glioma Segmentation. Neurocomputing 2018, 313, 135–142. [CrossRef] 9. Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [CrossRef] 10. Chang, C.C.; Lin, C.J. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 1–27. [CrossRef] 11. Schölkopf, B.; Smola, A.J.; Williamson, R.C.; Bartlett, P.L. New support vector algorithms. Neural Comput. 2000, 12, 1207–1245. [CrossRef][PubMed] 12. Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; pp. 278–282. 13. Chalupa, D. Supervised Segmentation Toolbox for 3D Slicer. Source Code Available Under the GNU General Public License v3.0. Available online: https://github.com/chalupaDaniel/slicerSupervisedSegmentation (accessed on 14 October 2018). 14. King, D.E. Dlib-ml: A machine learning toolkit. J. Mach. Learn. Res. 2009, 10, 1755–1758. 15. Igel, C.; Heidrich-Meisner, V.; Glasmachers, T. Shark. J. Mach. Learn. Res. 2008, 9, 993–996. 16. Kapás, Z.; Lefkovits, L.; Szilágyi, L. Automatic Detection and Segmentation of Brain Tumor Using Random Forest Approach. In Modeling Decisions for Artificial Intelligence; Springer: New York, NY, USA, 2016; pp. 301–312. 17. Menze, B.H.; Jakab, A.; Bauer, S.; Kalpathy-Cramer, J.; Farahani, K.; Kirby, J.; Burren, Y.; Porz, N.; Slotboom, J.; Wiest, R.; et al. The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Trans. Med. Imaging 2015, 34, 1993–2024. [CrossRef][PubMed] 18. Wo´zniak,M.; Graña, M.; Corchado, E. A survey of multiple classifier systems as hybrid systems. Inf. Fusion 2014, 16, 3–17. [CrossRef] 19. Termenon, M.; Graña, M. A two stage sequential ensemble applied to the classification of Alzheimer’s disease based on MRI features. Neural Process. Lett. 2012, 35, 1–12. [CrossRef] 20. Tipping, M.E. Sparse Bayesian learning and the relevance vector machine. J. Mach. Learn. Res. 2001, 1, 211–244. 21. Geremia, E.; Menze, B.H.; Ayache, N. Spatial Decision Forests for Glioma Segmentation in Multi-Channel MR Images. MICCAI Chall. Multimodal Brain Tumor Segmentation 2012, 34. 22. Bauer, S.; Nolte, L.P.; Reyes, M. Fully automatic segmentation of brain tumor images using support vector machine classification in combination with hierarchical conditional random field regularization. In Medical Image Computing and Computer-Assisted Intervention—MICCAI; Springer: Berlin/Heidelberg, Germany, 2011. Symmetry 2018, 10, 627 9 of 9

23. Criminisi, A.; Zikic, D.; Glocker, B.; Shotton, J. Context-sensitive classification forests for segmentation of brain tumor tissues. Proc MICCAI-BraTS 2012, 1, 1–9. 24. Festa, J.; Pereira, S.; Mariz, J.; Sousa, N.; Silva, C. Automatic brain tumor segmentation of multi-sequence MR images using random decision forests. Proc. NCI-MICCAI BRATS 2013, 1, 23–26.

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).