applied sciences

Article Development of a Fully Automated Glioma-Grading Pipeline Using Post-Contrast T1-Weighted Images Combined with Cloud-Based 3D Convolutional Neural Network

Hiroto Yamashiro 1, Atsushi Teramoto 1,* , Kuniaki Saito 1 and Hiroshi Fujita 2

1 Graduate School of Health Sciences, Fujita Health University, 1-98 Dengakugakubo, Kutsukake, Toyoake 470-1192, Aichi, ; [email protected] (H.Y.); [email protected] (K.S.) 2 Faculty of Engineering, Gifu University, 1-1 Yanagido, Gifu 501-1194, Gifu, Japan; [email protected] * Correspondence: [email protected]

Featured Application: The proposed grading pipeline combined a cloud-based trained 3D CNN and our original 3D CNN is useful for early treatment of patients and prediction of their prognosis.

Abstract: Glioma is the most common type of brain tumor, and its grade influences its treatment pol- icy and prognosis. Therefore, artificial-intelligence-based tumor grading methods have been studied. However, in most studies, two-dimensional (2D) analysis and manual tumor-region extraction were performed. Additionally, deep learning research that uses medical images experiences difficulties in   collecting image data and preparing hardware, thus hindering its widespread use. Therefore, we developed a 3D convolutional neural network (3D CNN) pipeline for realizing a fully automated Citation: Yamashiro, H.; Teramoto, glioma-grading system by using the pretrained Clara segmentation model provided by NVIDIA A.; Saito, K.; Fujita, H. Development and our original classification model. In this method, the brain tumor region was extracted using of a Fully Automated the Clara segmentation model, and the volume of interest (VOI) created using this extracted region Glioma-Grading Pipeline Using was assigned to a grading 3D CNN and classified as either grade II, III, or IV. Through evaluation Post-Contrast T1-Weighted Images using 46 regions, the grading accuracy of all tumors was 91.3%, which was comparable to that of Combined with Cloud-Based 3D the method using multi-sequence. The proposed pipeline scheme may enable the creation of a fully Convolutional Neural Network. Appl. Sci. 2021, 11, 5118. https://doi.org/ automated glioma-grading pipeline in a single sequence by combining the pretrained 3D CNN and 10.3390/app11115118 our original 3D CNN.

Academic Editor: Keun-Ho Ryu Keywords: brain tumor; magnetic resonance imaging; grading; convolutional neural network

Received: 7 May 2021 Accepted: 29 May 2021 Published: 31 May 2021 1. Introduction Glioma is a type of primary brain tumor and is the most common type of brain tumors. Publisher’s Note: MDPI stays neutral The grade of glioma is given as an index of its malignancy, and it significantly influences with regard to jurisdictional claims in its treatment policy and prognosis [1]. In clinical practice, the treatment policy for grade IV published maps and institutional affil- glioma is different from that of the other grades because it progresses rapidly and has a iations. poor prognosis. Therefore, accurate diagnosis of grade leads to accurate early treatment. Nevertheless, grading as a definite diagnosis cannot be performed without the pathological examination of the removed tumor tissue. Therefore, before the surgical operation, neurol- ogists estimate the tumor grade using magnetic resonance imaging (MRI) findings, such as Copyright: © 2021 by the authors. the presence or absence of the ring enhancement effect. However, these characteristics vary Licensee MDPI, Basel, . among patients and it makes diagnosis difficult [2–5]. Therefore, many researchers are This article is an open access article trying to solve this problem of low accuracy using a convolutional neural network (CNN), distributed under the terms and which is one of the excellent image-analysis technologies. Yang et al. [6] classified the conditions of the Creative Commons grade of glioma using fine-tuned GoogLeNet. Furthermore, Abd-Ellah et al. [7] proposed a Attribution (CC BY) license (https:// glioma detection and grading system using a parallel deep CNN. In addition to gliomas, a creativecommons.org/licenses/by/ computer-aided diagnosis (CAD) system for brain tumors using CNN is being developed. 4.0/).

Appl. Sci. 2021, 11, 5118. https://doi.org/10.3390/app11115118 https://www.mdpi.com/journal/applsci Appl. Sci. 2021, 11, 5118 2 of 11

Appl. Sci. 2021, 11, 5118 2 of 11

CNN is being developed. Abd El Kader et al. [8] proposed a differential deep CNN model to classify abnormal or normal MR brain images, and Díaz-Pernas et al. [9] proposed a Abdmultiscale El Kader CNN et al.model [8] proposed for region a differentialextraction an deepd classification CNN model of tothree classify types abnormal of brain ortu- normalmors, including MR brain glioma, images, in and post-contrast Díaz-Pernas T1-w et al.eighted [9] proposed images. a multiscaleHowever, CNNin these model studies, for region3D MR extraction images were and analyzed classification on a of slice-by-slice three types ofbasis. brain It tumors,is considered including that glioma,a more accurate in post- contrastanalysis T1-weighted will be possible images. using However, 3D images in these because studies, tumors 3D MR grow images in multiple were analyzed directions. on aAdditionally, slice-by-slice basis.when Itthe is consideredtumor regions that awere more ma accuratenually analysisextracted, will the be variations possible using in these 3D images because tumors grow in multiple directions. Additionally, when the tumor regions regions influenced the grading accuracy. Therefore, these problems can be solved by ap- were manually extracted, the variations in these regions influenced the grading accuracy. plying automated tumor-region extraction and grading using 3D MR images. In fact, Chen Therefore, these problems can be solved by applying automated tumor-region extraction et al. [10] developed an automatic CAD system of gliomas that combines automatic seg- and grading using 3D MR images. In fact, Chen et al. [10] developed an automatic CAD mentation and radiomics. By training and evaluation using Multimodal Brain Tumor Seg- system of gliomas that combines automatic segmentation and radiomics. By training mentation Challenge 2015 (BraTS2015) dataset, the grading accuracy was observed to be and evaluation using Multimodal Brain Tumor Segmentation Challenge 2015 (BraTS2015) 91.3%. Furthermore, Zhuge et al. [11] proposed automated glioma-grading system using dataset, the grading accuracy was observed to be 91.3%. Furthermore, Zhuge et al. [11] 3D U-Net and 3D CNN. By training and evaluation using BraTS2018 dataset, the grading proposed automated glioma-grading system using 3D U-Net and 3D CNN. By training accuracy was observed to be 97.1%. and evaluation using BraTS2018 dataset, the grading accuracy was observed to be 97.1%. Nevertheless, many of these studies such as Chen et al. [10] proposed method re- Nevertheless, many of these studies such as Chen et al. [10] proposed method required multi-sequencequired multi-sequence MR images MR images as input, as whichinput, leadswhich to leads a reduction to a reduction in the numberin the number of cases of duecases to due the to lack the of lack specific of specific sequences sequences and increasesand increases the computationalthe computational cost cost by enlargingby enlarg- theinginput the input data data size. size. In addition, In addition, a significant a significant number number of datasets of datasets and high and computationalhigh computa- costtional are cost required are required for training for training a deep a deep learning learning network network [12 ].[12]. However, However, in in the the case case of of usingusing medicalmedical images,images, thethe amountamount ofof availableavailable datadata isis limitedlimited owingowing toto severalseveral challenges,challenges, includingincluding ethicalethical problemsproblems andand lacklack ofof cooperationcooperation amongamong hospitalshospitals [[13,14].13,14]. Additionally, notnot everyoneeveryone cancan installinstall aa high-performancehigh-performance machinemachine thatthat cancan withstandwithstand aasubstantial substantial computationalcomputational loadload [[15,16].15,16]. Therefore,Therefore, fine-tuningfine-tuning aa modelmodel pretrainedpretrained byby natural natural images images contributescontributes towardtoward reducingreducing thethe amountamount ofof medicalmedical imageimage datadata requiredrequired asas wellwell asas thethe computationalcomputational cost cost [17 [17,18].,18]. However, However, it wasit was difficult difficult to adapt to adapt the modelthe model to 3D to images, 3D images, such assuch to brainas to MRbrain images. MR images. Therefore,Therefore, NVIDIA NVIDIA is is attempting attempting to to solve solve this this problem problem by by providing providing models models trained trained by variousby various medical medical images images in the in Clarathe Clara project projec [19].t [19]. A grading A grading system system can be can easily be easily developed devel- usingoped theusing brain the tumorbrain segmentationtumor segmentation model formodel single for sequence single sequence MR images MR ofimages the project. of the Therefore,project. Therefore, in this study, in this we study, developed we devel a fullyoped automated a fully automated glioma-grading glioma-grading pipeline pipeline using post-contrastusing post-contrast T1-weighted T1-weighted images images using usin twog 3D two —the 3D CNNs—the trained trained model model for tumorfor tu- regionmor region extraction extraction and the and original the original model model for grading. for grading. TheThe outlineoutline of the proposed proposed method method is is depict depicteded in in Figure Figure 1.1 .First, First, a post-contrast a post-contrast T1- T1-weightedweighted image image was was fed fed to to a a3D 3D CNN CNN for for brain brain tumor tumor region extraction.extraction. TheThe extractedextracted tumortumor regionregion waswas thenthen fedfed toto thethe other other 3D 3D CNN CNN for for grading grading and, and, thus, thus, the the glioma glioma grade grade waswas obtained.obtained. So So far, far, it it has has been been difficult difficult toto develop develop fully fully automated automated grading grading pipeline pipeline that thatextracts extracts and grades and grades the glioma the glioma regions regions from 3D from images 3D images due to dueproblems to problems such as suchcalcula- as calculationtion cost. Whereas cost. Whereas the proposed the proposed method method achieves achieves segmentation segmentation and andgrading grading with with low lowcomputational computational cost cost by byusing using pre-trained pre-trained 3D 3D CNNs CNNs for for the the segmentation segmentation task, whichwhich isis computationallycomputationally expensiveexpensive toto train.train. Furthermore,Furthermore, althoughalthough byby usingusing onlyonly oneone typetype ofof MRMR image,image, thethe proposedproposed methodmethod achievesachieves thethe samesame oror better better performance performance thanthan thethe method method usingusing multi-sequence,multi-sequence, whichwhich isis computationallycomputationally expensive. expensive.

FigureFigure 1.1. OutlineOutline of the proposed method. The The processi processingng in in the the method method is is fully fully automated, automated, and and therethere is is no no manual manual extraction extraction of of the the tumor tumor region region and and creation creation of of VOI. VOI. Appl. Sci. 2021, 11, 5118 3 of 11

2. Materials and Methods 2.1. Image Dataset We used the training dataset of the Multimodal Brain Tumor Segmentation Challenge 2018 (BraTS2018) [20–22]. This dataset included MR images for 285 cases of pathologically proven WHO grade II/III or grade IV glioma patients (grades II and III: 75 cases, grade IV: 210 cases). This dataset did not contain detailed information regarding the cancer type of glioma, and only information about the grade was provided. Each case included four sequences of a T1-weighted image, post-contrast T1-weighted images, T2-weighted image, a FLAIR image, and a ground truth. The ground truth comprised three labels (manually revised by neuroradiologists) of the contrast-enhanced region, necrotic and non-enhanced region, and edema region. The region comprising the contrast-enhanced region and necrotic and non-enhanced region was used as the ground truth of the brain tumor region to target only the substantial tumor region. Additionally, the image size was adjusted to 240 × 240 × 155 voxels with 1 mm3 per voxel, and parts other than the brain parenchyma, such as the skull, were removed from the images. The details of the dataset are provided in [20–22].

2.2. Region Extraction For extraction of the brain tumor region, we used a 3D CNN, which was provided by the Clara project developed by NVIDIA [19]. In the Clara project, NVIDIA is attempting to create and provide trained models of various tasks for the medical popularization of artificial intelligence (AI); in addition, high-performance machines manufactured by NVIDIA such as NVIDIA DGX-1 server are being used. In this study, we used a model that was trained on the previously mentioned BraTS2018 dataset. The voxel value x of the training data was normalized (Z-) by using Equation (1), where µ and σ denote the mean and standard deviation of the pixel value, respectively [23].

x − µ x0 = (1) σ Furthermore, the data were augmented to prevent overfitting. For data augmentation, the voxel value was randomly shifted by −0.1 to 0.1 times the standard deviation of the voxel value in the images, and the scale of the images was randomly changed between 90% and 110% of the original images. Additionally, the images were randomly flipped around the X-, Y-, and Z-axes. During the training period, 224 × 224 × 128 voxels, which sufficiently contained brain parenchyma and tumor regions, were cropped from the images to reduce the data size and were fed to the 3D CNN for extracting the brain tumor region. The architecture of the 3D CNN used for region extraction is shown in Figure2. The network comprises ResNet-like blocks. Each ResNet-like block comprises two sets of group-normalization layers, the rectified linear unit (ReLU) function and convolutional layer. Because of the limited memory, the batch size was set to 1, and group normalization was adopted. In the group-normalization layer, the data obtained from the previous layer were normalized and fed to the convolutional layer through the ReLU function. In the convolutional layer, convolution was performed using filters with a kernel size of 3 × 3 × 3. The number of epochs was 460, and Adam was adopted as the optimization algorithm [24]. The initial learning rate α was 1 × 10−4, which is expressed by using Equation (2).

 e 0.9 α = 1 × 10−4 × 1− (2) Ne

where e denotes the epoch counter and Ne is the total number of epochs. Appl. Sci. 2021, 11, 5118 4 of 11 Appl. Sci. 2021, 11, 5118 4 of 11

Figure 2. 3DFigure CNN 2. architecture 3D CNN architecture used for extracting used for theextracting brain tumor the brain region. tumor This region. model This was model provided was by pro- the Clara project developed byvided NVIDIA. by the ItClara comprises project ResNet-like developed blocksby NVIDIA. and skip It comprises connections. ResNet-like blocks and skip con- nections. This network was approximately divided into three parts: an encoder, a decoder, and This networka variational was approximately autoencoder divided (VAE). into The three ResNet-like parts: an blocks encoder, were a connected decoder, and by multiple layers. a variational autoencoderIn the encoder (VAE). part, The when ResNet- twolike blocks blocks are were passed, connected the kernel by sizemultiple remains lay- the same, and ers. In the encoderthe stride part, iswhen set to two 2 to blocks double are the passed, image featuresthe kernel and size halve remains the image the same, size. At the end, the and the stride isencoder set to 2 partto double branches the toimage the decoder features and and VAE halve parts, the image and in size. thedecoder At the end, part, the extracted the encoder partimage branches features to the are decoder halved, andand the VAE image parts, size and is doubledin the decoder for upsampling. part, the ex- Finally, the tumor tracted image featuresregion is are extracted halved, withand the the im sameage imagesize is sizedoubled as that for of upsampling. the input image. Finally, The VAE part is the tumor regiona networkis extracted that with reconstructs the same image an image size usingas that the of the extracted input image. image The features VAE and Gaussian part is a networkdistribution. that reconstructs It regularizes an imag thee using encoder the extracted part during image the features training and period. Gauss- Skip connections ian distribution.are It regularizes provided within the encoder the encoder part during or between the training the encoderperiod. Skip and connections decoder to facilitate back are provided withinpropagation the encoder during or between the training the encoder period. Theand decoder NVIDIA to Tesla facilitate V100 back 32 GB prop- GPU was used for agation during trainingthe training this period. model, The which NVIDIA was implemented Tesla V100 32GB using GPU the TensorFlowwas used for AI train- platforms. ing this model, which was implemented using the TensorFlow AI platforms. 2.3. Tumor Grading 2.3. Tumor Grading The tumors were graded using the original 3D CNN model designed by us. In this The tumorsmodel, were gradinggraded wasusing performed the original on the3D volumeCNN model of interest designed (VOI) by that us. was In this cropped according model, gradingto was the performed size of the on tumor the volume region. of Theinterest VOIs (VOI) were that cubical, was cropped and the accord- size to be extracted ing to the size wasof the taken tumor as twiceregion. the The maximum VOIs were side cubical, of the circumscribedand the size to rectanglebe extracted of the tumor; this was taken as twicewas donethe maximum to consider side the of information the circumscribed regarding rectangle the surrounding of the tumor; brain parenchyma.this Data was done to consideraugmentation the information was performed regard toing prevent the surrounding overfitting brain when parenchyma. training the 3D Data CNN for grading. augmentation wasAll the performed VOIs were to laterallyprevent flipped,overfitting and when lateral training and cephalocaudal the 3D CNN rotations for grad- were performed − ◦ ◦ ing. All the VOIsbetween were laterally10 and flipped, 10 . Because and lateral a gap and existed cephalocaudal in the number rotations of cases were between per- grade II/III and grade IV tumors, the rotation interval was adjusted to 5◦ for grades II and III and 10◦ formed between −10° and 10°. Because a gap existed in the number of cases between grade for grade IV to balance the training data. Consequently, the number of training images II/III and grade IV tumors, the rotation interval was adjusted to 5° for grades II and III and for grades II and III were 3200 VOIs and 3402 for grade IV. Additionally, the images were 10° for grade IV to balance the training data. Consequently, the number of training images distorted by randomly expanding the coordinates of the eight apexes of each VOI along for grades II and III were 3200 VOIs and 3402 for grade IV. Additionally, the images were the lateral, cephalocaudal, and anteroposterior directions; the maximum expansion and distorted by randomly expanding the coordinates of the eight apexes of each VOI along contraction distances were 10% the matrix size on each side of the VOIs. The VOIs were the lateral, cephalocaudal, and anteroposterior directions; the maximum expansion and then resized to 64 × 64 × 64 voxels, and after normalization, they were fed to the 3D CNN contraction distances were 10% the matrix size on each side of the VOIs. The VOIs were for grading. The voxel values were normalized by dividing each voxel value of VOI by the then resized to 64 × 64 × 64 voxels, and after normalization, they were fed to the 3D CNN maximum voxel value of all VOIs in the dataset. Here, the maximum value was calculated for grading. The voxel values were normalized by dividing each voxel value of VOI by using denoised VOI after applying the median filter to the original images. the maximum voxel value of all VOIs in the dataset. Here, the maximum value was calcu- lated using denoised VOI after applying the median filter to the original images. Appl. Sci. 2021, 11, 5118 5 of 11

Appl. Sci. 2021, 11, 5118 5 of 11

The architecture of the 3D CNN model used for grading is shown in Figure 3. This model is the 3D extension of ResNet and comprises residual blocks [25]. Each ResNet-like block comprises threeThe sets architecture of batch-normal of the 3Dization CNN modellayers, used ReLU for function, grading is and shown convolu- in Figure3. This model is the 3D extension of ResNet and comprises residual blocks [25]. Each ResNet-like tional layer. Additionally,block comprises the skip three connection sets of batch-normalizations in the residual blocks layers, ReLUreduce function, the vanishing and convolutional gradient. Convolutionlayer. was Additionally, performed the in skipthe convolutional connections in layer the residual using filters blocks with reduce a kernel the vanishing size of 1 × 1 × 1 or gradient.3 × 3 × 3 and Convolution a stride of was 1, performedfollowing inwhich the convolutional image features layer were using extracted. filters with a kernel A filter with a kernelsize ofsize 1 ×of1 1× × 1 or× 1 3 was× 3 used× 3 and to reduce a stride the of 1,dimension, following whichand subsequent image features were filters with a kernelextracted. size of A3 filter× 3 × with3 and a kernel1 × 1 × size 1 was of1 used× 1 × to1 restore was used the to dimension. reduce the dimension, This and bottleneck architecturesubsequent prevents filters the with computat a kernelional size of efficiency 3 × 3 × 3 degradation and 1 × 1 × 1due was to used multi- to restore the layered structuresdimension. [26]. The Thisfirst bottleneckresidual block architecture has max prevents pooling the computationaljust before the efficiency first con- degradation volution and the imagedue to multi-layeredsize was halved. structures Furthe [26rmore,]. The first when residual three block or four has maxresidual pooling blocks just before the first convolution and the image size was halved. Furthermore, when three or four residual were passed, the stride was changed to 2, and the image size was halved. Finally, in the blocks were passed, the stride was changed to 2, and the image size was halved. Finally, in fully connected layer,the fully these connected image features layer, these were image integrated, features wereand the integrated, grading and result the gradingof the result of tumor was obtainedthe tumorusing wasthe obtainedsoftmax usingfunction. the softmax The batch function. size was The batch16, the size number was 16, of the number epochs 30, and SGDof epochs (learning 30, andrate SGD= 1×10 (learning–4, momentum rate = 1 ×= 0.9)10–4 ,was momentum adopted = as 0.9) the was optimi- adopted as the zation algorithm optimization[27]. During algorithm the training [27]. pe Duringriod, the20% training of the period, training 20% dataset of the trainingwas ran- dataset was domly selected andrandomly validated. selected The andNVIDIA validated. GeForce The NVIDIA RTX 2080 GeForce Ti 11 RTXGB GPU 2080 Tiwas 11 GBused GPU to was used train this model, whichto train was this model,implemented which was usin implementedg the Keras using and theTensorFlow Keras and TensorFlowAI platforms.AI platforms.

FigureFigure 3. 3D CNN 3. 3D architecture CNN architecture used for grading. used for This grading. model isThis extension model of is ResNet, extension and itof was ResNet, designed and and it was trained by us. designed and trained by us. 2.4. Evaluation Metrics 2.4. Evaluation MetricsTo determine the usefulness of this method, evaluation was performed using the holdout method for 42 cases (grades II and III: 12 cases; grade IV: 30 cases) on the BraTS2018 To determine the usefulness of this method, evaluation was performed using the dataset, which was not used for training. The tumor region was extracted from the 42 cases holdout method previouslyfor 42 cases mentioned, (grades although II and thereIII: 12 were cases; 4 cases grade with IV 2 lesions.: 30 cases) Therefore, on the grading was BraTS2018 dataset,performed which was on 13not VOIs used of for grade trai II/IIIning. and The 33 tumor VOIs ofregion grade was IV. extracted from the 42 cases previouslyThe mentioned, Dice similarity although coefficient there (DSC) were was 4 cases usedto with evaluate 2 lesions. the tumor-region-extraction Therefore, grading was performedaccuracy on [28 13]. VOIs It is used of grade to quantitatively II/III and 33 evaluate VOIs of the grade similarity IV. between two sets and is The Dice similarityexpressed coefficient by Equation (DSC) (3), where was A andused B denoteto evaluate the results the oftumor-region- tumor-region extraction extraction accuracyand [28]. ground It is truth, used respectively. to quantitatively evaluate the similarity between two 2|A ∩ B| sets and is expressed by Equation (3), where A andDSC B denote= the results of tumor-region (3) |A| + |B| extraction and ground truth, respectively. 2|A∩B| DSC= (3) |A|+|B|

In addition, accuracy, sensitivity, and specificity were used to evaluate the grading accuracy. They are expressed by Equations (4)–(6), which are defined as: Appl. Sci. 2021, 11, 5118 6 of 11

Appl. Sci. 2021, 11, 5118 6 of 11 VOIs classified correctly Accuracy= (4) All VOIs In addition, accuracy, sensitivity, and specificity were used to evaluate the grading accuracy. They are expressedVOIs classified by Equations as grade (4)–(6), IV whichcorrectly are defined as: Sensitivity= (5) VOIs classifiedVOIs as classified grade IV correctly Accuracy = (4) All VOIs VOIs classified as grade II and III correctly Specificity= VOIs classified as grade IV correctly (6) SensitivityVOIs classified= as grade II and III (5) VOIs classified as grade IV They were used for performance evaluationVOIs classified. Furthermore, as grade II andthe IIIreceiver correctly operating Specificity = (6) characteristic (ROC) curve was created by chanVOIsging classified the threshold as grade to II classify and III from proba- bility of grades II andThey III wereor grade used IV for from performance 0 to 1, and evaluation. the area under Furthermore, the curve the (AUC) receiver was operating calculated [29].characteristic (ROC) curve was created by changing the threshold to classify from probabil- ity of grades II and III or grade IV from 0 to 1, and the area under the curve (AUC) was 3. Evaluation Resultscalculated [29]. To evaluate the automated extraction of tumor regions using the evaluation cases, 3. Evaluation Results the tumors were detected for all of the cases. The average value of the similarity index To evaluate the automated extraction of tumor regions using the evaluation cases, using DSC between the tumor-region extraction results and ground-truth image was the tumors were detected for all of the cases. The average value of the similarity index 0.839. Figures 4using and 5 DSC show between examples the tumor-region with high- extractionand low-similarity results and indices, ground-truth respectively, image was 0.839. in the extractionFigures results4 and obtained5 show examples using th withe proposed high- and low-similaritymethod. No indices,training respectively, time was in the needed becauseextraction a pretrained results model obtained was using used theas a proposed 3D CNN method. for tumor No region training extraction. time was needed However, the becausetraining a pretrainedtime was model2.87 days was usedat 460 as aepochs 3D CNN during for tumor the region development extraction. at However, NVIDIA, and thethe traininginference time time was was 2.87 approxim days at 460ately epochs 4 s duringper case. the As development for evaluating at NVIDIA, the and grading accuracy,the inferencethe accuracy time and was approximatelyAUC was observed 4 s per case.to be As91.3% for evaluating and 0.927. the The grading confu- accuracy, sion matrix of thethe grading accuracy and and processing AUC was observed results are to be presented 91.3% and in 0.927. Tables The 1 and confusion 2. Further- matrix of the grading and processing results are presented in Tables1 and2. Furthermore, the ROC more, the ROC curve, AUC, and the examples of correctly classified and misclassified tu- curve, AUC, and the examples of correctly classified and misclassified tumors by our mors by our proposedproposed method method are are shown shown in FiguresFigures6 6and and7. In7. gradesIn grades II and II and III, 69.2%III, 69.2% of the of lesions the lesions werewere correctly correctly classified classified as as grade IIII and and IV, IV, and and for for grade grade IV, all IV, the all lesions the lesions were correctly were correctly classified.classified. Training Training time time of theof the 3D 3D CNN CNN for gradingfor grading was 27.1was min, 27.1 and min, the and inference the time inference time was 0.110.11 s s for for all all 46 46 VOIs. VOIs.

Figure 4. Example of a case with a high similarity index. (a) Input image, (b) ground-truth image, (c) output image, and DSC valuesFigure calculated 4. Example in the volume of a case data. with a high similarity index. (a) Input image, (b) ground-truth image, (c) output image, and DSC values calculated in the volume data. Appl. Sci. 2021, 11, 5118Appl. Sci. 2021, 11, 5118 7 of 11 7 of 11

Appl. Sci. 2021, 11, 5118 7 of 11

Figure 5. Example of a case with a low similarity index. (a) Input image, (b) ground-truth image, (c) output image, and DSC valuesFigure calculated 5. Example in the volume of a case data. with a low similarity index. (a) Input image, (b) ground-truth image, (c) output image,Figure and 5.DSC Example values of acalculated case with ain low the similarity volume index.data. (a) Input image, (b) ground-truth image, (c) output image, and DSC values calculated in the volume data. Table 1. Confusion matrix of the grading result. Table 1. ConfusionTable matrix 1. Confusion of the matrixgrading of theresult. grading result. Estimated Grade EstimatedEstimated Grade Grade II/III IV II/III II/III IV IV II /III 9 4 ActualII /III grade II /III 9 9 4 4 Actual grade Actual grade IV 0 33 IV IV 0 0 33 33

TableTable 2. 2. ProcessingProcessing result result of ofthe the grading. grading. Table 2. Processing result of the grading. Accuracy (%) (%) Sensitivity Sensitivity (%) (%) Specificity Specificity (%) (%) Accuracy (%) Sensitivity (%) Specificity (%) ProposedProposed method 91.391.3 100 100 69.2 69.2 Proposed method 91.3 100 69.2

Figure 6. ROC curve and AUC for proposed method. Figure 6. ROC curve and AUC for proposed method.

Figure 6. ROC curve and AUC for proposed method. Appl. Sci. 2021, 11, 5118 8 of 11 Appl. Sci. 2021, 11, 5118 8 of 11

Figure 7. Examples of input images and the corresponding grading result. The blue outlined Figure 7. Examples of input images and the corresponding grading result. The blue outlined frames frames are correctly classified, and the red outlined frames are incorrectly classified. are correctly classified, and the red outlined frames are incorrectly classified.

4.4. Discussion Discussion TheThe DSC DSC value value was was 0.839 0.839 for for the the tumor-region tumor-region extraction, extraction, and and the the extraction extraction was was highlyhighly accurate accurate even even in in a asingle single sequence. sequence. Additionally, Additionally, because because the the matrix matrix sizes sizes of of the the VOIsVOIs were were defined defined as as twice twice of of the the tumor tumor diamet diameterer on on the the basis basis of of this this output output result, result, the the VOIsVOIs that that sufficiently sufficiently containedcontained thethe tumor tumo regionr region and and surrounding surrounding brain brain parenchyma parenchyma could couldbe created. be created. Additionally, Additionally, as shown as shown in Figure in Figure7, some 7, some cases cases with with the ringthe ring enhancement enhance- menteffect, effect, a typical a typical characteristic characteristic of high-grade of high-grade glioma, glioma, were were correctly correctly extracted. extracted. However, How- in ever,the casein the without case without the ring the enhancement ring enhancement effect, effect, as shown as shown in Figure in Figure7, the peripheral7, the peripheral edema edemaregion region was incorrectly was incorrectly extracted extracted as the as tumor the tumor region, region, and thus, and thethus, similarity the similarity index index tended tendedto be low.to be Therefore, low. Therefore, the 3D the CNN 3D usedCNN for used tumor-region for tumor-region extraction extraction was mainly was mainly trained trainedto recognize to recognize the ring the enhancement ring enhancement effect effect as the as edge the ofedge the of tumor the tumor region. region. The pretrained The pre- trainedmodel model of the of Clara the Clara project project was awas model a model that that trained trained MR MR images images of glioma of glioma patients, patients, and andparameter parameter tuning tuning had had already already been been performed. performed. By By using using this this model, model, it it was was possible possible to toeliminate eliminate the the large large amount amount of of time time required required for for training training and and parameter parameter tuning. tuning. RegardingRegarding tumor tumor grading, grading, accuracy, accuracy, sensitivity,sensitivity, and and specificity specificity were were 91.3%, 91.3%, 100%, 100%, and and69.2%, 69.2%, respectively. respectively. Table Table3 shows 3 shows a comparison a comparison of our of approach our approach with other with approaches other ap- proachesin terms in of terms the methods of the methods and results. and results. From Table From3, Table it is observed 3, it is observed that the that method the method of Yang ofet Yang al. classified et al. classified grade bygrade processing by processing one slice one of slice a single of a sequence; single sequence; however, however, the accuracy the accuracywas 96.8%, was which96.8%, was which higher was thanhigher that than of ourthat proposedof our proposed method. method. In addition In addition to the factto thethat fact they that used they a used different a different database database from that from of that other of studies,other studies, it made it simplemade simple comparisons com- parisonsdifficult. difficult. Furthermore, Furthermore, this was this not was a fullynot a automatedfully automated grading grading system system because because tumor tumordetection detection and region and region extraction extraction were performedwere performed manually. manually. Subsequently, Subsequently, we focused we fo- on cusedstudies on studies adapting adapting fully automated fully automated 3D processing, 3D processing, including including tumor tumor region region extraction extrac- and tiongrading—similar and grading—similar to our study. to our The study. sensitivity The sensitivity of this method of this method (100%) was(100%) higher was than higher that thanof the that method of the proposedmethod proposed by Zhuge by et al.Zhuge (94.7%), et al. and (94.7%), it is showed and it thatis showed all high-grade that all tumorshigh- gradewere tumors correctly were classified. correctly Therefore, classified. it isTheref consideredore, it thatis considered grade IV tumorsthat grade can beIV classifiedtumors canpreoperatively be classified preoperatively with 100% accuracy with 100% by this accu method,racy by leading this method, to early leading treatment to early of patients treat- mentand of prediction patients ofand prognosis. prediction In of addition, prognosis. the accuracyIn addition, of our the methodaccuracy (91.3%) of our was method lower (91.3%)than that was of lower the method than that proposed of the method by Zhuge proposed et al. (97.1%), by Zhuge but comparableet al. (97.1%), to but that compa- of Chen et al. (91.3%). While these two methods adopted multi-sequence processing, our method rable to that of Chen et al. (91.3%). While these two methods adopted multi-sequence pro- uses a single sequence. From these results, it was confirmed that a single sequence can cessing, our method uses a single sequence. From these results, it was confirmed that a be used for grading with high accuracy comparable to that of multi-sequences. Therefore, single sequence can be used for grading with high accuracy comparable to that of multi- this method can create a highly accurate grade classification pipeline while preventing sequences. Therefore, this method can create a highly accurate grade classification pipe- the occurrence of unusable cases and the increase in calculation cost due to the lack of line while preventing the occurrence of unusable cases and the increase in calculation cost a specific sequence. However, four VOIs (31%) of grades II and III were misclassified, due to the lack of a specific sequence. However, four VOIs (31%) of grades II and III were and the cases with the ring enhancement effect tended to be misclassified as belonging misclassified, and the cases with the ring enhancement effect tended to be misclassified as to grade IV (see Figure7). This may be because the ring enhancement effect is a typical belonging to grade IV (see Figure 7). This may be because the ring enhancement effect is Appl. Sci. 2021, 11, 5118 9 of 11

characteristic of high-grade glioma. However, grade II and III tumors without the ring enhancement effect were also misclassified as grade IV, and therefore, some features other than the ring enhancement effect might also have contributed to the incorrect classification results. Furthermore, multi-sequences are used for performing preoperative grading in actual clinical settings; additionally, a definitive postoperative diagnosis is also performed by obtaining multilateral information based on the results of both MRI and pathological examinations. However, our evaluation results confirmed that one could perform grading with high accuracy using this method by only considering the T1-weighted images.

Table 3. Comparison of the proposed approach with other approaches.

Region Dimension/ Dataset Accuracy Sensitivity Specificity Authors Methods Extraction Sequence (Cases) (%) (%) (%) 2D/ Yang ClinicalTrials.gov 2D CNN Manual single 94.5% - - et al. [6] (113 cases) sequence 3D/ Chen 3D CNN/ BraTS2015 Automated multi- 91.3% - - et al. [10] Radiomics (274 cases) sequence 3D/ Zhuge BraTS2018+TCIA 3D CNN Automated multi- 97.1% 94.7% 96.8% et al. [11] (289 cases) sequence 3D/ Proposed Brats2018 3D CNN Automated single 91.3% 100% 69.2% method (285 cases) sequence

In both the tasks of tumor-region extraction and grading, the small number of cases of grade II and III tumors might have decreased the accuracy. Therefore, the limitation of this study is the small number of grade II and III cases. Accordingly, we must evaluate more cases in the future.

5. Conclusions We developed a fully automated glioma-grading pipeline using the segmentation model of the NVIDIA Clara project and our original 3D CNN model. Thus far, a fully automated grading system using 3D image analysis was developed to analyze 3D MR images accurately. Two contributions of our study are the combination of a pretrained model for tumor region extraction to a grading pipeline and processing using only a single sequence. When the tumor region was extracted from a 3D image, it was necessary to prepare a large number of medical images and a high-performance machine that could withstand a substantial computational cost; however, using the pretrained model of the Clara project solved these problems. In fact, in this study, we were able to combine a high-performance tumor region extraction model without consuming the time required for training and parameter tuning, and we were able to intensively examine the grading part. Furthermore, in this study, only images of a single sequence were required for grading, and the classification accuracy was comparable to that of previous studies using multiple sequences. This reduced the computational cost and increased the number of applicable patients. These results indicate that a fully automated glioma-grading pipeline may be created in a single sequence by combining a cloud-based pretrained 3D CNN and our original 3D CNN. In the future, we plan to conduct external verification using databases other than BraTS 2018 and finally apply this method to predict IDH status (mutation or wild type).

Author Contributions: Conceptualization, H.Y., A.T., K.S., and H.F.; Formal analysis, H.Y. and A.T.; Methodology, H.Y., A.T., and H.F.; Software, H.Y. and A.T.; Writing—original draft preparation, H.Y. Appl. Sci. 2021, 11, 5118 10 of 11

and A.T.; Writing—review & editing, H.Y., A.T., K.S., and H.F.; Funding acquisition, A.T. All authors have read and agreed to the published version of the manuscript. Funding: This research received no external funding. Institutional Review Board Statement: Not applicable. Informed Consent Statement: Not applicable. Data Availability Statement: Datasets released to the public were analyzed in this study. The dataset can be found in the BraTS 2018 dataset: https://www.med.upenn.edu/sbia/brats2018/data.html (accessed on 31 May 2021). Conflicts of Interest: The authors declare no conflict of interest.

References 1. Caulo, M.; Panara, V.; Tortora, D.; Mattei, P.A.; Briganti, C.; Pravatà, E.; Salice, S.; Cotroneo, A.R.; Tartaro, A. Data-driven grading of brain gliomas: A multiparametric MR imaging study. Radiology 2014, 272, 494–503. [CrossRef] 2. Barker, F.G.; Chang, S.M.; Huhn, S.L.; Davis, R.L.; Gutin, P.H.; McDermott, M.W.; Wilson, C.B.; Prados, M.D. Age and the risk of anaplasia in magnetic resonance—Nonenhancing supratentorial cerebral tumors. Cancer 1997, 80, 936–941. [CrossRef] 3. Ginsberg, L.E.; Fuller, G.N.; Hashmi, M.; Leeds, N.E.; Schomer, D.F. The significance of lack of MR contrast enhancement of supratentorial brain tumors in adults: Histopathological evaluation of a series. Surg. Neurol. 1998, 49, 436–440. [CrossRef] 4. Scott, J.N.; Brasher, P.M.A.; Sevick, R.J.; Rewcastle, N.B.; Forsyth, P.A. How often are nonenhancing supratentorial gliomas malignant? A population study. Neurology 2002, 59, 947–949. [CrossRef] 5. Kondziolkaa, D.; Lunsford, L.D.; Martinez, A.J. Unreliability of contemporary neurodiagnostic imaging in evaluating suspected adult supratentorial (low-grade) astrocytoma. J. Neurosurg. 1993, 79, 533–536. [CrossRef][PubMed] 6. Yang, Y.; Yan, L.F.; Zhang, X.; Han, Y.; Nan, H.Y.; Hu, Y.C.; Hu, B.; Yan, S.L.; Zhang, J.; Cheng, D.L.; et al. Glioma grading on conventional MR images: A deep learning study with transfer learning. Front. Neurosci. 2018, 12, 804. [CrossRef] 7. Abd-Ellah, M.K.; Awad, A.I.; Hamed, H.F.A.; Khalaf, A.A.M. Parallel deep CNN structure for glioma detection and clas- sification via brain MRI Images. In Proceedings of the International Conference on Microelectronics, ICM, Cairo, Egypt, 15–18 December 2019; pp. 304–307. [CrossRef] 8. Abd El Kader, I.; Xu, G.; Shuai, Z.; Saminu, S.; Javaid, I.; Salim Ahmad, I. Differential Deep Convolutional Neural Network Model for Brain Tumor Classification. Brain Sci. 2021, 11, 352. [CrossRef] 9. Díaz-Pernas, F.J.; Martínez-Zarzuela, M.; Antón-Rodríguez, M.; González-Ortega, D. A Deep Learning Approach for Brain Tumor Classification and Segmentation Using a Multiscale Convolutional Neural Network. Healthcare 2021, 9, 153. [CrossRef][PubMed] 10. Chen, W.; Liu, B.; Peng, S.; Sun, J.; Qiao, X. Computer-Aided Grading of Gliomas Combining Automatic Segmentation and Radiomics. Int. J. Biomed. 2018, 2018, 2512037. [CrossRef][PubMed] 11. Zhuge, Y.; Ning, H.; Mathen, P.; Cheng, J.Y.; Krauze, A.V.; Camphausen, K.; Miller, R.W. Automated glioma grading on conventional MRI images using deep convolutional neural networks. Med. Phys. 2020, 47, 3044–3053. [CrossRef] 12. Wang, G. A perspective on deep imaging. IEEE Access 2016, 4, 8914–8924. [CrossRef] 13. Teramoto, A.; Tsukamoto, T.; Yamada, A.; Kiriyama, Y.; Imaizumi, K.; Saito, K.; Fujita, H. Deep learning approach to classification of lung cytological images: Two-step training using actual and synthesized images by progressive growing of generative adversarial networks. PLoS ONE 2020, 15, e0229951. [CrossRef] 14. Onishi, Y.; Teramoto, A.; Tsujimoto, M.; Tsukamoto, T.; Saito, K.; Toyama, H.; Imaizumi, K.; Fujita, H. Multiplanar analysis for pulmonary nodule classification in CT images using deep convolutional neural network and generative adversarial networks. Int. J. Comput. Assist. Radiol. Surg. 2020, 15, 173–178. [CrossRef] 15. Tajbakhsh, N.; Shin, J.Y.; Gurudu, S.R.; Hurst, R.T.; Kendall, C.B.; Gotway, M.B.; Liang, J. Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE Trans. 2016, 35, 1299–1312. [CrossRef][PubMed] 16. Beluch, W.H.; Genewein, T.; Nürnberger, A.; Köhler, J.M. The power of ensembles for active learning in image classification. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9368–9377. [CrossRef] 17. Teramoto, A.; Yamada, A.; Kiriyama, Y.; Tsukamoto, T.; Yan, K.; Zhang, L.; Imaizumi, K.; Saito, K.; Fujita, H. Automated classification of benign and malignant cells from lung cytological images using deep convolutional neural network. Inform. Med. Unlocked 2019, 16, 100205. [CrossRef] 18. He, K.; Girshick, R.; Dollar, P. Rethinking imageNet pre-training. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 4917–4926. [CrossRef] 19. Myronenko, A. 3D MRI brain tumor segmentation using autoencoder regularization. In Proceedings of the International MICCAI Brainlesion Workshop, Granada, , 16 September 2019; pp. 311–320. [CrossRef] 20. Bakas, S.; Akbari, H.; Sotiras, A.; Bilello, M.; Rozycki, M.; Kirby, J.S.; Freymann, J.B.; Farahani, K.; Davatzikos, C. Advancing the cancer genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 2017, 4, 170117. [CrossRef][PubMed] Appl. Sci. 2021, 11, 5118 11 of 11

21. Bakas, S.; Reyes, M.; Jakab, A.; Bauer, S.; Rempfler, M.; Crimi, A.; Shinohara, R.T.; Berger, C.; Ha, S.M.; Rozycki, M.; et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS Challenge. arXiv 2018, arXiv:1811.02629. 22. Menze, B.H.; Jakab, A.; Bauer, S.; Kalpathy-Cramer, J.; Farahani, K.; Kirby, J.; Burren, Y.; Porz, N.; Slotboom, J.; Wiest, R.; et al. The multimodal brain tumor image segmentation benchmark (BRATS). IEEE Trans. Med. Imaging 2015, 34, 1993–2024. [CrossRef] [PubMed] 23. Uddin, M.Z.; Hassan, M.M. Activity recognition for cognitive assistance using body sensors data and deep convolutional neural network. IEEE Sens. J. 2019, 19, 8413–8419. [CrossRef] 24. Kingma, D.P.; Ba, J.L. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR, San Diego, CA, USA, 7–9 May 2015. [CrossRef] 25. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [CrossRef] 26. He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The , 11–14 June 2016; pp. 630–645. [CrossRef] 27. Robbins, H.; Monro, S. A stochastic approximation method. Ann. Math. Stat. 1951, 22, 400–407. [CrossRef] 28. Dice, L.R. Measures of the amount of ecologic association between species. Ecology. Ecology 1945, 26, 297–302. [CrossRef] 29. Bradley, A. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 1997, 7, 1145–1159. [CrossRef]