Machine Learning Based Differentiation of Glioblastoma from Brain
Total Page:16
File Type:pdf, Size:1020Kb
www.nature.com/scientificreports OPEN Machine learning based diferentiation of glioblastoma from brain metastasis using MRI derived radiomics Sarv Priya1*, Yanan Liu2, Caitlin Ward3, Nam H. Le2, Neetu Soni1, Ravishankar Pillenahalli Maheshwarappa1, Varun Monga4, Honghai Zhang2, Milan Sonka2 & Girish Bathla1 Few studies have addressed radiomics based diferentiation of Glioblastoma (GBM) and intracranial metastatic disease (IMD). However, the efect of diferent tumor masks, comparison of single versus multiparametric MRI (mp-MRI) or select combination of sequences remains undefned. We cross- compared multiple radiomics based machine learning (ML) models using mp-MRI to determine optimized confgurations. Our retrospective study included 60 GBM and 60 IMD patients. Forty-fve combinations of ML models and feature reduction strategies were assessed for features extracted from whole tumor and edema masks using mp-MRI [T1W, T2W, T1-contrast enhanced (T1-CE), ADC, FLAIR], individual MRI sequences and combined T1-CE and FLAIR sequences. Model performance was assessed using receiver operating characteristic curve. For mp-MRI, the best model was LASSO model ft using full feature set (AUC 0.953). FLAIR was the best individual sequence (LASSO-full feature set, AUC 0.951). For combined T1-CE/FLAIR sequence, adaBoost-full feature set was the best performer (AUC 0.951). No signifcant diference was seen between top models across all scenarios, including models using FLAIR only, mp-MRI and combined T1-CE/FLAIR sequence. Top features were extracted from both the whole tumor and edema masks. Shape sphericity is an important discriminating feature. Glioblastoma (GBM) and intracranial metastatic disease (IMD) together constitute the vast majority of malignant brain neoplasms1,2. Gliomas account for about 25.5% of all primary brain and other CNS tumors and approxi- mately 80.8% of primary malignant brain tumors. Of these, GBM is the most common, accounting for over half of the gliomas (57.3%) with an annual age-adjusted incidence rate of 3.22 per 100,000 population in the United States3. IMD on the other hand has an incidence rate of approximately 10 per 100,000 population and are more common than GBM1. Te distinction between GBM and IMD is important since it has diagnostic, therapeutic and prognostic implications2,4,5. Histopathological tissue confrmation is considered the gold standard for diagnosis, but is not always optimal, with misdiagnosis and under grading of tumors reported in 9.2 and 28% of lesions respectively 6. Te reported biopsy complication rate varies between 6 and 12% with mortality rate of 0–1.7%7. On conventional imaging, factors such as multiplicity of lesions, morphology, cerebellar localization and known history of underlying primary cancer can be helpful to diferentiate IMD from GBM1,2. However, brain metastases may present as a solitary lesion in approximately half of the patients or be associated with undiagnosed systemic malignancy in about 15–30%8,9. Tus, conventional imaging alone may be insufcient for accurate clas- sifcation. Prior studies using advanced MRI imaging techniques such as perfusion imaging 10–12, spectroscopy, difusion-weighted and tensor imaging13, new difusion weighted techniques like neurite orientation dispersion and density imaging14, and more recently other advanced sequences like non-contrast infow-based vascular- space-occupancy MR imaging 15 have been used to distinguish amongst these entities with variable success16–20. However, these advanced imaging sequences are not performed universally, and conventional imaging is still the mainstay in clinical practice. Radiomics is a technique applied on medical images to extract quantitative 1Department of Radiology, University of Iowa Hospital and Clinics, 200 Hawkins Drive, Iowa City, IA 52242, USA. 2College of Engineering, University of Iowa, Iowa City, IA, USA. 3Department of Biostatistics, University of Iowa, Iowa City, IA, USA. 4Department of Medicine, University of Iowa Hospitals and Clinics, Iowa City, IA, USA. *email: [email protected] Scientifc Reports | (2021) 11:10478 | https://doi.org/10.1038/s41598-021-90032-w 1 Vol.:(0123456789) www.nature.com/scientificreports/ GBM METASTASES Patients (120) 60 60; Breast (20); Lung (40) Age years (mean ± SD) 62 ± 11 62 ± 10 Gender Male 36 27 Female 24 33 Localization Supratentorial 58 Breast (10); Lung (25) Infratentorial 2 Breast (6); Lung (8) Both 0 Breast (4); Lung (7) Multiplicity Single 53 Breast (12); Lung (24) Two 5 Breast (2); Lung (7) ≥ Two (Multiple) 2 Breast (6); Lung (9) Necrosis Yes 59 Breast (12); Lung (24) No 1 Breast (8); Lung (16) Table 1. Patient demographics and tumor characteristics. GBM-Glioblastoma. features invisible to human eye21. Tese features may provide a complimentary tool for the expert human reader. Tese radiomic features have been employed in multiple prior studies for tumor grading, classifcation and prognosis21–24. Te advantage of radiomics is that it can be applied to routinely acquired conventional clinical images25. Te application of radiomics based machine learning techniques (MLT) to diferentiate GBM from IMD has only been explored in a few prior studies, mostly using limited MRI sequences and MLT 1,2,4,26–29. Te superior- ity of having one, a few, or all conventional MRI sequences (T1 WI, T2 WI, ADC, FLAIR and T1-CE) as well as the impact of feature reduction and type of machine learning models remain largely unexplored. In this study, we aimed to determine the optimal radiomics based MLT for this specifc two-class problem using routinely available conventional MRI sequences. Results Patient characteristics. Tere were 120 patients (males 63, females 57) in the study population (GBM 60, metastases 60). Te majority of metastatic tumors were from lung cancer (40) followed by breast cancer (20). Te demographic and tumor characteristics are provided in Table 1. Model performance on mp-MRI. Using mp-MRI, the two best performing models were the LASSO (least absolute shrinkage and selection operator) and elastic net ft to the full feature set. Te LASSO classifer had mean cross-validated area under the curve (AUC) of 0.953 and the elastic net classifer had a mean cross- validated AUC of 0.952. Figure 1 displays the mean cross-validated AUC for all 45 MLT combinations ft using all sequences. Model performance on individual sequences. For the models ft to each sequence separately, the LASSO and elastic net ft to the full feature set again were the top performing models, with both being ft to the FLAIR sequence. Interestingly, seven of the top 10 best performing sequence-specifc models were derived from the FLAIR sequence. Te LASSO classifer on the FLAIR sequence had mean cross-validated AUC of 0.951 while the elastic net classifer had a mean cross-validated AUC of 0.948. Figure 2 shows the mean AUC for all models ft using the FLAIR sequence as many of the top performing individual sequence models came from this sequence. Table 2 displays the mean and standard deviation of AUC for the 10 best performing models for mp- MRI and individual sequences. Model performance from combined T1-CE and FLAIR sequences. For the models ft to the T1-CE and FLAIR sequences in combination, the adaBoost and LASSO models ft to the full feature set were the top performing models. Te adaBoost classifer had mean cross-validated AUC of 0.951. Te LASSO classifer had mean cross-validated AUC of 0.950. Figure 3 shows the mean AUC for all models ft using the combined T1-CE and FLAIR sequences. Table 3 displays the mean and standard deviation of AUC for the 10 best performing models for T1-CE and FLAIR combination. Comparison of predictive performance between mp-MRI, individual sequence, and combina- tion of T1-CE with FLAIR. Overall, the best performing model using mp-MRI (LASSO ft to the full fea- ture set), FLAIR sequence (LASSO full) and combined T1-CE and FLAIR sequences (adaBoost full) had similar predictive performance (p- value > 0.05 for all) (Table 4). Tese results indicate no statistically signifcant difer- ences in predictive performance between the top models in each of the three scenarios. Scientifc Reports | (2021) 11:10478 | https://doi.org/10.1038/s41598-021-90032-w 2 Vol:.(1234567890) www.nature.com/scientificreports/ Figure 1. Diagnostic performance using multiparametric MRI. Mean cross-validated ROC AUC for all 45 machine learning and feature reduction combinations using all sequences. Figure 2. Diagnostic performance using FLAIR sequence. Mean AUC for all models ft using the FLAIR sequence as many of the top performing models came from this sequence. Feature importance for the models. Features with higher relative importance were derived from both the whole tumor and edema masks. Te shape sphericity was the most important feature in all sequence combi- nations. A boxplot showing the distribution of this feature for the two tumor types on FLAIR sequence is shown in Fig. 4. Supplementary tables S1-S3 (and supplementary Fig. 1–3) display the ranking by variable importance for the ten most important features for the two best models for mp-MRI, FLAIR and combination of T1-CE and FLAIR sequence. Supplementary Fig. 4 shows the heatmap of feature importance for mp-MRI models (features with relative importance greater than 40 were included). Discussion Our study showed that radiomics based MLT can diferentiate GBM and IMD with excellent performance. We found LASSO and elastic net as the top performing models. Another key observation from our study was that the diagnostic performance for best models was similar for mp-MRI, FLAIR sequence and combined T1-CE and FLAIR sequence. Finally, radiomic features with high relative importance were derived from both the whole tumor and edema masks and shape sphericity was the most important feature. LASSO and elastic net models are both penalized regression models 30. LASSO model forces the coefcient estimates of the variables with limited contribution to the outcome to be exactly zero.