Dual-Sampling Attention Network for Diagnosis of COVID-19 from Community Acquired Pneumonia
Total Page:16
File Type:pdf, Size:1020Kb
1 Dual-Sampling Attention Network for Diagnosis of COVID-19 from Community Acquired Pneumonia Xi Ouyangy, Jiayu Huoy, Liming Xiay, Fei Shany, Jun Liuy, Zhanhao Moy, Fuhua Yany, Zhongxiang Dingy, Qi Yangy, Bin Songy, Feng Shi, Huan Yuan, Ying Wei, Xiaohuan Cao, Yaozong Gao, Dijia Wu, Qian Wang∗, Dinggang Shen∗ Abstract—The coronavirus disease (COVID-19) is rapidly Results show that our algorithm can identify the COVID-19 spreading all over the world, and has infected more than images with the area under the receiver operating characteristic 1,436,000 people in more than 200 countries and territories as curve (AUC) value of 0.944, accuracy of 87.5%, sensitivity of April 9, 2020. Detecting COVID-19 at early stage is essential of 86.9%, specificity of 90.1%, and F1-score of 82.0%. With to deliver proper healthcare to the patients and also to protect this performance, the proposed algorithm could potentially aid the uninfected population. To this end, we develop a dual- radiologists with COVID-19 diagnosis from CAP, especially in sampling attention network to automatically diagnose COVID- the early stage of the COVID-19 outbreak. 19 from the community acquired pneumonia (CAP) in chest Index Terms—COVID-19 Diagnosis, Online Attention, Ex- computed tomography (CT). In particular, we propose a novel plainability, Imbalanced Distribution, Dual Sampling Strategy. online attention module with a 3D convolutional network (CNN) to focus on the infection regions in lungs when making decisions of diagnoses. Note that there exists imbalanced distribution of the sizes of the infection regions between COVID-19 and CAP, I. INTRODUCTION partially due to fast progress of COVID-19 after symptom onset. HE disease caused by the novel coronavirus, or Coro- Therefore, we develop a dual-sampling strategy to mitigate the navirus Disease 2019 (COVID-19) is quickly spreading imbalanced learning. Our method is evaluated (to our best T knowledge) upon the largest multi-center CT data for COVID- globally. It has infected more than 1,436,000 people in more 19 from 8 hospitals. In the training-validation stage, we collect than 200 countries and territories as of April 9, 2020 [1]. On 2186 CT scans from 1588 patients for a 5-fold cross-validation. February 12, 2020, the World Health Organization (WHO) In the testing stage, we employ another independent large-scale officially named the disease caused by the novel coronavirus as testing dataset including 2796 CT scans from 2057 patients. Coronavirus Disease 2019 (COVID-19) [2]. Now, the number of COVID-19 patients, is dramatically increasing every day y X. Ouyang, J. Huo, L. Xia, F. Shan, J. Liu, Z. Mo, F. Yan, Z. Ding, Q. around the world [3]. Compared with the prior Severe Acute Yang, and B. Song contributed equally to this work. ∗ Corresponding authors: Q. Wang ([email protected]) and D. Shen Respiratory Syndrome (SARS) and Middle East Respiratory ([email protected]). Syndrome (MERS), COVID-19 has spread to more places and X. Ouyang, J. Huo and Q. Wang are with the Institute for Medical Imaging caused more deaths, despite its relatively lower fatality rate [4], Technology, School of Biomedical Engineering, Shanghai Jiao Tong Univer- sity, Shanghai, China. X. Ouyang and J. Huo are interns at Shanghai United [5]. Considering the pandemic of COVID-19, it is important to Imaging Intelligence Co. during this work. (e-mail: fxi.ouyang, jiayu.huo, detect COVID-19 early, which could facilitate the slowdown [email protected]). of viral transmission and thus disease containment. L. Xia is with the Department of Radiology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, In clinics, real-time reverse-transcriptionpolymerase-chain- China. (e-mail: [email protected]). reaction (RT-PCR) is the golden standard to make a definitive F. Shan is with the Department of Radiology, Shanghai Public diagnosis of COVID-19 infection [6]. However, the high false Health Clinical Center, Fudan University, Shanghai, China. (e-mail: shan- fei [email protected]). negative rate [7] and unavailability of RT-PCR assay in the arXiv:2005.02690v2 [cs.CV] 20 May 2020 J. Liu is with the Department of Radiology, The Second Xiangya Hospital, early stage of an outbreak may delay the identification of Central South University, Changsha, Hunan Province, China, and is also potential patients. Due to the highly contagious nature of the with the Department of Radiology Quality Control Center, Changsha, Hunan Province, China. (e-mail: [email protected]). virus, it then constitutes a high risk for infecting a larger Z. Mo is with the Department of Radiology, China-Japan Union Hospital population. At the same time, thoracic computed tomography of Jilin University, Changchun, China. (e-mail: [email protected]). (CT) is relatively easy to perform and can produce fast F. Yan is with the Department of Radiology, Ruijin Hospital, Shang- hai Jiao Tong University School of Medicine, Shanghai, China. (e-mail: diagnosis [8]. For example, almost all COVID-19 patients [email protected]). have some typical radiographic features in chest CT, including Z. Ding is with the Department of Radiology, Affiliated Hangzhou First ground-glass opacities (GGO), multifocal patchy consolida- Peoples Hospital, Zhejiang University School of Medicine, Hangzhou, Zhe- jiang, China. (e-mail: [email protected]). tion, and/or interstitial changes with a peripheral distribution Q. Yang is with the Beijing Chaoyang hospital, Capital Medical University. [9]. Thus chest CT has been recommended as a major tool (e-mail: [email protected]). for clinical diagnosis especially in the hard-hit region such as B. Song is with the Department of Radiology, Sichuan University West China Hospital, Chengdu, China. (e-mail: [email protected]). Hubei, China [6]. Considering the need of high-throughput F. Shi, H. Yuan, Y. Wei, X. Cao, Y. Gao, D. Wu and D. Shen are screening by chest CT and the workload for radiologists with the Department of Research and Development, Shanghai United Imag- especially in the outbreak, we design a deep-learning-based ing Intelligence Co., Ltd., Shanghai, China. (e-mail: ffeng.shi, huan.yuan, ying.wei, xiaohuan.cao, yaozong.gao, [email protected], Ding- method to automatically diagnose COVID-19 infection from [email protected]). the community acquired pneumonia (CAP) infection. 2 COVID-19 can effectively improve classification performance. In our work, an important observation is that COVID-19 cases usually have more severe infection than CAP cases [30], Distribution of Pneumonia Infection Area although some COVID-19 cases and CAP cases do have sim- ilar infection sizes. To illustrate it, we use an established VB- Net toolkit [10] to automatically segment lungs and pneumonia infection regions on all the cases in our training-validation (TV) set (with details of our TV set provided in Section IV), and show the distribution of the ratios between the infection CAP regions and lungs in Fig. 1. We can see the imbalanced distribution of the infection size ratios in both COVID-19 and CAP data. In this situation, the conventional uniform sampling on the entire dataset to train the network could lead to unsatisfactory diagnosis performance, especially concerning the limited cases of COVID-19 with small infections and also the limited cases of CAP with large infections. To this end, we train the second network with the size-balanced sampling strategy, by sampling more cases of COVID-19 with small infections and also more cases of CAP with large infections Fig. 1. Examples of CT images and infection segmentations of two COVID- 19 patients (upper left) and two CAP patients (bottom left), and the size within mini-batches. Finally, we apply ensemble learning to distribution of the infection regions of COVID-19 and CAP in our training- integrate the networks of uniform sampling and size-balanced validation set (right). The segmentation results of the lungs and infection sampling to get the final diagnosis results, by following the regions are obtained from an established VB-Net toolkit [10]. The sizes of the infection regions are denoted by the volume ratio of the segmented infection dual-sampling strategy. regions and the whole lung. Compared with CAP, the COVID-19 cases tend As a summary, the contributions of our work are in three- to have more severe infections in terms of the infection region sizes. fold: • We propose an online module to utilize the segmented pneumonia infection regions to refine the attention for With the development of deep learning [11], [12], [13], the network. This ensures the network to focus on the [14], [15], the technology has a wide range of applications infection regions and increase the adoption of visual in medical image processing, including disease diagnosis attention for model interpretability and explainability. [16], and organ segmentation [17], etc. Convolutional neural • We propose a dual-sampling strategy to train the network, network (CNN) [18], one of the most representative deep which further alleviates the imbalanced distribution of the learning technology, has been applied to reading and analyzing sizes of pneumonia infection regions. CT images in many recent studies [19], [20]. For example, • To our knowledge, we have used the largest multi-center Koichiro et. al. use CNN for differentiation of liver masses on CT data in the world for evaluating automatic COVID- dynamic contrast agentenhanced CT images [21]. Also, some 19 diagnosis. In particular, we conduct extensive cross- studies focus on the diagnoses of lung diseases in chest CT, validations in a TV dataset of 2186 CT scans from 1588 e.g., pulmonary nodules [22], [23] and pulmonary tuberculosis patients. Moreover, to better evaluate the performance and [24]. Although deep learning has achieved remarkable perfor- generalization ability of the proposed method, a large mance for abnormality diagnoses of medical images [16], [25], independent testing set of 2796 CT scans from 2057 [26], physicians have concerns especially in the lack of model patients is also used.