A Prospective Validation and Observer Performance Study of a Deep
Total Page:16
File Type:pdf, Size:1020Kb
Author Manuscript Published OnlineFirst on November 10, 2020; DOI: 10.1158/1078-0432.CCR-20-3159 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. 1 Original article 2 3 A prospective validation and observer performance study of a deep 4 learning algorithm for pathologic diagnosis of gastric tumors in endoscopic 5 biopsies 6 Jeonghyuk Park1*; Bo Gun Jang2*; Yeong Won Kim1; Hyunho Park1; Baek-hui Kim3; Myung 7 Ju Kim4; Hyungsuk Ko5; Jae Moon Gwak5; Eun Ji Lee5; Yul Ri Chung5; Kyungdoc Kim1; Jae 8 Kyung Myung6; Jeong Hwan Park7; Dong Youl Choi5; Chang Won Jung5; Bong-Hee Park5; 9 Kyu-Hwan Jung1+; Dong-Il Kim5+ 10 11 1VUNO Inc., Seoul, South Korea, 2Department of Pathology, Jeju National University School 12 of Medicine and Jeju National University Hospital, Jeju, South Korea, 3Department of 13 Pathology, Korea University Guro Hospital, Seoul, South Korea, 4Department of Anatomy, 14 Dankook University College of Medicine, Chonan, Chungnam, South Korea, 5Department of 15 Pathology, Green Cross Laboratories, Yongin, Gyeonggi, South Korea, 6Department of 16 Pathology, College of Medicine, Hanyang University, Seoul, South Korea, 7Department of 17 Pathology, SMG-SNU Boramae Medical Center, Seoul, South Korea 18 19 *,+these authors equally contributed to this paper 20 Running title : deep learning-assisted diagnosis in gastric biopsies 21 1 Downloaded from clincancerres.aacrjournals.org on October 2, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on November 10, 2020; DOI: 10.1158/1078-0432.CCR-20-3159 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. 22 Keywords: deep learning; whole slide image; digital pathology; gastric biopsy; gastric cancer 23 Financial support: This study was supported by Green Cross Laboratory and VUNO Inc. 24 25 Address for correspondence: 26 Kyu-Hwan Jung, PhD, VUNO Inc., 5F, 507, Gangnam-daero, Seocho-gu, Seoul, South Korea. 27 E-mail: [email protected] 28 Dong-Il Kim, MD, Green Cross Laboratories, Department of Pathology, 107, Ihyeonro 30 29 Beon-gil, Giheng-gu, Yongin-Si, Gyeonggi-do, South Korea. 30 E-mail: [email protected] 31 32 Disclosure of potential conflicts of interests: 33 JP, YWK, HP, KK, KHJ are employees of VUNO Inc. HK, JMG, EJL, YRC, DYC, CWJ, 34 BHP, DIK are employees of Green Cross Laboratories. KHJ is an equity holder of VUNO Inc. 35 The remaining authors disclose no conflict of interest. 36 37 38 39 40 41 42 43 2 Downloaded from clincancerres.aacrjournals.org on October 2, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on November 10, 2020; DOI: 10.1158/1078-0432.CCR-20-3159 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. 44 TRANSLATIONAL RELEVANCE 45 Diagnostic workload for gastric biopsy specimens is increasing steadily in Northeast Asia. 46 Previous studies on deep learning–assisted analysis of digital gastric biopsy images have 47 limitations with respect to clinical validation. In this study, we developed and evaluated a 48 deep learning algorithm for histological classification of gastric epithelial tumors in actual 49 clinical practice. Our algorithm demonstrated a superb performance with diagnostic accuracy 50 of 0.97-0.99 calculated from the area under the receiver operating characteristics curve 51 (AUROC) in both retrospective and prospective studies. Algorithm-assisted diagnosis 52 significantly saved the average review time per image, particularly for negative cases, 53 reaching 100% sensitivity. We believe that our algorithm can potentially serve as a screening 54 or an assistance tool not only in countries with heavy diagnostic workloads of gastric biopsy 55 specimens but also in areas where experienced pathologists are not available. 56 57 ABSTRACT 58 Purpose: Gastric cancer remains the leading cause of cancer death in Northeast Asia. 59 Population-based endoscopic screenings in the region have yielded successful results in early 60 detection of gastric tumors. Subsequently, endoscopic screening rates are continuously 61 increasing, and there is a need for an automatic computerized diagnostic system to reduce the 62 diagnostic burden. In this study, we developed an algorithm to classify gastric epithelial 63 tumors automatically and assessed its performance in a large series of gastric biopsies and its 64 benefits as an assistance tool. 65 Experimental Design: Using 2,434 whole slide images (WSIs), we developed an algorithm 66 based on convolutional neural networks (CNN) to classify a gastric biopsy image into one of 3 Downloaded from clincancerres.aacrjournals.org on October 2, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on November 10, 2020; DOI: 10.1158/1078-0432.CCR-20-3159 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. 67 three categories: negative for dysplasia (NFD), tubular adenoma (TA), or carcinoma (CA). 68 The performance of the algorithm was evaluated with 7,440 biopsy specimens collected 69 prospectively. The impact of algorithm-assisted diagnosis was assessed by six pathologists 70 using 150 gastric biopsy cases. 71 Results: Diagnostic performance evaluated by the area under the receiver operating 72 characteristic curve (AUROC) in the prospective study was 0.9790 for two-tier classification; 73 negative (NFD) vs. positive (all cases except NFD). When limited to epithelial tumors, the 74 sensitivity and specificity were 1.000 and 0.9749. Algorithm-assistance digital image viewer 75 resulted in 47% reduction in review time per image compared to digital image viewer only 76 and 58% decrease to microscopy. 77 Conclusions: Our algorithm has demonstrated high accuracy in classifying epithelial tumors 78 and its benefits as an assistance tool which can serve a potential screening aid system in 79 diagnosing gastric biopsy specimens. 80 81 Abbreviations used in this paper: AUROC, area under receiver operating characteristic 82 curve; CA, carcinoma; CG, chronic gastritis; CI, confidence interval; CNN, convolutional 83 neural networks; H&E, hematoxylin and eosin; IM, intestinal metaplasia; MALT, mucosa- 84 associated lymphoid tissue; NFD, negative for dysplasia; RACNN, representation 85 aggregation CNN; SRCC, signet ring cell carcinoma; TA, tubular adenoma; WSI, whole slide 86 image. 87 88 89 4 Downloaded from clincancerres.aacrjournals.org on October 2, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on November 10, 2020; DOI: 10.1158/1078-0432.CCR-20-3159 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. 90 INTRODUCTION 91 Gastric cancer is the third leading cause of cancer death in both men and women 92 worldwide.(1) Although its incidence is decreasing globally, the incidence and mortality of 93 gastric cancer remain considerably high in Northeast Asian countries including China, Japan, 94 and Korea.(2) Gastric cancer screening is done on a population basis in Japan and Korea, and 95 such mass screening has been shown to be effective in detecting gastric cancer at an early 96 stage, thereby reducing mortality rates.(3,4) As a result, endoscopic screening rates for gastric 97 cancer underwent an annual increase of 4.2% from 2004 to 2013 in Korea, reaching as high 98 as 73.6% in the population over 40 years old in the country.(5) Accordingly, the diagnostic 99 workload for gastric biopsy specimens has increased steadily. Therefore, there is a need for 100 an automatic computerized diagnostic system to reduce the increasing diagnostic burden and 101 prevent misdiagnosis. 102 Developing an automated screening method can reduce heavy diagnostic workloads, an 103 excellent example of which is the automated image analysis of cervical cytology specimens 104 for cervical cancer screening.(6) With advances in digital scanning devices and deep learning 105 technologies, automated cancer diagnostic systems are being developed using whole slide 106 images (WSIs); these, however, have mostly been on breast and colorectal cancers.(7-9) As 107 for gastric cancers, a small number of groups have reported their automated histological 108 classification systems using convolutional neural networks (CNN).(10-15) Sharma et al. 109 trained CNNs to detect gastric cancer yielding overall classification accuracy of 69.9%; 110 however, their dataset was limited to only 15 WSIs.(12) Li et al. proposed a deep learning- 111 based framework GastricNet for automated gastric cancer detection and demonstrated its 112 diagnostic accuracy of 100%, a performance superior to the already well-known, state-of-the- 113 art networks including DenseNet and ResNet.(16) However, this study was conducted only 5 Downloaded from clincancerres.aacrjournals.org on October 2, 2021. © 2020 American Association for Cancer Research. Author Manuscript Published OnlineFirst on November 10, 2020; DOI: 10.1158/1078-0432.CCR-20-3159 Author manuscripts have been peer reviewed and accepted for publication but have not yet been edited. 114 on a publicly-available gastric slide dataset lacking validation using an independent set of 115 samples. Furthermore, gastric adenomas were not included in their study design. Yoshida et al. 116 developed an image analysis software named “e-Pathologist” that classifies gastric biopsy 117 images into either carcinoma, adenoma, or