Automated Tissue Image Analysis Using Pattern Recognition
Total Page:16
File Type:pdf, Size:1020Kb
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1175 Automated Tissue Image Analysis Using Pattern Recognition JIMMY AZAR ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6214 ISBN 978-91-554-9028-7 UPPSALA urn:nbn:se:uu:diva-231039 2014 Dissertation presented at Uppsala University to be publicly examined in Häggsalen, Ångströmlaboratoriet, Lägerhyddsvägen 1, Uppsala, Monday, 20 October 2014 at 09:15 for the degree of Doctor of Philosophy. The examination will be conducted in English. Faculty examiner: Marco Loog (Delft University of Technology, Pattern Recognition & Bioinformatics Group). Abstract Azar, J. 2014. Automated Tissue Image Analysis Using Pattern Recognition. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1175. 106 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-554-9028-7. Automated tissue image analysis aims to develop algorithms for a variety of histological applications. This has important implications in the diagnostic grading of cancer such as in breast and prostate tissue, as well as in the quantification of prognostic and predictive biomarkers that may help assess the risk of recurrence and the responsiveness of tumors to endocrine therapy. In this thesis, we use pattern recognition and image analysis techniques to solve several problems relating to histopathology and immunohistochemistry applications. In particular, we present a new method for the detection and localization of tissue microarray cores in an automated manner and compare it against conventional approaches. We also present an unsupervised method for color decomposition based on modeling the image formation process while taking into account acquisition noise. The method is unsupervised and is able to overcome the limitation of specifying absorption spectra for the stains that require separation. This is done by estimating reference colors through fitting a Gaussian mixture model trained using expectation-maximization. Another important factor in histopathology is the choice of stain, though it often goes unnoticed. Stain color combinations determine the extent of overlap between chromaticity clusters in color space, and this intrinsic overlap sets a main limitation on the performance of classification methods, regardless of their nature or complexity. In this thesis, we present a framework for optimizing the selection of histological stains in a manner that is aligned with the final objective of automation, rather than visual analysis. Immunohistochemistry can facilitate the quantification of biomarkers such as estrogen, progesterone, and the human epidermal growth factor 2 receptors, in addition to Ki-67 proteins that are associated with cell growth and proliferation. As an application, we propose a method for the identification of paired antibodies based on correlating probability maps of immunostaining patterns across adjacent tissue sections. Finally, we present a new feature descriptor for characterizing glandular structure and tissue architecture, which form an important component of Gleason and tubule-based Elston grading. The method is based on defining shape-preserving, neighborhood annuli around lumen regions and gathering quantitative and spatial data concerning the various tissue-types. Keywords: tissue image analysis, pattern recognition, digital histopathology, immunohistochemistry, paired antibodies, histological stain evaluation Jimmy Azar, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction, Box 337, Uppsala University, SE-75105 Uppsala, Sweden. © Jimmy Azar 2014 ISSN 1651-6214 ISBN 978-91-554-9028-7 urn:nbn:se:uu:diva-231039 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-231039) List of Papers This thesis is based on the following papers, which are referred to in the text by their Roman numerals. I J. C. Azar, C. Busch and I. B. Carlbom, "Microarray Core Detection by Geometric Restoration," Analytical Cellular Pathology, vol. 35, no. 5–6, pp. 381–393, 2012. II M. Gavrilovic, J. C. Azar, J. Lindblad, C. Wählby, E. Bengtsson, C. Busch and I. B. Carlbom, "Blind Color Decomposition of Histological Images," IEEE Transactions on Medical Imaging, vol. 32, no. 6, pp. 983–994, 2013. III J. C. Azar, C. Busch, I. B. Carlbom, "Histological Stain Evaluation for Machine Learning Applications," in International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), Nice, France, 2012. IV J. C. Azar, M. Simonsson, E. Bengtsson and A. Hast, "Image Segmentation and Identification of Paired Antibodies in Breast Tissue," Journal of Computational and Mathematical Methods in Medicine, vol. 2014, Article ID 647273, 11 pages, 2014. doi:10.1155/2014/647273. V J. C. Azar, M. Simonsson, E. Bengtsson and A. Hast, "Automated Classification of Glandular Tissue by Statistical Proximity Sampling," 2014. Submitted for journal publication. The author has significantly contributed to the work in the above papers. In Paper II, the author contributed to the unsupervised aspect of the method. Reprints were made with permission from the respective publishers. The papers in this thesis have been proofread for errors. Contents 1 Introduction ........................................................................................... 9 2 Background .......................................................................................... 13 2.1 Tissue Preparation ........................................................................... 13 2.2 Histological Staining ....................................................................... 13 2.3 Tissue Microarrays .......................................................................... 15 2.4 Immunohistochemistry .................................................................... 15 2.5 Cancer Grading ............................................................................... 16 3 Pattern Recognition & Image Analysis ............................................... 18 3.1 Principal Component Analysis ........................................................ 18 3.2 Clustering ........................................................................................ 20 3.2.1 Hierarchical Clustering .......................................................... 22 3.2.2 Gaussian Mixture Model ....................................................... 24 3.3 Supervised Classification ................................................................ 26 3.3.1 Bayes Optimal Classifier ....................................................... 28 3.3.2 Quadratic Classifier ............................................................... 29 3.3.3 Normal-based Linear Classifier ............................................. 31 3.3.4 Nearest Mean Classifier ......................................................... 32 3.3.5 K-Nearest Neighbor Classifier ............................................... 33 3.3.6 Support Vector Classifier ....................................................... 36 3.3.7 Multiple Instance Learning .................................................... 40 3.4 Digital Image Processing ................................................................ 42 3.4.1 Neighborhood Connectivity ................................................... 42 3.4.2 Morphological Operations ..................................................... 43 3.4.3 Morphological Granulometry ................................................ 44 4 Summary of Publications ..................................................................... 46 4.1 Paper I: Microarray Core Detection by Geometric Restoration ...... 47 4.1.1 Problem Description .............................................................. 47 4.1.2 Proposed Method ................................................................... 50 4.1.3 Contributions ......................................................................... 55 4.2 Paper II: Blind Color Decomposition of Histological Images ........ 56 4.2.1 Problem Description .............................................................. 56 4.2.2 Proposed Method ................................................................... 56 4.2.3 Contributions ......................................................................... 62 4.3 Paper III: Histological Stain Evaluation for Machine Learning Applications .................................................................................... 63 4.3.1 Problem Description .............................................................. 63 4.3.2 Proposed Method ................................................................... 64 4.3.3 Contributions ......................................................................... 70 4.4 Paper IV: Image Segmentation and Identification of Paired Antibodies in Breast Tissue ............................................................ 71 4.4.1 Problem Description .............................................................. 71 4.4.2 Proposed Method ................................................................... 72 4.4.3 Contributions ......................................................................... 81 4.5 Paper V: Automated Classification of Glandular Tissue by Statistical Proximity Sampling ........................................................ 82 4.5.1 Problem Description .............................................................. 82 4.5.2 Proposed Method ................................................................... 83 4.5.3 Contributions ........................................................................