A METHOD FOR MICROCALISIFICATIONS DIAG- NOSIS IN BREAST MAMMOGRAMS

Ass.Prof. Dr.Abbas Hanon AL-Asadi AhmedKazim HamedAl-Saadi Computer Science Department, Science College, Basrah University,Basrah, Iraq [email protected] [email protected]

Abstract Breast cancer is the main cause of death for women above age 40. In this paper a method is proposed to develop a Computer-Aided Diagnosis (CADx), this method provide a second opinion in microcalcifica- tions diagnostic and making decisions (classify microcalcifications as benign or malignant). The proposed method for microcalcifications diagnosis splits into three steps: The first step extracts the region of interest (ROI). The second step is the features extraction, where we used a set of features from (ROI) by applying wavelet decomposition. The third step is the classification process where discrimination between benign and malignant is performed using a Nearest Neighbor Classifier. The proposed method was evaluated using the Mammographic Image Analysis Society (MIAS) mammographic databases. The proposed method has achieved satisfactory results.

Keywords:

Breast Cancer, computer aided diagnosis, Microcalcifications, Mammography, Wavelet Transform. at least three microcalcifications within a 1 cm2 1. INTRODUCTION area. Therefore, an accurate detection of microcal- Breast cancer is a malignant tumor that starts in cifications is essential to any early detection of the the cells of the breast [1]. In modern countries, majority of breast cancers [6, 7]. medical statistics have estimated that one woman In the literature, various numbers of techniques on eight will contract a breast cancer. Malign tu- are described to detect and classify the presence of mors are more likely to appear in breast tissues of microcalcifications in digital mammograms as be- women above age 40, and they represent 35% of nign or malignant. the abnormalities detected in women breasts. Breast cancer is currently responsible for more than 30% Chan et al. [8- 10] investigated a comput- of death by cancer in women, which is about 1% of er-based method for the detection of microcalcifica- all deaths worldwide. Men are also concerned by tions in digital mammograms. The method is based breast cancers since it represents 1.5% of all cancer on a difference image technique in which a signal death in men [2, 3]. suppressed image is subtracted from a signal en- hanced image to remove structured background in The early detection of breast cancer is very im- the mammogram. Global and local thresholding portant because the treatment of an undeveloped techniques are then used to extract potential micro- and non-metastasized tumor will not require mas- calcifications signals. Subsequently, signal extrac- sive surgical interventions. tion criteria are imposed on the potential microcal- Microcalcifications are deposits of calcium cifications to distinguish true positives from noise (Ca (PO ) OH) in breast tissue [4] .It is small size and artifacts. 5 4 3 lesions, typically range in size from 0.05 to 1 mm Karssemeijer [11- 13] developed a statistical in diameter. With these dimensions, microcalcifica- method for detection of microcalcifications in digi- tions are relatively difficult to detect [1, 5]. tal mammograms. The method is based on the use There is a high correlation between the pres- of statistical models and the general framework of ence of microcalcifications and breast cancer, par- Bayesian image analysis. ticularly when a number of microcalcifications Strickland et al. [14-19] developed a method grouped together is termed a cluster and it may be a based on undecimated biorthogonal wavelet trans- strong indication of cancer. A cluster is defined as [1] An important branch of CADx methods in forms and optimal subband weighting for detecting mammography employs wavelet transforms for fea- and segmenting clustered microcalcifications. ture enhancement. This work uses the Two-Dimen- Yoshida et al. [20, 21] used decimated wavelet sional Discrete Wavelet Transform (2D-DWT), transform and supervised learning for the detection which can be defined as: of microcalcifications. C a,b  C j, k  f x, y g x, y         jk   Cheng et al. [22] proposed an approach using xZ yZ fuzzy logic for the detection of microcalcifications. Yu and Guan [23] developed a CAD system (1) for the automatic detection of clustered microcalci- fications through two steps. The first one is to seg- j j With a = 2 , b = k.2 , j,k ∈ N, where f is the origi- ment potential microcalcifications pixels by using nal image, g is the wavelet function, a is a scale fac- wavelet and gray level statistical features and to tor of the wavelet function, b is a location parameter connect them into potential individual microcalcifi- of the wavelet function, and C(a,b) is the set of ob- cations objects. The second step is to check these tained coefficients. Inverse process is calculated by: potential objects by using 31 statistical features. Neural network classifiers were used. f x, y  C j,k jk x, y xZ yZ Jiang et al. [24] Proposed Genetic Algorithm (GA) technique which is characterized by trans- forming input images into a feature domain, where (2) each pixel is represented by its mean and standard Where  is the wavelet function used to recon- deviation inside a surrounding window of size 9x9 pixe1. In the feature domain, chromosomes are con- struct the image. structed to populate the initial generation and fur- The multiresolution representation carried out ther features are extracted to enable the proposed by 2D-DWT fragment the frequency spectrum of an GA to search for optimized classification and detec- image f(x,y) into a low pass subband image cAj and tion of microcalcifications clusters via regions of a set of band-pass subband images cDHj, cDVj, 128x 128 pixels. j cDD , j = 1,…,L, where L denote the number of 2. MAMMOGRAM DATABASE levels for a representation, cDHj is formed by low pass filtering the rows followed by high pass filter- The Mammography Image Analysis Society ing the columns, and is therefore sensitive to hori- (MIAS), which is an organization of UK research zontally oriented features. In the same way the groups interested in the understanding of mammo- cDVj contains vertically oriented structure, and grams, has produced a digital mammography data- cDDj contains primarily diagonal structure [25]. base ( ftp://peipa.essex.ac.uk ). Generally speaking, multiresolution representa- The database used in experiments of this work tions are implemented by a cascade of analysis/syn- was taken from the MIAS because it contains com- thesis (A/S) filter banks. The discrete wavelet trans- plete information about abnormalities of each mam- form uses two different wavelet mothers: h(x) for mographic image. The X-ray films in the database multiresolution decomposition (analysis) and g(x) have been carefully selected from the United King- for reconstruction (synthesis) of the original image dom National Breast Screening Program and digi- from its multiresolution representation. An efficient tized with a Joyce-Lobel scanning microdensitome- way to implement discrete wavelet transform using ter to a resolution of 50 μm × 50 μm, 8bits repre- filters was developed by Mallat [26]. sent each pixel with 1024×1024 pixel size and at 256 gray levels. Figure (1) shows the implementation of a one- level (L = 1) multiresolution representation of the The images are in the grayscale file format discrete wavelet transform, which divides orienta- (.pgm) (Portable Graymap). The used database con- tions into three bands. tains left and right breast images for 161 patients. Its quantity consists of 322 images, which belong to As seen in Figure (1), the forward 2D wavelet three types such as Normal, benign and malignant. transform is implemented using a bank of 1D low There are 208 normal, 63 benign and 51 malignant pass (h1(x)) and high pass (h2(x)) analysis filters. (abnormal) images. The reconstruction process, or inverse wavelet 3. WAVELET ANALYSIS

[2] transform, is likewise computed via 1D synthesis filters, g1(x) and g2(x).

Figure (1): Wavelet decomposition algorithm Wavelet-based image decomposition can be in- terpreted as an image filtering process. For a given image I of size 2n x 2n, wavelet-based subband de- composition can be performed as follows: The

wavelet filters h1(x) and h2(x) are applied to the rows of the image I. The filter h1(x) is a low-pass filter with frequency response H1(w) and h2(x) is a high pass filter with frequency response H2(w). By filtering the image I with H1(w), low-frequency is Figure (2): Flowchart of the steps of Microcalcifica- obtained information (background). tions Diagnosis method. By filtering the image with H2(w), the high- -frequency information is obtained (edges). After 4.1. Preprocessing downsampling by a factor of two, two subbands are obtained: H1rI and H2rI (the subscript r suggests This step enhances the breast’s edge before seg- that the filters are applied to rows of the image I). menting the region of interest (ROI) from the mam- The filters H1(w) and H2(w) are then applied to the mographic images; this process can be performed by columns of the subbands H1rI and H2rI, followed by using the Logarithm transformations tool for dynam- downsampling by a factor of two, and the following four subbands are obtained: cAj, cDHj, cDVj and ic range manipulation. Logarithm transformations j j are implemented using the expression: cDD . The subband cA contains the smooth infor- mation of the image, and the subbands cDHj, cDV- jand cDDj contain the detail information of the im- g = c* log (1+ double (f)) age. Then the cAj subband of the frequency domain where g is output image, f is input image, and c is a is segmented into four subbands at the second level constant. decomposition, and so on. 4.2. Segmentation 4. MICROCALCIFICATIONS DIAGNOSIS METHOD The aim of the segmenting of the image is to re- In this method, the first step is to apply a pre- duce the data amount consequently, reduce the com- processing to improve the edge of breast and then putational time and the process searching for Re- segmentation process (Region of interest) for elimi- gions of Interest that include a lesion with high nating some regions in the image, which are not probability. This step eliminates the parts in the im- useful for the mammographic interpretation.. Then age that are not useful for the mammographic inter- features are extracted after applied DWT and final- pretation and extract only the region of the image ly classification process is performed for classify- that corresponds to the breast. We performed this ing the microcalcifications. Flowchart in Figure (2) step by binarization the image with automatic depicts the steps of this method. threshold for obtaining a binary image and Morpho- logical operations for the extract the region of breast. [3] No. of No. of 4.3. Features Extraction No. of Category training testing image In an automated analysis system an important sets sets goal is to extract features that are able to "summa- Normal 40 20 20 rize" meaningful information in the mammographic Abnormal 25 17 8 image; information that is otherwise distributed among a large number of pixels. Features here are Benign 12 8 4 extracted from the ROI based on the wavelet de- composition process. These features are passed to Malignant 13 9 4 the classification stage. There are three processing steps in the features extraction stage. Features, in 1. Features Reduction our method, are extracted from the coefficients that were produced by the wavelet analysis decomposi- After computing the features for every image tion. The following steps discuss in details the ex- in training set we reduce the number of features by tracted features. estimated the mean for each wavelet coefficient at each feature vector. Table (2) and Table (3) are ex- i. Wavelet Decomposition amples of training database that used in classifica- In this work, the wavelet decomposition ap- tion. plied on the region of interest. The outputs of wave- Table (2): Energy Features for Malignant Class let analysis are the decomposition vector C and cor- responding book keeping matrix S. The vector C Approximation Horizontal Vertical Diagonal consists of horizontal, vertical, and diagonal detail coefficients and one approximation. 99.9327 0.0094 0.0558 0.002 ii. Coefficients Extraction 99.9728 0.0128 0.013 0.0014 The horizontal, vertical, and diagonal detail 99.9709 0.0099 0.0178 0.0014 and approximation was extracted from the wavelet decomposition structure [C, S]. These vectors were 99.9879 0.0032 0.008 0.00098003 extracted at level one of wavelet decomposition. 99.942 0.0138 0.0417 0.0025 iii. Energy, Mean, and Standard Deviation Com- 99.9675 0.0132 0.0174 0.0019 putation The energy, mean, and standard deviation have 99.9141 0.0091 0.0746 0.0023 been computed for extracted coefficients. The pro- 99.965 0.004 0.0302 0.00082275 duced values are considered as features (12 fea- tures) for the classification process. 99.9782 0.0076 0.013 0.0011 4.4. Classification Table (3): Energy Feature for Benign Class The classification process is divided into the Approximation Horizontal Vertical Diagonal training phase and the testing phase. In the training phase, labeled data are given. Separately, the data 99.9402 0.0075 0.0509 0.0014 on a candidate region which has already been de- 99.9006 0.01 0.0878 0.0016 cided as microcalcifications or as normal is given 99.9674 0.0069 0.0241 0.0016 and the classifier is trained. In the testing phase, un- 99.9398 0.0123 0.0462 0.0016 known data are given and the classification is per- formed using the classifier after training. The num- 99.9299 0.0064 0.0624 0.0013 ber of images which were used in training and test- 99.9649 0.0103 0.0231 0.0018 ing sets is shown in Table(1).This step can be di- 99.9305 0.007 0.0606 0.0019 vided into two processes: 99.9757 0.0076 0.0151 0.0016

Table (1): Number of training and testing images 2. Classify

[4] The Nearest Neighbor (NN) classifier has been (6) used to classify classes normal, malignant, and be- nign. The classifier used the Euclidean distance as a true positives precision= metric between the correspondents normalized true positives  false positives wavelet coefficients, as shown in Equation (3).

(7) n 2 DEuclidian  [(x[i] x j[i])] i1 Table (4): Successful rates of classification No. of No. of suc- Successful Class testing cessful sets rates sets (3) Normal 20 20 100% where x is the feature vector of unknown pattern, Benign 4 4 100% xj is the mean vector of class j, and n is the dimen- sionality of the feature space. Then the Nearest Malignant 4 4 100% Neighbor states that the vector x is to be assigned to the class which has the minimum distance. 6. CONCLUSIONS In this work, we proposed method for micro- 5. CADx EVALUATION calcifications diagnosis in mammograms of breast. After implementing this method, we concluded the This section presents the results achieved in following: microcalcifications diagnosis method. We obtained results of classification among normal, benign and 1. Despite the abundance of theoretical studies in malignant classes using Nearest Neighbor (NN) this field, computerized microcalcifications de- classifier. tection and diagnosis is one of the things that need a lot of study and practice. This belongs In the beginning, the method of classification is to many reasons. Those come from the great tested twice: once, between normal and microcalci- variability in the database mammograms, the fications, and once between benign and malignant. use of poor resolution microcalcifications The results were good in classifying the malignant mammograms, small number of the available and benign, whereas it was bad in classifying the database, some mammograms are still film- normal images. Also, this method tested with one, based and are read using a light box thus com- two, and three levels of the family wavelet (db4) mercial (CADx) systems digitize the film and decomposition and was best results with level one. present markers on a small display or on a sep- The classification results of method depending arate printout, for the purpose of detection and on Table (1) and using the family wavelet (db4) classification the image is just decomposed to with level one of decomposition were 100% sensi- generate features for a classifier and the task of tivity (Equation 4), 100% specificity (Equation 5), enhancement is more complex as it also re- 100% accuracy (Equation 6), and 100% precision quires an image reconstruction. (Equation 7). Table (4) shows the successful rates 2. Microcalcifications are subtle signs of breast of classification for normal, benign and malignant cancer and are very difficult to detect and diag- classes. nose in the mammographic images because of true positives their small size, low contrast with respect to the Sensitivity = normal breast tissue and proximity to the sur- true positives  false negatives rounding tissues. ((4 3. Because of the importance of the information true negatives in medical images, our goal was to keep this Specificity= information. Perhaps the uses of wavelet trans- true negatives  false positives form characteristics have had a significant im- ((5 pact in achieving this goal. accuracy= 7. REFERENCES true positives  true negatives true negatives  true positives  false positives  false negatives [5] [11] Karssemeijer, N. (1991).A stochastic mod- [1] American Cancer Society (2011). Breast el for automated detection of calcifications in Cancer. Atlanta, Ga: American Cancer Soci- digital mammograms. in Proc. 12th Int. ety. Conf. Information Processing Medical Imag- [2] Zimmerman, B. T. (2004). Understanding ing, Wye, U.K., pp. 227-238. Breast Cancer Genetics. United States of [12] Karssemeijer, N. (1993).Recognition of America: University Press of Mississippi. clustered microcalcifications using a random [3] Denarie, B. E. (2010).Using SURF imaging field model, biomedical image processing for efficient detection of micro-calcifications. and biomedical visualization. In SPIE Proc., M.Sc. thesis, Department of Engineering Cy- vol. 1905, San Jose, CA, pp. 776-786. bernetics, Mathematics and Electrical Engi- [13] Karssemeijer, N. (1993).Adaptive noise neering, Faculty of Information Technology, equalization and recognition of microcalcifi- Norwegian University of Science and Tech- cation clusters in mammograms. Int. J. Pat- nology, Norway. tern Recognit. Artificial Intell., Vol. 7, No. 6, [4] Nesbitt, D. (1995). Automated Detection of pp. 1357-1376. Microcalcifications in Digitized Mammo- [14] Strickland, R. N. and Hahn, H. I. gram Film Images. M.Sc. thesis, Electrical (1994).Wavelet transforms for detecting mi- Engineering, The Faculty of Graduate stud- crocalcifications in mammography. In Proc. ies, The University of British Columbia. Int. Conf. Image Processing, Austin, TX, pp. [5] Sankar, D., and Thomas, T. (2010). A New 402-406. Fast Fractal Modeling Approach for the De- [15] Strickland, R. N. and Hahn, H. I. (1994). tection of Microcalcifications in Mammo- Detection of microcalcifications using wave- grams. J Digit Imaging. 2010 Oct; lets. In Digital Mammography’94, Proc. 2nd 23(5):538-46. Epub 2009 Jul 18., 23, No.5, Int. Workshop Digital Mammography, York, pp. 538-546. U.K., pp. 79-88. [6] Uchiyama, N., and Nascimento, M. [16] Strickland, R. N. and Hahn, H. I. (1995). (2012).Mammography- Recent Advances. In- Wavelet transform matched filters for the de- Tech. tection and classification of microcalcifica- [7] Li, H., Liu, R., and Lo, S. (1997). Fractal tions in mammography. In Proc. IEEE Int. Modeling and Segmentation for the En- Conf. Image Processing, Washington, DC, hancement of Microcalcifications in Digital pp. 422-425. Mammograms. IEEE Trans on Medical [17] Strickland, R. N., Hahn, H. I., and Baig, Imaging, Vol.16, No.6, pp. 785-798. L. J.(1996).Wavelet methods for combining [8] Chan, H. P., K., Doi, S., Galhotra, Vyborny, CAD with enhancement of mammograms. In C. J., MacMahon, H., and Jokich, P. M. Medical Imaging 1996: Image Processing, (1987). Image feature analysis and comput- SPIE Proc., Vol. 2710, pp. 888-903. er-aided diagnosis in digital radiography, 1. [18] Strickland, R. N. and Hahn, H. I. Automatic detection of microcalcifications in (1996).Wavelet transform for detecting mi- mammography. Med. Phys., Vol. 14, No. 4, crocalcifications in mammograms. IEEE pp. 538-548. Trans. Med. Imag., vol. 15, pp. 218–229. [9] Chan, H. P., Doi, K., Vyborny, C. J., Lam, [19] Strickland, R. N. and Hahn, H. I. (1997). K. L., and Schmidt, R. A.(1988).Computer- Wavelet transform methods for objects de- aided detection of microcalcifications in tection and recovery. IEEE Trans. Image mammograms methodology and preliminary Processing, Vol. 6, pp. 724-735. clinical study. Investigative Radiol., Vol. 23, pp.664-671. [20] Yoshida, H., Doi, K., and Nishikawa, R. M. (1994). Automated detection of clustered [10] Chan, H. P., Doi, K., Vyborny, C. J., microcalcifications. In Digital Mammo- Schmidt, R. A., Metz, C., Lam, K. L., Ogura, grams Using Wavelet Transform Tech- T., Wu, Y., and Maxmahon, H.(1990). Im- niques, Medical Imaging 1994: Image Pro- provement in radiologists’ detection of clus- cessing, Proc. SPIE, Vol. 2167, Newport tered microcalcifications on mammogram: Beach, CA, pp. 868-886. The potential of computer-aided diagnosis,” Investigative Radiol., Vol. 25, pp.1102-1110. [6] [24] Jiang, J., Yao, B.,and Wason, A.M. [21] Yoshida, H., Zhang, W., Cai, W., Doi, K., (2007). A genetic algorithm design for mi- Nishikawa, R. M., and Giger, M. L. crocalcification detection and classification (1995).Optimizing wavelet transform based in digital mammograms. Computerized med- on supervised learning for detection of mi- ical imaging and graph ics, Vol. 31, No.1, crocalcifications in digital mammograms. In pp. 49-61. Proc. IEEE Int. Conf. Image Processing, Vol. 3, Washington, DC, pp. 152-155. [25] Laine, A., Schuler, S., Fan, J. and Huda, W. (1994). Mammographic feature enhance- [22] Cheng, H., Lui, Y. M., and Feiimanis, R. I. ment by multiscale analysis. IEEE Transac- (1998). A novel approach to microcalcifica- tions on Medical Imaging, Vol.13, No.4, tion detection using fuzzy logic techniques. pp.725-740. IEEE Trans. Med. Imag., Vol. 17, pp. 442- 450. [26] Mallat, S. G. (1989). Multifrequency chan- nel decomposition of images and wavelet [23] Yu, S., and Guan, L. (2000). A CAD Sys- models. IEEE Trans. Acoust., Speech, Signal tem for the Automatic Detection of Clustered Processing, Vol. 37, pp. 2091-2110. Microcalcifications in Digitized Mammo- gram Films. IEEE transactions on medical imaging, Vol. 19, No. 2, pp.115-126.

[7]