ANTICANCER RESEARCH 31: 2569-2574 (2011)

First Evaluation of the Diagnostic Accuracy of an Automated 3D System in a Breast Screening Setting

FRANK STÖBLEN1, SOLVEIG LANDT2, RUTH STELKENS-GEBHARDT1, JALID SEHOULI3, MAHDI REZAI4 and SHERKO KÜMMEL5

1Department of , Hyssensstift Kliniken-Essen-Mitte, Essen, Germany; 2Department of Gynecology and , University Hospital Düsseldorf, Düsseldorf, Germany; 3Department of Gynecology and Obstetrics, Charité University Hospital Berlin, Berlin, Germany; 4Breast Center, Luisenkrankenhaus, Düsseldorf, Germany; 5Breast Center, Huyssensstift Kliniken Essen-Mitte, Essen, Germany

Abstract. Background/Aim: Automated ultrasound examination 40 years of age; for the assessment of mammographically of suspicious findings can reduce the physician’s workload in suspicious lesions (ACR 3-4, BI-RADS 0, III, IV and V) screening . The present study examines the detected after clinical suspicion (40-49 and over 70 years) or diagnostic accuracy of this method in comparison to routine screening mammography (50-69 years); and for mammography as the reference standard for the first time. interventional biopsy in BI-RADS IV/V lesions. Patients and Methods: A total of 304 patients underwent Presently, ultrasound examination is not recommended as automated 3D ultrasound examination after screening the sole method of breast cancer screening. mammography. Mammograms and ultrasound images were Like ultrasound in general, sonographic breast diagnosis assessed by independent examiners, and sensitivity, specificity can be extremely accurate (3), but its quality depends not and the degree of agreement between both methods were only on a highstandard of technical equipment (1, 2), but calculated. Results: The degree of agreement was moderate also on the proficiency, experience and diligence of the (Cohen’s κ=0.130 for all and 0.153 for positive/negative examiner (4). When performed thoroughly, the examination ratings), mainly owing to a high percentage of false-positive requires approximately 20 minutes of physician time ultrasound results. However, the results of sonographical re- according to the recently published results of the ACRIN examination of suspicious mammograms were favorable. The 6666 study (5). only two undetected proven malignant lesions were Another potential pitfall is the fact that sonography results microcalcified, and in three more cases with disagreement, the are not automatically stored, limiting the possibilities for ultrasound diagnosis was correct. Conclusion: Automated 3D review under diagnostic, but also forensic aspects. ultrasound imaging appears to be on a par with hand-held Since breast cancer screening is a setting with high ultrasound in terms of diagnostic quality. economical impact, an optimized relationship between cost and benefit is of pivotal importance, and the reduction of Ultrasound examination of the breast has evolved into an physician time can be a means to this end, directing research indispensable diagnostic tool for early detection of breast focus on automated diagnostic systems that can be applied cancer since its introduction in the 1950s (1). According to by medical technicians or assistants (6). current guidelines in Germany, ultrasound examination is A relatively recent development in ultrasound diagnostics of recommended in the following situations (2): As a first-line the breast is the possibility for automated 3D real-time imaging method for the assessment of palpable lesions in women under that offers a number of potentially significant advantages over conventional ultrasound (1, 7, 8): Improved differentiation of architectural distortion, especially in the ’bird’s eye’ view; better appreciation of tumour volume, indispensable for Correspondence to: Frank Stöblen, Department of Radiology, monitoring of patients under neo-adjuvant treatment regimens; Hyssensstift Kliniken-Essen-Mitte, Henricistrasse 92, 45136 Essen, storage of the complete imaging data for off-line expert review Germany. Tel: +49 020117435001, Fax: +49 020117435000, e-mail: [email protected] and image processing; securing of complete volume coverage; and reduction of physician time, i.e. cost. Key Words: Breast cancer, screening, mammography, ultrasound, Presently the rationale for automated 3D imaging in breast automated 3D imaging. cancer screening in addition to or instead of conventional

0250-7005/2011 $2.00+.40 2569 ANTICANCER RESEARCH 31: 2569-2574 (2011) ultrasound techniques is unclear, and clinical experience is Table I. Contingency tables of mammography and breast scanner results rather limited (7-10). The issue of a possible application of (identical classification in grey fields). automated ultrasound breast examination for the screening Positive/negative Ultrasound BI-RADS of dense breasts in addition to mammography is, however κ=0.153 relevant, not the subject of the present study; rather this I-III IV-V Total study examines the diagnostic accuracy of automated ultrasound breast examination in comparison to Mammography ¥ mammography as the reference standard. BI-RADS I-II 230 60 290 IV 5¥ 914 Total 235 69 304 Patients and Methods All ratings Ultrasound BI-RADS Patients. Patients were recruited for the trial between August 26th κ=0.130 and November 10th, 2008. All women attending the diagnostic I II III IV V Total centre for screening mammography [which is funded by compulsory health insurance (CHI) on a bi-annual basis for women of 50-69 Mammography years of age in Germany] during this period were eligible, and 310 BI-RADS† I 114 34 26 35 11 220 consecutive patients were considered for participation. II 33 13 10 10 4 70 ‡ $ Breast density was not a criterion for enrolment; patients with IV 2 1§ 2 54 14 densities ACR 1-4 were enrolled, and the majority were classified Total 149 48 38 50 19 304 as ACR 2 (’fat with some fibroglandular tissue’). †Conference decision where applicable; ‡one microcalcified DCIS, in Informed consent (oral and written) was obtained from all patients, one case ultrasound correct, no tumor; §ultrasound correct, cyst; $one and no patient refused participation after receiving the information microcalcified DCIS, in one case ultrasound correct, benign. ¥ 2 regarding the study. Pre-menopausal patients were questioned about microcalcified DCIS, US correct in three cases. the possibility of , which was denied by all. The study met the criteria of ‘Good Clinical Practice’ and the principles of the Declaration of Helsinki. Patients were between 50.1 and 69.8 years of age upon examination (mean 58.3±6.4 years). The diagnostic procedure was The breast scanner images were evaluated by an independent completed in 304 patients per protocol; the patient data was made investigator who was oblivious to the mammography results. anonymous and processed for statistical evaluation. Each case of disagreement (with regard to BI-RADS Data collection was planned before the breast scanner and classification) between two investigators was reviewed by an mammographic examinations were performed, i.e. the study was external senior radiologist otherwise uninvolved in the study, and a prospective. conference decision was made. As part of the regular screening follow-up (2), patients were re- Diagnostic procedures. After declaring informed consent, patients examined at 6-monthly intervals; none of these re-examinations led first underwent mammography with the Mammomat NovationDR to the reversal of a diagnosis made during the study. Patients with full-field digital system in combination with the syngo Acquisition unsuspicious , but suspicious ultrasound findings, Workstation (AWS) and MammoReport breast care workplace were re-examined, but the results of this re-examination were not (Siemens Healthcare, Erlangen, Germany). This system is part of the present study. nationally and internationally approved for breast cancer screening. Statistical data evaluation. After completion of the last patient’s According to the manufacturer’s recommendations, the W/Rh examination the following information was entered for data target/filter combination was employed for all breast types even processing and evaluation: Patients’ age and date of examination; though Mo/Mo and Mo/Rh target/filter combinations are also medical history (previous examinations or treatments of the breasts); available. All examinations were performed under automatic mammography results (tissue density, classification of lesions, if exposure control (AEC). any); and ultrasound results (description, size, localisation and After completion of mammography, automated 3D ultrasound histological characteristics of lesions [if any], BI-RADS examination was performed with a SomoVu device (U-Systems, San classification). Jose, CA, USA in technology and distribution collaboration with The diagnostic accuracy of the automated ultrasound scanning Siemens Health Care Inc., Ultrasound Division, Mountain View, system was assessed in comparison with mammography as the CA, USA). This system allows automated acquisition of multi-plane reference standard, and data was analysed according to the ultrasound images of the breast by serial images in a volume of up Standards for Reporting of Diagnostic Accuracy (STARD) to 14.5×17×5 cm. The images are interpreted at the BreastView recommendations (11). workstation. For the statistical analysis, the STATISTICA software package In contrast to standard screening procedure, mammographies was employed (StatSoft, Tulsa, OK, USA). Sensitivity and were assessed by two independent, experienced radiologists who specificity of the ultrasound breast scan in comparison to had no access to each other’s diagnoses or the ultrasound images in mammography as a reference were calculated from the χ2 order to rule out inter-individual differences as a cause for contingency table, and the degree of agreement between both diagnostic errors. methods was determined by means of Cohens κ statistics.

2570 Stöblen et al: Diagnostic Accuracy of Automated 3D Ultrasound in Breast Screening

Table II. Diagnostic details of patients who would have undergone ultrasound examination in a clinical screening setting (BI-RADS IV in one rating) and an ultrasound rating of ≤III (possile false-negative ultrasound results).

Mammography BI-RADS

Case no. Assessments of Conference decision Breast scanner Result both examiners BI-RADS

18 I/IVb I I Normal breast, ACR 2, artefact 21 IVa/Iva IVa III Microcalcified DCIS, 5 mm in diameter 23 I/Iva II II Benign 24 IVa/Iva IVa III No tumor 35 I/Iva IVa I Benign† 36 IVa/I II I No tumor 105 IVa/III II II Benign 107 IVa/II IVa I Microcalcified DCIS 109 IVa/I I II No tumor 116 IVa/I I III No tumor 129 IVa/II II I Benign 135 IVa/I I II No tumor 165 I/IVa IVa II Cyst, no malignancy‡ 168 I/IVa I I No tumor 170 IVa/I I I No tumor 171 II/IVa II I Benign 179 IVa/I I III Benign 223 IVa/I II I Benign 225 IVa/IVa II I Cyst, no malignancy 251 IVa/I I I No tumor 256 IVa/I II II Microcalcifications, no tumor 265 I/IVa II I Microcalcifications, no tumor 270 II/IVa II II Cyst, no malignancy 276 IVa/I II III Benign 290 IVa/I I II No tumor 296 IVa/I I I No tumor 297 IVa/I II I Microcalcified epithelial proliferation, surgically removed 298 IVa/IVa I II No tumor

†Imaging findings displayed in Figure 2; ‡Imaging findings displayed in Figure 3.

Results analysis yielded false-negative results of the breast scan (measured against the mammography as a reference), further The results of the comparison of raw mammography and 3D- analysis showed that only in two cases was the ultrasound sonography data are displayed in Table I. In all instances rating indeed wrong, whereas in three cases, the where a consensus conference was held, because of mammography conference decision was wrong and the contradictory results, those results were used; in all other sonography rating was correct (Table I). cases, the higher mammography BI-RADS rating was According to the clinical application of ultrasound in employed for analysis. According to this table, the prevalence breast screening, however, false-positive results are irrelevant of BI-RADS IV was 4.6%, and the sensitivity and specificity because only mammographically suspicious lesions are of the 3D ultrasound scanning was 64.3% and 79.3%, evaluated in the first place. Therefore, those cases in which respectively. Cohen’s κ (weighted) was 0.130 for all ratings the ultrasound examination would have been performed in a and 0.153 for positive/negative ratings, respectively, indicating clinical screening setting (i.e. mammographies with BI- only a moderate degree of agreement between both methods. RADS-classification of III, IV, and V by one or both This was mainly due to a substantial percentage of false- investigators) were reviewed in more detail (Table II). positive ultrasound ratings: In 60 cases (20.4% of the entire In both cases where an actual tumor was not detected sample and 20.7% of patients with mammography BI-RADS sonographically, the neoplasm was microcalcified, hence I or II), the ultrasound examination yielded a rating of BI- eluding ultrasound imaging. In all other instances, the RADS IV/V, despite a non suspicious mammogram (I/II). conference decision or the histological diagnosis, respectively, Moreover, in those 5 cases where the sensitivity/specificity were in accordance with the breast scanner result, so that

2571 ANTICANCER RESEARCH 31: 2569-2574 (2011)

Figure 1. Frequency distribution of cases under conference review because of BI-RADS IV/V ratings in mammography. false-negative results of the latter occurred exclusively when may require critical review. In terms of detection of actual microcalcification was present. More importantly, the lesions, it is only superior to ultrasound when microcalcification automated 3D ultrasound imaging was more accurate than the is present, and this shortcoming of sonographical diagnosis is mammography in all instances where there was disagreement well known (3,12,13). On the other hand, the automated between the two and lesions were not microcalcified (Figure ultrasound scanning led to the correction of three BI-RADS IVa 1). The imaging findings of two patients are shown in Figure conference decisions, therefore contributing significantly to the 2 (benign growth) and Figure 3 (cyst). crucial avoidance of overdiagnosis and over-therapy. Furthermore, Table II underlines that the level of However, we also confirmed another shortcoming of disagreement between both investigators rating the ultrasound imaging of the breast, namely the relatively high mammograms was substantially higher than that between the percentage of false positive results (3,14,15) that currently conference decision and the ultrasound result. disqualifies sonography from being a first-line screening method. Considering the results in their entirety, they suggest Discussion that the sequence of mammography and ultrasound imaging as per the current screening standard makes perfect sense, From a practical point of view, automated 3D ultrasound has a and that automated 3D ultrasound scanning is probably number of compelling advantages that warrant its further equivalent to conventional hand-held imaging in terms of examination in systematic studies. Most important is the diagnostic accuracy. The latter conclusion is shared by reduction and adaptation of the time a physician spends on the Wenkel et al. (10) based on a recently published comparative diagnosis of a given patient. The actual examination of 6 scans study of hand-held vs. automated sonography and confirms per patient at the workstation only requires approximately 5 min. previous studies with similar results (16, 17). Moreover, off-line assessment at a time of the examiner’s choice Both the aforementioned pitfalls are also not exclusive to greatly reduces organizational burden, and the complete storage automated 3D ultrasound, but affect manual hand-held of imaging data allows for the convenient consultation of ultrasonography in equal measure (3, 12-15). Therefore there colleagues. A spin-off effect is the complete documentation of are presently no evidence-based reasons to favor one ultrasound the diagnostic process for later retrieval, be it for medical or modality over the other from a diagnostic point of view. forensic reasons. Further potential benefits are the opportunity Obviously, the achievement of high diagnostic accuracy in to utilize volume measurement as a tool for the assessment of ultrasound imaging requires a high degree of technical and chemotherapy effect, differential viewing perspectives and medical proficiency from the examiner in charge. Whereas whole-breast imaging. this generally applies to both methods, the automated 3D The present study indicates that the choice of mammography scan is less prone to examiner-related errors for two reasons: as a reference standard for breast cancer imaging modalities Firstly, the close adherence to a strict diagnostic protocol –

2572 Stöblen et al: Diagnostic Accuracy of Automated 3D Ultrasound in Breast Screening

Figure 2. Mammographic and sonographic findings in patient #35.

Figure 3. Mammographic and sonographic findings in patient #165.

2573 ANTICANCER RESEARCH 31: 2569-2574 (2011) which also yields a high accuracy when applied to hand-held 5 Berg WA, Blume JD, Cormack JB, Mendelson EB, Lehrer D, ultrasound (18) – is uncoupled from the examiner and Böhm-Vélez M, Pisano ED, Jong RA, Evans WP, Morton MJ, ensured automatically; secondly, off-line examination allows Mahoney MC, Larsen LH, Barr RG, Farria DM, Marques HS and Boparai K: Combined screening with ultrasound and for unlimited careful re-runs and the consultation of senior mammography vs. mammography alone in women at elevated specialists independently of their geographical location. risk of breast cancer. JAMA 299: 2151-2163, 2008. To date, there is certainly a broader base of diagnostic 6 Elsheikh TM: Does the new automated ’HALO’ nipple aspiration breast centers with expertise and experience in hand-held fluid system really deliver as promised? The answer is "No, ultrasound imaging of the breast than in automated 3D but...": A literature review of the role of breast fluid cytology in imaging since the latter has only been available for about cancer risk assessment. Diagn Cytopathol 37: 699-704, 2009. three years (1, 3, 8-10), but this cannot justify an 7 Fischer T, Filimonow S, Hamm B, Slowinski T and Thomas A: Dignitätsbeurteilung mammasonographischer Herde mittels unquestioned preference for the former. dreidimensionaler Darstellung. Rofo 178: 1224-1234, 2006. 8 Kotsianos-Hermle D, Wirth S, Fischer T, Hiltawsky KM and Conclusion Reiser M: First clinical use of a standardized three-dimensional ultrasound for breast imaging. Eur J Radiol 71: 102-108, 2009. The impact of possible diagnostic advantages of automated 9 Kotsianos-Hermle D, Hiltawsky KM, Wirth S, Fischer T, Friese ultrasound imaging such as the ‘bird’s eye’ view, better K and Reiser M: Analysis of 107 breast lesions with automated volume appreciation and whole breast coverage cannot be 3D ultrasound and comparison with mammography and manual assessed at the moment, and certainly not based on the results ultrasound. Eur J Radiol 71: 109-115, 2009. of the present study. An evidence-based assessment of this 10 Wenkel E, Heckmann M, Heinrich M, Schwab SA, Uder M, Schulz-Wendtland R, Bautz WA and Janka R: Automated breast issue would require a comparative study setting in which both ultrasound: lesion detection and BI-RADS classification - a pilot methods being employed, either on the same patients or in a study. Rofo 180: 804-808, 2008. randomized controlled study, and positive imaging results are 11 Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou PP, verified or falsified by histological or cytological examination. Irwig LM, Lijmer JG, Moher D, Rennie D and de Vet HC: In full appreciation of the effort involved in such a trial, Towards complete and accurate reporting of studies of diagnostic we consider it well justified based on the – doubtlessly accuracy: The STARD Initiative. Radiology 226: 24-28, 2003. preliminary, but nevertheless suggestive – results of the 12 Flobbe K, Nelemans PJ, Kessels AG, Beets GL, von Meyenfeldt MF, and van Engelshoven JM: The role of ultrasonography as an present study. adjunct to mammography in the detection of breast cancer. A systematic review. Eur J Cancer 38: 1044-1050, 2002. Acknowledgements 13 Prasad SN and Houserkova D: A comparison of mammography and ultrasonography in the evaluation of breast masses. Biomed Professor Per Skaane MD (Department of Radiology, Breast Pap Med Fac Univ Palacky Olomouc Czech Repub 151: 315- Imaging Center, Ullevaal University Hospital, Kirkeveien 166, N- 322, 2007. 0407 Oslo, Norway) reviewed each case of disagreement between 14 Smith DN: . Radiol Clin North Am 39: 485- two investigators with regard to BI-RADS classification. Hartmut 497, 2001. Buhck, M.D., provided editorial advice and assistance in statistical 15 Houssami N, Lord SJ and Ciatto S: Breast cancer screening: data evaluation, as well as for the methodical aspects of result emerging role of new imaging techniques as adjuncts to interpretation. mammography. Med J Aust 190: 493-497, 2009. 16 Guingrich J, Destounis S, Kaplan S, Thurmond A and Youker J: References Evaluation of the Somo-VuTM by U-Systems in diagostic patients. Radiological Society of North America scientific 1 Athanasiou A, Tardivon A, Ollivier L, Thibault F, El Khoury C assembly and annual meeting program. Oak Brook, Ill: and Neuenschwander S: How to optimize breast ultrasound. Eur Radiological Society of North America, 2006. J Radiol 69: 6-13, 2009. 17 Chou YH, Tiu CM, Chiang HR, Chen SP, Chiou HJ and Chiou 2 Albert US, Altland H, Duda V, Engel J, Geraedts M, Heywang- SY: Ultrasound ACR BI-RADS® categories applied in an Köbrunner S, Hölzel D, Kalbheim E, Koller M, König K, automated breast ultrasound system: Diagnostic reliabilty. Kreienberg R, Kühn T, Lebeau A, Nass-Griegoleit I, Schlake W, Radiological Society of North America scientific assembly and Schmutzler R, Schreer I, Schulte H, Schulz-Wendtland R, annual meeting program. Oak Brook, Ill: Radiological Society Wagner U and Kopp I: Kurzfassung der aktualisierten Stufe-3- of North America, 2006. Leitlinie Brustkrebs-Früherkennung in Deutschland 2008. Rofo 18 Berg WA, Blume JD, Cormack JB and Mendelson EB: Operator 180: 455-465, 2008. dependence of physician-performed whole-breast US: lesion 3 Sehgal CM, Weinstein SP, Arger PH and Conant EF: A review of detection and characterization. Radiology 241: 355-365, 2006. breast ultrasound. J Mammary Gland Biol Neoplasia 11: 113- 123, 2006. 4 Education and Practical Standards Committee, European Federation of Societies for Ultrasound in Medicine and Biology: Received March 16, 2011 Minimum training recommendations for the practice of medical Revised June 21, 2011 ultrasound. Ultraschall Med 27: 79-105, 2006. Accepted June 21, 2011

2574