4. Efficacy of Breast Cancer Screening
Total Page:16
File Type:pdf, Size:1020Kb
4. EFFICACY OF BREAST CANCER SCREENING 4.1 Methodological and analytical control arm. Hence, an intention-to-treat anal- issues ysis of RCTs, i.e. an analysis in which the data are analysed according to the original random- To evaluate the efficacy of screening, it is ized design, may actually have a limited ability to important to consider the definitions of efficacy address efficacy, due to non-ideal circumstances and effectiveness for an intervention, to define (Gulati et al., 2012). outcome measures, and to consider potential This section focuses primarily on the assess- biases. ment of efficacy; methodological issues in the assessment of effectiveness are addressed in 4.1.1 Efficacy versus effectiveness Section 5.1. The term “efficacy” should be distinguished 4.1.2 Primary outcome measures from the term “effectiveness”. Efficacy is “the extent to which a specific intervention, proce- The primary outcome measure is reduction dure, regimen, or service produces a beneficial in breast cancer mortality, although increasing result under ideal conditions” (Porta, 2014), life expectancy or reduction of metastatic disease whereas effectiveness is “a measure of the extent can also be considered efficacy measures. Given to which a specific intervention, procedure, the natural history of the disease, a minimum regimen, or service, when deployed in the field in requirement in addressing efficacy is a sufficiently the usual circumstances, does what it is intended long follow-up (Hanley, 2011). Some authors have to do for a specified population” Porta,( 2014). In suggested that the use of breast cancer mortality practice, true efficacy [under ideal conditions] as the end-point of a trial may have led to unre- can rarely be estimated. Randomized controlled liable estimates of the relative risk reduction, trials (RCTs), which are conducted to initially due to possible uncertainties surrounding the assess whether screening works, assess efficacy determination of breast cancer death (leading by estimating a primary outcome, such as reduc- to misclassification of deaths), and that the tion in breast cancer mortality in the study arm use of all-cause mortality as the end-point of a compared with the control arm. However, the trial would resolve this bias (Black et al., 2002; measure of efficacy is limited by the implementa- Gøtzsche & Jørgensen, 2013). However, others tion of the intervention and other practical issues have argued that all-cause mortality is not an – for instance, less than 100% compliance in appropriate end-point for screening trials for a the study arm and unintended screening in the specific disease Tabár( et al., 2002; Marmot et al., 237 IARC HANDBOOKS OF CANCER PREVENTION – 15 2013; Weiss, 2014). Although using all-cause including slower-growing tumours, which mortality avoids the need to determine cause are generally thought to be associated with a of death precisely, breast cancer deaths reflect a better prognosis and therefore longer survival. small fraction of all-cause mortality, and trials of This again leads to bias apparently in favour of the size needed to have sufficient statistical power screening. The most extreme form of length bias to detect the expected small effects of screening is referred to as overdiagnosis. Some ductal carci- on all-cause mortality would be logistically and noma in situ (DCIS) may never progress to inva- financially impracticable. A Swedish review, sive cancer or present clinically (in the absence of which incorporated all Swedish RCTs of breast screening) (Yen et al., 2003), and some invasive cancer screening, showed a 2% non-significant cancers may be sufficiently indolent that they reduction in all-cause mortality (Nyström et al., would never have presented clinically during the 2002a), which is in line with the expected 0.94% woman’s lifetime if they had not been detected by (Nyström et al., 2002b). screening (see Section 4.2.3c). The last important bias in evaluating 4.1.3 Biases screening is selection bias. Women attend screening voluntarily, and participants might Several sources of bias have important effects therefore generally be more health-conscious on the estimation of screening efficacy. and have a lower baseline risk of breast cancer The first important bias is lead-time bias. than non-participants, although in practice this The general concept of screening is that by early assumption may not hold true (Paap et al., 2011). detection of disease and subsequent treatment, The decision to attend screening may also be prognosis is improved and the probability of influenced by certain demographic and social death from the disease is reduced. The time factors (see Section 3.1) that affect disease prog- between screen detection and the point at which nosis, for example familial risk. In RCTs with a tumour would have presented and been clin- mortality as the end-point, such a selection may ically diagnosed (in the absence of screening) hamper the generalizability of the results. is referred to as “lead time” (Cole & Morrison, Evaluations of efficacy and effectiveness must 1980). The survival time, the time from breast control for the above-mentioned biases if they cancer diagnosis to death, of screen-detected are to provide credible estimates. To eliminate cases is increased because of this lead time, lead-time and length bias, differences in breast even for individuals who do not benefit from cancer mortality rates (between the trial arms or screening. Lead-time bias may therefore appear different populations) should be the end-point to act in favour of screening, if efficacy is evalu- of a study rather than survival, because survival ated by survival analyses. time in cancer patients is extended due to lead The second important bias is length bias time and is more favourable due to length-biased (Cole & Morrison, 1980) (sometimes referred sampling. Selection bias can partially be quanti- to as length-time bias). The probability of a fied by comparing non-participants with histor- tumour being detected at screening is (partially) ical or recent data on mortality or risk factors dependent on the growth rate of the tumour, and can, perhaps, be controlled for by adjusting because slow-growing tumours have a longer for risk factors or their surrogates (e.g. socioec- preclinical detectable phase (sojourn time) and onomic status; Allgood et al., 2008) or by the are therefore more likely to be detected than fast- application of an empirically derived adjust- growing tumours. Tumours detected at screening ment factor (Paap et al., 2014). In addition, it thus reflect a biased sample of preclinical lesions, has been argued that any bias due to selection 238 Breast cancer screening for screening is likely to be small in organized randomization by cluster is an alternative. Both programmes with invitation schemes based on types of randomization have been used in the population registries and with high attendance RCTs of breast cancer screening. Recruitment rates (van Schoor et al., 2011a, b). and randomization are less complex with cluster randomization, but an equal distribution of risk 4.1.4 Use of randomized controlled trials factors between the intervention group and the control group is less likely to be achieved than Reduction in breast cancer mortality in with individual randomization. Furthermore, women offered screening relative to women not subjects with a previous diagnosis of breast offered screening is the appropriate measure of cancer at the time of randomization are, ideally, benefit of an RCT. Lead-time and length bias excluded from the trial. Whereas a previous are then eliminated in the analyses. Women are diagnosis can be determined more easily in followed up from the time of randomization RCTs with individual randomization, this may instead of from the time of diagnosis, which be more difficult to achieve beforehand with avoids lead time, and all deaths from breast cluster randomization. An important advantage cancer that occur during the follow-up period of cluster randomization is that contamination of are included in the analysis. The RCTs of breast (screening in) the control group may be reduced. cancer screening are evaluated in accordance As mentioned above, the screening effect with the intention-to-screen principle, taking in RCTs is dependent on, among other things, into account in the intervention group both the compliance in the intervention group and women who accept the invitation to screening the limitation of contamination of the control and women who decline the invitation. The group. Low compliance reduces the estimate of resulting point estimate of reduction in breast effect and must therefore be reported. Screening cancer mortality therefore does not evaluate the of controls by services outside of the trial will screened groups of individuals only. also dilute the effect of screening on breast In RCTs, participants are randomly assigned cancer mortality. Possible contamination of to either the intervention group or the control the control group is often difficult to measure, group to prevent confounding at baseline, especially because mammography is also used accounting for both observable characteris- for clinical diagnosis of breast cancer and this tics and unknown confounders. However, even use may not be easily distinguished from use for well performed randomization schemes may opportunistic screening. Methods to adjust for not prevent potential imbalances completely. contamination and poor compliance have been To take into account possible differences in risk proposed (Cuzick et al., 1997; Baker et al., 2002). factors for death from breast cancer between Furthermore, unless the breast cancer mortality the intervention group and the control group, analysis is limited to those diagnosed with breast an assessment should preferably be made of the cancer during the screening phase of the trial distribution of risk factors in both groups at period, with longer follow-up, screening of the trial entry, which would permit adjustment in control group can influence the observed differ- the analysis (although most known risk factors ence between the intervention group and the for breast cancer seem to have limited predic- control group.