An Independent External Validation and Evaluation of QRISK Cardiovascular Risk Prediction: a Prospective Open Cohort Study
Total Page:16
File Type:pdf, Size:1020Kb
RESEARCH BMJ: first published as 10.1136/bmj.b2584 on 7 July 2009. Downloaded from An independent external validation and evaluation of QRISK cardiovascular risk prediction: a prospective open cohort study Gary S Collins, medical statistician,1 Douglas G Altman, professor of statistics in medicine1 Centre for Statistics in Medicine, ABSTRACT high risk population for cardiovascular disease in the Wolfson College Annexe, Objective To independently evaluate the performance of United Kingdom. QRISK underestimates 10 year University of Oxford, Oxford OX2 6UD the QRISK score for predicting 10 year risk of cardiovascular disease risk, but the magnitude of Correspondence to: G S Collins cardiovascular disease in an independent UK cohort of underprediction is smaller than the overprediction with [email protected] patients from general practice and compare the Anderson Framingham. performance with Framingham equations. Cite this as: BMJ 2009;339:b2584 doi:10.1136/bmj.b2584 Design Prospective open cohort study. INTRODUCTION Setting 274 practices from England and Wales Risk prediction models can play an important role in contributing to the THIN database. decision making and future management of individual Participants 1.07 million patients, registered between 1 or groups of patients with a particular medical January 1995 and 1 April 2006, aged 35-74 years (5.4 condition.1 Such models are designed, in principle, to million person years) with 43 990 cardiovascular events. estimate or predict the probability or risk of a patient Main outcome measures First diagnosis of cardiovascular developing some future clinical event based on a num- disease (myocardial infarction, coronary heart disease, ber of patient and disease characteristics. http://www.bmj.com/ stroke, and transient ischaemic attack) recorded in Unfortunately, of the overwhelming plethora of risk general practice records. score indices published every year most make no clin- Results This independent validation indicated that QRISK ical impact, offer little in the design of future prognostic offers an improved performance in predicting the 10 year studies and subsequently disappear into the archives.2 risk of cardiovascular disease in a large cohort of UK To date, the absence of explicit validation guidelines or patients over the Anderson Framingham equation. indeed reporting guidelines for risk prediction has Discrimination and calibration statistics were better with hampered the quality and clarity of published studies. on 2 October 2021 by guest. Protected copyright. QRISK. QRISK explained 32% of the variation in men and Studies often have similar methodological problems to 37% in women, compared with 27% and 31% other types of studies: poor design, small sample size, respectively for Anderson Framingham. QRISK underpredicted risk by 13% for men and 10% for women, incomplete data, inappropriate statistical analyses, and whereas Anderson Framingham overpredicted risk by optimistic interpretation are concerns. In addition, 32% for men and 10% for women. In total, 85 010 (8%) of there is a general lack of convincing external 3 patients would be reclassified from high risk (≥20%) with validation. Anderson Framingham to low risk with QRISK, with an The latter point, external validation, is an essential observed 10 year cardiovascular disease risk of 17.5% step in any risk model development in order to evalu- (95% confidence interval 16.9% to 18.1%) for men and ate and show transportability of the model so that it 16.8% (15.7% to 18.0%) for women. The incidence rate of could be applied with confidence on a cohort other 45 cardiovascular disease events among men was 30.5 per than the derivation cohort. Box and Draper aptly 1000 person years (95% confidence interval 29.9 to 31.2) put it: “All models are wrong; the practical question 6 in high risk patients identified with QRISK and 23.7 per is how wrong do they have to be to not be useful.” 1000 person years (23.2 to 24.1) in high risk patients The act of validation is thus merely quantifying and identified with Anderson Framingham. Similarly, the judging how useful (or not) the model is in estimating incidence rate of cardiovascular disease events among the risk of developing a particular outcome. women was 26.7 per 1000 person years (25.8 to 27.7) in The interpretation of the results from studies devel- high risk patients identified with QRISK compared with oping and validating risk prediction models is too often 22.2 per 1000 person years (21.4 to 23.0) in high risk focused on classifying patients into risk groups, relying patients identified with Anderson Framingham. mainly on receiver operating characteristic curves, and Conclusions The QRISK cardiovascular disease risk neglects the accuracy of the actual risk prediction (cali- equation offers an improvement over the long established bration). Accurate prediction is crucial as any systema- Anderson Framingham equation in terms of identifying a tic overprediction would inevitably lead to a BMJ | ONLINE FIRST | bmj.com page 1 of 10 RESEARCH disproportionate number of people being targeted for for missing data. They then applied this revised treatment, affecting healthcare resources and poten- model on an external dataset (THIN) in a second vali- tially exposing patients to unnecessary treatments. dation study to assess model performance.13 BMJ: first published as 10.1136/bmj.b2584 on 7 July 2009. Downloaded from Similarly, any systematic underprediction of risk Despite the two published papers on the develop- could potentially deny patients much needed treat- ment and validation of QRISK, on impressively large ment. cohorts, showing a more than competitive perfor- QRISK, a new multifactor cardiovascular disease mance when compared with the Anderson Framing- risk prediction algorithm, was recently developed ham and ASSIGN equations,14 the National Institute and validated for use in the United Kingdom. Initial for Health and Clinical Excellence (NICE) has not results describing the derivation and internal valida- recommended QRISK and has recommended contin- tion were published in July 2007.7 QRISK includes tra- ued use of Framingham.15 In addition, despite Fra- ditional cardiovascular disease risk factors—(1) age, (2) mingham’s limitations and known shortcomings,16-19 sex, (3) systolic blood pressure, (4) smoking status, and NICE has recommended an unexplained adjustment (5) serum cholesterol: high density lipoprotein ratio— factor to adjust for family history and ethnicity20 (eth- that are incorporated in the long established Framing- nicity is not a risk factor in the version of QRISK in this ham risk equations,8 but it also includes (6) body mass paper). However, this adjusted Anderson Framingham index, (7) family history of cardiovascular disease, (8) approach has, to date, not undergone any validation or 9 social deprivation (Townsend score), and (9) the use of peer reviewed consultation in the public domain. The antihypertensive treatment. We note QRISK excludes QRISK developers have also been criticised for not patients with a pre-existing diagnosis of diabetes and making the QRISK algorithm publicly available so does not include electrocardiogram assessment of left that head-to-head comparison can be made.21 Finally, ventricular hypertrophy, both included in Framing- an unpublished report, attempting to revalidate the ham. The development of QRISK made use of elabo- QRISK model on the THIN cohort, failed to replicate rate statistical methods such as fractional poly- the results of the external validation and has further 10 nomials to model any non-linear risk relationships contributed doubts about the performance of with continuous variables and multiple imputation QRISK.22 This report has subsequently been shown 11 methods to avoid potentially biased estimates to be incorrect and misleading and based on an incor- obtained from a reduced complete-case dataset. rect and naive specification of a model, and to date it In response to a number of critiques of the original 23 has not been made available in the public domain. http://www.bmj.com/ BMJ 12 papers, the QRISK authors undertook a revision This article describes and synthesises the results to account for statin use and implemented an improved from an independent and external evaluation of approach to multiple imputation method to account QRISK commissioned by the Department of Health and compares the performance of QRISK against the Anderson Framingham equation8 currently recom- Table 1 | Characteristics of 1 072 800 patients from the THIN database. Values are numbers mended by NICE.20 In addition, we compare QRISK (percentages) of patients unless stated otherwise with a recently developed sex-specific Framingham on 2 October 2021 by guest. Protected copyright. Characteristic Men (n=529 813) Women (n=542 987) risk equation.24 This article presents additional mate- Median (interquartile range) age (years) 48 (40-57) 49 (41-59) rial on the performance of QRISK currently not pre- Mean (SD) systolic blood pressure (mm Hg) 135.6 (19.4) 132.1 (21.0) sented elsewhere. As the QRISK equation has not been Mean (SD) total serum cholesterol (mmol/l) 5.7 (1.1) 5.8 (1.2) made available in the public domain, we emphasise Mean (SD) high density lipoprotein cholesterol 1.3 (0.4) 1.6 (0.4) that we were granted full access by the QRISK authors (mmol/l) to the QRISK algorithm and accompanying documen- Mean (SD) total serum cholesterol: high density 4.5 (1.3) 3.9 (1.2) tation, enabling an accurate implementation. In addi- lipoprotein