Development and External Validation of a COVID-19 Mortality Risk Prediction Algorithm: a Multicentre Retrospective Cohort Study
Total Page:16
File Type:pdf, Size:1020Kb
Open access Original research BMJ Open: first published as 10.1136/bmjopen-2020-044028 on 24 December 2020. Downloaded from Development and external validation of a COVID-19 mortality risk prediction algorithm: a multicentre retrospective cohort study Jin Mei,1 Weihua Hu,2 Qijian Chen,3 Chang Li,4 Zaishu Chen,5 Yanjie Fan,6 Shuwei Tian,6 Zhuheng Zhang,6 Bin Li,6 Qifa Ye,7 Jiang Yue ,6,8 Qiao- Li Wang9 To cite: Mei J, Hu W, ABSTRACT Strengths and limitations of this study Chen Q, et al. Development Objective This study aimed to develop and externally and external validation of validate a COVID-19 mortality risk prediction algorithm. ► We involved all patients with COVID-19 in the de- a COVID-19 mortality risk Design Retrospective cohort study. prediction algorithm: a fined hospitals during the study period and followed Setting Five designated tertiary hospitals for COVID-19 in multicentre retrospective them up for the coming 60 days at hospitals, which Hubei province, China. cohort study. BMJ Open reduced the chance of selection or detection bias. Participants We routinely collected medical data of 2020;10:e044028. doi:10.1136/ An independent population was used to externally 1364 confirmed adult patients with COVID-19 between 8 bmjopen-2020-044028 validate the prediction model. January and 19 March 2020. Among them, 1088 patients ► A cross- validation strategy was used to assess ► Prepublication history and from two designated hospitals in Wuhan were used to additional material for this model performance. Discrimination and calibration develop the prognostic model, and 276 patients from three paper are available online. To evaluation of the derivation and external validation hospitals outside Wuhan were used for external validation. view these files, please visit cohort indicated a good performance of the model. All patients were followed up for a maximal of 60 days the journal online (http:// dx. doi. ► As the prediction algorithm was generated based on after the diagnosis of COVID-19. org/ 10. 1136/ bmjopen- 2020- the COVID-19 cases from the Chinese population, Methods The model discrimination was assessed by the 044028). model validation in other populations might be war- area under the receiver operating characteristic curve ranted before its direct application. Q- LW and JM contributed (AUC) and Somers’ D test, and calibration was examined Keywords: COVID-19; prediction; survival, risk factor; equally. by the calibration plot. Decision curve analysis was prognosis conducted. http://bmjopen.bmj.com/ Received 23 August 2020 Revised 20 November 2020 Main outcome measures The primary outcome was Accepted 23 November 2020 all- cause mortality within 60 days after the diagnosis of COVID-19. pile up, making it a great challenge for the Results The full model included seven predictors of healthcare resources to meet the increased age, respiratory failure, white cell count, lymphocytes, demand for hospital beds and medical equip- platelets, D- dimer and lactate dehydrogenase. The simple model contained five indicators of age, respiratory failure, ment (eg, ventilators). A prediction model coronary heart disease, renal failure and heart failure. After for prognosis is needed for the risk stratifica- cross- validation, the AUC statistics based on derivation tion of confirmed cases. Early identification on September 25, 2021 by guest. Protected copyright. cohort were 0.96 (95% CI, 0.96 to 0.97) for the full model and intervention of patients with COVID-19 and 0.92 (95% CI, 0.89 to 0.95) for the simple model. can reduce mortality and morbidity as well The AUC statistics based on the external validation cohort as mitigating the burden on the healthcare were 0.97 (95% CI, 0.96 to 0.98) for the full model and system. There are two assessment tools to 0.88 (95% CI, 0.80 to 0.96) for the simple model. Good evaluate community- acquired pneumonia calibration accuracy of these two models was found in the © Author(s) (or their including CURB-65 and Pneumonia Severity derivation and validation cohort. employer(s)) 2020. Re- use Index for adults. However, these tools were Conclusion The prediction models showed good model permitted under CC BY-NC. No not specific for COVID-19 and they do not commercial re- use. See rights performance in identifying patients with COVID-19 with a include known risk factors for COVID-19- and permissions. Published by high risk of death in 60 days. It may be useful for acute BMJ. risk classification. related prognosis. Previous studies have For numbered affiliations see well documented the association between end of article. laboratory indicators or comorbidities and INTRODUCTION COVID-19 severity. The reported risk factors Correspondence to The pandemic of COVID-19 has spread associated with the poor prognosis include Dr Jiang Yue; yuejiang@ whu. edu. cn and rapidly across the world since December older age, cardiovascular metabolic diseases, Dr Qiao- Li Wang; 2019. The number of confirmed cases is respiratory disease and increased blood qiaoli. wang@ ki. se continuing to rise and the related deaths lactate dehydrogenase level.1–5 However, Mei J, et al. BMJ Open 2020;10:e044028. doi:10.1136/bmjopen-2020-044028 1 Open access BMJ Open: first published as 10.1136/bmjopen-2020-044028 on 24 December 2020. Downloaded from there are currently few models to predict mortality risk count, neutrophils and lymphocytes) were collected at among patients with COVID-19, and these models had a the time point for the patients’ first hospital admission. high risk of selection and detection bias as well as model overfitting.6 Model calibration and external validation of Derivation of the models risk prediction models are also lacking in the models.7–9 For both the derivation cohort and external validation Therefore, this study aimed to develop a valid and easy- cohort, multiple imputations based on the multivariate to- use risk prediction algorithm that estimates the risk of normal distribution were conducted for variables with short- term mortality and to externally validate the model more than 5% missing rate.15–18 Ten imputations were in an independent population. conducted for missing variables. We identified potential auxiliary variables that had absolute correlations greater than 0.4 with variables with missing data.19 The conver- gence of multiple imputation models performed well, METHODS and it was assessed by trace plots and autocorrelation Study design, participants and data collection plots. To explore the risk pattern of short- term mortality We performed a multicentre retrospective cohort study of among patients with COVID-19, univariate logistic regres- 1364 confirmed cases from designated tertiary hospitals sion was conducted to estimate the odds ratio (OR) with in Hubei province, China. All confirmed 1088 cases of a 95% CI for each of the 51 variables. COVID-19 in the derivation cohort were from two desig- We initially included all predictor variables in a multi- nated tertiary hospitals in Wuhan (the Fifth Hospital of variate logistic regression model. Then the stepwise Wuhan and Hubei No. 3 People’s Hospital of Jianghan selection approach was applied for prediction selection, University) during the period of 8 January 2020–19 March with a predefined nominal significance level of 0.05 for 2020. All diagnosed patients with COVID-19 (n=276) from both model entry and retention.20 21 To avoid substantial three designated tertiary hospitals outside Wuhan (Jiayu improvement of the goodness of model fit in the likeli- People’s Hospital, Jingzhou First People’s Hospital and hood ratio test by the omitted predictors, the excluded People’s Hospital of Nanzhang County) were included as predictors from the first step were later re-entered into an independent validation cohort. the model and re- evaluated one by one. Age was included All the patients were diagnosed by the confirmatory without any evaluation as older age has been reported to testing for COVID-19, a real-time reverse transcrip- be strongly associated with death in patients with COVID- tion- PCR assay with nasal and pharyngeal swab specimens 19.2 22 23 Interaction terms of predictors were tested and 10 according to the WHO interim guidance. The virus added to the model. The risk equation for predicting the detection was repeated two times for each patient. The log odds of short- term mortality after COVID-19 infection patients were followed up for maximal 60 days after the was computed using the estimated β estimates multiplied diagnosis. We extracted the medical records and collected by the corresponding selected predictors, along with the http://bmjopen.bmj.com/ the information using a standardised case report form. average intercept. The predicted log odds of mortality Data collection included demographic factors (eg, age (marked as µ) from the derivative model were further at diagnosis and sex), medical history (eg, COVID-19 used for computing the predicted absolute risk of short- diagnosis date, death or discharge status and comorbid- term mortality with the algorithm of predicted risk=1/ ities at diagnosis), symptoms and vital signs at admission, (1+e−µ). and laboratory indicators (eg, C- reactive protein and For a quick classification of patients with COVID-19 D- dimer) for each participant. Inclusion criteria were with a high risk of short-term death, we also developed confirmed COVID-19 cases aged ≥20 years old during the a simple model excluding laboratory tests but including on September 25, 2021 by guest. Protected copyright. study period. comorbidities which had been previously reported to be associated with the poor prognosis among patients with Outcomes COVID-19. The simple model was developed without The primary outcome was all-cause mortality, using the any predictor evaluation because all included predic- date of death recorded on the medical records. Patients tors have been previously reported to be risk factors for were followed up for a maximum of 60 days until the mortality among patients with COVID-19.2 12 13 22–26 The occurrence of death or discharge.