Machine Learning–Based Preoperative Predictive Analytics for Lumbar Spinal Stenosis
Total Page:16
File Type:pdf, Size:1020Kb
NEUROSURGICAL FOCUS Neurosurg Focus 46 (5):E5, 2019 Machine learning–based preoperative predictive analytics for lumbar spinal stenosis Alessandro Siccoli, BMed,1 Marlies P. de Wispelaere, PDEng,2 Marc L. Schröder, MD, PhD,1 and Victor E. Staartjes, BMed1,3,4 1Department of Neurosurgery, Bergman Clinics, Amsterdam; 2Department of Clinical Informatics, Bergman Clinics, Amsterdam; 3Amsterdam UMC, Vrije Universiteit Amsterdam, Neurosurgery, Amsterdam Movement Sciences, Amsterdam, The Netherlands; and 4Department of Neurosurgery, Clinical Neuroscience Centre, University Hospital Zurich, University of Zurich, Switzerland OBJECTIVE Patient-reported outcome measures (PROMs) following decompression surgery for lumbar spinal stenosis (LSS) demonstrate considerable heterogeneity. Individualized prediction tools can provide valuable insights for shared decision-making. The authors aim to evaluate the feasibility of predicting short- and long-term PROMs, reoperations, and perioperative parameters by machine learning (ML) methods. METHODS Data were derived from a prospective registry. All patients had undergone single- or multilevel mini-open facet-sparing decompression for LSS. The prediction models were trained using various ML-based algorithms to predict the endpoints of interest. Models were selected by area under the receiver operating characteristic curve (AUC). The endpoints were dichotomized by minimum clinically important difference (MCID) and included 6-week and 12-month numeric rating scales for back pain (NRS-BP) and leg pain (NRS-LP) severity and the Oswestry Disability Index (ODI), as well as prolonged surgery (> 45 minutes), extended length of hospital stay (> 28 hours), and reoperations. RESULTS A total of 635 patients were included. The average age was 62 ± 10 years, and 333 patients (52%) were male. At 6 weeks, MCID was seen in 63%, 76%, and 61% of patients for ODI, NRS-LP, and NRS-BP, respectively. At internal validation, the models predicted MCID in these variables with accuracies of 69%, 76%, and 85%, and with AUCs of 0.75, 0.79, and 0.92. At 12 months, 66%, 63%, and 51% of patients reported MCID; the observed accuracies were 62%, 74%, and 66%, with AUCs of 0.68, 0.72, and 0.79. Reoperations occurred in 60 patients (9.5%), of which 27 (4.3%) occurred at the index level. Overall and index-level reoperations were predicted with 69% and 63% accuracy, respec- tively, and with AUCs of 0.66 and 0.61. In 15%, a length of surgery greater than 45 minutes was observed and predicted with 78% accuracy and AUC of 0.54. Only 15% of patients were admitted to the hospital for longer than 28 hours. The developed ML-based model enabled prediction of extended hospital stay with an accuracy of 77% and AUC of 0.58. CONCLUSIONS Preoperative prediction of a range of clinically relevant endpoints in decompression surgery for LSS using ML is feasible, and may enable enhanced informed patient consent and personalized shared decision-making. Access to individualized preoperative predictive analytics for outcome and treatment risks may represent a further step in the evolution of surgical care for patients with LSS. https://thejns.org/doi/abs/10.3171/2019.2.FOCUS18723 KEYWORDS machine learning; outcome prediction; lumbar spinal stenosis; patient-reported outcome; decompression; laminectomy ITH mounting life expectancy, the incidence of patients, and many improvements in surgical techniques symptomatic lumbar spinal stenosis (LSS) has have been made over the past decades.5,21,22 However, indi- been steadily increasing.27 Although conserva- cations for and outcomes after lumbar decompression sur- Wtive treatment is recommended and effective in the major- gery remain heterogeneous, particularly in those patients ity of cases, many studies suggest an advantage of decom- who present with concomitant low-grade spondylolisthe- pression surgery over nonsurgical treatments for selected sis.12,14,15,21–24 ABBREVIATIONS ASA = American Society of Anesthesiologists; AUC = area under the receiver operating characteristic curve; GLM = generalized linear model; LSS = lumbar spinal stenosis; MCID = minimum clinically important difference; ML = machine learning; NPV = negative predictive value; NRS = numeric rating scale; NRS-BP = NRS for back pain; NRS-LP = NRS for leg pain; ODI = Oswestry Disability Index; PPV = positive predictive value; PROMs = patient-reported outcome measures; XGBoost = extreme gradient boosting. SUBMITTED December 28, 2018. ACCEPTED February 14, 2019. INCLUDE WHEN CITING DOI: 10.3171/2019.2.FOCUS18723. ©AANS 2019, except where prohibited by US copyright law Neurosurg Focus Volume 46 • May 2019 1 Unauthenticated | Downloaded 09/23/21 10:34 PM UTC Siccoli et al. The combination of hard-to-predict outcomes and a approach, a mini-open retractor was inserted. The inter- strong disease-linked psychological component requires spinous ligament was cut, and the spinous process was in-depth preoperative counseling.4 A range of factors partially resected.36 A bilateral partial hemilaminectomy influencing the outcome of surgery for LSS have been was then performed. The interspinous ligament was de- identified.3,5–7, 11,13,16,19,33 However, it is not always feasible liberately resected to contralaterally undercut the hyper- to integrate many single patient factors into one commu- trophic ligamentum flavum and osteophytes. Whenever nicable risk for poor outcome. Prediction models repre- possible, the facet joints were left untouched to preserve sent an interface between these important and complex biomechanical integrity. Only if necessary, their hypertro- factors, and produce an output that can be benchmarked phic medial part was partially resected. The lateral recess- and is interpretable and clinically valuable for both pa- es and foramina were further opened until the nerve roots tients and surgeons.38 Recently, interest has shifted toward appeared to be fully released. Discectomy was performed machine learning (ML) algorithms in predictive model- only in cases of significant nerve root compression by a ing.10,20,29 While some ML algorithms have been around bulging disc at the index level. for decades, they have only recently gained major interest for predictive analytics in medicine.31,32 This increase in Data Collection the use of ML is attributable to the advent of the “big data” Clinical and radiological baseline data were obtained era, but also to the development of new algorithms and 25 at the first outpatient visit to the treating surgeon. Patients steadily improving computational power. Various ML underwent MRI and a full clinical workup. The 15 col- techniques such as gradient-boosting machines, decision lected variables consisted of baseline patient-reported out- trees, random forests, and neural networks have already come measures (PROMs), as well as sex, age, height (cm), proven an increased predictive ability compared to more weight (kg), BMI, smoking status (active/inactive), regular traditional statistical modeling techniques in neurosurgi- 1,31,34,37 alcohol intake (yes/no), American Society of Anesthe- cal patient care. siologists (ASA) grade, prior surgery at the index level, The hope for these new techniques is that they will single- or multilevel decompression, index levels, presence enable surgeons to accurately predict adverse events and of concomitant spondylolisthesis, and the predominant outcomes that were previously near unpredictable, such presenting symptom (neurogenic claudication or [mono] as reoperations or a positive outcome, in order to identify radiculopathy). those patients who will benefit most from surgical care. In addition, such tools may aid patients in making the best-in- Outcome Measures formed decision for themselves. Therefore, the purpose of this study was to investigate whether comprehensive ML- Patient-Reported Outcome based preoperative predictive analytics are feasible and ro- For PROM measurement, patients completed a stan- bust in predicting a range of clinically relevant endpoints in dardized questionnaire including numeric rating scales patients undergoing decompression surgery for LSS. (NRSs) for back pain (NRS-BP) and leg pain (NRS-LP) severity, ranging from 0 to 10, and a validated Dutch ver- Methods sion of the Oswestry Disability Index (ODI) to capture functional disability, ranging from 0 to 100, with higher Patient Population values representing increasing severity. All PROM data In a retrospective analysis based on data from a pro- were collected using a validated web-based PROM assess- spective registry, we identified all patients who had under- ment tool at baseline, 6 weeks, and 12 months after sur- gone decompression surgery for LSS between November gery. 30 We defined clinical success as achievement of the 2013 and November 2018 at a single Dutch spine center minimum clinically important difference (MCID) thresh- (Bergman Clinics). All patients underwent single- or mul- old set by Ostelo et al. at the 6-week or 12-month postop- tilevel decompression by the senior author (M.L.S.). Adult erative follow-up.26 Thus, an improvement from baseline patients with ≥ 80% baseline data completion were includ- of ≥ 30% represented clinical success in functional dis- ed in this study. In patients with concomitant low-grade ability (ODI) or pain severity scores (NRS). spondylolisthesis, the decision of whether to add a fusion procedure to decompression alone was based upon a vali- Reoperation dated decision-making protocol.36 This study was devised Any reoperations were tracked. The analysis