Predictive Models of Medication Non-Adherence Risks of Patients
Total Page:16
File Type:pdf, Size:1020Kb
Clinical Care/Education/Nutrition BMJ Open Diab Res Care: first published as 10.1136/bmjdrc-2019-001055 on 9 March 2020. Downloaded from Open access Original research Predictive models of medication non- adherence risks of patients with T2D based on multiple machine learning algorithms Xing- Wei Wu ,1,2 Heng- Bo Yang,3 Rong Yuan,4 En- Wu Long,1,2 Rong- Sheng Tong1,2 To cite: Wu X- W, Yang H- B, Yuan R, et al. Predictive models of medication non- adherence risks of patients ABSTRACT with T2D based on multiple Significance of this study machine learning algorithms. Objective Medication adherence plays a key role in type 2 diabetes (T2D) care. Identifying patients with high risks BMJ Open Diab Res Care What is already known about this subject? 2020;8:e001055. doi:10.1136/ of non-compliance helps individualized management, bmjdrc-2019-001055 especially for China, where medical resources are ► Medication adherence is a key to diabetes care. relatively insufficient. However, models with good ► Proper care can help improve medication adherence. predictive capabilities have not been studied. This study Additional material is What are the new findings? ► aims to assess multiple machine learning algorithms and published online only. To view ► The newly developed model has a good performance please visit the journal online screen out a model that can be used to predict patients’ to predict the risk of non-medica tion adherence in (http:// dx. doi. org/ 10. 1136/ non- adherence risks. diabetics. bmjdrc- 2019- 001055). Methods A real-world registration study was conducted at Sichuan Provincial People’s Hospital from 1 April How might these results change the focus of 2018 to 30 March 2019. Data of patients with T2D on research or clinical practice? This article has been displayed demographics, disease and treatment, diet and exercise, ► This model can be used to filter patients with high in the conference of ISPOR mental status, and treatment adherence were obtained by risk of non-medica tion adherence and precise edu- 2019 as a poster presentation, face- to- face questionnaires. The medication possession cational interventions can be conducted. the 2019 Chinese Conference ratio was used to evaluate patients’ medication adherence on Pharmacoepidemiology status. Fourteen machine learning algorithms were and the 9th International Graduate Student Symposium applied for modeling, including Bayesian network, Neural to improve life quality and reduce the prob- on Pharmaceutical Sciences as Net, support vector machine, and so on, and balanced ability of related disability and death.2–4 oral presentations. In addition, sampling, data imputation, binning, and methods of Studies have confirmed that good medication feature selection were evaluated by the area under the http://drc.bmj.com/ this article has won the Second adherence is important for patients with T2D Prize of Thesis in the 9th receiver operating characteristic curve (AUC). We use International Graduate Student two- way cross- validation to ensure the accuracy of model to improve glycemic control, avoid compli- Symposium on Pharmaceutical evaluation, and we performed a posteriori test on the cations, and reduce overall health expen- 5 6 Sciences, and the Third Prize of sample size based on the trend of AUC as the sample size diture. Interventions such as integrative Thesis in the Annual Congress increase. health coaching can significantly improve of Chinese Clinical Pharmacy Results A total of 401 patients out of 630 candidates medication adherence.7–9 Considering the and the 15th Chinese Clinical were investigated, of which 85 were evaluated as poor on September 27, 2021 by guest. Protected copyright. Pharmacist Forum. large crowd and complex features of patients adherence (21.20%). A total of 16 variables were selected with T2D, identifying high-risk groups with Received 18 December 2019 as potential variables for modeling, and 300 models were poor medication adherence and carrying out built based on 30 machine learning algorithms. Among Revised 8 January 2020 precise educational interventions are feasible Accepted 16 January 2020 these algorithms, the AUC of the best capable one was 0.866±0.082. Imputing, oversampling and larger sample measures to improve the overall disease size will help improve predictive ability. control. Conclusions An accurate and sensitive adherence Studies found that numerous factors may © Author(s) (or their prediction model based on real-world registration data be associated with medication adherence employer(s)) 2020. Re- use was established after evaluating data filling, balanced permitted under CC BY. in patients with T2D, including the dura- Published by BMJ. sampling, and so on, which may provide a technical tool tion of the disease, mental state, anxiety, for individualized diabetes care. For numbered affiliations see depression, irritability, smoking, cost, and so 5 10 11 end of article. on. These factors vary with study designs and regions. That may be resulted from Correspondence to INTRODUCTION the differences in the selection of variables, Dr Rong- Sheng Tong; tongrs@ 126. com and En- Wu Type 2 diabetes (T2D), one of the most popular patients’ income levels, education levels, 1 Long; disorders all over the world, is a chronic cultural characteristics, living habits, and so 1043743338@ qq. com condition that requires long-term treatment on. Therefore, it is necessary to explore the BMJ Open Diab Res Care 2020;8:e001055. doi:10.1136/bmjdrc-2019-001055 1 BMJ Open Diab Res Care: first published as 10.1136/bmjdrc-2019-001055 on 9 March 2020. Downloaded from Clinical Care/Education/Nutrition main influencing factors of local patients’ medication were left. The adherence status, the medication possession compliance and to establish a model that can accurately ratio (the proportion of medication’s available days, higher predict therapeutic compliance in specific regions.12 13 than 80% is regarded as good medication compliance16 17), Despite several compliance-related predictive models that was determined as the target variable. have been reported recently in patients with tubercu- losis14 and heart failure,15 studies on patients with T2D Data blindness based on multivariate machine learning algorithms have All variable names were encoded to X1–X44 to achieve not been retrieved. statistical blindness in this study. Data analysis was In this study, a rigorous and comprehensive ques- performed using the encoded variable names. The vari- tionnaire survey was conducted and dozens of machine ables were unblinded after the model evaluation processes. learning- based models were built and assessed. In addi- Data cleaning tion, we also examined the impacts of data preprocessing, modeling methods and different sample sizes on predic- Variables with more than 50% missing or a certain cate- tion performance. gory’s proportion greater than 80% were excluded. The maximum likelihood ratio method was used to assess the correlations between the input variables and the target RESEARCH DESIGN AND METHODS variable. Variables with p value >0.1 were considered unim- Study design, study area, and participants portant and were excluded after the likelihood ratio test. A survey was conducted in the outpatient clinic of Data filling was performed using mean value (for numer- Sichuan Provincial People’s Hospital from April 2018 to ical variables), median (for ordinal variables) or mode (for March 2019. The hospital mainly serves patients from nominal variables).18 Outlier values were modified as the Sichuan Province, a populous province in southwestern maximum or minimum of normalized values. China. A total of 630 patients were approached, and 401 Because of the proportional imbalance between completed the survey. Researchers conducted a face- to- good and non- adherence patients, sampling methods, face questionnaire survey and filled out questionnaires including oversampling (four times for patients with according to the answers from the patients who partici- poor compliance) and undersampling (50% for patients pated in the survey. with good adherence), were taken to make up the Patients with T2D were selected according to the shortage caused by the imbalanced sample size between criteria: (1) examined HbA1c on the day of the question- the different levels of the target variable. Unbalanced naire, and (2) willing to take part in the survey and to data were analyzed simultaneously to evaluate the risk of provide information to the investigators. Patients were overfitting on account of balancing dispose. eliminated if he or she (1) was a non-T2D patient, (2) did not receive hypoglycemic agency treatment, (3) less Data partition than 18 years old, and (4) with a short life expectancy. There were two data partitioning processes in this study for two-way cross-validation. The first was conducted before Data collection data cleaning. In this process, the original data were http://drc.bmj.com/ The data in this study were collected from electronic randomly divided into two subsets (named set 1 and set medical records (EMR) and face-to- face questionnaires. 2) in a ratio of 8:2, which would be used for model estab- Clinical laboratory results were collected according to lishment and verification, respectively. During the second EMRs. The results of the previous examination and the partition, set 1 was divided into a training set and a testing duration from the last test were recalled by patients and, set by 7:3 after data cleaning. The training set would be if necessary, confirmed