Missing Data in Clinical Studies
Total Page:16
File Type:pdf, Size:1020Kb
The perils of the unknown: Missing data in clinical studies Helen Bridge1 andThomas M. Schindler2 1 AstraZeneca, Cambridge, UK 2 Boehringer Ingelheim Pharma, Biberach, Germany Visit 1 Correspondence to: Visit 2 Helen Bridge or Thomas M. Schindler Medical Writing Europe Boehringer Ingelheim Pharma Birkendorfer Str. 65 Visit 3 88307 Biberach an der Riß, Germany Thomas.Schindler@ boehringer-ingelheim.com Visit 4 Abstract The phenomenon of missing data is ubiquitous in clinical studies. Both the extent Visit 5 of missing data and the structure of missing data can introduce bias into study results and All rights reserved Schindler ©TM lead to wrong conclusions. Medical writers Figure 1. A depiction of missing data in clinical trials. should be aware of the extent of missing data All rights reserved ©TM Schindler and should describe the methods used to deal with the issue. This article outlines some of consequences. The complexity of clinical studies writer be absolved of having to talk about missing the most commonly used statistical methods means that everything is related to everything data. But as seen from this non-exhaustive list, in for handling missing data. The traditionally else, so the issue of missing data is linked to many the real world things are never perfect, and used last-observation-carried-forward (LOCF) other aspects, from study design to patient the issue of missing data will invariably arise. method to fill data gaps is problematic in retention, data analysis, and the conclusions that many ways. It is better to employ a method can be drawn. Because of this, every scientific What are the issues? that reduces bias, such as multiple imputation report about a clinical study must take note of the We cannot assume that we will obtain all the data (MI) or mixed-effects models for repeated extent to which data are missing and how this for all patients in a clinical study. This, however, measures (MMRM). Clinical study design unfortunate but inevitable fact has been handled. may or may not be a problem, depending on the can also help minimise the quantity of quantity and nature of the missing data. missing data. Why are data missing There can be no doubt about it: in clinical studies? In an the more data are missing, In an ideal world and an ideal ideal world and an the shakier the results and You may think that this topic does not concern clinical trial, all patients would ideal clinical trial, all conclusions become. It is very medical writers. You may think this is something come to all visits, all patients patients would come difficult to say when a critical for data managers or statisticians. Well, you may would take their medication limit of missing data has been need to think again. Every study you write about each day at the right time, and all to all visits, all reached because the size of the in a publication or a study report will have dealt patients would undergo all patients would take study, the indication being with the problem of missing data. Moreover, the procedures as planned. No their medication studied, the magnitude of way this problem was handled by those study investigators or patients each day at the right difference between treatments, conducting the study can have far-reaching would move or decide to leave time, and all patients and the frequency and nature of the study, and nobody would would undergo all the assessments must all be have an accident, fall ill, or die considered. However, if the trial A version of this article previously appeared in the AMWA during the study. Only in such a procedures as is testing for a difference in Journal 2017;32(3):116–19, 123. scenario could the medical planned. outcome events (e.g. heart 56 | March 2018 Medical Writing | Volume 27 Number 1 Bridge and Schindler – The perils of the unknown: Missing data in clinical studies attacks) then even a small number of missing data problem, provided they are rare. We would make it very difficult to draw any conclusions may be important. If outcome data are missing expect chance events to occur with similar about the efficacy or safety of WD. for a sizable proportion of the patients, the whole frequency in both treatment groups, and When reporting clinical studies, medical writers trial may become invalid. therefore no bias is being introduced. However, need to be alert to signs that missing data are not A second issue with missing data arises when “missingness” related to the treatment or due to chance and therefore have the potential to the pattern of missing data differs between the outcome variable leaves us on very difficult cause bias. Signs to watch out for include treatment groups. This is likely to introduce bias ground. differences bet ween treatment groups in the in the interpretation of results. Data can be Suppose we have a study comparing a new proportions of patients with missing values or the missing for various reasons. On the one hand, it wonderdrug (WD) and placebo. WD may cause reasons for withdrawals. Clusterings of could be pure chance that values are missing. adverse events that lead to dropout of patients, withdrawals or missed visits around certain For example, a patient misses a study visit while patients in the placebo group carry on. points in time should also raise suspicion. A because her car broke down and she could not Conversely, WD may have good efficacy and no starting point could be the tables detailing the get to the study site. Or a patient decides to leave tolerability issues, so the patients taking it remain disposition of patients. If you detect any issues, it the study because he needs to move for his wife in the study, while patients in the placebo group is advisable to ask the statistician to provide to take up a new job in a different region. On the drop out because they see no improve ment. In further information on the missing data. other hand, the fact that data are missing could these scenarios, we risk under esti mating or over - Now let’s look at an example of what missing be related to the outcome that is being measured estimating the size of the treatment effect. data can look like for individual patients. Let’s and/or the study treatment. For example, we Differential with draw al between treatment assume that we are looking at a trial in patients might have a much higher dropout rate in one groups will result in a serious conceptual prob - with type 2 diabetes. We want to find out what treatment group than in the other. This may lem. The goal of randomisation is that the two effect our new drug has on the long-term marker happen for many reasons, e.g. because of adverse treatment groups will have similar characteris tics for blood sugar levels, haemoglobin A1c events, lack of efficacy, or unknown reasons. at the start of the study. If many patients in one (HbA1c). We are looking at the change from Often it is difficult to know whether data are group but not in the other withdraw from the baseline to study end as our primary endpoint for missing by chance or because of the treatment. study, the two groups may no longer be efficacy. Table 1 depicts the data of five patients. Consider a drug that may cause dizziness and a comparable at the end. If a sizeable proportion of In this example, we have all values only for patient who has a traffic accident on her way to patients in the WD group drops out because of patient 1, who has completed all visits. Thus only the study clinic and ends up in hospital. Is this a tolerability issues, we will not only have more for her can we easily calculate the change from chance event or related to the treatment? missing data in this group, we will also have a baseline. Data analysis will be more complicated Statisticians have developed a theoretical different group of people at study end. By for the other four patients because they have data framework to categorise the reasons for missing exposing patients to WD for some weeks, we missing for some visits. Would it therefore be a data. In brief, they distinguish data that are unintentionally “select” those patients who are good idea to ignore the data from patients 2 to 5, “missing completely at random (MCAR)” from able to tolerate the treatment. Hence, at study i.e. concentrate the analysis only on “completers”? data that are “missing at random end we arrive at a comparison of the No, it would not. Looking at the table does not (MAR)” and data that are “missing placebo group with all its initial tell us the reasons why the data are missing, and not at random (MNAR)”. As the When reporting demographic and disease this is a common situation in clinical trials. elaboration of these concepts is clinical studies, characteristics and a modified We may know the broad reasons why some beyond the scope of this short medical writers need to WD group that consists only of patients withdrew (e.g. “adverse event” or “lack paper, please consult the reading be alert to signs that those patients who have of efficacy”) and the reason why a patient died, list. missing data are not tolerated the treatment. Their but patients who are lost to follow-up or who In a randomised trial demographic and baseline missed some visits may not have detailed reasons comparing two treatments, due to chance and disease characteristics may be recorded.