<<

Paper PO06

Randomization in Studies

David Shen, WCI, Inc. Zaizai Lu, AstraZeneca Pharmaceuticals

ABSTRACT

Randomization is of central importance in clinical trials. It prevents selection and insures against accidental bias. It produces comparable groups, and eliminates the source of bias in treatment assignments. Finally, it permits the use of theory to express the likelihood of chance as a source for the difference between outcomes. This paper discusses four common randomization methods. SAS implementation of randomization is provided with RANUNI and RANOR functions, PROC SURVEYSELECT and PROC PLAN.

INTRODUCTION

A good clinical trial minimizes variability of the evaluation and provides an unbiased evaluation of the intervention by avoiding from other factors. Randomization insures that each patient have an equal chance of receiving any of the treatments under study, generate comparable intervention groups which are alike in all important aspects except for the intervention each group receives. It also provides a basis for the statistical methods used in analyzing .

WHY RANDOMIZATION

The basic benefits of randomization include 1. Eliminates selection bias. 2. Balances arms with respect to prognostic variables (known and unknown). 3. Forms basis for statistical tests, a basis for an assumption-free statistical test of the equality of treatments. In general, a randomized trial is an essential tool for testing the efficacy of the treatment.

CRITERIA FOR RANDOMIZATION

1. Unpredictability • Each participant has the same chance of receiving any of the interventions. • Allocation is carried out using a chance mechanism so that neither the participant nor the investigator will know in advance which will be assigned. 2. Balance • Treatment groups are of a similar size & constitution, groups are alike in all important aspects and only differ in the intervention each group receives 3. Simplicity • Easy for investigator/staff to implement

METHODS OF RANDOMIZATION

The common types of randomization include (1) simple, (2) block, (3) stratified and (4) unequal randomization. Some other methods such as biased coin, minimization and response-adaptive methods may be applied for specific purposes.

1. Simple Randomization This method is equivalent to tossing a coin for each subject that enters a trial, such as Heads = Active, Tails = . The random number generator is generally used. It is simple and easy to implement and treatment assignment is completely unpredictable. However, it can get imbalanced in treatment assignment, especially in smaller trials. Imbalanced randomization reduces statistical power. In trial of 10 participants, treatment effect for 5-5 split relative to 7-3 split is (1/5+1/5)/(1/7+1/3)=.84, so 7-3 split is only 84% as efficient as 5-5 split. Even if treatment is balanced at the end of a trial, it may not be balanced at some time during the trial. For example, the trial may be balanced at end with 100 participants, but the first 10 might be AAAATATATA. If the trial is monitored during the process, we’d like to have balance in the number of subjects on each treatment over time.

2. Block Randomization Simple randomization does not guarantee balance in during trial. Especially, if patient characteristics change with time, (e.g. early patients sicker than later), early imbalances can't be corrected. Block randomization is often used to fix this issue. The basic idea of block randomization is to divide potential patients into m blocks of size 2n, randomize each block such that n patients are allocated to A and n to B. then choose the blocks randomly. This method ensures equal treatment allocation within each block if the complete block is used. Example: Two treatments of A, B and Block size of 2 x 2= 4 Possible treatment allocations within each block are (1) AABB, (2) BBAA, (3) ABAB, (4) BABA, (5) ABBA, (6) BAAB Block size depends on the number of treatments, it should be short enough to prevent imbalance, and long enough to prevent guessing allocation in trials. The block size should be at least 2x number of treatments (ref ICH E9). The block size is not stated in the so the clinical and investigators are blind to the block size. If is not masked in open-label trials, the sequence becomes somewhat predictable (e.g. 2n= 4): B A B ? Must be A. A A ? ? Must be B B. This could lead to selection bias. The solution to avoid selection bias is (1).Do not reveal blocking mechanism. (2). Use random block sizes. If treatment is double blinded, selection bias is not likely. Note if only one block is requested, then it produces a single sequence of , i.e. simple randomization.

3. Imbalance randomization in numbers of subjects reduces statistical power, but imbalance in prognostic factors is also more likely inefficient for estimating treatment effect. Trial may not be valid if it is not well balanced across prognostic factors. For example, with 6 diabetics, there is 22% chance of 5-1 or 6-0 split by block randomization only. Stratified randomization is the solution to achieve balance within subgroups: use block randomization separately for diabetics and non-diabetics. For example, Age Group: < 40, 41-60, >60; Sex: M, F Total number of strata = 3 x 2 = 6 Stratification can balance subjects on baseline covariates, tend to produce comparable groups with regard to certain characteristics (e.g., gender, age, race, disease severity), thus produces valid statistical tests. The block size should be relative small to maintain balance in small strata. Increased number of stratification variables or increased number of levels within strata leads to fewer patients per stratum. Subjects should have baseline measurements taken before randomization. Large clinical trials don’t use stratification. It is unlikely to get imbalance in subject characteristics in a large randomized trial.

4. Unequal Randomization Most randomized trials allocate equal numbers of patients to experimental and control groups. This is the most statistically efficient randomization ratio as it maximizes statistical power for a given total size. However, this may not be the most economically efficient or ethically/practically feasible. When two or more treatments under evaluation have a cost difference it may be more economically efficient to randomize fewer patients to the expensive treatment and more to the cheaper one. The substantial cost savings can be achieved by adopting a smaller randomization ratio such as a ratio of 2:1, with only a modest loss in statistical power. When one arm of the treatment saves lives and the other such as placebo/medical care only does not much to save them in the oncology trials. The subject survival time depends on which treatment they receive. More extreme allocation may be used in these trials to allocate fewer patients into the placebo group. Generally, randomization ratio of 3:1 will lose considerable statistical power, more extreme than 3:1 is not very useful, which leads to much larger sample size.

SAS IMPLEMENTATION

1. SAS Random Number Generators SAS provides several functions to work as random number generators: • RANUNI: generates random numbers between 0 and 1 which have a uniform distribution. • RANNOR: generates random numbers with a standard normal ~N(0, 1) distribution • RANBIN: generates random numbers with a binomial distribution Random number generators are used in producing randomization schedules for clinical trials or carrying out simulation studies. Subjects are supposed to get either a drug or a placebo with equal probability. Assume the variable GROUP represents assignment: Group = 'A' or Group = 'P'. RANUNI generates random number R between 0 and 1. If R is less than .5, then it is assigned to Group = 'P'. If R is greater than or equal to .5, then is assigned to Group = 'A'. The code that does this is the following: data ONE; seed=123; do i=1 to 100; r = ranuni(seed); if r<.5 then group = 'A'; else group = 'P'; output; end; run; proc freq data=one; tables group; run;

The SEED for the random number generator determines the starting value. The same positive SEED in the program always generates the same results. However, if SEED is 0 or negative number, the result will be different each time. When 0 or negative number as the seed, SAS chooses the current computer clock time value as the seed. The result is completely impossible to predict, but it is not generally recommended. You need to select a beginning seed value so that you could reproduce the results by the same seed value at a later date. Otherwise you may have to wait for thousand of years to get the same result. Note that in this example, the treatment assignments are unbalanced from the result of PROC FREQ: there are 56 assignments to placebo P and only 44 assignments to active treatment. This is not an unusual imbalance. The following code can put same number of subjects into each group by sorting the random number, then assigning drug and placebo to the random sequence. data ONE; seed=123; do i=1 to 100; r = ranuni(seed); output; end; run; proc sort data=ONE; by r; run; data TWO; set ONE; if _n_ <=50 then group='A'; else group='P'; run;

How if we want to split 100 subjects into more than 2 treatment groups? PROC RANK can easily accomplish this. proc rank data=ONE groups = 5 out=THREE; var r; ranks group; run;

PROC RANK collapses or categorizes the values of numeric variable R in data set ONE and creates new data set THREE. The new variable GROUP created by PROC RANK indicates observation membership in the or grouping variable. Option GROUPS= N, N is the number of groups to create.

RANNOR is another SAS random number generator. It produces random numbers which have a normal distribution with 0 and 1. RANNOR is used in much the same way as RANUNI.

2. PROC SURVEYSELECT. This procedure is originally designed to analyze very large data but to work with a relatively small random sample. The SURVEYSELECT procedure provides a variety of methods for selecting probability-based random samples. It can select a or can sample according to a complex multistage sample design that includes stratification and unequal randomization. The following is the simple randomization. data ONE; do i=1 to 100; output; end; run; proc surveyselect data=ONE method=srs n= 50 out=TWO; run;

The method=SRS specifies simple randomization , in which each subject has an equal probability of selection and sampling is without replacement. N=50 option specifies a samples size. OUT= option stores the sample data. If we define subjects in TWO as active treatment, then the rest of subjects in ONE will be treated with placebo. The following information is displayed in OUTPUT, which summarizes the sample selection.

The SURVEYSELECT Procedure Selection Method Simple Random Sampling Input Data Set ONE Random Number Seed 56895 Sample Size 50 Selection Probability 0.5 Sampling Weight 2 Output Data Set TWO The random number seed is 56895. Since the seed= option is not specified in the proc statement, the seed values in obtained using the time from computer’s clock. You can specify SEED=56895 to reproduce this sample. It is recommended that a random seed should be specified, so that the sample can be replicated. In the next example. dataset ONE has 100 subjects, of which 20 are male. We’d like to randomly split them into two treatment groups and also ensure each group has equal number of males, i.e. 10 males in each group. data ONE; do n=1 to 100; if n<=20 then sex='M'; else sex='F'; output; end; run; proc sort data=one; by sex; run; proc surveyselect data=one method=srs n=(40 10) out=TWO; strata sex; run;

Stratification is added to the sampling. Random samples are selected independently within the strata. N=(40 10) requests that 40 subjects from Female and 10 subjects from Male. PROC SURVEYSELECT requires that the input dataset sorted by the STRATA variables. The PROC FREQ with TABLES SEX displays the sampling result as we expected.

sex Percent F 40 80.00 M 10 20.00

The N= option can be replaced by rate=(0.5, 0.5) alternatively. RATE is the percentage of observations to select from each strata, 50% from Female and 50% from Male in this example. The rate can be adjusted for unequal randomization. The following randomization selects 25 subjects. Suppose that they are put into placebo group, the rest of subjects will be in the active treatment group. The randomization ratio is 1:3, which is also stratified by SEX. proc surveyselect data=one method=srs rate=(0.25, 0.25) out=THREE; strata sex; run;

3. PROC PLAN. The PLAN procedure is designed specifically for more complex designs and randomization plans such as factorial, nested and crossed , and designs. It can also be used in many basic randomization designs. The syntax is somewhat tricky, so care should be taken when using the procedure. The first example is the simple randomization to divide 12 subjects into 3 treatments. proc plan; factors Subject=12 ; treatments Group=12 cyclic (1 1 1 1 2 2 2 2 3 3 3 3 ); output out= ONE; run; quit;

Simple Randomization with 3 Levels of Treatments

Subject Group 1 3 2 2 3 3 4 1 5 1 6 3 7 2 8 3 9 2 10 1 11 2 12 1 Once again, a SEED should be applied, otherwise SAS generates its own seed, and this seed will be displayed in LOG: At the start of processing, random number seed=37786.

Our next example is about the block randomization design for 12 subjects: 2 treatments of A & B, block size 2x2=4 and 12/4 =3 blocks.

PROC PLAN SEED=12345678 ; FACTORS Block=3 random Size=4 random; OUTPUT out =C Size cvals = ('A' 'A' 'B' 'B' ); RUN;

It can bee seen that two treatments are always balanced in each block.

Block Randomization Design With 3 Blocks of Size 4, Treatments of A & B Obs Block Size 1 1 B 2 1 A 3 1 B 4 1 A 5 2 A 6 2 B 7 2 B 8 2 A 9 3 B 10 3 B 11 3 A 12 3 A

CONCLUSION

Randomization in clinical trial is convenient with the power of SAS. The randomization numbers generated will be stored in the central computer center (CORE) or put into sealed envelopes (opaque, not resealable). Each subject must have a unique identification number and keep that number throughout the study. Subject should be determined to be eligible by uniform and clear eligibility criteria and have signed the ICF before randomization. The subject’s randomization number can be obtained by calling randomization center through IVRS or accessing the web-based central randomization system.

CONTACT INFORMATION

Zaizai Lu [email protected] AstraZeneca Pharmaceuticals Wilmington, Delaware

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries.  indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies.