An Introduction to Dynamic Treatment Regimes

Marie Davidian

Department of North Carolina State University http://www4.stat.ncsu.edu/davidian

1/64 Dynamic Treatment Regimes Webinar Outline

• What is a dynamic treatment regime, and why study them? • Clinical trials to study dynamic treatment regimes • Thinking in terms of dynamic treatment regimes • Constructing dynamic treatment regimes • Discussion

2/64 Dynamic Treatment Regimes Webinar Hot topic

Personalized Medicine

Source of graphic: http://www.personalizedmedicine.com/

3/64 Dynamic Treatment Regimes Webinar • “Personalize ” treatment to the patient

That is: Treatment in practice involves sequential decision-making based on accruing information • Suggests thinking about and studying treatment from this perspective. . .

A perspective on personalized medicine

Clinical practice: Clinicians make (a series of) treatment decisions(s) over the course of a patient’s or disorder • Key decision points in the disease process • Fixed schedule , milestone in the disease process, event necessitating a decision • Several treatment options at each decision point • Accruing information on the patient

4/64 Dynamic Treatment Regimes Webinar That is: Treatment in practice involves sequential decision-making based on accruing information • Suggests thinking about and studying treatment from this perspective. . .

A perspective on personalized medicine

Clinical practice: Clinicians make (a series of) treatment decisions(s) over the course of a patient’s disease or disorder • Key decision points in the disease process • Fixed schedule , milestone in the disease process, event necessitating a decision • Several treatment options at each decision point • Accruing information on the patient • “Personalize ” treatment to the patient

4/64 Dynamic Treatment Regimes Webinar A perspective on personalized medicine

Clinical practice: Clinicians make (a series of) treatment decisions(s) over the course of a patient’s disease or disorder • Key decision points in the disease process • Fixed schedule , milestone in the disease process, event necessitating a decision • Several treatment options at each decision point • Accruing information on the patient • “Personalize ” treatment to the patient

That is: Treatment in practice involves sequential decision-making based on accruing information • Suggests thinking about and studying treatment from this perspective. . .

4/64 Dynamic Treatment Regimes Webinar Can clinical decision-making be formalized and made “evidence-based?”

Clinical decision-making

How are these decisions made? • Clinical judgment • Practice guidelines based on study results, expert opinion • Synthesize all information on a patient up to the point of the decision to determine the next treatment action

5/64 Dynamic Treatment Regimes Webinar Clinical decision-making

How are these decisions made? • Clinical judgment • Practice guidelines based on study results, expert opinion • Synthesize all information on a patient up to the point of the decision to determine the next treatment action

Can clinical decision-making be formalized and made “evidence-based?”

5/64 Dynamic Treatment Regimes Webinar Dynamic treatment regime

Dynamic treatment regime: • A set of sequential decision rules, each corresponding to a key decision point • Each rule dictates the treatment to be given from among the available options based on the accrued information on the patient to that point • Taken together, the rules define an algorithm for making treatment decisions • Dynamic because the treatment action can vary depending on the accrued information • Ideally , provides an “evidence-based ” approach to personalized treatment

6/64 Dynamic Treatment Regimes Webinar Treatment regime

Terminology/Convention: • Often, treatment regime is used to refer generally to any approach to deciding on treatment • And dynamic treatment regime is reserved for the case where patient information is used • We will use these terms interchangeably

In fact: Many common situations can be cast as involving (dynamic) treatment regimes

7/64 Dynamic Treatment Regimes Webinar ADHD therapy

Sequential (scheduled) decision points • Decision 1: Low dose therapy – 2 options: medication or behavior modification • Subsequent monthly decisions: I Responders – Continue initial therapy I Non-responders – 2 options: add the other therapy or increase dose of current therapy • Objective: Improved end-of-school-year performance

Example from Susan Murphy,

8/64 Dynamic Treatment Regimes Webinar Cancer treatment

Two (milestone) decision points:

• Decision 1 : Induction chemotherapy (options C1,C2) • Decision 2 : I Maintenance treatment for patients who respond (options M1,M2) I Salvage chemotherapy for those who don’t respond (options S1,S2) • Objective : Maximize survival time

9/64 Dynamic Treatment Regimes Webinar • “If age < 50, progesterone receptor level < 10 fmol, RAD51 mutation, then give C1, else, give C2”

• “If patient is a Libra, Scorpio, or Sagittarius, give C1, else, give C2”

Possible rules at Decision 2:

• “If patient responds, give maintenance M1; if does not respond, give salvage S1”(dynamic ) • “If patient responds, age < 60, CEA > 10 ng/mL, progesterone receptor level < 8 fmol, give M1, else, give M2; if does not respond, age > 65, P53 mutation, CA 15-3 > 25 units/mL, then give S1, else, give S2”

Possible treatment regimes

Possible rules at Decision 1:

• “Give C1”(non-dynamic )

10/64 Dynamic Treatment Regimes Webinar • “If patient is a Libra, Scorpio, or Sagittarius, give C1, else, give C2”

Possible rules at Decision 2:

• “If patient responds, give maintenance M1; if does not respond, give salvage S1”(dynamic ) • “If patient responds, age < 60, CEA > 10 ng/mL, progesterone receptor level < 8 fmol, give M1, else, give M2; if does not respond, age > 65, P53 mutation, CA 15-3 > 25 units/mL, then give S1, else, give S2”

Possible treatment regimes

Possible rules at Decision 1:

• “Give C1”(non-dynamic ) • “If age < 50, progesterone receptor level < 10 fmol, RAD51 mutation, then give C1, else, give C2”

10/64 Dynamic Treatment Regimes Webinar Possible rules at Decision 2:

• “If patient responds, give maintenance M1; if does not respond, give salvage S1”(dynamic ) • “If patient responds, age < 60, CEA > 10 ng/mL, progesterone receptor level < 8 fmol, give M1, else, give M2; if does not respond, age > 65, P53 mutation, CA 15-3 > 25 units/mL, then give S1, else, give S2”

Possible treatment regimes

Possible rules at Decision 1:

• “Give C1”(non-dynamic ) • “If age < 50, progesterone receptor level < 10 fmol, RAD51 mutation, then give C1, else, give C2”

• “If patient is a Libra, Scorpio, or Sagittarius, give C1, else, give C2”

10/64 Dynamic Treatment Regimes Webinar • “If patient responds, age < 60, CEA > 10 ng/mL, progesterone receptor level < 8 fmol, give M1, else, give M2; if does not respond, age > 65, P53 mutation, CA 15-3 > 25 units/mL, then give S1, else, give S2”

Possible treatment regimes

Possible rules at Decision 1:

• “Give C1”(non-dynamic ) • “If age < 50, progesterone receptor level < 10 fmol, RAD51 mutation, then give C1, else, give C2”

• “If patient is a Libra, Scorpio, or Sagittarius, give C1, else, give C2”

Possible rules at Decision 2:

• “If patient responds, give maintenance M1; if does not respond, give salvage S1”(dynamic )

10/64 Dynamic Treatment Regimes Webinar Possible treatment regimes

Possible rules at Decision 1:

• “Give C1”(non-dynamic ) • “If age < 50, progesterone receptor level < 10 fmol, RAD51 mutation, then give C1, else, give C2”

• “If patient is a Libra, Scorpio, or Sagittarius, give C1, else, give C2”

Possible rules at Decision 2:

• “If patient responds, give maintenance M1; if does not respond, give salvage S1”(dynamic ) • “If patient responds, age < 60, CEA > 10 ng/mL, progesterone receptor level < 8 fmol, give M1, else, give M2; if does not respond, age > 65, P53 mutation, CA 15-3 > 25 units/mL, then give S1, else, give S2”

10/64 Dynamic Treatment Regimes Webinar Possible treatment regimes

Result: Rules, and thus regimes , can be simple or complex (or not realistic ) • More complex rules involve more “personalization ” and more closely mimic clinical practice • There is an infinitude of possible rules at each decision point, and thus an infinitude of possible regimes • Ultimate goal : Find the “best ” or “optimal ” regime

Regimes of interest and “optimal” depend on the question • For definiteness, assume larger outcomes are preferred

11/64 Dynamic Treatment Regimes Webinar • Cancer example: Decision 1

• Two regimes of interest: “Give C1” vs. “Give C2”

• Class of regimes of interest is D = { “Give C1” , “Give C2”} • Usual question : “If all patients in the population were to be given C1, would outcome (mean survival time ) be different from (better than ) that if all patients in the population were to be given C2?” • Optimal regime in D: The regime such that, if all patients in the population were to receive treatment according to it , mean outcome would be the largest among all regimes in D (here, “Give C1” or “Give C2”)

Classes of treatment regimes

1. Classical treatment comparison: • Focus on a single decision point

12/64 Dynamic Treatment Regimes Webinar • Usual question : “If all patients in the population were to be given C1, would mean outcome (mean survival time ) be different from (better than ) that if all patients in the population were to be given C2?” • Optimal regime in D: The regime such that, if all patients in the population were to receive treatment according to it , mean outcome would be the largest among all regimes in D (here, “Give C1” or “Give C2”)

Classes of treatment regimes

1. Classical treatment comparison: • Focus on a single decision point • Cancer example: Decision 1

• Two regimes of interest: “Give C1” vs. “Give C2”

• Class of regimes of interest is D = { “Give C1” , “Give C2”}

12/64 Dynamic Treatment Regimes Webinar • Optimal regime in D: The regime such that, if all patients in the population were to receive treatment according to it , mean outcome would be the largest among all regimes in D (here, “Give C1” or “Give C2”)

Classes of treatment regimes

1. Classical treatment comparison: • Focus on a single decision point • Cancer example: Decision 1

• Two regimes of interest: “Give C1” vs. “Give C2”

• Class of regimes of interest is D = { “Give C1” , “Give C2”} • Usual question : “If all patients in the population were to be given C1, would mean outcome (mean survival time ) be different from (better than ) that if all patients in the population were to be given C2?”

12/64 Dynamic Treatment Regimes Webinar Classes of treatment regimes

1. Classical treatment comparison: • Focus on a single decision point • Cancer example: Decision 1

• Two regimes of interest: “Give C1” vs. “Give C2”

• Class of regimes of interest is D = { “Give C1” , “Give C2”} • Usual question : “If all patients in the population were to be given C1, would mean outcome (mean survival time ) be different from (better than ) that if all patients in the population were to be given C2?” • Optimal regime in D: The regime such that, if all patients in the population were to receive treatment according to it , mean outcome would be the largest among all regimes in D (here, “Give C1” or “Give C2”)

12/64 Dynamic Treatment Regimes Webinar • Cancer example: Eight dynamic regimes of interest: 1. Give C1 followed by (M1 if response, S1 if no response) 2. Give C1 followed by (M1 if response, S2 if no response) 3. Give C1 followed by (M2 if response, S1 if no response) 4. Give C1 followed by (M2 if response, S2 if no response) 5. Give C2 followed by (M1 if response, S1 if no response) 6. Give C2 followed by (M1 if response, S2 if no response) 7. Give C2 followed by (M2 if response, S1 if no response) 8. Give C2 followed by (M2 if response, S2 if no response) • Class D of interest contains these 8 regimes • Question: Comparison of mean outcomes if all patients in the population were to follow each regime • Optimal regime in D: The regime such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 2. Which is the “best” treatment sequence? • Multiple decision points

13/64 Dynamic Treatment Regimes Webinar 1. Give C1 followed by (M1 if response, S1 if no response) 2. Give C1 followed by (M1 if response, S2 if no response) 3. Give C1 followed by (M2 if response, S1 if no response) 4. Give C1 followed by (M2 if response, S2 if no response) 5. Give C2 followed by (M1 if response, S1 if no response) 6. Give C2 followed by (M1 if response, S2 if no response) 7. Give C2 followed by (M2 if response, S1 if no response) 8. Give C2 followed by (M2 if response, S2 if no response) • Class D of interest contains these 8 regimes • Question: Comparison of mean outcomes if all patients in the population were to follow each regime • Optimal regime in D: The regime such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 2. Which is the “best” treatment sequence? • Multiple decision points • Cancer example: Eight dynamic regimes of interest:

13/64 Dynamic Treatment Regimes Webinar • Question: Comparison of mean outcomes if all patients in the population were to follow each regime • Optimal regime in D: The regime such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 2. Which is the “best” treatment sequence? • Multiple decision points • Cancer example: Eight dynamic regimes of interest: 1. Give C1 followed by (M1 if response, S1 if no response) 2. Give C1 followed by (M1 if response, S2 if no response) 3. Give C1 followed by (M2 if response, S1 if no response) 4. Give C1 followed by (M2 if response, S2 if no response) 5. Give C2 followed by (M1 if response, S1 if no response) 6. Give C2 followed by (M1 if response, S2 if no response) 7. Give C2 followed by (M2 if response, S1 if no response) 8. Give C2 followed by (M2 if response, S2 if no response) • Class D of interest contains these 8 regimes

13/64 Dynamic Treatment Regimes Webinar • Optimal regime in D: The regime such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 2. Which is the “best” treatment sequence? • Multiple decision points • Cancer example: Eight dynamic regimes of interest: 1. Give C1 followed by (M1 if response, S1 if no response) 2. Give C1 followed by (M1 if response, S2 if no response) 3. Give C1 followed by (M2 if response, S1 if no response) 4. Give C1 followed by (M2 if response, S2 if no response) 5. Give C2 followed by (M1 if response, S1 if no response) 6. Give C2 followed by (M1 if response, S2 if no response) 7. Give C2 followed by (M2 if response, S1 if no response) 8. Give C2 followed by (M2 if response, S2 if no response) • Class D of interest contains these 8 regimes • Question: Comparison of mean outcomes if all patients in the population were to follow each regime

13/64 Dynamic Treatment Regimes Webinar Classes of treatment regimes 2. Which is the “best” treatment sequence? • Multiple decision points • Cancer example: Eight dynamic regimes of interest: 1. Give C1 followed by (M1 if response, S1 if no response) 2. Give C1 followed by (M1 if response, S2 if no response) 3. Give C1 followed by (M2 if response, S1 if no response) 4. Give C1 followed by (M2 if response, S2 if no response) 5. Give C2 followed by (M1 if response, S1 if no response) 6. Give C2 followed by (M1 if response, S2 if no response) 7. Give C2 followed by (M2 if response, S1 if no response) 8. Give C2 followed by (M2 if response, S2 if no response) • Class D of interest contains these 8 regimes • Question: Comparison of mean outcomes if all patients in the population were to follow each regime • Optimal regime in D: The regime such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

13/64 Dynamic Treatment Regimes Webinar • X1 = (lots of) patient information available at Decision 1 • In resource-limited setting, interested in rules depending on a subset of X1 routinely collected, e.g., of form

“If age < η1 and PR < η2 give C2; else give C1” PR = progesterone receptor level • Class D of interest consists of all regimes of this form (so for all values of η1 and η2) opt • Optimal regime in D: The regime defined by values η1 , opt η2 such that, if all patients in the population were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 3. “Best” dynamic regime in a “feasible class?” • Single or multiple decision points • Cancer example: Decision 1

14/64 Dynamic Treatment Regimes Webinar • Class D of interest consists of all regimes of this form (so for all values of η1 and η2) opt • Optimal regime in D: The regime defined by values η1 , opt η2 such that, if all patients in the population were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 3. “Best” dynamic regime in a “feasible class?” • Single or multiple decision points • Cancer example: Decision 1 • X1 = (lots of) patient information available at Decision 1 • In resource-limited setting, interested in rules depending on a subset of X1 routinely collected, e.g., of form

“If age < η1 and PR < η2 give C2; else give C1” PR = progesterone receptor level

14/64 Dynamic Treatment Regimes Webinar opt • Optimal regime in D: The regime defined by values η1 , opt η2 such that, if all patients in the population were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 3. “Best” dynamic regime in a “feasible class?” • Single or multiple decision points • Cancer example: Decision 1 • X1 = (lots of) patient information available at Decision 1 • In resource-limited setting, interested in rules depending on a subset of X1 routinely collected, e.g., of form

“If age < η1 and PR < η2 give C2; else give C1” PR = progesterone receptor level • Class D of interest consists of all regimes of this form (so for all values of η1 and η2)

14/64 Dynamic Treatment Regimes Webinar Classes of treatment regimes 3. “Best” dynamic regime in a “feasible class?” • Single or multiple decision points • Cancer example: Decision 1 • X1 = (lots of) patient information available at Decision 1 • In resource-limited setting, interested in rules depending on a subset of X1 routinely collected, e.g., of form

“If age < η1 and PR < η2 give C2; else give C1” PR = progesterone receptor level • Class D of interest consists of all regimes of this form (so for all values of η1 and η2) opt • Optimal regime in D: The regime defined by values η1 , opt η2 such that, if all patients in the population were to receive treatment according to it , mean outcome would be the largest among all regimes in D

14/64 Dynamic Treatment Regimes Webinar • X1 = patient information available at Decision 1, X2 = additional information collected between Decisions 1 and 2 • Accrued information at each decision Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} • Class D of interest: All possible sets of rules

{d1(H1), d2(H2)} • Each rule takes as input the accrued information and outputs a treatment from among the available options opt opt • Optimal regime in D: {d1 (H1), d2 (H2)} such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 4. “Optimal” overall dynamic treatment regime: • Single or multiple decision points • Cancer example: Two decision points

15/64 Dynamic Treatment Regimes Webinar • Class D of interest: All possible sets of rules

{d1(H1), d2(H2)} • Each rule takes as input the accrued information and outputs a treatment from among the available options opt opt • Optimal regime in D: {d1 (H1), d2 (H2)} such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 4. “Optimal” overall dynamic treatment regime: • Single or multiple decision points • Cancer example: Two decision points • X1 = patient information available at Decision 1, X2 = additional information collected between Decisions 1 and 2 • Accrued information at each decision Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2}

15/64 Dynamic Treatment Regimes Webinar opt opt • Optimal regime in D: {d1 (H1), d2 (H2)} such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

Classes of treatment regimes 4. “Optimal” overall dynamic treatment regime: • Single or multiple decision points • Cancer example: Two decision points • X1 = patient information available at Decision 1, X2 = additional information collected between Decisions 1 and 2 • Accrued information at each decision Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} • Class D of interest: All possible sets of rules

{d1(H1), d2(H2)} • Each rule takes as input the accrued information and outputs a treatment from among the available options

15/64 Dynamic Treatment Regimes Webinar Classes of treatment regimes 4. “Optimal” overall dynamic treatment regime: • Single or multiple decision points • Cancer example: Two decision points • X1 = patient information available at Decision 1, X2 = additional information collected between Decisions 1 and 2 • Accrued information at each decision Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} • Class D of interest: All possible sets of rules

{d1(H1), d2(H2)} • Each rule takes as input the accrued information and outputs a treatment from among the available options opt opt • Optimal regime in D: {d1 (H1), d2 (H2)} such that, if all patients were to receive treatment according to it , mean outcome would be the largest among all regimes in D

15/64 Dynamic Treatment Regimes Webinar • Case 1 : K = 1, rules of form (simple )

d1(H1) = Cj for all H1, j = 1, 2 • Case 2 : K = 2, rules of form (simple )

d1(H1) = Cj for all H1, j = 1, 2

X2 contains response status

d2(H2) = Mk if response, S` if no response, k, ` = 1, 2

Classes of treatment regimes In all of Cases 1–4: A set of rules at each of K decision points, K = 1 or 2, depending on accrued information

Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} Dynamic treatment regime

d = d1(H1) or d = {d1(H1), d2(H2)}

16/64 Dynamic Treatment Regimes Webinar • Case 2 : K = 2, rules of form (simple )

d1(H1) = Cj for all H1, j = 1, 2

X2 contains response status

d2(H2) = Mk if response, S` if no response, k, ` = 1, 2

Classes of treatment regimes In all of Cases 1–4: A set of rules at each of K decision points, K = 1 or 2, depending on accrued information

Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} Dynamic treatment regime

d = d1(H1) or d = {d1(H1), d2(H2)} • Case 1 : K = 1, rules of form (simple )

d1(H1) = Cj for all H1, j = 1, 2

16/64 Dynamic Treatment Regimes Webinar Classes of treatment regimes In all of Cases 1–4: A set of rules at each of K decision points, K = 1 or 2, depending on accrued information

Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} Dynamic treatment regime

d = d1(H1) or d = {d1(H1), d2(H2)} • Case 1 : K = 1, rules of form (simple )

d1(H1) = Cj for all H1, j = 1, 2 • Case 2 : K = 2, rules of form (simple )

d1(H1) = Cj for all H1, j = 1, 2

X2 contains response status

d2(H2) = Mk if response, S` if no response, k, ` = 1, 2

16/64 Dynamic Treatment Regimes Webinar • Case 3 : K = 1, code {C1,C2} = {0, 1}, rules of form

d1(H1) = I(age < η1, PR < η2)

• Case 4 : K = 2, general rules {d1(H1), d2(H2)}; e.g., with two options coded as {0, 1} at each decision T T d1(H1) = I(η1 H1 > 0), d2(H2) = I(η2 H2 > 0) Rules involve linear combinations of accrued information

Classes of treatment regimes In all of Cases 1–4: A set of rules at each of K decision points, K = 1 or 2, depending on accrued information

Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} Dynamic treatment regime

d = d1(H1) or d = {d1(H1), d2(H2)}

17/64 Dynamic Treatment Regimes Webinar • Case 4 : K = 2, general rules {d1(H1), d2(H2)}; e.g., with two options coded as {0, 1} at each decision T T d1(H1) = I(η1 H1 > 0), d2(H2) = I(η2 H2 > 0) Rules involve linear combinations of accrued information

Classes of treatment regimes In all of Cases 1–4: A set of rules at each of K decision points, K = 1 or 2, depending on accrued information

Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} Dynamic treatment regime

d = d1(H1) or d = {d1(H1), d2(H2)}

• Case 3 : K = 1, code {C1,C2} = {0, 1}, rules of form

d1(H1) = I(age < η1, PR < η2)

17/64 Dynamic Treatment Regimes Webinar Classes of treatment regimes In all of Cases 1–4: A set of rules at each of K decision points, K = 1 or 2, depending on accrued information

Decision 1 H1 = X1 Decision 2 H2 = {X1, A1, X2} Dynamic treatment regime

d = d1(H1) or d = {d1(H1), d2(H2)}

• Case 3 : K = 1, code {C1,C2} = {0, 1}, rules of form

d1(H1) = I(age < η1, PR < η2)

• Case 4 : K = 2, general rules {d1(H1), d2(H2)}; e.g., with two options coded as {0, 1} at each decision T T d1(H1) = I(η1 H1 > 0), d2(H2) = I(η2 H2 > 0) Rules involve linear combinations of accrued information

17/64 Dynamic Treatment Regimes Webinar Studying dynamic treatment regimes

How do we find an optimal treatment regime within a class of interest? • Required : Appropriate data • Case 1. Classical, single decision treatment comparison : Data from a standard comparing C1 and C2 • Case 2. Optimal treatment sequence for two decision points (simple dynamic treatment regimes) • We will return to Cases 3 and 4 later

18/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes

Recall: In our example, D consists of eight regimes

1. Give C1 followed by (M1 if response, S1 if no response)

2. Give C1 followed by (M1 if response, S2 if no response)

3. Give C1 followed by (M2 if response, S1 if no response)

4. Give C1 followed by (M2 if response, S2 if no response)

5. Give C2 followed by (M1 if response, S1 if no response)

6. Give C2 followed by (M1 if response, S2 if no response)

7. Give C2 followed by (M2 if response, S1 if no response)

8. Give C2 followed by (M2 if response, S2 if no response)

How do we compare the regimes in D and identify the “best?”

19/64 Dynamic Treatment Regimes Webinar • In another trial, M1 and M2 were compared on the basis of survival time in subjects who responded to their induction chemotherapy • In yet another, S1 and S2 were compared (survival ) in subjects for whom induction therapy did not induce response • Can’t we just “piece together ” the results from these separate trials to figure out the “best regime ?” • E.g., figure out the best “C” treatment for inducing response and then the best “M” and “S” treatments for prolonging survival? • Wouldn’t the regime that uses these have to have the “best ” mean outcome?

Clinical trials for studying treatment regimes Can’t we base this on data from a series of previous trials? • In one trial, C1 was compared against C2 in terms of response rate

20/64 Dynamic Treatment Regimes Webinar • In yet another, S1 and S2 were compared (survival ) in subjects for whom induction therapy did not induce response • Can’t we just “piece together ” the results from these separate trials to figure out the “best regime ?” • E.g., figure out the best “C” treatment for inducing response and then the best “M” and “S” treatments for prolonging survival? • Wouldn’t the regime that uses these have to have the “best ” mean outcome?

Clinical trials for studying treatment regimes Can’t we base this on data from a series of previous trials? • In one trial, C1 was compared against C2 in terms of response rate • In another trial, M1 and M2 were compared on the basis of survival time in subjects who responded to their induction chemotherapy

20/64 Dynamic Treatment Regimes Webinar • Can’t we just “piece together ” the results from these separate trials to figure out the “best regime ?” • E.g., figure out the best “C” treatment for inducing response and then the best “M” and “S” treatments for prolonging survival? • Wouldn’t the regime that uses these have to have the “best ” mean outcome?

Clinical trials for studying treatment regimes Can’t we base this on data from a series of previous trials? • In one trial, C1 was compared against C2 in terms of response rate • In another trial, M1 and M2 were compared on the basis of survival time in subjects who responded to their induction chemotherapy • In yet another, S1 and S2 were compared (survival ) in subjects for whom induction therapy did not induce response

20/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes Can’t we base this on data from a series of previous trials? • In one trial, C1 was compared against C2 in terms of response rate • In another trial, M1 and M2 were compared on the basis of survival time in subjects who responded to their induction chemotherapy • In yet another, S1 and S2 were compared (survival ) in subjects for whom induction therapy did not induce response • Can’t we just “piece together ” the results from these separate trials to figure out the “best regime ?” • E.g., figure out the best “C” treatment for inducing response and then the best “M” and “S” treatments for prolonging survival? • Wouldn’t the regime that uses these have to have the “best ” mean outcome?

20/64 Dynamic Treatment Regimes Webinar Data for doing this: • Design a clinical trial expressly for this purpose (next ) • Use longitudinal observational data , where treatments actually received at each decision point have been recorded (with other information)

Clinical trials for studying treatment regimes

One problem with this: Delayed effects

• E.g., C1 may yield a higher proportion of responders than C2 but may also have other effects that render subsequent maintenance treatments less effective in terms of mean survival time • Implication : Must study entire regimes in the same patients

21/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes

One problem with this: Delayed effects

• E.g., C1 may yield a higher proportion of responders than C2 but may also have other effects that render subsequent maintenance treatments less effective in terms of mean survival time • Implication : Must study entire regimes in the same patients

Data for doing this: • Design a clinical trial expressly for this purpose (next ) • Use longitudinal observational data , where treatments actually received at each decision point have been recorded (with other information)

21/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes

Clinical trials: • An eight arm trial – subjects randomized to the jth arm follow the jth regime • A Sequential , Multiple Assignment , Randomized Trial (next slide...) • How to analyze the data to compare regimes and find the optimal regime ? What else can be learned from such trials?

22/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes SMART: Sequential,MultipleAssignment,RandomizedTrial (Randomization at •s)

M 1

Response M2

C 1 S No 1 Response S2

Cancer M Response 1

M C 2 2

No Response S1

S2 Pioneered by Susan Murphy, Phil Lavori, and others

23/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes Embedded regimes: The eight regimes in D are embedded in the SMART

M 1

Response M2

C 1 S No 1 Response S2

Cancer M Response 1

M C 2 2

No Response S1

S2

24/64 Dynamic Treatment Regimes Webinar Remarks: • There is really no conceptual difference between randomizing up front or sequentially • Advantages and disadvantages , e.g., consent , balance • Important : Making efficient use of the data

Seminal reference: Murphy SA. (2005). An experimental design for the development of adaptive treatment strategies, Statistics in Medicine , 24, 1455–1481.

Clinical trials for studying treatment regimes Examples of SMARTs: SMARTs have been carried out or are ongoing, mainly in behavioral disorders; see http://methodology.psu.edu/ra/smart/projects

• SMARTs have also been done in oncology (coming up...)

25/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes Examples of SMARTs: SMARTs have been carried out or are ongoing, mainly in behavioral disorders; see http://methodology.psu.edu/ra/smart/projects

• SMARTs have also been done in oncology (coming up...)

Remarks: • There is really no conceptual difference between randomizing up front or sequentially • Advantages and disadvantages , e.g., consent , balance • Important : Making efficient use of the data

Seminal reference: Murphy SA. (2005). An experimental design for the development of adaptive treatment strategies, Statistics in Medicine , 24, 1455–1481.

25/64 Dynamic Treatment Regimes Webinar Remark 2: Individuals following different regimes can have the same realized treatment experience , e.g., experience

C1 ⇒ Response ⇒ M1 is consistent with having followed EITHER OF regimes • C1 followed by (M1 if response, S1 if no response) • C1 followed by (M1 if response, S2 if no response)

Clinical trials for studying treatment regimes Remark 1: Individuals following the same regime can have different realized treatment experiences , e.g.,

Give C1 followed by (M1 if response, S1 if no response)

• Subject 1 : Receives C1, responds, receives M1 • Subject 2 : Receives C1, does not respond, receives S1 • Both subjects’ experiences are consistent with following this regime

26/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes Remark 1: Individuals following the same regime can have different realized treatment experiences , e.g.,

Give C1 followed by (M1 if response, S1 if no response)

• Subject 1 : Receives C1, responds, receives M1 • Subject 2 : Receives C1, does not respond, receives S1 • Both subjects’ experiences are consistent with following this regime

Remark 2: Individuals following different regimes can have the same realized treatment experience , e.g., experience

C1 ⇒ Response ⇒ M1 is consistent with having followed EITHER OF regimes • C1 followed by (M1 if response, S1 if no response) • C1 followed by (M1 if response, S2 if no response)

26/64 Dynamic Treatment Regimes Webinar Remark 4: Do not confuse dynamic treatment regimes themselves or SMARTs with response- designs for classical treatment comparisons • A dynamic treatment regime is an algorithm for treating a single patient • This has nothing to do with other patients in a study • An adaptive trial is one in which the data are used to alter the design (e.g., drop an arm, sample size) • The design of a SMART does not change

Clinical trials for studying treatment regimes Remark 3: Do not confuse the regime with the possible realized experiences that can result from following it • “C1 followed by response followed by M1” and “C1 followed by no response followed by S1” are not regimes but are possible results of following the above regime • The regime is the algorithm (set of rules)

27/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes Remark 3: Do not confuse the regime with the possible realized experiences that can result from following it • “C1 followed by response followed by M1” and “C1 followed by no response followed by S1” are not regimes but are possible results of following the above regime • The regime is the algorithm (set of rules) Remark 4: Do not confuse dynamic treatment regimes themselves or SMARTs with response-adaptive clinical trial designs for classical treatment comparisons • A dynamic treatment regime is an algorithm for treating a single patient • This has nothing to do with other patients in a study • An adaptive trial is one in which the data are used to alter the design (e.g., drop an arm, sample size) • The design of a SMART does not change

27/64 Dynamic Treatment Regimes Webinar • However : Subjects will have realized experiences consistent with more than one regime ! • This can be exploited to improve precision. . .

Clinical trials for studying treatment regimes

Estimation of mean outcome (e.g., mean survival): • Usual approach under up-front randomization : estimate mean for regime j by sample average outcome based on subjects randomized to regime j only

28/64 Dynamic Treatment Regimes Webinar Clinical trials for studying treatment regimes

Estimation of mean outcome (e.g., mean survival): • Usual approach under up-front randomization : estimate mean for regime j by sample average outcome based on subjects randomized to regime j only • However : Subjects will have realized experiences consistent with more than one regime ! • This can be exploited to improve precision. . .

28/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Demonstration: • A certain kind of SMART is common in oncology... • . . . but way these trials are usually analyzed does not focus on comparing the embedded dynamic treatment regimes and finding the best treatment sequence • We demonstrate the general principle of how to exploit realized experiences consistent with more than one regime to do this

Reference: Lunceford JK, Davidian M, Tsiatis AA. (2002). Estimation of survival distributions of treatment policies in two-stage randomization designs in clinical trials. Biometrics , 58, 48–57.

29/64 Dynamic Treatment Regimes Webinar • Decision 1 : Subjects randomized to either standard induction chemotherapy C1 OR standard induction therapy + granulocyte-macrophage colony-stimulating factor (GM-CSF ) C2 (two options) • Decision 2 : I If response , subjects randomized to M1,M2 = intensification/maintenance treatments I, II (two options) I If no response , only one option: follow-up with physician • All subjects followed for the outcome survival time

Estimating mean outcome for embedded regimes

Cancer and Leukemia Group B (CALGB) 8923: Double-blind, placebo-controlled trial of 338 elderly subjects with acute myelogenous leukemia (AML) with randomizations at two key decision points

30/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Cancer and Leukemia Group B (CALGB) Protocol 8923: Double-blind, placebo-controlled trial of 338 elderly subjects with acute myelogenous leukemia (AML) with randomizations at two key decision points • Decision 1 : Subjects randomized to either standard induction chemotherapy C1 OR standard induction therapy + granulocyte-macrophage colony-stimulating factor (GM-CSF ) C2 (two options) • Decision 2 : I If response , subjects randomized to M1,M2 = intensification/maintenance treatments I, II (two options) I If no response , only one option: follow-up with physician • All subjects followed for the outcome survival time

30/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Four possible regimes: The class D of interest comprises

1.C 1 followed by (M1 if response, else follow-up) (C1M1)

2.C 1 followed by (M2 if response, else follow-up) (C1M2)

3.C 2 followed by (M1 if response, else follow-up) (C2M1)

4.C 2 followed by (M2 if response, else follow-up) (C2M2)

31/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes Schematic of CALGB 8923: Randomization at •s

Follow -up Non-

Response

Chemo + Intensification I Placebo

Response

Intensification II

AML

Non-

Response Follow-up

Chemo + GM-CSF Intensification I

Response

Intensification II

32/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Standard analysis:

• Compare response rates to C1 and C2

• Compare survival between M1 and M2 among responders

• Compare survival between C1 and C2 regardless of subsequent response • Does not address the embedded regimes

33/64 Dynamic Treatment Regimes Webinar • Estimate mean survival if all patients followed each of the four embedded regimes Cj Mk , j = 1, 2, k = 1, 2 • Use data from all subjects whose realized experience is consistent with having followed Cj Mk • I.e., subjects with either

Cj ⇒ response ⇒ Mk

Cj ⇒ no response ⇒ follow up with physician

Estimating mean outcome for embedded regimes

Goal: Find the regime in D such that, if all patients in the population were to receive treatment according to it , mean survival would be the largest

34/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Goal: Find the regime in D such that, if all patients in the population were to receive treatment according to it , mean survival would be the largest • Estimate mean survival if all patients followed each of the four embedded regimes Cj Mk , j = 1, 2, k = 1, 2 • Use data from all subjects whose realized experience is consistent with having followed Cj Mk • I.e., subjects with either

Cj ⇒ response ⇒ Mk

Cj ⇒ no response ⇒ follow up with physician

34/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Statistical framework: Causal inference perspective • Characterize in terms of potential outcomes

Consider first: Classical single decision treatment comparison

35/64 Dynamic Treatment Regimes Webinar • Y (1) = outcome that would be achieved if a randomly chosen patient from the population were to follow regime (2) “Give C1”; Y defined analogously • E(Y (1)) = the mean outcome if all patients in the (2) population were to follow “Give C1”; E(Y ) analogously • Usual question : “If all patients in the population were to be given C1, would mean outcome be different from (better than ) that if all patients were to be given C2?” ⇒ CompareE (Y (1)) and E(Y (2))

Statistical framework

Case 1: Classical, single decision treatment comparison

• D = { “Give C1” , “Give C2” } • Hypothesize potential outcomes under each regime in D

36/64 Dynamic Treatment Regimes Webinar • Usual question : “If all patients in the population were to be given C1, would mean outcome be different from (better than ) that if all patients were to be given C2?” ⇒ CompareE (Y (1)) and E(Y (2))

Statistical framework

Case 1: Classical, single decision treatment comparison

• D = { “Give C1” , “Give C2” } • Hypothesize potential outcomes under each regime in D • Y (1) = outcome that would be achieved if a randomly chosen patient from the population were to follow regime (2) “Give C1”; Y defined analogously • E(Y (1)) = the mean outcome if all patients in the (2) population were to follow “Give C1”; E(Y ) analogously

36/64 Dynamic Treatment Regimes Webinar Statistical framework

Case 1: Classical, single decision treatment comparison

• D = { “Give C1” , “Give C2” } • Hypothesize potential outcomes under each regime in D • Y (1) = outcome that would be achieved if a randomly chosen patient from the population were to follow regime (2) “Give C1”; Y defined analogously • E(Y (1)) = the mean outcome if all patients in the (2) population were to follow “Give C1”; E(Y ) analogously • Usual question : “If all patients in the population were to be given C1, would mean outcome be different from (better than ) that if all patients were to be given C2?” ⇒ CompareE (Y (1)) and E(Y (2))

36/64 Dynamic Treatment Regimes Webinar • By randomization , Y (1), Y (2) ⊥⊥ A ⇒ E(Y (1)) = E(Y (1)|A = 1) = E(Y |A = 1) and similarly for E(Y (2))

• Thus, from observed data (Yi , Ai ), i = 1,..., n (iid), can estimate Pn (1) i=1 Yi I(Ai = 1) E(Y ) by Pn , i=1 I(Ai = 1) the usual sample average , and E(Y (2)) similarly

Statistical framework Clinical trial: Do not observe Y (1) and Y (2) on each subject

• If A = 1 (2) if subject randomized to “Give C1” (“Give C2”), we do observe (Y , A), where Y = Y (1)I(A = 1) + Y (2)I(A = 2)

37/64 Dynamic Treatment Regimes Webinar Statistical framework Clinical trial: Do not observe Y (1) and Y (2) on each subject

• If A = 1 (2) if subject randomized to “Give C1” (“Give C2”), we do observe (Y , A), where Y = Y (1)I(A = 1) + Y (2)I(A = 2)

• By randomization , Y (1), Y (2) ⊥⊥ A ⇒ E(Y (1)) = E(Y (1)|A = 1) = E(Y |A = 1) and similarly for E(Y (2))

• Thus, from observed data (Yi , Ai ), i = 1,..., n (iid), can estimate Pn (1) i=1 Yi I(Ai = 1) E(Y ) by Pn , i=1 I(Ai = 1) the usual sample average , and E(Y (2)) similarly

37/64 Dynamic Treatment Regimes Webinar • Y (jk) = survival time that would be achieved if a randomly chosen patient from the population were to follow Cj Mk • Question : Compare mean survival if all patients followed each of Cj Mk , j, k = 1, 2 ⇒ Compare (estimate ) E(Y (jk)), j, k = 1, 2 • Or survival probabilities

(jk) (jk) Sjk (t) = pr(Y > t) = E{I(Y > t)}, j, k = 1, 2

• Assume no censoring (can be generalized )

Statistical framework

Case 2: Optimal treatment sequence for two decision points

• D = { Cj Mk , j, k = 1, 2 } • Hypothesize potential outcomes under each regime in D

38/64 Dynamic Treatment Regimes Webinar • Question : Compare mean survival if all patients followed each of Cj Mk , j, k = 1, 2 ⇒ Compare (estimate ) E(Y (jk)), j, k = 1, 2 • Or survival probabilities

(jk) (jk) Sjk (t) = pr(Y > t) = E{I(Y > t)}, j, k = 1, 2

• Assume no censoring (can be generalized )

Statistical framework

Case 2: Optimal treatment sequence for two decision points

• D = { Cj Mk , j, k = 1, 2 } • Hypothesize potential outcomes under each regime in D • Y (jk) = survival time that would be achieved if a randomly chosen patient from the population were to follow Cj Mk

38/64 Dynamic Treatment Regimes Webinar Statistical framework

Case 2: Optimal treatment sequence for two decision points

• D = { Cj Mk , j, k = 1, 2 } • Hypothesize potential outcomes under each regime in D • Y (jk) = survival time that would be achieved if a randomly chosen patient from the population were to follow Cj Mk • Question : Compare mean survival if all patients followed each of Cj Mk , j, k = 1, 2 ⇒ Compare (estimate ) E(Y (jk)), j, k = 1, 2 • Or survival probabilities

(jk) (jk) Sjk (t) = pr(Y > t) = E{I(Y > t)}, j, k = 1, 2

• Assume no censoring (can be generalized )

38/64 Dynamic Treatment Regimes Webinar • Consider j = 1; j = 2 similar Observed for each subject: (R, RZ, Y ) • Y = survival time • R = 1 if subject responds to C1, R = 0 if not • Z = k for responder randomized to Mk , k = 1, 2 (not defined if R = 0) • Assume when R = 0, Y (11), Y (12) are the same ; then Y = (1 − R)Y (11) + RI(Z = 1)Y (11) + RI(Z = 2)Y (12)

• From observed data (Ri , Ri Zi , Yi ), i = 1,..., n (iid), EstimateE (Y (11)), E(Y (12)) and similarly for j = 2

Statistical framework Clinical trial (e.g., SMART): Do not observe Y (jk), j, k = 1, 2 • Can we make a connection between potential outcomes and observed data as we did in Case 1?

39/64 Dynamic Treatment Regimes Webinar • Assume when R = 0, Y (11), Y (12) are the same ; then Y = (1 − R)Y (11) + RI(Z = 1)Y (11) + RI(Z = 2)Y (12)

• From observed data (Ri , Ri Zi , Yi ), i = 1,..., n (iid), EstimateE (Y (11)), E(Y (12)) and similarly for j = 2

Statistical framework Clinical trial (e.g., SMART): Do not observe Y (jk), j, k = 1, 2 • Can we make a connection between potential outcomes and observed data as we did in Case 1? • Consider j = 1; j = 2 similar Observed for each subject: (R, RZ, Y ) • Y = survival time • R = 1 if subject responds to C1, R = 0 if not • Z = k for responder randomized to Mk , k = 1, 2 (not defined if R = 0)

39/64 Dynamic Treatment Regimes Webinar Statistical framework Clinical trial (e.g., SMART): Do not observe Y (jk), j, k = 1, 2 • Can we make a connection between potential outcomes and observed data as we did in Case 1? • Consider j = 1; j = 2 similar Observed for each subject: (R, RZ, Y ) • Y = survival time • R = 1 if subject responds to C1, R = 0 if not • Z = k for responder randomized to Mk , k = 1, 2 (not defined if R = 0) • Assume when R = 0, Y (11), Y (12) are the same ; then Y = (1 − R)Y (11) + RI(Z = 1)Y (11) + RI(Z = 2)Y (12)

• From observed data (Ri , Ri Zi , Yi ), i = 1,..., n (iid), EstimateE (Y (11)), E(Y (12)) and similarly for j = 2

39/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Consider j = 1: Responders to C1 are randomized to M1 with probability π = 1/2

• Nonresponders to C1 ⇒ follow up

• Half of responders get M1, half get M2

• Estimate mean survival for C1M1 by weighted average • Nonresponders represent themselves ⇒ weight = 1

• Each responder who got M1 represents him/herself and another similar subject who got randomized to M2 ⇒ weight = 2

• Estimator for C1M2, switch roles • Note : Survival times from nonresponders are used to estimate the for both C1M1 and C1M2

40/64 Dynamic Treatment Regimes Webinar (11) −1 Estimators for E(Y ): Qi = 1 − Ri + Ri I(Zi = 1) π

n n !−1 n −1 X X X n Qi Yi or Qi Qi Yi i=1 i=1 i=1

• Qi = 0 if i is inconsistent with C1M1 (consistent with C1M2) • Qi = 1 if Ri = 0 −1 • Qi = π if Ri = 1 and Zi = 1 • Similarly for E(Y (12))

Estimating mean outcome for embedded regimes

Formally: For j = 1 (j = 2 similar), (Ri , Ri Zi , Yi ), i = 1,..., n Yi = survival time for subject i Ri = 1 if i responds to C1, Ri = 0 if not Zi = k for responder randomized to Mk , k = 1, 2 pr(Zi = 1| Ri = 1) = π (= 1/2 in previous)

41/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Formally: For j = 1 (j = 2 similar), (Ri , Ri Zi , Yi ), i = 1,..., n Yi = survival time for subject i Ri = 1 if i responds to C1, Ri = 0 if not Zi = k for responder randomized to Mk , k = 1, 2 pr(Zi = 1| Ri = 1) = π (= 1/2 in previous)

(11) −1 Estimators for E(Y ): Qi = 1 − Ri + Ri I(Zi = 1) π

n n !−1 n −1 X X X n Qi Yi or Qi Qi Yi i=1 i=1 i=1

• Qi = 0 if i is inconsistent with C1M1 (consistent with C1M2) • Qi = 1 if Ri = 0 −1 • Qi = π if Ri = 1 and Zi = 1 • Similarly for E(Y (12))

41/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

(11) −1 Estimators for E(Y ): Qi = 1 − Ri + Ri I(Zi = 1) π

n n !−1 n −1 X X X n Qi Yi or Qi Qi Yi i=1 i=1 i=1

• Can show : E(QY ) = E(Y (11)), E(Q) = 1 • And similarly for j, k = 1, 2 • ⇒ Consistent estimators for E(Y (jk)) (Appendix) • Estimators for E(Y (jk)), k = 1, 2, are correlated • Can derive statistics for comparison ⇒ identify optimal regime in D

42/64 Dynamic Treatment Regimes Webinar Estimating mean outcome for embedded regimes

Remarks: • Subjects may die before having a chance to respond – nonresponders at the time of death (R = 0) • Survival time may be right-censored – can incorporate inverse probability of censoring weighting • Randomization at each decision is key ⇒ subjects are prognostically similar • Can be generalized to arbitrary number of decisions, numbers of options at each

43/64 Dynamic Treatment Regimes Webinar Designing SMARTs Considerations: • Class of regimes should involve key decision points where it is feasible to randomize • And with more than one treatment option and no consensus on choice among options • Simplicity – small numbers of decision points and options • Embedded regimes should have simple decision rules ; e.g., depending only on a few variables (response status ) • Criteria and methods for sample size determination is an open problem • Critical : Collect rich patient information at baseline and between decision points to inform development of more complex , optimal regimes (e.g., Cases 3 and 4) • More shortly. . .

44/64 Dynamic Treatment Regimes Webinar Designing SMARTs Schematic of CALGB 8923: Randomization at •s

Follow -up Non-

Response

Chemo + Intensification I Placebo

Response

Intensification II

AML

Non-

Response Follow-up

Chemo + GM-CSF Intensification I

Response

Intensification II

45/64 Dynamic Treatment Regimes Webinar Such questions can be cast as questions about dynamic treatment regimes • Available data are almost always observational • Databases from registries • Databases from completed clinical trials

Thinking in terms of dynamic treatment regimes

Questions not addressed in a conventional clinical trial: • If a treatment is effective, what should be the duration of administration? • How would the randomized treatments have compared if no patients had discontinued their assigned treatments?

46/64 Dynamic Treatment Regimes Webinar Thinking in terms of dynamic treatment regimes

Questions not addressed in a conventional clinical trial: • If a treatment is effective, what should be the duration of administration? • How would the randomized treatments have compared if no patients had discontinued their assigned treatments?

Such questions can be cast as questions about dynamic treatment regimes • Available data are almost always observational • Databases from registries • Databases from completed clinical trials

46/64 Dynamic Treatment Regimes Webinar More precisely: Treatment duration of t hours means infuse for t hours or until an adverse event requiring stopping, whichever comes first • This is a dynamic treatment regime for each t because realized duration depends on the adverse event status

Johnson BA, Tsiatis AA. (2004). Estimating mean response as a function of treatment duration in an , where duration may be informatively censored. Biometrics, 60, 315–323.

Thinking in terms of dynamic treatment regimes Example: Optimal treatment duration • ESPRIT trial – Integrilin vs. placebo in PCI/stent patients • Primary analysis : Integrilin superior • Protocol : Infusion duration of 18 – 24 hours with mandatory stopping for adverse events • Duration of infusion left to physician discretion • What should be the “recommended ” treatment duration ? • Data are observational with respect to this question

47/64 Dynamic Treatment Regimes Webinar Thinking in terms of dynamic treatment regimes Example: Optimal treatment duration • ESPRIT trial – Integrilin vs. placebo in PCI/stent patients • Primary analysis : Integrilin superior • Protocol : Infusion duration of 18 – 24 hours with mandatory stopping for adverse events • Duration of infusion left to physician discretion • What should be the “recommended ” treatment duration ? • Data are observational with respect to this question

More precisely: Treatment duration of t hours means infuse for t hours or until an adverse event requiring stopping, whichever comes first • This is a dynamic treatment regime for each t because realized duration depends on the adverse event status

Johnson BA, Tsiatis AA. (2004). Estimating mean response as a function of treatment duration in an observational study, where duration may be informatively censored. Biometrics, 60, 315–323.

47/64 Dynamic Treatment Regimes Webinar Thinking in terms of dynamic treatment regimes Duration regime of t hours:

Stop infusion

immediately AE before t hours

Start Integrilin infusion

No AE before t Stop infusion at hours t hours

• D = { all regimes of the form “infuse for t hours or until an

adverse event requiring stopping, whichever comes first”

for 18 ≤ t ≤ 24 }

opt Objective : Find t ∈ [18, 24] leading to largest mean

outcome (probability of no CVD event in 30 days)

48/64 Dynamic Treatment Regimes Webinar Objective: Compare the two dynamic treatment regimes “Take ENOX (UFH) until completion or discontinuation for mandatory reasons”

Zhang M, Tsiatis AA, Davidian M, Pieper KS, Mahaffey KW. (2011). Inference on treatment effects from a clinical trial in the presence of premature treatment discontinuation: The SYNERGY trial. , 12, 258–269.

Thinking in terms of dynamic treatment regimes Example: Treatment comparison in presence of treatment discontinuation • SYNERGY trial - enoxaparin (ENOX) vs. unfractionated heparin (UFH) in ACS patients (open label ) • Primary (intent-to-treat) analysis : No difference • Lots of treatment discontinuation (switching, stopping) • Some mandatory due to adverse events , some at clinician/patient discretion • How do the treatments compare if there were no discontinuation ?

49/64 Dynamic Treatment Regimes Webinar Thinking in terms of dynamic treatment regimes Example: Treatment comparison in presence of treatment discontinuation • SYNERGY trial - enoxaparin (ENOX) vs. unfractionated heparin (UFH) in ACS patients (open label ) • Primary (intent-to-treat) analysis : No difference • Lots of treatment discontinuation (switching, stopping) • Some mandatory due to adverse events , some at clinician/patient discretion • How do the treatments compare if there were no discontinuation ?

Objective: Compare the two dynamic treatment regimes “Take ENOX (UFH) until completion or discontinuation for mandatory reasons”

Zhang M, Tsiatis AA, Davidian M, Pieper KS, Mahaffey KW. (2011). Inference on treatment effects from a clinical trial in the presence of premature treatment discontinuation: The SYNERGY trial. Biostatistics, 12, 258–269.

49/64 Dynamic Treatment Regimes Webinar Difficulties for studying regimes: • – subjects receiving one treatment or another may not be prognostically similar • E.g., subjects who discontinued may be sicker , older , etc • Standard methods are available to adjust for confounding , e.g., regression , propensity scores , etc, assuming no unmeasured confounders • However , the time-dependent nature of treatment causes additional complications

Studying regimes based on observational data

Again: Data are observational with respect to these questions • Decisions on duration , treatment discontinuation were not randomized • Made at clinician/patient discretion

50/64 Dynamic Treatment Regimes Webinar Studying regimes based on observational data

Again: Data are observational with respect to these questions • Decisions on duration , treatment discontinuation were not randomized • Made at clinician/patient discretion

Difficulties for studying regimes: • Confounding – subjects receiving one treatment or another may not be prognostically similar • E.g., subjects who discontinued may be sicker , older , etc • Standard methods are available to adjust for confounding , e.g., regression , propensity scores , etc, assuming no unmeasured confounders • However , the time-dependent nature of treatment causes additional complications

50/64 Dynamic Treatment Regimes Webinar Studying regimes based on observational data

Time-dependent confounding: Treatments actually received over time depend on accruing information • Temptation : “Adjust” for such time-dependent confounding • E.g., a Cox model for outcome including time-dependent intermediate variables and treatments • However : Part of the effect of treatment on outcome may be mediated through intermediate variables • ⇒ Adjustment would incorrectly remove this effect and hence misrepresent the true treatment effect

51/64 Dynamic Treatment Regimes Webinar Sequential randomization assumption: At any point where a treatment decision is made, the treatment received (among the options available) depends only on the accrued information on the patient and not additionally on his/her future prognosis • At some level, this must be true • In a SMART, this is automatically true by randomization • With observational data , is tenable only if all accrued information used to make decisions is available in the database

Studying regimes based on observational data

Resolution: • Requires a generalization of no unmeasured confounders • Unverifiable from the observed data

52/64 Dynamic Treatment Regimes Webinar Studying regimes based on observational data

Resolution: • Requires a generalization of no unmeasured confounders • Unverifiable from the observed data

Sequential randomization assumption: At any point where a treatment decision is made, the treatment received (among the options available) depends only on the accrued information on the patient and not additionally on his/her future prognosis • At some level, this must be true • In a SMART, this is automatically true by randomization • With observational data , is tenable only if all accrued information used to make decisions is available in the database

52/64 Dynamic Treatment Regimes Webinar Moral: • Many complex questions can be posed in terms of a class of dynamic treatment regimes • Methods are available for inference on regimes in the class

Studying regimes based on observational data

Under sequential randomization: Inference on dynamic treatment regimes • Can use weighted methods similar to those discussed earlier for Case 2 , extended to multiple decision points • Critical difference : Rather than weighting based on known randomization probabilities , weighting is based on the propensities of receiving treatment at each decision as a function of accrued information • Modeling/estimation of propensities

53/64 Dynamic Treatment Regimes Webinar Studying regimes based on observational data

Under sequential randomization: Inference on dynamic treatment regimes • Can use weighted methods similar to those discussed earlier for Case 2 , extended to multiple decision points • Critical difference : Rather than weighting based on known randomization probabilities , weighting is based on the propensities of receiving treatment at each decision as a function of accrued information • Modeling/estimation of propensities

Moral: • Many complex questions can be posed in terms of a class of dynamic treatment regimes • Methods are available for inference on regimes in the class

53/64 Dynamic Treatment Regimes Webinar Can we estimate an optimal regime within these classes? • From data from a SMART in which detailed accruing information was collected? • From data from an observational database ?

Constructing dynamic treatment regimes

Cases 3 and 4: More complex regimes focused on personalizing treatment to the patient • Case 3 : D = specified class of feasible regimes • Case 4 : D = all possible regimes • Rules involve accrued information on the patient

54/64 Dynamic Treatment Regimes Webinar Constructing dynamic treatment regimes

Cases 3 and 4: More complex regimes focused on personalizing treatment to the patient • Case 3 : D = specified class of feasible regimes • Case 4 : D = all possible regimes • Rules involve accrued information on the patient

Can we estimate an optimal regime within these classes? • From data from a SMART in which detailed accruing information was collected? • From data from an observational database ?

54/64 Dynamic Treatment Regimes Webinar Characterizing an optimal regime

Demonstration: Characterize an optimal regimed opt in the class D of all possible regimesd (Case 4 ) • Single decision point • Two treatment options coded as {0, 1}

• d ∈ D is a single rule d1(X1) taking values 0 or 1 • Data from a conventional clinical trial (simplest SMART)

(X1i , A1i , Yi ), i = 1,..., n (iid)

A1 is treatment received taking values {0, 1} • Assume large outcomes are better

55/64 Dynamic Treatment Regimes Webinar • Potential outcome if a randomly chosen patient were to follow regime d (d) (1) (0) Y = Y I{d(X1) = 1} + Y I{d(X1) = 0} (1) (0) = Y d(X1) + Y {1 − d(X1)} • E(Y (d)) = mean outcome if all patients in the population were to follow regime d

Optimal regime d opt : d opt maximizes E(Y (d)) among all d ∈ D

• Can we estimated opt satisfying this from the trial data ?

Characterizing an optimal regime Potential outcome for a regime: For any regime d ∈ D • Y (0) and Y (1) are potential outcomes if a randomly chosen patient were to receive treatments 0 and 1, respectively

56/64 Dynamic Treatment Regimes Webinar Optimal regime d opt : d opt maximizes E(Y (d)) among all d ∈ D

• Can we estimated opt satisfying this from the trial data ?

Characterizing an optimal regime Potential outcome for a regime: For any regime d ∈ D • Y (0) and Y (1) are potential outcomes if a randomly chosen patient were to receive treatments 0 and 1, respectively • Potential outcome if a randomly chosen patient were to follow regime d (d) (1) (0) Y = Y I{d(X1) = 1} + Y I{d(X1) = 0} (1) (0) = Y d(X1) + Y {1 − d(X1)} • E(Y (d)) = mean outcome if all patients in the population were to follow regime d

56/64 Dynamic Treatment Regimes Webinar Characterizing an optimal regime Potential outcome for a regime: For any regime d ∈ D • Y (0) and Y (1) are potential outcomes if a randomly chosen patient were to receive treatments 0 and 1, respectively • Potential outcome if a randomly chosen patient were to follow regime d (d) (1) (0) Y = Y I{d(X1) = 1} + Y I{d(X1) = 0} (1) (0) = Y d(X1) + Y {1 − d(X1)} • E(Y (d)) = mean outcome if all patients in the population were to follow regime d

Optimal regime d opt : d opt maximizes E(Y (d)) among all d ∈ D

• Can we estimated opt satisfying this from the trial data ?

56/64 Dynamic Treatment Regimes Webinar (d) (d) Thus: E(Y ) = E{ E(Y |X1) } h (1) (0) i = E E(Y |X1)d(X1) + E(Y |X1){1 − d(X1)}

h (1) (0) i = E E(Y |X1, A1 = 1)d(X1) + E(Y |X1, A1 = 0){1 − d(X1)} h i = E E(Y |X1, A1 = 1)d(X1) + E(Y |X1, A1 = 0){1 − d(X1)}

Estimating an optimal regime Observed outcome: (1) (0) (1) (0) Y = Y I(A1 = 1) + Y I(A1 = 0) = Y A1 + Y (1 − A1) (0) (1) • By randomization , Y , Y ⊥⊥ A1|X1 (1) (1) ⇒ E(Y |X1) = E(Y |X1, A1 = 1) = E(Y |X1, A1 = 1) and similarly for Y (0)

57/64 Dynamic Treatment Regimes Webinar Estimating an optimal regime Observed outcome: (1) (0) (1) (0) Y = Y I(A1 = 1) + Y I(A1 = 0) = Y A1 + Y (1 − A1) (0) (1) • By randomization , Y , Y ⊥⊥ A1|X1 (1) (1) ⇒ E(Y |X1) = E(Y |X1, A1 = 1) = E(Y |X1, A1 = 1) and similarly for Y (0)

(d) (d) Thus: E(Y ) = E{ E(Y |X1) } h (1) (0) i = E E(Y |X1)d(X1) + E(Y |X1){1 − d(X1)}

h (1) (0) i = E E(Y |X1, A1 = 1)d(X1) + E(Y |X1, A1 = 0){1 − d(X1)} h i = E E(Y |X1, A1 = 1)d(X1) + E(Y |X1, A1 = 0){1 − d(X1)}

57/64 Dynamic Treatment Regimes Webinar Suggests: Posit a regression model for E(Y |X1, A1)

Q(X1, A1; β)

• Fit the model to trial data ⇒ Q(X1, A1; βb) • Estimated optimal regime opt db (X1) = I{ Q(X1, 1; βb) > Q(X1, 0; βb) }

• Issue : What if the model Q(X1, A1; β) is misspecified ?

Estimating an optimal regime Recall: We wish to maximize (d) h i E(Y ) = E E(Y |X1, A1 = 1)d(X1)+E(Y |X1, A1 = 0){1−d(X1)}

• Clearly : E(Y (d)) is maximized by opt d (X1) = I{ E(Y |X1, A1 = 1) > E(Y |X1, A1 = 0) }

• E(Y |X1, A1) is the regression of outcome on baseline information and treatment received

58/64 Dynamic Treatment Regimes Webinar • Issue : What if the model Q(X1, A1; β) is misspecified ?

Estimating an optimal regime Recall: We wish to maximize (d) h i E(Y ) = E E(Y |X1, A1 = 1)d(X1)+E(Y |X1, A1 = 0){1−d(X1)}

• Clearly : E(Y (d)) is maximized by opt d (X1) = I{ E(Y |X1, A1 = 1) > E(Y |X1, A1 = 0) }

• E(Y |X1, A1) is the regression of outcome on baseline information and treatment received

Suggests: Posit a regression model for E(Y |X1, A1)

Q(X1, A1; β)

• Fit the model to trial data ⇒ Q(X1, A1; βb) • Estimated optimal regime opt db (X1) = I{ Q(X1, 1; βb) > Q(X1, 0; βb) }

58/64 Dynamic Treatment Regimes Webinar Estimating an optimal regime Recall: We wish to maximize (d) h i E(Y ) = E E(Y |X1, A1 = 1)d(X1)+E(Y |X1, A1 = 0){1−d(X1)}

• Clearly : E(Y (d)) is maximized by opt d (X1) = I{ E(Y |X1, A1 = 1) > E(Y |X1, A1 = 0) }

• E(Y |X1, A1) is the regression of outcome on baseline information and treatment received

Suggests: Posit a regression model for E(Y |X1, A1)

Q(X1, A1; β)

• Fit the model to trial data ⇒ Q(X1, A1; βb) • Estimated optimal regime opt db (X1) = I{ Q(X1, 1; βb) > Q(X1, 0; βb) }

• Issue : What if the model Q(X1, A1; β) is misspecified ?

58/64 Dynamic Treatment Regimes Webinar Personalized Medicine and Dynamic Treatment Regimes • Half-day shortcourse at 2015 ENAR Spring Meeting (Sunday, March 15, morning)

Forthcoming book: Kosorok, M. R. and Moodie, E. E. M. (2015). Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. SIAM.

Estimating an optimal regime Shameless promotion: Discussion of estimation of an optimal regime within a broad class of regimes D with a focus on personalized treatment as in Cases 3 and 4 merits its own shortcourse • Robustness to misspecification of models? • Alternative approaches ? • Extension to multiple decision points ? • Etc, etc. . .

59/64 Dynamic Treatment Regimes Webinar Forthcoming book: Kosorok, M. R. and Moodie, E. E. M. (2015). Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. SIAM.

Estimating an optimal regime Shameless promotion: Discussion of estimation of an optimal regime within a broad class of regimes D with a focus on personalized treatment as in Cases 3 and 4 merits its own shortcourse • Robustness to misspecification of models? • Alternative approaches ? • Extension to multiple decision points ? • Etc, etc. . .

Personalized Medicine and Dynamic Treatment Regimes • Half-day shortcourse at 2015 ENAR Spring Meeting (Sunday, March 15, morning)

59/64 Dynamic Treatment Regimes Webinar Estimating an optimal regime Shameless promotion: Discussion of estimation of an optimal regime within a broad class of regimes D with a focus on personalized treatment as in Cases 3 and 4 merits its own shortcourse • Robustness to misspecification of models? • Alternative approaches ? • Extension to multiple decision points ? • Etc, etc. . .

Personalized Medicine and Dynamic Treatment Regimes • Half-day shortcourse at 2015 ENAR Spring Meeting (Sunday, March 15, morning)

Forthcoming book: Kosorok, M. R. and Moodie, E. E. M. (2015). Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine. SIAM.

59/64 Dynamic Treatment Regimes Webinar Discussion

• Dynamic treatment regimes formalize clinical decision-making and provide a framework for personalized treatment • A broad of problems can be cast in terms of dynamic treatment regimes • SMARTs are the “gold standard ” data source for estimation of dynamic treatment regimes • Design considerations for SMARTs? Broader adoption ? Implications for how treatments are evaluated ? • Estimation of optimal treatment regimes is a wide open area of research

60/64 Dynamic Treatment Regimes Webinar Thought Leaders

2013 MacArthur Fellow Susan Murphy and Jamie Robins

61/64 Dynamic Treatment Regimes Webinar Resources

Introductory material: • http://methodology.psu.edu/ • http://www-personal.umich.edu/~dalmiral/ • http://www.huffingtonpost.com/ american-statistical-association/ being-smart-about-constru_b_4963862.html • http://impact.unc.edu/Symposium2014Agenda

Literature: See the separate list of references

62/64 Dynamic Treatment Regimes Webinar Appendix Consistency of estimators for E(Y (11)): −1 Qi = 1 − Ri + Ri I(Zi = 1) π n n !−1 n −1 X X X n Qi Yi or Qi Qi Yi i=1 i=1 i=1 Y = (1 − R)Y (11) + RI(Z = 1)Y (11) + RI(Z = 2)Y (12)

Want to show: E(QY ) = E(Y (11)) • UsingR (1 − R) = 0, I(Z = 1)I(Z = 2) = 0, etc. E(QY ) = E[ Y (11){(1 − R) + RI(Z = 1)π−1} ] = E[ Y (11) E{(1 − R) + RI(Z = 1)π−1|R, Y (11)} ] • So equivalently want to show E{(1 − R) + RI(Z = 1)π−1|R, Y (11)} = 1

63/64 Dynamic Treatment Regimes Webinar Appendix

E{(1 − R) + RI(Z = 1)π−1|R, Y (11)} −1 (11) (11) = E{(1 − R) + RI(Zi = 1)π |R = 0, Y }P(R = 0|Y ) −1 (11) (11) + E{(1 − R) + RI(Zi = 1)π |Ri = 1, Y }P(R = 1|Y ) = P(R = 0|Y (11)) + E{ I(Z = 1)|R = 1, Y (11)}π−1P(R = 1|Y (11)) = P(R = 0|Y (11)) + P(R = 1|Y (11)) = 1

(11) Because: By randomization, assignment to M1 ⊥⊥ Y

E{ I(Z = 1)|R = 1, Y (11)} = P(Z = 1|R = 1, Y (11)) = P(Z = 1|R = 1) = π

For k = 2: Same argument, Q = 1 − R + RI(Z = 2)(1 − π)−1

64/64 Dynamic Treatment Regimes Webinar