Statistical Methods for Bridging Experimental Data and Dynamic Models with Biomedical Applications

Hulin Wu, Ph.D. Dr. D.R. Seth Family Professor & Associate Chair Department of Biostatistics, School of Public Health

Professor, School of Biomedical Informatics University of Texas Health Science Center at Houston

Pittsburgh, March, 2017

Hulin Wu UTSPH March 2017 1 / 52 Outline

1 Introduction

2 Statistical estimation and inference methods for dynamic ODE models

I Naive Method: LS or MLE principle

I Local solution and time-varying parameter problems

I Smoothing-based methods

I Sparse longitudinal data: mixed-effects ODE models

I Bayesian methods

I High-dimensional ODE models: ODE model selection

3 Other dynamic models

4 Ongoing and future Work

5 Conclusions

Hulin Wu UTSPH March 2017 2 / 52 Statistical Modeling Cultures

I Leo Breiman (Statistical Science, 2001): Two cultures

I Data modeling (98% statisticians): What the data look like? e.g., regression models

I Algorithmic modeling (2% statisticians): No models and for prediction purpose, e.g., neural nets and decision trees

I A third culture:

I Mechanistic modeling (<1% statisticians): Build mathematical models based on the mechanisms behind the data

I How are the data generated?

I Goal: Understand principles or biological mechanisms

Hulin Wu UTSPH March 2017 3 / 52 Dynamic Systems/Models

Many and biological systems can be described by dynamic models:

I Differential :

I Ordinary differential equations (ODE)–simplest

I Delay differential equations (DDE)

I Hybrid differential equations (HDE)

I Partial differential equations (PDE)

I differential equations (SDE)

I Difference equations and state-space models

I Stochastic processes models: branching process etc.

I Agent-based models and cellular automata

I ...

Hulin Wu UTSPH March 2017 4 / 52 Modeling Goals

I Forward Problems: θ 7→ Pθ–Easier to do

I Predictions

I

I Inverse Problems: Y 7→ θ ∈ Θ–More challenging

I Determine model structures/forms

I Estimate unknown parameters: θ

Hulin Wu UTSPH March 2017 5 / 52 A Dynamic System: ODE Model

d X(t) = G[X(t), θ], X(0) = X (1) dt 0 Y (ti) = H[X(ti), β] + e(ti), (2) 2 e(ti) ∼ (0, σ I), i = 1, . . . , n where

I G(·): linear or nonlinear functions I H(·): observation functions I (θ, β): unknown parameters I e(ti): measurement error The NLS method: n X T min {Y (ti) − H[X(ti, θ), β]} {Y (ti) − H[X(ti, θ), β]}, θ,β,X 0 i=1

where X(ti) evaluated numerically from Eq (1).

Hulin Wu UTSPH March 2017 6 / 52 Naive NLS Method: Challenging Problems

1 Identifiability problem

2 Local solutions

3 Time-varying parameters

4 Need to solve the forward problem numerically and many times: Numerical error vs. measurement error

5 Slow convergence and high computational cost

6 Sparse longitudinal data problem

7 Nonlinear optimization

8 High-dimensional parameter space

Motivate new statistical methods for dynamic models

Hulin Wu UTSPH March 2017 7 / 52 Identifiability issues

I Theoretical identifiability: Mathematical identifiability

I Practical identifiability: Statistical and numerical identifiability

I Need to be investigated before the

I How to deal with unidentifiable models?

I Simplify or revise the model

I Lump some parameters together

I Fixed some parameters

I Bayesian approach: Use priors

Hulin Wu UTSPH March 2017 8 / 52 Identifiability issues: References

I Wu, H., Zhu, H., Miao, H., and Perelson, A.S. (2008), Parameter Identifiability and Estimation of HIV/AIDS Dynamic Models, Bulletin of Mathematical , 70(3), 785-799.

I Miao, H., Dykes, C., Demeter, L.M., Cavenaugh, J., Park, S.Y., Perelson, A.S., and Wu, H. (2008), Modeling and Estimation of Kinetic Parameters and Replicative Fitness of HIV-1 from -Cytometry-Based Growth Competition Experiments, Bulletin of Mathematical Biology, 70, 1749-1771.

I Miao, H., Dykes, C., Demeter, L., Wu, H. (2009), Differential Modeling of HIV Viral Fitness Experiments: Model Identification, Model Selection, and Multi-Model Inference, Biometrics, 65, 292-300.

I Liang, H., Miao, H., and Wu, H. (2010), Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model, Annals of Applied , 4, 460-483.

I Miao, H., Xia, X., Perelson, A.S., Wu, H. (2011), On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics, SIAM Review, 53(1): 3-39.

Hulin Wu UTSPH March 2017 9 / 52 Naive NLS Method: Local solution and numerical error problems

I Local solution problem:

I methods: Differential evolution and genetic algorithms (Storn et al 1997).

I Mixture of stochastic global optimization method and deterministic methods: scatter search method (Rodriguez-Fernandez et al. 2006)

I Numerical error problem:

I Xue, Miao and Wu (Annals of Statistics, 2010): theoretical results on numerical error vs. measurement error

Hulin Wu UTSPH March 2017 10 / 52 Naive NLS Method: Time-varying parameter problem

Xue, Miao and Wu, Annals of Statistics (2010)

dX(t) = F {t, X(t), θ, η(t)} dt

I The spline approach can be used to approximate the time-varying parameter:

η(t) = π(t)T α,

T where π(t) = (B1(t), ··· ,BN (t)) is a vector of basis functions.

I The time-varying coefficient ODE model becomes an ODE model with constant parameters:

dX(t) = F {t, X(t), θ, π(t)T α} dt

Hulin Wu UTSPH March 2017 11 / 52 Smoothing-Based Approaches: ODE Computational Problem

I Earlier ideas: Hemker (1972) and Varah (1982)

I Two-stage decoupling approaches: Chen and Wu (JASA 2008, Statistica Sinica 2008) and Liang and Wu (JASA, 2008)

I Parameter cascading method: Ramsay et al. JRSS-B (2007) and Wang et al. Stat Comput 2014.

Hulin Wu UTSPH March 2017 12 / 52 Smoothing-Based Approaches: Two-Stage Method

Chen and Wu (JASA 2008, Statistica Sinica 2008) and Liang and Wu (JASA, 2008):

0 X (ti) = F [X(ti), θ] (3) 2 Y (ti) = X(ti) + e1(ti), e1(ti) ∼ (0, σ I), (4)

I Step 1: Use a nonparametric smoothing to estimate X(t) and X0(t) from model (4).

I Step 2: Substitute the estimate Xˆ(ti) into model (3) to obtain:

0 Xˆ (ti) = F [Xˆ(ti), θ] + e2(ti). (5)

Then fit the above regression model (5) to estimate θ.

I F (·): Linear or nonlinear

Hulin Wu UTSPH March 2017 13 / 52 Smoothing-Based Approaches: Two-Step Methods

I Step 2 decoupled the system of ODEs: Fit the ODE one-by-one

I Convert ODE models to regression: Standard regression software tools can be used

I Avoid numerically solving the ODEs

I Computationally fast and efficient: Easy to deal with high-dimensional ODEs

I Price to pay:

I The estimate may not be accurate

I The decoupled system: Some information lost

I The “coupled" property: destroyed Extension to higher-order numerical -based algorithms: Wu, Xue and Kuman (Biometrics 2012)

Hulin Wu UTSPH March 2017 14 / 52 Parameter Cascading or Profiling Method

Ramsay, Hooker, Campbell, Cao, JRSS-B, 2007

Fitting to data

I Observations: y(ti) 0 I Nonparametric function: f(t) = φ(t) c Pn 2 I Fitting to data: C1 = i=1[y(ti) − f(ti)]

Fidelity to DE x0(t) = g(x|β) 0 0 I f (t) = φ (t)c 0 I Difference between two sides of DE: Lf(t) = f (t) − g(f(t)|β) R 2 I Fidelity to DE: C2 = [Lf(t)] dt

Criterion to estimate c: J(c|β) = C1 + λC2 Pn 0 2 Criterion to estimate β: H(β) = i=1[y(ti) − φ(ti) cˆ(β)]

Hulin Wu UTSPH March 2017 15 / 52 Numerical Comparisons: NLS, Profiling and Two-Stage Estimates

Ding and Wu, Statistica Sinica, 2014

I NLS: Not stable to get the global solution, computationally expensive

I Profiling:

I A 3-step iterative

I More stable than NLS to get a better solution

I Computational efficiency: similar to NLS

I Two-Stage Method: Computationally fast, but not accurate.

Hulin Wu UTSPH March 2017 16 / 52 Sparse Longitudinal Data Problem: Mixed-Effects Modeling Approaches

Deal with sparse data: Borrow information across subjects

I The MLE principle: Nonlinear Mixed-Effects Modeling (NLME)

I Treat the ODE solution as a nonlinear regression function

I Computational challenge: Stochastic Approximation EM (SAEM)

I Two-stage smoothing-based mixed-effects modeling approaches

I Fang, Wu and Zhu, Statistica Sinica (2011)

I Linear ODE: Linear mixed-effects model (LME)

I Nonlinear ODE: NLME

I Bayes methods

I A three-stage hierarchical model: implemented by MCMC

I Computation: expensive

Hulin Wu UTSPH March 2017 17 / 52 Mixed-Effects ODE Model: NLME

I Within-subject variation: d X(t) = G[X(t), θ ], X(0) = X (6) dt i i0 Y i(ti) = Hi[Xi(ti), θi] + ei(ti), i = 1, . . . , n

I Xi(ti): ODE solution for Subject i. T I Y i = (yi1(t1), ··· , yimi (tmi )) : Data from Subject i T 2 I ei = (ei(t1), ··· , ei(tmi )) ∼ N (0, σ Imi ): Measurement error

I Between-subject variation:

θi = µ + bi, [bi|Σ] ∼ N (0, Σ)

I µ: population parameter

I bi: random effects

I Estimation and inference: Stochastic Approximation EM (SAEM) I Delyon, Lavielle and Moulines (1999), Kuhn and Lavielle (2005) Grenier, Louvet, Vigneaux (2014)

Hulin Wu UTSPH March 2017 18 / 52 Smoothing-based Two-Stage Mixed-Effects Model

Fang, Wu and Zhu, Statistica Sinica (2011):

0 X (ti) = F [X(ti), θ] (7) 2 Y (ti) = X(ti) + e1(ti), e1(ti) ∼ (0, σ I), (8)

I Step 1: Use a nonparametric smoothing to estimate X(t) and X0(t) from model (8).

I Step 2: Substitute the estimate Xˆ(ti) into model (7) to obtain:

0 Xˆ (ti) = F [Xˆ(ti), θ] + e2(ti). (9)

I Convert the model (9) into a LME or NLME if F (x) is linear or nonlinear.

I Fit the LME or NLME using a standard approach or SAEM method

Hulin Wu UTSPH March 2017 19 / 52 Bayesian Methods: Borrow Information to Deal with Sparse Data and Identifiability Problems

Huang, Liu and Wu, Biometrics (2006): Example

I A viral dynamic model: describe the population dynamics of HIV and its target cells in plasma

d dt T = λ − ρT − [1 − γ(t)]kT V d ∗ ∗ dt T = [1 − γ(t)]kT V − δT (10) d ∗ dt V = NδT − cV

∗ I T,T , V : target uninfected cells, infected cells, virus

I γ(t): time-varying antiviral drug efficacy

I (λ, ρ, k, δ, N, c): unknown parameters to be estimated

I The equations (10): no closed-form solution

Hulin Wu UTSPH March 2017 20 / 52 Antiviral Drug Efficacy Model

I A modified Emax (M-M) model for drug efficacy:

C(t)A(t) IQ(t)A(t) γ(t) = = , 0 ≤ γ(t) ≤ 1 φIC50(t) + C(t)A(t) φ + IQ(t)A(t) (11)

I C(t): the plasma drug concentration

I A(t): drug adherence measurements

I IC50: in vitro phenotype drug resistance marker

I φ: a conversion factor parameter C(t) I IQ = : the Inhibitory Quotient (IQ) IC50(t)

I If γ(t) = 1, the drug: 100% effective

I If γ(t) = 0, the drug: no effect

Hulin Wu UTSPH March 2017 21 / 52 Drug Susceptibility Model

I Phenotype marker IC50 is used to quantify agent-specific drug sensitivity

I The function: to describe changes overtime in IC50

 Ir −I0 I0 + t t for 0 < t < tr, IC50(t) = r (12) Ir for t ≥ tr,

I I0 and Ir: respective values of IC50(t) at baseline and time point tr at which drug resistant mutations appear

I If Ir = I0, no resistance mutation developed during treatment

Hulin Wu UTSPH March 2017 22 / 52 A Challenging Problem

How to estimate the unknown parameters in the complex dynamic model?

I Difficulties:

I Identifiability problem: Too many parameters, (φ, λ, ρ, k, δ, N, C), some of them are not identifiable

I Data from individuals: sparse, only V (t) measured

I Nonlinear differential equations model: no closed-form solutions

Hulin Wu UTSPH March 2017 23 / 52 Viral load data from a clinical trial

Real data up to day 112

• • • • • • • • • •• • • • • •• • • • • • • • • log10(RNA) copies/ml • • • • • • • • • •• • • • • • • ••• • • • log10(50) 12345

0 20406080100

Time (days)

Hulin Wu UTSPH March 2017 24 / 52 Bayesian Modeling

I A three-stage Bayesian hierarchical model I Stage 1. Within-subject variation: 2 2 yi = fi(θi) + ei, [ei|σ , θi] ∼ N (0, σ Imi )

T I fi(θi) = (fi1(θi, t1), ··· , fimi (θi, tmi )) : ODE solutions. T I yi = (yi1(t1), ··· , yimi (tmi )) : Data from Subject i T I ei = (ei(t1), ··· , ei(tmi )) : Measurement error I Stage 2. Between-subject variation:

θi = µ + bi, [bi|Σ] ∼ N (0, Σ)

I Stage 3. Hyperprior distributions: σ−2 ∼ Ga(a, b), µ ∼ N (η, Λ), Σ−1 ∼ Wi(Ω, ν)

I Gamma (Ga), Normal (N ) and Wishart (Wi): independent distributions I Hyper-parameters a, b, η, Λ, Ω and ν: known

Hulin Wu UTSPH March 2017 25 / 52 Bayesian Estimation: Implementation

I Choose prior distributions

I Informative prior and non-informative prior

I Rule of thumb: choose non-informative prior distributions for parameters of interest

I Implement MCMC algorithm

I Gibbs sampling step: closed form of conditional distributions for σ−2, µ, Σ−1

I Metropolis-Hastings step: no closed form of conditional distributions for θi

I Run a long chain: the number of iterations, initial “burn-in", every fifth samples

I Obtain posterior distributions (posterior means or credible intervals) based on the final MCMC samples

Hulin Wu UTSPH March 2017 26 / 52 A Clinical Study: A5055

I A study of HIV-1 infected patients failing -containing therapies.

I Two salvage regimens: 44 patients

I Arm A: IDV 800 mg q12h+RTV 200mg q12h+two NRTIs

I Arm B: IDV 400 mg q12h+RTV 400mg q12h+two NRTIs

I Plasma HIV-1 RNA (viral load) measured at days 0, 7, 14, 28, 56, 84, 112, 140 and 168 of follow-up

Hulin Wu UTSPH March 2017 27 / 52 Clinical Data–Results of Population Parameters

Parameter PM SD 95% CI φ 2.1091 0.6354 (1.2143, 3.6392) c 2.9867 0.1466 (2.7139, 3.2881) δ 0.3729 0.0184 (0.3387, 0.4105) λ 100.645 4.9431 (91.497, 110.830) ρ 0.0997 0.0049 (0.0905, 0.1099) N 1004.988 49.795 (912.074, 1106.654) k 9.183 × 10−6 0.290 × 10−6 (8.632 × 10−6, 9.774 × 10−6)

I Posterior mean for the population parameter φ is 2.1091 with a SD of 0.6354 and the 95% CI of (1.2143, 3.6392)

I As φ plays a role of transforming the in vitro IC50 into in vivo IC50, our estimate shows that there is about 2-fold difference between in vitro IC50 and in vivo IC50

Hulin Wu UTSPH March 2017 28 / 52 Clinical Data–Results of Individual Parameters

Patient φi ci δi λi ρi Ni ki e 1 0.447 2.254 0.270 410.462 0.024 456.757 8.33 × 10−6 0.97 2 5.371 2.969 1.183 29.619 0.426 4795.813 10.84 × 10−6 0.17 3 3.723 2.283 0.456 36.877 0.289 3258.347 8.66 × 10−6 0.37 4 4.960 2.761 0.798 44.956 0.313 3051.988 9.09 × 10−6 0.34 5 7.066 2.306 0.663 71.295 0.201 2735.239 6.54 × 10−6 0.64 6 0.786 4.633 0.183 375.882 0.025 247.416 11.18 × 10−6 0.89 7 0.091 7.008 0.299 4015.398 0.003 30.559 18.54 × 10−6 0.98 8 8.484 2.280 0.663 32.722 0.416 4530.531 8.37 × 10−6 0.24

I The individual-specific parameter estimates suggest a large inter-subject variation I The model provides a good fit to the clinical data

Hulin Wu UTSPH March 2017 29 / 52 Patient 1

Fitted individual curves, drug efficacy, IC50 and adherence with IQ=c12h/IC50

Patid= 1 Patid= 1

IDV RTV IC50 Adherence IDV 4 8 12 16 RTV 0.80 0.90 1.00 0 50 100 150 200 0 50 100 150 200

Time (day) Time (day)

Patid= 1 Patid= 1

o o ec o o o o o o log10(RNA) Drug efficacy 1.5 3.0 4.5 0.6 0.8 1.0 0 50 100 150 200 0 50 100 150 200

Time (day) Time (day)

Hulin Wu UTSPH March 2017 30 / 52 Patient 2

Patid= 2 Patid= 2

IDV IDV RTV RTV IC50 Adherence 468 12 0.70 0.85 1.00 0 50 100 150 200 0 50 100 150 200

Time (day) Time (day)

Patid= 2 Patid= 2

ec o o o o o o log10(RNA) Drug efficacy o o 1.5 2.5 3.5 0.70 0.85 1.00 0 50 100 150 200 0 50 100 150 200

Time (day) Time (day)

Hulin Wu UTSPH March 2017 31 / 52 Patient 3

Patid= 3 Patid= 3

IDV IDV RTV RTV IC50 Adherence 4.5 6.0 7.5 0.970 0.990 0 50 100 150 200 0 50 100 150 200

Time (day) Time (day)

Patid= 3 Patid= 3

o

o o o log10(RNA) Drug efficacy ec o o o

1.5 2.5 3.5 o o 0.80 0.90 1.00 0 50 100 150 200 0 50 100 150 200

Time (day) Time (day)

Hulin Wu UTSPH March 2017 32 / 52 Bayesian Methods: Pros & Cons

I Pros

I Use prior to solve the identifiability problem

I Deal with extremely complicated models such as nonlinear differential equation models I Borrow information across subjects:

I Deal with sparse longitudinal data I Estimate parameters for both population and individuals

I Always get reasonable estimates

I Use posterior distributions: Easy to quantify “uncertainty" for inference

I Cons

I Computation: complex and expensive

I Prior: dominate the results

Hulin Wu UTSPH March 2017 33 / 52 High-Dimensional ODEs

I Require computationally fast and efficient methods

I Need to incorporate selection approaches: LASSO, SCAD etc.

I Easy to deal with longitudinal data: Mixed-effects models

I Two-stage smoothing-based method: good for this purpose

Hulin Wu UTSPH March 2017 34 / 52 Linear ODEs

Time course gene expression data: Dynamic (GRN) reconstruction (Lu, Liang, Li and Wu, JASA 2011)

n dxi X = θ x , i = 1, ··· , n, (13) dt ij j j=1

I When n is small, standard statistical inference and variable selection methods can be used

I When n is large, curse-of-dimensionality

Hulin Wu UTSPH March 2017 35 / 52 High-Dimensional Linear ODE: Identifying Significant Regulations

Two-Stage Method (Chen and Wu 2008a, 2008b; Liang and Wu 2008):

I Obtain mean expression curves and their Mˆ k(t) and ˆ 0 Mk(t) from Step II. ˆ ˆ 0 I Substitute Mk(t) and Mk(t) into the ODE model to form a regression model

High Dimensional Linear Regression Model Pp yk(t) = j=1 βkjxj(t) + εk(t),

k = 1, ··· , p; t = t1, t2, . . . , tN ˆ 0 ˆ yk(t) = Mk(t) and xj(t) = Mj(t)

Hulin Wu UTSPH March 2017 36 / 52 High Dimensional Model Selection

I Two-stage method

I Decouple the high-dimensional ODEs

I Convert the ODE model into a simple linear model

I Computationally fast

I Stepwise selection and subset selection

I Bridge selection (Frank and Friedman 1993)

I Least absolute shrinkage and selection (LASSO) (Tibshirani 1996)

I Smoothly Clipped Absolute Deviation (SCAD)(Fan and Li 2001; Kim, Choi and Oh 2008)

Hulin Wu UTSPH March 2017 37 / 52 Estimation Refinement: Stochastic Approximation EM (SAEM) Algorithm

Mixed-Effects ODE Model for Module k

mk dxki X = β M (t), i = 1, ··· , n ; k = 1, . . . , p, (14) dt kij [kj] k j=1

Longitudinal Measurement Model

yki(t) = xki(t) + εki(t) (15)

Random Effects Model

βki = βk + bki (16)

bki ∼ N (0, Dk)

Hulin Wu UTSPH March 2017 38 / 52 Application: Identification of Dynamic GRN for Yeast Cell Cycle

DNA microarrays experiment: 18 equally spaced time points during two cell cycles (Spellman 1998)

I Step I: 800 significant genes identified

I Step II: Cluster 800 genes into 41 functional modules

I Step III: Smoothing

I Step IV: Linear ODE model identification: SCAD variable selection

I Step V: Estimation Refinement

I Step VI: Function Enrichment Analysis

Hulin Wu UTSPH March 2017 39 / 52 Yeast Cell Cycle Gene Expression Profile

Module 1 Module 2 Module 3 Module 4 −0.5 0.5 −1.0 0.5 −1.0 1.0 2.5 −0.6 0.0 0.6 5 10 15 5 10 15 5 10 15 5 10 15 Module 5 Module 6 Module 7 Module 8* −2 0 1 −0.6 0.2 −1.5 0.0 1.5 −1.0 0.5 5 10 15 5 10 15 5 10 15 5 10 15 Module 9* Module 10 Module 11 Module 12 −0.5 0.5 −2 0 2 −0.4 0.4 −0.6 0.2 0.8 5 10 15 5 10 15 5 10 15 5 10 15 Module 13 Module 14* Module 15* Module 16 −1 1 −1.0 0.5 −0.6 0.2 −2 0 2 5 10 15 5 10 15 5 10 15 5 10 15

Hulin Wu UTSPH March 2017 40 / 52 Yeast Cell Cycle Gene Expression Profile

Module 17 Module 18* Module 19 Module 20* −0.5 0.5 −2 0 2 −0.6 0.0 −1.5 0.0 1.5 5 10 15 5 10 15 5 10 15 5 10 15 Module 21 Module 22 Module 23 Module 24 −1.0 0.0 −1.0 0.0 −1.0 0.0 1.0 −1.0 0.5 5 10 15 5 10 15 5 10 15 5 10 15 Module 25* Module 26 Module 27 Module 28 −0.5 0.5 −2 0 1 −0.4 0.2 −1.0 0.0 1.0 5 10 15 5 10 15 5 10 15 5 10 15 Module 29 Module 30* Module 31 Module 32* −1 1 2 −1.5 0.5 2.0 −2 0 2 −0.6 0.2 5 10 15 5 10 15 5 10 15 5 10 15

Hulin Wu UTSPH March 2017 41 / 52 Yeast Cell Cycle Gene Expression Profile

Module 33 Module 34 Module 35 Module 36 −0.5 0.5 1.5 −1.0 0.5 −1.5 0.0 −1.0 0.5 5 10 15 5 10 15 5 10 15 5 10 15 Module 37 Module 38* Module 39 Module 40 −2 0 2 −1.0 0.0 −1.0 0.5 −1.0 0.0 5 10 15 5 10 15 5 10 15 5 10 15 Module 41* −2 0

5 10 15

Hulin Wu UTSPH March 2017 42 / 52 Graph of Yeast Cell Cycle GRN

12

17 6 3 31

7 37 1

39 21

34 33 15 11 18 29 10 30 25 35 14 13 41 9 23 38 32 5

8 24 16 20

19 22 27 36 26 40

2 4

28 Hulin Wu UTSPH March 2017 43 / 52 High-Dimensional Nonlinear/Nonparametric ODEs

I Generalized ODEs: Miao, Wu and Xue, Journal of the American Statistical Association (2014)

I Sparse additive ODEs: Wu, Lu, Xue and Liang, Journal of the American Statistical Association (2014)

I Additive nonlinear ODEs: Xue, Wu, Wu and Wu, a manuscript (2017)

Hulin Wu UTSPH March 2017 44 / 52 Other Dynamic Models: State-Space Models (SSM)

Linear SSM:

Xt+1 = FtXt + Vt,Vt ∼ (0,Qt) (17)

Yt = GtXt + Wt,Wt ∼ (0,Rt) (18)

where

I Vt and Wt: independent model noise and measurement noise

I Standard Kalman filter (Kalman, 1960): the core algorithm for prediction and smoothing of state state vectors

Hulin Wu UTSPH March 2017 45 / 52 Statistical Methods for State-Space Models

I Zhu and Wu, JCGS (2007)

I Liu, Lu, Niu and Wu, Biometrics (2011)

I Liu, Wu, Zhu, Miao, BMC Bioinformatics (2014)

I Chen et al. PlusOne (2017), submitted

Hulin Wu UTSPH March 2017 46 / 52 Extension to SDE and PDE: Possible but Challenging

I Theoretically difficult

I Computationally challenging

I Applications: Not common

Hulin Wu UTSPH March 2017 47 / 52 Ongoing and Future Research

I High-dimensional ODEs: How to improve accuracy without sacrificing too much on computing?

I Extra-high dimensional ODE: 1000 ODEs with 1 million parameters (Wu, Qiu, Yuan and Wu, 2017, submitted).

I Characteristic analyses of large ODE systems: Controllability and stability analysis with uncertainty in parameter estimation

I Sun, Hu, Wu, Qiu, Linel, Wu, Infectious Disease Modelling 2016

I AI-driven ODE Model Builder

Hulin Wu UTSPH March 2017 48 / 52 Conclusions

Dynamic Models:

I Practically useful for both understanding associations and predictions

I Both theoretically and computationally challenging

I Statistical methods for dynamic models: More work needed

Hulin Wu UTSPH March 2017 49 / 52 Dr. Hulin Wu’s Publications on ODE Models by Topics ODE identifiability 1. Wu, H.*, Zhu, H.+, Miao, H.+, and Perelson, A.S. (2008), Parameter Identifiability and Estimation of HIV/AIDS Dynamic Models, Bulletin of Mathematical Biology, 70(3), 785- 799. 2. Miao, H.+, Dykes, C., Demeter, L., Wu, H.* (2009), Differential Equation Modeling of HIV Viral Fitness Experiments: Model Identification, Model Selection, and Multi-Model Inference, Biometrics, 65, 292-300. 3. Liang, H., Miao, H., and Wu, H.* (2010), Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model, Annals of Applied Statistics, 4, 460-483. 4. Miao, H., Xia, X., Perelson, A.S., Wu, H.* (2011), On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics, SIAM Review, 53(1): 3-39. 5. Lee, Y.+ and Wu, H.* (2012), MARS Approach for Global Sensitivity Analysis of Differential Equation Models with Applications to Dynamics of Influenza Infection, Bulletin of Mathematical Biology, 74, 73-90. 6. Wu, H.*, Miao, H., Xue, H., Topham, D.J., Zand, M. (2015), Quantifying Immune Response to Influenza Virus Infection via Multivariate Nonlinear ODE Models with Partially Observed State Variables and Time-Varying Parameters, Statistics in Biosciences, 7(1):147-166.

NLS Estimation of ODE parameters 1. Wu, H.*, Huang, Y.+, Dykes, C., Liu, D., Ma, J., Perelson, A.S., Demeter, L. (2006), Modeling and Estimation of Replication Fitness of HIV-1 in Vitro Experiments Using a Growth Competition Assay, Journal of Virology, 80, 2380-2389. 2. Xue, H.+, Miao, H., Wu, H.* (2010), Sieve Estimation of Constant and Time-Varying Coefficients in Nonlinear Ordinary Differential Equation Models by Considering Both Numerical Error and Measurement Error, Annals of Statistics, 38(4), 2351-87.

Two-stage methods for ODE models 1. Liang, H., Wu, H.*, (2008), Parameter Estimation for Differential Equation Models Using a Framework of Measurement Error in Regression Models, Journal of the American Statistical Association, 103, 1570-1583. 2. Chen, J.+ and Wu, H.* (2008), Efficient Local Estimation for Time-varying Coefficients in Deterministic Dynamic Models with Applications to HIV-1 Dynamics, Journal of the American Statistical Association, 103, 369-384. 3. Fang, Y.+, Wu, H.*, Zhu, L. (2011), A Two-Stage Estimation Method for Random Coefficient Differential Equation Models with Application to Longitudinal HIV Dynamic Data, Statistica Sinica, 21, 1145-1170. 4. Lu, T.+, Liang, H., Li, H., Wu, H.* (2011), High Dimensional ODEs Coupled with Mixed- Effects Modeling Techniques for Dynamic Gene Regulatory Network Identification, Journal of the American Statistical Association, 106, 1242-1258. 5. Wu, H.*, Xue, H., Kumar A.+ (2012), Numerical Discretization-Based Estimation Methods for Ordinary Differential Equation Models via Penalized Spline Smoothing with Applications in Biomedical Research, Biometrics, 68(2), 344-353. 6. Ding, A.A. and Wu, H.* (2014), Estimation of ODE Parameters Using Constrained Local Polynomial Regression, Statistica Sinica, 24, 1613-1631. 7. Wu, H.*, Miao, H., Xue, H., Topham, D.J., Zand, M. (2015), Quantifying Immune Response to Influenza Virus Infection via Multivariate Nonlinear ODE Models with Partially Observed State Variables and Time-Varying Parameters, Statistics in Biosciences, 7(1):147-166.

Time-varying parameter estimation in ODE Models 1. Chen, J.+ and Wu, H.* (2008), Efficient Local Estimation for Time-varying Coefficients in Deterministic Dynamic Models with Applications to HIV-1 Dynamics, Journal of the American Statistical Association, 103, 369-384. 2. Chen, J.+ and Wu, H.* (2008), Estimation of time-varying parameters in deterministic dynamic models, Statistica Sinica, 18, 987-1006. 3. Liang, H., Miao, H., and Wu, H.* (2010), Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model, Annals of Applied Statistics, 4, 460-483. 4. Xue, H.+, Miao, H., Wu, H.* (2010), Sieve Estimation of Constant and Time-Varying Coefficients in Nonlinear Ordinary Differential Equation Models by Considering Both Numerical Error and Measurement Error, Annals of Statistics, 38(4), 2351-87. 5. Cao, J., Huang, J.Z., Wu, H. (2012), Penalized Nonlinear Least Squares Estimation of Time-Varying Parameters in Ordinary Differential Equations, Journal of Computational and Graphical Statistics (JCGS), 21(1), 42-56.

Bayesian and mixed-effects ODE modeling approaches for longitudinal data 1. Wu, H.*, Ding, A. and DeGruttola. V. (1998), “Estimation of HIV Dynamic Parameters," Statistics in Medicine, 17, 2463-2485. 2. Wu, H.* and Ding, A. (1999), “Population HIV-1 Dynamics in Vivo: Applicable Models and Inferential Tools for Virological Data from AIDS Clinical Trials," Biometrics, 55, 410-418. 3. Huang, Y.+ and Wu, H.* (2006), A Bayesian Approach for Estimating Antiviral Efficacy in HIV Dynamic Models, Journal of Applied Statistics, 33, 155-174. 4. Huang, Y., Liu, D.+ and Wu, H.* (2006), Hierarchical Bayesian Methods for Estimation of Parameters in a Longitudinal HIV Dynamic System, Biometrics, 62, 413-423. 5. Fang, Y.+, Wu, H.*, Zhu, L. (2011), A Two-Stage Estimation Method for Random Coefficient Differential Equation Models with Application to Longitudinal HIV Dynamic Data, Statistica Sinica, 21, 1145-1170.

High-dimensional ODE models and model selections 1. Lu, T.+, Liang, H., Li, H., Wu, H.* (2011), High Dimensional ODEs Coupled with Mixed- Effects Modeling Techniques for Dynamic Gene Regulatory Network Identification, Journal of the American Statistical Association, 106, 1242-1258. 2. Wu, H.*, Lu, T.+, Xue, H., and Liang, H. (2014), Sparse Additive ODEs for Dynamic Gene Regulatory Network Modeling, Journal of the American Statistical Association, 109:506, 700-716.

Nonlinear/nonparametric ODE models 1. Wu, H.*, Lu, T.+, Xue, H., and Liang, H. (2014), Sparse Additive ODEs for Dynamic Gene Regulatory Network Modeling, Journal of the American Statistical Association, 109:506, 700-716. 2. Miao, H., Wu, H., and Xue, H. (2014), Generalized Ordinary Differential Equation Models, Journal of the American Statistical Association, 109:508, 1672-1682.

Statistical methods for state-space models 1. Zhu, H.+ and Wu, H.* (2007), Estimation of Smoothing Time-Varying Parameters in State Space Models, Journal of Computational and Graphical Statistics (JCGS), 16(4), 813-832. 2. Liu, D.+, Lu, T.+, Niu, X.F., and Wu, H.* (2011), Mixed-Effects State Space Models for Analysis of Longitudinal Dynamic Systems, Biometrics, 67, 476-485.

ODE experimental design 1. Wu, H.* and Ding, A.A. (2002), “Design of Viral Dynamic Studies for Efficiently Assessing Anti-HIV Therapies in AIDS Clinical Trials," Biometrical Journal, 2, 175-196. 2. Huang, Y. and Wu, H.* (2008), Bayesian Experimental Design for Long-Term Longitudinal HIV Dynamic Studies, Journal of Statistical Planning and Inference, 138, 105-113. 3. Miao, H., Xia, X., Perelson, A.S., Wu, H.* (2011), On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics, SIAM Review, 53(1): 3-39.

Dynamic model property analysis with uncertainty 1. Sun, X.+, Hu, F.+, Wu, S., Qiu, X., Linel, P.+, Wu, H.* (2016), Controllability and Stability Analysis of Large Transcriptomic Dynamic Systems for Host Response to Influenza Infection in Human, Infectious Disease Modelling, 1(1), 52-70.

Our recent work in nonlinear/high-dimensional ODE models

I Lu, T., Liang, H., Li, H., Wu, H. (2011), High Dimensional ODEs Coupled with Mixed-Effects Modeling Techniques for Dynamic Gene Regulatory Network Identification, JASA, 106, 1242-1258. I Wu, H., Xue, H., Kumar A. (2012), Numerical Discretization-Based Estimation Methods for Ordinary Differential Equation Models via Penalized Spline Smoothing with Applications in Biomedical Research, Biometrics, 68(2), 344-353. I Miao, H., Wu, H., and Xue, H. (2014), Generalized Ordinary Differential Equation Models, JASA, 109:508, 1672-1682 I Wu, H., Lu, T., Xue, H., and Liang, H. (2014), Sparse Additive ODEs for Dynamic Gene Regulatory Network Modeling, JASA, 109:506, 700-716. I Wu, S., Liu, Z.P., Qiu, X., and Wu, H. (2014), Modeling genome-wide dynamic regulatory network in mouse lungs with influenza infection using high-dimensional ordinary differential equations, PLOS ONE, 9(5):e95276. I Linel, P., Wu, S., Deng, N., Wu, H. (2014), Dynamic transcriptional signatures and network responses for clinical symptoms in influenza-infected human subjects using approaches, Journal of PK/PD, 41, 509-521. I Qiu, X. et al. (2015), Diversity in Compartmental Dynamics of Gene Regulatory Networks: The Immune Response in Primary Influenza A Infection in Mice, PLoS ONE, 10(9).

Hulin Wu UTSPH March 2017 50 / 52 Acknowledgement

I More than 30 postdocs, students and collaborators

Hulin Wu UTSPH March 2017 51 / 52 Thank You!

Hulin Wu UTSPH March 2017 52 / 52