Statistical Methods for Bridging Experimental Data and Dynamic Models with Biomedical Applications
Hulin Wu, Ph.D. Dr. D.R. Seth Family Professor & Associate Chair Department of Biostatistics, School of Public Health
Professor, School of Biomedical Informatics University of Texas Health Science Center at Houston
Pittsburgh, March, 2017
Hulin Wu UTSPH March 2017 1 / 52 Outline
1 Introduction
2 Statistical estimation and inference methods for dynamic ODE models
I Naive Method: LS or MLE principle
I Local solution and time-varying parameter problems
I Smoothing-based methods
I Sparse longitudinal data: mixed-effects ODE models
I Bayesian methods
I High-dimensional ODE models: ODE model selection
3 Other dynamic models
4 Ongoing and future Work
5 Conclusions
Hulin Wu UTSPH March 2017 2 / 52 Statistical Modeling Cultures
I Leo Breiman (Statistical Science, 2001): Two cultures
I Data modeling (98% statisticians): What the data look like? e.g., regression models
I Algorithmic modeling (2% statisticians): No models and for prediction purpose, e.g., neural nets and decision trees
I A third culture:
I Mechanistic modeling (<1% statisticians): Build mathematical models based on the mechanisms behind the data
I How are the data generated?
I Goal: Understand physics principles or biological mechanisms
Hulin Wu UTSPH March 2017 3 / 52 Dynamic Systems/Models
Many engineering and biological systems can be described by dynamic models:
I Differential equations:
I Ordinary differential equations (ODE)–simplest
I Delay differential equations (DDE)
I Hybrid differential equations (HDE)
I Partial differential equations (PDE)
I Stochastic differential equations (SDE)
I Difference equations and state-space models
I Stochastic processes models: branching process etc.
I Agent-based models and cellular automata
I ...
Hulin Wu UTSPH March 2017 4 / 52 Modeling Goals
I Forward Problems: θ 7→ Pθ–Easier to do
I Predictions
I Inverse Problems: Y 7→ θ ∈ Θ–More challenging
I Determine model structures/forms
I Estimate unknown parameters: θ
Hulin Wu UTSPH March 2017 5 / 52 A Dynamic System: ODE Model
d X(t) = G[X(t), θ], X(0) = X (1) dt 0 Y (ti) = H[X(ti), β] + e(ti), (2) 2 e(ti) ∼ (0, σ I), i = 1, . . . , n where
I G(·): linear or nonlinear functions I H(·): observation functions I (θ, β): unknown parameters I e(ti): measurement error The NLS method: n X T min {Y (ti) − H[X(ti, θ), β]} {Y (ti) − H[X(ti, θ), β]}, θ,β,X 0 i=1
where X(ti) evaluated numerically from Eq (1).
Hulin Wu UTSPH March 2017 6 / 52 Naive NLS Method: Challenging Problems
1 Identifiability problem
2 Local solutions
3 Time-varying parameters
4 Need to solve the forward problem numerically and many times: Numerical error vs. measurement error
5 Slow convergence and high computational cost
6 Sparse longitudinal data problem
7 Nonlinear optimization
8 High-dimensional parameter space
Motivate new statistical methods for dynamic models
Hulin Wu UTSPH March 2017 7 / 52 Identifiability issues
I Theoretical identifiability: Mathematical identifiability
I Practical identifiability: Statistical and numerical identifiability
I Need to be investigated before the inverse problem
I How to deal with unidentifiable models?
I Simplify or revise the model
I Lump some parameters together
I Fixed some parameters
I Bayesian approach: Use priors
Hulin Wu UTSPH March 2017 8 / 52 Identifiability issues: References
I Wu, H., Zhu, H., Miao, H., and Perelson, A.S. (2008), Parameter Identifiability and Estimation of HIV/AIDS Dynamic Models, Bulletin of Mathematical Biology, 70(3), 785-799.
I Miao, H., Dykes, C., Demeter, L.M., Cavenaugh, J., Park, S.Y., Perelson, A.S., and Wu, H. (2008), Modeling and Estimation of Kinetic Parameters and Replicative Fitness of HIV-1 from Flow-Cytometry-Based Growth Competition Experiments, Bulletin of Mathematical Biology, 70, 1749-1771.
I Miao, H., Dykes, C., Demeter, L., Wu, H. (2009), Differential Equation Modeling of HIV Viral Fitness Experiments: Model Identification, Model Selection, and Multi-Model Inference, Biometrics, 65, 292-300.
I Liang, H., Miao, H., and Wu, H. (2010), Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model, Annals of Applied Statistics, 4, 460-483.
I Miao, H., Xia, X., Perelson, A.S., Wu, H. (2011), On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics, SIAM Review, 53(1): 3-39.
Hulin Wu UTSPH March 2017 9 / 52 Naive NLS Method: Local solution and numerical error problems
I Local solution problem:
I Global optimization methods: Differential evolution algorithms and genetic algorithms (Storn et al 1997).
I Mixture of stochastic global optimization method and deterministic methods: scatter search method (Rodriguez-Fernandez et al. 2006)
I Numerical error problem:
I Xue, Miao and Wu (Annals of Statistics, 2010): theoretical results on numerical error vs. measurement error
Hulin Wu UTSPH March 2017 10 / 52 Naive NLS Method: Time-varying parameter problem
Xue, Miao and Wu, Annals of Statistics (2010)
dX(t) = F {t, X(t), θ, η(t)} dt
I The spline approach can be used to approximate the time-varying parameter:
η(t) = π(t)T α,
T where π(t) = (B1(t), ··· ,BN (t)) is a vector of basis functions.
I The time-varying coefficient ODE model becomes an ODE model with constant parameters:
dX(t) = F {t, X(t), θ, π(t)T α} dt
Hulin Wu UTSPH March 2017 11 / 52 Smoothing-Based Approaches: ODE Computational Problem
I Earlier ideas: Hemker (1972) and Varah (1982)
I Two-stage decoupling approaches: Chen and Wu (JASA 2008, Statistica Sinica 2008) and Liang and Wu (JASA, 2008)
I Parameter cascading method: Ramsay et al. JRSS-B (2007) and Wang et al. Stat Comput 2014.
Hulin Wu UTSPH March 2017 12 / 52 Smoothing-Based Approaches: Two-Stage Method
Chen and Wu (JASA 2008, Statistica Sinica 2008) and Liang and Wu (JASA, 2008):
0 X (ti) = F [X(ti), θ] (3) 2 Y (ti) = X(ti) + e1(ti), e1(ti) ∼ (0, σ I), (4)
I Step 1: Use a nonparametric smoothing to estimate X(t) and X0(t) from model (4).
I Step 2: Substitute the estimate Xˆ(ti) into model (3) to obtain:
0 Xˆ (ti) = F [Xˆ(ti), θ] + e2(ti). (5)
Then fit the above regression model (5) to estimate θ.
I F (·): Linear or nonlinear function
Hulin Wu UTSPH March 2017 13 / 52 Smoothing-Based Approaches: Two-Step Methods
I Step 2 decoupled the system of ODEs: Fit the ODE one-by-one
I Convert ODE models to regression: Standard regression software tools can be used
I Avoid numerically solving the ODEs
I Computationally fast and efficient: Easy to deal with high-dimensional ODEs
I Price to pay:
I The derivative estimate may not be accurate
I The decoupled system: Some information lost
I The “coupled" property: destroyed Extension to higher-order numerical discretization-based algorithms: Wu, Xue and Kuman (Biometrics 2012)
Hulin Wu UTSPH March 2017 14 / 52 Parameter Cascading or Profiling Method
Ramsay, Hooker, Campbell, Cao, JRSS-B, 2007
Fitting to data
I Observations: y(ti) 0 I Nonparametric function: f(t) = φ(t) c Pn 2 I Fitting to data: C1 = i=1[y(ti) − f(ti)]
Fidelity to DE x0(t) = g(x|β) 0 0 I f (t) = φ (t)c 0 I Difference between two sides of DE: Lf(t) = f (t) − g(f(t)|β) R 2 I Fidelity to DE: C2 = [Lf(t)] dt
Criterion to estimate c: J(c|β) = C1 + λC2 Pn 0 2 Criterion to estimate β: H(β) = i=1[y(ti) − φ(ti) cˆ(β)]
Hulin Wu UTSPH March 2017 15 / 52 Numerical Comparisons: NLS, Profiling and Two-Stage Estimates
Ding and Wu, Statistica Sinica, 2014
I NLS: Not stable to get the global solution, computationally expensive
I Profiling:
I A 3-step iterative algorithm
I More stable than NLS to get a better solution
I Computational efficiency: similar to NLS
I Two-Stage Method: Computationally fast, but not accurate.
Hulin Wu UTSPH March 2017 16 / 52 Sparse Longitudinal Data Problem: Mixed-Effects Modeling Approaches
Deal with sparse data: Borrow information across subjects
I The MLE principle: Nonlinear Mixed-Effects Modeling (NLME)
I Treat the ODE solution as a nonlinear regression function
I Computational challenge: Stochastic Approximation EM (SAEM)
I Two-stage smoothing-based mixed-effects modeling approaches
I Fang, Wu and Zhu, Statistica Sinica (2011)
I Linear ODE: Linear mixed-effects model (LME)
I Nonlinear ODE: NLME
I Bayes methods
I A three-stage hierarchical model: implemented by MCMC
I Computation: expensive
Hulin Wu UTSPH March 2017 17 / 52 Mixed-Effects ODE Model: NLME
I Within-subject variation: d X(t) = G[X(t), θ ], X(0) = X (6) dt i i0 Y i(ti) = Hi[Xi(ti), θi] + ei(ti), i = 1, . . . , n
I Xi(ti): ODE solution for Subject i. T I Y i = (yi1(t1), ··· , yimi (tmi )) : Data from Subject i T 2 I ei = (ei(t1), ··· , ei(tmi )) ∼ N (0, σ Imi ): Measurement error
I Between-subject variation:
θi = µ + bi, [bi|Σ] ∼ N (0, Σ)
I µ: population parameter
I bi: random effects
I Estimation and inference: Stochastic Approximation EM (SAEM) I Delyon, Lavielle and Moulines (1999), Kuhn and Lavielle (2005) Grenier, Louvet, Vigneaux (2014)
Hulin Wu UTSPH March 2017 18 / 52 Smoothing-based Two-Stage Mixed-Effects Model
Fang, Wu and Zhu, Statistica Sinica (2011):
0 X (ti) = F [X(ti), θ] (7) 2 Y (ti) = X(ti) + e1(ti), e1(ti) ∼ (0, σ I), (8)
I Step 1: Use a nonparametric smoothing to estimate X(t) and X0(t) from model (8).
I Step 2: Substitute the estimate Xˆ(ti) into model (7) to obtain:
0 Xˆ (ti) = F [Xˆ(ti), θ] + e2(ti). (9)
I Convert the model (9) into a LME or NLME if F (x) is linear or nonlinear.
I Fit the LME or NLME using a standard approach or SAEM method
Hulin Wu UTSPH March 2017 19 / 52 Bayesian Methods: Borrow Information to Deal with Sparse Data and Identifiability Problems
Huang, Liu and Wu, Biometrics (2006): Example
I A viral dynamic model: describe the population dynamics of HIV and its target cells in plasma
d dt T = λ − ρT − [1 − γ(t)]kT V d ∗ ∗ dt T = [1 − γ(t)]kT V − δT (10) d ∗ dt V = NδT − cV
∗ I T,T , V : target uninfected cells, infected cells, virus
I γ(t): time-varying antiviral drug efficacy
I (λ, ρ, k, δ, N, c): unknown parameters to be estimated
I The equations (10): no closed-form solution
Hulin Wu UTSPH March 2017 20 / 52 Antiviral Drug Efficacy Model
I A modified Emax (M-M) model for drug efficacy:
C(t)A(t) IQ(t)A(t) γ(t) = = , 0 ≤ γ(t) ≤ 1 φIC50(t) + C(t)A(t) φ + IQ(t)A(t) (11)
I C(t): the plasma drug concentration
I A(t): drug adherence measurements
I IC50: in vitro phenotype drug resistance marker
I φ: a conversion factor parameter C(t) I IQ = : the Inhibitory Quotient (IQ) IC50(t)
I If γ(t) = 1, the drug: 100% effective
I If γ(t) = 0, the drug: no effect
Hulin Wu UTSPH March 2017 21 / 52 Drug Susceptibility Model
I Phenotype marker IC50 is used to quantify agent-specific drug sensitivity
I The function: to describe changes overtime in IC50
Ir −I0 I0 + t t for 0 < t < tr, IC50(t) = r (12) Ir for t ≥ tr,
I I0 and Ir: respective values of IC50(t) at baseline and time point tr at which drug resistant mutations appear
I If Ir = I0, no resistance mutation developed during treatment
Hulin Wu UTSPH March 2017 22 / 52 A Challenging Problem
How to estimate the unknown parameters in the complex dynamic model?
I Difficulties:
I Identifiability problem: Too many parameters, (φ, λ, ρ, k, δ, N, C), some of them are not identifiable
I Data from individuals: sparse, only V (t) measured
I Nonlinear differential equations model: no closed-form solutions
Hulin Wu UTSPH March 2017 23 / 52 Viral load data from a clinical trial
Real data up to day 112
• • • • • • • • • •• • • • • •• • • • • • • • • log10(RNA) copies/ml • • • • • • • • • •• • • • • • • ••• • • • log10(50) 12345
0 20406080100
Time (days)
Hulin Wu UTSPH March 2017 24 / 52 Bayesian Modeling
I A three-stage Bayesian hierarchical model I Stage 1. Within-subject variation: 2 2 yi = fi(θi) + ei, [ei|σ , θi] ∼ N (0, σ Imi )
T I fi(θi) = (fi1(θi, t1), ··· , fimi (θi, tmi )) : ODE solutions. T I yi = (yi1(t1), ··· , yimi (tmi )) : Data from Subject i T I ei = (ei(t1), ··· , ei(tmi )) : Measurement error I Stage 2. Between-subject variation:
θi = µ + bi, [bi|Σ] ∼ N (0, Σ)
I Stage 3. Hyperprior distributions: σ−2 ∼ Ga(a, b), µ ∼ N (η, Λ), Σ−1 ∼ Wi(Ω, ν)
I Gamma (Ga), Normal (N ) and Wishart (Wi): independent distributions I Hyper-parameters a, b, η, Λ, Ω and ν: known
Hulin Wu UTSPH March 2017 25 / 52 Bayesian Estimation: Implementation
I Choose prior distributions
I Informative prior and non-informative prior
I Rule of thumb: choose non-informative prior distributions for parameters of interest
I Implement MCMC algorithm
I Gibbs sampling step: closed form of conditional distributions for σ−2, µ, Σ−1
I Metropolis-Hastings step: no closed form of conditional distributions for θi
I Run a long chain: the number of iterations, initial “burn-in", every fifth simulation samples
I Obtain posterior distributions (posterior means or credible intervals) based on the final MCMC samples
Hulin Wu UTSPH March 2017 26 / 52 A Clinical Study: A5055
I A study of HIV-1 infected patients failing PI-containing therapies.
I Two salvage regimens: 44 patients
I Arm A: IDV 800 mg q12h+RTV 200mg q12h+two NRTIs
I Arm B: IDV 400 mg q12h+RTV 400mg q12h+two NRTIs
I Plasma HIV-1 RNA (viral load) measured at days 0, 7, 14, 28, 56, 84, 112, 140 and 168 of follow-up
Hulin Wu UTSPH March 2017 27 / 52 Clinical Data–Results of Population Parameters
Parameter PM SD 95% CI φ 2.1091 0.6354 (1.2143, 3.6392) c 2.9867 0.1466 (2.7139, 3.2881) δ 0.3729 0.0184 (0.3387, 0.4105) λ 100.645 4.9431 (91.497, 110.830) ρ 0.0997 0.0049 (0.0905, 0.1099) N 1004.988 49.795 (912.074, 1106.654) k 9.183 × 10−6 0.290 × 10−6 (8.632 × 10−6, 9.774 × 10−6)
I Posterior mean for the population parameter φ is 2.1091 with a SD of 0.6354 and the 95% CI of (1.2143, 3.6392)
I As φ plays a role of transforming the in vitro IC50 into in vivo IC50, our estimate shows that there is about 2-fold difference between in vitro IC50 and in vivo IC50
Hulin Wu UTSPH March 2017 28 / 52 Clinical Data–Results of Individual Parameters
Patient φi ci δi λi ρi Ni ki e 1 0.447 2.254 0.270 410.462 0.024 456.757 8.33 × 10−6 0.97 2 5.371 2.969 1.183 29.619 0.426 4795.813 10.84 × 10−6 0.17 3 3.723 2.283 0.456 36.877 0.289 3258.347 8.66 × 10−6 0.37 4 4.960 2.761 0.798 44.956 0.313 3051.988 9.09 × 10−6 0.34 5 7.066 2.306 0.663 71.295 0.201 2735.239 6.54 × 10−6 0.64 6 0.786 4.633 0.183 375.882 0.025 247.416 11.18 × 10−6 0.89 7 0.091 7.008 0.299 4015.398 0.003 30.559 18.54 × 10−6 0.98 8 8.484 2.280 0.663 32.722 0.416 4530.531 8.37 × 10−6 0.24
I The individual-specific parameter estimates suggest a large inter-subject variation I The model provides a good fit to the clinical data
Hulin Wu UTSPH March 2017 29 / 52 Patient 1
Fitted individual curves, drug efficacy, IC50 and adherence with IQ=c12h/IC50
Patid= 1 Patid= 1
IDV RTV IC50 Adherence IDV 4 8 12 16 RTV 0.80 0.90 1.00 0 50 100 150 200 0 50 100 150 200
Time (day) Time (day)
Patid= 1 Patid= 1
o o ec o o o o o o log10(RNA) Drug efficacy 1.5 3.0 4.5 0.6 0.8 1.0 0 50 100 150 200 0 50 100 150 200
Time (day) Time (day)
Hulin Wu UTSPH March 2017 30 / 52 Patient 2
Patid= 2 Patid= 2
IDV IDV RTV RTV IC50 Adherence 468 12 0.70 0.85 1.00 0 50 100 150 200 0 50 100 150 200
Time (day) Time (day)
Patid= 2 Patid= 2
ec o o o o o o log10(RNA) Drug efficacy o o 1.5 2.5 3.5 0.70 0.85 1.00 0 50 100 150 200 0 50 100 150 200
Time (day) Time (day)
Hulin Wu UTSPH March 2017 31 / 52 Patient 3
Patid= 3 Patid= 3
IDV IDV RTV RTV IC50 Adherence 4.5 6.0 7.5 0.970 0.990 0 50 100 150 200 0 50 100 150 200
Time (day) Time (day)
Patid= 3 Patid= 3
o
o o o log10(RNA) Drug efficacy ec o o o
1.5 2.5 3.5 o o 0.80 0.90 1.00 0 50 100 150 200 0 50 100 150 200
Time (day) Time (day)
Hulin Wu UTSPH March 2017 32 / 52 Bayesian Methods: Pros & Cons
I Pros
I Use prior to solve the identifiability problem
I Deal with extremely complicated models such as nonlinear differential equation models I Borrow information across subjects:
I Deal with sparse longitudinal data I Estimate parameters for both population and individuals
I Always get reasonable estimates
I Use posterior distributions: Easy to quantify “uncertainty" for inference
I Cons
I Computation: complex and expensive
I Prior: dominate the results
Hulin Wu UTSPH March 2017 33 / 52 High-Dimensional ODEs
I Require computationally fast and efficient methods
I Need to incorporate variable selection approaches: LASSO, SCAD etc.
I Easy to deal with longitudinal data: Mixed-effects models
I Two-stage smoothing-based method: good for this purpose
Hulin Wu UTSPH March 2017 34 / 52 Linear ODEs
Time course gene expression data: Dynamic gene regulatory network (GRN) reconstruction (Lu, Liang, Li and Wu, JASA 2011)
n dxi X = θ x , i = 1, ··· , n, (13) dt ij j j=1
I When n is small, standard statistical inference and variable selection methods can be used
I When n is large, curse-of-dimensionality
Hulin Wu UTSPH March 2017 35 / 52 High-Dimensional Linear ODE: Identifying Significant Regulations
Two-Stage Method (Chen and Wu 2008a, 2008b; Liang and Wu 2008):
I Obtain mean expression curves and their derivatives Mˆ k(t) and ˆ 0 Mk(t) from Step II. ˆ ˆ 0 I Substitute Mk(t) and Mk(t) into the ODE model to form a regression model
High Dimensional Linear Regression Model Pp yk(t) = j=1 βkjxj(t) + εk(t),
k = 1, ··· , p; t = t1, t2, . . . , tN ˆ 0 ˆ yk(t) = Mk(t) and xj(t) = Mj(t)
Hulin Wu UTSPH March 2017 36 / 52 High Dimensional Model Selection
I Two-stage method
I Decouple the high-dimensional ODEs
I Convert the ODE model into a simple linear model
I Computationally fast
I Stepwise selection and subset selection
I Bridge selection (Frank and Friedman 1993)
I Least absolute shrinkage and selection operator (LASSO) (Tibshirani 1996)
I Smoothly Clipped Absolute Deviation (SCAD)(Fan and Li 2001; Kim, Choi and Oh 2008)
Hulin Wu UTSPH March 2017 37 / 52 Estimation Refinement: Stochastic Approximation EM (SAEM) Algorithm
Mixed-Effects ODE Model for Module k
mk dxki X = β M (t), i = 1, ··· , n ; k = 1, . . . , p, (14) dt kij [kj] k j=1
Longitudinal Measurement Model
yki(t) = xki(t) + εki(t) (15)
Random Effects Model
βki = βk + bki (16)
bki ∼ N (0, Dk)
Hulin Wu UTSPH March 2017 38 / 52 Application: Identification of Dynamic GRN for Yeast Cell Cycle
DNA microarrays experiment: 18 equally spaced time points during two cell cycles (Spellman 1998)
I Step I: 800 significant genes identified
I Step II: Cluster 800 genes into 41 functional modules
I Step III: Smoothing
I Step IV: Linear ODE model identification: SCAD variable selection
I Step V: Estimation Refinement
I Step VI: Function Enrichment Analysis
Hulin Wu UTSPH March 2017 39 / 52 Yeast Cell Cycle Gene Expression Profile
Module 1 Module 2 Module 3 Module 4 −0.5 0.5 −1.0 0.5 −1.0 1.0 2.5 −0.6 0.0 0.6 5 10 15 5 10 15 5 10 15 5 10 15 Module 5 Module 6 Module 7 Module 8* −2 0 1 −0.6 0.2 −1.5 0.0 1.5 −1.0 0.5 5 10 15 5 10 15 5 10 15 5 10 15 Module 9* Module 10 Module 11 Module 12 −0.5 0.5 −2 0 2 −0.4 0.4 −0.6 0.2 0.8 5 10 15 5 10 15 5 10 15 5 10 15 Module 13 Module 14* Module 15* Module 16 −1 1 −1.0 0.5 −0.6 0.2 −2 0 2 5 10 15 5 10 15 5 10 15 5 10 15
Hulin Wu UTSPH March 2017 40 / 52 Yeast Cell Cycle Gene Expression Profile
Module 17 Module 18* Module 19 Module 20* −0.5 0.5 −2 0 2 −0.6 0.0 −1.5 0.0 1.5 5 10 15 5 10 15 5 10 15 5 10 15 Module 21 Module 22 Module 23 Module 24 −1.0 0.0 −1.0 0.0 −1.0 0.0 1.0 −1.0 0.5 5 10 15 5 10 15 5 10 15 5 10 15 Module 25* Module 26 Module 27 Module 28 −0.5 0.5 −2 0 1 −0.4 0.2 −1.0 0.0 1.0 5 10 15 5 10 15 5 10 15 5 10 15 Module 29 Module 30* Module 31 Module 32* −1 1 2 −1.5 0.5 2.0 −2 0 2 −0.6 0.2 5 10 15 5 10 15 5 10 15 5 10 15
Hulin Wu UTSPH March 2017 41 / 52 Yeast Cell Cycle Gene Expression Profile
Module 33 Module 34 Module 35 Module 36 −0.5 0.5 1.5 −1.0 0.5 −1.5 0.0 −1.0 0.5 5 10 15 5 10 15 5 10 15 5 10 15 Module 37 Module 38* Module 39 Module 40 −2 0 2 −1.0 0.0 −1.0 0.5 −1.0 0.0 5 10 15 5 10 15 5 10 15 5 10 15 Module 41* −2 0
5 10 15
Hulin Wu UTSPH March 2017 42 / 52 Graph of Yeast Cell Cycle GRN
12
17 6 3 31
7 37 1
39 21
34 33 15 11 18 29 10 30 25 35 14 13 41 9 23 38 32 5
8 24 16 20
19 22 27 36 26 40
2 4
28 Hulin Wu UTSPH March 2017 43 / 52 High-Dimensional Nonlinear/Nonparametric ODEs
I Generalized ODEs: Miao, Wu and Xue, Journal of the American Statistical Association (2014)
I Sparse additive ODEs: Wu, Lu, Xue and Liang, Journal of the American Statistical Association (2014)
I Additive nonlinear ODEs: Xue, Wu, Wu and Wu, a manuscript (2017)
Hulin Wu UTSPH March 2017 44 / 52 Other Dynamic Models: State-Space Models (SSM)
Linear SSM:
Xt+1 = FtXt + Vt,Vt ∼ (0,Qt) (17)
Yt = GtXt + Wt,Wt ∼ (0,Rt) (18)
where
I Vt and Wt: independent model noise and measurement noise
I Standard Kalman filter (Kalman, 1960): the core algorithm for prediction and smoothing of state state vectors
Hulin Wu UTSPH March 2017 45 / 52 Statistical Methods for State-Space Models
I Zhu and Wu, JCGS (2007)
I Liu, Lu, Niu and Wu, Biometrics (2011)
I Liu, Wu, Zhu, Miao, BMC Bioinformatics (2014)
I Chen et al. PlusOne (2017), submitted
Hulin Wu UTSPH March 2017 46 / 52 Extension to SDE and PDE: Possible but Challenging
I Theoretically difficult
I Computationally challenging
I Applications: Not common
Hulin Wu UTSPH March 2017 47 / 52 Ongoing and Future Research
I High-dimensional ODEs: How to improve accuracy without sacrificing too much on computing?
I Extra-high dimensional ODE: 1000 ODEs with 1 million parameters (Wu, Qiu, Yuan and Wu, 2017, submitted).
I Characteristic analyses of large ODE systems: Controllability and stability analysis with uncertainty in parameter estimation
I Sun, Hu, Wu, Qiu, Linel, Wu, Infectious Disease Modelling 2016
I AI-driven ODE Model Builder
Hulin Wu UTSPH March 2017 48 / 52 Conclusions
Dynamic Models:
I Practically useful for both understanding associations and predictions
I Both theoretically and computationally challenging
I Statistical methods for dynamic models: More work needed
Hulin Wu UTSPH March 2017 49 / 52 Dr. Hulin Wu’s Publications on ODE Models by Topics ODE identifiability 1. Wu, H.*, Zhu, H.+, Miao, H.+, and Perelson, A.S. (2008), Parameter Identifiability and Estimation of HIV/AIDS Dynamic Models, Bulletin of Mathematical Biology, 70(3), 785- 799. 2. Miao, H.+, Dykes, C., Demeter, L., Wu, H.* (2009), Differential Equation Modeling of HIV Viral Fitness Experiments: Model Identification, Model Selection, and Multi-Model Inference, Biometrics, 65, 292-300. 3. Liang, H., Miao, H., and Wu, H.* (2010), Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model, Annals of Applied Statistics, 4, 460-483. 4. Miao, H., Xia, X., Perelson, A.S., Wu, H.* (2011), On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics, SIAM Review, 53(1): 3-39. 5. Lee, Y.+ and Wu, H.* (2012), MARS Approach for Global Sensitivity Analysis of Differential Equation Models with Applications to Dynamics of Influenza Infection, Bulletin of Mathematical Biology, 74, 73-90. 6. Wu, H.*, Miao, H., Xue, H., Topham, D.J., Zand, M. (2015), Quantifying Immune Response to Influenza Virus Infection via Multivariate Nonlinear ODE Models with Partially Observed State Variables and Time-Varying Parameters, Statistics in Biosciences, 7(1):147-166.
NLS Estimation of ODE parameters 1. Wu, H.*, Huang, Y.+, Dykes, C., Liu, D., Ma, J., Perelson, A.S., Demeter, L. (2006), Modeling and Estimation of Replication Fitness of HIV-1 in Vitro Experiments Using a Growth Competition Assay, Journal of Virology, 80, 2380-2389. 2. Xue, H.+, Miao, H., Wu, H.* (2010), Sieve Estimation of Constant and Time-Varying Coefficients in Nonlinear Ordinary Differential Equation Models by Considering Both Numerical Error and Measurement Error, Annals of Statistics, 38(4), 2351-87.
Two-stage methods for ODE models 1. Liang, H., Wu, H.*, (2008), Parameter Estimation for Differential Equation Models Using a Framework of Measurement Error in Regression Models, Journal of the American Statistical Association, 103, 1570-1583. 2. Chen, J.+ and Wu, H.* (2008), Efficient Local Estimation for Time-varying Coefficients in Deterministic Dynamic Models with Applications to HIV-1 Dynamics, Journal of the American Statistical Association, 103, 369-384. 3. Fang, Y.+, Wu, H.*, Zhu, L. (2011), A Two-Stage Estimation Method for Random Coefficient Differential Equation Models with Application to Longitudinal HIV Dynamic Data, Statistica Sinica, 21, 1145-1170. 4. Lu, T.+, Liang, H., Li, H., Wu, H.* (2011), High Dimensional ODEs Coupled with Mixed- Effects Modeling Techniques for Dynamic Gene Regulatory Network Identification, Journal of the American Statistical Association, 106, 1242-1258. 5. Wu, H.*, Xue, H., Kumar A.+ (2012), Numerical Discretization-Based Estimation Methods for Ordinary Differential Equation Models via Penalized Spline Smoothing with Applications in Biomedical Research, Biometrics, 68(2), 344-353. 6. Ding, A.A. and Wu, H.* (2014), Estimation of ODE Parameters Using Constrained Local Polynomial Regression, Statistica Sinica, 24, 1613-1631. 7. Wu, H.*, Miao, H., Xue, H., Topham, D.J., Zand, M. (2015), Quantifying Immune Response to Influenza Virus Infection via Multivariate Nonlinear ODE Models with Partially Observed State Variables and Time-Varying Parameters, Statistics in Biosciences, 7(1):147-166.
Time-varying parameter estimation in ODE Models 1. Chen, J.+ and Wu, H.* (2008), Efficient Local Estimation for Time-varying Coefficients in Deterministic Dynamic Models with Applications to HIV-1 Dynamics, Journal of the American Statistical Association, 103, 369-384. 2. Chen, J.+ and Wu, H.* (2008), Estimation of time-varying parameters in deterministic dynamic models, Statistica Sinica, 18, 987-1006. 3. Liang, H., Miao, H., and Wu, H.* (2010), Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model, Annals of Applied Statistics, 4, 460-483. 4. Xue, H.+, Miao, H., Wu, H.* (2010), Sieve Estimation of Constant and Time-Varying Coefficients in Nonlinear Ordinary Differential Equation Models by Considering Both Numerical Error and Measurement Error, Annals of Statistics, 38(4), 2351-87. 5. Cao, J., Huang, J.Z., Wu, H. (2012), Penalized Nonlinear Least Squares Estimation of Time-Varying Parameters in Ordinary Differential Equations, Journal of Computational and Graphical Statistics (JCGS), 21(1), 42-56.
Bayesian and mixed-effects ODE modeling approaches for longitudinal data 1. Wu, H.*, Ding, A. and DeGruttola. V. (1998), “Estimation of HIV Dynamic Parameters," Statistics in Medicine, 17, 2463-2485. 2. Wu, H.* and Ding, A. (1999), “Population HIV-1 Dynamics in Vivo: Applicable Models and Inferential Tools for Virological Data from AIDS Clinical Trials," Biometrics, 55, 410-418. 3. Huang, Y.+ and Wu, H.* (2006), A Bayesian Approach for Estimating Antiviral Efficacy in HIV Dynamic Models, Journal of Applied Statistics, 33, 155-174. 4. Huang, Y., Liu, D.+ and Wu, H.* (2006), Hierarchical Bayesian Methods for Estimation of Parameters in a Longitudinal HIV Dynamic System, Biometrics, 62, 413-423. 5. Fang, Y.+, Wu, H.*, Zhu, L. (2011), A Two-Stage Estimation Method for Random Coefficient Differential Equation Models with Application to Longitudinal HIV Dynamic Data, Statistica Sinica, 21, 1145-1170.
High-dimensional ODE models and model selections 1. Lu, T.+, Liang, H., Li, H., Wu, H.* (2011), High Dimensional ODEs Coupled with Mixed- Effects Modeling Techniques for Dynamic Gene Regulatory Network Identification, Journal of the American Statistical Association, 106, 1242-1258. 2. Wu, H.*, Lu, T.+, Xue, H., and Liang, H. (2014), Sparse Additive ODEs for Dynamic Gene Regulatory Network Modeling, Journal of the American Statistical Association, 109:506, 700-716.
Nonlinear/nonparametric ODE models 1. Wu, H.*, Lu, T.+, Xue, H., and Liang, H. (2014), Sparse Additive ODEs for Dynamic Gene Regulatory Network Modeling, Journal of the American Statistical Association, 109:506, 700-716. 2. Miao, H., Wu, H., and Xue, H. (2014), Generalized Ordinary Differential Equation Models, Journal of the American Statistical Association, 109:508, 1672-1682.
Statistical methods for state-space models 1. Zhu, H.+ and Wu, H.* (2007), Estimation of Smoothing Time-Varying Parameters in State Space Models, Journal of Computational and Graphical Statistics (JCGS), 16(4), 813-832. 2. Liu, D.+, Lu, T.+, Niu, X.F., and Wu, H.* (2011), Mixed-Effects State Space Models for Analysis of Longitudinal Dynamic Systems, Biometrics, 67, 476-485.
ODE experimental design 1. Wu, H.* and Ding, A.A. (2002), “Design of Viral Dynamic Studies for Efficiently Assessing Anti-HIV Therapies in AIDS Clinical Trials," Biometrical Journal, 2, 175-196. 2. Huang, Y. and Wu, H.* (2008), Bayesian Experimental Design for Long-Term Longitudinal HIV Dynamic Studies, Journal of Statistical Planning and Inference, 138, 105-113. 3. Miao, H., Xia, X., Perelson, A.S., Wu, H.* (2011), On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics, SIAM Review, 53(1): 3-39.
Dynamic model property analysis with uncertainty 1. Sun, X.+, Hu, F.+, Wu, S., Qiu, X., Linel, P.+, Wu, H.* (2016), Controllability and Stability Analysis of Large Transcriptomic Dynamic Systems for Host Response to Influenza Infection in Human, Infectious Disease Modelling, 1(1), 52-70.
Our recent work in nonlinear/high-dimensional ODE models
I Lu, T., Liang, H., Li, H., Wu, H. (2011), High Dimensional ODEs Coupled with Mixed-Effects Modeling Techniques for Dynamic Gene Regulatory Network Identification, JASA, 106, 1242-1258. I Wu, H., Xue, H., Kumar A. (2012), Numerical Discretization-Based Estimation Methods for Ordinary Differential Equation Models via Penalized Spline Smoothing with Applications in Biomedical Research, Biometrics, 68(2), 344-353. I Miao, H., Wu, H., and Xue, H. (2014), Generalized Ordinary Differential Equation Models, JASA, 109:508, 1672-1682 I Wu, H., Lu, T., Xue, H., and Liang, H. (2014), Sparse Additive ODEs for Dynamic Gene Regulatory Network Modeling, JASA, 109:506, 700-716. I Wu, S., Liu, Z.P., Qiu, X., and Wu, H. (2014), Modeling genome-wide dynamic regulatory network in mouse lungs with influenza infection using high-dimensional ordinary differential equations, PLOS ONE, 9(5):e95276. I Linel, P., Wu, S., Deng, N., Wu, H. (2014), Dynamic transcriptional signatures and network responses for clinical symptoms in influenza-infected human subjects using systems biology approaches, Journal of PK/PD, 41, 509-521. I Qiu, X. et al. (2015), Diversity in Compartmental Dynamics of Gene Regulatory Networks: The Immune Response in Primary Influenza A Infection in Mice, PLoS ONE, 10(9).
Hulin Wu UTSPH March 2017 50 / 52 Acknowledgement
I More than 30 postdocs, students and collaborators
Hulin Wu UTSPH March 2017 51 / 52 Thank You!
Hulin Wu UTSPH March 2017 52 / 52