
Statistical Methods for Bridging Experimental Data and Dynamic Models with Biomedical Applications Hulin Wu, Ph.D. Dr. D.R. Seth Family Professor & Associate Chair Department of Biostatistics, School of Public Health Professor, School of Biomedical Informatics University of Texas Health Science Center at Houston Pittsburgh, March, 2017 Hulin Wu UTSPH March 2017 1 / 52 Outline 1 Introduction 2 Statistical estimation and inference methods for dynamic ODE models I Naive Method: LS or MLE principle I Local solution and time-varying parameter problems I Smoothing-based methods I Sparse longitudinal data: mixed-effects ODE models I Bayesian methods I High-dimensional ODE models: ODE model selection 3 Other dynamic models 4 Ongoing and future Work 5 Conclusions Hulin Wu UTSPH March 2017 2 / 52 Statistical Modeling Cultures I Leo Breiman (Statistical Science, 2001): Two cultures I Data modeling (98% statisticians): What the data look like? e.g., regression models I Algorithmic modeling (2% statisticians): No models and for prediction purpose, e.g., neural nets and decision trees I A third culture: I Mechanistic modeling (<1% statisticians): Build mathematical models based on the mechanisms behind the data I How are the data generated? I Goal: Understand physics principles or biological mechanisms Hulin Wu UTSPH March 2017 3 / 52 Dynamic Systems/Models Many engineering and biological systems can be described by dynamic models: I Differential equations: I Ordinary differential equations (ODE)–simplest I Delay differential equations (DDE) I Hybrid differential equations (HDE) I Partial differential equations (PDE) I Stochastic differential equations (SDE) I Difference equations and state-space models I Stochastic processes models: branching process etc. I Agent-based models and cellular automata I ::: Hulin Wu UTSPH March 2017 4 / 52 Modeling Goals I Forward Problems: θ 7! Pθ–Easier to do I Predictions I Simulations I Inverse Problems: Y 7! θ 2 Θ–More challenging I Determine model structures/forms I Estimate unknown parameters: θ Hulin Wu UTSPH March 2017 5 / 52 A Dynamic System: ODE Model d X(t) = G[X(t); θ]; X(0) = X (1) dt 0 Y (ti) = H[X(ti); β] + e(ti); (2) 2 e(ti) ∼ (0; σ I); i = 1; : : : ; n where I G(·): linear or nonlinear functions I H(·): observation functions I (θ; β): unknown parameters I e(ti): measurement error The NLS method: n X T min fY (ti) − H[X(ti; θ); β]g fY (ti) − H[X(ti; θ); β]g; θ;β;X 0 i=1 where X(ti) evaluated numerically from Eq (1). Hulin Wu UTSPH March 2017 6 / 52 Naive NLS Method: Challenging Problems 1 Identifiability problem 2 Local solutions 3 Time-varying parameters 4 Need to solve the forward problem numerically and many times: Numerical error vs. measurement error 5 Slow convergence and high computational cost 6 Sparse longitudinal data problem 7 Nonlinear optimization 8 High-dimensional parameter space Motivate new statistical methods for dynamic models Hulin Wu UTSPH March 2017 7 / 52 Identifiability issues I Theoretical identifiability: Mathematical identifiability I Practical identifiability: Statistical and numerical identifiability I Need to be investigated before the inverse problem I How to deal with unidentifiable models? I Simplify or revise the model I Lump some parameters together I Fixed some parameters I Bayesian approach: Use priors Hulin Wu UTSPH March 2017 8 / 52 Identifiability issues: References I Wu, H., Zhu, H., Miao, H., and Perelson, A.S. (2008), Parameter Identifiability and Estimation of HIV/AIDS Dynamic Models, Bulletin of Mathematical Biology, 70(3), 785-799. I Miao, H., Dykes, C., Demeter, L.M., Cavenaugh, J., Park, S.Y., Perelson, A.S., and Wu, H. (2008), Modeling and Estimation of Kinetic Parameters and Replicative Fitness of HIV-1 from Flow-Cytometry-Based Growth Competition Experiments, Bulletin of Mathematical Biology, 70, 1749-1771. I Miao, H., Dykes, C., Demeter, L., Wu, H. (2009), Differential Equation Modeling of HIV Viral Fitness Experiments: Model Identification, Model Selection, and Multi-Model Inference, Biometrics, 65, 292-300. I Liang, H., Miao, H., and Wu, H. (2010), Estimation of constant and time-varying dynamic parameters of HIV infection in a nonlinear differential equation model, Annals of Applied Statistics, 4, 460-483. I Miao, H., Xia, X., Perelson, A.S., Wu, H. (2011), On Identifiability of Nonlinear ODE Models and Applications in Viral Dynamics, SIAM Review, 53(1): 3-39. Hulin Wu UTSPH March 2017 9 / 52 Naive NLS Method: Local solution and numerical error problems I Local solution problem: I Global optimization methods: Differential evolution algorithms and genetic algorithms (Storn et al 1997). I Mixture of stochastic global optimization method and deterministic methods: scatter search method (Rodriguez-Fernandez et al. 2006) I Numerical error problem: I Xue, Miao and Wu (Annals of Statistics, 2010): theoretical results on numerical error vs. measurement error Hulin Wu UTSPH March 2017 10 / 52 Naive NLS Method: Time-varying parameter problem Xue, Miao and Wu, Annals of Statistics (2010) dX(t) = F ft; X(t); θ; η(t)g dt I The spline approach can be used to approximate the time-varying parameter: η(t) = π(t)T α; T where π(t) = (B1(t); ··· ;BN (t)) is a vector of basis functions. I The time-varying coefficient ODE model becomes an ODE model with constant parameters: dX(t) = F ft; X(t); θ; π(t)T αg dt Hulin Wu UTSPH March 2017 11 / 52 Smoothing-Based Approaches: ODE Computational Problem I Earlier ideas: Hemker (1972) and Varah (1982) I Two-stage decoupling approaches: Chen and Wu (JASA 2008, Statistica Sinica 2008) and Liang and Wu (JASA, 2008) I Parameter cascading method: Ramsay et al. JRSS-B (2007) and Wang et al. Stat Comput 2014. Hulin Wu UTSPH March 2017 12 / 52 Smoothing-Based Approaches: Two-Stage Method Chen and Wu (JASA 2008, Statistica Sinica 2008) and Liang and Wu (JASA, 2008): 0 X (ti) = F [X(ti); θ] (3) 2 Y (ti) = X(ti) + e1(ti); e1(ti) ∼ (0; σ I); (4) I Step 1: Use a nonparametric smoothing to estimate X(t) and X0(t) from model (4). I Step 2: Substitute the estimate X^(ti) into model (3) to obtain: 0 X^ (ti) = F [X^(ti); θ] + e2(ti): (5) Then fit the above regression model (5) to estimate θ. I F (·): Linear or nonlinear function Hulin Wu UTSPH March 2017 13 / 52 Smoothing-Based Approaches: Two-Step Methods I Step 2 decoupled the system of ODEs: Fit the ODE one-by-one I Convert ODE models to regression: Standard regression software tools can be used I Avoid numerically solving the ODEs I Computationally fast and efficient: Easy to deal with high-dimensional ODEs I Price to pay: I The derivative estimate may not be accurate I The decoupled system: Some information lost I The “coupled" property: destroyed Extension to higher-order numerical discretization-based algorithms: Wu, Xue and Kuman (Biometrics 2012) Hulin Wu UTSPH March 2017 14 / 52 Parameter Cascading or Profiling Method Ramsay, Hooker, Campbell, Cao, JRSS-B, 2007 Fitting to data I Observations: y(ti) 0 I Nonparametric function: f(t) = φ(t) c Pn 2 I Fitting to data: C1 = i=1[y(ti) − f(ti)] Fidelity to DE x0(t) = g(xjβ) 0 0 I f (t) = φ (t)c 0 I Difference between two sides of DE: Lf(t) = f (t) − g(f(t)jβ) R 2 I Fidelity to DE: C2 = [Lf(t)] dt Criterion to estimate c: J(cjβ) = C1 + λC2 Pn 0 2 Criterion to estimate β: H(β) = i=1[y(ti) − φ(ti) c^(β)] Hulin Wu UTSPH March 2017 15 / 52 Numerical Comparisons: NLS, Profiling and Two-Stage Estimates Ding and Wu, Statistica Sinica, 2014 I NLS: Not stable to get the global solution, computationally expensive I Profiling: I A 3-step iterative algorithm I More stable than NLS to get a better solution I Computational efficiency: similar to NLS I Two-Stage Method: Computationally fast, but not accurate. Hulin Wu UTSPH March 2017 16 / 52 Sparse Longitudinal Data Problem: Mixed-Effects Modeling Approaches Deal with sparse data: Borrow information across subjects I The MLE principle: Nonlinear Mixed-Effects Modeling (NLME) I Treat the ODE solution as a nonlinear regression function I Computational challenge: Stochastic Approximation EM (SAEM) I Two-stage smoothing-based mixed-effects modeling approaches I Fang, Wu and Zhu, Statistica Sinica (2011) I Linear ODE: Linear mixed-effects model (LME) I Nonlinear ODE: NLME I Bayes methods I A three-stage hierarchical model: implemented by MCMC I Computation: expensive Hulin Wu UTSPH March 2017 17 / 52 Mixed-Effects ODE Model: NLME I Within-subject variation: d X(t) = G[X(t); θ ]; X(0) = X (6) dt i i0 Y i(ti) = Hi[Xi(ti); θi] + ei(ti); i = 1; : : : ; n I Xi(ti): ODE solution for Subject i. T I Y i = (yi1(t1); ··· ; yimi (tmi )) : Data from Subject i T 2 I ei = (ei(t1); ··· ; ei(tmi )) ∼ N (0; σ Imi ): Measurement error I Between-subject variation: θi = µ + bi; [bijΣ] ∼ N (0; Σ) I µ: population parameter I bi: random effects I Estimation and inference: Stochastic Approximation EM (SAEM) I Delyon, Lavielle and Moulines (1999), Kuhn and Lavielle (2005) Grenier, Louvet, Vigneaux (2014) Hulin Wu UTSPH March 2017 18 / 52 Smoothing-based Two-Stage Mixed-Effects Model Fang, Wu and Zhu, Statistica Sinica (2011): 0 X (ti) = F [X(ti); θ] (7) 2 Y (ti) = X(ti) + e1(ti); e1(ti) ∼ (0; σ I); (8) I Step 1: Use a nonparametric smoothing to estimate X(t) and X0(t) from model (8). I Step 2: Substitute the estimate X^(ti) into model (7) to obtain: 0 X^ (ti) = F [X^(ti); θ] + e2(ti): (9) I Convert the model (9) into a LME or NLME if F (x) is linear or nonlinear.
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages55 Page
-
File Size-