JournalofMachineLearningResearch11(2010)2745-2783 Submitted 5/10; Revised 9/10; Published 10/10
Mean Field Variational Approximation for Continuous-Time Bayesian Networks∗
Ido Cohn† IDO [email protected] Tal El-Hay† [email protected] Nir Friedman [email protected] School of Computer Science and Engineering The Hebrew University Jerusalem 91904, Israel
Raz Kupferman [email protected] Institute of Mathematics The Hebrew University Jerusalem 91904, Israel
Editor: Manfred Opper
Abstract Continuous-time Bayesian networks is a natural structured representation language for multi- component stochastic processes that evolve continuously over time. Despite the compact represen- tation provided by this language, inference in such models is intractable even in relatively simple structured networks. We introduce a mean field variational approximation in which we use a prod- uct of inhomogeneous Markov processes to approximate a joint distribution over trajectories. This variational approach leads to a globally consistent distribution, which can be efficiently queried. Additionally, it provides a lower bound on the probability of observations, thus making it attractive for learning tasks. Here we describe the theoretical foundations for the approximation, an efficient implementation that exploits the wide range of highly optimized ordinary differential equations (ODE) solvers, experimentally explore characterizations of processes for which this approximation is suitable, and show applications to a large-scale real-world inference problem. Keywords: continuous time Markov processes, continuous time Bayesian networks, variational approximations, mean field approximation
1. Introduction
Many real-life processes can be naturally thought of as evolving continuously in time. Examples cover a diverse range, starting with classical and modern physics, but also including robotics (Ng et al., 2005), computer networks (Simma et al., 2008), social networks (Fan and Shelton, 2009), gene expression (Lipshtat et al., 2005), biological evolution (El-Hay et al., 2006), and ecological systems (Opper and Sanguinetti, 2007). A joint characteristic of all above examples is that they are complex systems composed of multiple components (e.g., many servers in a server farm and multiple residues in a protein sequence). To realistically model such processes and use them in
∗. A preliminary version of this paper appeared in the Proceedings of the Twenty Fifth Conference on Uncertainty in Artificial Intelligence, 2009 (UAI 09). †. These authors contributed equally.