Parallel Implementations of Random Time Algorithm for Chemical Network Stochastic Simulations
Total Page:16
File Type:pdf, Size:1020Kb
Parallel implementations of random time algorithm for chemical network stochastic simulations Chuanbo Liua, Jin Wanga,b,∗ aState Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Jilin, People’s Republic of China bDepartment of Chemistry, Physics and Applied Mathematics, State University of New York at Stony Brook, Stony Brook, USA Abstract In this study, we have developed a parallel version of the random time simulation algorithm. Firstly, we gave a rigorous basis of the random time description of the stochastic process of chemical reaction network time evolution. And then we reviewed the random time simulation algorithm and gave the implementations for the parallel version of next reaction random time algorithm. The discussion of computational complexity suggested a factor of M (which is the connection number of the network) folds time consuming reduction for random time simulation algorithm as compared to other exact stochastic simulation algorithms, such as the Gillespie algorithm. For large-scale system, such like the protein- protein interaction network, M is on order of 108. We further demonstrate the power of random time simulation with a GPGPU parallel implementation which achieved roughly 100 folds acceleration as compared with CPU implementations. Therefore the stochastic simulation method we developed here can be of great application value for simulating time evolution process of large-scale network. Keywords: Random time, Stochastic simulations, Parallel algorithm, GPGPU 1. Introduction such like StochKit. We here demonstrated a new imple- mentation of random time algorithm with GPGPU for Chemical reaction network time evolution is intrinsi- stochastic simulations. By carefully arranging the data cally stochastic. Starting by Gillespie [1], a lot of algo- distribution on global and local memories we can acceler- rithms had been developed for stochastic simulation of ate the stochastic simulation with a data parallel manner the time evolution of chemical reaction network, such like for roughly 100 folds. In this paper, we first briefly re- the First Reaction Method (FRM), Direct Method (DM). viewed the methods of random time simulation, then dis- Some improved methods had also been developed to ac- cussed the relative computational complexity. At last, we celerate the simulation processes. Such like the Opti- P demonstrated the methods with an oscillation predator- mized Direct Method (ODM), computes aj incremen- prey model. tally. The Next Reaction Method (NRM), use one single random number for each simulation step. Others like Sort- 2. A brief introduction to the random time ap- ing Direct Method (SDM), Logarithmic Direct Method proach (LDM), Partial-propensity Stochastic Simulation Algo- rithm (PSSA), PSSA-composition rejection (PSSA-CR), In this part, we follow [2]. But we are trying to give a and Sorting Partial-propensity Direct Method (SPDM). rigorous generalized approach. Focused on different aspects of the simulation steps. Consider N > 1 chemical species in a well-mixed system. arXiv:2103.00405v1 [q-bio.MN] 28 Feb 2021 Development of hardware, especially for the General Also consider M > 1 reactions, labeled with k, the stoi- Purpose Graphics Processing Units (GPGPU), give the chiometric number of j-th chemical species in k-th reaction opportunity for developing parallel algorithm that can take is σjk. The simplest stochastic models for describing the advantage of the large number of thread blocks (TBs). system is a continuous time Markov chain. The state is Various data structure and implementations were also represented by the molecule number of chemical species been developed for acceleration of the stochastic simula- X(t) = {X1,X2,...,XN } and reactions modeled as possi- tion on fine-grid or coarse-grid level. These implemen- ble transitions of the chain. It was shown by Gillespie [6, 1] tations can achieve about 10 folds of simulation accelera- that in a well mixed system, the probability for a specific tion as compared to mature CPU based simulation toolkit, reaction takes place is governed by the propensity func- tion ak(X(t), t) when consider time-inhomogeneous chem- ical reaction networks. The propensity function is ∗To whom correspondence should be addressed. Email: [email protected] ak(X(t), t) = ck(t)hk(X(t)) (1) Preprint submitted to arxiv March 2, 2021 where the time interval (t, t + ∆t] is P{R(t + ∆t) − R(t) = k|Ft} Y Xk(t) Y Xk(t)! hk(X(t)) = = (2) Z t+∆t σjk σjk!(Xk(t) − σjk)! j j = P{ dG(λ(s), s) = k|Ft} t = P{G(λ(t), t + ∆t) − G(λ(t), t) = k|Ft} when considering the time evolution , and the state vector (λ(t)∆t)k for k-th jump is vk. With Markov property, the conditional = e−λ(t)∆t probability can be expressed as: k! R t+∆t k λ s ds t+∆t t ( ) − R λ(s)ds = e t P{k-th reaction fired once in (t, t+∆t)|Ft} = ak(X(t), t)∆t k! (3) ( Z t+∆t ! Z t ) where Ft is the σ-algebra representing the information = P Y λ(s)ds − Y λ(s)ds = k|Ft about the system that is available at time t [7]. The conse- t 0 quence of Eq. 3 is the possibility of two reactions occurred (8) at the same is on the order of (∆t)2, which means it is very where Y is unit rate Poisson process. The trick here is λ unlikely to happen. This can be shown by simply calculate is not changed in the infinitesimal time interval (t, t + ∆t]. the joint probability as Also it can be noticed Z t+∆t P{R t t − R t } lim λ(s)ds = lim λ(t)∆t (9) k( + ∆ ) k( ) > 2 ∆t→0 t ∆t→0 2 = P{Rk(t + ∆t) − Rk(t)} With Eq. 7, the result is obvious. Eq. 8 suggests ∼ P{Rk(t + ∆t) − Rk(t) > 1,Rj(t + ∆t) − Rj(t) > 1} Z t P{R t t − R t }· P{R t t − R t } = k( + ∆ ) k( ) j( + ∆ ) j( ) R(t) = Y λ(s)ds (10) ∼ O((∆t)2) 0 (4) Next we can establish the relationship between λ and The system evolution can be described equivalently as a the propensity function ak. We can now rewrite the left counting process by replacing the random variables from side of Eq. 3 as chemical species molecule number to number of times that P {Rk(t + ∆t) − Rk(t) = 1|Ft} a specific reaction fired. If Rk(t) is the number of times that the k-th reaction has fired up to time t, then the state Z t+∆t ! −(λk(s)ds) at t that originated from X(0) is given by = λk(s)ds e t (11) Z t+∆t ! M 2 X = λk(s)ds + O((∆t) ) X(t) = X(0) + Rk(t)vk (5) t k=1 Compared to the right side of Eq. 3, we can have This counting process is a generalized Poisson process Z t+∆t G(λ(t), t) in the sense that the arrival rate λ is a function λk(s)ds = ak(X(t), t)∆t (12) of time t. The total arrival times for generalized Poisson t process G can be calculated as which means, λk(s) = ak(X(t), t) (13) Z t Therefore, from Eq. 5, Eq. 10 and Eq. 13, the system R(t) = dG(λ(s), s) (6) 0 evolution equation is, M Z t λ X The Poisson distribution with parameter in the time in- X(t) = X(0) + Yk ak(X(s), s)ds vk (14) terval (s, t] describe the arrival times of events of a Poisson k=1 0 process N in this time interval. So we have represented the chemical reaction network evolution process as an increment counting process. Ev- (λ(s − t))k P{N(λ, s) − N(λ, t) = k} = e−λ(s−t) (7) ery infinitesimal time frame of this stochastic process can k! be further decomposed to M independence unit rate Pois- son processes. Eq. 14 can be rewritten to have the same Therefore the probability of k times events happened in formula with Eq. 5 by introducing the “internal time” for 2 each chemical reaction channel. The internal time Tk(t) is Eq. 21 tells us if a random number r is uniformly dis- defined as tributed on [0, 1], then the time interval is Z t Tk(t) = ak(X(s), s)ds (15) 0 ∆t = ln(1/r) (22) And Eq. 5 is becoming In the random time representation of the stochastic model M of chemical reaction network, for every infinitesimal time X X(t) = X(0) + Yk (Tk(t)) vk (16) frame the stochastic process can be described by a count- k=1 ing process and further be decomposed into unit rate Pois- son processes. All these unit rate Poisson processes are in- This is where the random time notion comes from. dependent and remain stationary until some chemical re- action channel is fired. For k-th channel, follow the same 3. Random time simulation algorithm argument from Eq. 17 to Eq. 20, the internal time has the same distribution formula as Eq. 22. In order to perform a complete stochastic simulation, two things must be cleared firstly, ∆Tk(t) = ln (1/rk) (23) 1. how much time passed before one of the stochastic where rk is a random number uniformly distributed processes, Yk, fires; on [0, 1]. So if a set of random number is given 2. which Yk fires at that later time; {r1, r2, . , rM }, the corresponding real time for k-th If we view the state of chemical reaction network as points chemical channel can be calculated from Eq.