Received February 1, 2013, accepted September 2, 2013, date of publication September 16, 2013, date of current version September 24, 2013.

Digital Object Identifier 10.1109/ACCESS.2013.2281668

Approximating Extremely Large Networks via Continuum Limits

YANG ZHANG1, EDWIN K. P. CHONG1 (Fellow, IEEE), JAN HANNIG2, AND DONALD ESTEP3 1Department of Electrical and Computer Engineering, Colorado State University, Ft. Collins, CO 80523-1373, USA 2Department of Statistics and Operation Research, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3260, USA 3Department of Mathematics and Department of Statistics, Colorado State University, Fort Collins, CO 80523-1373, USA Corresponding author: Y. Zhang ([email protected]) The work of Y. Zhang and E. Chong was supported in part by the National Science Foundation (NSF) under Grant ECCS-0700559 and ONR under Grant N00014-08-1-110. The work of J. Hannig was supported in part by NSF under Grants 1007543 and 1016441. The work of D. Estep was supported in part by the Defense Threat Reduction Agency under Grant HDTRA1-09-1-0036, the Department of Energy under Grants DE-FG02-04ER25620, DE-FG02-05ER25699, DE-FC02-07ER54909, DE-SC0001724, DE-SC0005304, and INL00120133, the Lawrence Livermore National Laboratory under Grants B573139, B584647, B590495, the National Aeronautics and Space Administration under Grant NNG04GH63G, the National Institutes of Health under Grant 5R01GM096192-02, the National Science Foundation under Grants DMS-0107832, DMS-0715135, DGE-0221595003, MSPA-CSE-0434354, ECCS-0700559, DMS-1016268, and DMS-FRG-1065046, and the Idaho National Laboratory under Grants 00069249 and 00115474.

ABSTRACT This paper is concerned with modeling of networks with an extremely large number of components using partial differential equations (PDEs). This modeling method is based on the convergence of a sequence of underlying Markov chains of the network indexed by N, the number of components in the network. As N goes to infinity, the sequence converges to a continuum limit, which is the solution of a certain PDE. We provide sufficient conditions for the convergence and characterize the rate of convergence. As an application, we model large wireless sensor networks by PDEs. While traditional Monte Carlo simulation for extremely large networks is practically infeasible, PDEs can be solved with reasonable computation overhead using well-established mathematical tools.

INDEX TERMS Modeling, partial differential equations, Markov processes, network modeling.

I. INTRODUCTION associated with direct simulation thus severely limits the This paper is concerned with modeling of networks size and complexity of networks that can be studied in this involving an extremely large number of components. fashion. The conventional way to study large networks is by Our recent papers [2]–[5] address this problem by using computer modeling and simulation [1]. The approach continuum modeling to capture the global characteristics of involves representing the network in computer software large networks. In large networks, we are often more inter- and then applying a numerical simulation method to ested in the global characteristics of an entire network than study how the network behaves. Typically, each indi- in a particular individual component. Continuum models do vidual component is explicitly represented as a sep- away with the need to represent each individual component of arate entity. As we are confronted with larger and a large network as a separate entity, and consider the behavior larger networks, the number of its components that have of the components on the scale of the aggregate rather than of to be represented increases, and this significantly lengthens the individual. Similar to treating water as a continuous fluid the time it takes to write, manage, and run computer simula- instead of a large number of individual molecules, continuum tion programs. Simulating large networks typically requires modeling treats the large number of communicating compo- expensive, highly sophisticated supercomputers involving nents (or nodes) in a network collectively as a continuum. The large parallel computing hardware with specialized soft- continuum modeling strategies in [3]–[5] use partial differen- ware. It is not uncommon for a simulation run to take days tial equations (PDEs) to approximate large sensor or cellular or weeks, even on a large supercomputer. The larger the networks modeled by a certain class of Markov chains. The network, the longer it takes. The computational overhead PDE model represents the global characteristics of the net-

VOLUME 1, 2013 2169-3536 2013 IEEE 577 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits work, while the individual characteristics of the components cluster platforms [9]–[12]; new mechanisms for executing the enter the model through the form and the parameters of the simulation have been designed to improve the efficiency of PDE. event in event-driven simulations (see, e.g., [13], PDEs are well-suited to the modeling of continuum behav- [14]); and fluid simulations, in contrast to traditional packet- ior. Although uncommon in modeling networks, they are level ones, have been used to simplify the network model by common in modeling many physical phenomena, including treating network traffic (not nodes) as continuous flows rather heat, sound, electromagnetism, and fluid flow. There are well- than discrete packets [15]–[18]. However, as the number of established mathematical tools to solve PDEs, such as the nodes in the network grows extremely large, computer-based finite element method [6] and the finite difference method [7], simulations involving individual nodes eventually become incorporated into computer software packages such as Matlab practically infeasible. For the remainder of this subsection, and Comsol. We can use these tools to greatly reduce compu- we review some existing results on analysis of stochastic tation time. As a result, the effort to run the PDE models in networks that do not depend on direct numerical simulation. a computer no longer suffers from the curse of sheer size. (In Our convergence analysis in this paper uses Kushner’s ordi- fact, as we will show, the larger the network, the closer the nary differential equation (ODE) method [19]. This method PDE approximates it.) Continuum modeling thus provides a essentially studies a ‘‘smoothing’’ limit as a certain ‘‘aver- powerful way to deal with the number of components in large aging’’ parameter goes to ∞, but not a ‘‘large-system’’ limit networks. This, in turn, would make it possible to carry out— as the number of components in the system goes to ∞. In with reasonable computational burden even for extremely contrast, the limiting process analyzed in this paper involves large systems—network performance evaluation and proto- two steps: the first similar to that in Kushner’s ODE method, typing, network design, systematic parameter studies, and and the second a ‘‘large-system’’ limit. (We provide more optimization of network characteristics. details about the two-step procedure later in Sec. I-D.) In The work in this paper is motivated by the continuum mod- other words, while Kushner’s method deals with a fixed state N eling strategies in the papers [3]–[5] mentioned above, and by space, we treat a sequence of state spaces {R } indexed by the need for a rigorous description of the heuristic limiting increasing N, where N is the number of components in the process underlying the construction of their PDE models. We system. analyze the convergence of a class of Markov chains to their Kushner’s ODE method is closely related to the line of continuum limits, which are the solutions of certain PDEs. We research called stochastic approximation, started by Rob- consider a general model in an abstract setting bins and Monro [20] and Kiefer and Wolfowitz [21] in the instead of that of any particular network model. We do this early 1950s, which studies stochastic processes similar to for two reasons: first, our network modeling results involve a those addressed by Kushner’s ODE method, and has been class of Markov chains modeling a variety of communication widely used in many areas (see, e.g., [22], [23], for surveys). networks; second, similar Markov chain models akin to ours Among the numerous following efforts, several ODE meth- arise in several other contexts. For example, a very recent ods including that of Kushner were first developed in the paper [8] on human crowd modeling derives a limiting PDE 1970s (see, e.g., [24], [25]) and extensively studied there- in a fashion similar to our approach. after (see, e.g., [26]–[28]), many times addressing problems In the convergence analysis, we show that a sequence of outside of the category of stochastic approximation (see, Markov chains indexed by N, the number of components in e.g. [19]). the system that they model, converges in a certain sense to its The general subject of the approximation of Markov chains continuum limit, which is the solution of a time-dependent (or equivalently, the convergence of sequences of Markov PDE, as N goes to ∞. The PDE solution describes the global chains to certain limits) goes beyond the scope of ODE spatio-temporal behavior of the model in the limit of large methods or stochastic approximation, and there are results system size. We apply this abstract result to the modeling of a on the convergence of different models to different limits. large wireless sensor network by approximating a particular A huge class of Markov chains (discrete-time or continuous- global aspect of the network states (queue length) by a nonlin- time) that model various systems, phenomena, and abstract ear convection-diffusion-reaction PDE. This network model problems, hence having in general very different forms from includes the network example discussed in [3] as a special ours, have been shown to converge either to ODE solutions case. [29]–[31] (and more generally, abstract Cauchy prob- lems [32]), or to stochastic processes like diffusion pro- A. RELATED LITERATURE cesses [19], [33]. These results use methods different from The modeling and analysis of stochastic systems such as Kushner’s, but share with it the principle idea of ‘‘averaging networks is a large field of research, and much of the previous out’’ of the randomness of the Markov chain. Their deeper contributions share goals with the work in this paper. connection lies in weak convergence theory [19], [33], [34] In the field of direct numerical simulation approaches, and methods to prove such convergence that they have in many efforts have been made to accelerate the simulation. For common: the operator semigroup convergence theorem [35], example, parallel simulation techniques have been developed the martingale characterization method [36], and identifica- to exploit the computation power of multiprocessor and/or tion of the limit as the solution to a stochastic differential

578 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits equation [19], [33]. The reader is referred to [19], [33] and able); and the instantaneous throughput can be obtained by the references therein for additional information on these integrating the PDE solution at a certain time over the spatial methods. domain. Similar to Kushner’s ODE method, all the convergence Of course, there do exist numerous continuum models results discussed above differ from our approach in the sense in a wide spectrum of areas such as physics, chemistry, that they essentially study only the single-step ‘‘smoothing’’ ecology, economics, transportation, and sociology (e.g., [8], limit as an ‘‘averaging’’ parameter goes to ∞, but do not [57]–[63]), many of which use PDEs to formulate spatio- have our second-step convergence to the ‘‘large-system’’ temporal phenomena and approximate quantities such as the limit (PDE) (as N → ∞). There are systems in which the probability density of particle velocity in thermodynamic ‘‘averaging’’ parameter represents some ‘‘size’’ of the system systems, the concentration of reactants in chemical reactions, (e.g., population in the epidemic model, the metapopulations the population density in animal swarms, the wealth of a com- model, and the searching parasites problem [31], [32]). How- pany in consumption-investment processes, the car density ever, it is still the case that the convergence requires a fixed on highways, and the density of people in human crowds. dimension of the state space of the Markov chain, like the case All these works differ from the work presented here both by of Kushner’s ODE convergence, and does not apply to the the properties of the system being studied and the analytic ‘‘large-system’’ limit in our second step. For example, in the approaches. In addition, most of them study distributions of epidemic model, the Markov chain represents the number of limiting processes that are random, while our limiting func- people in a population in two states: infected and uninfected, tions themselves are deterministic. We especially emphasize and the large population limit is studied. This is a single- the difference between our results and those of the math- step limit and the state space of the Markov chain is always ematical physics of hydrodynamics [64]–[69], because the in R2. Notice that in these cases, the Markov chains model the latter have a similar style by deducing macroscopic behavior number or proportion of the components in different states from microscopic interactions of individual particles, and in in the system, and unlike our model, the indexing or loca- some special cases result in similar PDEs. However, they tions of the components are either unimportant or ignored. In use an entirely different approach, which usually requires contrast, in our case, the spatial index of nodes is addressed different assumptions on the systems such as translation throughout. invariant transition probabilities, conservation of the number In fact, a variety of other approximation methods for of particles, and particular distributions of the initial state; large systems, in general built on different ideas from the and their limiting PDE is not the direct approximation of aforementioned ones, take a similar direction: they study system state, but the density of some associated probability the number or proportion of components in a certain state measure. (or some related performance parameters), thus ignoring There is a vast literature on the convergence of a large their order or the difference in their spatial locations. For variety of network models different from ours, to essentially example, the famous work of Gupta and Kumar [37], fol- two kinds of limits: the fluid limit (or functional law of large lowed by many others (e.g., [38]–[41]), derives scaling numbers approximation) [70]–[79] and the diffusion approx- laws of network performance parameters (e.g., through- imation (or functional central limit theorem approximation), put); and many efforts based on mean field theory [42]– under the so-called fluid and diffusion scalings, respectively, [51] or on the theory of large deviations [52]–[56] study with the latter limit mostly studied in networks in heavy the convergence with regard to the so-called empirical traffic [80]–[91]. (Some papers study both limits [92]–[94].) (or occupancy) measure or distribution, which essentially Unlike our work, this field of research focuses primarily on represents the proportion of components in certain states, networks with a fixed number of nodes. to a deterministic function, as the number of compo- Our work is to be distinguished from approaches where nents grows large, treating the components as exchange- the model is constructed to be a continuum representation able in terms of their order or spatial indices. These from the start. For example, many papers treat nodes as a approaches differ from our work at least in the sense continuum by considering only the average density of nodes that they only study the statistical instead of the spatio- [95]–[102]; and others model network traffic as a continuum temporal characteristics of the system. As a result of by capturing certain average characteristics of the data packet treating the components without regard to their locations, traffic, with the averaging being over possibly different time when the limits obtained by these approaches are in fact scales [103]–[105]. The latter shares a similar idea with fluid differential equations, they are usually ODEs instead of simulations discussed at the beginning of this section. PDEs. Note that the statistical parameters studied in these works correspond to some deterministic quantities easily B. MARKOV CHAIN MODEL obtained from our deterministic limits that directly approx- We first describe our model in full generality. Consider N imate the state of the systems. For example, the propor- points VN = {vN (1),..., vN (N)} in a compact, convex tion of nodes with, say, empty queues in our network Euclidean domain D representing a spatial region. We assume model can be directly calculated from the limiting PDE that these points form a uniform grid, though the model solution (in addition, their locations are directly observ- generalizes to nonuniform spacing of points under certain

VOLUME 1, 2013 579 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits conditions (see Sec. IV for discussion). We refer to these N points in D as grid points. We consider a discrete-time Markov chain > N XN,M (k) = [XN,M (k, 1),..., XN,M (k, N)] ∈ R (the superscript > represents transpose) whose evolution is FIGURE 1. (a) An illustration of a wireless sensor network over a two-dimensional domain. Destination nodes are located at the far edge. described by the stochastic difference equation We show the possible path of a message originating from a node located in the left-front region. (b) An illustration of the collision protocol: XN,M (k + 1) = XN,M (k) + FN (XN,M (k)/M, UN (k)). (1) reception at a node fails when one of its other neighbors transmits (regardless of the intended receiver). (c) An illustration of the time Here, XN,M (k, n) is the real-valued state associated with the evolution of the queues in the one-dimensional network model. grid point vN (n) at time k, where n = 1,..., N is a spatial index and k = 0, 1,... is a temporal index; UN (k) are time k, node n decides to be a transmitter with proba- i.i.d. random vectors that do not depend on the state XN,M (k); bility W (n, X (k, n)/M), where X (k, n) is the queue M is an ‘‘averaging’’ parameter (explained later); and FN is N,M N,M a given function. length of node n at time k, and W is a given func- Treating N and M as indices that grow, the equation (1) tion. In this section, for the sake of explanation, we simplify defines a doubly indexed family XN,M (k) of Markov chains indexed by both N and M. (We will later take M to be a the problem even further and consider a one-dimensional domain (a two-dimensional example will be given in function of N, and treat this family as a sequence XN (k) of the single index N.) Below we give a concrete example of a Sec. II-E.3). Here, N sensor nodes are equidistributed in an system described by (1). interval D ⊂ R and labeled by n = 1,..., N. The destination nodes are located on the boundary of D, labeled by n = 0 and C. A STOCHASTIC NETWORK MODEL n = N + 1. In this subsection we demonstrate the various objects in the We assume that if node n is a transmitter at a certain abstract Markov chain model analyzed in this paper on a time instant, it randomly chooses to transmit one message prototypical example. We begin by describing a stochastic to the right or the left immediate neighbor with probability + ≤ model of a wireless sensor network. Pr (n) and Pl(n), respectively, where Pr (n) Pl(n) 1. Consider a network of N wireless sensor nodes uniformly In contrast to strict equality, the inequality here allows for a placed over the domain D. That is, the N nodes are located on more general stochastic model of transmission: after a sensor node randomly decides to transmit over the wireless channel, the grid points VN = {vN (1),..., vN (N)} described above. there is still a positive probability that the message is not We label the node at vN (n) by n, where n = 1,..., N. The sensor nodes generate, according to a probability distribution, transferred to its intended receiver (what might be called an data messages that need to be communicated to the desti- ‘‘outage’’). nation nodes located on the boundary of the domain, which The special destination nodes at the boundaries of the represent specialized devices that collect the sensor data. The domain do not have queues; they simply receive any message sensor nodes also serve as relays for routing messages to the transmitted to them and never themselves transmit anything. destination nodes. Each sensor node has the capacity to store We illustrate the time evolution of the queues in the network messages in a queue, and is capable of either transmitting in Fig. 1(c). or receiving messages to or from its immediate neighbors. The queue lengths (Generalization to further ranges of transmission can be found > N XN,M (k) = [XN,M (k, 1),..., XN,M (k, N)] ∈ R in our paper [106].) At each time instant k = 0, 1,..., each sensor node probabilistically decides to be a transmitter or form a Markov chain network model given by (1), where receiver, but not both. This simplified rule of transmission UN (k) = [Q(k, 1),..., Q(k, N), T (k, 1),..., T (k, N), allows for a relatively simple representation. We illustrate G(k, 1),..., G(k, N)]> such a network over a two-dimensional domain in Fig. 1(a). In this network, communication between nodes is is a random vector comprising independent random vari- interference-limited because all nodes share the same wire- ables: Q(k, n) are uniform random variables on [0, 1] less channel. We assume a simple collision protocol: a used to determine if the node is a transmitter or not; transmission from a transmitter to a neighboring receiver is T (k, n) are ternary random variables used to determine the successful if and only if none of the other neighbors of the direction a message is passed, which take values R, L, receiver is a transmitter, as illustrated in Fig. 1(b). We assume and S (representing transmitting to the right, the left, and that in a successful transmission, one message is transmitted neither, respectively) with probabilities Pr (n), Pl(n), and from the transmitter to the receiver. 1 − (Pr (n) + Pl(n)), respectively; and G(k, n) are the We assume that the probability that a node decides to be number of messages generated at node n at time k. We a transmitter is a function of its normalized queue length model G(k, n) by independent Poisson random variables with (normalized by an ‘‘averaging’’ parameter M). That is, at mean g(n).

580 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

> N For a generic x = [x1,..., xN ] ∈ R , the nth component Example 1: For the one-dimensional Markov chain net- of FN (x, UN (k)), where n = 1,..., N, is work model introduced in Sec. I-C, it follows from (2) (with = =  1 + G(k, n), if the particular choice of W (n, y) min(1, y)) that for x  > N  [x1,..., xN ] ∈ [0, 1] , the nth component of fN (x) in its  Q(k, xn−1) < W (n − 1, xn−1), T (k, n − 1) = R,  corresponding deterministic difference equation (4), where  Q(k, xn) > W (n, xn),  n = 1,..., N, is (after some tedious algebra, as described  Q(k, xn+1) > W (n + 1, xn+1);  in [3])  or   Q(k, xn+1) < W (n + 1, xn+1), T (k, n + 1) = L,   Q(k, xn) > W (n, xn), (1 − xn)[Pr (n − 1)xn−1(1 − xn+1)   Q(k, x − ) > W (n − 1, x − )  n 1 n 1 +Pl(n + 1)xn+1(1 − xn−1)] − + 1 G(k, n), if (2) − − −  = xn[Pr (n)(1 xn+1)(1 xn+2)  Q(k, xn) < W (n, xn), T (k, n) L,  − +Pl(n)(1 − xn−1)(1 − xn−2)] + g(n), (5)  Q(k, xn−1) > W (n 1, xn−1),  − ;  Q(k, xn−2) > W (n 2, xn−2)   or where xn with n ≤ 0 or n ≥ N + 1 are defined to be zero.  =  Q(k, xn) < W (n, xn), T (k, n) R, We analyze the convergence of the Markov chain to the  +  Q(k, xn+1) > W (n 1, xn+1), solution of a PDE using a two-step procedure. The first  +  Q(k, xn+2) > W (n 2, xn+2) step depends heavily on the relation between XN,M (k) and  G(k, n), otherwise, xN,M (k). We show that for each N, as M → ∞, the difference between X (k)/M and x (k) vanishes, by proving that where xn with n ≤ 0 or n ≥ N + 1 are defined to be N,M N,M zero; and W is the function that specifies the probability they both converge in a certain sense to the solution of the that a node decides to be a transmitter, as defined earlier. same ODE. The basic idea of this convergence is that as the ‘‘fluctuation size’’ of the system decreases and the ‘‘fluc- Here, the three possible values of FN correspond to the three events that at time k, node n successfully receives one mes- tuation rate’’ of the system increases, the stochastic system sage, successfully transmits one message, and does neither converges to a deterministic ‘‘small-fast-fluctuation’’ limit, of the above, respectively. The inequalities and equations on which can be characterized as the solution of a particular the right describe conditions under which these three events ODE. In our case, the smallness of the fluctuation size and largeness of the fluctuation rate is quantified by the ‘‘averag- occur: for example, Q(k, xn−1) < W (n−1, xn−1) corresponds to the choice of node n − 1 to be a transmitter at time k, ing’’ parameter M. We use a weak convergence theorem of T (k, n − 1) = R corresponds to its choice to transmit to the Kushner [19] to prove this convergence. In the second step, we treat M as a function of N, written right, Q(k, xn) > W (n, xn) corresponds to the choice of node n to be a receiver at time k, and so on. MN (therefore treating XN,MN (k) and xN,MN (k) as sequences We simplify the situation further by assuming that of the single index N, written XN (k) and xN (k), respectively), W (n, y) = min(1, y). (We use this assumption throughout and show that for any sequence {MN } of N, as N → ∞, the paper.) With the collision protocol described earlier, this xN (k) converges to the solution of a certain PDE (and we provides the analog of a network with backpressure rout- show how to construct the PDE). This is essentially a con- ing [107]. vergence analysis on the approximating error between xN (k) After presenting the main results of the paper, we will and the PDE solution. We stress that this is different from the revisit this network model in Sec. II-E and present a PDE that numerical analysis on classical finite difference schemes (see, approximates its global behavior as an application of the main e.g., [7], [108], [109]), because our difference equation (4), results. which originates from particular system models, differs from those designed specifically for the purpose of numerically D. OVERVIEW OF RESULTS IN THIS PAPER solving differential equations. The difficulty in our conver- In this subsection, we provide a brief description of the main gence analysis arises from both the different form of (4) and results in Sec. II. the fact that it is in general nonlinear. We provide not only The Markov chain model (1) is related to a deterministic sufficient conditions for the convergence, but also a practical difference equation. We set criterion for verifying such conditions otherwise difficult to N fN (x) = EFN (x, UN (k)), x ∈ R , (3) check. Finally, based on these two steps, we show that as and define x (k) = [x (k, 1),..., x (k, N)]> ∈ N N,M N,M N,M R N and M go to ∞ in a dependent way, the continuous-time- by N space extension (explained later) of the normalized Markov 1 chain X (k)/M converges to the PDE solution. We also x (k + 1) = x (k) + f (x (k)), N N N,M N,M M N N,M characterize the rate of convergence. We note that special X (0) x (0) = N,M a.s. (4) caution is needed for specifying the details of this dependence N,M M between the two indices N and M of the doubly indexed (‘‘a.s.’’ is short for ‘‘almost surely’’). family XN,M (k) of Markov chains in the limiting process.

VOLUME 1, 2013 581 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

E. OUTLINE OF THE PAPER limiting PDE The remainder of the paper is organized as follows. In Sec. II, z˙(t, s) = f (s, z(t, s), ∇z(t, s), ∇2z(t, s)), (8) we present the main theoretical results and apply the results to the wireless sensor network introduced above, and present with boundary condition some numerical experiments. In Sec. III, we present the 2 proofs of the main results. Finally, we conclude the paper and h(s, z(t, s), ∇z(t, s), ∇ z(t, s)) = 0 (9) discuss future work in Sec. IV. and initial condition z(0, s) = z0(s). Recall that xN,M (k) is defined by (4). Suppose that we II. MAIN RESULTS AND APPLICATIONS associate the discrete time k with points on the real line spaced A. CONSTRUCTION OF THE LIMITING PDE apart by a distance proportional to δN . Then, the technical We begin with the construction of the PDE whose solution assumptions (6) and (7) imply that xN,M (k) is, in a certain describes the limiting behavior of the abstract Markov chain sense, close to the solution of the limiting PDE (8) with model. boundary condition (9). Below we develop this argument For each N and the grid points VN = {vN (1),..., vN (N)} ⊂ rigorously. D as introduced in Sec. I-B, we denote the distance between Establishing existence and uniqueness for the resulting any two neighboring grid points by dsN . For any continuous nonlinear models is a difficult problem in theoretical analysis N function w : D → R, let yN be the vector in R composed of PDEs in general. The techniques are heavily dependent of the values of w at the grid points vN (n); i.e., yN = > on the particular form of f . Therefore, as is common with [w(vN (1)),..., w(vN (N))] . Given a point s ∈ D, we let numerical analysis, we assume that this has been established. {sN } ⊂ D be any sequence of grid points sN ∈ VN such Later, we apply the general theory to the modeling of net- that as N → ∞, sN → s. Let fN (yN , sN ) be the component works of particular characteristics. The resulting limiting of the vector fN (yN ) corresponding to the location sN ; i.e., if PDE is a nonlinear reaction-convection-diffusion problem. sN = vN (n) ∈ VN , then fN (yN , sN ) is the nth component of Existence and uniqueness for such problems for ‘‘small’’ data fN (yN ). and short times can be established under general conditions. In order to obtain a limiting PDE, we have to make cer- Key ingredients are coercivity, which will hold as long as z is tain technical assumptions on the asymptotic behavior of bounded away from 1, and diffusion dominance, which will the sequence of functions {fN } that insure that fN (yN , sN ) also hold as long as z is bounded above. is asymptotically close to an expression that looks like the right-hand side of a time-dependent PDE. Such conditions B. CONTINUOUS TIME-SPACE EXTENSION OF THE are familiar in the context of PDE limits of Brownian motion. MARKOV CHAIN Checking these conditions often amounts to a simple alge- Next we define the continuous time-space extension of the braic exercise. We provide a concrete example (the network Markov chain XN,M (k). model) in Sec. II-E where fN satisfies these assumptions. For each N and M, define We assume that there exist sequences {δ }, {β }, {γ }, N N N   and {ρ }, functions f and h, and a constant c < ∞, such δN T N dtN,M = , tN,M (k) = k dtN,M , KN,M = , and that as N → ∞, δN → 0, δN /βN → 0, γN → 0, ρN → 0, M dtN,M and: ˜ T TN = . (10) • Given s in the interior of D, there exists a sequence of δN { }: → functions φN D R such that (o) ˜ First, we construct the continuous-time extension XN,M (t) fN (yN , sN )/δN of XN,M (k), as the piecewise-constant time interpolant with 2 M M = f (sN , w(sN ), ∇w(sN ), ∇ w(sN )) + φN (sN ), (6) interval length 1/ and normalized by : (o) ˜ = b ˜c ˜ ∈ ˜ for any sequence of grid points sN → s, and for N XN,M (t) XN,M ( Mt )/M, t [0, TN ]. (11) sufficiently large, |φ (s )| ≤ cγ ; and N N N (o) ˜ • Given s on the boundary of D, there exists a sequence of Similarly, define the continuous-time extension xN,M (t) of x (k) by functions {ϕN }: D → R such that N,M (o) ˜ = b ˜c ˜ ∈ ˜ fN (yN , sN )/βN xN,M (t) xN,M ( Mt ), t [0, TN ]. (12) = ∇ ∇2 + h(sN , w(sN ), w(sN ), w(sN )) ϕN (sN ), (7) (p) ∈ × Let XN,M (t, s), where (t, s) [0, T ] D, be the for any sequence of grid points s → s, and for N (o) ˜ ˜ ∈ ˜ N continuous-space extension of XN,M (t) (with t [0, TN ]) sufficiently large, |ϕN (sN )| ≤ cρN . by piecewise-constant space extensions on D and with time i Here, ∇ w represents all the ith order derivatives of w, where scaled by δN so that the time-interval length is δN /M := = (o) i 1, 2. dtN,M . By piecewise-constant space extension of XN,M , we Fix T > 0 for the rest of this section. Assume that there mean the piecewise-constant function on D such that the exists a unique function z : [0, T ] × D → R that solves the value of this function at each point in D is the value of the

582 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

(o) component of the vector XN,M corresponding to the grid point By (4), (10), (14), and (15), we have that that is ‘‘closest to the left’’ (taken one component at a time). (p) ε (k + 1) Then XN,M (t, s) is the continuous time-space extension of N (p) · 1 XN,M (k), and for each t, XN,M (t, ) is a real-valued function = εN (k) + (fN (xN (k)) − fN (zN (k))) + dtN uN (k) defined on D. We illustrate in Fig. 2. MN 1 = εN (k) + (fN (zN (k) + εN (k)) − fN (zN (k))) MN + dtN uN (k). (16)

> > > Let εN = [εN (1) , . . . , εN (KN ) ] and uN = > > > [uN (0) ,..., uN (KN − 1) ] denote vectors in the (KN N)- K N dimensional vector space R N . Assume that

εN (0) = 0. (17)

Then by (16), for fixed z, there exists a function HN : K N K N R N → R N such that

(p) εN = HN (uN ). (18) FIGURE 2. An illustration of X (k) and X (t, s) in one dimension, N,M N,M represented by solid dots and dashed-line rectangles, respectively. N K N Define the vector norm k · k( ) on R N such that for > > > K N x = [x(1) ,..., x(KN ) ] ∈ R N , where x(k) = (p) ∈ × > N The function XN,M (t, s) with (t, s) [0, T ] D is in [x(k, 1),..., x(k, N)] ∈ R , the space DD[0, T ] of functions from [0, T ] × D to that R N are Càdlàg with respect to the time component, i.e., right- (N) X kxk = dsN max |x(k, n)|. (19) continuous at each t ∈ [0, T ), and have left-hand limits at k=1,...,K N = each t ∈ (0, T ]. Denote the norm k · k(p) on DD[0, T ] such n 1 that for x ∈ DD[0, T ], Define

Z (N) kxk(p) = sup |x(t, s)| ds. (13) kHN (u)k µN = lim sup . (20) t∈[0,T ] D → (N) α 0 kuk(N)≤α kuk

C. MAIN RESULTS FOR CONTINUUM LIMIT OF THE We now present the main theorem. ABSTRACT MARKOV CHAIN MODEL Theorem 1: (Main Theorem) Assume that: In this subsection, we present the main theorem, Theorem 1, T1.1. there exist a sequence {ξ } and c < ∞ such that which states that under some conditions, the continuous- N 1 (p) as N → ∞, ξ → 0, and for N sufficiently large, time-space extension X of the Markov chain X (k) N N,M N,M ku k(N) < c ξ ; converges to the solution z of the limiting PDE (8) in the N 1 N T1.2. for each N, there exists an identically distributed norm defined by (13), as N and M go to ∞ in a dependent sequence {λ (k)} of integrable random variables such way. By this we mean that we set M to be a function of N that for each k and x, |FN (x, UN (k))| ≤ λN (k) a.s.; N, written MN , such that MN → ∞ as N → ∞. (p) T1.3. for each N, the function FN (x, UN (k)) is continuous in Then we can treat XN,M (k), xN,M (k), X , dtN,M , N N N,MN N x a.s.; t , and K all as sequences of the single index N,MN N,MN T1.4. for each N, the ODE y˙ = f (y) has a unique solution N, written X (k), x (k), X (p), dt , t , and K respec- N N N N N N N on [0, T˜ ] for any initial condition y(0), where T˜ is as tively. We apply such changes of notation throughout the N N defined by (10); rest of the paper whenever M is treated as a function T1.5. z is Lipschitz continuous on [0, T ] × D; of N. T1.6. for each N, (17) holds; and Define z (k, n) = z(t (k), v (n)) and z (k) = N N N N T1.7. the sequence {µ } is bounded. [z (k, 1),..., z (k, N)]> ∈ N . Define the truncation error N N N R ˆ ˆ ˆ Then a.s., there exist c0 < ∞, N0, and M1 < M2 < M3,... fN (zN (k), n) zN (k + 1, n) − zN (k, n) such that for each N ≥ N and each M ≥ Mˆ , uN (k, n) = − , (14) 0 N N δN dtN k (p) − k(p) { } > N XN z < c0 max ξN , dsN . and uN (k) = [uN (k, 1),..., uN (k, N)] ∈ R . Define This theorem states that as N and MN go to ∞ in a depen- εN (k, n) = xN (k, n) − zN (k, n), (15) (p) k · k(p) dent way, XN converges to z in a.s. We prove this in > N and εN (k) = [εN (k, 1), . . . , εN (k, N)] ∈ R . Sec. III-C.

VOLUME 1, 2013 583 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

D. SUFFICIENT CONDITIONS ON fN FOR THE where pl(s) and pr (s) are real-valued functions defined on D BOUNDEDNESS OF {µN } such that The key assumption of Theorem 1 is that the sequence {µN } is bounded (Assumption T1.7). We present in the following pl(s) = b(s) + cl(s)dsN and pr (s) = b(s) + cr (s)dsN . theorem a result that gives specific sufficient conditions on f N Let c = c − c . The values b(s) and c(s) correspond to diffu- that guarantee that {µ } is bounded. This provide a practical l r N sion and convection quantities in the limiting PDE. Because criterion to verify this key assumption otherwise difficult to p (s) + p (s) ≤ 1, it is necessary that b(s) ≤ 1/2. In order check. l r to guarantee that the number of messages entering the system Consider fixed z. We assume that f ∈ C1 and denote the N from outside over finite time intervals remains finite through- jacobian matrix of fN at x by DfN (x). Define for each N and out the limiting process, we set g(n) = Mgp(vN (n))dtN , for k = 0,..., KN − 1, where gp : D → R is called the message generation rate. 1 1 A (k) = I + Df (z (k)), (21) Assume that b, cl, cr , and gp are in C . Further assume that N N N N N 1 MN xN,M (k) ∈ [0, 1] for each k. Then fN is in C . N×N We have assumed above that the probabilities P and P where IN is the identity matrix in R . l r N of the direction of transmission are the values of the con- We denote the 1-norm on R and its induced norm N×N k · k(N) = tinuous functions pl and pr at the grid points, respectively. on R both by 1 ; i.e., for a vector x > N This may correspond to stochastic routing schemes where [x1,..., xN ] ∈ R , nodes in close vicinity behave similarly based on some local N X information that they share; or to those with an underlying kxk(N) = |x |, (22) 1 n network-wide directional configuration that are continuous = n 1 in space, designed to relay messages to destination nodes N×N and for a matrix A ∈ R with aij being its (i, j)th compo- at known locations. On the other hand, the results can be nent, extended to situations with certain levels of discontinuity, as N (N) X discussed in Sec. IV. kAk = max |aij|. (23) 1 j=1,...,N By these assumptions and definitions, it follow from (5) = i 1 that the function f in (8) for this network model is: We then have Theorem 2: (Sufficient condition for key assumption) f (s, z(t, s), ∇z(t, s), ∇2z(t, s)) Assume that: ∂ = b(s) ((1 − z(t, s))(1 + 3z(t, s))z (t, s)) T2.1. for each N, (17) holds; ∂s s 1 + − T2.2. for each N, fN ∈ C ; and 2(1 z(t, s))zs(t, s)bs(s) 2 T2.3. there exists c < ∞ such that for N sufficiently large +z(t, s)(1 − z(t, s)) bss(s) (N) and for k = 1,..., KN − 1, kAN (k)k ≤ 1 + c dtN , ∂ 1 + (c(s)z(t, s)(1 − z(t, s))2) + g (s). (25) k · k(N) p where 1 is defined by (23). ∂s Then {µN } is bounded. Here, a single subscript s represents first derivative and a We prove this in Sec. III-D. double subscript ss represents second derivative. In Sec. III-E, we will show that these sufficient conditions Note that the computations needed to obtain (25) (and later, hold for the network model described in Sec. I-C, and use this (26), (48), and (49)) require tedious but elementary algebraic theorem to prove the convergence of its underlying Markov manipulations. For this purpose, we found it helpful to use the chain to a PDE. symbolic tools in Matlab. Based on the behavior of nodes n = 1 and n = N next to E. APPLICATION TO NETWORK MODELS the destination nodes, we derive the boundary condition (9) In this subsection, we apply the main results to show how the of the PDE of this network. For example, the node n = 1 Markov chain modeling the network introduced in Sec. I-C receives messages only from the right and encounters no can be approximated by the solution of a PDE. This approxi- interference when transmitting to the left. Replacing xn with mation was heuristically developed in [3]. n ≤ 0 or n ≥ N + 1 by 0, it follows that the 1st component of We first deal with the one-dimensional network model. Its fN (x) is corresponding stochastic and deterministic difference equa- tions (1) and (4) were specified by (2) and (5), respectively. (1 − xn)Pl(n + 1)xn+1 For this model we set δN (introduced in Sec. II-A) to be −x [P (n) + P (n)(1 − x + )(1 − x + )] + g(n). 2 n l r n 1 n 2 dsN . Then 2 dtN,M := δN /M = dsN /M. Similarly, the Nth component of fN (x) is Assume that (1 − xn)Pr (n − 1)xn−1 Pl(n) = pl(vN (n)) and Pr (n) = pr (vN (n)), (24) −xn[Pr (n) + Pl(n)(1 − xn−1)(1 − xn−2)] + g(n).

584 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

Set βN , defined in Sec. II-A, to be 1. Then from each of the above two functions we get the function h in (9) for the one- dimensional network: h(s, z(t, s), ∇z(t, s), ∇2z(t, s)) = −b(s)z(s)3 + b(s)z(s)2 − b(s)z(s). (26)

Note that the function h is the limit of fN (yN , sN )/βN , not fN (yN , sN )/δN (whose limit is f ). Solving h = 0 for real z, we have the boundary condition z(t, s) = 0. Let z be the solution of the PDE (8) with f specified by (25) and with boundary condition z(t, s) = 0 and initial condition z(0, s) = z0(s). Assume that (17) holds. As in Sec. II-C, we treat M as a sequence of N, written MN . In the following theo- rem we show the convergence of the Markov chain modeling the one-dimensional network to the PDE solution. Theorem 3: (Convergence of network model) For the one- dimensional network model, a.s., there exist c0 < ∞, N0, and ˆ ˆ ˆ M1 < M2 < M3,... such that for each N ≥ N0 and each ≥ ˆ k (p) − k(p) MN MN , XN z < c0dsN . We prove this in Sec. III-E.

FIGURE 4. The Monte Carlo simulations (with different N and M) and the PDE solution of a one-dimensional network, with b = 1/2 and c = 0, at t = 1s.

PN n=1 XN,M (ko, n)/M of the network does go to ∞; however, the coverage dsN of each node goes to 0. Hence the sum of the ‘‘data-coverage product’’ can be approximated by the finite area C(to).

FIGURE 3. The PDE solution at a fixed time that approximates the 2) COMPARISONS OF THE PDE SOLUTIONS AND MONTE normalized queue lengths of the network. CARLO SIMULATIONS OF THE NETWORKS In the remainder of this section, we compare the limiting PDE 1 1) INTERPRETATION OF LIMITING PDE solutions with Monte Carlo simulations of the networks. Now we make some remarks on how to interpret a given We first consider a one-dimensional network over the domain D = [−1, 1]. We use the initial condition z0(s) = limiting PDE. First, for fixed N and M, the normalized queue −s2 length of node n at time k, is approximated by the value of the l1e , where l1 > 0 is a constant, so that initially the nodes PDE solution z at the corresponding point in [0, T ] × D; i.e., in the middle have messages to transmit, while those near the XN,M (k,n) ≈ boundaries have very few. We set the message generation rate M z(tN,M (k), vN (n)). −s2 R gp(s) = l2e , where l2 > 0 is a parameter determining the Second, we discuss how to interpret C(to) := z(to, s)ds, D total load of the system. the area below the curve z(to, s) for fixed to ∈ [0, T ]. Let We use three sets of values of N = 20, 50, 80 and M = N 3, ko = bto/dtN,M c. Then we have that z(to, vN (n))dsN ≈ X (k ,n) and show the PDE solution and the Monte Carlo simulation N,M o ds , the area of the nth rectangle in Fig. 3. There- M N results with different N and M at t = 1s. The networks fore have diffusion b = 1/2 and convection c = 0 in Fig. 4 and N N X X XN,M (ko, n) c = 1 in Fig. 5, respectively, where the x-axis denotes C(t ) ≈ z(t , v (n))ds ≈ ds , o o N N M N the node location and y-axis denotes the normalized queue n=1 n=1 length. the sum of all rectangles. If we assume that all messages For the three sets of the values of N = 20, 50, 80 and in the queue have roughly the same bits, and think of dsN M = N 3, with c = 0, the maximum absolute errors of as the ‘‘coverage’’ of each node, then the area under any segment of the curve measures a kind of ‘‘data-coverage 1This paper has supplementary downloadable material available at http://ieeexplore.ieee.org, provided by the authors. This includes eight mul- product’’ of the nodes covered by the segment, in the unit of timedia AVI format movie clips, which show comparisons of network simu- ‘‘bit·meter.’’ As N → ∞, the total normalized queue length lations and limiting PDE solutions. This material is 34.7 MB in size.

VOLUME 1, 2013 585 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

−3 −3 the PDE approximation are 5.6 × 10 , 1.3 × 10 , and probabilities Pe(n1, n2) = b1(vN (n1, n2)) + ce(vN (n1, n2)) −3 1.1 × 10 , respectively; and with c = 1, the errors are dsN , Pw(n1, n2) = b1(vN (n1, n2)) + cw(n1, n2))dsN , −3 −3 −3 4.4 × 10 , 1.5 × 10 , and 1.1 × 10 , respectively. As Pn(n1, n2) = b2(vN (n1, n2)) + cn(vN (n1, n2))dsN , and we can see, as N and M increase, the resemblance between Ps(n1, n2) = b2(vN (n1, n2)) + cs(vN (n1, n2))dsN , respec- the Monte Carlo simulations and the PDE solution becomes tively, where Pe(n1, n2) + Pw(n1, n2) + Pn(n1, n2) + stronger. In the case of very large N and M, it is difficult to Ps(n1, n2) ≤ 1. Therefore it is necessary that b1(s) + distinguish the results. b2(s) ≤ 1/2. Define c1 = cw − ce and c2 = cs − cn. The derivation of the limiting PDE is similar to those of the one-dimensional case, except that we now have to consider transmission to and interference from four directions instead of two. We present the limiting PDE here without the detailed derivation: 2   X ∂ 3 ∂z 3 ∂z dbj z˙ = bj (1 + 5z)(1 − z) + 2(1 − z) ∂sj ∂sj ∂sj dsj j=1 d2b ∂ +z − z 4 j + c z − z 4 + g (1 ) 2 j (1 ) p, dsj ∂sj with boundary condition z(t, s) = 0 and initial condition z(0, s) = z0(s), where t ∈ [0, T ] and s = (s1, s2) ∈ D. We now compare the PDE approximation and the Monte Carlo simulations of a network over the domain D = [−1, 1] × [−1, 1]. We use the initial condition z0(s) = −(s2+s2) l1e 1 2 , where l1 > 0 is a constant. We set the message −(s2+s2) generation rate gp(s) = l2e 1 2 , where l2 > 0 is a constant. We use three different sets of the values of N1 × N2 and M, = = = 3 where N1 N2 20, 50, 80 and M N1 . We show the contours of the normalized queue length from the PDE solution and the Monte Carlo simulation results with different sets of values of N1, N2, and M, at t = 0.1s. The networks have diffusion b1 = b2 = 1/4 and convection c1 = c2 = 0 FIGURE 5. The Monte Carlo simulations (with different N and M) and the = − = − PDE solution of a one-dimensional network, with b = 1/2 and c = 1, at in Fig. 6 and c1 2, c2 4 in Fig. 7, respectively. t = 1s. For the three sets of values of N1 = N2 = 20, 50, 80 and = 3 = = M N1 , with c1 c2 0, the maximum absolute errors are We stress that the PDEs only took fractions of a second to 3.2×10−3, 1.1×10−3, and 6.8×10−4, respectively; and with −3 −3 solve on a computer, while the Monte Carlo simulations took c1 = −2, c2 = −4, the errors are 4.1×10 , 1.0×10 , and on the order of tens of hours. 6.6×10−4, respectively. Again the accuracy of the continuum model increases with N1, N2, and M. 3) A TWO-DIMENSIONAL NETWORK It took 3 days to do the Monte Carlo simulation of the The generalization of the continuum model to higher dimen- network at t = 0.1s with 80 × 80 nodes and the maximum sions is straightforward, except for more arduous algebraic queue length M = 803, while the PDE solved on the same manipulation. Likewise, the convergence analysis is similar machine took less than a second. We could not do Monte to the one dimensional case. Carlo simulations of any larger networks or greater values of We consider a two-dimensional network of N = N1 × N2 t because of prohibitively long computation time. sensor nodes uniformly placed over a domain D ⊂ R2. Here we switch to a two-dimensional labeling scheme. We label the III. PROOFS OF THE MAIN RESULTS nodes by (n1, n2), where n1 = 1,..., N1 and n2 = 1,..., N2, This section is devoted solely to the proofs of the results in and denote the grid point in D corresponding to node (n1, n2) Sec. II. As such, the material here is highly technical and by vN (n1, n2). This labeling scheme is more intuitive for this might be tedious to follow in detail, though we have tried our two-dimensional scenario, but is essentially equivalent to the best to make it as readable as possible. The reader can safely single-label one. (e.g., if we set n := (n1 − 1)N2 + n2 and skip this section without doing violence to the main ideas of vˆN (n) := vN (n1, n2), then vˆN (n) form the same grid.) the paper, though much of our hard work is reflected here. Again let the distance between any two neighboring nodes We first prove Theorem 1 (Main Theorem) by analyzing be dsN . Assume that node (n1, n2) randomly chooses to trans- the convergence of the Markov chains XN,M (k) to the solution mit to the east, west, north, or south immediate neighbor with of the limiting PDE in a two-step procedure. In the first

586 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

to the PDE solution. Based on the two steps, we show in ∞ (p) Sec. III-C that as N and MN go to in a dependent way, XN converges to the PDE solution, proving Theorem 1. We then prove Theorem 2 (Sufficient condition for key assumption) in Sec. III-D. Finally, in Sec. III-E, we prove Theorem 3 (Convergence of network model) using Theorem 1 and 2.

A. CONVERGENCE OF XN,M (k) AND xN,M (k) TO THE SOLUTION OF THE SAME ODE In this subsection, we show that for each N, XN,M (k)/M and xN,M (k) are close in a certain sense for large M under certain conditions, by proving that both their continuous-time extensions converge to the solution of the same ODE. ˜ For fixed T and N, by (10), TN is fixed. As defined by (o) ˜ (o) ˜ (11) and (12) respectively, both XN,M (t) and xN,M (t) with ˜ N ˜ N t˜ ∈ [0, TN ] are in the space D [0, TN ] of R -valued Càdlàg ˜ functions on [0, TN ]. Since they both depend on M, each one N ˜ of them forms a sequence of functions in D [0, TN ] indexed (o) N ˜ by M = 1, 2,.... Define the ∞-norm k · k∞ on D [0, TN ]; N ˜ i.e., for x ∈ D [0, TN ], (o) kxk∞ = max sup |xn(t)|, = n 1,...,N ∈ ˜ FIGURE 6. The Monte Carlo simulations (from top to bottom, with t [0,TN ] N = N = 20, 50, 80, respectively, and M = N3) and the PDE solution of 1 2 1 where xn is the nth components of x. a two-dimensional network, with b1 = b2 = 1/4 and c1 = c2 = 0, at t = 0.1s. Now we present a lemma stating that under some condi- → ∞ (o) tions, for each N, as M , XN,M converges uniformly ˙ = (o) to the solution of the ODE y fN (y), and xN,M converges ˜ uniformly to the same solution, both on [0, TN ]. Lemma 1: Assume, for each N, that: L1.1. there exists an identically distributed sequence {λN (k)} of integrable random variables such that for each k and x, |FN (x, UN (k))| ≤ λN (k) a.s.; L1.2. the function FN (x, UN (k)) is continuous in x a.s.; and ˜ L1.3. the ODE y˙ = fN (y) has a unique solution on [0, TN ] for any initial condition y(0). → ∞ (o) −→P (o) → Suppose that as M , XN,M (0) y(0) and xN,M (0) P y(0), where ‘‘−→’’ represents convergence in probability. → ∞ k (o) − k(o) −→P Then, for each N, as M , XN,M y ∞ 0 and k (o) − k(o) → ˜ xN,M y ∞ 0 on [0, TN ], where y is the unique solution of y˙ = fN (y) with initial condition y(0). To prove Lemma 1, we first present a lemma due to Kush- ner [19]. Lemma 2: Assume, for each N, that: L2.1. the set {|FN (x, UN (k))| : k ≥ 0} is uniformly inte- grable; L2.2. for each k and each bounded random variable X,

FIGURE 7. The Monte Carlo simulations (from top to bottom, with lim E sup |FN (X, UN (k)) − FN (X + Y , UN (k))| = 0; δ→0 N = N = 20, 50, 80, respectively, and M = N3) and the PDE solution of |Y |≤δ 1 2 1 a two-dimensional network, with b1 = b2 = 1/4 and c1 = −2, c2 = −4, at = and t 0.1s. ˆ ˆ L2.3. there is a function fN (·) [continuous by 2] such that as step, for each N, we show in Sec. III-A that as M → ∞, n → ∞, XN,M (k)/M converges to xN,M (k). In the second step, we treat n 1 X P M as a function of N, written MN , and for any sequence {MN }, F (x, U (k)) −→ fˆ (x). n N N N we show in Sec. III-B that as N → ∞, xN (k) converges k=0

VOLUME 1, 2013 587 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

ˆ Suppose that, for each N, y˙ = fN (y) has a unique solution on The same argument implies the deterministic convergence of ˜ (o) ⇒ (o) → ∞ k (o) − k(o) → ˜ [0, TN ] for any initial condition, and that XN,M (0) y(0), xN,M : as M , xN,M y ∞ 0 on [0, TN ].  where ‘‘⇒’’ represents weak convergence. Then for each N, Based on Lemma 1, we get the following lemma, which as M → ∞, kX (o) − yk(o) ⇒ 0 on [0, T˜ ]. N,M ∞ N states that, for each N, X (o) and x(o) are close with high We note that in Kushner’s original theorem, the conver- N,M N,M probability for large M. gence of X (o) to y is stated in terms of Skorokhod norm [19], N,M Lemma 3: Let the assumptions of Lemma 1 hold. Then for but it is equivalent to the ∞-norm in our case where the time ˜ any sequence {ζN }, for each N and for M sufficiently large, interval [0, TN ] is finite and the limit y is continuous [110]. {k (o) − (o) k(o) } ≤ 2 ˜ We now prove Lemma 1 by showing that the Assump- P XN,M xN,M ∞ > ζN 1/N on [0, TN ]. tions L2.1, L2.2, and L2.3 of Lemma 2 hold under the Proof: By the triangle inequality, Assumptions L1.1, L1.2, and L1.3 of Lemma 1. k (o) − (o) k(o) ≤ k (o) − k(o) + k (o) − k(o) XN,M xN,M ∞ XN,M y ∞ xN,M y ∞ . Proof of Lemma 1: Since λN (k) is integrable, as a → ∞, → ∞ k (o) − (o) k(o) −→P E|λN (k)|1{|λ (k)|>a} → 0, where 1A is the indicator function By Lemma 1, for each N, as M , XN,M xN,M ∞ 0 N ˜ of set A. By Assumption L1.1, for each k, x, and a > 0, on [0, TN ]. This completes the proof. (o) (o) Since XN,M and xN,M are the continuous-time extensions E|FN (x, UN (k))|1{|F (x,U (k))|>a} N N of XN,M (k) and xN,M (k) by piecewise-constant extensions, ≤ | | E λN (k) 1{|FN (x,UN (k))|>a} respectively, we have the following corollary stating that for ≤ | | → ∞ E λN (k) 1{|λN (k)|>a}. each N, as M , XN,M (k)/M converges uniformly to x (k). → ∞ N,M Therefore as a , Corollary 1: Let the assumptions of Lemma 1 hold. Then | | → ; for any sequence {ζ }, for each N and for M sufficiently large, sup E FN (x, UN (k)) 1{|FN (x,UN (k))|>a} 0 N k≥0 we have that i.e., the family {|F (x, U (k))| : k ≥ 0} is uniformly   N N  X (k, n)  1 integrable, and hence Assumption L2.1 holds. P N,M − x k n ≤ max N,M ( , ) > ζN 2 . k=1,...,KN,M M N By Assumption L1.2, for each k and each bounded X, a.s.,  n=1,...,N  | − + | = lim sup FN (X, UN (k)) FN (X Y , UN (k)) 0. B. CONVERGENCE OF x (k) TO THE LIMITING PDE δ→0 |Y |≤δ N For the remainder of this section, we treat M as a function By Assumption L1.1, for each k and each bounded X and Y , of N, written MN . We now state conditions under which εN a.s., converges to 0 for any sequence {MN } as N → ∞. |FN (X, UN (k)) − FN (X + Y , UN (k))| Lemma 4: Assume that: L4.1 there exist a sequence {ξN } and c1 < ∞ such that ≤ |FN (X, UN (k))| + |FN (X + Y , UN (k))| ≤ 2λN (k). as N → ∞, ξN → 0, and for N sufficiently large, (N) Therefore, for each k, each bounded X, and each δ, a.s., kuN k < c1ξN ; L4.2 for each N, (17) holds; and sup |F (X, U (k)) − F (X + Y , U (k))| ≤ 2λ (k), N N N N N L4.3 the sequence {µN } is bounded. |Y |≤δ Then there exists c0 < ∞ such that for any sequence {MN } an integrable random variable. By the dominant convergence (N) and N sufficiently large, kεN k < c0ξN . theorem, Proof: By the definition of µN (20), for each N, there lim E sup |FN (X, UN (k)) − FN (X + Y , UN (k))| exists δ > 0 such that for α < δ, δ→0 |Y |≤δ kH (u)k(N) = | − + | = sup N ≤ µ + 1. E lim sup FN (X, UN (k)) FN (X Y , UN (k)) 0. k k(N) N δ→0 |Y |≤δ kuk(N)≤α u (N) Hence Assumption L2.2 holds. By Assumption L4.1, kuN k → 0 as N → ∞. Then (N) Since UN (k) are i.i.d., by the weak there exists α1 such that for N sufficiently large, kuN k ≤ and the definition of fN in (3), as n → ∞, α1 < δ, and hence n kH (u )k(N) kH (u)k(N) 1 X P N N ≤ N ≤ + FN (x, UN (k)) −→ fN (x). (N) sup (N) µN 1. n kuN k k k(N)≤ kuk k=0 u α1 Hence Assumption L2.3 holds. Therefore, for N sufficiently large, → ∞ k (o) − (N) (N) (N) Therefore, by Lemma 2, for each N, as M , XN,M kεN k = kHN (uN )k ≤ (µN + 1)kuN k . yk(o) ⇒ 0 on [0, T˜ ]. For any sequence of random processes ∞ N By Assumption L4.3, and because the derivation above does { } ⇒ −→P Xn , if A is a constant, Xn A if and only if Xn A. not depend on the choice of the sequence {MN }, the proof is → ∞ k (o) − k(o)−→P ˜ Therefore, as M , XN,M y ∞ 0 on [0, TN ]. completed.

588 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

Z X (p) C. PROOF OF THEOREM 1 ≤ max sup |z (t, s) − z(t, s)| ds k (s) N  (n) ∈ (t) We now prove the main theorem. n N t N (k) Z Proof of Theorem 1: X By Lemma 4, there exist a sequence = max sup |z(tN (k), vN (n)) − z(t, s)| ds { } ∞ → ∞ → k (s) (t) ξN and c2 < such that as N , ξN 0, and for N n N (n) t∈ (k) (N) N sufficiently large, kεN k ≤ c2ξN . Z > > > X Let XN = [XN (1) ,..., XN (KN ) ] /MN , xN = ≤ max c3dsN ds = c3dsN |D|, (29) > > > > > > k (s)(n) [xN (1) ,..., xN (KN ) ] , and zN = [zN (1) ,..., zN (KN ) ] n N K N denote vectors in R N . Hence εN = xN − zN . where |D| is the Lebesgue measure of D. K N > > > For x ∈ R N , where x = [x(1) ,..., x(KN ) ] and By the triangle inequality, > N x(k) = [x(k, 1),..., x(k, N)] ∈ R , we have that kX (p) − zk(p) ≤ kX (p) − z(p)k(p) + kz(p) − zk(p). kxk(N) ≤ max |x(k, n)|. N N N N k=1,...,KN = ˆ ˜ ¯ n 1,...,N Set MN = max{MN , MN }. By (27), (28), and (29), a.s., there ˜ exist c < ∞ and N such that for N ≥ N and M ≥ Mˆ , Therefore, by Corollary 1, there exists a sequence {MN } such 0 0 0 N N ˜ that if for each N, MN ≥ MN , then (p) (p) kX − zk < c0 max{ξN , dsN }. ∞ ∞ N X n (N) o X 2 P kXN − xN k > ξN ≤ 1/N < ∞.  N=1 N=1 It follows from the first Borel-Cantelli Lemma that a.s., there D. PROOF OF THEOREM 2 ˜ exists N1 such that for N ≥ N1 and MN ≥ MN , kXN − To prove Theorem 2, we first prove Lemma 5 and 6 below. (N) xN k ≤ ξN . First we provide in Lemma 5 a sequence bounding {µN } By the triangle inequality, from above. By (18), for each N, for k = 1,..., KN and n = = (k,n) (k,n) (N) (N) (N) 1,..., N, we can write εN (k, n) HN (uN ), where HN kXN − zN k ≤ kXN − xN k + kεN k . K N is from R N to R. Suppose that HN is differentiable at 0. Therefore, a.s., there exists N2 such that for N ≥ N2 and Define ˜ MN > MN , KN N (k,n) X X ∂H (N) DH = max max N (0) . (30) kXN − zN k < (c2 + 1)ξN . (27) N k=1,...,KN j=1,...,N ∂u(i, j) i=1 n=1 Let z(p)(t, s), where (t, s) ∈ [0, T ] × D, be the continuous- N We have that time-space extension of zN (k) defined in the same way as (p) Lemma 5: Assume that: XN (t, s) is defined from XN (k). Then by its definition, we have that L5.1 for each N, (17) holds; and L5.2 for each N, H ∈ C1 locally at 0. (p) (p) (p) (N) N kX − z k = kXN − zN k . (28) N N Then we have that for each N, µN ≤ DHN . (t) (s) Let N (k, n) =  (k)× (n) be the subset of [0, T ]×D Proof: Let JN be the jacobian matrix of HN at 0. Note N N K N×K N (p) that JN ∈ R N N . Let JN (l, m) be its (l, m)th component, containing (tN (k), vN (n)) over which zN is piecewise con- t s where l, m = 1,..., K N. Then we have that for k, i = stant; i.e., t (k) ∈ ( )(k) and v (n) ∈ ( )(n), and for N N N N N 1,..., K and n, j = 1,..., N, ∈ (p) = (p) = N all (t, s) N (k, n), zN (t, s) zN (tN (k), vN (n)) (k,n) z(tN (k), vN (n)). ∂H ¯ N (0) = J ((k − 1)N + n, (i − 1)N + j). By (10), there exists a sequence {MN } such that if for each N ¯ ∂u(i, j) N, MN ≥ MN , then for N sufficiently large, dtN ≤ dsN . N×N By Assumption T1.5, there exists c3 < ∞ such that for N Let CN (k, i) be the matrix in R such that for n, j = ¯ sufficiently large, for MN ≥ MN , and for k = 1,..., KN and 1,..., N, the (n, j)th component of CN (k, i) is n = 1,..., N, (k,n) ∂HN | − | ≤ ∈ (0); z(tN (k), vN (n)) z(t, s) c3dsN , (t, s) N (k, n). ∂u(i, j) Then we have that i.e., C (k, i) is the (k, i)th block in the partition of J into Z N N k (p) − k(p) = | (p) − | N × N blocks (there are KN × KN such blocks), where k, i = zN z sup zN (t, s) z(t, s) ds t∈[0,T ] D 1,..., KN . Then by (30), Z X (p) = | − | KN sup zN (t, s) z(t, s) ds X (N) ∈ (s) = k k t [0,T ] n N (n) DHN max CN (k, i) 1 . (31) k=1,...,KN Z i=1 = X | (p) − | max sup zN (t, s) z(t, s) ds k (t) (s) (N) ∈ n N (n) k · k t N (k) ( 1 is defined by (23).)

VOLUME 1, 2013 589 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

> > K N By (19) and (22), for u = [u(1) ,..., u(KN ) ]> ∈ R N , Proof: By Assumption L6.2 and Taylor’s theorem, for > N ˜ where u(k) = [u(k, 1),..., u(k, N)] ∈ R , fixed z, there exists a function fN such that K (N) N fN (xN (k)) − fN (zN (k)) = DfN (zN (k))εN (k) k k(N) = X JN u dsN max CN (k, i)u(i) + + k=1,...,KN tldfN (zN (k) εN (k), zN (k)), i=1 1 KN and for each z, ≤ X k k(N) ˜ dsN max CN (k, i)u(i) 1 fN (z, z) = 0, (37) k=1,...,KN i=1 KN and X (N) (N) ˜ (N) ≤ k k k k fN (z + ε, z) dsN max CN (k, i) 1 u(i) 1 = k=1,...,KN lim N 0. (38) i=1 kεk(N)→0 kεk( ) KN X (N) (N) Then we have from (16) that for k = 0,..., KN − 1, ≤ max kCN (k, i)k dsN max ku(l)k k=1,...,K 1 l=1,...,K 1 N = N 1 i 1 + = + (N) εN (k 1) εN (k) DfN (zN (k))εN (k) = DHN kuk , MN 1 ˜ where the last equation follows from (31), (19), and (22). + fN (zN (k) + εN (k), zN (k)) + dtN uN (k). M Therefore, for u 6= 0, N (N) Therefore kJN uk DHN ≥ . (32) kuk(N) εN (k + 1) = AN (k)εN (k) + dtN uN (k) ˜ + Note that if u = 0, then by (16) and (17), ε = 0. + fN (zN (k) εN (k),zN (k)) . N N MN Therefore For k = 0,..., KN − 1, define HN (0) = 0. (33) f˜ (z (k) + ε (k), z (k)) By Assumption L5.2 and Taylor’s theorem, there exists a η (k) = dt u (k) + N N N N . (39) ˜ N N N function HN such that MN ˜ Then εN (k + 1) = AN (k)εN (k) + ηN (k). Therefore for k = HN (u) = JN u + HN (u), (34) 1,..., KN , and ε (k) = A (k − 1) ... A (1)η (0) ˜ (N) N N N N kHN (u)k lim sup = 0. (35) +AN (k − 1) ... AN (2)ηN (1) → (N) α 0 kuk(N)≤α kuk + ... + AN (k − 1)ηN (k − 2) + ηN (k − 1). By (34) and the triangle inequality, we have that Then it follows from (36) that for k = 1,..., KN , (N) (N) ˜ (N) kHN (u)k ≤ kJN uk + kHN (u)k . k = X (k,l) − Therefore by (20), εN (k) BN ηN (l 1). (40) (N) ˜ (N) l=1  kJN uk kHN (u)k  µ ≤ lim → sup (N) + . N α 0 kuk ≤α kuk(N) kuk(N) = (k) Write εN (k) HN (uN ). By (39), Hence by (32) and (35), we complete the proof.   f˜ z (k) + H (k)(u ), z (k) Next we present in Lemma 6 a relationship between fN and N N N N N = ηN (k) = dtN uN (k) + . DHN . Define for each N and for k, l 1,..., KN , MN  AN (k − 1)AN (k − 2) ... AN (l), 1 ≤ l < k; Hence by (40), for k = 1,..., K , (k,l)  N B = IN , l = k; N k  0, l > k, X ε (k) = B(k,l)dt u (l − 1) (36) N N N N l=1 where A (l) is as defined by (21). We have that  −  N k ˜ − + (l 1) − fN zN (l 1) HN (uN ), zN (l 1) Lemma 6: Assume that: + X (k,l) BN . MN L6.1 for each N, (17) holds; and l=1 1 L6.2 for each N, fN ∈ C . k,l,n Denote by g( )(·) : KN N → N the nth component of Then we have that for each N, for k, i = 1,..., KN and N R R n, j = 1,..., N,   (k,l) ˜ − + (l−1) · − BN fN zN (l 1) HN ( ), zN (l 1) . ∂H (k,n) N = (k,i) (0) BN (n, j)dtN . (k,l,n) = ∂u(i, j) By (37) and (33), gN (0) 0.

590 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

Let {e(i, j) : i = 1,..., KN , j = 1,..., N} be the standard By (43) and (46), KN N KN N basis for R ; i.e., e(i, j) is the element of R with the (N) (l) | | + ˆ k k(N) | | + k k(N) (i, j)th entry being 1 and all other entries being 0. Then HN (u) < ( c ε(ε)) u < ( c 1) u . ∂H (k,n) By the above two inequalities, N (0) = B(k,i)(n, j)dt ∂u(i, j) N N θ (l,d)(u)  (k,l,n)  N 1 Pk gN (h e(i,j)) + limh→0 . (N) < ε. (47) MN l=1 h kuk ˜(l,d) = It remains to show that By (37), fN (0) 0. Therefore, in the case that (k,l,n) (l) (N) = (l,d) = g (h e(i, j)) HN (u) 0, θN (u) 0, and thus (47) still holds. lim N = 0. h→0 h Therefore, (42) holds. Now we prove Theorem 2 using the preceding lemmas. l,d ( ) · : KN N → Denote by θN ( ) R R the dth component of (l) Proof of Theorem 2: By (30), Lemma 5, and Lemma 6, we f˜ (z (l) + H (·), z (l)). Then N N N N have that N (k,l,n) X (k,l) (l−1,d) PKN PN (k,i) = µN ≤ maxk=1,...,K maxj=1,...,N B (n, j) dtN gN (u) BN (n, d)θN (u). N i=1 n=1 N d=1 (N) PKN (k,i) = maxk=1,...,K B dtN ˜(l,d) N N i=1 N Denote by f (·) : R → R the dth component of 1 N (N) f˜ (z (l) + (·), z (l)). Then ≤ (k,i) N N N maxk=1,...,KN KN maxi=1,...,KN B dtN N 1 θ (l,d)(u) = f˜(l,d)(H (l)(u)). (41) (N) N N N ≤ (k,i) T max k=1,...,KN BN . Then it remains to show that i=1,...,KN 1 (N) θ (l,d)(u) (k · k is defined by (23).) Therefore, by (36) and by the N = 1 lim N 0. (42) sub-multiplicative property of induced norm, we have that kuk(N)→0 kuk( ) ≤ k − − k(N) By Assumption L6.2 and by induction, it follows from (16) µN T max k=1,...,KN AN (k 1)AN (k 2) ... AN (i) 1 i=1,...,k−1 that for fixed z, ε is a C1 function of u , because the compo- N N ≤ k − k(N) k k(N) 1 1 T max k=1,...,KN AN (k 1) 1 ... AN (i) 1 . sition of functions in C is still in C . Hence Assumption L6.2 i=1,...,k−1 here implies Assumption L5.2 of Lemma 5. By Assump- Then by Assumption T2.3, there exists c < ∞ such that for tion L5.2 and (33), there exists c such that |c| < ∞, and for (N) N sufficiently large, each ε1 > 0, there exists δ1(ε1) such that for kuk < δ1(ε1), (N) (l) KN H (u) µN ≤ T (1 + c dtN ) . N − c < ε . Hence for kuk(N) < δ (ε ), kuk(N) 1 1 1 As N → ∞, KN → ∞, and (l) (N) | | + k k(N)  c T KN HN (u) < ( c ε1) u . (43) KN c T (1 + c dtN ) = 1 + → e . f˜(l,d) x KN N ( ) = By (38), limkxk(N)→0 k k(N) 0. Hence for each x Therefore {µN } is bounded.  ε > 0, there exists δ (ε ) such that for kxk(N) < δ (ε ), 2 2 2 2 2 f˜(l,d)(x) N ε2 k k(N) E. PROOF OF THEOREM 3 k k(N) < |c|+1 . Hence for 0 < x < δ2(ε2), x We now prove Theorem 3 using Theorem 1 and 2. ε f˜(l,d)(x) < 2 kxk(N). (44) N |c| + 1 Proof of Theorem 3: It follows from (5) that there exists c , c < ∞ such that for N sufficient large and k = For each ε, let εˆ(ε) be sufficiently small such that 1 2 0,..., KN − 1, | | + ˆ ˆ ( c ε(ε))δ1(ε(ε)) < δ2(ε), (45)  |u (k, n)| < c , n = 1, N; N 1 (48) and |uN (k, n)| < c2dsN , n = 2,..., N − 1. εˆ(ε) < 1. (46) Therefore, there exists c3 < ∞ such that for N sufficient (N) large, k k(N) ˆ (l) Then by (43) and (45), for u < δ1(ε(ε)), HN (u) < N (N) X (l) max |uN (k, n)| < c3, δ2(ε). Therefore, in the case that H (u) > 0, by (41) k=0,...,KN −1 N n=1 and (44), and hence by (19), we have that for N sufficient large, (l,d) (l,d)  (l)  ε (l) (N) θ (u) = f˜ H (u) < H (u) . (N) N N N |c| + 1 N kuN k < c3dsN .

VOLUME 1, 2013 591 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

Hence the Assumption T1.1 of Theorem 1 holds. the limit. For example, we could consider a double sequence > N By (5), for each N, for x = [x1,..., xN ] ∈ [0, 1] , the of minimum point spacing {hi} and maximum point spacing (n, m)th component of DfN (x), where n, m = 1,..., N, is {Hi} with Hi/hi = constant, and for each i, we can consider  a model with nonhomogeneous point spacing between hi Pl(n)xn(1 − xn−1), m = n − 2;  and Hi. We can also introduce a spatial change of variables  (1 − xn)[Pr (n − 1)(1 − xn+1)  that maps a nonuniform model to a uniform model. This  −Pl(n + 1)xn+1]  changes the coefficients in the resulting PDE, by substitution  +Pl(n)xn(1 − xn−2), m = n − 1;  and the chain rule. In this way we can extend our approach to  −Pr (n − 1)xn−1(1 − xn+1)  nonuniform, even mobile, networks. We can further consider  −Pl(n + 1)xn+1(1 − xn−1)  the control of nodes such that global characteristics of the −P (n)(1 − x + )(1 − x + ) r n 1 n 2 network are invariant under node locations and mobility. (See  −Pl(n)(1 − xn−1)(1 − xn−2), m = n;  our paper [106], [111] for details.)  (1 − xn)[Pl(n + 1)(1 − xn−1)  The assumption made in (24) that the probabilities of trans-  −Pr (n − 1)xn−1]  mission behave continuously insures that there is a limiting  +Pr (n)xn(1 − xn+2), m = n + 1;  behavior in the limit of large numbers of nodes and relates the  Pr (n)xn(1 − xn+1), m = n + 2;  behavior of networks with different numbers of nodes. The  0 other wise, convergence results can be extended to the situation in which where xn with n ≤ 0 or n ≥ N + 1 are defined to be zero. It the probabilities change discontinuously at a finite number then follows that for each k, of lower dimensional linear manifolds (e.g., points in one (N) dimension, lines in two dimensions, planes in three dimen- kA (k)k = 1. (49) N 1 sions) in space provided that all of the discrete networks under Hence Assumption T2.3 of Theorem 2 holds. We note that consideration have nodes on the manifolds of discontinuity. obtaining (49) and (48) requires tedious, but elementary, There are other considerations regarding the network that algebraic manipulation. One can also verify that the other can significantly affect the derivation of the continuum assumptions of Theorem 1 and 2 hold. By Theorem 1, this model. For example, transmissions could happen beyond completes the proof.  immediate nodes, and the interference between nodes could behave differently in the presence of power control; we can IV. CONCLUSION consider more boundary conditions other than sinks, includ- In this paper we analyze the convergence of a sequence ing walls, semi-permeating walls, and their composition; and of Markov chains to its continuum limit, the solution of a we can seek to establish continuum models for other domains PDE, in a two-step procedure. We provide precise sufficient such as the Internet, cellular networks, traffic networks, and conditions for the convergence and the explicit rate of con- human crowds. vergence. Based on such convergence we approximate the Markov chain modeling a large wireless sensor network by REFERENCES a nonlinear diffusion-convection PDE. [1] M. Guizani, A. Rayes, B. Khan, and A. Al-Fuqaha, Network Modeling and Simulation: A Practical Perspective. New York, NY, USA: Wiley, With the well-developed mathematical tools available for 2010. PDEs, this approach provides a framework to model and [2] Y. Zhang, E. Chong, J. Hannig, and D. Estep, ‘‘On continuum limits of Markov chains and network modeling,’’ in Proc. 49th IEEE Conf. CDC, simulate networks with a very large number of components, Dec. 2010, pp. 6779–6784. which is practically infeasible for Monte Carlo simulation. [3] E. K. P. Chong, D. Estep, and J. Hannig, ‘‘Continuum modeling of large Such a tool enables us to tackle problems such as per- networks,’’ Int. J. Numer. Model., Electron. Netw., Devices Fields, vol. 21, no. 3, pp. 169–186, 2008. formance analysis and prototyping, resource provisioning, [4] K. Wong, C. Gin, R. Rajeswari, W. Woon, and E. Chong, ‘‘Feasability network design, network parametric optimization, network of a PDE-based teletraffic model for cellular networks,’’ in Proc. ICIAS, Nov. 2007, pp. 383–388. control, network tomography, and inverse problems, for very [5] N. Burch, E. Chong, D. Estep, and J. Hannig, ‘‘Analysis of routing proto- large networks. For example, we can now use the PDE model cols and interference-limited communication in large wireless networks,’’ J. Eng. Math., vol. 79, no. 1, pp. 183–199, 2012. to optimize certain performance metrics (e.g., throughput) of [6] G. R. Liu and S. S. Quek, The Finite Element Method: A Practical Course. a large network by adjusting the placement of destination Oxford, U.K.: Butterworth-Heinemann, 2003. nodes or the routing parameters (e.g., coefficients in convec- [7] A. R. Mitchell and D. F. Griffiths, The Finite Difference Method in Partial Differential Equations. New York, NY, USA: Wiley, 1980. tion terms), with relatively negligible computation overhead [8] M. Burger, P. Markowich, and J. Pietschmann, ‘‘Continuous limit compared with that of the same task done by Monte Carlo of a crowd motion and herding model: Analysis and numerical simulations,’’ Kinetic Rel. Methods, vol. 4, no. 4, pp. 1025–1047, simulations. 2011. For simplicity, we have treated sequences of grid points [9] S. Bhatt, R. Fujimoto, A. Ogielski, and K. Perumalla, ‘‘Parallel simulation that are uniformly located. As with finite difference meth- techniques for large-scale networks,’’ IEEE Commun. Mag., vol. 36, no. 8, pp. 42–47, Aug. 1998. ods for differential equations, the convergence results can [10] R. M. Fujimoto, Parallel and Distributed Simulation Systems. New York, be extended to models that have nonuniform points spacing NY, USA: Wiley, 2000. [11] R. M. Fujimoto, K. Perumalla, A. Park, H. Wu, M. H. Ammar, and G. F. under assumptions that insure the points in the sequence Riley, ‘‘Large-scale network simulation—How big? How fast?’’ in Proc. should become dense in the underlying domain uniformly in Symp. MASCOTS, 2003, pp. 1–8.

592 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

[12] B. Szymanski, Y. Liu, and R. Gupta, ‘‘Parallel network siulation under [39] L. L. Xie and P. Kumar, ‘‘A network information theory for wireless distributed genesis,’’ in Proc. 17th Workshop Parallel Distrib. Simul., communication: Scaling laws and optimal operation,’’ IEEE Trans. Inf. Jun. 2003, pp. 61–68. Theory, vol. 50, no. 5, pp. 748–767, May 2004. [13] M. Villén-Altamirano and J. Villén-Altamirano, ‘‘On the efficiency of [40] J. Herdtner and E. Chong, ‘‘Throughput-storage tradeoff in ad hoc net- RESTART for multidimensional state systems,’’ ACM Trans. Model. works,’’ in Proc. 24th Annu. Joint Conf. IEEE Comput. Commun. Soc., Comput. Simul., vol. 16, pp. 251–279, Jul. 2006. vol. 4. Mar. 2005, pp. 2536–2542. [14] P. M. Papazoglou, D. A. Karras, and R. C. Papademetriou, ‘‘An effi- [41] P. Jelenkovic, P. Momcilovic, and M. Squillante, ‘‘Scalability of wireless cient scheduling mechanism for simulating concurrent events in wireless networks,’’ IEEE/ACM Trans. Netw., vol. 15, no. 2, pp. 295–308, Apr. communications based on an improved priority queue (PQ) TDM lay- 2007. ered multi-threading approach,’’ WSEAS Trans. Commun., vol. 7, no. 3, [42] C. Graham, ‘‘Chaoticity on path space for a queueing network with selec- pp. 207–215, Mar. 2008. tion of the shortest queue among several,’’ Ann. Appl. Probab., vol. 37, [15] D. Anick, D. Mitra, and M. Sondhi, ‘‘Stochastic theory of a data han- no. 1, pp. 198–211, 2000. dling system with multiple sources,’’ Bell Syst. Tech. J., vol. 61, no. 8, [43] M. Benaïm and J.-Y.L. Boudec, ‘‘A class of mean field interaction models pp. 1871–1894, Oct. 1982. for computer and communication systems,’’ Perform. Evaluat., vol. 65, [16] G. Kesidis, A. Singh, D. Cheung, and W. Kwok, ‘‘Feasibility of nos. 11–12, pp. 823–838, 2008. fluid event-driven simulation for ATM networks,’’ in Proc. Commun., [44] R. Bakhshi, L. Cloth, W. Fokkink, and B. R. Haverkort, ‘‘Mean-field Key Global Prosperity, Global Telecommun. Conf., vol. 3. Nov. 1996, analysis for the evaluation of gossip protocols,’’ ACM SIGMETRICS pp. 2013–2017. Perform. Evaluat. Rev., vol. 36, pp. 31–39, Nov. 2008. [17] K. Kumaran and D. Mitra, ‘‘Performance and fluid simulations of [45] R. Bakhshi, J. Endrullis, S. Endrullis, W. Fokkink, and B. Haverkort, a novel shared buffer management system,’’ in Proc. IEEE 17th ‘‘Automating the mean-field method for large dynamic gossip networks,’’ Annu. Joint Conf., Comput. Commun. Soc., vol. 3, Mar./Apr. 1998, in Proc. 7th Int. Conf. QEST, Sep. 2010, pp. 241–250. pp. 1449–1461. [46] C. Bordenave, D. McDonald, and A. Proutiére, ‘‘A particle system in [18] B. Liu, D. Figueiredo, Y. Guo, J. Kurose, and D. Towsley, ‘‘A study of interaction with a rapidly varying environment: Mean field limits and networks simulation efficiency: Fluid simulation vs. packet-level simu- applications,’’ Netw. Heterogeneous Media, vol. 5, no. 1, pp. 31–62, lation,’’ in Proc. 20th Annu. Joint Conf. IEEE Comput. Commun. Soc., 2010. vol. 3. Apr. 2001, pp. 1244–1253. [47] H. Tembine, J.-Y. Le Boudec, R. El-Azouzi, and E. Altman, ‘‘Mean field [19] H. J. Kushner, Approximation and Weak Convergence Methods for Ran- asymptotics of Markov decision evolutionary games and teams,’’ in Proc. dom Processes, with Applications to Stochastic Systems Theory. Cam- Int. Conf. Game Theory Netw., May 2009, pp. 140–150. bridge, MA, USA: MIT Press, 1984. [48] P. E. Caines, ‘‘Mean field stochastic control,’’ in Proc. Bode Lecture 48th [20] H. Robbins and S. Monro, ‘‘A stochastic approximation method,’’ Ann. IEEE Conf. Decision Control, Shanghai, China, Dec. 2009. Math. Stat., vol. 22, no. 3, pp. 400–407, 1951. [49] M. Nourian, R. Malhamé, M. Huang, and P. Caines, ‘‘Optimality of [21] J. Kiefer and J. Wolfowitz, ‘‘Stochastic estimation of the maximum of a adaption based mean field control laws in leader-follower stochas- regression function,’’ Ann. Math. Stat., vol. 23, pp. 462–466, 1952. tic collective dynamics,’’ in Proc. 49th IEEE Conf. CDC, Dec. 2010, [22] M. Benaïm, ‘‘Dynamics of stochastic approximation algorithms,’’ pp. 2270–2275. in Séminaire de Probabilités XXXIII (Lecture Notes in Mathemat- [50] M. Nourian, P. Caines, and R. Malhame, ‘‘Synthesis of Cucker-Smale ics), vol. 1709. Berlin/Heidelberg, Germany: Springer-Verlag, 1999, type flocking via mean field stochastic control theory: Nash equilib- pp. 1–68. ria,’’ in Proc. 48th Annu. Allerton Conf. Commun., Control, Comput., [23] T. L. Lai, ‘‘Stochastic approximation: Invited paper,’’ Ann. Stat., vol. 31, Sep./Oct. 2010, pp. 814–819. no. 2, pp. 391–406, 2003. [51] N. Gast, B. Gaujal, and J.-Y. L. Boudec, ‘‘Mean field for Markov deci- [24] L. Ljung, ‘‘Analysis of recursive stochastic algorithms,’’ IEEE Trans. sion processes: From discrete to continuous optimization,’’ IEEE Trans. Autom. Control, vol. 22, no. 4, pp. 551–575, Aug. 1977. Autom. Control, vol. 57, no. 9, pp. 2266–2280, Sep. 2012. [25] H. J. Kushner and D. S. Clark, Stochastic Approximation for Constrained [52] O. Zeitouni and M. Gutman, ‘‘On universal hypotheses testing via large and Unconstrained Systems. New York, NY, USA: Springer-Verlag, deviations,’’ IEEE Trans. Inf. Theory, vol. 37, no. 2, pp. 285–290, 1978. Mar. 1991. [26] A. Benveniste, M. MéTivier, and P. Priouret, Stochastic Approxima- [53] Y. Amit and M. Miller, ‘‘Large deviations for coding Markov chains tions and Adaptive Algorithms. New York, NY, USA: Springer-Verlag, and Gibbs random fields,’’ IEEE Trans. Inf. Theory, vol. 39, no. 1, 1990. pp. 109–118, Jan. 1993. [27] M. Benaïm and M. W. Hirsch, ‘‘Stochastic approximation algorithms with [54] A. Weiss, ‘‘An introduction to large deviations for communication net- constant step size whose average is cooperative,’’ Ann. Appl. Probab., works,’’ IEEE J. Sel. Areas Commun., vol. 13, no. 6, pp. 938–952, vol. 9, no. 1, pp. 216–241, 1999. Aug. 1995. [28] H. J. Kushner and G. G. Yin, Stochastic Approximation and Recursive [55] I. Paschalidis and Y. Chen, ‘‘Anomaly detection in sensor networks based Algorithms and Applications. New York, NY, USA: Springer-Verlag, on large deviations of Markov chain models,’’ in Proc. 47th IEEE CDC, 2003. Dec. 2008, pp. 2338–2343. [29] T. G. Kurtz, ‘‘Solutions of ordinary differential equations as limits [56] I. Paschalidis and G. Smaragdakis, ‘‘Spatio-temporal network anomaly of pure jump Markov processes,’’ J. Appl. Probab., vol. 7, no. 1, detection by assessing deviations of empirical measures,’’ IEEE/ACM pp. 49–58, 1970. Trans. Netw., vol. 17, no. 3, pp. 685–697, Jun. 2009. [30] R. W. R. Darling, (2002, Dec.). Fluid Limits of Pure Jump [57] S. L. Sobolev, Partial Differential Equations of Mathematical Physics. Markov Processes: A Practical Guide [Online]. Available: New York, NY, USA: Dover, 1964. http://arxiv.org/abs/math/0210109 [58] D. Blount, ‘‘Law of large numbers in the supremum norm for a chem- [31] R. W. R. Darling and J. R. Norris, ‘‘Differential equation approximations ical reaction with diffusion,’’ Ann. Appl. Probab., vol. 2, no. 1, for Markov chains,’’ Probab. Surveys, vol. 5, pp. 37–79, 2008. pp. 131–141, 1992. [32] R. McVinish and P. Pollett, ‘‘The deterministic limit of a stochastic [59] Y. Tyutyunov, I. Senina, and R. Arditi, ‘‘Clustering due to acceleration in logistic model with individual variation,’’ Math. Biosci., vol. 241, no. 1, the response to population gradient: A simple self-organization model,’’ pp. 109–114, Jan. 2013. Amer. Naturalist, vol. 164, no. 6, pp. 722–735, 2004. [33] S. N. Ethier and T. G. Kurtz, Markov Processes: Characterization and [60] C. F. Huang and H. Pages, ‘‘Optimal consumption and portfolio policies Convergence (Probability and Mathematical Statistics: Probability and with an infinite horizon: Existence and convergence,’’ Ann. Appl. Probab., Mathematical Statistics). New York, NY, USA: Wiley, 1986. vol. 2, no. 1, pp. 36–64, 1992. [34] P. Billingsley, Convergence of Probability Measures. New York, NY, [61] G. L. Xu and S. E. Shreve, ‘‘A duality method for optimal consumption USA: Wiley, 1999. and investment under short-selling prohibition. II. Constant market coef- [35] H. F. Trotter, ‘‘Approximation of semi-groups of operators,’’ Pacific ficients,’’ Ann. Appl. Probab., vol. 2, no. 2, pp. 314–328, 1992. J. Math., vol. 8, no. 4, pp. 887–919, 1958. [62] W. F. Phillips, A New Continuum Model for Traffic Flow. Washington, [36] D. W. Stroock and S. S. Varadhan, Multidimensional Diffusion Processes. DC, USA: U.S. Dept. of Transportation, Research and Special Programs Berlin, Germany: Springer-Verlag, 1979. Administration, 1981. [37] P. Gupta and P. R. Kumar, ‘‘The capacity of wireless networks,’’ IEEE [63] P. Kachroo, Pedestrian Dynamics Feedback Control of Crowd Evacua- Trans. Inf. Theory, vol. 46, no. 2, pp. 388–404, Mar. 2000. tion. New York, NY, USA: Springer-Verlag, 2008. [38] M. Grossglauser and D. N. C. Tse, ‘‘Mobility increases the capacity [64] C. B. Morrey, ‘‘On the derivation of the equations of hydrodynamics of ad hoc wireless networks,’’ IEEE/ACM Trans. Netw., vol. 10, no. 4, from statistical mechanics,’’ Proc. Nat. Acad. Sci. USA, vol. 40, no. 5, pp. 477–486, Aug. 2002. pp. 317–322, 1954.

VOLUME 1, 2013 593 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

[65] R. L. Dobrushin and R. Siegmund-Schultze, ‘‘The hydrodynamic limit [93] V. Nguyen, ‘‘Fluid and diffusion approximations of a two-station mixed for systems of particles with independent evolution,’’ Math. Nachrichten, queueing network,’’ Math. Oper. Res., vol. 20, no. 2, pp. 321–354, vol. 105, no. 1, pp. 199–224, 1982. 1995. [66] M. Z. Guo, G. C. Papanicolaou, and S. R. S. Varadhan, ‘‘Nonlinear [94] W. Chen, D. Huang, A. Kulkarni, J. Unnikrishnan, Q. Zhu, P. Mehta, diffusion limit for a system with nearest neighbor interactions,’’ Commun. S. Meyn, and A. Wierman, ‘‘Approximate dynamic programming Math. Phys., vol. 118, no. 1, pp. 31–59, 1988. using fluid and diffusion approximations with applications to [67] C. Kipnis and C. Landim, Scaling Limits of Interacting Particle Systems. power management,’’ in Proc. 48th IEEE CDC/CCC, Dec. 2009, New York, NY, USA: Springer-Verlag, 1999. pp. 3575–3580. [68] A. Koukkous, ‘‘Hydrodynamic behavior of symmetric zero-range pro- [95] E. W. Grundke and A. N. Zincir-Heywood, ‘‘A uniform continuum model cesses with random rates,’’ Stochastic Process. Appl., vol. 84, no. 2, for scaling of ad hoc networks,’’ in Proc. 2nd Int. Conf. ADHOC-NOW, pp. 297–312, 1999. 2003, pp. 96–103. [69] C. Bahadoran, H. Guiol, K. Ravishankar, and E. Saada, ‘‘Euler hydro- [96] P. Jacquet, ‘‘Geometry of information propagation in massively dense dynamics of one-dimensional attractive particle systems,’’ Ann. Probab., ad hoc networks,’’ in Proc. 5th ACM Int. Symp. Mobile Ad Hoc Netw. vol. 34, no. 4, pp. 1339–1369, 2006. Comput., 2004, pp. 157–162. [70] H. Chen and A. Mandelbaum, ‘‘Discrete flow networks: Bottleneck [97] B. Sirkeci-Mergen, A. Scaglione, and G. Mergen, ‘‘Asymptotic analysis analysis and fluid approximations,’’ Math. Oper. Res., vol. 16, no. 2, of multistage cooperative broadcast in wireless networks,’’ IEEE Trans. pp. 408–446, 1991. Inf. Theory, vol. 52, no. 6, pp. 2531–2550, Jun. 2006. [71] H. Chen, ‘‘Fluid approximations and stability of multiclass queueing net- [98] S. Toumpis and L. Tassiulas, ‘‘Optimal deployment of large wire- works: Work-conserving disciplines,’’ Ann. Appl. Probab., vol. 5, no. 3, less sensor networks,’’ IEEE Trans. Inf. Theory, vol. 52, no. 7, pp. 637–665, 1995. pp. 2935–2953, Jul. 2006. [72] J. G. Dai, ‘‘On positive Harris recurrence of multiclass queueing net- [99] E. Altman, A. Silva, P. Bernhard, and M. Debbah, ‘‘Continuum equilibria works: A unified approach via fluid limit models,’’ Ann. Appl. Probab., for routing in dense ad-hoc networks,’’ in Proc. 45th Annu. Allerton Conf. vol. 5, no. 1, pp. 49–77, 1995. Commun., Control, Comput., 2007, pp. 1–31. [73] J. G. Dai, ‘‘A fluid limit model criterion for instability of multiclass [100] J.-M. Kelif, M. Coupechoux, and P. Godlewski, ‘‘Spatial outage proba- queueing networks,’’ Ann. Appl. Probab., vol. 6, no. 3, pp. 751–757, bility for cellular networks,’’ in Proc. IEEE Global Telecommun. Conf., 1996. Nov. 2007, pp. 4445–4450. [74] S. P. Meyn, ‘‘Transience of multiclass queueing networks via fluid limit [101] C. F. Chiasserini, R. Gaeta, M. Garetto, M. Gribaudo, D. Manini, and models,’’ Ann. Appl. Probab., vol. 5, no. 4, pp. 946–957, 1995. M. Sereno, ‘‘Fluid models for large-scale wireless sensor networks,’’ [75] A. Eryilmaz and R. Srikant, ‘‘Fluid models of wireless networks,’’ in Proc. Perform. Evaluat., vol. 64, nos. 7–8, pp. 715–736, 2007. SCSC, Montreal, QC, Canada, Aug. 2003, pp. 1–5. [102] V. Blondel, J. Hendrickx, and J. Tsitsiklis, ‘‘On Krause’s multi-agent [76] F. P. Kelly and R. J. Williams, ‘‘Fluid model for a network operating consensus model with state-dependent connectivity,’’ IEEE Trans. Autom. under a fair bandwidth-sharing policy,’’ Ann. Appl. Probab., vol. 14, no. 3, Control, vol. 54, no. 11, pp. 2586–2597, Nov. 2009. pp. 1055–1083, 2004. [103] V.Misra, W. B. Gong, and D. Towsley, ‘‘Fluid-based analysis of a network [77] M. Lelarge, ‘‘Fluid limit of generalized Jackson queueing networks with of AQM routers supporting TCP flows with an application to RED,’’ in stationary and ergodic arrivals and service times,’’ J. Appl. Probab., Proc. Conf. Appl., Technol., Archit., Protocols Comput. Commun., 2000, vol. 42, no. 2, pp. 491–512, 2005. pp. 151–160. [78] R. Egorova, S. Borst, and B. Zwart, ‘‘Bandwidth-sharing networks in [104] L. Liu, P. Parag, J. Tang, W. Y. Chen, and J. F. Chamberland, ‘‘Resource overload,’’ Perform. Evaluat., vol. 64, pp. 978–993, Oct. 2007. allocation and quality of service evaluation for wireless communication [79] H. C. Gromoll and R. J. Williams, ‘‘Fluid limits for networks with systems using fluid models,’’ IEEE Trans. Inf. Theory, vol. 53, no. 5, bandwidth sharing and general document size distributions,’’ Ann. Appl. pp. 1767–1777, May 2007. Probab., vol. 19, no. 1, pp. 243–280, 2009. [105] J. Lee, S. Bohacek, J. Hespanha, and K. Obraczka, ‘‘Modeling communi- [80] D. L. Iglehart, ‘‘Multiple channel queues in heavy traffic,’’ Probab. cation networks with hybrid systems,’’ IEEE/ACM Trans. Netw., vol. 15, Theory Rel. Fields, vol. 17, no. 2, pp. 168–180, 1971. [81] J. M. Harrison, ‘‘A limit theorem for priority queues in heavy traffic,’’ J. no. 3, pp. 630–643, Jun. 2007. [106] Y. Zhang, E. K. P. Chong, J. Hannig, and D. Estep, ‘‘Con- Appl. Probab., vol. 10, no. 4, pp. 907–912, 1973. [82] G. Foschini and J. Salz, ‘‘A basic dynamic routing problem and tinuum modeling and control of large nonuniform networks,’’ in diffusion,’’ IEEE Trans. Commun., vol. 26, no. 3, pp. 320–327, Proc. 49th Annu. Allerton Conf. Commun., Control, Comput., 2011, Mar. 1978. pp. 1670–1677. [83] S. Halfin and W. Whitt, ‘‘Heavy-traffic limits for queues with [107] M. Neely, E. Modiano, and C. Rohrs, ‘‘Dynamic power allocation and many exponential servers,’’ Oper. Res., vol. 29, no. 3, pp. 567–588, routing for time-varying wireless networks,’’ IEEE J. Sel. Areas Com- 1981. mun., vol. 23, no. 1, pp. 89–103, Jan. 2005. [84] J. M. Harrison and M. I. Reiman, ‘‘Reflected Brownian motion on an [108] M. K. Jain, Numerical Solution of Differential Equations. New York, NY, orthant,’’ Ann. Probab., vol. 9, no. 2, pp. 302–308, 1981. USA: Wiley, 1984. [85] W. Whitt, ‘‘On the heavy-traffic limit theorem for GI/G/∞ queues,’’ Adv. [109] E. Isaacson and H. B. Keller, Analysis of Numerical Methods. New York, Appl. Probab., vol. 14, no. 1, pp. 171–190, 1982. NY, USA: Wiley, 1966. [86] J. M. Harrison and R. J. Williams, ‘‘Brownian models of feedforward [110] P. Billingsley, Probability and Measure. New York, NY, USA: Wiley, queueing networks: and product form solutions,’’ Ann. 1995. [111] Y. Zhang, E. K. P. Chong, J. Hannig, and D. Estep, ‘‘Continuum Appl. Probab., vol. 2, no. 2, pp. 263–293, 1992. [87] J. M. Harrison and R. J. Williams, ‘‘A multiclass closed queueing network modeling and control of large nonuniform wireless networks via non- with unconventional heavy traffic behavior,’’ Ann. Appl. Probab., vol. 6, linear partial differential equations,’’ Abstract Appl. Anal., vol. 2013, no. 1, pp. 1–47, 1996. pp. 262581-1–262581-16, Mar. 2013. [88] J. M. Harrison, ‘‘Heavy traffic analysis of a system with parallel servers: Asymptotic optimality of discrete-review policies,’’ Ann. Appl. Probab., vol. 8, no. 3, pp. 822–848, 1998. [89] J. M. Harrison, ‘‘Brownian models of open processing networks: Canon- ical representation of workload,’’ Ann. Appl. Probab., vol. 10, no. 1, pp. 75–103, 2000. [90] A. Budhiraja and A. P. Ghosh, ‘‘Diffusion approximations for con- YANG ZHANG received the B.S. and M.S. degrees trolled stochastic networks: An asymptotic bound for the value in electrical engineering from Xi’an Jiaotong Uni- function,’’ Ann. Appl. Probab., vol. 16, no. 4, pp. 1962–2006, versity, Xi’an, China, in 2004 and 2008, respec- 2006. [91] M. Bramson, J. G. Dai, and J. M. Harrison, ‘‘Positive recurrence of tively. He is currently pursuing the Ph.D. degree in reflecting Brownian motion in three dimensions,’’ Ann. Appl. Probab., electrical engineering from Colorado State Univer- vol. 20, no. 2, pp. 753–783, 2010. sity, Fort Collins, CO, USA. His current research [92] H. Chen and J. Shanthikumar, ‘‘Fluid limits and diffusion interests include network modeling, stochastic approximations for networks of multi-server queues in heavy analysis, simulation, optimization, and control. traffic,’’ Discrete Event Dyn. Syst., vol. 4, no. 3, pp. 269–291, 1994.

594 VOLUME 1, 2013 Y. Zhang et al.: Approximating Extremely Large Networks via Continuum Limits

EDWIN K. P. CHONG (F’04) received the B.E. DONALD ESTEP received the B.A. degree from (Hons.) degree from the University of Adelaide, Columbia College, Columbia University, New Adelaide, South Australia, in 1987 and the M.A. York, NY, USA, in 1981 and the M.S. and Ph.D. and Ph.D. degrees from Princeton University, degrees in applied mathematics from the Univer- Princeton, NJ, USA, in 1989 and 1991, respec- sity of Michigan, Ann Arbor, Michigan, USA, in tively, where he held an IBM Fellowship. He was 1987. He joined the faculty in the School of Math- with the School of Electrical and Computer Engi- ematics with the Georgia Institute of Technology, neering, Purdue University, West Lafayette, IN, Atlanta, GA, USA, in 1987. In 2000, he joined USA, in 1991, where he was named a University the faculties in the Departments of Mathematics Faculty Scholar in 1999. Since August 2001, he and of Statistics at Colorado State University, Fort has been a Professor of electrical and computer engineering and Professor Collins, CO, USA. He was appointed as the Colorado State University of mathematics from Colorado State University, Fort Collins, CO, USA. His Interdisciplinary Research Scholar in 2009. At Colorado State University, current research interests include stochastic modeling and control, optimiza- he directed the NSF IGERT Program for Interdisciplinary Mathematics, tion methods, and communication and sensor networks. He coauthored the Ecology, and Statistics from 2003 to 2009, and has directed the Center for best-selling book, An Introduction to Optimization (Fourth Edition, Wiley- Interdisciplinary Mathematics and Statistics, since 2006. He was an NSF Interscience, 2013). He received the NSF CAREER Award in 1995 and the International Research Fellow from 1993 to 1995, won the Computational ASEE Frederick Emmons Terman Award in 1998. He was a co-recipient of and Mathematical Methods in Sciences and Engineering Prize in 2005, and the 2004 Best Paper Award for a paper in the journal Computer Networks. In won the Pennock Distinguished Service Award in 2009 and University Schol- 2010, he received the IEEE Control Systems Society Distinguished Member arship Impact Award with Colorado State University in 2011. From 2013 Award. to 2014, he holds the Chalmers Jubilee Professor honor with the Chalmers Prof. Chong was the Founding Chairman of the IEEE Control Systems University of Technology, Gothenburg, Sweden. He is the founding Editor Society Technical Committee on Discrete Event Systems, and served as an in Chief of the SIAM/ASA Journal on Uncertainty Quantification, Editor in IEEE Control Systems Society Distinguished Lecturer. He is currently a Chief of the SIAM Book Series on Computational Science and Engineering Senior Editor of the IEEE Transactions on Automatic Control, and also serves in 2009, and Editor for five other journals. He served as the founding Chair on the editorial boards of Computer Networks and the Journal of Control Sci- of the SIAM Activity Group on Uncertainty Quantification, and currently ence and Engineering. He served as a member of the IEEE Control Systems serves on the SIAM Book Committee and on the Governing Board of the Society Board of Governors and is currently a Vice President for Financial Statistical and Applied Mathematical Sciences Institute. Activities. He has also served on the organizing committees of several inter- national conferences. He has been on the program committees for the IEEE Conference on Decision and Control, the American Control Conference, the IEEE International Symposium on Intelligent Control, IEEE Symposium on Computers and Communications, and the IEEE Global Telecommunications Conference. He has also served in the executive committees for the IEEE Conference on Decision and Control, the American Control Conference, the IEEE Annual Computer Communications Workshop, the International Conference on Industrial Electronics, Technology and Automation, and the IEEE International Conference on Communications. He was the Conference (General) Chair for the Conference on Modeling and Design of Wireless Networks, part of SPIE ITCom 2001. He was the General Chair for the 2011 Joint 50th IEEE Conference on Decision and Control and European Control Conference.

JAN HANNIG is a Professor in the Department of Statistics and Operations Research with the Uni- versity of North Carolina, Chapel Hill, NC, USA. His current research interests include applied prob- ability, theoretical statistics, generalized fiducial inference, and applications to biology and engi- neering. He received the M.gr. degree in mathe- matics from the Charles University, Prague, Czech Republic, in 1996. He received the Ph.D. degree in statistics and probability from Michigan State University, Ann Arbor, MI, USA, in 2000 under the direction of Prof. A.V. Skorokhod. From 2000 to 2008, he was the Faculty with the Department of Statistics with Colorado State University, Fort Collins, CO, USA where he was promoted to an Associate Professor in 2006. He was with the Department of Statistics and Operations with the University of North Carolina, Chapel Hill, NC, USA, in 2008 and was promoted to Professor in 2013, he is an Associate Editor of Electronic Journal of Statistics, Stat-The ISI’s Journal for the Rapid Dissemination of Statistics Research and Journal of Computational and Graphical Statistics. He is an elected member of International Statistical Institute. He was a PI and co-PI on several federally funded projects.

VOLUME 1, 2013 595