Kelly Cache Networks
Total Page:16
File Type:pdf, Size:1020Kb
Kelly Cache Networks Milad Mahdian, Armin Moharrer, Stratis Ioannidis, and Edmund Yeh Electrical and Computer Engineering, Northeastern University, Boston, MA, USA fmmahdian,amoharrer,ioannidis,[email protected] Abstract—We study networks of M/M/1 queues in which nodes We make the following contributions. First, we study the act as caches that store objects. Exogenous requests for objects problem of optimizing the placement of objects in caches in are routed towards nodes that store them; as a result, object Kelly cache networks of M/M/1 queues, with the objective traffic in the network is determined not only by demand but, crucially, by where objects are cached. We determine how to of minimizing a cost function of the system state. We show place objects in caches to attain a certain design objective, such that, for a broad class of cost functions, including packet as, e.g., minimizing network congestion or retrieval delays. We delay, system size, and server occupancy rate, this optimization show that for a broad class of objectives, including minimizing amounts to a submodular maximization problem with matroid both the expected network delay and the sum of network constraints. This result applies to general Kelly networks with queue lengths, this optimization problem can be cast as an NP- hard submodular maximization problem. We show that so-called fixed service rates; in particular, it holds for FIFO, LIFO, and continuous greedy algorithm [1] attains a ratio arbitrarily close processor sharing disciplines at each queue. to 1 − 1=e ≈ 0:63 using a deterministic estimation via a power The so-called continuous greedy algorithm [1] attains a series; this drastically reduces execution time over prior art, 1 − 1=e approximation for this NP-hard problem. However, which resorts to sampling. Finally, we show that our results it does so by computing an expectation over a random vari- generalize, beyond M/M/1 queues, to networks of M/M/k and symmetric M/D/1 queues. able with exponential support via randomized sampling. The number of samples required to attain the 1−1=e approximation I. INTRODUCTION guarantee can be prohibitively large in realistic settings. Our second contribution is to show that, for Kelly networks of Kelly networks [2] are multi-class networks of queues M/M/1 queues, this randomization can be entirely avoided: capturing a broad array of queue service disciplines, including a closed-form solution can be computed using the Taylor FIFO, LIFO, and processor sharing. Both Kelly networks and expansion of our problem’s objective. To the best of our their generalizations (including networks of quasi-reversible knowledge, we are the first to identify a submodular maxi- and symmetric queues) are well studied and classic topics [2]– mization problem that exhibits this structure, and to exploit it [5]. One of their most appealing properties is that their steady- to eschew sampling. Finally, we extend our results to networks state distributions have a product-form: as a result, steady state of M/M/k and symmetric M/D/1 queues, and prove a negative properties such as expected queue sizes, packet delays, and result: submodularity does not arise in networks of M/M/1/k server occupancy rates have closed-form formulas as functions queues. We extensively evaluate our proposed algorithms over of, e.g., routing and scheduling policies. several synthetic and real-life topologies. In this paper, we consider Kelly networks in which nodes are The remainder of our paper is organized as follows. We equipped with caches, i.e., storage devices of finite capacity, review related work in Sec. II. We present our mathematical which can be used to store objects. Exogenous requests for model of a Kelly cache network in Sec. III, and our results objects are routed towards nodes that store them; upon reach- on submodularity and the continuous-greedy algorithm in ing a node that stores the requested object, a response packet networks of M/M/1 queues in Sections IV and V, respectively. containing the object is routed towards the request source. As Our extensions are described in Sec. VI; our numerical eval- arXiv:1901.04092v2 [cs.NI] 28 Jan 2019 a result, object traffic in the network is determined not only by uation is in Sec. VII. Finally, we conclude in Sec. VIII. the demand but, crucially, by where objects are cached. This abstract setting is motivated by–and can be used to model– II. RELATED WORK various networking applications involving the placement and Our approach is closest to, and inspired by, recent work by transmission of content. This includes information centric Shanmugam et al. [19] and Ioannidis and Yeh [8]. Ioannidis networks [6]–[8], content delivery networks [9], [10], web- and Yeh consider a setting very similar to ours but without caches [11]–[13], wireless/femtocell networks [14]–[16], and queuing: edges are assigned a fixed weight, and the objective peer-to-peer networks [17], [18], to name a few. is a linear function of incoming traffic scaled by these weights. In many of these applications, determining the object place- This can be seen as a special case of our model, namely, one ment, i.e., how to place objects in network caches, is a decision where edge costs are linear (see also Sec. III-B). Shanmugam that can be made by the network designer in response to et al. [19] study a similar optimization problem, restricted to object popularity and demand. To that end, we are interested the context of femtocaching. The authors show that this is in determining how to place objects in caches so that traffic an NP-hard, submodular maximization problem with matroid attains a design objective such as, e.g., minimizing delay. constraints. They provide a 1 − 1=e approximation algorithm based on a technique by Ageev and Sviridenko [20]: this Designated Server involves maximizing a concave relaxation of the original ob- Packet Departure ir jective, and rounding via pipage-rounding [20]. Ioannidis and Cache Hit r r p v u p Yeh show that the same approximation technique applies to Request ? rArrival Packet Arrival λ more general cache networks with linear edge costs. They also ? λr provide a distributed, adaptive algorithm that attains an 1−1=e Response Departure approximation. The same authors extend this framework to jointly optimize both caching and routing decisions [21]. (a) Kelly Network (b) Kelly Cache Network Our work can be seen as an extension of [8], [19], in that it incorporates queuing in the cache network. In contrast to Fig. 1: (a) Example of a Kelly network. Packets of class r enter the network with rate λr , are routed through consecutive queues over path pr , and subsequently exit the network. both [8] and [19] however, costs like delay or queue sizes (b) Example of a Kelly cache network. Each node v 2 V is equipped with a cache of r are highly non-linear in the presence of queuing. From a capacity cv . Exogenous requests of type r for object i enter the network and are routed over a predetermined path pr towards the designated server storing ir . Upon reaching technical standpoint, this departure from linearity requires us an intermediate node u storing the requested object ir , a response packet containing to employ significantly different optimization methods than the the object is generated. The response is then forwarded towards the request’s source in the reverse direction on path pr . Request packets are of negligible size compared to ones in [8], [19]. In particular, our objective does not admit a response messages; as a result, we ignore request traffic and focus on queuing due to concave relaxation and, consequently, the technique by Ageev response traffic alone. and Sviridenko [20] used in [8], [19] does not apply. Instead, we must solve a non-convex optimization problem directly (c.f. Eq. (13)) using the so-called continuous-greedy algorithm. a response packet containing the object is routed towards Several papers have studied the cache optimization problems the node that generated the request. As a result, content under restricted topologies [9], [22]–[25]. These works model traffic in the network is determined not only by demand but, the network as a bipartite graph: nodes generating requests crucially, by how contents are cached. For completeness, we connect directly to caches in a single hop. The resulting review classic Kelly networks in Appendix A. An illustration algorithms do not readily generalize to arbitrary topologies. highlighting the differences between Kelly cache networks, In general, the approximation technique of Ageev and Sviri- introduced below, and classic Kelly networks, can be found in denko [20] applies to this bipartite setting, and additional Fig. 1. approximation algorithms have been devised for several vari- Although we describe Kelly cache networks in terms of ants [9], [22]–[24]. We differ by (a) considering a multi-hop FIFO M/M/1 queues, the product form distribution (c.f. (4)) setting, and (b) introducing queuing, which none of the above arises for many different service principles beyond FIFO works considers. (c.f. Section 3.1 of [2]) including Last-In First-Out (LIFO) Submodular function maximization subject to matroid con- and processor sharing. All results we present extend to these straints appears in many important problems in combinatorial service disciplines; we discuss more extensions in Sec. VI. optimization; for a brief review of the topic and applications, A. Kelly Cache Networks see [26] and [27], respectively. Nemhauser et al. [28] show that the greedy algorithm produces a solution within 1/2 of Graphs and Paths. We use the notation G(V; E) for a directed the optimal. Vondrak´ [29] and Calinescu et al.