JOURNAL OF IEEE TRANSACTIONS ON COMPUTING, VOL. XX, NO. XX, XX 2017 1 Dynamic VM Scaling: Provisioning and Pricing through an Online Auction

Xiaoxi Zhang, Student Member, IEEE, Zhiyi Huang, Member, IEEE, Chuan Wu, Senior Member, IEEE Zongpeng Li, Senior Member, IEEE and Francis C.M. Lau, Senior Member, IEEE

Abstract—Today’s IaaS clouds allow dynamic scaling of VMs allocated to a user, according to real-time demand of the user. There are two types of scaling: horizontal scaling (scale-out) by allocating more VM instances to the user, and vertical scaling (scale-up) by boosting resources of VMs owned by the user. It has been a daunting issue how to efficiently allocate the resources on physical servers to meet the scaling demand of users on the go, which achieves the best server utilization and user utility. An accompanying critical challenge is how to effectively charge the incremental resources, such that the economic benefits of both the cloud provider and cloud users are guaranteed. There has been online auction design dealing with dynamic VM provisioning, where the resource bids are not related to each other, failing to handle VM scaling where later bids may rely on earlier bids of the same user. As the first in the literature, this paper designs an efficient, truthful online auction for resource provisioning and pricing in the practical cases of dynamic VM scaling, where: (i) users bid for customized VMs to use in future durations, and can bid again in the following time to increase resources, indicating both scale-up and scale-out options; (ii) the cloud provider packs the demanded VMs on heterogeneous servers for energy cost minimization on the go. We carefully design resource prices maintained for each type of resource on each server to achieve threshold-based online allocation and charging, as well as a novel competitive analysis technique based on submodularity of the offline objective, to show a good competitive ratio is achieved. The efficacy of the online auction is validated through solid theoretical analysis and trace-driven simulations.

Index Terms—Algorithm design, online auction, !

1 INTRODUCTION VMs in the first place, even to just provide a somewhat good guarantee of resource availability for future scaling Manual or auto scaling of resources has been widely sup- demands. The challenge escalates when we take server costs ported in today’s IaaS clouds. Most of the major cloud into consideration, striving to achieve high efficiency in providers provide horizontal scaling (scale-out), to add power consumption and server utilization at the same time. more virtual machine (VM) instances to a user’s application An effective online solution is still missing, to optimize both upon demand [1][2][3]. Some others enable vertical scaling user satisfaction and provider utility, i.e., the social welfare. (scale-up) by live addition of resources to VMs used by a An accompanying important challenge, economics wise, user [4][5][6], using hot-plug technologies of CPU, memory, is how to price the incremental resources on the go, such disk storage, etc. [7][8]. Scale-up can be potentially easier, that the benefits of both the cloud provider and users are but the resource increase cannot exceed the physical limit of guaranteed. Almost all the current IaaS offerings adopt fixed a physical server; scale-out may involve data replication and pricing [9], to charge a fixed unit price per preconfigured be thus costly, but the resources that can be increased are VM or per unit of resources, which does not change in potentially unlimited. The choice and preference of scaling the short term. There have been recent practices and pro- modes are at user’s call, depending on their respective posals towards market-based pricing, for timely adaptation resource need and program implementation. to demand-supply relation changes, that allocates resources Efficient timely allocation of resources on servers to to users who value the resources most and boosts both accommodate time-varying demands of users is at the core provider revenue and user utility. The Spot Instance market of dynamic VM scaling. Practically, without knowing which of Amazon EC2 is the pioneer production system adopting VM is to be scaled up and which user will opt for scaling bidding-based dynamic VM pricing, but has been shown out, it is daunting to decide on which servers to place the by studies not being a truly market-driven pricing system [10][11]. • X. Zhang, Z. Huang, C. Wu, and F. Lau are with the Department of A number of online auction mechanisms have been Computer Science, The University of Hong Kong. E-mail: {xxzhang2, cwu, zhiyi, fcmlau}@cs.hku.hk proposed to achieve dynamic cloud resource allocation • Z. Li is with the School of Computer Science, Wuhan University. and pricing. They treat dynamically-arrival user demands E-mail:[email protected] as independent bids, ignoring possible connections among Manuscript received July 6, 2016; revised March 1, 2017. The research bids submitted at different times by a user. In the practical was supported in part by grants from Hong Kong RGC under the con- scenarios of VM scaling, a user may bid repeatedly after tracts HKU 717812, 718513, 17204715, 17225516, 27200214E, C7036-15G submitting his initial bid, to increase the resources needed, (CRF), a grant NSFC 61628209, a grant from Huawei Technologies Co. Ltd. (HO2016050002BE), a grant from the Natural Sciences and Engineering and hence later bids from the same user are related to earlier Research Council of Canada (NSERC), and a grant from Wedge Networks. bids. With such time-coupling bids, the existing cloud auc- JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 2

tion mechanisms are not applicable: our study has identified bids, reserving resources for upcoming high value bids. that the offline optimal resource allocation problem, whose Computed before taking in an accepted bid, such a payment solution any online decisions strive to approach, renders is bid-independent, hence guaranteeing truthfulness of the a completely different form from those studied in existing mechanism. online cloud auctions. This brings significant difficulty in Third, we design a novel competitive analysis technique online mechanism design to approximate the offline opti- which, based on submodularity of the objective function of mum, and calls for new solutions. the offline problem, obtains an upper bound of the offline Targeting market-driven, dynamic resource provisioning optimum and shows a good competitive ratio achieved by and pricing for VM scaling, this paper proposes a provenly our online mechanism. Currently, no competitive online efficient online auction mechanism. The following practical algorithms exist for optimizing a submodular offline ob- auction model is investigated: (i) Users bid for tailor-made jective function under a complicated problem structure as VMs (with customized bundles of resources) to use in future in our case, i.e.,, complex non-linear constraints. Our online durations, e.g., based on empirical estimation/prediction of analysis is novel in the literature and may be useful for other resource needs of their jobs; (ii) a user can bid again in online algorithm design problems with a submodular offline the following time to increase resources, either before or objective. after the start of his VM usage, when he learns better his The rest of the paper is organized as follows. We discuss resource need, and the bid indicates his preferences in scale- related work in Sec. 2, and define the auction model in Sec. 3. up or scale-out; (iii) the cloud provider packs the demanded Sec. 4 presents our online auction design with analysis given VMs onto heterogeneous servers on the go, considering in Sec. 4.2. We present simulation results in Sec. 5 and scaling preferences of users and striving for energy cost conclude the paper in Sec. 6. minimization on servers. Our online auction design aims to achieve truthfulness, individual rationality, computational efficiency, and competitiveness in social welfare, i.e., a small 2 RELATED WORK ratio between the social welfare of the online mechanism There have been a number of system work (e.g., [14][15]) over that of the offline optimum, computed assuming full on resource scaling design in IaaS cloud, which exploit knowledge over the system span. prediction of user demand for proactive resource scaling. We reveal the following key technical challenge in They target at cost-effective practical schemes, without achieving such an efficient online mechanism. To guarantee showing theoretical guarantee of the performance. Dynamic competitiveness, the online allocation and pricing algorithm and efficient resource allocation is at the heart of resource should typically be designed based on the structure of scaling, and has been extensively investigated. For example, the offline resource allocation problem, and connections Alicherry et al. [16] study VM allocation in distributed cloud between the online decisions and increments of the offline systems, taking into consideration the latency among the social welfare should be established. To the best of our VMs. Joe-Wong et al. [17] seek to balance efficiency and knowledge, the only well established technique for online fairness when allocating VMs to users. However, pricing of auctions, where users bid for future resources with produc- the allocated VMs is left out of the scope of these studies. tion (server) cost, is to exploit the online primal-dual frame- Lin et al. [18] and Jiao et al. [19] design online algorithms for work [12][13], which requires that the corresponding offline cloud resource provisioning considering switching cost of optimization problem is convex with linear constraints. servers. They focus on minimizing total cost incurred while We identify that the offline optimization problem for VM we aim to maximize social welfare which includes both scaling with time-coupling bids is non-standard, with a valuation of user jobs and the operational cost of servers. submodular objective function and non-linear constraints, Towards market-driven pricing together with dynamic such that none of the existing primal-dual frameworks is resource allocation, cloud auctions have been proposed in applicable. both offline/one-time settings [20][21] and online scenarios Our contributions. We address the above challenge and [22][23][24]. Most of the online auctions are built on dif- design an efficient online auction as follows. ferent assumptions of the cloud system. Zhang et al. [22] First, we construct an expressive bidding language to consider a single type of cloud resource. Shi et al. [23] characterize different cases of dynamic demand scaling, allow preemption of resources already occupied and paid which allows users to request and scale resources on differ- by a user, which is typically not true in practice. In [24], ent servers according to their preferences. We also uncover users come and go over multiple rounds of the auction, the underlying relation between users’ online bidding be- while each user only occupies the allocated resources for havior and the offline social welfare maximization problem. one round. Moreover, none of these studies consider VM We identify an important property of the offline social wel- placement on physical servers and server cost minimization, fare function, namely, submodularity, which plays a crucial which is in fact an important element in the social welfare role in our analysis. that they aim to maximize. Therefore, these schemes cannot Second, we design an efficient online auction mechanism be easily extended to handle resource scaling, which relies based on carefully designed price functions. The prices for heavily on whereabout of the VMs on different servers. A each unit of each resource on each server for each future recent work [12] designs online auctions, where users bid for time slot are maintained according to resource utilization future resources and the cloud provider packs user-specified on the server, and used to compute potential payments of resource bundles onto heterogeneous servers on the fly, users if their requested resources are allocated on the server. considering server cost in social welfare maximization and Such payments serve as thresholds to filter out low value resource reusability. Unfortunately, the online primal-dual JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 3 framework for algorithm design that they applied is not [T ], and Bni denote the ith bid of user n, submitted in time 1 applicable to an offline resource allocation problem with a slot tni. Each bid can be expressed by non-convex objective function and non-linear constraints. − + Bni = (tni, tni, {dnik(t)}k∈[K],t∈[t− ,t+ ], {bnis}s∈[S]). (1) Recently, a sustainable body of theoretical literature has ni ni investigated submodular objective functions in both of- − + Here tni and tni represent the desired start time and end fline [25][26] and online [27][28] optimization. For example, time of usage of the required resources, respectively. Nat- Feldman et al. [28] relax the sectary problem to a model − + urally, we have tni ≤ tni ≤ tni. The third element in with a submodular objective, assuming elements arrive at a Bni describes resources requested for each time slot during random order. Buchbinder et al. [27] study online algorithms the usage interval, where dnik(t) is the demand of type- for submodular function maximization in several models, k resource in time slot t, set according to prediction of allowing the algorithms to preempt previously accepted workload. If this is the very first bid of user n, or all previous elements. We have identified that these models studied are bids of the user fail, dnik(t), ∀k ∈ [K], specify resource structurally different from our model. Instead, we design composition of the new VM at time t; if there exist successful novel competitive analysis techniques to prove the efficiency bids before Bni is submitted, dnik(t)’s specify incremental of our online mechanism. resources to be added, either onto a VM that the user has successfully acquired, or in the form of a new VM. User n indicates his choice or preferences of creating a 3 PROBLEM MODEL new VM or adding resources onto an acquired VM using bidding prices bnis, ∀s ∈ [S]. bnis denotes the willingness- 3.1 Resource Scaling Auction to-pay for the resources, if they are allocated on server s. Consider an IaaS cloud with S servers. Let [X] denote the If the bid is successful, resources are allocated on a server set of integers {1, 2,...,X}. A server s ∈ [S] provisions K assigned with a previously-acquired VM of the same user, − + types of resources, e.g., CPU, RAM and disks, with capacity and that VM’s running span overlaps with [tni, tni], the Cks of each resource k ∈ [K]. During a long span of T time incremental resources will be added to that VM (the scale- slots, the cloud provider allocates the resources to N cloud up case). If the bid is accepted on a server without such users, who may come and go throughout the span, through an already acquired VM, a new VM will be created and an online auction. Each user as a bidder may demand a vir- launched in due time (the scale-out case or deployment of tual machine (VM) of different resource composition from the first-ever VM). A user can set bnis’s according to concrete time to time, and use the requested VMs for variant lengths implementation of his program and detailed jobs running of time. The resource composition of each VM, as specified on the VMs. For example, if launching a new VM involves in a user bid, can vary over its usage span, according to significant data replication, the user may prefer scale-up time-varying need of the user’s job on the VM. This can than scale-out, by setting bnisˆ (where sˆ is the server his be enabled by dynamic addition and removal of resources previous VM is running on) larger than any other bnis’s. If to and from the VM, through “hotplug” technologies that the user program is not implemented to exploit dynamically adjust CPU cycles, memory and disks allocated to a run- added CPU cores, but can smoothly handle distributed ning VM, supported in various virtualization environments computation, the user may prefer scale-out and set a bnis [7][29]. (where s 6=s ˆ) larger than bnisˆ. Moreover, a user can (re)submit a bid later, to adjust Note that “s” is used to differentiate bid prices for his resource demands needed on the VMs requested earlier, presentation clarity only: in practice, the cloud provider according to the bidding results of previous VM requests does not have to inform a user which actual servers his (accepted or rejected), as well as his more precise estimation VMs are running on; the bidder can communicate with the of future workload as the time draws near. In particular, if provider by a message like “I am willing to pay a if my his previous bid fails, he can ask for the same VM again, requested resources are added to the VM I acquired in time or a VM with adjusted resource composition (increase or slot b, and I can pay c if they are put on another server”. decrease) and/or usage span (expand or shrink); if the If the cloud provider can share more insider view of the previous bid is successful, the user can ask for more re- network, a user can differentiate bid prices for sources either on an acquired VM (scale-up) or in the form provisioning his new VM on a server on the same rack (with of another VM on another server (scale-out). In the latter one that runs an existing VM of his), on a different rack but case, the new bid can be submitted either before or after sharing the same switch, etc. an acquired VM has been launched: besides adding more We next give concrete examples to illustrate our bidding resources in each time slot of VM usage, (i) if the VM is not model. Suppose user 1 plans to run a MapReduce job. yet running, the starting time of the VM can be advanced There are 3 servers. At time slot 1, he requests one VM and the end time can be extended; (ii) if the VM has been with one unit of resource (considering one type of resource launched, the end time can be postponed. for simplicity) to run a Mapper during time slots 5 to 7, with the bid B11 = (5, 7, {1, 1, 1}, {1, 1, 1}), and requests a second VM to run a reducer during time slots 6 to 8 with 3.2 Bidding Language B12 = (6, 8, {1, 1, 1}, {1, 1, 1}). He sets the same price of 1 in We develop the following bidding language to model the 1. We allow concurrent submission of multiple bids in the same time above behavior of bidders. Let In be the total number of slot, i.e., tni = tnj for j 6= i, and randomly order simultaneous bids bids that a user n ∈ [N] submits during the entire time span arriving at the same time. JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 4

both bids, no matter which server the VMs are provisioned at a much larger time scale than those for VM allocation, e.g., on. At time slot 3, he realizes that the resource asked for the Amazon EC2 adjusts its server provisioning roughly once Mapper VM is insufficient. If B11 has been successful with per month [12]. Therefore, we realistically assume that all S the Mapper assigned to server 1, he can submit another servers are turned on in the span T under our investigation. bid B13 = (5, 7, {1, 2, 3}, {1, 0, 0}) to request adding 1, 2 Each server consumes a basic amount of power with no VM and 3 units of resource to the same Mapper in its three running, and power usage increases with the increase of running time slots, respectively, at the additional price of resource occupied on the server. Let fks(·) denote the cost 1. Or he can submit B13 = (5, 7, {1, 2, 3}, {0, 1, 1}) instead, function of server s on yks(t), the amount of type-k resource if he wants to run a new Mapper on either of the other used on the server at time slot t. We consider the following two servers. If B11 has been rejected, he may resubmit a cost function: new bid in time slot 3 with boosted resources and prices, ( 2 hksyks, if yks(t) ∈ [0,Cks] B13 = (5, 7, {2, 3, 4}, {2, 2, 2}). If after the start of the fks(yks(t)) = (2) +∞, yks(t) > Cks Mapper acquired through B11, at time slot 6, the user realizes that the resource for the Mapper is still insufficient, When the consumption of type-k resource, yks(t), on server he may submit another bid B = (7, 9, {1, 5, 5}, {5, 0, 0}), 14 s does not exceed the capacity Cks, the respective server cost to boost the VM resource to 5 from the next time slot (7) till increases linearly with the increase of resource usage; other- time slot 9. An illustration of this example is shown in Fig. wise, the cost becomes infinity, which would never happen 1. if the resource capacity constraint is respected. Such a linear cost has been shown through measurements: server power 3.3 Goals of Online Auction Design consumption increases roughly linearly with the utilization of CPU, memory, disk I/O and network I/O [31], if CPU Upon receiving each bid B , the cloud provider makes the ni DVFS (Dynamic Voltage Frequency Scaling) is not enabled.3 following decisions on the spot. (i) Resource provisioning: h indicates the relative weight of cost due to each type of x , ∀s ∈ [S], indicating whether resources requested in ks nis resource in the overall server cost. It has also been shown B are to be provisioned on server s (x = 1) or not ni nis that power consumption of memory, disk I/O and network (x = 0). x , being zero for all s implies rejection of B . nis nis ni I/O are significantly lower than that of the CPU, further (ii) Payment: pˆ , denoting the charge to user n if B is ni ni ranked in a decreasing order among themselves [32], and accepted. the power usage due to different resources is additive [31]. Our online mechanism design targets the following We note that when hks = 0, the cost function becomes a properties (which will be shown in Theorem 3): (1) Truth- zero-infinity function, which represents the special case of fulness. For any bidder, declaring his true information (start no server cost considered. and end time of resource usage, resource needs) and true valuations of the resources in his bid always maximizes 3.4 The Offline Social Welfare Maximization Problem his utility (valuation minus payment), regardless of all the We next formulate the offline social welfare maximization other bids. (2) Individual rationality. Each user always obtains problem, solving which provides the offline optimal deci- a non-negative utility by submitting a bid in the auction. sions, assuming all user scaling demands in [T ] are known (3) Computational efficiency. Polynomial-time algorithms for and truthful bidding is guaranteed. The offline optimum resource allocation and payment calculation are needed will serve as the basis, to compare our online algorithm for the auction, to run efficiently in an online fashion. (4) with. Competitiveness in social welfare. The social welfare over the In participating in the auction, each user n has a value system span [T ] is the sum of cloud provider’s profit (aggre- function v : {0, 1}In×S 7→ [0, +∞). The user’s overall gate payment of all winning bids minus total cost) and all n value, for all I bids he submits, is denoted as v (x ), users’ utilities, which equals the aggregate user valuation n n n which depends on x , a feasible overall resource assignment minus the provider’s cost. A cloud system operates at the n scheme for user n in [T ]. Here x = {x } ∈ maximal efficiency if social welfare is maximized over the n nis i∈[In],s∈S {0, 1}In×S, where x = 1 if user n’s bid i is accepted on running span, benefiting both the cloud provider and users. nis server s and x = 0 otherwise. The resource assignment We aim to achieve a small competitive ratio, computed by nis is feasible if it satisfies P x ≤ 1 for any i ∈ [I ], i.e., dividing the offline optimal social welfare by the social s∈[S] nis n any bid B is accepted to at most one server. welfare achieved by the online mechanism. ni Let D (t) denote the overall demand of user n for In computing cloud provider’s profit, we consider the nik type-k resource for future time slot t, which is his updated operational cost of servers, which is mainly due to the estimation of resource need for the future time slot when he power cost decided by server power consumption. In prac- submits his ith bid at t . Note that this D (t) is the overall tice, most cloud data centers keep their servers on, which ni nik resource (type k) that user n wants to use in time slot t: remain in the low-power idle mode if no jobs are running, to some of the amount may have been allocated by accepting avoid time-consuming booting up if switched completely off the user’s earlier bids before B , and some is requested [30]. Decisions on turning servers on/off are usually made ni through Bni. The relationship between the asked resource dnik(t) in Bni and this Dnik(t) will be discussed soon after 2. Note that in this case the demands and valuations of B13 become larger since B13 has to include the demand and valuations of the we formulate the offline problem, as follows. rejected bid B11. In this sense, for each user, previously accepted bids are fewer, the resource demands and valuations (or bidding prices) are 3. We leave the case of super-linear CPU cost function due to DVFS larger. for future studies. JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 5

CPU CPU B12 RAM B12 RAM accepted to Disk accepted to Disk

server 1 server 2 server 3 B11 server 1 server 2 server 3 B11

1 0 0 0 1 1

bidding prices of B13 bidding prices of B13

B13 B13 1 2 3 4 5 6 7 8 9 t 1 2 3 4 5 6 7 8 9 t submit B11, B12 submit B13 submit B11, B12 submit B13

Case 1: B11 is accepted to server 1; B13 requires to scale up Case 2: B11 is accepted to server 1; B13 requires to scale out

CPU CPU B12 B12 RAM RAM Disk accepted to Disk

server 1 server 2 server 3 server 1 server 2 server 3 B11

2 2 2 5 0 0

bidding prices of B13 bidding prices of B14

B13 B13 B14 1 2 3 4 5 6 7 8 9 t 1 2 3 4 5 6 7 8 9 t submit B11, B12 submit B13 submit B11, B12 submit B14

Case 3: B11 is rejected; B13 increases resource demands and bidding prices Case 4: B11 has started running; B14 requires to scale up

Fig. 1: An Illustration for four example cases

due to our definition of the cost function in (2). Based on our X X X X best attempts, constraint (3b) cannot be formulated into an maximize: vn(xn) − fks(yks(t)) (3) equivalent linear constraint, leading to significant difficulty n∈[N] t∈[T ] s∈[S] k∈[K] in computing both the offline and online solutions. We subject to: next show the objective function is submodular, after giving relations between our offline formulation and online bids of X xnis ≤ 1, ∀n ∈ [N], i ∈ [In] (3a) the users. s∈[S] X X X Relation between online and offline problems. In the xnis(Dnik(t) − max {Dnjk(t) xnjs}) ≤yks(t), 1≤j≤i−1: online auction, a bid of a user Bni depends on his previous n∈[N] i∈[In] − + s∈[S] tnj ≤t≤tnj bids in two ways: (1) the amount of resources requested ∀k ∈ [K], s ∈ [S], t ∈ [T ] (3b) and, thus, the bid prices may depend on which of the previous bids are accepted (recalling the example in the last paragraph of Sec. 3.2) (2) the bid prices may also depend The offline problem aims to maximize social welfare on the servers to which the previous accepted bids are over system span T , i.e., the sum of the overall val- assigned (if the user has a preference between scaling out ues of all users minus the total server cost over t ∈ and scaling up). As a reasonable assumption, the online [T ]. We will discuss the connection between user’s bnis’s bid prices can be formulated by the offline value function and his overall value vn soon. Constraint (3a) ensures (assuming truthful bidding) which embodies the auction’s that each bid Bni is accepted to at most one server. decisions on previous bids. In particular, an online bid price Constraint (3b) connects xnis’s and yks(t)’: Dnik(t) − b is set as the additional value user n can obtain, if the P nis max − + {Dnjk(t) xnjs} is the addi- 1≤j≤i−1:tnj ≤t≤tnj s∈[S] bid Bni is accepted to server s, as follows: tional amount of resource k allocated to user n for t, if (i−1) (is) (i−1) 0 bnis = vn(x ∨ e ) − vn(x )(vn(x ) = 0) . Bni is accepted; the maximization operation is to come n n n up with the overall allocated resource to the user before (4) v (x(i−1) ∨e(is)) n Bni. Hence summing up all such resource allocation for all n n is the overall value of user up to the time accepted bids on server s, we obtain the total amount of after submitting bid Bni if the bid is accepted on server s, (i−1) type-k resource allocated on s at t, i.e., yks(t). Note that and vn(xn ) is the overall value of user n so far otherwise. (i−1) the capacity constraint, yks(t) ≤ Cks, is implicitly satisfied Here xn denotes user n’s resource assignment vector by by any feasible solution achieving non-trivial social welfare, the online auction after handling its (i − 1)th bid, i.e., for JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 6 (i−1) − + any j < i, xnjs0 = 1 if the online auction accepted bid j δnik for each unit of resource k in interval [tni, tni] if his ith 0 0 on server s and 0 otherwise, and for any j ≥ i and any s , bid is accepted. His “pure” value un is simply the sum of (is) (is) xnjs0 = 0. e is an In × S vector with eis = 1 and all the unit values of the resources obtained from the accepted other entries being zero, and x ∨ x0 denotes a vector whose bids in the entire span [T]. More correctly, readers may think entries are equal to the max of the corresponding entries in of the “pure” value function to be the one in the above 0 0 0 (i−1) example throughout the rest of the paper. Nonetheless, note x and x , i.e., (x ∨ x )i = 1 if xi = 1 or xi = 1. So xn ∨ (is) (i) that a much richer family of “pure” value functions satisfy e equals xn with bid i accepted to server s. In addition, − + n the amount of resource k for time t ∈ [t , t ] that user n submodularity. A user may be subjected to a cost when ni ni i j requests in his bid B , should be the incremental amount his bids and are accepted on the same server, or on ni φup to meet his total resource demand D (t), depending on different servers. Let n(ij) denote the scale-up cost of user nik n i j whether his previous bids are accepted or not: when and are accepted on the same server, for all i < j, which may represent the delay of resource hotplug. Let φout denote his scale-out cost if i and j are accepted on X n(ij) dnik(t) = Dnik(t) − max {Dnjk(t) xnjs} different servers, e.g., data replication and communication 1≤j≤i−1:t− ≤t≤t+ nj nj s∈[S] cost among the VMs. The total cost of user n in [T ] due to scaling up/out between accepted bids can be computed as (Dn0k(t) = 0). (5) X X up X X out To give a better understanding of the relation of online Φn(xn) = φn(ij)xnisxnjs + φn(ij)xnisxnjs0 and offline problem, we introduce the “pure” value part i

TABLE 1: Key Notation to which server it is assigned if accepted. We maintain a price for each unit of resource of each type k on each N # of users T # of time slots server s for each future time slot t, denoted by p (t), K # of resource types S # of servers ks according to reserved allocation of the respective resource B bid i of user n t arrival time of B ni ni ni on the respective server at that time due to already accepted I # of bids that user n submits n bids. We compute the potential payment of B , if the C k s ni ks capacity of type- resource on server requested resources are to be provisioned on server s, as t− (t+ ) start (end) time of resource usage in B P P ni ni ni Pnis = − + dnik(t)pks(t), ∀s ∈ [S]. Hence t∈[tni,tni] k∈[K] dnik(t) demand of type-k resource at t of Bni the corresponding utility gain received by user n, if the bid x serve B on server s (1) or not (0) nis ni is accepted, would be bnis − Pnis (bidding price minuses Dnik(t) offline overall demand of type-k resource at payment). We check if the user’s utility is positive on some t when user n submits the ith bid servers. If so, we choose the server giving the largest utility δnik value per unit resource k demand of Bni ∗ gain, i.e., s = arg maxs∈[S](bnis − Pnis), to serve the bid, up ∗ φn(ij) the cost incurred by winning bid Bni i.e., xnis∗ = 1 and xnis = 0 for all s 6= s , and increase and Bnj on a same server (scale-up). the amount of allocated resources yks∗ (t) by dnik(t) on out ∗ φn(ij) the cost incurred by winning Bni and Bnj server s for all resources k ∈ [K] and all time slots − + on different servers (scale-out). t ∈ [tni, tni]. If no server provides a positive utility, reject bnis bidding price if Bni is accepted to server s the bid, i.e., xnis = 0 for all s ∈ [S] . The above utility- bnis Uk max − + maximizing approach for each bid leads to truthfulness and s∈[S],i∈[In],n∈[N],t∈[tni,tni] dnik(t) bnis approximate social welfare maximization for some carefully L min P k s∈[S],i∈[In],n∈[N] K − + d (t) t∈[t ,t ] nik ni ni chosen prices which we will explain next. yks(t) allocation amount of resource k on s at t Payment design. We design a price function pks(t) that pks(t) marginal price of resource k on s at t depends on the amount of allocated resource, yks(t), for all pˆni payment of bid Bni k ∈ [K], s ∈ [S], t ∈ [T ], as follows: g(x) pure value minus server cost given an assignment x yks(t)   C Lk − hks 2KS(Uk − hks) ks w(x) social welfare function pks(t) = + hks given an assignment x 2KS Lk − hks wˆnis(x) marginal welfare of accepting a user n’s bid i (8) given an assignment x bnis ˆ where Lk = min P φnis(x) marginal scaling cost of accepting s∈[S],i∈[In],n∈[N] K t∈[t− ,t+ ] dnik(t) a user n’s bid i given an assignment x ni ni bnis Uk = max − + d (t) s∈[S],i∈[In],n∈[N],t∈[tni,tni] nik Lemma 1. The “pure” value minus server cost g(x) defined in Here Uk is an upper bound of the unit price of resource k (7) is submodular. in the sense that no user would want his bid to be accepted The detailed proof is given in Appendix A. at price Uk. Lk is a lower bound on the unit price of type- k resource in the sense that even the lowest bid would be Theorem 1. The social welfare function w(x) (which equals the accepted if the unit price for resource k is Lk for all k. The objective in (3)) is submodular. first term of the above price design ensures that (i) the cloud would accept even the lowest bid at the beginning, but (ii) The detailed proof is given in Appendix B. the price grows exponentially as the demand increases so Assumption 2 (small scale up/out costs). Suppose x is a the cloud would not allocate all resources to low value bids feasible resource assignment that does not accept bid i of that come early at the risk of having no capacity left for high 0 user n (i.e., xnis0 = 0 for all s ∈ [S]). Then, the increase value bids in the future. The second term ensures that (iii) in g(x) in (7) for accepting the bid on any server is at least the gain in social welfare when a bid is served outweighs α times the increase in the total scale up/out cost Φn(xn), the loss in total server cost. Indeed, designing the online where α is a large enough value (e.g., α ≥ 3). pricing rules is the key to obtain a good competitive ratio Since the social welfare equals g(x)−Φ(x), this assump- in social welfare. The price of serving each Bni on server tion states that accepting a bid increases the social welfare s∗ is computed by summing up the products of resource substantially, which is reasonable in practice as well. demands and current unit prices on server s∗, over all k − + and t ∈ [tni, tni], i.e., 4 ONLINE AUCTIONFOR SOCIAL WELFARE MAX- X X pˆni = dnik(t)pks∗ (t) . IMIZATION − + k∈[K] t∈[tni,tni] In this section, we introduce our online auction design. Online Auction. We summarize the online auction mecha- nism, referred to as T oast, in Alg. 1. Note that the payment 4.1 Online Auction (T oast) (line 10) is computed based on unit resource prices on the Resource provisioning. Let us first explain how the auc- selected server s∗ before counting in resources requested in tioneer decides whether to accept a bid Bni or not and the bid (line 2). Hence, the payment is independent of the JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 8

ALGORITHM 1: The Online Auction Mechanism (T oast) Let wˆnis(x) denote the marginal welfare of accepting a Input: S, K, C, U, L user n’s bid i on server s given an assignment x, i.e., Output: x, p (nis) Initialize: x = 0, y = 0, p(y) = p(0), ˆp = 0 wˆnis(x) = w(x ∨ e ) − w(x). (9) 1 Upon arrival of bid Bni according to (1): P P Similarly, let 2 Compute Pnis = − + k∈[K] dnik(t)pks(t), ∀s ∈ [S]; t∈[tni,tni] X up X X out 3 uni = max (bnis − Pnis); ˆ 0 s∈[S] Φnis(x) = φn(ij)xnjs + φn(ij)xnjs (10) 4 if uni > 0 then j6=i j6=i s06=s 5 Accept Bni; ∗ 6 s = argmaxs∈[S](bnis − Pnis); denote the additional scale-up/out costs for accepting a user 7 Update xnis∗ = 1, and allocate each resource k at n’s bid i on server s given x. As a corollary of Assumption ∗ − + amount of dnik(t) on server s , for each t ∈ [tni, tni]; 2, the following lemma holds. 8 Update yks∗ (t) = yks∗ (t) + dnik(t), 0 − + 0 ∀k ∈ [K], t ∈ [tni, tni]; Lemma 2. Suppose x satisfies that xnis = 0 for all s ∈ [S], − + P 9 Update pks∗ (t) in (8), ∀k ∈ [K], t ∈ [tni, tni]; and s0∈[S] xnjs0 ≤ 1 for all j. Then, we have 10 Charge Bni the payment of pˆni = Pnis∗ ; 11 else 1 Φˆ nis(x) ≤ wˆnis(x) 12 Reject Bni; 2 Proof of Lemma 2 is given in Appendix E. As a further corollary, we have the following lemma. bid, which is the key to guarantee truthfulness. In addition, 0 Lemma 3. Suppose x satisfies that xnis0 = 0 for all s ∈ [S], although our online auction requires Uk and Lk as input, P and s0∈[S] xnjs0 ≤ 1 for all j Then wˆnis(x) ≥ 0 whose exact values are not known before all bids have arrived, we can adopt estimated values of these upper and In case the bid has been accepted to some other servers lower bounds as input to our online algorithm, e.g., based in x, we lower bound the changes in the welfare function on past experience. We will show in Sec. 5 the impact of the with the following lemma, which follows by monotonicity estimation accuracy. of g(x). Note that the algorithm can guarantee that each bid is accepted to at most one server while in the analysis Theorem 2. Alg. 1 outputs a feasible solution of (3). we temporarily allow multiple entries of x to be 1. The reason is that a well-defined submodular function should The detailed proof is given in Appendix C. P not be restricted by such a constraint s∈[S] xnis ≤ 1. The Theorem 3. The online auction in Alg. 1 is truthful in valuation, following lemma essentially shows that if we choose more bid arrival time and VM execution duration, achieves individual than one servers for an accepted bid, the decreased social rationality for each bidder and the cloud provider, and processes welfare (−wˆnis(x)) will be no more than the additional each bid in O(KST ) time. scale-up/out costs incurred. The detailed proof is given in Appendix D. Lemma 4. Suppose x satisfies that there exists s0 6= s such that xnis0 = 1. Then, wˆnis(x) ≥ −Φˆ nis(x).

4.2 Competitive Analysis of Algorithm T oast Proof of Lemma 4 is given in Appendix F. To upper bound the marginal values of accepting a bid We next analyse the competitive ratio achieved by our that is accepted by the offline optimum in x? but not by the online auction. Let x(A) be the assignment vector of the algorithm in x(A), w.r.t. x(A), we divide such bids into two algorithm and x? be the offline optimal assignment. In order subsets. to show competitiveness of the algorithm, we need to show ? ? 0 ? S = {(n, j, s) | x = 1, ∀s 6= s, x(A) 0 = 0} that w(x(A)) is comparable to w(x ). However, it is difficult 1 njs njs (11) ? ? 0 to compare them directly. We adopt a two-step approach S2 = {(n, j, s) | xnjs = 1, ∃s 6= s, x(A)njs0 = 1} (12) from the literature of submodular maximization (e.g., [27]): Similarly, divide the bids that are accepted by the algorithm (i) first show that w(x(A)) is comparable to w(x(A) ∨ x?), in x(A) but not by the optimal in x? into two subsets which is the welfare of accepting both the bids accepted by 0 ? our algorithm and those accepted by the offline optimum; S1 = {(n, j, s) | x(A)njs = 1, ∀s 6= s, xnjs0 = 0} (13) ? ? then (ii) show that w(x(A)∨x ) is comparable to w(x ), i.e., 0 ? S2 = {(n, j, s) | x(A)njs = 1, ∃s 6= s, xnjs0 = 1} (14) the offline optimum. To show (i), imagine we start with w(x(A)) and add one The following lemma is to show that w(x? ∨ x(A)) by one the bids that are accepted by the offline optimum is comparable to w(x?) as we stated in step (ii) at the in x? but not by the algorithm in x(A). We need to upper beginning of this subsection. bound the marginal value of accepting each such bid. Simi- larly, to show (ii), imagine we start from w(x?) and add one Lemma 5. The social welfare of the offline optimum is upper by one the bids that are accepted by the algorithm in x(A) bounded by but not by the optimum in x?. We need to lower bound the ? ? P ˆ ? w(x ) ≤ w(x ∨ x(A)) + (n,i,s)∈S Φnis(x ∨ x(A)) (potentially negative) marginal value of accepting each such 2 bid. Proof of Lemma 5 is given in Appendix G. JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 9

The following lemma is to show that w(x(A)) is compa- There are in total T = 1000 time slots and each time rable to w(x?∨x(A)) as we stated in step (i) at the beginning slot is 10 seconds long. cluster data provides the of this subsection. arrival time of each job, and the start time and end time of each task in a job. We scale the duration of Google jobs Lemma 6. We have down to our time scale, and set the time a user submits his ? X w(x ∨ x(A)) ≤ w(x(A)) + wˆnis(x(A)) first bid according to the job arrival time in Google data. (n,i,s)∈S? We also order tasks in a job according to their start time, 1 − + and set start (end) time of VM usage in the bids, tni (tni), Proof of Lemma 6 is given in Appendix H. according to the start (end) time of the respective tasks. The Lemma 7. We have arrival time of each bid i (i ≥ 2) of a user n is set uniformly − 1 1 within [tn(i−1), tni]. We set the bidding prices in each bid X ˆ ? ? up Φnis(x ∨ x(A)) ≤ w(x ) + w(x(A)) φ 2 2 according to (4). We uniformly randomly set n(ij)’s and (n,i,s)∈S out P P 2 φ ’s within [0, − + dnik(t)δnik/In] (note n(ji) t∈[tni,tni] k∈[K] Proof of Lemma 7 is given in Appendix I. scaling up/out cost of each bid is not always as small as Assumption 2 requires). The unit values δnik of users are set Lemma 8. We have within the upper and lower bounds of users’ value per unit P P   U L ? wˆnis(x(A)) ≤ pks(Yks(t)) − hks Cks of resource per unit of time, k and k, which will be varied (n,i,s)∈S1 k,s,t in different experiments. By default, Uk = 50, Lk = 1. Proof of Lemma is given in Appendix J. We simulate servers with heterogenous resource capacities 2(L −h )C (C ) following the distribution of server configurations Lemma 9. Suppose w(x?) ≥ P k ks ks . Then, ks k,s,t KS summarized from the Google data as follows (CPU and X X X   Memory units are normalized so that the maximum capacity pks(Yks(t)) − hks Cks t∈[T ] s∈[S] k∈[K] is 1): 2KS(U − h ) 1 ≤ max ln k ks w(x(A)) + w(x?) percentage of machines 53% 30% 8% 6% 3% s∈[S],k∈[K] Lk − hks 4 CPU 0.50 0.50 0.50 1.00 0.25 Memory 0.50 0.25 0.750 1.00 0.25 Proof of Lemma 9 is given in Appendix K.

? P 2(Lk−hks)Cks Since the Google data does not provide disk configurations, Theorem 4. Suppose w(x ) ≥ k,s,t KS . Then, our designed online auction T oast has a competitive ratio of we set the disk storage capacity of servers randomly within 2KS(Uk−hks)  [320, 800](GB). The total capacity of each type of resource O maxs∈[S],k∈[K] ln( ) . Lk−hks to provision, and hence the number of servers to simulate, Proof. Putting Lemma 5 –9 together proves the theorem. is roughly according to the total amount of demand from all bids multiplying a random number in [0.4, 0.8]. hks is Recall that Lk is a lower bound of the bidding prices per uniformly distributed within [0.4, 0.6] for CPU (different for unit of demand among all the bids. Thus Lk − hks is the different servers s), and within [0.005, 0.02] for RAM and minimum social welfare contributed by a bid if it is served disk, roughly following the percentage measured in [32]. by server s and (Lk−hks)Cks is the minimal social welfare if We compare our online auction with the offline opti- the capacity of k-type resource on server s is fully occupied. mum, as well as two heuristic schemes, Twice-the-Cost (TC) Thus, the assumption in Theorem 4 is essentially that the and Twice-the-Index (TI) (we identified a lack of comparable workload at each time slot is high enough to fully occupy approaches from the cloud auction literature). The two at least one resource on two servers or two resources on one heuristics share the same steps with our online auction, server, which is easy to fulfill in practice. except for adopting different price functions [34]: For TC, 0 pks(t) = 2f (yks(t)), i.e., the marginal price at t is twice ks 0 of the marginal cost for t; for TI, pks(t) = f (2yks(t)), i.e., 5 PERFORMANCE EVALUATION ks the marginal price for t is the marginal cost on twice of the 5.1 Experimental Setup resource usage at t. We evaluate our online auction using trace-driven simu- lations, exploiting Google cluster-usage data [33], which 5.2 Comparison with Offline Optimum contains resource capacity of servers and job information submitted to the Google cluster. A job comprises multiple We study the ratios computed by dividing the offline op- tasks, each of which is accompanied by its resource require- timal social welfare by the social welfare achieved by our ment (CPU, RAM and disk). We associate each job with a online algorithm under different settings. The offline opti- user and translate each of the job’s tasks into a VM bid of the mal social welfare is obtained by solving the offline problem user, requesting K = 3 types of resource demands extracted (3) exactly with a brute force approach, by enumerating all from the traces. Especially, the overall resource demand of possible solutions, due to the hardness of solving the non- user n when submitting the ith bid (Dnik(t)) equals the total convex offline problem. Due to the high time complexity of demand of the resource in the first i tasks of the job; the the method, we limit the largest number of bids from all resource requirement specified in the ith bid (dnik(t)) is set users to be 30 in this set of experiments and the number of according to (5). The number of bids per user (In) is within servers to very small as well. Fig. 2 shows that the ratios are [1, 50]. around 2 with varying numbers of servers. JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 10

2.5 1400 S=1 S=2 S=3 20% 60% 100% 140% 180% 1200 2 1000

1.5 800

1 600

Social Welfare400 Competitive0.5 Ratio 200

0 0 5 10 15 20 25 30 400 600 800 1000 1200 1400 Number of Bids Number of Bids Fig. 2: Competitive ratio of T oast with different server Fig. 3: Social welfare achieved by T oast with different numbers estimations of Lk

1400 1400 20% 60% 100% 140% 180% Toast TC TI 1200 1200

1000 1000

800 800

600 600

Social Welfare400 Social Welfare400

200 200

0 0 400 600 800 1000 1200 1400 400 600 800 1000 1200 1400 Number of Bids Number of Bids

Fig. 4: Social welfare achieved by T oast with different Fig. 5: Comparison among T oast, TC, and TI estimations of Uk

2500 1400 TC1 Toast1 TC2 Toast2 TC3 Toast3 Toast1 TI1 Toast2 TI2 Toast3 TI3 1200 2000 1000

1500 800

1000 600

Social Welfare Social Welfare400 500 200

0 0 400 600 800 1000 1200 1400 400 600 800 1000 1200 1400 Number of Bids Number of Bids

Fig. 6: Comparison between T oast and TC with vary- Fig. 7: Comparison between T oast and TI with vary- ing Uk/Lk ing Uk/Lk

5.3 Performance with Under/Over-Estimation of Uk and out low value bids when the range of bidding price per unit Lk of demand is large. We also further compare T oast and TI in Fig. 7, by We evaluate performance of our online auction at different setting the ratio of total resource capacity/total demand of levels of under-estimation and over-estimation of the actual each type of resource in the system at each time within dif- values (the case of 100%) of Uk and Lk, which are input in ferent ranges. The range is [0.6, 0.8], [0.4, 0.6], and [0.2, 0.4] the price functions in T oast. Fig. 3 and Fig. 4 illustrate how for T oast1 (TI1), T oast2 (TI2), and T oast3 (TI3), respec- the social welfare achieved is influenced by their under- tively. We observe that when the resource is more scarce, our estimation or over-estimation, with over-estimation more mechanism outperforms TI more. This is expected since TI preferred for Lk and under-estimation more preferred for rejects more bids when the server utilization is high, because Uk. its prices surge when resource allocation reaches half of the server capacity. 5.4 Comparison with Two Heuristics Fig. 5 shows that T oast outperforms TI significantly and 6 CONCLUDING REMARKS achieves slightly better performance than TC. In Fig. 6, we This work designs a truthful and efficient online auction for further compare T oast and TC by running the algorithms dynamic resource scaling and pricing, where cloud users under different values of Uk/Lk. Especially, Uk/Lk is 10 repeatedly bid for resources into the future with increased for TC1 and T oast1, 50 for TC2 and T oast2, and 250 amounts, according to their scale-up/out preferences. We for TC3 and T oast3 in the figure. We observe that when consider server energy cost minimization in social welfare Uk/Lk is larger, T oast outperforms TC more. This can be maximization, and reveal an important property, submodu- explained by that the price function of TC is only related to larity, of the objective function in the resulting significantly the marginal server cost which underperforms in filtering more challenging offline problem. A novel competitive JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 11

analysis framework is established for submodular function Inequality (15) holds by our assumption of g’s being sub- optimization with non-linear constraints, demonstrating a modular. Next, we show (16). By (6), we have that for any good competitive ratio of our online mechanism. The cur- x, Φ(x ∨ e(nis)) − Φ(x) is equal to rent work focuses on dynamic increase of resources, which P up P out φ + 0 φ , is supported in today’s IaaS clouds. On the other hand, j6=i:xnjs=1 n(ij) j6=i:∃s 6=s,xnjs0 =1 n(ij) resource scale-down and scale-in involve selling resources up out which is non-decreasing in x since φn(ij)’s and φn(ij)’s are back to the provider. Designing efficient online mechanisms 0 non-negative. So (16) holds because xnjs0 ≥ x 0 for all for a bi-directional market is significantly more challenging, njs j ∈ [In] and s ∈ [S]. which we seek to investigate in our future work, possibly exploiting double auctions.

APPENDIX C 7 ACKNOWLEDGEMENTS PROOFOF THEOREM 2 This work was supported in part by grants from Proof. VM consolidation constraint (3a). This follows by the Hong Kong RGC under the contracts HKU 17204715, definition of our online auction (line 8 and 9 of Alg. 1). 17225516, 17202115E, C7036-15G (CRF), grants NSFC Capacity constraint (3b). By our payment design in (8), we (61571335, 61628209), HKU URC Matching Funding, have pks(t) = Uk when yks(t) = Cks, which is high enough Huawei HIRP (HO2016050002BE), and Hubei Science Foun- to ensure negative utility for any bid. So the algorithm dation (2016CFA030, 2017AAA125). assigns resources within the server capacities.

APPENDIX A APPENDIX D PROOFOF LEMMA 1 PROOFOF THEOREM 3 Proof. Informally, (7) is submodular because the marginal Proof. (Truthfulness in bidding price) The marginal prices that increase of (7) due to accepting a bid i (of user n) is propor- the cloud provider presents to bid Bni depend only on the tional to the amount of resources obtained through the bid, demands of resources before the arrival of bid Bni and the which decreases if more of the other bids are accepted. demands of bid Bni, thus, are independent on Bni’s bidding Formally, we need to show that for any n ∈ [N], i ∈ [In], price. Further, the cloud provider always assigns bids to 0 0 s ∈ [S], and any x, x such that xnis = xnis = 0 and xnjs0 ≥ servers to maximize each bid’s utility given the current 0 0 xnjs0 for all j ∈ [In] and s 6= s, the marginal increase in marginal prices. Assuming the bidders are myopic, i.e., a “pure” value minus server cost of accepting bid i on server bidder only cares about the utility in his current bid. So it s satisfies falls into the family of sequential posted price mechanisms (e.g., [35]) and, thus, a bidder cannot improve its utility by g(x ∨ e(nis)) − g(x) ≤ g(x0 ∨ e(nis)) − g(x0) lying about the bidding price of any of his bids. P P (Truthfulness in arrival time) Since the marginal prices are Since g(x) = n∈[N] un(xn) − k,s,t fks(yks(t)), we have non-decreasing in the amount of allocated resources, which is non-decreasing over the resource allocation time, i.e., let (nis) P P P g(x∨e ) = − + (δnik−hks)dnik(t) n∈[N] k∈[K] t∈[tni,tni] yks(tni, t) denote the amount of resource k on s in t which has been allocated by tni (tni ≤ t) and yks(tni, t) is non- According to Assumption 1, δnik ≥ hks. And recall that decreasing over tni. Hence a bidder cannot decrease the total P dnik(t) = Dnik(t)−max − + {Dnjk(t) xnjs}. price of the resource that it requests by delaying its arrival. 1≤j≤i−1:tnj ≤tnj s∈[S] Note that the arrival time of a bid is the first time the bidder 0 Since we have xnjs0 ≥ xnjs0 for all j ∈ [In] and s ∈ is aware of his latest updated demands so the arrival time (nis) [S], dnik(t) under x ∨ e is no larger than that under can not be earlier. 0 (nis) x ∨ e . Thus the lemma follows. (Truthfulness in VM execution duration) Dropping part of the true resource occupation duration in the request would risk failing to complete the job. So no bidder would APPENDIX B do that. On the other hand, the marginal prices are non- PROOFOF THEOREM 1 negative according to (8). So requesting a superset of the Proof. We need to show that for any n ∈ [N], i ∈ [In], s ∈ true VM occupation duration increases a bidder’s payment 0 0 [S], and any x, x such that xnis = xnis = 0 and xnjs0 ≥ and decreases her utility. 0 0 xnjs0 for all j ∈ [In] and s 6= s, the marginal increase in (Individually rational) According to line 4 of Alg. 1, the social welfare of accepting bid i on server s satisfies utility in any bid of a bidder is always non-negative. The profit of the provider is also non-negative based on the w(x ∨ e(nis)) − w(x) ≤ w(x0 ∨ e(nis)) − w(x0) formulation of pˆni in Sec. 4 which implies pks(yks(t)) > 0 Since w(x) = g(x) − Φ(x), it suffices to show that fks(yks(t)). (Polynomial running time) To process a bid Bni, the (nis) 0 (nis) 0 g(x ∨ e ) − g(x) ≤ g(x ∨ e ) − g(x ) (15) algorithm first sums up the marginal prices for all requested resources over the occupation duration for each server and s ∈ [S] to compute the payment that bid Bni should pay Φ(x ∨ e(nis)) − Φ(x) ≥ Φ(x0 ∨ e(nis)) − Φ(x0) (16) if it would be served on server s. This step runs in O(KST ) JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 12

time. Then, the algorithm computes uni and decides the APPENDIX I allocation and payment of bid Bni by checking the utility PROOFOF 7 of the bid if it would be served on each server s, which Proof. Note that Φˆ nis(x) is linear in x (Eqn. (10)). We can be done in O(S) time. Finally, the algorithm updates ? ? get that Φˆ nis(x ∨ x(A)) ≤ Φˆ nis(x ) + Φˆ nis(x(A)). Since the amount of allocated resources yks(t) and the respective up out ˆ ˆ φnii = φnii = 0, we have that Φnis(x(A)) = Φnis(x(A) ∧ marginal prices pks(t), which can be done in O(KST ) time. (nis) e ). Further, by Lemma 2, we have Φˆ nis(x(A) ∧ (nis) 1 (nis) 0 e ) ≤ 2 wˆnis(x(A) ∧ e ). Here x ∧ x denotes a vector whose entries are equal to the min of the cor- 0 0 APPENDIX E responding entries in x and x , i.e., (x ∧ x )i = 1 if both x = 1 and x0 = 1 hold. By submodularity, we PROOFOF LEMMA 2 i i w(x(A)) ≥ P wˆ (x(A) ∧ e(nis)) ≥ get that (n,i,s):xnis=1 nis Proof. According to Assumption 2, the increase in g(x) for P (nis) (n,i,s)∈S wˆnis(x(A) ∧ e ), where the second inequal- x α ≥ 3 2 turning nis from 0 to 1 is at least ( ) times the cor- ity holds because S2 is a subset of the (n, i, s)’s that are responding increase in Φn(xn). Moreover, we have g(x) = accepted by x (i.e., x = 1) and each term in the sum ˆ ˆ nis w(x) + Φ(x). Thus we have wˆnis(x) + Φnis(x) ≥ 3Φnis(x), P Φˆ (x(A)) ≤ is non-negative (Lemma 3). Thus xnis∈S2 nis which the lemma follows. 1 2 w(x(A)). 0 Moreover, for each xnis ∈ S2, there exists a s 6= s such ? ? APPENDIX F that xnis0 ∈ x , i.e., xnis0 = 1. According to Assumption 2, P ? we have − + (δnik − hks0 )dnik(t) ≥ 3Φˆ nis0 (x ) PROOFOF LEMMA 4 k∈[K],t∈[tni,tni] P ˆ ? and k∈[K],t∈[t− ,t+ ](δnik − hks0 )dnik(t) ≥ 3Φnis(x ). Proof. According to Lemma 1, g(x) is monotone, i.e., the ni Pni Thus we have k∈[K],t∈[t− ,t+ ](δnik − hks0 )dnik(t) − increase in g(x) for turning xnis from 0 to 1 is always ni ni ? 2 P ˆ Φˆ nis0 (x ) ≥ − + (δnik − hks0 )dnik(t) ≥ nonnegative. Therefore, wˆnis(x) + Φnis(x) ≥ 0, which the 3 k∈[K],t∈[tni,tni] ˆ ? ˆ ? 1 ? ? lemma follows. 2Φnis(x ), i.e., Φnis(x ) ≤ 2 wˆnis0 (x ). Since w(x ) ≥ P ? P ˆ ? 0 wˆnis0 (x ), we have Φnis(x ) ≤ (nis ):xnis0 =1 xnis∈S2 1 w(x?). Putting together the inequalities shows the APPENDIX G 2 lemma. PROOFOF LEMMA5

S1 Proof. Let e be a vector with ones in the entries in S1 and APPENDIX J S2 zeros in other entries. Define e similarly. Then, we have PROOFOF 8 x? ∨ x(A) = x? ∨ eS1 ∨ eS2 . By submodularity, we have Proof. Note that the marginal valuation wˆnis(x(A)) repre- sents the valuation of bid Bni if it is served on server s w(x? ∨ x(A)) = w(x? ∨ eS1 ∨ eS2 ) conditioned on the decisions on all other bids being the ? ? X ? S1 same as those by our algorithm T oast. For each xnis ∈ S1 , ≥ w(x ) + wˆnis(x ∨ e ) bid Bni is rejected by our algorithm. Therefore, the marginal (n,i,s)∈S1 valuation of accepting bid Bni on server s when the bid X ? S1 S2 + wˆnis(x ∨ e ∨ e ) arrived (which, by submodularity, is greater than or equal (n,i,s)∈S2 to wˆnis(x(A))), is smaller than the payment on that server (ni) Note that x? ∨ eS1 satisfies the conditions in Lemma 3. s. Let yks (t) denote the allocated amount of resource k at So the second term on the RHS of the above inequality is t on s just before the arrival of Bni. Then we have ? S1 S2 at least 0. By Lemma 4, we have wˆnis(x ∨ e ∨ e ) = ? S1 S2 ? X X (ni) −Φˆ nis(x ∨ e ∨ e ) = −Φˆ nis(x ∨ x(A)). So the lemma   wˆnis(x(A)) ≤ pks(yks (t)) − hks dnik(t) follows. − + k∈[K] t∈[tni,tni] X X   ≤ pks(Yks(t)) − hks dnik(t) − + k∈[K] APPENDIX H t∈[tni,tni] PROOFOF 6 where the second inequality holds because pks(yks(t)) is S? S? Proof. Define e 1 and e 2 similarly as in the previous monotone. ? S? S? ? proof. Then, we have x ∨ x(A) = x(A) ∨ e 1 ∨ e 2 . By Moreover, the bids in S1 are a subset of the bids accepted submodularity, we have by the offline optimal. So the total demand of these bids for ? ? any resource on any server does not exceed the capacity of ? S1 S2 w(x ∨ x(A)) = w(x(A) ∨ e ∨ e ) the server. Thus, summing up over all (n, i, s) ∈ S?, we X 1 ≤ w(x(A)) + wˆnis(x(A)) have ? (n,i,s)∈S1 ? X S1 X X X X   + wˆnis(x(A) ∨ e ) wˆnis(x(A)) ≤ pks(Yks(t))−hks Cks ? ? (n,i,s)∈S2 (n,i,s)∈S1 t∈[T ] s∈[S] k∈[K] By Lemma 4, the last term is less than or equal to 0. JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 13

APPENDIX K [14] Sethuraman Subbiah, Xiaohui Gu, and John Wilkes. CloudScale: PROOFOF 9 Elastic Resource Scaling for Multi-tenant Cloud Systems. In Proc. of ACM SoCC, 2011. B [15] Hiep Nguyen, Zhiming Shen, Xiaohui Gu, Sethuraman Subbiah, Proof. For any bid ni that is accepted by the algorithm on and John Wilkes. Agile: Elastic Distributed Resource Scaling for some server s, i.e., xnis = 1, its marginal value is at least the Infrastructure-as-a-Service. In Proc. of the USENIX ICAC, 2013. payment. Thus, we have [16] M Alicherry and T V Lakshman. Network Aware Resource Allocation in Distributed Clouds. In Proc. of IEEE INFOCOM, 2012. [17] Carlee Joe-Wong, Soumya Sen, Tian Lan, and Mung Chiang. Multi Z Yks(t) X X X   Resource Allocation: Fairness-Efficiency Tradeoffs in a Unifying w(x(A)) ≥ pks(yks(t)) − hks dyks(t) Framework. In Proc. of IEEE INFOCOM, 2012. t∈[T ] s∈[S] k∈[K] 0 [18] L. L. Andrew E. Thereska M. Lin, A. Wierman. Dynamic Right- (17) Sizing for Power-Proportional Data Centers. In Proc. of IEEE INFOCOM, 2011. Next, we lower bound each term in the above sum as [19] J. Llorca Y. Jin A. Sala L. Jiao, A. Tulino. Smoothed Online Resource Allocation in Multi-Tier Distributed Cloud Networks. In IPDPS, follows: 2016. Z Yks(t) [20] Sharrukh Zaman and Daniel Grosu. Combinatorial Auction-based   pks(yks(t)) − hks dyks(t) Allocation of Virtual Machine Instances in Clouds. Journal of 0 Parallel and Distributed Computing, 73(4):495–508, 2013. yks(t) [21] Linquan Zhang, Zongpeng Li, and Chuan Wu. Dynamic Resource Z Yks(t)   C Lk − hks 2KS(Uk − hks) ks  Provisioning in Cloud Computing: A Randomized Auction Ap- = dyks(t) proach. In Proc. of IEEE INFOCOM, 2014. 0 2KS Lk − hks [22] Hong Zhang, Hongbo Jiang, Bo Li, Fangming Liu, Athanasios V. Yks(t) (Lk − hks)Cks 2KS(Uk − hks) C Vasilakos, and Jiangchuan Liu. A Framework for Truthful On- = ks 2KS(U −h ) line Auctions in Cloud Computing with Heterogeneous User 2KS ln( k ks ) Lk − hks Lk−hks Demands. In Proc. of IEEE INFOCOM, 2013. (L − h )C [23] Weijie Shi, Chuan Wu, and Zongpeng Li. RSMOA: A Revenue and − k ks ks 2KS(U −h ) Social Welfare Maximizing Online Auction for Dynamic Cloud 2KS ln( k ks ) Resource Provisioning. In Proc. of IWQoS, 2014. Lk−hks  [24] Weijie Shi, Linquan Zhang, Chuan Wu, Zongpeng Li, and 1   (Lk − hks)Cks Fracis C.M. Lau. An Online Auction Framework for Dynamic = pks(Yks(t)) − hks Cks − ln( 2KS(Uk−hks) ) 2KS Resource Provisioning in Cloud Computing. In Proc. of ACM Lk−hks SIGMETRICS, 2014. where the last equality holds by our payment design in (8). [25] Jeff Bilmes Rishabh Iyer. Submodular Optimization with Submod- ular Cover and Submodular Knapsack Constraints. In Proc. of The lemma then follows by summing over all k ∈ [K], s ∈ NIPS, 2013. [S], and t ∈ [T ] and using the assumption that the optimal [26] Uriel Feige and Jan Vondr ak. Approximation algorithms for 2(L −h )C welfare is at least w(x?) ≥ P k ks ks . allocation problems: Improving the factor of 1 - 1/e. In Proc. of k,s,t KS IEEE FOCS, 2006. [27] Niv Buchbinder, Moran Feldman, and Roy Schwartz. Online submodular maximization with preemption. In Proc. of ACM- REFERENCES SIAM SODA, 2015. [28] Moran Feldman, Joseph (Seffi) Naor, , and Roy Schwartz. Im- proved competitive ratios for submodular secretary problems. In [1] Amazon. Amazon Auto Scaling. Proc. of APPROX/RANDOM, 2011. http://aws.amazon.com/autoscaling/ 2014. [29] Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Har- [2] Azure. Autoscaling and ris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. Azure. https://msdn.microsoft.com/en- Xen and the Art of Virtualization. In Proc. of ACM SOSP, 2003. us/library/hh680945(v=pandp.50).aspx 2012. [30] Hao Chen, Michael C. Caramanis, and Ayse K. Coskun. Reducing [3] Google. Autoscaler - Compute Engine . the Data Center Electricity Costs Through Participation in Smart https://cloud.google.com/compute/docs/autoscaler/ 2015. Grid Programs. In Proc. of IGCC, 2014. [4] ProfitBricks. ProfitBricks Live Verti- [31] Bhavani Krishnan, Hrishikesh Amur, Ada Gavrilovska, and cal Scaling. https://www.profitbricks.de/ Karsten Schwan. VM Power Metering: Feasibility and Challenges. sites/default/files/PB LiveVerticalScaling by ACM SIGMETRICS Performance Evaluation Review, 38(3):56–60, ProfitBricks EN.pdf 2012. 2010. [5] Lunacloud. lunacloud. http://www.lunacloud.com 2014. [32] JAman Kansal, Feng Zhao, Jie Liu, Nupur Kothari, and Arka A. [6] Ninefold. ninefold. https://ninefold.com 2013. Bhattacharya. Virtual Machine Power Metering and Provisioning. [7] Linux-kvm. KVM CPU Hotplug. http://www.linux- In Proc. of ACM SoCC, 2010. kvm.org/page/CPUHotPlug. [33] Charles Reiss, John Wilkes, and Joseph L. Hellerstein. Google [8] Backdrift. Xen Memory Hotplug. http://backdrift.org/xen- cluster-usage traces: format + schema. Technical report. memory-hot-add-and-remove. http://code.google.com/p/googleclusterdata/wiki/TraceVersion2. [9] Al-Roomi May, Al-Ebrahim Shaikha, Buqrais Sabika, and Ahmad [34] A. Blum, A. Gupta, Y. Mansour, and A. Sharma. Welfare and Profit Imtiaz. Cloud Computing Pricing Models: A Survey. International Maximization with Production Costs. In Proc. of IEEE FOCS, 2011. Journal of Grid and Distributed Computing, 6(5):93–106, 2013. [35] Shuchi Chawla, Jason D Hartline, David L Malec, and Balasubra- [10] Orna Agmon Ben-Yehuda, Muli Ben-Yehuda, Assaf Schuster, and manian Sivan. Multi-Parameter Mechanism Design and Sequen- Dan Tsafrir. Deconstructing Amazon EC2 Spot Instance Pricing. tial Posted Pricing. In Proc. of ACM STOC, 2010. In Proc. of IEEE CloudCom, 2011. [11] Qian Wang, Kui Ren, and Xiaoqiao Meng. When Cloud meets eBay: Towards Effective Pricing for Cloud Computing. In Proc. of IEEE INFOCOM, 2012. [12] Xiaoxi Zhang, Zhiyi Huang, Chuan Wu, Zongpeng Li, and Fran- cis C.M. Lau. Online Auctions in IaaS Clouds: Welfare and Profit Maximization with Server Costs. In Proc. of ACM SIGMETRICS, 2015. [13] Zhiyi Huang and Anthony Kim. Welfare Maximization with Production Costs: a Primal Dual Approach. In Proc. of the ACM- SIAM SODA, 2015. JOURNAL OF IEEE TRANSACTIONS ON CLOUD COMPUTING, VOL. XX, NO. XX, XX 2017 14

1 . Xiaoxi Zhang is currently a postdoctoral re- searcher at Carnegie Mellon University. She re- ceived her B.E. in electronics and information engineering from Huazhong University of Sci- ence and Technology in 2013 and her Ph.D. in computer science from The University of Hong Kong in 2017. Her research interests include cloud computing, online algorithms, online learn- ing, and algorithmic game theory.

2 . Zhiyi Huang is an assistant professor of Com- puter Science at the University of Hong Kong. He works broadly on theoretical computer science. Before joining HKU, He was a postdoc at Stan- ford University from 2013 to 2014, working with Tim Roughgarden. He obtained his Ph.D. from the University of Pennsylvania under Sampath Kannan and Aaron Roth in 2013, interned at Microsoft Research Redmond under Nikhil R. Devanur in the summers of 2011 and 2012, and got his bachelor degree from the first ”Yao Class” under Andrew Yao at Tsinghua University in 2008.

3 . Chuan Wu received her B.E. and M.E. de- grees in 2000 and 2002 from Department of Computer Science and Technology, Tsinghua University, , and her Ph.D. degree in 2008 from the Department of Electrical and Computer Engineering, University of Toronto, Canada. She is currently an associate professor in the De- partment of Computer Science, The University of Hong Kong, China. Her research interests in- clude cloud computing and online/mobile social network.

4 . Zongpeng Li received his BSc from Tsinghua University in 1999, and his PhD from University of Toronto in 2005. His research interests in- clude computer networks and cloud computing. He was a faculty member at the University of Calgary, and is now with the School of Computer Science, Wuhan University, China. He won the Outstanding Young Computer Scientist Prize in 2015.

5 . Francis C.M. Lau received his Ph.D. de- gree from the Department of computer science from University of Waterloo. He is currently a professor in computer science in The Univer- sity of Hong Kong. He is the editor-in-chief of the Journal of Interconnection Networks. His re- search interests include computer systems, net- works, programming languages, and application of computing in arts.