Price-Based Distributed Optimization in Large-Scale Networked Systems

Dissertation document submitted to the

Division of Research and Advanced Studies

of the University of Cincinnati

in partial fulfillment of the

requirements of degree of

Doctor of Philosophy

in the School of Dynamic Systems

College of Engineering and Applied Sciences

by

Baisravan HomChaudhuri

Summer 2013

MS, University of Cincinnati, 2010

BE, Jadavpur University, 2007

Committee Chair: Dr. Manish Kumar

c 2013 Baisravan HomChaudhuri, All Rights Reserved Abstract

This work is intended towards the development of distributed optimization methods for large-scale complex networked systems. The advancement in tech- nological fields such as networking, communication and computing has facili- tated the development of networks which are massively large-scale in nature. The examples of such systems include power grid, communication network, Internet, large manufacturing plants, and cloud computing clusters. One of the important challenges in these networked systems is the evaluation of the optimal point of operation of the system. Most of the centralized optimiza- tion algorithms suffer from the curse of dimensionality that raises the issue of scalability of algorithms for large-scale systems. The problem is essentially challenging not only due to high-dimensionality of the problem, but also due to distributed nature of resources, lack of global information and uncertain and dynamic nature of operation of most of these systems. The inadequacies of the traditional centralized optimization techniques in addressing these issues have prompted the researchers to investigate distributed optimization tech- niques. In particular, distributed resource allocation is a promising paradigm of special relevance to complex networked systems. This research work focuses on developing techniques to carry out the global optimization in a distributed fashion that explores the fundamental idea of decomposing the overall opti- mization problem into a number of sub-problems that utilize limited informa- tion exchanged over the network using neighborhood relationships. Inspired by price-based mechanisms, the research develops two methods. First, a dis- tributed optimization method consisting of dual decomposition and update of dual variables in the subgradient direction, also known as market-based meth- ods, is developed for some different classes of resource allocation problems. Although the dual decomposition based requires lesser communication and is easy to implement, it has its own drawbacks includ- ing the rate of convergence. To address some of the drawbacks in the field of distributed optimization, in this dissertation, a Newton based distributed interior point optimization method is developed. The proposed Newton based distributed interior point approach, which is iterative in nature, focuses on the generation of both primal and dual feasible solutions at each iteration and development of mechanisms that demand lesser communication. The conver- gence and rate of convergence of both the primal and the dual variables in the system is also analyzed using a benchmark Network Utility Maximization (NUM) problem followed by numerical simulation results for the NUM prob- lem. A comparative study between the proposed distributed and centralized method of optimization is also provided to evaluate the performance of the proposed method.

The proposed distributed optimization techniques have been applied to real world systems such as optimal power allocation in Smart Grid and util- ity maximization in Cloud Computing systems. Both the problems belong to the class of large-scale complex network problems and have immense signifi- cance in current and future power distribution and computing world. In the power grids, the power from the generation units is required to be allocated to the end users in an optimal fashion. The challenges in this problem are augmented with the nature of the decision variables, coupling effect in the network, the global constraints in the system, uncertain nature of renewable power generators, and last and not-the-least the large-scale distributed nature of the problem. In cloud computing, resources such as memory, processing, and bandwidth are needed to be allocated to a large number of users to max- imize the users’ quality of experience.

Finally, the research focuses on the development of a stochastic distributed optimization methods for solving problems with multi-modal cost functions. As opposed to the unimodal function optimization, the widely practiced gra- dient descent methods fail to reach the global optimum solution when multi- modal cost functions are considered. In this dissertation, an effort is be made to develop a stochastic distributed optimization method that exploits noise based solution update to prevent the algorithm from converging into local optimum solutions. The method is applied to the Network Utility Maximiza- tion problem with multi-modal cost functions, and is compared with Genetic Algorithm (GA).

Acknowledgments

I would like to express my deep gratitude to Dr. Manish Kumar for his valu- able and constructive suggestions during the planning and development of this research work. His patience and willingness to give his time so generously has been very much appreciated. I have been extremely fortunate to have an ad- viser like Dr. Kumar who gave me the freedom to explore on my own, at the same time guided me when required.

My deepest gratitude to Dr. Kelly Cohen, Dr. David Thompson and Dr. Sam Anand for being a part of my PhD committee. I deeply appreciate their insightful comments on my research as they helped me shape my dissertation.

I am thankful to my parents and my elder sister. They were always sup- portive to my decisions and always encouragement me to pursue my studies.

I would also like to thank my lab mates Alireza Nemati, Balaji Sharma, Benson Isaac, Gaurav Mukherjee, Jiankun Fan, Joseph Anthony Pietrykowski, Paul Rael, Ruoyu Tan and Sushil Garg for always keeping a lively atmosphere in the lab.

I am grateful to the National Science Foundation Grant number EFRI- 1024608 for financially supporting me for sometime in my PhD. I am thankful to Dr. Vijay Kumar Devabhaktuni of University of Toledo, Dr. Prasad Calyam of University of Missouri and Kshitij Fadnis of Ohio State University for col- laborating in different projects. Finally, I would like to thank the University of Cincinnati and the School of Dynamic Systems for giving me the opportunity to pursue my PhD degree. Contents

1 Introduction and Motivation 1 1.1NetworkFlowProblem...... 7 1.2 Coordination in Networked Systems ...... 8 1.3 Distributed Optimization ...... 9 1.4 Optimization with Multimodal Functions ...... 11 1.5 Application Areas ...... 13 1.6 Document outlines ...... 15

2 Literature Review 17 2.1DecompositionMethods...... 18 2.2 Distributed Optimization ...... 19 2.3Multi-modalCostFunction...... 28 2.4 Application Areas ...... 29

3 Problem Description 32 3.1 Minimum Cost Network Flow Problem: Mincost Problem . . . 32 3.1.1 Mincost Problem 1: Allocation of indivisible resources . 34 3.1.2 Mincost Problem 2: Allocation of divisible resources . . 36 3.2MaximumFlowProblem:MaxflowProblem...... 37 3.3 Optimization with Multi-Modal Cost Function ...... 39

i 4 Approach and Simulation Results for Minimum Cost Network Flow Problem 42 4.1 Mincost Problem 1: Resource Allocation with Indivisible Re- sources...... 42 4.2 Mincost Problem 2: Allocation of Divisible Resources ..... 45 4.3SimulationResults...... 48 4.3.1 Mincost Problem 1: Resource Allocation with Invisible Resources...... 48 4.3.2 Mincost Problem 2: Resource Allocation with Divisible Resources...... 50

5 Approach and Simulation Results for Maximum Flow Prob- lem 57 5.1 Optimality Conditions ...... 58 5.2DualityGap...... 59 5.3 Distributed Primal and Dual Update ...... 60 5.4DualStepSize...... 63 5.5PrimalStepSize...... 64 5.6ConvergenceAnalysis...... 68 5.6.1 Descent Direction ...... 68 5.6.2 ConvergenceofPrimalVariables...... 71 5.6.3 ConvergenceofDualVariables...... 73 5.7SimulationResults...... 76

6 Approach and Simulation Results for Multi-Modal NUM Prob- lem 82 6.1SimulationResults...... 86

ii 7 Applications 91 7.1OptimalPowerFlowProbleminPowerGrids...... 91 7.1.1 BuyerStrategy...... 97 7.1.2 DealerStrategy...... 99 7.2 Utility Maximization in Cloud Computing ...... 99 7.3SimulationResults...... 104 7.3.1 OptimalPowerFlowProbleminPowerGrids..... 104 7.3.2 Cloud Computing Systems ...... 106

8 Dissertation Contributions and Future Scope 111 8.1 Dissertation Contributions ...... 111 8.2FutureScope...... 115

iii List of Tables

4.1 Comparison between Marked Based Method and Integer Pro- gramming (Mincost Problem 1) ...... 52 4.2 Comparison between Marked Based Method and Linear Pro- gramming (Mincost Problem 2) ...... 52 4.3TaskDemand...... 53 4.4ResourceCapacities...... 53

5.1 Comparison Between the Proposed Distributed Optimization and Central Optimization Method for Case 1 (200 Source Nodes and500Links)...... 78 5.2 Comparison of Cost for Case 2 (2500 Source Nodes 4500 Links) 79

6.1 Comparison Between the Proposed Distributed Optimization andGeneticAlgorithm...... 89

7.1 OPF Solution for IEEE 30-bus System Using Market-Based Ap- proach...... 105 7.2 Comparison of Market Based Solution and Greedy Solution for 6DifferentCases...... 107

iv List of Figures

1.1AMultimodalFunction...... 12

3.1SchematicofaNetwork...... 33 3.2Multi-ModalCostFunction...... 41

4.1 Mincost Problem 1 (Scenario 1): 25 Resources and 25 Tasks . 49 4.2 Mincost Problem 1 (Scenario 2): 50 Resources and 50 Tasks . 50 4.3 Mincost Problem 1 (Scenario 3): 100 Resources and 100 Tasks 51 4.4 Mincost Problem 2 (Scenario 1): 10 Resources and 15 Tasks . 54 4.5 Mincost Problem 2 (Scenario 2): 21 Resources and 43 Tasks . 55 4.6 Mincost Problem 2 (Scenario 3): 50 Resources and 50 Tasks . 56

5.1CASE1:(200SourceNodesand500Links)...... 80 5.2 CASE 2: (2500 Source Nodes and 4500 Links) ...... 81

6.1GeometricInterpretationoftheApproach...... 85 6.2 Evolution of System Utility with The Number of Iterations . . 88 6.3 (Scenario 1): Comparison of Utility for 20 Different Runs . . . 89 6.4 (Scenario 2): Comparison of Utility for 20 Different Runs . . . 90 6.5 (Scenario 3): Comparison of Utility for 20 Different Runs . . . 90

v 7.1 Schematic Diagram of IEEE 30 bus system. The horizontal bars represent the buses in the system; the generators are shown as circles attached to the specific buses; the arrow head represents the loads at different buses; and the transmission lines connect onebuswithanother...... 94 7.2EvolutionofPriceWithRespecttoIteration...... 106 7.3 Evoltuion of Error Values for 5 Randomly Chosen Buses With RespecttoIteration...... 107 7.4 Problem 4: The Solution to DC OPF Problem for Scenario 1; The Power Generated is Shown in Red, Power Demand at Each Node is Shown in Blue, Power Flow Between The Transmission LinesareShowninBlack...... 108 7.5 Comparison of Total Generation Costs Between MATPOWER andMarketBasedSolutionsfor10DifferentScenarios..... 109 7.6 Performance of market-based method compared with a popular heuristic approach in cloud computing ...... 110

vi Chapter 1

Introduction and Motivation

Large-scale networked systems are an important part of the modern human society. A network can be broadly defined as any interconnected system of devices, machines, or even people. It essentially comprises of a large num- ber of entities, capable of decision making, with some form of communication and flow of resources in between them. In multi-agent systems, these enti- ties are generally called as agents or nodes which are connected with each other by links. Generally speaking, the networked systems are composed of a large number of subsystems that are capable of taking local decisions and coordinate among themselves (according to the communication protocol) to accomplish the system’s goal. Complex networked systems describe a large number of social, biological, and engineered systems of high technological and intellectual importance. In particular, they represent several key engineering infrastructures in our modern society including the Internet, power grids, sen- sor networks, electronic financial transaction network, modern transportation network, capital flow network [1], cloud computing systems and social net- works. Even though these systems are diverse in nature, they possess some

1 common characteristics including: i) distributed resources, ii) local interac- tion, i.e., every element interacts with its immediate neighbors, iii) scarcity of resource (shared resources), and iv) the phenomenon of emergence, i.e., local interactions leading to global behavior.

Network resource allocation or network flow problem has a wide range of applicability. For example, in the transportation network problem, the flow of the vehicles through the railway or road network is considered; in wire- less sensor networks, the flow of measured data through a wireless channel is considered; and in the Internet, the information packets flow through the wired network. Other examples of allocation of resources to tasks in a net- work include allocation of computing resources [2, 3, 4] such as memories or processors, allocation of network resources to consumers [2, 5] in computer or communication networks, and parts and machine allocation in factories [6]. In general terms, the resources are any means that enable operations of an engineered system or process. Some of the examples of resources include en- ergy, information, materials, capabilities of plants, bandwidth, and machines. In most of the real world problems, resources are always limited. Generally, in an engineered system, there are two sets of subsystems: one that needs resources and the other that offers resources. Thus the resource allocation problem deals with the allocation of resources from one set of subsystems to another set of subsystems so that the overall operation of the system is carried out in an optimal manner.

Among the existing networked systems, Internet can be considered to be the most popular and widely used system that comprises of a large number of computers and users connected together through a network. Similarly, in

2 recent years, a lot of emphasis has been given to the power grids that also fall into the large-scale networked system category. To address the drawbacks of the current power grid, a modernized version of the power grid, called the Smart Grid, is proposed that includes improved computing, communication and incorporation of the sensors in different parts of the network. The fu- turistic power grids (smart grid), with increased communication and decision making capability, can be considered as a good example of the large-scale net- worked systems which will be more distributed, dynamic and unpredictable in nature because of the incorporation of renewable and distributed energy resources. Another example of a large-scale networked system is the cloud computing system that treats delivery of computing and storage as a service rather than a product. Cloud computing systems are composed of multiple service users and service providers (services include computing, bandwidth and memory resources) that are geographically distributed. Even modern day cars and aircrafts can be considered to be large-scale networked systems as they include a large number of sensors and controllers which work together to make the driver’s or pilot’s work much easier and safer. Other potential net- worked systems are cyber-physical systems, cognitive networks, participatory or opportunistic sensing network, software defined communications, and many more. These systems are present in a lot of unexpected places and the number of these systems are expected to grow with the development in the fields of sensing, communication and computing.

This dissertation focuses on the network flow problem [7] of the complex networked systems which emphasizes on fair resource allocation in a network. The network flow problem is one of the fundamental problems of many research fields such as mathematical optimization, operations research, computer sci-

3 ence and many engineering and real world problems. Loosely speaking, the problem deals with the flow of resource within the network in an optimal manner which can be interpreted as the evaluation of the optimal point of operation of the network. Network resource allocation is a part of resource management problem which is concerned with objectives such as task plan- ning, resource deployment, and resource planning so that the system achieves its overall goal. The resource allocation is a lower level function of resource management which involves the assignment of available resources to tasks in an optimal manner.

The networked optimization problem becomes challenging as the network users with different objectives compete for shared resources [8]. Apart from the scarcity of shared resources, the problem becomes more complex with the un- certainty and dynamic nature of the system, presence of distributed resources and the absence of central information in the system. This necessitates the use of distributed optimization methods that can operate parallelly and in- dependently utilizing only local information [9]. The primary challenge here is to obtain the global outcome using local interactions and incomplete infor- mation. Even though large-scale networked systems have become pervasive in our society, the existing state-of-knowledge in this field is primitive and lacks the analytical tools to accurately predict, and hence control, their behavior. Apart from that, some of the existing challenges associated with these systems include: i) the assurance of network control performance under communication constraints between network nodes, ii) the effect of noise, delay, network over- head, latency, and packet dropouts, iii) coverage, consensus, and optimized cooperation in the system, and iv) information patterns for distributed and decentralized control of the network [10]. In this context, of particular interest

4 is the determination of optimal operation point of these networks to ensure its efficiency/performance, robustness to uncertainties, and responsiveness to dynamic changes.

The network resource allocation and optimization problem is generally dealt in a centralized manner [6]. To perform the optimization in a cen- tralized manner, the central processing node requires perfect information of the system it is required to optimize [2]. Hence, the central processing node requires knowledge of current states of resources and tasks at every instance of decision-making. Among the several unique challenges to carrying out opti- mization in these large-scale networks, the most fundamental one is the high- dimensionality since it includes a large number of agents in the system. This is due to the fact that more the number of agents or entities in the system, more is the number of decision variables for the optimization problem. Thus the computational complexity increases with the number of decision variables for the centralized optimization methods and thus the centralized optimiza- tion methods scales poorly for these kind of problems. In addition to this, for some complex problems, it is often impossible to define a single system wide metric for the optimization process [2]. Apart from the computational com- plexity, reliability is another issue associated with the centralized methods. The centralized methods have single point of failure, i.e., if the central node in a networked system fails, then the whole system fails. In many networks, it is not efficient or possible for an agent to share the objective function in- formation with the other agents due to privacy or energy constraints [11]. In such cases, a centralized optimization method cannot perform due to the lack of global information.

5 To alleviate some of these issues with centralized methods, recent years have seen the emergence of concept of decentralized or distributed optimiza- tion that strives to make use of distributed agents utilizing limited information to carry out the optimization. A distributed optimization and resource allo- cation method, not only alleviates the problem of high dimensionality and single point of failure, but also addresses the issue of scalability and flexibil- ity. It is also advantageous to decentralize the optimization process because in many cases the resources and tasks are geographically separated, possess different characteristics and abilities and have different information. Another aspect of such systems that impedes the applicability of centralized optimiza- tion techniques is the dynamic changes. A centralized method would require entire recalculation of optimal decision variables every time there is a change. The distributed techniques are more robust to these changes since most of the changes at local level that can be addressed via changing local decision variables. Though the concepts of decentralized or distributed optimization techniques have been around for decades, but it is only recently that advances in computing and communication technology coupled with the increase in com- plexity in networks have made these distributed concepts more applicable due to the ease with which information can be shared. However, solving the global optimization problem via control of local variables in a distributed and asyn- chronous fashion is still an open problem in literature.

The emerging applications of large-scale networked systems have drawn much attention for the necessity to exploit real-time information and operate in dynamic environment. The industrial and operations research community identifies the real-time control and optimization of large-scale dynamic net- worked systems to be one of the emerging challenges in research today [12].

6 The main challenge associated with the networked systems is to obtain desir- able global group behavior through the individual decisions under the presence of limited information and local and global constrains.

1.1 Network Flow Problem

The network flow problem [7] is a widely researched topic that deals with the optimal flow of resources through the network while satisfying the network constraints. The network flow problem is widely classified into minimum cost flow problem and maximum flow problem. In the minimum cost flow problem, tasks are completed while the cost of the flow of resources through the network is minimized while ensuring the satisfaction of the constraints. Problems such as , transportation problems and assignment problems fall into this category. In all these problems, a cost is associated with every movement of the resources. For example, in the transportation problem, the cost can be fuel cost for traveling from one city to another and the objective is to finish a given task at minimum cost. Maximum flow problems on the other hand, deals with the maximization of the flow of resources (generally dictated by a utility function) through a network while meeting the network constraints. Although both the minimum cost and maximum flow problem may look identical, the fundamental difference between them is that the for- mer minimizes the cost and the later is concerned about the maximization of flow in the network. Generally in maximum flow problems, the utility or the negative cost only depends on the amount of resource flowing through the net- work rather than the path the resources take. Problems such as minimum cut problem, matching problem and network utility maximization (NUM) problem falls into the maximum flow problem category. The importance of these prob-

7 lems not only depends on the existing applications, but also to the futuristic networked problems. For example, according to [13], the futuristic commu- nication networks are expected to modify their data rate transfer according to the available bandwidth within the network. Sensor networks is a good example that utilizes this problem, where the goal is to maximize the rate of data flow through the network.

1.2 Coordination in Networked Systems

Developments in the field of large-scale dynamic networked systems focus on modeling, algorithmic development of coordination, and optimization tech- niques in the absence of a central coordinator [14, 15]. In the distributed scenarios, only limited information is available to every computational unit in the system. Evaluation of the optimal point of operation with only lim- ited information is an extremely difficult task. Thus cooperation (information exchange) among the agents are required and the overall protocol of cooper- ation among the agents can be considered as the coordination mechanism in the network. Coordination is a very important aspect of distributed optimiza- tion under limited information especially in the presence of global constraints. Some of the commonly practiced coordination methods include:

• Central Coordination: In this form of coordination, a central agent in the network takes all the information required to solve the optimization process for every agent in the network. The central agent performs all the optimization while every agent takes part in information exchange.

• Hierarchical Coordination: The coordination responsibility is given to a separate coordination unit. This feature is generally seen in hierar-

8 chical decision making systems where the coordination is performed by the top-level and the components that are being coordinated constitute the base-level. Generally, both the top-level and base-level entities are mutually affected by each others decision.

• Distributed Coordination: The coordination is achieved through the individual entities in the network in a self-organized fashion. This is common in the cooperative multi-agent systems where each component of the network cooperate with its neighbor to coordinate its local actions.

The coordination protocol defines the information exchange rules throughout the network and is required in every distributed resource allocation method. According to the information exchanges, different actions are taken by different agents in the network.

1.3 Distributed Optimization

In most of the engineering systems, the common practice is to solve the opti- mization problem in a centralized manner. Although the centralized optimiza- tion method works very well in a number of systems, there are some problems associated with it when applied to multi-agent systems such as large-scale networked systems (as explained earlier). The importance of solving optimiza- tion problems in networked systems lies in the fact that 70 percent of the real-world mathematical programming problems can be formulated as a net- work optimization problem [16].

In the distributed optimization methods, the computational load is dis- tributed among the participating agents where each agent solves its own op-

9 timization problem in the presence of limited information. In the distributed optimization methods, a coordination mechanism (section 1.2) is used so that global optimal solution can be achieved in the presence of limited informa- tions. In this dissertation, a distributed optimization method is proposed that utilizes the dual variable information, which can be interpreted as prices of the resources, as the coordination mechanism within the system to handle the global constraints of the network. In this dissertation, two different distributed optimization approaches in solving the network flow problems are discussed. First, a distributed optimization method involving dual-decomposition and update of the dual variables by subgradient method, also know as market- based methods, is used to solve some generic minimum cost flow problems. A distributed Newton based interior point method is later introduced and ap- plied to solve a class of problem called Network Utility Maximization (NUM) problem.

The market-based distributed optimization method [17] uses the process of buying and selling for solving the optimization problem. In market-based methods, every agent is categorized as a buyer or a seller or both, and the equilibrium (the optimal point of operation) is reached through the process of competitive buying and selling. Generally, in market-based methods, a special agent called a dealer agent is introduced that receives limited information from buyers and sellers and helps in transaction of resources. The computational load on the dealer agent depends on the problem structure. A detailed descrip- tion of distributed optimization is presented in the chapter 4. The challenges in the market-based optimization methods lie in the development of market mechanisms specific to the optimization problem. A market mechanism can be considered to be a set of rules followed by the participating agents in the

10 network in developing the buyer’s strategy, the seller’s strategy, and the pric- ing of resources. The objective function and the local and global constraints of the problem are considered in the development of these market mechanisms.

Although these methods can perform with very less information exchange, they suffer from slow rate of convergence [18, 19, 20]. Apart from the slow convergence rate [20], these methods has a tendency to provide infeasible pri- mal solutions for a finite number of iterations. The infeasible primal solutions are generated due to the suboptimality in the computation of dual variables in the initial iterations. Some of these issues are addressed in the distributed primal-dual interior point method developed in this research. This method uses Newton step for the update of both the primal and dual variables. The Newton based distributed interior point method also involves update of both the primal and dual variables of the system in such a manner that generates feasible primal and dual solutions during each iteration.

1.4 Optimization with Multimodal Functions

In the field of applied , multimodal optimization deals with the task of determination all or most of the minima or maxima of a multimodal function. A multimodal function can be roughly defined as a function with multiple optimum points. Fig.1.1 shows an example of a multimodal func- tion that contains multiple minima and maxima points. Generally, the task of any optimization algorithm is to find the best possible point (global minima or maxima according to the problem) of operation in the search space, while the search space can have single or multiple minima or maxima. Evaluation

11 5

0

−5

2 0 2 0 y −2 −2 x

Figure 1.1: A Multimodal Function of the global optimum solution while dealing with multimodal functions is a challenging task and the normal descent algorithms have the ten- dency to converge to the local optimal solution. A number of optimization algorithms with probabilistic searching rules, such as genetic algorithm [21], particle swarm agorithm [22] and [23], have been used to address the issue of local convergence. Many distributed approaches to these methods are available in the literature ([24, 25, 26]) that addresses the improvement of the convergence rate of these algorithm through parallel pro- cessing. One distinct feature of these methods is that though the individual agents search different overlapping regions of the search space, the decision variable of each agent is generally of the same dimension as that of the overall problem. Apart from that, these methods mostly depend on global inter-agent communication rather than local inter-agent communication that are preva- lent in the networked systems. Although much research has been done using unimodal functions, solving distributed optimization with a multimodal cost function is still an open research problem.

12 In this dissertation, a noise based stochastic primal-dual distributed op- timization method is proposed that utilizes randomness to escape from local minima points. This method is applied to a network utility maximization framework, where the global constraints are being satisfied with the use of the dual variables while randomness is introduced in the primal search direction (which in turn affects the dual update) to search for the global optimum in a distributed manner. This search direction is conceptually the same as the q-derivative approach for global optimization [27] where the q-derivative at any point is directed toward the slope of the secant line at that point. This method is compared with the genetic algorithm and the results are provided in the chapter 6.

1.5 Application Areas

Some of the application areas considered in this dissertation are: i) solving optimal power flow problem, and ii) solving distributed resource allocation in a cloud computing system. Each system has its own set of challenges and con- straints and the distributed algorithm is required to be modified accordingly.

Smart Grid is a modernized power generation and distribution network, considered to be a multi-faceted solution to the problems faced by the existing power grids. Smart grid incorporates the idea of utilizing more information about power consumption and production, to allocate power, control assets and schedule tasks in order to reduce cost involved in operation, maintenance and system planning, and improve reliability, efficiency and security [28]. Es- sentially, a Smart Grid uses sensors, communication devices, control and intel-

13 ligence to enhance the overall functionality of power system. Increasing com- plexity of the management of a bulk power grid, aging assets, environmental concerns, renewable energy sources, energy independence, and the growth in demand and quality of service have contributed to the development of the Smart Grid concept.

Incorporation of renewable power generation technologies in the future power grid will make the grid massively distributed and decentralized in na- ture. The dependence of these renewable technologies on environmental con- ditions would make the grid extremely dynamic, and hence less predictable [29]. This creates the necessity for developing demand, supply and power flow control technologies for protection, management and optimization of the new grid. In smart grid, operating conditions of the grid assets and dynamic loads are balanced to maximize the efficiency of power delivery and these actions are tracked by advanced grid monitoring, optimization and control applica- tions [29]. According to Nguyen et. al. in [30], distributed control methods in power systems are expected to be helpful for the power system health and se- curity, as they reduce dependencies in the system and improve system’s ability to be operational after disturbances or loss of equipment. However, increased interactions between producers and consumers, due to envisioned transforma- tion of the traditional consumers of power into producers/consumers via the use of emerging renewable technologies, would lead to dynamic load condi- tions and change in directionality of power flow based on market conditions [29]. Also, incorporation of renewable sources induces stochasticity in grid op- erations, as these sources depend on uncertain environmental conditions. The traditional centralized way of carrying optimization for power flow decisions would be ineffective in such massively distributed and highly dynamic and un-

14 certain power grid. In order to overcome some of these issues, this dissertation explores market-based decentralized optimization technique.

In cloud computing, computing and storage are delivered to the customers as a service rather than a product. In this system, the shared resources, software, and information are provided to computers and other devices as a metered service over a network. Common user applications such as email, pho- tos, videos and data storage are already being supported at Internet-scale by cloud platforms. One of the central issues in cloud computing is that of alloca- tion of resources such as computer processors, memory, and bandwidth to end users so that the overall user quality of experience is enhanced. The current methods of resource allocation in cloud computing includes use of thumb rules and heuristics which leads to over compensation and suboptimal resource al- location. Cloud computing system will be used as one of the application areas to demonstrate the feasibility of marker-based algorithms developed in this research.

1.6 Document outlines

The outline of this document is provided below: Chapter 1: In this chapter, an introduction and the motivation of this re- search is presented. The motivation of the research in the direction of dis- tributed optimization is presented especially the merits of this research as opposed to centralized methods of optimizations are provided. The different challenges faced in the field of large-scale networked optimization problems are discussed. Chapter 2: The background review and the related works are provided in

15 this chapter. The previous and ongoing research is highlighted. Chapter 3: In this chapter, the different types of problems addressed in this dissertation are presented. This chapter formulates some of the network flow problems that are considered in this dissertation, such as minimum-cost network flow problem, maximum flow problem and networked optimization problem with multi-modal cost function. The different problem structures are discussed in details. Chapter 4: The distributed optimization approaches in solving the minimum cost network flow problems are discussed in this chapter. Market-based dis- tributed optimization methods are explained and different mechanisms used in solving different types of problems are provided. The simulation results are also provided. Chapter 5: The primal-dual distributed interior point method in solving the network utility maximization problem is explained in this chapter. Simulation results for solving the network utility maximization problem is provided in this chapter. Chapter 6: The noise based primal-dual distributed optimization method used to solve the maximum flow problem with multi-modal cost function is explained in this chapter followed by the simulation results. Chapter 7: In this chapter, two real world problems: i) optimal power flow in power grids and ii) utility maximization in cloud computing systems are formulated. The distributed method used to solve these problems are also discussed in this chapter followed by the simulation results. Chapter 8: The conclusion of the dissertation is provided in this chapter and the future scope of this research area are discussed.

16 Chapter 2

Literature Review

This chapter presents the background survey of the existing literature on differ- ent approaches to distributed optimization and resource allocation techniques. The different challenges in large-scale networked systems have boosted the re- search towards distributed optimization and resource allocation. To cope with the distributed information exchange and computational complexity with a po- tentially large number of interacting agents in large-scale networked systems, a number of distributed optimization methods has been introduced in the fields ranging from mathematical programming to distributed artificial intelligence to game theory and economics. The concept of distributed computing has led to the emergence of the field of multi-agent systems which deals with multiple interacting computing elements called agents. Agents are defined as the en- tities or software that possess perception to understand the environment and independence to make decisions on its own. The field of multi-agent systems has found applications in distributed resource allocation where resources and tasks can be considered to be agents capable of perceiving the environment and making decisions on their own.

17 2.1 Decomposition Methods

The fundamental paradigm behind a distributed optimization method is the optimization of a global objective based on local information and local deci- sions by individual agents. As such, this is a topic of great interest among researchers and remains an open research problem. The problem is usually approached using the decomposition methods [31] where the global objective is broken down into local objectives of individual agents so that the global objective can be met by minimization of the local objective functions. The process of decomposition is carried out using global constraints [32] imposed on local objectives. While local objectives are optimized by every agent utiliz- ing the local information, global constraints serve as a mechanism to facilitate global information exchange to ensure local optimization by agents lead to the optimization of the global objective. Decomposition methods are applicable for solving a problem with decomposable structure, where the complete prob- lem is divided into a number of subproblems which together can be solved more efficiently than the original problem. There are two main approaches to decomposition: primal decomposition [33, 9] and dual decompositions [9].

Primal decomposition has an interpretation of direct resource allocation since the required resources are directly allocated to the subproblems. The dual decomposition method has an interpretation of resource allocation by pricing, as the prices of the resources are first set and according to the prices the subproblems decide on the amount of resource to use. Primal decomposi- tion is used in problem that has coupling decision variables, i.e., the decision variables are shared among the subproblems, and dual decomposition is used in problems that has coupling constraints, i.e, the resources are shared among

18 the subproblems [9]. The application of decomposition methods are found in [34, 35, 36, 37, 38, 39]. For example, in [34], the decomposition methods were used for multi-commodity network design and its extension was presented in [35]. Other applications include locomotive and car assignment problem [38, 36, 37], large-scale water resource management problem, and two stage stochastic linear problem [39].

Other decomposition methods include indirect decomposition, where pri- mal or dual decomposition method is applied indirectly using auxiliary vari- ables, and hierarchical decomposition, where primal or dual decomposition is applied recursively to obtain smaller problems [9]. Lagrangian decompo- sition method [40] is a form of dual decomposition generally used in problems. Some of the applications of Lagrangian decomposi- tion are found in [41, 42, 43]. Another decomposition method called cross decomposition [44] is available in the literature that simultaneously exploits the primal and dual structure of the centralized problem. The difference be- tween this method and the other decomposition methods is that this method captures profound relationship between primal and dual decomposition rather than only primal or dual structure.

2.2 Distributed Optimization

Distributed constrained optimization problem (DCOP) is a widely used model for a vast range of distributed reasoning and multi-agent coordination prob- lems where all the interacting agents in the network communicate with each other to cooperatively solve constrained optimization problem [45]. DCOP can be considered as the distributed version of the constrained optimization.

19 A distributed constrained optimization problem essentially includes a set of variables and each variable is assigned to an agent that controls its value. The goal of the DCOP is to coordinate the values of the variables among agents so that the global objective function is optimized. In this type of problem, the global objective function is modeled as a set of constraints when every agent has the knowledge about the constraints in which its variables are in- volved [46]. The idea here is to model the objective function as the number of constraints and minimize the number of constraints that are not satisfied [46]. Some of its application include target tracking in distributed sensor net- works [46], distributed scheduling in large organizations [47] and time table of nurse’s in large hospitals [48]. ADOPT [46] is considered to be one of the leading polynomial-space algorithms for DCOP. However the network load of ADOPT increases at an exponential rate as the number of agents in the net- work increases, making the the method non-scalable. Apart from that, DCOP framework considers only local constraints rather than global or coupling con- straints and this framework is not suitable for many real world problems where the decision variables has associated utility or cost function.

In multi-agent system framework for distributed resource allocation, each agent generally has local capabilities, limited interactions, and a decision- making mechanism without any central controller [49]. A general framework that facilitates parallel and distributed computation in multi-agent system can be found in [50, 51]. The distributed systems frameworks have been used to solve a number of complex problems. There are two standard approaches used in the development of distributed optimization methods for optimization and control of multi-agent systems. One of the methods use dual decomposition and subgradient update method [13] while the other involves consensus-based

20 schemes [52]. The first approach has an economic interpretation where the dual variables represents the prices of the resources and its subgradient up- date is similar to the price update in real markets where price update takes place according to demand-supply. The price-based or market-based meth- ods [17] can be implemented in a distributed manner where the primal and dual variables are evaluated by the resource users and resource providers re- spectively. In these methods, the dual variables or the prices of the resources are used to navigate the system into obtaining feasible and optimal primal solutions. There exists no strict definition of what constitutes market-based, auction like or economically inspired methods [53]. Rather, there are certain features found in the market that have inspired its application in distributed resource allocation methods. Decentralization, interaction among agents, and a notion of numerous, distributed resources that needs to be allocated are some of such features. In market-based systems, like the distributed systems, the control emerges from the individual goals of the agents in the system rather than any centralized command.

A market can be considered to be a system with locally interacting agents (components) that achieve some overall global behavior. In a market, the sim- ple interactions among individual agents, such as buying and selling, lead to a desirable global effect such as stable prices and fair allocation of resources. The ability of the market-based method to facilitate fair allocation of resources with limited information makes them an attractive solution for many complex problems. The similarities between distributed computer systems and eco- nomic systems suggest that the models that were previously developed for the economic systems can be directly implemented on distributed computer sys- tems [54]. Auctions algorithms are easy to implement and highly intuitive and

21 can be explained in terms of economic competition concepts [55]. From both computational as well as software perspectives, market approaches helps in natural decomposition [56, 57]. Market approaches can easily handle addition or deletion of agents and thus give more flexibility [57]. Since these agents represent resources or tasks, the market-based methods are inherently robust to dynamic addition or deletion or resources or tasks.

In most of the market-based resource allocation or optimization techniques, there are three kinds of agents, buyer agents, seller agents and auctioneer or the dealer. In a typical implementation of market-based method, a seller agent would correspond to a resource provider and a buyer agent would correspond to a resource user or task. Generally in a market, only the price information of the resource flows between the participating agents. The price of the re- sources can be set by the dealer or auctioneer in the market or by the resources (seller agents) itself. In most implementations of market-based approach, the price of the resources are considered to be global information and available to all the agents who adjust their allocation strategy according to the price dynamics. Market control often contains both centralized and decentralized aspects. Since each agent in the market acts on its own, the whole system is distributed and the dealer acts as an entity to facilitate global information exchange. The market-based methods are thus said to exploit the advantages of both the centralized and decentralized methods.

There are a number of different market mechanisms which find applications in solving different types of optimization problems. Among these mechanisms, fixed price markets are most common [58]. In a fixed price market, prices of the resources do not change with time, i.e., the resource price is not dependent

22 on the demand of the resource for the consumers. The buyer agent’s allocation strategy thus will not change since the price of resources remains fixed and so the fixed price markets have limited application to optimization problems. In dynamic price markets, the price of the commodities or resources change with time or demand and the buyers change their allocations according to the prices of the resources.

In market-based method for resource allocation, the commodities can be traded in both continuous and discrete amounts [58]. Discrete commodity trade generally occurs when the resources are indivisible in nature as in the case of auction of flight tickets. In the case of allocation of electric power or communication bandwidth, on the other hand, continuous trade take place. Hence, market-based methods, in general, are applicable to both discrete and continuous optimization problems. There are overall two categories of market- based models that are generally used in resource management: commodity market model and auction model. In commodity market model, the resource providers specify their prices and charge the consumers (resource users or tasks) according to their consumption. The consumers and the providers in auction models act independently and settle for a selling price privately. The auction methods are generally used for the resources that have no standard values and the prices change dynamically according to the supply and demand at a specific instant. The auction methods are decentralized, they require lit- tle information and are easy to implement [59]. There are four basic types of auctions based on the interaction between the consumer and the provider: the ascending auction, descending auction, the first price and second price sealed auction, and the double auction [60].

23 In the distributed resource allocation or optimization problem, the over- all centralized problem is decomposed into as many numbers of subproblems as there are buyer agents or buyer and seller agents. The buyer agents are generally associated with tasks or demands that need to be met by the re- sources represented by the seller agents. Both the buyer and the seller agents in the market act selfishly by maximizing their own profit margins. Every buyer agent thus intends to buy resources at lower costs and every seller agent intends to sell its resources at higher prices. The process of competitive buy- ing and selling continues till the equilibrium is achieved. In the equilibrium scenario, all the agents reach their pareto-optimal solution which corresponds to a solution where no alternate solutions exist that will make some of the agents solution better without making some other agent’s solution worse [61]. In market-based resource allocation method, since every buyer and seller agent is egoistical in nature and tries to maximize its own profit margin, no further trading can take place when a pareto-optimal solution is reached. It is clear that even though the pareto-optimal solutions do not correspond to globally optimal solutions, they do represent efficient solutions. Although these meth- ods can perform with very less information exchange, as mentioned in the previous chapter, they suffer from slow rate of convergence [18, 19, 20]. Apart from that, for a finite number of iterations, these methods have a tendency to provide infeasible primal solutions which is not a good feature for time critical problems. The infeasible primal solutions are generated due to the subopti- mality in the computation of dual variables.

The well studied consensus problems belong to a class of canonical problems on networked multi-agent systems [52, 62, 63, 64] where the idea is to reach an ‘agreement’ between the agents on the estimate of the optimal solution

24 of the problem. Consensus problems, which first originated in management science [65], are used in computer science for a long time and forms the basis of distributed computing [66]. In networks, a consensus algorithm is the pro- tocol of solution update and information exchange between the neighbors of an agent [64]. The theoretical framework for formulating and solving the con- sensus problem for network dynamic systems is found in [67] and the problem of reaching the agreement without the computation of any objective function, also known as the alignment problem, was shown in [63]. The extension of this work is found in [68, 69] which uses directed information flow in the network. A common approach used to reach a consensus of the solution in network opti- mization problem is through the process of inter-agent communication where each agent updates its estimate of the optimal solution by a weighted sum of its neighbors’ estimates of the solution over a fixed or time-varying network. It has been proved that by using a well defined weighing matrix, called consensus matrix, an agreement between the agents can be reached, i.e., all the agents converge to a unique solution. Under some mild assumptions of connectivity of the graph and update rules, it has been shown [70] that the suboptimal- ity of the solution converges at a linear rate. Apart from some assumptions in connectivity, the consensus-based scheme relies upon sharing the solution estimate of each agent which might not be feasible in many real-world prob- lems such as those which involve agents from different functional entities such as different utility companies. Furthermore, the consensus-based schemes are generally applicable to problems with local constraints rather than coupled constraints.

Other examples of multi-agent unconstrained problem are found in [19, 71], where the problem is solved using a novel combination

25 of consensus algorithm and subgradient update method. Authors in these literature provide a method for solving unconstrained, non-differentiable, and separable convex optimization problems, where each agent updates its solution using its subgradient information and the solution obtained by its neighbor- ing nodes. A similar problem is solved in [72], which is an extension of [19], considering only local constraints which is satisfied using projection methods. A distributed primal-dual Newton method for networked optimization is pro- posed in [20] where the cost function is doubly differentiable and the solution of each agent is updated in the Newton direction. Other approaches of deter- ministic incremental subgradient method is shown in [73] while a randomized incremental subgradient method is proposed in [74, 75]. Most of these papers consider unconstrained problem or local constraints which can be satisfied by each individual agent. The problems become a lot more complex when the global constraints are considered which are present in almost all the real world optimization and resource allocation problems. The global constrains can be considered as resources that are shared among the participating agents for resource allocation problems. In most of the real-world problems, shared re- sources are scarce and hence optimization in a distributed manner in presence of global constraints becomes relevant. In [76], the authors have proposed a distributed Newton method for the Network Utility Maximization problems. Here, they propose an approach of distributed computation of dual variables using concepts of matrix splitting and information exchange among the agents. The authors provide an in depth mathematical analysis of the convergence of the algorithm along with the relationship between network topology and the convergence rate. The method is shown to converge at a superlinear rate, much faster than the distributed subgradient methods. However, the computation of the dual variables via matrix splitting method adds an extra inner-loop in

26 every iteration of the algorithm and requires extra information exchange, first and second derivative of the cost function, between the agents in the system. Furthermore, finite number of iterations practically used for matrix splitting often result in inaccuracies in the computation of dual variables. Other issue in this approach is the estimation of the Newton decrement based on consen- sus methods which results not only in associated errors but also require more computation and more communication between the agents.

There have been several applications of multi-agent systems including con- trol of manufacturing, traffic or multi robot systems and logistics [77], [78] and [79]. Most of these applications can be viewed as resource allocation problems. For example, in manufacturing process, the capacity of the machines can be considered to be the resources that need to be allocated among the work- pieces and jobs. Similarly, the transportation routes and vehicles/trucks are the resources in logistic systems that are required to be allocated to meet the transportation demand of the consumers. One of the primary reasons behind the success of most of the application lies in the use of distributed resource allocation facilitated by the framework of multi-agent systems [49].

Game theory [80] and economic theory has provided another two approach in understanding and solving problems involving multiple interacting agents. Game theory, in particular, finds a lot of application in modeling interactive computations and multi-agent systems [81]. Game theory focuses on optimal strategies and the achievement of equilibrium under the presence of conflicting multiple agents [80]. In game theory, rule based algorithms [82] are used to understand the emerging coordination pattern rising from the local decisions taken in a cooperative game. In these methods, the agents interact among

27 themselves to learn their optimal behavior through a process of repeated cycles of trial and error. Authors in [83] show how collective coordination of swarms, capable of adapting to dynamically changing environment, for search missions is achieved using game theoretic approaches.

2.3 Multi-modal Cost Function

Many real world problems require solving an optimization problem with mul- timodal cost function. The general approach in solving these multimodal op- timization problems is to perform unimodal search from multiple points in the search space [84]. Another common practice is to incorporate probabilistic transition rules in the search method as shown in [21, 85]. Many distributed approaches to these methods are available in the literature ([24, 25, 26]) that addresses the improvement of the convergence rate of these algorithms by us- ing parallel computation. One distinct feature of these methods is that though the individual agents search different overlapping regions of the search space, the decision variables of each agent are generally of the same dimension as that of the overall problem. Apart from that, these methods mostly depend on global inter-agent communication rather than local inter-agent communi- cation that are prevalent in the networked systems.

Some of the real world network resource allocation problems with multi- modal cost function are the optimal power flow problem (OPF) in power grid [86], and quality of experience maximization in multimedia applications [87]. The cost function in the OPF problem of the power grid turns multimodal when valve-point loading effects of thermal generators are considered or when flexible alternating current transmission (FACT) systems devices are intro-

28 duced [86], or even when security constraints are used [88]. In the rate al- location of multimedia applications, the utility increases in a discontinuous fashion which resembles a staircase function which is multimodal in nature. This staircase function can be approximated as a sum of sigmoid functions [87]. The problem in [86] utilizes a centralized particle swarm optimization to solve the OPF problem method while a suboptimal heuristic method is used in [87] to solve the rate allocation problem.

As mentioned before, most of the existing optimization methods used for the problems involving multimodal cost function, uses probabilistic transition rules. The probabilistic search methods [21, 85] helps in exploring different parts of the search space rather than converge into local optima. One of the recent methods for solving unconstrained problems with multimodal cost function involves utilization of q-derivative as the search direction [27]. In this method, randomness is incorporated in the computation of the q-derivative. Some well known optimization methods used for problems with multimodal cost function are Genetic Algorithm [21], Simulated Annealing [23] and Par- ticle Swarm Optimization method [89].

2.4 Application Areas

Over the past few decades, the OPF problem has been extensively studied [90, 91, 92]. OPF problem is a non-linear optimization problem with a set of constraints and has been solved using both conventional as well as non- conventional methods. The conventional methods used for solving the OPF problem include Newton method, , gradient methods, and interior-point methods [90, 91, 92, 93]. In recent

29 times, non-conventional or meta-heuristic methods such as Evolutionary Pro- gramming [94, 95], Genetic Algorithm [96, 21], Particle Swarm Optimization [97], [98] and Simulated Annealing [99] has been used for solving the OPF problem. Most of the techniques mentioned above use centralized approach for solving the optimization problem. It may be worth noting that solving a complex optimization problem in a centralized manner is computa- tionally cumbersome due to a very large solution space resulting from a large number of decision variables. The issue is highly pronounced in the problem under consideration in this dissertation due to massively distributed nature of smart grids. One of the approaches to address the above issue is use of distributed optimization techniques that decompose the global problem into subproblems that involve lesser number of (local) decision variables and local information.

In recent years, the interest towards decentralized optimal power flow has gained significantly with the deregulation and liberalization of electricity mar- ket [100, 101, 102]. With the increasing performance requirement and avail- ability of parallel processing hardware, incorporation of parallel processing in power systems problems is considered to be an important task [103]. Authors in [102] demonstrated three mathematical decomposition coordination meth- ods useful for implementation of distributed OPF problem. An approach to parallelize optimal power flow problem for coarse-grained distributed imple- mentation in large scale inter-connected network is found in [104]. Research on multi-area OPF problem is reported in [101, 105, 106]. In these problems, the global problem is decomposed into a number of overlapping areas and each area solves its own optimization problem along with the coupling constraints in the overlap. The dual decomposition and subgradient based approach or

30 the market based method has been used in solving resource allocation and optimal power flow problem in smart grid [107, 108, 109, 110].

Cloud computing systems consists of a number of virtual desktops request- ing resources, like memory, processing and network bandwidth, to the resource providers, which are the virtual desktop cloud systems. The virtual desktops are generally allocated with resources from the distributed data centers using a greedy approach while maintaining the network constraints. Over time, this opportunistic approach leads to ‘resource fragmentation’ problem which affects the quality of experience of the virtual desktops. This requires the use of some global optimization methods for ‘defragmentation’. In the defragmentation process, the resource and data center assignment to each virtual desktops are re-evaluated. The resource fragmentation is a well studied problem in storage and memory systems. Most of the current literature uses heuristic or meta- heuristic methods to solve the fragmentation problem [111, 112, 113]. Authors in [111] uses a genetic algorithm based optimization method for resource allo- cation while in [112] a min-max fairness scheme is used for fair allocation and in [113], a programming method used for the purpose of resource allocation.

31 Chapter 3

Problem Description

In this chapter, the different kinds of network flow optimization problems that are being addressed in this dissertation are discussed. First the minimum cost network flow problem is discussed and within its framework, the gen- eral resource allocation problem with both indivisible and divisible resources are being presented. The network maximum flow problem is presented next where the network utility maximization (NUM) problem is formulated. This is followed by the discussion on the network resource allocation problem with multi-modal cost function.

3.1 Minimum Cost Network Flow Problem: Mincost Problem

Minimum cost network flow problem belong to network flow problems, where a cost function is associated with every flow in the network. Considering a graph G =(V,E), with the set V consisting of the nodes and the set E consisting of the edges, a minimum cost network flow problem can be described

32 as: min Cijxij, such that the network constraints are satisfied. xij is the

amount of flow of resources through the edge (i, j) ∈ E and Cij is the cost associated with the unit flow of resource through the edge (i, j) ∈ E and it is 0 if i is not connected to j. A schematic of a network is shown in Fig 3.1, where the p1,p2,p3,p4 are producer of resources while c1,c2,c3,c4 are the consumer of the resources. The arrow heads in the figure shows the connectivity of the network and the goal of the minimum cost network flow problem is to minimize the total cost of flow in the network while maintaining the network constraints. Resources can be classified into continuous or discrete, divisible or indivisible,

  - p4 XX ¨* c4  XX ¨¨  XXX ¨  X¨XX  X ¨ Xz  p3 HXX ¨ ¨* c3 H X¨X¨ ¨ ¨H XX ¨  ¨ H X¨XX  ¨ HH¨ Xz-  p2 ¨¨H  c2 ¨ HH  ¨¨ H  ¨ Hj-  p1  c1

Figure 3.1: Schematic of a Network sharable or non-sharable, static or perishable and single unit or multi unit.

• continuous or discrete resources: A resource can be continuous, like power or bandwidth, or discrete, like plane tickets or fruits. This physical property influence how the resources are being traded.

• divisible or indivisible resources: Resources can be divisible or indi- visible. Being continuous or discrete is a property of the resources while a resource is divisible or indivisible is decided at the level of allocation mechanism.

33 • sharable or non-sharable resources: A sharable resource can be allocated to a number of different agents at the same time (Earth Obser- vation Satellite application) while this is not the case with non-sharable resources.

• static or perishable resources: A consumable resource (the agent holding the resource) is considered to be a perishable resource such as fuel or food. On the otherhand, the resources that do not change their properties during the negotiation period is considered to be static. Gen- eral practice is to assume all resources to be static throughout a partic- ular negotiation process.

• single unit or multi unit resources: In a multi-unit setting, many resources of the same type is available that are identified with the same name. In this setting the resources of the same type are non-distinguishable as opposed to single unit setting.

In this section, resource allocation problem with both divisible and indivisible resources are considered. In the case of both divisible and indivisible resource allocation, resources can be considered to be either continuous or discrete, static, multi-unit and non-sharable.

3.1.1 Mincost Problem 1: Allocation of indivisible re- sources

The allocation of indivisible resources is essentially an assignment problem, a combinatorial optimization problem of Non-deterministic Polynomial time Hard (NP Hard) category [114], widely studied in optimization or operations research literature. Generally, an assignment problem involves two sets and

34 they are generally referred to as “tasks” and “resources” so that the tasks or the jobs can be carried out or demands can be fulfilled via use of resources [115] while optimizing certain aspect. In the original version of the assignment problem, each task is assigned to a different resource so that a one-to-one as- signment is achieved. The assignment problem can also be formulated as a maximum weight matching problem in a weighted bipartite graph in case of two sets or multipartite graph in case of multiple sets [115]. The generalized as- signment problem has a number of applications such as in resource scheduling problems, scheduling of project networks, storage space allocation, in design of communication networks with node capacity constraints [116]. The exam- ples include vehicle routing problem [117], assignment of jobs to computers in a computer network [118], assignment of ships to yards for regular overhauls [119], and multi-robot task allocation problem [120]. This problem can be mathematically formulated as follows. Consider a problem of optimal assign- ment of n resources to n tasks. Each of the n tasksisconsideredtohave a demand given by di(i =1, 2...n), while each of the n resources bears the capability of fulfilling the demand of each agent. The objective of the resource allocation problem is to minimize:

n n min Cijdixij (3.1a) xij i=1 j=1 n such that xij = 1 (3.1b) j=1

xij ∈{0, 1} (3.1c)

th th In Eq. (3.1a), Cij is the per unit cost of utilization of j resource for the i task. Eq. (3.1b) makes sure that every resource gets assigned to exactly one

35 task (one-to-one assignment problem). Eq. (3.1c) shows that the problem is an

indivisible resource allocation problem, i.e., the decision variable xij is a binary variable can only take values 0 or 1. Some transportation problems, such as taxi dispatch problem, where each consumer location needs to be visited by one vehicle falls into this class of optimization problem.

3.1.2 Mincost Problem 2: Allocation of divisible re- sources

This problem deals with resources that can be partially allocated to one or multiple tasks or jobs. In these kinds of problems, rather than a one-to- one assignment, same resource can be allocated to multiple tasks without exceeding the capacity of the resource producer. To mathematically formulate the problem, consider n tasks (buyer agents) and m resources (seller agents).

Each buyer agent is considered to have a demand di(i =1, 2...n) while each

seller agent is considered to have a capacity of cj(j =1, 2...m). The problem is to minimize:

n n min Cijxij (3.2a) xij i=1 j=1 m such that xij = di (i =1, 2..n) (3.2b) j=1 n xij ≤ cj (j =1, 2..m) (3.2c) i=1

xij ≥ 0 (3.2d)

In Eq. (3.2a), Cij, as in Section 3.1.1, is the unit cost of utilization of resources from seller j to buyer i. Here the decision variable xij is the amount of resource

36 allocated from seller j to buyer i which is always a real non-negative value as shown in Eq. (3.2d). Eq. (3.2b) makes sure that the demand of every

buyer agent i, di, is met by the seller agents it is getting resources from. The constraint in Eq. (3.2c) makes sure that the total capacities of the sellers are not exceeded, i.e., the net resource sold by a seller j cannot exceed its capacity cj. This problem is applicable in power allocation in the electric grid, where the power generated from the generation units (seller agents) is required to be allocated to the end users (buyer agents) while considering the constraints of the system. Another example is cloud computing where resources such as memory, processing, and bandwidth need to be allocated to a large number of users to maximize the users’ quality of experience.

3.2 Maximum Flow Problem: Maxflow Prob- lem

In this dissertation, we use the Network Utility Maximization (NUM) problem to formulate the Maximum Optimization problem. The NUM problem represents a class of network resource allocation and utility maximiza- tion problems that can be formulated as a constrained maximization problem [9], and is often used as a benchmark problem in the area of networked resource optimization. In the maximum flow problems, the goal is to maximize the flow of resources from a source node (resource provider) to the sink node (resource user) through a predetermined path while satisfying the system requirements and constrains.

The network is considered to be composed of m directed links that is shared

37 by n different sources so that each link j canbeconsideredtobesharedbya set of S(j) source nodes. Each source is considered to transmit non-negative resources xi, i =1, 2...n, via a predetermined route consisting of a set of links L(i) for source i, while each link possesses a finite positive capacity cj, j =1, 2.., m, of resource movement. To use standard nomenclature, xi of each source node i is considered to be the rate of flow (resources) through its predetermined path. The capacity constraint at each link in the network can be described by:

Ax ≤ cx∈ Rn,c∈ Rm,A∈ Rm×n

The rate vector, x =[x1,x2..xn] ∈ Rn, consists the primal variables of the problem so that the ith element of x denotes the flow rate for source node i. Similarly, the jth element of capacity vector, c ∈ Rm, is the capacity of the rate of flow through the link j. The matrix A is the routing matrix of dimension m × n so that, ⎧ ⎨ 1 if link j is on the route of source i Aij = ⎩ (3.3) 0 otherwise

i Each source, i, is considered to be associated with a utility function Ui(x ), and the net utility of the network is considered to be the sum of utility of  n i the source nodes in the network (additive utility): i=1 Ui(x ). The network

38 utility maximization problem can be given by:

n n i i max Ui(x ) or min −Ui(x ) (3.4a) i=1 i=1 such that Ax ≤ c (3.4b)

x ≥ 0 (3.4c)

The following assumptions are considered:

1. The utility function is considered to be twice differentiable, strictly con- cave, and monotonically nondecreasing function on (0, ∞)

2. There exists a solution (x∗,λ∗) to the problem in Eq. (3.4) that satisfies the optimality conditions, provided in Eq. (5.1) in chapter 5.  n − i ∇ 3. The cost function, f(x)= i=1 Ui(x ), and its first derivatives, f(x) and second derivatives, ∇2f(x), are both upper and lower bounded.

4. The ∇2f(x) is positive definite and the function f(x)is

Lipschitz continuous, i.e., ||∇f(xk+1) −∇f(xk)|| ≤ γ||xk+1 − xk||.

5. The routing matrix A has full row rank.

Throughout the dissertation, the norm represented by ||.|| is considered to be 2-norm.

3.3 Optimization with Multi-Modal Cost Func- tion

In this section, we discuss the optimization problem with multi-modal cost function with respect to the NUM problem. Multi-modal functions are func-

39 tions which has more than one minima or maxima as shown in Fig 3.2. Fig 3.2 shows a one dimensional multi-modal function and the function has 3 min- ima points. The constrained NUM problem with multi-modal cost function is mathematically similar to Eq. (3.4) but the cost function is multi-modal rather than unimodal. The multi-modal cost function considered in this dis- sertation is a function composed of the sum of multiple exponential functions showninEq.(3.5):

n p −(x − μ )2 f(x)=− a exp j (3.5) j 2σ2 j=1 j

In the above equation, np is the total number of exponential functions, μi and th σi is the mean and standard deviation of the i exponential function while ai is a scaling factor. Thus the constrained multi-modal NUM problem can be stated as:

n n n p −(x − μ )2 max − f (x ) = min − a exp i j (3.6a) i i j 2σ2 i=1 i=1 j=1 j Ax ≤ c (3.6b)

x ≥ 0 (3.6c)

x ∈ Rn; c ∈ Rm; A ∈ Rm×n

Similar to the problem in Eq. (3.4), the overall cost function is the sum of the cost function of each node in the system. In Eq. (3.6a), n is the total number of source nodes, x and c in Eq. (3.6b) and (3.6c) are the decision variable vector and line capacity vectors respectively while matrix A is the routing matrix.

40 −0.5

−1 Third Minima −1.5

−2 Utility

−2.5

−3 First Second Minima Minima

−3.5 0 1 2 3 4 5

Figure 3.2: Multi-Modal Cost Function

41 Chapter 4

Approach and Simulation Results for Minimum Cost Network Flow Problem

In this chapter, the distributed optimization approaches used in solving the problems described in the section 3.1.1 and section 3.1.2 are presented. The ap- proach taken in solving the described problems can be categorized as Market- Based Distributed Optimization method. In the following sections, the different market mechanisms used in solving the different problems are discussed.

4.1 Mincost Problem 1: Resource Allocation with Indivisible Resources

This, as formulated in Section 3.1.1, is a typical integer/binary programming problem and can be solved in a centralized manner using well studied methods to solve integer programming problems. However, these methods suffer from

42 the issue of scalability and other issues associated with centralized techniques as mentioned earlier. In the proposed market mechanism [108], which is based on an auction model, there are n buyers (corresponding to tasks), n sellers (corresponding to resources), and one dealer. The algorithm works in an it- erative fashion and in each iteration, every task or buyer agent calculates the cost of acquiring the required resource from all of the seller agents. In this problem, it is considered that each of the buyer agents’ demand can be met by any one of the seller agents. The Lagrangian relaxation of the problem described by Eq. (3.1) is:

n n n L(x, p)= Cijdixij + pj(xij − 1) (4.1) i=1 j=1 j=1

In the above equation pj∀j represents the dual variables of the problem or the

prices of the resources. The cost, (Costij)k of acquiring the resource from a particular seller agent j by buyer agent i in iteration k is a function of the

per unit cost Cij of resource flow between the buyer i and the seller j and the price, (pj)k of the resources for seller j, and is given by:

(Costij)k =(Cij)di +(pj)k (4.2)

This cost reflects the total price the buyer agent has to pay to procure the resources from a given seller agent. Each buyer agent i maintains an n dimen-

sional cost vector given by (Costi)k whose elements are obtained using Eq. (4.2). To maximize its own profit margins, each buyer agent i(i =1, 2...n)

would bid for the resource that results in the minimum cost, i.e., min(Costi)k for a particular i. Thus each buyer agent only maximizes in its own profit and in doing so, more that one buyer agent will request resources from the

43 same resource provider which violates the one-to-one assignment constraint. To alleviate this problem, the dual variables or the prices of the resources are updated according to supply-demand (updated using subgradient method).

The seller agent is considered to maintain a very simple strategy. The initial selling price (pj)k for the seller agent j is considered to start from a low value and increases as the number of requests to the seller agent j increases. The evolution of price over iteration from k to k +1isgivenby:

(pj)k+1 = min (0, (pj)k + γ(nj − 1)) (4.3)

where nj is the number of buyers bidding for the seller agent j at iteration k and γ is a constant that governs the rate of change of price. Eq. (4.3) suggests that if only one buyer bids for the seller j then its price will remain unchanged. When more than one buyer bid for the seller, its price will go up. If no buyer bids for the seller agent j, its price will go down but not negative since the dual variables are interpreted as prices. Thus the change of the price of the seller agent is directly related to the demand of the seller agent. It may be also noted that, the price update is in the subgradient descent direction.

A dealer can be considered to assigns the highest bidder for a particular seller agent j to the buyer agent i and the transaction price is the cost of acquiring the resource from the seller agent j for the highest bidder given by Eq. (4.2). It can be seen here that although a separate dealer can be considered in the system, in the approach, each seller agent can assume the role of the dealer if necessary and thus the problem can be solved in a completely distributed manner. In this method, the update of the prices of the resources

44 (Eq. (4.3)) helps in reaching the equilibrium point in the system. As shown in Eq. (4.3), the price increases as the demand on that particular resource increase. So, at a higher price, only the most deserving buyer agent receives the resource. The process of bidding, buying and selling over iterations results into a competitive equilibrium corresponding to the maximum price obtained by sellers and the minimum cost incurred by buyers.

4.2 Mincost Problem 2: Allocation of Divisi- ble Resources

This problem, as formulated in section 3.1.2, is a typical linear programming problem and can be solved in a centralized fashion using the existing meth- ods in literature. However, a central optimization technique such as linear programming considers the knowledge of the global system to be known and becomes more and more computationally intractable as the number of buyer and seller agents increases.

In the proposed market-based approach [110], based on an auction model, the centralized problem in Eq. (3.2) is decomposed into agent level optimiza- tion problems that can be solved by each of the buyer agents. In the proposed technique, each end user agent solves its own optimization problem and sub- mits its request along with the valuation of the request to the dealer. The valuation, also called the bid for the request, is a measure of how much the request is valuable to the buyer agent. Each seller agent submits its prices of resources along with its capacity to the dealer in the market. The dealer helps in the transaction of resources between the buyer and seller agents so

45 that overall utility is maximized, and it also sets the prices of the resources.

The proposed approach works in an iterative fashion where agents adjust their bidding and selling prices based on current allocations decided by the dealer in each iteration. In the next iteration, agents try to re-adjust the allo- cation in a manner that improves the overall utility. The process is described as follows. At iteration k, the optimization problem solved by each of the buyer agent i is to minimize:

min (Cij +(pj)k)(xij)k (4.4a) (xij ) k j such that xij = di (4.4b) j

(xij)k =(xij)k−1 +(wij)k (4.4c)

m (Wi)k =[wi1wi2 ...wim]k ∈ R (4.4d) V (Wi)k = Cij(xij)k−1 − Cij(xij)k (4.4e) j j

In Eq. (4.4a), (pj)k is the unit price of resource for seller agent j.(wij)k in Eq. (4.4c) is the re-allocation decision taken by the buyer agent i for seller agent j

at iteration k and (Wi)k is the vector containing all the re-allocation decisions of buyer agent i for all seller agent as shown in Eq. (4.4d). Eq. (4.4d) shows

that in a system consisting of m seller agent the (Wi)k will consists of m ele-

ments. V (Wi)k is the valuation or the bid associated with the bundle (Wi)k is received by the buyer agent i at iteration k. Thus each buyer agent submits its most preferable bundle (Wi)k along with its valuation or bid V (Wi)k for the bundle to the dealer.

46 The dealer solves its own optimization problem called the market matching problem where it decides what percentage of the request of each of the buyer agents will be met considering the limits in resource availability of the seller agents given by the constraint Eq. (3.2c). If at any particular iteration, none of the request of any buyer agent is met, the unmatched request along with its valuation or bid value gets carried over to the next iteration for that particular buyer agent. Thus, at any particular iteration k, a buyer agent can place a l − request of size l, Wi , which would consist of (l 1) unmatched requests from previous iterations and one current request found by solving the problem in Eq. (4.4) at the current iteration. Since upto l requests can be submitted to the dealer by the buyer agents (l ≤ k, k is the current iteration), l bids,

l Vl(Wi ), associated with the requests should be submitted to the dealer also.

In this problem, the dealer is considered to solve the following problem after it receives the required information from the buyer and the seller agents:

n l l max Vl((Wi )k)yi (4.5a) yl i i=1 l n l l ≤ such that ( (Wi )kyi) (cj)k j =1, 2,...,m (4.5b) =1 i l l ≤ yi 1 (4.5c) l

l th In Eq. (4.5a), yi can be considered to be the percentage of the l request from buyer i met by the dealer (cj)k is the remaining resources in seller agent j at iteration k. The dealer also publishes the shadow price (pj)k+1 of the resources which are the dual variables of the constraint in Eq. (4.5b). For

47 l each non-zero solution of yi, a transaction takes place and the buyer agent is  l l allocatedwith(xij)k =(xij)k−1 + l((Wi )kyi). The buyer agents again solve their own optimization problem given by Eq. (4.4) and the process continues till the termination conditions are met.

In this method, the equilibrium is reached when the prices of the resources converge. The prices of the resources are the shadow prices of the constraint (Eq. (4.5b)) that increase as the demands on the resources increase. Thus similar to the previous method, at the higher price of the resource, only the most deserving buyer receives the resource.

4.3 Simulation Results

4.3.1 Mincost Problem 1: Resource Allocation with In- visible Resources

In this problem, a number of different scenarios are considered and the cost Cij and the resource demand of the task agents (buyer agents) di are generated in a random fashion. In order to visualize the results, the buyer agents and the resource agents (seller agents) are considered to be geographically distributed and the cost Cij between buyer agent i and seller agent j is considered to be the geometric distance between the two. As mentioned in section 3.1.1 the problem in Eq. (3.1) can be solved using integer programming method which is centralized in nature. The solutions from the proposed market-based allocation method are thus compared with the centralized integer programming method to evaluate the effectiveness of the procedure. Fig. 4.1 – Fig. 4.3 shows the solutions achieved from the market-based method as well as the

48 integer programming method for three different scenarios: i) 25 resources and 25 tasks; ii) 50 resources and 50 tasks; and iii) 100 resources and 100 tasks. Each of the figures show: a) the assignment results obtained from centralized integer programming method that provides globally optimal result; b) the assignment results obtained from the proposed market-based method; and c) the evolution of global cost function as the number of iterations progresses in the market-based method. Table 4.1 compares the results obtained from both the methods for 9 different scenarios ranging from small problem to large assignment problems. The cost is computed using Eq. (3.1a) where Cij is the distance between task agent i and resource agent j.

Market Based Integer Programming Resource Resource 100 100 Task Task

80 80

60 60

40 40

20 20

0 0 0 20 40 60 80 100 0 20 40 60 80 100

(a) Solution from Integer Programming (b) Solution from Market-Based Method Method

2500

2400

2300

2200

Global Cost 2100

2000

1900 0 50 100 150 200 250 300 350 400 Number of Iteration (c) Global Cost vs Number of Iterations Obtained Using Market -Based Method

Figure 4.1: Mincost Problem 1 (Scenario 1): 25 Resources and 25 Tasks

49 Integer Programming Resource Market Based Resource 100 Task 100 Task

80 80

60 60

40 40

20 20

0 0 0 20 40 60 80 100 0 20 40 60 80 100

(a) Solution from Integer Programming (b) Solution from Market-Based Method Method

3200

3100

3000

2900 Global Cost

2800

2700 0 200 400 600 800 1000 Number of Iteration (c) Global Cost vs Number of Iterations Obtained Using Market -Based Method

Figure 4.2: Mincost Problem 1 (Scenario 2): 50 Resources and 50 Tasks

4.3.2 Mincost Problem 2: Resource Allocation with Di- visible Resources

In this problem, similar to the previous problem, the resource agents and the

tasks agents are considered to be geographically distributed and the cost Cij between task agent i and resource agent j is considered to be the geometric distance between the two. Similar to the problem in section 4.3.1, the costs,

Cij, the demands of the task or buyer agents, di, and the capacity of the re-

50 Integer Programming Market Based 100 Resource 100 Resource Task Task

80 80

60 60

40 40

20 20

0 0 0 20 40 60 80 100 0 20 40 60 80 100

(a) Solution from Integer Programming (b) Solution from Market-Based Method Method

8500

8000

7500

7000 Global Cost

6500

6000 0 200 400 600 800 1000 Number of Iteration (c) Global Cost vs Number of Iterations Obtained Using Market -Based Method

Figure 4.3: Mincost Problem 1 (Scenario 3): 100 Resources and 100 Tasks

source or seller agent, cj, are generated in a random fashion while maintaining   n ≤ m the inequality constraint, i=1 di j=1 cj, so that the solution is feasible. The results obtained via market-based method are compared with linear pro- gramming method which is centralized in nature. Fig 4.4 - Fig 4.6 presents the results for three different scenarios: i) 10 resources and 15 tasks; ii) 21 resources and 43 tasks; and iii) 50 resources and 50 tasks. The task demand and the resource capacities for scenario 1 (15 tasks and 10 resources) used in the simulation are shown in Table 4.3 and Table 4.4 respectively. Fig 4.4, 4.5

51 Table 4.1: Comparison between Marked Based Method and Integer Program- ming (Mincost Problem 1)

Scenarios (No. of Resource X No. of Tasks) Market-Based Cost Integer Programming Cost 5X5 575.11 575.11 10X10 849.13 849.13 15X15 1192.1 1192.1 20X20 1320.2 1320.2 25X25 1558.3 1558.3 30X30 1999.3 1999.3 50X50 2714.0 2714.0 100X100 6342.1 6342.1 200X200 8755.5 8755.5 and 4.6 show the assignment results obtained for the three scenarios respec- tively. Each of the figures show: a) the assignment obtained from proposed market-based method; b) the assignment obtained from centralized linear pro- gramming method that provides globally optimal result; and c) the evolution of global cost function as number of iterations progresses in the market-based method. Table 4.2 compares the results obtained from linear programming and market-based methods for 5 different scenarios.

Table 4.2: Comparison between Marked Based Method and Linear Program- ming (Mincost Problem 2)

Scenarios (No. of Resource X No. of Tasks) Market-Based Cost Linear Programming Cost 10X15 1429.2 1429.2 25X35 1470 1469.1 21X43 1887.92 1886.4 50X50 2191.41 2189.2 75X75 2313.4 2311.3

52 Table 4.3: Task Demand

3.4822 Table 4.4: Resource Capacities 4.6827 4.7144 7.3315 4.9272 7.0427 3.9778 7.9445 3.4406 6.9622 3.4524 6.5340 4.0736 7.4427 4.5242 7.4241 3.6951 6.8319 3.9225 7.7820 4.2786 8.1750 4.8347 3.3231 4.4313

53 Market Based Linear Programming 100 100 Resource Resource Task Task 80 80

60 60

40 40

20 20

0 0 0 20 40 60 80 100 0 20 40 60 80 100

(a) Solution from Linear Programming (b) Solution from Market-Based Method Method

2400

2200

2000

1800 Global Cost

1600

1400 0 50 100 150 200 Number of Iterations (c) Global Cost vs Number of Iterations Obtained Using Market -Based Method

Figure 4.4: Mincost Problem 2 (Scenario 1): 10 Resources and 15 Tasks

54 Market Based Resource Linear Programming Resource 100 100 Task Task

80 80

60 60

40 40

20 20

0 0 0 20 40 60 80 100 0 20 40 60 80 100

(a) Solution from Linear Programming (b) Solution from Market-Based Method Method

4500

4000

3500

3000

Global Cost 2500

2000

1500 0 100 200 300 400 500 Number of Iterations (c) Global Cost vs Number of Iterations Obtained Using Market -Based Method

Figure 4.5: Mincost Problem 2 (Scenario 2): 21 Resources and 43 Tasks

55 Market Based Resource Linear Programming Resource 100 Task 100 Task

80 80

60 60

40 40

20 20

0 0 0 20 40 60 80 100 0 20 40 60 80 100

(a) Solution from Linear Programming (b) Solution from Market-Based Method Method

5000

4500

4000

3500

3000

Global Cost 2500

2000

1500

1000 0 100 200 300 400 500 Number of Iterations (c) Global Cost vs Number of Iterations Obtained Using Market -Based Method

Figure 4.6: Mincost Problem 2 (Scenario 3): 50 Resources and 50 Tasks

56 Chapter 5

Approach and Simulation Results for Maximum Flow Problem

In this chapter, a primal dual distributed interior point method and its ap- plication in solving the network utility maximization (NUM) problem of the network maximum flow problem, which is presented in section 3.2, is described. The simulation results of the this proposed distributed approach is provided at the end of this chapter as well as a comparison with the centralized method to evaluate the method’s performance. A distributed primal-dual interior point approach is taken in this dissertation for solving the NUM problem. In primal- dual optimization methods, both the primal and dual variables are updated using a Newton step on the modified form of the optimality conditions [121] at each iteration.

57 5.1 Optimality Conditions

A very common technique to handle the inequality constraints is to intro- duce a barrier function within the cost function [121]. The constraints in Eq. (3.4b) are called as complicating constraints or coupled constraints because each constraint involves more than one primal variable while the constraints in Eq. (3.4c) are local in nature since each constraint involves only one primal variable. Thus, inclusion of a barrier function to satisfy the constraints in Eq. (3.4b) will hamper the separability of the total cost function while incorpora- tion of barrier function to satisfy the constraints in Eq. (3.4c) will not disturb the separability of the total cost function. The non-negativity constraints in Eq. (3.4c), xi > 0, of each primal variable thus can be satisfied with the in- clusion of a barrier function in the utility function of each source node i.The cost function of each source node is thus augmented with a logarithmic barrier

i i i function: fi(x )=−Ui(x ) − μ log(x ), where the parameter μ decreases as the algorithm progresses. The Lagrangian relaxation of the Eq. (3.4) is given by:

n i T fi(x )+λ (Ax − c) such that λ ≥ 0 i=1

Here, λ ∈ Rm is the vector of dual variables of the problem that can be

considered to be prices of the respective links such that λj is the price of link j. Considering f(x∗) to be the optimal total cost, the Karush-Kuhn-Tucker (KKT) optimality conditions [121] of the problem can be formulated by Eq. (5.1). We define a diagonal matrix diag(Y ) derived from a vector Y ∈ Rp as

p×p a matrix of size R with diagonal elements [diag(Y )]ii = Yi and rest of the elements as zero.

58 ∇f(x∗)+AT λ∗ = 0 (5.1a)

diag(λ∗)s∗ = 0 (5.1b)

x∗,s∗,λ∗ ≥ 0(5.1c)

In Eq. (5.1b-5.1c), s ∈ Rm, is the vector of slack variables defined by s = (c − Ax), while s∗ is the optimal slack variables given by s∗ =(c − Ax∗). Eq. (5.1b) is defined as the complementary slackness conditions [121], which is analogous to the Walrasian Law [122] in economic markets. Eq. (5.1c) gives the feasibility conditions of the problem where s∗ ≥ 0, i.e., (Ax∗ − c) ≤ 0, and x∗ ≥ 0 are the primal feasibility conditions while λ∗ ≥ 0 are the dual feasibility conditions.

5.2 Duality Gap

The duality gap is defined as the difference between the primal and dual solu- tions. The duality gap is always non-negative and represents a bound on the suboptimality of the solution [123] and is used in many cases as a stopping criterion [121]. In most of the distributed optimization problem, computation of the duality gap at a particular iteration is either impossible or computation- ally too expensive. In such cases, surrogate duality gap [121] is used instead of the traditional duality gap as a measure of convergence of the algorithm. The surrogate duality gap is given by:

ηˆ = sT λ

59 It may be pointed out here, since the dual and the slack variables (the prices of the resources and remaining resources) are considered to be global information available to every node of the network, surrogate duality gap can be computed very easily.

5.3 Distributed Primal and Dual Update

The optimality conditions are modified by the introduction of the parameter ‘t’, which is adjusted according to the duality gap or the surrogate duality gap, as the algorithm progresses. The modified KKT optimality conditions are achieved by modifying the complementary slackness condition in Eq. (5.1b) as: ∇f(x)+AT λ =0 1 (5.2) diag(λ)(c − Ax)= 1 t m In Eq. (5.2), t is a non-negative parameter which is inversely proportional to the duality gap. It measures the suboptimality of the solution, and as t →∞, the original optimality condition (Eq. (5.1b)) is recovered. 1m is a vector with m dimensions, with all values equal to 1. The modified optimality condition can be expressed as rt(x, λ)=0forafixedt,sothat ⎡ ⎤ ∇ T ⎣ f(x)+A λ ⎦ rt(x, λ)= =0 (5.3) − − 1 diag(λ)(c Ax) t 1m

For a fixed t, the solution (x, λ) of Eq (5.3) is evaluated using the iterative Newton method so that the primal-dual search direction, (Δx, Δλ), is the

Newton step for solving the non-linear equations rt(x, λ)=0.Ify =(x, λ) denotes the current point of the iteration, the Newton step Δy =(Δx, Δλ)is

60 characterized by the linear equations [121]:

rt(y +Δy) ≈ rt(y)+∇rt(y)Δy =0

∇rt(y)Δy = −rt(y) (5.4)

In a more elaborate form, according to the Eq. (5.4), the primal-dual search direction can be computed by: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ∇2f(x) AT Δx −∇f(x) − AT λ ⎣ ⎦ ⎣ ⎦ = ⎣ ⎦ (5.5) − − 1 diag(λ)A diag(s) Δλ diag(λ)s + t 1m

From Eq. (5.5), the primal search direction, at a particular iteration k,is given by

2 −1 T Δxk =[∇ f(xk)] [−∇f(xk) − A λk] (5.6) while the dual search direction, at a particular iteration k, is evaluated by:

−1 1 −1 Δλk = diag(λk)[diag(sk)] AΔxk − λk + [diag(sk)] 1m (5.7) tk

Eq. (5.6) provides the primal update direction. Thus the update of the ith primal variable or the decision variable of the source node i can be updated by: i − ∇2 i −1 ∇ i T Δxk = [ fi(xk)] [ fi(xk)+(A λk)i] (5.8)

T th In Eq. (5.8), the term (A λk)i is given by the product of the i column i of matrix A and the vector λk, i.e., A λk. It may be noted that the matrix 2 2 −1 ∇ f(xk)isan×n diagonal matrix and its inverse, [∇ f(xk)] ,isalsoan×n 2 −1 1 diagonal matrix with [∇ f(xk)] = 2 . Each source node is considered ii [∇ f(xk)]ii to have the knowledge of its own route, which means a source node i has the

61 information of the ith column of the routing matrix A. The dual variables or the prices of the resources, λk, are considered to be the global information, i.e., available to every source node in the system. Thus the primal variables can be updated in a completely distributed fashion with the available knowledge.

Considering a step size dk, the primal variable associated with each source node i can be updated by using:

i i i xk+1 = xk + dkΔxk (5.9)

th From Eq. (5.7), the j element of the vector Δλk is given by:

j AjΔxk − j 1 Δλk = j 1 λk + j (5.10) sk tksk

th (AjΔxk) denotes the product of j row of the matrix A and the vector Δxk  i ∀ and is given by i∈S(j) AjiΔxk. The term (AjΔxk), j, represents the net increase or decrease in the rate of flow through the link j at an iteration k j which can be considered to be known by the link j. Since the slack variable sk, which can be considered to be the amount of available resources, is known to the link j, ∀j, the dual variables or the unit-prices of the links can be computed in a completely distributed manner by each link using Eq. (5.10). Thus, dual variable λj can be updated by each link j by using:

j j j j λk+1 = λk + βkΔλk (5.11)

j ≤ j ≤ A proper choice of the step size βk (0 βk 1) in Eq. (5.11) helps in the generation of feasible dual solutions and helps in faster convergence of the dual solutions, as is discussed in the following section.

62 5.4 Dual Step Size

In this section, we provide our result that helps us in determining an appro- j priate value of step-size βk. The step size rule in the following theorem serves two purposes: first, generation of dual feasible solution, given by Eq. (5.1c), and second, convergence of the dual variables of the problem.

Theorem 5.1. Given a strictly positive initial dual vector λ0 > 0, the sequence

of dual vectors {λk} generated by the dual update equation Eq. (5.11) are non- j negative if the dual update step size βk is chosen as:

λj sj t βj ≤ k k k if A Δx − sj < 0 k j − j − j k k (sk AjΔxk)tkλk 1 ≤ j ≤ − j ≥ 0 βk 1 if AjΔxk sk 0

≥ j ≥ ∀ Proof. The dual feasibility as given by Eq. (5.1c) is λk+1 0orλk+1 0, j. j Essentially, for the proof of this Theorem, we find bounds on βk that would j ≥ j ≥ ensure that λk+1 0ifλk 0. From Eq. (5.11), the dual update is given by, j j j j ∀ j ≥ λk+1 = λk + βkΔλk, j.Thusforλk+1 0,

A Δx βj λj + βj j k − 1 λj + k ≥ 0 k k sj k t sj k  k k A Δx − sj βj j j k k j k ≥ 1+βk j λk + j 0 sk tksk

− j ≥ j ∈ j ≥ ∀ − j ≤ Now, if AjΔxk sk 0, then for any βk (0, 1), λk+1 0, j.IfAjΔxk sk

63 j ≥ 0, for λk+1 0,  A Δx − sj βj j j k k j ≥− k 1+βk j λk j sk tksk

Re-arranging the above inequality gives:

λj sj t βj ≤ k k k k j − j − (sk AjΔxk)tkλk 1

5.5 Primal Step Size

The primal update step size is chosen to generate feasible primal solutions given by Eq. (5.1c) and help in the convergence of the algorithm. The in- clusion of the barrier function in the individual cost function helps in the

satisfaction of the constraints xk+1 > 0 and the primal step size is used for the

satisfaction of the constraint c − Axk+1 ≥ 0. The primal step size is chosen based on the feasibility of the primal solutions and evaluated based on little information exchange within the network. For the optimization problems in- volving gradient descent or Newton methods, the step size is generated using backtracking methods [121]. In the Newton methods, it has been ≥ ˆb shown that a backtracking line search always results in a step size dk 1+ ˆ k , Nd ˆ ∈ ˆ k b (0, 1) [121]. Nd is called the Newton decrement which is defined as: 2 1 ˆ ˆ k T ∇ 2 b Nd =(Δxk f(xk)Δxk) . It can be seen that a step size dk = 1+ ˆ k satisfies Nd the backtracking line search condition. Apart from convergence, feasibility

64 of the intermediate solutions has to be taken into account and the following Theorem provides the bound on the step size to generate feasible solutions.

Theorem 5.2. Given a strictly positive initial slack variable vector s0,a

sequence of primal vectors {xk} obtained by Eq. (5.9) will generate a se-

quence of non-negative slack variable vectors {sk+1} if the primal step size,

dk(0 ≤ dk ≤ 1) is chosen as:

Pk dk = ˆ k 1+Nd

j j j j βk Where Pk ≤ min Pj if Δl +1 + β λ − > 0 j k k k j tksk

0 ≤ Pk ≤ 1 Otherwise

sj βj λj k k k j j − j where Pj = j Δ + j j j − j and Δlk+1 = λk+1 λk. sk lk+1 skβkλk βk/tk

Proof. Our approach to prove this theorem is by considering sk ≥ 0, for any arbitrary k, and then obtaining the condition on the primal step size that

ensures that sk+1 ≥ 0. For sk+1 to be non-negative, the following has to be

65 satisfied:

Axk+1 ≤ c

Axk + dkAΔxk ≤ c

Since c − Axk = sk ≤ dkAΔxk sk j sk Thus dk ≤ , ∀j (5.12) AjΔxk

Now, the slack variable update can be given by:

j j − sk = c Ajxk j j − sk+1 = c Ajxk+1 j j − − Thus sk+1 = c Ajxk dkAjΔxk j j − sk+1 = sk dkAjΔxk (5.13)

Thus, if AjΔxk ≤ 0, choosing the primal step size as 0 ≤ dk ≤ 1, will not violate the global constraint condition of sk+1 ≥ 0. On the other hand, if

AjΔxk > 0, dk should follow Eq. (5.12). If the information AjΔxk is shared by the links to the source nodes, the upper bound on dk can be computed very easily. The term AjΔxk represents the total change of flow through link j at iteration k. This information is readily available to the respective links. Here, we consider that this information is not shared with the source nodes due to the need to minimize the communication requirement of sharing this information. In this case, the upper bound on dk can be computed by using

66 Eq. (5.10) and Eq. (5.11) as:

j j j AjΔxk j j j j j λ +1 − λ = β λ − β λ + β /(tks ) k k k sj k k k k k k Δlj AjΔxk k+1 − 1 j = j j +1 j j sk βkλk tkskλk ⎛ ⎞ j j j j sk ⎝ βkskλk ⎠ = j (5.14) AjΔxk j j j j j βk s Δl +1 + β s λ − k k k k k tk

Hence from Eq. (5.12) and Eq. (5.14), the primal update should follow: ⎛ ⎞ βj sj λj ≤ ⎝ k k k ⎠ dk j j j j j j βk s Δl +1 + β s λ − k k k k k tk ≤ Pk Since Pk minj Pj, it is quite evident that dk = 1+ ˆ k will not violate the Nd condition for feasibility given by Eq. (5.12).

It may be noted that the primal update rule obtained by Theorem 5.2 complies with the feasibility of the solution as well as the backtracking line search rule. Another aspect of the distributed optimization method stems from the fact that the computation of the Newton decrement in a distributed ˆ k manner is a challenging task. According to its definition of Nd ,

n ˆ k 2 T ∇2 i 2 ∇2 i (Nd ) =(Δxk f(xk)Δxk)= ((Δxk) ( fi(xk))) i=1 (5.15) ˆ k 2 ≤ i 2 ∇2 i (Nd ) n max ((Δxk) ( fi(xk))) √ 2 2 1 ˆ k ≤ i ∇ i 2 k Nd n max ((Δxk) ( fi(xk))) = Nd

k ˆ k ≤ k Here Nd is the upper bound of the Newton decrement at iteration k. Nd Nd

67 1 1 ≥ Pk means 1+ ˆ k 1+N k and so if the primal step size is chosen as dk = 1+N k Nd d d ≤ Pk ≤ would mean dk 1+ ˆ k and thus dk Pk. Thus, replacing the Newton Nd decrement with the upper bound of the Newton decrement will not violate the primal feasibility condition. Although, using the upper bound of the Newton decrement in place of the Newton decrement reduces the primal step length and thus in the process reduces the speed of primal convergence, but it reduces the computational complexity by replacing the summation of all the elements  n i 2 ∇2 i from each node, i=1((Δxk) ( fi(xk))), with comparison of the elements, i 2 ∇2 i ((Δxk) ( fi(xk))), among all the nodes. So, the upper bound of the Newton √ 2 2 1 k i ∇ i 2 decrement, Nd = n max ((Δxk) ( fi(xk))) , is used as an estimate of the actual Newton decrement which would provide feasible primal solutions as well as help in convergence which will be shown in the next section. It may be noted that, for the computation of the upper-bound of the Newton decrement,

i 2 i exchange of information (Δxk) fi(xk) among the source nodes is required.

5.6 Convergence Analysis

This section provides the theoretical analysis of the proposed distributed primal- dual interior point method. The convergence analysis of both the primal and dual variables are presented in this section.

5.6.1 Descent Direction

Any update Δx is considered to be in a descent direction if it satisfies the following condition:

∇f(x)T Δx ≤ 0 (5.16)

68 This is because such an update direction results in the reduction of the global cost which can be shown below:

f(x +Δx) ≈ f(x)+∇f(x)T Δx

f(x +Δx) ≤ f(x)

For the constrained optimization problems, any line search method [121, 124] can be used for the evaluation of the step size which when multiplied with the update direction generates proper descent directions. In case of primal-dual interior point method, the line search method requires global knowledge [121] and performing it in a distributed manner is a challenging task. It can be seen from Eq. (5.8), the Newton step size, and hence the descent direction (Eq.

(5.16)), depends on the dual vector, λk, at any iteration k.

T T 2 −1 T ∇f(xk) Δxk = −∇f(xk) [∇ f(xk)] [∇f(xk)+A λk]

 −|∇f (xi )|+Aiλ n |∇ i | i k k = i=1 fi(xk) ∇2 ( i ) fi xk  i i i −|∇f (x )|+A λ −1+A diag(β )Δλ −1 n |∇ i | i k k k k = i=1 fi(xk) ∇2 ( i ) (5.17) fi xk

|∇ i | ∇ i Here, fi(xk) indicates the absolute value of fi(xk). It can be seen from Eq. (5.17) that certain circumstances such as large increase of values in dual variables can result into violation of descent criteria given by Eq. (5.16). This method provides a mechanism to ensure that the update direction is a descent direction in a distributed manner by using a bounded update of the dual variables. Our main result on this is provided below as Theorem 5.3.

Theorem 5.3. Assuming very small values for the elements of the initial dual vector, λ0, and considering a sequence of dual vectors {λk} updated using Theorem 5.1, the primal update direction is a descent direction if the positive

69 ∈ | j updates of the dual variables (j P Δλk > 0) are bounded, so that  n i=1 gi Gk ≤ p i=1 qi

j j ∈ ∀ where Gk =max(βkΔλk−1), for j P , k, p is the cardinality of set P that |∇ ( i )| fi xk consists of source nodes that encounter positive dual update, qi = ∇2 ( i ) and fi xk i i −|∇f (x )|+A λ −1 |∇ i | i k k gi = fi(xk) ∇2 ( i ) . fi xk

Proof. From Eq. (5.17),

n i i i −|∇f (x )| + A λ −1 + A diag(β )Δλ −1 ∇f(x )T Δx = |∇f (xi )| i k k k k k k i k ∇2f (xi ) i=1 i k n i i n i −|∇f (x )| + A λ −1 A diag(β )Δλ −1 = |∇f (xi )| i k k + |∇f (xi )| k k i k ∇2f (xi ) i k ∇2f (xi ) i=1 i k i=1 i k n i i i −|∇f (x )| + A λ −1 A diag(β )Δλ −1 |∇ i | i k k |∇ i | k k = fi(xk) 2 i + fi(xk) 2 i ∇ fi(x ) ∇ fi(x ) i=1 k i∈P k Aidiag(β )Δλ |∇ i | k k + fi(xk) 2 i ∇ fi(x ) i/∈P k

If we eliminate the terms that involve the non-positive update terms (i/∈ P ), we get:

n i i i −|∇f (x )| + A λ −1 A diag(β )Δλ −1 ∇ T ≤ |∇ i | i k k |∇ i | k k f(xk) Δxk fi(xk) 2 i + fi(xk) 2 i ∇ fi(x ) ∇ fi(x ) i=1 k i∈P k

If mi is the number of links with the source node i that encounters a positive dual update, and considering the upper bounds of the positive dual updates,

70 we get:

n i i i −|∇f (x )| + A λ −1 m |∇f (x )| ∇ T ≤ |∇ i | i k k i i k f(xk) Δxk fi(xk) 2 i + Gk 2 i ∇ fi(x ) ∇ fi(x ) i=1 k i∈P k (5.18)

The assumption of a very small initial dual variables is chosen so that the initial primal update direction is a descent direction. Now, the bounds on positive update of dual variables for the subsequent iterations can be obtained using the descent direction criteria in Eq. (5.16) and Eq. (5.18):

n i i p i −|∇f (x )| + A λ −1 m |∇f (x )| |∇f (xi )| i k k + G i i k ≤ 0 i k ∇2f (xi ) k ∇2f (xi ) i=1 i k i=1 i k p i n i i m |∇f (x )| |∇f (x )|−A λ −1 G i i k ≤ |∇f (xi )| i k k k ∇2f (xi ) i k ∇2f (xi ) i=1 i k i=1 i k

Since mi ≥ 1, we have:

p i n i i |∇f (x )| |∇f (x )|−A λ −1 G i k ≤ |∇f (xi )| i k k k ∇2f (xi ) i k ∇2f (xi ) i=1 i k i=1 i k n i=1 gi Gk ≤ p i=1 qi

5.6.2 Convergence of Primal Variables

The following theorem describes the convergence property of the proposed method.

Theorem 5.4. Considering that the primal step size dk is computed by The- orem 5.2 and the primal update is in the descent direction as maintained by

71 Theorem 5.3, then a sequence of primal vectors {xk} updated by Eq. (5.9)

converges as k →∞, i.e., limk→∞ Δxk → 0 and the improvement of the cost function at each step satisfies the relationship:

2 αγdˆ k||Δxk|| f(x +1) − f(x ) ≤ k k κ − 1

Proof. Since dk satisfies the backtracking line search criteria [121], there ex- T T ists κ (dk ≤ κ ≤ 1) so that ∇f(xk+1) Δxk ≥ κ∇f(xk) Δxk [125]. Now, considering a bounded cost function,

T T ∇f(xk+1) Δxk ≥ κ∇f(xk) Δxk

T T [∇f(xk+1) −∇f(xk)] Δxk ≥ (κ − 1)∇f(xk) Δxk ≥ 0 (5.19)

T This is because κ ≤ 1and∇f(xk) Δxk ≤ 0 according to Theorem 5.3. Now, according to Cauchy-Schwarz inequality, we have:

T ||[∇f(xk+1) −∇f(xk)] Δxk|| ≤ ||[∇f(xk+1) −∇f(xk)]||||Δxk||

Since the function is Lipschitz continuous, for a γ ≥ 0, the above inequality can be rewritten as:

T 2 ||[∇f(xk+1) −∇f(xk)] Δxk|| ≤ γ||xk+1 − xk||||Δxk|| = γdk||Δxk||

T Since from Eq. (5.19), [∇f(xk+1) −∇f(xk)] Δxk ≥ 0, the above inequality can be written as:

T 2 [∇f(xk+1) −∇f(xk)] Δxk ≤ γdk||Δxk|| (5.20)

72 From Eq. (5.19) and Eq. (5.20), we have:

T [∇f(x +1) −∇f(x )] Δx ∇f(x )T Δx ≤ k k k k k (κ − 1) (5.21) || ||2 ≤ γdk Δxk κ − 1

Now, considering satisfaction of line search criteria for some positiveα ˆ [121], the cost function update is given by:

T f(xk+1) − f(xk) ≤ αˆ∇f(xk) Δxk || ||2 ≤ αγdˆ k Δxk κ − 1

It can be seen from the above equation, that since κ ≤ 1, the R.H.S. of the above inequality is negative and hence the cost function decreases in every iteration. Since the function is bounded, it is evident that the decrease in cost function will eventually vanish and the point where this happens will be the optimal point where the cost function would be minimized. At optimality,

f(xk+1) − f(xk)=0andsoΔxk = 0, which means: limk→∞ ||Δxk|| → 0.

5.6.3 Convergence of Dual Variables

The following theorem provides our results on the convergence property and rate of convergence of the dual variables.

Theorem 5.5. Assuming a routing matrix A with full row rank, a sequence of

dual vector updates Δλk, updated as the Newton step of Eq. (5.5) converges, j → i.e., Δλk 0, as the primal variables converge.

Proof. The update of each primal variable is given by Eq. (5.8). With the

73 i convergence of the each primal variable, Δxk =0,wehave:

− ∇2 i −1 ∇ i T [ fi(xk)] [ fi(xk)+(A λk)i]=0 −∇ i T fi(xk)=(A λk)i −∇ i T Similarly, fi(xk+1)=(A λk+1)i

Subtracting the last two equations, and since xk+1 = xk when the primal variables converge, we have:

T (A Δλk)i =0

T That is, A Δλk = 0 (5.22)

Since the routing matrix has full row rank, i.e., the rows of A are linearly

independent, from Eq. (5.22) it can be concluded that Δλk = 0 as the primal variables converge.

Theorem 5.6. The Karush Kuhn Tucker optimality conditions given by Eq. (5.1) are satisfied when both the primal and dual variables converge as per Theorems 5.4 and 5.5.

Proof. Since the slack variables can only be non-negative, at the optimal point, some of the slack variables will converge to positive values, and the rest will go

to zero. Let the set Sk0 contains m − jp slack variables that converge to zero

and the set Skp contains jp the slack variables that converge to positive values.

Let us try to determine the dual variables (λk) for the case corresponding to j ∀ ∈ the respective slack variables converging to positive terms, sk > 0, j Skp.

At the convergence of the primal variables: Δxk = 0 and hence AjΔxk =0, ∀j.

74 Now, using the dual step size rule of Theorem 5.1

λj sj t βj ≤ k k k k (sj − A Δx )t λj − 1 k j k k k λj sj t βj ≤ k k k k j j − sktkλk 1 j − j j ≤ j (βk 1)sktkλk βk βj λj ≤ k (5.23) k j − j (βk 1)sktk

j ≤ j − ≤ j Since βk 1, the term (βk 1) 0. As, sk and tk are always positive, from Eq. (5.23),

j ≤ λk 0 (5.24)

Since the dual variables must always be non-negative, when AjΔxk =0,the j ∀ ∈ dual variables must be zero, i.e., λk =0, j Skp. Now, to prove the satisfaction of the KKT optimality conditions given by Eq. (5.1): It can be seen from the update vector of primal variables given by Eq. (5.6), with the convergence of the primal variables, Δxk = 0, the first optimality condition given by Eq. (5.1a) will be satisfied. For the second optimality condition, complementary slackness condition, given by Eq. (5.1b), the terms

T sk λk are given by:

m T j j sk λk = skλk j=1 j j j j = skλk + skλk (5.25) j∈Sk0 j∈Skp

75  j j j In the above equation, the term ∈ s λ = 0, since s =0,j ∈ Sk0 and  j Sk0 k k k term sj λj = 0, since λj =0,j ∈ Sk . Thus from Eq. (5.25), the j∈Skp k k k p complementary slackness condition at the optimality can be given by:

T sk λk = 0 (5.26)

Thus, the complimentary slackness condition, Eq. (5.1b), is maintained as the algorithm converges. The satisfaction of third optimality condition given by Eq. (5.1c) is ensured by the Theorems 5.1 and 5.2 and the incorporation of the log barrier in the cost functions.

5.7 Simulation Results

In this section, the performance of the proposed distributed primal-dual inte- rior point method is evaluated on different randomly generated networks and is compared with the centralized interior-point method. The utility function of each source node is considered to be different but logarithmic in nature, so that Ui(xi)=bi log(xi), which represent many real world utility functions.

The constants bi for source nodes are sampled from a uniform distribution. The routes of the source nodes are randomly chosen which means that the elements of the matrix A (Eq. (3.3)) are chosen in a random fashion while ensuring that A satisfies the Assumption 5 made in section 3.2. The capacity of each link cj is also chosen in a random fashion from a uniform distribution. We present our results for two cases where the first case represents a medium sized problem and the second case represents a large-sized problem. All the simulations are performed using a computer with 3.47 GHz X990 i7 processor and using Matlab 2011b.

76 Case 1: Here, we consider a problem that involves n = 200 source nodes, and m = 500 links. The cost value of the optimal solution, for a typical sce- nario generated as described above for this case, provided by the proposed distributed optimization method is 325.12. The solution for this scenario pro- vided by the central interior point optimization is 324.85 which is close to the solution provided by the proposed distributed method. Fig. 5.1a shows the evolution of the total system cost with the number of iterations. Now, in order to carry out more exhaustive numerical study of the optimality of the proposed method, we carried out simulation for 10 different scenarios for this case. In each scenario, the values of bi,theA matrix, and the values cj were randomly chosen. Fig. 5.1b and Tab. 5.1 show the comparison of the cost functions evaluated by the proposed distributed method and the centralized optimization method for the different scenarios for Case 1.

Tab. 5.1 also shows the computation times for the proposed distributed optimization method and the centralized method in solving the medium sized problem. For the proposed distributed optimization method, the computation time in solving the problem is the time taken by the algorithm to finish 5000 iterations as shown in Fig. 5.1a. Although Tab. 5.1 shows that the centralized optimization method takes less that one third the time taken by the proposed distributed method, it may be noted that the distributed method converges close to the optimal solution long before the 5000 iterations as seen in Fig. 5.1a. Furthermore, the algorithm works on a single computer and not in a parallel fashion and the codes are not optimized as the in-built Matlab optimization functions used for the centralized method. Case 2: This case considers a network with 2500 source nodes and 4500 links. The evolution of the total cost for a typical scenario for Case 2 with the

77 Table 5.1: Comparison Between the Proposed Distributed Optimization and Central Optimization Method for Case 1 (200 Source Nodes and 500 Links)

Cost Using the Cost Using the Computation Computation Proposed Centralized Time Using the Time Using the Distributed Method Proposed Centralized Method Distributed Method (secs) Method (secs) 325.12 324.85 73.28 24.13 292.84 292.72 71.78 26.91 284.31 284.05 77.99 23.29 287.93 287.79 79.42 28.75 279.47 279.04 78.85 21.77 289.06 288.90 79.90 22.16 302.52 302.18 81.49 22.45 287.90 287.56 73.01 24.60 308.01 307.67 72.64 28.39 282.56 282.28 79.25 23.84 number of iterations is shown in Fig. 5.2a. Fig. 5.2b and Tab. 5.2 shows the comparison of the cost function evaluated by the proposed distributed method and the centralized optimization method for 10 different scenarios for Case 2. It may be noted here that for convex optimization problems, the interior-point central optimization method would always provide the globally optimum solu- tion provided enough number of iterations are performed. It may be noted in Tab. 5.2, that although the proposed distributed optimization method is not implemented in a parallel fashion and the codes are not optimized as is done in Matlab’s in-built functions, the proposed distributed method takes less than ten times the computation time required by the centralized method. Further- more, the computation time shown in Tab. 5.2 for the proposed distributed method is measured at the end of 10000 iterations while the algorithm con- verges long before that.

The goal of this research is to develop a distributed interior-point optimiza- tion algorithm that can carry out the computations in a completely distributed manner with fewer communication requirement and provide solutions which

78 are close to the global optimum solutions. It can be seen from both Tab. 5.1 and 5.2, the distributed approach is capable of providing solutions which are close to the globally optimum solutions and the computation time is sig- nificantly low for large problems. It may also be noted that a distributed implementation of the proposed method would lead to further reduction in the computation time.

Table 5.2: Comparison of Cost for Case 2 (2500 Source Nodes 4500 Links)

Cost Using the Cost Using the Computation Computation Proposed Centralized Time Using the Time Using the Distributed Method Proposed Centralized Method Distributed Method (secs) Method (secs) 5245.9 5240.2 9101.72 105092.55 5117.8 5114.6 8862.05 103101.73 5217.5 5211.4 8928.63 104040.08 5200.7 5196.7 8919.49 102389.33 5160.4 5156.7 8991.94 101848.65 5253.7 5251.0 9323.30 103038.27 5194.8 5190.2 9229.91 106188.58 5197.4 5193.1 9245.03 101379.18 5211.7 5209.7 9285.23 103629.97 5233.7 5227.1 9243.26 104426.86

79 3000

2500

2000

1500 Total Cost 1000

500

0 0 1000 2000 3000 4000 5000 Number of Iterations (a) Total Cost vs Number of Iterations for a Typical Scenario using the Proposed Distributed Method

350 Distributed Optimization Cost Centralized Optimization Cost 300

250

200

150 Total Cost

100

50

0 1 2 3 4 5 6 7 8 9 10 Number of Runs (b) Comparison of Cost Obtained by The Proposed Distributed Method and The Centralized Optimization Method for 10 Different Scenarios

Figure 5.1: CASE 1: (200 Source Nodes and 500 Links)

80 12000

11000

10000

9000

8000 Total Cost

7000

6000

5000 0 2000 4000 6000 8000 10000 Number of Iterations (a) Total Cost vs Number of Iterations for a Typical Scenario using the Proposed Distributed Method Distributed Optimization Cost 6000 Centralized Optimization Cost

5000

4000

3000 Total Cost 2000

1000

0 1 2 3 4 5 6 7 8 9 10 Number of Runs (b) Comparison of Cost Obtained by The Proposed Distributed Method and The Centralized Optimization Method for 10 Different Scenarios

Figure 5.2: CASE 2: (2500 Source Nodes and 4500 Links)

81 Chapter 6

Approach and Simulation Results for Multi-Modal NUM Problem

In this chapter, the approach taken to solve the constrained NUM problem with multi-modal cost function is discussed. It may be noted that most of the currently available distributed optimization techniques and results are mostly applicable to convex functions. The choice of algorithms reduces dramatically when the cost function in concern is multi-modal [84]. It is due to the fact that the gradient descent techniques stops prematurely at the first local optima they encounter. Distributed optimization methods have been used extensively in solving convex optimization problems, which are unimodal functions, rather than multimodal functions. Optimization with multimodal functions are dif- ficult because it is hard to evaluate the required search direction towards the global optimum solution. Multi-modal functions are very common in real- world applications and some their examples can be found in [126] and [127].

82 In this chapter, a primal-dual stochastic distributed approach is taken to solve the constrained NUM problem with multi-modal cost function. In this method, primal variables are updated as a combination of the Newton direction and a random noise. Similar to the primal variable update in section 5, each primal variable xi is updated in a completely distributed manner given by the following equation:

i − ∇2 i −1 ∇ i T N Δxk = [ fi(xk)] [ fi(xk)+(A λk)i]+ (μ, σk) (6.1)

The above equation is similar to the the primal variable update in Eq. (5.8)

and the only difference is that a Gaussian noise (N (μ, σk), where μ and σk is the mean and standard deviation) is added in the search direction of the primal update. Since the cost function is multi-modal in nature, the random term is introduced to help the algorithm escape from the local minima which the gradient descent methods fail to achieve. The geometric explanation of the random term can be explained using Fig. 6.1. The dotted line in Fig. 6.1 shows the gradient at a point A in the plot. It is clear from from the figure that following the gradient direction will lead to the first minima which is a suboptimal solution as shown in Fig 6.1. The incorporation of the random term, makes the new search direction as the slope of the secant line as shown by the solid arrows in Fig 6.1. Since there can be a large number of secant lines at any point of a curve, the new search direction will depend on the the value of the mean and standard deviation of the Gaussian noise. A search direction towards the slope of the secant line helps in looking beyond the local minima and hence helps in searching the global optimum point as can be seen in Fig. 6.1. This method of using slope of the secant as the search direction is utilized

83 in [27], where the q-derivate points towards the slope of the secant at any point. Unlike the method in [27], the method developed in this dissertation does not require the computation of the q-derivate, which can be computa- tionally complex. Furthermore, this method solves constrained optimization problem which is not considered in [27].

In order to converge to a solution, the standard deviation of the Gaussian noise is reduced in every iteration so that with as the number of iterations increases, the method eventually converges to the distributed primal-dual in- terior point method described in chapter 5. This reduction of the randomness in the search direction is a concept used in Simulated Annealing where the ‘temperature’ parameter, which is chosen in a random fashion, reduces with the number of iterations. Apart from the randomness in the search direction,

i the new search point, xk+1 is chosen probabilistically, similar to the Simulated Annealing method. Mathematically, the new search point is computed as:

i i If fi(xk+1)

i i i xk+1 = xk + dkΔxk Otherwise (6.2)

−( ( i )− ( i )) i i i fi xk+1 fi xk xk+1 = xk + dkΔxk with probability e −( ( i )− ( i )) i i − fi xk+1 fi xk xk+1 = xk with probability 1 e

The above equation means that if there is an improvement in the cost function in the current search direction, then that step is taken deterministically while if the cost function worsens in the current search direction, then that search

−(f (xi )−f (xi )) direction is chosen with a probability e i k+1 i k . This probabilistic move in the search space is also similar to Simulated Annealing but unlike Simulated

84 0

−0.5 Secant

−1 Gradient

−1.5 Utility

−2 A

−2.5

−3 0 1 2 3 4 5

Figure 6.1: Geometric Interpretation of the Approach

Annealing, the search direction takes into account of the Newton direction at the current point and hence it is a random yet directed move in the search space is made. Thus, in this stochastic distributed approach, the randomness in the search direction helps in exploring different regions of the search space while the probabilistic move helps to avoid the worse regions. The problem considered this dissertation is a constrained NUM problem and hence the dual variables are updated in a distributed manner by using Eq. (5.10). The primal j step size, dk, and the dual step sizes, βk, are computed following Theorems (5.2) and (5.1) respectively. The psuedo-code of the algorithm is provided in Algorithms 1.

85 Result: Primal-Dual Search Direction i j Initialization of xk, λk and σk ; while The solution doesn’t converge do i j Compute Δxk and Δλk by Eq. (6.1) and Eq. (5.10); j Compute dk and βk by Theorem (5.2) and (5.1) ; i i i ∀ j j j j i i xk+1 = xk + dkΔxk, i, λk+1 = λk + βkΔλk if fx+1

6.1 Simulation Results

In this section, the simulation results of the proposed stochastic distributed method is provided. Similar to the NUM problem, the capacity of the links and the topology of the network (the A matrix) are chosen in a random man- ner. The cost function (negative of the utility function) of each source node is considered to be a sum of multiple exponential function as shown in Eq. (3.5). Here, the utility function is considered to be composed of a sum of 3 exponential functions (np = 3). The network is considered to be consisting of 200 sources nodes and 500 links. To evaluate the performance of the pro- posed distributed algorithm in obtaining the optimal solutions, the method is compared with Genetic Algorithm optimization method with 900 population and 900 generations. The solution (total utility) obtained by the proposed distributed optimization is 606.02 which is very close to the solution obtained by the centralized Genetic Algorithm method which is 608.96. Even though Genetic Algorithm provided better solution (higher system utility), it took

86 16221.17 secs (appox. 4.5 hours) to find the solution while the proposed dis- tributed optimization method took only 164.84 secs. It may be noted that the computation time for the GA largely depends on the population size and the number of generations. Thus the computation time of GA will reduce if lesser number of generations and a smaller population size is used but that will cause the solution from GA to worsen.

In this simulation, the proposed approach runs on a single computer and the code is not optimized and thus the computation time of the proposed method will further decrease when the algorithm would work in a distributed and parallel manner. Fig. 6.2 shows the evolution of the overall system utility with the number of iterations. The red solid line in the figure only graphically shows the final solution of the Genetic Algorithm and not the evolution of the utility with iteration for the Genetic Algorithm.

The proposed distributed method is stochastic in nature and hence it is expected to behave differently for different runs on the same scenario. The proposed noise-based distributed optimization method is compared with Ge- netic Algorithm (GA), which is also stochastic in nature, for 20 different runs on one particular scenario which consists of 200 sources and 500 links. The results are shown in Fig. 6.3, 6.4 and 6.5. The proposed noise-based dis- tributed method is compared with GA for 3 different scenarios, which has different routing matrix A, link capacity cj and utility functions. The mean utility and the mean computational time of the proposed distributed method and GA for 3 different scenarios (20 runs in each scenario) is shown in Tab. 6.1. It can be seen that the proposed noise-based distributed optimization method performs almost similar to the GA when optimality is considered but

87 700 Proposed Method Genetic Algorithm

600

500

Utility 400

300

200 0 1000 2000 3000 4000 5000 Number of Iterations Figure 6.2: Evolution of System Utility with The Number of Iterations the proposed method outperforms GA when computation time is considered.

It may be noted that the GA uses a population size of 900. The Matlab GA toolbox, used in this problem, uses a stopping criterion for termination of the algorithm. The computation time reported here corresponds to the time when the algorithm meets the stopping criterion. The computational time would depend on the population size, and an optimal population size needs to be obtained for a given size of the problem. It may be also be observed that GA with a larger population size would terminate earlier than GA with a smaller population size. In this study, a population size of 900 was chosen on a higher side to ensure that the solution obtained was global in nature since the focus here was to compare the optimality of the proposed noise- based stochastic distributed optimization method. The comparison of the

88 computational time between the GA and the noise-based stochastic distributed optimization method is just a qualitative indication of computational efficiency of the proposed method. More investigation in terms of proper choice of parameters for the GA and the noise-based stochastic distributed method is needed to obtain a quantitative comparison of computational time between the two methods.

500 Distributed Approach Genetic Algorithm 495

490

485 Utility

480

475

470 0 5 10 15 20 Number of Runs Figure 6.3: (Scenario 1): Comparison of Utility for 20 Different Runs

Table 6.1: Comparison Between the Proposed Distributed Optimization and Genetic Algorithm

Mean Utility Mean Utility Mean Mean Using the Using GA Computation Computation Proposed Time Using the Time Using GA Distributed Proposed (secs) Method Distributed Method (secs) 484.93 486.14 138.42 16306 460.15 459.84 140.37 16112 526.40 524.55 136.03 15851

89 480 Distributed Approach 475 Genetic Algorithm

470

465

460 Utility 455

450

445

440 0 5 10 15 20 Number of Runs Figure 6.4: (Scenario 2): Comparison of Utility for 20 Different Runs

Distributed Approach 540 Genetic Algorithm 535

530

525

520 Utility 515

510

505

500 0 5 10 15 20 Number of Runs Figure 6.5: (Scenario 3): Comparison of Utility for 20 Different Runs

90 Chapter 7

Applications

In this dissertation, two real world application areas are considered, power grid and cloud computing systems. These problems fall into the category of large-scale optimization problems. Both the problems have a lot of signifi- cance as their importance and applications in modern days are growing. The optimization problems considered in this dissertation are the optimal power flow (OPF) problem in the power grids and the utility maximization problem in the cloud computing systems. In the following sections, these problems are discussed and the approaches taken to solve them in a distributed fashion are explained.

7.1 Optimal Power Flow Problem in Power Grids

Electrical power is delivered from the bulk generation units to the consumers through the transmission and distribution system. Electric distribution system is the final stage in the delivery of power, where distribution network carries

91 power from the sub-station to the end users (customers). In the transmission system, with minor exceptions, electrical power cannot be stored and thus power generated should be as close to the power demand as possible. There- fore, a control system has to be incorporated to ensure this requirement, while minimizing the total generation cost.

In real-world, power engineers perform optimization, monitoring and con- trol of different aspects of power systems including economic dispatch, state estimation, unit commitment, automatic generation control and optimal power flow (OPF). Among these tasks, OPF is considered as an important task and has been significantly researched since its introduction. The goal of OPF problem is to evaluate the power system network settings that optimize a cer- tain objective function, while satisfying the power flow equations, security and maintaining equipment operational constraints. In order to achieve maximum asset utilization and autonomous functioning of the power grid, an allocation technique that can carry out decision making for OPF while incorporating distributed generation systems is required. In this dissertation, power flow problem has been solved in the transmission network, where the total of cost of power generation is considered to be the objective function.

In this optimization problem, resources are the power generated from cen- tral power plants and distributed generation units such as solar panels, wind mills, fuel cells, etc. The problem is a large-scale problem since it involves a large number of power production units, both big central power plants and small distributed generation units, and a huge number of end users. The scale of this problem can range from a microgrid to the whole American power grid. Apart from high dimensionality of the problem, the resources are distributed

92 in nature and the information is local in nature. These reasons, along with the deregulation of the electricity market, requires an optimization method that can perform in a decentralized fashion.

At every instance of time, each consumer has a certain requirement of power, which has to be met by power sources. Although an aspect of smart grid (modernized power grids) treats the load or the demand of the consumers as a variable (demand-response), in this dissertation the power demand is con- sidered to be a constant. The power requirement of each of the consumers can be met from a number of sources but the most efficient allocation technique would result in the minimization of the overall cost. The cost can be modeled in a number of ways depending on the problem being solved. The usual prac- tice in optimal power flow (OPF) problem is to consider the total generation cost (the cost of power production) as the overall objective function. This kind of the objective function is more realistic and is used as benchmark for OPF problems.

The optimal power flow in power problem is a non-linear optimization prob- lem. The power transmission system comprises of a number of buses which may contain only generation units, only loads (consumer demand) or both loads and generation units. The optimal power flow problem boils down to fulfillment of the consumer demand (load at each bus) from the generation units while minimizing the overall cost of the network. In this dissertation, DC optimal power flow problem is considered which an approximation of the AC optimal power flow using realistic assumptions. The assumptions include: i) small voltage angles so that the sines and the cosines can be approximated, ii) the magnitude of the bus voltages are very close to one, iii) the resistance

93 on the power lines are significantly less than the reactance and iv) the real power flow through the line is significantly high than the reactive power.

The electric grid can be assumed to be composed of n buses that contains m generation units (price elastic, i.e., power generation changes with the change

in price) with finite power production capacity, Sk(k =1, 2..m) and loads or power demand with a finite power demand (price inelastic, i.e., the demand does not change with the change of price) di,where,i =1, 2...n.Fig.7.1 shows a typical IEEE 30-bus system. The electric grid can be represented as

Figure 7.1: Schematic Diagram of IEEE 30 bus system. The horizontal bars represent the buses in the system; the generators are shown as circles attached to the specific buses; the arrow head represents the loads at different buses; and the transmission lines connect one bus with another.

a connected graph, G =(V,E), where V comprises of the set of vertices or nodes and E is the set of edges or lines. The adjacency matrix, A of the graph G represents the connectivity of graph, i.e., A(i, j) = 1 if there exists an edge

94 between node i and node j and A(i, j) = 0 otherwise. The nodes or vertices V can be considered to represent the buses with loads and generators and the set of edges or line E are represented by the transmission lines connecting the buses. The graph G representing the grid is considered to be connected, i.e., there exists a set of edges that connects every node of G with any other node. The global optimization problem is given by:

m min Fi(x) (7.1a) x i=1 h(x) = 0 (7.1b)

g(x) ≤ 0 (7.1c)

Eq. (7.1) shows a typical structure of the optimal power flow problem where Eq. (7.1a) is the overall cost function of the problem and Eq. (7.1b) and (7.1c) are the equality and inequality constraints of the optimization problem. The decision variable for a DC OPF problem x =[x1,x2...xn], where xi =[PGi,θi], generally includes: the real power generated by the generation units and the voltage angles of the nodes in the network. Eq. (7.1b) is called the load balance equation that ensures Kirchoff’s law is maintained. Eq. (7.1c) represents the branch flow constraints so that power flow capacity of each line is not exceeded. The cost function (Eq. (7.1a)) in optimal power flow problems is generally modeled as a quadratic function of the real power generation. The nonlinear optimization problem of optimal power flow problem can be explained by the

95 following equations:

n min Fi(PGi) (7.2a) x i=1 n such that [PGi − PDi − Bij(θi − θj)] = 0 (7.2b)

i=1 j∈Ni

Bij(θi − θj) ≤ Limitij (7.2c)

θimin ≤ θi ≤ θimax ∀i (i =1, 2,...,n) (7.2d)

Pimin ≤ PGi ≤ Pimax ∀i (i =1, 2,...,n) (7.2e)

Here the cost function given in Eq. (7.2a) is given by the quadratic equation 2 Fi(Pi)=ci2Pi + ci1Pi + ci0 where PGi is the active power generated and c2i,

c1i and ci0 are the constants in generation unit connected to node/bus i.The constraints of the global problem are given by Eq. (7.2b)-(7.2e). Eq. (7.2b) is

considered to be load balance equation where PDi is the price inelastic load at

node/bus i; Bij is the susceptance in the transmission line connecting buses

i and j; θi and θj are the voltage phase angles at buses i and j respectively;

Bij(θi − θj) is the power transmitted between nodes i and j, Ni is the set of neighboring nodes/buses of bus i, i.e., the nodes/buses directly connect-

ing node i via a transmission line. Limitij is the maximum allowable power transfer in the transmission line connecting node i and j in Eq. (7.2c) which is takes care of line. The decision variables (the variables that are needed to

be obtained to solve the optimization problem) are PGi and θi, i.e., for node

i, the decision variable xi =[PGi,θi]. Eq. (7.2d) and Eq. (7.2e) provide the local constraints on the decision variables at node i.Itmaybenotedhere that generally all the buses/nodes in the network do not contain generation

units in them. For those buses/nodes θi (i =1, 2,...,n) is the only decision

96 variable and they are computed to meet the constraints.

In this approach, the centralized problem in Eq. (7.2) is decomposed into agent level optimization problems that can be solved by each of the nodes. An iterative market-based approach has been used in this dissertation where every node/bus adjusts their strategies in each iteration. In this approach, realistic decision variables such as real power generation from the generation units and voltage angle of each node are considered. In an iterative pricing mechanism, each node/bus adjusts its strategy according to the price signal received from the dealer or coordinator in the market. The different strategies of each node/bus and the coordinator or auctioneer are described below.

7.1.1 Buyer Strategy

The power network is a coupled system which means that any change in de- cision variable of one node affects the other nodes it is connected to and this change propagates throughout the network. Thus information flow within the network is required to avoid suboptimal solutions. In this problem, each node i (i =1, 2..n) is considered to share its voltage angle information θi with the nodes in its neighborhood Ni, i.e., with the nodes it is directly connected to. At each iteration t,eachnodei in the transmission network performs an

97 optimization problem given by:

ˆ 2 min [(Fi((PGi)t)] + β((θj)t − (θj)t) − (λi)t(PGi)t) (7.3a) ˆ PGi,θi,θj j∈Ni subjected to ˆ (PGi)t − PDi − Bij((θi)t − (θj)) = 0 (7.3b)

j∈Ni ˆ Bij((θi)t − (θj)) ≤ Limitij (7.3c)

As shown in Eq. (7.3a), the decision variables of node i (i =1, 2, 3..n) include its active power generation (if generation unit attached to it) (PGi)t, the voltage phase angle (θi)t and the voltage phase angle of the nodes directly connected to ˆ it θj, j ∈ Ni. The Eq. (7.3b) shows the load balance equation and Eq. (7.3c) makes sure lines are not congested. (λi)t is the price signal computed by the dealer in the market. The cost function of each node in Eq. (7.3a) can be explained as follows: The Fi((PGi)t) term is the power generation cost of the ˆ 2 generation unit at node i; the second term β((θj)t − (θj)t) is the penalty for angle mismatch with node j when (θj)t is the voltage angle at node j computed by node j and β is a constant. This term in the cost function helps to minimize the angle mismatch between the nodes. The third term (λi)t(PGi)t)canbe interpreted as the profit gained by selling the generated power at (λi)t unit price that is computed by the dealer. Apart from the constraints shown in Eq. (7.3b) and Eq. (7.3c), each node also maintains the constraints in Eq. (7.2d) and Eq. (7.2e). This ensures that the load balancing and line congestion issues are taken care in the agent level optimization problem itself.

98 7.1.2 Dealer Strategy

The dealer maintains a rather simple strategy. The dealer generates the price signal that is used by each individual node in the network to re-adjust its decision variables. The price update rules in market mechanisms generally follow a subgradient update rule [128]. The price signals in a network are used to satisfy the network constraints. The price signal is generated from the inequality constraints given by Eq. (7.2b). The price update rule is given by:

(λi)t+1 =(λi)t + α( PDi − (PGi)t) (7.4) i i

Here, α is the scalar step size at iteration t that controls the rate of price change. It can be seen in Eq. (7.4), that the price signal is updated according to the mismatch in the net power demand and power generated. A shortage in total power production is thus compensated by raising the prices λi to encourage more power generation.

7.2 Utility Maximization in Cloud Computing

Cloud computing systems consists of a number of virtual desktops requesting resources, like memory, processing and network bandwidth, to the resource providers, which are the virtual desktop cloud systems. The virtual desktops are generally allocated with resources from the distributed data centers using a greedy approach while maintaining the network constraints. Over time, this opportunistic approach leads to ‘resource fragmentation’ problem which af- fects the quality of experience of the virtual desktops. This requires the use of some global optimization methods for ‘defragmentation’. In the defragmenta- tion process, the resource and data center assignment to each virtual desktops

99 are re-evaluated.

The resource allocation problem in cloud computing is a large-scale op- timization problem as it includes a huge number of virtual desktops using bandwidth, memory and processing resource from a number of data centers. In this section, the problem is mathematically formulated and the approach taken to solve the defragmentation problem in cloud computing is shown. The whole system is composed of a number of virtual desktops (resource users), which can be categorized into a finite number of categories, and a number of data centers which provide the resources. The utility function of each virtual desktop is a function of the quality of experience of the virtual desktops which depends on the amount of allocated resources. The goal of this problem is to maximize the overall system utility, which is measured as the sum of the util- ity function of each virtual desktops, while maintaining the system constraint which include the limits on the available resources. Apart from the resource constraint, there are two other constraints which are required to be satis- fied, Quality Constraint and Fairness Constraint. Quality constraint ensures a minimum quality of experience for the virtual desktops while the fairness constraint ensures that the quality of experience of all the virtual desktops are same.

Considering n number of virtual desktops (vd), ng number of user groups

100 and nd number of data centers, the problem can be formulated as:

n n max U(x)= Ui(xi) = min − U(x)=− Ui(xi) (7.5a) i=1 i=1 nj such that xi ≤ sj ∀j =1, 2,...,nd (7.5b) i=1 nd nj = n (7.5c) j=1

3 The above problem shows a utility maximization problem, xi ∈ R is a 3 dimensional vector where the first element represents the CPU resource, the second element represents the RAM resource and the third element represents 3 the Network Bandwidth resource for each virtual desktop i. sj ∈ R ,rep- resents the capacity of the jth data center which is a 3 dimensional vector where the first, second and third elements represent the total available CPU,

RAM and Network Bandwidth resources in the data center j. nj represents the number of virtual desktops requesting resources from the jth data center.

The quality of experience of the virtual desktop is measured as a func- tion of the allocated resources and the utility of each virtual desktop, and is the product of its quality of experience and the latency. The utility func- tion is seen to have approximately a linear relationship with the amount of allocated resources till a point, after which it reaches a saturation level. The minimum quality requirement of the virtual desktop and the point at which utility reaches the saturation point can be mapped as the minimum and the maximum amount of resources required for each virtual desktops.

In this cloud computing problem, 3 types of user groups are considered

101 (ng = 3), i) Campus Computers ii) Distant Learning and iii) Engineering Sites. The importance of each user within a group is same, and thus the re- source allocated to all the virtual desktop of a particular user group is same. For example, if virtual desktop p and q belong to the same user group (say campus computers) and if p is allocated with resources [x1,x2,x3], then q will

also be allocated with resources [x1,x2,x3], although p and q might get its resources from different data centers.

Considering all the constraints, the central problem is divided into agent level optimization problem. For each data center j, the cost of resource usage is given by a virtual desktop i is given by:

ng − j T j − 2 Cj = min ( Ui(xi )+λj xi + (Qi Qk) + Mbj) (7.6a) k=1 i ≤ ≤ i ∈ R3 xmin xi xmax xi (7.6b)

3 th Here, λj ∈ R is the cost of resources for the j data center, Qi is the quality th th of the i virtual desktop and Qk is the quality of k user group, and Mbj is the cost of moving (known as the migration cost) from bth data center (b can be considered to be the data center in which i was assigned in the previous

th iteration) to j data center, such that Mbb = 0. The first term Ui(xi)isthe th T j utility of the i virtual desktop; the second term, λj xi , is the cost of of re- source utilization from the jth data center; the third term is used to make sure the quality of experience for all the user groups are same and the last term

th Mbj is the migration cost. The data center for the i virtual desktop is chosen l as l = min Cj(j =1, 2,...,nd) and the resource request is chosen as xi .Thus l th the new allocation is: i takes xi resources from l data center.

102 The prices of the resources for the data center j are calculated as follows:

nj j − λj(t +1)=max(0,λj(t)+α( xi sj)) (7.7) i=1  nj j − → Thus the prices stabilize when i=1(xi bj) 0x. The prices of the resources are increased as the demand of the resource goes higher and decreased when the demand decreases. In mathematical terms, the prices are updated using subgradient methods.

The above procedure can be explained with a simple example. Let us consider 3 data centers and 3 virtual desktops and the initial prices of the resources for all the data centers to be zero, and the first virtual desktop after solving its own optimization problem chooses the first data center since it receives least cost from it. Let us consider the second virtual desktop chooses the second data center and the third virtual desktop chooses the first data center to receive its resources. According to the market based method, since the first data center receives two requests, second data center receives one request while the third data center receives no request, prices of the resources in the first and second data centers will increase with higher price increase in the first data center while the prices of the resources in the third data center will not change. The price update can be rationalized using economic interpretation that the prices increase as the demand of the resources increase. Thus, in the next iteration, as the cost of resource usage becomes more in first data center than the second and third data center, the first data center might receive no request while the second and third data center receive one and two request respectively. As before, the prices of the resources in the third data

103 center will increase as the demand goes high and the prices of the resources of the first data center will decrease since the demand of its resources decrease. This process continues till the prices converge and the optimal allocation of resources are achieved.

7.3 Simulation Results

In this section the simulation results are presented.

7.3.1 Optimal Power Flow Problem in Power Grids

In this section, the simulation results of the proposed distributed optimiza- tion method is presented. IEEE 30-bus system is chosen as the test scenario. The first column of Table 7.1 shows the nodes with the generation units, and the second column represents the corresponding capabilities of those genera- tion units. The proposed distributed optimization technique is applied to the above scenario and the obtained optimal solution (i.e., the values of power generated at each generation unit) is shown in the 3rd column (OPF Solution) of Table 7.1. This optimal solution is the total power required to be generated from respective generation units so that the overall cost of power generation is minimized, while meeting the demand at each bus and the constrains are satisfied. The optimal value of the cost function obtained using our approach is 565.23. The solution is obtained using MATPOWER 4.0’s inbuilt optimiza- tion scheme is 565.206. Fig. 7.2 shows the update of price as the number of iterations increase. It can be seen that the price converge before 100 iterations. Net error at bus i  computed as ((θ ) − (θˆ) ) reduces to zero as the number of iterations j∈Ni i t i t

104 Table 7.1: OPF Solution for IEEE 30-bus System Using Market-Based Ap- proach

Nodes Generation Capabilities (MW) OPF Solution (MW) 1 80 48.08 2 80 57.55 13 50 22.11 22 55 15.28 23 30 30.82 27 40 15.28 increases. Fig 7.3 shows such update of error in 5 randomly selected buses (buses 2, 8, 12, 24 and 29). Fig 7.4 shows the solution to the DC optimal power flow problem for the first scenario, where the amount of power gener- ated at each bus (that contains power generation units) is marked with red, load (power demand) at each bus is marked with blue while the amount of power flow in the transmission lines are marked with black. All the units of power are in mega watts (MW).

Different scenarios are generated with IEEE 30 bus system with MAT- POWER 4.0 by changing demand conditions and generator capacities. Fig 7.5 shows the comparison between MATPOWER 4.0 and the distributed mar- ket based optimization solution for 10 different scenarios. It can be seen that both distributed market based method and MATPOWER’s inbuilt optimiza- tion scheme provides almost the same result. However, it may be noted that the proposed market based method enables the computation to take place in a distributed manner using mostly local information, local decision variables and global price information. The proposed method derives its significance due to its scalability properties achieved due to distributed nature of computations. This allows the proposed method to be applied for large-scale systems such as

105 future smart grids. It may be noted that the method presented in this chapter still uses limited global information.

4

3.5

3

2.5

2 Price 1.5

1

0.5

0 0 50 100 150 200 250 300 Number of Iterations

Figure 7.2: Evolution of Price With Respect to Iteration

7.3.2 Cloud Computing Systems

In this section, the simulation results of solving the utility maximization prob- lem in cloud computing system is provided. In this problem, 3 data centers are considered so that nd = 3 and the number of virtual desktop in the system and each data center is considered to have 16 units of CPU, 32 units of memory and 50 units of network bandwidth. All the other data of quality measurement and latency of each virtual desktop, number of virtual desktop in the system and initial allocation are taken from Ohio Supercomputing Center. 6 differ- ent test scenarios are considered and the results obtained from the distributed

106 0.01

0

−0.01 Bus 2 Bus 8

Error Bus 29 −0.02 Bus 24 Bus 12

−0.03

−0.04 0 50 100 150 200 250 300 Number of Iterations

Figure 7.3: Evoltuion of Error Values for 5 Randomly Chosen Buses With Respect to Iteration market-based solution as well as a comparison with the greedy approach is provided in Table 7.2. Fig. 7.6 shows the bar plot of the comparative results.

Table 7.2: Comparison of Market Based Solution and Greedy Solution for 6 Different Cases

Test Scenarios Utility from Greedy Approach Utility from Distributed Market-Based Approach 1 21.73 49.06 2 18.35 41.67 3 17.63 35.53 4 18.89 43.54 5 17.23 31.48 6 18.39 43.02

107 Figure 7.4: Problem 4: The Solution to DC OPF Problem for Scenario 1; The Power Generated is Shown in Red, Power Demand at Each Node is Shown in Blue, Power Flow Between The Transmission Lines are Shown in Black

108 700 Matpower 4 Market Based 600

500

400

300 Total Cost

200

100

0 1 2 3 4 5 6 7 8 9 10 Scenario Number

Figure 7.5: Comparison of Total Generation Costs Between MATPOWER and Market Based Solutions for 10 Different Scenarios

109 Heuristic Solution 50 Market Based Solution

45

40

35

30

25 Utility 20

15

10

5

0 1 2 3 4 5 6 Cases

Figure 7.6: Performance of market-based method compared with a popular heuristic approach in cloud computing

110 Chapter 8

Dissertation Contributions and Future Scope

In this chapter, the major dissertation contributions and the future scope of this research is discussed.

8.1 Dissertation Contributions

This dissertation focuses on the development and application of distributed optimization methods for the network flow problems in large-scale networked systems. This dissertation seeks to to contribute to the growing literature on distributed optimization and control of large-scale networked systems. Some of the major contributions of the dissertation is that it presents different types of price-based mechanisms for different types of minimum cost network flow problems, it develops novel distributed optimization method for maximum flow problems, it shows the applicability of some of the distributed optimiza- tion methods for solving real world problems such as in power grids and cloud

111 computing systems and finally it develops a novel noise-based distributed op- timization method for problems with multi-modal cost function.

Even though a lot of literature is available on market-based techniques, there is a lack of guidelines on what mechanisms to choose. It turns out that the choice of market mechanism is heavily dependent on the class of problems being solved. Chapter 4 provides a description of different types of market mechanisms and an illustration of how they are utilized to solve some differ- ent classes of problems: i) assignment problem of indivisible resources and ii) allocation of divisible resources. These problems has a lot of academic and real world applications. Resource allocation problem has been formulated in a number of ways. In this dissertation, we formulate some of the benchmark resource allocation problems as minimum cost network flow problems and iden- tify different market mechanisms to solve these problems.

The dissertation also develops a novel distributed primal-dual interior point method which focuses on fast convergence, completely distributed computa- tion of the primal and dual variables, lesser communication requirement and generation of feasible primal and dual solutions at every iteration for maxi- mum flow problems in networks. This novel distributed optimization approach is explained in more details in Chapter 5. Computation of the primal and dual variables in a completely distributed manner helps in faster convergence for large-scale systems as compared to the centralized or semi-centralized meth- ods. The generation of feasible primal and dual solutions at every iteration ensures acceptable solutions are available in case the algorithm needs to stop prematurely for time critical cases. It also allows the system to work in a dynamic manner. Apart from that, the feasible solutions at every iteration

112 guarantees acceptable primal solutions in presence of suboptimality in dual solutions which the dual decomposition based subgradient methods cannot guarantee. A Newton based method is proposed in this research for the up- date of the primal and dual variables in a distributed manner that utilizes weighted update rules to generate feasible primal and dual solutions at each iteration. In this respect, this research is close to the work reported in [76]. However, this research contributes by developing a distributed primal-dual interior-point optimization method that provides a natural distributed solu- tion for the computation of the primal as well as the dual variables without the need of extra inner-loop by avoiding the extra iterations and error in dual variable computation involving the matrix-splitting method. Apart from that, the extra information exchange of the first and the second derivatives of the cost function from the source nodes to the links used to evaluate the dual vari- ables in [76] is not required in this the proposed method. Furthermore, the complexity involved in the computation of the Newton decrement is avoided by using the upper bound of the Newton decrement. Although the use of up- per bound of the Newton decrement in place of the actual Newton decrement slows the convergence of the primal variables, however it reduces the compu- tational complexity and communication requirement, and thus helps in the development of distributed algorithms for large-scale systems.

This dissertation also relaxes the requirement of convex cost functions and develops a noise-based primal-dual distributed optimization method for prob- lems with multi-modal cost function. The general approach in solving these multi-modal optimization problems is to perform unimodal search from mul- tiple points in the search space [84]. Although there exists global optimiza- tion methods for optimization of multi-modal cost functions such as Genetic

113 Algorithm [21], Simulated Annealing [23] and Particle Swarm Optimization method [89], these methods require global information. These methods can be implemented in a parallel fashion but one distinct feature of these methods is that in these methods, the decision variables of each agent are generally of the same dimension as that of the overall problem. Apart from that, these methods mostly depend on global inter-agent communication rather than local inter-agent communication that are prevalent in the networked systems. The noise-based distributed approach proposed in this dissertation can be consid- ered to be conceptually similar to the q-derivate approach in [27], but the proposed method solves constrained problems and does not require the com- putation of the q-derivate, which can be computationally complex.

Finally, this dissertation applies the distributed optimization method to some real-world problems such as in power grids and cloud computation sys- tems. In the power grids, we solve the optimal power flow problem (OPF), which is one of the most fundamental problems of the power system. There exists some literature on the distributed optimization approach in solving the OPF problems such as in [104, 102, 129, 130, 106], but in most of literature, the OPF problem is distributed into a small number of areas of the transmission network which are connected with each other and in each area, the problem is solved in a centralized manner. This reduces the number of complicating constraints. The market based distributed optimization approach proposed in this dissertation decomposes the whole problem into the bus level in the power grid and hence makes the solution much more decentralized in nature. This approach also reduces the computation of the auctioneer/dealer in the system and thus makes the approach much more scalable. The utility maximization problem in cloud computing systems is done in a heuristic centralized fash-

114 ion. A distributed market based approach is presented in this dissertation to solve the utility maximization problem in the cloud system to enhance the net system utility.

8.2 Future Scope

In this dissertation, the application of the developed distributed optimization method for the network flow problems and their mathematical analysis are pro- vided. Apart from the problem discussed in section 3.3, all the network flow problems considered are convex optimization problems. In recent years, many real world optimization problems have been found to possess non-convex cost functions. This requires modification of the existing algorithms to address the non-convexity of the cost functions. Although some optimization techniques and their mathematical analysis exist for non-convex optimization problems, executing them in a distributed framework is an open problem. Future research direction thus includes development of distributed optimization methods and their mathematical analysis for non-convex optimization problems.

Chapter 6 presents the developed distributed optimization method for op- timization problems with multi-modal cost functions. In this method, noise is added to the search direction to avoid suboptimal convergence and explore dif- ferent regions of the cost function. Although the simulation results presented in chapter 6 shows the developed algorithm works fairly well in comparison to the existing methods (Genetic Algorithm), the algorithm requires math- ematical analysis. The future research scope in this direction thus includes mathematical analysis of the developed distributed optimization method. In this research direction, an analysis of the effect of the noise model in the global

115 optimality is required. The mathematical analysis will help in choosing the right noise intensity at every iteration of the algorithm so that faster conver- gence is achieved.

In this dissertation, the dual variables (prices of the resources) are assumed to be computed by the sources, in minimum cost network flow problems, and the links, in maximum flow problems. In some problems, such as in sensor net- works or multi-robot task allocation problems, the sources or the links might not be provided with sensing and computational capabilities and thus compu- tation of dual variables in such cases is a problem. The future research direc- tion thus includes development of distributed optimization methods aimed at exploring the the different auction methods and learning methods to handle these issues.

Future scope in this research direction also includes the hardware imple- mentation of the developed distributed optimization methods. This research direction essentially requires performing the distributed optimization method in a parallel computational framework. The challenges involved in this research will include real world problems of communication delay, loss of communi- cation, asynchronous operation, and other computational constraints. This would require development of robust algorithms and modification of the exist- ing methods to perform in spite of added constraints in the system.

116 Bibliography

[1] C. Perez, Technological revolutions and financial capital: The dynamics of bubbles and golden ages. Edward Elgar Publishing, 2002.

[2] R. A. Giliano and P. Mitchem, Market based control: A paradigm for distributed resource allocation, ch. Valuation of Network Computing Re- sources, pp. 28–52. World Scientific Publishing, 1996.

[3]K.HartyandC.D.,Market based control: A paradigm for distributed resource allocation, ch. A market approach to operating system memory allocation, pp. 126–155. World Scientific Publishing, 1996.

[4] D. F. Ferguson, C. Nickolaou, J. Sairamesh, and Y. Yemini, Market based control: A paradigm for distributed resource allocation, ch. Economic Models for Allocating Resources in Computer Systems, pp. 156–183. World Scientific Publishing, 1996.

[5] K. Kuwabara, T. Ishida, Y. Nishibe, and T. Suda, Market based con- trol: A paradigm for distributed resource allocation, ch. An equilibratory market-based approach for distributed resource allocation and its appli- cation to communication network control, pp. 53–73. World Scientific Publishing, 1996.

117 [6] A. D. Baker, Market based control: A paradigm for distributed resource allocation, ch. A case study where agents bid with actual costs to sched- ule a factory, pp. 184–223. World Scientific Publishing, 1996.

[7] A. V. Goldberg, E.´ Tardos, and R. E. Tarjan, “Network flow algorithms,” tech. rep., DTIC Document, 1989.

[8] H. Terelius, U. Topcu, and R. Murray, “Decentralized multi-agent op- timization via dual decomposition,” in World Congress of the Interna- tional Federation of Automatic Control, IFAC, 2011.

[9] D. P. Palomar and M. Chiang, “A tutorial on decomposition methods for network utility maximization,” Selected Areas in Communications, IEEE Journal on, vol. 24, no. 8, pp. 1439–1451, 2006.

[10] J. Baillieul and P. Antsaklis, “Control and communication challenges in networked real-time systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 9–28, 2007.

[11] S. Sundhar Ram, A. Nedi´c, and V. Veeravalli, “Distributed stochastic subgradient projection algorithms for convex optimization,” Journal of optimization theory and applications, vol. 147, no. 3, pp. 516–545, 2010.

[12] Y. Lu, “An integrated algorithm for distributed optimization in net- worked systems,” HKU Theses Online (HKUTO), 2010.

[13] F. P. Kelly, A. K. Maulloo, and D. K. Tan, “Rate control for commu- nication networks: shadow prices, proportional fairness and stability,” Journal of the Operational Research society, vol. 49, no. 3, pp. 237–252, 1998.

118 [14] P. Antsaklis and J. Baillieul, “Special issue on technology of networked control systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 5–8, 2007.

[15] F. Bullo, J. Cort´es, and B. Piccoli, “Special issue on control and opti- mization in cooperative networks,” SIAM Journal on Control and Opti- mization, vol. 48, 2009.

[16] H. Taha and H. Taha, Operations research: an introduction,vol.8.Pren- tice Hall Upper Saddle River, NJ, 1997.

[17] M. Wellman, “A market-oriented programming environment and its ap- plication to distributed multicommodity flow problems,” Arxiv preprint cs/9308102, 1993.

[18] S. Boyd, L. Xiao, and A. Mutapcic, “Subgradient methods,” lecture notes, Stanford University, Autumn Quarter, vol. 2004, 2003.

[19] A. Nedic and A. Ozdaglar, “Distributed subgradient methods for multi- agent optimization,” IEEE Transactions on Automatic Control,, vol. 54, no. 1, pp. 48–61, 2009.

[20] A. Jadbabaie, A. Ozdaglar, and M. Zargham, “A distributed newton method for network optimization,” in Proceedings of the 48th IEEE Con- ference on Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC, pp. 2736–2741, IEEE, 2009.

[21] D. Goldberg, Genetic algorithms in search, optimization, and machine learning. Addison-wesley, 1989.

[22] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Pro- ceedings., IEEE International Conference on Neural Networks,,vol.4, pp. 1942–1948, IEEE, 1995.

119 [23] S. Kirkpatrick, C. Gelatt Jr, and M. Vecchi, “Optimization by simulated annealing,” science, vol. 220, no. 4598, pp. 671–680, 1983.

[24] J. Liang, A. Qin, P. Suganthan, and S. Baskar, “Comprehensive learn- ing particle swarm optimizer for global optimization of multimodal func- tions,” Evolutionary Computation, IEEE Transactions on, vol. 10, no. 3, pp. 281–295, 2006.

[25] L. de Castro and J. Timmis, “An artificial immune network for mul- timodal function optimization,” in CEC’02. Proceedings of the 2002 Congress on Evolutionary Computation, vol. 1, pp. 699–704, IEEE, 2002.

[26] B. Miller and M. Shaw, “Genetic algorithms with dynamic niche sharing for multimodal function optimization,” in Proceedings of IEEE Inter- national Conference on Evolutionary Computation,, pp. 786–791, IEEE, 1996.

[27] A. C. Soterroni, R. L. Galski, and F. M. Ramos, “The q- for global optimization,” to appear in Journal of Global Optimization, pp. 1–13, 2012.

[28] J. Momoh, “Smart grid design for efficient and flexible power networks operation and control,” in Power Systems Conference and Exposition. PSCE’09. IEEE/PES, pp. 1–8, Ieee, 2009.

[29] E. Santacana, G. Rackliffe, L. Tang, and X. Feng, “Getting smart,” IEEE Power and Energy Magazine,, vol. 8, no. 2, pp. 41–48, 2010.

[30] P. Nguyen, W. Kling, G. Georgiadis, M. Papatriantafilou, L. Bertling, et al., “Distributed routing algorithms to manage power flow in agent-

120 based active distribution network,” in Innovative Smart Grid Technolo- gies Conference Europe (ISGT Europe), IEEE PES, pp. 1–7, IEEE, 2010.

[31] G. Dantzig and P. Wolfe, “Decomposition principle for linear programs,” Operations research, pp. 101–111, 1960.

[32] M. Zhu and S. Martinez, “On distributed optimization under inequality constraints via lagrangian primal-dual methods,” in American Control Conference (ACC),, pp. 4863–4868, IEEE, 2010.

[33] G. J. Silverman, “Primal decomposition of mathematical programs by resource allocation: I—basic theory and a direction-finding procedure,” Operations Research, vol. 20, no. 1, pp. 58–74, 1972.

[34] A. Geoffrion and G. Graves, “Multicommodity distribution system de- sign by benders decomposition,” Management science, pp. 822–844, 1974.

[35] J. Cordeau, F. Pasin, and M. Solomon, “An integrated model for logistics network design,” Annals of Operations Research, vol. 144, no. 1, pp. 59– 82, 2006.

[36] J. Cordeau, F. Soumis, and J. Desrosiers, “A benders decomposition ap- proach for the locomotive and car assignment problem,” Transportation Science, vol. 34, no. 2, pp. 133–149, 2000.

[37] J. Cordeau, F. Soumis, and J. Desrosiers, “Simultaneous assignment of locomotives and cars to passenger trains,” Operations Research, pp. 531– 548, 2001.

121 [38] M. Florian, G. Gu´erin, G. Bushell, and U. de Montr´eal. D´epartement d’informatique, The engine scheduling problem in a railway network.D´epartement d’informatique, Universit´edeMontr´eal, 1972.

[39] G. Zhao, “A log-barrier method with benders decomposition for solv- ing two-stage stochastic linear programs,” Mathematical Programming, vol. 90, no. 3, pp. 507–536, 2001.

[40] M. Fisher, “The lagrangian relaxation method for solving integer pro- gramming problems,” Management science, pp. 1–18, 1981.

[41] C. Tan, D. Palomar, and M. Chiang, “Distributed optimization of cou- pled systems with applications to network utility maximization,” in ICASSP 2006 Proceedings. IEEE International Conference on Acous- tics, Speech and Signal Processing, vol. 5, pp. V–V, IEEE, 2006.

[42] B. Awerbuch, R. Khandekar, and S. Rao, “Distributed algorithms for multicommodity flow problems via approximate steepest descent frame- work,” in Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 949–957, Society for Industrial and , 2007.

[43] H. Li and J. Tsai, “A distributed computation algorithm for solving port- folio problems with integer variables,” European Journal of Operational Research, vol. 186, no. 2, pp. 882–891, 2008.

[44] T. Van Roy, “Cross decomposition for mixed integer programming,” Mathematical programming, vol. 25, no. 1, pp. 46–63, 1983.

122 [45] M. Yokoo and K. Hirayama, “Algorithms for distributed constraint satis- faction: A review,” Autonomous Agents and Multi-Agent Systems,vol.3, no. 2, pp. 185–207, 2000.

[46] P. Modi, W. Shen, M. Tambe, and M. Yokoo, “Adopt: Asynchronous distributed constraint optimization with quality guarantees,” Artificial Intelligence, vol. 161, no. 1, pp. 149–180, 2005.

[47] R. Maheswaran, M. Tambe, E. Bowring, J. Pearce, and P. Varakan- tham, “Taking dcop to the real world: Efficient complete solutions for distributed multi-event scheduling,” in Proceedings of the Third Interna- tional Joint Conference on Autonomous Agents and Multiagent Systems- Volume 1, pp. 310–317, IEEE Computer Society, 2004.

[48] E. Kaplansky and A. Meisels, “Distributed personnel schedul- ing—negotiation among scheduling agents,” Annals of Operations Re- search, vol. 155, no. 1, pp. 227–255, 2007.

[49] H. Voos, “Resource allocation in continuous production using market- based multi-agent systems,” in 5th IEEE International Conference on Industrial Informatics, vol. 2, pp. 1085–1090, IEEE, 2007.

[50] D. Bertsekas and J. Tsitsiklis, “Parallel and distributed computation,” 1989.

[51] J. Tsitsiklis, “Problems in decentralized decision making and computa- tion.,” tech. rep., DTIC Document, 1984.

[52] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Randomized gossip algorithms,” IEEE Transactions on Information Theory,, vol. 52, no. 6, pp. 2508–2530, 2006.

123 [53] S. Clearwater, Market-based control: A paradigm for distributed resource allocation. World Scientific Pub Co Inc, 1996.

[54] J. Kurose and R. Simha, “A microeconomic approach to optimal re- source allocation in distributed computer systems,” IEEE Transactions on Computers,, vol. 38, no. 5, pp. 705–717, 1989.

[55] D. Bertsekas, “Auction algorithms for network flow problems: A tuto- rial introduction,” Computational Optimization and Applications,vol.1, no. 1, pp. 7–66, 1992.

[56] G. Schreiber, Knowledge engineering and management: the Com- monKADS methodology. the MIT Press, 2000.

[57] F. Ygge, Market-oriented programming and its application to power load management. PhD thesis, Lund University, 1998.

[58] M. Karlsson, F. Ygge, and A. Andersson, “Market-based approaches to optimization,” Computational Intelligence, vol. 23, no. 1, pp. 92–109, 2007.

[59] R. Buyya, D. Abramson, J. Giddy, and H. Stockinger, “Economic mod- els for resource management and scheduling in grid computing,” Con- currency and computation: practice and experience, vol. 14, no. 13-15, pp. 1507–1542, 2002.

[60] Z. Tan and J. Gurd, “Market-based grid resource allocation using a sta- ble continuous double auction,” in Proceedings of the 8th IEEE/ACM International Conference on Grid Computing, pp. 283–290, IEEE Com- puter Society, 2007.

124 [61] J. Stiglitz, “Pareto optimality and competition,” The Journal of Fi- nance, vol. 36, no. 2, pp. 235–251, 1981.

[62] R. Freeman, P. Yang, and K. Lynch, “Stability and convergence proper- ties of dynamic average consensus estimators,” in 45th IEEE Conference on Decision and Control, pp. 338–343, IEEE, 2006.

[63] A. Jadbabaie, J. Lin, and A. Morse, “Coordination of groups of mobile autonomous agents using nearest neighbor rules,” IEEE Transactions on Automatic Control,, vol. 48, no. 6, pp. 988–1001, 2003.

[64] R. Olfati-Saber, J. A. Fax, and R. M. Murray, “Consensus and coopera- tion in networked multi-agent systems,” Proceedings of the IEEE, vol. 95, no. 1, pp. 215–233, 2007.

[65] M. H. DeGroot, “Reaching a consensus,” Journal of the American Sta- tistical Association, vol. 69, no. 345, pp. 118–121, 1974.

[66] N. A. Lynch, Distributed algorithms. Morgan Kaufmann, 1996.

[67] R. Olfati-Saber and R. M. Murray, “Consensus problems in networks of agents with switching topology and time-delays,” Automatic Control, IEEE Transactions on, vol. 49, no. 9, pp. 1520–1533, 2004.

[68] L. Moreau, “Stability of multiagent systems with time-dependent com- munication links,” Automatic Control, IEEE Transactions on, vol. 50, no. 2, pp. 169–182, 2005.

[69] W. Ren and R. W. Beard, “Consensus seeking in multiagent systems under dynamically changing interaction topologies,” Automatic Control, IEEE Transactions on, vol. 50, no. 5, pp. 655–661, 2005.

125 [70] V. Blondel, J. Hendrickx, A. Olshevsky, and J. Tsitsiklis, “Convergence in multiagent coordination, consensus, and flocking,” in Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC’05. 44th IEEE Conference on, pp. 2996–3000, IEEE, 2005.

[71] B. Johansson, On distributed optimization in networked systems. Elektro-och systemteknik, Kungliga Tekniska h¨ogskolan, 2008.

[72] A. Nedic, A. Ozdaglar, and P. Parrilo, “Constrained consensus and op- timization in multi-agent networks,” IEEE Transactions on Automatic Control,, vol. 55, no. 4, pp. 922–938, 2010.

[73] K. Kiwiel, “Convergence of approximate and incremental subgradi- ent methods for convex optimization,” SIAM Journal on Optimization, vol. 14, p. 807, 2004.

[74] A. Nedic and D. Bertsekas, “Incremental subgradient methods for non- differentiable optimization,” SIAM Journal of Optimization, vol. 12, no. 1, pp. 109–138, 2001.

[75] B. Johansson, M. Rabi, and M. Johansson, “A randomized incremental subgradient method for distributed optimization in networked systems,” SIAM Journal on Optimization, vol. 20, no. 3, p. 1157, 2009.

[76] E. Wei, A. Ozdaglar, and A. Jadbabaie, “A distributed newton method for network utility maximization,” in Decision and Control (CDC), 2010 49th IEEE Conference on, pp. 1816–1821, IEEE, 2010.

[77] A. W. Colombo, R. Schoop, and R. Neubert, Collaborative (Agent- Based) factory automation, ch. 109: The Industrial Information Tech- nology Handbook. CRC Press, 2004.

126 [78] N. Jennings and S. Bussmann, “Agent-based control systems,” IEEE control systems, vol. 23, no. 3, pp. 61–74, 2003.

[79] F. Kl¨ugl, A. Bazzan, and S. Ossowski, Applications of agent technology in traffic and transportation. Birkhauser Verlag, 2005.

[80] J. Von Neumann, O. Morgenstern, A. Rubinstein, and H. Kuhn, Theory of games and economic behavior. Princeton Univ Pr, 2007.

[81] C. Boutilier, Y. Shoham, and M. Wellman, “Economic principles of multi-agent systems,” Artificial Intelligence, vol. 94, no. 1-2, pp. 1–6, 1997.

[82] J. Ferber, Multi-agent systems: an introduction to distributed artificial intelligence. Addison-Wesley Longman Publishing Co., Inc., 1999.

[83] P. Gaudiano, B. Shargel, E. Bonabeau, and B. Clough, “Swarm intel- ligence: A new c2 paradigm with an application to control swarms of uavs,” tech. rep., DTIC Document, 2003.

[84] A. Corana, M. Marchesi, C. Martini, and S. Ridella, “Minimizing multi- modal functions of continuous variables with the “simulated annealing” algorithm corrigenda for this article is available here,” ACM Transac- tions on Mathematical Software (TOMS), vol. 13, no. 3, pp. 262–280, 1987.

[85] F. De Fran¸ca, F. Von Zuben, and L. De Castro, “An artificial im- mune network for multimodal function optimization on dynamic envi- ronments,” in Proceedings of the 2005 conference on Genetic and evolu- tionary computation, pp. 289–296, ACM, 2005.

127 [86] L. Shengsong, W. Min, and H. Zhijian, “Hybrid algorithm of chaos op- timisation and slp for optimal power flow problems with multimodal characteristic,” in IEE Proceedings on Generation, Transmission and Distribution,, vol. 150, pp. 543–547, IET, 2003.

[87] M. S. Talebi, A. Khonsari, M. H. Hajiesmaili, and S. Jafarpour, “A suboptimal network utility maximization approach for scalable multi- media applications,” in Global Telecommunications Conference, 2009. GLOBECOM 2009. IEEE, pp. 1–6, IEEE, 2009.

[88] P. Onate Yumbla, J. Ramirez, and C. Coello Coello, “Optimal power flow subject to security constraints solved with a particle swarm optimizer,” IEEE Transactions on Power Systems, vol. 23, no. 1, pp. 33–40, 2008.

[89] Y. Shi and R. C. Eberhart, “Empirical study of particle swarm opti- mization,” in Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on, vol. 3, IEEE, 1999.

[90] J. Momoh and J. Zhu, “Improved interior point method for opf prob- lems,” IEEE Transactions on Power Systems,, vol. 14, no. 3, pp. 1114– 1120, 1999.

[91] A. Santos Jr and G. Da Costa, “Optimal-power-flow solution by newton’s method applied to an augmented lagrangian function,” in IEE Proceedings-Generation, Transmission and Distribution,, vol. 142, pp. 33–36, IET, 1995.

[92] O. Alsac and B. Stott, “Optimal load flow with steady-state security,” IEEE Transactions on Power Apparatus and Systems, no. 3, pp. 745– 751, 1974.

128 [93] J. Weber, Implementation of a Newton-based optimal power flow into a power system simulation environment. PhD thesis, University of Illinois, 1997.

[94] J. Yuryevich and K. Wong, “Evolutionary programming based optimal power flow algorithm,” IEEE Transactions on Power Systems, vol. 14, no. 4, pp. 1245–1250, 1999.

[95] R. Gnanadass, P. Venkatesh, and N. Padhy, “Evolutionary programming based optimal power flow for units with non-smooth fuel cost functions,” Electric Power Components and Systems, vol. 33, no. 3, pp. 349–361, 2004.

[96] M. Osman, M. Abo-Sinna, and A. Mousa, “A solution to the optimal power flow using genetic algorithm,” Applied mathematics and compu- tation, vol. 155, no. 2, pp. 391–405, 2004.

[97] M. Abido, “Optimal power flow using particle swarm optimization,” In- ternational Journal of Electrical Power & Energy Systems, vol. 24, no. 7, pp. 563–571, 2002.

[98] M. Abido, “Optimal power flow using tabu search algorithm,” Electric Power Components and Systems, vol. 30, no. 5, pp. 469–483, 2002.

[99] C. Roa-Sepulveda and B. Pavez-Lazo, “A solution to the optimal power flow using simulated annealing,” International journal of electrical power & energy systems, vol. 25, no. 1, pp. 47–57, 2003.

[100] J. Jacobo and D. De Roure, “A decentralised dc optimal power flow model,” in DRPT 2008. Third International Conference on Electric Util-

129 ity Deregulation and Restructuring and Power Technologies,, pp. 484– 490, IEEE, 2008.

[101] F. Nogales, F. Prieto, and A. Conejo, “A decomposition methodology applied to the multi-area optimal power flow problem,” Annals of oper- ations research, vol. 120, no. 1, pp. 99–116, 2003.

[102] B. Kim and R. Baldick, “A comparison of distributed optimal power flow algorithms,” IEEE Transactions on Power Systems,, vol. 15, no. 2, pp. 599–604, 2000.

[103] D. Tylavsky, A. Bose, F. Alvarado, R. Betancourt, K. Clements, G. Heydt, G. Huang, M. Ilic, M. La Scala, and M. Pai, “Parallel process- ing in power systems computation,” IEEE Transactions on Power Sys- tems (Institute of Electrical and Electronics Engineers);(United States), vol. 7, no. 2, 1992.

[104] B. Kim and R. Baldick, “Coarse-grained distributed optimal power flow,” IEEE Transactions on Power Systems, vol. 12, no. 2, pp. 932– 939, 1997.

[105] A. Conejo and J. Aguado, “Multi-area coordinated decentralized dc op- timal power flow,” Power Systems, IEEE Transactions on, vol. 13, no. 4, pp. 1272–1278, 1998.

[106] G. Hug-Glanzmann and G. Andersson, “Decentralized optimal power flow control for overlapping areas in power systems,” IEEE Transactions on Power Systems,, vol. 24, no. 1, pp. 327–336, 2009.

130 [107] B. HomChaudhuri, M. Kumar, and V. Devabhaktuni, “Market based ap- proach for solving optimal power flow problem in smart grid,” in Amer- ican Control Conference (ACC), 2012, pp. 3095–3100, IEEE, 2012.

[108] B. HomChaudhuri and M. Kumar, “Market based allocation of power in smart grid,” in American Control Conference (ACC), 2011, pp. 3251– 3256, IEEE, 2011.

[109] B. HomChaudhuri and M. Kumar, “Market-based distributed optimiza- tion approaches for three classes of resource allocation problems,” Par- allel and Distributed Computing and Networks 2012, vol. 1, no. 1, 2012.

[110] B. HomChaudhuri, M. Kumar, and V. Devabhaktuni, “A market based distributed optimization for power allocation in smart grid,” in Dynamic Systems and Control Conference, ASME, 2011.

[111] G. Lee, N. Tolia, P. Ranganathan, and R. H. Katz, “Topology-aware resource allocation for data-intensive workloads,” in Proceedings of the first ACM asia-pacific workshop on Workshop on systems, pp. 1–6, ACM, 2010.

[112] A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, “Dominant resource fairness: Fair allocation of heterogeneous resources in datacenters,” University of California, Berkeley, EECS De- partment, Technical Report UCB/EECS-2010-55, 2010.

[113] H. N. Van, F. D. Tran, and J.-M. Menaud, “Sla-aware virtual resource management for cloud infrastructures,” in Computer and Information Technology, 2009. CIT’09. Ninth IEEE International Conference on, vol. 1, pp. 357–362, IEEE, 2009.

131 [114] M. Fisher, R. Jaikumar, and L. Van Wassenhove, “A multiplier ad- justment method for the generalized assignment problem,” Management Science, pp. 1095–1103, 1986.

[115] D. Pentico, “Assignment problems: A golden anniversary survey,” Eu- ropean Journal of Operational Research, vol. 176, no. 2, pp. 774–793, 2007.

[116] M. Grigoriadis, D. Tang, and L. Woo, “Considerations in the opti- mal synthesis of some communication networks,” in 45th joint National ORSA/TIMS Meeting, 1974.

[117] M. Fisher and R. Jaikumar, “A generalized assignment heuristic for vehicle routing,” Networks, vol. 11, no. 2, pp. 109–124, 1981.

[118] V. Balachandran, “An integer generalized transportation model for optimal job assignment in computer networks,” Operations Research, pp. 742–759, 1976.

[119] D. Gross and C. Pinkus, “Optimal allocation of ships to yards for regular overhauls,” Tech. Memorandum, vol. 63095, 1972.

[120] B. Gerkey and M. Mataric, “Multi-robot task allocation: Analyzing the complexity and optimality of key architectures,” in Proceedings. ICRA’03. IEEE International Conference on Robotics and Automation,, vol. 3, pp. 3862–3868, IEEE, 2003.

[121] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge univer- sity press, 2004.

132 [122] M. Donato, M. Milasi, and C. Vitanza, “Dynamic walrasian price equilib- rium problem: evolutionary variational approach with sensitivity analy- sis,” Optimization Letters, vol. 2, no. 1, pp. 113–126, 2008.

[123] A. Zymnis, N. Trichakis, S. Boyd, and D. O’Neill, “An interior-point method for large scale network utility maximization,” in Proceedings of the Allerton Conference on Communication, Control, and Computing, Citeseer, 2007.

[124] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty, : theory and algorithms. Wiley-interscience, 2006.

[125] J. E. Dennis and R. B. Schnabel, Numerical methods for unconstrained optimization and nonlinear equations, vol. 16. Society for Industrial Mathematics, 1987.

[126] D. Crutchley and M. Zwolinski, “Using evolutionary and hybrid algo- rithms for dc operating point analysis of nonlinear circuits,” in CEC’02. Proceedings of the 2002 Congress on Evolutionary Computation,vol.1, pp. 753–758, IEEE, 2002.

[127] Z. Dong, M. Lu, Z. Lu, and K. Wong, “A differential evolution based method for power system planning,” in CEC 2006. IEEE Congress on Evolutionary Computation, pp. 2699–2706, IEEE, 2006.

[128] D. Bertsekas, Nonlinear programming. Athena Scientific, 1999.

[129] E. Dall’Anese, H. Zhu, and G. B. Giannakis, “Distributed optimal power flow for smart microgrids,” To Appear IEEE Transactions on Smart Grid, 2012.

133 [130] R. Baldick, B. H. Kim, C. Chase, and Y. Luo, “A fast distributed imple- mentation of optimal power flow,” Power Systems, IEEE Transactions on, vol. 14, no. 3, pp. 858–864, 1999.

134