Generalized Hopfield Networks for Constrained Optimization

Generalized Hop eld Networks for Constrained Optimization Jan van den Berg Email: jvandenb [email protected] Abstract A twofold generalization of the classical continuous Hop eld neural network for model ling constrained optimization problems is proposed. On the one hand, non-quadratic cost functions are admitted corresponding to non-linear output summation functions in the neurons. On the other hand it is shown under which conditions various new types of constraints can be incorporated directly. The stability properties of several relaxation schemes are shown. If a direct incorporation of the constraints appears to be impossible, the Hop eld-Lagrange model can be applied, the stability properties of which are analyzed as wel l. Another good way to deal with constraints is by means of dynamic penalty terms, using mean eld annealing in order to end up in a feasible solution. A famous example in this context is the elastic net, although it seems impossible - contrary to what is suggested in the literature - to derive the architecture of this network from a constrained Hop- eld model. Furthermore, a non-equidistant elastic net is proposed and its stability properties are compared to those of the classical elastic network. In addition to certain simulation results as known from the literature, most theoretical statements of this paper are validated with simulations of toy problems while in some cases, more sophisti- cated combinatorial optimization problems have been tried as wel l. In the nal section, we discuss the possibilities of applying the various models in the area of constrained optimization. It is also demonstrated how the new ideas as inspired by the analysis of generalized continuous Hop eld models, can betransferredto discrete stochastic Hop eld models. By doing so, simulating annealing can be exploited in other to improve the quality of solutions. The transfer also opens new avenues for continued theoretical research. 1 Intro duction We start presenting a general outline of how Hop eld neural networks are used in the area of combinatorial optimization. Since the source of inspiration in this pap er is supplied by the classical continuous Hop eld net, we next recall this mo del to mind. In the consecutive subsections, we present { in a concise overview { whichvariations and extensions of the original mo del are prop osed after its app earance. All this information serves as a preparation for the rest of the pap er, the outline of whichissketched in the nal subsection of this intro duction. 1.1 Hop eld networks, combinatorial optimization, and statistical physics Hop eld and allied networks have b een used in applications which are mo delled as an `asso ciative memory', and in problems emanating from the eld of `combinatorial optimization' ever since their conception [14,15]. In b oth typ es of applications, an energy or cost function is minimized, while { in case of dealing with combinatorial optimization problems{agiven set of constraints should b e ful lled as well. In the latter case, the problem can formally b e stated as minimize E x sub ject to : C x=0; =1;:::;m; 1 where x =x ;x ;:::;x is the state vector or system state of the neural net, x representing the 1 2 n i output neuron i, and where any C x = 0 is a constraint. If the value of the state vector is such that 8 : C x = 0, wesay that x represents a `valid' or `feasible' solution. Roughly sp oken, three metho ds are available in order to deal with the constraints. The rst and eldest one is the p enalty approach where usually xed and quadratic p enalty functions are added to the original cost function. In practice, it turns out very dicult to nd p enalty weights that guarantee b oth valid and high quality solutions. In a second approach, constraints are directly incorp orated in the neural network bycho osing appropriate transfer functions in the neurons. Up till now, the applicability of this metho d was centered in indep endent, symmetric linear constraints. A third wayto grapple with the constraints, is combining the neural network with the Lagrange multiplier metho d resulting in what we call a Hop eld-Lagrange mo del. All these metho ds can be combined with a typ e of `annealing' [1] where during relaxation of the recurrent network, the `temp erature' of the system is gradually lowered in order to try not to land in a lo cal minimum. The technique of annealing originates from an analysis of Hop eld-typ e networks using the theory of statistical mechanics [26]. In this pap er however, we emphasize on a mathematical analysis and only refer to physical interpretations if these yield relevant additional insights. The three ab ove-mentioned metho ds for resolving constrained optimization problems form part of this analysis. 1.2 The classical continuous Hop eld mo del Applying the classical discrete [17] or classical continuous [18] Hop eld mo del, the cost or energy function to b e minimized is quadratic and is expressed in the output of the neurons. In this pap er the source of inspiration is the continuous mo del. Then, the neurons are continuous-valued and the continuous energy function E V is given by c Z V i X X X 1 1 E V = w V V I V + g v dv 2 c ij i j i i 2 0 i;j i i {z } | {z } | = E V + E V : 3 h n E V corresp onds to the cost function of equation 1, where V =V ;V ;:::;V 2 [0; 1] represents 1 2 n the state vector. E V , whichwe call the `Hop eld term', has a statistical mechanical interpretation h based on a so-called mean eld analysis of a stochastic Hop eld mo del [16,34, 15,14,35, 4]. Its general e ect is a displacement of the minima of E V towards the interior of the state space [18] whose magnitude dep ends on the current `temp erature' in the system: the higher the temp erature 1 is, the larger is the displacementtowards the interior . The motion equations corresp onding to 2 are X @E V c _ U = = w V + I U ; 5 i ij j i i @V i j where V = g U should hold continuously. U represents the weighted input of neuron i. After a i i i random initialization, the network is generally not in an equilibrium state. Then, while maintaining V = g U , the input values U are adapted conform 5. The following theorem [18] gives conditions i i i for which an equilibrium state will b e reached: Theorem 1 Hop eld. If w is a symmetric matrix and if 8i : V = g U is a monotone in- ij i i creasing, di erentiable function, then E is a Lyapunov function [3,15,14] for motion equations 5. c Under the given conditions, the theorem guarantees convergence to an equilibrium state of the neural net where X 8i : V = g U ^ U = w V + I : 6 i i i ij j i j 1.3 The p enalty mo del The oldest approach for solving combinatorial optimization problems using Hop eld mo dels con- sists of a so-called p enalty metho d, sometimes called the `soft' approach [35, 29]: extra `p enalty' terms are added to the original energy function, p enalizing violation of constraints. The various 1 In case of cho osing the sigmoid V = 1=1 + exp U as the transfer function g U , the part of i i i temp erature is acted by1= and the Hop eld term in 3 can b e written as [16 , 35, 4 , 6, 7] X 1 E V = V ln V +1 V ln1 V : 4 i i i i h s i From this, the displacement of solutions towards towards the interior is recognized easily since E V has h s one absolute minimum, precisely in the middle of the state space where 8i : V =0:5. i 2 I 1 w 11 P r U V r 1 1 g j r w 1n I 2 w 21 P r U V r 2 2 g j r w 2n I n w n1 P U V n n g j w nn Figure 1: The original continuous Hop eld network. p enalty terms are weighted with { originally { xed weights c in case we shall sp eak of a static p enalty metho d and chosen in sucha way that m X c C V has a minimum value , V represents a valid solution. 7 =1 In many cases, the chosen p enalty terms are quadratic expressions. Applying a continuous Hop eld network, the original problem 1 is converted into m X minimize E V =E V + c C V +E V ; 8 p h =1 E V and E V b eing given by 3. The corresp onding up dating rule is h X @E @C p _ U = = w V + I c U : 9 i ij j i i @V @V i i j Ignoring the Hop eld term for the momentby applying a low temp erature, the energy function E isaweighted sum of m + 1 terms and a diculty arises in determining correct weights c . The p minimum of E is a compromise between ful lling the constraints and minimizing the original cost p function E V . Applying this p enalty approach to the travelling salesman problem TSP [19,40], the weights had to b e determined by trial and error.

Generalized Hopfield Networks for Constrained Optimization

A Relativistic Extension of Hopfield Neural Networks Via the Mechanical Analogy

Optimization by Mean Field Annealing

7Network Models

Chapter 14 Stochastic Networks

Statistical Computations Underlying the Dynamics of Memory Updating

Forecast of Consumer Behaviour Based on Neural Networks Models Comparison

Artificial Neural Networks for Financial Time Series Prediction and Portfolio

Online Discrimination of Nonlinear Dynamics with Switching Differential Equations

On the Mapping Between Hopfield Networks and Restricted Boltzmann

Generalization in a Hopfield Network J.F

A Novel Neural Network Training Framework with Data Assimilation

Multi-Agent Market Modeling Based on Neural Networks