<<

VOLUME 83, NUMBER 7 PHYSICAL REVIEW LETTERS 16AUGUST 1999

Computational for Continuous Dynamics

Hava T. Siegelmann and Asa Ben-Hur Faculty of Industrial and Management, Technion, Haifa 32000, Israel

Shmuel Fishman Department of , Technion, Haifa 32000, Israel (Received 22 March 1999) Dissipative flows model a large of physical . In this Letter the evolution of such systems is interpreted as a process of computation; the of the dynamics represents the output. A framework for an algorithmic analysis of dissipative flows is presented, enabling the comparison of the performance of discrete and continuous time analog computation models. A simple for finding the maximum of n numbers is analyzed, and shown to be highly efficient. The notion of tractable (polynomial) computation in the Turing model is conjectured to correspond to computation with tractable (analytically solvable) dynamical systems having polynomial complexity.

PACS numbers: 05.45.Ða, 89.70.+c, 89.80.+h

The computation of a digital , and its mathe- The Hopfield neural network is a dynamical matical abstraction, the Turing is described by which evolves to which are interpreted as a on a discrete configuration space. In recent years memories; the network is also used to solve optimization scientists have developed new approaches to computation, problems [2]. Brockett introduced a of ODEs that some of them based on continuous time analog systems. perform various tasks such as sorting and solving linear The most promising are neuromorphic systems [1], mod- programming problems [9]. Numerous other applications els of human memory [2], and experimentally realizable can be found in [10]. An analytically solvable ODE quantum [3]. Although continuous time sys- for the problem was proposed by tems are widespread in experimental realizations, no the- Faybusovich [11]. Our theory is, to some extent, a ory exists for their algorithmic analysis. The standard continuation of their work, in that it provides a framework theory of computation and computational complexity [4] for the complexity analysis of continuous time . deals with computation in discrete time and in a discrete Our model is restricted to exponentially convergent configuration space, and is inadequate for the description autonomous dissipative ODEs of such systems. This Letter describes an attempt to fill dx ෇ F͑x͒ , (1) this gap. Our model of a computer is based on dissipa- dt tive dynamical systems (DDS), characterized by flow to for x [ ޒn and F,ann-dimensional vector field, where attractors, which are a natural choice for the output of a n depends on the input. For a given problem, F takes computation. This makes our theory realizable by small- the same mathematical form, and only the length of the scale classical physical systems (since there dissipation is various objects in it (vectors, tensors, etc.) depends on usually not negligible) [5]. We define a of com- the size of the instance, corresponding to “uniformity” putational complexity which reflects the convergence time in computer science [12]. We discuss only systems with of a physical implementation of the continuous flow, en- fixed attractors, and the term attractor will be used abling a comparison of the efficiency of continuous time to denote an attracting fixed point. We study only au- algorithms with discrete ones. On the conceptual level, tonomous systems since for these the time parameter is the framework introduced here strengthens the connection not arbitrary (contrary to nonautonmous ones): under any between the theory of computational complexity and the nonlinear transformation of the time parameter the sys- field of dynamical systems. tem is no longer autonomous, as will be explained in what Turing universality of dynamical systems is a funda- follows. The restricted class of exponentially convergent mental issue; see [6] and a recent book [7]. A system of vector fields describes the “typical” convergence scenario ordinary differential (ODEs) which simulates a for dynamical systems [13]. of expo- Turing machine was constructed in [8]. Such construc- nentially convergent flows is an important property for tions retain the discrete nature of the simulated map, in analog computers. As a further justification we argue that they follow its computation step by step by a continu- that exponential convergence is a prerequisite for efficient ous . In the present Letter, on the other hand, we computation, provided the computation requires reaching consider continuous systems as is, and interpret their dy- the asymptotic regime, as is usually the case. Asymptoti- ͞ ء namics as a process of computation. cally, jx͑t͒ 2 x j ϳ e2t tch [see Eq. (5)]. When a tra- The view of the process of computation as a flow to jectory is close to its attractor, in a time tch ln2 a digit an attractor has been taken by a number of researchers. of the attractor is computed. Thus the computation of L

0031-9007͞99͞83(7)͞1463(4)$15.00 © 1999 The American Physical Society 1463 VOLUME 83, NUMBER 7 PHYSICAL REVIEW LETTERS 16AUGUST 1999 digits requires a time which scales as tchL. This is in linear transformation effectively changes the system itself. contrast with polynomially convergent vector fields: Sup- Therefore we suggest autonomous systems as representing j ϳ t2b for some b.0, then in or- the intrinsic complexity of the class of systems that can beءpose that jx͑t͒ 2 x .with L significant digits, we need to obtained from them by changing the time parameter ءder to compute x j , 22L,ort . 2L͞b, for an exponential The evolution of a DDS reaches an attractor only in theءhave jx͑t͒ 2 x time complexity. infinite time limit. Therefore for any finite time we can Last, we concentrate on ODEs with a formal solution, compute it only to some finite precision. This is sufficient since for these, complexity is readily analyzed, and it since for combinatorial problems with integer or rational is easy to provide criteria for halting a computation. inputs, the set of fixed points (the possible solutions) Dynamical systems with an analytical solution are an will be distributed on a grid of some finite precision. A exception. But despite their scarcity, we argue later that computation will be halted when the attractor is computed a subclass of analytically solvable DDS’s which converge with enough precision to infer a solution to the associated exponentially to fixed point attractors are a counterpart problem by rounding to the nearest grid point. for the classical complexity class P. This then suggests The evolution of a may be rather a correspondence between tractability in the realm of complicated, and a major problem is to decide when a dynamical systems and tractability in the Turing model. point approached by the trajectory is indeed the attractor The input of a DDS can be modeled in various of the dynamics, and not a . An attractor ways. One possible choice is the . This is certified by its attracting region which is a subset of is appropriate when the aim of the computation is to the trapping set of the attractor in which the distance decide to which attractor out of many possible ones the from the attractor is monotonically decreasing in time. system flows. This approach was pursued in [14]. The The convergence time to an attracting region U, tc͑U͒ is main problem within this approach is related to initial the time it takes for a trajectory starting from the initial conditions in the vicinity of basin boundaries. The flow in condition x0 to enter U. the vicinity of the boundary is slow, resulting in very long When the computation has reached the attracting region computation . In the present Letter, on the other of a fixed point, and is also within the precision required hand, the parameters on which the vector field depends for solving the problem, ep, the computation can be are the input, and the initial condition is a of halted. We thus define the halting region of a DDS the input, chosen in the correct basin, and far from basin with attracting region U and required precision ep as ء ء boundaries to obtain an efficient computation. For the H ෇ U > B͑x , ep͒, where B͑x , ep͒ is a ball of radius ء gradient vector field equation (7), designed to find the ep around the attractor x . The computation time is the maximum of n numbers, the n numbers c constitute i convergence time to the halting region, tc͑H͒, given by the input, and the initial condition given by (11) is untypically simple. More generally, when dealing with tc͑H͒ ෇ max͓tc͑ep͒, tc͑U͔͒ , (2) the problem of optimizing some cost function E͑x͒, e.g., ء by a gradient flow xᠨ ෇ gradE͑x͒, an instance of the where tc͑ep͒ is the convergence time to B͑x , ep͒. problem is specified by the parameters of E͑x͒, i.e., by In general, we cannot calculate tc͑H͒ for a DDS the parameters of the vector field. algorithm. Thus we resort to halting by a bound on the The vector x͑t͒ represents the state of the correspond- computation time of all instances of size L: ing at time t. The time parameter is ͑ ͒ ෇ ͑ ͑ ͒͒͒ thus time as measured in the laboratory, and has a well- T L max tc H P , (3) jPj෇L defined meaning. Therefore we suggest it as a measure of the time complexity of a computation. However, for where P denotes the input, and L ෇ jPj is its size in nonautonomous ODEs that are not directly associated bits. The definition of the input size depends on the with physical systems, the time parameter seems to be input space considered. The bit size (used here) is the arbitrary: if the time variable t of a nonautonomous vec- suitable measure for unbounded integers. See [15] for tor field is replaced by s ෇ g͑t͒, where g͑t͒ is strictly more details. dx ෇ monotonic we obtain a nonautonomous system ds Time complexity is a dimensionless number, whereas F͑x,t͒ ϵ ͑ ͒ T͑L͒ depends on the time units of the system at hand. To gᠨ͓g21͑s͔͒ Fs x, s . In this way arbitrary speed-up can make it dimensionless we express it in terms of the time ء be achieved in principle. However, the time transformed system F ͑x, s͒ is a new system. Only once it is con- scale for convergence to the fixed point. Let x ͑P͒ be s ᠨ structed does its time parameter take on the role of physi- the attracting fixed point of x ෇ F͑x͒ on input P. In the ᠨ ء dx ءcal time, and is no longer arbitrary. Therefore speed-up vicinity of x the linear approximation dx ෇ DFjx ء is a relevant concept only within the bounds of physical holds, where dx ෇ x 2 x . Let li be the real part of the :Wedefine. ءrealizability. We stress the distinction between linear and ith eigenvalue of DFjx nonlinear transformations of the time parameter: a linear l ෇ min jlij . (4) transformation is merely a change of the time unit; a non- i

1464 VOLUME 83, NUMBER 7 PHYSICAL REVIEW LETTERS 16AUGUST 1999 l determines the to an attractor, contribution of flow near basin boundaries we choose as j ϳ e2lt, leading to the an initial condition the symmetric vectorءsince in its vicinity jx͑t͒ 2 x definition of the characteristic time 1 T 1 e ෇ ͑1,...,1͒ . (11) tch ෇ . (5) n l The coordinates of the possible solutions (vertices of We finally define the time complexity of a DDS algorithm: the simplex) are integer, and therefore e ෇ 1͞2. Using ͑ ͒ p 0͑ ͒ ෇ T L Eq. (10) we obtain that the convergence time to the T L ͑ ͒ , (6) tch P0 ep-vicinity of e1 is bounded by where P0 is a fixed instance of the problem. This is a ͑ ͒ valid definition of complexity since it is under tc ep ,tch ln4n . (12) linear transformations of the time parameter. Next we show a bound on the convergence time to the We demonstrate our approach with a simple DDS al- attracting region. The attracting region of the problem is ᠨ gorithm for the “MAX” problem, which is the problem the region in which xi , 0 forP i . 1. By the positivity of n ෇ of finding the maximum of n numbers. Let the num- the xi’s, this is satisfied if j෇1 cjxj . ci, i 2, . . . , n. bers be c1,...,cn, and define the linear cost function Inserting the analytical solution yields a bound on tc͑U͒: E͑x͒ ෇ cT x. The MAX problem can be formulated as t ͑U͒ #t ͑lnt c 1 lnn͒ , (13) a constrained optimization problem:P find the maximum c ch ch 2 ͑ ͒ n ෇ of E x subject to the constraints i෇1 xi 1, xi $ 0. The maximum of (12) and (13) gives the bound This is recognized as the linear programming problem on ͑ ͑ ͒͒͒ ͑ ͒ n tc H c #tch lntchc2 1 ln4n . (14) thePn 2 1 dimensional simplex Dn21 ෇ ͕x [ ޒ : xi $ n 0, ෇ x ෇ 1͖. We use√ the vector fi!eld If integer inputs are consideredP then tch # 1. The i 1 i n Xn size of the input in bits is L ෇ ෇ ͓1 1 log ͑ci 1 1͔͒. ෇ i 1 2 Fi ci 2 xjcj xi , (7) Expressing the bound on tc͑H͑c͒͒͒ in terms of L: j෇1 tc͑H͑c͒͒͒ ෇ O͑lnL͒ . (15) which is the gradient of the function E on Dn21 relative to a Riemannian metric which enforces the positivity con- A sublinear (logarithmic) complexity arises because the straints [10]. This flow is a special case of the Faybuso- model is inherently parallel: the variables of a DDS vich vector field [11], which we analyze elsewhere [16]. algorithm can be considered as processing units, and n We denote by e1,...,en the standard basis of ޒ . their number in this algorithm increases with the size The fixed points of F are the vertices of the simplex of the input. When the inputs are bounded integers the e1,...,en.Wefirst assume a unique maximum. Also complexity becomes O͑logn͒, similar to the complexity suppose that c1 . c2 and c2 $ cj, j ෇ 3,...,n. Under obtained in models of parallel computation in the Turing this assumption the flow converges exponentially to e1 as framework. The analysis of the MAX problem was witnessed by the analytical solution presented as an illustration. In [16] we analyze the ci t ͑ ͒ P e xi 0 maximum network flow problem and the general linear xi͑t͒ ෇ n , (8) cj t ͑ ͒ programming problem. j෇1 e xj 0 In the following, we no longer assume integer inputs where x ͑0͒ are the components of the initial condition. i (see [15]). Suppose that c ෇ c , then the maximum is We note that the analytical solution does not help in 1 2 not unique, and arbitrarily small perturbations of such determining which of the fixed points is the attractor an instance have different attractors as the solution. The of the system: the solution to the specific instance of bound on t ͑U͒ also shows that in such a case it takes a the problem is required for that. Thus the analytical c long time for the system to distinguish between attractors solution is only formal, and one has to follow the with close values of the cost function, since when c dynamics of the vector field (7) to find the maximum. 2 is close to c we have a large time scale, t . Thus However, the formal solution is useful in obtaining tight 1 ch problems which are near to problems with a nonunique bounds on T͑L͒. maximizer are “hard.” Instances of a problem which A of F shows that the time scale for have a nonunique solution are termed “ill-posed” in the convergence to the attractor e1 is 1 numerical analysis literature [17] and the difficulty of t ෇ . (9) problems which are close to ill-posed (“ill-conditioned”) ch c 2 c 1 2 is expressed in “condition number theorems” which state By solving for t in the equation kx͑t͒ 2 e k ,e,an 1 that the inverse of the distance of an instance from the set upper bound on the time to reach an e vicinity of the of ill-posed problems is equal to its condition number. In vertex e is found [10]: 1 our context we define t as the condition number and ͑ ͒ ͑ ͑ ͒ 2͒ ch tc e #tchjln x1 0 e j . (10) indeed find that it is the inverse of the distance of an The divergence as x1͑0͒ tends to zero is due to initial instance from the set of problems in which c1 ෇ c2 [15]. conditions close to the basin boundary. To minimize the A more general condition number theorem is also argued

1465 VOLUME 83, NUMBER 7 PHYSICAL REVIEW LETTERS 16AUGUST 1999

[15]. Ill-conditioned problems for models where the input saddle. This yields a Co-RP type of complexity class [12] is the parameters, are reminiscent of problems with the which is applicable to gradient flows, for example [14]. initial condition near the basin boundary in models where Only fixed point attractors were considered here. In the initial condition is the input. [14] computation of chaotic attractors is discussed. Such In the following, we compare complexity in our model attractors are found to be computable efficiently by means with the classical theory. The complexity class P in clas- of nondeterminism. The inherent difference between sical computational complexity is the class of problems fixed points and chaotic attractors may shed on the P for which there exists an algorithm which runs in poly- vs NP question which is a main open problem in computer nomial time on a Turing machine. Its counterpart in our science. framework is called CP (continuous P), and contains the The authors would like to thank E. Sontag, K. Ko, set of problems which have a DDS algorithm with a poly- C. Moore, and J. Crutchfield for helpful discussions. This nomial number of variables and polynomial time com- work was supported in part by the U.S.-Israel Binational plexity. Note that since the variables play the role of Science Foundation (BSF), by the Israeli Ministry of Arts processing units, their number needs to be limited as well. and Sciences, by the Fund for Promotion of Research at In [16] we present a DDS algorithm for the maximum the Technion and by the Minerva Center for Nonlinear network flow problem which has polynomial time com- Physics of Complex Systems. plexity. In addition we define the class CLOG (continu- ous log) of problems that have a DDS algorithm with a polynomial number of variables and logarithmic time complexity. We have shown that MAX is in CLOG. We [1] C. Mead, Analog VLSI and Neural Systems (Addison- note that for a comparison with the classical theory to be Wesley, Reading, Massachusetts, 1989). meaningful it is necessary to impose constraints on the [2] J. J. Hopfield and D. W. Tank, Biol. Cybernet. 52, 141 complexity of the vector field. Otherwise, the computa- (1985). tional power of the model can be attributed to the com- [3] C. P. Williams, in Proceedings of Quantum Computing plexity of the vector field. We assume in the following and Quantum Communications: First NASA International Conference, QCQC’98, Springer Lecture Notes in Com- that the vector field is in the parallel complexity class NC puter Science Vol. 1509 (Springer, Berlin, 1998). [12], of computations in polylogarithmic (polynomial of [4] The term computational complexity in computer science log) time with a polynomial number of processors; NC is denotes the scaling of the resources needed to solve a the complexity class just below P. problem with its size [12]. We argue that CP ෇ P. For the inclusion P # CP [5] E. Ott, Chaos in Dynamical Systems (Cambridge Univer- we rely on the claim that the P-complete [18] problem sity Press, Cambridge, 1993). of maximum network flow is in CP [16]. If we use [6] C. Moore, Phys. Rev. Lett. 64, 2354 (1990). the Turing reductions from a P problem to maximum [7] H. T. Siegelmann, Neural Networks and Analog Computa- network flow we have in fact shown that all efficient tion: Beyond the Turing Limit (Birkhauser, Boston, Mass- Turing computations can be performed polynomially in achusetts, 1999), and references therein. our framework. However, relying on Turing reductions [8] M. S. Branicky, in Proceedings of the IEEE Workshop on Physics and Computation (Dallas, Texas, 1994), pp. 265Ð which are external to our model might be considered 274. unsatisfactory. As of yet we have no argument that [9] R. W. Brockett, Linear Algebr. Appl. 146, 79 (1991). CP # P. However, we believe that a polynomial time [10] U. Helmke and J. B. Moore, Optimization and Dynamical simulation of the ODE with some Systems (Springer-Verlag, London, 1994). scheme should be possible because of the convergence to [11] L. Faybusovich, IMA J. Math. Control Inf. 8, 135 (1991). fixed points. [12] C. Papadimitriou, Computational Complexity (Addison- In this Letter we concentrated on analytically tractable Wesley, Reading, Massachusetts, 1995). dynamical systems, a property which helped us in com- [13] B. R. Hunt, T. Sauer, and J. A. Yorke, Bull. Am. Math. puting bounds on the convergence time to the attracting Soc. 27, 217 (1992). region, and find initial conditions far from basin bound- [14] H. T. Siegelmann and S. Fishman, Physica (Amsterdam) aries. For a large class of systems without an analytical 120D, 214 (1998). [15] H. T. Siegelmann, A. Ben-Hur, and S. Fishman (to be solution one can resort to probabilistic verification of an published). attractor: when it is suspected that a fixed point is ap- [16] A. Ben-Hur, H. T. Siegelmann, and S. Fishman (to be proached, a number of are initiated in an e published). ball around the trajectory. If this ball shrinks, then with [17] S. Smale, SIAM Rev. 32, 211 (1990). high probability the fixed point is an attractor. If the ball [18] A problem in P is called P-complete if any problem in P has expanded in some direction, then the fixed point is a can be reduced to it [12].

1466