<<

week ending PRL 110, 168702 (2013) PHYSICAL REVIEW LETTERS 19 APRIL 2013

Causal Entropic

A. D. Wissner-Gross1,2,* and C. E. Freer3,† 1Institute for Applied Computational Science, Harvard University, Cambridge, Massachusetts 02138, USA 2The Media Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA 3Department of Mathematics, University of Hawaii at Manoa, Honolulu, Hawaii 96822, USA (Received 24 May 2012; revised manuscript received 26 February 2013; published 19 April 2013) Recent advances in fields ranging from cosmology to computer science have hinted at a possible deep connection between intelligence and maximization, but no formal physical relationship between them has yet been established. Here, we explicitly propose a first step toward such a relationship in the form of a causal generalization of entropic forces that we find can cause two defining behaviors of the human ‘‘cognitive niche’’—tool use and social cooperation—to spontaneously emerge in simple physical systems. Our results suggest a potentially general thermodynamic model of adaptive behavior as a nonequilibrium process in open systems.

DOI: 10.1103/PhysRevLett.110.168702 PACS numbers: 05.65.+b, 05.70.a, 07.05.Mh, 45.80.+r

Recent advances in fields ranging from cosmology where T is the reservoir temperature, SðXÞ is the entropy to computer science have hinted at a possible deep con- associated with macrostate X,andX0 is the present nection between intelligence and entropy maximization. macrostate. In cosmology, the causal entropic principle for anthropic Inspired by recent developments [1–6] to naturally selection has used the maximization of entropy production generalize such biases so that they uniformly maximize in causally connected space-time regions as a thermody- entropy production between the present and a future time namic proxy for intelligent observer concentrations in the horizon, rather than just greedily maximizing instanta- prediction of cosmological parameters [1]. In geoscience, neous entropy production, we can also contemplate gener- entropy production maximization has been proposed as a alized entropic forces over paths through configuration unifying principle for nonequilibrium processes underly- space rather than just over the configuration space itself. ing planetary development and the of life [2–4]. In particular, we can promote microstates from instanta- In computer science, maximum entropy methods have neous configurations to fixed-duration paths through con- been used for inference in situations with dynamically figuration space while still partitioning such microstates revealed information [5], and strategy algorithms have into macrostates according to the initial coordinate of each even started to beat human opponents for the first time at path. More formally, for any open thermodynamic system, historically challenging high look-ahead depth and branch- we can treat the phase-space paths taken by the system xðtÞ ing factor games like Go by maximizing accessible future over the time interval 0 t as microstates and parti- game states [6]. However, despite these insights, no formal tion them into macrostates fXg according to the equiva- physical relationship between intelligence and entropy max- lence relation xðtÞx0ðtÞ iff xð0Þ¼x0ð0Þ, thereby imization has yet been established. In this Letter, we explic- identifying every macrostate X with a unique present itly propose a first step toward such a relationship in the form of a causal generalization of entropic forces that we show can spontaneously induce remarkably sophisticated behaviors associated with the human ‘‘cognitive niche,’’ including tool use and social cooperation, in simple physical systems. Our results suggest a potentially general thermo- dynamic model of adaptive behavior as a nonequilibrium process in open systems. A nonequilibrium physical system’s bias towards maximum instantaneous entropy production [3] is re- FIG. 1. Schematic depiction of a causal entropic . (a) A causal macrostate X with horizon time , consisting of path flected by its evolution toward higher-entropy macroscopic microstates xðtÞ that share a common initial system state xð0Þ,in states [7], a process characterized by the formalism an open thermodynamic system with initial environment state of entropic forces [8]. In the , the xð0Þ. (b) Path microstates belonging to a causal macrostate X, entropic force F associated with a macrostate partition, in which (for illustrative purposes) there is an environmentally fXg,isgivenby imposed excluded path-space volume that breaks translational symmetry, resulting in a causal entropic force F directed away F ðX Þ¼ r ðXÞj 0 T XS X0 ; (1) from the excluded volume.

0031-9007=13=110(16)=168702(5) 168702-1 Ó 2013 American Physical Society week ending PRL 110, 168702 (2013) PHYSICAL REVIEW LETTERS 19 APRIL 2013 system state xð0Þ, as schematically illustrated in Fig. 1(a). translationally symmetric dynamics, (ii) as ! 0, and We can then define the causal path entropy Sc of a macro- (iii) if the system and its environment are deterministic. state X with associated present system state xð0Þ as the For concreteness, we will now explore the effect of path integral selectively applying causal entropic forcing to the posi- Z tions of one or more degrees of freedom of a classical ScðX;Þ¼kB Pr ðxðtÞjxð0ÞÞ ln Pr ðxðtÞjxð0ÞÞDxðtÞ; mechanical system. Working with a bounded allowed xðtÞ region in unbounded classical phase space to break trans- (2) lational symmetry, we will denote system paths through position-momentum phase space as xðtÞ¼ðqðtÞ; pðtÞÞ where Pr ðxðtÞjxð0ÞÞ denotes the conditional probability of and denote the forced degrees of freedom by fjg, with x the system evolving through the path ðtÞ assuming the effective masses mj. As a simple means for introducing initial system state xð0Þ, integrating over all possible paths nondeterminism, let us choose the environment xðtÞ to x ðtÞ taken by the open system’s environment during the be a heat reservoir at temperature Tr that is coupled only same interval: to the system’s forced degrees of freedom, and that Z periodically with time scale rethermalizes those degrees Pr ðxðtÞjxð0ÞÞ ¼ Pr ðxðtÞ; xðtÞjxð0ÞÞDxðtÞ: (3) of freedom via nonlinear Langevin dynamics with tem- xðtÞ porally discretized additive thermal noise and friction Our proposed causal path entropy measure (2) can be terms. Specifically, let us assume the overall energetic ðx Þ seen as a special form of path information entropy—a more force on the forced degrees of freedom to be gj ðtÞ;t ðx Þ ðx Þ general measure that was originally proposed in the con- pjðbt=cÞ= þ fjðbt=cÞþhj ðtÞ , where hj ðtÞ text of stationary states and the fluctuation theorem [9]—in collects any deterministic state-dependent internal system which macrostate variables are restricted to the initial forces and fjðtÞ is a piecewise-constant random Gaussian states of path microstates (hence, each path microstate force with mean hfjðtÞi ¼ 0 and correlation function 0 2 can be seen as a ‘‘causal’’ consequence of the macrostate hfjðtÞfj0 ðt Þi ¼ mjkBTrj;j0 bt=c;bt0=c= . to which it belongs). In contrast, path information entropy Under these assumptions, the components of the causal by definition imposes no such causal macrostate restriction entropic force (4) at macrostate X0 with associated initial and allows temporally delocalized macrostate variables system state ðqð0Þ; pð0ÞÞ can be expressed as such as time-averaged quantities over paths [10]. By @S ðX;Þ restricting the macrostate variables to the present (t ¼ 0) ðX Þ¼ c Fj 0; Tc 0 ; (5) state of the microscopic degrees of freedom, a causal @qjð Þ X¼X0 entropy gradient force can always be defined that can be treated as a real microscopic force that acts directly on for the positions of the forced degrees of freedom. Since those degrees of freedom and that can prescribe well- paths that run outsideR the allowed phase space region have 1 Prðxð Þjxð0ÞÞ ð0Þ ð0Þ¼0 defined flows in system state space. Again, in contrast, probability zero, 1 @ t =@qj dqj ,so gradients of path information entropy can only generically bringing the partial derivative from (5) inside the path be interpreted as phenomenological forces that describe a integral in (2) yields system’s macroscopic dynamics. The generic limitation of Z @ Pr ðxðtÞjxð0ÞÞ path information entropy to serving as a macroscopic FjðX0;Þ¼kBTc x @q ð0Þ description, rather than a microscopic physical prescrip- ðtÞ j tion, for dynamics was noted recently [11]. ln Pr ðxðtÞjxð0ÞÞDxðtÞ: (6) A path-based causal entropic force F corresponding to (2), as schematically illustrated in Fig. 1(b), may then be Since the system is deterministic within each period 1 expressed as ½n; ðn þ Þ, we can express any nonzero conditional path probability Pr ðxðtÞjxð0ÞÞ as the first-order Markov F X r X ð 0;Þ¼Tc XScð ;ÞjX0 ; (4) chain NY1 where Tc is a causal path temperature that parametrizes the PrðxðtÞjxð0ÞÞ ¼ Prðxðt 1Þjxðt ÞÞ PrðxðÞjxð0ÞÞ; system’s bias toward macrostates that maximize causal nþ n (7) n¼1 entropy. Alternatively, Tc can be interpreted as parametriz- ing the rate at which paths in a hypothetical dynamical where tn n and N =, and therefore ensemble of all possible fixed-duration paths transition into Pr ðx x 0 Þ each other, in analogy to the transitions between configura- @ ðtÞj ð Þ 0 tional microstates of an responsible for its @qjð Þ entropic . Note that the force (4) is completely NY1 @ Pr ðxðÞjxð0ÞÞ determined by only two free parameters, T and , and ¼ Pr ðxðt 1Þjxðt ÞÞ : (8) c nþ n @q ð0Þ vanishes in three degenerate limits, (i) in systems with n¼1 j

168702-2 week ending PRL 110, 168702 (2013) PHYSICAL REVIEW LETTERS 19 APRIL 2013

Choosing to be much faster than local position-dependent 1 1=2 f½2mjjrqð0Þhjðxð0ÞÞj gand momentum-dependent 1 [ jrpð0Þhjðxð0ÞÞj ] variation in internal system forces, the position qjðÞ is given by 0 0 0 pjð Þ fjð Þþhjð Þ 2 qjðÞ¼qjð0Þþ þ : (9) 2mj 2mj 2 It follows from (9) that qjðÞhqjðÞi ¼ fjð0Þ =ð2mjÞ 2 2 2 4 FIG. 2. Illustrative examples of causal entropic forcing. (a) A and hqj ðÞi hqjðÞi ¼ kBTr =ð mjÞ, and since the distribution Pr ðxðÞjxð0ÞÞ is Gaussian in q ðÞ, we can particle in a box is forced toward the center of its box. (b) A j cart and pole system in which only the cart is forced success- therefore write fully swings up and stabilizes an initially downward-hanging @ Pr ðxðÞjxð0ÞÞ @ Pr ðxðÞjxð0ÞÞ pole. (See also Supplemental Material, Movies 1 and 2 [13], ¼ respectively.) @qjð0Þ @qjðÞ 2f ð0Þ ¼ j Pr ðxðÞjxð0ÞÞ: (10) the box. In contrast, if the particle had merely diffused kBTr from its given initial state, while its expectation position Finally, substituting (10) into (8), and (8) into (6), we find would relax towards the center of the box, its maximum that likelihood position would remain stationary. As a second example, we considered a cart and pole 2 Z Tc (or inverted pendulum) system [13]. The upright stabiliza- FjðX0;Þ¼ fjð0Þ Tr xðtÞ tion of a pole by a mobile cart serves as a standard model Pr ðxðtÞjxð0ÞÞ ln Pr ðxðtÞjxð0ÞÞDxðtÞ: (11) for bipedal locomotion [16,17], an important feature of the evolutionary divergence of hominids from apes [15] that Qualitatively, the effect of (11) can be seen as driving the may have been responsible for freeing up prehensile hands forced degrees of freedom (j) with a temperature-dependent for tool use [14]. We found that causal entropic forcing of strength (Tc=Tr) in an average of short-term directions the cart resulted in the successful swing-up and upright [fjð0Þ] weighted by the diversity of long-term paths stabilization of an initially downward-hanging pole, as [Pr ðxðtÞjxð0ÞÞ ln Pr ðxðtÞjxð0ÞÞ] that they make reach- shown in Fig. 2(b) and Movie 2 in the Supplemental able, where path diversity is measured over all degrees of Material [13]. During upright stabilization, the angular freedom of the system, and not just the forced ones. variation of the pole was reminiscent of the correlated To better understand this classical-thermal form random walks observed in human postural sway [18]. (11) of causal entropic forcing, we simulated its effect Heuristically, again, swinging up and stabilizing the pole [12,13] on the evolution of the causal macrostates of a made it more energetically favorable for the cart to sub- variety of simple mechanical systems: (i) a particle in a sequently swing the pole to any other angle, hence, max- box, (ii) a cart and pole system, (iii) a tool use puzzle, imizing the diversity of causally accessible paths. Similarly, and (iv) a social cooperation puzzle. The first two while allowing the pole to circle around erratically instead systems were selected for illustrative purposes. The of remaining upright might make diverse pole angles caus- latter two systems were selected because they isolate ally accessible, it would limit the diversity of causally major behavioral capabilities associated with the human accessible pole angular momenta due to its directional ‘‘cognitive niche’’ [14]. The widely-cited cognitive bias. This result advances beyond previous evidence that niche theory proposes that the unique combination of instantaneous noise can help to stabilize an inverted pen- ‘‘cognitive’’ adaptive behavioral traits that are hyper- dulum [19,20] by effectively showing how potential future developed in humans with respect to other animals noise can not only stabilize an inverted pendulum but can have evolved in order to facilitate competitive adaptation also swing it up from a downward hanging position. on faster time scales than natural evolution [15]. As a third example, we considered a tool use puzzle [13] As a first example of causal entropic forcing applied to a based on previous experiments designed to isolate tool use mechanical system, we considered a particle in a two- abilities in nonhuman animals such as chimpanzees [21] dimensional box [13]. We found that applying forcing to and crows [22], in which inanimate objects are used as both of the particle’s momentum degrees of freedom had tools to manipulate other objects in confined spaces that are the effect of pushing the particle toward the center of the not directly accessible. We modeled an animal as a disk box, as shown in Fig. 2(a) and Movie 1 in the Supplemental (disk I) undergoing causal entropic forcing, a potential tool Material [13]. Heuristically, as the overall farthest position as a smaller disk (disk II) outside of a tube too narrow for from boundaries, the central position maximized the diver- disk I to enter, and an object of interest as a second smaller sity of causal paths accessible by within disk (disk III) resting inside the tube, as shown in Fig. 3(a)

168702-3 week ending PRL 110, 168702 (2013) PHYSICAL REVIEW LETTERS 19 APRIL 2013

FIG. 4. Effect of causal entropic forcing on a social coopera- tion puzzle. (a) Initial configuration. (b) Disks I and II sponta- FIG. 3. Effect of causal entropic forcing on a tool use puzzle. neously synchronize their positions and push down in tandem on (a) Disk I is too large to fit into a tube containing disk III, both string handles. (c) Disk III has been successfully pulled to whereas disk II is small enough. (b) Disk I collides with disk II. an accessible position for direct manipulation. (d) Disks I and II (c) Disk II enters the tube and collides with disk III. (d) Disk III directly manipulate disk III. (See also Supplemental Material, has been successfully released from its initially fixed position in Movie 4 [13].) the tube, and is now free for direct manipulation by disk I. (See also Supplemental Material, Movie 3 [13].) As a result, disk III was successfully pulled toward the compartments containing disks I and II, as shown in and Movie 3 in the Supplemental Material [13]. We found Fig. 4(c), allowing disks I and II to directly manipulate disk that disk I spontaneously collided with disk II [Fig. 3(b)], III, as shown in Fig. 4(d). Heuristically, by cooperating to so as to cause disk II to then collide with disk III inside the pull disk III into their compartments and thereby making it tube [Fig. 3(c)], successfully releasing disk III from its accessible for direct manipulation through collisions, disks initially fixed position and making its degrees of freedom I and II maximized the diversity of accessible causal paths. accessible for direct manipulation and even a sort of To the best of our knowledge, these tool use puzzle and ‘‘play’’ by disk I [Fig. 3(d)]. As a fourth example, we considered a social cooperation social cooperation puzzle results represent the first success- puzzle [13] based on previous experiments designed to ful completion of such standard animal cognition tests using isolate social cooperation abilities in nonhuman animals only a simple physical process. The remarkable spontane- such as chimpanzees [23], rooks [24], and elephants [25], ous emergence of these sophisticated behaviors from such a in which a pair of animals cooperate to perform synchro- simple physical process suggests that causal entropic forces nized pulling on both ends of a rope or string in order to might be used as the basis for a general—and potentially retrieve an object from an inaccessible region. We modeled universal—thermodynamic model for adaptive behavior. a pair of animals as two disks (disks I and II) undergoing Namely, adaptive behavior might emerge more generally independent causal entropic forcing and residing in in open thermodynamic systems as a result of physical separate compartments, which they initially shared with agents acting with some or all of the systems’ degrees of a pair of lightweight ‘‘handle’’ disks that were connected freedom so as to maximize the overall diversity of acces- by a string, as shown in Fig. 4(a) and Movie 4 in the sible future paths of their worlds (causal entropic forcing). Supplemental Material [13]. The string was wound around In particular, physical agents driven by causal entropic a horizontal bar free to move vertically and supporting a forces might be viewed from a Darwinian perspective as target object (disk III), which could move horizontally on it competing to consume future histories, just as biological and which was initially inaccessible to disks I and II. Disk replicators compete to consume instantaneous material masses and drag forces were chosen such that synchro- resources [26]. In practice, such agents might estimate nized pulling on both ends of the string resulted in a much causal entropic forces through internal Monte Carlo sam- larger downward force on disk III than asynchronous or pling of future histories [13] generated from learned models single-sided pulling. Moreover, the initial positions of of their world. Such behavior would then ensure their disks I and II were set asymmetrically such that coordi- uniform aptitude for adaptiveness to future change due to nated timing of the onset of pushing down on handle disks interactions with the environment, conferring a potential was required in order for the string not to be pulled away survival advantage, to the extent permitted by their strength from either disks I or II. We found that independent causal (parametrized by a causal path temperature, Tc) and their entropic forcing of disks I and II caused them to first ability to anticipate the future (parametrized by a causal spontaneously align their positions and push down in time horizon, ). Consistent with this model, nontrivial tandem on both handles of the string, as shown in Fig. 4(b). behaviors were found to arise in all four example systems

168702-4 week ending PRL 110, 168702 (2013) PHYSICAL REVIEW LETTERS 19 APRIL 2013 when (i) the characteristic of the forcing (kBTc)was [4] E. J. Chaisson, Cosmic Evolution: The Rise of Complexity larger than the characteristic energy of the system’s internal in Nature (Harvard University Press, Cambridge, MA, dynamics (e.g., for the cart and pole example, the energy 2002). required to lift a downward-hanging pole), and (ii) the [5] B. D. Ziebart, J. A. Bagnell, and A. K. Dey, in causal horizon was longer than the characteristic time Proceedings, Twenty-seventh International Conference on Machine Learning: Haifa, Israel, 2010, edited by J. scale of the system’s internal dynamics (e.g., for the cart and Fu¨rnkranz and T. Joachimspp (International Machine pole example, the time the pole would need to swing Learning Society, Madison, WI, 2010), p. 1255. through a semicircle due to gravity). [6] S. Gelly, L. Kocsis, M. Schoenauer, M. Sebag, D. Silver, These results have broad physical relevance. In con- C. Szepesva´ri, and O. Teytaud, Commun. ACM 55, 106 densed matter , our results suggest a novel means (2012). for driving physical systems toward self-organized criti- [7] C. R. Shalizi and C. Moore, arXiv:cond-mat/0303625. cality [27]. In particle theory, they suggest a natural gen- [8] E. Verlinde, J. High Energy Phys. 04 (2011), 1. eralization of [8]. In econophysics, they [9] R. C. Dewar, J. Phys. A 36, 631 (2003). suggest a novel physical definition for wealth based on [10] R. C. Dewar, J. Phys. A 38, L371 (2005). 11 causal entropy [28,29]. In cosmology, they suggest a path [11] R. C. Dewar, Entropy , 931 (2009). entropy-based refinement to current horizon entropy-based [12] Our general-purpose causal entropic force simulation soft- ware will be made available for exploration at http:// anthropic selection principles that might better cope with www.causalentropy.org. black hole horizons [1]. Finally, in biophysics, they suggest [13] See Supplemental Material at http://link.aps.org/ new physical measures for the behavioral adaptiveness and supplemental/10.1103/PhysRevLett.110.168702 for sophistication of systems ranging from biomolecular con- additional example system details, calculation details, figurations to planetary ecosystems [2,3]. control experiment details, and movies. In conclusion, we have explicitly proposed a novel [14] S. Pinker, Proc. Natl. Acad. Sci. U.S.A. 107, 8993 (2010). physical connection between adaptive behavior and en- [15] J. Tooby and I. DeVore, in The Evolution of Human tropy maximization, based on a causal generalization of Behavior: Primate Models, edited by W. G. Kinzey entropic forces. We have examined in detail the effect of (State University of New York Press, New York, 1987), such causal entropic forces for the general case of a clas- pp. 183–237. sical mechanical system partially connected to a heat [16] P. Holmes, R. J. Full, D. Koditschek, and J. Guckenheimer, SIAM Rev. 48, 207 (2006). reservoir, and for the specific cases of a variety of simple [17] D. Schmitt, J. Exp. Biol. 206, 1437 (2003). example systems. We found that some of these systems [18] J. J. Collins and C. J. DeLuca, Phys. Rev. Lett. 73, 764 exhibited sophisticated spontaneous behaviors associated (1994). with the human ‘‘cognitive niche,’’ including tool use and [19] J. G. Milton, T. Ohira, J. L. Cabrera, R. M. Fraiser, J. B. social cooperation, suggesting a potentially general ther- Gyorffy, F. K. Ruiz, M. A. Strauss, E. C. Balch, P.J. Marin, modynamic model of adaptive behavior as a nonequilib- and J. L. Alexander, PLoS ONE 4, e7427 (2009). rium process in open systems. [20] J. Milton, J. L. Cabrera, T. Ohira, S. Tajima, Y. Tonosaki, We thank E. Kaxiras, J. Gore, J. A. Barandes, M. C. W. Eurich, and S. A. Campbell, Chaos 19, 026110 Tegmark, E. S. Boyden, T. Hylton, L. Lerman, J. M. (2009). [21] C. Sanz, J. Call, and D. Morgan, Biol. Lett 5, 293 (2009). Smart, Z. D. Wissner-Gross, and T. M. Sullivan for helpful 297 discussions. C. E. F. was partially supported by NSF Grant [22] A. A. S. Weir, J. Chappell, and A. Kacelnik, Science , 981 (2002). No. DMS-0901020. [23] A. P. Melis, B. Hare, and M. Tomasello, Animal Behaviour 72, 275 (2006). [24] A. M. Seed, N. S. Clayton, and N. J. Emery, Proc. R. Soc. B 275, 1421 (2008). *To whom all correspondence should be addressed. [25] J. M. Plotnik, R. Lair, W. Suphachoksahakun, and F. B. M. [email protected] de Waal, Proc. Natl. Acad. Sci. U.S.A. 108, 5116 (2011). http://www.alexwg.org [26] R. Dawkins, The Selfish Gene (Oxford University Press, †Present address: Computer Science and Artificial New York, 1976). Intelligence Laboratory, Massachusetts Institute of [27] S. S. Elnashaie and J. R. Grace, Chem. Eng. Sci. 62, 3295 Technology, Cambridge, Massachusetts 02139, USA. (2007). [1] R. Bousso, R. Harnik, G. D. Kribs, and G. Perez, Phys. [28] T. P. Wallace, Wealth, Energy, And Human Values Rev. D 76, 043513 (2007). (AuthorHouse, Bloomington, 2009). [2] A. Kleidon, Phys. Life Rev. 7, 424 (2010). [29] D. Deutsch, The Beginning of Infinity (Allen Lane, [3] L. Martyushev and V. Seleznev, Phys. Rep. 426, 1 (2006). London, 2011).

168702-5