Why Deep Neural Nets Cannot Ever Match Biological Intelligence and What to Do About It?

International Journal of Automation and Computing 14(5), October 2017, 532-541 DOI: 10.1007/s11633-017-1093-8 Why Deep Neural Nets Cannot Ever Match Biological Intelligence and What To Do About It? Danko Nikolić1, 2, 3 1DXC Technology, Frankfurt am Main, Germany 2 Frankfurt Institute for Advanced Studies (FIAS), Ruth-Moufang-Straße 1, D-60438 Frankfurt/M, Germany 3Department of Psychology, Faculty of Humanities and Social Sciences, University of Zagreb, Croatia Abstract: The recently introduced theory of practopoiesis offers an account on how adaptive intelligent systems are organized. According to that theory, biological agents adapt at three levels of organization and this structure applies also to our brains. This is referred to as tri-traversal theory of the organization of mind or for short, a T3-structure. To implement a similar T3-organization in an artificially intelligent agent, it is necessary to have multiple policies, as usually used as a concept in the theory of reinforcement learning. These policies have to form a hierarchy. We define adaptive practopoietic systems in terms of hierarchy of policies and calculate whether the total variety of behavior required by real-life conditions of an adult human can be satisfactorily accounted for by a traditional approach to artificial intelligence based on T2-agents, or whether a T3-agent is needed instead. We conclude that the complexity of real life can be dealt with appropriately only by a T3-agent. This means that the current approaches to artificial intelligence, such as deep architectures of neural networks, will not suffice with fixed network architectures. Rather, they will need to be equipped with intelligent mechanisms that rapidly alter the architectures of those networks. Keywords: Artificial intelligence, neural networks, strong artificial intelligence, practopoiesis, machine learning 1 Introduction: Hierarchy of policies networks enhance computations[3, 4]. Importantly, a practopoietic agent may have different Practopoiesis is a recent theory of how adaptive agents sets of policies, some of them acting on the environment are organized and proposes a number of principles under but others acting on the agent itself. These sets form a [1] whichsuchsystemsoperate . One of the key presump- hierarchy. tions of practopoiesis is that adaptive mechanisms are orga- Thus, practopoietic hierarchy is an arrangement in nized into a specific type of hierarchy: Mechanisms lower on which, for policy x, there is a policy y whose actions change the hierarchy determine the properties of the mechanisms policy x. This makes it a T2-agent, due to the actions exe- higher on the hierarchy. Interactions among those levels cuted at two levels of organization. To indicate that actions of organization are described by concepts such as monitor- of policy y change policy x,wewrite: and-act unit and cybernetic knowledge, and by principles π → π . such as knowledge extraction, knowledge shielding, down- y x ward pressure for adjustment, and equi-level interactions. It [5] [6] In that case, TD-learning and Q-learning algorithms has been also proposed that practopoiesis has implications are considered special policies belonging to πy. for development of machine learning and artificial intelli- [1, 2] Importantly, however, accordingtopractopoietictheory, gence (AI) . [1] biological T3-systems have also a third policy — referred Practopoietic systems can be described from the perspec- to as tri-traversal theory of human cognition. The theory tive of machine learning as follows. The entire set of adap- claims that there are fast and slow “learning” mechanisms tive capabilities of an organism (i.e., monitor-and-act units) and that the slow mechanisms train the fast ones. Thus, a at one level of organization, in the terminology of machine full agent can be described then as learning, can be described as the policy (π) for generating [1] actions. Similarly, cybernetic knowledge can be under- πG → πA → πN stood as an optimal policy of machine learning. Recent analyses showed that adaptations of neurons and neural whereby, by following the tri-traversal theory, we presume that πG is stored in genes, πA in the rules for neural adapta- Research Article tion responsible for fast changes to the system (referred to Special Issue on Human-inspired Computing also as anapoiesis), and πN in the properties of the neural Manuscript received January 16, 2017; accepted May 10, 2017; pub- lished online July 4, 2017 network. This work was supported by Hertie Foundation and Deutsche This relationship can be also described as fast learning Forschungsgemeinschaft. Recommended by Associate Editor Hong Qiao (by operations of πA) and slow learning how to learn c The Author(s) 2017 fast (by operations of πG). Here, most of the knowledge D. Nikolić/ Why Deep Neural Nets Cannot Ever Match Biological Intelligence and What To Do About It? 533 acquired through the lifetime of an agent is acquired in the ing rate), but where the knowledge on how to learn contains properties of πA,notinπN as would be the case in the most a huge amount of information, i.e., the system becomes an modern approaches to AI. In other words, T3-agents store expert on how to learn. See Fig. 1 for organization of bio- knowledge in the rules on how to quickly learn. logical minds according to the theory of practopoiesis. To describe the interaction between an agent and its en- Practopoiesis proposes that all living individuals have vironment, we can write T3 organization, starting from each biological cell. Impor- tantly, adding multiple Tn systems does not increase n. π → π → π → U G A N So, while each living cell alone operates as a T3 system, where U stands for the surrounding world or Umwelt. an entire organism built off billions of such cells remains to To describe the adaptive capabilities of an entire species, present also a T3 system. Similarly, a group of individual organisms or an entire society still form a T3 system. there is one additional policy, πE , which determines the πG The inclusion of evolution to form a T4 system applies genome . This policy operates according to the rules of [1] evolution by natural selection. Thus, the adaptive structure to a single species and to the biosphere as a whole .Thus, of an entire species can be described as according to the theory, the entire evolving life on planet Earth can be understood as operating with four levels of πE → πG → πA → πN . policies, whereas each individual agent has “only” three. These levels of adaptation in Tn-systems, would be dis- 1.1 Generalizing actions tinguished from layers in the hierarchy of neural networks. Even if a network has 1 000 layers, it remains to operate as In reinforcement learning theory, actions of an agent are a T1 system if there is no learning, and as T2 if a typical conceptually different from the processes of learning by that form of learning such as backpropagation is applied. agent. Practopoiesis generalizes all those forms of actions To create a T3 system, one needs to add a level at which to one single concept: adaptive traverses (here, indicated by the learning algorithm adapts. For example, an algorithm arrows, →). A traverse is whenever knowledge of an agent that uses feedback from the environment to adapt the learn- together with the feedback from environment are used to ing rate of a gradient descent can be considered to con- make some changes either to the agent itself or to the en- tribute overall to a T3 organization of the entire system. vironment. Two traverses are considered different if their This is because there are three levels of adaptation: network mutual impacts are asymmetric: If one traverse affects the < learning algorithm (teacher of the network) < teacher of knowledge of the other but not the other way around. For the learning algorithm (learning to learn). In the present example, a back propagation learning rule changes the map- paper, we are considering systems in which not only a small ping function of a neural net, but the mapping function of bit of knowledge is used at lower levels of organization to the neural net does not change the learning rule. Hence, form a T3 system (e.g., only on parameter value, the learn- here two traverses can be distinguished. Fig. 1 The organization of biological minds according to the theory of practopoiesis. The descriptions of the three traverses are shown on the right. On the left described is the knowledge that each traverse uses for its operations but also the knowledge that each of the traverses creates by its operations. Top to Top-3 indicate the depths of adaptive levels. 534 International Journal of Automation and Computing 14(5), October 2017 A system with operational capabilities at n levels is a variety of the real-life problems that such agents face? The Tn-system and has n traverses, each directed either towards immediately following questions is: Can an increase in the the outside of the agent or towards inside of the agent (the number of policies (traverses) improve any limitations in latter being often referred to as learning). the variety of the agent? The total number of traverses equals the number of or- The key presumption behind the present calculations is ganization levels at which policies exist. This is because that the upper limit of the total variety of states that a pol- actions of the policy at the top level of organization, πN , icy of an agent can produce is limited by the total amount of affect the environment directly. Hence, the full interac- memory that the policy requires. The available amount of tion (with all the arrows) between a biological species as an memory represents the maximum entropy that the system agent and its environment can be written as can generate and yet that its actions are informed about the environment in which it acts.

Load more