CS765 Interactive Cognitive Systems
Total Page:16
File Type:pdf, Size:1020Kb
CS765 Interactive Cognitive Systems L5: Cybernetics & Information Matthew Egbert Jim Warren What are some situated, embodied and dynamical - cybernetics approaches other than - enaction evolutionary robotics? Back in Lecture 3… Feedback The more XXXX there is, the more A number of approaches to XXXX there will be! understanding cognition have focused upon different types of feedback. Positive Feedback These tend to be more non-representationalist views that embrace aspects of situatedness, embodiment and dynamics. In particular, cybernetics (negative The more XXXX feedback as purposefulness) there is, the less XXXX there will be! Negative Feedback We tend to not be very good at thinking about systems that involve circular causality.. Is it possible for a passive vehicle on a flat surface to go directly downwind faster than the wind that is powering it? - wheeled vehicle - no motor - moving parts are allowed Back to by-hand engineering Instead of using EAs to design brains in SED robots, we can design controllers by hand. - Understandable - Test hypotheses - Understand implications of an architecture or assumptions - Demonstrate ideas - But perhaps hard to do? Can be representationalist or non-representationalist or perhaps even somewhere in between? The Homeostat "the closest thing to a synthetic brain so far designed by man". The Thinking Machine, Time, 24 January 1949. W. Ross Ashby W. Ross Ashby was a psychiatrist who (like many cyberneticists) did some of his most interesting work outside of his professional life (almost a hobby).[1] He wrote (very readable) books that had substantial (though not widely recognised) impact upon AI - Design for a Brain - Introduction to Cybernetics. One of Ashby's most famous contributions was the homeostat.. [1] Pickering 2010. The Cybernetic Brain. The University of Chicago Press. Ashby's Homeostat The homeostat was assembled from bomb parts after WWII. It consisted of four identical units that were coupled to each other. Homeostat Each of the four units has a state (an electrical potential). This state is influenced by all of the other states. When any of the state variables go out of a predefined "viable range", the way that each unit influences other units is randomized, until the all of the variables return to the viable range. Nested feedback loops essential A environmental interaction variable(s) feedback loop (between the Environment and the Response-system) of the system. This feedback loop is part of a larger feedback loop, where if the Env-R interaction causes essential variables to leave a predefined region, the Selector modifies R so as to produce a different Env-R interaction. Ultrastability This produces a remarkable form of adaptive regulation termed 'ultrastability' For instance, if the homeostat were appropriately hooked up to the controls for an aircraft, it would be capable of adjusting to a variety of major disruptions. Even total inversion of the controls... so that steering right makes the plane roll left! Essential variable = how level the plane is R = how to influence the plane for various instrument states Human perception is also capable of adapting to radical transformations, such as inversion of the visual field. What happens when someone wears goggles that flip the world upside down for a month? Initially they are horribly clumsy, but after time they adapt -- they can ride a bike, ski, etc.! After a time, they report that the world does not appear upside down! Anyone who wears glasses has experienced a more mild version of this. Initially the world swims around, and then that disappears. How could you build an artificial system that operated in this ultrastable way? Ivo Kohler 1951 (YouTube search 'inverting goggles' for some amusing relevant videos.) No model We have an example here of a system that adapts to the world, without modelling it. vs. The limits of this kind of adaptation are less explicit than representationalist AI. The homeostat responds to things that disturb its essential variables...regardless of what they are. This is not to say that it is guaranteed to adapt successfully, but just that the set of things that it can respond to is not 'circumscribed' [1] Discuss: Compare to micro-worlds and human adaptability. Adapting to inversion of the visual field Di Paolo (2000,2003), Iizuka et al. 2013, etc. - Two wheeled robot with light sensors. - Plastic neural network. If neural activity leaves predefined bounds, network weights are rearranged - Artificial evolution is selects for NN parameters that accomplish phototaxis and keep neural activity within the predefined bounds. - Robot adapts to inversion of visual field (that was not present in evolution) Information The notion of 'information' is central in many approaches in AI. But what is information? Claude Shannon What is information? Is there any way to be more quantitative about information? Is this the same thing? What is information? Is there any way to be more quantitative about information? Shannon Information Claude Shannon wrote a seminar paper in 1948 entitled "A Mathematical Theory of Communication" which considered information in the context of sending messages over a noisy channel. How much information can be sent over such channels, given a set of possible messages, an encoding, an amount of noise, etc.? This paper essentially founded the entire field of Information Theory, with numerous applications, including lossless compression, satellite communications, etc. Image taken from: Information Theory: A Tutorial Introduction James V Stone (2018) arXiv:1802.05968 [cs.IT] Bits & Channel Imagine: the apocalypse has taken place. Your bunker has an Capacity ancient teletype device with 32 keys. Your zombie bite wounds Assuming a noise-free channel: If you are rather painful and so you can have two possible messages, equally only type one key per second. probable and you send that message every N seconds, the channel capacity is Q: What is the maximum channel 1/N bits per second capacity for requesting help using this device?!?!!1! More generally, if we assume each message is equally likely to occur, the A: 5 bits / second channel capacity is: But note that with messages like "OWWWWWWWW. MY ARM HURTS.", log X / N seconds 2 You are not using that full capacity. Some letters are more common where X is the number of possible than others. messages. Example: Consider a message that communicates the result of a coin flip. Entropy If the coin is 'fair', there are 2 equally possible messages, and so the message conveys 1-bit of ...So a channel has a maximum information. (log2 X, where X is the number of possible capacity, how much of that messages = log2 2 = 1) capacity does an encoded message (an information Q: How many coin flip messages could you 'source') utilize? communicate per second using the post-apocalyptic zombie teletype? Remember you can only push one Entropy is a measure of the button per second? average information content of an information source. If you think about the source of But what if the coin flip is biased? Does the information as a random entropy go up or down? process, the entropy of that source is the uncertainty you Consider the extreme. Every time it is flipped it are about the result of comes up heads. How much uncertainty do you sampling it. have in the message? Entropy (continued) When the probability of messages is uneven, the following equation can be used to describe the entropy of a source. The greater the number of possible messages, and the more evenly distributed their likelihood, the higher Example 1: a coin biased 25% / 75%. the entropy of the random process. H = -( (0.25 x -2.0) + (0.75 x -0.415) ) = 0.811 bits Example 2: post-apocalyptic teletype maximum entropy per keystroke (i.e. all 32 keys evenly used) H = -( 32 (1/32 x -5) ) = 5 bits Joint Entropy The entropy of two independent variables is calculated in a similar way. Conditional Entropy Q: If Y and X are binary If you know X, how much entropy variables, and X is always the remains in Y? opposite value as Y, what is H(Y|X)? Read H(Y|X) as "entropy of Y given X" Q: If Y and X are completely independent, what is H(Y|X)? Mutual Information If I know X, how much does that reduce the entropy of Y? Put another way: how much does knowledge of X reduce uncertainty about the state of Y? People say as a shorthand, "X contains information about Y," but be careful you don't confuse yourself with this language. This measure is symmetrical. I(X;Y) = I(Y;X) https://en.wikipedia.org/wiki/Mutual_information But what does it all mean? Given the excitement of a neuron (X) and the presence of something in the environment (Y), what does I(X;Y) describe? What does it mean for there to be a high (or low) degree of mutual information between these variables? This will be the topic we discuss with Randy Beer when he visits (21 May). (Note also that these information theoretical values are used in ML and other areas of AI not covered here). Be careful! Don't confuse yellow definitions with the blue definitions!.