Vision and the Statistics of the Visual Environment Eero P Simoncelli
Total Page:16
File Type:pdf, Size:1020Kb
To appear: Current Opinion in Neurobiology, Vol 13, April 2003 Vision and the Statistics of the Visual Environment Eero P Simoncelli Summary topic in a variety of recent workshops and meet- It is widely believed that visual systems are opti- ings [2], as well as several review articles [3, 4] and mized for the visual properties of the environment a special journal issue [5]. Recent results are inter- inhabited by the host organism. A specific instance esting, albeit controversial. Because of this flurry of this principle known as the Efficient Coding Hy- of recent interest, and because Bayesian theories pothesis holds that the purpose of early visual pro- of vision have been reviewed in a number of other cessing is to produce an efficient representation of places [e.g., 6, 7, 8, 9, 10, 11], I willl focus almost the incoming visual signal. The theory provides entirely on efficient coding in this article, with em- a quantitative link between the statistical proper- phasis on articles published in 2001 or later. ties of the world and the structure of the visual system. As such, specific instances of this theory have been tested experimentally, and have be used Efficient Coding to motivate and constrain models for early visual processing. The theory of information plays a natural role in models of neural systems, by providing abstract Address but unique quantitative definitions for information 4 Washington Place, Room 809, New York, NY [12]. Barlow [13] recognized the importance of in- 10003-6604, USA; phone: (212) 998-3938; fax: formation theory in this context, and hypothesized (212) 995-4011; email: [email protected] that the efficient coding of visual information could serve as a fundamental constraint on neural pro- cessing. That is, a group of neurons should encode Introduction information as compactly as possible, in order to most effectively utilize the available computing re- One of the primary roles of theory in the sciences is sources. Mathematically, this is expressed as a de- to provide fundamental principles that explain why sire to maximize the information that neural re- the natural world is constructed as it is. In biology, sponses provide about the visual environment. the main example of such a principle is the theory The simplest form of this hypothesis (in particu- of evolution by natural selection. In the context lar, ignoring the noise in neural responses) decou- of vision, this sort of philosophy was championed ples naturally into two separate statements: one re- by David Marr, who argued that it is essential to garding the statistics of individual neural responses consider the visual system at an abstract compu- and a second regarding the joint statistics of the re- tational level in order to understand its design [1]. sponse of a population [14, 3, 15]: Two specific instances of this philosophy provide a quantitative link between the statistical properties of the visual environment and the structure of bio- • The responses of an individual neuron to the logical visual systems: The so-called Efficient Cod- natural environment should fully utilize its ing Hypothesis, and the formulation of early vision output capacity, within the limits of any con- problems in terms of Bayesian estimation or deci- straints on the response (e.g., maximum firing sion theory. As such, specific instances of both rate). theories may be tested experimentally, and may be • used to motivate and constrain models for vision. The responses of different neurons to the nat- ural environment should be statistically inde- The Efficient Coding Hypothesis has been a central pendent of each other. In other words, the in- formation carried by each neuron should not dard choice of unit for information, but the ab- be redundant with that carried by the others. stract definition of information is well-motivated, This is also consistent with a notion that the unique, and is most certainly relevant to the brain. visual system strives to decompose a scene into statistically independent constituents (e.g., in- dividual objects). Experimentally observed dependency. An- other criticism of the Hypothesis is that \some More detailed discussions of these ideas, as well experimental data from multi-neuron recordings as the role of noise, may be found in many other show correlation, synchronization, or other forms references [16, 13, 17, 18, 19, 20, 21, 22, 3, 15]. of statistical dependency between neurons". Most such experiments do not use naturalistic stimuli, and thus dependencies in the neural responses are not directly relevant to the hypothesis. In addi- Criticisms tion, recent studies suggest that responses to nat- ural stimuli in primary visual cortex are relatively A variety of criticisms have been voiced regard- independent[23, 24, 25]. Finally, even if one were ing the Efficient Coding Hypothesis. A number to observe dependencies in neural responses under of these represent misconceptions about the the- natural stimulus conditions, the hypothesis states ory, some are aimed at particular variants of the only that the system strives for independence: The theory, some are about practical experimental is- constraints of neural processing may prevent actual sues, whereas others are more fundamental. Below achievement. Perhaps a more realistic expectation, is brief discussion of some of these (see the recent then, is that successive stages of processing (e.g., review article by Horace Barlow for additional dis- along an ascending sensory pathway) should reduce cussion [4]). statistical dependence [26]. The purpose of vision. It has often been ar- Over-representation in cortex. A further gued that efficient coding of visual information is criticism is based on a comparison of the num- irrelevant because the purpose of vision is not to ber of retinal ganglion cells to the number of neu- encode or reconstruct the visual world. There is rons in primary visual cortex. Critics argue that some truth to this criticism, in that the hypothe- \the number of neurons devoted to processing sen- sis does not take into account how the information sory information seems to expand as one goes that has been extracted is to be used. This may be deeper into the system, suggesting that the brain viewed as either an advantage (because one does increases redundancy." This argument usually as- not need to assume any specific visual task or goal, sumes, however, that the coding capacity of all and does not even need to specify what is being rep- neurons (and in particular those in retina and cor- resented) or a limitation (because tasks and goals tex) is the same. The distribution of information are clearly relevant for visual processing). More amongst more neurons does not necessitate more complete theories, such as that given by Bayesian redundancy if the form of neural coding employed estimation and decision, can take into account both by those neurons is allowed to differ. For exam- the statistical structure of the environment and the ple (as Barlow points out in [4]) cortical neurons visual task or goal. tend to have lower firing rates, and may well use a different form of code than retinal neurons. In ad- dition, cortical neurons have more complex tempo- Relevance of information theory. A second ral dynamic properties (e.g., adaptation) that may criticism of the Efficient Coding Hypothesis is that serve to encode information over longer timescales. \information theory is irrelevant because the brain Although the redundancy of retinal and V1 neu- is not concerned with bits." Bits are just a stan- rons has not been experimentally compared, a re- 2 lated comparison in the auditory system was able Testing the Hypothesis to demonstrate a reduction of redundancy [26]. Although the efficient coding hypothesis is roughly fifty years old, it has only recently been explored quantitatively. This recent progress is due to three Experimental impracticality. Many authors fundamental improvements: (1) we have a much have pointed out that \Estimation of information- better understanding of early sensory processing; theoretic quantities requires enormous amounts of (2) mathematical and engineering tools have been data, and is thus impractical for experimental ver- developed to describe and manipulate more com- ification.” This is a significant problem, especially plex statistical models; and (3) advances in com- since commonly used estimators of information are puting and imaging technologies allow us to gather also known to be heavily biased. Nevertheless, and manipulate vast quantities of image data, both cases of successful experimental measurement give for statistical modeling purposes, and for use as ex- reason for optimism (see below). perimental stimuli. There are two basic methodologies for testing and refining the efficient coding hypotheses. The direct approach is to examine the statistical properties Definition of input and output. The Hypoth- of neural responses under natural stimulation con- esis depends critically on the probability distribu- ditions [e.g. 27, 21, 28, 29, 23]. An alternative tion of natural images and the definition of neu- approach is to use the statistical properties of nat- ral response, both of which are underconstrained. ural images to constrain or derive a model for early Thus the theory is not as assumption-free as one sensory processing [e.g. 30, 31, 32, 33, 34, 19, 35, is led to believe. In my opinion, this is the most 36, 37, 38]. Below, I'll review some recent examples fundamental problem with the hypothesis. The in- of each of these. put distribution is typically not defined explicitly , but is assumed to be well represented by a collec- tion of calibrated \naturalistic" images. One must Experimental Tests also specify which neurons are meant to satisfy the hypothesis (e.g., neurons within a particular cell In recent years, there have been a number of inter- class, or within a specific visual area, or across mul- esting experimental articles examining neural re- tiple visual areas), and how their responses are to sponses to naturalistic images or image sequences be measured (e.g., mean firing rates vs.