Active efficient coding explains the development of and its failure in

Samuel Eckmanna,b,c,1 , Lukas Klimmascha,b, Bertram E. Shid, and Jochen Triescha,b,1

aFrankfurt Institute for Advanced Studies, 60438 Frankfurt am Main, Germany; bDepartment of Computer Science and Mathematics, Goethe University, 60629 Frankfurt am Main, Germany; cMax Planck Institute for Brain Research, 60438 Frankfurt am Main, Germany; and dDepartment of Electronic and Computer Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong

Edited by Martin S. Banks, University of California, Berkeley, CA, and approved January 29, 2020 (received for review May 13, 2019)

The development of vision during the first months of life is an In our formulation of active efficient coding, we maximize active process that comprises the learning of appropriate neu- coding efficiency as measured by the Shannon mutual informa- ral representations and the learning of accurate eye movements. tion I (R, C ) between the retinal stimulus represented by retinal While it has long been suspected that the two learning processes ganglion cell activity R and its cortical representation C under are coupled, there is still no widely accepted theoretical frame- a limited resource constraint. The mutual information can be work describing this joint development. Here, we propose a com- decomposed as putational model of the development of active binocular vision to fill this gap. The model is based on a formulation of the active I (R, C ) = H(R) − H(R | C ), [1] efficient coding theory, which proposes that eye movements as well as stimulus encoding are jointly adapted to maximize the where H(R) is the entropy of the retinal response and H(R | C ) overall coding efficiency. Under healthy conditions, the model is its conditional entropy given the cortical representation. self-calibrates to perform accurate vergence and accommodation Prior formulations focused on minimizing the conditional eye movements. It exploits disparity cues to deduce the direction entropy H(R | C ) only (6). H(R | C ) is a measure of the infor- of defocus, which leads to coordinated vergence and accommo- mation that is lost (i.e., not represented in the cortical encoding). dation responses. In a simulated anisometropic case, where the The limitation of this prior formulation is that this quantity can refraction power of the two eyes differs, an amblyopia-like state be minimized by simply reducing H(R), the entropy of the retinal develops in which the foveal region of one eye is suppressed due response, since H(R | C ) ≤ H(R). Thus, an active agent could to inputs from the other eye. After correcting for refractive errors, minimize H(R | C ) by, for example, defocusing or closing the the model can only reach healthy performance levels if receptive eyes altogether. In the free energy and predictive processing lit- fields are still plastic, in line with findings on a for erature, this is known as the “dark room problem” (14, 15). In binocular vision development. Overall, our model offers a unify- our formulation, maximizing I (R, C ) is achieved by maximizing ing conceptual framework for understanding the development of H(R) and minimizing H(R | C ) simultaneously, thus avoiding binocular vision. this problem. We demonstrate this approach through a con- crete model of the development of binocular vision, including

efficient coding | active perception | amblyopia | vergence | accommodation Significance

ur brains are responsible for 20% of our energy con- Brains must operate in an energy-efficient manner. The effi- Osumption (1). Therefore, organizing neural circuits to be cient coding hypothesis states that sensory systems achieve energy efficient may provide a substantial evolutionary advan- this by adapting neural representations to the statistics of tage. One means of increasing energy efficiency in sensory sensory input signals. Importantly, however, these statis- systems is to attune neural representations to the statistics of tics are shaped by the organism’s behavior and how it sensory signals. Based on this efficient coding hypothesis (2), samples information from the environment. Therefore, opti- numerous experimental observations in different sensory modal- mal performance requires jointly optimizing neural repre- ities have been explained (3, 4). For instance, it has been sentations and behavior, a theory called active efficient shown that receptive field properties in the early visual path- coding. Here, we test the plausibility of this theory by way can be explained through models that learn to efficiently proposing a computational model of the development of encode natural images (5, 6). These findings have extended clas- binocular vision. The model explains the development of sic results showing that receptive field shapes in visual cortex accurate binocular vision under healthy conditions. In the are highly malleable and a product of the organism’s sensory case of refractive errors, however, the model develops an experience (7–10). amblyopia-like state and suggests conditions for successful Importantly, however, animals can shape the statistics of their treatment. sensory inputs through their behavior (Fig. 1). This gives them Author contributions: S.E., B.E.S., and J.T. designed research; S.E. and L.K. performed additional degrees of freedom to optimize coding efficiency by research; S.E. and L.K. analyzed data; and S.E., B.E.S., and J.T. wrote the paper.y jointly adapting their neural representations and behavior. This The authors declare no competing interest.y idea has recently been advanced as active efficient coding (11, 12). It can be understood as a generalization of the efficient cod- This article is a PNAS Direct Submission.y ing hypothesis (2) to active perception (13). Along these lines, This open access article is distributed under Creative Commons Attribution-NonCommercial- NoDerivatives License 4.0 (CC BY-NC-ND).y active efficient coding models have been able to explain the Data deposition: Documented MATLAB code of the model is available in ModelDB under development of visual receptive fields and the self-calibration accession no. 261483.y of smooth pursuit and vergence eye movements (11, 12). This 1 To whom correspondence may be addressed. Email: [email protected] or triesch@ has been achieved by optimizing the neural representation of fias.uni-frankfurt.de.y the sensory signal statistics while simultaneously, via eye move- This article contains supporting information online at https://www.pnas.org/lookup/suppl/ ments, optimizing the statistics of sensory signals themselves for doi:10.1073/pnas.1908100117/-/DCSupplemental.y maximal coding efficiency. First published March 2, 2020.

6156–6162 | PNAS | March 17, 2020 | vol. 117 | no. 11 www.pnas.org/cgi/doi/10.1073/pnas.1908100117 Downloaded by guest on September 30, 2021 Downloaded by guest on September 30, 2021 lrsrcue(i.2.Acria oigmdl oesthe models H module coding representation mod- representation cortical a nal efficient A has an propose of 2). we learning (Fig. that structure model coding ular efficient active The Formulation Model binocular of development the may vision. for efficiency statistics, explanation not coding input unifying that does a visual suggests provide and model the plastic our in longer Overall, changes prevails. no the is to encoding How- adapt sensory correction. the refraction after if readapt ever, fields regained the to is receptive in vision the due binocular fields as healthy is receptive that this demonstrate binocular and that of model show that development We observe abnormal we develops. the hyperopia, state monocular amblyopia-like is ver- model strong an the accurate to When due develops movements. impaired eye model accommodation our and gence conditions, healthy Under remain mostly adults a 23). recover while (22, in often corrected, impeded they still are (10), errors are plasticity refractive patients cortex after visual young of When period neu- (20). the critical of stage by maturation the aggravated circuit on depends is to ral strongly This lead success treatment patients. not that all do fact in methods treatment outcomes existing satisfactory 21), (20, amblyopia completely or accurate 19). less accom- (18, either to absent and are leads movements vergence cases, eye severe Furthermore, modation in (17). and not blindness of is monocular loss that a be acuity with can charac- Amblyopia visual associated correction. is in refractive that by difference resolved system interocular immediately visual an developing ambly- by the evoke terized of can disorder this a development, If opia: eyes. during the early between error corrected refractive and not the in vergence difference of a describes example, calibration For impaired. the is control conditions, accommodation medical certain Additionally, interacting unknown. in currently while is environment mechanisms man- visual control their infants with How their (16). self-calibrate properly to eyes age their verge yet cannot and accommodation and vergence of control. calibration simultaneous the pro- active the sampling consider active not and the do (orange). encoding theories shapes component coding neural sampling thereby, efficient both Classic on and sampling. which depends input environment actions encoding, perception the neural of Therefore, by from cess. formed selection signals is the percept input A drives sampling movements). by eye via obtained (e.g., is input sory hne ntesnoyiptsaitc.Teeatcoc of choice exact The al. et Eckmann statistics. to input response sensory in simultaneously the adjust in and changes plastic are minimizes modules module learning ment maximizes module learning 1. Fig. (R h bv nig r l edl xlie yormodel. our by explained readily all are findings above The of treatment the in advances recent been have there Although focus into objects bringing difficulties have newborns Indeed, | C h cinpreto opi cieefiin oig h sen- The coding. efficient active in loop action–perception The ttesm ie nacmoainreinforcement accommodation an time, same the At ). R ymnmzn h odtoa entropy conditional the minimizing by H (R n egnereinforce- vergence a and ), H (R C | C ftebnclrreti- binocular the of ) (Eq. .Althree All 1). a details): fields has receptive with neurons cortical where error encoding the an of minimize average the we (i.e., directly, bound entropy upper conditional the minimizing of a receives Methods). that and (Materials periphery (26) standard algorithm the the pursuit using for matching implemented are one Both and input. details pass-filtered fine region low for foveal one the coders: efficient in 1, two comprises Fig. behavioral encoding (compare which cal on generated based are “percept” commands binocular the represents Encoding. Cortical itera- the next along the planes 3). of vergence (Fig. and axis input egocentric accommodation retinal shifting the by tion control reinforcement binocularly The modules neurons. then, by cortical learning of adjusted and population contrast a mechanism, by whitened, object encoded suppression are and interocular inputs vergence, The an accommodation, 3). the (Fig. of planes based positions rendered con- are the images which retinal presented on The iterations, is distance. object 10 random new new for a a fixation, at eyes each After two fixation. the one stitute by sampled is object In coding efficient 25). function. active (24, previous to models in applied model successfully models the been learning have reinforcement for and coding important cortical different not fact, is algorithms the n error entropy ing conditional the minimize to (SI (6) cortex Learning. Vergence visual the in cells S1). simple Fig. (28) of Appendix, statistics those adjust stimulus the resemble fields reflect and fields receptive receptive the and Thus, ods). error encoding activities the minimize both to online iteration, every In n comdto omns eal r in are Details commands. (Right accommodation modules and learning reinforcement represents two that to neurons input binocular of encoding set cortical a the by (Left encoded activity cortical are of they history Thereafter, recent the on based mechanism are control suppression disparity of stage and retinal blur flow the defocus at given the whitened with representing images input arrows Sampled dashed commands. and information sensory 2. Fig. ofidasto ern htbs noe h nu,instead input, the encodes best that neurons of set a find To The object. planar textured a with presented is model Our R ˆ oe rhtcue ihsldarw ersnigteflwof flow the representing arrows solid with architecture, Model sa siaeo h input the of estimate an is kS H (R k 2 .Teeoe egnemvmnsaefavored are movements vergence Therefore, ). | C normdl h otclpplto activity population cortical the model, our In PNAS h egnerifreetlanras aims also learner reinforcement vergence The C ) ≤ (Center E | R h R ˆ kR n otatajse hog ninterocular an through adjusted contrast and ac 7 2020 17, March .Tecria ouainatvt evsas serves activity population cortical The ). = − X j R ˆ c (C R j b j )k ae nteactivities the on based H . b kS j 2 aeil n Methods. and Materials (R | i aeil n Methods and (Materials k ≡ o.117 vol. 2 | E aeil n Meth- and (Materials C htcnrlvergence control that ) Upper  ) kS ie,teencod- the (i.e., k | 2 .Tecorti- The ).  o 11 no. kS , k 2 (27): ) | c 6157 j [2] [3] of ).

BIOPHYSICS AND NEUROSCIENCE COMPUTATIONAL BIOLOGY A C self-reinforcing suppression cycle when left and right eye inputs are dissimilar (Fig. 5B).

Results Active Efficient Coding Leads to Self-Calibration of Active Binoc- ular Vision. In the healthy condition without refractive errors, the model learns to perform precise vergence and accommo- D dation eye movements (Fig. 6A and SI Appendix, Fig. S2A). The object is continuously tracked by the eyes (SI Appendix, B Fig. S3A), and most neurons develop binocular receptive fields (SI Appendix, Fig. S1). This is not due to artificially introduc- ing a bias for zero disparity during initialization of the model. When receptive fields are adapted to a uniform input dispar- ity distribution, the encoding of zero-disparity input is still most efficient (Fig. 4B). Due to the overlap of the left and right eye visual fields, the information contained in the retinal response H(R) is smallest for zero disparity when the images projected Fig. 3. Input sampling from the environment. (A) Object (obj.) position, onto the two eyes maximally overlap. Thus, even an unbi- vergence (verg.) distance, and left (l.) and right (r.) accommodation (acc.) ased encoder that can encode inputs of all disparities equally distance are represented as different plane positions. (B) Abstraction of A. well will tend to encode zero-disparity input more accurately The gray horizontal bar indicates the range where objects are presented because such input contains less information. This bootstraps during the simulation and also, indicates the fixation range (i.e., possible the positive feedback loop of active efficient coding (Fig. 4A and vergence plane positions). Horizontal axes indicate reachable accommoda- SI Appendix, Fig. S4). tion plane positions for the left (light blue) and right (green) eyes. Note Accommodation performance becomes highly accurate as that, when the stimulus is placed at, for example, position 0, it cannot be well. This is due to the edge-enhancing nature of retinal gan- focused by the right eye in this example. Accommodation and vergence glion cell receptive fields. With their center-surround shape, errors are measured as the distance between the respective planes and the object position in a.u. (C) Position range of accommodation and vergence they are selective for sharp contrasts and respond poorly planes under different conditions. Same scheme as in B.(D) Examples of when out of focus input is presented (Fig. 4C). For sharper retinal input images for different plane position configurations. For better input, the range of responses across the population and visibility, disparity shifts and defocus blur are increased compared to actual thus, the response entropy increase (SI Appendix, Figs. S5 values. and S6). Furthermore, accurate accommodation is achieved without obvious sign cues: in our simplified visual environment, defocus that produce visual input that can be most accurately encoded blur is independent of whether an eye focuses behind or in front with the current set of receptive fields. This leads to a self- of the object. Also, neither chromatic nor other higher-order reinforcing feedback cycle (Fig. 4A). If inputs of a certain dis- aberrations are provided in our model, which could help to steer parity can be encoded particularly well, the vergence learner focus in the right direction (36, 37). Instead, the model learns will try to produce visual input that is dominated by this dis- to infer the sign of defocus from disparity cues (SI Appendix, parity. This will cause even more neurons to become selective for this disparity and make the encoding of this disparity even more efficient (Fig. 4B). Thus, an initial bias for, say, small dis- parities can be magnified until the model always favors input with small disparities and most neurons are tuned to small ABC disparities.

Accommodation Learning. The entropy of the retinal response H(R) (Eq. 1) is maximized via the accommodation reinforce- ment learning module. For this, H(R) is approximated by the squared activity of the retinal representation kRk2 (Materials and Methods). We assume the spatial frequency tuning of retinal Fig. 4. The feedback loop of active efficient coding and reward depen- ganglion cells to be static and thus, independent of the distri- dencies. (A) Positive feedback loop of active efficient coding. An efficiently bution of spatial frequencies in the retinal input as suggested by encoded stimulus is preferred over other stimuli (acting). Therefore, the deprivation experiments (9, 29, 30). However, the exact receptive sensory system is more frequently exposed to the stimulus, and neu- field shape does not matter for the model to favor focused input ral circuits adapt to reflect this overrepresentation (statistical learning), which further increases encoding efficiency (neural coding). (B) Normal- (Fig. 4C). ized (norm.) vergence (verg.) reward for different disparity distributions and neural populations (averaged over 300 textures). The receptive fields Suppression Model. Interocular suppression is thought to be a of 300 neurons adapted to different distributions of input disparities with central mechanism in amblyopia. We use a basic interocular color-coded SDs. Gray indicates unbiased/uniform, pink and purple indi- suppression model (Fig. 5A) to describe dynamic contrast mod- cate Laplacian distributed, and dark blue indicates model trained under ulation based on the ocular balance of the input encoding. If healthy conditions. In each case, stimuli seen at zero absolute (abs.) mostly right (left) monocular receptive fields are recruited dur- disparity produce the highest average (avg.) vergence reward (i.e., the ing cortical encoding, the contrast of the left (right) eye input most efficient encoding). This advantage is even more pronounced when becomes suppressed in subsequent iterations. This is in agree- small disparities have been encountered more frequently (i.e., for smaller σ). (C) Normalized accommodation (acc.) reward for different whitening ment with reciprocal excitation of similarly tuned neurons in filters. Zero-blur input yields the highest accommodation reward inde- visual cortex (31, 32). At the same time, the total input energy is pendent of the size of the whitening filter. However, smaller whitening kept balanced to ensure similar activity levels for monocular and filters induce a stronger preference for focused input. The smallest fil- binocular visual experience as observed experimentally at high ter (dark blue) was used for the simulation (Materials and Methods has contrast levels (33–35) (Materials and Methods). This leads to a details).

6158 | www.pnas.org/cgi/doi/10.1073/pnas.1908100117 Eckmann et al. Downloaded by guest on September 30, 2021 Downloaded by guest on September 30, 2021 hr h mtoi odto sahee i pia essor presbyopia, lenses optical for via method achieved (43). is treatment surgery condition a ametropic from the where results Appendix , which (SI recruited sion, gets stim- S10 eye the which Fig. ranges, determines intermediate history At for other ulus respective used suppressed. the is is and eye eye S10A), myopic defocused Fig. less Appendix, relatively (SI the vision vision, distant near for ( used is preferred is eye S1, no Fig. 3C, but (Fig. monocular, more error become refractive the of eye S3). unimpaired Fig. the Appendix, with (SI tracked such continuously adapts is stimulus Accommodation the S1 , that (40–42). Fig. experiments in Appendix, observed (SI periphery the ter in 6A fields (Fig. tive maintained unaf- S2B is is Figs. vergence coarse input a of that peripheral such information control , Appendix binocular low-resolution provides (SI still the and suppressed fected while actively S9A), becomes Fig. eye 5 the hyperopic (Fig. drives state the that amblyopia-like cycle an vicious into a model in results fields receptive adapting Center 6C, Fig. Upper unim- (compare reflect the system- favoring eye fields monocular, paired receives more receptive become other, eye Cortical and the imbalance input. impaired this over defocused the favored more case, is atically anisometropic eye neither the where in 3C, focus case, (Fig. to healthy unable distances the was close and the hyperopic of at became front objects it in that lens such simulated eye a right adding by simulated case we anisometropic conditions, an rearing abnormal under evolves State. model Amblyopic into Model Drives with Anisometropia agree S8). qualitatively Fig. to Appendix, (SI model conditions). 39) the healthy (38, of under results experimental trained responses agent the an find of in We placed eyes were the abnor- lenses of simulated under front when entanglement (e.g., this conditions input examined mal further We S7). Fig. affects that feedback processing. indicate lines input suppres- Dashed future cycle). future right makes (red; adapt which likely (RFs) monocular, more fields sion more receptive becoming disparate timescale, by slower exacerbates suppression a to and On cycle). movements left eyes. eye (purple; the input vergence between precise competition which impedes inducing neurons, This monocular suppression of interocular recruitment in preferential results (Materials to inputs lead input) Disparate eyes unattenuated model. low-resolution both suppression provide the the to of still cycle suppressed, Feedback may is (B) eye Methods). eye and high- left left the the when the of (i.e., of periphery specific region scale (excit- foveal being inhibitory resolution as indicate We suppression thickness. lines line interocular by (solid) model represented is Dashed strength Connection (green). interactions. eye atory) eye left right for selectivity or response (blue) indicates hue Color iterations. subsequent in unit contrast neurons ular 5. Fig. cmn tal. et Eckmann AB hnbt yswr iial mardbtwt poiesign opposite with but impaired similarly were eyes both When ,wihpoieeog nomto o oreseepi as stereopsis coarse for information enough provide which ), hnmsl ih lf)monoc- (left) right mostly When (A) model. suppression Interocular .A eut h eaieymr ypceye myopic more relatively the result, a As Right). Upper B c y .Ti eut nsal ioua recep- binocular stable in results This S3B). and j r and r ciae oecd niptiaepth h ih (left) right the patch, image input an encode to activated are (y .Tecmie feto maacdiptand input imbalanced of effect combined The ). l sectd n h et(ih)rtnliaei suppressed is image retinal (right) left the and excited, is ) .Ti ofiuaini iia omonovi- to similar is configuration This C). Lower ,rcpiefilsstill fields receptive Bottom), .Teeoe unlike Therefore, Middle). and .Fva nu from input Foveal B). i.S1, Fig. Appendix, SI and ots o the how test To IAppendix, SI IAppendix, SI oe Cen- Lower comdto n egnebhvo,wihehne the enhances which improved and behavior, movements, vergence eye and of control accommodation prob- the fine tuning, this facilitates disparity detect between solves which to loop feedback model ability positive coding the the through on efficient lem and relies active focus turn, Our in accurately disparities. which to eyes, ability the the align dis- requires fine of of detectors development the egg” disparity and control: movement the “chicken eye and how fundamental tuning explain parity the to solves failed system have visual models previous (44, Therefore, accommo- origin 46). developmental and their vergence explaining without for control of signals dation error populations providing presupposed readily have cells but others properties, accommodation Conversely, and field behavior. vergence receptive give neglected completely in could model differences conditions their through rearing systematic emerge alternate to Hunt may how rise tuning by and disparity model coding how a sparse explained example, (28) al. For rich et their 44–46). capture (28, to failed interdependence have vergence but of control development accommodation the and or tuning disparity of simultaneous development the for accommodation. accounts the and on vergence theory, of based development coding is which efficient model, active active of our self-calibration Specifically, behavior the vision. to both binocular leads optimizing efficiency for simultaneously encoding how and shown have We Discussion should it plasticity Furthermore, cortex effective. 21). visual be 20, reinstating (10, therapies cor- after that errors in limited amblyopia predicts refractive is from that of This recovery suggesting correction 6C). prevents evidence the adults (Fig. of in statistics body plasticity input large tical fields changed a receptive the with binocular This of to line S9B). monocular result Fig. from a Appendix, shift as (SI a impaired values to formerly decreased lower due the error was to of restored vergence suppression was adapt strong the eye the could con- statistics, and and In 6B), input 6B). plastic (Fig. changed (Fig. remained the error fields vergence to receptive of when level high trast, Instead, state. a amblyopic monocular, the maintained from remained it recover not fields did model receptive refrac- the the were corrected, and after errors was fixed refractive error were fields all tive receptive Then, con- the fully state. had When a amblyopic corrected. it trained the until first to conditions we verged anisometropic error, under amblyopia refractive model from the plastic recover Vision. of can Binocular model correction anisometropic Rescues upon the Correction if Refractive test Late To Not but Early itga ffva F iouaiya esrdb h ih monocu- right the by measured as (C dominance module. binocularity (monoc.) learning RFs reinforcement lar the foveal of vergence of recalibration the the Histogram in to increase due initial is The error (RFs). fields receptive (non-)plastic with iter- at errors refractive 5 all ation of correction after the model of anisometropic performance Vergence formerly (B) conditions. are healthy planes under accommodation randomly indi- when moved error line vergence dashed average The after expected eye conditions. the (r.) cates anisometropic right and and (l.) healthy left under the training of errors (acc.) accommodation and (verg.) 6. Fig. aeil n Methods and (Materials ABC rvoscmuainlmdl aefcsdo ihrthe either on focused have models computational Previous × vrg ag)aslt as)vergence (abs.) absolute (avg.) Average (A) performance. Model 10 6 vria ryln) h dte)sldln niae h model the indicates line solid (dotted) The line). gray (vertical PNAS a details). has d (·,r | ) eoeadatrrfatv ro correction error refractive after and before ac 7 2020 17, March | o.117 vol. | o 11 no. | 6159 )

BIOPHYSICS AND NEUROSCIENCE COMPUTATIONAL BIOLOGY representation of fine disparities. In the end, the tuning prop- The contrast-adjusted patches R were encoded with the matching pursuit erties of sensory neurons reflect the image statistics produced algorithm (26). by the system’s own behavior (47). Under healthy conditions, For each patch, we recruited N = 10 of 300 cortical neurons to most the model develops accurate vergence and accommodation eye efficiently encode the image. The cortical response C = (c1, ... , c300) was movements. For a simulated anisometropia, however, where one determined via an iterative process, where the activities of neurons that were not selected for encoding remained zero. In the first encoding step, eye suffers from a refractive error while the other eye is unaf- n = 1, the neuron with the receptive field bj that was most similar to the fected, it develops into an amblyopia-like state with monocular retinal input was selected: receptive fields and loss of fine stereopsis. Recovery from this  amblyopia-like state is only possible if receptive fields in the in = argmax |hbj, Sn−1i| , S0 ≡ R, [4] model remain plastic, matching findings of a critical period for j binocular development (10). An important mechanism in amblyopia is interocular sup- where the similarity between a receptive field bj and retinal input R was pression. The simple logic behind the model’s suppression measured with the scalar product hbj, Ri. When selecting the next neuron, all information that was already encoded by the first neuron is subtracted mechanism is that every neuron suppresses input that is incon- from the original input R: gruent to its own receptive field (34, 35). This implementa- tion proved sufficient to account for the development of an = − , = h , i, S1 R ci1 bi1 ci1 bi1 S0 [5] amblyopia-like state, with mostly monocular receptive fields in n X the representation of the fovea. More sophisticated suppres- S = R − c b , [6] n ik ik sion models could be incorporated in the future (48, 49), but k=1 we do not expect them to change the conclusions from the present model. Future work should focus on understanding the where S1 is the residual image after the first encoding step and Sn is the principles of interocular suppression within the active efficient generalized residual after the nth encoding step. Subsequent neurons are coding framework. A topic of current interest is how suppression selected based on the similarity of their receptive fields with the residual according to Eq. 4. By greedily selecting the neuron with maximum response develops during disease and treatment (e.g., with the standard c in each encoding step, the reconstruction error kS k2 is minimized (i.e., ik N patching method) (50). A better understanding of the role of coding efficiency is maximized). suppression in amblyopia could lead to improved therapies in After encoding, all receptive fields were updated through gradient 2 the future. descent on kSNk and normalized to unit length. Thus, their tuning reflects While we have focused on the development of active binocu- the input statistics (SI Appendix, Fig. S1): lar vision, including accommodation and vergence control, our formulation of active efficient coding is very general and could bj + ∆bj ∂ 2 bj ← , ∆bj = −η kSNk = 2ηcjSN, [7] be applied to many active perception systems across species and kbj + ∆bjk ∂bj sensory modalities. Active efficient coding is rooted in classic efficient coding ideas (2–6), of which predictive coding theories where 2η = 5 × 10−5 is a learning rate. Each patch of the foveal scale are special examples (51–53). Classic efficient coding does not, was encoded by a subset of the same 300 neural receptive fields. For the however, consider optimizing behavior. Friston’s active infer- peripheral scale, a separate set of 300 neurons was used for encoding (SI Appendix, Fig. S13). At the beginning of each simulation, all receptive field ence approach does consider the generation of behavior in a weights were drawn randomly from a zero-mean Gaussian distribution and very general fashion. There, motor commands are generated subsequently normalized to unit norm. to fulfill sensory predictions. In our formulation of active effi- cient coding, motor commands are learned to maximize the Reinforcement Learning. We used two separate natural actor critic reinforce- mutual information between the sensory input and its cortical ment learners (55, 56) with identical architectures to control the accommo- representation. This implies maximizing the amount of sensory dation planes and the vergence plane. Possible actions a correspond to shifts information sampled from the environment and avoids the prob- in the respective plane positions: a ∈ {−2, −1, 0, 1, 2} (compare Fig. 3). The lem of deliberately using accommodation to defocus the eyes or state information vector comprises the patch-averaged squared responses closing the eyes altogether to make the sensory input easy to of the cortical neurons: 1 X 2 encode and/or predict. fi = c(i,p), [8] np p

Materials and Methods where c(i,p) is the activity of neuron i after encoding patch p and np = Input Image Rendering. We used 300 grayscale-converted natural images of 81 is the number of patches per scale. Therefore, fi is spatially invari- the “manmade” category from the McGill Database (54). One image was ant due to averaging over patches and does not depend on the polar- presented at a random position (Fig. 3 A and B) during one fixation (i.e., 10 ity of the input due to the squaring. This is similar to the properties subsequent iterations) before the next image and position were randomly of complex cells in primary visual cortex (57, 58). After they were nor- selected for the next fixation. malized to unit norm, the peripheral and foveal scale state vectors are For every distance unit between vergence and object plane, the left concatenated into the combined state vector f of size 300 × 2. The (right) eye image was shifted 1 px (pixel) to the left (right). This resulted next action a is chosen with probability πa(f), and the state value is in a disparity of 2 px per distance unit. A Gaussian blur filter was applied estimated as V(f): to the left and the right eye image, where the SDs depended linearly on the distance between object and accommodation planes. The 1-a.u. dis- exp(za) X X πa = , za = uajfj, V(f) = vjfj. [9] P exp(z ) tance equals 0.8 px of SD (SI Appendix, SI Text and Figs. S11 and S12). Errors j j j j were measured in arbitrary units (a.u.) as distances to the object plane. For the foveal (peripheral) scale, two retinal images of size 72 × 72 (160 × 160) The weights uaj and vj are updated via approximate natural gradient pixels were cropped from the center of the original image (SI Appendix, descent on an approximation of the temporal difference error δ (algorithm Fig. S13). 3 in ref. 56): (t) (t−1) δt = rt + γV(f ) − V(f ), [10] Input Processing. The left and right retinal input images were whitened as described by Olshausen and Field (6) (SI Appendix has details). For where rt is the accommodation or vergence reward and γ = 0.6 is the tem- each scale, images were cut and merged into 81 binocular patches of size poral discounting factor. At the beginning of each simulation, all network 2 ×8× 8 px, where the peripheral scale was down-sampled with a Gaussian weights were initialized randomly. pyramid by a factor of four (SI Appendix, Fig. S13). The whitened reti- nal patches R* were normalized to zero mean intensity and subsequently Approximating Mutual Information. Rewards for the reinforcement learners contrast adjusted via the interocular suppression mechanism (see below). are based on the squared response after whitening and cortical encoding.

6160 | www.pnas.org/cgi/doi/10.1073/pnas.1908100117 Eckmann et al. Downloaded by guest on September 30, 2021 Downloaded by guest on September 30, 2021 1 .Za,C .Rtkp,J rec,B .Si Auie oe ftejitdevel- joint the of model unified “A Shi, E. B. Triesch, J. Rothkopf, A. visual C. primary Zhao, the Y. of plasticity 11. and Development Stryker, P. M. Espinosa, Sebastian J. 10. cmn tal. et Eckmann normalized is, were that variance: rewards unit and vergence mean the zero to and online accommodation the agents, Normalization. Reward blur. defocus simulate to where spectrum frequency flat Appendix a (SI of finds assumption one whitening, the after Under blur. input increasing module. learning reinforcement accommodation use we variance fore, the continuous, Lipschitz and σ unimodal its and is support distribution bounded has probability response retinal the since generally, More variance H(R). the to equal is σ where Since (SI input SD the same in the blur with approximate distribution of we Laplace level the Therefore, the S5). estimate of Fig. independent and Appendix, distribution, approximated variable well Laplace is random a distribution by The underlying distribution. probability same its of the entropy of sample dent response retinal response whitened retinal learner. and vergence cortical the for between reward the difference as the energy take we Therefore, energy the Appendix minus (SI representation representation retinal residual cortical the the of the of energy of the energy to the equal (26), is algorithm image pursuit matching the of property error tion entropy conditional the where response whitened the between information T ˆ .H .B ish .N pnli iuleprec oie itiuino hori- of development. neural Visual distribution Sluyters, Van C. modifies R. Movshon, experience A. J. Visual 9. Spinelli, N. D. of Hirsch, deprived kittens B. of V. cortex H. striate in 8. responses Single-cell strategy Hubel, A H. set: D. basis Wiesel, overcomplete N. T. an with scenes? 7. coding natural Sparse about Field, J. know D. retina Olshausen, the A. B. does What 6. Redlich, N. A. Atick, sounds. natural J. of J. coding representation. Efficient 5. Lewicki, neural S. and M. statistics image 4. Natural Olshausen, A. B. Simoncelli, P. E. 3. Barlow B. H. 2. in brain” the of metabolism energy and “Circulation Sokoloff, L. Clark, D. D. 1. gte,ti a eudrto sa miia siaeo h mutual the of estimate empirical an as understood be can this ogether, R 2 2 soewudepc o h entropy the for expect would one As o h comdto ere,w aiieteetoyo the of entropy the maximize we learner, accommodation the For ofrneo eeomn n erigadEieei oois(ICDL) Robotics Epigenetic and 1–6. in Learning pp. 2012), control” and NJ, Piscataway, vergence Development and on selectivity Conference disparity of opment cortex. cats. (1981). in 477–522 fields receptive oriented vertically (1970). and zontally eye. one in vision v1? by employed Comput. Neurosci. Rev. Annu. messages. 1999), PA, Philadelphia, (Lippincott-Raven, Eds. Uhler, D. 637–670. M. pp. Fisher, K. S. Albers, W. R. Aspects, Medical and Cellular Molecular, Neurochemistry: sa nieetmt fisvariance its of estimate online an is samntncfnto falwrbudo h nrp 5) There- (59). entropy the of bound lower a of function monotonic a is h xetdsurdatvt ftertnlrepresentation retinal the of activity squared expected the ˆ σ r is steS fteGusa lrfitrta sapidbfr whitening before applied is that filter blur Gaussian the of SD the is Neuron h xoetal egtdrnigaeaeo h reward the of average running weighted exponentially the kS 9–1 (1992). 196–210 4, es Commun. Sens. k kR 2 osbepicpe neligtetasomto fsensory of transformation the underlying principles Possible al., et (Eq. 3–4 (2012). 230–249 75, kS − 2 i.Res. Vis. .Neurophysiol. J. sa miia siaeof estimate empirical an as 2 k 1311 (2001). 1193–1216 24, and 2 eoebigpse oterifreetlearning reinforcement the to passed being Before = σ 3132 (1997). 3311–3325 37, I 1–3 (1961). 217–234 1, − (R, R 2 IAppendix SI H(R) ti loamntncfnto fteentropy the of function monotonic a also is it , .W aeec nr of entry each take We H(R).  C r kR (t ) H(R kR = ) ≈ 0311 (1963). 1003–1017 26, ← 2 H(R) ln kC − 2 | r C  ∝ (t suprbuddb h reconstruc- the by bounded upper is ) σ ) e σ ˆ .Det h eeg conservation” “energy the to Due ). H(R − 1/σ ) − R q k (t : 2 ) ˆ r 2σ  :ta is, that ): (t , = H(R), ) R 2 a.Neurosci. Nat. R , E  | kC (kR C n otclresponse cortical and . ihteetoyo a of entropy the with H(R) ), k 2 kR kR − 2 .J igl .W Agranoff, W. B. Siegel, J. G. o h eado the of reward the for ) 2 02IE International IEEE 2012 nu e.Psychol. Rev. Annu. Science lodcessfor decreases also 5–6 (2002). 356–363 5, 2 R . sa indepen- an as 869–871 168, E (kR C Neural r : (IEEE, Basic and [14] [12] [11] [15] [13] 32, 2 ) 1 .P tye,S owl mloi:Nwmlclrpamclgcladenviron- and molecular/pharmacological New Amblyopia: Loewel, S. mini-review. A Stryker, amblyopia: P. and M. Stereopsis Bavelier, 21. D. Knill, C. D. Accommodative Levi, M. Candy, D. R. T. 20. Cotter, A. S. Tarczy-Hornoch, K. Chen, M. A. strabis- Manh, in V. movements eye 19. vergence Dynamic Stark, L. Ciuffreda, J. K. Kenyon, . V. and R. amblyopia of 18. analysis Experimental Sluyters, Van C. infants. R. human Blakemore, in C. latencies vergence 17. and Accommodation Candy, R. T. Tondel, M. G. 16. 2 .A Cotter A. S. 22. dark-room in the prediction” and with problems minimization “The Free-energy Sims, Clark, A. A. 15. Thornton, C. Friston, K. 14. Gibson, J. J. 13. Teuli C. 12. etakMraFoisadRwnCnyfrnmru discussions. numerous Foundation. for Quandt Candy Rowan the Hong and and 01EW1603A, Fronius 618713, Maria and Grant thank 01GQ1414 We Council Grants Grants Research Research and Kong Education of istry ACKNOWLEDGMENTS. (62). 261483 no. accession under ModelDB in available Documentation. and Software the implementation, model the contrast- For detail. response the response. between retinal retinal distinguish contrast-adjusted raw not and do we adjusted framework, theoretical our oethat Note activation squared patch-averaged relative 5). iterations: (Fig. previous images the over input input right monocular (right) and left left of amount measure of contrast contrast a the introduce We adjust that scale, per Mechanism. Suppression (60). weighting where usqetiptsbace o h otclcdraeajse to adjusted are coder set cortical the we for Furthermore, subpatches input 5B). subsequent (Fig. active saturation becomes the loop feedback reinforcing at input binocular at perfect saturates it until zero from As units: contrast two by processed estimate where as defined is dominance constant monocular decay The with time over average moving nential The h otatestimate contrast the etlapproaches. mental (2015). 17–30 114, amblyopia. (2015). unilateral 1193–1207 56, with children of performance vergence. (1980). Symmetric amblyopia: and mus Ophthalmol. J. Br. 2017), Res. Vis. Germany, Main, am Frankfurt Group, (MIND Eds. 23. chap. Wiese, W. Metzinger, K. T. Syst. Autonom. Robot. smtoi mloi ncide ihrfatv correction. refractive with (2006). 895–903 children in amblyopia isometropic problem. 1966). MA, α b ooua oiac fec neuron each of dominance monocular (i = x ere ,k ` I 6–7 (2008). 564–576 48, k ) (R, rn.Psychol. Front. 0 sa paert htst h ea fteexponential the of decay the sets that rate update an is 0.001 ftelf n ih ufilso h nu mg sseparately is image input the of subfields right and left the of stelf/ih ooua ufil fneuron of subfield monocular left/right the is efclbaigsot usi hog cieefiin coding. efficient active through pursuit smooth Self-calibrating al., et h essCniee sPreta Systems Perceptual as Considered Senses The C tal. et m ) R = k = σ ˆ (t I 7–8 (1974). 176–182 58, ˆ (t r eiti y ies netgtrGop ramn fan- of Treatment Group, Investigator Disease Eye Pediatric ; (R*, d opeetttlsprsino n y.Fnly the Finally, eye. one of suppression total prevent to 0.8 ) (t i.Neurosci. Vis. = (i + x ,k PNAS + k –2(2015). 3–12 71, ) R 1) (t C hswr a upre yGra eea Min- Federal German by supported was work This 3 (2012). 130 3, = x k ∗ 1) hr r w eaaesprsinmdls one modules, suppression separate two are There since ) ) k 2 (t y = x = = rse h threshold the crosses kb (x ) k ( * | k (1 (1 = (i 1 ) ecoetethreshold the chose We m. ,k X ouetdMTA oeo h oe is model the of code MATLAB Documented + = Mar − − opoiesm agnbfr h self- the before margin some provide to 0.5 kb ) R i 08(2018). E018 35, k y α) α) shmoopi to homeomorphic is 1 + x (i (x

h1,2020 17, ch ,k k − m ˆ R σ r ˆ , k kb ) P (t k k (t (t θ sused. is ) f (i {l ∈ ) − j [ i + , 2 net ptaml iulSci. Visual Ophthalmol. Invest. x ¯ k f ) k j + 1)) k αr ! hlspyadPeitv Processing, Predictive and Philosophy − , f , α r i d (t / − } θ (i [ ), ˆ ] P htgvsa siaeo the of estimate an gives that r ,k k + y | (t IAppendix SI ) θ ∈ (x + . j ) net ptaml iulSci. Visual Ophthalmol. Invest. d h output the , vol. f ¯ − k {l (i t j ,τ (t ,k Here, . , Huho ifln Boston, Mifflin, (Houghton r ) − r (t . 117 }, 6) hrfr,in Therefore, (61). R* swihe ihits with weighted is τ )] 1))). 2 = , Ophthalmology θ h·i | IAppendix (SI 10 = i a additional has no. t h contrast The . y utabove just 0.6 ,τ (x steexpo- the is k 11 increases ) 60–74 19, i.Res. Vis. | 6161 [21] [17] [18] [16] [19] [20] 113, ).

BIOPHYSICS AND NEUROSCIENCE COMPUTATIONAL BIOLOGY 23. A. L. Steele et al., Successful treatment of anisometropic amblyopia with spectacles 43. B. J. W. Evans, Monovision A review. Ophthalmic Physiol. Optic. 27, 417–439 (2007). alone. J. Am. Assoc. Pediatr. Ophthalmol. Strabismus 10, 37–43 (2006). 44. C. M. Schor, J. Alexander, L. Cormack, S. Stevenson, Negative feedback control model 24. Q. Zhu, J. Triesch, B. E. Shi, Joint learning of binocularly driven saccades and vergence of proximal convergence and accommodation. Ophthalmic Physiol. Optic. 12, 307–318 by active efficient coding. Front. Neurorob. 11, 58 (2017). (1992). 25. L. Klimmasch, A. Lelais, A. Lichtenstein, B. E. Shi, J. Triesch, “Learning of active 45. P. O. Hoyer, A. Hyvarinen,¨ Independent component analysis applied to feature binocular vision in a biomechanical model of the oculomotor system” in Joint IEEE extraction from colour and stereo images. Netw. Comput. Neural Syst. 11, 191–210 International Conference on Development and Learning and Epigenetic Robotics (2000). (ICDL-EpiRob) (IEEE, Piscataway, NJ, 2017), pp. 21–26. 46. A. Gibaldi, M. Chessa, A. Canessa, S. P. Sabatini, F. Solari, A cortical model for binocular 26. S. G. Mallat, Z. Zhang, Matching pursuits with time-frequency dictionaries. IEEE Trans. vergence control without explicit calculation of disparity. Neurocomputing 73, 1065– Signal Process. 41, 3397–3415 (1993). 1073, (2010). 27. T. M. Cover, J. A. Thomas, Elements of Information Theory (John Wiley & Sons, 2006). 47. A. Gibaldi, A. Canessa, S. P. Sabatini, The active side of stereopsis: Fixation strategy 28. J. J. Hunt, P. Dayan, G. J. Goodhill, Sparse coding can predict primary visual cortex and adaptation to natural environments. Sci. Rep. 7, 44800 (2017). receptive field changes induced by abnormal visual input. PLoS Comput. Biol. 9, 48. R. F. Hess, B. Thompson, Amblyopia and the binocular approach to its therapy. Vis. e1003005 (2013). Res. 114, 4–16 (2015). 29. S. M. Sherman, J. Stone, Physiological normality of the retina in visually deprived cats. 49. L. E. Hallum et al., Altered balance of receptive field excitation and suppres- Brain Res. 60 224–230 (1973). sion in visual cortex of amblyopic macaque monkeys. J. Neurosci. 37, 8216–8226 30. J. A. Movshon et al., Effects of early unilateral blur on the macaque’s visual system. (2017). iii. Physiological observations. J. Neurosci. 7, 1340–1351 (1987). 50. S. Kehrein, T. Kohnen, M. Fronius, Dynamics of interocular suppression in amblyopic 31. T. Yoshioka, G. G. Blasdel, J. B. Levitt, J. S. Lund, Relation between patterns of intrinsic children during electronically monitored occlusion therapy: First insight. Strabismus lateral connectivity, ocular dominance, and cytochrome oxidase-reactive regions in 24, 51–62 (2016). macaque monkey striate cortex. Cerebr. Cortex 6, 297–310 (1996). 51. R. P. N. Rao, D. H. Ballard, Predictive coding in the visual cortex: A functional inter- 32. M. Florencia Iacaruso, I. T. Gasler, S. B. Hofer, Synaptic organization of visual space in pretation of some extra-classical receptive-field effects. Nat. Neurosci. 2, 79–87 primary visual cortex. Nature 547, 449–452 (2017). (1999). 33. F. Moradi, D. J. Heeger, Inter-ocular contrast normalization in human visual cortex. J. 52. K. Friston, The free-energy principle: A unified brain theory? Nat. Rev. Neurosci. 11, Vis. 9, 13 (2009). 127–138 (2010). 34. C. P. Said, D. J. Heeger, A model of and cross-orientation suppression. 53. S. E. Palmer, O. Marre, M. J. Berry, W. Bialek, Predictive information in a sensory PLoS Comput. Biol. 9, e1002991 (2013). population. Proc. Natl. Acad. Sci. U.S.A. 112, 6908–6913 (2015). 35. M. Carandini, D. J. Heeger, Normalization as a canonical neural computation. Nat. 54. A. Olmos, F. A. Kingdom, A biologically inspired algorithm for the recovery of shading Rev. Neurosci. 13, 51–62 (2012). and reflectance images. Perception 33, 1463–1473 (2004). 36. L. Chen, P. B. Kruger, H. Hofer, B. Singer, D. R. Williams, Accommodation with higher- 55. R. S. Sutton, A. G. Barto, Reinforcement Learning: An Introduction (MIT Press, order monochromatic aberrations corrected with adaptive optics. J. Opt. Soc. Am. A 2018). 23, 1–8 (2006). 56. S. Bhatnagar, R. Sutton, M. Ghavamzadeh, M. Lee, Natural actor-critic algorithms. 37. M. Zannoli, G. D. Love, R. Narain, M. S. Banks, Blur and the perception of depth at Automatica 45, 2471–2482 (2009). occlusions. J. Vis. 16, 17 (2016). 57. D. H. Hubel, T. N. Wiesel, Receptive fields, binocular interaction and functional 38. S. R. Bharadwaj, T. R. Candy, Accommodative and vergence responses to conflicting architecture in the cat’s visual cortex. J. Physiol. 160, 106–154 (1962). blur and disparity stimuli during development. J. Vis. 9, 1–18 (2009). 58. E. H. Adelson, J. R. Bergen, The Plenoptic Function and the Elements of Early Vision 39. S. R. Bharadwaj, T. R. Candy, The effect of lens-induced anisometropia on accommo- (MIT Press, 1991), pp. 3–20. dation and vergence during human visual development. Invest. Ophthalmol. Vis. Sci. 59. H. W. Chung, B. M. Sadler, A. O. Hero, Bounds on variance for unimodal distributions. 52, 3595–3603 (2011). IEEE Trans. Inf. Theor. 63, 6936–6949 (2017). 40. K. Holopigian, R. Blake, M. J. Greenwald, Selective losses in binocular vision in 60. T. Finch, Incremental Calculation of Weighted Mean and Variance (University of anisometropic amblyopes. Vis. Res. 26, 621–630 (1986). Cambridge Computing Service, 2009). 41. R. J. Babu, S. Clavagnier, W. R. Bobier, B. Thompson, R. F. Hess, Regional extent of 61. A. Kraskov, H. Stogbauer,¨ P. Grassberger, Estimating mutual information. Phys. Rev. E peripheral suppression in amblyopia. Invest. Ophthalmol. Vis. Sci. 58, 2329–2340 69, 066138 (2004). (2017). 62. S. Eckmann, L. Klimmasch, Active efficient coding explains the development of binoc- 42. A. Bradley, R. D. Freeman, Contrast sensitivity in anisometropic amblyopia. Invest. ular vision and its failure in amblyopia. ModelDB. http://modeldb.yale.edu/261483. Ophthalmol. Vis. Sci. 21, 467–476 (1981). Deposited 15 February 2020.

6162 | www.pnas.org/cgi/doi/10.1073/pnas.1908100117 Eckmann et al. Downloaded by guest on September 30, 2021