FIXATIONAL EYEMOVEMENTS:DATA AUGMENTATION FOR THE

BRAIN?

APREPRINT

Beren Millidge Richard Shillcock Department of Informatics Department of Informatics University of Edinburgh University of Edinburgh

February 28, 2019

ABSTRACT

Fixational movements are ubiquitous and have a large impact on . Although their physical charecteristics and, to some extent, neural underpinnings are well documented, their function, with the exception of preventing visual fading, remains poorly understood. In this paper, we propose that the visual system might utilise the relatively large number of similar slightly jittered images produced by fixational eye movements to help learn robust and spatially invariant representations as a form of neural data augmentation. Additionally, we form a link between effects such as retinal stabilisation and predictive processing theory, and argue that they may be best explained under such a paradigm.

Keywords Fixational Eye Movements · Data Augmentation · Predictive Processing · · Neural Networks

1 Introduction

Even when the are fixated on a point, they are never still. During each fixation, the eyes are constantly making tiny movements which are imperceptible to us phenomenologically. These movements are called fixational eye movements and have been broadly separated into three main types - tremour, drift, and microsaccades (Rolfs, 2009).

5

Microsaccades are short, abrupt, high-velocity jerk-like movements which carry the eye to a new location. Unlike regular , they are generally smaller in amplitude, they are involuntary and unnoticeable (to the subject), and occur during voluntary fixation. Microsaccades last about 25ms (Ditchburn, 1980) and can carry the across several tens to several hundreds of photoreceptors (Ratliff and Riggs, 1950; Martinez-Conde et al., 2000). Like

10 regular saccades, microsaccades are conjugate, meaning that both eyes make the same movement (Møller et al., 2002; Lord, 1951). Additionally, like regular saccades, microsaccades show a linear ‘main-sequence’ relationship betweem A PREPRINT -FEBRUARY 28, 2019

amplitude and velocity (Zuber et al., 1965) and may be generated in the (SC), the same region as regular saccades (Hafed et al., 2009; Goffart et al., 2012; Hafed and Krauzlis, 2012). Moreover, the neural mechanisms underlying both microsaccades and regular saccades appear to be the same. SC neuons with receptive fields tuned to a

15 given amplitude and orientation exhibit a pre-movement buildup of activity, a strong burst during the microsaccade, and then a return to baseline, in a manner indistinguishable from tuned to amplitudes and orientations (Ahissar et al., 2016).

Drifts are slower movements that occur in the intervals between microsaccades. Initially drifts seem random, and

20 have been modelled as random walk processes (Burak et al., 2010; Engbert and Kliegl, 2004; Engbert et al., 2011; Kuang et al., 2012), perhaps caused by oculumotor or neural noise (Ditchburn and Ginsborg, 1953; Cornsweet, 1956). However, drifts have also been shown to compensate for inaccurate microsaccades (Cyr and Fender, 1969), or to help maintain fixation when microsaccades are suppressed (Nachmias, 1959, 1961). During a drift, the retinal image of the fixated point can move across up to a dozen foveal photoreceptors (Ratliff and Riggs, 1950), so drifts could play a

25 major part in ‘refreshing’ the retinal image or in data augmentation. Indeed Ahissar et al. (2016) have proposed that drifts, instead of microsaccades are the main fixational which underlie vision by sampling from the visual world while microsaccades, like macro-saccades, are only there to reorient to a different part of the visual field. Unlike microsaccades, drifts have been found to be both conjugate (Ditchburn and Ginsborg, 1953; Spauschus et al., 1999), and non-conjugate (Krauskopf et al., 1960). Interestingly, Cherici et al. (2012) find that untrained subjects

30 exhibit significantly many more and higher amplitude fixational eye movements compared to trained subjects, including drift velocities up to ten times higher than reported in earlier studies.

Tremour is an aperiodic oscillatory motion (Riggs et al., 1953), typically about 90 Hz (Carpenter, 1988), superimposed on the more regular drift motion. It is not thought that tremour has a large role to play in human vision since its

35 frequencies are often higher than the flicker-fusion frequency (Ditchburn, 1955; Gerrits and Vendrik, 1970), although other studies have indicated that tremour frequencies can often be below it (Spauschus et al., 1999). The amplitude of a tremour is about the diameter of one cone receptor in the fovea (Yarbus, 1967), so even if the tremour frequencies were slow enough to be perceptible, the amount of activation each one would engender would be small and, perhaps, insignificant against the neural noise.

40

Shortly after fixational eye movements were discovered, another counterintuitive phenomenon was found. When fixational eye-movements are artificially counteracted such that the image is kept fixed on the same point at the retina, the phenomonological percept of the image rapidly fades away (Riggs et al., 1953; Gerrits et al., 1966; Ratliff, 1958; Pritchard, 1961). This is called retinal stabilisation. The original studies found it took several seconds to achieve full

45 fading (Martinez-Conde et al., 2004), but this may have been due to imperfect stabilisation on the retina (Barlow, 1963). A more recent study found incredibly short fading times (80ms) for entoptic stimuli such as the vasculastructure of the retina (Coppola and Purves, 1996). Moreover, even short stabilisation times can lead to a dramatic decrease in visual acuity (Rucci and Desbordes, 2003). Moreover, for low contrast stimuli in the periphery of vision, retinal stabilisation

2 A PREPRINT -FEBRUARY 28, 2019

is not necessary; mere fixation sufficies for fading (Troxler, 1804).

50

In light of findings that superimposing movements similar to microsaccades under conditions of retinal stabilisation could restore perception (Martinez-Conde et al., 2006; Carpenter, 1988; Ditchburn, 1980; Martinez-Conde and Macknik, 2017), it was argued that one function of microsaccades could be to ‘refresh’ the image by moving it imperceptibly across the retina (Livingstone et al., 1996; McCamy et al., 2012). This is supported by neurophysiological findings that

55 microsaccades are predominantly excitatory throughout the striate cortex (Gur and Snodderly, 1997; Bair and O’keefe, 1998; Leopold and Logothetis, 1998; Snodderly et al., 2001; Macknik and Livingstone, 1998), LGN (Reppas et al., 2002; Martinez-Conde et al., 2002), and even in the retina (Greschner et al., 2002), and generally cause a long, tight burst of spikes immediately upon the cessation of the microsaccade (Martinez-Conde et al., 2000, 2002). Moreover, the neural response to microsaccades is predominantly visual in nature, since microsaccading across a blank image

60 does not result in a similar neural activity pattern (Martinez-Conde et al., 2000). However, a strong counterargument to the idea that microsaccades function to refresh the image on the retina is that subjects can suppress their own microsaccades without noticeable fading (Steinman et al., 1967; Fiorentini and Ercoles, 1966; Steinman et al., 1973). Additionally, such a reduction of microsaccades required minimal instruction (Winterson and Collewun, 1976) often just telling subjects to ‘hold’ their eyes still rather than ‘fixate’ sufficed (Steinman et al., 1967), implying that this

65 ability is common and does not require special training. Moreover, Kowler and Steinman (1979) and others have found that microsaccades are naturally suppressed during high-acuity (Kowler and Steinman, 1977) visual tasks such as threading a needle or shooting a rifle (Bridgeman and Palca, 1980; Winterson and Collewun, 1976), a strange result if microsaccades were necessary to preserve acuity by preventing visual fading. However, these results are not uncontestable. De Bie (1986) found a significant increase in microsaccade rate for two high acuity tasks (vernier-offset

70 and Landolt C discrminiation) and argued that the previous high acuity experiments may be confounded by expectation effects, since in both studies the trial culminated in an expected event (Rolfs, 2009). Moreover, a more recent study by Ko et al. (2010) found that during threading a needle microsaccade rates were increased and that microsaccades and other fixational eye movements were crucial to high performance and dynamically moved the eye to focus on regions of high importance to the task.

75

Although visual fading under stabilised conditions is typically explained in terms of neural adaptation (Rolfs, 2009; Martinez-Conde et al., 2013), we propose that couching the explanation in terms of the predictive processing theory (Clark, 2012; Friston, 2012) newly prominent in cognitive , is more enlightening. Predictive processing proposes a unifying theory of the (Friston, 2010), and argues that the brain is composed of a hierarchy of

80 probabilistic generative models involved in constantly attempting to predict their own inputs and, thus, the outside sensory universe. Each layer of the hierarchy attempts to predct the bottom-up input it receives, taking into account the predictions received from higher layers, and updates itself to reduce its prediction error. The prediction errors generated at each level are then transmitted up the hierarchy for higher levels to handle (Clark, 2015). In this way, the brain as a whole minimises its free-energy (Friston, 2009; Friston et al., 2006), a variational quantity roughly equivalent

85 with its prediction error. From this high-level formalism, biologicaly plausible update rules, similar to Hebbian and delta-rule updating can be derived (Feldman and Friston, 2010). Visual fading upon retinal stabilisation is a natural consequence of predictive processing in the visual system. Since each layer updates itself to minimise prediction error,

3 A PREPRINT -FEBRUARY 28, 2019

and since only the prediction error of the layer is passed on to subsequent layers, if a is presented for too long ‘stabilised’ in one location the first neural layers will eventually learn to perfectly predict it, and therefore there is no

90 prediction error to pass onto higher layers, so the percept fades. Conversely, the small displacements of the image caused by fixational eye movements, as long as they are not predictable, will generate prediction errors which will be propagated to higher visual areas, thus explaining the reappearance of the percept, as well as the bursts of spikes found accompanying microsaccades in neurophysiological studies which can be interpreted as encoding the new prediction errors generated by the microsaccade translating the image across the retina.

95

Moreover, the predictive processing explanation can account for why microsaccades and other fixational eye movements are necessary to maintain a visual percept, but may be suppressed in high acuity tasks. This can occur because of a quantity called precision. In predictive processing theory, the precision is the weight given to incoming sensory inputs over top-down predictions (Feldman and Friston, 2010). in predictive processing is proposed to be

100 implemented through increasing the precision of the incoming sense-data (Kanai et al., 2015). Since tasks requiring high levels of visual acuity will also require a strong attentional focus, the precision of the input will be substantially enhanced. This enhanced precision, and consequent upweighting of precision errors arising from the sense-data will counteract to some extent the lessening of prediction errors due to enhanced prediction, thus enabling the percept to persist longer without being refreshed by fixational eye movements.

105

Despite the claims of Steinman and Kowler (Kowler and Steinman, 1979, 1980) that fixational eye movements serve no purpose and are an evolutionary mystery or are an obstacle that must be overcome to avoid blurring (Packer and Williams, 1992) and learn complex spatial representations (Burak et al., 2010), a large number of useful purposes for fixational eye movements have been proposed. Microsaccades and drifts have been found to counteract fixational errors

110 (Ditchburn and Ginsborg, 1953; De Bie and Van den Brink, 1984), to maintain fixation on a stimulus (Cornsweet, 1956; Engbert and Mergenthaler, 2006; Møller et al., 2006), and to compensate for nonhomogenous visual acuity within the fovea (Poletti et al., 2013), or improve acuity in the periphery (Hennig and Wörgötter, 2004). The direction of microsaccades (Hafed and Clark, 2002; Rolfs et al., 2004) and drift can be modulated by attention (Pastukhov and Braun, 2010; Laubrock et al., 2010; Costela et al., 2013; Engbert and Kliegl, 2003) and task demands (Gao et al., 2015;

115 Di Stasi et al., 2013) which implies that they have some function beyond simply refreshing the image upon the retina, for which random movements will suffice. Engbert and Kliegl (2004) find that the statistics of drift as a random walk over short timescales is persistent, such that drift tends to carry the eye away from the fixation point, helping to sample multiple points of the image, while over longer timescales the random walk tends to be antipersistent, bringing the eye position back to the fixation point. Rucci and Casile (2005) find that random fixational eye movements introduce

120 a component into the retinal input signals which lacks spatial correlations, resulting in significantly decorrelated or whitened neural activity in the retina and LGN which may enhance further processing due to the decreased redundancy. Similarly, Rucci (2008) and Kuang et al. (2012) find that the movements of drift have the effect of equalising spectral power across a wide range of spatial frequencies - up to 10 cycles/deg, which tends to whiten the inputs, and emphasise fine spatial detail. Furthermore, Rucci and Victor (2015) argues, following dynamic theories of vision (Ahissar and

125 Arieli, 2012, 2001)) that the primary mode of vision may be through spatiotemporal representations built up out of the

4 A PREPRINT -FEBRUARY 28, 2019

transients caused by the random jitter of fixational eye movements.

We propose that fixational eye movements may serve another useful function for the brain - helping build robust and spatially invariant representations of the visual scene. In parsing the visual input, the brain faces a difficult problem. Not

130 only must it solve the inverse problem and try to reconstruct what is ‘out there’ given only the retinal image, but it must also discover that the same object can be in multiple positions or orientations, each of which may give it a completely different retinal aspect. Fixational eye movements may help the brain develop spatially invariant representations of objects since, at each fixation, the brain receives not a single image of the scene, but rather many slightly different images, translated and jittered slightly by microsaccades and drift. Since the brain must know in some sense that all of

135 these successive sense impressions represent almost the same vantage point on the same scene, it can use these inputs to try to build more robust representations of the inputs which are invariant to spatially irrelevant transformations. This mechanism is analogous to data augmentation in machine learning (Krizhevsky et al., 2012; Chatfield et al., 2014; Zhang et al., 2016; Gan et al., 2015) in which additional ‘data’ is generated from a dataset by applying small peturbations to each data item which do not change its ‘meaning’ (Wu et al., 2015). In visual classification tasks, this typically

140 includes small translations, skews, and rotations which do not meaningfully alter the classification of the item, as well as tranforming colour luminances to mimic ambient lighting shifts. Data augmentation has been found to be useful as a regularisation technique that helps improve validation and test-set performance, as well as increases the robustness of the learned representations and helps the networks learn spatial invariances. We argue that the small fixational eye movements may serve the same purpose for the brain, by representing the same visual scene during a fixation, but

145 by applying what are effectively small translations to the objects in it. This succession of images may help the brain learn to build more spatially invariant object representations. Additionally, since each object fixated will be perceived from multiple slightly different angles, these differences during a fixation may enable the brain to resolve some of the inevitable ambiguities which arise when trying to solve the inverse problem of inferring a 3D world from a 2D image.

150 Although both microsaccades and drifts may be of use in this manner, it seems likely that the primary means of sampling from the visual world comes from drifts, and microsaccades serve, like regular saccades, mainly to reorient the eye within the visual scene. Microsaccades are too fast to encode much visual information during scanning, since during a microsaccade, the stimulus will move so fast it will hardly stimulate any single detector (Vuong et al., 1984), and if the detector has a center-surround receptive field, both will be activated near-simultaneously, cancelling out the

155 effect (Amthor and Grzywacz, 1993). Moreover, they occur too infrequently for the information encoded at their destination to be of much use, varying between 0.5-2Hz (Martinez-Conde et al., 2004), or only once or twice as often as macro-saccades. Microsaccades are useful, however, in reorienting gaze within the foveal region to areas of interest, and this is supported by the finding of McCamy et al. (2014) that microsaccades are significantly more frequent in highly informative regions. Drift, however, occurs continuously and in a random walk fashion, thus ensuring a good

160 sampling of the visual scene, and drift velocity is such that the temporal delays can be reliably decoded by neural circuits (Ahissar, 1998). Additionally, while neural responses to microsaccades are short, tight bursts, responses to drifts are longer sustained bursts, better suited to encoding fine spatial and locational detail about the visual input (Kagan et al., 2008). Thus, we propose, that a function of fixational eye movements is to sample a given foveated point from a large number of slightly different angles, and from these additional datapoints learn more robust and invariant

5 A PREPRINT -FEBRUARY 28, 2019

165 representations of the visual world than could be obtained from keeping the eyes fixed precisely unnerringly on a single point. In this paper, we support these proposals with a neural network model inspired by predictive processing theory that empirically demonstrates both visual fading upon retinal stabilisation as well as the benefits to performance of fixational eye movements conceptualised as data augmentation.

2 Method and Results

170 2.1 Network Architecture

The network followed a simple convolutional encoder-decoder architecture, consiting of three convolutional encoding layers and then three deconvolutional decoding layers. Each convolutional or deconvolutional layer except the final one was accompanied with a max-pooling 2x2 layer or an upsampling 2x2 layer, respectively, and was batch-normalised (Ioffe and Szegedy, 2015). Dropout (Srivastava et al., 2014) was applied between the encoder and decoder layers with a

175 coefficient of 0.1. The network was trained with a pixel-wise mean-square-error loss function, a learning rate of 0.0001, and was optimised by standard stochastic gradient descent. The hyperparameters were chosen somewhat arbitrarily and were not optimised in any significant way, so such results are likely robust to reasonable hyperparameter variations.

As a whole the network learned a map from its input image to its input image. To stop the network from simply learning

180 the trivial identity mapping, a dimensional bottleneck was introduced between the encoder and decoder, so that the network had to learn to recreate its original input from a low-dimensonal representation of the input.

The prediction error was calculated in a pixel-wise manner as simply the difference between the pixel value of the predicted image and the veridicial image.

N X Ep = |xi − xˆi| i=0

185 Where xi is the actual value of that image pixel, and xˆi is the predicted value of that image pixel. The total error would then be the sum of all the pixel-wise errors, and the error map would be a new image where each pixel has the value of its prediction error. Since in predictive coding theory the prediction error is what is as- sumed to be passed on to the next level of the hierarchy, the error-maps are treated here as the main output of the network.

190 The data was derived from the MNIST dataset (LeCun et al., 2010). A database of 70,000 grayscale handwritten digits 28 by 28 pixels in dimension. The dataset was chosen since it is very commonly used in machine learning and very high accuracies have been achieved. Furthermore, since the models employed here are more proof-of-concept than intended for production-use, it was thought wise to use a simpler and more well-established dataset as a demonstration.

2.2 Visual Fading

195 To simulate fixational eye movements eye movements, each input image was augmented with ten additional images in which the input image was translated vertically or horizontally a random number of pixels (chosen uniformly between 0

6 A PREPRINT -FEBRUARY 28, 2019

and 8 pixels). In the stabilised condition, each input image was augmented with ten additional copies of the input image. This was done so that in both conditions, the total number of input images fed to the network was the same.

200 To simulate visual fading, the network was trained on sequential presentations of either the jittered or the stabilised input data. In the two graphs below, the ‘response’ of the network - the prediction error map - is shown for successive epochs under the stabilised and non-stabilised conditions.

Figure 1: Visual fading of the image is rapidly observed over time under the stabilised condition

As can be seen, under the stabilised condition, the network is able to learn the precise details of the image well enough to eliminate almost all prediction error, and therefore its output fades rapidly. In the non-stabilised condition, the small,

205 unpredictable jitters caused by fixational eye movements are sufficient to maintain the percept. Although there is some fading, which may even be the case with fixational eye movements when fixating on a point for a long time, the prediction error and hence the percept is largely preserved intact.

7 A PREPRINT -FEBRUARY 28, 2019

Figure 2: Although there is some slight fading, fixational eye movements do a good job of maintaining the image under the non-stabilised condition

To our knowledge, this is the first empirical demonstration of visual fading under retinal stabilisation, and confirms

210 hypotheses that the small peturbations of the image created by small fixational eye movements are necessary to ‘refresh’ the image and thus prevent visual fading.

2.3 Usefulness of microsaccadic data augmentation

In addition to preventing visual fading under retinally stabilised conditions, we hypothesised that the data augmentation effect of small fixational eye movements on the retina would be useful for the brain. To test this, we calculated the

215 overall error of the network on a test-set of 60000 input images for networks trained with the stabilised input or the microsaccadic input. The total error was calculated as the mean sum of the errors in each error map over the 60,000 test-set images. Under the microsaccadic condition, the data was augmented with virtual microsaccades as described above in the visual fading section. In the stabilised condition, the data were augmented with copies of the same image.

While the network trained under the microsaccadic condition obtained a lower training error than under the copy-

220 condition, this difference was not significant (t = 2.70; p = 0.010), thus implying that if there is an advantage to microsaccadic data augmentation alone, the magnitude of this advantage is unlikely to be large or robust.

8 A PREPRINT -FEBRUARY 28, 2019

Figure 3: The average error on the test-set of the network trained in the stabilised condition and the network trained in the microsaccadic condition

Additionally, we tested the hypothesis that the microsaccadic data augmentation may help the brain build spatial invariances in its representations, the networks were tested on a new test-set of images constructed by translating the images of the old test-set by 0,2,4,6, or 8 pixels in a random direction. A plot of the errors of network trained under

225 both conditions is shown below:

Figure 4: The errors obtained by networks trained in the stabilised or microsaccadic condition on a test-set composed of stimuli offset by a certain number of pixels, to simulate the spatial invariance of the representations learned by the network.

9 A PREPRINT -FEBRUARY 28, 2019

Although, the network trained in the stabilized condition performed slightly better in the 0-pixel condition - unsurprising since that condition was effectively the train set of the stabilized copy model while being unusual for the microsaccade- trained-model, for all the other conditions the microsaccade-trained model outperformed the copy model, thus showing that the fixational eye movements helped it generalize and build more robustly invariant representations of the input.

230 Notably, the relative performance of the microsaccade-trained model increases with the amount of image translation, thus showing that its representations become increasingly robust to larger peturbations than the copy model.

2.4 Usefulness of drift data augmentation

To simulate drift, each input image was augmented with ten additional images which were derived from a gaussian random walk which translated the image horizontally or vertically by a number of pixels selected from a gaussian

235 distribution with a mean of 0 and a variance of 1 (negative values resulted in leftward translations; positive rightwards). As before, in the stabilised condition, each input image was augmented with ten additional copies of the input.

As in the microsaccadic condition, the mean error on the testset of images was calculated for networks trained under the stabilised regime, or the drift regime. A bar chart of the errors is shown below:

Figure 5: The errors obtained by networks trained in the stabilised or drift condition on the test-set

240 Here the advantage of the network trained under the drift condition was significantly greater than that trained under the copy condition (t=33.5, p 0), and, importantly, significantly greater than the non-significant result of training with microsaccades by themselves.

Additionally, to test the spatial invariances learnt by the networks under both conditions, they were tested on test-sets

245 consisting of test stimuli translated either forwards or backwards by a certain number of pixels. The mean errors of the networks under these conditions is plotted below

10 A PREPRINT -FEBRUARY 28, 2019

Figure 6: The errors obtained by networks trained in the stabilised or drift condition on a test-set composed of stimuli offset by a certain number of pixels, to simulate the spatial invariance of the representations learned by the network.

While the drift-trained model does appear to be slightly better than the copy model at larger pixel displacements, the difference is fairly marginal, and it is not immediately apparent which model performs better overall. These results suggest that overall the benefit of drifts alone is quite small.

250 2.5 Usefulness of drift and microsaccadic data augmentation

To simulate microsaccades and drifts, each input was augmented with ten additional images. These images were generated by applying either a microsaccade - a large translation by a number of pixels uniformly sampled between 0 and 8, or by a drift step sampled from a gaussian with mean 2 and variance 1. A microsaccade was chosen with probability 0.1. If a microsaccade was not chosen, a drift step was taken. In the stabilised condition, the original image

255 was augmented with ten copies, as before.

11 A PREPRINT -FEBRUARY 28, 2019

Figure 7: The errors obtained by networks trained in the stabilised or drift condition on the test-set

The test-set performance of the model trained with both drifts and microsaccades was significantly greater than the model trained with only copies of the input data (t = 47.3, p 0), and, importantly, the performance of the drift and microsaccade model was better than either the drifts alone or the microsaccades alone model, thus suggesting that the

260 combination of both drifts and microsaccades might be important and required for the maximum benefit rather than one or the other individually.

Figure 8: The errors obtained by networks trained in the stabilised or drift condition on a test-set composed of stimuli offset by a certain number of pixels, to simulate the spatial invariance of the representations learned by the network.

12 A PREPRINT -FEBRUARY 28, 2019

This pattern was maintained in the invariance experiments. The drift-and-microsaccade model not only significantly outperformed the copy-model, and increasingly so as the degree of pixel displacement increased, but it also signifiacntly

265 outperformed the models which just used drifts, or microsaccades, alone. This suggests that the combination of both drifts and microsaccades leads to the development of the most robust and generalizable input representations.

3 Discussion

In this paper, we have implemented a simple convolutional neural network autoencoder model implementing predictive processing and shown that it exhibits visual fading under conditions of retinal stabilisation. To our knowledge, this is

270 the first neural network model of visual fading, and confirms that the retinal stabilisation effect falls out as a natural consequence of predictive proecssing theory.

Predictive processing theory predicts that insofar as both microsaccades and drift shift the retinal image over sufficient photoreceptors for a reasonably large prediction error to accumulate, that both would be able to refresh the retinal

275 image on the retina. One key caveat would be that such fixational eye movements would need to be unpredictable, since if they were predictable they would be predicted, and thus generate no prediction error upon being made and so not lead to a perceptual refresh. Thus, it would seem that the ideal fixational eye movements from a predictive processing standpoint would be entirely random and hence entirely unpredictable, and so generate the maximum prediction error. How then, should we accomodate the large number of findings (Pastukhov and Braun, 2010; Costela et al., 2013; Gao

280 et al., 2015) showing that fixational eye movements - especially microsaccades - are not random and are sensitive to a large number of factors including attention, task demands, and visual saliency and informativeness (McCamy et al., 2014)? One response would be to note that there is still a significant stochastic component to microsaccades, even given such modulations, and so that insofar as they are at least partly unpredictable, they will still generate some prediction error and thus refresh the retinal image.

285

Secondly, the reason for the non-randomness of fixational eye movements must lie in their other functions apart from simply refreshing the retinal image. A number of functions have been proposed. Rucci et al. (2008, 2015) have argued that microsaccades shift the spatiotemporal statistics of the input and serve to whiten spatial frequency power which aids low level neural processing, and also has the effect of enhancing fine spatial detail. Moreover, Ahissar et al. (2016)

290 argue that microsaccades serve to reorient attention to informative stimuli within the foveal region and that drifts sample from various image locations. This reorientational role is supported by findings of attentional modulation, and also the results of McCamy et al. (2014) who found that microsaccades are more common in more informative regions. There is also a large body of work supporting the idea of fixational eye movements helping to counteract fixation errors (Engbert and Mergenthaler, 2006; Møller et al., 2006). These additional functions of input whitening, maintaining

295 accurate fixation, and enhanced sampling of regions of interest may be able to explain the non-random aspects of fixational eye movements. This reorienting of the eye to focus most on regions of interest also fits well within our theory of fixational eye movements as data augmentation, since the most valuable regions to gather additional data from would be those of the highest information content for the brain.

13 A PREPRINT -FEBRUARY 28, 2019

300 Furthermore, the lack of predictability in microsaccades may be less necessary in tasks requiring high visual acuity or when responding to attentional modulation, as attention itself increases precision, which increases the weighting of the sensory prediction errors and thus may suffice to reduce fixational fading itself. An interesting question for further work presents itself here. Is attention able to prevent visual fading under conditions of retinal stabilisation to some extent, as predicted by predictive processing theories in which attention is implemented as an increase in precision, or a relative

305 up-weighting of the sensory evidence? An answer to this question would shed light on the general suitability of the predictive processing framework to encompass relatively low-level phenomena such as visual stabilisation and other questions relating directly to adapation at the retinal and earliest cortical levels.

A third possibility for the nonstochasticity relates to the multiple distinct channels and generative models comprising

310 the brain. Fixational eye movements need only be unpredictable for the predictive generative models instantiated in the for them to generate prediction errors, and this may be the case, even if they are generated entirely non-stochastically in the superior colliculus as long as the superior colliculus does not transfer that information to the visual cortex prior to the fixational eye movements being genereated. This separation of concerns and information in the brain would enable the completely predictable and non-stochastic operations of some regions to be unpredictable to

315 other regions and thus generate prediction errors.

A deeper question arises as to why vision is constantly refreshed. Many other senses, such as touch, and to some extent, hearing, are not. When sitting down you are not constantly aware of the feeling of a surface against you. Similarly, you are well able to tune out constant and predictable audible noises. In all these sensory modalities, no equivalent of

320 refreshing in vision occurs. Instead the percept is allowed to fade completely. The real question, then, is not why does fading upon retinal stabilisation occur, but rather why do we have fixational eye movements which guarantee a continu- ally refreshed and thus preserved constant visual sensorium? Could it be possible that fixational eye movements evolved independently of the need for a constant phenomenological visual awareness and therefore constant visual presence is in effect a byproduct of other reasons, such as a need for data augmentation or simply maintenance of accurate fixation?

325

While we agree with Rucci and colleagues (Rucci and Victor, 2015; Rucci, 2008; Kuang et al., 2012) that the spatiotemporal shifts of the input statistics engendered by fixational eye movements are important and may be helpful to the brain in parsing the visual input and aiding the first steps of neural proecssing, we also argue that not just the statistical input shifts, but the additional ‘snapshots’ observed by the brain during fixational eye movements

330 may be very usefully processed to aid the brain in developing robust and spatially invariant representations. This is because such snapshots present multiple views of the same scene from almost, but not quite the same perspective. Representations consistent with this augmented input must possess a significantly greater degree of generalisability and spatial invariance than otherwise, since they must be able to map many slightly different jittered inputs to the same ultimate representation of the scene. This process is analogous to that of data augmentation in machine learning,

335 which has been found to be a powerful regularisation technique, and one that encourages the learning of robust and generalisable representations. We support this hypothesis with modelling work which demonstrates the benefits of fixational eye movements as data augmentation. In the augmentation conditions for both microsaccades and drifts

14 A PREPRINT -FEBRUARY 28, 2019

and microsaccades, the model in the augmented condition outperformed the model in the fixation condition which instead of augmented data items utilised simple copies of the iput image. The data augmented model achieved

340 both a lower mean error on the test-set, as well as substantially more invariant representations. This supports the hypothesis that the small movements in the input engendered by fixational microsaccades are helpful in developing better, more robust, and more spatially invariant representations of the input. Additionally, it is possible that the constant slight peterbutations of the input the brain receives due to fixational eye movements may help it in developing representations more resilient to adversarial examples (Kurakin et al., 2016; Tabacof and Valle, 2016; Liu et al.,

345 2016) which artificial neural networks are susceptible to. Adversarial examples are where a new image input, generated by applying a tiny, and unnoticeable peturbation of the original input, suffice to make the network wildly misclassify the input. As far as we are aware, no corresponding class of adversarial images exists for biological .We hypothesise that the intrinsic regularising effects of constant data augmentation due to fixational eye movements contribute to this fact since it would force the brain to learn representations not sensitive to small petur-

350 bations in the input and would encourage the learned classification surface to be smooth, and generalisations to be robust.

Interestingly, we found that augmenting the data with microsaccades alone did little to improve test-set performance or development of invariant representations. One possibility is that the large shifts created by the simulated microsaccades were too large and too difficult for the network to easily learn generalisation of the input from them alone, while the

355 smaller shifts of drifts provided a training set more easily amenable to learning from.

Additionally, we found that the combination of drifts and microsaccades outperformed both drifts and microsac- cades individually. This is potentially because drifts and microsaccades are complementary in that drifts force the model to learn small-scale spatial invariances of the order of a few pixels at a time, while microsaccades force the network to learn larger scale spatial invariance, but microsaccades alone may leave it vulnerable to

360 small peturbations. The combination of the two would then cover both possible types of shift, and thus lead the network to develop the maximally robust and generalizable representations, invariant to shifts and noise over multiple scales. The finding that the combination of drifts and microsaccades is best for developing strong representations thus provides a clue as to why the brain utilizes multiple types of fixational eye movements in perception.

365 Additionally, while we argue that fixational eye movements may serve as a kind of natural data augmentation for the brain, this does not preclue alternative hypotheses about their role in attentional modulation, or correcting errors of various kinds. Indeed, it seems likely that fixational eye movements are a complex behaviour, built up out of multiple interacting components, and fulfilling several important functions simulataneously, one of which is effective data augmentation at multiple scales to help develop better and more robust representations of the visual world.

370 4 Conclusion

In this paper we make two main contributions. We propose that retinal stabilisation fading can best be explained under the predictive processing paradigm and present the first neural network model of visual fading. Secondly we propose an additional function for fixational eye movements, that of data augmentation. We argue that by sampling multiple closely related points of the image by microsaccading and drifting, the brain is able to use this additional input to learn

15 A PREPRINT -FEBRUARY 28, 2019

375 more robust, generalisable, and spatially invariant representations of the input. We buttress this claim with additional neural network modelling work where we show that networks trained under the fixational eye movements conditions obtain lower error overall and are more resistant to jittered inputs than networks trained in the stabilised condition.

References

References

Ahissar, E. (1998). Temporal-code to rate-code conversion by neuronal phase-locked loops. Neural Computation, 10(3):597–650.

Ahissar, E. and Arieli, A. (2001). Figuring space by time. , 32(2):185–201.

Ahissar, E. and Arieli, A. (2012). Seeing via miniature eye movements: a dynamic hypothesis for vision. Frontiers in Computational Neuroscience, 6:89.

Ahissar, E., Arieli, A., Fried, M., and Bonneh, Y. (2016). On the possible roles of microsaccades and drifts in visual perception. Vision Research, 118:25–30.

Amthor, F. R. and Grzywacz, N. M. (1993). Inhibition in on-off directionally selective ganglion cells of the rabbit retina. Journal of Neurophysiology, 69(6):2174–2187.

Bair, W. and O’keefe, L. P. (1998). The influence of fixational eye movements on the response of neurons in area mt of the macaque. Visual Neuroscience, 15(4):779–786.

Barlow, H. (1963). Slippage of contact lenses and other artefacts in relation to fading and regeneration of supposedly stable retinal images. Quarterly Journal of Experimental Psychology, 15(1):36–51.

Bridgeman, B. and Palca, J. (1980). The role of microsaccades in high acuity observational tasks. Vision Research, 20(9):813–817.

Burak, Y., Rokni, U., Meister, M., and Sompolinsky, H. (2010). Bayesian model of dynamic image stabilization in the visual system. Proceedings of the National Academy of Sciences, 107(45):19525–19530.

Carpenter, R. H. (1988). Movements of the Eyes, 2nd Rev. Pion Limited.

Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A. (2014). Return of the devil in the details: Delving deep into convolutional nets. arXiv preprint arXiv:1405.3531.

Cherici, C., Kuang, X., Poletti, M., and Rucci, M. (2012). Precision of sustained fixation in trained and untrained observers. Journal of Vision, 12(6):31–31.

Clark, A. (2012). Dreaming the whole cat: Generative models, predictive processing, and the enactivist conception of perceptual experience. Mind, 121(483):753–771.

Clark, A. (2015). Surfing uncertainty: Prediction, action, and the embodied mind. Oxford University Press.

Coppola, D. and Purves, D. (1996). The extraordinarily rapid disappearance of entopic images. Proceedings of the National Academy of Sciences, 93(15):8001–8004.

Cornsweet, T. N. (1956). Determination of the stimuli for involuntary drifts and saccadic eye movements. JOSA, 46(11):987–993.

16 A PREPRINT -FEBRUARY 28, 2019

Costela, F., Otero-Millan, J., McCamy, M., Macknik, S., Troncoso, X., and Martinez-Conde, S. (2013). Microsaccades correct fixation errors due to blinks. Journal of Vision, 13(9):1335–1335.

Cyr, G. J. S. and Fender, D. H. (1969). The interplay of drifts and flicks in binocular fixation. Vision Research, 9(2):245–265.

De Bie, J. (1986). The control properties of small eye movements. PhD thesis, TU Delft, Delft University of Technology.

De Bie, J. and Van den Brink, G. (1984). Small stimulus movements are necessary for the study of fixational eye movements. In Advances in Psychology, volume 22, pages 63–70. Elsevier.

Di Stasi, L. L., McCamy, M. B., Catena, A., Macknik, S. L., Canas, J. J., and Martinez-Conde, S. (2013). Microsaccade and drift dynamics reflect mental fatigue. European Journal of Neuroscience, 38(3):2389–2398.

Ditchburn, R. (1955). Eye-movements in relation to retinal action. Optica Acta.

Ditchburn, R. (1980). The function of small saccades. Vision Research, 20(3):271–272.

Ditchburn, R. and Ginsborg, B. (1953). Involuntary eye movements during fixation. The Journal of Physiology, 119(1):1–17.

Engbert, R. and Kliegl, R. (2003). Microsaccades uncover the orientation of covert attention. Vision Research, 43(9):1035–1045.

Engbert, R. and Kliegl, R. (2004). Microsaccades keep the eyes’ balance during fixation. Psychological Science, 15(6):431–431.

Engbert, R. and Mergenthaler, K. (2006). Microsaccades are triggered by low retinal image slip. Proceedings of the National Academy of Sciences, 103(18):7192–7197.

Engbert, R., Mergenthaler, K., Sinn, P., and Pikovsky, A. (2011). An integrated model of fixational eye movements and microsaccades. Proceedings of the National Academy of Sciences, 108(39):E765–E770.

Feldman, H. and Friston, K. (2010). Attention, uncertainty, and free-energy. Frontiers in Human Neuroscience, 4:215.

Fiorentini, A. and Ercoles, A. (1966). Involuntary eye movements during attempted monocular fixation. Atti della Fondazione Giorgio Ronchi, 21:199–217.

Friston, K. (2009). The free-energy principle: a rough guide to the brain? Trends in Cognitive Sciences, 13(7):293–301.

Friston, K. (2010). The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11(2):127.

Friston, K. (2012). Prediction, perception and agency. International Journal of Psychophysiology, 83(2):248–252.

Friston, K., Kilner, J., and Harrison, L. (2006). A free energy principle for the brain. Journal of Physiology-Paris, 100(1-3):70–87.

Gan, Z., Henao, R., Carlson, D., and Carin, L. (2015). Learning deep sigmoid belief networks with data augmentation. In Artificial Intelligence and Statistics, pages 268–276.

Gao, X., Yan, H., and Sun, H.-j. (2015). Modulation of microsaccade rate by task difficulty revealed through between-and within-trial comparisons. Journal of vision, 15(3):3–3.

Gerrits, H., De Haan, B., and Vendrik, A. (1966). Experiments with retinal stabilized images. relations between the observations and neural data. Vision Research, 6(7-8):427–440.

17 A PREPRINT -FEBRUARY 28, 2019

Gerrits, H. and Vendrik, A. (1970). Artificial movements of a stabilized image. Vision Research, 10(12):1443–1456.

Goffart, L., Hafed, Z. M., and Krauzlis, R. J. (2012). Visual fixation as equilibrium: evidence from superior colliculus inactivation. Journal of Neuroscience, 32(31):10627–10636.

Greschner, M., Bongard, M., Rujan, P., and Ammermüller, J. (2002). Retinal ganglion cell synchronization by fixational eye movements improves feature estimation. Nature Neuroscience, 5(4):341.

Gur, M. and Snodderly, D. M. (1997). Visual receptive fields of neurons in primary visual cortex (v1) move in space with the eye movements of fixation. Vision Research, 37(3):257–265.

Hafed, Z. M. and Clark, J. J. (2002). Microsaccades as an overt measure of covert attention shifts. Vision Research, 42(22):2533–2545.

Hafed, Z. M., Goffart, L., and Krauzlis, R. J. (2009). A neural mechanism for microsaccade generation in the primate superior colliculus. Science, 323(5916):940–943.

Hafed, Z. M. and Krauzlis, R. J. (2012). Similarity of superior colliculus involvement in microsaccade and saccade generation. Journal of Neurophysiology, 107(7):1904–1916.

Hennig, M. H. and Wörgötter, F. (2004). Eye micro-movements improve stimulus detection beyond the nyquist limit in the peripheral retina. In Advances in Neural Information Processing Systems, pages 1475–1482.

Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.

Kagan, I., Gur, M., and Snodderly, D. M. (2008). Saccades and drifts differentially modulate neuronal activity in v1: effects of retinal image motion, position, and extraretinal influences. Journal of Vision, 8(14):19–19.

Kanai, R., Komura, Y., Shipp, S., and Friston, K. (2015). Cerebral hierarchies: predictive processing, precision and the pulvinar. Phil. Trans. R. Soc. B, 370(1668):20140169.

Ko, H.-k., Poletti, M., and Rucci, M. (2010). Microsaccades precisely relocate gaze in a high visual acuity task. Nature Neuroscience, 13(12):1549.

Kowler, E. and Steinman, R. M. (1977). The role of small saccades in counting. Vision Research, 17(1):141–146.

Kowler, E. and Steinman, R. M. (1979). Miniature saccades: Eye movements that do not count. Vision Research, 19(1):105–108.

Kowler, E. and Steinman, R. M. (1980). Small saccades serve no useful purpose: reply to a letter by rw ditchburn. Vision Research, 20(3):273–276.

Krauskopf, J., Cornsweet, T., and Riggs, L. (1960). Analysis of eye movements during monocular and binocular fixation. JOSA, 50(6):572–578.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pages 1097–1105.

Kuang, X., Poletti, M., Victor, J. D., and Rucci, M. (2012). Temporal encoding of spatial information during active visual fixation. Current Biology, 22(6):510–514.

Kurakin, A., Goodfellow, I., and Bengio, S. (2016). Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236.

18 A PREPRINT -FEBRUARY 28, 2019

Laubrock, J., Kliegl, R., Rolfs, M., and Engbert, R. (2010). When do microsaccades follow spatial attention? Attention, Perception, & Psychophysics, 72(3):683–694.

LeCun, Y., Cortes, C., and Burges, C. (2010). Mnist handwritten digit database. AT&T Labs [Online]. Available: http://yann. lecun. com/exdb/mnist, 2.

Leopold, D. A. and Logothetis, N. K. (1998). Microsaccades differentially modulate neural activity in the striate and extrastriate visual cortex. Experimental Brain Research, 123(3):341–345.

Liu, Y., Chen, X., Liu, C., and Song, D. (2016). Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770.

Livingstone, M., Freeman, D., and Hubel, D. (1996). Visual responses in v1 of freely viewing monkeys. In Cold Spring Harbor Symposia on Quantitative Biology, volume 61, pages 27–37. Cold Spring Harbor Laboratory Press.

Lord, M. P. (1951). Measurement of binocular eye movements of subjects in the sitting position. The British Journal of Ophthalmology, 35(1):21.

Macknik, S. L. and Livingstone, M. S. (1998). Neuronal correlates of visibility and invisibility in the primate visual system. Nature Neuroscience, 1(2):144.

Martinez-Conde, S. and Macknik, S. L. (2017). Unchanging visions: the effects and limitations of ocular stillness. Phil. Trans. R. Soc. B, 372(1718):20160204.

Martinez-Conde, S., Macknik, S. L., and Hubel, D. H. (2000). Microsaccadic eye movements and firing of single cells in the striate cortex of macaque monkeys. Nature Neuroscience, 3(3):251.

Martinez-Conde, S., Macknik, S. L., and Hubel, D. H. (2002). The function of bursts of spikes during visual fixation in the awake primate lateral geniculate nucleus and primary visual cortex. Proceedings of the National Academy of Sciences, 99(21):13920–13925.

Martinez-Conde, S., Macknik, S. L., and Hubel, D. H. (2004). The role of fixational eye movements in visual perception. Nature Reviews Neuroscience, 5(3):229.

Martinez-Conde, S., Macknik, S. L., Troncoso, X. G., and Dyar, T. A. (2006). Microsaccades counteract visual fading during fixation. Neuron, 49(2):297–305.

Martinez-Conde, S., Otero-Millan, J., and Macknik, S. L. (2013). The impact of microsaccades on vision: towards a unified theory of saccadic function. Nature Reviews Neuroscience, 14(2):83.

McCamy, M. B., Otero-Millan, J., Di Stasi, L. L., Macknik, S. L., and Martinez-Conde, S. (2014). Highly informative natural scene regions increase microsaccade production during visual scanning. Journal of Neuroscience, 34(8):2956– 2966.

McCamy, M. B., Otero-Millan, J., Macknik, S. L., Yang, Y., Troncoso, X. G., Baer, S. M., Crook, S. M., and Martinez- Conde, S. (2012). Microsaccadic efficacy and contribution to foveal and . Journal of Neuroscience, 32(27):9194–9204.

Møller, F., Laursen, M., and Sjølie, A. (2006). The contribution of microsaccades and drifts in the maintenance of binocular steady fixation. Graefe’s Archive for Clinical and Experimental Ophthalmology, 244(4):465.

19 A PREPRINT -FEBRUARY 28, 2019

Møller, F., Laursen, M., Tygesen, J., and Sjølie, A. (2002). Binocular quantification and characterization of microsac- cades. Graefe’s Archive for Clinical and Experimental Ophthalmology, 240(9):765–770.

Nachmias, J. (1959). Two-dimensional motion of the retinal image during monocular fixation. JOSA, 49(9):901–908.

Nachmias, J. (1961). Determiners of the drift of the eye during monocular fixation. JOSA, 51(7):761–766.

Packer, O. and Williams, D. R. (1992). Blurring by fixational eye movements. Vision Research, 32(10):1931–1939.

Pastukhov, A. and Braun, J. (2010). Rare but precious: microsaccades are highly informative about attentional allocation. Vision Research, 50(12):1173–1184.

Poletti, M., Listorti, C., and Rucci, M. (2013). Microscopic eye movements compensate for nonhomogeneous vision within the fovea. Current Biology, 23(17):1691–1695.

Pritchard, R. M. (1961). Stabilized images on the retina. Scientific American, 204(6):72–79.

Ratliff, F. (1958). Stationary retinal image requiring no attachments to the eye. JOSA, 48(4):274_1–275.

Ratliff, F. and Riggs, L. A. (1950). Involuntary motions of the eye during monocular fixation. Journal of Experimental Psychology, 40(6):687.

Reppas, J. B., Usrey, W. M., and Reid, R. C. (2002). Saccadic eye movements modulate visual responses in the lateral geniculate nucleus. Neuron, 35(5):961–974.

Riggs, L. A., Ratliff, F., Cornsweet, J. C., and Cornsweet, T. N. (1953). The disappearance of steadily fixated visual test objects. JOSA, 43(6):495–501.

Rolfs, M. (2009). Microsaccades: small steps on a long way. Vision Research, 49(20):2415–2441.

Rolfs, M., Engbert, R., and Kliegl, R. (2004). Microsaccade orientation supports attentional enhancement opposite a peripheral cue: commentary on tse, sheinberg, and logothetis (2003). Psychological Science, 15(10):705–707.

Rucci, M. (2008). Fixational eye movements, natural image statistics, and fine spatial vision. Network: Computation in Neural Systems, 19(4):253–285.

Rucci, M. and Casile, A. (2005). Fixational instability and natural image statistics: Implications for early visual representations. Network: Computation in Neural Systems, 16(2-3):121–138.

Rucci, M. and Desbordes, G. (2003). Contributions of fixational eye movements to the discrimination of briefly presented stimuli. Journal of Vision, 3(11):18–18.

Rucci, M. and Victor, J. D. (2015). The unsteady eye: an information-processing stage, not a bug. Trends in , 38(4):195–206.

Snodderly, D. M., Kagan, I., and Gur, M. (2001). Selective activation of visual cortex neurons by fixational eye movements: implications for neural coding. Visual Neuroscience, 18(2):259–277.

Spauschus, A., Marsden, J., Halliday, D. M., Rosenberg, J. R., and Brown, P. (1999). The origin of ocular microtremor in man. Experimental Brain Research, 126(4):556–562.

Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958.

20 A PREPRINT -FEBRUARY 28, 2019

Steinman, R. M., Cunitz, R. J., Timberlake, G. T., and Herman, M. (1967). Voluntary control of microsaccades during maintained monocular fixation. Science, 155(3769):1577–1579.

Steinman, R. M., Haddad, G. M., Skavenski, A. A., and Wyman, D. (1973). Miniature eye movement. Science, 181(4102):810–819.

Tabacof, P. and Valle, E. (2016). Exploring the space of adversarial images. In Neural Networks (IJCNN), 2016 International Joint Conference on, pages 426–433. IEEE.

Troxler, D. (1804). Ophthalmologisches bibliothek. Vol. II. Himly, K. & Schmidt, JA (eds.), pages 1–53.

Vuong, T., Chabre, M., and Stryer, L. (1984). Millisecond activation of transducin in the cyclic nucleotide cascade of vision. Nature, 311(5987):659.

Winterson, B. J. and Collewun, H. (1976). Microsaccades during finely guided visuomotor tasks. Vision Research, 16(12):1387–1390.

Wu, R., Yan, S., Shan, Y., Dang, Q., and Sun, G. (2015). Deep image: Scaling up image recognition. arXiv preprint arXiv:1501.02876, 7(8).

Yarbus, A. (1967). Eye movements and vision. 1967. New York.

Zhang, C., Bengio, S., Hardt, M., Recht, B., and Vinyals, O. (2016). Understanding deep learning requires rethinking generalization. arXiv preprint arXiv:1611.03530.

Zuber, B., Stark, L., and Cook, G. (1965). Microsaccades and the velocity-amplitude relationship for saccadic eye movements. Science, 150(3702):1459–1460.

21