1 2 3 4 5 6 7 V1 as an Egocentric Cognitive Map 8 9 Paul Linton 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 Paul Linton 44 Centre for Applied Vision Research 45 City, University of London 46 Northampton Square 47 London EC1V 0HB, UK 48 [email protected] 49

1 50 Abstract 51 52 We typically distinguish between V1 as an egocentric perceptual map and the hippocampus as an 53 allocentric cognitive map. In this article we argue that V1 also functions as a post-perceptual 54 egocentric cognitive map. We argue that three well-documented functions of V1, namely (a) the 55 estimation of distance, (b) the estimation of size, and (c) multisensory integration, are better 56 understood as post-perceptual cognitive inferences. This argument has two important 57 implications. First, we argue that V1 must function as the neural correlates of the visual 58 perception / cognition distinction, and suggest how this can be accommodated by V1’s laminar 59 structure. Second, we use this insight to propose a low-level account of visual consciousness in 60 contrast to mid-level accounts (recurrent processing theory; integrated information theory), and 61 higher-level accounts (higher-order thought; global workspace theory). Detection thresholds 62 have traditionally used to rule out such an approach, but we explain why it is a mistake to equate 63 visibility (and therefore the presence / absence of visual experience) with detection thresholds. 64 65 66

2 67 Introduction 68 69 Since The Hippocampus as a Cognitive Map (O’Keefe & Nadel, 1978), we typically distinguish 70 between V1 as an egocentric perceptual map and the hippocampus as an allocentric cognitive 71 map. In this article we explain why V1 also functions as an egocentric cognitive map. 72 To the extent that cognitive processing has been discussed in V1, it has focused on (a) 73 the allocation of attention, (b) top-down influences on perception, and (c) the transition from 74 egocentric perception to allocentric navigation. By contrast, in this article we argue that three 75 well-documented functions of V1, namely (a) the estimation of distance, (b) the estimation of 76 size, and (c) multisensory integration, are better understood as post-perceptual cognitive 77 inferences. 78 In Section 1, we outline how the field’s understanding of V1 has evolved from a retinal 79 map, to a depth map, to a distance map, to a size map, to a multisensory map. 80 In Section 2, we explain why our recent experimental work suggests that the last three 81 developments, namely the perception of distance, size, and multisensory integration, may be 82 better thought of as purely cognitive in nature, and therefore reflect V1 functioning as a 83 cognitive as well as perceptual map. 84 In Section 3, we consider how V1 could fulfil this dual function by the potential 85 distribution of perceptual and cognitive functions to different layers in V1. 86 In Section 4, we explain why this commits us to a low-level account of visual 87 consciousness, in contrast to mid-level accounts (recurrent processing theory; integrated 88 information theory), and higher-level accounts (higher-order thought; global workspace theory). 89 90 1. V1 as a Spatial Map 91 92 In this first section we outline how the field’s understanding of V1 as a spatial map has evolved 93 from a retinal map in the early 20th Century, to a depth map in the 1960s, to a distance map in 94 the 1990s, to a size map in the early 2000s, to a multisensory map in the last couple of decades. 95 96 a. V1 as a Retinal Map 97 98 V1 provides the initial, and largest, map of the in the cortex (Brewer et al., 2002; 99 Dougherty et al., 2003). Adams & Horton (2003) illustrate just how literally V1 is a retinal map 100 (or a “metabolic imprint of the retina”, as they put it) by mapping in V1 the shadows cast by 101 retinal blood vessels. The earliest evidence that V1 was retinotopically organised came from 102 Panizza (1855) (Zago et al., 2000; Colombo et al., 2002), Munk (1881), and Henschen (1890). 103 Building upon pre-existing associations between points on the retina and angular directions in 104 the visual field (Kepler, 1604; Descartes, 1637; ‘local signs’: Lotze, 1846; Hering, 1879; 19th 105 Century optometry: Atchison, 1979; Johnson et al., 2011), Tatsuji Inouye, and then Gordon 106 Holmes, plotted V1’s retinotopic map (Inouye: Inouye, 1909; see Glickstein & Whitteridge, 107 1987; Glickstein & Fahle, 2000; Leff, 2004; Jokl & Hiyama, 2007; Leff, 2015; Lindegger & Pless, 108 2019; Holmes: Lister & Holmes, 1916; Holmes, 1918; Holmes, 1919a; see also Riddoch, 1917), 109 which was only accurately mapped in the last 30 years (Horton & Hoyt, 1991; see Wandell et al., 110 2009; Carandini, 2012). 111 112 b. V1 as a Depth Map? 113 114 Before 1960, it was generally believed that “binocular was based on high-level 115 quasi-cognitive events” (Bishop & Pettigrew, 1986; see also Julesz, 1960; Julesz, 1986), and was 116 therefore processed in higher visual areas. However, disparity selective neurons in V1 were 117 posted by Pettigrew (1965) and found by Barlow et al. (1967) and Nikara et al. (1968). Although

3 118 Barlow et al. (1967) suggested that this depth information would still have to be “sorted out by 119 higher order visual neurones”, they suggested that these disparity selective neurons provided the 120 “neural basis of ”. For a time it seemed certain that V1 was the site of human 121 stereoscopic depth processing, with Bishop & Pettigrew (1986) distinguishing between 122 ‘Stereopsis before 1960: Mentalism Prevails’ and ‘Retreat from Mentalism begins in the 1960’s’. 123 However, contemporary reviews of the neural processing of depth perception strike a 124 very different tone (Cumming & DeAngelis, 2001; Parker, 2007; Welchman, 2011; Welchman & 125 Kourtzi, 2013; Parker, 2014; Welchman, 2016; Parker, 2016; Parker et al., 2016; Bridge, 2016). 126 According to the literature, responses of binocular neurons in V1 “do not reflect perceived 127 depth, but rather compute a local correlation between two images” (Bridge, 2016). So it is now 128 widely accepted that depth perception is ‘extra-striate’ (post-V1) (Cumming & DeAngelis, 2001). 129 130 c. V1 as a Distance Map 131 132 Although V1 is no longer thought of as a depth map, it is still thought to encode viewing 133 distance. As Cumming & DeAngelis (2001) observe: “Most of the complex questions have not 134 been addressed at the neurophysiological level. One exception is the effect of viewing distance.” 135 Cumming & DeAngelis (2001) refer to Trotter et al. (1992) and Gonzalez & Perez (1998), but 136 the principle dates back to Kepler (1604) and Descartes (1637). The closer an object is, the more 137 the two eyes have to rotate to fixate upon it, and so this degree of eye rotation (or ‘’) 138 provides the with the viewing distance. Trotter et al. (1992) found that neurons in 139 V1 were responsive to changes in vergence, and this was confirmed in Trotter et al. (1993) and 140 Trotter et al. (1996), where they used prisms to change the vergence angle whilst keeping the 141 retinal image and constant. 142 Trotter & Celebrini (1999) complete the transition to thinking of V1 in spatiotopic rather 143 than retinotopic terms by showing that half the cells in the monkey V1 are responsive to 144 direction (version eye movements). Importantly, gaze direction (version), when coupled with 145 fixation distance (vergence), marks out a specific location in 3D space, implying that V1 cells 146 “are dedicated to certain volumes of visual space” (Trotter & Celebrini, 1999). 147 148 d. V1 as a Size Map 149 150 Since the early 2000s, it has been suggested that the spatial map in V1 reflects the perceived 151 angular size of objects rather than their retinal size, recasting V1 as a size map rather than a 152 retinal map. To explain this distinction, consider the following illusion: https://osf.io/8uac3/ 153 (Anonymous, 2013; Reddit, 2013). Since this is the same car pasted 3 times into the picture, each 154 car has the same retinal image size. And yet, the cars appear to be different sizes. The far car 155 appears to take up more of the picture than the near car. This difference in perceived angular 156 size is quantified at around 20% difference in size (Murray et al., 2006; Fang et al., 2008). 157 Importantly, Murray et al. (2006) found that the activation in V1 reflected the perceived angular 158 size of the stimulus, rather than its retinal size. It varied by 20% when participants looked from a 159 near to a far pictorial object, reflecting almost perfectly the change in perceived angular size. 160 But how is it possible for the V1 to reorganise itself in this way? After all, V1 is supposed 161 to be a retinal map. MacEvoy & Fitzpatrick, (2006) hypothesised, and Ni et al. (2014) and He et 162 al. (2015) found, that V1 receptive fields dynamically re-organise themselves according to the 163 content of the picture (most likely from feedback from LOC; Zeng et al., 2020). This represents 164 “a dynamic redrawing of the map of visual space” (MacEvoy & Fitzpatrick, 2006), where specific 165 objects in the picture are seen as magnified or minified (Ni et al., 2014). 166 Another size constancy effect is ‘neurological zoom’. This is the suggestion that V1 167 magnifies the whole visual field with far fixation, and minimises it with near fixation. Marg & 168 Adams (1970) argue that this is responsible for (a) the improvement of visual acuity at far

4 169 fixation (Aubert & Foerster, 1857; Schindel & Arnold, 2010), and (b) changes in the apparent 170 size of after-images with fixation distance (Emmert’s Law: Emmert, 1881). 171 Human single cell recordings find evidence of receptive field changes with near fixation 172 in LGN (Richards, 1967; 1968; cf. Cavonius & Hilz, 1973) and V1 (Marg & Adams, 1970), 173 although both sets of data found reduced magnification towards the periphery. More recently, 174 human fMRI appears to tell a similar story. Sperandio et al. (2012) find that viewing distance 175 modulates V1 activity when observers view an after-image (with Juan Chen et al., 2019 finding 176 that the time course suggests recurrent processing of the vergence signal in V1 and/or feedback 177 from higher visual areas), and Cottereau et al. (2014) appears to confirm Marg & Adams (1970)’s 178 finding that vergence modulates receptive fields in V1. 179 Sperandio et al. (2012) and Juan Chen et al. (2019) suggest that whilst pictorial size 180 constancy is limited to 10-45% (the recent literature suggests around 20%: Murray et al., 2006; 181 Fang et al., 2008), the scaling of after-images is near perfect because of the additional presence of 182 triangulation cues (vergence, accommodation, binocular disparity, motion parallax). However, 183 Jing Chen et al. (2019) were surprised to find that although their subjects experienced close to 184 perfect size constancy in real world viewing conditions, recorded neural activity could only 185 account for half of this effect. The problem is that Sperandio et al. (2012), Juan Chen et al. 186 (2019), and Jing Chen et al. (2019) all overlook the distinction between perceived physical size 187 and perceived angular size. It’s true that participants may judge the perceived physical size of 188 objects almost perfectly in real world viewing conditions. But that doesn’t mean that the 189 perceived angular size of an object never reduces with distance. So there are two effects at play. 190 First, our ability to judge constant physical size despite a perceived reduction in the angular size 191 of an object with distance. To me, this is clearly a cognitive effect, since it makes up whatever 192 shortcomings there are in our perception. Second, the extent to which there are shortcomings in 193 our perception. Does the perceived angular size simply reflects the retinal image size, or does it 194 also incorporates perceptual effects (‘size constancy’). Clearly something is going on in the 195 illusion with the cars, but is this a perceptual or a cognitive effect? 196 197 e. V1 as a Multisensory Map 198 199 In the last couple of decades there has been increasing evidence of V1 as the site of multisensory 200 integration, with Murray et al. (2016) observing that “convergence and integration of information 201 from different senses within low-level cortices is a rule rather than an exception (e.g. Ghazanfar 202 & Schroeder, 2006; van Atteveldt et al., 2014; De Meo et al., 2015; ten Oever et al., 2016)”. We 203 find evidence of multisensory integration in V1 in three specific contexts: 204 205 i. Vision and Sound 206 207 The quote from Murray et al. (2016) comes from audio-visual integration, where there is 208 evidence of bottom-up signals travelling from the auditory cortex to V1 (see Falchier et al., 2002 209 and Rockland & Ojima, 2003 on monkeys, and Beer et al., 2011 on humans). 210 In human fMRI, Calvert et al. (2001) find increased activation in V1 when a visual 211 stimulus is accompanied by a synchronous sound, whilst Martuzzi et al. (2007) find that sound 212 alone can activate V1. Watkins (2006; 2007) studied the sound-induced flash illusion, and find 213 increased activation in V1 when one visual flash and two auditory beeps are experienced as two 214 visual flashes. Studies also demonstrate that sound can modulate the orientation preference of 215 neurons in layers 2/3 (Ibrahim et al., 2016; Chanauria et al., 2019; McClure & Polack, 2019) and 216 5/6 (Chanauria et al., 2019) of V1, but not layer 4 (Ibrahim et al., 2016; and Chanauria et al. 217 (2019), emphasising that we shouldn’t think of V1 as a single map, but instead consider the 218 functional role of each layer (a point developed below).

5 219 In addition to these bottom-up effects, there is also evidence of top-down contextual 220 effects (Petro et al., 2017). Vetter and colleagues show that the high-level semantic categories 221 (‘bird singing’, ‘people talking’, ‘traffic’) can be decoded from V1 in blindfolded (Vetter et al., 222 2014) and congenitally blind (Vetter et al., 2020) participants. By contrast, although Masuda et al. 223 (2021) also find task-dependent signals in V1 in the blind visual field of participants with macular 224 degeneration, these signals are not detectable in control participants, leading them to question 225 the extent to which these top-down signals have an impact on normal in V1. 226 Masuda et al. (2021) conclude that “top-down, modality independent signals are ordinarily 227 supressed by feed-forward signals and become profound in the absence of retinal inputs.” 228 229 ii. Vision and Touch 230 231 Masuda et al. (2021) come to the same conclusion for the integration of vision and touch. Whilst 232 they find touch signals present in the blind visual field of participants with macular degeneration 233 during active tasks, these signals are not present when passively applied, nor are they found in 234 normally sighted controls, suggesting that they merely reflect “task-related feedback signals 235 rather than reorganized feedforward visual inputs.” Similarly, Couto et al. (2019) found that 236 neurons in the mouse V1 failed to respond to tactile stimuli when the mouse’s whiskers were 237 passively stroked, suggesting that the impact of other modalities on V1 is restricted to modality- 238 independent top-down effects. 239 Sadato et al. (1996) did find activation of V1 during braille reading by blind participants. 240 However, this doesn’t appear to reflect typical multisensory integration, so much as the 241 recruitment of V1 for touch after a period of visual deprivation (Merabet et al., 2008; Bedny, 242 2017). On the other hand, Merabet et al. (2004) did find a small effect of TMS to V1 on sighted 243 participants’ ability to discriminate the spacing of a row of dots felt by touch, potentially 244 reflecting a compromised ability to assign spatial attention (Macaluso et al., 2000). 245 246 iii. Vision and Locomotion 247 248 The perception of self-motion relies on integrating retinal stimulation (optic flow) with signals 249 about bodily motion, so this is a prime candidate for multisensory integration in V1. 250 We can distinguish between passive self-motion and active self-motion. In passive self- 251 motion, Roberts et al. (2017) find increased activation in V1 in human participants when visual 252 and vestibular cues are congruent, but no effect when they are incongruent. Roberts et al. (2018) 253 also find that vestibular neuritis patients (who typically suffer from acute vertigo from impaired 254 vestibular function) have reduced V1 activation in congruent conditions. 255 However, much of the recent attention has been on active self-motion, specifically in 256 mice (for reviews see Pakan et al., 2018; Busse et al., 2017; Wang et al., 2020; Saleem, 2020). Niell 257 & Stryker (2010) found increased firing in V1 with motion, and Keller et al. (2012) and Saleem et 258 al. (2013) found that running in darkness activates V1. Ayaz et al. (2013) also found that motion 259 reduces suround supression, enabling V1 neurons to integrate over larger regions of visual space. 260 In terms of multisensory integration, Saleem et al. (2013) found that most of the V1 261 neurons that responded encoded a weighted sum between optic flow and running speeds. 262 Furthermore, Keller et al. (2012) found that layer 2/3 of V1 is strongly modulated by a mismatch 263 between actual and expected optic flow, leading them to suggest that V1 is engaged in predictive 264 coding (Rao & Ballard, 1999; Koch & Poggio, 1999; Friston, 2018), with Zmarz & Keller (2016) 265 finding that this occurs locally at the level of the receptive fields of individual neurons (for 266 further discussion of predictive coding of optic flow in V1 see Leinweber et al., 2017; Fridman & 267 Petreanu, 2017; Keller & Mrsic-Flogel, 2018; Jordan & Keller, 2020; Padamsey & Rochefort, 268 2020).

6 269 Increasingly, V1 has been found to include the location of objects in the visual scene and 270 their behavioural relevance (see Pakan et al., 2018; Saleem, 2020; Flossmann & Rochefort, 2021). 271 Ji & Wilson (2007) found that during sleep V1 replayed the same experience (presumably as 272 dreams) as the hippocampus. Poort, Khan, et al. (2015) found that perceptual learning led to an 273 increase in the response of layer 2/3 to a visual stimulus, and Fiser, Mahringer, Oyibo, et al. 274 (2016) found different responses to the same visual stimulus depending on where it was 275 encountered in the environment, with both studies also finding neurons that anticipated the 276 stimulus. There is increasing evidence that these responses are due to top-down feedback from 277 the anterior cingulate cortex (ACC), and likely also the retrosplenial cortex (Fiser et al., 2016). 278 Pakan, Currie, Fischer, et al. (2018) find that the layer 2/3 neurons that encode spatial position 279 also respond to self-motion defined estimates of distance when visual cues are absent, and give 280 their results a predictive coding explanation (Pakan et al., 2018). 281 Other studies that report subjective position in V1 include Saleem, Diamanti et al. (2018) 282 and Diamanti et al. (2020) (in layer 2/3) and Fournier, Saleem, Diamanti et al. (2020) (in layers 283 4/6), but they have a slightly different focus. Pioneering parallel recording in V1 and the 284 hippocampus, they ask how we get from egocentric (vision, V1) to allocentric (navigation, 285 hippocampus) models of the environment? Because Fournier, Saleem, Diamanti et al. (2020) find 286 that the effect of subjective estimates of position are reduced around visual landmarks, they 287 suggest that V1 plays an early role in navigation by recalibrating estimates of the distance 288 travelled when a salient visual landmark is passed. 289 290 2. V1 as an Egocentric Cognitive Map 291 292 The last section on multisensory integration demonstrated increasing evidence of non-visual top- 293 down processing in V1. What is this processing for? We outline the three explanations in the 294 literature, before advancing our own proposal that, in addition to a retinal map, V1 acts as a 295 post-perceptual egocentric cognitive map. 296 1. Anchoring the Hippocampus’ Cognitive Map: We classically draw a distinction 297 between V1 as an egocentric visual map and the hippocampus as an allocentric cognitive map 298 (Tolman, 1948; O’Keefe & Nadel, 1978; O’Keefe & Nadel, 1979). As we just saw, one 299 explanation for non-visual processing in V1 is that V1 plays an early role in calibrating the 300 allocentric cognitive map (navigation, hippocampus) to our current egocentric visual experience 301 (perception, V1) (see also Nau et al., 2018). This is one facet of a very active research question 302 that seeks to explain how we ‘anchor the cognitive map to the visual world’ (Julian, 2017), with 303 others also pointing to egocentric, as well as allocentric, frames of reference in the hippocampus 304 as another potential mechanism (see Wang et al., 2020 for a review). The key thing to recognise 305 is that this research agenda maintains the classic distinction between egocentric (vision, V1) and 306 allocentric (cognitive map, hippocampus) models of the environment. The underlying 307 assumption of this line of research is that is egocentric and spatial cognition is 308 allocentric; and the question is how we translate between them. 309 2. Cognitive Penetration: There are two paradigms in the multisensory literature (Couto 310 et al., 2019). The first treats multisensory integration in V1 as bottom-up, and horizontal, rather 311 than hierarchical (Watkins et al., 2006). However, given the prevalence of top-down multisensory 312 signals reviewed in the last section, this leads to the suggestion that bottom-up processes in V1 313 are subject to top-down ‘cognitive penetration’ (Vetter & Newen, 2014; Newen & Vetter, 2017). 314 What this means is that these top-down signals are influencing our visual experience. Take the 315 findings by Poort, Khan, et al. (2015) and Fiser, Mahringer, Oyibo, et al. (2016) that expectations 316 lead to an increased response in V1. Poort, Khan, et al. (2015) suggest that this provides the 317 observer with “enhanced and more distinct representations of task-relevant stimuli”, and Fiser, 318 Mahringer, Oyibo, et al. (2016) suggest that it works to “enhance the discriminability of similar 319 stimuli in different contexts.”

7 320 3. Predictive Processing: The second, more radical, but increasingly orthodox paradigm 321 in the multisensory literature treats perception as one large predictive process (Rao & Ballard, 322 1999; Keller & Mrsic-Flogel, 2018; Petro et al., 2017; Newen & Vetter, 2017 also frame their 323 work in these terms). On this account perception is hierarchical, and the perception / cognition 324 distinction is effectively eradicated. This is consistent with predictive (Bayesian) models of visual 325 perception (Knill & Richards, 1996; Trommershauser et al., 2011) according to which (at least on 326 one popular interpretation) we see what we predict. 327 4. Post-Perceptual Cognitive Processing: Both ‘cognitive penetration’ and ‘predictive 328 processing’ are committed to the idea that top-down signals to V1 influence our visual 329 experience. With the exception of the suggestion that V1 provides the early stages of anchoring 330 the hippocampus’ cognitive map, it is a truism of the literature that V1 processing equates with 331 visual processing. By contrast, we argue that V1 is the site of egocentric spatial cognition, as well 332 as egocentric spatial perception. 333 The account of visual perception I develop in Linton, (2017) (reviewed by Erkelens, 334 2018; and elaborated in Linton, 2018a; the Brains Blog: 2018b; 2018c; 2018d; 2018e; 2018f; and 335 two experimental papers: Linton, 2020a; 2021) argues that many of the supposedly basic aspects 336 of visual perception are better understood as post-perceptual cognitive influences. This includes 337 (a) visual scale (the size and distance of objects; see Linton, 2017, pp.134-136 and Linton, 338 2018a), (b) visual shape from multiple depth cues (depth cue integration; see Linton, 2017, 339 chapter 2), (c) pictorial cues and pictorial perception (Linton, 2017, chapter 3), and (d) 340 multisensory integration (Linton, 2017, pp.65-66). However, as Erkelens (2018) notes, a 341 “shortcoming of the book is that … it does not discuss the new theory in relation to the 342 neurophysiology of the visual system.” The purpose of this article is to rectify this shortcoming. 343 Given our survey of V1 in Section 1, the implications of my theory for the 344 neurophysiology of the visual system become apparent. On the one hand, as we saw in Section 1, 345 both visual scale (the size and distance of objects) and multisensory integration are now regarded 346 as paradigmatic V1 processes. On the other hand, under my account, both visual scale (the size 347 and distance of objects) and multisensory integration are automatic post-perceptual cognitive 348 inferences. This leads us to the conclusion that, in addition to V1 functioning as an egocentric 349 perceptual map, V1 must also be functioning as a post-perceptual egocentric cognitive map. 350 In the rest of this Section I will focus on substantiating this claim. But before I do, I 351 need to explain what I mean by ‘automatic post-perceptual cognitive inferences’. 352 353 a. Perception / Cognition Distinction 354 355 We mark the ‘perception’ / ‘cognition’ boundary as the point at which mental processes become 356 about our perceptual experience, rather than contributing to our perceptual experience. We define 357 ‘perceptual experience’ as experience in any sensory modality; for instance, visual experience 358 (what we see), auditory experience (what we hear), and tactile experience (what we feel through 359 touch). And the focus of Linton (2017) is to try to understand which visual cues contribute to 360 our visual experience, as opposed to merely influencing our cognition or understanding of our 361 experience. 362 To make this distinction clearer, we appeal to the example of reading. We argue that two 363 people have the same visual experience of words on a page if one can read and the other cannot, 364 even though they have very different cognitive experiences of them. In this way, being able to 365 read changes our experience of a visual stimulus, without changing our visual experience of the 366 stimulus (understanding or attributing meaning often has an experiential quality, see Strawson, 367 1994; Bayne & Montague, 2011 on ‘cognitive phenomenology’). We believe that reading captures 368 a common intuition about the ‘perception’ / ‘cognition’ distinction. Defining reading 369 ‘perceptual’, as opposed to ‘cognitive’, would appear to define ‘perception’ too broadly. Indeed,

8 370 one wonders what, if any, automatic cognitive process that relies on visual input (driving, 371 navigation, etc.) would be excluded by such a broad definition of ‘perception’. 372 We apply the thought about reading to the interpretation of signs more broadly, which 373 covers associations as diverse as understanding that there is fire by seeing smoke, understanding 374 when to stop and when to go at a traffic light, understanding hand gestures, and understanding 375 symbols like the McDonald’s logo (Atkin, 2013). This attribution of meaning through association 376 is, we argue, paradigmatic of post-perceptual cognitive processes. So, under our definition, work 377 in computer vision based on object recognition in images (Tarr & Bulthoff, 1998), as well as 378 ventral stream processing that instantiates object recognition in the human cortex (Goodale & 379 Milner, 1992), should be thought of as cognitive rather than perceptual. 380 In Linton (2017), we go further still, and argue that many apparently basic aspects of 381 visual perception (size and distance; depth cue integration; multisensory integration; pictorial 382 cues and pictorial perception) are better thought of as automatic post-perceptual cognitive 383 inferences about our visual experience, rather than contributing to our visual experience. 384 Ultimately, this is an empirical question – we cannot rely on our introspection to tell us whether 385 a process contributes to our visual experience or not – and so the focus of this Section is on 386 presenting our empirical data that suggests that a number of well-studied V1 processes are purely 387 cognitive in nature. 388 This is a broad definition of ‘cognition’. Cognition can be unconscious (we are unaware 389 of it at work), automatic (we do not have to do anything), and involuntary (we often cannot 390 overrule it). Most cognitive scientists accept a broad definition of ‘cognition’ (see Heyes in Bayne 391 et al., 2019; ‘System 1’ vs. ‘System 2’ in Kahneman, 2011). And this was how ‘cognition’ was 392 conceived during the ‘cognitive revolution’ of the 1960s, a term Neisser (1967) coined to “do 393 justice … to the continuously creative process by which the world of experience is constructed.” 394 395 “Visual cognition … deals with the processes by which a perceived, remembered, and 396 thought-about world is brought into being from as unpromising a beginning as the 397 retinal patterns.” Neisser (1967) 398 399 However, unlike most cognitive scientists, I do not believe these ‘mid-level’ processes contribute 400 to our visual experience. In Linton (2017), I therefore draw a sharp distinction between my own 401 approach, and contemporary approaches to ‘visual cognition’ (Cavanagh, 2011). Again, this is 402 not necessarily counter-intuitive. The more ‘mid-level’ processes we add to vision, the more we 403 open the door to the suggestion that (in Cavanagh, 2011’s words) “our visual systems, on their 404 own, rank with the most advanced species in cognitive evolution.” 405 406 “...the unconscious inferences of the visual system may include models of goals of others as 407 well as some version of the rules of physics. If a ‘Theory of Mind’ could be shown to be 408 independently resident in the visual system, it would be a sign that our visual systems, on 409 their own, rank with the most advanced species in cognitive evolution.” 410 411 And, by equating ‘vision’ with any automatic process that relies on visual input, we go far beyond 412 visual experience. For instance, in Glennerster (2016) “the visual system” is now responsible for 413 performing our everyday tasks, such as getting a mug from another room: 414 415 “The visual system must somehow know that the mug is behind one door rather than 416 another, even if it does not store the three-dimensional location of the mug.” 417 418 This is the problem we lead ourselves into by identifying every visually informed unconscious 419 inference or automatic process as ‘visual perception’.

9 420 In contrast to my negative (‘not visual experience’) definition of cognition (Linton, 2017; 421 see also Brainard in Bayne et al., 2019), most philosophers adopt a positive and high-level 422 definition of ‘cognition’ (e.g. Bayne in Bayne et al., 2019; see Heyes in Bayne et al., 2019). This is 423 typically what people mean when they ask whether ‘cognition’ penetrates ‘perception’ (Pylyshyn, 424 1999; Firestone & Scholl, 2016). But this definition of ‘cognition’ is liable to distort discussions. 425 For instance, Knotts et al. (2020) describe automatic inferences about perception as ‘intra- 426 perceptual’, rather than ‘post-perceptual’ or ‘cognitive’ because automatic inferences are immune 427 from ‘cognitive penetration’. But this new category (‘intra-perceptual’) risks conflating our biased 428 post-perceptual judgements about our perceptual experience (when “subjective evaluation of 429 [our visual experience] is systematically biased, or inflated”) with perceptual experience itself, 430 both of which may be immune to ‘cognitive penetration’. 431 The essential question of the ‘perception’ / ‘cognition’ distinction is whether a process 432 contributes to our visual experience or not? (Linton, 2017; Brainard in Bayne et al., 2019; a point 433 Abid, 2019 makes well in response to Odegaard et al., 2018). Knotts et al. (2020) appear to 434 suggest that the answer to this question in the context of ‘intra-perception’ is ‘yes’, in which case 435 it’s hard to see why we need a new category, in addition to ‘perception’. 436 Knotts et al. (2020) give three examples of ‘intra-perceptual’ biases: 1. the Müller-Lyer 437 illusion, 2. multisensory integration, and 3. amodal completion. But all three examples seem to fit 438 quite neatly within the existing ‘perception’ / ‘cognition’ distinction. Most vision scientists would 439 agree with Knotts et al. (2020) that both the Müller-Lyer illusion and multisensory integration 440 affect our visual experience, and this is why most scientists would refer to them as ‘perceptual’. 441 By contrast, I argue that the Müller-Lyer illusion and multisensory integration do not affect our 442 visual experience, and this is why I refer to them as merely ‘cognitive’ (Linton, 2017, pp.60-66; 443 2018c). Amodal completion is an interesting case. I agree with Knotts et al. (2020) that many 444 would regard amodal completion as ‘perceptual’. However, I disagree with Knotts et al. (2020) 445 that the borders of the occluded object are “still visually represented” in amodal completion. As 446 Knotts et al. (2020) themselves admit, amodal completion is merely just “a feeling”. In which 447 case it is merely an automatic cognitive inference about our percept, and fits naturally within 448 ‘cognition’. Note also that Bergmann et al. (2019) were unable to decode amodal completion in 449 V1, suggesting that this effect is very much a purely cognitive process. 450 Having outlined how I draw the distinction between ‘perception’ and ‘cognition’, let me 451 now substantiate why I believe that visual scale (size and distance) and multisensory integration 452 fall on the ‘cognition’ side of the distinction, with an emphasis on V1 processing. 453 454 b. V1 as a Cognitive Distance Map 455 456 You’ll recall that the viewing distance is instantiated in V1 using vergence eye movements 457 (Trotter et al., 1992), so that “the visuo-oculomotor transformation process appears to start as 458 early as the primary ” (Trotter et al., 1993). 459 But is this a ‘perceptual’ or ‘cognitive’ process? Applying the ‘perception’ / ‘cognition’ 460 distinction we outlined above, the question is whether eye movements really affect our visual 461 experience? One way to test this is to vary vergence gradually, and see if vergence still has an 462 effect on our visual experience, and therefore on our distance judgements, even when 463 participants are subjectively unaware that their eyes have moved. In Linton (2020a) we tested this 464 by gradually manipulating vergence, and found that participants were unable to use their visual 465 experience to judge distance. Instead, they were effectively guessing (Fig.1). 466

10 50 60 Our Results (y=0.16x+37) 55 45 50

40 45

40 35 35 Williams & Tresilian (1999) 30 − 30 Mon Pointed Distance (cm) Distance Pointed Pointed Distance (cm) Distance Pointed 25 Veridical Triangulation (y=x) Our Results (y=0.16x+37) 25 Veridical Triangulation (y=x) 20 Mon−Williams & Tresilian (1999) 25 30 35 40 45 25 30 35 40 45 Vergence Distance (cm) Vergence Distance (cm)

467 468 469 Figure 1. Results from Experiment 2 in Linton (2020a). Mon-Williams & Tresilian (1999) 470 represents how accurate vergence is traditionally assumed to be. My set-up was very similar to 471 theirs, but in Mon-Williams & Tresilian (1999) the changes in vergence were sudden rather than 472 gradual which gave participants subjective knowledge about their changing eye movements. In 473 my experiment, vergence was manipulated gradually by increasing or decreasing the separation 474 between two targets on a screen. Metal plates either side of the observer’s head ensured the right 475 eye only saw the left target and the left eye only saw the right target, enabling us to manipulate 476 the vergence distance (where the two lines of sight intersect, indicated by the arrow). Participants 477 then had to the point to the distance of the target using a non-visible hand. 478 479 These results are complemented by our results in Linton (2020b) on size perception (discussed 480 below). Together they pose a real challenge to the suggestion that eye movements affect our 481 visual experience. This work has two important implications. 482 First, we are not challenging Trotter et al. (1992)’s finding that vergence is encoded in 483 V1. However, since vergence is encoded in V1, and since vergence does not appear to affect our 484 visual experience, we conclude that V1 must be cognitively (and not perceptually) encoding 485 egocentric distance from vergence. V1 might be encoding another distance cue that does affect 486 our visual experience. However, the point developed in Linton (2018a) and Linton (2020a) is 487 that all of our other absolute distance cues are either similarly ineffective (absolute motion 488 parallax, vertical disparities) or merely cognitive (familiar size) in nature. In any case, this isn’t 489 essential to the point, since Trotter et al. (1992) specifically test absolute distance signals in V1 in 490 reduced cue conditions in order to isolate the vergence signal from other absolute distance cues. 491 Second, if vergence (eyes moving in the opposite direction) doesn’t affect our visual 492 experience, we have to question whether version (eyes moving in the same direction) does? 493 Version eye movements are thought to be essential for visual direction (in order to convert 494 retinotopic coordinates into spatiotopic coordinates). So, unless eye movements affect our visual 495 experience, we are forced to conclude that visual direction is also merely cognitive. This will be 496 the focus of future experiments. It also opens up the question of whether retinal stability is 497 merely cognitive. This is because the visual system is making roughly 3 per second 498 (Yarbus, 1967). So unless eye movements affect our visual experience (Keller & Mrsic-Flogel, 499 2018), the stabilisation of the visual scene over time would have to be merely cognitive as well. 500 501 c. V1 as a Cognitive Size Map 502 503 What about the size constancy effects reported in V1. Are they ‘perceptual’ or merely ‘cognitive’? 504 You’ll recall that Sperandio et al. (2012) draw a distinction between size constancy (a) in the real

11 505 world (which, according to Sperandio et al., 2012 is close to perfect), and (b) in pictures (which is 506 around 20%; Murray et al., 2006). 507 508 i. Real World Size Constancy 509 510 Sperandio et al. (2012) suggest that in full cue conditions their fMRI results show that V1 511 represents the perceived rather than retinal size of the object. Size constancy is close to perfect, 512 they argue, because in addition to pictorial cues, in real world viewing participants have access to 513 additional depth cues including vergence, binocular disparity, and motion parallax. However, the 514 participants in Sperandio et al. (2012) did not have motion parallax (they were in an fMRI 515 scanner), and binocular disparity is thought to primarily rely on vergence to provide absolute 516 distance information (Wallach & Zuckerman, 1963). So it’s no surprise that a central question is 517 how reliant we are on vergence for size constancy. 518 A key site for this discussion is the Taylor illusion (Taylor, 1941). Here participants view 519 an after-image of their hand (which fixes the retinal size of their hand) in complete darkness, and 520 then move their hand forwards or backwards in space. Taylor (1941) found that participants 521 report that the hand appears to shrink when brought close to the face, and expand when moved 522 away. The accepted explanation of the Taylor illusion is that it is entirely due (Taylor, 1941; 523 Morrison & Whiteside, 1984; Mon-Williams et al., 1997), or almost entirely due (Sperandio et al., 524 2013), to changes in vergence as the eyes track the hand in darkness (see also Gregory et al., 525 1959; Carey & Allan, 1996; Bross, 2000; Ramsay et al., 2007; Faivre et al., 2017a; and on vergence 526 scaling Urist, 1959; Suzuki, 1986; Lou, 2007; Zenkin & Petrov, 2015). 527 However, the problem with these experiments is that they rely on manipulating the 528 vergence angle in a way that informs the participant about the change in fixation distance, by 529 asking participants to track their own hand as it moves. These experiments are therefore open to 530 an alternative cognitive explanation, namely that changes in vergence don’t affect the perceived 531 angular size of the hand, but merely affect our cognitive understanding of its size given its fixed 532 angular size, and our subjective knowledge about our changing fixation distance. 533 In order to differentiate between these two interpretations, we replicated the vergence 534 changes in Sperandio et al. (2013) (distance change from 50 to 25cm over 5 seconds) but this 535 time controlling for any cognitive knowledge about the change in distance by having participants 536 track a visual target, rather than an LED or their hand. Tracking their own hand clearly gives 537 participants knowledge about their changing gaze position. So too does an LED moving in 538 depth, because the eyes are very bad at tracking a single point moving in depth, leading to retinal 539 slip (motion of the LED on the retina). By contrast, we used a 3° target to ensure that any retinal 540 slip from the target moving in depth was undetectable. We found no evidence that vergence 541 affects perceived size once subjective knowledge about the changing gaze position has been 542 controlled for (Linton, 2020c; 2021). We therefore conclude that the vergence size constancy 543 effects reported in the literature are merely cognitive rather than perceptual. 544 545 ii. Size Constancy in Pictures 546 547 What about the pictorial size constancy in Murray et al. (2006)? There’s no denying that the far 548 car seems to take up more of the picture than the near car in the following example: 549 https://osf.io/8uac3/ (Anonymous, 2013; Reddit, 2013). But a key point of Linton (2017) is that 550 we can’t rely on introspection to tell us if an effect is perceptual or merely cognitive. Instead, a 551 useful strategy is what we call ‘scaffolding’: introducing elements into the visual scene that help 552 us to judge what we are seeing. In Linton (2018c) we apply this logic to pictorial size constancy, 553 as well as the Müller-Lyer illusion. The claim in the literature, as Schwarzkopf (2015) rightly 554 points out, is that pictorial size constancy works by “distorting the entire map of visual space 555 through shifting the receptive fields as suggested by electrophysiological recordings.” But if this

12 556 is right, and pictorial size constancy really does distort visual space, by enlarging some regions, 557 and shrinking others, then it should also distort any additional pictorial elements introduced into 558 the picture. However, consider the following example: https://osf.io/u2jd3/ (Linton, 2018c). 559 According to the logic of the literature, squares that bound the exact same area in visual space as 560 the cars should be just as distorted in size as the cars. But this isn’t what we experience. Instead, 561 the size difference, if any, between the squares is vastly reduced. So what we took to be a 562 perceived change in size between the cars is actually entirely (our position), or almost entirely (if 563 you insist there is a small difference in size between the squares), a cognitive effect. 564 565 d. V1 as a Cognitive Multisensory Map 566 567 This way of thinking about the ‘perception’ / ‘cognition’ distinction also leads us to reinterpret 568 multisensory integration as cognitive rather than perceptual. The central question is whether 569 multisensory integration actually affects our visual experience, rather than merely our 570 interpretation of it. And we appeal to four specific examples to argue that it does not: 571 572 i. Mandatory Fusion 573 574 In Linton (2017), we argue that different kinds of visual depth cues (pictorial cues and binocular 575 disparity) are not perceptually integrated, but merely cognitively integrated. This is because the 576 experiments that test visual cue integration (Hillis et al., 2002) test the stimuli in highly artificial 577 conditions conducive to visual cues being judged to be perceptually fused when they are not. 578 And, if merely cognitive integration explains the integration of visual cues, then we would also 579 expect it to explain less immediate effects like multisensory integration. 580 But you don’t need to accept my analysis of visual cue integration to come to this 581 conclusion. If two cues are perceptually integrated, you would expect a change in one of them in 582 one direction to cancel out a change in the other in the other direction. This is known as 583 ‘mandatory fusion’. But Hillis et al. (2002) find that unlike visual cues, multisensory integration 584 (between vision and touch) is not subject to ‘mandatory fusion’. Similarly, failures in ‘mandatory 585 fusion’ have been found in vision and sound (Caziot & Mamassian, 2017) and vision and 586 vestibular cues (Ash et al., 2011). This raises the following dilemma (Linton, 2018b). Where, 587 perceptually, are these multisensory percepts experienced? Not in vision, which reports the visual 588 estimate. And not in touch, which reports the touch estimate. So there doesn’t appear to be a 589 sensory experience where we actually experience the multisensory integration. It doesn’t change 590 our visual experience, and it doesn’t change our tactile experience. So where is it? (Contrast this 591 with the suggestions in the next paragraph that visual experience “relies on multimodal signals” 592 and that “visual consciousness is shaped by the body”). Instead, we conclude (Linton, 2017, 593 p.66), that “the absence of mandatory fusion in … cross-modal perception appears to suggest 594 that when information is integrated in these contexts it is by virtue of an integrated judgement rather 595 than an integrated percept.” 596 597 ii. Taylor Illusion 598 599 Our approach to vergence eye movements provides a model for multisensory integration more 600 generally. Sperandio et al. (2013) find that hand movements, and not just the change in vergence, 601 directly affects the perceived size of the of the hand in the Taylor illusion. They 602 demonstrate this by showing that the strength of the illusion is marginally reduced when 603 vergence and the hand move in opposite directions. Consequently, the Taylor illusion is thought 604 to provide evidence for the suggestion that vision “relies on multimodal signals” (Sperandio et 605 al., 2013; Chen et al., 2018) and that “visual consciousness is shaped by the body” (Faivre et al., 606 2015; Faivre et al., 2017a; Faivre et al., 2017b). But if, as we argue in Linton (2020c; 2021), the

13 607 integration of the retinal image with the proprioceptive sensation of our own vergence eye 608 movements is purely cognitive (and this is the major driver of the Taylor illusion), then it 609 becomes very likely the integration of the retinal image with the proprioceptive sensation of our 610 own hand movements is also purely cognitive when this is a minor driver of the Taylor illusion. 611 This cognitive interpretation of the influence of hand movements on visual perception also 612 suggests a cognitive interpretation of variants of the Taylor illusion that show multisensory 613 integration with the rubber-hand illusion (Faivre et al., 2017a) and tool use (Grove et al., 2019). 614 615 iii. Rubber Hand Illusion 616 617 The rubber-hand illusion (Botvinick & Cohen, 1998) poses another question. How is it that 618 visual and tactile sensations can be spatially decoupled in this way? The traditional explanation is 619 that the signal from each sense is corrupted by noise (Trommershauser et al., 2011), so each 620 signal is liable to give different spatial estimates. But the degree of decoupling involved in the 621 rubber-hand illusion far exceeds what one would expect from two identical, but noisy, estimates 622 of the same spatial location. Instead, in Linton, (2018a) we argue that vision and touch can come 623 apart in this way because they don’t share a common metric. Put simply, they aren’t giving 624 conflicting spatial information because they don’t operate in the same space. Without eye 625 movements influencing visual perception, vision is retinotopic not spatiotopic (we put to one 626 side the question of whether touch is head-centric or body-centric; if the latter, a further 627 conversion between head-centric and body-centric coordinates would be required for vision). So 628 without eye movements influencing visual perception, vision doesn’t provide us with absolute 629 distance or direction. Touch, we presume, does. And so a further, cognitive, step is required to 630 convert vision into the same metric as touch so that they can be integrated. But this further, 631 cognitive, step is subject to significant error. (Contrast this account with a very different 632 cognitive interpretation of the rubber-hand illusion by Lush et al., 2020; Roseboom & Lush, 633 2020, according to which there is no integration, perceptual or cognitive). 634 635 iv. Self-Motion 636 637 Finally, at the conceptual level, why should we expect multisensory integration to be perceptual? 638 In Linton (2018a) we discuss this thought in the context of the integration of visual (optic flow) 639 and vestibular cues to passive self-motion. Adding vestibular cues to optic flow increases the 640 impression of self-motion over and above optic flow alone (Ash, Palmisano, & Kim, 2011). But 641 why would we expect this to affect our visual experience? It’s not as if the optic flow has to be 642 seen as flowing faster in order to incorporate this vestibular cue into our determination of self- 643 motion. Instead, the visual (optic flow) and the non-visual (vestibular) cues merely feed into a 644 common post-perceptual cognitive determination of speed. 645 646 3. V1 as the Neural Correlate of the Visual Perception / Cognition Boundary 647 648 We therefore argue that V1 is the site of the visual perception / cognition distinction, acting 649 both as (a) the site of visual experience, and (b) where automatic post-perceptual inferences 650 about our own visual experience occur. 651 Our suggestion of V1 as a cognitive map is a very different from Zhaoping (2014; 652 2019)’s description of V1 as a ‘saliency map’. Zhaoping (2019)’s ‘saliency map’ is concerned with 653 attributing cognitive meaning (saliency) to the scene prior to perception in order to drive eye 654 movements to specific meaningful locations in the scene. This is because, under Zhaoping 655 (2019)’s account, “V1 activities are unrelated to visual awareness”, and so fixational eye 656 movements are required to pass the stimulus via attention to higher visual areas for visual 657 awareness. Under Zhaoping (2019)’s account, (a) V1 is pre-perceptual, and (b) Zhaoping (2019)’s

14 658 ‘saliency map’ is a pre-perceptual ‘cognitive map’ of sorts. By contrast, we are arguing that (a) V1 659 is the site of visual experience (V1 as visual map), but also (b) V1 is also the site of the first layer 660 of post-perceptual cognition (V1 as cognitive map). 661 But how can V1 be both a visual map (providing our visual experience), and also a post- 662 perceptual cognitive map (interpreting our visual experience), at the same time? We suggest that 663 the answer lies in the laminar structure of V1. It is typical to distinguish between superficial 664 (2/3), mid (4), and deep (5/6) layers of V1. Visual inputs from LGN feed primarily into layer 4C, 665 so layer 4 would appear to be most likely area for ‘pure’ binocular visual experience divorced 666 from cognitive inferences (cf. Fournier, Saleem, Diamanti et al., 2020 on navigation; Self et al., 667 2013 on figure/ground; and, given my comments on motion below, Livingstone & Hubel, 1988). 668 Differentiating between layers of V1 may also help us to explain why our visual 669 experience doesn’t appear to be high-pass spatially filtered, whilst this is regarded as a key aspect 670 of V1 visual processing (Carandini, 2012). For instance, if high-pass spatially filtering were 671 primarily an aspect of layer 2/3 (feeding into higher visual areas), but the inputs at layer 4 672 remained unfiltered, and were experienced as visual perception, then this would provide a 673 convincing explanation of why spatial filtering is a key aspect of V1 visual processing, and yet 674 isn’t directly experienced. 675 Since higher visual areas predominantly feed into layer 2/3, and more distant cortical 676 areas predominantly feed into layer into 5/6, one suggestion is that top-down influences to layer 677 2/3 are reflected in illusions that we experience as visual perception, whilst top-down influences 678 to layer 5/6 are reflect as imagery (Bergmann et al., 2019; Koenig-Robert & Pearson, 2021). One 679 potential problem with this account is that Kok et al. (2016) find that the Kaniza triangle illusion 680 only activates layer 5/6, not 2/3. However, it could be that the Kaniza triangle really is better 681 thought of as a form of involuntary visual imagination (consistent with evidence in Kok et al., 682 2016 of attendant suppression of perception in layers 2/3 and 4), although we continue to prefer 683 a purely cognitive interpretation of the Kaniza triangle (Linton, 2017, p.11). 684 However, our key concern with Bergmann et al. (2019) and Koenig-Robert & Pearson 685 (2021)’s suggestion that we experience layers 2/3 as visual perception, and layers 5/6 as imagery, 686 is that visual inputs also affect layer 5/6, as if imagery and perception are being integrated. 687 Instead, we regard one of two solutions as the most likely interpretation. 688 The simplest solution would be to regard layer 4 as the site of conscious visual 689 experience (which we describe above as ‘pure’ binocular visual experience) and layers 2/3 and 690 5/6 as two different layers of cognitive processing. This is supported by Jordan & Keller (2020)’s 691 finding of two different interactions between visual and motor signals in V1: comparison in 692 layers 2/3 (for visual stability), and integration in layers 5/6 (for the integrated ‘percept’) (both of 693 which I would give a cognitive interpretation) (see also Keller & Mrsic-Flogel, 2018; Padamsey & 694 Rochefort, 2020). Further support comes from Self et al. (2013) on the time course of figure- 695 ground segmentation in layers 2/3 and 5/6. 696 Another approach, which we also have some sympathy for, would be to regard layers 697 5/6 as the site of visual experience, over which bottom-up perception and top-down imagination 698 compete, with dreams resulting when imagination wins that competition globally, and 699 hallucinations resulting when imagination wins that competition in a local region of space. 700 Whichever interpretation is correct, the key point is that somewhere feedforward visual 701 processing is dominant, and this is reflected as our visual experience. 702 703 4. Low Level Theory of Visual Consciousness 704 705 If V1 is the first site of post-perceptual cognitive processing, then it must logically be the site of 706 visual experience as well (unless you want to suggest that visual experience is situated prior to V1 707 in LGN, which is a conceptual possibility, but which V1 acting as the site of binocular 708 integration appears to rule out).

15 709 Thinking of V1 as the site of visual experience provides some potentially helpful insights 710 about the nature of visual ‘illusions’ which, because we give them a cognitive interpretation, are 711 “better thought of as delusions (false judgements) rather than illusions (false percepts)” (Linton, 712 2017, pp.8, 69, 74). First, as we have already noted, Bergmann et al. (2019) were unable to 713 decode amodal completion in V1, suggesting that this effect is very much a purely cognitive 714 process. Second, Bergmann et al. (2019) were only able to decode the edges of the neon 715 spreading illusion in V1. But the neon spreading illusion is of a single illusory surface. This 716 suggests that the illusory surface in the neon spreading illusion is also cognitive rather perceptual. 717 Distinguishing between the layers of V1 (and arguing that feed-forward processing in 718 layer 4 reflects visual perception, whilst feedback processing from higher visual areas into layers 719 2/3 merely reflects cognition) leads us to embrace a low-level theory of visual consciousness, in 720 contrast to mid-level theories (recurrent processing theory; integrated information theory) and 721 higher-level theories (higher-order thought; global workspace theory). This is important, because 722 most of the recent adversarial collaborations in consciousness science today have focused on 723 adjudicating between mid and higher-level approaches, whilst missing this third distinct 724 alternative (Templeton grants 0567: He, Peters, Denison, Brascamp, Block, & Chalmers, ‘An 725 adversarial collaboration to test predictions of first-order and higher-order theories of 726 consciousness’, 2020-2023; 0484: Block, ‘Analyzing and merging theories of consciousness’, 727 2020-2023; and 21569: Block & Lau, ‘Arbitrating philosophical theories of consciousness by 728 cognitive neuroscience experiments’, 2011-2015). Let me explain the differences between these 729 different approaches, and why my third distinct alternative is required. 730 1. Higher-level Theories (higher-order thought; global workspace theory): The key 731 attribute of higher-level theories for our purposes is that they do not situate visual experience in 732 the visual cortex (V1-V6). Classically the distinction has been thought of in terms of whether the 733 neural correlates of consciousness are in the front or the back of the brain (Boly et al., 2017; 734 Odegaard et al., 2017; Liu et al., 2019). The prefrontal cortex (PFC) has become the site of visual 735 experience for higher-level theories of visual consciousness (Frith, 2019), including global 736 workspace theories (Dehaene & Naccache, 2001) and higher-order thought theories (Lau, 2019). 737 2. Mid-level theories (recurrent processing theory; integrated information theory): Mid- 738 level theories typically attribute visual consciousness to recurrent processing within the visual 739 cortex (V1-V6), either as the basis of their account (recurrent processing theory; Lamme & 740 Roelfsema, 2000; Lamme, 2018), or because recurrent processing is thought to reflect 741 information integration (integrated information theory; Tononi & Edelman, 1998; Oizumi et al., 742 2014). Given the importance of recurrent processing, we might also include predictive coding 743 accounts discussed above (Rao & Ballard, 1999). 744 I will focus on subset of these theories that explicitly suggest that V1 may be the site of 745 visual experience, including Stoerig and colleagues (Stoerig & Cowey, 1995; Stoerig, 2001), 746 Pollen, in limited circumstances (Pollen, 1995; 1999; 2003), Lamme and colleagues (Lamme et 747 al., 2000; Lamme & Roelfsema, 2000; Lamme, 2003; 2004; 2006; Fahrenfort & Lamme, 2012; 748 Lamme, 2018; 2020), Tong (Tong, 2003; 2009), and Rees and colleagues (Watkins et al., 2006; 749 Rees, 2007; cf. Rees et al., 2002); see Boly et al. (2017) for a recent defence. 750 3. Low-level Theories (my feedforward account): Where I differ from these mid-level 751 theories of visual consciousness is that they are unanimous in requiring feedback from higher 752 visual areas to V1 for conscious visual perception (Lamme, 2018). As Oizumi et al. (2014) note 753 of both recurrent processing theory and information integration theory: “The idea that “feed- 754 back”, “reentry”, or “recursion” of some kind may be an essential ingredient of consciousness 755 has many proponents”. By contrast, the essence of my account is that recurrent processing from 756 higher visual areas to V1 is not necessary for visual consciousness. 757 This is in keeping with mouse physiology where there are no corollaries for the higher 758 visual areas required for such recurrent processing.

16 759 It is also in keeping with V1 as a cognitive map. On this account, as we have emphasised, 760 feed-forward visual processing in layer 4 reflects our visual experience (V1 as visual map), whilst 761 feed-back visual processing from higher visual areas reflects cognitive processing (V1 as 762 cognitive map). Which of these two approaches is correct? I will take Victor Lamme’s work 763 (Lamme, 2020) as my target, and highlight three key distinctions between our theories. First, 764 both Lamme (2020) and I (Linton, 2017, Ch.1) agree that all leading theories of visual perception 765 (including his) are ‘inferential’, but this is exactly what (from a theoretical perspective) I want to 766 reject. Second, advocates of recurrent processing equate visibility with detectability, but I argue 767 that this leads us into a paradox. Third, Fahrenfort & Lamme (2012) and I disagree as to what 768 constitutes a good paradigm of “visual perception”. They focus on illusions, whilst I argue 769 random-dot stereograms, which were favoured by the early vision literature of the 1960s-80s, but 770 have recently fallen out of fashion, teach us the most about the mechanisms of visual experience. 771 772 a. Vision Without Inference 773 774 Lamme (2020) rightly identifies ‘inference’ as essential to all leading theories of vision (by 775 inference I mean to include Lamme, 2020’s ‘categorizations’ and ‘interference’; cf. Albertazzi et 776 al., 2010, who reject a narrower sense of ‘inference’ – ‘inverse optics’ – than Lamme, 2020 and I 777 are discussing). What I propose in Linton (2017), Ch.1 is a theory of vision without inference. By 778 inference I mean the visual constancies, Gestalt groupings, and the attribution of meaning to the 779 retinal image. So just as we can distinguish between low-level, mid-level, and higher-level 780 theories of visual processing (Cavanagh, 2011), we can distinguish between low-level (non- 781 inferential), mid-level (recurrent processing theory; integrated information theory), and higher- 782 level (higher-order thought; global workspace theory) theories of visual consciousness. 783 The essence of my account is that vision is not about anything. Vision is not trying to 784 ‘represent’ anything, or ‘predict’ anything, or ‘convey’ anything. Perception is not involved 785 ‘information processing’ or making ‘estimates’ about the external world (Neisser, 1967; Miller, 786 2003), much less is vision, or visual processing, is in some sense ‘predictive’ (Frith, 2013; Clark, 787 2013). Instead, to borrow from a long tradition of naïve realists, I insist (Linton, 2017, p. 74) that 788 vision is ‘silent’, it makes no claims or representations (Austin, 1962; Travis, 2004; Travis, 2013). 789 However, the claim that vision is ‘silent’ is not just the preserve of naïve realism. I develop my 790 own account as a form of naïve or perceptual idealism (Linton, 2018b; 2018e), and Papineau 791 (2021) advances a similar claim from a sensationalist perspective. This is very different way of 792 thinking of vision from Marr (1982)’s suggestion that “vision is the process of discovering from 793 images what is present in the world and where it is.” 794 The question whether we can account for all of our visual experience without invoking 795 ‘inference’ is an empirical one. First, as I outline below, we need to determine whether a process 796 is properly categorised as ‘perceptual’ or ‘cognitive’. Second, if it truly is perceptual (as, for 797 instance, I argue that depth from binocular disparity is), can it be accounted for in non- 798 inferential terms (for instance, my argument that binocular depth is the product of rivalry 799 eradication, rather than the visual system trying to estimating scene geometry). 800 801 b. Detection ≠ Perception 802 803 A key argument in favour of recurrent processing being necessary for visual consciousness is the 804 suggestion that V1 activation on its own is subjectively undetectable and therefore invisible 805 (Lamme et al., 2000; Koivisto & Revonsuo, 2010; Koivisto et al., 2010; Lamme, 2020). Lamme et 806 al. (2000) point to the fact that subthreshold changes in colour and luminance are processed in 807 V1 even though they are ‘objectively’ invisible to the observer. But underpinning this suggestion 808 is the assumption that just because a change isn’t detected it isn’t perceived. And I argue that this 809 is wrong. Let me explain why this is a shortcoming of signal detection theory.

17 810 Consider someone doing a contrast detection task being shown a series of stimuli in 811 succession, one after the other. Imagine the stimulus is a luminous display that takes up their 812 whole visual field. First, they are shown the display with one luminance. Then this is replaced 813 (without an interval) by a second stimulus where the display is now lighter than the first. But the 814 participant can’t detect that the second stimulus is lighter. Does this mean that the luminance 815 difference between the first and second stimulus is not apparent in their visual experience? Signal 816 detection theory suggest so. But this leads to the following paradox. Now the second stimulus is 817 replaced (without an interval) by a third stimulus that is lighter than the second stimulus, but 818 again the difference is not detectable. And the third stimulus is replaced (without an interval) by 819 a fourth stimulus that is lighter than the third stimulus, but again the difference is not detectable. 820 And so on. Clearly the observer’s visual experience is changing. For instance, had the first 821 stimulus been replaced by the fourth stimulus the observer would have immediately noticed. But 822 according to signal detection theory these changes are not just undetectable, but invisible (i.e. 823 they have no effect on our visual experience). And we argue this is wrong. The observer’s visual 824 experience will change even though none of the individual changes were detectable. And yet this 825 is the criterion of ‘invisibility’ that recurrent processing theorists employ. 826 A similar example would be gradual changes, e.g. someone who fails to detect the change 827 in position of a clock hand over time. A succession of small but undetectable changes is 828 essentially what gradual change-blindness is. Simons et al. (2000) and Auvray & O’Regan (2003) 829 suggested that these gradual changes are not only undetectable, but also imperceptible (i.e. 830 they’re subjectively invisible). We maintain that the same logical illustrates why this cannot be the 831 case. Imagine a region of the picture is gradually changing from its initial red to blue. We argue 832 that it can’t really be maintained that the observer’s visual experience remains its initial red over 833 the course of the transition. Instead, a series of subthreshold changes must be changing the 834 observer’s conscious visual perception (Linton, 2017, p.102; Linton, 2020a, p.3189). 835 Note that this is a different point from Lamme (2003; 2004)’s and Block (2007)’s 836 argument that sudden (masked) ‘change blindness’ is not in fact blindness, but merely 837 ‘inattentional inaccessibility’. Their argument rests on there being a failure of attention in the 838 masked (sudden) ‘change blindness’ case. But gradual change blindness (sub-threshold failures of 839 detection) are not due to failures of attention. As O’Regan (2011) note, “‘you’ may even be 840 looking directly at the thing that is changing and still not discover it.” Instead, ours is the very 841 different point that detection thresholds in signal detection theory provide us with no 842 justification for claiming that sub-threshold stimuli are not consciously experienced. 843 The same logic applies to the insistence that feed-forward processing in V1 can’t be 844 conscious because when backwards masking is applied to halt recurrent processing, such feed- 845 forward processing is undetectable. Lamme (2020) terms this ‘objective’ invisibility. But whilst 846 Lamme (2020) suggests that sudden (masked) ‘change blindness’ (which he argues is consciously 847 experienced) “is clearly different from invisibility from masking”, we’ve been given no reason, 848 apart from behaviour, to conclude that backwards masking extinguishes vision in one context 849 (V1 processing without recurrent processing), but merely working memory in the other (sudden 850 change blindness). 851 852 c. Competing Visual Paradigms 853 854 Finally, Fahrenfort & Lamme (2012) and I disagree over what constitutes a good paradigm for 855 visual processing, and therefore over what is a good stimulus to test whether recurrent 856 processing is necessary for visual processing. Fahrenfort & Lamme (2012) use three different 857 types of illusions in their ‘no report’ paradigm: 1. pictorial size constancy in the , 2. 858 illusory contours in the Kanizsa triangle, and 3. the . However, I disagree 859 that any of these three illusions are really perceptual.

18 860 1. Pictorial size constancy in the Ponzo illusion: First, as we have already discussed above 861 in Section 2(c)(ii) in the context of pictorial size constancy, it’s not at all clear that the apparent 862 effect of perspective cues on perceived size is actually distorting visual space as opposed to 863 merely our ability to make accurate judgements about visual space. 864 2. Illusory contours of the Kanizsa triangle: Second, whilst the illusory contours of the 865 Kanizsa triangle are reliant on top-down processing from V2 (Qiu & von der Heydt, 2005), we 866 argue again (Linton, 2017, p.11) that there is little reason to suppose that these illusory contours 867 are visually perceived. Indeed, how do they manifest themselves in our visual experience? It is 868 sometimes suggested that the contours are seen as lighter than the surround. But there seems to 869 be little evidence that this effect can cancel out an actual physical decrease in luminance of the 870 contours, which would be required to show that this really is a perceptual effect. 871 3. Checker shadow illusion: The checker shadow illusion (Adelson, 1995) is heavily 872 reliant upon pictorial cues for its effect. Given our argument (in Linton, 2017) that pictures and 873 pictorial cues are merely cognitive in nature, we would also argue that such luminance constancy 874 effects are also purely cognitive. The same thought applies to colour constancy, which explains 875 why we do not find Block (1998)’s critique of V1 as the site of visual experience based on the 876 Land effect (colour constancy) determinative. 877 4. Random-Dot Stereogram: What, then, is an appropriate paradigm for thinking about 878 visual perception? This is the topic of the recent debate between myself and Morales, Bax, & 879 Firestone (Morales et al., 2020; Linton, 2021a; Morales & Firestone, 2021; Morales et al., in 880 press). Morales et al. (2020) are concerned with everyday viewing conditions, and suggest that 881 visual perception provides us with both the ‘distal stimulus’ (representation of physical 3D 882 world) and the ‘proximal stimulus’ (representation of the retinal image). There is a long tradition 883 of equating one eye vision with the retinal image (‘local signs’: Lotze, 1846; Hering, 1879; 19th 884 Century optometry: Atchison, 1979; Johnson et al., 2011), which formed the basis of early 885 investigations into the retinotopic nature of V1 (Inouye, 1909). And since I argue that that 886 pictorial cues are merely cognitive, I am very sympathetic with the idea that monocular vision is 887 just our retinal image. In Linton (2017), Ch.4 I explored the possibility that defocus blur might 888 import some monocular depth (following Zannoli et al., 2016), but I am now inclined to think 889 blur is merely a pictorial depth cue, and so all monocular vision is flat in Sacks (2006)’s sense. 890 However, the argument in Linton (2021a) is that ordinary vision with two eyes doesn’t 891 provide us with our retinal image, nor does it provide us with a representation of the physical 892 world. Instead, what matters is how the visual system resolves binocular disparity (the 893 differences in the retinal images of the two eyes) to give us a single coherent percept. I therefore 894 suggest that viewing a random-dot stereogram in complete darkness and isolation is the 895 appropriate paradigm through which to begin to understand vision. In this regard I appeal to an 896 older tradition (that was dominant in the 1960s-80s) of ‘basic’ or ‘early’ vision (Julesz, 1960; 897 Julesz, 1971; Julesz, 1986; Bishop & Pettigrew, 1986; Julesz, 1995; Papathomas et al., 1995). 898

899 900

19 901 Figure 2. Example of a random dot stereogram from Serrano-Pedraza et al. (2016), and an 902 illustration of the shape it produces when the right eye sees the left image, and the left eye sees 903 the right image (‘cross-fusion’). © Serrano-Pedraza et al. (CC-BY.4.0) 904 905 What are the basic components of seeing a random-dot stereogram? You might think it has two 906 components. First, its distance. Second, its shape. But we have argued that its distance is not 907 visually conveyed to us (see Section 2(b) above on V1 as a Cognitive Distance Map). So all that is 908 left is the perception of 3D shape with its visual scale being ambiguous (what I call ‘phenomenal 909 geometry without scale’ in Linton, 2017). 910 Turning to 3D shape, and consistent with my suggestion that vision is ‘silent’, I don’t 911 believe stereo vision is a process of trying to estimate the scene geometry of the world from the 912 different view-points of the two eyes. Indeed, trying to do this would be meaningless without the 913 vergence signal (Wallach & Zuckerman, 1963), which is exactly what we’ve illustrated the visual 914 system doesn’t have. Instead, I think the most plausible explanation for stereo depth is that it is 915 the solution to a very different problem, that has nothing to do with 3D shape. And that 916 problem is binocular rivalry. The percept we see in Fig.2 is the only percept we could see if we 917 are to eradicate binocular rivalry. So on this account depth from disparity isn’t about 3D shape, 918 but merely the eradication of binocular rivalry (Linton, in prep). 3D depth is the solution to a 919 problem, not a problem to be solved itself. Specifically, there are no inferences about the 3D 920 geometry of the scene involved. 921 Strictly speaking do I think we should even talk about perceiving 3D shape or 3D 922 geometry, as opposed to simply 3D depth (which is merely the resolution of rivalry). Take the 923 following random-dot stereogram from Linton (2021a) (Fig.3). The process of rivalry resolution 924 I am describing is the process of locating each individual point in depth. This is the same process 925 whether the points make up a single surface (as they do here) or are completely unrelated (as 926 they would be in a random dot field). That we interpret the points in Figure 3 as making up a 927 single surface seems to be a cognitive inference based on Gestalt groupings. But all our visual 928 experience conveys to us is individual dots located in 3D space. 929

930 931 932 Figure 3. Random-dot stereogram from Linton (2021a) to be viewed with red-blue glasses. 933 934 5. Motion perception: One criticism of this approach is that it focuses on static binocular 935 vision. Surely motion perception is perceptual, and that involves areas beyond V1 like MT. 936 However, we argue that motion perception is purely cognitive. 937 First, the fact that the ‘double drift’ motion illusion relies on the prefrontal cortex is 938 taken as evidence that motion perception, and therefore perception in general, supports a higher- 939 level theory of visual consciousness (Liu et al., 2019). However, the illusion does not affect

20 940 saccadic eye movements (Lisi & Cavanagh, 2015; cf. Lisi & Cavanagh, 2017). Either our eye 941 movements don’t act on what we see. Or, as we argue instead, eye movements provide a more 942 immediate gauge of our visual experience, divorced from the cognitive biases that illusions 943 introduce to the perceptual judgements and more deliberative actions tested by Lisi & Cavanagh 944 (2017) and Liu et al. (2019). 945 Second, motion perception is really just the integration of extinguished percepts over 946 time. In Linton (2017), pp.58-61 we ask why this process is presumed to be perceptual rather 947 than cognitive? For instance, we see apparent motion between two dots blinking one after the 948 other (‘beta motion’; Boeykens et al., 2020). But take one of these dots blinking in isolation. It’s 949 not at all clear that the judgement that the dot is ‘blinking’ – which relies heavily on working 950 memory to integrate percepts over time – is perceptual rather than cognitive. But if that’s true of 951 one single dot blinking, why shouldn’t it be true of motion between the two dots? 952 Third, blindsight often claims to be a perceptual judgement, without visual experience, 953 that something is moving in the visual field (Barbur et al., 1994; Zeki & Ffytche, 1998; Azzopardi 954 & Hock, 2011). Whether blindsight exists in this sense, or is merely degraded visual experience, 955 continues to be debated (Azzopardi & Cowey, 1997; Overgaard et al., 2008; Weiskrantz, 2009; 956 Phillips, 2020). But the idea that motion perception is merely cognitive remains a distinct 957 possibility here, and if so, may provide insights for ‘normal’ motion perception. 958 959 5. Objections 960 961 Finally, let us turn to four potential objections reviewers have raised with my account: 962 963 a. Is this a theory of visual consciousness? 964 965 Whilst Section 3 provides a theory of the neural correlates of visual consciousness, the argument 966 is that Section 4 is not yet a theory of visual consciousness because it doesn’t ‘explain why’. First, 967 it doesn’t explain why we experience visual consciousness (the ‘hard problem’ is unresolved; 968 Chalmers, 1995). Second, it doesn’t explain why we experience cognitive phenomenology in 969 some contexts (e.g. reading), but not in others. In reply, would argue that consciousness science 970 today is very much akin to astronomy in the 17th Century. It’s important to remember that there 971 were accounts that ‘explained why’ the planets moved based on magnetism (Gilbert, 1600) and 972 vortices (Descartes, 1644). But we made progress by embracing a theory that did not attempt to 973 ‘explain why’ (Newton, 1687: “I feign no hypothesis”), even though this failure to ‘explain why’ 974 left Newton open to the objection (from Leibniz in Clarke, 1717), that this rendered planetary 975 motion “inexplicable, unintelligible, precarious, groundless, and unexampled.” 976 977 b. What is pure visual consciousness without cognition? 978 979 I try to address this with my discussion of random-dot stereograms viewed in isolation and in 980 darkness. With single eye viewing, there’s nothing to suggest that we experience anything 981 different from our retina image. What we see is flat. We know this because if we introduce a 982 second identical retinal image from the other eye (zero disparity), our visual experience will not 983 change, and zero disparity stimuli are flat (I think it’s helpful to think of single eye viewing as 984 akin to zero disparity viewing). The random dot surface is some indeterminate extension away 985 from us, but I argue that it would be a category mistake to attribute any physical distance to this 986 phenomenal extension: 987 988 “When we watch a movie, the movie is always at the same fixed phenomenal depth: the 989 screen never appears to move towards us or away from us. And yet the fact that the 990 movie is always at the same fixed phenomenal depth in no way impedes our enjoyment

21 991 of the film: the movie can transition from wide-panning shots of a cityscape to close- 992 ups of a face without this transition being jarring. Attributing a variety of different scales 993 to scenes viewed at a single fixed phenomenal depth is therefore something that seems 994 natural to humans.” (Linton, 2018f) 995 996 The difference between monocular (one eye) and binocular (two eye) viewing is the sense of 997 depth we perceive with two eyes, which is experienced in a random-dot stereogram when there is 998 disparity (a difference in the retinal images). But this, I argue, is simply the outcome of the visual 999 system resolving binocular rivalry. It moves visual points in depth (off the focal plane) so that 1000 they do not rival each other anymore. There is no inference involved. 1001 1002 c. How do we determined whether an effect is perceptual or cognitive? 1003 1004 For any other potentially perceptual process it is an empirical question whether it is really 1005 perceptual or cognitive. We have to ask two questions. 1006 First, how does the process affect our visual experience? For the Kanizsa triangle the 1007 illusory contours are said to look brighter. For the Müller-Lyer illusion lines are meant to look 1008 longer / shorter. For vergence, objects are meant to look smaller and closer. The key point being 1009 that you have to be able to point to something in our visual experience that changes. It can’t just 1010 be “a feeling” (see my disagreement with Knotts et al. 2020 in Section 2(a)). This is why I resist 1011 the suggestion that reading is perceptual. What changes in our visual experience? 1012 Second, to test whether any apparent change in our visual experience really is a change in 1013 our visual experience, you need to devise an experiment that shows that this effect persists where 1014 a purely cognitive effect would not. With vergence I showed that this was not the case by 1015 retesting vergence as a size and distance cue whilst removing any subjective knowledge about the 1016 change in vergence and showing that changes in vergence had no effect on our visual experience. 1017 1018 d. Visual consciousness in the absence of V1 1019 1020 Silvanto et al. (2007) and Koivisto et al. (2010) document how neural plasticity can give rise to 1021 visual experience when V1 is no longer intact. Does this indicate that V1 isn’t actually necessary 1022 for visual experience? In a literal sense, yes. But I would make two points: 1023 First, Ajina et al. (2015) illustrate that when V5/MT+ provides the basis for visual 1024 experience in these patients, it doesn’t represent ‘ordinary’ V5/MT+ functioning, but instead 1025 appears to take on the functions of the missing V1. And, as we saw above, it appears that V1 1026 itself can be recruited for other modalities with extended visual deprivation. How the brain can 1027 change a neuron from one experience to another (in the case of V1 recruitment), is essentially as 1028 puzzling as the hard problem itself (why any neuron should give rise to experience, and just as 1029 importantly, that specific experience). But I think what this literature shows is just how flexible 1030 the brain is at allocating functions, as opposed to saying that the functions associated with V1 1031 feedforward processing can’t give rise to visual experience. 1032 Second, the fact that feedforward processing from LGN to V5/MT+ gives rise to visual 1033 experience (Ajina & Bridge, 2018), in much the same way that feedforward processing from 1034 LGN to V1 does, is actually supportive of my account, since there seems to be little prospect of 1035 the recurrent processing necessary for recurrent processing accounts. 1036 1037 Conclusion 1038 1039 One approach to visual neuroscience is to take a perceptual phenomenon that’s already relatively 1040 well understood, and to try and find the neural basis for it. But another approach is to take a 1041 neural mechanism that’s already well explored, and to argue for a new interpretation of it. For

22 1042 instance, Rao & Ballard (1999)’s initial argument for predictive coding rested on a 1043 reinterpretation of surround suppression found by Hubel & Wiesel (1965) (see Keller & Mrsic- 1044 Flogel, 2018). In this paper we take two neural mechanisms that have been relatively well 1045 explored over the last three decades, the extraction of the distance from vergence, and the 1046 scaling of the retinal image using vergence, and argue for a reinterpretation of these basic visual 1047 functions as cognitive instead of perceptual. Given the interaction between vergence and 1048 multisensory integration in the Taylor illusion, as well as failures of mandatory fusion in 1049 multisensory integration, I extend this account to multisensory integration as well. We argue that 1050 the finding that such basic functions as the processing of size and distance in V1 is cognitive 1051 should have significant implications for our understanding of V1, not just as an egocentric 1052 cognitive map, but also as the site of a low-level theory of visual consciousness. 1053 1054 Acknowledgements 1055 1056 Thank you to the editor and the three reviewers for their very helpful comments on the 1057 manuscript. The paper develops arguments that were presented in my 2018 Brains Blog 1058 discussion of my book The Perception and Cognition of Visual Space (2017), and I thank John 1059 Schwenkler, Robert Briscoe, and the Brains Blog team for that opportunity. 1060 1061

23 1062 References 1063 1064 Abid, G. (2019). Deflating inflation: The connection (or lack thereof) between decisional and

1065 metacognitive processes and visual phenomenology. Neuroscience of Consciousness,

1066 2019(niz015). https://doi.org/10.1093/nc/niz015

1067 Adams, D. L., & Horton, J. C. (2003). A Precise Retinotopic Map of Primate Striate Cortex

1068 Generated from the Representation of Angioscotomas. The Journal of Neuroscience,

1069 23(9), 3771–3789. https://doi.org/10.1523/JNEUROSCI.23-09-03771.2003

1070 Adelson, T. E. (1995). Checker shadow illusion:

1071 Http://persci.mit.edu/gallery/checkershadow.

1072 Ajina, S., & Bridge, H. (2018). Blindsight relies on a functional connection between hMT+

1073 and the lateral geniculate nucleus, not the pulvinar. PLOS Biology, 16(7), e2005769.

1074 https://doi.org/10.1371/journal.pbio.2005769

1075 Ajina, S., Kennard, C., Rees, G., & Bridge, H. (2015). Motion area V5/MT+ response to

1076 global motion in the absence of V1 resembles early visual cortex. Brain, 138(1), 164–

1077 178. https://doi.org/10.1093/brain/awu328

1078 Albertazzi, L., Tonder, G. J. V., & Vishwanath, D. (2010). Perception Beyond Inference: The

1079 Information Content of Visual Processes. MIT Press.

1080 Anonymous. (2013). “Copy-paste, no resize”, https://imgur.com/WBAzkuI (last accessed

1081 17th December 2020).

1082 Ash, A., Palmisano, S., & Kim, J. (2011). Vection in Depth during Consistent and

1083 Inconsistent Multisensory Stimulation. Perception, 40(2), 155–174.

1084 https://doi.org/10.1068/p6837

1085 Atchison, D. A. (1979). History of Visual Field Measurement. The Australian Journal of

1086 Optometry, 62(8), 345–354. https://doi.org/10.1111/j.1444-0938.1979.tb02329.x

24 1087 Atkin, A. (2013). Peirce’s Theory of Signs. In E. N. Zalta (Ed.), The Stanford Encyclopedia

1088 of Philosophy (Summer 2013). Metaphysics Research Lab, Stanford University.

1089 https://plato.stanford.edu/archives/sum2013/entries/peirce-semiotics/

1090 Aubert, H. R., & Foerster, C. F. R. (1857). Beiträge zur Kenntniss des indirecten Sehens. (I).

1091 Untersuchungen über den Raumsinn der Retina. Archiv Für Ophthalmologie, 3, 1–37.

1092 Austin, J. L. (1962). Sense and Sensibilia. Clarendon Press.

1093 Auvray, M., & O’Regan, J. (2003). L’influence des facteurs sémantiques sur la cécité aux

1094 changements progressifs dans les scènes visuelles. L’Année psychologique, 103(1), 9–

1095 32. https://doi.org/10.3406/psy.2003.29621

1096 Ayaz, A., Saleem, A. B., Schölvinck, M. L., & Carandini, M. (2013). Locomotion Controls

1097 Spatial Integration in Mouse Visual Cortex. Current Biology, 23(10), 890–894.

1098 https://doi.org/10.1016/j.cub.2013.04.012

1099 Azzopardi, P., & Cowey, A. (1997). Is blindsight like normal, near-threshold vision?

1100 Proceedings of the National Academy of Sciences, 94(25), 14190–14194.

1101 https://doi.org/10.1073/pnas.94.25.14190

1102 Azzopardi, P., & Hock, H. S. (2011). Illusory motion perception in blindsight. Proceedings of

1103 the National Academy of Sciences, 108(2), 876–881.

1104 https://doi.org/10.1073/pnas.1005974108

1105 Barbur, J. L., Harlow, A. J., & Weiskrantz, L. (1994). Spatial and temporal response

1106 properties of residual vision in a case of hemianopia. Philosophical Transactions of

1107 the Royal Society of London. Series B, Biological Sciences, 343(1304), 157–166.

1108 https://doi.org/10.1098/rstb.1994.0018

1109 Barlow, H. B., Blakemore, C., & Pettigrew, J. D. (1967). The neural mechanism of binocular

1110 depth discrimination. The Journal of Physiology, 193(2), 327–342.

1111 https://doi.org/10.1113/jphysiol.1967.sp008360

25 1112 Bayne, T., Brainard, D., Byrne, R. W., Chittka, L., Clayton, N., Heyes, C., Mather, J.,

1113 Ölveczky, B., Shadlen, M., Suddendorf, T., & Webb, B. (2019). What is cognition?

1114 Current Biology, 29(13), R608–R615. https://doi.org/10.1016/j.cub.2019.05.044

1115 Bayne, T., & Montague, M. (2011). Cognitive Phenomenology. OUP Oxford.

1116 Bedny, M. (2017). Evidence from Blindness for a Cognitively Pluripotent Cortex. Trends in

1117 Cognitive Sciences, 21(9), 637–648. https://doi.org/10.1016/j.tics.2017.06.003

1118 Beer, A. L., Plank, T., & Greenlee, M. W. (2011). Diffusion tensor imaging shows white

1119 matter tracts between human auditory and visual cortex. Experimental Brain

1120 Research, 213(2), 299. https://doi.org/10.1007/s00221-011-2715-y

1121 Bergmann, J., Morgan, A. T., & Muckli, L. (2019). Two distinct feedback codes in V1 for

1122 ‘real’ and ‘imaginary’ internal experiences. BioRxiv, 664870.

1123 https://doi.org/10.1101/664870

1124 Bishop, P. O., & Pettigrew, J. D. (1986). Neural mechanisms of . Vision

1125 Research, 26(9), 1587–1600. https://doi.org/10.1016/0042-6989(86)90177-X

1126 Block, N. (1998). How to Find the Neural Correlate of Consciousness*. Royal Institute of

1127 Philosophy Supplements, 43, 23–34. https://doi.org/10.1017/S1358246100004288

1128 Block, N. (2007). Consciousness, accessibility, and the mesh between psychology and

1129 neuroscience. The Behavioral and Brain Sciences, 30(5–6), 481–499; discussion 499-

1130 548. https://doi.org/10.1017/S0140525X07002786

1131 Boeykens, C., Wagemans, J., & Moors, P. (2020). Perception of the ambiguous motion

1132 quartet: A stimulus-observer interaction approach. PsyArXiv.

1133 https://doi.org/10.31234/osf.io/t5p6g

1134 Boly, M., Massimini, M., Tsuchiya, N., Postle, B. R., Koch, C., & Tononi, G. (2017). Are the

1135 Neural Correlates of Consciousness in the Front or in the Back of the Cerebral

26 1136 Cortex? Clinical and Neuroimaging Evidence. Journal of Neuroscience, 37(40),

1137 9603–9613. https://doi.org/10.1523/JNEUROSCI.3218-16.2017

1138 Botvinick, M., & Cohen, J. (1998). Rubber hands “feel” touch that eyes see. Nature,

1139 391(6669), 756. https://doi.org/10.1038/35784

1140 Brewer, A. A., Press, W. A., Logothetis, N. K., & Wandell, B. A. (2002). Visual areas in

1141 macaque cortex measured using functional magnetic resonance imaging. The Journal

1142 of Neuroscience: The Official Journal of the Society for Neuroscience, 22(23),

1143 10416–10426.

1144 Bridge, H. (2016). Effects of cortical damage on binocular depth perception. Philosophical

1145 Transactions of the Royal Society B: Biological Sciences, 371(1697), 20150254.

1146 https://doi.org/10.1098/rstb.2015.0254

1147 Bross, M. (2000). Emmert’s law in the dark: Active and passive proprioceptive effects on

1148 positive visual . Perception, 29(11), 1385–1391.

1149 https://doi.org/10.1068/p3038

1150 Busse, L., Cardin, J. A., Chiappe, M. E., Halassa, M. M., McGinley, M. J., Yamashita, T., &

1151 Saleem, A. B. (2017). Sensation during Active Behaviors. Journal of Neuroscience,

1152 37(45), 10826–10834. https://doi.org/10.1523/JNEUROSCI.1828-17.2017

1153 Calvert, G. A., Hansen, P. C., Iversen, S. D., & Brammer, M. J. (2001). Detection of audio-

1154 visual integration sites in humans by application of electrophysiological criteria to the

1155 BOLD effect. NeuroImage, 14(2), 427–438. https://doi.org/10.1006/nimg.2001.0812

1156 Carandini, M. (2012). Area V1. Scholarpedia, 7(7), 12105, revision #137292 (30th October

1157 2013). https://doi.org/10.4249/scholarpedia.12105

1158 Carey, D. P., & Allan, K. (1996). A motor signal and “visual” size perception. Experimental

1159 Brain Research, 110(3), 482–486. https://doi.org/10.1007/BF00229148

27 1160 Cavanagh, P. (2011). Visual cognition. Vision Research, 51(13), 1538–1551.

1161 https://doi.org/10.1016/j.visres.2011.01.015

1162 Cavonius, C. R., & Hilz, R. (1973). Invariance of visual receptive-field size and visual acuity

1163 with viewing distance. JOSA, 63(8), 929–933.

1164 https://doi.org/10.1364/JOSA.63.000929

1165 Caziot, B., & Mamassian, P. (2017). Limits of sensory fusion in audio-visual cue conflict

1166 stimuli. Journal of Vision, 17(10), 195–195. https://doi.org/10.1167/17.10.195

1167 Chalmers, D. (1995). Facing Up to the Problem of Consciousness. Journal of Consciousness

1168 Studies, 2(3), 200–219.

1169 Chanauria, N., Bharmauria, V., Bachatene, L., Cattan, S., Rouat, J., & Molotchnikoff, S.

1170 (2019). Sound Induces Change in Orientation Preference of V1 Neurons: Audio-

1171 Visual Cross-Influence. Neuroscience, 404.

1172 https://doi.org/10.1016/j.neuroscience.2019.01.039

1173 Chen, Jing, McManus, M., Valsecchi, M., Harris, L. R., & Gegenfurtner, K. R. (2019).

1174 Steady-state visually evoked potentials reveal partial size constancy in early visual

1175 cortex. Journal of Vision, 19(6), 8–8. https://doi.org/10.1167/19.6.8

1176 Chen, Juan, Sperandio, I., & Goodale, M. A. (2018). Proprioceptive Distance Cues Restore

1177 Perfect Size Constancy in Grasping, but Not Perception, When Vision Is Limited.

1178 Current Biology: CB, 28(6), 927-932.e4. https://doi.org/10.1016/j.cub.2018.01.076

1179 Chen, Juan, Sperandio, I., Henry, M. J., & Goodale, M. A. (2019). Changing the Real

1180 Viewing Distance Reveals the Temporal Evolution of Size Constancy in Visual

1181 Cortex. Current Biology, 29(13), 2237-2243.e4.

1182 https://doi.org/10.1016/j.cub.2019.05.069

28 1183 Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of

1184 cognitive science. Behavioral and Brain Sciences, 36(3), 181–204.

1185 https://doi.org/10.1017/S0140525X12000477

1186 Clarke, S. (1717). A Collection of Papers, which passed between the late Learned Mr.

1187 Leibniz, and Dr. Clarke, In the Years 1715 and 1716. James Knapton.

1188 Colombo, M., Colombo, A., & Gross, C. (2002). Bartolomeo Panizza’s Observations on the

1189 (1855). Brain Research Bulletin, 58, 529–539.

1190 https://doi.org/10.1016/S0361-9230(02)00831-6

1191 Cottereau, B., Durand, J.-B., Vayssière, N., & Trotter, Y. (2014). The influence of eye

1192 vergence on retinotopic organization in human early visual cortex. Perception, ECVP

1193 Abstracts 2014, 81. https://doi.org/10.1177/03010066140430S101, poster available at

1194 https://f1000research.com/posters/1096674

1195 Couto, J., Kandler, S., Mao, D., McNaughton, B. L., Arckens, L., & Bonin, V. (2019).

1196 Spatially segregated responses to visuo-tactile stimuli in mouse neocortex during

1197 active sensation. BioRxiv, 199364. https://doi.org/10.1101/199364

1198 Cumming, B. G., & DeAngelis, G. C. (2001). The physiology of stereopsis. Annual Review of

1199 Neuroscience, 24, 203–238. https://doi.org/10.1146/annurev.neuro.24.1.203

1200 De Meo, R., Murray, M. M., Clarke, S., & Matusz, P. J. (2015). Top-down control and early

1201 multisensory processes: Chicken vs. egg. Frontiers in Integrative Neuroscience, 9.

1202 https://doi.org/10.3389/fnint.2015.00017

1203 Dehaene, S., & Naccache, L. (2001). Towards a cognitive neuroscience of consciousness:

1204 Basic evidence and a workspace framework. Cognition, 79(1), 1–37.

1205 https://doi.org/10.1016/S0010-0277(00)00123-2

29 1206 Descartes, R. (1637). Dioptrique (Optics). In J. Cottingham, R. Stoothoff, & D. Murdoch

1207 (Eds.), The Philosophical Writings of Descartes: Volume 1 (1985). Cambridge

1208 University Press.

1209 Descartes, R. (1644). Principia Philosophiae. Apud Ludovicum Elzevirium.

1210 Diamanti, E. M., Reddy, C. B., Schröder, S., Muzzu, T., Harris, K. D., Saleem, A. B., &

1211 Carandini, M. (2020). Spatial modulation of visual signals arises in cortex with active

1212 navigation. BioRxiv, 832915. https://doi.org/10.1101/832915

1213 Dougherty, R. F., Koch, V. M., Brewer, A. A., Fischer, B., Modersitzki, J., & Wandell, B. A.

1214 (2003). Visual field representations and locations of visual areas V1/2/3 in human

1215 visual cortex. Journal of Vision, 3(10), 1–1. https://doi.org/10.1167/3.10.1

1216 Emmert, E. (1881). Größenverhältnisse der Nachbilder. Klinische Monatsblätter Für

1217 Augenheilkunde Und Für Augenärztliche Fortbildung, 19, 443–450.

1218 Erkelens, C. (2018). Linton, P. The Perception and Cognition of Visual Space. Perception,

1219 47(10–11), 1110–1111. https://doi.org/10.1177/0301006618793311

1220 Fahrenfort, J. J., & Lamme, V. A. F. (2012). A true science of consciousness explains

1221 phenomenology: Comment on Cohen and Dennett. Trends in Cognitive Sciences,

1222 16(3), 138–139. https://doi.org/10.1016/j.tics.2012.01.004

1223 Faivre, N., Arzi, A., Lunghi, C., & Salomon, R. (2017b). Consciousness is more than meets

1224 the eye: A call for a multisensory study of subjective experience. Neuroscience of

1225 Consciousness, 2017(1). https://doi.org/10.1093/nc/nix003

1226 Faivre, N., Dönz, J., Scandola, M., Dhanis, H., Ruiz, J. B., Bernasconi, F., Salomon, R., &

1227 Blanke, O. (2017a). Self-Grounded Vision: Hand Ownership Modulates Visual

1228 Location through Cortical β and γ Oscillations. Journal of Neuroscience, 37(1), 11–

1229 22. https://doi.org/10.1523/JNEUROSCI.0563-16.2016

30 1230 Faivre, N., Salomon, R., & Blanke, O. (2015). Visual consciousness and bodily self-

1231 consciousness. Current Opinion in Neurology, 28(1), 23–28.

1232 https://doi.org/10.1097/WCO.0000000000000160

1233 Falchier, A., Clavagnier, S., Barone, P., & Kennedy, H. (2002). Anatomical Evidence of

1234 Multimodal Integration in Primate Striate Cortex. Journal of Neuroscience, 22(13),

1235 5749–5759. https://doi.org/10.1523/JNEUROSCI.22-13-05749.2002

1236 Fang, F., Boyaci, H., Kersten, D., & Murray, S. O. (2008). Attention-dependent

1237 representation of a size illusion in human V1. Current Biology: CB, 18(21), 1707–

1238 1712. https://doi.org/10.1016/j.cub.2008.09.025

1239 Firestone, C., & Scholl, B. J. (2016). Cognition does not affect perception: Evaluating the

1240 evidence for “top-down” effects. The Behavioral and Brain Sciences, 39, e229.

1241 https://doi.org/10.1017/S0140525X15000965

1242 Fiser, A., Mahringer, D., Oyibo, H. K., Petersen, A. V., Leinweber, M., & Keller, G. B.

1243 (2016). Experience-dependent spatial expectations in mouse visual cortex. Nature

1244 Neuroscience, 19(12), 1658–1664. https://doi.org/10.1038/nn.4385

1245 Flossmann, T., & Rochefort, N. L. (2021). Spatial navigation signals in rodent visual cortex.

1246 Current Opinion in Neurobiology, 67, 163–173.

1247 https://doi.org/10.1016/j.conb.2020.11.004

1248 Fournier, J., Saleem, A. B., Diamanti, E. M., Wells, M. J., Harris, K. D., & Carandini, M.

1249 (2020). Mouse Visual Cortex Is Modulated by Distance Traveled and by Theta

1250 Oscillations. Current Biology, 30(19), 3811-3817.e6.

1251 https://doi.org/10.1016/j.cub.2020.07.006

1252 Fridman, M., & Petreanu, L. (2017). Cortical Processing: How Mice Predict the Visual

1253 Effects of Locomotion. Current Biology: CB, 27(23), R1272–R1274.

1254 https://doi.org/10.1016/j.cub.2017.10.038

31 1255 Friston, K. (2018). Does predictive coding have a future? Nature Neuroscience, 21(8), 1019–

1256 1021. https://doi.org/10.1038/s41593-018-0200-7

1257 Frith, C. (2013). Making up the Mind: How the Brain Creates Our Mental World. John Wiley

1258 & Sons.

1259 Frith, C. D. (2019). The neural basis of consciousness. Psychological Medicine, 1–13.

1260 Ghazanfar, A. A., & Schroeder, C. E. (2006). Is neocortex essentially multisensory? Trends

1261 in Cognitive Sciences, 10(6), 278–285. https://doi.org/10.1016/j.tics.2006.04.008

1262 Gilbert, W. (1600). De magnete. Peter Short.

1263 Glennerster, A. (2016). A moving observer in a three-dimensional world. Philosophical

1264 Transactions of the Royal Society B: Biological Sciences, 371(1697), 20150265.

1265 https://doi.org/10.1098/rstb.2015.0265

1266 Glickstein, M., & Fahle, M. (2000). T. Inouye: Visual disturbances following gunshot

1267 wounds of the cortical visual area (Translation). Brain, 123 (Supplementary Volume).

1268 Glickstein, Mitchell, & Whitteridge, D. (1987). Tatsuji Inouye and the mapping of the visual

1269 fields on the human cerebral cortex. Trends in Neurosciences, 10(9), 350–353.

1270 https://doi.org/10.1016/0166-2236(87)90066-X

1271 Gonzalez, F., & Perez, R. (1998). Modulation of cell responses to horizontal disparities by

1272 ocular vergence in the visual cortex of the awake Macaca mulatta monkey.

1273 Neuroscience Letters, 245(2), 101–104. https://doi.org/10.1016/s0304-

1274 3940(98)00191-8

1275 Goodale, M. A., & Milner, A. D. (1992). Separate visual pathways for perception and action.

1276 Trends in Neurosciences, 15(1), 20–25. https://doi.org/10.1016/0166-2236(92)90344-

1277 8

1278 Gregory, R. L., Wallace, J. G., & Campbell, F. W. (1959). Changes in the size and shape of

1279 visual after-images observed in complete darkness during changes of position in

32 1280 space. Quarterly Journal of Experimental Psychology, 11(1), 54–55.

1281 https://doi.org/10.1080/17470215908416288

1282 Grove, C., Cardinali, L., & Sperandio, I. (2019). Does Tool-Use Modulate the Perceived Size

1283 of an Afterimage During the Taylor Illusion? [Abstract ECVP Leuven 2019].

1284 Perception, 48, 181.

1285 He, D., Mo, C., Wang, Y., & Fang, F. (2015). Position shifts of fMRI-based population

1286 receptive fields in human visual cortex induced by Ponzo illusion. Experimental

1287 Brain Research, 233(12), 3535–3541. https://doi.org/10.1007/s00221-015-4425-3

1288 Henschen, S. E. (1890). Klinische und anatomische Beitrage zur Pathologie des Gehirns (Pt

1289 1). Almquist and Wiksell.

1290 Hering, E. (1879). Der Raumsinn und die Bewegungen der Auges. In L. Hermann: Handbuch

1291 der Physiologie 3 (Part 1). English translation, edited by C. A. Radde: Spatial Sense

1292 and Movements of the Eye. Am. .J. Optom., 1942.

1293 Hillis, J. M., Ernst, M. O., Banks, M. S., & Landy, M. S. (2002). Combining Sensory

1294 Information: Mandatory Fusion Within, but Not Between, Senses. Science,

1295 298(5598), 1627–1630. https://doi.org/10.1126/science.1075396

1296 Holmes, G. (1918). DISTURBANCES OF VISION BY CEREBRAL LESIONS. The British

1297 Journal of Ophthalmology, 2(7), 353–384.

1298 Holmes, G. (1919a). The cortical localization of vision. British Medical Journal, 2(3059),

1299 193–199.

1300 Hubel, D. H., & Wiesel, T. N. (1965). Receptive fields and functional architecture in two

1301 nonstriate visual areas (18 and 19) of the cat. Journal of Neurophysiology, 28(2),

1302 229–289.

1303 Ibrahim, L. A., Mesik, L., Ji, X.-Y., Fang, Q., Li, H.-F., Li, Y.-T., Zingg, B., Zhang, L. I., &

1304 Tao, H. W. (2016). Cross-Modality Sharpening of Visual Cortical Processing through

33 1305 Layer-1-Mediated Inhibition and Disinhibition. Neuron, 89(5), 1031–1045.

1306 https://doi.org/10.1016/j.neuron.2016.01.027

1307 Inouye, T. (1909). Die Sehstörungen bei Schussverletzungen de kortikalen Sehsphäre nach

1308 Beobachtungen an Versundeten der letzten Japanische Kriege. W. Engelmann.

1309 Ji, D., & Wilson, M. A. (2007). Coordinated memory replay in the visual cortex and

1310 hippocampus during sleep. Nature Neuroscience, 10(1), 100–107.

1311 https://doi.org/10.1038/nn1825

1312 Johnson, C. A., Wall, M., & Thompson, H. S. (2011). A history of perimetry and visual field

1313 testing. Optometry and Vision Science: Official Publication of the American Academy

1314 of Optometry, 88(1), E8-15. https://doi.org/10.1097/OPX.0b013e3182004c3b

1315 Jokl, D. H.-K., & Hiyama, F. (2007). Tatsuji Inouye-Topographer of the Visual Cortex.

1316 Exemplar of the Germany-Japan Ophthalmic Legacy of the Meiji Era. Neuro-

1317 Ophthalmology, 31(3), 33–43. https://doi.org/10.1080/01658100701436264

1318 Jordan, R., & Keller, G. B. (2020). Opposing Influence of Top-down and Bottom-up Input on

1319 Excitatory Layer 2/3 Neurons in Mouse Primary Visual Cortex. Neuron, 108(6),

1320 1194-1206.e5. https://doi.org/10.1016/j.neuron.2020.09.024

1321 Julesz, B. (1960). Binocular depth perception of computer-generated patterns. Bell System

1322 Technical Journal, 39, 1125–1162. https://doi.org/10.1002/j.1538-

1323 7305.1960.tb03954.x

1324 Julesz, B. (1971). Foundation of Cyclopean Perception. MIT Press.

1325 Julesz, B. (1986). Stereoscopic vision. Vision Research, 26(9), 1601–1612.

1326 https://doi.org/10.1016/0042-6989(86)90178-1

1327 Julesz, B. (1995). Dialogues on Perception. MIT Press.

1328 Julian, J. (2017). Anchoring The Cognitive Map To The Visual World. Publicly Accessible

1329 Penn Dissertations. https://repository.upenn.edu/edissertations/2376

34 1330 Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.

1331 Keller, G. B., Bonhoeffer, T., & Hübener, M. (2012). Sensorimotor Mismatch Signals in

1332 Primary Visual Cortex of the Behaving Mouse. Neuron, 74(5), 809–815.

1333 https://doi.org/10.1016/j.neuron.2012.03.040

1334 Keller, G. B., & Mrsic-Flogel, T. D. (2018). Predictive Processing: A Canonical Cortical

1335 Computation. Neuron, 100(2), 424–435. https://doi.org/10.1016/j.neuron.2018.10.003

1336 Kepler, J. (1604). Paralipomena to Witelo. In W. H. Donahue (Trans.), Optics: Paralipomena

1337 to Witelo and Optical Part of Astronomy. Green Lion Press, 2000.

1338 Knill, D. C., & Richards, W. (1996). Perception as Bayesian Inference. Cambridge

1339 University Press.

1340 Knotts, J. D., Michel, M., & Odegaard, B. (2020). Defending subjective inflation: An

1341 inference to the best explanation. Neuroscience of Consciousness, 2020(niaa025).

1342 https://doi.org/10.1093/nc/niaa025

1343 Koch, C., & Poggio, T. (1999). Predicting the visual world: Silence is golden. Nature

1344 Neuroscience, 2(1), 9–10. https://doi.org/10.1038/4511

1345 Koenig-Robert, R., & Pearson, J. (2021). Why do imagery and perception look and feel so

1346 different? Philosophical Transactions of the Royal Society B: Biological Sciences,

1347 376(1817), 20190703. https://doi.org/10.1098/rstb.2019.0703

1348 Koivisto, M., Mäntylä, T., & Silvanto, J. (2010). The role of early visual cortex (V1/V2) in

1349 conscious and unconscious visual perception. NeuroImage, 51(2), 828–834.

1350 https://doi.org/10.1016/j.neuroimage.2010.02.042

1351 Koivisto, M., & Revonsuo, A. (2010). Event-related brain potential correlates of visual

1352 awareness. Neuroscience & Biobehavioral Reviews, 34(6), 922–934.

1353 https://doi.org/10.1016/j.neubiorev.2009.12.002

35 1354 Kok, P., Bains, L. J., van Mourik, T., Norris, D. G., & de Lange, F. P. (2016). Selective

1355 Activation of the Deep Layers of the Human Primary Visual Cortex by Top-Down

1356 Feedback. Current Biology: CB, 26(3), 371–376.

1357 https://doi.org/10.1016/j.cub.2015.12.038

1358 Lamme, V. A. F. (2004). Separate neural definitions of visual consciousness and visual

1359 attention; a case for phenomenal awareness. Neural Networks, 17(5), 861–872.

1360 https://doi.org/10.1016/j.neunet.2004.02.005

1361 Lamme, V., & Roelfsema, P. (2000). The Distinct Modes of Vision Offered by Feedforward

1362 and Recurrent Processing. Trends in Neurosciences, 23, 571–579.

1363 https://doi.org/10.1016/S0166-2236(00)01657-X

1364 Lamme, Victor A. F. (2003). Why visual attention and awareness are different. Trends in

1365 Cognitive Sciences, 7(1), 12–18. https://doi.org/10.1016/s1364-6613(02)00013-x

1366 Lamme, Victor A. F. (2006). Towards a true neural stance on consciousness. Trends in

1367 Cognitive Sciences, 10(11), 494–501. https://doi.org/10.1016/j.tics.2006.09.001

1368 Lamme, Victor A. F. (2018). Challenges for theories of consciousness: Seeing or knowing,

1369 the missing ingredient and how to deal with panpsychism. Philosophical Transactions

1370 of the Royal Society B: Biological Sciences, 373(1755), 20170344.

1371 https://doi.org/10.1098/rstb.2017.0344

1372 Lamme, Victor A. F. (2020). Visual Functions Generating Conscious Seeing. Frontiers in

1373 Psychology, 11. https://doi.org/10.3389/fpsyg.2020.00083

1374 Lamme, Victor A. F, Supèr, H., Landman, R., Roelfsema, P. R., & Spekreijse, H. (2000). The

1375 role of primary visual cortex (V1) in visual awareness. Vision Research, 40(10),

1376 1507–1521. https://doi.org/10.1016/S0042-6989(99)00243-6

1377 Lau, H. (2019). Consciousness, Metacognition, & Perceptual Reality Monitoring. PsyArXiv.

1378 https://doi.org/10.31234/osf.io/ckbyf

36 1379 Leff, A. (2004). A historical review of the representation of the visual field in primary visual

1380 cortex with special reference to the neural mechanisms underlying macular sparing.

1381 Brain and Language, 88(3), 268–278. https://doi.org/10.1016/S0093-934X(03)00161-

1382 5

1383 Leff, A. P. (2015). Tatsuji Inouye (1881–1976). Journal of Neurology, 262(10), 2399–2400.

1384 https://doi.org/10.1007/s00415-015-7801-9

1385 Leinweber, M., Ward, D. R., Sobczak, J. M., Attinger, A., & Keller, G. B. (2017). A

1386 Sensorimotor Circuit in Mouse Cortex for Visual Flow Predictions. Neuron, 95(6),

1387 1420-1432.e5. https://doi.org/10.1016/j.neuron.2017.08.036

1388 Lindegger, D. J., & Pless, M. (2019). The Discovery of Visual Field Representation in the

1389 Brain. Journal of Ophthalmological Science, 1(1).

1390 https://www.ophthalmolscience.com/articles/the-discovery-of-visual-field-

1391 representation-in-the-brain.html

1392 Linton, P. (2017). The Perception and Cognition of Visual Space. Palgrave.

1393 Linton, P. (in prep). Depth from Disparity isn’t about 3D Shape.

1394 Linton, P. (2018a). Do We See Scale? https://www.biorxiv.org/content/10.1101/371948v1

1395 Linton, P. (2018f). Do We See Scale? Brains Blog.

1396 https://philosophyofbrains.com/2018/06/29/do-we-see-scale.aspx

1397 Linton, P. (2018e). Perceptual Idealism and Phenomenal Geometry. Brains Blog.

1398 https://philosophyofbrains.com/2018/06/28/perceptual-idealism-and-phenomenal-

1399 geometry.aspx

1400 Linton, P. (2018c). Perceptual Integration and Visual Illusions. Brains Blog.

1401 https://philosophyofbrains.com/2018/06/26/perceptual-integration-and-visual-

1402 illusions.aspx

37 1403 Linton, P. (2018d). Seeing Depth with One Eye and Pictorial Space. Brains Blog.

1404 https://philosophyofbrains.com/2018/06/27/seeing-depth-with-one-eye-and-pictorial-

1405 space.aspx

1406 Linton, P. (2018b). Visual Space and the Perception / Cognition Divide. Brains Blog.

1407 https://philosophyofbrains.com/2018/06/25/visual-space-and-the-perception-

1408 cognition-divide.aspx

1409 Linton, Paul. (2020a). Does vision extract absolute distance from vergence? Attention,

1410 Perception, & Psychophysics, 82(6), 3176–3195. https://doi.org/10.3758/s13414-020-

1411 02006-1

1412 Linton, Paul. (2020b). Eye Movements Do Not Affect Perceived Size. BioRxiv,

1413 2020.02.23.961649. https://doi.org/10.1101/2020.02.23.961649

1414 Linton, Paul. (2020c). No Vergence Size Constancy. Journal of Vision, 20(11), 1048–1048.

1415 https://doi.org/10.1167/jov.20.11.1048

1416 Linton, Paul. (2021a). Conflicting shape percepts explained by perception cognition

1417 distinction. Proceedings of the National Academy of Sciences, 118(10).

1418 https://doi.org/10.1073/pnas.2024195118

1419 Linton, Paul. (2021b). Does Vergence Affect Perceived Size? BioRxiv, 2020.02.23.961649.

1420 https://doi.org/10.1101/2020.02.23.961649

1421 Lisi, M., & Cavanagh, P. (2015). Dissociation between the Perceptual and Saccadic

1422 Localization of Moving Objects. Current Biology: CB, 25(19), 2535–2540.

1423 https://doi.org/10.1016/j.cub.2015.08.021

1424 Lisi, M., & Cavanagh, P. (2017). Different spatial representations guide eye and hand

1425 movements. Journal of Vision, 17(2), 12–12. https://doi.org/10.1167/17.2.12

1426 Lister, W. T., & Holmes, G. (1916). Disturbances of Vision from Cerebral Lesions, with

1427 Special Reference to the Cortical Representation of the Macula. Proceedings of the

38 1428 Royal Society of Medicine, 9(Sect_Ophthalmol), 57–96.

1429 https://doi.org/10.1177/003591571600901711

1430 Liu, S., Yu, Q., Tse, P. U., & Cavanagh, P. (2019). Neural Correlates of the Conscious

1431 Perception of Visual Location Lie Outside Visual Cortex. Current Biology, 29(23),

1432 4036-4044.e4. https://doi.org/10.1016/j.cub.2019.10.033

1433 Livingstone, M., & Hubel, D. (1988). Segregation of form, color, movement, and depth:

1434 Anatomy, physiology, and perception. Science (New York, N.Y.), 240(4853), 740–

1435 749. https://doi.org/10.1126/science.3283936

1436 Lotze, H. (1846). Seele und Seelenleben. In R. Wagner, Handwörterbuch der Physiologie,

1437 mit Rücksicht auf physiologische Pathologie. Vol. 3.1. Vieweg.

1438 Lou, L. (2007). Apparent Afterimage Size, Emmert’s Law, and Oculomotor Adjustment.

1439 Perception, 36(8), 1214–1228. https://doi.org/10.1068/p5513

1440 Lush, P., Botan, V., Scott, R. B., Seth, A. K., Ward, J., & Dienes, Z. (2020). Trait

1441 phenomenological control predicts experience of mirror synaesthesia and the rubber

1442 hand illusion. Nature Communications, 11(1), 4853. https://doi.org/10.1038/s41467-

1443 020-18591-6

1444 Macaluso, E., Frith, C. D., & Driver, J. (2000). Modulation of human visual cortex by

1445 crossmodal spatial attention. Science (New York, N.Y.), 289(5482), 1206–1208.

1446 https://doi.org/10.1126/science.289.5482.1206

1447 MacEvoy, S. P., & Fitzpatrick, D. (2006). Visual Physiology: Perceived Size Looms Large.

1448 Current Biology, 16(9), R330–R332. https://doi.org/10.1016/j.cub.2006.03.076

1449 Marg, E., & Adams, J. E. (1970). Evidence for a neurological zoom system in vision from

1450 angular changes in some receptive fields of single neurons with changes in fixation

1451 distance in the human visual cortex. Experientia, 26(3), 270–271.

1452 https://doi.org/10.1007/BF01900088

39 1453 Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and

1454 Processing of Visual Information. MIT Press.

1455 Martuzzi, R., Murray, M. M., Michel, C. M., Thiran, J.-P., Maeder, P. P., Clarke, S., &

1456 Meuli, R. A. (2007). Multisensory interactions within human primary cortices

1457 revealed by BOLD dynamics. Cerebral Cortex (New York, N.Y.: 1991), 17(7), 1672–

1458 1679. https://doi.org/10.1093/cercor/bhl077

1459 Masuda, Y., Takemura, H., Terao, M., Miyazaki, A., Ogawa, S., Horiguchi, H., Nakadomari,

1460 S., Matsumoto, K., Nakano, T., Wandell, B. A., & Amano, K. (2021). V1 Projection

1461 Zone Signals in Human Macular Degeneration Depend on Task Despite Absence of

1462 Visual Stimulus. Current Biology. https://doi.org/10.1016/j.cub.2020.10.034

1463 McClure, J. P., & Polack, P.-O. (2019). Pure tones modulate the representation of orientation

1464 and direction in the primary visual cortex. Journal of Neurophysiology, 121(6), 2202–

1465 2214. https://doi.org/10.1152/jn.00069.2019

1466 Merabet, L. B., Hamilton, R., Schlaug, G., Swisher, J. D., Kiriakopoulos, E. T., Pitskel, N.

1467 B., Kauffman, T., & Pascual-Leone, A. (2008). Rapid and Reversible Recruitment of

1468 Early Visual Cortex for Touch. PLOS ONE, 3(8), e3046.

1469 https://doi.org/10.1371/journal.pone.0003046

1470 Merabet, L., Thut, G., Murray, B., Andrews, J., Hsiao, S., & Pascual-Leone, A. (2004).

1471 Feeling by sight or seeing by touch? Neuron, 42(1), 173–179.

1472 https://doi.org/10.1016/s0896-6273(04)00147-3

1473 Miller, G. A. (2003). The cognitive revolution: A historical perspective. Trends in Cognitive

1474 Sciences, 7(3), 141–144. https://doi.org/10.1016/S1364-6613(03)00029-9

1475 Mon-Williams, M., & Tresilian, J. R. (1999). Some Recent Studies on the Extraretinal

1476 Contribution to Distance Perception. Perception, 28(2), 167–181.

1477 https://doi.org/10.1068/p2737

40 1478 Mon-Williams, M., Tresilian, J. R., Plooy, A., Wann, J. P., & Broerse, J. (1997). Looking at

1479 the task in hand: Vergence eye movements and perceived size. Experimental Brain

1480 Research, 117(3), 501–506. https://doi.org/10.1007/s002210050246

1481 Morales, J., Bax, A., & Firestone, C. (2020). Sustained representation of perspectival shape.

1482 Proceedings of the National Academy of Sciences, 117(26), 14873–14882.

1483 https://doi.org/10.1073/pnas.2000715117

1484 Morales, J., Bax, A., & Firestone, C. (in press). Perspectival interference up close.

1485 Proceedings of the National Academy of Sciences.

1486 Morales, J., & Firestone, C. (2021). A “living” supplement to: Morales, Bax, & Firestone

1487 (2020) Sustained Representation of Perspectival Shape. PNAS. PsyArXiv.

1488 https://doi.org/10.31234/osf.io/5kq24

1489 Morrison, J. D., & Whiteside, T. C. (1984). Binocular cues in the perception of distance of a

1490 point source of light. Perception, 13(5), 555–566. https://doi.org/10.1068/p130555

1491 Munk, H. (1881). Uber die Funktionen der Grosshirnrinde. A. Hirschwald (English

1492 Translation in Von Bonin, G. ed. (1960), The Cerebral Cortex, pp.97-117, Springfield

1493 Thomas).

1494 Murray, M. M., Thelen, A., Thut, G., Romei, V., Martuzzi, R., & Matusz, P. J. (2016). The

1495 multisensory function of the human primary visual cortex. Neuropsychologia, 83,

1496 161–169. https://doi.org/10.1016/j.neuropsychologia.2015.08.011

1497 Murray, S. O., Boyaci, H., & Kersten, D. (2006). The representation of perceived angular

1498 size in human primary visual cortex. Nature Neuroscience, 9(3), 429–434.

1499 https://doi.org/10.1038/nn1641

1500 Nau, M., Julian, J. B., & Doeller, C. F. (2018). How the Brain’s Navigation System Shapes

1501 Our Visual Experience. Trends in Cognitive Sciences, 22(9), 810–825.

1502 https://doi.org/10.1016/j.tics.2018.06.008

41 1503 Neisser, U. (1967). Cognitive Psychology: Classic Edition. Appleton-Century-Crofts.

1504 Newen, A., & Vetter, P. (2017). Why cognitive penetration of our perceptual experience is

1505 still the most plausible account. Consciousness and Cognition, 47, 26–37.

1506 https://doi.org/10.1016/j.concog.2016.09.005

1507 Newton, I. (1687). Philosophiæ Naturalis Principia Mathematica. Royal Society.

1508 Ni, A. M., Murray, S. O., & Horwitz, G. D. (2014). Object-centered shifts of receptive field

1509 positions in monkey primary visual cortex. Current Biology: CB, 24(14), 1653–1658.

1510 https://doi.org/10.1016/j.cub.2014.06.003

1511 Niell, C. M., & Stryker, M. P. (2010). Modulation of visual responses by behavioral state in

1512 mouse visual cortex. Neuron, 65(4), 472–479.

1513 https://doi.org/10.1016/j.neuron.2010.01.033

1514 Nikara, T., Bishop, P. O., & Pettigrew, J. D. (1968). Analysis of retinal correspondence by

1515 studying receptive fields of binocular single units in cat striate cortex. Experimental

1516 Brain Research, 6(4), 353–372. https://doi.org/10.1007/BF00233184

1517 Odegaard, B., Chang, M. Y., Lau, H., & Cheung, S.-H. (2018). Inflation versus filling-in:

1518 Why we feel we see more than we actually do in peripheral vision. Philosophical

1519 Transactions of the Royal Society B: Biological Sciences, 373(1755), 20170345.

1520 https://doi.org/10.1098/rstb.2017.0345

1521 Odegaard, B., Knight, R. T., & Lau, H. (2017). Should a Few Null Findings Falsify

1522 Prefrontal Theories of Conscious Perception? Journal of Neuroscience, 37(40), 9593–

1523 9602. https://doi.org/10.1523/JNEUROSCI.3217-16.2017

1524 Oizumi, M., Albantakis, L., & Tononi, G. (2014). From the Phenomenology to the

1525 Mechanisms of Consciousness: Integrated Information Theory 3.0. PLOS

1526 Computational Biology, 10(5), e1003588.

1527 https://doi.org/10.1371/journal.pcbi.1003588

42 1528 O’Keefe, J., & Nadel, L. (1978). The Hippocampus as a Cognitive Map. Clarendon Press.

1529 O’Keefe, J., & Nadel, L. (1979). Précis of O’Keefe & Nadel’s The hippocampus as a

1530 cognitive map. Behavioral and Brain Sciences, 2(4), 487–494.

1531 https://doi.org/10.1017/S0140525X00063949

1532 O’Regan, J. K. (2011). Why Red Doesn’t Sound Like a Bell: Understanding the Feel of

1533 Consciousness. Oxford University Press, USA.

1534 Overgaard, M., Fehl, K., Mouridsen, K., Bergholt, B., & Cleeremans, A. (2008). Seeing

1535 without Seeing? Degraded Conscious Vision in a Blindsight Patient. PloS One, 3(8),

1536 e3028. https://doi.org/10.1371/journal.pone.0003028

1537 Padamsey, Z., & Rochefort, N. L. (2020). Defying Expectations: How Neurons Compute

1538 Prediction Errors in Visual Cortex. Neuron, 108(6), 1016–1019.

1539 https://doi.org/10.1016/j.neuron.2020.12.005

1540 Pakan, J. M., Francioni, V., & Rochefort, N. L. (2018b). Action and learning shape the

1541 activity of neuronal circuits in the visual cortex. Current Opinion in Neurobiology,

1542 52, 88–97. https://doi.org/10.1016/j.conb.2018.04.020

1543 Pakan, J. M. P., Currie, S. P., Fischer, L., & Rochefort, N. L. (2018). The Impact of Visual

1544 Cues, Reward, and Motor Feedback on the Representation of Behaviorally Relevant

1545 Spatial Locations in Primary Visual Cortex. Cell Reports, 24(10), 2521–2528.

1546 https://doi.org/10.1016/j.celrep.2018.08.010

1547 Panizza, B. (1855). Osservazioni sul nervo ottico. Gior IR Ist Lomb Sci Lett Art., 7, 237–252.

1548 Papathomas, T. V., Chubb, C., Gorea, A., & Kowler, E. (1995). Early Vision and Beyond.

1549 MIT Press.

1550 Papineau, D. (2021). The Metaphysics of Sensory Experience. Oxford University Press.

1551 Parker, A. J. (2014). Cortical pathways for binocular depth. In L. Chalupa & J. S. Werner,

1552 The New Visual Neurosciences (pp. 397–408). MIT Press.

43 1553 Parker, A. J., Smith, J. E. T., & Krug, K. (2016). Neural architectures for stereo vision.

1554 Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1697),

1555 20150261. https://doi.org/10.1098/rstb.2015.0261

1556 Parker, Andrew J. (2007). Binocular depth perception and the cerebral cortex. Nature

1557 Reviews Neuroscience, 8(5), 379–391. https://doi.org/10.1038/nrn2131

1558 Parker, Andrew J. (2016). Vision in our three-dimensional world. Philosophical Transactions

1559 of the Royal Society B: Biological Sciences, 371(1697), 20150251.

1560 https://doi.org/10.1098/rstb.2015.0251

1561 Petro, L. S., Paton, A. T., & Muckli, L. (2017). Contextual modulation of primary visual

1562 cortex by auditory signals. Philosophical Transactions of the Royal Society B:

1563 Biological Sciences, 372(1714), 20160104. https://doi.org/10.1098/rstb.2016.0104

1564 Pettigrew, J. D. (1965). Binocular interaction of single units of the striate cortex of the cat

1565 [Thesis submitted for the degree of BSc (Med) in the Department of Physiology].

1566 University of Sydney.

1567 Phillips, I. (2020). Blindsight is qualitatively degraded conscious vision. Psychological

1568 Review. https://doi.org/10.1037/rev0000254

1569 Pollen, D. A. (1999). On the neural correlates of visual perception. Cerebral Cortex (New

1570 York, N.Y.: 1991), 9(1), 4–19. https://doi.org/10.1093/cercor/9.1.4

1571 Pollen, Daniel A. (1995). Cortical areas in visual awareness. Nature, 377(6547), 293–294.

1572 https://doi.org/10.1038/377293b0

1573 Pollen, Daniel A. (2003). Explicit Neural Representations, Recursive Neural Networks and

1574 Conscious Visual Perception. Cerebral Cortex, 13(8), 807–814.

1575 https://doi.org/10.1093/cercor/13.8.807

1576 Poort, J., Khan, A. G., Pachitariu, M., Nemri, A., Orsolic, I., Krupic, J., Bauza, M., Sahani,

1577 M., Keller, G. B., Mrsic-Flogel, T. D., & Hofer, S. B. (2015). Learning Enhances

44 1578 Sensory and Multiple Non-sensory Representations in Primary Visual Cortex.

1579 Neuron, 86(6), 1478–1490. https://doi.org/10.1016/j.neuron.2015.05.037

1580 Pylyshyn, Z. (1999). Is vision continuous with cognition? The case for cognitive

1581 impenetrability of visual perception. The Behavioral and Brain Sciences, 22(3), 341–

1582 365; discussion 366-423. https://doi.org/10.1017/s0140525x99002022

1583 Qiu, F. T., & von der Heydt, R. (2005). Figure and Ground in the Visual Cortex: V2

1584 Combines Stereoscopic Cues with Gestalt Rules. Neuron, 47(1), 155–166.

1585 https://doi.org/10.1016/j.neuron.2005.05.028

1586 Ramsay, A. I. G., Carey, D. P., & Jackson, S. R. (2007). Visual-proprioceptive mismatch and

1587 the Taylor illusion. Experimental Brain Research, 176(1), 173–181.

1588 https://doi.org/10.1007/s00221-006-0609-1

1589 Rao, R. P. N., & Ballard, D. H. (1999). Predictive coding in the visual cortex: A functional

1590 interpretation of some extra-classical receptive-field effects. Nature Neuroscience,

1591 2(1), 79–87. https://doi.org/10.1038/4580

1592 Reddit. (2013). Copy-paste, no resize,

1593 https://www.reddit.com/r/illusionporn/comments/1mb61j/copypaste_no_resize_xpost_

1594 rwoahdude/ (last accessed 17th December 2020).

1595 Rees, G. (2007). Neural correlates of the contents of visual awareness in humans.

1596 Philosophical Transactions of the Royal Society B: Biological Sciences, 362(1481),

1597 877–886. https://doi.org/10.1098/rstb.2007.2094

1598 Rees, G., Kreiman, G., & Koch, C. (2002). Neural correlates of consciousness in humans.

1599 Nature Reviews. Neuroscience, 3(4), 261–270. https://doi.org/10.1038/nrn783

1600 Richards, W. (1967). Apparent modifiability of receptive fields during accommodation and

1601 convergence and a model for size constancy. Neuropsychologia, 5(1), 63–72.

1602 https://doi.org/10.1016/0028-3932(67)90051-6

45 1603 Richards, W. (1968). Spatial remapping in the primate visual system. Kybernetik, 4(4), 146–

1604 156. https://doi.org/10.1007/BF00288548

1605 Riddoch, G. (1917). Dissociation of visual perceptions due to occipital injuries, with especial

1606 reference to appreciation of movement. Brain: A Journal of Neurology, 40, 15–57.

1607 https://doi.org/10.1093/brain/40.1.15

1608 Roberts, R. E., Ahmad, H., Arshad, Q., Patel, M., Dima, D., Leech, R., Seemungal, B. M.,

1609 Sharp, D. J., & Bronstein, A. M. (2017). Functional neuroimaging of visuo-vestibular

1610 interaction. Brain Structure and Function, 222(5), 2329–2343.

1611 https://doi.org/10.1007/s00429-016-1344-4

1612 Roberts, R. E., Ahmad, H., Patel, M., Dima, D., Ibitoye, R., Sharif, M., Leech, R., Arshad,

1613 Q., & Bronstein, A. M. (2018). An fMRI study of visuo-vestibular interactions

1614 following vestibular neuritis. NeuroImage. Clinical, 20, 1010–1017.

1615 https://doi.org/10.1016/j.nicl.2018.10.007

1616 Rockland, K., & Ojima, H. (2003). Multisensory convergence in calcarine visual areas in

1617 macaque monkey. International Journal of Psychophysiology : Official Journal of the

1618 International Organization of Psychophysiology, 50, 19–26.

1619 https://doi.org/10.1016/S0167-8760(03)00121-1

1620 Roseboom, W., & Lush, P. (2020). Serious problems with interpreting rubber hand illusion

1621 experiments. PsyArXiv. https://doi.org/10.31234/osf.io/uhdzs

1622 Sacks, O. (2006). Stereo Sue. The New Yorker, 64.

1623 Sadato, N., Pascual-Leone, A., Grafman, J., Ibañez, V., Deiber, M. P., Dold, G., & Hallett,

1624 M. (1996). Activation of the primary visual cortex by Braille reading in blind

1625 subjects. Nature, 380(6574), 526–528. https://doi.org/10.1038/380526a0

46 1626 Saleem, A. B. (2020). Two stream hypothesis of visual processing for navigation in mouse.

1627 Current Opinion in Neurobiology, 64, 70–78.

1628 https://doi.org/10.1016/j.conb.2020.03.009

1629 Saleem, A. B., Ayaz, A., Jeffery, K. J., Harris, K. D., & Carandini, M. (2013). Integration of

1630 visual motion and locomotion in mouse visual cortex. Nature Neuroscience, 16(12),

1631 1864–1869. https://doi.org/10.1038/nn.3567

1632 Saleem, A. B., Diamanti, E. M., Fournier, J., Harris, K. D., & Carandini, M. (2018). Coherent

1633 encoding of subjective spatial position in visual cortex and hippocampus. Nature,

1634 562(7725), 124–127. https://doi.org/10.1038/s41586-018-0516-1

1635 Schindel, R., & Arnold, D. H. (2010). Visual Sensitivity Can Scale with Illusory Size

1636 Changes. Current Biology, 20(9), 841–844. https://doi.org/10.1016/j.cub.2010.02.068

1637 Schwarzkopf, D. S. (2015). Where Is Size in the Brain of the Beholder? Multisensory

1638 Research, 28(3–4), 285–296. https://doi.org/10.1163/22134808-00002474

1639 Self, M. W., van Kerkoerle, T., Supèr, H., & Roelfsema, P. R. (2013). Distinct Roles of the

1640 Cortical Layers of Area V1 in Figure-Ground Segregation. Current Biology, 23(21),

1641 2121–2129. https://doi.org/10.1016/j.cub.2013.09.013

1642 Serrano-Pedraza, I., Vancleef, K., & Read, J. C. A. (2016). Avoiding monocular artifacts in

1643 clinical stereotests presented on column-interleaved digital stereoscopic displays.

1644 Journal of Vision, 16(14), 13–13. https://doi.org/10.1167/16.14.13

1645 Silvanto, J., Cowey, A., Lavie, N., & Walsh, V. (2007). Making the blindsighted see.

1646 Neuropsychologia, 45(14), 3346–3350.

1647 https://doi.org/10.1016/j.neuropsychologia.2007.06.008

1648 Simons, D. J., Franconeri, S. L., & Reimer, R. L. (2000). Change Blindness in the Absence of

1649 a Visual Disruption. Perception, 29(10), 1143–1154. https://doi.org/10.1068/p3104

47 1650 Sperandio, I., Chouinard, P. A., & Goodale, M. A. (2012). Retinotopic activity in V1 reflects

1651 the perceived and not the retinal size of an afterimage. Nature Neuroscience, 15(4),

1652 540–542. https://doi.org/10.1038/nn.3069

1653 Sperandio, I., Kaderali, S., Chouinard, P. A., Frey, J., & Goodale, M. A. (2013). Perceived

1654 size change induced by nonvisual signals in darkness: The relative contribution of

1655 vergence and proprioception. The Journal of Neuroscience: The Official Journal of

1656 the Society for Neuroscience, 33(43), 16915–16923.

1657 https://doi.org/10.1523/JNEUROSCI.0977-13.2013

1658 Stoerig, P. (2001). The neuroanatomy of phenomenal vision: A psychological perspective.

1659 Annals of the New York Academy of Sciences, 929, 176–194.

1660 https://doi.org/10.1111/j.1749-6632.2001.tb05716.x

1661 Stoerig, Petra, & Cowey, A. (1995). Visual perception and phenomenal consciousness.

1662 Behavioural Brain Research, 71(1), 147–156. https://doi.org/10.1016/0166-

1663 4328(95)00050-X

1664 Strawson, G. (1994). Mental Reality. MIT Press.

1665 Suzuki, K. (1986). Effects of oculomotor cues on the apparent size of afterimages. Japanese

1666 Psychological Research, 28(4), 168–175.

1667 https://doi.org/10.4992/psycholres1954.28.168

1668 Tarr, M. J., & Bulthoff, H. H. (1998). Image-based object recognition in man, monkey and

1669 machine. Cognition, 67, 1–20.

1670 Taylor, F. V. (1941). Change in size of the afterimage induced in total darkness. Journal of

1671 Experimental Psychology, 29(1), 75–80. https://doi.org/10.1037/h0058125

1672 ten Oever, S., Romei, V., van Atteveldt, N., Soto-Faraco, S., Murray, M. M., & Matusz, P. J.

1673 (2016). The COGs (context, object, and goals) in multisensory processing.

48 1674 Experimental Brain Research, 234(5), 1307–1323. https://doi.org/10.1007/s00221-

1675 016-4590-z

1676 Tolman, E. C. (1948). Cognitive maps in rats and men. Psychological Review, 55(4), 189–

1677 208. https://doi.org/10.1037/h0061626

1678 Tong, F. (2003). Tong, F. Primary visual cortex and visual awareness. Nature Rev. Neurosci.

1679 4, 219-229. Nature Reviews. Neuroscience, 4, 219–229.

1680 https://doi.org/10.1038/nrn1055

1681 Tong, F. (2009). V1 and visual consciousness. In The Oxford Companion to Consciousness.

1682 Oxford University Press.

1683 https://www.oxfordreference.com/view/10.1093/acref/9780198569510.001.0001/acre

1684 f-9780198569510-e-339

1685 Tononi, G., & Edelman, G. M. (1998). Consciousness and Complexity. Science, 282(5395),

1686 1846–1851. https://doi.org/10.1126/science.282.5395.1846

1687 Travis, C. S. (2004). The Silence of the Senses. Mind, 113(449), 57–94.

1688 https://doi.org/10.1093/mind/113.449.57

1689 Travis, C. S. (2013). The Silences of the Senses. In Perception: Essays After Frege (pp. 23–

1690 58). Oxford University Press.

1691 Trommershauser, J., Kording, K., & Landy, M. S. (2011). Sensory Cue Integration. Oxford

1692 University Press.

1693 Trotter, Y., Celebrini, S., Stricanne, B., Thorpe, S., & Imbert, M. (1992). Modulation of

1694 neural stereoscopic processing in primate area V1 by the viewing distance. Science,

1695 257(5074), 1279–1281. https://doi.org/10.1126/science.1519066

1696 Trotter, Y., Celebrini, S., Stricanne, B., Thorpe, S., & Imbert, M. (1996). Neural processing

1697 of stereopsis as a function of viewing distance in primate visual cortical area V1.

49 1698 Journal of Neurophysiology, 76(5), 2872–2885.

1699 https://doi.org/10.1152/jn.1996.76.5.2872

1700 Trotter, Yves, & Celebrini, S. (1999). Trotter, Y. & Celebrini, S. Gaze direction controls

1701 response gain in primary visual-cortex neurons. Nature 398, 239−242. Nature, 398,

1702 239–242. https://doi.org/10.1038/18444

1703 Trotter, Yves, Stricanne, B., Celebrini, S., Thorpe, S., & Imbert, M. (1993). Neural

1704 processing of stereopsis as a function of viewing distance. Oxford University Press.

1705 https://0-www-oxfordscholarship-

1706 com.wam.city.ac.uk/view/10.1093/acprof:oso/9780198547853.001.0001/acprof-

1707 9780198547853-chapter-027

1708 Urist, M. J. (1959). Afterimages and ocular muscle proprioception. A.M.A. Archives of

1709 Ophthalmology, 61(2), 230–232.

1710 https://doi.org/10.1001/archopht.1959.00940090232006

1711 van Atteveldt, N., Murray, M. M., Thut, G., & Schroeder, C. E. (2014). Multisensory

1712 integration: Flexible use of general operations. Neuron, 81(6), 1240–1253.

1713 https://doi.org/10.1016/j.neuron.2014.02.044

1714 Vetter, P., Bola, Ł., Reich, L., Bennett, M., Muckli, L., & Amedi, A. (2020). Decoding

1715 Natural Sounds in Early “Visual” Cortex of Congenitally Blind Individuals. Current

1716 Biology, 30(15), 3039-3044.e2. https://doi.org/10.1016/j.cub.2020.05.071

1717 Vetter, P., & Newen, A. (2014). Varieties of cognitive penetration in visual perception.

1718 Consciousness and Cognition, 27, 62–75.

1719 https://doi.org/10.1016/j.concog.2014.04.007

1720 Vetter, P., Smith, F. W., & Muckli, L. (2014). Decoding sound and imagery content in early

1721 visual cortex. Current Biology: CB, 24(11), 1256–1262.

1722 https://doi.org/10.1016/j.cub.2014.04.020

50 1723 Wallach, H., & Zuckerman, C. (1963). The constancy of stereoscopic depth. The American

1724 Journal of Psychology, 76, 404–412.

1725 Wang, C., Chen, X., & Knierim, J. J. (2020). Egocentric and allocentric representations of

1726 space in the rodent brain. Current Opinion in Neurobiology, 60, 12–20.

1727 https://doi.org/10.1016/j.conb.2019.11.005

1728 Watkins, S., Shams, L., Josephs, O., & Rees, G. (2007). Activity in human V1 follows

1729 multisensory perception. NeuroImage, 37(2), 572–578.

1730 https://doi.org/10.1016/j.neuroimage.2007.05.027

1731 Watkins, S., Shams, L., Tanaka, S., Haynes, J.-D., & Rees, G. (2006). Sound alters activity in

1732 human V1 in association with illusory visual perception. NeuroImage, 31(3), 1247–

1733 1256. https://doi.org/10.1016/j.neuroimage.2006.01.016

1734 Weiskrantz, L. (2009). Is blindsight just degraded normal vision? Experimental Brain

1735 Research, 192(3), 413–416. https://doi.org/10.1007/s00221-008-1388-7

1736 Welchman, A. E. (2011). Decoding the Cortical Representation of Depth. In J.

1737 Trommershauser, K. Kording, & M. S. Landy, Sensory Cue Integration (pp. 346–

1738 367). Oxford University Press. https://0-oxford-universitypressscholarship-

1739 com.wam.city.ac.uk/view/10.1093/acprof:oso/9780195387247.001.0001/acprof-

1740 9780195387247-chapter-19

1741 Welchman, A. E. (2016). The Human Brain in Depth: How We See in 3D. Annual Review of

1742 Vision Science, 2(1), 345–376. https://doi.org/10.1146/annurev-vision-111815-

1743 114605

1744 Welchman, A. E., & Kourtzi, Z. (2013). Linking brain imaging signals to visual perception.

1745 Visual Neuroscience, 30(5–6), 229–241. https://doi.org/10.1017/S0952523813000436

51 1746 Yarbus, A. L. (1967). Eye Movements During Perception of Complex Objects. In A. L.

1747 Yarbus (Ed.), Eye Movements and Vision (pp. 171–211). Springer US.

1748 https://doi.org/10.1007/978-1-4899-5379-7_8

1749 Zago, S., Nurra, M., Scarlato, G., & Silani, V. (2000). Bartolomeo Panizza and the Discovery

1750 of the Brain’s Visual Center. Archives of Neurology, 57(11), 1642–1648.

1751 https://doi.org/10.1001/archneur.57.11.1642

1752 Zannoli, M., Love, G. D., Narain, R., & Banks, M. S. (2016). Blur and the perception of

1753 depth at occlusions. Journal of Vision, 16(6), 17–17. https://doi.org/10.1167/16.6.17

1754 Zeki, S., & Ffytche, D. H. (1998). The Riddoch syndrome: Insights into the neurobiology of

1755 conscious vision. Brain: A Journal of Neurology, 121 ( Pt 1), 25–45.

1756 https://doi.org/10.1093/brain/121.1.25

1757 Zeng, H., Fink, G. R., & Weidner, R. (2020). Visual Size Processing in Early Visual Cortex

1758 Follows Lateral Occipital Cortex Involvement. Journal of Neuroscience, 40(22),

1759 4410–4417. https://doi.org/10.1523/JNEUROSCI.2437-19.2020

1760 Zenkin, G. M., & Petrov, A. P. (2015). Transformation of the Visual Afterimage Under

1761 Subject’s Eye and Body Movements and the Visual Field Constancy Mechanisms.

1762 Perception, 44(8–9), 973–985. https://doi.org/10.1177/0301006615594937

1763 Zhaoping, L. (2014). Understanding Vision: Theory, Models, and Data. Oxford University

1764 Press.

1765 Zhaoping, L. (2019). A new framework for understanding vision from the perspective of the

1766 primary visual cortex. Current Opinion in Neurobiology, 58, 1–10.

1767 https://doi.org/10.1016/j.conb.2019.06.001

1768 Zmarz, P., & Keller, G. B. (2016). Mismatch Receptive Fields in Mouse Visual Cortex.

1769 Neuron, 92(4), 766–772. https://doi.org/10.1016/j.neuron.2016.09.057

1770 1771

52 1772

53