1 Supplementary Information

2 Theory of weakly coupled oscillators 3 Detailed reviews and mathematical descriptions of the theory, also its extensions and limitations, 4 can be found in a number of publications1–19. In essence, many oscillatory phenomena in the 5 natural world represent dynamic systems with a limit-cycle attractor2. Even though the 6 underlying system might be complex (e.g. a or neural population), the dynamics of the 7 system can be reduced to a phase-variable if the interaction strength among oscillators is weak. 8 In general, coupled oscillators can interact through adjustments of their amplitudes and phases, 9 yet if interaction strength is weak, amplitude changes are small and play a minor role in the 10 oscillatory dynamics (no strong deviation from the limit-cycle) and only the adjustments of the 11 phase are essential for understanding the behaviour. The transformation of a complex system to a 12 phase variable, if valid, reduces the dimensions of the problem, thereby allowing exact 13 mathematical investigation. The manner by which mutually coupled oscillators adjust their 14 phases, either by phase-delay or phase-advancement, is described by the phase response curve 15 (also called phase resetting curve), the PRC7,9,14–16,19–21. The PRC is important, because if the 16 PRC of a system can be described, the synchronization behaviour of the system can be 17 understood and predicted. Many biological oscillators are inherently noisy or chaotic. Therefore, 18 it is important to take this variability of the dynamics into account for a better understanding of 19 biological data. Cortical in vivo exhibit irregular spiking patterns24 and neural networks 20 oscillations also show significant variability over time25. This type of variation is referred to as 21 phase noise and is distinct from measurement noise with the latter being unrelated to the 22 dynamics of the system. 23 According to the theory of weakly coupled oscillators, the synchronization of two coupled 24 oscillators can be predicted from the forces they exert on each other as a function of their 25 instantaneous phase difference. This function is referred to as the phase-response curve (PRC). 26 Accordingly, the phase evolution of two given cortical V1 locations is reduced to: 27

28 1  112121121 ε H 

29 2  2  2ε 12H 12 2 1 2 30

31 where φ1,2 is the phase, 1,2 its temporal derivative, ω1,2 is the preferred frequency, ε12 and ε21 are 32 the interaction strengths, H12 and H21 are the single PRCs and ƞ1,2 is a phase-noise term with ƞ1,2 33 ~ N(0, σ2) N being the normal distribution. The two equations, as given in the main text equation 34 1, can be further simplified to: 35

36 3    εG   37

1

38 where θ = φ1 –φ2 is the phase difference, ∆ω = ω1 – ω2 the detuning, εG(θ)= ε21H21(θ)-ε12H12(-θ) 39 the combined interaction term with ε being the interaction strength and G(θ) the mutual PRC

2 40 and ƞ= ƞ1 –ƞ2 the phase noise with η ~ N(0, 2σ ). 41 42 Analytical derivation of phase-locking and mean phase difference 43 Equation 3 is a stochastic differential equation (a Langevin equation) and was solved as 44 described in 1. Equation 3 can be rewritten in the form of a Fokker-Planck equation that has been 45 developed to give an analytical solution for the evolution of a probability distribution P of a 46 particle influenced by a drag force (first term on the right side of the equation) and a random 47 Gaussian noise process (second term). The drag force is here the combined systematic force of 48 detuning ∆ω and the interaction function εG(θ): 49 ∆ 50 4 51

52 The stationary (time-independent) solution of the Fokker-Planck equation which is: ′ 1 ′ 53 5 exp 54 ∆ ′ 1 ′ 55 6 exp 56 where C is a normalization constant defined by 1. V(θ) represents the influence 57 of systematic force as a function of phase difference. is the phase difference probability 58 distribution and describes how likely a particular phase difference is to occur. A uniform 59 distribution means that every phase difference is equally likely and the oscillator are hence 60 asynchronous. If the distribution approximates a delta distribution (meaning only one phase 61 difference has non-zero probability), then the oscillators are completely synchronized. All other 62 distributions in between signify intermittent (partial) synchronization (also called cycle slipping 63 or phase walk-through, 1,2). To quantifying the narrowness of the distribution, we use the phase- 64 locking value (the mean resultant vector length,3) defined here as: 65 7 | | 66 Further, we were also interested in the mean phase difference, also described as the preferred 67 phase difference, defined here as: 68 8 arg 2

69 A phase difference between oscillators in neural networks implies spike timing differences. It has 70 been shown that spike-timing is an important characteristic in addition to spike synchrony 4–9. 71 72 General behaviour of the model in term of PLV and mean phase difference 73 In Fig.S1 the model’s behaviour is illustrated as a function of detuning ∆ω and interaction 74 strength ε. To understand how the PLV and the mean phase difference change, it is illustrative to 75 consider the noise-free case first. In the noise-free case one needs to solve the equation 3 for 76 zero-points (root or equilibrium points), meaning that the time derivative of the phase difference 77 is zero, ( 0, i.e. zero frequency difference). Let us assume here for illustration that G(θ) is a 78 sine function, as is for example used for the Kuramoto model 10,11. It then becomes (also called 79 the Adler equation, 1): 80 9 0 ∆ 81 The zero-point or the equilibrium is stable if the derivative of the PRC is negative (stable if 82 < 0). The sin function has two zero crossing (at 0 and π with Δω =0), but only one is stable. 83 In the case ε = 0 (no interaction), only the condition of ∆ω = 0 (no detuning) can lead to 84 synchrony. If ε = 0 and ∆ω ≠ 0 (no interaction with detuning), the oscillators will be 85 asynchronous and there will be linear phase precession with the speed determined by the ∆ω 86 value. If ε > 0 (there is interaction), then an equilibrium can only exist if the detuning is not 87 stronger than the interaction strength, denoted by |∆ω|<= ε. This is because for equilibrium the 88 detuning needs to be counter-balanced by the interaction. In the case of |∆ω|> ε (detuning 89 exceeding interaction strength), the interaction cannot counterbalance the detuning and the 90 oscillators start to phase precess. 91 If the oscillators do phase precess but are coupled (ε>0), the phase-precession is non-linear. The 92 precession rate is determined by the detuning (∆ω), the modulation shape (G(θ)), and the 93 modulation amplitude (ε). In this case, the oscillators are in the intermittent synchronization 94 regime. The transition point ∆ω = ε defines the border between high PLV values and low PLV 95 values and defines the borders of the so-called Arnold tongue, the synchronization region in the 96 ∆ω- ε parameter space 1,2. The larger the ε, the larger the detuning can be for oscillators to still 97 synchronize. This leads to a triangle shape of the synchronization region in the ∆ω- ε space as 98 illustrated in Fig.S1A. The mean (preferred) phase difference can also be derived from equation 99 6. For reaching equilibrium the detuning ∆ω and interaction term (εsin(θ)) need to be 100 counterbalanced. For different detuning ∆ω values the counterbalance will depend on different 101 phase difference values of the interaction term εsin(θ). It can be represented graphically 102 (Fig.S1B,2) as the intersection of a horizontal line (∆ω) with the graph of εsin(θ). Notice that the 103 stronger the interaction strength ε is, the smaller the slope of the detuning-to-phase difference 104 translation becomes. Outside of the Arnold tongue (∆ω> ε), there will be phase precession yet 105 with a preference (minimal precession rate) for a phase difference where the PRC is min or max. 106 Phase noise has important effects on the synchronization behavior1. Strictly speaking, the 107 condition of complete synchrony for noisy oscillators does not exist as θ’ cannot remain stable 108 all the time (there will always be small fluctuations). If ∆ω+ƞ < ε, meaning that in spite of 109 frequency variability detuning can be counterbalanced by interaction strength, then oscillators 110 remain close to the equilibrium point and they have high phase-locking. Yet, when noise is

3

111 larger, it is likely that the interaction term cannot counterbalance all the time and the oscillators 112 will phase precess from time to time. However, also in this case, the transition point ∆ω = ε still 113 determines the Arnold tongue border and outlines the transition between the regions of high 114 phase-locking and regions of low phase-locking, but the transition is smoother1,2. In Fig.S1 we 115 show the Arnold tongue in terms of PLV and mean phase-difference for coupled oscillators 116 having moderate levels of white noise (σ=15Hz). 117 118 Biophysical modeling of coupled gamma-generating neural networks 119 To demonstrate that the results from the phase-oscillator equations are generalizable to more 120 biophysically realistic neuronal network oscillations12, we simulated two coupled excitatory- 121 inhibitory spiking neural networks generating pyramidal-interneuron gamma (PING,13) 122 oscillations. 123 The neural voltage dynamics v were of the Izhikevich-type14 and defined as follows: 124 10 0.04 5140 125 11 ← 30, then 126 ← 127 The coupled differential equations were numerically solved using the Euler method (1ms step 128 size). The networks were both composed of two types of neurons: 200 regular spiking neurons 129 RS (a=0.02, b=0.2, c=-65mV, d=8) and 50 fast-spiking interneuron FS (a=0.1, b=0.2, c=-65mV, 130 d=2). RS were excitatory neurons and FS inhibitory neurons (ratio 4:1). The neural networks 131 were all-to-all synaptically connected. Synapses were modelled as exponential decaying 132 functions, reset to 1 after the presynaptic neurons fired. Synaptic connection values had a 133 maximum synaptic connection strength (max syn). The synaptic strengths were chosen from a 134 random uniform distribution defined between the 0 and the maximal connection strength. 135 Within a network, RS neurons projected excitatory synaptic AMPA (decay constant= 2ms) 136 connections onto FS neuron (max syn= 0.45) and among themselves (max syn= 0.05). FS 137 neurons projected synaptic GABA-A (decay constant= 8ms) connections onto RS neurons (max 138 syn= -0.35) and among themselves (max syn = -0.2). For cross-connections between the 139 networks, we included RSFS connections (EI, max syn(default)= 0.015) and RSRS 140 connections (EE, max syn(default)= 0.007). We did not include inter-network FSFS or FS 141 RS connections to reflect that V1 horizontal connectivity is dominated by excitatory connections 142 originating from pyramidal cells15–19. 143 The input drive to RS neurons was composed of a fixed input current to each neuron (=10), 144 unique Gaussian input noise for a given neuron (SD±3) and Gaussian input noise shared among 145 neurons (SD±1) of the same network. Thus each network received Gaussian input noise to RS 146 neurons with the effect of inducing instantaneous frequency variation in the network over time 147 (similar to intrinsic phase noise in the phase-oscillator model). For FS neurons, each received a 148 fixed input current (=4) and Gaussian input noise (SD±3). FS neurons received further excitatory 149 drive from the RS neurons. For estimating the instantaneous phase, phase difference and

4

150 frequency of the network oscillation we used a population signal defined as mean membrane 151 voltage of all RS neurons of a given network. We simulated in total n=697 conditions (17 152 coupling and 41 detuning conditions) to compare it to analytical predictions. 153 154 Surgical procedures 155 Two adult male rhesus monkeys (Macaca mulatta) were used in this experiment. Two chambers 156 were implanted above early visual cortex, one positioned over V1/V2 and the second over V4. 157 For the experiment reported here we used data from the V1/V2 chamber only. A head post was 158 implanted to head-fix the monkey during the experiment. All the procedures were in accordance 159 with the European council directive 2010/63/EU, the Dutch ‘experiments on animal acts’ (1997) 160 and approved by the Radboud University ethical committee on experiments with animals 161 (Dier‐Experimenten‐Commissie, DEC). 162 163 Recording techniques 164 V1 recordings were made with Plexon U-probes (Plexon Inc.) consisting of 16 contacts (10µm 165 diameter, 0.5-1m impedance, and 150µm inter-contact spacing). Three probes were inserted 166 through a sharp guide tube, which was lowered through granulation tissue to just above the level 167 of the dura surface. The probes were arranged in a linear manner separated from each other by 168 ~2-3mm. The probes were then advanced by separate microdrives (Nan Instruments LTD.). The 169 probes were connected to headstages of high input impedance, and data were acquired via the 170 Plexon ‘Multichannel Acquisition system’ (MAP, Plexon Inc.). The measured extracellular 171 signal was filtered online between 150Hz and 8kHz to extract spiking activity and filtered 172 between 0.7Hz and 300Hz to obtain the ’’ (LFP). The signal was amplified 173 and digitized with 1kHz for the LFP and 40kHz for the spike signal. The data was converted 174 from Plexon to Matlab file format and cut into trials from fixation onset to stimulus offset using 175 the fieldtrip toolbox 20. For the LFP data, the line noise was removed using the fieldtrip toolbox 176 dft filter, which fits a sine and a cosine at 50, 100 and 150 Hertz and subtracts these components 177 from the data. We collected 7 recording sessions in monkey M1 and 6 sessions in M2. Each 178 recording session had on average ~590 trials in M1 and ~718trials in M2. 179 180 Current source density (CSD) 181 First for extrapolating the CSD to the outermost contacts of our probes, at the top and bottom of 182 the probe, a replica of the LFP of respectively the first and last contact was appended 21. The LFP 183 was then smoothed with a Gaussian (zero-phase) filter of a SD of 1.2 and range of 5 (effectively 184 weighting signals around the centre electrode by 24% in the centre, 20% immediate neighbours, 185 12% 2 contacts away, 5% 3 contacts away). Then the standard CSD algorithm was applied for 186 each contact position x, our inter-contact spacing h of 150µm and a conductivity C of 0.3 S/m: 2 187 12 ∗ 188

5

189 We used CSD signals for the main analysis to reduce effects of volume conduction (see also 190 section MUA-CSD and MUA-MUA analysis). 191 192 Receptive field mapping 193 Receptive fields (RFs) were mapped using both spiking and LFP information as described in 22. 194 Briefly, monkeys fixated centrally while high-contrast black and white squares of sizes 0.1- 195 1degree were presented pseudorandomly on a 10x10 grid. The locations where the spiking or the 196 LFP response exceeded the 75th percentile of the response distribution were defined as the RF. 197 Other than in Roberts et al. 22, the LFP response was also used based on the envelope of the 198 broadband gamma power (30-150 Hz) in the CSD, which we found to produce a localized result 199 in line with spiking RFs. CSDs were computed as described above, but with a smaller Gaussian 200 filtering of SD 0.6 and a filter range of 2, meaning that only the two neighbouring electrodes of 201 the centre electrode had some influence on the RF estimate of a given contact. This was done to 202 avoid mislocalization of RF shifts in size or position that are indicative of a shift to a different 203 column or to V223 (see Fig.S2B, rightmost plots for an example of CSD and spiking RFs with 204 such a shift). To obtain estimates of cortical distance (in mm) between the probes we took 205 advantage of the well-known retinotopy of V1. We measured the distance between RF centres 206 and calculated the cortical distance by converting differences in visual degrees using a cortical 207 magnification factor (CMF,24,25). The CMF was estimated individually for each monkey where 208 we used the measured the physical distance between the laminar probes (fixed to the holder) 209 before insertion into cortex (M1: ~2.7mm/deg, M2: ~2.5mm/deg). 210 211 Laminar alignment 212 We inserted the laminar probes on each recording day. The exact laminar positions of the probes 213 (Fig.S2) differed within and between sessions and hence we depth-aligned the probes based on 214 their stimulus-evoked response and inter-laminar coherence characteristics26. For depth- 215 alignment (to assign each contact a particular cortical depth value) we used the following 216 procedure: 217 218 1. We computed the CSD-VEP response. The different sink-source profiles were aligned 219 using a parallel-tempering technique27. This is an iterative procedure that minimizes the 220 squared error between all probes, shifting the position of one probe by one position on 221 each iteration. Central to the parallel tempering algorithm is the parallel start of the 222 procedure at multiple “temperatures”, each of which in our case starts with a different 223 initial, random offset in the probes. Higher temperatures accept higher increases in error 224 with a shift in the position of a probe. If a procedure running at a high temperature 225 achieves a lower error than another temperature (overcoming a local minimum), it swaps 226 the achieved shift vector with a lower temperature to find the new minimum around it. 227 Similar to Godlove et al.28 (using a genetic algorithm), we implemented a lenient 228 maximum shift constraint between electrodes (allowing by shifts of 4 channels upwards 229 and downwards, which for any two probes enforces a minimal overlap of 50%) to prevent 230 trivial solutions. For our data, we used 3000 iterations at 4 different temperatures and 231 different error tolerances per temperature (log spaced between zero and 1). The procedure 6

232 showed asymptotic behaviour (no further decrease in error) at <= 1500 iterations. Note 233 that the optimal number of iterations required for this algorithm will depend on the 234 number of probes/sessions entered. 235 2. We then computed the within laminar probe LFP coherence matrix29. It has been shown 236 that there is sharp decrease in coherence around the L4/L5 border26. We chose this to 237 refine the depth alignments of step 1 using the coherence matrix and again parallel 238 tempering with the initial values defined by the output of step 1. An advantage of the 239 coherence matrix is that it is a robust feature and insensitive to possible gain differences 240 among contacts. 241 3. We manually checked for outliers of which none were found in this dataset. 242 243 Layer assignment 244 Channels were labelled as supragranular, granular and infragranular (Fig.S2B) based on the 245 location of the initial sink-source reversal (as established by the position of the reversal in the 246 aligned grand average) in relation with known anatomy. We consider the position of the sink- 247 source reversal to correspond to the edge of layers 4 and 530,31. Specifically, given our 248 intercontact spacing of 150 micrometres and about 500 micrometres width generally used per 249 layer 26,32, channels from this border to 450 micrometre below it were labelled infragranular, 250 channels up to 450 micrometre above as granular, and channels 600 above it and higher as 251 supragranular. Data were averaged within supra- and granular layers or infragranular layers in 252 agreement with the two separable sites of gamma-power synchronization as indicated in the text. 253 254 Definition of the V1-White Matter-V2 borders 255 The depth probes often collected signals beyond the lower V1 layer 6 border and often reached 256 the deep V2 infragranular layers. When the probes reached deep V2 the RFs shifted abruptly 257 several degrees as expected form V1-V2 retinotopy (Fig.S2B, rightmost plots)23. The white 258 matter situated between the two areas appeared relatively thin, often comprising 1-2 contacts 259 (150-300microns). 260 To estimate the lower V1 Layer 6 boundary, we first used spiking RFs to determine the 261 transition. We computed a RF centre distance measure, referenced to L4-L5 border, to determine 262 at which contact the transition to deep V2 occurred. Before the transition, often 1 or 2 contacts 263 did not show spike RFs at all and were thus likely to represent white matter. V1 Layer 6 border 264 was then defined as the contact with the last low RF centre distance (threshold < 0.5 deg). In 265 probes with low spiking quality; we used CSD signals (filtered in the gamma range (30-150Hz) 266 for determining the V1 L6 border. 267 268 Single-session RF and CSD evaluation 269 For each session and probe, the CSD from full-screen checkerboard flashes (37), the task and RF 270 data were plotted side-by-side. CSDs from flashes and the grating onset were very similar in the 271 initial response (data not shown). The task-data from a single, high-contrast condition was split 272 in an early and a later half to detect any changes in depth over the session and also compared 273 with flash CSDs before and after the task (where available). Recordings were stable in depth

7

274 according to this measure. The RF mapping was used to detect changes in the size or location of 275 RFs over depth and to ascertain that there were no gradual drifts in RF location, indicative of a 276 probe not inserted fully orthogonal to cortex. In cases were noticeable shifts were observed, the 277 affected deeper channels were removed from the analysis. The final cut-off between deep V1 and 278 white matter/V2 was determined based on the distance from the layer 4/5 reversal (see Layer 279 assignment). This border, 450 microns below the 4/5 reversal, was typically above the level 280 where RF shifts were observed, leading to removal of further deep channels from the analysis. 281 282 Visual stimulation paradigm 283 The monkeys were trained to accept head-fixation and were placed in a Faraday-isolated 284 darkened booth at a distance of 57cm from a computer screen. Stimuli were presented on a 285 Samsung TFT screen (SyncMaster 940bf, 38ºx30º 60Hz). The screen was calibrated to linearize 286 luminance as function of RGB values. During stimulation and pre-stimulus time the monkey 287 maintained eye position (measured by infra-red camera, Arrington 60Hz sampling rate) within a 288 square window of 2x2°. This window was relatively large to allow for noise associated with the 289 camera, recording with a second high-speed high-resolution camera showed that eye position 290 was generally held more stable than the window required. The monkey was rewarded if keeping 291 gaze within the eye window during the whole trial. 292 We aimed to manipulate gamma frequency differences between three recorded locations in V1 293 each separated by ~2-3mm, corresponding to receptive fields (RF) separated by ~1 degree in 294 visual space. The probes were arranged linearly either perpendicularly or parallel to the lunate 295 sulcus, thus receptive fields were arranged respectively either horizontally or vertically. To 296 manipulate gamma frequency differences we manipulated local stimulus contrast differences in a 297 large square-wave grating (2 cycles/degree, presented at two opposite phases randomly 298 interleaved). Contrast was varied smoothly between the three locations. The direction of the 299 contrast difference was parallel to the arrangement of RFs and orthogonal to the orientation of 300 the grating. To avoid that the contrast manipulation would attract exogenous and endogenous 301 attention (possibly appearing as an object or object boundary), we manipulated contrast 302 differences in a repeating symmetric pattern over the entire screen. Additionally, the stimulus 303 was isoluminant at all points and was isoluminant with the pre-stimulus grey screen. The contrast 304 at the location of the centre RF, was constant over all conditions. We presented 8 levels of 305 contrast difference and one stimulus where contrast was the same at all points. The exact 306 contrasts differed slightly between the two monkeys since we used different screens (of the same 307 type) which had somewhat different luminance levels. Contrast levels are given in table S1. 308 We aimed to align the stimulus so that receptive fields at the three cortical locations would align 309 with the highest, lowest and midpoint of one cycle of the contrast variation. However, RFs did 310 not always fall exactly as we wished and there was often some variability in RFs within each 311 probe. To get the best alignment that we could on a given session, we placed the stimulus such 312 that receptive fields from the upper portion of the central probe fell on the midpoint between the 313 peak and trough of the contrast variation. We then selected a stimulus where the distance 314 between the peak and trough best matched the distance between RFs from the flanking probes. In 315 most cases this lead to a peak-to-trough distance of 2 degrees. In some cases we used a distance 316 of 1 or of 3 degrees. In some sessions we recorded with only two probes in V1. In those cases the 317 stimulus was aligned so that the midpoint was midway between the RFs of the two probes. 8

318 Most analysis was based on the measured gamma frequency rather than the stimulus contrast and 319 so any mismatch between the stimulus contrast a particular RF received and the contrast we 320 planned to present did not affect our conclusions. Where statistical analysis (see sections below 321 ‘Effects of visual contrast and eccentricity on gamma frequency’) was based on stimulus contrast 322 we took the stimulus contrast which was present at the centre of the measured RF of each single 323 electrode contact. For Fig.1 and Fig.S4 the data is shown binned by stimulus contrast values for 324 illustration. 325 326 Analysis of L2-L4 and L5-L6 gamma-band synchronization 327 For the main analysis of synchronization, we limited the analysis to data recorded from L2-L4 328 representing most the gamma power in V122,26,32–34. The lowest gamma power was observed 329 around the L4-L5 border. We observed a second gamma peak around L5-L632,34 and gamma 330 power going into deep V2. To distinguish L6 from deep V2 we used marked receptive fields 331 shifts (as described above) as indicator for the transition from V1 to V2. 332 We did the exact same analysis for quantifying synchronization between pairs of L5-6 gamma as 333 used for L2-4 gamma (Fig.S3). We could confirm the observation of an Arnold tongue in terms 334 of PLV and mean phase difference also for the deep gamma showing that the observed 335 synchronization properties can be generalized over different laminar compartments. We propose 336 that calculating the PRC and Arnold tongue between various cortical locations would be a 337 fruitful way to understand the connectivity between networks. 338 339 Effects of visual contrast and eccentricity on gamma frequency 340 Local stimulus contrast had a significant effect on the V1 gamma frequency (linear regression, 341 M1: R2=0.31,n=1179 M2: R2=0.25,n=1134, both p<10-10,see Fig.S4) in both monkey M1and M2 342 confirming previous studies of monkey and human visual cortex22,35–39. Stimulus contrast lead to 343 a monotonic increase of the frequency, here measured as the mean of the instantaneous gamma 344 frequency (the same results were obtained using the conventional frequency of the power 345 spectral peak). Both LFP and CSD gamma gave the same result. The MUA spike rate also 346 significantly increased with stimulus contrast (linear regression, M1: R2=0.14, ,n=1179,M2: 347 R2=0.12, n=1134, both p<10-10) as well established by previous work40,41 suggesting that a likely 348 source of frequency change is due to a change of network excitation13,42. We inserted laminar 349 probes acutely into the visual cortex and the probes had, depending on their arrangement, 350 differences in their visual eccentricities. There was also variation across sessions. It has been 351 shown in previous work that the V1 gamma frequency is modulated by eccentricity43,44. We 352 confirmed these observations. The gamma frequency significantly decreased with visual 353 eccentricity (linear regression, M1: R2=0.12, n=1179, M2: R2= 0.15, n=1134, both p<10-10). We 354 also observed that the MUA spike rate also decreased with visual eccentricity (linear regression, 355 M1:R2= 0.04, n=1179, M2: R2= 0.08, n=1134, both p<10-10) similarly to gamma frequency. 356 Frequency differences (detuning) between all V1 pairs were therehere a function of both 357 stimulus contrast, being the strongest factor, and visual eccentricity (multiple linear regression, 358 M1: ∆contrast, R2=0.28, ∆eccentricity, R2=0.09, n=9632; M2:∆contrast, R2=0.25, ∆eccentricity, 359 R2=0.11, n=7938, all p<10-10). We observed that the frequency difference was closely related to 360 MUA spike rate difference among probes (linear regression, M1: R2=0.53, n=9632, M2: 9

361 R2=0.36, n=7938, both p<10-10) indicating that gamma frequency differences (and hence 362 detuning) between locations are related to excitability differences. The lower excitability in more 363 eccentric locations could reflect network differences or that stimulus, with a spatial frequency of 364 2 cycles/degree, was better suited to more foveal sites. 365 366 Estimation of instantaneous gamma phase, frequency and amplitude 367 For quantifying the phase-locking value and the preferred phase difference we relied on the 368 reconstruction of the instantaneous phase45. Methods based on the instantaneous phase deal 369 better with non-stationary dynamics, which were present in the gamma-band signals investigated 370 here. The main challenge is to decompose the often complex, multi-component measured 371 LFP/CSD signal, into a well-defined gamma oscillatory component from which the 372 instantaneous phase can be extracted (i.e., after a Hilbert-Transform or directly from a time- 373 frequency representation (TFR),46). We used a method based on the singular spectrum 374 decomposition of the signal (SSD, see https://project.dke.maastrichtuniversity.nl/ssd/ )47. SSD is 375 a recently proposed method for the decomposition of nonlinear and non-stationary time series 376 47,48 in a completely data-driven manner. The method originates from singular spectrum analysis 377 (SSA), which is a nonparametric spectral estimation method used for analysis and prediction of 378 time series. For a given signals x(t) we applied SSD for each trial separately to extract the 379 gamma oscillatory components (SSDγ). Here a short overview is presented. For more 380 information see47. The following steps were implemented to retrieve the gamma oscillatory 48 381 component SSDγ , where each iteration reproduces one component. The iteration stopped when 382 10 components were extracted or only 1% residual variance remained. 383 1. The signal x(t) is embedded giving a trajectory matrix X: 384

… … 385 13 X ⋮⋮⋮ ⋮ … 386 387 Particular to the SSD approach, the embedding dimension M is automatically estimated 388 in a completely data-driven manner as 1.2*Fs/fmax, with fmax being the dominant 389 frequency in the power spectral density (PSD) of x(n), and Fs the sampling frequency. 390 The factor 1.2 allows M to cover a time span 20% larger than the average period of the 391 wanted component (to account for a variable period). 392 393 2. The singular value composition (SVD) of the trajectory matrix X is then computed: 394 395 14 , 396 3. Out of the M principal components of X, an approximated version of X is obtained by 397 selecting those principal components with a dominant frequency in the range [fmax - δf; 398 fmax + δf], where the width of the dominant peak δf is estimated by means of a Gaussian 10

399 interpolation of the power spectral density of the time series x(t). Then signal is then 400 reconstructed by diagonal averaging. The reconstructed component signal is subtracted 401 from the original signal and a new iteration of steps is started. 402 403 The SSD procedure results in a set of components representing rhythmic variation of the signal 404 with different dominant frequencies. We were interested in the component which represented the 405 gamma-band. We therefore selected the component which had the largest fraction of spectral 406 power in gamma frequencies [25Hz-60Hz]. In most cases, there was one clear component 407 representing gamma-band fluctuations in the LFP/CSD signals. 408 For deriving the instantaneous phase of a SSD component, the Hilbert transform (HT) was 409 applied using the Matlab implementation.

410 15

411 where HT(SSDγ) is the Hilbert-Transform of the selected SSD gamma component. The HT of a 412 real-valued signal is added as imaginary component to the real-valued signal itself to obtain the 413 analytic signal. SSDαγ is the analytical signal of the SSDγ. The instantaneous phase φ can then 414 easily be derived from the analytic signal:

415 16 Arg

416 Arg is the argument of the complex value SSDaγ. The instantaneous frequency (IF) can be 417 determined as the derivative of the instantaneous phase. The phases need to be unwrapped before 418 applying the derivative. However, the IF might exhibit strong outliers if the signal is noisy. We 419 used therefore a Savitzky-golay filter49 to smooth the phase trajectory (and hence the IF) using a 420 polynomial fitting approach (kernel=31ms). 421 The HT is a standard approach for reconstruction of the instantaneous phase, however a problem 422 of HT is its sensitivity to low SNR. We therefore used another approach for estimating 423 instantaneous phase that is more robust against noise, but remains valid50. We approximated the 424 instantaneous phase by using the time-frequency representation (TFR) of the signals using 425 Morlet wavelets46. This approach was used mainly for estimating phase-locking strength (PLV). 426 Morlet wavelet approach was defined as follows: ∞ ∗ 427 17 , Ψ ,ωxdx * t,ω 428 where , is the wavelet coefficient of the gamma SSD component and Ψ is the 429 complex conjugate of the Morlet wavelets, both as a function of time t and frequency ω. Morlet 430 wavelets were defined as: πω 431 18 Ψ,ω √ω e e σ 432 433 Where σ defines the width of the wavelet which also defines the number of cycles (nc=6fσ). 434 Here we used 6 cycles. The argument of the complex wavelet coefficients gives the 435 instantaneous phase for each frequency-time point:

11

436 19 , Arg , 437 438 Estimation of phase-locking strength and mean phase difference 439 The mean phase difference was defined as the mean circular phase difference between two 440 oscillations (averaged in the complex domain), where θ = ϕ1- ϕ2: 1 441 20 Arg 442 with a range of [-π, π]. Arg is the argument function and θ is the instantaneous phase difference 443 derived from the Hilbert transform. For estimating phase-locking we computed the phase-locking 444 value (PLV,3) based on the instantaneous phase derived from the wavelet TFR. The PLV was 445 computed by averaging the complex values with unit amplitude: 1 446 21 | |

447 22 , 1 , 2

448 where T is total number of time points (trials were concatenated). The frequencies ω1 and ω2 449 were chosen based on the frequency of the gamma spectral power peaks of the respective 450 contacts. The PLV ranges from 1, corresponding to full phase consistency, to 0, corresponding to 451 fully random. Importantly, the PLV measure allows that oscillations have different frequencies (a 452 form of cross-frequency coupling measure,50). Both, HT-PLV or wavelet TFR-PLV gave similar 453 results. However, the wavelet TFR-PLV is more robust for SNR changes over different probes or 454 sessions and we chose this as our preferred method for the main analysis. The main results were 455 also not dependent on applying SSD and similar results could be obtained by combining filtering 456 and HT or wavelet TFR on raw signals. For the MUA signals analysed (see below) we used 457 wavelet TFR-PLV on the raw MUA signals. 458 459 Correction for CSD-induced phase shifts 460 When applying CSD on a laminar probe the resultant signal from a given contact will likely 461 show a constant (artificial) phase shift relative to the phase of the original LFP. This is because 462 the CSD computes the difference among nearby LFP contacts which can change the polarity. For 463 statistical analysis on single contact level these shifts are not problematic (as they are constant 464 for a given contact pair) nor for the directionality measures, but it would give a scrambled 465 picture for the Arnold tongue mapping, where all contact pairs are needed for analysis. To reduce 466 the effect of the phase shifts, we normalized the phase-differences for each given contact pair to 467 the condition having the smallest frequency difference. Hence, for CSD the phase-difference is 468 by definition 0 at frequency difference (detuning) zero. This was done because gamma 469 oscillations had zero phase difference at zero frequency difference shown by 1) LFP-LFP 470 analysis 2) confirmed by MUA-MUA analysis. An alternative correction of the CSD phase 471 difference using the estimated time-lags from the PSI gave similar results.

12

472 473 SNR estimation and SNR-correction 474 For experimental data it is important to consider (external) measurement noise. Measurement 475 noise is noise that adds to the biological signal and is completely unrelated to the underlying 476 dynamics. The amount of measurement noise is often expressed as the signal-to-noise ratio 477 (SNR). Despite the fact that the SNR from invasive LFP or MUA measurements is higher than 478 non-invasive EEG/MEG measures, the SNR is still a limiting factor and needs to be considered 479 for a better interpretation of the data. At low SNR, the PLV is largely underestimated. For 480 example, a SNR of 3 can reduce the PLV more than half. Further, it also important for separating 481 effects of true gamma amplitude from effects by SNR. A further important motivation for 482 considering SNR correction was to be able to compare experimental PLV to the analytical 483 predictions from the coupled oscillator equations which are SNR free. 484 In the data the exact amount of biological signal and external noise is unclear and needs to be 485 approximated. We approximated the gamma-band SNR by using the fact that most of the gamma 486 power is induced by stimulation. We therefore compared gamma power during stimulation to 487 gamma power during baseline period. The power spectra in the baseline period looked similar to 488 1/f indicating that the approximation is plausible. The gamma SNR was defined as follows:

489 23 490 To obtain PLV values that are relatively SNR independent, we simulated artificial oscillatory 491 synchronization data using phase-oscillator equations50 for different SNR levels. We applied the 492 exact same PLV estimation procedure as used for experimental data and quantified how SNR 493 level does change the PLV estimate. The PLV estimates were compared to analytical derived 494 expected PLV by solving the phase-oscillator equations. From these analyses we derived a SNR 495 inverse function which gives a correction factor for the PLV measured at a particular SNR. 496 In addition, we performed the same procedure for the estimation of the interaction strength ε in 497 experimental data which is also sensitive to SNR. At low SNR, the interaction strength ε is 498 underestimated. Also here we computed a correction factor based on simulated data with 499 different level of SNR. 500 501 Instantaneous frequency modulations by phase-difference 502 Synchronization counteracts the phase precession by either accelerating or decelerating the 503 precession depending on the form of the phase-response curve (PRC). Hence, phase difference 504 dependent frequency modulations are expected from synchronization theory. To quantify the 505 phase difference dependent frequency modulation in simulation/experimental data, we first 506 computed for each pair of oscillations their instantaneous phases and their derivative 507 (instantaneous frequency, see Fig.S5). To estimate the modulation, we computed the mean 508 instantaneous frequency for a given instantaneous phase difference. For this, we binned the 509 instantaneous phase difference data into equal bin sizes (bin size = 0.1rad), and for each bin we 510 estimated the mean instantaneous frequency, here for contact 1.

13

1 511 24

512 where IF the instantaneous frequency, Tθ is the maximal number of time points having phase 513 difference θ, are individual time points with phase difference θ. 514

515 25 ∆ 516 517 Estimation of detuning value ∆ω 518 The intrinsic frequency, the frequency an oscillator would have without interactions with other 519 oscillators, could not be directly measured experimentally. The simple mean (emergent) 520 frequency difference between oscillations will change as a function of synchronization. The 521 stronger the synchronization, the closer the (emergent) frequency difference will be become; up 522 to the point they are complete synchronized (common frequency). 523 Yet, the intrinsic frequency can be approximated from the phase difference dependent 524 instantaneous frequency fluctuations. If there are no interactions among oscillators, the 525 measured frequency is equal to the intrinsic frequency. However, if the oscillators synchronize, 526 the instantaneous frequency will fluctuate as a function of the phase difference. At the preferred 527 phase difference the IF difference between oscillators is minimal, whereas at the anti-preferred 528 phase it is maximal. Importantly, if both the interaction strength and the PRC are similar for both 529 oscillators, then the mean of ∆ will be equal to the detuning. Hence to derive the 530 detuning, we first assumed that the interaction functions between oscillations were symmetric, 531 which seems plausible, considering the isotropic horizontal connectivity properties in V118. The 532 detuning value was then defined as follows: 1 533 26 ∆ ∆

534 assuming that (1) ε12 ≈ ε21 and (2) H12≈ H21. The validity of the approach was tested using phase- 535 oscillator model as well as the coupled PING network model. In the former, the true detuning 536 was a given parameter and in the latter the detuning could be measured by decoupling the two 537 PING networks. Both modelling types showed that the detuning could be robustly retrieved, if 538 interaction strengths were approximately symmetric. The equation 26 can be adapted to deal with 539 cases of interaction asymmetry (ε12 ≠ ε21) using the individual interaction strengths ε12 and ε21 for 540 a weighted averaging. In our experimental V1 data we observed covariation and similarity of 541 individual interaction strengths ε12 and ε21 that is in line with the isotropic connectivity structure 542 of V1. We observed systematic deviations from symmetry in cases when the PING networks or 543 V1 contacts had large amplitude differences. Oscillation amplitude can influence the interaction 544 strength ε, because a network that can send a larger amount and more synchronized spikes to 545 another network will have a stronger influence51–53. 546 547 Estimation of interaction strength ε 14

548 A straightforward method is for each contact pair to estimate the modulation amplitude ε as the 549 (min-max)/2 of the modulation. Even though the method works in many cases, especially for the 550 PING simulation data, it is not very robust against SNR and has a tendency of overestimating the 551 interaction strength as tested with phase-oscillator simulations where the true interaction is 552 known. We therefore used another approach (used in the main analysis) based on the Fourier 553 transform (FFT) of the modulation function: 554 27 ||∆|| 555 where ω is frequency. The first Fourier coefficient is the mean offset of the modulation. Since we 556 observed an approximately sinusoidal shape of the frequency modulation that was periodic over 557 a phase differences of 2π, the amplitude of the modulation is captured in the second Fourier 558 coefficient. We also included the third Fourier coefficient to capture to some extent the 559 asymmetries observed in the modulation shape. The higher Fourier coefficients should mainly 560 represent noise. For estimating modulation strength ε we summed the second and third Fourier 561 coefficient and subtracted the estimated noise. This noise was assumed to be uniform across all 562 Fourier components. It was therefore estimated as the mean amplitude of the second quadrant of 563 N Fourier coefficients (N defined by number of phase bins of the modulation function). / 2 564 28 ε 2 3 / 565 This gave much more robust estimates for lower SNR data and reduced the tendency of 566 overestimation for lower interaction strengths. Rather it had a weak tendency of underestimation 567 (especially if the modulation shape is more asymmetric). However for both methods the 568 estimation of interaction strength ε systematically decreased with lower SNR. 569 For the experimental data, we scaled the ε values for the analytical predictions to account for the 570 SNR in macaque V1 data. For this, as described above, we estimated the known interaction 571 strengths of the phase-oscillator data and added different level of external noise to mimic SNR 572 seen in monkey data. This yielded a curve giving the accuracy of the interaction strength 573 estimate depends on SNR. The inverse of the curve gave us the rescaling values to compensate 574 for SNR. Based on the estimated SNR of monkey M1 and M2 data, we rescaled the estimate of ε 575 to make it approximately SNR independent. For the main analysis conditions, with larger 576 detuning were chosen to estimate the modulation function of instantaneous frequency by phase 577 difference because data points were more equally distributed over different phases for these 578 conditions. For each contact pair we used all conditions with a detuning value larger than 4Hz 579 and took the mean of those estimated ε values. 580 In our experimental data we always had cases of large detuning, owing to our experimental 581 manipulation of contrast difference. In cases where only small detuning values are recorded, we 582 suggest that estimations of ε can be based on the absolute instantaneous frequency differences. 583 This will avoid cancellation of fluctuations around zero which would give severe 584 underestimation of the true underlying interaction strength. The price will be a tendency of 585 overestimation for very low interaction strengths. 586

587

15

588 589 Estimation of the mutual phase response curve G(θ) 590 Given that the detuning value (∆ω) and the interaction strength (ε) were estimated from the mean 591 instantaneous frequency modulation by phase-difference (∆), then G(θ) can be simply 592 estimated by following equation: ∆ ∆ 593 29 ε 594 This approach was tested using the phase-oscillator model, where G(θ) was a known function. 595 Using the described approach any shape of G(θ) could be estimated from the simulation data 596 assuming the oscillators were mutually connected with approximately similar interaction 597 strengths (hence being symmetric). 598 The G(θ) describes how the rate of phase precession (equivalent to instantaneous frequency 599 difference) between oscillator is altered as a function of phase-difference, whereas the single 600 PRC H(θ) describes how the phase evolution of a single oscillator (=instantaneous frequency) is 601 altered as a function of phase-difference. The H(θ) in the PING networks were asymmetric with 602 a stronger positive (advancing) component and a weak negative (delaying) component, hence 603 being more of the so-called PRC Type 154,55. When PING networks were unidirectionally 604 coupled, they exhibited asymmetric Arnold tongues due to the asymmetric H(θ). Whether these 605 applies to unidirectional coupled brain areas needs to be tested. For the main analysis, the PING 606 networks were mutually connected with the same strength. The resultant G(θ) had therefore 607 symmetric negative and positive components. 608 For making predictions of the PLV or mean phase-difference we assumed that the underlying 609 interaction properties (the shape of the PRCs) did not change over the different conditions. We 610 therefore used one estimation of G(θ) funciton for the whole dataset for each monkey or for all 611 PING simulations. We assumed that the underlying interaction properties (the shape of the 612 PRCs) did not change over the different conditions. 613 To obtain a G(θ) population estimation for a whole dataset we averaged single absolute |G(θ)| 614 from all contact pairs that had a sufficient level of detuning (|∆ω| > 4Hz). We took the absolute 615 to make G(θ) independent of sign and so avoid cancelling each other out during averaging. 616 Taking only conditions where |∆ω| > 4Hz was necessary first to assure low synchronization and 617 therefor to have a more uniform phase-difference probability distribution to ease the estimation 618 of the instantaneous frequency difference for all phase-differences. Second, the minima of the 619 G(θ) shifted in its mean (preferred) phase difference mainly within the range of -4Hz to 4Hz. 620 This would lead to smearing of the obtained population estimation. Restricting to conditions of 621 |∆ω| > 4Hz ensured that the individual minima approximately overlapped. 622 While we used the population G(θ) for the main analysis, using a G(θ) from a single contact pair 623 led to good prediction values of the whole dataset in many cases. However, there were also 624 individual cases that deviated from the norm. Generally, estimation of G(θ) in contact pairs with 625 low detuning that show very high level of synchronization is difficult as the oscillators remain 626 constantly around their preferred phase-relation. This was the case especially in PING simulation 627 with low phase noise levels. In this case, perturbation techniques might be more appropriate. 628 Further, in cases of strong amplitude differences between contacts, we observed asymmetries at

16

629 the level of single PRC leading to different G(θ) properties. Further, we observed small to 630 moderate amplitude modulations as a function of phase difference (Fig.S6) that might have 631 affected the shape of G(θ) of contact pairs with strong interaction values. The use of CSD for 632 reducing volume conduction led to additional noise due to artificial phase shifts. Given all these 633 considerations, single G(θ) could be noisy for experimental single contact pairs. The population 634 average G(θ) was a better representation of the interaction properties of V1 horizontal 635 connections. 636 The use of IF(θ) for estimating single PRC or G(θ) needs careful considerations. However, our 637 work shows how much important information these modulations can contain about the 638 underlying synchronization process. Our approach can be applied to other brain regions and 639 frequency-bands to improve understanding of the underlying synchronization properties. 640 641 Estimation of noise variance σ2 642 The estimation of the phase noise variance σ of the noise process ƞ(t) from data is not trivial due 643 to measurement noise and general complexity of LFP/CSD signals. The phase noise is intrinsic 644 to the oscillatory process (dynamic noise) and relevant for the understanding of the dynamics and 645 therefore distinct to external measurement noise. Phase noise implies variability in the 646 instantaneous frequency of oscillators (see equation 1-3) and the overall variability of 647 instantaneous frequency should scale with the noise variance σ2. We approximated the noise 648 variance σ2 by determining in the phase-oscillator model what σ2 value would produce the same 649 observed instantaneous frequency difference distribution as observed in the PING or 650 experimental V1 data. It is important to note that the observed frequency variance is not the same 651 as the (intrinsic) variance going into equation 1-3. This is because synchronization also 652 counteracts the intrinsic variability. The procedure involved two main steps: 653 1. Estimate the (population average) standard deviation of the observed instantaneous 654 frequency difference distribution of SSD gamma. 655 2. Using phase-oscillator equations to find the value for σ that can reproduce the observed 656 standard deviation of the observed instantaneous frequency difference distribution giving 657 the observed signal-noise-level. 658 659 Analytical predictions for PING and experimental V1 gamma 660 Using the equations 5 and 6 and the estimated G(θ) and noise variance σ from the data we could 661 make predictions for any value of detuning Δω and interaction strength ε. The phase difference 662 probability distribution was analytically predicted, from which we quantified the phase- 663 locking strength (PLV, see equation 7) and the mean phase difference (equation 8). The 664 predicted PLV and mean phase difference was compared to the observed PLV and mean phase 665 difference from the data with the same detuning and interaction strength. 666 For PING networks we modulated the interaction strength ε by changing the inter-network 667 connectivity strength and the detuning Δω by given different excitatory input drive to the two 668 networks. For each simulation we had an estimate of interaction strength ε and detuning Δω.

17

669 For experimental V1 gamma, interaction strength ε was modulated by cortical distance and 670 detuning Δω by local contrast and to a weaker extent eccentricity (see above). For each contact 671 pair, we had their interaction strength ε and their detuning Δω. 672 673 Mapping of the Arnold tongue 674 For PING data, we mapped the data corresponding to their detuning and internetwork synaptic 675 connectivity strength. For data corresponding to particular connectivity strength, we estimated 676 the interaction strength ε and used these values for the rescaling of the y-axis, because the 677 interaction strength ε was the parameter we wanted to compare to the theoretical model. 678 For experimental V1, we binned the contact pairs according to their detuning (±bin size=0.35Hz, 679 bin steps=0.2Hz) and cortical distance (±bin size=0.6mm, bin steps=0.3mm). This average data 680 binned according to detuning and cortical distance is what is termed population data. For data 681 corresponding to a particular bin, we estimated the interaction strength ε and use these values for 682 the rescaling of the y-axis. This was done to make sure that the interaction strength ε dimension 683 was independent of the detuning ∆ω dimension, because binning directly using interaction 684 strength ε and detuning had a potential risk of inducing dependencies between dimensions (e.g., 685 due to SNR fluctuation) as both were based on estimation of ∆IF(θ). 686 687 Evaluation of prediction accuracy of analytical model 688 We estimated the accuracy of the model predictions using the coefficient of determination, R2, 689 for phase-locking strength (PLV) and the mean phase difference. Notice that here we evaluate 690 the model accuracy without optimizing the parameters to enhance fitting.

691 30 1

692 SSres is sum of square of the prediction error, the residuals of the difference between observed 693 data and the predicted data, and the SSTot is sum of square of the demeaned observed data. 694 For the PING networks we observed that for both PLV (R2=0.93, n=697) and mean phase 695 difference (R2= 0.94, n=697) the model predictions explained a large significant part of the 696 variance. 697 For experimental V1 gamma data we observed that the model predictions captured also a large 698 significant part of the PLV (population level: M1:R2=0.88, n=638, M2: R2= 0.9, n=638; single 699 contact level: M1: R2=0.18,n=9632, M2: R2= 0.32, n=7938) and mean phase difference variance 700 in both monkeys (population level: M1: R2=0.94, n=638, M2: R2=0.88, , n=638; single-contact 701 level: M1: R2=0.56, n=9632, M2: R2=0.27, n=7938). The population-level data represent the 702 binned and averaged single-contact data according to detuning and cortical distance (see section 703 Mapping of the Arnold tongue). 704 705 Instantaneous amplitude modulation by phase difference 706 We also investigated whether instantaneous gamma amplitude changed as a function of phase 707 difference (Fig.S6). In both, the PING networks as well as V1 gamma oscillations, we observed 18

708 small to moderate amplitude modulation (up to ~15% modulation from mean amplitude). The 709 modulations observed in the PING model looked strikingly similar to the V1 gamma amplitude 710 modulations (compare Fig.S6A/B with Fig.S6C). These modulations are not expected from the 711 weakly coupled oscillator theory, but as mentioned above (section Estimation of detuning value 712 ∆ω), oscillation amplitude can influence the interaction strength ε as more synchronized spikes 713 are more effective in influencing receiving neurons. This might affect synchronization behavior 714 for phase-locking and mean phase difference, especially if amplitude modulations become more 715 substantial. Future work should explicitly account for the effect of amplitude 56. 716 717 Multiple regression fitting of PING and experimental V1 data 718 We tested what factors determine variation of gamma PLV and mean phase difference using 719 multiple linear regression including detuning, interaction strength/cortical distance, amplitude 720 difference and their interactions. We did this for both the PING network data and the 721 experimental V1 gamma data (Fig.S7). For PLV the strongest predictive factors were detuning 722 Δω and interaction strength ε in both PING and experimental data. For experimental V1 data, 723 using cortical distance instead of interaction strength lead to similar results. Amplitude 724 difference (as well as total amplitude) was significant, but in comparison to the other two factors 725 was rather minor. 726 For mean phase difference, detuning Δω was the largest factor in both PING and experimental 727 V1 data. Interaction strength ε had no main effect on mean phase difference, but the interaction 728 between detuning and interaction strength was highly significant. This was expected from theory, 729 because interaction strength changes the slope of the detuning-to-phase difference translation 730 12,57–60. This was observed in both PING and experimental V1 data. The results show generally 731 striking similarity between the synchronization behaviour of two weakly coupled PING networks 732 and the synchronization of nearby gamma V1 locations. The gamma amplitude did affect the 733 synchronization behaviour, but only to a minor extent (however stronger in some conditions than 734 others), which is not predicted by the theory of weakly coupled oscillators. However, theories 735 dealing explicitly with amplitude effects56 might be applied for better prediction of V1 gamma 736 synchronization. 737 738 MUA-CSD and MUA-MUA analysis 739 We also computed gamma PLV and the mean phase difference using multi-unit activity (MUA) 740 by computing both CSD-MUA locking and MUA-MUA locking. The MUA represent a local 741 population spike rate signal and it is thought to reflect more the ‘output’ of the network, whereas 742 LFP/CSD represent the synaptic input of the network61,62. Further, in the main analysis we 743 estimated the synchronization gamma behaviour by using current-source density (CSD) signals 744 derived from our V1 16-contact laminar probes. The important advantage of CSD compared to 745 the local field potential (LFP) is the strong reduction of volume conduction which would 746 substantially bias the PLV as well as the mean phase difference the closer the laminar probes are. 747 The local second spatial derivation of nearby contact on the laminar probes for deriving CSD 748 21,63 reduced the effect of far electrical fields. However, application of CSD can likely not 749 completely eliminate the influence of volume conduction of very near probes. Therefore we used 750 the more local MUA signal to test whether we can confirm a similar gamma synchronization 19

751 behaviour as observed with CSD. A disadvantage of MUA signal in our recording data was its 752 much lower SNR than LFP/CSD signals. We analysed the aggregate MUA signals of all L2-L4 753 contacts of a single laminar probes, converted the spikes in to spike densities smoothed using a 754 Gaussian filter (σ=4ms). In Fig.S8 the results of the MUA-CSD and MUA-MUA analysis are 755 illustrated showing that the Arnold tongue in terms of PLV and mean phase-difference could be 756 observed in MUA-CSD as well MUA-MUA signals. The results show that similar gamma 757 synchronization behaviour in V1 can be observed at the level of CSD, representing mainly 758 synaptic inputs, and spiking data, representing neural output. It also shows that volume 759 conduction, already minimized for CSD signals, cannot be an influential determinant. 760 761 References 762 1. Pikovsky, A., Rosenblum, M., Kurths, J. & Hilborn, R. C. Synchronization: A Universal 763 Concept in Nonlinear Science. Am. J. Phys. 70, 655 (2002). 764 2. Izhikevich, E. M. Dynamical Systems in Neuroscience: The Geometry of Excitability and 765 . Dyn. Syst. 25, (2007). 766 3. Lachaux, J. P., Rodriguez, E., Martinerie, J. & Varela, F. J. Measuring phase synchrony in 767 brain signals. Hum Brain Mapp 8, 194–208 (1999). 768 4. Masquelier, T., Hugues, E., Deco, G. & Thorpe, S. J. Oscillations, phase-of-firing coding, 769 and spike timing-dependent plasticity: an efficient learning scheme. J Neurosci 29, 770 13484–13493 (2009). 771 5. Markram, H., Gerstner, W. & Sjöström, P. J. Spike-timing-dependent plasticity: a 772 comprehensive overview. Front. Synaptic Neurosci. 4, 2 (2012). 773 6. Tiesinga, P., Fellous, J.-M. & Sejnowski, T. J. Regulation of spike timing in visual 774 cortical circuits. Nat. Rev. Neurosci. 9, 97–107 (2008). 775 7. Dan, Y. & Poo, M. M. Spike timing-dependent plasticity of neural circuits. Neuron 44, 776 23–30 (2004). 777 8. London, M. & Häusser, M. Dendritic computation. Annu. Rev. Neurosci. 28, 503–32 778 (2005). 779 9. Heitmann, S., Boonstra, T. & Breakspear, M. A dendritic mechanism for decoding 780 traveling waves: principles and applications to motor cortex. PLoS Comput. Biol. 9, 781 e1003260 (2013). 782 10. Breakspear, M., Heitmann, S. & Daffertshofer, A. Generative models of cortical 783 oscillations: neurobiological implications of the kuramoto model. Front Hum Neurosci 4, 784 190 (2010). 785 11. Strogatz, S. H. From Kuramoto to Crawford: exploring the onset of synchronization in 786 populations of coupled oscillators. Phys. D Nonlinear Phenom. 143, 1–20 (2000). 787 12. Lowet, E. et al. Input-Dependent Frequency Modulation of Cortical Gamma Oscillations 788 Shapes Spatial Synchronization and Enables Phase Coding. PLoS Comput. Biol. 11, 789 e1004072 (2015). 790 13. Tiesinga, P. & Sejnowski, T. J. Cortical enlightenment: are attentional gamma oscillations 791 driven by ING or PING? Neuron 63, 727–732 (2009). 792 14. Izhikevich, E. M. Simple model of spiking neurons. IEEE Trans Neural Netw 14, 1569– 793 1572 (2003). 794 15. Boucsein, C., Nawrot, M. P., Schnepel, P. & Aertsen, A. Beyond the cortical column: 795 abundance and physiology of horizontal connections imply a strong role for inputs from

20

796 the surround. Front Neurosci 5, 32 (2011). 797 16. Angelucci, A. & Bullier, J. Reaching beyond the classical receptive field of V1 neurons: 798 Horizontal or feedback axons? J. Physiol. Paris 97, 141–154 (2003). 799 17. Bosking, W. H., Zhang, Y., Schofield, B. & Fitzpatrick, D. Orientation Selectivity and the 800 Arrangement of Horizontal Connections in Tree Shrew Striate Cortex. J. Neurosci. 17, 801 2112–2127 (1997). 802 18. Stettler, D. D., Das, A., Bennett, J. & Gilbert, C. D. Lateral Connectivity and Contextual 803 Interactions in Macaque Primary Visual Cortex. Neuron 36, 739–750 (2002). 804 19. Angelucci, A. et al. Circuits for local and global signal integration in primary visual 805 cortex. J. Neurosci. 22, 8633–46 (2002). 806 20. Oostenveld, R., Fries, P., Maris, E. & Schoffelen, J.-M. FieldTrip: Open source software 807 for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. 808 Intell. Neurosci. 2011, 156869 (2011). 809 21. Vaknin, G., DiScenna, P. G. & Teyler, T. J. A method for calculating current source 810 density (CSD) analysis without resorting to recording sites outside the sampling volume. 811 J. Neurosci. Methods 24, 131–135 (1988). 812 22. Roberts, M. J. et al. Robust gamma coherence between macaque V1 and V2 by dynamic 813 frequency matching. Neuron 78, 523–36 (2013). 814 23. Gattass, R., Gross, C. G. & Sandell, J. H. Visual topography of V2 in the macaque. J. 815 Comp. Neurol. 201, 519–39 (1981). 816 24. Sereno, M. et al. Borders of multiple visual areas in humans revealed by functional 817 magnetic resonance imaging. Science (80-. ). 268, 889–893 (1995). 818 25. Schwartz, E. L. Computational anatomy and functional architecture of striate cortex: A 819 spatial mapping approach to perceptual coding. Vision Res. 20, 645–669 (1980). 820 26. Maier, A., Adams, G. K., Aura, C. & Leopold, D. A. Distinct superficial and deep laminar 821 domains of activity in the visual cortex during rest and stimulation. Front. Syst. Neurosci. 822 4, (2010). 823 27. Frenkel, D. & Smit, B. Understanding molecular simulation: from algorithms to 824 applications. (Academic Press, 2001). 825 28. Godlove, D. C., Maier, A., Woodman, G. F. & Schall, J. D. Microcircuitry of agranular 826 frontal cortex: testing the generality of the canonical cortical microcircuit. J. Neurosci. 34, 827 5355–69 (2014). 828 29. Carter, G., Knapp, C. & Nuttall, A. Estimation of the magnitude-squared coherence 829 function via overlapped fast Fourier transform processing. IEEE Trans. Audio 830 Electroacoust. 21, (1973). 831 30. Mitzdorf, U. & Singer, W. Laminar segregation of afferents to lateral geniculate nucleus 832 of the cat: an analysis of current source density. J Neurophysiol 40, 1227–1244 (1977). 833 31. Schroeder, C. E., Tenke, C. E., Givre, S. J., Arezzo, J. C. & Vaughan, H. G. Striate 834 cortical contribution to the surface-recorded pattern-reversal vep in the alert monkey. 835 Vision Res. 31, 1143–1157 (1991). 836 32. van Kerkoerle, T. et al. Alpha and gamma oscillations characterize feedback and 837 feedforward processing in monkey visual cortex. Proc. Natl. Acad. Sci. 111, 14332–14341 838 (2014). 839 33. Buffalo, E. A., Fries, P., Landman, R., Buschman, T. J. & Desimone, R. Laminar 840 differences in gamma and alpha coherence in the ventral stream. Proc. Natl. Acad. Sci. U. 841 S. A. 108, 11262–7 (2011). 21

842 34. Xing, D., Yeh, C.-I., Burns, S. & Shapley, R. M. Laminar analysis of visually evoked 843 activity in the primary visual cortex. Proc. Natl. Acad. Sci. U. S. A. 109, 13871–6 (2012). 844 35. Hadjipapas, A., Lowet, E., Roberts, M. J., Peter, A. & De Weerd, P. Parametric variation 845 of gamma frequency and power with luminance contrast: A comparative study of human 846 MEG and monkey LFP and spike responses. Neuroimage (2015). 847 doi:10.1016/j.neuroimage.2015.02.062 848 36. Jia, X., Xing, D. & Kohn, A. No consistent relationship between gamma power and peak 849 frequency in macaque primary visual cortex. J Neurosci 33, 17–25 (2013). 850 37. Ray, S. & Maunsell, J. H. R. Differences in gamma frequencies across visual cortex 851 restrict their possible use in computation. Neuron 67, 885–96 (2010). 852 38. Hall, S. D. et al. The missing link: analogous human and primate cortical gamma 853 oscillations. Neuroimage 26, 13–7 (2005). 854 39. Self, M. W. et al. The Effects of Context and Attention on Spiking Activity in Human 855 Early Visual Cortex. PLoS Biol. 14, e1002420 (2016). 856 40. Sclar, G., Maunsell, J. H. R. & Lennie, P. Coding of image contrast in central visual 857 pathways of the macaque monkey. Vision Res. 30, 1–10 (1990). 858 41. Contreras, D. & Palmer, L. Response to Contrast of Electrophysiologically Defined Cell 859 Classes in Primary Visual Cortex. J. Neurosci. 23, 6936–6945 (2003). 860 42. Traub, R. D., Whittington, M. a, Colling, S. B., Buzsáki, G. & Jefferys, J. G. Analysis of 861 gamma rhythms in the rat in vitro and in vivo. J. Physiol. 493 ( Pt 2, 471–84 862 (1996). 863 43. Lima, B., Singer, W., Chen, N.-H. & Neuenschwander, S. Synchronization dynamics in 864 response to plaid stimuli in monkey V1. Cereb. Cortex 20, 1556–73 (2010). 865 44. van Pelt, S. & Fries, P. Visual stimulus eccentricity affects human gamma peak frequency. 866 Neuroimage 78C, 439–447 (2013). 867 45. Picinbono, B. On instantaneous amplitude and phase of signals. IEEE Trans. Signal 868 Process. 45, 552–560 (1997). 869 46. Le Van Quyen, M. et al. Comparison of Hilbert transform and wavelet methods for the 870 analysis of neuronal synchrony. J. Neurosci. Methods 111, 83–98 (2001). 871 47. Bonizzi, P., Karel, J. M. H., Meste, O. & Peeters, R. L. M. Singular spectrum 872 decomposition: A new method for time series decomposition. Adv. Adapt. Data Anal. 873 1450011 (2014). doi:10.1142/S1793536914500113 874 48. Bonizzi, P. et al. Singular spectrum analysis improves analysis of local field potentials 875 from macaque V1 in active fixation task. Conf. Proc. ... Annu. Int. Conf. IEEE Eng. Med. 876 Biol. Soc. IEEE Eng. Med. Biol. Soc. Annu. Conf. 2012, 2945–8 (2012). 877 49. Schafer, R. W. What is a savitzky-golay filter? IEEE Signal Process. Mag. 28, 111–117 878 (2011). 879 50. Lowet, E., Roberts, M. J., Bonizzi, P., Karel, J. & De Weerd, P. Quantifying Neural 880 Oscillatory Synchronization: A Comparison between Spectral Coherence and Phase- 881 Locking Value Approaches. PLoS One 11, e0146443 (2016). 882 51. Fries, P. Rhythms For Cognition: Communication Through Coherence. Neuron (2015). 883 52. Womelsdorf, T. et al. Modulation of neuronal interactions through neuronal 884 synchronization. Science 316, 1609–1612 (2007). 885 53. Tiesinga, P. H., Fellous, J.-M., Salinas, E., José, J. V & Sejnowski, T. J. Inhibitory 886 synchrony as a mechanism for attentional gain modulation. J. Physiol. Paris 98, 296–314 887 (2005). 22

888 54. Cannon, J. & Kopell, N. The Leaky Oscillator: Properties of Inhibition-Based Rhythms 889 Revealed through the Singular Phase Response Curve. SIAM J. Appl. Dyn. Syst. 14, 1930– 890 1977 (2015). 891 55. Schwemmer, M. A. & Lewis, T. J. in Phase Response Curves Neurosci. 3–31 (2012). 892 doi:10.1007/978-1-4614-0739-3 893 56. Aronson, D. G., Ermentrout, G. B. & Kopell, N. Amplitude response of coupled 894 oscillators. Phys. D Nonlinear Phenom. 41, 403–449 (1990). 895 57. Tiesinga, P. H. & Sejnowski, T. J. Mechanisms for Phase Shifting in Cortical Networks 896 and their Role in Communication through Coherence. Front Hum Neurosci 4, 196 (2010). 897 58. Hoppensteadt, F. C. & Izhikevich, E. M. Thalamo-cortical interactions modeled by 898 weakly connected oscillators: could the brain use FM radio principles? Biosystems 48, 85– 899 94 (1998). 900 59. Ermentrout, G. B. & Kopell, N. Frequency Plateaus in a Chain of Weakly Coupled 901 Oscillators, I. SIAM J. Math. Anal. 15, 215–237 (1984). 902 60. Ermentrout, G. B. & Kleinfeld, D. Traveling electrical waves in cortex: insights from 903 phase dynamics and speculation on a computational role. Neuron 29, 33–44 (2001). 904 61. Buzsáki, G., Anastassiou, C. A. & Koch, C. The origin of extracellular fields and currents- 905 -EEG, ECoG, LFP and spikes. Nat. Rev. Neurosci. 13, 407–20 (2012). 906 62. Buzsáki, G. & Schomburg, E. W. What does gamma coherence tell us about inter-regional 907 neural communication? Nat. Neurosci. 18, 484–489 (2015). 908 63. Einevoll, G. T., Kayser, C., Logothetis, N. K. & Panzeri, S. Modelling and analysis of 909 local field potentials for studying the function of cortical circuits. Nat. Rev. Neurosci. 14, 910 770–85 (2013). 911 912 913 914 915 916

23

917 918 Fig.S1 : Illustration of the coupled oscillator behaviour with a sinusoidal PRC and phase-noise 919 of σ=15Hz. (A) Change of the phase-locking strength (here quantified by phase-locking value, 920 PLV) as a function of detuning ∆ω and interaction strength ε. Interaction strength defines the 921 amplitude of the PRC as illustrated in the plots I-III. The plots show the PRC as a function of 922 phase difference. The phase adjustments (y-axis) are expressed in Hz (change of instantaneous 923 frequency difference). The detuning ∆ω can be represented as a horizontal line shifting up or 924 down. As long as the detuning ∆ω is less or equal the interaction strength ε, the phase-locking

24

925 will be high (a stable fixed point can be defined). The black lines (ε=|∆ω|) represents the 926 predicted transient point from high to low phase-locking strength for a noise free case. The 927 stronger the interaction strength ε, the larger the detuning values ∆ω that can be tolerated, as 928 illustrated in I-III. In IV the whole space of detuning and interaction strengths is shown. The 929 phase-locking strength (PLV) is color-coded with high values being yellow-red. An inverted 930 triangular synchronization region can be observed, the Arnold tongue. Note that the PLV does 931 not reach 1 (perfect locking) within the Arnold tongue due to the phase noise. (B) Change of the 932 mean phase difference as a function of detuning (∆ω) and interaction strength ε. I-IV show 933 illustrative plots with the PRC and different levels of detuning. The detuning ∆ω can be seen as 934 shifting the PRC with respect to the zero line. The new mean value of the PRC is represented as 935 grey dashed line. The fixed point, defining the preferred phase difference, is defined by the cross 936 section of the PRC and the zero line (where θ’ =0). Only the fixed point which is stable (the 937 derivative of θ’ is negative) is shown. In V the mean phase difference is mapped in the parameter 938 space of detuning and interaction strength.

25

939 940 Fig.S2: Cortical depth alignment and analysis. (A) Alignment procedure: CSDs from single 941 sessions and probes were shifted iteratively to minimize the squared error between all probes 942 using a parallel tempering algorithm until an optimal constellation of shifts was reached. 943 Gamma-range coherence was then used to confirm or improve the constellation (see text for 944 details). From left to right: example sessions from the first few and last few recording days, at 945 the very right: average across all sessions, including sessions not shown here. Top row: CSD, 946 bottom row: gamma-range coherence during baseline grey screen period. Black/grey lines 947 indicate location of layers 2-4 versus 5-6. Example sessions are taken from monkey M2. (B) 948 Depth-aligned grand average features per monkey, top row monkey M1, bottom row monkey 949 M2. Black/grey lines indicate location of layers 2-4 versus 5-6. Zero indicates last channel 950 included in L2-4, black/grey lines overlap the reversal point. From left to right: CSD, LFP 951 gamma-band coherence as used for alignment, visual evoked potential, peristimulus time

26

952 histogram (PSTH), LFP and CSD power in the gamma range, area assignment based on receptive 953 field jumps. The PSTH of each session was normalized to the maximum activity of the 954 maximally active spike channel. Relative power was computed as stimulus/baseline (baseline 955 averaged across trials) for both LFP and CSD. The rightmost column shows the number of 956 contacts assigned to each depth and their assignment to V1 vs. white matter or V2 based on 957 receptive field mapping alone, providing an estimate independent of the CSD reversal point. 958 Grey contacts were either those positioned likely in white matter, just preceding a receptive field 959 jump, or all contacts if no receptive field jump was present (most likely representing all contacts 960 still in V1). An example RF mapping to the right together with a sketch illustrates this procedure. 961 Receptive fields as estimated by spiking (black) and CSD (red) show a clear jump at contact 14, 962 entering the deep layers of V2, the two contacts above cannot be assigned unambiguously. 963

964 965 Fig.S3: Arnold tongue mapping of gamma from L2-4 and L5-6 cortical laminae. To the left, 966 illustrative figures are shown indicating the laminar origin of the analysed signals. Top row 967 shows analysis of L2-4 signals and bottom row shows analysis of L5-6 signals. (A) The 968 Arnold tongue mapping of L2-4 in term of PLV combined over the two monkeys. The black 969 lines (ε=|dω|) represents the transition from high to low phase-locking strength. (B) The same 970 as (A), but now in terms of the mean phase difference. (A-B) represent the data used for the 971 main analysis. (C) The Arnold tongue mapping of L5-6 deep layer gamma in term of PLV. 972 (D) The same as (C), but now in term of the mean phase difference. The analysis shows that 973 the Arnold tongue structure was present at both cortical depths, notice that interaction 974 strengths were lower between deep sites of equal cortical distance. 975

27

976 977 Fig.S4: Effect of contrast and eccentricity on macaque V1 gamma frequency. (A) On the left, the 978 relative power spectra (15-55Hz) of monkey M1 are shown for three representative grating 979 contrasts (see Table.S1) showing a monotonic increase in the preferred frequency range and a 980 nonlinear power change with contrast. To the right, the dependence of (instantaneous) gamma 981 frequency and local contrast is quantified. (B) The same as in A, but for monkey M2. (C) Left, 982 the power spectra of contacts in monkey M1 with lower eccentricity (<4.6deg) and higher 983 eccentricity (>4.6deg) are shown. A decrease in the peak frequency with higher eccentricity can 984 be observed. This is quantified on the right in the plot of estimated gamma frequency and 985 eccentricity. (D) The same as in C, but for monkey M2. (E) The MUA spike rate was similarly 986 correlated with contrast and eccentricity as was gamma frequency in M1. In general, we 987 observed that MUA spike rate difference among V1 locations predicted well changes in gamma 988 frequency difference. (F) The same as in E, but for monkey M2. 989

28

990 991 Fig.S5: (A) Example plots of instantaneous frequency (IF) modulation by phase difference in 992 monkey M1. The IF was estimated using the Hilbert transform. Left and right show two different 993 modulation amplitudes. Blue and red represents the IF modulations of two respective contacts. 994 Line thickness represents standard error. (B) The same as in A, but for monkey M2. (C) IF 995 modulations are not method-dependent: Here, we computed the wavelet TFR and averaged the 996 time points as a function of phase difference for a given contact pair (contact from probe 1 and 997 probe2). This gives the phase difference averaged TFRs. In I-III we show different examples 998 (from monkey M1). Clear modulations of the preferred gamma frequency can be seen as a 999 change of phase difference between the contacts. (D) The procedure for the estimation of the 1000 (mutual) PRC (G(θ)) is illustrated. For the given contact pair, the difference of the IF 1001 modulations by phase difference (left) was computed, resulting in the middle plot. The mean

29

1002 (dashed line) was defined as the detuning ∆ω and ε was estimated as the amplitude. To get an 1003 estimate of G(θ), the ∆ modulation was subtracted by ∆ω and normalized by ε, giving the 1004 right plot. (E) The interaction strength ε decreased with cortical distance in both monkeys M1 1005 and M2. This was expected, because the amount of horizontal connectivity also decreases with 1006 cortical distance in V1. (F) In the intermittent synchronization regime is the mean phase 1007 difference is determined by the phase difference in which the contacts have their minimal 1008 frequency difference (illustrated as filled dots in the small plots to the left). In line with theory, 1009 we indeed observed that the phase difference with minimal frequency difference shifted with 1010 detuning ∆ω similarly to the measured mean phase difference.

1011 1012 Fig.S6: Gamma amplitude modulation as a function of phase-difference. (A) CSD gamma 1013 amplitude modulation from monkey M1 of both contacts for a giving pair with strength 1014 expressed in percent change from the mean (y-axis) and as a function of phase difference (x- 1015 axis). From left to right the interaction strength is increased (cortical distance decreased). We 1016 observed small to medium-sized/intermediate modulation amplitudes of gamma. The modulation 1017 strength increased with interaction strength ε. Line thickness represents standard error across 1018 contact pairs. (B) The same as in A, but for monkey M2. (C) Quantification of the gamma 1019 amplitude modulation in two coupled simulated PING networks. As for the monkey data, we 1020 observed systematic amplitude modulations that increased with interaction strength (synaptic

30

1021 connectivity). Line thickness represents the standard error of the instantaneous frequency for a 1022 given phase difference.

1023 1024 Fig.S7: Multiple-regression analysis of phase-locking and phase difference for PING networks 1025 (n=697) and macaque V1 data (M1: n=9632, M2: n=7938). Factors included were interaction 1026 strength ε, detuning ∆ω, gamma power difference and their interactions. Their contribution is 1027 expressed in explained variance (R2). A significance value below P<0.01 is marked with an 1028 asterisk. (A) We applied multiple regression for prediction changes in phase-locking strength 1029 (PLV) and phase difference for gamma synchronization among coupled PING networks. The 1030 results for PLV are on the left and for phase difference are on the right. For PLV, most of the 1031 variance was explained by interaction strength and detuning. For the phase difference, most 1032 variance was explained by detuning, but interaction strength only explained variance through an 1033 interaction effect with detuning. Power differences explained only very small amounts of 1034 variance. (B) The same analysis as in A for V1 gamma from monkey M1 (top) and M2 (bottom). 1035 For PLV, interaction strength and detuning were the strongest factors explaining variance. In M2 1036 there was also a substantial contribution of power. For phase difference, the detuning was the 1037 strongest factor (as in A) and interaction strength ε through an interaction effect with detuning 1038 (as in A). Power differences again played a minor role. 31

1039 1040 Fig.S8: Arnold tongue mapping for CSD-MUA and MUA-MUA signals. (A) For each contact 1041 pair, we selected the CSD signal from one contact and the MUA signal from the other contact for 1042 computing PLV as a function of detuning ∆ω and interaction strength ε / cortical distance. We 1043 observed a similar Arnold tongue in PLV as observed for CSD-CSD analysis. (B) The same as in 1044 A, but for mean phase difference. (C) The same analysis as in A, but for MUA-MUA signals. 1045 We observed also an Arnold tongue for MUA-MUA PLV. Notice that the MUA signals had 1046 much lower SNR than the CSD signals. (D) As in C, but for phase difference. 1047 1048 Contrast difference condition (monkey M1) Range 44.7 35.9 24.8 13.3 0 -13.3 -24.8 -35.9 -44.7 RF 1 66 58.6 51.7 44.3 36.5 31 27 22.7 21.2 RF 2 21.2 22.7 27 31 36.5 44.3 51.7 58.6 66 Contrast difference condition (monkey M2) Range 42.7 34 24.5 13.6 0 -13.6 -24.5 -34 -42.7 RF 1 62.7 57.5 52.2 46 39.2 32.5 27.7 23.5 20 RF 2 20 23.5 27.7 32.5 39.2 46 52.2 57.5 62.7 1049 Table.S1: Range of contrast difference conditions used for the experimental task for monkeys 1050 M1 and M2. The top sub-table shows the contrast difference conditions (in %) used for M1, and 1051 the bottom sub-table shows the values for M2.

32