Decoding Neural Signals and Discovering Their Representations
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/2020.06.02.129114; this version posted June 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. DECODING NEURAL SIGNALS AND DISCOVERING THEIR REPRESENTATIONS WITH A COMPACT AND INTERPRETABLE CONVOLUTIONAL NEURAL NETWORK A PREPRINT Arthur Petrosuan, Mikhail Lebedev and Alexei Ossadtchi June 17, 2020 ABSTRACT 1 Brain-computer interfaces (BCIs) decode information from neural activity and send it to external 2 devices. In recent years, we have seen an emergence of new algorithms for BCI decoding including 3 those based on the deep-learning principles. Here we describe a compact convolutional network- 4 based architecture for adaptive decoding of electrocorticographic (ECoG) data into finger kinematics. 5 We also propose a theoretically justified approach to interpreting the spatial and temporal weights in 6 the architectures that combine adaptation in both space and time, such as the one described here. In 7 these architectures the weights are optimized not only to align with the target sources but also to tune 8 away from the interfering ones, in both the spatial and the frequency domains. The obtained spatial 9 and frequency patterns characterizing the neuronal populations pivotal to the specific decoding task 10 can then be interpreted by fitting appropriate spatial and dynamical models. 11 We first tested our solution using realistic Monte-Carlo simulations. Then, when applied to the 12 ECoG data from Berlin BCI IV competition dataset, our architecture performed comparably to the 13 competition winners without requiring explicit feature engineering. Moreover, using the proposed 14 approach to the network weights interpretation we could unravel the spatial and the spectral patterns 15 of the neuronal processes underlying the successful decoding of finger kinematics from another 16 ECoG dataset with known sensor positions. 17 As such, the proposed solution offers a good decoder and a tool for investigating neural mechanisms 18 of motor control. 19 Keywords: ECoG, limb kinematics decoding, deep learning, machine learning, weights interpretation, spatial filter, 20 temporal filter 21 1 Introduction 22 Brain-computer interfaces (BCIs) link the nervous system to external devices [4] or even other brains [18]. While there 23 exist many applications of BCIs [1], clinically relevant BCIs have received most attention that aid in rehabilitation of 24 patients with sensory, motor, and cognitive disabilities [13]. Clinical uses of BCIs range from assistive devices to 25 neural prostheses that restore functions abolished by neural trauma or disease [2]. 26 BCIs can deal with a variety of neural signals [17, 9] such as, for example, electroencephalographic (EEG) potentials 27 sampled with electrodes placed on the surface of the head [12], or neural activity recorded invasively with the elec- 28 trodes implanted into the cortex [6] or placed onto the cortical surface [22]. The latter method, which we consider 29 here, is called electrocorticography (ECoG). Accurate decoding of neural signals is key to building efficient BCIs. 30 Typical BCI signal processing comprises several steps, including signal conditioning, feature extraction, and decod- 31 ing. In the modern machine-learning algorithms, parameters of the feature extraction and decoding pipelines are 32 jointly optimized within computational architectures called Deep Neural Networks (DNN) [10]. DNNs derive features 33 automatically when trained to execute regression or classification tasks. While it is often difficult to interpret the 34 computations performed by a DNN, such interpretations are essential to gain understanding of the properties of brain bioRxiv preprint doi: https://doi.org/10.1101/2020.06.02.129114; this version posted June 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. A PREPRINT -JUNE 17, 2020 35 activity contributing to decoding, and to ensure that artifacts or accompanying confounds do not affect the decoding 36 results. DNNs can also be used for knowledge discovery. In particular, interpretation of features computed by the first 37 several layers of a DNN could shed light on the neurophysiological mechanisms underlying the behavior being studied. 38 Ideally, by examining DNN weights, one should be able to match the algorithm’s operation to the functions and prop- 39 erties of the neural circuitry to which the BCI decoder connects. Such physiologically tractable DNN architectures are 40 likely to facilitate the development of efficient and versatile BCIs. 41 Several useful and compact architectures have been developed for processing EEG and ECoG data. The operation 42 of some blocks of these architectures can be straightforwardly interpreted. Thus, EEGNet [8] contains explicitly 43 delineated spatial and temporal convolutional blocks. This architecture yields high decoding accuracy with a minimal 44 number of parameters. However, due to the cross-filter-map connectivity between any two layers, a straightforward 45 interpretation of the weights is difficult. Some insight regarding the decision rule can be gained using DeepLIFT 46 technique [25] combined with the analysis of the hidden unit activation patterns. Schirrmeister et al. describe two 47 architectures: DeepConvNet and its compact version ShallowConvNet. The latter architecture consists of just two 48 convolutional layers that perform temporal and spatial filtering, respectively [23]. Recent study of Zubarev et al. [28] 49 reported two compact neural network architectures, LF-CNN and VAR-CNN, that outperformed the other decoders 50 of MEG data, including linear models and more complex neural networks such as ShallowFBCSP-CNN, EEGNet-8 51 and VGG19. LF-CNN and VAR-CNN contain only a single non-linearity, which distinguishes them from most other 52 DNNs. This feature makes the weights of such architectures readily interpretable with the well-established approaches 53 [7, 5]. This methodology, however, has to be applied taking into account the peculiarities of each specific network 54 architecture. 55 Here we introduce another simple architecture, developed independently but conceptually similar to those listed above, 56 and use it as a testbed to refine the recipes for the interpretation of the weights in the family of architectures charac- 57 terized by separated adaptive spatial and temporal processing stages. We thus emphasize that when interpreting the 58 weights in such architectures we have to keep in mind that these architectures tune their weights not only to adapt to 59 the target neuronal population(s) but also to minimize the distraction from the interfering sources in both spatial and 60 frequency domains. 61 2 Methods 62 Figure 1 illustrates a hypothetical relationship between motor behavior (hand movements), brain activity, and ECoG 63 recordings. The activity, e(t), of a set of neuronal populations, G1 − GI , engaged in motor control, is converted into 64 a movement trajectory, z(t), through a non-linear transform H: z(t)= H(e(t)). The activity of populations A1 − AJ 65 is unrelated to the movement. The recordings of e(t) with a set of sensors are represented by a K-dimensional vector 66 of sensor signals, x(t). This vector can be modeled as a linear mixture of signals resulting from application of the 67 forward-model matrices G and A to the task-related sources, s(t), and task-unrelated sources, f(t), respectively: XI XJ x(t)= Gs(t)+ Af(t)= gisi(t)+ ajfj(t) (1) i=1 j=1 P J 68 We will refer to the noisy, task-unrelated component of the recording as η(t)= j=1 ajfj(t). 69 Given the linear generative model of electrophysiological data, the inverse mapping used to derive the activity of T 70 sources from the sensor signals is also commonly sought in the linear form: ^s(t) = W X(t), where columns of 71 W form a spatial filter that counteracts the volume conduction effect and decreases the contribution from the noisy, 72 task-unrelated sources. 73 Neuronal correlates of motor planing and execution have been extensively studied [27]. In the cortical-rhythm domain, 74 alpha and beta components of the sensorimotor rhythm desynchronize just prior to the execution of a movement and 75 rebound with a significant overshoot upon the completion of a motor act [14]. The magnitude of these modulations 76 correlates with the person’s ability to control a motor-imagery BCI [20]. Additionally, the incidence rate of of beta 77 bursts in the primary somatosensory cortex is inversely correlated with the ability to detect tactile stimuli [24] and 78 also affects other motor functions. Intracranial recordings, such as ECoG, allow reliable measurement of the faster 79 gamma band activity, which is temporally and spatially specific to movement patterns [26] and is thought to accompany 80 movement control and execution. Overall, based on the very solid body of research, rhythmic components of brain 81 sources, s(t), appear to be useful for BCI implementations.Given the linearity of the generative model (1), these 82 rhythmic signals reflecting the activity of specific neuronal populations can be computed as linear combinations of 83 narrow-band filtered sensor data x(t). 2 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.02.129114; this version posted June 17, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. A PREPRINT -JUNE 17, 2020 s(t), a(t) Electrode x (t) z(t) 1 G1 Electrode e(t) Signal propagation medium x2(t) G2 G H i G .