Decoding Dynamic Brain Patterns from Evoked Responses: a Tutorial on Multivariate Pattern Analysis Applied to Time Series Neuroimaging Data
Total Page:16
File Type:pdf, Size:1020Kb
Decoding Dynamic Brain Patterns from Evoked Responses: A Tutorial on Multivariate Pattern Analysis Applied to Time Series Neuroimaging Data Tijl Grootswagers1,2,3, Susan G. Wardle1,2, and Thomas A. Carlson2,3 Abstract ■ Multivariate pattern analysis (MVPA) or brain decoding aim is to “decode” different perceptual stimuli or cognitive states methods have become standard practice in analyzing fMRI data. over time from dynamic brain activation patterns. We show that Although decoding methods have been extensively applied in decisions made at both preprocessing (e.g., dimensionality reduc- brain–computer interfaces, these methods have only recently tion, subsampling, trial averaging) and decoding (e.g., classifier been applied to time series neuroimaging data such as MEG selection, cross-validation design) stages of the analysis can sig- and EEG to address experimental questions in cognitive neuro- nificantly affect the results. In addition to standard decoding, science. In a tutorial style review, we describe a broad set of we describe extensions to MVPA for time-varying neuroimaging options to inform future time series decoding studies from a data including representational similarity analysis, temporal gener- cognitive neuroscience perspective. Using example MEG data, alization, and the interpretation of classifier weight maps. Finally, we illustrate the effects that different options in the decoding we outline important caveats in the design and interpretation of analysis pipeline can have on experimental results where the time series decoding experiments. ■ INTRODUCTION 2011; Lemm, Blankertz, Dickhaus, & Müller, 2011), the The application of “brain decoding” methods to the anal- aims of time series decoding for cognitive neuroscience ysis of fMRI data has been highly influential over the past are distinct from those that drive the application of these 15 years in the field of cognitive neuroscience (Kamitani methods in BCI, thus requiring a targeted introduction. &Tong,2005;Carlson,Schrater,&He,2003;Cox& Although there are many reviews and tutorials for fMRI Savoy, 2003; Haxby et al., 2001; Edelman, Grill-Spector, decoding (Haynes, 2015; Schwarzkopf & Rees, 2011; Kushnir, & Malach, 1998). In addition to their increased Mur, Bandettini, & Kriegeskorte, 2009; Pereira, Mitchell, & sensitivity, the introduction of fMRI decoding methods Botvinick, 2009; Formisano, De Martino, & Valente, 2008; offered the possibility to address questions about infor- Haynes & Rees, 2006; Norman, Polyn, Detre, & Haxby, 2006; mation processing in the human brain, which have comple- Cox & Savoy, 2003), there are no existing tutorial intro- mented traditional univariate analysis techniques. Although ductions to decoding time-varying brain activity. Although decoding methods for time series neuroimaging data the approaches are conceptually similar, there are impor- such as MEG/EEG have been extensively applied in brain– tant distinctions that stem from fundamental differences in computer interfaces (BCI; Müller et al., 2008; Curran & the nature of the neuroimaging data between fMRI and Stokes, 2003; Wolpaw, Birbaumer, McFarland, Pfurtscheller, MEG/EEG. In this article, we provide a tutorial introduction & Vaughan, 2002; Kübler, Kotchoubey, Kaiser, Wolpaw, & using an example MEG data set. Although there are many Birbaumer, 2001; Farwell & Donchin, 1988; Vidal, 1973), possible analyses targeting time series data (e.g., oscillatory they have only recently been applied in cognitive neuro- [Jafarpour, Horner, Fuentemilla, Penny, & Duzel, 2013] or science (Carlson, Hogendoorn, Kanai, Mesik, & Turret, induced responses), we restrict the scope of this article to 2011; Duncan et al., 2010; Schaefer, Farquhar, Blokland, decoding information from evoked responses, with statis- Sadakata, & Desain, 2010). tical inference at the group level on single time points or The goal of this article is to provide a tutorial style guide small time windows. As with most neuroimaging analysis to the analysis of time series neuroimaging data for cog- techniques, the number of possible permutations for a nitive neuroscience experiments. Although introductions given set of analysis decisions is very large, and the par- to BCI exist (Blankertz, Lemm, Treder, Haufe, & Müller, ticular choice of analysis pipeline is guided by the experi- mental question at hand. Here we aim to provide a broad 1Macquarie University, Sydney, Australia, 2ARC Centre of Excel- demonstration of how the analysis may be approached, lence in Cognition and its Disorders, 3University of Sydney rather than prescribing a particular analysis pipeline. ©2017MassachusettsInstituteofTechnology JournalofCognitiveNeuroscience29:4,pp.677–697 doi:10.1162/jocn_a_01068 Early studies using time-resolved decoding methods brain activity was recorded.Thegoalofthedecoding have revealed significant potential for experimental in- analysis is to test whether we can predict if the partici- vestigation using this approach with MEG/EEG (see pant was viewing a blue circle or a red square based on MVPA for MEG/EEG section). However, compared with their patterns of brain activation. If the experimental the popularity of decoding methods in fMRI, to date only stimuli can be successfully “decoded” from the partici- a small number of studies have applied multivariate pat- pant’spatternsofbrainactivation,wecanconcludethat tern analysis (MVPA) techniques to EEG or MEG. Accord- some information relevant to the experimental manipula- ingly, the aims of this article are to (a) introduce the critical tion exists in the neuroimaging data. First, brain activation differences between decoding time series (e.g., MEG/EEG) patterns in response to the different stimuli (or experi- versus spatial (e.g., fMRI) neuroimaging data, (b) illustrate mental conditions) are recorded using standard neuro- the time series decoding approach using a practical tutorial imaging (MEG, fMRI, etc.) techniques (Figure 1A). The with example MEG data, (c) demonstrate the effect that activation levels of the variables (e.g., voxels in fMRI, selecting different analysis parameters has on the results, channels in MEG/EEG) in different experimental conditions and (d) outline important caveats in the interpretation of are represented as complex patterns in high-dimensional time series decoding studies. In summary, this article will space (each voxel, channel, or principal component is one provide a broad overview of available methods to inform dimension). For simplicity, in Figure 1B, these patterns are future time-resolved decoding studies. This tutorial is pre- shown in two-dimensional space. Each point in the plot sented in the context of MEG; however, the methods and represents an experimental observation corresponding to analysis principles generalize to other time-varying brain the simultaneous activation level in two example voxels/ recording techniques (e.g., ECoG, EEG, electrophysiol- channels in response to one of the experimental condi- ogical recordings.). As this review is targeted at providing tions (blue circles or red squares). a broad overview to a general audience, we avoid formal The first step in a decoding analysis involves training mathematical definitions and implementation details of aclassifiertoassociatebrainactivationpatternswith the methods and instead focus on the rationale behind the experimental conditions using a subset of the data the decoding approach as applied to time series data. (Figure 1C). In effect, during training the classifier finds the decision boundary in higher-dimensional space that best separates the patterns of brain activation correspond- MVPA for MEG/EEG ing to the two experimental categories into two distinct The term “multivariate pattern analysis” (or MVPA) en- groups. As neuroimaging data are inherently noisy, this compasses a diverse set of methods for analyzing neuro- separation is not necessarily perfect (note the red square imaging data. The common element that unites these on the wrong side of the decision boundary in Figure 1C). approaches is that they take into account the relation- Next, the trained classifier is used to predict the condition ships between multiple variables (e.g., voxels in fMRI or labels for new data that were not used for training the channels in MEG/EEG), instead of treating them as inde- classifier (Figure 1D). The classifier predicts whether pendent and measuring relative activation strengths. The the new (unlabeled) data are more similar to the pattern term “decoding” refers to the prediction of a model from of activation evoked by viewing a blue circle or a red the data (“encoding” approaches do the reverse, predict- square. If the classifier performs higher than that expected ing the data from the model, reviewed in Naselaris, Kay, by chance (in this case 50% is the guessing rate as there Nishimoto, & Gallant, 2011; see also, e.g., Ding & Simon, are two stimuli), it provides evidence that the classifier 2012, for an example of encoding models for MEG). The can successfully generalize the learned associations to most common application of decoding in cognitive neu- labeling new brain response patterns. Consequently, it is roscience is the use of machine learning classifiers (e.g., assumed that the patterns of brain activation contain infor- correlation classifiers (Haxby et al., 2001) or discriminant mation that distinguishes between the experimental con- classifiers (Carlson et al., 2003; Cox & Savoy, 2003) to ditions (i.e., the conditions blue circle/red square can be identify patterns