Channel Selection Methods for the P300 Speller

G Model NSM 6874 1–10 ARTICLE IN PRESS Journal of Neuroscience Methods xxx (2014) xxx–xxx Contents lists available at ScienceDirect Journal of Neuroscience Methods jo urnal homepage: www.elsevier.com/locate/jneumeth 1 Clinical Neuroscience 2 Channel selection methods for the P300 Speller a,∗ b a b a 3 Q1 K.A. Colwell , D.B. Ryan , C.S. Throckmorton , E.W. Sellers , L.M. Collins a 4 Department of Electrical & Computer Engineering, Duke University, Durham, NC, USA b 5 Department of Psychology, East Tennessee State University, Johnson City, TN, USA 6 7 h i g h l i g h t s 8 • 9 Active electrode (“channel”) selection leads to higher average P300 Speller classification performance compared to a standard channel set of similar 10 size. • 11 Jumpwise regression is capable of selecting an effective electrode subset at runtime. • 12 Some experimental subjects see large gains in accuracy with channel selection. 13 2514 a r t i c l e i n f o a b s t r a c t 15 16 Article history: The P300 Speller brain–computer interface (BCI) allows a user to communicate without muscle activity 17 Received 11 October 2013 by reading electrical signals on the scalp via electroencephalogram. Modern BCI systems use multiple 18 Received in revised form 6 March 2014 electrodes (“channels”) to collect data, which has been shown to improve speller accuracy; however, 19 Accepted 10 April 2014 system cost and setup time can increase substantially with the number of channels in use, so it is in the 20 user’s interest to use a channel set of modest size. This constraint increases the importance of using an 21 Keywords: effective channel set, but current systems typically utilize the same channel montage for each user. We 22 Brain–computer interface examine the effect of active channel selection for individuals on speller performance, using generalized 23 P300 Speller standard feature-selection methods, and present a new channel selection method, termed jumpwise 24 Channel selection regression, that extends the Stepwise Linear Discriminant Analysis classifier. Simulating the selections of each method on real P300 Speller data, we obtain results demonstrating that active channel selection can improve speller accuracy for most users relative to a standard channel set, with particular benefit for users who experience low performance using the standard set. Of the methods tested, jumpwise regression offers accuracy gains similar to the best-performing feature-selection methods, and is robust enough for online use. © 2014 Published by Elsevier B.V. 26 1. Introduction remain in the experimental stage and are primarily used in a labo- 36 ratory environment (Vaughan et al., 2006). 37 27 Brain–computer interface (BCI) systems are designed to analyze One BCI that has been successfully deployed to users with ALS 38 28 real-time data associated with a human user’s brain activity and (Sellers and Donchin, 2006) is the P300 Speller, first developed 39 29 translate it into computer output. The clearest current motivation by Farwell and Donchin (1988). This system combines measure- 40 30 for BCI development is to extend a means for communication and ments of electroencephalogram (EEG) signals on the user’s scalp, a 41 31 control to people with neurological diseases, such as amyotrophic software signal processor, an online classifier, and presentation of 42 32 lateral sclerosis (ALS) or spinal-cord injury, who have lost motor stimuli that evoke a P300 event-related potential (ERP), in order to 43 33 ability (“locked-in” patients). However, state-of-the-art BCI sys- sequentially choose items from a list (e.g. the letters in a word, or 44 34 tems for such individuals are still expensive and limited in speed commands such as “Page Down” or “Escape”). 45 35 and accuracy, and setup for home use is nontrivial; most systems The original P300 Speller, as conceived by Farwell and Donchin, 46 employed only a single electrode (one “channel” of information). 47 The use of additional channels was discovered to improve clas- 48 ∗ sification performance, and most if not all modern P300 Speller 49 Corresponding author. systems include data from multiple recording sites (e.g. Krusienski 50 E-mail addresses: [email protected], [email protected] (K.A. Col- et al., 2006; Sellers and Donchin, 2006; Schalk et al., 2004). How- 51 well), [email protected] (D.B. Ryan), [email protected] (C.S. Throckmorton), [email protected] (E.W. Sellers), [email protected] (L.M. Collins). ever, larger channel sets require more complicated electrode caps 52 http://dx.doi.org/10.1016/j.jneumeth.2014.04.009 0165-0270/© 2014 Published by Elsevier B.V. Please cite this article in press as: Colwell KA, et al. Channel selection methods for the P300 Speller. J Neurosci Methods (2014), http://dx.doi.org/10.1016/j.jneumeth.2014.04.009 G Model NSM 6874 1–10 ARTICLE IN PRESS 2 K.A. Colwell et al. / Journal of Neuroscience Methods xxx (2014) xxx–xxx 53 and more amplifier channels, which can greatly increase the cost literature (e.g. Farwell and Donchin, 1988; Sellers and Donchin, 119 54 of a system: implementing a 32-channel system rather than an 8- 2006; Krusienski et al., 2008); however, the proposed methods 120 55 channel system can raise the system cost by tens of thousands of for channel selection could be used in conjunction with other 121 56 dollars. This cost can be prohibitive to home users. Further, each classifiers or could be similarly included into a classifier training 122 57 channel must be calibrated individually for proper placement and phase. 123 58 impedance before each spelling session, adding to setup time and Each channel of EEG data contributes a number of time-samples 124 59 user discomfort. As a result, clinically relevant systems are limited (features) for classification decisions; channel selection can be 125 60 to using a subset of all possible electrode locations. The selec- viewed as a feature selection problem, with the additional constraint 126 61 tion of these channel locations impacts system performance: one that only entire channels of features may be selected for inclusion 127 62 sensitivity analysis concluded that identifying an appropriate chan- or exclusion. Feature selection is commonly performed in machine 128 63 nel set for an individual was more important than factors such learning problems with high dimensionality (many features) under 129 64 as feature space, pre-processing hyperparameters, and classifier the assumption that it is possible to discard some features at a 130 65 choice (Krusienski et al., 2008). In addition to empirical demon- low cost (for example, if the features are irrelevant or redun- 131 66 strations of the benefit of channel selection (Schröder et al., 2005; dant). Feature selection reduces data storage and computational 132 67 Krusienski et al., 2008; Rakotomamonjy and Guigue, 2008; Cecotti requirements by discarding unnecessary dimensions, reduces clas- 133 68 et al., 2011), principled reasons for selecting channel sets on a sifier training time by training on less-complex data, and guards 134 69 per-subject basis include the difficulty of outpatient calibration against classifier over-fitting by reducing the effect of the “curse 135 70 of electrode caps by nonclinical aides (such that electrodes that of dimensionality.” (See Guyon and Elisseefi, 2003 for an excellent 136 71 might have been useful do not yield as much information), as well introduction to feature selection.) Methods for feature selection fall 137 72 as several neurological motivations, including variation in brain into two categories: wrapper methods, which determine feature 138 73 structure and response across subjects arising from their unique subsets’ value by measuring their performance with the chosen 139 74 cortical folds, and the plasticity of the brain over time, particu- classifier; and filter methods, which choose subsets of features inde- 140 75 larly as it adapts to a new system. Disease progression may also pendently of the classifier. Wrapper methods observe a “search 141 76 impact the optimal set of electrodes. Furthermore, BCI deployment space” of possible combinations of features, assigning each combi- 142 77 for home use has proven much more challenging than deployment nation a value based on the classifier’s performance; these methods 143 78 in the laboratory environment (Sellers and Donchin, 2006; Sellers are appealing for their empiricism and simplicity, but can suffer 144 79 et al., 2006; Kübler et al., 2001); it is possible that the subject- high computation requirements due to the need to repeatedly train 145 80 independent channel sets that are effective for healthy subjects and test the classifier (Guyon and Elisseefi, 2003). Filter methods 146 81 do not generalize to other populations. As a practical note, it is can be faster, but require a performance metric independent of 147 82 anticipated that in order for users to obtain both a performance classification performance. These categories remain when gener- 148 83 benefit from selecting channels from an extensive set and the cost alizing features to channels, and approaches from both categories 149 84 and setup savings of employing a system with a small subset of are explored here: a heuristic filter method, maximum signal-to- 150 85 those channels, a channel selection calibration session could first noise ratio selection (Max-SNR); the wrapper methods sequential 151 86 be conducted in a laboratory environment to determine the opti- forward selection (SFS) and k-forward, m-reverse (KFMR) selection; 152 87 mal subset, which would then be the only channels set up for and a new filter method, jumpwise selection (JS). The performance 153 88 home use. of each method is compared to baseline results obtained with the 154 89 Although channel selection has been investigated for other standard eight-channel montage determined in Krusienski et al.

Channel Selection Methods for the P300 Speller

Part II: Auditory Evoked Potentials, P300 Response and Contingent Negative Variation in Social Drinkers

Spectral Components of the P300 Speller Response in Electrocorticography

Rapid P300 Brain-Computer Interface Communication with a Head-Mounted Display

Stimulus Frequency and Masking As Determinants of P300 Latency in Event-Related Potentials from Auditory Stimuli *

Electrocorticographic (Ecog) Correlates of Human Arm Movements

Considerations in Source Estimation of the P3

Working Memory Load for Faces Modulates P300, N170, and N250r

Psychophysiological Processing of Emotionaland Self-Referential

Multiple Mechanisms Link Prestimulus Neural Oscillations to Sensory Responses

Working Memory Load for Faces Modulates P300, N170, and N250r

P300 Hemispheric Amplitude Asymmetries from a Visual Oddball Task

What Are Neural Oscillations?