Relevance Vector Machine Applied to EEG Signals Classification Sandro Chagas, Marcio Eisencraft, Clodoaldo Ap
Total Page:16
File Type:pdf, Size:1020Kb
XXVI SIMPÓSIO BRASILEIRO DE TELECOMUNICAÇÕES - SBrT’08, 02-05 DE SETEMBRO DE 2008, RIO DE JANEIRO, RJ Relevance Vector Machine Applied to EEG Signals Classification Sandro Chagas, Marcio Eisencraft, Clodoaldo Ap. M. Lima Abstract —The electroencephalogram (EEG) is a complex gives better frequency resolution. Also AR method has an and aperiodic time series, which is a sum over a very large advantage over FFT that, it needs shorter duration data number of neuronal membrane potentials. Despite the rapid records than FFT [20]. advances of neuroimaging techniques, EEG recording con- In the late 1890s, a powerful method was proposed to tinues playing an important role in both the diagnosis of perform time-scale analysis of signals: the wavelet trans- neurological diseases and understanding of the psychological process. In order to extract relevant information of brain forms (WT). This method provides a unified framework electrical activity, a variety of computerized-analysis me- for different techniques that have been developed for var- thods have been used. In this paper, we propose the use of a ious applications. Since the WT is appropriate for analysis recently developed machine-leaning technique – relevance of non-stationary signals and this represents a major ad- vector machine (RVM) – for EEG signals classification. vantage over spectral analysis, it is well suited to locating RVM is based on Bayesian estimation theory, which has as transient events. Wavelet’s feature extraction and repre- distinctive feature the fact that it can yield a sparse decision sentation properties can be used to analyze various tran- function defined only by a very small number of so-called re- sient events in biological signals. In [1] was presented an levance vectors. From the experimental results, we can see overview of the discrete wavelet transform (DWT) devel- that estimation and classification based on RVM perform well in EEG signals classification problem compared with oped for recognizing and quantifying spikes, sharp waves traditional approach support vector machine (SVM), which and spike-waves. Through, wavelet decomposition of the indicates that this classification method is valid and has EEG records, transient features are accurately captured promising application. and localized in both time and frequency context. Various other techniques from the theory of signal ana- Keywords - Electroencephalogram, Support Vector Ma- lyses have been used to obtain representations and extract chine, Relevance Vector Machine, Bayesian estimation the features of interest for classification purposes. Neural theory, classification problem networks and statistical pattern recognition methods have been applied to EEG analysis. In [11] was used the raw EEG data as an input to a neural network while in [19] I. INTRODUCTION was used the features proposed by Gotman with an adap- The human brain is a complex system, and exhibits rich tive structure neural network, but his results show a poor spatiotemporal dynamics. Among the techniques for in- false detection rate. In [10] a recurrent neural network vestigating human brain dynamics, electroencephalogra- combined with wavelet pre-processing was proposed to phy (EEG) provides a non-invasive and direct measure of predict the onset of epileptic seizures both on scalp and cortical activity with temporal resolution in milliseconds. intracranial recordings only one-channel of electroence- EEG is a record of the electrical potentials generated by phalogram. the cerebral cortical neurons. Early on, EEG analysis was However, most of the techniques used to train the neu- restricted to visual inspection of EEG records. Since there ral network classifiers are based on the idea of minimi- is no definite criterion evaluated by the experts, visual zing the training error, which is usually called empirical analysis of EEG signals is insufficient. Routine clinical risk. As a result, limited amounts of training data and over diagnosis requires the analysis of EEG signals. Therefore, high training accuracy often lead to over training instead some automation and computer techniques have been of good classification performance. In addition, its classi- used for this aim [5]. Since the early days of automatic fication accuracy is also sensitive to the dimension of the EEG processing, representations based on a Fourier trans- training set. form have been most commonly applied. This approach is On the other hand, the Support Vector Machine (SVM) based on earlier observations that the EEG spectrum con- approach is based on the minimization of the structural tains some characteristic waveforms that fall primarily risk [18], which asserts that the generalization error is de- within four primary components. Such methods have limited by the sum of the training error and a parcel that proved beneficial for various EEG characterizations, but depends on the Vapnik-Chervonenkis dimension. By mi- fast Fourier transform (FFT), suffers from large noise nimizing this summation, high generalization perfor- sensitivity. Parametric power spectrum estimation me- mance may be obtained. Besides, the number of free pa- thods such as AR, reduces the spectral loss problems and rameters in SVM does not explicitly depend upon the in- put dimensionality of the problem at hand. Another im- Sandro Chagas, Marcio Eisencraft, Clodoaldo Ap. M. Lima, Escola de portant feature of the support vector learning approach is Engenharia, Universidade Presbiteriana Mackenzie, São Paulo, Brasil, that the underlying optimization problems are inherently E-mails: [email protected], [email protected], convex and have no local minima, which comes as the re- [email protected]. XXVI SIMPÓSIO BRASILEIRO DE TELECOMUNICAÇÕES - SBrT’08, 02-05 DE SETEMBRO DE 2008, RIO DE JANEIRO, RJ sult of applying Mercer's conditions on the characteriza- physik/eegdata.html [2]. The complete data set consists of tion of kernels [13]. five sets (denoted A-E) each containing 100 single- Although the SVM classification provides successful channel EEG segments. These segments were selected results, a number of significant and practical disadvantag- and cut out from continuous multi-channel EEG record- es are identified as follows [15][16]: ings after visual inspection for artifacts, e.g., due to mus- • Although SVMs are relatively sparse, the number of cle activity or eye movements. Sets A and B consisted of support vector (SVs) typically grows linearly with the segments taken from surface EEG recordings that were size of the training set and therefore, SVMs make carried out on five healthy volunteers using a standardized unnecessarily liberal of basic functions. electrode placement scheme (Figure 1). • Predictions are not probabilistic, and therefore, SVM is not suitable for classification tasks in which post- erior probabilities of class membership are necessary. • In SVM, it is required to estimate the error/margin trade tradeoff parameter C, which generally entails a cross-validation procedure which can be a waste of data as well as computation. • In SVM, the kernel function must satisfy Mercer’ condition, hence, it must be a continuous symmetric kernel of a positive integral operator. The Relevance Vector Machine (RVM) has been intro- duced by [15][16] as a Bayesian treatment alternative to the SVM that does not suffer from the aforementioned li- Figure 1: The 10-20 international system of electrode mitations. The RVM is a statistical learning method and it placement c images of normal and abnormal cases. is a probabilistic sparse kernel model identical in func- tional form to the SVM. It represents a new approach to Volunteers were relaxed in an awake-state with eyes pattern classification that has recently attracted a great open (A) and eyes closed (B), respectively. Sets C, D, and deal of interest in the machine learning community. RVM E originated from EEG archive of pre-surgical diagnosis. can be seen as a new way to train polynomial, neural net- EEGs from five patients were selected, all of whom had work, or Radial Basic Functions classifiers. It operates on achieved complete seizure control after resection of one the induction principle of structural risk minimization, of the hippocampal formations, which was therefore cor- which minimizes an upper bound on the generalization er- rectly diagnosed to be the epileptogenic zone. Segments ror. Because in this approach no probability density is es- in set D were recorded from within the epileptogenic timated, it becomes highly insensitive to the curse of di- zone, and those in set C from the hippocampal formation mensionality. In many problems RVM classifiers have of the opposite hemisphere of the brain. While sets C and been shown to perform much better than other non-linear D contained only activity measured during seizure free in- classifiers such as artificial neural networks. However, tervals, set E only contained seizure activity. Here seg- their application to EEG signals classification problem ments were selected from all recording sites exhibiting ic- has been very limited. tal activity. All EEG signals were recorded with the same In this paper, we propose the use of RVM for EEG sig- 128-channel amplifier system, using an average common nals classification. Furthermore, a comparative study in reference. The data were digitized at 173.61 samples per terms of performance and complexity (number of relev- second using 12 bit resolution. Bandpass filter settings ance vectors versus number support vectors) is realized. were 0.53-40