<<

sensors

Article on -Based Recognition Using a Self-Made EEG Device

Ngoc-Dau Mai 1, Boon-Giin Lee 2 and Wan-Young Chung 1,*

1 Department of Artificial Convergence, Pukyong National University, Busan 48513, Korea; [email protected] 2 School of Computer , The University of Nottingham Ningbo China, Ningbo 315100, China; [email protected] * Correspondence: [email protected]

Abstract: In this , we develop an affective computing method based on machine learning for using a wireless protocol and a wearable (EEG) custom-designed device. The system collects EEG signals using an eight-electrode placement on the scalp; two of these electrodes were placed in the frontal lobe, and the other six electrodes were placed in the . We performed experiments on eight subjects while they watched emotive videos. Six entropy measures were employed for extracting suitable features from the EEG signals. Next, we evaluated our proposed models using three popular classifiers: a support vector machine (SVM), multi-layer perceptron (MLP), and one-dimensional convolutional neural network (1D-CNN) for emotion classification; both subject-dependent and subject-independent strategies were used. Our experiment results showed that the highest average accuracies achieved in the subject-dependent and subject-independent cases were 85.81% and 78.52%, respectively; these accuracies were achieved  using a combination of the sample entropy measure and 1D-CNN. Moreover, our study investigates  the T8 position (above the right ear) in the temporal lobe as the most critical channel among the

Citation: Mai, N.-D.; Lee, B.-G.; proposed measurement positions for emotion classification through electrode selection. Our results Chung, W.-Y. Affective Computing on prove the feasibility and efficiency of our proposed EEG-based affective computing method for Machine Learning-Based Emotion emotion recognition in real-world applications. Recognition Using a Self-Made EEG Device. Sensors 2021, 21, 5135. Keywords: electroencephalogram (EEG); affective computing; emotion recognition; entropy https://doi.org/10.3390/s21155135 measures; support vector machine (SVM); multi-layer perceptron (MLP); one-dimensional convolutional neural network (1D-CNN) Academic Editor: Paweł Pławiak

Received: 1 July 2021 Accepted: 27 July 2021 1. Introduction Published: 29 July 2021 an essential role not only in communication, but also in human

Publisher’s Note: MDPI stays neutral ; emotions also influence decision making [1]. Researchers have paid consider- with regard to jurisdictional claims in able to studying emotions in a vast number of interdisciplinary fields [2], such as published maps and institutional affil- affective , , , and of emotions. However, iations. research on identifying emotional states automatically and accurately remains largely un- explored. Affective computing, also known as emotion AI [3], applies modern technology of human–computer interactions to human emotions [4]. Emotion AI has been proposed and developed in various applications, especially , healthcare, driver assistance, and [5]. In the field of marketing, emotion AI would include measuring cus- Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. tomer reactions to products and services to optimize strategies, satisfy customers, and This article is an open access article enhance productivity and profits [6]. An excellent example of the application of affective distributed under the terms and computing in healthcare is the early detection of a negative emotional state while reading conditions of the Creative Commons social media news [7]. The extensive development of the Internet has led to the appearance Attribution (CC BY) license (https:// of social media sites that provide users with convenient and quick access to creativecommons.org/licenses/by/ sources. However, young people are particularly vulnerable to inappropriate content that 4.0/). can lead to emotional disorders with prolonged exposure. Therefore, the development

Sensors 2021, 21, 5135. https://doi.org/10.3390/s21155135 https://www.mdpi.com/journal/sensors Sensors 2021, 21, 5135 2 of 19

of emotion recognition systems is essential for preventing undesirable exposure to toxic online content. Some other applications of emotion recognition discussed in other studies are as follows. N. Murali Krishna et al. [8] described an approach based on the generalized mixture distribution model to identify the emotion recognitions by mentally impaired persons. Justas Salkevicius et al. [9] developed a virtual exposure therapy (VRET) system for handling multiple disorders using physiological signals like blood volume pressure (BVP), galvanic skin response (GSR), and skin temperature. However, the familiar point of these studies is that they all employ available datasets or apply expensive commercial devices with big sizes for data acquisition, so it is hard to be suitable for specific applications. Our study proposes a self-made device with compact size and lightweight that can be worn for a long time, verified in terms of data acquisition in our experiments. Studies have shown that multiple factors can be used to recognize emotions, such as facial expressions, rate, body temperature, and electromyography [10–14]. Elec- troencephalography (EEG) signals have drawn considerable attention because of their high efficiency and [15]. In addition, EEG is a fast and non-invasive method [16] that provides reasonably precise responses to internal or external stimuli, especially for emotional stimuli [17]. Most previous investigations employed commercial equipment with a high cost and large size; in this research, we investigated and developed a low-cost, compact, wireless, and wearable device to collect EEG signals. However, electromagnetic interference is common in EEG data collection [18]. Therefore, signal preprocessing is vital to enhance system performance. Numerous studies have verified the success of the en- tropy measures because of the system’s sensitivity to variations in the physiological signal properties [19]. Furthermore, researchers have applied the entropy concept to quantify the amount of or randomness in the pattern [20], which is exceptionally suitable for signals, such as EEG, which contain a large amount of information. In this study, six varieties of entropy measures were performed in both the time and frequency domains for feature extraction: permutation entropy (PEE), singular value decomposition entropy (SVE), approximate entropy (APE), sample entropy (SAE), spectral entropy (SPE), and continuous wavelet transform entropy (CWE). Several previous studies have proved that not all electrodes employed for collecting EEG signals contained valuable information for emotion recognition [21]. Consequently, some electrode selection approaches were proposed to drop the EEG electrodes that con- tained unnecessary content [22]. This helped to decrease the computational complexity and boost the system speed. One popular selection technique is to combine the analysis of variance (ANOVA) [23] and Tukey’s honestly significant difference (HSD) tests [24]; we have followed this in our study. We explored T8 as the preferred measure- ment position for EEG acquisition that holds valuable information for emotion recognition through electrode selection. In the last section of our system, we implemented the classifi- cation models that take on the role of automatically and precisely identifying the emotional states. In this study, we evaluated the performance of our proposed system using three popular classifiers: support vector machine (SVM), multi-layer perceptron (MLP), and one-dimensional convolutional neural network (1D-CNN). Moreover, we enhanced the processing speed by applying and weighing three dimensionality reduction techniques: principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP). In addition to the benefits of price, portability, and ease of use, we also examined our wireless and wearable EEG device for feasibility and reliability through eye-closed and eye-opened testing. Moreover, with the obtained results from this study in emotion recognition, our system proved its ability to be employed for various purposes toward human life and the research community, such as entertainment, e-learning, or e-healthcare purposes. In summary, the significant contributions of this study are as follows: • We proposed a novel end-to-end affective computing based on machine learning for emotion recognition utilizing a wireless and wearable custom-designed EEG device with a low-cost and compact . Sensors 2021, 21, x FOR PEER REVIEW 3 of 18

• Sensors 2021, 21, 5135 We proposed a novel end-to-end affective computing based on machine learning3 of for 19 emotion recognition utilizing a wireless and wearable custom-designed EEG device with a low-cost and compact design. • The combination of sample entropy feature extraction and 1D-CNN was introduced • Theand combinationachieved the ofbest sample performance entropy featureamong the extraction proposed and entropy 1D-CNN measures was introduced and ma- andchine achieved learning the models best performance for two kinds among of EEG the emotion proposed recognition entropy measures experiments, and ma- in- chinecluding learning the subject-dependent models for two kinds and subject-independent of EEG emotion recognition cases. experiments, includ- • ingWe thealso subject-dependent investigated that and T8 subject-independentin the temporal lobe cases. was adopted most frequently • Wethrough also investigated the electrode that selection T8 in the method, temporal implying lobe was that adopted this position most frequently would have through fur- thether electrode valuable selection information method, than implyingother locations that this for positionemotion wouldrecognition. have further valu- able information than other locations for emotion recognition. The remainder of this paper is organized as follows: Section 2 describes the materials and Themethods. remainder Next, ofthe this experimental paper is organized results and as follows: the outcome Section of2 thisdescribes study theare materialspresented and methods. Next, the experimental results and the outcome of this study are presented in Section 3. Finally, Section 4 presents the conclusions of the paper. in Section3. Finally, Section4 presents the conclusions of the paper.

2.2. MaterialsMaterials andand MethodsMethods 2.1.2.1. Eight-ChannelEight-Channel EEGEEG RecordingRecording InIn thisthis study,study, wewe developed developed aa complete complete EEG-based EEG-based affectiveaffective computingcomputing methodmethod forfor emotionemotion recognition. The The system system consists consists of of two two main main parts: parts: an EEG an EEG headset headset and andthe soft- the software.ware. The The main main function function of the of headset the headset is to iscollect to collect EEG EEGsignals signals at the at sampling the sampling rate of rate 256 ofHz 256 from Hz the from 8-channel the 8-channel electrode electrode cap. Then, cap. Then, the headset the headset transfers transfers the signals the signals to the to soft- the softwareware up set on up a on computer a computer so that so thatall the all data the dataare managed are managed and processed and processed using using the Blue- the Bluetoothtooth wireless wireless protocol. protocol. The The proposed proposed EEG EEG dev deviceice is isshown shown in in Figure Figure 1.1. We usedused “dry”“dry” activeactive sensorssensors withwith pre-amplificationpre-amplification circuitscircuits toto solvesolve thethe highhigh electrode–skinelectrode–skin interfacialinterfacial impedancesimpedances whenwhen collectingcollecting EEGEEG signalssignals fromfrom thethe brainbrain throughthrough thethe hair.hair. In thisthis study,study, wewe usedused sensorssensors inin thethe rangerange ofof 100–2000100–2000 kOhmkOhm onon thethe unpreparedunprepared leathersleathers producedproduced byby Cognionics,Cognionics, Inc.,Inc., SanSan Diego,Diego, USAUSA [[25].25]. ADSADS 12991299 isis anan 8-channel,8-channel, 24-bit24-bit analog-to-digitalanalog-to-digital converterconverter thatthat specializesspecializes inin EEGEEG applicationsapplications [[26]26] usedused toto digitalizedigitalize thethe obtainedobtained signals.signals. BeforeBefore thethe digitizeddigitized data data were were sent sent to to a a computer computer via via the the wireless wireless protocol protocol using using the the HC-05 HC- Bluetooth05 Bluetooth module, module, the datathe data were were conveyed conveyed to the to Teensy the Teensy 3.2 board 3.2 board used asused a microcontroller. as a microcon- Thetroller. EEG The data EEG were data gathered were gathered and displayed and displayed continuously continuously by self-programmed by self-programmed and pre-and installedpre-installed software software on our on computer.our computer. Previous Previous studies studies have have explained explained how how the frontal the frontal and temporaland temporal lobes lobes are criticalare critical brain brain areas areas for for research research on on emotions emotions [27 [27–29].–29]. Therefore, Therefore, we we collectedcollected EEGEEG signalssignals fromfrom thethe twotwo frontalfrontal electrodeselectrodes (AF3(AF3 andand AF4)AF4) andand thethe sixsix temporaltemporal electrodeselectrodes (FT7,(FT7, FT8,FT8, T7,T7, T8,T8, TP7,TP7, andand TP8)TP8) inin thisthis studystudy thatthat werewere placedplaced onon thethe scalpscalp accordingaccording toto thethe internationalinternational 10–2010–20 systemsystem [[30]30] (see(see FigureFigure2 ).2).

FigureFigure 1.1. ((aa)) BlockBlock diagramdiagram ofof thethe proposedproposed EEGEEG device;device; ((bb)) toptop viewview ofof thethe EEGEEG device;device; ((cc)) bottombottom viewview ofof thethe EEGEEG device;device; ((dd)) drydry sensorssensors and and active active circuits; circuits; ( e(e)) complete complete 8-channel 8-channel EEG EEG device. device.

Sensors 2021, 21, 5135 4 of 19 SensorsSensors 2021 2021, 21, 21, x, xFOR FOR PEER PEER REVIEW REVIEW 4 4of of 18 18

FigureFigureFigure 2.2. 2. TwoTwo Two frontalfrontal frontal electrodeselectrodes electrodes (AF3(AF3 (AF3 andand and AF4)AF4) AF4) andand and sixsix six temporaltemporal temporal electrodeselectrodes electrodes (FT7,(FT7, (FT7, FT8,FT8, FT8, T7,T7, T7, T8,T8, T8, TP7,TP7, TP7, andandand TP8)TP8) TP8) selectedselected selected inin in thethe the study.study. study.

ThereThereThere areare are multiple multiple multiple measures measures measures to to evaluateto evaluate evaluate the the the feasibility feasibility feasibility of aof of device a adevice device for EEGfor for EEG EEG acquisition, acquisi- acquisi- suchtion,tion, assuch such using as as using the using attention/ the the attention/meditation attention/meditation level or level level eye or blinking or eye eye blinking blinking or eye-closed or or eye-closed eye-closed and eye-opened and and eye- eye- testingopenedopened [ 31testing testing]. This [31]. [31]. study This This assessed study study assessed ourassessed device our our through device device through eye-closed through eye-closed eye-closed and eye-opened and and eye-opened eye-opened resting- stateresting-stateresting-state signals (thesignals signals outcomes (the (the outcomes outcomes are shown are are shown inshown Figure in in Figure3 Figure). Neural 3). 3). Neural Neural alpha alpha wavealpha wave wave is identified is is identified identified in deliveringinin delivering delivering a dominant a adominant dominant peak peak peak of approximatelyof of approximately approximately 10 Hz10 10 Hz in Hz thein in the occipitalthe occipita occipita cortexl cortexl cortex during during during closing clos- clos- eyes.inging eyes. O1–O2eyes. O1–O2 O1–O2 positions positions positions are used are are toused used collect to to coll EEGcollectect signals EEG EEG fromsignals signals the from occipitalfrom the the occipital cortex.occipital Welch’s cortex. cortex. periodogramWelch’sWelch’s periodogram periodogram is adopted is is adopted toadopted compute to to compute compute PSD (Power PSD PSD (Power Spectral(Power Spectral Spectral Density) Density) Density) with a with 1-swith (256 a a1-s 1-s data (256 (256 points)datadata points) points) window window window and a and 50%and a overlap. a50% 50% overlap. overlap.

FigureFigureFigure 3.3. 3. ((aa ())a ExperimentalExperimental) Experimental setupsetup setup forfor for EEGEEG EEG devicedevice device testing;testing; testing; ((bb ())b powerpower) power spectralspectral spectral densitiesdensities densities (PSDs)(PSDs) (PSDs) duringduring during eyes-closedeyes-closedeyes-closed andand and eyes-openedeyes-opened eyes-opened restingresting resting statestate state fromfrom from O1-O2O1-O2 O1-O2 locations. locations. locations.

2.2.2.2.2.2. StimulusStimulus andand and ProtocolProtocol Protocol ItItIt isis is important important to to to design designdesign reliable reliablereliable stimuli stimuli for for for emotion emotion emotion elicitation. elicitation. elicitation. Different Different Different stimulants stimulants stimu- lantshavehave havebeen been beenemployed employed employed in in emotion emotion in emotion investigations investigations investigations in in recent recent in recentyears, years, years,such such as suchas images, images, as images, , music, music,movies,movies, movies, and and games. games. and games. , Databases, Databases, such such as suchas Intern Intern asational Internationalational Affective Affective Affective Picture Picture Picture System System System [32], [32], Data- [Data-32], Databasebasebase for for Emotion forEmotion Emotion Analysis Analysis Analysis Using Using Using Physiological Physiological Physiological Signals Signals Signals [33], [33], [ 33Geneva Geneva], Geneva Affective Affective Affective Picture Picture Picture Da- Da- Databasetabasetabase [34], [34], [34 Nencki], Nencki Nencki Affective Affective Affective Picture Picture Picture System System [35], [[35],35], and andand EmoMadrid EmoMadridEmoMadrid [36], [ 36[36],], have have have been been used.used. used. StudiesStudiesStudies havehave have underlinedunderlined underlined the the the importance importance importance of of theof the the practicability practicability practicability and and and effectiveness effectiveness effectiveness of imagesof of images images in discoveringinin discovering discovering emotional emotional emotional states states states [37 [37–39]. –[37–39].39]. In In ourIn ou ou experiments,r rexperiments, experiments, we we we selected selected selected pictures pictures pictures from from from thethe the EmoMadridEmoMadridEmoMadrid datasetdataset dataset toto to produceproduce produce emotionsemotions emotions inin in theth the participants.eparticipants. participants. ThisThis This datasetdataset dataset containscontains contains overover over 700 photos across a wide variety of genres; the dataset contents have ratings 700700 photos photos across across a awide wide variety variety of of genres; genres; the the dataset dataset contents contents have have normative normative ratings ratings for three categories of emotions: negative, neutral, and positive [36]. We carefully chose forfor three three categories categories of of emotions: emotions: negative, negative, ne neutral,utral, and and positive positive [36]. [36]. We We carefully carefully chose chose

Sensors 2021, 21, x FOR PEER REVIEW 5 of 18 Sensors 2021, 21, 5135 5 of 19

images from this dataset to create 30 videos, and the same number of images were shown imagesto each fromparticipant this dataset for each to create of the 30 three videos, emotional and the states. same number of images were shown to eachA participanttotal of eight for persons each of (all the males) three emotional in the age states. group of 25 to 40 years participated in this study.A total They of eight were persons all students (all males) from Pukyong in the age National group of University, 25 to 40 years Busan, participated Korea. Some in thisprevious study. studies They were revealed all students little or from almost Pukyong no significant National University,difference Busan,in the -based Korea. Some previousemotion classification studies revealed on EEG little signals or almost [7,40,41]. no significant Hence, in difference this study, in we the will gender-based not concen- emotiontrate on classificationjudging the impact on EEG of signals gender [7 on,40 ,EEG41]. Hence,signals in in this emotion study, recognition. we will not concentrateAll the par- onticipants judging were the impactboth physically of gender and on EEGmentally signals healthy in emotion without recognition. any eyesight All theproblems. participants They werewere bothinstructed physically not to and use mentally any stimulants, healthy such without as alcohol any eyesight or drugs, problems. a day before They the were ex- instructedperiment to not avoid to use any any adverse stimulants, effects such on asthe alcohol test accuracy. or drugs, Note a day that before the theexperiments experiment on tohuman avoid subjects any adverse were effectsperformed on the by test the accuracy. approval Note of the that Institutional the experiments Review on Board human at subjectsPukyong were National performed University by the for approval biomedical of the research. Institutional Each participant Review Board read at a Pukyong clear de- Nationalscription Universityof the research for biomedical paper and agreed research. to Eachsign a participant form. read They a clear were description comfortably of theseated research in an paper upright and position agreed toin sign front a consentof the computer form. They screen, were comfortablyand they wore seated the in EEG an uprightheadsets. position In each in experiment, front of the computera participant screen, watched and they 30 short wore videos. the EEG In headsets. each video, In each 10 s experiment,were given afor participant the hint, 50 watched s for viewing 30 short pictures, videos. Inand each 10 video,s for rest. 10 s Figure were given 4 shows for thethe hint,detailed 50 s protocol. for viewing pictures, and 10 s for rest. Figure4 shows the detailed protocol.

FigureFigure 4.4. ProtocolProtocol ofof thethe EEGEEG experiment.experiment.

2.3.2.3. FeatureFeature ExtractionExtraction TheThe obtainedobtained EEG signals were were cleaned cleaned with with a aba band-passnd-pass filter filter at at 1–50 1–50 Hz Hz to toreject reject all allunwanted unwanted noise. noise. For For a trial a trial of of the the signal signal co collectionllection process, process, the the EEG signals were seg-seg- mentedmented withwith a a sliding sliding window window of of 0.5 0.5 s (128s (128 data data points) points) having having an overlapan overlap rate ofrate 50%; of 50%; this segmentationthis segmentation ensured ensured that there that wasthere no was lack no of lack valuable of valuable information information because because of the tempo- of the raltemporal correlation correlation of the data of the points. data Next,points. we Ne appliedxt, we theapplied six variants the six of variants entropy of measures entropy tomeasures extract usefulto extract features useful from features the preprocessed from the preprocessed EEG data for EEG classifying data for theclassifying emotional the states.emotional Entropy states. is aEntropy measure is ofa measure uncertainty. of unce Thertainty. level of The disturbance level of disturbance in biomedical in biomed- signals, includingical signals, EEG, including can be EEG, estimated can be by estimated employing by employing the system’s the entropy. system’s Aentropy. higher A entropy higher impliesentropy more implies significant more significant uncertainty uncertainty and a more and chaotic a more approach. chaotic approach. Entropy isEntropy given as is follows: given as follows: Z max(x)   1 Entropy = x(n) Px log1 dx, (1) Entropy xnmin(x) P log dPx, (1) P where Px is the density function of the signal x(n). where Px is the probability density function of the signal x(n). 2.3.1. Permutation Entropy (PEE) PEE [42] is a robust nonlinear statistical tool for time-series data, which determines the order relationships between its values to measure quantitative complexity of a dynamic system. Then, this tool extracts the probability distribution of a set of ordinal patterns. Some

Sensors 2021, 21, 5135 6 of 19

notable features of the method are non-parametric, robust, flexible, and computationally efficient.

2.3.2. Singular Value Decomposition Entropy (SVE) SVE [43] symbolizes the numerous eigenvectors required for a sufficient description of the time-series data. SVE is a powerful tool to measure the signal uncertainty, and it has a vast and vital application in .

2.3.3. Approximate Entropy (APE) APE [44] is a technique used to quantify the volume of consistency and the unpre- dictability of changes across the time-series signals. It was first introduced as a variant of the Kolmogorov–Sinai entropy by Steve M. Pincus in 1991 to investigate medical signals, such as heart rate and EEG. Then, it was widely implemented in multiple diverse fields, including , , and climate .

2.3.4. Sample Entropy (SAE) SAE [45] is an improvement over APE; it eliminates the existing limitations of APE. Richman and Moorman proposed assessing the complexity of the physiological time-series signals with data length independence and achieved relatively trouble-free implementation.

2.3.5. Spectral Entropy (SPE) SPE [46] is a measure of the signal irregularities in Shannon entropy. Most physiolog- ical signals, including EEG signals, are nonlinear; therefore, SPE as a nonlinear method is ideal for analyzing neural signals. It employs spectral analysis methods, such as the Fourier transform and Welch periodogram, to transform signals from the time domain to the frequency domain. Then, the SPE method implements the Shannon entropy concept to distribute the spectral power.

2.3.6. Continuous Wavelet Transform Entropy (CWE) Wavelet transform (WT) [47] is considered the most accurate approach for time- frequency analysis to overcome the constraints of short-time Fourier transform in window size selection. The WT overcomes the drawbacks by decomposing the signal in both the time and frequency domains for multiple resolutions. It applies a modulated window moved along with the signal at various scales. In this study, we adopted a continuous WT using Morlet for EEG signal analysis and obtained the probability distributions of each of the individual wavelet energies and the total wavelet . Then, these distributions are used to characterize the Shannon entropy.

2.4. Electrode Selection EEG signals were collected from multiple electrodes attached to the various measure- ment positions on the scalp. However, not all the obtained data from the measurement positions provided critical and necessary information; in fact, the data collected depended on the purpose of the particular study. Various studies focus on the issue and determine the optimal electrode locations for emotion recognition. Sofien Gannouni et al. [48] described a zero-time windowing-based proper electrode selection method to identify emotional states. We used ANOVA to select electrode locations with crucial content for emotion classification. ANOVA was developed by Sir Ronald A. Fisher (1925) [23]; it is a statistical technique to determine whether there are any statistically significant differences between the mean values of three or more independent groups. However, one drawback of using ANOVA was that it could not detect the specific difference between each pair of groups [49]. To obtain details about the differences between any two groups, we need to append Tukey’s HSD test [24], known as the mathematical test, for obtaining mean values significantly distinct from each other. This study employs ANOVA and Tukey’s HSD test as an elec- trode selection method for each feature type and each measurement location to determine Sensors 2021, 21, 5135 7 of 19

whether there is a significant difference in emotion classification between each pair of emotional states (Positive–Negative, Negative–Neutral, and Neutral–Positive). Then, a decision is made to select or reject the measurement position. This method helps lessens the computational complexity of the models, which boosts the system’s speed.

2.5. Dimensionality Reduction Data with high dimensionality introduces computational complexity into the classifi- cation process and yields unexpectedly low results. For real-world applications, - ality reduction aims to convert data from a high-dimensional space into a low-dimensional space to increase the classifier’s speed and stability. An advantage of dimensionality reduc- tion is to make data more apparent using a two- or three-dimensional space; this visualization becomes a challenge when a large number of are involved. In this study, we compare three popular approaches: PCA, t-SNE, and UMAP. PCA [50] is considered a statistical technique that enables the extraction of informa- tion content from an extensive original dataset into an uncorrelated variable called the principal component. PCA applies a linear transformation to arrange the components with the highest possible variance generated from the input data. PCA reduces the dataset’s dimension and identifies the artifactual components from the EEG signals to eliminate the unwanted signals. Additionally, t-SNE [51] is a nonlinear technique for dimensionality reduction; it is a helpful tool for the visualization of high-dimensional datasets. It was developed in 2008 by Laurens van der Maatens and Geoffrey Hinton; t-SNE calculates the similarities between the pairs of instances in high-dimensional space and low-dimensional space. PCA is employed for maximizing the variance with large pairwise distances, whereas the t-SNE would only consider small pairwise distances or local similarities. UMAP [52] is an based on manifold learning techniques; this algorithm depends on the ideas obtained from topological data analysis for data reduction. In application and research, UMAP is superior to t-SNE because of its faster processing speed. Furthermore, UMAP tends to work on data-dimensionality reduction, which is suitable for machine learning preprocessing, whereas t-SNE is mainly suitable for data visualization.

2.6. Classification 2.6.1. Support Vector Machine (SVM) SVM is a simple classifier [53] that finds an optimal hyperplane between the data of two classes so that the distance between the closest data points to that hyperplane is the greatest. The nearest points are called the support vectors, whereas the distance that needs to be maximized is called the margin. The SVM is a binary classifier that can be extended into a multi-class classifier by using the one-to-rest approach. This approach determines a hyperplane that separates a class from all other classes at once. This study employs the radial basis function, which is most commonly used as a kernel in SVM.

2.6.2. Multi-Layer Perceptron (MLP) MLP is a feed-forward artificial neural network (ANN) pattern [54] known as a “vanilla” neural network because it possesses only one hidden layer. We propose an MLP model with three layers: one input layer, one hidden layer, and one output layer (see Figure5a ). The size of the input layer is that of each extracted feature. The hidden layer consists of 128 followed by the rectified linear activation unit (ReLU), which executes a ReLU nonlinear function. Finally, the SoftMax function, which is a normalized exponential function, is used to normalize a network’s output to a probability distribution over the predicted output classes. Sensors 2021, 21, x FOR PEER REVIEW 8 of 18

model with three layers: one input layer, one hidden layer, and one output layer (see Fig- ure 5a). The size of the input layer is that of each extracted feature. The hidden layer con- sists of 128 neurons followed by the rectified linear activation unit (ReLU), which executes Sensors 2021, 21, 5135 a ReLU nonlinear function. Finally, the SoftMax function, which is a normalized exponen-8 of 19 tial function, is used to normalize a network’s output to a probability distribution over the predicted output classes.

FigureFigure 5.5. ((aa)) TheThe proposedproposed MLPMLP model;model; ((bb)) thethe proposedproposed 1D-CNN1D-CNN model.model.

2.6.3.2.6.3. 1-D1-D ConvolutionalConvolutional NeuralNeural NeworksNeworks (1D-CNN)(1D-CNN) Convolutional neural network (CNN) [55] is a great ANN technique applied in var- Convolutional neural network (CNN) [55] is a great ANN technique applied in vari- ious fields ranging from image classification to bio-signal processing. CNN models are ous fields ranging from image classification to bio-signal processing. CNN models are widely used when working with images in which the input of the model is two- or three- widely used when working with images in which the input of the model is two- or three- dimensional data that represent the pixels and channels of the pictures. 1D-CNN, as dimensional data that represent the pixels and channels of the pictures. 1D-CNN, as a a variant of CNN, is also often applied to one-dimensional data sequences suitable for variant of CNN, is also often applied to one-dimensional data sequences suitable for bio- biomedical signals, such as EEG, electrocardiogram, and heart rate. We propose a 1-D CNN medical signals, such as EEG, electrocardiogram, and heart rate. We propose a 1-D CNN model with a 1D-Convolution (1D-CONV) layer with 256 neurons and two hidden layers model with a 1D-Convolution (1D-CONV) layer with 256 neurons and two hidden layers with 128 neurons each (see Figure5b). The 1D-CONV filter is considered the most critical with 128 neurons each (see Figure 5b). The 1D-CONV filter is considered the most critical part of 1D-CNN because it uses kernels to convolve the input data. In MLP and 1D-CNN, part of 1D-CNN because it uses kernels to convolve the input data. In MLP and 1D-CNN, we initiated the rate as 0.001, the optimizer as the Adam, and the loss we initiated the supervised learning rate as 0.001, the optimizer as the Adam, and the loss function as the categorical cross-entropy. function as the categorical cross-entropy. 3. Results 3. Results Figure6 shows the workflow diagram of our proposed system. The methods and algorithmsFigure used 6 shows in this the study workflow are described diagram concisely of our proposed in each block. system. The methods and used in this study are described concisely in each block. 3.1. EEG Features 3.1. EEGFirst, Features we applied the band-pass filter (1–50 Hz) for the raw EEG signals. Then, before usingFirst, the six we entropy applied measures the band-pass for feature filter extraction, (1–50 Hz) wefor splitthe raw the EEGEEG signalssignals. into Then, segments before ofusing 0.5 sthe with six an entropy overlapping measures rate offor 50%. feature The ex entropytraction, measures we splitused the EEG in this signals study into include seg- PEE,ments SVE, of 0.5 APE, s with SAE, an SPE,overlapping and CWE. rate Theof 50%. assessed The entropy values measures after feature used extraction in this study are normalizedinclude PEE, from SVE, 0 APE, to 1 to SAE, promote SPE, computationand CWE. The in assessed the following values steps. after feature extraction are normalizedFigure7 presents from 0 the to mean1 to promot and thee computation standard deviation in the valuesfollowing of the steps. different entropy measuresFigure for 7 presents three emotional the mean states. and the From standard Figure deviation7, we can values see that of the the different effect of entropy each featuremeasures type for is three distinct emotional in both states. magnitude From andFigure the 7, distribution we can see ofthat values. the effect PEE of and each CWE fea- haveture type the highest is distinct mean in both values, magnitude and their and standard the distribution values are theof values. least.In PEE contrast, and CWE SAE have has athe minor highest mean, mean and values, its standard and their deviation standard is the va highest.lues are Nevertheless, the least. In contrast, by depending SAE has only a onminor their mean, mean and and its standard standard deviation deviation values, is the it highest. is difficult Nevertheless, to determine by which depending of these only six featureson their aremean suitable and standard for emotion deviation recognition. values, it is difficult to determine which of these six featuresIn this are study, suitable each for participant emotion recognition. watched a total of 30 videos for eliciting their emotions, 10 videos for each type of emotional state corresponding to 3 labels (Negative, Neutral, and Positive). Each video lasts 50 s. Data segmentation is done with a 0.5 s window and 50% overlap. Therefore, in the case of 30 videos per participant, we have 30 (videos) ∗ 8 (channels) ∗ [50 s (each video) ∗ 2 (0.5 s window) ∗ 2 (50% overlap) − 1] = 47,760 segments utilized for feature extraction. After feature extraction, 8 (channels) ∗ 5970 (features) are used as the dataset for building models. We apply 5-fold cross-validation for training, validation, and evaluation of the models in the subject-dependent case. Additionally, in the subject-independent case, we employ leave-one (subject)-out cross-validation, which Sensors 2021, 21, 5135 9 of 19

Sensors 2021, 21, x FOR PEER REVIEWhandles a subject’s sample data as a test set for evaluating the model after it has been9 of 18 Sensors 2021, 21, x FOR PEER REVIEW 9 of 18 trained and validated on the data sample of the remaining subjects.

Figure 6. Workflow diagram of the proposed system. FigureFigure 6. 6.Workflow Workflow diagram diagram of of the the proposed proposed system. system.

Figure 7. Normalized values with the mean and standard deviation for the six features of the three FigureFigure 7. 7.Normalized Normalized valuesvalues withwith thethe meanmean andand standardstandard deviation deviation for for the the six six features features of of the the three three emotions. emotions.emotions.

3.2. EEGInIn this this Electrode study, study, Selection each each participant participant and Rejection watched watched a a total total of of 30 30 videos videos for for eliciting eliciting their their emotions, emotions, 10 videos for each type of emotional state corresponding to 3 labels (Negative, Neutral, 10 videosNot all for the each EEG type signals of emotional obtained from state the corresponding electrode positions to 3 labels on the (Negative, head provide Neutral, the and Positive). Each video lasts 50 s. Data segmentation is done with a 0.5 s window and informationand Positive). necessary Each video for emotionlasts 50 s. classification. Data segmentation Electrode is done selection with nota 0.5 only s window eliminates and 50% overlap. Therefore, in the case of 30 videos per participant, we have 30 (videos) * irrelevant50% overlap. noise, Therefore, but also speedsin the upcase the of computation30 videos per of participant, classifiers. Some we have previous 30 (videos) studies * 8(channels) * [50 s (each video) * 2 (0.5 s window) * 2 (50% overlap) − 1] = 47,760 segments have8(channels) focused * on [50 identifying s (each video) the optimal* 2 (0.5 s electrode window) placement * 2 (50% overlap) in emotion − 1] recognition. = 47,760 segments In this utilized for feature extraction. After feature extraction, 8 (channels) * 5970 (features) are study,utilized we for used feature ANOVA extraction. and Tukey’s After HSDfeatur teste extraction, for electrode 8 (channels) selection. * Figure5970 (features)8 describes are used as the dataset for building models. We apply 5-fold cross-validation for training, theused distribution as the dataset of the for data building points models. of three We emotional apply 5-fold states cross-validation over each electrode. for training, It is validation,validation, and and evaluation evaluation of of the the models models in in the the subject-dependent subject-dependent case. case. Additionally, Additionally, in in thethe subject-independent subject-independent case, case, we we employ employ leav leave-onee-one (subject)-out (subject)-out cross-validation, cross-validation, which which handleshandles a a subject’s subject’s sample sample data data as as a a test test se set tfor for evaluating evaluating the the model model after after it it has has been been trainedtrained and and validated validated on on the the data data sample sample of of the the remaining remaining subjects. subjects.

Sensors 2021, 21, x FOR PEER REVIEW 10 of 18

3.2. EEG Electrode Selection and Rejection Not all the EEG signals obtained from the electrode positions on the head provide the information necessary for . Electrode selection not only eliminates irrelevant noise, but also speeds up the computation of classifiers. Some previous studies Sensors 2021, 21, 5135 10 of 19 have focused on identifying the optimal electrode placement in emotion recognition. In this study, we used ANOVA and Tukey’s HSD test for electrode selection. Figure 8 describes the distribution of the data points of three emotional states over each electrode. It is not notcomplicated complicated to determine to determine thethe electrode electrode that that provides provides helpful helpful information information about about the thesig- significantnificant differences differences between between the the three three emotional emotional states. states. Table 11 lists lists the thep p-values-values of of the the variousvarious pairs pairs of of emotional emotional states states on on all all the the eight eight channels channels and and shows shows which which position position needs needs toto be be chosen chosen or or removed. removed. For For each each electrode electrode to to be be selected, selected, no no pair pair of of emotional emotional states states shouldshould have have a ap -valuep-value greater greater than than 0.05. 0.05. For For example, example, for for electrode electrode 1 1 (placed (placed on on the the AF3 AF3 location),location), the thep p-value-value of of the the negative–neutral negative–neutral and and neutral–positive neutral–positive groups groups is is equal equal to to 0.001 0.001 (p(p< < 0.05). 0.05). However,However, thethep p-value-value ofof the the negative–positive negative–positive couple couple was was 0.8130 0.8130 ( p(p> > 0.05), 0.05), whichwhich implied implied that that there there was was no no significant significant difference difference between between these these two two groups; groups; therefore, there- wefore, removed we removed the AF3 the location. AF3 location.

FigureFigure 8. 8.Normalized Normalized values values with with the the mean mean and and standard standard deviation deviation of of three three emotional emotional states states across across eacheach electrode electrode using using SAE SAE on on Subject Subject 2. 2.

TableTable 1. 1.EEG EEG electrode electrode selection selection for for SAE SAE over over Subject Subject 2. 2. p-Value p-Value Decision Electrode (ANOVA(ANOVA andand Tukey’s Tukey’s HSD HSD Test) Test) Decision Electrode (p(p≤ ≤0.05) 0.05) neg-neuneg-neu neg-posneg-pos neu-posneu-pos 11 0.0010.001 0.81300.8130 0.0010.001 rejectreject 2 0.001 0.001 0.001 adopt 2 0.001 0.001 0.001 adopt 3 0.5900 0.0365 0.2916 reject 3 0.5900 0.0365 0.2916 reject 4 0.001 0.001 0.001 adopt 45 0.0010.1053 0.0010.001 0.0010.001 adoptreject 56 0.10530.001 0.0010.001 0.0010.001 rejectadopt 67 0.0010.0029 0.0010.001 0.0010.8367 adoptreject 78 0.00290.0225 0.0010.001 0.83670.001 rejectadopt Note that:8 1-AF3, 2-AF4, 3-FT7, 0.0225 4-FT8, 5-T7, 6-T8, 0.001 7-TP7, 8-TP8. 0.001 adopt Note that: 1-AF3, 2-AF4, 3-FT7, 4-FT8, 5-T7, 6-T8, 7-TP7, 8-TP8. Figure 9 shows notable variations in the mean and standard deviation values of the dataFigure before9 andshows after notable removing variations the unnecessar in the meany electrodes. and standard The average deviation values values of the neg- of theative data and before neutral and groups after removingincrease simultaneously; the unnecessary however, electrodes. the average The average value valuesin the pos- of theitive negative data group and neutralshows a groupsdecrease. increase We perform simultaneously;ed electrode however, selection the on averageall the six value features in theacross positive the eight data groupsubjects shows (see Table a decrease. 2). The We outcomes performed show electrode that the selection selected onelectrodes all the sixdepended features on across the types the eight of features, subjects and (see the Table outcom2). Thees varied outcomes from showsubject that to subject. the selected Figure electrodes depended on the types of features, and the outcomes varied from subject to subject. Figure 10 shows the rate of selection of electrodes using a topographical plot. The dark position shows the areas where more electrodes are preferred. Clearly, the T8 in the temporal lobe has been chosen most frequently, which implies that this position would have more helpful information than other positions for classifying the emotional states. Sensors 2021, 21, x FOR PEER REVIEW 11 of 18

10 shows the rate of selection of electrodes using a topographical plot. The dark position SensorsSensors2021 2021, 21, 21, 5135, x FOR PEER REVIEWshows the areas where more electrodes are preferred. Clearly, the T8 in the temporal1111 of lobeof 19 18

has been chosen most frequently, which implies that this position would have more help- ful information than other positions for classifying the emotional states. 10 shows the rate of selection of electrodes using a topographical plot. The dark position shows the areas where more electrodes are preferred. Clearly, the T8 in the temporal lobe has been chosen most frequently, which implies that this position would have more help- ful information than other positions for classifying the emotional states.

FigureFigure 9. 9.Normalized Normalized values values with with the the mean mean and and standard standa deviationrd deviation values values across across the three theemotional three emo- statestional before states and before after and electrode after electrode selection selection using SAE using on SubjectSAE on 2. Subject 2.

Table 2. TableEEG electrode 2. EEG electrode selection selection for six entropy for six measuresentropy measures across eight across subjects. eight subjects.

Selected Electrodes FigureSubject 9. Normalized valuesSelected with the Electrodes mean and standard deviation values across the three emo- Subject tional states beforePEE and after electrodeSVE selectionAPE using SAE onSAE Subject 2. SPE CWE PEE1 SVE 1, 3, 4, 5, 6, 7, 8 APE 6, 7 2, 4, 5, SAE 6, 8 4, 5, 6, 7, 8 SPE 1, 2, 3, 6, 7 CWE 1, 2, 6, 7 1 1, 3, 4, 5, 6, 7, 8Table2 2. 6,EEG 7 electrode 2, 5 selection 2, 4, 5,1, 3, 6, for4, 8 5, six 8 entropy 2, 4, 4, 5, 5, measures 6, 8 7, 8 2,across 4, 6, 8 1,eight 2, 3, subjects. 6, 2, 7 4, 5, 6, 8 1, 2, 1, 6,2, 74, 5, 6 3 4, 8 All electrodes 3, 4, 6 1, 2, 3, 4, 6, 7, 8 1, 2, 3, 4, 7, 8 2, 4, 6, 7 2 2, 5 1, 3, 4, 5, 8 2, 4, 5, 8 2, 4, 6, 8 2, 4, 5, 6, 8 1, 2, 4, 5, 6 4 6, 7 1, 6 4,Selected 6 Electrodes 1, 7 1, 3, 6 1, 2, 6 Subject 3 4, 8 All5 electrodes 1, 3, 4,PEE 5, 6, 7, 8 3, 4, 2,6SVE 4, 6 1, 2, 2,APE 4, 3, 6, 4, 7 6, 7, 8 2, 3,SAE 4, 6, 1, 7 2, 3, 4, 7, 1, 8SPE 2, 5 2, 4, 6,CWE 6, 7 8 4 6, 716 1, 6 1, 3, 4, 5, 5,6, 6,7 7, 8 1, 4, 2, 6 4, 6, 5, 7 6, 7, 8 2, 4, 4,5, 6 1,6, 78 1, 4,2, 5,4, 6,5, 7,6, 87, 8 1, 3, 61, 1, 2, 2, 5, 3, 6, 6, 7, 7 8 1, 2, 1, 1, 62,3, 6,6, 78 27 2, 4, 58 1, 1,2, 3,4, 4,5, 5,6, 87, 8 2, 4, 7, 5,8 8 2, 3, 4,4, 6,7, 88 2, 4, 7,5, 8 6, 8 1, 2, 1, 4,4, 5,6 6 5 1, 3, 4, 5, 6, 7, 838 2, 4, 6 2, 4, 5, 8 6 2, All 4, 6,electrodes 1, 7 3, 7 3,2, 1, 4, 3, 7 6 4, 6, 7 1, 2, 1, 3, 2, 4, 7, 6, 8 7, 8 1, 2, 1,5 2, 3, 1, 4,7 7, 8 6, 2, 8 1,4, 2, 6, 5 7 6 5, 6, 7Note 1,4 2, that: 4, 5, 1-AF3, 6, 7, 8 6, 2-AF4,7 3-FT7, 4, 6 1,4-FT8, 6 5-T7, 1, 2,6-T8, 4,4, 6 5, 7-TP7, 6, 7, 8 8-TP8. 1, 7 1, 2, 5, 6, 7, 1, 8 3, 6 1, 3, 6,1, 2, 8 6 5 1, 3, 4, 5, 6, 7, 8 2, 4, 6 2, 4, 6, 7 2, 3, 4, 6, 7 1, 2, 5 6, 8 7 4, 8 1, 2, 4, 5, 6, 7, 8 7, 8 3, 4, 7, 8 7, 8 1, 4, 6 6 5, 6, 7 1, 2, 4, 5, 6, 7, 8 4, 6 1, 2, 4, 5, 6, 7, 8 1, 2, 5, 6, 7, 8 1, 3, 6, 8 8 2, 5, 67 1, 3, 7 4, 8 1, 1, 2, 74, 5, 6, 7, 8 7,1, 8 2, 7, 8 3, 4, 7, 8 1, 7 7, 8 1, 2, 1, 5 4, 6 Note8 that: 1-AF3, 2, 5, 2-AF4,6 3-FT7, 1, 4-FT8, 3, 7 5-T7, 6-T8, 1, 7 7-TP7, 8-TP8. 1, 2, 7, 8 1, 7 1, 2, 5 Note that: 1-AF3, 2-AF4, 3-FT7, 4-FT8, 5-T7, 6-T8, 7-TP7, 8-TP8.

Figure 10. Topographical scalp map of the selected EEG electrodes.

FigureFigure 10. 10.Topographical Topographical scalp scalp map map of of the the selected selected EEG EEG electrodes. electrodes.

Sensors 2021, 21, 5135 12 of 19

Sensors 2021, 21, x FOR PEER REVIEW 12 of 18

3.3. Classification Results 3.3. ClassificationThe features Results extracted after using electrode selection would be accepted as the input to theThe SVM features classifier. extracted Figure after 11 busing shows electrode that the selection accuracy would result be obtainedaccepted as by the using input SAE fromto the Subject SVM classifier. 2 is 87.79%; Figure it is11b described shows that in detailthe accuracy with the result obtained matrix. by using Accuracy, SAE sensitivity,from Subject and 2 specificityis 87.79%; it are is calculateddescribed asin followsdetail with based the onconfusion the confusion matrix. matrix: Accuracy, sensitivity, and specificity are calculated as follows based on the confusion matrix: TP + TN Accuracy = TP TN , (2) Accuracy TP + TN + FP + , FN (2) TP TN FP FN TP Sensitivity =TP , (3) Sensitivity , (3) TP TP FN+ FN TN Specificity =TN , (4) Specificity TN +, TP (4) TN TP where TP, TN, FP, and FN represent the true positive, true negative, false positive, and false where TP, TN, FP, and FN represent the true positive, true negative, false positive, and negative, respectively. false negative, respectively.

Figure 11. The results for SAE feature on Subject 2: (a) ROC curves for the SVM classifier; (b) nor- Figure 11. The results for SAE feature on Subject 2: (a) ROC curves for the SVM classifier; (b) normal- malized confusion matrix of the SVM classifier; (c) loss curve of the proposed 1D-CNN; (d) normal- izedized confusion confusion matrixmatrix of the proposed SVM classifier; 1D-CNN. (c) loss curve of the proposed 1D-CNN; (d) normalized confusion matrix of the proposed 1D-CNN. To visualize the performance of the multi-class classification problem, we also used the areaTo visualize under the the curve performance (AUC)–receiver of the operating multi-class characteristics classification curve problem, [56], as we a graph- also used theical area plot under using the the curve true positive (AUC)–receiver rate against operating the false characteristics positive rate curve at various [56], as threshold a graphical plotsettings. using When the true the positive AUC increases, rate against the themodel false will positive have greater rate at variouscapacity threshold to distinguish settings. Whenbetween the classes. AUC increases,In this case, the the model AUC with will SVM have results greater for capacity the three to categories distinguish negative, between classes.neutral, In and this positive case, the are AUC0.93, 0.90, with and SVM 0.92, results respectively, for the threeas shown categories in Figure negative, 11a. Table neutral, 3 andlists positive the classification are 0.93, results 0.90, and using 0.92, SVM respectively, for all the assixshown features; in the Figure highest 11a. average Table3 accu-lists the classificationracy using SAE results across using eight SVM subjects for allis 79.36%. the six features; the highest average accuracy using SAE across eight subjects is 79.36%.

Sensors 2021, 21, 5135 13 of 19

Table 3. Comparison of classification accuracy (%) between SVM, MLP, 1D-CNN with six features across eight subjects.

EEG Features PEE SVE APE Subject SVM MLP 1D-CNN SVM MLP 1D-CNN SVM MLP 1D-CNN (%) (%) (%) (%) (%) (%) (%) (%) (%) 1 83.33 70.22 79.64 73.14 74.14 81.97 69.17 78.33 75.88 2 85.93 76.54 83.46 81.87 70.28 77.78 74.92 72.59 75.39 3 69.45 67.92 69.44 87.52 89.72 90.35 65.89 65.83 66.58 4 68.57 70.04 70.65 72.99 71.03 82.22 67.67 68.22 68.56 5 75.92 75.65 80.98 81.58 78.17 84.05 74.27 75.56 76.25 6 70.54 71.39 71.27 81.25 79.72 84.62 66.95 67.84 68.01 7 71.75 72.45 71.55 79.49 79.04 83.70 63.47 65.48 69.16 8 68.98 71.13 70.42 72.67 79.80 84.44 68.50 69.57 68.09 Mean 74.31 71.92 74.68 78.81 77.74 83.64 68.85 70.43 70.99 Std 6.73 2.71 5.30 5.04 5.78 3.28 3.69 4.35 3.82 EEG Features SAE SPE CWE Subject SVM MLP 1D-CNN SVM MLP 1D-CNN SVM MLP 1D-CNN (%) (%) (%) (%) (%) (%) (%) (%) (%) 1 70.89 80.95 86.75 78.38 75.45 86.12 68.72 71.63 72.18 2 87.79 92.90 93.89 87.50 84.72 84.90 73.29 72.61 72.37 3 83.70 81.28 86.26 79.97 84.98 80.56 66.92 65.95 66.88 4 73.61 81.83 83.06 79.82 82.57 81.50 64.74 65.52 66.76 5 84.06 80.81 84.75 73.10 77.17 76.26 75.23 75.63 76.55 6 79.33 79.76 85.94 72.59 74.44 77.59 68.56 67.93 68.25 7 78.19 75.65 82.17 74.57 77.22 76.99 65.62 66.84 67.87 8 77.32 78.97 83.68 75.93 80.05 80.78 69.54 70.97 72.91 Mean 79.36 81.52 85.81 77.73 79.58 80.59 69.08 69.64 70.47 Std 5.27 4.67 3.41 4.56 3.87 3.37 3.39 3.38 3.31 Note: SVM was used for features with electrode selection whereas MLP and 1D-CNN were used for features without electrode selection.

This study also employs two additional state-of-the- classifiers, MLP and 1D-CNN, for EEG-based emotion recognition. To evaluate the performances of the MLP and 1D-CNN models, we used the extracted features without using electrode selection as the input data to keep all the information from all electrodes. As discussed earlier, electrode selection is applied to hold relevant information and reject unnecessary information for emotion recognition. Thus, it makes model computation less complicated and faster, suitable for traditional machine learning models like SVM. However, from Table1, we take the rejected electrodes into deep consideration. Take electrode 1 as an example; it is dropped because there is no notable difference between the negative–positive pair (p = 0.8130 > 0.05); however, the two remaining pairs still meet the condition p < 0.05. Hence, if removing these electrodes, there is a high probability that critical information for classification will disappear, causing the system’s accuracy to be reduced. Moreover, neural networks are well-known for the advantages of fast computation and the ability to work competently in extracting valuable features from the input data. Therefore, in the case of MLP and 1D-CNN, we take all the features obtained as input to these models to ensure that no vital information is lost, which increases the system’s accuracy. Using 5-fold cross-validation, the proposed models were trained and evaluated in the study. We split the dataset into five Sensors 2021, 21, 5135 14 of 19

subsets of which one subset was used as the test set; the other four subsets were adopted Sensors 2021, 21, x FOR PEER REVIEWfor training the model. This process was iterated over each subset. The final result14 was of 18 the average of the performance accuracy on each subset. Table3 shows that the highest average results using the three proposed models SVM, MLP, and 1-CNN over eight subjects are 79.36%, 81.52%, and 85.81%, respectively; all the models used SAE as the feature input. Figurethe feature 11c showsinput. Figure the loss 11c curve shows and the confusion loss curve matrix and confusion of 1D-CNN’s matrix performance of 1D-CNN’s using per- SAEformance on Subject using2; SAE this on model Subject achieves 2; this themodel highest achieves accuracy the highest of 93.89%. accuracy of 93.89%.

3.4.3.4. Subject-Dependent andand Subject-IndependentSubject-Independent ToTo makemake ourour systemsystem moremore practicalpractical forfor real-worldreal-world applications,applications, we trained the pro- posedposed modelsmodels acrossacross allall subjectssubjects withwith the datadata used for the subject-independent strategy. ForFor the subject-dependent subject-dependent case, case, we we developed developed a new a new model model consisting consisting of the of thetraining training and andtesting testing data data from from each each subject’s subject’s dataset; dataset; high high accuracy accuracy was wasachieved achieved by this by thismethod, method, but butit could it could not notbe generalized. be generalized. For the For subject the subject-independent-independent strategy, strategy, we used we used a subject’s a subject’s data datasample sample for testing for testing the proposed the proposed models, models, and th ande datasets the datasets of the of remaining the remaining subjects subjects were wereused usedfor training. for training. Figure Figure 12 shows 12 shows a comparis a comparisonon of the of theresults results obtained obtained for forsubject-de- subject- dependentpendent and and subject-independent subject-independent cases cases using using SVM, SVM, MLP, MLP, and and 1D-CNN with SAE.SAE. AsAs mentioned,mentioned, thethe averageaverage accuraciesaccuracies achievedachieved inin thethe subject-independentsubject-independent casecase usingusing SVM,SVM, MLP,MLP, andand 1D-CNN1D-CNN withwith SAESAE werewere 72.65%,72.65%, 75.14%,75.14%, andand 78.52%,78.52%, respectively;respectively; thesethese valuesvalues areare lowerlower thanthan thethe averageaverage accuraciesaccuracies obtainedobtained forfor thethe subject-dependentsubject-dependent casecase (79.36%,(79.36%, 81.52%,81.52%, andand 85.81%,85.81%, respectively).respectively). Even with this comparatively high level of accuracy, thethe subject-independentsubject-independent modelsmodels demonstrateddemonstrated thatthat ourour proposedproposed systemsystem forfor EEG-basedEEG-based emotionemotion recognitionrecognition isis practicallypractically feasible.feasible.

FigureFigure 12.12. Comparison of the classificationclassification accuraciesaccuracies ofof thethe threethree classifiersclassifiers forfor threethree emotionsemotions withwith bothboth subject-dependentsubject-dependent andand subject-independentsubject-independent casescases forfor eighteight subjectssubjects usingusing SAE.SAE.

3.5.3.5. Dimensionality ReductionReduction andand DataData VisualizationVisualization SpeedSpeed isis oneone of of the the most most critical critical factors factors for for assessing assessing whether whether or notor not a system a system is feasible is fea- insible real-world in real-world applications. applications. Dimensionality Dimensionality reduction reduction helps helps reduce reduce the the computational computational complexitycomplexity forfor classifiersclassifiers andand therebythereby improvesimproves theirtheir stability.stability. InIn addition,addition, dimensionalitydimensionality reductionreduction isis aa helpfulhelpful tooltool forfor datadata visualization,visualization, whichwhich becomesbecomes challengingchallenging withwith multi-multi- dimensionaldimensional data. In this study, aa comparativecomparative analysisanalysis isis mademade ofof thethe threethree approachesapproaches PCA,PCA, t-SNE, and UMAP UMAP to to reduce reduce the the number number of of EEG EEG data data dimensions. dimensions. We We evaluated evaluated the theprocessing processing speed speed of each of each method. method. Figure Figure 13a 13showsa shows the theruntime runtime values values of the of thethree three ap- approachesproaches when when they they are areapplied applied to the to resa thempled resampled dataset dataset using SAE using on SAE Subject on Subject2. Dimen- 2. Dimensionalitysionality reduction reduction was performed was performed three times three for times each for subject each subjectusing multiple using multiple dataset datasetsizes to sizesensure to ensurethat the that evaluation the evaluation was object wasive. objective. PCA is PCA always is always the fastest the fastestoption option for all fordifferent all different dataset dataset sizes. sizes.When Whenworking working on sm onall smalldatasets, datasets, t-SNE t-SNEis more is morerobust robust than thanUMAP. UMAP. However, However, as the as dataset the dataset size gradually size gradually increases, increases, UMAP UMAP outperforms outperforms t-SNE. t-SNE. Fig- ure 13b–d show the results obtained by applying the three methods. Based on the data visualization, it is clear that data distribution with UMAP is the most useful as compared with the other two classification methods. However, to precisely evaluate their perfor- mances in classifying EEG-based emotion states, we implemented these approaches for

Sensors 2021, 21, 5135 15 of 19

Figure 13b–d show the results obtained by applying the three methods. Based on the Sensors 2021, 21, x FOR PEER REVIEWdata visualization, it is clear that data distribution with UMAP is the most useful15 of as 18

compared with the other two classification methods. However, to precisely evaluate their performances in classifying EEG-based emotion states, we implemented these approaches for the datasets of the six types of features on eight subjects with SVM as a simple classifier. the datasets of the six types of features on eight subjects with SVM as a simple classifier. Figure 14 shows the obtained results. Although the PCA speed is the highest, it has the Figure 14 shows the obtained results. Although the PCA speed is the highest, it has the least accuracy for classification. UMAP has demonstrated its strength in its high speed least accuracy for classification. UMAP has demonstrated its strength in its high speed and accuracy on all six types of features across eight subjects; it has the highest average and accuracy on all six types of features across eight subjects; it has the highest average accuracy of 76.78% with SAE. Therefore, UMAP will be suitable for real-world emotion accuracy of 76.78% with SAE. Therefore, UMAP will be suitable for real-world emotion recognition applications. recognition applications.

Figure 13. (a) Runtime of three dimensionality reduction techniques (PCA, t-SNE, and UMAP). (b) Figure 13. (a) Runtime of three dimensionality reduction techniques (PCA, t-SNE, and UMAP). Data visualization after using PCA with n_components = 2, svd_solver = auto. (c) Data visualization (b) Data visualization after using PCA with n_components = 2, svd_solver = auto.(c) Data visual- after using t-SNE with n_components = 2, metric = ‘Euclidean’. (d) Data visualization after using izationUMAP afterwith usingn_components t-SNE with = 2, n_components metric = ‘Euclidean’. = 2, metric Data =were ‘Euclidean’. obtained( byd) Datausingvisualization SAE on Subject after 2. using UMAP with n_components = 2, metric = ‘Euclidean’. Data were obtained by using SAE on Subject 2.

Figure 14. Comparison of classification accuracy of PCA, t-SNE, and UMAP with SVM using all features across eight subjects.

Sensors 2021, 21, x FOR PEER REVIEW 15 of 18

the datasets of the six types of features on eight subjects with SVM as a simple classifier. Figure 14 shows the obtained results. Although the PCA speed is the highest, it has the least accuracy for classification. UMAP has demonstrated its strength in its high speed and accuracy on all six types of features across eight subjects; it has the highest average accuracy of 76.78% with SAE. Therefore, UMAP will be suitable for real-world emotion recognition applications.

Figure 13. (a) Runtime of three dimensionality reduction techniques (PCA, t-SNE, and UMAP). (b) Sensors 2021, 21, 5135 Data visualization after using PCA with n_components = 2, svd_solver = auto. (c) Data visualization16 of 19 after using t-SNE with n_components = 2, metric = ‘Euclidean’. (d) Data visualization after using UMAP with n_components = 2, metric = ‘Euclidean’. Data were obtained by using SAE on Subject 2.

FigureFigure 14. ComparisonComparison of of classification classification accuracy accuracy of of PCA, PCA, t-SNE, and UMAP with SVM using all featuresfeatures across across eight subjects.

4. Conclusions This study proposed an EEG-based affective computing method for emotion recog- nition with three categories of emotions (negative, neutral, and positive). We collected 8-channel EEG signals using our wireless and wearable custom-designed device. A total of 30 trials were conducted on eight participants while they watched emotive videos. In this study, six entropy measures were employed for feature extraction, and three standard classifiers (SVM, MLP, and 1D-CNN) were implemented for emotion classification. After feature extraction, we classified the emotion states by electrode selection to retain all the important information and remove all the unnecessary information. The results proved that T8 in the temporal lobe contained much valuable content for emotion recognition. We completed the classification process with subject-dependent and subject-independent strategies using six features and three classifiers across eight subjects. The accuracy values of our proposed system for detecting emotion states as short as 0.5 s with subject-dependent and subject-independent strategies were 85.81% and 78.52%, respectively; we used a combination of SAE and 1D-CNN. To meet the requirements of high-speed processing and stable accuracy during prac- tical applications of a system, we compared three approaches (PCA, t-SNE, and UMAP) for dimensionality reduction and applied SVM as a classifier. The achieved results demon- strated our proposed system’s feasibility in real-life applications; the highest accuracy of 76.78% was obtained with UMAP. Besides, our proposed system has some limitations that need to be considered in the following steps. First, although the number of experiments (30 trials) on each subject is moderately enough to create a complete dataset, the number of participants is still limited. Additionally, our research still has not focused on the impact of gender and age on the changes of EEG signals in emotion recognition. Hence, we will conduct experiments on more subjects and assess other aspects such as age and gender. Another issue in this work is that in the signal preprocessing, we currently apply a simplistic bandpass filter to eliminate unwanted signals out of the frequency range of 1 to 50 Hz. However, filtering like this has constraints in not handling different types of artifacts associated with the participant well, such as electromyography (EMG) caused muscle movements. Therefore, in the following study, we will implement more advanced and efficient techniques in signal processing such as blind source separation (BSS) or independent component analysis (ICA) to reduce eye movements, blinks, muscle, heart, and line noise. Furthermore, our future work will concentrate on building and deploying robust machine learning models on our embedded device and then connect to IoT-connected healthcare platforms; this makes our system straightforward and convenient to achieve in real-world applications. Sensors 2021, 21, 5135 17 of 19

Author Contributions: Conceptualization, W.-Y.C.; methodology, N.-D.M., B.-G.L.; data curation, N.-D.M., B.-G.L.; supervision, W.-Y.C. All authors have read and agreed to the published version of the manuscript. Funding: This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2019R1A2C1089139). Institutional Review Board Statement: The permission number is 1041386-202003-HR-11-02 ap- proved by the Institutional Review Board of Pukyong National University. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study. Data Availability Statement: The data presented in this study are available on request from the corresponding author. The data are not publicly available due to their containing information that could compromise the privacy of research participants. Conflicts of : The authors declare no conflict of interest.

References 1. Zheng, W.-L.; Zhu, J.-Y.; Lu, B.-L. Identifying Stable Patterns over Time for Emotion Recognition from EEG. IEEE Trans. . Comput. 2017, 10, 417–429. [CrossRef] 2. Stets, J.E.; Turner, J.H. (Eds.) Handbook of the : Volume II; Springer: Berlin/Heidelberg, Germany, 2014. 3. Picard, R.W. Affective Computing; MIT Press: Cambridge, MA, USA, 1997. 4. Zhou, Q. Multi-layer affective computing model based on emotional psychology. Electron. Commer. Res. 2017, 18, 109–124. [CrossRef] 5. Tao, J.; Tan, T. Affective computing: A review. In Proceedings of the International Conference on Affective Computing and Intelligent Interaction, Beijing, China, 22–24 October 2005; Springer: Berlin/Heidelberg, Germany, 2005. 6. Paschen, J.; Kietzmann, J.; Kietzmann, T.C. Artificial intelligence (AI) and its implications for market knowledge in B2B marketing. J. Bus. Ind. Mark. 2019, 34, 1410–1419. [CrossRef] 7. Nguyen, T.-H.; Chung, W.-Y. Negative News Recognition during Social Media News Consumption Using EEG. IEEE Access 2019, 7, 133227–133236. [CrossRef] 8. Krishna, N.M.; Sekaran, K.; Vamsi, A.V.N.; Ghantasala, G.P.; Chandana, P.; Kadry, S. An efficient approach in brain-machine interface systems for extracting the psychological status of mentally impaired persons using EEG signals. IEEE Access 2019, 7, 77905–77914. [CrossRef] 9. Šalkevicius, J.; Damaševiˇcius,R.; Maskeliunas, R.; Laukiene,˙ I. Anxiety Level Recognition for Therapy System Using Physiological Signals. Electronics 2019, 8, 1039. [CrossRef] 10. Takahashi, K. Remarks on SVM-based emotion recognition from multi-modal bio-potential signals. In Proceedings of the 13th IEEE International Workshop on Robot and Human Interactive Communication, Okayama, Japan, 20–22 September 2004. 11. Quintana, D.S.; Guastella, A.J.; Outhred, T.; Hickie, I.; Kemp, A.H. Heart rate variability is associated with emotion recognition: Direct evidence for a relationship between the autonomic and social . Int. J. Psychophysiol. 2012, 86, 168–172. [CrossRef][PubMed] 12. Goshvarpour, A.; Abbasi, A.; Goshvarpour, A. An accurate emotion recognition system using ECGand GSR signals and matching pursuit method. Biomed. J. 2017, 40, 355–368. [CrossRef] 13. Chanel, G.; Kronegg, J.; Grandjean, D.; Pun, T. Emotion assessment: evaluation using EEG’s and phys-iological signals. In International Workshop on Multimedia Content Representation, Classification and Security; Springer: Berlin, Germany, 2006. 14. Liu, Z.; Wang, S. Emotion recognition using hidden Markov models from facial temperature sequence. In Proceedings of the International Conference on Affective Computing and Intelligent Interaction, Memphis, TN, USA, 9–12 October 2011; Springer: Berlin/Heidelberg, Germany, 2011. 15. Ahern, G.L.; Schwartz, G.E. Differential lateralization for positive and negative emotion in the : EEG spectral analysis. Neuropsychologia 1985, 23, 745–755. [CrossRef] 16. Ball, T.; Kern, M.; Mutschler, I.; Aertsen, A.; Schulze-Bonhage, A. Signal quality of simultaneously recorded invasive and non-invasive EEG. NeuroImage 2009, 46, 708–716. [CrossRef] 17. Lin, Y.-P.; Wang, C.-H.; Jung, T.-P.; Wu, T.-L.; Jeng, S.-K.; Duann, J.-R.; Chen, J.-H. EEG-Based Emotion Recognition in Music Listening. IEEE Trans. Biomed. Eng. 2010, 57, 1798–1806. [PubMed] 18. Tandle, A.; Jog, N.; D’cunha, P.; Chheta, M. Classification of artefacts in EEG signal recordings and overview of removing techniques. Int. J. Comput. Appl. 2015, 975, 8887. 19. Zhang, X.D. Entropy for the Complexity of Physiological Signal Dynamics. Healthc. Big Data Manag. 2017, 1028, 39–53. 20. De Luca, A.; Termini, S. A definition of a nonprobabilistic entropy in the setting of fuzzy sets theory. Inf. Control 1972, 20, 301–312. [CrossRef] 21. Jenke, R.; Peer, A.; Buss, M. Feature Extraction and Selection for Emotion Recognition from EEG. IEEE Trans. Affect. Comput. 2014, 5, 327–339. [CrossRef] Sensors 2021, 21, 5135 18 of 19

22. Alotaiby, T.; Abd El-Samie, F.E.; Alshebeili, S.A.; Ahmad, I. A review of channel selection algorithms for EEG signal processing. EURASIP J. Adv. Signal Process. 2015, 2015, 66. [CrossRef] 23. St, L.; Wold, S. Analysis of variance (ANOVA). Chemom. Intell. Lab. Syst. 1989, 6, 259–272. 24. Abdi, H.; Williams, L.J. Tukey’s honestly significant difference (HSD) test. Encycl. Res. Des. 2010, 3, 1–5. 25. Dry Sensor. 2016. Available online: https://www.cgxsystems.com/quick-30 (accessed on 21 May 2021). 26. Texas Instruments. ADS1299 Low-Noise, 8-Channel, 24-Bit, Analog-to-Digital Converter for EEG and Biopotential Measurements; Data Sheet; Texas Instruments: Dallas, TX, USA, 2017. 27. Zheng, W.L.; Lu, B.L. Investiating Critical Frequency Bands and Channels for EEG-Based Emotion Recognition. IEEE Trans. Auton. Ment. Dev. 2015, 7, 162–175. [CrossRef] 28. Balconi, M.; Brambilla, E.; Falbo, L. Appetitive vs. defensive responses to emotional cues. Autonomic measures and brain oscillation modulation. Brain Res. 2009, 1296, 72–84. [CrossRef][PubMed] 29. Rolls, E.; Hornak, J.; Wade, D.; McGrath, J. Emotion-related learning in patients with social and emotional changes associated with frontal lobe damage. J. Neurol. Neurosurg. 1994, 57, 1518–1524. [CrossRef] 30. Homan, R.W.; Herman, J.; Purdy, P. Cerebral location of international 10–20 system electrode placement. Electroencephalogr. Clin. Neurophysiol. 1987, 66, 376–382. [CrossRef] 31. Maskeliunas, R.; Damasevicius, R.; Martisius, I.; Vasiljevas, M. Consumer grade EEG devices: Are they usable for control tasks? PeerJ 2016, 4, e1746. [CrossRef][PubMed] 32. Britton, J.C.; Taylor, S.; Sudheimer, K.D.; Liberzon, I. Facial expressions and complex IAPS pictures: Common and differential networks. NeuroImage 2006, 31, 906–919. [CrossRef][PubMed] 33. Koelstra, S.; Muhl, C.; Soleymani, M.; Lee, J.S.; Yazdani, A.; Ebrahimi, T.; Pun, T.; Nijholt, A.; Patras, I. DEAP: A for amotion analysis; using physiological signals. IEEE Trans. Affect. Comput. 2011, 3, 18–31. [CrossRef] 34. Dan-Glauser, E.S.; Scherer, K.R. The Geneva affective picture database (GAPED): A new 730-picture database focusing on and normative significance. Behav. Res. Methods 2011, 43, 468–477. [CrossRef] 35. Marchewka, A.; Zurawski,˙ Ł.; Jednoróg, K.; Grabowska, A. The Nencki Affective Picture System (NAPS): Introduction to a novel, standardized, wide-range, high-quality, realistic picture database. Behav. Res. Methods 2014, 46, 596–610. [CrossRef][PubMed] 36. Carretié, L.; Tapia, M.; López-Martín, S.; Albert, J. EmoMadrid: An emotional pictures database for affect research. Motiv. Emot. 2019, 43, 929–939. [CrossRef] 37. Ellard, K.K.; Farchione, T.J.; Barlow, D.H. Relative Effectiveness of Emotion Induction Procedures and the Role of Personal Relevance in a Clinical Sample: A Comparison of Film, Images, and Music. J. Psychopathol. Behav. Assess. 2011, 34, 232–243. [CrossRef] 38. Choi, J.; Rangan, P.; Singh, S.N. Do Cold Images Cause Cold-Heartedness? The Impact of Visual Stimuli on the Effectiveness of Negative Emotional Charity Appeals. J. Advert. 2016, 45, 417–426. [CrossRef] 39. Mikels, J.A.; Fredrickson, B.L.; Samanez-Larkin, G.; Lindberg, C.M.; Maglio, S.J.; Reuter-Lorenz, P.A. Emotional category data on images from the international affective picture system. Behav. Res. Methods 2005, 37, 626–630. [CrossRef] 40. Zhu, J.-Y.; Zheng, W.-L.; Lu, B.-L. Cross-subject and cross-gender emotion classification from EEG. In Proceedings of the World Congress on Medical Physics and Biomedical Engineering, Toronto, ON, Canada, 7–12 June 2015; Springer: Cham, Switzerland, 2015. 41. Tolegenova, A.A.; Kustubayeva, A.M.; Matthews, G. Trait meta-, gender and EEG response during emotion-regulation. Personal. Individ. Differ. 2014, 65, 75–80. [CrossRef] 42. Ouyang, G.; Li, J.; Liu, X.; Li, X. Dynamic characteristics of absence EEG recordings with multiscale permutation entropy analysis. Epilepsy Res. 2013, 104, 246–252. [CrossRef] 43. Prabhakar, S.K.; Rajaguru, H. Performance comparison of fuzzy mutual information as dimensionality reduction techniques and SRC, SVD and approximate entropy as post classifiers for the classification of epilepsy risk levels from EEG signals. In Proceedings of the 2015 IEEE Student Symposium in Biomedical Engineering & Sciences (ISSBES), Shah Alam, Malaysia, 4 November 2015; IEEE: Piscataway, NJ, USA, 2015. 44. Srinivasan, V.; Eswaran, C.; Sriraam, N. Approximate Entropy-Based Epileptic EEG Detection Using Artificial Neural Networks. IEEE Trans. Inf. Technol. Biomed. 2007, 11, 288–295. [CrossRef] 45. Jie, X.; Cao, R.; Li, L. Emotion recognition based on the sample entropy of EEG. Bio-Med. Mater. Eng. 2014, 24, 1185–1192. [CrossRef] 46. Seitsonen, E.R.J.; Korhonen, I.; van Gils, M.; Huiku, M.; Lötjönen, J.M.P.; Korttila, K.T.; Yli-Hankala, A.M. EEG spectral entropy, heart rate, photoplethysmography and motor responses to skin incision during sevoflurane anaesthesia. Acta Anaesthesiol. Scand. 2005, 49, 284–292. [CrossRef] 47. Adeli, H.; Zhou, Z.; Dadmehr, N. Analysis of EEG records in an epileptic patient using wavelet transform. J. Neurosci. Methods 2003, 123, 69–87. [CrossRef] 48. Gannouni, S.; Aledaily, A.; Belwafi, K.; Aboalsamh, H. Emotion detection using electroencephalography signals and a zero-time windowing-based epoch estimation and relevant electrode identification. Sci. Rep. 2021, 11, 1–17. [CrossRef][PubMed] 49. Isajiw, W.W.; Sev’er, A.; Driedger, L. Ethnic Identity and Social Mobility: A Test of the ‘Drawback Model’. Can. J. Sociol. Cah. Can. Sociol. 1993, 18, 177–196. [CrossRef] 50. Ringnér, M. What is principal component analysis? Nat. Biotechnol. 2008, 26, 303–304. [CrossRef] Sensors 2021, 21, 5135 19 of 19

51. Belkina, A.C.; Ciccolella, C.O.; Anno, R.; Halpert, R.; Spidlen, J.; Snyder-Cappione, J.E. Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets. Nat. Commun. 2019, 10, 1–12. [CrossRef][PubMed] 52. Parra-Hernández, R.M.; Posada-Quintero, J.I.; Acevedo-Charry, O.; Posada-Quintero, H.F. Uniform Manifold Approximation and Projection for Clustering Taxa through Vocalizations in a Neotropical Passerine (Rough-Legged Tyrannulet, Phyllomyias burmeisteri). Animals 2020, 10, 1406. [CrossRef] 53. Wang, L. (Ed.) Support Vector Machines: Theory and Applications; Springer Science & Business Media: Chicago, IL, USA, 2005; Volume 177. 54. Zare, M.; Pourghasemi, H.R.; Vafakhah, M.; Pradhan, B. Landslide susceptibility mapping at Vaz Watershed (Iran) using an artificial neural network model: A comparison between multilayer perceptron (MLP) and radial basic function (RBF) algorithms. Arab. J. Geosci. 2012, 6, 2873–2888. [CrossRef] 55. Kiranyaz, S.; Ince, T.; Gabbouj, M. Real-Time Patient-Specific ECG Classification by 1-D Convolutional Neural Networks. IEEE Trans. Biomed. Eng. 2015, 63, 664–675. [CrossRef][PubMed] 56. Streiner, D.; Cairney, J. What’s under the ROC? An introduction to receiver operating characteristics curves. Can. J. Psychiatry 2007, 52, 121–128. [CrossRef][PubMed]