Neurolinguistic and Machine-Learning Perspectives on Direct Speech Bcis for Restoration of Naturalistic Communication
Total Page:16
File Type:pdf, Size:1020Kb
Brain-Computer Interfaces ISSN: 2326-263X (Print) 2326-2621 (Online) Journal homepage: http://www.tandfonline.com/loi/tbci20 Neurolinguistic and machine-learning perspectives on direct speech BCIs for restoration of naturalistic communication Olga Iljina, Johanna Derix, Robin Tibor Schirrmeister, Andreas Schulze- Bonhage, Peter Auer, Ad Aertsen & Tonio Ball To cite this article: Olga Iljina, Johanna Derix, Robin Tibor Schirrmeister, Andreas Schulze- Bonhage, Peter Auer, Ad Aertsen & Tonio Ball (2017): Neurolinguistic and machine-learning perspectives on direct speech BCIs for restoration of naturalistic communication, Brain-Computer Interfaces, DOI: 10.1080/2326263X.2017.1330611 To link to this article: http://dx.doi.org/10.1080/2326263X.2017.1330611 © 2017 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group Published online: 21 Jun 2017. Submit your article to this journal View related articles View Crossmark data Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=tbci20 Download by: [93.230.60.226] Date: 22 June 2017, At: 15:41 BRAINCOMPUTER INTERFACES, 2017 https://doi.org/10.1080/2326263X.2017.1330611 OPEN ACCESS Neurolinguistic and machine-learning perspectives on direct speech BCIs for restoration of naturalistic communication Olga Iljinaa,b,c,e,g,h, Johanna Derixe,h, Robin Tibor Schirrmeistere,h, Andreas Schulze-Bonhaged,e, Peter Auera,b,c,f, Ad Aertseng,i and Tonio Balle,h aGRK 1624 ‘Frequency effects in language’, University of Freiburg, Freiburg, Germany; bDepartment of German Linguistics, University of Freiburg, Freiburg, Germany; cHermann Paul School of Linguistics, University of Freiburg, Germany; dEpilepsy Center, Department of Neurosurgery, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; eBrainLinks-BrainTools, University of Freiburg, Freiburg, Germany; fFreiburg Institute for Advanced Studies (FRIAS), University of Freiburg, Freiburg, Germany; gNeurobiology and Biophysics, Faculty of Biology, University of Freiburg, Freiburg, Germany; hTranslational Neurotechnology Lab, Department of Neurosurgery, University Medical Center Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany; iBernstein Center Freiburg, University of Freiburg, Germany ABSTRACT ARTICLE HISTORY The ultimate goal of brain-computer-interface (BCI) research on speech restoration is to develop Received 2 August 2016 devices which will be able to reconstruct spontaneous, naturally spoken language from the underlying Accepted 11 May 2017 neuronal signals. From this it follows that thorough understanding of brain activity and its functional KEYWORDS dynamics during real-world speech will be required. Here, we review current developments in BCI; ECoG; speech intracranial neurolinguistic and BCI research on speech production under increasingly naturalistic production; neurolinguistics; conditions. With an example of neurolinguistic data from our ongoing research, we illustrate the real-world communication plausibility of neurolinguistic investigations in non-experimental, out-of-the-lab conditions of speech production. We argue that interdisciplinary endeavors at the interface of neuroscience and linguistics can provide valuable insight into the functional signifcance of speech-related neuronal data. Finally, we anticipate that work with neurolinguistic corpora composed of real-world language samples and simultaneous neuronal recordings, together with machine-learning methodology accounting for the specifcs of the neurolinguistic material, will improve the functionality of speech BCIs. 1. Introduction Speech reconstruction from brain activity is not only an exciting but also a challenging endeavor, and difer- Establishing approaches to enable communication ent recording methods, both hemodynamic and electro- in severely paralyzed patients is a major aim of brain- physiological, have been used to attempt it. Each of them computer-interface (BCI) research [1]. One increas- ingly pursued method is to infer the message the patient has its own drawbacks and advantages. Popular hemo- desires to convey from the underlying neuronal activity in dynamic methods are functional magnetic resonance speech-related brain areas and to externalize the message imaging (fMRI) and functional near-infrared spectros- using an assistive technical device. A speech BCI based copy (fNIRS). Tey detect brain activity by non-invasive on this method can be referred to as ‘direct’, as it relies measurements of changes in the level of cerebral oxy- on a one-to-one correspondence between the neuronal genation. Tese are correlated with changes in neuronal activity and the behavioral output [2]. For instance, if activation [5] although this correlation can be loose [6] a paralyzed patient wants to say, ‘I’d like to have a cup and dependent on the investigated anatomical area [7]. of strong black cofee, please’, a direct speech BCI, also fMRI has the advantage of whole-head coverage, and referred to as a ‘brain-to-text’ system [3,4], will try to it allows studying the entire network involved in task- reconstruct this utterance as precisely as possible from related processing. In comparison with fMRI, fNIRS can the underlying neuronal signal and convert it into, for only measure hemodynamic responses from the outer example, a text message or into voiced synthetic speech cortical layers, but it is associated with the advantages via a text-to-speech device. of being silent, portable, and cheaper to acquire [8,9]. CONTACT Olga Iljina [email protected] © 2017 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License (http://creativecommons.org/licenses/by-nc- nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way. 2 O. ILJINA ET AL. Recent BCI studies have shown that fNIRS can be used Accordingly, ECoG is available for research only at large to decode intended simple communication via ‘yes’ or neurosurgical centers, and the placement and the number ‘no’ responses in paralyzed subjects [10,11], suggesting of electrodes can difer between subjects due to their indi- that fNIRS can aid restoration of communicative abili- vidual clinical needs. Also, a researcher may ofen have ties in complete paralysis [12]. Hemodynamic methods to wait for a suitable patient, in whom the main areas of of measurement are non-invasive, and they possess good interest are sufciently covered by electrodes and whose spatial resolution. Nevertheless, their main demerit for clinical picture allows for studying the areas of interest speech-related research is that the temporal resolution (e.g. one may want to make sure the seizure onset zone of the recorded signals only allows detecting changes on in patients with epilepsy lies outside the respective areas) the temporal scale of seconds, corresponding to the slow [18,20,21]. Important advantages of using this method speed of metabolic processes inherent to these methods. for speech-related research are that neuronal activity can To be able to capture dynamic speech-related changes be recorded directly from the cortical surface, and that in the time range of milliseconds, electrophysiological recordings, which are usually obtained over an extended recordings [3] or combinations of electrophysiologi- period of one to three weeks of pre-neurosurgical diag- cal and hemodynamic methods [10] may be helpful. nostics, can be used to study naturalistic, self-initiated Electroencephalography (EEG) and magnetoencepha- human behavior [21,22] and real-world communication lography (MEG) are popular non-invasive methods to [18,20,23]. study speech-related neuronal processing. Tese methods Another invasive method used in previous direct can record electrophysiological signals from the cortical speech BCI research is intracortical microelectrodes, surface, and they possess very high temporal resolution which can record extracellular potentials from small multi- and spatial resolution comparable with fNIRS and fMRI, unit populations. Like ECoG, this recording method pos- respectively. One demerit of these methods, however, sesses high signal quality and robustness against ocular is that they can only record bone- and scalp-fltered and other movement-related artifacts, and it has been signals and therefore ofer inferior signal-to-noise ratios used in long-term clinical trials due to the comparatively compared with intracranial electrophysiology [13]. An small area of implantation and hence a reduced risk of important methodological drawback of both hemody- infection or mechanical damage to the neuronal tissue namic and non-invasive electrophysiological methods [24,25]. A major limitation of this approach, however, is for studies on expressive language is that they are prone that it allows recording from small cortical regions, which to distortion of the brain signal by myographic activ- may be a risk factor for long-term stability of the recorded ity. It accompanies expressive speech, during covert neuronal signal. production as well [14], and may thus obscure biologi- In the following, we describe the latest developments cally relevant neuronal responses. Due to this drawback, in the feld of ECoG research