Santhanam.Pdf

DESIGNING HIGHER PERFORMANCE NEURAL PROSTHETIC SYSTEMS

A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

Gopal Santhanam December 2006

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3242614

INFORMATION TO USERS

The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion.

® UMI

UMI Microform 3242614 Copyright 2007 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code.

ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

(Krishna V. Shenoy) Principal^/raviser

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

(Balaji Prabhakar)

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

(Teresg H. Meng)

Approved for the University Committee on Graduate Studies.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A bstract

Many individuals suffer from movement disorders, ranging from neurological deficits in the central nervous system to limb amputations. In extreme cases, higher-level cognitive function can remain intact but the motor output system is blocked and cannot function (e.g., ALS, brain-stem stroke, or spinal cord injuries). It has been proposed that “neural prostheses” can be interfaced with the human brain to read out motor intentions and create control signals to effectively bypass the patient’s pathology. Recent studies have demonstrated that monkeys and humans can use signals from the brain to guide computer cursors. These brain-computer interfaces (BCIs) may someday assist patients, but relatively low system performance remains a major roadblock. In fact, the speed and accuracy with which keys can be selected using BCIs is far lower than for systems relying on simple eye movements. This dissertation will first describe the design and demonstration, using electrode arrays im planted in monkey dorsal pre-motor cortex, of a manyfold higher performance BCI than previously reported. Our >4 times increase in system performance indicates that a fast and accurate key se lection system, capable of operating with a range of keyboard sizes, is indeed possible (up to 6.5 bits/s or ~15 words per minute). Next, an algorithm will be introduced that further increases per formance over standard neural decoding models by incorporating correlation structure between recorded neural signals. We find that by using a probabilistic framework, we can appreciably re duce the error rate of our prosthetic system. Lastly, as such prosthetic systems transition to the clinical setting, there will be a need to more fully characterize electrode array sensor stability, a feature essential for mobile humans. For this purpose, I will describe the design and prelimi nary data from an embedded system for recording neural data from freely behaving monkeys and our preliminary characterizations of long-duration, continuously-recorded data from a standard, implanted electrode array. Taken together, these results should substantially increase the clini cal viability of BCIs in humans as well as provide opportunities for the further study of neural prostheses. Finally, there will be a discussion of collaborative work in the context of my primary research.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. P reface

Early in my graduate student career, I was at a crossroads. I had nearly completed my Master’s degree and had to decide whether I would continue on in pursuit of a doctorate. I investigated the vast array of opportunities available in the Department of Electrical Engineering at Stanford and evaluated the research groups that were aligned with my personal background and interests. Nothing really grabbed my attention until I heard about a new research group being established by a young Assistant Professor, Krishna Shenoy. Although I had come from a traditional electrical engineering background and had no experience in neuroscience or bioengineering, I was hooked after my first meeting with Krishna. The prospect of performing scientific experiments to better understand the brain and engineering prosthetic systems for neurologically impaired patients was immediately exciting, motivating, and intellectually challenging. The field is certainly accessible to people who are uninitiated or non-technical. In this vein, I would like this dissertation to be easily digestible by the reasonably intelligent reader. However, that is not to say that the research field as a whole is light on detail or rigor. There are a great many opportunities for solid, intellectual contributions, including (but not limited to) developing good experimental paradigms, creating efficient experimental apparatus, conducting thorough analyses, and deriving new computational approaches. Above all, I have felt that a me thodical approach coupled with a strong attention to detail can lead to very illuminating results. It is my hope that this particular viewpoint will be expressed through the course of this dissertation. Another quality that attracted me to this research area was the opportunity to engineer and build tangible systems. Too often, academic research can be theoretical without a practical bent, at least in the near term. While I had had an initial desire to work on theoretical problems, I found that I subconsciously gravitated toward projects that dealt with the development of research- quality infrastructure and the execution of laboratory experiments. This will be a common thread that connects the various projects in this thesis. Finally, it is important to note that all of the presented work was a team effort. Though I took the lead for the projects described in the main body of this dissertation, I had vital assistance from other members of the Shenoy laboratory, without whom I would not have been able to carry this research forward. Such collaboration is inevitable for these types of projects, especially when they span multiple years and personnel. Likewise, during my tenure in the laboratory, I supported

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. other research conducted by my colleagues. In recognition of that collaboration, I have included an appendix that briefly describes a few of these projects and their impact on my area of research as a whole. The following outline is a road map for this dissertation.

• Chapter 1 provides a broad introduction to neural prosthetic systems, or brain-computer interface, and their utility for helping patients with motor disabilities. I describe the general schematic of a brain-computer interface and point to the pressing need to improve such systems before they are clinically viable. I also introduce a categorization for prosthetic systems that are targeted toward individuals with motor disabilities. This categorization should help the reader understand the various advantages and disadvantages of the types of systems that one might consider developing. An overview of the past literature is also given to provide adequate perspective for the the later chapters of this dissertation.

• In Chapter 2, I report some of the infrastructure that was built and used in our brain- computer interface. Specifically we constructed a more robust system to perform “spike sort ing” (the task of discriminating between different neurons on the tip of a recording electrode) while a laboratory experiment is in progress. This task of separating neurons as distinct sources is an important signal processing problem; different neurons provide different views on what the brain is doing and mixing them together will degrade our ability to accurately extract that information. The improvement of real-time “spike sorting” was one aspect in our quest for greater overall performance of neural prosthetic systems.

• Chapter 3 describes the brain-computer interface that we built in the laboratory using a non-human primate animal model. This is the cornerstone of my thesis and demonstrates an actual system, directly analogous to the types of systems that will be used for paralyzed patients. The system is intended as a communication device (e.g., a keyboard interface for typing out emails). It is able to achieve a more than four-fold increase in performance over the current state-of-the-art. We performed careful controls to ensure that our results were not confounded by certain neurophysiological factors. We also varied different parameters to understand the impact of keyboard layout, task timing, and numbers of recorded neurons on the overall performance of the system.

• Chapter 4 delves into the algorithmic decoding component of the brain-computer interface. A neural prosthesis targeted for individuals with motor disfunction must be able to read sig nals from the brain, decode their meaning, and produce an output. The algorithm we used for the system characterized in Chapter 3 was relatively simple. The purpose of the work presented in this chapter was to explore how much improvement might be gained by trying a more complicated decoding algorithm. Here, we used more sophisticated mathematical

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. techniques to better model the responses of neurons and take into account their interrela tionships. The new approach resulted in a substantial increase in the performance in certain situations.

• In Chapter 5 ,1 switch gears and introduce a system that we designed and built in the labora tory to conduct portable neural recordings. As prosthetic systems transition to wider clinical use, there will be a greater need to test laboratory systems in settings that are similar to the usage mode of actual patients. Importantly, the patient population will not only include completely paralyzed individuals; it may eventually encompass paraplegics and amputees who are still ambulatory. To investigate this scenario, we built a device, dubbed HermesB, that can record neural data continuously from a rhesus monkey, even while the monkey is freely behaving in its home cage. We were able to investigate two interesting scientific ques tions with these long-duration, nearly continuous, datasets. First, we examined the stability of neural recordings in freely behaving animals; stability refers to either the gross presence of neural signals recorded from a chronic electrode or the general shape of the voltage wave form emitted by any neuron. Second, we compared the neural recordings between active periods (when the animal is moving) versus inactive periods (when the animal is station ary or asleep). This type of characterization will be invaluable when trying to design high performance systems that expand past a very limited group of disabled patients.

• The second-to-last chapter will provide some brief concluding statements.

• The final chapter contains a list of my journal and conference contributions.

• An appendix, as previously mentioned, will present a brief overview of secondary projects with which I have been involved. These projects have important implications to the thrust of my research to increase the overall performance of neural prosthetic systems.

- When Krishna’s laboratory was in its infancy, Mark Churchland began a project inves tigating motor planning in the context of fast and slow movements. The premise was simple, namely to better understand how neural responses differed between when a subject was planning a fast-paced reach versus a medium-paced reach to the same tar get location. The results of this experiment were powerful and the work has spawned many future studies. I was very fortunate to have had to opportunity to modestly con tribute to this work (e.g., with infrastructure development and some amount of monkey training) and also gain valuable experience working with Mark.

- In an effort to better understand and characterize the responses in pre-motor cortex, Aaron Batista performed experiments that compared the influence of the current eye position against the the influence of the upcoming arm movement. The influence of the eye and the arm were surprisingly comparable, especially when considering that this is

vii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. a brain area in which a sizable fraction of neurons travel down the spinal cord and ac tivate the musculature. In this project, I was involved with some experimental design, infrastructure development, and data analysis. Coupled with the experiments of Mark Churchland, these results have a substantial impact on how we view computation in the motor areas of the brain.

- One of the more exciting recent research thrusts in the laboratory has been our explo ration into the mechanistic details of motor planning. There is still very little known about the actual process of how the central nervous system takes an abstract motor goal and produces the necessary neural responses to coordinate all of the muscles in the arm. Initial, groundbreaking work was performed by Mark Churchland, who demonstrated that neural responses in pre-motor cortex are indicative of a recurrent neural network settling to a solution when planning a movement. Byron Yu and Afsheen Afshar have extended this further by attempting to map out, in an abstract space, the planning trajectory, or the path the motor system traverses from having no plan to having a fully-formed plan, all prior to the start of movement. I have been able to assist with both of these projects by helping to collect relevant experimental data and providing scientific feedback.

- Another project started in the laboratory’s early days was the effort to combine activity from different different brain areas to create a better neural prosthesis. Different parts of the brain can provide vastly different information about a particular movement. For example, one brain area might signal information before the movement begins (e.g., specifying the endpoint of the upcoming reach). Another brain area may signal infor mation during a movement (e.g., coordinating the flight of the arm). The project to combine these different types of neural activity has been an ongoing one for the past few years. The studies have been headed up by Caleb Kemere and Byron Yu and sev eral publications in the literature speak to the effectiveness of their approaches. My involvement has included performing simulations, collecting neural data, and provid ing scientific feedback for their work.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Acknowledgments

First and foremost, I would like to thank Krishna Shenoy for his role as my Ph.D. adviser over the past 5+ years. There were many times during my graduate student career where I would stop by Krishna’s office with a problem, and often I felt the problem was rather serious. Krishna would take time out of his busy schedule, listen to my concerns, and quickly develop a plan of action. Within about 30 minutes, I would be leaving his office, no longer worried, and re-energized about the direction of my research. Furthermore, his keen scientific mind, methodical nature, peer-oriented management style, and genuine concern for his advisees has made my professional and personal relationship with him extremely rewarding. Krishna should also receive praise for gathering together a very talented professionals to sup port his group. Having joined the group at its inception, I was exposed to many administrative issues as we were setting up the laboratory. We were all rather fortunate to have the steady hand of Sandra Eisensee chaperoning us through the Stanford bureaucracy during this time. Addition ally, there were demands on our laboratory since we use rhesus monkeys in our research. Here, Krishna was able to recruit two first-rate lab technicians. The first was Missy Howard who was on staff for over half of my time in the laboratory; she was a friendly caretaker of the animals and the researchers. After her departure, our group was able to continue efficiently and effectively due to the capable oversight of Mackenzie Risch. Apart from my adviser, I have grown intellectually and scientifically as a result of my inter actions with my labmates. At the time I first decided to join the laboratory, the group consisted of only three people — Krishna, Mark Churchland, and me. Little did I know then how much of a positive impact Mark would have on my research and overall scientific outlook. My first expo sure to experiments was through his mentorship, and his clear thinking, ability to design great experiments, and sound intuition for new concepts have been invaluable assets throughout my thesis work. Special thanks to Stephen Ryu and Byron Yu for the wonderful collaborative efforts that lead to the development of our very own high-performance brain-computer interface. It was very exciting working in the lab with them for many months and I couldn’t have asked for better research partners. I would also like to acknowledge Caleb Kemere for his unwavering willingness to lend a hand, Aaron Batista for his insights on animal training and knowledge of the current literature, Afsheen

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Afshar for providing valuable assistance in running experiments for our brain-computer interface project, and Michael Linderman and Vikash Gilja for their tireless efforts in development, data collection, and analysis for HermesB. There are several newer members of the Shenoy laboratory that have also contributed to a very intellectually stimulating environment and for that I am grate ful. Furthermore, we have all benefited from the close interactions with other research programs, including those of Maneesh Sahani from the Gatsby Computational Neuroscience Unit, Professor Bill Newsome in Stanford’s Neurobiology department and Professor Teresa Meng in Stanford’s Electrical Engineering department. Finally, I would like to thank my family who has provided me with solid support throughout these past few years, including my father, mother, grandmother, brother, and sister-in-law. I es pecially appreciate the countless home cooked meals from my mother that provided me with the biofuel to do research, and I owe so much to my brother who has inspired me to be, among other things, intellectually curious and precise.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. C ontents

Abstract iv

Preface v

Acknowledgments ix

1 Introduction 1 1.1 Overview ...... 1 1.2 Motor and Communication Prostheses ...... 3 1.3 Plan and Movement Activity ...... 5 1.4 Recent A dvances ...... 8 1.4.1 Motor Prostheses ...... 8 1.4.2 Communication P ro s th e s e s ...... 11 1.5 Approaches to Improving Performance ...... 14

2 Real-Time Spike-Sorting 15 2.1 Overview ...... 15 2.2 M e th o d s ...... 17 2.2.1 Spike-Sorting System D iagram ...... 17 2.2.2 Basic Platform ...... 18 2.2.3 RR: Second Generation Classification Infrastructure ...... 19 2.2.4 Spike Clustering Algorithm ...... 20 2.2.5 Hoop Design for Online Classification ...... 21 2.2.6 RRR: Third Generation Classification Infrastructure ...... 23 2.2.7 Data Collection and A nalysis ...... 24 2.3 Results and Discussion ...... 25 2.3.1 Clustering and Classification ...... 25 2.3.2 Target Location Estim ation ...... 26 2.4 Feasibility of Implantable Spike-Sorting Circuits ...... 28 2.5 S u m m a ry ...... 30

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2.6 Credits ...... 31

3 A High-Performance Brain-Computer Interface 32 3.1 Overview ...... 32 3.2 M e th o d s ...... 34 3.2.1 Neural recordings ...... 34 3.2.2 Decoding Algorithms ...... 37 3.2.3 Model T rain in g ...... 40 3.3 Control E xperim ents ...... 41 3.3.1 Selection of Skip Time (Ts^ p ) ...... 41 3.3.2 Selection of Integration Time (Tjnt ) ...... 44 3.4 BCI Experiments ...... 47 3.4.1 Additional BCI Performance Aspects ...... 48 3.5 S u m m a ry ...... 52 3.6 Addendum: EMG m e a su re m en ts ...... 53 3.7 Addendum: Application of Information Theory to B C Is ...... 55 3.7.1 Analogy to Communication System s ...... 55 3.7.2 Computations ...... 57 3.7.3 Notes ...... 59 3.8 Credits ...... 61

4 Factor Analysis Investigation 62 4.1 Overview ...... 62 4.2 M e th o d s ...... 65 4.2.1 Latent Variable Models ...... 65 4.2.2 Poisson Output Model ...... 66 4.2.3 Extensions to Accommodate Multiple T argets ...... 69 4.3 Results and Discussion ...... 72 4.3.1 Data Characterization ...... 72 4.3.2 Target Decoding ...... 74 4.3.3 Datasets with More Shared Variability ...... 78 4.4 Summary ...... 80 4.5 C re d its...... 81 4.6 Appendix: Mathematical Derivations for FA G O ...... 82 4.6.1 EStep ...... 82 4.6.2 M Step ...... 84 4.6.3 Inference ...... 85

xii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 HermesB 86 5.1 Overview ...... 86 5.2 B ackground ...... 87 5.3 Methods ...... 90 5.3.1 System Description ...... 90 5.3.2 Recordings and Analyses ...... 93 5.3.3 Recording Stability Analyses ...... 95 5.4 R e su lts...... 96 5.4.1 System V erification ...... 96 5.4.2 Recording Stability ...... 97 5.4.3 Neural Correlates of Behavioral Contexts ...... 103 5.5 S u m m a ry ...... 106 5.6 C re d its...... 107

6 Future Directions 108

7 Publications 110 7.1 Journal A rticles ...... 110 7.2 Conference Talks, Articles, A b stra cts ...... I l l 7.2.1 2006 ...... I l l 7.2.2 2005 ...... 112 7.2.3 2004 ...... 113 7.2.4 2003 ...... 115 7.2.5 2002 ...... 115

A Select Collaborations 116 A.l Speed Tuning in PM d ...... 116 A. 1.1 Motivation ...... 116 A. 1.2 R esu lts...... 117 A. 1.3 Significance ...... 119 A.2 Reference Frames in PMd ...... 120 A.2.1 Motivation ...... 120 A.2.2 Results...... 120 A.2.3 Significance ...... 123 A.3 Mechanisms of Motor Planning ...... 124 A.3.1 Motivation ...... 124 A.3.2 Results...... 124 A.3.3 Significance ...... 127 A.3.4 Beyond N V ...... 128

xiii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A.4 Mixture of Trajectory Models ...... 131 A.4.1 Motivation ...... 131 A.4.2 Methods ...... 132 A.4.3 Results...... 135 A.4.4 Significance...... 137

xiv

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Tables

2.1 Decoding Performance Improvement due to Spike Sorting ...... 27 2.2 Decoding Performance Improvement when Further Restricting Electrodes ...... 27

3.1 BCI Experiments with Highest IT R C ...... 48 3.2 Comparing Methods for Calculating Information Transfer ...... 59

4.1 Factor Analysis Performance Comparison ...... 75 4.2 Second Factor Analysis Performance Comparison ...... 77

5.1 HermesB Parameters ...... 91

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. List of Figures

1.1 Overview Chart of Neural Prostheses ...... 2 1.2 System Diagram of a Neural Prosthetic S y s te m ...... 4 1.3 Examples of Plan and Movement Activity ...... 5 1.4 Control of Prostheses with Plan and Movement A c tiv ity ...... 7

2.1 Extraction of Neural Signals ...... 17 2.2 Screenshot of the Cerebus User Interface ...... 18 2.3 RR Block D iag ram ...... 19 2.4 Block diagram of the Sahani algorithm ...... 20 2.5 Clustering Results from the Sahani Algorithm ...... 22 2.6 Threshold and Hoop Design Example ...... 23 2.7 RR Block D ia g ra m ...... 24 2.8 Example of Difficult Hoop Sort ...... 26

3.1 Instructed-delay and BCI T a s k s ...... 35 3.2 Anatomical Placement of Electrode A rray s ...... 36 3.3 M ultivariate Gaussian D ata-fitting ...... 38 3.4 Empirical Spike Count Distributions ...... 39 3.5 T ^ p A n aly ses ...... 42 3.6 Tgjjjp Analyses with Multiple T a r g e ts ...... 43 3-7 Tint Effects in Control Experim ents ...... 45 3.8 ITRC for M ulti-target T a s k ...... 46 3.9 Effect of Tjnt and Target Configuration in BCI Experiments ...... 49 3.10 Single-trial Accuracy as a Function of Number of Neural Units and Tjn t ...... 50 3.11 ITRC as a Function of Number of Neural Units and ...... 51 3.12 EMG Measurements for Monkey G ...... 54 3.13 Schematic Diagram of a Communication System ...... 56 3.14 Confusion Matrices from Two Experim ents ...... 58 3.15 Information Transfer as a Function of Accuracy ...... 60

xvi

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 4.1 Illustration of Spike Count Covariance ...... 63 4.2 Latent Space Example for FAGO ...... 70 4.3 Choosing the Number of Latent Dimensions with Test Likelihood ...... 73 4.4 Intrinsic Fano Factor ...... 74 4.5 Choosing the Number of Latent Dimensions for FAGOcmb ...... 77 4.6 FAGOcmb for BCI Experiments ...... 79

5.1 Array Lifetime Diagram ...... 88 5.2 HermesB Block Diagram ...... 90 5.3 HermesB Components ...... 93 5.4 Sample Protocol for HermesB Execution ...... 94 5.5 Sample Neural and Accelerometer D ata ...... 96 5.6 Comparison between CKI and HermesB ...... 97 5.7 Neural Stability over 48 Hours ...... 98 5.8 Variation in Vpp and R M S ...... 99 5.9 Variation in Waveform Relative to Acceleration Events ...... 100 5.10 Variation in Waveform under High Acceleration ...... 100 5.11 Neural and Accelerometer D a ta ...... 103 5.12 LFP Analyses ...... 104

A.l Effect of Direction, Distance, and Instructed-Speed on Neural Firing R a te ...... 118 A.2 Variety of Reference Frames in P M d ...... 122 A.3 Optimal-Subspace Hypothesis ...... 125 A.4 NV Time C o u rs e ...... 126 A.5 Inferred Trial-by-Trial Planning Dynamics ...... 129 A.6 MTM Trajectory Decoding E xam ples ...... 136 A.7 MTM compared against RWM and STM ...... 137

xvii

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 1

Introduction

1.1 Overview

Each year, hundreds of thousands of people suffer from neurological injuries and disease, result ing in the permanent loss of motor function. In many cases, the disability is so severe that it is not even possible to feed oneself or communicate with others. Though surgical and medical inter ventions have made it possible to repair peripheral nerves and promote recovery in many cases, most central nervous system impairments still do not have effective treatments. Medical systems that electronically interface with the nervous system, termed neural prostheses, have started to fill some of these treatm ent gaps. There have been successes in other classes of disabilities, including cochlear implants for the profoundly deaf and deep brain stimulators to alleviate Parkinsonian tremor. In the relatively near term, epileptic-seizure disruption systems, artificial vision systems, prosthetic arm robotics, and basic communication systems are believed to be possible, while cognitive, memory and language oriented systems may be likely in the more distant future. Furthermore, while much of the initial research in these areas has been enabled by basic discoveries in systems neuroscience over the past three decades, applied neuroprosthetic research is also beginning to provide new views of neural representations and processing. The ultimate goal of any prosthesis is to restore normal functionality. Though complete restora tion is ideal, prostheses are clinically viable when the anticipated quality of life improvement out weighs the potential risks. Since neural prostheses must often measure or perturb neurons in the central nervous system, non-invasive techniques are particularly attractive and have been in vestigated extensively. Invasive electrode-based techniques have become a major research thrust due to their high signal quality — they promise the potential for enabling extremely high perfor mance prostheses despite the increased risk. However, the use of invasive techniques represents a somewhat long-term approach. In the near-term, due to moderate surgical risk and the extensive

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 2

investment required per patient, clinical applications may be limited to only the most severely disabled patients. However, if these systems can surpass what is possible with non-invasive mea surement techniques, and surgical risk and system costs can be sufficiently minimized, it is an ticipated that invasive electrode-based prostheses will find more widespread use (e.g., amputees and moderately non-communicative cerebral palsy patients as opposed to just quadriplegics and “locked in” amyotrophic lateral sclerosis (ALS) patients). The primary objective of the work presented in this dissertation is to increase the performance of neural prostheses so that these systems can tangibly increase the quality of life for patients with motor disabilities. In this chapter, we begin by first introducing “motor” and “communication” prostheses and their goals, which allows us to better define performance. We then review the use of “plan” neural activity, which is beginning to complement and extend the capabilities of systems that have relied on only “movement” activity until recently. Figure 1.1 depicts the two types of prostheses (motor and communication), two types of neural activity (movement and plan), and two ends of the invasiveness spectrum considered here. Then, we return to the topic of recent advances in prosthetic performance, providing a description of the valuable research already conducted in the field and what remains before systems can be viable in a clinical setting. At the end of this chapter, we will provide a brief overview on our approach to improving prosthetic performance, which is the subject of the balance of this dissertation.

NEURAL PRQSTKESES.

Peripheral nervous system

Central nervous system

Cortically controlled

Motor prostheses Communication prostheses

nvasivenessinvasiveness nvasivenessinvasiveness low *—►high low*—►high

Other activity

Figure 1.1: Chart illustrating the relationship among the various types of neural prostheses. Prostheses that interface with the peripheral nervous system (e.g., cochlear implants) are extremely important but are not considered further here. Prostheses interfacing with the central nervous system, including the spinal cord and deep brain structures (e.g., deep brain stimulators) are also very important, but are again not considered. Only cortically-controlled systems which attempt to restore motor and communication functions are considered, along with the underlying types of neural activity (movement, plan and other) and methods of measuring this activity (invasive and non-invasive).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 3

1.2 Motor and Communication Prostheses

Motor prostheses aim to provide natural control of the paralyzed limb, via electrical microstim ulation, or of an equivalent prosthetic replacement limb. In the case of upper-limb prostheses considered here, natural control includes the precise three-dimensional movement of all arm and hand segments along the desired path and with the desired speed profile. Such control is indeed a daunting ultimate goal, with many steps along the way leading to clinically viable systems. For example, simply being able to feed oneself, even without being able to deftly cut a steak, could still help thousands of quadriplegics. Communication prostheses do not aim to restore the ability to communicate in the form of natural voice or typing. Instead they aim to provide a fast and accurate communication channel rivaling the natural communication rate with which most people can speak or type. For example, “locked-in” ALS patients are altogether unable to converse with the outside world; many other neuro-degenerative diseases also severely compromise the quality of speech. Being able to reliably type even a few words per minute on a computer would be a meaningful advance for these patients. In fact, many of the most severely disabled patients, who are the likely recipients of first gener ation systems, would benefit from a prosthesis capable of performing motor and communication functions (Tkach et al. 2005). Figure 1.2 illustrates the basic operating principle behind motor and communication prosthe ses. Neural activity from various brain regions is electronically processed to create control signals for enacting the desired movement. Non-invasive or minimally-invasive sensors can collect neural signals representing the average activity of many neurons. When invasive permanently-implanted arrays of electrodes are employed, it is possible to identify individual neurons near the tip of each electrode through a mathematical process termed action potential (spike) sorting. Spike sorting (discussed further in Chapter 2) uses waveform shape differences to discriminate between cells and compress the information associated with neurons’ outputs into the times at which the cells “spiked” (emitted an action potential). This can be even further compressed by only considering the number of spikes in a predefined time window (e.g., 50-100 ms); this quantity is oft referred to as the neural firing rate. After determining how each neuron responds before and during a move ment (tuning), typically accomplished by correlating arm movements made during a behavioral task with associated neural activity, estimation (decode) algorithms can be designed to later infer the desired movement from only the ongoing pattern of neural activity. The system can then generate control signals appropriate for moving a robotic arm, or more simply, a cursor on a screen. Motor prostheses guide prosthetic arms (robotic arms via actuators or paralyzed limbs via microstimulation) or computer cursors continuously through space in order to restore natural functionality. On the other hand, communication prostheses do not attempt to reconstruct trajectory with high fidelity. Instead, these systems control prosthetic devices, such as a computer cursor, by simply selecting among a discrete set of targets, as we do while typing on a

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 4

S1 Neural signals

Frontal P arietal O ccipital

=a^ 5 a n »

"Spike sort T em p o ral

Spike times

JULiL Control signals

Figure 1.2: Concept sketch of cortically-controlled motor and communication prostheses (illustrated with intra-cortical recordings). There exist distinct regions in the cortex of a rhesus monkey (as shown) and in homologous areas in humans that participate in the preparation and execution of natural arm movements. Areas include the medial intra-parietal area (MIP) / parietal reach region (PRR) with largely plan activity, the dorsal aspect of pre-motor cortex with both plan and movement activity, and motor cortex with largely move ment activity. The signal path and prosthetic operation are similar when non-invasive (e.g., EEG) signals are used.

keyboard. Their goal is to provide a fast and accurate communication channel. Motor and communication prostheses are quite similar conceptually, but important differences critically influence their design. Motor prostheses must generate movement trajectories and they attempt to reproduce the desired movement as accurately and precisely as possible. Continuous prosthetic guidance is a necessity, and measures of prosthetic performance must quantify the sim ilarity between the prosthetic movement and the desired trajectory. In contrast, communication prostheses are concerned with information throughput from the subject to the world (e.g., the speed and accuracy with which keys on a keyboard can be selected). Although a continuously guided motor prosthesis could be used to convey information by moving to a key, only the key that is eventually struck contributes to information conveyance. Thus, simple discrete prosthesis posi tioning is sufficient for a communication prosthesis. For example, if it is possible to predict which letter on a keyboard is desired, a computer cursor could be directly positioned on that key as op posed to sliding it out to strike the key. Measures of communication prosthetic performance must quantify the similarity between the prosthetic selections (speed and accuracy on a given task) and the desired selections. This seemingly subtle distinction between motor and communication pros theses has important implications that profoundly influence the type of neural activity to be used and the overall prosthetic architecture.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 5

1.3 Plan and Movement Activity

Two main types of neural activity are well suited for driving prosthetic movements. Plan activity is present before arm movements begin and is believed to reflect preparatory processing required for the fast and accurate generation of movement. This activity is readily observed before move ment initiation in a delayed reach task. Delayed reach tasks begin by presenting a visual reach target. After a delay period of several tenths of a second, a “go cue” indicates that a reach may begin. Figure 1.3 illustrates that plan activity in a pre-motor cortex (PMd) neuron of a behaving rhesus monkey is correlated to, or “tuned” for, the direction of the upcoming movement. Plan ac tivity is present from soon after target onset until just after the go cue is given (several hundred milliseconds in this example). This activity typically rises at the start of, and is held during, the delay period. Plan activity can also be tuned for movement extent (data not shown, Messier and Kalaska 2000). Movement activity is present from just before movement initiation until just before movement completion, correlating with the movement details of the arm. This activity is tuned for both the direction (panel A) and speed (panel B) of arm movement (e.g., Moran and Schwartz 1999a).

d [) Target Onset Go Cue

^y Lm^. — 400 m s

Plan Movement Activity Activity M o v em en t Activity Activity

Figure 1.3: Plan and movement activity (illustrated with intra-cortical recordings) from a single PMd neu ron. a. Spike histograms showing average plan (green) and movement (red) activity associated with center- out reaches to peripheral targets (blue circles). Fifty representative reach trajectories to the upward-right target are shown in gray (mean trajectory in black), b. Top panel, same fifty representative reach trajectories as in panel a shown as a function of time (horizontal component only). Bottom panel, spike times associated with each of these fifty reaches (one row corresponds to one trial, black tick marks indicate spike times, gray bar indicates movement onset) along with the response averaged across these trials. (Figure adapted from Kemere et al. (2004a).)

Until recently, both motor and communication prostheses have focused exclusively on move ment activity. As depicted in Fig. 1.4a, movement activity can be elicited merely by “thinking” about moving the arm. Surprisingly, it has been found that this “movement activity” is even present in the absence of movement or electromyographic (EMG) activity (e.g., Wolpaw and Mc Farland 2004). When using animal models, such as healthy monkeys, neural activity that can

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 6

be generated when there are no muscle contractions is currently considered to be an adequate proxy for neural signals from paralyzed subjects (Taylor et al. 2002), although this assumption has yet to be widely tested. Hochberg et al. (2006) have recently demonstrated that algorithms developed with healthy monkeys is directly applicable to paralyzed patients. Movement activity is then decoded to generate instantaneous direction and speed signals, which is used to slide a prosthetic device such as a computer cursor along the specified trajectory (e.g., Taylor et al. 2002; Serruya et al. 2002; Carmena et al. 2003). Traditionally, motor prostheses would only incorporate movement activity since the goal is to recreate the desired movement path and speed, and plan activity has been thought to reflect only movement endpoint.1 Nevertheless, plan activity can play an important role in trajectory estimation by providing an estimate of where the movement will end. The goal estimate serves as a probabilistic prior, which helps constrain the instantaneous movement estimates based on movement activity and improves overall performance (Kemere et al. 2002, 2004b). Furthermore, if necessary, plan activity alone is sufficient to guide a motor prosthe sis. For example, a quick succession of discrete categorizations such as left, left, and up would be sufficient to guide a limb largely leftward and a bit upward. Alternatively, a typical movement tra jectory (e.g., straight path with a bell-shaped speed profile) could be followed from start to finish if the endpoint can be determined from plan activity alone (Shenoy et al. 2003; Kemere et al. 2004b; Musallam et al. 2004). Thus, motor prostheses generally rely on movement activity but can also benefit from plan activity. In contrast, communication prostheses are not obliged to move the prosthetic device along a continuous path in order to strike a target such as a key on a virtual keyboard (see Fig. 1.2). Instead, if target location can be estimated directly from neural plan activity, the cursor can be positioned immediately on the desired key. Recent reports suggest that there is considerable per formance benefit to using plan activity and direct-positional prosthesis control (Shenoy et al. 2003; Musallam et al. 2004; Hatsopoulos et al. 2004). Figure 1.4b illustrates that plan activity can be elicited merely by “intending” to move the arm to a target/key location, and it is well-established that plan activity does not necessarily produce movements or EMG activity (e.g., Weinrich and Wise 1982; Churchland et al. 2006a). Plan activity is then decoded to yield the desired key and the prosthetic cursor immediately appears to signal the selection of this key. Additionally, movement activity can be used to control a sliding cursor that then makes discrete selections; this would be classified as a communication prosthesis (Kennedy and Bakay 1998; Leuthardt et al. 2004; Wol- paw and McFarland 2004). Thus, communication prostheses can either rely upon plan activity alone, just movement activity, or a combination of the two.

1We have recently shown that plan activity can also reflect the upcoming speed of the movement and this finding will be discussed in Appendix A.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 7

“Think" about actually movinc O O arm to leftward target (but do not move arm) ^ o O •o o o

M ovem ent Activity Motor Prosthesis

O O “Plan" to m ove arm o o to leftward target (but do not move arm) □ <§> O O , / " ^ w 0 U pt I '1 / / /

Plan Activity Communication Prosthesis

Figure 1.4: Two types of neural activity for controlling two types of prostheses (illustrated with intra-cortical recordings), a. Movement activity can guide a cursor (red circle) along the desired path (e.g., straight or curved dashed red lines) and at the desired speed to hit the target, b. Plan activity can be decoded into a desired endpoint location which can be used to directly position a cursor (green circle) on the desired target. Though motor prostheses often rely on movement activity and communication prostheses often rely on plan activity, both types of activity are useful in both types of prostheses (see Fig. 1.1).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 8

1.4 Recent Advances

Having introduced the basic operation and goals of motor and communication prostheses, as well as movement and plan activity upon which these systems depend, we turn to the question of per formance. This is of utmost importance since a certain minimum level of performance is needed before a prosthesis will be deemed clinically viable for each particular patient group. Moreover, basic risk-benefit analysis requires a firm understanding of not only the relative risks between non-invasive and invasive alternatives, but also a quantitative understanding of the relative levels of performance. We begin by considering recent advances in motor prostheses, whose performance is inherently difficult to measure as it must compare the produced trajectory with the desired trajectory. Recent advances in communication prostheses are then described. Performance is con siderably simpler to measure in this domain, thereby allowing for a more quantitative comparison among systems.

1.4.1 Motor Prostheses

In the late 1960s and 1970s, Olds, Fetz, and others discovered that nonhuman primates could learn to regulate the firing rate of individual cortical neurons (Olds 1965; Fetz 1969; Fetz and Baker 1973). These pioneering experiments relied on straightforward forms of real-time feedback but clearly demonstrated that firing rates could be brought to requested levels without accompa nying muscle contraction, even in motor cortex (Ml). In the 1970s and early 1980s, Humphrey, Schmidt, and colleagues proposed that neural activity could be used to directly control prostheses (Humphrey et al. 1970; Schmidt 1980). By the late 1990s, technological advances and a consider ably better understanding of how cortical neurons contribute to limb movement (e.g., Georgopoulos et al. 1982, 1986; Schwartz 1992, 1993, 1994; Ashe and Georgopoulos 1994) sparked renewed in terest in developing clinically viable systems. This would require an ongoing series of experiments with animal models and disabled human patients with the goal of learning fundamental design principles and quantifying performance. Chapin, Nicolelis, and colleagues investigated one dimensional (ID) control of a motor prosthe sis by training rats to press a lever to receive a liquid reward (Chapin et al. 1999). The apparatus was then altered such that the lever was controlled by movement activity across a population of cortical neurons. This was accomplished by chronically-implanting electrodes in cerebral cortex and using various population decode algorithms to convert spike activity into movement signals. They found that rats could still control the lever to receive a liquid reward. Animals soon learned that actual forelimb movement was not needed and stopped movements altogether while contin uing to move the lever with brain derived activity. Although previous studies had demonstrated neural control of prosthetic devices, they were primarily conceived of as communication prosthe ses. In contrast, this and other concurrent studies provided important experimental evidence demonstrating the feasibility of motor prostheses.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 9

While this work demonstrated that a lever could swing through an arc, it is difficult to assess the quality of this ID control. As discussed above, motor prostheses must reproduce the desired trajectory and it is not clear how to ascertain what speed profile along this arc, including stationary hold periods, the rat actually desired. Success was defined as having received a reward; it was 60-

100%. Meanwhile, investigations of 2D and full 3D control essential for recreating natural arm move ments were underway. Often these studies controlled computer cursors on a screen visible to the subject. While cursors are of direct utility in communication prostheses, their role in motor pros thesis research is also well-founded. Since robotic or prosthetic limbs can operate by following a series of desired hand locations, it is sufficient to demonstrate that the desired hand location (cur sor) can be continuously controlled. Electronic controllers could then supplement this end-point trajectory by computing the inverse kinematics necessary for determining joint angles and forces for guiding an arm-like prosthesis. Schwartz and colleagues demonstrated that 2D and 3D hand location could be reconstructed with reasonable fidelity from the movement activity of a popula tion of simultaneously recorded M l neurons in rhesus monkeys (Isaacs et al. 2000), and Nicolelis and colleagues reported similarly encouraging reconstructions using simultaneous recordings from parietal cortex, PMd and M l (Wessberg et al. 2000). Together with similar recording studies from Donoghue and colleagues (Maynard et al. 1999), the stage was set for the first 2D and 3D motor prosthesis experiments. Donoghue and colleagues pursued 2D cursor control with rhesus monkeys (Serruya et al. 2002). A few tens of Ml neurons were recorded simultaneously with a chronically-implanted electrode array as monkeys moved a manipulandum to guide the cursor. By recording spike activity while monkeys tracked a continuously, pseudo-randomly moving target, a linear filter could be learned to relate neural activity to cursor movement. This linear filter was then used in a new task, where neural activity guided the prosthetic cursor to hit visual targets appearing at random locations. Linear filters were re-learned once neural control was underway. Targets could be hit within roughly one second on average, only slightly longer than during manipulandum control. This study clearly demonstrated 2D cursor control. Schwartz and colleagues pursued 3D cursor control with rhesus monkeys (Taylor et al. 2002). A few tens of Ml neurons were recorded with a chronically-implanted electrode array, and mon keys made 3D reaching movements to visual targets appearing in a 3D virtual reality environ ment. Neural responses were characterized in terms of the movement direction eliciting maximal response (preferred direction) and were combined to form a modified population vector. This in dicates the instantaneous movement of the hand or, in prosthesis mode, the prosthetic cursor. Monkeys then entered “brain-control” mode wherein the 3D prosthetic cursor was controlled by the neural population vector as opposed to the location of the hand. The prosthetic cursor hit the visual target roughly half the time when several seconds were allowed for target acquisition. In- triguingly, cell tuning properties were observed to change when controlling the prosthetic cursor.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 10

By using a novel control algorithm that tracked these changes (“co-adaptive” algorithm), monkeys were able to substantially improve prosthetic performance. The algorithm was able to bootstrap its training process without actual arm movements by assuming that the subject intentions matched the instructed targets. This would be similar to the situation of paralyzed patients. Targets could be hit on 70-80% of trials, and within 1.5-2.0 seconds. This study clearly demonstrates 3D control and provides tantalizing evidence that adaptation during prosthetic operation may be used to im prove performance. This group has since gone on to demonstrate that monkeys can use these 3D control signals to feed themselves with an anthropomorphic robotic arm (Spalding et al. 2005). Nicolelis and colleagues pursued 2D cursor control, along with a form of prosthetic grasping, with rhesus monkeys (Carmena et al. 2003). Hundreds of Ml, PMd, supplementary motor area (SMA), primary sensory area (SI) and posterior parietal neurons were recorded with chronically- implanted electrode arrays while monkeys performed each of three behavioral tasks. Monkeys were trained to move a pole to control the position of an on-screen cursor (task 1), to grip the pole to control the size of the cursor which indicated grip force (task 2), or the combination of tasks 1 and 2 (task 3). During these tasks, multiple linear models were used to estimate a variety of motor parameters including hand position, velocity, and gripping force from neural activity. After several minutes of training, models converged to an optimal performance and their coefficients were fixed. These models were used in “brain control” mode to translate neural activity into cursor movement (tasks 1 & 3) or cursor size (tasks 2 & 3). Animals initially produced arm movements in brain control mode but soon realized that these were not necessary and ceased to produce them for periods of time. Intriguingly, performance on each of the three tasks improved substantially over a period of days with performance achieving the following statistics: approximately 80% of visual targets were hit in 2 to 3 seconds (task 1), approximately 95% of the requested grip force ranges were achieved in 1.5 to 3 seconds (task 2), and approximately 75% of the combined tasks were successfully completed in 3 to 3.5 seconds (task 3). This study clearly demonstrates that grip force can be controlled, which is essential once a prosthetic arm arrives at the desired location, and that combined positioning and gripping are possible. Moreover, a robotic arm was inserted into the control loop that increased the delay between neural activity and cursor movement by nearly 0.1 second, but performance was able to nearly fully recover after training. Taken together, these investigations provide compelling proof-of-concept demonstrations that motor prostheses are possible. Systems are generally capable of guiding a prosthetic cursor to the specified target within a few seconds. Although this level of performance falls short of the speed and accuracy of natural arm movements, it is sufficiently high to motivate next-generation experiments and technological designs. In fact, it is based on these results that an FDA-approved pilot clinical trial of motor prostheses has begun in human patients (Hochberg et al. 2006). It is worth noting that these studies did not quantify motor prosthetic performance in terms of the difference between the desired trajectory and the actual cursor trajectory, on a trial-by-trial basis It is challenging to quantify true motor prosthetic performance in animal models because it is

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 11

difficult to obtain reports of what the desired trajectory might be. Consequently, human clinical trials may be needed to fully understand how well movement trajectories are aligning with desired trajectories. Nevertheless, animal experiments where trajectory “corridors” are specified should allow for more meaningful performance quantification.

1.4.2 Communication Prostheses

The prosthetic systems described so far have relied exclusively on movement activity and were designed primarily as motor prostheses. Alongside and even preceding this motor prosthetic re search effort, there has been active research on cortically-controlled communications prostheses. This interest has been especially prevalent among the non-invasive community, and the intra- cortical community has recently begun to explore this domain as well. Using non-invasive EEG recordings, researchers have explored several approaches for engineering a brain-computer inter face that allows for discrete target selection. Some commonly studied categories of EEG signals include slow cortical potentials (SCPs), sensorimotor (p and /3) rhythms, and evoked potentials (P300). Slow cortical potentials are not entirely sensorimotor-related signals. Rather, they are rep resentative of cognitive state and trained through operant conditioning. As the name suggests, these signals can be controlled only over long timescales (>2 sec); thus, it is inherently difficult to achieve high-speed communication with devices designed for this modality. Birbaumer and col leagues have focused on building practical systems based on the SCP for many years (Birbaumer et al. 1999), but despite various signal processing improvements, the rate of communication is 1-2 characters per minute (Hinterberger et al. 2004). This is equivalent to approximately .08-. 16 bits per sec (bps), assuming a ~32 key keyboard and perfect accuracy. At these rates, it would take >10 hours to electronically communicate a message of ~100 words. Scalp recordings can also sense rhythmic activity in the p and /3 frequency bands related to sensorimotor cognition. In recent years, Wolpaw and colleagues have been training subjects to use prosthetic systems based on these signals to slide computer cursors to predetermined targets. The subjects often report that they use motor imagery, where they imagine moving a limb or even the entire body through a trajectory, to first learn how to control the prosthetic system. There are many instantiations of this type of system; most of the highest performing systems report an accuracy on the order of 80-90% and bit rates in the range of 0.25-0.40 bps (McFarland et al. 2003). Furthermore, we can infer from the numbers reported in Wolpaw and McFarland (2004) that the recent 8 target, 2D EEG system is able to perform at 1.25 bps for their best subject, given their usual methods for calculating information transfer rate. A final EEG communication approach is the P300 evoked potential spelling device. The sys tem displays a grid of possible character choices (usually a 6x6 matrix). The grid is randomly illuminated, one row or one column at a time, during which the patient attends to the preselected

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 12

character, mentally noting when it is illuminated (this occurs when either its row or its column is illuminated). The P300 potential is a deflection in the EEG signals occurring approximately 300 milliseconds after unexpected, or random, events. Since the patient is concentrating on a specific row-column pair, the P300 potential will be modulated when either the row or the column is ran domly illuminated by the system. By searching for the row and column that induced the most P300 deflection, one can determine the subject’s original character selection post hoc. The initial report of this system was by Farwell and Donchin (1988) and recently there have been some dramatic advances in performance. Modern signal processing and statistical tools, such as support vec tor machines, have now been imported from other engineering domains to improve performance. Serby et al. (2005) provide a survey of the latest performance of these systems (reaching 1.0 to 1.5 bps for certain subjects, with 44% accuracy) as well as a report on their own adaptive online system that achieves 0.25 bps with 80% accuracy. In the intra-cortical domain, Kennedy and colleagues demonstrated that just one or two neu rons from the motor cortex of locked-in human ALS patients could be used to move a cursor across a virtual keyboard to type out messages (Kennedy et al. 2000). Patients reported simply “imagin ing” or “thinking” about moving various parts of their bodies, and eventually the computer cursor itself, to guide the cursor. This is an example of a communication prosthesis (goal is to select targets) operating by converting movement activity into continuous cursor control. A severely disabled patient was able to achieve a maximum of 3 characters/min when simultaneously us ing neural and muscle activities. While one should take care in noting that this was not a fully cortically-controlled prosthesis (it used some EMG signals from residual muscle function), the per formance was equivalent to an information rate of ~0.5 bps. There are other minimally invasive approaches being investigated (Kennedy et al. 2004; Leuthardt et al. 2004) that will likely yield similar information throughput, though results are still preliminary. In another study, Taylor et al. (2003) provide a nice demonstration that the line between motor prostheses and communication prostheses can be productively blurred — i.e., a system designed for the former purpose can be used for the latter application. In this work, the authors demonstrated that the continuous trajectory output from a prosthetic system designed to reach to discrete targets could be processed by an algorithm that chooses the most likely target location. If only a very early portion of the trajectory is needed to make the discrete classification, the system can simply cut the trial short and prepare to decode a new target. With this paradigm, the authors are able to estimate that such a system could theoretically achieve 1.6 bps2. Different factors limit non-invasive and invasive techniques. The low communication rates of EEG systems are likely due to the inherently low information content in the non-invasive scalp recording caused by spatial averaging across many neurons with dissimilar properties. On the

2We took the data shown in Figure 2 by Taylor et al. (2003) and adjusted the classification time (x-axis) to include an additional 700 ms, accounting for the 500 ms inter-trial interval and the 200 ms delay between target presentation and cursor movement. Then we divided the information (y-axis) by the time (x-axis) to yield bps. We selected the maximum value on that curve.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 13

other hand, the bit rates described by Taylor et al. (2003) are most likely limited by having to employ what is inherently a motor prosthesis design as a communication prosthesis. Recall that communication prostheses can simply position the cursor directly on the target because target se lection is their sole function (Shenoy et al. 2003). Furthermore, intra-cortical communication pros theses record data from a population of single neurons that contains relatively high information content, thereby allowing such a system to select targets/keys much more quickly and accurately than EEG-based systems. Andersen and colleagues recently reported the use of MIP/PRR and some PMd plan activity from rhesus monkeys to select targets/keys (Musallam et al. 2004). Plan activity was used to determine the desired movement endpoint.3 Performance comparable to the systems described above was achieved, though the speed and accuracy with which targets/keys on keyboards of different sizes was not directly investigated or pushed. In fact, at this time in the literature (2004), there had been no concrete studies demonstrating how intra-cortical designs can offer substantially higher performance than their EEG-based counterparts. This comparison is essential if we are to justify the increased surgical risk associated with intra-cortical electrode implantation. This dissertation helps provide answers to this critical question, as detailed later. As will be described in Chapter 3, information transfer rate is a natural metric for quantifying performance of communication prostheses, but this metric must be defined and applied precisely. Unfortunately, the precise definition and interpretation of this information transfer rate can be rather inconsistent between the aforementioned studies. Specifically, the information transfer rate of a system is a theoretical maximum that is asymptotically achievable. It is defined as the information that can be conveyed with zero probability of error, and is often achieved with an infi nite length error correcting code (Shannon 1948). Depending on the nature of the communication prosthesis it may be important to optimize single-trial accuracy or it might be more beneficial to optimize average information transfer rate (bit rate). In other words, if the design goal of a specific prosthetic system is to optimize information transfer rate, there is no benefit to accepting a lower bit rate for the sake of higher average accuracy — any theoretical bit rate computation already implies a zero probability of error. Furthermore, using all of the error statistics when computing the information transfer rate can lead to a more accurate assessment of prosthetic systems. For example, if the errors associated with a particular intended symbol are always assigned to an ad jacent symbol, one can use this fact in the error correcting code. Often studies simply collapse the error statistics to a single accuracy value before computing bit rate. It will be become increasingly important for the field to move toward a consistent interpretation of information theory in order to meaningfully compare the performance of communication prostheses.

interestingly, they also discovered that activity in MIP/PRR was modulated by reward expectancy, with neural activity more sharply differentiated relative to target direction when the subject was expecting a larger reward. Leveraging these effects could increase information transfer rates.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1. INTRODUCTION 14

1.5 Approaches to Improving Performance

We have provided an overview of neural prostheses as it pertains to patients with motor disabilities and also summarized the research th at was current around the time we started our own work. We now wish to frame the critical, outstanding next set of questions whose answers can move the field forward substantially. First, while present systems can definitely improve the quality of life for severely disabled patients, their performance does not come close to fully restoring lost motor func tion, or even providing a substantial sense of independence. For example, a system that can attain a communication rate of 1.5 bps can only yield a typing speed of ~3.5 words per minute (wpm) on a limited alphanumeric keyboard. Secondly, research has not demonstrated the promise of invasive- based systems — namely, their ability to provide greater performance over non-invasive-based systems. Recall that non-invasive-based systems record responses averaged over several million or more neurons that might be representing different information. Consequently, performance can suffer and patient training can be slow and tedious. Invasive-based systems presumably will not suffer from these limitations but it is imperative to demonstrate this fact in practice, especially if we are to justify the considerably higher surgical risk associated with such approaches. One can address both of these concerns by attempting to raise the performance of invasive- based systems. This is what we chose to do. We first restricted ourselves to investigating com munication prostheses because we feel it will provide the highest immediate impact for severely disabled individuals. Again, the clinical population that will be first targeted for such systems are patients that have very severe disabilities including ALS and quadriplegia, which leaves them unable to interact easily with the outside world, if at all. To address their basic everyday needs, these individuals require devices that improve their ability to communicate, by way of selecting icons on a screen or typing emails. Having decided on communication prostheses, we evaluated whether it might be better to use plan or movement activity. Here, we chose to build a system around plan activity since it does not require decoding of trajectory information and therefore may achieve higher performance as mentioned in Section 1.3. With these high-level design choices, we chose to investigate the performance of each com ponent in our system schematic (consult Fig. 1.2). Naturally, if we improve the performance of each individual piece, we can improve the performance of the system as a whole. We looked at the initial signal extraction phase by improving the ability to source separate individual neurons recorded on a single electrode tip (Chapter 2). Next, we started with a simple decoding algorithm and optimized its key parameters for performance, as well as analyzed the impact of varying other different design parameters such as keyboard size and number of recorded neurons (Chapter 3). Finally, we examined a more complicated decoding algorithm to see how much more performance could be squeezed out of such a system (Chapter 4). With this approach, we have made substan tial headway in improving the performance of neural prostheses and delivering on the potential benefits of invasive systems.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 2

Real-Time Spike-Sorting

2.1 Overview

In either basic systems neuroscience or neural prosthetic research, experiments require the col lection of data from the brain. Traditionally, this involves investigating the response properties of single neurons. Electrodes are implanted in the brain and situated nearby to several cells. One can measure the voltage surrounding these neurons and attempt to interpret the information communicated by them (Kandel et al. 2000). While modulations in the low-frequency neuronal os cillations may contain useful information, the primary mechanism of information transmission is the emission of a characteristic waveform (i.e., action potential or “spike”). The exact shape of the action potential is very regular across emission events, but is otherwise generally unimportant for the conveyance of information.1 It is the rate of these emissions (or “firing rate”) that is thought to convey information (Dayan and Abbott 2001). From a signal processing perspective, the spike is the signal of interest and the remainder of the recorded waveform is noise. If the electrode is close to the cell, the sensed action potential will appear large relative to the background noise. This allows one to reliably detect the presence and importantly the time of the action potential emission. Likewise, if the electrode is far from the cell, it will be difficult to distinguish the cell’s spikes from the noise, or it may be difficult to discriminate between two different cells’ spikes. The procedure of spike sorting is to infer the times at which one or more neurons emit spikes, as well as assigning each spike to the cell th at emitted it. This is done by determining the number of distinct voltage shapes and using those shapes to help identify and classify further events in the recording. A good review of the challenges associated with this blind-source separation problem has been w ritten by Lewicki (1998). In the context of neural prosthetic systems, spike sorting can lead to greater information ex traction from the brain. For example, in the extreme case, it would be highly detrimental to lump

1This exact characteristics of the action potential might be important to neural biophysicists or single channel neurosci entists, however.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 16

two neurons with opposite response properties together. Different neurons are communicating dif ferent information and combining them will possibly degrade prosthetic performance. However, there has been a tendency to shy away from spike sorting in this area of research, even though overall decoding performance is of primary interest. One recent study recognizes the importance of spike sorting but argues that it is impractical for large electrode counts given that sorting provides only an incremental performance gain (Carmena et al. 2003). Some studies sort units on small numbers of electrodes (Serruya et al. 2002; Taylor et al. 2002), but do so in a semi-automated fash ion. Not surprisingly, there can be a wide variability in the number of neurons and spikes detected when different researchers are asked to manually spike sort an identical raw data stream (Wood et al. 2004). As discussed in Chapter 1, there has been a recent push for implanting large numbers of im movable electrodes (100s) for neural prosthetic research. With these arrays, the electrodes’ loca tions are fixed and there is little flexibility to increase the signal-to-noise ratio after implantation.2 Hence, implantable electrodes are manufactured with only moderately high impedances (e.g., 200- 500 kf2) to ensure recordings from at least one neuron, though in practice they typically record from two or more. While recording from more than one neuron per electrode may sound advanta geous at first blush, it substantially increases the need for high-quality and fully-automated spike sorting capable of distinguishing each spike’s neural origin. Sophisticated spike-sorting algorithms exist for training and classifying multiple clusters (“units”) in low signal-to-noise situations (Sa- hani 1999; Shoham et al. 2003), but none of these have been applied across high electrode counts under real-time classification constraints. In this chapter, we present a new infrastructure that leverages existing unsupervised, prob abilistic clustering algorithms to sort spikes from a cortical electrode array. The data were col lected from a rhesus monkey performing a delayed center-out reach task and are presented here to demonstrate the efficacy of our system. We used both sorted and unsorted (thresholded) action potentials from an array implanted in pre-motor cortex to “predict” the reach target, a common operation in neuroprosthetic research. The use of sorted spikes led to an improvement in decoding accuracy of up to 9.3% on an 8-target task. This system was used in most array recording and all prosthetic experiments in the lab (e.g., Santhanam et al. 2006b). We conclude this chapter with a brief discussion of whether such complex spike sorting algorithm are feasible in a fully implantable system that must operate in real-time.

2 With single-electrode technology, electrode shanks are movable and can be situated closer to a neuron of interest during the experiment.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 17

2.2 Methods

2.2.1 Spike-Sorting System Diagram

Certain features are common to nearly all spike-sorting algorithms, as shown in Fig. 2.1. The sig nal from each electrode must be impedance converted, filtered, and amplified (Fig. 2.1a, triangle). Then, the process requires some conversion from the analog waveform to a digital signal because the fundamental data desired is the timing of action potentials. Since only neural spike times are of interest (~ 100 Hz), but the neural signals are sampled at a high rate (~30 kHz) to quantify action potential shapes, clearly some sort of data reduction must follow the digitization. Next, detected spikes are classified into categories corresponding to individual neurons or inseparable multiunit activity. The parameters for the data reduction are computed during a “training” phase (e.g., where raw data is processed to determine the numbers of neurons detected by each electrode and how best to separate them). Finally, the time and identity of each detected spike can be used by a downstream algorithm to infer a paralyzed patient’s motor intentions. Again, Lewicki (1998) provides a complementary overview of the spike-sorting problem.

Data H Telemetry Electrode Reduction Digitization

neurons I III I I I I I I I 1 1 II II I I I #1-3 llll I I I

final signal: a series of spike times

broadband signal

Data reduction into three classes and noise using orthogonal subspaces.

means of class waveforms

single neural spike

3 2 sa m p le s

Figure 2.1: Extraction of neural signals, a. General block diagram of data extraction from cortical neural recordings for a prosthetic interface: Broadband signal (b: 1 s of data; c: 2 ms, showing a spike) recorded on electrode is first digitally sampled. Then, a feature extraction process reduces the dimensionality of the data (d: spike waveforms in an optimized three dimensional subspace are easy to distinguish). In this reduced signal space, the activity of individual neurons can be differentiated from each other and from background noise. Optimally, only the spiking times of neurons (e) are finally transmitted from the device to the down stream system which decodes neural activity into control signals for a prosthetic device. Figure taken from Zumsteg et al. (2005).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 18

2.2.2 Basic Platform

A cornerstone of our system is the Cerebus 128 Channel Data Acquisition System (Cyberkinetics Neurotechnology Systems, Foxborough, MA). We chose to use the Cerebus system because its ar chitecture allows for easy interfacing with our design. First, the Cerebus “front-end” amplifies the incoming signals, applies an anti-aliasing filter, and digitally samples each channel (electrode) at 30 kHz. The digitized output is transmitted via a fiber optic link to the Cerebus Neural Signal Processor (NSP). The NSP can filter the incoming data stream for spike extraction. We chose a fourth order high- pass Butterworth filter with a cut-off frequency of 250 Hz (one of the available digital filters on the Cerebus system). The NSP compares the filtered data in real-time against a simple threshold trigger — if the trigger is tripped, a 1.6 ms “spike snippet” is sampled. Next, the NSP compares the spike snippet against several sets of time-amplitude window discriminators (“hoops”). Each set of hoops can be used to classify a unit — if a spike waveform passes through all of the active hoops for a specific unit, it is classified with that unit number. There can be up to 4 hoops per unit and 5 units per electrode channel. Snippets that do not satisfy any hoops are tagged as unclassified. The spike snippets, with their classification numbers, are broadcast over a private UDP network. The NSP can optionally broadcast the electrodes’ 30 kHz raw data onto the network as well. A desktop PC runs a graphical user interface (GUI) under Microsoft Windows. The GUI can configure the NSP via the UDP network, including modifying the threshold levels and hoops for online classification. Additionally, the GUI receives the spike snippets and plots each snippet, color coded by classification number. A human operator would ordinarily determine the best sets of hoops (or more generally, the sorting parameters) for each channel by examining the past history of spike snippets. This is known as the training phase. Figure 2.2 is a screenshot of the user interface for one particular electrode.

1 IS filia l

S. m ssiiawtsi a 1 •Lu Hiit i if ** n,: ! “ OmtJ ~i •sgSjSjSSl

i r m— J

Figure 2.2: Screenshot of the Cerebus user interface. Two units are sorted while a third was left unsorted. The remainder of the waveforms are from noise crossing the trigger. The operator sets the trigger threshold (red horizontal line) and places hoops to classify incoming waveforms. The NSP can classify a spike with a round-trip latency of 1-1.5 ms.

Compared to the Cerebus system, other commercial online spike-sorting products offer more

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 19

advanced visualization tools during the training phase, such as principal components analysis. Even so, these products require a great deal of human intervention. Moreover, their ability to identify separate units on each electrode is considerably less robust than the methods of Sahani (1999) and Shoham et al. (2003), which can allow for automated (unsupervised) training and clas sification of neural data. When we had first started our array recording experiments, Dr. Stephen Ryu was responsible for spike sorting the neural units recorded off of the array prior to each experiment. We affection ately dubbed this the “R” system, after his surname.

2.2.3 RR: Second Generation Classification Infrastructure

We wished to develop a new system that could eliminate the tedious human involvement during the training phase of the spike-sorting process. At the same time, we aimed to have a system that was sufficiently repeatable and scalable to hundreds of electrodes. Our approach was to leverage the data acquisition and classification capabilities of the Cerebus system, while automating the training phase. A block diagram of our second-generation system, that we dubbed “RR” (as a codename for “Ryu Replacement”), is shown in Fig. 2.3. First, the RR server configures the NSP to broadcast the 30 kHz data stream from all active electrodes. The collection time is set so that we capture a sufficient number of neural events for the training algorithm; this was typically 2— 3 minutes while the subject was actively performing a relevant behavioral task. The RR server buffers data from all electrodes into main memory.

Cortical Cerebus NSP Array 1

UDP

RPC 1o M atlab MEX RPC to M atlab MEX Laver RPC to M atlab MEX Laver

— — RR Client C ------RR Client I I — RR Client d ------(1) ------{pr ------(2) ------(N ) ------

Figure 2.3: System diagram of our “RR” architecture. The Cerebus “front-end” collects raw data from the set of electrodes and interfaces with the Cerebus NSP as usual. The GUI is now relegated to a monitoring role. A second PC, running RTAI Linux (a real-time variant of Linux) is also on the UDP interface - we dub this the “RR server.” It can receive data from the NSP as well as manipulate the NSP’s configuration. The RR server communicates with data processing clients on a separate network interface. These clients train on the data and manipulate the Cerebus NSP parameters by using the RR server as a proxy. We also wrote a Matlab (MathWorks, Natick, MA) MEX interface for communication with the RR server; this allows for easy integration of clustering algorithms that are written in Matlab.

After collection is finished, an RR client can request a specific electrode’s data from the RR

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 20

server through a remote procedure call (UNIX rpcgen). The client processes the data with the algorithm of choice (as detailed in the following sections), identifying the units present on an electrode. There are typically several computational clients communicating with the server on a TCP/IP network. Each electrode or group of electrodes can be farmed out to one of these clients for parallel processing. This is a key feature since parallelization can dramatically reduce the overall time to train the spike sorter across all of the electrodes. We used generic Pentium 4, 3.0 GHz computers with 2 GB of RAM for the RR server and three accompanying RR clients. Once an electrode’s data is processed, the client uses the clustering information to generate hoops for online classification by the NSP. The sorting clients relay the new threshold level and hoops to the NSP via the RR server. The NSP subsequently classifies all incoming neural events based on these hoops.

2.2.4 Spike Clustering Algorithm

Our architecture can support a variety of specific spike-sorting algorithms. For RR, we chose to use methods described in Sahani (1999) to identify the shapes of action potentials associated with different cells in the recording, and the shapes were then used to design hoops for the Cerebus NSP classification system. This training algorithm was run in Matlab and interfaced with the RR server using compiled Matlab MEX functions. We summarize the algorithm here, but refer the reader to Sahani (1999) for more details. The objective of the algorithm is to estimate the number of sources (neurons) that contribute to the observed signal and to characterize the distribution of action-potential shapes that each source produces. Figure 2.4 provides a diagrammatic summary of the Sahani algorithm and the individual steps are outlined below.

Neural Data

Real-Time Classification

Threshold Interp./ Classification NWrPCA Peak Align

Training

Threshold Interp./ No Yes Peak Align NWrPCA REM/CMS

NWrPCA Coeff.

Figure 2.4: Block diagram of the Sahani algorithm. Processing associated with one electrode is illustrated; such processing must be performed for all electrodes. Figure taken from Zumsteg et al. (2005).

1. The data are first high-pass filtered to eliminate low-frequency neuronal oscillations that

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 21

are not directly related with the action potentials. The “spikes” of interest ride on top of this wave.

2. A threshold is chosen relative to the RMS of the filtered signal. A snippet is sampled around each threshold crossing, but snippets that do not match a predefined shape heuristic are discarded. The recorded signal is also sampled at times where the RMS-derived threshold was not exceeded, so as to build an estimate of the covariance matrix of the noise.

3. The covariance of the background noise (Xn) is computed from all of the snippets that didn’t cross threshold.

4. Align all of the spike snippets so that their peaks appear at the same sample time.

5. Whiten the noise component of the spike snippets by linearly transforming each with the inverse square root of Xn.

6. Estimate the principal components of the noise-whitened snippets by a fitting technique that is robust to outliers (NWrPCA).

7. Project the snippets to the corresponding 4-dimensional principal subspace, and then fit a mixture-of-Gaussians model to the data using maximum-likelihood. The fitting uses a “relaxation” variant of Expectation-Maximization that reduces the chances of converging to local maxima. The particular relaxation scheme employed allows model selection to be integrated into the fitting procedure, thus automatically identifying the number of cells.

Figure 2.5 shows the results on a two minute segment of neural data. The false positive and miss rates for four clusters are each less than 5% when examining the a posteriori cluster assign ment probabilities in the training set. Note that only two of the units could have been reasonably sorted using hand-positioned hoops. Also, the pre-processing of snippets described above is essen tial to cell identification; conventional principal components estimated from unprocessed data do not reveal the differences between the three lower-amplitude action-potential shapes.

2.2.5 Hoop Design for Online Classification

Given the mixture model derived by the spike-clustering algorithm, each action-potential snippet can be assigned to the cell from which it is most likely to have originated. However, this operation cannot be carried out on the standard Cerebus NSP hardware. For our second generation system (RR), we developed a novel method that uses the probabilistic assignments from the training set to generate hoop parameters for each cell. Then, the Cerebus NSP can classify new snippets in real-time3:

3Note that since the NSP does not perform any snippet alignment before classifying, all training spike snippets are locked to NSP threshold crossings for hoop design.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 22

400

200

OQ.

■400

-0.2 0.2 0.4 Time Relative to Trough (ms)

0 10 20 30 40 50 PC 1

Figure 2.5: Clustering results from electrode G20040117.22. Projections into a 2-dimensional principal subspace (after peak alignment and noise whitening) are shown (left panel). Median waveforms for each cluster demonstrate the difference in unit shapes in the temporal domain (right panel).

1. Choose the cluster whose waveforms have the highest power about their peak.

2. Given the set of snippets for this cluster, for each time point consider a hoop whose amplitude window encompasses a fixed multiple of the interquartile range of snippet samples at that time point. Center the windows about the median voltage at the respective time point. This non-parametric metric minimizes the effect of outliers in a given class.

3. Select the hoop from those considered at all time points that minimizes the false positive rate from other neural events in the data stream. Continue this process until there are no false positives remaining or the four available hoops are exhausted.

4. Remove all events that have been correctly classified by this set of hoops. Since the hoop selection is non-optimal and is not as robust as the original clustering, there can be many unclassified neural events remaining for this cluster (i.e., misses). These events continue to remain in the training data since they need to influence the hoop selection for other clusters.

5. Repeat steps until all clusters have been assigned hoops.

Although our process of choosing hoops is not optimal, it is a computationally-efficient greedy algorithm. It implements an intuitive heuristic for setting hoops from a set of tagged waveforms. We added an extra heuristic to reduce the leakage of false positives into legitimate classifications. We used the first set of hoops to extract mostly unsortable activity that crosses threshold. Four hoops are placed at equispaced time points shortly after the threshold crossing. Their amplitude windows are twice the threshold level of that channel, centered about zero volts. We call this the “hash unit.” The NSP classifies units in a prioritized fashion and all classifications are mutually exclusive. Hence, the hash unit will reduce the false positive rate at the expense of miscategorizing true spikes into this hash unit.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 23

150

100

-50

100

-150 - 0.2 0.2 0.4 0.6 0.8 Time Relative to Threshold Crossing (ms)(ms)

Figure 2.6: Threshold and hoop design for three clusters on electrode G20040202.14. Shading of waveforms denotes 1.5 times the interquartile range, centered about the median. Hoop positions are graphed with a slight jitter along the x-axis to provide visibility when hoops overlap.

Fig. 2.6 shows the waveforms of each unit along with the hoop settings. This is the final result from the clustering and hoop design process for our RR system. Features of note include: the hash unit (gray hoops) captures most of the green multi-unit cluster; the red unit registers ~10% false positives due to the green unit and -20% misses due to the hash unit.

2.2.6 RRR: Third Generation Classification Infrastructure

The hoop-based classification of RR was a first attempt at automating our spike sorting procedures in the laboratory. As will be shown subsequently, RR provided adequate sorting capabilities but faltered in situations where the action potential shape from a particular neuron was very similar to another on the same electrode. In order to solve this, we created a more robust infrastructure, dubbed “RRR” (as a code name for “Revised Ryu Replacement”). A diagram of RRR is provided in Fig. 2.7. Unlike in the RR setup, the RRR server performs the real-time classification in lieu of the Cerebus NSP. The Cerebus NSP is relegated to act simply as a data acquisition system. During real-time classification the NSP transmits broadband data to the RRR server and the RRR server performs the actual spike-sorting. In RR, the NSP performed the real-time spike sorting using hoops. In RRR, the server performs the real-time spike sorting using the full probabilistic model afforded by the Sahani ,algorithm. This allows us to classify spikes in the same principle component subspace used for training and the classification can be performed using maximum-likelihood techniques. The RRR server then sends data to the GUI by mimicking the communication of the NSP. Our early neural prosthetic experiments were conducted using RR (monkey G), and we later transitioned to RRR (monkey H).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 24

Cortical Cerebus NSP Array

RR R Server

RPC to Matlab MEX Laver RPC to Matlab MEX Lavei RPC to Matlab MEX Laverlai^-

RRR Client I I RRR Client I I

Figure 2.7: System diagram of our “RRR” architecture.

2.2.7 Data Collection and Analysis

For the purpose of quantifying and illustrating spike-sorting performance, we analyzed data from a rhesus monkey trained to perform delayed center-out reaches to visual targets as briefly outlined in Chapter 1. This behavioral setup will be later detailed in Chapter 3. After testing and verifying the RR and RRR systems, we investigated the benefits of sorting by running analyses to ascertain how well target location can be estimated for each trial from plan period activity. Given a particular target location, the distribution of spike rates for each trial was modeled as a multivariate Gaussian. We employed maximum likelihood methods to determine the highest probability target location for a given trial (see Chapter 3 for specifics). Either sorted data or threshold crossings were input into the estimator. We obtained classification percentages for each day’s session through leave-one-out cross-validation.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 25

2.3 Results and Discussion

2.3.1 Clustering and Classification

The two key param eters for our algorithm were the threshold level (RR and RRR) and hoop extent (RR only); these were set to 3—3.5 times the RMS of the filtered data and 3.73 times the interquar tile range, respectively. The parameters were empirically determined to provide adequate results. Also, the amount of training data collected per electrode was important since it determined how many representative spikes were used to build our clustering models. We used 2 minutes of data to balance between the need for sufficient training data and the overall amount of experiment time available per day (~90-120 minutes). Our infrastructure was highly effective in terms of training time. The clustering algorithm took approximately 20 seconds per electrode, and we sorted 96 electrodes in 10 minutes with three clients. This is at least as fast as human-assisted training, but the strength of our architecture is its scalability and repeatability for very large electrode counts. Training time can be reduced by simply adding more RR/RRR clients. The traditional problem with testing spike-sorting algorithms on real neural data is that there is no measure for the ground truth. We have no independent way of knowing what the time and identity of each spike really was. As an alternative, given that the training algorithm is probabilistic by nature, we can compute the a posteriori probability of each spike belonging to each of the classes (which correspond to neurons). This allows us to calculate an average false positive and miss probability for each cluster. The cluster is said to be well-isolated if each type of misclassification probability is under 5%. For example, with our G20040202, G20040312, and G20040330 datasets, there were 62, 40, and 41 units that fit this criteria, respectively. For our original RR, we asked if these units are still well-isolated when classified with hoops. We computed the false positive and miss rates for hoop classification by comparing against the initial clustering results. For the same three datasets, only 46, 33, and 25 units had false positive and miss rates of less than 5% when sorting with hoops. This was a significant drop in the number of well-discriminable units. Furthermore, this error comparison excluded noise snippets that were misclassified as spikes. When we lifted this exemption, we found that noise heavily influences the misclassification rates and many fewer neural units satisfied our goodness criteria. Ultimately, the hoop-based classifier performed well but did not achieve exceptional results. There was either extraneous noise assigned to legitimate units or a loss of spikes into the hash unit. For example, our hoop-based system was unable to reliably sort all five units on the elec trode data shown in Fig. 2.5. Nevertheless, the overall sorting performance was assessed to be qualitatively equivalent to human selection of the hoops — often an individual may feel he is se lecting acceptable hoops, but he is unable to fully appreciate the underlying clustering of the data. Figure 2.8 provides an extreme limit case where hoop-based sorting breaks down.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 26

150

100

TO O > -50

0.2 0.4 0.6 1.2 Time Relative to Threshold Crossing (ms)

Figure 2.8: Waveforms from two clusters with the shading height corresponding to two times the interquar tile range, centered at the median. These units are easily separated by the clustering algorithm, with low false positive and miss rates. While the median shapes are distinct, a hoop-based classifier struggles with the data due to the spread of the waveforms. Hoops placed for the green unit capture 26.7% false positives from the red unit even though clustering algorithm estimates false positives at less than 5%. Data were taken from electrode G20040312.21.

Compared to RR, spike-sorting performance was considerably improved when using RRR, as verified by human inspection. This is because the sorts were performed in the reduced-dimension space using the Sahani algorithm’s probabilistic model. As such, unlike hoop classification, the overall shape of the waveform was implicitly considered rather than treating each timepoint as in dependent. This allows for the distinguishing of shapes with considerable timepoint-by-timepoint overlap (e.g., Fig. 2.8).

2.3.2 Target Location Estimation

Next, we performed a target estimation analysis to verify the hypothesis that spike sorting allows for greater information extraction. For each day’s data, we excluded electrodes that did not have two or more clustered units as determined by our training algorithm. This exclusion is sensible since if an electrode had only one very large amplitude spike waveform, its signal would be de facto spike-sorted with even just a thresholding scheme. Furthermore, for our task, the estimation performance asymptotes as the number of electrodes is increased, even if the additional electrodes only possess unsortable neural activity. To illustrate the benefits of spike sorting, we “biased” the simulations by considering only sortable electrodes. We suggest that this biasing would not be necessary for a more challenging behavioral task — see Carmena et al. (2003) where more performance was gained by spike sorting. Spiking rate was calculated for each unit (or electrode for the unsorted simulations) in a 150 to 350 ms window following the reach target presentation. The results of the maximum-likelihood estimator are summarized in Table 2.1. For RR, we found a performance increase between 2.7 and 5.7% when using spike-sorted units for classification. The increase was dependent on the following parameters: model training size, spike integration window, and the electrodes that were excluded. Searching this entire space of parameters is intractable and unnecessary. We can however report

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 27

that in the various scenarios we tested, RR spike sorting resulted in approximately the same performance increases relative to performance when using simple threshold crossings. On two occasions, we compared the automated sorting architecture and hand-optimized hoop locations; the two methods were nearly equivalent in performance.

Table 2.1: Decoding Performance Improvement due to Spike Sorting

D ata Set # of Tgts # of Elec. Unsorted Perf. RR Perf. RRR Perf. G20040329 8 36 64.4% 70.1% 73.2% G20040330 8 35 66.3% 71.9% 75.6% G20040413 16 35 75.6% 80.0% 83.5% G20040417 8 48 91.1% 93.8% 94.5% G20040421 8 42 83.7% 89.1% 91.3%

When using RRR, there was a consistent increase in performance over even RR, up to +3.7%. The total performance increase over unsorted data is 7.5-9.3% (when excluding G20040417).4 If we are to further restrict our analyses to only electrodes that have 3 separate neural units or more, we see that there can be an even more dramatic difference between unsorted-, RR-, and RRR-based performances. These numbers are listed in Table 2.2. With more neural units on a given electrode, differentiating between them becomes much more important. Hence, better sorting leads to better overall decoding performance, which agrees with intuition. Finally, as will be shown in Chapter 3, what appear to be small gains in decode accuracy can have large impacts on overall performance.

Table 2.2: Decoding Performance Improvement when Further Restricting Electrodes

Data Set # of Tgts # of Elec. Unsorted Perf. RR Perf. RRR Perf. G20040329 8 20 59.2% 65.5% 69.2% G20040330 8 16 58.6% 65.1% 70.0% G20040413 16 16 64.6% 69.9% 72.9% G20040417 8 18 80.9% 84.2% 86.8% G20040421 8 24 80.9% 85.5% 87.7%

4The performance increase when including all of the electrodes is 6.1-7.8%, again omitting G20040417.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 28

2.4 Feasibility of Implantable Spike-Sorting Circuits

Based on the success of proof-of-concept prosthetic systems in the laboratory (see Chapter 3), there is now considerable interest creating implantable electronics for use in clinical systems. A critical question is whether it is possible to perform the spike-sorting operations in real-time and with low power. Low power is essential both for power supply considerations and heat dissipation in the brain. In order to answer this question, we performed a feasibility analysis to estimate how much power would be theoretically consumed by an advanced real-time spike-sorting algorithm. Only a summary of this work is provided here. For more details please refer to Zumsteg et al. (2005). To demonstrate the feasibility of high quality real-time spike sorting in implanted hardware, we chose the algorithm th at we believe to be both one of the best and one of the most computation ally intensive spike sorting algorithms available. We intentionally sought a state-of-the-art spike sorting algorithm, which is uncompromising in spike sorting quality and relies on principled ma chine learning techniques, to help assure that our power estimates would not be overly optimistic. As detailed earlier in Section 2.2.4, our algorithm of choice is the Sahani algorithm. We addressed the power consumption of the two major computational elements of a spike sort ing system: analog-digital conversion (ADC) and the digital training/classification. For the ADC component, we first turned to previous reports of low-energy ADC converters (Scott et al. 2003). However, recent developments in low-power ADC design have leveraged the extremely power- efficient digital circuit to “aid” the analog design (Murmann and Boser 2004). As a result, using these digital calibration and compensation techniques, the power consumption of ADCs is expected to be reduced by an order of magnitude from the values quoted by Scott et al. (2003). Hence, a con verter consuming close to 1 pW with 8-bit resolution at 30 kHz — or 100 pW for 100 channels — should be achievable. We estimated the power requirements of the Sahani spike-sorting algorithm by recasting the operations performed to simple instructions that can be implemented in integrated circuits (ICs). A detailed analysis of the algorithms was carried out and approximate figures for the number of operations (specifically adds and multiplies) required for each task were obtained.5 Operation counts were then translated to power using the conversion factor 1 mW/GOPS (Chandrakasan et al. 1992). This figure is used as the standard power consumption per operation for ASICs implemented in 0.13 pm CMOS technology. Finally, to approximate power usage from memory accesses, we simply double the power from instruction execution (Meng et al. 1998). The figures should be taken as an “order of magnitude” indication. However, we believe that these figures are indicative of the power consumption, and thus achieve the objective of showing that these systems can be implemented in an implantable neural prosthesis. For the training portion of the Sahani algorithm, assuming a training interval of 12 hours, the

5 Operation counts for some complex linear algebra functions used in the algorithms, like matrix decompositions, were taken from standard texts on numerical linear algebra (Golub and Van Loan 1983).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 29

total power requirement is approximately 2.8 pW for 100 electrodes. Next, the classification pro cess itself contributes relatively little to the overall power consumption of real-time spike sorting, even though it must be operated continuously. A simplified classification, using only the mini mum Euclidean distance to a cluster (i.e., the traditional technique used in concert with PCA), requires 1.3 x 104 ops/sec/electrode. This corresponds to 0.026 pW/electrode or 2.6 pW for 100 elec trodes. Finally, most of the real-time computational burden is dominated by the high-pass filter and thresholding. The problem is made more difficult by the fact that the LFP is in the 0.5-100 Hz frequency range, while much of the signal power is concentrated in the 1000-3000 Hz range. With a sampling frequency of 30 kHz, the necessary transition band is somewhat steep. A digital filter consumes approximately 1 pW per electrode or 100 pW per 100 electrodes. This figure is similar to that of a analog filtering approach although it will not require large capacitors and resistors which can be chip-area intensive. Therefore, with 100 electrodes, an upper bound of the power consumption of our spike sorting algorithm (without interpolation during real-time operation) is ~150 pW. Also, we have shown that the hundred, 8-bit, 30 kHz analog-to-digital converters needed for digital spike sorting are expected to consume less than 100 pW of power. Thus, 250 pW is an achievable level of power consumption for an implantable, 100 electrode digital spike sorting circuit. Assuming heat dis sipation over a 16 mm2 chip, we have a power to area ratio of about 1.6 mW/cm2, which is well below the 80 mW/cm2 chronic heat dissipation threshold believed to cause tissue damage (Seese et al. 1998). By way of comparison, for 100 electrodes, the all-analog approach of (Harrison 2003), would require 5.7 mW, and the wavelet compression technique of (Oweiss et al. 2003) 120 mW. While these alternative approaches may benefit from some of the architectural techniques which we leverage for our estimates, the loss of information and less than ideal data compression remain significant when compared to our proposed implantable, spike-sorting approach.6

6 We have not considered the requisite low-noise amplifier in this report as all approaches to spike sorting require their use, and because recent reports have demonstrated low power (< 1 pW per channel) and noise (~2 pV) levels (Horiuchi et al. 2004; Harrison and Charles 2003).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 30

2.5 Summary

We demonstrated that fully automated spike sorting for laboratory experiments involving hun dreds of neural electrodes is practical with present-day technology. Our architecture facilitates use of unsupervised clustering algorithms for configuring existing real-time spike classifiers. Fur thermore, we also demonstrated that the performance of a target estimator was improved when using sorted information as opposed to threshold crossings. The performance improvement was moderate and it was gained with little expense. The infrastructure, once installed, is trivial to run before every day’s experiment, and it is extensible past the point where rapid, consistent, human- assisted sorting of hundreds of electrodes becomes untenable. Finally, the training stage is truly quantifiable and can serve as a more robust daily record of the neural implant’s stability. We have offered two alternatives. The RR architecture is designed to exploit the real-time classification capability of the Cerebus NSP. Since it uses a very simple classifier (time-amplitude hoops) it should extend easily to thousands of electrodes. The RRR architecture is more com putationally intensive since it must perform operations similar to the training algorithm (peak- alignment; subspace projection) and then classify based on a maximum-likelihood computation. However, spike shapes can be more accurately sorted using this technique. The extra performance benefits of RRR over RR are measurable and the sorting method is more mathematically princi pled and computationally tenable for the ~100 electrode systems we use in the laboratory today. The computational complexities of RRR are also theoretically realizable in a custom, implantable solution given our power feasibility calculations. It is important to note that sorting units may possibly reduce decode performance. If nearby neurons had similar tuning properties, it could be advantageous to group them together as a single channel rather than separating each into its own unit. This would help combat the inherent spiking variability of neurons, which is often modeled as an inhomogeneous Poisson distribution (Dayan and Abbott 2001). The idea of designing the sorting parameters based on the optimality of the final decoding algorithm is an exciting line of research and requires further investigation. To properly explore this subject, one would have to first spike sort and then ask whether it is better to fuse together the separated units. As such, there is a need for a solid spike-sorting infrastructure regardless. Finally, for many neuroscience experiments and future prosthetic work where single neuron adaptation is expected, it is critical to track individual neurons. The experimenter needs to val idate that shifts in a neuron’s response are indeed legitimate and not an artifact of poor signal separation. Robust architectures like that proposed in this Chapter can hopefully facilitate these types of studies.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2. REAL-TIME SPIKE-SORTING 31

2.6 Credits

A report of RR has previously appeared in a five-page conference paper format (Santhanam et al. 2004). The starting point for our endeavors was the collection of algorithms developed by Dr. Ma- neesh Sahani for his doctoral dissertation (Sahani 1999). I wrote and tested all of the real-time software, which interfaced tightly with the algorithmic code provided by Dr. Sahani. Dr. Sahani and I worked closely to develop the new greedy algorithm for RR and I later ported RR to RRR. Dr. Stephen Ryu (i.e., “R”) was the primary surgeon for electrode implantation and provided the initial inspiration for the project; he would perform daily spike sorting by hand prior to the advent of this automated system. Caleb Kemere conducted much of the analysis regarding the feasibility of an implantable spike-sorting solution, Stephen O’Driscoll assisted with the ADC power compu tations, and Professor Teresa Meng contributed many valuable scientific discussions. We also thank Byron Yu for assisting with data collection, Missy Howard for surgical assistance and veterinary care, and Dr. Nicho Hatsopoulos for surgical assistance with the monkey G implant. This study was supported by the NDSEG Fellowship (GS), NIH grant NS-10414 (MS), the Coleman Fund (MS), the Christopher Reeve Paralysis Foundation (SIR,KVS), MARCO Center for Circuit & System Solutions (www.c2s2.org) under contract 2003-CT-888 (THM,CK), and the following awards to KVS: the NSF Center for Neuromorphic Systems Engineering at Caltech, ONR, Whitaker Foundation, Center for Integrated Systems at Stanford, Sloan Foundation, and Burroughs Wellcome Fund Career Award in the Biomedical Sciences.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 3

A High-Performance Brain-Computer Interface

3.1 Overview

As covered in Chapter 1, brain-computer interfaces (BCIs) may someday assist patients suffer ing from neurological injury or disease, but relatively low system performance remains a major roadblock. In fact, the speed and accuracy with which keys can be selected using BCIs is still far lower than for systems relying on simple eye movements. This is true whether BCIs employ recordings from populations of individual neurons using invasive electrode techniques (Serruya et al. 2002; Taylor et al. 2002; Carmena et al. 2003; Musallam et al. 2004; Kennedy et al. 2000; Hochberg et al. 2006; Patil et al. 2004) or EEG recordings using less- or non-invasive (Leuthardt et al. 2004; Wolpaw and McFarland 2004) techniques. In Chapter 2, we presented a front-end approach to improving prosthetic performance, namely performing a more principled job of dis criminating neurons recorded from implanted electrodes. We now turn to improving the task of decoding neural signals to predict the motor intentions of a subject. Most BCIs translate neural activity into a continuous movement command, which guides a computer cursor to a desired visual target (Kennedy et al. 2000; Serruya et al. 2002; Taylor et al. 2002; Carmena et al. 2003; Leuthardt et al. 2004; Wolpaw and McFarland 2004; Patil et al. 2004; Hochberg et al. 2006). If the cursor is used to select targets representing discrete actions, the BCI serves as a communication prosthesis. Examples include typing keys on a keyboard, turning on room lights, and moving a wheelchair in specific directions. Human-operated BCIs are currently capable of communicating only a few letters per minute (~1 bit/s sustained rate; Wolpaw and McFarland 2004) and monkey-operated systems can only accurately select one target every 1-3 seconds (~1.6 bits/s sustained rate; Taylor et al. 2003), despite using invasive electrodes. An alternate, potentially higher-performance approach is to translate neural activity into a

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 33

prediction of the intended target and immediately place the cursor directly on that location. This type of control is appropriate for communication prostheses and benefits from not having to esti mate unnecessary parameters such as continuous trajectory (Shenoy et al. 2003; Musallam et al. 2004). In this chapter, we describe how we conducted an iterative series of experiments to investi gate how quickly and accurately a BCI could operate under direct endpoint control. We were able to design and demonstrate, using electrode arrays implanted in monkey dorsal pre-motor cortex, a manyfold higher performance BCI than previously reported (Wolpaw and McFarland 2004; Tay lor et al. 2003). These results indicate that a fast and accurate key selection system, capable of operating with a range of keyboard sizes, is possible (up to 6.5 bits/s, or ~15 words per minute, with 96 electrodes). The highest information throughput is achieved with unprecedentedly brief neural recordings, even as recording quality degrades over time. These performance results and their implications for system design should substantially increase the clinical viability of BCIs in humans. The significance of this work has been independently assessed by Scott (2006).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 34

3.2 Methods

We trained two rhesus monkeys (G and H) to perform a standard instructed-delay center-out reach ing task (Cisek and Kalaska 2004) to first assess neural activity in the arm representation of monkey pre-motor cortex (PMd), as shown in Fig. 3.1a. Animal protocols were approved by the Stanford University Institutional Animal Care and Use Committee. Hand and eye position were tracked optically (Polaris, Northern Digital, Canada; Iscan, Burlington, MA). Stimuli were back- projected onto a frontoparallel screen 30 cm from the monkey. Real-reach trials (consult Fig. 3.1a) began when the monkey touched a central yellow square and fixated his eyes on a magenta cross. Following a touch hold time (200-400 ms), a visual reach target appeared on the screen. After a randomized (200-1000 ms) delay period, a “go” cue (central touch and fixation cues were ex tinguished and reach target was slightly enlarged) indicated that a reach should be made to the target. As previously reported, neural activity during the delay period (time from target appear ance until ‘go’ cue) reflects the endpoint of the upcoming reach (Messier and Kalaska 2000). The reach endpoint can be decoded from delay-period activity using maximum-likelihood techniques (Yu et al. 2004). Eye fixation was enforced throughout the delay period to control for eye-position-modulated activity in PMd (Cisek and Kalaska 2002; Batista et al. 2005). This fixation requirement is appro priate in a clinical setting if targets are near-foveal, or imagined as in a virtual keyboard setup. The hand was also not allowed to move until the go cue was presented, providing a proxy for the cortical function of a paralyzed subject (Serruya et al. 2002; Taylor et al. 2002; Carmena et al. 2003). Subsequent to a brief reaction time, the reach was executed, the target was held (~200 ms), and a juice reward was delivered along with an auditory tone. An inter-trial interval (~250 ms) was inserted before starting the next trial. We presented various target configurations (2, 4, 8 or 16 targets) on the screen, including layouts with 2, 4, or 8 directions, and 1 or 2 distances (6-12 cm radially outward). The aforementioned paradigm was used for control experiments that helped us design of our BCI system. When actually implementing our BCI system we modified the system to display targets in rapid succession as will be detailed later. We call these our BCI experiments. This allowed us to test the true performance of the system in a setting analogous to its usage scenario for human patients.

3.2.1 Neural recordings.

Neural activity was simultaneously recorded from a 96-channel electrode array (Cyberkinetics Neurotechnology Systems, Foxborough, MA) implanted in arm representation of PMd, contralat eral to the reaching arm (left, monkey G; right, monkey H). For monkey G, we used the second- generation spike sorting (RR) described in Chapter 2. The third-generation, more sophisticated

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 35

3 Touch hold Delay period 'Go' cue Real reach

•« f*MTy.‘r \* '/Vi’s**.!v •;■.*.,*» ■'

Touch hold Trial #1 Trial #2 Trial #3 Trial #4 Delay Period 'Go1 cue

:•■■■%*•'.rv;,** V" J'.v~ , r ‘.'** * '* * . r. . •.*•*. * . / _ . i* .'.v i^i iri''*ii k ■ .1, v , ““IT ■

•*. _■**• l}1 '.V'-V#.'*. j‘]iv vVt-I.:

1 2 3

Figure 3.1: Instructed-delay (real reach) and BCI (prosthetic cursor) tasks, with accompanying neural data. Large numbered ellipses draw attention to the increase in neural activity related to the peripheral reach tar get. a. Standard instructed-delay reach trial. Data from selected neural units are shown (gray shaded region); each row corresponds to one unit and black ticks indicate spike times. Units are ordered by angular tuning di rection (preferred direction) during the delay period. For hand (H) and eye (E) traces, blue and red lines show the horizontal and vertical coordinates, respectively. Full range of scale for these data is ±15 cm from the center touch cue. b. Chain of three prosthetic cursor trials followed by a standard instructed-delay reach trial. Tgkip is denoted by orange in the timeline. Neural activity was integrated (Tjnt) during the purple shaded interval and used to predict the reach target location. After a short processing time (?’(jec+ren(j«40 ms), a prosthetic cursor was briefly rendered and a new target was displayed. The dotted circles represent the reach target and prosthetic cursor from the previous trial, both of which were rapidly extinguished before the start of the trial indicated. Trials from experiment H20041106.1 with monkey H.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 36

system, was used for monkey H (RRR; also detailed in the previous chapter). The use of an auto matic spike-sorting system ensured a very fast and repeatable method for classifying neural units each day. We recorded 20-30 single neurons and 60-100 multi-neuron units in a typical session. Figure 3.2 shows the anatomical placement of the electrode array in both monkeys.

Monkey G Monkey H

Posterior Anterior Posterior

Figure 3.2: Placement of electrode arrays in PMd of monkeys G and H. For both monkeys, the arrays were placed in a location that spans dorsal pre-motor and primary motor cortices. The neural signals tended to be responsive during both the delay period and the movement phase of trials. Intraoperative photographs of the array implanted in cerebral cortex are shown with sulci indicated. Overlapping diagram shows the relative array placement between monkeys. Monkey H’s sulcal pattern is reflected vertically and rotated to bring the sulci into alignment with those of monkey G. Ce.S.: central sulcus; S.Pc.D.: superior precentral dimple; Sp.A.S.: spur of the arcuate sulcus; A.S.: arcuate sulcus.

In BCI experiments, a selection process determined which units were to be used during target prediction. For monkey G, we used 0-4 single units for each electrode, the exact number varying from day to day, along with an optional multi-unit classification. Single units were preferentially included by signal-to-noise ranking. We collected data from 18 separate BCI experiments from monkey G, each experiment containing many hundreds of trials. For monkey H, we included multi-unit activity along with 0-5 single units per electrode. An additional ANOVA criteria was applied to include only units that were significantly modulated

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 37

by reach target direction during the delay period (p < 0.01). We collected data from 40 separate BCI experiments from monkey H, each experiment containing many hundreds of trials. With the aforementioned selection criteria, ~70-90 neural units were used in our highest performance BCI experiments with monkey H. To better understand what proportion of single units and multi-units were recorded, we exam ined neural data with parameters similar to those used during our BCI experiments. For monkey H (using dataset H20041217; 8 targets), there were 25 tuned single units and 89 tuned multi-units (tuning assessed with ANOVA, p < 0.05). Tuned units are those units that show a statistically significant difference in spike count as the direction of the reach is varied. For monkey G (using dataset G20040508; 8 targets), there were 26 tuned single units and 65 tuned multi-units. We evaluated the sort quality of a unit (single versus multi) using all spiking data in each experiment and the discriminability between units in the modified principal components space (Sahani 1999).

3.2.2 Decoding Algorithms

Maximum-likelihood techniques (or decoding algorithms, as we refer to them) are central to our ability to decode neural activity in order to discover a subject’s motor intentions. For each trial, we compress the activity recorded off of the array into a vector that denotes the number of spikes from each neuron during the delay period.1 We then model the spike counts as a random vector derived from either a multivariate Gaussian or Poisson distribution. Taking the case of a multivariate Gaussian distribution, we can write the following mathematical expression:

nyl ’ (27r)'l/2|Zs |1/2 ( }

where y e R9 is the vector of spike counts for a single trial, and p s e K9 and l s el?9*9 are the mean vector and the covariance matrix fitted to the data for reach endpoint s e {1,.. .,M}. This is illustrated with simulated data in Fig. 3.3 for q = 2 and M = 3. The mean vector will be different for each reach endpoint s and for any given trial the observed spike counts will be perturbed by Gaussian noise. The parameters of the model are first fit with a set of trials dubbed “training trials.” Then the fitted model can be used to make predictions of the reach direction given only the neural data. We decoded reach direction for “test” trials using maximum likelihood as follows:

s = argmax P(s | y) (3.2) S f(y\s)P(s) = arg m ax — ----- (3.3) s f(y) = argmax f(y | s), (3.4)

1The discrimination of individual neurons from the recorded electrode voltages and the determination of their spike times was covered in Chapter 2.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 38

c/> CD

CL 40 CO o

# 20

20 40 60 y 1 (# of spikes)

Figure 3.3: Multivariate Gaussian data-fitting. Each point corresponds to a single trial and its color corre sponds to the actual reach direction on that trial. A covariance ellipsoid is fit to the set of data points for each reach direction. Only three reach directions (0° blue, 90° green, 180° red) are shown. The ellipses correspond to 50% confidence regions.

where s is the estimated reach endpoint. Equation (3.3} is obtained using Bayes’ rule and Eq. (3.4} is a result of all reach directions being equally likely and f(y) not being dependent on s. With our datasets, the number of neural units ( q) is most often comparable to the number of training trials (150-200 neurons versus 50-100 trials per reach endpoint). Therefore, we chose to constrain the covariance matrix (Zs e IR9*9) to be diagonal in order to avoid overfitting issues with a full covariance matrix. We assume that spike counts from each neuron is independent of the counts from the other neurons once the reach direction is predetermined. Another common approach in for modeling the spike counts of neurons is to fit the data to a Poisson distribution (Dayan and Abbott 2001). Fig. 3.4 shows the distribution of spike counts from a random subset of neuron/reach-direction pairs using data collected from our standard ex perimental setup. Overlaid on the plot is the theoretical distribution from a Poisson distribution (matched mean; blue) and a Gaussian distribution (matched mean and standard deviation; red). A Gaussian distribution can model spike count data well for high mean counts, but when there are fewer spike counts in a given delay period, the Gaussian is no longer a good fit. The Poisson model, however, provides a better fit for lower spike rates. We have also examined the Fano Factor and found that this measure roughly agrees with the value expected from a Poisson distribution. For a Poisson-based model, the probability mass function (pmf) can be expressed as:

,-Ai a y p(yl I s) = ■ (3.5) yli

where y l e No is the spike count for neural unit i in a single trial, X\ e IR+ is the mean spike count fitted to the training data for reach direction s e {1,.. .,M}. It is important to note that the Poisson-like noise properties of neural recordings is not a spe cific feature of our recordings, but rather a generally accepted model for cortical neurons. There

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 39

unit12,1-dir1 unitl 3,1 —dirl unit14,1-dir1 unitl 5,1—dirl BUB----- 1 T ...... 0.5 0.5 °-sm \

0 1 2 3 4 5 0 1 0123456 012345

unit18,1—dirl unitl 9,1 —dirl unit20,1-dir1 unit20,2—dirl 1 1 1 1

0.5 0.5 0.5 0.5 m a m — 0 0 L A i h i . — 0 ...... 0 1 2 3 4 01234567 02468 101214 0 1 2 3 4 5

unit22,1-dir1 unit23,4-dir1 unit25,1-dir1 unit26,3-dir1 1 1rr ------1 1 0.5 0.5 0.5

0 0 0 02468 10 12 0 1 2 3 0123456 01234567

unit29,1-dir1 unit29,2-dir1 unit34,1 —dirl unit40,1 —dirl 1 1 1 1 n 2 0.5 0.5 0.5 0.5

0 — 0 0 1 2 3 4 5 01234567 0123456 0 1 2 3 4 spike count

Figure 3.4: Histogram of normalized spike counts with overlaid Poisson distribution (red) and Gaussian distribution (blue) for a random selection of neural units. The x-axis denotes spike counts and the y-axis is the normalized frequency or probability. Specific neural-target pairs are plotted in each subpanel. Spike counts are summed over the interval [150,250] ms referenced from target presentation. There is an arbitrary y-axis scaling for the Gaussian; this degeneracy was resolved by rescaling the maximum point on the blue curve to coincide with the data histogram.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 40

is strong precedent with respect to neural decoding algorithms using the Poisson distribution to describe the data. Such algorithms most often use either Gaussian (Maynard et al. 1999; Zhang et al. 1998) or Poisson models (Zhang et al. 1998; Brown et al. 1998; Smith and Brown 2003; Truc- colo et al. 2004; Brockwell et al. 2004). The belief that neuronal spike counts are best modeled with a Gaussian or Poisson distribution is quite strong, and, while not perfect (e.g., a Gamma distribution can sometimes be better), it is considered to be a reasonable approximation and also computationally tractable.

3.2.3 Model Training

We fit (trained) models based on neural activity collected starting Tskip after the target presenta tion time and extending for a duration ■^int- For the control experiments, we either used two sepa rate blocks of trials, one for training and one for prediction, or used leave-one-out cross-validation with all the trials in a dataset. For BCI experiments, training trials were initially collected to fit the models and during subsequent trials, the target was predicted with the model (Gaussian model for monkey G and Poisson model for monkey H). There were many hundreds of test trials in these BCI experiments. As such, we were able to provide sufficient repeatability in our experiment so as to ensure a high degree of confidence in our results.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 41

3.3 Control Experiments

Before conducting BCI experiments, we analyzed the neural activity from instructed-delay control experiments to set parameters essential for high-performance BCI operation. We subdivided the delay period into two epochs: a time to skip after target onset while waiting for reach endpoint information to become reliable and readily decodable (T ^ p ) and a time to integrate the neural data that will be used to predict the desired target selection (Tjnt).

3.3.1 Selection of Skip Time (Tskip)

The first epoch, Tskip> includes the time for visual information about the target to arrive in PMd (50-70 ms), the time for the subject to select among targets if more than one are present, and the time for neural activity reflecting the desired target to be generated. Despite being of considerable scientific interest (Yu et al. 2006a), neural activity during these early periods is discarded in the present BCI design. Some activity during this period may already be predictive of the desired target, but it is not yet clear how best to decode this information. Choosing a short Tskjp can reduce the overall length of each trial, but may adversely affect prediction accuracy. Tgjjip was chosen to be 150 ms based on control experiments including a multi-target task where the monkey was trained to reach for one of many simultaneously-presented targets, as described directly below. Before visual information is relayed to PMd, the measured neural activity in PMd is not target- related — random neural variability can inject noise into our decoding model. We computed the average single-trial accuracy as a function of Tg^p, fixing to 50 ms. Figure 3.5 demonstrates that the neural activity in PMd cannot be meaningfully decoded to predict the reach target until ~75 ms after the target is displayed. This estimate includes a ~ 16-33 ms delay between when the software sends a request to show the stimulus and when it is actually displayed by the CRT projector. Figure 3.5 also reveals that there is target related information in PMd as early as 50- 70 ms after the target is first cued. It would not be possible to decode the target with above chance probability otherwise. This rough estimate of latency agrees with neural response plots from other previous studies in PMd (Crammond and Kalaska 2000; Kalaska and Crammond 1995, etc.), where some neurons show a change in activity very soon after stimulus onset. This exact latency has further implications for BCI experiments where reach targets are presented in rapid succession. Figure 3.1b shows that neurons were spiking according to the target location of a previous trial for many 10s of milliseconds after the start of a new trial (see just after ellipse #2). The previous analysis measures the latency of PMd neurons in a very specific situation — where there is a single target displayed and the subject reaches to that target. To estimate the time needed for the brain to select among multiple reach targets, we performed a separate control experiment with both monkeys. We presented each monkey with a multi-target task where all of the eight possible reach locations were shown on every trial, but only one was colored yellow while the rest were colored green. The monkey was trained to reach for the yellow target following the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 42

Monkey H

50 M onkey G

0 100 200 300 400 Skip tim e (7skip) (m s)

Figure 3.5: PMd latency analysis with the single-target instructed-delay task (one reach target was shown out of a possible of 8 locations and the remaining 7 locations were invisible) as a function of Tg^jp. Perfor mance was calculated by training a Poisson model on all trials in a dataset and computing the feave-one- out cross-validated performance on the same data. The shaded area denotes the 95% confidence interval (Bernoulli process) around the mean performance (embedded line). Dark curves correspond to monkey G (dataset G20040603) and light curves to monkey H (dataset H20041117). Performance was calculated for a constant 7 ^ of 50 ms with varying 7g^jp.

delay period. Figure 3.6 compares the performance for the conventional single-target instructed- delay and multi-target tasks, as a function of Tgjjjp. For the multi-target task, we require a longer Tgkip before there is a decodable reach plan. For monkey H, we used both a yellow-green and yellow-blue color scheme.2 Comparing the two color schemes, there is a much larger (+150 ms) latency for the yellow-green scheme. This large difference between the two color schemes demon strates that the difficulty of the task can greatly influence the speed at which plans are formed. A question that frequently arises in visually cued studies such as ours is whether the neural activity measured during the delay period is related to a reach plan, the visually cued stimulus, or a combination of both. One such discussion of this issue can be found in Crammond and Kalaska (1995). For example, recording from primary visual cortex could provide excellent prospects for decoding the reach target in our single-target task, but a BCI operating on this neural activity would not represent the motor intentions of the subject. Our multi-target task can serve as a control experiment in this regard. Placing Tgj^p at the time where the performance curves in Figs. 3.6b and 3.6d converge would provide assurance that such a BCI is decoding motor intention. The multi-target task is also an inherently more difficult task than the single-target task. The different time courses of these two tasks cannot be entirely due to the difference in visual stimuli, especially since merely changing the color of the non-reach targets caused a considerable shift in the time course for monkey H (cf. Figs. 3.6c and 3.6d). We therefore chose to be neither overly conservative by waiting until the time at which the performance curves fully converged (250 ms), nor overly liberal by selecting T ^ p to be coincident with the early plateau in decoder accuracy for the single-target task (75-100 ms). We chose a T ^ p of 150 ms.

2 All colors were measured to be roughly isoluminant with a photometer calibrated for the primate visual system.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 43

§ 90' Oo 70' S' 23 50' Oo S 30'

100 200 300 400 Skip tim e (Tskip) (m s) Monkey G

80 80

60 60

40 40

20 20

Monkey H

Figure 3.6: Direct performance comparison between the single-target and multi-target tasks as a function of Tskjp. a. Different task configurations. Tasks were interleaved in a pseudorandomized fashion during each experiment. Analysis is similar to that presented in Fig. 3.5. Performance is plotted with ^int fixed at 50 ms and varied, b. Performance with yellow-green color scheme converges at 7’sj5;jp«250 ms (dataset G20040603). c. Performance with yellow-green color scheme converges at ?1skip~400 ms (dataset H20041117). d. Performance with yellow-blue color scheme converges at Tgkjp=:250 ms (dataset H20041201).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 44

We used the single-target task, as opposed to the more complicated multi-target task, for our BCI experiments because we felt that it provided the simplest analogy to a human prosthetic system. While real patients may have to choose from several objects in their workspace, these objects will ordinarily not be presented immediately prior to a decision to execute a prosthetic reach. Furthermore, BCIs will typically rely on internally-generated target plans as opposed to externally presented stimuli (Shenoy et al. 2003; Afshar et al. 2005) and it has been shown that PMd exhibits robust motor plan activity in the absence of visual stimuli (Crammond and Kalaska 2000). BCIs tested with internally-generated plans may well achieve even greater performance than what we have demonstrated in this report. Internally generated plans can be formed without the added latencies of the visual system. Experiments are underway to test this hypothesis, but these are outside the scope of this dissertation.

3.3.2 Selection of Integration Time (Tjn^.)

The second epoch, r int> directly follows Tgkjp and provides the neural data used to predict the desired BCI cursor position. Given the Poisson-like noise in the spike timing of cortical neurons, a longer T ^ will average away more noise and result in more accurate predictions of reach end point. However, a longer ■^int will also reduce the total number of cursor positionings that can be made per second. Herein lies the fundamental speed-accuracy tradeoff that we must optimize in order to increase BCI performance. To determine the best T ^ to be used in BCI experiments, we analyzed the effect of this param eter on two performance metrics. The first is single-trial accuracy, which is the percent of trials in which the target is correctly predicted on average. We found that accuracy rises and largely saturates around 85-90% as increases to 200-250 ms. Figure 3.7 illustrates this effect as a function of total trial length, which is defined to be the sum of Tskip (150 ms), Tjnt (variable), and a small system overhead time associated with decoding and rendering the prosthetic cursor on the screen (^dec+rend*4® ms)- Should a minimum level of single-trial accuracy be required for a particular application, a corresponding minimum Tjnt can be chosen. The second performance metric is information transfer rate capacity (ITRC, in bits/s or bps). This quantity measures the rate at which information is conveyed from the subject, through the BCI, to the environment (Taylor et al. 2003; Shannon 1948). It is the information per trial, which is closely related to single-trial accuracy, divided by the total trial length.3 As shown in Fig. 3.7, the optimal ITRC occurs at short trial lengths, despite relatively low single-trial accuracy at these trial lengths. The highest ITRC is 7.5 bps at a total trial time of 260 ms, which corresponds to a

Tint of 70 ms (rskip=150 ms> r dec+rend=40 ms>- As further confirmation that neural responses are reflecting motor intention even at a rapid

3 As such, ITRC takes into account (1) task complexity, (2) the accuracy of task completion, and (3) the speed of task com pletion, and it is used universally to quantify performance of communication systems (Shannon 1948; Cover and Thomas 1991). For more discussion on ITRC, see Section 3.7 later in this chapter.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 45

100 a

o 2 3 O <0 ■aa) o QCD r TTT 200 250 300 350 400 Trial length (ms)

Figure 3.7: Single-trial accuracy and information transfer rate capacity (ITRC) with monkey H. Performance curves investigating the dependence on Tjn£ were calculated from control experiment H20041118 (8-target configuration). The trial length was 7’sj£jp+T,jn t+7’(jec+ren(j with Tsjcjp=150 ms and T(jec+ren(j=40 ms. Tint was varied and performance was computed. Performance metrics were very consistent day after day and between monkeys (data not shown). The theoretical maximum ITRC in bps, assuming 100% accuracy regardless of Tjnt, is plotted as the dotted red curve.

pace, we repeated the above analysis with the multi-target task. Since the multi-target task is a more difficult task, overall performance of a BCI using such a paradigm may not be as high as that of the single-target-based system. Fig. 3.8 compares the ITRC between the single-target task and the multi-target task in control experiments. In summary, there was a ~30% penalty for a system using the more difficult multi-target task.4 Similar to the discussion for Tgj^p, the difference in ITRC performance could be attributed to differences in visual stimulus presentation, cognitive difficulty, or a combination of both. Importantly almost all past studies, including BCIs employing continuous trajectory control, are based on singly-presented, visual stimuli. This was one reason for why we chose to employ a single-target paradigm for BCI experiments and report those results as our primary finding.

4The maximum ITRC in this analysis differs from that found in Fig. 3.7 since different datasets were used for each analysis along with different model training methods.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 46

Monkey H

Monkey G

Figure 3.8: ITRC comparisons between single-target (black) and multi-target (gray) tasks. An 8-target layout was used in the experiment. Both tasks were interleaved in a pseudorandomized fashion during each experiment. Trial length was taken to be 7’skip+^int+^'dec+rend w^h T’dec+rend se*' t° 40 ms. Tgjjjp was fixed at 150 ms for the single-target task and 250 ms for the multi-target task. was varied and performance extrapolated, a. Data from monkey H (H20041201) with a Poisson decoding model; maximum ITRC was 8.0 bps for the single-target instructed-delay task and 5.5 bps for the multi-target task. b. D ata from monkey G (G20040603) with a Gaussian decoding model; maximum ITRC was 6.8 bit/s and 4.6 bit/s, respectively.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 47

3.4 BCI Experiments

The performance curves in Figs. 3.7 and 3.8 are extrapolations using experimental data from in dividual trials that had long delay periods and long times between trials (refer to Fig. 3.1a). To directly measure the ITRC performance when actually presenting trials at high speeds, we con ducted a series of BCI experiments using a real-time system capable of rapidly decoding neural information. BCI experiments began with the collection of delay-period activity preceding reaches to different target locations (Fig. 3.1a) and fitting statistical models to the activity (model train ing). Then, during BCI prosthetic cursor trials (Fig. 3.1b), the intended target was decoded and a circular cursor was rendered on the screen at the predicted location. If the prediction was correct, the next target was displayed with very little delay. If the prediction was incorrect, the trial was either considered a failure and aborted, or the monkey was allowed to make a real reach to the target. In this manner, a sequence of high-speed prosthetic cursor trials could be generated. Fig ure 3.1b illustrates three successful prosthetic cursor trials followed by a standard real-reach trial. Real-reach trials were also interspersed to ensure the monkey remained engaged in the task.5 Using this paradigm, we varied the number of locations at which a target could appear on any given trial. This allowed task difficulty to be varied, which contributes to the ITRC metric. Performance values were calculated by averaging data from several hundred trials per condition. Table 3.1 lists the highest ITRC results during BCI experiments with 2, 4, 8 or 16 targets. In all cases, we were careful to avoid placing targets directly below the center touch cue since this loca tion would be obscured by the monkey’s hand. We also explored two annular rings (for the 8- and 16-target configurations) to demonstrate 2-dimensional target selection. The best overall perfor mance was achieved with the 8-target task (6.5 and 5.3 bps, monkeys H and G). This performance corresponds to typing ~15 words per minute with a basic alphanumeric keyboard. We took a conservative approach in computing BCI performance. Specifically, we considered only sustained BCI trials. All BCI trials are not equivalent in their timing characteristics. In Fig. 3.1b, the first BCI trial contains a large center touch hold time. This period allows the mon key to reset its behavioral state after an immediately preceding reach trial. Consequently, the monkey is not being requested to rapidly switch his plan from a previous BCI trial. Including this particular trial’s success or failure in our performance numbers is not a valid indication of sustained performance and could unduly inflate performance results. For the particular chain of trials shown in Fig. lb, we only included trials #2 and #3 in our average performance results. As mentioned before, we take the whole trial time, consisting of Tgkip’ Tjn^, and T(jec+rencj, when calculating all results that depend on the rate of target presentations. While a sustained performance rate of 6.5 bps is manyfold greater than reported previously, it is lower than the extrapolated result (7.5 bps; refer to Fig. 3.7). Furthermore, the ITRC peak

5 For videos demonstrating the instructed-delay (real-reach) task, and moderate-speed and high-speed BCI experiments, please see the Nature Publishing Group website and reference the supplementary materials included for Santhanam et al. (2006 b).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 48

Table 3.1: BCI experiments with highest ITRC for monkeys H and G. Each row lists the experiment with highest performance (ITRC) for a given target layout. Other experiments yielded higher single-trial accuracy or involved faster cursor rates, but did not achieve the highest ITRC for the corresponding target layout (not shown).

# of targets accuracy (%) trials/s bps H 2 94.3 3.5 2.4 4 94.5 2.8 4.7 8 68.9 3.5 6.5 16 51.1 2.9 6.4 G 2 84.2 3.6 1.3 4 93.0 2.5 3.8 8 76.8 2.5 5.3 16 26.4 2.2 3.1

was expected at a total trial length of 260 ms, but our BCI experiments yielded 5 bps with this timing. These discrepancies are due to the limitations inherent when using control experiments to extrapolate performance for speeds at which the subject must quickly recognize new targets and rapidly change neural activity (i.e., switch reach plans). The differences between extrapolated and directly measured performance were present despite specific model training methods that allowed for a fair comparison.6

3.4.1 Additional BCI Performance Aspects

Having confirmed that large BCI performance gains are possible with a direct endpoint control strategy, we investigated two additional performance aspects. First, we varied T-jn^ in BCI exper iments with monkey H to experimentally verify the trends seen in Fig. 3.7. Fig. 3.9 also demon strates an increase in single-trial accuracy with increasing trial length (black curves) as well as a peak in each ITRC curve (red curves). These results reveal how two or four target tasks restrict ITRC by virtue of the lower number of maximum bits per trial (1 and 2, respectively). Further more, given the numbers of neural units available in these experiments, it appears that ITRC is approaching a saturation point beyond which adding more target locations may not produce an appreciable increase in performance (doubling targets from 8 to 16 does not increase ITRC7).

6One possible source of decreased performance in BCI experiments includes situations where data used to train the decoding models is dissimilar to data used for prediction. To optimize the similarity between these two conditions, for monkey H we presented rapid sequences of reach targets during the training portion of our experiment, only commanding the monkey to reach for the last target in the sequence. Since the subject presumably planned reaches to every target, statistical models were trained from these high-speed trials that mimic the speed of trials during the prediction portion of the experiment. Overall performance was improved in comparison to decoding using models trained on slower-paced trials. Note that this training procedure can be easily adapted for paralyzed patients as well. 7 The latter layout requires distance tuning which is known to be weaker than direction tuning (Messier and Kalaska 2000; Churchland et al. 2006a)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 49

Additional target locations should improve ITRC when more neurons are available.

2 targets 4 targets 8 targets 16 targets

' !,.! • +L: >

4 00 l/A -s o r

Q 40

I------1------I------I 200 300 400 500 200 300 400 500 200 300 400 500 200 300 400 500 Trial length (ms) Trial length (ms) Trial length (ms) Trial length (ms)

Figure 3.9: Single-trial accuracy and information transfer rate capacity (ITRC) with monkey H. Performance measured during BCI experiments. Performance is plotted for each target configuration and across varying total trial lengths. Each data symbol represents performance calculated from one experiment (many hun dreds of trials). Across target configurations, single-trial accuracy decreases and ITRC increases as more targets locations are used.

Though each data point in Fig. 3.9 represents performance consolidated over hundreds of trials in a given session, we would ideally replicate experimental conditions and repeat experiments over multiple sessions. Practically, the electrode array can only provide a quasi-stable number of neurons over a relatively short time (2-3 months). We chose instead to sample the fundamental design parameters (target configuration and T ^ ). Also, in these BCI experiments with monkey H, different T ^ p times (150-250 ms) were chosen on an experiment-by-experiment basis based on the cross-validated performance of the training trials, but the majority of experiments were conducted with 2^]^=150 ms. Second, a common concern for BCIs such as ours is th at as the electrode implant ages the num ber of recordable neurons declines, leading to a drop in overall performance (Schwartz 2004). To investigate the impact of neuron loss, we performed analyses of single-trial accuracy and ITRC us ing data from control experiments. For a single day’s experiment, we selected all neural units from the array th at were responsive to target location within a desired using an ANOVA (p<0.05). For each neural ensemble size of interest, our total set of neural units was subdivided by drawing 100 randomized subsets (without replacement). Performance was computed for each subset and these data were averaged within a given ensemble size. This provided a single prediction accuracy and a single ITRC for each ■^int and ensemble size. We generated contour plots by using linear interpolation across this 2-dimensional surface. As expected, single-trial accuracy falls as neuron ensemble size decreases. However, it is pos sible to partially compensate for this performance loss by increasing Tint’ BCI speed may be

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 50

compromised as a result, but single-trial accuracy can be preserved (Fig. 3.10). For example, at a population subset of -80 neural units, increasing the subset size improves decode performance. Increasing the integration time also improves decode performance. For very low numbers of neural units, the performance eventually saturates regardless of the size of T ^ . This effect reflects the inherent noise present from sampling a small subset of neurons as well as the potential mismatch of our (or any) spiking model to the actual statistics of the neural system.

160 i

120 ■

80 •

40 ■ 0 .7' 0.6

0 J

0 100 200 300 400 500 600

Figure 3.10: Single-trial accuracy as a function of numbers of units and All data is from experiment H20041118 which involved an 8-target configuration. Tgjjip was fixed at 150 ms. Similar results were ob tained for dataset G20040508 from monkey G.

Figure 3.11 plots ITRC as a function of the number of neural units and ■înt- For small en sembles (e.g., 20 neurons), the ITRC peaks at ■înt a 120 ms but does not decline sharply as r in tis further increased; accuracy (and bits per trial) is increasing so as to offset the longer trial times. For larger ensembles, the information content at small is relatively high such that further lengthening ■înt has a dramatic effect on ITRC. Furthermore, for each ensemble size tested, a cubic spline interpolation was used to estimate the particular Tjn^. that maximized the ITRC. Plotting this “optimal” value (Fig. 3.11 inset) for each ensemble size illustrates that the maximum ITRC is achieved with small (60-130 ms), over a broad range of ensemble sizes. Thus, high-performance BCIs may require far shorter trials than have been explored prior to our work.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRA1N-C0MPUTER INTERFACE 51

160 i 140

.£ 100

0 5 0 100 Numberof neural units 80 ■

40 - (bits/s) ^ ---

0 100 200 300 400 500 600

Figure 3.11: ITRC as a function of number of neural units and r int- All data are from experiment H20041118 which used an 8-target configuration and contained over 1300 trials. Tg^-p was fixed at 150 ms. Main panel shows contours of ITRC (bps) as a function of the number of neural units available and ^int' The inset shows the value of ^int that achieves the maximum ITRC for each neural ensemble size that we tested. Similar results were obtained for dataset G20040508 from monkey G.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 52

3.5 Summary

Using a direct endpoint control strategy, we have described here an over four-fold (6.5 versus 1.6 bps) increase in BCI performance compared to recent studies. Performance is calculated in a conservative fashion since the entire trial time (^skip^int^dec+rend) was used; had just Tint been used as is sometimes done, the maximum ITRC would have been 28.4 bps. However, this is not an appropriate metric since it does not reflect an achievable selection throughput. As described previously, our system differs from continuous BCI approaches in several ways which may account for our performance gains. Additionally, continuous BCIs attempt to move the cursor well enough, although at the expense of speed (1-3 seconds per selection), to avoid making errors for a given selection. Conversely, the direct endpoint control reported here need not correct errors within a given selection since these errors can be rectified with rapid follow-on selections. This concept is intrinsic to our use of information theory and the capacity metric to quantify our communication prosthesis. Our performance results far exceed EEG-based non-invasive system performance, and help motivate the use of invasive, electrode-based systems in clinical BCIs. Although at its fastest, this direct endpoint control BCI demonstrates selection speeds (~3.5 trials/s) on par with saccadic eye movements, the ITRC for saccades is much higher due to their exceptional precision and accuracy. While eye or even speech control may be effective in specific settings, BCIs attempting to restore lost motor function must rely on the natural neural signals if they are to avoid commandeering and interfering with another motor modality. For example, an eye-tracking system used to control a wheelchair can prove inconvenient if the paralyzed patient wishes to exercise free gaze without controlling the wheelchair.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 53

3.6 Addendum: EMG measurements

We measured EMG from monkey G to verify th at the neural activity was not a byproduct of minor limb movements during the delay period of our experiment. The aim was to ensure that the BCI system is operating with motor planning activity as opposed to movement execution activity. This is an important requirement if such a prosthetic system is intended for paralyzed patients. Figure 3.12 shows the data from three different muscles in monkey G. There is no noticeable difference in EMG activity between the periods before and after target presentation for real-reach trials. Furthermore, there is no significant tuning in the EMG activity for target direction during the time period 50 to 300 ms after target presentation (ANOVA, p » 0.05). We also measured EMG activity while presenting targets at a rapid pace (“rapid condition”), akin to the behavioral conditions present in BCI experiments.8 Results were very similar to those of the real-reach trials — there was no elevated activity after target presentation and there was no target-specific tuning in the EMG signal during the delay period. The lack of tuned EMG signal in the delay period was typical across other monkeys in our laboratory (Churchland et al. 2006b,a). While we did not measure EMG from monkey H, the endpoint position of the monkey’s fingertip did not move with respect to the target direction during the delay period. Our hand tracking apparatus has a sub-millimeter resolution.

8 There was no movement epoch for the rapid condition, much like there was no such period for prosthetic cursor trials during BCI experiments.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 54

——A a. f t ♦ t

— /■t V □ □ „~/Y a □ □ □ □ □ □

° □ ._A aa D □ ■vt □q «□ I \Q J | I I

i — n -TV 200 ms

/ * \ t t

t t □ □ □ □ .- A □ \ □ UN -I □

Figure 3.12: EMG measurements for monkey G plotted in arbitrary units. Data was collected for real reach trials (green) as well as trials with rapid presentation of targets (red). The first arrow designates the target presentation time and the second arrow marks 150 ms before the start of the movement. The delay period and movement periods are separated by a slight gap to allow for differing delay periods across trials, a. Measurements from the deltoid muscle, b. Measurements from the biceps muscle, c. Measurements from the triceps muscle. We only collected data from real-reach trials for this muscle.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 55

3.7 Addendum: Application of Information Theory to BCIs

3.7.1 Analogy to Communication Systems

Information theory has been used previously in neuroscience to estimate the information content in neural spike trains or other neural activity. This is not what we did; we did not attempt to estimate the intrinsic information content of the neural signals, at least not in any direct fashion. Instead, we calculate how much information, quantified in bits much like transmission of data over a modem, can be extracted from the subject’s thoughts, by way of our prosthetic system (which includes the entire signal path, from electrode recordings to target decoder). Figure 3.13a shows a schematic of a standard communication system. The system is used as follows:

1. An arbitrary message is chosen by the source (e.g., “Hello World”).

2. The message is encoded into a series of channel symbols (e.g., 011100...) as per a predeter mined translation scheme, or code.

3. The symbols are then sent through a noisy channel and corrupted (e.g., 010100...).

4. The receiver processes the output of the channel with a message decoder that utilizes error- correcting features (redundancy) of the code to retrieve an estimate of the entire message.

Figure 3.13b creates an analogy between the classical communication system and a cortically- controlled prosthetic system. The subject now must choose the message and also encode it in the form of channel symbols. The encoding scheme and channel symbols are predetermined during the training phase of the prosthetic system. The channel is the prosthetic system that translates the intended symbol into an estimated symbol. The prosthetic system may not be able to perfectly estimate the intended symbol; hence, the channel is noisy. Again, the message decoder uses error correcting features of the code to accurately recover the message. Figure 3.13c illustrates the approach we used to quantify system performance. We focus on characterizing the communication channel by presenting various sets of reach targets to the sub ject. These targets are the channel symbols. The subject’s neural signals first represent the in tended target. We record these neural signals from the electrode array, spike sort, and decode the target location. (Each of these steps can inject noise into the final predicted target: for example, we are only recording a small population of intrinsically noisy neurons, our spike sorting, though good, is not perfect, and our decoder assumes a parametrized model th at is surely not entirely con sistent with the underlying neural representation.) We repeat our measurements many hundreds of times to allow us to characterize the error patterns (statistics) of this noisy channel. Ultimately we can establish bounds on information transmission using the techniques from information the ory. Particular targets (symbols) may be decoded incorrectly, but one can asymptotically achieve perfect reconstruction of the message using error correcting techniques (Shannon 1948).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 56

M essag e C hannel M essage Encoder p(ylx) D ecoder Message Estim ate of Message

"Hello World" 011100... 010100... "Hello World"

Human chooses message; Prosthetic M essag e thinks of channel symbols System D ecoder Estim ate of Message

Neural Representation; Recording Setup; Target Decoder {

Figure 3.13: Schematic diagram of a communication system. We focus on characterizing the red components of the system, a. Classical communication system where a message is encoded and sent through a noisy channel, adapted from Cover and Thomas (1991). A decoder is able to reconstruct the original message despite individual errors in the transmission, b. An analogy to a human communication prosthesis. The subject thinks of a message and encodes it. The channel consists of the prosthetic system that estimates the intended symbol (in a potentially noisy fashion). The output of the channel is a series of symbols that are then fed into the message decoder, c. Illustration of how we characterize the communication channel. The communication channel is a black box that encapsulates the neural representation of the target, our recording setup, and our target decoder.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 57

3.7.2 Computations

We start by testing whether the target predicted from neural activity coincides with the target presented. If there is a match, the trial is deemed correct. Otherwise, the trial is an error trial. For error trials, we note the normalized frequency at which particular targets are decoded given a particular target presented. This allows us to characterize the “channel” of the communication system. With these measurements, there are three ways in which to assess the information transfer (IT) per trial. For all calculations, the key quantity of interest is the mutual information metric:

I(X;Y) = H(X)-H(X\Y) = -£ p (x )lo g 2p(x) - -^ p (x ,y )lo g 2p(x|y)] (3.6) x x,y )

This equation simply states that the mutual information (I) between the set of presented targets (X) and estimated targets (F) is the difference in entropy (or uncertainty) of the presented tar get set (H(X)) and the entropy after making an estimation (H(X\Y)). The experimental data is used to compute p(y\x), which are the fractional occurrence of each estimated target y given a spe cific presented target x. The other quantities of interest are found from basic probability theory; p{x,y) = p(x)p(y\x) and p(x|y) = If Y provides a perfect estimate of X, H(.X\Y) = 0; hence, the information transfer is maximal and equal H(X), or the information contained in the presented stimuli. Taking an example of 8 targets, all presented with equal frequency, H(X) = 3, p(x,y) = g if y = x and 0 otherwise. Below we discuss three possible ways to compute the IT per trial: 1. Level-1 IT approximation — convert the average prediction accuracy across an exper iment to bits per trial. Again, if we have 8 targets, all presented with equal frequency, p(x,y) = if y = x and p(x,y) = yjfe otherwise, where pc is the fraction of occurrences that the estimated target matches the presented target averaged over all target presentations. (The number 7 appears in the denominator to equally distribute error across the remaining y x targets.)

2. Level-2 IT approximation — take into account error structure by computing the true mu tual information between presented targets and decoded targets. This provides a more ac curate representation of p(x,y). In other words, every element of p{x,y) is the fractional occurrence of the presented-estimated target pair (x,y), measured from our experiments. This can be a more accurate representation of information transfer. If, for example, errors are always distributed adjacent to the correct target, such a pattern is useful and taking it into account will lead to increased information transfer.

3. Information Transfer Capacity (ITC) — determine the full theoretical capacity of the communication system using the Blahut-Arimoto algorithm. The algorithm attempts to find

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 58

the “capacity” (C) of the channel, namely the bits per use of the channel (averaged over many uses of the channel) such that there is zero probability of error.

C = max I(X;Y). <3.7) p(x)

The algorithm starts with a guess for p(x) and iteratively improves the estimate by solving successive constrained maximization problems with Lagrange multipliers until C converges to its global optimum (Cover and Thomas 1991). Unlike the level-2 approximation, the sys tem is not constrained to the relative frequencies of presented targets, p(x), used during data collection. Importantly, this approach yields a system that utilizes certain targets (e.g., those that can be decoded more accurately) more often than other targets.

Figure 3.14 illustrates the pronounced structure in error pattern by plotting two error distribu tions based on experimental data. Each is a 2D histograms depicting which target was estimated (y) for each target presented (x). If target 1 was correctly estimated on every presentation, and likewise for all 8 targets, all eight squares along the unity diagonal line would be red, and all other squares would be blue. If the distribution is more diffuse (probability mass more spread away from the unity diagonal), there is more “confusion” between the presented and estimated targets. Figure 3.14a is from one 8-target experiment and demonstrates that when a mistake does occur it is generally only one target off. Figure 3.14b illustrates the error structure from a different exper iment where errors were more broadly spread, but still somewhat clustered around the diagonal. The ITC of panel a is higher than that of panel b and both are higher than their respective level-2 or level-1 IT values. a b

12345678 12345678 Estimated Target (y) Estimated Target (y)

Figure 3.14: Confusion matrices from two experiments. There can be structure in the error pattern. This structure can be exploited (see level-2 IT approximation and ITC, but not level-1 IT approximation) to allow for greater information transfer through the system.

Table 3.2 shows the values obtained when using these different methods of calculating IT for a few representative BCI experiments. In general there was a large gain between the level-1 IT and level-2 IT calculations but only a modest gain (~15%) between the level-2 IT and ITC calcula tions. One could use any of these three methods of computing IT to then produce an information

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 59

Table 3.2: Comparing Methods for Calculating Information Transfer

Targets Performance # of targets max bits/trial accuracy levell-IT Ievel2-IT ITC (max bpt) (%) (bpt) (bpt) (bpt) H 8 3 68.9% 1.2 1.6 1.9 8 3 71.4% 1.3 1.6 1.7 16 4 51.1% 1.1 1.9 2.2 G 4 3 93.0% 1.5 1.6 1.6 8 4 73.5% 1.4 1.8 1.9 8 4 76.8% 1.6 2.1 2.1

transfer rate (by dividing by the entire trial length, ^1skip+^’int+-^dec+rend^ The ITC is the stan dard performance metric in information theory (Shannon 1948). Again, this metric represents the maximum information per use of the channel and is asymptotically achievable with zero trans mission error by using an infinite length error-correcting code. We use the ITC to obtain the ITRC

^ total triaUength ) anc^ thereby evaluated and optimized our BCI. Figure 3.15 shows the measured level-1 IT, level-2 IT, and ITC from all 8-target BCI exper iments with monkey H. As expected, the ITC for a given experiment is greater than its corre sponding level-2 IT, which is in turn greater than its corresponding level-1 IT. This reflects the fact that there is structure in the decoding algorithm’s errors (i.e., when a prediction is wrong, a nearby target is often chosen). As a result, level-2 IT and ITC use the more accurate characteriza tion of the communication channel and this structured error distribution allows for more efficient error-correcting codes to be employed.

3.7.3 Notes

It is im portant to clarify th at our prosthetic system has not actually achieved the ITC or ITRC. We have simply measured and quantified the fundamental maximum bit rate given the channel’s error statistics. This is standard practice in the communications literature and neuroprosthetic research (Wolpaw and McFarland 2004; Taylor et al. 2003). Any realizable encoding scheme that is used with this channel will achieve performance less than or equal to this bound. The use of information transfer capacity as a metric is critical for a comparison between the channel properties of different prosthetic systems. While ITRC is a well-established metric, it is often useful to translate this measure into a more tangible number, namely how many words can be typed per minute using a given prosthetic system. With a 5 bps communication prosthesis a patient could select one key per second from a 32 key keyboard (25 = 32 keys; 26 letters + space bar + 5 numbers). The 5 bps communication prosthesis would allow one 5-character word (from this 32 key keyboard) to be selected every 6

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 60

Information per TRIAL vs. Accuracy

3.5

N=t

0.5

40 60 100 Accuracy (%)

Figure 3.15: Information transfer as a function of accuracy. Solid blue curves represent the theoretical rela tionship between accuracy and level-1 IT for different numbers of targets. Diamonds correspond to measured IT values from online experiments with monkey H using an 8-target configuration. For a given experiment, the three diamonds are plotted against the experiments average single-trial accuracy: blue diamonds denote the level-1 IT, green diamonds denote the level-2 IT, and red diamonds denote the complete ITC.

seconds (including the need for a space bar selection) resulting in 10 words/minute. Furthermore, an intelligent entry scheme can increase the communication throughput by exploiting redundancy in the language of interest (e.g., text prediction software for mobile devices). When reporting our primary results, we first made a rough conversion of 6.5 bps measurement to 15 words/minute by assuming a 32 key keyboard, 5-character words (including the space bar), and no text prediction. When making the same calculation for 6-character words, the result is 13 words/minute. There is room for an increase of at least several words per minute by using text prediction algorithms that leverage the underlying entropy of English (or similarly French, Tamil, etc.). This is why we arrived a t the quoted value of ~15 words/minute.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 3. A HIGH-PERFORMANCE BRAIN-COMPUTER INTERFACE 61

3.8 Credits

The work detailed in this chapter has been published in a peer-reviewed journal (Santhanam et al. 2006b). It would not have been possible if not for the support of a number of individuals. Dr. Stephen Ryu was responsible for the initial experimental concept and surgical implantation of the electrode array. He also materially assisted with experimental design, animal training, data collection, and preliminary analysis. Byron Yu and Afsheen Afshar supported this study with animal training, data collection, and analysis. I was responsible for experimental design, infrastructure development, animal training, data collection, and in-depth analysis. We also thank Missy Howard for surgical assistance and veterinary care and Dr. Nicho Hat- sopoulos for surgical assistance (monkey G implant), Drs. Mark Churchland and Maneesh Sahani for scientific discussions, and Drs. Eric Knudsen and Tirin Moore for comments on our Nature manuscript. This study was supported by NDSEG Fellowships (GS,BMY), NSF Graduate Research Fellow ships (GS,BMY), the Christopher Reeve Paralysis Foundation (SIR,KVS), the NIH Medical Scien tist Training Program (AA) and the following awards to KVS: a Burroughs Wellcome Fund Career Award in the Biomedical Sciences, the Stanford Center for Integrated Systems, the NSF Center for Neuromorphic Systems Engineering at Caltech, ONR, the Sloan Foundation, and the Whitaker Foundation.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 4

Factor Analysis Investigation

4.1 Overview

In Chapter 3, we demonstrated that the careful design and implementation of prosthetic systems can provide substantial increases in overall performance. At the same time, it is important to rec ognize that our improvements were largely a product of our approach and careful choice of system parameters, as opposed to the use of complex target decoding algorithms. We employed simple Gaussian and Poisson models of neural firing rate due to their acceptance in the neuroscience field (Dayan and Abbott 2001) and ease of computation. Now, we investigate whether a more sophisti cated decoder can be developed and thereby achieve higher prosthetic performance. As discussed in Section 3.2.2, we assumed that the spike counts for each neuron were indepen dent once the reach endpoint was specified.1 This construction implies that there are no high-level factors (e.g., overall attentiveness to the task, reach speed of the upcoming movement, reach cur vature, etc.) that influence the recorded neural data (other than the reach target itself). If there were, then these factors that are uncontrolled, and often unobserved, would modulate the under lying firing rate of our observed neurons in predictable fashions, thereby inducing measurable unit-by-unit correlations in the spike counts that we observe. This would negate the assumption of conditional independence (conditioned on endpoint). With this in mind, our initial assumptions of conditional independence — despite being useful for achieving a high performance system in Chapter 3 — are certainly gross approximations. While one of the primary influences on PMd activity is reach endpoint (Messier and Kalaska 2000), there is evidence that PMd activity can depend on factors other than target location, including the type of grasp (Godschalk et al. 1985), the required accuracy (Gomez et al. 2000), reach curvature (Hocherman and Wise 1991), reach speed (Churchland et al. 2006a), and (to some degree) force (Riehle et al. 1994). If a given model only describes reach endpoint, the model cannot accurately

1For the Gaussian models, this assumption was made to avoid a problem of too little training data when fitting a full covariance matrix. For the Poisson models, independence is a natural consequence of the distribution that we chose.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 63

reflect how the firing rate might change if any one of the unaccounted properties (e.g., reach speed) perturbs the underlying firing rate. These fluctuations will appear as “noise” on the recorded neural output, though the noise will be correlated between the observed neurons. For example, consider the cartoon illustration in Fig. 4.1. Panel a shows the expected number of spike counts of five neurons for a given reach endpoint (e.g., leftward reach). Panels b and c show the observed spike counts on a two separate trials. For panel b, we suggest that the subject might have been planning a slightly faster than average reach. Conversely, for panel c, the subject might have been planning a slightly slower than average reach. Note how the reach speed does not necessarily affect all neurons with the same polarity and magnitude. Some neurons elevate their firing rate (and hence observed counts) for faster reaches. Other neurons do the opposite and they do so with different amplitudes. This neuron-by-neuron difference in polarity and magnitude is commonplace among response properties (e.g., Churchland et al. 2006a).

12345 12345iL lidL 12345 Neuron Neuron Neuron

Figure 4.1: Simple cartoon illustrating how spike counts can co-vary from trial to trial, a. Nominal mean spike counts for 5 neurons for a particular reach endpoint, b. Spike counts during a given trial for the same reach endpoint. Activity is either elevated or suppressed relative to panel a. The modulation may be due to an uncontrolled factor (e.g., speed), c. Spike counts during another trial.

In reality, we may not know if it is reach speed or some other variable that is causing the trial-by-trial modulation; many different factors can be involved and many of them are simply unobservable (e.g., cognitive attentiveness to the task). We can instead attempt to infer a set of abstract factors for each trial, along with the mapping between the factors and the underlying firing rate of the recorded neurons. A good target decoding algorithm can use this knowledge to then avoid mistaking the relatively unimportant trial-to-trial variations as being the signature for an entirely different reach endpoint. In this chapter, we first survey existing methods for learning these trial-by-trial abstract fac tors. With these techniques, we were able to find lower-dimensional representations (e.g., 1-2 abstract dimensions) of our high-dimensional data (e.g., ~100 neural units). We found that a small but measurable amount of the total variability in our data can be attributed to the unob served factors (~15%). We then extended the relevant models to handle multi-target data. With these modifications, we built a classifier that leveraged these learned factors to perform target decoding. The use of this decoder led to a reduction of the decode error by up to ~75% (~20% total prediction error became ~5%). For these models, we also tested whether Poisson-based mod els were a better choice over Gaussian-based models. We found no benefits to using the more

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 64

computationally complex Poisson-based models, especially if the Gaussian-based model is fitted to square-root-transformed data.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 65

4.2 Methods

4.2.1 Latent Variable Models

The work here is based on “latent variable models” which have been a statistical tool for analyz ing empirical data since the early 1900s. In his brief and clear introduction to the topic, Everitt (1984) defines latent variables as “essentially hypothetical constructs invented by a scientist for the purpose of understanding some research area of interest, and for which there exists no oper ational method for direct measurement. Although latent variables are not observable, certain of their effects on measurable (manifest) variables are observable, and hence subject to study.” In our case, the observable (i.e., output) variables are the neural spiking data that we record from the electrode array. The latent variables represent the cognitive state of the subject. They encap sulate the intended reach endpoint, as well as the uncontrolled and unobserved variables present during the task. We can use the larger number of observed output variables to help triangulate the smaller number of unobserved latent variables of the system. The two classic methods to reduce dimensionality, and in essence reveal the underlying latent variables, are Principal Components Analysis (PCA) and Factor Analysis (FA). As shown by Roweis and Ghahramani (1999), both of these techniques posit a generative model (or probabilistic process of creating the data of each trial) with the following form:

x~Af(0,I) (4.1)

y|x~)V(Cx,R). (4.2)

The latent state vector, x e [Rpxl, is Gaussian distributed with mean 0 and covariance I. Most often, it is unobserved. The output, y e IR9xl, is then generated from a Gaussian distribution. The matrix C e [R9Xp provides the mapping between latent state and observations, and R e Mgxq is a diagonal covariance matrix of the output noise process. In classic FA literature, the parameters C and R are often referred to as the loading and uniqueness matrices, respectively. The variables x n and yn denote independent draws from this generative model over N observations (trials), with n e {1,..., N}. For non-zero centered y, the mean across all training trials must be first subtracted before fitting and applying the model. Both PCA (or rather, sPCA2) and FA require that R be a diagonal matrix. In other words, the variability in the output space is independent once x is specified. Without knowing the latent variables, x, the individual components of the data may appear correlated but this correlation solely arises from their underlying dependence on the factors in x. The difference between sPCA and FA lies in the form of R. In sPCA, R is constrained to have the form el. For FA, R can be any positive-definite diagonal matrix. This distinction is important. It is often quoted that the intrinsic

2The “sensible” PCA (sPCA) model is a probabilistic approach to PCA and yields the same mapping between latent states and observations as conventional PCA. This is demonstrated by Roweis (1998).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 66

variability of neurons is nearly proportional to the mean (e.g., Shadlen and Newsome 1998). The proportionality constant is approximately 1. Therefore, forcing all of the neural units (components of y) to have equal variance would not capture the property that higher firing rate units tend to have higher overall variability and lower firing rate units tend to have lower overall variability. Using sPCA on neural data has the disadvantage that the mapping between latent space and observation space is chosen based on the most variable output units rather than capturing a more accurate representation of the latent variables. The procedure of system identification, or “model training,” requires learning the parameters from the observed data. The observed data includes N trials of y, an identically and independently

distributed (i.i.d.) sequence (yi,y2,---,yiv) denoted by {y}. With the model shown in Eqs. 4.1-4.2, we only consider a single reach endpoint. Restricting the fit to only a single endpoint allows for the characterization of the unobserved factors that influence the observations. The model fitting procedure is an unsupervised problem since the hidden states are unobserved and therefore unknown - we cannot use known values of the latent variables to help fit the param eters C and R. The classic approach to system identification in the presence of unobserved latent variables is the Expectation-Maximization (or EM) algorithm. The algorithm maximizes the like lihood of the observed data over the model parameters (i.e., 6 = {C,R}). The algorithm is iterative and each iteration is performed in two parts, the expectation (E) step and the maximization (M) step. Iterations are performed until the likelihood converges. This results in the parameters that correspond to the highest data likelihood P({y} 1 6). We can then estim ate the most likely x for the observed data y. The exact fitting procedures for sPCA and FA are described elsewhere (Roweis and Ghahramani 1999; Ghahramani and Hinton 1997) and are omitted here for the sake of brevity. One open question is how to select p, the number of latent dimensions. The objective of model training is to best describe the training data within the constraints imposed by Eqs. 4.1—4.2. How ever, with too many latent dimensions the model training procedure will explain the training data so well through the latent space that there will be unrealistically small amounts of independent observation noise (R). This is contrary to obtaining a simpler model (fewer latent dimensions) with a more reasonable amount of observation noise. For example, when p is large, the model will have enough latent dimensions to explain a high proportion of variability (and importantly, covariance) without using the independent observation noise. In this case, the model will not generalize well for new (test) data. The technical term for this is “overfitting.” We used the standard approach of partitioning data into training and test sets to assess at which choice of p does overfitting become a problem. The choosing of p is part of the process of “model selection.”

4.2.2 Poisson Output Model

Standard FA uses a Gaussian noise model but this might not be the most appropriate for our type of data. Recall that our output variables are the spike counts from the recorded neurons and these

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 67

are naturally nonnegative integers. Furthermore, the means of these data are relatively low (e.g., <10). Hence, such data is not necessarily well-suited for a Gaussian distribution, since a Gaussian has nonzero probability density for rational numbers and negative numbers. Neural count data are usually considered to be Poisson or Poisson-like in their distribution (Dayan and Abbott 2001). There are two possibilities to contend with this issue. One approach is to modify the raw data by first applying a square-root to the counts and then centering the data about zero. It can be shown that the approximation error induced when using a Gaussian distribution to fit Poisson data is diminished if the Poisson data is first square-rooted (Thacker and Bromiley 2001). The transformed data is then inputted into the standard FA. This is an approach that we tested. The second option is to alter the generative model to allow for Poisson distributed noise in the output variables. With this change, the model is now written as follows:

x ~ N (0 ,1) (4.3)

y l |x~Poisson(M c'-x + cOA) for i e l,...,q . (4.4)

The outputs, y* e Mo, are generated from a Poisson distribution where h is a link function mapping IR — IR+, c* e [Rpxl and d l e IR are constants, and A e 1R_ is the time bin width. The function h ensures that mean firing rate argument to the Poisson distribution is nonnegative. We call this family of models “Factor Analysis with Poisson Output” (FAPO). The Poisson output distribution along with the nonlinear mapping function h, makes an analytic solution to the EM algorithm intractable. Hence, we must use a few approximations when performing the Expectation-Maximization algo rithm.

E Step

The E step requires computing the expected log joint likelihood, E [log P({x},{y} | Q)\, over the pos terior distribution of the hidden state vector, P ({x} | {y}, 0*), where are the parameter estimates at the Mh EM iteration. Since the observations are i.i.d. we can equivalently maximize the sum of the individual expected log joint likelihoods, E [log P (x„, y„ 1 6)]. The posterior distribution can be expressed as follows:

P(*n \yn,Sk)oc P(yn Ix„,0&)P(x„ | dk). (4.5)

Because P(yn I x„) is a product of Poisson distributions rather than a multivariate Gaussian, the state posterior P (x„ | y„) will not be of a form that allows for easy computation of the log joint likelihood. Instead, we approximated this posterior with a Gaussian centered at the mode of log P (xn | yn) and whose covariance is given by the negative inverse Hessian of the log posterior at that mode. Certain choices of h, including h\(z) = ez and h^iz) = log(l + ez), lead to a log posterior that is strictly concave in x n. In these cases, the unique mode can easily be found by Newton’s

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 68

method. We chose h = hi, to avoid the problem with h\\ namely, = ez and thus small changes in the trial-by-trial hidden state lead to very large changes in the underlying output mean if biased in regime of large z (opposite effect in regime of small z). For each trial n, let Qn be a Gaussian distribution in that approximates P(xn | y n,6k)- The expectation of the log joint likelihood for a given observation can be expressed as

&n=EQn [logP(x„,y„ |0)], (4.6)

and the expectation of the log joint likelihood over all of the N trials is simply the sum of the individual S n terms:

6 = Eq [log P ({x},{y} | 0)]

N = L S n - n-1

M Step

The M step requires finding (learning) the Qk+i that satisfies:

6k+i = argmax E q [log P({x},{y} | 0)]. (4.7) 9

This can achieved by differentiating £ with respect to the parameters, 6. Learning the cl and d l parameters in Eq. 4.4 can be somewhat challenging. We wish to maximize the following objective function, with respect to cl and dl:

N q £-h (c!-x„ + d lj A + y lnlog {h [c!-x„ + d lj a | (4.8) n=1 i=1

This optimization can be solved by recasting the expectation over Qn, applying Gaussian quadra ture approximations, and then iteratively searching for a solution using conjugate gradient meth ods (Yu 2007).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 69

4.2.3 Extensions to Accommodate Multiple Targets

All of the prior techniques are intended to be used for data collected while the subject is reaching to a single target and can help quantify unobserved factors that affect the neural activity. To use FA (or FAPO) to help decode target endpoint, we tried two different forms of the generative model. We cover these two forms in the context of the Poisson-based framework (refer to Eq. 4.4), but the same formulation is applicable for the Gaussian-distributed outputs as will be shown later. The first closely mimics the decode algorithms that we used in our BCI experiments (see Sec tion 3.2.2). We fit a separate FAPO model for each target and this is formally written as follows:

x ~ N (0,1) (4.9)

y l |x ,s ~Poisson(ft(Cg-x + dg)A) for i e (4.10)

The random variable s is the mixture component indicator and is a discrete probability distribution over M} (e.g., P(s) = ns). During model fitting, we assume s is known and we take ns = for all s. We then decode test trials by choosing the FAPO model, indexed by reach endpoint, th at best describes the data. We do so by finding s (the most likely s), using the following operation:

s = argm axP(s |y,0) (4.11) S P(y I s,6)P(s) = arg m ax ------(4.12) P(y 16)

= argmax P(y\s,6) (4.13) s = argmax 1 P (y,x | s,8) dx (4.14) S Jx = argmax f P (y | x,s,0)P (x) dx. (4.15) S Jx

The second approach is to share the same output mapping between target locations and incor porate the effect of reach endpoint through the shared latent space. We can formalize this model as follows:

x \ s ~ N (fis, Ls) (4.16) y l | x ~ Poisson(/i(c!-x-i-

To find s we then performed the operation:

s = argmax f P (y |x ,s ,0 )P (x |s )d x . (4.18) S Jx

The difference between these models is subtle but important. In Eq. 4.10, there is a separate set of cl and d lvariables for each target location. Essentially this generative model defines a

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 70

different latent space for each reach endpoint. In Eq. 4.17, however, the cl and d l variables are shared and the data for each endpoint is separated by their different means in the latent space Xs = I. We also tried Gaussian-based output distributions with models analogous to the ones above. We dubbed these “Factor Analysis with Gaussian Output” or FAGO for short. They are written as

x ~ N (0 ,1) (4.19)

y I x,s ~ AffCgX, R s) (4.20)

and

x | s~N(tis,I.s) <4.21)

y |x ~ A « C x ,R ), (4.22)

respectively. Note that the observations are no longer mean-centered about zero for the latter model. Rather, the mean observation vector for a particular reach endpoint is mapped from the underlying latent space mean (i.e., C fis). An example of how these clusters might appear is shown in Fig. 4.2. We chose the number of latent dimensions to be 3 to allow for convenient plotting of the data.

r. 3

-5 -5 -10 -10

Figure 4.2: Latent Space Example for FAGO. Each point corresponds to the inferred latent space variable x for a given trial. The coloring of the data points denotes the the upcoming reach target. Points of similar color are clustered together since all of these trials correspond to the same reach target.

A short derivation is provided for FAGO in Section 4.6. The derivation for FAPO is omitted as it is lengthy and does not additionally provide any important insights. While the details are similar

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 71

to FAGO on a high level, the computations are significantly more laborious due to the need for approximations to non-linear functions during the course of the EM algorithm. Yu (2007) provides a description of the operational details.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 72

4.3 Results and Discussion

4.3.1 Data Characterization

We first wanted to better characterize the empirical variance in our data.3 Primarily, we asked how much of the total variance is due to the underlying trial-to-trial variability and how much is due to the intrinsic noise properties of the output neurons. Mathematically, in the model described by Eqs. 4.1 and 4.2, the latent space variability manifests itself in the output space as CC' (shared variance). The independent variance is found in the matrix R. Once the model training is complete, CC' + R is the best fit covariance of the raw data. It does not necessarily match the empirical covariance of the data due to the reduced rank of C and the diagonal constraint on R. Let us assume, for example, that the shared variance is large compared to the independent variance. In this situation, a FA model may be better apt to describe a trial than the simple decoding models in Chapter 3. The simple decoding models ascribe all of the variance in spike counts to be independent along the output dimensions (neural units). However, as previously discussed, variations in spike counts from their mean values might actually be due to a change in some unobserved factors. The hope is that FA can identify these trial-by-trial variations, provide a richer description of the data, and allow for a more robust mechanism by which we can decode target endpoint. Understanding the relative proportion of shared variance to independent variance can help build intuition on how much improvement FA might be able to deliver when applied to our data. To this end, we started by segregating our data by reach target, and fit a separate model to each endpoint (Eqs. 4.19 and 4.20). We considered the spike counts in the window [150:350] after target presentation. As discussed at the end of Section 4.2.1, an appropriate p, the number of latent dimensions, must be chosen. To assess this free parameter, we further split our data into approximately two equal halves, one to serve as a training set and the other as a test set. For each test trial, we computed the likelihood using the FA model built for that trial’s reach endpoint {CS,RS}. We summed the test trial’s likelihoods to obtain the total test likelihood. The test likelihood as a function of p is shown in Fig. 4.3 with two curves, one each for monkeys G and H. The curves are normalized such that their peaks occur at 1. The results show that overfitting is an issue for even relatively small values of p. Why might this be? There are surely many independent factors that influence the upcoming reach — reach direction, distance, curvature, speed, force, etc. However, for our dataset, we may have limitations in our ability to resolve the latent space. One, our reaching task was highly stereotyped. There was low variance in the reaches within the subset of reaches to the same endpoint. Two, the intrinsic noise properties of our neurons may be large relative to the shared reach variability. Given a relatively small number of neurons (~100) and training trials for each reach target (~50),

3 Again, the data is simply the neural data binned into spike counts per sorted neural unit.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 73

1.005 M onkey G M onkey H

i o c

1 2 3 4 5 67 8 Number of Latent Dimensions (p)

Figure 4.3: Test likelihood reveals the ideal number of latent dimensions. As overfitting sets in, the test likelihood declines. Data from monkey G (red) and monkey H (blue) are plotted. Curves are self-normalized.

it is likely that we are unable to identify more hidden factors without misestimating the model parameters and overfitting. We were careful to choose p to be small for the remainder of our analyses, usually equal to 1 when building a model for a single reach target, and equal to 8 when building a multi-target model according to the description in Section 4.2.3. Having chosen the number of latent dimensions in our model, we returned to investigating the partitioning of the total variance between the shared and intrinsic processes. Again, we segregated the training trials by reach target (s) and fit the standard FA models (FAGOs). For each FAGOs, we obtained the intrinsic variance per neural unit (R*.). We only retained the neurons that were tuned for target location in this analysis, as per our standard tuning criteria (ANOVA; p < 0.05). Then, for each neuron-target pair, we also computed the total raw variance from the data alone RS (v|). We finally derived the fraction of the total variance attributable to the intrinsic variance and took the mean of this ratio across all neuron-target pairs. For monkey G (dataset G20040508; p = 2), the intrinsic variance contributed 85% of the total variance, on average. For monkey H (dataset H20041217; p - 2), the result was similar with the intrinsic variance accounting for 89% of the total variance, on average. This indicates that there is measurable shared variance found by the FA model fit, but this quantity is relatively modest.4 We therefore inferred from the intrinsic-to-total variance ratio that there should be a small to moderate improvement in target decoding when using the FA-based target for these datasets. This is indeed the case and will be shown later. Finally, using the same FA models as above, we computed the “intrinsic” Fano Factor (ratio of intrinsic variance, RA, over the mean spike counts for that neuron-target combination). This allowed us to compare the intrinsic Fano Factor (FF) against the theoretical FF of a Poisson noise

4It is important to note that we may certainly be underestimating the amount of shared variance in this system. If p were larger, there would be greater opportunity to assign more output variability to the shared variability of the system. But as we showed before, given our data limitations, it appears that we overfit for larger p. To find the true value of p, we may need a very large number of trials and neural units.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 74

distribution. This quantity is often of interest in the neuroscience community (Tolhurst et al. 1983; Gur et al. 1997; Bair and O’Keefe 1998; Averbeck and Lee 2003). The FF of any Poisson distribution should be 1 since the variance is equal to the mean. Figure 4.4 shows a histogram of the intrinsic FF for all neuron-target pairs as analyzed on two separate datasets. The red overlay corresponds to the raw FF that is computed on the data alone, making it clear that the intrinsic FF shifts considerably to the left when shared variance is taken into account. The average intrinsic FF was 0.97 while the average raw FF was 1.18 for monkey G.

Monkey G Monkey H

250

200

100

Fano Factor Fano Factor

Figure 4.4: Intrinsic Fano Factor. The distribution of intrinsic FF (blue histogram) was computed by taking the intrinsic variance over the mean after fitting a FA model. The distribution of FFs computed from the raw data is overlaid (red). A simulation was also performed so that the intrinsic FF distribution could be compared to the theoretical distribution (gray curve), a. Results from monkey G (dataset G20040508). b. Results from monkey H (dataset H20041217).

We then took random draws from Poisson distributions that had the same means as those measured for the neuron-target pairs in our data. The number of draws was equal to the number of trials per condition used to initially train the model. We then calculated the FF for this simulation, obtaining the “theoretical” FF adjusted for the limited number of samples (i.e., trials). This is the gray overlay in Fig. 4.4. The figure shows that the intrinsic FF is much closer to the theoretical FF than the raw value. Nonetheless, the intrinsic FF is not a perfect match to the theoretical FF. For monkey G, the intrinsic FF overrepresents values at both tails. For monkey H, the intrinsic FF distribution is mostly overrepresenting for values greater than 1 (often known as “super-Poisson”). It is difficult to determine whether this mismatch is a true property of the data or simply an artifact of a poor FA model fit. Since we are primarily interested in improving the overall decode performance, we will soon discuss this data fitting issue in that context.

4.3.2 Target Decoding

Given the encouraging results from our deconstruction of the variance into shared and intrinsic components, we next implemented target decoding using the FA framework. We first compared 4

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 75

different decoders. One pair consisted of the simple, independent-Gaussian (G) and -Poisson (P) models. The second pair were the FA models that fit separate output mappings per target endpoint (Eqs. 4.19-4.20 and 4.9-4.10). We refer to these models as FAGOsep and FAPOsep, respectively. For all of the decode analyses, the window for counting spikes started 150 ms after target presentation (i.e., Tgjtip=150 ms). We initially computed decode accuracy for two separate plan lengths, ■^int =150 ms and ■^int =250 ms. The trials were first shuffled so as to remove any effects such as reduced attention or muscle fatigue that systematically progress over the course of the experiment. Then, we set aside 50 trials per condition to train the model. The number of latent dimensions was chosen to be p = 1 (we found that increasing p did not improve the overall per formance). The fitted model was later used to decode the reach target for the remainder of trials, as previously described in Eq. 4.15. The average decode accuracies for each model is shown in Table 4.1.

Table 4.1: Factor analysis performance comparison. Decode accuracy was computed using data from both monkey G (dataset G20040508) and monkey H (dataset H20041217) for various models.

Decoding Models spike window (ms) G (%) P(%) FAGOSeP (%) FAPOsep (%) H [150:300] 82.0 87.1 90.3 87.2 [150:400] 87.6 91.5 96.1 96.1 G [150:300] 88.2 91.5 93.8 91.0 [150:400] 90.9 93.8 95.6 95.1

There are two important points to note regarding the findings in Table 4.1. First, the perfor mance improvement when using an FA style model is relatively small. The increase in decode accuracy was as little as 2.3% and only as high as 4.6%. The dataset for monkey H, with analy sis window [150:400] is the most promising. The performance obtained with the FAGOsep decoder (96.1%) is statistically significantly different than that of the Poisson decoder (91.5%), as confirmed by checking the 95% confidence interval of each estimate. Secondly, the Poisson-based FA (FAPOsep) does not perform as well as the Gaussian-based version (FAGOsep).5 This is somewhat surprising since a Poisson distribution often better models neural data than a Gaussian (see Fig. 3.4), especially for small time windows. It appears that this difference vanishes if the Gaussian is fit to the square-root transformed data and a FA approach is employed. The performance difference between the two models is less noticeable for longer windows. The effects we saw could be due to a variety of reasons:

1. The Poisson fitting procedure includes several approximations, which could be resulting in a

5 Note that the simPle Poisson model easily outPerformed the simple Gaussian Partially because the data for the latter was not preprocessed with a square-root transform.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 76

sub-optimal fit, even if the training likelihood increased with every EM iteration. Often the training likelihood would even start to drop after ~100 iterations, albeit slowly.

2. The output noise process may not be exactly Poisson as evidenced by Fig. 4.4, despite its wide popularity in the neuroscience community. Gaussian models may better capture the sub- and super-Poisson characteristics of the data.

3. The Poisson models are able to match the decode performance of the Gaussian models for longer windows. This might be due to the larger signal-to-noise ) in this large-window situation. Since the variance is equal to the mean for a Poisson distribution, the signal- to-noise increases as firing rates are higher. For smaller windows, the spike counts for the neural units can be rather low. In a window such as [150:200] ms after target presentation, the trial-by-trial spike counts for a unit are most often only 0, 1, or 2. This would make it difficult to determine the underlying mean and correlation structure, especially if there are only a few number of units and trials. We speculate that the sub-optimal EM approximations may be easily susceptible to erroneous model fitting in this sort of regime.

Again, the above discussion covers FA models that employ separate output mappings for each target endpoint. We also examined the FA models th at share a single combined output mapping for all of the target endpoints (Eqs. 4.21-4.22 and 4.16-4.17). These are the FAGOcmb and FAPOcmb models. Before using these models to decode, we need to choose an appropriate value of the model param eter p, the number of latent dimensions. We had done this for the single-target FA model in Section 4.3.1. For FAGOcmb and FAPOcmb, since the model must share a single output mapping for all reach targets, there exists a slight complication as detailed below. Intuition suggests that a well-fit model of the form in Eq. 4.22, should be such that the ex pected observation mean for a given target (i.e., E(y \ s)) closely matches the empirical mean of the data for that same target. The model states that the expected mean is Cfis, where s is the target location of interest. The observations lie in a q-dimensional space while the vector p s is in a lower p-dimensional space, and C is clearly rank p. Thus, the vector Cps lies within a p-dimensional subspace spanned by the columns of C. If there are M total targets, the space of possible obser vation means lie in at most an ( M - l)-dimensional space. Hence, p must be at least (M - 1) if we are to ensure that the model can always capture the appropriate target-specific output means. Recall that for our prior analysis of the latent space, we segregated data by target location, fit the FA model, and then computed the test likelihood for increasing values of p.6 We had found the optimal p was either 1 or 2. Given this information, we posited that p = M (simply one more than p = (M -1)) might be an ideal choice for FAGOcmb or FAPOcmb. To test our intuition, we fit a FAGOcmb model for different values of p and decoded target endpoint on a separate set of trials. The results of this analysis using data from monkey H (dataset

6Since the data was separated by target for that computation, the mean was fit separately and does not influence p.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 77

H20040928) are shown in Fig. 4.5. The spike count window was set to [150:400] ms and p was swept from 1 to 15. The plot indicates that performance saturates early. While p < 8 could be a safe choice for the number of latent dimensions, p = 8 appears to be safe, as we had intuited. We set p - 8 for all further analyses with FAGOcrab and FAPOcmb.

100

3 50

2 4 6 8 10 12 14

Figure 4.5: Choosing the number of latent dimensions for FAGOcmb- We found the average decoder accuracy for each tested value of p. The performance saturates around p = 6.

Next, we compared FAGOcmb and FAPOcmb to their counterparts, FAGOsep and FAPOsep. Sim ilar to Table 4.1, we chose the time period of [150:300] ms after the target presentation to calculate the spike counts. Table 4.2 shows the performance from the FAGOcmb and FAPOcmb techniques versus the FAGOsep and FAPOsep approaches. The most striking aspect of the comparison is that FAPOcmb does not suffer from the same performance degradation as FAPOsep. In FAPOcmb, a smaller number of parameters are fit jointly against all of the trials. This results in a simpler model that is not as susceptible to overfitting as FAPOsep. Additionally, we also noted that the performance of FAGOcmb only slightly edges that of FAGOsep while being roughly equivalent to FAPOcmb- This trend continued over various different window sizes and across two additional datasets (data not shown). Consequently, we chose to use FAGOcmb over the other alternatives due to its overall decode performance and speed of computation.

Table 4.2: Factor analysis performance comparison between separate output mapping models and combined output mapping models. Decode accuracy was computed using datasets from monkey G (dataset G20040508) and monkey H (dataset H20041217).

Decoding Models spike window (ms) FAGOSeP (%) FAPOsep (%) FAGOcmb (%) FAPOcmb (%) H [150:300] 90.3 87.2 91.9 92.2 G [150:300] 93.8 91.0 94.5 93.9

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 78

4.3.3 Datasets with More Shared Variability

The datasets that we used so far contain highly stereotyped reaches and the timing of the trials were very regular. As such, it is perhaps not surprising that we were unable to benefit greatly from FA, a technique that identifies shared variability. The previous data characterization analyses showed that the shared variability accounted for approximately 15% of the total variability. Thus, the datasets G20040508 and H20041217 simply do not have a large amount of shared variability, and a significant portion of the decode error in due to the intrinsic variance of the neural units. Furthermore, the baseline performance (as computed from our simple Poisson-based decoder) was already sufficiently high. Hence, there was little room for improvement in the average decode accuracy when we employed the FA techniques.7 We attempted to locate a dataset that possessed trial-by-trial variability and one in which shared processes contribute heavily to the overall data variability. As chance may have it, we have precisely that dataset from our BCI experiments. In those experiments, we presented a mix of BCI trials (short trials, chained rapidly together) and standard reach trials. The BCI trials in G20040427 and H20040928 had total trial lengths of approximately 400 ms. For the real reaches, most trials had plan periods greater than 400 ms and we discarded any catch trials with tim ings shorter than this. Therefore, we could analyze neural activity up to ~400 ms after target presentation regardless of the trial type (BCI versus real reach). We know from other related studies (Kalmar et al. 2005; Gilja et al. 2005) that there can be substantial gain modulation as a chain of BCI trials progresses and that simply normalizing single-trial responses by the average firing rate across the array can improve decode performance. Therefore, we trained on a set of data that included both BCI trials and reach trials. This resulted in an ideal type of dataset. The FA methods could potentially represent the gain modulation as an underlying factor and the target decoder could perhaps benefit from this more accurate model. Figure 4.6 shows a comparison between the simple Poisson-based decoder and the FAGOcmb decoder.8 The FAGOcmb model had 8 latent dimensions (p - 8). We have plotted the decode error so as to better illustrate the difference between the two methods. A number of window lengths were tested for each monkey. The performance differential between simple Poisson-based decoding and FAGOcmb decoding was appreciable. For monkey H, however, there was a less dramatic boost in performance for 75-100 ms windows. As stated before, we suspect that the signal-to-noise ratios is too low for the neural data when measured over these small windows lengths. We do not have sufficient trials to counteract this phenomenon. For long window lengths, the performance improvement can be very dramatic (up to ~15%) in both monkeys. These BCI datasets have nicely illustrated the power of using FAGOcmb for situations where there is variability in the task itself.

7We cannot simply drop units or reduce the training set size in order to bias ourselves in a lower performance regime. This is because adjusting these two parameters will then directly influence how much data we have to accurately fit the FAGOcmb model. Plus, this approach will not increase shared variability. 8We did not analyze the data using the other FA decoders since we had already sufficiently determined that FAGOcmb is the best option from the family of decoders described in Section 4.2.3.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 79

a b 35

25 25 20 HI 0> T3 o 0)o D

50 100 150200 250 300 50100 150 200 250 300 Plan Window Length (ms) Plan Window Length (ms)

Figure 4.6: Comparison of simple Poisson-based decoder (black) with the FAGOcmb decoder (red), a. Monkey G (dataset G20040427). Models were trained on the first 75 trials per condition and tested on the remaining 76 trials per condition in the dataset, b. Monkey H (dataset H20040928). The training set consisted of 65 trials per condition and the test set had 67 trials per condition.

It is worthwhile to express the improvements in decode accuracy into the ITRC (Information Transfer Rate Capacity) metric that we so vigorously espoused in Chapter 3. For these BCI datasets, the total ITRC would have increased by approximately 1-1.25 bps if we would have used FAGOcmb during real-time experiments. This constituted an ITRC increase of 15-20%, which is more than the 8-15% increase in decode accuracy since the performance increases in decode accuracy are amplified when expressed in terms of ITRC.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 80

4.4 Summary

In this chapter, we have investigated the use of more sophisticated decode algorithms in the hopes that we can achieve higher prosthetic performance. While we were able to demonstrate significant breakthroughs in performance with the system previously outlined in Chapter 3, we hoped to extend these advancements even further here. Factor Analysis techniques were used to help better account for trial-by-trial variations in uncontrolled and unobserved aspects of the prosthetic task. Simple data characterization analyses showed that there was sufficient potential for performance improvement using these methods. We applied minor extensions to the conventional FA model and adapted it for the purpose of decoding target endpoint. We found that using an entirely separate model for each reach end point was not as effective as fitting a single model to the entire dataset. The latter strategy re quires fewer model parameters and may be less prone to estimation error and overfitting. Surpris ingly, the complicated extensions to support Poisson-distributed were deemed unnecessary since the Gaussian-based models did equally well, and even better in some instances, when data were square-root transformed. This allowed us to dispense with FAPOcmb and avoid the lengthy com pute times associated with fitting those models. The full utility of the FA methodology was demonstrated with our BCI datasets where the task design had different operating modes (BCI vs. reach trials). This resulted in much more shared variability and FAGOcmb was able to consistently and significantly outperform the conventional methods. For a clinical prosthetic setup, the situation of mixing BCI and reach trials would not be realistic since the patient would be paralyzed. However, even for a clinical BCI the set of actions available to the patient may be so heterogeneous that there may be underlying factors that significantly modulate the outputs, even though the factors are irrelevant to the task itself. If this is the case, FA can be one tool by which the system designer can combat performance degradation. Finally, we chose to use a probabilistic framework and construct a generative model a priori that we felt reasonably describes how our neural data relates to the reaching task. A potential disadvantage, however, is that we must employ an unsupervised learning algorithm to optimally fit the model without assigning a cost for how well or how poorly the model can classify the data. The fact that our FA approach improves the performance, despite this drawback, indicates that we may be revealing something intrinsic about the system. On the other hand, another algorith mic strategy would be to train a classifier that expressly accounts for misclassifications during the fitting process. This is the approach taken by several standard algorithms in the field of machine learning, including neural network classifiers, support vector machines (SVMs), and Gaussian processes. Furthermore, there is also the possibility of using a hybrid between supervised and unsupervised methods. An eventual comparison between the FA approach and these other ap proaches would be fruitful and will help better frame the broader impact of what we have shown here.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 81

4.5 Credits

This effort was greatly facilitated due to initial computational work by Byron Yu and Maneesh Sahani. Mathematical techniques developed for other purposes were directly applicable for our decoding problem. Furthermore, there were many valuable conversations with Byron and Ma neesh during the course of this project. Lastly, the rotation projects of Vikash Gilja and Rachel Kalmar were thought-provoking for the work here. Their studies explored the question of non- stationarities in the BCI data — Gilja et al. (2005) showed that a simple array-mean firing- rate normalization could improve decode performance and Kalmar et al. (2005) highlighted the gain modulation occurring within a BCI chain. This helped us identify the key datasets for Sec tion 4.3.3. We also thank Dr. Stephen Ryu for performing the electrode implant operations for monkeys G and H, Missy Howard for surgical assistance and veterinary care, Dr. Nicho Hatsopoulos for surgical assistance (monkey G implant), and Afsheen Afshar for helping with animal training and data collection (monkey H). This study was supported by NDSEG Fellowships (GS,BMY), NSF G raduate Research Fellow ships (GS,BMY), the Gatsby Charitable Foundation U nit (MS,BMY), and the following awards to KVS: a Burroughs Wellcome Fund Career Award in the Biomedical Sciences, the Stanford Center for Integrated Systems, the NSF Center for Neuromorphic Systems Engineering at Caltech, ONR, the Sloan Foundation, the Whitaker Foundation, and the Christopher Reeve Paralysis Foundation.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 82

4.6 Appendix: Mathematical Derivations for FAGO

The generative model is

x |s ~ J V (Ms>I s) <4.23>

y|x~ A T (C x,R ). (4.24)

The random variable s is the mixture component indicator and is a discrete probability distribution over M] (e.g., P(s) = ns). Given s, the latent state vector, xeR pxl, is Gaussian distributed with mean fis and covariance Zs. The output, y e (R

4.6.1 E Step

The E step of EM requires computing the expected log joint likelihood, E [log P ({x}, {y}, {s} 10)], over the posterior distribution of the hidden state vector, P({x} | {y},{s},0*)> where 0* are the parameter estimates at the Mh EM iteration. Since the observations are i.i.d. we can equivalently maximize the sum of the individual expected log joint likelihoods, E [log P(xn,yn,sn | 0)]. The latent state and output observations are jointly Gaussian given s:

( Yn CPs„ Z u Z12' p | sn = A( i (4.25) <.x ». j I Psn L21 Z22. ,

CZS C^ + R CZg = N (4.26)

And, therefore, the posterior distribution of the hidden state can be written as

■P(x n I y n>sn) = N {Psn + ^“21^11 (y« —CPsn) > ^22 — ^12) (4.27)

= N [Psn + Psn (y n~ ) > 2 Sn - Psn CZSJ , (4.28)

where fiSn = ZSnC'(R + CZSnC') 1. The inverse in /5Sn can be computed efficiently using the matrix inversion lemma:

(R + CZSvn C T 1 = R ”1- R “1C (Z on :1 + C,R “1C r 1C 'R -1. (4.29)

For observation n, let Qn be the Gaussian posterior latent state distribution that has mean %n

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 83

and second moment En:

4n = E [xra I y«;sn]

= P e n + P * n (y « -C t* S n ) <4 ' 3 0 >

Sn=£[xnxJl |yn,s„]

, =Var(xre \yn,sn) + E[xn |y„,s„]£[x„ |y„,s„]'

= Z8a- p anCZtn+Zn?n (4.31)

The expectation of the log joint likelihood for a given observation can be expressed as follows:

£n= E Qn [log P (x„, yn, sn | 0)] (4.32)

= ^Qn[p(y„ | x„) + log P(x„ | s„) + log P(sn)] (4.33)

= EQn [ - |log(2x)- ^log(|R|)- ^y[jR_1yre +y[jR_1Cxre - ix^C 'R _1Cx„

- |lo g ( 2 x ) - ^log(|Xs„ |) - “ xJjXjJxnfi'SnL + ; ! x n - V., <4’34>

■t log P(s„)].

The terms th at do not depend on x„ or any component of 6 can be grouped as a constant, C, outside the expectation. Doing so, and simplifying further, we have

&n = Eqn [y[lR _1Cx„ - i X;C 'R -1Cxn + fi'SnL ^ x n - ix^XJ^x,*]

- ly'n^yn - |iog(iR i>-l^x"1/!., - |iog(|2:«, I) + c

= y'nR~1C*EQn [x„] - ±Tr (C'R^C * EQn [ x ^ D

- ^ynR_1y«- ^iog(iRi)- ^ ' rex“VSre -iio g ( |x , J ) +c

= y ’nR -1CSn-±Tr (C 'R ^ C * S„)

+ < ^ n - ^ T r ( X s-1*E„) (4.36)

- Jy ^ V * - JlogdRD- i^X -V s* - ^log(|Xs„ \) + C.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 84

The expectation of the log joint likelihood over all of the N observations is simply the sum of the individual &n terms:

8 = E Q [log P ({x}, {y}, {s} 10)] N = tSn. 71— 1

4.6.2 MStep

The M step requires finding (learning) the 6k+i that satisfies:

dk+i = argmax E q [logP({x},{y},{s} 10)]. (4.37) e

This can achieved by differentiating £ with respect to the parameters, 9, as shown below. The indicator function, I(sn = s) will prove useful. Also, let N s = L ^=1/(s« = s).

• Prior probability of mixture component identification s:

1 N ns = — £ /(s„ = s) (4.38) M = i

• State vector mean, for mixture component identification s:

d& ^ a = ^ I(sn = s) — ^sPs] = 0 °Ps n=1

1 N Ps = 1rr Z I (sn=sHn (4.39) Ms n=l

State vector covariance, for mixture component identification s:

= £ H sn = s) j - ^ T r ( i ; 1 * S n) + fi'si ; 14n - ^ p ' ^ P s - ^logdZ J1!)

= L ■1 («» = «) (ZS_1( l E'n~ + lull's] s ; 1 - ^ J 1) = 0

N 1 = L *(«» = S)ZS 1 f J e b - Hs{'n + IflsHs) Zs 1 2 »=i M2 ^ " 2'

1 N 2 N i N Zs = T T L I(-Sn = s)S„ -— Us £/(«„ = s)^ ITT+ PsPs Y . I(-sn= s> iv s n = l •'vs n = l ■A's n = l

1 ^ = F l Jr(s'1 = s)E '1_ (4-40:> i V S 7 1 = 1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4. FACTOR ANALYSIS INVESTIGATION 85

Loading matrix:

N N Cnew = 11 <4.41> \n= 1 / \ 7 i — 1

• Noise matrix:

R new = ± diag ] £ y ny'n - c n™sny'n (4.42) 71 = 1

where the diag operator sets all of the off-diagonal elements of a matrix to zero.

4.6.3 Inference

Once the model parameters have been chosen, the generative model can be used to make infer ences on the training data or new observations. For the training data, the hidden state vector x is the only variable that must be inferred. The posterior distribution of x is a Gaussian, ex actly as described previously. This yields in a distribution Q with mean fis + /3S (y„ - C ^s) and covariance l s - psCLSn. Therefore, the maximum a posteriori estimate estimate of x is simply Ps + Ps{yn-Cfls). When performing inference for a new observation, the mixture component identification, s, is now unknown. The posterior distributions of both s and x, given the data, y, are of interest. The first of these distributions can be expressed as follows:

P(s|y,0)oc P(y|s,0).P(s|0)

oc ns rexp[(y-C/ts)'(Ci:sC' + R)_1(y-C/is)|. (4.43) |CZsC' + R|2 1 j

To infer x given the data, the following derivation applies:

M P(x|y,0)cx £P(x|y,s,0)P(s|y,0), (4.44) S = 1

where the first factor in the summation is the conditional Gaussian (see 4.28) and the second is a weighting as shown above. Simply put, the distribution of x given y (but not conditioned on s) is a mixture of Gaussians.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 5

H erm esB

5.1 Overview

In the previous chapters, we have shown how chronically implanted electrode arrays have en abled a broad range of advances in basic electrophysiology and neural prostheses. Those successes motivate new experiments, particularly the development of prototype implantable prosthetic pro cessors for continuous use in freely behaving subjects, both monkeys and humans. However, tra ditional experimental techniques require the subject to be restrained, limiting both the types and duration of experiments. In this chapter, we present a dual-channel, battery powered neural recording system with integrated 3-axis accelerometer for use with chronically implanted elec trode arrays in freely behaving primates. The recording system, called HermesB, is self-contained, autonomous, programmable and capable of recording broadband neural (sampled at 30 kS/s) and acceleration data to a removable compact flash for up to 48 hours. We have collected long duration datasets with HermesB from an adult macaque monkey which provide insight into timescales and free behaviors inaccessible under traditional experiments. Variations in action potential shape and RMS noise are observed across a range of timescales. The peak-to-peak voltage of action po tentials varied by up to 30% over a 24 hours including step changes in waveform amplitude (up to 25%) coincident with high acceleration movements of the head. These initial results suggest that spike-sorting algorithms can no longer assume stable neural signals and will need to tran sition to adaptive signal processing methodologies to maximize performance. During physically active periods (defined by head mounted accelerometer), we observed significantly reduced 5-25 Hz local field potential (LFP) power and increased firing rate variability. Using a threshold fit to LFP power, 93% of 403 five-minute recording blocks were correctly classified as active or inactive, potentially providing an efficient tool for identifying different behavioral contexts in prosthetic applications. These results demonstrate the utility of HermesB, and motivate using this type of system to advance neural prosthetics and electrophysiological experiments.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 87

5.2 Background

The development of chronically implantable electrode arrays for in vivo neural recording in pri mates (both monkeys and humans) have enabled a range of advances, in neural prostheses (Ser- ruya et al. 2002; Taylor et al. 2002; Carmena et al. 2003; Musallam et al. 2004; Santhanam et al. 2006b; Hochberg et al. 2006) and basic electrophysiology experiments (Maynard et al. 1997, 1999; Hatsopoulos et al. 2004). However, most current state of the art experimental systems require the animal to be restrained, restricting both the types and duration of experiments. As a result there is limited data available with which to characterize both the nature and content of neural record ings over the broader range of timescales and free behaviors relevant to future prosthetic and electrophysiology experiments. To make the transition to new experimental paradigms possible, continuous, long duration, broadband (sampled as 30 kS/s) neural recordings from freely behaving subjects are needed. These datasets will enable validation of spike discrimination and decoding algorithm performance in freely behaving subjects, multi-day plasticity and learning experiments, determination of neural correlates of free behaviors, and direct measurement of the stability of neural recordings. Here, we present results, using data collected with HermesB, addressing the latter two questions to demonstrate the utility of long duration recording from freely behaving subjects. Recording stability is a critical issue for neural prosthetic systems. Here we define recording stability, or more specifically, recording instability, as the change in the gross presence or absence of neural signals off of an electrode, time varying fluctuations of the observed action potential shape, and time varying fluctuations in the background noise process on an electrode. Neural recordings during any given session are considered to be quasi-stable; there is usually very little change in the numbers of neurons recording and their action potential shapes during a several- hour recording session. However, recording instability has been observed between sessions, likely resulting from the subjects freely behaving in the housing room between sessions (Suner et al. 2005). Long durations datasets will enable us to reconcile the current assumptions of quasi-stable neural signals during a highly controlled experimental session with the variation in the neural signals observed between sessions. Figure 5.1 summarizes the significant timescales in the life of a chronically implanted electrode array. We are normally only concerned with neural recording stability in the high-yield recording period during which most experiments are conducted (Schwartz 2004). Within this window, neural interface systems are potentially affected by recording instability at all three timescales (short, intermediate and long). However, current experiments, with their discrete daily recording periods, are only able to characterize variations on timescales less than a few hours and across days. Past studies have only characterized neural recording stability on short (seconds or minutes; Fee et al. 1996; Suner et al. 2005) and long timescales (days; Williams et al. 1999; Suner et al. 2005; Liu et al. 2006). Over very short timescales, variations in action potential waveform shape

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 88

Array Recording Lifetime Timescales Current Stability S h o rt Characterization Fade In 1 m s -1 min - 3 w eeks Newly Available ' Characterization

High Yield Intermediate 6 months • 1 year 1 min -1 day

Fade Out Long 1 day +

Figure 5.1: Summary of array lifetime and available data for recording from individual, identifiable neurons using a chronically implanted electrode array.

are a function of the short-term spiking frequency of a neuron (Fee et al. 1996); at high frequencies the waveform is typically broader (in time) and decreased in amplitude due to depletion of ion gra dients in and around a highly active neuron. At longer timescales, the variation in spike waveform is not as systematic, potentially arising from a number of mechanisms such as neural plasticity, physical movement of the electrode relative to nearby neurons, chemical degradation of the elec trode tip, or immunological reactions to the implant (Lewicki 1998; Schwartz 2004). Studying neural stability at intermediate timescales will enable characterization (along with existing short and long timescale data) of the full range of timescales relevant to a neural interface system and may also provide insight into long timescale phenomena. Experimental protocols in which the subject is retrained limit the types of behaviors that can be observed. Long duration datasets recorded during free behavior provide neural data associated with a broader range of behaviors than traditionally possible. To maximize system performance, prostheses must be sensitive to behavioral and neural changes across the day and must react robustly in the face of variable background conditions. For example, such systems should reliably detect different behavioral contexts such as whether the user is awake or asleep, or intending to be active or not. If a neural prosthetic attempts to decode the users intentions during sleep, it may waste battery power or cause undesired behaviors. Alternatively, if such a system does not reliably detect waking periods, the user may lose the ability to interact with the world. The ability to record neural activity across a variety of different behaviors and contexts will allow for characterization of the true neural environment in which chronic implantable systems will operate. Long duration datasets are of considerable interest for certain, multi-day electrophysiology ex periments. Chronically implanted electrode arrays can support multi-day learning or plasticity experiments. However, because the period between traditional daily recording periods is unob served, there is no reliable method to track single neurons over multiple days. Recording systems for freely behaving subjects can allow researchers to record while the animal is in its home cage,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 89

providing continuous monitoring of neurons identified during an active experiment. Without such monitoring, it is not possible to certify that the same neuron is being observed day-to-day and thereby reliably state that the adaption is not the result of recording instability in the system. Recording systems have been developed for freely behaving animals (Vyssotski et al. 2006; Mavoori et al. 2005; Obeid et al. 2004). However, these systems often have one or more of the fol lowing limitations: 1) they cannot sample at full broadband (30 kS/s) potentially missing relevant signal features, 2) their battery life or storage capacity is limited to a few hours or less for broad band recording, 3) they cannot switch recording parameters, such as input channel, autonomously, limiting the range of possible experiments, and 4) they are not designed or tested for portable use with primates. Here, we describe the first generation of a portable recording system, dubbed HermesB as a moniker for “Hours of Electrophysiological Recordings in Monkey with an Extensible System, Ver sion B.” HermesB addresses the limitations of previous systems by providing a full broadband, long duration, autonomous recording platform for use with chronically implanted electrode arrays in primates. An extensible system, HermesB can easily evolve to include new components such as experimental analog front ends (e.g., Harrison et al. 2006), making HermesB a useful proto typing platform as well. Importantly, the system interfaces (although not exclusively so) with the popular 96-channel electrode array manufactured by Cyberkinetics Neurotechnology Systems, Inc. (CKI). This implant has been adopted by many electrophysiology research laboratories, is now FDA approved, and in clinical trials with humans (Hochberg et al. 2006). Understanding the char acteristics of this array and the stability of signals recorded from it can provide great benefit for translating the technology to the clinical setting. To demonstrate the utility of HermesB, we present preliminary results derived from multi-day broadband recordings from a freely behaving macaque monkey which provide insight into previ ously unobserved timescales and behavioral contexts. A macaque was chosen as it is generally accepted as the ideal animal model for researching neural prostheses for humans (Isaacs et al. 2000; Wessberg et al. 2000; Serruya et al. 2002; Taylor et al. 2002; Shenoy et al. 2003; Carmena et al. 2003; Musallam et al. 2004; Santhanam et al. 2006b). In particular we present data quan tifying the stability of neural recordings over timescales of 5 min - 54 hours. We address three aspects of recording stability identified by Lewicki (1998): the change in mean waveform shape over time, changes in the background noise process and changes in the waveform shape due to electrode movement. We illustrate the ability to identify contextual periods in our long duration neural recordings and specific attention is paid towards identifying and understanding systematic differences in firing rate and local field potential (LFP) during active and inactive periods.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 90

5.3 Methods

5.3.1 System Description

HermesB is composed of three separate components, as shown in Fig. 5.2. First is the specially designed array connector, a custom designed low profile 96-pin zero insertion force (ZIF) connec tor. Next, there is the analog signal conditioning pathway, consisting of a printed circuit board (PCB) with amplifiers and filter. Finally there is the digital signal acquisition unit, comprised of a separate PCB that includes a microcontroller, accelerometer, and compact flash interface. Data is stored on a high capacity non-volatile compact flash (CF) card, which is periodically removed and downloaded to a PC. The system is powered by a pair of high efficiency, rechargeable cell phone batteries and is entirely housed in a protective and electrically shielded casing attached to the monkey’s skull. Table 5.1 summarizes system parameters.

Analog Board

8 :1 -

High P a s s Low P a s s Neuro Port ADC Compact ARM C ore Accel Flash Microcontroller Digital Board

Figure 5.2: HermesB block diagram. The neuroport is a custom 96-channel zero insertion force connector which mates to the electrode array connector. The analog signal conditioning and digitization and storage are implemented on separate circuit boards to reduce noise and provide modularity.

HermesB is architected to be a flexible and extensible experimental platform. The modular construction allows new components, such as experimental analog front ends (Harrison et al. 2006) or neural decoding backends, to be incorporated into the system without extensive redesign. Additional ADC channels are available to support new analog data sources, such as chronically implanted electromyogram (EMG) electrodes. The commercial-off-the-shelf (COTS) CF interface leverages increasing Type I card capacity without redesign or remanufacturing. Although capable of interfacing with any electrode array, the current HermesB was designed to work with the 96-channel chronic electrode array manufactured by (CKI). The array is wired to a CerePort™ connector pedestal. A custom low profile ZIF connector was developed to mate to the pedestal. The new connector is comprised of a mechanical component which allows access to all 96 electrodes and a PCB interface that provides access to a subset of 32 electrodes. Three different PCBs were manufactured and can be interchanged manually to switch between each bank of 32 electrodes. The analog signal conditioning path is illustrated in the upper dashed box of Fig. 5.2. The

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB

Table 5.1: HermesB Parameters

Interface Capabilities Simultaneous active channels 2 Programmably accessible channels 16 Connector accessible channels 96 3-axis accelerometer range +6g Storage currently 6 GB Physical Parameters Enclosure size 60x70x45 mm Enclosure mass 127 g Electronics mass including batteries 77 g Neuroport mass 16 g Grand total mass 220 g Signal Conditioning Parameters High pass filter (-3dB) < .5 Hz Low pass filter (-3dB) 7.4 kHz Neural sampling rate 30 kSamples/s Accel, sampling rate 1 kSamples/s ADC Precision 12 bits Battery Parameters Battery Capacity 1600 mAh Typical Battery Life at 67% recoding duty cycle 19 hrs Measured Circuit Parameters Input referred noise 3.5 pV RMS Input referred precision 1 pV per LSB Amplifier Gain ~600x

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 92

analog board has a 16-channel input connector and can be mechanically bridged to one of two out puts of the 32-channel HermesB ZIF connector. First, all 16 input channels undergo impedance conversion using a CMOS op-amp (Texas Instruments TLC2254) in a unity gain configuration. The desired channels are then digitally selected using two 8:1 analog multiplexers (Analog De vices ADG658). From here, two identical signal paths are provided to amplify and filter two of the 16 input channels. The selected signals are high-pass filtered to remove electrode DC bias, then amplified with a differential instrumentation amplifier (Texas Instruments INA121). Three path matched references are provided — two reference signals and analog ground — selectable via a jumper. Each reference signal corresponds to a platinum-iridium reference wire that accompanies the electrode array and provides an electrical reference local to the implantation site. The ampli fied signal is further amplified and low-pass filtered (Texas Instruments OPA2344) before being passed to the digital board. The positive and negative voltage supplies are provided by the digital board. The digital module is depicted in the lower dashed box of Fig. 5.2. An ARM microcontroller (Analog Devices ADUC2106) is responsible for system control, digitization of the neural and ac celerometer signals, and management of the CF card. The analog signals are digitized by a 12-bit successive approximation ADC integrated into the microcontroller. Data packets are buffered us ing the internal memory of the microcontroller and written to the CF card. The 3-axis accelerom eter (ST Microsystems STM9321) is mounted on the digital board to measure the subject’s head movement. The digital module includes the necessary positive and negative voltage regulators for both the digital circuitry and the analog module. The negative and positive supply voltages are provided by separate batteries ( negative: Varta EasyPack, 43.5x35.4x5.8 mm, 14 grams, 3.7 V; positive: LG Chem ICP633450A1, 49.0x33.6x6.8 mm, 24.3 grams, 3.7 V). The entire system, including batteries is housed in a lightweight protective aluminum case, shown in Fig. 5.3a,b, secured with methyl methacrylate, which was in turn secured to the skull. The case encapsulates all of the electronics, batteries, and neuroport connector. The enclosure was sealed with a watertight gasket and was also electrically connected to the monkey by way of standard grounding hardware so as to provide electromagnetic (EM) shielding for the electronics contained inside. Figure 5.3c,d,e provides photographs of the connector, analog module, and digital module. Figure 5.3a includes a schematic of the tight packing of the components into the protective shell. We used non-conductive foam to fill any open space; this ensured only very little, if any, vibration inside the shell. Hence, we can safely state that our accelerometer records head motion, and not any residual board vibrations. Furthermore, the weight of our system (220 g grand total; Table 5.1) was light enough that no behavioral differences were observed in the animal and the accelerometer data we collect represents natural behavior. HermesB is controlled by custom firmware. The firmware includes a basic command inter preter that allows the user to interact with the system in real time when tethered to a portable laptop computer (via a RS232 serial port), as well as write simple sequencing programs for fully

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 93

screw ■ ______protective shield

^a$^et I digital board f analog board ibattervll battery s p T methyl methracrylate I silicone, elastomeri

chronic electrode array

white matter

illustrations not to scale

accelerometer a x e s

Figure 5.3: HermesB components, a. Illustration of enclosure mounted on monkey’s head along with side profile showing the stack up of the various components. The space labeled ICS denotes the intracranial space between the dura and skull. This space is larger in humans compared to monkeys and may be a source of greater recording variability when electrode-based systems transition to the clinical domain, b. Aluminum enclosure with centimeter ruler, c. Custom low-profile neuroport connector, d. Digital board, e. Analog board.

autonomous execution. A sample program is shown in Fig. 5.4. The system is highly configurable. Parameters such as neural sampling rate and accelerometer sampling rate can be initially set to balance sampling precision against data storage capacity. The experimenter can then specify a se quence of epochs, each either a data sampling period or quiescent sleep period, to balance between recording duration and battery lifetime.

5.3.2 Recordings and Analyses

Primary data for this report was collected from an adult, female macaque monkey (monkey D) freely moving in a home cage. All experiments and procedures were approved by the Stanford Uni versity Institutional Animal Care and Use Committee (IACUC). We performed a sterile surgery to implant a head restraint system. At this time, we also implanted a silicon 96-electrode array. The electrode array (Cyberkinetics, Foxborough, MA) was implanted in a region spanning the arm representation of the dorsal aspect of pre-motor cortex (PMd) and primary motor cortex (Ml), as estimated visually from local anatomical landmarks. Surgical methods are very similar to that described in Hatsopoulos et al. (2004). HermesB was used to record starting in August 2005. A number of recording profiles were used. One profile consists of recording at a 67% duty cycle (5 minutes of recording followed by 2.5 minutes of sleep). Total experiment duration is approximately 54 hours, broken up into three 18-hour sessions. The recording-sleeping duty cycling is a compromise between memory capacity

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 94

% Setup the sampling frequncy to be 30 kHz % on the neural channels and sample the % accelerometer once every 30 neural samples. % % Initial 600 sec sleep period followed % by loop of 300 sec. of recoding from % channels 4 & 6 and 150 sec. of sleep % and loop indefinitely. neuralfreq 30000 % Line 0 a c c elp erio d 30 % Line 1 addsleep 600 % Line 2 addsample 4 6 300 % Line 3 addsleep 150 % Line 4 addloop 3 % Line 5

Figure 5.4: Sample program for autonomous execution. The initial sleep period is added to allow the experi menters sufficient time to close up the protective enclosure before recording commences.

and battery life constraints.1 Between each session, the monkey was transferred from the home cage to the training chair to replace the battery and download the ~4 GB of recorded data. During these “pit stops,” recording was continued with a second smaller CF card and a new battery to maintain dataset continuity. Other profiles include round-robin recording of 4—8 channels over a 24-hour schedule. Two neural channels were recorded per dataset in full broadband (0.5 Hz to 7.5 kHz at 30 kSamples/s with 12-bit resolution) and a 3-axis accelerometer fixed to the monkey’s head was sampled (1 kSamples/s with 12-bit resolution) and stored to compact flash. Accelerometer data was used from each five-minute data block to label the blocks as either “active,” “inactive,” or “mixed.” Blocks in which the maximum accelerometer magnitude (MAM) was greater than 1.25 g were labeled active, blocks in which the MAM was less than 1.15 g were labeled inactive, and blocks that were within these bounds were labeled mixed. These thresholds were selected to roughly balance the number of active and inactive blocks to a ratio similar to that of day (lights on) versus night (lights off) blocks (as we expect low activity when the lights are off), while retaining a 0.1 g margin between classifications. The recorded neural signals from each five-minute block were post-processed with the Sahani spike-sorting algorithm, which is a unsupervised clustering algorithm as described in Chapter 2 and by Sahani (1999), and further analyzed by Zumsteg et al. (2005). Spike times were identi fied using a threshold determined from data across the block (3cr with respect to the RMS noise estimate from filtered data). As described earlier in Chapter 2, a spike waveform, or snippet, com prised of a 32 sample window around the threshold event, was extracted and aligned to its center of mass. Snippets were projected into a 4-dimensional robust, noise-whitened principal compo nents space (NWrPCA) and clustered using a maximum a posteriori (MAP) clustering technique.

iwhen recording continuously the current memory capacity can be quickly exhausted. At very low duty cycling, the battery is discharged by the static power consumption before the CF card is full, despite sleeping the microcontroller in between recording periods.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 95

Well-isolated neural units were identified and cross-referenced across blocks by hand. For LFP analyses, broadband data was filtered by applying Chebyshev Type I lowpass and bandpass filters with a passband ripple of 1 dB. Power spectral density estimates were calculated using the Welch periodgram method.

5.3.3 Recording Stability Analyses

To quantify the stability and consistency of waveforms recorded from our electrode array, we an alyzed data from our long duration recordings in several ways. First, snippets from the entire session were extracted using a 3 a threshold and projected into a single 2-dimensional principal components subspace. By graphing a 2D histogram of the snippets in this subspace, snippets with similar waveform shapes are grouped into distinct clusters. Movement of these clusters across the session indicates drift in the waveforms. The magnitude of the shifts were assessed by examining the actual waveform shapes over these periods of interest. Second, to observe more continuous shifts in waveform shape, we chose a feature of the average waveform shape, the peak-to-peak voltage (Vpp), and plotted this quantity over the course of the recording session. The Vpp was determined on a block-by-block basis by using the Sahani algorithm per block, providing local estimates of the average waveform shapes. Lastly, to search for potentially abrupt changes in waveform shape, the neural recordings were analyzed in conjunction with the accelerometer data. An abrupt change in electrode array position in the cortex would presumably manifest itself as an abrupt change in waveform amplitude, as the neuron-electrode distance would change. If such changes do occur, we additionally presume they are correlated with high acceleration events such as vigorous head movement. Therefore, we examined the neural recordings straddling high acceleration events (>3 g threshold) and ex amined the Vpp metric around these events. To help search for events of interest, we computed the local change of the Vpp metric ( V ^ er/V^ ^ ore), constructed from 200 snippets before and 200 snippets after the acceleration event. This allowed us to narrow in on high acceleration events that coincided with large shifts in action potential waveform shape. Single neural units were used for these analyses to observe the recording stability from our chronic implant. One important concern is that if a unit is automatically identified by the spike sorter, large changes in the unit’s waveform shape could cause the unit to no longer be classified correctly, thereby obscuring the analyses. Thus, the NWrPCA projections of the selected units were examined separately by us to ensure that snippets were not ignored, or improperly included. This was accomplished by ensuring all units included in the aforementioned stability analyses were well-isolated, high-firing-rate single neurons, and sufficiently distinct from other signals on their respective electrodes such that reasonably large variations would not result in a high rate of misclassification.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTERS. HERMESB 96

5.4 Results

5.4.1 System Verification

Figure 5.5 shows example data recorded from our animal subject freely moving in her home cage. The top traces, Fig. 5.5a, show the three-axis acceleration measurements of the monkey’s head over a 10 second period. This data segment was recorded in the early evening during a period in which the monkey was quite active. Figure 5.5b shows 100 ms of broadband neural data recorded from a single channel on the electrode array. The LFP (local field potential) is easily visible, as are a number of spikes “riding” on top of the LFP. Figure 5.5c shows the same data segment filtered with a 250 Hz high pass HR filter, which is the same filter used when spike sorting for our other HermesB analyses.

-2g 1 s

:L 5 ms

1X1 5 ms

Figure 5.5: Sample neural and accelerometer data recorded from a freely behaving monkey, a. Accelerometer channels, x (blue), y (green), and z (red). The DC levels on the channels is due to the particular orientation of the accelerometer with respect to Earth’s gravity vector, b. Unfiltered broadband neural data taken from the middle of the recording period, c. Filtered broadband neural data.

Datasets like that shown in Fig. 5.5 were used as part of a three step verification process to ensure the accuracy of HermesB recordings. The steps were 1) measure HermesB circuit param eters, 2) compare recordings of the CKI Neural Simulator made with HermesB and our standard laboratory recording system (CKI Cerebus System), and 3) compare HermesB recordings of neural activity in a rhesus monkey to recordings made by the fixed laboratory system. The measured circuit parameters are summarized in Table 5.1. The input referred noise, mea sured with grounded inputs, is comparable to or better than current state-of-the-art commercial (CKI Cerebus System) and research systems (Harrison et al. 2006). The CKI Neural Simulator is a playback device that simulates 128 channels of neural signals at the amplitude of array recordings

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 97

(e.g., maximum of ~500 p,V peak-to-peak) and similar output impedance to a standard electrode array. Figure 5.6a shows a side-by-side comparison of Neural Simulator recordings made with the CKI Cerebus system (left) and with HermesB (right). The three spike waveforms are clearly visible, with comparable levels of noise (measured as the spread of the curves) between the two systems. Figure 5.6b shows a similar comparison for a channel from the electrode array, recorded from a monkey sitting quietly in a primate chair. The figure shows the 10th-90th percentile in amplitude of action potential waveforms recorded from a single channel on the electrode array.

CKI Cerebus H erm esB

.2 ms .2 ms

prpr.2 ms .2 ms Figure 5.6: Comparison of snippets recorded with CKI Cerebus system (left) and HermesB (right), a. Snip pets recorded from CKI Neural Simulator, b. Snippets from four neurons recorded from a single electrode channel in a monkey comfortably in a chair with head restrained. Snippets have been sorted and the 10th 901*1 percentile in amplitude indicated by the colored region for each waveform.

A five-minute recording was sorted using the Sahani algorithm which classified the spikes as belonging to one of four units (indicated by different coloring). There were four separable units. The spike snippets were projected into a lower dimensional subspace to verify that they originated from separable clusters (data not shown). The waveforms are very similar between the two sys tems, indicating that HermesB is comparable to current state-of-the-art commercial laboratory equipment. Furthermore, the ability of HermesB to distinguish between several units on a sin gle electrode builds confidence that this apparatus can serve to address the scientific goals posed earlier.

5.4.2 Recording Stability

Figure 5.7 shows neural recordings made over the course of 48 hours in October 2005. Figure 5.7a shows a time series of NWrPCA cluster plots for five-minute data segments recorded at the times shown. Each cluster corresponds to a single neuron, and the movement (drift) of the relative distance between these clusters is readily seen by scanning across the snapshots. The drift of

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 98

the clusters in NWrPCA space reflects changes in spike waveform shape. Figure 5.7b shows action potential shapes (voltage vs. time) from the same recording period. The colored region indicates the 10th-90th percentile in amplitude. The lines of constant voltage provide a reference against which one can see the large changes in waveform amplitude. These changes in action potential shape have been previously observed across once-daily recordings (Suner et al. 2005). Here, preliminary results from these continuous neural recordings of a freely behaving primate indicate substantial variation in spike waveforms over intermediate timescales as well.

17:44-Day 1 20:24 - Day 1 01:44-Day 2 07:04 - Day 2

12:24- Day 2 17:44-Day 2 23:04 - Day 2 01:44- Day 3

17:44-Day 1 01:44-Day 2 07:04-Day 2 23:04-Day 2 100(0/

-1 5 0 (iV

Figure 5.7: Neural recordings over a period of 48 hours (dataset D20051008). a. Histogram of spike waveform projections into a fixed 2D NWrPCA space. PCA space determined using 20,000 snippets uniformly selected across the time period. Each plot is the projection of 5 minutes of data recorded from a signal channel at the time shown. The green and blue circles denote identifiable single neurons that are analyzed in the bottom panel, b. Spike waveforms of two neurons for selected five-minute blocks. To better isolate the selected units, spike sorting was performed strictly within a block and irrespective of the data in other blocks. Colored region indicates 10th-90th percentile in amplitude. Horizontal lines indicate maximum and minimum voltage for each unit. Waveforms shown are recorded from a single channel using the same signal conditioning path. Note that between 17:44 (day 1) and 07:04 (day 2) Vpp, the peak-to-peak voltage, of the green waveform increases, while Vpp of the blue waveform decreases, showing that waveform changes cannot be attributed to fluctuations in signal conditioning pathway (connectorization, amplifiers, ADC, battery power, etc.).

Figure 5.8 shows a more continuous representation of the waveform changes over time. Panel c shows the normalized peak-to-peak voltage, for the neuron identified in panel a, recorded from a single channel over 54-hour periods. The normalized Vpp is simply the mean Vpp for each block, normalized by Vpp of the mean waveform for that neuron across the entire 54-hour dataset. Variability in waveform amplitude, up to 30% relative to the mean, is observed over a range of timescales. There is a clear variation on the order of a single block (5 minutes of recording with 2.5 minutes of sleep) as well as changes on the order of several blocks, and even several hours.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 99

3 Neuron 1 C e 1.4 Q. > 1.2 §1.0 o 1-° z n0.81" a l I U i, ,„ I , , I ■ I fi ,

b Neuron 2 d 20 ? 1 S flo CC g i I I 1 I I I i I = 5 , I_____ I I . I ...... I...... I....., - L...... -L 6 AM 6 PM 6 AM 6 PM 6 AM 6 AM 6 PM 6 AM 6 PM 6 AM

Figure 5.8: Variation in Vpp and RMS. a,b. Histogram of spike waveform projections into NWrPCA space from two different electrodes recorded for different 54-hour datasets (D20060302.ch2 & D20060225.chl). Selected neurons are indicated by arrows. These neurons were well-isolated, c. Normalized Vpp of neuron #1 recorded over the 54-hour session, d. RMS noise of recorded channels over same period. e,f. Same two plots second dataset. The wide light gray regions indicate night, and the thin pink regions indicate “pit stops,” when the monkey was taken from the home cage and placed in a primate chair to service the recording equipment.

Figure 5.8d shows the RMS voltage of filtered neural recordings from three channels recorded over five-minute blocks. All spikes, identified with thresholding at 3ct of RMS noise, have been removed from the dataset prior to the RMS calculation shown. Without the spikes, the RMS value should offer a better measure of the true background noise process (Watkins et al. 2004). Even after removing identifiable spikes, the RMS noise is highly correlated to neural activity (as measured by mean firing rate). These variations (~5 pV) can partly result from distant spike activity (i.e., neural activity is sensed by the electrode, but the signal does not rise above the spike threshold because the spike amplitude is too small, or the neuron is too far away). Furthermore, depending on which data block is analyzed to set the threshold, there can be differences greater than 15 pV for a 3 o threshold. Figure 5.8e,f show similar results for the neuron identified in Fig. 5.8b which was recorded from a different electrode during a different 54-hour period. Similar characteristics have been observed for other channels (data not shown), indicating that the changes in waveform amplitude observed in Fig. 5.8c,e are not unique to those channels. The large change observed at 13:00 (day 1) in Fig. 5.8c is coincident with a vigorous head movement, and may have resulted from an abrupt movement of the array, a possibility discussed below. In our analysis of abrupt waveform changes, examination of recordings straddling high accel eration events show, in nearly all cases, far smaller changes in waveform amplitude than those observed over the intermediate timescales of Fig. 5.8. For example, Fig. 5.9a,b show the local changes in Vpp ( V ^ er/V^ ^ ore) for all 3+ g acceleration events for the same two neurons in Fig. 5.8c,e. Over a recording period of ~50 hours for each session, there were ~1700 and ~800 high acceleration events for panels a and b, respectively. For nearly all events shown in Fig. 5.9a,b there is less than a 5% change in mean waveform amplitude straddling the acceleration event. There are, however, two events in Fig. 5.9b that show much larger changes (labeled event 1 and 2). For the first of these events, the NWrPCA projections of the before (blue) and after (green) snippets are shown in Fig. 5.10a. The significant change in

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 100

I 1-4 jQ | 12 0 H % > 0.8 5® 1-4-i A £l0 - event 1 6 12 > b ^ *2 I

0.8 6 PM 6 AM 6 PM 6 AM 6 PM

Figure 5.9: Variation in waveform amplitude straddling high acceleration events, a. Local change in mean waveform amplitude (Vpp'er/Vpp^ore) for 200 snippets before and after 3+ g acceleration events (dataset D20060225. chlul). Note that the heights of the symbols do not correspond to the magnitudes of the acceler ation events, b. Same as previous panel for dataset D20060302. ch2ul. Arrows in panel b indicate events of interest. Similar gray and pink shading as in Fig. 5.8.

waveform amplitude (1.25 x increase) is clearly reflected in the NWrPCA projection. A second unit on this channel (the other cluster in the NWrPCA projection) shows a smaller change in amplitude (only a l.lx increase) across the same acceleration event suggesting that the observed variation does not result from changes in the signal conditioning pathway (not shown). For example, a common shift in signal gain would result in equivalent waveform amplitude change for both units, which was not the case here.

350

> E o

200 50 3+ g Acceleration d 350 f Event

> £

200 50 s

Figure 5.10: Variation in waveform amplitude for events identified in Fig. 5.9b. a. NWrPCA projection of 200 before (blue) and 200 after (green) snippets straddling acceleration event overlaid on NWrPCA histogram for all snippets in a five-minute block. Dataset D20060302. ch2. b. Peak-to-peak voltage of mean waveform amplitude averaged over 200 spikes centered around time point shown, for the neuron of interest in panel a. The red vertical line marks the >3 g acceleration event. c,d. Similar plots for event 2 in the same dataset.

Figure 5.10b shows a 200 spike moving average of Vpp for the block in which the event was recorded. The close alignment between the acceleration event (indicated by the red vertical line) and the step change in waveform amplitude strongly suggests that the relationship between the

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 101

change in waveform amplitude and the high acceleration event is not coincidental. The profile is consistent with an abrupt change in array position. Well before and after the shift, the array was in a stable state, evidenced by the near constant waveform amplitude, while at the time of the large acceleration event there is a step change in the Vpp- Figure 5.10c,d show similar results for the second event in Fig. 5.9b.

Implications of Neural Recording Instability

These analyses of neural recording stability were our most novel investigations with HermesB. It is in this particular class of experiments that HermesB is most differentiated from other portable recording systems currently in use. What might be the cause for these variations in waveform amplitude? The step changes in waveform amplitude appear, in some cases, to result from abrupt shifts caused by head movement. For the non-abrupt variation in waveform shape and RMS noise we believe there could be a number of factors that may play a significant role, including changes in the cortical environment in response to subject activity, including “brain bounce,” changes in in tracranial pressure (ICP), and other homeostatic factors. At short to intermediate timescales (i.e., longer than bursting periods), Lewicki (1998) suggests that array movement, or more specifically changes in the neuron-electrode distance, might play a role in waveform shape change. Fluctu ations in the ICP could potentially move the cortex tissue relative to the array (or vice-versa). Confirming such a relationship is beyond the scope of this work, though may be of interest in future studies. Since so few of the high acceleration events were coincident with large changes in waveform amplitude, there is the temptation to dismiss these events as rare and unimportant. However, a practical neural prosthesis will have to operate 24 hours a day and 7 days a week. As such, the prosthetic system must be able to recognize the 3-4 abrupt changes that might occur in a week, especially when such systems are eventually used for more ambulatory patients. In fact in one stretch of ~84 hours of recording we found that there were many tens of events that showed >10% change in average waveform amplitude coincident with a >3 g acceleration measurement. Our results are only preliminary, however, and will require more datasets and more animals for comprehensive characterization. Traditional experimental protocols that utilize discrete, daily recording periods have provided sparse information regarding neural recording stability. The day-to-day sampling restricts the potential characterization of variations to timescales of either minutes or days. It is important to note that similar variations were not observed within the hour long broadband recordings de scribed in Suner et al. (2005). However, those recordings were made under a more traditional experimental protocol in which a restrained monkey performed a repetitive reaching task. It is possible the more controlled and consistent environment of those recordings, in contrast to the animal freely behaving in the home cage, produces a more consistent cortical environment (e.g.,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 102

less “brain bounce,” smaller changes in intracranial pressure, etc.) and thus reduced variation in waveform shape. We have shown examples from preliminary datasets of significant waveform shape and RMS noise variation at all three timescales. Both types of variation can have adverse affects on spike- sorting performance, either through the use of an inappropriate threshold or outright misclassification. The improved statistical characterization of the stability of neural recordings enabled by these new long duration datasets will allow the principled design and evaluation of sorting algorithms. Tolerance to some instabilities in neural recordings has already been incorporated into sorting algorithms. The short timescale variations in spike shape can be addressed by in corporating firing statistics into the spike-sorting algorithm (Pouzat et al. 2004) and changes in RMS voltage (from which the threshold is typically derived) can be addressed through adaptive thresholding (Watkins et al. 2004). Long term variation, however, may require periodic retrain ing of the spike-sorting parameters. With such readjustments, experimenters report the ability to track single neurons across months or even years (although experimenters cannot be sure the same neurons are being observed without truly constant tracking, a capability now available with HermesB). There does not appear to be a consensus on exactly what retraining period is required. Current experiments that use discrete daily recording periods naturally update once per day. The quality of the trained spike-sorting parameters is paramount. Poor sorting parameters, and thus poor sorting performance, will affect all aspects of neural prosthetic system performance. This was demonstrated in Chapter 2. This does not imply that systems should retrain arbitrar ily often. Frequent retraining can have significant costs. For advanced spike-sorting algorithms (Sahani 1999), the training algorithm is computationally expensive. Although our recent power feasibility study has shown that the power consumption of the algorithm in Sahani (1999) is small relative to real-time classification, it was assumed that retraining would be required only every 12 hours (Zumsteg et al. 2005). If a much shorter training period is required, the power consumption of training could quickly become significant. Sorting algorithms with an adaptive training approach that continuously integrates over an extended period, similar to the method proposed in (Bar-Hillel et al. 2004), as opposed to discrete retraining, might be the best approach in light of the instability of neural recordings. A suitable adaptive algorithm would have an effective training interval short enough to track variations in waveform shape and background process, without the cost of traditional discrete retraining. The apparent sparsity of abrupt changes in waveform shape due to rapid array movement may mean that there are fewer problem scenarios in which abrupt retraining is required. Nonetheless, the presence of these abrupt changes in waveform shape does suggest that to maximize spike classifi cation accuracy, any algorithm would benefit from the ability to initiate discrete retraining when step changes in the waveform shape are observed. As these chronic electrode arrays are implanted in amputees (rather than tetraplegics), the head will move substantially. It is worthwhile to note that the space between the brain and the dura is larger in humans than monkeys. Therefore “brain

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 103

bounce” and other non-stationarities may be much more of an issue. Systems like HermesB will be critical to characterize the recording stability and provide the test data for more robust, adaptive spike-sorting algorithms.

5.4.3 Neural Correlates of Behavioral Contexts

Figure 5.11 shows data from two 54-hour recordings. For the first dataset (Fig. 5.11a,c-e), there were 438 data blocks. Active blocks, in which the monkey was putatively moving in its home cage, constituted 40% of the blocks while 52% of the blocks were inactive. From the accelerometer data (Fig. 5.lid), it is clear that the monkey was more physically active during the day, and as expected, firing rates tend to be higher during these periods. Note that LFP power was generally lower during these periods as shown in Fig. 5.lie. During the “pit stops” (battery swap periods) the monkey’s head was comfortably restrained in a fix position (the time duration indicated by the pink bands); therefore, accelerometer magnitude remained flat at 1 g. Likewise, few movements were made and consequently firing rates were suppressed. These trends were consistent across two datasets collected from different electrodes and at different times. Neural activity recorded simultaneously from a second channel show similar patterns (Fig. 5.11b,f-h).

« 50 * f

* ■ .I

Figure 5.11: Neural and accelerometer data recorded from a freely behaving monkey. a,b. Histogram of spike waveform projections into NWrPCA space from two different electrodes recorded for different 54-hour datasets (D20060302. ch2 & D20060225. chi). Selected neurons are indicated by arrows. These neurons were well-isolated, c. Firing rate of the neuron shown in panel a calculated over a 1 second interval using a Hamming window. Red and blue data points were recorded in time periods labeled as “active” and “inactive,” respectively Green data points were recorded during unlabeled periods, d. Accelerometer magnitude over the recording period downsampled to 100 Hz. e. LFP power per block, recorded from the same electrode, calculated by integrating the power over the 5-25 Hz frequency band. f-h. Same plots for second dataset. Similar gray and pink shading as in Fig. 5.8.

As shown in Fig. 5.12a, the mean LFP power differed between “active” and “inactive” periods in the 2-30 Hz and 50-100 Hz frequency bands. For the majority of this range the standard devi ations are large relative to the difference in the mean; this relationship makes power modulation in these bands an unreliable classifier for per-block behavior (i.e., “active” vs. “inactive”). However, the 5-25 Hz band was well-separated, so the power in this range can be used to develop a reliable classifier. This differentiation in LFP power is consistent with previous results showing that 10- 100 Hz LFP activity diminished during movement (Donoghue et al. 1998; Santhanam et al. 2003)

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 104

and increased during sleep (Destexhe et al. 1999).

a b

■145 10 20 30 40 50 60 70 80 90 1.5 2 2.5 3 3.5 4 Frequency (Hz) LFP pow er (tiV2 ) x 1 °"

Figure 5.12: LFP analyses for dataset D20060302. ch2. a. Power spectral density (PSD) recorded during “active” (red) and “inactive” (blue) periods. The thin lines are the mean PSDs and the standard error of the mean is represented by their thickness. The thickness of the wider translucent lines are the standard deviations. Each PSD is calculated over 5 minutes of data and their distributions were taken from data across the 54-hour dataset for neuron 1. b. Spectral power recorded during “active” (red) and “inactive” (blue) periods for the 5-25 Hz frequency band. The dotted line represents the learned classification threshold between “active” and “inactive” blocks.

Figure 5.12b plots 5-25 Hz LFP power versus MAM (maximum accelerometer magnitude) for each five-minute block. When we classified the activity level of blocks by thresholding LFP power at -56.5 dB, 93% (131/141) of “active” blocks and 92% (175/191) of “inactive” blocks were correctly classified. Results were similar for a second channel from a different session: 89% (150/169) of “active” blocks and 88% (81/92) of “inactive” blocks were correctly classified with a threshold of -57.1 dB. These results were obtained by picking the optimal linear classification boundary using the first 40 active and first 40 inactive blocks and testing on the remaining blocks. Head posting during “pit stops” can create confounds since the accelerometer was held in a fixed position even if the monkey was otherwise active during these periods. Hence, these periods were removed prior to the aforementioned analysis. A similar classification was not as successful when using the average firing rate over a five- minute block (data not shown). The mean and variance of the MAM increased as the firing rate increased, but the likelihood of a small MAM (i.e., an inactive period) remained relatively high even for high firing rates (data not shown). Recall that the electrode was implanted in a region spanning PMd and Ml, which is strongly believed to be involved in the motor planning and exe cution of arm movements (Tanji and Evarts 1976; Weinrich and Wise 1982; Weinrich et al. 1984; Godschalk et al. 1985; Kurata 1989; Churchland et al. 2006b). If arm movements are made while the head position remains fixed, firing rates could increase without large acceleration events. Also, motor plans can be generated and subsequently canceled. Thus, absolute firing rate may not be the best proxy for activity level. Given that “active” and “inactive” periods tended to occur during day and night, respectively, the variations in firing rate and LFP might be explained, in part, by circadian rhythms (or direct

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 105

modulation by light level). One might hypothesize that 5-25 Hz LFP power is increased and firing rates are depressed in association with day-night cycles. However, for blocks within a single activity condition (either “active” or “inactive”), the differences between day and night for both LFP and firing rate were at least an order of magnitude smaller than the difference between “active” and “inactive” blocks during either time period. This suggests that circadian rhythms do not heavily influence these effects. As shown in Fig. 5.12, LFP is a promising proxy for activity level. Furthermore, LFP power measurement consumes less battery power than firing rate measurement (a low-power LFP power measurement circuit is described by Harrison et al. (2004)), potentially enabling a power efficient “sleep” mode when the user is inactive. When LFP power falls below a defined threshold, indicating that the user is active, the prosthetic can switch out of this “sleep” mode. Furthermore, using LFP thresholding could help prevent undesired movements from the prosthetic system during “inactive” periods. In future studies we plan to examine subtler context changes. Some contexts may require fewer neurons for acceptable performance; under these conditions we can conserve power by dis abling a subset of the neural channels. Under different contexts, users may require different sets of behavioral responses (such as discrete target selection vs. continuous motion) or the underlying dynamics of the observed cortical area may change drastically; we would like to respond to these concerns by switching the decoding model according to context. By identifying contexts and ad justing hardware configuration accordingly, it may be possible to boost performance in terms of power consumption and decoding accuracy. We were able to identify natural behavior across multiple days using accelerometer measure ments and correlating these to neural recordings. Such an ability coupled with more advanced behavioral monitoring, such as chronically implanted EMG electrodes (Holdefer and Miller 2002; Morrow and Miller 2003) or motion tracking, can enable the exploration of questions that have been unapproachable until now. Mining large datasets to find the neural correlates of free be havior may help us to develop new controlled experiments; these datasets are also necessary for developing and testing neural prosthetics systems with the ability to operate autonomously over extended periods of time. Similar investigations with EMG recordings are already underway by other researchers and HermesB can serve as another tool in these types of experiments (Jackson et al. 2006a,c).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 106

5.5 Summary

HermesB is a new, self-contained, long duration, neural recording system for use with freely be having primates. It records dual-channel broadband and 3-axis head acceleration data to a high density compact flash card. Controlled by simple sequencing programs written by the experi menter, HermesB can autonomously change recording channel and pause recording during the experiment. With a single battery charge, HermesB can record for up to 48 hours (at a low duty cycle). With short breaks to replace the batteries and compact flash card, HermesB can record nearly continuously for an indefinite period. The high quality of the broadband recordings, despite being in the electrically noisy environ ment of the home cage room (e.g., florescent lights), enables results from HermesB to be integrated into experiments using the traditional laboratory rig. There are a variety of applications for such a platform. For example, the long duration recordings, in concert with traditional experiments, en able important multi-day learning and plasticity experiments, an application not explored in detail in our experiments. Researchers can use HermesB to record during periods when the animal is outside the laboratory rig to provide continuous monitoring of significant neurons identified during active experiments. And, we have already detailed how HermesB can be useful for investigating neural stability and correlating different free-behavioral contexts to neural activity. Recently, there have also been scientific reports involving the pairing of recording and stimu lation. Using a portable system that is worn by the subject over the period of several days, it has been shown that subpopulations of neurons can be made to induce different behaviors, presumably due to a reshaping of neural connectivity (Jackson et al. 2006b). HermesB can also be extended to include stimulation and by doing so can hopefully assist in these types of experiments in the future. At present, HermesB is in active use supporting a number of experiments. There is also ongo ing development to increase recording capabilities. As CF technology and battery energy density improve, recording duration will be expanded. Future generations of HermesB may also incorpo rate wireless telemetry, more simultaneous recording channels, EMG recording capabilities, and stimulation capabilities.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 5. HERMESB 107

5.6 Credits

The work detailed in this chapter is in preparation for resubmission to a peer-reviewed journal (Santhanam et al. 2006a). It would not have been possible if not for the support from a number of individuals. Michael Linderman assisted with the analog front-end development from the very inception of this project and participated with testing the system, data collection, analyses for neural stability, and the writing of our co-first authored journal article manuscript. Vikash Gilja was instrumental in many aspects of the development, data collection, and analyses on behavioral correlates. Dr. Stephen Ryu was the primary surgeon for electrode implantation. Afsheen Afshar created mechanical drawings for the sealed aluminum enclosure that houses the electronics of HermesB. I was responsible for the initial experimental concept and was involved either directly or through a secondary capacity with all aspects of this work except electrode implantation. We also thank Shane Guillory of Intragraphix, who designed and laid out the analog module and laid out the digital module, Jim McCrae of JMC Design, Karlheinz Merkle at the Stanford Physics Machine Shop, Pascal Stang and Carter Dunn for their help designing and manufacturing HermesB, Mackenzie Risch for expert veterinary care, Dr. Aris Mendiola for medical consultation, and Sandra Eisensee for administrative assistance. This study was supported by NDSEG Fellowships (VG,MDL,GS), NSF Graduate Research Fel lowships (GS), MARCO Center for Circuit & System Solutions (THM,MDL), Medical Scientist Training Program (AA), Bio-X fellowship (AA), Christopher Reeve Paralysis Foundation (SIR,KVS) and the following awards to KVS: NSF Center for Neuromorphic Systems Engineering at Caltech, ONR Adaptive Neural Systems, Whitaker Foundation, Center for Integrated Systems at Stanford, Sloan Foundation, and Burroughs Wellcome Fund Career Award in the Biomedical Sciences.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 6

Future Directions

The work presented heretofore represents valuable progress toward the realization of cortically- controlled prostheses. Furthermore, our research has helped promote several avenues of further investigation. For example, one specific study that was begun as a direct result of our BCI experiments was “optimal target placement.” In our original BCI experiments, our possible reach target locations were arranged in the patterns shown in Fig. 3.9. Splaying targets out angularly, as opposed to linearly, recognizes the observation that most PMd neurons modulate their firing rate more for the upcoming reach direction than for the upcoming reach distance. However, we noticed that perturb ing targets away from high-symmetry locations resulted in minor ITRC improvements and this is reflected in our reported results from Chapter 3. The performance improvement was partially due to the tuning properties of the particular neural units recorded from our electrode array. Hence, a natural question is whether, given the particular neural units at hand, one can choose the possible target locations to optimize the single-trial accuracy and the ITRC. Cunningham et al. (2006a,b) investigated this on quasi-simulated neural data using sophisticated convex optimization tech niques. Their work suggests that true optimization of target placement based on neural response functions could provide further performance improvements beyond what we have accomplished in Santhanam et al. (2006b). Also, a recent integration of this target optimization technique into a laboratory experiment has yielded encouraging results (~7% decoding accuracy improvement on an 8-target task). Further experiments are needed to verify this result more fully. Prosthetic systems can also help explain how neural circuits function. This, in turn, can re sult in better prosthetic algorithms. For example, in designing systems for higher performance, research can also help shed light on the fundamental speed of neural processing. By presenting stimuli at an increasingly faster rate, researchers can assess how well neural circuits can cope with a demand for greater speed. We have recently begun some initial investigations by employ ing a communication prosthesis with rapid target decodes. We noted that performance suffered as

108

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 6. FUTURE DIRECTIONS 109

the number of decodes that had occurred in rapid succession increased, and some neurons seemed to gain modulate their tuning curve peaks as a series of decodes progressed (Kalmar et al. 2005). While this may be evidence of a fundamental processing speed limit of pre-motor cortex, future experiments are needed to investigate this further. Classic confounds include potential differences in attention or reward expectancy as the number of successive decodes increases. In the reverse, core neuroscience research in the motor regions of the brain will ultimately allow us to build better models for decoding neural signals for prosthetic systems. For example, in Chapter 4, we saw how applying a more appropriate model of neural activity in the planning region of the brain can help us develop more accurate Bayesian algorithms to decode the reach plan itself. Likewise, the dynamical models (HNLDS) detailed in Section A.3.4 have potential for improving system performance. HNLDS is an even more intricate framework by which we can model reach plans. There are efforts underway to verify HNLDS; then these mathematical techniques can be used to improve prosthetic performance. The future of cortically-controlled prostheses is bright. Continuing research in non-invasive technologies is encouraging and will result in practical systems that offer low surgical risk to dis abled patients. Performance of EEG systems have improved such that researchers are considering using them for 2D motor control. Intra-cortical systems have been receiving attention in recent years, proof-of-concept devices have been demonstrated, and our research has helped measurably advance the field. Further research in this domain should yield even higher performance systems. Although numerous scientific and technical challenges remain to be solved (e.g., ranging from a better understanding of cortex to issues specific to the physical recording apparatus), we are op timistic that continued progress is likely. Further innovation will hopefully yield prostheses that can help debilitated patients interact with the world in effective ways.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Chapter 7

Publications

7.1 Journal Articles

SANTHANAM G, R y u SI, Yu BM, AFSHAR A, AND SHENOY KV (July 2006). A high-performance brain-computer interface. Nature, 442(7099), 195-198. doi:10.1038/nature04968.

SANTHANAM G, LlNDERMAN MD, GlLJA V, AFSHAR A, RYU SI, MENG TH, AND SHENOY KV (December 2006). HermesB: A continuous neural recording system for freely behaving primates. In preparation for resubmission to IEEE Transactions on Biomedical Engineering.

CHURCHLAND MM, SANTHANAM G, AND SHENOY KV (July 2006). Preparatory activity in pre motor and motor cortex reflects the speed of the upcoming reach. Journal of Neurophysiology. doi:10.1152/jn.00307.2006.

CHURCHLAND MM, Yu BM, Ryu SI, SANTHANAM G, AND SHENOY KV (April 2006). Neural vari ability in premotor cortex provides a signature of motor preparation. Journal of Neuroscience, 26(14), 3697-3712. doi:10.1523/JNEUROSCI.3762-05.2006.

YU BM, AFSHAR A, SANTHANAM G, RYU SI, SHENOY KV, AND SAHANI M (January 2006). Ex tracting dynamical structure embedded in neural activity. In Y Weiss, B Scholkopf, and J Platt (Eds.) Advances in Neural Information Processing Systems 18, pages 1545-1552. MIT Press, Cambridge, MA.

Y u BM, KEMERE C, SANTHANAM G, AFSHAR A, RYU SI, MENG TH, SAHANI M, AND SHENOY KV (December 2006). Mixture of trajectory models for neural decoding of goal-directed move ments. In preparation for resubmission to Journal of Neurophysiology Innovative Methodology.

B a t is t a AP, Y u BM , Sa n t h a n a m G, R y u S I, A f s h a r A , AND SHENOY K V (D ecem ber 2006). A direct comparison of eye-centered and limb-centered reference frames for reach planning in the

110

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 7. PUBLICATIONS 111

dorsal aspect of the premotor cortex. In preparation for resubmission to Journal of Neurophysi ology.

ZUMSTEG ZS, KEMERE C, O ’DRISCOLL S, SANTHANAM G, AHMED RE, SHENOY KV, AND MENG TH (2005). Power feasibility of implantable digital spike sorting circuits for neural prosthetic systems. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(3), 272-279. doi:10.1109/TNSRE.2005.854307.

7.2 Conference Talks, Articles, Abstracts

7.2.1 2006

B a t is t a AP, Y u B M , S a n t h a n a m G, R y u S I, A f s h a r A, a n d S h e n o y K V (October 2006). Influence of eye position on end-point decoding accuracy in dorsal-premotor cortex. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 148.8. Atlanta, GA. Poster presentation.

C h e s t e r CA, B a t is t a AP, Yu BM, Sa n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (October 2006). The relationship between PMd neural activity and reaching behavior is stable in highly trained macaques. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 148.5. Atlanta, GA. Poster presentation.

G il j a V, L in d e r m a n MD, Sa n t h a n a m G, A f s h a r A, R y u SI, M e n g TH, a n d S h e n o y KV (October 2006). Multiday electrophysiological recordings from freely behaving primates using an autonomous, multi-channel neural system. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 148.19. Atlanta, GA. Poster presentation.

L in d e r m a n MD, G il j a V, Sa n t h a n a m G, A f s h a r A, R y u SI, M e n g TH, a n d S h e n o y KV (October 2006). Neural recording stability of chronic electrode arrays in freely behaving pri mates. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 13.7. Atlanta, GA. Slide presentation.

K e m e r e C, B a t is t a AP, Yu BM, S a n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (October 2006). Hidden Markov models for spatial and temporal estimation for prosthetic control. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 256.17. Atlanta, GA. Poster presentation.

S h e n o y KV, Sa n t h a n a m G, R y u SI, A f s h a r A , Yu BM , G il ja V, L in d e r m a n M D , K a l m a r

RS, C u n n in g h a m JP, K e m e r e CT, B a t is t a AP, C h u r c h l a n d M M , a n d M e n g TH (Septem ber 2006). Increasing the performance of cortically-controlled prostheses. In Proceedings of the 28th Annual International Conference of the IEEE EMBS. New York, NY. Invited talk.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 7. PUBLICATIONS 112

L in d e r m a n M D , G il j a V, Sa n t h a n a m G, A f s h a r A, R y u SI, M e n g TH, a n d S h e n o y KV (September 2006). An autonomous, broadband, multi-channel neural recording system for freely behaving primates. In Proceedings of the 28th Annual International Conference of the IEEE EMBS, ThBP8.7. New York, NY. Poster presentation.

G il j a V, L in d e r m a n MD, S a n t h a n a m G, A f s h a r A , R y u SI, M e n g T H , a n d S h e n o y KV (September 2006). Multiday electrophysiological recordings from freely behaving primates. In Proceedings of the 28th Annual International Conference of the IEEE EMBS, SaD08.3. New York, NY. Slide presentation.

Linderman MD, G ilja V, Santhanam G, A fshar A, Ryu SI, M eng TH, and Shenoy KV (September 2006). Neural recording stability of chronic electrode arrays in freely behaving pri mates. In Proceedings of the 28th Annual International Conference of the IEEE EMBS, SaD08.4. New York, NY. Slide presentation.

B a t i s t a AP, Yu BM, S a n th a n a m G, R y u SI, A f s h a r A, and Shenoy KV (May 2006). Hetero geneous reference frames for reaching in macaque PMd. In 16th Annual Meeting of the Neural Control Movement Society, F-12. Key Biscayne, FL. Poster presentation.

7.2.2 2005

Sa n t h a n a m G, R y u SI, Y u BM, A f s h a r A, a n d S h e n o y KV (November 2005). Intra-cortical communication prosthesis design. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 519.19. Washington, DC. Poster presentation.

Santhanam G, Ryu SI, Y u BM, A f s h a r A, and Shenoy KV (March 2005). A high perfor mance neurally-controlled cursor positioning system. In Proceedings of the 2nd International

IEEE EMBS Conference on Neural Engineering, 5 .1.2-6, pages 49 4 —500. Arlington, VA. Slide presentation.

A f s h a r A, A c h tm a n N, S a n th a n a m G, R y u SI, Yu BM, and Shenoy KV (November 2005). Free-paced target estimation in a delayed-reach task. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 401.13. Washington, DC. Poster presentation.

B a t i s t a AP, Yu BM, S a n th a n a m G, R y u SI, A f s h a r A, and Shenoy KV (November 2005). Heterogeneous coordinate frames for reaching in macaque PMd. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 363.12. Washington, DC. Slide presentation.

G il ja V, K a l m a r RS, S a n t h a n a m G, R y u SI, Y u BM , A f s h a r A, a n d S h e n o y KV (Novem ber 2005). Trial-by-trial mean normalization improves plan period reach target decoding. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 519.18. Washington, DC. Poster presentation.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 7. PUBLICATIONS 113

K a lm a r RS, G i l j a V, S a n th a n a m G, R y u SI, Yu BM, A f s h a r A, and Shenoy KV (November 2005). PMd delay activity during rapid sequential movement plans. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 519.17. Washington, DC. Poster presentation.

SAHANI M, Yu BM, A f s h a r A, S a n th a n a m G, R y u S I, AND S h e n o y KV (November 2005). Ex tracting dynamical structure embedded in neural activity. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 689.14. Washington, DC. Poster presentation.

Yu BM, K e m e r e C, S a n th a n a m G, A f s h a r A, Ryu SI, Meng TH, S a h a n i M, and Shenoy KV (November 2005). Mixture of trajectory models for neural decoding of goal-directed move ments. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 520.18. Washington, DC. Poster presentation.

Churchland MM, Yu BM, R y u SI, S a n th a n a m G, and Shenoy KV (April 2006). Motor preparation and settling activity in PMd. In 15th Annual Meeting of the Neural Control Move ment Society, E-13. Key Biscayne, FL. Poster presentation.

Churchland MM, Yu BM, R y u SI, S a n th a n a m G, and Shenoy KV (March 2005). Neural variability in premotor cortex provides a signature of motor preparation. In Computational and Systems Neuroscience 2005,13, page 26. Salt Lake City, UT. Oral and poster presentation.

Yu BM, A f s h a r A, S a n th a n a m G, R y u SI, S h e n o y KV, and Sahani M (March 2005). Ex tracting dynamical structure embedded in motor preparatory activity. In Computational and Systems Neuroscience 2005, 290, page 303. Salt Lake City, UT. Poster presentation.

Yu BM, S a n th a n a m G, R y u SI, and Shenoy KV (March 2005). Feedback-directed state tran sition for recursive Bayesian estimation of goal-directed trajectories. In Computational and Systems Neuroscience 2005, 291, page 304. Salt Lake City, UT. Poster presentation.

7.2.3 2004

S a n t h a n a m G, R y u SI, Yu BM, AND S h e n o y KV (October 2004). High information trans mission rates in a neural prosthetic system. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 263.2. San Diego, CA. Slide presentation.

Sa n t h a n a m G, Sa h a n i M, R y u SI, AND S h e n o y KV (September 2004). An extensible infras tructure for fully automated spike sorting during online experiments. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4380—4384. San Francisco, CA. doi:10.1109/IEMBS.2004.1404219.

Churchland MM, Yu BM, R y u SI, S a n th a n a m G, and Shenoy KV (December 2004). Set tling recurrent networks underlie motor planning in the primate brain. In PS Churchland and

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 7. PUBLICATIONS 114

T Sejnowski (Eds.) Neural Information Processing post-Conference Workshop — The Neurobiol ogy of Planning and Deciding: Studies from Many Levels of Brain Organization. Whistler, BC, Canada. Invited talk.

Kemere C, Santhanam G, Ryu SI, Y u BM, M e n g TH, and Shenoy KV (November 2004). Re construction of arm trajectories from plan and peri-movement motor cortical activity. In Neural Interfaces Workshop 2004. National Institutes of Health, Bethesda, MD. Poster presentation.

B a t i s t a AP, Y u BM, S a n t h a n a m G, R y u SI, and Shenoy KV (October 2004). Coordinate frames for reaching in macaque dorsal premotor cortex (PMd). In Society for Neuroscience Ab stract Viewer and Itinerary Planner, 191.7. San Diego, CA. Poster presentation.

Churchland MM, Yu BM, R y u SI, S a n t h a n a m G, and Shenoy KV (October 2004). Time- course of PMd processing predicts reaction time. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 603.5. San Diego, CA. Slide presentation.

K e m e r e CT, Santhanam G, Ryu SI, Yu BM, Meng TH, and Shenoy KV (October 2004). Reconstruction of arm trajectories from plan and peri-movement motor cortical activity. In So ciety for Neuroscience Abstract Viewer and Itinerary Planner, 8 8 4 .1 2 . San Diego, CA. Poster presentation.

RYU SI, SANTHANAM G, Yu BM, AND Shenoy KV (October 2004). High speed neural prosthetic icon positioning. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 263.1. San Diego, CA. Slide presentation.

Y u BM, R y u SI, S a n t h a n a m G, Churchland MM, and Shenoy KV (October 2004). Improv ing neural prosthetic system performance by combining plan and peri-movement activity. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 884.11. San Diego, CA. Poster presentation.

Churchland MM, Yu BM, R y u SI, S a n t h a n a m G, and Shenoy KV (October 2004). Role of movement preparation in movement generation. In R Shadmehr and E Todorov (Eds.) Advances in Computational Motor Control III, Symposium at the Society for Neuroscience Meeting. San Diego, CA. Contributed Talk.

R y u SI, S a n t h a n a m G, Y u BM, and Shenoy KV (October 2004). The speed at which reach movement plans can be decoded from the cortex and its implications for high performance neural prosthetic arm systems. In 54th Annual Meeting Congress of Neurological Surgeons, 785. San Francisco, CA. Oral presentation.

H a r r i s o n RR, S a n t h a n a m G, AND S h e n o y KV (September 2004). Local field potential mea surement with low-power analog integrated circuit. In Proceedings of the 26th Annual Interna tional Conference of the IEEE EMBS, volume 6, pages 4067—4070. San Francisco, CA.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 7. PUBLICATIONS 115

K e m e r e C, S a n t h a n a m G, Y u BM, R y u SI, M e n g TH, a n d S h e n o y KV (September 2004). Model-based decoding of reaching movements for prosthetic systems. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4524—4528. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404256.

W a t k in s PT, Santhanam G, Shenoy KV, and Harrison RR (September 2004). Validation of adaptive threshold spike detector for neural recording. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4079-4082. San Francisco, CA.

Y u BM, Ryu SI, S a n t h a n a m G, Churchland MM, and Shenoy KV (September 2004). Im proving neural prosthetic system performance by combining plan and peri-movement activity. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4516-4519. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404254.

Z u m s t e g ZS, A h m e d RE, S a n t h a n a m G, S h e n o y KV, a n d M e n g TH (September 2004). Power feasibility of implantable digital spike-sorting circuits for neural prosthetic systems. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4237-4240. San Francisco, CA. doi:10.1109/IEMBS.2004.1404181.

7.2.4 2003

Santhanam G, Churchland MM, Sahani M, and Shenoy KV (November 2003). Local field potential activity varies with reach distance, direction, and speed in monkey pre-motor cortex. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 918.1. New Orleans, LA. Poster presentation.

SANTHANAM G and Shenoy KV (March 2003). Methods for estimating neural step sequences in neural prosthetic applications. In Proceedings of the 1st International IEEE EMBS Conference on Neural Engineering, 5.3.4-7, pages 344-347. Capri, Italy. Poster presentation.

S h e n o y KV, Churchland MM, S a n t h a n a m G, Y u BM, a n d R y u SI (September 2003). Influ ence of movement speed on plan activity in monkey pre-motor cortex and implictions for high- performance neural prosthetic systems design. In Proceedings of the 25th Annual International Conference of the IEEE EMBS, 6.1.1-3, pages 1897-1900. Cancun, Mexico. Invited talk.

7.2.5 2002

K e m e r e CT, Santhanam G, Yu BM, Shenoy KV, and M eng TH (October 2002). Decoding of plan and peri-movement neural signals in prosthetic systems. In IEEE Workshop on Signal Processing Systems (SIPS’02), pages 2 7 6 -2 8 3 . San Diego, CA.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Appendix A

Select Collaborations

This appendix will briefly outline some notable developments by other researchers in the lab. I provided a measurable amount of support for these projects. Although much of the work described below involves basic neuroscience, these results can help us better understand the brain’s motor systems and thereby improve neural prosthetic performance in the long-term. For fuller descrip tions of each study, please refer to the referenced literature.1

A.1 Speed Tuning in PMd

A. 1.1 M otivation

In understanding how movements are prepared, it seems important that we determine which reference frames describe the neural responses at each temporal, anatomical, and functional stage. (By reference frame we simply mean a low-dimensional set of variables, spatial or otherwise, upon which neural activity is posited to depend in some straightforward fashion.) Such knowledge should also have immediate practical significance, given recent efforts to guide motor prostheses using preparatory activity (Musallam et al. 2004; Santhanam et al. 2006b; Shenoy et al. 2003). It is often assumed that reach preparation occurs in a predominantly spatial reference frame (e.g., van Beers et al. 2004). In support, preparatory activity in PMd is tuned for target direction and distance (Kurata 1993; Messier and Kalaska 2000; Riehle and Requin 1989), and is more closely tethered to the visuo-spatial location of the target than to the direction of the reach (Shen and Alexander 1997). Recent work has asked whether the relevant spatial reference frame translates with the hand, eye, or both (Cisek and Kalaska 2002; Pesaran et al. 2006). Yet some results suggest that PMd/Ml preparatory activity might not obey a simple spatial reference frame. PMd activity can depend on factors other than target location, including the type of grasp (Godschalk et al.

^ Most of the text that follows was taken verbatim from journal articles or manuscripts on which I share authorship, but on which I am not the primary author.

116

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 117

1985), the required accuracy (Gomez et al. 2000), reach curvature (Hocherman and Wise 1991), and (to some degree) force (Riehle et al. 1994). Our goal was to determine whether preparatory activity in PMd and M l reflects a non-spatial aspect of the upcoming reach: its speed, instructed by target color. This work has been published in a peer-reviewed journal (Churchland et al. 2006a).

A. 1.2 Results

Two monkeys were trained to reach at different speeds, with green and red targets instructing “slow” and “fast” reaches. Monkeys performed the task well. Even “slow” reaches to green targets were fairly swift, with durations of 150-300 ms depending on target distance. “Fast” reaches to red targets were swifter still, with durations of 100-200 ms. Their success rates were high and would take practice for a human to equal. We consider the 95 neurons for which we obtained a “direction” series (7 directions x 2 dis tances x 2 speeds). For each neuron and each condition (i.e., target-location / instructed-speed; 28 total conditions) we computed the mean delay-period firing rate. The mean number of tri als/condition was 14. Figure A.l plots the delay-period firing rate versus direction for several example neurons. Red and green traces correspond to red (fast) and green (slow) targets. Dashed and solid traces correspond to near (7 cm) and far (12 cm) targets. The examples in Figure A. 1 illustrate a number of features typical of recorded responses. First, delay-period activity often showed a large influence of instructed speed, in addition to the previ ously known influence of target direction and distance. Second, interactions between the effects of direction, distance and speed were common. For example, for the neuron shown in the bottom panels, speed had an effect primarily for near targets. Third, while direction tuning was typically robust, it was not always invariant. Preferred directions are similar (outer arcs whose arc lengths denote ±1 SE) but not identical across the different distances and instructed-speeds. Our primary new finding is that the instructed speed has a large influence on delay-period re sponses. Of tuned neurons, 74% showed a significant main effect of speed, while 94% showed some effect (main or interaction) involving speed. Firing rates could be higher before instructed-fast reaches (e.g., A19, A29), or before instructed-slow reaches (e.g., A01, B114). Considering each di rection/distance combination separately (a total of 95x7x2 comparisons), 61% (39%) of significant effects involved a preference for fast (slow) reaches. Thus, there was an overall tendency for the “fast” instructed-speed to evoke higher firing rates, but the opposite effect was not uncommon. It was also not uncommon for a neuron to prefer far targets and the slower instructed speed (e.g., A01, A06) or to prefer near targets and the faster instructed speed (e.g., A19).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 118

40 spikes/s 10 spikes/s

20 spikes/s 10 spikes/s. 25 spikes/s 50 spikes/s

5 sptkes/s

10 spikes/s 10 spikes/s

Figure A.1: Responses of twelve example neurons, illustrating the range of observed responses. Each sub panel shows a polar plot of delay-period firing rate versus target direction. Error bars on each symbol plot the SE across trials. Arcs at the outside of the plot show, for each condition, the preferred direction ±1 SE. The black circle at center shows baseline firing rate (mean over the 300 ms preceding target onset). Neuron identities are given at the top of each panel. Labels (in spikes/s) indicate the scale provided by the outer gray circle.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 119

A. 1.3 Significance

The current results demonstrate that delay-period activity robustly reflects non-spatial aspects of how the reach is to be executed. (By non-spatial we mean influenced by something other than the spatial location of the target/reach-trajectory. The influence of instructed speed could of course be due to different activation patterns of the muscles, which are certainly distributed in space.) While theoretical and behavioral studies have often assumed that motor planning is primarily spatial, finding non-spatial features represented in preparatory activity is not surprising. If preparatory activity is part of a causal chain that will eventually generate movement, then presumably all aspects of the movement must be reflected (at least implicitly) in that activity. The discovered mapping from preparatory activity to behavior might not conform to any simple representational framework. This would be consistent with our experimental observations, which revealed considerable heterogeneity in tuning across neurons, and failed to reveal a simple set of parameters that yielded invariant tuning. Of course, 'we may simply not be plotting our data against the right movement parameters. Perhaps there is a straightforward relationship between PMd preparatory activity and pending muscle activity (certainly both show preferred direction, or PD, rotations). Still, it is important to at least consider the possibility that no fundamental reference frame exists — a lack of invariant tuning for any of the tested parameters, together with a high degree of heterogeneity across neurons, question the idea that preparatory activity obeys any clear reference frame. This skepticism is also put forth in an independent focus piece (Cisek 2006) written in response to Churchland et al. (2006a). A number of prior results also suggest the absence of a fundamental reference frame. The principal finding of Shen and Alexander (1997) was that delay-period direction tuning in PMd was more closely tied to the visual location of the target than to the direction of the actual impend ing reach. Yet, both clearly had an effect, arguing that the operative reference frame is neither extrinsic nor intrinsic. The findings of Scott and Kalaska (1997), Scott et al. (1997), and Kakei et al. (1999) make a similar point regarding movement-related activity. The PDs of Ml and PMd neurons rotated with arm posture, but not in ways adequately captured by either intrinsic or ex trinsic reference frames. From a computational standpoint, such properties are not necessarily problematic, and may even confer advantages (Deneve et al. 2001; Pouget et al. 2002; Zipser and Andersen 1988). Ultimately, these aforementioned results provide novel experimental data that should inform researchers in efforts to discover a unifying model of the motor system. Naturally a better model of the motor system will promote higher performance neural prosthetic systems.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 120

A.2 Reference Frames in PMd

A.2.1 Motivation

When we reach out to grasp an object we see, our brain must rapidly determine an appropriate pattern of muscular contractions that will bring the hand to the object. At the heart of visually- guided reaching is a reference frame transformation, from the initial retinal representation of an object’s location to the required pattern of muscular contractions. A network of cortical areas between the parietal and frontal lobes are thought to subserve the reference frame transformation for reaching (reviewed in Boussaoud and Bremmer 1999; Caminiti et al. 1996). An important node in this network is the dorsal aspect of the pre-motor cortex (PMd). This area receives input signals related to vision, limb posture, and motor planning from the parietal lobe (Johnson et al. 1996), and in turn, PMd projects both directly to the spinal cord (Dum and Strick 1991; Galea and Darian-Smith 1994), and also to the primary motor cortex (M atsumura and Kubota 1979), the cortical region thought to be chiefly involved in the control of reaching. As we have already cited in the previous (and closely related) section, many studies have ex plored the role of PMd in the planning and performance of visually-guided reaches. An important open question is whether reach planning activity in PMd encodes reach goals in an eye-centered or a limb-centered reference frame. The medial bank of the intraparietal sulcus (MIP) projects monosynaptically to PMd (Tanne-Gariepy et al. 2002). Neurons in area MIP represent reach plans in eye-centered coordinates (Baker et al. 1999; Batista et al. 1999; Medendorp et al. 2003). Limb- centered reference frames for reaching have been reported in PMd (Caminiti et al. 1991; Cisek and Kalaska 2002). On the strength of this evidence, it appears a complete transformation from an eye- centered to a limb-centered reference frame might occur between MIP and PMd. In contrast, other reports have indicated that PMd neurons are influenced by the direction of gaze (Boussaoud et al. 1998; Boussaoud 1995) or sensory location of reach targets (Shen and Alexander 1997), suggesting that the transformation to a limb-centered reference frame may still be incomplete at the level of PMd. To explore this important unresolved issue, we sought to directly compare the relative degree of eye-centered spatial coding and limb-centered spatial coding in PMd neurons. This work has appeared as abstracts (Batista et al. 2004, 2005) and is currently in manuscript form (Batista et al. 2006).

A.2.2 Results

Two monkeys were trained over several months to perform a delayed-reach task as described in Chapter 3. Data were then collected while monkeys performed a version of the delayed reach task called the reference frame task. This task was designed to independently assess the effects on neural activity of target position relative to the eyes and the hand (see Batista et al. 1999). The

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 121

reference frame task is a simple extension of the delayed reach task in which the initial eye and hand position, and target location, are varied for each trial. Four different start conditions were used. In two of them, the eye position is the same, while the initial hand position is different. Thus, a given target is at the same location in eye-centered coordinates between the two conditions, but is at different locations relative to the hand and arm. We call these the “eye-aligned configurations.” In the other two conditions, the initial hand position is the same, but the fixation point differs. This manipulation altered the locations of the targets in eye-centered coordinates, while maintaining them in hand-centered coordinates. We call these the “hand-aligned configurations.” For each of the four start configurations, reaches were instructed to targets at the same locations (relative to the screen). All targets were presented above the initial eye and hand position. The design of the reference frame task allowed us to independently measure the effect of al tering the position of the reach goal relative to the arm, or relative to the eyes. A neuron that is insensitive to the hand-centered location of the reach target should show a high degree of simi larity between the firing rates observed in the two eye-aligned configurations. Such a similarity would rule out the possibility that the neuron uses a limb-centered reference frame for encoding reach goals, but it leaves open the possibility that the cell uses an eye-centered reference frame. Similarly, a neuron that is insensitive to the eye-centered location of the reach target would show a high degree of similarity between the firing rates measured in the two hand-aligned configurations. Such a similarity would allow us to rule out the possibility that the neuron uses an eye-centered reference frame, while still leaving open the possibility that the cell uses a limb-centered reference frame. Our primary analysis in this study reflects this logic: for each PMd neuron, we attempted to independently rule out the possibilities that the cell uses an eye-centered or a limb-centered reference frame. Figure A.2 show 4 neurons with very different reference frame properties. The 5x2 grids show the average firing rate during the delay period (computed over the 500 ms epoch extending from 250 ms after the appearance of the reach target) for each of the 10 target locations in the two eye-aligned and two hand-aligned configurations. Panel A.2a illustrates a hand-centered cell. For this neuron, the top two activity maps show a greater similarity than do the bottom two panels, indicating that the cell is relatively insensitive to the eye-centered location of the targets. Further more, the bottom two activity maps in each panel show that the response field of this neuron tends to move along with the hand; this cell encodes target locations using a extrinsic limb-centered reference frame (perhaps centered on the hand). Panel A.2b depicts a neuron that is eye-centered. The bottom two activity maps (where the hand position changes, but the eye position is the same) are more similar to each other than are the top two activity maps (where the hand position is the same, but the eye position changes). Hence, we posit that this cell encodes target locations in a retinotopic reference frame. Panels A.2a and A.2b illustrate that at least some PMd neurons employ a reference frame for reach planning that can be reasonably well-characterized as eye-centered or hand-centered.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 122

a b c d

ft <*> ft <*> ft ft <*>

<*> ■ P <*> ft

- <*> ft ft <*> ft <*> ft

ft <*> ft <*>

Unit H20041231.40.2

Figure A.2: Each panel depicts one neuron. Within each panel are four activity maps, corresponding to the four different start configurations (indicated by hand and eye icons.) Each activity map shows averaged neural response during the delay epoch for reaches to the ten targets. White indicates the highest firing rate for that neuron, while black is the lowest.

However, many of the neurons we observed in PMd encode locations in more complex reference frames. For example, the neuron in Panel A.2c is more active when the eyes are directed to the left of the hand (the second and third activity maps). Furthermore, the response field of the cell moves such that it remains at a fixed location relative to the combined position of the eyes and hand. There were also a few neurons that did not move when either the direction of gaze or the initial hand position was varied. Panel A. 2d is the clearest example of such a cell. This neuron is active preceding reaches to the rightward set of targets, no matter where on the retina these targets fall, or the trajectory of the reach needed to acquire them. Surprisingly the neurons that are shown in Fig. A.2 were somewhat exceptional in our PMd population in how they appear to use a reference frame that can be described easily. Many other neurons resisted categorization. They exhibited complex spatial timing, with no discernible regularities across the different configurations of the eyes and hand in which we tested the cells. Finally, when considering the PMd population as a whole, we first compared the degree of influence on PMd neurons of the eye-centered location of the target and the hand-centered location of the target, using an ANOVA. Changes in the eye-centered location of the targets (induced by changing the starting eye position) significantly affected 58% of PMd neurons, almost as many as did changing the target location relative to the arm (by changing the starting hand posture; 76% of cells). We also compared the spatial coding schemes used by PMd neurons: do cells code reach goals in a more eye-centered or more hand-centered reference frame? We used a simple distance metric to compare the dissimilarity between the two eye-aligned configurations and the dissimilarity between the two hand-aligned configurations. Out of 79 neurons across two monkeys,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 123

51 cells (65%) use a more limb-centered than eye-centered reference frame. This leaves 35% of these PMd neurons to be classified as apparently more eye-centered than hand-centered.

A.2.3 Significance

The chief finding of this study is that the eye-centered location of the reach goal strongly influ ences motor planning activity in the dorsal aspect of the pre-motor cortex (PMd). It is perhaps surprising to find a strong influence of the retinal location of the reach target this far along the pathway for the processing of visually-guided reaching. PMd projects to the spinal cord, and to the primary motor cortex, and is intimately involved in controlling arm movements (Churchland and Shenoy 2006). Why should neurons this integral to motor planning still carry a signal of the target location in sensory coordinates? Two categories of explanation exist. It could be that the retinal information about reach endpoint is still important, even at the advanced stage of movement plan ning occupied by PMd; our finding of retinal location information in PMd might be evidence for a rich, multipotential spatial coding strategy in the area. Alternatively, of course, this retinal information could be unimportant: a residue from the initial cortical representation of the reach endpoint. Perhaps all cortical output stages are influenced by the retinal locations of reach goals, with the final conversion to limb coordinates actualized only just prior to the reach itself (Zipser and Andersen 1988). Another possibility is that residual eye influences simply average away in the spinal cord or at the motoneurons and there is no need for cortex to fully eradicate the eye signals. Nonetheless, the presence of eye-influenced neural activity in PMd (and perhaps even M l if one actually rigorously pursued that question) has large implications for neural prosthetic systems. These eye-position-related correlations may cause confounds in the decoding of motor intention. For example, in the BCI experiments of Chapter 3, we controlled for eye-modulation by fixing the gaze for each trial. It is quite possible that we may not have achieved so high a performance if the animal were allowed to freely gaze during trials, as this could add unexplained “noise” to our neural models. Moreover, should eye position also influence peri-movement activity — which appears to be the case in our data (unpublished observations) — then current performance of peri- movement-based, continuous cursor prostheses may be artificially limited as well. Eventually, more robust systems will have to contend with free gaze in the future.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 124

A.3 Mechanisms of Motor Planning

A.3.1 Motivation

In the past two sections, we have described experiments that illustrate how the neurons in the motor planning region of the brain exhibit complex patterns. These neural responses do not fit into a simple framework. So this begs the question of how we might try to understand the more mechanistic aspects of motor planning. One first step in this direction is to focus on the time course of motor preparation. Reaction times (RTs) (from the go cue until movement onset) are shorter when delays are longer, suggesting that some time-consuming preparatory process is given a head start by the delay (Riehle and Requin 1989; Crammond and Kalaska 2000). Thus the progress of a developing motor plan is likely reflected in the neural activity. Perhaps activity must rise above a threshold to trigger the movement, as seems likely for eye-movement saccades (Hanes and Schall 1996). An instructed delay could allow activity to approach threshold and shorten the subsequent RT. Supporting this “rise-to-threshold” hypothesis, higher firing rates are often associated with shorter RTs (Riehle and Requin 1993; Bastian et al. 2003), although Crammond and Kalaska (2000) found that, peak firing rates following the go cue (when the movement is presumably triggered) were on average lower following an instructed delay. An alternate hypothesis, illustrated in Fig. A.3, assumes that the movement produced is a function of the state of preparatory activity at the time some trigger is applied. For each possible movement, there would be an “optimal” subspace of firing rates, appropriate to generate a suffi ciently accurate movement. Motor preparation might therefore be an optimization: bringing firing rates from their initial state to the appropriate subspace. Activity might drift somewhat while waiting to execute, but motor preparation would remain “complete” so long as firing rates remain within the optimal subspace. Is there evidence that the brain actively attempts to bring firing rates to that subspace? Is some penalty paid, perhaps a longer RT, if firing rates are elsewhere? We show that these questions can be addressed by measuring the variability of firing rates. This work has been published in a peer-reviewed journal (Churchland et al. 2006b).

A.3.2 Results

Many of our analyses rely on the measurement of neural variability, across trials of the same type, made as a function of time. A central assumption of this approach is that the measured variability is attributable to both cell-intrinsic variability in spike production and to “true” variability in the underlying firing rate on each trial. Our goal was to isolate the latter, as best as possible, by normalizing with respect to the estimated contribution of the former. To do so, we compute the variance of firing rate across trials and normalize by the mean firing rate, all as a function of time. We term the resulting measurement the normalized variance (NV). The logic behind this metric is

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A SELECT COLLABORATIONS 125

neuron 3

left reach right reach

trial 1

trial 2

neuron 2

firing rate, neuron 1

Figure A.3: Illustration of the optimal-subspace hypothesis. The configuration of firing rates is represented in a state space, with the firing rate of each neuron contributing an axis, only three of which are drawn. For each possible movement, we hypothesize that there exists a subspace of states that are optimal in the sense that they will produce the desired result when the movement is triggered. Different movements will have different optimal subspaces (shaded areas). The goal of motor preparation would be to optimize the configu ration of firing rates so that it lies within the optimal subspace for the desired movement. For different trials (arrows), this process may take place at different rates, along different paths, and from different starting points.

as follows. Intrinsic spiking variability is thought to be near Poisson for cortical neurons, so that its variance scales linearly with mean firing rate. Thus, if the measured across-trial variability were attributable solely to intrinsic spiking variability (i.e., the underlying firing rate were identical on each trial), the NV should be unity. In the presence of variability in underlying firing rate, the NV should be greater than unity. In particular, we were interested in whether variability in underlying firing rate declined during the course of the trial (see Fig. A.3) since the underlying firing rate is taken from an uncontrolled initial condition to a consistent pre-movement subspace. In this case, the NV should decline from above one to near one. As predicted, Fig. A.4a shows that the NV (+SE computed across isolations/target locations) declined after target onset (see arrow), remained at a rough plateau during the delay, and fell again after the go cue. Figure A.4a includes three different datasets that had different delay period lengths. This general pattern was also consistent across other datasets and across different monkeys. The initial decline in the NV consumed 98-198 ms depending on the monkey and dataset. This is consistent with the idea that the magnitude of the NV indicates the approximate degree of motor preparation yet to be accomplished. Admittedly, this interpretation rests on some assump tions. First of all, it assumes that the increasing consistency of firing rates with time reflects their increasing accuracy (i.e., their increasing tendency to occupy the optimal subspace, whose bound aries cannot be easily inferred using current methods). Second, it assumes that there is a limit on the rate at which firing rates approach their putatively optimal values, such that progress before

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 126

the go cue shortens the subsequent RT. The first assumption is difficult to test directly. The second assumption can be tested directly, by comparing the rate of decline in the NV for trials with differ ent delay durations. To do so, we used the experimental data that had three discrete delay-period durations (30, 130, and 230 ms, randomly interleaved). As previously stated, Fig. A.4a shows the NV computed for the three delays, aligned to the onset of the go cue. The rate of decline in the NV is similar for the three delay durations. As a consequence, at the time of the go cue the NV for the 230 ms delay has dropped to a plateau, while the NV for the 30 ms delay does not reach the same point until ~80 ms later, potentially explaining why mean RT is longer. Figure A.4b plots RT versus the NV at the time of the go cue for the three delays. The relationship increases monotonically. Thus, the height of the N V at the time of the go cue is predictive ofRT, as would be expected if it reflected the average degree of motor preparation yet to be accomplished.

•“S-1 30 ms delay E, - ir c CO 4) firing rate E

275

1-1.5 NV at go cue 130 ms 30 ms

230 ms delay NV

30 m s delay

E target 100 m s movement onset <-275

0 8(24: Change in rate (spikes/s) by go cue

Figure A.4: NV results of an experiment using three discrete delay-period durations: 30, 130 and 230 ms. Data are from one day’s recording using monkey G (39 isolations, 957 trials), a. Traces at top show the change in mean firing rate from baseline (SE), across all isolations and target locations. Traces below show the NV (SE). Analysis was performed with data aligned to the go cue. This means that for each delay duration, analysis was also aligned to target onset, although that occurred at different times prior to the go cue. b. Mean RT versus the NV, measured at the time of the go cue for the three delay-period durations. Bars show standard errors, c. Mean RT versus the change in firing rate from baseline, measured at the time of the go cue for the three delay-period durations. Black symbols plot the mean change averaged across all neurons and conditions. Gray symbols plot the same analysis but including only each neuron’s preferred condition. Note th at the x-axis has been rescaled in the latter case.

In contrast, Fig. A.4c shows that there was no simple relationship between RT and mean firing rate at the time of the go cue. This was true whether we considered all conditions (black) or just preferred conditions (gray). Note that this would also have been true had we considered firing rate at some fixed time (e.g., 100 ms) after the go cue. At that point, the 30 ms delay (which produced the longest RTs) produced the highest firing rates (see Fig. A.4a, top). At no time after the go cue were firing rates highest for the 230 ms delay duration, although it produced the shortest RT.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 127

A.3.3 Significance

The NV reveals a previously unknown degree of temporal structure in the variability of neural activity during a delayed reach task. Variability declines rather dramatically after target onset, and more modestly after the go cue. Because the NV is a measurement of across-trial firing-rate variability, the most natural interpretation is that there is a decline in the across-trial variability of the underlying firing rates. Alternately, the decline in the NV might reflect a change in the within-trial cell-intrinsic process of spike production (e.g., from cortex-like statistics to vestibular- afferent-like statistics). A number of controls (see Churchland et al. 2006b) exclude the most obvious ways this might happen (most trivially, with increasing firing rate), but it is difficult to completely exclude this possibility given extra-cellular recordings alone. Still, the proposal that cell-intrinsic spiking statistics change would be quite radical. If a movement is in whole or in part a consequence of the preparatory activity present at the time it is triggered, then it would seem critical that such activity be optimized before triggering. We hypothesize that such optimization is the behaviorally-inferred process of motor preparation. Our experiments and analyses were designed to test two central predictions of this hypothesis. First, if the brain is actively “trying” to bring firing rates to a particular state, then this should produce a decline in variability. Second, if the brain can sense when preparatory activity is accurate, and if activity is on average roughly accurate, then RTs should be shortest when variability is low — that is, when firing rates are closest to their mean (we are not suggesting that the brain cares about variability per se, but rather that reduced variability is a correlate of increased accuracy). That these two predictions were born out lends support to the optimal-subspace hypothesis. Measurements of variability have been extensively employed in the analysis of neural data (Tolhurst et al. 1983; Gur et al. 1997; Bair and O’Keefe 1998; Averbeck and Lee 2003). Yet the present study is, to our knowledge, the first to use a measurement of variability in an attempt to track the time-course of internal processing (although this interpretation was anticipated by Horwitz and Newsome 2001). Given present results, it seems plausible that the measured in crease in consistency reflects an increase in accuracy — an increasing likelihood that firing rates have reached their appropriate values. This highlights an advantage of measuring firing rate vari ability. Even when little is known regarding the “representation” used by an area of interest (so that the experimenter cannot know which firing-rate vectors count as “accurate” or “appropriate”) an index of variability can potentially allow one to infer the time-course with which firing rates become accurate.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 128

A.3.4 Beyond NV

The NV results suggest that the network underlying motor preparation exhibits rich dynamics. However, NV provides little insight into the course of motor planning on a single trial. A gradual fall in trial-to-trial variance might reflect a gradual convergence on each trial, or might reflect rapid transitions that occur at different times on different trials. All the NV tells us about the dynamic properties of the underlying network is the basic fact of convergence from uncontrolled initial conditions to a consistent pre-movement preparatory state. The structure of any underlying attractors and corresponding basins of attraction is unobserved. To better understand the underlying mechanism of motor planning, one can adopt latent vari able methods. These methods can identify a hidden dynamical system that summarizes and ex plains the simultaneously-recorded spike trains. The central idea is that the responses of different neurons reflect different views of a common dynamical process in the network, whose effective dimensionality is much smaller than the total number of neurons in the network. While the un derlying state trajectory may be slightly different on each trial, the commonalities among these trajectories can be captured by the network’s parameters, which are shared across trials. These parameters define how the network evolves over time, as well as how the observed spike trains relate to the network’s state at each time point. Recall that the NV results inform us that neural activity is initially variable across trials, but appears to settle during the delay period. A dynamical system model capable of expressing these types of behaviors of neural systems is a fully-connected recurrent network with Gaussian noise:

x* |xf—i ~ N , Q) (A.l) f(x) = (1 - k ) -x + k ■ W ■ g(x),

where the state x.t e IR^*1 is a vector of the node values in the recurrent network at time t= l,...,T , k e M is related to the time constant of the network, W e Rpxp is a connection weight matrix, and Q sM pxP is a covariance matrix. The function f : IRpxl ->■ lRpxl defines the non-linear state dynamics and g is a non-linear activation function that acts element-by-element on its vector argument. We took g to be the error function defined by

erf(z)= —= f edt. (A.2) \/7 l JO

We chose the error function because it made the fitting algorithm analytically tractable. The initial state is Gaussian-distributed:

xi~A((pi,Vi), (A.3>

where pi e Rp><1 and Vi e IRpxp are the mean vector and covariance matrix, respectively.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 129

The output distribution is a generalized linear model that defines the relationship between all nodes in the state x* and the spike count y\ e {0,1,2,...} of neuron i = 1 ,..., q in the tth time bin

y\ | x* ~ Poisson [h (c'; x f + d t) • A),

where c ; e x 1 and d i e R define a linear function of the state and A e IR+ is the time bin width. For notational compactness, the spike counts for all q simultaneously-recorded neurons are assembled into a q *1 vector y t, whose ith element is y\. The link function h : IR — IR+ is chosen to be h(z) = log (1 + e2) so as to ensure that the mean rate parameter of each Poisson distribution is non negative. The computational details and data simulations are skipped here and the interested reader is invited to refer to Yu et al. (2006a); Yu (2007). Applying this latent variable method to delayed- reach neural data, we can try to reveal the otherwise hidden, cognitive state of the monkey, while he is in the midst of planning a reach to the presented targets. Figure A.5 shows the means of the marginal state posteriors P (x* | {y}^) (black traces) for 100 test trials based on the dynamical model with recurrent state dynamics; note that a separate trajectory is inferred for each trial. The blue and green dots correspond to 50 ms after target presentation and 50 ms after the go cue, respectively. Despite the trial-to-trial variability in the delay period neural responses, the state evolves along a characteristic path on each trial, presumably from an idle state to a fully formed reach plan. Even with the characteristic structure however, the state trajectories are not all identical. This presumably reflects the fact that the motor planning process is internally- regulated, and its time course may differ from trial to trial, even when the presented stimulus (in this case, the reach target) is identical.

8s] • Target onset + 50ms

4 s x 3 o. M ia -4 N

-8,

Figure A.5: Inferred state trajectories (black) in latent x space for 100 test trials, based on the model with recurrent state dynamics. Dots indicate 50 ms after target onset (blue) and 50 ms after the go cue (green). The radius of the green dots is logarithmically-related to delay period duration (200, 750, or 1000 ms).

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 130

While these results are promising, they are still somewhat preliminary. The trajectories agree with intuition, but it is necessary to relate some aspect of the trajectory (e.g., closeness to the convergence region) with behavior (e.g., reaction time). Since we cannot measure directly the hidden cognitive state of the animal during the planning process, we must use indirect behavioral correlates to help confirm the validity of these inferred single-trial hidden trajectories. This is a subject of ongoing efforts in the Shenoy laboratory.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 131

A.4 Mixture of Trajectory Models

A.4.1 Motivation

One of the key components of a prosthetic device is its decoding algorithm, which translates neural activity into arm reaches. Examples of decoding algorithms that translate neural activity around the time of the movement (termed peri-movement activity) into continuous arm trajectories include population vectors (Taylor et al. 2002) and linear filters (Serruya et al. 2002; Carmena et al. 2003). Both of these decoding algorithms assume a linear relationship between the neural activity and arm state. In general, the arm state may include, but is not limited to, arm position, velocity, and acceleration. While these linear decoding algorithms are effective, recursive Bayesian decoders have been shown to provide more accurate trajectory estimates (Brown et al. 1998; Brockwell et al. 2004; Wu et al. 2004, 2006). Recursive Bayesian decoders are based on the specification of a probabilistic model comprising (1) a trajectory model, which describes how the arm state changes from one time step to the next, and (2) an observation model, which describes how the observed neural activity relates to the time-evolving arm state. If the modeling assumptions are satisfied, then Bayesian estimation makes optimal use of the observed data, as well as provide confidence regions for the arm state estimates and allow for non-linear relationships between the neural activity and arm state. The functionality of the trajectory model is to build into the recursive Bayesian decoder prior knowledge about the form of the reaches. The degree to which the trajectory model reflects the dynamics of the actual reaches directly affects the accuracy with which trajectories can be decoded from neural data. A commonly-used trajectory model is the random walk (Brown et al. 1998; Brockwell et al. 2004), which captures the fact that arm trajectories tend to be smooth. In other words, small changes in arm state from one time step to the next are more likely than large changes. An alternative trajectory model is based on linear dynamics perturbed by Gaussian noise, termed a linear-Gaussian model (Wu et al. 2004; Shoham et al. 2005; Wu et al. 2006). It is often the case that there are a finite number of distinct objects that a disabled patient may wish to reach for in his/her workspace. Examples include reaching for the lighting, bed, or temper ature controls; typing on a keyboard; or picking up the phone.2 Natural reaching movements in such settings exhibit the following three properties. First, many, though clearly not all, reaching movements in the workspace will be directed to this set of discrete goals. Second, multiple reaches to the same goal are not all identical. For example, there may be variability in reach speed or curvature. Third, the trajectories generally start at rest, proceed out to the reach goal, and end at rest. Current trajectory models, such as the random walk or linear-Gaussian models, are limited in their ability to capture all three aforementioned properties. In particular, it is not possible to

2 See Hochberg et al. (2006) for additional descriptions and videos of a spinal-cord-injured patient operating a neural prosthesis.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 132

specify multiple discrete reach goals at which the trajectories are likely to come to rest. Thus, we seek a trajectory model that better captures the dynamics of goal-directed reaches, which should in turn yield more accurate trajectory estimates. In addition, on a given trial, there can be information available about the identity of the upcom ing reach goal before the reach begins. For example, as we have discussed throughout, information about the goal of an upcoming reaching movement can often be deduced before the reach begins from neural activity related to motor preparation ( delay activity). It should be possible to use this goal information, when available, to improve the accuracy of the decoded trajectory. We present a mixture of trajectory models (MTM) framework that provides (1) a suitable tra jectory model for goal-directed reaches, and (2) a principled way to incorporate information about the identity of the upcoming reach goal. This work has partially appeared in conferences and peer-reviewed journals (Kemere et al. 2003, 2004a,b) and the latest incarnation is presently in manuscript form (Yu et al. 2006b).

A.4.2 Methods

Ideally, we would like to construct a complete model of neural motor control that captures the hard, physical constraints of the limb, the soft constraints imposed by neural mechanisms, as well as the physical surroundings and context. One way to approximate such a complete model is to build a separate trajectory model for each group of movements with similar objectives. Here, we group the movements by reach goal. At the onset of a new movement the desired reach goal is unknown, or imperfectly known, and so the full trajectory model is composed of a mixture of the individual, goal-specific trajectory models. We develop here a recursive Bayesian decoder based on a mixture of trajectory models (MTM). The decoding of a continuous arm trajectory involves finding the likely sequences of arm states corresponding to the observed neural activity. At each time step t, we seek to compute the distribu

tion of the arm state x t given the peri-movement neural activity yi, y 2 ,.. •, y< (or {y}^) observed up to that time. This distribution is P (x* | {y}*) and termed the state posterior. Here, y; is a vector of binned spike counts across the neural population at time step t, and t = 1 corresponds to the time at which we begin to decode movement. If the desired reach goal m* is perfectly known before the reach begins, then we can compute the state posterior based on the individual trajectory model corresponding to that reach goal. This distribution is P (x* | {y}p7n*) and termed the conditional state posterior. In general, the desired reach goal is unknown or imperfectly known, so we need to compute P (x< | {y }^, m) for each m e { 1,.. .,M}, where M is the number of possible reach goals. To combine the M conditional state posteriors, we can simply expand P (x< | {y}^) by condition ing on the reach goal m

M p {xt I (y}i) = £ P (x* | {yj‘ ,m )P (m | {y}*). (A.5) 771 = 1

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 133

In other words, the state posterior is a weighted sum of the conditional state posteriors. The weights P [m | {y}^) represent the probability that the desired reach goal is m, given the observed spike counts up to time t. Bayes’ rule can then be applied to these weights in Eq. (A.5), yielding the key equation for the MTM framework

, t\ r t \ -F* ({yJi I m )P(m) P (xi I {y}J = £ P (xf | {y}1;m ) p ------.

The conditional state posteriors P (x* | {y}^,m) and data likelihoods P ({y}^ | m) in Eq. (A.6) can be computed or approximated using any of a number of different recursive Bayesian decoding tech niques, including Bayes’ filter (Brown et al. 1998), particle filters (Brockwell et al. 2004; Shoham et al. 2005), and Kalman filter variants (Wu et al. 2004, 2006). If available, information about the identity of the upcoming reach goal can be incorporated naturally into the MTM framework via P(,m) in Eq. (A.6). This information must be available before the reach begins and may differ from trial-to-trial. If no such information is available, a uniform distribution (P(m) = M) can be used across all trials. Alternatively, we can use the maximum-likelihood methods described in Section 3.2.2 (see Eqs. (3.5) and (3.4)) to find P(m) from the delay-period activity.

MTM m odel

The particular probabilistic model explored in this work is

xf |xi_i,m~)V(AmXf_i-i-bm,Qm) (A.7)

x.1 \m ~ N (jrm,V m) (A.8)

si-lag; Ix * ~ Poisson (ec’x‘+d>A), (A.9>

where m e {1,...,M} indexes reach goal and M is the number of reach goals. The dynamical arm state at time step t e {1,...,T} is xt e Kpxl, which includes position, velocity, and acceleration terms. The corresponding observation, s'_lag e {0,1,2,...}, is a peri-movement spike count for unit i e {1,..., q] taken in a time bin of width A, where lag; is the time lag (in time steps) between the neural firing of the ith unit and the associated arm state. For notational convenience, the spike counts across the q simultaneously-recorded units are assembled into a q x 1 vector y t, whose ith element is s'_lag . This is the y* that appears in Eqs. (A.5) and (A.6>. The parameters A m £Rpxp, bm e Rpxl, Qm e n m e R-P”1, Vm e R-px-p, lag; e Z, c; e R-p,xl, di e R do not depend on time and are fit to training data. Equations (A.7) and (A.8) define the trajectory model, which describes how the arm state xt changes from one time step to the next. In this case, the full trajectory model is a mixture of standard linear-Gaussian trajectory models, each describing the trajectories toward a particular reach goal indexed by m.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 134

Equation defines the observation model, which describes how the recorded peri-movement spike counts s\ , relate to the arm state x*. In Eq. (A.9), the linear mapping c '.xt + dj is a cosine t- i a g i i tuning model (Georgopoulos et al. 1982), where c* is the “preferred state vector.” This linear mapping is then passed through an exponential to ensure that the mean firing rate of the ith unit at time t - lagj, ec>x*+rfi, is non-negative. Note that, whereas each mixture component indexed by m in the trajectory model (Eqs. (A.7) and (A.8)) can have different param eters leading to different arm state dynamics, the observation model (Eq. (A.9» is the same for all m. Arm trajectories can be decoded from neural activity by applying Bayes’ rule to the statistical relationships Eqs. (A.7)-(A.9). Having observed the neural data, we seek the likely sequences of arm states that could have led to those neural observations. When the trajectory and obser vation models are both linear-Gaussian, all of the relevant distributions are Gaussian and the appropriate integrals can be computed exactly. In this case, the solution is identical to applying the standard Kalman filter. For our model, however, given that the observation is a Poisson noise model in Eq. (A.9), approximations are required to develop the appropriate estimation filter. These approximations are omitted here for sake of brevity but the interested reader can refer to Brown et al. (1998); Yu et al. (2006b).

Random Walk Trajectory Model

For comparison, we also implemented the random walk trajectory model with Poisson observations presented by Brockwell et al. (2004):

= v t - i - v t-2 + et (A. 10)

~ N(n, V) (A .ll)

s^-iag. 1 v< ~ Poisson [ec'iy,+di a | , (A.12)

where et ~ N (0, Q) in Eq. (A.10), vz e Kpxl is the arm velocity at time t, v* is defined to be [v't ||v* ||]'

in Eq. (A.12), and ||V f || is the arm speed at time t. As in Eq. (A.9), sj_lag. is the peri-movement spike count of the ith unit at time t - lagj, where lag; is the time lag between the neural firing of unit i e {1,..., q] and the associated arm velocity. Spike counts are taken in time bins of width A. The parameters Q e Upxp, n e R2pxl, V e M2px2p, lagj e Z, Cj e IR(j:,+1)xl, dj e OS are fit to training data, as described below. Note that the random walk trajectory model is a special case of the linear-Gaussian trajectory model with appropriately chosen param eters in Eqs. (A.7) and (A.8). Equations (A.10) and (A.ll) define the random walk trajectory model that imposes smoothness in acceleration; Eq. (A.12) defines the Poisson observation model. To decode arm trajectories using this probabilistic model, we followed Brockwell et al. (2004) and implemented particle filtering with 2500 particles at each time step. This yielded a velocity estimate at each time step. To obtain

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 135

a single decoded position trajectory, the means of these velocity estimates were integrated over time. Because the arm state does not include positional variables in this model, we assumed the actual initial arm position was known. Thus, the decoder based on the random walk trajectory model was given a slight advantage over the other decoders.

A.4.3 Results

Two monkeys were trained to perform a delayed-reach task as described in Chapter 3, which provides for both plan and peri-movement activity. The reach goal was presented at one of eight possible radial locations (30, 70, 110, 150, 190, 230, 310, 350°) 10 cm away. We considered three trajectory models: a random walk model (RWM, Eqs. (A.10) and (A.ll)) in acceleration, a single linear-Gaussian trajectory model (STM, Eqs. (A.7) and (A.8) for special case of M = 1), and a mixture of linear-Gaussian trajectory models (MTM, Eqs. (A.7) and (A.8)). Each of the trajectory models was fitted to the arm data with a time step of dt = 10 ms. For the STM and MTM, the following physical quantities were included in the arm state vector xt: position, velocity, acceleration, position magnitude, and velocity magnitude. The parameters of all three trajectory models were fit using least squares. For the STM, a single linear-Gaussian tra jectory model was shared across all goal locations. The STM is similar to the trajectory model used by Donoghue and colleagues (Wu et al. 2004, 2006), where it was applied to pursuit-tracking and “pinball” tasks. In contrast, for the MTM, a separate linear-Gaussian trajectory model was trained for each reach goal, based only on reaches to that goal. The trajectory model can be viewed, in the space of all possible trajectories, as a specification of which trajectories are more likely than others and by how much. This information is encoded in the parametric form of the trajectory model (e.g., random walk or linear-Gaussian), as well as in the fitted values of the model parameters. For each observation model (Eqs. (A.9) and (A.12», we sought the optimal lag for each unit and the parameters {cj,

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 136

with those decoded using the STM (thick green) and MTM (thick orange). For the purposes of this plot, only the state elements corresponding to arm position are shown. The MTM decoded trajectory is a weighted sum of component trajectory estimates E [x* | {y}^,m], one for each reach goal indexed by m e {1,... ,8}. The three component trajectory estimates with the largest weights for this trial are plotted in the upper subpanel (cyan, blue, magenta). The lower subpanel shows how the corresponding weights P (m | {y}^) evolved during the course of the trial. The values of these weights at time zero (t = 0) represent the probability that the upcoming reach goal is m, before any peri-movement neural activity had been observed. This is set from the plan activity, as previously discussed. As time proceeded, these weights were updated as more and more peri- movement activity was observed. The weight for the actual reach goal (cyan) in the lower subpanel was higher at every timepoint, the clearest effect seen during the first 200 ms.3 The weighted sum of the eight component trajectory estimates (of which three are plotted in the upper-right panel) using the weights shown in the lower subpanel yield the MTM decoded trajectory (thick orange, EJnns: r 7.4 mm) in the upper subpanel.

With delay activity With delay activity

100 £ E o<0 Q. •e > - 5 0

-100 0 0 100 Horz pos (mm) Horz pos (mm)

&m 55? • t 0.5 • t 0.5 _ /l 200 200 T im e (m s) T im e (m s)

Figure A.6: Two representative test trial in which the use of delay activity improved the MTM decoded trajectory. Upper panels: actual trajectory (thick black), STM decoded trajectory (thick green), MTM de coded trajectories with delay activity (thick orange). Lower panels: the three corresponding MTM component weights as they evolve during the trial. Time zero corresponds to 60 ms before movement onset (i.e., one time step before we begin to decode movement). For left trial, E rms was 17.4 and 7.4 mm for STM and MTM with delay activity, respectively. For right trial, Enns was 16.7 and 13.4 mm for STM and MTM with delay activity, respectively. (Experiment G20040508, trial IDs 686 and 676.)

Figure A.6b shows a different test trial. In this case, the dominant weight at t = 0 (blue) did not correspond to the actual reach goal (cyan). In other words, the delay activity incorrectly indicated the identity of the upcoming reach goal. However, as these weights were updated by the observation of peri-movement activity, this “error” was soon corrected (within approximately 80 ms). From that point on, the weight corresponding to the actual reach goal dominated. Despite

3Without the delay-period activity, there was competition between the actual reach goal (cyan) and the neighboring goals. This particular scenario is not shown here.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 137

this error at the beginning of the trial, the MTM decoded trajectory (thick orange, E rmS: 13.4 mm) in the upper-right panel remained nearly identical to that in the upper-left panel. The reason is that the error occurred early-on in the trial, when all eight component trajectory estimates were still near the origin of the workspace; the weighted sum of these component estimates lies near the origin no matter how they are weighted. Having demonstrated how the MTM framework produces trajectory estimates, we can quan tify and compare the performance of decoders based on different trajectory models. Figure A.7 compares the trial-averaged decoding performance using the RWM, STM, MTM without delay ac tivity (labeled MTMm, since only peri-movement activity is used), and MTM with delay activity (labeled MTMdm, since both delay and peri-movement activity are used). For each monkey, the trend was the same: E m s decreased when going from RWM to STM, from STM to MTMm, and from MTMm to MTMjjm (Wilcoxon paired-sample test, p < 0.01). The superior performance of the MTMm compared to the RWM and STM can be attributed to the fact that the MTM better cap tures the dynamics of goal-directed reaches. If delay activity is available, this additional source of information can be naturally incorporated in the MTM framework to further improve decoding performance (MTMdm)- The RWM can be seen as a restricted form of the STM, which explains the higher £rms of the RWM compared to the STM in Fig. A.7.

b Monkey G Monkey H 30 30

2 5 25 20 E- & 15

ujg 10

5 0 RWM STM MTM., MTM„ RWM STM MTM., MTM„„ M DM

Figure A.7: E n a s (mean ± SE) comparison for decoders using the RWM, STM, MTM without delay activity (MTMm), and MTM with delay activity (MTMdm )- a. Monkey G (98 units), b. Monkey H (99 units).

A.4.4 Significance

The mixture of trajectory models framework provides (1) a suitable trajectory model for goal- directed reaches, and (2) a principled way to incorporate information about the identity of the upcoming reach goal. In contrast to current trajectory models, a mixture of linear-Gaussian tra jectory models (MTM) can capture the notion of goal-directed control, whereby trajectories start at rest, proceed out to one of M discrete reach goals, and end at rest. Because the MTM better describes the dynamics of goal-directed reaches, its decoded trajectories were on average more accurate than those based on the random walk and linear-Gaussian (STM) trajectory models.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. APPENDIX A. SELECT COLLABORATIONS 138

As detailed in Chapter 1, the field of cortical prosthetics has largely been split based on which of the two types of information should be used: plan activity to decode the intended reach goal or peri-movement activity to decode the moment-by-moment details of a trajectory. By combining the two types of information, the MTM decoder can be viewed as a way to bridge differences in the design approach of cortical prosthetics. Also, the work outlined in Section A.l and detailed in Churchland et al. (2006a) suggests that delay period activity can provide a probabilistic prior for peak movement speed as well. While devising a complete model of neural motor control would be ideal, the MTM framework provides an effective and general discrete approximation. In this work, we grouped trajectories by reach goal. In other contexts, the trajectories can be grouped by other criteria such as reach speed, reach curvature, etc. Extensions to this work include applying the MTM framework to settings with (1) novel reach goals, as well as (2) larger numbers of reach goals. We are also interested in extending the MTM framework from M discrete reach goals to a continuum of goal locations.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Bibliography

A f s h a r a , A c h t m a n N, S a n t h a n a m G, Ryu SI, Yu BM, AND SHENOY KV (2005). Free paced target estimation in a delayed-reach task. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 401.13. Washington, DC. Poster presentation.

ASHE J and G eorgopoulos AP (1994). Movement parameters and neural activity in motor cortex and area 5. Cerebral Cortex, 4(6), 590-600.

AVERBECK BB AND L e e D (2003). Neural noise and movement-related codes in the macaque supplementary motor area. Journal of Neuroscience, 23(20), 7630-7641.

BAIR W a n d O ’K e e f e LP (1998). The influence of fixational eye movements on the response of neurons in area MT of the macaque. Vision Neuroscience, 15(4), 779-786.

BAKER JT, DONOGHUE JP, a n d Sa n e s JN (1999). Gaze direction modulates finger movement activation patterns in human cerebral cortex. Journal of Neuroscience, 19(22), 10044-10052.

BAR-HlLLEL A, SPIRO A, AND S t a r k E (2004). Spike sorting: Bayesian clustering of non- stationary data. In LK Saul, Y Weiss, and L Bottou (Eds.) Advances in Neural Information Processing Systems 17, pages 105—112. MIT Press, Cambridge, MA.

BASTIAN A, SCHONER G, AND RlEHLE A (2003). Preshaping and continuous evolution of mo tor cortical representations during movement preparation. European Journal of Neuroscience, 18(7), 2047-2058.

B a t is t a AP, B u n e o CA, S n y d e r LH, a n d A n d e r s e n RA (1999). Reach plans in eye-centered coordinates. Science, 285(5425), 257-260.

B a t is t a AP, Yu BM, Sa n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (2005). Hetero geneous coordinate frames for reaching in macaque PMd. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 363.12. Washington, DC. Slide presentation.

B a t is t a AP, Yu BM, S a n t h a n a m G, R y u SI, A f s h a r A, a n d S h e n o y KV (2006). A direct comparison of eye-centered and limb-centered reference frames for reach planning in the dorsal aspect of the premotor cortex. In preparation for resubmission to Journal of Neurophysiology.

B a t is t a AP, Yu BM, S a n t h a n a m G, R y u SI, a n d S h e n o y KV (2004). Coordinate frames for reaching in macaque dorsal premotor cortex (PMd). In Society for Neuroscience Abstract Viewer and Itinerary Planner, 191.7. San Diego, CA. Poster presentation.

B ir b a u m e r N , G h a n a y im N , H interberger T, I v e r s e n I, K o t c h o u b e y B, K u b l e r A, PERELMOUTER J, TAUB E, AND FLOR H (1999). A spelling device for the paralysed. Nature, 398(6725), 297-298.

BOUSSAOUD D (1995). Primate premotor cortex: modulation of preparatory neuronal activity by gaze angle. Journal of Neurophysiology, 73(2), 886-890.

139

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 140

BOUSSAOUD D AND BREMMER F (1999). Gaze effects in the cerebral cortex: reference frames for space coding and action. Experimental Brain Research, 128(1-2), 170—180.

BOUSSAOUD D, JOUFFRAIS C, AND BREMMER F (1998). Eye position effects on the neuronal activity of dorsal premotor cortex in the macaque monkey. Journal of Neurophysiology, 80(3), 1132-1150.

BROCKWELL AE, R o ja s AL, AND K a ss RE (2004). Recursive Bayesian decoding of motor cortical signals by particle filtering. Journal of Neurophysiology, 91(4), 1899-1907.

B r o w n EN, F r a n k LM, T a n g D, Q u ir k MC, and W ilson MA (1998). A statistical paradigm for neural spike train decoding applied to position prediction from the ensemble firing patterns of ra t hippocampal place cells. Journal of Neuroscience, 18(18), 7411-7425.

CAMINITI R, Ferraina S, and Johnson PB (1996). The sources of visual information to the primate frontal lobe: a novel role for the superior parietal lobule. Cerebral Cortex, 6(3), 3 1 9 - 328.

CAMINITI R, Johnson PB, G alli C, Ferraina S, and BURNOD Y (1991). Making arm move ments within different parts of space: the premotor and motor cortical representation of a coor dinate system for reaching to visual targets. Journal of Neuroscience, 11(5), 1182-1197.

C a rm en a JM , L e b e d e v MA, C r is t RE, O ’D o h e r t y JE , S a n t u c c i DM, D im itr o v DF, P a t i l PG, HENRIQUEZ CS, and N icolelis MAL (2003). Learning to control a brain-machine inter face for reaching and grasping by primates. PLoS Biology, 1(2), 193—208.

Chandrakasan AP, S h e n g S, AND B r o d e r s o nRW (1992). Low-power CMOS digital design. IEEE Journal of Solid-State Circuits, 27(4), 473-484.

C h a p in JK , M o x o n KA, M a r k o w itz RS, and N icolelis MAL (1999). Real-time control of a robot arm using simultaneously recorded neurons in the motor cortex. Nature Neuroscience, 2(1), 664-670.

Churchland MM, S a n th a n a m G, and Shenoy KV (2006a). Preparatory activity in premotor and motor cortex reflects the speed of the upcoming reach. Journal of Neurophysiology. doi: 10.1152/jn.00307.2006.

CHURCHLAND MM AND S h e n o y KV (2006). Delay of movement caused by disruption of cortical preparatory activity. Journal of Neurophysiology. doi:10.1152/jn.00808.2006.

Churchland MM, Yu BM, R yu SI, S a n th a n a m G, and Shenoy KV (2006b). Neural vari ability in premotor cortex provides a signature of motor preparation. Journal of Neuroscience, 26(14), 3697-3712. doi:10.1523/JNEUROSCI.3762-05.2006.

ClSEK P (2006). Preparing for speed. Focus on: "Preparatory activity in premotor and mo tor cortex reflects the speed of the upcoming reach". Journal of Neurophysiology, doi: 10.1152/jn.00857.2006.

ClSEK P AND KALASKA JF (2002). Modest gaze-related discharge modulation in monkey dorsal premotor cortex during a reaching task performed with free fixation. Journal of Neurophysiol ogy, 88(2), 1064-1072.

ClSEK P AND KALASKA JF (2004). Neual correlates of mental rehearsal in dorsal premotor cortex. Nature, 431(1011), 993-996.

COVER TM and Thomas JA (1991). Elements of Information Theory. John Wiley and Sons, Inc., New York, NY.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 141

CRAMMOND D J AND KALASKA JF (1995). Modulation of preparatory neuronal activity in dor sal premotor cortex due to stimulus-response compatibility.Journal of Neurophysiology, 71(3), 1281-1284.

CRAMMOND D J AND KALASKA JF (2000). Prior information in motor and premotor cortex: activity during the delay period and effect on pre-movement activity. Journal of Neurophysiology, 84(2), 986-1005.

CUNNINGHAM JP, Yu BM, AND S h e n o y KV (2006a). Optimal target placement for neural com munication prostheses. In Proceedings of the 28th Annual International Conference of the IEEE EMBS, FrBP10.3. New York, NY. Poster presentation.

CUNNINGHAM JP, Y u BM, AND S h e n o y KV (2006b). Optimal target placement for neural communication prostheses. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 256.21. Atlanta, GA.

DAYAN P AND A b b o t t LF (2001). Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. MIT Press, Cambridge, MA.

DENEVE A, L a th a m PE, AND POUGET A (2001). Efficient computation and cue integration w ith noisy population codes. Nature Neuroscience, 4(8), 826—831.

D e s t e x h e A, CONTRERAS D, and Steriade M (1999). Spatiotemporal analysis of local field po tentials and unit discharges in cat cerebral cortex during natural wake and sleep states. Journal of Neuroscience, 19(11), 4595-4608.

DONOGHUE JP, S a n e s JN , HATSOPOULOS NG, a n d G a a l G (1998). Neural discharge and local field potential oscillations in primate motor cortex during voluntary movements. Journal of Neurophysiology, 79(1), 159-173.

DUM RP AND STRICK PL (1991). The origin of corticospinal projections from the premotor areas in the frontal lobe. Journal of Neuroscience, 11(3), 667-689.

EVERITT BS (1984). An Introduction to Latent Variable Models. Chapman and Hill, London.

F a rw ell LA AND D onchin E (1988). Talking off the top of your head: toward a mental prosthe sis utilizing event-related brain potentials. Electroencephalography Clinical Neurophysiology, 70(6), 510-523.

FEE MS, MlTRA PP, and K leinfeld D (1996). Variability of extracellular spike waveforms of cortical neurons. Journal of Neurophysiology, 76(6), 3823-3833.

FETZ EE (1969). Operant conditioning of cortical unit activity. Science, 163(870), 955—957.

FETZ EE AND BAKER MA (1973). O perantly conditioned patterns of precentral unit activity and correlated responses in adjacent cells and contralateral muscles. Journal of Neurophysiology, 36(2), 179-204.

GALEA MP and Darian-Sm ith I (1994). Multiple corticospinal neuron populations in the macaque monkey are specified by their unique cortical origins, spinal terminations, and con nections. Cerebral Cortex, 4(2), 166-194.

Georgopoulos AP, KALASKA JF, CAMINITI R, AND MASSEY JT (1982). On the relations be tween the direction of two-dimensional arm movements and cell discharge in primate motor cortex. Journal of Neuroscience, 2(11), 1527-1537.

G eorgopoulos AP, Schwartz AB, and K ettner RE (1986). Neuronal population coding of movement direction. Science, 233(4771), 1416-1419.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 142

GHAHRAMANI Z AND HINTON G (1997). The EM algorithm for mixtures of factor analyzers. Technical Report CRG-TR-96-1.

G ilja V, Kalmar RS, Santhanam G, Ryu SI, Yu BM, Afshar A, and Shenoy KV (2005). Trial-by-trial mean normalization improves plan period reach target decoding. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 519.18. Washington, DC. Poster presenta tion.

G o d s c h a lk M, L em on RN, K u y p e r s HG, and van der Steen J (1985). The involvement of monkey premotor cortex neurones in preparation of visually cued arm movements. Behavioral Brain Research, 18(2), 143-157.

GOLUB G a n d V an l o a n CF (1983). Matrix Computations. Johns Hopkins University Press, Baltimore, MD, 3rd edition.

G om ez JE , F u Q, F la m e n t D, a n d E b n e r TJ (2000). Representation of accuracy in the dorsal premotor cortex. European Journal of Neuroscience, 12(10), 3748-3760.

G u r M, BEYLIN A, AND S n o d d e r ly DM (1997). Response variability of neurons in primary visual cortex (VI) of alert monkeys. Journal of Neuroscience, 1 7(8), 2914-2920.

HANES DP AND SCHALL JD (1996). Neural control of voluntary movement initiation. Science, 274(5286), 427-430.

H a r r is o n R, W a tk in s P, K ie r R, B l a c k D, N o r m a n n R, and Solzbacher F (2006). A low- power integrated circuit for a wireless 100 electrode neural recording system. In 2006 IEEE International Conference on Solid-State Circuits Digest of Technical Papers, pages 554-555.

HARRISON RR (2003). A low-power integrated cicuit for adaptive detection of action potentials in noisy signals. In Proceedings of the 25th Annual International Conference of the IEEE EMBS, pages 3325-3328. Cancun, Mexico.

HARRISON RR AND C h a rles C (2003). A low-power low-noise CMOS amplifier for neural record ing applications. IEEE Journal of Solid-State Circuits, 38(6), 958-965.

H a r r is o n RR, SANTHANAM G, and Shenoy KV (2004). Local field potential m easurem ent with low-power analog integrated circuit. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4067—4070. San Francisco, CA.

HATSOPOULOS N, JOSHI J, AND O’LEARY JG (2004). Decoding continuous and discrete motor behaviors using motor and premotor cortical ensembles. Journal of Neurophysiology, 92, 1165- 1174.

Hinterberger T, S c h m id t S, N e u m a n n N, M e l l i n g e r J, B l a n k e r t z B, C u r io G, a n d B ir - BAUMER N (2004). Brain-computer communication and slow cortical potentials. IEEE Transac tions on Biomedical Engineering, 51(6), 1011-1018.

Hochberg LR, Serruya MD, Friehs GM, Mukand JA, Saleh M, Caplan AH, Branner A, CHEN D, Penn RD, and DONOGHUE JP (2006). Neuronal ensemble control of prosthetic devices by a human with tetraplegia. Nature, 442(7099), 164-171.

HOCHERMAN S AND WISE SP (1991). Effects of hand movement path on motor cortical activity in awake, behaving rhesus monkeys. Experimental Brain Research, 83(January), 285-302.

HOLDEFER RN AND MILLER LE (2002). Primary motor cortical neurons encode functional muscle synergies. Experimental Brain Research, 146(2), 233-243.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 143

H o r iu c h i T, S w i n d e ll T, S a n d e r D, and Abshire P (2004). A low-power CMOS neural am plifier with amplitude measurements for spike sorting. In Proceedings of the 2004 IEEE Inter national Symposium on Circuits and Systems (ISCAS ’04), volume 4, pages 29-32 . Vancouver, Canada.

HORWITZ GD a n d NEWSOME WT (2001). Target selection for saccadic eye movements: direction- selective visual responses in the superior colliculus. Journal of Neurophysiology, 86(5), 2527- 2542.

H u m p h r e y DR, S c h m id t EM, AND T h o m p s o n WD (1970). Predicting measures of motor per formance from multiple cortical spike trains. Science, 170(3959), 758-762.

ISAACS RE, W e b e r DJ, a n d S c h w a r t z AB (2000). Work toward real-time control of a cortical neural prosthesis. IEEE Transactions on Rehabilitation Engineering, 8(2), 196-198.

JACKSON A, MAVOORI J, AND F etz EE (2006a). Correlations between the same motor cortex cells and arm muscles during a trained task, free behavior and natural sleep in the macaque monkey. Journal of Neurophysiology, Epub. doi:10.1152/jn.00710.2006.

JACKSON A , MAVOORI J, AND F e t z EE (2006b). Long-term motor cortex plasticity induced by an electronic neural implant. Nature, 444(7115), 56-60.

J a c k s o n A, M o r it z CT, M a v o o r i J, L u c a s TH, a n d F e t z EE (2006c). The neurochip BCI: towards a neural prosthesis for upper limb function. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2), 187-190.

J o h n s o n PB , F e r r a in a S, B ia n c h i L, a n d CAMINITI R (1996). Cortical networks for visual reaching: physiological and anatomical organization of frontal and parietal lobe arm regions. Cerebral Cortex, 6(2), 102-119.

KAKEI S, HOFFMAN DS, AND S t r ic k PL (1999). Muscle movement representations in the pri mary motor cortex. Science, 285(5436), 2136—2139.

KALASKA J F AND CRAMMOND DJ (1995). Deciding not to GO: neuronal correlates of response selection in a GO/NOGO task in primate premotor and parietal cortex. Cerebral Cortex, 5(5), 410—428.

K a l m a r RS, G il j a V, S a n t h a n a m G, R y u SI, Y u BM, A f s h a r A, a n d S h e n o y K V (2005). PMd delay activity during rapid sequential movement plans. In Society for Neuroscience Ab stract Viewer and Itinerary Planner, 519.17. Washington, DC. Poster presentation.

KANDEL ER, SCHWARTZ JH , AND JESSELL TM (2000). Principles of Neural Science. McGraw-Hill Medical, 4th edition.

K e m e r e C, SAHANI M, a n d M e n g TH (2003). Robust neural decoding of reaching movements for prosthetic systems. In Proceedings of the 25th Annual International Conference of the IEEE EMBS, 6.4.2-3, pages 2079-2082. Cancun, Mexico.

K e m e r e C, S a n t h a n a m G, Y u BM, R y u SI, M e n g TH, a n d S h e n o y KV (2004a). Model- based decoding of reaching movements for prosthetic systems. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4524—4528. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404256.

Kem ere C, Shenoy KV, AND m eng TH (2004b). Model-based neural decoding of reaching move ments: a maximum likelihood approach. IEEE Transactions on Biomedical Engineering - Spe cial Issue on Brain-Machine Interfaces, 51(6), 925-932.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 144

K e m e r e CT, Santhanam G, Yu BM, Shenoy KV, and M eng TH (2002). Decoding of plan and peri-movement neural signals in prosthetic systems. In IEEE Workshop on Signal Processing Systems (SIPS’02), pages 276-283. San Diego, CA.

K e n n e d y P, A n d r e a s e n D, E h ir im P, K in g B, K ir b y T, M ao H, a n d M o o r e M (2004). Using human extra-cortical local field potentials to control a switch. Journal of Neural Engineering, 1(2), 72-77.

K ennedy PR AND Bakay RAE (1998). Restoration of neural output from a paralyzed patient by a direct brain connection. NeuroReport, 9(8), 1707-1711.

Kennedy PR, Bakay RAE, M oore MM, Adams K, and G oldw aithe J (2000). Direct control of a computer from the human central nervous system. IEEE Transactions on Rehabilitation Engineering, 8, 198-202.

KURATA K (1989). Distribution of neurons with set- and movement-related activity before hand and foot movements in the premotor cortex of rhesus monkeys. Experimental Brain Research, 77(2), 245-256.

KURATA K (1993). Premotor cortex of monkeys: set- and movement-related activity reflecting amplitude and direction of wrist movements. Journal of Neurophysiology, 69(1), 187-200.

LEUTHARDT EC, SCHALK G, WOLPAW JR , OJEMANN JG , AND MORANN DW (2004). A brain- computer interface using electrocorticographic signals in humans. Journal of Neural Engineer ing, 1(2), 63-71.

LEWICKI MS (1998). A review of methods for spike sorting: the detection and classification of neural action potentials. Network: Computation in Neural Systems, 9(4), R53-R78.

LIU X, McCREERY DB, BULLARA LA, AND AGNEW WF (2006). Evaluation of the stability of intracortical microelectrode arrays. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(1), 91-100.

M acK ay DJC (2003). Information Theory, Inference, and Learning Algorithms. Cambridge Uni versity Press, Cambridge, UK.

MATSUMURA M AND KUBOTA K (1979). Cortical projection to hand-arm motor area from post- arcuate area in macaque monkeys: a histological study of retrograde transport of horseradish peroxidase. Neuroscience Letters, 11(8), 241-246.

MAVOORI J, J a c k s o n A, D io r io C, a n d F e t z E (2005). An autonomous implantable computer for neural recording and stimulation in unrestrained primates. Journal of Neuroscience Meth ods, 148(1), 71-77.

Maynard EM, H atsopoulos NG, Ojakangas CL, Acuna BD, Sanes JN, Normann RA, AND DONOGHUE J P (1999). Neuronal interactions improve cortical population coding of move ment direction. Journal of Neuroscience, 19(18), 8083-8093.

Maynard EM, Nordhausen CT, AND N o r m a n n RA (1997). The Utah intracortical electrode array: a recording structure for potential brain-computer interfaces. Electroencephalography Clinical Neurophysiology, 102(8), 228-239.

M c F a r la n d DJ, Sarnacki WA, and WOLPAW JR (2003). Brain-computer interface (BCI) oper ation: optimizing information transfer rates. Biological Psychology, 63(8), 237-251.

M e d e n d o r p WP, GOLTZ HC, V i l i s T, and Crawford JD (2003). Gaze-centered updating of visual space in human parietal cortex. Journal of Neuroscience, 23(15), 6209-6214.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 145

MENG TH, H u n g AC, T s e r n EK, AND G o r d o nBM (1998). Low-power signal processing system design for wireless applications. IEEE Personal Communications, 5(3), 20-31.

MESSIER J AND KALASKA JF (2000). Covariation of primate dorsal premotor cell activity with direction and amplitude during a memorized-delay reaching task. Journal of Neurophysiology, 84(1), 152-165.

MORAN DW and Schw artz AB (1999a). Motor cortical activity during drawing movements: population representation during sprial tracing. Journal of Neurophysiology, 82(5), 2693-2704.

MORAN DW and Schw artz AB (1999b). Motor cortical representation of speed and direction during reaching. Journal of Neurophysiology, 82(5), 2676-2692.

MORROW MM AND M ille r LE (2003). Prediction of muscle activity by populations of sequentially recorded primary motor cortex neurons.Journal of Neurophysiology, 89(4), 2279-2288.

MURMANN B AND BOSER BE (2004). Digitally Assisted Pipeline ADCs. Kluwer Academic Pub lishers, The Netherlands.

M u s a lla m S, C o r n e il BD, G r e g e r B, Scherberger H, and Andersen RA (2004). Cognitive control signals for neural prosthetics. Science, 305(5681), 258-262.

OBEID I, NICOLELIS ML, AND W o lf PD (2004). A multichannel telemetry system for single unit neural recordings. Journal of Neuroscience Methods, 133(1-2), 33-38.

OLDS J (1965). Operant conditioning of single unit responses. In 23rd International Congress of Physiological Sciences 1965, pages 372-380. Excerpta Medica Foundation, Tokyo, Japan.

OWEISS KG, A n d e r s o n DJ, AND PAPAEFTHYMIOU MM (2003). Optimizing signal coding in neural interface system-on-a-chip modules. In Proceedings of the 25th Annual International Conference o f the IEEE EM BS, pages 2216-2219. Cancun, Mexico.

PATIL PG, CARMENA JM , N i c o l e l i s MAL, AND T u r n e r DA (2004). Ensemble recordings of human subcortical neurons as a source of motor control signals for a brain-machine interface. Neurosurgery, 55(1), 27-38.

PESARAN B, NELSON M, AND A ndersen R (2006). Dorsal premotor neurons encode the relative position of the hand, eye, and goal during reach planning. Neuron, 51(1), 125-134.

POUGET A, Deneve S, and Duham el JR (2002). A computational perspective on the neural basis of multisensory spatial representations. Nature Reviews Neuroscience, 3(9), 741-747.

POUZAT C, DELESCLUSE M, VlOT P, AND DlEBOLT J (2004). Improved spike-sorting by modeling firing statistics and burst-dependent spike amplitude attenuation: a Markov chain Monte Carlo approach. Journal of Neurophysiology, 91(6), 2910-2928.

RlEHLE A, M acK a y WA, AND REQUIN J (1994). Are extent and force independent movement parameters? Preparation- and movement-related neuronal activity in the monkey cortex. Ex perimental Brain Research, 99(1), 56-74.

RlEHLE A AND REQUIN J (1989). Monkey primary motor and premotor cortex: single-cell activity related to prior information about direction and extent of an intended movement. Journal of Neurophysiology, 61(3), 534-549.

RlEHLE A AND R e q u in J (1993). The predictive value for performance speed of preparatory changes in neuronal activity of the monkey motor and premotor cortex. Behavioral Brain Re search, 53(1-2), 35-49.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 146

ROWEIS S AND GHAHRAMANI Z (1999). A unifying review of linear gaussian models. Neural Computation, 11(2), 305-345.

ROWEIS ST (1998). EM algorithms for PCA and SPCA. In MI Jordan, MJ Kearns, and SA Solla (Eds.) Advances in Neural Information Processing Systems 10. MIT Press, Cambridge, MA.

SAHANI M (1999). Latent Variable Models for Neural Data Analysis. Ph.D. thesis, Computational and Neural Systems, California Institute of Technology, Pasadena, CA.

S a n th a n a m G, Churchland MM, S a h a n i M, and Shenoy KV (2003). Local field potential activity varies with reach distance, direction, and speed in monkey pre-motor cortex. In So ciety for Neuroscience Abstract Viewer and Itinerary Planner, 918.1. New Orleans, LA. Poster presentation.

Santhanam G, Linderman MD, G ilja V, Afshar A, Ryu SI, M e n g TH, and Shenoy KV (2006a). HermesB: A continuous neural recording system for freely behaving primates. In preparation for resubmission to IEEE Transactions on Biomedical Engineering.

S a n th a n a m G, R y u SI, Yu BM, A f s h a r A, and Shenoy KV (2006b). A high-performance brain-computer interface. Nature, 442(7099), 195-198. doi:10.1038/nature04968.

SANTHANAM G, S a h a n i M, R y u SI, AND S h e n o y KV (2004). An extensible infrastructure for fully automated spike sorting during online experiments. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4380-4384. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404219.

SCHMIDT EM (1980). Single neuron recording from motor cortex as a possible source of signals for control of external devices.Annals of Biomedical Engineering, 8(4-6), 339-349.

SCHWARTZ AB (1992). Motor cortical activity during drawing movements: single-unit activity during sinusoidal tracing. Journal of Neurophysiology, 68(2), 528-541.

SCHWARTZ AB (1993). Motor cortical activity during drawing movements: population representa tion during sinusoidal tracing. Journal of Neurophysiology, 70(1), 28-36.

SCHWARTZ AB (1994). Direct cortical representation of drawing. Science, 265(5171), 540-542.

SCHWARTZ AB (2004). Cortical neural prosthetics. Annual Review of Neuroscience, 27, 487-507.

SCOTT D, BOSER B E , AND P i s t e r KSJ (2003). An ultra low-energy ADC for Smart Dust. IEEE Journal of Solid-State Circuits, 38(7), 1123—1129.

SCOTT SH (2006). Neuroscience: converting thoughts into action. Nature, 442(7099), 141-142.

SCOTT SH and Kalaska JF (1997). Reaching movements with similar hand paths but different arm orientations. I. Activity of individual cells in motor cortex. Journal of Neurophysiology, 77(2), 826-852.

S c o t t SH, S e r g io LE, and Kalaska JF (1997). Reaching movements with similar hand paths but different arm orientations. II. Activity of individual cells in dorsal premotor cortex and pari etal area 5. Journal of Neurophysiology, 78(5), 2413—2426.

SEESE TM, HARASAKI H , S a id e l GM, AND DAVIES CR (1998). Characterization oftissue mor phology, angiogenesis, and temperature in adaptive response of muscle tissue to chronic heating. Lab Investigation, 78(12), 1553-1562.

SERBY H, YOM-TOV E, AND INBAR GF (2005). An improved P300-based brain-computer interface. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(1), 89-98.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 147

S e r r u y a MD, HatsopoulosNG, P a n in s k i L, F e l l o w s MR, and Donoghue J P (2002). In stant neural control of a movement signal. Nature, 416(6877), 141-142.

SHADLEN M N AND NEWSOME WT (1998). The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. Journal of Neuroscience, 18(10), 3870- 3896.

SHANNON CE (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379-423 and 623-656.

SHEN L AND A lexander GE (1997). Preferential representation of instructed target location versus limb trajectory in dorsal premotor area. Journal of Neurophysiology, 77(3), 1195-1212.

S h e n o y KV, M e e k e r D, C ao S, K u r e s h i SA, Pesaran B, M itra P, B u n e o CA, B a t is t a AP, BURDICK JW, and A ndersen RA (2003). Neural prosthetic control signals from plan activity. NeuroReport, 14(4), 591-596.

SHOHAM S, FELLOWS M, AND N o r m a n n R (2003). Robust, autom atic spike sorting u sing m ix tures of multivariate t-distributions. Journal of Neuroscience Methods, 127(2), 111-122.

Shoham S, Paninski LM, Fellow s MR, H atsopoulosNG, Donoghue JP, and Normann RA (2005). Statistical encoding model for a primary motor cortical brain-machine interface. IEEE Transactions on Biomedical Engineering, 52(7), 1313-1322.

SMITH AC AND B r o w n EN (2003). Estimating a state-space model from point process observa tions. Neural Computation, 15(5), 965-991.

S p a ld in g MC, V e l l i s t e M, Jarosiewicz B, and Schwartz A (2005). 3-D cortical control of an anthropomorphic robotic arm for reaching and retrieving. InSociety for Neuroscience Abstract Viewer and Itinerary Planner, 401.3. Washington, DC.

S u n e r S, F e l l o w s MR, Vargas-Irwin C, N a k a t a GK, and Donoghue J P (2005). Reliability of signals from a chronically implanted, silicon-based electrode array in non-human primate primary motor cortex. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(4), 524-541.

TANJI J AND EVARTS E V (1976). Anticipatory activity of motor cortex neurons in relation to direction of an intended movement. Journal of Neurophysiology, 39(5), 1062-1068.

Tanne-Gariepy J, ROUILLER EM, AND BOUSSAOUD D (2002). Parietal inputs to dorsal versus ventral premotor areas in the macaque monkey: evidence for largely segregated visuomotor pathways. Experimental Brain Research, 145(1), 91-103.

T a y lo r DM, Helms T illery SI, AND Schwartz AB (2002). Direct cortical control of 3D neuroprosthetic devices. Science, 296, 1829-1832.

Taylor DM, Helms T illery SI, and Schwartz AB (2003). Information conveyed through brain-control: cursor vs. robot.IEEE Transactions on Neural Systems and Rehabilitation Engi neering, 11(2), 195-199.

THACKER NA a n d BROMILEY PA (2001). The effects of a square root transform on a Poisson distributed quantity. Technical Report 2001-010.

TKACH DC, Remier J, AND H atsopoulos NG (2005). A hybrid neuromotor brain-machine interface using trajectory and goal state control modes. In Society for Neuroscience Abstract Viewer and Itinerary Planner, 707.11. Washington, DC.

TOLHURST DJ, MOVSHON JA, a n d D e a n AF (1983). The statistical reliability of signals in single neurons in cat and monkey visual cortex. Vision Research, 23(8), 775-785.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 148

T r u c c o l oW, E d e n UT, F e l l o w s MR, D o n o g h u eJP, a n d B r o w n EN (2004). A point process framework for relating neural spiking activity to spiking history, neural ensemble and extrinsic covariate effects. Journal of Neurophysiology, Epub. doi:10.1152/jn.00697.2004.

VAN BEERS RJ, H a g g a r d P, AND WOLPERT DM (2004). The role of execution noise in movement variability. Journal of Neurophysiology, 91(2), 1050-1063.

VYSSOTSKI A L, SERKOV AN, ITSKOV PM, D e ll’Omo G, LATANOV AV, WOLFER DP, AND LlPP HP (2006). Miniature neurologgers for flying pidgeons: multichannel EEG and action and field potentials in combination with GPS recording. Journal of Neurophysiology, 95(2), 1263—1273.

W atkins PT, Santhanam G, Shenoy KV, and Harrison RR (2004). Validation of adaptive threshold spike detector for neural recording. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4079-4082. San Francisco, CA.

WEINRICH M AND WISE SP (1982). The premotor cortex of the monkey. Journal of Neuroscience, 2(9), 1329-1345.

W einrich M, W ise SP, AND M auritz KH (1984). A neurophysiological study of the premotor cortex in the rhesus monkey. Brain, 107(2), 385-414.

W e s s b e r g J, S ta m b a u g h CR, K r a lik JD , B e c k PD, L a u b a c h M, C h a p in JK , Kim J, B ig g s SJ, SRINIVASAN MA, AND NICOLELIS MAL (2000). Real-time prediction of hand trajectory by ensembles of cortical neurons in primates. Nature, 408(6810), 361-365.

WILLIAMS JC , R e n n a k e r RL, a n d K ip k e DR (1999). Stability of chronic multichannel neural recordings: implications for a long term neural interface. Neurocomputing, 26-27, 1069-1076.

WOLPAW JR AND MCFARLAND DJ (2004). Control of a two-dimensional movement signal by a noninvasive brain-computer interface in humans. Proceedings of the National Academy of Sciences of the USA, 101(51), 17849-17854.

WOOD F, BLACK MJ, VARGAS-lRWIN C, FELLOWS M, AND DONOGHUE J P (2004). On the vari ability of manual spike sorting. IEEE Transactions on Biomedical Engineering, 51(6), 912-918.

W u W, B l a c k MJ, Mumford D, Gao Y, Bienenstock E, and DonoghueJ P (2004). Modeling and decoding motor cortical activity using a switching Kalman filter. IEEE Transactions on Biomedical Engineering, 51(6), 933-942.

W u W, G ao Y, Bienenstock E, D o n o g h u eJP, a n d B l a c k MJ (2006). Bayesian population decoding of motor cortical activity using a Kalman filter. Neural Computation, 18(1), 80—118.

YU BM (2007). Neural Dynamics of Motor Preparation and Execution. Ph.D. thesis, Department of Electrical Engineering, Stanford University, Stanford, CA.

Y u BM, A f s h a r A, S a n th a n a m G, R yu SI, S h e n o y KV, and Sahani M (2006a). Extracting dynamical structure embedded in neural activity. In Y Weiss, B Scholkopf, and J Platt (Eds.) A d vances in Neural Information Processing Systems 18, pages 1545-1552. MIT Press, Cambridge, MA.

Y u BM, Kemere C, Santhanam G, A fshar A, Ryu SI, Meng TH, Sahani M, a n d S h e n o y KV (2006b). Mixture of trajectory models for neural decoding of goal-directed movements. In preparation for resubmission to Journal of Neurophysiology Innovative Methodology.

Y u BM, R yu SI, S a n th a n a m G, Churchland MM, and Shenoy KV (2004). Improving neural prosthetic system performance by combining plan and peri-movement activity. In Proceedings of the 26th Annual International Conference of the IEEE EMBS, volume 6, pages 4516-4519. San Francisco, CA. doi: 10.1109/IEMBS.2004.1404254.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. BIBLIOGRAPHY 149

ZHANG K, G in z b u r g I, M c NAUGHTON B, AND SEJNOWSKI TJ (1998). Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cells. Journal of Neurophysiology, 79(2), 1017-1044.

ZlPSER D AND ANDERSEN RA (1988). A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature, 331(6158), 679-684.

Z u m ste g ZS, K e m e r e C, O’D riscoll S, S a n th a n a m G, A h m ed RE, S h e n o y KV, a n d M e n g TH (2005). Power feasibility of implantable digital spike sorting circuits for neural prosthetic systems. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(3), 272-279. doi: 10.1109/TNSRE.2005.854307.

Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.