ADAPTIVE FOR THE ANALYSIS

OF VISUAL EVOKED POTENTIALS

by

Jie Cui

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Institute of Biomaterials and Biomedical Engineering University of Toronto

Copyright © 2006 by Jie Cui

Abstract

Adaptive Chirplet Transform for the Analysis of Visual Evoked Potentials Doctor of Philosophy, 2006 Jie Cui Institute of Biomaterials and Biomedical Engineering, University of Toronto

Visual evoked potentials (VEPs) are electrical signals measured on the surface of the scalp in response to rapid and repetitive visual stimuli. These signals possess complex time-frequency structures and are difficult to characterize with conventional methods. In this work, we propose a new approach based on the adaptive chirplet transform (ACT) to represent a complete VEP response from the transient to the steady-state portion.

Our implementation of the ACT involves both a windowed and a non-windowed approach. The non-windowed ACT employs a coarse-refinement algorithm to estimate multiple chirplets under low signal-to-noise ratio condition. The method decomposes

VEPs into chirplet basis functions with four adjustable parameters (i.e., time-spread, chirp rate, time-center and frequency-center). We show how these parameters can be used to separate the transient from the steady-state portions of the response, and that as few as three chirplets are required to represent a complete VEP signal. In the windowed

ACT method, the signal is partitioned into equal-length non-overlapping segments before estimating one chirplet from each segment. The concept of the optimal window length with reference to the windowed ACT method is proposed and calculations are made in terms of the signal characteristics. Both the windowed and non-windowed methods reveal a similar pattern of the VEP response – that a short transient VEP precedes the steady-state VEP. It is shown, however, that the computational time of the windowed method is significantly lower. Finally, we demonstrate that the adaptive chirplet spectrogram (ACS) offers a clearer visualization of the time-frequency structure of the

VEP signal than the conventional spectrogram, because the ACS avoids cross-term interference in the time-frequency plane. Possible applications of VEP chirplet analysis to on-line signal classification and the technical limitations of the ACT approach are also discussed.

ii

Acknowledgements

“Therefore, it does not matter whether “是故无贵无贱,无长无少,道之 a person is high or low in position, 所存,师之所存也。” young or old in age. Where there is the doctrine, there is my teacher.” —《师说》• 韩愈 —— Han Yu (768-824)

I would like to express my gratitude to my supervisors Dr. Willy Wong and Dr. Hans Kunov for their guidance and support, and to my supervisory committee members, Dr. Milos R. Popovic and Dr. Steve Mann, for their continuing interest and suggestions in this work. I would also like to acknowledge the external examiners, Dr. Terrence Picton, Dr. Kenneth H. Norwich and Dr. William MacKay, from the University of Toronto, and the external appraiser, Dr. Rangaraj M. Rangayyan, from the University of Calgary. Of equal importance has been the camaraderie of the associates and students here at Sensory Communication Laboratory of the Institute of Biomaterials & Biomedical Engineering. I would like to thank Alberto Behar, Dr. Hilmi Dajani, Dr. Dave Purcell, Dr. Elad Sagi, Dr. Taha Jaffer, Ewen MacDonald, Kevin Cannons, Jason Lee, Graham Greenland, Gerry Fung, Jan Rubak, Fei Fan, and Hafiz Noordin for their friendship as well as teamwork. Finally, I wish to thank and dedicate this thesis to the members of my family. Most treasured of all is my wife, Dinghui Wang, whose love and support kept me inspired. Words alone cannot express how grateful I am to my parents for their moral support and encouragement when I am on this never-ending road of pursuing the truth. Richard Jie Cui Toronto

iii

Contents

Abstract ...... ii

Acknowledgements...... iii

Contents...... iv

List of Tables ...... vii

List of Figures...... viii

List of Abbreviations ...... x

Mathematical Notations...... xii

Chapter 1 Introduction ...... 1

1.1 Motivation ...... 1

1.2 Current Status of EP Signal Processing...... 5

1.3 Challenges...... 10

1.4 Objective and Hypothesis ...... 12

1.5 Why Adopt Time-Frequency Analysis? ...... 14

1.6 Thesis Outline...... 18

Chapter 2 MPLEM – An Adaptive Chirplet Transform ...... 19

2.1 Introduction to the Gaussian Chirplet Transform ...... 19

2.1.1 An overview of time-frequency analysis...... 20

2.1.1.1 The need for time-frequency analysis ...... 20

2.1.1.2 Logon and the short-time ...... 23

2.1.1.3 “” and the ...... 26

iv 2.1.2 Chirplet and the Gaussian chirplet transform ...... 29

2.2 The Adaptive Chirplet Transform (ACT) ...... 32

2.2.1 The MP algorithm...... 34

2.2.2 The LEM algorithm ...... 42

2.2.3 MPLEM algorithm of the ACT...... 44

2.2.3.1 Signal model and CRLBs of chirplet estimates ...... 46

2.2.3.2 Numerical simulation ...... 50

2.2.4 Measures for stopping criterion and compactness...... 54

2.3 The Windowed Adaptive Chirplet Transform ...... 55

2.3.1 Computational method...... 57

2.3.2 Discussion...... 57

2.4 Summary ...... 60

Chapter 3 Application of the Non-Windowed ACT ...... 61

3.1 Experimental Method ...... 61

3.1.1 Subjects...... 62

3.1.2 Visual stimulus...... 62

3.1.3 Apparatus and VEP recording ...... 64

3.2 VEP Data Processing Results...... 64

3.2.1 Number of chirplets in the dictionary ...... 64

3.2.2 Chirplet estimation ...... 66

3.2.3 Visualization...... 68

3.3 Discussion ...... 70

3.3.1 Model validation ...... 71

3.3.2 Compactness comparison...... 76

3.3.3 Visualization effect...... 79

3.3.4 Separation of tVEP and ssVEP...... 81

v 3.4 Conclusions...... 83

Chapter 4 Application of the Windowed ACT ...... 86

4.1 Introduction...... 86

4.2 Optimal Window Length ...... 89

4.3 VEP Data Processing Results...... 93

4.3.1 Number of chirplets in dictionary...... 93

4.3.2 Chirplet estimation ...... 93

4.3.3 Visualization...... 94

4.3.4 Statistical information...... 96

4.4 Discussion ...... 99

4.5 Summary ...... 101

Chapter 5 Discussions...... 103

5.1 Applications of VEP Analysis with ACT...... 103

5.2 Technical Limitations ...... 108

Chapter 6 Conclusions and Future Work...... 114

6.1 Conclusions...... 114

6.2 Future Work...... 116

6.2.1 CRLB for multiple-chirplet estimation...... 116

6.2.2 Signal classifier based on chirplet features...... 116

6.2.3 Beyond EP applications ...... 117

References...... 119

Appendices...... 130

Appendix A. Energy Conservation of Decomposition...... 130

Appendix B. CRLBs of Chirplet Estimates ...... 132 Appendix C. Human Experimentation Protocol...... 135

vi

List of Tables

Table I Construction of the discrete chirplet dictionary ...... 37

Table II CRLBs of the estimates ...... 48

Table III Components of the synthetic signal ...... 53

Table IV Ten chirplets estimated from signal D1 ...... 65

Table V The coherent coefficients ccn of the decomposed chirplets ...... 65

Table VI ERs calculated from the decomposition using chirplets, ERC ...... 77

Table VII ERs calculated from the decomposition using Gabor logons, ERG .. 77

Table VIII Statistics of the parameters of transient and steady-state VEPs ..... 83

Table IX Estimated Signal-to-noise Ratio (SNR) ...... 88

Table X Estimated duration of tVEPs ...... 92

Table XI Twelve chirplets estimated from signal D1...... 93

Table XII Standard deviation (STD) values of the parameter ...... 98

Table XIII Time differences between maxima of matched filter outputs ...... 106

vii

List of Figures

Fig. 1. An example of VEP response to repetitive visual stimulation...... 3

Fig. 2. Schematic diagram of EP representation for different repetition rate (R )

of sensory stimulation...... 5

Fig. 3. A schematic diagram of the pathways of the human visual system...... 14

Fig. 4. Time-frequency characteristics of a typical VEP signal (D1)...... 17 Fig. 5. Relationship between wave (pure tone) and “wavelet” in terms of time

series and spectrogram...... 22

Fig. 6. Logon representation of the set of Gabor bases corresponding to STFT. 24 Fig. 7. Logon representation of the set of Gabor bases corresponding to the

wavelet transform...... 27 Fig. 8. Relationship between chirp and chirplet in terms of and

spectrogram...... 28

Fig. 9. Wave, “wavelet,” chirp and chirplet revisited...... 29

Fig. 10. Time-frequency distributions of Gaussian chirplets...... 31 Fig. 11. Chirplet representation of a set of basis functions corresponding to a

GCT...... 31

Fig. 12. Time-frequency plots of a signal with sinusoidal frequency modulation..40

Fig. 13. Comparison of signal decompositions...... 41

Fig. 14. An example of LEM...... 42

Fig. 15. Flowchart of the MPLEM adaptive chirplet decomposition algorithm. . 45

Fig. 16. CRLBs of the estimates of a single chirplet in noise...... 49 Fig. 17. Simulation results of estimating a single chirplet in various levels of noise.

...... 51

viii Fig. 18. Simulation results of multiple-chirplet decomposition of a synthetic signal.

...... 53 Fig. 19. Time-frequency plot of the signal with a sinusoidal frequency modulation.

...... 58

Fig. 20. The matrix of moving bars (MMB) ...... 62

Fig. 21. Mean and STD of the coherent coefficients...... 66

Fig. 22. Time-frequency structures of signal D1...... 67

Fig. 23. Time-frequency plots of the signals D2 – D5...... 69

Fig. 24. The residual signals (D1 decomposition) and their PSD’s...... 70

Fig. 25. Whiteness measures of all five signals...... 73

Fig. 26. The signal D1 and the reconstructed signals...... 74

Fig. 27. Comparison of the decompositions of signal D1...... 75

Fig. 28. Difference of the energy ratios ...... 78

Fig. 29. Visualization of different time-frequency representations of signal D1.... 79

Fig. 30. Reconstructed signals and the separated tVEPs and ssVEPs...... 82

Fig. 31. Schematic flowchart of a proposed real-time system...... 88

Fig. 32. Spectrogram and windowed Fourier ridges...... 90

Fig. 33. Time duration of tVEP...... 91

Fig. 34. Time-frequency structures of signals D1 – D5...... 95

Fig. 35. Averaged parameters of the atoms A1 – A12...... 97

Fig. 36. Results of signal D1 using different window length...... 99

Fig. 37. Window shift...... 101

Fig. 38. Matched filter outputs...... 105

Fig. 39. Time cost of chirplet estimation for different window lengths...... 112

Fig. 40. Chirplet representations of bio-acoustical signals...... 118

ix

List of Abbreviations

ACS Adaptive chirplet spectrogram ACT Adaptive chirplet transform BCI Brain-computer interface cc Coherent coefficient

CGWN Complex Gaussian white noise CPU Central processing unit CRLB Cramér-Rao lower bound EEG Electroencephalograph or Electroencephalogram ENR Energy-to-noise ratio EP Evoked potential ER Energy ratio FFT Fast Fourier transform FIR Finite impulse response fMRI Functional magnetic resonance imaging FSR Frequency sweeping response GCT Gaussian chirplet transform GPU Graphic processing unit GWN Gaussian white noise HMM Hidden Markov model LEM Logon expectation maximization MEG Magnetoencephalograhy MLE Maximum likelihood estimation MMB Matrix of moving bars

x MP Matching pursuit MPLEM Matching pursuit and logon expectation and maximization NIB Narrow instantaneous bandwidth PET Positron emission tomography PSD Power spectral density SNR Signal-to-noise ratio ssVEP Steady-state visual evoked potential STD Standard deviation STFT Short-time Fourier transform tVEP Transient visual evoked potential VEP Visual evoked potential WVD Wigner-Ville distribution

xi

Mathematical Notations

ℜ Set of real numbers ℜ2 Cartesian product of two sets of infinite real numbers

fˆ()ω The Fourier transform of function ft() +∞ fg, Inner product, fg, = ftg()* () tdt ∫−∞ 1/p +∞ p f p p-norm, fftdtp = () ()∫−∞

f 2-norm, ff==2 ff, +∞ L2 ()ℜ Space of finite energy functions, fftdt()2 <+∞ {}∫−∞

tc Time-center

fc or ωc Frequency-center

Δt Time-spread

Δω Frequency-spread c Chirp rate I Set of parameters of a time-frequency atom L Segment length M Number of segments N Set of natural numbers N Signal size Z Set of integer numbers Γ Set of chirplet parameters in a dictionary

Sf() tcc,ω Coefficients of short-time Fourier transform 2 Sf() tcc,ω Spectrogram

Wf() tct,Δ Coefficients of wavelet transform 2 Wf() tct,Δ Scalogram

WftV (),ω Coefficients of the Wigner-Ville distribution ⎣x⎦ The maximum integer smaller than x

xii

Chapter 1 Introduction

In this chapter we briefly review the concept and applications of visual evoked potentials (VEPs). Next we show that traditional methods of signal analysis are not well suited to characterize the complex time-frequency structure of VEPs. This then motivates us to employ the adaptive chirplet transform in our work. After we describe the challenges of the problem, the objective is presented and the main hypothesis is stated, that VEP signals can be described as a sum of chirplet basis functions. Finally, we justify the adoption of the time-frequency approach by investigating the characteristics of VEP responses.

1.1 Motivation

This thesis pioneers an application of a newly emerging method of time- frequency analysis – the adaptive chirplet transform (ACT 1) – to the field of biomedical signal processing. The aim is to propose, develop and apply two new methods, namely non-windowed ACT and windowed ACT, for time-frequency analysis of VEP signals. The goal of my approach is to characterize the time-

1 The definitions and discussions of ACT and chirplet basis functions, or chirplets, are given in Chapter 2.

1 Chapter 1 Introduction dependent behavior of VEP from its initial transient portion to the steady-state portion by a series of time-frequency atoms, or chirplet basis functions. VEPs are surface electrical potentials measured from the scalp in response to a visual signal. It is believed that they are generated from the visual cortex and/or the peripheral neural pathways leading to the cortex, and are time-locked to the visual stimulus [1]. VEPs have prominent clinical significance and can help diagnose sensory dysfunctions. They have, so far, mainly been employed in tests for the integrity of the visual pathway. They also have been used as a supplement to other techniques in research into specific clinical conditions [2-4]. Widespread acceptance of VEP recording in both academic research and clinical practice is largely due to the fact that VEP recording, like other evoked potential (EP) recording, is currently the only feasible manner by which electrical activity within the visual pathways of an intact human brain can be estimated in a short- time scale. Although other noninvasive methods (e.g., magnetoencephalography (MEG), positron emission tomography (PET) and functional magnetic resonance imaging (fMRI)) are available, they are technically demanding and expensive. Consequently, the threshold of their clinical use is presumably higher. Moreover, these methods usually have a large time constant of measurements and therefore they do not possess a comparable time-resolution with EP recordings. It has been shown that if the repetition rate of visual stimuli is sufficiently high (usually, above 6 times per second [1]), responses begin to merge and the shape of the resulting VEP becomes periodic. These responses are usually referred to as steady-state visual evoked potentials (ssVEPs) [1,2]. A variety of clinical applications require the detection of ssVEPs, including, for instance, evaluation of optic nerve function [3,5], objective estimation of visual acuity in infants or adults that are unable to provide reliable verbal responses [6,7], detection of abnormal mental states (e.g., hysteria), and assessment of delayed neurological

2 Chapter 1 Introduction maturation[3,5]. More recently, ssVEPs have also found applications within interface design (e.g., see references [8-10]).

Buildup Transient-state Steady-state tVEP ssVEP Fig. 1. An example of VEP response to repetitive visual stimulation ([2]).

It is worth noting that an idealized ssVEP response is merely an assumption. That is, it is defined as a repetitive signal whose constituent discrete frequency components remain constant in amplitude and phase over an infinitely long time period [2]. In practice, however, VEPs to rhythmical stimulation never completely fulfill this definition. Nevertheless, in many experimental situations, VEPs to repetitive stimulation correspond reasonably closely to the ideal steady- state response, since the amplitudes and phases of the constituent frequency components remain constant with time, in principle. Therefore, with the assumption of steady-state, VEPs are usually modeled as a linear combination of a fundamental (usually, the stimulator) frequency and its higher harmonics. The detection task then is reduced to finding these periodic components embedded in the background spontaneous EEG (e.g., see references [11-14]).

3 Chapter 1 Introduction

However, this model is not always sufficient for the description of steady-state EPs. Firstly, the model usually assumes that the frequency components of a steady-state response only consist of the fundamental frequency and its higher harmonics. Although this assumption works well in some cases, signal energy can be distributed at other frequencies, which might be due to the nonlinear nature of the visual system [15]. Therefore, besides the harmonics, other frequency components may also exist in the response. Secondly, relying on this model, an experimenter usually ignores the information contained in the transient VEP (tVEP 2) that appears immediately following the onset of the visual stimulus. Previous studies [2,16] have shown that time is required for the formation of the steady-state response. In Fig. 1, we show an example of VEP recording elicited by flicker stimulation. In general, two stages of the response may be observed from the figure – (1) a transient buildup portion preceding (2) the steady-state portion. A number of new applications require fast estimation and detection of VEPs (e.g., design of brain-computer interface (BCI) [8-10,17]). This has prompted us to find new ways to characterize the transient portion of the response. Finally, it has been demonstrated that the variability in the mental state of a subject may perturb the steady-state components of a VEP response (perhaps due to a lack of concentration, tiredness or accommodation) [18]. Even if the signal occurs at known frequencies, its amplitude and phase are usually unknown and may be nonstationary [15,19,20]. Therefore, a linear expansion from a Fourier basis may not be the optimal way to represent the entire signal.

2 It should be emphasized that the term ‘transient VEP’, or tVEP, is conceptually different from that used in traditional electrophysiological literature. Usually, it refers to an experimental paradigm where the potentials are evoked by visual stimuli being sufficiently widely spaced so that the visual system can be regarded as returning to a state of rest between successive stimuli. In this thesis, however, tVEP refers to the preceding signal process before the formation of ssVEP. It is usually different from VEP evoked with low stimulus rates.

4 Chapter 1 Introduction

1.2 Current Status of EP Signal Processing

Traditionally, techniques for characterizing sensory evoked potentials (including VEPs) may be broadly categorized into two groups according to the repetition rate of sensory stimuli: (1) a succession of positive and negative deflections of varying amplitude and latency for representing transient EPs in the time domain [21], and (2) the amplitude and phase as a function of frequency for describing steady-state EPs in the frequency domain [22,23]. The former method is usually employed when the repetition rate is lower than a certain value (e.g., six times per second for visual stimuli of flickering), while the latter is used when the repetition rate is higher. Fig. 2 shows a schematic illustration. Both methods have been extensively applied to the analysis of the EP (e.g., see references [1- 3,24]; for a comprehensive review cf. [5] Part VI, pp. 229-276 and [12]).

Frequency domain

representation

Time-frequency

representation

Time domain

representation

RR< c ▲ RR≥ c

Rc

Fig. 2. Schematic diagram of EP representation for different repetition rate ( R ) of sensory stimulation. Symbol ▲ denotes the critical rate ( Rc ). EPs induced by the stimulation with higher rate (RR≥ c ) are conventionally characterized in the frequency domain. EPs induced by the stimulation with lower rate (RR< c ) are characterized in the time and, more recently, the joint time-frequency domain.

The value of time-frequency representations has been recently recognized in the analysis of transient responses. The classical methods of frequency analysis often yield good results when the assumption of stationarity or quasi-stationarity

5 Chapter 1 Introduction is satisfied. In practice, however, the assumption of stationarity is often not justified and nonstationary approaches are usually required. Another major limitation of Fourier-based frequency analysis is its inability to provide information about time-course of the frequency content of the signal. The joint time-frequency representations map a one-dimensional signal into a two- dimensional function of time and frequency. The time-frequency plane gives an indication of which spectral components are present at which time instants. The methods of time-frequency analysis have been recently applied to various problems of biomedical signal processing and it has been shown that many biomedical signal problems may benefit from time-frequency analysis, such as QRS detection on ECG analysis, EMG description, and tracking of rapid dynamic changes in seizure EEG [25]. Regarding the specific applications in EP analysis, three categories may be identified: applications of STFT, wavelet transform and adaptive Gabor logon transform. Transient EPs are typical nonstationary signals. The pattern of the EP may change during the experiment depending on a variety of factors. Time-dependent variation in the response characteristics of EPs has been investigated with STFT approach. Norcia et al. [26] applied it to reveal the changes of frequency components during the course of the stimulation. In their study, VEPs to the transient presentation of sinusoidal luminance gratings in the range of 0.5 – 8 c/deg were recorded. The stimulation began with a temporal raised-cosine waveform to modulate the rate of contrast appearance. Rise-times to the full contrast value of 30% ranged from 4 to 200 ms. After the grating reached full contrast, it was held at that level for 500 ms and then ramped off with a 500 ms temporal cosine wave form. The spectrogram was then obtained by applying a 550 ms Hamming window which was progressively shifted through the record in 20 ms steps. The peaks were selected from the spectrogram using a bandwidth

6 Chapter 1 Introduction criterion, independent of their amplitude. The VEP was thus seen to consist of two prominent formants (designated f1 and f2) arising at different times after stimulus onset. The components occurred at temporal frequencies below the alpha band, with the f1 frequency being roughly half that of the f2 frequency. The f1 component was largest at low spatial frequencies with f2 becoming progressively dominant as spatial frequency was increased. Such information cannot be obtained by Fourier-based spectrum alone. STFT was adopted by Peachey et al. [27] to investigate short-term changes of VEPs before and after the onset of stimulation. They examined how the response characteristics of the VEPs varied during the course of trials. The stimulus was a sinusoidal contrast reversal grating with various spatial frequencies. The time-course of each trail consisted of a 9.04 s initial adaptation phase and a 20.34 s stimulation phase. A window of 2.26 s was used to calculate amplitude and phase values from each of the segments. It was found that responses varied depending on the spatial frequencies. At low spatial frequencies (0.77 or 1.55 c/deg), VEP amplitude remained stable throughout the trail. At middle frequency (3.1 c/deg), VEP amplitude was gradually increased to a stable value in 6-12 s. The amplitude changes were more complex at high frequencies (6.2 or 12.4 c/deg), first increasing and then decreasing dramatically. These studies have shown that STFT are promising techniques for characterizing the time-dependent behavior of EP signals. However, since a fixed- width window is usually used in the process, the efficiency of STFT analysis has been questioned. Deweerd and Kap [28] pointed out that waves of relatively higher frequency and shorter duration were largely responsible for the early components, while the later ones are of lower frequency and longer duration in a typical transient EP (e.g., Fig.5a-c in [28]). Therefore, a window with short time- duration but wide frequency band is desired for accurately characterizing the

7 Chapter 1 Introduction early components, and a window with large time-duration but narrow frequency band for characterizing the later portion of an EP. The wavelet transform actually provides a tool for such as an analysis, because the ratio between the central frequency of a wavelet and its bandwidth is constant (the constant-Q property). Schiff et al. [29] compared the results obtained by STFT and wavelet transform for EP analysis. Their goal was to extract features of EPs from background activity. For the wavelet analysis, they employed two types of : the Mexican hat and a B-Spline wavelet. They showed by simulation that the accuracy of feature extraction appeared to be enhanced by using wavelet analysis. A fast computational method was further developed for B-Spline wavelet transform. The results from real signals showed that the analysis by the fast computation did not impair the accuracy of feature extraction. The feasibility of using wavelets to extract features of EPs was also investigated by Trejo [30] for human performance monitoring. A traditional linear regression model for predicting a composite measure of human signal detection performance was focused in the study. However, the coefficients of the model were selected from the coefficients of the discrete wavelet analysis of EPs. A comparison to feature extraction by principle components analysis was given. They showed that less number of wavelet coefficients was needed to achieve a comparable performance. Furthermore, the results suggested that wavelet represent the EPs efficiently and extract behaviorally important features for use in linear regression. Advances in techniques of fast computation and optimal wavelet selection for EP analysis have also been proposed. A fast wavelet transform for general EEG analysis was proposed by Zhang et al. [31]. However, Bertrand et al. [32] considered the problem of fast wavelet analysis of transient EPs of finite duration. They argued that Mallat’s [33] recursive algorithm for fast wavelet transform was generally more efficient when the signal length is long (typically > 1024 samples),

8 Chapter 1 Introduction but for a typical EP record of less than 1024 samples, the algorithm was not adequately efficient. They thus proposed a fully discrete procedure for the fast wavelet transform, dealing with short duration digital signals. An orthogonal discrete wavelet transform and its inverse transform based on Meyer’s wavelets were proposed. Other studies indicated that an appropriate choice of the mother wavelet for EP analysis should be considered. Indeed, the possibility of choosing the wavelet function to be compared with the signal is one of the main advantages of wavelets over STFT. Quiroga et al. [34] proposed quadratic B- Splines as the mother functions for studying alpha response in pattern VEPs and gamma responses to bimodal (auditory and visual) stimulation. The major reason of such a choice was that the wavelets were similar to the EP responses. However, no actual comparisons to other wavelets were made to justify their claim. Wavelet analysis of EPs has been found usefulness for clinical practices as well. Thakor et al. [35,36] employed the multiresolution wavelet analysis to monitor the shape changes of EP responses to cerebral hypoxia. Results obtained by the wavelet analysis were compared with conventional STFT of the same signal. Particularly, they found that two characteristics appear to be of diagnostic value: the detail component of the analysis d4, representing the fine features at high frequencies, displayed an early and a more rapid decline in response to hypoxic injury, while the coarse component c4, representing an approximation of the EP signal, showed an earlier recovery upon reoxygenation. These studies demonstrate the capability of characterization and feature extraction of EP signals by wavelet analysis. Recently, a more flexible method of adaptive Gabor logon [37] was proposed to extract features of EPs. The procedure of the adaptive method is signal dependent and thus leads to a very compact representation of the signal. Based on non-orthogonal Gabor logons, Brown et al. [38] adaptively approximated the EPs to briefly presented visual

9 Chapter 1 Introduction stimuli. The estimated logons were then used as the feature vectors for the post- processing classifier. They demonstrated that only five Gabor logons were needed to fit the ER data sufficiently and showed that five features provide the best classification results in an examination. The application of the adaptive Gabor logons to detect sleep spindles in general sleep EEG has been extensively investigated by Durka et al. [39-42]. One prominent advantage of this approach is the flexibility of the basis functions. Unlike wavelet being restricted by the constant-Q property, the logon can be free to translate and scale in the time- frequency plane. This property makes it possible to efficiently represent various time-frequency structures in a coherent manner. Indeed, we are inspired by the flexibility of the adaptive approach and have developed a method to represent both the transient and the steady-state portions of VEPs. Apart from our own work [43,44], however, we are not aware of other studies (Fig. 2) that take the time-frequency approach to the analysis of the time-course of VEPs from the transient to the steady-state portion.

1.3 Challenges

In this thesis, we present new techniques based upon the adaptive chirplet transform (ACT) to fulfill this purpose. Our approaches (described in Chapter 2) provide a unified representation of the entire VEP response. A unified approach can be useful in many types of physiological and electrophysiological studies including those investigating the time course of visual cortical activities [27,45-47]. Moreover, the chirplet representation may improve and provide an alternative method for VEP estimation in clinical practice. Due to the delay in the formation of ssVEP, an effective representation of tVEP will benefit applications that require fast detection of VEP responses.

10 Chapter 1 Introduction

Clearly, the goal of the study cannot be achieved without solving some critical problems. In our opinion, the major difficulties and challenges of this research lie in the following aspects:

• Choice of analysis method. A number of competing techniques for time- frequency analysis of EPs have been proposed [24,26,30,35,36]. There is, however, no consensus as to the criteria for choosing these methods for different applications. The challenge here is to choose a proper technique suitable for solving the problem at hand. This issue will be presented and discussed in Chapter 2 in detail.

• Estimation of weak signal. It is well known that the signal-to-noise ratio (SNR) of time-locked EEG phenomena (such as the VEP signals discussed here) is very low (usually below -10 dB) [2]. Noisy signals will inevitably incur large variance of estimation and thus degrade the quality of the results. In this thesis, we propose a new technique to deal with the low- SNR situation by improving the filtering technique of the algorithm. Note that, although averaged signals are employed in preprocessing the data, the number of trials needed for averaging is greatly reduced.

• Reduction of computational time. A technical limitation in the application of chirplet analysis is the computational time required to estimate the chirplet parameters. Depending on signal size, chirplet analysis can be computationally extremely demanding. However, there are some instances, particularly in some clinical situations and signal processing of BCI, where it is imperative that processing time be minimized. For these cases, a real challenge is to reduce the time of calculation to perform chirplet analysis in real-time. In this thesis, we propose a promising windowing technique to cut down the computational time. The details of the proposed method,

11 Chapter 1 Introduction

application and the measurement of time cost are presented in Chapter 2, Chapter 4 and Chapter 5, respectively. There are, of course, other challenges in VEP analysis from the perspective of physiology and neurophysiology that are also highly interesting and important for understanding the underlying mechanism of a VEP response. However, this thesis is mainly intended to explore VEP analysis from the point of view of signal processing, and hence these questions are not the focus of this work.

1.4 Objective and Hypothesis

The objective of the thesis is to demonstrate the feasibility of applying the ACT methods to represent both the transient portion (tVEP) and the steady- state portion (ssVEP) of the VEP to repetitive visual stimulation. Specifically, we will describe the applications of the non-windowed and the windowed ACT methods to the analysis of VEPs. Recently, it has been shown that the matching pursuit (MP) algorithm with Gabor logons (non-chirping time- frequency atoms introduced in Chapter 2) has demonstrated good performance in various biomedical applications (e.g., [39,40,48]). As a natural extension to this method, the non-windowed ACT method is introduced with an improved estimation of signals under low-SNR conditions. Another important advantage of our method is that we have obtained a compact representation of VEPs. However, the main problem of the non-windowed method is the excessive computational time in processing data of large size. To reduce the time cost, we have further developed the windowed ACT method as a middle step towards real-time processing. In this method, VEPs are partitioned into equal-length non- overlapping segments. Then, chirplet analysis is applied to each short segment.

12 Chapter 1 Introduction

Only the ‘strongest’ chirplet is extracted from each of the segment. By this way, the time of computation can be greatly reduced. The above approaches assume two chirplet models of VEPs by decomposing the signals into a series of chirplets. First, in the non-windowed approach, a signal is modeled as a weighted sum of different chirplets. Second, in the windowed approach, it is modeled as a sequence of piecewise linear chirps. Note that, however, when applying the ACT to a given data segment, the windowed method is, in fact, a special case of the non-windowed one, as the number of chirplets to be estimated is just one. More clearly, the following working hypothesis underlies the analysis of VEPs using the ACT: The signals of visual evoked potentials to repetitive stimulation can be modeled as a linear sum of Gaussian chirplets embedded in additive Gaussian white noise. An immediate question concerns the physiological basis of the analysis. It is true that the criteria for selecting the chirplet basis function are basically mathematical rather than physiological (see Chapter 2). Although there is no adequate evidence, to our knowledge, that an analysis of VEPs into chirplet components is physiologically meaningful in itself, different components at the transient and steady-state portions do indicate different physiological properties. The reason why the choice has been somewhat arbitrary on physiological grounds is, in general, that nobody has yet been able to suggest elementary functions which have a demonstrable physiological basis for EP signals. Nevertheless, the analysis of the VEP into basis functions selected on mathematical grounds can provide us with powerful, immediately available techniques. Many studies initiated on such a basis have already led to a number of physiological insights [24,49-51]. The chirplet functions may lead to future analytical techniques whose bases are more closely related to physiology.

13 Chapter 1 Introduction

1.5 Why Adopt Time-Frequency Analysis?

The chirplet transform can be regarded as a method of time-frequency analysis. In this section, we attempt to justify the adoption of the time-frequency approach by examining the characteristics of VEP signals in relation to concepts of signal analysis. We begin with a brief description of the neurophysiological background of the human visual system. Then, we wish to argue that the observed time-frequency characteristics of VEPs might be caused by the superposition of signals from different VEP generator sites in the visual system. These characteristics are, in fact, the major motivation for using methods of time-frequency analysis.

Optic disk

Optic disk Optic nerves Optic nerves

Fig. 3. A schematic diagram of the pathways of the human visual system consisting of the geniculostriate system and the tectal system (adapted from [52]).

As shown in Fig. 3, light-evoked physiological signals exit the eye at the optic disk along the optic nerves. The two optic nerves from both eyes come together at the x-shaped optic chiasm. The major termination of the optic tract nerve

14 Chapter 1 Introduction fibers is the lateral geniculate nucleus of the thalamus. Neurons in both lateral geniculate bodies, in turn, project to the striate cortex of the primary visual cortex in the occipital lobe at the back of the head. This geniculostriate system is the dominant neural pathway of vision for humans [52,53]. A major feature of the retinocortical projection is that approximately one half of the optic nerve fibers cross to the opposite of the brain: fibers from temporal retina project to cortex on the same side of the head, whereas fibers from nasal retina cross at the optic chiasm and project to the opposite side of the head. The minor termination of the optic fibers is the superior colliculus of the midbrain. Neural fibers from the superior colliculus activate motor pathways leading to eye and body movements. This neural pathway forms the tectal system [52,53]. It is known that the visual cortex consists of a large number of distinct visual areas, most of which contain topographically organized representations of the visual field [53]. The striate cortex is also known as area 17. Neurons in area 17 project to other areas of the occipital lobe [52]. Cortical neurons in these areas are thus stimulated indirectly by light on the retina. In our project, the VEPs have been recorded on the scalp directly above the primary visual cortex (area 17, see Section 3.1). Recall that a VEP is recorded in response to repetitive visual stimuli. In general, waves of relatively higher frequency and shorter duration are responsible for the early components of the response, whereas the later ones are of lower frequency and longer duration; the latter is close to sinusoids (see Fig. 4). It is of strong interest to know the VEP generator sites for the complex recorded signals, since, if we knew these locations for the given stimulus modality, we could provide basic vision researchers with a means of associating specific sensory processes with known brain sites. Furthermore, by investigating electrical activities of these known sites, we could provide engineering researchers with a solid electro-physiological basis for

15 Chapter 1 Introduction analyzing gross scalp EP recordings into components. Some studies (e.g., [54-56]) suggested that the initial higher-frequency nonstationary responses, similar to responses to pattern-onset stimulation, might be generated in area 17 of the striate cortex, whereas the later lower-frequency ones might originate in both areas of 17 and 18. With the continuance of repetitive stimuli, responses from areas 17, 18 and perhaps other cortical areas might be superimposed and merged together to form a periodic wave of response or the ssVEP [1,2]. Unfortunately, specific locations of generators are largely unknown and quite controversial in the literature of vision research. The difficulty in locating the generator sites makes the problem of decomposing VEPs into physiologically meaningful components a difficult task. Nevertheless, we wish to show next that the basic structures of VEP motivate us to adopt a method of analysis to decompose it into a series of time-frequency components. It is our hope that these components can provide insights for understanding the underlying physiological mechanism. A typical VEP response is shown in Fig. 4, where the signal between approximately 1.4 s and 2 s is enlarged in the 2nd panel labeled as ‘A’, and the signal between approximately 3.5 s and 4.5 s is shown in the 3rd panel labeled as ‘B’. Two major characteristics of the signal can be observed by comparing the signal and the reference sinusoid (shown in a superimposed dotted line): (1) The variation of the amplitude of the signal in interval A is generally larger than that in interval B, and (2) in Panel A, the signal shows an obvious frequency change with time, whereas the fundamental frequency of the signal in Panel B is relatively stable. We have found similar characteristics in other VEP signals. Since the stimulus began at 1 s, the signal in Panel A is largely indicative of the “early components” in the evoked potential. The signal in Panel B is believed to be the “steady-state” response.

16 Chapter 1 Introduction

A B

20

0 D1 −20

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

20 0 A −20

1.4 1.5 1.6 1.7 1.8 1.9 2 Amplitude (arbitrary unit)

20 0 B −20

3.5 3.6 3.7 3.8 3.9 4 4.1 4.2 4.3 4.4 4.5 Time (s)

Fig. 4. Time-frequency characteristics of a typical VEP signal (D1). The original signal (solid line) is shown in the first panel, which is superimposed by a sinusoid with fixed frequency and amplitude (dotted line). The vertical dotted line at 1 s indicates the onset of the visual stimulus. The intervals A and B are enlarged in the 2nd and 3rd panel, respectively.

In view of the transient structure of evoked potentials, it cannot be expected that time-invariant descriptors such as the power density spectrum based on will provide an adequate frequency representation. The reason for this is that the spectrum is an averaged energy measure (as a function of frequency) over the entire observation interval. Even if a signal is present during only a part of the observation (e.g., Fig. 4, interval A), its energy is considered to be smeared out over the entire interval, which may lead to misleading results. Consequently, a time-varying spectral description (or time-frequency representa- tion) of the energy distribution is desired. Moreover, an adequate description of VEPs requires variable, rather than fixed, width of time windows. The purpose of the next chapter is to propose a means for obtaining such a representation by using the ACT.

17 Chapter 1 Introduction

1.6 Thesis Outline

The rest of the thesis is arranged as follows: in Chapter 2 we introduce the non-windowed and windowed ACT methods. We review the application of time- frequency analysis to EP analysis and discuss the reasons of choosing ACT in this study. We will show how the algorithms can be improved to increase SNR and to reduce computational time. We also estimate the optimal lower bound of the estimates – Cramér-Rao lower bound (CRLB) – in this chapter. In Chapter 3, we illustrate the experimental setup and data acquisition procedure to record VEPs, and present the results of applying the non-windowed ACT methods to the data. Special attention is given to discussion of the stopping criterion in signal decomposition and the method to discriminate between tVEP and ssVEP using the estimated parameters of chirplets. The application of the windowed ACT method is given in Chapter 4. From the statistical information contained in the estimated chirplets, we demonstrate that the windowed approach generally captures the same patterns of VEP response as those revealed by the non-windowed approach, but requires less time for calculation. We discuss in detail some applications of VEP chirplet analysis and the technical limitations of the proposed methods in Chapter 5. We conclude this work in Chapter 6. Some possible directions of future research are proposed at the end.

18

Chapter 2 MPLEM – An Adaptive Chirplet Transform

In this chapter, we develop the methods of a non-windowed and a windowed adaptive chirplet transform (ACT). We first present a review of time-frequency analysis. The Gaussian chirplet transform is then introduced as a newly emerged branch in this field. Subsequently, we introduce the method to adaptively approximate a signal by using chirplet functions. We present a new algorithm – the MPLEM algorithm of the non-windowed ACT and provide a performance analysis, including the Cramér-Rao lower bound of the chirplet estimates. Finally, we modify the non-windowed approach one step further to establish the windowed ACT.

2.1 Introduction to the Gaussian Chirplet Transform

Chirping phenomenon, i.e., a time-varying swept frequency wave, exists in many natural signals. One may find it, for instance, in bird whistles [57], in bat echo-location signals, in human voices, in seismic signals [58], in impulsive signals dispersed by the ionosphere [59] and in EEG signals [44], etc. Therefore, it is desirable to have a method to approximate these signals in terms of a weighted

19 Chapter 2 MPLEM – An adaptive chirplet transform sum of chirp functions. The weights or chirplet coefficients are obtained from a chirplet transform. Motivated by a discovery in detecting a small piece of ice floating in an ocean environment using Doppler radar, Mann and Haykin [57,60-63] formulated the chirplet transform as a generalization of the wavelet transform in the early 1990s. Particular interest is directed to the Gaussian chirplet, implemented as successively applying scaling, chirping (time and frequency shear), time-shift and frequency-shift operators to a Gabor Logon function (introduced later) [64], because it has the highest joint time-frequency resolution, and it is the only function whose Wigner-Ville distribution (WVD) is non-negative [57,65]. The common employment of the Gaussian chirplet transform (GCT) is also due to its relative simplicity of mathematical manipulation. Because of these reasons, the GCT plays a unique role in the area of time-frequency analysis. In this section we will introduce some transforms useful for time-frequency analysis for biomedical application, beginning with the classical Fourier transform, and continuing onwards to the GCT.

2.1.1 An overview of time-frequency analysis

2.1.1.1 The need for time-frequency analysis

There are two classical methods of signal analysis. One is the description of the signal as a function of time; the other is frequency analysis with the Fourier transform. There are four main reasons for performing frequency analysis [66]. First, we can learn something about the source of a waveform by analyzing it into its spectral components. For example, we can learn about the composition of blood, synovial fluids and other body fluids from their infrared spectra [67]. Second, the propagation of waves through a medium generally depends on frequency. For example, waves of different frequencies propagate with different

20

Chapter 2 MPLEM – An adaptive chirplet transform velocity and attenuation. Third, frequency analysis often simplifies our understanding of the waveform. In general, many complicated signals are really the simple superposition of sine waves, which is simpler to understand and characterize. Finally, Fourier analysis is a powerful mathematical tool useful for obtaining the solutions of ordinary and partial differential equations.

Mathematically, the Fourier transform of the signal ft() (i.e., the waveform under analysis) and the inverse Fourier transform are defined as

∞ −jtω fftedtˆ()ω = () (2.1) ∫−∞ and

∞ 1 jtω ft()= fˆ()ωω e d. (2.2) 2π ∫−∞

(e.g., see reference [68]) Underlying the Fourier transform is the notion of sinusoidal waves: the Fourier coefficient is obtained in (2.1) by correlating ft() with a sinusoidal wave

ψω()tjt= exp() (2.3) where ω is the frequency of ψ(t) in radian per second; from (2.2), the signal is expanded in terms of sinusoidal waves of different frequencies. Note that the sinusoid in (2.3) is a complex signal (see Fig. 5 for an example). It has both a real and an imaginary part. Although the Fourier transform has been shown to be useful in a wide variety of applications of signal processing, it cannot provide information about localized frequency changes (in a time-interval). The reason is that the wave ψ(t) in (2.3) covers an infinite interval of time, and thus fˆ()ω depends on the values ft() in the entire time domain. Therefore, no time information is provided in the Fourier transform.

21

Chapter 2 MPLEM – An adaptive chirplet transform

Wave Time Series “Wavelet” Time Series Real Real

g g Ima Ima

Wave Spectrogram “Wavelet” Spectrogram 0.5 0.5

y y uenc uenc 0 0 q q Fre Fre

−0.5 −0.5 Time Time

Fig. 5. Relationship between wave and “wavelet” in terms of time series and spectrogram (magnitude time-frequency distribution). In general, each of these two functions may be regarded as a chirplet. For example, the wave is a special case of a chirplet where the chirp rate is zero and the time-spread is arbitrarily large. Note the Gaussian envelope of the wavelet time series. Figure adapted from [60].

Two broad mechanisms are usually involved in producing time-varying spectra[66]. One is that the propagation of waves in a medium is frequency dependent. The other reason for changing spectra is that the source of production depends on physical parameters that may change in time. That is the case with human speech, for instance. As we speak we are continually changing the physical shape of our tongue, mouth, etc., so as to produce a frequency-changing signal in time. These signals with time-varying spectra are known as non- stationary signals lack global stationarity. It is well known that most biomedical signals (including EP) are nonstationary in nature and have highly complex time- frequency characteristics [25,69]. Energy distributions of non-stationary signals

22

Chapter 2 MPLEM – An adaptive chirplet transform cannot be analyzed using classical power spectrum methods based on the Fourier transform. These phenomena demand a more efficient way of signal analysis to locally and simultaneously characterize a signal in both time and frequency domain. Next, we will review several transforms for joint time-frequency analysis.

2.1.1.2 Logon and the short-time Fourier transform

Motivated by quantum mechanics, in 1946 the physicist Gabor [64] defined elementary time-frequency signals as waveforms that have a minimum spread, governed by Heisnberg’s uncertainty principle, in a time-frequency plane. He coined these elementary signals as “logons”. To measure time-frequency “information” content, he proposed a decomposition of a signal over a set of logons. This decomposition is now known as the short-time Fourier transform (STFT 3 ). By showing that such decompositions are closely related to our sensitivity to sounds, and that they exhibit important structures in speech and music recordings, Gabor demonstrated the importance of localized time-frequency signal processing. A Gabor logon4 is the modulation product of a harmonic oscillation of any frequency with a Gaussian window function. More specifically, let a window function gt() be real, symmetric gt()= g (− t ) and unitary g = 1 . A possible candidate is the Gaussian function

2 gt()= ()π −1/4 exp()− t /2 . (2.4)

We then translate it by tc and modulate it by frequency ωc to construct a logon gtgtt()= − exp jtω , (2.5) tcccc,ω ()()

3 In his 1946 paper, Gabor used a coherent family of Gaussian functions to “window” the data. The resulting transform is often referred as the Gabor transform in the literature. The prevalent STFT, also called the Windowed Fourier Transform, can be viewed as a generalized version of the Gabor transform because the window function involved is no longer confined to a Gaussian function. 4 An elementary function of Gabor logon is sometimes referred to as a time-frequency atom in the literature of signal processing and applied mathematics (e.g., see references [33,37,65,70]).

23

Chapter 2 MPLEM – An adaptive chirplet transform

2 so that the energy of the logon is located in the neighborhood of ()tcc, ω ∈ℜ , where tc and ωc (or fcc= ωπ/2 in Hz) are, respectively, the time- and frequency-center of the logon. In a loose sense, a logon may be regarded as “a portion of a wave”. We follow the terminology used in [57,60,62] to refer to a small portion of a wave as “wavelet” 5 . An example of a “wavelet” and the relationship between wave and “wavelet” are shown in Fig. 5.

Δ Δ t1 t2 ω

ωc Δ 2 ω2

ωc Δ 1 ω1

0 t t tc c1 2

Fig. 6. Logon representation of the set of Gabor bases corresponding to STFT. Given a window, the time-spread and frequency-spread of the logons are invariant ( Δ = Δ , tt12 Δ = Δ ), and hence the time-frequency resolution of the analysis is fixed. Figure ωω12 adapted from [60].

The STFT of a finite energy signal ft( ) ∈ℜ L2 ( ) may be defined as an inner product between the signal and logons:

+∞ * Sf() tcc,,ω == f g t,ωω f() t g t () t dt cc∫−∞ cc, +∞ (2.6) ⎡⎤−jtωc = ftgt() ()− tc e dt, ∫−∞ ⎣ ⎦

5 The term “wavelet” will appear in quotes when it is used in a less restrictive sense: any ephemeral burst of energy, finite in physical support (which includes bases in both the Weyl- Hiesenberg and the affine spaces) [60].

24

Chapter 2 MPLEM – An adaptive chirplet transform where ‘*’ denotes the complex conjugate operation. Clearly, from the definitions of Fourier transforms, the STFT may be depicted as the Fourier transform of the windowed signal ftgt() ()− tc . However, the definition using the concept of the inner product can be easily generalized to other linear time-frequency transforms. From (2.6), one can define an energy density (magnitude time-frequency

2 distribution) Sf() tcc,ω called the spectrogram. A spectrogram measures the energy of ft() in the neighborhood of (tcc,ω ) in the time-frequency plane, if the energy of the logon g is negligible outside the area of the neighborhood. tcc,ω Therefore, the resolution of the STFT depends on the spread of the logon. The effective time-spread of a logon is defined by [64]

1/2 1 +∞ 2 2 Δtct= ()tt− g,ω () tdt g {}∫−∞ cc tcc,ω

+∞ 1/2 = ∫ tgtdt2 ()2 , {}−∞ (2.7) and the effective frequency-spread is given by

1/2 11+∞ 2 2 () Δωω= ()ωω− ctgdˆ , ω ω gˆ {}2π ∫−∞ cc tcc,ω 1/2 +∞ 1 2 2 = ωωωgdˆ() , {}∫−∞ 2π (2.8) where gˆ is the Fourier transform of g . Gabor emphasized the use of a Gaussian window gt(), since it minimizes the uncertainty product ΔΔt ω . Gabor logon bases cover the time-frequency plane (see Fig. 6), and we can trade frequency resolution for improved temporal resolution, or vice versa. However, both time-spread and frequency-spread are independent of time and frequency. Therefore, given a window gt(), Δt and Δω are invariant and the resolution of the analysis is fixed (Fig. 6). This is actually a major deficiency of STFT, because a signal that is either much shorter or much longer than the

25

Chapter 2 MPLEM – An adaptive chirplet transform time-spread of the specified logon will not be effectively analyzed. Since VEP signals contain both structures of short-duration transient and relatively long- term steady-state signals, the STFT representation is unable to characterize efficiently these features.

2.1.1.3 “Wavelet” and the wavelet transform

The wavelet transform has been recently proposed to partially overcome the problem of the fixed resolution with the STFT [33,70,71]. It introduces a family of time-frequency elementary signals called wavelets that have variable effective spreads at different locations in the time-frequency plane. In this thesis, we follow the development of a “generalized logon” proposed by [60]. That is, we will denote an arbitrary piece of sinusoid resulting from a windowed pure tone as a “wavelet” 6 , with the only constraint being that the window function is a Gaussian function and hence each “wavelet” is a Gabor logon. A family of bases “wavelet” functions can then be derived from a mother “wavelet” by applying to it two operations of scale (or time-spread) and time translation. For example, a mother Gabor “wavelet” (or Morlet wavelet [70])

⎛⎞t 2 ψπ()tjt= −1/4 exp⎜− ⎟ exp() η (2.9) ⎝⎠⎜ 2 ⎟

can be found by setting the parameters of logon in (2.5) as tc = 0 and ωηc = . Thus, the time-center of the mother “wavelet” is zero and the frequency-center7 is η . Then, a family of “wavelets” can be constructed by successively applying two operations of scale Δt and translation tc to (2.9)

6 In the past, the term “wavelet” often denoted any basis function which acted as a bandpass filter. More recently, however, the term wavelet has been used in more restrictive sense to denote a basis function of constant shape (from an affine group of translation and dilates of one mother wavelet) [60]. 1/4 2 7 This is due to ψωˆ()= ()4exp π⎡⎤−− ω η /2. ⎣⎦⎢⎥() 26

Chapter 2 MPLEM – An adaptive chirplet transform

1 ⎛⎞tt− gt()= ψ⎜ c ⎟. (2.10) tct,Δ ⎜ ⎟ Δt ⎝⎠Δt

Similar to the definition of the STFT, the wavelet transform may be defined as the inner product between the signal and “wavelets” in (2.10):

Wf t,,.Δ = f g (2.11) ()ct tct,Δ

ω Δ Δ t1 t2

2η ω = Δω c2 Δ 2 t2

2η Δ ω = ω1 c1 Δ t1 t t t 0 c1 c2

Fig. 7. Logon representation of the set of Gabor bases corresponding to the wavelet transform. The frequency-center of a “wavelet” can be adjusted by applying different scales. Note the property of constant-Q, i.e., ΔΔ = ΔΔ . The time-spread is tt11ωω 22 narrower and the frequency-spread is wider when a “wavelet” is in a lower frequency area than those when the “wavelet” is in a higher frequency area in the plane. Figure adapted from [60].

2 The energy distribution Wf() tct,Δ is defined as the scalogram. The time and frequency resolution of the analysis is, therefore, defined by the effective spread of the “wavelet” in (2.10). Note that the time-center and frequency-center8 of the wavelet in (2.10) are tc and ωηct= /Δ , respectively. By applying the same definitions in (2.7) and (2.8), it is easy to find that the time-spread is Δt and the frequency-spread is Δω = 1/2Δt . Consequently, the ratio of the frequency-center to the effective frequency-spread is a constant, i.e., ωηc /Δω = 2 . Therefore, the

8 This is due to gjtˆ ()ωπωψωη= Δ⋅/2( ) exp()ˆ ⎡ Δ−Δ ( / )⎤ . ttct,Δ ctt⎣ ⎦ 27

Chapter 2 MPLEM – An adaptive chirplet transform

“wavelet” acquires narrower time-spread and wider frequency-spread in higher frequency area than in lower frequency area in the time-frequency plane (Fig. 7). This is the reason why the wavelet transform is particularly suitable for analyzing signals with discontinuity or abrupt changes. However, this property also means that the wavelet transform does not provide precise estimates of such time-frequency structures in a signal as low-frequency components with short- time duration or narrow-band, high-frequency components. It is not efficient to represent a chirp signal either.

Upchirp Series Upchirplet Series Downchirp Series Downchirplet Series Real

g Ima

TF for upchirp TF for upchirplet TF for downchirp TF for downchirplet +0.5 +0.5 +0.5 +0.5

y

0 0 0 0 uenc q Fre

−0.5 −0.5 −0.5 −0.5 Time Time Time Time

Fig. 8. Relationship between chirp and chirplet in terms of time series and spectrogram. Note that both upchirp (positive chirp rate) and downchirp (negative chirp rate) are shown. Figure adapted from [60].

Recall (Section 1.1) that a frequency changing transient portion of VEP precedes the steady-state portion, we speculate that chirping components exist in tVEP. Therefore, an analysis method with basis functions that can characterize chirp-like signals is preferable. For this reason, we will discribe the chirplet transform next.

28

Chapter 2 MPLEM – An adaptive chirplet transform

Fig. 9. Wave, “wavelet,” chirp and chirplet revisited. The x axis corresponds to the real value of the function and the y axis to the imaginary value. Although the functions are continuous, a coarse sampling is used to enhance the 3-D appearance. Each sample is rendered as a particle in (x, y, t). WAVE – The wave appears as 3-D helix. The angle of rotation between each sample and the next is constant, hence the frequency is constant. WAVELET – The “wavelet” is a windowed wave, where the reduction in amplitude is observed as decay toward the t axis. CHIRP – The chirp is characterized by a linearly increasing angle of rotation between one sample and the next. CHIRPLET – The chirplet is characterized by the same linearly increasing angle of rotation but first with a growing and then with a decaying amplitude. Figure reproduced from [57] with permission.

2.1.2 Chirplet and the Gaussian chirplet transform

In order to overcome these difficulties and especially to acquire an efficient representation of chirp signals, the elementary signal of logon is now allowed for additional degree of freedom in the time-frequency plane. We have pointed out that by applying scale and translation operations to a logon defined in (2.5) a “wavelet” basis function can be constructed. A further step is to allow “wavelet” to rotate (or “chirping operation” introduced later) in the time-frequency plane. This is equivalent to obtaining a piece of chirp signal by using a Gaussian window and therefore the resultant function has been coined “chirplet” [60]. Fig.

29

Chapter 2 MPLEM – An adaptive chirplet transform

8 shows some examples of chirps and chirplets, including upchirp and upchirplet with a positive chirp rate, and downchirp and donwchirplet with a negative chirp rate. Referring to Fig. 5, we can see that the relationship of a chirplet to a chirp is analogous to that of a “wavelet” to a wave. Fig. 9 shows a 3-D appearance of these waveforms. The chirplets in Fig. 8 and Fig. 9 were derived from a single Gaussian window by applying simple mathematical operations to that window. More precisely, the chirplet bases for a Gaussian chirplet transform (GCT) can be derived from the unitary Gaussian function in (2.4) through four basic operations: −1/4 (1) scaling πΔ−Δ2 exp⎡ ()t /2 /2⎤ , ()tt⎣⎢ ⎦⎥ 22 (2) chirping ()π −1/4 exp()()−tjct /2 exp , 2 (3) time-shift ()π −1/4 exp⎡−−()tt /2⎤ , ⎣⎢ c ⎦⎥ 2 ()−1/4 (4) frequency-shift πωexp()−tjt /2 exp()c .

For each operation, there is a corresponding transform for the Wigner-Ville distribution 9 (WVD) [66]. The WVD is a time-frequency energy density computed by taking the correlation of the signal ft() with time and frequency translation of itself

+∞ ⎛⎞⎛⎞ττ ⎜⎜⎟⎟* −jτω WftV (),.ωτ=+ ft⎜⎜⎟⎟ f t− e d (2.12) ∫−∞ ⎝⎠⎝⎠22⎟⎟

Later we will use the WVD to visualize the results of chirplet estimation. A sequential application of these operations leads to a family of wave packets with four adjustable parameters [57] called Gaussian chirplets ⎧⎫2 11⎪⎪⎛⎞tt− g() t= exp⎪⎪− ⎜ c ⎟ + jctt⎡⎤()− + ω () tt− . (2.13) tccc,,,ω Δ t ⎨⎬⎟ cc c πΔ ⎪⎪2 ⎝⎠⎜ Δ ⎟ ⎣⎦ t ⎩⎭⎪⎪t

9In 1948 Ville [72] introduced the Wigner distribution into signal analysis, which was studied by Wigner in a 1932 article on quantum thermodynamics [73]. The Wigner-Ville distribution plays an important role in the theory of time-frequency analysis. It has been proved that the spectrogram and the scalogram can be derived by time-frequency averaging of the WVD [66].

30

Chapter 2 MPLEM – An adaptive chirplet transform

(1) Unit Gaussian (2) Scaling 3 3 2 2 1 1

ω 0 ω 0 −1 −1 −2 −2 −3 −3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 t t (3) chirping (4) Time− & Frequency−shift 3 3 2 2 1 1

ω 0 ω 0 −1 −1 −2 −2 −3 −3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 t t

(A) (B)

Fig. 10. Time-frequency distributions of Gaussian chirplets. (A) 3-D visualization of the WVD of the unitary Gaussian function; (B) (1) WVD contour of the unitary Gaussian, (2) effect of scaling, (3) effect of chirping, (4) effect of time-shift and frequency-shift.

ω

t 0

Fig. 11. Chirplet representation of a set of basis functions corresponding to a GCT.

The effects of these operations on the WVD of a chirplet are shown in Fig. 10. As mentioned earlier, the chirplets can be regarded as a natural extension to “wavelets” (2.10) by applying an additional chirping operation. Chirplets correspond to slanted logons which can fill the time-frequency plane completely.

31

Chapter 2 MPLEM – An adaptive chirplet transform

Fig. 11 shows an instance of titling the time-frequency plane by using chirplets. Of course, the Gabor logons and “wavelets” are just special cases of chirplets, where the chirp rate is zero (the beginning instantaneous frequency equals the ending instantaneous frequency). Finally, the Gaussian chirplet transform (GCT) of a signal is defined as the inner product between the signal and chirplets given in (2.13) [57,60,65]:

+∞ * afgftgtdttc,,,ωωΔΔ==, tc ,,,() tc ,,, ω Δ () . (2.14) cc t cc t∫−∞ cc t

The coefficients a reflect the similarity between the local structures of the tccc,,,ω Δ t signal and the chirplets. They represent the signal’s energy content in the time- frequency regions specified by the chirplets. The absolute value of a coefficient is the amplitude of the projection. The set of chirplet parameters can be denoted by a continuous index set It= ()cc,,,ω cΔ t for simplifying mathematical notations used later.

2.2 The Adaptive Chirplet Transform (ACT)

Similar to the Fourier transform (2.1) that correlates a signal with sinusoids, the GCT given in (2.14) correlates a signal with chirplets. In this section, however, we wish to find a relation to approximate a signal by superposition of chirplet waves, which plays a similar role as the inverse Fourier transform that approximates a signal by superposition of sinusoidal waves. This is an important question both theoretical and practical. The “inverse GCT”, if it exists, will complete the theory of GCT. In addition, with the help of the inverse GCT, we can process the signal in the “chirplet domain” (i.e., manipulating the chirplet coefficients) and then convert the results from the “chirplet domain” to the time domain. Unfortunately, there is no widely accepted theory of the inverse

32

Chapter 2 MPLEM – An adaptive chirplet transform transform for chirplet analysis as of yet. It should be noted that the GCT was originally introduced as an analytic tool other than a synthetic one in 1991 [60]. Nevertheless, several adaptive approaches have been proposed to approximate a signal by a linear sum of chirplet waves. Roughly, two categories of strategies may be identified. One strategy uses the sequential estimation scheme. A single chirplet is estimated at each iterative step and subsequently subtracted from the signal. A new chirplet will be estimated from the residual signal. By this way, a sequence of chirplets can be obtained and the original signal is approximated by a linear sum of the estimated chirplets. A representative of this approach is the Matching Pursuit (MP) algorithm [37,65,74]. The other one employs a coarse-refinement scheme to simultaneously estimate a group of chirplets. In this approach a process of initial estimation is carried out first to obtain the number of chirplets to be refined and the rough positions of these chirplets in the time-frequency plane. Subsequently, a refinement process is employed to progressively “move” these chirplets close to the true time-frequency structures of the signal. Finally, the original signal is approximated by a sum of the refined chirplets. One approach of the coarse estimation proposed in [74,75], for example, is to estimate three of the four parameters by assuming that the chirp rate is zero. Then refine them with a “zooming algorithm” using the spectral information. In [76] the author proposed a different refinement approach called “ridge pursuit algorithm”. An important method of adaptive refinement is the Logon Expectation and Maximization (LEM) algorithm, which was first proposed in [62,63] and later re-proposed in [77]. In this method the initial guesses of chirplets are drawn to the signal’s structure by using the EM algorithm iteratively. It has been demonstrated that the EM algorithm can be used to improve the quality of signal estimation under low-SNR condition [77]. This is of relevance to

33

Chapter 2 MPLEM – An adaptive chirplet transform our problem since VEP signals are often embedded in high level of noise (SNR around 0 dB for our recordings, see Chapter 4). However, the initial estimates of the LEM algorithm, usually obtained by heuristic guesses, are often not satisfying. Therefore, in this thesis we propose a new adaptive method by combining the MP algorithm with the LEM algorithm. The adaptive properties of our algorithm lies in two aspects: (1) initial coarse estimation of the chirplets by the MP algorithm, followed by (2) the adaptive refinement of the initials by the LEM algorithm. We will denote this algorithm as MPLEM. In the rest of this section, we will describe this algorithm after the introduction of the MP and LEM algorithms.

2.2.1 The MP algorithm

The MP algorithm was first proposed in [37,74] to adaptively decompose a signal into optimally matched Gabor logons (2.5), and later was extended to chirplets (2.13) [65].

Suppose that we have selected a chirplet gt() from a predefined family of I0 chirplets, where I 0 is the parameter set of the first chirplet. An arbitrary signal ft()∈ℜ L2 () can then be decomposed into two parts:

1 ft()=+ agII () t Rft () 00 (2.15) 1 =+ft0 () Rft ()

st 1 where ft0 () is denoted as the 1 -order approximation of the signal and Rf() t is the residual after approximating ft() in the direction of gt(). If we define I0 Rf0 () t= f () t, the decomposition can then be repetitively done on the residual

Rfnn() t=+ a g () t R+1 f () t,0,1,2, n =. (2.16) IInn

Suppose we carry the decomposition up to the order P , that is, if we decompose the signal into the concatenated sum, from (2.16) we get

34

Chapter 2 MPLEM – An adaptive chirplet transform

011⎡⎤⎡ PP⎤ ft()=+ Rft ()⎣⎦ Rft ()− Rft ()++ ⎣ Rft ()− Rft ()⎦ P−1 ⎡⎤nn+1 P = ∑ ⎣⎦Rf() t− R f () t+ Rf () t n=0 (2.17) P−1 =+ag() t RftP (), ∑ IInn n=0 P =+ftRftP−1 () ()

where ftP−1 () is the Pth-order approximation. It should be noted that the energy of a signal is conserved during the decomposition (proof in Appendix A):

P−1 2 2 faRf2 =+P , (2.18) ∑ In n=0 and that the signal can be completely reconstructed from (2.17). Consequently, there is no information loss in the procedure and the signal ft() is fully characterized by the sequence aI, , which actually describes the behavior ()Inn n∈N of the signal in the time-frequency plane. However, two crucial issues remain unsolved: (1) the selection of the parameter set ()I , or equivalently how to select g from a predefined set of n n∈N In chirplets, so that the coefficients a can be calculated, and (2) predefining the In set of chirplets which play the role of basis functions in the decomposition. The latter issue relates to the construction of a set of functions known as the dictionary, which will be illustrated later. For the present, let us assume that the dictionary has been constructed and the problem is to select a series of chirplets g , where nP∈−[]0, 1 , from the In dictionary. An optimal selection is to choose P chirplets from the dictionary

P−1 concurrently so as to minimize the difference fag− globally. ∑n=0 IInn Unfortunately, this approach is generally considered to be unpractical as it is an NP-hard problem [37]. This means that no known polynomial-time algorithm exists to solve this operation. Consequently, the MP algorithm was proposed to extract one chirplet from the signal at each step. More specifically, referring to

35

Chapter 2 MPLEM – An adaptive chirplet transform

(2.15)-(2.17), the optimal match for the residual at nth step is searched by projecting Rfn () t onto each chirplet in the dictionary. The optimal chirplet is obtained when the amplitude of the projection (2.14) reaches the maximum, i.e.,

n IRfgnI= arg max , (2.19) I ∈Γ where Γ is the set of the parameters of all chirplets in the dictionary. Then, a In and Rftn+1 () are obtained from (2.14) and (2.16), respectively. We emphasize that the adaptive nature of the mechanism of the algorithm comes for the optimal selection of the basis functions (i.e., chirplets). The parameters of the functions are predefined in the dictionary, which differs from the approach of adaptive filtering where the parameters are varied on a sample-by-sample basis [78]. We now discuss the construction of the chirplet dictionary. A dictionary is a repertoire of chirplet basis functions selected to cover efficiently the entire time- frequency plane [37,65]. An insufficient number of basis functions may result in the inability to estimate some time-frequency structures presented in the signal. On the other hand, an excessive number of chirplets is not only unnecessary, but incurs expensive computation as well. This problem can be solved through “frame theory” [33], that, in principle, determines the desired bases through appropriate discretization of their parameters. We follow the method proposed by Bultan [65] to discretize the four-parameter space of chirplets. Here, we only summarize the results in Table I. For a signal with size N, the number of decomposition levels D is determined from N and the radix a. The first level in the decomposition is denoted as Level Zero. Next, the scale index k and angle index m are calculated, from which the discrete chirp rate c and time-spread Δt are found. The time- center tc and frequency-center ωc are directly determined by the signal size N.

36

Chapter 2 MPLEM – An adaptive chirplet transform

TABLE I CONSTRUCTION OF THE DISCRETE CHIRPLET DICTIONARY Symbol Value Description N Signal size (number of samples)

i0 typically 1 The first level to chirp/rotate logons a typically 2 Radix of scales γ γγ∈∈[0,N ) , Z Signal range T N Normalized time range F 2π Normalized frequency range ⎢⎥1 D ⎢⎥loga N Number of levels of decomposition ⎣⎦⎢⎥2

k kDik∈−∈[)0,0 , Z Scale (time-spread) index m 2k k 41a − Number of chirplets at each scale ⎛⎞The total number of chirplets in the Ni2 ⎜ + m⎟ M ⎜ 0 ∑ k ⎟ ⎝⎠k dictionary

m mmm∈∈[0,k ] , Z Chirp rate/rotational angle index 2k αm arctan()ma / Discretized angle for each scale t c tTNc ∈ γγ/ = Discrete time-center of chirplets

ωc ωγc ∈ FN/2/= πγ N Discrete frequency-center of chirplets FFm tan α = c ( m ) Discrete chirp rate TTΔt Δ 2k t a Discrete time-spread

The parameter i0 indicates the first level to rotate a logon. The reason for introducing i0 may be understood in this way: a chirplet is close to the unitary Gabor logon if its time-spread is close to one. Hence, there is little significance to rotate it. The parameter i0 can be used to avoid rotating small scale (time-spread close to one) chirplets. For example, for a typical set of chirplets with D = 5 and i0 = 1, the number of levels in the decomposition is 5, but the first level to rotate the chirplets is Level One (the 2nd level). Note also that a chirplet with the specified scale and chirp rate can be moved to N 2 positions in the time-frequency plane. Therefore, the total number of

37

Chapter 2 MPLEM – An adaptive chirplet transform chirplets is the product of N 2 and the sum of the number of chirplets at each level. However, the number of chirplets stored in the memory can be greatly reduced at the expense of additional operations of time-shift and FFT. Indeed, substituting (2.13) into (2.14), we have

+∞ ⎡ ⎤ −jtωc aftttedtIcc= ()φΔ , ()− , (2.20) ∫−∞ ⎣⎢⎥t ⎦ where the kernel function

2 11⎡ ⎛⎞t ⎤ φ ()tjct= exp ⎢− ⎜ ⎟ + 2 ⎥ (2.21) Δt ,c ⎢ ⎟⎥ πΔ 2 ⎜⎝⎠Δ ⎟ t ⎣⎢ t ⎦⎥

is parameterized by the discrete time-spread Δt and chirp rate c (because of

(2.19), the constant exp()−jtωcc can be neglected for the sake of simplicity). We simply store the kernel φ (t) in the memory, and then calculate a as shown Δt ,c I by (2.20). The number of kernels in the dictionary is determined by im0 + ∑ k . k

The discrete parameters In of the chirplet that are adaptively obtained through (2.19) at step n can be locally refined further by the Newton-Raphson method; some other optimization methods have also been proposed [79]. The Newton-Raphson method is chosen in this work as it is simple and fast.

The time-frequency structures of any ft() from its decomposition within the dictionary may be visualized with the adaptive chirplet spectrum (ACS) proposed by Bultan [65]. ACS is a weighted sum of the WVD of each selected chirplet:

P−1 2 Ef() t,,,ωω= a W g() t (2.22) ∑ IVInn n=0 where Wg() t,ω is the WVD of the Gaussian chirplet gt(). One can interpret VIn In Ef( t,ω) as an energy density of ft() in the time-frequency plane ()t,ω . Unlike the WVD of ft(), it does not include cross terms. Due to the Gaussian envelope

38

Chapter 2 MPLEM – An adaptive chirplet transform of gt(), WVg() t,ω is positive everywhere in the time-frequency plane10 [66]. In In The ACS provides a clear picture of a signal’s time-frequency structures. Before providing examples to illustrate the algorithm, we point out the following reasons for adopting the MP algorithm. First of all, the MP algorithm is relatively simple. By straightforwardly applying (2.19) and (2.16) at each iterative step, it can be shown [37] that the algorithm converges, i.e., the residual

2 energy RftP () tends to zero as P increases. This means that all of the signal energy can be extracted. Secondly, the MP approach has been established on a solid theoretical foundation [37,65,80-82]. Mallat and Zhang [37] provided the fundamental ideas, including the proof of the completeness and convergence of the decomposition. They pointed out the relation between the MP algorithm and the ideas of signal quantization and project pursuit [80]. Davis et al [81] further showed that the residual converges to a chaotic attractor of a process called “dictionary noise”. Bultan [65] extended the algorithm to include chirplets and proved the completeness of the chirplet dictionary. Finally, previous studies have demonstrated the success of applying the MP algorithm to solving biomedical problems, including temporomandibular joint sound analysis [38], sleep EEG analysis [39,40] and motion analysis of patients with Parkinson’s disease [48]. However, all of the above are all non-chirplet applications. Our approach of adopting the MP algorithm with chirplets may be regarded as one further step in the direction of adaptive biomedical signal processing. Next, two examples are presented to illustrate the essence and efficiency of the ACT in signal decomposition. For the first example, Fig. 12 shows the energy density of a signal exp()−jtαω cos α with sinusoidal frequency modulation in the

⎪⎪⎧⎫2 10 ⎪⎪()tt− c 2 2 Wg() t,2expωωω= ⎪⎪⎨⎬−−Δ−−−⎡()() 2 ct t ⎤ VI⎪⎪Δ2 t⎣ c c⎦ ⎩⎭⎪⎪t

39

Chapter 2 MPLEM – An adaptive chirplet transform time-frequency plane. The signal is contaminated by a small amount of Gaussian white noise (GWN). The densities represented by the spectrogram Sf() t,ω 2 and by the ACT Ef( t,ω) are shown in Fig. 12(A) and (B), respectively. It can be seen that while the spectrogram consists of hundreds of points calculated by the STFT, the ACS consists of only five chirplets. In fact, the true essence of the ACT approach can be elucidated as approximating the signal’s energy curve in the time-frequency plane by using straight lines with arbitrary slopes (a first- order approximation). Compared with zero-order approximation, the first-order one is more suitable for compactly characterizing chirp-like signals. This is manifested more clearly in the next example.

ω ω Frequency − Frequency −

Time − t Time − t

(A) Spectrogram (B) ACS

Fig. 12. Time-frequency plots of a signal with sinusoidal frequency modulation. (A) The spectrogram of the signal; (B) the ACS of the signal.

In the second example, we decomposed a simulated complex signal into a number of time-frequency logons based upon two algorithms using the MP method. The only difference between the two methods is the type of logons in the dictionary: one uses the chirplets and the other uses Gabor logons. The results are shown in Fig. 13, where panel (A) shows the different waveforms of the original signal and the reconstructed ones, panel (B) shows the various signal

40

Chapter 2 MPLEM – An adaptive chirplet transform

structures included in the simulated signal, panel (C) shows the time-frequency plot of the decomposition using the MP algorithm with Gabor logons, and panel (D) is the ACS of the decomposition.

Original signal (B) (A) IV

Reconstructed from 7 chirplets III II Reconstructed from 26 Gabor logons I

ω (C)ω (D)

t t Fig. 13. Comparison of signal decompositions. (A) The waveforms of signals, including the original simulated signal, the reconstructed signal from seven estimated chirplets and the reconstructed one from 26 estimated Gabor logons. (B) The signal structures of the simulation. The simulated signal is a direct sum of the signals I-IV, where A = one period of a sinusoid, B = one period of a saw-tooth wave, C and D = sinusoids modulated by a Gaussian, E = delta function, F = sinusoid and G = Gaussian chirplet. (C) The adaptive spectrogram shows the results of decomposition using the MP algorithm with Gabor logons. (D) The ACS shows the results using the MP algorithm with chirplets.

The MP algorithm with Gabor logons was obtained from the website [41]. We follow the example given by Durka and Blinowska [40] where the utility of Gabor logons was demonstrated. Here we extend the demonstration to include a chirping component which cannot, quite obviously, be represented compactly by

41

Chapter 2 MPLEM – An adaptive chirplet transform

Gabor logons, as seen in Fig. 13. A comparison of the compactness between the two representations will be further analyzed in Section 3.3.

(a) (b) (c)

Fig. 14. An example of LEM. The target distributions are displayed by rectangles. (a) Initial guess of the time-frequency distributions displayed by circles; (b) the time- frequency distributions after a few iterations: the centers translate and dilate (while preserving constant area) to match the target distributions. (c) Corresponding time domain representation: Top subplot shows time series of initial guess corresponding to the time-frequency distribution in (a). Bottom subplot (corresponding to the time- frequency distribution in (b)) shows time series after running LEM. Figure reproduced from [62] with permission.

2.2.2 The LEM algorithm

The Logon Expectation and Maximization (LEM) was initially proposed by Mann and Haykin [62,63]. The principal idea involves two major steps: (1) several initial guesses of the major time-frequency structures of the signal, called “logon centers”, are arbitrarily placed; (2) the centers are then drawn to approximate the signal’s structures by using the EM algorithm iteratively. Note that it is an expectation and maximization in a time-frequency plane. It is different from the traditional EM method in that the traditional one is applied to a signal in the time domain only (or to a spectrum in the frequency domain only), but LEM is applied to a logon in the joint time-frequency domain. An example of

42

Chapter 2 MPLEM – An adaptive chirplet transform the LEM algorithm has been shown in Fig. 14. Next, we present our EM steps for chirplet estimation. Instead of adopting mathematical notations of quantum mechanics used in [62,63], we express the EM algorithm with conventional notations for maximum likelihood estimation (MLE) [11]. By following the derivation provided by Kay [11] (Vol. I, Appendix 7C, see also [77]), we can specify the EM steps for chirplet estimation as: E Step:

P−1 eft= ()− agt (), (2.23) ∑ IInn n=0

yt()=+ agt () 1 e. (2.24) kIIkk P

M Step:

IygkkI= arg max , , (2.25) I ∈Γ for kP= 0,1, ,− 1 . The values of a are obtained by the GCT (2.14). Note Ik that a signal model of the signal embedded in additive white Gaussian noise (WGN) is assumed, which we will discuss in detail in 2.2.3.1. At E Step the error between the signal ft() and the linear sum of the estimated P chirplets g is In first calculated. Then the complete data yk are adjusted according to the error. At M Step the set of four chirplet parameters of each chirplet is re-estimated by the MP algorithm. The corresponding complex amplitude a (chirplet coefficient) Ik is found directly by the GCT. The LEM algorithm proposed here is slightly different from Mann and Haykin’s in that the M step employs the MP method to readjust all four parameters of each chirplet. Their method did not use the MP method and did not re-estimate chirp rate in each iterative step.

43

Chapter 2 MPLEM – An adaptive chirplet transform

2.2.3 MPLEM algorithm of the ACT

Our approach is similar to some previously proposed methods [65,74] and can be regarded as a natural extension onto the MP algorithm with Gabor logons [37]. However, it is important to note that the methods proposed in [65,74] generally assume a signal model without noise, and hence their effectiveness in the case of low-SNR has not been rigorously demonstrated. This is of relevance to our problem, since VEP signals are often obtained under low-SNR conditions. To overcome this problem, our approach to multiple-chirplet decomposition involves two crucial procedures in each iterative step: (1) initial coarse estimates of chirplets are obtained by using the MP algorithm, and (2) the estimates undergo progressive refinement with the LEM algorithm. The implementation of the MPLEM algorithm generally follows the proposed methods [62,63,77]. The implementation of MPLEM algorithm is illustrated by the flowchart in Fig. 15. The initialization stage of the algorithm consists of the construction of the discrete chirplet dictionary (Table I) and the initial residual, Rf0 = f (i.e., the original signal itself). When we already have P ≥ 1 chirplets, the new P+1th chirplet is estimated according to (2.19) with the MP algorithm,

P IRfgPI= arg max , , and then locally refined with the Newton-Raphson I ∈Γ method. Subsequently, all P+1 chirplets are progressively refined with the LEM algorithm. The LEM algorithm may be repeated several times until either the error in

(2.23) is below a threshold ( e ≤ δ1 ) or the number of iterations exceeds a predefined value. To prepare for the next iteration, we refresh the residual RfP according to (2.17) and increase the index P by one. The number of chirplets extracted from ft() will be decided according to the stopping criterion (e.g., cc < δ2 the coherent coefficient lower than a threshold δ2 ), which is an important

44

Chapter 2 MPLEM – An adaptive chirplet transform issue in practice (cf. Section 2.2.4). We will continue the discussion on this subject in Section 2.2.4.

START

Construct chirplet dictionary

P = 0; Initialize residue: Rf0 = f

Coarse estimates of one chirplet: Apply MP algorithm to RfP Single Chirplet Estimation Locally refine estimates I P by Newton-Raphson method

E-step: P−1 Error ef= − ag ∑ IInn n=0 Complete data yag=+1 ⋅ e kIIkk P Multiple kP= 0, 1, ,− 1 Chirplets Refinement M-step: Refine a , g of each y Ik Ik k kP= 0, 1, ,− 1

N e ≤ δ1 ? Y Calculate residue RfP Next chirplet: PP=+1

N cc < δ2 ?

Y END

Fig. 15. Flowchart of the MPLEM adaptive chirplet decomposition algorithm.

It should be noted that the proposed MPLEM algorithm is different from the LEM algorithm proposed by Mann and Haykin [62,63] and by O’Neill and Flandrin [77] in that we include an initialization step in our algorithm (the MP

45

Chapter 2 MPLEM – An adaptive chirplet transform algorithm). It is known that a good initial guess is crucial for the algorithm to converge to the true values of the parameters. A poor guess will not usually lead to the solution that has the minimum global error, and can sometimes also result in the failure of the algorithm to converge. Indeed, the approaches proposed in [62,63] may suffer from extremely slow convergence due to poor initial values of the centers. Hence, an alternative algorithm was proposed in [62,63] to partially solve this problem by increasing the robustness of the first couple of iterative steps. However, it is still a heuristic approach in principle. By adopting the MP algorithm, we provide a systematic way of searching initial values for the iterative steps, i.e., the coarse-estimation step in our approach. The search space effectively covers the entire time-frequency space of the signal according to the “frame theory” of signal analysis [65,70]. The resultant values are presumably better than those arbitrarily guessed in the LEM algorithm. Moreover, a subtle advantage of using the MP algorithm is that the reconstruction of the signal is guaranteed, while no such discussion was provided by Mann and Haykin [62,63]. However, the high time cost of the MP algorithm may prohibit its use in real-time applications. Next, we will present a discussion of the signal model and the numerical simulation to validate the MPLEM algorithm.

2.2.3.1 Signal model and CRLBs of chirplet estimates

A challenge in VEP analysis is the low-SNR of the measured signals. Therefore, an analysis problem becomes, in principle, an estimation problem. To apply the chirplet analysis, we propose a signal model: a VEP signal can be represented by a summation of chirplets and the signal is embedded in GWN (Section 1.4). Mathematically, this model may be formulated as

P−1 ft()=+ ag () t Wt (), (2.26) ∑ IInn n=0

46

Chapter 2 MPLEM – An adaptive chirplet transform where a , defined in (2.14), is a complex weight of the selected chirplet gt(), P In In is the number of chirplets, and Wt( ) is a process of complex Gaussian white noise (CGWN) with variance σ2 . Re{}Wt() and Im{}Wt( ) are uncorrelated and follow normal distribution N ()0, σ2 . The introduction of the complex signal model is to simplify the mathematical expressions. There are six parameters to be estimated for each chirplet in (2.26): four of them are Itc= ,,,ω Δ that ncct()n govern the chirplet gt(), and the other two are the amplitude A and phase In In −jφ φ that decide the coefficient aAe= In . Thus, there are 6P+1 parameters to In IInn be estimated (including σ2 ). The maximum likelihood estimation (MLE) algorithm has been chosen to estimate the parameters, as it has been suggested in recent studies (e.g., [77]) that the MLE estimator can provide reliable chirplet estimation in low-SNR (Note that the implementation of MLE in our algorithm is the LEM algorithm). Since an MLE estimator is an unbiased estimator, its performance will merely depend on the variance of the estimates. We therefore discuss the optimal bound of the variance, the Cramér-Rao lower bound (CRLB) [11], of chirplet estimates. CRLB provides a benchmark against which we can compare the performance of any unbiased estimator [11]. Let us consider the simplest case that just one chirplet is embedded in the noise (i.e., P = 1 in (2.26)). Thus, the signal model is:

ft()=+ agt () Wt () (2.27)

aAe= −jφ (2.28)

() ⎡ ⎤ ϕωtctt= ⎣ ()− cc+ ⎦ () tt− c (2.29) ⎡ ⎛⎞2 ⎤ 11⎢⎥⎜tt− c ⎟ gt()= exp⎢⎟⎥− ⎜ ⎟ exp[] jϕ() t (2.30) πΔ 2 ⎝⎠⎜ Δ ⎟ t ⎣⎢ t ⎦⎥

47

Chapter 2 MPLEM – An adaptive chirplet transform

where the subscript I 0 is omitted without confusion. From (2.19), the four estimates of the chirplet parameters are

⎡ tˆ ⎤ ⎢⎥c ⎢⎥ ωˆ ⎢⎥c 2 ⎢⎥= arg maxfg , , (2.31) ⎢⎥cˆ I ⎢⎥ tccc,,,ω Δ t ⎢⎥ ⎢⎥Δˆ ⎣ t ⎦

where gI is a chirplet defined in the dictionary. Subsequently, the estimated chirplet gtˆ() can be found by substituting (2.31) into (2.30). The other estimates are found by

Azˆ = ˆ (2.32)

φˆ = −zˆ (2.33)

fz2 − ˆ 2 σˆ2 = (2.34) 2N where N is the size of ft() and zfgˆ = , . The performance of the estimators (2.31)-(2.34) may be quantitatively measured by finding their CRLBs. Fortunately, the bounds can be found in closed form after a tedious, though straightforward derivation. We present the results in Table II, but leave the derivation in Appendix B.

TABLE II CRLBS OF THE ESTIMATES ˆ ˆ ωˆ ˆ ˆ ˆ 2 θ tc c cˆ Δt φ A σˆ Δ2 14+ c24Δ 4 Δ2 34+ ω22Δ σ4 † t t t ct 2 CRLB 2 4 σ ξ ξΔt ξΔt 2ξ 4ξ N † ξσ= A22/2

The ratio ξ is defined as the energy-to-noise ratio (ENR). Except for the estimates Aˆ and σˆ2 , the CRLBs of the other estimates are inversely proportional

48

Chapter 2 MPLEM – An adaptive chirplet transform to the ENR. Since the chirplet is well within the recording interval (see Appendix B), all other estimate bounds (except that of noise variance) are not functions of the signal size N. The curves of the bounded functions are shown in Fig. 16. Simple functions related to Aˆ and σˆ2 are not included.

4 10 t c 3 10 c Δ t 2 10 t

Δ 1 10 , c, and

c 0 10

−1 10 CRLB’s of t

−2 10

−3 10

−4 10 0 1 2 3 4 5 6 7 8 9 10 True time−spread Δ t (A)

14+ c24Δ 22 t 34+ ωctΔ 2 500 Δt 1000 4

400 800 c φ

ω 300 600

200 400 CRLB of CRLB of

100 200

0 0 1 3 True frequency−center 0.8 10 2.5 10 0.6 8 2 8 1.5 True chirp rate c0.4 6 6 1 4 Δ 4 Δ t 0.2 2 t 0.5 2 0 0 ω 0 0 True time−spread True time−spread c

(B) (C)

Fig. 16. CRLBs of the estimates of a single chirplet in noise. (A) The bounds of tˆ , cˆ ˆ ˆ c and Δt . (B) The bound of ωˆc . (C) The bound of φ . Note the symmetric property for negative c and negative ωc in (B) and (C), respectively. The figures are calculated when the ENR ξ is assumed to be one.

ˆ ˆ We can see that the bounds of Δt and tc increase with increase in the true time-spread. The bound of the chirp rate shoots up sharply when Δt approaches zero, which indicates that it will be difficult to estimate the chirp rate when Δt

49

Chapter 2 MPLEM – An adaptive chirplet transform

ˆ is small. The bound behaviors related to ωˆc and φ are more complicated. For small chirp rate c, the variance of the estimate ωˆc decreases with increase of Δt .

For large c, however, either large or small Δt will increase the bound of ωˆc . These findings provide one with some measure of confidence in chirplet estimation. For instance, if we know that a chirplet’s time-spread is small, we will have a good confidence in the estimate of its time-center.

2.2.3.2 Numerical simulation

The purpose of the simulation is to validate the algorithm in two aspects. The first is to investigate the behavior of the algorithm in estimating a single chirplet in various levels of noise. This has been done by comparing the variances of the estimates with the CRLBs. The second is to demonstrate, via an example, the efficiency of the LEM algorithm in estimating multiple chirplets under low-SNR conditions. We have already provided the CRLBs of the seven chirplet estimators in Table II for the signal model in (2.27). The parameters of the model for simulation are N = 100 (signal size), A = 1 , φ = 0 , tNc = /2 , ωπc = /2 ,

2 cN= π / and Δt = 2 . The values of the variance of CGWN σ are chosen in such a way that the SNR of the simulated signal ranges from -20 dB to 20 dB. The SNR is defined as: ξ A2 SNR ==10 log 10 log (dB). (2.35) 10NN−−121 10 ()σ2

50

Chapter 2 MPLEM – An adaptive chirplet transform

2 2 3.5 10 0.4 10 ˆ ˆ 1 (B) φ (A) A 10 1 3.0 0.3 10

0 10 0 2.5 φ 0.2 10 φ −1 10 −1 2.0 0.1 10 −2 10 variance of variance of A −2 1.5 0.0 10 mean value of mean value of A −3 10

−3 1.0 −4 10 −0.1 10

−5 −4 0.5 10 −0.2 10 −20 −15 −10 −5 0 5 10 15 20 −20 −15 −10 −5 0 5 10 15 20 SNR (dB) SNR (dB)

2 1 53.5 10 4.0 10

53.0 ˆ (D) ωˆ (C) t 1 c 0 c 10 3.5 10

52.5 c

c 0 −1

10 ω 3.0 10 c 52.0 c ω

−1 −2 51.5 10 2.5 10

51.0 −2 −3 variance of t

10 2.0 10 variance of mean value of t mean value of 50.5

−3 −4 10 1.5 10 50.0

−4 −5 49.5 10 1.0 10 −20 −15 −10 −5 0 5 10 15 20 −20 −15 −10 −5 0 5 10 15 20 SNR (dB) SNR (dB)

1 2 0.1 0 10 22 10 (E) cˆ 20 (F) Δˆ 0 t 1 0.08 10 18 10

16 −1 t 0

0.06 10 Δ 10 14 t Δ

12 −2 −1 0.04 10 10 10

−3 variance of c 8 −2 0.02 10 variance of

mean value of c 10 mean value of 6

−4 −3 0.00 10 4 10 2

−5 −4 −0.02 10 0 10 −20 −15 −10 −5 0 5 10 15 20 −20 −15 −10 −5 0 5 10 15 20 SNR (dB) SNR (dB)

0 0 10 10 (G) σˆ2 Legend: −1 −2 10 10 2

σ −4 Mean value −2 10 10 2 σ True value −6 −3 10 10 Variance mean value of variance of −8 10 −4 10 CRLB

−10 10 −5 10 −20 −15 −10 −5 0 5 10 15 20 SNR (dB) Fig. 17. Simulation results of estimating a single chirplet in various levels of noise. In each figure the x-axis is the SNR in dB, the y-axis of the mean value of the parameter is at the left side, and the y-axes of the variance is at the right side. The seven parameters are (A) amplitude, (B) phase, (C) time-center, (D) frequency-center, (E) chirp rate, (F) time-spread, and (G) the variance of the noise.

51

Chapter 2 MPLEM – An adaptive chirplet transform

The results are shown in Fig. 17, where the mean and variance of the estimated parameters are compared with the true values and CRLBs, respectively. The seven parameters listed in Table II are shown as functions of the SNR. In each individual figure, the lines with white dots represent the mean value of the parameter, of which the dotted line shows the true value and the solid line is the mean of the measurements. The lines with black dots represent the variance, of which the dotted line shows the CRLB and the solid line represents the variance of the measurements. The results show that, in general, the mean values of all estimates are equal to the true values for SNR’s above 0 dB. For lower SNR, the mean values deviate significantly from the true values, resulting in biased estimates. A similar pattern can be observed for the variance. The variance of the estimates closely follows the CRLBs until the SNR is below 0 dB. Recall that an estimator is said to be optimal, if its variance is equal to the CRLB. The simulation results show that all chirplet estimators are no longer optimal if the SNR is below zero, except the estimator of the noise variance σ2 . Both the mean and the variance follow the theoretical values closely within the whole range of noise level. Note that the signal size in the simulation was set at 100 points. Other studies have shown that the performance of estimators can be improved for longer signal size under the condition that the SNR level is the same as that in the simulation [83]. To evaluate the effectiveness of the LEM algorithm for multiple-chirplet decomposition, we used a synthetic signal consisting of three chirplets (specified in Table III). The signal size was N = 100 and the SNR11 was set at 0 dB.

11 m In the case of multiple chirplets, the SNR is defined as 10 log10 2 in dB, where m is the number of chirplets and N is the signal size. ()N −1 σ

52

Chapter 2 MPLEM – An adaptive chirplet transform

TABLE III COMPONENTS OF THE SYNTHETIC SIGNAL ˆ ωˆ (rad/pts) ˆ ˆ ˆ Chirplets tc (pts) c cˆ (rad/pts) Δt (pts) φ (rad) A

c1 50.00 1.57 0.00 1.00 0.00 1.00 c2 50.00 1.26 0.03 12.50 0.00 1.00 c3 50.00 2.20 0.00 12.50 0.00 1.00 pts = points

(A) (B) Frequency Frequency Frequency

Time Time (C) (D)

c3

c1 c2 Frequency Frequency Frequency

Time Time

Fig. 18. Simulation results of multiple-chirplet decomposition of a synthetic signal. (A) WVD of the signal. (B) WVD of the signal in noise. (C) ACS of the estimated chirplets without using the EM algorithm. (D) ACS of the estimated chirplets by using the LEM algorithm.

In Fig. 18, we show four different time-frequency plots of the signals. The WVD of the synthetic signal with and without noise is shown in (A) and (B), respectively. The signal structures are not easily discerned with visual inspection, especially for the noisy signal. Next, we show in (C) the result of applying the

53

Chapter 2 MPLEM – An adaptive chirplet transform

MP algorithm to the signal only (no process of the LEM refinement). Errors in the estimates are clearly observable. Finally, we show in (D) the result after applying the MPLEM algorithm. The estimated time-frequency structures are displayed with the ACS. The parameters of the three chirplets are now correctly estimated. This example illustrates the significance of the LEM refinement process.

2.2.4 Measures for stopping criterion and compactness

A critical point of the analysis is the determination of the number of chirplets required to represent sufficiently the VEP signals. We did not attempt to predefine this number, but rather opted to continue the decomposition until a specific stopping criterion is satisfied. One stopping criterion is to calculate the coherent coefficients (cc) of the extracted chirplets, which is defined as the ratio of the energy of the projection to the energy of the residue,

2 a cc==In n0, , P − 1 , (2.36) n Rfn 2

2 where a is the energy of the projection and Rfn 2 is the energy of the residue In Rfn . This measure was originally proposed by Mallat and Zhang [37] to characterize the coherence of a signal with respect to the dictionary of functions. In other words, the chirplets in the dictionary serve as a “vocabulary to interpret a given sentence”. The more coherent a signal is with respect to the dictionary, the larger the cc values. Therefore, a small cc value indicates low correlation between the signal and the dictionary. A threshold based upon the cc value can be chosen for a stopping criterion. In Chapter 3, we specify a criterion using the cc values according to the estimated chirplets. Another measure can be derived in terms of the energy ratio (ER), which is the energy ratio of the residual to the original signal:

54

Chapter 2 MPLEM – An adaptive chirplet transform

Rfn 2 ER = . (2.37) n f 2

The lower the ER value is, the lesser the energy remaining in the residual. We will employ this measure to compare the compactness of the chirplet representation with that of the Gabor logon representation in Chapter 3.

2.3 The Windowed Adaptive Chirplet Transform

In the preceding section, we have described the principles of the MPLEM algorithm for estimating multiple chirplets based on a signal modeled shown in

(2.26). Since the entire signal ft() is under analysis directly, we term the method the non-windowed ACT to distinguish it from the windowed ACT described next. We will modify the non-windowed ACT one step further by segmenting the signal in a preprocessing procedure. That is, the signal is partitioned into non- overlapping, equal-length segments using rectangular truncation to obtain an integral number of segments. One chirplet is then estimated from each segment, and hence the entire signal is approximated by a sequence of chirplets. Segment-based methods are common practices in the field of signal processing. They are generally used to either increase the speed of computation or reduce the dimensionality of the feature space in classification problems. For example, to compute an output of a linear time-invariant system with a very long input signal, block convolution is generally used to avoid a large delay in processing (e.g., see reference [84]). In this technique, the signal is segmented into equal length sections that are convolved with the finite-length impulse response of the system. The filtered sections are then fitted together in an appropriate way. In speech recognition, to avoid a high-order feature set (i.e., a large number of features), it is a common practice to divide the speech signal and use a low-order

55

Chapter 2 MPLEM – An adaptive chirplet transform set to represent each individual segment (e.g., see reference [85]). The resulting sequences of small number of features can be subsequently input to a dynamic model, such as a hidden Markov model (HMM), to characterize the time-varying properties of the signals. A major motivation to develop the windowed ACT is to reduce computational time. It is known that most of the computational time is spent in calculating the inner product of (2.19) in selecting the optimal chirplet. The complexity of this

2 operation is ON( log2 N), where N is the signal size [65]. That is, if CN() is the

2 number of operations, then CN()= kN12log N, where k1 is a constant. Suppose the signal has been partitioned into M segments, each of length L. The number of

2 operations of the windowed ACT will be CNw ()= kL22log L, where kkM21= .

2 The computational complexity can then be reduced to OL()log2 L. Since L is usually much smaller than N, significant time of computing can be saved. In our approach, each data segment is represented by a single chirplet. This is equivalent to modeling the data segment as a linear chirp with a Gaussian envelope. Linear-chirp modeling has been previously studied in approximation of narrow-band signals [86]. Quite recently, Ainsleigh et al [83] extended this idea to approximate a narrow instantaneous bandwidth (NIB) signal by partitioning it and then approximating each segment using a linear chirp. Our method of approximating signals with piecewise Gaussian chirplets was developed independently in a different context (i.e., in VEP analysis), but the underlying mechanisms are similar. Note that the total bandwidth of a NIB signal is not necessarily narrow. That is, although the energy of the signal may cover a wide range of frequency, at a time instance, the signal’s energy may cover only a small frequency interval; this is referred to as NIB.

56

Chapter 2 MPLEM – An adaptive chirplet transform

2.3.1 Computational method

Suppose that the size of a signal is N. We partition it into M segments, each of size L, i.e., N = L×M. The signal of the k-th segment sk may be modeled as st=+ agt wt, for kM= 0,1, ,− 1 . (2.38) kk() IIkk () k kk ()

tk are sample times at tkLkLk =+,1,,11 () k + L− . wk is Gaussian white noise with unknown variance σ2 . a and g are the amplitude and the chirplet to fit k Ik Ik the data in the k-th segment. The computational procedure is straightforward: after partitioning the signal into M segments, we estimate one chirplet out of each segment using the MP algorithm. It is exactly the same step of “single chirplet estimation” in the algorithm shown in Fig. 15 (p. 45). Because only one chirplet is estimated, the other step of “multiple chirplets refinement” is unnecessary. The signal is then represented by a sequence of parameter sets aI, for kM= 0, 1, ,− 1 . ()Ikk

2.3.2 Discussion

With this method a signal is essentially approximated by a series of chirplets. One important question is: Can a broadband signal be sufficiently represented by a sequence of narrowband components? The reason to justify the window approach is that we may be able to obtain an adequate representation if the signal is an NIB signal. Other situations may not fit the window approach. An example is presented next to illustrate this idea. A good example of an NIB signal is the one with sinusoidal frequency modulation shown in Fig. 12 (p. 40). We partition the signal and apply the windowed ACT technique to approximate each segment with one chirplet. The result is shown in Fig. 19. It is interesting to compare this result with that shown in Fig. 12(B), where five chirplets were obtained with the non-windowed ACT. We can see that both the results yield an adequate approximation to the

57

Chapter 2 MPLEM – An adaptive chirplet transform spectrogram in Fig. 12(A). However, the time cost with the windowed ACT is much less than that with the non-windowed ACT. On the other hand, the synthetic signal specified in Table III is not an NIB signal, and therefore, the windowed ACT will not be able to achieve sufficient approximation to the structures as shown in Fig. 18(D).

② ① ③ ω ④ ⑥ Frequency − ⑤

Time − t

Fig. 19. Time-frequency plot of the signal with sinusoidal frequency modulation whose spectrogram is shown in Fig. 12(A). The plot is obtained with the windowed ACT method. The signal has been divided into six segments and each segment is approxi- mated by one chirplet. The displayed WVD of the chirplets are denoted by the components ①-⑥.

Another important question concerns the optimal selection of the window length. The model in (2.38) is, in fact, identical to the one-chirplet model stated in (2.27)-(2.30), and the performance analysis of the model (described in Section 2.2.3.1 and the simulation in Section 2.2.3.2) can be directly applied here. On the one hand, the lower bound of length is placed by the SNR condition of the signal. We see from Fig. 17 that if the SNR is fixed at 0 dB, the signal size should not be less than 100 points. Otherwise, the variance of the estimates will significantly deviate from the CRLBs and the estimators are no longer optimal. On the other hand, an upper limit will be imposed by the desired “resolution” of analysis. That is, the segment must be short enough for the chirplet features to adequately

58

Chapter 2 MPLEM – An adaptive chirplet transform

“sample” the time-evolution of the signal spectrum. Thus, we can see a case of tradeoff here. Short segments make the time-resolution higher, but lead to large estimator variance. Longer segments give smaller estimator variance, but introduce errors due to model mismatch. An automatic algorithm for adaptively determining the optimal length would be an interesting topic for future research. The specific procedure of window length selection for VEP analysis is detailed in Chapter 4. By partitioning a signal into small segments, we are able to represent the temporal evolution of signal characteristics by a small number of features. This approach avoids a high-order set of “global” features obtained by the non- windowed ACT. The computational time for a classifier to make a decision is presumably reduced, as the number of features has been cut down. As we have already pointed out, the computational time of the windowed ACT can be

2 greatly curtailed since the order of complexity is reduced to OL()log2 L, where L is usually a much smaller value than N. This makes real-time processing using the chirplet transform potentially possible. Finally, it is worth of noting that alternative efforts have also been derived to lower the computational cost of chirplet transform. One exciting improvement is to exploit advanced technology of modern computer hardware. In collaboration with the Laboratory of Wearable Computer of the University of Toronto [87], it has been successively shown that the speed of computation can be significantly increased by using a cutting-edge graphics processing unit (GPU). For instance, it has been demonstrated that for a 64-point chirplet transform, the GeForce 6900 FXTM provided at least a 20-fold of speedup over the Pentium IV CPU [87].

59

Chapter 2 MPLEM – An adaptive chirplet transform

2.4 Summary

VEP responses to rapidly repetitive visual stimuli consist of early transient and later steady-state components. These time-varying characteristics require a time-frequency description. We pointed out that continual effort in pursuing optimal representations of complex time-frequency structures led to the emergence of the chirplet transform. An arbitrary signal can then be completely represented by a series of chirplet functions. The chirplets are estimated with the MPLEM algorithm for the non-windowed ACT method, including an initial coarse estimation with the MP algorithm and a refinement processing with the LEM algorithm. The visualization of the results is implemented with the ACS. We have proposed a VEP signal model to take into account the low-SNR condition in VEP recordings. The optimal lower bounds of variance (the CRLB) of the chirplet estimators were given to evaluate the estimators’ performance under various SNR conditions. We carried out simulations to validate the algorithm and concluded that the minimum length of a signal should be chosen according to its SNR level. The technique of the windowed ACT has been proposed as a crucial step toward real-time applications. We showed that the computational cost could be significantly reduced by the windowing technique. We also illustrated that high- dimensional global features could be avoided by using time-dependent low- dimensional local features. All of above are necessary for fast computing when minimum time of analysis and classification is imperative.

60

Chapter 3 Application of the Non-Windowed ACT12

This chapter is devoted to the results and analysis of the application of the non-windowed ACT method to VEPs. We will begin with the description of the experimental setup and data acquisition, followed by the results and discussion. Special attention is given to the validation of the signal model, compactness comparison, visualization effect of different time-frequency representations, and the criterion for separating tVEPs and ssVEPs. We summarize the basic conclusions of this chapter in the last section.

3.1 Experimental Method

We describe the method of VEP data acquisition in this section. However, the reader should note that the same data set was used in the work described in the next chapter, where the application of the windowed ACT technique is presented.

12 Partial materials in this chapter are contained in the article: “The adaptive chirplet transform and visual evoked potentials” by J. Cui and W. Wong, IEEE Transactions on Biomedical Engineering, Vol. 53, No. 7, pp. 1378-1384, July 2006.

61 Chapter 3 Application of non-windowed ACT

3.1.1 Subjects

Five healthy subjects (four males and one female, aged 22 to 35 years) participated in this study. They all had normal or corrected-to-normal vision with no known impairment to their visual systems. All subjects were previously trained before the start of the session to ensure that they understood the task. They also provided informed consent. Health Sciences Research Ethics Board of the University of Toronto approved the study (Protocol reference № 12611, Appendix C).

3.1.2 Visual stimulus

The patterned visual stimulus was a matrix of moving bars denoted as MMB (Fig. 20), which follows essentially the pattern adopted by Pei et al [88]. With a static cross at the center, the display consisted of a series of 16 horizontal-vertical bar pairs. Each bar was 1.2o high and 0.2o wide. The pairs were distributed evenly across a display subtending a 10.5o visual angle and with a contrast equal to 40%. The display had an approximate mean luminance of 50 cd/m2.

Fig. 20. The matrix of moving bars (MMB).

A single trial consisted of 5 s of data. Following a one-second pre-stimulation period, the vertical bars were made to oscillate in the horizontal direction with a temporal frequency of 3.0±0.1 Hz for 4 s, while the 16 horizontal bars were static and served as the positional reference for the vertical bars. The onset of the

62 Chapter 3 Application of non-windowed ACT stimulus was at the end of the 1st second. The relative amplitude of the motion was 40% of the horizontal bar length. The subjects were instructed to pay attention to the moving bars on the screen. Each subject sat for a total of 50 trials. S/he initiated the trials with a button when s/he was ready. The pattern of MMB (Fig. 20) was present on the screen before the subject initiated a trial. The EEG recording began at the moment the subject pressed the button. The vertical bars started moving one second later. The recording stopped 5 s after the subject pressed the button, but the vertical bars continued moving for extra one second and then stopped (6 s after the button pressed). The MMB was then shown on the screen. There was no recording of the EEG until the subject pressed the button again. This type of stimulation is different from conventional patterned stimuli such as a checkerboard or grating. In the work of Pei et al [88], the visual stimulation of MMB was employed to show successfully a correlation between the temporal frequency of the moving bars and the frequency components of the evoked potentials, which has been confirmed by our own work [89]. The work of Pei et al, however, focused on the analysis of ssVEPs without taking into account the transient portion of a VEP response. In this work, we are interested in investigating the time-dependent behavior of VEP from its initial transient portion to the steady-state portion. One potential problem associated with this type of stimulus is the artifacts that could be induced by the involuntary pendular eye movements. To suppress this effect, subjects were instructed to maintain fixation on the central cross. They were also asked to withhold eye blink during each trial and to keep the facial muscles relaxed. Existing evidence showed that the electrical artifacts associated with eye movements were very small for occipital electrodes [5] (pp. 384-389), e.g., an electrode at Oz of the International 10-20 System.

63 Chapter 3 Application of non-windowed ACT

3.1.3 Apparatus and VEP recording

All stimuli were produced on an LCD monitor (ViewSonic ViewPanel VP150m) using a high-resolution graphics board (Nvidia GeForce4, 1024×768 pixels, 75 Hz). The subjects viewed the monitor binocularly from a distance of 50 cm in a darkened room. VEPs were recorded via gold-cup electrodes applied to the scalp with TEN20TM electrolyte paste following skin preparation using NU- PREPTM abrasive gel. The active electrodes were placed on the scalp at the Oz position, while the electrode on the right ear lobe served as the reference. The third electrode, placed on the left ear lobe, acted as ground. The contact impedance of the three electrodes was below 10 kΩ for all subjects. The EEG signal was amplified (by a Grass CP511 AC amplifier with a gain of 1000), bandpass filtered (0.01 Hz – 300 Hz), passed through an A/D converter (NI PCI- 4451, 16-bit, sampled at 1 kHz) and then streamed to the harddisk. At the start of post-recording analysis (carried off-line), the data were further lowpass filtered at 40 Hz (-3 dB cutoff) with a digital filter and then re-sampled at 240 Hz. An averaged signal was obtained from the 50 trials of each subject. In total, we obtained five averaged signals (denoted as D1 – D5) from the five subjects.

3.2 VEP Data Processing Results

3.2.1 Number of chirplets in the dictionary

According to the algorithm illustrated in Fig. 15, the first step in the process of decomposition is to construct the discrete chirplet dictionary. The construction method has already been given in Table I (p. 37), from which we know that the number of chirplets in the dictionary will be decided by the signal size N, the radix of scale a, and the first level of chirping operation (rotation) i0.

64 Chapter 3 Application of non-windowed ACT

TABLE IV

TEN CHIRPLETS ESTIMATED FROM SIGNAL D1 † ˆ ˆ ˆ ˆ Chirplets aˆ φ (rad) tc (s) fc (Hz) cˆ(Hz/s) Δt (s)

c1 198.97 -2.18 3.56 5.59 -0.01 1.02 c2 174.84 2.50 1.82 6.00 -1.35 0.26 c3 131.53 1.40 1.47 7.75 -50.07 0.05 c4 94.40 1.10 2.02 15.16 17.79 0.41 c5 88.66 0.32 1.25 18.26 1.10 0.69 c6 74.14 -2.23 4.80 8.38 -8.30 0.20 c7 85.68 1.25 3.89 11.30 0.12 0.76 c8 75.19 0.18 1.70 26.33 -0.42 1.01 c9 66.76 -0.78 0.24 10.88 3.04 0.16 c10 67.07 -2.77 1.02 34.37 2.05 0.78

† The ten chirplets are labeled as c1 – c10. The estimated parameters are in reference to equations (2.14) and (2.19).

TABLE V

THE COHERENT COEFFICIENTS ccn OF THE DECOMPOSED CHIRPLETS Signals D D D D D Mean STD Chirplets 1 2 3 4 5

c1 0.17 0.40 0.22 0.41 0.35 0.31 0.11

c2 0.17 0.27 0.19 0.34 0.37 0.27 0.09

c3 0.11 0.11 0.15 0.22 0.15 0.15 0.05

c4 0.07 0.08 0.12 0.11 0.07 0.09 0.03 c5 0.06 0.06 0.11 0.09 0.10 0.08 0.02

c6 0.05 0.05 0.08 0.10 0.08 0.07 0.02

c7 0.07 0.07 0.09 0.07 0.06 0.07 0.01

c8 0.05 0.04 0.06 0.06 0.06 0.06 0.01

c9 0.05 0.05 0.05 0.05 0.09 0.06 0.02 c10 0.05 0.02 0.04 0.04 0.06 0.04 0.01

Given the 240 Hz sampling rate and 5 s duration of the signal, the signal size is N = 1200 . We chose the typical values of a = 2 and i0 = 1 , and found that the number of levels of decomposition is D = 5 . Because the first level of applying the chirping operation was Level One, there was only one chirplet at Level Zero (see p. 28). The number of chirplets at the other four levels was found

65 Chapter 3 Application of non-windowed ACT

by mk , for k = 0, 1, 2, 3, as 3, 15, 63 and 255, respectively. Therefore, the 3 number of chirplets stored in the memory was im+=337 . 0 ∑k=0 k

3.2.2 Chirplet estimation

How many chirplets are needed to represent a VEP signal adequately? In order to answer this question properly, ten chirplets were first estimated from each signal with the algorithm described in Fig. 15. This number of chirplets was believed to be sufficient as the residual signal was close to GWN (see the discussion in Section 3.3).

Fig. 21. Mean and STD of the coherent coefficients.

The ten chirplets estimated from D1 are summarized in Table IV, for example.

The coherent coefficients ccn (defined by (2.36)) of all the estimated chirplets are summarized in Table V. In general, the chirplets extracted first have higher amplitudes and higher cc. In particular, we found that the amplitudes of the first three chirplets are significantly higher than those of the remaining chirplets. The first chirplet, c1, represents the steady-state component of the VEP signal, as it

66 Chapter 3 Application of non-windowed ACT

ˆ has a long time-spread, Δt , and near-zero chirp rate. The remaining two chirplets c2 and c3 have negative chirp rates, indicating that the instantaneous frequency decreases with time.

60 (A) 50

40 c 10

30 c 8

Frequency (Hz) c 20 5 c 4 c c c 9 3 7 c 10 c c 6 2 1

0 40 0 −40 0.0 1.0 2.0 3.0 4.0 5.0 Time (s)

60 (B) 50

40

30

Frequency (Hz) 20 c 3

c 10 2 c 1

0 40 0 −40 0.0 1.0 2.0 3.0 4.0 5.0 Time (s)

Fig. 22. Time-frequency structures of signal D1. (A) The adaptive chirplet spectrogram (ACS) of the 10 chirplets with the reconstructed signal (below) and its spectrum (left).

(B) The ACS of chirplets c1 – c3 as a VEP representation.

67 Chapter 3 Application of non-windowed ACT

From Table IV, we note that the frequency center of chirplet c1 is close to the second harmonic (6 Hz) of the stimulator frequency (3 Hz). This indicates that the VEPs were in response to the changes of the directions of bar movement, but not to the specific direction. As is indicated in Section 2.2.4, an important issue in the procedure of decomposition is to decide the stopping criterion. To facilitate its determination, the mean and STD of the cc values in Table V are shown graphically in Fig. 21. It can be seen that the values decrease with each increasing step13. Typically, the cc values of the first three chirplets are significantly higher than 0.10, indicating that the first three or four chirplets are usually sufficient to represent VEP signals by assuming a cutoff at 0.10 (cf. Fig. 22 and Fig. 23). Recall that the true essence of a cc value is to measure the correlation between the estimated signals and a given dictionary. Therefore, we may interpret that the estimated chirplets whose cc values are 0.10 or higher contain more “meaningful” information, with respect to the dictionary, than those with lower cc values. This cc value provides a reasonable stopping criterion for our data.

3.2.3 Visualization

With the help of the ACS defined in (2.22), we can now provide a visualization of the time-frequency structures estimated from the signals D1 – D5. Fig. 22 shows the visualization of the results in Table IV using the ACS. Directly below the adaptive spectrogram in (A) is the reconstructed signal from the ten chirplets. The spectrum of the reconstructed signal is shown on the left. The vertical dotted lines at 1.0 s represent the onset of the visual stimuli. Chirplets c1-c3 are shown separately in Panel (B). These three chirplets, whose cc values are above 0.10, show a typical representation of VEPs. Again, the reconstructed

13 Note that, since only one chirplet is estimated in each iterative step, the indices of the steps are identical to the indices of the chirplets.

68 Chapter 3 Application of non-windowed ACT signal from these three chirplets is shown below and its spectrum is presented on the left. It can be seen that the reconstructed signal provides a “less noisy” waveform than the one in Panel (A). In addition, the signal may be partitioned into its transient and steady-state components by reconstructing the steady-state component using chirplet c1 and the transient component using chirplets c2 and c3.

60 60 50 D2 50 D3 40 40 30 30 20 20

Frequency (Hz) 10 10 0 0 40 40 0 0 −40 −40 0.0 1.0 2.0 3.0 4.0 5.0 0.0 1.0 2.0 3.0 4.0 5.0

60 60

50 D4 50 D5 40 40 30 30 20 20

Frequency (Hz) 10 10 0 0 40 40 0 0 −40 −40 0.0 1.0 2.0 3.0 4.0 5.0 0.0 1.0 2.0 3.0 4.0 5.0 Time (s) Time (s)

Fig. 23. Time-frequency plots of the signals D2 – D5. The time-frequency structures are represented by chirplets whose cc values are all above 0.10. The reconstructed signals from the chirplets are shown immediately below the spectrogram and the spectra of the reconstructed signal are shown on the left.

To verify the repeatability of this observation, the results for the other subjects are reported in Fig. 23. Only the estimated chirplets with cc values above 0.10 are shown with the ACS. The results for D2 – D5 generally mirror the results shown for signal D1. That is, a couple of chirplets with wider bandwidths and shorter time-spreads represent the early portion of a VEP response (the

69 Chapter 3 Application of non-windowed ACT transient components), while the chirplets with large time-spread and near-zero chirp rate represent the later portion of the response (the steady-state component). We will discuss this further later in Section 3.3.4.

3.3 Discussion

An important question is whether a true VEP signal can be sufficiently estimated via the ACT procedure. We attempt to answer this question in the discussion of model validation by testing the whiteness of the residual signals. We show the superiority of the ACT over the method using Gabor logons by comparing the compactness of decomposition between these two methods. We will also discuss the visualization effect by using the ACS. One prominent advantage of chirplet representation is that we may now have the capability to distinguish the transient VEP portion from the steady-state portion.

Residual signal at n = 1 (R ) 1 10 40 R 20 1 8 R 0 3 R −20 10 −40 6 0.0 1.0 2.0 3.0 4.0 5.0 Residual signal at n = 3 (R ) 3 4 40 20 2 0 −20 −40 PSD (dB/Hz) 0 0.0 1.0 2.0 3.0 4.0 5.0 Residual signal at n = 10 (R ) 10 −2 40 20 −4 0 −20 −40 −6 0.0 1.0 2.0 3.0 4.0 5.0 0 5 10 15 20 25 30 35 40 Time (s) Frequency (Hz)

(A) Residual signals (B) PSD of the residual

Fig. 24. The residual signals (D1 decomposition) and their PSDs. (A) Waveforms of three st rd residual signals that are obtained after the 1 iteration (R1), the 3 iteration (R3) and th the 10 iteration (R10), respectively. (B) PSDs of the residual signals (estimated by Yule- Walker AR method with the order p = 10 [90]).

70 Chapter 3 Application of non-windowed ACT

3.3.1 Model validation

Here, we discuss the validation of the signal model. The model has been described in Section 1.4. Its mathematical expression is in (2.26). It basically states that the VEP signal represented by a sum of the chirplets is in additive Gaussian white noise (GWN). Therefore, the model is valid if the residual signal can be tested to be white after a certain number of iterations. The above model makes the assumption that the background noise is Gaussian, white, and additive. There is some evidence that the assumption is not entirely valid [91]. Nevertheless, the model and simplifying assumptions provide some useful insights into the convergence of averaged EP data. It should be noted that within the small frequency range concerned in our study (0.01 Hz – 40

Hz) the spectrum of VEP residual after ten iterative steps (R10) is approximately flat (Fig. 24(B)). The validation of the model may be verified by checking whether or not the power spectral density (PSD) of a residual sequence is constant; in other words, we check for whiteness. If the true VEP signal is sufficiently extracted from the measured data, we expect that the “color” of the residual signal to be white according to our model. As an example, Fig. 24 shows the residual signals and the corresponding PSDs. The residues were obtained at different steps of iteration when the signal D1 was decomposed. It can be seen that the residual sequences become more like white noise with the increase of the iteration. This observation is consistent with the changes in the PSDs shown in Fig. 24(B), since the spectral curves gradually approach a constant level. The figure illustrates qualitatively the trend of whiteness of the residues as increasing number of chirplets are extracted from the original signal. To verify quantitatively this observation, we have to resort to a statistical measure of whiteness. We adopted the method recently proposed by Drouiche

71 Chapter 3 Application of non-windowed ACT

[92], which is simple and fast to use as it involves the periodogram to estimate the PSD. The Drouiche statistics outperforms other traditional statistics (e.g., the Portmanteau and Fisher statistics [93]) for whiteness test. To apply it to our problem, only the related results in [92] are presented here. Specifically, to test the hypothese:

⎧HP:.()ω = const ⎪ 0 x ⎨ (3.1) ⎪HP:.()ω ≠ const ⎩⎪ 1 x

() where Px ω is the PSD of a stochastic sequence and const. denotes a positive constant, we construct the estimated quantity

11ππ ˆ () () WIdIdNN= logωω−− log N ωω γ. (3.2) 22ππ∫∫−−ππ

() Here, γ denotes the Euler constant (0.57721) and Px ω is estimated by the periodogram

N −1 2 1 −jkω () IxeNkω = ∑ , (3.3) 2πN k=0

where xk , kN= 0,1, ,− 1 , is the residual, and N is the sample size. The ˆ estimate WN can be considered as a “distance” to whiteness. It can be shown ˆ that under H 0 , WN follows a normal distribution [92]:

N ˆ UWN= N ∼ ()0, 1 . (3.4) π2 −1 6

Therefore, we test against the null hypothesis (H 0 ) that the residue is white. If U is larger than the critical value tα calculated according to a predefined threshold of significance level α , then the residue is not white (H1 ). Typically, α is set at

5%. The value tα in turn is found by solving the equation of probability PU≥ t= α . (3.5) H0 ()α

72 Chapter 3 Application of non-windowed ACT

4.0 0.00

D 3.5 1 0.02 D 2 D 3.0 3 0.14 D 4 U) %)

2.5 D 0.62 α 5

2.0 2.28 α = 5% 1.5 6.68 Whiteness measure ( Significance level (

1.0 15.9

0.5 30.9

0.0 50.0 1 2 3 4 5 6 7 8 9 10 Iterations

(A) Values of whiteness measure

4.0 0.00

3.5 0.02

3.0 0.14 U) %)

2.5 0.62 α

2.0 2.28 α = 5% 1.5 6.68 Whiteness measure ( Significance level (

1.0 15.9

0.5 30.9

0.0 50.0 1 2 3 4 5 6 7 8 9 10 Iterations

(B) Mean and STD

Fig. 25. Whiteness measures for all five signals. (A) The values of residual signals at each iteration. The y-axis of the measure U is shown on the left and its corresponding significance level α is shown on the right. (B) The mean and STD of the measures.

73 Chapter 3 Application of non-windowed ACT

The values of the whiteness measure14 are summarized in Fig. 25. In panel (A), the results for all five signals are drawn. In general, they decrease with increase in the number of iterations. All the measures are lower than the significance level

α = 0.05 ( t0.05 = 1.65 ) after the seventh step. The result indicates that the residue is close to white noise after six or seven chirplets are extracted from the original signal. In panel (B), the mean and STD of the measures are presented. It can be seen that the curve of mean values penetrates the 5% level line between the fifth and sixth steps, and is significantly below it after the seventh step. From these results, we conclude that given the significance level α = 0.05 , the residual signal after seven iterations is white ( H 0 accepted). With the assumption of ergodicity, it is easy to verify that the residues are Gaussian and white as well. Hence, we conclude that the signal model (2.26) is valid for VEP analysis.

Signal D 1 40 20 0 −20 Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Reconstructed signal from c − c 1 3 40 20 0 −20 Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Reconstructed signal from c − c 1 7 40 20 0 −20 Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s)

Fig. 26. The signal D1 and the reconstructed signals.

14 Note that, due to the low-pass filtering at 40 Hz, we need to downsample the residual from 240 Hz to 80 Hz before applying the whiteness statistic [84].

74 Chapter 3 Application of non-windowed ACT

60 60 (A) (B)

50 50

40 40 g 9 c 10 g 30 10 30 c 8 g 8 Frequency (Hz) Frequency (Hz) c 20 g 20 5 6 c g 4 4 g c c c 5 9 3 7 10 g 10 c g g 7 c c 6 2 1 2 1 g 3 0 0 0.0 1.0 2.0 3.0 4.0 5.0 0.0 1.0 2.0 3.0 4.0 5.0 Time (s) Time (s)

Original signal (C) 40

0

−40 0.0 1.0 2.0 3.0 4.0 5.0 Reconstructed signal (with 10 chirplets) 40

0 Amplitude −40 0.0 1.0 2.0 3.0 4.0 5.0 Reconstructed signal (with 10 Gabor logons) 40

0

−40 0.0 1.0 2.0 3.0 4.0 5.0 Time (s)

Fig. 27. Comparison of the decompositions of signal D1. (A) The time-frequency structures of the 10 Gabor logons (labeled g1 – g10). (B) The time-frequency structures of the 10 chirplets (same as in Fig. 22(B)). (C) The original signal and the reconstructed signals.

It is interesting to compare the measures of cc values shown in Fig. 21 with the measures of whiteness shown in Fig. 25. Fig. 21 shows that the first three chirplets have generally higher cc values (> 0.10) than the other estimated ones. By comparison with Fig. 25, it follows that the next four chirplets are likely to be VEP signals as well, but the cc values of these components are generally lower. That is, the signals represented by these three later chirplets are less coherent with respect to the given dictionary. Following the above discussion, we see that the VEP signal can be completely represented by seven chirplets. However, it is

75 Chapter 3 Application of non-windowed ACT sufficient to represent the signal by the first three chirplets only, because they characterize the major time-frequency variations of the VEP responses. The estimated chirplets with cc above 0.10 may be thought as the major components of the VEP with respect to the given dictionary. For comparison, the signal D1 and the corresponding reconstructed signals from the first three chirplets and seven chirplets are shown in Fig. 26.

3.3.2 Compactness comparison

We have developed the non-windowed ACT method as an extension to the adaptive transform with Gabor logons. Earlier in the thesis (Section 1.4 and Section 2.2.4), we pointed out that a main feature in our approach is that a VEP signal can be more compactly represented by chirplets than by Gabor logons. In this section we will first provide an example to illustrate qualitatively the compactness of the representations, and then evaluate it quantitatively by using ER measures.

Fig. 27 shows the time-frequency structures of signal D1. The estimated 10 atoms 15 by using the MP algorithm with either Gabor logons or chirplets are shown in panel (A) and (B), respectively. It appears by comparison that the signal structure depicted by the atoms g6, g4, early portion of g5 and g2 is captured by the chirplet c3. The structure depicted by g2, early portion of g1 and g3 is represented by the chirplet c2. Atom g1 and chirplet c1 basically describe the same structure of the signal, i.e., the steady-state component. Therefore, the three chirplets c1 – c3 seem to provide a more compact representation of the same time-frequency structures than the six Gabor logons do. In addition, the sparse chirplet representation provides a clearer picture of the trend of frequency changes than that provided by the Gabor logons.

15 We use here loosely the term ‘atom’ to refer to the basis function involved in decomposition, i.e., either a chirplet or a Gabor logon.

76 Chapter 3 Application of non-windowed ACT

TABLE VI

ERS CALCULATED FROM THE DECOMPOSITION USING CHIRPLETS, ERC Signals D D D D D Mean STD Iterations 1 2 3 4 5 1 0.78 0.55 0.78 0.50 0.84 0.69 0.15 2 0.63 0.41 0.63 0.37 0.50 0.51 0.12 3 0.55 0.36 0.58 0.28 0.46 0.45 0.12 4 0.53 0.34 0.50 0.25 0.44 0.41 0.11 5 0.49 0.32 0.44 0.23 0.39 0.37 0.10 6 0.48 0.30 0.41 0.21 0.35 0.35 0.10 7 0.44 0.28 0.38 0.19 0.33 0.32 0.09 8 0.42 0.27 0.35 0.18 0.31 0.30 0.09 9 0.40 0.25 0.33 0.17 0.28 0.29 0.09 10 0.38 0.24 0.32 0.16 0.26 0.27 0.08

TABLE VII

ERS CALCULATED FROM THE DECOMPOSITION USING GABOR LOGONS, ERG Signals D D D D D Mean STD Iterations 1 2 3 4 5 1 0.76 0.57 0.79 0.50 0.77 0.68 0.13 2 0.67 0.49 0.71 0.43 0.76 0.61 0.15 3 0.63 0.47 0.62 0.40 0.73 0.57 0.13 4 0.60 0.46 0.58 0.40 0.71 0.54 0.13 5 0.58 0.45 0.55 0.38 0.70 0.52 0.13 6 0.56 0.45 0.53 0.35 0.68 0.51 0.13 7 0.55 0.43 0.50 0.34 0.66 0.50 0.13 8 0.54 0.43 0.49 0.33 0.65 0.49 0.12 9 0.52 0.43 0.47 0.32 0.63 0.47 0.11 10 0.51 0.42 0.46 0.28 0.62 0.46 0.12

The waveforms of the original signal and the reconstructed signals are shown in panel (C). It is interesting to analyze the similarity among these signals by calculating their correlation coefficients. We found that the correlation between the original signal and the reconstructed signal from the 10 Gabor logons is 0.7, whereas that between the original signal and the reconstructed signal from the 10

77 Chapter 3 Application of non-windowed ACT chirplets is 0.8. Although it is only marginally higher, the value indicates that the chirplet representation is more efficient than the Gabor logon representation. A quantitative way of evaluating the compactness is to calculate the ER values defined in (2.37). Recall that an ER value shows the percentage of the signal energy that has not been extracted (or estimated). Since only one atom is estimated at each iterative step, for a given iteration number, the lower the ER value is, the more compact the corresponding representation. The values calculated from the chirplet decomposition (ERC) and the Gabor logons decomposition (ERG) are shown in Table VI and Table VII, respectively.

Fig. 28. Difference of the energy ratios between the Gabor logon representation and the chirplet representation. Mean and STD of the difference are shown in the figure.

The difference in the energy extraction between the two techniques was evaluated by calculating ERG-ERC and its associated statistics. Fig. 28 clearly shows that except for the very first step, the chirplet decomposition extracts more energy than the Gabor logon decomposition at each iteration. To understand why the first step shows no statistical difference, we recall that the

78 Chapter 3 Application of non-windowed ACT first atom extracted is the one that represents the steady-state portion of VEP signals. Thus it is no surprise that both methods yield the same result.

60 60 (A) (B)

50 50

40 40

30 30 Frequency (Hz) 20 Frequency (Hz) 20

10 10

0 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Time (s)

60 60 (C) (D) 50 50

40 40

30 30 Frequency (Hz) Frequency (Hz) 20 20

10 10

0 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Time (s)

Fig. 29. Visualization of different time-frequency representations of signal D1. (A) Spectrogram (with 11-point Gaussian window); (B) Wigner-Viller distribution; the two arrows show the approximate duration of the VEP response. (C) Selected Gabor logons; (D) The ACS of the selected chirplets. For the sake of convenience, we repeat here panel (B) in Fig. 22. The dotted lines at the first second indicate the onset of the visual stimulus.

3.3.3 Visualization effect

For almost all biomedical applications, an important result is the visualization of a signal’s structure in the time-frequency plane. Usually, a human observer (e.g., a clinician) will inspect the visualized results for the purpose of further acquiring information that may be used later in classification and diagnosis. In

79 Chapter 3 Application of non-windowed ACT our opinion, whether a specific way of visualization will help an observer gain more information is an empirical question, which requires an empirical check in every new situation. A thorough discussion of the visualization effect will certainly involve psychological or psychophysical analysis and is clearly beyond the scope of this thesis. Nevertheless, we present here a direct comparison of different visualization methods of time-frequency analysis. In Fig. 29 we show four methods displaying the time-frequency structures of the signal D1, i.e., (A) the spectrogram, (B) the WVD, (C) the Gabor logons and (D) the ACS of chirplets. As is well known, spectrograms based upon the STFT invariably involve smoothing of some sort, and yield an overall lower resolution picture [66]. Although the spectrogram can show some of the salient time-frequency structures of the VEP response, most of the detail is lost due to smearing. In Fig. 29(A), for instance, a major structure with relatively constant frequency below 10 Hz is observable. However, it is difficult to tell more precisely the value of the constant frequency and details of frequency variation at the beginning of the response. The WVD of the signal in (B), on the other hand, provides results with much higher time-frequency resolution. Unfortunately, WVD suffers from interference (cross-terms) from interactions between signal components themselves, and between signal components and noise. The resulting visualization is fuzzier and hence confusing for an analyzer. However, visualizations resulting from the adaptive techniques using (C) Gabor logons and (D) chirplets appear more appropriate for visual inspection. This is due to the fact that individual atoms are characterized by WVD and, simultaneously, interference between them is avoided (as detailed in Section 2.2). Consequently, the time-frequency structures are shown with higher resolution and without cross-terms. The visualization of Gabor logons and the ACS of chirplets

80 Chapter 3 Application of non-windowed ACT has already been shown in (A) and (B) in Fig. 27. Only the selected chirplets

(i.e., c1, c2 and c3) and the corresponding Gabor logons (i.e., g1, g2, g3, g4 and g6) are shown in Fig. 29(C) and (D). The ACS in (D) is more compact than the representation in (C). Also, the trend of frequency variation is more easily distinguishable in (D) than in (C). Furthermore, the estimated parameters obtained from the decomposition analysis (e.g., Table IV) provide detailed information about the local time-frequency structures of the signal, which is not easily obtainable from the conventional spectrogram and WVD on their own. From these observations, we conclude that the ACS appears to be more suitable for VEP visualization. Next, we discuss that the chirplet information is useful in separating the transient VEP from the steady-state VEP. This capability, however, is currently not available from the adaptive transform with Gabor logons.

3.3.4 Separation of tVEP and ssVEP

The chirplet functions are expected to assist in gaining insight into the underlying physiological mechanism. Besides better visualization by the ACS, one prominent advantage of using chirplet representation is that we are able to find a criterion for separating the tVEPs and the ssVEPs. From the results shown in Fig. 22 and Fig. 23 (on p. 67 and p. 69), we can see a general pattern of the estimated chirplets (with cc ≥ 0.10). That is, a couple of chirplets with wider bandwidth and shorter time-spread represent the early portion of a VEP response (the transient components), while the chirplets with large time-spread and near-zero chirp rate represent the later portion of the response (the steady-state components). It is easy to separate the tVEPs and ssVEPs by visual inspection. We show the corresponding reconstructed waveforms in Fig. 30.

81 Chapter 3 Application of non-windowed ACT

Reconstructed signal of D 1 40 20 0 −20 Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 tVEP 40 20 0 −20 Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 ssVEP 40 20 0 −20 Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) (A) Reconstructed signal of D Reconstructed signal of D 2 3 40 20 40 20 0 0 −20 −20

Amplitude −40 −40 Amplitude 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 tVEP tVEP 40 20 40 20 0 0 −20 −20

Amplitude −40 −40 Amplitude 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 ssVEP 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 ssVEP 40 20 40 0 20 0 −20 −20 Amplitude −40 −40 Amplitude 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0 Time (s) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) (B) (C)

Reconstructed signal of D Reconstructed signal of D 4 5 40 40 20 20 0 0 −20 −40 −20 Amplitude Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 tVEP tVEP 40 40 20 20 0 0 −20 −40 −20 Amplitude Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 ssVEP ssVEP 40 40 20 20 0 0 −20 −40 −20 Amplitude Amplitude −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Time (s) (D) (E)

Fig. 30. Reconstructed signals and the separated tVEPs and ssVEPs. (A) Signal D1; (B) Signal D2; (C) Signal D3; (D) Signal D4; (E) Signal D5. Signals D1 – D5 are averaged signals from subjects 1 – 5.

To find an objective criterion of separation, we separated the transient and steady-state portions manually. We then summarized the statistics of the esti-

82 Chapter 3 Application of non-windowed ACT mated parameters in Table VIII. From the table, we confirm our observation that the values of the time-spread and the time-center are significantly different for the transient and steady-state components. ssVEPs have much wider time- ˆ ˆ spread (typically, Δ≥t 1 s) and a later time of appearance (typically, tc ≥ 3.21 s). Moreover, both the chirp rate and the variance are seen to be close to zero at steady-state. We therefore propose the following criteria to classify the compo- nents: a chirplet describes the steady-state component if (1) its time-spread is wider than 1 s, (2) its time-center is later than 2.21 s after the stimulus onset, and (3) the absolute value of chirp rate is less tan 0.03 Hz/s. The remaining chir- plets (with cc ≥ 0.10) are then classified as transient components.

TABLE VIII STATISTICS OF THE PARAMETERS OF TRANSIENT AND STEADY-STATE VEPS tVEP ssVEP ˆ tc (s) 1.69 ± 0.14 3.31 ± 0.20 ˆ fc (Hz) 6.34 ± 1.35 5.74 ± 0.16 cˆ(Hz/s) -13.45 ± 15.98 -0.01 ± 0.02 ˆ Δt (s) 0.12 ± 0.08 1.14 ± 0.14

It should be noted that the averaged frequency center of the steady-state VEP is not exactly the 2nd harmonic of the stimulus frequency (3 Hz). This might be due to the non-linearity property of the visual system, which might induce frequency components other than the harmonics of the stimulator frequency.

3.4 Conclusions

In this chapter, we have presented and discussed the results of applying the non-windowed ACT method to VEP estimation.

83 Chapter 3 Application of non-windowed ACT

In this method, detailed information on the estimated chirplet parameters can be provided in each iterative step. The chirplet representation provides researchers with new computed parameters that characterize the VEP signals in a different way as compared to, for instance, the STFT. The chirp rate is the representative parameter that provides information on the rate of change of the instantaneous frequencies. This quantity is usually difficult to estimate by competing techniques. Additionally, conventional parameters such as amplitudes and latencies can be readily retrieved from the reconstructed signals. To find a stopping criterion of decomposition, the cc values are calculated for each step. According to the statistics of our data, we propose to stop the iterative procedure when the cc value is below 0.10. In general, we conclude from our data that a 1200-point (sampling at 240 Hz) response with duration of five seconds can be well represented by as few as three chirplets: typically, two chirplets represent the tVEP component and one chirplet represents the ssVEP component. Whether a VEP signal can be sufficiently estimated by the ACT method is an important question. We answered this question by testing the whiteness of the residual signals defined in the signal model (2.26). With the help of the Drouiche measure, we concluded from the results that, generally, the VEP signal can be completely estimated by seven chirplets. Interestingly, the first three chirplets with cc values above 0.10 may be thought of as the major components, since they reflect the major characteristic time-frequency variations of a VEP response. Although the MP algorithm with Gabor logons can be used to estimate the VEP as well, we have demonstrated that the chirplet representations are more compact than the Gabor logon representations. One consequence of this sparse representation is that it can provide a clearer visualization of the time-frequency structures of a VEP signal, especially in its transient portion. We reached this

84 Chapter 3 Application of non-windowed ACT conclusion by a direction comparison of the spectrogram, WVD, visualization of the Gabor logons, and the ACS of the same VEP signal. Finally, one prominent advantage of chirplet representation is that it is capable of characterizing the entire signal from its transient portion to the steady-state portion. We have proposed criteria to separate the tVEPs and ssVEPs. Consequently, they can be readily reconstructed from the corresponding chirplets, as shown in Fig. 30. This is perhaps the first study, to our knowledge, that successfully isolates the transition of the VEP response in a time scale of around one second. Thus, our method has the potential to help investigate the initial stage of a VEP response, which is generally difficult to do by alternative methods.

85

Chapter 4 Application of the Windowed ACT 16

In this chapter, we present the results of applying the technique of the windowed ACT to VEP signals. The important question regarding the selection of the window length is discussed in detail. An optimal size has been proposed according to the characteristics of the VEP recordings. Subsequently, the chirplets estimated with the window method are described, the time-frequency structures are visualized by using the ACS, and the statistical information of the estimates is explained. We further justify the effectiveness of the proposed window length by comparing the results using various lengths and positions of windows.

4.1 Introduction

In applying the windowed ACT technique, a given VEP recording (i.e., signals D1 – D5) was partitioned into non-overlapping and equal-length segments. As pointed out in Section 2.3, this segment-based method is employed in order to

16 A preliminary version of this work was presented at the 26th Annual International Conference of the IEEE EMBS, San Francisco, CA, USA, Sept. 1-5, 2004 [43]. Parts of the work presented in this chapter were also published in the article: “Time-frequency analysis of visual evoked poten- tials using chirplet transform” by J. Cui, W. Wong and S. Mann, in Electronics Letters, vol. 41, pp. 217-218, 2005 [44].

86 Chapter 4 Application of windowed ACT increase the speed of computation and reduce the dimension of the feature space for the signal classifier. In fact, the window technique has been developed to overcome some technical limitations of the non-windowed ACT with the ultimate goal of real-time applications. A discussion of the technical limitations of the non- windowed ACT, as well as those of the windowed ACT, however, will be postponed until the next chapter. In the rest of this chapter, we focus on the implementation of the window technique for VEP analysis. It should be noted that in real-time application the processes of data collection and chirplet estimation are carried out in parallel. The flowchart of a possible implementation is proposed in Fig. 31. Basically, it consists of two blocks of processes, i.e., the hardware process for data collection, and the software process for chirplet estimation and other possible processing steps. The two blocks maintain communication via a one-way trigger signal sent by the data acquisition board to indicate whether a segment of L-point (VEP) data is ready. If so, the data will be moved into system memory to begin the estimation process. Clearly, to keep the system working in real-time, the time for one chirplet estimation (ts) must be less than the time for collecting a segment of L samples

(th) (assuming a single trial EEG measurement without signal averaging); the latter is usually proportional to L, given a fixed sampling frequency. Since the

2 time for chirplet estimation is proportional to LLlog , we can see that ts should quickly surpass th with increasing L. This will impose an upper bound of L. However, this limit is usually irrelevant to the signal per se and is mainly decided by the properties of the hardware system. We are not going to discuss the limitation in the respect of hardware further, but shift the focus on other conditions, for specifying the bounds on the window length L, which are imposed by the characteristics of the VEP signals under analysis.

87 Chapter 4 Application of windowed ACT

Hardware process Software process

Analog input device Begin configuration Initialization One sample collected in Input Buffer Waiting for data-ready trigger signal from hardware N Buffer full? Retrieve L-point data from the (L points received) Y Reading Buffer Data in Input Buffer → Reading Buffer Estimation of a single chirplet from the L-point data Data ready Send out trigger signal Other possible processing

N N Stop? Stop? Y Y End End

Fig. 31. Schematic flowchart of a proposed real-time system. The block on the left depicts the major processes involved in the operations performed in the data acquisition hardware. It contains two data buffers, L-point length each. As soon as the input buffer is full, the data will be moved to the reading buffer and a trigger signal is sent out. The trigger signal could be either hardware or software oriented, depending on the configuration. Concurrently, the next segment of L data will be collected in the input buffer. Meanwhile, the software process, illustrated in the block on the right, is in an idle state, waiting for the trigger signal to indicate whether the data are ready. Once ready, the data will be retrieved into system memory and subsequently the operation of estimation of one chirplet will be processed, usually through a CPU. Both the hardware and software processes will continue until the stop condition is met. Note that they are carried out in parallel, but communicate via the trigger signal denoted by the dashed line. Note that a single-trial EEG measurement is assumed.

TABLE IX ESTIMATED SIGNAL-TO-NOISE RATIO (SNR)

Signals D1 D2 D3 D4 D5 Mean STD SNR (dB) 0.47 -0.45 1.52 -0.37 1.60 0.55 0.98

88 Chapter 4 Application of windowed ACT

4.2 Optimal Window Length

From the discussion in Section 2.3.2, we note that the window length L of partitioning is related to the characteristics of the VEP recordings – the lower bound is placed by the SNR condition, while the upper bound is mainly limited by the desired resolution of the analysis. As we will see later, the resolution is in turn determined by the time duration of the transient VEP. In the following, we will specify the choice of the window length for VEP analysis.

We first examine the SNR level of the signals D1 – D5 to decide the lower bound. The SNR of the recorded signals was estimated according to the results of the non-windowed ACT method. From the analysis of model validation in Section 2.2.3.1, we know that, in general, a VEP signal can be completely “extracted” from the original data after seven iterations and the remaining signal is close to a GWN process. Therefore, if we assume that the reconstructed signal from the seven chirplets is the true signal, the noise is then found by subtracting it from the original data. Consequently, the SNR can be calculated from the power ratio of these two signals. We estimated the SNR values of all five recorded signals D1 – D5, as listed in Table IX. The average SNR is 0.55±0.98 dB, slightly above 0 dB. The estimated level of SNR helps determine the lower limit of the window length. We exploit the simulation results of CRLB shown in Fig. 17 (Section 2.2.3.2). That is, if the SNR is around 0 dB, to keep the estimator optimal, the signal size should not be less than N = 100 . Therefore, we prefer a segment size, L, around 100 points (~ 400 ms) or longer. Note that in the following estimation the SNR in each segment is assumed to be 0 dB as well.

89 Chapter 4 Application of windowed ACT

Spectrogram 20

15

10

5 Frequency (Hz) 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0

Windowed Fourier ridges 20

15

10

5 Frequency (Hz) 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Fig. 32. Spectrogram and windowed Fourier ridges. The upper panel shows an enlarged area (0 – 20 Hz) of the spectrogram of signal D1 (using the STFT with a Gaussian window of 0.416 ms, or 100 points). The signal’s instantaneous frequencies may be estimated from the windowed Fourier ridges of the spectrogram [33], which are shown in the lower panel. The dotted line at 1 s marks the onset of the stimulus; the line at 2 s denotes an approximate time when the signal becomes steady.

Although a longer segment is usually expected to improve the quality of the estimates, it may limit the estimator’s capability to expose sufficiently the signal’s time-dependent behavior. A segment must be short enough to sample adequately the energy density curves of a signal in the time-frequency plane. As for a VEP response, since most of the variations occur in the transients, a choice of the upper bound will be mainly influenced by the time duration of the transient portion of the signal, i.e., tVEP. Next, we give an intuitive illustration of the duration of the transient portion by using windowed Fourier ridges [33].

The spectrogram of signal D1 is shown in the upper panel of Fig. 32. Its corresponding windowed Fourier ridges, displayed in the lower panel, show a rough picture of the trend of the instantaneous frequencies from the transient portion to the steady-state portion. It can be seen that the signal becomes approximately steady after about 2 s, because the instantaneous frequency of the

90 Chapter 4 Application of windowed ACT major ridge is close to 6 Hz. Note, however, that some short-duration ridges occur between about 2.5 s and 3 s and between 4.0 s and 5 s. They represent the noise in the signal. Recall that the visual stimulus onset is at one second, so that the duration of the transient portion is around 1 s in length. In order to reveal sufficiently the time-dependent behavior of the signal in this portion, usually a window length that is shorter than half of the duration is employed. We thus expect a window length less than 500 ms to maintain sufficient resolution.

50 Effective duration of tVEP 40

30

20 A = 34.0 A/100 10

0

Amplitude −10 tVEP −20

Envelope −30

−40 Onset

−50 0.5 1.0 1.5 2.0 2.5 3.0 Time (s)

Fig. 33. Time duration of tVEP. The envelope of the reconstructed tVEP (of signal D1) is shown in the dot-dash lines. Its maximum is denoted by A (34.0 in this example). The time duration of the tVEP is illustrated by a rectangular window. Its starting time and ending time are the instants when the first and last values on the envelope are greater than A/100 (or 0.340).

In practice, a systematic way of acquiring a more accurate estimate of the duration is preferred. The effective time-spread of tVEP summarized in Table VIII might be a choice. However, the measure of time-spread is calculated from the energy distribution of a signal (defined in (2.7)), which is different from the way we define the length of a window. To obtain a comparable measurement, we propose the following method (referring to Fig. 33):

91 Chapter 4 Application of windowed ACT

1. Reconstruct the tVEP signal from the chirplets classified according to Table VIII.

2. Find the envelope of the tVEP, which is defined as the square-root of the sum of the squared values of the signal and its Hilbert transform, i.e.,

st2 ()+ Hst[] ()2 , where st( ) is the real tVEP signal and Hst[]() is its Hilbert transform [66].

3. Subsequently, find the maximum of the envelope A > 0 and the threshold is set at 17 A/100.

4. Finally, define the duration as the time interval between the instants at which the first point and the last point on the envelope are above the threshold.

TABLE X ESTIMATED DURATION OF tVEPs

Signals D1 D2 D3 D4 D5 Mean STD

(ms) 1471 1100 1271 1358 1113 1263 159 tVEP duration (points) 353 264 305 326 267 303 38

According to this procedure, we calculated the measures from all five signals, as listed in Table X (in both units of “ms” and “points”, sampling at 240 Hz). It can be seen that the average duration of tVEPs is around 300 points. Thus the segment size should be smaller than 150 points. Considering the lower limit imposed by the SNR condition, we choose 100 points (416.7 ms) as the window length for partitioning the VEP.

17 It should be pointed out that the choice of this threshold is somewhat arbitrary, reflecting a subjective view of the level of amplitude above which the data can be regarded as ‘signal’. In par- ticular, it may depend very much on the specific type of EPs recorded in different experimental paradigms.

92 Chapter 4 Application of windowed ACT

4.3 VEP Data Processing Results

4.3.1 Number of chirplets in dictionary

The acquisition method and VEP recordings under analysis, i.e., D1 – D5, have already been introduced in Section 3.1. The number of atoms contained in the dictionary is different, however. Since the segment size is L = 100 points, we find from Table I that the number of levels of decomposition should be D = 3 ,

given typical values a = 2 and i0 = 1 . Therefore, the size of the dictionary is 2 im+=82 only, instead of 337 chirplets used in the non-windowed ACT. 0 ∑k=0 k

At each level, the discrete time-spread, Δt , and the discrete chirp rate are calculated according to Table I to complete the construction of the dictionary.

TABLE XI

TWELVE CHIRPLETS ESTIMATED FROM SIGNAL D1 † ˆ ˆ ˆ ˆ Chirplets Range (s) aˆ φ (rad) tc (s) fc (Hz) cˆ(Hz/s) Δt (s)

A1 0.00 – 0.42 70.02 -0.19 0.27 30.38 -8.15 0.08 A2 0.42 – 0.83 42.87 -0.23 0.66 17.05 1.28 0.15 A3 0.84 – 1.25 57.71 0.90 1.04 35.07 1.39 0.12 A4 1.25 – 1.67 168.06 1.21 1.48 7.30 -25.77 0.04 A5 1.67 – 2.08 157.46 -2.07 1.88 5.30 -4.58 0.14 A6 2.09 – 2.50 91.03 -1.88 2.25 5.82 -0.88 0.12 A7 2.50 – 2.92 82.74 3.05 2.71 5.35 -0.92 0.12 A8 2.92 – 3.33 76.47 1.43 3.14 6.82 -2.75 0.13 A9 3.34 – 3.75 91.55 1.52 3.60 5.33 -2.26 0.11 A10 3.75 – 4.17 82.09 -3.01 4.00 5.28 -5.97 0.14 A11 4.17 – 4.58 74.83 -2.99 4.36 5.56 -3.04 0.13 A12 4.59 – 5.00 102.17 1.89 4.88 6.26 6.47 0.10

† The 12 chirplets estimated by the windowed ACT are labeled as A1 – A12. The estimated parameters are in reference to equations (2.14) and (2.19).

4.3.2 Chirplet estimation

One chirplet was estimated from each segment. Since the window length is 100 points, each 1200-point signal was divided into M = 12 equal-length segments, resulting in 12 estimated chirplets. Detailed information regarding the

93 Chapter 4 Application of windowed ACT complex amplitude and the four parameters of each estimated chirplet can then be obtained. As an example, we have provided the amplitudes and parameters of the 12 chirplets estimated from signal D1 in Table XI, as well as the time-ranges of the 12 segments. Note that the displayed time-center of each chirplet has already been adjusted by adding the start time of the window to the estimated time- center. We can see that the frequency-centers of the first three chirplets A1 – A3 are significantly higher than those of the remaining chirplets. The signal represented by these three chirplets is usually spontaneous EEG activity. This is followed by two chirplets, A4 and A5, with relatively higher amplitude.

Specifically, we note that A4 possesses a small time-spread (0.04 s) and a large chirp rate (-25.77 Hz/s), indicating a decrease of instantaneous frequency. This observation is consistent to the results shown in Table IV obtained by the non- windowed ACT. That is, both tables show that a transient portion precedes a steady-state portion of VEPs. Chirplets A4 and A5 appear to represent the transient VEP portion as the time interval covered by them is close to the interval of tVEP shown in Table VIII. The rest of the signal is represented by the chirplets A6 – A7 with relatively stable amplitudes, frequency-centers, chirp rates and time-spreads, indicating the steady-state portion of the VEP response. We will confirm our observation with the statistical information of the estimates. Note that the time-centers are usually not at the centers of the corresponding segments, due to the adaptive procedure of chirplet selection.

4.3.3 Visualization

Similar to the application of the non-windowed ACT, the results of the windowed ACT can also be visualized by using the ACS technique. The resulting ACS, however, is not constructed by superimposing the WVDs of the different

94 Chapter 4 Application of windowed ACT chirplets, but is a sequential display of the WVDs along with their corresponding segments.

Original signal D 1 60 40 D1 20 50 0

Amplitude −20 40 A 3 A 1 −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 30 Time (s) Reconstructed signal from A to A 1 12 A 40 Frequency (Hz) 2 A 20 4 20

A A A A A A A A 0 10 5 6 7 8 9 10 11 12

Amplitude −20

0 −40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Time (s)

60 60 D2 D3 50 50

40 40

30 30 Frequency (Hz) 20 Frequency (Hz) 20

10 10

0 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Time (s)

60 60 D4 D5

50 50

40 40

30 30 Frequency (Hz) 20 Frequency (Hz) 20

10 10

0 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Time (s)

Fig. 34. Time-frequency structures of signals D1 – D5. For each signal, 12 chirplet atoms have been estimated, which are labeled as A1 – A12 (shown in panel D1). As an example, the original signal D1 and the reconstructed signal are shown on the right.

95 Chapter 4 Application of windowed ACT

The decomposition results of all five signals D1 – D5 are shown in Fig. 34, where the chirplets are labeled as A1 – A12 (shown in panel D1). As an example, the reconstructed signal, together with the original signal D1, is shown on the right of panel D1. The time-frequency structures of the other signals are presented below. In general, the results follow a similar pattern, that is, the signal before

1.25 s is represented by three atoms, A1 – A3 with lower amplitudes and higher frequency-centers. Then, they are followed by two atoms with relatively stronger amplitudes and shorter time-spreads; their chirp rates are usually negative, indicating downward transition of the instantaneous frequencies. The rest of the signal is represented by a series of six atoms denoted by A6 – A12. Most of them have wide time-spreads and small chirp rates. Their frequency-centers are close to 6 Hz.

It is worth of noting that the reconstructed signal from A1 – A3 (e.g., in case of signal D1) presents unusual waveforms that seems quite different from the original spontaneous EEG signals. It appears that the signals before 1.25 s are not well represented. This is mainly due to the fact that the SNR in this interval is low, and thus the assumption of 0 dB SNR (Section 4.2) does not adequately hold, leading to large variance of the estimates. Nevertheless, we believe that the estimates of high frequency-centers and low amplitudes reflect the characteristics of the spontaneous EEG preceding the VEP response.

4.3.4 Statistical information

Previous observations regarding the VEP variation revealed by the windowed ACT (Section 4.3.2) can be further confirmed by the statistical information of the estimates. The averaged parameters of chirp rate (c), frequency-center ( fc ), amplitude (a) and time-spread (Δt ) of the estimated chirplets A1 – A12 are shown

96 Chapter 4 Application of windowed ACT in Fig. 35. For the sake of clarity, the standard deviation (STD) values of the estimates are not shown in the figure, but summarized in Table XII instead.

30 21 Chirp rate Frequency center

20 16

10 11

0 6 Chirp rate (Hz/s) −10 1 Frequency center (Hz)

−20 −4

−30 −9 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 Sequence of atoms (A)

180 180 Amplitude Time−spread 160 160

140 140

120 120

100 100

80 80 Amplitude

60 60 Time−spread (ms)

40 40

20 20

0 0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 Sequence of atoms

(B)

Fig. 35. Averaged parameters of the atoms A1 – A12. (A) Averaged chirp rates and frequency-centers. (B) Averaged amplitudes and time-spreads.

In panel (A), we show the averaged chirp rates and frequency-centers of the estimated chirplets. A clear transition of central frequencies can be observed

97 Chapter 4 Application of windowed ACT

between A3 and A4. In particular, the chirp rate of A4 is a large negative value, indicating a sharp decrease in instantaneous frequency. The frequency-centers of the rest of the chirplets are close to 6 Hz and their chirp rates are close to zero. We also show the time-dependent variation of the averaged amplitudes and time- spreads of the chirplets in panel (B). A prominent characteristic is that A4 possesses significantly shorter time-spread and higher amplitude. The chirplet A5 has higher amplitude as well. It appears that A4 and A5 represent the transient

VEP. The rest of the chirplets, A6 – A12, which have relatively stable amplitudes and chirp rates, represent the steady-state portions of the signals.

TABLE XII STANDARD DEVIATION (STD) VALUES OF THE PARAMETER ESTIMATES OF CHIRPLETS, INCLUDING CHIRP RATE (c),

FREQUENCY-CENTER ( fc ), AMPLITUDE (a) AND TIME-SPREAD ( Δt )

Atoms STD(c) (Hz/s) STD( fc ) (Hz) STD(a) STD( Δt ) (s)

A1 24.66 8.14 16.76 30.07 A2 17.66 8.96 12.31 56.25 A3 18.93 12.18 15.06 24.99 A4 11.58 0.79 23.90 10.09 A5 3.12 0.69 30.47 28.54 A6 1.69 0.35 11.58 16.16 A7 2.83 0.62 11.78 27.92 A8 3.40 0.80 12.39 11.87 A9 1.32 0.29 12.17 24.45 A10 3.03 0.63 7.71 22.28 A11 1.67 0.29 5.76 16.99 A12 2.24 0.43 20.73 20.23

In general, we conclude from the results above that the chirplets estimated by the windowed ACT reveals a pattern of VEP response similar to that shown by the non-windowed method. That is, a short transient tVEP is followed by a long steady-state ssVEP. However, the time-cost of the windowed method is

98 Chapter 4 Application of windowed ACT significantly lower. The comparison of computational time will be detailed in the next chapter.

4.4 Discussion

Two questions deserve further discussion. One is the effectiveness of different window lengths. We have shown that a window length of 0.42 s (100 points) can be used to adequately characterize a VEP response from its initial transient portion to the steady-state portion. We are going to empirically verify its effectiveness by comparing the results due to different partition lengths. The other is the variation of the estimates caused by the time shift of window positions. In practice, the EEG signals are usually monitored continuously. Because the arrival time of a VEP response is usually unpredictable, the relative window positions are not necessarily the same as those shown in Fig. 34. We will compare the results due to different window positions.

60 (A) 60 (B)

50 50

40 40

30 30 Frequency (Hz) Frequency (Hz) 20 20

10 10

0 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Time (s)

Fig. 36. Results of signal D1 using different window lengths. (A) The window length was 0.21 s (50 points). (B) The window length was 0.83 s (200 points). The two dotted lines indicate the position of the window between 0.84 s and 1.67 s.

99 Chapter 4 Application of windowed ACT

In Fig. 36 we show the visualization of the estimates of signal D1 resulting from different window lengths. Panel (A) is the result when a window length of 0.21 s, half of the proposed size, was adopted. By reducing the window length, the time position of a VEP local structure is more accurately displayed. However, it is difficult to acquire an effective separation of the spontaneous EEG, tVEP portion and ssVEP portions. As an example, we know that the signal between 1 s and 2 s is essentially the transient VEP, but the pattern of the chirplets in this interval is generally not distinguishable from that after 2 s, where the steady- state VEP is expected. This is likely caused by the increase of the variance of the estimates, because of the reduction of the observable data in each segment. On the other hand, a longer window leads to a lower variance of the estimates, as shown in panel (B), where a window length of 0.83 s was employed. The visualization of the time-frequency structures of signal D1 is clearer than that in

Fig. 34(D1) as it consists of a clear transient portion (indicated by two dotted lines) and the steady-state portion represented by three chirplets. We may thus prefer a 0.83 s window to the proposed 0.42 s window. However, this observation may be misleading, since the major portion of the transient VEP happened to be covered by the window between the dotted lines (0.84 s – 1.67 s). This point may be seen more clearly when the positions of the windows are shifted. The time- frequency structures shown in panel (A) of Fig. 37 were obtained by using the window length of 0.42 s, but the window positions were shifted by half window length, i.e., 0.21 s, as compared to the case in Fig. 34(D1). We can see clearly that two chirplets represent the tVEP between 1.0 s and 2.0 s, followed by a series of chirplets representing the ssVEP. On the other hand, panel (B) shows the results when the window length is 0.83 s, but the window positions were shifted by 0.42 s as compared to the case in Fig. 36(B). Although the steady- state portion is still clearly represented, the transient portion is not apparent.

100 Chapter 4 Application of windowed ACT

This is because the transient portion of the VEP response, in this case, has been divided into two adjacent windows and its energy is not dominant in either window. This indicates that a long window has the potential risk of inadequately “sampling” the tVEPs. From above discussions, we conclude that the window length of 0.42 s (100 points) is effective for VEP analysis by the windowed ACT method.

60 60 (A) (B)

50 50

40 40

30 30 Frequency (Hz) Frequency (Hz) 20 20

10 10

0 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Time (s) Time (s)

Fig. 37. Effect of window shift. Chirplets estimated from signal D1 using window length 0.42 s (100 points) in (A) or using window length 0.83 s (200 points) in (B). The positions of the windows were shifted by half the window size as compared to the positions of the windows used to obtain the results in Fig. 34(D1) or in Fig. 36(B), respectively.

4.5 Summary

Segmentation-based computational techniques are desirable not only for processing non-stationary signals, but also for the purpose of long-term continuous monitoring of signals. In this chapter, the technique of the windowed ACT was developed by partitioning a given VEP signal into fixed-length and non-overlapping segments. The algorithm lends itself to a possible real-time implementation on a digital signal processor.

101 Chapter 4 Application of windowed ACT

An important question arose regarding the length of the analysis window. We have discussed in detail the selection of the optimal length based on the characteristics of the VEP recordings. In particular, we pointed out that the lower bound was imposed by the SNR condition of the signal, while the upper bound was placed by the time duration of the transient portion of the VEP response. The proposed window length of 0.42 s (100 points) is by far the best interval of partition for our data. A window of around 0.42 s appears to be an ideal candidate for the windowed ACT for VEP analysis. The results of applying the windowed ACT to the VEP signals have been presented. The number of chirplets required in the dictionary has been significantly reduced. The detailed information of the parameters of each estimated chirplet and their visualization by the ACS were provided. We can observe from the statistical information of the estimates that the chirplets can adequately characterize the VEPs from their transient portion to the steady-state portion. It should be mentioned, finally, that the windowing approach has additional potential to facilitate further signal processing, where a set of time-varying and low-dimensional local features is preferred, instead of the high-dimensional global features estimated by the non-windowed ACT.

102

Chapter 5 Discussions

The features and merits of the proposed non-windowed and the windowed ACT methods have been presented in the previous chapters. In this chapter, we discuss some potential applications of VEP chirplet analysis. We also discuss the technical limitations of the two methods.

5.1 Applications of VEP Analysis with ACT

It has been shown [94] that the more we know about a weak signal, the more readily we can detect it under low-SNR condition. By taking advantage of the results we obtained in this thesis, we may now have a more accurate picture of the “true” VEP signal. We may apply this knowledge to a number of VEP related applications. One such application would be in on-line signal detection and classification for a system of brain-computer interface (BCI) using VEPs [8,9]. In BCI, a classified signal indicates a mental command sent out via EEG signals. This system may be regarded as a communication channel, through which a voluntary intention of a subject is transmitted to a computer. An important character of a communication channel is the channel capacity, which is the bits of information transferred in one second [95], or equivalently measured as the

103 Chapter 5 Discussions

“correct rate per trial” in a BCI [10]. At present, however, the clinical use of BCIs is severely limited by low channel capacity. Two possible reasons for this are low rate of correct classification and/or long time required for the data acquisition in each trial [96,97]. It is known, in general, that the more information of a signal to be detected we know, the higher rate of a correct classification we can achieve. Usually, a shorter observational interval (i.e., less data) is required under this condition. Time reduction of the observational interval is one factor that can increase channel capacity. One possible way to reduce this time interval is to obtain adequate data of the signal at an earlier time. For VEP detection, relying on the model of steady- state signal, a detector will have to wait until the data of ssVEP have been collect before processing classification. If we can detect a VEP during its transient portion, processing time can be reduced by saving the time for collecting data of ssVEP. In other words, merely the data of tVEP will be adequate for a detector to claim a detection of the complete VEP. One method of quantifying detector efficiency is to examine the so-called ‘arrival time’ of a signal. Arrival time is defined as the time instant when the target signal is detected [11]. In this section, we discuss some preliminary results of applying chirplet representation to estimate the arrival time of VEPs. Particularly, we attempt to show that one advantage of using tVEP in the detection process is that the arrival of a VEP can be detected earlier than by having to wait for the ssVEP. It is well known that when the complete information about the signal to be detected is known, the optimal detector (in the Neyman-Pearson sense) is the likelihood ratio test which is usually implemented by a matched filter [98]. A matched filter may be viewed as a special case of a finite impulse response (FIR) filter. Specifically, if hn[] is the impulse response of the FIR filter, then the output at time n, for n ≥ 0 , is

104 Chapter 5 Discussions

n yn[]= ∑ xkhn[][− k ] (5.1) k=0

Templates of the matched filters 40 s s (A) ss t 20

0

Amplitude −20

−40 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 Matched filter outputs 1.0 (B) 0.8 0.6 0.4 0.2

Normlized outputs 0.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 t t Time (s) a b Fig. 38. Matched filter outputs. (A) Template signals for matched filters are in solid lines. The time duration of a template, indicated by a rectangle in dash-dotted line, is 0.416 s

(100 points). Shown on the left is the “steady-state template” sss (a piece of sinusoid at 6 Hz) for detecting ssVEP. The “transient template” st (see text) for detecting tVEP is shown on the right. A VEP signal (reconstructed from c1 – c7 of signal D1 in this figure) is shown by the dotted line. The matched filter process is carried out as the template is swept across the signal (in the direction shown by the arrow) and the outputs are calculated by (5.4). (B) The outputs of matched filtering with the transient template are shown by the solid line and the outputs with the steady-state template are shown by the dotted-dash line. The time when the former reaches the maximum is denoted as ‘ta’ and that for the latter as ‘tb’. The maximum of the outputs of matched filtering with st has been normalized to one.

where the input xn[ ] and the impulse response of the filterhn[] are nonzero only over the interval []0,N − 1 . In a matched filter the output yn[] is obtained as a convolution of the input with the impulse response, where the impulse response is a time-reversed version of the target signal sn[ ] (signal “template”),

n yn[]= ∑ xksN[][]−−1 ( n − k ). (5.2) k=0

The output at time nN= − 1 is

105 Chapter 5 Discussions

N −1 yN[]−1 = ∑ xksk [][]. (5.3) k=0

The matched filter thus performs a correlation between the signal and the template. It has many applications in biomedical signal processing [24]. ˆ When the arrival time t0 of a signal is unknown, its estimate t0 may be found ˆ by locating the maximum of the outputs of the matched filter [11]. That is, t0 is found by18

tM0 + −1 ˆ [] tytxnsnt00==arg max[] arg max ∑ []− 0 (5.4) tt00 nt= 0

[ ] i.e., maximizing the outputs yt[ 0 ] over all possible t0 , where sn is the target signal with a duration of M points (nonzero over the interval [0,M − 1] ) and xn[]∈−[]0, N 1 is the observation interval. Matched filtering is carried out by sweeping the template across the signal from left to right and the outputs are calculated by (5.3) at each time instant (see Fig. 38).

TABLE XIII TIME DIFFERENCES BETWEEN MAXIMA OF MATCHED FILTER OUTPUTS FOR STEADY- STATE AND TRANSIENT TEMPLATES

Signals D1 D2 D3 D4 D5 Mean STD

ttba− (ms) 450.0 145.8 287.5 195.8 137.5 243.3 116.3

The estimates of the arrival time of VEPs will depend on the type of target signal sn[ ], or equivalently the template of the matched filter. For the sake of simplicity, we consider only two choices here. One is to employ the conventional

[] model of steady-state EP and define a “steady-state template” snss as a segment of a sinusoid. The other is to take advantage of the information

18 We employed the definition of arrival time (5.4) in [11], Vol. I, p.192.

106 Chapter 5 Discussions regarding tVEP obtained through previous analysis to form an alternative

[] template. The “transient template” snt is found by reconstructing the chirplets obtained from the tVEP. Since the transient appears earlier than the steady-state

[] portion, we expect an earlier detection of tVEP with snt . The process of matched filtering is illustrated in Fig. 38. The duration of the templates is set at 0.416 s (100 points), which is comparable to the optimal window length in the

[] windowed ACT (discussed in Section 4.2). snss is a sinusoid at 6 Hz with

[] unitary amplitude and zero phase. snt is a chirplet whose parameters of frequency-center fc , chirp rate c and time-spread ts are set as the mean values of tVEP in Table VIII. Its time-center tc is set at the center of the segment, the

[] [] amplitude is unitary and the phase is zero. Both snss and snt are shown in Panel A of Fig. 38. Note that they have been exaggerated to give a clear visualization. In this preliminary study, the VEP signal xn[] provides a “complete representation” of the VEP recordings (see Fig. 26 in Section 3.3.1).

For instance, xn[] of signal D1 in Fig. 26 is shown in the dotted line in Panel A of Fig. 38. The resulting outputs are shown in Panel B and the time locations of

[] [] the maxima are denoted as ‘ta’ and ‘tb’ when snt and snss were adopted, respectively. The time differences tb – ta for signals D1 – D5 are listed in Table

XIII. We can see that, for our data, ta is always earlier than tb, indicating that the tVEP signal can be detected earlier if we use a chirplet model, rather than the usual sinusoid model. The average gain of time is about 240 ms. The results discussed above show the potential of earlier detection of VEP by exploiting the information regarding tVEP. It should be noted, however, that the influence of noise has not been considered here, which will definitely impede the detector performance. In fact, our preliminary results indicated that at SNR = 0 dB (the SNR level for our VEP data) ta and tb were not statistically different. The performance of the detector will be further evaluated in future studies.

107 Chapter 5 Discussions

5.2 Technical Limitations

Two ACT techniques were developed to characterize the VEP responses to periodically repeated visual stimulation. The results show that the chirplets estimated by both methods can successfully represent the initial tVEP and the later ssVEP of a VEP response. In this section, however, we discuss the difference between the two techniques by emphasizing their technical limitations. Possible solutions to these limitations are also suggested. One major limitation of the non-windowed method comes from its difficulty in analyzing long-time persistent nonstationary signals that have continuous time- varying behavior over an indefinite time period. Although, in principle, one can always apply the non-windowed method after all the data of interest have been obtained, in practice, this approach may incur unfavorably long time of processing. The reasons of excessive cost of time mainly lie in three aspects. First, the procedure of data acquisition may be excessively long, if the observation interval covers a long time period. Second, the time cost of data analysis may be expensive because of large data size, N. As we have already pointed out, the time of chirplet estimation is proportional to NN2 log . Therefore, the time of computation will increase rapidly with the increase of data size. In addition, computational resources could be a problem in this case. For example, the number of discrete chirplets in a dictionary is a function of sample size as well, increasing with an expanding size N. The capacity of system memory may not be large enough to store the dictionary for large N. The algorithm performing data analysis could work off-line. Finally, the condition of low-SNR level of the signals requires the average of multiple signal trials. However, if the SNR level of signals can be improved to be 0 dB (or higher), the process of average can be avoided for our algorithm. For our data, the SNR of a single trail was below -10 dB, which

108 Chapter 5 Discussions demanded the average over 50 single trials to increase the SNR. However, in some situations (e.g., a brain-computer interface) the algorithm should work on- line to process the data in real-time. The non-windowed ACT approach usually cannot fulfill this purpose. The windowed approach could be a solution and we will provide information on computational time at the end of this section. Another limitation of the non-windowed method lies in representing ssVEP with long duration. We have seen from Table VIII (p. 83) that the steady-state components of a VEP response are usually represented by chirplets with large time-spread and near-zero chirp rates. Recall that the envelope of a chirplet is a Gaussian function, which results in a bell-shaped amplitude of the reconstructed steady-state portion. The amplitude diminishes to a small amount at both ends, as is clearly shown in Fig. 30 (p. 82). In particular, the amplitude of all reconstructed VEP signals decreases after about 4 seconds, suggesting a steady- state component with decreasing amplitude. This suggestion, however, could be misleading, since the measured VEP signal, for instance, signal D1 (e.g. in Fig. 26) does not appear to have vanishing amplitude after 4 seconds. Arguably, the trend of amplitude reduction might be concealed by noise in the measurement process. We believe, however, the amplitude reduction in the reconstructed signal is more likely an artifact that is due to the intrinsic property of the Gaussian chirplets, i.e., the Gaussian envelope. There are a number of ways to resolve this shortcoming, including increasing the number of chirplets used to construct the steady-state portion of the signal or by including Fourier components (long- duration sinusoids) in the dictionary as suggested by Mallat and Zhang [37]. Another possible artifact caused by the non-windowed approach is the “non- causal” effect; that is, it seems that the response occurs before the onset of stimulation. Again, in Fig. 30, we note that some small but observable reconstructed signals appear before the onset of the visual stimulus at 1 s.

109 Chapter 5 Discussions

However, it is not clear whether these results imply an anticipative response system of the subject. It is more likely that they are artifacts mainly attributed to the chirplets representing the ssVEPs. This is because these chirplets represent well the narrow-band steady-state signals, but they have less certainty of the signals’ time position, reflected by their large time-spreads. The proposed windowed ACT method could be a possible solution to these problems. By using an appropriate window size, the computational complexity can be significantly reduced because of small sample size (Section 2.1); and because the steady-state component is separated into segments with short durations, each segment can be represented by a chirplet whose Gaussian envelope has no significant change within this duration. Therefore, the amplitude of the reconstructed signal has relatively constant amplitude; see, for example, the reconstructed signal D1 in Fig. 34. In addition, the estimated chirplets are well confined within the duration of the segments. Note that the windowed ACT has been proposed as a middle step toward the chirplet analysis in real-time (Section 4.1). As mentioned earlier, the process of signal average could not be avoided due to low-SNR levels of a single trial of data, although the computational complexity of the windowed ACT has been significantly reduced. If the SNR of a single trial can be improved (> 0 dB) by some filtering mechanism in real-time, signal average can be avoided. This should be further investigated in the future work regarding real-time implementation of the ACT algorithms. However, the windowed approach also has several technical limitations. Firstly, since the chirplets estimated from different segments are independent, the reconstructed signals are usually not continuous at the boundaries of the segments. However, there is no obvious reason to expect discontinuity in a VEP signal. This problem could be solved by imposing additional constraints of

110 Chapter 5 Discussions estimation to minimize discontinuity of the reconstructed signals across the segment boundaries. Secondly, each segment of data is represented by a single chirplet that most suitably fits the data as its coefficient acquires the highest amplitude. To represent the signal sufficiently, we have pointed out (in Section 2.3.2) that the signal under analysis should satisfy the NIB condition; that is, the energy distribution of the signal in the time-frequency plane is dominated by one trajectory of the energy curve. Therefore, there is a risk of poor representation if the NIB condition is not met. For example, speech signals characterized by several formants (frequency regions of relatively high intensity in the spectrum) will probably not be well estimated. For our data, however, the results show that this condition is closely satisfied by VEP signals. Another assumption in applying the windowed approach is the SNR condition that determines the bounds of window size. For the employed window size of 100 points, the SNR should be above 0 dB (Section 4.2). This assumption can be justified for those segments in which the true VEP is present. However, the degradation of the estimator due to low-SNR levels of the data before the start of visual stimulation will result in a large variance of the estimates. As was discussed in Section 4.3.3 and shown in Fig. 34, the signal before 1.25 s appears not to be satisfactorily represented. Finally, we should point out, in respect of signal processing, that an important condition in deriving the CRLBs of the single-chirplet model (2.27)- (2.30) is that the VEP signal must be well within the observational interval (i.e., a segment of data in the windowed ACT method). With this condition, we can derive the closed-form of CRLBs (Appendix B) and hence determine the relationship between the optimal data size and the condition of SNR level (see Fig. 17). This condition, however, is an approximation to real situations,

111 Chapter 5 Discussions especially in estimating the steady-state component of a VEP response. Since ssVEP usually covers a long period of time, a segment of data can only reflect partial information about the signal. To our knowledge, little work has been done to investigate to what extent the bounds may degrade as more and more of the signal is unobserved, which is a possible area of future work. Nevertheless, from our results, the series of chirplets estimated from short windowed segments can generally characterize the steady-state VEP portions, as well as the transient portions.

500 449.4 450 1200 400

350

300

250

200 143.6 150 Time per chirplet (s) 100 600 50.1 50 5.9 13.6 300 0 60 120 100 1000 Window length (points)

Fig. 39. Time cost of chirplet estimation for different window lengths. The results are shown in a double log-axis plot. For each testing point, two values are displayed. The bottom one is the window length in points and the upper one is the average time cost in seconds.

We now complete this section by summarizing the time cost for chirplet estimation in Fig. 39. The time was measured for calculating one chirplet under different conditions of window length. Note that, if just a single chirplet is estimated, the windowed ACT method may be thought of as a special case of the non-windowed approach, with a window length of 1200 points (i.e., the entire

112 Chapter 5 Discussions signal). The measurements were conducted on a Pentium IV (3.0 GHz) Windows PC equipped with 1 GB of random access memory (RAM). The algorithm was

® coded with MatLab 7.0. The signal D1 was employed as the testing signal. The non-windowed ACT and the windowed ACT with segment sizes of 600, 300, 120 and 60 points were applied to the signal. Time information was collected by using MatLab Profile Tool. The average time cost for one chirplet was then calculated. The results are shown in Fig. 39. As is expected, the trend is close to a straight line. It took about 10 s to estimate a 100-point chirplet. The reduction of computational time is significant with decrease in window length.

113

Chapter 6 Conclusions and Future Work

We first present the conclusions of this work. As a general tool of time- frequency analysis, the ACT method is not constrained within the field of EP signal analysis. Other possible areas, where the adaptive chirplet analysis might be suitable, are suggested at the end.

6.1 Conclusions

In this thesis, we have applied the adaptive chirplet transform (ACT) in both the non-windowed and windowed implementations to estimate the visual evoked responses to repetitive visual stimulation. The ACT partially overcomes the disadvantages of conventional time-frequency methods and provides a parsimonious and high-resolution representation of VEP signals. The Cramér-Rao lower bounds (CRLBs) of the estimates of a single chirplet in additive Gaussian white noise (GWN) were found. The non-windowed ACT was implemented by an iterative coarse-refinement algorithm. Through numerical simulations, we concluded that the algorithm was suitable for multi-chirplet estimation under low signal-to-noise ratio (SNR) conditions. In the application of the non-windowed ACT to VEPs, a signal model consisting of multiple chirplets

114 Chapter 6 Conclusions and future work embedded in additive GWN was assumed. We validated the model by testing the whiteness of the residual signals and concluded that a complete VEP signal can be estimated by as few as seven iterations in our algorithm. However, statistical analysis showed that the coherent coefficient (cc) value at 0.10 is a suitable stopping criterion for our data, because the chirplets with higher cc values reflect the major time-frequency variations in the signals. We also showed that as few as three chirplets are required to represent a complete VEP response. A prominent advantage of the chirplet representation is the capability of separating the transient portion (tVEP) from the steady-state portion (ssVEP). The criteria for separation were summarized in Table VIII. By measuring the energy ratio (ER) values, we showed that the chirplet representation is more compact than the Gabor logon representation. The adaptive chirplet spectrogram (ACS) provided a clearer visualization of the results of estimation than conventional methods. The windowed ACT was developed for the purpose of reducing time cost and with an ultimate goal of real-time application. In this method, the data were windowed into equal-length non-overlapping segments. The selection of the optimal window length was discussed in detail. After analyzing the SNR level of the VEP recordings and the effective time-spread of tVEPs, we proposed a window length of 0.416 s (100 points) as being sufficient in terms of time resolution. This result was further verified by using different window lengths and different window positions. We concluded from the statistical analysis that the windowed method could adequately characterize the entire VEP response from tVEP to ssVEP, and reveal a pattern similar to that found by the non-windowed ACT. However, the time cost of the windowed method is significantly lower than that of the non-windowed method. We emphasize that both the non-windowed and windowed approaches provide a unified representation of the complete VEP response (both transient and

115 Chapter 6 Conclusions and future work steady-state VEPs). We have not seen such a feature in competing techniques. A unified approach can assist many types of physiological and electrophysiological studies [27,45-47]. However, technical limitations exist for both methods, and one should be aware of them in order to acquire a proper interpretation of the estimation. We believe that the ACT method will be especially suitable for characterizing those biomedical signals that consist of complicated time-frequency components.

6.2 Future Work

6.2.1 CRLB for multiple-chirplet estimation

Little work has been done to find the CRLBs for multi-chirplet estimation in the case of the non-windowed ACT. Whether a closed-form expression of CRLBs exists for multiple chirplets is largely unknown. The CRLBs for multiple chirplets attracts particular interest in that they should be useful for the analysis of the signals consisting of more than one chirplet components. In general, the CRLB for single-chirplet case should be valid as long as the chirplets are well separated in the time-frequency plane. When two or more single components become close, we have to factor into the bounds the ambiguity in assigning energy to one or the other chirplet function. However, our preliminary investigation exhibited that this would be complicated. The work done by Rife and Boorstyn [99], regarding CRLB for multiple sinusoids, may be a good starting point.

6.2.2 Signal classifier based on chirplet features

With the method of windowed ACT, signals are partitioned into non- overlapping segments, and each segment is modeled as a Gaussian chirplet. The amplitude, phase and the four parameters of the chirplet are estimated in each

116 Chapter 6 Conclusions and future work segment. These parameters may be used as features to represent adequately each segment. Subsequently, a time-varying probability density function (PDF) model can be used to track the evolution of the features through time. This approach avoids classifying based solely on a set of “global” features. A signal classifier based on this approach may be implemented in real-time and would be useful for fast signal identification. The techniques of adaptively partitioning signals depending on the signal statistics may also be employed to select local features in an optimal sense [100].

6.2.3 Beyond EP applications

As a new tool of time-frequency analysis, the ACT has potential applications in the analysis of a variety of signals that involve chirping components. Besides EEG signals, the chirping phenomenon exists in many other natural signals. Two examples of chirplet representations of bio-acoustical signals are shown in Fig. 40. The signals are a bat echo signal and a Robin whistle, which are in the ultrasonic range and audible frequency range, respectively. The chirplets were estimated with the non-windowed ACT method and visualized with the ACS method proposed in Chapter 2. These examples demonstrate again the value of the compact representations using chirplets. Signal information is diluted less and packed into a few coefficients of high energy. Therefore, signal compression is a potential application. Besides this advantage, the chirplet transform may serve as an alternative filtering method to conventional Fourier-based techniques. For example, we can remove or enhance the signal parts represented by chirplets corresponding to some ranges of the time-shift, frequency-shift, chirp rate and time-spread. The ACS provides a clear picture of a signal’s energy content, and thus captures the “signature” of the signal in the time-frequency plane, which should be especially useful in pattern recognition problems.

117 Chapter 6 Conclusions and future work

Bat echo−location signal 0.3 0.2 0.1 0.0 (A1) −0.1 −0.2 −0.3 0 0.35 0.70 1.05 1.40 1.75 2.10 2.45 2.80 Time (ms) American Robin bird chirp signal 1.0

0.5

0.0 (A2)

−0.5

−1.0 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Time (s)

70 70

60 60

50 50

40 40

30 30 Frequency (kHz) Frequency (kHz) 20 20

10 10

0 0 0.00 0.35 0.70 1.05 1.40 1.75 2.10 2.45 2.80 0 0.35 0.70 1.05 1.40 1.75 2.10 2.45 2.80 Time (ms) Time (ms) (B1) (B2)

4.0 4.0

3.5 3.5

3.0 3.0

2.5 2.5

2.0 2.0

1.5 1.5 Frequency (kHz) Frequency (kHz)

1.0 1.0

0.5 0.5

0 0.0 0 0.05 0.10 0.15 0.20 0.25 0.30 0 0.05 0.1 0.15 0.2 0.25 0.3 Time (s) Time (s) (C1) (C2)

Fig. 40. Chirplet representations of bio-acoustical signals. Time domain signals of (A1) large brown bat sound (sampled at 0.14 MHz) and (A2) American Robin chirp (sampled at 8 kHz); Time-frequency representations of bat sound, including (B1) the spectrogram (calculated with a 0.45 ms Gaussian window) and (B2) the ACS (represented by five chirplets); and Time-frequency representations of bird sound, including (C1) the spectrogram (with an 8 ms Gaussian window) and (C2) the ACS (represented by 20 chirplets), respectively.

118

References

[1] D. Regan, Evoked potentials in psychology, sensory physiology and clinical medicine, London, UK: Chapman and Hall, 1972.

[2] D. Regan, Human brain electrophysiology: evoked potentials and evoked magnetic fields in science and medicine, New York: Elsevier, 1989.

[3] A.M. Halliday, Evoked potentials in clinical testing, 2nd ed., Edinburgh, UK: Churchill Livingstone, 1992.

[4] J.E. Desmedt, Visual evoked potentials, New York, USA: Elsevier Science Publishers B.V. (Biomedical Division), 1990.

[5] J.R. Heckenlively and G.B. Arden, Principles and practice of clinical electrophysiology of vision, Mosby Year Book, 1991.

[6] A.M. Norcia and C.W. Tyler, “Spatial-frequency sweep VEP - visual- acuity during the 1st year of life,” Vision Research, vol. 25, pp. 1399- 1408, 1985.

[7] Y. Tang and A.M. Norcia, “Application of adaptive filtering to steady- state evoked response,” Medical & Biological Engineering & Computing, vol. 33, pp. 391-395, 1995.

[8] M. Cheng, X.R. Gao, S.G. Gao, and D.F. Xu, “Design and implementation of a brain-computer interface with high transfer rates,” IEEE Transactions on Biomedical Engineering, vol. 49, pp. 1181-1186, 2002.

[9] M. Middendorf, G. McMillan, G. Calhoun, and K.S. Jones, “Brain- computer interfaces based on the steady-state visual-evoked response,”

119 References

IEEE Transactions on Rehabilitation Engineering, vol. 8, pp. 211-214, 2000.

[10] J.R. Wolpaw, N. Birbaumer, D.J. McFarland, G. Pfurtscheller, and T.M. Vaughan, “Brain-computer interfaces for communication and control,” Clinical Neurophysiology, vol. 113, pp. 767-791, June 2002.

[11] S.M. Kay, Fundamentals of statistical signal processing: Estimation and detection theory, Englewood Cliffs, N.J.: Prentice-Hall PTR, 1993.

[12] J.I. Aunon, C.D. Mcgillem, and D.G. Childers, “Signal processing in evoked-potential research - averaging and modeling,” Crc Critical Reviews in Bioengineering, vol. 5, pp. 323-367, 1981.

[13] F. Coelho, D. Simpson, and A. Infantosi, “Testing recruitment in the EEG under repetitive photo stimulation using frequency-domain approaches,” in Proceedings of IEEE Engineering in Medicine and Biology, Montreal, QC, Canada, pp. 905-906, 1995.

[14] A.P. Liavas, G.V. Moustakides, G. Henning, E.Z. Psarakis, and P. Husar, “A periodogram-based method for the detection of steady-state visually evoked potentials,” IEEE Transactions on Biomedical Engineering, vol. 45, pp. 242-248, 1998.

[15] J. Mast and J.D. Victor, “Fluctuations of steady-state VEPs - Interaction of driven evoked-potentials and the EEG,” Electroencephalography and Clinical Neurophysiology, vol. 78, pp. 389- 401, 1991.

[16] L.H. Van Der Tweel, “Relation between psychophysics and electrophysiology of flicker,” Documenta Ophthalmologica, vol. 18, pp. 287-304, 1964.

[17] J.R. Wolpaw, N. Birbaumer, W.J. Heetderks, D.J. McFarland, P.H. Peckham, G. Schalk, E. Donchin, L.A. Quatrano, C.J. Robinson, and

120 References

T.M. Vaughan, “Brain-computer interface technology: A review of the first international meeting,” IEEE Transactions on Rehabilitation Engineering, vol. 8, pp. 164-173, June 2000.

[18] F. Di Russo and D. Spinelli, “Effects of sustained, voluntary attention on amplitude and latency of steady-state visual evoked potential: a costs and benefits analysis,” Clinical Neurophysiology, vol. 113, pp. 1771-1777, 2002.

[19] L.H. Van Der Tweel and H.F.E. Verduyn Lunel, “Human visual responses to sinusoidally modulated light,” Electroencephalography and Clinical Neurophysiology, vol. 18, pp. 587-598, 1965.

[20] S. Tobimatsu, H. Tomoda, and M. Kato, “Normal variability of the amplitude and phase of steady-state VEPs,” Evoked Potentials- Electroencephalography and Clinical Neurophysiology, vol. 100, pp. 171- 176, 1996.

[21] G.F.A. Harding, “The visual evoked response,” in Advances in Ophthalmology, M. J. Roper-Hall, Ed. Basel: S Kaerger AG, 1974, pp. 2- 28.

[22] S.J. Fricker, “Narrow-band filter techniques for detection and measurement of evoked responses,” Electroencephalography and Clinical Neurophysiology, vol. 14, pp. 411-421, 1962.

[23] D. Regan, “Some characteristics of average steady-state and transient responses evoked by modulated light,” Electroencephalography and Clinical Neurophysiology, vol. 20, pp. 238-&, 1966.

[24] R.M. Rangayyan, Biomedical signal analysis: a case-study approach, New York, USA: IEEE Press, 2002.

121 References

[25] M. Akay and IEEE Engineering in Medicine and Biology Society, Time- frequency and wavelets in biomedical signal processing, New York: IEEE Press, 1998.

[26] A.M. Norcia, T. Sato, P. Shinn, and J. Mertus, “Methods for the identification of evoked-response components in the frequency and combined time frequency domains,” Electroencephalography and Clinical Neurophysiology, vol. 65, pp. 212-226, 1986.

[27] N.S. Peachey, P.J. Demarco, R. Ubilluz, and W. Yee, “Short-term changes in the response characteristics of the human visual-evoked potential,” Vision Research, vol. 34, pp. 2823-2831, 1994.

[28] J.P.C. Deweerd and J.I. Kap, “Spectro-temporal representations and time-varying spectra of evoked-potentials - A methodological investigation,” Biological Cybernetics, vol. 41, pp. 101-117, 1981.

[29] S.J. Schiff, A. Aldroubi, M. Unser, and S. Sato, “Fast wavelet transformation of EEG,” Electroencephalography and Clinical Neurophysiology, vol. 91, pp. 442-455, 1994.

[30] L.J. Trejo and M.J. Shensa, “Feature extraction of event-related potentials using wavelets: An application to human performance monitoring,” Brain and Language, vol. 66, pp. 89-107, 1999.

[31] Z. Zhang, H. Kawabata, and Z.-Q. Liu, “Electroencephalogram analysis using fast wavelet transform,” Computers in Biology and Medicine, pp. 429-440, 2001.

[32] O. Bertrand, J. Bohorquez, and J. Pernier, “Time-frequency digital filtering based on an invertible wavelet transform - An application to evoked-potentials,” IEEE Transactions on Biomedical Engineering, vol. 41, pp. 77-88, 1994.

122 References

[33] S.G. Mallat, A wavelet tour of signal processing, 2nd ed., San Diego: Academic Press, 1999.

[34] R.Q. Quiroga, O.W. Sakowitz, E. Basar, and M. Schurmann, “Wavelet transform in the analysis of the frequency composition of evoked potentials,” Brain Research Protocols, pp. 16-24, 2001.

[35] N.V. Thakor, X.R. Guo, C.A. Vaz, P. Laguna, R. Jane, P. Caminal, H. Rix, and D.F. Hanley, “Orthonormal (Fourier and Walsh) models of time-varying evoked-potentials in neurological injury,” IEEE Transactions on Biomedical Engineering, vol. 40, pp. 213-221, 1993.

[36] N.V. Thakor, X.R. Guo, Y.C. Sun, and D.F. Hanley, “Multiresolution wavelet analysis of evoked-potentials,” IEEE Transactions on Biomedical Engineering, vol. 40, pp. 1085-1094, 1993.

[37] S.G. Mallat and Z. Zhang, “Matching pursuit with time-frequency dictionaries,” IEEE Transactions on Signal Processing, vol. 41, pp. 3397- 3415, 1993.

[38] M.L. Brown, W.J. Williams, and A.O. Hero, “Non-orthogonal Gabor representations of biological signals,” pp. 305-308, 1994.

[39] K.J. Blinowska and P.J. Durka, “Unbiased high resolution method of EEG analysis in time-frequency space,” Acta Neurobiologiae Experimentalis, vol. 61, pp. 157-174, 2001.

[40] P.J. Durka and K.J. Blinowska, “Analysis of EEG transients by means of matching pursuit,” Annals of Biomedical Engineering, vol. 23, pp. 608-611, Sept. 1995.

[41] P.J. Durka, “Matching pursuit with stochastic time-frequency dictionary”. http://brain.fuw.edu.pl/~durka/software/mp/. 2005.

[42] P.J. Durka, “From wavelets to adaptive approximations: time-frequency parametrization of EEG,” Biomed. Eng Online., vol. 2, pp. 1, Jan. 2003.

123 References

[43] J. Cui, W. Wong, and S. Mann, “Time-frequency analysis of visual evoked potentials by means of matching pursuit with chirplet atoms,” in IEEE Annual International Conference of Engineering in Medicine and Biology Society, San Francisco, CA, pp. 267-270, 2004.

[44] J. Cui, W. Wong, and S. Mann, “Time-frequency analysis of visual evoked potentials using chirplet transform,” Electronics Letters, vol. 41, pp. 217-218, 2005.

[45] W.A. Ho and M.A. Berkley, “Evoked-potential estimates of the time course of adaptation and recovery to counterphase gratings,” Vision Research, vol. 28, pp. 1287-1296, 1988.

[46] D.Y. Xin, W. Seiple, K. Holopigian, and M.J. Kupersmith, “Visual- evoked potentials following abrupt contrast changes,” Vision Research, vol. 34, pp. 2813-2821, 1994.

[47] C. Janz, S.P. Heinrich, J. Kornmayer, M. Bach, and J. Hennig, “Coupling of neural activity and BOLD fMRI response: New insights by combination of fMRI and VEP experiments in transition from single events to continuous stimulation,” Magnetic Resonance in Medicine, vol. 46, pp. 482-486, 2001.

[48] M. Sekine, M. Akay, T. Tamura, Y. Higashi, and T. Fujimoto, “Investigating body motion patterns in patients with Parkinson's disease using matching pursuit algorithm,” Medical & Biological Engineering & Computing, vol. 42, pp. 30-36, Jan. 2004.

[49] A. Cohen, Biomedical signal processing, Boca Raton, FL: CRC Press, 1986.

[50] M. Akay, Biomedical signal processing, San Diego, CA: Academic Press, 1994.

124 References

[51] E.N. Bruce, Biomedical signal processing and signal modeling, New York: John Wiley, 2001.

[52] S.I. Fox, Human physiology, 7th. ed., Boston, MA: McGraw-Hill Companies, Inc, 2002.

[53] P.A. Buser and M. Imbert, Vision, Cambridge, Mass: MIT Press, 1992.

[54] H. Ikeda, H. Nishijo, K. Miyamoto, R. Tamura, S. Endo, and T. Ono, “Generators of visual evoked potentials investigated by dipole tracing in the human occipital cortex,” Neuroscience, vol. 84, pp. 723-739, 1998.

[55] S. Arroyo, R.P. Lesser, W.T. Poon, W.R.S. Webber, and B. Gordon, “Neuronal generators of visual evoked potentials in humans: Visual processing in the human cortex,” Epilepsia, vol. 38, pp. 600-610, 1997.

[56] D.A. Jeffreys and J.G. Axford, “Source locations of pattern-specific components of human visual evoked-potentials .1. Component of striate cortical origin,” Experimental Brain Research, vol. 16, pp. 1-21, 1972.

[57] S. Mann and S. Haykin, “The chirplet transform - physical considerations,” IEEE Transactions on Signal Processing, vol. 43, pp. 2745-2761, Nov. 1995.

[58] B. Boushash and H.J. Whitehouse, “Seismic applications of the Wigner- Ville distribution,” in Proceedings of IEEE International Conference on Circuits System, pp. 34-37, 1986.

[59] S. Qian, M.E. Dunham, and M.J. Freeman, “Transionospheric signal recognition by joint time-frequency representation,” Radio Science, vol. 30, pp. 1817-1829, Nov. 1995.

[60] S. Mann and S. Haykin, “The chirplet transform: A generalization of Gabor's logon transform,” in Vision Interface, Calgary, Canada, pp. 205- 212, 1991.

125 References

[61] S. Mann and S. Haykin, “Chirplets and warblets - Novel time-frequency methods,” Electronics Letters, vol. 28, pp. 114-116, Jan. 1992.

[62] S. Mann and S. Haykin, “The adaptive chirplet - an adaptive generalized wavelet-like transform,” in Adaptive Signal Processing, S. Haykin, Ed. Bellingham: SPIE - Int Soc Optical Engineering, 1991, pp. 402-413.

[63] S. Mann and S. Haykin, “Adaptive chirplet transform - An adaptive generalization of the wavelet transform,” Optical Engineering, vol. 31, pp. 1243-1256, June 1992.

[64] D. Gabor, “Theory of communication,” Journal of IEE, vol. 93, pp. 429- 457, Nov. 1946.

[65] A. Bultan, “A four-parameter atomic decomposition of chirplets,” IEEE Transactions on Signal Processing, vol. 47, pp. 731-745, Mar. 1999.

[66] L. Cohen, Time-frequency analysis, Englewood Cliffs, NJ: Prentice Hall PTR, 1995.

[67] J. Cui, J. Loewy, and E.J. Kendall, “Automated search for arthritic patterns in infrared spectra of synovial fluid using adaptive wavelets and fuzzy C-means analysis,” IEEE Transactions on Biomedical Engineering, vol. 53, pp. 800-809, 2006.

[68] R.R. Goldberg, Fourier transforms, Cambridge, UK: Cambridge University Press, 1970.

[69] Z.Y. Lin and J.D. Chen, “Advances in time-frequency analysis of biomedical signals,” Critical Reviews in Biomedical Engineering, vol. 24, pp. 1-72, 1996.

[70] I. Daubechies, Conference Board of the Mathematical Sciences, and National Science Foundation (U.S.), Ten lectures on wavelets, Philadelphia, PA: Society for Industrial and Applied Mathematics, 1992.

126 References

[71] C.K. Chui, An introduction to wavelets, San Dego, CA: Academic Press, 1992.

[72] J. Ville, “Theorie et applications de la notion de signal analytique,” Cables Et Transmissions, vol. 2A, pp. 61-74, 1948.

[73] E.P. Wigner, “On the quantum correction for thermodynamic equilibrium,” Physical Review, vol. 40, pp. 749-759, 1932.

[74] S. Qian and D.P. Chen, “Signal representation using adaptive normalized Gaussian functions,” Signal Processing, vol. 36, pp. 1-11, 1994.

[75] S. Qian, D.P. Chen, and Q.Y. Yin, “Adaptive chirplet based signal approximation,” Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-6, pp. 1781-1784, 1998.

[76] R. Gribonval, “Fast matching pursuit with a multiscale dictionary of Gaussian chirps,” IEEE Transactions on Signal Processing, vol. 49, pp. 994-1001, 2001.

[77] J.C. O'Neill and P. Flandrin, “Chirp hunting,” Proceedings of the IEEE- SP International Symposium on Time-Frequency and Time-Scale Analysis, pp. 425-428, 1998.

[78] S.J. Orfanidis, Optimum signal processing: an introduction, 2nd ed. -- ed., New York: Macmillan, 1988.

[79] Q.Y. Yin, S. Qian, and A.G. Feng, “A fast refinement for adaptive Gaussian chirplet decomposition,” IEEE Transactions on Signal Processing, vol. 50, pp. 1298-1306, June 2002.

[80] P.J. Huber, “Projection pursuit,” Annals of Statistics, vol. 13, pp. 435- 475, 1985.

127 References

[81] G. Davis, S. Mallat, and M. Avellaneda, “Adaptive greedy approximations,” Constructive Approximation, vol. 13, pp. 57-98, 1997.

[82] G. Davis, S. Mallat, and Z.F. Zhang, “Adaptive Time-Frequency Decompositions,” Optical Engineering, vol. 33, pp. 2183-2191, 1994.

[83] P.L. Ainsleigh, S.G. Greineder, and N. Kehtarnavaz, “Classification of nonstationary narrowband signals using segmented chirp features and hidden Gauss-Markov models,” IEEE Transactions on Signal Processing, vol. 53, pp. 147-157, Jan. 2005.

[84] A.V. Oppenheim, R.W. Schafer, and J.R. Buck, Discrete-time signal processing, 2 ed., Singapore: Pearson Education Pte. Ltd., 1999.

[85] L.R. Rabiner, Fundamentals of speech recognition, Englewood Cliffs, N.J.: PTR Prentice Hall, 1993.

[86] J. Wolcin, “Maximum a posteriori estimation of narrowband signals”. Journal of Acoustical Society of America 68[1], 174-178. 1980.

[87] J. Fung, J. Cui, S. Mann, and W. Wong, “Fast computation of chirplet transform using graphics processing unit,” Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada, 2005. (unpublished work)

[88] F. Pei, M.W. Pettet, and A.M. Norcia, “Neural correlates of object-based attention,” Journal of Vision, vol. 2, pp. 588-596, 2002.

[89] J. Cui, “Ph.D. thesis proposal: A visual evoked potential based brain- computer system using peripheral vision,” Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, ON, Canada, Mar.,2003. (unpublished work)

[90] M.H. Hayes, Statistical digital signal processing and modeling, New York: Wiley, 1996.

128 References

[91] C.D. Mcgillem, J.I. Aunon, and K.B. Yu, “Signals and noise in evoked brain potentials,” IEEE Transactions on Biomedical Engineering, vol. 32, pp. 1012-1016, 1985.

[92] K. Drouiche, “A new test for whiteness,” IEEE Transactions on Signal Processing, vol. 48, pp. 1864-1871, July 2000.

[93] M.G. Kendall, A. Stuart, J.K. Ord, and A. O'Hagan, Kendall's advanced theory of statistics, 2nd ed., London: Edward Arnold, 2004.

[94] J.V. Candy, Model-based signal processing, Hoboken, NJ: IEEE Press, 2006.

[95] T.M. Cover and J.A. Thomas, Elements of information theory, New York: John Wiley & Sons, Inc., 1991.

[96] G. Dornhege, B. Blankertz, G. Curio, and K.R. Muller, “Boosting bit rates in noninvasive EEG single-trial classifications by feature combination and multiclass paradigms,” IEEE Transactions on Biomedical Engineering, vol. 51, pp. 993-1002, June 2004.

[97] B. Obermaier, C. Neuper, C. Guger, and G. Pfurtscheller, “Information transfer rate in a five-classes brain-computer interface,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 9, pp. 283-288, Sept. 2001.

[98] H.L. Van Trees, Detection, estimation, and modulation theory, New York: Wiley, 2001.

[99] D.C. Rife and R.R. Boorstyn, “Multiple tone parameter-estimation from discrete-time observations,” Bell System Technical Journal, vol. 55, pp. 1389-1410, 1976.

[100] U. Appel and A. Vonbrandt, “A comparative-study of 3 sequential time- series segmentation algorithms,” Signal Processing, vol. 6, pp. 45-60, 1984.

129

Appendices

Appendix A. Energy Conservation of Decomposition

This section proves the energy conservation of the decomposition stated in (2.18). We adapt a proof similar to that for decomposition using Gabor logons as given by Mallat and Zhang [37] to the case of Gaussian chirplets. Firstly, we

2 show that the chirplet defined in (2.13) is unitary, i.e., gI = 1 . We then prove (2.18) by mathematical induction. Indeed, by definitions

2 gggIII= , 2 ⎛⎞2 1⎜tt− c ⎟ +∞ − ⎜ ⎟ 1 2⎜ Δ ⎟⎟⎡jct()− t+ω ⎤ () t− t = ee⎝⎠t ⋅ ⎣⎦cc c dt ∫−∞ πΔt

2 ⎛⎞tt− −⎜⎜ c ⎟ 1 +∞ ⎜ ⎟ = edt⎝⎠Δt ∫−∞ πΔt = 1. (A.1)

Now, at n = 0 , by using the fact that g is orthogonal to Rf , that is, I0 Rf,, g= f− a g g IIII0000 = fg,,,− a g g= fg− a (A.2) IIII0000 II 00 = 0 we have

f2 == ffagRfagRf,, + + II00 II 00 =+++a g,,,, a g Rf Rf a g Rf Rf a g II00 II 00 II 00 II 00 2 =+aRf2 . I0 (A.3)

130 Appendices

Therefore, (2.18) holds at n = 0 . Clearly, (A.2) can be generalized to show that g is orthogonal to RfP+1 , IP

RfgP+1 ,0,0,1,2,.== P (A.4) IP

(2.16) and (A.4) yield energy conservation at each step

222 RfPP=+ a R+1 f . (A.5) IP

Suppose (2.18) also holds at nP= −1, i.e.,

P−1 2 2 faRf2 =+P ∑ In n=0 P−1 22 2 =++aa RfP+1 ∑ IInP n=0 P 2 2 =+aRfP+1 . ∑ In n=0 (A.6)

We have already employed (A.5) in the derivation. (A.6) shows that (2.18) holds at nP= . We thus conclude that the energy conservation of the decomposition is valid for any natural number P = 0, 1, 2, . The behavior of the residue RfP (when P increases) is an interesting and important issue. For example, the information will help set the threshold in order to discriminate signal from noise. It has been shown that for decomposition using Gabor logons the residues converge to a chaotic attractor of a stochastic process called “dictionary noise” [81]. However, its behavior when using chirplets remains largely unknown. Although a similar property is expected, it needs to be studied more precisely in the case of chirplet decomposition.

131 Appendices

Appendix B. CRLBs of Chirplet Estimates

We provide in this section a straightforward approach to the calculation of the CRLBs of the estimates of the signal model described in (2.27)-(2.30). The specific process is formulated according to the theorem given by Kay [11]. Some approximations are provided at first and then the CRLBs are found as the principal diagonal of the inverse Fisher information matrix. We will use the following approximations, which are obtained from the moments of the normal distribution and valid as long as the signal is well within the recording interval and adequately sampled.

2 NN−−11l ⎡ ⎤ l 2 ()tt− ⎛⎞tt− ()()ttgtt−−= c exp ⎢−⎜ c ⎟ ⎥ ∑∑cc πΔΔ⎢ ⎜ ⎟⎟⎥ tt==00tt⎣⎢ ⎝⎠⎦⎥ ⎛⎞l + 1 ⎡⎤11+ ()−Γl ⎜ ⎟ ⎢⎥⎣⎦⎜⎝⎠⎟ ≈Δ2 l (A.7) 2 π t ⎪⎧1,l = 0 ⎪ ⎪0,l = 1, 3 ⎪ = ⎨⎪ ⎪ 2 ⎪Δt /2,l = 2 ⎪ ⎪3/4,4,Δ4 l = ⎩⎪ t

2 NN−−11∂ gt() ∂gt* () ∑∑= 2Re gt() tt==00∂∂ttcc (A.8) ∂g = 2Reg ,≈ 0, ∂tc

2 N −1 ∂ gt() ∂g ∑ = 2Reg ,≈ 0, (A.9) t=0 ∂Δtt ∂Δ

N −1 where ggt= {}()t=0 . Also, we have some results of expectations: Eft[]()== agt(), Ef⎡ ⎤ ag and (A.10) ⎣⎢ ⎦⎥

132 Appendices

Eft⎡ ()− agt ()2 ⎤ = 2,σ2 ⎣⎦⎢⎥ (A.11) Ef⎡⎤− ag= 2, Nσ2 ⎢⎥⎣ ⎦ where E []⋅ denotes the expected value. The seven deterministic but unknown parameters are 2 θσ= {},,ActΔtcc ,,, ωφ , , (A.12) and the log likelihood function is lf();log;θθ= pf () N −1 1 2 2 () () = −−−Nftagtlog() 2πσ 2 ∑ 2σ t=0 2 2 2 A 11 = −−−Nffaglog() 2πσ 22+ 2 Re , , 22σσ σ (A.13)

()N −1 where fft= {}t=0 . The Fisher information matrix I ()θ is given as ⎡∂2lf;θ ⎤ ⎡⎤ ⎢⎥() ⎢⎥IE()θ = − ⎢⎥,,1,2,,7. ij= (A.14) ⎣⎦ij, ⎢⎥∂∂θθ ⎣⎢ ij⎦⎥ Note that I ()θ is symmetric positive definite, and hence we need to calculate only the upper triangle of the matrix. With the help of (A.7)-(A.11), the Fisher information matrix is found as

133 Appendices

⎡⎤N ⎢⎥00 0 0 0 0 ⎢⎥A22σ ⎢⎥ ⎢⎥1 ⎢⎥00002 00 ⎢⎥A ⎢⎥1 ⎢⎥00 0 0 0 0 ⎢⎥Δ2 ⎢⎥t 2 ⎢⎥3ΔΔ42ω Δ 2 A ⎢⎥000ttct− 0 I ()θ =×2 ⎢⎥.(A.15) σ ⎢⎥16 4 4 ⎢⎥Δ22224ωω12+ Δ + 4c Δ ⎢⎥000−−Δ−tc c t t c 2 ω ⎢⎥2 tc ⎢⎥42Δt ⎢⎥2 ⎢⎥2 Δt ⎢⎥000 0−Δc t 0 ⎢⎥2 ⎢⎥Δ2 ⎢⎥000t −ω 0 1 ⎣⎦⎢⎥4 c The inverse of I ()θ can be readily found to get the CRLBs of the estimates of θ , ⎡⎤−1 19 i.e., ⎢⎥I (θ) , i = 1, 2, , 7 , which has been summarized in Table II . ⎣⎦ii

19 We recently became aware of similar results developed independently by Ainsleigh et al [83], but no detailed derivations were presented there. 134 Appendices

Appendix C. Human Experimentation Protocol University of Toronto RESEARCH SERVICES – ETHICS REVIEW UNIT

Human Experimentation Protocol: Visual Evoked Potential Measurement

ETHICS PROTOCOL SUBMISSION

Date: October 11, 2004

Submitted by: Jie Cui 416-978-6170 Fax: 416-978-4317 Email: [email protected]

Willy Wong 416-978-8734 Fax: 416-946-8734 Email: [email protected]

Hans Kunov 416-978-6712 Fax: 416-978-4317 Email: [email protected]

Institute of Biomaterials and Biomedical Engineering Rosebrugh Building University of Toronto

Submitted to: Ethics Review Unit University of Toronto Research Services Simcoe Hall, University of Toronto

Expedited Review The work described in this protocol represents minimal risk, and we respectfully request expedited review. No invasive method is employed. All spontaneous and evoked potential signals are obtained with standard scalp surface electrodes and amplified through a high performance AC amplifier satisfying the EMC standards and directives of medical device safety. There are no known risks or harmful effects from this research.

135 Appendices

1. Background and Objectives Background The Sensory Communication Group in the Institute of Biomaterials and Biomedical Engineering has been engaged in research on sensory systems and in the development of related instrumentation for many years. We plan to continue this activity. The research work involves the design and testing of a brain-computer interface using visual evoked potentials. A brain-computer interface is an alternative communication and control channel without using the brain’s normal neuromuscular output. Previous studies have shown that the amplitudes of specific frequency components of visual evoked potentials can be consciously modified by the subject’s attention when the eyes are exposed to a periodic stimulation. By continuously monitoring the amplitude of a specific frequency component of visual evoked potentials, a computer can be used to test whether a user is paying attention to the visual stimulation or not. Paying attention may represent a user’s command of the environment such as turning on a light. This way, a user can transmit his/her intention without using his/her normal neuromuscular output pathways. The immediate goal of this system is to provide those patients suffering from severe neuromuscular disabilities, who cannot adopt a traditional augmentative treatment, an alternative communication and control channel. Objectives To record visual evoked potentials elicited by repetitive visual stimulus using scalp surface electrodes. To extract spectral pattern of visual evoked potentials of different mental status (attended stimulus or unattended stimulus). To observe the dynamic process of building-up steady-state visual evoked potentials. To validate theories on communication channel using brain waves.

2. Research Methodology Instrumentation All the instrumentation used is standard electro-encephalographic (EEG) equipment identical to those normally used in the clinic, which are designed to meet or exceed applicable standards. Great care is taken to secure that all instruments are calibrated and functioning before each test. The equipment involved is listed in the following table.

Type Description Quantity

136 Appendices

EIM 105-30 Bio-potential Electrode Impedance Meter 1 445-10G-48TP EEG Gold Plated Electrodes 3 RPS312 Grass Regulated Power Supply 1 CP511 Grass High Performance AC Preamplifier 1 PCI-4451 NI Data Acquisition Card 1 VP150m ViewSonic LCD Monitor 1 PC Computer and Software 1

Stimulus The visual stimulus or display is generated using conventional LCD or CRT monitor connected to a desktop personal computer. The stimuli that will be presented to subjects are periodical signals with the repeat frequencies ranging from 0.1 Hz to 50 Hz. The display of the stimuli will be subtended about 10.5o visual angle with a contrast around 0.4. The subjects will view the display binocularly from a distance of 50 cm in a darkened room. The screen will be masked to reveal a 10.5o square field. The display has an approximate mean luminance of 50 cd/m2. The luminance and the brightness of all stimuli are in normal ranges. In no instance will the extreme exposure of light be attempted. The specific stimuli presented to the subjects will include one or more of the following types: i. Matrix of Moving Bars (MMB) It consists of a series of 16 crosses spreading across a 10.5o × 10.5o display. Each bar is 1.2o high and 0.2o wide. The center-to-center spacing of the cross is 2.5o. The 16 horizontal bars consist of the horizontal bar matrix, and the 16 vertical bars consist of the vertical bar matrix (see Fig. 1(a)). ii. Flickering The size of the display is the same as MMB. It consists of a white square and a black square, which will be displayed in sequence. The time of displaying the white square accounts for 50% time of each period. The time of displaying the black square accounts for the other 50% time. iii. Pattern reversal The checker-board reversal pattern is frequently used in clinical examination (see Fig. 1 (b)). The size of the display is the same as MMB. The size of the check square is 0.4o.

137 Appendices

(a) (b) Fig. 1. Patterns of visual stimuli. (a) Matrix of Moving Bars; (b) Checker-board Reversal

Recording and Analysis For each pattern of stimulus described above, we will collect data into two different conditions of attention: attended the stimulus or not attended. Before starting each trial, the subject is instructed to attend the stimulus or not. Each subject will be previously trained with about 4 to 6 trials before starting the session to be sure they understand the task and can perform it correctly. The data are collected from each subject for a total of 50 single trials with each trial lasting 6 seconds. Visual evoked potentials are recorded with a standard set of three GrassTM gold-cup surface electrodes. The active electrode is placed at Oz position (3 cm above inion) designated in the international EEG 10-20 system, and the reference electrode is placed on the right ear lobe. The third electrode is placed on the left ear lobe connected to the ground line of the amplifier. After amplification and filtering, the signals are samples at 250 Hz and A/D converted. The digitized data are stored on the computer and then go through the signal processing procedure to extract spectral information using standard methods such as short-time-Fourier-transform (STFT). Subject Response All response of the subject will be a physical one, measured by the instrument. No psychological and/or psychophysical response will be measured.

3. Participants Selection

138 Appendices

Subjects will often be the researcher himself or herself, and fellow students and staff in the laboratory. As subjects we will select normal people and people with diagnosed visual and neuromuscular pathologies. Each subject will be given a full explanation of the purpose of the experiment by a principal investigator or one of their assistants. Instructions to participants Specifically, before any test is performed the subject or his/her guardian will be given the following written information: i. The title of the project ii. Identification of the investigators iii. Brief, but complete description in lay language of the purpose of the experiment and the experimental procedure iv. Statement of all known side effects with an estimate of the probability of their occurrence v. Assurance that the identity of the subject will be kept confidential and a description of how this will be accomplished vi. Statement of the total amount of time that will be required of a subject (beyond that needed for treatment in the case of patients) vii. Details of monetary compensation to be offered to the subjects viii. An offer to answer any inquires concerning the procedure to ensure that they are fully understood by the subject and/or guardian ix. An unambiguous statement that the subject may decline to enter or withdraw from the study at any time and in the case of patients without consequence to continuing medical care x. Signature of subject or guardian consenting to participate in the research project and acknowledging receipt of a copy of the consent form including all attachments xi. Signature of a witness Sample Size For engineering development purposes, the sample size is often one, typically the researcher him/herself. For research and verification purpose, the number of subjects will be between five and twenty. Inclusion/Exclusion criteria

139 Appendices

Participants must be 18 or above 18 years old. The minors will NOT be included in this study at current stage.

4. Recruitment Participants will be recruited from graduate students in the Institute of Biomaterials and Biomedical Engineering, or they will be recruited by posters and ads on the University campus. The principal investigators may have a colleague relationship to the participants.

5. Risks and benefits Risks There are no known risks from these experiments. The risks to the subjects are not different from those they are exposed to during normal clinical electrophysiology examination of vision. They are non-invasive. Here is a list of those risks and how they have been eliminated:

Risk Safeguards Electric shock The power supply and EEG pre-amplifier meets the safety standards and directives (EN60601-1, EN50082-1, EN55011, 93/42/EEC, 89/336/EEC) for medical equipment. Infection from The skin area contacting electrodes will be prepared by scalp surface standard skin preparation gels and conductive paste. After electrodes the experiment, the area will be promptly cleaned with warm water. Light exposure The brightness of the monitor displaying visual stimulus is limited within normal range. Epilepsy Potential subjects having the history of seizures or related disorders will be excluded.

Benefits There are no direct benefits to the subjects or patients themselves, except for a modest monetary compensation (about $10 per hour).

140 Appendices

There are considerable benefits to the University and the research effort by the investigators from the experiments: The experiment will provide valuable insight into the dynamic process of the visual system by analyzing visual evoked potentials. Through the work, we will be able to develop more efficient methods for implementing a brain-computer interface. The aim of the system is to provide the patients suffering from severe neuromuscular disabilities an alternative of communication and control.

6. Privacy and confidentiality All data collected will be kept confidential and all records used will be coded so that identification of individuals is not possible. The data will be kept in a locked office, and eventually will be securely destroyed. Access to the data will be limited to members of the research team.

7. Compensation Participants will be reimbursed about $10 per hour or fraction thereof. There are no other benefits.

8. Conflicts of interest There is no identified conflict of interest at present.

9. Informed Consent Process Free and informed consent will be obtained in writing. No manipulation or undue influence of participants will be permitted. Participants will be fully informed about the research program, its purpose, the identity of the researcher, and the duration and nature of participation. Potential harms and benefits will be explained. It will be made clear to the participant that he/she can withdraw from the research at any time without consequences. The consent form is attached below. 10. Scholarly review N/A

141 Appendices

11. Additional ethics reviews N/A 12. Contracts N/A 13. Clinical Trials N/A

142 Appendices

Consent Form

(Departmental letterhead) Institute of Biomaterials and Biomedical Engineering, University of Toronto Rosebrugh Building, 4 Taddle Creek Road, Toronto, Ontario M5S 3G9 Consent Form

Title of research project: Visual evoked potential based brain-computer communication interface

Investigators: Principal Investigator: Mr. Jie Cui, Phone: 416-978-6170 Email: [email protected] Supervisors: Dr. Willy Wong, Phone: 416-978-8734 Email: [email protected] Dr. Hans Kunov, Phone: 416-978-6712 Email: [email protected] Please feel free to contact persons above if any question or problem arises.

Sponsor or funding: Currently, there is no sponsor or funding for this study.

Background & purpose of research: This experiment is part of Jie Cui’s Ph.D. thesis. The purpose of the study is to gather data on the eye and brain by recording electrical signals on the surface of the scalp. The results will be used to explore the possibility of using brain waves to communicate with a computer. The immediate goal of this study is to help those patients suffering from severe disabilities to control their environment.

143 Appendices

Eligibility: To participate in this study you must be 18 years old or older.

Procedures: All experiments will take place in the Institute of Biomaterials and Biomedical Engineering, Rosebrugh Building, 4 Taddle Creek Road, University of Toronto. You will be told to look at one of or all of three different patterns displayed on a computer monitor. You will be instructed either to or not to attend to the patterns presented to you. Each trial will last 8 seconds. A total of 50 trials will be recorded for one pattern. The total time required will be around 40 minutes. You will be asked to remain quiet during the experiment. Two of the three EEG electrodes will be placed on your right and left ear lobes. The third will be placed on the back of your head. All the visual patterns will be explained by the investigator before the experiment. The procedure will be demonstrated to you by the investigator in the training session before the experiment. The training session will last about 10 minutes. It is important that you understand them before beginning the experiment.

Voluntary participation & early withdrawal: It is understood that participation in this study is voluntary. You are assured that may decline to enter, and that you may withdraw from the study at any time without any consequences whatsoever.

Early termination: You will be informed if the investigators determine circumstances which may require the termination of your involvement, or the study. The relevant information should be disclosed to you.

Risks & benefits: You understand that there are no known side effects to the procedure. You also understand that you will not benefit directly from your participation in this study, aside from your monetary compensation of $10.00 per hour or part thereof.

144 Appendices

Privacy & confidentiality: You identity will be kept confidential. All records bearing your name will be kept in a locked file by the principal investigator and/or his supervisors, and eventually will be securely destroyed. Access to the data will be limited to members of the research team. In any published reports of this work, you will be identified only by a coded number. However, you understand that confidentiality can only be guaranteed to the extent permitted by law.

Publication of research findings: The data obtained from the experiments will be published in an aggregate way. No quotations or personal opinions or ideas of the participants will be included in publications.

Possible commercialization of findings: No creative question acquiring ideas that have potential possibility of commercialization will be asked in the experiments. There is no intention of commercializing any findings at current stage.

New findings: If anything comes to light during course of this research which may influence your decision to continue, you will be duly notified.

Compensation: There is no cost for a participant except time for the experiment. There is no payment for parking, travel time and inconvenience, aside from a monetary compensation as is stated in the section of ‘Risks & benefits’.

Right of subjects: You waive no legal rights by participating in this study.

Dissemination of findings:

145 Appendices

You may request a copy of the final report or any published documents.

You have discussed with the investigator(s) all questions that you have regarding the test, and have received satisfactory answers to all of them. You have been given a copy of this informed consent form, which you have read and understood and signed, to keep for your own records.

Signature:______Printed Name:______Date:______

146