Detection and Reconstruction of Random Telegraph Signals Using Machine Learning

Ben Hendrickson, Ralf Widenhorn, Paul R. DeStefano and Erik Bodegom

Abstract - Random Telegraph Signal (RTS) is B. Math of RTS noise and Reconstruction Goals classified as stochastic discrete jumps in what otherwise RTS is defined by stochastic transitions between one of would be a constant electrical signal. This phenomenon two states, defined here as state 0 and state 1, the low and has been known and studied in semiconductor devices high states respectively. Represented mathematically, the for decades. More recently it has been shown that a new state 푠 at some given time 푡 is either 푠(푡) = 0 or 푠(푡) = 1. It flavor of RTS occurs in the photosensitive area of silicon has been suggested that there may be RTS centers with more image sensors. Here, a method of RTS signal detection than two discrete levels [11], but we will not address them and reconstruction is presented. This method is built on here. An RTS signal collected from image sensor dark frames machine learning techniques for classification and will have two noise contributors, the RTS transitions and denoising of one-dimensional time-series. The model is Gaussian or . Since these noise sources are trained on a simulated dataset in order to provide assumed independent from one another in our model, their certainty in the fidelity of corresponding ‘clean’ and respective variances add together to determine the total noise ‘noisy’ signals, and tested on a similar set of signals. In of the signal such that: addition, a set of real signals collected from a CCD 2 2 2 sensor were used to provide a qualitative description of 휎푆퐼퐺 = 휎퐺푎푢푠푠𝑖푎푛 + 휎푅푇푆 the model’s efficacy. The magnitude of the signal at some time 푡 is written as: Keywords—machine learning, random telegraph 푥(푡) = 푥0 + 휖(푡) + 퐴푅푇푆 ∗ 푠(푡) signal, convolutional neural network, denoising , signal reconstruction where 푥0 is the dark current , 휖(푡) is the dark current contribution at time 푡, 푠(푡) is the RTS

I. INTRODUCTION state at time 푡, and 퐴푅푇푆 is the RTS amplitude.

In general, a single pixel can have many RTS defect A. Current state of art for RTS analysis centers. In this case the signal at some time 푡 is described by:

RTS noise is a well-studied phenomena, typically the 푛 consequence of radiation damage in silicon devices. [1-5] A ( ) ( ) common manifestation of it comes in the form of discrete 푥 푡 = 푥0 + 휖 푡 + ∑ 퐴𝑖푅푇푆 ∗ 푠𝑖(푡) changes in the generation rate of leakage current, known in 𝑖=1 image sensor as dark current. [6] RTS noise in image sensors where 푖 is the number of RTS defect centers located in the has been previously analyzed in a number of ways. Usually a pixel. For simplicity, this paper will assume that if a pixel lengthy time series is created for pixels of interest by displays RTS behavior, the number of RTS defect centers in collecting many (a few thousand) frames at regular intervals. This series is then analyzed to identify RTS behavior and that pixel is always equal to one. This is indeed the case for extract characteristics of interest. A variety of strategies have the majority of pixels that have RTS noise, except in cases of been used to analyze RTS pixels including visual inspection very high irradiation levels. [12] [7], histograms of the time series [8], and signal C. Machine Learning Classification reconstruction by wavelet denoising [9]. The most popular strategy over the last decade was created by Goiffon et al. and The goal in building a classification model is to take a aims to detect and reconstruct an RTS signal via set of data made of many categories and accurately separate convolutional filtering. There, a step-shaped filter is it into its different types. This classification model was convolved along a signal to suppress Gaussian noise, and trained to differentiate RTS signals from non-RTS signals. A produce large spikes where RTS transitions occur. Mean signal is represented as a vector and passed through various signal values are collected between spike locations to layers of operators or functions to produce, in this case, a reconstruct it without the white noise. [10] We will present single output (zero for RTS pixels or one for non-RTS in this paper a novel approach that takes advantage of recent pixels). This is similar to the way that image classification is progress that has been made in machine learning techniques. performed, and similar to machine learning classification methods previously used for one dimensional digital signals.

[13-17] A typical convolutional classification model [18] will noise, and the other the confidence of a signal not containing include: convolutional, pooling, dropout, and fully-connect RTS. layers, here, each is addressed in turn. D. Classifier Training When the model is first initialized for training the coefficients of each filter, or the shape of each filter, are 1) Convolutional Layers randomized. Then, one by one, members of the training set Convolutional layers apply filters to extract prominent are passed through the network, and assigned a confidence of features that are representative of distinctive characteristics, RTS versus non-RTS. Because this is supervised training the such as RTS transitions. As the signal is passed forward confidence score is checked against the given label for the through the network, each neuron (or, filter or kernel) is signal, again 0 for RTS and 1 for non-RTS signals. The error convolved with the signal creating a feature map that is the of the confidence score is calculated by using the binary same size as the input.[19] Finally, an activation function is cross-entropy loss function, defined below, and improved by applied to each filter. This function ensures that each updating the filter and activation weights by means of back- convolution is, in the end, a non-linear operation. The propagation. [24] activation function used here is the rectified linear unit (ReLU) function [20] which returns a zero for negative 1) Binary Cross Entropy inputs, and the input value itself for positives. [21] The loss function used for classification is binary cross entropy, 퐸. Here, 푡𝑖 is the target label, 0 for RTS, 1 for non- 푓(푥)푅푒퐿푈 = max{0, 푥} RTS. 푦𝑖 is the probability of the signal being non-RTS Convolutional layers that are stacked after the initial layer according to the model. Notice if the target and probability will operate upon the feature maps produced from the close to one another the error is close to zero. [25] previous layers. The shapes of the filters, or weights of the 푛 퐸 = − ∑ 푡 log(푦 ) + (1 − 푡 )log(1 − 푦 ) neurons, are continuously changed during the training 𝑖=1 𝑖 𝑖 𝑖 𝑖 process by back-propagation, to be discussed later. E. Denoising Autoencoder 2) Pooling Layers Once the signal is run through the classification model, Pooling layers reduce the dimensionality of the vector by and if it is determined to have RTS transitions, the signal has down-sampling the feature maps. Pooling layers typically its white noise component suppressed by means of a appear directly following a convolutional layer. While there denoising autoencoder (DAE). The autoencoder shares some are a variety of pooling techniques, our classification scheme features of the classifier, e.g., convolutional layers, pooling uses “max-pooling.” Essentially, max-pooling is a form of layers, etc,… Rather than attempting to identify the kind of compression that inspects a section of a feature map, say signal (RTS vs. non-RTS) it takes the ‘incorrect’ signal as an elements 7,8, and 9, finds the largest value amongst the three, input and attempts to return the ‘correct’ one. In this case, the and tosses the other two values out. Pooling not only eases autoencoder takes an RTS signal with Gaussian noise, and the computational stress of training a model by reducing the returns a signal with suppressed noise. number of parameters, but also provides spatial invariance of In practice, a clean RTS signal, 푥, is created by important features.[22] simulation. Then, Gaussian noise is added over the top to 3) Dropout Layers produce signal 푥̃. This signal is then encoded by running it Dropout layers turn off a percentage of neurons, or filters through convolutional and pooling layers to extract pertinent during training. This prevents filters from becoming features and compress it. The now encoded signal, or rather dependent on the presence of neighboring filters to optimize feature map, is then decoded by again running it through the model. This interdependence leads to overfitting. An convolutional layers, but now using up-sampling rather than overfit model will perform very well on the data it is trained pooling. The up-sampling returns the signal to its original on, but will perform poorly on data in general. [23] size by adding elements with value equal to zero. Adding these zeros forces the autoencoder to learn the relationships 4) Fully Connected Layers between non-zero values. Finally, the signal is passed The final layer in a classification model is a fully through a fully connected layer that produces a denoised connected layer. Each neuron in this layer, as the name reconstruction of the input signal 푥̂ as seen in figure 1. Just suggests, is connected to every output from the previous like with the classifier, the result is measured against the layer. This layer forms a vector where each element ground truth, or in this case the original clean signal 푥, by represents a confidence score corresponding to a distinct again using a loss function. For the autoencoder the loss class. This model has a final layer of size two, where one function is a simple mean squares error comparison between neuron represents the confidence of a signal containing RTS each element of the clean signal 푥 and the denoised 푥̂. [26- 30]

Figure 2: Topology of the RTS classification model

B. Autoencoder Summary The denoising autoencoder model, shown in figure 3, was likewise built in Python using Keras as a wrapper over Figure 1: An RTS signal before (blue) and after (orange) TensorFlow. Its layers are structured as such: 퐶표푛푣(64) → passing through AE. A significant increase in signal to noise 푃표표푙(3) → 퐶표푛푣(32) → 푃표표푙(3) → 퐶표푛푣(32) → is obvious. 푈푝푠푎푚푝푙푒(3) → 퐶표푛푣(64) → 푈푝푠푎푚푝푙푒(3) → 퐹푢푙푙푦 퐶표푛푛푒푐푡푒푑 (1500). The convolutional layers have II. MODEL TOPOLOGY AND ALGORITHM 64, 32, and 64 filters respectively while the size of each filter METHODOLOGY is again set to 12. Each uses the rectified linear unit activation This RTS signal detection and reconstruction schema function. The final fully connected layer uses a linear was developed in Python and MATLAB using the concepts activation function. Training was carried out over five outlined in the previous section. All machine learning epochs. modeling was performed in Python, while the data preparation and reconstruction finalization was performed in MATLAB. This section outlines the specific choices made with respect to model architecture, and signal processing to carry out the goal of accurate detection and reconstruction.

A. Classifier Summary The classification modeling network, shown in figure 2, was developed in Python using Keras [31] as a wrapper over TensorFlow [32]. The layers are structured as such: 퐶표푛푣(32) → 푃표표푙(3) → 퐷푟표푝(0.5) → 퐶표푛푣(64) → 푃표표푙(3) → 퐷푟표푝(0.5) → 퐶표푛푣(128) → 푃표표푙() → 퐷푟표푝(0.5) → 퐹푢푙푙푦 퐶표푛푛푒푐푡푒푑(1). The convolutional layers have 32, 64, and 128 filters respectively with the size of each filter set to 12. Each uses the ReLU activation function. The first two pooling layers take the maximum value, while the last takes an average. The dropout rate is set to 50%. The final layer uses the sigmoid activation function. Training was carried out over five epochs.

Figure 3: Topology of the denoising autoencoder.

C. Training Considerations Each signal then has Gaussian noise added, with a standard One of the more problematic aspects of DC-RTS noise deviation of 75 (arbitrary units) as shown in figure 5. A new is that there are no well-defined limits on amplitude or state quantity is defined for an approximation of signal to noise lifetime. If either of these key characteristics is sufficiently which is simply small it is difficult, even to the human eye, to distinguish whether or not a signal has RTS transitions, let alone attempt 푆푁푅푅푇푆 = 퐴푅푇푆/휎퐺푎푢푠푠𝑖푎푛 to reconstruct it without Gaussian noise. It then becomes where 퐴푅푇푆 is the RTS amplitude and 휎퐺푢푎푠푠𝑖푎푛 is the necessary to create a training set with realistic RTS signals Gaussian noise. The range of 푆푁푅푅푇푆 for the training dataset that feature a wide variety of amplitudes and state lifetimes. 1 spans from to 6. Simulated signals and noise augmentation have been used 75 previously for training networks related to variety of In addition to the RTS data set a collection of non-RTS applications. [33-37] The training set created here has signals, Gaussian noise only, were produced. These are, of amplitudes from 1 to 450 arbitrary units (AU), and state course, used to train the classifier to separate RTS from non- lifetimes from 1 to 300 samples as shown in figure 4. RTS signals. In total 180,000 signals were created, 90,000 Transitions between RTS states are determined by a decaying with only Gaussian noise and 90,000 with Gaussian noise exponential probability so that they remain stochastic, but and RTS transitions featuring a variety of amplitudes and average out to the appropriate state lifetime. Lifetimes for the state lifetimes. high and low states were set equal to each other for all RTS signals.

Figure 5: A simulated RTS signal before and after adding Gaussian noise

Finally, before training the models the signals must be scaled, so the shape of the signal, not the magnitude determines the weights of the filters. It was determined that Figure 4: The structure of the training set each signal should lie between zero and one, so each signal 푥 is subtracted by a value just below the minimum to create 푥푠

푥푠 = 푥 − 푠 ; 푠 = 0.99 ∗ min (푥)

푥푠 is then divided by a value just above its maximum to create 푥푠푑 푥 푥 = 푠 ; 푑 = 1.01 ∗ max (푥 ) 푠푑 푑 푠

Since the model is trained on scaled signals, any real data processed by it must undergo the same scaling. In order to ensure the mean signal values remain unchanged this scaling must be reversable, so a key is maintained that records 푠 & 푑 Figure 7: The final reconstruction of an RTS signal. There for each signal 푥. are a total of 2 values for the entire signal, zero Gaussian noise.

III. RESULTS AND DISCUSSION D. Gaussian fit level finding

A. Simulated Dataset Recall the total noise of an RTS signal is defined as: In order to measure the efficacy of the classification model, a validation test was carried out on two additional sets of simulated signals. This test inputs a sample signal to the 2 2 2 휎푆퐼퐺 = 휎퐺푎푢푠푠𝑖푎푛 + 휎푅푇푆 model, and records the number of correct and incorrect inferences. Like the training sets, each is composed of

90,000 signals. The set of RTS signals has state lifetimes that which shows the Gaussian noise and RTS noise are span from 1 to 300 frames, and amplitudes such that the 1 uncorrelated to one another. Therefore, a histogram of an 푆푁푅 runs from to 6. The set of non-RTS signals has 푅푇푆 75 RTS signal, before and after the autoencoder denoising, will various Gaussian noise levels. All signals are scaled as be composed of the sum of Gaussian peaks, one for each described above before running them through the algorithm. level. The reconstruction of an RTS signal is completed by taking a histogram of the autoencoder result, and fitting [38] The algorithm detected 83.5% of the RTS signals, and it as a sum of two Gaussians as shown in figure 6. The new recorded zero false positives from the non-RTS test set. It clean signal, figure 7, is created by snapping each element of works remarkably well on RTS signals that have a 푆푁푅푅푇푆 > the autoencoder to whichever peak value from the fitted 1.5 and lifetimes longer than about 20 frames as seen in histogram (see figure 6) is closest to that element. From here figure 8. the RTS amplitudes and state lifetimes are simply collected.

Figure 8: The RTS detection map. Black areas are where the Figure 6: The fitted histogram of the autoencoder results. detection model failed. The final reconstruction uses the values where peaks occur.

Each signal that passed detection was then scored on the quality of reconstruction by means of the sample correlation coefficient. This is a great advantage of testing on a simulated data set since each reconstruction can be directly compared to the original clean signal. The sample correlation coefficient is calculated as

Σ(푥𝑖 − 푥)(푦𝑖 − 푦) 퐶푥푦 = √Σ(푥 − 푥)2√Σ(푦 − 푦)2 𝑖 𝑖 푡ℎ Where 퐶푥푦 is the correlation coefficient or score, 푥𝑖 is the 푖 Figure 10: Distribution of Correlation Coefficients. The value of some element in the reconstructed signal, 푥̅ is the table of correlation score counts highlights the quality of 푡ℎ mean of the reconstructed signal, 푦𝑖 is the 푖 value of some reconstruction for a great majority of pixels. element in the original clean signal, and 푦̅ is the mean of the original clean signal. The coefficient lies between −1 and 1 where −1 is perfectly anticorrelated, 0 is uncorrelated and 1 B. RTS Image Sensor Data is perfectly correlated. In practice negative scores are In addition to testing on simulated signals, the algorithm possible, but exceedingly rare. Nearly all RTS signal was used to detect and reconstruct RTS signals collected from detections resulted in a highly accurate reconstruction as seen a CCD image sensor, shown in figure 11. The sensor is a SITe in figures 9 and 10. The mean correlation score for detected SI-033AF front side illuminated 1 mega-pixel CCD RTS signals is 0.978. [39] (1024푥1024) [40]. Frames were taken in dark conditions with 10 second integration time at 305 K. Qualitatively, the algorithm performed well, but was prone to false positive detections triggered by cosmic ray events, which show up as a large spike of dark current in a single frame. Additionally, low frequency noise, caused by temperature fluctuations, hampered the performance of the algorithm. This was mitigated by subtracting the median value across the sensor from each frame. The RTS amplitudes and state lifetimes are unaffected, so none of the key information is lost.

Figure 9: The Sample Correlation Coefficient map. For the vast majority of signals that passed detection, reconstruction is near perfect.

Figure 11: Reconstructions of four randomly selected RTS pixels.

The algorithm performed well on image sensor data. Reconstruction for signals that passed detection is This is remarkable, considering that it was trained completely exceptional, reaching an almost perfect correlation on simulated data sets. There is no real characteristic shape coefficient of 0.99 or greater for nearly 50,000 signals. of an RTS signal; the amplitudes can be essentially any size Qualitatively, the algorithm performed well on a set of and the lifetimes are completely unbounded. It was expected, data collected by taking dark frames with a CCD image perhaps naively, that this stochastic nature of RTS signals sensor. Additional steps need to be taken to address issues would cause some issues, but none have arisen yet. Given the stemming from cosmic rays and thermal fluctuations, but success of the technique it is safe to expect similar strategies once mitigated near perfect reconstruction of an RTS signal will begin to replace traditional methods of signal processing is expected. at an accelerated rate. Additionally, this direct time series reconstruction may work well as supplementary in situations V. CITATIONS involving audio classification like speech recognition and bird identification where spectrograms are the primary source of feature extraction. Creating realistic simulated [1] P. L. Leonard and S. V. Jaskolski, "An investigation into the origin and representations of time series requires careful thought, but is nature of "Popcorn noise"," in Proceedings of the IEEE, vol. 57, no. 10, pp. simple to automate once key characteristics are recognized. 1786-1788, Oct. 1969. [2] J. Bogaerts, B. Dierickx and R. Mertens, "Random telegraph signals in a IV. CONCLUSION radiation-hardened CMOS active pixel sensor," in IEEE Trans. on Nucl. Sci., vol. 49, no. 1, pp. 249-257, Feb. 2002. [3] W. H. Chard and P. K. Chaudhari, “Characteristics of burst noise,” Proc. A machine learning based algorithm is presented for the IEEE, vol. 53, p. 652, 1965. reconstruction and analysis of RTS signals. The algorithm uses a convolutional classifier for the identification of RTS [4] E. Simoen, B. Dierickx, C. L. Claeys, and G. J. Declerck, “Explaining the amplitude of RTS noise in submicrometer MOSFETs,” IEEE Trans. transitions, and a convolutional denoising autoencoder to Electron Devices, vol. 39, p. 422, Feb. 1992 increase the RTS signal to noise. A histogram of the decoded signal values is taken, and fit to the sum of two gaussians. [5] V. Goiffon, P. Magnan, P. Martin-Gonthier, C. Virmontois, and M. Gaillardin, “New source of random telegraph signal in CMOS image Finally, the signal is reconstructed by snapping each value of sensors,” presented at the International Image Sensor Workshop, Hokkaido, the decoded signal to the nearest peak location. Japan, 2011. The algorithm was shown to be very successful quantitatively by running it on a set of simulated RTS and [6] K. Ackerson et al., "Characterization of "blinking pixels" in CMOS Image Sensors," 2008 IEEE/SEMI Advanced Semiconductor Manufacturing non-RTS signals. It detected over 83% of the RTS signals, Conference, Cambridge, MA, 2008, pp. 255-258. and only consistently failed on signals with a 푆푁푅푅푇푆 < 1.

[7] I. H. Hopkins and G. R. Hopkinson, "Further measurements of random Networks for Image Classification" International Joint Conference on telegraph signals in proton irradiated CCDs," in IEEE Transactions on Artificial Intelligence (2011): n. pag. Web. 8 Apr. 2019 Nuclear Science, vol. 42, no. 6, pp. 2074-2081, Dec. 1995. [23] Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, [8] D. R. Smith, A. D. Holland, and I. B. Hutchinson, “Random telegraph and Ruslan Salakhutdinov, Improving neural networks by preventing co- signals in charge coupled devices,” Nucl. Instrum. Meth. A, vol. 530, no. 3, adaptation of feature detectors, CoRR abs/1207.0580 (2012) pp. 521–535, Sep. 2004. [24] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based [9] B. Hendrickson, R. Widenhorn, M. Blouke, D. Heidtmann, and E. learning applied to document recognition," in Proceedings of the IEEE, vol. Bodegom, “RTS Noise in CMOS Image Sensors Irradiated with High 86, no. 11, pp. 2278-2324, Nov. 1998. Energy Photons.” Submitted for publication. [25] P. Sadowski, “Notes on backpropagation,” 2016. [Online]. Available: [10] V. Goiffon, G. R. Hopkinson, P. Magnan, F. Bernard, G. Rolland and https://www.ics.uci.edu/ O. Saint-Pe, "Multilevel RTS in Proton Irradiated CMOS Image Sensors Manufactured in a Deep Submicron Technology," in IEEE Transactions on [26] X. Wu, G. Jiang, X. Wang, P. Xie and X. Li, "A Multi-Level-Denoising Nuclear Science, vol. 56, no. 4, pp. 2132-2141, Aug. 2009. Autoencoder Approach for Wind Turbine Fault Detection," in IEEE Access.

[11] A. M. Chugg, R. Jones, M. J. Moutrie, J. R. Armstrong, D. B. S. King, [27] L. Gondara, "Medical Image Denoising Using Convolutional Denoising and N. Moreau, “Single particle dark current spikes induced in CCDs by high ," 2016 IEEE 16th International Conference on Data Mining energy neutrons,” IEEE Trans. Nucl. Sci., vol. 50, no. 6, pp. 2011–2017, Workshops (ICDMW), Barcelona, 2016, pp. 241-246. Dec. 2003. [28] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol. [12] E. Martin, T. Nuns, C. Virmontois, J. -. David and O. Gilard, "Proton Stacked denoising autoencoders: Learning useful representations in a deep and 훾-Rays Irradiation-Induced Dark Current Random Telegraph Signal in network with a local denoising criterion. In Proceedings of the 27th a 0.18-휇m CMOS Image Sensor," in IEEE Transactions on Nuclear Science, International Conference on Machine Learning, pages 3371–3408. ACM, vol. 60, no. 4, pp. 2503-2510, Aug. 2013. 2010.

[13] B. Zhao, S. Xiao, H. Lu and J. Liu, "Waveforms classification based on [29] G. Hinton et al., "Deep Neural Networks for Acoustic Modeling in convolutional neural networks," 2017 IEEE 2nd Advanced Information Speech Recognition: The Shared Views of Four Research Groups," in IEEE Technology, Electronic and Automation Control Conference (IAEAC), Signal Processing Magazine, vol. 29, no. 6, pp. 82-97, Nov. 2012. Chongqing, 2017, pp. 162-165. [30] X. Ye, L. Wang, H. Xing and L. Huang, "Denoising hybrid in [14] Tamer Ölmez, Zümray Dokur, “Classification of heart sounds using an image with stacked autoencoder," 2015 IEEE International Conference on artificial neural network,” Pattern Recognition Letters, Volume 24, Issues Information and Automation, Lijiang, 2015, pp. 2720-2724. 1–3, 2003, Pages 617-629. [31] Chollet, François and others, Keras, www.keras.io, 2015 [15] Kaustav Basu, Vincent Debusschere, Ahlame Douzal-Chouakria, Seddik Bacha, “Time series distance-based methods for non-intrusive load monitoring in residential buildings,” Energy and Buildings, Volume 96, [32] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng 2015, Pages 109-117. Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, [16] Chen, Y., Zhang, G., Bai, M., Zu, S., Guan, Z., & Zhang, Michael Isard, Rafal Jozefowicz, Yangqing Jia, Lukasz Kaiser, Manjunath M. ( 2019). “Automatic Waveform Classification and Arrival Picking Based Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya on Convolutional Neural Network,” Earth and Space Science, 6. Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin [17] Xiong, Z., Stiles M.K., Zhao, J., “Robust ECG Signal Classification for Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng, TensorFlow: Detection of Atrial Fibrillation Using a Novel Neural Network,” Computing Large-scale machine learning on heterogeneous systems, 2015. Software in Cardiology, VOL 44, 2017. available from tensorflow.org. [18] T. Guo, J. Dong, H. Li and Y. Gao, "Simple convolutional neural network on image classification," 2017 IEEE 2nd International Conference [33] M Carrillo et al, “Time series analysis of gravitational wave signals on Big Data Analysis (ICBDA), Beijing, 2017, pp. 721-724. using neural networks,” J. Phys.: Conf. Ser. 654 012001, 2015.

[19] Li, Jia & Si, Yujuan & Lang, Liuqi & Liu, Lixun & Xu, Tao. (2018). A [34] M. Takadoya, M. Notake, M. Kitahara, J.D. Achenbach, Q.C. Guo and Spatial Pyramid Pooling-Based Deep Convolutional Neural Network for the M.L. Peterson, “Crack-Depth Determination By A Neural Network With A Classification of Electrocardiogram Beats. Applied Sciences. 8. 1590. Synthetic Training Data Set”, Review of Progress in Quantitative 10.3390/app8091590. Nondestructive Evaluation, vol. 12, pages 803-810, 1993.

[20] Z. Hu, Y. Li and Z. Yang, "Improving Convolutional Neural Network Using Pseudo Derivative ReLU," 2018 5th International Conference on [35] N. J. Rodriguez-Fernandez, P. Richaume, Y. H. Kerr, F. Aires, C. Systems and Informatics (ICSAI), Nanjing, 2018, pp. 283-287. Prigent and J. Wigneron, "Global retrieval of soil moisture using neural networks trained with synthetic radiometric data," 2017 IEEE International [21] M. D. Zeiler and R. Fergus. Visualizing and understanding Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, 2017, pp. 1581-1584. convolutional networks. In D. J. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, editors, ECCV, volume 8689 of Lecture Notes in Computer Science, pages 818–833. Springer, 2014. [36] T. A. Le, A. G. Baydin, R. Zinkov and F. Wood, "Using synthetic data to train neural networks is model-based reasoning," 2017 International Joint [22] Ciresan, Dan, Meier, Ueli, Masci, Jonathan, Gambardella, Luca, AND Conference on Neural Networks (IJCNN), Anchorage, AK, 2017, pp. 3514- Schmidhuber, Jürgen. "Flexible, High Performance Convolutional Neural 3521.

[37] A. Witmer and B. Bhanu, "HESCNET: A Synthetically Pre-Trained Convolutional Neural Network for Human Embryonic Stem Cell Colony Classification," 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, 2018, pp. 2441-2445.

[38] T.C. O’Haver, Pragmatic Introduction to Signal Processing 2019: Applications in scientific measurement. Kindle Direct Publishing, 2019, p. 340.

[39] N. Bershad and A. Rockmore, "On estimating signal-to-noise ratio using the sample correlation coefficient (Corresp.)," in IEEE Transactions on Information Theory, vol. 20, no. 1, pp. 112-113, January 1974.

[40] SITe SI03xA 24 µm Charge-Coupled Device Family, SITe, Tigard, OR, http://www.not.iac.es/instruments/detectors/CCD1/S103xA_family.pdf, 2003.