Analysis of an Unknown Tracking Algorithm for Rats Project Report MVE385

Analysis of an unknown tracking algorithm for rats Project Report MVE385 Axel Nathanson Sofia Cvetkovic Destouni Rickard Karlsson [email protected] [email protected] [email protected] January 20, 2021 Supervisors Irlab: Fredrik Wallner, Peder Svensson, Susanna Waters. Smartr: Mattias Sundén, Erik Lorentzen. External advisor: Per Enflo Abstract In this study we attempt to model and recreate the unknown function of a black-box analyser unit used to track the behavioural patterns of experimental animals. In total, we evaluate the following seven approaches, the Kalman filter, rolling mean, exponential smoothing, B-splining and three custom algorithms created during the project named Rolling Unique Mean (RUM), Slider and Enflo’s algorithm. None of the evaluated algorithms was definitively proven to be used by the analyser unit, however the RUM algorithm showed promising characteristics, similar to the unit. A limita- tion of this study is the fact that the input to the analyser unit is also unknown, so a artificial substitute needed to be calculated which was not able to represent outlying data samples. A continuation of this study would be to improve the artificial data generation and look to keep developing RUM or similarly structured algorithms. 1 2 Contents 1 Introduction3 2 Setup and Notation3 3 Theory 3 3.1 Algorithms......................................3 3.2 Hyperparameter Optimisation............................7 4 Methods 7 4.1 Experimental Setup..................................7 4.2 Evaluation.......................................8 4.3 Optimising Hyperparameters.............................9 4.4 Code Implementation................................. 10 5 Results 12 5.1 Statistical analysis................................... 12 5.2 Evaluation of the algorithms............................. 15 6 Discussion 20 6.1 Sources of Errors................................... 20 6.2 Conclusion...................................... 21 A Results from Hyperparameter Optimisation 23 B Feature Plots 24 C PCA Plots 33 3 1 Introduction A possible approach when screening novel pharmaceuticals is to investigate treatment effects on a system level, without necessarily considering the underlying specific biological mechanisms caused by the chemical compound of the treatments. A crucial component in this screening process is to analyse behavioural patterns of experimental animals since this can give key insights about changes in the brain. IRLAB Therapeutics currently uses two different systems to track the experimental animals, one signal is generated by a high-resolution video while another signal is generated by a photo-beam crossing system, which is fed into a unit which we denote as the analyser unit, with lower resolution. However, the analyser unit process the input with some unknown black-box function. The goal of this project is therefore to investigate this black-box function and create a model that can reproduce the same output. This project can be divided into two main parts: firstly, an analysis of the camera and analyser unit output, and secondly, the evaluation of different filtering and smoothing algorithms as possible candidates for the black-box function mentioned above. 2 Setup and Notation In this project we have worked with data generated from 18 repeated experiments with tracking of rat’s movements. The experiments were tracked by the photo-beam crossing system as well as video cameras. Here we provide notations for the three basic signals generated from the experiments, these notations will be used systematically throughout this report. • By original we refer to the trajectory data obtained from the beam-crossing system and pro- cessed by the analyser unit that we are trying to understand. • By camera we refer to the trajectory data from the camera recordings, where each data point is the calculated centre of mass of the rat, computed with computer vision algorithms. • By virtual beams we refer to data computed from the camera data and a virtual beam-crossing grid, trying to recreate the input to the analyser unit. These three types of data signals mentioned were provided to us by IRLAB, how they are generated will be explained further in 4.1. The most important thing to remember is that the original data is the output of the analyser unit while virtual beams is an artificial recreation of the input to the unit. 3 Theory In this section we present the algorithms that have been investigated as potential candidates for modelling the black-box analyser. The selection of algorithms is based on discussions with IRLAB and Smartr in combination with what kind of algorithms are commonly used in signal processing. Moreover, each algorithm has a set of hyperparameters why we discuss hyperparameter optimisation later in this section. 3.1 Algorithms In total, we investigate seven algorithms. Firstly, a set of common smoothing algorithms which are b-spline smoothing, exponential smoothing and rolling mean. Moreover, we investigate an 4 implementation of the classic Kalman filter. Lastly, we investigate a set of custom algorithms which have been named Enflo’s algorithm1, Slider, and Rolling Unique Mean (RUM). In this section, the input trajectory which we apply the algorithms on will be denoted T(t), and the output will be denoted Tˆ(t). A list of all algorithms and their hyperparameters can be found in Table1. Algorithm Parameters B-spline smoothing d; S Exponential Smoothing a Rolling Mean w Kalman filter a; sQ; sR Enflo w Slider Algorithm M Rolling Unique Mean n Table 1: Overview of the algorithms and their parameters that we optimise. B-spline We utilise the scipy function splrep that finds a B-spline representation given a time series. A spline function is a piece-wise polynomial function, which meets in so called knots. In our case, the knots are each point of our time series. The two parameters that we can control are the degree of the polynomial, d, and a smoothing factor S. The smoothing factor defines the trade-off between closeness to T(t) and smoothness of fit where larger S means smoother results. One problem with this approach is that it can not be performed with streaming data, but only when the tracking has terminated. This indicates clearly that the analyser unit do not use this function, but it is kept to see if it could be a viable post process algorithm to recreate similar trajectories as the analyser unit. Exponential Smoothing Exponential smoothing computes a moving average where the importance of previous observations in the input trajectory decays exponentially. The calculation is simple since it only requires the algorithm to keep track of two values at any point of time, it is described with the following equations: Tˆ(0) = T(0) (1) Tˆ(t + 1) = aT(t) + (1 − a)Tˆ(t) For this algorithm, we can optimise the parameter a, which is how much weight we put on the current observation. 1This is named after Per Enflo, it’s creator. 5 Rolling Mean The rolling mean algorithm computes a different moving average compared to exponential smoothing. It utilises a window of size w which is convoluted over the input trajectory and computes the mean of the values within the windows. It is computed with the following formula Tˆ(t) = 0; t = 0;:::;w − 1 1 w Tˆ(w) = ∑ T(i) w i=1 1 Tˆ(t) = Tˆ(t − 1) + (T(t) − Tˆ(t − 1)); t = w + 1;:::;L w where L is the length of the trajectory T. The only parameter for the rolling mean algorithm is the window size w, which is the number of values taken into account during the calculations. Kalman Filter The Kalman filter is a well-known recursive algorithm for estimating the state using a series of noisy observations. The filter algorithm is widely used in different technological fields which is why it is also included in this study [1]. In short, the Kalman filter is the optimal filter to the linear Gaussian state space model with regard to the minimum mean square error of the state estimation. The parameters we then optimise for is the step size in our numerical differentiation’s, a, the noise in the model, sQ, and finally the noise in the measurements sR. Enflo Algorithm The Enflo algorithm combines the computation of the mean from several time steps and a numerical derivative. We define the window for the mean and derivative as w. Tˆ(t) = 0 for t = 0;:::;w − 1 Tˆ(w) = T(w) T(t + w) − T(t − w) (2) D(t) = 2w Tˆ(t) = r · T(t) + (1 − r) · (Tˆ(t − 1) + D(t − 1)); t = w + 1;:::;L − w where L is the length of T and r a constant between 0 and 1. Tˆ resulting from this algorithm is 2w shorter than T(t), which need to be taken into account when comparing results. There are two hyperparameters for the Enflo algorithm, the window size w and the factor r which weighs the importance of the next time step. Slider Algorithm The Slider algorithm is a custom algorithm that computes the cumulative change in both the X and Y coordinates. Then, if the cumulative change is large enough then it will be registered and added to the output trajectory. The algorithm can be found expressed as pseudocode in Algorithm1. Although optional, we apply a smoothing algorithm to the trajectory T before applying the slider 6 algorithm since the slider algorithm does not perform any averaging. In our experiments, we use exponential smoothing for this. There is one hyperparameter M which sets the threshold for the cumulative change in either the X- or Y-coordinates. Algorithm 1 Slider algorithm Require: Trajectory T = (TX ;TY ) in X- and Y-coordinates Initialise empty vector Tˆ with same shape as T Xcum 0 Ycum 0 TˆX (0) TX (0) TˆY (0) TY (0) for t = 1, 2, . , length(T) do TˆX (t) TX (t − 1) TˆY (t) TY (t − 1) Xcum Xcum + (TX (t) − TX (t − 1)) Ycum Ycum + (TY (t) − TY (t − 1)) if j Xcum j> 0 and j Ycum j> 0 or j Xcum j + j Ycum j> M then TˆX (t) TˆX (t) + Xcum TˆY (t) TˆY (t) +Ycum Xcum 0 Ycum 0 else continue end if end for return (TˆX ;TˆY ) Rolling Unique Mean Rolling Unique Mean (RUM) is an extension of the rolling mean algorithm with a special update rule.

Analysis of an Unknown Tracking Algorithm for Rats Project Report MVE385

Arxiv:1707.09546V1 [Math.GN] 29 Jul 2017 Ru,Sprbegop Rcmatgop Suoopc G Pseudocompact Group, Precompact Group, Separable Group, 54B15

Manifestations of Nonlinear Roundness in Analysis, Discrete Geometry And

L. Maligranda REVIEW of the BOOK by ROMAN

L. Maligranda REVIEW of the BOOK BY

Per Enflo to Return to Chagrin Series for Mozart Concertos on January 21 by Daniel Hathaway

Functional Analysis an Elementary Introduction

Used the Idea of Metric Roundness to Investigate the Uniform Structure of Banach Spaces

Three Non-Linear Problems on Normed Spaces A

Hans Rådström and How to Define Smooth Functions on Any

ON the INVARIANT SUBSPACE PROBLEM 1. Introduction

A Counterexample to the Approximation Problem in Banach Spaces

Roundness in Analysis and Topology Stratos Prassidis Canisius College