<<

Analysis of an unknown tracking for rats Project Report MVE385

Axel Nathanson Sofia Cvetkovic Destouni Rickard Karlsson [email protected] [email protected] [email protected]

January 20, 2021

Supervisors Irlab: Fredrik Wallner, Peder Svensson, Susanna Waters. Smartr: Mattias Sundén, Erik Lorentzen. External advisor: Per Enflo Abstract

In this study we attempt to model and recreate the unknown of a black-box analyser unit used to track the behavioural patterns of experimental animals. In total, we evaluate the following seven approaches, the Kalman filter, rolling mean, exponential smoothing, B-splining and three custom created during the project named Rolling Unique Mean (RUM), Slider and Enflo’s algorithm. None of the evaluated algorithms was definitively proven to be used by the analyser unit, however the RUM algorithm showed promising characteristics, similar to the unit. A limita- tion of this study is the fact that the input to the analyser unit is also unknown, so a artificial substitute needed to be calculated which was not able to represent outlying data samples. A continuation of this study would be to improve the artificial data generation and look to keep developing RUM or similarly structured algorithms.

1 2

Contents

1 Introduction3

2 Setup and Notation3

3 Theory 3 3.1 Algorithms...... 3 3.2 Hyperparameter Optimisation...... 7

4 Methods 7 4.1 Experimental Setup...... 7 4.2 Evaluation...... 8 4.3 Optimising Hyperparameters...... 9 4.4 Code Implementation...... 10

5 Results 12 5.1 Statistical analysis...... 12 5.2 Evaluation of the algorithms...... 15

6 Discussion 20 6.1 Sources of Errors...... 20 6.2 Conclusion...... 21

A Results from Hyperparameter Optimisation 23

B Feature Plots 24

C PCA Plots 33 3

1 Introduction

A possible approach when screening novel pharmaceuticals is to investigate treatment effects on a system level, without necessarily considering the underlying specific biological mechanisms caused by the chemical compound of the treatments. A crucial component in this screening process is to analyse behavioural patterns of experimental animals since this can give key insights about changes in the brain. IRLAB Therapeutics currently uses two different systems to track the experimental animals, one signal is generated by a high-resolution video while another signal is generated by a photo-beam crossing system, which is fed into a unit which we denote as the analyser unit, with lower resolution. However, the analyser unit process the input with some unknown black-box function. The goal of this project is therefore to investigate this black-box function and create a model that can reproduce the same output. This project can be divided into two main parts: firstly, an analysis of the camera and analyser unit output, and secondly, the evaluation of different filtering and smoothing algorithms as possible candidates for the black-box function mentioned above.

2 Setup and Notation

In this project we have worked with data generated from 18 repeated experiments with tracking of rat’s movements. The experiments were tracked by the photo-beam crossing system as well as video cameras. Here we provide notations for the three basic signals generated from the experiments, these notations will be used systematically throughout this report. • By original we refer to the trajectory data obtained from the beam-crossing system and pro- cessed by the analyser unit that we are trying to understand. • By camera we refer to the trajectory data from the camera recordings, where each data point is the calculated centre of mass of the rat, computed with computer vision algorithms. • By virtual beams we refer to data computed from the camera data and a virtual beam-crossing grid, trying to recreate the input to the analyser unit. These three types of data signals mentioned were provided to us by IRLAB, how they are generated will be explained further in 4.1. The most important thing to remember is that the original data is the output of the analyser unit while virtual beams is an artificial recreation of the input to the unit.

3 Theory

In this section we present the algorithms that have been investigated as potential candidates for modelling the black-box analyser. The selection of algorithms is based on discussions with IRLAB and Smartr in combination with what kind of algorithms are commonly used in signal processing. Moreover, each algorithm has a set of hyperparameters why we discuss hyperparameter optimisa- tion later in this section.

3.1 Algorithms In total, we investigate seven algorithms. Firstly, a set of common smoothing algorithms which are b-spline smoothing, exponential smoothing and rolling mean. Moreover, we investigate an 4 implementation of the classic Kalman filter. Lastly, we investigate a set of custom algorithms which have been named Enflo’s algorithm1, Slider, and Rolling Unique Mean (RUM). In this section, the input trajectory which we apply the algorithms on will be denoted T(t), and the output will be denoted Tˆ(t). A list of all algorithms and their hyperparameters can be found in Table1.

Algorithm Parameters B-spline smoothing d, S Exponential Smoothing α Rolling Mean w Kalman filter a, σQ, σR Enflo w Slider Algorithm M Rolling Unique Mean n

Table 1: Overview of the algorithms and their parameters that we optimise.

B-spline We utilise the scipy function splrep that finds a B-spline representation given a . A spline function is a piece-wise polynomial function, which meets in so called knots. In our case, the knots are each point of our time series. The two parameters that we can control are the degree of the polynomial, d, and a smoothing factor S. The smoothing factor defines the trade-off between closeness to T(t) and smoothness of fit where larger S means smoother results. One problem with this approach is that it can not be performed with streaming data, but only when the tracking has terminated. This indicates clearly that the analyser unit do not use this function, but it is kept to see if it could be a viable post process algorithm to recreate similar trajectories as the analyser unit.

Exponential Smoothing Exponential smoothing computes a moving average where the importance of previous observa- tions in the input trajectory decays exponentially. The calculation is simple since it only requires the algorithm to keep track of two values at any point of time, it is described with the following equations:

Tˆ(0) = T(0) (1) Tˆ(t + 1) = αT(t) + (1 − α)Tˆ(t)

For this algorithm, we can optimise the parameter α, which is how much weight we put on the current observation. 1This is named after Per Enflo, it’s creator. 5

Rolling Mean The rolling mean algorithm computes a different moving average compared to exponential smooth- ing. It utilises a window of size w which is convoluted over the input trajectory and computes the mean of the values within the windows. It is computed with the following formula

Tˆ(t) = 0, t = 0,...,w − 1 1 w Tˆ(w) = ∑ T(i) w i=1 1 Tˆ(t) = Tˆ(t − 1) + (T(t) − Tˆ(t − 1)), t = w + 1,...,L w where L is the length of the trajectory T. The only parameter for the rolling mean algorithm is the window size w, which is the number of values taken into account during the calculations.

Kalman Filter The Kalman filter is a well-known recursive algorithm for estimating the state using a series of noisy observations. The filter algorithm is widely used in different technological fields which is why it is also included in this study [1]. In short, the Kalman filter is the optimal filter to the linear Gaussian state space model with regard to the minimum mean square error of the state estimation. The parameters we then optimise for is the step size in our numerical differentiation’s, a, the noise in the model, σQ, and finally the noise in the measurements σR.

Enflo Algorithm The Enflo algorithm combines the computation of the mean from several time steps and a numerical derivative. We define the window for the mean and derivative as w.

Tˆ(t) = 0 for t = 0,...,w − 1 Tˆ(w) = T(w) T(t + w) − T(t − w) (2) D(t) = 2w Tˆ(t) = r · T(t) + (1 − r) · (Tˆ(t − 1) + D(t − 1)), t = w + 1,...,L − w where L is the length of T and r a constant between 0 and 1. Tˆ resulting from this algorithm is 2w shorter than T(t), which need to be taken into account when comparing results. There are two hyperparameters for the Enflo algorithm, the window size w and the factor r which weighs the importance of the next time step.

Slider Algorithm The Slider algorithm is a custom algorithm that computes the cumulative change in both the X and Y coordinates. Then, if the cumulative change is large enough then it will be registered and added to the output trajectory. The algorithm can be found expressed as pseudocode in Algorithm1. Although optional, we apply a smoothing algorithm to the trajectory T before applying the slider 6 algorithm since the slider algorithm does not perform any averaging. In our experiments, we use exponential smoothing for this. There is one hyperparameter M which sets the threshold for the cumulative change in either the X- or Y-coordinates.

Algorithm 1 Slider algorithm

Require: Trajectory T = (TX ,TY ) in X- and Y-coordinates Initialise empty vector Tˆ with same shape as T Xcum ← 0 Ycum ← 0 TˆX (0) ← TX (0) TˆY (0) ← TY (0) for t = 1, 2, . . . , length(T) do TˆX (t) ← TX (t − 1) TˆY (t) ← TY (t − 1) Xcum ← Xcum + (TX (t) − TX (t − 1)) Ycum ← Ycum + (TY (t) − TY (t − 1))   if | Xcum |> 0 and | Ycum |> 0 or | Xcum | + | Ycum |> M then TˆX (t) ← TˆX (t) + Xcum TˆY (t) ← TˆY (t) +Ycum Xcum ← 0 Ycum ← 0 else continue end if end for return (TˆX ,TˆY )

Rolling Unique Mean Rolling Unique Mean (RUM) is an extension of the rolling mean algorithm with a special update rule. Instead of calculating the mean of the last w values, we use a queue of length n and calculate mean of its content. We only add a element to the queue if the new value is not equal to the value in front. This way the same element will not be added to the queue multiple times in a row. Pseudocode of RUM is found in Algorithm2. 7

Algorithm 2 Rolling Unique Mean Require: Trajectory T Initialise empty vector Tˆ with same shape as T Initialise empty queue Q Tˆ(0) = T(0) for t = 1,2,...,length(T) do . Everything is performed separately for each of T if T(t) 6= T(t − 1) then Enqueue T(t) to Q end if if length(Q) > n then Dequeue last element in Q end if Tˆ(t) = mean(Q) end for return Tˆ

3.2 Hyperparameter Optimisation To summarise the number of parameters for each algorithm, we have gathered them in Table1. Since the algorithms that we investigate have a set of hyperparameters, which we call θ, we select their values based on the data that we observe. This is performed by applying an optimisation algorithm which explores parameter values and picks the best found values given a loss function  L f (x;θ),y , where f (x;θ) is one of our algorithms with input x and y is what is compared to the output of f (x;θ). More formally, we are solving the following optimisation problem

∗  θ = argmin L f (x;θ),y subject to θ ∈ P (3) θ  where P is the set of allowed values for the hyperparameters and L f (x;θ),y is a loss func- tion chosen by us. The more practical details of the hyperparameter optimisation are explained in section 4.3.

4 Methods

In this section, we describe the practical details of this project. More specifically, we explain the experimental data, set up and how we evaluate the algorithms mentioned in the theory section.

4.1 Experimental Setup In this study we are looking at the tracking of rats in a box. The box is 42 centimeter wide, and divided into an integer-valued coordinate grid of 128x128 positions2. The steps in the grid are defined by IRLAB. Additionally, the minimum and maximum observed coordinate values are 8 and 120 respectively since a rats centre of mass can only get so close to the walls. In each experiment, the rat’s movement is tracked for one hour, its position measured 25 times per second and rounded to integers. The rat is tracked with both the analyser unit and a top-down

2Resulting in each step being 42/128 = 0.328125 cm. 8 camera which films the interior of the box during the entire experiment. The analyser unit uses a grid of laser-beams to calculate the rats position based on which beams the rat break. In total, we have data on 18 recordings of different rats that are numbered from 1 to 20, where number 5 and 13 are excluded. To investigate the signal processing inside the analyser unit, we need to know its input. However, we do not have it’s exact input which is why we instead have to simulate it by recreating it arti- ficially, resulting in the data set we call "virtual beams". It is done with the help of the camera recording. By tracking a boundary box around the rat’s body with the camera, it is possible to compute which laser beams it would break and thus get an estimation of what the analyser unit’s input is. There are however some shortcomings to this approach, we can not for example recreate disturbances from the rats tail. In Figure1 we see an example of how part of a trajectory can look. We see in the left plot all three data sets: original, camera and virtual beams recorded during the same time period. Here we can see the challenge of using virtual beams to recreate the original output. We see that the two presented algorithms, RUM and slider, in some ways assimilate the characteristics of the original trajectory much better than their input data, yet they are fundamentally limited by positional differences between their input and what we are trying to recreate. The error factor of not knowing the real input to the analyser unit is one of the bigger challenges in this project.

Figure 1: A small time from rat number 1 (minute 30.25 to 30.5) shown for some of our data. Both plots with the original trajectory in yellow, along with camera and visual beams data to the left, and the two algorithms slider and RUM applied to the virtual beams data to the right.

4.2 Evaluation We cannot simply see how closely we recreate the original trajectories, because of the discrepancies seen above. We therefore need to use other metrics. The main way, as used by IRLAB, was to calculate a set of features that summarises the characteristics of each trajectory. We calculate 9 different features (with the used annotation in parenthesis): • Acceleration (Acc) 9

• Total distance (Di) • Velocity (Vel) • Meander (Me) • Meander per distance (Mem) • Fraction of time spent in the middle of the box (Mi) • Fraction of time in movement (Mo) • Stands still in the middle of the box (Stm) • Stands still (St) These features are calculated for every quarter of the recorded hour, and also calculated for 7 different sampling frequencies, specified by IRLAB. This results in 9 · 4 · 7 = 252 different values calculated for each rat.3 These features can be compared directly but since they are of such high dimension it is also convenient to compress the information using principal component analysis (PCA)4. This low- dimensional representation can more easily be used to compare parts of the variation in the features between multiple trajectories. Beyond using the features, we also investigate the data with the help of various visualisations and statistical tools such as histograms and heatmaps.

4.3 Optimising Hyperparameters For the optimisation problem presented in equation (3), we use the following loss function 4  p p L f (x;θ),y = || f (x;θ) − y||2 + ∑ qi ||Fi( f (x;θ)) − Fi(y)||2 (4) i=1 where Fi(·) is a set of characteristic features from the i:th quarter of the trajectory and qi is the multiplicative weight for the i:th quarter. The first term in the loss function is the mean-square distance between the original data y and the output of a algorithm f which has been applied to the virtual beams data x. The other term is the mean-square difference between the feature values for each trajectory y and f (x). We do not use optimise with respect to all features to reduce the complexity of the loss, thus Fi contains only Mem, Acc, St and Di. To get suitable hyperparameters θ, we solved the optimisation problem by using HyperOpt5 which is a well-known method for hyperparameter tuning [2]. We used 50 evaluations in total per algo- rithm and a tree of Parzen estimators to suggest new parameter values during optimisation. To evaluate the optimised hyperparameters of our algorithm, we have to split up the dataset of rat trajectories into a training dataset which is used to tune the parameters and then a validation dataset that can be used to test whether the chosen hyperparameters generalises well to unseen trajecto- ries. The training set and validation set contains of rats numbered (1,2,3,4,6,7,8,10,11,12) and (14,15,16,18,19,20), respectively.6

3IRLAB also calculate 2 features corresponding to vertical movement, but the rats vertical movement is not available for the camera data why these are left out. 4We use the scikit-learn implemenation of PCA. 5Code available at https://github.com/hyperopt/hyperopt 6Rat 9 and 17 were removed due to complications when calculating features. 10

Although the hyperparameter tuning can not give a optimal set of parameters due to both computa- tional limitations as well as it depends on our subjective choice of loss function, the hyperparameter optimisation is a helpful guide for picking parameter values. However, for some of the algorithms the parameter values are chosen from experience rather than the output of the optimisation. For example, for RUM we chose 4 from observations made during construction of the algorithm, rather than optimising it. The best found hyperparameters, as well as values for the weights q, are found in AppendixA.

4.4 Code Implementation All algorithms have been implemented in Python, and are available in a GitHub repository.7

7Email Rickard Karlsson for access, contact information available at top of report. 11

Figure 2: Resulting trajectories for rat 1 from our seven algorithms next to the virtual beam, original and camera trajectories. 12

5 Results

We will start by presenting discoveries from the analysis of the original data, and how they could give an indication of which methods that are more likely to be useful. We then follow that up with performance analysis of the different algorithms covered in this project applied to the artificial input data. Before we begin to present the statistical analysis, we share an overview of how the trajectories look for a single rat. These trajectories are either given by the three different tracking methods or one of the algorithms applied to the virtual beams data signal. These trajectories are seen in Figure2.

5.1 Statistical analysis One of the first things that we observed was that the different tracking methods result in signifi- cantly different distributions of the rat’s position, this is best observed in the heatmaps presented in Figure3. The original data shows a mosaic looking distribution, while the camera data shows smoother transitions and lastly in the virtual beams data we clearly see the the grid like appearance of the virtual beam positions. It can also be observed that the rats spend the majority of the time along the edges and avoid the middle. In the presented heatmaps only one rat is shown as an ex- ample, but the same behaviour is displayed for every rat. It can also be observed that there are a few positions which the rat spends the majority of its time in since our scale is logarithmic. The difference between the white dots and dark are of the scale 104, where one experiment is ∼ 9 · 104 frames. This indicates that the rats lay down or stop after a while of running around.

Figure 3: The grid represents the positions of rat 1 during the trial. The value in each point repre- sents how many of the frames where spend in one point. Note that the scale is logarithmic, so the amount of time spend along the edges is up to 104 greater than the middle.

Data type Total number of stops in all experiments Standing still [% of total frames] Original 22 607 98.1% Camera 65 249 94.5%

Table 2: Summary of the amount of time the rat is standing still and amount of times they stop for the two different data collection methods.

A difference can be observed in the amount of time the rats are standing still in the different data collection methods, as presented in Table2. A stop is defined here as no change in position between 13 two frames and that there has been movement between the two previous frames. It is perhaps not that surprising that there are many stops since we are measuring the position of the rats every 1/25 second and only register the positions as integers, i.e. the resolution is relatively low. It is however interesting that the camera data registers more stops in total, but has significantly less amount of time standing still. A possible explanation for this is that the camera can track the rats position more precisely since it is not dependent a grid and therefore updates more often resulting in more frequent stops, but shorter ones between updates. Another explanation is that the analyser unit may use some threshold or averaging technique resulting in fewer total movements. This is one of the realisations that led up to the RUM and slider algorithms. These discrepancies between the two different data collection methods led us to look closer at how the step sizes differ in the two data collection methods. This can more closely be observed in Figures4 and5. The two figures show how the rats move in one direction when the other is still (Figure4) and how they move in one direction when also moving in the other (Figure5).

Figure 4: The distribution of step lengths recorded for the rats in one direction (x or y) when it is standing still in the other. For example, if there is no change in the x-position between two frames, we register the change in the y-position in this graph. We do the same for both dimensions. This way we can see how a rats movement in one direction is dependent on the other one. Note that the the y-axis is log-scaled.

What is interesting is how the rat moves in the original data in the two figures. We can see that for all data types, what is most common is no movement. However, in the original data, if we have movement in one direction, Figure5, it is as common that the rat also moves in the other direction as standing still. This pattern cannot be seen in any of the other data types. It is also interesting to note that the step lengths have a similar distribution for the camera and virtual beams data, regardless of which graph you observe. But for the analyser unit it actually seem to matter for the step length if the rat is moving in both directions or just in one. Since in Figure4 we can see that if a rat is standing still in one coordinate, its most common step length when moving is actually 2 rather than 1. An easy explanation for this is simply that the distance between the beams that break are bigger, while a camera can detect shorter movements. Moreover, it is worth mentioning that the virtual beam data seem to show that there are steps of both length 4 and 3, but this is not true but rather a result of rounding errors. As the figures have logarithmic scales if we add the amount of steps of length 3 and 4 into one bar, it will still not be more frequent than standing still. 14

Figure 5: The distribution of step lengths recorded for the rats in one direction (x or y) when it is moving in the other. For example, if there is change in the x-position between two frames, we register the change in the y-position in this graph. We do the same for both dimensions. This way we can see how a rats movement in one direction is dependent on the other one. Note that the the y-axis is log-scaled.

As a next step, a combination of Figure4 and5 was computed, where we looked at how all steps are distributed from the experiments and presented in Figure6. Here we see a clearer illustration of what we have mentioned earlier. The original data has a more even (logarithmic) distribution of the different step lengths, even for longer steps, while the camera data favours shorter ones and do not display longer steps than 6 more than 50 times out of over 1.6 million frames. This clearly illustrates the difference between the two data collection methods, and are characteristics we need to try and recreate with our algorithms to be able to generate similar feature values or in the best case, totally recreate the algorithm that the analyser unit uses.

Figure 6: The distribution of step lengths recorded for all the rats in all experiments. Note that the the y-axis is log-scaled.

During the statistical analysis of the data, we also investigated whether the error varied with the rat’s position in the box. In this case, the error is measured between the camera data and the original data or between the virtual beams data and the original data. It was computed as the root mean square error (RMSE) and sorted in bins with respect to the X- and Y-coordinate of the original data. However, we know that the rats are positioned near the edges of the box most of the time. This results in an unbalanced number of samples in each bin. To partly circumvent this, we look at the distribution of the errors in each bin with a box plot, as seen in Figure7 and8. It shows that the median (orange line) and the size of the interquartiles are mostly the same for each bin. This may indicate that the error does not directly depend on the rat’s positioning in the box. 15

Figure 7: Distribution of error differences (RMSE) between camera data and original data with respect to bins with different X- and Y-coordinates.

Figure 8: Distribution of error differences (RMSE) between virtual beams data and original data with respect to bins with different X- and Y-coordinates.

5.2 Evaluation of the algorithms From the statistical analysis we have already noticed some differences between the various trajecto- ries generated by the three different tracking methods as well as the algorithms that we investigate. In this section, we will focus to answer the question whether we may model the unknown signal- processing which generates the original data signal from the analyser unit. Thus, we evaluate the seven algorithms by comparing their output when applied to virtual beams signal, as described in section 4.2. Remember that the virtual beams signal is what we use to artificially simulate the input to the analyser unit. A first visualisation we performed for every algorithm is to compare how the movement distribu- tions compare between the models. We found it was most easily done with heatmaps of the rat 16 movement. In Figure9 we can see how the different algorithms perform compared to the original and camera data. It can be observed that the smoothing algorithms retain the grid-like structure of the virtual beams data we could observe in Figure3. We can also see that the b-spline algorithm seem to be smoother than the original. However, the RUM and Kalman algorithms show a distri- bution that in its mosaic appearance resembles the original data. The slider algorithm seem to be a mix of the grid feature and some mosaic features. Of all the algorithms, RUM:s distribution most closely resembles the original.

Figure 9: Heatmaps showing the trajectory of rat number 1, filtered with the 7 algorithms and with the camera and original data as comparisons.

As a next step, we make a similar analysis as earlier of the distribution of the step lengths generated with the different algorithms. As seen in Figure6 the original data has a rather unique distribution compared to the camera and virtual beams. In Figure 10 we see the step distribution in each dimension and for all algorithms. Here we can see clearly that Kalman and Enflo are more similar to the camera data distribution with focus on shorter steps rather than the original datas more even distribution. We see that the slider algorithm show very unique appearances while the RUM and b-spline for some parts seem to recreate the original the best, however far from exact. It is here again worth noting how even the distribution seems to be of the original data. We do not expect the algorithms to be able to recreate the distributions presented of the algorithms in Figure9 and 10 perfectly since, as stated before, we do not have the exact same input. Still, 17 we hope to generate trajectories with similar characteristics. Thus, based on these plots we may make a qualified guess that the analyser unit is not solely using rolling mean, a Kalman filter or the Enflo algorithm. It does not seem probable that the slider or exponential smoothing is the correct algorithm either. However, they perform relatively well in recreating the features calculated for the trajectories. As mentioned earlier, we know the box can not be using the b-spline algorithm since it is an offline algorithm. But it still shows potential as a possible post-processing model to recreate the feature values that interest us, so we keep it for further comparisons.

Figure 10: The step distributions of the resulting trajectories of all algorithms, and the original and camera data as reference. The algorithms have all been applied to the virtual beams data set.

Next we compare the performance of the our algorithms by feature comparison. For each feature of those defined in 4.2 (Acc, Di, Vel, Me, Mem, Mi, Mo, Stm and St), we compare the original data set to the different algorithms, as well as to the virtual beams data set. Comparisons of all features can be found in AppendixB. In Figure 11 we show the comparison of the acceleration feature as a demonstration. 18

Figure 11: Comparison plots for the acceleration of the camera and virtual beams trajectories as well as the seven implemented trajectory algorithms, on the y-axis compared to that of the original, on the x-axis. The colours in each plot distinguish the four quarters of the recorded hour of an experiment. Each plot includes features data for all 18 rats.

For the acceleration feature, Figure 11 shows that b-spline and the slider algorithm are the ones that come closest to the original data for all quarters. We see that the exponential smoothing also performs relatively well whilst the others diverge more. The other algorithms, as well as the cam- era data, show generally lower values of acceleration, whilst the virtual beams data shows higher values. Performing a similar analysis on the rest of the feature plots, we conclude that we cannot pick a single algorithm that consistently performs better than the rest for all features. Following conclusions are made for the rest of the features: - For velocity (Vel) and total distance (Di), the Enflo algorithm performed best along with the camera data. Exponential smoothing and rolling mean also did well. - For meander (Me), rolling mean and exponential smoothing and the Kalman filter performed significantly well. - For meander per distance (Mem) there is high dispersion for all data sets compared to the original, but b-spline performed somewhat better than the rest. 19

- For the stand still features (Stm and St) the b-spline and RUM algorithms perform the best. The rest of the algorithms significantly worsen these features compared to their input, that is the virtual beams data. - For the fraction of time in movement (Mo) it can be stated that all algorithms perform well, with b-splines and RUM algorithms performing even better than the rest. - For the fraction of time in the middle (Mi) all the scatter plots are similar and dispersed, meaning that no algorithm managed to affect this feature in any significant way. Finally, based on the previous results we decided to compare a more narrow set of algorithms with the original data using a PCA plot. In Figure 12, we see a comparison of b-spline smoothing, expo- nential smoothing, slider and RUM when applied to the virtual beams data for our set of validation rats. The two first principal components in this analysis are able to explain 43.8% and 20.5% of the variance in the data, respectively. This gives an indication of how similar the characteristic features of the algorithms and the original data are. We note that PCA plots with the other rats and algorithms can be found in AppendixC. Based on Figure 12, the following observations are made. We see that RUM (star) is similar to b- spline (cross). It also appears that RUM and b-spline are usually the closest to the original (circle). Both exponential smoothing (diamond) and slider (square) are sometimes close to original as well, but less than RUM and b-spline. Thus, no algorithm matches the original data perfectly which does not come as a surprise based on our previous results so far. Still, we see that RUM and b-spline does a good job in imitating parts of it. 20

Figure 12: The resulting PCA plot to compare trajectories of the validation rats. In the plot we have included b-spline smoothing, exponential smoothing, rolling unique mean and slider which are compared to original data. The percentage of variance explained by the first and second principal component is 43.8% and 20.5%, respectively.

6 Discussion

This has been a exploratory study of how the analyser unit filters the data. We have performed both a statistical analysis of the data generated while tracking the rats during experiments, but also assessed seven different algorithms that could possibly model the black-box analyser unit. Some were more promising than others, but we have not been able with certainty to find the algorithm used by the analyser unit. Still, the best candidates which we evaluated were rolling unique mean (RUM) and b-spline.

6.1 Sources of Errors The road to original data, contains in fact at least two unknown components. First there is the data from the beam crossing system which has been translated to trajectory data. We simulate this component when we create the virtual beams data, upon which we then assume that some further unknown algorithm has been applied by the analyser. But these two components are indistinguish- able from outside of the black box, which makes it hard if not impossible to identify the source of specific errors as shortcomings of the way we compute virtual beams, or of the way we imitate the analyser. Simulating optical beams with the virtual beams in the camera data causes further error factors. 21

Since the laser beams are placed at various heights in the rat’s box, the virtual beams attempt to imitate the output is a method that uses three-dimensional information with two-dimensional camera data. This means that potential usage of height data in the original setup is disregarded in the virtual beams setup. The original setup is sensitive to whichever occurrence that could break an optical beam, while the virtual beams setup is limited only to breaks within the four points of the detected bounding box. An example of occurrences missed by the virtual beams setup are the sudden occasional "jumps" detected in original data trajectories. These are suspected to be the tail of a rat lifting and breaking some beam outside of its bounding box, which is not detected by the bounding box in virtual beams data and impossible to recreate with our set up. The virtual beams data rely on video recordings, while the original data do not. Recording effects and the projection of the beams onto the box are therefore error factors. An example is a small but existing fish eye effect in the footage which makes the virtual beams unable to reach all the way into the edges of the image. This makes make subtle movements along the edges undetectable in the virtual beams data. Other camera induced error factors would be positioning of camera, as well as and potential movement of the camera both during, and between experiments. However these error factors can quite easily be minimised, what is more their effects might show to be static and systematic enough for it even not to be necessary altering. A later addition to the project was artificial data which had been created with blocks being moved around in the experimental set up. This showed promise in recreating more controlled input data which could be used to model the analyser unit without the bigger jumps and such. Because of time limitations no extensive analysis of this data have been able to be performed as part of this project, but it would be a natural next step to create artificial input data more closely matching that of the analyser unit and in doing so more easily be able to asses different filtering algorithms performance.

6.2 Conclusion As part of this project several algorithms have been evaluated. We have not been able to identify one algorithm we believe is the true one used by the analyser unit. We have however been able to elim- inate several well-known algorithms that were potential candidates in the beginning. What seems to be true about the analyser is that you should probably should keep it simple, which supports the belief that algorithms similar to slider and RUM probably is a more probable way forward rather than the more well know filtering algorithms. Lastly, B-splines good performance, even though probably not the correct algorithm, could be a testament to the flexibility and versatility of fitting multiple low degree polynomials to a time series. There is merit to it as a post processing method, but not as the black-box function. However you have to be very vary of overfitting if used as a post processing method, so the hyperparameters needs to be optimised with more data points. As final thoughts to keep in mind while moving forward we want to paraphrase Per Enflo from one of our meetings: We should stop trying to figure out how clever the box is and instead try and understand how stupid it is. 22

References

[1] Vikram Krishnamurthy. Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing. Cambridge University Press, 2016. [2] James Bergstra, Daniel Yamins, and David Cox. Making a Science of Model Search: Hyperpa- rameter Optimization in Hundreds of Dimensions for Vision Architectures. In Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 115–123. PMLR, 2013. 23

A Results from Hyperparameter Optimisation

The weights qi for i = 1,...,4 in the loss function, see equation (4), were chosen to make the first and last quarter more important since we noticed that these usually had more deviations in the features plots. Thus, we selected the weights as the following:

Algorithms q1 q2 q3 q4 Value 1 0.25 0.25 1

Table 3: Multiplicate weights for different quarters in the loss function, see equation (4).

In the following subsections, we present the parameter values that we use for each algorithm to- gether with brief explanations.

Exponential smoothing A suitable value for weighting coefficient in equation1 was found to be α = 0.6. This was decided with the help of inspecting the distribution of step lengths, which are seen in Figure 10.

B-spline smoothing The parameter values for the B-spline smoothing was found with the help of the optimisation procedure. The degree of the polynomial was set to d = 1 while the smoothing factor was S = 20232.4074. It is a bit surprising that the degree of the polynomial was found to be one, but we could not come up with a better alternative ourselves which is why we decided to go with this.

Enflo algorithm The parameter values for the Enflo smoothing was found with the help of the optimisation proce- dure. The window size was found to be w = 20 while the resulting weighting factor was r = 0.2061, see equation (2) for how they relate to the algorithm.

Kalman filter For the Kalman filter we found the following values with help of the optimisation procedure: step length a = 4 as well as noise values σQ = 1.3215 and σR = 4.7751.

Rolling mean The window size w for the rolling mean was decided with the help of inspecting the distribution of step lengths, which are seen in Figure 10. In the end, we selected a value of w = 9.

Rolling unique mean (RUM) The queue length n for RUM was decided with the help of inspecting the distribution of step lengths and the movement heat maps, which are seen in Figure 10 and9. In the end, we selected a value of n = 4. 24

Slider algorithm The move threshold was set to M = 5 for the slider algorithm which was decided using optimisation procedure.

B Feature Plots

Figure 13: Acceleration 25

Figure 14: Velocity 26

Figure 15: Meander, a measure of turns 27

Figure 16: Meander divided by distance 28

Figure 17: Stand stills as short as 2 frames long 29

Figure 18: Stand stills in the middle 30

Figure 19: Total distance 31

Figure 20: Fraction of time in movement 32

Figure 21: Fraction of time spent in the "middle" of the box 33

C PCA Plots

In Figure 22 and 23 PCA plots of the calculated features are presented for both the training and validation set of rats for all algorithms and data sets.

Figure 22: PCA plots for optimised algorithms where their input are the virtual beam signals and their output is compared to the (original) analyser unit signal. The rats that are shown are from the training set. The percentage of variance explained by the first and second principal component is 36% and 22%, respectively. 34

Figure 23: PCA plots for optimised algorithms where their input are the virtual beam signals and their output is compared to the (original) analyser unit signal. The rats that are shown are from the validation set. The percentage of variance explained by the first and second principal component is 41% and 21%, respectively.