Efficient Data Structures for Large Scale Tracking

Efficient Data Structures for Large Scale Tracking R. O. Lane M. Briers T. M. Cooper S. R. Maskell QinetiQ QinetiQ Igence Radar Liverpool University Malvern, UK Malvern, UK Malvern, UK Liverpool, UK Abstract—This paper describes a set of data structures that more than 100,000 vessels. To enable real-time tracking of this enables the tracking of large scale data sets. Although well- large scale data set a suite of efficient data structures has been known procedures exist for speeding up tracking performance, developed and implemented. such as gating, they are typically not sufficient for situations where it is required to simultaneously track tens or hundreds of The remainder of this paper is organized as follows. Section thousands of targets where even the gating calculations II gives an overview of related work aimed at speeding up themselves take a significant proportion of time. We describe tracking algorithms. Section III describes a set of efficient data dynamic spatiotemporal binary tree-based structures, a box structures. Section IV explains how these structures can be forest and cone forest, for storing measurements and tracks, and applied to the tracking problem. Finally, section V provides a string trie for text information. Efficient pruning of the concluding remarks. structures allows for a vast reduction in the number of gating calculations. Performance of a real-time tracking algorithm that II. RELATED WORK uses the data structures is demonstrated on a real-world maritime data set of more than 100,000 targets. A standard assumption in the tracking literature, known as the mutual exclusion constraint, is that each measurement can Keywords—box forest; cone forest; data association; string trie; only be assigned to one track and each track can have at most ubiquitous sensing one measurement per scan is assigned it. Joint probabilistic data association (JPDA) examines all possible associations that I. INTRODUCTION satisfy the above constraint and assigns a probability to each association. The target state can then be modeled by a Gaussian Over the last few decades a variety of techniques for mixture model with a certain number of components for tracking multiple targets with multiple sensors has been Kalman filter processing [1]. Standard JPDA is generally proposed. Three of the main tasks involved in a tracking considered to be computationally prohibitive for more than a system are prediction of target states at measurement times, few targets. However, the efficient hypothesis management data association – the assignment of measurements to tracks, (EHM) algorithm exploits redundancy in the calculations to and filtering – updating tracks that have measurements produce a mathematically equivalent answer with a substantial assigned to them. In a basic implementation of a tracking computational saving, allowing hundreds of targets to be algorithm, the state of every track must be predicted before tracked with this approach [2]. An alternative efficient data association can take place. During association, the implementation of JPDA is mentioned in [3]. A simpler computational load is usually reduced using a fine gating alternative to JPDA is the global nearest neighbor (GNN) procedure whereby measurements that are beyond a threshold algorithm, which picks the best single joint assignment of distance from a track are immediately rejected as not belonging measurements to tracks that satisfies the constraints. GNN is to that track. However, for asynchronous sensor reports the faster than JPDA but can have difficulties recovering if an number of prediction steps increases linearly with the number incorrect assignment is made at a particular point in time. A of targets, and the number of fine gating calculations is potential difficulty with these and other data association proportional to the square of the number of targets, assuming algorithms is the track swapping problem where measurements the number of measurements is proportional to the number of from nearby targets are assigned to the wrong track. A more targets. With the advent of ubiquitous sensing and large scale accurate multiple target tracker that considers the dependence data sets of thousands of targets, standard tracking algorithms between target states caused by unknown measurement-to- become computationally infeasible. target associations, and is implemented efficiently for large The work presented in this paper has been motivated by the numbers of targets, is described in [4]. need to track large numbers of ships. Maritime domain A number of algorithms have been proposed to further awareness is recognized internationally as a vital component in improve tracking efficiency. A coarse gating procedure is the fight against illegal activity such as terrorism, smuggling described in [5]. A track search area is defined as the set of and piracy. The amount of information available to this task positions that can be reached assuming a certain maximum has dramatically increased with the introduction of the speed, taking into account the effect of state and measurement automatic identification system (AIS). Ships with AIS uncertainty. Tracks with no measurements in the search area do transponders transmit their location, course, speed, and other not have to perform state predictions or be included in the data details. However, this information needs to be validated against association process. Tracks with at least one measurement in non-cooperative sensors such as radar. The data fusion task is the search area undergo the fine gating procedure as normal. computationally complex as the world’s shipping consists of Since coarse gating is much faster than fine gating, the computation time is reduced. An algorithm in [6] stores measurements from each scan in a kd-tree structure. The structure can then be queried to quickly retrieve measurements near the predicted position of a track before fine gating. Reference [7] processes all data in one batch to group Fig. 1. First three trees of a binary log forest. measurements into potential tracks without forcing the mutual exclusion constraint. One kd-tree is constructed per time step • Delete: When a leaf node is deleted from the tree, its and the trees are used to prune measurement sets that do not sibling is promoted to its parent’s position to maintain conform to the motion model. Note that since batch processing the full binary structure. is used, prediction calculations do not need to be carried out. An advance in the algorithm is reported in [8]. Cluster-based Note that there is no method for adding a single leaf to the tree data association approaches that are linear in the number of – all data must be loaded in one batch. The rationale for this is detections and tracked objects are described in [9]. Several explained in the following section. Efficient searching for a distributed tracking algorithms have been proposed. For data point can be carried out. The combination of the binary example, [10] uses consensus filters for fusion of the sensor structure, summary data at internal nodes, and pruning function data and covariance information, [11] uses the drain-and- enables searching for a data point in O(log n) time as opposed balance method for constructing efficient tracking hierarchies, to exhaustive searching, which is O(n). and [12] describes a hierarchy where lower tracking nodes in a tree are only activated when instructed by higher nodes. The B. Binary Log Forest aim of these algorithms is to distribute the computing among For real-time performance, the data structure used to store several processing nodes to reduce the per-node computational information must be able to be queried at a faster rate on requirement. average than the arrival of new queries. Although a standard Although the above approaches are an advance on standard binary tree is able to execute queries quickly for data loaded in tracking practice, each of them has some shortcoming that has a single batch, its performance degrades with the insertion and prevented general application to very large scale real-time deletion of points over time. When the tree is loaded in a batch, tracking problems. For example, although some efficient data all leaves exist in either the bottom or second-to-bottom layer structures have been applied to problems relating to tracking of the tree. If a single point is added or deleted, this property no ballistic objects in the atmosphere [6] and asteroids [7], the longer necessarily holds. If many points are added in one part trajectories of these objects are near-deterministic. The novelty of the tree, then the depth of the leaves in that region is much of this paper is to describe a combination of data structures that greater than the depth of leaves in other parts of the tree. This has the ability to be efficient while considering large numbers unbalanced tree structure can result in a much longer search of non-deterministic trajectories. time than the optimum for the total number of data points in the tree. To re-balance the tree it would be necessary to re-organize large parts of the tree. The quality of most indexing structures, III. EFFICIENT DATA STRUCTURES such as kd-trees or K-D-B-trees, deteriorates when a large Here we describe in detail a number of data structures that number of updates are performed on them [13]. will be used later in the paper. Some of these may be familiar To avoid these problems we use the logarithmic method to to readers with a computer science background but are not construct a binary log forest [13]. The forest consists of a finite necessarily commonly used in the tracking community. sequence of unconnected binary trees. The ith tree contains at Properties of the structures are presented for tracking purposes; most 2i-1 leaves. When a data point is deleted from the forest, it other applications may require modified versions. is simply deleted from whichever tree it is contained in.

Efficient Data Structures for Large Scale Tracking

Application of TRIE Data Structure and Corresponding Associative Algorithms for Process Optimization in GRID Environment

KP-Trie Algorithm for Update and Search Operations

Lecture 26 Fall 2019 Instructors: B&S Administrative Details

Efficient In-Memory Indexing with Generalized Prefix Trees

Search Trees for Strings a Balanced Binary Search Tree Is a Powerful Data Structure That Stores a Set of Objects and Supports Many Operations Including

X-Fast and Y-Fast Tries

Tries and String Matching

String Algorithm

B-Tries for Disk-Based String Management

Algorithm for Character Recognition Based on the Trie Structure

Complexity Analysis of Tries and Spanning Tree Problems Bernd

Multidimensional Point Data 1