Upstream Data Fusion: History, Technical Overview, and Applications to Critical Challenges

Upstream Data Fusion: History, Technical Overview, and Applications to Critical Challenges Andrew J. Newman and Glenn E. Mitzel pstream data fusion (UDF) refers to the processing, exploitation, and fusion of sensor data as closely to the raw sensor data feed as possible. Upstream processing minimizes information loss that can result from data reduction methods that current legacy systems use to process sensor data; in addition, upstream processing enhances the ability to exploit complementary attributes of different data sources. Since the early 2000s, APL has led a team that pioneered development of UDF techniques. The most mature application is the Air Force Dynamic Time Critical Warfighting Capability program, which fuses a variety of sensor inputs to detect, locate, classify, and report on a specific set of high-value, time-sensitive relocatable ground targets in a tactically actionable time frame. During the late 2000s, APL began expanding the application of UDF techniques to new domains such as space, maritime, and irregu- lar warfare, demonstrating significant improvements in detection versus false-alarm performance, tracking and classification accuracy, reporting latency, and production of actionable intelligence from previously unused or corrupted data. This article introduces the concept, principles, and applicability of UDF, providing a historical account of its development, details on the primary technical elements, and an overview of the challenges to which APL is applying this technology. INTRODUCTION Sensor and Data Fusion The term “sensor and data fusion” refers to tech- duce new information and inferences and achieve more niques that combine data from multiple sources to pro- complete, clear, precise, accurate, and timely estimates JOHNS HOPKINS APL TECHNICAL DIGEST, VOLUME 31, NUMBER 3 (2013) 215 A. J. NEWMAN AND G. E. MITZEL of the unknown quantities than could be achieved by Upstream Data Fusion Concept the use of a single source alone. Fusion of data from “Upstream data fusion” (UDF) refers to the process- multiple sensors provides several advantages over deriv- ing, exploitation, and fusion of sensor data as closely to ing estimates from a single source (see, e.g., Refs. 1 and the raw sensor data feed as possible within the limits 2). First, a statistical advantage is gained by combining imposed by technical feasibility and operational practi- independent observations. Second, the relative posi- cality. Upstream processing minimizes the information tion or motion of sensors can be exploited to provide loss that can result from the data reduction methods a geometric advantage in estimating kinematic states used by current legacy systems that process data from a and other attributes of an observed object. Third, the particular single source; in addition, upstream process- relative strengths and weaknesses of dissimilar data ing enhances the ability of the fusion process to exploit types can be, respectively, magnified and mitigated by the complementary attributes of different data sources. combining them judiciously. Fusing different data types The UDF process taps data at an appropriate point in from multiple sensor sources, in particular from different the processing chain near the sensor source (chosen to sensing phenomenologies (also referred to as modali- acquire the desired information content within engi- ties), broadens the number of physical observables avail- neering feasibility and without disrupting current opera- able to the fusion process, which results in significant tional processes) and bypasses the data reduction and improvements in detection, tracking, and classification detection thresholding steps inherent to a traditional performance as well as resistance to countermeasures single-sensor processing approach. It then exploits the and changing conditions. upstream data by tuning detection sensitivity to respond The U.S. military has acquired and currently oper- to faint signatures and discriminating between the true ates a diverse ensemble of intelligence, surveillance, and targets and the consequent large number of false can- reconnaissance (ISR) assets but tasks and exploits them didates by fusing data across complementary sensor in self-contained enterprises, often referred to as “stove- phenomenologies, diverse view geometries, and differ- pipes,” using a combination of automated and manual ent times. In addition, processing of raw (or nearly raw) processes that are highly specialized to a particular ISR sensor data allows the UDF algorithms to extract and asset, data type, application, or military domain and do exploit measurement data (e.g., target position) and not systematically interact with other such stovepipes. associated uncertainties (e.g., error statistics such as vari- This approach often fails to fully exploit complemen- ance or covariance) with the highest possible precision, tary capabilities and opportunities for collaboration as well as to exploit attribute data that are not normally (see, e.g., Refs. 3–6). The sensors provide an enormous reported in traditional processing chains. Applying UDF volume of data but often do not support precision to operational problems in the ground, maritime, and engagements of targets of interest because of deficien- space domains has demonstrated significant improve- cies in fusion and exploitation of the available data. ments in detection versus false-alarm performance, The process of tasking, collecting, processing, exploit- tracking and classification accuracy, reporting latency, ing, and disseminating ISR data is generally divorced and production of actionable intelligence from previ- from the rapidly evolving tactical picture and the needs ously unused or corrupted data. The benefits of UDF are of the end user (e.g., theater commander or exploita- more fully described in the UDF Benefits section. tion system operator). Improved sensor and data fusion A typical legacy downstream fusion process captures capabilities are needed to satisfy the warfighter’s accu- and fuses post-detection data from the available sources. racy, persistence, and timeliness requirements (see, e.g., Each of the individual sensor systems applies its own Ref. 7). processing, data reduction, and detection threshold- In general, the ability of a multiple-sensor ensemble ing to produce a set of candidate targets (e.g., ground, to provide persistent coverage on all targets, with high maritime, or space targets) in its data stream. In particu- accuracy, high detection probability, and low false- lar, each individual system is tuned to optimize its own alarm rate, will be constrained by limits on the number intrinsic performance, meaning that detection thresh- and diversity of assets, platform speed, maneuverabil- olds are set to relatively high levels to maintain a low ity, view angle, sensor coverage and update rates, sensor false-alarm rate while reducing the data processing load. resolution, and mismatch of target and sensor phe- This reduces the probability of detecting targets whose nomenologies. Coverage gaps, missed detections, false signatures may be only faintly observed in the sensor alarms, errors in the sensor support data, and classifier data (i.e., below the defined threshold). Moreover, data confusion will cause geo-location and classification that are identified as “bad” or “corrupted” are usually errors, incorrect associations, track loss or discontinui- discarded because they may not be reliable enough to ties, and spurious tracks, resulting in an ambiguous and support decision making within that individual process- unreliable tactical ground picture. Fusion and exploi- ing chain, even though these data can often reinforce tation systems must be robust enough to address these (or contradict) the information accrued from other realistic conditions. sensors. The performance of a downstream fusion pro- 216 JOHNS HOPKINS APL TECHNICAL DIGEST, VOLUME 31, NUMBER 3 (2013) UPSTREAM DATA FUSION: HISTORY, TECHNICAL OVERVIEW, AND APPLICATIONS cess is therefore inherently limited by the reduced set dissimilar sensor modalities (intercepted signals, images, of thresholded data and the resulting limited number of conventional radar, and ground moving target indica- candidate detections that it receives. tor radar) supplying pieces of complementary upstream Figure 1 illustrates the concept of tapping and exploit- information that are traditionally exploited separately ing upstream sensor data to produce information that is and combined using only products intended for end-user not available through current means of processing single- consumption (if at all). sensor data. The UDF technique captures the raw or partially processed upstream data before decisions are UDF Principles made (i.e., before detection thresholds are applied) and performs an efficient multiple-level screening process UDF is based on the principles of efficient infor- to search the upstream data for candidate detections. mation processing and rigorous model-based evidence Screening thresholds are set very low to ensure that data accrual for multiple, possibly highly dissimilar, data that would otherwise be rejected are considered in the types. Efficiency is a key tenet because of the need to fusion process. This increases the probability of detec- extract all useful information from a potentially over- whelming volume of source data while maintaining tion of actual targets but also increases the number of computational tractability at achievable data transmis- false alarms

Upstream Data Fusion: History, Technical Overview, and Applications to Critical Challenges

The Application of Data Fusion Method in the Analysis of Ocean and Meteorology Observation Data

Causal Inference and Data Fusion in Econometrics∗

Data Fusion – Resolving Data Conflicts for Integration

Data Mining and Fusion Techniques for Wsns As a Source of the Big Data Mohamed Mostafa Fouada,B,E,F, Nour E

Data Fusion and Modeling for Construction Management Knowledge Discovery

Data Fusion by Matrix Factorization

An Overview of Iot Sensor Data Processing, Fusion, and Analysis Techniques

Information Acquisition in Data Fusion Systems

CS 520 Data Integration, Warehousing, and Provenance

Cheminformatics and the Semantic Web: Adding Value with Linked Data and Enhanced Provenance

Information Quality Assessment for Data Fusion Systems

The Use of Data Fusion on Multiple Substructural Analysis Based GA Runs