Test & Evaluation of AI-Enabled and Autonomous Systems: a Literature Review

INSTITUTE FOR DEFENSE ANALYSES Test & Evaluation of AI-enabled and Autonomous Systems: A Literature Review Heather M. Wojton, Project Leader Daniel J. Porter John W. Dennis September 2020 This publication has not been approved by the sponsor for distribution and release. Reproduction or use of this material is not authorized without prior permission from the responsible IDA Division Director. IDA Document NS-D-14331 Log: H 2020-000326 INSTITUTE FOR DEFENSE ANALYSES 4850 Mark Center Drive Alexandria, Virginia 22311-1882 The Institute for Defense Analyses is a nonprofit corporation that operates three Federally Funded Research and Development Centers. Its mission is to answer the most challenging U.S. security and science policy questions with objective analysis, leveraging extraordinary scientific, technical, and analytic expertise. About This Publication This work was conducted by the Institute for Defense Analyses (IDA) under contract HQ0034-19-D-0001, Task 229990, “Test Science,” for the Office of the Director, Operational Test and Evaluation. The views, opinions, and findings should not be construed as representing the official position of either the Department of Defense or the sponsoring organization. Acknowledgments The IDA Technical Review Committee was chaired by Mr. Robert R. Soule and consisted of Rachel A. Haga, John T. Haman, Mark R. Herrera, and Brian D. Vickers from the Operational Evaluation Division, and Nicholas J. Kaminski from the Science & Technology Division For more information: Heather M. Wojton, Project Leader [email protected], (703) 845-6811 Robert R. Soule, Director, Operational Evaluation Division [email protected] • (703) 845-2482 Copyright Notice © 2020 Institute for Defense Analyses 4850 Mark Center Drive, Alexandria, Virginia 22311-1882 • (703) 845-2000 This material may be reproduced by or for the U.S. Government pursuant to the copyright license under the clause at DFARS 252.227-7013 [Feb. 2014]. INSTITUTE FOR DEFENSE ANALYSES IDA Document NS-D-14331 Test & Evaluation of AI-enabled and Autonomous Systems: A Literature Review Heather M. Wojton, Project Leader Daniel J. Porter John W. Dennis Executive Summary This paper summarizes a subset of the literature regarding the challenges to and recommendations for the test, evaluation, verification, and validation (TEV&V) of autonomous military systems. This literature review is meant for informational purposes only and does not make any recommendations of its own. A synthesis of the literature identified the following categories of TEV&V challenges: 1. Problems arising from the complexity of autonomous systems; 2. Challenges imposed by the structure of the current acquisition system; 3. Lack of methods, tools, and infrastructure for testing; 4. Novel safety and security issues; 5. A lack of consensus on policy, standards, and metrics; 6. Issues around how to integrate humans into the operation and testing of these systems. Recommendations for how to test autonomous military systems can be sorted into five broad groups: 1. Use certain processes for writing requirements, or for designing and developing systems; 2. Make targeted investments to develop methods or tools, improve our test infrastructure, or enhance our workforce's AI skillsets; 3. Use specific proposed test frameworks; 4. Employ novel methods for system safety or cybersecurity; 5. Adopt specific proposed policies, standards, or metrics. 1 Table of Contents Introduction .............................................................................................................. 1 T&E Challenges for Autonomy .............................................................................. 4 Challenge #1: System Complexity ....................................................................................4 Task Complexity ....................................................................................................................... 4 The State-Space Explosion ........................................................................................................ 5 Stochastic/Non-Deterministic/Chaotic Processes ..................................................................... 5 Learning Systems ...................................................................................................................... 5 Multi-Agent/Component Evaluation ......................................................................................... 6 Novelty .......................................................................................................................................6 System Opacity.......................................................................................................................... 6 Challenge #2: Acquisition System Limitations................................................................7 Requirements ............................................................................................................................. 7 Processes ................................................................................................................................... 8 Stovepiping ................................................................................................................................ 8 Challenge #3: Lack of Methods, Tools, or Infrastructure .............................................9 Method Needs............................................................................................................................ 9 Scalability ................................................................................................................................ 10 Instrumentation ........................................................................................................................ 11 Simulation ............................................................................................................................... 11 Ranges and Infrastructure ........................................................................................................ 11 Personnel ................................................................................................................................. 12 Challenge #4: Safety & Security .....................................................................................12 Safety ....................................................................................................................................... 12 Security .................................................................................................................................... 13 Challenge #5: Lack of Policy, Standards, or Metrics ...................................................13 Policy ....................................................................................................................................... 13 Metrics ..................................................................................................................................... 13 Standards ................................................................................................................................. 14 Challenge #6: Human-System Interaction.....................................................................14 Trust ........................................................................................................................................ 15 Teaming ................................................................................................................................... 15 Human Judgment & Control ................................................................................................... 16 Summary of Challenges ...................................................................................................16 T&E Recommendations for Autonomy ............................................................... 18 Recommendation #1: Requirements, Design, & Development Pipeline .....................18 Requirements ........................................................................................................................... 18 Assurance Aiding Designs ...................................................................................................... 19 i Safety Middleware ............................................................................................................ 19 Built-in Transparency and Traceability ............................................................................ 20 Explainability .................................................................................................................... 21 “Common, Open Architectures with Reusable Modules” ................................................ 21 Modularity ........................................................................................................................ 21 Open Architectures ........................................................................................................... 22 Capability Reuse ............................................................................................................... 22 Design & Development Processes........................................................................................... 23 Coordinate, Integrate, & Extend Activities ............................................................................. 23 Recommendation #2: Methods, Tools, Infrastructure, & Workforce ........................24 Leverage Existing Methods ....................................................................................................

Load more