A Framework Using Machine Vision and Deep Reinforcement Learning for Self-Learning Moving Objects in a Virtual Environment

Deep Models and Artificial Intelligence for Military Applications: AAAI Technical Report FS-17-03 A Framework Using Machine Vision and Deep Reinforcement Learning for Self-Learning Moving Objects in a Virtual Environment Richard Wu, Ying Zhao, Alan Clarke, Anthony Kendall [email protected], [email protected], [email protected], [email protected] Naval Postgraduate School, CA, USA Abstract in the open ocean, or to spot deviations in the behavior of In recent artificial intelligence (AI) research, convolutional neural flocking birds that could indicate the bird disturbances networks (CNNs) can create artificial agents capable of self- caused by troop movements. learning. Self-learning autonomous moving objects utilize ma- Although most autonomous aerial navigation techniques chine vision techniques based on processing and recognizing use a combination of global positioning system (GPS) and objects in digital images. Afterwards, deep reinforcement learn- inertial navigation system (INS) information, they cannot ing (Deep-RL) is applied to understand and learn intelligent ac- perform the interesting discovery described above. New tions and controls. The objective of our research is to study meth- sensor technologies such as LIDAR-based sensors, which ods and designs on how machine vision and deep machine learn- can detect and avoid obstacles, evaluate landing zones, and ing algorithms can be implemented in a virtual world (e.g., a plan routes to perform tasks (AACUS 2017), can only gen- computer game) for moving objects (e.g., vehicles or aircrafts) to erate more image data but cannot perform behavior-based improve their navigation and detection of threats in real life. In discovery. this paper, we create a framework for generating and using data Machine vision with machine learning implemented in from computer games to be used in CNNs and Deep-RL to per- moving objects such as cars, helicopters, drones, and many form intelligent actions. We show the initial results of applying other unmanned aerial vehicles (UAV) has potential to the framework and identify various military applications that may overcome the limitations of INS, GPS and sensors, which benefit from this research. not only improves orienting and sensing the environment, but also performs advanced self-learning and discovering to accomplish various tasks such as monitoring its sur- Introduction roundings, detecting objects, mapping terrain, and for autonomous controls. Training a computer to identify adversary weapons or hos- tile vehicles in complex backgrounds, as one of the ma- There are three types of machine learning as follows: chine vision applications, would bring values to the mili- • Supervised learning: requires training data to cor- tary especially in real-time and post-mission processing. relate input (e.g., images) and output (e.g., object Real-time applications include assisting human sensor op- classes) so that it can predict output for a new da- erators in analyzing footage or scanning for threat indica- ta. The convolutional neural networks (CNNs) tions, missile launches, or navigation cues when the sensor method used in machine vision is a supervised is not actively being monitored by an operator. Combining learning method. the visual scene (optical flow) with other sensor data (posi- • Reinforcement learning: requires the feedback tion, pose, control inputs) might provide valuable insight in reward data from the environment through explo- training autonomous systems to identify, emulate and ration to modify the parameters in a system so it augment human performance. The first step is to filter learns to reinforce the right actions and avoid the large volumes of data collected from electro-optical (EO) wrong ones. Reinforcement learning is important sensors, which reduces the human-manageable load for fast for self-learning, real-time learning and adaptation analysis. The second step is to analyze the filtered data sets in large-scale applications (DeepMind 2017). to perform interesting discovery. For example, intelligent • Unsupervised learning: requires only input data to machine vision analysis can be used to find crash wreckage discover patterns and anomalies. It can be used to 231 spot the deviation behavior such as the bird devia- The machine vision applications discussed above require tion combined with machine vision. large amount of training data - e.g., images and their clas- Machine vision and machine learning are important ses due to the nature of supervised machine learning. Alt- methods and components in modern AI designs and sys- hough the training data is difficult and expensive to gener- tems. When the components of machine vision and ma- ate in reality, previous research demonstrated that it’s chine learning are implemented in military sensors, they much easier to obtain the training data from a virtual envi- have the potential to increase the value of the sensors, for ronment – e.g., video games (Knight 2016) (Shafaei 2016). example, to have the sensors working towards better un- derstanding and learning the environment instead of just collecting data. Commercial companies have already been Overview of Methods and Concepts working in developing machine vision and machine learn- The objective of our research is to study methods and de- ing applications supported by sensor footage libraries signs on how machine vision and deep machine learning (FLIR 2017). For example, traffic camera systems can use algorithms can be implemented in a virtual world (e.g., a machine learning to improve their ability to detect and dis- computer game) for moving objects (e.g., vehicles or air- cern people either riding bicycles or cars. Military applica- crafts) to improve their navigation and detection of threats tions such as automatic target detection could reduce oper- in real life. ator work load and improve detection probabilities. Other applications related to the military mission include vehicle In order to understand the methods and framework used, mounted cameras to assess features in the field of view to we provide an overview of important concepts in machine identify weapons, weapon signatures, or other items of vision, machine learning and artificial intelligence. interest. Machine vision and machine learning would be useful for such military applications as: Machine Vision Using Convolutional Neural Net- • Performing AI-guided human attention manage- works (CNNs) ment: Single-seat aircraft so that the sensor can perform while the pilot is flying the aircraft or do- An artificial neural network is a computer system that ing other tasks in real-time. models the structure and functions of the brain that can be • Optimizing human supervisor to remote systems used for a machine vision task such as recognizing the ob- ratios: A single operator is monitoring multiple ject class of an input image. A CNN has groups of nodes cameras in a security role in real-time. interconnected with each other and organized into multiple • Automating analysis/flagging of post-mission data layers as shown in Fig. 1. The input layer is known as the by providing human analysts with highlights. convolutional layer, in which digital images are prepro- • Distributing tasks of persistent multi-constraint, cessed and many feature detectors are applied. Multiple correlation and processing with an AI assistance. convolutional layers may be added to the network to im- • Cross-correlating target ID with other (e.g., elec- prove its training and learning rate. The second layer is the tromagnetic) sensors to increase ID confidence. pooling layer, which ensures spatial invariance in the im- • Recognizing landscape/terrain, sidereal scene fea- ages. In other words, if an object in the image tilts, twists, tures to validate navigation in GPS-denied envi- or is oriented differently, then the feature detector will still ronment. be able to recognize them. The third layer is the flattening • Performing critical intervention: recognition of layer where the nodes are organized into a column of val- hazard conditions, imminent collisions by optical ues for the final layer. The final layer is the fully connected flow. layer, where all features are processed through the network • Identifying hand-held weapons at an airport and an image class is determined (Eremenko 2017). To checkpoint or large vehicles such as commercial summarize, a CNN will recognize images and classify aircraft. them. Such a neural network is trained through forward and backward propagation algorithms, which will adjust the assigned weights on each connection. By executing Another analysis of data is to discover patterns over time many iterations of that process, a CNN will improve its in the scene such as identifying 1) people moving much performance and learn based on the data it is trained on. more quickly than usual (e.g., running away from some- thing); 2) people meeting at time that they don't usually meet; or 3) places devoid of people that are normally occu- pied, which is outside the realm of the purely optical techniques and more in the data realm. 232 As shown in Fig. 2, the learning cycle of RL is 1) an agent observes some information (a state, e.g., images taken from the first-person view of a driving car); 2) the agent takes an action (e.g., turning left, right or going straight) based on its current state and internal knowledge models; 3) then the environment gives an immediate reward or penalty (e.g., continuing driving or crash). Reinforcement learning in- volves algorithms to learn, update or modify

A Framework Using Machine Vision and Deep Reinforcement Learning for Self-Learning Moving Objects in a Virtual Environment

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support