The Promise of Artificial Intelligence – Again!

David Gunning Information Innovation Office (I2O) Defense Advanced Research Projects Agency (DARPA)

MIT Washington Seminar Series Artificial Intelligence and Machine Learning

Tuesday, October 10, 2017

Approved for Public Release, Distribution Unlimited AI in the News

Autonomous Vehicles Image Understanding Language Translation / / ://www.cbsnews.com CBS Interactive CBS Inc. www.engadget.com / :// ://www.roboticstrends.com 2017OathTech Network AolTech http Trends Robotics ©2017 Adapted/https https © ©2014 ©2014 >6000 miles without Facebook has 98% Buds real-time operator intervention accuracy translation

Approved for Public Release, Distribution Unlimited 2 Commercial R&D ) ) DIUx Experimental ( Experimental - Source: DefenseInnovation Unit Startups Facebook Apple Google Amazon

Approved for Public Release, Distribution Unlimited 3 Global Interest in AI

Russia China

China

U.S. / / ©2017 Newsline / Newsline ©2017 TechnologyReview ://www.technologyreview.com ©2017 MIT MIT ©2017 https https://newsline.com

“Artificial intelligence is the future, not Research papers published on deep learning only for Russia, but for all humankind. (2012-2016) Whoever becomes the leader in this sphere will become the ruler of the world.” Vladimir Putin – 4 SEP, 2017

Approved for Public Release, Distribution Unlimited 4 Three Waves of AI

DESCRIBE PREDICT EXPLAIN Symbolic Reasoning Statistical Learning Contextual Adaptation Engineers create sets of Engineers create Engineers create logic rules to represent statistical models for systems that construct knowledge in limited specific problem explanatory domains domains and train them models for classes of on big data real world phenomena

Reasoning over narrowly Nuanced classification Natural communication defined problems and prediction among machines and capabilities people

No learning capability No contextual capability Systems learn and and poor handling of and minimal reasoning reason as they uncertainty ability encounter new tasks and situations

Approved for Public Release, Distribution Unlimited 5 DARPA Contributions to AI

1960s 1970s 1980s 1990s 2000s 2010s

L2M

XAI Wave rd CwC 3

Bio-Inspired Deep Big Cog. Arch. Learning Mechanism ANN PAL & IPTO Mind’s Eye Machine PPAML TIPTER, TREC & Link Machine Wave Learning MUC: Image Understanding Discovery Initiative Reading nd Speech Birth of data-

2 driven, Statistical Understanding TIDES GALE BOLT/DEFT/RATS D3M Research (SUR): Speech & Language Language First use of HMM Understanding Understanding DARPA Grand DARPA Robotics I2O Data Project MAC: Autonomous ARPI Planning Challenges Challenges Analytics • MIT, Stanford, & DENDRAL, Command Land Vehicle Initiative Post of the CMU AI Labs MYCIN xDATA QCR • Fleet Command Future Wave Chess, Theorem EXPERT DART Proving, General SYSTEMS Center Management st HACMS

1 Problem Solving Knowledge Knowledge Base DAML & • Shaky the robot Pilot’s Associate Sharing Programs Semantic Web

Bob Taylor, Bob Kahn & Steve J. Licklider & the ISO/ Tony Tether Larry Strategic Cross & I2O Original IPTO ITO & IPTO

DARPA Roberts Computing SISTO

Approved for Public Release, Distribution Unlimited 6 The 1st Wave of AI

Symbolic Reasoning 2017 Intuit, Inc. / 1997 - / © / 2015 MJC2 2017 GlobalResearch.ca www.mjc2.com © https://www.globalresearch.ca Copyright2005 - © https://turbotax.intuit.com

Engineers create sets of rules to represent knowledge in well-defined domains

Approved for Public Release, Distribution Unlimited 7 Founders of the 1st Wave

John McCarthy Allen Newell and Herbert Simon Marvin Minsky Stanford Carnegie Mellon MIT Inc.Inc Inc. Inc. IncInc. / © / © / © ACM, ACM,2017 , 2017 , / © ACM,2017 , / © ACM,2017 , / © ACM,2017 , ://awards.acm.org ://awards.acm.org ://awards.acm.org ://awards.acm.org ://awards.acm.org http http http http http

Mathematical Logic Production Rules Frames Society of Mind

Approved for Public Release, Distribution Unlimited 8 Expert Systems (1970s)

DEC R1 Commercially-successful rule- based AI “expert system” ://www.shortliffe.net/ http

Approved for Public Release, Distribution Unlimited 9 Strategic Computing Initiative (1980s)

Autonomous Land Vehicle Pilot’s Associate

http://www.tested.com/

Approved for Public Release, Distribution Unlimited 10 Cyc Knowledge Base (KB) (1985 - Today)

The Cyc KB is a formalized The Cyc KB contains: representation of a vast • 500K terms quantity of fundamental • 17K relations human knowledge • 7M assertions

https://www.cs.us.es/

Approved for Public Release, Distribution Unlimited 11 The 2nd Wave of AI

Statistical Learning / ://www.cbsnews.com thrilllist.com CBS Interactive CBS Inc. Source: ©2014 ©2014 Adapted/https

Engineers create statistical models for specific problem domains and train them on big data

Approved for Public Release, Distribution Unlimited 12 TIPSTER Text Understanding Program (1990s)

Text Retrieval Evaluation Conference The TREC Corpus (TREC) ://www.cs.bilkent.edu.tr/ http http://slideplayer.com/

Approved for Public Release, Distribution Unlimited 13 Statistical Language Learning (1990s)

• Statistical language processing • Word tagging • Parsing with probabilistic grammars • Grammar induction • Syntactic disambiguation • Semantic word classes • Word-sense disambiguation 2017 The 2017 The MIT Press / © ://mitpress.mit.edu https

Approved for Public Release, Distribution Unlimited 14 DARPA AI Under Tony Tether (2000s)

Cognitive Computing DARPA Grand Challenges DARPA Autonomous Vehicle Grand Challenge 140 miles of dirt tracks in California and Nevada

Approved for Public Release, Distribution Unlimited 15 Key Enablers of 2nd Wave AI

Better machine-learning algorithms Big data and cheap storage

Power-efficient processing Extensive industrial opportunities

https://quid.com/insights/the-future-of-artificial-intelligence

Patents in Machine Learning 2004 - 2013 Approved for Public Release, Distribution Unlimited 16 Machine Learning Techniques (2000s to Today)

Approved for Public Release, Distribution Unlimited 17 Deep Learning Breakthrough (2012)

ImageNet

© 1987 – 2017 Neural Information Processing Systems Foundation, Inc. Krizhevsky, A., Sutskever, I. & Hinton, G. ImageNet classification with deep convolutional neural networks. In Proc. Advances in Neural Information Processing Systems 25 1090–1098 (2012).

cs.stanford.edu/ This report was a breakthrough that used ://

http convolutional nets to almost halve the error rate for object recognition, and precipitated the rapid adoption of deep learning by the computer vision community

Approved for Public Release, Distribution Unlimited 18 Deep Learning Architecture

Each “feature map” performs a local Fully-connected layers analysis over the whole input space perform global analysis

convolutions subsampling

0 1 approx. 30,000

8 cells in total for 9 >99.5% accuracy input 20 feature maps 12x12 1000 each 24x24 feature maps 28x28 fully convolutions subsampling connected

Machine-learning “programmers” design the network structure with experience and by trial and error

Approved for Public Release, Distribution Unlimited 19 Neural Nets are Trained with Data

Computed outputs

Data inputs non-linear cell weights cell inputs function (learned) (from previous layer) = POS (SUMPRODUCT( W1:W16, V1:V16))

Approved for Public Release, Distribution Unlimited 20 Image Captioning

Yann LeCun, Yoshua Bengio, & Geoffrey Hinton (2015). Deep Learning, Nature, Vol. 521, (pp. 436‐444).

Deep learning, Y. LeCun, Y. Bengio, G. Hinton - Nature, May 2015 http://www.nature.com/ © 2015 Macmillan Publishers Limited

Approved for Public Release, Distribution Unlimited 21 Classification of Skin Cancer

Example images processed by the CNN

The deep learning CNN outperformed the average of the dermatologists at skin cancer classification

Dermatologist-level classification of skin cancer with deep neural networks, A. Esteva, B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, H.M. Blau & S. Thrun - Nature, February 2017 http://www.nature.com/ © 2015 Macmillan Publishers Limited

Approved for Public Release, Distribution Unlimited 22 AlphaGo © 2016 Macmillan2016Publishers Limited © ://www.nature.com / http

Approved for Public Release, Distribution Unlimited 23 DoD Applications

Data Analytics and Machine Learning Anti-submarine warfare Continuous for Intelligence Analysis Trail Unmanned Vessel (ACTUV) ://geimint.blogspot.com/ Google http

Approved for Public Release, Distribution Unlimited 24 Challenges with 2nd Wave AI

“Panda” <1% targeted “Gibbon” distortion (99.3% confidence)

+ = www..org www.tensorflow.org

Manifold separation process can be exploited

Approved for Public Release, Distribution Unlimited 25 Challenges with 2nd Wave AI

Statistically impressive, but individually unreliable Fei - Fei , Li Li , Karpathy a young boy is holding a

Credit:Andrej baseball bat

Approved for Public Release, Distribution Unlimited 26 Andrew Ng on the State of AI

When asked what can today’s AI systems do, Andrew responds: “Anything a human can do in less than a second”

2017 Coursera 2017 Coursera Inc. Today’s AI systems, especially with / © deep learning, are solving the machine perception problem ://www.coursera.org https

Approved for Public Release, Distribution Unlimited 27 The 3rd Wave of AI

perceive abstract

explainable model

learn reason

Approved for Public Release, Distribution Unlimited 28 The Need for Explainable AI

AI System Watson AlphaGo User

©IBM ©Marcin Bajer/Flickr • We are entering a new • Why did you do that? age of AI applications Sensemaking Operations • Why not something else? • Machine learning is the • When do you succeed? core technology • When do you fail? • Machine learning • When can I trust you? models are opaque, • How do I correct an error? non-intuitive, and difficult for people to ©NASA.gov ©Eric Keenan, U.S. Marine Corps understand

• The current generation of AI systems offer tremendous benefits, but their effectiveness will be limited by the machine’s inability to explain its decisions and actions to users • Explainable AI will be essential if users are to understand, appropriately trust, and effectively manage this incoming generation of artificially intelligent partners

Approved for Public Release, Distribution Unlimited 29 What Are We Trying To Do? Spin West ©

Today South • Why did you do that? • Why not something else? Learning This is a • When do you succeed? cat of Toronto Process (p = .93) • When do you fail? • When can I trust you? User • How do I correct an error? ©University Learned Output with Training Data Function a Task Spin West ©

Tomorrow South • I understand why This is a cat • I understand why not It has fur, • I know when you’ll New whiskers, claws & Learning this feature: succeed of Toronto Process • I know when you’ll fail • I know when to trust you

©University User Explainable Explanation • I know why you erred with Training Data Model Interface a Task

Approved for Public Release, Distribution Unlimited 30 XAI Challenge Problems

Learn a model to Explain decisions, Use the explanation perform the task actions to the user to perform a task

An analyst is Data 2 trucks performing a loading activity Recommend looking for items Analytics Explainable Explanation of interest in Model Interface Classification Explanation massive multimedia data Learning Task ©Air Force Research Lab ©Getty Images Multimedia Data sets Classifies items of Explains why/why not Analyst decides interest in large data for recommended which items to set items report, pursue An operator is directing Autonomy Actions Explainable Explanation autonomous Model Interface Reinforcement Explanation systems to Learning Task ©ArduPikot.org accomplish a ArduPilot & SITL Simulation ©US Army series of missions

Learns decision Explains behavior in Operator decides policies for simulated an after-action which future tasks to missions review delegate Approved for Public Release, Distribution Unlimited 31 Performance vs. Explainability

New Learning Techniques (today) Explainability Approach (notional) Create a suite of Neural Nets machine learning Graphical Models techniques that Deep produce more Learning Ensemble Bayesian Methods explainable models, Belief Nets while maintaining a SRL Random high level of CRFs HBNs Forests AOGs learning Statistical MLNs Models Decision LearningPerformance performance Markov Trees SVMs Models Explainability

Approved for Public Release, Distribution Unlimited 32 Performance vs. Explainability

New Learning Techniques (today) Explainability Approach (notional) Create a suite of Neural Nets machine learning Graphical Models techniques that Deep produce more Learning Ensemble Bayesian Methods explainable models, Belief Nets while maintaining a SRL Random high level of CRFs HBNs Forests AOGs learning Statistical MLNs Models Decision LearningPerformance performance Markov Trees SVMs Models Explainability

Deep Explanation Interpretable Models Model Induction Modified deep learning Techniques to learn more Techniques to infer an techniques to learn structured, interpretable, causal explainable model from any explainable features models model as a black box

Approved for Public Release, Distribution Unlimited 33 Approaches to Deep Explanation (Berkeley, SRI/Toronto, BBN/MIT) Attention Mechanisms Modular Networks Top-down Caption Saliency Neural Module Networks [Ramanishka, et al. CVPR17] [Andreas, et al. CVPR16, EMNLP16], [Hu, et al. CVPR17]

Feature Identification Learn to Explain

Approved for Public Release, Distribution Unlimited 34 Causal Model Induction (Charles River Analytics)

Training Explanatory Concepts Causal Probabilistic Explanatory Machine Learning Programming Dictionary Technique Framework Parameterization Causal Model Template

Machine Learned Causal Model Test Data Learning Causal Model System Learner Technique

Causal Model Induction: Experiment with the learned model (as a gray box) to learn an explainable, causal, probabilistic programming model

Approved for Public Release, Distribution Unlimited 35 Explanation by Selection of Teaching Examples (Rutgers University)

TRAINING DATA

brow mouth This face is Angry lowered nostrils raised flared because it is similar to these chin pushed examples out/up lips thinned/ cheekbone pushed out s raised EXPLAINABLE CLASSIFICATION MODEL and dissimilar to these examples

Bayesian teaching for optimal selection of examples for machine explanation

Approved for Public Release, Distribution Unlimited 36 Remaining Challenges

Learning • Unsupervised learning • One-shot learning • Lifelong learning • Learning from instruction Understanding • Explanation • Representation and abstraction Human Level AI Human-like cognition • Planning and action • Meta-reasoning • Commonsense https://www.huffingtonpost.com/ ©2017 Oath Inc.

Approved for Public Release, Distribution Unlimited 37 Commonsense Knowledge and Reasoning

Commonsense Knowledge • Time, space, general concepts • Naïve Physics • Naïve Psychology • Commonly known facts Commonsense Reasoning • Temporal reasoning • Visio-spatial reasoning • Reasoning about actions/change • Qualitative reasoning • Mental simulation

Credit: Peter Crowther Associates https://cacm.acm.org/ ©2017 by the ACM

Approved for Public Release, Distribution Unlimited 38 www.darpa.mil

Approved for Public Release, Distribution Unlimited