Active Learning

AI, Machine Learning and Language Technologies: What’s Next? Army Mad Scientists Program March 7, 2017 Jaime Carbonell, and colleagues School of Computer Science Carnegie Mellon University www.cs.cmu.edu/~jgc AI Touches Virtually All Areas of Computer Science Systems Entertain Lang + Theory Tech Tech Fine Arts Comp Artificial Intelligence Bio Sciences Machine Learning Human-Comp Interaction Robotics Computer Humanities Science Engineering 3/6/2017 Jaime G. Carbonell, Language 2 Technolgies Institute AI is Becoming Central to the World Economy (Davos 2016) “The fourth Industrial Revolution” is characterized by: n “Ubiquitous and mobile internet”, n “Smaller and more powerful sensors”, n “Artificial Intelligence”, and n “Machine Learning” -- Prof. Klaus Schwab, founder of the Davos World Economic Forum, 2016 3/6/2017 Jaime G. Carbonell, Language 3 Technolgies Institute Key Components of AI o Automated Perception n Vision, sonar, lidar, haptics, … o Robotic Action n Locomotion, manipulation, … o Deep Reasoning n Planning, goal-oriented behavior, projection, … o Language Technologies n Language, speech, dialog, social nets, … o Machine Learning n Adaptation, reflection, knowledge acquisition, … o Big Data 3/6/2017 Jaime G. Carbonell, Language 4 Technolgies Institute Key Components of AI o Automated Perception n Vision, sonar, lidar, haptics, … o Robotic Action n Locomotion, manipulation, … o Deep Reasoning n Planning, goal-oriented behavior, projection, … o Language Technologies n Language, speech, dialog, social nets, … o Machine Learning Today’s main focus n Adaptation, reflection, knowledge acquisition, … My research o Big Data 3/6/2017 Jaime G. Carbonell, Language 5 Technolgies Institute How Big is Big? Dimensions of Big Data Analytics LARGE-SCALE : TERABYTES PETABYTES EXOBYTES Billions++ of entries: Terabyes/Petabyes of data HIGH-COMPLEXITY Trillions of potential relations among entries (graphs) HIGH-DIMENSIONAL Millions of attributes per entry (but typically sparse encoding) 3/6/2017 Jaime G. Carbonell, CMU 6 The Big-Data “Stack” Analytics Algorithms -- Machine Learning -- Artificial Intelligence Alerts, Visualization Big-Data Architecture -- Hadoop/H-Table -- Asynch/Pegasus Sensors Big-Data “Plumbing” -- Cloud/Storage Knowledge -- Resource Allocator Historical & base Normative Data 3/6/2017 Jaime G. Carbonell, CMU 7 Trends in Machine Learning o “Deep” Learning (DNNs): vision, speech, NLP o Reinforcement Learning: robotics o Large-margin methods (SVM): classification o Graphical models: strong priors, domain K. How to cope with knowledge sparsity? o (Pro)Active learning: optimizing external help o Transfer/Multitask learning: related new domains o Explainable AI (to engage SMEs, users) o (Pro)Active teaching: …coming next? 3/6/2017 Jaime G. Carbonell, Language 8 Technolgies Institute Machine Learning in A Nutshell o Training data: n Special case: o Functional space: o Fitness Criterion: n a.k.a. loss function o Active Learning Sampling Strategy: 3/6/2017 Jaime G. Carbonell, Language 9 Technolgies Institute Why is Active Learning Important? o Labeled data volumes unlabeled data volumes n 1.2% of all proteins have known structures n < .01% of all galaxies in the Sloan Sky Survey have consensus type labels n < .0001% of all web pages have topic labels n << E-10% of all internet sessions are labeled as to fraudulence (malware, etc.) n < .0001% of all financial transactions investigated w.r.t. fraudulence o If labeling is costly, or limited, select the instances with maximal impact for learning 3/6/2017 Jaime G. Carbonell, Language 10 Technolgies Institute Strategy Selection: A Surprise There is No Universal Optimum • Optimal operating range for AL sampling strategies differs • How to get the best of both worlds? • (Hint: ensemble methods) 3/6/2017 Jaime G. Carbonell, Language 11 Technolgies Institute How does DUAL do better? o Runs DWUS until it estimates a cross-over o Monitor the change in expected error at each iteration to detect when it is stuck in local minima o DUAL uses a mixture model after the cross-over ( saturation ) point o Our goal should be to minimize the expected future error n If we knew the future error of Uncertainty Sampling (US) to be zero, then we’d force n But in practice, we do not know it 3/6/2017 Jaime G. Carbonell, Language 12 Technolgies Institute Cost varies non-uniformly statistically significant (p<0.01) 3/6/2017 Jaime G. Carbonell, Language 13 Technolgies Institute Active Learning is Awesome, but … is it Enough? Traditional Single Perfect Source Fixed Labeling Cost Active Learning CIKM ‘08 Multiple Sources Varying-Cost Model Going Beyond Differing Answer Task Expertise Reluctance Expertise Difficulty Level Labeling Noise Proactive Learning SDM_sub ‘10 Ambiguity JMLR_’09 Fixed over time Time-varying 14 KDD ‘09 Active vs Proactive Learning Active Learning Proactive Learning Number of Oracles Individual (only one) Multiple, with different capabilities, costs and areas of expertise Reliability Infallible (100% right) Variable across oracles and queries, depending on difficulty, expertise, … Reluctance Indefatigable (always Variable across oracles and answers) queries, depending on workload, certainty, … Cost per query Invariant (free or constant) Variable across oracles and queries, depending on workload, difficulty, … Note: “Oracle” {expert, experiment, computation, …} 3/6/2017 Jaime G. Carbonell, Language 15 Technolgies Institute SDM ‘10 Does Tracking Predictor Accuracy Actually Help in Proactive Learning? 3/6/2017 Jaime G. Carbonell, Language 16 Technolgies Institute Active Learning for MT Parallel Expert corpus Translator S,T Trainer Mode S Sampled l corpus MT System Source Language Active Corpus Learner 3/6/2017 Jaime G. Carbonell, Language 17 Technolgies Institute Active Crowd Translation S,T 1 S,T 2 Trainer . Translation . Selection . Mode S,T l n S Sentenc e Selectio n MT System Source Language ACT Corpus Framework 3/6/2017 Jaime G. Carbonell, Language 18 Technolgies Institute Active Learning Strategy: Diminishing Density Weighted Diversity Sampling Experiments: Language Pair: Spanish-English Iterations: 20 Batch Size: 1000 sentences each Translation: Moses Phrase SMT Development Set: 343 sens Test Set: 506 sens Graph: X: Performance (BLEU ) Y: Data (Thousand words) 3/6/2017 Jaime G. Carbonell, Language 19 Technolgies Institute Translation Selection from AMT o Crowds beat experts • Translator Reliability • Translation Selection: 3/6/2017 Jaime G. Carbonell, Language 20 Technolgies Institute MT via LSTM (DNNs + Sequence) I'd like a beer STOPstop → I'd like a beer Attention history: 3/6/2017 Jaime G. Carbonell, Language 21 Technolgies Institute Used Deep Learning (LDSTA) model trained on Yahoo! answers data to match questions with answer-bearing sentences 3/6/2017 Jaime G. Carbonell, Language 22 Technolgies Institute 3/6/2017 Jaime G. Carbonell, Language 23 Technolgies Institute Transfer/Multi-Task Learning o Basic Idea: Map invariant properties from similar tasks previously learning tasks o Challenges: What to retain? How to modify? o History: n Transformation/Derivational Analogy (1980s) n Case-Based Reasoning (1980s-1990s) n “Modern” Transfer Multi-Task (2000’s) o New focus: beyond transferring priors & features n Regularizers to maximize transfer n Structural biases 3/6/2017 Jaime G. Carbonell, Language 24 Technolgies Institute Host-pathogen interactions : The Multitask Landscape Homologous proteins due to common Firmicutes ancestors B. anthracis H. sapiens Bacteria Vertebrates Y. pestis M. musculus Enterobacteria Protists S. typhi Plants A. Thaliana 3/6/2017 Jaime G. Carbonell, Language 25 Technolgies Institute Common Biological Pathways The “Glucose Transport Pathway” 3/6/2017 Jaime G. Carbonell, Language 26 Technolgies Institute Multi-task Objective For m tasks with parameters 1. Minimize empirical error 2. Enforce commonality hypothesis 3. Prevent overfitting Empirical loss Pathway regularizer L2 regularizer 3/6/2017 Jaime G. Carbonell, Language 27 Technolgies Institute Boeing-CMU Aerospace ML/Analytics Lab Just after Takeoff Dreamliner Maiden Flight 15-December-2009 3/6/2017 Jaime G. Carbonell, Language 28 Technolgies Institute F/A-18 Maintenance Decision Support Past: Reactive, Improve flight readiness § Computer-assisted diagnoses of F/A-18 troubles • Statistical learning problem § Computer-assisted expert finding • Statistical recommendation (“collaborative filtering”) problem § Computer-assisted resolution recommendation • Information retrieval problem § From research prototypes to operational systems • Software engineering problem (not done by CMU) Jaime G. Carbonell, Language 29 Technologies Institute Information flow: Aircraft trouble reports Classifiers Recom. System Search Engine Jaime G. Carbonell, Language 30 Technolgies Institute What’s Next for AI/ML? o Reflection n Agent that analyzes its own failures n Knows what it does not know but needs to know n Learns trust (teachers, sources, observations) o Curiosity n Never idles, but dreams “what if” n Runs internal experiments when inactive o Teamwork n Shares knowledge proactively n Changes roles as needed 3/6/2017 Jaime G. Carbonell, Language 31 Technolgies Institute LTI COG (aka Agent) Architecture 3/6/2017 32 What’s Next (take 2) o Safe AI vs Wild AI n Safe = guarantees, constraints, transparency n Wild = adaptability, curiosity, exploration o Safety n Universal “undo” button for autonomy n Explain why it is recommending decisions n Best for non-time critical tasks o AI in the wild n Full autonomy no ironclad

Active Learning

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support