Song Cornellgrad 0058F 12342.Pdf (2.229Mb)

Total Page:16

File Type:pdf, Size:1020Kb

Song Cornellgrad 0058F 12342.Pdf (2.229Mb) MEASURING THE UNMEASURED: NEW THREATS TO MACHINE LEARNING SYSTEMS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Congzheng Song December 2020 c 2020 Congzheng Song ALL RIGHTS RESERVED MEASURING THE UNMEASURED: NEW THREATS TO MACHINE LEARNING SYSTEMS Congzheng Song, Ph.D. Cornell University 2020 Machine learning (ML) is at the core of many Internet services and applica- tions. Practitioners evaluate ML models based on the accuracy metrics, which measures the models’ predictive power on unseen future data. On the other hand, as ML systems are becoming more personalized and more important in decision-making, malicious adversaries have an incentive to interfere with the ML environment for various purposes such as extracting information about sen- sitive training data or inducing desired behavior in models’ output. However, none of these security and privacy threats are captured by accuracy and it is unclear to what extent current ML systems could go wrong. In this dissertation, we identify and quantify a number of threats to ML sys- tems that are not measured by conventional performance metrics: (1) we con- sider the privacy threats at training time, where we show that adversary can supply malicious training code to force a ML model into intentionally ”mem- orizing” sensitive training data, and later extract memorized information from the model; (2) motivated by data-protection regulations, we identify a compli- ance issue where personal information might be collected for training ML mod- els without consent, and design practical auditing techniques for detecting such unauthorized data collection; (3) we study overlearning phenomenon in deep learning models where the internal representations reveal sensitive and uncor- related information, and discuss its implications in terms of privacy leakages and compliance with regulations; and (4) we demonstrate a secure venerability in ML models for analyzing text semantic similarity, where we propose attacks for generating texts that are semantically unrelated but judged as similar by these ML models. The goal of this dissertation is to provide ML practitioners ways for measur- ing risks in the ML models through threat modeling. We hope that our proposed attacks could give insights for better mitigation methods, and advocate the ML community to consider all aspects rather than only accuracy when designing new learning algorithms and building new ML systems. BIOGRAPHICAL SKETCH Congzheng Song was born in Changsha, China. He earned B.S. degree with Summa Cum Laude in Computer Science from Emory University. In 2016, he entered Cornell University to pursue a Ph.D. in Computer Science, and was ad- vised by Prof. Vitaly Shmatikov at Cornell Tech campus in New York City. His doctoral research focused on identifying and quantifying security and privacy issues in machine learning. He was a doctoral fellow at Cornell Tech’s Digital Life Initiative in 2020. During his Ph.D. study, he interned at Amazon, Google and Petuum Inc for industrial research. iii To my parents. iv ACKNOWLEDGEMENTS I am extremely fortunate to have Vitaly Shmatikov as my Ph.D. advisor. He supported me in anyway he can and I learned more than I could hope for from him, ranging from formulating and refining research ideas to writing and pre- senting outcomes. His passion and wisdom guided me through many difficult times and made my Ph.D. research very productive. I owe Vitaly deeply and this dissertation would not be possible without his help. I am very grateful to the rest of my thesis committee members: Thomas Ris- tenpart and Helen Nissenbaum. Tom introduced me to security and privacy research problems in machine learning when I first came to Cornell Tech, which is also the focus of this dissertation. I was a doctoral fellow at the Digital Life Initiative (DLI) founded by Helen. I learned about the societal aspects of tech- nology from Helen and the DLI team and how people outside our field view our works from different perspectives. I also want to thank Tom and Helen for their valuable feedback on this dissertation. I want to acknowledge my collaborators and co-authors: Emiliano De Cristo- faro, Luca Melis, Roei Schuster, Vitaly Shmatikov, Reza Shokri, Marco Stronati, Eran Tromer, Ananth Raghunathan, Thomas Ristenpart, and Alexander M. Rush. They are incredible researchers in this field and I benefited a lot from their profound minds. I would like to express my graditiude to my mentors and colleagues at Ama- zon, Google and Petuum Inc during my internships. From them, I learned about how security and privacy research is deployed in practice and about the differ- ence between academia research and the real world. Last but not least, I want to thank all my friends and family members in the U.S. and China for their endless support throughout my entire life. v TABLE OF CONTENTS Biographical Sketch . iii Dedication . iv Acknowledgements . .v Table of Contents . vi List of Tables . ix List of Figures . xii 1 Introduction 1 1.1 Thesis Contribution . .2 1.2 Thesis Structure . .4 2 Background 6 2.1 Machine Learning Preliminaries . .6 2.1.1 Supervised learning . .6 2.1.2 Linear models . .9 2.1.3 Deep learning models . .9 2.2 Machine Learning Pipeline . 10 2.2.1 Data collection . 11 2.2.2 Training ML models . 12 2.2.3 Deploying ML models . 13 2.3 Memorization in ML . 14 2.3.1 Membership Inference Attacks . 14 2.4 Privacy-preserving Techniques . 16 2.4.1 Differential privacy . 16 2.4.2 Secure ML environment . 17 2.4.3 Model partitioning . 18 3 Intentional Memorization with Untrusted Training Code 20 3.1 Threat Model . 20 3.2 White-box Attacks . 24 3.2.1 LSB Encoding . 24 3.2.2 Correlated Value Encoding . 25 3.2.3 Sign Encoding . 28 3.3 Black-box Attacks . 30 3.3.1 Abusing Model Capacity . 30 3.3.2 Synthesizing Malicious Augmented Data . 31 3.3.3 Why Capacity Abuse Works . 34 3.4 Experiments . 35 3.4.1 Datasets and Tasks . 35 3.4.2 ML Models . 37 3.4.3 Evaluation Metrics . 38 3.4.4 LSB Encoding Attack . 40 vi 3.4.5 Correlated Value Encoding Attack . 42 3.4.6 Sign Encoding Attack . 45 3.4.7 Capacity Abuse Attack . 47 3.5 Countermeasures . 54 3.6 Related Work . 55 3.7 Conclusion . 58 4 Auditing Data Provenance in Text-generation Models 60 4.1 Text-generation Models . 61 4.2 Auditing text-generation models . 63 4.3 Experiments . 68 4.3.1 Datasets . 68 4.3.2 ML Models . 70 4.3.3 Hyper-parameters . 70 4.3.4 Performance of target models . 72 4.3.5 Performance of auditing . 73 4.4 Memorization in text-generation models . 80 4.5 Limitations of auditing . 84 4.6 Related work . 86 4.7 Conclusion . 87 5 Overlearning Reveals Sensitive Attributes 89 5.1 Censoring Representation Preliminaries . 90 5.2 Exploiting Overlearning . 92 5.2.1 Inferring sensitive attributes from representation . 93 5.2.2 Re-purposing models to predict sensitive attributes . 94 5.3 Experimental Results . 95 5.3.1 Datasets, tasks, and models . 95 5.3.2 Inferring sensitive attributes from representations . 97 5.3.3 Re-purposing models to predict sensitive attributes . 102 5.3.4 When, where, and why overlearning happens . 105 5.4 Related Work . 107 5.5 Conclusions . 108 6 Adversarial Semantic Collisions 110 6.1 Threat Model . 110 6.2 Generating Adversarial Semantic Collisions . 113 6.2.1 Aggressive Collisions . 114 6.2.2 Constrained Collisions . 117 6.2.3 Regularized Aggressive Collisions . 117 6.2.4 Natural Collisions . 118 6.3 Experiments . 120 6.3.1 Tasks and Models . 122 6.3.2 Attack Results . 125 vii 6.3.3 Evaluating Unrelatedness . 126 6.3.4 Transferability of Collisions . 127 6.4 Mitigation . 128 6.5 Related Work . 130 6.6 Conclusion . 132 7 Conclusion 133 A Chapter 6 of appendix 135 viii LIST OF TABLES 3.1 Summary of datasets and models. n is the size of the train- ing dataset, d is the number of input dimensions. RES stands for Residual Network, CNN for Convolutional Neural Network. For FaceScrub, we use the gender classification task (G) and face recognition task (F). 36 3.2 Results of the LSB encoding attack. Here f is the model used, b is the maximum number of lower bits used beyond which accu- racy drops significantly, δ is the difference with the baseline test accuracy. 39 3.3 Results of the correlated value encoding attack on image data. Here λc is the coefficient for the correlation term in the objective function and δ is the difference with the baseline test accuracy. For image data, decode MAPE is the mean absolute pixel error. 40 3.4 Results of the correlated value encoding attack on text data. τ is the decoding threshold for the correlation value. Pre is precision, Rec is recall, and Sim is cosine similarity. 41 3.5 Results of the sign encoding attack on image data. Here λs is the coefficient for the correlation term in the objective function. 41 3.6 Results of the sign encoding attack on text data. 42 3.7 Decoded text examples from all attacks applied to LR models trained on the IMDB dataset. 45 3.8 Results of the capacity abuse attack on image data. Here m is the m number of synthesized inputs and n is the ratio of synthesized data to training data. 47 3.9 Results of the capacity abuse attack on text data. 47 3.10 Results of the capacity abuse attack on text datasets using a pub- lic auxiliary vocabulary.
Recommended publications
  • Elements of DSAI: Game Tree Search, Learning Architectures
    Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Elements of DSAI AlphaGo Part 2: Game Tree Search, Learning Architectures Search & Learn: A Recipe for AI Action Decisions J¨orgHoffmann Winter Term 2019/20 Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 1/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Competitive Agents? Quote AI Introduction: \Single agent vs. multi-agent: One agent or several? Competitive or collaborative?" ! Single agent! Several koalas, several gorillas trying to beat these up. BUT there is only a single acting entity { one player decides which moves to take (who gets into the boat). Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 3/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Competitive Agents! Quote AI Introduction: \Single agent vs. multi-agent: One agent or several? Competitive or collaborative?" ! Multi-agent competitive! TWO players deciding which moves to take. Conflicting interests. Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 4/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Agenda: Game Search, AlphaGo Architecture Games: What is that? ! Game categories, game solutions. Game Search: How to solve a game? ! Searching the game tree. Evaluation Functions: How to evaluate a game position? ! Heuristic functions for games. AlphaGo: How does it work? ! Overview of AlphaGo architecture, and changes in Alpha(Go) Zero. Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 5/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Positioning in the DSAI Phase Model Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 6/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Which Games? ! No chance element.
    [Show full text]
  • ELF: an Extensive, Lightweight and Flexible Research Platform for Real-Time Strategy Games
    ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games Yuandong Tian1 Qucheng Gong1 Wenling Shang2 Yuxin Wu1 C. Lawrence Zitnick1 1Facebook AI Research 2Oculus 1fyuandong, qucheng, yuxinwu, [email protected] [email protected] Abstract In this paper, we propose ELF, an Extensive, Lightweight and Flexible platform for fundamental reinforcement learning research. Using ELF, we implement a highly customizable real-time strategy (RTS) engine with three game environ- ments (Mini-RTS, Capture the Flag and Tower Defense). Mini-RTS, as a minia- ture version of StarCraft, captures key game dynamics and runs at 40K frame- per-second (FPS) per core on a laptop. When coupled with modern reinforcement learning methods, the system can train a full-game bot against built-in AIs end- to-end in one day with 6 CPUs and 1 GPU. In addition, our platform is flexible in terms of environment-agent communication topologies, choices of RL methods, changes in game parameters, and can host existing C/C++-based game environ- ments like ALE [4]. Using ELF, we thoroughly explore training parameters and show that a network with Leaky ReLU [17] and Batch Normalization [11] cou- pled with long-horizon training and progressive curriculum beats the rule-based built-in AI more than 70% of the time in the full game of Mini-RTS. Strong per- formance is also achieved on the other two games. In game replays, we show our agents learn interesting strategies. ELF, along with its RL platform, is open sourced at https://github.com/facebookresearch/ELF. 1 Introduction Game environments are commonly used for research in Reinforcement Learning (RL), i.e.
    [Show full text]
  • MEDIA ADVISORY: Thursday, August 11, 2011**
    **MEDIA ADVISORY: Thursday, August 11, 2011** WWE SummerSlam Cranks Up the Heat at Participating Cineplex Entertainment Theatres Live, in High-Definition on Sunday, August 14, 2011 WHAT: Three championship titles are up for grabs, one will unify the prestigious WWE Championship this Sunday. Cineplex Entertainment, via our Front Row Centre Events, is pleased to announce WWE SummerSlam will be broadcast live at participating Cineplex theatres across Canada on Sunday, August 14, 2011 at 8:00 p.m. EDT, 7:00 p.m. CDT, 6:00 p.m. MDT and 5:00 p.m. PDT live from the Staples Center in Los Angeles, CA. Matches WWE Champion John Cena vs. WWE Champion CM Punk in an Undisputed WWE Championship Match World Heavyweight Champion Christian vs. Randy Orton in a No Holds Barred Match WWE Divas Champion Kelly Kelly vs. Beth Phoenix WHEN: Sunday, August 14, 2011 at 8:00 p.m. EDT, 7:00 p.m. CDT, 6:00 p.m. MDT and 5:00 p.m. PDT WHERE: Advance tickets are now available at participating theatre box offices, through the Cineplex Mobile Apps and online at www.cineplex.com/events or our mobile site m.cineplex.com. A special rate is available for larger groups of 20 or more. Please contact Cineplex corporate sales at 1-800-313-4461 or via email at [email protected]. The following 2011 WWE events will be shown live at select Cineplex Entertainment theatres: WWE Night of Champions September 18, 2011 WWE Hell in the Cell October 2, 2011 WWE Vengeance (formerly Bragging Rights) October 23, 2011 WWE Survivor Series November 20, 2011 WWE TLC: Tables, Ladders & Chairs December 18, 2011 -30- For more information, photos or interviews, please contact: Pat Marshall, Vice President, Communications and Investor Relations, Cineplex Entertainment, 416-323- 6648, [email protected] Kyle Moffatt, Director, Communications, Cineplex Entertainment, 416-323-6728, [email protected] .
    [Show full text]
  • Improved Policy Networks for Computer Go
    Improved Policy Networks for Computer Go Tristan Cazenave Universite´ Paris-Dauphine, PSL Research University, CNRS, LAMSADE, PARIS, FRANCE Abstract. Golois uses residual policy networks to play Go. Two improvements to these residual policy networks are proposed and tested. The first one is to use three output planes. The second one is to add Spatial Batch Normalization. 1 Introduction Deep Learning for the game of Go with convolutional neural networks has been ad- dressed by [2]. It has been further improved using larger networks [7, 10]. AlphaGo [9] combines Monte Carlo Tree Search with a policy and a value network. Residual Networks improve the training of very deep networks [4]. These networks can gain accuracy from considerably increased depth. On the ImageNet dataset a 152 layers networks achieves 3.57% error. It won the 1st place on the ILSVRC 2015 classifi- cation task. The principle of residual nets is to add the input of the layer to the output of each layer. With this simple modification training is faster and enables deeper networks. Residual networks were recently successfully adapted to computer Go [1]. As a follow up to this paper, we propose improvements to residual networks for computer Go. The second section details different proposed improvements to policy networks for computer Go, the third section gives experimental results, and the last section con- cludes. 2 Proposed Improvements We propose two improvements for policy networks. The first improvement is to use multiple output planes as in DarkForest. The second improvement is to use Spatial Batch Normalization. 2.1 Multiple Output Planes In DarkForest [10] training with multiple output planes containing the next three moves to play has been shown to improve the level of play of a usual policy network with 13 layers.
    [Show full text]
  • Unsupervised State Representation Learning in Atari
    Unsupervised State Representation Learning in Atari Ankesh Anand⇤ Evan Racah⇤ Sherjil Ozair⇤ Mila, Université de Montréal Mila, Université de Montréal Mila, Université de Montréal Microsoft Research Yoshua Bengio Marc-Alexandre Côté R Devon Hjelm Mila, Université de Montréal Microsoft Research Microsoft Research Mila, Université de Montréal Abstract State representation learning, or the ability to capture latent generative factors of an environment, is crucial for building intelligent agents that can perform a wide variety of tasks. Learning such representations without supervision from rewards is a challenging open problem. We introduce a method that learns state representations by maximizing mutual information across spatially and tem- porally distinct features of a neural encoder of the observations. We also in- troduce a new benchmark based on Atari 2600 games where we evaluate rep- resentations based on how well they capture the ground truth state variables. We believe this new framework for evaluating representation learning models will be crucial for future representation learning research. Finally, we com- pare our technique with other state-of-the-art generative and contrastive repre- sentation learning methods. The code associated with this work is available at https://github.com/mila-iqia/atari-representation-learning 1 Introduction The ability to perceive and represent visual sensory data into useful and concise descriptions is con- sidered a fundamental cognitive capability in humans [1, 2], and thus crucial for building intelligent agents [3]. Representations that concisely capture the true state of the environment should empower agents to effectively transfer knowledge across different tasks in the environment, and enable learning with fewer interactions [4].
    [Show full text]
  • The Oral Poetics of Professional Wrestling, Or Laying the Smackdown on Homer
    Oral Tradition, 29/1 (201X): 127-148 The Oral Poetics of Professional Wrestling, or Laying the Smackdown on Homer William Duffy Since its development in the first half of the twentieth century, Milman Parry and Albert Lord’s theory of “composition in performance” has been central to the study of oral poetry (J. M. Foley 1998:ix-x). This theory and others based on it have been used in the analysis of poetic traditions like those of the West African griots, the Viking skalds, and, most famously, the ancient Greek epics.1 However, scholars have rarely applied Parry-Lord theory to material other than oral poetry, with the notable exceptions of musical forms like jazz, African drumming, and freestyle rap.2 Parry and Lord themselves, on the other hand, referred to the works they catalogued as performances, making it possible to use their ideas beyond poetry and music. The usefulness of Parry-Lord theory in studies of different poetic traditions tempted me to view other genres of performance from this perspective. In this paper I offer up one such genre for analysis —professional wrestling—and show that interpreting the tropes of wrestling through the lens of composition in performance provides information that, in return, can help with analysis of materials more commonly addressed by this theory. Before beginning this effort, it will be useful to identify the qualities that a work must possess to be considered a “composition in performance,” in order to see if professional wrestling qualifies. The first, and probably most important and straightforward, criterion is that, as Lord (1960:13) says, “the moment of composition is the performance.” This disqualifies art forms like theater and ballet, works typically planned in advance and containing words and/or actions that must be performed at precise times and following a precise order.
    [Show full text]
  • WWE® Reports 2010 Third Quarter Results
    FOR IMMEDIATE RELEASE Contacts: Investors: Michael Weitz 203-352-8642 Media: Robert Zimmerman 203-359-5131 WWE® Reports 2010 Third Quarter Results STAMFORD, Conn., November 4, 2010 - World Wrestling Entertainment, Inc. (NYSE:WWE) today announced financial results for its third quarter ended September 30, 2010. Revenues totaled $109.6 million as compared to $111.3 million in the prior year quarter. Operating income was $20.3 million as compared to $14.5 million in the prior year quarter. Net income was $14.3 million, or $0.19 per share, as compared to $8.9 million, or $0.12 per share, in the prior year quarter. Impacting comparability to the prior year quarter are production tax credits recorded in both current and prior year quarters. Excluding the impact of these credits, Adjusted Operating income was $14.2 million as compared to $12.0 million in the prior year quarter. Adjusted Net income was $10.3 million, or $0.14 per share, as compared to $7.4 million, or $0.10 per share. “In the third quarter, our results reflected the Company’s continued focus on optimizing business results in a difficult environment,” stated Vince McMahon, Chairman and Chief Executive Officer. “We generated earnings growth despite a decline in revenue across many of our businesses. We believe our ongoing talent transition and the sluggish economy were important factors in these declines. Based on our history of developing talent and creating content with broad appeal, we are confident we can address these creative challenges. Further, by taking advantage of our strategic opportunities, we can achieve meaningful growth.” Comparability of Results Excluding the impact of production tax credits, Q3 2010 Adjusted Operating income increased 18% to $14.2 million and Adjusted EBITDA increased 12% to $17.4 million.
    [Show full text]
  • Professional Wrestling: Local Performance History, Global Performance Praxis Neal Anderson Hebert Louisiana State University and Agricultural and Mechanical College
    Louisiana State University LSU Digital Commons LSU Doctoral Dissertations Graduate School 2016 Professional Wrestling: Local Performance History, Global Performance Praxis Neal Anderson Hebert Louisiana State University and Agricultural and Mechanical College Follow this and additional works at: https://digitalcommons.lsu.edu/gradschool_dissertations Part of the Theatre and Performance Studies Commons Recommended Citation Hebert, Neal Anderson, "Professional Wrestling: Local Performance History, Global Performance Praxis" (2016). LSU Doctoral Dissertations. 2329. https://digitalcommons.lsu.edu/gradschool_dissertations/2329 This Dissertation is brought to you for free and open access by the Graduate School at LSU Digital Commons. It has been accepted for inclusion in LSU Doctoral Dissertations by an authorized graduate school editor of LSU Digital Commons. For more information, please [email protected]. PROFESSIONAL WRESTLING: LOCAL PERFORMANCE HISTORY, GLOBAL PERFORMANCE PRAXIS A Dissertation Submitted to the Graduate Faculty of the Louisiana State University and Agricultural and Mechanical College in partial fulfillment of the requirements for the degree of Doctor of Philosophy in The School of Theatre By Neal A. Hebert B.A., Louisiana State University, 2003 M.A., Louisiana State University, 2008 August 2016 TABLE OF CONTENTS ACKNOWLEDGMENTS .............................................................................................. iv ABSTRACT ......................................................................................................................v
    [Show full text]
  • Understanding & Generalizing Alphago Zero
    Under review as a conference paper at ICLR 2019 UNDERSTANDING &GENERALIZING ALPHAGO ZERO Anonymous authors Paper under double-blind review ABSTRACT AlphaGo Zero (AGZ) (Silver et al., 2017b) introduced a new tabula rasa rein- forcement learning algorithm that has achieved superhuman performance in the games of Go, Chess, and Shogi with no prior knowledge other than the rules of the game. This success naturally begs the question whether it is possible to develop similar high-performance reinforcement learning algorithms for generic sequential decision-making problems (beyond two-player games), using only the constraints of the environment as the “rules.” To address this challenge, we start by taking steps towards developing a formal understanding of AGZ. AGZ includes two key innovations: (1) it learns a policy (represented as a neural network) using super- vised learning with cross-entropy loss from samples generated via Monte-Carlo Tree Search (MCTS); (2) it uses self-play to learn without training data. We argue that the self-play in AGZ corresponds to learning a Nash equilibrium for the two-player game; and the supervised learning with MCTS is attempting to learn the policy corresponding to the Nash equilibrium, by establishing a novel bound on the difference between the expected return achieved by two policies in terms of the expected KL divergence (cross-entropy) of their induced distributions. To extend AGZ to generic sequential decision-making problems, we introduce a robust MDP framework, in which the agent and nature effectively play a zero-sum game: the agent aims to take actions to maximize reward while nature seeks state transitions, subject to the constraints of that environment, that minimize the agent’s reward.
    [Show full text]
  • By Kevin Blake
    by Kevinby Matt Blake Scheff [ Intentionally Left Blank ] by Matt Scheff Consultant: Dr. Mike Lano Pro Wrestling Writer, Photographer, and Radio Host Credits Cover and Title Page, © Jonathan Bachman/AP Images for WWE; TOC, © Mike Lano Photography; 4, © Mike Lano Photography; 5, © Zuma Press/Alamy; 6, © Zuma Press/ Alamy; 7, © Zuma Press/Alamy; 8, © MarclSchauer/Shutterstock; 9, © Reporter & Farmer; 10, © Ken Wolter/Shutterstock; 11, © Bill Greenblatt/UPI Photo Service/ Newscom; 12, © Alexandre Pona/City Files/Icon SMI/Newscom; 13, © Zuma Press/ Alamy; 14, © Mike Lano Photography; 15, © Jim Mone/AP Images; 16, © Darryl Dennis/AP Images; 17, © Eric Jamison/AP Images; 18, © Matt Roberts/Zuma Press/Newscom; 19, © Matt Roberts/Zuma Press/Newscom; 20, © Matt Roberts/Zuma Press/Icon Sportswire; 21, © Mike Lano Photography; 22T, © George Napolitano/Retna Ltd./Corbis; 22B, © Mike Lano Photography. Publisher: Kenn Goin Senior Editor: Joyce Tavolacci Creative Director: Spencer Brinker Photo Researcher: Chrös McDougall Design: Debrah Kaiser Library of Congress Cataloging-in-Publication Data in process at time of publication (2015) Library of Congress Control Number: 2014037331 ISBN-13: 978-1-62724-550-0 Copyright © 2015 Bearport Publishing Company, Inc. All rights reserved. No part of this publication may be reproduced in whole or in part, stored in any retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without written permission from the publisher. For more information, write to Bearport Publishing Company, Inc., 45 West 21st Street, Suite 3B, New York, New York 10010. Printed in the United States of America. 10 9 8 7 6 5 4 3 2 1 Contents Brock Versus Rock ...................
    [Show full text]
  • Combining Tactical Search and Deep Learning in the Game of Go
    Combining tactical search and deep learning in the game of Go Tristan Cazenave PSL-Universite´ Paris-Dauphine, LAMSADE CNRS UMR 7243, Paris, France [email protected] Abstract Elaborated search algorithms have been developed to solve tactical problems in the game of Go such as capture problems In this paper we experiment with a Deep Convo- [Cazenave, 2003] or life and death problems [Kishimoto and lutional Neural Network for the game of Go. We Muller,¨ 2005]. In this paper we propose to combine tactical show that even if it leads to strong play, it has search algorithms with deep learning. weaknesses at tactical search. We propose to com- Other recent works combine symbolic and deep learn- bine tactical search with Deep Learning to improve ing approaches. For example in image surveillance systems Golois, the resulting Go program. A related work [Maynord et al., 2016] or in systems that combine reasoning is AlphaGo, it combines tactical search with Deep with visual processing [Aditya et al., 2015]. Learning giving as input to the network the results The next section presents our deep learning architecture. of ladders. We propose to extend this further to The third section presents tactical search in the game of Go. other kind of tactical search such as life and death The fourth section details experimental results. search. 2 Deep Learning 1 Introduction In the design of our network we follow previous work [Mad- Deep Learning has been recently used with a lot of success dison et al., 2014; Tian and Zhu, 2015]. Our network is fully in multiple different artificial intelligence tasks.
    [Show full text]
  • (CMPUT) 455 Search, Knowledge, and Simulations
    Computing Science (CMPUT) 455 Search, Knowledge, and Simulations James Wright Department of Computing Science University of Alberta [email protected] Winter 2021 1 455 Today - Lecture 22 • AlphaGo - overview and early versions • Coursework • Work on Assignment 4 • Reading: AlphaGo Zero paper 2 AlphaGo Introduction • High-level overview • History of DeepMind and AlphaGo • AlphaGo components and versions • Performance measurements • Games against humans • Impact, limitations, other applications, future 3 About DeepMind • Founded 2010 as a startup company • Bought by Google in 2014 • Based in London, UK, Edmonton (from 2017), Montreal, Paris • Expertise in Reinforcement Learning, deep learning and search 4 DeepMind and AlphaGo • A DeepMind team developed AlphaGo 2014-17 • Result: Massive advance in playing strength of Go programs • Before AlphaGo: programs about 3 levels below best humans • AlphaGo/Alpha Zero: far surpassed human skill in Go • Now: AlphaGo is retired • Now: Many other super-strong programs, including open source Image source: • https://www.nature.com All are based on AlphaGo, Alpha Zero ideas 5 DeepMind and UAlberta • UAlberta has deep connections • Faculty who work part-time or on leave at DeepMind • Rich Sutton, Michael Bowling, Patrick Pilarski, Csaba Szepesvari (all part time) • Many of our former students and postdocs work at DeepMind • David Silver - UofA PhD, designer of AlphaGo, lead of the DeepMind RL and AlphaGo teams • Aja Huang - UofA postdoc, main AlphaGo programmer • Many from the computer Poker group
    [Show full text]