Deepfakes & Disinformation

Total Page:16

File Type:pdf, Size:1020Kb

Deepfakes & Disinformation DEEPFAKES & DISINFORMATION DEEPFAKES & DISINFORMATION Agnieszka M. Walorska ANALYSISANALYSE 2 DEEPFAKES & DISINFORMATION IMPRINT Publisher Friedrich Naumann Foundation for Freedom Karl-Marx-Straße 2 14482 Potsdam Germany /freiheit.org /FriedrichNaumannStiftungFreiheit /FNFreiheit Author Agnieszka M. Walorska Editors International Department Global Themes Unit Friedrich Naumann Foundation for Freedom Concept and layout TroNa GmbH Contact Phone: +49 (0)30 2201 2634 Fax: +49 (0)30 6908 8102 Email: [email protected] As of May 2020 Photo Credits Photomontages © Unsplash.de, © freepik.de, P. 30 © AdobeStock Screenshots P. 16 © https://youtu.be/mSaIrz8lM1U P. 18 © deepnude.to / Agnieszka M. Walorska P. 19 © thispersondoesnotexist.com P. 19 © linkedin.com P. 19 © talktotransformer.com P. 25 © gltr.io P. 26 © twitter.com All other photos © Friedrich Naumann Foundation for Freedom (Germany) P. 31 © Agnieszka M. Walorska Notes on using this publication This publication is an information service of the Friedrich Naumann Foundation for Freedom. The publication is available free of charge and not for sale. It may not be used by parties or election workers during the purpose of election campaigning (Bundestags-, regional and local elections and elections to the European Parliament). Licence Creative Commons (CC BY-NC-ND 4.0) https://creativecommons.org/licenses/by-nc-nd/4.0 DEEPFAKES & DISINFORMATION DEEPFAKES & DISINFORMATION 3 4 DEEPFAKES & DISINFORMATION CONTENTS Table of contents EXECUTIVE SUMMARY 6 GLOSSARY 8 1.0 STATE OF DEVELOPMENT ARTIFICIAL INTELLIGENCE AND ITS ROLE IN DISINFORMATION 12 2.0 CHEAPFAKES & DEEPFAKES TECHNOLOGICAL POSSIBILITIES FOR THE MANIPULATION OF TEXT, IMAGES, AUDIO AND VIDEO 14 2.1 DEEPFAKES VS CHEAPFAKES 15 2.2 EXAMPLES OF APPLICATION 16 MANIPULATION OF MOVEMENT PATTERNS 16 VOICE AND FACIAL EXPRESSIONS 17 IMAGE MANIPULATION: DEEPNUDE AND ARTIFICIAL FACES 18 AI-GENERATED TEXTS 19 DEEPFAKES & DISINFORMATION DEEPFAKES & DISINFORMATION 5 CONTENTS 3.0 DISSEMINATION & CONSEQUENCES HOW DANGEROUS ARE DEEPFAKES IN REALITY? 20 3.1 DISSEMINATION 20 3.2 CONSEQUENCES 21 3.3 ARE THERE ANY EXAMPLES OF POSITIVE APPLICATIONS OF DEEPFAKES? 22 4.0 HOW CAN WE FACE THE CHALLENGES ASSOCIATED WITH DEEPFAKES? 24 4.1 TECHNOLOGICAL SOLUTIONS FOR IDENTIFYING AND COMBATING DEEPFAKES 24 4.2 SELF-REGULATION ATTEMPTS BY SOCIAL MEDIA PLATFORMS 26 4.3 REGULATION ATTEMPTS BY LEGISLATORS 28 4.4 THE RESPONSIBILITY OF THE INDIVIDUAL – CRITICAL THINKING AND MEDIA LITERACY 29 5.0 WHAT’S NEXT? 30 6 DEEPFAKES & DISINFORMATION DEEPFAKES & DISINFORMATION DEEPFAKES & DISINFORMATION 7 Applications of Artificial Intelligence (AI) are playing an The German federal government is clearly unprepared for increasing role in our society – but the new possibil- the topic of “Applications of AI-manipulated content for ities of this technology come hand in hand with new purposes of disinformation”, as shown by the brief par- risks. One such risk is misuse of the technology to liamentary inquiry submitted by the FDP parliamenta- deliberately disseminate false information. Although ry group in December 2019. There is no clearly defined politically motivated dissemination of disinformation responsibility within the government for the issue is certainly not a new phenomenon, technological pro- and no specific legislation. So far, only “general and gress has made the creation and distribution of ma- abstract rules” have been applied. The replies given by nipulated content much easier and more efficient than the federal government do not suggest any concrete ever before. With the use of AI algorithms, videos can strategy nor any intentions of investing in order to now be falsified quickly and relatively cheaply “deep( - be better equipped to deal with this issue. In general, fakes”) without requiring any specialised knowledge. the existing regulatory attempts at the German and European level do not appear sufficient to curb the The discourse on this topic has primarily focused on problem of AI-based disinformation. But this does not the potential use of deepfakes in election campaigns, necessarily have to be the case. Some US states have but this type of video only makes up a small fraction already passed laws against both non-consensual of all such manipulations: in 96% of cases, deepfakes deepfake pornography and the use of this technology were used to create pornographic films featuring to influence voters. prominent women. Women from outside of the public sphere may also find themselves as the involuntary star Accordingly, legislators should create clear guidelines of this kind of manipulated video (deepfake revenge for digital platforms to handle deepfakes in particular, pornography). Additionally, applications such as and disinformation in general, in a uniform manner. DeepNude allow static images to be converted into Measures can range from labelling manipulated deceptively real nude images. Unsurprisingly, these content as such and limiting its distribution (excluding applications only work with images of female bodies. it from recommendation algorithms) to deleting it. But visual content is not the only type of content that Promoting media literacy should also be made a pri- can be manipulated or produced algorithmically. AI- ority for all citizens, regardless of age. It is important generated voices have already been successfully used to raise awareness of the existence of deepfakes to conduct fraud, resulting in high financial damages, among the general public and develop the ability of and GPT-2 can generate texts that invent arbitrary individuals to analyse audiovisual content – even facts and citations. though it is becoming increasingly difficult to identify fakes. In this regard, it is well worth taking note of the What is the best way to tackle these challenges? approach taken by the Nordic countries, especially Companies and research institutes have already Finland, whose population was found to be the most invested heavily in technological solutions to identify resilient to disinformation. AI-generated videos. The benefit of these investments is typically short-lived: deepfake developers respond Still, there is one thing that we should not do: give in to technological identification solutions with more to the temptation of banning deepfakes completely. sophisticated methods – a classical example of an Like any technology, deepfakes do open up a wealth arms race. For this reason, platforms that distribute of interesting possibilities – including for education, manipulated content must be held more accountable. film and satire – despite their risks. Facebook and Twitter have now self-imposed rules for handling manipulated content, but these rules are not uniform, and it is not desirable to leave it to private companies to define what “freedom of expression” entails. 8 DEEPFAKES & DISINFORMATION GLOSSARY Artificial General Intelligence / Strong AI Cheapfakes / Shallowfakes The concept of strong AI or AGI refers to a computer In contrast to deepfakes, shallowfakes are image, audio system that masters a wide range of different tasks or video manipulations created with relatively simple and thereby achieves a human-like level of intelligence. technologies. Examples include reducing the speed of Currently, no such AI application exists. For instance, an audio recording or displaying content in a modified no single system is currently able to recognise cancer, context. play chess and drive a car, even though there are spe- cialised systems that can perform each task separately. Multiple research institutes and companies are currently DARPA working on strong AI, but there is no consensus on whether it can be achieved, and, if so, when. The Defense Advanced Research Projects Agency is part of the US Department of Defense, entrusted with the task of researching and funding groundbreaking Big Tech military technologies. In the past, projects funded by DARPA have resulted in major technologies that are The term “Big Tech” is used in the media to collectively also used in non-military applications, including the refer to a group of dominant companies in the IT internet, machine translation and self-driving vehicles. industry. It is often used interchangeably with “GAFA” or “the Big Four” for Google, Apple, Facebook, and Amazon (or “GAFAM” if Microsoft is included). For the Chinese big tech companies, the abbreviation BATX is used, for Baidu, Alibaba, Tencent, and Xiaomi. DEEPFAKES & DISINFORMATION DEEPFAKES & DISINFORMATION 9 GLOSSARY Deepfake Deep Porn Deepfakes (a portmanteau of deep learning and fake) Deep porn refers to the use of deep learning methods are the product of two AI algorithms working together to generate artificial pornographic images. in a so-called Generative Adversarial Network (GAN). GANs are best described as a way to algorithmically generate new types of data from existing datasets. For Generative Adversarial Network example, a GAN could analyse thousands of pictures of Donald Trump and then generate a new picture that is Generative adversarial networks are algorithmic architec- similar to the analysed images but not an exact copy of tures based on a pair of two neural networks, namely any of them. This technology can be applied to various one generative network and one discriminatory network. types of content – images, moving images, sound, and The two networks compete against one another (the text. The term deepfake is primarily used for audio and generative network generates data and the discriminatory video content. network falsifies
Recommended publications
  • Artificial Intelligence in Health Care: the Hope, the Hype, the Promise, the Peril
    Artificial Intelligence in Health Care: The Hope, the Hype, the Promise, the Peril Michael Matheny, Sonoo Thadaney Israni, Mahnoor Ahmed, and Danielle Whicher, Editors WASHINGTON, DC NAM.EDU PREPUBLICATION COPY - Uncorrected Proofs NATIONAL ACADEMY OF MEDICINE • 500 Fifth Street, NW • WASHINGTON, DC 20001 NOTICE: This publication has undergone peer review according to procedures established by the National Academy of Medicine (NAM). Publication by the NAM worthy of public attention, but does not constitute endorsement of conclusions and recommendationssignifies that it is the by productthe NAM. of The a carefully views presented considered in processthis publication and is a contributionare those of individual contributors and do not represent formal consensus positions of the authors’ organizations; the NAM; or the National Academies of Sciences, Engineering, and Medicine. Library of Congress Cataloging-in-Publication Data to Come Copyright 2019 by the National Academy of Sciences. All rights reserved. Printed in the United States of America. Suggested citation: Matheny, M., S. Thadaney Israni, M. Ahmed, and D. Whicher, Editors. 2019. Artificial Intelligence in Health Care: The Hope, the Hype, the Promise, the Peril. NAM Special Publication. Washington, DC: National Academy of Medicine. PREPUBLICATION COPY - Uncorrected Proofs “Knowing is not enough; we must apply. Willing is not enough; we must do.” --GOETHE PREPUBLICATION COPY - Uncorrected Proofs ABOUT THE NATIONAL ACADEMY OF MEDICINE The National Academy of Medicine is one of three Academies constituting the Nation- al Academies of Sciences, Engineering, and Medicine (the National Academies). The Na- tional Academies provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions.
    [Show full text]
  • Real Vs Fake Faces: Deepfakes and Face Morphing
    Graduate Theses, Dissertations, and Problem Reports 2021 Real vs Fake Faces: DeepFakes and Face Morphing Jacob L. Dameron WVU, [email protected] Follow this and additional works at: https://researchrepository.wvu.edu/etd Part of the Signal Processing Commons Recommended Citation Dameron, Jacob L., "Real vs Fake Faces: DeepFakes and Face Morphing" (2021). Graduate Theses, Dissertations, and Problem Reports. 8059. https://researchrepository.wvu.edu/etd/8059 This Thesis is protected by copyright and/or related rights. It has been brought to you by the The Research Repository @ WVU with permission from the rights-holder(s). You are free to use this Thesis in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you must obtain permission from the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license in the record and/ or on the work itself. This Thesis has been accepted for inclusion in WVU Graduate Theses, Dissertations, and Problem Reports collection by an authorized administrator of The Research Repository @ WVU. For more information, please contact [email protected]. Real vs Fake Faces: DeepFakes and Face Morphing Jacob Dameron Thesis submitted to the Benjamin M. Statler College of Engineering and Mineral Resources at West Virginia University in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering Xin Li, Ph.D., Chair Natalia Schmid, Ph.D. Matthew Valenti, Ph.D. Lane Department of Computer Science and Electrical Engineering Morgantown, West Virginia 2021 Keywords: DeepFakes, Face Morphing, Face Recognition, Facial Action Units, Generative Adversarial Networks, Image Processing, Classification.
    [Show full text]
  • Synthetic Video Generation
    Synthetic Video Generation Why seeing should not always be believing! Alex Adam Image source https://www.pocket-lint.com/apps/news/adobe/140252-30-famous-photoshopped-and-doctored-images-from-across-the-ages Image source https://www.pocket-lint.com/apps/news/adobe/140252-30-famous-photoshopped-and-doctored-images-from-across-the-ages Image source https://www.pocket-lint.com/apps/news/adobe/140252-30-famous-photoshopped-and-doctored-images-from-across-the-ages Image source https://www.pocket-lint.com/apps/news/adobe/140252-30-famous-photoshopped-and-doctored-images-from-across-the-ages Image Tampering Historically, manipulated Off the shelf software (e.g imagery has deceived Photoshop) exists to do people this now Has become standard in Public have become tabloids/social media somewhat numb to it - it’s no longer as impactful/shocking How does machine learning fit in? Advent of machine learning Video manipulation is now has made image also tractable with enough manipulation even easier data and compute Can make good synthetic Public are largely unaware of videos using a gaming this and the danger it poses! computer in a bedroom Part I: Faceswap ● In 2017, reddit (/u/deepfakes) posted Python code that uses machine learning to swap faces in images/video ● ‘Deepfake’ content flooded reddit, YouTube and adult websites ● Reddit since banned this content (but variants of the code are open source https://github.com/deepfakes/faceswap) ● Autoencoder concepts underlie most ‘Deepfake’ methods Faceswap Algorithm Image source https://medium.com/@jonathan_hui/how-deep-learning-fakes-videos-deepfakes-and-how-to-detect-it-c0b50fbf7cb9 Inference Image source https://medium.com/@jonathan_hui/how-deep-learning-fakes-videos-deepfakes-and-how-to-detect-it-c0b50fbf7cb9 ● Faceswap model is an autoencoder.
    [Show full text]
  • Exposing Deepfake Videos by Detecting Face Warping Artifacts
    Exposing DeepFake Videos By Detecting Face Warping Artifacts Yuezun Li, Siwei Lyu Computer Science Department University at Albany, State University of New York, USA Abstract sibility to large-volume training data and high-throughput computing power, but more to the growth of machine learn- In this work, we describe a new deep learning based ing and computer vision techniques that eliminate the need method that can effectively distinguish AI-generated fake for manual editing steps. videos (referred to as DeepFake videos hereafter) from real In particular, a new vein of AI-based fake video gen- videos. Our method is based on the observations that cur- eration methods known as DeepFake has attracted a lot rent DeepFake algorithm can only generate images of lim- of attention recently. It takes as input a video of a spe- ited resolutions, which need to be further warped to match cific individual (’target’), and outputs another video with the original faces in the source video. Such transforms leave the target’s faces replaced with those of another individ- distinctive artifacts in the resulting DeepFake videos, and ual (’source’). The backbone of DeepFake are deep neu- we show that they can be effectively captured by convo- ral networks trained on face images to automatically map lutional neural networks (CNNs). Compared to previous the facial expressions of the source to the target. With methods which use a large amount of real and DeepFake proper post-processing, the resulting videos can achieve a generated images to train CNN classifier, our method does high level of realism. not need DeepFake generated images as negative training In this paper, we describe a new deep learning based examples since we target the artifacts in affine face warp- method that can effectively distinguish DeepFake videos ing as the distinctive feature to distinguish real and fake from the real ones.
    [Show full text]
  • JCS Deepfake
    Don’t Believe Your Eyes (Or Ears): The Weaponization of Artificial Intelligence, Machine Learning, and Deepfakes Joe Littell 1 Agenda ØIntroduction ØWhat is A.I.? ØWhat is a DeepFake? ØHow is a DeepFake created? ØVisual Manipulation ØAudio Manipulation ØForgery ØData Poisoning ØConclusion ØQuestions 2 Introduction Deniss Metsavas, an Estonian soldier convicted of spying for Russia’s military intelligence service after being framed for a rape in Russia. (Picture from Daniel Lombroso / The Atlantic) 3 What is A.I.? …and what is it not? General Artificial Intelligence (AI) • Machine (or Statistical) Learning (ML) is a subset of AI • ML works through the probability of a new event happening based on previously gained knowledge (Scalable pattern recognition) • ML can be supervised, leaning requiring human input into the data, or unsupervised, requiring no input to the raw data. 4 What is a Deepfake? • Deepfake is a mash up of the words for deep learning, meaning machine learning using a neural network, and fake images/video/audio. § Taken from a Reddit user name who utilized faceswap app for his own ‘productions.’ • Created by the use of two machine learning algorithms, Generative Adversarial Networks, and Auto-Encoders. • Became known for the use in underground pornography using celebrity faces in highly explicit videos. 5 How is a Deepfake created? • Deepfakes are generated using Generative Adversarial Networks, and Auto-Encoders. • These algorithms work through the uses of competing systems, where one creates a fake piece of data and the other is trained to determine if that datatype is fake or not • Think of it like a counterfeiter and a police officer.
    [Show full text]
  • Deepfakes 2020 the Tipping Point, Sentinel
    SENTINEL DEEPFAKES 2020: THE TIPPING POINT The Current Threat Landscape, its Impact on the U.S 2020 Elections, and the Coming of AI-Generated Events at Scale. Sentinel - 2020 1 About Sentinel. Sentinel works with governments, international media outlets and defense agencies to help protect democracies from disinformation campaigns, synthetic media and information operations by developing a state-of-the-art AI detection platform. Headquartered in Tallinn, Estonia, the company was founded by ex-NATO AI and cybersecurity experts, and is backed by world-class investors including Jaan Tallinn (Co-Founder of Skype & early investor in DeepMind) and Taavet Hinrikus (Co-Founder of TransferWise). Our vision is to become the trust layer for the Internet by verifying the entire critical information supply chain and safeguard 1 billion people from information warfare. Acknowledgements We would like to thank our investors, partners, and advisors who have helped us throughout our journey and share our vision to build a trust layer for the internet. Special thanks to Mikk Vainik of Republic of Estonia’s Ministry of Economic Affairs and Communications, Elis Tootsman of Accelerate Estonia, and Dr. Adrian Venables of TalTech for your feedback and support as well as to Jaan Tallinn, Taavet Hinrikus, Ragnar Sass, United Angels VC, Martin Henk, and everyone else who has made this report possible. Johannes Tammekänd CEO & Co-Founder © 2020 Sentinel Contact: [email protected] Authors: Johannes Tammekänd, John Thomas, and Kristjan Peterson Cite: Deepfakes 2020: The Tipping Point, Johannes Tammekänd, John Thomas, and Kristjan Peterson, October 2020 Sentinel - 2020 2 Executive Summary. “There are but two powers in the world, the sword and the mind.
    [Show full text]
  • CNN-Generated Images Are Surprisingly Easy to Spot... for Now
    CNN-generated images are surprisingly easy to spot... for now Sheng-Yu Wang1 Oliver Wang2 Richard Zhang2 Andrew Owens1,3 Alexei A. Efros1 UC Berkeley1 Adobe Research2 University of Michigan3 synthetic real ProGAN [19] StyleGAN [20] BigGAN [7] CycleGAN [48] StarGAN [10] GauGAN [29] CRN [9] IMLE [23] SITD [8] Super-res. [13] Deepfakes [33] Figure 1: Are CNN-generated images hard to distinguish from real images? We show that a classifier trained to detect images generated by only one CNN (ProGAN, far left) can detect those generated by many other models (remaining columns). Our code and models are available at https://peterwang512.github.io/CNNDetection/. Abstract are fake [14]. This issue has started to play a significant role in global politics; in one case a video of the president of In this work we ask whether it is possible to create Gabon that was claimed by opposition to be fake was one a “universal” detector for telling apart real images from factor leading to a failed coup d’etat∗. Much of this con- these generated by a CNN, regardless of architecture or cern has been directed at specific manipulation techniques, dataset used. To test this, we collect a dataset consisting such as “deepfake”-style face replacement [2], and photo- of fake images generated by 11 different CNN-based im- realistic synthetic humans [20]. However, these methods age generator models, chosen to span the space of com- represent only two instances of a broader set of techniques: monly used architectures today (ProGAN, StyleGAN, Big- image synthesis via convolutional neural networks (CNNs).
    [Show full text]
  • OC-Fakedect: Classifying Deepfakes Using One-Class Variational Autoencoder
    OC-FakeDect: Classifying Deepfakes Using One-class Variational Autoencoder Hasam Khalid Simon S. Woo Computer Science and Engineering Department Computer Science and Engineering Department Sungkyunkwan University, South Korea Sungkyunkwan University, South Korea [email protected] [email protected] Abstract single facial image to create fake images or videos. One popular example is the Deepfakes of former U.S. President, An image forgery method called Deepfakes can cause Barack Obama, generated as part of a research [29] focusing security and privacy issues by changing the identity of on the synthesis of a high-quality video, featuring Barack a person in a photo through the replacement of his/her Obama speaking with accurate lip sync, composited into face with a computer-generated image or another person’s a target video clip. Therefore, the ability to easily forge face. Therefore, a new challenge of detecting Deepfakes videos raises serious security and privacy concerns: imag- arises to protect individuals from potential misuses. Many ine hackers that can use deepfakes to present a forged video researchers have proposed various binary-classification of an eminent person to send out false and potentially dan- based detection approaches to detect deepfakes. How- gerous messages to the public. Nowadays, fake news has ever, binary-classification based methods generally require become an issue as well, due to the spread of misleading in- a large amount of both real and fake face images for train- formation via traditional news media or online social media ing, and it is challenging to collect sufficient fake images and Deepfake videos can be combined to create arbitrary data in advance.
    [Show full text]
  • Risk Assessment and Knowledge Management of Ethical Hacking In
    Technoethics and Sensemaking: Risk Assessment and Knowledge Management of Ethical Hacking in a Sociotechnical Society by Baha Abu-Shaqra A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electronic Business Faculty of Engineering University of Ottawa © Baha Abu-Shaqra, Ottawa, Canada, 2020 ii Abstract Cyber attacks by domestic and foreign threat actors are increasing in frequency and sophistication. Cyber adversaries exploit a cybersecurity skill/knowledge gap and an open society, undermining the information security/privacy of citizens and businesses and eroding trust in governments, thus threatening social and political stability. The use of open digital hacking technologies in ethical hacking in higher education and within broader society raises ethical, technical, social, and political challenges for liberal democracies. Programs teaching ethical hacking in higher education are steadily growing but there is a concern that teaching students hacking skills increases crime risk to society by drawing students toward criminal acts. A cybersecurity skill gap undermines the security/viability of business and government institutions. The thesis presents an examination of opportunities and risks involved in using AI powered intelligence gathering/surveillance technologies in ethical hacking teaching practices in Canada. Taking a qualitative exploratory case study approach, technoethical inquiry theory (Bunge-Luppicini) and Weick’s sensemaking model were applied as a sociotechnical theory
    [Show full text]
  • Game Playing with Deep Q-Learning Using Openai Gym
    Game Playing with Deep Q-Learning using OpenAI Gym Robert Chuchro Deepak Gupta [email protected] [email protected] Abstract sociated with performing a particular action in a given state. This information is fundamental to any reinforcement learn- Historically, designing game players requires domain- ing problem. specific knowledge of the particular game to be integrated The input to our model will be a sequence of pixel im- into the model for the game playing program. This leads ages as arrays (Width x Height x 3) generated by a particular to a program that can only learn to play a single particu- OpenAI Gym environment. We then use a Deep Q-Network lar game successfully. In this project, we explore the use of to output a action from the action space of the game. The general AI techniques, namely reinforcement learning and output that the model will learn is an action from the envi- neural networks, in order to architect a model which can ronments action space in order to maximize future reward be trained on more than one game. Our goal is to progress from a given state. In this paper, we explore using a neural from a simple convolutional neural network with a couple network with multiple convolutional layers as our model. fully connected layers, to experimenting with more complex additions, such as deeper layers or recurrent neural net- 2. Related Work works. The best known success story of classical reinforcement learning is TD-gammon, a backgammon playing program 1. Introduction which learned entirely by reinforcement learning [6]. TD- gammon used a model-free reinforcement learning algo- Game playing has recently emerged as a popular play- rithm similar to Q-learning.
    [Show full text]
  • Reinforcement Learning with Tensorflow&Openai
    Lecture 1: Introduction Reinforcement Learning with TensorFlow&OpenAI Gym Sung Kim <[email protected]> http://angelpawstherapy.org/positive-reinforcement-dog-training.html Nature of Learning • We learn from past experiences. - When an infant plays, waves its arms, or looks about, it has no explicit teacher - But it does have direct interaction to its environment. • Years of positive compliments as well as negative criticism have all helped shape who we are today. • Reinforcement learning: computational approach to learning from interaction. Richard Sutton and Andrew Barto, Reinforcement Learning: An Introduction Nishant Shukla , Machine Learning with TensorFlow Reinforcement Learning https://www.cs.utexas.edu/~eladlieb/RLRG.html Machine Learning, Tom Mitchell, 1997 Atari Breakout Game (2013, 2015) Atari Games Nature : Human-level control through deep reinforcement learning Human-level control through deep reinforcement learning, Nature http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html Figure courtesy of Mnih et al. "Human-level control through deep reinforcement learning”, Nature 26 Feb. 2015 https://deepmind.com/blog/deep-reinforcement-learning/ https://deepmind.com/applied/deepmind-for-google/ Reinforcement Learning Applications • Robotics: torque at joints • Business operations - Inventory management: how much to purchase of inventory, spare parts - Resource allocation: e.g. in call center, who to service first • Finance: Investment decisions, portfolio design • E-commerce/media - What content to present to users (using click-through / visit time as reward) - What ads to present to users (avoiding ad fatigue) Audience • Want to understand basic reinforcement learning (RL) • No/weak math/computer science background - Q = r + Q • Want to use RL as black-box with basic understanding • Want to use TensorFlow and Python (optional labs) Schedule 1.
    [Show full text]
  • Latest Snapshot.” to Modify an Algo So It Does Produce Multiple Snapshots, find the Following Line (Which Is Present in All of the Algorithms)
    Spinning Up Documentation Release Joshua Achiam Feb 07, 2020 User Documentation 1 Introduction 3 1.1 What This Is...............................................3 1.2 Why We Built This............................................4 1.3 How This Serves Our Mission......................................4 1.4 Code Design Philosophy.........................................5 1.5 Long-Term Support and Support History................................5 2 Installation 7 2.1 Installing Python.............................................8 2.2 Installing OpenMPI...........................................8 2.3 Installing Spinning Up..........................................8 2.4 Check Your Install............................................9 2.5 Installing MuJoCo (Optional)......................................9 3 Algorithms 11 3.1 What’s Included............................................. 11 3.2 Why These Algorithms?......................................... 12 3.3 Code Format............................................... 12 4 Running Experiments 15 4.1 Launching from the Command Line................................... 16 4.2 Launching from Scripts......................................... 20 5 Experiment Outputs 23 5.1 Algorithm Outputs............................................ 24 5.2 Save Directory Location......................................... 26 5.3 Loading and Running Trained Policies................................. 26 6 Plotting Results 29 7 Part 1: Key Concepts in RL 31 7.1 What Can RL Do?............................................ 31 7.2
    [Show full text]