KEYNOTE: Deep Learning Toolbox Al 2020

Total Page:16

File Type:pdf, Size:1020Kb

KEYNOTE: Deep Learning Toolbox Al 2020 portada KEYNOTE SPEAKER “Deep Learning Toolbox en el 2020” Presenta: Marc Torrent Director CIDAI cidai.eu | @CIDAI_eu KEYNOTE SPEAKER “Deep Learning Toolbox en el 2020” Oriol Vinyals Research Director in Deep Learning Google DeepMind research.google/people/OriolVinyals/ The Deep Learning Toolbox in 2020 @OriolVinyalsML Research Director Deep Learning Google DeepMind AI & Big Data Barcelona (online), October 2020 artificial intelligence grand project to build non-human intelligence machine learning machines that learn to be smarter artificial intelligence grand project to build non-human intelligence machine learning machines that learn to be smarter deep learning supervised learning reinforcement learning artificial intelligence grand project to build non-human intelligence machine learning machines that learn to be smarter deep learning Lots of attention ● Startups / VCs ● Facebook / Google / Amazon / Apple ● Open source efforts ● Press (noise...) ● Universities Machine Learning Artificial Intelligence Deep Learning ML is done in many places TensorFlow GitHub stars by GitHub user profiles w/ public locations Source: http://jrvis.com/red-dwarf/?user=tensorflow&repo=tensorflow Datacenter Revolution Picture credit: http://americanhistory.si.edu/exhibitions/preview-case-american-enterprise General Artificial Intelligence General Artificial Intelligence Hardware Revolution Tensor Processing Unit v2 ● 180 teraflops of computation, 64 GB of HBM memory, 2400 GB/s mem BW ● Designed to be connected together into larger configurationsGeneral Artificial Intelligence TPU Pod 64 2nd-gen TPUs For comparison, 11.5 petaflops #10 supercomputer in world has 4 terabytes of memory Rpeak of 11 petaflops* Data Revolution Datasets / Environments http://www.spacemachine.net/views/2016/3/datasets-over-algorithms Data modalities increasingly diverse “[...] The alternative approach, which they thought was crazy, was to forget logic and try and understand how networks of brain cells learn things. Curiously, two people who rejected the logic based approach to AI were Turing and Von Neumann. [...] now neural networks are everywhere and the crazy approach is winning.” G. Hinton Words, letters Speech Videos Images Programs Graphs Software Revolution Big tech companies open sourced most ML tools! Research Revolution The life of a researcher The Deep Learning Toolbox: Zooming Out Platforms Frameworks Datasets Summary ● Datacenter Revolution ● Hardware Revolution ● Data Revolution ● Software Revolution ● Research Revolution Algorithms Revisited Back to the 50s ConvNets General Artificial Intelligence A Learning Algorithm Given training examples “(input, output)” pairs While not done: Pick a training example (x, y) Run the neural network on x Compare actual output to y Adjust parameters to reduce the error (the “loss”) The Deep Learning Toolbox: Zooming In Feed forward models Sequence Prediction Seq2Seq Attention & Pointers Read/Write memories Temporal Hierarchies Key,Value memories Graph Neural Networks Recurrent Architectures Figure credits: Jeff Dean, Chris Olah, Santoro et al 2016, Koutnik et al 2014, van den Oord et al 2016, Miller et al 2016, Vinyals et al 2016, Vaswani et al 2017 Feed forward models Functions a deep neural network can learn input output Pixels: “lion” Sequence Prediction Functions a deep neural network can learn input output Pixels: “lion” Audio: “How cold is it outside?” Seq2Seq Functions a deep neural network can learn input output Pixels: “lion” Audio: “How cold is it outside?” “Hello, how are you?” “Bonjour, comment allez-vous?” Functions a deep neural network can learn input output Pixels: “lion” Audio: “How cold is it outside?” “Hello, how are you?” “Bonjour, comment allez-vous?” Pixels: “A blue and yellow train travelling down the tracks” 2011 humans 26% errors 5% errors 2011 2016 humans 26% errors 5% errors 3% errors Impact @ Google and beyond Growing use of Deep Learning at Google “Google will soon be a big LSTM”, Jüergen Schmidhuber Across many products/areas: Android Apps drug discovery Gmail Image understanding Maps Photos Robotics research Search Speech Translate YouTube … many others ... Products using Machine Learning Machine Learning Enabling Apps that See, Hear and Understand https://waymo.com/tech/ Hemorrhages Healthy Diseased No DR Mild DR Moderate DR Severe DR Proliferative DR 1 2 3 4 5 F-score 0.95 0.91 Algorithm Ophthalmologist (median) “The study by Gulshan and colleagues truly represents the brave new world in medicine.” Dr. Andrew Beam, Dr. Isaac Kohane Harvard Medical School “Google just published this paper in JAMA (impact factor 37) [...] It actually lives up to the hype.” Dr. Luke Oakden-Rayner University of Adelaide www.nasa.gov/press-release/artificial-intelligence-nasa-data-used-to-discover-eighth-planet-circling-distant-star Blog: www.blog.google/topics/machine-learning/hunting-planets-machine-learning/ Paper: [Shallue & Vandenburg], www.cfa.harvard.edu/~avanderb/kepler90i.pdf www.nasa.gov/press-release/artificial-intelligence-nasa-data-used-to-discover-eighth-planet-circling-distant-star Blog: www.blog.google/topics/machine-learning/hunting-planets-machine-learning/ Paper: [Shallue & Vandenburg], www.cfa.harvard.edu/~avanderb/kepler90i.pdf From research to production examples Google Translate Seq2Seq Machine Translation WaveNet (Google Voice) van den Oord, A., et al. “WaveNet: A Generative Model for Raw Audio.” arxiv (2016). Temporal Hierarchies Text to speech Feed forward models Which of these are not real? Brock, A., et al. “Large Scale GAN Training for High Fidelity Natural Image Synthesis” ICLR (2019). Feed forward models Razavi, A., et al. “Generating Diverse High-Fidelity Images with VQ-VAE-2” ICLR (2019). Frontiers and challenges Sequence Prediction Neural Chatbots Conditional Language [Vinyals, Le, ICML DL WS 2015] Human: what do you think about tesla ? Machine: he ’s a good conductor . Human: What do you think about messi ? Machine: he ’s a great player . Human: what do you think about cleopatra ? Machine: oh , she ’s very regal . Human: who is skywalker ? Machine: he is a hero . Human: is sky blue or black ? Machine: blue . Human: does a cat have a tail ? Machine: yes . Human: does a cat have a wing ? Machine: no Scale of Language Models Recurrent Architectures ● Movie subtitles: 900M ● LM1B: 1B Sequence Prediction Key,Value memories ● GPT2: 40B ● GPT3: 500B Evolution of language modeling Shannon, 1951 (samples from SLP book, 2000), 3-gram [ ] They also point to ninety nine point six billion dollars from two hundred four oh six three percent of the rates of interest stores as Mexico and Brazil on market conditions Sutskever et al, 2011, RNNs [ ] while he was giving attention to the second advantage of school building a 2-for-2 stool killed by the Cultures saddled with a halfsuit defending the Bharatiya Fernall ’s office Jozefowicz et al, 2016, BIG LSTMs [ ] With even more new technologies coming onto the market quickly during the past three years , an increasing number of companies now must tackle the ever-changing and ever-changing environmental challenges online . Evolution of language modeling Liu et al, 2018, Transformer [==wings over kansas] ==wings over kansas is a 2010 dhamma feature film written and directed by brian ig ariyoshi . it premiered on march 17, 2010 the film tells the story of three americans who bravely achieved a victory without expected daknfi . ==Wings Over Kansas Plot the story begins with the faltering success of egypt 's hungry dakfunctionality when he loses his lives around the time when the embarked white - collar daughters begin their father 's cabin. the rest of the campaign ( coming to town ) gives dakhandles [...] Radford and Wu et al, 2019, BIG Transformer [In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English.] The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science. Now, after almost two centuries, the mystery of what sparked this odd phenomenon is finally solved. Dr. Jorge Perez, an evolutionary biologist from the University of La Paz, and several companions, were exploring the Andes Mountains when they found a small valley, with no other animals or humans. Perez noticed that the valley had what appeared to be a natural fountain, surrounded by two peaks of rock and silver snow. Perez and the others then ventured further into the valley. “By the time we reached the top of one peak, the water looked blue, with some crystals on top,” said Perez. Perez and his friends were astonished to see the unicorn herd. These creatures could be seen from the air without having to move too much to see them – they were so close they could touch their horns. [...] GPTX ● Transformer-based ● GPT2: ○ 1.5 billion parameters ○ 40 billion words ● GPT3: ○ 175 billion parameters ○ 500 billion words ● Adapts to style and content of arbitrary conditioning input Radford et al. (2019) https://openai.com/blog/better-language-models/#sample1 Brown & Mann & Ryder & Subbiah et al (2020) @shariffshameen losslesshq.com @mattshummer_ @sh_reya Challenge: One Shot Learning ● Humans have a capacity for very rapid assimilation of data (one/few-shot learning). Lake et al, 2013, 2015 Challenge: Adversarial Examples Hamster Airplane Image Image classifier classifier Crafted adversarial perturbation Clean image Adversarial image [ Intriguing properties
Recommended publications
  • Xinggan for Person Image Generation
    XingGAN for Person Image Generation Hao Tang1;2, Song Bai2, Li Zhang2, Philip H.S. Torr2, and Nicu Sebe1;3 1University of Trento ([email protected]) 2University of Oxford 3Huawei Research Ireland Abstract. We propose a novel Generative Adversarial Network (Xing- GAN or CrossingGAN) for person image generation tasks, i.e., translat- ing the pose of a given person to a desired one. The proposed Xing gener- ator consists of two generation branches that model the person's appear- ance and shape information, respectively. Moreover, we propose two novel blocks to effectively transfer and update the person's shape and appear- ance embeddings in a crossing way to mutually improve each other, which has not been considered by any other existing GAN-based image genera- tion work. Extensive experiments on two challenging datasets, i.e., Market- 1501 and DeepFashion, demonstrate that the proposed XingGAN ad- vances the state-of-the-art performance both in terms of objective quan- titative scores and subjective visual realness. The source code and trained models are available at https://github.com/Ha0Tang/XingGAN. Keywords: Generative Adversarial Networks (GANs), Person Image Generation, Appearance Cues, Shape Cues 1 Introduction The problem of person image generation aims to generate photo-realistic per- son images conditioned on an input person image and several desired poses. This task has a wide range of applications such as person image/video genera- tion [41,9,2,11,19] and person re-identification [45,28]. Exiting methods such as [21,22,31,45,35] have achieved promising performance on this challenging task.
    [Show full text]
  • Artificial Intelligence: with Great Power Comes Great Responsibility
    ARTIFICIAL INTELLIGENCE: WITH GREAT POWER COMES GREAT RESPONSIBILITY JOINT HEARING BEFORE THE SUBCOMMITTEE ON RESEARCH AND TECHNOLOGY & SUBCOMMITTEE ON ENERGY COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY HOUSE OF REPRESENTATIVES ONE HUNDRED FIFTEENTH CONGRESS SECOND SESSION JUNE 26, 2018 Serial No. 115–67 Printed for the use of the Committee on Science, Space, and Technology ( Available via the World Wide Web: http://science.house.gov U.S. GOVERNMENT PUBLISHING OFFICE 30–877PDF WASHINGTON : 2018 COMMITTEE ON SCIENCE, SPACE, AND TECHNOLOGY HON. LAMAR S. SMITH, Texas, Chair FRANK D. LUCAS, Oklahoma EDDIE BERNICE JOHNSON, Texas DANA ROHRABACHER, California ZOE LOFGREN, California MO BROOKS, Alabama DANIEL LIPINSKI, Illinois RANDY HULTGREN, Illinois SUZANNE BONAMICI, Oregon BILL POSEY, Florida AMI BERA, California THOMAS MASSIE, Kentucky ELIZABETH H. ESTY, Connecticut RANDY K. WEBER, Texas MARC A. VEASEY, Texas STEPHEN KNIGHT, California DONALD S. BEYER, JR., Virginia BRIAN BABIN, Texas JACKY ROSEN, Nevada BARBARA COMSTOCK, Virginia CONOR LAMB, Pennsylvania BARRY LOUDERMILK, Georgia JERRY MCNERNEY, California RALPH LEE ABRAHAM, Louisiana ED PERLMUTTER, Colorado GARY PALMER, Alabama PAUL TONKO, New York DANIEL WEBSTER, Florida BILL FOSTER, Illinois ANDY BIGGS, Arizona MARK TAKANO, California ROGER W. MARSHALL, Kansas COLLEEN HANABUSA, Hawaii NEAL P. DUNN, Florida CHARLIE CRIST, Florida CLAY HIGGINS, Louisiana RALPH NORMAN, South Carolina DEBBIE LESKO, Arizona SUBCOMMITTEE ON RESEARCH AND TECHNOLOGY HON. BARBARA COMSTOCK, Virginia, Chair FRANK D. LUCAS, Oklahoma DANIEL LIPINSKI, Illinois RANDY HULTGREN, Illinois ELIZABETH H. ESTY, Connecticut STEPHEN KNIGHT, California JACKY ROSEN, Nevada BARRY LOUDERMILK, Georgia SUZANNE BONAMICI, Oregon DANIEL WEBSTER, Florida AMI BERA, California ROGER W. MARSHALL, Kansas DONALD S. BEYER, JR., Virginia DEBBIE LESKO, Arizona EDDIE BERNICE JOHNSON, Texas LAMAR S.
    [Show full text]
  • Video and Audio Deepfakes: What Lawyers Need to Know by Sharon D
    Video and Audio Deepfakes: What Lawyers Need to Know by Sharon D. Nelson, Esq., and John W. Simek © 2020 Sensei Enterprises, Inc. If some nefarious person has decent photos of your face, you too (like so many unfortunate Hollywood celebrities) could appear to be the star of a pornographic video. If someone has recordings of your voice (from your website videos, CLEs you have presented, speeches you’ve given, etc.), they can do a remarkably good job of simulating your spoken words and, just as an example, call your office manager and authorize a wire transfer – something the office manager may be willing to do because of “recognizing” your voice. Unnerving? Yes, but it is the reality of today. And if you don’t believe how “white hot” deepfakes are, just put a Google alert on that word and you’ll be amazed at the volume of daily results. Political and Legal Implications We have already seen deepfakes used in the political area (the “drunk” Nancy Pelosi deepfake, a reference to which was tweeted by the president), and many commentators worry that deepfake videos will ramp up for the 2020 election. Some of them, including the Pelosi video, are referred to as “cheapfakes” because they are so poorly done (basically running the video at 75 percent speed to simulate drunkenness), but that really doesn’t matter if large numbers of voters believe it’s real. And the days when you could tell a deepfake video by the fact that the person didn’t blink are rapidly vanishing as the algorithms have gotten smarter.
    [Show full text]
  • Neural Rendering and Reenactment of Human Actor Videos
    Neural Rendering and Reenactment of Human Actor Videos LINGJIE LIU, University of Hong Kong, Max Planck Institute for Informatics WEIPENG XU, Max Planck Institute for Informatics MICHAEL ZOLLHÖFER, Stanford University, Max Planck Institute for Informatics HYEONGWOO KIM, FLORIAN BERNARD, and MARC HABERMANN, Max Planck Institute for Informatics WENPING WANG, University of Hong Kong CHRISTIAN THEOBALT, Max Planck Institute for Informatics (real) Driving motion (synth.) Output Fig. 1. We propose a novel learning-based approach for the animation and reenactment of human actor videos. The top row shows some frames of the video from which the source motion is extracted, and the bottom row shows the corresponding synthesized target person imagery reenacting the source motion. We propose a method for generating video-realistic animations of real hu- images are then used to train a conditional generative adversarial network mans under user control. In contrast to conventional human character render- that translates synthetic images of the 3D model into realistic imagery of ing, we do not require the availability of a production-quality photo-realistic the human. We evaluate our method for the reenactment of another person 3D model of the human, but instead rely on a video sequence in conjunction that is tracked in order to obtain the motion data, and show video results with a (medium-quality) controllable 3D template model of the person. With generated from artist-designed skeleton motion. Our results outperform the that, our approach significantly reduces production cost compared to conven- state-of-the-art in learning-based human image synthesis. tional rendering approaches based on production-quality 3D models, and can CCS Concepts: • Computing methodologies → Computer graphics; also be used to realistically edit existing videos.
    [Show full text]
  • Complement the Broken Pose in Human Image Synthesis
    Focus and retain: Complement the Broken Pose in Human Image Synthesis Pu Ge†, Qiushi Huang†, Wei Xiang‡, Xue Jing, Yule Li, Yiyong Li, Zhun Sun Bigo Technology PTE. LTD, Singapore {gepu,huangqiushi,xiangwei1}@bigo.sg Abstract Given a target pose, how to generate an image of a spe- cific style with that target pose remains an ill-posed and thus complicated problem. Most recent works treat the human pose synthesis tasks as an image spatial transformation prob- lem using flow warping techniques. However, we observe that, due to the inherent ill-posed nature of many compli- cated human poses, former methods fail to generate body parts. To tackle this problem, we propose a feature-level flow attention module and an Enhancer Network. The flow attention module produces a flow attention mask to guide the combination of the flow-warped features and the structural pose features. Then, we apply the Enhancer Network to re- fine the coarse image by injecting the pose information. We present our experimental evaluation both qualitatively and quantitatively on DeepFashion, Market-1501, and Youtube dance datasets. Quantitative results show that our method has 12.995 FID at DeepFashion, 25.459 FID at Market-1501, 14.516 FID at Youtube dance datasets, which outperforms some state-of-the-arts including Guide-Pixe2Pixe, Global- Flow-Local-Attn, and CocosNet. 1. Introduction Figure 1: (a) The inputs of the task: a reference image with a target pose. (b) The outputs of the task: a generated human in Conditional image generation and synthesis becomes a the target pose. From left to right: the ground truth, results popular computer vision task recent years [29].
    [Show full text]
  • Fault Tolerance and Re-Training Analysis on Neural Networks
    FAULT TOLERANCE AND RE-TRAINING ANALYSIS ON NEURAL NETWORKS by ABHINAV KURIAN GEORGE B.Tech Electronics and Communication Engineering Amrita Vishwa Vidhyapeetham, Kerala, 2012 A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science, Computer Engineering, College of Engineering and Applied Science, University of Cincinnati, Ohio 2019 Thesis Committee: Chair: Wen-Ben Jone, Ph.D. Member: Carla Purdy, Ph.D. Member: Ranganadha Vemuri, Ph.D. ABSTRACT In the current age of big data, artificial intelligence and machine learning technologies have gained much popularity. Due to the increasing demand for such applications, neural networks are being targeted toward hardware solutions. Owing to the shrinking feature size, number of physical defects are on the rise. These growing number of defects are preventing designers from realizing the full potential of the on-chip design. The challenge now is not only to find solutions that balance high-performance and energy-efficiency but also, to achieve fault-tolerance of a computational model. Neural computing, due to its inherent fault tolerant capabilities, can provide promising solutions to this issue. The primary focus of this thesis is to gain deeper understanding of fault tolerance in neural network hardware. As a part of this work, we present a comprehensive analysis of fault tolerance by exploring effects of faults on popular neural models: multi-layer perceptron model and convolution neural network. We built the models based on conventional 64-bit floating point representation. In addition to this, we also explore the recent 8-bit integer quantized representation. A fault injector model is designed to inject stuck-at faults at random locations in the network.
    [Show full text]
  • Design Perspectives on Delivery Drones
    C O R P O R A T I O N Design Perspectives on Delivery Drones Jia Xu For more information on this publication, visit www.rand.org/t/RR1718z2 Published by the RAND Corporation, Santa Monica, Calif. © Copyright 2017 RAND Corporation R® is a registered trademark. Limited Print and Electronic Distribution Rights This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited. Permission is given to duplicate this document for personal use only, as long as it is unaltered and complete. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial use. For information on reprint and linking permissions, please visit www.rand.org/pubs/permissions. The RAND Corporation is a research organization that develops solutions to public policy challenges to help make communities throughout the world safer and more secure, healthier and more prosperous. RAND is nonprofit, nonpartisan, and committed to the public interest. RAND’s publications do not necessarily reflect the opinions of its research clients and sponsors. Support RAND Make a tax-deductible charitable contribution at www.rand.org/giving/contribute www.rand.org Preface Delivery drones may become widespread over the next five to ten years, particularly for what is known as the “last-mile” logistics of small, light items. Companies such as Amazon, Google, the United Parcel Service (UPS), DHL, and Alibaba have been running high-profile experiments testing drone delivery systems, and the development of such systems reached a milestone when the first commercial drone delivery approved by the Federal Aviation Administration took place on July 17, 2015.
    [Show full text]
  • In-Datacenter Performance Analysis of a Tensor Processing Unit
    In-Datacenter Performance Analysis of a Tensor Processing Unit Presented by Josh Fried Background: Machine Learning Neural Networks: ● Multi Layer Perceptrons ● Recurrent Neural Networks (mostly LSTMs) ● Convolutional Neural Networks Synapse - each edge, has a weight Neuron - each node, sums weights and uses non-linear activation function over sum Propagating inputs through a layer of the NN is a matrix multiplication followed by an activation Background: Machine Learning Two phases: ● Training (offline) ○ relaxed deadlines ○ large batches to amortize costs of loading weights from DRAM ○ well suited to GPUs ○ Usually uses floating points ● Inference (online) ○ strict deadlines: 7-10ms at Google for some workloads ■ limited possibility for batching because of deadlines ○ Facebook uses CPUs for inference (last class) ○ Can use lower precision integers (faster/smaller/more efficient) ML Workloads @ Google 90% of ML workload time at Google spent on MLPs and LSTMs, despite broader focus on CNNs RankBrain (search) Inception (image classification), Google Translate AlphaGo (and others) Background: Hardware Trends End of Moore’s Law & Dennard Scaling ● Moore - transistor density is doubling every two years ● Dennard - power stays proportional to chip area as transistors shrink Machine Learning causing a huge growth in demand for compute ● 2006: Excess CPU capacity in datacenters is enough ● 2013: Projected 3 minutes per-day per-user of speech recognition ○ will require doubling datacenter compute capacity! Google’s Answer: Custom ASIC Goal: Build a chip that improves cost-performance for NN inference What are the main costs? Capital Costs Operational Costs (power bill!) TPU (V1) Design Goals Short design-deployment cycle: ~15 months! Plugs in to PCIe slot on existing servers Accelerates matrix multiplication operations Uses 8-bit integer operations instead of floating point How does the TPU work? CISC instructions, issued by host.
    [Show full text]
  • Abstractions for Programming Graphics Processors in High-Level Programming Languages
    Abstracties voor het programmeren van grafische processoren in hoogniveau-programmeertalen Abstractions for Programming Graphics Processors in High-Level Programming Languages Tim Besard Promotor: prof. dr. ir. B. De Sutter Proefschrift ingediend tot het behalen van de graad van Doctor in de ingenieurswetenschappen: computerwetenschappen Vakgroep Elektronica en Informatiesystemen Voorzitter: prof. dr. ir. K. De Bosschere Faculteit Ingenieurswetenschappen en Architectuur Academiejaar 2018 - 2019 ISBN 978-94-6355-244-8 NUR 980 Wettelijk depot: D/2019/10.500/52 Examination Committee Prof. Filip De Turck, chair Department of Information Technology Faculty of Engineering and Architecture Ghent University Prof. Koen De Bosschere, secretary Department of Electronics and Information Systems Faculty of Engineering and Architecture Ghent University Prof. Bjorn De Sutter, supervisor Department of Electronics and Information Systems Faculty of Engineering and Architecture Ghent University Prof. Jutho Haegeman Department of Physics and Astronomy Faculty of Sciences Ghent University Prof. Jan Lemeire Department of Electronics and Informatics Faculty of Engineering Vrije Universiteit Brussel Prof. Christophe Dubach School of Informatics College of Science & Engineering The University of Edinburgh Prof. Alan Edelman Computer Science & Artificial Intelligence Laboratory Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology ii Dankwoord Ik wist eigenlijk niet waar ik aan begon, toen ik in 2012 in de cata- comben van het Technicum op gesprek ging over een doctoraat. Of ik al eens met LLVM gewerkt had. Ondertussen zijn we vele jaren verder, werk ik op een bureau waar er wel daglicht is, en is het eindpunt van deze studie zowaar in zicht. Dat mag natuurlijk wel, zo vertelt men mij, na 7 jaar.
    [Show full text]
  • Unpaired Pose Guided Human Image Generation
    Research Collection Conference Paper Unpaired Pose Guided Human Image Generation Author(s): Chen, Xu; Song, Jie; Hilliges, Otmar Publication Date: 2019 Permanent Link: https://doi.org/10.3929/ethz-b-000396290 Rights / License: In Copyright - Non-Commercial Use Permitted This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use. ETH Library Unpaired Pose Guided Human Image Generation Xu Chen Jie Song Otmar Hilliges AIT Lab, ETH Zurich {xuchen,jsong,otmarh}@inf.ethz.ch Abstract This paper studies the task of full generative modelling of realistic images of humans, guided only by coarse sketch of the pose, while providing control over the specific instance or type of outfit worn by the user. This is a difficult prob- lem because input and output domain are very different and direct image-to-image translation becomes infeasible. We propose an end-to-end trainable network under the gener- ative adversarial framework, that provides detailed control over the final appearance while not requiring paired train- ing data and hence allows us to forgo the challenging prob- lem of fitting 3D poses to 2D images. The model allows to generate novel samples conditioned on either an image taken from the target domain or a class label indicating the style of clothing (e.g., t-shirt). We thoroughly evaluate the architecture and the contributions of the individual compo- nents experimentally. Finally, we show in a large scale per- ceptual study that our approach can generate realistic look- Figure 1: Generating humans in clothing: Our network ing images and that participants struggle in detecting fake takes a pose sketch as input and generates realistic images, images versus real samples, especially if faces are blurred.
    [Show full text]
  • P1360R0: Towards Machine Learning for C++: Study Group 19
    P1360R0: Towards Machine Learning for C++: Study Group 19 Date: 2018-11-26(Post-SAN mailing): 10 AM ET Project: ISO JTC1/SC22/WG21: Programming Language C++ Audience SG19, WG21 Authors : Michael Wong (Codeplay), Vincent Reverdy (University of Illinois at Urbana-Champaign, Paris Observatory), Robert Douglas (Epsilon), Emad Barsoum (Microsoft), Sarthak Pati (University of Pennsylvania) Peter Goldsborough (Facebook) Franke Seide (MS) Contributors Emails: [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] Reply to: [email protected] Introduction 2 Motivation 2 Scope 5 Meeting frequency and means 6 Outreach to ML/AI/Data Science community 7 Liaison with other groups 7 Future meetings 7 Conclusion 8 Acknowledgements 8 References 8 Introduction This paper proposes a WG21 SG for Machine Learning with the goal of: ● Making Machine Learning a first-class citizen in ISO C++ It is the collaboration of a number of key industry, academic, and research groups, through several connections in CPPCON BoF[reference], LLVM 2018 discussions, and C++ San Diego meeting. The intention is to support such an SG, and describe the scope of such an SG. This is in terms of potential work resulting in papers submitted for future C++ Standards, or collaboration with other SGs. We will also propose ongoing teleconferences, meeting frequency and locations, as well as outreach to ML data scientists, conferences, and liaison with other Machine Learning groups such as at Khronos, and ISO. As of the SAN meeting, this group has been officially created as SG19, and we will begin teleconferences immediately, after the US thanksgiving, and after NIPS.
    [Show full text]
  • Towards Incremental Agent Enhancement for Evolving Games
    Evaluating Reinforcement Learning Algorithms For Evolving Military Games James Chao*, Jonathan Sato*, Crisrael Lucero, Doug S. Lange Naval Information Warfare Center Pacific *Equal Contribution ffi[email protected] Abstract games in 2013 (Mnih et al. 2013), Google DeepMind devel- oped AlphaGo (Silver et al. 2016) that defeated world cham- In this paper, we evaluate reinforcement learning algorithms pion Lee Sedol in the game of Go using supervised learning for military board games. Currently, machine learning ap- and reinforcement learning. One year later, AlphaGo Zero proaches to most games assume certain aspects of the game (Silver et al. 2017b) was able to defeat AlphaGo with no remain static. This methodology results in a lack of algorithm robustness and a drastic drop in performance upon chang- human knowledge and pure reinforcement learning. Soon ing in-game mechanics. To this end, we will evaluate general after, AlphaZero (Silver et al. 2017a) generalized AlphaGo game playing (Diego Perez-Liebana 2018) AI algorithms on Zero to be able to play more games including Chess, Shogi, evolving military games. and Go, creating a more generalized AI to apply to differ- ent problems. In 2018, OpenAI Five used five Long Short- term Memory (Hochreiter and Schmidhuber 1997) neural Introduction networks and a Proximal Policy Optimization (Schulman et al. 2017) method to defeat a professional DotA team, each AlphaZero (Silver et al. 2017a) described an approach that LSTM acting as a player in a team to collaborate and achieve trained an AI agent through self-play to achieve super- a common goal. AlphaStar used a transformer (Vaswani et human performance.
    [Show full text]