Arxiv:1611.08903V1 [Cs.LG] 27 Nov 2016 20 Percent Increases in Cash Collections [20]
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Elements of DSAI: Game Tree Search, Learning Architectures
Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Elements of DSAI AlphaGo Part 2: Game Tree Search, Learning Architectures Search & Learn: A Recipe for AI Action Decisions J¨orgHoffmann Winter Term 2019/20 Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 1/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Competitive Agents? Quote AI Introduction: \Single agent vs. multi-agent: One agent or several? Competitive or collaborative?" ! Single agent! Several koalas, several gorillas trying to beat these up. BUT there is only a single acting entity { one player decides which moves to take (who gets into the boat). Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 3/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Competitive Agents! Quote AI Introduction: \Single agent vs. multi-agent: One agent or several? Competitive or collaborative?" ! Multi-agent competitive! TWO players deciding which moves to take. Conflicting interests. Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 4/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Agenda: Game Search, AlphaGo Architecture Games: What is that? ! Game categories, game solutions. Game Search: How to solve a game? ! Searching the game tree. Evaluation Functions: How to evaluate a game position? ! Heuristic functions for games. AlphaGo: How does it work? ! Overview of AlphaGo architecture, and changes in Alpha(Go) Zero. Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 5/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Positioning in the DSAI Phase Model Hoffmann Elements of DSAI Game Tree Search, Learning Architectures 6/49 Introduction Games Game Search Evaluation Fns AlphaGo/Zero Summary Quiz References Which Games? ! No chance element. -
ELF: an Extensive, Lightweight and Flexible Research Platform for Real-Time Strategy Games
ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games Yuandong Tian1 Qucheng Gong1 Wenling Shang2 Yuxin Wu1 C. Lawrence Zitnick1 1Facebook AI Research 2Oculus 1fyuandong, qucheng, yuxinwu, [email protected] [email protected] Abstract In this paper, we propose ELF, an Extensive, Lightweight and Flexible platform for fundamental reinforcement learning research. Using ELF, we implement a highly customizable real-time strategy (RTS) engine with three game environ- ments (Mini-RTS, Capture the Flag and Tower Defense). Mini-RTS, as a minia- ture version of StarCraft, captures key game dynamics and runs at 40K frame- per-second (FPS) per core on a laptop. When coupled with modern reinforcement learning methods, the system can train a full-game bot against built-in AIs end- to-end in one day with 6 CPUs and 1 GPU. In addition, our platform is flexible in terms of environment-agent communication topologies, choices of RL methods, changes in game parameters, and can host existing C/C++-based game environ- ments like ALE [4]. Using ELF, we thoroughly explore training parameters and show that a network with Leaky ReLU [17] and Batch Normalization [11] cou- pled with long-horizon training and progressive curriculum beats the rule-based built-in AI more than 70% of the time in the full game of Mini-RTS. Strong per- formance is also achieved on the other two games. In game replays, we show our agents learn interesting strategies. ELF, along with its RL platform, is open sourced at https://github.com/facebookresearch/ELF. 1 Introduction Game environments are commonly used for research in Reinforcement Learning (RL), i.e. -
Improved Policy Networks for Computer Go
Improved Policy Networks for Computer Go Tristan Cazenave Universite´ Paris-Dauphine, PSL Research University, CNRS, LAMSADE, PARIS, FRANCE Abstract. Golois uses residual policy networks to play Go. Two improvements to these residual policy networks are proposed and tested. The first one is to use three output planes. The second one is to add Spatial Batch Normalization. 1 Introduction Deep Learning for the game of Go with convolutional neural networks has been ad- dressed by [2]. It has been further improved using larger networks [7, 10]. AlphaGo [9] combines Monte Carlo Tree Search with a policy and a value network. Residual Networks improve the training of very deep networks [4]. These networks can gain accuracy from considerably increased depth. On the ImageNet dataset a 152 layers networks achieves 3.57% error. It won the 1st place on the ILSVRC 2015 classifi- cation task. The principle of residual nets is to add the input of the layer to the output of each layer. With this simple modification training is faster and enables deeper networks. Residual networks were recently successfully adapted to computer Go [1]. As a follow up to this paper, we propose improvements to residual networks for computer Go. The second section details different proposed improvements to policy networks for computer Go, the third section gives experimental results, and the last section con- cludes. 2 Proposed Improvements We propose two improvements for policy networks. The first improvement is to use multiple output planes as in DarkForest. The second improvement is to use Spatial Batch Normalization. 2.1 Multiple Output Planes In DarkForest [10] training with multiple output planes containing the next three moves to play has been shown to improve the level of play of a usual policy network with 13 layers. -
RV - RA - RM Realidades Virtuales, Aumentadas Y Mixtas Índice General
RV - RA - RM Realidades virtuales, aumentadas y mixtas Índice general 1 Definiciones 1 1.1 Realidad virtual ............................................ 1 1.1.1 Virtualidad .......................................... 1 1.1.2 Relación realidad/irrealidad ................................. 1 1.1.3 Inmersión y navegación .................................... 2 1.1.4 Usos ............................................. 2 1.1.5 Productos ........................................... 3 1.1.6 Técnicas de realidad virtual ................................. 6 1.1.7 Problemas de la realidad virtual ............................... 7 1.1.8 Tecnoética realidad virtual .................................. 8 1.1.9 Véase también ........................................ 9 1.1.10 Referencias .......................................... 9 1.1.11 Enlaces externos ....................................... 10 1.1.12 Bibliografía .......................................... 10 1.2 Realidad aumentada .......................................... 10 1.2.1 Definiciones ......................................... 11 1.2.2 Cronología .......................................... 11 1.2.3 Tecnología .......................................... 12 1.2.4 Técnicas de visualización ................................... 13 1.2.5 Elementos de la realidad aumentada ............................. 13 1.2.6 Niveles ............................................ 14 1.2.7 Aplicaciones ......................................... 14 1.2.8 Aplicaciones futuras ..................................... 16 1.2.9 Literatura -
Unsupervised State Representation Learning in Atari
Unsupervised State Representation Learning in Atari Ankesh Anand⇤ Evan Racah⇤ Sherjil Ozair⇤ Mila, Université de Montréal Mila, Université de Montréal Mila, Université de Montréal Microsoft Research Yoshua Bengio Marc-Alexandre Côté R Devon Hjelm Mila, Université de Montréal Microsoft Research Microsoft Research Mila, Université de Montréal Abstract State representation learning, or the ability to capture latent generative factors of an environment, is crucial for building intelligent agents that can perform a wide variety of tasks. Learning such representations without supervision from rewards is a challenging open problem. We introduce a method that learns state representations by maximizing mutual information across spatially and tem- porally distinct features of a neural encoder of the observations. We also in- troduce a new benchmark based on Atari 2600 games where we evaluate rep- resentations based on how well they capture the ground truth state variables. We believe this new framework for evaluating representation learning models will be crucial for future representation learning research. Finally, we com- pare our technique with other state-of-the-art generative and contrastive repre- sentation learning methods. The code associated with this work is available at https://github.com/mila-iqia/atari-representation-learning 1 Introduction The ability to perceive and represent visual sensory data into useful and concise descriptions is con- sidered a fundamental cognitive capability in humans [1, 2], and thus crucial for building intelligent agents [3]. Representations that concisely capture the true state of the environment should empower agents to effectively transfer knowledge across different tasks in the environment, and enable learning with fewer interactions [4]. -
Spectrally-Based Audio Distances Are Bad at Pitch
I’m Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch Joseph Turian Max Henry [email protected] [email protected] Abstract Growing research demonstrates that synthetic failure modes imply poor generaliza- tion. We compare commonly used audio-to-audio losses on a synthetic benchmark, measuring the pitch distance between two stationary sinusoids. The results are sur- prising: many have poor sense of pitch direction. These shortcomings are exposed using simple rank assumptions. Our task is trivial for humans but difficult for these audio distances, suggesting significant progress can be made in self-supervised audio learning by improving current losses. 1 Introduction Rather than a physical quantity contained in the signal, pitch is a percept that is strongly correlated with the fundamental frequency of a sound. While physically elusive, pitch is a natural dimension to consider when comparing sounds. Children as young as 8 months old are sensitive to melodic contour [65], implying that an early and innate sense of relative pitch is important to parsing our auditory experience. This basic ability is distinct from the acute pitch sensitivity of professional musicians, which improves as a function of musical training [36]. Pitch is important to both music and speech signals. Typically, speech pitch is more narrowly circumscribed than music, having a range of 70–1000Hz, compared to music which spans roughly the range of a grand piano, 30–4000Hz [31]. Pitch in speech signals also fluctuates more rapidly [6]. In tonal languages, pitch has a direct influence on the meaning of words [8, 73, 76]. For non-tone languages such as English, pitch is used to infer (supra-segmental) prosodic and speaker-dependent features [19, 21, 76]. -
Augmented Reality Applied Tolanguage Translation
Ana Rita de Tróia Salvado Licenciado em Ciências da Engenharia Electrotécnica e de Computadores Augmented Reality Applied to Language Translation Dissertação para obtenção do Grau de Mestre em Engenharia Electrotécnica e de Computadores Orientador : Prof. Dr. José António Barata de Oliveira, Prof. Auxiliar, Universidade Nova de Lisboa Júri: Presidente: Doutor João Paulo Branquinho Pimentão, FCT/UNL Arguente: Doutor Tiago Oliveira Machado de Figueiredo Cardoso, FCT/UNL Vogal: Doutor José António Barata de Oliveira, FCT/UNL September, 2015 iii Augmented Reality Applied to Language Translation Copyright c Ana Rita de Tróia Salvado, Faculdade de Ciências e Tecnologia, Universi- dade Nova de Lisboa A Faculdade de Ciências e Tecnologia e a Universidade Nova de Lisboa têm o direito, perpétuo e sem limites geográficos, de arquivar e publicar esta dissertação através de ex- emplares impressos reproduzidos em papel ou de forma digital, ou por qualquer outro meio conhecido ou que venha a ser inventado, e de a divulgar através de repositórios científicos e de admitir a sua cópia e distribuição com objectivos educacionais ou de in- vestigação, não comerciais, desde que seja dado crédito ao autor e editor. iv To my beloved family... vi Acknowledgements "Coming together is a beginning; keeping together is progress; working together is success." - Henry Ford. Life can only be truly enjoyed when people get together to create and share moments and memories. Greatness can be easily achieved by working together and being sup- ported by others. For this reason, I would like to save a special place in this work to thank people who were there and supported me during all this learning process. -
Echtzeitübersetzung Dank Augmented Reality Und Spracherkennung Eine Untersuchung Am Beispiel Der Google Translate-App
ECHTZEITÜBERSETZUNG DANK AUGMENTED REALITY UND SPRACHERKENNUNG EINE UNTERSUCHUNG AM BEISPIEL DER GOOGLE TRANSLATE-APP Julia Herb, BA Matrikelnummer: 01216413 MASTERARBEIT eingereicht im Rahmen des Masterstudiums Translationswissenschaft Spezialisierung: Fachkommunikation an der Leopold-Franzens-Universität Innsbruck Philologisch-Kulturwissenschaftliche Fakultät Institut für Translationswissenschaft betreut von: ao. Univ.-Prof. Mag. Dr. Peter Sandrini Innsbruck, am 22.07.2019 1 Inhaltsverzeichnis Abstract ........................................................................................................................................................... 1 Abbildungsverzeichnis .................................................................................................................................... 2 1. Aktualität des Themas ............................................................................................................................. 4 2. Augmented Reality: die neue Form der Multimedialität ........................................................................ 7 a. Definition und Abgrenzung ................................................................................................................ 7 b. Anwendungsbereiche der AR-Technologie ...................................................................................... 11 3. Verbmobil: neue Dimensionen der maschinellen Übersetzung dank Spracherkennung ....................... 13 a. Besonderheiten eines Echtzeit-Übersetzungs-Projekts .................................................................... -
Arxiv:2008.01352V3 [Cs.LG] 23 Mar 2021 Spatiotemporal Disentanglement in the PDE Formalism
Published as a conference paper at ICLR 2021 PDE-DRIVEN SPATIOTEMPORAL DISENTANGLEMENT Jérémie Donà,y∗ Jean-Yves Franceschi,y∗ Sylvain Lampriery & Patrick Gallinariyz ySorbonne Université, CNRS, LIP6, F-75005 Paris, France zCriteo AI Lab, Paris, France [email protected] ABSTRACT A recent line of work in the machine learning community addresses the problem of predicting high-dimensional spatiotemporal phenomena by leveraging specific tools from the differential equations theory. Following this direction, we propose in this article a novel and general paradigm for this task based on a resolution method for partial differential equations: the separation of variables. This inspiration allows us to introduce a dynamical interpretation of spatiotemporal disentanglement. It induces a principled model based on learning disentangled spatial and temporal representations of a phenomenon to accurately predict future observations. We experimentally demonstrate the performance and broad applicability of our method against prior state-of-the-art models on physical and synthetic video datasets. 1 INTRODUCTION The interest of the machine learning community in physical phenomena has substantially grown for the last few years (Shi et al., 2015; Long et al., 2018; Greydanus et al., 2019). In particular, an increasing amount of works studies the challenging problem of modeling the evolution of dynamical systems, with applications in sensible domains like climate or health science, making the understanding of physical phenomena a key challenge in machine learning. To this end, the community has successfully leveraged the formalism of dynamical systems and their associated differential formulation as powerful tools to specifically design efficient prediction models. In this work, we aim at studying this prediction problem with a principled and general approach, through the prism of Partial Differential Equations (PDEs), with a focus on learning spatiotemporal disentangled representations. -
Large Margin Deep Networks for Classification
Large Margin Deep Networks for Classification Gamaleldin F. Elsayed ∗ Dilip Krishnan Hossein Mobahi Kevin Regan Google Research Google Research Google Research Google Research Samy Bengio Google Research {gamaleldin, dilipkay, hmobahi, kevinregan, bengio}@google.com Abstract We present a formulation of deep learning that aims at producing a large margin classifier. The notion of margin, minimum distance to a decision boundary, has served as the foundation of several theoretically profound and empirically suc- cessful results for both classification and regression tasks. However, most large margin algorithms are applicable only to shallow models with a preset feature representation; and conventional margin methods for neural networks only enforce margin at the output layer. Such methods are therefore not well suited for deep networks. In this work, we propose a novel loss function to impose a margin on any chosen set of layers of a deep network (including input and hidden layers). Our formulation allows choosing any lp norm (p ≥ 1) on the metric measuring the margin. We demonstrate that the decision boundary obtained by our loss has nice properties compared to standard classification loss functions. Specifically, we show improved empirical results on the MNIST, CIFAR-10 and ImageNet datasets on multiple tasks: generalization from small training sets, corrupted labels, and robustness against adversarial perturbations. The resulting loss is general and complementary to existing data augmentation (such as random/adversarial input transform) and regularization techniques such as weight decay, dropout, and batch norm. 2 1 Introduction The large margin principle has played a key role in the course of machine learning history, producing remarkable theoretical and empirical results for classification (Vapnik, 1995) and regression problems (Drucker et al., 1997). -
Machine Learning Is Going Mobile
Machine learning is going mobile By David Schatsky ACHINE learning—the process by which Signals Mcomputers can get better at perform- ing tasks through exposure to data, rather • Google has introduced language translation than through explicit programming—requires software, using small neural networks opti- massive computational power, the kind usu- mized for mobile phones, which can per- ally found in clusters of energy-guzzling, form well without an Internet connection.1 cloud-based computer servers outfitted with specialized processors. But an emerging trend • Lenovo announced a mobile phone that promises to bring the power of machine learn- uses multiple sensors, high-speed image ing to mobile devices that may lack or have processing hardware, and specialized only intermittent online connectivity. This will Google software to support capabili- give rise to machines that sense, perceive, learn ties such as indoor wayfinding, precision from, and respond to their environment and measuring, and augmented reality even their users, enabling the emergence of new when offline.2 product categories, reshaping how businesses engage with customers, and transforming how • NVIDIA, a maker of graphics processing work gets done across industries. technology, introduced an embeddable module for computer vision applications Machine learning is going mobile in devices such as drones and autono- systems run in the cloud on powerful serv- mous vehicles that the company says ers, processing data such as digitized voice or consumes one-tenth the power of a photos that users upload. competing offering.3 Until recently, a typical smartphone lacked the power to perform such tasks without con- • Qualcomm introduced a new processor and necting to the cloud, except in limited ways. -
Understanding & Generalizing Alphago Zero
Under review as a conference paper at ICLR 2019 UNDERSTANDING &GENERALIZING ALPHAGO ZERO Anonymous authors Paper under double-blind review ABSTRACT AlphaGo Zero (AGZ) (Silver et al., 2017b) introduced a new tabula rasa rein- forcement learning algorithm that has achieved superhuman performance in the games of Go, Chess, and Shogi with no prior knowledge other than the rules of the game. This success naturally begs the question whether it is possible to develop similar high-performance reinforcement learning algorithms for generic sequential decision-making problems (beyond two-player games), using only the constraints of the environment as the “rules.” To address this challenge, we start by taking steps towards developing a formal understanding of AGZ. AGZ includes two key innovations: (1) it learns a policy (represented as a neural network) using super- vised learning with cross-entropy loss from samples generated via Monte-Carlo Tree Search (MCTS); (2) it uses self-play to learn without training data. We argue that the self-play in AGZ corresponds to learning a Nash equilibrium for the two-player game; and the supervised learning with MCTS is attempting to learn the policy corresponding to the Nash equilibrium, by establishing a novel bound on the difference between the expected return achieved by two policies in terms of the expected KL divergence (cross-entropy) of their induced distributions. To extend AGZ to generic sequential decision-making problems, we introduce a robust MDP framework, in which the agent and nature effectively play a zero-sum game: the agent aims to take actions to maximize reward while nature seeks state transitions, subject to the constraints of that environment, that minimize the agent’s reward.