IEEE Paper Template in A4 (V1)

Total Page:16

File Type:pdf, Size:1020Kb

IEEE Paper Template in A4 (V1) International Journal of Research Trends in Computer Science & Information Technology (IJRTCSIT)[ISSN: 2455-6513] Volume 6, Issue 2 ,December [2020] A review on GPT-3 - An AI revolution from Open AI Naman Mishra1, Priyank Singhal2, Shakti Kundu3 Teerthanker Mahaveer University, Moradabad, UP, India [email protected], [email protected], [email protected] Abstract— the great achievements, and others have injected When we talk about AI we think of machines that themselves very smoothly into our daily regime. In have the ability to think, to be able to make between these extremes, programs of AI have decisions on their own without any human become vital tools in field of science and commerce. interference. To to so one thing has been extremely Even though not every AI achievement makes it to important i.e, how well a machine is able to process mainstream media there have been some amazing and generate language. Until recently we relied on developments in AI and they have in some small small-scale statistical models or formalized way become a part of how technology works today. grammar systems. Over the course of last few years Voice Assistants like Amazon Alexa or Google we have seen better models comes out for these be Home are best examples of products backed by AI it ELMo, BERT or GPT-n series by OpenAI. In this around us. paper we try to look at the large-scale statistical models with the focus on GPT-3 currently the The world of Artificial Intelligence is changing largest model in the world and try to understand rapidly every we are seeing great developments in their impact in the world of AI. this sector. Some major events in past five years were the Google’s deepmind AI learning to walk or Keywords— NLP, AI, GPT-3, OpenAI, Turing- when in 2017, Google AlphaGo AI was able to beat NLG grandmaster in chinese game of GO, considered one of the most complicated games in the world. Now we are seeing great advancements in AI but I. INTRODUCTION even though great work is being done, it would be Artificial intelligence is what helps machine learn fair to say we are still pretty much in the early overtime and adjust to new inputs without human stages of Artificial Intelligence. interference. John Mcharty defined Artificial Intelligence as, “It is the science and engineering of making intelligent machines, especially intelligent II. Background computer programs. It is related to the similar task of using computers to understand human There has been a curiosity to as to whether human intelligence, but AI does not have to confine itself will even be writing code in the future[1]. Having a to methods that are biologically observable.” good language model would be the first step in the Simply stated we can say AI can refer to man’s direction to no code. We are looking into the recent pursuit to build machines that can reason, learn and developments in the world of Artificial Intelligence act intelligently. mainly in Natural Language Processing domain and we will start with the background on some major Some of ARTIFICIAL INTELLIGENCE’s recent breakthrough models like ELMo, BERT, OpenAI achievements show how far AI had come with all GPT-2 , and Microsoft Reseach’s Turing-NLG. College of Computing Sciences & Information Technology (CCSIT), Teerthanker Mahaveer University, Moradabad, India Page 26 International Journal of Research Trends in Computer Science & Information Technology (IJRTCSIT)[ISSN: 2455-6513] Volume 6, Issue 2 ,December [2020] Embeddings from Language Model or popularly of sentences or large size documents it can generate known as ELMo, as suggested by the name, here their summariess documents. At the time of launch Language Models(LM) are used to create deeply this was the largest language model with 17.5 contextualized word embeddings. ELMo uses billion learning parameters. bidirectional language model (biLM) which is pre- trained on a large text corpus, to learn both words Generative models like Turing-NLG are vital for (e.g., syntax and semantics) and linguistic context Natural Language Processing tasks as the objective (i.e., to model polysemy). BiLM capture context- is to act in a direct and accurate way and as fluently dependent aspects of word meaning. as a humans being would in a given scenerio. Before, many a models for QA and summarising would rely on extracting existing content from documents that may be able to serve as a stand-in answer or “summary”, but more often than not these would seem unusual and unclear. While Turing-NLG can answer questions or reply to Fig 1: A comparison BERT, GPT and ELMO models emails or summarize in a natural way as a human would. Turing-NLG was the largest model ever Bidirectional Encoder Representations from published with 17.5 billion parameters but it was Transformers or Bert by Google Research, as dethroned by OpenAI’s latest GPT-3 model which apparent by the name, this model is based on was trained on 175 billion ML parameters[13]. We bidirectional representations from the unlabeled text will be discussing the GPT-3 in detail in the next by jointly conditioning on both left and right section. context in all layers. Because of this BERT is often hailed as one of the most exciting in some years. III. Discussion BERT makes use of Transformer, an attention Generative Pre-trained Transformer 3(GPT-3) is mechanism that learns contextual relations between part of the GPT-n series developed by OpenAI, words (or sub-words) in a text. In its vanilla form, GPT-3, in simple words can be described as an Transformer includes two separate mechanisms — autoregressive language model that uses deep an encoder that reads the text input and a decoder learning to generate human-like text. GPT-3 is that produces a prediction for the task. much larger neural network in terms of parameters compared to its predecessor GPT-2 which had 1.5 The OpenAI GPT-2 is the second version from the billion parameters to the 175 Billion ML parameters GPT-n series of California based artificial in GPT-3, it was trained on hundreds of billions of intelligence research laboratory OpenAI. “GPT-2 is words. It is currently the largest Natural Language a large transformer-based language model with 1.5 Processing transformer taking the crown from billion parameters, trained on a dataset of 8 million Microsoft’s Turing-NLG which had 17B web pages. GPT-2 is trained with a simple objective: parameters[3]. predict the next word, given all of the previous words within some text., with generative pre- Basically, GPT-3 lets humans to communicate with training of a language model on a diverse corpus of machines in Simple English rather than having to unlabelled text, followed by discriminative fine- learn complicated programming languages. By tuning on each specific task[8].” A complete which we mean that by just describing English what version of GPT-2 was released in November 2019. we want the machine to do we can get it do that task, be it code , complete our sentences, create a Unveiled only this year in February, Turing-NLG is simple website, write an article or generate images. a Transformer-based generative language model, it Since the launch researchers have been coming up has the ability to complete open-ended textual tasks. with unique ways the model can used. One Twitter Not only can it finish open ended conversion, but it user was able to interview Albert Einstein using the can give direct and exact answers to questions. model[12]. Considered the most useful part is given a myriad College of Computing Sciences & Information Technology (CCSIT), Teerthanker Mahaveer University, Moradabad, India Page 27 International Journal of Research Trends in Computer Science & Information Technology (IJRTCSIT)[ISSN: 2455-6513] Volume 6, Issue 2 ,December [2020] As of July 2020, GPT-3 is made available to 4. Arram Sabeti used GPT-3 to generate poems researchers/interested people via a private beta about Elon Musk the way they would have program. It is currently being offered as an API been written by Dr. Suess which can be accessed through the cloud, and till 5. Bemmu Sepponen generated an entire now those who got their hands on it have made presentation using the GPT-3 model. some very interesting products that use the capabilities of GPT-3 to make products such as These were just few of the hundreds of ways people search engines, medical question answering have been able to implement GPT-3 in a useful systems and so much more. way[6]. Newer use case sceneries have been coming, like being able to generate an image just by describing it via text. But, there have been some IV. How it works? scepticism too regarding the model, in the next section we will see what are some issues that have GPT-3 stands for "generative pre-training," and the been raised. three specifies the third version. It's generative because as opposed to other neural networks that give a numeric score or a yes or no answer, GPT-3 V. Problems with GPT-3 can generate long sequences of original text as its output[7]. It is pre-trained in the sense that is has Ever since GPT-3 was offered for beta testing it not been built with any domain knowledge, even created a lot of buzz among people and media, with though it can complete domain-specific tasks, such its capabilities and people finding out all that can be as foreign-language translation. done with this model. As goes with all tech with all “A language model, in the case of GPT-3, is a the good there comes some issues like AI lacking program that calculates how likely one word is to common sense[4], were raised by the media and appear in a text given the other words in the text.
Recommended publications
  • Artificial Intelligence in Health Care: the Hope, the Hype, the Promise, the Peril
    Artificial Intelligence in Health Care: The Hope, the Hype, the Promise, the Peril Michael Matheny, Sonoo Thadaney Israni, Mahnoor Ahmed, and Danielle Whicher, Editors WASHINGTON, DC NAM.EDU PREPUBLICATION COPY - Uncorrected Proofs NATIONAL ACADEMY OF MEDICINE • 500 Fifth Street, NW • WASHINGTON, DC 20001 NOTICE: This publication has undergone peer review according to procedures established by the National Academy of Medicine (NAM). Publication by the NAM worthy of public attention, but does not constitute endorsement of conclusions and recommendationssignifies that it is the by productthe NAM. of The a carefully views presented considered in processthis publication and is a contributionare those of individual contributors and do not represent formal consensus positions of the authors’ organizations; the NAM; or the National Academies of Sciences, Engineering, and Medicine. Library of Congress Cataloging-in-Publication Data to Come Copyright 2019 by the National Academy of Sciences. All rights reserved. Printed in the United States of America. Suggested citation: Matheny, M., S. Thadaney Israni, M. Ahmed, and D. Whicher, Editors. 2019. Artificial Intelligence in Health Care: The Hope, the Hype, the Promise, the Peril. NAM Special Publication. Washington, DC: National Academy of Medicine. PREPUBLICATION COPY - Uncorrected Proofs “Knowing is not enough; we must apply. Willing is not enough; we must do.” --GOETHE PREPUBLICATION COPY - Uncorrected Proofs ABOUT THE NATIONAL ACADEMY OF MEDICINE The National Academy of Medicine is one of three Academies constituting the Nation- al Academies of Sciences, Engineering, and Medicine (the National Academies). The Na- tional Academies provide independent, objective analysis and advice to the nation and conduct other activities to solve complex problems and inform public policy decisions.
    [Show full text]
  • AI Computer Wraps up 4-1 Victory Against Human Champion Nature Reports from Alphago's Victory in Seoul
    The Go Files: AI computer wraps up 4-1 victory against human champion Nature reports from AlphaGo's victory in Seoul. Tanguy Chouard 15 March 2016 SEOUL, SOUTH KOREA Google DeepMind Lee Sedol, who has lost 4-1 to AlphaGo. Tanguy Chouard, an editor with Nature, saw Google-DeepMind’s AI system AlphaGo defeat a human professional for the first time last year at the ancient board game Go. This week, he is watching top professional Lee Sedol take on AlphaGo, in Seoul, for a $1 million prize. It’s all over at the Four Seasons Hotel in Seoul, where this morning AlphaGo wrapped up a 4-1 victory over Lee Sedol — incidentally, earning itself and its creators an honorary '9-dan professional' degree from the Korean Baduk Association. After winning the first three games, Google-DeepMind's computer looked impregnable. But the last two games may have revealed some weaknesses in its makeup. Game four totally changed the Go world’s view on AlphaGo’s dominance because it made it clear that the computer can 'bug' — or at least play very poor moves when on the losing side. It was obvious that Lee felt under much less pressure than in game three. And he adopted a different style, one based on taking large amounts of territory early on rather than immediately going for ‘street fighting’ such as making threats to capture stones. This style – called ‘amashi’ – seems to have paid off, because on move 78, Lee produced a play that somehow slipped under AlphaGo’s radar. David Silver, a scientist at DeepMind who's been leading the development of AlphaGo, said the program estimated its probability as 1 in 10,000.
    [Show full text]
  • Much Has Been Written About the Turing Test in the Last Few Years, Some of It
    1 Much has been written about the Turing Test in the last few years, some of it preposterously off the mark. People typically mis-imagine the test by orders of magnitude. This essay is an antidote, a prosthesis for the imagination, showing how huge the task posed by the Turing Test is, and hence how unlikely it is that any computer will ever pass it. It does not go far enough in the imagination-enhancement department, however, and I have updated the essay with a new postscript. Can Machines Think?1 Can machines think? This has been a conundrum for philosophers for years, but in their fascination with the pure conceptual issues they have for the most part overlooked the real social importance of the answer. It is of more than academic importance that we learn to think clearly about the actual cognitive powers of computers, for they are now being introduced into a variety of sensitive social roles, where their powers will be put to the ultimate test: In a wide variety of areas, we are on the verge of making ourselves dependent upon their cognitive powers. The cost of overestimating them could be enormous. One of the principal inventors of the computer was the great 1 Originally appeared in Shafto, M., ed., How We Know (San Francisco: Harper & Row, 1985). 2 British mathematician Alan Turing. It was he who first figured out, in highly abstract terms, how to design a programmable computing device--what we not call a universal Turing machine. All programmable computers in use today are in essence Turing machines.
    [Show full text]
  • In-Datacenter Performance Analysis of a Tensor Processing Unit
    In-Datacenter Performance Analysis of a Tensor Processing Unit Presented by Josh Fried Background: Machine Learning Neural Networks: ● Multi Layer Perceptrons ● Recurrent Neural Networks (mostly LSTMs) ● Convolutional Neural Networks Synapse - each edge, has a weight Neuron - each node, sums weights and uses non-linear activation function over sum Propagating inputs through a layer of the NN is a matrix multiplication followed by an activation Background: Machine Learning Two phases: ● Training (offline) ○ relaxed deadlines ○ large batches to amortize costs of loading weights from DRAM ○ well suited to GPUs ○ Usually uses floating points ● Inference (online) ○ strict deadlines: 7-10ms at Google for some workloads ■ limited possibility for batching because of deadlines ○ Facebook uses CPUs for inference (last class) ○ Can use lower precision integers (faster/smaller/more efficient) ML Workloads @ Google 90% of ML workload time at Google spent on MLPs and LSTMs, despite broader focus on CNNs RankBrain (search) Inception (image classification), Google Translate AlphaGo (and others) Background: Hardware Trends End of Moore’s Law & Dennard Scaling ● Moore - transistor density is doubling every two years ● Dennard - power stays proportional to chip area as transistors shrink Machine Learning causing a huge growth in demand for compute ● 2006: Excess CPU capacity in datacenters is enough ● 2013: Projected 3 minutes per-day per-user of speech recognition ○ will require doubling datacenter compute capacity! Google’s Answer: Custom ASIC Goal: Build a chip that improves cost-performance for NN inference What are the main costs? Capital Costs Operational Costs (power bill!) TPU (V1) Design Goals Short design-deployment cycle: ~15 months! Plugs in to PCIe slot on existing servers Accelerates matrix multiplication operations Uses 8-bit integer operations instead of floating point How does the TPU work? CISC instructions, issued by host.
    [Show full text]
  • Turing Test Does Not Work in Theory but in Practice
    Int'l Conf. Artificial Intelligence | ICAI'15 | 433 Turing test does not work in theory but in practice Pertti Saariluoma1 and Matthias Rauterberg2 1Information Technology, University of Jyväskylä, Jyväskylä, Finland 2Industrial Design, Eindhoven University of Technology, Eindhoven, The Netherlands Abstract - The Turing test is considered one of the most im- Essentially, the Turing test is an imitation game. Like any portant thought experiments in the history of AI. It is argued good experiment, it has two conditions. In the case of control that the test shows how people think like computers, but is this conditions, it is assumed that there is an opaque screen. On actually true? In this paper, we discuss an entirely new per- one side of the screen is an interrogator whose task is to ask spective. Scientific languages have their foundational limita- questions and assess the nature of the answers. On the other tions, for example, in their power of expression. It is thus pos- side, there are two people, A and B. The task of A and B is to sible to discuss the limitations of formal concepts and theory answer the questions, and the task of the interrogator is to languages. In order to represent real world phenomena in guess who has given the answer. In the case of experimental formal concepts, formal symbols must be given semantics and conditions, B is replaced by a machine (computer) and again, information contents; that is, they must be given an interpreta- the interrogator must decide whether it was the human or the tion. They key argument is that it is not possible to express machine who answered the questions.
    [Show full text]
  • THE TURING TEST RELIES on a MISTAKE ABOUT the BRAIN For
    THE TURING TEST RELIES ON A MISTAKE ABOUT THE BRAIN K. L. KIRKPATRICK Abstract. There has been a long controversy about how to define and study intel- ligence in machines and whether machine intelligence is possible. In fact, both the Turing Test and the most important objection to it (called variously the Shannon- McCarthy, Blockhead, and Chinese Room arguments) are based on a mistake that Turing made about the brain in his 1948 paper \Intelligent Machinery," a paper that he never published but whose main assumption got embedded in his famous 1950 paper and the celebrated Imitation Game. In this paper I will show how the mistake is a false dichotomy and how it should be fixed, to provide a solid foundation for a new understanding of the brain and a new approach to artificial intelligence. In the process, I make an analogy between the brain and the ribosome, machines that translate information into action, and through this analogy I demonstrate how it is possible to go beyond the purely information-processing paradigm of computing and to derive meaning from syntax. For decades, the metaphor of the brain as an information processor has dominated both neuroscience and artificial intelligence research and allowed the fruitful appli- cations of Turing's computation theory and Shannon's information theory in both fields. But this metaphor may be leading us astray, because of its limitations and the deficiencies in our understanding of the brain and AI. In this paper I will present a new metaphor to consider, of the brain as both an information processor and a producer of physical effects.
    [Show full text]
  • Artificial Intelligence and Big Data – Innovation Landscape Brief
    ARTIFICIAL INTELLIGENCE AND BIG DATA INNOVATION LANDSCAPE BRIEF © IRENA 2019 Unless otherwise stated, material in this publication may be freely used, shared, copied, reproduced, printed and/or stored, provided that appropriate acknowledgement is given of IRENA as the source and copyright holder. Material in this publication that is attributed to third parties may be subject to separate terms of use and restrictions, and appropriate permissions from these third parties may need to be secured before any use of such material. ISBN 978-92-9260-143-0 Citation: IRENA (2019), Innovation landscape brief: Artificial intelligence and big data, International Renewable Energy Agency, Abu Dhabi. ACKNOWLEDGEMENTS This report was prepared by the Innovation team at IRENA’s Innovation and Technology Centre (IITC) with text authored by Sean Ratka, Arina Anisie, Francisco Boshell and Elena Ocenic. This report benefited from the input and review of experts: Marc Peters (IBM), Neil Hughes (EPRI), Stephen Marland (National Grid), Stephen Woodhouse (Pöyry), Luiz Barroso (PSR) and Dongxia Zhang (SGCC), along with Emanuele Taibi, Nadeem Goussous, Javier Sesma and Paul Komor (IRENA). Report available online: www.irena.org/publications For questions or to provide feedback: [email protected] DISCLAIMER This publication and the material herein are provided “as is”. All reasonable precautions have been taken by IRENA to verify the reliability of the material in this publication. However, neither IRENA nor any of its officials, agents, data or other third- party content providers provides a warranty of any kind, either expressed or implied, and they accept no responsibility or liability for any consequence of use of the publication or material herein.
    [Show full text]
  • Machine Theory of Mind
    Machine Theory of Mind Neil C. Rabinowitz∗ Frank Perbet H. Francis Song DeepMind DeepMind DeepMind [email protected] [email protected] [email protected] Chiyuan Zhang S. M. Ali Eslami Matthew Botvinick Google Brain DeepMind DeepMind [email protected] [email protected] [email protected] Abstract 1. Introduction Theory of mind (ToM; Premack & Woodruff, For all the excitement surrounding deep learning and deep 1978) broadly refers to humans’ ability to rep- reinforcement learning at present, there is a concern from resent the mental states of others, including their some quarters that our understanding of these systems is desires, beliefs, and intentions. We propose to lagging behind. Neural networks are regularly described train a machine to build such models too. We de- as opaque, uninterpretable black-boxes. Even if we have sign a Theory of Mind neural network – a ToM- a complete description of their weights, it’s hard to get a net – which uses meta-learning to build models handle on what patterns they’re exploiting, and where they of the agents it encounters, from observations might go wrong. As artificial agents enter the human world, of their behaviour alone. Through this process, the demand that we be able to understand them is growing it acquires a strong prior model for agents’ be- louder. haviour, as well as the ability to bootstrap to Let us stop and ask: what does it actually mean to “un- richer predictions about agents’ characteristics derstand” another agent? As humans, we face this chal- and mental states using only a small number of lenge every day, as we engage with other humans whose behavioural observations.
    [Show full text]
  • Understanding & Generalizing Alphago Zero
    Under review as a conference paper at ICLR 2019 UNDERSTANDING &GENERALIZING ALPHAGO ZERO Anonymous authors Paper under double-blind review ABSTRACT AlphaGo Zero (AGZ) (Silver et al., 2017b) introduced a new tabula rasa rein- forcement learning algorithm that has achieved superhuman performance in the games of Go, Chess, and Shogi with no prior knowledge other than the rules of the game. This success naturally begs the question whether it is possible to develop similar high-performance reinforcement learning algorithms for generic sequential decision-making problems (beyond two-player games), using only the constraints of the environment as the “rules.” To address this challenge, we start by taking steps towards developing a formal understanding of AGZ. AGZ includes two key innovations: (1) it learns a policy (represented as a neural network) using super- vised learning with cross-entropy loss from samples generated via Monte-Carlo Tree Search (MCTS); (2) it uses self-play to learn without training data. We argue that the self-play in AGZ corresponds to learning a Nash equilibrium for the two-player game; and the supervised learning with MCTS is attempting to learn the policy corresponding to the Nash equilibrium, by establishing a novel bound on the difference between the expected return achieved by two policies in terms of the expected KL divergence (cross-entropy) of their induced distributions. To extend AGZ to generic sequential decision-making problems, we introduce a robust MDP framework, in which the agent and nature effectively play a zero-sum game: the agent aims to take actions to maximize reward while nature seeks state transitions, subject to the constraints of that environment, that minimize the agent’s reward.
    [Show full text]
  • AI for Broadcsaters, Future Has Already Begun…
    AI for Broadcsaters, future has already begun… By Dr. Veysel Binbay, Specialist Engineer @ ABU Technology & Innovation Department 0 Dr. Veysel Binbay I have been working as Specialist Engineer at ABU Technology and Innovation Department for one year, before that I had worked at TRT (Turkish Radio and Television Corporation) for more than 20 years as a broadcast engineer, and also as an IT Director. I have wide experience on Radio and TV broadcasting technologies, including IT systems also. My experience includes to design, to setup, and to operate analogue/hybrid/digital radio and TV broadcast systems. I have also experienced on IT Networks. 1/25 What is Artificial Intelligence ? • Programs that behave externally like humans? • Programs that operate internally as humans do? • Computational systems that behave intelligently? 2 Some Definitions Trials for AI: The exciting new effort to make computers think … machines with minds, in the full literal sense. Haugeland, 1985 3 Some Definitions Trials for AI: The study of mental faculties through the use of computational models. Charniak and McDermott, 1985 A field of study that seeks to explain and emulate intelligent behavior in terms of computational processes. Schalkoff, 1990 4 Some Definitions Trials for AI: The study of how to make computers do things at which, at the moment, people are better. Rich & Knight, 1991 5 It’s obviously hard to define… (since we don’t have a commonly agreed definition of intelligence itself yet)… Lets try to focus to benefits, and solve definition problem later… 6 Brief history of AI 7 Brief history of AI . The history of AI begins with the following article: .
    [Show full text]
  • The Turing Test and Other Design Detection Methodologies∗
    Detecting Intelligence: The Turing Test and Other Design Detection Methodologies∗ George D. Montanez˜ 1 1Machine Learning Department, Carnegie Mellon University, Pittsburgh PA, USA [email protected] Keywords: Turing Test, Design Detection, Intelligent Agents Abstract: “Can machines think?” When faced with this “meaningless” question, Alan Turing suggested we ask a dif- ferent, more precise question: can a machine reliably fool a human interviewer into believing the machine is human? To answer this question, Turing outlined what came to be known as the Turing Test for artificial intel- ligence, namely, an imitation game where machines and humans interacted from remote locations and human judges had to distinguish between the human and machine participants. According to the test, machines that consistently fool human judges are to be viewed as intelligent. While popular culture champions the Turing Test as a scientific procedure for detecting artificial intelligence, doing so raises significant issues. First, a simple argument establishes the equivalence of the Turing Test to intelligent design methodology in several fundamental respects. Constructed with similar goals, shared assumptions and identical observational models, both projects attempt to detect intelligent agents through the examination of generated artifacts of uncertain origin. Second, if the Turing Test rests on scientifically defensible assumptions then design inferences become possible and cannot, in general, be wholly unscientific. Third, if passing the Turing Test reliably indicates intelligence, this implies the likely existence of a designing intelligence in nature. 1 THE IMITATION GAME gin et al., 2003), this paper presents a novel critique of the Turing Test in the spirit of a reductio ad ab- In his seminal paper on artificial intelligence (Tur- surdum.
    [Show full text]
  • Efficiently Mastering the Game of Nogo with Deep Reinforcement
    electronics Article Efficiently Mastering the Game of NoGo with Deep Reinforcement Learning Supported by Domain Knowledge Yifan Gao 1,*,† and Lezhou Wu 2,† 1 College of Medicine and Biological Information Engineering, Northeastern University, Liaoning 110819, China 2 College of Information Science and Engineering, Northeastern University, Liaoning 110819, China; [email protected] * Correspondence: [email protected] † These authors contributed equally to this work. Abstract: Computer games have been regarded as an important field of artificial intelligence (AI) for a long time. The AlphaZero structure has been successful in the game of Go, beating the top professional human players and becoming the baseline method in computer games. However, the AlphaZero training process requires tremendous computing resources, imposing additional difficulties for the AlphaZero-based AI. In this paper, we propose NoGoZero+ to improve the AlphaZero process and apply it to a game similar to Go, NoGo. NoGoZero+ employs several innovative features to improve training speed and performance, and most improvement strategies can be transferred to other nonspecific areas. This paper compares it with the original AlphaZero process, and results show that NoGoZero+ increases the training speed to about six times that of the original AlphaZero process. Moreover, in the experiment, our agent beat the original AlphaZero agent with a score of 81:19 after only being trained by 20,000 self-play games’ data (small in quantity compared with Citation: Gao, Y.; Wu, L. Efficiently 120,000 self-play games’ data consumed by the original AlphaZero). The NoGo game program based Mastering the Game of NoGo with on NoGoZero+ was the runner-up in the 2020 China Computer Game Championship (CCGC) with Deep Reinforcement Learning limited resources, defeating many AlphaZero-based programs.
    [Show full text]