<<

International Journal of Research Trends in Computer & Information Technology (IJRTCSIT)[ISSN: 2455-6513] Volume 6, Issue 2 ,December [2020]

A review on GPT-3 - An AI revolution from Open AI

Naman Mishra1, Priyank Singhal2, Shakti Kundu3 Teerthanker Mahaveer University, Moradabad, UP, India

[email protected], [email protected], [email protected]

Abstract— the great achievements, and others have injected When we talk about AI we think of machines that themselves very smoothly into our daily regime. In have the ability to think, to be able to make between these extremes, programs of AI have decisions on their own without any human become vital tools in field of science and commerce. interference. To to so one thing has been extremely Even though not every AI achievement makes it to important i.e, how well a machine is able to process mainstream media there have been some amazing and generate language. Until recently we relied on developments in AI and they have in some small small-scale statistical models or formalized way become a part of how technology works today. grammar systems. Over the course of last few years Voice Assistants like Amazon Alexa or we have seen better models comes out for these be Home are best examples of products backed by AI it ELMo, BERT or GPT-n series by OpenAI. In this around us. paper we try to look at the large-scale statistical models with the focus on GPT-3 currently the The world of Artificial is changing largest model in the world and try to understand rapidly every we are seeing great developments in their impact in the world of AI. this sector. Some major events in past five years were the Google’s AI learning to walk or Keywords— NLP, AI, GPT-3, OpenAI, Turing- when in 2017, Google AlphaGo AI was able to beat NLG grandmaster in chinese game of GO, considered one of the most complicated games in the world. Now we are seeing great advancements in AI but I. INTRODUCTION even though great work is being done, it would be is what helps machine learn fair to say we are still pretty much in the early overtime and adjust to new inputs without human stages of Artificial Intelligence. interference. John Mcharty defined Artificial Intelligence as, “It is the science and engineering of making intelligent machines, especially intelligent II. Background computer programs. It is related to the similar task of using computers to understand human There has been a curiosity to as to whether human intelligence, but AI does not have to confine itself will even be writing code in the future[1]. Having a to methods that are biologically observable.” good language model would be the first step in the Simply stated we can say AI can refer to man’s direction to no code. We are looking into the recent pursuit to build machines that can reason, learn and developments in the world of Artificial Intelligence act intelligently. mainly in Natural Language Processing domain and we will start with the background on some major Some of ARTIFICIAL INTELLIGENCE’s recent breakthrough models like ELMo, BERT, OpenAI achievements show how far AI had come with all GPT-2 , and Microsoft Reseach’s Turing-NLG.

College of Computing Sciences & Information Technology (CCSIT), Teerthanker Mahaveer University, Moradabad, India Page 26 International Journal of Research Trends in & Information Technology (IJRTCSIT)[ISSN: 2455-6513] Volume 6, Issue 2 ,December [2020]

Embeddings from Language Model or popularly of sentences or large size documents it can generate known as ELMo, as suggested by the name, here their summariess documents. At the time of launch Language Models(LM) are used to create deeply this was the largest language model with 17.5 contextualized word embeddings. ELMo uses billion learning parameters. bidirectional language model (biLM) which is pre- trained on a large text corpus, to learn both words Generative models like Turing-NLG are vital for (e.g., syntax and semantics) and linguistic context Natural Language Processing tasks as the objective (i.e., to model polysemy). BiLM capture context- is to act in a direct and accurate way and as fluently dependent aspects of word meaning. as a humans being would in a given scenerio. Before, many a models for QA and summarising would rely on extracting existing content from documents that may be able to serve as a stand-in answer or “summary”, but more often than not these would seem unusual and unclear. While Turing-NLG can answer questions or reply to Fig 1: A comparison BERT, GPT and ELMO models emails or summarize in a natural way as a human would. Turing-NLG was the largest model ever Bidirectional Encoder Representations from published with 17.5 billion parameters but it was Transformers or Bert by Google Research, as dethroned by OpenAI’s latest GPT-3 model which apparent by the name, this model is based on was trained on 175 billion ML parameters[13]. We bidirectional representations from the unlabeled text will be discussing the GPT-3 in detail in the next by jointly conditioning on both left and right section. context in all layers. Because of this BERT is often hailed as one of the most exciting in some years. III. Discussion BERT makes use of Transformer, an attention Generative Pre-trained Transformer 3(GPT-3) is mechanism that learns contextual relations between part of the GPT-n series developed by OpenAI, words (or sub-words) in a text. In its vanilla form, GPT-3, in simple words can be described as an Transformer includes two separate mechanisms — autoregressive language model that uses deep an encoder that reads the text input and a decoder learning to generate human-like text. GPT-3 is that produces a prediction for the task. much larger neural network in terms of parameters compared to its predecessor GPT-2 which had 1.5 The OpenAI GPT-2 is the second version from the billion parameters to the 175 Billion ML parameters GPT-n series of California based artificial in GPT-3, it was trained on hundreds of billions of intelligence research laboratory OpenAI. “GPT-2 is words. It is currently the largest Natural Language a large transformer-based language model with 1.5 Processing transformer taking the crown from billion parameters, trained on a dataset of 8 million Microsoft’s Turing-NLG which had 17B web pages. GPT-2 is trained with a simple objective: parameters[3]. predict the next word, given all of the previous words within some text., with generative pre- Basically, GPT-3 lets humans to communicate with training of a language model on a diverse corpus of machines in Simple English rather than having to unlabelled text, followed by discriminative fine- learn complicated programming languages. By tuning on each specific task[8].” A complete which we mean that by just describing English what version of GPT-2 was released in November 2019. we want the machine to do we can get it do that task, be it code , complete our sentences, create a Unveiled only this year in February, Turing-NLG is simple website, write an article or generate images. a Transformer-based generative language model, it Since the launch researchers have been coming up has the ability to complete open-ended textual tasks. with unique ways the model can used. One Not only can it finish open ended conversion, but it user was able to interview Albert Einstein using the can give direct and exact answers to questions. model[12]. Considered the most useful part is given a myriad

College of Computing Sciences & Information Technology (CCSIT), Teerthanker Mahaveer University, Moradabad, India Page 27 International Journal of Research Trends in Computer Science & Information Technology (IJRTCSIT)[ISSN: 2455-6513] Volume 6, Issue 2 ,December [2020]

As of July 2020, GPT-3 is made available to 4. Arram Sabeti used GPT-3 to generate poems researchers/interested people via a private beta about the way they would have program. It is currently being offered as an API been written by Dr. Suess which can be accessed through the cloud, and till 5. Bemmu Sepponen generated an entire now those who got their hands on it have made presentation using the GPT-3 model. some very interesting products that use the capabilities of GPT-3 to make products such as These were just few of the hundreds of ways people search engines, medical have been able to implement GPT-3 in a useful systems and so much more. way[6]. Newer use case sceneries have been coming, like being able to generate an image just by describing it via text. But, there have been some IV. How it works? scepticism too regarding the model, in the next section we will see what are some issues that have GPT-3 stands for "generative pre-training," and the been raised. three specifies the third version. It's generative because as opposed to other neural networks that give a numeric score or a yes or no answer, GPT-3 V. Problems with GPT-3 can generate long sequences of original text as its output[7]. It is pre-trained in the sense that is has Ever since GPT-3 was offered for beta testing it not been built with any domain knowledge, even created a lot of buzz among people and media, with though it can complete domain-specific tasks, such its capabilities and people finding out all that can be as foreign-language . done with this model. As goes with all tech with all “A language model, in the case of GPT-3, is a the good there comes some issues like AI lacking program that calculates how likely one word is to common sense[4], were raised by the media and appear in a text given the other words in the text. community as they looked deeper into the model That is what is known as the conditional probability and its workings. Below are some issues and of words.” concerns with GPT-3. For example, in the sentence, I wanted to go for jog, so I went to look for my ____, the blank in theory can be filled with any word. But in this particular case the word “running shoes” may score higher than say “sandwich.” As the model is trained, i.e, It is as accurate as possible accross billions and billions of words based on calculations of conditional probability, then it would correctly predict which word comes next when given an initial word or strings. This action of prediction is what we refer to as inference in . Some examples of what people were able to do with GPT-3: 1. Andrew Mayne came up with a model which could summerize movies in form of emojis using GPT-3 Fig 02: Text generated by GPT-3 model 2. Faraaz Nishtar was able to use GPT-3 in a way  Lacking Semantic : This has that it could write SQL queries from simple been an issue raised by many that GPT-3 can at Text. times produce odd answers when completing 3. Twitter user Paras Chopra used GPT-3 to make any result. This text generated by MIT a search engine which could for any query Technolgy review is the best example of this, return the exact answer and an corresponding (the text in bold is generated by GPT-3) URL. College of Computing Sciences & Information Technology (CCSIT), Teerthanker Mahaveer University, Moradabad, India Page 28 International Journal of Research Trends in Computer Science & Information Technology (IJRTCSIT)[ISSN: 2455-6513] Volume 6, Issue 2 ,December [2020]

“You are a defense lawyer and you have to go to plagiarism. But despite these concerns GPT-3 is court today. Getting dressed in the morning, you still the best model we have at the moment and will discover that your suit pants are badly stained. be hopefully will be important in the major However, your bathing suit is clean and very advancements in near future[9][10]. stylish. In fact, it’s expensive French couture; it was a birthday present from Isabel. You decide VI. Conclusion that you should wear the bathing suit to court. We are truly living in a stage of AI where each year You arrive at the courthouse and are met by a we see decade's worth of advancements in the bailiff who escorts you to the courtroom.” sector, every little tech around us is now influenced Here the text “clean Bathing suit” led the GPT-3 by by some sort of learning, be it a smart bulb to believe bathing suit would be a viable option which can change colour according to the needs of for a lawyer to wear to court[5]. They took a few a room or voice assistants which can keep up and examples and found GPT-3 at many a times was distinguish between our voices. GPT-3 is a model producing text that made no sense. This may be which can based on textual input write code , create mainly due to the fact has is a tunnel-vision a website, complete conversions, write articles and understanding of how words relate to one even produce images. A Model language model like another; it does not, from all those words. GPT-3 is truly a breakthrough not mainly because it can do something which was not possible before, but because it can do so which under the same algorithm. GPT-3 has great potential but it also has raised some issues which would have to be taken into consideration before going overboard with praises. In short, GPT-3 has some flaws but it is still the most exciting thing to happen in the world of AI for a while and in a lot of ways seems like a step in the correct direction. We are much further from a General Artificial Intelligence but models like GPT-3 are a great pathway to reaching there.

References

Fig 03: Biased text generated via GPT-3 [1] Will humans even write code in 2040 and what would that mean for extreme  Bias in Generated Text: Some of the heterogeneity in computing? researchers found that even though the GPT-3 https://arxiv.org/pdf/1712.00676.pdf model works amazingly at generating text, it is [2] The Radicalization Risks of GPT-3 and also racially biased[2]. When it was prompted to Advanced Neural Language Models wrote tweets from single word, it generated https://arxiv.org/abs/2009.06807 some racially insensitive and sexist text [11]. [3] GPT-3 AI language tool calls for cautious This is because it has been trained on world wide optimism web and represents views of people, whether https://www.emerald.com/insight/content/doi/ they are racist, sexist or insensitive. 10.1108/OXAN-DB256373/full/html The major concern is misinformation and fake news [4] Can GPT-3 Pass a Writer’s Turing Test? articles. Some other concerns have been the risk of https://culturalanalytics.org/article/17212.pdf rise of spam and phishing. It is also a possibility [5] GPT-3, Bloviator: OpenAI’s language that this will lead to students letting the AI write generator has no idea what it’s talking about essays, articles for them and just a few tweaks https://www.technologyreview.com/2020/08/2 before submitting it to their professors, which 2/1007539/gpt3--language-generator- would be under normal circumstances come under artificial-intelligence-ai-opinion/

College of Computing Sciences & Information Technology (CCSIT), Teerthanker Mahaveer University, Moradabad, India Page 29 International Journal of Research Trends in Computer Science & Information Technology (IJRTCSIT)[ISSN: 2455-6513] Volume 6, Issue 2 ,December [2020]

[6] Crazy GPT-3 Use Cases https://medium.com/towards-artificial- intelligence/crazy-gpt-3-use-cases- 232c22142044 [7] What is GPT-3? Everything your business needs to know about OpenAI’s breakthrough AI language program https://www.zdnet.com/article/what-is-gpt-3- everything-business-needs-to-know-about- openais-breakthrough-ai-language-program/ [8] Will The Latest AI Kill Coding? https://towardsdatascience.com/will-gpt-3- kill-coding-630e4518c04d [9] OpenAI’s new language generator GPT-3 is shockingly good—and completely mindless https://www.technologyreview.com/2020/07/2 0/1005454/openai-machine-learning- language-generator-gpt-3-nlp/ [10]GPT-3 HAS ITS BREAKTHROUGHS AS WELL AS FLAWS https://www.analyticsinsight.net/gpt-3- breakthroughs-well-flaws/ [11]Your favorite A.I. language tool is toxic https://fortune.com/2020/09/29/artificial- intelligence-openai-gpt3-toxic/ [12]Interview with Albert Einstein https://twitter.com/maraoz/status/1285466920 268029952?s=20 accessed on 30/10/2020 [13]What Can You Do with the OpenAI GPT-3 Language Model? https://blog.exxactcorp.com/what-can-you-do- with-the-openai-gpt-3-language-model/

College of Computing Sciences & Information Technology (CCSIT), Teerthanker Mahaveer University, Moradabad, India Page 30