Autogenerative Networks Oscar Chang Submitted in Partial

Total Page:16

File Type:pdf, Size:1020Kb

Autogenerative Networks Oscar Chang Submitted in Partial Autogenerative Networks Oscar Chang Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2021 © 2021 Oscar Chang All Rights Reserved Abstract Autogenerative Networks Oscar Chang Artificial intelligence powered by deep neural networks has seen tremendous improvements in the last decade, achieving superhuman performance on a diverse range of tasks. Many worry that it can one day develop the ability to recursively self-improve itself, leading to an intelligence explosion known as the Singularity. Autogenerative networks, or neural networks generating neural networks, is one major plausible pathway towards realizing this possibility. The object of this thesis is to study various challenges and applications of small-scale autogenerative networks in domains such as artificial life, reinforcement learning, neural network initialization and optimization, gradient-based meta-learning, and logical networks. Chapters 2 and 3 describe novel mechanisms for generating neural network weights and embeddings. Chapters 4 and 5 identify problems and propose solutions to fix optimization difficulties in differentiable mechanisms of neural network generation known as Hypernetworks. Chapters 6 and 7 study implicit models of network generation like backpropagating through gradient descent itself and integrating discrete solvers into continuous functions. Together, the chapters in this thesis contribute novel proposals for non-differentiable neural network generation mechanisms, significant improvements to existing differentiable network generation mechanisms, and an assimilation of different learning paradigms in autogenerative networks. Table of Contents Acknowledgments ........................................ 1 Motivation ............................................ 1 Chapter 1: Overview ...................................... 5 1.1 Preliminaries . 5 1.1.1 Limits of Recursive Computation . 5 1.1.2 Importance of Good Representations . 6 1.2 Overview of Common Deep Learning Methods . 8 1.3 Related Work . 11 1.3.1 Early Prior Work (Pre-2000s) . 11 1.3.2 Recent Prior Work . 11 1.4 Summary of Our Contributions . 14 1.5 Publications . 18 Chapter 2: Neural Network Quine ............................... 20 2.1 Introduction . 20 2.1.1 Motivations . 21 2.1.2 Related Work . 22 2.2 Building the Network . 23 i 2.2.1 How can a neural network refer to itself? . 23 2.2.2 Vanilla Quine . 23 2.2.3 Auxiliary Quine . 26 2.3 Training the Network . 27 2.3.1 Network Architecture . 27 2.3.2 How do we train a neural network quine? . 27 2.4 Results and Discussion . 29 2.4.1 Vanilla Quine . 29 2.4.2 Is this a quine? . 31 2.4.3 Hill-climbing . 31 2.4.4 Generational Replication . 32 2.4.5 Auxiliary Quine . 33 2.5 Conclusion . 35 Chapter 3: Agent Embeddings ................................. 37 3.1 Introduction . 37 3.1.1 Our Contribution . 38 3.2 Related Work . 39 3.2.1 Interpretability . 39 3.2.2 Generative Modeling . 40 3.2.3 Meta-Learning . 40 3.2.4 Bayesian Neural Networks . 41 3.3 Learning Agent Embeddings for Cart-Pole . 41 ii 3.3.1 Supervised Generation . 41 3.3.2 Cart-Pole . 41 3.3.3 CartPoleNet . 42 3.3.4 CartPoleGen . 43 3.3.5 Sampling from CartPoleGen . 44 3.4 Experimental Results and Discussion . 46 3.4.1 Convergent Learning . 46 3.4.2 Exploring the Latent Space . 49 3.4.3 Repairing Missing Weights . 51 3.5 Limitations of Supervised Generation . 53 3.5.1 High Sample Complexity . 53 3.5.2 Subpar Model Performance . 54 3.5.3 Scaling Issues . 54 3.6 Potential Applications for AI . 55 3.7 Conclusion . 56 Chapter 4: Hypernetwork Initialization ............................ 57 4.1 Introduction . 57 4.2 Preliminaries . 59 4.2.1 Ricci Calculus . 59 4.2.2 Xavier Initialization . 59 4.2.3 Kaiming Initialization . 60 4.3 Review of Current Methods . 61 iii 4.4 Hyperfan Initialization . 62 4.4.1 Hyperfan-in . 63 4.4.2 Hyperfan-out . 64 4.5 Experiments . 65 4.5.1 Feedforward Networks on MNIST . 66 4.5.2 Continual Learning on Regression Tasks . 67 4.5.3 Convolutional Networks on CIFAR-10 . 68 4.5.4 Bayesian Neural Networks on ImageNet . 69 4.6 Conclusion . 70 Chapter 5: Hypernetwork Optimization ............................ 72 5.1 Introduction . 72 5.2 Catalog of hypergenerative Networks . 72 5.3 Stability under Hypergeneration . 75 5.4 Experiments . 75 5.5 Conclusion . 76 Chapter 6: Gradient-Based Meta-Learning .......................... 77 6.1 Introduction . 77 6.2 Review of Gradient-Based Meta-Learning . 80 6.2.1 MAML . 81 6.2.2 Meta-SGD . 82 6.2.3 MAML++ . 82 6.2.4 Regularization Methods . 83 iv 6.3 Insights from Multi-Task Learning . 84 6.3.1 Multi-Task Learning Regularizes Meta-Learning . 84 6.3.2 Meta-Learning Complements Multi-Task Learning . 84 6.3.3 Applying Multi-Task Learning Asynchronously . 85 6.4 Gradient Sharing . 86 6.5 Experimental Results and Discussions . 88 6.5.1 Acceleration of Meta-Training . 89 6.5.2 Bigger Inner Loop Learning Rates . 89 6.5.3 Comparable Meta-Test Performance . 90 6.5.4 Evolution of m and , through Meta-Training . 91 6.6 Conclusion . 92 Chapter 7: Logical Networks ................................. 95 7.1 Introduction . 95 7.1.1 Our Contribution . 98 7.2 Background . 100 7.2.1 SATNet . 100 7.2.2 Visual Sudoku . 100 7.3 SATNet Fails at Symbol Grounding . 101 7.3.1 The Absence of Output Masking . 101 7.3.2 Visual Sudoku . 103 7.4 MNIST Mapping Problem . 104 7.4.1 Configuring SATNet Properly . 105 v 7.5 Conclusion . ..
Recommended publications
  • Data Sanity Check for Deep Learning Systems Via Learnt Assertions
    Data Sanity Check for Deep Learning Systems via Learnt Assertions Haochuan Lu∗y, Huanlin Xu∗y, Nana Liu∗y, Yangfan Zhou∗y, Xin Wang∗y ∗School of Computer Science, Fudan University, Shanghai, China yShanghai Key Laboratory of Intelligent Information Processing, Shanghai, China Abstract—Reliability is a critical consideration to DL- based deviations in a data flow perspective. Invalid input cases are systems. But the statistical nature of DL makes it quite vulnerable thus identified effectively. to invalid inputs, i.e., those cases that are not considered in We summarize the contributions of this paper as follows. the training phase of a DL model. This paper proposes to perform data sanity check to identify invalid inputs, so as to • We approach reliability enhancement of DL systems via enhance the reliability of DL-based systems. We design and data sanity check. We proposed a tool, namely SaneDL, implement a tool to detect behavior deviation of a DL model to perform data sanity check for DL-based systems. when processing an input case. This tool extracts the data flow SaneDL provides assertion mechanism to detect behavior footprints and conducts an assertion-based validation mechanism. The assertions are built automatically, which are specifically- deviation of DL model. To our knowledge, SaneDL is the tailored for DL model data flow analysis. Our experiments first assertion-based tool that can automatically detects conducted with real-world scenarios demonstrate that such an invalid input cases for DL systems. Our work can shed assertion-based data sanity check mechanism is effective in light to other practices in improving DL reliability.
    [Show full text]
  • Caradoc of the North Wind Free
    FREE CARADOC OF THE NORTH WIND PDF Allan Frewin Jones | 368 pages | 05 Apr 2012 | Hachette Children's Group | 9780340999417 | English | London, United Kingdom CARADOC OF THE NORTH WIND PDF As the war. Disaster strikes, and a valued friend suffers Caradoc of the North Wind devastating injury. Branwen sets off on a heroic journey towards destiny in an epic adventure of lovewar and revenge. Join Charlotte and Mia in this brilliant adventure full of princess sparkle and Christmas excitement! Chi ama i libri sceglie Kobo e inMondadori. The description is beautiful, but at times a bit too much, and sometimes at its worst the writing is hard to comprehend completely clearly. I find myself hoping vehemently for another book. It definitely allows the I read this in Caradoc of the North Wind sitting and could not put it down. Fair Wind to Widdershins. This author has published under several versions of his name, including Allan Jones, Frewin Jones, and A. Write a product review. Now he has stolen the feathers Caradoc of the North Wind Doodle, the weather-vane cockerel in charge of the weather. Jun 29, Katie rated it really liked it. Is the other warrior child, Arthur?? More than I thought I would, even. I really cafadoc want to know more, and off author is one that can really take you places. Join us by creating an account and start getting the best experience from our website! Jules Ember was raised hearing legends of wjnd ancient magic of the wicked Alchemist and the good Sorceress. Delivery and Returns see our delivery rates and policies thinking of returning an item? Mar 24, Valentina rated it really liked it.
    [Show full text]
  • Faithful Saliency Maps: Explaining Neural Networks by Augmenting "Competition for Pixels"
    Faithful Saliency Maps: Explaining Neural Networks by Augmenting "Competition for Pixels" The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Görns, Jorma Peer. 2020. Faithful Saliency Maps: Explaining Neural Networks by Augmenting "Competition for Pixels". Bachelor's thesis, Harvard College. Citable link https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37364724 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Harvard University Senior Thesis Faithful Saliency Maps: Explaining Neural Networks by Augmenting "Competition for Pixels" Jorma Peer G¨orns Applied Mathematics supervised by Professor Himabindu Lakkaraju Cambridge, MA April 2020 Abstract For certain machine-learning models such as image classifiers, saliency methods promise to answer a crucial question: At the pixel level, where does the model look to classify a given image? If existing methods truthfully answer this question, they can bring some level of interpretability to an area of machine learning where it has been inexcusably absent: namely, to image-classifying neural networks, usually considered some of the most "black- box" classifiers. A multitude of different saliency methods has been developed over the last few years|recently, however, Adebayo et al. [1] revealed that many of them fail so- called "sanity checks": That is, these methods act as mere edge detectors of the input image, outputting the same convincing-looking saliency map completely independently of the model under investigation! Not only do they not illuminate the inner workings of the model at hand, but they may actually deceive the model investigator into believing that the model is working as it should.
    [Show full text]
  • Sys/Sys/Bio.H 1 1 /* 65 Void *Bio Caller2; /* Private Use by the Caller
    04/21/04 16:54:59 sys/sys/bio.h 1 1 /* 65 void *bio_caller2; /* Private use by the caller. */ 2 * Copyright (c) 1982, 1986, 1989, 1993 66 TAILQ_ENTRY(bio) bio_queue; /* Disksort queue. */ 3 * The Regents of the University of California. All rights reserved. 67 const char *bio_attribute; /* Attribute for BIO_[GS]ETATTR */ 4 * (c) UNIX System Laboratories, Inc. 68 struct g_consumer *bio_from; /* GEOM linkage */ 5 * All or some portions of this file are derived from material licensed 69 struct g_provider *bio_to; /* GEOM linkage */ 6 * to the University of California by American Telephone and Telegraph 70 off_t bio_length; /* Like bio_bcount */ 7 * Co. or Unix System Laboratories, Inc. and are reproduced herein with 71 off_t bio_completed; /* Inverse of bio_resid */ 8 * the permission of UNIX System Laboratories, Inc. 72 u_int bio_children; /* Number of spawned bios */ 9 * 73 u_int bio_inbed; /* Children safely home by now */ 10 * Redistribution and use in source and binary forms, with or without 74 struct bio *bio_parent; /* Pointer to parent */ 11 * modification, are permitted provided that the following conditions 75 struct bintime bio_t0; /* Time request started */ 12 * are met: 76 13 * 1. Redistributions of source code must retain the above copyright 77 bio_task_t *bio_task; /* Task_queue handler */ 14 * notice, this list of conditions and the following disclaimer. 78 void *bio_task_arg; /* Argument to above */ 15 * 2. Redistributions in binary form must reproduce the above copyright 79 16 * notice, this list of conditions and the following disclaimer in the 80 /* XXX: these go away when bio chaining is introduced */ 17 * documentation and/or other materials provided with the distribution. 81 daddr_t bio_pblkno; /* physical block number */ 18 * 3.
    [Show full text]
  • Hello, World! Free
    FREE HELLO, WORLD! PDF Disney Book Group | 14 pages | 16 Aug 2011 | Disney Press | 9781423141402 | English | New York, NY, United States "Hello, World!" program - Wikipedia Learn Data Science by completing interactive coding challenges and watching videos by expert instructors. Start Now! Python Hello a very simple language, and has a very straightforward syntax. It encourages programmers to program without boilerplate prepared code. The simplest directive in Python is the "print" directive - it simply prints out a line and also includes a newline, unlike in C. There Hello two major Python versions, Python 2 and Python 3. Python 2 and 3 are quite different. This tutorial uses Python 3, because it more semantically correct and supports newer features. For example, World! difference between Python 2 and 3 is the print statement. In Hello 2, World! "print" statement is not a function, and therefore it is invoked without parentheses. However, in Python World!, it World! a function, and must be invoked with parentheses. Python uses indentation for World!, instead of curly braces. Both tabs and spaces are supported, but the standard indentation requires standard Python code to use four spaces. For example:. This site is generously supported by DataCamp. Join over a million other learners and get started learning Python for data science today! Hello, World! To print a string in Python 3, just write: print "This line will be printed. Hello Tutorial. Read our Terms of Use and Privacy Policy. Hello, World! - Learn Python - Free Interactive Python Tutorial A "Hello, World! Such a Hello is very simple in most programming World!and World! often used to illustrate the basic syntax of a programming language.
    [Show full text]
  • Bugdoc: Algorithms to Debug Computational Processes
    BugDoc: Algorithms to Debug Computational Processes Raoni Lourenço Juliana Freire Dennis Shasha New York University New York University New York University [email protected] [email protected] [email protected] Abstract Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quanti- tative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challenging, usually requiring time and much human thought, while still being error-prone. We propose a new approach that makes use of iteration and Figure 1: Machine learning pipeline and its prove- provenance to automatically infer the root causes and de- nance. A data scientist can explore different input rive succinct explanations of failures. Through a detailed datasets and classifier estimators to identify a suitable experimental evaluation, we assess the cost, precision, and solution for a classification problem. recall of our approach compared to the state of the art. Our Example: Exploring Supernovas. In an astronomy experiment, experimental data and processing software is available for some visualizations of supernovas presented unusual arti- use, reproducibility, and enhancement. facts that could have indicated a discovery. The experimental CCS Concepts analysis consisted of multiple pipelines run at different sites, • Information systems → Data provenance. including data collection at the telescope site, data processing at a high-performance computing facility, and data analysis 1 Introduction run on the physicist’s desktop. After spending substantial Computational pipelines are widely used in many do- time trying to verify the results, the physicists found that mains, from astrophysics and biology to enterprise analytics.
    [Show full text]
  • Opengl Performer™ Programmer's Guide
    OpenGL Performer™ Programmer’s Guide 007-1680-060 CONTRIBUTORS Written by George Eckel and Ken Jones Edited by Rick Thompson and Susan Wilkening Illustrated by Chrystie Danzer and Chris Wengelski Production by Adrian Daley and Karen Jacobson Engineering contributions by Angus Dorbie, Tom Flynn, Yair Kurzion, Radomir Mech, Alexandre Naaman, Marcin Romaszewicz, Allan Schaffer, and Jenny Zhao COPYRIGHT © 1997, 2000 Silicon Graphics, Inc. All rights reserved; provided portions may be copyright in third parties, as indicated elsewhere herein. No permission is granted to copy, distribute, or create derivative works from the contents of this electronic documentation in any manner, in whole or in part, without the prior written permission of Silicon Graphics, Inc. LIMITED RIGHTS LEGEND The electronic (software) version of this document was developed at private expense; if acquired under an agreement with the USA government or any contractor thereto, it is acquired as "commercial computer software" subject to the provisions of its applicable license agreement, as specified in (a) 48 CFR 12.212 of the FAR; or, if acquired for Department of Defense units, (b) 48 CFR 227-7202 of the DoD FAR Supplement; or sections succeeding thereto. Contractor/manufacturer is Silicon Graphics, Inc., 1600 Amphitheatre Pkwy 2E, Mountain View, CA 94043-1351. TRADEMARKS AND ATTRIBUTIONS Silicon Graphics,IRIS, IRIS Indigo, IRIX, ImageVision Library, Indigo, Indy, InfiniteReality, Onyx, OpenGL, are registered trademarks, SGI, and CASEVision, Crimson, Elan Graphics, IRIS Geometry Pipeline, IRIS GL, IRIS Graphics Library, IRIS InSight, IRIS Inventor, Indigo Elan, Indigo2, InfiniteReality2, OpenGL Performer, Personal IRIS, POWER Series, Performance Co-Pilot, RealityEngine, RealityEngine2, SGI logo, and Showcase are trademarks of Silicon Graphics, Inc.
    [Show full text]
  • CS107, Lecture 16 Wrap-Up / What’S Next?
    CS107, Lecture 16 Wrap-Up / What’s Next? This document is copyright (C) Stanford Computer Science and Nick Troccoli, licensed under Creative Commons Attribution 2.5 License. All rights reserved. Based on slides created by Marty Stepp, Cynthia Lee, Chris Gregg, and others. 1 Plan For Today • Recap: Where We’ve Been • Larger Applications • What’s Next? • Q&A 2 Plan For Today • Recap: Where We’ve Been • Larger Applications • What’s Next? • Q&A 3 We’ve covered a lot in just 10 weeks! Let’s take a look back. 4 Our CS107 Journey Arrays Bits and and Heap Bytes Pointers Generics Allocators C Strings Stack and Assembly Heap 5 Course Overview 1. Bits and Bytes - How can a computer represent integer numbers? 2. Chars and C-Strings - How can a computer represent and manipulate more complex data like text? 3. Pointers, Stack and Heap – How can we effectively manage all types of memory in our programs? 4. Generics - How can we use our knowledge of memory and data representation to write code that works with any data type? 5. Assembly - How does a computer interpret and execute C programs? 6. Heap Allocators - How do core memory-allocation operations like malloc and free work? 6 First Day /* * hello.c * This program prints a welcome message * to the user. */ #include <stdio.h> // for printf int main(int argc, char *argv[]) { printf("Hello, world!\n"); return 0; } 7 First Day • The command-line is a text-based interface to navigate a computer, instead of a Graphical User Interface (GUI). Graphical User Interface Text-based interface 8 Bits And Bytes Key Question: How can a computer represent integer numbers? 9 Bits And Bytes Why does this matter? • Limitations of representation and arithmetic impact programs! • We can also efficiently manipulate data using bits.
    [Show full text]
  • Extracting Detailed Tobacco Exposure from the Electronic Health Record
    Extracting Detailed Tobacco Exposure From The Electronic Health Record By Travis John Osterman Thesis Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in Biomedical Informatics August 11, 2017 Nashville, Tennessee Approved Josh Denny, M.D., M.S. Mia Levy, M.D., Ph.D. Pierre Massion, M. D. i To Laura, Owen and Gavin. Thank you for your patience, encouragement, and support. ii ACKNOWLEDGEMENTS I would like to thank the National Library of Medicine and the Conquer Cancer Foundation for supporting my biomedical informatics training and research (LM007450, R01- LM010685). My appreciation and thanks also go to the Department of Biomedical Informatics faculty and students for their support. Specifically, I thank Cindy Gadd and Rischelle Jenkins for creating and maintaining a rigorous and enlightening curriculum and to my master’s committee Josh Denny, Mia Levy, and Pierre Massion, for supporting my education and broader career development. In addition, I would like to thank Wei-Qi Wei, Dara Mize, Julie Wu, and Lisa Bastarche who directly contributed to this work. Finally, I would like to thank my parents, wife, and children for supporting me through this program. iii TABLE OF CONTENTS Page DEDICATION ................................................................................................................................ ii ACKNOWLEDGEMENTS ..........................................................................................................
    [Show full text]
  • IRIS Performer™ Programmer's Guide
    IRIS Performer™ Programmer’s Guide Document Number 007-1680-040 CONTRIBUTORS Written by George Eckel Edited by Steven Hiatt Illustrated by Dany Galgani Production by Derrald Vogt and Linda Rae Sande Engineering contributions by Sharon Clay, Brad Grantham, Don Hatch, Jim Helman, Michael Jones, Martin McDonald, John Rohlf, Allan Schaffer, Chris Tanner, Jenny Zhao, Yair Kurzion, and Tom McReynolds © Copyright 1995 -1997 Silicon Graphics, Inc.— All Rights Reserved This document contains proprietary and confidential information of Silicon Graphics, Inc. The contents of this document may not be disclosed to third parties, copied, or duplicated in any form, in whole or in part, without the prior written permission of Silicon Graphics, Inc. RESTRICTED RIGHTS LEGEND Use, duplication, or disclosure of the technical data contained in this document by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of the Rights in Technical Data and Computer Software clause at DFARS 52.227-7013 and/or in similar or successor clauses in the FAR, or in the DOD or NASA FAR Supplement. Unpublished rights reserved under the Copyright Laws of the United States. Contractor/manufacturer is Silicon Graphics, Inc., 2011 N. Shoreline Blvd., Mountain View, CA 94039-7311. Indigo, Indy, IRIS, IRIS Indigo, ImageVision Library, Onyx, OpenGL, Silicon Graphics, and the Silicon Graphics logo are registered trademarks and Crimson, Elan Graphics, Geometry Pipeline, Indigo Elan, Indigo2, IRIS GL, IRIS Graphics Library, IRIS InSight, IRIS Inventor, IRIS Performer, IRIX, Personal IRIS, POWER Series, Performance Co-Pilot, RealityEngine, RealityEngine2, and Showcase are trademarks of Silicon Graphics, Inc. AutoCAD is a registered trademark of Autodesk, Inc.
    [Show full text]
  • Programming in Standard ML
    Programming in Standard ML (DRAFT:VERSION 1.2 OF 11.02.11.) Robert Harper Carnegie Mellon University Spring Semester, 2011 Copyright c 2011 by Robert Harper. All Rights Reserved. This work is licensed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/3.0/us/, or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. Preface This book is an introduction to programming with the Standard ML pro- gramming language. It began life as a set of lecture notes for Computer Science 15–212: Principles of Programming, the second semester of the in- troductory sequence in the undergraduate computer science curriculum at Carnegie Mellon University. It has subsequently been used in many other courses at Carnegie Mellon, and at a number of universities around the world. It is intended to supersede my Introduction to Standard ML, which has been widely circulated over the last ten years. Standard ML is a formally defined programming language. The Defi- nition of Standard ML (Revised) is the formal definition of the language. It is supplemented by the Standard ML Basis Library, which defines a com- mon basis of types that are shared by all implementations of the language. Commentary on Standard ML discusses some of the decisions that went into the design of the first version of the language. There are several implementations of Standard ML available for a wide variety of hardware and software platforms. The best-known compilers are Standard ML of New Jersey, MLton, Moscow ML, MLKit, and PolyML.
    [Show full text]
  • Classification of Computer Programs in the Scratch Online Community Lena Abdalla
    Classification of Computer Programs in the Scratch Online Community by Lena Abdalla Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2020 © Massachusetts Institute of Technology 2020. All rights reserved. Author................................................................ Department of Electrical Engineering and Computer Science January 27, 2020 Certified by. Andrew Sliwinski Research Affiliate Thesis Supervisor Accepted by . Katrina LaCurts Chair, Master of Engineering Thesis Committee 2 Classification of Computer Programs in the Scratch Online Community by Lena Abdalla Submitted to the Department of Electrical Engineering and Computer Science on January 27, 2020, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science Abstract Scratch is a graphical programming platform that empowers children to create com- puter programs and realize their ideas. Although the Scratch online community is filled with a variety of diverse projects, many of these projects also share similarities. For example, they tend to fall into certain categories, including games, animations, stories, and more. Throughout this thesis, I describe the application of Natural Language Processing (NLP) techniques to vectorize and classify Scratch projects by type. This effort included constructing a labeled dataset of 873 Scratch projects and their corresponding types, to be used for training a supervised classifier model. This dataset was constructed through a collective process of consensus-based annotation by experts. To realize the goal of classifying Scratch projects by type, I first train an unsupervised model of meaningful vector representations for Scratch blocks based on the composition of 500,000 projects.
    [Show full text]