Rosetta Code: Improv in Any Language

Total Page:16

File Type:pdf, Size:1020Kb

Rosetta Code: Improv in Any Language Rosetta Code: Improv in Any Language Piotr Mirowski1, Kory Mathewson1, Boyd Branch1,2 Thomas Winters1,3, Ben Verhoeven1,4, Jenny Elfving1 1Improbotics (https://improbotics.org) 2University of Kent, United Kingdom 3KU Leuven, Dept. of Computer Science; Leuven.AI, Belgium 4ERLNMYR, Belgium Abstract Rosetta Code provides improv theatre performers with artificial intelligence (AI)-based technology to perform shows understandable across many different languages. We combine speech recognition, improv chatbots and language translation tools to enable improvisers to com- municate with each other while being understood—or comically misunderstood—by multilingual audiences. We describe the technology underlying Rosetta Code, detailing the speech recognition, machine translation, text generation and text-to-speech subsystems. We then describe scene structures that feature the system in performances in multilingual shows (9 languages). We provide evaluative feedback from performers, au- Figure 1: Example of a performed Telephone Game. Per- diences, and critics. From this feedback, we draw formers are aligned and one whispers to their partner on analogies between surrealism, absurdism, and multilin- the right a phrase in a foreign language (here, in Swedish), gual AI improv. Rosetta Code creates a new form of language-based absurdist improv. The performance re- which is then repeated to the following performer, until the mains ephemeral and performers of different languages last utterance is voiced into automated speech recognition can express themselves and their culture while accom- and translation to show how information is lost. modating the linguistic diversity of audiences. in which it is performed. Given that improv is based on the Introduction connection between the audience and the performers, watch- Theatre is one of the most important tools we have for shar- ing improv in a foreign language severely limits this link. ing experiences and building cross-cultural understanding. This contrasts with scripted theatre, which has been sal- Moreover, theatre performers and audiences who speak dif- vaged from monolingual oblivion: Sophocles, Shakespeare, ferent languages are more connected than ever, thanks to in- and Sartre continue to be translated into many different lan- creasing ease of communication, dissemination of culture, guages, reinterpreted, and enjoyed by audiences around the translation, travel, and improvements in remote performance world. Improv has not had such an opportunity, and perfor- capabilities. In particular, improvised theatre (improv) is mance groups are bound to remain local or switch to English well positioned to connect culture given its universality, ac- as a lingua franca when performing internationally. cessibility, and low barriers to entry: improvisation tech- The art of improvisation is derived from the connections niques can be readily understood and internalized, and in a between performative layers, both between the performers, short manner of time, individuals from diverse cultures em- and between the performers and the audience. Improv em- pathize with each other while performing scenes together, braces the audience to create collaboratively together. In with deep characters, relationships, settings, motivations, this way improvisation is a democratic narrative, and the and subtext. Improv serves as a microcosm of cultural com- potential impacts of improvised theatre between performers munication; it is “the theatre of the people” in moment (Boal and audiences of different cultures and languages are sig- 2006). Improv is therefore an ideal test-bed to explore broad nificant. Most international improvisational collaboration is cultural and communication questions (Mathewson 2019). English based, but many regional festivals take place in the Improv is also a paradoxical cultural artifact. On one languages of the host region. These performances exclude hand, improv is ubiquitous and conveys universal messages audiences without knowledge of the performance language, about the human condition and the vagaries of life. On the and limits the contributions of improvisors who do not speak other hand, as a highly linguistic art form, improv is nearly the language. Without translation, improvisation misses out impossible to understand if you do not know the language on important voices due to language limitations. Proceedings of the 11th International Conference on Computational Creativity (ICCC’20) 115 ISBN: 978-989-54160-2-8 How can we create conditions so that improvisors from different cultures can improvise together in their own lan- guage? How can audiences understand performers using different languages? How might we grow our cultural com- munication and empathy while only being able to speak one language? Rosetta Code answers these questions, and gives theatrical improv a suite of software, scenes, and show struc- tures from which to advance and expand. The Methods section describes the technical details of the system, the challenges associated with improv in any lan- guage, and how we used our system in the context of the- atrical improvisation. In section Rosetta Code on Stage we provide results of using the system in three shows us- ing nine languages. We also present evaluative feedback Figure 2: Illustration of the visual interface used in Rosetta from performers, audiences, and critics. In section Re- Code as seen by the audience. The top part displays the lated Work, Historical Context and Discussion we situate choice of the language used for speech recognition and the Rosetta Code at the intersection of improvisational theatre latest recognised sentence (here French). The bottom part and language, present an exploration of the cultural impor- shows the target language for machine translation (here Pol- tance of multi-lingual artistic performance, and provide sev- ish), as well as the last few translation results. The buttons eral directions for future work. and input box (top) enable overriding speech recognition and activating / deactivating text-to-speech. Methods Artificial intelligence-based improvisation is an art form, where a robot and/or AI is used on stage as an improv stage dynamic vocal microphones (with an on-off button that partner (Bruce et al. 2000; Mathewson and Mirowski 2017a; can be triggered by the user), connected to the computer 2017b; Mirowski and Mathewson 2019; Jacob et al. 2019; via an analog-to-digital audio interface. Winters and Mathewson 2019; Liu et al. 2019; Mathewson An instantaneous translation system, e.g. Google Trans- 2019). That robot relies on a generative language model to • late API4, is used as communication channel to convert produce lines or actions in response to context, and can in recognised speech from one language into another. itself be seen as a computationally creative system. A vari- A surtitle visualization interface (Figure 2) that enables ation of that format, Improbotics, designed in 2016, con- • sists in letting human actors enunciate the lines: the chatbot the audience to follow the conversation using instanta- effectively whispers lines into the ears of human improvis- neous translation, and allows improvisors to modify trans- ers, who are only allowed to repeat exactly those lines, but lation language settings. are otherwise free to express themselves with a full vocabu- Text-to-speech synthesis API to automatically voice lary of physicality and emotions (Mathewson and Mirowski • translations. 2018). We have adopted this configuration for the Rosetta In-ear headphone interfaces (or earpieces) enable individ- Code show. • ual performers to listen to audio translation while still be- The core idea of Rosetta Code is to build on existing, ing able to follow other conversations. Our setup to trans- state-of-the-art language technology (Section: Technology mit sound from the computer to the improviser relies on Overview) to enable a palette of improvisational games (Sec- FM radio transmitters that can multicast information to tion: Improv Games). Rosetta Code thereby allows impro- multiple FM radio receivers worn by several improvisers. visors speaking different languages to perform multilingual improv theatre together on stage. Improvisational Chatbot System To respond meaning- fully to human improvisor input utterances, the AI improv Technology Overview system works by using a statistical language model to gen- The technical setup used in this project consists of several erate sentences in continuation of some context presented elements that can be seen as independent building blocks, as text. Previous versions of AI improvisation were built each corresponding to a piece of equipment or to an Appli- upon the neural network sequence-to-sequence architec- cation Programming Interface (API). ture (Sutskever, Vinyals, and Le 2014) trained on a pseudo- Speech recognition (e.g. Google Speech-to-text API1 or translation task from the context into the generated out- • Web Speech API2), which works in multiple languages3, put (Vinyals and Le 2015). For Rosetta Code, we rely on running in a browser application. In order to successfully the GPT-2 neural network transformer architecture (Radford capture the improviser’s voice while occluding ambient et al. 2019), trained on a large corpus of web pages, which 5 noise and other performers’ voices, we rely on handheld we fine-tuned on the OpenSubtitles corpus of film subti- tles (Tiedemann 2009). 1 https://cloud.google.com/speech-to-text 2 4 https://google.com/intl/en/chrome/demos/speech.html https://cloud.google.com/translate/docs
Recommended publications
  • Analyzing Programming Languages' Energy Consumption: an Empirical Study
    Analyzing Programming Languages’ Energy Consumption: An Empirical Study Stefanos Georgiou Maria Kechagia Diomidis Spinellis Athens University of Economics and Delft University of Technology Athens University of Economics and Business Delft, The Netherlands Business Athens, Greece [email protected] Athens, Greece [email protected] [email protected] ABSTRACT increase of energy consumption.1 Recent research conducted by Motivation: Shifting from traditional local servers towards cloud Gelenbe and Caseau [7] and Van Heddeghem et al. [14] indicates a computing and data centers—where different applications are facil- rising trend of the it sector energy requirements. It is expected to itated, implemented, and communicate in different programming reach 15% of the world’s total energy consumption by 2020. languages—implies new challenges in terms of energy usage. Most of the studies, for energy efficiency, have considered energy Goal: In this preliminary study, we aim to identify energy implica- consumption at hardware level. However, there is much of evidence tions of small, independent tasks developed in different program- that software can also alter energy dissipation significantly [2, 5, 6]. 2 3 ming languages; compiled, semi-compiled, and interpreted ones. Therefore, many conference tracks (e.g. greens, eEnergy) have Method: To achieve our purpose, we collected, refined, compared, recognized the energy–efficiency at the software level as an emerg- and analyzed a number of implemented tasks from Rosetta Code, ing research challenge regarding the implementation of modern that is a publicly available Repository for programming chrestomathy. systems. Results: Our analysis shows that among compiled programming Nowadays, more companies are shifting from traditional local languages such as C, C++, Java, and Go offers the highest energy servers and mainframes towards the data centers.
    [Show full text]
  • Snapshots of Open Source Project Management Software
    International Journal of Economics, Commerce and Management United Kingdom ISSN 2348 0386 Vol. VIII, Issue 10, Oct 2020 http://ijecm.co.uk/ SNAPSHOTS OF OPEN SOURCE PROJECT MANAGEMENT SOFTWARE Balaji Janamanchi Associate Professor of Management Division of International Business and Technology Studies A.R. Sanchez Jr. School of Business, Texas A & M International University Laredo, Texas, United States of America [email protected] Abstract This study attempts to present snapshots of the features and usefulness of Open Source Software (OSS) for Project Management (PM). The objectives include understanding the PM- specific features such as budgeting project planning, project tracking, time tracking, collaboration, task management, resource management or portfolio management, file sharing and reporting, as well as OSS features viz., license type, programming language, OS version available, review and rating in impacting the number of downloads, and other such usage metrics. This study seeks to understand the availability and accessibility of Open Source Project Management software on the well-known large repository of open source software resources, viz., SourceForge. Limiting the search to “Project Management” as the key words, data for the top fifty OS applications ranked by the downloads is obtained and analyzed. Useful classification is developed to assist all stakeholders to understand the state of open source project management (OSPM) software on the SourceForge forum. Some updates in the ranking and popularity of software since
    [Show full text]
  • Using Tcl to Curate Openstreetmap Kevin B
    Using Tcl to curate OpenStreetMap Kevin B. Kenny 5 November 2019 The massive OpenStreetMap project, which aims to crowd-source a detailed map of the entire Earth, occasionally benefits from the import of public-domain data, usually from one or another government. Tcl, used with only a handful of extensions to orchestrate a large suite of external tools, has proven to be a valuable framework in carrying out the complex tasks involved in such an import. This paper presents a sample workflow of several such imports and how Tcl enables it. mapped. In some cases, the only acceptable approach is to Introduction avoid importing the colliding object altogether. OpenStreetMap (https://www.openstreetmap.org/) is an This paper discusses some case studies in using Tcl scripts ambitious project to use crowdsourcing, or the open-source to manage the task of data import, including data format model of development, to map the entire world in detail. In conversion, managing of the relatively easy data integrity effect, OpenStreetMap aims to be to the atlas what issues such as topological inconsistency, identifying objects Wikipedia is to the encyclopaedia. for conflation, and applying the changes. In many ways, it Project contributors (who call themselves, “mappers,” in gets back to the roots of Tcl. There is no ‘programming in preference to any more formal term like “surveyors”) use the large’ to be done here. The scripts are no more than a tools that work with established programming interfaces to few hundred lines, and all the intensive calculation and data edit a database with a radically simple structure and map management is done in an existing ecosystem of tools.
    [Show full text]
  • Steps-In-Scala.Pdf
    This page intentionally left blank STEPS IN SCALA An Introduction to Object-Functional Programming Object-functional programming is already here. Scala is the most prominent rep- resentative of this exciting approach to programming, both in the small and in the large. In this book we show how Scala proves to be a highly expressive, concise, and scalable language, which grows with the needs of the programmer, whether professional or hobbyist. Read the book to see how to: • leverage the full power of the industry-proven JVM technology with a language that could have come from the future; • learn Scala step-by-step, following our complete introduction and then dive into spe- cially chosen design challenges and implementation problems, inspired by the real-world, software engineering battlefield; • embrace the power of static typing and automatic type inference; • use the dual object and functional oriented natures combined at Scala’s core, to see how to write code that is less “boilerplate” and to witness a real increase in productivity. Use Scala for fun, for professional projects, for research ideas. We guarantee the experience will be rewarding. Christos K. K. Loverdos is a research inclined computer software profes- sional. He holds a B.Sc. and an M.Sc. in Computer Science. He has been working in the software industry for more than ten years, designing and implementing flex- ible, enterprise-level systems and making strategic technical decisions. He has also published research papers on topics including digital typography, service-oriented architectures, and highly available distributed systems. Last but not least, he is an advocate of open source software.
    [Show full text]
  • Using Domain Specific Language for Modeling and Simulation: Scalation As a Case Study
    Proceedings of the 2010 Winter Simulation Conference B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yucesan,¨ eds. USING DOMAIN SPECIFIC LANGUAGE FOR MODELING AND SIMULATION: SCALATION AS A CASE STUDY John A. Miller Jun Han Maria Hybinette Department of Computer Science University of Georgia Athens, GA, 30602, USA ABSTRACT Progress in programming paradigms and languages has over time influenced the way that simulation programs are written. Modern object-oriented, functional programming languages are expressive enough to define embedded Domain Specific Languages (DSLs). The Scala programming language is used to implement ScalaTion that supports several popular simulation modeling paradigms. As a case study, ScalaTion is used to consider how language features of object-oriented, functional programming languages and Scala in particular can be used to write simulation programs that are clear, concise and intuitive to simulation modelers. The dichotomy between “model specification” and “simulation program” is also considered both historically and in light of the potential narrowing of the gap afforded by embedded DSLs. 1 INTRODUCTION As one learns simulation the importance of the distinction between “model specification” and “simulation program” is made clear. In the initial period (Nance 1996), the distinction was indeed important as models were expressed in a combination of natural language (e.g., English) and mathematics, while the simulation programs implementing the models were written in Fortran. The gap was huge. Over time, the gap has narrowed through the use of more modern general-purpose programming languages (GPLs) with improved readability and conciseness. Besides advances in general-purpose languages, the developers of Simulation Programming Languages (SPLs) have made major contributions.
    [Show full text]
  • Hello World/Web Server - Rosetta Code
    Hello world/Web server - Rosetta Code http://rosettacode.org/wiki/Hello_world/Web_server Hello world/Web server From Rosetta Code < Hello world The browser is the new GUI! Hello world/Web The task is to serve our standard text "Goodbye, World!" to server http://localhost:8080/ so that it can be viewed with a web browser. You are The provided solution must start or implement a server that accepts encouraged to multiple client connections and serves text as requested. solve this task according to the task description, Note that starting a web browser or opening a new window with using any language you this URL is not part of the task. Additionally, it is permissible to may know. serve the provided page as a plain text file (there is no requirement to serve properly formatted HTML here). The browser will generally do the right thing with simple text like this. Contents 1 Ada 2 AWK 3 BBC BASIC 4 C 5 C++ 6 C# 7 D 8 Delphi 9 Dylan.NET 10 Erlang 11 Fantom 12 Go 13 Haskell 14 Io 15 J 16 Java 17 JavaScript 18 Liberty BASIC 19 Modula-2 20 NetRexx 21 Objeck 22 OCaml 23 Opa 24 Perl 25 Perl 6 26 PicoLisp 27 Prolog 28 PureBasic 29 PHP 30 Python 1 sur 18 19/07/2013 19:57 Hello world/Web server - Rosetta Code http://rosettacode.org/wiki/Hello_world/Web_server 31 Racket 32 REALbasic 33 Ruby 34 Run BASIC 35 Salmon 36 Seed7 37 Smalltalk 38 Tcl Ada Library: AWS Uses many defaults, such as 5 max simultaneous connections.
    [Show full text]
  • Addressing Problems with Replicability and Validity of Repository Mining Studies Through a Smart Data Platform
    Empirical Software Engineering manuscript No. (will be inserted by the editor) Addressing Problems with Replicability and Validity of Repository Mining Studies Through a Smart Data Platform Fabian Trautsch · Steffen Herbold · Philip Makedonski · Jens Grabowski The final publication is available at Springer via https://doi.org/10.1007/s10664-017-9537-x Received: date / Accepted: date Abstract The usage of empirical methods has grown common in software engineering. This trend spawned hundreds of publications, whose results are helping to understand and improve the software development process. Due to the data-driven nature of this venue of investigation, we identified several problems within the current state-of-the-art that pose a threat to the repli- cability and validity of approaches. The heavy re-use of data sets in many studies may invalidate the results in case problems with the data itself are identified. Moreover, for many studies data and/or the implementations are not available, which hinders a replication of the results and, thereby, decreases the comparability between studies. Furthermore, many studies use small data sets, which comprise of less than 10 projects. This poses a threat especially to the external validity of these studies. Even if all information about the studies is available, the diversity of the used tooling can make their replication even then very hard. Within this paper, we discuss a potential solution to these problems through a cloud-based platform that integrates data collection and analytics. We created SmartSHARK,
    [Show full text]
  • A Comparative Study of Programming Languages in Rosetta Code
    A Comparative Study of Programming Languages in Rosetta Code Sebastian Nanz · Carlo A. Furia Chair of Software Engineering, Department of Computer Science, ETH Zurich, Switzerland fi[email protected] Abstract—Sometimes debates on programming languages are and types of tasks solved, and by the use of novice program- more religious than scientific. Questions about which language is mers as subjects. Real-world programming also develops over more succinct or efficient, or makes developers more productive far more time than that allotted for short exam-like program- are discussed with fervor, and their answers are too often based ming assignments; and produces programs that change features on anecdotes and unsubstantiated beliefs. In this study, we use and improve quality over multiple development iterations. the largely untapped research potential of Rosetta Code, a code repository of solutions to common programming tasks in various At the opposite end of the spectrum, empirical studies languages, which offers a large data set for analysis. Our study based on analyzing programs in public repositories such as is based on 7’087 solution programs corresponding to 745 tasks GitHub [2], [22], [25] can count on large amounts of mature in 8 widely used languages representing the major programming paradigms (procedural: C and Go; object-oriented: C# and Java; code improved by experienced developers over substantial functional: F# and Haskell; scripting: Python and Ruby). Our time spans. Such set-ups are suitable for studies of defect statistical
    [Show full text]
  • Ranking Programming Languages by Energy Efficiency
    Ranking Programming Languages by Energy Efficiency Rui Pereiraa, Marco Coutoa, Francisco Ribeiroa, Rui Ruaa, J´acomeCunhab, Jo~aoPaulo Fernandesc, Jo~aoSaraivaa aHASLab/INESC TEC & Universidade do Minho, Portugal bUniversidade do Minho & NOVA LINCS, Portugal cCISUC & Universidade de Coimbra, Portugal Abstract This paper compares a large set of programming languages regarding their efficiency, including from an energetic point-of-view. Indeed, we seek to establish and analyze different rankings for programming languages based on their energy efficiency. The goal of being able to rank languages with energy in mind is a recent one, and certainly deserves further studies. We have taken 19 solutions to well defined programming problems, expressed in (up to) 27 programming languages, from well know repositories such as the Computer Language Benchmark Game and Rosetta Code. We have also built a framework to automatically, and systematically, run, measure and compare the efficiency of such solutions. Ultimately, it is based on such comparison that we propose a serious of efficiency rankings, based on multiple criteria. Our results show interesting findings, such as, slower/faster languages con- suming less/more energy, and how memory usage influences energy consump- tion. We also show how to use our results to provide software engineers support to decide which language to use when energy efficiency is a concern. Keywords: Energy Efficiency, Programming Languages, Language Benchmarking, Green Software 1. Introduction Software language engineering provides powerful techniques and tools to design, implement and evolve software languages. Such techniques aim at im- proving programmers productivity - by incorporating advanced features in the language design, like for instance powerful modular and type systems - and at efficiently execute such software - by developing, for example, aggressive com- piler optimizations.
    [Show full text]
  • Comparative Language Fuzz Testing Programming Languages Vs
    Comparative Language Fuzz Testing Programming Languages vs. Fat Fingers Diomidis Spinellis Vassilios Karakoidas Panos Louridas Athens University of Economics and Business fdds, bkarak, [email protected] Abstract a tool that systematically introduces diverse random pertur- We explore how programs written in ten popular program- bations into the program’s source code. Finally, we applied ming languages are affected by small changes of their source the fuzzing tool on the source code corpus and examined code. This allows us to analyze the extend to which these whether the resultant code had errors that were detected at languages allow the detection of simple errors at compile or compile or run time, and whether it produced erroneous re- at run time. Our study is based on a diverse corpus of pro- sults. grams written in several programming languages systemat- In practice, the errors that we artificially introduced ically perturbed using a mutation-based fuzz generator. The into the source code can crop up in a number of ways. results we obtained prove that languages with weak type sys- Mistyping—the “fat fingers” syndrome—is one plausible tems are significantly likelier than languages that enforce source. Other scenarios include absent-mindedness, auto- strong typing to let fuzzed programs compile and run, and, mated refactorings [7] gone awry (especially in languages in the end, produce erroneous results. More importantly, our where such tasks cannot be reliably implemented), unin- study also demonstrates the potential of comparative lan- tended consequences from complex editor commands or guage fuzz testing for evaluating programming language de- search-and-replace operations, and even the odd cat walk- signs.
    [Show full text]
  • What Are Your Programming Language's Energy-Delay Implications?
    Delft University of Technology What are your Programming Language’s Energy-Delay Implications? Georgiou, Stefanos; Kechagia, Maria; Louridas, Panos; Spinellis, Diomidis DOI 10.1145/3196398.3196414 Publication date 2018 Document Version Accepted author manuscript Published in MRS'18 Proceedings of the 15th International Conference on Mining Software Repositories Citation (APA) Georgiou, S., Kechagia, M., Louridas, P., & Spinellis, D. (2018). What are your Programming Language’s Energy-Delay Implications? In MRS'18 Proceedings of the 15th International Conference on Mining Software Repositories (pp. 303-313). Association for Computing Machinery (ACM). https://doi.org/10.1145/3196398.3196414 Important note To cite this publication, please use the final published version (if applicable). Please check the document version above. Copyright Other than for strictly personal use, it is not permitted to download, forward or distribute the text or part of it, without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license such as Creative Commons. Takedown policy Please contact us and provide details if you believe this document breaches copyrights. We will remove access to the work immediately and investigate your claim. This work is downloaded from Delft University of Technology. For technical reasons the number of authors shown on this cover page is limited to a maximum of 10. What Are Your Programming Language’s Energy-Delay Implications? Stefanos Georgiou Maria Kechagia Athens University of Economics
    [Show full text]
  • Exercise #6 – Programming Language Design Research and Analysis
    Exercise #6 – Programming Language Design Research and Analysis DUE: As indicated on Canvas PowerPoint presentation to the class required during final exam, but for this assignment, report on what you have found so far. Read PLP Chapter 12, which completes our coverage. The goal of this assignment is to compare implementation of a sufficiently complex and interesting algorithm from your experience as a student in two vastly different programming languages to determine if there’s specific advantage in using one compared to the other. Alternative languages you can choose from include: R, Halide, Lisp, Scheme, Java, Python and Prolog (or other with instructor approval) and Primary languages to compare to can include: C/C++, Ada83, Ada95, and MATLAB that are taught in the SE program. Exercise #6 Requirements: 1) [10 points] Present your work on Lab #5 and 6 in the form of a brief report (a few paragraphs) comparing the two implementations and background on the primary and alternative PL selected (why each was designed?, what for and why? , why you think the algorithm of your choice should be easy to implement, or not in each PL?). If you are pressed for time, as long as you answer each question (including above), this is adequate for this exercise. If you prefer to write a complete report now, and to provide a more interactive teaching style presentation with few or no slides, see my note on a suggested format below. 2) [20 points] Research a hypothesis that can be tested by your side by side comparison with at least 3 metrics.
    [Show full text]