Thorn—Robust, Concurrent, Extensible Scripting on the JVM
Total Page:16
File Type:pdf, Size:1020Kb
Thorn—Robust, Concurrent, Extensible Scripting on the JVM Bard Bloom1, John Field1, Nathaniel Nystrom2∗, Johan Ostlund¨ 3, Gregor Richards3, Rok Strnisaˇ 4, Jan Vitek3, Tobias Wrigstad5† 1 IBM Research 2 University of Texas at Arlington 3 Purdue University 4 University of Cambridge 5 Stockholm University Abstract 1. Introduction Scripting languages enjoy great popularity due to their sup- Scripting languages are lightweight, dynamic programming port for rapid and exploratory development. They typically languages designed to maximize short-term programmer have lightweight syntax, weak data privacy, dynamic typing, productivity by offering lightweight syntax, weak data en- powerful aggregate data types, and allow execution of the capsulation, dynamic typing, powerful aggregate data types, completed parts of incomplete programs. The price of these and the ability to execute the completed parts of incomplete features comes later in the software life cycle. Scripts are programs. Important modern scripting languages include hard to evolve and compose, and often slow. An additional Perl, Python, PHP, JavaScript, and Ruby, plus languages weakness of most scripting languages is lack of support for like Scheme that are not originally scripting languages but concurrency—though concurrency is required for scalability have been adapted for it. Many of these languages were orig- and interacting with remote services. This paper reports on inally developed for specialized domains (e.g., web servers the design and implementation of Thorn, a novel program- or clients), but are increasingly being used more broadly. ming language targeting the JVM. Our principal contribu- The rising popularity of scripting languages can be at- tions are a careful selection of features that support the evo- tributed to a number of key design choices. Scripting lan- lution of scripts into industrial grade programs—e.g., an ex- guages’ pragmatic view of a program allows execution of pressive module system, an optional type annotation facility completed sections of partially written programs. This fa- for declarations, and support for concurrency based on mes- cilitates an agile and iterative development style—“at every sage passing between lightweight, isolated processes. On the step of the way a working piece of software” [9]. Execu- implementation side, Thorn has been designed to accommo- tion of partial programs allows instant unit-testing, interac- date the evolution of the language itself through a compiler tive experimentation, and even demoing of software at all plugin mechanism and target the Java virtual machine. times. Powerful and flexible aggregate data types and dy- namic typing allow interim solutions that can be revisited Categories and Subject Descriptors D.3.2 [Programming later, once the understanding of the system is deep enough Languages]: Language Classifications—Concurrent, dis- to make a more permanent choice. Scripting languages focus tributed, and parallel languages; Object-oriented languages; on programmer productivity early in the software life cycle. D.3.3 [Programming Languages]: Language Constructs and For example, studies show a factor 3–60 reduced effort and Features—Concurrent programming structures; Modules, 2–50 reduced code for Tcl over Java, C and C++ [38, 39]. packages; Classes and objects; Data types and structures; However, when the exploratory phase is over and require- D.3.4 [Programming Languages]: Processors—Compilers ments have stabilized, scripting languages become less ap- General Terms Design pealing. The compromises made to optimize development Keywords Actors, Pattern matching, Scripting time make it harder to reason about correctness, harder to do semantic-preserving refactorings, and in many cases harder ∗ Work done while the author was affiliated with IBM Research. to optimize execution speed. Even though scripts are suc- † Work done while the author was affiliated with Purdue University. cinct, the lack of type information makes the code harder to navigate. An additional shortcoming of most scripting lan- guages is lack of first-class support for concurrency. Concur- Permission to make digital or hard copies of all or part of this work for personal or rency is no longer just the province of specialized software classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation such as high-performance scientific algorithms—it is ubiqui- on the first page. To copy otherwise, to republish, to post on servers or to redistribute tous, whether driven by the need to exploit multicore archi- to lists, requires prior specific permission and/or a fee. tectures or the need to interact asynchronously with services OOPSLA 2009, October 25–29, 2009, Orlando, Florida, USA. Copyright c 2009 ACM 978-1-60558-734-9/09/10. $10.00 on the Internet. Current scripting languages, when they sup- port concurrency at all, do so through complex and fragile Modules: An expressive module system makes it easy to libraries and callback patterns [20], which combine browser- wrap scripts as reusable components. and server-side scripts written in different languages and are Constraints: Optional constraint annotations on declara- notoriously troublesome. tions permit static or dynamic program checking, and fa- Currently, the weaknesses of scripting languages are cilitate optimizations. largely dealt with by rewriting scripts in either less brittle or more efficient languages. This is costly and often error- Extensibility: An extensible compiler infrastructure writ- prone, due to semantic differences between the scripting lan- ten in Thorn itself allows for language experimentation guage and the new target language. Sometimes parts of the and development of domain specific variants. program are reimplemented in C to optimize a particularly It is important to note what Thorn does not support. Threads: intensive computation. Bridging from a high-level scripting There is no shared memory concurrency in Thorn, hence language to C introduces new opportunities for errors and in- data races are impossible and the usual problems associated creases the number of languages a programmer must know with locks and other forms of shared memory synchroniza- to maintain the system. tion are avoided. Dynamic loading: Thorn does not support dynamic loading of code. This facilitates static optimization, Design Principles. This paper reports on the design and enhances security, and limits the propagation of failures. the implementation status of Thorn, a novel programming However, Thorn does support dynamic creation of compo- language running on the Java Virtual Machine (JVM). Our nents, which can be used in many applications that currently principal design contributions are a careful selection of fea- require dynamic loading. Introspection: Unlike many script- tures that support the evolution of scripts into robust indus- ing languages, Thorn does not support aggressively intro- trial grade programs, along with support for a simple but spective features. Unsafe methods: Thorn can access Java li- powerful concurrency model. By “robust”, we mean that braries through an interoperability API, but there is no access Thorn encourages certain good software engineering prac- to the bare machine through unsafe methods as in C# [22]. tices, by making them convenient. The most casual use of Java-style interfaces: Their function is subsumed by mul- classes will result in strong encapsulation of instance fields. tiple inheritance. Java-style inner classes: These are com- Pattern matching is convenient and pervasive—and provides plex and difficult to understand, and are largely subsumed some of the type information which is lost in most dynami- by simpler constructs such as first-class functions. Implicit cally typed languages. Code can be encapsulated into mod- coercions: Unlike some scripting languages, Thorn does not ules. Components provide fault boundaries for distributed support implicit coercions, e.g., interpreting a string "17" as and concurrent computing. an integer 17 in contexts requiring an integer. Thorn has been designed to strike a balance between Targeted Domains. Thorn is aimed at domains includ- dynamicity and safety. It does not support the full range of ing client-server programming for mobile or web applica- dynamic features commonly found in scripting languages, tions, embedded event-driven applications, and distributed though enough of them for rapid prototyping. It is static web services. The relevant characteristics include a need for enough to facilitate reasoning and static analyses. Thorn rapid prototyping, for concurrency and, as applications ma- runs on the JVM, allowing it to execute on a wide range of ture, for packaging scripts into modules for reuse, deploy- operating systems and letting it use lower-level Java libraries ment, and static analysis. Thorn plugins provides language for the Thorn runtime and system services. Thorn has the support for specialized syntax and common patterns, e.g., following key features: web scripting or streaming. They make it possible to adapt Scripty: Thorn is dynamically-typed with lightweight syn- Thorn to specialized domains and allow more advanced op- tax and includes powerful aggregate data types and first- timizations and better error handling than with macros. class functions. Implementation Status and Availability. Thorn is joint Object-oriented: Thorn includes a simple class-based project between IBM Research