Factor: a Dynamic Stack-Based Programming Language
Total Page:16
File Type:pdf, Size:1020Kb
Factor: A Dynamic Stack-based Programming Language Slava Pestov Daniel Ehrenberg Joe Groff Stack Effects LLC Carleton College Durian Software [email protected] [email protected] [email protected] Abstract high-performance implementation and interactive develop- Factor is a new dynamic object-oriented programming lan- ment environment make it notable. guage. It began as an embedded scripting language and Factor programs look very different from programs in evolved to a mature application development language. The most other programming languages. At the most basic level, language has a simple execution model and is based on the function calls and arithmetic use postfix syntax, rather than manipulation of data on a stack. An advanced metaprogram- prefix or infix as in most programming languages. Factor ming system provides means for easily extending the lan- provides local variables, but they are used in only a small guage. Thus, Factor allows programmers to use the right minority of procedures because its language features allow features for their problem domain. The Factor implemen- most code to be comfortably written in a point-free style. tation is self-hosting, featuring an interactive development Factor is an object-oriented language with an object sys- environment and an optimizing compiler. In this paper, the tem centered around CLOS-inspired generic functions in language and its implementation are presented. place of traditional message passing. To make Factor suit- able for development of larger applications, it has a robust Categories and Subject Descriptors D.3.3 [Programming module system. Factor’s metaprogramming system allows Languages]: Language Constructs and Features; D.3.4 for arbitrary extension of syntax and for compile-time com- [Programming Languages]: Processors — Compilers, Op- putation. Factor allows the clean integration of high-level timization and low-level code with extensive support for calling li- braries in other languages and for efficient manipulation of General Terms Languages, Design, Performance binary data. Keywords Factor, dynamic languages, stack-based lan- Factor has an advanced, high-performance implementa- guages tion. We believe good support for interactive development is invaluable, and for this reason Factor allows programmers to 1. Introduction test and reload code as it runs. Our ahead-of-time optimiz- ing compiler and efficient runtime system can remove much Factor is a dynamic stack-based programming language. It of the overhead of high-level language features. Together, was originally conceived as an experiment to create a stack- these features make Factor a useful language for writing both based language practical for modern programming tasks. quick scripts and large programs in a high-level way. It was inspired by earlier stack-based languages like Forth Factor is an open-source project and the product of [33] and Joy [44]. The stack-based model allows for con- many contributors. It is available for free download from cise and flexible syntax with a high degree of factoring and http://factorcode.org. code reuse. Driven by the needs of its users, Factor gradu- This paper contributes the following: ally evolved from this base into a dynamic, object-oriented programming language. Although there is little truly novel to the Factor language or implementation, Factor’s combi- nation of the stack-based paradigm with functional, object- • Abstractions and checking for managing the flow of data oriented, and low-level programming features alongside its in stack-based languages (Section 2.1) • A CLOS- and Dylan-inspired object system, featuring generic functions built upon a metaobject protocol and a flexible type system (Section 2.2) Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed • An expressive and easy-to-use system for staged metapro- for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute gramming (Section 2.3) to lists, requires prior specific permission and/or a fee. • DLS 2010, October 18, 2010, Reno/Tahoe, Nevada, USA. The design of a foreign function interface and low-level Copyright c 2010 ACM 978-1-4503-0405-4/10/10. $10.00 capabilities in a dynamic language (Section 2.4) 43 "data.txt" utf8 file-lines On the second line, the standard library word head pops 10 head two objects from the stack, the array of lines of text result- ing from file-lines and the integer 10, and pushes a new Figure 1: Retrieving the first ten lines of a file array containing a fixed number of elements from the be- ginning of the input array. The final result of the example is an array containing the first ten lines of text from the file • The design of an effective optimizing compiler for a "data.txt" as strings. dynamic language (Section 3) The two lines of code in Figure 1 can be understood • A case study in the evolution of a dynamic language either as separate programs or concatenated to form a single (Secton 4) program. The latter case has the effect of first running the Much of Factor’s features and implementation are not rad- first program and then the second. Note that in Factor source ically different from previous dynamic languages. It is the code, newlines between words are interpreted the same as combination of these features into a useful system that make spaces. Factor notable. 2.1.2 Higher-order Programming 2. Language Design Factor supports higher-order functions (functions which take Factor combines features from existing languages with new functions as arguments). In the Joy tradition, we refer to innovations. We focus here on the prominent unique aspects higher-order functions as combinators and to anonymous of Factor’s design: our contributions to the stack-based lan- functions as quotations. Quotation literals are pushed on the guage paradigm, the module and object systems, tools for stack by surrounding a series of tokens with [ and ], which metaprogramming, and low-level binary data manipulation delays the evaluation of the surrounded code and stores it support. in a quotation object. Combinators are invoked like any other word, with quotation objects as parameters. In Factor, 2.1 Stack-based Programming Language all control flow is expressed using combinators, including In Factor, as in Forth and Joy [44], function parameters are common branching and looping constructs usually given passed by pushing them on an operand stack prior to per- special syntax in other languages. Some examples: forming the function call. We introduce two original contri- • if is a combinator taking three inputs from the stack: a butions: a set of combinators which replace the stack shuf- boolean value, a “then” branch quotation, and an “else” fling words found in other stack languages, and a syntax for branch quotation. For example, partial application. 2 even? [ "OK" ] [ "Cosmic rays detected" ] 2.1.1 Postfix Syntax if tests whether the value 2 is even, placing the string "OK" In stack-based languages, a function call is written as a sin- on the stack if so or "Cosmic rays detected" if not. gle token, which leads to the term “word” being used in • place of “function”, in the Forth tradition. This contrasts The each standard library word implements what other with mainstream programming languages, in which func- languages call a “for-each” loop. It iterates in order over tion call syntax combines the function name with a list of the elements of an array (or other sequence, see 2.2.2), parameters. Symbols that traditionally behave as operators invoking a quotation with each element as the input pa- with infix syntax in other languages, such as +, -, *, and /, rameter on each iteration. For example, { } are normal postfix words in Factor and receive no special "veni" "vidi" "vici" [ print ] each syntactic treatment. Languages which use an operand stack will print the strings "veni", "vidi", and "vici" to the { } in this manner have been called concatenative, because they console in order. (The and delimiters create a literal have the property that programs are created by “concatenat- array object in the same manner [ and ] create a quota- ing” (or “composing”) smaller programs. In this way, words tion.) can be seen as functions which take and return a stack [30]. • reduce is a variation of each that accumulates a value Literals in a stack language, such as "data.txt" and 10 between iterations. From the top of the stack, it takes an in Figure 1, can be thought of as functions that push them- array, an initial accumulator value, and a quotation. On selves on the stack, making the values available to subse- each iteration, it passes to the provided quotation the next quent word calls which pop them off the stack. Some words, item in the array along with the accumulated value thus such as utf8, can also behave as literal symbols that push far. For example, themselves on the stack. file-lines consumes two objects { 1234} 0 [ + ] reduce from the stack, the filename string "data.txt" and the en- will sum the numbers 1, 2, 3, and 4, and coding symbol utf8, and pushes back an array of strings, { 1234} 1 [ * ] reduce the contents of the specified file broken into lines of text. will multiply them. 44 [ "#" head? not ] filter : tail-factorial ( accumulator n -- n! ) [ string>number ] map dup 0 = 0 [ + ] reduce [ drop ] [[*][1-]bitail-factorial ] Figure 2: Summing the numerical value of array elements if ; not beginning with # : factorial ( n -- n! ) 1 swap (factorial) ; • map iterates over its input array like each but addition- ally collects the output value from each invocation of the quotation into a new array.