Interleaving with Coroutines: a Practical Approach for Robust Index Joins

Total Page:16

File Type:pdf, Size:1020Kb

Interleaving with Coroutines: a Practical Approach for Robust Index Joins Interleaving with Coroutines: A Practical Approach for Robust Index Joins ⋆ ⋆ Georgios Psaropoulos † Thomas Legler† Norman May† Anastasia Ailamaki ‡ ⋆ EPFL, Lausanne, Switzerland †SAP SE, Walldorf, Germany ‡RAW Labs SA first-name.last-name @epfl.ch first-name.last-name @sap.com { } { } ABSTRACT 18 Index join performance is determined by the efficiency of the 16 Main lookup operation on the involved index. Although database 14 Main-Interleaved indexes are highly optimized to leverage processor caches, 12 main memory accesses inevitably increase lookup runtime 10 when the index outsizes the last-level cache; hence, index 8 join performance drops. Still, robust index join performance 6 becomes possible with instruction stream interleaving:given 4 agroupoflookups,wecanhidecachemissesinonelookup 2 with instructions from other lookups by switching among Query response time0 (in ms) their respective instruction streams upon a cache miss. 1MB 2MB 4MB 8MB 16MB32MB64MB128MB256MB512MB1GB 2GB In this paper, we propose interleaving with coroutines for any type of index join. We showcase our proposal on SAP Data size (log scale) HANA by implementing binary search and CSB+-tree tra- Figure 1: Response time of an IN-predicate query with versal for an instance of index join related to dictionary com- 10K INTEGER values. Main memory accesses hinder se- pression. Coroutine implementations not only perform simi- quential execution when the dictionary is larger than the larly to prior interleaving techniques, but also resemble the cache (25 MB); interleaved execution is affected much less. original code closely, while supporting both interleaved and non-interleaved execution. Thus, we claim that coroutines make interleaving practical for use in real DBMS codebases. of an index join that we use to propose a practical technique PVLDB Reference Format: that significantly enhances index join performance by hiding Georgios Psaropoulos, Thomas Legler, Norman May, and Ana- the cost of main memory accesses. stasia Ailamaki. Interleaving with Coroutines: A Practical Ap- Like all index lookups, dictionary lookups become dispro- proach for Robust Index Joins. PVLDB,11(2):xxxx-yyyy,2017.230 - 242 portionally expensive when the dictionary outsizes the last DOI: https://doi.org/10.14778/3149193.3149202 level cache of the processor. Figure 1 illustrates this dispro- portionality in the runtime of a query with an IN predicate, 1. INTRODUCTION executed on the SAP HANA column store [9]. For the size When choosing the physical operator for an equi-join be- range 1MB–2GB, we observe a significant runtime increase tween two relations, A and B, a query optimizer checks if when the dictionary outgrows the last level cache (25MB). either has an index on the join attribute. Such an indexed The increase is caused by main memory accesses (details in relation, e.g., A, can be used for an index join, which scans Section 2), a known problem for index joins [32] and main B, looking up A’s index to retrieve the matching records. memory database systems in general [5, 20]. In main memory column stores that employ dictionary Traditional tenets for dealing with main memory accesses encoding [9, 17, 19, 26, 27], we encounter relations that are prescribe the elimination of excessive indirection and non- always indexed: the dictionaries. Each dictionary holds the sequential memory access patterns. Given that dictionary mapping between a value and its encoding, and every query implementations are cache-conscious data structures, we can for a list of values requires a sequence of lookups on the assume any effected main memory accesses to be essential dictionary. In this paper, we consider these lookups a case and thus unavoidable. Still, we can hide the latency of main memory accesses by providing the processor with enough Permission to make digital or hard copies of all or part of this work for independent instructions to execute while fetching data. In personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies our case, lookups are independent from each other, allowing bear this notice and the full citation on the first page. To copy otherwise, to us to execute them concurrently in a time-sharing fashion: republish, to post on servers or to redistribute to lists, requires prior specific we can interleave the instruction streams of several lookups permission and/or a fee. Articles from this volume were invited to present so that, when a cache miss occurs in one instruction stream, their results at The 44th International Conference on Very Large Data Bases, execution continues in another. Hence, the processor keeps August 2018, Rio de Janeiro, Brazil. executing instructions without having to wait for data. Proceedings of the VLDB Endowment, Vol. 11, No. 2 Copyright 2017 VLDB Endowment 2150-8097/17/10... $ 10.00. Prior works propose two forms of such instruction stream DOI: https://doi.org/10.14778/3149193.3149202 interleaving (ISI):static,likegroup prefetching (GP) [6] and 230 software pipelined prefetching (SPP) [6], and dynamic, like 2.1 Dictionaries in Column Stores the state-of-the-art asynchronous memory access chaining Dictionary encoding is a common compression method in (AMAC) [15]. Static interleaving has negligible overhead main-memory column stores, e.g. [9, 17, 19, 26, 27]. It maps for instruction streams with identical control flow, whereas the value domain of one or more columns to a contiguous dynamic interleaving efficiently supports a wider range of integer range [7, 9, 10, 12, 18, 20]. This mapping replaces use cases, allowing instruction streams to diverge, e.g., with column values with unique integer codes and is stored in a early returns. Both approaches require to rewrite code either separate data structure, the dictionary,whichsupportstwo as a group or a pipeline in the static case, or a state ma- access methods: chine in the dynamic case (see Section 3). The interleaved extract returns the value for a code. code ends up in the database codebase alongside the origi- • locate returns the code for a value that exists in the nal, non-interleaved implementation, implying functionality • dictionary, or a special code that denotes absence. duplication, as well as additional testing and maintenance The resulting vector of codes and the dictionary consti- costs. As a result, developers can be reluctant to use inter- tute the encoded column representation. The code vector leaved execution in production. is usually smaller than the original column, reflecting the In this work, we propose interleaving with coroutines, i.e., smaller representation of codes, whereas the dictionary size functions that can suspend their execution and resume at a is determined by the value domain, which can comprise from later point. Coroutines inherently support interleaved exe- few to billions of distinct values as encountered by database cution, while they can also run in non-interleaved mode. vendors in customer workloads [25]. Atechnicalspecification[3]forthepopularC++ language In this work, we use the column store of SAP HANA, introduces coroutine support at the language level: the pro- which has two parts for each column: the read-optimized grammer simply inserts suspension statements into the code, Main,againstwhichthequeryinFigure1wasrun,andthe and the compiler automatically handles the state that has update-friendly Delta.AMain dictionary is a sorted array to be maintained between suspension and resumption (see of the domain values, and the array positions correspond to Section 4). With this soon-to-be-standard language support, codes, similarly to [9, 17, 27]. Hence, extract is a simple ar- we implement interleaving with comparable performance to ray lookup, whereas locate is a binary search on the array the prior proposals, supporting both interleaved and non- contents for the appropriate array position. On the other interleaved execution through a single implementation that hand, Delta dictionaries are implemented as unsorted ar- closely resembles the original code (see Section 5). rays indexed by a cache-conscious B+-tree (CSB+-tree) [28]; Concretely, we make the following contributions: extract is again an array lookup, but locate is now an in- A technique for instruction stream interleaving based + • dex lookup on the CSB -tree. on coroutines. We exhibit our technique in index joins, Sorted or not, a dictionary array can be considered a rela- + where we use sorted arrays and CSB -trees as indexes, tion D(code, value)thatisindexedonbothattributes:codes describing how it can be applied to any type of pointer- are encoded as array indices, whereas values can be retrieved based index structure. through binary search or index lookup, respectively in the Acomparisonwithgroup prefetching (GP) and asyn- • sorted and the unsorted case; here, we focus on the value chronous memory access chaining (AMAC).Sinceco- index. Since a sequence of values is also a relation S(value), routines are a dynamic interleaving approach, they are every value lookup from a column involves a join S ◃▹ D, equivalent to AMAC in terms of applicable use cases which is performed as an index join when S << D .Such and performance without the need for an explicit state joins dominate the IN-predicate queries that| we| discuss| | next. machine. Coroutine code closely resembles the original implementation and can be used in both interleaved 2.2 IN Predicates and Their Performance and non-interleaved execution, relying on the compi- In this paper, we study the problem of random memory ler for state management and optimization. accesses in queries with IN predicates [23], yet our analysis An optimized implementation for IN-predicate queries and the coroutine-based technique we propose apply to any • with predictable runtime, proportional to the dictio- index join that involves pointer-based data structures like nary size. This implementation is part of a prototype hash tables or B+-trees, or algorithms involving chains of based on SAP HANA, and targets both sorted and non-sequential memory accesses like binary search.
Recommended publications
  • The Early History of Smalltalk
    The Early History of Smalltalk http://www.accesscom.com/~darius/EarlyHistoryS... The Early History of Smalltalk Alan C. Kay Apple Computer [email protected]# Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. HOPL-II/4/93/MA, USA © 1993 ACM 0-89791-571-2/93/0004/0069...$1.50 Abstract Most ideas come from previous ideas. The sixties, particularly in the ARPA community, gave rise to a host of notions about "human-computer symbiosis" through interactive time-shared computers, graphics screens and pointing devices. Advanced computer languages were invented to simulate complex systems such as oil refineries and semi-intelligent behavior. The soon-to- follow paradigm shift of modern personal computing, overlapping window interfaces, and object-oriented design came from seeing the work of the sixties as something more than a "better old thing." This is, more than a better way: to do mainframe computing; for end-users to invoke functionality; to make data structures more abstract. Instead the promise of exponential growth in computing/$/volume demanded that the sixties be regarded as "almost a new thing" and to find out what the actual "new things" might be. For example, one would compute with a handheld "Dynabook" in a way that would not be possible on a shared mainframe; millions of potential users meant that the user interface would have to become a learning environment along the lines of Montessori and Bruner; and needs for large scope, reduction in complexity, and end-user literacy would require that data and control structures be done away with in favor of a more biological scheme of protected universal cells interacting only through messages that could mimic any desired behavior.
    [Show full text]
  • Object-Oriented Programming in the Beta Programming Language
    OBJECT-ORIENTED PROGRAMMING IN THE BETA PROGRAMMING LANGUAGE OLE LEHRMANN MADSEN Aarhus University BIRGER MØLLER-PEDERSEN Ericsson Research, Applied Research Center, Oslo KRISTEN NYGAARD University of Oslo Copyright c 1993 by Ole Lehrmann Madsen, Birger Møller-Pedersen, and Kristen Nygaard. All rights reserved. No part of this book may be copied or distributed without the prior written permission of the authors Preface This is a book on object-oriented programming and the BETA programming language. Object-oriented programming originated with the Simula languages developed at the Norwegian Computing Center, Oslo, in the 1960s. The first Simula language, Simula I, was intended for writing simulation programs. Si- mula I was later used as a basis for defining a general purpose programming language, Simula 67. In addition to being a programming language, Simula1 was also designed as a language for describing and communicating about sys- tems in general. Simula has been used by a relatively small community for many years, although it has had a major impact on research in computer sci- ence. The real breakthrough for object-oriented programming came with the development of Smalltalk. Since then, a large number of programming lan- guages based on Simula concepts have appeared. C++ is the language that has had the greatest influence on the use of object-oriented programming in industry. Object-oriented programming has also been the subject of intensive research, resulting in a large number of important contributions. The authors of this book, together with Bent Bruun Kristensen, have been involved in the BETA project since 1975, the aim of which is to develop con- cepts, constructs and tools for programming.
    [Show full text]
  • CPS323 Lecture: Concurrency Materials: 1. Coroutine Projectables
    CPS323 Lecture: Concurrency Materials: 1. Coroutine Projectables 2. Handout for Ada Concurrency + Executable form (simple_tasking.adb, producer_and_consumer.adb) I. Introduction - ------------ A. Most of our attention throughout the CS curricum has been focussed on _sequential_ programming, in which a program does exactly one thing at a time, following a predictable order of steps. In particular, we expect a given program will always follow the exact same sequence of steps when run on a given set of data. B. An alternative is to think in terms of _concurrent_ programming, in which a "program" (at least in principle) does multiple things at the same time, following an order of steps that is not predictable and may differ between runs even if the same program is run on exactly the same data. C. Concurrency has long been an interest in CS, but has become of particular interest as we move into a situation in which raw hardware speeds have basically plateaued and multicore processors are coming to dominate the hardware scene. There are at least three reasons why this topic is of great interest. 1. Achieving maximum performance using technologies such as multicore processors. 2. Systems that inherently require multiple processors working together cooperatively on a single task - e.g. distributed or embedded systems. 3. Some problems are more naturally addressed by thinking in terms of multiple concurrrent components, rather than a single monolithic program. D. There are two broad approaches that might be taken to concurrency: 1. Make use of system libraries, without directly supporting parallelism in the language. (The approach taken by most languages - e.g.
    [Show full text]
  • A Low-Cost Implementation of Coroutines for C
    University of Wollongong Research Online Department of Computing Science Working Faculty of Engineering and Information Paper Series Sciences 1983 A low-cost implementation of coroutines for C Paul A. Bailes University of Wollongong Follow this and additional works at: https://ro.uow.edu.au/compsciwp Recommended Citation Bailes, Paul A., A low-cost implementation of coroutines for C, Department of Computing Science, University of Wollongong, Working Paper 83-9, 1983, 24p. https://ro.uow.edu.au/compsciwp/76 Research Online is the open access institutional repository for the University of Wollongong. For further information contact the UOW Library: [email protected] A LOW-COST IMPLEMENTATION OF COROUTINES FOR C Paul A. Bailes Department of Computing Science University of Wollongong Preprint No. 83-9 November 16, 1983 P.O. Box 1144, WOLLONGONG N.S.W. 2500, AUSTRALIA tel (042)-282-981 telex AA29022 A Low-Cost Implementation of Coroutines for C Paul A. Bailes Department of Computing Science University of Wollongong Wollongong N.S.W. 2500 Australia ABSTRACT We identify a set of primitive operations supporting coroutines, and demonstrate their usefulness. We then address their implementation in C accord­ ing to a set of criteria aimed at maintaining simplicity, and achieve a satisfactory compromise between it and effectiveness. Our package for the PDP-II under UNIXt allows users of coroutines in C programs to gain access to the primitives via an included definitions file and an object library; no penalty is imposed upon non-coroutine users. October 6, 1983 tUNIX is a Trademark of Bell Laboratories. A Low-Cost Implementation of Coroutines for C Paul A.
    [Show full text]
  • Kotlin Coroutines Deep Dive Into Bytecode #Whoami
    Kotlin Coroutines Deep Dive into Bytecode #whoami ● Kotlin compiler engineer @ JetBrains ● Mostly working on JVM back-end ● Responsible for (almost) all bugs in coroutines code Agenda I will talk about ● State machines ● Continuations ● Suspend and resume I won’t talk about ● Structured concurrency and cancellation ● async vs launch vs withContext ● and other library related stuff Agenda I will talk about Beware, there will be code. A lot of code. ● State machines ● Continuations ● Suspend and resume I won’t talk about ● Structured concurrency and cancellation ● async vs launch vs withContext ● and other library related stuff Why coroutines ● No dependency on a particular implementation of Futures or other such rich library; ● Cover equally the "async/await" use case and "generator blocks"; ● Make it possible to utilize Kotlin coroutines as wrappers for different existing asynchronous APIs (such as Java NIO, different implementations of Futures, etc). via coroutines KEEP Getting Lazy With Kotlin Pythagorean Triples fun printPythagoreanTriples() { for (i in 1 until 100) { for (j in 1 until i) { for (k in i until 100) { if (i * i + j * j < k * k) { break } if (i * i + j * j == k * k) { println("$i^2 + $j^2 == $k^2") } } } } } Pythagorean Triples fun printPythagoreanTriples() { for (i in 1 until 100) { for (j in 1 until i) { for (k in i until 100) { if (i * i + j * j < k * k) { break } if (i * i + j * j == k * k) { println("$i^2 + $j^2 == $k^2") } } } } } Pythagorean Triples fun printPythagoreanTriples() { for (i in 1 until 100) { for (j
    [Show full text]
  • Csc 520 Principles of Programming Languages Coroutines Coroutines
    Coroutines Coroutines are supported by Simula and Modula-2. CSc 520 They are similar to Java’s threads, except the programmer has to explicitly transfer control from one Principles of Programming execution context to another. Languages Thus, like threads several coroutines can exist simultaneously but unlike threads there is no central 36: Procedures — Coroutines scheduler that decides which coroutine should run next. Christian Collberg A coroutine is represented by a closure. [email protected] A special operation transfer(C) shifts control to the coroutine C, at the location where C last left off. Department of Computer Science University of Arizona Copyright c 2005 Christian Collberg 520—Spring 2005—36 [1] 520—Spring 2005—36 [2] Coroutines. Coroutines. The next slide shows an example from Scott where two var us, cfs: coroutine; coroutines execute “concurrently”, by explicitly transferring control between each other. coroutine update_screen() { ... In the example one coroutine displays a moving detach screen-saver, the other walks the file-system to check loop { for corrupt files. ... transfer(cfs) ... } } coroutine check_file_system() { ... } main () { ... } 520—Spring 2005—36 [3] 520—Spring 2005—36 [4] Coroutines. Coroutines in Modula-2 coroutine check_file_system() { Modula-2's system module provides two functions to create and transfer ... between coroutines: detach for all files do { PROCEDURE NEWPROCESS( proc: PROC; (* The procedure *) ... transfer(cfs) addr: ADDRESS; (* The stack *) ... transfer(cfs) size: CARDINAL; (* The stack size *) ... transfer(cfs) ... VAR new: ADDRESS); (* The coroutine *) } PROCEDURE TRANSFER( } VAR source: ADDRESS; (* Current coroutine *) VAR destination: ADDRESS); (* New coroutine *) main () { us := new update_screen(); The ®rst time TRANSFER is called source will be instantiated to cfs := new check_file_system(); the main (outermost) coroutine.
    [Show full text]
  • Practical Perl Tools Parallel Asynchronicity, Part 2
    COLUMNS Practical Perl Tools Parallel Asynchronicity, Part 2 DAVID N. BLANK-EDELMAN David Blank-Edelman is elcome back for Part 2 of this mini-series on modern methods the Technical Evangelist at for doing multiple things at once in Perl. In the first part (which I Apcera (the comments/ highly recommend that you read first before reading this sequel), views here are David’s alone W and do not represent Apcera/ we talked about some of the simpler methods for multi-tasking like the use Ericsson) . He has spent close to thirty years of fork() and Parallel::ForkManager. We had just begun to explore using the in the systems administration/DevOps/SRE Coro module as the basis for using threads in Perl (since the native stuff is so field in large multiplatform environments ucky) when we ran out of time. Let’s pick up the story roughly where we left it. including Brandeis University, Cambridge Technology Group, MIT Media Laboratory, Coro and Northeastern University. He is the author As a quick reminder, Coro offers an implementation of coroutines that we are going to use as of the O’Reilly Otter book Automating System just a pleasant implementation of cooperative threading (see the previous column for more Administration with Perl and is a frequent invited picayune details around the definition of a coroutine). By cooperative, we mean that each speaker/organizer for conferences in the field. thread gets queued up to run, but can only do so after another thread has explicitly signaled David is honored to serve on the USENIX that it is ready to cede its time in the interpreter.
    [Show full text]
  • Coroutines and Generators Chapter Three
    Coroutines and Generators Coroutines and Generators Chapter Three 3.1 Chapter Overview This chapter discusses two special types of program units known as coroutines and generators. A coroutine is similar to a procedure and a generator is similar to a function. The principle difference between these program units and procedures/functions is that the call/return mechanism is different. Coroutines are especially useful for multiplayer games and other program flow where sections of code "take turns" execut- ing. Generators, as their name implies, are useful for generating a sequence of values; in many respects generators are quite similar to HLA’s iterators without all the restrictions of the iterators (i.e., you can only use iterators within a FOREACH loop). 3.2 Coroutines A common programming paradigm is for two sections of code to swap control of the CPU back and forth while executing. There are two ways to achieve this: preemptive and cooperative. In a preemptive sys- tem, two or more processes or threads take turns executing with the task switch occurring independently of the executing code. A later volume in this text will consider preemptive multitasking, where the Operating System takes responsibility for interrupting one task and transferring control to some other task. In this chapter we’ll take a look at coroutines that explicitly transfer control to another section of the code1 When discussing coroutines, it is instructive to review how HLA’s iterators work, since there is a strong correspondence between iterators and coroutines. Like iterators, there are four types of entries and returns associated with a coroutine: • Initial entry.
    [Show full text]
  • Deriving Type Systems and Implementations for Coroutines
    Deriving Type Systems and Implementations for Coroutines Konrad Anton and Peter Thiemann Institut f¨urInformatik, Universit¨atFreiburg {anton,thiemann}@informatik.uni-freiburg.de Abstract. Starting from reduction semantics for several styles of corou- tines from the literature, we apply Danvy’s method to obtain equivalent functional implementations (definitional interpreters) for them. By ap- plying existing type systems for programs with continuations, we obtain sound type systems for coroutines through the translation. The resulting type systems are similar to earlier hand-crafted ones. As a side product, we obtain implementations for these styles of coroutines in OCaml. 1 Introduction Coroutines are an old programming construct, dating back to the 1960s [5]. They have been neglected for a while, but are currently enjoying a renaissance (e.g. in Lua [12]), sometimes in the limited form of generators (Python [20], C# [17]) and sometimes under different names (e.g., fibers in .NET). Type systems for coroutines have not been considered in the past. Coroutines without parameters and return values (Simula, Modula 2 [21]), coroutine opera- tions whose effects are tied to the static structure of the program (ACL, [16]), or coroutines lexically limited to the body of one function (C#), could be integrated into said languages without special support in the type system. In an earlier paper [1], we developed the first type system for a simply-typed λ-calculus with coroutines. This development was done in an ad-hoc style for a feature-rich calculus and resulted in a type and effect system with a simple notion of subeffecting to capture the control effects of coroutine operations.
    [Show full text]
  • CS164: Introduction to Programming Languages and Compilers, Spring 2012 Thibaud Hottelier UC Berkeley
    Lecture 5 Implementing Coroutines compile AST to bytecode, bytecode interpreter Ras Bodik Hack Your Language! Shaon Barman CS164: Introduction to Programming Thibaud Hottelier Languages and Compilers, Spring 2012 UC Berkeley 1 What you will learn today Implement Asymmetric Coroutines – why a recursive interpreter with implicit stack won’t do Compiling AST to bytecode - this is your first compiler; compiles AST to “flat” code Bytecode Interpreter - bytecode can be interpreted without recursion - hence no need to keep interpreter state on the call stack 2 PA2 PA2 was released today, due Sunday – bytecode interpreter of coroutines – after PA2, you will be able to implement iterators – in PA3, you will build Prolog on top of your coroutines – extra credit: cool use of coroutines (see L4 reading) HW3 has been assigned: cool uses of coroutines This homework is not graded. While it is optional, you are expected to know the homework material: 1) lazy list concatenation 2) regexes Solve at least the lazy list problem before you start on PA2 3 Code 4 Cal Hackathon Friday, February 3, 9:00pm to Saturday, February 4 5:00pm Soda Hall Wozniak Lounge Grand Prize: $1,500!!! Register @ http://code4cal.eventbrite.com/ More information about the event at http://stc.berkeley.edu. The Student Technology Council (STC), an advisory council for the UC Berkeley CIO, Shel Waggener, is hosting Code 4 Cal, a hackathon for students to create innovative, sustainable, and most of all, useful applications or widgets for Cal students. Student-developed app could be adopted by the University! Cash prizes start at $1,500. 4 Review of L4: Why Coroutines Loop calls iterator function to get the next item eg the next token from the input The iterator maintains state between two such calls eg pointer to the input stream.
    [Show full text]
  • Specialising Dynamic Techniques for Implementing the Ruby Programming Language
    SPECIALISING DYNAMIC TECHNIQUES FOR IMPLEMENTING THE RUBY PROGRAMMING LANGUAGE A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Engineering and Physical Sciences 2015 By Chris Seaton School of Computer Science This published copy of the thesis contains a couple of minor typographical corrections from the version deposited in the University of Manchester Library. [email protected] chrisseaton.com/phd 2 Contents List of Listings7 List of Tables9 List of Figures 11 Abstract 15 Declaration 17 Copyright 19 Acknowledgements 21 1 Introduction 23 1.1 Dynamic Programming Languages.................. 23 1.2 Idiomatic Ruby............................ 25 1.3 Research Questions.......................... 27 1.4 Implementation Work......................... 27 1.5 Contributions............................. 28 1.6 Publications.............................. 29 1.7 Thesis Structure............................ 31 2 Characteristics of Dynamic Languages 35 2.1 Ruby.................................. 35 2.2 Ruby on Rails............................. 36 2.3 Case Study: Idiomatic Ruby..................... 37 2.4 Summary............................... 49 3 3 Implementation of Dynamic Languages 51 3.1 Foundational Techniques....................... 51 3.2 Applied Techniques.......................... 59 3.3 Implementations of Ruby....................... 65 3.4 Parallelism and Concurrency..................... 72 3.5 Summary............................... 73 4 Evaluation Methodology 75 4.1 Evaluation Philosophy
    [Show full text]
  • Language Design and Implementation Using Ruby and the Interpreter Pattern
    Language Design and Implementation using Ruby and the Interpreter Pattern Ariel Ortiz Information Technology Department Tecnológico de Monterrey, Campus Estado de México Atizapán de Zaragoza, Estado de México, Mexico. 52926 [email protected] ABSTRACT to explore different language concepts and programming styles. In this paper, the S-expression Interpreter Framework (SIF) is Section 4 describes briefly how the SIF has been used in class. presented as a tool for teaching language design and The conclusions are found in section 5. implementation. The SIF is based on the interpreter design pattern and is written in the Ruby programming language. Its core is quite 1.1 S-Expressions S-expressions (symbolic expressions) are a parenthesized prefix small, but it can be easily extended by adding primitive notation, mainly used in the Lisp family languages. They were procedures and special forms. The SIF can be used to demonstrate selected as the framework’s source language notation because advanced language concepts (variable scopes, continuations, etc.) they offer important advantages: as well as different programming styles (functional, imperative, and object oriented). • Both code and data can be easily represented. Adding a new language construct is pretty straightforward. • Categories and Subject Descriptors The number of program elements is minimal, thus a parser is − relatively easy to write. D.3.4 [Programming Languages]: Processors interpreters, • run-time environments. The arity of operators doesn’t need to be fixed, so, for example, the + operator can be defined to accept any number of arguments. General Terms • There are no problems regarding operator precedence or Design, Languages. associativity, because every operator has its own pair of parentheses.
    [Show full text]