Obliv-C: a Language for Extensible Data-Oblivious Computation

Total Page:16

File Type:pdf, Size:1020Kb

Obliv-C: a Language for Extensible Data-Oblivious Computation Obliv-C: A Language for Extensible Data-Oblivious Computation Samee Zahur David Evans [email protected] [email protected] University of Virginia University of Virginia Abstract A common data-oblivious program representation is a Boolean logic circuit: every logic gate (e.g., AND, Many techniques for secure or private execution de- OR) is specified before the secret inputs are even pend on executing programs in a data-oblivious way, known. Another popular representation uses addition where the same instructions execute independent of or multiplication gates that operate directly on finite the private inputs which are kept in encrypted form field elements (instead of just Boolean values). Given throughout the computation. Designers of such com- a circuit that describes the desired computation, the putations today must either put substantial effort into protocol specifies how to execute the circuit without constructing a circuit representation of their algorithm, revealing any inputs or intermediate results. or use a high-level language and lose the opportunity While many previous languages and frameworks to make important optimizations or experiment with for secure computation have been developed (see protocol variations. We show how extensibility can be Section 7), none are sufficiently expressive to allow improved by judiciously exposing the nature of data- programmers to implement even simple library ab- oblivious computation. We introduce a new language stractions. The reason is that these languages have that allows application developers to program secure been designed to provide traditional programming ab- computations without being experts in cryptography, stractions that hide the data-oblivious nature of se- while enabling programmers to create abstractions cure computation from the programmer. Our approach such as oblivious RAM and width-limited integers, provides high-level programming abstractions while or even new protocols without needing to modify the exposing the essential data-oblivious nature of such compiler. This paper explains the key language features computations. that safely enable such extensibility and describes the Motivating Example. Consider this simple C example simple implementation approach we use to ensure of a dynamically resized array: security properties are preserved. DynVec ∗vec = dynVecNew(); for (i = 0; i < n; i++) { 1. Introduction if (cond) { dynVecAppend(vec,x); } A protocol for secure computation allows two or ... more parties to collaboratively perform some computa- tion without revealing their own inputs. There are many Implementing a library like this for standard compu- generic protocols for secure computation, which can tation is trivial. The DynVec object just needs to keep perform arbitrary computation on encrypted data [8, track of the current size of the vector, and resize an 18, 24, 34]. The way these generic protocols work internal buffer when more space is needed to complete is that the entire computation is first converted into an operation. a data-oblivious representation, where the control flow Writing something similar for a data-oblivious com- of the program does not depend on the secret program putation, requires the compiler to implement an append inputs in any way. Such a program can be executed on under an unknown condition: the internal memory encrypted data without leaking any information about buffer must be resized regardless of the now unknown intermediate results, since the control flow is the same semantic value of cond, whereas the value of x should for all executions and does not depend on the data. be appended into that buffer (which is now encrypted) using a conditional write that depends on the value of an overview of the design and philosophy behind the cond specified outside of the function. language. Section 2.2 presents a concrete example of This problem is exacerbated for more complex li- an Obliv-C program. We provide details on the type brary abstractions. For example, an ORAM structure system in Section 3. Our implementation compiles that allows random access to a memory bank without an Oliv-C program into standard C, as described in revealing anything about the access pattern. On every Section 6. read or write operation it needs to do things like network transfers, pseudo-random shuffling, and cryp- 2.1. Overview tographic operations. Defining a simple oramWrite() function is problematic if we want to allow it to be Obliv-C is designed to guarantee that all security called from inside a conditional block: the function properties provided by the underlying protocol are needs to specify a whole series of operations, some maintained, while exposing aspects of data-oblivious of which need to be done conditionally while others computation to the programmer. Our design emphases are done unconditionally. Indeed, it is not clear how safety, guaranteeing that no information can be leaked a traditional programming language could even be by program executions (assuming the underlying pro- adapted to express the situations that commonly arise tocol is secure) while giving programmers enough in data-oblivious computation. control (including the ability to circumvent type rules) Contributions. We show how a language can be to do things that would not be possible with other high- designed to support extensible secure programming level languages. introducing control structures that expose the data- The main construct we introduce is an oblivious con- oblivious nature of secure computation. To make it ditional. For example, consider the following statement easier for programmers to develop and reason about where x and y are secret data: data-oblivious programs, we provide a type system that obliv if (x > y) x = y; incorporates oblivious data. Our Obliv-C language is a strict extension of C Since the truth value of the x > y condition will not be that supports all C features (including struct, typedef, known even at runtime, this code cannot be executed pointers, recursive calls, and indirect function calls), normally. Instead, every assignment inside the if state- along with new data types and control structures to ment will have to use “multiplexer” circuits in much support data-oblivious programs. Section 2 introduces the same way Boolean logic circuits use multiplexers our language and describes how its language constructs to choose between two different values. We could and type system support data-oblivious computation. translate this code into something like: We describe the architecture of our Obliv-C com- cond = (x > y); // 0 or 1 piler in Section 6, showing that our language can x = x + cond ∗ (y − x); be implemented on top of a traditional language and This removes any explicit control flow dependency on in a way that provides high confidence that security unknown values by using conditional assignments. properties of the underlying protocol are preserved. Obliv-C extends C in the following ways: Obliv-C is designed to enable practitioners to more easily develop scalable secure protocols, and to allow • Every basic data type (e.g., int, char, etc.) researchers to easily implement and test new features has an obliv-qualified counterpart (e.g., obliv int, or techniques by simply writing a new libraries rather obliv char, etc.) which is represented using an than having to modify or build a new compiler. To encrypted value. demonstrate how our approach supports exploration • Every if statement with a condition that depends at many levels, Section 4 shows how Obliv-C could on obliv-qualified data is explicitly indicated as be used to easily implement various library-based obliv if. An obliv if statement executes in a way features including range-tracked integers, ORAM, and that prevents control dependencies from leaking multi-threading that could not be done with existing the condition value. languages, and Section 5 shows how Obliv-C supports • Type rules related to obliv if are enforced across experimentation with protocols. function boundaries at compile time by using two different function families: ones that can be 2. Obliv-C invoked from inside obliv if, and ones that cannot. • Special unconditional segments allow library writ- Obliv-C is a strict extension of C that provides data- ers to perform actions unconditionally, which oblivious programming constructs. Next, we provide allow them to write various library abstractions. 2 These segments escape the type system, but do not reveal function only succeeds if both parties provide risk any information leak, just the possibility that consistent parameters to the function (e.g., it will fail a program does not mean what the programmer if they provide different values for src or p). intended. To run the program, both the files in Figure 1 are Next, we walk through a simple example illustrating compiled with the oblivcc command provided by our the general structure of Obliv-C programs and how the tool. It is a simple wrapper that provides a familiar programmer uses it. command-line interface. It preprocesses any input file with an “.oc” extension to a plain C file before passing 2.2. Millionaires’ Problem it on to gcc and links with additional runtime libraries required for Obliv-C code. Once compiled, the two Figure 1 shows an Obliv-C implementation of Yao’s parties simply execute the program with appropriate classic millionaires’ problem [34]. It simply outputs inputs like any other program: the end user does not which of two integers is greater (purportedly, to enable need to know about Obliv-C or even need to install it two millionaires to decide who should pay for dinner separately. without disclosing their actual wealth). When the program executes, both parties (in this protocol, although our design can support any number typedef struct { int myinput; of parties) execute the same program. By convention, bool result; we will call them Alice (Party 1) and Bob (Party 2). } ProtocolIO; The a, b, and res variables are declared using the obliv keyword to indicate that their values may depend on void millionaire (void ∗args); secret inputs.
Recommended publications
  • Programming Language
    Programming language A programming language is a formal language, which comprises a set of instructions that produce various kinds of output. Programming languages are used in computer programming to implement algorithms. Most programming languages consist of instructions for computers. There are programmable machines that use a set of specific instructions, rather than general programming languages. Early ones preceded the invention of the digital computer, the first probably being the automatic flute player described in the 9th century by the brothers Musa in Baghdad, during the Islamic Golden Age.[1] Since the early 1800s, programs have been used to direct the behavior of machines such as Jacquard looms, music boxes and player pianos.[2] The programs for these The source code for a simple computer program written in theC machines (such as a player piano's scrolls) did not programming language. When compiled and run, it will give the output "Hello, world!". produce different behavior in response to different inputs or conditions. Thousands of different programming languages have been created, and more are being created every year. Many programming languages are written in an imperative form (i.e., as a sequence of operations to perform) while other languages use the declarative form (i.e. the desired result is specified, not how to achieve it). The description of a programming language is usually split into the two components ofsyntax (form) and semantics (meaning). Some languages are defined by a specification document (for example, theC programming language is specified by an ISO Standard) while other languages (such as Perl) have a dominant implementation that is treated as a reference.
    [Show full text]
  • Safe, Fast and Easy: Towards Scalable Scripting Languages
    Safe, Fast and Easy: Towards Scalable Scripting Languages by Pottayil Harisanker Menon A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy. Baltimore, Maryland Feb, 2017 ⃝c Pottayil Harisanker Menon 2017 All rights reserved Abstract Scripting languages are immensely popular in many domains. They are char- acterized by a number of features that make it easy to develop small applications quickly - flexible data structures, simple syntax and intuitive semantics. However they are less attractive at scale: scripting languages are harder to debug, difficult to refactor and suffers performance penalties. Many research projects have tackled the issue of safety and performance for existing scripting languages with mixed results: the considerable flexibility offered by their semantics also makes them significantly harder to analyze and optimize. Previous research from our lab has led to the design of a typed scripting language built specifically to be flexible without losing static analyzability. Inthis dissertation, we present a framework to exploit this analyzability, with the aim of producing a more efficient implementation Our approach centers around the concept of adaptive tags: specialized tags attached to values that represent how it is used in the current program. Our frame- work abstractly tracks the flow of deep structural types in the program, and thuscan ii ABSTRACT efficiently tag them at runtime. Adaptive tags allow us to tackle key issuesatthe heart of performance problems of scripting languages: the framework is capable of performing efficient dispatch in the presence of flexible structures. iii Acknowledgments At the very outset, I would like to express my gratitude and appreciation to my advisor Prof.
    [Show full text]
  • Dynamic Extension of Typed Functional Languages
    Dynamic Extension of Typed Functional Languages Don Stewart PhD Dissertation School of Computer Science and Engineering University of New South Wales 2010 Supervisor: Assoc. Prof. Manuel M. T. Chakravarty Co-supervisor: Dr. Gabriele Keller Abstract We present a solution to the problem of dynamic extension in statically typed functional languages with type erasure. The presented solution re- tains the benefits of static checking, including type safety, aggressive op- timizations, and native code compilation of components, while allowing extensibility of programs at runtime. Our approach is based on a framework for dynamic extension in a stat- ically typed setting, combining dynamic linking, runtime type checking, first class modules and code hot swapping. We show that this framework is sufficient to allow a broad class of dynamic extension capabilities in any statically typed functional language with type erasure semantics. Uniquely, we employ the full compile-time type system to perform run- time type checking of dynamic components, and emphasize the use of na- tive code extension to ensure that the performance benefits of static typing are retained in a dynamic environment. We also develop the concept of fully dynamic software architectures, where the static core is minimal and all code is hot swappable. Benefits of the approach include hot swappable code and sophisticated application extension via embedded domain specific languages. We instantiate the concepts of the framework via a full implementation in the Haskell programming language: providing rich mechanisms for dy- namic linking, loading, hot swapping, and runtime type checking in Haskell for the first time. We demonstrate the feasibility of this architecture through a number of novel applications: an extensible text editor; a plugin-based network chat bot; a simulator for polymer chemistry; and xmonad, an ex- tensible window manager.
    [Show full text]
  • The University of Chicago Reflective Techniques In
    THE UNIVERSITY OF CHICAGO REFLECTIVE TECHNIQUES IN EXTENSIBLE LANGUAGES A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES IN CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF COMPUTER SCIENCE BY JONATHAN RIEHL CHICAGO, ILLINOIS AUGUST 2008 To Leon Schram ABSTRACT An extensible programming language allows programmers to use the language to modify one or more of the language’s syntactic, static-semantic, and/or dynamic- semantic properties. This dissertation presents Mython, a variant of the Python lan- guage, which affords extensibility of all three language properties. Mython achieves extensibility through a synthesis of reflection, staging, and compile-time evaluation. This synthesis allows language embedding, language evolution, domain-specific opti- mization, and tool development to be performed in the Mython language. This work argues that using language-development tools from inside an extensible language is preferable to using external tools. The included case study shows that users of an embedded differential equation language are able to work with both the embedded language and embedded programs in an interactive fashion — simplifying their work flow, and the task of specifying the embedded language. iii ACKNOWLEDGMENTS In keeping with my dedication, I’d like to begin by thanking Leon Schram, my A.P. computer science teacher. Mr. Schram took a boy who liked to play with computers, and made him a young man that could apply reason to the decomposition and solution of problems. In turn, I’d like to thank Charlie Fly and Robin Friedrich for helping extend that reason to encompass large scale systems and language systems.
    [Show full text]
  • Castor: Programming with Extensible Generative Visitors
    Castor: Programming with Extensible Generative Visitors Weixin Zhanga,∗, Bruno C. d. S. Oliveiraa aThe University of Hong Kong, Hong Kong, China Abstract Much recent work on type-safe extensibility for Object-Oriented languages has focused on design patterns that require modest type system features. Examples of such design patterns include Object Algebras, Extensible Visitors, Finally Tagless interpreters, or Polymorphic Embeddings. Those techniques, which often use a functional style, can solve basic forms of the Expression Problem. However, they have important limitations. This paper presents Castor: a Scala framework for programming with extensible, generative visitors. Castor has several advantages over previous approaches. Firstly, Castor comes with support for (type-safe) pattern matching to complement its visitors with a concise notation to express operations. Secondly, Castor supports type-safe interpreters (à la Finally Tagless), but with additional support for pattern matching and a generally recursive style. Thirdly, Castor enables many operations to be defined using an imperative style, which is significantly more performant than a functional style (especially in the JVM platform). Finally, functional techniques usually only support tree structures well, but graph structures are poorly supported. Castor supports type-safe extensible programming on graph structures. The key to Castor’s usability is the use of annotations to automatically generate large amounts of boilerplate code to simplify programming with extensible visitors. To illustrate the applicability of Castor we present several applications and two case studies. The first case study compares the ability of Castor for modularizing the interpreters from the “Types and Programming Languages” book with previous modularization work. The second case study on UML activity diagrams illustrates the imperative aspects of Castor, as well as its support for hierarchical datatypes and graphs.
    [Show full text]
  • An Abstract, Reusable, and Extensible Programming Language Design Architecture⋆
    An Abstract, Reusable, and Extensible Programming Language Design Architecture⋆ Hassan A¨ıt-Kaci Universit´eClaude Bernard Lyon 1 Villeurbanne, France [email protected] Abstract. There are a few basic computational concepts that are at the core of all programming languages. The exact elements making out such a set of concepts determine (1) the specific nature of the computational services such a language is designed for, (2) for what users it is intended, and (3) on what devices and in what environment it is to be used. It is therefore possible to propose a set of basic build- ing blocks and operations thereon as combination procedures to enable program- ming software by specifying desired tasks using a tool-box of generic constructs and meta-operations. Syntax specified through LALR(k) grammar technology can be enhanced with greater recognizing power thanks to a simple augmentation of yacc technology. Upon this basis, a set of implementable formal operational semantics constructs may be simply designed and generated (syntax and seman- tics) `ala carte, by simple combination of its desired features. The work presented here, and the tools derived from it, may be viewed as a tool box for generating lan- guage implementations with a desired set of features. It eases the automatic prac- tical generation of programming language pioneered by Peter Landin’s SECD Machine. What is overviewed constitutes a practical computational algebra ex- tending the polymorphically typed λ-Calculus with object/classes and monoid comprehensions. This paper describes a few of the most salient parts of such a system, stressing most specifically any innovative features—formal syntax and semantics.
    [Show full text]
  • 201604261441 Merged2.Pdf
    Demystifying secure computation: Familiar abstractions for efficient protocols A dissertation Submitted to the department of Computer Science Of University of Virginia In fulfillment of the requirements For the degree of Doctor of Philosophy Samee Zahur April 2016 Abstract Over the past few years, secure multi-party computation (MPC) has been transformed from a research tool to a practical one with numerous interesting applications in practice. MPC is a cryptographic technique that allows two or more parties to collaboratively perform a computation without revealing their own private inputs to each other (other than what can be inferred from the output result). Example uses include private auctions where all the participants keep their bids private, private aggregation of corporate-internal data for economic analysis, and private set intersection. However, efficiency of MPC protocols have remained a persistent challenge for many applications. One particular issue that we examine in this dissertation is input-dependent memory accesses. It is difficult to efficiently access a memory location without revealing which element is being accessed, which in turn makes it very difficult to efficiently implement certain programs. This dissertation solves the problem by separately considering two different cases. First, we construct efficient circuit structures for cases where the access pattern is known to follow certain constraints, such as locality. The second case involves a new Oblivious RAM (ORAM) construction that provides general random access. The ORAM construction is slower than the specialized circuit structures, but faster than existing ORAM constructions for MPC for a large range of parameters. To help in implementing and evaluating these constructions, we also designed a new extensible programming language for MPC called Obliv-C, which we believe can be a useful contribution in its own right.
    [Show full text]
  • Genesis: an Extensible Java
    Genesis: An Extensible Java by Ian Lewis BComp Hons Submitted in fulfilment of the requirements for the Degree of Doctor of Philosophy University of Tasmania February 2005 This thesis contains no material which has been accepted for a degree or diploma by the University or any other institution, except by way of background information and duly acknowledged in the thesis, and to the best of the candidate’s knowledge and belief no material previously published or written by another person except where due acknowledgement is made in the text of the thesis. Ian Lewis • iii • This thesis may be made available for loan and limited copying in accordance with the Copyright Act 1968 . Ian Lewis • v • Abstract Extensible programming languages allow users to create fundamentally new syntax and translate this syntax into language primitives. The concept of compile-time meta- programming has been around for decades, but systems that provide such abilities generally disallow the creation of new syntactic forms, or have heavy restrictions on how, or where, this may be done. Genesis is an extension to Java that supports compile-time meta-programming by allowing users to create their own arbitrary syntax. This is achieved through macros that operate on a mix of both concrete and abstract syntax, and produce abstract syntax. Genesis attempts to provide a minimal design whilst maintaining, and extending, the expressive power of other similar macro systems. The core Genesis language definition lacks many of the desirable features found in other systems, such as quasi-quote, hygiene, and static expression-type dispatch, but is expressive enough to define these as syntax extensions.
    [Show full text]
  • Extensible Languages for Flexible and Principled Domain Abstraction
    Extensible Languages for Flexible and Principled Domain Abstraction Dissertation for the degree of Doctor of Natural Sciences Submitted by Sebastian Thore Erdweg, MSc born March 14, 1985 in Frankfurt/Main Department of Mathematics and Computer Science Philipps-Universität Marburg Referees: Prof. Dr. Klaus Ostermann Dr. Eelco Visser Prof. Dr. Ralf Lämmel Submitted November 28, 2012. Defended March 06, 2013. Marburg, 2013. Erdweg, Sebastian: Extensible Languages for Flexible and Principled Domain Abstraction Dissertation, Philipps-Universität Marburg (1180), 2013. Curriculum vitae 2007, Bachelor of Science, TU Darmstadt 2009, Master of Science, Aarhus University Cover photo by Tellerdreher Photography, 2012. Abstract Most programming languages are designed for general-purpose software deve- lopment in a one-size-fits-all fashion: They provide the same set of language features and constructs for all possible applications programmers ever may want to develop. As with shoes, the one-size-fits-all solution grants a good fit to few applications only. The trend toward domain-specific languages, model-driven development, and language-oriented programming counters general-purpose languages by promo- ting the use of domain abstractions that facilitate domain-specific language features and constructs tailored to certain application domains. In particular, domain abstraction avoids the need for encoding domain concepts with general- purpose language features and thus allows programmers to program at the same abstraction level as they think. Unfortunately, current approaches to domain abstraction cannot deliver on the promises of domain abstraction. On the one hand, approaches that target internal domain-specific languages lack flexibility regarding the syntax, static checking, and tool support of domain abstractions, which limits the level of actually achieved domain abstraction.
    [Show full text]
  • AAA One Minute Madness
    One Minute Madness PLMW 2014 Alexander Bakst Data Structure Verification via Refinement Types function append(x1, x2) { if (x1 != null){ var n = x1.next; x1.next = append(n, x2); return x1; } else { return x2; } } Alexander Bakst University of California, San Diego Andrew Bedford Benjamin Greenman Conditional Inheritance class List<T> extends Eq<List<T>> given T extends Eq<T> 0Ben Greenman, Cornell University 1/1 Cole Schlesinger tomorrow at 3:15pm come see CAROLYN ANDERSON present NetKAT an algebraic presentation of network packet processing and verifcation Cyrus Omar Type-Oriented Foundations for Safely Extensible Programming Systems Cyrus Omar, CMU client compatibility G+R R G,#P,#Q,#R features G G (syntax, type system, implementation G+P G+Q C+A P Q strategy, editor service) (a) Separate Languages (b) Extensible Language language library Denis Bogdanas K I I I K I I I I I I Diego Gomez Ajhuacho Elias Castegren Elizabeth Davis Modeling Steganography with Linear Epistemic Logic Elizabeth Davis Advisor: Frank Pfenning Bit ops + normalization Innocuous cover Embedded steganographic message message Epistemic logic : reason about information gained from extracting encoded message Linear logic : reason about consumption and generation of resources in changing state Linear Epistemic logic : reason about actions based on changing information state (DeYoung & Pfenning, 2009) – K says A : linear affirmation – K has A : linear knowledge – K knows A : persistent knowledge Eric Mullen Eric Seidel Refinement'Types'with'LiquidHaskell' data$Text$=$Text$ $${$arr$::$Array $$,$off$::${v:Nat$|$v$$$$$$$<=$alen$arr} $$,$len$::${v:Nat$|$v$+$off$<=$alen$arr}$ $$}$ type$Nat$=${v:Int$|$v$>=$0} measure$alen$::$Array$A>$Nat Eric'Seidel';;'UC'San'Diego' Erick Lavoie Erick Lavoie, PhD @ McGill University Why? Allow scientists and engineers to run MATLAB code in the browser, fast! static dynamic (JIT) How? MATLAB JSON AST JavaScript McLab Compiler new MATLAB VM in JS •Find the fastest JS subset for num.
    [Show full text]
  • Where Are We? PL Category: Concatenative Pls Introduction to Forth
    Where Are We? PL Category: Concatenative PLs Introduction to Forth CS F331 Programming Languages CSCE A331 Programming Language Concepts Lecture Slides Monday, March 23, 2020 Glenn G. Chappell Department of Computer Science University of Alaska Fairbanks [email protected] © 2017–2020 Glenn G. Chappell Review 2020-03-23 CS F331 / CSCE A331 Spring 2020 2 Review Haskell: Data We made a Binary Tree type, with a data item in each node. Such a Binary Tree either has no nodes (it is empty) or it has a root node, which contains a data item and has left and right subtrees, each of which is a Binary Tree. The type is called the type BT. It has two constructors. § BTEmpty gives an empty Binary Tree. § BTNode, followed by an item of the value type, the left subtree, and the right subtree, constructs a nonempty tree. data BT vt = BTEmpty | BTNode vt (BT vt) (BT vt) The value type We implemented the Treesort algorithm in Haskell, using BT for the Binary Search Tree. See data.hs. 2020-03-23 CS F331 / CSCE A331 Spring 2020 3 Review PL Feature: Values & Variables [1/3] Remember: § A value has a lifetime: time from construction to destruction. § An identifier has a scope: where in code it is accessible. Because a bound variable involves both an identifier and a value, scope and lifetime are both applicable. 2020-03-23 CS F331 / CSCE A331 Spring 2020 4 Review PL Feature: Values & Variables [2/3] At runtime, a variable is typically implemented as a location in memory large enough to hold the internal representation of the variable’s value.
    [Show full text]
  • Effective Extensible Programming: Unleashing Julia on Gpus
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Effective Extensible Programming: Unleashing Julia on GPUs Tim Besard, Christophe Foket and Bjorn De Sutter, Member, IEEE Abstract—GPUs and other accelerators are popular devices for accelerating compute-intensive, parallelizable applications. However, programming these devices is a difficult task. Writing efficient device code is challenging, and is typically done in a low-level programming language. High-level languages are rarely supported, or do not integrate with the rest of the high-level language ecosystem. To overcome this, we propose compiler infrastructure to efficiently add support for new hardware or environments to an existing programming language. We evaluate our approach by adding support for NVIDIA GPUs to the Julia programming language. By integrating with the existing compiler, we significantly lower the cost to implement and maintain the new compiler, and facilitate reuse of existing application code. Moreover, use of the high-level Julia programming language enables new and dynamic approaches for GPU programming. This greatly improves programmer productivity, while maintaining application performance similar to that of the official NVIDIA CUDA toolkit. Index Terms—Graphics processors, very high-level languages, code generation F 1 INTRODUCTION the implementation but hinders long-term maintainability To satisfy ever higher computational demands, hardware when the host language gains features or changes semantics. vendors and software developers look at accelerators, spe- It also forces users to learn and deal with the inevitable cialized processors that are optimized for specific, typically divergence between individual language implementations. parallel workloads, and perform much better at them than This paper presents a vision in which the high-level general-purpose processors [1], [2], [3], [4], [5].
    [Show full text]