Obliv-C: a Language for Extensible Data-Oblivious Computation

Obliv-C: A Language for Extensible Data-Oblivious Computation Samee Zahur David Evans [email protected] [email protected] University of Virginia University of Virginia Abstract A common data-oblivious program representation is a Boolean logic circuit: every logic gate (e.g., AND, Many techniques for secure or private execution de- OR) is specified before the secret inputs are even pend on executing programs in a data-oblivious way, known. Another popular representation uses addition where the same instructions execute independent of or multiplication gates that operate directly on finite the private inputs which are kept in encrypted form field elements (instead of just Boolean values). Given throughout the computation. Designers of such com- a circuit that describes the desired computation, the putations today must either put substantial effort into protocol specifies how to execute the circuit without constructing a circuit representation of their algorithm, revealing any inputs or intermediate results. or use a high-level language and lose the opportunity While many previous languages and frameworks to make important optimizations or experiment with for secure computation have been developed (see protocol variations. We show how extensibility can be Section 7), none are sufficiently expressive to allow improved by judiciously exposing the nature of data- programmers to implement even simple library ab- oblivious computation. We introduce a new language stractions. The reason is that these languages have that allows application developers to program secure been designed to provide traditional programming ab- computations without being experts in cryptography, stractions that hide the data-oblivious nature of se- while enabling programmers to create abstractions cure computation from the programmer. Our approach such as oblivious RAM and width-limited integers, provides high-level programming abstractions while or even new protocols without needing to modify the exposing the essential data-oblivious nature of such compiler. This paper explains the key language features computations. that safely enable such extensibility and describes the Motivating Example. Consider this simple C example simple implementation approach we use to ensure of a dynamically resized array: security properties are preserved. DynVec ∗vec = dynVecNew(); for (i = 0; i < n; i++) { 1. Introduction if (cond) { dynVecAppend(vec,x); } A protocol for secure computation allows two or ... more parties to collaboratively perform some computation without revealing their own inputs. There are many Implementing a library like this for standard compu- generic protocols for secure computation, which can tation is trivial. The DynVec object just needs to keep perform arbitrary computation on encrypted data [8, track of the current size of the vector, and resize an 18, 24, 34]. The way these generic protocols work internal buffer when more space is needed to complete is that the entire computation is first converted into an operation. a data-oblivious representation, where the control flow Writing something similar for a data-oblivious com- of the program does not depend on the secret program putation, requires the compiler to implement an append inputs in any way. Such a program can be executed on under an unknown condition: the internal memory encrypted data without leaking any information about buffer must be resized regardless of the now unknown intermediate results, since the control flow is the same semantic value of cond, whereas the value of x should for all executions and does not depend on the data. be appended into that buffer (which is now encrypted) using a conditional write that depends on the value of an overview of the design and philosophy behind the cond specified outside of the function. language. Section 2.2 presents a concrete example of This problem is exacerbated for more complex li- an Obliv-C program. We provide details on the type brary abstractions. For example, an ORAM structure system in Section 3. Our implementation compiles that allows random access to a memory bank without an Oliv-C program into standard C, as described in revealing anything about the access pattern. On every Section 6. read or write operation it needs to do things like network transfers, pseudo-random shuffling, and cryp- 2.1. Overview tographic operations. Defining a simple oramWrite() function is problematic if we want to allow it to be Obliv-C is designed to guarantee that all security called from inside a conditional block: the function properties provided by the underlying protocol are needs to specify a whole series of operations, some maintained, while exposing aspects of data-oblivious of which need to be done conditionally while others computation to the programmer. Our design emphases are done unconditionally. Indeed, it is not clear how safety, guaranteeing that no information can be leaked a traditional programming language could even be by program executions (assuming the underlying pro- adapted to express the situations that commonly arise tocol is secure) while giving programmers enough in data-oblivious computation. control (including the ability to circumvent type rules) Contributions. We show how a language can be to do things that would not be possible with other high- designed to support extensible secure programming level languages. introducing control structures that expose the data- The main construct we introduce is an oblivious con- oblivious nature of secure computation. To make it ditional. For example, consider the following statement easier for programmers to develop and reason about where x and y are secret data: data-oblivious programs, we provide a type system that obliv if (x > y) x = y; incorporates oblivious data. Our Obliv-C language is a strict extension of C Since the truth value of the x > y condition will not be that supports all C features (including struct, typedef, known even at runtime, this code cannot be executed pointers, recursive calls, and indirect function calls), normally. Instead, every assignment inside the if state- along with new data types and control structures to ment will have to use “multiplexer” circuits in much support data-oblivious programs. Section 2 introduces the same way Boolean logic circuits use multiplexers our language and describes how its language constructs to choose between two different values. We could and type system support data-oblivious computation. translate this code into something like: We describe the architecture of our Obliv-C com- cond = (x > y); // 0 or 1 piler in Section 6, showing that our language can x = x + cond ∗ (y − x); be implemented on top of a traditional language and This removes any explicit control flow dependency on in a way that provides high confidence that security unknown values by using conditional assignments. properties of the underlying protocol are preserved. Obliv-C extends C in the following ways: Obliv-C is designed to enable practitioners to more easily develop scalable secure protocols, and to allow • Every basic data type (e.g., int, char, etc.) researchers to easily implement and test new features has an obliv-qualified counterpart (e.g., obliv int, or techniques by simply writing a new libraries rather obliv char, etc.) which is represented using an than having to modify or build a new compiler. To encrypted value. demonstrate how our approach supports exploration • Every if statement with a condition that depends at many levels, Section 4 shows how Obliv-C could on obliv-qualified data is explicitly indicated as be used to easily implement various library-based obliv if. An obliv if statement executes in a way features including range-tracked integers, ORAM, and that prevents control dependencies from leaking multi-threading that could not be done with existing the condition value. languages, and Section 5 shows how Obliv-C supports • Type rules related to obliv if are enforced across experimentation with protocols. function boundaries at compile time by using two different function families: ones that can be 2. Obliv-C invoked from inside obliv if, and ones that cannot. • Special unconditional segments allow library writ- Obliv-C is a strict extension of C that provides data- ers to perform actions unconditionally, which oblivious programming constructs. Next, we provide allow them to write various library abstractions. 2 These segments escape the type system, but do not reveal function only succeeds if both parties provide risk any information leak, just the possibility that consistent parameters to the function (e.g., it will fail a program does not mean what the programmer if they provide different values for src or p). intended. To run the program, both the files in Figure 1 are Next, we walk through a simple example illustrating compiled with the oblivcc command provided by our the general structure of Obliv-C programs and how the tool. It is a simple wrapper that provides a familiar programmer uses it. command-line interface. It preprocesses any input file with an “.oc” extension to a plain C file before passing 2.2. Millionaires’ Problem it on to gcc and links with additional runtime libraries required for Obliv-C code. Once compiled, the two Figure 1 shows an Obliv-C implementation of Yao’s parties simply execute the program with appropriate classic millionaires’ problem [34]. It simply outputs inputs like any other program: the end user does not which of two integers is greater (purportedly, to enable need to know about Obliv-C or even need to install it two millionaires to decide who should pay for dinner separately. without disclosing their actual wealth). When the program executes, both parties (in this protocol, although our design can support any number typedef struct { int myinput; of parties) execute the same program. By convention, bool result; we will call them Alice (Party 1) and Bob (Party 2). } ProtocolIO; The a, b, and res variables are declared using the obliv keyword to indicate that their values may depend on void millionaire (void ∗args); secret inputs.

Load more