Lightweight Affinity and Object Capabilities in Scala
Total Page:16
File Type:pdf, Size:1020Kb
http://www.diva-portal.org Postprint This is the accepted version of a paper presented at ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). Citation for the original published paper: Haller, P., Loiko, A. (2016) LaCasa: Lightweight Affinity and Object Capabilities in Scala. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (pp. 272-291). Association for Computing Machinery (ACM) https://doi.org/10.1145/3022671.2984042 N.B. When citing this work, cite the original published paper. © Author | ACM 2016. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, http://dx.doi.org/10.1145/3022671.2984042. Permanent link to this version: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-197902 LACASA: Lightweight Affinity and Object Capabilities in Scala Philipp Haller Alex Loiko KTH Royal Institute of Technology, Sweden Google, Sweden ∗ [email protected] [email protected] Abstract difficulty of reasoning about program behavior and software Aliasing is a known source of challenges in the context of architecture [3], and it can introduce data races in concur- imperative object-oriented languages, which have led to im- rent programs. These observations have informed the devel- portant advances in type systems for aliasing control. How- opment of a number of type disciplines aimed at providing ever, their large-scale adoption has turned out to be a surpris- static aliasing properties, such as linear types [33, 51, 64], ingly difficult challenge. While new language designs show region inference [62, 63], unique references [14, 18, 41, 48], promise, they do not address the need of aliasing control in and ownership types [21, 50]. existing languages. While there have been important advances in the flexibil- This paper presents a new approach to isolation and ity and expressiveness of type systems for aliasing control, uniqueness in an existing, widely-used language, Scala. The large-scale adoption has been shown to be a much greater approach is unique in the way it addresses some of the most challenge than anticipated. Recent efforts in the context of important obstacles to the adoption of type system exten- new language designs like Rust [7] are promising, but they sions for aliasing control. First, adaptation of existing code do not address the increasing need for aliasing control in ex- requires only a minimal set of annotations. Only a single isting, widely-used languages. bit of information is required per class. Surprisingly, the One of the most important obstacles to the adoption of paper shows that this information can be provided by the a type system extension in a widely-used language with a object-capability discipline, widely-used in program secu- large ecosystem is the adaptation of existing code, including rity. We formalize our approach as a type system and prove third-party libraries. Typically, adaptation consists of adding key soundness theorems. The type system is implemented (type) annotations required by the type system extension. for the full Scala language, providing, for the first time, a With a large ecosystem of existing libraries, this may be pro- sound integration with Scala’s local type inference. Finally, hibitively expensive even for simple annotations. A second, we empirically evaluate the conformity of existing Scala and almost equally critical obstacle is robust support for the open-source code on a corpus of over 75,000 LOC. entirety of an existing language’s type system in a way that satisfies requirements for backward compatibility. Categories and Subject Descriptors D.3.3 [Programming This paper presents a new approach to integrating a flex- Languages]: Language Constructs and Features; F.3.3 [Log- ible type system for isolation and uniqueness into an exist- ics and Meanings of Programs]: Studies of Program Con- ing, full-featured language, Scala. Our approach minimizes structs the annotations necessary for reusing existing code in a con- Keywords Aliasing, uniqueness, object capabilities, Scala text where isolation and uniqueness is required. In the pre- sented system, a single bit of information is enough to de- 1. Introduction cide whether an existing class supports isolation and unique- Uncontrolled aliasing in imperative object-oriented lan- ness. A key insight of our approach is that this single bit guages introduces a variety of challenges in large-scale soft- of information is provided by the object-capability disci- ware development. Among others, aliasing can increase the pline [27, 46]. The object capability model is an established methodology in the context of program security, and has ∗ Work done while at KTH Royal Institute of Technology, Sweden. been proven in large-scale industrial use for secure sandbox- ing of JavaScript applications [23, 47, 56]. This paper makes the following contributions: This is the author’s version of the work. It is posted here for your personal use. Not for • We present a new approach to separation and unique- redistribution. The definitive version was published in the following publication: ness which aims to minimize the annotations necessary to OOPSLA’16, November 2–4, 2016, Amsterdam, Netherlands c 2016 ACM. 978-1-4503-4444-9/16/11... reuse existing code (Section 2). In our system, reusabil- http://dx.doi.org/10.1145/2983990.2984042 ity is based on the object capability model. Thus, when 272 annotating existing code bases, only a single bit of infor- 1 class ActorA extends Actor[Any] { mation is required per class. 2 override def receive(msg: Any): Unit = msg match { • We formalize our approach in the context of two object- 3 case s: Start => 4 val newMsg = new Message oriented core languages (Section 3). The first core lan- 5 newMsg.arr = Array(1, 2, 3, 4) guage formalizes a type-based notion of object capabili- 6 s.next.send(newMsg) ties. The second core language additionally provides ex- 7 newMsg.arr(2) = 33 ternal uniqueness via flow-insensitive permissions. 8 // ... 9 case other => // ... • We provide complete soundness proofs, formally estab- 10 } lishing heap separation and uniqueness invariants for our 11 } 12 class ActorB extends Actor[Message] { two core languages (Section 4). Moreover, a concurrent 13 override def receive(msg: Message): Unit = { extension enables the statement of an isolation theorem 14 println(msg.arr.mkString(",")) for processes in the presence of a shared heap and ef- 15 } ficient, by-reference message passing (Section 4.1). We 16 } 17 class Message { have also mechanized the operational semantics and type 18 var arr: Array[Int] = _ system of the first core language in Coq (see the compan- 19 def leak(): Unit = { ion technical report [37]). 20 SomeObject.fld = arr 21 } • We implement our approach for the full Scala language 22 } as a compiler plugin (Section 5).1 To our knowledge, our 23 object SomeObject { implementation of (external) uniqueness is the first to in- 24 var fld: Array[Int] = _ 25 } tegrate soundly with local type inference in Scala. More- 26 class Start { over, the implementation leverages a unique combination 27 var next: ActorRef[Message] = _ of previous proposals for (a) implicit parameters [24, 25], 28 } and (b) closures with capture control [31, 45]. Figure 1. Two communicating actors in Scala. • We empirically evaluate the conformity of existing Scala classes to the object capability model on a corpus of over poorly understood. Second, LACASA fundamentally simpli- 75,000 LOC of popular open-source projects (Section 6). fies type checking: as long as a class conforms to the object- Results show that between 21% and 79% of the classes capability model, LACASA’s constructs enable isolation and of a project adhere to a strict object capability discipline. uniqueness for instances of the class. This has two important In the following we discuss the most closely related work, consequences: (a) a minimal set of additional annotations (a and defer a discussion of other related work to Section 7. In single bit of information per class) enables reusing existing Section 8 we conclude. code, and (b) type checking reusable class declarations is simple and well-understood, following the object-capability Selected Related Work. Most closely related to our system discipline, which we adapt for Scala. are approaches based on permissions or capabilities. Of par- ticular relevance is previous work on capabilities for unique- 2. Overview ness in Scala by Haller and Odersky [39] (“Cap4S”). While this prior work shares our high-level goal of lightweight We proceed with an informal overview of LACASA: its pro- unique references in Scala, the two approaches are signif- gramming model in Scala, and its type system. icantly different, with important consequences concerning A First Example. Consider the case of asynchronous com- soundness, robustness, and compatibility. First, Cap4S is munication between two concurrent processes. This style of based on flow-sensitive capabilities which are modeled using concurrency is well-supported by the actor model [2, 40] Scala’s annotations, similar to the use of extended type anno- for which multiple implementations exist for Scala [36, 38, tations in Java 8 for pluggable type systems [29]. However, 43]. Figure 1 shows the definition of two actor classes.2 the interaction between Scala’s local type inference [53] and The behavior of each actor is implemented by overriding annotation propagation has been shown to be a source of un- the receive method inherited from superclass Actor. The soundness and implementation complexities for such plug- receive method is invoked by the actor runtime system gable type systems [58]; these challenges are exacerbated whenever an actor is ready to process an incoming message.