Online Detection of Effectively Callback Free Objects with Applications to Smart Contracts
Total Page:16
File Type:pdf, Size:1020Kb
Online Detection of Effectively Callback Free Objects with Applications to Smart Contracts SHELLY GROSSMAN, Tel Aviv University, Israel ITTAI ABRAHAM, VMware Research, USA GUY GOLAN-GUETA, VMware Research, USA YAN MICHALEVSKY, Stanford University, USA NOAM RINETZKY, Tel Aviv University, Israel MOOLY SAGIV, Tel Aviv University, Israel and VMware Research, USA YONI ZOHAR, Tel Aviv University, Israel Callbacks are essential in many programming environments, but drastically complicate program understanding and reasoning because they allow to mutate object’s local states by external objects in unexpected fashions, thus breaking modularity. The famous DAO bug in the cryptocurrency framework Ethereum, employed callbacks to steal $150M. We define the notion of Effectively Callback Free (ECF) objects in order toallow callbacks without preventing modular reasoning. An object is ECF in a given execution trace if there exists an equivalent execution trace without callbacks to this object. An object is ECF if it is ECF in every possible execution trace. We study the decidability of dynamically checking ECF in a given execution trace and statically checking if an object is ECF. We also show that dynamically checking ECF in Ethereum is feasible and can be done online. By running the history of all execution traces in Ethereum, we were able to verify that virtually all existing contract executions, excluding these of the DAO or of contracts with similar known vulnerabilities, are ECF. Finally, we show that ECF, whether it is verified dynamically or statically, enables modular reasoning about objects with encapsulated state. CCS Concepts: • Theory of computation → Program analysis; • Software and its engineering → Dynamic analysis; Additional Key Words and Phrases: Program analysis, Modular reasoning, Smart contracts 1 INTRODUCTION The theme of this paper is enabling modular reasoning about the correctness of objects with encapsulated state. This is inspired by platforms like Ethereum [Wood 2016] that facilitate execution of Smart Contracts [Szabo 1997] on top of a blockchain-based distributed ledger [Nakamoto 2008]. A key property in Ethereum Smart Contracts is the lack of global mutable shared state, in contrast to common standard programming environments such as C and Java. A smart contract is analogous to an object with encapsulated state. However, the Ethereum blockchain, and many other dynamic environments, implement event- driven programming using callbacks. These callbacks are necessary for functionality, but can compromise security. For example, the famous bug in the DAO contract exploited callbacks to steal $150M [Daian 2016]. Indeed, callbacks may break modularity which is essential for good programming style and extendibility. In the context of Blockchain, modularity is even more important since contracts are contributed by different sources, some of which may be malicious. Accordingly, the bug intheDAO allowed an adversarially crafted contract to mutate the DAO’s state by calling back to it. The DAO contract, that implemented a crowd-funding platform, was attacked by a ‘callback loop-hole’ (to be precisely described below). This attack, the recovery from which required a controversial hard-fork of the blockchain,1 exhibits a vulnerability that is peculiar to decentralized 1A hard fork can be thought of as taking an agreed history of transactions, and manually change it. consensus systems, like Ethereum: in such systems, a buggy contract cannot be updated or fixed (except for extreme measures like hard-forking), which makes validation and verification of smart contracts of even greater importance for this application. Effectively Callback-Free Objects. We identify a natural generic correctness criteria for objects which enables modular reasoning in environments with local-only mutable states, and expect most correct objects to satisfy this requirement. Informally, if an object o calls another object o0, and the execution of o0 calls o again, this second call to o is defined as a callback. The main idea isto allow callbacks in o only when they cannot affect the serial non-interruptible behavior of o. Thus, such callbacks can be considered harmless and do not affect the set of local reachable states ofthe object o. In particular, the behavior of such objects is independent of the client environments and of other objects. It is possible to reproduce all behaviors of the object using a most general client and without analyzing external objects. We say that an execution is Dynamically Effectively Callback Free (dECF) when there exists “an equivalent” execution without callbacks which starts in the same state and reaches the same final state. By equivalent, we refer to the behavior of a particular object as an external observer may perceive. We say that an object is Statically Effectively Callback Free (sECF) when all its possible executions are dynamically ECF. We do not distinguish between dynamic and static ECF when the context is clear. Both definitions are useful. Dynamic ECF in particular is applicable tothe blockchain environment, since static ECF is undecidable in the general case. We ran experiments on Ethereum, proving that checking dynamic ECF is inexpensive, and thus can be done efficiently in-vivo. This, combined with Ethereum’s built-in rollback feature, would have allowed to prevent the DAO bug from occurring, without invalidating legitimate executions. (In fact, we found just one such legitimate non-ECF contract, discussed in Section9). We show that the vulnerable DAO contract is non-ECF while no non-ECF executions are detected after applying the suggested corrections to it. Notice that the ECF notion is similar to the notion of atomic transactions in concurrent systems. Indeed, despite the fact that contract languages do not usually support concurrency, modularity and callbacks require similar kind of reasoning. The ECF property’s usefulness is not limited to bug-finding; once ECF is established, it canbe served to simplify reasoning on the object in isolation of other objects: We show that the set of reachable local states in ECF objects can be determined without considering the code of other objects and thus enable modular reasoning. This modular reasoning can be performed automatically using abstract interpretation e.g., as suggested in Logozzo[2009] or by using deductive verification which is supported by Dafny [Leino 2010]. We demonstrate this by verifying an interesting invariant of the DAO contract. (See Section2). Online Detection of ECF executions. A naïve detection of dECF may be costly because of the need to enumerate subexecution traces. Therefore, we develop an effective polynomial online algorithm for checking if an execution is ECF. The main idea is to detect conflicting memory accesses and utilize commutativity in an effective manner. We integrated the algorithm into the Ethereum Virtual Machine (EVM) [Wood 2016]. We ran the algorithm on all executions kept in the Ethereum blockchain until 23 June 2017, and demonstrate that: (i) the vulnerable DAO contract and other buggy contracts are non-ECF. (ii) very few correct contracts are non-ECF, (iii) callbacks are not esoteric and are used in many contracts, and (iv) the runtime overhead of our implementation is negligible and thus can be integrated as an online check. This online detection can thus be used to prevent incidents like the theft from the DAO at the cost of slightly more restricted form of programming. As far as we are aware, our tool is also the most precise and effective tool for finding such vulnerable behaviors due to callbacks. We compared it to the Oyente tool [Luu et al. 2016; Melonport 2017], by giving it both ECF and non-ECF contracts based on the DAO object (Figure1). We found that it has false positives, as it detects a ‘reentrancy bug’ (the common name of the DAO vulnerability in the blockchain community) for any one of the fixes that render our example contract ECF. Decidability of sECF for objects. We also consider the problem of checking sECF algorithmically. Obviously, since modern contract languages, such as Solidity [Ethereum Foundation 2017b], are Turing complete, checking if a contract is ECF is undecidable. We show that checking that a contract is sECF in a language with finite local states is decidable. This is interesting since many contracts only use small local states or maps with uniform data independent accesses. Technically, this result is non-trivial since the nesting of contract calls is unbounded, and since ECF requires reasoning about permutations of nested invocations. The reason for the decidability is that non-ECF executions which occur in high depth of nesting must also occur in depth 2. Main Results. Our results can be summarized as follows: (1) We define a general safety property, called ECF, for objects (sECF) and executions (dECF). Our definition is inspired by the Blockchain environment but it may also be useful forother environments with encapsulated states, such as Microservices. (2) We show that objects with encapsulated data, under the assumption that they satisfy ECF, can be verified using modular reasoning in a sound manner. (3) A stronger notion of ECF, based on conflict-equivalence, enabling efficient verification of dECF in real-life environments, and for which sECF is decidable for programs with finite state and unbounded stack. (4) A polynomial time and space algorithm for online checking of dECF and prototype imple- mentation of it as a dynamic monitor of dECF, built on top of an Ethereum