Combining Source and Target Level Cost Analyses for Ocaml Programs
Total Page:16
File Type:pdf, Size:1020Kb
Combining Source and Target Level Cost Analyses for OCaml Programs Stefan K. Muller Jan Hoffmann Carnegie Mellon University Carnegie Mellon University Abstract bounding gas usage in smart contracts [24]. More generally, Deriving resource bounds for programs is a well-studied it is an appealing idea to provide programmers with imme- problem in the programming languages community. For com- diate feedback about the efficiency of their code at design piled languages, there is a tradeoff in deciding when during time. compilation to perform such a resource analysis. Analyses at When designing a resource analysis for a compiled higher- the level of machine code can precisely bound the wall-clock level language, there is a tension between working on the execution time of programs, but often require manual anno- source code, the target code, or an intermediate represen- tations describing loop bounds and memory access patterns. tation. Advantages of analyzing the source code include Analyses that run on source code can more effectively derive more effective interaction with the programmer and more such information from types and other source-level features, opportunities for automation since the control flow and type but produce abstract bounds that do not directly reflect the information are readily available. The advantage of analyz- execution time of the compiled machine code. ing the target code is that analysis results apply to the code This paper proposes and compares different techniques for that is eventually executed. Many of the tools developed combining source-level and target-level resource analyses in in the programming languages community operate on the the context of the functional programming language OCaml. source level and derive upper bounds on a high-level notion The source level analysis is performed by Resource Aware of cost like number of loop iterations or user-defined cost ML (RaML), which automatically derives bounds on the costs metrics [25, 40, 53]. In the embedded systems community, of source-level operations. By relating these high-level oper- the focus is on tools that operate on machine code and derive ations to the low-level code, these bounds can be composed bounds that apply to concrete hardware [9, 55]. with results from target-level tools. We first apply this idea In this paper, we study the integration of source and target to the OCaml bytecode compiler and derive bounds on the level resource analyses for OCaml programs. We build on number of virtual machine instructions executed by the com- Resource Aware ML (RaML) [29, 30], a source level resource piled code. Next, we target OCaml’s native compiler for ARM analysis tool for OCaml programs that is based on automatic and combine the analysis with an off-the-shelf worst-case ex- amortized resource analysis (AARA) [33, 36]. AARA systemat- ecution time (WCET) tool that provides clock-cycle bounds ically annotates types with potential functions that map val- for basic blocks. In this way, we derive clock-cycle bounds ues of the type to a non-negative number. A type derivation for a specific ARM processor. An experimental evaluation can be seen as a proof that the initial potential is sufficient to analyzes and compares the performance of the different ap- cover the cost of an execution. Advantages of AARA include proaches and shows that combined resource analyses can compositionality and efficient inference of potential func- provide developers with useful performance information. tions, and thus resource bounds, using linear programming, even if potential functions are polynomial [28]. RaML can 1 Introduction derive bounds for user-defined metrics that assign a constant cost to an evaluation step in the dynamic semantics. The programming languages community has extensively Our approaches to integrating source and target level anal- studied the problem of statically analyzing the resource yses broadly follow the idea of using RaML to derive resource consumption of programs. The developed techniques range usage bounds that are parametric in the resource usages from fully automatic techniques based on static analysis of basic blocks, and then composing these results with a and automated recurrence solving [2, 11, 25, 38, 51, 54], lower-level analysis that operates on each basic block. Im- to semi-automatic techniques that check user annotated plementing these approaches in practice requires a technical bounds [19, 53], to manual reasoning systems that are inte- extension of RaML: we extend RaML to enable bound in- grated with type systems and program logics [15, 20, 21, 40]. ference for cost metrics that contain symbolic expressions. Static resource analysis has interesting applications that in- Instead of specifying cost 8128 at a certain spot in the pro- clude prevention of side channels [46], finding performance gram, it is now possible to specify a cost expression such bugs and algorithmic complexity vulnerabilities [49] and 8128a + 9b where a and b are symbolic constants. RaML will then derive a bound that is a function of both the argu- PL’19, January 01–03, 2018, New York, NY, USA 2019. ments and the constants a and b. In the context of this paper, 1 PL’19, January 01–03, 2018, New York, NY, USA Stefan K. Muller and Jan Hoffmann symbolic resource analysis can be used to devise resource 1 let rec fold f b l = metrics that are parametrized by the costs of basic blocks. 2 match l with To this end, we automatically annotate the source program 3 | [] ! b with cost annotations that correspond to beginnings of basic 4 | x::xs ! f (fold f b xs) x blocks in the compiled code. Each cost annotation is labeled 5 with a fresh symbol that corresponds to the, yet unknown, 6 let countsum1 l = cost of the corresponding basic block. A simple translation 7 let count = fold (fun c _ !c + 1) 0 in validation procedure ensures that every block has been la- 8 let sum = fold (fun s n !s + n) 0 in beled with at least one cost annotation. At the target level, 9 (count l, sum l) we can now analyze the cost of individual basic blocks and 10 substitute the results for the corresponding symbol in the 11 let countsum2 l = high-level bound. 12 fold (fun (count, sum) n ! Our third contribution is the implementation of the de- 13 (count + 1, sum + n)) scribed technique for the OCaml bytecode and native-code 14 (0, 0) compilers. For the OCaml bytecode compiler, we associate 15 l the symbolic constants with the number of bytecode instruc- tions in their respective basic block. In this way, we derive Figure 1. Two implementations of the countsum function symbolic bounds on the number of bytecode instructions that are executed by a function. For the OCaml native code compiler, we use AbsInt’s Worst-case execution time (WCET) the OCaml bytecode compiler and evaluates its effectiveness analysis tool aiT to derive clock-cycle bounds for each basic with experiments. In Section 5, we study the combination block for the ARM Cortex-R5 platform. Together with the with WCET analysis and the OCaml native compiler and source-level bounds, this yields symbolic clock-cycle bounds report the findings from the respective experiments. Finally, for the compiled machine code. In many cases, aiT cannot we discuss related work (Section 6) and conclude. automatically derive loop and recursion bounds. So a final combination of source and target level analysis that we ex- 2 Symbolic Resource Analysis plore is to use the basic block analysis performed by RaML to The first ingredient for connecting the source-level resource derive aiT control-flow annotations for specific input sizes. analysis with compiled code is an extension of RaML we Our technique for connecting an high-level cost model call symbolic resource analysis. Before describing the tech- with compiled code is similar to existing techniques that nique, we present an overview of symbolic resource analysis have been implemented in the context of verified C compil- and its applications through an example. Consider the two ers [5, 15] (see Section 6). The novelty of our work is that OCaml functions in Figure 1, defined using the auxiliary we implemented the technique for a functional language function fold. Both take as an argument an integer list and and an existing optimizing compiler, support higher-order return a pair of the count and the sum of the elements. The functions, and combine compilation with AARA, and sup- first function, countsum1, makes two passes over the list, port OCaml-specific features such as an argument stack for counting the elements, then summing them, and finally re- avoiding the creation of function closures. turns a pair. The second, countsum2, computes both results We have evaluated our techniques on several OCaml pro- in one pass. RaML allows us to compare the two implemen- grams and found them to be both practical and reasonably tations based on how many list operations they perform by precise. For example, our bytecode analysis generates asymp- instrumenting fold with a “tick” annotation indicating that totically tight bounds on instruction counts for all of the it performs one list operation (a pattern match). example programs, and exact bounds for several of them. In 1 let rec fold f b l = addition, for several of our example programs, the control- 2 (Raml.tick (1.0); flow annotations derived by our analysis result inWCET 3 match l with cycle counts that are identical to results from hand-written 4 | [] ! b annotations. Hand annotations require manual reasoning 5 | x::xs ! f (fold f b xs) x) about the recursive structure of the program (which is labor- intensive and error-prone) in addition to the effort of manu- When the code is analyzed with this version of fold, ally inserting the annotations.