A Type Theory for Incremental Computational Complexity with Control Flow Changes
Total Page:16
File Type:pdf, Size:1020Kb
A Type Theory for Incremental Computational Complexity with Control Flow Changes Ezgi Çiçek Zoe Paraskevopoulou Deepak Garg MPI-SWS, Germany Princeton University, USA MPI-SWS, Germany [email protected] [email protected] [email protected] Abstract self-adjusting computation [3, 4], by storing intermediate results in Incremental computation aims to speed up re-runs of a program a trace in the first run, it is possible to re-execute only those parts after its inputs have been modified slightly. It works by recording a that depend on the input changes during the incremental run, and trace of the program’s first run and propagating changes through the to reuse the parts that didn’t change free of cost. trace in incremental runs, trying to re-use as much of the original Although previous work has investigated incremental computa- trace as possible. The recent work CostIt is a type and effect system tion from different aspects (demand-driven settings [18], compiler- to establish the time complexity of incremental runs of a program, driven automatic incrementalization [8], etc.), until recently, the as a function of input changes. However, CostIt is limited in two programmer had to reason about the asymptotic time complexity of ways. First, it prohibits input changes that influence control flow. incremental execution, the dynamic stability, by direct analysis of This makes it impossible to type programs that, for instance, branch the cost semantics of programs [23]. Such reasoning is usually dif- on inputs that may change. Second, the soundness of CostIt is ficult as the programmer has to reason about two executions—the proved relative to an abstract cost semantics, but it is unclear how first run and the incremental run—and their relation (dynamic sta- the semantics can be realized. bility is inherently a relational property of two runs of a program). In this paper, we address both these limitations. We present Moreover, dynamic stability analysis heavily relies on a variety of DuCostIt, a re-design of CostIt, that combines reasoning about parameters such as the underlying incremental execution technique, costs of change propagation and costs of from-scratch evaluation. the input size, the nature of the input change, etc. Although many The latter lifts the restriction on control flow changes. To obtain the specific benchmark programs have been analyzed manually, estab- type system, we refine Flow Caml, a type system for information lishing the dynamic stability of a program can be both difficult and flow analysis, with cost effects. Additionally, we inherit from CostIt tedious. index refinements to track data structure sizes and a co-monadic In our prior work, CostIt, we took the first steps towards ad- type. Using a combination of binary and unary step-indexed logical dressing this problem by providing a refinement type and effect relations, we prove DuCostIt’s cost analysis sound relative to not system for establishing upper bounds on the asymptotic update only an abstract cost semantics, but also a concrete semantics, complexities of incremental programs [11]. This approach is attrac- which is obtained by translation to an ML-like language. tive because programmers can reason about the dynamic stability of their programs without worrying about the semantics of traces and Categories and Subject Descriptors F.3.1 [Logics and meanings incremental computation algorithms, from which the type system of programs]: Specifying and verifying and reasoning about pro- abstracts away. Furthermore, the analysis is compositional: Large grams; F.3.2 [Logics and meanings of programs]: Semantics of pro- programs are analyzed by composing the results of the analysis of gramming languages subprograms. CostIt can establish precise bounds on the dynamic General Terms Verification stability of many examples, including list programs like map, ap- pend and reverse, matrix programs like dot products and multipli- Keywords Complexity analysis, incremental computation, type cation and, divide-and-conquer algorithms like balanced list folds. and effect systems However, CostIt suffers from two serious limitations. First, Cos- tIt assumes that changes to inputs do not change control flow— 1. Introduction closures executed in the incremental run must match those executed Programs are often optimized under the implicit assumption that in the first run. The type system imposes stringent restrictions to en- they will execute only once. However, in practice, many programs sure this and cannot analyze many programs. For instance, CostIt ’s are executed again and again on slightly different inputs: spread- analysis of merge sort has to assume that the merge function, which sheets compute the same formulas with modifications to some of merges two sorted sub-lists, has been analyzed by external means, their cells, search engines periodically crawl the web and software since this function’s control flow depends on the values in the in- build processes respond to small source code changes. In such set- put lists. Second, the soundness of CostIt is established relative to tings, it is not enough to design a program that is efficient for the an abstract change propagation semantics based on previous work first (from-scratch) execution; the program must also be efficient on self-adjusting computation [3, 4], but beyond empirical analysis for the subsequent incremental executions (ideally much more ef- for specific programs, there is no evidence that the semantics are ficient than the from-scratch execution). Incremental computation realizable. is a promising approach to this problem that aims to design soft- In this paper, we address both these limitations. To address the ware that can automatically and efficiently respond to changing in- first limitation, we re-design the type system, properly account- puts. The potential for efficient incremental updates comes from ing for the fact that during incremental run, some closures, which the fact that, in practice, large parts of the computation repeat be- were not executed during the first run, may have to be evaluated tween the first and the incremental run. As shown by prior work on from scratch. Accordingly, our type system, called DuCostIt, has two typing judgments—one counts costs of incremental updates that the costs estimated by our type system can be realized algo- (change propagation) and the other counts costs of from-scratch rithmically. The reader may wonder why we present the abstract evaluation. Switches between the two modes are mediated by type semantics when we also have a concrete semantics. The answer is refinements. To address the second limitation, we show that our that the abstract semantics are very easy to understand and they language, a λ-calculus with lists, can be translated (type-directed) specify what the concrete semantics must do. to a low-level language similar to ML, preserving both incremental To prove soundness of the type system with respect to the two and from-scratch costs estimated by the type system. This transla- semantics, we develop logical relation models of types and typing tion significantly improves upon existing work [9], which provides judgments. These models are interesting in themselves: They com- a related translation but only shows that the translation preserves bine a binary relation for the judgment `S e : τ j κ with a unary the cost of the first run (the more important cost here is the cost relation for the judgment `C e : τ j κ, with an interesting interac- of the incremental run). We briefly summarize the key insights in tion in the step-indices. DuCostIt’s design. In summary, we make the following contributions: First, dynamic stability is a function of input changes, so to • analyze dynamic stability precisely, the type system must track We develop a type system, DuCostIt, for dynamic stability that which values may change. For this, we use type refinements that combines analyses of costs of incremental update and of from- trace lineage to types for information flow analysis [26]: The type scratch evaluation. The type system combines index refine- ments, changeability refinements, co-monadic reasoning and (A)S contains values of type A that cannot change structurally, two kinds of cost effects. Our type system significantly extends while (A)C contains values of type A that may change arbitrarily. prior work. (Section 3) Second, dynamic stability depends on sizes of input data structures like lists. To track these sizes, we use index refinements in the style • We show that the type system can precisely type several inter- of DML [28] and DFuzz [15]. esting examples. (Section 2) Third, like CostIt, DuCostIt’s type system treats costs as an • We develop an abstract cost semantics and a concrete cost se- effect on the typing judgment. However, unlike CostIt, where the mantics and prove soundness with respect to both using models only possible effect is the cost of incremental update, in DuCostIt that mix binary and unary step-indexed logical relations. The there are two possible costs, which are manifest in two different soundness with respect to the concrete cost semantics is com- typing judgments. The judgment `S e : τ j κ means that e (of type pletely new and covers a gap in prior work. (Sections 4 and 5) τ) has incremental update cost at most κ, while `C e : τ j κ means that e’s from-scratch execution cost is at most κ. For example, if Omitted inference rules and proofs of theorems are included in x :(real)S, i.e., x is a real number that cannot change, then an appendix available from the authors’ webpages [1]. ` x + 1 : (real)S j 0 (since x cannot change, there is nothing S Implementation We have also designed and implemented an al- to incrementally update in x + 1, so the cost is zero), but ` C gorithmic version of DuCostIt’s type system.