Related work

Yannick Forster, Fabian Kunze, Gert Smolka Saarland University March 16, 2018

1 Call-by-value lambda-calculus

• “The mechanical evaluation of expressions”, Landin (1964) [24]: Landin intro- duces the SECD machine, the name stemming from the 4 components of the machine called stack, environment, control and dump. He does not have correct- ness proofs. He uses closures and mentions the idea of representing composite objects by an address and sharing. • “Call-by-Name, Call-by-Value and the λ-Calculus”, Plotkin (1975) [31]: First in- troduction of call-by-value lambda-calculus. Although the equivalence relation is strong, reduction is weak, i.e. does not apply under binders. Plotkin variables (and constants) as values, but his machine only talks about closed terms (and he has to do this, because he substitutes variables one after the to transform closures to terms). He gives a detailed, top-down correctness proof of an SECD machine similar to Landin’s. He uses an inductive, step-indexed predicate (starting at 1) for the evaluation of lambda terms. He shows that the machine terminates representing the right value iff the corresponding term evaluates with that value. He first shows the usual bigstep-topdown-induction, e.g. that the machine-evaluation of every closure representing a normalizing term eventually puts the right value closure on the argument stack. He then shows that divergence propagates top- down: if the term does not evaluate in less than t steps, than the machine does not succeed in t steps. He does not proof his substitution lemma. His closures are defined to be closed. In contrast to Landin, his closures can contain arbitrary terms, not only abstrac- tions. Plotkin does not use inductive predicates. He also gives a CPS transformation of call-by-value into call-by-name and vice versa.

1 • “The weak as a reasonable machine”, Dal Lago and Martini (2008) [12]: Simulation of call-by-value lambda-calculus on Turing machines us- ing de Bruijn indices and closures. Variables are values. They don’t define their substitution, but the implementation they give in Turing machines is simple sub- stitution. And their results only hold if the one used on terms is simple substi- tution. They describe their turing machine in plain text and do not formally define it. But they give the alphabet, the tapes and describe their approach in such detail such that transition relation and state space are implicitly clear. Their implementation encodes de-bruijn terms in polnish notation, using a bi- nary encoding for variables. To reduce a term, they first parse for redexes in the string encoding the whole term and then substitute all bound variable in the body with the arguments To archive all this, they use a stack of position- symbols (left-of-application, right-of-application and under abstraction) to keep track of the position in the term while parsing, as this is e.g. needed to see where a lambda, and thus a non-applicative context, ends. They then claim that evaluation (with the right result) propagates top-down (where the machine-steps are polynomial bounded in the cost measure of the lambda reduction). They don’t proof this, but say that this is “clear from the definition”. • “Weak Call-by-Value Lambda Calculus as a Model of Computation in Coq”, Forster and Smolka (2017) [18]: Formalisation in Coq of the call-by-value lambda calculus as model of computation using de Bruijn indices. Only abstractions are values. First reproduction of some of Plotkin’s results in a formal setting (e.g. unique normal forms).

2 Stack machines

• “A call-by-name lambda-calculus machine”, Krivine (2007) [20]: Unpublished for 25 years (i.e. conceived in 1982). First SECD machine for the call-by-name lambda-calculus using an explicit instruction pointer. Krivine uses a pair of de Bruijn indices to point to an environment on the stack and to a term in this environment. Gives detailed correctness proofs. Call-by-name version of call/cc is treated. • “A generalization of jumps and labels”, Landin (1998) [22]: Landin gives seman- tics of a λ-calculus with his J operator, introduced in [23] in terms of an SECD machine. • Control Operators, the SECD-machine, and the λ-calculus, Felleisen and Fried- man (1986) [16]: In [17], Felleisen, Friedman, Kohlbecker, and Duba give an

2 equational system for a (named) call-by-value λ-calculus with a J operator. This paper re-explains this construction in a more general way. They first give a CEK-machine inorporating the J operator (CEK from control, environment, con- tinuation) and then transform this machine into a rewriting system, from which they derive rules for the calclus. The paper contains detailed proofs. • “Functional runtime systems within the lambda-sigma calculus”, Hardin, Maranget, and Pagano (1998) [19]: Hardin, Maranget, and Pagano present cor-

rectness proofs for four machines executing the λσw -calculus [1, 11]: Krivine’s machine, the SECD, Cardelli’s FAM [8], and the CAM [10]. They give a unified framework applying to all four verifications, similar to our refinements of L,

just for the λσw -calculus. In contrast to them, our refinement relations do not include a compilation function. • “A functional correspondence between evaluators and abstract machines”, Ager, Biernacki, Danvy, and Midtgaard (2003) [3]: This paper presents a two- way correspondence between functional evaluators and abstract machines based on two-way derivations. The steps involved are closure conversion, CPS trans- formation and defunctionalisation. They give evaluators for Krivine’s machine, Landin’s SECD machine, and Felleisen and Friedman’s CEK machine, and con- struct those machines from evaluators in a general way. They call the mentioned machines “abstract” machines, in contrast to “virtual machines”, which use a compiled term representation. For virtual machines, they consider the CAM (categorical abstract machine) by Cousineau, Curien, and Mauny [10]. • “A rational deconstruction of Landin’s SECD machine”, Danvy (2004) [13]: Danvy deconstructs Landin’s SECD machine into an evaluator for applicative ex- pressions, and then reconstructs several evaluators into SECD-like machines, in- cluding a call-by-name, call-by-need and tail-recursive version. His SC-machine is similar to our stack machine. He mostly uses explicit names, but also mentions a machine using de Bruijn indices and a machine using an instruction set, which are however both not shown. • “A Rational Deconstruction of Landin’s SECD Machine with the J Operator”, Millikin and Danvy (2008) [29]: Follow-up of [13] including Landin’s J operator. • “Lambda and pi calculi, CAM and SECD machines”, Vasconcelos (2005) [36]: Journal version of [35]. Vasconcelos encodes both the call-by-value λ-calculus and an SECD machine into the π-calculus. He introduces a notion of compos- able correspondence between rewriting system, that is similar to our notion of refinement. The difference is that its direction is inverted and in the third part of the definition, he allows B to use the reflexive-transitive closure of the step relation, where we only use the reflexive closure. In our terms, this means he

3 almost proves that the call-by-value λ refines the π-calculus. His SECD machine is the machine introduced by Plotkin. Vasconcelos does not give an explicit correctness proof for the SECD machine and only refers to Plotkin. • “Proof of translation in natural semantics”, Despeyroux (1986) [15]: Despey- roux gives a detailed correctness proof of a CAM machine for Mini-ML.

2.1 Formalised work

• Compilation to the Modern SECD, Leroy (2016) [26]: Leroy gives a very concise formalisation of a call-by-value λ-calculus with constants using de Bruijn indices and an SECD machine using closures and an instruction set. He proves that bis- step evaluation using environments in the λ-calculus implies big-step evaluation of the machine and that a coinductively defined notion of infinite evaluation in the λ-calculus yields an infinite path of the machine. We conjecture that his notion of infinite evaluation is strictly stronger than non- termination. He does not consider small-step semantics or heaps. • “Generalized Refocusing: From Hybrid Strategies to Abstract Machines”, Bier- nacka, Charatonik, and Zielinska (2017) [4]: Follow-up to [33]. The authors build on [14] and present a format for specifying reduction semantics and a general technique of transforming this semantics into an abstract machine (called refo- cusing). They formalise the general refocusinng procedure in Coq and prove the correctness of the resulting abstract machines. They gives several case-studies, including machines for call-by-name and call-by-value (about 400 lines needed to instantiate general procedure). They have a notion of tracing for two abstract rewriting systems which can be seen as a version of our τβ-refinement. • “The tail-recursive SECD machine”, Ramsdell (1999) [32]: Ramsdell discusses and compares correctness proofs for the SECD machine, a tail recursive SECD machine and Felleisen and Friedman’s CEK machine in the Boyer-Moore theorem prover. He only explains the structure of the proof and does not state interme- diate lemmas. He uses de-Bruijn indices and a non-capturing single-point substitution opera- tion. He does not use a heap. • “From mathematics to abstract machine: a formal derivation of an executable Krivine machine”, Swierstra (2012) [34]: Swierstra formalises Biernacka and Danvy’s [5] derivation of the Krivine machine for a call-by-name λ-calculus in Agda. He uses a well-typed, well-scoped term type. • “Proving correctness of the translation from mini-ml to the CAM with the Coq

4 proof development system”, Boutin (1995) [6]: In french. Gives a formalisation of Mini-ML with explicit names and a big-step semantics using environments. He partially mechanises the proof from [15]. • “The adequacy of Launchbury’s natural semantics for lazy evaluation”, Breit- ner (2018) [7]: Breitner formalises a denotational semantics for the call-by-name λ-calculus using a very abstract notion of heaps following Launchbury’s “A natu- ral semantics for lazy evaluation” [25]. He uses the theorem prover Isabelle with the Nominal package. He does not consider an explicit abstract machine, explicit heaps or code. • “Distilling abstract machines”, Accattoli, Barenbaum, and Mazza (2014) [2]: This paper derive abstract machines and verifies them in an uniform way called distilleries for call-by-name, call-by-value and call-by-need linear substitution cal- culi (LSC). LCS are similar to explicit substitution calculi, e.g. ’closures on term- level’. They derive several machines: CBName: Krivins machine; CBValue: CEK, Leroys single-stack machine, split CEK (different stack for function-arguments and argument-arguments) and MAM (milner abstract ma- chine, but the phrase seems coined by the authors), a machine with ’only one global environment and just one global closure, which sometimes called heap and dump’; CBNeed: MAD and Merged MAD (’Cregut’s lazy KAM’), extensions of their MAM, and Pointing MAD, MAD with only one global environment for both dump and heap. They don’t have code, but operate on terms. They don’t need substitution and use alpha equivalence on terms, but not on machine states. They only consider closed terms. They introduce a notion of distilleries for deterministic machines, which are bottom-up simulations with silent steps on machines and an compatible equiv- alence relation on terms that e.g. includes α-equivalence and silent steps of the LCS. They have distillations (decompilation). They show that each term-reduction corresponds to some machine-reduction. Assuming termination of silent ma- chine steps and propagation of reducibility from terms to machines, they show that term-reduction steps propagate top-down. They also show that the intenal steps are bounded in some sense by the number of external steps and the initial term size, which extends the results of [12]

3 Other formalisations

• “Formal verification of a realistic compiler”, Leroy (2009) [27]: Leroy reports on CompCert, a compiler from a large subset of C to PowerPC assembly, written

5 and verified in Coq. • “A certified type-preserving compiler from lambda calculus to assembly lan- guage”, Chlipala (2007) [9]: Chlipala formalises the simply typed λ-calculus with natural numbers in Coq using de-Bruijn indices and gives a denotational seman- tics. He then gives several typed intermediate languages, including closures and linear code before arriving at an assembler language. • “Pilsner: a compositionally verified compiler for a higher-order imperative language”, Neis et al. (2015) [30]: • “CakeML: a verified implementation of ML”, Kumar, Myreen, Norrish, and Owens (2014) [21]: The authors report on their implementation of CakeML, a compiler verified in HOL4 from a large subset of Standard ML to x86-64 byte- code. The compiler takes the form of a read-eval-print loop and uses multiple intermediate languages. They use the standard environment semantics for Stan- dard ML. • Using Coq’s evaluation mechanisms in anger, Leroy (2015) [28]: Coq has sev- eral strategies to evaluate terms. For instance compute, which uses an inter- preter, and native_compute, which compiles to OCAML bytecode. The tactic vm_compute uses an abstract machine for the programming language underly- ing Coq. In the blog post, Leroy explains how he used this built-in reductions to execute CompCert from within Coq.

4 More

• Mention vm_compute. • According to Leroy (https://xavierleroy.org/talks/compilation-agay. pdf, slide 11), Landin inventend closures. • Leroy slide 24: ”Modern implementations use one-block closures with minimal environments‘’.

References

[1] Martin Abadi et al. “Explicit substitutions”. In: Journal of functional program- ming 1.4 (1991), pp. 375–416. [2] Beniamino Accattoli, Pablo Barenbaum, and Damiano Mazza. “Distilling ab- stract machines”. In: ICFP. 2014.

6 [3] Mads Sig Ager et al. “A functional correspondence between evaluators and abstract machines”. In: Proceedings of the 5th ACM SIGPLAN international conference on Principles and practice of declaritive programming. ACM. 2003, pp. 8–19. [4] Małgorzata Biernacka, Witold Charatonik, and Klara Zielinska. “Generalized Refocusing: From Hybrid Strategies to Abstract Machines”. In: LIPIcs-Leibniz International Proceedings in Informatics. Vol. 84. Schloss Dagstuhl-Leibniz- Zentrum fuer Informatik. 2017. [5] Małgorzata Biernacka and Olivier Danvy. “A concrete framework for environ- ment machines”. In: ACM Transactions on Computational Logic (TOCL) 9.1 (2007), p. 6. [6] Samuel Boutin. “Proving correctness of the translation from mini-ml to the CAM with the Coq proof development system”. PhD thesis. INRIA, 1995. [7] Joachim Breitner. “The adequacy of Launchbury’s natural semantics for lazy evaluation”. In: Journal of 28 (2018). [8] Luca Cardelli. “Compiling a functional language”. In: Proceedings of the 1984 ACM Symposium on LISP and functional programming. ACM. 1984, pp. 208– 217. [9] Adam Chlipala. “A certified type-preserving compiler from lambda calculus to assembly language”. In: ACM Sigplan Notices. Vol. 42. 6. ACM. 2007, pp. 54– 65. [10] Guy Cousineau, P-L Curien, and Michel Mauny. “The categorical abstract ma- chine”. In: Science of computer programming 8.2 (1987), pp. 173–202. [11] Pierre-Louis Curien, Thérèse Hardin, and Jean-Jacques Lévy. “Confluence properties of weak and strong calculi of explicit substitutions”. In: Journal of the ACM (JACM) 43.2 (1996), pp. 362–397. [12] Ugo Dal Lago and Simone Martini. “The weak lambda calculus as a reasonable machine”. In: Theor. Comput. Sci. 398.1-3 (2008), pp. 32–50. [13] Olivier Danvy. “A rational deconstruction of Landin’s SECD machine”. In: Sym- posium on Implementation and Application of Functional Languages. Springer. 2004, pp. 52–71. [14] Olivier Danvy and Lasse R Nielsen. “Refocusing in reduction semantics”. In: BRICS Report Series 11.26 (2004). [15] Joëlle Despeyroux. “Proof of translation in natural semantics”. PhD thesis. INRIA, 1986.

7 [16] Matthias Felleisen and Daniel P Friedman. Control Operators, the SECD- machine, and the λ-calculus. Indiana University, Computer Science Depart- ment, 1986. [17] Matthias Felleisen et al. “Reasoning with continuations”. In: LICS. Vol. 86. 1986, pp. 131–141. [18] Yannick Forster and Gert Smolka. “Weak Call-by-Value Lambda Calculus as a Model of Computation in Coq”. In: ITP 2017. Springer, LNCS 10499, 2017, pp. 189–206. [19] Thérese Hardin, Luc Maranget, and Bruno Pagano. “Functional runtime sys- tems within the lambda-sigma calculus”. In: Journal of Functional Program- ming 8.2 (1998), pp. 131–176. [20] Jean-Louis Krivine. “A call-by-name lambda-calculus machine”. In: Higher- order and symbolic computation 20.3 (2007), pp. 199–207. [21] Ramana Kumar et al. “CakeML: a verified implementation of ML”. In: ACM SIGPLAN Notices. Vol. 49. 1. ACM. 2014, pp. 179–191. [22] Peter J Landin. “A generalization of jumps and labels”. In: Higher-Order and Symbolic Computation 11.2 (1998), pp. 125–143. [23] Peter J Landin. “Correspondence between ALGOL 60 and Church’s Lambda- notation: part I”. In: Communications of the ACM 8.2 (1965), pp. 89–101. [24] Peter J Landin. “The mechanical evaluation of expressions”. In: The Computer Journal 6.4 (1964), pp. 308–320. [25] John Launchbury. “A natural semantics for lazy evaluation”. In: Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. ACM. 1993, pp. 144–154. [26] Xavier Leroy. Compilation to the Modern SECD. Coq sources for the lecture Functional programming and type systems, MPRI course 2-4, https : / / xavierleroy.org/mpri/2-4/. 2016. [27] Xavier Leroy. “Formal verification of a realistic compiler”. In: Communications of the ACM 52.7 (2009), pp. 107–115. [28] Xavier Leroy. Using Coq’s evaluation mechanisms in anger. Blog post, http: //gallium.inria.fr/blog/coq-eval/. 2015. [29] Kevin Millikin and Olivier Danvy. “A Rational Deconstruction of Landin’s SECD Machine with the J Operator”. In: Logical Methods in Computer Science 4 (2008). [30] Georg Neis et al. “Pilsner: a compositionally verified compiler for a higher- order imperative language”. In: ACM Sigplan Notices. Vol. 50. 9. ACM. 2015, pp. 166–178.

8 [31] Gordon D. Plotkin. “Call-by-Name, Call-by-Value and the λ-Calculus”. In: Theor. Comput. Sci. 1.2 (1975), pp. 125–159. [32] John D Ramsdell. “The tail-recursive SECD machine”. In: Journal of Automated Reasoning 23.1 (1999), pp. 43–62. [33] Filip Sieczkowski, Małgorzata Biernacka, and Dariusz Biernacki. “Automating derivations of abstract machines from reduction semantics”. In: Symposium on Implementation and Application of Functional Languages. Springer. 2010, pp. 72–88. [34] Wouter Swierstra. “From mathematics to abstract machine: a formal deriva- tion of an executable Krivine machine”. In: arXiv preprint arXiv:1202.2924 (2012). [35] Vasco T Vasconcelos. “The call-by-value λ-calculus, the SECD machine, and the π-calculus”. In: (2000). [36] Vasco Thudichum Vasconcelos. “Lambda and pi calculi, CAM and SECD ma- chines”. In: Journal of Functional Programming 15.1 (2005), pp. 101–127.

9