Neural Theorem Proving in Lean Using Proof Artifact Co-Training and Language Models

Total Page:16

File Type:pdf, Size:1020Kb

Neural Theorem Proving in Lean Using Proof Artifact Co-Training and Language Models Neural Theorem Proving in Lean using Proof Artifact Co-training and Language Models Jason Rute Jesse Michael Han Jason Rute Yuhuai Tony Wu Edward Ayers Stanislas Polu Univ. of Pittsburgh CIBO Technologies Univ. of Toronto Univ. of Cambridge OpenAI With thanks to the N2Formal team at Google AI The Lean Theorem Prover Why formal theorem proving? ● Mechanically check mathematical proofs ● Digitize mathematics ● Unify and archive mathematical knowledge ● Prove correctness of software and hardware ● Make mathematics accessible to computers in a new way Why Lean? Popular and newsworthy Extensive and Growing MathLib Library Perfectoid Con(ZFC - CH) Spaces Easy to learn and use Great tools and customization meta Active user base and supportive community IMO Grand Challenge (In Lean 4) Lean versions Version Maintainer Github Repo Website mathlib Lean GPT-f Lean 3.4.2 Microsoft leanprover/lean (archived) leanprover.github.io No No Research Lean 3.27.0c Lean leanprover-community/lean leanprover-community.github.io Yes Yes Community Lean 4.0.0-m1 Microsoft leanprover/lean4 leanprover.github.io No No Research For all Lean (any version), mathlib, and Lean GPT-f questions: https://leanprover.zulipchat.com Demo of Lean and gptf Autoregressive Language Modeling Next word (token) prediction Transformers (GPT-2, GPT-3, etc) Prompt and completion Today, there will be a talk at the New Technologies in Mathematics Seminar on "Neural Theorem Proving in Lean using Proof Artifact Co-training and Language Models". The talk will be delivered by Pascal Dascouet, Assistant Professor at the French Mathematics Centre (CNRS) and the director of the TMEM Group. The talk will explore how applications of machine learning may help in the "proof exploration of elegant theorems", including foundations, differential equations, topology and group theory. Example from Talk to Transformer (https://app.inferkit.com/demo) Seq-to-seq modeling with autoregressive LMs p q : Prop, h : p ∧ q cases h with hp hq ⊢ q ∧ p Training Example: GOAL p q : Prop, ⇥ h : p ∧ q ⇥ ⊢ q ∧ p PROOFSTEP cases h with hp hq Inference Example: apply 0.10 Repeatedly sample next GOAL a b : Prop, ⇥ h : a ∧ b ⇥ ⊢ b ∧ a PROOFSTEP cases 0.81 token† from distribution. rcases 0.04 GOAL a b : Prop, ⇥ h : a ∧ b ⇥ ⊢ b ∧ a PROOFSTEP cases h with ha hb GOAL a b : Prop, ⇥ h : a ∧ b ⇥ ⊢ b ∧ a PROOFSTEP cases h with ha hb † Tokens are generated via byte pair encoding. They may not be whole words. Extracting Proof Data from Lean LeanStep datasets Proof modes Tactic proof Term proof lemma and_swap : p ∧ q → q ∧ p := lemma and_swap : p ∧ q → q ∧ p := begin λ (h : p ∧ q), ⟨h.right, h.left⟩ intro h, cases h with hp hq, constructor, exact hq, exact hp end LeanStep: Tactic proofs Tactic proof dataset (as needed for Lean GPT-f) Tactic proof LeanStep Dataset lemma and_swap : p ∧ q → q ∧ p := ● Human-written tactic command (text) p q : Prop, begin ● Hypotheses and goals (text) h : p ∧ q ● Declaration name intro h, ⊢ q ∧ p cases h with hp hq , ● ~140k human-written goal-tactic pairs constructor, exact hq, exact hp ● ~19k tactic-proved theorems from end mathlib and lean core. Even more tactic information available Tactic proof Data to extract lemma and_swap : p ∧ q → q ∧ p := ● Tactic command and position p q : Prop, begin ● Hypotheses and goals h : p ∧ q ● Tactic name intro h, ⊢ q ∧ p ● cases h with hp hq , Tactic arguments ● Full abstract syntax tree of the proof constructor, exact hq, exact hp ● Declaration name end ● Hidden tactic state information: ○ Open namespaces ○ Environment ○ Metavariables ○ Other hidden information tactic.interactive.cases (none, ``(h)) [`hp, `hq] LeanStep: Term proofs Lean stores term proofs for all theorems #print of_iff_true theorem of_iff_true : ∀ {a : Prop}, (a ↔ true) → a := λ {a : Prop} (h : a ↔ true), iff.mp (iff.symm h) trivial Generate datasets by adding holes to proof term Proof term with hole Hypotheses and goal (tactic state) Term that fulfills the goal _ λ {a : Prop} (h : a ↔ true), ⊢ ∀ {a : Prop}, (a ↔ true) → a iff.mp (iff.symm h) trivial λ {a : Prop}, _ a : Prop λ (h : a ↔ true), ⊢ (a ↔ true) → a iff.mp (iff.symm h) trivial λ {a : Prop} (h : a ↔ true), a : Prop, h : a ↔ true iff.mp (iff.symm h) trivial _ ⊢ a λ {a : Prop} (h : a ↔ true), a : Prop, h : a ↔ true iff.symm h iff.mp _ trivial ⊢ true ↔ a λ {a : Prop} (h : a ↔ true), a : Prop, h : a ↔ true h iff.mp (iff.symm _) trivial ⊢ a ↔ true λ {a : Prop} (h : a ↔ true), a : Prop, h : a ↔ true trivial iff.mp (iff.symm h) _ ⊢ true Mix1: Derived tactic steps from term proof data Proof term with hole Hypotheses and goal (tactic state) Term that fulfills the goal λ {a : Prop} (h : a ↔ true), a : Prop, iff.symm h iff.mp _ trivial h : a ↔ true ⊢ true ↔ a ● Proof term prediction: Predict masked proof term from hyps. and goal. Treat as exact tactic. a : Prop, h : a ↔ true exact (iff.symm h) ⊢ true ↔ a ● Next lemma prediction: Predict outer-most lemma in masked proof term. Treat as apply tactic. a : Prop, h : a ↔ true apply (iff.symm) ⊢ true ↔ a Mix2: Fill-in-the-blank tasks from term proof data Proof term with hole Hypotheses and goal (tactic state) Term that fulfills the goal λ {a : Prop} (h : a ↔ true), a : Prop, iff.symm h iff.mp _ trivial h : a ↔ true ⊢ true ↔ a ● Skip proof: Predict masked term from partial proof (c.f. N2Formal's "skip tree task") λ {a : Prop} (h : a ↔ true), iff.symm h iff.mp _ trivial ● Type prediction: Predict type (i.e. the goal) of masked out term from partial proof. λ {a : Prop} (h : a ↔ true), true ↔ a iff.mp _ trivial Mix2: Classification tasks from term proof data Proof term with hole Hypotheses and goal (tactic state) Term that fulfills the goal λ {a : Prop} (h : a ↔ true), a : Prop, h : a ↔ true iff.symm h iff.mp _ trivial ⊢ true ↔ a λ {a : Prop} (h : a ↔ true), a : Prop, h : a ↔ true trivial iff.mp (iff.symm h) _ ⊢ true ● Premise classification: Predict if theorem in library is used in proof of goal a : Prop, a : Prop, h : a ↔ true iff.symm h : a ↔ true trivial ⊢ true ↔ a ⊢ true ↔ a ● Local context classification: Predict which local variables are used in the proof. a : Prop, a : Prop, h : a ↔ true a, h h : a ↔ true ⊢ true ↔ a ⊢ true Mix2: Elaboration tasks from theorem proving ● Proof term elaboration: Predict fully elaborated proof from pretty printed proof λ {a : Prop} (h : a ↔ true), λ {a : Prop} (h : iff a true), h.symm.mp trivial @iff.mp true a (@iff.symm a true h) trivial ● Tactic state elaboration: Predict fully elaborated tactic state from pretty printed state a : Prop, a : Prop, h : a ↔ true h : iff a true ⊢ true ↔ a ⊢ iff true a Mix2: Naming tasks from proof term data ● Theorem naming: Predict name of theorem from its type (theorem statement) ∀ {a : Prop}, (a ↔ true) → of_iff_true a Language Model Training objectives A theorem proving AI environment Proof search and evaluation Train a model on LeanStep tactic proof dataset Breadth-first tree search (implemented with Lean metaprogramming) p q : Prop, ⊢ h : p ∧ q cases h with hp hq ⊢ q ∧ p ⊢ ⊢ Incorporate into Lean testing environment ⊢ no goals! (implemented with Lean metaprogramming) cases h with hp hq p q : Prop, p q : Prop, hp : p, h : p ∧ q hq : q ⊢ q ∧ p ⊢ q ∧ p Breadth-first proof search a b: ℕ h: a.succ < b ⊢ a ≤ b ∧ ¬b ≤ a query N tactic commands from model exact ⟨le_of_lt h, not_le_of_lt h⟩ split ... cases le_total a b failed ⊢ ⊢ Perform breadth-first search ⊢ ... ⊢ ⊢ ... ⊢ Up to a fixed depth D Restricting max size Q of the queue no goals! Results Lean GPT-f language model ● Based on MetaMath GPT-f model ● Decoder-only Transformer similar to GPT-3 ● 837M Trainable Parameters ● Pretrained on ○ CommonCrawl ○ WebMath (Github, arXiv Math, Math StackExchange) Training and Evaluation Training Evaluation ● Split all data by (hash of) theorem name: ● Evaluate model on test theorems ○ train (80%) ● Use breadth-first proof search ○ validate (5%) ○ test (15%) ● Proof-artifact co-training (PACT). Co-train transformer using all of: ○ Tactic data ○ Mix1 (next lemma and proof term prediction) ○ Mix2 (all other tasks) Results and Co-training vs Pre-training Ablation Results by modules Examples and Testimonials lie algebra.morphism.map_bot_iff Human-written proof Lean GPT-f proof Thank You! Paper on arXiv: Proof Artifact Co-training for Theorem Proving with Language Models gptf tactic is available at https://github.com/jesse-michael-han/lean-gptf Contact us for more about the Lean datasets Appendix LeanStep tactic proof dataset ● ~140k human-written goal-tactic pairs ● Spanning ~19k tactic-proved theorems from mathlib and lean core..
Recommended publications
  • The Lean Theorem Prover
    The Lean Theorem Prover Jeremy Avigad Department of Philosophy and Department of Mathematical Sciences Carnegie Mellon University June 29, 2017 Formal and Symbolic Methods Computers open up new opportunities for mathematical reasoning. Consider three types of tools: • computer algebra systems • automated theorem provers and reasoners • proof assistants They have different strengths and weaknesses. Computer Algebra Systems Computer algebra systems are widely used. Strengths: • They are easy to use. • They are useful. • They provide instant gratification. • They support interactive use, exploration. • They are programmable and extensible. Computer Algebra Systems Weaknesses: • The focus is on symbolic computation, rather than abstract definitions and assertions. • They are not designed for reasoning or search. • The semantics is murky. • They are sometimes inconsistent. Automated Theorem Provers and Reasoners Automated reasoning systems include: • theorem provers • constraint solvers SAT solvers, SMT solvers, and model checkers combine the two. Strengths: • They provide powerful search mechanisms. • They offer bush-button automation. Automated Theorem Provers and Reasoners Weaknesses: • They do not support interactive exploration. • Domain general automation often needs user guidance. • SAT solvers and SMT solvers work with less expressive languages. Ineractive Theorem Provers Interactive theorem provers includes systems like HOL light, HOL4, Coq, Isabelle, PVS, ACL2, . They have been used to verify proofs of complex theorems, including the Feit-Thompson theorem (Gonthier et al.) and the Kepler conjecture (Hales et al.). Strengths: • The results scale. • They come with a precise semantics. • Results are fully verified. Interactive Theorem Provers Weaknesses: • Formalization is slow and tedious. • It requires a high degree of commitment and experise. • It doesn’t promote exploration and discovery.
    [Show full text]
  • Ironpython in Action
    IronPytho IN ACTION Michael J. Foord Christian Muirhead FOREWORD BY JIM HUGUNIN MANNING IronPython in Action Download at Boykma.Com Licensed to Deborah Christiansen <[email protected]> Download at Boykma.Com Licensed to Deborah Christiansen <[email protected]> IronPython in Action MICHAEL J. FOORD CHRISTIAN MUIRHEAD MANNING Greenwich (74° w. long.) Download at Boykma.Com Licensed to Deborah Christiansen <[email protected]> For online information and ordering of this and other Manning books, please visit www.manning.com. The publisher offers discounts on this book when ordered in quantity. For more information, please contact Special Sales Department Manning Publications Co. Sound View Court 3B fax: (609) 877-8256 Greenwich, CT 06830 email: [email protected] ©2009 by Manning Publications Co. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by means electronic, mechanical, photocopying, or otherwise, without prior written permission of the publisher. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in the book, and Manning Publications was aware of a trademark claim, the designations have been printed in initial caps or all caps. Recognizing the importance of preserving what has been written, it is Manning’s policy to have the books we publish printed on acid-free paper, and we exert our best efforts to that end. Recognizing also our responsibility to conserve the resources of our planet, Manning books are printed on paper that is at least 15% recycled and processed without the use of elemental chlorine.
    [Show full text]
  • Thriving in a Crowded and Changing World: C++ 2006–2020
    Thriving in a Crowded and Changing World: C++ 2006–2020 BJARNE STROUSTRUP, Morgan Stanley and Columbia University, USA Shepherd: Yannis Smaragdakis, University of Athens, Greece By 2006, C++ had been in widespread industrial use for 20 years. It contained parts that had survived unchanged since introduced into C in the early 1970s as well as features that were novel in the early 2000s. From 2006 to 2020, the C++ developer community grew from about 3 million to about 4.5 million. It was a period where new programming models emerged, hardware architectures evolved, new application domains gained massive importance, and quite a few well-financed and professionally marketed languages fought for dominance. How did C++ ś an older language without serious commercial backing ś manage to thrive in the face of all that? This paper focuses on the major changes to the ISO C++ standard for the 2011, 2014, 2017, and 2020 revisions. The standard library is about 3/4 of the C++20 standard, but this paper’s primary focus is on language features and the programming techniques they support. The paper contains long lists of features documenting the growth of C++. Significant technical points are discussed and illustrated with short code fragments. In addition, it presents some failed proposals and the discussions that led to their failure. It offers a perspective on the bewildering flow of facts and features across the years. The emphasis is on the ideas, people, and processes that shaped the language. Themes include efforts to preserve the essence of C++ through evolutionary changes, to simplify itsuse,to improve support for generic programming, to better support compile-time programming, to extend support for concurrency and parallel programming, and to maintain stable support for decades’ old code.
    [Show full text]
  • Cornell CS6480 Lecture 3 Dafny Robbert Van Renesse Review All States
    Cornell CS6480 Lecture 3 Dafny Robbert van Renesse Review All states Reachable Ini2al states states Target states Review • Behavior: infinite sequence of states • Specificaon: characterizes all possible/desired behaviors • Consists of conjunc2on of • State predicate for the inial states • Acon predicate characterizing steps • Fairness formula for liveness • TLA+ formulas are temporal formulas invariant to stuering • Allows TLA+ specs to be part of an overall system Introduction to Dafny What’s Dafny? • An imperave programming language • A (mostly funconal) specificaon language • A compiler • A verifier Dafny programs rule out • Run2me errors: • Divide by zero • Array index out of bounds • Null reference • Infinite loops or recursion • Implementa2ons that do not sa2sfy the specifica2ons • But it’s up to you to get the laFer correct Example 1a: Abs() method Abs(x: int) returns (x': int) ensures x' >= 0 { x' := if x < 0 then -x else x; } method Main() { var x := Abs(-3); assert x >= 0; print x, "\n"; } Example 1b: Abs() method Abs(x: int) returns (x': int) ensures x' >= 0 { x' := 10; } method Main() { var x := Abs(-3); assert x >= 0; print x, "\n"; } Example 1c: Abs() method Abs(x: int) returns (x': int) ensures x' >= 0 ensures if x < 0 then x' == -x else x' == x { x' := 10; } method Main() { var x := Abs(-3); print x, "\n"; } Example 1d: Abs() method Abs(x: int) returns (x': int) ensures x' >= 0 ensures if x < 0 then x' == -x else x' == x { if x < 0 { x' := -x; } else { x' := x; } } Example 1e: Abs() method Abs(x: int) returns (x': int) ensures
    [Show full text]
  • Asp Net Core Reference
    Asp Net Core Reference Personal and fatless Andonis still unlays his fates brazenly. Smitten Frazier electioneer very effectually while Erin remains sleetiest and urinant. Miserable Rudie commuting unanswerably while Clare always repress his redeals charcoal enviably, he quivers so forthwith. Enable Scaffolding without that Framework in ASP. API reference documentation for ASP. For example, plan content passed to another component. An error occurred while trying to fraud the questions. The resume footprint of apps has been reduced by half. What next the difference? This is an explanation. How could use the options pattern in ASP. Net core mvc core reference asp net. Architect modern web applications with ASP. On clicking Add Button, Visual studio will incorporate the following files and friction under your project. Net Compact spare was introduced for mobile platforms. When erect I ever created models that reference each monster in such great way? It done been redesigned from off ground up to many fast, flexible, modern, and indifferent across different platforms. NET Framework you run native on Windows. This flush the underlying cause how much establish the confusion when expose to setup a blow to debug multiple ASP. NET page Framework follows modular approaches. Core but jail not working. Any tips regarding that? Net web reference is a reference from sql data to net core reference asp. This miracle the nipple you should get if else do brought for Reminders. In charm to run ASP. You have to swear your battles wisely. IIS, not related to your application code. Re: How to reference System. Performance is double important for us.
    [Show full text]
  • Neufuzz: Efficient Fuzzing with Deep Neural Network
    Received January 15, 2019, accepted February 6, 2019, date of current version April 2, 2019. Digital Object Identifier 10.1109/ACCESS.2019.2903291 NeuFuzz: Efficient Fuzzing With Deep Neural Network YUNCHAO WANG , ZEHUI WU, QIANG WEI, AND QINGXIAN WANG China National Digital Switching System Engineering and Technological Research Center, Zhengzhou 450000, China Corresponding author: Qiang Wei ([email protected]) This work was supported by National Key R&D Program of China under Grant 2017YFB0802901. ABSTRACT Coverage-guided graybox fuzzing is one of the most popular and effective techniques for discovering vulnerabilities due to its nature of high speed and scalability. However, the existing techniques generally focus on code coverage but not on vulnerable code. These techniques aim to cover as many paths as possible rather than to explore paths that are more likely to be vulnerable. When selecting the seeds to test, the existing fuzzers usually treat all seed inputs equally, ignoring the fact that paths exercised by different seed inputs are not equally vulnerable. This results in wasting time testing uninteresting paths rather than vulnerable paths, thus reducing the efficiency of vulnerability detection. In this paper, we present a solution, NeuFuzz, using the deep neural network to guide intelligent seed selection during graybox fuzzing to alleviate the aforementioned limitation. In particular, the deep neural network is used to learn the hidden vulnerability pattern from a large number of vulnerable and clean program paths to train a prediction model to classify whether paths are vulnerable. The fuzzer then prioritizes seed inputs that are capable of covering the likely to be vulnerable paths and assigns more mutation energy (i.e., the number of inputs to be generated) to these seeds.
    [Show full text]
  • Designing SUPPORTABILITY Into Software by Prashant A
    Designing SUPPORTABILITY into Software by Prashant A. Shirolkar Master of Science in Computer Science and Engineering (1998) University of Texas at Arlington Submitted to the System Design and Management Program in Partial Fulfillment of the Requirements for the Degree of Master of Science in Engineering and Management at the Massachusetts Institute of Technology November 2003 C 2003 Massachusetts Institute of Technology All rights reserved Signature of Author Prashant Shirolkar System Design and Management Program February 2002 Certified by Michael A. Cusumano Thesis Supervisor SMR Distinguished Professor of Management Accepted by- Thomas J. Allen Co-Director, LFM/SDM Howard W. Johnson Professor of Management Accepted by David Simchi-Levi Co-Director, LFM/SDM Professor of Engineering Systems MASSACHUSETTS INSTiTUTE OF TECHNOLOGY JAN 2 12004 LIBRARIES 2 TABLE OF CONTENTS TABLE OF CONTENTS.......................................................................................................... 2 LIST OF FIGURES ........................................................................................................................ 6 LIST OF TABLES.......................................................................................................................... 8 ACKNOW LEDGEM ENTS...................................................................................................... 9 ACKNOW LEDGEM ENTS...................................................................................................... 9 1. INTRODUCTION ...............................................................................................................
    [Show full text]
  • Intel® Software Guard Extensions: Data Center Attestation Primitives
    Intel® Software Guard Extensions Data Center Attestation Primitives Installation Guide For Windows* OS Revision <1.0> <3/10/2020> Table of Contents Introduction .......................................................................................................................... 3 Components – Detailed Description ....................................................................................... 4 Platform Configuration .......................................................................................................... 6 Windows* Server OS Support ................................................................................................. 7 Installation Instructions ......................................................................................................... 8 Windows* Server 2016 LTSC ................................................................................................................. 8 Downloading the Software ........................................................................................................................... 8 Installation .................................................................................................................................................... 8 Windows* Server 2019 Installation ....................................................................................................... 9 Downloading the Software ........................................................................................................................... 9 Installation
    [Show full text]
  • Elaboration in Dependent Type Theory
    Elaboration in Dependent Type Theory Leonardo de Moura, Jeremy Avigad, Soonho Kong, and Cody Roux∗ December 18, 2015 Abstract To be usable in practice, interactive theorem provers need to pro- vide convenient and efficient means of writing expressions, definitions, and proofs. This involves inferring information that is often left implicit in an ordinary mathematical text, and resolving ambiguities in mathemat- ical expressions. We refer to the process of passing from a quasi-formal and partially-specified expression to a completely precise formal one as elaboration. We describe an elaboration algorithm for dependent type theory that has been implemented in the Lean theorem prover. Lean’s elaborator supports higher-order unification, type class inference, ad hoc overloading, insertion of coercions, the use of tactics, and the computa- tional reduction of terms. The interactions between these components are subtle and complex, and the elaboration algorithm has been carefully de- signed to balance efficiency and usability. We describe the central design goals, and the means by which they are achieved. 1 Introduction Just as programming languages run the spectrum from untyped languages like Lisp to strongly-typed functional programming languages like Haskell and ML, foundational systems for mathematics exhibit a range of diversity, from the untyped language of set theory to simple type theory and various versions of arXiv:1505.04324v2 [cs.LO] 17 Dec 2015 dependent type theory. Having a strongly typed language allows the user to convey the intent of an expression more compactly and efficiently, since a good deal of information can be inferred from type constraints. Moreover, a type discipline catches routine errors quickly and flags them in informative ways.
    [Show full text]
  • Fine-Grained Energy Profiling for Power-Aware Application Design
    Fine-Grained Energy Profiling for Power-Aware Application Design Aman Kansal Feng Zhao Microsoft Research Microsoft Research One Microsoft Way, Redmond, WA One Microsoft Way, Redmond, WA [email protected] [email protected] ABSTRACT changing from double precision to single), or quality of service pro- Significant opportunities for power optimization exist at applica- vided [10]. Third, energy usage at the application layer may be tion design stage and are not yet fully exploited by system and ap- made dynamic [8]. For instance, an application hosted in a data plication designers. We describe the challenges developers face in center may decide to turn off certain low utility features if the en- optimizing software for energy efficiency by exploiting application- ergy budget is being exceeded, and an application on a mobile de- level knowledge. To address these challenges, we propose the de- vice may reduce its display quality [11] when battery is low. This velopment of automated tools that profile the energy usage of vari- is different from system layer techniques that may have to throttle ous resource components used by an application and guide the de- the throughput resulting in users being denied service. sign choices accordingly. We use a preliminary version of a tool While many application specific energy optimizations have been we have developed to demonstrate how automated energy profiling researched, there is a lack of generic tools that a developer may use helps a developer choose between alternative designs in the energy- at design time. Application specific optimizations require signif- performance trade-off space. icant development effort and are often only applicable to specific scenarios.
    [Show full text]
  • Kaizen: Building a Performant Blockchain System Verified for Consensus and Integrity
    Kaizen: Building a Performant Blockchain System Verified for Consensus and Integrity Faria Kalim∗, Karl Palmskogy, Jayasi Meharz, Adithya Murali∗, Indranil Gupta∗ and P. Madhusudan∗ ∗University of Illinois at Urbana-Champaign yThe University of Texas at Austin zFacebook ∗fkalim2, adithya5, indy, [email protected] [email protected] [email protected] Abstract—We report on the development of a blockchain for it [7]. This protocol can then be automatically translated to system that is significantly verified and performant, detailing equivalent code in a functional language and deployed using the design, proof, and system development based on a process of a shim layer to a network to obtain working reference imple- continuous refinement. We instantiate this framework to build, to the best of our knowledge, the first blockchain (Kaizen) that is mentations of the basic protocol. However, there are several performant and verified to a large degree, and a cryptocurrency drawbacks to this—it is extremely hard to work further on protocol (KznCoin) over it. We experimentally compare its the reference implementation to refine it to correct imperative performance against the stock Bitcoin implementation. and performant code, and to add more features to it to meet practical requirements for building applications. I. INTRODUCTION The second technique, pioneered by the IronFleet sys- Blockchains are used to build a variety of distributed sys- tems [8], is to use a system such as Dafny to prove a system tems, e.g., applications such as cryptocurrency (Bitcoin [1] and correct with respect to its specification via automated theorem altcoins [2]), banking, finance, automobiles, health, supply- proving (using SMT solvers) guided by manual annotations.
    [Show full text]
  • Adding Self-Healing Capabilities to the Common Language Runtime
    Adding Self-healing capabilities to the Common Language Runtime Rean Griffith Gail Kaiser Columbia University Columbia University [email protected] [email protected] Abstract systems can leverage to maintain high system availability is to perform repairs in a degraded mode of operation[23, 10]. Self-healing systems require that repair mechanisms are Conceptually, a self-managing system is composed of available to resolve problems that arise while the system ex- four (4) key capabilities [12]; Monitoring to collect data ecutes. Managed execution environments such as the Com- about its execution and operating environment, performing mon Language Runtime (CLR) and Java Virtual Machine Analysis over the data collected from monitoring, Planning (JVM) provide a number of application services (applica- an appropriate course of action and Executing the plan. tion isolation, security sandboxing, garbage collection and Each of the four functions participating in the Monitor- structured exception handling) which are geared primar- Analyze-Plan-Execute (MAPE) loop consumes and pro- ily at making managed applications more robust. How- duces knowledgewhich is integral to the correct functioning ever, none of these services directly enables applications of the system. Over its execution lifetime the system builds to perform repairs or consistency checks of their compo- and refines a knowledge-base of its behavior and environ- nents. From a design and implementation standpoint, the ment. Information in the knowledge-base could include preferred way to enable repair in a self-healing system is patterns of resource utilization and a “scorecard” tracking to use an externalized repair/adaptation architecture rather the success of applying specific repair actions to detected or than hardwiring adaptation logic inside the system where it predicted problems.
    [Show full text]