Welcome to the 2019 DeepSpec Workshop! The Science of Deep Specification

Benjamin C. Pierce University of Pennsylvania

DeepSpec Workshop @ PLDI June, 2019 “We can’t build that works!” Or…? How did that happen? • Better programming languages • Powerful mechanisms for abstraction and modularity • Better software development methodology • Agile workflows, unit testing, … • Stable platforms and frameworks • Posix, Win32, Android, iOS, apache, DOM/JS, … Are we done?

Nope What about secure software? Grounds for hope… • Better programming languages • Basic safety guarantees built in • Better understanding of risks and vulnerabilities • Better system architectures for security • Separation kernels, hypervisors, sandboxing, TPMs, … • Success stories of formal specification and machine-checked verification of critical software at scale • Not a panacea (side channels, etc.) • But a big step in the right direction! design logical specification Theorem compiles_correctly := Formal ∀(e : exp), execute [] (compile e) = [eval e]. Live Rich executable specification

Definition compiles_correctly (e : exp) : Bool := Formal eq (execute [] (compile e)) [eval e]. QuickCheck compiles_correctly. Live Rich unit tests

proof Example e3 : Formal testing assert (compiles_correctly (Plus (Num 2) (Num 2))). Live Example e4 : assert (compiles_correctly (Plus (Num 5) (Num 3))). Rich /✘ randomized code unit testing informal specification Fixpoint compile (e : exp) : list instr := Formal ✘ match e with Lorem ipsum dolor sit amet, consectetur adipiscing | Num n => [PUSH n] elit, sed do eiusmod tempor incididunt ut labore et Live ✘ | Plus e1 e2 => compile e1 ++ compile e2 ++ [PLUS] dolore magna aliqua. Ut enim ad minim veniam, quis | Minus e1 e2 => compile e1 ++ compile e2 ++ [MINUS] thinking | Mult e1 e2 => compile e1 ++ compile e2 ++ [MULT] nostrud exercitation ullamco laboris nisi ut aliquip ex Rich end. ea commodo consequat. … Are logical specifications practical? • Accepts most of ISO C 99 • Produces machine code for PowerPC, ARM, (32-bit), and RISC-V architectures • 90% of the performance of GCC (v4, opt. level 1) Verification really works!

Regehr’s Csmith project used random testing to assess all popular C compilers, and reported: “The striking thing about our CompCert results is that the middle- John Regehr Univ. of Utah end bugs we found in all other compilers are absent. As of early 2011, the under-development version of CompCert is the only compiler we have tested for which Csmith cannot find wrong-code errors. This is not for lack of trying: we have devoted about six CPU-years to the task. The apparent unbreakability of CompCert supports a strong argument that developing compiler optimizations within a proof framework, where safety checks are explicit and machine- checked, has tangible benefits for compiler users.” • Real-world operating-system kernel

• With an end-to-end proof of implementation correctness and security enforcement

• Verified down to machine code And many, many more! • Bedrock system • IronClad Apps • Ur/Web compiler • Full-scale formal specifications of critical system interfaces CompCert TSO compiler • • X86 instruction set • CompCert static analysis tools • TCP protocol suite • Posix file system interface Jitk and Data6 verified filesystems • • Weak memory consistency models for • Fscq file system from MIT x86, ARM, PowerPC • ISO C / C++ concurrency • Verdi distributed system framework • Elf loader format • C language (Cerberus – also see • Testable formal spec for AutoSAR Krebbers, K semantics, …) • CakeML compiler • Vellvm: Verified LLVM optimizations Verified Textbooks!

Isabelle

Coq

SoftwareFoundations.org … and several others! Why now?

Urgent need for increased confidence + Diminishing value of “paper proofs” + Progress on enabling technologies Enabling Technologies

• Logics • Concurrent separation logic, … • Proof assistants • , Isabelle, ACL2, Twelf, HOL-light, … Testing tools and methodologies • QuickCheck • QuickCheck, QuickChick, … • DSLs for writing specifications • OTT, Lem, Redex, … • Languages with integrated specifications • Dafny, Boogie, JML, F*, Liquid Types, Verilog PSL, Dependent Haskell, ... Enabling Technologies So are we done?

Nope. Lessons from CompCert

OS client interface Program Logic

CertiKOS Verifiable C hypervisor kernel System

Shao C language C language Appel

C language

CompCert Compiler

PowerPC ISA Leroy

PowerPC ISA

IBM’s CPU Sewell Transistors Lessons from CompCert

OS client interface Program Logic

CertiKOS Verifiable C hypervisor kernel System

Shao C language C language Appel

C language

CompCert Compiler

PowerPC ISA Leroy

PowerPC ISA

IBM’s CPU Sewell Transistors Lessons from CompCert

OS client interface Program Logic

CertiKOS Verifiable C hypervisor kernel System

Shao C language C language Appel

C language

CompCert Compiler

PowerPC ISA Leroy

PowerPC ISA

IBM’s CPU Sewell Transistors Lessons from seL4

• Original specification and correctness proof for seL4 kernel took ~20 person years • Later, the same team added a tool for setting up secure system configurations • where processes at different security levels were guaranteed not to interfere • Proving correctness of this tool took ~4 person years, of which 1.5 years were devoted to upgrading the kernel specification (and proof) to eliminate unwanted nondeterminism Two-sided specifications

Verified components Two-sided must connect at specifications specification boundaries “Deep” specifications:

Formal mathematically rigorous

Rich precisely expressing intended behavior of complex software automatically checked against Live actual code (not just a model) exercised by both “implementors” Two-sided and “clients” The Science of Deep Specification

Andrew Appel Steve Zdancewic Princeton University of Pennsylvania

Lennart Beringer Adam Chlipala Yours truly Zhong Shao Stephanie Weirich Princeton MIT University of Pennsylvania Yale University of Pennsylvania And more importantly… Andres Erbsen Leonidas Lampropoulos Richard Zhang Antal Spector-Zabusky Li-yao Xia Ronghui Gu Antoine Voizard Lionel Rieg Samuel Gruetter Benjamin Sherman Lucas Paul Santiago Cuellar Christine Rizkallah Matthew Weaver Unsung Lee David Costanzo Mengqi Liu Vilhelm Sjöberg David Kaloper Meršinjak Mirai Ikebuchi William Mansky Dmitri Garbuzov Murali Vijayaraghavan Wolf Honore Hernán Vanzetto Nick Giannarakis Xiongnan (Newman) Wu Jade Philipoom Olivier Savary Belanger Yao Li Jason Gross Pedro Henrique Avezedo de Yishuai Li Ji-Yong Shin Amorim Yuanfeng Peng Jieung Kim Paul He Yuting Wang Joachim Breitner Pierre Wilke Zoe Paraskevopoulou Joonwon Choi Qinxiang Cao Joshua Lockerman Quentin Carbonneaux Jérémie Koenig Goal: Move from point success stories to sustainable engineering practice at industrially relevant scale Andrew Lennart Appel Beringer

Demo projects: Program logic for proving correctness of crypto primitives, (concurrent) C programs “mailbox” Proof automation tools communication system, for applying the program logic garbage collector for Certicoq

B+ Trees DBMS

web server

30 Operating System Zhong Shao

Demo project: “Certified Abstraction Layers” Certified Kit Operating System (CertiKOS) a new refinement-based methodology for software Configurable as supervisor or hypervisor correctness proofs Runs on Intel, AMD, ARM platforms . . . of programs with low-level concerns such as interrupts, . . . multicore virtual-memory mapping, scheduling, . . . Hosts Linux (or other client/guest software)

31 LLVM compiler intermediate language Steve Zdancewic

Vellvm: Formal specification Demo project: of LLVM; proofs of correctness of LLVM Use as basis for compiler phases testing correctness of GHC, using QuickChick widely used compiler framework based on Static Single-Assignment (SSA)

32 Haskell Language

Specification Stephanie Weirich Haskell: widely used pure functional Haskell Core Spec: Demo projects: programming language with lazy evaluation Formal specification Prove correctness of semantics of the of some GHC phases Haskell core language using hs-to-coq Haskell Core: near-source-level Use as basis for intermediate language testing correctness inside GHC compiler of GHC, using QuickChick

33 Verified Compiler for

Coq programs Andrew Appel Greg Morrisett

CertiCoq: Demo projects: Gallina: A verified compiler functional programming for Gallina Resolution theorem language inside Coq prover for Separation Logic Write your software as a “Extraction:” pure functional program in Coq, Translate Gallina prove its correctness using Coq, (?) CompCert to ML, compile with use CertiCoq to compile Ocaml compiler to efficient machine code (?) database query optimization Extraction is quite good, but it’s not verified (?) parts of web server correct 34 Verified processor

design Adam Chlipala Old way: New way: Write reference manual for ISA Formal specification of ISA Prove correctness of Bluespec program Write RTL program in VHDL Write RTL program

Decent formal in Bluespec Prove correctness tools exist for Compile VHDL of Bluespec compiler verifying this into transistors part Compile Bluespec into VHDL

Compile VHDL Use existing tools to verify correctness into transistors

Demo project: specification / verification of RISC-V processor implementation

35 Specification- based

random testing Benjamin Pierce Old way: Fuzz testing QuickChick: Semantic fuzz testing based Recent ways: on conformance to formal Semantic fuzz testing specification in Coq Tool: QuickCheck, for Haskell and Erlang; fuzzes over (tree) data structures, automatically reduces bugs found into minimal input Demo projects: cases Apache web server DeepSpec web server

Haskell compiler 36 Verification of cryptographic primitives Lennart Adam Beringer Chlipala

Demo projects: these crypto applications High-level cryptographic serve as demo projects for several of our specs (“pseudorandom function, other tools: cryptographic advantage”), Message authentication, Random number generation

High-level functional specs (elliptic curves in finite fields) Fiat

Low-level functional specs (multibit carry)

Efficient imperative implementations

37 Andrew Lennart Appel Beringer

Demo projects: Program logic for proving correctness of crypto primitives, (concurrent) C programs “mailbox” Proof automation tools communication system for applying the program garbage collector logic for Certicoq B+ Trees DBMS

web server

38 Goal: Rich, formal, live, 2-sided specs

39 Application demo?

Elections Web Server IoT device C programming C programming Hypervisor C programming Database programmingFunctional Database

Operating system Operating system Operating system

40 Application demos!

Web Server Self-driving car C programming File systemCrypto

C programming C Hypervisor

Operating system

41 DeepWeb

A web server built on DeepSpec Many parts One whole Challenges

• Extreme vertical integration • Make progress through a sequence of “integration experiments” • Multiple levels and styles of specifications • Need a “lingua franca” for writing a variety of specs • à Interaction trees • Combining testing and verification • Reasoning about server behavior “modulo the network” Challenge: Vertical Integration Executable high-level specification of HTTP(S) spec HTTP(S) protocols and web services Web server

POSIX API System call interface specification = OS RISC-V ISA Instruction-set specification

RISC-V

Transistors RTL-level description of circuit behaviors Goal: A “single QED” encompassing the whole stack Executable high-level specification of HTTP(S) spec HTTP(S) protocols and web services Functional program with same observable Low-level functional spec behavior as C web server Web server System call interface specification POSIX API (separation logic Hoare triples)

POSIX API System call interface specification (CertiKOS “layer interface”) OS Instruction-set specification RISC-V ISA (assembly level, structured memory model) Instruction-set specification RISC-V ISA (machine-code level, flat memory model) RISC-V

Transistors RTL-level description of circuit behaviors Challenge: Disparate Specification Styles Too many metalanguages!

• Network-level HTTP spec • Nondeterministic “model implementation” (functional program) • Client-side acceptance tester (functional program) • Web server implementation • CompCert “observation traces” • VST C verification tool • Hoare triples in separation logic • CertiKOS • “Layer interfaces” Interaction trees Reasoning “Modulo the Network” Swap server specification

Cat Bat Client 1 Cat Bat Dog Elk Dog Client 2 Cat

Server Elk Dog Client 3

52 Swap server: in the real world

Cat • Messages on different connections can be Dog reordered Bat Clients • Messages can be Server Elk / delayed indefinitely Tester Dog

53 Network refinement

Specification Observable behavior by clients Network semantics network-refines ∪I

Implementation Observable behavior by clients Network semantics

Adaptation of Observational refinement/Linearizability Challenge: Testable High-Level Specifications Where we have to stand for testing Because (1) we want to test our C code and (2) the What we have tester also needs to work with stock web servers

Specification Observable behavior by clients Network semantics network-refines ∪I

Implementation Observable behavior by clients Network semantics Automatic Specification derivation Tester (“model implementation”) (“acceptance test”)

Main challenge: nondeterminism • introduced by the network • … or present in the original spec 59 Final theorem

• “If you put these bits (produced by compiling CertiKOS and the web server using CompCert) into a memory connected to this connection of transistors (produced by compiling a RISC-V implementation using Kami), the behaviors of the resulting system will network refine the behaviors describe ed by the model implementation.” Progress

• Vertical integration • See CPP 2019 paper about testing and VST verification of a “swap server” • Interaction trees • https://github.com/DeepSpec/InteractionTrees • See talks by Steve Zdancewic today and by Yann Régis-Giannis and Gil Hur tomorrow • Connecting VST and CertiKOS • See talk by William Mansky today • Connecting CertiKOS and Risc-V • Ongoing work at Yale and MIT on a “flat memory” semantics for CompCert The demo is not the (only) scientific result!

DeepSpec is not “build a verified stack”

62 DeepSpec is . . . a coherent collection of tools and techniques . . .

COMPCERT

…that can be connected, combined, and configured to allow users to build and foundationally verify high assurance, functionally correct software and hardware.

63 DeeoSpec Workshop Overview Welcome and overview

Compiler Verification

Modular Reasoning

Interaction Trees and Algebraic Effects I Interaction Trees and Algebraic Effects II

Hardware / Software Interface Specifications

Verifying all the things

Coinduction and testing Thank you! Join us! (any (more) questions?) Teaching materials Technical workshops (like this one :-) Summer schools visitors program PhD and postdoc positions

Visit deepspec.org to see what’s happening and join our mailing list