Creating High-Performance, Statically Type-Safe Network Applications
Total Page:16
File Type:pdf, Size:1020Kb
UCAM-CL-TR-775 Technical Report ISSN 1476-2986 Number 775 Computer Laboratory Creating high-performance, statically type-safe network applications Anil Madhavapeddy March 2010 15 JJ Thomson Avenue Cambridge CB3 0FD United Kingdom phone +44 1223 763500 http://www.cl.cam.ac.uk/ c 2010 Anil Madhavapeddy This technical report is based on a dissertation submitted April 2006 by the author for the degree of Doctor of Philosophy to the University of Cambridge, Robinson College. Technical reports published by the University of Cambridge Computer Laboratory are freely available via the Internet: http://www.cl.cam.ac.uk/techreports/ ISSN 1476-2986 Abstract A typical Internet server finds itself in the middle of a virtual battleground, under constant threat from worms, viruses and other malware seeking to subvert the original intentions of the programmer. In particular, critical Internet servers such as OpenSSH, BIND and Sendmail have had numerous security issues ranging from low-level buffer overflows to subtle protocol logic errors. These problems have cost billions of dollars as the growth of the Internet exposes increasing numbers of computers to electronic malware. Despite the decades of research on techniques such as model-checking, type-safety and other forms of formal analysis, the vast majority of server implementations continue to be written unsafely and informally in C/C++. In this dissertation we propose an architecture for constructing new implementations of stan- dard Internet protocols which integrates mature formal methods not currently used in deployed servers: (i) static type systems from the ML family of functional languages; (ii) model checking to verify safety properties exhaustively about aspects of the servers; and (iii) generative meta- programming to express high-level constraints for the domain-specific tasks of packet parsing and constructing non-deterministic state machines. Our architecture—dubbed MELANGE—is based on Objective Caml and contributes two domain-specific languages: (i) the Meta Packet Language (MPL), a data description language used to describe the wire format of a protocol and output statically type-safe code to handle network traffic using high-level functional data structures; and (ii) the Statecall Policy Language (SPL) for constructing non-deterministic finite state automata which are embedded into applications and dynamically enforced, or translated into PROMELA and statically model-checked. Our research emphasises the importance of delivering efficient, portable code which is feasi- ble to deploy across the Internet. We implemented two complex protocols—SSH and DNS—to verify our claims, and our evaluation shows that they perform faster than their standard coun- terparts OpenSSH and BIND, in addition to providing static guarantees against some classes of errors that are currently a major source of security problems. 3 CONTENTS Contents 1 Introduction 7 1.1 Internet Growth . .7 1.1.1 Security and Reliability Concerns . .8 1.1.2 Firewalls Prove Insufficient . .8 1.1.3 The Internet Server Monoculture . .9 1.2 Motivation for Rewriting Internet Servers . 10 1.3 Contributions . 11 2 Background 13 2.1 Internet Security . 13 2.1.1 History . 13 2.1.2 Language Issues . 15 2.1.3 The Rise of the Worm . 16 2.1.4 Defences Against Internet Attacks . 17 2.2 Functional Programming . 20 2.2.1 History . 20 2.2.2 Type Systems . 22 2.2.3 Features . 23 2.2.4 Evolution . 26 2.3 Objective Caml . 26 2.3.1 Strong Abstraction . 27 2.3.2 Polymorphic Variants . 29 2.3.3 Mutable Data and References . 30 2.3.4 Bounds Checking . 31 2.4 Model Checking . 32 2.4.1 SPIN and PROMELA ........................... 33 2.4.2 System Verification using SPIN ..................... 35 2.4.3 Model Creation and Extraction . 37 2.5 Summary . 38 3 Related Work 39 3.1 Control Plane . 40 3.1.1 Formal Models of Concurrency . 40 3.1.2 Model Extraction . 41 4 CONTENTS 3.1.3 Dynamic Enforcement and Instrumentation . 43 3.2 Data Plane . 44 3.2.1 Data Description Languages . 44 3.2.2 Active Networks . 45 3.2.3 The View-Update Problem . 46 3.3 General Purpose Languages . 47 3.3.1 Software Engineering . 47 3.3.2 Meta-Programming . 48 3.3.3 Functional Languages for Networking . 48 3.4 Summary . 50 4 Architecture 51 4.1 Goals . 51 4.1.1 Data Abstractions . 51 4.1.2 Language Support . 53 4.2 The MELANGE Architecture . 54 4.2.1 Meta Packet Language (MPL) . 56 4.2.2 Statecall Specification Language (SPL) . 57 4.3 Threat Model . 59 4.4 Summary . 61 5 Meta Packet Language 62 5.1 Language . 63 5.1.1 Parsing IPv4: An Example . 63 5.1.2 Theoretical Space . 68 5.1.3 Syntax . 70 5.1.4 Semantics . 70 5.2 Basis Library . 73 5.2.1 Packet Environments . 73 5.2.2 Basic Types . 75 5.2.3 Custom Types . 77 5.3 OCaml Interface . 77 5.3.1 Packet Sinks . 79 5.3.2 Packet Sources . 79 5.3.3 Packet Proxies . 80 5.4 Evaluation . 81 5.4.1 Experimental Setup . 81 5.4.2 Experiments and Results . 83 5.5 Discussion . 85 5.6 Summary . 86 6 Statecall Policy Language 87 6.1 Statecall Policy Language . 89 6.1.1 A Case Study using ping ......................... 89 6.1.2 Syntax . 91 6.1.3 Typing Rules . 91 6.2 Intermediate Representation . 94 5 CONTENTS 6.2.1 Control Flow Automaton . 96 6.2.2 Multiple Automata . 97 6.2.3 Optimisation . 99 6.3 Outputs . 101 6.3.1 OCaml . 102 6.3.2 PROMELA ................................. 105 6.3.3 HTML and Javascript . 106 6.4 Summary . 107 7 Case Studies 109 7.1 Secure Shell (SSH) . 110 7.1.1 Performance . 112 7.1.2 SSH Packet Format . 115 7.1.3 SSH State Machines . 116 7.1.4 AJAX Debugger . 118 7.1.5 Model Checking . 118 7.2 Domain Name System . 122 7.2.1 DNS Packet Format . 122 7.2.2 An Authoritative Deens Server . 124 7.2.3 Performance . 125 7.3 Code Size . 127 7.4 Summary . 128 8 Conclusions 129 8.1 Future Work . 131 A Sample Application: ping 156 B MPL User Manual 160 B.1 Well-Formed Specifications . 160 B.2 Semantics . 163 C MPL Protocol Listings 165 C.1 Ethernet . 165 C.2 IPv4 . 165 C.3 ICMP . ..