Type Systems and Programming

Total Page:16

File Type:pdf, Size:1020Kb

Type Systems and Programming Type Systems and Programming D. Renault ENSEIRB-Matmeca Mar. 24th 2021, v.1.4.4 Introduction What’s a programming language ? int ackermann(int m, int n) { ackermann { if (!m) return n + 1; 0=1⊃!:1+2⊃! if (!n) return ackermann(m-1,1); 0=2⊃!:r(¯1+1⊃!)1 return ackermann(m-1, r(¯1+1⊃!),r(1⊃!),¯1+2⊃! ackermann(m,n-1)); } } A complex and expressive tool for the representation of computations. D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 2 / 122 Introduction Focus on the problem of the verification of these computations. What properties can one expect to be enforceable ? Termination properties : is it possible to be perfectly certain that a given program evaluates in a finite number of steps ? Correctness properties : is it possible to be perfectly certain that a program never ends up in an uncontrolled error state ? And more pragmatically, checking for the presence or absence of : null pointer exceptions, invalid file descriptors, indices out of array bounds, divisions by zero . D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 3 / 122 Introduction How is it possible to enforce some of these properties ? ) Different families of methods, spread along the development cycle. Requirements Deductive methods Formal verification Model Model checking ::: Architecture Lexical analysis Static analysis Implementation Type systems ::: Testing Testing Runtime verification Maintenance Monitoring ::: ) Each family possesses different characteristics : Compile-time or Runtime Automatic or Assisted Decidable (complexity ?) or Semi-decidable D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 4 / 122 Type systems (informal description) a family of tractable methods, considering programs on a syntactic level, verifying some properties on their behaviors. General tactics Classify the expressions occurring inside a program into types, Verify that the combination of these types into the program respect a set of coherence rules. Example : locomotive + flower D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 5 / 122 Programming languages and type systems studied in this course : OCaml (4.09) caml.inria.fr Haskell (ghc-8.10) haskell.org/ghc LiquidHaskell (0.8-git) ucsd-progsys.github.io/liquidhaskell-blog Scala (2.12) scala-lang.org And their influence in mainstream languages : Java 8-15, C++ 14-20, C# 5-9 . D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 6 / 122 Some references Pierce, B. C. Types and Programming Languages. MIT Press, 2002. Bruce, K. B. Foundations of Object-oriented Languages : Types and Semantics. MIT Press, 2002. Hindley, J. R. Basic simple type theory. Cambridge University Press, 1997. Wadler, P. Propositions as types. Communications ACM, 2015. D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 7 / 122 Overview 1 Simple lambda-calculus 2 Polymorphism D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 8 / 122 1 Simple lambda-calculus Propositional logic Untyped lambda calculus Simply typed lambda calculus Type checking and inference Curry-Howard correspondence 2 Polymorphism D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 9 / 122 Definition (Minimal intuitionistic logic) The minimal intuitionistic logic is the set of all formulae P; Q;::: constructed from : an infinite set of atomic formulae denoted as variables α; β; : : : , if P; Q are two formulas, then P ) Q is also a formula. ) ) ) α ) ) ) β δ α β α δ D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 10 / 122 Definition (Sequent) A sequent is an assertion Γ ` α, where : Γ is a possibly empty sequence of formulae called the antecedents, and α is a formula called the consequent. Writing Γ; P ` Q means that the antecedents are constituted of a list of formulae Γ along with a specific formula P. D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 11 / 122 Definition (Derivation tree) A derivation tree (or proof tree) is a tree whose nodes are syntactically coherent with a finite set of inference rules. In propositional logic, these rules are the following : Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Each inference rule possesses a name indicating its role, most of the time the introduction (I) or the elimination (E) of a logical operator. D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 12 / 122 Γ ` R Γ ` R ) (S ) T ) Γ ` R Γ ` R ) S Γ ` S ) T Γ ` S Γ ::= f(R ) (S ) T )); (R ) S); Rg ` T (R ) (S ) T )); (R ) S) ` (R ) T ) (R ) (S ) T )) ` (R ) S) ) (R ) T ) Frege’s theorem R ) (S ) T ) ) (R ) S) ) (R ) T ) Inference rules Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Proof as a derivation tree ` (R ) (S ) T )) ) ((R ) S) ) (R ) T )) D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 13 / 122 Γ ` R Γ ` R ) (S ) T ) Γ ` R Γ ` R ) S Γ ` S ) T Γ ` S Γ ::= f(R ) (S ) T )); (R ) S); Rg ` T (R ) (S ) T )); (R ) S) ` (R ) T ) Frege’s theorem R ) (S ) T ) ) (R ) S) ) (R ) T ) Inference rules Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Proof as a derivation tree (R ) (S ) T )) ` (R ) S) ) (R ) T ) ` (R ) (S ) T )) ) ((R ) S) ) (R ) T )) D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 13 / 122 Γ ` R Γ ` R ) (S ) T ) Γ ` R Γ ` R ) S Γ ` S ) T Γ ` S Γ ::= f(R ) (S ) T )); (R ) S); Rg ` T Frege’s theorem R ) (S ) T ) ) (R ) S) ) (R ) T ) Inference rules Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Proof as a derivation tree (R ) (S ) T )); (R ) S) ` (R ) T ) (R ) (S ) T )) ` (R ) S) ) (R ) T ) ` (R ) (S ) T )) ) ((R ) S) ) (R ) T )) D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 13 / 122 Γ ` R Γ ` R ) (S ) T ) Γ ` R Γ ` R ) S Γ ` S ) T Γ ` S Frege’s theorem R ) (S ) T ) ) (R ) S) ) (R ) T ) Inference rules Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Proof as a derivation tree Γ ::= f(R ) (S ) T )); (R ) S); Rg ` T (R ) (S ) T )); (R ) S) ` (R ) T ) (R ) (S ) T )) ` (R ) S) ) (R ) T ) ` (R ) (S ) T )) ) ((R ) S) ) (R ) T )) D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 13 / 122 Γ ` R Γ ` R ) (S ) T ) Γ ` R Γ ` R ) S Frege’s theorem R ) (S ) T ) ) (R ) S) ) (R ) T ) Inference rules Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Proof as a derivation tree Γ ` S ) T Γ ` S Γ ::= f(R ) (S ) T )); (R ) S); Rg ` T (R ) (S ) T )); (R ) S) ` (R ) T ) (R ) (S ) T )) ` (R ) S) ) (R ) T ) ` (R ) (S ) T )) ) ((R ) S) ) (R ) T )) D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 13 / 122 Γ ` R Γ ` R ) (S ) T ) Frege’s theorem R ) (S ) T ) ) (R ) S) ) (R ) T ) Inference rules Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Proof as a derivation tree Γ ` R Γ ` R ) S Γ ` S ) T Γ ` S Γ ::= f(R ) (S ) T )); (R ) S); Rg ` T (R ) (S ) T )); (R ) S) ` (R ) T ) (R ) (S ) T )) ` (R ) S) ) (R ) T ) ` (R ) (S ) T )) ) ((R ) S) ) (R ) T )) D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 13 / 122 Frege’s theorem R ) (S ) T ) ) (R ) S) ) (R ) T ) Inference rules Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Proof as a derivation tree Γ ` R Γ ` R ) (S ) T ) Γ ` R Γ ` R ) S Γ ` S ) T Γ ` S Γ ::= f(R ) (S ) T )); (R ) S); Rg ` T (R ) (S ) T )); (R ) S) ` (R ) T ) (R ) (S ) T )) ` (R ) S) ) (R ) T ) ` (R ) (S ) T )) ) ((R ) S) ) (R ) T )) D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 13 / 122 Frege’s theorem R ) (S ) T ) ) (R ) S) ) (R ) T ) Inference rules Γ; P ` Q Γ ` P Γ ` P ) Q [ax] [)i] [)e] P ` P Γ ` P ) Q Γ ` Q Proof as a derivation tree Γ ` R Γ ` R ) (S ) T ) Γ ` R Γ ` R ) S Γ ` S ) T Γ ` S Γ ::= f(R ) (S ) T )); (R ) S); Rg ` T (R ) (S ) T )); (R ) S) ` (R ) T ) (R ) (S ) T )) ` (R ) S) ) (R ) T ) ` (R ) (S ) T )) ) ((R ) S) ) (R ) T )) 3 D. Renault (ENSEIRB-Matmeca) Type Systems and Programming Mar. 24th 2021, v.1.4.4 13 / 122 Summary on propositional logic The model of propositional logic offers : a language describing a family of objects inductively, and a system for defining a subset of this family respecting local rules. The difficulty lies in constructing a kind of proof (here a derivation tree) for assessing the validity of a proposition. In the following, we construct an equivalent model for a programming language : the untyped λ-calculus.
Recommended publications
  • No-Longer-Foreign: Teaching an ML Compiler to Speak C “Natively”
    Electronic Notes in Theoretical Computer Science 59 No. 1 (2001) URL: http://www.elsevier.nl/locate/entcs/volume59.html 16 pages No-Longer-Foreign: Teaching an ML compiler to speak C “natively” Matthias Blume 1 Lucent Technologies, Bell Laboratories Abstract We present a new foreign-function interface for SML/NJ. It is based on the idea of data- level interoperability—the ability of ML programs to inspect as well as manipulate C data structures directly. The core component of this work is an encoding of the almost 2 complete C type sys- tem in ML types. The encoding makes extensive use of a “folklore” typing trick, taking advantage of ML’s polymorphism, its type constructors, its abstraction mechanisms, and even functors. A small low-level component which deals with C struct and union declarations as well as program linkage is hidden from the programmer’s eye by a simple program-generator tool that translates C declarations to corresponding ML glue code. 1 An example Suppose you are an ML programmer who wants to link a program with some C rou- tines. The following example (designed to demonstrate data-level interoperability rather than motivate the need for FFIs in the first place) there are two C functions: input reads a list of records from a file and findmin returns the record with the smallest i in a given list. The C library comes with a header file ixdb.h that describes this interface: typedef struct record *list; struct record { int i; double x; list next; }; extern list input (char *); extern list findmin (list); Our ml-nlffigen tool translates ixdb.h into an ML interface that corre- sponds nearly perfectly to the original C interface.
    [Show full text]
  • What I Wish I Knew When Learning Haskell
    What I Wish I Knew When Learning Haskell Stephen Diehl 2 Version This is the fifth major draft of this document since 2009. All versions of this text are freely available onmywebsite: 1. HTML Version ­ http://dev.stephendiehl.com/hask/index.html 2. PDF Version ­ http://dev.stephendiehl.com/hask/tutorial.pdf 3. EPUB Version ­ http://dev.stephendiehl.com/hask/tutorial.epub 4. Kindle Version ­ http://dev.stephendiehl.com/hask/tutorial.mobi Pull requests are always accepted for fixes and additional content. The only way this document will stayupto date and accurate through the kindness of readers like you and community patches and pull requests on Github. https://github.com/sdiehl/wiwinwlh Publish Date: March 3, 2020 Git Commit: 77482103ff953a8f189a050c4271919846a56612 Author This text is authored by Stephen Diehl. 1. Web: www.stephendiehl.com 2. Twitter: https://twitter.com/smdiehl 3. Github: https://github.com/sdiehl Special thanks to Erik Aker for copyediting assistance. Copyright © 2009­2020 Stephen Diehl This code included in the text is dedicated to the public domain. You can copy, modify, distribute and perform thecode, even for commercial purposes, all without asking permission. You may distribute this text in its full form freely, but may not reauthor or sublicense this work. Any reproductions of major portions of the text must include attribution. The software is provided ”as is”, without warranty of any kind, express or implied, including But not limitedtothe warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authorsor copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, Arising from, out of or in connection with the software or the use or other dealings in the software.
    [Show full text]
  • Idris: a Functional Programming Language with Dependent Types
    Programming Languages and Compiler Construction Department of Computer Science Christian-Albrechts-University of Kiel Seminar Paper Idris: A Functional Programming Language with Dependent Types Author: B.Sc. Finn Teegen Date: 20th February 2015 Advised by: M.Sc. Sandra Dylus Contents 1 Introduction1 2 Fundamentals2 2.1 Universes....................................2 2.2 Type Families..................................2 2.3 Dependent Types................................3 2.4 Curry-Howard Correspondence........................4 3 Language Overview5 3.1 Simple Types and Functions..........................5 3.2 Dependent Types and Functions.......................6 3.3 Implicit Arguments...............................7 3.4 Views......................................8 3.5 Lazy Evaluation................................8 3.6 Syntax Extensions...............................9 4 Theorem Proving 10 4.1 Propositions as Types and Terms as Proofs................. 10 4.2 Encoding Intuitionistic First-Order Logic................... 12 4.3 Totality Checking................................ 14 5 Conclusion 15 ii 1 Introduction In conventional Hindley-Milner based programming languages, such as Haskell1, there is typically a clear separation between values and types. In dependently typed languages, however, this distinction is less clear or rather non-existent. In fact, types can depend on arbitrary values. Thus, they become first-class citizens and are computable like any other value. With types being allowed to contain values, they gain the possibility to describe prop- erties of their own elements. The standard example for dependent types is the type of lists of a given length - commonly referred to as vectors - where the length is part of the type itself. When starting to encode properties of values as types, the elements of such types can be seen as proofs that the stated property is true.
    [Show full text]
  • 13 Templates-Generics.Pdf
    CS 242 2012 Generic programming in OO Languages Reading Text: Sections 9.4.1 and 9.4.3 J Koskinen, Metaprogramming in C++, Sections 2 – 5 Gilad Bracha, Generics in the Java Programming Language Questions • If subtyping and inheritance are so great, why do we need type parameterization in object- oriented languages? • The great polymorphism debate – Subtype polymorphism • Apply f(Object x) to any y : C <: Object – Parametric polymorphism • Apply generic <T> f(T x) to any y : C Do these serve similar or different purposes? Outline • C++ Templates – Polymorphism vs Overloading – C++ Template specialization – Example: Standard Template Library (STL) – C++ Template metaprogramming • Java Generics – Subtyping versus generics – Static type checking for generics – Implementation of Java generics Polymorphism vs Overloading • Parametric polymorphism – Single algorithm may be given many types – Type variable may be replaced by any type – f :: tt => f :: IntInt, f :: BoolBool, ... • Overloading – A single symbol may refer to more than one algorithm – Each algorithm may have different type – Choice of algorithm determined by type context – Types of symbol may be arbitrarily different – + has types int*intint, real*realreal, ... Polymorphism: Haskell vs C++ • Haskell polymorphic function – Declarations (generally) require no type information – Type inference uses type variables – Type inference substitutes for variables as needed to instantiate polymorphic code • C++ function template – Programmer declares argument, result types of fctns – Programmers
    [Show full text]
  • Typedevil: Dynamic Type Inconsistency Analysis for Javascript
    TypeDevil: Dynamic Type Inconsistency Analysis for JavaScript Michael Pradel∗x, Parker Schuh∗, and Koushik Sen∗ ∗EECS Department, University of California, Berkeley xDepartment of Computer Science, TU Darmstadt Abstract—Dynamic languages, such as JavaScript, give pro- of type number with undefined, which is benign when grammers the freedom to ignore types, and enable them to write running the program in its default configuration but causes concise code in short time. Despite this freedom, many programs a crash with a slightly different configuration. Finding these follow implicit type rules, for example, that a function has a particular signature or that a property has a particular type. problems is difficult because they do not always lead to obvi- Violations of such implicit type rules often correlate with prob- ous signs of misbehavior when executing the programs. How lems in the program. This paper presents TypeDevil, a mostly can developers detect such problems despite the permissive dynamic analysis that warns developers about inconsistent types. nature of JavaScript? The key idea is to assign a set of observed types to each variable, All three examples in Figure 1 share the property that property, and function, to merge types based in their structure, and to warn developers about variables, properties, and functions a variable, property or function has multiple, inconsistent that have inconsistent types. To deal with the pervasiveness of types. In Figure 1a, variable dnaOutputStr holds both the polymorphic behavior in real-world JavaScript programs, we undefined value and string values. In Figure 1b, function present a set of techniques to remove spurious warnings and leftPad sometimes returns an object and sometimes returns to merge related warnings.
    [Show full text]
  • Constrained Type Families (Extended Version)
    Constrained Type Families (extended version) J. GARRETT MORRIS, e University of Edinburgh and e University of Kansas 42 RICHARD A. EISENBERG, Bryn Mawr College We present an approach to support partiality in type-level computation without compromising expressiveness or type safety. Existing frameworks for type-level computation either require totality or implicitly assume it. For example, type families in Haskell provide a powerful, modular means of dening type-level computation. However, their current design implicitly assumes that type families are total, introducing nonsensical types and signicantly complicating the metatheory of type families and their extensions. We propose an alternative design, using qualied types to pair type-level computations with predicates that capture their domains. Our approach naturally captures the intuitive partiality of type families, simplifying their metatheory. As evidence, we present the rst complete proof of consistency for a language with closed type families. CCS Concepts: •eory of computation ! Type structures; •So ware and its engineering ! Func- tional languages; Additional Key Words and Phrases: Type families, Type-level computation, Type classes, Haskell ACM Reference format: J. Garre Morris and Richard A. Eisenberg. 2017. Constrained Type Families (extended version). PACM Progr. Lang. 1, 1, Article 42 (January 2017), 38 pages. DOI: hp://dx.doi.org/10.1145/3110286 1 INTRODUCTION Indexed type families (Chakravarty et al. 2005; Schrijvers et al. 2008) extend the Haskell type system with modular type-level computation. ey allow programmers to dene and use open mappings from types to types. ese have given rise to further extensions of the language, such as closed type families (Eisenberg et al.
    [Show full text]
  • Twelf User's Guide
    Twelf User’s Guide Version 1.4 Frank Pfenning and Carsten Schuermann Copyright c 1998, 2000, 2002 Frank Pfenning and Carsten Schuermann Chapter 1: Introduction 1 1 Introduction Twelf is the current version of a succession of implementations of the logical framework LF. Previous systems include Elf (which provided type reconstruction and the operational semantics reimplemented in Twelf) and MLF (which implemented module-level constructs loosely based on the signatures and functors of ML still missing from Twelf). Twelf should be understood as research software. This means comments, suggestions, and bug reports are extremely welcome, but there are no guarantees regarding response times. The same remark applies to these notes which constitute the only documentation on the present Twelf implementation. For current information including download instructions, publications, and mailing list, see the Twelf home page at http://www.cs.cmu.edu/~twelf/. This User’s Guide is pub- lished as Frank Pfenning and Carsten Schuermann Twelf User’s Guide Technical Report CMU-CS-98-173, Department of Computer Science, Carnegie Mellon University, November 1998. Below we state the typographic conventions in this manual. code for Twelf or ML code ‘samp’ for characters and small code fragments metavar for placeholders in code keyboard for input in verbatim examples hkeyi for keystrokes math for mathematical expressions emph for emphasized phrases File names for examples given in this guide are relative to the main directory of the Twelf installation. For example ‘examples/guide/nd.elf’ may be found in ‘/usr/local/twelf/examples/guide/nd.elf’ if Twelf was installed into the ‘/usr/local/’ directory.
    [Show full text]
  • Session Arrows: a Session-Type Based Framework for Parallel Code Generation
    MENG INDIVIDUAL PROJECT IMPERIAL COLLEGE LONDON DEPARTMENT OF COMPUTING Session Arrows: A Session-Type Based Framework For Parallel Code Generation Supervisor: Prof. Nobuko Yoshida Author: Dr. David Castro-Perez Shuhao Zhang Second Marker: Dr. Iain Phillips June 19, 2019 Abstract Parallel code is notorious for its difficulties in writing, verification and maintenance. However, it is of increasing importance, following the end of Moore’s law. Modern pro- grammers are expected to utilize the power of multi-core CPUs and face the challenges brought by parallel programs. This project builds an embedded framework in Haskell to generate parallel code. Combining the power of multiparty session types with parallel computation, we create a session typed monadic language as the middle layer and use Arrow, a general interface to computation as an abstraction layer on top of the language. With the help of the Arrow interface, we convert the data-flow of the computation to communication and generate parallel code according to the communication pattern between participants involved in the computation. Thanks to the addition of session types, not only the generated code is guaranteed to be deadlock-free, but also we gain a set of local types so that it is possible to reason about the communication structure of the parallel computation. In order to show that the framework is as expressive as usual programming lan- guages, we write several common parallel computation patterns and three algorithms to benchmark using our framework. They demonstrate that users can express computa- tion similar to traditional sequential code and gain, for free, high-performance parallel code in low-level target languages such as C.
    [Show full text]
  • Ur/Web: a Simple Model for Programming the Web
    Ur/Web: A Simple Model for Programming the Web The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation Chlipala, Adam. "Ur/Web: A Simple Model for Programming the Web." The 42nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '15), January 15-17, 2015, Mumbai, India. As Published http://dl.acm.org/citation.cfm?id=2676726 Publisher Association for Computing Machinery (ACM) Version Author's final manuscript Citable link http://hdl.handle.net/1721.1/92321 Terms of Use Creative Commons Attribution-Noncommercial-Share Alike Detailed Terms http://creativecommons.org/licenses/by-nc-sa/4.0/ Ur/Web: A Simple Model for Programming the Web Adam Chlipala rtifact Comple * A t * te n * te A is W s E * e n l l C o L D MIT CSAIL C o P * * c u e m s O E u e e P n R t v e [email protected] o d t * y * s E a a l d u e a t Abstract for network communication, and on a language or API like SQL The World Wide Web has evolved gradually from a document de- for storing persistent, structured data on servers. Code fragments livery platform to an architecture for distributed programming. This in these different languages are often embedded within each other largely unplanned evolution is apparent in the set of interconnected in complex ways, and the popular Web development tools provide languages and protocols that any Web application must manage. little help in catching inconsistencies.
    [Show full text]
  • On the Infeasibility of Modeling Polymorphic Shellcode*
    On the Infeasibility of Modeling Polymorphic Shellcode∗ Yingbo Song Michael E. Locasto Angelos Stavrou Dept. of Computer Science Dept. of Computer Science Dept. of Computer Science Columbia University Columbia University Columbia University [email protected] [email protected] [email protected] Angelos D. Keromytis Salvatore J. Stolfo Dept. of Computer Science Dept. of Computer Science Columbia University Columbia University [email protected] [email protected] ABSTRACT General Terms Polymorphic malcode remains a troubling threat. The ability for Experimentation, Measurement, Security malcode to automatically transform into semantically equivalent variants frustrates attempts to rapidly construct a single, simple, easily verifiable representation. We present a quantitative analy- Keywords sis of the strengths and limitations of shellcode polymorphism and polymorphism, shellcode, signature generation, statistical models consider its impact on current intrusion detection practice. We focus on the nature of shellcode decoding routines. The em- pirical evidence we gather helps show that modeling the class of self–modifying code is likely intractable by known methods, in- 1. INTRODUCTION cluding both statistical constructs and string signatures. In addi- Code injection attacks have traditionally received a great deal of tion, we develop and present measures that provide insight into the attention from both security researchers and the blackhat commu- capabilities, strengths, and weaknesses of polymorphic engines. In nity [1, 14], and researchers have proposed a variety of defenses, order to explore countermeasures to future polymorphic threats, we from artificial diversity of the address space [5] or instruction set show how to improve polymorphic techniques and create a proof- [20, 4] to compiler-added integrity checking of the stack [10, 15] of-concept engine expressing these improvements.
    [Show full text]
  • A Verifying Compiler for Embedded Networked Systems Kalyan Chakradhar Regula Clemson University, [email protected]
    Clemson University TigerPrints All Theses Theses 8-2010 A Verifying Compiler for Embedded Networked Systems Kalyan chakradhar Regula Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_theses Part of the Computer Sciences Commons Recommended Citation Regula, Kalyan chakradhar, "A Verifying Compiler for Embedded Networked Systems" (2010). All Theses. 899. https://tigerprints.clemson.edu/all_theses/899 This Thesis is brought to you for free and open access by the Theses at TigerPrints. It has been accepted for inclusion in All Theses by an authorized administrator of TigerPrints. For more information, please contact [email protected]. A Verifying Compiler for Embedded Networked Systems A Thesis Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Master Of Science Computer Science by Kalyan Chakradhar Regula August 2010 Accepted by: Dr. Jason O. Hallstrom, Committee Chair Dr. Murali Sitaraman Dr. Brain Malloy Abstract Embedded networked devices are required to produce dependable outputs and communicate with peer devices given limited computing resources. These devices monitor and control processes within the physical world. They are used in applications related to environmental monitoring, telecommunications, social networking, and also life-critical applications in domains such as health care, aeronautics, and automotive manufacturing. For such applications, software errors can be costly - both in terms of financial and human costs. Therefore, software programs installed on these devices must meet the appropriate requirements. To guarantee this, one must verify that the implemented code meets the corresponding specifications. Manual trial-and-error validation of such applications, especially life-critical software programs, is not a feasible option.
    [Show full text]
  • Polymorphic and Metamorphic Code Applications in Portable Executable Files Protection
    Volume 51, Number 1, 2010 ACTA TECHNICA NAPOCENSIS Electronics and Telecommunications ________________________________________________________________________________ POLYMORPHIC AND METAMORPHIC CODE APPLICATIONS IN PORTABLE EXECUTABLE FILES PROTECTION Liviu PETREAN “Emil Racovi ţă ” High School Baia Mare, 56 V. Alecsandri, tel. 0262 224 266 Abstract: Assembly code obfuscation is one of the most popular ways used by software developers to protect their intellectual property. This paper is reviewing the methods of software security employing metamorphic and polymorphic code transformations used mostly by computer viruses. Keywords: code, polymorphic, portable I. INTRODUCTION to execute. Nowadays self-modifying code is used by The illegal copying of computer programs causes huge programs that do not want to reveal their presence such as revenue losses of software companies and most of the computer viruses and executable compressors and time these losses exceed the earnings. As a consequence protectors. the software companies should use strong protection for Self modifying code is quite straightforward to write their intellectual property, but surprisingly, we often when using assembly language but it can also be encounter the absence of such protection or just a futile implemented in high level language interpreters as C and security routine. Many software producers argued these C++. The usage of self modifying code has many frailties affirming that sooner or later their product will be purposes. Those which present an interest for us in this reversed with or without protection [1], [3], [6]. They are paper are mentioned below: right but only partially, because even if everything that 1. Hiding the code to prevent reverse engineering, can be run can be reversed, the problem is how long is the through the use of a disassembler or debugger.
    [Show full text]