Satisfiability Modulo Theory

Total Page:16

File Type:pdf, Size:1020Kb

Satisfiability Modulo Theory Satisfiability modulo theory David Monniaux CNRS / VERIMAG October 2017 1 / 185 Schedule Introduction Propositional logic First-order logic Applications Beyond DPLL(T) Quantifier elimination Interpolants 2 / 185 Who I am David Monniaux Senior researcher (directeur de recherche) at CNRS (National center for scientific research) Working at VERIMAG in Grenoble 3 / 185 Who we are VERIMAG = joint research unit of I CNRS (9 permanent researchers) I Université Grenoble Alpes (15 faculty) I Grenoble-INP (8 faculty) 4 / 185 What we do (Widely speaking) Methods for designing and verifying safety-critical embedded systems I formal methods I static analysis I proof assistants I decision procedures I exact analysis I modeling of hardware/software system platforms (caches, networks on chip etc.) I hybrid systems 5 / 185 Program verification Program = instructions in I “toy” programming language or formalism (Turing machines, λ-calculus, “while language”) I real programming language Program has a set of behaviors (semantics) Prove these behaviors included in acceptable behaviors (specification) 6 / 185 Difficulties Defining the semantics of a real language (C, C++…) is very difficult. Add parallelism etc. to make it nearly impossible. Specifications are written by humans, may be themselves buggy. I will concentrate on the proving phase. 7 / 185 Objectives Increasing ambition: Advanced testing find execution traces leading to violations More efficiently than by hand or randomly Assisted proof help the user prove the program Automated proof prove the program automatically 8 / 185 Schedule Introduction Propositional logic First-order logic Applications Beyond DPLL(T) Quantifier elimination Interpolants 9 / 185 Schedule Introduction Propositional logic Generalities Binary decision diagrams Satisfiability checking Applications First-order logic Applications Beyond DPLL(T) 10 / 185 Quantifier elimination Interpolants Quantifier-free propositional logic ^ (and), _ (or), : (not) (also noted ¯x) Possibly add ⊕ (exclusive-or) t = “true”, f = “false” Formula with variables: (a _ b) ^ (c _ ¯a) An assignment gives a value to all variables. A satisfying assignment or model makes the formula evaluate to t. 11 / 185 Exponential size in the input. Disjunctive normal form Disjunction of conjunction of literals (= variable or negation of variable) Obtained by distributivity (a _ b) ^ c −! (a ^ c) _ (b ^ c) Inconvenience? 12 / 185 Disjunctive normal form Disjunction of conjunction of literals (= variable or negation of variable) Obtained by distributivity (a _ b) ^ c −! (a ^ c) _ (b ^ c) Inconvenience? Exponential size in the input. 12 / 185 Conjunctive normal form Disjunction of literals = clause CNF = conjunction of clauses Obtained by distributivity (a ^ b) _ c −! (a _ c) ^ (b _ c) 13 / 185 Negation normal form Push negations to the leaves of the syntax tree using De Morgan’s laws :(a _ b) −! (:a) ^ (:b) :(a ^ b) −! (:a) _ (:b) Makes formula “monotone” with respect to f < t ordering. 14 / 185 Applications Verification of combinatorial circuits Equivalence of two circuits 15 / 185 Schedule Introduction Propositional logic Generalities Binary decision diagrams Satisfiability checking Applications First-order logic Applications Beyond DPLL(T) 16 / 185 Quantifier elimination Interpolants The problem Representing compactly set of Boolean states A set of vector n Booleans = a function from f0; 1gn into f0; 1g. Example: f(0; 0; 0); (1; 1; 0)g represented by (0; 0; 0) 7! 1, (1; 1; 0) 7! 1 and 0 elsewhere. 17 / 185 Expanded BDD BDD = Binary decision diagram Given ordered Boolean variables (a; b; c), represent (a ^ c) _ (b ^ c) : a 0 1 b b c c c c 00 0 1 0 1 0 1 18 / 185 Removing useless nodes Silly to keep two identical subtrees: a 0 1 b b c c c c 00 0 1 0 1 0 1 identical identical 19 / 185 Compression a 0 1 c b 1 0 c 0 0 1 identical 20 / 185 Reduced BDD a 0 b 1 0 c 0 1 Idea: turn the original tree into a DAG with maximal sharing. Two different but isomorphic subtrees are never created. Canonicity: a given example is always encoded by the same DAG. 21 / 185 Implementation: hash-consing Important: implementation technique that you may use in other contexts “Consing” from “constructor” (cf Lisp : cons). In computer science, particularly in functional programming, hash consing is a technique used to share values that are structurally equal. […] An interesting property of hash consing is that two structures can be tested for equality in constant time, which in turn can improve efficiency of divide and conquer algorithms when data sets contain overlapping blocks. https://en.wikipedia.org/wiki/Hash_consing 22 / 185 Implementation: hash-consing 2 Keep a hash table of all nodes created, with hashcode H(x) computed quickly. If node = (v; b0; b1) compute H from v and unique identifiers of b0 and b1 Unique identifier = address (if unmovable) or serial number If an object matching (v; b0; b1) already exists in the table, return it How to collect garbage nodes? (unreachable) 23 / 185 (Other use of weak pointers: caching recent computations.) Garbage collection in hash consing Needs weak pointers: the pointer from the hash table should be ignored by the GC when it computes reachable objects I Java WeakHashMap I OCaml Weak 24 / 185 Garbage collection in hash consing Needs weak pointers: the pointer from the hash table should be ignored by the GC when it computes reachable objects I Java WeakHashMap I OCaml Weak (Other use of weak pointers: caching recent computations.) 24 / 185 Hash-consing is magical Ensures: I maximal sharing: never two identical objects in two =6 locations in memory I ultra-fast equality test: sufficient to compare pointers (or unique identifiers) And once we have it, BDDs are easy. 25 / 185 BDD operations Once a variable ordering is chosen: I Create BDD f, t(1-node constants). I Create BDD for v, for v any variable. I Operations ^, _, etc. 26 / 185 I without dynamic programming: unfolds DAG into tree ) exponential I with dynamic programming O(jaj:jbj) where jxj the size of DAG x Binary BDD operations Operations ^, _: recursive descent on both subtrees, with memoizing: I store values of f(a; b) already computed in a hash table I index the table by the unique identifiers of a and b Complexity with and without dynamic programming? 27 / 185 Binary BDD operations Operations ^, _: recursive descent on both subtrees, with memoizing: I store values of f(a; b) already computed in a hash table I index the table by the unique identifiers of a and b Complexity with and without dynamic programming? I without dynamic programming: unfolds DAG into tree ) exponential I with dynamic programming O(jaj:jbj) where jxj the size of DAG x 27 / 185 Fixed point solving B0 = initial states B1 = B0 + reachable in 1 step from B0 B2 = B1 + reachable in 1 step from B1 B3 = B2 + reachable in 1 step from B2 . converges in finite time to R = reachable states 28 / 185 Iteration sequence 0 0 Bk+1(x1;:::; xn) = 9 ^ 0 0 x1 ::: xn Bk(x1;:::; xn) τ(x1;:::; xn; x1;:::; xn) Needs I variable renaming (easy) I constructing the BDD for τ I 9-elimination for n variables I conjunction 29 / 185 Tools e.g. NuSMV, NuXMV in BDD mode 30 / 185 Schedule Introduction Propositional logic Generalities Binary decision diagrams Satisfiability checking Applications First-order logic Applications Beyond DPLL(T) 31 / 185 Quantifier elimination Interpolants SAT problem Given a propositional formula say whether it is satisfiable or not (give a model if so)./ Satisfiable = “it has a model” = “it has a satisfying assignment”. I “SAT” with arbitrary formula I “CNF-SAT” with formula in CNF I “3CNF-SAT” with formula in CNF, 3 literals per clause 32 / 185 Exercise Show that one can convert SAT into 3CNF-SAT with time and output linear in size of input. 33 / 185 Tseitin encoding Gregory S. Tseytin, On the complexity of derivation in propositional calculus, http://www.decision-procedures.org/ handouts/Tseitin70.pdf Can one transform a formula into another in CNF with linear size and preserve solutions? (a _ b) ^ (c _ d) −! (a ^ c) _ (a ^ d) _ (b ^ c) _ (b ^ d) 34 / 185 Tseitin encoding Add extra variables ( ) (a ^ b¯ ^ ¯c) _ (b ^ c ^ d¯) ^ (b¯ _ ¯c) : Assign propositional variables to sub-formulas: e ≡ a ^ b¯ ^ ¯c f ≡ b ^ c ^ d¯ g ≡ e _ f h ≡ b¯ _ ¯c ϕ ≡ g ^ h ; 35 / 185 Tseitin encoding e ≡ a ^ b¯ ^ ¯c f ≡ b ^ c ^ d¯ g ≡ e _ f h ≡ b¯ _ ¯c ϕ ≡ g ^ h ; turned into clauses ¯e _ a ¯e _ b¯ ¯e _ ¯c ¯a _ b _ c _ e ¯f _ b ¯f _ c ¯f _ d b¯ _ ¯c _ d _ f ¯e _ g ¯f _ g ¯g _ e _ f b _ h c _ h h¯ _ b¯ _ ¯c ϕ¯ _ g ϕ¯ _ h ¯g _ h¯ _ ϕ ϕ 36 / 185 DIMACS format c start with comments p cnf 5 3 1 -5 4 0 -1 5 3 4 0 -3 -4 0 5 = number of variables 3 = number of clauses each clause: -1 is variable 1 negated, 5 is variable 5, 0 is end of clause 37 / 185 DPLL Each clause acts as propagator e.g. assuming a and b¯, clause ¯a _ b _ c yields c Boolean constraint propagation aka unit propagation: propagate as much as possible once the value of a variable is known, use it elsewhere 38 / 185 DPLL: Branching If unit propagation insufficient to I either find a satisfying assignment I either find an unsatisfiable clause (all literals forced to false) Then: I pick a variable I do a search subtree for both polarities of the variable 39 / 185 Example ¯e _ a ¯e _ b¯ ¯e _ ¯c ¯a _ b _ c _ e ¯f _ b ¯f _ c ¯f _ d b¯ _ ¯c _ d _ f ¯e _ g ¯f _ g ¯g _ e _ f b _ h c _ h h¯ _ b¯ _ ¯c ϕ¯ _ g ϕ¯ _ h ¯g _ h¯ _ ϕ ϕ From unit clause ϕ ϕ¯ _ g ! g ϕ¯ _ h ! h ¯g _ h¯ _ ϕ removed Now g and h are t, ¯e _ g removed ¯f _ g removed b _ h removed c _ h removed ¯g _ e _ f ! e _ f h¯ _ b¯ _ ¯c ! b¯ _ ¯c 40 / 185 CDCL: clause learning A DPLL branch gets closed by contradiction: a literal gets forced to both t and f.
Recommended publications
  • Efficient Immutable Collections
    Efcient Immutable Collections Efcient Immutable Collections Michael J. Steindorfer Michael J. Steindorfer Efficient Immutable Collections Efficient Immutable Collections ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad van doctor aan de Universiteit van Amsterdam op gezag van de Rector Magnificus prof. dr. ir. K.I.J. Maex ten overstaan van een door het College voor Promoties ingestelde commissie, in het openbaar te verdedigen in de Agnietenkapel op dinsdag 28 februari 2017, te 14.00 uur door Michael Johannes Steindorfer geboren te Friesach, Oostenrijk Promotiecommissie Promotores: Prof. dr. P. Klint Universiteit van Amsterdam Prof. dr. J.J. Vinju Technische Universiteit Eindhoven Overige leden: Prof. dr. M.L. Kersten Universiteit van Amsterdam Dr. C.U. Grelck Universiteit van Amsterdam Prof. dr. P. van Emde Boas Universiteit van Amsterdam Prof. dr. M. Püschel ETH Zurich Dr. D. Syme Microsoft Research Faculteit der Natuurwetenschappen, Wiskunde en Informatica The work in this thesis has been carried out at Centrum Wiskunde & Informatica (CWI) under the auspices of the research school IPA (Institute for Programming research and Algorithmics). Contents Contents v 1 Introduction 1 1.1 Collection Libraries . 3 1.2 Variability Dimensions of Collections . 5 1.3 Immutable Collections . 8 1.4 Persistent Data Structures . 10 1.5 Platform-Specific Data Locality Challenges . 15 1.6 Contributions . 19 1.7 Software Artifacts . 22 1.8 Further Reading . 23 2 Towards a Software Product Line of Trie-Based Collections 25 2.1 Introduction . 26 2.2 Related Work . 27 2.3 A Stable Data Type Independent Encoding . 29 2.4 Intermediate Generator Abstractions . 31 2.5 Conclusion . 33 3 The CHAMP Encoding 35 3.1 Introduction .
    [Show full text]
  • Selective Memoization∗
    Selective Memoization¤ Umut A. Acar Guy E. Blelloch Robert Harper School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 fumut,blelloch,[email protected] Abstract 1 Introduction We present a framework for applying memoization selectively. The Memoization is a fundamental and powerful technique for result framework provides programmer control over equality, space us- re-use. It dates back a half century [7, 24, 25] and has been used age, and identification of precise dependences so that memoiza- extensively in many areas such as dynamic programming [4, 9, 10, tion can be applied according to the needs of an application. Two 21], incremental computation [11, 37, 12, 39, 18, 1, 40, 22, 15, 16, key properties of the framework are that it is efficient and yields 2], and many others [8, 26, 19, 14, 23, 28, 29, 22]. In fact, lazy programs whose performance can be analyzed using standard tech- evaluation provides a limited form of memoization [20]. niques. Although memoization can dramatically improve performance and We describe the framework in the context of a functional language can require only small changes to the code, no language or library and an implementation as an SML library. The language is based support for memoization has gained broad acceptance. Instead, on a modal type system and allows the programmer to express pro- many successful uses of memoization rely on application-specific grams that reveal their true data dependences when executed. The support code. The underlying reason for this is one of control: since SML implementation cannot support this modal type system stati- memoization is all about performance, the user must be able to con- cally, but instead employs run-time checks to ensure correct usage trol the performance of memoization.
    [Show full text]
  • Type-Safe Modular Hash-Consing
    Type-Safe Modular Hash-Consing Jean-Christophe Filliˆatre Sylvain Conchon LRI LRI Universit´eParis Sud 91405 Orsay France Universit´eParis Sud 91405 Orsay France fi[email protected] [email protected] Abstract context dependent and we will not discuss its choice in this paper— Hash-consing is a technique to share values that are structurally anyway, choosing a prime number is always a good idea. equal. Beyond the obvious advantage of saving memory blocks, As a running example of a datatype on which to perform hash- hash-consing may also be used to speed up fundamental operations consing, we choose the following type term for λ-terms with de and data structures by several orders of magnitude when sharing Bruijn indices: is maximal. This paper introduces an OCAML hash-consing library type term = that encapsulates hash-consed terms in an abstract datatype, thus | Var of int safely ensuring maximal sharing. This library is also parameterized | Lam of term by an equality that allows the user to identify terms according to an | App of term × term arbitrary equivalence relation. Instantiated on this type, the hashcons function has the following Categories and Subject Descriptors D.2.3 [Software engineer- signature: ing]: Coding Tools and Techniques val hashcons : term → term General Terms Design, Performance If we want to get maximal sharing—the property that two values Keywords Hash-consing, sharing, data structures are indeed shared as soon as they are structurally equal—we need to systematically apply hashcons each time we build a new term. 1. Introduction Therefore it is a good idea to introduce smart constructors perform- ing hash-consing: Hash-consing is a technique to share purely functional data that are structurally equal [8, 9].
    [Show full text]
  • Lively Linear Lisp — 'Look Ma, No Garbage!'1 Henry G
    ACM Sigplan Notices 27,8 (Aug. 1992),89-98. Lively Linear Lisp — 'Look Ma, No Garbage!'1 Henry G. Baker Nimble Computer Corporation, 16231 Meadow Ridge Way, Encino, CA 91436 (818) 986-1436 (818) 986-1360 (FAX) Abstract Linear logic has been proposed as one solution to the problem of garbage collection and providing efficient "update- in-place" capabilities within a more functional language. Linear logic conserves accessibility, and hence provides a mechanical metaphor which is more appropriate for a distributed-memory parallel processor in which copying is explicit. However, linear logic's lack of sharing may introduce significant inefficiencies of its own. We show an efficient implementation of linear logic called Linear Lisp that runs within a constant factor of non-linear logic. This Linear Lisp allows RPLACX operations, and manages storage as safely as a non-linear Lisp, but does not need a garbage collector. Since it offers assignments but no sharing, it occupies a twilight zone between functional languages and imperative languages. Our Linear Lisp Machine offers many of the same capabilities as combinator/graph reduction machines, but without their copying and garbage collection problems. Introduction Neither a borrower nor a lender be; For loan oft loses both itself and friend ... [Shakespeare, Hamlet, I. iii 75] The sharing of data structures can be efficient, because sharing can substitute for copying, but it creates ambiguity as to who is responsible for their management. This ambiguity is difficult to statically remove [Barth77] [Bloss89] [Chase87] [Hederman88] [Inoue88] [Jones89] [Ruggieri88], and we have shown [Baker90] that static sharing analysis—even for pure functional languages—may be as hard as ML type inference, which is known to be exponential [Mairson90].2 We show here that a system which takes advantage of incidental, rather than necessary, sharing can provide efficiency without the ambiguity normally associated with sharing.
    [Show full text]
  • Analysis and Reduction of Memory Inefficiencies in Java Strings
    Analysis and Reduction of Memory Inefficiencies in Java Strings Kiyokuni Kawachiya Kazunori Ogata Tamiya Onodera IBM Research, Tokyo Research Laboratory 1623-14, Shimotsuruma, Yamato, Kanagawa 242-8502, Japan <[email protected]> Abstract Categories and Subject Descriptors D.3.3 [Programming This paper describes a novel approach to reduce the mem- Languages]: Language Constructs and Features—frameworks ory consumption of Java programs, by focusing on their General Terms Languages, Design, Performance, Experi- string memory inefficiencies. In recent Java applications, mentation string data occupies a large amount of the heap area. For example, about 40% of the live heap area is used for string Keywords Java, string, memory management, garbage col- data when a production J2EE application server is running. lection, footprint analysis and reduction By investigating the string data in the live heap, we iden- tified two types of memory inefficiencies — duplication and 1. Introduction unused literals. In the heap, there are many string objects Virtually all programming languages provide support for that have the same values. There also exist many string lit- strings together with a rich set of string operations. In erals whose values are not actually used by the applica- Java [10], the standard class library includes three classes, tion. Since these inefficiencies exist as live objects, they can- String for immutable strings, and StringBuffer and not be eliminated by existing garbage collection techniques, StringBuilder for mutable strings. Although the devel- which only remove dead objects. Quantitative analysis of oper of a Java virtual machine (JVM) attempts to imple- Java heaps in real applications revealed that more than 50% ment these classes as efficiently as possible, there is little of the string data in the live heap is wasted by these ineffi- description in the literature on how efficient or inefficient ciencies.
    [Show full text]
  • Efficient Tabling of Structured Data with Enhanced Hash-Consing
    Theory and Practice of Logic Programming 1 Efficient Tabling of Structured Data with Enhanced Hash-Consing Neng-Fa Zhou CUNY Brooklyn College & Graduate Center [email protected] Christian Theil Have Roskilde University [email protected] submitted ; revised ; accepted Abstract Current tabling systems suffer from an increase in space complexity, time complexity or both when dealing with sequences due to the use of data structures for tabled subgoals and answers and the need to copy terms into and from the table area. This symptom can be seen in not only B-Prolog, which uses hash tables, but also systems that use tries such as XSB and YAP. In this paper, we apply hash-consing to tabling structured data in B- Prolog. While hash-consing can reduce the space consumption when sharing is effective, it does not change the time complexity. We enhance hash-consing with two techniques, called input sharing and hash code memoization, for reducing the time complexity by avoiding computing hash codes for certain terms. The improved system is able to eliminate the extra linear factor in the old system for processing sequences, thus significantly enhancing the scalability of applications such as language parsing and bio-sequence analysis applications. We confirm this improvement with experimental results. 1 Introduction Tabling, as provided in logic programming systems such as B-Prolog (Zhou et al. 2008), XSB (Swift and Warren 2012), YAP (Santos Costa et al. 2012), and Mercury (Somogyi and Sagonas 2006), has been shown to be a viable declarative language construct for describing dynamic programming solutions for various kinds of real- world applications, ranging from program analysis, parsing, deductive databases, theorem proving, model checking, to logic-based probabilistic learning.
    [Show full text]
  • Object Equality Profiling
    Object Equality Profiling Darko Marinov Robert O’Callahan MIT Lab for Computer Science IBM T. J. Watson Research Center [email protected] [email protected] ABSTRACT 1. INTRODUCTION We present Object Equality Profiling (OEP), a new technique for Object-oriented programs typically create and destroy large num- helping programmers discover optimization opportunities in pro- bers of objects. Creating, initializing and destroying an object con- grams. OEP discovers opportunities for replacing a set of equiva- sumes execution time and also requires space for the object while lent object instances with a single representative object. Such a set it is alive. If the object to be created is identical to another object represents an opportunity for automatically or manually applying that the program has already created, it might be possible to simply optimizations such as hash consing, heap compression, lazy alloca- use the existing object and avoid the costs of creating a new ob- tion, object caching, invariant hoisting, and more. To evaluate OEP, ject. Even if a new object is created, if it later becomes identical we implemented a tool to help programmers reduce the memory us- to an existing object, one might be able to save space by replacing age of Java programs. Our tool performs a dynamic analysis that all references to the new “redundant” object with references to the records all the objects created during a particular program run. The existing object and then destroying the redundant object. tool partitions the objects into equivalence classes, and uses col- Anecdotal evidence suggests that these improvements are fre- lected timing information to determine when elements of an equiv- quently possible and useful.
    [Show full text]
  • Implementing and Reasoning About Hash-Consed Data Structures in Coq
    Journal of Automated Reasoning manuscript No. (will be inserted by the editor) Implementing and reasoning about hash-consed data structures in Coq Thomas Braibant · Jacques-Henri Jourdan · David Monniaux November 2013 Abstract We report on four different approaches to implementing hash-consing in Coq programs. The use cases include execution inside Coq, or execution of the extracted OCaml code. We explore the different trade-offs between faithful use of pristine extracted code, and code that is fine-tuned to make use of OCaml programming constructs not available in Coq. We discuss the possible consequences in terms of performances and guarantees. We use the running example of binary decision diagrams and then demonstrate the generality of our solutions by applying them to other examples of hash-consed data structures. 1 Introduction Hash-consing is a programming technique used to share identical immutable values in memory, keeping a single copy of semantically equivalent objects. It is especially useful in order to get a compact representation of abstract syntax trees. A hash- consing library maintains a global pool of expressions and never recreates an expression equal to one already in memory, instead reusing the one already present. In typical implementations, this pool is a global hash-table. Hence, the phrase hash-consing denotes the technique of replacing node construction by a lookup in a hash table returning a preexisting object, or creation of the object followed by insertion into the table if previously nonexistent. This makes it possible to get maximal sharing between objects, if hash-consing is used systematically when creating objects.
    [Show full text]
  • Software Design Pattern 1 Software Design Pattern
    Software design pattern 1 Software design pattern In software engineering, a software design pattern is a general, reusable solution to a commonly occurring problem within a given context in software design. It is not a finished design that can be transformed directly into source or machine code. Rather, it is a description or template for how to solve a problem that can be used in many different situations. Design patterns are formalized best practices that the programmer can use to solve common problems when designing an application or system. Object-oriented design patterns typically show relationships and interactions between classes or objects, without specifying the final application classes or objects that are involved. Patterns that imply mutable state may be unsuited for functional programming languages, some patterns can be rendered unnecessary in languages that have built-in support for solving the problem they are trying to solve, and object-oriented patterns are not necessarily suitable for non-object-oriented languages. Design patterns may be viewed as a structured approach to computer programming intermediate between the levels of a programming paradigm and a concrete algorithm. History Patterns originated as an architectural concept by Christopher Alexander as early as 1966 (c.f. "The Pattern of Streets," JOURNAL OF THE AIP, September, 1966, Vol. 32, No. 3, pp. 273-278). In 1987, Kent Beck and Ward Cunningham began experimenting with the idea of applying patterns to programming – specifically pattern languages – and presented their results at the OOPSLA conference that year. In the following years, Beck, Cunningham and others followed up on this work.
    [Show full text]
  • A Dynamic Analysis to Support Object-Sharing Code Refactorings
    A Dynamic Analysis to Support Object-Sharing Code Refactorings ∗ Girish Maskeri Rama Raghavan Komondoor Infosys Limited Indian Institute of Science, Bangalore [email protected] [email protected] ABSTRACT p u b l i c CommonFont getFontProps ( ) f Creation of large numbers of co-existing long-lived isomor- ... phic objects increases the memory footprint of applications CommonFont commonFont = significantly. In this paper we propose a dynamic-analysis new CommonFont( fontFamily , fontSelectionStrategy , based approach that detects allocation sites that create large fontStretch , fontStyle , fontVariant , numbers of long-lived isomorphic objects, estimates quan- fontWeight , fontSize , fontSizeAdjust); titatively the memory savings to be obtained by sharing r e t u r n commonFont; isomorphic objects created at these sites, and also checks g whether certain necessary conditions for safely employing object sharing hold. We have implemented our approach as Listing 1: Creation of isomorphic objects a tool, and have conducted experiments on several real-life Java benchmarks. The results from our experiments indicate that in real benchmarks a significant amount of heap mem- this problem is partly due creation of large numbers of co- ory, ranging up to 37% in some benchmarks, can be saved by existing isomorphic objects. Intuitively, two objects are iso- employing object sharing. We have also validated the pre- morphic if they are of the same type, have identical values in cision of estimates from our tool by comparing these with corresponding primitive fields, and are such that correspond- actual savings obtained upon introducing object-sharing at ing reference fields themselves point to isomorphic objects. selected sites in the real benchmarks.
    [Show full text]
  • Representation Sharing for Prolog 3
    To appear in Theory and Practice of Logic Programming 1 Representation Sharing for Prolog PHUONG-LAN NGUYEN Institut de Math´ematiques Appliqu´ees, UCO, Angers, France [email protected] BART DEMOEN Department of Computer Science, K.U.Leuven, Belgium [email protected] submitted 8 June 2010; revised 19 January 2011; accepted 30 May 2011 Abstract Representation sharing can reduce the memory footprint of a program by sharing one representation between duplicate terms. The most common implementation of representation sharing in functional programming systems is known as hash-consing. In the context of Prolog, representation sharing has been given little attention. Some current techniques that deal with representation sharing are reviewed. The new contributions are: (1) an easy implementation of input sharing for findall/3; (2) a description of a sharer module that introduces representation sharing at runtime. Their realization is shown in the context of the WAM as implemented by hProlog. Both can be adapted to any WAM- like Prolog implementation. The sharer works independently of the garbage collector, but it can be made to cooperate with the garbage collector. Benchmark results show that the sharer has a cost comparable to the heap garbage collector, that its effectiveness is highly application dependent, and that its policy must be tuned to the collector. KEYWORDS: Prolog, WAM, memory management 1 Introduction arXiv:1106.1311v2 [cs.PL] 30 Jun 2011 Data structures with the same value during the rest of their common life can share the same representation. This is exploited in various contexts, e.g., by the intern method for Strings in Java, by hash-consing in functional languages (Goto 1974), and by data dedupli- cation during backup.
    [Show full text]
  • Convenient, and Abstract Type Representations
    The Wizard of TILT: Efficient?, Convenient, and Abstract Type Representations Tom Murphy March 2002 CMU-CS-02-120 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Senior Thesis Advisors Bob Harper ([email protected]) Karl Crary ([email protected]) Abstract The TILT compiler for Standard ML is type-directed and type-preserving, that is, it makes use of and translates type information during the phases of compilation. Unfortunately, such use of type data incurs a significant overhead. This paper explores methods for abstractly, conveniently, and efficiently storing and manipulating type information in TILT. In the end, we discover that doing more work to reduce overhead is a bad strategy for this situation. This material is based on work supported in part by ARPA grant F-19628-95-C-0050 and NSF grant CCR- 9984812. Any opinions, findings, and conclusions or recommendations in this publication are those of the authors and do not reflect the views of these agencies. Keywords: types, typed intermediate languages, type representations, Standard ML, hash consing, de Bruijn indices, views 1 The Wizard of TILT TILT is a certifying compiler for Standard ML [1]. Its major distinguishing feature is the use of Typed Intermediate Languages throughout the phases of compilation. Because each of the code transformations that the compiler performs also transforms the types, we preserve type information that is normally discarded after typechecking the source language in traditional compilers. This allows us to typecheck the results of these transformations (catching compiler bugs), perform data representation optimizations, and do nearly tag-free garbage collection.
    [Show full text]