Generation of Uniformly-Random Graphs

Total Page:16

File Type:pdf, Size:1020Kb

Generation of Uniformly-Random Graphs Department of Computer Science Submitted in part fulfilment for the degree of BSc. Generation of Uniformly-Random Graphs Mark Henrick 29th of April 2019 Supervisor: Detlef Plump Contents Executive Summary iv 0.1 Ethics . v 1 Introduction 1 2 Literature Review 2 2.1 Preliminaries and Notation . 2 2.2 Generation of Strings u.a.r. 2 2.3 Mairson’s Methods . 3 2.3.1 A Space-Time Tradeoff . 3 2.3.2 Example: Balanced Brackets . 4 2.4 Ambiguous String Grammars . 6 2.5 Hypergraphs and Hyperedge-Replacement Grammars . 7 2.5.1 Informal Overview . 7 2.5.2 Formal Overview . 8 3 Adaptation of String Algorithms to HRGs 11 3.1 Prior Work . 11 3.2 Substrings and Concatenation for Graphs . 11 3.3 Length and Size . 12 3.4 Normal Form . 12 3.4.1 The Modified Mairson Algorithm . 13 3.5 Embedding of String Grammars in HRGs . 14 4 Implementation 15 4.1 Language Choice . 15 4.2 User Interface . 16 4.2.1 Deduplication . 17 5 Evaluation 19 5.1 Grammars . 19 5.1.1 The Palindrome Grammar . 19 5.1.2 The a∗jbbb Grammars . 19 5.1.3 The “AB” Grammar . 20 5.2 Correctness . 20 5.2.1 Ambiguity . 20 5.3 Performance . 21 5.3.1 Methodology . 21 ii Contents 5.3.2 Results . 22 6 Conclusion 26 6.1 Opportunities for Further Work . 26 iii Executive Summary This project aims to produce a program which accepts as input an un- ambiguous context-free hyperedge replacement grammar and produces random hypergraphs of a given size1. The important property is that these hypergraphs should be generated uniformly at random (u.a.r.), meaning that every hypergraph of the specified size that the grammar produces should be generated with equal probability. A primary use case of this software is to produce random inputs for testing graph algorithms, there- fore the generator should write its results in a computer-readable format suitable to be used in other programs. The only similar programs found in the literature review are past BSc projects, which use slightly different algorithms or software platforms than this project. In chapter 1 further background on the state of the art is given. Chapter 2 explores the existing string algorithms and explains hypergraphs and hyperedge replacement grammars (HRGs). Chapter 3 details the adaptation of a string algorithm by Harry Mairson [1] to HRGs. While the original algorithm required the input be in Chom- sky normal form, these restrictions are partially relaxed in the process of adapting the algorithm to account for the properties of hypergraphs. In chapter 4 the implementation is covered, including software platform choice and a discussion of how to remove many duplicate graphs from the output in an efficient manner. The algorithm is implemented in Java [2], and results are rendered in JSON [3], a widely-supported data interchange format. The evaluation of the program is detailed in chapter 5. The program produces graphs with the expected distribution, but unfortunately is found to be generally slower than a program produced by Jake Coxon [4], while performing faster than one produced by Carla Lawrence [5]. The report is concluded in chapter 6, which details some areas for further work. 1The size of a hypergraph is the sum of the numbers of hyperedges and nodes iv Executive Summary 0.1 Ethics As this project is purely adapting and implementing a rather abstract math- ematical algorithm, there are no direct ethical implications. As usual, academic integrity must be maintained, and is of heightened importance due to the existence of similar student projects in the area. v 1 Introduction Graphs are one of the most ubiquitous and versatile data structures in computer science and discrete mathematics. As they can be used to model a large number of problems, there is considerable interest in the manipulation of graphs, resulting in programming languages designed specifically for that purpose, such as GP2 [6]. A concern for all software development is testing, which can take the form of formal verification, hand-written assertions, or generating random tests cases that are checked for certain invariants. The aim of this project is to develop a method which can be used to generate graphs derived from a specific grammar uniformly at random (u.a.r.). These outputs can then be used as random inputs to a graph algorithm, allowing semi-automatic testing. Random graph generators do exist, such as Stanford GraphBase [7], however these are a lot less powerful than what is needed for this project. The existing generators primarily generate “ordinary” graphs — we will be generating hypergraphs — and rarely give much control over the “shape” of the graph. We will be generating graphs from hyperedge-replacement grammars, which allow powerful specification of graph languages. Prior work in the area of uniform generation from grammars has primarily focused on strings, and for these it is mostly a solved problem, as detailed in section 2.2, however work extended this to graphs is limited, as detailed in section 3.1. This project aims to add to this prior work with a new hypergraph generator for the Java platform, using a variant of an existing string algorithm by Mairson (detailed in section 2.3), which is faster than a similar program written by Lawrence (see again section 3.1). 1 2 Literature Review 2.1 Preliminaries and Notation This report presumes rudimentary knowledge of context-free string gram- mars. We will define a context-free grammar (CFG) for strings as G = (N, S, P, S) where N is a set of nonterminals (variables), S is the terminal alphabet (disjoint from N), P ⊆ N × (N [ S)∗ is the set of productions (rules) and S 2 N is the start symbol. L is the language generated by the grammar and we write L` for the ` sublanguage restricted to a specific string length, L` = L \ S . Note that ` while L may be (countably) infinite, L` is finite with cardinality at most jSj . We will write terminals as lowercase and nonterminals as capitals. ) notates direct derivation and )∗ means derivation by any number of steps. e denotes the empty string, and ` will be used throughout to denote the length of the string, or size of the hypergraph, which we wish to generate. 2.2 Generation of Strings u.a.r. The problem of generating strings uniformly at random (u.a.r.) from a context-free grammar (CFG) has received substantial attention. We formally specify the problem as follows: given inputs of a CFG G and a length ` > 0, describe an algorithm to select a string from L` with probability 1/jL`j. Methods based on choosing available productions u.a.r. will not work, as strings with shorter derivations will be more likely to be generated. Intuitively one can think of a total language tree — this approach would only work if it were perfectly balanced. Hickey and Cohen [8] present two algorithms for unambiguous grammars. This work is improved upon by Mairson [1], who presents two algorithms that give a tradeoff between linear time via use of a quadratic-size data structure v.s. quadratic time and linear space. 2 2 Literature Review 2.3 Mairson’s Methods Mairson presumes an unambiguous grammar G that is in Chomsky normal form (CNF). This means any production is of the form A ! BC (which I will call “binary productions”), A ! x (which I will call “terminal productions”), or S ! e (which I will call “the empty production”). There is a well-known terminating algorithm to convert any CFG to this form. Generation is considered with regard to a certain starting symbol, which may not be the “global” S of the grammar — we will refer to it as I for “initial”. If ` < 2, we simply choose a random production I ! x where jxj = ` and return x (in practice the only possibilities for ` = 0 are S ! e or failure). This leaves the case of ` > 1. Mairson defines the “potential”1 of a symbol for a given length, denoted as jjAjj` as the number of strings of length ` that can be derived from A in any number of steps. This can be defined as the number of strings that can be directly derived from A (which will all be of length 1), plus the potential of each production with an LHS of A. The production potential jjA ! BCjj` is the number of strings of length ` with a derivation starting with A ! BC. To generate such a string we have a choice of where to split the length of the string generated from B, and that from C , as long as the lengths are positive and sum to `. In other words, jjA ! BCjj` = ∑ jjBjjk · jjCjj`−k. 0<k<` jjI ! BCjj Now we choose a production I ! BC with probability ` and jjIjj` jjBjj · jjCjj the split 0 < k < ` with probability k `−k , then recurse on (B, k) jjI ! BCjj` and (C, ` − k), and return the concatenated result. The algorithm fails exactly when jjIjj` = 0. After this, every selection and recursive call that the algorithm makes is guaranteed to succeed. The potentials can be calculated efficiently using dynamic programming (algorithm 1). 2.3.1 A Space-Time Tradeoff The preprocessing algorithm that was just discussed produces a data structure of size O(`) for a constant grammar, however the generation of a word with use of this data structure has a time complexity quadratic in the length of the string. Mairson also offers a method to produce an 1Terminology mine, Mairson does not give it a name 3 2 Literature Review Initialise all cells for all nonterminals to 0 foreach Terminal production A ! x 2 P do A[1] A[1] + 1 for i 2 to ` do foreach Binary production A ! BC 2 P do A[i] A[i] + ∑ B[k] · C[i − k] 0<k<i jjAjj` can now be found at A[`] Algorithm 1: Mairson’s algorithm for calculation of potentials auxiliary data structure with size quadratic in ` which can be later used for linear-time string generation.
Recommended publications
  • An E Cient and General Implementation of Futures on Large
    An Ecient and General Implementation of Futures on Large Scale SharedMemory Multipro cessors A Dissertation Presented to The Faculty of the Graduate School of Arts and Sciences Brandeis University Department of Computer Science James S Miller advisor In Partial Fulllment of the Requirements of the Degree of Doctor of Philosophy by Marc Feeley April This dissertation directed and approved by the candidates committee has b een ac cepted and approved by the Graduate Faculty of Brandeis University in partial fulll ment of the requirements for the degree of DOCTOR OF PHILOSOPHY Dean Graduate School of Arts and Sciences Dissertation Committee Dr James S Miller chair Digital Equipment Corp oration Prof Harry Mairson Prof Timothy Hickey Prof David Waltz Dr Rob ert H Halstead Jr Digital Equipment Corp oration Copyright by Marc Feeley Abstract An Ecient and General Implementation of Futures on Large Scale SharedMemory Multipro cessors A dissertation presented to the Faculty of the Graduate School of Arts and Sciences of Brandeis University Waltham Massachusetts by Marc Feeley This thesis describ es a highp erformance implementation technique for Multilisps future parallelism construct This metho d addresses the nonuniform memory access NUMA problem inherent in large scale sharedmemory multiprocessors The technique is based on lazy task creation LTC a dynamic task partitioning mechanism that dramatically reduces the cost of task creation and consequently makes it p ossible to exploit ne grain parallelism In LTC idle pro cessors get work to do
    [Show full text]
  • The Complexity of Flow Analysis in Higher-Order Languages
    The Complexity of Flow Analysis in Higher-Order Languages David Van Horn arXiv:1311.4733v1 [cs.PL] 19 Nov 2013 The Complexity of Flow Analysis in Higher-Order Languages A Dissertation Presented to The Faculty of the Graduate School of Arts and Sciences Brandeis University Mitchom School of Computer Science In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy by David Van Horn August, 2009 This dissertation, directed and approved by David Van Horn’s committee, has been accepted and approved by the Graduate Faculty of Brandeis University in partial fulfillment of the requirements for the degree of: DOCTOR OF PHILOSOPHY Adam B. Jaffe, Dean of Arts and Sciences Dissertation Committee: Harry G. Mairson, Brandeis University, Chair Olivier Danvy, University of Aarhus Timothy J. Hickey, Brandeis University Olin Shivers, Northeastern University c David Van Horn, 2009 Licensed under the Academic Free License version 3.0. in memory of William Gordon Mercer July 22, 1927–October 8, 2007 Acknowledgments Harry taught me so much, not the least of which was a compelling kind of science. It is fairly obvious that I am not uninfluenced by Olivier Danvy and Olin Shivers and that I do not regret their influence upon me. My family provided their own weird kind of emotional support and humor. I gratefully acknowledge the support of the following people, groups, and institu- tions, in no particular order: Matthew Goldfield. Jan Midtgaard. Fritz Henglein. Matthew Might. Ugo Dal Lago. Chung-chieh Shan. Kazushige Terui. Christian Skalka. Shriram Krishnamurthi. Michael Sperber. David McAllester. Mitchell Wand. Damien Sereni. Jean-Jacques Levy.´ Julia Lawall.
    [Show full text]
  • Fundamentals of Type Inference Systems
    Fundamentals of type inference systems Fritz Henglein DIKU, University of Copenhagen [email protected] February 1, 1991; revised January 31, 1994; updated August 9, 2009 1 Introduction These notes give a compact overview of established core type systems and of their fundamental properties. We emphasize the use and application of type systems in programming languages, but also mention their role in logic. Proofs are omitted, but references to relevant sources in the literature are usually given. 1.1 What is a \type"? There are many examples of types in programming languages: • Primitive types: int, float, bool • Compound types: products (records), sums (disjoint unions), lists, arrays • Recursively definable types, such as tree data types • Function types, e.g. int !int • Parametric polymorphic types • Abstract types, e.g. Java interfaces Loosely speaking a type is a description of a collection of related values. • A type has syntax (\description"): It denotes a collection. • A type has elements (\collection"): It makes sense to talk about ele- ments of a type; in particular, it may have zero, one or many elements 1 • A type's elements have common properties (\related"): Users of a type can rely on each element having some common properties|an interface| without having to know the identity of particular elements. Types incorporate multiple aspects: • Type as a set of values: This view focuses on how values are con- structed to be elements of a type; e.g. constructing the natural num- bers from 0 and the successor function; • Type as a an interface: This view focuses on how values can be used (\deconstructed") by a client; e.g.
    [Show full text]
  • On the Resolution Semiring
    Aix-Marseille Université École doctorale 184 UFR sciences Institut de Mathématiques de Marseille Thèse présentée pour obtenir le grade universitaire de docteur Spécialité: Mathématiques On the Resolution Semiring Marc Bagnol Jury: Pierre-Louis Curien Université Paris Diderot Jean-Yves Girard Aix-Marseille Université (directeur) Ugo dal Lago Università di Bologna (rapporteur) Paul-André Melliès Université Paris Diderot Myriam Quatrini Aix-Marseille Université Ulrich Schöpp LMU München Philip Scott University of Ottawa (rapporteur) Kazushige Terui Kyoto University Soutenue le 4/12/2014 à Marseille. This thesis is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International licence. Résumé On étudie dans cette thèse une structure de semi-anneau dont le produit est basé sur la règle de résolution de la programmation logique. Cet objet mathématique a été initialement introduit dans le but de modéliser la procédure d’élimination des coupures de la logique linéaire, dans le cadre du programme de géométrie de l’interaction. Il fournit un cadre algébrique et abstrait, tout en étant présenté sous une forme syntaxique et concrète, dans lequel mener une étude théorique du calcul. On reviendra dans un premier temps sur l’interprétation interactive de la théorie de la démonstration dans ce semi-anneau, via l’axiomatisation catégorique de l’approche de la géométrie de l’interaction. Cette interprétation établit une traduction des programmes fonctionnels vers une forme très simple de programmes logiques. Dans un deuxième temps, on abordera des problématiques de théorie de la complexité: bien que le problème de la nilpotence dans le semi-anneau étudié soit indécidable en général, on fera apparaître des restrictions qui permettent de caractériser le calcul en espace logarithmique (déterministe et non-déterministe) et en temps polynomial (déterministe).
    [Show full text]
  • Metatheorems About Convertibility in Typed Lambda Calculi
    Metatheorems about Convertibility in Typed Lambda Calculi: Applications to CPS transform and "Free Theorems" by Jakov Kucan B.S.E., University of Pennsylvania (1991) M.A., University of Pennsylvania (1991) Submitted to the Department of Mathematics in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 1997 @ Massachusetts Institute of Technology 1997. All rights reserved. x - I A uthor .... ................ Department of Mathematics /1. October 10, 1996 Certified by ....... , ,.... ..... .... ........................... Albert R. Meyer Hitachi America Professor of Engineering / -Thesis Supervisor (/1 n / Accepted by..... ... ......Accep.. ...yHungApplied.... .... ....ma.. .........Cheng..... Chairman, itics Committee Accepted by ................................... .......... .... ............ Richard Melrose oChairman, Departmental Committee on Graduate Students MAR 0 41997 Metatheorems about Convertibility in Typed Lambda Calculi: Applications to CPS transform and "Free Theorems" by Jakov Kutan Submitted to the Department of Mathematics on October 10, 1996, in partial fulfillment of the requirements for the degree of Doctor of Philosophy Abstract In this thesis we present two applications of using types of a typed lambda calculus to derive equation schemas, instances of which are provable by convertibility. Part 1: Retraction Approach to CPS transform: We study the continuation passing style (CPS) transform and its generalization, the computational transform, in which the notion of computation is generalized from continuation passing to an arbitrary one. To establish a relation between direct style and continuationpassing style interpretation of sequential call- by-value programs, we prove the Retraction Theorem which says that a lambda term can be recovered from its continuationized form via a A-definable retraction. The Retraction Theorem is proved in the logic of computational lambda calculus for the simply typable terms.
    [Show full text]
  • Chih-Hao Luke ONG Contents
    Chih-Hao Luke ONG Department of Computer Science Tel: +44 1865 283522 University of Oxford Fax: +44 1865 273839 Wolfson Building, Parks Road Email: [email protected] Oxford OX1 3QD Homepage: http://www.cs.ox.ac.uk/people/luke.ong/personal/ United Kingdom Born in Singapore; Singapore citizen. Married with one child. 1 Contents 1 Education 2 2 Appointments 2 2.1 Current Appointments............................................2 2.2 Past Appointments..............................................2 2.3 Visiting Positions...............................................3 3 Awards 3 4 Research and Research Group 3 4.1 Postdoctoral Researchers Supervised and Research Fellows Hosted.....................3 4.2 Research Visitors Hosted (4 Weeks or Longer)................................4 4.3 Doctoral Students Graduated.........................................5 4.4 Current Doctoral Students:..........................................7 4.5 Software....................................................7 4.6 Patent.....................................................7 4.7 Research Grants as PI or co-PI........................................8 4.8 Named Collaborator on International Research Grants............................8 5 Invited Presentations 8 5.1 Conferences, Workshops and Instructional Meetings.............................8 5.2 University Colloquia and Seminars...................................... 13 6 Publications 14 7 Professional Activities 19 7.1 International External Review........................................ 19 7.2 Editorial Duties...............................................
    [Show full text]
  • ICFP 2008 Final Program
    ICFP 2008 Final Program Monday, Sep 22, 2008 Tuesday, Sep 23, 2008 Wednesday, Sep 24, 2008 Invited Talk (Chair: Peter Thiemann) Invited Talk (Chair: James Hook) Invited Talk (Chair: Mitchell Wand) 9:00 Lazy and Speculative Execution in Computer Systems 9:00 Defunctionalized Interpreters for Higher-Order Lan- 9:00 Polymorphism and Page Tables|Systems Program- Butler Lampson; Microsoft Research guages ming From a Functional Programmer's Perspective 10:00 Break Olivier Danvy; University of Aarhus Mark Jones; Portland State University Session 1 (Chair: Martin Sulzmann) 10:00 Break 10:00 Break 10:30 Flux: FunctionaL Updates for XML Session 6 (Chair: Andrew Tolmach) Session 11 (Chair: Fritz Henglein) James Cheney; University of Edinburgh 10:30 Parametric Higher-Order Abstract Syntax for Mecha- 10:30 Pattern Minimization Problems over Recursive Data 10:55 Typed Iterators for XML nized Semantics Types 1 2 Giuseppe Castagna , Kim Nguyen ; 1PPS (CNRS) - Universit´e Adam Chlipala; Harvard University Alexander Krauss; TU M¨unchen Paris 7 - Paris, France, 2LRI - Universit´eParis-Sud 11 - Orsay, France 10:55 Typed Closure Conversion Preserves Observational 10:55 Deciding kCFA is complete for EXPTIME 11:20 Break Equivalence David Van Horn, Harry Mairson; Brandeis University Toyota Technological Institute at Session 2 (Chair: Matthew Fluet) Amal Ahmed, Matthias Blume; 11:20 Break 11:50 Aura: A Programming Language for Authorization Chicago Session 12 (Chair: Derek Dreyer) and Audit 11:20 Break 11:50 HMF: Simple Type Inference for First-Class Polymor-
    [Show full text]
  • Database Query Languages Embedded in the Typed Lambda Calculus
    Information and Computation IC2571 information and computation 127, 117144 (1996) article no. 0055 ViewDatabase metadata, citation and Query similar papers Languages at core.ac.uk Embedded in the Typed Lambda Calculusbrought* to you by CORE provided by Elsevier - Publisher Connector Gerd G. Hillebrand- and Paris C. Kanellakis- Department of Computer Science, Brown University, Box 1910, Providence, Rhode Island 02912 and Harry G. Mairson Department of Computer Science, Brandeis University, Waltham, Massachusetts 02254 picture is somewhat complex. If inputs and outputs are We investigate the expressive power of the typed *-calculus when Church numerals typed as Int (where Int#({ Ä {) Ä { Ä { expressing computations over finite structures, i.e., databases. We for some fixed {), Schwichtenberg [42] and Statman show that the simply typed *-calculus can express various database showed that the expressible multi-argument functions of query languages such as the relational algebra, fixpoint logic, and the complex object algebra. In our embeddings, inputs and outputs are type (Int, ..., Int) Ä Int (or equivalently, Int Ä }}} Ä *-terms encoding databases, and a program expressing a query is a Int Ä Int) are exactly the extended polynomials, i.e., the *-term which types when applied to an input and reduces to an output. functions generated by 0 and 1 using the operations addi- Our embeddings have the additional property that PTIME computable tion, multiplication and conditional. If inputs and outputs queries are expressible by programs that, when applied to an input, are Church numerals given more complex types than Int, reduce to an output in a PTIME sequence of reduction steps. Under our database input-output conventions, all elementary queries are express- exponentiation and predecessor can also be expressed.
    [Show full text]
  • First-Class Continuations Basic Research in Computer Science
    BRICS BRICS RS-96-20 Danvy & Lawall: Back to Direct Style II: First-Class Continuations Basic Research in Computer Science Back to Direct Style II: First-Class Continuations Olivier Danvy Julia L. Lawall BRICS Report Series RS-96-20 ISSN 0909-0878 June 1996 Copyright c 1996, BRICS, Department of Computer Science University of Aarhus. All rights reserved. Reproduction of all or part of this work is permitted for educational or research use on condition that this copyright notice is included in any copy. See back inner page for a list of recent publications in the BRICS Report Series. Copies may be obtained by contacting: BRICS Department of Computer Science University of Aarhus Ny Munkegade, building 540 DK - 8000 Aarhus C Denmark Telephone: +45 8942 3360 Telefax: +45 8942 3255 Internet: [email protected] BRICS publications are in general accessible through WWW and anonymous FTP: http://www.brics.dk/ ftp ftp.brics.dk (cd pub/BRICS) Back to Direct Style II: First-Class Continuations ∗ Olivier Danvy Julia L. Lawall BRICS † IRISA Computer Science Department University of Rennes § Aarhus University ‡ ([email protected]) ([email protected]) June 1996 Abstract The direct-style transformation aims at mapping continuation- passing programs back to direct style, be they originally written in continuation-passing style or the result of the continuation-passing- style transformation. In this paper, we continue to investigate the direct-style transformation by extending it to programs with first-class continuations. First-class continuations break the stack-like discipline of continua- tions in that they are sent results out of turn.
    [Show full text]
  • Efficient Inference of Object Types
    Information and Computation, 123(2):198{209, 1995. Efficient Inference of Object Types Jens Palsberg [email protected] Laboratory for Computer Science Massachusetts Institute of Technology NE43-340 545 Technology Square Cambridge, MA 02139 Abstract Abadi and Cardelli have recently investigated a calculus of objects [2]. The calculus supports a key feature of object-oriented languages: an object can be emulated by another object that has more refined methods. Abadi and Cardelli presented four first-order type systems for the calculus. The simplest one is based on finite types and no subtyping, and the most powerful one has both recursive types and subtyping. Open until now is the question of type inference, and in the presence of subtyping \the absence of minimum typings poses practical problems for type inference" [2]. In this paper we give an O(n3) algorithm for each of the four type inference problems and we prove that all the problems are P-complete. We also indicate how to modify the algorithms to handle functions and records. 1 Introduction Abadi and Cardelli have recently investigated a calculus of objects [2]. The calculus supports a key feature of object-oriented languages: an object can 1 be emulated by another object that has more refined methods. For example, if the method invocation a:l is meaningful for an object a, then it will also be meaningful for objects with more methods than a, and for objects with more refined methods. This phenomenon is called subsumption. The calculus contains four constructions: variables, objects, method in- vocation, and method override.
    [Show full text]
  • Notes on the History of Type Inference
    Notes on the history of type inference Fritz Henglein DIKU, University of Copenhagen 2010-01-29 Newman's typability algorithm Newman [1943] for Quine's Type Theory (Quine [1937]) and an implicit version of Church's Theory of Simple Types (Church [1940])|a Curry-style formulation without explicit types (Curry [1934])| was recently rediscovered by Hindley [2008]. The algorithm is a fascinating precursor to constraint-based type inference and program analysis techniques, which have been developed in the late 80s and onwards for both theoretical and practical purposes. Using terminology from constraint-based type inference Newman's algorithm can be described as follows when applied to simple typing:1 1. Ensure that all bound and free variables in the subject term M are named apart. Let each subterm X of M be associated with a unique type variable αX . In Newman's description αX is identified by the subterm X itself. 2. Generate constraints for each subterm Z of M: (a) For Z ≡ XY Newman generates the constraints X γ1 Z and X γ2 Y , which corresponds to the (type-)equational constraint αX = αY ! αZ . (b) For Z ≡ λx.U generate Z γ1 U and Z γ2 x, corresponding to αZ = αx ! αU . It can be observed that whenever there exists X γ1 Y in Newman's con- straints then they also contain X γ2 Z for some Z, and this is preserved throughout the subsequent constraint simplification process. A |decidedly revisionist|interpretation of this observation is that the type constraint notation subsequently adopted in constraint-based type inference syntac- tically incorporates this duality by combining them into a single construct: Define αX = αY ! αZ if X γ1 Z and X γ2 Y in Newman's formulation.
    [Show full text]
  • Bottom-Up Beta-Reduction
    Bottom-up β-reduction: uplinks and λ-DAGs Olin Shivers1 and Mitchell Wand2 1 Georgia Institute of Technology 2 Northeastern University Abstract. Representing a λ-calculus term as a DAG rather than a tree allows us to represent the sharing that arises from β-reduction, thus avoiding combina- torial explosion in space. By adding uplinks from a child to its parents, we can efficiently implement β-reduction in a bottom-up manner, thus avoiding com- binatorial explosion in time required to search the term in a top-down fashion. We present an algorithm for performing β-reduction on λ-terms represented as uplinked DAGs; discuss its relation to alternate techniques such as Lamping graphs, explicit-substitution calculi and director strings; and present some tim- ings of an implementation. Besides being both fast and parsimonious of space, the algorithm is particularly suited to applications such as compilers, theorem provers, and type-manipulation systems that may need to examine terms in- between reductions—i.e., the “readback” problem for our representation is trivial. Like Lamping graphs, and unlike director strings or the suspension λ-calculus, the algorithm functions by side-effecting the term containing the redex; the rep- resentation is not a “persistent” one. The algorithm additionally has the charm of being quite simple: a complete implementation of the core data structures and algorithms is 180 lines of SML. 1 Introduction The λ-calculus [2, 5] is a simple language with far-reaching use in the programming- languages and formal-methods communities, where it is frequently employed to repre- sent, among other objects, functional programs, formal proofs, and types drawn from sophisticated type systems.
    [Show full text]