Automated Black Box Generation of Structured Inputs for Use in Software Testing

Total Page:16

File Type:pdf, Size:1020Kb

Automated Black Box Generation of Structured Inputs for Use in Software Testing University of California Santa Barbara Automated Black Box Generation of Structured Inputs for Use in Software Testing A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Kyle Thomas Dewey Committee in charge: Professor Ben Hardekopf, Chair Professor Chandra Krintz Professor Tevfik Bultan September 2017 The Dissertation of Kyle Thomas Dewey is approved. Professor Chandra Krintz Professor Tevfik Bultan Professor Ben Hardekopf, Committee Chair June 2017 Automated Black Box Generation of Structured Inputs for Use in Software Testing Copyright © 2017 by Kyle Thomas Dewey iii Dedicated to Melanie, whose love and support has exceeded far beyond anything I could imagine. iv Acknowledgements In many ways, getting to this point was a team effort, and it would not have been possible if not for the vast amount of help I received along the way. Here I point out specific people who have been instrumental in this process, for whom I am immensely thankful. • Ben Hardekopf: For being an all-around excellent adviser and teaching me how to do research in the first place. Your patience speaks tomes. • Tevfik Bultan: For first exposing me to research in software engineering, and plenty of related critical guidance. • Chandra Krintz: For plenty of crucial guidance early on, and convincing me being here was a good idea. • Phill Conrad: For providing tons of mentoring both for teaching and for CS education research. • Ben Wiedermann: For teaching-related mentoring, along with giving CLP-based testing its first practical application). • Vineeth Kashyap: For being my first line of defense against having no idea what was going on in my first two years. • Madhukar Kedlaya: For being my first line of defense against having no idea what was going on in my third year. • Lawton Nichols: For first convincing me that I might actually have an idea of what’s going on. • Niko Matsakis, along with the entire Rust development team: For answering barrage after barrage of questions about Rust. We could not have fuzzed Rust without this help. v • Robert Rothman: For being brutally honest with me during my undergrad, and letting me know my place was in Computer Science. • Michael Christensen, Mehmet Emre, Miroslav (Mika) Gavrilov: For being wonderful fellow Ph.D. compatriots. You have made this a fun-filled experience. Case in point: Enter Sandman, Sad but True, Holier Than Thou, The Unforgiven, Wherever I May Roam, Don’t Tread on Me, Through the Never, Nothing Else Matters, Of Wolf and Man, The God That Failed, My Friend of Misery, The Struggle Within • Jared Roesch, Elena Morozova, Dylan Lynch, Ethan Kuefner, Berkeley Churchill, Davina Zamanzadeh, Dianne Wagner, Ben Campbell: The undergraduate students whom I worked closely with for multiple quarters, if not years. Beyond your research contributions, you all collectively helped me learn how to advise students. • The PL Lab: All members not already mentioned, past and present. • My Parents: For remaining sane and supportive when their only child suddenly decided to move to California. I apologize if I missed any names; it is not intentional. My memory resembles that of a goldfish. vi Curriculum Vitæ Kyle Thomas Dewey Education 2011 - 2017 Ph.D. in Computer Science, University of California, Santa Barbara. 2007 - 2011 M.S. in Bioinformatics, Rochester Institute of Technology, Rochester, NY. 2007 - 2011 B.S. in Bioinformatics, Rochester Institute of Technology, Rochester, NY. Courses Taught at the University of California, Santa Barbara Spring 2017 CS 162: Programming Languages Winter 2017 CS 162: Programming Languages Summer 2016 CS 56: Advanced Applications Programming (co-instructor) Winter 2016 CS 64: Computer Organization and Design Logic Fall 2015 CS 64: Computer Organization and Design Logic Winter 2015 CS 162: Programming Languages Summer 2014 CS 24: Problem Solving with Computers II Summer 2012 CS 16: Problem Solving with Computers I Courses TA’d at the University of California, Santa Barbara Spring 2014 CS 162: Programming Languages Winter 2014 CS 162: Programming Languages Spring 2013 CS 162: Programming Languages Winter 2013 CS 162: Programming Languages Spring 2012 CS 189B: Capstone Project, Part B Winter 2012 CS 189A: Capstone Project, Part A Fall 2011 CS 170: Operating Systems Awards and Professional Service June 2016 Student Volunteer for PLDI’16 Winter 2014 Outstanding Teaching Assistant for CS 162: Programming Lan- guages vii Peer-Reviewed Publications Evaluating Test Suite Effectiveness and Assessing Student Code via Constraint Logic Programming Kyle Dewey, Phill Conrad, Michelle Craig, Elena Morozova Conference on Innovation and Technology in Computer Science Education (ITiCSE), 2017 Fuzzing the Rust Typechecker Using CLP Kyle Dewey, Jared Roesch, Ben Hardekopf Conference on Automated Software Engineering (ASE), 2015 Automated Data Structure Generation: Refuting Common Wisdom Kyle Dewey, Lawton Nichols, Ben Hardekopf International Conference on Software Engineering (ICSE), 2015 A Parallel Abstract Interpreter for JavaScript Kyle Dewey, Vineeth Kashyap, Ben Hardekopf Symposium on Code Generation and Optimization (CGO), 2015 Language Fuzzing Using Constraint Logic Programming Kyle Dewey, Jared Roesch, Ben Hardekopf Conference on Automated Software Engineering (ASE), 2014 JSAI: A Static Analysis Platform for JavaScript Vineeth Kashyap, Kyle Dewey, Ethan A. Kuefner, John Wagner, Kevin Gibbons, John Sarracino, Ben Wiedermann, Ben Hardekopf Symposium on Foundations of Software Engineering (FSE), 2014 viii Abstract Automated Black Box Generation of Structured Inputs for Use in Software Testing by Kyle Thomas Dewey A common problem in automated software testing is the need to generate many inputs with complex structure in a black-box fashion. For example, a library for manipulating red-black trees may require that inputs are themselves valid red-black trees, meaning anything invalid is not suitable for testing. As another example, in order to test code generation in a compiler, it is necessary to use input programs which are both syntac- tically valid and well-typed. Despite the importance of this problem, we observe that existing solutions are few in number and have severe drawbacks, including unreasonably slow performance and a lack of generality to testing different systems. This thesis presents a solution to this problem of black-box structured input gen- eration. I observe that test inputs can be described as solutions to systems of logical constraints, and that more expressive constraints can lead to more complex tests. In order to test effectively and generate many tests, we need high-performance constraint solvers capable of finding many solutions to these constraints. I observe that constraint logic programming (CLP) offers an expressive constraint language paired with a high- performance constraint solver, and thus serves as a potential solution to this problem. Via a series of case studies, I have found that CLP (1) is applicable to testing a wide vari- ety of systems; (2) can scale to more complex constraints than ever previously described; and (3) is often orders of magnitude faster than competing solutions. These case studies have also exposed dozens of bugs in high-profile software, including the Rust compiler and the Z3 SMT solver. ix Contents Curriculum Vitae vii Abstract ix List of Figures xiii List of Tables xvi 1 Introduction and Motivation 1 1.1 What is Black-Box Fuzzing? . .2 1.2 Why (Not) Black-Box Fuzzing? . .3 1.3 Related Work . .4 1.4 Problem with Black-Box Fuzzing: Highly Structured Input Generation . .9 1.5 Key Insights . 10 1.6 Potential Solutions . 14 1.7 Overarching Thesis . 20 2 Case Study: Generating Interesting JavaScript Programs 24 2.1 Introduction . 24 2.2 CLP for Program Generation . 25 2.3 Generating JavaScript . 31 2.4 Evaluation . 37 2.5 Conclusions . 41 3 Case Study: Generating Complex Data Structures 43 3.1 Introduction . 43 3.2 Example . 45 3.3 CLP Compared to Other Data Structure Generators . 46 3.4 Data Structures and Properties . 49 3.5 Evaluation . 55 x 3.6 Conclusions . 62 4 Case Study: Type-Based Fuzzing of the Rust Compiler 63 4.1 Introduction . 63 4.2 Generating Well-Typed Programs . 64 4.3 Finding Typechecker Bugs . 69 4.4 Testing the Rust Typechecker . 76 4.5 Evaluation . 83 4.6 Conclusions . 89 5 Case Study: Semantics-Based Fuzzing of SMT Solvers 90 5.1 Introduction . 90 5.2 Generating Satisfiable Formulas . 92 5.3 Generating Unsatisfiable Formulas . 98 5.4 Application to Bitvectors and Floating Point . 101 5.5 Evaluation . 105 5.6 Discussion . 110 5.7 Conclusions . 113 6 Case Study: Intelligent Fuzzing of Student Tokenizers and Parsers 115 6.1 Introduction . 115 6.2 Testing Tokenizers, Parsers, and Arithmetic Evaluators With CLP . 116 6.3 Student Programming Assignment . 122 6.4 Evaluation . 124 6.5 Conclusions . 131 7 Case Study: Generating Polymorphic Programs for Testing Student Typecheckers 132 7.1 Introduction . 132 7.2 SimpleScala Language . 133 7.3 A Naive CLP-Based Generator for Well-Typed SimpleScala Programs . 140 7.4 Optimizing the Naive CLP-Based Generator for Well-Typed SimpleScala Programs . 156 7.5 Results . 169 7.6 Conclusion . 169 8 Improving CLP for Testing: Typed-Prolog 171 8.1 Introduction and Motivation . 171 8.2 Related Work . 174 8.3 Problems with CLP for Test Case Generation . 175 8.4 Type System . 183 xi 8.5 Compiling Higher-Order Relations . 189 8.6 Module System . 195 8.7 Results and Discussion . 198 8.8 Conclusions . 201 9 Improving CLP for Testing: Bounding and Search-Oriented Metainter- preter 202 9.1 Introduction and Motivation . 202 9.2 Related Work . 206 9.3 Background on CLP Metainterpreters . 208 9.4 A Metainterpreter for CLP that Parameterizes Search and Bounding . 213 9.5 Composing Search and Bounding Strategies . 227 9.6 Results and Discussion . 234 9.7 Conclusions and Future Work . 235 10 Conclusions and Future Work 237 A CLP Preliminaries 238 A.1 CLP Background .
Recommended publications
  • Answer Set Programming: a Tour from the Basics to Advanced Development Tools and Industrial Applications
    Answer Set Programming: A tour from the basics to advanced development tools and industrial applications Nicola Leone and Francesco Ricca Department of Mathematics and Computer Science, University of Calabria, Italy leone,ricca @mat.unical.it { } Abstract. Answer Set Programming (ASP) is a powerful rule-based language for knowledge representation and reasoning that has been developed in the field of logic programming and nonmonotonic reasoning. After more than twenty years from the introduction of ASP, the theoretical properties of the language are well understood and the solving technology has become mature for practical appli- cations. In this paper, we first present the basics of the ASP language, and we then concentrate on its usage for knowledge representation and reasoning in real- world contexts. In particular, we report on the development of some industry-level applications with the ASP system DLV, and we illustrate two advanced develop- ment tools for ASP, namely ASPIDE and JDLV, which speed-up and simplify the implementation of applications. 1 Introduction Answer Set Programming (ASP) [11, 19, 30] is a powerful rule-based language for knowledge representation and reasoning that has been developed in the field of logic programming and nonmonotonic reasoning. ASP features disjunction in rule heads, non monotonic negation in rule bodies [30], aggregate atoms [16] for concise mod- eling of complex combinatorial problems, and weak constraints [12] for the declarative encoding of optimization problems. Computational problems, even of high complexity [19], can be solved in ASP by specifying a logic program, i.e., a set of logic rules, such that its answer sets correspond to solutions, and then, using an answer set solver to find such solutions [38, 34].
    [Show full text]
  • Complexity Results for Probabilistic Answer Set Programming
    International Journal of Approximate Reasoning 118 (2020) 133–154 Contents lists available at ScienceDirect International Journal of Approximate Reasoning www.elsevier.com/locate/ijar Complexity results for probabilistic answer set programming ∗ Denis Deratani Mauá a, , Fabio Gagliardi Cozman b a Institute of Mathematics and Statistics, Universidade de São Paulo, Brazil b Escola Politécnica, Universidade de São Paulo, Brazil a r t i c l e i n f o a b s t r a c t Article history: We analyze the computational complexity of probabilistic logic programming with Received 16 May 2019 constraints, disjunctive heads, and aggregates such as sum and max. We consider Received in revised form 25 October 2019 propositional programs and relational programs with bounded-arity predicates, and look Accepted 9 December 2019 at cautious reasoning (i.e., computing the smallest probability of an atom over all Available online 16 December 2019 probability models), cautious explanation (i.e., finding an interpretation that maximizes the Keywords: lower probability of evidence) and cautious maximum-a-posteriori (i.e., finding a partial Probabilistic logic programming interpretation for a set of atoms that maximizes their lower probability conditional on Answer set programming evidence) under Lukasiewicz’s credal semantics. Computational complexity © 2019 Elsevier Inc. All rights reserved. 1. Introduction Probabilities and logic programming have been combined in a variety of ways [1–8]. A particularly interesting and power- ful combination is offered by probabilistic answer set programming, which exploits the powerful knowledge representation and problem solving toolset of answer set programming [9]. Available surveys describe probabilistic logic programming in detail and go over many promising applications [10–13].
    [Show full text]
  • Utgåvenoteringar För Fedora 11
    Fedora 11 Utgåvenoteringar Utgåvenoteringar för Fedora 11 Dale Bewley Paul Frields Chitlesh Goorah Kevin Kofler Rüdiger Landmann Ryan Lerch John McDonough Dominik Mierzejewski David Nalley Zachary Oglesby Jens Petersen Rahul Sundaram Miloslav Trmac Karsten Wade Copyright © 2009 Red Hat, Inc. and others. The text of and illustrations in this document are licensed by Red Hat under a Creative Commons Attribution–Share Alike 3.0 Unported license ("CC-BY-SA"). An explanation of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/. The original authors of this document, and Red Hat, designate the Fedora Project as the "Attribution Party" for purposes of CC-BY-SA. In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version. Red Hat, as the licensor of this document, waives the right to enforce, and agrees not to assert, Section 4d of CC-BY-SA to the fullest extent permitted by applicable law. Red Hat, Red Hat Enterprise Linux, the Shadowman logo, JBoss, MetaMatrix, Fedora, the Infinity Logo, and RHCE are trademarks of Red Hat, Inc., registered in the United States and other countries. 1 Utgåvenoteringar For guidelines on the permitted uses of the Fedora trademarks, refer to https:// fedoraproject.org/wiki/Legal:Trademark_guidelines. Linux® is the registered trademark of Linus Torvalds in the United States and other countries. Java® is a registered trademark of Oracle and/or its affiliates. XFS® is a trademark of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries.
    [Show full text]
  • Answer Set Programming: a Primer?
    Answer Set Programming: A Primer? Thomas Eiter1, Giovambattista Ianni2, and Thomas Krennwallner1 1 Institut fur¨ Informationssysteme, Technische Universitat¨ Wien Favoritenstraße 9-11, A-1040 Vienna, Austria feiter,[email protected] 2 Dipartimento di Matematica, Universita´ della Calabria, I-87036 Rende (CS), Italy [email protected] Abstract. Answer Set Programming (ASP) is a declarative problem solving para- digm, rooted in Logic Programming and Nonmonotonic Reasoning, which has been gaining increasing attention during the last years. This article is a gentle introduction to the subject; it starts with motivation and follows the historical development of the challenge of defining a semantics for logic programs with negation. It looks into positive programs over stratified programs to arbitrary programs, and then proceeds to extensions with two kinds of negation (named weak and strong negation), and disjunction in rule heads. The second part then considers the ASP paradigm itself, and describes the basic idea. It shows some programming techniques and briefly overviews Answer Set solvers. The third part is devoted to ASP in the context of the Semantic Web, presenting some formalisms and mentioning some applications in this area. The article concludes with issues of current and future ASP research. 1 Introduction Over the the last years, Answer Set Programming (ASP) [1–5] has emerged as a declar- ative problem solving paradigm that has its roots in Logic Programming and Non- monotonic Reasoning. This particular way of programming, in a language which is sometimes called AnsProlog (or simply A-Prolog) [6, 7], is well-suited for modeling and (automatically) solving problems which involve common sense reasoning: it has been fruitfully applied to a range of applications (for more details, see Section 6).
    [Show full text]
  • Answer Set Programming
    ANSWER SET PROGRAMMING Tran Cao Son Department of Computer Science New Mexico State University Las Cruces, NM 88011, USA [email protected] http://www.cs.nmsu.edu/~tson October 2005 Answer Set Programming. Acknowledgement This tutorial contains some materials from tutorials on answer set programming and on knowledge representation and logic programming from those provided by • Chitta Baral, available at www.public.asu.edu/~cbaral. • Michael Gelfond, available at www.cs.ttu.ued/~mgelfond. Tran Cao Son 1 Answer Set Programming. Introduction — Answer Set Programming Answer set programming is a new programming paradigm. It is introduced in the late 90’s and manages to attracts the intention of different groups of researchers thanks to its: • declarativeness: programs do not specify how answers are computed; • modularity: programs can be developed incrementally; • expressiveness: answer set programming can be used to solve problems in high 2 complexity classes (e.g. ΣP , Π2P , etc.) Answer set programming has been applied in several areas: reasoning about actions and changes, planning, configuration, wire routing, phylogenetic inference, semantic web, information integration, etc. Tran Cao Son 2 Answer Set Programming. Purpose • Introduce answer set programming • Provide you with some initial references, just in case • ...you get excited about answer set programming Tran Cao Son 3 Answer Set Programming. Outline • Foundation of answer set programming: logic programming with answer set semantics (syntax, semantics, early application). • Answer set programming: general ideas and examples • Application of answer set programming in – Knowledge representation – Constraint satisfaction problem – Combinatoric problems – Reasoning about action and change – Planning and diagnostic reasoning • Current issues Tran Cao Son 4 LOGIC PROGRAMMING AND ANSWER SET SEMANTICS Answer Set Programming.
    [Show full text]
  • A Meta-Programming Technique for Debugging Answer-Set Programs∗
    A Meta-Programming Technique for Debugging Answer-Set Programs∗ Martin Gebser1, Jorg¨ Puhrer¨ 2, Torsten Schaub1, and Hans Tompits2 1 Institut fur¨ Informatik, Universitat¨ Potsdam, Germany, {gebser,torsten}@cs.uni-potsdam.de 2 Institut fur¨ Informationssysteme 184/3, Technische Universitat¨ Wien, Austria, {puehrer,tompits}@kr.tuwien.ac.at Abstract applying such an approach to ASP results in some deci- sive drawbacks, undermining the declarativity of answer- Answer-set programming (ASP) is widely recognised as a vi- able tool for declarative problem solving. However, there is set semantics. In particular, establishing canonical tracing- currently a lack of tools for developing answer-set programs. based techniques for debugging answer-set programs re- In particular, providing tools for debugging answer-set pro- quires also a canonical procedure for answer-set computa- grams has recently been identified as a crucial prerequisite tion, and programmers would have to be familiar with the al- for a wider acceptance of ASP. In this paper, we introduce a gorithm. This would lead to a shift of perspectives in which meta-programming technique for debugging in ASP. The ba- an answer-set program is degraded from a declarative prob- sic question we address is why interpretations expected to be lem description to a set of parameters for a static solving answer sets are not answer sets of the program to debug. We algorithm. Therefore, we argue for declarative debugging thus deal with finding semantical errors of programs. The ex- strategies that are independent of answer-set computation. planations provided by our method are based on an intuitive Indeed, among the few approaches dealing with debug- scheme of errors that relies on a recent characterisation of the answer-set semantics.
    [Show full text]
  • Translingual Obfuscation
    Translingual Obfuscation Pei Wang, Shuai Wang, Jiang Ming, Yufei Jiang, and Dinghao Wu College of Information Sciences and Technology The Pennsylvania State University fpxw172, szw175, jum310, yzj107, [email protected] Abstract—Program obfuscation is an important software pro- Currently the state-of-the-art obfuscation technique is to tection technique that prevents attackers from revealing the incorporate with process-level virtualization. For example, programming logic and design of the software. We introduce obfuscators such as VMProtect [10] and Code Virtualizer [4] translingual obfuscation, a new software obfuscation scheme replace the original binary code with new bytecode, and a which makes programs obscure by “misusing” the unique custom interpreter is attached to interpret and execute the features of certain programming languages. Translingual ob- bytecode. The result is that the original binary code does fuscation translates part of a program from its original lan- not exist anymore, leaving only the bytecode and interpreter, guage to another language which has a different program- making it difficult to directly reverse engineer [39]. How- ming paradigm and execution model, thus increasing program ever, recent work has shown that the decode-and-dispatch complexity and impeding reverse engineering. In this paper, execution pattern of virtualization-based obfuscation can we investigate the feasibility and effectiveness of translingual be a severe vulnerability leading to effective deobfusca- obfuscation with Prolog, a logic programming language. We tion [24], [66], implying that we are in need of obfuscation implement translingual obfuscation in a tool called BABEL, techniques based on new schemes. which can selectively translate C functions into Prolog pred- We propose a novel and practical obfuscation method icates.
    [Show full text]
  • Distributed Multi-Threading in GNU Prolog Nuno Eduardo Quaresma Morgadinho
    §' Departamento de hrformática Distributed Multi-Threading in GNU Prolog Nuno Eduardo Quaresma Morgadinho <[email protected]> Supervisor: Salvador Abreu (Ur,iversidade de Évora DI) This thesis does not include aVpreciation nor suggestions madeby the iury. Esta dissertaçã.o não inclui as críticas e sugestões feitas pelo iúri, Évora 2007 § Departamento de Lrformática Distributed Multi-Threading in GNU Prolog Nuno Eduardo Quaresma Morgadinho <[email protected]> ..--# Jü3 3?Y Supervisor: Salvador Abreu (Universidade de Évora, DI) This thesis does not include appreciation nor suggestions madeby tlrc jury. Esta dissertação não inclui as críticas e sugestões feitas pelo júri. Évora 2007 Abstract Although parallel computing has been widely researdred, the process of bringrng concurrency and parallel programming to the mainstream has just be- gun. Combining a distributed multi-threading environment like PM2 with Pro- log, opens the way to exploit concurrency and parallel computing using logic programming. Tlo achieve suú a pu{pose, we developed PM2-Prolog, a Prolog interface to the PM2 system. It allows multithreaded Prolog applications to run in multiple GNU Prolog engines in a distributed environment, thus taking advan- tage of the resources available on a computer network. This is especially useful for computationally intensive problems, where performance is an important fac- tor. The system API offers thread management primitives, as well as explicit communication between threads. Preliminary test results show an almost linear speedup, when compared to a sequential version. Keywords: Distributed, Multi-Threading, Prolog, Logic Programming, Concurrency, Parallel, High-PerformÉulce Computing 1 Resumo Multi-Threading Distribuído no GNU Prolog Embora a computação paralela já tenha sido alvo de inúmeros estudos, o processo de a tomar acessível às massas ainda mal começou.
    [Show full text]
  • Integrating Answer Set Programming and Constraint Logic Programming
    Integrating Answer Set Programming and Constraint Logic Programming Veena S. Mellarkod and Michael Gelfond and Yuanlin Zhang {veena.s.mellarkod,mgelfond,yzhang}@cs.ttu.edu Texas Tech University Dedicated to Victor Marek on his 65th birthday as a specification for the sets of beliefs to be held by a ra- tional reasoner associated with Π. Such sets, called an- Abstract swer sets of Π, are represented by collections of ground literals. A rule (1) is viewed as a constraint which says We introduce a knowledge representation language that if literals lk+1, . , ln belong to an answer set A of AC(C) extending the syntax and semantics of ASP Π and none of the literals ln+1, . , lm belong to A then and CR-Prolog, give some examples of its use, and A must contain at least one of the literals l , . , l . To present an algorithm, ACsolver, for computing answer 1 k sets of AC(C) programs. The algorithm does not re- form answer sets of Π, the reasoner must satisfy Π’s quire full grounding of a program and combines “clas- rules together with the rationality principle which says: sical” ASP solving methods with constraint logic pro- “Believe nothing you are not forced to believe”. gramming techniques and CR-Prolog based abduction. The AC(C) based approach often allows to solve prob- Given a computational problem P , an ASP programmer lems which are impossible to solve by more traditional • Expresses information relevant to the problem in the ASP solving techniques. We belief that further inves- language of ASP; tigation of the language and development of more effi- cient and reliable solvers for its programs can help to • Reduces P to a query Q requesting computation of substantially expand the domain of applicability of the (parts of) answer sets of Π.
    [Show full text]
  • Improving the Compilation of Prolog to C Using Type and Determinism Information: Preliminary Results*
    Improving the Compilation of Prolog to C Using Type and Determinism Information: Preliminary Results* J. Morales1, M. Carro1' M. Hermenegildo* * [email protected] [email protected] [email protected] Abstract We describe the current status of and provide preliminary performance results for a compiler of Prolog to C. The compiler is novel in that it is designed to accept different kinds of high-level information (typically ob- tained via an analysis of the initial Prolog program and expressed in a standardized language of assertions) and use this information to optimize the resulting C code, which is then further processed by an off-the-shelf C compiler. The basic translation process used essentially mimics an unfolding of a C-coded bytecode emú- lator with respect to the particular bytecode corresponding to the Prolog program. Optimizations are then applied to this unfolded program. This is facilitated by a more flexible design of the bytecode instructions and their lower-level components. This approach allows reusing a sizable amount of the machinery of the bytecode emulator: ancillary pieces of C code, data definitions, memory management routines and áreas, etc., as well as mixing bytecode emulated code with natively compiled code in a relatively straightforward way We report on the performance of programs compiled by the current versión of the system, both with and without analysis information. 1 Introduction Several techniques for implementing Prolog have been devised since the original interpreter developed by Colmerauer and Roussel [5], many of them aimed at achieving more speed. An excellent survey of a sig- nifícant part of this work can be found in [26].
    [Show full text]
  • Towards Flexible Goal-Oriented Logic Programming
    FACULTY OF SCIENCES Towards Flexible Goal-Oriented Logic Programming ir. Benoit Desouter Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Computer Science Supervisors: prof. dr. ir. Tom Schrijvers prof. dr. ir. Marko van Dooren Department of Applied Mathematics, Computer Science and Statistics Faculty of Sciences, Ghent University ii Acknowledgments As it feels more natural to thank people in the language that we have used on a daily basis during the journey towards completing this PhD thesis, I'll use Dutch for most of the next few pages. In de eerste plaats wil ik mijn promotoren Tom en Marko bedanken. Tom, bedankt voor het geduld als ik occasioneel iets maar half begreep, en om er altijd vertrouwen in te blijven hebben. Je hebt me heel wat kansen aangereikt, waardoor ik van heel wat onderwerpen iets heb opgestoken. Marko, hoewel je er pas halverwege bijkwam, toonde je al snel interesse voor het onderwerp en heb je er vanuit je eigen expertise heel wat aan toegevoegd. Je deur stond altijd voor me open als ik even een tussentijdse statusupdate wou geven. Bedankt voor de babbels over vanalles en nog wat, en om me grondig te betrekken bij het geven van Programmeren 1. Ik heb nog heel wat bijgeleerd over objec- tori¨entatie door jouw visie, slides en codevoorbeelden. Daarnaast ook bedankt om mijn lokale LATEX-goeroe te zijn: niettegenstaande ik LATEX al tien jaar lang gebruik, heb ik voor het precies goed krijgen van deze thesis heel wat nieuwe pakketten en trucjes moeten gebruiken, met regelmatig vreemde out- put of cryptische foutmeldingen tot gevolg die ik niet altijd alleen kon oplossen | of het zou me op zijn minst veel meer tijd gekost hebben.
    [Show full text]
  • CS 252R: Advanced Topics in Programming Languages
    CS 252r: Advanced Topics in Programming Languages Eric K. Zhang [email protected] Fall 2020 Abstract These are notes for Harvard's CS 252r, a graduate seminar class on topics at the intersection of programming languages and artificial intelligence, as taught by Nada Amin1 in Fall 2020. Possible topics include: differentiable programming, interpretable AI, neuro-symbolic systems, reinforcement learning, probabilistic programming, and synthesis. Course description: We will explore programming languages for artificial intelligence. Programming languages drive the way we communicate with computers, including how we make them intelligent and reasonable. In this advanced topic course, we will look at artificial intelligence broadly construed from the point of view of programming languages. We gain clarity of semantics, algorithms and purpose. Contents 1 September 3rd, 20204 1.1 Breakout Discussion: Learning DSLs...........................4 1.2 Program Synthesis.....................................4 1.2.1 Type-Driven Specification [OZ15].........................4 1.2.2 Logic-Based Approaches..............................6 1.2.3 Neural-Guided Search [KMP+18].........................6 1.2.4 Differentiable Programming [BRNR17]......................7 1.3 Breakout Prompts: Applying PL to Improve ML....................7 2 September 8th, 20209 2.1 Probabilistic Programming [vdMPYW18]........................9 2.2 Bayesian Inference Algorithms.............................. 11 3 September 10th, 2020 13 3.1 Breakout Discussion: Perceptual Versus Symbolic...................
    [Show full text]