Control-Flow Analysis of Functional Programs
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Catamorphic Approach to Program Analysis
Catamorphic Approach to Program Analysis Mizuhito OGAWA1,2, Zhenjiang HU1,2, Isao SASANO1, Masato TAKEICHI1 1 The University of Tokyo 2 PRESTO 21, Japan Science and Technology Corporation {mizuhito,hu,sasano,takeichi}@ipl.t.u-tokyo.ac.jp Abstract. This paper proposes a new framework for program analysis that regards them as maximum marking problems: mark the codes of a program under a certain condition such that the number of marked nodes is maximum. We show that if one can develop an efficient checking program (in terms of a finite catamorphism) whether marked programs are correctly marked, then an efficient and incremental marking algorithm to analyze control flow of structured programs (say in C or Java with limited numbers of gotos) is obtained for free. To describe catamorphisms on control flow graphs, we present an algebraic construction, called SP Term, for graphs with bounded tree width. The transformation to SP Terms is done efficiently in linear time. We demonstrate that our framework is powerful to describe various kinds of control flow analyses. Especially, for analyses in which some optimality is required (or expected), our approach has a big advantage. Keywords: Catamorphism, Control Flow Graphs, Control Flow Analysis, Functional Programming, Linear Time Algorithm, Program Optimization, Bounded Tree Width 1 Introduction While program transformation plays an important role in optimizing compiler and software engi- neering, implementing program transformation requires various kinds of program analyses to extract program fragments satisfying a certain property; these fragments will then be replaced by equivalent but more efficient constructs. As a simple example, consider the program in Fig. -
7. Control Flow First?
Copyright (C) R.A. van Engelen, FSU Department of Computer Science, 2000-2004 Ordering Program Execution: What is Done 7. Control Flow First? Overview Categories for specifying ordering in programming languages: Expressions 1. Sequencing: the execution of statements and evaluation of Evaluation order expressions is usually in the order in which they appear in a Assignments program text Structured and unstructured flow constructs 2. Selection (or alternation): a run-time condition determines the Goto's choice among two or more statements or expressions Sequencing 3. Iteration: a statement is repeated a number of times or until a Selection run-time condition is met Iteration and iterators 4. Procedural abstraction: subroutines encapsulate collections of Recursion statements and subroutine calls can be treated as single Nondeterminacy statements 5. Recursion: subroutines which call themselves directly or indirectly to solve a problem, where the problem is typically defined in terms of simpler versions of itself 6. Concurrency: two or more program fragments executed in parallel, either on separate processors or interleaved on a single processor Note: Study Chapter 6 of the textbook except Section 7. Nondeterminacy: the execution order among alternative 6.6.2. constructs is deliberately left unspecified, indicating that any alternative will lead to a correct result Expression Syntax Expression Evaluation Ordering: Precedence An expression consists of and Associativity An atomic object, e.g. number or variable The use of infix, prefix, and postfix notation leads to ambiguity An operator applied to a collection of operands (or as to what is an operand of what arguments) which are expressions Fortran example: a+b*c**d**e/f Common syntactic forms for operators: The choice among alternative evaluation orders depends on Function call notation, e.g. -
A Survey of Hardware-Based Control Flow Integrity (CFI)
A survey of Hardware-based Control Flow Integrity (CFI) RUAN DE CLERCQ∗ and INGRID VERBAUWHEDE, KU Leuven Control Flow Integrity (CFI) is a computer security technique that detects runtime attacks by monitoring a program’s branching behavior. This work presents a detailed analysis of the security policies enforced by 21 recent hardware-based CFI architectures. The goal is to evaluate the security, limitations, hardware cost, performance, and practicality of using these policies. We show that many architectures are not suitable for widespread adoption, since they have practical issues, such as relying on accurate control flow model (which is difficult to obtain) or they implement policies which provide only limited security. CCS Concepts: • Security and privacy → Hardware-based security protocols; Information flow control; • General and reference → Surveys and overviews; Additional Key Words and Phrases: control-flow integrity, control-flow hijacking, return oriented programming, shadow stack ACM Reference format: Ruan de Clercq and Ingrid Verbauwhede. YYYY. A survey of Hardware-based Control Flow Integrity (CFI). ACM Comput. Surv. V, N, Article A (January YYYY), 27 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn 1 INTRODUCTION Today, a lot of software is written in memory unsafe languages, such as C and C++, which introduces memory corruption bugs. This makes software vulnerable to attack, since attackers exploit these bugs to make the software misbehave. Modern Operating Systems (OSs) and microprocessors are equipped with security mechanisms to protect against some classes of attacks. However, these mechanisms cannot defend against all attack classes. In particular, Code Reuse Attacks (CRAs), which re-uses pre-existing software for malicious purposes, is an important threat that is difficult to protect against. -
Control-Flow Analysis of Functional Programs
Control-flow analysis of functional programs JAN MIDTGAARD Department of Computer Science, Aarhus University We present a survey of control-flow analysis of functional programs, which has been the subject of extensive investigation throughout the past 30 years. Analyses of the control flow of functional programs have been formulated in multiple settings and have led to many different approximations, starting with the seminal works of Jones, Shivers, and Sestoft. In this paper, we survey control-flow analysis of functional programs by structuring the multitude of formulations and approximations and comparing them. Categories and Subject Descriptors: D.3.2 [Programming Languages]: Language Classifica- tions—Applicative languages; F.3.1 [Logics and Meanings of Programs]: Specifying and Ver- ifying and Reasoning about Programs General Terms: Languages, Theory, Verification Additional Key Words and Phrases: Control-flow analysis, higher-order functions 1. INTRODUCTION Since the introduction of high-level languages and compilers, much work has been devoted to approximating, at compile time, which values the variables of a given program may denote at run time. The problem has been named data-flow analysis or just flow analysis. In a language without higher-order functions, the operator of a function call is apparent from the text of the program: it is a lexically visible identifier and therefore the called function is available at compile time. One can thus base an analysis for such a language on the textual structure of the program, since it determines the exact control flow of the program, e.g., as a flow chart. On the other hand, in a language with higher-order functions, the operator of a function call may not be apparent from the text of the program: it can be the result of a computation and therefore the called function may not be available until run time. -
Incrementalized Pointer and Escape Analysis Frédéric Vivien, Martin Rinard
Incrementalized Pointer and Escape Analysis Frédéric Vivien, Martin Rinard To cite this version: Frédéric Vivien, Martin Rinard. Incrementalized Pointer and Escape Analysis. PLDI ’01 Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation, Jun 2001, Snowbird, United States. pp.35–46, 10.1145/381694.378804. hal-00808284 HAL Id: hal-00808284 https://hal.inria.fr/hal-00808284 Submitted on 14 Oct 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Incrementalized Pointer and Escape Analysis∗ Fred´ eric´ Vivien Martin Rinard ICPS/LSIIT Laboratory for Computer Science Universite´ Louis Pasteur Massachusetts Institute of Technology Strasbourg, France Cambridge, MA 02139 [email protected]strasbg.fr [email protected] ABSTRACT to those parts of the program that offer the most attrac- We present a new pointer and escape analysis. Instead of tive return (in terms of optimization opportunities) on the analyzing the whole program, the algorithm incrementally invested resources. Our experimental results indicate that analyzes only those parts of the program that may deliver this approach usually delivers almost all of the benefit of the useful results. An analysis policy monitors the analysis re- whole-program analysis, but at a fraction of the cost. -
A Nominal Theory of Objects with Dependent Types
A Nominal Theory of Objects with Dependent Types Martin Odersky, Vincent Cremet, Christine R¨ockl, Matthias Zenger Ecole´ Polytechnique F´ed´eralede Lausanne INR Ecublens 1015 Lausanne, Switzerland Technical Report IC/2002/070 Abstract Simula 67 [DMN70], whereas virtual or abstract types are present in BETA [MMPN93], as well as more recently in We design and study νObj, a calculus and dependent type gbeta [Ern99], Rune [Tor02] and Scala [Ode02]. An es- system for objects and classes which can have types as mem- sential ingredient of these systems are objects with type bers. Type members can be aliases, abstract types, or new members. There is currently much work that explores types. The type system can model the essential concepts the uses of this concept in object-oriented programming of Java’s inner classes as well as virtual types and family [SB98, TT99, Ern01, Ost02]. But its type theoretic foun- polymorphism found in BETA or gbeta. It can also model dations are just beginning to be investigated. most concepts of SML-style module systems, including shar- As is the case for modules, dependent types are a promis- ing constraints and higher-order functors, but excluding ap- ing candidate for a foundation of objects with type mem- plicative functors. The type system can thus be used as a bers. Dependent products can be used to represent functors basis for unifying concepts that so far existed in parallel in in SML module systems as well as classes in object sys- advanced object systems and in module systems. The paper tems with virtual types [IP02]. -
Generalized Escape Analysis and Lightweight Continuations
Generalized Escape Analysis and Lightweight Continuations Abstract Might and Shivers have published two distinct analyses for rea- ∆CFA and ΓCFA are two analysis techniques for higher-order soning about programs written in higher-order languages that pro- functional languages (Might and Shivers 2006b,a). The former uses vide first-class continuations, ∆CFA and ΓCFA (Might and Shivers an enrichment of Harrison’s procedure strings (Harrison 1989) to 2006a,b). ∆CFA is an abstract interpretation based on a variation reason about control-, environment- and data-flow in a program of procedure strings as developed by Sharir and Pnueli (Sharir and represented in CPS form; the latter tracks the degree of approxi- Pnueli 1981), and, most proximately, Harrison (Harrison 1989). mation inflicted by a flow analysis on environment structure, per- Might and Shivers have applied ∆CFA to problems that require mitting the analysis to exploit extra precision implicit in the flow reasoning about the environment structure relating two code points lattice, which improves not only the precision of the analysis, but in a program. ΓCFA is a general tool for “sharpening” the precision often its run-time as well. of an abstract interpretation by tracking the amount of abstraction In this paper, we integrate these two mechanisms, showing how the analysis has introduced for the specific abstract interpretation ΓCFA’s abstract counting and collection yields extra precision in being performed. This permits the analysis to opportunistically ex- ∆CFA’s frame-string model, exploiting the group-theoretic struc- ploit cases where the abstraction has introduced no true approxi- ture of frame strings. mation. -
Escape from Escape Analysis of Golang
Escape from Escape Analysis of Golang Cong Wang Mingrui Zhang Yu Jiang∗ KLISS, BNRist, School of Software, KLISS, BNRist, School of Software, KLISS, BNRist, School of Software, Tsinghua University Tsinghua University Tsinghua University Beijing, China Beijing, China Beijing, China Huafeng Zhang Zhenchang Xing Ming Gu Compiler and Programming College of Engineering and Computer KLISS, BNRist, School of Software, Language Lab, Huawei Technologies Science, ANU Tsinghua University Hangzhou, China Canberra, Australia Beijing, China ABSTRACT KEYWORDS Escape analysis is widely used to determine the scope of variables, escape analysis, memory optimization, code generation, go pro- and is an effective way to optimize memory usage. However, the gramming language escape analysis algorithm can hardly reach 100% accurate, mistakes ACM Reference Format: of which can lead to a waste of heap memory. It is challenging to Cong Wang, Mingrui Zhang, Yu Jiang, Huafeng Zhang, Zhenchang Xing, ensure the correctness of programs for memory optimization. and Ming Gu. 2020. Escape from Escape Analysis of Golang. In Software In this paper, we propose an escape analysis optimization ap- Engineering in Practice (ICSE-SEIP ’20), May 23–29, 2020, Seoul, Republic of proach for Go programming language (Golang), aiming to save Korea. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/3377813. heap memory usage of programs. First, we compile the source code 3381368 to capture information of escaped variables. Then, we change the code so that some of these variables can bypass Golang’s escape 1 INTRODUCTION analysis mechanism, thereby saving heap memory usage and reduc- Memory management is very important for software engineering. -
Escape Analysis for Java
Escape Analysis for Java Jong-Deok Choi Manish Gupta Mauricio Serrano Vugranam C. Sreedhar Sam Midkiff IBM T. J. Watson Research Center P. O. Box 218, Yorktown Heights, NY 10598 jdchoi, mgupta, mserrano, vugranam, smidkiff ¡ @us.ibm.com ¢¤£¦¥¨§ © ¦ § § © § This paper presents a simple and efficient data flow algorithm Java continues to gain importance as a language for general- for escape analysis of objects in Java programs to determine purpose computing and for server applications. Performance (i) if an object can be allocated on the stack; (ii) if an object is an important issue in these application environments. In is accessed only by a single thread during its lifetime, so that Java, each object is allocated on the heap and can be deallo- synchronization operations on that object can be removed. We cated only by garbage collection. Each object has a lock asso- introduce a new program abstraction for escape analysis, the ciated with it, which is used to ensure mutual exclusion when connection graph, that is used to establish reachability rela- a synchronized method or statement is invoked on the object. tionships between objects and object references. We show that Both heap allocation and synchronization on locks incur per- the connection graph can be summarized for each method such formance overhead. In this paper, we present escape analy- that the same summary information may be used effectively in sis in the context of Java for determining whether an object (1) different calling contexts. We present an interprocedural al- may escape the method (i.e., is not local to the method) that gorithm that uses the above property to efficiently compute created the object, and (2) may escape the thread that created the connection graph and identify the non-escaping objects for the object (i.e., other threads may access the object). -
Making Collection Operations Optimal with Aggressive JIT Compilation
Making Collection Operations Optimal with Aggressive JIT Compilation Aleksandar Prokopec David Leopoldseder Oracle Labs Johannes Kepler University, Linz Switzerland Austria [email protected] [email protected] Gilles Duboscq Thomas Würthinger Oracle Labs Oracle Labs Switzerland Switzerland [email protected] [email protected] Abstract 1 Introduction Functional collection combinators are a neat and widely ac- Compilers for most high-level languages, such as Scala, trans- cepted data processing abstraction. However, their generic form the source program into an intermediate representa- nature results in high abstraction overheads – Scala collec- tion. This intermediate program representation is executed tions are known to be notoriously slow for typical tasks. We by the runtime system. Typically, the runtime contains a show that proper optimizations in a JIT compiler can widely just-in-time (JIT) compiler, which compiles the intermediate eliminate overheads imposed by these abstractions. Using representation into machine code. Although JIT compilers the open-source Graal JIT compiler, we achieve speedups of focus on optimizations, there are generally no guarantees up to 20× on collection workloads compared to the standard about the program patterns that result in fast machine code HotSpot C2 compiler. Consequently, a sufficiently aggressive – most optimizations emerge organically, as a response to JIT compiler allows the language compiler, such as Scalac, a particular programming paradigm in the high-level lan- to focus on other concerns. guage. Scala’s functional-style collection combinators are a In this paper, we show how optimizations, such as inlin- paradigm that the JVM runtime was unprepared for. ing, polymorphic inlining, and partial escape analysis, are Historically, Scala collections [Odersky and Moors 2009; combined in Graal to produce collections code that is optimal Prokopec et al. -
Obfuscating C++ Programs Via Control Flow Flattening
Annales Univ. Sci. Budapest., Sect. Comp. 30 (2009) 3-19 OBFUSCATING C++ PROGRAMS VIA CONTROL FLOW FLATTENING T. L¶aszl¶oand A.¶ Kiss (Szeged, Hungary) Abstract. Protecting a software from unauthorized access is an ever de- manding task. Thus, in this paper, we focus on the protection of source code by means of obfuscation and discuss the adaptation of a control flow transformation technique called control flow flattening to the C++ lan- guage. In addition to the problems of adaptation and the solutions pro- posed for them, a formal algorithm of the technique is given as well. A prototype implementation of the algorithm presents that the complexity of a program can show an increase as high as 5-fold due to the obfuscation. 1. Introduction Protecting a software from unauthorized access is an ever demanding task. Unfortunately, it is impossible to guarantee complete safety, since with enough time given, there is no unbreakable code. Thus, the goal is usually to make the job of the attacker as di±cult as possible. Systems can be protected at several levels, e.g., hardware, operating system or source code. In this paper, we focus on the protection of source code by means of obfuscation. Several code obfuscation techniques exist. Their common feature is that they change programs to make their comprehension di±cult, while keep- ing their original behaviour. The simplest technique is layout transformation [1], which scrambles identi¯ers in the code, removes comments and debug informa- tion. Another technique is data obfuscation [2], which changes data structures, 4 T. L¶aszl¶oand A.¶ Kiss e.g., by changing variable visibilities or by reordering and restructuring arrays. -
The Power of Interoperability: Why Objects Are Inevitable
The Power of Interoperability: Why Objects Are Inevitable Jonathan Aldrich Carnegie Mellon University Pittsburgh, PA, USA [email protected] Abstract 1. Introduction Three years ago in this venue, Cook argued that in Object-oriented programming has been highly suc- their essence, objects are what Reynolds called proce- cessful in practice, and has arguably become the dom- dural data structures. His observation raises a natural inant programming paradigm for writing applications question: if procedural data structures are the essence software in industry. This success can be documented of objects, has this contributed to the empirical success in many ways. For example, of the top ten program- of objects, and if so, how? ming languages at the LangPop.com index, six are pri- This essay attempts to answer that question. After marily object-oriented, and an additional two (PHP reviewing Cook’s definition, I propose the term ser- and Perl) have object-oriented features.1 The equiva- vice abstractions to capture the essential nature of ob- lent numbers for the top ten languages in the TIOBE in- jects. This terminology emphasizes, following Kay, that dex are six and three.2 SourceForge’s most popular lan- objects are not primarily about representing and ma- guages are Java and C++;3 GitHub’s are JavaScript and nipulating data, but are more about providing ser- Ruby.4 Furthermore, objects’ influence is not limited vices in support of higher-level goals. Using examples to object-oriented languages; Cook [8] argues that Mi- taken from object-oriented frameworks, I illustrate the crosoft’s Component Object Model (COM), which has unique design leverage that service abstractions pro- a C language interface, is “one of the most pure object- vide: the ability to define abstractions that can be ex- oriented programming models yet defined.” Academ- tended, and whose extensions are interoperable in a ically, object-oriented programming is a primary focus first-class way.