The Semantics of Scheme Control-Flow Analysis

The Semantics of Scheme Control-Flow Analysis Olin Shivers School of Computer Science Carnegie Mellon Pittsburgh, Pennsylvania 15213 OlinShiverscscmuedu A bstract Nonstandard abstract semantic interpretations Non-standard abstract semantic interpretation is an elegant method This is a follow-on to my1988PLDIpaper, ªControl-Flow Analysis for formally describing program analyses. Suppose we have a inSchemeº[9]. Iusethe methodofabstractsemanticinterpretations S programming language L with a denotational semantics , and we to explicate the control-¯ow analysis technique presented in that wish to determine some property X at compile time. Our ®rst step is paper. S L to developan alternate semantics X for that precisely expresses I begin with a denotational semantics for CPS Scheme. I then S property X . That is, whereas semantics might say the meaning present an alternate semantics that precisely expresses the control- of a program is a function ªcomputingº the program's result value ¯ow analysis problem. I abstract this semantics in a natural way, S given its inputs, semantics X would say the meaning ofa program arriving at two different semantic interpretations giving approxi- is a function ªcomputingº the property X on its corresponding mate solutions to the ¯ow analysis problem, each computable at inputs. compile time. The development of the ®nal abstract semantics pro- S X is a precise de®nition of the property we wish to determine, vides a clear, formal description of the analysis technique presented but its precision typically implies that it cannot be computed at in ªControl-Flow Analysis in Scheme.º compile time. It might be uncomputable; it might depend on the S runtime inputs. The second step, then, is to abstract X to a ^ S X Intro duction new semantics, which trades off accuracy for compile-time computability. This sort of approximation is a typical program- Controlow analysis analysis tradeoff Ð the real answers we seek are uncomputable,so we settle for computable, conservative approximations to them. Schemecontrol-¯owanalysis(CFA)isausefultechniquefor compile- For example, allow property X to be the setof all ªuselessvari- time analysis of the control-¯ow structure of Scheme programs (or, ablesºin a program, where a uselessvariableis one referenced only more generally, programs written in languages allowing ®rst-class to computevaluesbound to other uselessvariables. Suchvariables, functions). In a previouspaper[9], I introducedthe technique,gave and the computationsreferencingthem, can then beeliminated from S an algorithm for it, and demonstrated two example optimisations the program without altering its result. Our alternate semantics X P (induction-variable elimination and useless-variable elimination) would map a program P to a function that ªcomputesº 's useless- that could be achieved with the results of the analysis. Other use- variable set. This semantics would probably be uncomputable, ful applications of control-¯ow analysis are type recovery [11], and depending on perfect knowledge of the control-¯ow behavior of P . copy, constantandlambda propagation[13]. The fundamentalideas ^ S A useful, conservative abstraction X would be one that occasion- of control-¯ow analysis have also been utilised in other work on ally misses a truely useless variable, but never includes a useful functional programming languages [8, 2]. variable in its result set. The basic technique for performing Scheme control-¯ow anal- The method of non-standard abstract semantic interpretation ysis consists of translating the Scheme program into a simple inter- has several bene®ts. Since the analysis is expressed in termsofa mediate representation: continuation-passing style (CPS) Scheme formal semantics, it is possible to prove important properties about with a primitive functional conditional operator and all side-effects the analysis. In particular, we can prove that the abstract semantics to variables converted into side-effects to data-structures. After ^ S S X is computable, and safe with respect to X . Further, due to its the CPS conversion, all transfers of control in the program Ð se- formal nature, and because of its relation to the standard semantics quencing, iteration, conditional transfers, procedure call/return Ð of a programming language, the simple expression of an analysis are represented as tail-recursive procedure calls. Thus the problem in terms of abstract semantic interpretations helps clarify it. The of determining the control-¯ow structure of the program reduces to abstract semantic interpretation method of program analysis has the problem of determining for each call site the set of all lambda been applied to an array of program analyses [3, 5, 12, 4, 11]. expressionsin the program that could be branched to from that call In this paper, I will explicate Scheme control-¯ow analysis site. using this framework. I will show a series of semantics for CPS Scheme, beginning with the standard semantics, evolving through To appear at the First ACM SIGPLAN and IFIP Symposium exactcontrol analysis,andendingupwith two different computable on Partial Evaluation and Semantics-Based Program Manipu- abstractions (with different cost/precision tradeoffs). lation, June 1991, Yale University, New Haven, Conn. Also available as Technical Report CMU-CS-91-119, CMU School of Computer Science. 2 CFA Scheme Semantics PR ::= LAM v v c [v c ] i LAM ::= 1 . n VAR CALL f a a [f a ] i CALL ::= 1 . n FUN ARG l c [f l c ] letrec f i 1 1 . i VAR LAM CALL + ARG ::= LAM + VAR CONST + FUN ::= LAM + VAR PRIM x z foo g VAR ::= f . f g CONST ::= f . if g PRIM ::= f . Figure 1: CPS Scheme Syntax Notation would like to limit the arguments we are willing to circularly close to be lambda expressions. By elevating this restriction to the syn- D D is used to indicate all vectors of ®nite length over the set . tactic level, we will simplify the semantics equations, and sidestep a b c d Functions are updated with brackets: e is the func- the Y operator. There is no syntax for assigning variables in this b c d tion mapping a to , to , and everywhereelse identicalto function language. If we wish to allow side effects, we can introduce ap- e. This notation is extended by taking an update standing by itself propriate primops to create and side-effect mutable data structures; to imply an update to the appropriate bottom function ; hence[] assignments to variables in the source language can be converted is equivalent to . Unused function variables are, by convention, into equivalent data structure side effects during the CPS conver- x e D P (D ) subscripted with , e.g., x . The power set of is . Vectors sion [6]. Finally, we'll assume that all variables are uniquein a a p z i are written h . Lambda functions are sometimes written with program Ð that is, no identi®er is bound by more than one lambda ha bi a vector-destructuring syntax: the function exp takesa two expression. element vector as its single argument, binding a to the ®rst element, It bearsemphasizingthatthis ratherminimal languageis a useful i v v i and b to the second. The th element of vector is written . intermediate representation for compiling higher-order languages Functions with power-set ranges can be joined with the t operator: such as Scheme. Variants of CPS Scheme have been used in sev- t g = x (f x) (g x) + f . The ªpredomainº operator is usedto eral Scheme and ML compilers. A full discussion of the many + B construct the disjoint union of two sets: A . This operator does advantages of CPS-based intermediate representations, however, is not introduce a new bottom element, and so the result object is just beyondthe scope of this paper [1, 6, 15]. a set, not a domain; following Reynolds [7], I attempt to introduce CPS Scheme has a very simple semantics. The semantic do- domains only where necessaryin semantic constructions, avoiding mains and functions are given in ®gure 2. There is a set of basic ªspurious values.º values,Bas, whichconsistsofthe integersandaspecialfalse value(I will follow traditional Lisp practice in assuming no special boolean CPS Scheme type; anything not false is a true value). The value set D consists of the basicvaluesandCPS Schemefunctions. CPSSchemefunctions The practice of converting programs into continuation-passingstyle are representedasfunctionsfrom vectors ofvaluesto the answerset. (CPS) as an intermediate representation for compilation has been The domain of answers Ans is the value set D plus a special error discussed in several papers [15, 6, 1]. CPS can be summarised element denoting a run-time error, and a bottom element denoting by stating that function calls are one-way transfers Ð they do not non-termination1. An environment is a function from variables to return. Soafunctioncallcanbeviewedasa GOTO thatpassesvalues. values. In this section, we will de®ne a very simple language, called CPS Note that an environment only maps to values in D Ð it will Scheme, for expressing programs in written in this style. never map a variable to or error. This is one of the happy The syntax of CPS Scheme is shown in ®gure 1. A program consequencesof CPS conversion. Also note the absence of a store is a single lambda expression. Lambda expressions bind variables in this semantics. Side effects have been dropped completely from v v 1 . n ; the body of a lambda expression must be a single call this semantics to simplify the presentation; they are not dif®cult expression. There are two kinds of call expressions. A simple to reinstate [10] once the basic methods outlined in this paper are call expression is a function applied to a series of arguments. The understood. function expressionmay only be a variable, a lambda or a primitive PR maps a program to its result. It simply calls the A function operation (primop). An argument expression may only be a vari- to close its lambda intheemptyenvironment[ ], andcallstheresult able, a lambda,ora constant.

The Semantics of Scheme Control-Flow Analysis

7. Control Flow First?

A Survey of Hardware-Based Control Flow Integrity (CFI)

Control-Flow Analysis of Functional Programs

Obfuscating C++ Programs Via Control Flow Flattening

Control Flow Statements

Control Flow Statements in Java Tutorials Point

Control Flow Analysis

Code Based Analysis of Object-Oriented Systems Using Extended Control Flow Graph

Computer Science 2. Conditionals and Loops

Statement-Level Control Structures

Stimulus Structures and Mental Representations in Expert Comprehension of Computer Programs By: Nancy Pennington Presented By: Michael John Decker Author & Journal

Control Flow