A Transformation-Based Implementation of Lightweight Nested Functions

Vol. 47 No. SIG 0(PRO 29) IPSJ Transactions on Programming June 2006 A Transformation-Based Implementation of Lightweight Nested Functions Tasuku Hiraishi,y Masahiro Yasugiy and Taiichi Yuasay The SC language system has been developed to provide a transformation-based language extension scheme for SC languages (extended/plain C languages with an S-expression based syntax). Using this system, many ﬂexible extensions to the C language can be implemented by transformation rules over S-expressions at low cost mainly because of the pre-existing Common Lisp capabilities for manipulating S-expressions. This paper presents the LW-SC (LightWeight-SC) language as an important application of this system, which features nested functions (i.e., a function de¯ned inside another function). Without returning from a function, the function can manipulate its caller's local variables (or local variables of its indirect callers) by indirectly calling a nested function of its (indirect) caller. Thus, many high-level services with \stack walk" can be easily and elegantly implemented by using LW-SC as an intermediate language. Moreover, such services can be e±ciently implemented because we design and implement LW-SC to provide \lightweight" nested functions by aggressively reducing the costs of creating and maintaining nested functions. The GNU C compiler also provides nested functions as an extension to C, but our sophisticated translator to standard C is more portable and e±cient for occasional \stack walk". port nested functions. A nested function is a 1. Introduction function de¯ned inside another function. With- The C language is often indispensable for de- out returning from a function, the function can veloping practical systems. Furthermore, ex- manipulate its caller's local variables (or local tended C languages are sometimes suitable for variables of its indirect callers) sleeping in the elegant and e±cient development. We can execution stack by indirectly calling a nested implement language extension by modifying a function of its (indirect) caller. C compiler, but sometimes we can do it by This paper presents the implementation of translating an extended C program into C. We an extended SC language, named LW-SC have developed the SC language system8),10) to (LightWeight SC), which features nested func- help such transformation-based language exten- tions☆. Many high-level services with \stack sions. SC languages are extended/plain C lan- walk" mentioned above can be easily and el- guages with an S-expression based syntax and egantly implemented by using LW-SC as an the extensions are implemented by transforma- intermediate language. Moreover, such ser- tion rules over S-expressions. Thus we can re- vices can be e±ciently implemented because duce implementation costs mainly because we we design and implement LW-SC to provide can easily manipulate S-expressions using Lisp. \lightweight" nested functions by aggressively The fact that C has low-level operations (e.g., reducing the costs of creating and maintaining pointer operations) enables us to implement nested functions. Though the GNU C com- many ﬂexible extensions using the SC language piler15) (GCC) also provides nested functions system. But without taking \memory" ad- as an extension to C, our sophisticated transla- dresses, C lacks an ability to access variables tor to standard C is more portable and e±cient sleeping in the execution stack, which is re- for occasional \stack walk". quired to implement high-level services with Note that, though this paper presents an im- \stack walk" such as capturing a stack state for plementation using the SC language system, check-pointing and scanning roots for copying our technique is not limited to it. GC (Garbage Collection). A possible solution to this problem is to sup- ☆ We have previously reported the implementation of y Department of Communications and Computer En- LW-SC as an example of a language extension us- gineering, Graduate School of Informatics, Kyoto ing the SC language system.10) This paper discusses University further details about LW-SC itself. 1 2 IPSJ Transactions on Programming June 2006 extended SC-B code transformation rule-set B transformation rule-set A SC preprocessor SC translator SC preprocessor SC translator extended SC-A code C code extended SC-C code SC preprocessor SC translator SC preprocessor SC compiler C compiler transformation rule-set C SC-0 code executable ¯le Fig. 1 Code translation phases in the SC language system. evaluated as a defmacro form of Com- 2. The SC Language System mon Lisp to de¯ne an SC macro. After This section explains the SC language system the de¯nition, every list in the form of for giving the speci¯cation and an implementa- (macro-name ¢ ¢ ¢) is replaced with the re- tion of LW-SC. More details are available in our sult of the application of Common Lisp's past paper.8),10) macroexpand-1 function to the list. The 2.1 Overview algorithm to expand nested macro applica- The SC language system, implemented in tions complies with the standard C speci¯- Common Lisp, deals with the following S- cation. expression-based languages: ² (%defconstant macro-name S-expression) ² SC-0, the base SC language, and de¯nes an SC macro in the same way as ² extended SC languages, a %defmacro directive, except that every and consists of the following three kinds of mod- symbol which eqs macro-name is replaced ules: with S-expression after the de¯nition. ² The SC preprocessor | includes SC ¯les ² (%undef macro-name) and handles macro de¯nitions and expan- unde¯nes the speci¯ed macro de¯ned by sions, %defmacros or %defconstants. ² The SC translator | interprets transfor- ² (%ifdef symbol list 1 list 2) mation rules for translating SC code into (%ifndef symbol list 1 list 2) another SC, and If the macro speci¯ed by symbol is de¯ned, ² The SC compiler | compiles SC-0 code list 1 is spliced there. Otherwise list 2 is into C. spliced. Fig. 1 shows code translation phases in the ² (%if S-expression list 1 list 2) SC language system. Extended SC code is S-expression is macro-expanded, then the translated into SC-0 by the SC translators, then result is evaluated by Common Lisp. If the translated into C by the SC compiler. Be- return value eqls nil or 0, list 2 is spliced fore each translation phase with a transforma- there. Otherwise list 1 is spliced. tion rule-set is applied, preprocessing by the ² (%error string) SC preprocessor is performed. Extension im- interrupts the compilation with an error plementers can develop a new translation phase message string. simply by writing new transformation rules. ² (%cinclude ¯le-name) 2.2 The SC Preprocessor ¯le-name speci¯es a C header ¯le. The C The SC preprocessor handles the following code is compiled into SC-0 and the result SC preprocessing directives to transform SC is spliced there. The SC programmers can programs: use library functions and most of macros ² (%include ¯le-name) such as printf, NULL declared/#defined corresponds to an #include directive in C. in C header ¯les☆. The ¯le ¯le-name is included. ☆ ² In some cases such a translation is not obvious. (%defmacro macro-name lambda-list In particular, it is sometimes impossible to trans- S-expression1¢ ¢ ¢S-expressionn) late #define macro de¯nitions into %defmacro or Vol. 47 No. SIG 0(PRO 29) A Transformation-Based Implementation of Lightweight Nested Functions 3 2.3 The SC Translator and Transfor- uation result of (every #'function-name mation Rules list) is non-nil. A transformation rule for the SC translator The function function-name can be what is de- is given by the syntax: ¯ned as a list of transformation rules or an usual Common Lisp function (a built-in function or (function-name pattern parm2 ¢ ¢ ¢ parmn) what is de¯ned separately from transformation -> expression rules). In evaluating expression, the special variable where a function function-name is de¯ned as x is bound to the whole matched S-expression an usual Lisp function. When the function is and, in the cases except (1), symbol is bound to called, the ¯rst argument is tested whether it the matched part in the S-expression. matches to pattern. If matched, expression is An example of such a function de¯nition can evaluated by the Common Lisp system, then its be given as follows☆: value is returned as the result of the function call. The parameters parm2 ¢ ¢ ¢ parmn, if any, (EX (,a[numberp] ,b[numberp])) are treated as usual arguments. -> `(,a ,b ,(+ a b)) A list of transformation rules may include two (EX (,a ,b)) or more rules with the same function name. In -> `(,a ,b ,a ,b) these cases, the ¯rst argument is tested whether (EX (,a ,b ,@rem)) it matches to each pattern in written order, and -> rem the result of the function call is the value of (EX ,otherwise) expression if matched. -> '(error) It is permitted to abbreviate The application of the function EX can be ex- ¢ ¢ ¢ (function-name pattern1 parm2 parmn) empli¯ed as follows: -> expression ::: (EX '(3 8)) ! (3 8 11) ¢ ¢ ¢ ! (function-name patternm parm2 parmn) (EX '(x 8)) (x 8 x 8) -> expression (EX 8) ! (error) (EX '(3)) ! (error) (all the expressions are identical and only pat- (EX '(x y z)) ! (z) terns are di®erent from each other) to Each set of transformation rules de¯nes one or ¢ ¢ ¢ (function-name pattern1 parm2 parmn) more (in most cases) function(s). A piece of ::: extended SC code is passed to one of the func- ¢ ¢ ¢ (function-name patternm parm2 parmn) tions, which generates transformed code as the -> expression. result. Internally, transformation rules for a func- The pattern is an S-expression consisted of one tion are compiled into an usual Common Lisp of the following elements: function de¯nition (defun).

A Transformation-Based Implementation of Lightweight Nested Functions

Compiling Sandboxes: Formally Verified Software Fault Isolation

Using the GNU Compiler Collection (GCC)

Bringing GNU Emacs to Native Code

Hop Client-Side Compilation

Practical Control-Flow Integrity

Objective-C Internals Realmac Software

Using and Porting the GNU Compiler Collection

Using the GNU Compiler Collection

Memory-Efficient Tail Calls in the JVM with Imperative Functional Objects

Jekyll on Ios: When Benign Apps Become Evil

Selective Tail Call Elimination

Mocfi: a Framework to Mitigate Control-Flow Attacks on Smartphones