A System for Rapid Implementation of Domain Specific Languages

The following paper was originally published in the Proceedings of the Conference on Domain-Specific Languages Santa Barbara, California, October 1997 KHEPERA: A System for Rapid Implementation of Domain Specific Languages Rickard E. Faith, Lars S. Nyland, Jan F. Prins University of North Carolina, Chapel Hill For more information about USENIX Association contact: 1. Phone: 510 528-8649 2. FAX: 510 548-5738 3. Email: [email protected] 4. WWW URL:http://www.usenix.org Khepera: A System for Rapid Implementation of Domain Sp eci c Languages Rickard E. Faith Lars S. Nyland Jan F. Prins Department of Computer Science University of North Carolina CB 3175, Sitterson Hal l Chapel Hil l NC 27599-3175 ffaith,nyland,[email protected] Abstract ample a C compiler is attractive b ecause it provides p ortabilityover a large class of architectures, while achieving p erformance through the near uni- The Khepera system is a to olkit for the rapid im- versal availability of architecture-sp eci c optimizing plementation and long-term maintenance of domain C compilers. sp eci c languages DSLs. Our viewp oint is that Yet there are some drawbacks to this approach. DSLs are most easily implemented via source-to- While DSLs are often simpler than general purp ose source translation from the DSL into another lan- programming languages, the domain-sp eci c infor- guage and that this translation should b e based on mation available may result in a generated program simple parsing, sophisticated tree-based analysis and that can b e much larger and substantially di erent manipulation, and source generation using pretty- in structure than the original co de written in the printing techniques. Khepera emphasizes the use DSL. This can make debugging very dicult: an of familiar, pre-existing to ols and provides supp ort exception raised on some line of an incomprehensi- for transformation replay and debugging for the DSL ble C program generated by the DSL pro cessor is a pro cessor and end-user programs. In this pap er, we long way removed in abstraction from the DSL input presentanoverview of our approach, including im- program. plementation details and a short example. Since the DSL pro cessor is comp osed with a native high-level compiler, and do es not have to p erform machine-co de generation or optimization, we b elieve 1 Intro duction that there are some basic di erences between the construction of a compiler for a general purp ose programming language and the construction of a trans- Domain sp eci c languages DSL can often be im- lator for a DSL. Our view is that DSL translation is plemented as a source-to-source translator comp osed most simply expressed as with a pro cessor for another language. For example, PIC [8], a classic \little language" for typ esetting gures, is translated into troff, a general-purp ose 1. simple parsing of input into an abstract syntax typ esetting language. Language comp osition can b e tree ast, extended in either direction: the CHEM language 2. translation via sophisticated tree-based analysis [1], a DSL used for drawing chemical structures, is and manipulation, and translated into PIC, while troff is commonly translated into PostScript. 3. output source generation using versatile pretty- Other DSLs translate into general-purp ose high-level printing techniques. programming languages. For example, ControlH, a DSL for the domain of real-time Guidance, Naviga- We add the additional caveat that the translation tion, and Control GN&C software, translates into pro cess retain enough information to supp ort the in- Ada [5]; and Risla, a DSL for nancial engineering, verse mapping problem, i.e., given a lo cus in the out- translates into COBOL [18]. put source, determine the tree manipulations and in- The comp osition of a DSL pro cessor with for ex- put source elements that are resp onsible for it. This @ @ @ @ @ @ 2 1 ` C C B C C B C C B C C C C C C H B B B H H B B B H B B B H ' $ ' $ T T T 0 1 ` Final Original Source Source Co de Co de & & 0 P P Figure 1: Transformation Pro cess facilitywould b e useful b oth for the DSL develop er 1.1 Goals for a DSL Implementation to trace erroneous translation and for the DSL user To olkit to trace run-time errors back to the input source. For the translation step we advo cate the use of arbi- The implementation of a DSL translator can require trary ast traversals and transformations. We be- considerable overhead, b oth for the initial implemen- lieve that this approach is simpler for source-to- tation and as the DSL evolves. A to olkit should source translation than the use of attribute gram- leverage existing, familiar to ols as much as p ossi- mars, since it decouples the ast analysis and pro- ble. Use of such to ols takes advantage of previous gram synthesis from the grammar of the input and implementor knowledge and the availability of com- output languages. Further, this approach minimizes prehensive resources explaining these to ols which the need for parsing \heroics", since simple gram- may not b e widely available for a DSL to olkit. mars, close or identical to the natural sp eci cation of the DSL syntax, can b e used to generate an ast A transformational mo del for DSL design ts in well that is sp ecialized in subsequent analysis. By decou- with these high-level goals. Consider the problem of pling the input grammar, translation pro cess, and translating a program, P , written in the domain sp e- output grammar, this approach is b etter able to ac- ci c language, L. In Figure 1, T is an ast which 0 commo date changes during the evolution of the DSL represents P after the parsing phase, . T is the ` syntax and semantics. 0 nal transformed ast, and P isavalid program in 0 the output language, L , constructed from T during Throughout this pap er, we will use \ast" to refer ` the pretty-printing phase, . The transformation to abstract syntax tree derived from parsing the in- pro cess is viewed as a sequential application of var- put le, and to any intermediate tree-based repre- ious transformations functions, T =T ,to sentations derived from this original ast, even if k +1 k k+1 the ast. The determination of which transforma- those representations do not strictly represent an tion function to apply next may require extensive \abstract syntax". analysis of the ast. Once the transformation func- In our own work we use the DSL paradigm in the tions are determined, however, they can b e rapidly compilation of parallel programs. We are particu- applied for replay or debugging. larly interested in the translation into HPF of ir- Within a transformational mo del, a DSL-building regular computations expressed in the Proteus [12] to olkit can simplify the implementation pro cess by language, a DSL providing sp ecialized notation. Our providing sp ecialized to ols where pre-existing to ols observation was that wewere sp ending a disprop or- are not already available, and to transparently inte- tionate amount of e ort working on a custom trans- grate supp ort for debugging within this framework. lator implementation to incorp orate changes in Pro- teus syntax and improvements in the translation The Khepera system facilitates b oth the problem scheme|thus wewere motivated to investigate gen- of rapid DSL prototyping and the problem of long- eral to ol supp ort for DSL translation to simplify this term DSL maintenance through the following sp e- pro cess. ci c design goals: Familiar, mo dularized parsing comp onents. Khepera supp orts the use of familiar scanning and parsing to ols e.g., the traditional lex and yacc,or the newer PCCTS [11] for implementation of a DSL Khepera library as discussed in Section 4.6 and pro cessor. Because Khepera concentrates on pro- shown in Figure 8 and Figure 9; or the transforma- viding the \missing pieces" that help with rapid im- tions are written using explicit calls to the Khepera plementation of DSLs, previous knowledge can be library tree manipulation functions. In either case, utilized, thereby decreasing the slop e of the learning low-level ho oks in the Khepera library track debug- curve necessary for the rapid implementation of a ging information when no des or subtrees are created, DSL. destroyed, copied, or replaced. This low-level information can b e analyzed to provide the abilityto Familiar, exible, and ecient semantic anal- navigate through intermediate versions of the trans- ysis. Khepera uses the source-to-source transfor- formed program, and the abilityto answer sp eci c mational mo del outlined in Figure 1. This mo del queries that supp ort the debugging of the nal trans- uses tree-pattern matching for ast manipulation, formed output: analysis, and attribute calculation. For tedious but common tasks, such as tree-pattern match- setting breakp oints ing, sub-tree creation, and sub-tree replacement, Khepera provides a little language for describ- determining current execution lo cation e.g., in ing tree matches and for building trees. For un- resp onse to a breakp oint or program exception predictable or language-sp eci c tasks, such as attribute manipulation or analysis, the Khepera little rep orting a pro cedure traceback language provides an escap e to a familiar general- purp ose programming language C. Standard tree displaying values of variables traversal algorithms are supp orted e.g., b ottom up, top down, as well as arbitrarily complicated syntax- These tracking and debugging capabilities are the directed sequencing.

Load more