Rules for Block Structured Languages: The Block Language

JoA˜£o Saraiva

[email protected] .mailto:[email protected] 27 de Maio de 2005

This document presents an attribute grammar to specify the scope rules for block- structure languages. These rules are based on the name analysis rules of the programming language Algol 68. This document has the following objectives:

• A brief introduction to the LRC system,

• A concise introduction to Attribute Grammars,

• The graphical representation of Abstract Syntax Trees (as XML trees),

• The specification of GUI via attribute grammars,

The pdf version of this document is available at block.pdf .http://www.di.uminho.pt/ jas/Research/LRC/examples/Block/block.pdf.

1 The Block Language

Consider a very simple language that deals with the scope rules of a block structured language: a definition of an identifier x is visible in the smallest enclosing block, with the exception of local blocks that also contain a definition of x. In this later case, the definition of x in the local scope hides the definition in the global one. We shall analyse these scope rules via a toy language: the Block language. One sentence in Block consists of a block, and a block is a (possibly empty) list of statements.A statement is one of the following three things: a declaration of an identifier (such as decl a), the use of an identifier (such as use a), or a nested block. Statements are separated by the punctuation symbol “;” and blocks are surrounded by square brackets. A concrete sentence in this language looks as follows:

1 sentence = [ use x ; use y ; decl x ; [ decl y ; use y ; use w ] ; decl y ; decl x ]

This language does not require that declarations of identifiers occur before their first use. Note that this is the case in the first two applied occurrences of x and y: they refer to their (later) definitions on the outermost block. Note also that the local block defines a second identifier y. Consequently, the second applied occurrence of y (in the local block) refers to the inner definition and not to the outer definition. In a block, however, an identifier may be declared once, at the most. So, the second definition of identifier x in the outermost block is invalid. Furthermore, the Block language requires that only defined identifiers may be used. As a result, the applied occurrence of w in the local block is invalid, since w has no binding occurrence at all. We aim to develop a program that analyses Block programs and computes a list con- taining the identifiers which do not obey to the rules of the language. Thus, this program, called block, is a static semantic analyser for the Block language. It has the following type: block :: Prog -> [Name]

, where Name is the type of the Block identifiers. In order to make the problem more interesting, and also to make it easier to detect which identifiers are being incorrectly used in a Block program, we require that the list of invalid identifiers follows the sequential structure of the input program. Thus, the semantic meaning of processing the example sentence is [w,x], i.e.: block sentence = [w,x]

Next, we shall describe the program block in the traditional attribute grammar paradigm. First, we define the concrete and the abstract syntax of Block via two context-free gram- mars. After that, we define the semantics of the language by extending the grammar with attributes and attribute equations.

2 Regular Expressions for

In order to define the concrete syntax of the Block language, we need to define its set of terminal symbols. Usually, terminal symbols are easily and conciselly defined by using reg- ular expressions. Next, we present the regular expressions defining the reserved keywords

2 of the language, and the notation used for its identifiers. In LRC regular expressions have a name for future possible references. The notation used for reegular expressions is the Unix one.

USE : < "use" >; DECL : < "decl" >; IDENTIFIER : < [a-zA-Z][a-zA-Z0-9_]* >;

Punctuation symbols, like ; or [ are also terminal symbols. We omit here its defintion via regular expressions since they can be directly defined in the context-free grammar (because they consist of a character only). Traditionally, lexical analisers remove ”spaces”from the input. In the LRC specification language there is a special named regular expression - WHITESPACE - where we can define regular expressions that will not return the matches charecters to the parser.

WHITESPACE : [\ \t\n]+ >;

3 Context-free grammar for syntactic analysis

The axiom defines a block surrounded by square brackets and the non-terminal stats defines the statements of the blocks. prog { syn P ast; }; prog ::= (stats) { $$.ast = Root(stats.ast); } ;

The body of a block is either an empty statement, which is defined by the empty production, or, it is a non-empty statement, in which case, the body of a block is defined by non-terminal lststs. stats { syn Its ast; }; stats ::= (lststs)

3 { $$.ast = lststs.ast ; } | () { $$.ast = NilIts(); } ;

Non-terminal lststs defines a sequence of one or more non-terminal symbols stat sepa- rated by the literal terminal ;. lststs { syn Its ast; }; lststs ::= (stat) { $$.ast = ConsIts (stat.ast,NilIts()); } | (stat ’;’ lststs) { $$.ast = ConsIts (stat.ast,lststs$2.ast); } ;

The non-terminal stat defines the three statements of Block: definitions, uses and nested blocks. stat { syn It ast; }; stat ::= (USE name) { $$.ast = Use(name.ast); } | (DECL name) { $$.ast = Decl(name.ast); } | (’[’ stats ’]’ ) { $$.ast = Block(stats.ast); } ;

name { syn Id ast ; } ; name ::= (IDENTIFIER) { $$.ast = Ident(IDENTIFIER); } ;

The Block concrete grammar is used for the syntactic analysis of the Block language. It guides the derivation process of each concrete sentence of the language. It also assigns a unique concrete syntax tree to each syntactically valid sentence of the language. These trees are automatically generated by the LRC system, see chapter ???.

4 However, syntactically valid sentences may fail to correspond to semantically valid sentences of the language. Such sentences violate the semantic rules of the language. Consider, for example, the sentence sentence: it is syntactically correct because, as we said before, can be generated by the previous CFG, but it is semantically incorrect, because two identifiers ( x and w) of sentence violate the rules of the language. To describe the semantic rules of the language we focus on its abstract structure. So, before we extend our grammar with attribute and attribute equations, we have to define the abstract syntax of the Block language. Let us start by informally defining the abstract structure of a Block sentence: an abstract sentence in Block is a list of statements, where each statement is the declaration or the use of an identifier, or is a nested block. The syntactic property according to which statements are separated by a punctuation symbol is irrelevant for the abstract structure of the language. Formally, we define such abstract language by the following productions:

P : Root (Its) ; Its : NilIts () | ConsIts (It Its) ; It : Use (Id) | Decl (Id) | Block (Its) ; Id : Ident (STR) ;

As expected, the literal symbols, e.g., decl, use, [, ] and ;, are not mentioned in the abstract grammar. Observe also that in the concrete grammar we have used two non- terminal symbols to specify the fact that the body of a block is a possibly empty list of statements separated by the character “ ;”. As we have said above, this is irrelevant for the abstract structure of the language where non-terminal Its simply defines a list of statements.

3.1 Root Symbol(s) In the LRC specification language we have to explicitly define the root symbol of the context-free grammars defining the concrete and the abstract syntax of the underlying language. Let us start by the abstract grammar. To define the root symbol of this grammar we explicit declare the root non-terminal with the keyword root. In the Block (abstract) language we have the following root:

5 root P;

This can also be seen as the type of the abstract syntax tree. Or in other words, the result of parsing a sentence is a tree of type P. Let us consider the concrete grammar. In the concrete grammar we have not only to define the root symbol, but also the the type os the result of the parser and the name of the attribute where that tree is computed. In LRC this is defined as follows:

P ~ prog.ast;

, and we read it as follows: we (may) start deriving sentences from non-terminal symbol prog and we construct an abstract syntax tree of type P in attribute ast. In LRC we can have several root symbols, meaning that we will be able to parse sub- languages of Block, as well. Next, we declare of the so-called parser entry points.

Its ~ stats.ast; It ~ stat.ast; Id ~ name.ast;

The possibility of having several ”root symbols”will be an important feature of the parsers when we build programming environments (see chapter ??). To build batch tools (e.g., compilers) this feature is not so relevant. Note that other systems, like for example the Yacc system, can use a default rule which assumes that the first non-terminal occurring in the specification is the root symbol.

4 Pretty Printing/Unparsing Rules

To pretty print (or unparse) the abstract syntax tree we have three different approaches:

• We may use a predefined LRC notation for unparsing.

• We may use the xml4free tool (developed with LRC) which automatically produce attribution rules to compute XML trees (and its graphical representation).

• We may use an attribute grammar for (generic) pretty printing. This pretty printing attribute grammar is distributed in the AGLIB

6 Next, we will present how to use the predefined unparsing rules for the Block language. In chapter ?? we will show how to use xml4free to produce the attribute grammar fragment that builds an instance of the AST as an XML tree. We will also show how to visualize such tree graphically. The use of the generic and powerful pretty printing algorithm will be discussed in ...

4.1 Unparsing Rules in LRC The LRC specification language includes a special notation to define unparsing rules (also called pretty printing rules). It includes a pre-defined and fixed notation to define unparsing rules. Thus, for each production of the abstract grammar we have to define its unparsing rule. That is to say that for each tree node constructor we have to define how it is shown textually. Next, we present the unparsing rules for each production/constructor of the abstract grammar.

P : Root [ @ ::= "[ " @ "%n]" ] ;

Its : ConsIts [ @ ::= @ [" ; "] @] /* list elements separated by ; */ ;

It : Use [ @ ::= "use " @ ] | Decl [ @ ::= "decl " @ ] | Block [ @ ::= "%n [ " @ " ] %n%b" ] ;

Id : Ident [ ^ ::= ^ ] ;

5 Running LRC

Having defined the regular expressions (for scanning the input), the concrete context- free grammar (for parsing), the abstract grammar (for constructing the abstract syntax tree) and the unparsing rules, we can use the LRC system to produce a front-end for the Block language. Let us consider that these specifications are included in the file block.ssl

.http://www.di.uminho.pt/ jas/Research/LRC/examples/Block/block.ssl. The LRC system includes a () script to run the different tools included in the system. The easiest way to run LRC is to define a Makefile similar ro the one below.

• Filename: Makefile

7 RUNLRC=perl -S -w /usr/local/lrc/bin/runlrc.pl

SSL_FILES = block.ssl

RUNLRC_OPTIONS = -verbose -v2c_code -DTRACE eva : code.ivr $(LIBS) $(RUNLRC) +4 -4 $(RUNLRC_OPTIONS) code.ivr : $(SSL_FILES) $(RUNLRC) -3 $(SSL_FILES) $(RUNLRC_OPTIONS) clean : $(RUNLRC) -clean

This Makefile is available is available here .http://www.di.uminho.pt/ jas/Research/LRC/examples/Block/Makefile.

6 Running the block processor

[jas@localhost doc]$ ./eva -h Usage: eva [option] file Valid options are: -T show trees that are processed -i use indentation when printing trees -f load etf file -F emit etf file -P show computed output packets -Y trace parsing -V trace visits -O write tree or attribute of root node to file -O attr_name> -O attr_name>: view_name> -O attr_name>:: file_name> -O attr_name>: view_name>: file_name> -O: view_name> -O: view_name>: file_name> -M define memoization strategy -M strategy>, number of collection-free evaluation>, number of caches>

8 -S define memoization tree-size -S use tree size>, minimal tree size> -h print this message

eva -Y input

[jas@localhost doc]$ ./eva -Y input /**** processing file input ****/ SCANNED: <[> (code=91; token length=1) SCANNED: (code=257; token length=3) SCANNED: (code=259; token length=1) SCANNED: <;> (code=59; token length=1) SCANNED: (code=257; token length=3) SCANNED: (code=259; token length=1) SCANNED: <;> (code=59; token length=1) SCANNED: (code=258; token length=4) SCANNED: (code=259; token length=1) SCANNED: <;> (code=59; token length=1) SCANNED: <[> (code=91; token length=1) SCANNED: (code=258; token length=4) SCANNED: (code=259; token length=1) SCANNED: <;> (code=59; token length=1) SCANNED: (code=257; token length=3) SCANNED: (code=259; token length=1) SCANNED: <;> (code=59; token length=1) SCANNED: (code=257; token length=3) SCANNED: (code=259; token length=1) SCANNED: <]> (code=93; token length=1) SCANNED: <;> (code=59; token length=1) SCANNED: (code=258; token length=4) SCANNED: (code=259; token length=1) SCANNED: <;> (code=59; token length=1) SCANNED: (code=258; token length=4) SCANNED: (code=259; token length=1) SCANNED: <]> (code=93; token length=1) SCANNED: <> (code=0; token length=1)

9 Next we present the concrete syntax tree concrete (or parse tree) and the abstract syntax tree assigned by the concrete and the abstract grammar, respectively, for the example sentence. eva -T -i input

[jas@localhost doc]$ ./eva -T -i input ... abstract tree= Root( ConsIts( Block( ConsIts( Use( Ident("x") ), ConsIts( Use( Ident("y") ), ConsIts( Decl( Ident("x") ), ConsIts( Block( ConsIts( Decl( Ident("y") ), ConsIts( Use( Ident("y") ), ConsIts( Use( Ident("w") ), NilIts() ) ) ) ), ConsIts(

10 Decl( Ident("y") ), ConsIts( Decl( Ident("x") ), NilIts() ) ) ) ) ) ) ), NilIts() ) )

eva -O input

[jas@localhost doc]$ ./eva -O input /**** processing file input ****/ Unparse decorated term [ [ use x ; use y ; decl x ; [ decl y ; use y ; use w ] ; decl y ; decl x ]

]

7 The Block Attribute Grammar

The Block language does not force a declare-before-use discipline. Consequently, a conven- tional implementation of the required analysis naturally leads to a program that traverses each block twice: once for processing the declarations of identifiers and constructing an environment and a second time to process the uses of identifiers (using the computed environment) in order to check for the use of non-declared identifiers.

11 The uniqueness of identifiers is checked in the first traversal: for each newly encountered identifier declaration it is checked whether that identifier has already been declared at the same lexical level. In this case, the identifier has to be added to a list reporting the detected errors. The algorithm for the processor of the Block language looks as follows: • 1st Traversal – Collect the list of local definitions – Detect duplicate definitions (using the collected definitions) • 2nd Traversal – Use the list of definitions as the global environment – Detect use of non defined names – Combine “both” errors As a consequence, semantic errors resulting from duplicate definitions are computed during the first traversal and errors resulting from missing declarations, in the second one. According to the above algorithm, we have to define rules in order to compute (i.e, to synthesize) the list of declared identifiers in a block. This list is the environment needed to detected invalid uses. The computation of the list is graphically shown next.

Next, we shall define the attribution rules of the attribute grammar. According to the above semantic rules, for every block we compute three things: its environment, its lexical level and its invalid identifiers. The environment defines the context where the block occurs. It consists of all the identifiers that are visible in the block. The lexical level indicates the nesting level of a block. Observe that we have to distinguish between the same identifier declared at different levels, which is a valid declaration (e.g., “decl y” in sentence), and the same identifier declared at the same level, which is an invalid declaration (e.g., “decl x” in sentence). Finally, we have to compute the list of identifiers that are incorrectly used, i.e., the list of errors. Those three things correspond to three attributes. Attributes have a type. The type of the lexical level is the primitive LRC type INT. Apart from the primitive types of LRC, we also assume that the type of the pseudo-terminal symbol Name exists (it is provided externally). In other words, the type of Block identifiers is Name. We define two new types: the type of the environment, denoted by Env, that is an association list from identifiers to lexical levels, and the type Error that represents the type of the list of invalid identifiers. It is common in attribute grammars to use additional non-terminal symbols to define new data types. These types are defined in LRC as follows:

12 /* * Environment: Finit function mapping Names to lexical levels */ list Env; Env : EmptyEnv () | ConsEnv (ATuple Env) ; ATuple : Tuple (Id INT) ;

And the type of the list of errors is:

/* * Result: List of (invalid) Names */ list Error; Error : NilError() | Cons_Error (Id Error) ;

Let us now describe the semantic domains of the Block language. In order to focus on each semantic domain individually we split the attribute grammar specification into fragments. Every fragment corresponds to a particular semantic domain of our language. Basically we have three semantic domains: the list of declarations, the environment, the lexical level and the errors. Therefore we will have four AG fragments, each of which contains the attribution rules defining the respective semantic domain.

7.1 Collecting the List of Definitions We start by defining the construction of the list of definitions. We have two alternatives to construct such a list: either we synthesize the local declarations using a bottom-up strategy and we “add” these declarations to the environment of the outer block in order to obtain the required environment, or we thread the environment of the outer block through the body of the local block and we accumulate its local declarations. Although both alternatives seem to follow the algorithm, only the second one produces the desired results. Observe that the duplicated declarations are detected during the construction of the environment. The first alternative detects those duplications, but points out the wrong occurrence of a duplicated

13 declaration: since it performs a bottom-up strategy (in other words, it traverses the Block sentences from right-to-left) it is the first occurrence of the declaration of an identifier that is marked as duplicated, instead of the second one. This is shown in the next figure.

Thus, we adopt the second alternative, since it detects the duplications following the sequential structure of the input. In this technique is used for improving functional programs and is known as accumulation parameters. Thus, we associate an inherited attribute dcli of type Env to the non-terminal symbols Its and it that define a block. The inherited list of declarations is threaded through the block in order to accumulate the local definitions and in this way synthesizes the total list of declarations of the block. We associate a synthesized attribute dclo also of type Env to the non-terminal symbols Its and it, which defines the newly computed list.

The only production that contributes to the synthesized environment of a block is Decl. The single semantic equation of this production makes use of the semantic function :: (written in infix notation) to build the environment. Note that we are using the list

14 type definition presented previously. The constructor Tuple is used to bind an identifier to its lexical level. The single occurrence of pseudo-terminal Name is syntactic reference syntactically referenced in the equation since it is used as a normal value of the semantic function. All the other semantic equations of this fragment simply pass the environment to the left-hand side and right-hand side symbols within the respective productions. The semantic function is the identity function, which we omit from the copy rule.

/* * Fragment: Collecting the list of definitions */

Its , It { inh Env dcli ; syn Env dclo; } ; P : Root { Its.dcli = EmptyEnv (); } ; Its : NilIts { $$.dclo = $$.dcli; } | ConsIts { It.dcli = $$.dcli; Its$2.dcli = It.dclo; $$.dclo = Its$2.dclo; } ; It : Decl { $$.dclo = Tuple(Id,$$.lev) :: $$.dcli; } | Use , Block { $$.dclo = $$.dcli; } ;

7.2 Computing the Lexical Level Every block has a lexical level. Thus, we introduce one inherited attribute lev indicating the nesting level of a block. The LRC primitive function + is used to increment the value of the lexical level passed to the inner blocks.

15 /* * Fragment: Computing the lexical level */

Its, It { inh INT lev } ; P : Root { Its.lev = 0; } ; Its : ConsIts { It.lev = $$.lev; Its$2.lev = $$.lev; } ; It : Block { Its.lev = $$.lev + 1; } ;

7.3 Distributing the Environment Now that the total environment of a block is defined, we pass that context down to the body of the block in order to detect applied occurrences of undefined identifiers. Every block inherits the environment of its outer block. The environment of a local block includes the environment of its outer block together with its own local declarations. This is shown in the next figure.

16 Thus, we define a second inherited attribute of type Env, called env, to distribute the total environment. The first semantic equation of Block specifies that the inner blocks inherit the environment of their outer ones. As a result, only after computing the envi- ronment of a block is it possible to process its nested blocks. That is, inner blocks will be processed in the second traversal of the outer one. The total environment of the inner blocks, however, is the synthesized environment (i.e., attribute dclo), as defined in the second equation. It is also worthwhile to note that the equation

Its.env = Its.dclo

induces a dependency from a synthesized to an inherited attribute of the same symbol Its. Although such dependencies are natural in attribute grammar specifications they may lead to complex (functional) implementations.

/* * Fragment: Distributing the environment

17 */

Its, It { inh Env env; } ;

P : Root { Its.env = Its.dclo; } ; Its : ConsIts { It.env = $$.env; Its$2.env = $$.env; } ; It : Block { Its.dcli = $$.env; Its.env = Its.dclo; } ;

7.4 Computing the list of errors

/* * Fragment: Computing the list of errors */

P , Its , It { syn Error errors; };

P : Root { $$.errors = Its.errors; } ; Its : NilIts { $$.errors = NilError(); }

18 | ConsIts { $$.errors = It.errors @ Its$2.errors; } ; It : Decl { $$.errors = mNBIn (Id,$$.lev,$$.dcli); } | Use { $$.errors = mBIn (Id,$$.env); } | Block { $$.errors = Its.errors; } ;

7.5 Semantic Functions /* * Fragment: Semantic Functions */

Error mBIn (Id name , Env e) { with(e) ( EmptyEnv() : name :: NilError() , ConsEnv(t,es) : with(t) ( Tuple(n,l) : (n == name) ? NilError() : mBIn (name,es) ) ) };

Error mNBIn (Id name , INT lev , Env e) { with(e) ( EmptyEnv() : NilError() , ConsEnv(t,es) : with(t) ( Tuple(n,l) : (n == name) && (l == lev) ? n :: NilError() : mNBIn (name,lev,es) ) ) };

19 8 Constructing and Visualizing XML

In order to compute a graphical (XML-based) visual representation of the input sentence under consideration, we can use the tool ) xml4free. This tool accepts as input a ssl specification - where the abstract grammar is defined - ans produces aSSL fragment. The produced SSL code contains the attribution rules needed to synthesize the XML represen- tation of the tree. The generated SSL file can be directly included in the specification of the tool. Next, we

present the makefile (available as here .http://www.di.uminho.pt/ jas/Reasearch/LRC/examples/Block/Makefile2) where:

• The tool xml4free is called producing the file block xml.ssl

• Several AGs from the AG LIB are added to the specification of the Block language. This AGs are used in the generated file block xml.ssl and are:

– (AG LIB)/W3C/XML/XmlAST.ssl and (AG LIB)/W3C/XML/XmlPP.ssl: define XML abstract syntax and pretty printing – (AG LIB)/Languages/DOT/Dot.ssl: defines the DOT abstract syntax. This is the language used by the GaphViz tools. – (AG LIB)/W3C/XML/Xml2Dot.ssl: AG that maps XML into DOT.

• Filename: Makefile2

RUNLRC=perl -S -w /usr/local/lrc/bin/runlrc.pl

## AG_LIB modules needed by the Xml4Free

AG_LIB = ../../../AG_LIB

SSL_LIBS = $(AG_LIB)/W3C/XML/XmlAST.ssl \ # XML Abstract Grammar $(AG_LIB)/W3C/XML/XmlPP.ssl \ # XML Pretty Printing $(AG_LIB)/Languages/DOT/Dot.ssl \ # Dot Abstract Grammar $(AG_LIB)/W3C/XML/Xml2Dot.ssl # AG mapping XML to Dot

## LRC Files produced by Xml4Free

GEN_SSL_XML_FILE = block_xml.ssl

## All the files of the attribute grammar

SSL_FILES = block.ssl $(GEN_SSL_XML_FILE) $(SSL_LIBS)

20 ## LRC options

RUNLRC_OPTIONS = -verbose -v2c_code eva : code.ivr $(RUNLRC) +4 -4 $(RUNLRC_OPTIONS) code.ivr : $(SSL_FILES) $(RUNLRC) -3 $(SSL_FILES) $(RUNLRC_OPTIONS)

## Producing the LRC fragment to build XML trees. block_xml.ssl : block.ssl ../../Xml4Free/bin/xml4free -X block.ssl > block_xml.ssl

## Inducing the Schema for the Block abstract grammar block_xml.scm : block.ssl ../../Xml4Free/bin/xml4free -S block.ssl > block_xml.scm clean : $(RUNLRC) -clean rm -f bcode.* code.* Save.*

Compiling the Block specification with this makefile we produce a tool that computes (synthesizes) two results more: the XML representation (in attribute lrc xml) and the DOT representation (in attribute xml dot). See file the generated file block xml.ssl. Next, we run the produced tool with the example sentence and we display the results. eva -O_lrc_xml_ input

[jas@p98 doc]$ ./eva -O_lrc_xml_ input entering output descriptor: attr=’_lrc_xml_’ view=’BASEVIEW’ file=’(null)’ /**** processing file input ****/

21 x y x y y

22 w y x

eva -Oxml_dot::block.dot input dot -Tjpg -Gsize="7,7" block.dot > block_xml.jpg

23 9 Constructing a Programming Environment

A set of visual objects is pre-defined in the LRC prelude that can easily be included in the AG specification. Those objects correspond to standard graphical user interface objects, like menus, buttons, etc. A “textual object” that displays the unparsing of a syntax tree is used to offer more traditional textual editing. The unparsed representation of the syntax tree is induced by the unparse rules defined in the AG. Those rules are written in SSL code. The graphical objects can be combined horizontally or vertically in order to form the interface of the tool. Moreover, a set of properties can be defined for each of the objects, which specifies its size, colour, fonts, etc. In the next section, we describe some of these visual objects. To illustrate how modern language-based environments are easily defined in LRC, we shall extend the specification of the Block language with the advanced graphical objects included in the LRC prelude and supported by the runtime system. To show the notation of LRC specification language, we present the textual code exactly as written in LRC. We start by defining a traditional language-based editor for the Block language. Such an editor displays a “pretty printed” version of the syntax tree that represents the input. The prelude of LRC contains the constructor Unparse to provide this facility. Thus, to obtain a language-based editor for Block, we simply need to extend our grammar with a

24 new attribute that synthesizes the visual object, where the constructor Unparse is used to compute values of this visual object. Next, we present the LRC specification (left) and the synthesized visual object (right) that is presented to the user as one of the results of processing the Block example sentence.

Prog { syn lrc_visuals guiObjects}; Prog : Root { Prog.guiObjects = let editor_frame = Unparse(&Stats.ast) ; in (Toplevel(editor_frame,"editor","Block Editor")::LrcVisuals0); };

The root symbol Prog synthesizes an attribute that represents the list of graphical user interface objects defined in the attribute grammar. The type lrc visuals is the pre-defined type for the visual objects to be displayed by LRC. The constructor Unparse takes as argument a reference to the abstract syntax tree to be displayed. This tree is displayed according to the unparse rules defined for each of its productions. As a result of using the Unparse constructor, the editor of the Block language allows the user to perform traditional language-based editing. A Toplevel constructor displays a frame in a window. It takes three arguments: the frame, a name and the title. The title is displayed at the top of the window. The name must be unique within a list of toplevels for one tree. We can also specify the size of the frame where the syntax tree is to be displayed. We define a frame with 40 columns and 20 lines with the Size constructor as follows:

editor_frame = Size(Unparse(&Stats.ast),40,20)

The root symbol of the Block AG synthesizes the attribute errs. So, let us define a visual object to display this attribute as well. We have two alternatives here. Either we

25 display the errors within the abstract syntax tree or we display them in a different frame. In the first case, the language-based environment displays the value of the attribute in the program being manipulated, , it points to the user where, in the input text, the error occurs. This is the approach we follow in the editor that we will present in Figure ???. In the example we present here, we take the second approach, i.e., we define a new Unparse constructor to display the unparsing of attribute errs. Furthermore, we combine both unparsing frames into one by using the VList frame combinator. Next, we show the let expression only:

let editor_frame = Unparse(&Stats.ast) ; error_frame = Unparse(&Prog.errs) ; vcomb_frame = VList (editor_frame :: error_frame :: Frames0); in (Toplevel(vcomb_frame,"edit","Block Editor")::LrcVisuals0);

We proceed now to define a more advanced visual object. We add a push button to the user interface in order to allow the user to add statements to the input Block program by pressing a simple button. We use, in this case, the constructor PushButton of the LRC prelude. This constructor has a single argument: the string to be displayed in the button. Next, we extend with a local attribute defining a push-button. The Block environment shown in the right is the result of combining vertically the push-button with the previous two unparsing frames.

Prog : Root { local frame button_add_entry; button_add_entry = PushButton("Add Statement"); };

26 The PushButton constructor displays a push button only. To assign an action to the displayed button, however, we have to specify such an action in the AG. The LRC prelude includes a set of pre-defined event-handler constructors to specify how user interactions with the graphical objects are translated into changes in the object being edited. The constructor ButtonPress is the event-handler associated with PushButton. Next, we show a possible action associated with this event-handler.

Its : NilIts { bind on ButtonPress("Add Statement") : &Its -> ConsIts (Decl("a"),NilIts); };

The LRC bind expression is used to specify how user interactions are handled by the language-based environment. In this case it simply defines that every time the push button ”Add Statement” is pressed, the subtree rooted Its is transformed into ConsIts (Decl(”a”),NilIts. Note that this event-handler constructor is defined in the context of a NilIts production. Thus, a new declaration is added at the end of the input program. We have presented two ways to specify the user interaction with the environments pro- duced by LRC, i.e., by traditional editing within the Unparse constructor, and by direct manipulation of pre-defined visual objects. There is a third way to specify user interac- tions: by defining tree transformations. Tree transformations are a feature provided by the SSL language. For each production of the grammar (, constructor) we can associate an expression defining how the production is translated. Next, we associate several transfor- mation rules to non-terminal It. The first rule, for example, defines that if the user selects the transformation name ”USE” in the context of a production Decl, then the production is transformed into a Use production. The identifier declared/used in the statement is represented by the pattern variable n and does not change. In the environments produced by LRC the list of possible transformations is presented to the user in a pop-up menu, as shown in the environment on the right. transform It on "USE " Decl(n): Use(n), on "BLOCK" Decl(n): Block(ConsIts(Decl(n),NilIts)) on "DECL " Use(n) : Decl(n), on "BLOCK" Use(n) : Block(ConsIts(Use(n),NilIts)) ;

27 The visual objects defined in the LRC prelude, and the different ways of interaction with the tools generated by LRC makes it possible to define easily powerful interactive systems with LRC. 1

10 Bibliografia

• JoA˜£o Saraiva and Matthijs Kuiper Lrc - A Generator for Incremental Language-Oriented Tools 7th International Conference on Compiler Construc- tion, CC/ETAPS’98, Kay Koskimies, LNCS 1383 1998

• Doaitse Swierstra and Pablo Azero and JoA˜£o Saraiva Designing and Imple- menting Combinator Languages Third Summer School on Advanced Functional Programming, LNCS 1608 1998

• JoA˜£o Saraiva and Doaitse Swierstra Data Structure Free Compilation 8th International Conference on Compiler Construction, CC/ETAPS’99, Stefan Jah- nichen, LNCS 1575 1999

• JoA˜£o Saraiva and Doaitse Swierstra and Matthijs Kuiper Functional Incre- mental Attribute Evaluation 9th International Conference on Compiler Con- struction, CC/ETAPS’00, David Watt, LNCS 1781 2000

• JoA˜£o Saraiva and Pablo Azero Component-based Programming for At- tribute Grammars Joint Conference on , Luis M. Per- reira e Paulo Quaresma 2001

• Maarten Pennings Generating Incremental Evaluators 1994

• JoA˜£o Saraiva Purely Functional Implementation of Attribute Grammars

1999 [PS] .ftp://ftp.cs.uu.nl/pub/RUU/CS/phdtheses/Saraiva/

28