54

Chapter IV

ULTRAMAN ARCHITECTURE

Introduction

In previous chapters, we have introduced and motivated the ideas of a transformational approach to generating user interfaces. Throughout this dissertation we will discuss interfaces as views. For us, a view is an interface which is either the source or result of a transformation. In terms of granularity, we expect a view to be roughly equivalent to a window in a traditional desktop-metaphor window system. This chapter will introduce the mechanisms that this research has developed to perform these view transformations and give the reader a broad overview of the architecture of our technology. Future chapters will take up the details of the various subsystems.

FIGURE 4-1 Television Character 55

The system we have constructed is called Ultraman. This name was chosen for two reasons. First, Ultraman was the favorite television show of the author as a child. Second, the main character in the television show underwent a transformation from a normal person to the superhero, Ultraman. (It has been subsequently revealed that the intent of the show’s creators was that the normal person conjured Ultraman, but this was never adequately revealed to the author as eight year-old. This confusion was widely shared by others.) As has been explained previously and will be explained in more detail in the future, the notion of transformation is a crucial one to our system. A picture of the television character Ultraman can be found in Figure 4-1. We will refer to the overall system of technology as Ultraman and specify its parts, such as Tool, when necessary to avoid confusion.

FIGURE 4-2 Timeline Of Events When Using Ultraman

In thinking about Ultraman’s internal architecture, there are two distinct “times” that are important. These times are generate-time and run-time. In having these two important times, Ultraman is very similar to most other systems which generate code for use by an end-programmer in his or her application, for example a compiler. When we refer to actions the user takes or events that occur within Ultraman at generate-time, we mean the time when the end-programmer is designing the set of interface transformations. We call this generate-time because after the user has expressed his or her transformations to the Ultraman tool, the tool generates substantial amounts of code. In contrast, run-time is the time when the end-programmer’s 56

application actually executes and the transformations occur. Clearly, run-time follows generate-time, since for an application to run using Ultraman’s technology, Ultraman must generate code for use at run-time.

A timeline of the events that would precede an application using Ultraman being completed is shown in Figure 4-2. The timeline in this figure is not meant to have any scale, the events are simply shown in their proper order. (The events involved with the design of an application on the left can take weeks and the time to regenerate an interface on the right only milliseconds.) In the generate-time portion of Figure 4-2 there are several decisions that are made by the programmer. The details of these decisions are in Chapter VIII; at this point what is important to understand is that the decisions being made are by the programmer and involved with how to use the Ultraman tool to achieve his or her ends and occur prior to the application being run. In the case of the right hand side of the figure, it is worth noting that this process of rebuilding the interface occurs as many times as the user makes modifications even though only one is pictured in this figure.

Generate-Time

Without a terrible loss of precision, generate-time can be considered to be the time when the end- programmer is manipulating the Ultraman tool, shown in Figure 4-3. A better definition of generate-time would almost certainly include other activities on the part of the end-programmer such determining what view should be transformed and into what new view, what information will be necessary to perform the transformations, writing transformation actions (see “Writing A Transformation Action” on page 86), etc. These actions on the part of the programmer will be considered in more detail in Chapter VIII but from Ultraman’s perspective there are two events occurring:

• The user is manipulating the user interface of the tool to inform Ultraman of the transformation of interest.

• The tool is generating code that implements the user’s desired transformation. In upcoming chapters we will frequently refer to a pattern or a user-defined pattern. Such a pattern is a subtree like the one at the top of the screen dump shown in Figure 4-3. At run-time, Ultraman will search for patterns within the interface tree of some source view, as explained in “A More Formal Introduction” on page 12. These patterns are used to extract information from the source view. When using the Ultraman tool, these patterns are always expressed as a tree of nodes, where each node bears the label of a type of interactor. 57

Generated code After the end-programmer has completed his or her work defining the patterns of interest, the Ultraman tool goes to work generating code. The code generated is primarily two things: A specification for a lexical analyzer and a specification for a parser. Although Chapter V and Chapter VI will cover the approach used to do this in some detail, it is worth understanding at this point the overall direction of Ultraman’s approach. In principle, our approach is to utilize off-the-shelf parsing and lexing technology to achieve the goal of finding the patterns of interactors specified by the user within a tree of interactors. Toward this end the Ultraman tool generates a particular specification of a lexer and a parser based on the set of a patterns defined by the user. These specifications are given to the ANTLR [27] tool, which generates a Java-based lexer and parser for use with the end-programmer’s application. (We expect the end programmer to link his or her application with the generated parser and lexer.) Although in truth ANTLR is generating the lexer and parser that actually are used, Ultraman ultimately controls the lexer and parser produced by writing the 58

specification that ANTLR reads. Because of this, we frequently will refer to the “Ultraman-generated” lexer or parser to mean the lexer or parser generated by feeding the specification generated by Ultraman into ANTLR.

FIGURE 4-3 Screen Shot Of The Ultraman Tool In Use

When Ultraman generates a lexer or parser specification, there are parts of the specification that refer to user-defined code. Through the Ultraman tool, the end-programmer can express that certain actions (in the form of code) should be taken at various points in the transformation process as it occurs at run-time. Ultraman takes care to insure that these events actually occur at the proper time by crafting the specification it generates for the intended effect. For example, if a designated piece of user code should run when a particular pattern is found, Ultraman emits a parser specification which includes calls to the user code at the proper points. This implies that the specifications Ultraman generates are tied fairly closely to user code, and that at run-time user code is being invoked during—or, perhaps better stated, intermingled with—the execution of the generated parser and lexer. 59

In the chapters of this dissertation dealing with lexical analysis and parsing, we will explain in detail how the code generator of Ultraman works. However, a philosophical point is worth noting here: We believe that the simplest way to engineer a code generator is to generate code in a object-oriented fashion “to” some existing, library-written class supplied with the tool. “To” here means that we intend to generate code which is subclass of some well-known library class. In the case of Ultraman, the implication here is that the generated lexer and parser are subclasses of classes that are part of the Ultraman library.

Run-Time

FIGURE 4-4 Function Of The Ultraman Run-Time

Figure 4-4 represents the overall function of Ultraman at run-time. Ultraman’s built-in library plus the lexer and parser produced at generate-time form the curved arrow seen in the middle of this figure. This arrow converts a view, expressed as the tree on the left, into another view, expressed as the tree on the right. These trees are the programmer’s view of an interface; the mechanism by which interfaces are built with code. The small insets behind each tree represent the end-user’s view of the application: a display that appears on the screen.

Let us consider Figure 4-4 at some point during run-time. Both interfaces, the source interface on the left and the generated interface on the right, are present on the screen and not currently changing. As soon as the end-user makes a change to the source interface, Ultraman swings into action. First, the lexical analysis 60

part of the system converts the left tree into a stream of input for the generated parser. Then the parser searches the tree for the patterns of interest and invokes the proper segments of the end-programmer’s code as each pattern is found. Other parts of the input tree are ignored. In the end-programmer’s code, he or will be creating subtrees which will form the basis for the new generated interface on the right. The end- programmer’s code will, from his or her point of view, be adding these new subtrees to the point in the right hand subtree where they belong. Again, from the end-programmer’s point of view, the old contents of these subtrees is removed before parsing begins. Once parsing is completed, Ultraman’s state preservation code is invoked to attempt to preserve state in the old version of the right-hand tree. By applying some novel techniques to the programmer’s output nodes, we perform a merge of the old and new trees yielding a new result for the right hand tree. It is this tree that is finally updated on the screen in response the end-user’s change to the left-hand tree.

At run-time significant amounts of specialized code is being run—code that only makes sense for the given application. This would include the generated lexer and parser as well as any code that the Ultraman user writes that is specific to the semantics of his or her application. However, there is code that we call the “library code” which is part of the Ultraman system which is the same for all applications. Again, although these two types of run-time code are different in spirit, during the actual execution of the program the library code makes calls into the user code and vice versa.

Library Code As we explained in “Generated code” on page 57, we believe on philosophical grounds in having the library of object supplied with Ultraman include base classes on which the generated code is based. This code is clearly part of the library code used by an application to do its transformation(s) and it also is the same throughout all Ultraman applications, but this code can be effectively ignored by the user. This code is simply provided to make the lexer and parser generation easier for the code generator and to allow more flexibility if changes are needed later.

However, much more important to the “library” shipped with Ultraman is the code involved with state- preservation. The problem of state preservation is fully explored in the chapter “State Preservation” on page 73 and is previewed in the introductory chapter of this document. Briefly, it can be understood as the problem of preserving state between successive runs of the Ultraman transformation system. In other words, if we are transforming view A into view B repeatedly, how does any state currently encoded in B survive the next transformation? Will such state be lost? Ultraman employes a sophisticated mechanism to insure that in most cases the state will be preserved. 61

Ultraman’s library code includes a special class designed for the purpose of dealing with the state preservation problem, the ShadowObject. One of the ShadowObject’s jobs is to mimic the API of an interior node of the interactor tree. In particular, the ShadowObject is mimicing a particular object in the interactor tree which the end-programmer has designated previously (at the time that application is initialized). From the end-programmer’s point of view, the ShadowObject is the interior object; it has the same API and is virtually indistinguishable from the object it is “shadowing.” Because operations which modify the interactor tree are being done to the ShadowObject—although to the end-programmer it might as well be the actual interior node—the ShadowObject has the opportunity to become aware of all changes that are being made and, in effect, keep an “extra” copy of part of the interactor tree. This is an extra copy because the actual interactor subtree is kept by the actual interior node. By using its knowledge of the interactor tree changes and having two copies of the tree temporarily (the extra is destroyed after each transformation) the ShadowObject has an opportunity to implement state-preservation algorithms.

Technical Chapters Preview

There are three primary technical chapters in this document, Chapter V, Chapter VI, and Chapter VII. These three chapters explore the issues of lexical analysis, parsing, and state preservation respectively. The lexical analysis chapter, beginning on page 61, deals with the process of converting a tree of interactors into a suitable input for a parser. This problem requires that the two-dimensional nature of a tree be converted into a linear stream of input for the parser to consume. This chapter details the particular algorithms used to produce the stream of input to the parser. Finally, this chapter concludes by considering the issue of using the Ultraman technology with a user-defined model. In this circumstance, the end-programmer must supply a lexer capable of converting his or her model into a form suitable for the parser and the lexical analysis chapter gives an explanation of this procedure and an example of how this is done.

The chapter on parsing begins on page 72, and considers to primary points of focus. First, given that the run-time lexer has converted the input tree into a one-dimensional stream of tokens, how does the parser actually find the patterns of interest while ignoring the rest of the input? This problem is complicated by the fact that the user has significant power in expressing the patterns of interest, including using powerful operators such as the Kleene star. Underlying the workings of all this machinery is the grammar which drives the parser, and that grammar is generated by the Ultraman tool. This chapter explores the algorithms used to turn the user patterns expressed in the tool’s graphical interface into a grammar. Finally, this chapter addresses the issue of application “linkage”, or how the particular application’s semantics—in the form of user code—is called and controlled by Ultraman’s parser generator. 62

One of the most intriguing aspects of Ultraman is its ability to preserve state across multiple runs of the system’s transformation engine. The chapter which explores the techniques and algorithms used to accomplish this, both in terms of Ultraman’s technology and the user’s approach, starts on page 73. This chapter explores the crucial idea of value numbering which is the process of assigning labels, referred to as a “number,” to particular nodes in a tree. AT run-time, we utilize this technique to find “like” nodes in successive generations of Ultraman-generated trees. The process of performing this value numbering and related algorithms is manifested in the idea ShadowObject which was discussed above.