Proceedings of the 7th USENIX Tcl/Tk Conference

Austin, Texas, USA, February 14–18, 2000

U S I N G T L T O B U I L A B U Z Z W O R D * C O M P L I A N T E N V I R O N M E N T T H AT G L U E S T O G E T H E R L E G A C Y A N A L Y S I S P R O G R A M S

Carsten H. Lawrenz and Rajkumar C. Madhuram

THE ADVANCED COMPUTING SYSTEMS ASSOCIATION

© 2000 by The USENIX Association. All Rights Reserved. For more information about the USENIX Association: Phone: 1 510 528 8649; FAX: 1 510 548 5738; Email: [email protected]; WWW: http://www.usenix.org. Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes.This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. Using Tcl To Build A Buzzword* Compliant Environment That Glues Together Legacy Analysis Programs

Carsten H. Lawrenz, Rajkumar C. Madhuram, Siemens Westinghouse Power Corporation, Orlando, Florida {Carsten.Lawrenz,Rajkumar.Madhuram}@swpc.siemens.com

Abstract. The Siemens Integrated Design (SID) Environment is a system that allows engineers to link together many legacy computer programs. This capability provides significant reduction in effort for defining the conceptual design of electrical generators. The SID environment is a generic tool for running all types of analysis programs (methods) as well as managing their associated data. Methods are plugged into the environment in a simplified fashion by using a well-defined interface. Any features that are added to the environment immediately benefit all methods. Data can be shared between remote sites through an in-house developed, java based, replication server. This paper discusses how Tcl was used to develop the SID Environment and why it was the best choice for our application.

* Buzzwords: Scalable, Multi User, Client Server, Distributed, Cross Platform, Customizable.

1. Introduction platforms were limited. We evaluated a third party GUI library (Galaxy) which was C based. We found its use 1.1 Business Case of coding too difficult to develop a prototype quickly. Java was still only known as Coffee. By chance, the Siemens Westinghouse relies on various complex authors discovered Tcl/Tk. Although Tcl was not computer calculations for designing electrical officially available on Windows at the time, various generators. Various computer programs (methods) Windows versions did exist. written in and other languages have been used over time. Because each method addressed Siemens Westinghouse designed an architecture that specific aspects of generator design, they existed as enabled methods to be easily plugged in, instantly standalone entities. Running each method required the sharing data among other methods. Using Tcl lent itself manual creation of input files and the manual extraction to fast prototyping and development. Six months into of output data to prepare it as the input for other development, engineers were able to start using the methods. More complications arose when trying to newly created environment. incorporate the more contemporary functionality of spreadsheets. This process, and its many iterations, Processes when performed by several different design teams, Distributed Data resulted in several manual hand-offs of information. TCL Code Server JAVA Code Furthermore, this created an immense paper trail, data SQL Server RPC Calls Tcl blend integrity issues, and time and effort spent trying to keep Remote CORBA Replication Distributed the design teams synchronized. SID Environment Server Data Servers JDBC Mini SQL There was an effort to reduce the time required to Desktop

Variables conceptually design an electrical generator from 180 Run Server Server Days and 10 engineers to 18 Days and 3 engineers. The File Server

Xess Server File Agent project was incrementally funded; therefore quick turn Fortran codes Xess around time was required. Variables Data Method Output Although most of our user platform was UNIX based, we knew we may want to use the Windows platform some time in the future. The computer languages with Figure 1: SID Architecture GUI support that ran at the time (1995) on both Figure 2: Snapshots of the SID Environment (Left: Case Manager, Input Screen, and Variable table. Right: Dynamic Input Screen, Data Flow and Process Maps)

1.2 Basic Usage updates the variables within the scope of the case. This transfer of data is accomplished by the use of input and The SID Environment simplifies the conceptual design output forms which are detailed below. process by automating and linking manual tasks. Users attend a one day training class during which theyÕre The environment also provides many utilities for IDÕs are registered in the SID database. Once viewing the various output types (PDF, postscript, registered, typing ÔsidÕ in a command window launches plots, HTML) created by the methods. All output the environment. viewing is launched from a data viewer tool. This allows the less computer savvy users to be very Each user works in their own unique data set that we productive. call a case. These cases are collected in folders and are presented in a hierarchical format similar to a file 2. Architecture manager. The folders and cases also mimic UNIX type file access permissions, which the user can modify. The architecture of SID Environment is outlined in Any a number of methods are associated with each Figure 1. The SID environment is a collection of case. The user opens a case by selecting a method. The servers and the desktop client. The desktop client case is then locked to disable access by other users. (shown in Figure 2) handles most of the user interactions with the system. It communicates with The case is displayed to the user as a window. Within several other servers for performing various functions. this window the input screens for each associated method may be displayed. A method screen, by default, is a table of variable names. The environment allows customized input screens for each method as well. Users can select on any variableÕs entry box and change its value. Validations of input data for each variable are provided. A popup menu provides descriptions for each variable. The variablesÕ descriptions, units and type are all defined in a global variable file.

The scope of each variable is global within the case, which allows methods to communicate. The user may select to run one method or several methods in series. As each method runs, the environment automatically with the data server to access the replication server objects.

metho case folder It was also required to port the system to Windows NT. In order to give uniform access to the case data, a file name oid type parent agent was required. It handles data movement to and É name from a case on behalf of the user into the central data É oid = parent repository. This mechanism also provides an added level of security, since all the data is owned by the file agent and access is only allowed through this file agent.

case metho 3. Software Development oid caseid = oid mname = mname caseid 3.1 Debugging É The fact that Tcl is as fully interpreted language lends oid = oid oid = oid itself well to software development. Many fault Tcl because it does not provide syntax checking like other languages such as C. object However, in the Tcl mode of programming the oid É developer is able (with tools like TkInspect) to dynamically modify the code while the program is running. Furthermore, modified code can easily be reloaded into the interpreter by re-sourcing the code Figure 3: E-R diagram for case and without having to exit the program. Also, bug fixes can method tables be copied out to the production area without having to take the whole environment down. We found this approach to programming to be much quicker than the Variables are the most widely used logical entities in code, compile and debug method. the system. They are stored in a data file in key-value ordered pairs. Each method takes input from a set of TclÕs error handling is more graceful than C or Java variables and generates output that is returned. For also. Most errors arenÕt fatal, i.e. the environment every design calculation or analysis, engineers use a set usually does not crash when errors are encountered. of methods, which we group as a case. In essence, a We overrode the error handler with a dialog box, that case is the set of methods and their related variables. allows the user to email the developer a description of what caused the bug and a stack trace. This usually Cases have to be managed in a reliable and fail-safe provides the developer enough information to manner, especially in a multi-user environment. Hence, determine the cause of the error. we employed a mini (mSQL) database [Hugh99] to hold the information about the cases, methods, users 3.2 Coding and the relationships between them. Cases are organized in the fashion of a UNIX file system, with a SID requires relatively few lines of Tcl code. This is hierarchy of folders and permissions to control access. advantageous because the development team is only Each case and each method in a case are uniquely allotted 1_ man years per year for environment identified by an object id (oid) and its parents oid. The maintenance. The SID environment contains over 115 relations between these entities are shown in Figure 3. thousand lines of code and is maintained by only two As the need for exchanging data between engineers in developers. This smaller body of code also lends itself remote sites surfaced, we wanted some of the folders to to utilities such as Concurrent Versioning System be shared between all the sites. In order to implement (CVS). Much of the coding is almost self- such a distributed system, we decided to use CORBA documenting; understanding code logic comes quickly. because it has established itself as a reliable and robust It's conceivable that to achieve this same functionality way to build distributed applications [Wolf98]. A using C or Java could require about ten times as many replication server was coded in Java, which uses JDBC lines of code. to communicate with the database. Tcl blend was used The Tcl auto loading of methods also helps developers organize their source files in a comprehensive manner either by components or functionality. In the SID environment, our source directories are several levels Some methods are preprocessors to complex Finite deep. Element Analysis (FEA) models. These methods have customized input screens with simplified graphics of 3.3 Ease of Learning the various model components. These graphic elements may be adjusted manually to modify variables or they The development team relied on contract labor to help can redraw themselves to reflect the value of variables meet the demands of the project in the early stages. (See graphic input screen in Figure 2). The developers that were hired to program the environment did not have any prior knowledge of Tcl/Tk. However, it was easy to get motivated programmers up to speed quickly and start MDI development. One observation is that experienced C programmers do not necessarily make good Tcl create, tile, cascade, programmers. Strings and lists manipulations are so wallpaper, É powerful and easily handled in Tcl. Tcl programming requires users to accept a paradigm shift while solving 1 problems.

3.4 Reusability * MDIWindow Adding a method to the environment requires insertion of a record in the method table (Figure 3) and the close, iconify, restore, print, creation of a file (method.screen) that lists all of the move, É input and output variables used by that method. SID parses the list, and by default, a generic input screen is created for that method (See input screen in Figure 2). Several other applications source this same file. For Figure 4: Tix MDI classes used in SID example, there is a web server script that reads this file to create an input/output dictionary the methods online documentation. There is also an administrative application that reads every methods screen file and We also created our own Multiple Document Interface creates a matrix of all variables and how they are used (MDI) widgets on top of the Tix framework. The MDI by each method. This matrix allows SID to determine system consists of the MDI widget and the if other method's data has been invalidated due to the MDIWindow widget (Figure 4). The MDI widget is a change of a variable's value. This simplifies method container widget that contains MDIWindow widgets. administration because all information is kept All the tools within SID environment are built using the consistent and concurrent. MDIWindow widget. The advantages of such a scheme are many: 1. It enforces uniformity in all the windows and provides access to features provided by the MDI 4. Most used Features of Tcl system and 2. It provides a compact environment where all SID related components are held together. 4.1 Graphical User interface Users can create new customized input screens. By The SID environment relies heavily on GUI placing local versions of the screen files in a pre- components. The environment creates a large number defined directory. These files override the production of entry widgets, which are used for data entry. There version of the files. This allows developers to test are also a lot of bindings attached to these widgets to functionality without having to check out a entire local handle data validation. The GUI commands blend well copy of the environment. Once the new screen has been with the source code because of their simplicity tested it can be copied out to the production area to be compared to other languages such as Java [WeFr97]. shared by all users. Very complicated and large input screens are easily handled by the Tk extension, as also witnessed in other The user can customize the look and feel of the large applications [Angel98] [DeCl97]. Our own environment. There are many configurable options like experience has shown that dynamically creating an wallpaper, fonts, background color etc. In addition to input screen with over 100 entries in Java required minutes compared to seconds in Tcl. GUI, many options like viewers, sound etc., can be first substitute markers of the form @@Mark[nn]@@ customized. instead of newline characters, where nn stands for the current line number. We also handle line continuations 4.2 Strings and List Processing by introducing special markers. In the next phase, all the markers of the above form are substituted with a We use the string processing capabilities of Tcl in SID. macroPhase2 command and a check to see if it was It is easy to create meta-languages for various purposes. halted. The command macroPhase2 is called with the For example, we developed a "forms" language, which current line number and a pointer to the macro is Tcl with a few abstractions that provide a generic environment. It handles things like highlighting the interface to the legacy codes. An input form (iform) is a current line in the editor, handling break points and also template for creating input (Figure 5) and an output acts as a state machine to put the macro engine to the form (oform) is used to get the output values back into next state based on the user action (stop, reset, run etc.). SID. It is an easy and yet very powerful method of Since a macro is typically a small script (less than 100 parsing the input and output. lines), the pre-processing is fast (~0.5 secs/100 lines of code) and hardly noticeable. Tcl enabled us to create a Another example where the string processing fairly robust macro system within a matter of days, capabilities were heavily used was in creation of a which would not be possible had we chosen a different language.

SIDIFORM Input File Large lists are handled efficiently in Tcl from version

var PF 8.0 onwards. Consequently, we found a tremendous MVA=150 var 0.89 increase in performance of SID. The number of PF=0.89 Ðcomma\ 150,173.22 Q=173.22 MVA Q 89.22 variables in a typical case can go above 5000. We store V(1)=89.22 varray V 67.27 V(2)=67.27 É. É these variables in global arrays using array set É. command, which is many times faster than iterating through the list and storing them individually.

Figure 5: Mapping of SID data to Input File 4.3 Global Variables

Many of the routines in SID rely on pointers, which are macro language. A macro is a sequence of SID actions actually references to global arrays. Pointers are useful that an engineer can use for design work. We created an in creating complex data structures. They also provide environment where macros can be edited and run, an extremely convenient and clean way of keeping complete with debugging options like stepping, watch track of state information within an environment as and breakpoints. The language of choice for the macro large as SID. We use a single global variable GV that was obviously Tcl and we supplemented it with some provides pointers to all the information about the commands like RUN, GRAPH, CALL etc., to provide current state of the environment. This makes it easy to some higher level abstraction. For this, we used regular organize the data in a hierarchical structure. For substitutions (regsub command). We avoided using example, in order to determine if a module named slave interpreters since a tight integration with the data tgs8000 has valid data in the active case, we could in the environment was necessary. traverse like this

In order to perform dynamic highlighting when a macro deref $GV(active_case) case; is run and also for debugging, a parser is needed. deref $case(module_ptr) module_list Conventionally, one would use lex and yacc to specify deref $module_list(TGS8000) module the grammar and then compile it with C to get a parser. if $module(valid) { However, we decided to use the powerful regsub É... command in an iterative fashion. Whenever a macro is } run, it is first pre-processed in a two-phase method. We The pointer mechanism comprises commands struct, The canvas widget is the only one that provides alloc, free and deref. The struct command can drawing capabilities. Nevertheless, it is very powerful be used to create data types similar to that in C. Whenever the alloc routine is called, it creates a global variable of the form _mem where nn stands for a sequence number generated using a counter. A pointer is returned, which is of the form #_mem. The deref command creates an alias for the global variable referenced by the pointer. In the absence of object oriented constructs (i.e, without Itcl), the pointer mechanism provided a way of neatly packaging different components inside SID. Also, we wrote a simple script that would go through all the struct definitions and create html documents. It serves as a good reference document to the internal data structures of the system.

Y (0,1)

(0.5,0.65)

(0.425,0.5) (0.575,0.5)

(0.5,0.35) (0,0) X deref [set OBJ(junc) [alloc array]] JUNC ÉÉÉ set JUNC(connPoints) {{0.5 0.35 v} {0.425 0.5 h} {0.5 0.65 v} {0.575 0.5 h}} set JUNC(connRules) { Figure 7: Snapshot of Graphical Input Screen {{in <= 3} {a maximum of only 3 inputs are allowed}} {{out <= 1} {you can have only one output from a junction box}} } and we used it in components like process maps, data flow diagrams and custom methods. Process maps are Figure 6: Specification of a process chart primitive basically flowcharts that represent an engineering process. The goal was to create and represent these processes inside the SID environment. First we investigated commercial charting programs, but these Variable tracing is another aspect of Tcl that we used in did not suit our requirements. Also, the Slate package SID. We use it in macros to associate the pseudo [RL98] was not available at that time. Therefore, we variable names with the real variables so that the decided to use the canvas widget to create one. We internal mechanisms are hidden from the user. It is used developed a set of primitives such as decision box, in several places when we need to fill certain lists so process box, sub-process box, etc. The whole process that it remains consistent with the context of the chart is represented as a flow chart structure. The sub- application. One interesting lesson that we learned was process primitives have pointers to the graphs of the to avoid variable tracing if it is needed only in certain corresponding sub-processes. Thus, the result was a instances when the variable in question is accessed. hierarchy of graphs for a given process. All of these Trying to have a global variable that flags the tracing on were neatly handled by the pointer mechanism that was and off creates problems that are hard to debug. One discussed earlier. instance is when an error causes the interpreter to spiral out of a routine before the flag is not restored to its Another interesting aspect of the process charts is the proper state. links that connect the different primitives. We introduced the notion of connection points, which are 4.4 Graphics Capability pre-defined positions around the primitive from which a link could be drawn. When a user drags the mouse to create a link, we search for the nearest connection point that is available. Also, every primitive has constraints Communication between the various networked on the connection points. For example, the start/end components is done using the Tcl-Dp package. The box could have only one output or one input and the RPC mechanism is robust and provides a simple decision box can have only one input and two outputs. interface for servers and clients. Again, the scripting nature of Tcl Òcame to the rescueÓ in describing and checking the constraints. For The servers within the SID environment are spawned example, the junction box object has a description on every invocation. It is not possible to assign fixed similar to the one shown in Figure 6. port numbers to each server since more than one user can be using SID (with remote displays) on a same machine. In order to solve this problem, whenever a The co-ordinates of the connection points are relative to server is spawned, it searches for a free port and stores a hypothetical 1.0x1.0 bounding box of the primitive. the port number information in a registry file. This file The v and h indicates that only a vertical or horizontal serves as a directory for the various network connection is allowed at that point respectively. While components. parsing the rules, "in" is substituted with the number of total inputs (+1 if the current connection point is being Interactions between the servers can be quite complex. considered for an input) and "out" is substituted Figure 8 shows the events that take place when the user correspondingly. The rule engine goes through each clicks on the RUN button. Communications between rule and evaluates the rule expression. If it fails, it servers take place through RPC and the status of other displays the corresponding error message and returns. processes are monitored by the UNIX signal trapping Tcl makes the process very generic and simple. One commands provided by extended Tcl. could even have rules such as "in+out <= 1" (in case of start/end box). The data server provides replication services. Because, replicating all the cases involves huge network traffic and is largely unnecessary, we identified a subset of Each of the primitives and links has a pointer to an folders that would be replicated. There are two array that contains information about the object. The tags for each are also named accordingly. The tag for a primitive may look like obj_0#_mem72 and that of a connection, like con_0#mem66. Using regexp and deref commands, we get a quick access to the Desktop properties of the object when it is selected for various 8. Fetch output variables reasons like cut/paste, delete etc. The editor also 7. Done provides powerful features like cut/copy/paste and 1. Run tgs8000 undo/redo operations.

Once we got the process map editor in place, Run Server incorporating it into the environment was simple. Next, 2. Spawn run we were also able to make the process "run". A flashing file for tgs8000 circle beside the primitive indicates that it is currently executing. The handling of multiple charts in each case 6. Done was done using our MDI window widget.

Run File We use the BLT graph widgets for plotting graphs Variable Server 5. Process within the environment. The package PlPlot [PL99] is Output used to generate high quality engineering graphs variables without having the need to display on the screen. 3. Run method 4. Done (exec) (SIGTERM, We also provide graphical input screens, which are SIGKILL,,,) entry widgets overlaid on a gif image of a drawing tgs8000 (Figure 7). This provides an intuitive interface, especially for the novice users to enter values for various input variables in the method. Figure 8: Sequence of events during a method run 4.5 Networking/Communications replication modes, synchronous and asynchronous. In our goals. We could create prototypes quickly and synchronous replication, the action has to be completed incorporate them into the environment. Relying on in all sites before proceeding. For example, when many extensions can be a setback at times, especially someone tries to open a case, it needs to be locked. The when migrating to new versions of Tcl. But it certainly sql command to lock the record is passed on to all the was not a showstopper since source code was freely sites. In each site, the command is supplied to the msql available. SID was started with version 7.3 and now daemon, via JDBC. Only when it is successful in all runs under 8.0.4. If we were to embark on another sites, the locking is valid. On the other hand, project of this scale today, itÕs our opinion that we asynchronous replication passes on the command and would use Tcl again as opposed to the current trend simply returns. Each site holds a command queue (java towards JAVA. layer) which keeps trying until the operation succeeds. Replication of case data is handled using FTP. We Acknowledgements tested the replication between three sites (Orlando and Charlotte in the U.S and Muelheim in Germany). We found that the network bandwidth, which we hope will We thank the open source community, the various tcl improve, was the only bottleneck. It works extension authors and Scriptics, who laid the exceptionally well and as planned, a testimony to Tcl's foundations on which we could build a successful claim as a glue language. We use xess spreadsheets project of this magnitude. [Ais99] for some of our methods. There is a Tcl interface for the xess spreadsheets that we use to References communicate between the spreadsheets and the SID environment in the unix platform. We communicate [Ais99] Applied Information Systems home page with Microsoft Excel spreadsheets on NT using the http://www.ais.com package tcom that exposes COM interfaces. We use a Valid as of 09/01/1999. Tcl implementation of SMTP that we found in the newsgroup for all our mail purposes within the system. [Angel98] Angelovich, Kenny and Sarachan, "NBCs It works across platforms and is reliable. Genesis Broadcast Automation System: From Prototype to Production", Proc. Sixth 4.6 Cross Platform Deployment Annual Tcl/Tk Conference, pp. 1-9, San Diego, Calif.:USENIX, 1998. We recently started porting the SID environment to NT after all of the required extensions were available. [DeCl97] De Clarke, "Dashboard: A Knowledge- Surprisingly, there were few coding changes required to Based Real-Time Control Panel", Proc. bring up the environment. All GUI components Fifth Annual Tcl/Tk Workshop, pp. 9-18, worked extremely well on the first attempt. Early in the Boston, Mass.:USENIX, 1997. development, we made a conscious effort to remove as many exec commands with in the Tcl code. The [Hugh99] The mSQL Home Page, majority of changes were related to file permissions and http://www.hughes.com.au sharing between UNIX and NT. This problem was Valid as of 09/01/1999. tackled by using SAMBA and by writing a small File Agent that handles most file duties on behalf of the [PL99] The PLPlot Home Page, user. This gave us an added benefit, the File Agent http://emma.la.asu.edu/plplot becomes the owner of all files in the repository thereby Valid as of 09/01/1999. restricting access by general users. [RL98] Reekie H.J. and Lee E.A, "The Tycho We did notice some reduced performance when Slate: Complex Drawing and Editing in creating large input screens running on a Pentium II Tcl/Tk", Proc. Sixth Annual Tcl/Tk Windows machine. Also, some of our input forms Conference, pp. 37-46, San Diego, would not display properly unless we added a few well Calif.:USENIX, 1998. placed update commands. [WeFr97] Webster T. and Francis A., "Tcl/Tk for 5. Conclusion Dummies", pp. 324, IDG Books, 1997. Our experience has been that weÕre able to add any desired feature into SID with Tcl. Most of our design was incremental and Tcl provided the flexibility to meet [Wolf98] Hans K.Wolf, "Java, CORBA, and Archit- ecture", Component Strategies, September 1998, pp. 58-64.