• RAID Array Controllers • Workflow Models • PC LAN and System Management Tools

Digital Technical Journal Digital Equipment Corporation

Volume 6 Number 4

Fall 1994 Editorial Advisory Board Jane C. Blake, Managing Editor Samuel H. Fuller, Chairman Kathleen M. Stetson, Editor Richard W Beane Helen L. Patterson, Editor Donald Z. Harbert Circulation William R. Hawe Richard]. Hoi I ingsworth Catherine M. Phillips, Administrator Richard Lary Dorothea B. Cassady, Secretary F Alan G. Nemeth Production Jean A. Proulx Terri Autieri, Production Editor Robert M. Supnik Anne S. Katzeff, Typographer Gayn B. Winters Peter R. Woodbury, Illustrator The Digital Technicaljournal is a refereed journal published quarterly by Digital Equipment Corporation, 30 Porter Road '"102/D 10, Littleton, Massachusetts 01460. Subscriptions to the journal are $40.00 (non-U.S. $60) for four issues and $75.00 (non-U.S. $115) for eight issues and must be prepaid in U.S. funds. University and college professors and Ph.D. students in the electrical engineering and science fields receive complimentary subscriptions upon request. Orders, inquiries, and address changes should be sent to the Digital Technicatjournal at the published­ by address. Inquiries can also be sent electronjcally to [email protected]. Single copies and back issues are available for $16.00 each by calling DECdirect at 1-800-DIGITAL (1-800-344-4825). Recent back issues of thejou·rnal are also available on the Internet at http://www.digital.com/info/DTJ/home.html. Complete Digital internet listings can be obtained by sending an electronic mail message to [email protected].

Digital employees may order subscriptions through Readers Choice by entering vrx PROFILE at the system prompt. Comments on the content of any paper are welcomed and may be sent to the managing editor at the published-by or network address.

Copyright© 1995 Digital Equipment Corporation. Copying without fee is permitted provided that such copies are made for use in educational institutions by faculty mem­ bers and are not distributed for commercial advantage. Abstracting with credit of Digital Equipment Corporation's authorship is permitted. All rights reserved. The information in the journal is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation or by the companies herein represented. Digital Equipment Corporation assumes no responsibility for any errors that may appear in the journal. ISSN 0898-901X

Documentation Number EY-Tll8E-TJ

The following are trademarks of Digital Equipment Corporation: AXP, Cl, DEC, DEC OSF/1, DECmcc, DECmodel, DECnet, DECwindows, Digital, the DIGITAL logo, HSC, HSC50, HSC60, HSC70, HSC90, HSJ, HSZ, InfoServer, KDM, ManageWORKS, Object Flow, OpenVMS, PATHWORKS, POLYCENTER, Storage Works, ULTRIX, VAX, VAXcluster, VAXstation, VMS, and VMScluster. Apple and AppleShare are registered trademarks of Apple Computer, Inc.

dBase IV is a registered trademark of Borland International, Inc.

Hewlett-Packard is a registered trademark of Hewlett-Packard Company. Cover Design Our cover design is inspired by a system man­ i960 is a trademark of Intel Corporation. agement topic in this issue. Manage WORKS IBM and NetView are registered trademarks of International Business Machines Corporation. software is a system and network manage­ Knowledge Craft is a registered trademark of Carnegie Group, Inc. ment tool that presents an object-oriented, Microsoft and Visual C++ are registered trademarks and Windows and Windows NT graphical view of a heterogeneous LAN envi­ are trademarks of Microsoft Corporation. ronment. The multi color circles on the cover NFS is a registered trademark of Sun Microsystems, Inc. represent the diverse objects, or entities, on NetWare and Novell are registered trademarks of Novell, Inc. the networks among which a system adminis­ OSF/ I is a registered trademark of the Open Software Foundation, Inc. trato-r "navigates" using the integrated com­ Sun M icrosystems is a registered trademark of Sun Microsystems, Inc. ponents of the tool. UNIX is a registered trademark in the United States and other countries, licensed The cover was designed by Lucinda O'Neill exclusively byX/Open Company Ltd. andjoe Pozerycki,jr:, of Digital's Design is a trademark of the Massachusetts Institute ofTechnology. Group. Book production was done by Quantic Communications, lnc. Contents I

RAID Array Controllers

5 The Architecture andDesign of HS-series StorageWorks Array Controllers Stephen]. Sicola

Wo rkflow Models

26 Policy Resolution in Workflow Management Systems Christoph ]. BuBier

50 TheDesign ofDECmodel for Windows Stewart V Hoover and Gary L. Kratkiewicz

PC LAN and System Management Tools

63 TheDesign of ManageWORKS: A User Interface Framework Dennis G. Giokas and John C. Rokicki

75 The Structure of the OpenVMS Management Station James E. Johnson

89 Automatic, Network-directed Operating System Software Upgrades: A Platform-independent Approach John R. Lawson, Jr. Editor's Introduction

I explain the reasoning behind the creation of a pre­ sentation layer in DECmoc.lel that provides a graphi­ cal view of the business process while hiding the technical details of the model. The authors also cover implementation details, including the deci­ sions to move from the original !.IS!' environment to a C:++ programming environment and to imple­ ment the knowledge base fo r DECmodel in ROCK, a frame-based .knowledge representation. We then shift the fo cus to ManageWORKS and POLYCENTER tools that have been developed to simplify the increasingly complicated job of system Jane C. Blake Ma naging Editor management. The first of three papers describes the development of the ManageWORKS Workgroup Administrator software. Dennis Giokas ami John Three computing topics are presented in this issue Rokicki discuss the design principles adopted for of the journal: a storage array controller fo r open this product that enables system and network man­ system environments, workflow architectures and agement of heterogeneous L.ANs from a single PC tools, ancl PC andlAN system management products. running Microsoft Windows. Key design elements The opening paper, by Steve Sicola, describes are plug-in, customizable modules fo r system Digital's new HS series of StorageWorks array con­ navigation ancl management, ami the user inter­ trollers. Designed fo r open systems, the control­ fa ce framework, which controls the flow between lers interface to host by means of the modules. The authors offer scenarios to illustrate industry-standard SCSI-2 interconnect, as well as interactions between components. Digital's CI and OSSI host interconnects. Equally Managing OpenVMS systems from a PC running important to designers as openness were controller the Microsoft Windows operating system can be availability and perfo rmance. Innovative fe atures accomplished with the OpenVMS Management were introduced, including dual-redundant con­ Station, of which ManageWORKS is a key compo­ trol lers and Parity RAID firmware to ensure high nent. Jim Johnson defines the need fo r this scalable availability, and a write-back cache that significantly and secure client-server tool in OpenVMS envi­ improves performance. The paper concludes with ronments, which can be clustered, distributed, a description of the common controller processing expanded, and networked extensively. After a dis­ core fo r the SCSI, CI, and DSSI controller variants. cussion of design alternatives, Jim describes the Workflow is the subject of two papers with dif­ fu nctions of the Station's client, communication, fe ring perspectives. Christoph BuBier opens his and server components. paper with introductory definitions and impI ica­ The final paper is about an initial system load tions of workflow concepts. He argues that a \ovork­ (ISL) capability fo r automatic, network-directed. flow that uses roles fo r task assignment is limited, operating system software upgrades. John Lawson especially in large. international enterprises. He reviews goals for the POLYCENTER Software Distri­ states that by adding the dimension of organiza­ bution layered product, compares the POLYCENTER tional dependencies fo r task assignment a complex ISL process with the OpenV:VIS ISL process. and workflow is more precisely expressed. Using the steps through the requirements for expanding the example of a travel expense reimbursement work­ l'OLYCENTER Sofnvare Distribution capability to flow, Christoph shows how the Policy Resolution other platforms and operating systems.

Architecture design principles support enterprise­ Our next issue will celebrate the Journal's tenth level workflow deployment-reusability, security, anniversary of publishing the technical achieve­ generality, dynamics, and distribution. He also dis­ ments of Digital's engineers and partners. The issue cusses the Policy Definition Language that formally will feature database technologies and new Alpha describes workJlow elements. ancl high-end server systems. A second paper about workflow presents a tool, called DECmodel fo r Windows, fo r the development of business process models and their graphical presentation. Stew Hoover and Gary Kratkiewicz

2 Biographies I

Christoph J. BuBier Christoph BuGler is a faculty member at the Technical University of Darmstadt, Germany, where he is pursuing a Ph.D. degree. His research is in workflow and organization modeling, with a fo cus on organizational embec.lcling of workflow management, ancJ in architectures fo r enterprise-wic.le deployment of workflow management systems. While at Digital from 1991 to 1994, Christoph developecJ the Policy Resolution Architecture and its prototype imple­ mentation. He holds an M.C.S (1990) from the Technical University of Munich ancJ has published many papers on workflow management and enterprise modeling.

Dennis G. Giokas Dennis Giokas is currently a senior associate with Symmetrix, Inc. While at Digital from 1984 to 1995, he was a consulting engineer in the PATHWORKS group. He co-led PA1HWORKS VS.O and architected the user interface ami system management tools. He was also architect and manager fo r the PC DECwindows program. Previously, Dennis worked at Arco Oil & Gas and The Foxboro Company developing process control software. He holds a Bachelor of Music from the University of Massachusetts at Lowell, a Master of Music from the New England Conservatoq', and an M.S.C.S. from Boston University.

Stewart V. Hoover EmployecJ at Digital Equipment Corporation between 1984 and 1994, Stew Hoover is currently an indepenc.Jentconsu ltant specializing in modeling ancJsimulation. Before joining Digital, he was an associate professor of industrial engineering ancJ info rmation systems at Northeastern University. Stew contributed to the development of the DECalc-PLUS application, Statistical Process Control Software (SPCS), and the DECwindows version of Symmod. He has written many papers and articles on simulation and is coauthor of Simulation, A Problem-Soluing Approach, published by Addison-Wesley.

James E. Johnson A consulting software engineer, Jim Johnson has worked in the Open VMS Engineering Group since joining Digital in 1984. He is currently a member of the OpenVMS Engineering team in Scotland, where he is a technical consultant fo r transaction processing and file services. His work bas spanned several areas across OpenVMS, incluc.l ing RMS, the DECc.ltm transaction services, the port of Open VMS to the Alpha architecture, and Open VMS system manage­ ment. Jim holds one patent on commit protocol optimizations. He is a member of the ACYl.

3 Biographies

Gary L. Kratkiewicz Gary Kratkiewicz is currently a scientist in the Intel li­ gent Systems R&D Group at Bolt Beranek and Newman Inc. As a principal engi­ neer in Digital's DECmodel engineering group from 1991 to 1994, Gary coordinated the architecture and high-level design specifications, and c.lcvel­ oped the knowledge base, script engine, API, and several user interface modules. Earlier at Digital, be developed an expert system fo r shipping and was project leader fo r a knowledge-based logistics system. Gary holds an S.B.M.E. from ,\'liT and an M.S. in manufacturing systems engineering from Stanford University.

john R. Lawson, Jr. John Lawson joined Digital in 1984. He has been a mem­ ber of the OpenW

john C. Rokicki John Rokicki, the project leader fo r ManageWORKS Wo rkgroup Administrator, is a principal software engineer within Digital's Network Operating Systems engineering organization. His rrimary responsibility is the design and implementation of the base services of the ManageWORKS product. Before joining Digital in 1990, he was employed by Data General Corp. and Sytron Inc. John holds a B.S. (1989) in computer science from Wo rcester Polytechnic Institute.

Stephen). Sicola Consulting engineer Stephen Sicola is a member of the the Array Controller Group in the Storage Business Unit. He is working on the next generation of controllers and was the technical leader fo r the current StorageWorks controller product set. In earlier work, Steve developed software and hardware fo r such products as the 1-JSC, KDi'v\70, and advanced development controller projects. Steve joined Digital in 1979 after receiving a B.S.E.E from Stanford University. He received an M.S.C.E. from the National Technological University in 1992.

4 StepbenJ Sicola I

The Architecture and Design ofHS-series StorageWorks Array Controllers

Th e HS series ofStorag eW!orks array controllers is a new fami�J' of Digital products that includes models for both open systems and systems that use Digital's propri­ etary buses. The HS-series controllers combine performance, auailabili(J!, and relia­ bili�)J in total storage subsystem solutions that use industry-standard storage devices. Th e architecture and design ofStora ge Works array controllers represents a balance between the market requirements and the available technology Th e engineering trade-ojfs led to an innovative design that incorporates product fea­ tures such as a dual-active controller configuration, write-back caching, Pari�)! RAIDtechnotog11, and SC\'f-2 device handling

The HS series of StorageWorks array controllers, a devices would provide investment protection new addition to Digital's storage subsystem fa mily, fo r customers because they would not have to

supports an open systems environment by allowing change devices when a new controller was intro­ the attachment of industry-standard Small Computer duced or when they changed controller modules Systems Interface (SCSJ-2) devices to the controller.1 to use a different host interconnect. Industry­ Moreover, these controller products yield high avail­ standard devices would also reduce overall sub­ ability and high performance. This paper describes system cost because of the competitive nature of the architecture and the design of the HSJ30, HSJ40, the storage device industry The long-term use of HSD30, and HSZ40 StorageWorks array control lers. both Digital and non-Digital devices was desired These controllers interface to host computers by to provide a wide variety of device choices fo r means of existing Digital interconnects, i.e., the customers. The use of an industry-standard host Computer Interconnect (CI) and the Digital Storage interconnect would allow StorageWorks con­ System Interconnect (DSSJ), as weJ l as a SCSI-2 host trollers to be used with Digital and non-Digital interconnect to VAX , Al pha, and most other com­ host computers, further expanding the open sys­ puters in the industry. The paper documents the tems capability. The SCSI-2 interconnect was cho­ design and development trade-offs and describes sen as the device interface and the host interface the resulting control lers and their features. over other industry-standard interconnects fo r StorageWorks array controllers represent a sig­ cost and strategic reasons. nificant change from Digital's original Hierarchical Storage Controller (HSC) subsystem, the HSC)O con­ 2. High availability. The goa ls fo r high availability troller, which was designed in the late 1970s, and included both controller fa ult tolerance and also from other Digital controllers such as the storage (disk configuration) fa ult tolerance. HSC60, HSC70, HSC90, and KDM70 controllers. The StorageWorks controllers discussed in this paper Controller fault tolerance was achieved by devel­ oping a dual-redundant controller configuration were designed to meet the fo llowing product goals: in combination with new StorageWorks enclo­ I. Open systems capability The goals fo r open sys­ sures that provide redundant power supplies tems capability were to use industry-standard and cooling fa ns. The goal of the dual-redundant storage devices attached to the controllers and configuration was to have the surviving con­ to use an industry-standard host interconnect fo r troller automatically assume control of the fa iled one controller model. Using industry-standard controller's devices and provide 110 service to

Digital Teclmicnl]ournnl H,f 6 N". 4 Fall 1994 RAID Array Controllers

them. As a side benefit, such a configuration development as a result of RAID technology would provide Joacl balancing of controiJer investigati<>ns. resources across shared device ports. Another approach to reduce latency was to The storage fault-tolerance goal was to develop develop controller-based disk striping, i.e., firmware support for controller-based redundant implement the HAU) level 0 technology.2 Specific

array of inexpensive disks (RAID). 2 The initial goals were to achieve parallel access to all RAID Parity RAID implementation incorporated the level 0 array members for read and write opera­ best attributes of RAJDlevels .3 and ). The design tions ancl to streamline firmware to increase provided the basis for later implementations of RAJDlevel 0 pert()rmance. other forms of RAJD technology, notably mirror­ The Parity RAID performance goal was to over­ ing. Parity RAJDsupports the goal of storage fault come the well-known weaknesses of RAID level tolerance by providing tor continued l/0 service 3 (i.e., poor transaction throughput) and RAID from an array of several disks in the event that level 5 (poor small-write performance) and to one disk fails. StorageWorks packaging that pro­ approach RAID level 0 striped array performance vides redundant power supplies and cooling for both small and large read and write requests.2 should be combined with the Parity RAJD tech­ A combination of hardware-assisted parity nology to extend storage fault tolerance. computations and write-back caching helped .3. High performance. The goals for high perfor­ achieve this goal. Parity calculations in hardware mance were to specify controller throughput reduced firmware overhead to complete RAID 5 (the number of I/O operations per unit of time), level write operations. Write-back caching latency (responsiveness). and data transfer rate minimized the effects of the RAID level ) small­ (controller bandwidth) for each of the three con­ write penalty.; To meet the needs of customers troller platforms: Cl, DSSI, and SCSI. The through­ who require high data transfer rates \Vith RAID, RAID level 3-style algorithms must be added for put was specified in the maximum number of read and write requests executed per second. the Parity RAIDdesign. The controllers had to speed up the response A common controller processing core had to I/O time for host operations and thus clel.iver data be architected and designed to meet the perfor­ with lower command latency than the HSC con­ mance nee(ls of all the planned StorageWorks trollers. StorageWorks controllers had to achieve controllers (based on host interface ca pabili­ the highest possible data transfer rate and were ties). The platform had to execute the same base to do so on a common platform. firmware, coupling new host interface firmware to the specific platforms. A common platform The platform-specific controller throughput was believed to ease product development and goals were as follows. The initial goal for the Cl­ to maximize reuse of firmware for the same to-SCSI controller was 1,100 read requests per "look and feel" in all products. second; the long-term goal was 1,500 to 1,700 read requests per second. The initial goal for the DSSI-to-SCSI controller was 800 read requests per Open Sy stems Capability

second; the long-term goal was I ,300 read For Storage Works controllers to enter the open sys­ requests per second. The initial goal for the SCSI­ tems market, product designers had to consicler to-SCSI controller was 1,400 read requests per the following aspects of open systems in the con­ second; the long-term goal was 2,000 read troller definition: the use of industry-stanJarc.l requests per second. The controller throughput device interconnects and industry-standard devices t()r write operations was slightly lower. attached to the controller, and the use of inc.lustry­ standard and Digital host interconnects. To reduce latency, the controller hardware and firmware implemented controller l/0 re

(i 6 Vol. No. . J /·{1/l 1'.!'.!4 Digital Tee/mien/ jo urnal Tb e A1·c!Jitecture and Design of HS-series StorageWorks Array Controllers

concurrently designing and building storage device and other industry system environments. The basic enclosures ca lled shelves that would house up to differences are the number of hosts connected seven 3.5 -inch devices or two 5.25-inch devices. and whether or not other storage devices can be These shelves, connected to the controller, would on the same host interconnect as the controller and allow a wide variety of SCSI-2 devices to be incorpo­ the other hosts. rated and would do so at a low cost because of the The CI configuration supports up to 32 nodes per widespread use of SCSI-2 as a device interconnect. bus. Each node may be either a storage controller StorageWorks control lers were designed to sup­ (i.e., an HSJ30, an HSJ40, or an HSC device) or a host port the following types of SCSI-2 devices: computer (i.e., a VAX or an Alpha system) . The DSSI configuration supports up to 8 nodes • Disk-rotating spindle disk drives and solid­ per bus. Each node may be either a storage con­ state disks troller (i .e., an HSD30 or an HSD05), a storage ele­ • Ta pe-individuaJ tape drives, tape loaders, and ment (e.g., an RF73 device), or a VAX or an Alpha jukeboxes that contain robotic access to multi­ host computer. ple drives from a media library The SCSI configuration supports up to 8 targets per bus. The HSZ40 controller, with its standard • CD-ROM SCSI-2 host interface, may be connected to Digital • Optical-ind ividual disks and jukeboxes that Alphacomputers (i.e., DEC 3000 and DEC 7000/10000 co ntain robotic access to mu .ltiple drives from computers running the DEC OSF/1 operating sys­ a media library tem), Sun Microsystems computers, Hewlett­ Packard computers, and IBM computers. Digital qualifies the HSZ40 controller for operation with StorageWorks Controllers in System additional vendors' systems accord ing to market Environments demand. The desire to produce a controller with an open system host interconnect was coupled with a com­ mitment to protect the investments of existing High Availability Digital customers who currently use CI and DSSI To meet the goals of controller and storage fa ult tol­ host interconnects. The strategy was to produce CI, erance, the designers of StorageWorks controllers DSSJ, ami SCSl variants of the StorageWorks array developed a number of scenarios from which the control ler, all based on a common platform. As in controller can be fa ult tolerant with respect to fa il­ the selection of the device interconnect, the SCSI-2 ures in control ler or attached storage components. host interconnect variant was chosen because of its The first aspect of fault tolerance considered is that widespread use and low cost. of controller fault tolerance; the second is configu­ The controllers fo r the CI, DSSI, and SCSI intercon­ ration fa ult tolerance. nects were named the HSJ30/HSJ40, the HSD30, and the HSZ40, respectively. The designations of "30'' Controller Fault Tolerance and "40" represent a code for the number of device Designers achieved controller fault tolerance by ports attached to the controller. The HSJ30 and investigating the common fa ults that the controller HSD30 controllers have three device ports each, could tolerate without requiring extreme design whereas the HS.J40 and HSZ40 have six device ports measuresand incurring high costs. The results of this each. The number of device ports selected for each investigation drove the design of what became the controller type was based on (1) the overall capabil­ dual-redundant HS-series controller configuration. ity of the host port interconnect to support the This configuration incorporates several patented aggregate capability of a number of device ports hardware and firmware features (patent pending). and (2) the desire to amortize controller cost The fo llowing fau lts can exist within a against as many attached devices as possible. StorageWorks array controller and the attached StorageWorks controller configurations depend Storage Works packaging and do not make host data on the controller host interface. Marked differ­ unavailable: ences exist in the configurations supported by CI-based OpenVJYlS YAXcluster configurations, DSSJ­ • Controller failure. In a dual-redundant configu­ based OpenVMS VAXcluster configurations, and ration, if one controller fa ils, all attached storage SCSI-based configurations in OpenVMS, DEC OSF/1, devices continue to be served. This is ca lled

Digital Te cbnicaljounltll Vol. (, No. 4 Fall 1994 7 RAlD Array Controllers

fa ilover. Failover occurs because the controllers during which a controller configuration must oper­ in a dual-redundant configuration share SCSI-2 ate with a fa ilure present. device ports and therefore access to all attached Another requ irement of fault-tolerant systems storage devices. If failover is to be achieved, the is the ability to "hot swap" or "bot plug" compo­ surviving controller should not require access to nents, i.e., to replace components while the system the fa iled controller. is still operating and thus to not ca use the system to shut down during repairs. The designers made the • Partial memory fa ilure. If portions of the control­ control ler and its associated cache module hot ler bu ffer and cache memories fa il, the controller swappable. That is, one controller in the dual con­ continues normal operation. Hardware error cor­ figuration can be replaced wi thout shutting clown rection in controller memory, coupled with the second controller, and the second controller aclvancecl diagnostic firmware, allows the con­ continues to service the requests of the attached troller to survive dynamic and static memory hosts. This feature, coupled with the hot-swap fa ilures. In fact, the controller will continue to capability of StorageWorks devices, creates highly operate even if a cache module fails. available systems. • Power supply or fan fa ilure. StorageWo rks pack­ aging supports dual power supplies and dual Dual-redundant Controller Configuration Like fans. HS- series controllers can therefore be con­ all StorageWorks components, HS-series con­ figured to survive a failure of either of these trollers are packaged in StorageWorks shelves. The components. Storage Works controller shelf contains a backplane that accommodates one or two controllers and • SCSI-2 device port failure. A failure in a single their associated cache modules, as well as SCSI-2 SCSI-2 device port does not cause a controller device port connectors. The packaging is common to fa il. The controller continues to operate on to all system environments. HS-series controllers the remaining device ports. mounted in a single shelf may be combined in pairs The controller must be able to sense the fail­ to fo rm a dual-redundant control ler configuration ures just listed in order to notify the host of a fa ult­ (shown in Figure 1) in which both controllers can tolerant failure and then to continue to operate access the same set of devices. normally until the fault is repaired. The designers Figure 2 shows two HS-series control. lers deemed this fe ature vital to reducing the time installed in a StorageWorks controller shelf in

HOST INTERFACE

HS-SERIES HS-SERIES CONTROLLER CONTROLLER MAINTENANCE MAINTENANCE TERMINAL FAILOVER COMMUNICATION TERMINAL EIA-423 PORT EIA-423 PORT

SCSI DEVICE PORTS SHARED BETWEEN CONTROLLERS

Figure 1 StorageWorks Controllers: System Block Dia�ram

lt1l. G No. 4 Fall 1')')4 Digital Te chnical jourua/ The Architecture and Design of HS-series Storage Wo rks Arrc�y Controllers

MAINTENANCE PROGRAM CARD CONTROLLER TERMINAL (PCMCIA) HSJ40 CONNECTION

PORT BUTTONS

POWER SUPPLIES RESET CONTROLLER A (1 MANDATORY, BUTTON 1 OPTIONAL FOR FAULT TOLERANCE)

Figure 2 StorageWorks Controller Shelf

a du al-redu nd ant configuration. Figure 3 shows the confi guration inform ation. The dual-active con­ two dual-redundant control ler configurations troller op tion requires that each controller have mounted in a StorageWorks cabi net with several detailed knowledge ab out the other controller and device shelves. The controllers connect to storage the device state; it does not require that the con­ devices with cables that emerge from the controller trollers share a memory. The des igners chos e the shel f and attach to the device sh elves. seconcl option because it provided loacl balancing The designers had to decide how the dual­ and therefore potentially greater performance. redu ndant control ler conf iguration could ach ieve However, they faced the challenge of designing a high availability through faul t tolerance . To meet backplane and an interface between the controllers the high-availability goals, the team addressed the that woulcl achieve the dual-active configur ation but concept of contro ller failover early in the des ign would not requi re a shared mem01yThe result of the process. One fault-tolerant opti on considered was design effon was the StorageWorks controller shelf. to run with a "hot-standby " controller that would become orerational only if the main controller Storage Wo rks Controller Shelf The StorageWo rks were to fail. A second opti on was to design a dual­ controller shelf is an archi tected enclosure th at active controller configuration in which two con­ allows a pair of Storage Works controllers and their trol lers wou ld oper ate simultaneously. They would respective cache memory modules to be pl aced share and concurrently use device port buses (not imo the dual-redundant configuration, as shown in devices), thus balanci ng the I/O load fr om host Figure 4. A cache module is attached to each con­ compu ters. troller for per formance purposes. The controller Hoth options allow for direct failover of devices shelf contains a backplane th at includes intercon­ wi thou t manual intervention. Th e hot-standby con­ troller communication, control lines between the troller op tion requ ires either au tomatic configura­ controllers. and shared SCSI-2 device ports . Si nce tion of the attached devices when the hot-standby the two controllers share scsr-2 device ports, the contro ller becomes operational or nonvolatile (i .e., design enable� continued device availab ility if one impervi ous to rower loss) shared memory to hold controller fails.

Digital Te chnical ]Otwn al Vol. 6 No. ·J hill I'J'Il 9 RAID Array Controllers

TAPE DRIVE

CONTROLLER ..!;-'-;.---.,.---- SHELF

HOST INTERFACE CABLES

SCSI DEVICE PORT CABLES (6)

DEVICE SHELF

Figure 3 StorageWorks Ca binet

SLOT 0 SLOT 1

' KILL B '

KILL A

COAL BUS

CACHE B LOCK

CACHE A CACHE B CONTROLLER A CONTROLLER B LOCK * LOCK J CACHE A CACHE1 B L CDAL BUS I I COAL BUS t CACHE A LOCK 1 COAL BUS

t FAILOVER UART COMMUNICATION LINE j 316 SHARED SCSI DEVICE BUSES

�NOTE: Controller and Cache Present signals to each controller are not shown. s

Figure 4 StorageWurks Co ntroller !JackjJ!m ze: Controllers in a Dual-redundant Co nj(r?,umtion

]() Vo l. 6 No. i hill /')')4 Digital Te chnical journal Th e Architecture and Design of HS-series StorageWorks Array Controllers

The backplane contains a direct communica­ or absence of the other controller in a dual­ tion path between the two controllers by means redundant configuration. This assists in control­ of a serial communication universal asynchronous ler setup of dual-controller operation as well receiver/transmitter (UART) on each controller. The as general controller initialization of the dual­ controllers use this communication link to inform redundant configuration. one another about • Status about the presence of the second coo­

• Controller initialization status. In a dual-redun­ troller's cache. Each control ler can sense the dant configuration, a controller that is initializ­ presence or absence of the other controller's ing or reinitializing sends information about the cache fo r dual-controller setup purposes. process to the other controller. • The "KILL" signal. In a dual-redundant config­

• "Keep alive " communication. Controllers send uration, each controller has the capability to use KILL keep alive messages to each other at timed the control signal to cause a hardware reset intervals. The cessation of communication by of the other controller. However, once one coo­ KILL one controller causes a failover to occur once troller asserts the signal, the other control­ KILL the surviving controller has disabled the other ler loses the capability. The signal ensures controller. that a fa iled or fail ing controller will not create the possibility of data corruption to or from • Configuration information. StorageWorks con­ attached storage devices. trollers in a dual- redundant configuration have KILL the same configuration information at all times. The signal denotes that failover to the surviv­ When configuration information is entered ing controller should occur. A controller asserts KILL into one controller, that controller sends the the signal when the other controller sends new information to the other controller. Each a message that it is fa iling or when normally controller stores this information in a controller­ scheduled keep alive communication from the KILL resident nonvolatile memory. If one control­ other controller ceases. The signal is also ler fails, the surviving controller continues to used when both controllers decide to reset one serve the fa iled controller's devices to host com­ another, e.g., when the communication path has puters, thus obviating shared memory access. failed. The controller resolves any discrepancies by The designers had to ensure that only one con­ using the newest information. trol ler could succeed in the KILL operation, i.e., that no window existed where both controllers • Synchronized operations between controllers. cou ld use the KJLL signal. After firmware on Specific firmware components within a control­ a controller asserts the KILL signal to its dual­ ler can communicate with the other control ler redundant partner, the KILL recognition cir­ to synchronize special events between the hard­ cuitry within the controller that asserted the ware on both controllers. Some examples of signal is disabled. The probability of true simul­ these special events are SCSI bus resets, cache taneous KILL signal assertion was estimated at state changes, ami diagnostic tests. w-20, based on hardware timing and the possi­ The other signals on the backplane pertain to bility of synchronous dual-control ler operation. the current state of the configuration within the • The cache LOCK signals. The cache LOCK signals controller shelf and to specific control lines that control access to the cache modules. The dual­ determine the operation of the dual-redundant controller architecture had to prevent one con­ controller configuration. The backplane state and troller from gaining access to a cache module that control signals include was being used by the other controller and had to • Status about the presence of a control ler's cache al low the surviving controller to access the failed module. Each control ler can sense the presence controller's cache. The access control had to be or absence of its cache to set up fo r cache diag­ implemented in either firmware or hardware. nostics and cache operations. A firmware solution would involve a software • Status about the presence of a second controller, locking mechanism that the controllers would which indicates a dual-redundant configura­ recognize and cooperatively use to limit cache tion. Each controller can sense the presence module access to the associated controller. This

Fa ll Digital Te clmicaljournal Vol. 6 o. 4 1994 11 RAID Array Controllers

method had an inherent problem: firmware that the designers considered valuable in a alone may not prevent inadvertent cache access dual-active control ler configuration. Load bal­ by a fa iling controller. The designers therefore aiKing is static in the control ler, i.e., devices are had to implement a hardware lock mechanism allocated to one controller or to the other, not to prevent such inadvertent access. shared dynamically. To change allocation, the system manager must change the preferred path The hardware lock mechanism was imple­ of device access. This is accomplished by access­ mented with control signals from each control­ ing either the maintenance port of the controller ler. The signals are utilized by hardware to or the config uration firmware through the host prevent inadvertent access ami by firmware interface (e.g., the diagnostics ancl utilities pro­ to limit cache module access to the associated tocol fo r U and DSSI systems). control.ler. From each controller, the designers

implemented two LOCK signals that extend imli­ • The cache module battery is low or has failed. vidual ly to each cache module and are visible to This special case of failover is useLI in conjunc­ both controllers. The cache LOCK signals are tion with Parity RAID operations fo r the reasons illustrated in Figure 4. described in the Parity RAID technology portion of the h>llowing section. The main issue is to con­ The LOCK signals allow a control ler to achieve tinue to provide as much data protection as possi­ exclusive access to a specific cache module to ble fo r Parity RAID disk configurations when the ensure data integrity. LOC K signals from a con­ battery on the write-back cache is low or bad. troller that has been "killed" by its dual-redundant

partner are reset so that the partner may fail over • The controller is unable to communicate with any unwritten cache data in the write-hack cache. the devices to which it is currently allocated fo r host operations. 'T'his situation can occur if

Fa ilouer Controller faiJover is a feature of the a device port on a controller fails. dual-redundant configuration fo r StorageWorks controllers. Failover of a controller's devices and Storage Fa ult To lerance cache to the other controller occurs when Storage fault tolerance is achieved by ensuring that power or environmental factors do not cause • A control ler fails to send the keep alive message. devices to be unavailable for host access and by This situation can occur because of a controller using firmware to prevent a device failure from failure in the dual UART (DlJART) or in any other non-fault-tolerant portion of the controller mod­ affecting host accessibility. ule. In this scenario, the surviving controller uses the KILl. signal to disable the other controller, Enuironmenta/ Fa ctors StorageWorks enclosures communicates to the failed controller's devices. provide for optional redundant power supplies and and then serves the failed controller's devices to cooling fans to prevent power or fan fa ilures from hosts. making devices unavailable. The SCSI-2 cables that connect device shelves to the controller shelf carry The failover of a control ler's cache occurs only if extra signals to alert the controller to power supply write-back caching was in use before the con­ or fan failures so that these conditions may be troller failure was detected . In this case. the sur­ reported to host computers. 'T'he enclosures must viving controller uses the failed controller's contain light -emitting diodes (LEOs) to allow a con­ cache to write any previously unwritten data to troller to identify failed devices. In addition, a the failed controller's disks before serving these cache module can fail, and the controller will con­ dis ks to hosts. When the surviving control ler has tinue to O]Jerate. written the data to disks (i.e., flushed the data), it releases the cache to await the fa iled con­ RAID Te chnology Tb prevent a device failure troller's return to the dual-redu ndant configura­ from affecting host access to data, the designers tion through reinitialization or replacement. introduced a combined fi rmware and hardware

• A customer desires to change the load balance of implementation of RAID technology. 2 The designers one or more devices attached to one controller had to decide which RAID level to choose and what to the other controller. This specia lized use tvpe of hardware (if any) was required fo r the of fa ilover provides a load-balancing feature implementation.

12 llri/. 6 ;Vu. i /-{Iff !')') 1 Dip, ita/ Te cbuica/ jo urnal Th e Architecture and Design of HS-series StorageWorks Array Controllers

The designers considered RAJDlevels 1 through 5 be protected implies an inherent cost of one as options fo r solving the problem of disk fail­ additional housed , powered, and attached disk. ures that affect data availability. RAID level 1 (disk RAID level 1 was also discounted because software­ mirroring, which is depictecJ in Figure Sa) was based solutions were available fo r many of the rejected because of its higher cost, i.e., the cost of hosts fo r which the HS-series controllers were ini­ parts to implement the mirroring 2 Each disk to tially targeted .

DATA DISKS

CHECK DISKS

(a) Mapp ing fu r a RAID Level 1 Array (b) Mapp ing fo r a RA ID Level 2 Array

(c) Mapp ing fo r a RAID Leve/ 3 Array (d) Mapping fo r a RAID Leue/ 4 Array

(e) A Ty pical Mapping fo r a RAID Level 5 Array

Figure 5 Mapp ing fo r RAID Levels 1 through 5

Digital Te chnical jo urnal Vo l. 6 No. 4 Fa l/ 1994 13 RAlDArray Controllers

RAID levels 2 through 4, illustrated in Figures 5b • A controller fa ilure occurs with a host's write through 5d, were rejected because they do not pro­ request outstanding. vide good performance over the entire range of Either the updated data or the updated parity t()r 1/0 workJoads fo r which the controllers were tar­ • the host's write request bas been written to disk geted. 1 In general , these RAID levels provide high, but not both. single-stream data transfer rates but relatively poor transaction processing performance. • A fa ilure of a differentdi sk occurs after the con­ RAID level 5 i n its pure fo rm was rejected because troller fa ilure has been re paired . of its poor write performance, especially for small write operations.! The designers ultimately chose To eliminare this write hole, designers bad to RAID level 5 data mapping (i .e., data striping with develop a method of preserving information about interleaved parity, as illustrated in Figure 5e) cou­ ongoing RA ID write operations across power fa il­ pled with dynamic update algorithms and write­ ures such that it could be conveyed between part­ back caching to overcome the small-write penalty. ner controllers in a dual-redundant configuratio n . This implementation is called Parity RAID. Designers decided to use nonvolatile caching of IWD An HS-series Parity RAIDarray appears to hosts as write operations in progress 5 Three alterna­ an economical, fault-tolerant virtual disk unit. tives were considered : A Parity RAID virtual disk unit with a storage capac­ 1. An uninterruptible power supply (UPS) fo r the i ty equ ivalent to that of n disks requ ires n + 1 phys­ control ler, cache, and all attached disk devices. ical disks to implement . Data and parity are This choice was deemed to be a costly and distributed (striped) across all disk members in the unwieldy solution bec<�useof the range of possi­ array, primarily to equalize the overbe

DATAOE!JDATA 1E!l DATA 2 Ell PARITY REGENERATED APPLICATION DATA FROM MEMBER 3

"" "" I r r I IIi... � MEMBER MEMBER MEMBER MEMBER DISK 4 DISK 0 DISK 1 DISK 2 3 (PARITY) PARITY RAID ARRAY�

F(!{ure 6 Regenerating Data in a Pa ri(y RAID Arrc�y tl 'it/1 a Fa iled J'vlember Disk

14 Vol. G No. 4 1-lr/1 /')')4 Digital Te cbnical Jo urnal The Architecture and Design of HS-series StorageWorks Arr�y Controllers

supplies and a battery. This confl ict between 100 hours of battery backup fo r a 32 -MB cache fa ult-tolerant StorageWo rks shelf configurations module. The batteries eliminated the need fo r with dual power supplies and the desire to add a special (and costly) nonvolatile memory write a battery for write-back caching was unaccept­ cache and allowed data hold-up after power failure. able to the designers because of the loss of power The designers chose lead-acid batteries over NiCAD redundancy to gain write-back cache integrity. batteries because of their steady power retention and output over time. This option protects against 3. A controller-based nonvolatile cache. The options most major power outages (of five minutes to five fo r controller-based nonvolatile caching included days) and al l minor power outages (of Jess than five (a) a battery-protected cache fo r write data, (b) an minutes). Most power outages (according to stud­ additional nonvolatile random-access memory ies within Digital) last less than five minutes and are (NVRAL\1)on the controller to journai !WD writes, handled in the same manner as major outages, that and (c) a battery-protected cache fo r both read is, by flushing write data immediately after power and write data. has been restored to the controller configuration.

With a battery-protected write cache, data must Battery status is provided to firmware, which uses be copied if it is to be cached fo r subsequent this information to make policy decisions about read requests. Designers deemed the potential RAID arrays and other virtual disk units with write­ performance penalty unacceptable. back caching enabled. For an HS-series controller to support Parity RAJD, Using controller NVRAM as a IV\J D write journal its cache module must have batteries installed. The not only closes the RAID level 5 write hole but batteries make the cache nonvolatile and enable also provides a small write cache fo r data. This the algorithms that close the RAlD level 5 write hole approach also requires data copying and creates and permit the use of the write-back cache as a per­ an l\TVRALVJ access problem for the surviving con­ formance assist, both vital fo r proper Parity IWD troller to the failed controLler NVRA.t\1 to resolve operation. If the controller firmware detects a low­ any outstanding IWD write requests. or bad-battery condition, write-back caching is dis­ abled. The control ler that detects the condition The third controller-based nonvolatile cache tries to fail over Parity RAID units to the other con­ option, to battery-backup the entire cache, troller in the dual-redundant configuration to keep solved the copy issue of option 3a and the the units available to hosts. If the other controller fa ilover issue of option 3b. cache module has a low- or bad-battery condition, The designers chose option 3c, the battery­ the Parity RAIDunit is made unavailable to hosts to protected read/write cache module. A totally non­ protect against data loss or data corruption should vo latile cache had the advantage of not requiring a power fa ilure occur. When the batteries are no write-cache flushing, i.e., copying data between longer low, Parity RAID units are again made avail­ the write cache and the read cache after the write able to hosts. Any Parity RAlDun its that had been data has been written to devices. Segregated cache failed over to the other controller would fail back, approaches (part nonvolatile, part volatile) wo uld i.e., return, to the controller that originally con­ have required either copying or discarding data trolled them. The module hardware and fi rmware after write-cache flushing. Such approaches would support read caching regardless of the presence of have resulted in a loss of part of the value of using a battery. the caching algorithm by all owing only read caching After solving the RAlD level 5 write-hole problem, of read data already read. Another benefit of a non­ the designers decided to automate the Parity RAJD volatile read/write cache is fa ilover of the cache recovery process wherever possible. This goal was module in the event of a controller fa ilure. This fu r­ adopted so that customers would not have to under­ ther reduces the risk associated with a RAID level 5 stand the technology details in order to use the write hole because unwritten write operations to technology in the event of a fa ilure. StorageWorks Parity RAlD arrays can be completed by the surviv­ controller firmware developers, therefore, chose to ing controller after fa ilover. add automatic Parity RA.ID management fe atures To achieve a total nonvolatile cache, the design­ rather than require manual intervention after fa il­ ers opted fo r the battery solution, using two 3-by-5- ures. Controller-based automatic array management by-0.125-inch lead-acid batteries that supply up to is superior to manual techniques because the

Digital Te clmical jo urnal Vo l. 6 No. 4 Fa ll 1994 15 RAID Array Controllers

controLler has the best visibility into array problems rime it takes to reconstruct one Parity RAID unit.

and can best manage any situation given proper Mean time between data loss (NITBDL) is a combina­ guide! ines fo r operation. tion of the mean time to repair (MTIR) and the fa il­ An important fe ature of Parity RAm is the ability ure rate of a second device in a Parity RAID unit. to automatically bring a predesignated disk into ser­ The automatic sparing feature reduces the NlTTR vice to restore data protection as quickly as possi­ and thus increases the MTBDL. Data loss can occur ble when a fa ilure occurs. Other controllers in the only in the highly unlikeJy event that a fa ilure occurs industry mandate configurations with a hot-standby in another RAID set member before the reconstruc­ disk, i.e., a spare disk, dedicated to each Parity RAm tion completes on the chosen spare. During Parity unit. A hot-standby disk is powered and ready fo r RAID reconstruction, the controller immediately firmware use if an active member disk of its Parity makes the host read or write request to the recon­ RAlD unit fa ils. This concept is potentially wasteful structing member redundant by updating parity since the probability that multiple Parity RAIDunits and data on rt1e spare after the host read or write will have simultaneous single-member disk fa ilures operation. Parity IWD firmware quick ly recon­ is low. The designers therefore had the options of structs the rest of the Parity RAI D unit as a back­ making spare disks available on a per-Parity RAID ground task in the controller. If the member that unit basis or having a pool of spares, i.e., a spare set, is being reconstructed happens to fail and other that any configured Parity RAJDunit could access. spare set members remain, reconstruction on a The designers chose the pool of spares option new spare begins immediately, further reducing the because it was simpler to implement and less costly probability of data loss. fo r rhe customer, and it offered the opportunity to Parity RA[[) member disk fa ilure declaration is key add selection criteria fo r spare set usage and thus to the efficient use of spares and to preventing maximize either performance or capacity efficiency. unwarranted use of spares. If a write command to a To allow more flexibility in choosing spare set RAID set member fails, RAIDfirmwar e assumes that members, designers made two spare selection the SCSI-2 disk drive has exhausted a!J internal meth­ options available: best fit and best performance. ods to recover from the error. SCSI-2 disk drives auto­ The best fit option allows fo r disk devices of diffe r­ matically perform bad block replacement on write ent sizes to compose the pool of spares. When a operations as long as there is space available within spare disk is needed after a member of a Parity RAID the disk drive revector area (the area where spare unit fa ils, the device with the best fit, that is, whose data blocks can be mapped to a fa iled block). The size most closely matches that of the failed disk designers chose this method over more complex (typically of the same size but possibly of greater retry algorithms that might encounter intermittent capacity), is chosen. The best performance option fa ilure scenarios. Empi rical information related to can reduce the need fo r physical reconfiguration previous storage devices showed that local ized after a spare is utilized if a spare attached to the write fa ilures are rare and that this strategy was same device port as the failed array member can be souml for the majority of disk access fa ilures. allocated. The best performance option maintains When read failures occur, data is regenerated operational parallel ism by spreading array mem­ from the remaining array members, and a fo rced bers across the controller device ports after a fa il­ bad block replacement is performed. Metadata on ure and subsequent spare utilization. the disk is used to perfo rm this function atomically, These features allow automatic sparing of fa iled that is, to perform rbe bad block replacement even devices in Parity RALD units and automatic recon­ if a power fa ilure occurs during the replacement.- If struction after a spare device has been added to the rhe disk cannot replace the block, then the Parity Parity RA ID unit (• Furthermore, any drive of at least RAID member disk is failed out and an attempt is the size of the smallest member of a Parity RAID unit made to choose a suitable spare from the spare set. is a candidate spare, which reduces tl1e need fo r If no spare is available, the Parity RAID unit operates like devices to be used as spares. (Typically, how­ in reduced mode, regenerating data from the fa iled ever, spare set members are like members.) member when requested by the hosts. ' A Parity RAID unit with a fa iled member will Parity RAIDfirmware uses rhe metadata to detect become unavailable and lose data ifa second fa ilure a loss of data due to catastrophic cache failure. inap­ occurs. The HS-series automatic sparing feature propriate device removal, or cache replacement red uces the windo·w of possible data loss to the without prior flush of write data. The designers

16 Vol. (J No. 4 Fa ll 1')94 Digital Te cbuicnf jou runl TheAr cbitecture and Design of HS-series StomgeWorks Array Controllers

considered it important that the controller queuing to all attached SCSI-2 storage devices for firmware he able to detect these data loss condi­ greater performance. 1 tions and report them to the host computers. Discussions of the RAil) level 0 technology and The fa ilure scenarios just described occur infre­ of how the designers used Parity RAI D technology quently, and the StorageWorks Parity RAID firm­ to overcome some of the performance bottlenecks ware is able to recover after such fa ilures. During fo llow. a typical normal operation, the main challenge for Parity RAID firmware is to achieve a high level of Striping-RAID Level 0 performance during write operations and a high Digital has used RAJ[) level 0 technology, that is, level of controller performance in general. striping, in systems fo r at least five years, in its host computers using software as well as in its control­ Hig h Performance lers. Striping allows a set of disks to be treated as As discussed earlier, the performance goals for the one virtual unit. Device data blocks are interleaved Storage Wo rks controllers were in the areas of in strips, i.e., contiguous sets of blocks, across all throughput ami latency Bandwidth goals were disks, which provides high-speed parallel data based on the architecture and technology of the access. Figure 7 illustrates the mapping fo r a RAID controller platform. The designers met the perfor­ level 0 array. ' Since a striped disk unit inherently mance goals by producing a controller that bad Jacks fau lt tolerance (i.e., if one device in the set a low command overhead and that processed fails, data is lost), controller-based striping is typi­ requests with a high degree of parallelism. The cally used in conjunction with host-based mirror­ firmware design achieves low overhead by means ing or in cases where data can be easily reproduced. of the algorithms running on the controller, cou­ Stripe sets achieve high performance because of pled with RAID and caching technology. The hard­ the potential fo r parallelism by means of the device ware design that allows fo r low command overhead and data organization. The key difference between and high data transfer rates (bandwidth) is dis­ RAJD level 0 and RAID levels 3 and higher is that cussed in the section Common Hardware Platform. striping results in the interdependence of data writ­ ten to different devices. Command Processing The StorageWorks designers maximized the num­ Controller Caching ber of requests the controller can process per sec­ Caching with StorageWorks controllers was origi­ ond by reducing the command processing latency nally read caching only When the designers within the controller firmware. The firmware uti­ decided to use a nonvolatile cache to eliminate the lizes controller-based caching and also streamlined RAID level 5 write hole, write-back caching on the command processing that al lows multiple out­ controller became a viable option. standing commands to be present in the controller. To meet the varying needs of customer a ppl ica­ Controller Read Caching Read caching was tions, the controller supports both Parity RAID and intended to reduce latency in the controller by min­ Rr\lD level 0. The designers decicled to include RAID imizing the need to access devices continuously fo r level 0 as a controller feature because of its inherent repeated host read requests to the same locations on parallel ism, even though RATD level 0 is not fault tol­ attached devices. Read caching must also address erant without externalre dundancy. the issue of how to handle write data fo r later use. Storage\'Vo rks controllers service all device The design could have incorporated on-board con­ types, but the designers fe lt that disk device per­ troller memory to hold write data. However, this fo rmance was the key metric fo r determining would require copying the write data to the read how well a controller supports RAID technology. cache after the write data had been written to the The controller firmware was designed w etficiently devices and would result in inefficient use of the control individual devices (commonly referred read cache. Therefore, the designers decided to to as "just a bunch of devices" [.JBOD]) and Parity have the read cache serve as a write-through cache RAID, prioritizing requests to each of the SCSI-2 as well. Read caching would be disabled/enabled device ports on the controller. StorageWorks per logical unit presented to the host instead of hav­ controllers comply with SC:SI-2 protocols and per­ ing global read caching, where a logical unit is one form advanced SC:SI-2 functions, such as tagged or more devices configured as one virtual device.

Digital Te chnical jo urual Vo l. 6 No.. j ht/1 /')')4 17 RAID Array Controllers

r'igure 7 MajJjJ ingfor a RAID Lel'el 0 Arm)'

Thus, customers can specify fo r which virtual Ta ble 1 StorageWo rks Controller 1/0 devices they want caching enabled. Performance with Read Caching The read and write-through caching firmware Read Requests Write Requests receives requests from other parts of the controller Controller per Second per Second fi rmware (e.g., a host port, a device port, and the Parity IWD firmware) and proceeds as f·()l lows. HSJ30/HSJ40 1,550 1,050 For reads requests, the caching firmware provides HSD30 1,000 800 HSZ40 2,250 1,500 I. The data pointers to the cached request, i.e., the cache hit

2. The data pointers for part of the request, i.e., Wr ite-back Cachinp,-lbjonnance Aspects As a partial cache hit, which means that the remain­ noted earl i er, when the cache modu le contains ing data must be retrieved from tbe device or batteries, the memory is nonvolatile for up to 100 devices being requested hours. The Storage\Xforks control ler can use the nonvolatile cache to increase controller perfor­ 3. A status response of cache miss, which means mance by reducing latency fo r write request Parity that storage management must retrieve the data RAIDper fo rmance to a level similar to that of RAID from the device or devices being requested level 0 (simple disk striping). The controller can For write requests, the caching firmware offers also utilize the write-back cache to reduce the the cache manager data from the request. The cache latency of JllOD and RA ID leve l 0 disk configura­ manager places the previous data pointers into the tions. As with read cach ing, write-back caching is read cache tables after the data is written through disabled/enabled per logical unit. the cache to the devices. Fi rmware tel ls the device The write-back caching firmware controls the port hardware where to find wr ite data, and the usage of both a surviving controller's cache module port hardware transfers the data. and a failed controller's cache· module. When it Read caching in the first version of the controller receives a write request, the controller places the fi rmware allowed the controller to achieve tbe ini­ data in the cache, marks the request as complete, tial throughput goals across the three co ntroller and writes the data based on internal control ler platforms. The current software version, version firmware policies (write-back cachi ng). To provide 2.0, was shipped in October 1994 and exhibits even greater performance during Parity RAID operations greater throughput performance. Table l shows the than simple write-back caching could provide, the

e 1/0 performance fo r the three Storage\Xforks con­ write-back cac h firmware is also tied to the Parity troller platforms with read caching enabled. lWOfirm ware .

IR Vol. (, No. 4 htll I'J'J4 Digital Te cbnicaljournal Th e Architecture and Design of H'i-series StorageWo rks Array Controllers

ln addition to dealing with the continual prob­ enhance small write operations. The designers lem of controller latency on write commands, intentionally chose to overcome both the small­ designers had to overcome the RAID level 5 small­ write penalty and the inherent lack of high band­ write penalty with parity updates to RAID set mem­ width that Parity RAID delivers. bers. Write-back caching was chosen over RAID The nonvolatile write-back cache module level 3 hardware assists as a Parity RAID strategy afforded the firmware designers more choices fo r because RAID level 3 does not provide a wide range Parity RAI D write request processing and data flush of benefits fo r all customer workloads. Write-back algorithms. The designers pursued techniques caching provides latency reductions fo r RAID and to speed up all write operations by performing non-IWD configurations. Write-back caching also write aggregations (i.e., combining data from mul­ increases write-request throughput. For example, tiple write requests and read cache data) in three the published performance numbers for write dimensions: throughput with write-back caching enabled in ver­ 1. Contiguous aggregation, in which the firmware sion 2.0 firmware appear in Ta ble 2. looks fo r consecutive block requests and ties The use of write-back caching resulted in a 20 to them together into one device request, thus 30 percent increase in write throughput fo r all plat­ eliminating separate device requests. fo rms as compared to write-through caching. Before discussing the effect of write-back caching on 2. Ve rtical aggregation, in which the firmware can latency for individual devices and fo r Parity RAID detect two write operations to the same block, arrays, the paper describes how the write-back thus eliminating one write operation. cache firmware was designed and tied directly to Parity RAID firmware. 3. Horizontal aggregation (for Parity IWD opera­ The fe atures chosen fo r write-back caching were tions only). This type of aggregation occurs extensively benchmarked against data integrity when all data blocks within a Parity RAJD strip issues. The addition of settable timers allows cus­ are present in the write-back cache. In such tomers to flush write data destined for devices that cases, the firmware can write to all RAID set are idle for a specific length of time. To reduce the members at once, in combination with the FX number of read/modify/writes required to update chip (discussed later in this section) on-the-fly parity on small write operations, designers tied hardware XOR operations during the RAID set flush algorithms to RAID.Flush algorithms fo r write­ member writes. The original request can cause back caching are vital to customer data integrity horizontal aggregation to take place if all blocks ancl to latency reduction. The flush algorithms actu­ within a strip are part of the first write request. ally allow Parity lWD to simulate RAID level 3 oper­ The firmware can also perform horizontal aggre­ ations because of the nonvolatile write-back cache. gation after processing several write requests. In As mentioned earlier, Parity RAID configurations this way, the parity write operation directly fo l­ suffe r a penalty on small write operations that lows the data write operations. Horizontal write includes a series of read and write operations and aggregation potentially cuts physical device XOR operations on blocks of data to update RAI D access in half when compared to normal RAI D parity. The write-back cache firmware was write operations that require data members to designed with specific attributes to enhance Parity be read 2.B The resu lt is pseudo-RAJ D level 3 oper­ RAID write operations in ge neral, and not just to ation, because the write-back cache is combined with the horizontal aggregation cache policy.

The performance gain for individual disks and fo r Ta ble 2 StorageWorks Controller Parity RAID arrays from using write-back caching is Write Request Throughput dramatic, resulting in higher write throughput and with Write-back Caching low latency The write-back cache actually smoothes Write Requests out differences in performance that are typical of Controller per Second workloads that have different read/write ratios,

HSJ30/HSJ40 1,350 whetller or not Parity RAID is utilized. Figure 8 shows the relative latency fo r a controller HSD30 900 with and without write-back caching enabled. The HSZ40 1,850 configurations tested comprised individual devices

Vo l. 6 4 Fa ll Digital Te chnical jo urnal No. 1994 19 RAIDArray Controllers

50 The designers determined that hardware acceler­ XOR RAID (j) ation of operations t< >r Parity firmware 40 wou ld speed up RAID parity calculations and thus 0§? 0 further improve Parity RAID latency and through­ w � 30 put. The firmware also supports FX compare opera­ tions, which eliminates the need fo r SCSI-2 devices that have implemented compare commands and fo r speeding up compare requests from hosts.

Cmnmon Ha rdware Platfo rm WORKLOAD 1 WORKLOAD 2 WORKLOAD 3 To produce a high-performance controller in all KEY three performance dimensions-latency, through­ JBOD ARRAY MODEL PARITY RAID ARRAY MODEL put, and clara transfer rate-the designers of D READ CACHE • READ CACHE StorageWorks controllers faced the challenge of 0 WRITE-BACK CACHE 0 WRITE-BACK CACHE creating a new controJier architecture ami using new technology. In addition, they bad to do so at Figure 8 H�/40 Art'C�J'Latenq Comparisons a reasonable cost. Although each has its own specific host interface hardware, the Cl, DSS!, ami SCSI controller variants and Parity RAID units (in a five-plus-one configura­ share a common hardware core. Commonality tion). The performance measurements were taken was desired to control the llevelopment costs and from a version 2.0 I-JSJ40 array controller. schedules for such large engineering projects. To Workload 1 has a read/write ratio of 70/30. i.e., del iver high performance and commonality, the 70 percent of the requests were read requests and designers investigated several cont roller architec­ :)0 percent were write requests. Wo rkload 2 has ture alternatives. The first architecture considered a read/write ratio of84/ l(>. Wo rkload 3 has a ratio of was similar to Digital's HSC'iO-':)'i controller, incor­ 20/80 In al l workloa(IS, the latency for individual porating similar bus structures, processing ele­ devices and fo r Parity RAm units is lower when ments, and memories, but newer technology. write-back caching is enabled than when only read Figure ':) shows the HSC architecture.0 caching is enabled. In fact, when write operations The HSC architecture is a true mult iprocessor sys­ dominate the 1/0 mix, latency fo r Parity RAJDun its tem. It contains a private memory fo r its policy pro­ is the same as for the workloads in which read oper­ cessor, which manages the work that is coming ations are predominant' from the host port interface ami queues this wo rk to the device interface modules. Data then flows RAID/C ompare Ha rdware between the host port and device modules to ami Storage Wo rks controllers contain a hardware Parity from hosts. The modules have two interfaces RAID ami data compare accelerator called FX, a gate (buses) fo r access to command processing and data array that performs on-the-fly XOR operations on movement. These buses are called the control mem­ data buffers. Parity RAIDami data compare firm­ ory interface ami the data memory interface. The ware use this gate array to accelerate Parity RAID policy processor queues work to the host port and parity calculations ami host data compare requests. device modules through the control memory inter­ 'T'he FX chip is programmed to (1) observe the bus, fa ce, and then the modules process the clara over (2) "snoop" the bus fo r specific addresses, (3) per­ the data memory interface. fo rm the XOR operation to compare the associated Using this architecture would have been too data on-the-fly with data in a private memory called expensive. The control ler cost had to be competi­ XBliF memory, and (4) write the data back into tive with other products in the industry, most the XBUF memory. of which currently cost considerably less than the XOR operations can take place as data is moving HSC controller. The HSC bus architecture required from buffe r or cache memory to device ports or three diffe rent memory interfaces, which would vice versa. The FX can also perform direct memory require three different, potentiallv large memories. access (Di'vlA) operations to move the contents of The designers had to pursue other options that buffer orcache memory to or from XBUF memory. met the cost goals but did not significantly reduce

20 \1J/ G No. 4 /·all 1')94 Digital Te e/mica/ jourua/ The Architecture and Design of HS-series StorageWorks ArrayCon trollers

CONTROL CONTROL BUS (6.6 MB/S) MEMORY .··I I L POLICY LOAD DISK INTERFACE HOST INTERFACE PROCESSOR DEVICE OR r-- TAPE INTERFACE CI BUS (UP TO 8 TOTAL) < > TERMINAL 1- r-- ....:..... SOl OR STl 8 8 8 {4 PER INTERFA CE) DATA I MEMORY DATA BUS (13.3 MB/S)

Figure 9 Block Diagramof the HSC Architecture performance. They considered single internal bus icy processor to access one memory while a device architectures, but during simulation, these options or host port processor accesses the other memory. were unable to meet either the initial or the long­ The architecture achieves a lower overall cost term cost goals. than the HSC architecture yet ach ieves similar Figure 10 shows the controller architecture performance. The new architecture, with fe wer option that became the common hardware base fo r memories, does not significantly reduce the perfor­ StorageWorks controllers. This architecture con­ mance, while the newer technology chosen to tains three buses ami two memories. A third small implement the controller enhances performance. memory is used fo r Parity RAJDand data compare The bus bandwidth of the new controller is much operations but does not drastically increase con­ higher than that of the HSC controller. Conse­ troller cost. The architectural design aJJows the pol- quently, a more cost-effective solution that uses

,.------,32-KB ,------,PCMCIA ,.------,,.-----...,;--- INSTRUCTION 32-KB i960 PROGRAM DUAL TIMER CONTROL AND DATA CACHE NV RAM CARD (2 MB) UART HARDWARE REGISTERS

I I I I IBUS BUS I I

16- OR 32-MB READ OR BUFFER MDAL BUS BUS )C===C=D=A=L=B=U=S=::::)I BATTERY CACHE MEMORY DRAB EXCHANGER DRAB BACKED UP MODULE (8 MB) WRITE-BACK CACHE NBUS BUS

I DEVICE PORT DEVICEI PORT DEVICE PORT DEVICEI PORT DEVICEI PORT DEVICEI PORT HOST PORT 53C71 0 53C710 53C710 53C710 53C710 53C710 {CI, DSSI, SCSI) PROCESSOR PROCESSOR PROCESSOR PROCESSOR PROCESSOR PROCESSOR

Figure 10 HSx40 Controller Architecture

4 1994 Digital Te clm ica ljournal Vo l. 6 Nn. Fa ll 21 RAIDArray Controllers

a less-costly architecture can attain similar to better module, the designers felt that the benefits of such performance. a design outweighed the additional cost. The extreme integration of hardware to the very On each initializatio n, the controller reads the large-scale integration (VLSI) level allowed fo r a firmware image on the program card and copies the much smaller enclosure than that of the HSC control­ image to the shared memory. The firmware exe­ ler, even with a dual-redundant controller configura­ cutes from the shared buffer memory. tion (see Figure 3). A StorageWorks dual-controller configuration measures 56.5 by 20.9 by 43.2 centi­ Dual UA RT (IJ UART) The DUART is used for two meters (22 by 8 by 17 inches), which is approxi­ reasons: mately one-tenth the size of the HSC: controller. 1. Maintenance term inal connection. The main­ tenance terminal is a means of entering con­ Common Co ntroller Platfo rm The common con­ troller system management commands (with the troller platform consists of the control ler without com mand line interpreter, which is the user the associated host port. The common core of hard­ interface fo r controller configuration manage­ ware consists of the policy processor hardware, the ment) and is also a status and error reporting SCSI-2 device port hardware, and the cache module. interface. Designers made extensive use of this The controller-specific host port interface hardware interface fo r debugging controller hardware and includes either the CI, the DSSI, or the SCSI interface. firmware. Use of the maintenance terminal con­ nection is optional. The interface remains on the PoliLy Processor Hmdware The StorageWorks controller so that users can direct control l. er controller policy processor is Intel's 2)-MHz i960CA management and status reporti ng, ifdesired . microprocessor, which contains an internal instruc­ tion cache and is augmented by a secondary cache 2. Failover communication between two control­ external to the processor. The secondary cache lers in a dual-redundant configuration. The com­ relieves the potential bottleneck created by shared munication path is used to share configuration memory between the policy processor and host/ and status information between the controllers. device port processors. The designers had to make trade-offs in two Shared Buff er and Cache ;ll/emOIJ' The dynamic areas: the memory speed/cost and the number of random-access memory (DRAM) buffe r (or shared buses. After simulation, the external instruction memory) has at its heart the (lynamic RANI and arbi­ and data cache showed a significant performance tration (DRAB) chip. This chip supports the buffe r improvement, given the chosen shared-memory and cache memory accesses from the policy pro­ architecture. The cache covers the first 2 MB of cessor and from the host and device ports. The data buffer memory, where policy processor instruc­ transfer rate supported by the shared memory is tions and local processor data structures reside and approximately :)') megabytes per second (MB/s) where most of the performance gain fo r the policy The DRAB chip contains error-correcting code processor would be achieved . (ECC) hardware to correct single-bit memorv, to The pol icy processor uses the JBUS exclusively to detect multibit errors, and to check and generate fetch instructions and to access the program stor­ bus parity. This fe ature allows the controller to age card, the NVRfu\1 , the DUART, and the timers. survive partial memory fa ilures, which wr quick code design rather than static random-access memory upgrades and to eliminate the need fo r a boot read­ (SRA1Vl ) chips led to the use of ECC. DRA.i\'ls were only memory (ROM) on the controller. The program chosen because of their cost and power savings card is a PCMCIA, 2-MB flash electrically erasable, over equivalent SRAM. However, because the programm able , read-only memory (EEPROM) card designers expected large amounts of DRAM (as that contains the firmware image . Designers chose much as 40 MB) to be present on a controller and its the PCMCIA card to fa cilitate code updates in the associated cache module, the statistical error prob­ field, where host-based downline loading of abilities were high enough to warrant the use of firmware was not supported. Although the PC:MCIA ECC on the memory. The combination ofDRAM and card cost more than EEPR0.\1 chips attached to the ECC was less costly than an equivalent amount of

22 Vo l. (> No.4 hill I'.I'J4 Digital Te cbuical jo urnal The Architecture and Design of HS-series StorageWorks Array Controllers

more reliable SRAM. The use of parity on the buses Microelectronic Products Division of AT&T Global is a standard fe ature in all Storage Wo rks controllers. Information Solutions Company) 53C710 SCSI-2 The bus parity fe ature provides further error detec­ processor chips. The SCSI-2 processor chips reside tion capability outside the bounds of the memory on the NBLJS and access the shared-memory cache because it covers the path from memory to or from fo r data structure and data buffer access. These pro­ external host or device interfaces. cessors receive their work from data structures in The DRAB chip also controls access to the cache buffer memory and perform commands on their module in conjunction with slave DRAB chips on specific SCSI-2 bus fo r read or write operations. the cache module associated with the controller. The Sym bios Logic chip provided the most pro­

These DRAB chips provide refresh signals for the cessing power, when compared to the other chips DRAM buffe r or cache memory that they control; available when the controllers were designed. The whereas, the master DRAB on the controller module designers fe lt that direct control of SCSJ-2 interfaces provides arbitration fo r cache accesses that origi­ by the policy processor or a separate processor nate from the various sources on the controller was too costly in terms of processor utilization module. Slave DRAB chips can also be accessed by ancl capital expense. The Symbios Logic chips do the dual-redundant partner control ler, depending require some policy processor utilization, but the on the two controller LOCK signal states. designers considered this acceptable because high­ The controller firmware uses 8 MB of shared performance architectural fe atures in the policy buffe r memory to execute the program image, to processor hardware compensated fo r the extra pro­ hold the firmware data structures, and to read and cessor utilization. write-through cache data (if no cache module is The SCSJ-2 device port supports the SCSI fast, present). The i960CA policy processor and the host single-ended, 8-bit interface. t The data transfer and device data processing elements on the NBUS rate supported by this interface is 10 MB/s. can all access buffe r memory. Host Po rt Hardware The host port hardware Ca che Jl!l emory Each cache memory module is either a Cl, a DSSI, or a SCSI interface imple­ contains one slave DRAB chip and 16 or 32 MB of mented with gate arrays or Symbios Logic 53C720 DRAM, and also two ports into the module (one SCSI-2 processors. The host port hardware, the only from each controller) fo r use in fa ilover. Each cache noncommon hardware on a StorageWorks con­ module optionally contains batteries to supply troller, requires a separate platform to support each power to the DRA..l\1 chips in the event of power host interface. failure fo r write-back caching and Parity RAID use. The CI interface is made up of a gate array and The cache modules are interchangeable between CI interface hardware that perfo rms DJVIA write controller types. or read operations from shared memory or cache memory over the NBLJS. The maximum data transfer

Pa rity RAID XOR and Compare Ha rdware The rate supported by the Cl hardware is approximately Parity RAID XOR and compare hardware consists of 8 MB/S. the FX gate array and 256 kilobytes (KB) of fast Tbe DSSI interface utilizes a Symbios Logic SRAM . The FX allows concurrent access by SCSI-2 53C720 chip coupled with a gate array and DSSI device port hardware and the policy processor. The drivers to receive and transmit data to or from the FX compares the XOR of a data buffe r (512 bytes of DSSI bus. The DSSI interface is 8 bits wide, and the data) that is entering or exiting an attached device maximum data transfer rate supported by the DSSI with the XOR buffe rs in the fa st SRA..M. The policy hardware is 4.5 MB/s. processor uses the FX to perform compare opera­ The SCSI interface also uses a Symbios Logic tions at the request of a host and perform DMA 53C720 chip couplecl with differential drivers to operations to move data to and from memories. provide a SCSJ-2, fa st-wide (i.e., 16-bit) diffe rential This hardware is common across all the controller interface to hosts. The maximum data transfer rate platforms fo r Parity RAID and compare firmware. supported by the SCSI-2 interface is 20 MB/s fo r fa st-wide operations. SCSI-2 Device Port Ha rdware The device ports Ta ble 3 shows tl1e current (version 2.0) maxi­ (three or six, depending on the controller model) mum measured (at the host) data transfer rate per­ are controlled by Symbios Logic (the fo rmer NCR fo rmance numbers fo r StorageWorks controllers.

Digital Tecbnicaljourrwl Vol. 6 No. 4 Fa ll 19')4 23 RAJD Array Controllers

Table 3 SCSI-2 Host Interface Performance

Read Data Tra nsfer Rate Write Data Tra nsfer Rate Controller (Megabytes per Second) (Megabytes per Second)

HSJ30/HSJ40* 6.7 4.4 HSD30 3.2 2.8

HSZ40** 14 8.0

' In a multihost environment

" Measured for the HSZ40-B controller

Summary in the even t of a fa ilure and utilizes spare device

The Storage\XIorks HS-series array controllers were pools in conjunction with user-defined Paritv designed to meet the storage subsystem needs of RAID configuration management policies. The both Digital and non-Digital systems. thereby enter­ SrorageWorks Parity RAID implementation ing the world of open systems. The architecture for exceeds the requirements of the HAll) Advisory the HSJ:)O, HS]40, HSD30, and HSZ40 cont rollers has Board ti >r RAID availability features. achieved the initial project goals and provides 3. High pe rform ance. The HSJ30/HS.J40, HSD:)O, and

HSZ 'iO contro lle rs achieved the respective initial 1. Open systems capability A SCS!-2 device interface performance goa ls of 1, 100, HOO, and 1,400 r;os allows many types of disk, tape, and optical per second . The control lers mer the low request devices to he attached to the HSJ;){), HSJ40. and laten cy goals by streamlini ng firmware \v here fJSD30 control lers. The HSZ40 control ler. which is possible ami by introducing write-back caching. currently a disk-only controller, provides a scsr-2 Wr ite-back caching firmware drama tically host interface that al lows the contro.l ler to be reduces latency on all write requests, and write­ attached to Digital and non-Digital computers. back cache hardware provides battery backup for 2. High availability. Co ntro l le r fault tolerance and data integrity across power fa ilures. Further­ RAID firmware yielded a highly available more. the write-back cache overcomes the RAID StorageWorks storage su bsystem. level ') small-write penalty ami high data transfer

The dual-red umlant controll.er configuration rate inefficiencies ancl thus provides high perfor­ RAI D allows each of a pair of active controllers to mance with Pa rity disk configurations. StorageWorks Parity RAID firmwMe implements operate independently with host systems. while many of the l{r\ 11) Advisory Bo�ml opriona I perfor­ sharing device ports, configuration information, mance fe atures to produce a high-performance and status. Th is (lesign al lows both control lers RAIDsolution. to achieve maximum performance. The dual­

redundant configuration also provides fa ult A com mon controller processing co re was tolerance if one controller fa ils, because the successfully developed for lhe fiSJ50/J lSJ40, surviving control ler serves the fa iled control­ HSD:�O. and HSZ40 controllers. ,VI ore than H5 per­

ler's devices to the host computers. The dual­ cent ot the fi rmware is com mon to ;i ll three con­ controller configuration, combinnl with trol l er platforms, which allows fo r ease ot StorageWorks controller packaging, results in maintenance and fo r the same look and feel for a highly available controller configuration with customers. The architecture and the t echnology

built-in fa ult t olera nce , error recovery, and bat­ used resulted in a core controller clesign that tery backup features. support s a high clara transfer rate fo r all Sto ragcWorks controller platforms Parity RAID controller firmware, combined with

StorageWorks device packaging, allows t

24 l!rJ/. (, ,\'u. i !-it// !')':)4 Di,�ital Te cllllica/ jo urnal Th e Architecture and Design of HS-series StorageWo1·ks Array Controllers

platforms that utilize the knowledge and experi­ 4. P Massiglia, ed ., Th e RAJDbook: A Source Book ence acquired during the development of the fo r Disk Array Te chnology, 4th ed. (St. Peter, Storage Wo rks l-IS-series array controllers. Minnesota: The RAID Advisory Board, September 1994) Acknowledgments 5. P Biswas, K. Ramakrishnan, D. Tow sley, and The StorageWorks array controller project was the C. Krishna, "Performance Analysis of Dis­ cooperative effort of a large number of engineers tributed File Systems with Non-Volatile Caches," who sacrificed considerable personal time to ACM Sigmetrics (1993). achieve the project goals. The fo llowing people and groups contributed to the success of the product: 6. Parity !WDunit reconstruction of data and parity Bob Blackledge, Diana Sl1en, Don Anders, Richard from a fa iled array member means regenerating Woerner, El len Lary, Jim Pherson, Richard Brame, the data block-by-block from the remaining array Jim jackson, Ron McLean, Bob Ellis, Clark Lubbers, members (see Figure 6) and writing the regen­ Susan Elkington, Wayne Umland, Bruce Sardeson, erated data onto a spare disk. Re construction Randy Marks, Randy Roberson, Diane Edmonds, restores data redundancy in a Parity RAID unit. Roger Oa key, Rod Lilak, Randy Fuller, joe Ke ith, 7 Metadata is information written in a reserved Mary Ruden, lVl ike Richard, To m Lawlor, jim area of a disk. The information, which takes up Pu lsipher, Jim Va gais, Ray Massie, Dan Watt, Jesse approximately 0.01 percent of the total disk Ya ndell, Jim Zahrobsky, Mike Wa lker, To m Fava, capacity, describes the disk's configuration and Jerry Va nderwaall, Dave Mozey, Brian Schow, Mark state with respect to its use in a Parity IWD unit. Lyon, Bob Pemberton, Mike Leavitt, Brenda Lieber, P. Mark Lewis, Reuben Martinez ,John Panneton, Jerry 8. Biswas and K. Ramakrishnan, "Trace Driven Lucas, Richie Lary, Dave Clark, Brad Morgan, Ken Analysis of Write Caching Pol icies fo r Disks," Bates, Pau l Massiglia, To m Adams, Jill Gramlich, AC.M Sigmetrics (1993). Leslie Rivera, Dave Dyer, Joe Krantz, Ke lly Ta ppan, 9. R. Lary and R. Bean, "The Hierarchical Storage Charlie Zu llo, Keith Wo estehoff, Rachel Zhou, ControJJer, A Tightly Coupled Multiprocessor Kathy Meinzer, and Laura Hagar. Thanks to the CAD as Storage Server," Digital Te chnical jo urnal, team, the Storage Wo rks packaging and manufactur­ vol. 1, no. 8 (February 1989): 8-24. ing team, the software verification team, and the problem management team. A final thanks to our OpenVNIS and DEC OSt/1 operating system partners and to the corporate test groups, all of whom worked with our engineering team to ensure inter­ operability between the operating systems and the controllers.

References and No tes

1. information Sy stems-Small Co mputer Sy stems ln teJface-2 ('>C�1-2), A.t'I SI X1.131-1994 (New Yo rk: American National Standards Institute, 1994).

2. D. Patterson, G. Gibson, and R. Katz, "A Case fo r Redundant Arrays of Inexpensive Disks (RAID)," Report No. UCB/CSD 87/391 (Berkeley: University of California, December 1987).

3. The RAID level 5 smal l-write penalty re sults when a small write operation does not write all the blocks associated with a parity block. This situation requires disk reads to recalculate parity that must then be written back to the RAID level 5 unit to achieve data redundancy.

Digital Te chuica/ ]o unwl V!Jf. () No. 4 Fa ll I'J'J4 25 Cbristop b J Bufller I

Po licy Resolution in Wo rkflow Management Sy stems

One crucial Ju nction of a Ll'orkflow management .�) 'Stem (W/�11.\) is to assign tasks to users wbo are eligible to cany them out. Except in simple u•orkj lou· scenarios, roles such as secretary and nwrwger are not a sujf icient basisfor determining eligi­ bility AdditionallJ; wr, I!Ss are deployed not only in group settings by small compa­ nies but also worldll'ide by large ente1}Jrises. Since local lau·s and business policies bave to be fo llowed, task assignmentpolicies fo r tbe same tasl? geueml�J' diff erfrom country to counti�J' and, therej(Jre, must be sp ecified locally The Po licy Resolution Architecture (PRA) 1nodel provides more generality and e:xpressil'eness than role models do and at the same time supports the independent .ljJeufication of task

assignment policies in diffe rent parts of an enterprise. PRA can be used to model arbitrary organization structures and to define realistic task assignment (eligibil­ ity) rules by means of precise�v defined OJganizational policies. Tlm �� PRA provides real-world organizations witb a precise, silnple means of e"ip ressing tbeir complex task assignmentpolicie s.

A workflow management system (WF;\1S) is a soft­ success ful completion of a task. however, often ware system that manages the flow of work requ ires that crucia l, irreversi ble decisions be made between participants or users according to h>rmaJ by a person who is responsible fo r the task. Making s specifications of business processe cal led work­ the right decisions and then carefully and responsi­ flows. A workflow specifies tasks to be performed bly carrying out the task is essential to conducting ami their execution order. Additional ly, a workflow business successful ly. specification defines the internal flow of data The criteria used to determine an eligible user t<>r between tasks as well as all applications required to a task are manit<>ld. A user must have a specific set carry our the tasks. For example, a travel exrense of capabilities to he able to carry out the task. reimbursement workflow specifies the tasks of fill­ Additionally, the position of a user in the organiza­ ing , checking and signing a form. and reimbursing tion hierarchy and/or the reporting structure of the an amount. This wo rkJlow specifies that the fo rm organization can determine if the user is resronsi­ must be signed before an amount is rei mbursed . ble for the task. furthermore, limits rlaced on The workflow specification also tlefines the flow of a user's tlecision-making authoritv can affe ct eligi­ the expense form between tasks and the required bility. For example, not every salesperson is autho­ spreadsheet application. Final ly, for each task of a rized to accept an order that leads to a significant workflow, some rule bas to be in p lace that speci­ increase in manufact11ring output. Such an order fies the users who are el igible to carry out the task. requires special attention and internal coordi­ This set of eligible users is determined at ru n time, nation by a senior sales representative. When and the task is subsequently assigned to them. cost-optimized task assignments are made, the One of the key issues in successfull y deploying experience of the user as well as the user's skill set WFtviSs in an enterprise is the correct assignment has to be taken into consideration. Highly experi­ of a given task to eligible users. An eligible user is enced users are in most cases expensive resources, one who is capab.le of and respo nsible fo r carrying hut usually they can complete tasks faster than out an assigned task. This distinction is impor­ users with average ex perience. Although users tant because not every user who is capable of per­ with either level of experience may have sufficient fo rming a task is necessarily respons i ble fo r it. The experience to carry out a specifi c task, if deadlines

(, Vo l. No. 4 !'o ff f'J'J4 Digital Te cbuical jo urnal Po licy Resolution in Wo rkflow Management Sy stems

are involved or extreme caution with respect to Before explaining PRA in detail and providing the quality is necessary, a highly experienced user rationale for its development, the paper introduces might be appropriate. ln such cases, the additional the key concepts of workflow management. This cost would be justified . introduction presents a seemingly simple workflow The previous discussion demonstrates the neces­ that specifies travel expense reimbursement, which sity of a precise definition of eligible users fo r a is later used to introduce the design objectives of given task. Such a definition, i.e., set of task assign­ PRA.Note that a real travel expense reimbursement ment rules, should contain all the criteria used to workflow fo r production is by far more complex determine eligible users fo r the task. Early in the than the example used in this paper. A large dis­ development of Digital's Objectflow WFMS prod­ tributed enterprise endeavors to reuse the same uct, the concept of roles was considered sufficient workflow in all of its parts because reuse faci litates to model the assignment of tasks to users.1 How­ administration and leverages the development ever, an analysis of distributed enterprise-wide pro­ investment. At the same time, such an enterprise duction workflows clearly showed that using roles probably sponsors numerous business trips, which as the only assignment mechanism has limited makes the travel ex pense reimbursement workflow value in determining eligibil ity2 The need fo r a far an excellent candidate to use as an example. more expressive, general, and flexible approach became obvious. The analysis also revealed that Wo rkflow Management workflows are often reused in different parts of This section introduces a model of workflow man­ an enterprise. A prominent example is the travel agement. The discussion begins with a survey of expense reimbursement workflow, which is dis­ preliminary work. The survey suggests the motiva­ cussed throughout this paper. Although a work­ tion fo r workflow management and enumerates flow is reused, however, the task assignment some areas in which workflow management is policies may diffe r greatly in the various parts of an deployed. The key concepts of the workflow model enterprise. This diffe rence is due to the need to are then used to model a workflow example, i.e., adhere to local laws and/or to business-related devi­ the travel expense reimbursement workflow. The ations from the general rules. section concludes with a definition of workflow Based on the requirements derived from several management systems. case studies of complex workJiows, the Policy Resolution Architecture (PRA) was developed to Historical Survey provide a comprehensive way of specifying task Looking back in history reveals that workflow man­ assignment rules. 2 To support the fact that di.fferent agement has many roots. The most important are parts of an organization may require different office automation, software process management, assignment rules, PRAand its implementation were manufacturing, and transaction processing. The fo l­ designed as separate components. PRA incorpo­ lowing short survey of achieved results is given rates three major elements and thus provides to help the reader understand the motivation for workflow management. The discussion also • Concepts that enable the modeling of any orga­ explains the choice of workflow management con­ nization structure (not just roles and groups) cepts. The list of previous and related works indi­ without prescribing structures that are applica­ cates the range of literature that exists. tion dependent.

Offi ce Automation One of the primary roots of • Ta sk assignment rules as entities in themselves, separate from a workflow specification. This workflow management is undoubtedly office makes it possible fo r each of the different parts automation. Early research led to the development :1-Y of an enterprise to have its own set of task assign­ of models and tools to support office workers. mem rules fo r the same workflow. What emerged were not only desktop applications that imitate concepts such as in basket, out basket, • A language that enables the explicit specification fo rms, and documents but also models of the pro­ of organization schemas and task assignment cedures that the office workers fo llow while doing rules. Specifications are processed by a compo­ their jobs. 1o· 11 Furthermore, systems were devel­ nent called the policy resolution engine during oped that execute the office procedures to actively workflow execution. manage the flow of work within offices. Il. l�

Fa /1 1994 Digital Te chllicaljournal Vol 6 No. 4 27 Wo rkflow Models

Software Process fl!lodeling A second major root modeling language. and execution mechanism, or of workflow management is soft ware process mod­ is it p oss ible to IJrovide general concepts that need eling anll execution. 1� -25 The fo cus of research in this to be customized o nly fo r a specific area of applica­ area is the automated support of software develop­ tion' This question triggered the development of ment processes. Concepts com prise process mod­ the co ncept of work flow, whose goal it is to serve els like the waterfall m odel or th e s pira l model, as the ge neral and custom izable concept. del iverable colle, installation and operation manu­ a ls, requirements documents, and test cases 21•.r Workflow Ma nagement Concepts

JV!a nufacturing Traditiona ll y, fo rmalized proce­ After the speci fic applica tio n semant ics (e .g.. docu­ ments. office workers, r lea e pro res (lures that are executed re peated ly are inl1erent to e s cedu , and CAD clrawings) have been abstracted, the basic concepts manufacturing, another root of workflow m anage­ ment. Manufacturing involves not only product ion of workflow management can be distilled from the processes but also preproduction procedures start­ various appro aches mentioned above. Al.thougll w k l w m nage en s ind e e e of specific ing from, for example, the release of computer­ or f o a m t i p nd nt application semant ics, it does support all the appli­ aided d esign (CAD) drawings to the p repara ti on of

'fransaction Processing Another imp ortant area to model the semantics of each appl ication area. that in fl uenced the devel opmen t of workflow Wo rkflow manage ment is anal ogous to relational management is transaction processing. After the database systems. Such systems know how to concept of atomicity, consistency. isolation, a nd model and implement tables and how to process e llurabi l i ty (ACID) transactions was d evel oped . queries: h owev r, they do not know about t he resea rche rs p rop osed more advanced transaction specific concepts of an appl i cation area that are models for processing several interdepend ent tasks implemented h�· user-defined tables, e.g.. addresses c - and orders. that must be transa t ional and recoverahfe .-<� .w The fol lowing list introduces the basic concepts Coordination Th eory, Entetprise Modelil lf!., and of workflow management by e numerating the major aspects that make up a workflow specification: 1·• Sp eech Act Tb e01y Another area of research that con tributed to the idea of wo rkfl ow management is • a coordination theor y. ;o. '1 Th i s area looks at p ro­ Functional aspect. The function l aspect cesses as one form of coordination and tries to describes what has to be done , without sa)'­ apply interdisciplinary research res ults to it. l'.h e ing hovv, by whom, and with which data. The research area of en terp rise model ing fo cuses on fu nctional aspect provides two concepts: ele­ the model ing of the who.le multifaceted enter­ m entary wo rkflows and compos ite workflows. prise. •2 •'> Enterprise activities are one part of an Elementary workflows are tasks that can he ca r­ enterprise that d rives the enterprise processes. The ried out by one person. program, or machine. s peech act theory is an attempt to m odel the con­ For b revit y. elementary workflows are called versation between humans. '" Some research fo l­ steps. Com posite workflows bu ndle e i ther elementary workflows or other composite lows the direction that a workflow is an interwoven workflows to higher- level asks . In thi s way. chain of speec h acts. '1 t a re use hie ra rchy is bu ilt. since the bund led

Ear�J' AjJjJii cation-indej>e nde n t Appmachcs In workflows may very well stand by themselves. addition to the application-specific roots of work­ Generally. these higher-level tasks can no longer flow manage ment, early approaches th at modeled be ach ieved by a s i ngle person, progra m. p rocesses independent of appl ication areas pro­ or machine h ut require several such entities. vided motivation fo r workflow management.'�-' i A workflow that hund les other workflows refer­ The term process appears in all the areas of work ences them. As a naming convention, a work­ mentioned above. Al so, all these research areas deal flow that is referenced by some other workflow with data, e.g., documen ts, Cr\D d rawings, and is called a subworkflow. The referencing work­ orders. Most approaches have some notion of sub­ flow is ca l led the supe rwo rkflow. The topmost ject or age nt. The question arose among researchers, workflow of a reuse hiera rchy is called the top­ Does each area need its own definition of terms. level wo rkflow.

l'rJ/. i Digital Te cbuicnl jou rnal 2H (, No. i 1-ii/1 I')'J Po licy Resolution in Wo rkflow Ma nagement Sy stems

• Behavioral aspect. The behavioral aspect Tr avel Expense Reimbursement Worliflow describes the execution order of the subwork­ Figure 1 shows the graphical representation of flows of a workflow. Constructs that describe a simplified workflow for the reimbursement the order include sequence, conditional branch­ of travel expenses. (Examples of workflow lan­ ing, parallel branching, ami the looping and/or guage can be fo und in the literature.'''(') The work­ joining of parallel or conditional execution paths. flow consists of four steps: (1) fi ll, (2) check, (3) sign, and (4) reimburse. The graphical represen­ • Informational aspect. The informational aspect tation shows the functional aspect (task strucmre) is twofold: first, it describes the local variables of as ovals and the behavioral aspect (control flow) as a workflow and the external data referenced; solid arrows. The informational aspect (data flow) second, it describes the flow of data fro m sub­ is displayed as fo rms; dotted arrows indicate the workflow to subworkflow. direction of the flow of data. The organizational • Organizational aspect. The organizational aspect aspect is omitted since the paper will fo cus later on describes who is eligible tO carry out a step. The this topic. The technological aspect is represented "who" can be a human (e .g. , an office worker) , by icons of the software applications that are avail­ a program (e.g., a compiler in a software pro­ able to carry out the steps. The historical aspect is cess), or a machine (e.g., a cell in a shop floor). represented by icons that symbolize logs in which The term user was chosen to represent all three. information must be recorded. Most available WFMSs offer the concept of roles to Step 1 of the travel expense reimbursement model the organizational aspect. A role usually workflow, the fill step, enables a user to enter the groups a set of users. At run time, tasks are relevant expenses incurred during a business trip assigned to roles and all users grouped by these into an electronic travel expense form. Aftera user roles are assigned the task. Although this method has finished entering the data, validation must take of task assignment is adequate fo r certain work­ place. The check step enables a user to look at the flows such as departmental workflows, as shown contents of the travel expense fo rm. This user is later in the section Ta sk Assignment in a Travel prompted to val idate the contents but cannot Expense Re imbursement Wo rkflow, roles are not change entries. If the user who checks the fo rm sutficient to handle workflows that are deployed detects an error, the fo rm is sent back to the user in an enterprise-wide or international setting. who in itially filled it out, with a note that explains the reason fo r rejection. Otherwise, the fo rm is for­ The literature discusses additional aspects, e.g., warded to the next user who has to sign the fo rm to a historical aspect and a technological aspect."" approve the amount. After the sign step is com­ The historical aspect is used to specify the kind of plete, the amount can be reimbursed. The last step, information to be stored in a historical database reimburse, enables a user to add the amount spent during the execution of a workflow, e.g., starting to the next paycheck of the user who requested times or values of variables. Instead of having the reimbursement. default strategy of saving all data, the workflow Th is sample workflow is intentionally kept sim­ specifies in the historical aspect only the important ple because beginning with the next section, the data that must be stored . The technological aspect paper fo cuses solely on task assignment rules. In allows the definition of which application program a real organizational setting, the workflow would or programs are available to carry out a step. At run involve more steps and additional execution paths. time, these application programs are made avail­ For example, a user who has to sign the fo rm might able to the user. In principle, it is not possible to detect an error. In this case, as in the checl< step, the enumerate all necessary aspects completely in fo rm would be sent back to the user who initially advance. Depending on the appl ication area to be filled it out. modeled, add itional aspects might appear and require support. The paper now shows how the key concepts Workflow Management Systems of workflow management can be applied, i.e., cus­ Managing the flow of work among users is done by tomized, to model a specific workflow type. The a software system called a workflow management example used is a sample travel expense reimburse­ system (WFMS). A WFMSconta ins all the specifica­ ment workflow. tions of the workflow types (e.g., a travel expense

1994 Digital Te clmicaljournal Vo l. 6 No. 4 Fa ll 29 Workflow Models

Travel Expense Reimbu rsemenl n�OTE :------u------: ' '

' ' ' ' ' ' Q Q QQ Q KEY

C) TASK CONTROL FLOW DATA FLOW l'""ioCJ looioCJ

ELECTRONIC FORM SPREADSHEET MONEY TRANSFER APPLICATION APPLICATION [[§]Q INFORMATION LOG

Figure 1 Tra vel E\pense Reimbursement Wo rkflow

reimbursement or a capital equipment order) that is not assigned to users and the I ist of actions is are modeled and released for production. If a user applied to each of the subworkflows. issues a request to start a workflow (e.g., if, after Each user who can potentially be involved in a business trip, a traveler starts a travel expense a workflow is connected to a WFMS by a private reimbursement workflow), the WFJVIS creates an work I ist, which is a graphical representation of instance of the requested workflowtype. Of course, a I ist of steps assigneel to the user. Each en try in a more than one instance of the same workflow type user's worklist represents a task the user is eligible can exist simultaneously. A WFMS assigns the steps to carry out. A user can participate in more than of a workflow to users according to the specified one workflow at the same time. Normally, the user order of the behavioral, fu nctional, and organiza­ is free to choose from the worklist any item on tional aspects. which to start. In well-designed systems, the WFMS In general, a WFJVIS performs the fo llowing automatically starts the application programs that actions to execute a workflow instance: the user will require to accomplish the work. In this way, the user can begin work immediately. • Determine the next steps to be executed. Almost all prototype implementations or prod­

• Determine the eligible users for these steps. uct developments allow the modeling of the fo ur main aspects described previously. The Jist of work­ • Assign steps to el igible users. flow management systems is growing rapidly, and

• Wa it fo r the result of each step. references to relevant literature are readily avail­ 17'7-!' able. 1 References to literature that describes • Transfer the result back to the step's superwork­ the deployment of workflow management systems flow and record the step as complete. in an application area are rare, however.'J.(,J.(,)-(," The WFMS repeats these actions until all steps of The reminder of the paper focuses on the orga­ a workflow are executed."' ''-)� This list of actions nizational aspect of workflow management. The has to be slightly modified if. in addition to steps, paper discusses the derivation of the requirements a workflow contains composite workflows in its that concepts of this aspect must meet and then list of subworkflows. In this case, the subworkflow introduces PRA as the model whose concerts

30 \In/. 6 Nn. 4 Fa ll 1994 Digital Te clmical jou rnal Po licy Resolution in Wo rkfl ow Management Sys tems

normally has to approve spending by subordi­ address the requirements. An analysis of the travel expense reimbursement workflow illustrates some nates. Usually, there is only one user to whom of these requirements. Add itional requirements are the applicant reports and who is able to play the also described to provide a more complete set. role of manager. If there are two such users, either can be responsible fo r signing the fo rm Ta sk Assignment in a Travel Expense and onJy one has to sign it. Reimbursement Wo rkflaw The overall task assignment rule is: Everyone The requirements that must be fulfilled by the con­ who is able to play the rol.e of manager and cepts of the organizational aspect were derived to whom the applicant reports is eligible to exe­ from the travel expense reimbursement workflow cute the sign step. example, the author's project work experiences, • Reimburse. The reimburse step must be exe­ and Marshak's "Characteristics of a Wo rkflow cuted by a financial clerk who is responsible fo r System-Mind Yo ur P's and R's."68 Tbe fo llowing the grou p to which the applicant belongs. list describes task assignment rules fo r each step of the travel expense reimbursement workflow: The overall task assignment rule is: Everyone who is able to play the role of financial clerk and • Fill. The fi ll step can be executed by anyone in who is responsible fo r the applicant's group is an organization who has the potential to travel. eligible to execute the reimburse step. This assignment rule enables an employee to fill in a travel expense reimbursement fo rm after The requirements thus far derived from the a business trip. (An employee who did not travel example are can also fill in a fo rm and claim expenses; how­ ever, the check and sign steps are intended to • Organization structure dependencies. To select detect such misbehavior and to reject the fo rm.) one user relative to another (e.g., a user playing The user who fills in the fo rm is referred to as the role of secretary reporting to a user playing the applicant and is known at run time. the role of manager) requires describing the users, the roles, and the dependencies (relation­ • Check. The check step must be executed by ships). This description is called an organization a user who is able to play the role of secretary. structure. An organization structure contains all To be able to va lidate the contents of the fo rm, a organizational object types like "user," "group," user in this role is expected to know how a travel or "role," and the relationships among them like expense reimbursement fo rm is structured and "reports to" or "supervises." Given such a struc­ how to correctly fill in the fo rm. This user is also ture, users can be selected based on their rela­ expected to know the destination and the travel tionships to others. Users can also be selected <.lates,and ifthe travel actual ly took place. Not a!J based on attributes such as their absence status secretaries in an enterprise have this knowledge, (i.e., whether they are on vacation or on a busi­ but the secretary of the applicant's manager can ness trip) or theirworkload . be expected to know the information. This sec­ retary usually plans the trip and often the meet­ • Historical access. In some cases, the eligible user ings of the traveler. If the user who is able to play fo r a step cannot be <.leterminecl local ly, and his­ the role of secretary <.letermines that the con­ torical information is required. For example, tents of the travel expense reimbursement fo rm determining the user who can play the role of are sound, the fo rm is fo rwarded to the next manager in one step might require knowing step; otherwise it is sent back to the applicant. which user started the workflow. Therefore, it must be possible to query a log of the history of The overall task assignment rule is therefore: a workflow to derive the information necessary Everyone who is able to play the role of secretary to make task assignments. and reports to the same manager as the applicant is eligible to execute the check step. (Note that The fo llowing are additional requirements: the term manager means a user who is able to • Data dependency. In the travel expense reim­ play the role of manager.) bursement example used in this paper, the man­ • Sign. The sign step has to be executed by a man­ ager to whom the sign step is assigned can sign ager of the applicant because the manager fo r any amount. In other cases, however, this

4 Fa ll Digital Te chnical jo urnal Vn l. G No. 1994 31 Wo rkflow Models

signatory power may have limitations. For the development plan, which is based on a mar­ instance, if the amount exceeds a certain value, a ket ana lysis. A manager or a vice president is usu­ vice president and not the manager of the appli­ ally responsible for these three complex tasks cant must sign the travel expense reimbursement (market analysis, development plan, investment fo rm. As this last example shows, task assignment plan) but not involved in the detailed work. In may depend on data in the workflow. a WFMS, this situation would be modeled as a workflow called Product Development Start, • Delegation. A manager who is out of the office which contains the three complex tasks as sub­ may want to delegate his/her tasks to keep busi­ workflows. The Product Development Start ness operations running smoothly. The appro­ workflow could then be assignee! to a manager priate task assignment rule would then have to or a vice president to model responsibility. The be extended to incorporate the Llelegation of assignment to this user means only that the user tasks. Depending on the status of the manager must acknowledge the start of the assigned (e.g., on a business trip or on vacation), the work workflow ami therefore accept responsibility would be assigned to someone else (i.e .. dele­ fo r it. The assignment does not imply that the gated). However, task assignment rules that user has to perform the cletailed work. Thus. incorporate delegation can be complex. Con­ a WFMS must be able to assign not only steps to siller the situation in which a manager leaves users but also composite workflows. on a business trip after work has already been assigned. In this situation (ami also in the case • Early/late ;� ! location. Often, the application where a manager has an excessive amount of semantics clearly indicates the single user who work to accomplish), the manager must be able should execute a task. In such cases, the related to dynamically delegate some or all of the already task assignment rule (e.g., the role of manager assigned tasks. Further consider that a manager of applicant) passes to this user at run time. In may want to delegate different types of t:Jsks not other scenarios. hO\vever, successful execution to the same user but to different users. depend­ of a task requires some capabil itv that more than ing on the type of task. To avoid leaking informa­ one user possesses. This capability is often tion or making an inexped ient assignment, the expressed through a role (e.g., financial clerk, task assignment rule must make sure that the tar­ which is a ro le usually played by more than one get users are eligible to receive the delegated user in large enterprises). In the single-user case. task assignment. the task is assigned to that user regardless of the user's workload; this process is called early allo­ • Separation of duty. Some scenarios require a sep­ cation. The user must carry out the task unless it aration of duty, i.e., two tasks must be per­ is fe asible to delegate it. In the mu.ltip.lc-user fo rmed by different users. For example, in the case, the task appears on the work list of all users transfer of a large amount of money, two man­ able to plav the ro.lc. One user starts the task; in agers must sign the transfer fo rm to double­ most cases, this user wou ld not have tbe bighest check the transaction. Regarding the travel workload. Therefore, the fi nal allocation of the expense reimbursement workflow, a user who task is made not by the Wf\!S but by the set of fi lls out the claim fo rm should not also sign it. eligible users themselves. This process is called Ta sk assignment rules must ensure that there is late aUocarion. In this case. if one user starts a separation of duty. work on a ster. the other users are no longer allowed to begin the task.' '� Subsequent.ly, their • Responsibility. As previous.ly stated, a subwork­ assignment must be revoked. "Im plementing flow can be either a step or a group of steps that Agent Coordination fo r Wo rkflow Management may be a reuse of building blocks fo r larger Systems Using Active Database Systems" describes workflows. A second use of a composite work­ a general mechanism fo r handling the revoca­ flow is to explicitly express responsibility tRA, which are then discussed .

:)2 Vo l. (i No. -i f·i1ff f')'l! Digital Te chnical jo urnal Po licy Resolution in Wo rkflow Management Sy stems

Roles As Ta sk Assignment Rules is assigned to a capable user. The assignment As stated earlier, roles have limited use as task does not fulfill the additional requirement that assignment rules. Applying the role concept to the this user must also be responsible fo r the group task assignment rules introduced above illustrates to which the applicant belongs. the limitations. Certainly, the term role has many The discussion of the last three task assignment definitions. In this paper, a role is an abstraction of rules demonstrates two tightly coupled limitations a set of users. The abstraction criteria are the set of using roles to model requirements. of capabilities of a user. Whether or not a particular user belongs to the set of users abstracted by a role 1. The concept of roles cannot express organiza­ is defined by an explicit relationship between tional dependencies, such as relationships a user and a role called the "plays" relationship. A between users (e.g., "reports to" and "responsi­ user who has a plays relationship with a role has the ble fo r"). It only relates users to roles by a plays capabilities defined by that role, i.e., the user is able relationship. Furthermore, roles do not provide to play the role. For example, if both Ann and Joe a means of introducing additional objects of are users who are able to play the role of clerk, then organization structures like "group" and "depart­ each one has the capabilities defined by this role ment." The only two objects the concept of and each is capable of executing the task. A user roles provides are "role" and "user." might have a wide range of capabilities and be able 2. The concept of roles, therefore, does not pro­ to play several roles at the same time. For example, vide a sufficiently sophisticated language to a user might be able to play the role of employee express, for instance, that a user not only has and the role of manager simultaneously. Although to play a certain role but also has to relate to this definition of role is not the only one, it is very some other user in a particular way (e.g., 1".'H o2.626un.ct common and often applied (, "reports to" a particular user). For each task assignment rule that was intro­ duced in the travel expense reimbursement exam­ In addition, the other requirements like historical ple, a discussion foJlows about the extent to which access, delegation, and separation of duty cannot roles support the requirements. be modeled at all using roles. To overcome these limitations, PRA introduces • Fill. The task assignment rule fo r the fill step is the concepts of organization schema and organiza­ the only rule of the example that can be modeled tional policy and the Policy Definition Language. completely with a role. Assume that every user is A brief introduction fo llows. Details are presentee! able to play the role of employee. If the fill step in the section Policy Resolution Architecture. is assigned to the role of employee, every user can execute the step, thus modeling exactly the task assignment rule of the fill step. Organization Schema One of the fundamental concepts of PRA is a freely • Check. Assigning the check step to the role of definable organization schema. An organization secretary does not model the fu ll semantics schema contains aU types of organizational objects of the desired task assignment rule. Such an and re.lationships that are availab1e for modeling assignment models only the requirement that a particular organization. Figure 2a gives an exam­ a user has to be able to play the role of secretary ple of an organization schema. Jf a defined schema to carry out the step. The assignment does not is instantiated, it contains an organization struc­ model the additional requirement that only ture. Since other objects besides roles are required those users who report to the same manager as to model an organization, relationships other than the applicant are eligible. "plays" must be available. Some necessary addi­ tional relationships are "reports to," which relates • Sign. Analogous to the situation in the check two users, and "is responsible fo r" and "belongs to," step, assigning the sign step to the role of man­ which relate a user and a group. A freely definable ager does not model that only a user to whom organization schema, such as the one provided by the applicant reports is eligible but that any man­ PRA, allows designers to define roles as required ager is eligible. by the workflow application.

• Reimburse. Assigning the reimburse step to the Such a freely definable organization schema may role of financial clerk ensures only that the step seem to be a luxury, and a fixed organization

Digital Te chnical journal Vo l. 6 No. 4 Fa ll 1994 33 Wo rkflow Models

ROLE

PLAYS

REPORTS RESPONSIBLE TO //- - - ', FOR GROUP I \ D - - BELONGS ' , ___ , TO I (, 6• USER

(a) Sample Organization Schema

SALES MANUFACTURING ENGINEERING ADMINISTRATION

AL NINA KEN SUSAN MATT CHARLES MIKE

8 8(b) Sample Organ izatio n Structure8

Figure2 Sample OrganizationSc hema and Organization Structure fo r the Travel Expense Reilnbursement Exa mple

schema that provides the most relevant objects and semantics completely. Consequently, such a relationships may seem sufficient. An analysis of schema does not meet enterprise-specific needs. various organization structures in different enter­ Figure 2a shows a graphical representation of a prises clearly shows, however, that a single organi­ sample schema for the travel expense reimburse­ zation schema is not adequate fo r all situations ment example. Although this schema may appear in which WFMSs can be deployed. An enterprise general and an adequate alternative tO an all­ that deploys a schema in which the semantics of embracing schema, it does not contain required the modeled objects are fixed has to fo llow the organizational objects such as task fo rces with

34 Vo l. 6 No. 4 Fall 1994 Digital Te e/mica/ jourrml Policy Resolution in Wo rkflow Management Systems

a limited life span, committees, and departments. Figure 3a shows an example of an informal orga­ Also, this sample schema does not consider objects nizational policy fo r the sign step. This organiza­ or relationships necessary fo r modeling delegation tional policy specifies that if the WFMS is to assign and relocation of employees. Figure 2b displays a the sign step, it will assign the step to the manager superficial organization structure, i.e., an instantia­ of the applicant if the amount is less than $1,000. tion of the schema. Objects like user and role are Otherwise, it will assign the step to the vice presi­ depicted as icons, and relationships are depicted as dent responsible fo r the applicant's group. A more arcs and solid, dashed, and dotted lines between advanced rule would not fix the amount at $1,000 the icons. but would make this amount dependent on the Approaches that go beyond using roles as a basis authorization level of the manager, as illustrated in fo r task assignment common.ly provide organiza­ Figure 3b. tional objects in addition to roles and users, usually The Policy Definition Language is PRA's fo rmal group and/or department objects U•.So9.n The litera­ language for specifying organizational policies. ture contains evidence that the schemas and the Policies written in this language are precise and task assignment rules are fixed and have to be used executable by an execution engine called the pol­ as they are. Additionally, these approaches do not icy resolution engine. Each time the WFMS is about separate the workflow from the workflow specifi­ to assign a step, the system evaluates the corre­ cation, which makes the reuse of a workflow in a sponding organizational policy to determine the diffe rent organizational setting very difficult. set of users who can execute the task.

Organizational Po licies As Task Policy Resolution Architecture Assignment Rules \V FMSs operate in global, open, and distributed A second fundamental PRA concept is that of an environments and in group, department, enter­ organizational policy, which up to this point has prise, and multiple-enterprise settings. The been called a task assignment rule. An organiza­ enterprise-level deployment of workflows is pos­ tional pol icy specifies all the eligible users fo r a sible only if the underlying concepts and sys­ task by stating the criteria a user must meet. These tems are developed appropriately. PRA is therefore criteria can include a role or roles that a user has based on several design principles that ensure a to be able to play and relationships that a user has to general approach that supports enterprise-level .have with other users or groups. deployment .

(a}

WORKFLOW TravelExpenseReimbursement STEP sign CRITERIA IF amount < 1000 THEN manager of applicant ELSE VP responsible for applicant 's group ENDIF

(b)

WORKFLOW TravelExpenseReimbursement STEP sign CRITERIA IF amount < author ization leve l of applicant 's manager THEN manager of applicant ELSE VP responsible for applicant 's group ENDIF

Figure 3 Informal Orga nizational Po licies fo r the Sign Step of the Tra vel Expense Reimbursement Wo rkflow

Digital Te chnicaljow·nal 6 Vo l. No. 4 Fa /1 1994 35 Workflow Models

Design Principles to make organizational policies objects in them­ The PRA design principles are reusability, security. selves, independent of a workflow specification. general ity, dynamics, ami distribution. Organizational policies name not only the step in which they are used but also the surrounding work­ Reusability In the travel expense reimbursement flow. The design of organizational policies fo r a example, the sign step was modeled to approve step depends on the context in which the step is to travel expenses. Other workflows, like capital he reused. equipment orders, can reuse the sign step for simi­ As mentioned earlier, making organizational poli­ lar tasks, e.g., to approve an order. If an organiza­ cies independent objects allows diffe rent organi­ tional policy were attached to the step type itself, zation stn�ctures to reuse a workflow. To achieve this assignment rule would serve to determine eJ igi­ such reuse, each organizational setting has its own ble users independent of the workflow in which set of organizational policies for the workflow to he the step is reused. Viewed from an organizationa l reused. These organizational policies are tailored perspective, however, the reuse of steps in differ­ to the specific needs and circumstances of the orga­ ent workflows requires several pol icies. For exam­ nizational setting. ple, the signing of a travel expense reimbursement Organizational policies can themselves be reused . fo rm is carriecl out by a manager of the applicant. Diffe re nt steps may require the same set of eligible whereas the signing of a capital equipment order users, and, therefore, one policy would be suffi­ fo r an amount that exceecls a certain value is carried cient fo r more than one kind of step (e.g., sign ami out hy an appropriate vice president. Therefore, fi ll) or t( Jr more than one use of the same kind of the sign step in the context of a travel expense reim­ step. For example, a manager signs not only travel bursement workflow has an organizational policy expense fo rms but also capital equipment orders. that defines the manager of the applicant to be el igi­ In both workflows, the organizational pol icy that ble, whereas the sign step in the context of the cap­ defines the manager of the applicant depends on ital equipment order workflow has a diffe rent the authorization level. Both workflows can reuse policy, one that defines an appropriate vice presi­ the sign step, as can be seen in the policy shown dent as eligible fo r the task. in Pigure 4a. If the authorization level depends on 'T"he observation that a policy for a step depends the workflow, the policy changes to take into con­ not only on the step itself but also on the workfl ow sicleration the specific kind of workflow, as shown in which the step is reused led to the decision in Figure 4b.

(a)

WORKFLOW TravelExpenseReimbursement I CapitalEquipmentOrder STEP sign CRITERIA IF amount < authorization level of applicant 's manager THEN manager of applicant

ELSE VP responsible for applicant 's group ENDIF

(b)

WORKFLOW TravelExpenseReimbursement I CapitalEquipmentOrder STEP sign CRITERIA IF amount < authorization level of applicant 's manager depending on workflow type THEN manager of applicant

ELSE VP responsible for applicant 's group ENDIF

Figure 4 Informal Organizational Po licies Showing Reuse of the Sign Step

l'<1l. (> .Vo . .:f Fa ll 1')')4 Digital Te cbuical jo urual Po licy Resolution in Wo rkf!o·w Management Sy stems

Security Because changing an organizational pol­ Fo llowing this ge neral approach, it is apparent icy may affect daily business operations, aJ I users that a fixed set of assignment ru les is also inade­ should not be able to make changes at will. For quate. The PRA design hence provides a language example, a user (applicant) should not be able to that enables users to define task assignment ru les approve his/her own travel request. Organ izational (organizational policies) as required by the work­ pol icies are therefore objects that must be properly flows of an enterprise. secured to prevent users from performing unau­ thorized tasks. The decision to design organiza­ Dy namics Organizations change fo r many rea­ tional pol icies as objects makes it easier to secure sons, e.g., employee numbers fluctuate, restructur­ the pol icies, because security mechanisms such as ing takes place, groups join or split because of new access control lists (ACLs) can be applied directly product strategies, etc. Business operations and to objects. '5 therefore workflows, however, must continue unin­ Designers considered and rejected the alter­ terrupted. To do so, the organization structure and native approach of securing the workflow specifi­ the organizational policies of a WFMS must change cation and, consequently, the organizational to reflect the changes in the real organization. The policies included in the specification. Wo rkflow decision to separate workflows from organization types do have to be secured to prevent unautho­ structures and organizational pol icies enables users rized changes; however, securing the workflow to change versions independently. For example, an specification would al low those who are eligible organizational policy can change while a workflow to change the workflow type to also change the that uses it is running. If the change takes place associated organizational po.licies. Such an all­ before the WFMS assigns the step to a user, the encompassing security design inhibits the separa­ WFMS wi ll use the new version of the organiza­ tion of duty between workflow designers who care tional policy instead of the old version. Policy about how a business process is implemented by changes result in neither the shutting down of the a workflow and organization designers who care WFMS nor the stopping and restarting (from the about the organization structure and the user capa­ beginning) of the workflow. This independence bilities and responsibilities. Protecting workflows allows WFMSs to deal with the dynamics of an orga­ independently of organ izational policies allows nization and make correct task assignments while users to modify a workflow without allowing them changes are taking place. to modify organizational policies and thus gain or grant unauthorized eligibility. Similarly, organiza­ Distribution Not only are enterprises becom­ tion schemas and organization structures must be ing more distributed, but they are also increasing secured independently to prevent users from their worldwide operation. Nations have diffe rent changing roles or relationships to gain or grant local Jaws and policies because they decide unauthorized authority. autonomously on these issues. A local subsidiary has to adhere to local law, even though it belongs Generality Although several standard orga niza­ to a company that operates worldwide. for exam­ tion structures prevail-strong hierarchical, matrix­ ple, U.S. companies have a position called vice shaped, function-oriented, and networked-hybrid president. A U.S. company may have the rule organ ization structures exist, which contain a myr­ that contracts with external suppl iers of manu­ iad of anomalies and exceptions. Independent of fa cturing parts must be signed by the vice presi­ their organization structure, most enterprises have dent of manufacturing. If the U.S. company has a business processes that are potential candidates fo r German subsidiary, by German law, this subsidiary a WFiviS implementation. A WFMS that claims to be is a company in itself and must have a person called able to implement business processes in all kinds of Geschdftsfiihrer who is responsible fo r the opera­ enterprises must therefore be able to support all tions of the company. If the subsidiary wa nts to possible organ ization structures. A fixed organiza­ enter into a contract with a supplier, German law tion schema is inadequate fo r such a universal requires the Geschdftsfiih-ret· to sign the contract implementation capability. Consequent.ly, PRA even though the U.S. corporate organizational pol­ supports the modeling of arbitrary organization icy requires the vice president of manufacturing schemas and allows WFMSs to implement any orga­ to sign. Although the same type of workflow is nization that might exist. running in both countries, e.g., the contract with

6 Fa ll Digital Te cfmicaljounwl Vo l. No. 4 1994 37 Workflow Models

external supplier workflow, the organizational internal data, such as local variables, and thus be policies for the approval step differ. The U.S. dependent on the workflow state. Consequently, version of the organizational policy specifies the the set of users for the same step in two different vice president of manufacturing is the only eligible instances of the same workflow might be differ­ user, and the German version specifies that the ent. Consider, for example, the travel expense reim­ Geschdftsfii hrer is the only eligible user. bursement workflow, with the user selection for Domains were introduced to deal with the issue the sign step dependent on the authorization level. of autonomous pol icies. A domain is an abstract In two instances of the workflow, the amounts to entity of management. Organizational pol icies as be reimbursed might differsuch that diffe rent peo­ well as workflows are related to domains. The pre­ ple, e.g., the manager ami the vice president, must vious example might involve two domains: "USA" execute the two sign steps. and "GERMANY." (The domains could be fu rther To provide a general mechanism for determining subdivided.) a set of eligible users for a workflow, PRA organiza­ The principles just discussed guided the PRA tional policies accommodate operations in addi­ design. As mentioned in the previous section, tion to executing a step or taking responsibility for PRAdefi nes the concepts of organization schema, a composite workflow. Delegating a workflow and organizational policy, and a fo rmal language to undoing a workflow are two examples. To delegate model policies. In addition, I'RA defines interfaces a workflow, an organizational policy has to ensure fo r an execution engine and their use by a WF!VIS. A that both the person who delegates the workflow detailed discussion of the PRAcomponents fo llows. and the person to whom the workflow is assigned are el igible users. The operation of undoing a work­ Orga nization Schema and flow (i.e., to undo the results achieved thus far) and Organization Structure starting again can result in wasted effort and unre­ coverable work. Therefore, a WFMS must carefu lly The PRA organization schema is a set of objects and choose eligible users fo r this operation. relationships that can be freely defined, thus To deal with various workflow operations, a PRA enabling users to model arbitrary organizations. organizational policy relates a workflow type and Each member of the set can be instantiated to popu­ one of its operations in a given tlomain tO an organi­ late an organization schema, that is. to produce an zational expression. An organizational policy is organization structure. PRA allows users to define defined as tl1e tuple . For example, erroneous structures. For example, if an enterprise the organizational policy fo r the fill step in has the policy that an employee must not report the travel expense reimbursement example is to more than two people, PRA enables the user to . Since can be related to only two others through a "reports an applicant should be able to undo the step and to" relationship. If a modeler adds a third reporting start again, the WFMS must also specify the organi­ I ine, the system detects the violated constraint. zational policy

undo, liSA, ·the user who started fill'>. (The next Organiza tional Policy section describes PRA's formal language fo r specify­ An organizational policy specifies a set of eligible ing organizational policies.) users for a given workflow, which can be either ele­ When a WFMS determines that a workflow in mentary (a step) or composite. A set of users is not a particular domain is to be executed, it calls stable and therefore fixed but specified through an the policy resolution engine, which looks for the expression called an organizational expression. An appropriate organizational policy and evaluates organizational expression specifies tile selection of its organizational expression. 'fhe engine returns users with particular properties from an organiza­ the results of the evaluation, i.e., the set of eligible tion structure. For example, an expression might users, to the WFMS, which subsequently assigns the enumerate users, select all users able to play a par­ workflow to those users. One organizational policy ticular role, or select a user related to some other in can be reused fo r several workflow types, domains, a specific way. Additionally, organizational expres­ etc., by entering a set in tlw appropriate element sions can refer to the history of a workflow or to its of the tuple. For example, if the organizational

:38 Vo l. 6 No. 4 Fa ll /')')4 Digital Te chnical jo urnal Po licy Resolution in Wo rkflow Management Systems

pol icy fo r the fill step of the travel expense reim­ tiona! expressions. Finally, the third part supports bursement workflow is the same in the U.S. the definition of organizational policies. as it is in Europe, the policy could be modeled as The fo llowing figures illustrate the POL fo r a sam­ . organizational policies fo r the travel expense reim­ bu rsement workflow. Figure 5 shows the POL for Po licy Definition Language the organization schema displayed in Figure 2a. The From the organizational viewpoint, the fo llowing POL fo r the instantiation displayed in Figure 2b elements are necessary to run a workflow: an organi­ appears in Figure 6. zation schema together with its instantiation, the The organization schema definition part of the organizational policies fo r this workflow, and the rel­ POL looks like a data definition language (DOL) in a evant organizational expressions. To describe these relational database. Two diffe rences exist, though: elements in a fo rmal way, PRA defines a language (1) PDL distinguishes organizational object types called the Policy Definition Language (PDL), which from organizational relationship types, and (2) POL consists of several parts. The first part enables the allows complex data types (e.g., sets as attributes). definition of an organization schema and its popula­ If a policy resolution engine is bu ilt on top of a rela­ tion. The second part is concerned with organiza- tional database, a compiler or a translator within

ORGANIZATION_TYPE Role ATTRIBUTES name : String authorization_level: set ( task, amount ); KEYS name;

ORGANIZATION_TYPE Group ATTRIBUTES name : String KEYS name;

ORGANIZATION_TYPE User ATTRIBUTES name : String office_tel_# : String e_mail : String absence: {vacation, ill, business, avai labl e} KEYS name ;

RELATIONSHIP_TYPE Reports_to FROM User TO User ATTRIBUTES kind : {line , functional , none }

RELAT IONSHIP_TYPE Plays FROM User TO Role ATTRIBUTES duration_from: date duration_to: date

RELATIONSHIP_TYPE Responsible_for FROM User TO Group

RELATIONSHIP_TYPE Belongs_to FROM User TO Group

Note that , for simplicity, we assume user names to be unique . In reality, this is not the case and the modeling must deal with nonunique names .

Figure 5 Po licy Definition Language fo r the Sample Organization Schema Shown in Figure 2a

Digital Te chnical jo urnal Vo l. 6 No. 4 Fall 1994 39 Wo rkflow Models

Role "Employee" , {} "Manager", { (TravelExpenseReimbursernent . Sign, 1000) , (Capi talEquipmentOrder. Sign, 5000) } "FinancialClerk" , {} "Secretary" , {} "Engineer" , {}

"VP" , {(TravelExpenseReimbursement .Sign, *), (CapitalEquipmentOrder .Sign, *) }

Group "Sales" "Manufacturing" "Engineering" "Administration"

User "Al", "[1) 125-5589", "al@center .com" , available "Nina" , "[1) 125-5590", "nina@center . com" , available , "Ken' , "(1] 125-5601", "[email protected]" , available "Susan" , "[1) 125-5609", "susan@center . com" , bu siness "Matt" , "(1] 125-4499", "matt@center . com" , available "Charles" ,"[l] 125-4580", "charles@center .com" , available "Mike" , "[1) 125-0101", "mike@center . com" , available

Repcrts_to "Al", "Nina", line "Ken" , "Nina" , line "Nina", "Mike", line

"Susan", "Matt'' 1 line "Charles", "Matt" , line "Matt", \\Mike", line "Mike", none

Plays "Al", "Employee" , 01-02-88, 0-0-0 ( * open ended . ) "Al", "FinancialClerk" , 01-02-88, 0-0-0 "Nina", "Employee" , 01-02-90, 0-0-0 "Nina" , "Manager", 01-02-90, 0-0-0 "Ken" , "Employee", 01-02-91, 0-0-0 "Ken" , "Secretary" , 01-02-91, 0-0-0 "SUsan" , "Employee" , 01-02-92, 0-0-0 "SUsan" , "Secretary" , 01-02-92, 0-0-0 "Matt" , "Employee", 01-02-88, 0-0-0 "Matt", "Manager" , 01-02-88, 0-0-0 "Charles", "Employee", 01-02-88, 0-0-0 "Charles" , "Engineer", 01-02-88, 0-0-0 "Mike", "Employee", 01-02-90, 0-0-0

"Mike", "'fPII , 01-02-93, 12-31-97

Responsible_for "Al", "Sales" "Al", "Manufacturing" "Al", "Engineering" "Mike", "Sales" "Mike", "Manufacturing" "Mike" , "Engineering"

Belongs_to "Al", "Administration" "Nina", "Engineering" "Ken" , "Administration" "SUsan" , "Administration" "Matt", "Engineering" "Charles", "Engineering" "Mike" ,

Figure 6 Policy Definition Language.:fo r the Sample Organization Structure (Instantiation) Sbott'JI in fl�r.;ure 2b

...j() lfJI. () .\'u. 4 Ta ll 199-1 Digital Te cbuical jo urnal Policy Resolution in Wo rkflow Ma nagement Sys tems

the engi ne translates the organization schema defi­ role of secretary and who report to the same user nition part of PDL into a set of DDL statements. as the applicanr. Figure 7 lists the organizational expressions Independent from the travel expense reimburse­ required to fo rmulate the organizational pol icies ment example are the sample separation of duty fo r the travel expense re imbursement workflow. and delegation pol icies shown in Figures 9 and 10. Note that the organizational expression for employ­ The organizational policy that specifies separation ees selects all users who play the role of employee. of duty ensures that the user who signs the expense The RETURNS statement ind icates the search for fo rm is different from the user who fi lls out the users. The definition of the plays re lationship type form. The policy that models the delegation opera­ in Figure 5 indicates that the employee is of the tion contains a parameter that specifies to which type role. This information is sufficie nt to fo rmu­ person the sign step is to be delegated. Only the late a query to the underlying database system in an manager of the applicant can call this operation implementation of a policy resolution engine. and then only if the parameter specifies either the The PDL fo r the organizational policies fo r the next higher manager or the responsible vice presi­ travel expense reimbursement example appears in dent. The step can be delegated only to one of these Figure 8. The WFMS applies the first organizational two users. pol icy when assigning the fill step in a travel expense Since the PDL is wel l defined, it can be used not re imbursement workflow. The policy is valid in only by designers to model organizations and poli­ three domains, USA, EUROPE, and ASLA, fo r the exe­ cies bm also by developers of graphics-oriented cute operation, which has no parameters. The pol­ tools. Such tools could present graphical symbols to icy engine returns a set of all users who are able to users to be manipu lated. When a user decides to play the role of employee. The second policy listed commit the changes, the tool generates a PDL script, in Figure 8 returns a set of alI users who play the which is fe d into the policy resolution engine.

ORGANIZATIONAL_EXPRESSION employees () RETURNS User : user user plays employee

ORGANIZATIONAL_EXPRESSION secretaries () RETURNS User: user user plays secretary

ORGANIZATIONAL_EXPRESSION rnanager_of (User : a_user) RETURNS User: user a_user report s_to user

ORGANIZATIONAL_EXPRESSION subordinates_of (User : a_ user) RETURNS User: user user reports_to a_user

ORGANIZATIONAL_EXPRESSION group_of (User : a_u ser) RETURNS Group : group a_user belongs_to group

ORGANI ZATIONAL_EXPRESSION VP_responsible_for_group_of (User: a_user) RETURNS User : user user plays VP INTERSECTION user responsible_for group_o f(a_us er)

ORGANI ZATIONAL_EXPRESSION executing_agent (Workflow: a_workflow) RETURNS User (* provided by the historical services of WFMS *)

Fig ure 7 Orga nizational Expressions fo r the Tr avel Expense Reimbursement Exa mple

Digital Te chnicaljourual Vol. 6 No. 4 Fa ll 1994 41 Workflow Models

ORGAN IZATIONAL_POL ICY WORKFLOW Trave lExpenseReimbursement .Fill OPERATION Execute () DOMA IN USA, EUROPE , ASIA ORGANIZATIONAL_EXPRESSION employees ()

ORGANIZATIONAL_POLICY WORKFLOW TravelExpenseReimbursement .Check OPERATION Execute () DOMAIN USA, EUROPE , ASIA ORGANIZATIONAL_EXPRESSION secretaries () INTERSECTION subordinates_of ( manager_of( executing_agent ( TravelExpenseReimbursement .Fil l) ))

ORGANIZATIONAL_POLICY WORKFLOW TravelExpenseReimbursement .Sign OPERATION Execute () DOMAIN USA, EUROPE, ASIA ORGANIZATIONAL_EXPRESSION manager_of ( executing_agent ( TravelExpenseReimbursement .Fill) )

ORGANIZATIONAL_POLICY WORKFLOW TravelExpenseReimbursernent . Reimburse OPERATION Execute () DOMAIN USA, EUROPE, ASIA ORGANIZATIONAL_EXPRESSION financial_clerks () INTERSECTION User : user responsible_for group_o f( executing_agent ( Trave lExpenseReimbursement .Fi ll ))

8 Figure Organizational Po liciesjor t!Je Tr emel Expense Reimbursement Exa nzple

ORGANIZATIONAL_POLICY WORKFLOW TravelExpenseReimbursement . Sign OPERATION Execute () DOMAIN USA, EUROPE, ASIA ORGANIZATIONAL_EXPRESSION manager_of( executing_agent ( TravelExpenseReimbursement .Fill) ) DIFFERENCE execut ing_agent ( TravelExpenseReimbursement .Fill)

Figure 9 Organiza tional Po lhy fo r the Sepa ration of Duty

Approaches like the ones mentioned earlier in users fo r a workflow. None of these approaches the paper provide a fixed set of types fo r modeling provides a language like PDL that can freely define an organization or a fixed set of functions, such as tbe organizational aspect as the appl ication seman­

·' role player " or '·supervisor," from which to select tics requires.

No. 42 Vol. 6 4 Fall 1994 Digital Te e/mica/ jo urnal Po licy Resolution in Wo rkflow Management Sy stems

ORGANIZATIONAL_POLICY WORKFLOW TravelExpenseReimbursement .Sign OPERATION Delegate {User: a_user) OOMAIN USA, EUROPE, ASIA ORGANI ZATIONAL_EXPRESSION IF a_user IN {manager_of { manager_of { executing_agent { TravelExpenseReimbur sement .Fill) )) OR VP_responsible_for_group_of{ execut ing_agent ( TravelExpenseReimbursement .Fill) )) THEN manager_of ( executing_agent ( TravelExpenseReirnbursement .Fill) )

F(f?ure 10 Orga nizational Policy fo r the Delegate Operation

Po licyResolution Engine set of results, the conform to operation returns the The policy resolution engine is a mechanism that va lue "true"; otherwise it returns the va lue "false." evaluates organizational policies fo r a WFMS.Serving Policy resolution engine clients use this operation as a base service, the policy resoiution engine to validate a request by a user to execute a certain manages organizational policies and organizational task of a workflow. expressions, as we ll as the organization schema and The resolve and confo rm to operations fo r orga­ its population. The engine also provides interfaces nizational expressions work analogously. Instead fo r the definition, modification, and evaluation of a workflow reference, the operations expect of these objects. The interfaces are distinguished the name of an organ izational expression as input. by the kind of service they provide. There are basi­ The operations evaluate the named organizational cally two kinds of interfaces: evaluation interfaces expression and return the set of results, which and management interfaces. is used if the resolve operation is called. The con­ fo rm to operation returns true and false values as Evaluation Inte1jaces Policy resolution engine described in the previous paragraph. clients use evaluation interfaces to evaluate organi­ zational policies or organizational expressions Management Interfa ces Management interfaces when necessary. The engine provides four evalua­ are used to define, modify, or delete organizational tion interfaces: two fo r organizational policies policies, organizational expressions, or organiza­ ("resolve" and "conform to") and two fo r organi­ tion schemas and their populations. These inter­ zational expressions (also "resolve" and "conform faces look like the following operations that are to"). The resolve operation for organizational poli­ provided fo r organizational policies: create, delete, cies expects a workflow reference and one of its modify, list, get. The create operation creates an operations as input values. This operation selects organizational policy; the delete operation deletes an appropriate organizational policy, evaluates it, a policy; the modify operation allows users to and returns a set of users eligible to execute the change an organizational policy to adjust to new given task of the workflow. The conform to opera­ requirements; the list operation returns the identi­ tion for organizational policies expects a workflow fiers of all policies; and the get operation returns reference, one of its operations, and a user as input the complete description of a policy. values. This operation resolves the appropriate Designers do not cal l these management inter­ organizational policy fo r the workflow and checks faces directly, since they communicate their whether the user is contained in the set of re sults changes through user-friendly interfaces or tools. fo r that organizational policy (i.e., if the user con­ These tools are either graphics oriented or language fo rms to the policy). If the user is contained in the oriented. In a graphics-oriented tool, a designer

Digital Te chuical]ou1·nal Vo l. 6 No. 4 Fa ll 1994 43 Workfl ow Models

manipulates icons and graphical symbols, which in If the relevant data is stored in several databases, turn re sults in calls to the appro priate management the querying interface must be built in such a way interfaces. Al ternatively, a graphics tool can ge ner­ that the policy resolution engine can issue the nec­

ate a PDL script according to the manipulations of a essary queries, which might span several databases. user and submit this script to the policy resolution Furthermore, semantics issues have to be dea lt engine. In this case, the engine interprets the sub­ with in heterogeneous environments_-,_-(, mitted script and changes its internalstate accord­ ingly. Language-oriented tools enable a designer to Arcbitec tuml Considemtiolls-Clients of a Po liLy Resolution Engine F m an architectural point of directl y express changes using PDI.. These tools take ro specifications and translate them into management view, there are two possible ways to design a pol icy interface calls. Of course, they can also submit the resolution engine: language specifications directly as scripts to 1. Ill)!. Incorporate the p ol icy reso lu ti on engine into the pol icy resolution engine, as described above. \V F,\1S a The engine wou ld be a module, whose operations are hidden by the exported inter­ L geny Databases Many large enterprises have e WFMS. faces of the Allc a ll s to the engine opera­ developed databases that contain some or all of the tions would he made through the interface of organizational data the policy resolution engine the WI'NJS. needs to evaluate organizational pol icies . These databases, called legaqr databases, m ight he self­ 2. Make the policy resol ution engine an i ndepen­ implemented or based on standards efforts like dent COI11[10nent. The engine would be a server those related to provid ing directory services on with a WF�IS system as one of its cl ients. Al l networks, i . e . , X.500.-' In ge neraL organizati ons clients of the engine, including the WF�IS, would must deal with one of the fo llowing scenarios: be able to directly access the exported opera­ tions of the engine. • No legacy da t abase exists. No existing dat abase

, a has to he consi dered and the pol icy res olution PRA recommends the implementation of pol icy engine can use its own database to build up orga­ resolution engi ne as an independent base service . nizational knowledge. which can be used by clients other than a WF.\1S

For example, an electronic mail system can be • Legacy databases contain all re levant data. To a client of the pol icy resolution engine. Since elec­ use the pol iq' resolution engine, the database tronic mail is sem to users, rather than enu merate must provide a sufficiently expressive query the electronic mail add resses of the recipients hy interface, on top of which queries issued from hand, organizational expressions can provide the tile engine can be evaluated. The only add itional addresses. For example, a manager could send an information that has to be stored is organiza­ electronic mail message to "all my subordinates " or tional pol icies and organizational express io ns . an engineer could send an electronic mail message The organization has to choose whether to to ·'all my colleagues who are engineers." The sam­ extend the legacy databases or to use the pl e operational expression shown in Figure 11 database within the pol icv resolution engine. retu rns all electron ic mail add resses of aJI subord i­ • A legacy database contains some relevant data. nates of a given user.

ln addition to organizational policies and o rga­ Another possible client is a transaction procl'ss­ nizational expressions, organizational objects ing monitor, which incorporates workflow man­

and relationships must be stored in either the age ment.-- Dayal er al. refe rence a service ca lled legacy database or the database of the policy res­ role resolution, which is an earlier development of olution engine. policy resolution.-x

ORGANI ZATI ONAL_EXPRE SSION subordinates (User : a_user) RETURNS String: user . e_mail user reports_to a_user

Figure 11 0Jgwzizationa/ £.\pression fo r Electmnic .liuil

44 �bl. (i No. -1 hli/ I'J'J4 Digital Te chnical jo urnal Po licy Resolution in Wo rkflow Management Sys tems

Figure 12 shows a schematic representation of paper. Susan Thomas assisted me by improving my a policy resolution engine with three clients-a English and thus making the paper more readable.

WFMS, a transaction processing monitor, and an Kathy Stetson was always very helpful in coordinat­ electronic mail system. ing the writing and revision processes.

Summary References BYTE The sample workflow discussed in this paper, that 1. T. May, "Know Yo ur Wo rk-Flow To ols," is, the travel expense reimbursement workflow, (July 1994). illustrates that roles are sufficient as task assign­ 2. Kreifelts and Seuffert, "Addressing in ment rules fo r only the simplest scenarios. Since T. P an Office Procedure System," Message Ha nd­ workflow management systems are deployed in sit­ ling Sys tems, State uf the Art and Future uations where complex workflows are modeled Directions: Proceedings WG 65 Wo rking and executed, a more general and powerful model lf1P Co nference on Message Ha ndling Systems, cal led the Policy Resolution Architecture (PRA)was R. Speth, ed. (Amsterdam: North-Holland, developed. PRA provides the concept of an organi­ 1987) zational policy. An organizational policy is more general than a role in that it relates a workflow type 3. S. Chang and W Chan , "Transformation to an organizational expression that determines the and Ve rification of Office Procedures," IEEE set of eligible users fo r the workflow. Because they Tra nsactions on Software Engineering, vol. state all criteria a user has to fu lfill and do not limit SE-1 1, no. 8 (August 1985). the selection based on their properties or interrela­ 4. W Croft and L. Lefkowitz, "Task Support in an tionships, organizational policies specify all el igible Office System," ACM Tra nsactions on Office users. Since an organizational expression is related Information Sy stems, vol. 2, no. 3 (July 1984). to a workflow type by an organizational policy, task assignment through organizational policies is a very 5. C. Ellis and G. Nutt, "Office Info rmation Sys­ general approach. Organizational policies are eval­ tems and Computer Science," Comjmting 12, uated based on organization schema and their Surveys, vol. no. 1 (March 1980). populations (organization structures). Since PRA 6. C. Ellis and M. Bernal, "Officetalk-D: An provides a way to model arbitrary complex organi­ Experimental Office Information System;· zation schemas, arbitrary organizations can be mod­ First SIGOA Co nference on Offi ce Informa­ eled and subsequently populated. This generali ty, tion Sy stems (1982). in conjunction with organizational policies, pro­ 7 C. Ellis, "Formal and Info rmal Models of vides a powerful and flexible approach to task Office Activity," /nfonnation Processing 83, assignment in workflow management. R. Mason, eel . (Amsterdam: North-HoJland, Ackrwwledgments 1983). I want to thank the anonymous referees whose 8. B. Karbe and N. Ramsperger, "Concepts and remarks helped me a great deal in revising this Implementation of Migrating Office Pro­ cesses," Ve rteilte Kii.nstlicbe Jntelligenz und

Kooperatives Arbeiten: 4. Jnternationaler WORKFLOW TRANSACTION ELECTRONIC GI-Kongt-ejs Wissensbasierte Sys teme, lnfo r­ MANAGEMENT PROCESSING MAIL SYSTEM SYSTEM MONITOR matik Fa chberichte 291, W Brauer and D. Hernandez, eds. (Munich: Springer-Verlag, October 1991 ).

9. T. Kreifelts, "Coordination of Distributed POLICY Wo rk: From Office Procedures to Custom­ RESOLUTION izable Activities," Ve rteilte Kunstliche ENGINE Jntelligenz und Kooperatives Arbeiten: 4. Jnternationaler GT-Kongrejs Wissensbasierte Sy steme, 1nformatik Fa chberichte 291,

Figure 12 Client-server Structure W Brauer and D. Hernandez, eds. (Munich: uf a Pu li<-J'Re solution Engine Springer-Verlag, October 1991).

Digital Te chnical jo urnal Vo l. 6 No. 4 Fa ll 1994 45 Workflow Models

10. C. Cook, "Streamlining Office Procedures­ 20. T Katayama, "A Hierarchical and Functional An Analysis Using the Information Control Software Process Description and Its Enac­ Net Model," Af"'PS Conference Proceedings of tion,'' Proceedings of the E/e"enth Interna­ the 1980 National Computer Conference, tional Conference on Sojhuare Engineering Anaheim, California (May 1980). (May 1989).

11. I. Lacld and D Ts ichritis, ''An Office Form Flow 21. P Mi and W Scacchi, Op erational Semantics Model," AF!PS Conference Proceedings of the of Process Enactrnent and Its Prototype 1980 Na tional Computer Conference, Ana­ Implementations (Los Angeles: University heim, California (May 1980). of Southern California, Computer Science Department, April 1991). 12. L Baumann and R. Coop, "Automated Work­

flow Control: A Key to Office Productivity," 22. P Mi and W Scacchi, Modeling Articulation 1980 AF!PS Conference Proceedings of the Wo rk in Software Engineering Processes (Los Na tional Computer Conference, Anaheim, Angeles: University of Southern California, California (May 1980). Computer Science Department, April 1991)

13. M. Zisman, "Representation, Specification 23. P Mi and W Scacchi, "A Knowledge-Based and Automation of Office Procedures," Ph.D. Environment fo r Modeling and Simulating dissertation (Philadelphia: University of Penn­ Software Engineering Processes," 1/:.I::t:: Trans­ sylvania, Wharton School, 1977) actions 011 Knowledge and Oatu Engineer­ ing, voL 2, no. (September 1990). 14. B. Curtis, M. Kellner, and J Over, "Process 3 Modeling," Co mmunications of the AC!H, voL 24. L. Osterweil, "Software Processes Are Soft­ 35, no. 9 (September 1992). ware To o," Proceedings of the Ninth lnternct­

15. W Deiters and V Gruhn, "The Funsoft Net tional Conference on Softll'are Engineering Approach to Software Process Management," (March/April 1987). Internationaljo urnal of Sojtware Engineer­ 25. L Williams, "Software Process Modeling: ing and Knowledge Engineering, vol. 4, no. 2 A Behavioral Approach," Proceedings of the (1994) Te nth International Conference on Software 16. W Deiters, V Gruhn, and H. We ber, ''Software Engineering (1988). Process Evolution in MELMAC," Th e Impact of W CA SE Te chnology on Software Processes 26. Royce. "Managing the Development of Series on Software Engineering and Knowl­ Large Software Systems," Proceedings of the edge Engineering, voL 3, D. Cooke, ed. (Singa­ Ninth international Con/erence on Software pore: Wo rld Scientific Publishing, 1994). Engineering (March/April l987).

17. D. Harel et al., "STATE!YIATE: A Wo rking Envi­ 27. B. Boehm, "A Spiral Model of Software Devel­ ronment fo r the Development of Complex opment and Enhancement," A<:i11 Sojtware Reactive Systems," Proceedings of the Te nth Engineering Noles, voL 11, no. 4 (August 1986). International Conference on Sojtware Engi­ neering (1988). 28. C. Hegarty and L. Rowe, "Control Loops ami Dynamic Run Modifications Using the Berke­ 18. W Humphrey and M. Kellner, "Software Pro­ ley Process-Flow Language," Proceedings of cess Modeling: Principles of Entity Process the Th ird International Conference on Data Models," Proceedings of the Eleventh Inter­ and Knowledge Sy stems ji Jr kl anufacturing national Conference on Software Engineer­ and Engineering, Lyons, France (1992). ing (May 1989). 29. S. Jablonski, "Data Flow Management in Dis­ 19. M. Jaccheri and R. Conradi, "Techniques for tributed OM Systems;· Proceedings of the " Process Model Evolution in EPOS, fEEt. Trans­ Th ird International Conference (111 Datcl actions on Software Engineering (December and KnouAedge Systerns fo r Jlthmufacturing 1993) and Engineering, Lyons, France ( 1992).

46 Vo l. 6 No. 4 Fa ll 1994 Digital Te cbnicalJournal Po licy Resolution in Wo rkfl(YWMana gement Sy stems

30. Proceedings of the Third International Con­ 41. T. Malone, K. Crowston,J. Lee, and B. Pentland, fe rence on Data and Knowledge Sy stems fo r "Tools fo r Inventing Organizations: To ward a Ma nufacturing and Engineering, Lyo ns, Handbook of Organizational Processes," CCS

France (1992). WP #141, Sloan School WP #3562-93 (Cam­ bridge: Massachusetts Institute of Technology, 31. H. Yo shikawa and). Goossenaerts, eds. , Info r­ Center fo r Coordination Science, May 1993). mation InfrastructureSy stems fo r Manufa c­ tu ring (Amsterdam: North-Holland, 1993). 42. R. Burkhart, "Process-based Definition of Enterprise Models," Proceedings of the First

32. T. Harder and A. Reuter, "Principles of International Conference on Enterprise Inte­ Tra nsaction-oriented Database Recovery," gration 1\llodeling Te chnology (ICEIM T), ACM ComjJuting Surveys, vol. 15, no. 4 Hilton Head, South Carolina (June 1992). (December 1983). 43. C. BuGler, "Enterprise Process Modeling and 33. P At tie, M. Singh, A. Shet, and M. Rusinkiewicz, Enactment in GERAM," Proceedings of the "Specifying and Enforcing Intertask Depen­ International Co nference on Automation, dencies,'' Proceedings of the Nineteenth Robotics and Computer Vision (ICA RCV '94), International Conference on Ve ry Large Singapore (November 1994) Databases (VLDB), Dublin, Ireland (1993). 44. M. Fox , "The TOVE Project: Towards a Com­ mon-Sense Model of the Enterprise," Proceed­ 34. Y. Breitbart, A. Deacon, H. Schek, and G. We ikum, "Merging Appl ication-centric ings of the Fi rst International Conference on and Data-centric Approaches to Support Entetprise In tegration Modeling Te chnology (ICEIMT), Transaction-oriented Multi-system Wo rk­ Hilton Head, South Carol ina Qune 1992). flows," S!Gtl'IOD Record, vo l. 22, no. 3 (Sep­ 1993). tember 45. Proceedings of the First International Con­ fe rence on Enterprise In tegra tion Modeling 35. U. Dayal, M. Hsu, and R. Laclin, "A Transac­ Te chnology (ICElt'vlT), Hilton Head, South Car­ tional Model fo r Long-Running Activities," 1992). Proceedings of the Seventeenth Interna­ olina (June tional Conference on Ve ry Large Databases 46. R. Katz, "Business/enterprise Modeling," IBM (VLDB), Barcelona, Spain (September 1991). Systemsjo urnal, vol. 29, no. 4 (1990).

36. H. Garcia-Molina and K. Salem, ·'Sagas," Pro­ 47. ). Sowa and ]. Zachman, "Extending and For­ ceedings of the 1993 ACM SIGL'viOD Interna­ malizing the Framework fo r Information Sys­ tional Conference on Management of Data tems Architecture," IBM Sy stems jo urnal, vol. (1987). 31 , no. 3 (1992).

37. Bulletin of the Te chnical Committee on Data 48. F Ve rnadat, "Business Process and Enterprise Engineering, vol. 16, no. 2 (June 1993). Activity Modelling: CIMOSA Contribution to a General Enterprise Reference Architecture 38. S. Jablonski, "Transaction Support for Activity ancl Methodology (GERAM)," Proceedings Management," Proceedings of the Wo rkshop of the International Conference on Automa­ on High Perforrnance Tra nsaction Processing tion, Robotics and Computer Vision (JCA RCV Sy stems (H PTS), Asilomar, Califo rnia (1993). '94), Singapore (November 1994).

39. H. W�ichter and A. Reuter, "The ConTract 49. T. Williams, "Architectures fo r Integrating Model," in Transaction Models fo r Advanced Manufacturing Activities and Enterprises," Database Applications, A. Elmagarmid, ed. Information Infrastructure Sy stems fo r Ma n­ (San Mateo, California: Morgan Kaufmann, ufacturing, H. Yo shikawa andJ. Goossenaerts, 1992). ells. (Amsterdam: North-Holland, 1993).

40. T. Malone and K. Crowston, "The Interdisci­ 50. F Flores and T. Winograd , Un derstanding plinary Study of Coordination," ACM Comput­ Co mp uters and Cognition (Reading, .MA: ing Surveys, vol. 26, no. 1 (March 1994). Addison-Wesley, 1987).

Digital Te cb11.icaljo urnal 4 1994 Vol. 6 No. Fa ll 47 Workflow Models

51. R. Medina-Mores, R. Winograd, T. Flores, and 63. S. Sarin, K. Abbot, and D. McCarthy, "A Process F Flores, "The Action Wo rkflow Approach to Model and System fo r Supporting Collabora­ Workflow Management Te chnology," Proceed­ tive Wo rk," Proceedings of the AOH SIGWS ings of the AC!l-'1 i992 Conference on Com­ Conference on Organiza tional C01nputing puter Supported Cooperatil'e Wo rk (C'\CW), Sy stems (November 1991). Toronto, Ontario, Canada (November 1992). 64. M. Shan, "Pegasus Architecture and Design 52. T. Danielsen and U. Pankoke-Babatz, "T'he Principles," Proceedings of the 1993 ACM SJG­ Amigo Activity Model," in Research into Net' MOD international Conference on lVianage­ works and Distributed Applications, R. Speth, Jn ent of Data. Was hington, D.C. (May 1993) ed. (Munich: North-Holland, Elsevier Science Publ ishers B.V, 1988). 65. M. Ansari. L. Ness, M. Rusinkiewicz, and A. Sheth, "Using Flexible Transactions to Sup­ 53 R. Fehling, K. Joerger, and D. Sagalowicz, port Multi-System Te lecommunication Appl i­ Knowledge Sy stems fo r Process Mmwgement cations," Proceedings of the Eighteenth CA: 9 ). (Palo Alto, Teknowledge Inc., 1 86 International Co nference on Ve ry Lmge Databases (VI.IJR). Va ncouver, British 54. J Guyot, "A Process Model t()r Data Bases," SJG­ Columbia, Canada (1992). MOD Record, val. 17, no. 4 (December 19R8) .

55. C. BuSier and S. Jablonski, "An Approach to 66. D. Evans, "Putting Elves to ·work: Workflow Integrate Workflow Modeling and Organiza­ Technology in a Law Firm," Proceedings of

' 93 tion Model ing in an Enterprise," Proceedings the Groupware Conference, San Jose, California(1 99:3) of the Third IEEE international Wo rkshop 011 Enabling Te chnologies: infrastructure fo r 67 D. Sng, "A National Information InJrastructure Collaborative EnteJ1Jrises (WLT JCT;), Mor­ for the 21st Century Collaborative Enter­ gantown, West Virginia (April 1994). prise." Proceedings of the International Con­

56. S. Jablonski, ";\•IOBILE: A Modular Wo rkflow fe rence on A utomation, Robotics and '94), Model and Architecture," Proceedings of the Computer Vi sion (ICARCV Singapore Fo urth Wo rk ing Conference on Dynmnic (November 1994) Jl!iodelling and Information .�)'stems, Noord­ 68. R. Marshak, "Characteristics of a Workflow wijkerhout, Netherlands (September 1994} System-Mind Yo ur P's and R's," Proceedings 57. M. Hsu, A. Ghoneimy, and C. Kleissner, "An of the Groupware '93 Conference, San Jose, Execution Model fo r an Activity Management California( 1993).

System," Proceedings of the Work�shojJ on High 69. C. Bugler ami S. Jablonski, " Implementing Perfornumce Tra nsaction .s:vstems (1991). Agent Coordination for Workflow lvianage­ 58 M. Hsu and .M . Howard. "Work-Flow and ment Systems Using Active Database Systems," Legacy Systems," BYTE (July 1994). Proceedings of the Fo urth intenzationol ·worl<.s/.JojJ on Research Iss ues in Data Engi­ 59. F Leymann and W AJ tenhuber. "Managing Busi­ neering: Actit•e Database S)'stems (RIDE-ADS ness Processes as an Int()rrnation Resource," '94), Houston, Texas (February 1994). IBM .�ystetnsjournal, vo l. 33. no. 2 (1994). 70. C. EJJ is, S. Gibbs, and G. Rein, "Groupware­ 60. Wo rkflow Mmwgement Soflwa re: The Communica­ Business Opportunity (Ovum Reports, Some Issues and Experiences," ACM, :'l4, 1991) December 1991). tions of the vol. no. 1 (January 71. (J l. T. White and L. Fischer, "New 7()()/s fo r Ne w L. Lawrence, "The Role of Roles,'' Cmnputers l (1993). Times: The Wo rkflow Pa radigm (Alameda: and Securit_g vol. 12, no. Future Strategies Inc., Book Division, 1994). 72. L. Aiello, D. Nardi. and iv i. Panti, "Modeling 62. J Bair. "Contrasting Wo rkflow Models: Get­ the Office Structure: A First Step towards the ting to the Roots of Three Vendors," Proceed­ Office Expert System," Second AC\1 S!GOA

ings of the Groupware '93 Confere nce, San Cm�terence 011 Ojjice information .S) •stems _lose, California(1 993). (A C.II S/C,OA). vol. 5, nos. 1 and 2 (l9R4)

48 ViJ/. (J No. ·I hill !')') 1 Digital Te chnical jo urnal Policy Resolution in Wo rkflow Management Sys tems

73. D. Denning, Cry ptography and Data Security Oriented Database Programming Language," (Reading, MA: Addison-Wesley, 1983). Proceedings of the Seventeenth In terna­ tional Conference on Ve ry Large Databases 74 . Blue Book, Vo lume Vl!I, Fa scicle VJIJ. 8, Data (VLDB), Barcelona, Spain (September 1991). Communication Ne tworks: Directory, Rec­ ommendations X.500-X.521 (Study Group 77. U. Dayal et al., "Third Generation TP Moni­ VII), Comite Consultatif lnternationale de tors: A Database Challenge," Proceedings of Tell·graphique et Te�lephonique. tbe 1993 ACM S/G/'vJOD International Confe r­ ence on Nlanagernent of Data, Was hington, 7). S. Ceri and .J. Widom, ''Managing Semant ic Heterogeneity with Production Rules ami Per­ D.C. (May 1993). sistent Queues." Proceedings of the Nine­ 78. C. BuBier, "Capability Based Modeling," tee1ltb Conference on Ve ry Lorge Databases Proceedings of the First International Con­ (VUJN), Dublin, Ireland ( 1993). fe rence on En te11Jrise Integration Modeling 7(J. W Kent, "Solving Domain Mismatch and Te chnology (JCEIMT), Hilton Head, South Schema Mismatch Problems with an Object- Carolina (June 1992).

Digital Te chnical jou rnal Vu l. (j No. 4 hill 1')<)4 49 Stewart V.Ho over GaryL. Kratkiewicz

Th e Design of DECm odel fo r Wi ndows

The DECm odel fo r Windou•s softlmre tool represents a significant admnce in the development of business process models. Tbe DECmodel tool allows rap id devel­ op ment of models and graphical representations of business processes by prot•id­ ing a laborator)' enl'ironment fo r testing processes befo re propaga ting t!Jem into workflows. Such an app mach can signifi cantly reduce tbe risk associated u·itb large inuestments in information technology The DECIJlodel dest�f{ll i11corporates knowledge-based, simulation, and graphical user intetjace technology on a PC plat­ fo rm based on the L'vficrosoft Windml's op erating �Tstem. Un ique to the design is the manner in whicb it separates the model of the business processes Fum tbe uiews or presentations of the model.

Many approaches have been developed fo r under­ base component aml the presentation component standing, specifying, testing, and val idating busi­ as separate processes. A primitive mailbox system 1980s, ness processes. In the late Digital began ro was used fo r interprocess communication. To serve reengineer some of its most complex and mission­ the needs of nontech nical business users and to critical business processes. It soon became appar­ achieve the necessary product qu ality, Sym mod ent that modeling methodologies and tools were needed to be completely redesigned and rebuilt.

needed to document, test, ami va I idate the reengi­ In early 1991 . the Modeling and Visualization neered processes before they were implemented, Group decided to build a product version of the as well as to provide a high-level specification fo r Sym mod application, which wo uld be released as their design and implementation. Consequently, the DECmoclel tool. The team drafted requirements, Digital decided to provide the business process specifications, ami an architecture. The DECm odel engineer with tools similar to those used by archi­ product was initially targeted at two platforms: tects, mechanical designers, and computer and soft­ VA Xstation workstations running under the ware engineers. DECwindows operating system and personal com­ The first implementation of Digital\ drnamic puters (PCs) running under the Windows NT oper­ business modeling technology, Symbolic Modeling, ating system. As users were interviewed and was developed at Digital's Artificial Intel l igence requirements were accumul.ated . it became clear. Tec hnology Center. The technology was embodied however. that by far the most important platform 1991 in an application called Symmod. wh ich in ran for DECmouel users was the PC platform based only on a VA Xstation system. 1 Symm()(J'skn owledge on the Wi n dows operating system. Consequently,

base and simulation engine were implemented the DECmodel development eftort shifted to this using the LISP progra mming language and the platform. Knowledge Craft product, a fra me-based knowl­ During 1991, the team en hance(! the existing ver­ edge representation package with modeling and sion of Symmod so that it would meet user needs simulation fe atures.l Because models were written until the release of the pro(luct version for l'Cs. The in LISP code, users had to be computer program­ most sign ificant enhancement was the develop­ mers as we ll as business consultants . The appl ica­ ment of an X Window System interface fo r building tion contained a graphical presentation builder and and editing models. A second important enhance­ \'iewer implemented in the C programming lan­ ment was a graphica l shell program that tra ns­

guage that used a relational database fo r presenta­ parently started up the knowledge base and tion storage. The user had to start the knowledge presentation components to r the user.

')() Vol. o No. 1 h1ll 1')')4 Digital Te cbnical jo urnal The Design of DEC model fo r Windows

In March 1992, Digital officially announced Phase Designers believed that while simulating the 0 (the strategy and requirements determination business, the user should be able to interact with phase) of the DECmodel for Windows product. the model and thereby select and test more than one scenario. The DECmodel tool was intended to be a working scale model of the business, giving the Design and Development Goals user a sense of how the business process would The DECmodel product design team had the fo llow­ work as different choices were made. The tool, by ing goa ls: design, neither predicts congestion and through­ put as a fu nction of resource constraints nor pro­ • Provide a modeling tool that maps directly to business processes vides information through statistical reports. The DECmodel product was designed to provide a slow, • Allow the modeling of both the static and the deliberate simulation of the business, not to com­ dynamic characteristics of the business process press weeks or years of activities into a few sec­ onds, leaving behind only a statistical summary. • Allow multiple views of the business process The team's development goals fo r the DECmodel model by separating the model from the presen­ product were to tation of the business process during simulation

• Provide a tool that runs on a popular hardware • Allow the user to interact with the tool and to platform used by business consultants make decisions while the business process is being simulated in order to let the user "test­ • Achieve a short time-to-market, i.e., delivery drive" the business process within one year

• Provide a tool that is easy to use fo r business con­ • Utilize a widely accepted software base technol­ sultants and that requires no programming ogy (for maintainability) Note that the designers intentionally omitted the fo llowing goals from the DECmodel design: The DECmodel Wo rld Vlew • Include resource constraints and queuing Every modeling and simulation tool is based on a predefined view of the world 5 In the DECmodel • Allow the user to perform a statistical analysis world view, a business process is composed of of the behavior of the business process aggregate centers capable of perfo rm ing one or By far the most important go al fo r the DECmodel more tasks or work steps. Each aggregation is design was the first one listed, an obvious mapping referred to as a process, and the tasks that can occur between elements of the model and business pro­ in a process are called activities. Processes commu­ cesses. The anticipated users of the DECmodel tool nicate through the exchange of messages, which were business analysts and consultants, not system are sent by activities and received by another pro­ designers and software engineers. The designers cess or other processes or by the same process that felt that adding levels of abstractions to a modeling contains the activity.4 tool would make it less acceptable to the intended This view differs significantlyfrom the one taken users. A notable corollary to providing an obvious by the typical workflow model in which work steps mapping was modeling both the static and the are directly linked. In the DECmodel model, an dynamic characteristics of the business process. activity that sends a message to a process has no To engage the user in interacting with the model knowledge of what work steps will occur next. For and test-driving the business process required example, when a customer (a process) sencls an a graphical interface that was separate from the order (a message) to a supplier (another process), model. This "presentation" layer of the DECmodel the customer does not know what work steps tool provides a layout and graphical appearance (activities) the supplier will initiate when it that has the look and fe el of the actual business pro­ receives the order. It is invisible to the customer cess, hiding the irrelevant technical details of the whether or not the supplier decides to change its model. The presentation enables the user to step work rules, for instance, by sending the order to a through the business, watching information and second source because materials are not available. material flows occur, and thus see where the Similarly, when the supplier's activities have been dependencies and concurrencies exist. completed and the material that was ordered has

Digital Te chnict,l]ournal Vo l. 6 No. 4 Fa ll 1994 51 Wo rkflow Models

been sent to the customer, the supplier has neither Activities can send messages to processes only. knowledge of nor dependencies on the work steps The receiving process makes the message known to that the customer undertakes next. In contrast, in every activity that uses the message in its receive a workflow model each task is directly linked to rule. Messages are universal to the model, and the another task. Changes in the suppi ier's way of cloing same message type can be sent by activities in dit� business fo rce changes in how the customer's tasks ferent processes. connect to the supplier's tasks. More succinctly, the Processes can have state knowledge (attributes) DECmodel tool encapsulates the behaviors and that can be assigned values as a side effect of an work rules of each individual process in the larger activity being completed. 'The activity can use business process. This differencebetw een the pro­ a process attribute value to decide what messages cess and workflow models is shown in Figure 1. to send out ancl where to send them. 'That is, pro­ cesses have a state that can be altered to change the behavior of the model. Processes, Activities, and Messages Like processes, messages can contain infor­ As described above, the DECmodel model repre­ mation, which is stored in their attribu tes. When sents a business process as a collection of smaller a process receives a message and passes it on to encapsulated processes. The behavior of each pro­ an activity, information in the message can be used cess is defined by the activities that it contains. The in both the receive rule and the send rule of the DECmodel tool provides three general types of activity. Aclditionally, the information in a received activities: generating activities, processing activi­ message can be copied into the attributes of ties, and terminating activities. Generating and ter­ any message that an activity sends. In this way, the minating activities represent the boundaries of the DECmodel tool supports information propagation. model; processing activities represent the work The DECmoclel representation of business bor­ steps in the business process. rows heavily from both the stochastic-timed Petri An activity is characterized by (1) a receive rule, net (STPN) model and the object paradigm fo und in which defines the messages that the activity neecls object-oriented design. '1' fo r initiation, (2) a duration, and (3) a send rule, which defines the messages that the activity semis out at the end of its duration. Generating activities TIJ e Stochastic-timed Petri Net Model uersus the have only send rules, and terminating activities DECrnodel ;lJodel An STPN model represents a have only receive rules. system as a col lection of places, transitions, arcs. and tokens. Places contain tokens and act as inputs to transitions. A transition results in the movement PROCESS A PROCESS 8 of a token to another place if an arc exists between the transition ancl the place. Before a transition can occur, a token must be present at each place that is connected to the transition by an arc. Associated with each transition is an exponentially distributed random variable that expresses the delay between the enabling of the transition and the firing of the transition. The DECmodel model welds the STPN place, tran­ (a ) Process Jl!J odel sition, ancl arc elements into a single object cal led an activity. The analogous elements of the ST'PN and DECmodel models are

STPN DECmodel Place Activity receive rule (b) Wo rkjlou• ii!J odel Transition Activity duration To ken Message Figure I Tbe Process ;Ho del uersus Arc Activity send rule the Wo rkj? ou• Model Process

52 Vii/. 6 ;\(). cj Fa ll I'J')1 Digital 1ecbuical jo urnal The Des(f.!.11 of DECmodelfor Windows

The DEC:model mode.l goes beyond the STPN 5. Not restricting durations to being exponentially model by disrribllted random variables.

STPN l. Adding the process object between the activity Like an model, a DECmodel model does not send rules (arcs) and the activity receive rules explicitly have resources but can represent the (places). Each process can have multiple activity availability of a resource by sending a message to a send rules. As the process object receives mes­ process when the resource is av ailable. 2 sages (tokens), it dispatches them to the appro­ Figure shows the workflow system from priate activity receive rule (place). Figure 1 as both an STPN model and a DECmodel model with the process receiving messages from 2. Al lowing more than one type of message (token) the activities. to exist.

:;. Storing information in bo th the processes and The DECmodel !Vl odel and Object-oriented Design the messages (tokens). The elements of object-oriented design that the DECmodel model fu lly draws upon are encapsu la­ 4. Using AND, OR, and message-m atching receive tion of information and the message-method para­ ru les in the activity receive rules (places). digm . Information is encapsulated within DECmodel

ACTIVITY 1 ACTIVITY 2 ACTIVITY 3 ACTIVITY 4 V-UVl KEY: ' PLACE ARC 0 J TRANSITION • TOKEN

u ' (a) Stochastic-timed Petri Net Model of a Fo ur-acti iz) Workflow

KEY: c==::> PROCESS JACTIVITY SEND RULE ACTIVITY • MESSAGE 0 ACTIVITY RECEIVE RULE

(b) A DECmodel Model of a Fo ur-activiZJ' Workflow with a Process Dispatching Messages between Activities

Fip,ure 2 The Stocbastic-tirnedPetri Net Nlode/ versus the DECmode/ Process-activityJ\1/o del

No. Digital Te chnical jo urnal Vol. () Fa(( J'J94 Workflow Models

objects and is not available globally. However, an planning fu nction without traveling down through important difference exists between DECmodel sys­ the various levels of the manufacturing organiza­ tems and object-oriented systems. Tn DEC:moclel tion. In a DECmodel model, as in most businesses, systems, a number of messages may by required to when an activity is completed, a message can be trigger a behavior; whereas, in classical object­ sent directly to any process in the business. oriented systems, each message triggers a method. The DECmodel design feature that allows pro­ The DECmodel tool supports polymorphism, in cesses to receive messages ami then pass them on to that the same message can be sent to differentpro­ subprocesses and activ ities can result in multiple cesses, which can result in different behaviors. message receipts h>r a single send operation. That Developers investigated going beyond standard is, one activity can send a single message that is polymorphism by using one message to trigger dif­ received by every activity in the model that includes ferent activities within the same process. The the message in its receive rule. Modeling experts dis­ approach considered was to use process "filters" to agree about bow well this phenomenon maps to examine the information in a message and then real business processes. The DECmodel user can decide which activity or activities in the process avoid this effect, if desired, by using uniquely named should receive it. This feature was not completely messages in the send rules of activities. developed because of time constraints and a less­ than-clear mapping between the concept and the The Presentation actual practices in most business. Further, using activity send rules that utilize the int<>rmation con­ The first DECmoclel design goal was supported by the tained in messages can provide a similar capability. modeling paradigm of processes, activities, and mes­ The DECmodel tool does not support inheri­ sages. The presentation aspect of the DECmodel tool tance, but the underlying technology of the prod­ supports the goals of a strong separation between uct does support this feature. As in the case of the model and the graphical representation of the nonstandard polymorphism, time-to-market pres­ business process and the need to support user inter­ sures and the lack of clear evidence that the fe ature action ami decisions during moclel simulation. would be used in business processes drove the The presentation of the model is based on views decision not to include inheritance support. Also, that contain networked nodes. Each node in a view the DECmodel product does not currently support can represent zero or more processes in the model; class types beyond the built-in classes of the rro­ however, no process can be represented by more cess and the three activity types. than one node in a single view. This mapping between the processes in the model and the nodes in a view allows the user to develop ancl animate Process Hierarchies multiple views of the model simultaneously. For Tb address the goal of having a strong mapping example, one view may show the model at its low­ between the model ami real business processes, the est level of detail, with each process in the model DEC:model model supports processes within pro­ mapped to a single node. Another view may show cesses. Processes can receive messages in two ways: a higher level of mapping, with multiple processes hierarchical routing ami peer-to-reer routing. mapped to the same node. A third view may map In a business process, a message sent to a high­ processes based on attributes such as geographic level process should travel through the process hier­ location, the organizational chart, or technology. archy to the activity that is to act upon the message. The construction of the views is left to the creativ­ For example, an activity in the sales process should ity of the analyst building the model. be able to send a message to the manufacturing pro­ During model simulation, the DECmoclel tool cess and not be concerned that manufacturing con­ uses animation to show the movement of messages tains several subprocesses. The knowledge of how from one process to another. The user can also to relay a message should be in the receiving pro­ view the messages and their attributes. cess, not the sending process. To accommodate user interaction, the DECmodel In business, however, much communication tool provides a menu send rule in the definition of occurs on a peer-to-peer basis, with information an activity. If an activity uses the menu sencl rule, seldom routed up and down the organization hier­ just before the activ ity fires, a menu appears that archy. For example, the results of a marketing allows the user to make a choice that determ ines research activity go directly to the manufacturing what messages are to be sent by the activity and

'54 Vo l. 6 No. 'l Fa ll I')'Yi Digital Te cbnical jou rnal Th e Design oj DECmodel fo r Windows

which processes are to receive them. The user is builder. The user interface is designed as a set of unaware of the actual send rule; the choice made classes specialized from the Microsoft Foundation fo rces one of a set of send ru les to be selected. The Classes. Interaction between the two layers is use of menus, animation of messages moving achieved with an internal application programming between processes, and user-control led stepping interface (API). through the simulation gives the user the fe eling of This architecture was chosen fo r both technical test-driving the business process. and pragmatic organizational reasons. The parti­ tioning into two layers al lowed the internal engine Architecture and Development Process to be built using state-of-the-art knowledge repre­ sentation tech nology and the user interface to be The overall DECmodel architecture, shown in Figure built using state-of-the-art graphical user interface 3, contains two layers. The inner layer of the architec­ technology. The disadvantages in this separation ture is the internal engine, which provides the means were the necessity of designing an internal API and for representing, storing, and executing (simulating) the need to duplicate some data (nominally stored models. The internal engine is designed using ROCK, in the knowledge base) in the user interface. a frame-based , object-oriented knowledge repre­ The partitioning mapped well to the human sentation system, and AMP, a modeling and simula­ resources available in the DECmodeJ team. The tion frame-class library implemented in ROCK 7 The DECmodeJ engineers had strong skills in developing outer layer of the architecture is the user interface, LISP, knowledge-based, and X Window System appli­ which provides the means fo r all user interaction cations but little experience in developing C++, with the DECmodel model and has two major com­ ROCK, or Microsoft Windows applications. With the ponents: the model builder and the presentation architectural separation, one team was able to focus on the internal engine using C++ and ROCK r------, and, therefore, did not have to learn much about Windows programming. The other team was able to

MODEL PRESENTATION fo cus on the user interface using C++ and Windows BUILDER BUILDER programming tools and did not have to learn any­ thing about ROCK. The engineering team fe lt that GENERIC USER�� "' I' NTERFACE CLASSES I the efficient use of human resources in the develop­ ment process overcame the technical disadvan­ MICROSOFT FOUNDATION CLASSES I tages of the approach. USER INTERFACE DECmodel development proceeded with the two teams. Since the bulk of their development work (A� was completed first, the members of the knowl­ SCRIPT ENGINE edge base team also worked on the user interface team toward the end of the development process. SI MULATION KNOWLEDGE BASE ENGINE Design and Implementation ANALYSIS This section describes the design of the two I AMP DECmodel layers: the internal engine and the user l ROCK interface. INTERNAL ENGINE

DECMODEL APPLICATION InternalEn gine ----1------The internal engine represents models of dynamic r ------business processes in a knowledge base and exe­

DECMODEL cutes these models using discrete event simulation. REPORT MODELING LANGUAGE FILES This layer provides a set of services for interacting I I FILES with the knowledge base. These services are PERSISTENT STORAGE accessed through the DECmodel tool's internal API. l ll ______The internal engine contains the DECmodel knowl­ edge base, simulation engine, and means of persis­ Figure 3 DECm odel Architecture tent storage. Using the DECmodel methodology to

Digital Te cbnicaljournal Vo l. 6 No. 4 Fa l/ 1994 55 Wo rkflow Models

represent and execute business process models, executed when the frame receives the approp­ the internal engine riate message from the application rrogram. Class frames represent object types or categories. • Represents the structure, attributes, and behavior Instance frames rep resent particular members of descriptions of the business processes in a knowl­ a class. ROCK requires frame classes to be organized edge base. (This representation is the model.) in a class hierarchy. Attribute slots and message

• Represents the structure, attributes, and behav­ slots can inherit values and methods from classes at ior descriptions of the animated visualization a higher level in the hierarchy. This mechanism can of the model in a knowledge base. (This repre­ be used to define default values f( >r frame classes. sentation is the presentation.) Both frame classes and frame instances are objects. ancl both can be dynamically created, operated on. • Represents the connections between the model and deleted during run time. With respect to the and the presentation in a knowledge base. C++ language, all frames appear to have the same (This representation is the model-presentation data type. Slots are objects, whose behavior is mapping.) defined independent of the frames.

• Represents the dynamic behavior of the business Portions of the knowledge base are built using processes by al lowing fo r discrete event simula­ AMP, a modeling and simulation frame-class library tion of the knowledge base. implemented in HO CK. A!YJ I' contains objects fo r representing process models that contain entity

Knou•lec(f!,e Base The DECmodel knowledge base flow, for constructing ami running (I iscrete-even t contains the frame-based, object-oriented rep­ simulations, and fo r generating, collecting, and resentation of the model, the presentation, and reducing statistical data. the connections between them. It also main­ The DECmodel frame classes are subclasses of tains the model relations, attributes, and methods. ROCK and AMP classes ami contain relations, The knowledge base contains both classes and attributes, and methods that describe the content instances. The classes specify DECmodel objects; and behavior of DECmodel objects. Some DECmollel sets of instances make up specific models and pre­ frame classes are abstract classes used only as sentations. In addition to containing all the infor­ a basis for more specific subclasses; others are used mation about model and presentation behavior and for instantiation of DECmodel objects. The structure, the knowledge base contains all the DECmodel tool contains three types of h·ame graphical information usecl by the model builder classes: model objects. presentation objects. and ami the presentation builder. This int

56 I vi. 6 ,\o. 4 Fo ii i'J'J·I Digital Te clmical jo urnal The Design of DECmodelfo r Windows

error hand I ing capabilities. The DECmodel tool Anazysis and Reporting Services The knowledge contains its own DML code generator. base contains services that al low the user to ana­ lyze models and presenrations in the knowledge Simulation Engine The simu lation engine exe­ base and to generate reports. cutes a discrete event simulation of the model in The consistency advisor checks models, presenta­ the knowledge base. This simulation can be per­ tions, and mappings for inconsistencies and poten­ fo rmed either interactively or in a batch mode. The tial problems at any point in the model development simulation engine was designed to be so robust process. This check is analogous to the syntax check that a model can be simulated at any stage of its performed by a compiler. The consistency advisor development, regard less of inconsistencies or check is the primary model-building debugging aid undefined elements. fo r users. Inconsistencies in the model do not pre­ The simulation engine interacts with the presen­ vent a model from being simulated. tation builder to control simulation, animation, and The model description report lists the descrip­ graphics. The user controls simulation through tion, messages sent, and messages received fo r each the presentation builder. The presentation builder activity and process. The model table report con­ calls simulation engine API fu nctions to perform tains the basic model information in a table format the requested actions, such as starting, step­ fo r easy access by another application, database, or ping through, pausing, ending, and reinitializing spreadsheet. The simulation summary report con­ the simulation. tains information on simulation performance.

Script Engine and CmnjJiler Scripts provide Design and Implementation Decisions The inter­ a means of specifying user-defined actions to cus­ nal engine fo r the first DECmodel product release, tomize model animation and to perform spe­ DECmodel fo r Windows version 1.0, was imple­ cial presentation actions during simulation. The mented as a Windows dynamic link library (DLL) DECmodel tool contains a language fo r defining using the Windows version of ROCK version 1.0, the

scripts, a script compiler, and a script engine fo r Windows version of AMPve rsion 1.0, and Microsoft executing the scripts. Although the DECmodel team C!C++ version 70. For DECmodel fo r Windows wanted to avoid requ iring any programming in the version 1.1, developers ported the internal engine tool, developers decided that a script language was to Microsoft Visual C++ version 1.0. the only way to implement these fe atures in the Several options existed fo r implementing the available time frame. DECmodel knowledge base. The knowledge base of The script language contains fu nctions fo r the Sy mmod application, the precursor to the

DECmoclel product, was implemented in a LISP envi­ • Annotating, hiding, showing, flashing, moving, ronment. The DECmodel engineering team wanted highlighting, and sealing presentation icons to move to a more standard programming environ­ ment and, therefore, fo cused on C++ and C++-based • Playing sounds and sound loops tools. However, a straight C++ implementation

• Animating connections between nodes would have required the reimplementation of knowledge representation, simulation, and model­ • Showing, hiding, and clearing certain kinds of ing technology available in other tools. windows Another modeling and simulation technology, the Modeling and Simulation System (MSS), had • Starting other applications been developed fo r Digital's Artificial Intelligence • Te mporarily stopping execution Te chnology Center by the Carnegie Group, Inc. (CGf )H This graphical tool was designed at a lower • Loading a new project level than Sym mod. It used a model ing simulation language and was developed to implement the next • Starting and pausing the simulation version of Symmod. However, the MSS modeling

• Displaying files paradigm was not compatible with that of the DECmodel tool. • Displaying a list of DECmodel development team IMKA had also been recently developed by members CGl, fu nded by a consortium of companies, as a

Digital Te e/mica/ ]o urual l,bl. o No. 4 Fa ll I'J'Yi 57 Wo rkflow Models

repl acement for the Knowledge Craft product . and thus yieklecl a higher- qual ity produ ct and JMKA's implementation , ROCK, lacked some of the a fa ster implemen tatio n. class libraries included in Knowl edge Craft fo r sim­ ulation and process model ing but ran s ignificantly Us er In terface faster than Knowledge Craft. The engineering team The user interface provides the means fo r all user decided to use ROCK to i mplement the knowledge interaction with the DECmodel tooL It has two base because of its knowledge represen tation major components: the model builder and the pre­ power and its C++ compatibility Digital contracted sentation builder. with CGl to port the class li braries to ROCK. The The user interface is designed as a set of classes team, therefore, had a head start in designing ancl special ized from the Microsoft Foundation Classes. implementing the imernal engine. The portabi l ity M ost of t hese speci al DECmodel user interface of ROCK was also an advantage; switch ing to th � classes co rrespond to frame cl asses in the know l­ Windows p latform from the DECwimlows platform edge base : the rema inder are necessa ry for i mple­ caused no d isru ption in development. menting the user interface. The three main types of The origi nal intent of the engineering team was user interface classes-windows, graphic obj ec t s, to i mplement the DECmodel tool as a single exe­ and d ialog boxes-are u sed by both the m()(lel cutable file. The know ledge base contains much builder and the presentation bu i lder. global data, however, and restrictions on the number of data segmen ts required developers to i mplem ent the intern al engi ne as a DLL. Th is encap­ lflindow Classes The user interface contains sev­ sulation of the internal engine al lows it to be used eral types of window classes: gra ph ics winduws, in other appl ications and enables easy porting to text windows, and a frame window. gr h c re e iv d from other platforms. The DECmodel team de veloped The ap i s window classes a all d r e tbe c r ic window class. a set of i nternal API functions and structures to generi DECmodel g aph s allow interactions between the DLL-based internal Graphics windows contain graphic obj ects, such as engine and the executable-based user interface. boxes or lines. Users act upon these windows g m us through the i d s The Sym mod application had a model ing throu h menu com an or W n ow language based on liSP for persisten t storage of messages generated by the mouse ancl mouse but­ models and used a rel ati onal database f(>r persistent tons. The graphics windows are the model window, the view windows, ami the paJ ttes. Menu com­ storage of presentat ions. Consideration was given e mands specific to each graphics window are han­ to developing a model ing language specific to the DECmodel tool. Instead, the engineering team dled by message hand lers within the wi ndow cl ass. decided to use the ROCK frame definition lan­ The text window classes are derived from the guage, since it was already com pletely defined and gen eric DECmodel text window class. Text win­ debugged and had an interpreter. 'fhe team used dows are genera l ly read-o n ly and display various this l anguage fo r persistent storage of both models types of textual information, such as descriptions, the files. a d and presentati ons to allow easy sharing of proj ects text of n clock information. As in the between users and to simplify debugging by users case of graphics windmvs, menu commands spe­ m sa and DECmo

Vo l. (i No. .f F( {// I'.I'Ii Digital Te cbuical jo urnal Th e Design of DECmodel fo r Windows

Dialog Box Classes The DECmodeJ tool contains Appearance of the Us er Interjcu;e Figure 4 shows a large number of dialog boxes derived from the a small but typical DECmodel model. The figure dis­ CModaiDialog Microsoft Foundation Class. The tool plays each process and its member activities. Note uses these dialog boxes to define the information that each of the three activity types is denoted by and relationships contained in the DECmoclel a different icon. Lines indicate the potential flow objects. of messages. Figure 5 shows the DECmodel presen­ tation fo r the model that appears in Figure 4. The Menus The DECmodel tool uses a set of menus presentation contains both a view and the support­ individualized to match the capabilities of the ing windows, e.g., the simulation clock and the window currently in use. When a user starts the description windows. DECmodel app.lication, the tool presents a reduced menu that allows the user to start a new project Design and Implementation Decisions The team or to load an existing one. Once a project is in implemented the user interface fo r DECmodel fo r memory, the menu changes as the user switches Windows version 1.0 using Microsoft C/C++ ver­ between the Model Editing Window, the views, and sion 70 and Microsoft Foundation Classes version the other windows. Menu com mands activate mes­ 1.0. For DECmodel fo r Windows version 1.1, devel­ sage hand ler fu nctions within the window classes. opers ported the user interfa ce to Microsoft Visual

-= ...- odc9 r Patien't· a /1/ \\; Ins ranee Carrier I Ill Make Dia n4�is ,·a 1 ' f t Home Treatment IJ hedul� l-abTests Contact �' �pita! ConfirmD Coverage Visit �octor \ D_ II ' Evaluat� Lab Tests 12 Notify Ratient Prepare for Hospital Evaluate aSymptoms Arrange aDischarge

Begin TieatmenttJ IJ Pay Bill --- Hos ital - p -..� Pat ent Billing ...... Ho pital Ad missions --=------' . I 12 I Laboratory Set u 1 Billing I I I IJ ..=1 Schedu e Room I ' Hos�ital Records d IJ Contac Insurer Perform Lab Tests Admit12atient \ III Close illing ( Locate or Createa Records IJ NotifyD Billing Bill Patient

Figure 4 Ty pical DECmodel Model

Digital Te chnical jo urnal Vo l. () 1'vi J. 4 Fa ll 199·-i 59 Wo rkflow Models

Patient is Not Well

Admissions

Message sent by : Hospital Admissions message name : Record Request received by : Hospital Records

Figure 5 T)1Jica/ IJL"C:IIlode/ Presentation (fo r the Model Sbou •n in Figure 4)

C++ version 1.0 and Microsoft foundation Classes During the design and development of the version 1.'5. DECmodel product, the team debated how graphical As stated at the beginning of the paper, the to make the user interface, that is, to what extent dia­ DECmodel product was initially targeted at both log boxes should be used. Although the goal was to VA Xstation workstations running under the make the user interface as graphical as possible, the DECwindows operating system and l'Cs running tight schedule h>rced the team to postpone plans fo r under the Windows NT operating system. Conse­ graphical editors in favor of dialog boxes, which

quently, when developers decided to fo cus solely were faster to i mplement. for example. the team had

on the l'C platform r u nning under the st:mdard initiallv planned to implement an Activity Editing Windows operating system, the user interface Window and had parrial.ly developed it. This window development effort was disrupted . Engineers had was to provide a complete view of an activity and done a significant amount of design work toward allow graphical editing of its info rmation. Schedule achieving a DEC:winclows implementation. constraints required the ream to abandon this plan The DECmodel engineering team considered and to develop a set of dialog boxes that were nor as other class .l ibraries and user interface implementa­ easy to use but were faster to implement. tion packages (such as A.'V T), but most were defi­ The user interface design was not specified or cient in Windows fe atures or in the look and feel. committed to storyboards in any detail at the begin­ Since the Windows operating system was the only ning of the project, partially to save time after platform fo r the foreseeable fu ture, the engineering the disruptions in the development work. This deci­ team fe lt that using Microsoft Foundation Classes sion Jed to more lost time later in the project, was the best choice. However. they made this deci­ though, because user interface featu res were sion after they had performed a significant amount designed q uickly and sometimes incompatibly. ami of development work with one of the tools. Much consequently required reworking. Jn addition, the of rhe work had to be redone. which contributed to resulting user intt>rface was nor as easy to use as it the schedule delav. could have been if better planned.

60 l'r;/ {> .\'u. i hill I'J'N Digital Te chnical jo urnal The Design of DECmodel fo r Windows

External review of the user interface design was goal of requ iring no programming, and some users not performed until late in the project. The review fo und scripts hard to use. However, many users have yielded some ideas that would have resulted in requested that a future DECmodel version provide a more usable product; however, there was not more script functions and extend the script language enough time left in the schedule to implement them. to be more like the BASIC programming language. Also, to enhance the use of the DECmodel tool in Delivery the design of business processes, a future version should support classes to make generic processes A discussion of the released product and the team's available as building blocks of a business process. success in achieving the design and development goals fo llows. Development Successes and Lessons Release The DECmodel engineering team successfully released a software product on the Microsoft Digital released version 1.0 of the DECmodel fo r Windows platform, the one most popular with busi­ Windows product in November 1993 and version ness consultants. This achievement was significant 1.1 in April 1994. Ve rsion 1.0 contained the basic because the group of engineers began the project capabilities fo r building models and presentations with no PC experience. The team did not meet its of business processes; version 1.1 added a set of one-year delivery goal, and the goal slipped to one minor enhancements and bug fixes. Because of its ancl one-half years after the Phase 0 announcement. small, focused market and the large cost savings However, this time frame was still extremely short that can result from its use, the DECmodel tool was fo r developing a complex PC product from scratch. introduced as a low-volume, high-priced product. The product retained the ex isting Symbolic The product includes the software, example mod­ Modeling paradigm (i.e., a process-activity-message els, documentation, and a week of hands-on train­ model and a strong distinction between model and ing. The DECmodel tool is an integral part of Digital presentation) and exhibited performance an order Consulting's reengineering practice. of magnitude better than that of the Symmod prod­ uct, which it replaced. The product utilized the Success of Design Choices most widely accepted modern programming tech­ The separation of the model from the presentation nology base (C/C++ ), which simplified maintain­ is the single most important element of the prod­ ability and reduced the need fo r special training uct's success. This feature, along with animation, of maintainers. distingu ishes the DECmodel tool from its competi­ Splitting the development team into two sub­ tion. Some users have even requested the capability teams worked we ll. It distributed the amount of of building the presentation first and then gener­ learning about new technologies requ ired by the ating the corresponding model. Such capability engineers and minimized the overall development would require considerable investigation. time. Key factors in the success of this approach The paradigm of process-activity encapsulation were the detailed object and internal API specifica­ is difficult for some users to become accustomed tions that were kept up-to-date throughout devel­ to. Many still prefer to build a model using a work­ opment and thus provided a rei iable interface flow approach, which the DECmodel tool can sup­ between the two parts of the project. port, rather than by defining each process and its After the product was released, the DECmodel behavior independently. team identified certain factors that could have The exclusion of resource constraints has limited made the team and the product even more success­ the application of the DECmodel tool to system fu l. The entire engineering team would have bene­ design, thus preventing its use in modeling sys­ fited from Windows training at the onset of the tem performance. AJthough the capability was orig­ project. The Windmvs design of the user interface inally not a product goal, many users would like shou ld have been specified and committed to story­ a fu ture version of the DECmodel product to pro­ board in much greater detail much earl ier in the vide this feature. project. In addition, the team shou ld have arranged To perform special user-defined actions during fo r Windows experts to review the design . These the simulation, a script language was included in changes in the engineering process would have the DECmodel tool. This design fe ature violated the helped the team produce a cleaner, easier-to-use,

1-b/. 61 Digital Te cbuicnl]ourual 6 No. 4 Fa ll I'J')4 Workflow Models

more maintainable user interface and would have Laurel Drummond, Peter Floss, Amal Kassatly, Mike reduced implementation time. The project sched­ Kiskiel, Kip Landingham, and Janet Rothstein. ule should have been created using a bottom-up rather than a top-down process. The initial one-year References schedule was based on an unrealistic, management­ 1. Sy mmod Us er's Guide (Maynard, MA: Digital imposed release date. W11en the engineering team Equipment Corporation, 1990). revised the schedule and calculated a release date based on their detailed estimates, the team met the 2. Knowledge Craft Reference Manual (Pittsburgh, new date. PA : Carnegie Group, 1988).

Simulation, A Problem Summary 3. S. Hoover and R. Perry, Solving Approach (Reading, MA: Addison­ Modeling and simulating business processes is an We sley, 1989). important part of business process reengineering. Digital developed the DECmodel tool specifica!Jy 4. DECmodel fo r Wi ndows: Modeler's Guide (May­ fo r this type of simulation. Although it borrows nard, MA: Digital Equipment Corporation, 1994). many ideas from other discipl ines of modeling and 5. ]. Peterson, Petri Ne t Th eory and Modeling of simulation, as well as from object-oriented design, Systems (Englewood Cl iffs, N.J: Prentice-Hall, the DECmodel product is unique in the way it mod­ 1981). els business processes, separates the model from the presentation, and represents the model as 6. G. Booch, Object Oriented Design (Redwood frames in a knowledge base. City, CA: Benjamin-Cummings, 1991).

7. ROCK Software Functional Sp ecification, Ve r­ Acknowledgments sion 2. 0 (Pittsburgh, PA : Carnegie Group, 1991). The authors would like to acknowledge the fo llow­ ing people who also contributed to the design of 8. Modeling and Simulation System Us er's Guide the DECmodel product: Ty Chaney, David Choi, (Pittsburgh, PA : Carnegie Group, 1991).

62 Vo l. 6 No. 4 Fa ll 1994 Digital Te chnical journal Dennis G. Giokas jo hn C. Rokicki I

Th e Design of ManageWORKS: A Us er Interface Framework

The Managelf!ORKS Wo rkg ro up Administrator fo r Windows software product is Digital's in tegration platform fo r sy stem and network management of heteroge­ neous local area networks. Th e Manage WORKS product enables multiple, heteroge­ neous network op erating sys te m and network interconnect device management fr om a single PC running under the Microsoft Windows op erating sys tem. Th e Nf anageiVORKS software is a user interfa ce fra mework; that is, the services it pro­ vides are primarily targeted at the integ ration of the user interfa ce elements of management applications. It manife sts the orga nizational, navigational, andJu nc­ tio nal elements of sy stem and network management in a coherent whole. Vi ewers, such as the hierarchical outline viewer and the topological relationships viewer that are components of the Manage WORKS softwa·re, provide the organiza tional and navigational elements of the sy stem. Management applications developed by Digital and by third parties through the Manage WORKS Software Developer's Kit provide the fu nctional elements to manage network entities. This paper discusses the user intetfa ce desig n that implements these three elements and the software sys­ tem design that supports the user intetfa ce fra mework.

The ManageWORKS Wo rkgroup Administrator fo r This paper focuses on how the ManageWORKS Windows software product is Digital's strategic software presents and integrates its fu nctionality tool fo r providing system and network manage­ to the end user. Specifical ly, the paper presents ment of heterogeneous local area networks (LANs). details of the user interface paradigm and discusses It serves as Digital's platform fo r the integration the design rationale and the design methods of PC LAN management. From the perspective employed. The paper also discusses the design of of the end user, i.e., the LAN system administrator ManageWORKS software in support of the user and network manager, the ManageWORKS product interface framework. comprises a suite of modules that integrates a diverse set of management activities into one Driving Fo rces behind the Design workspace. From the perspective of tlJe developer The ManageWORKS software was first released of system and network management applications, as a component of the PATHWORKS version 5.0 the ManageWORKS product is an extensible and fo r DOS and Windows product. The fo ci fo r flexible software framework fo r the rapid develop­ that PATHWORKS release set the tone fo r the ment of integrated management modules, all of ManageWORKS design. The PAT H\'VORKSversion 5.0 which presents a consistent user int erface. design objectives were to The design of the management system was user centric, i.e., usability was the top priority. Thus, l. Enhance the usability of the PATHWO RKS prod­ we began the design work without any precon­ uct. Since the PATHWORKS system was rooted in ceived notions about the management software sys­ a command line-based user interface, the goal tem design. The design that emerged and that is was to develop a graphical user interface for the documented in this paper was driven solely by the system that was based on the Microsoft Windows user interface paradigm developed and tested with operating system. Such a user interface would be our customers. contemporary, easier to learn, and easier to use.

Vo l. 6 No. 4 Fa ll 1994 Digital Te clmicaljournal 63 PC LAN and System Management Tools

2. Enhance the manageability of the PAT H\VORKS The team employed software usability testing product. The goal was to reduce the cost of own­ throughout the development life cycle. Two usabil­ ership by improving the installation, configura­ ity tests were performed with early design proto­ tion, and administration of the system. types; the final test was performed with our first pass at a detailed concept design. We performed The ManageWORKS design team used two voice­ the usability testing with customers to test user ot�the-customer techniques to provide more depth interface and fu nctional element design concepts and detail fo r the two high-level product design that we developed as a result of the Contextual objectives. First, the team used Contextual Inquiry Inquiry. The user thus served as a design partici­ to determine a customer profile and to develop pant. With each iteration of the fo rmal testing, we a clearer statement of the user's work.' Then, the tested specific functional concepts in three key team tested user interface prototypes with cus­ areas: (1) mechanisms to navigate among the man­ tomers by means of formal usability testing. From aged entities, (2) mechanisms to organize these 15 to 20 customers and users participated in each entities, and (3) the fllnctional capability inherent of three rounds of usability testing. in the management directives supported. (Note Early in the investigation, Contextual lnqu iry that, in this paper, the servers, services, and revealed that the profile of the PATHWORKS system resources managed by means of the ManageWORKS administrator had changed drastically during the software are collectively referred to as managed five years since the PATH\X'ORKS product was first entities.) The major lessons that we learned from released. A typical system administrator in the era this testing effo rt and then applied to the user inter­ of PATHWORKS version 1.0 had been a VA X/VMS sys­ face and software designs are as fo llows: tem manager who inherited the responsibility of installing and managing a PC file and print-sharing 1. The ManageWORKS software had to provide product. The interface into the system was a VT-class mechanisms to navigate among a diverse set of terminal running command line-based utilities. managed entities on the LAN or in some user­ To day, a system administrator is usually a PC user defined management domain. Users want to be who is quite familiar with graphical user interfaces. able to view and thus "discover" the entities that Such an administrator is more likely to be trainee! in are to be managed. The system had to present the installation, configuration, and management of the managed entities in graphical display formats PCs and PC networking software than his/her pre­ that were familiar ami enticing to users. Users clecessors. This change in the profile encouraged welcome the ability to support different styles us to shift the PATHWORKS focus from using host­ of presentation. Final ly, users need easy mecha­ basecl command line utilities to manage the system nisms to navigate through the hierarchy of to using client-based graphical utilities. an entity. We also profiled the customer network configu­ 2. Navigation mechanisms, as just described, work ration. During the same five years, it changed from well for novice users but become tedious and a very simple and homogeneous environment with constraining for more experienced users, as we just a few PA TI-fW'ORKS servers to a medium-to-large could attest to after our experience with the pro­ heterogeneous PC LA N. At present, configurations totypes. The solution that we presented to users comprise network operating systems that consist allowed them to create custom views of their of Novell NetWare, Microsoft LAN Manager, and managed entities, i.e., to organize their manage­ Apple AppleShare file and print services, as wel l ment domains. This concept was well received as other services that are emerging in the PC LAN by users during usability testing. environment. The network operating systems are deployed on their native platforms and by Digital 3. The ManageWORKS product had to provide on the OpenVMS and DEC OSF/1platf orms. Each sys­ mechanisms that consistently performed the tem has its own tools to manage the clients and functions that were common among a diverse the servers. Each has a diffe rent user i nterface that set of management applications. The product reslllts in a long learning curve and thus high train­ design presents users with an object-oriented ing costs or low productivity fo r system administra­ view of the managed environment. The building tors. Customers reported that they desired tools block of this design is the object, an abstraction with a consistent user interface to manage this of a manageable entity such as a server or a net­ diversity. work router. Each object is a member of a single

64 Vo l. (, No. 4 fa ll 1994 Digital Te cbuical jo urnal The Design of JIIJcmag eWONKS: A Us er Interfa ce Fra mework

object class that describes the set of object The ManageWORKS team adopted the concept of instances within it. The ManageWOHKS appli­ plug-in modules, a software design that is supported cation renders objects to the user as icons in a by the Windows Dynamic Link Library (DLL) archi­ viewer. Fo r example, fo r a LAN that contains tecture." The design is also in common use by many three NerWare servers, the object class called Windows applications incl uding tl1e Windows NerWare Servers would contain three objects, Control Panel, the utility that manages the local each of which represents one of the three imli­ cl esktop's configuration and user preferences. 1 vidual NetWare servers on the LAN. When users The next challenge was to decide how much focus on an object, the tool reveals which constraint to impose on the design of the actions are val id in the object's current context. ManageWORKS' plug-in modules and how consis­ This approach diffe rs from the traditional com­ tent the modules must he. Digital's extensible enter­ mand line approach in which the user first prise management director, the DECmcc product, selects the uti! ity (action) and then specifies incorporated some excellent concepts. ' In particu­ the objects upon which to act. Interestingly, lar, our design was influenced by the way in which whereas novice users fo und this object-focused DECmcc layered the management responsibility concept easy to grasp, those who considered into presentation modules. fu nctional modules, themselves strong users of the traditional com­ and access modules. Early in the design process, we mand line management utilities experienced dif decided to separate the navigation and presenta­ ficulty in grasping the new concept. tion of managed entities from the access and func­ tional management of the entities. 4. The typical customer has a diverse and large Another DECmcc concept, which is used, fo r (200 to 1,000) number of entities to manage. To example, in the access module layer, was the pre­ address this need, the prototype testing pre­ sentation of a consistent view to the layers above. ' sented users with the ability w manage more This concept, however, was not suitabl.e for the than one entity at the same time and the ability ManageWORKS design because it would have placed to manage many entities as one. Users liked constraints on the user interface design, in particu­ being able to view and modify the properties lar, on the presentation of the attri butes of man­ of multiple entities at the same time as well as aged entities. The design team was not willing to being able to modify the same property across compromise on this aspect of the design. a set of like entities. Thus, we decided on a ManageWORKS design that '5. In addition to providing a consistent user inter­ can best be described as a user interface frame­ fa ce, the ManageWORKS product should integrate work. The initial release, which was a component the management tools into one workspace. User of PATHWORKS version '5.0for DOS and Windows, feedback Jed to the design of the user interface offered few services other than to tie together the framework as the delivery vehicle fo r a diverse user interface elements required for system and set of management applications. network management. The user interface services needed were dictated by the five user interface The Key SoftwareDesi gn Principles requirements previously described. At this point in the development cycle, the design The Manage WORKS design incorporates two types fo cus shifted from developing and testing user of plug-in modules: navigation modules, referred to interface and fu nctionality concepts to designing in the ManageWORKS product as Object Navigation the ManageWORKS software itself. With what we Modules (ONMs), and application modules, referred considered to be a good understanding of the user's to as Object Management Modules (OMMs). The needs, we proceeded to design a software architec­ Manage\VORKS framework controls the control ture to support those requirements. flow and messaging between the modules. Prior architectures that were fa miliar to the ONMs allow fo r any number of navigation models design team served as starting points fo r the design. to be supported and used singly or simultaneously The fo llowing two examples represent sources of by the user. Although, by design, ONMs possess no design concepts that we employed and adapted to knowledge of the managed entities or entity rela­ suit our objectives. Each represents an opposing tionships they display, they do possess the ability end of the spectrum with respect to design objec­ to display entities with the relationships inherent tives and implementation. in them. ONI'

Digital Te cbuical jou rnal �'Ill. 6 No. · Fall I'J94 6'5 PC LAN and System Management Tools

browsing and navigating through the management can develop new ONMs and OMMs and simply enroll hierarchy. In addition to navigation capabilities, them into the system. Users have the additional ONMs provide the user interface fo r organizing enti­ benefit of being able to customize the product to ties into a user-defined management domain. support only the ONMs and OMMs that are useful in The OMMs are responsible for managing the enti­ their environment. ties. The OMM design has three key components.

l. OMMs provide the methods usecl to manage the The Us er Interfa ce of ONMs and OMMs entities. These methods include the fu nctions of Given the key software design elements presented discover, create, view, modify, and delete. The in the previous section, the focus of the paper now OMMs also have the option of presenting to the returns to the user interface. This section describes user additional methods. That is, since each OMM what was implemented to support the customer knows bow to manage the entities for which it is requirements and the design framework. responsible, it knows which actions can be The user interface framework manifests the orga­ applied to an entity based on the entity's current nizational, navigational, and fu nctional elements of state and the user's context. system and network management in a coherent 2. orv!Ms provide access to the managed entities. whole. For example, the first three menus on the An OMM can use any interprocess communica­ ManageWORKS menu bar-Viewer, Edit Viewer, and tion mechanism to access or to manage an entity Actions-are all the tools the user needs to manage Examples include the task-to-task, remote pro­ entities. A discussion of the Viewer and Edit Viewer cedure call, and object request broker mecha­ menus fo llows. nisms. Since a PC LAN environment affords no By means of the ManageWORKS Viewer menu, common way for a management director to com­ ONMs present display elements, called viewers, to municate with all the types of devices present, the user. Each instance of a window that an ONM the design team decided to leave the choice of creates is considered a viewer. A ManageWORKS access mechanism up to the OMM. viewer is one of the organizational elements fo r the user and is the root-level object fo r navigation. Each 3. OMMs provide the user interfaces required fo r viewer is a viewport into a set of managed entities managing the entities. This design component that the user may be browsing and navigating allows developers to present an interface that through. A viewer is analogous to a word proces­ best suits the needs of the user and best maps sor's document, i.e., a viewer is a ManageWORKS to the entity being managed. It also allows fo r "document." Just as you can create new documents flexibility, evolution, and innovation in the user and open, close, or edit existing documents when interface of OM.Ms. The ManageWORKS design you use a word processing application, you can per­ team did not want to impose a user interface form the same fu nctions on viewers when using the style or present a user interface that was com­ ManageWORKS software. promised by the diversity of applications that we ManageWORKS ONMs are responsible for the nav­ envisioned running within the context of the igational and organizational display properties. The framework, e.g., by being the least common current ManageWORKS release comes with two denominator. Even though one of the key prod­ ONMs. One ON;YI supports a hierarchical display of uct design goals was a consistent user interface, managed entities. This display is rendered in a sin­ we fe lt that it was important to allow the OMMs gle viewer window p,.raphically as a tree or textually to control the user interfaces. First, we thought as an outline. The other available ONM supports the the design benefits outweighed the risk of any relational display of managed entities, rendered as inconsistency. Second, we encouraged, but did a map. The map ONM can also support a hierarchy; not enforce, consistency by means of a user each map is rendered in a new viewer instance. interface style guide and common libraries that Figure 1 shows ManageWORKS with two hierarchi­ implemented those guidelines.".r' cal viewer styles and a map viewer. The hierarchical The plug-in modules also have a residual benefit. views are the Outline view (shown in the Browser Because these modules can easily be added to or viewer) and the Outl ine Tree view (shown in the IP removed from the environment, they provide Hierarchical View viewer). In aclc!ition to the map an easy way to extend and to customize the viewer (shown in the II' Discovery viewer), note the ManageWORKS product. Digital and third parties navigation window fo r the map viewer (shown in

66 Vo l. (i Nu. 4 Fa l/ 1994 Digital Te chnical journal Tbe Design of Manage WO RKS: A Us er Interfa ce Framework

LARRYAUG '!li!MO SAIC 16 124 1 44 1 3 -!li! MWORKS

12�4.144.16 7 lG --·A 1l(j CD RIVE 1liOCLIENTS 16 124� 1 44 1611 - �DOWNLOAD 16 124 144 0 II!!J EDRIVE

I( 1 612�4144 250 I( II> c::> til ' I( ' ·-· · - c:> I CI C>

Figure 1 ManageWORKS Vi ewers

the IP Discovery (Navigator) viewer). This view user interfaces and in bow the user interfaces inter­ shows a scaled map; the entire contents of the map relate to the ONMs. viewer appears in a rectangular outline, which rep­ The consistency starts with the ManageWORKS resents the user's current viewport into the data. Actions menu. The basic management directives on The user can use the PC pointing device to drag and managed entities originate from this menu. The reposition the viewport. major challenge in designing this menu was to avoid Because the ONM maintains context when the using too many menu items, menu items that would user "edits," i.e., modifies, the contents of a viewer, change constantly (i.e., by addition or deletion), the user can customize or organize the managed menu items that bad three or fo ur levels of hierar­ entities as desired . By means of the Edit Viewer, chy, and menu items that were not context sensitive ONMs allow user customization within a viewer to what the user was doing. The objective was to with the support of user-definable hierarchies. For find a small set of words that conveyed the manage­ example, each instance of a viewer can represent ment fu nctions the user would most often perform. a different management domain fo r the user. The We fe lt that these words should always be present benefit is that the user can find objects and then in the Actions menu, but to eliminate confusion fo r arrange them into hierarchies that are most useful. the user, they should be rendered inactive when As stated earlier, OMMs control the user inter­ not valid. On the other hand, we realized that this faces fo r displaying and modifying managed entity small set of menu choices could never fu lly support properties. The ManageWORKS framework pro­ the actions on managed entities; therefore, the soft­ vides fo r consistency in how the OMMs invoke the ware had to provide an extensibility mechanism.

Digital Te cht�icaljournal Vo l. 6 Nu. 4 Fa ll 1994 67 PC LAl'l and System Management Tools

We began the design process by developing an select multiple objects of the same class from entity/action matrix. One axis contained a list of a viewer ami to invoke an OMM method. The list of the entities that we envisioned being managed selected objects is contained within this drop­ from the ManageWORKS software. The other axis down list box. The user can easily view the contained a list of the actions that could be per­ attributes of diffe rent objects from the same dialog fo rmed on the entities. We marked the intersec­ box. The dialog box displays various sets of man­ tions of the axes. In forming the list of actions, we aged entity properties. The user can select the chose words that were used in existing products desired set of properties from the View or Modify that managed the same entities, words that we drop-down I ist boxes. thought should be considered in a good user inter­ Figure 2 demonstrates that two dialog boxes can face, and finally, synonyms to those words already be active at the same time. This featuresupports the listed. This approach gave us a clear picture of the ManageWORKS design requirement that the user be common actions and also provided a thesaurus of able to manage more than one entity at a time. The words from which to choose. The common actions ManageWORKS product supports, in effect, threads on managed entities that emerged from this exer­ of execution to allow multiple OMivls to be active cise were simultaneously. Support for the design principle of managing many entities as easily as one is not I. Make a new entity of some type. a function of the ManageWORKS software but of 2. Display all the managed entities. the OMMs, since OM.Ms control the methods used to manage entities. 3. View and modifythe entity's properties.

4. Eliminate the entity. Th e Software Sy stem Design of ManageWORKS The ManageWORKS software supports these The focus of the paper now shifts to the common actions through the following Action ManageWOHKS internals that support the design menu choices: principles and user interface just described. 1. Create. Choose Create to make a new entity. Th e Application Framework 2. Expand. Choose Expand to view all the entities As an application, the ManageWORKS product is that the lVIanageWORKS software is managing. merely a software framework fo r integrating its top­ 3. Properties. Choose Properties to display a dialog level user interface with the user interfaces of the box that manifests all the entity's properties. The OMMs and ONMs. The ManageWORKS application user can then view the properties and make consists of two main components: (1) the user inter­ modifications, as appropriate. face shell and (2) the dispatcher. Figure 3 depicts the relationship between these ManageWORKS com­ 4. Delete. Choose Delete to eliminate the entity. ponents and the OMMs and ONMs. The design of the Properties dialog box is one The user interface shell is a standard Microsoft of the key user interface style elements of the Windows application that supports the top-level ManageWORKS product: however, ManageWORKS Windows user interface componen ts-the main does not enforce or provide for this element. application window and its menu bar, tool ribbon, Rather, the consistency is a function of a user inter­ and status bar. The user interface shell translates all face style guide fo r OMMs and some common user interaction by means of the menus, tool rib­ library routines that support this user interface bon, and mouse actions into OMM ami ONM appli­ style.oC• Figure 2 shows the dialog boxes of two cation programming interfaces (AP!s) to perfo rm of the three OMMs that come with the current work fo r the end user. The shell is also responsible ManageWOHKS product: the Simple Network fo r initializing and terminating the application, Management Protocol (SNMP) Manager OMM and including the dispatcher. the LAN Manager (LM) server management OMM. The dispatcher is responsible fo r maintammg (The third OMM, for NetWa re servers, is not shown.) a link between the user interface shell and all Note the Selected Objects field in the SNMP <.Jialog the OMMs, as well as fo r providing service routines. box. The ManageWORKS software allows the user to The dispatcher loads and initializes all OMMs

68 ViJI. 6 No. 4 Fa ll 1')5J4 Digital Te cbuicaljourual TIJ e Desig n of Ma nage WORKS: A u,·er lntetja ce Fra mework

111! j ManageWORKS Workgroup Adnlinls1rator -1 ons '!'l'indow .tiel

Selected Objecls: [1 6.121.111.251 l!JI OK Cancel Properties: IG ene ral lnforma•ion l!ll Apply Typ e: llltlSNt.AP Aouler.IP l!JI System Cont8ct I s�contca.ct not specilied

System Description; I OECNIS 600 software version V2.3-4

Group·

modifice.tion IP Auto-discovery Allow by co :t:es r ]'io Redirector SeiV

Poll Interval: (sees) 60 ·W59Zicl Time out (sees) 1 0 107'i2fiJtJ Monitor this node via polling: Connections Protocol used to monitor this host 10 SNt.AP ,� ICt.AP Mode: Foiled: SNt.APCommunity Nome•

Set: ,------·------Help Buffers

Big: Get: t.AIB... Request:

� ------��--- - �------��

Figure 2 !'vl anageWORKS 0;�1111 Properties Dialog Boxes

present based on an initialization file that the end user configures at installation time (or, if sub­ sequent modules are added , by means of the OBJECT NAVIGATION Management Module Setup program). To enable MODULE this routing to occur, the dispatcher maintains a Jist

,------1 of all OMMs loaded and the object classes that they 1 USER INTERFACE SHELL 1 I I support. I I I I One service that the dispatcher provides fo r I DISPATCHER� I I I OMMs and ONMs is the ability to modify the menu I I I I bar. OMMs and ONMs may add and set menu items I but only through rhe APJs. The ManageWORKS soft­ ------1------i-----J ware ultimately controls what gets displayed in the OBJECT OBJECT OBJECT MANAGEMENT MANAGEMENT MANAGEMENT mem1s based on what objects are selected in a MODULE MODULE MODULE viewer, which prevents the modules from directly manipulating the menu bar.

MANAGEDt MANAGED t MANAGEDt ENTITY ENTITY ENTITY TheAp plication Programming Interfaces Once we had defined the concepts of the I I I ManageWORKS user interface ancl object classes, we DATABASE designed a common set of AP!s that all OMM and ONM developers would employ. The AP!s that emerged 1- fo cused primarily on the object-both its class Figure 3 ManageWORKS Application and its instance. Because the current set of object­ ArciJitectu·re oriented languages and tools does not map wel l to

Digital Te cbnical jo urnal Vo l. 6 No. 4 htll 1')94 PC LA.t'J and System Management Tools

the services supplied by the Windows system, these Each class- or object-based API requires an OlD AP!s are in a more conventional C/Pascal program­ or list of OJDs on which to perform the opera­ ming language style rather than in a C++ style. tion. When called, each class API acts on a single The Al'ls that an OMM mLISt support fall into three object class. The caller manages all memory needed categories based upon their scope of operation: for the successful completion of an API, i.e., no (1) module based , (2) class based , and (3) object API returns a pointer to data. APis that can return based. AllAPls have parameters that contain infor­ a variable amount of information use a two-step mation pertinent to the API ca ll, including the cal ling convention. The first call determines the object identifier (OlD), which identifies the object buffer size required to hold all the data; the second on which to perform the operation. call retrieves that data. This two-call approach Module-based APls perform initialization, termi­ requires OMMs to efficiently gather informa­ nation, and information reponing fo r the entire tion using OMM-specific information caches to OMM. The initialization includes determin ing how store information retrieved from the managed many object classes an OMM supports. This fu nc­ entity. tion is important because an OMM can support ONMs contain all the module-, class-, and object­ more than one class, e.g., a hierarchy of classes. By based APis that exist in a standard OMM but also checking fo r software dependencies on the operat­ contain some viewer-specific APls. These APls ing system or support libraries, the OMM can also include functions to display viewers, select dis­ make sure that the computer environment is capa­ played objects, expand objects, update objects, and ble of supporting the OMM. For example, Digital's retrieve displayed objects. New ONMs can be devel­ implementation of the OMM that manages Net Ware oped using these APTs. servers requires that the NetWare client be installed and configured on the PC. Module termination Th e Object Identifier occurs before the ManageWORKS software termi­ To represent objects within the ManageWORKS soft­ nates, which allows OMMs to clean up any ware, we chose the approach of assigning an OlD to resources they may have used. The information each object in the system. This number embodies function provides information such as the module's the information of the class to which the object name and copyright information. belongs as well as the uniqueness of the individual Class-based APis support the actions that apply to instance of an object within the class. all objects within a class. These fu nctions include The assignment of an OlD to an object is the initialization, termination, configuration, and responsibility of the OMM. The ManageWORKS soft­ reporting information about what actions and ware dynamically assigns to an object class an OlD properties can be accessed by the end user in the that represents the class, and the OMM is responsi­ ManageWORKS user interface. A class-based config­ ble for creating the unique instance values within uration API presents a configuration window fo r the context of that class. This approach allows each class to the user; the user can then change the OMMs the flexibility of using any strategy to assign behavior of the object class. For example, the user these values, e.g. , sequential assignment or map­ can indicate whether or not files on a disk with hid­ ping to a particular technology, such as an external den or system attributes or hidden LAN Manager file database record. services should be displayed. Each OlD is a 32-bit number; the high 12 bits con­ Object-based APis provide the ability to manipu­ tain information that identifies the class to which late individual objects within the ManageWORKS the object belongs. This bit arrangement places a software. With these APis, OMMs can accomplish all limit, 212-1, i.e., 4095 (a value of 0 is invalid), on the base actions and those operations provided for the number of classes that can be active with in the user interface. These APis include fu nctions ManageWORKS at any one time. The low 20 bits pro­ to create, delete, insert, remove, copy, get and set vide the uniqueness fo r each object instance within properties, display a properties dialog box, main­ the class, providing fo r up to 220- 1, i.e., more than tain containership relationships (e.g., technology­ 1 million, individual instances within a single class. based hierarchies), and maintain classes that can be The advantages to using an OlD lie in allowing created and inserted into an object. Approximately objects to store information in any fo rmat they 30 APis (a small manageable set) must be imple­ wish and using access functions to get at that info r­ mented to be ManageWORKS compliant. mation in a consistent manner.

70 Vo l. 6 No 4 Fa ll 1994 Digital Te cbuical jou rnal The Design of Manage WO RKS: A Us er Inteiface Fra mework

Storing Jnfonn ation about Objects character string would resu lt in Jess conflict among Although the OMMs are responsible fo r assigning OMM developers than assigning hard-coded OlD val­ OlDs to objects within a class and fo r storing infor­ ues to each customer that wanted to develop an mation about each object that can be managed, we OMM. By not hard-coding the values, we ensured clid not want every OMM under development to that each newly created object class would receive have to create its own mechanism to accomplish the next OlD value. Thus, diffe rent end users who these tasks. We decided to create an object database are using diffe rent sets of OMMs may have diffe rent that would store info rmation about objects and gen­ OlD values assigned to each of the object classes. erate new OIDs fo r the OM Ms. OMMs can use this object database to create Jnitial designs of this object database were to object classes or objects within those classes, ancl support multiple users and thus al low the sharing to store any amount of info rmation with each of information between multiple ManageWORKS object. Most objects store enough information users and other applications. Because the schedule to get to another data source, thereby prevent­ fo r the first release of the ManageWO RKS software ing information in the database from becoming did not give us ample time to employ a commer­ inconsistent with the managed entity. For example, cially ava ilable database, we decided to create our a NetWare Server OMM saves only the server own database to support the management of name in the database because with that name the object classes and object instances. This database OMM can make NetWare API calls to retrieve other supports only a single user and consists of indexed information. files fo r (l) object information, (2) class infor­ When the object database creates an object, it mation, ancl (3) containership information. The assigns the object an OlD within the space of that existence of these files is hidden under a database object class. Thus, OMMs can rely on the database fo r API, which supports all the management aspects creating unique O!Ds fo r each object in the system. of objects, from creating and deleting classes Another feature of the object database is the anc.l objects to reading and modifying attributes of concept of transient and permanent objects. The those objects. object database DLL writes transient objects not To aUow future changes in the underlying tech­ into the database files but rather to global system nology of the database, we placed the database memory in the Windows operating system. Having code into a DLL. For the second release, we created the objects in memory creates a large performance a new database DLI., with the same APis, that works gain and avoids the problems associated with disk

with Borland's dBase rv database implementation. thrashing. To indicate the type of object that is By simply replacing the database DLL, all OMMs can created, the object database reserves bit 19 of now take advantage of having information shared the OlD to use as a flag. If the bit is set by the OMM between ManageWORKS users across the network. or ONM, the object is transient. When an object is

This design allows fo r comanagement of the LAN by created in the database, the OlD fo r the class is multiple network administrators who have the passed to the database DLL with or without bit 19 same information available. The OMMs do not have set, thus determining whether the object is tran­ to make any source code changes to work with this sient or permanent. new database DLL, but additional A Pis are present to In our initial development work, we quickly dis­ allow for the use of advanced database fe atures. covered that creating all the OlD entries in a Before an OMM can create objects in the data­ database file diminished performance. This prob­ base, the object class itself must be created in lem was most evident in the development of the the database. Because it dynamically assigns OIDs, DOS file system OMM. This OMM enumerates direc­ the object database must store unique information tories, which causes a disk seek operation and about the class along with the Ol D. Each OMM must a disk read operation fo r the enumeration. Next a register an object class, where each class has a name write of the object to the database file on the same that can be presented to the user in the user inter­ disk causes another disk seek/write operation. This face, and a class tag. The class tag is a 64-byte char­ resu lted in tremendous disk thrashing. We envi­ acter string that must be unique among all OMMs. sioned that many OMMs would enumerate and cre­ The database dynamically assigns an OlD to a newly ate a list of contained objects each time an object is created class and maintains that mapping to the expanded, so we wanted this operation to be fast class tag. We decided that using a unique 64-byte and efficient.

Digital Te chnical ]o unwf Fall 1994 Vo l. 6 No. 4 71 PC IAN and System Management Tools

In troducing Ne w OMMs and ONMs into the set of selected objects in a viewer, and the valid tbe Ma nageWORKS Software managed entity actions in the Action menu. The In traditional software development, the addition application uses the dispatcher to call a particular of new fu nctionality into an application generally API to the correct OMM for the class of object being requires source code modification and recom­ operated upon. In this section, we walk through pilation. Clearly, this approach would not allow three typical user interaction scenarios. For each J\l!anageWORKS developers to meet the goal of scenario, we describe key elements of control flow providing an extensible application framework. between the user interface shell, the dispatcher, Developers needed a way to write software that the ONM involved, and the OMM involved. These could become part of the ManageWORKS applica­ scenarios illustrate how the ManageWORKS elements tion without requiring changes to the application. fit and work together to achieve our primary objec­ Since the ManageWORKS software runs in the tive, i.e., to design a user interface framework with Microsoft Windows operating system environment, consistent mechanisms to display, organize, and software developers were able to take advantage of navigate through management entities for the pur­ many features of the Windows system. We used pose of managing one or more of those entities. DLLs to provide an extensible framework fo r the ManageWORKS product. Scenario I This scenario outlines the process of By creating a DLL that conforms to the set of displaying the properties dialog box of the selected APis needed to manage an object or to implement object(s) in a viewer. a viewer, we can add new DLL'i at any time to add 1. .functionality to the ManageWORKS software. There­ The user has selected one or more objects of the fore, all OMMs and ONMs must be implemented as same class in a viewer by clicking with the mouse. DLLs. The registration process needed to be simple 2. The user then chooses the Properties menu item and dynamic for these DLLs. Using a Windows appli­ from the Actions menu. As a reminder, this action cation initialization (IN!) file, the dispatcher reads invokes the properties dialog box, which by style the Jist of entries in the file and attempts to load and guide convention, supports the viewing and initialize all OMMs and ONl\,Is defined. End users can modification of a managed entity's properties. add new OMMs by running the ManageWORKS Management Module Setup program, which simpli­ 3 The ManageWORKS software queries the selected viewer fo r the list of selected objects and obtains fies the installation of any ON!Ms provided by either the OIDs of the objects from the viewer. Digital or a third-party vendor. When an OMM is introduced, the ManageWORKS 4. The ManageWORKS dispatcher decodes the software needs to assign an OlD to each object object class portion of the oro. class that the OMM handles. This is accomplished 5. The ManageWORKS software tells the OMM of by asking the dispatcher for an OlD fo r the class that object class to display the properties dialog based upon a supplied class tag. The dispatcher box fo r the list of objects (OIDs) supplied. then uses the object database to have the OID assigned. The dispatcher's use of the object data­ 6. The OMM displays a properties dialog box that base ensures that the OlD for the class is unique contains all the supplied objects The OMM has to that class. OMMs can ask the object database complete control of the user interface fo r this directly, but this is merely a side effect of the window and complete control over the access to dispatcher's use of the object database and is not the managed entity mechanism to get and set the recommended. properties from the managed entities.

Interactions between Ma nage WORKS Scenario 2 This scenario outlines the process of Components expanding a selected set of objects in a hierarchical Most ManageWORKS events occur when the user viewer. Expanding an object results in the display interacts with the user interface, although OMMs of the object's descendants within the hierarchy and ONMs can generate events that cause commu­ defined by the OMM. The user may render this dis­ nication to occur between the components of play in a hierarchical fashion with one of the hierar­ the system. The usual flow of control through chical view styles or as a descendant portion of the ManageWORKS software begins with a viewer, a topological view.

72 Vo l. 6 No. 4 ht/1 1994 Digital Te e/m ica/ jo urnal The Design of Ma nage WORKS: A Us er Interfa ce Fra mework

1. The user has selected one or more objects in 4. If it is over a viewer, the mouse tells the target a viewer by cl icking with the mouse. The objects viewer what objects the user is dragging over it. may be of the same class or of different classes. The source ONM sends a ManageWORKS-defined Windows message to the target viewer window 2. The user then chooses the Expand menu item with the list of O!Ds being dragged. from the Actions menu. 5. The target viewer determines what object the 3. The ManageWORKS software queries the selected mouse is over and if that object is selected. The viewer fo r the I ist of selected objects and obtains set of objects targeted to receive the dropped the oms of the objects from the viewer. object comprises either the individual object, or if selected, all the selected objects in the viewer. 4. The ManageWORKS software tells the selected viewer to expand the list of objects supplied (the 6. The target viewer queries the ONIM of each target selected objects from the last call). object about what class of object can be dropped on it. If all the target objects can accept the 5 For each selected object to be expanded, the dragged objects, the cursor changes shape to viewer queries the object by means of the dis­ reflect a potentially successful drop. Otherwise, patcher fo r the list of contained objects within the cursor changes to reflect that the drop that object. The dispatcher calls the OMM that would not succeed at this mouse location. supports the object to get the Jist of contained objects. The viewer repeats this process for all 7. When the user drops the objects, the same verifi­ OIDs to be expanded. cation occurs as during the drag operation. If the drop is not going to be successful, the viewer 6. For a hierarchical view, the viewer places the list that initiated the drag operation returns the of objects into the viewer in a hierarchical fash­ mouse cursor to the original location. ion. For a topological map view, the viewer 8. If the drop operation passes the verification either creates a new window or replaces the cur­ step, each object that the user is dragging is rent window, depending on the choice the user copied by the OMM to each target object. This has indicated through the customization dialog is done iteratively fo r each dragged object, and box. The window shows the descendant set of each copy has the potential fo r fa ilure. For exam­ objects with their topological relationships. ple, a DOS file can be dragged to a DOS disk class 7. For each of the contained objects, the viewer object, but when the copy is attempted, the disk queries the object's OMM by means of the dis­ may not have enough free space to successfu lly patcher fo r its name and bitmap, and to deter­ copy the file. \Vhen each dragged object is mine whether it can potentially be expanded by copied, the OMM of the target object is told that it the user. The viewer repeats this process fo r should now contain the new object. This causes each contained object to be displayed and then the hierarchy to be properly updated. A drag­ renders each item. and-drop operation that is intended to move an object is implemented as a copy fo llowed by a removal of the original. Scenario 3 This scenario outlines the process of dragging and dropping an object onto another object in a viewer. The OMM of the target object Conclusions controls the semantics of this operation. We feel that we have been successful at building a unique user interface framework that integrates a The viewer controls drag-and-drop operations. I. diverse set of applications; the design essentially meets all but one of the objectives we established. 2. The viewer determines the O!Ds of the object(s) Because by design we limited the scope of services that the user is dragging. provided by the framework, we could not meet all 3. As the user moves the mouse, the viewer ofour end-user objectives. Specifically, the respon­ receives mouse move messages from the sibility of allowing the user to manage many enti­ Windows system and determines if the mouse is ties as though they were one fell on the OMMs and over a viewer. The window messages are sent not on the framework itself. Although we would directly to the viewer window. have liked the framework to provide this service,

Digital Te cbuical]our11al Vu /. 6 No. 4 Fa ll 1994 73 PC IAN and System Management Tools

such a design was not fe asible, given that the OMM organization to which they belong, namely, controlled both the access to the managed entity Business Management, Marketing, Human Factors and the user interface to view and modify entity Engineering, Systems Quality Engineering, Doctl­ properties. mentation, Release Engineering, Field Te st Admin­ The reader should observe that the first two istration, and, of course, Software Development major releases of the ManageWORKS software pro­ Engineering. vide few core services. The core services include the user interface shell, the viewers, and the object References database that ship with the ManageWORKS product 1. K. Holtzblat and S. Jones, "Contextual Inquiry: and the ManageWORKS Software Developer's Kit. A Participatory Te ch nique fo r System Design" in These components serve as a unifying framework Pa rticipatOJ]' Design: Principles and Practice, fo r the fu nctional modules, which provide the user A. Namioka and D. Schuler, eds. (Hillsdale, N.J : with tools to manage entities and are thus the "heart Lawrence Erlbaum Associates, Inc., 1993). and soul" of the environment. Future development 2. Microsoft Windows Guide to Programming of core framework services is under consideration. (Redmond, Microsoft Press, 1990). Among the areas under active consideration are WA : Windows Object Linking and Embedding (OLE) sup­ 3. Windows 3.1 Software Developer's Kit, Control port and scripting support fo r inter- and intra-OMM Panel Applets in Online Help (Redmond, \X-1\. : control. Such services would make ONMs and Microsoft Press, 1992). Ol'vlMs more consistent, useful, and powerful for the 4. C. Strutt and D. Shurtleff, "Architecture fo r an end user. At the same time, these services would Integrated, Extensible Enterprise Management free the individual developer from writing this Director" in Integrated Ne twork JI!J anagement, code and thus provide the developer the freedom vol. I, B. Meandzja and J. We stcott, eels. (Amster­ to fo cus on the value-adclecl functionality. dam: North-Holland, Elsevier, 1989): 61-72. Acknowledgments 5. Mc mageWORKS Programming Guide (Maynard, MA: Digital Equipment Corporation, Order No. Many people contributed a great deal to the design AA-QADFB-TE, 1994) and implementation of the ManageWORKS product. Although the contributors are too numerous to men­ 6. ManageWORKS Programmer's Reference (May­ tion individually, we would like to acknowledge nard, MA: Digital Equipment Corporation, Order the fu nctional groups within the PATHWORKS No. AA-QADGB-TE, 1994).

6 74 Vo l. No. 4 Fa ll 1994 Digital Te chnical jo urnal ja mes E.johnson I

The Structure of the Op enVMS Ma nagement Station

Th e Op en VMS Ma nagement Sta tion software provides a robust client-server application between a PC running the iv/icrosoft Windows op erating sy stem and several Op en VMS cluster systems. The initial ve-rsion of the Op enVi\115 Management Station software concentrated on allowing customers to handle the sy stem man­ agementfu nctionality associated with user account management. To achieve these

attributes, the Open V/VIS Ma nagement Sta tion software uses the data-sharing aspects of Op en Vi'YIS cluster sys tems, a communications design that is secure and that scales well with additional target sys tems, and a management display that is gearedfo r the simultaneous management of multiple similar systems.

Overview Data from customer surveys ind icated the need The OpenVMS Management Station version 1.0 soft­ fo r a quick response to the problems of managing ware provides a robust, scalable, and secure client­ OpenVMS systems. For this reason, the project team server application between a personal computer chose a phased delivery approach, in which a series (PC) running the Microsoft Windows operating of frequent releases would be shipped, with sup­ system and several OpenVMS systems. This man­ port fo r a small number of system management agement tool was developed to solve some very tasks provided in an individual release. specific problems concerning the management of The initial version of the OpenVMS Management multiple systems. At the same time, the project Station software concentrated on providing the engineers strove fo r a release cycle that co uld bring system management fu nctionality associated with timely relief to customers in installments. user account management. Th is goal was achieved Before the advent of this new software, all by using a project infrastructure that supported OpenVMS base system management tools have either frequent product releases. This paper describes executed against one system, such as AUTHORIZE, the OpenVMS Management Station software, con­ or against a set of systems in sequence, such as centrating on the cl ient-server structure. It also SYSMAN. Furthermore, the existing tools that do presents the issues and trade-offs that needed to be provide some primitive support for the manage­ faced for successful delivery. ment of multiple systems either do not take advan­ tage of or do not take into account the inherent Managing Op enVMS Us er Accounts structure of a VMScluster system. Managing user accounts on an Open VMS operating In contrast, the OpenVMS Management Station system is a relatively complicated task.1 The man­ product was designed from the outset fo r efficient ner in which the user is represented to the system execution in a distributed, multiple system configu­ manager is the cause of much complexity. The ration. The OpenVMS Management Station tool attributes that define a user are not located in one supports parallel execution of system manage­ place, nor is much emphasis placed on ensuring ment requests against several target OpenVMS consistency between the various attributes. systems or VMScluster systems. Furthermore, the For example, Table 1 gives the attributes of an software incorporates several fe atures that make OpenVMS user stored in various files, including the such multiple target requests natural and easy fo r user authorization file (SYSUAF.DAT), the rightslist the system manager. file (RIGHTSLIST.DAT), and the DECnet network

Digital Te chnical ]o urnaf Vo f. 6 No. 4 Fa ll f994 75 PC LAN and System Management Tools

Ta ble 1 Breakdown of Data Stores and Management Utilities for OpenVMS Users

Data Store Attributes Management Utility

SYSUAF. DAT Username, AUTHORIZE Authorization data (e.g., passwords), process quotas, login device, and directory

RIGHTSLIST. DAT Rights identifiers AUTHORIZE

NET$PROXY.DAT Remote<->local user AUTHORIZE DECnet proxy mappings

VMS$MAIL_PROFILE.DAT User's mail profile MAIL

QUOTA.SYS (per disk) User's disk quota DISKQUOTA

Login directory User's home directory CREAT E/D IRECTORY

TNT$UADB.DAT User's location, phone number,

proxy file (NET$PROA.'Y.DAT). Prior to the OpenVMS deployed to users who need dedicated processing Management Station product, these files were man­ power or graphics support, and personal computers aged by a collection of low-level utilities, such as are deployed in other departments fo r data access AUTHOIUZE. Although these utilities provide the and storage. FinaUy, the table lists some groups of ability to manipulate the individual user attributes, users who need access to multiple systems, some­ they offer little support fo r ensuring that the overal.l times with changed attributes. The system manager collection of user attributes is consistent. For fo r th is type of configuration would repeated ly per­ instance, none of these utilities would detect that fo rm many tasks across several targets, such as sys­ a user's account had been created with the user's tems or users, with small variations from target to home directory located on a disk to which the user target. The OpenVMS Management Station product had no access. was designed to operate well in configurations All of these factors create additional complexity such as this. fo r an OpenVlVIS system manager. This complexity is A distributed system is clearly necessary to sup­ compounded when a number of loosely related port effective and efficient systems management for Open VMS systems must be managed at various sites. configurations such as the one shown in Figure 1. The user account management features of the A distributee! system should support parallel execu­ Open VMS Management Station product are designed tion of requests, leverage the clllsterwide attributes to alleviate or remove these additional complexi­ of some system management operations, and pro­ ties for the Open VMS system manager. vide fo r wicle area support. These characteristics are expanded in the remainder of this section. Op enVMS Sy stem Configurations rll1e OpenVMS operating system can be used in many Supporting Pa rallel Execution ways. The features of the VMScluster method allow Support of parallel execution has two different systems to expand by incrementally adding storage implications. First, the execution time should rise or processing capacity. In addition, the OpenVMS slowly, or preferably remain constant, as systems operating system is frequently used in networked are added. This implies that the execution against configurations. Its inherent richness leads to a large any given target system should be overlapped by and diverse range in the possible Open VMS configu­ the execution against the other target systems. rations. The skill and etfort required to manage the Second, fo r parallel execution to be usable in a wider larger configurations is considerable. range of cases, it shou ld be easy and straightforward For instance, Figure 1 shows a possible customer to make a request that will have similar, but not iden­ configuration, in which a number of VMScluster tical, behavior on the target systems. For instance, systems extend across a primary and a backup site. consider adding a user fo r a new member of the Each cluster has a somewhat diffe rent purpose, as development staff in the configuration shown in given in Ta ble 2. Here OpenVMS workstations are Figure 1. The new user would be privileged on the

76 Vo l. 6 No. 4 Fa ll 1994 Digital Te chnical Jo urnal The Structure of the Op enVMS Management Station

ETHERNET

DISK DISK DISK CLUSTER A DISK DISK CLUSTER C

DISK DISK DISK CLUSTER B

Figure 1 Distributed Op enVMS System Configuration

development VMScluster system, but unprivileged Jn the first case, the scope of the resource on the production cluster. It should be straightfor­ extends throughout the VMScluster system. Here, it ward to express this as a single request, rather than is desirable (ancl when the operation is not idempo­ as two disparate ones. tent, it is necessary) fo r the operation to execute once within the VMScluster system. In the latter Leveraging VM Scluster Attributes case, the operation must execute against every sys­ OpenVMS system management tasks operate tem within the cluster that the system manager against some resources and attributes that are wants to affect. Also, the set of resources that fa lls shared clusterwide, such as modifications to the into the first case or the second is not fixed. In the user authorization file, and some that are not OpenV1viS operating system releases, the ongoing shared, such as the system parameter settings. trend is to share resources that were node-specific

Ta ble 2 Usage and User Community for Sample Configuration

Name Usage User Community

A Main production cluster Operations group Production group Development group (unprivileged)

B Development cluster Operations group Development group (fu ll development privileges) c Backup production cluster and Operations group main accounting cluster Development group (unprivileged) Production group Accounting group

Workstations owner Some of operations group

Digital Te clmicaljourual Vo l. 6 No. 4 Fa ll 19':)4 77 PC LAN and System Management Tools

throughout a VMScluster system. The OpenVMS components were specified in the design, but were Management Station software must handle unnecessary for the initial release. resources that have different scopes on diffe rent The client software on the PC uses the systems that it is managing at the same time. ManageWORKS management framework from Digital's PAT HWORKS product. This extensible Wide Area Support framework provides hierarchical navigation and presentation support, as well as a local configura­ Management of a group of Open VMS systems is not tion database. 2 The framework dispatches to necessarily limited to one site or to one local area Object Management Modules (OMMs) to manage network (LAN). Frequently there are remote backup individual objects. OpenVMS Management Station systems, or the development site is remote from the has three OMMs that are used to organize the system production site. Almost certainly, the system man­ manager's view of the managed systems. These are ager needs to be able to perform some management Management Domains, VJV!Scluster Systems, and tasks remotely (from home). Therefore, any solu­ OpenVMS Nodes. In addition , two OMMs manage tion must be able to operate outside of the LAN user accounts: OpenVMS Accounts and OpenVMS environment. It should also be able to function rea­ User. The first OMM is used to retrieve the user sonably in bandwidth-limited networks, regardless accounts and to create subordinate OpenVMS User of whether or not the slower speed lines are to objects in the ManageWORKS framework hierarchy. a few remote systems, or between the system man­ The second contains the client portion of the ager and all the managed systems. OpenVMS user account management support. Underlying the last two OMMs is the client commu­ Op enVMS Management nications layer. This provides authenticated com­ Station Structure munications to a server. The resulting structure fo r the OpenViviS Man­ The server software on the OpenVMS systems agement Station software is shown in Figure 2. The consists of a message-dispatching mechanism and components contained within the dashed box are a collection of server OMMs that enact the various present in the final version 1.0 product. The other management requests. The dispatcher is also

�------

NETVIEW PC CLIENT SYSTEM I I I �:;; LOCAL I I MANAGEWORKS FRAMEWORK H J CONFIGURATION I - DATA I API USER PROXY I AGENT OMM \ I � I I I LOCAL COMMUNICATION LAYER CLIENT I I l ----I I r- -- I

I SERVER INFRASTRUCTURE SERVER INFRASTRUCTURE I I I UASERVER UASERVER I OMM OMM I v FORWARDING COMMUNICATION LAYER FORWARDING COMMUNICATION LAYER I L------1

Figure 2 Op enVM S Management Station Structure

78 Vo l. 6 No. 4 Fa ll 1994 Digital Te cbnicaljournal Th e Structure of the Op enVMS Management Station

responsible fo r fo rwarding the management the system manager is remotely located across request to all target VMScluster systems and inde­ slower speed lines from the managed systems. pendent systems, and fo r gathering the responses Finally, they require that the client understand the and returning them to the client. The version 1.0 scope of a management resource fo r all possible tar­ server contains two OMMs: UAServer and Spook. get OpenVMS systems that it may ever manage. The former implements the server support for both These various difficulties led the project team to the OpenVMS Accounts and OpenVMS User OMMs. place the data gathering, reduction, and display The Spook OMM implements the server component logic in the client. The client communicates to one of the authentication protocol. of the managed systems, which then fo rwards the Other clients were not built fo r version 1.0 but requests to all affected independent systems or were planned into the design. Specifically, these VMScluster systems. Similarly, replies are passed items are (l) a local client to provide a local applica­ through the forwarding system and sent back to the tion programming interface (API) to the fu nctions client. The chosen system is one that the system in the server, and (2) a proxy agent to provide manager has determined is a reasonable choice as a mapping between Simple Network Management a forwarding hub. Protocol (SN\11') requests and server functions. Note that the fo rwarding system sends a request to one system in a VMScluster. That system must Design Alternatives determine if the request concerns actions that Before this structure was accepted, the designers occur throughout the VMScluster or if the request considered a number of alternatives. The two areas needs to be fo rwarded further within the with many variables to consider were the place­ VMScluster. In the second case, that node then ment of the communications layer and the use of acts as an intermediate fo rwarding system. a management protocol. This structure allows the client to scale rea­ sonably with increasing numbers of managed sys­ Communications Layer Placement The first tems. The number of active communication links major structural question concerned the place­ is constant, although the amount of data that is ment of the communications layer in the overall transferred on the replies increases with the num­ application. ber of targeted managed systems. The amount of At one extreme, tl1e client could have been a dis­ local state information increases similarly. Although play engine, with all the application knowledge in it is not a general routing system, its responsiveness the servers. This design is simi.lar to the approach is affected less by either a system manager remote used fo r the X Window System and is sufficient fo r from all the managed systems, or by the manage­ the degenerate case of a single managed system. ment of a few systems at a backup site. Final ly, it Without application knowledge in the client, how­ allows the managed VMScluster system to deter­ ever, there was no opportunity fo r reduction of mine which management requests do or do not data, or fo r the simpl ification of its display, when need to be propagated to each individual node. attempting to perform management requests to several target systems. Use of Standard Protocols The second major At the other extreme, the appl ication knowl.edge structural question concerned the use of de facto or could have been wholly contained within the de jure standard enterprise management protocols, client. The server systems would have provided such as SNMP or Common Management Information simple file or disk services, such as Distributed Protocol (CMIP).-14 Both protocols are sufficient Computing Environment (DCE) distributed fi le to name the various management objects and to server (DFS) or Sun's Network File Service (NFS). encode their attribu tes. Neither can direct a request Since application knowledge would be in the to multiple managed systems. Also, neither can han­ client, these services would provide management dle the complexities of determining ifman agement requests to either a single managed system or to operations are inherently clusterwide or not. The multiple systems. However, they scale poorly. For project team could have worked around the short­ instance, in the case of user account management, comings by using additional logic within the man­ seven active file service connections would be agement objects. This alternative would have required for each managed system' Furthermore, reduced the management software's use of either these services exhibit very poor responsiveness if protocol to little more than a message encoding

Digital Te chnical Jo urnal Vo l. 6 No. 4 Fa l/ 1994 79 PC L'\N and System Management Tools

scheme. However, it was not clear that the result Note that these hierarchies do not imply any would have been useful and manageable to clients fo rm of close coupling. Their only purpose is to aid of other management systems, such as Net View. the system manager in organization. Several diffe r­ On a purely pragmatic level, an SNMP engine was ent hierarchies may be used. For a given set of sys­ not present on the OpenVM S operating system. The tems, a system manager may have one hierarchy CMIP-basecl extensible agent that was available that reflects physical location and another that exceeded the management software's limits for reflects organization boundaries. resource consumption and responsiveness. For Figure 3 shows a typical hierarchy. In the figure, instance, with responsiveness, a simple operation the system manager has grouped the VMScluster using AUTHORIZE, suchas "show account attributes," systems, PSWAPM and PCAPT, into a domain called typically takes a second to list the first user account !V ly Management Domain. The display also shows and is then limited by display bandwidth. For suc­ the results of a "list users" request at the domain cessful adoption by system managers, the project level of the hierarchy. A "list users" request can also team fe lt that any operation needed to be close to be executed against a single system. For instance, to that level of responsiveness. Early tests using the obtain the list of users on the PCAPT VMScluster sys­ CMIP-based common agent showed response times tem, the system manager need only expand the for equivalent operations varied from 10 to 30 sec­ ''OpenVM S Accounts" item directly below it. onds before the first user was displayed. Remaining user accounts were also displayed more slowly, but Pa rticipation in the not as noticeably. Authentication Protocol In the final analysis, the project engineers could have either ported an SNMP engine or corrected It was an essential requirement from the start fo r the resource and responsiveness issues with the the OpenVMS Management Station software to be at least as secure as the traditional OpenVMS system CMIP-based common agent. However, either choice would have required diverting considerable project management tools. Furthermore, due to the rela­ tively insecure nature of PCs, the product could not resources for questionable payback. As a result, the safely store sensitive data on the client system. For product developers chose to use a simple, private usability, however, the product had to limit the request-response protocol, encoding the man­ amount and frequency of authentication data agement object attributes as type-length-value the system manager needed to present. sequences (TLVs). As a result, two OMMs, the VMScluster ami the OpenVMS Node, store the OpenVNIS username that Client Component the system manager wishes to use when access­ With the OpenVMS Management Station, the client ing those systems. For a given session within the is the component that directly interacts with the lVl anageWORKS software, the first communication system manager. As such, it is primarily responsible attempt to the managed system results in a request fo r structuring the display of management infor­ fo r a password fo r that username. Once the pass­ mation and for gathering input to update such man­ word is entered, the client and the server perform agement information. This specifically includes a challenge-response protocol. The protocol estab­ capabilities fo r grouping the various OpenVMS lishes that both the client and the server know the systems according to the needs of the system man­ same password without exchanging it in plain text ager, fo r participating in the authentication pro­ across the network. Only after this authentication tocol, and for displaying and updating user account exchange has successfully completed, does the information. server process any management requests. The hashed password is stored in memory at the Grouping Op enVJ.l15 Systemsfo r client and used for two further purposes. First, if Management Op erations the server connection fails, the client attempts to The system manager is able to group individual sys­ silently reconnect at the next request (if a request is tems and VMScluster systems into loose associa­ outstanding when the fa ilure occurs, that request tions called domains. These domains themselves reports a failure). This reconnection attempt also may be grouped together to produce a hierarchy. undergoes the same authentication exchange. If the The system manager uses hierarchies to indicate hashed password is still valid, however, the recon­ the targets for a request. nection is made without apparent interruption or

80 vh/. 6 No . 4 Fa ll l'J(J4 Digital Te chnical jo urnal The Structure of the Op enVMS Management Station

ManageWORKS Ia aa

-@ My management domain -,:.»Open VMS Accounts ® DANA on SYSMGT A living legend [at least in his mind) [DANA) ,:.'1/4 DAVIDSON on SYSMGT Stu Davidson [DAVIDSON] ® DCESSERVER on PCAPT DCE Services [DCESSERVER) ® DEFAULT on PCAPT OpenVMS provided account template (USER) ® DEFAULT on SYSMGT OpenVMS provided account template (DEFAULT) tin�DEV ARAJAN on SYSMGT R. Devarajan [DEVARAJANJ ,:i:(� DFSSFS_RESD on SYSMGT DFS serveraccess [DECNET) ,&;1ftDNSSSER VER on SYSMGT DNSSSERVER [DNSSSERVER) ® DPLSSERVER on SYSMGT DECplan Server [DPLSSERVERJ ,:n'VftDQSSS ERVER on PCAPT DOSSSERVER default account [DOSSSERVER) ,lfJtDOSSSER VER on SYSMGT DQSSSERVER default account [DOS SSE RVER) � DUTKO on PCAPT Nestor Dutko [CLIENT meister) [DUTKO) � DUTKO on SYSMGT Nestor Dutko (CLIENT meister) [DUTKO) ,:n"¥¥DZIEDZIC on SYSMGT Tony Dziedzic [DZIEDZIC) tin�DZIE DZIC_N on SYSMGT Tony Dziedzic [DZIEDZIC_N] ·� SYSMGT -� PCAPT • ,:.!}Open VMS Accounts

Figure 3 Management Domain View

requests fo r input from the system manager. For instance, all the mail profile information is in Second, the hashed password is used as a key to one window. encrypt particularly sensitive data, such as new The first window to be displayed is the character­ passwords fo r user accounts, prior to their trans­ istics display, which is shown in Figure 4. This win­ mission to the server. dow contains general information about the user The resulting level of security is quite high. It cer­ that was fo und during usability testing to be needed tainly exceeds the common practice of remotely frequently by the system manager. Occasionally, logging in to OpenVMS systems to manage them. information was needed in places that did not match its internal structure. For instance, the "new Display and Up date of User mail count" was found to have two windows: the Account Information user flags display, which l1ad the login display The OpenVMS Management Station version 1.0 attributes, and the mail profile display. cl ient software primari ly supports user account The OpenVMS User OMM and the zoom display management. This support is largely contained in organize the attribu tes into logical groupings, the OpenVM S User OMM. This module presents the simplify the display and modification of those attri­ OpenVMS user account attributes in a consistent, butes, and provide fairly basic attribute consistency unified view. enforcement. The project team did encounter one The main view from the OpenVMS User OMM is case in which no standard text display proved suffi­ called the zoom display. This series of related win­ ciently usable. This was in the area of access time dows displays and allows modification to the user restrictions. All attempts to list the access times account attributes. The displays are organized so as text proved too confusing during usability test­ that related attributes appear in the same window. ing. As a result, the project developers produced

Digital Te chnical jo urnal Vo l. 6 No . 4 Fa /1 1991 81 PC LAl'l/ and System Management Tools

.!!se.name: jJJOHNSON on PSWAPhl liJ OK I A!llibute(s): Jr:haracteristics liJ !;ancel I APPIJ I I A!UIIll to AnJ I l!elp I · contact Information 0)!rler: jJim Johnson r Location: r------1 Phone: II!; i�------���------�--�:---� state-- r 1_ @ f.nabled 0 Oisa_b_led I I AccountExpiration j200. j4ii0 Expife_ On: ro loI io Advanced._ . . · IX: No el!Jiiration [----- Accounting§.roup: ARGUSJ P!iority. f4 � I Login disk l

Di� I $USERS: I .....J Direclorv; [ JJOHNSON fHome j I I d __ _Jo ______b s __ l o_nt_. _�_uo_�ta_: j _o_o_o_o__ ���S_P_�__ u � _l_m_c_k__l:_9_� l_I_Jl_%_1�----� �

Fl�r;;;ure 4 User Characteristics Disp lay a specialized screen control that displayed the time manager cbanges these settings is to accommodate range directly, as shown in the Time Restrictions a new version of an application that needs increased section of Figure 5. Later, system managers who values to fu nction correctly. Prior to the develop­ participated in the usability testing fo und this to be ment of the OpenViVIS Management Station tool, the very usable. system manager had to locate all the users of this The display and presentation work fo r the application, examine each account, and increase Open VMS User OMM was necessary for usability. the resource quotas if they were below the appli· However, this does not directly address the need cation's needs. Conversely, with the OpenVMS to support requests against multiple simultaneous Management Station product, the system manager targets. For the OpenVMS User OMtvl , these targets selects the users of the application in the domain may be either multiple VMScJuster systems or inde· display (Figure 3), and requests the zoom display pendent systems, multiple users, or a combination fo r the entire set. The system manager then of either configuration with multiple users. proceeds to the user quota display and selects the At its simplest, this support consisted of simply quotas to change. The selection takes the form of triggering a request to have multiple targets. This is a conditional request-in this case an At Least done through the Apply to Al l button on any of the condition-and the value to set. The system man­ zoom windows. By pressing this button, the system ager then presses the Apply to All button, and manager directs the updates to be sent to all user the changes are carried out fo r all selected users. accounts on all target systems I isted in the Figure 6 shows the user quota display. user name field. This action is sufficient if the sys­ tem manager is performing a straightforward task, Communications Component such as "set these users' accounts to disabled." It is The communications component is responsible fo r not sufficient in a number of cases. managing communications between the client and For example, one interesting case involve:; user servers. It provides support fo r transport-indepen­ account resource quotas. One reason a system dent, request-response communications, automated

82 Vo l. 6 No. 4 Fa ll 1994 Digital Te chnical jo urnal Th e Structure of the Op enVMS JV/anagement Station

ll.serna-: IJJOHNSON on PS\IIAPM l!J Altribute(s): I Ti'!'e Restrictions liJ Cancel

AllJ)Iy to I Alii 11� I PriiRart. Days S-econdary DaysI Monday Saturday Tuesday Sunday Wednesday Thursday Friday

.. ·---·- ..----·--- -- r- R e stricrrona ------·-·---- Processing Mode - . Primary Day Access:

�) .!..ocal 0 Jl.atch �� .... -=�..;. =� �-+ : I 0 fletwork S.a.condary Day!!"i Accen:'';f" ''"'!�l Djalup �! I I I I I I 3AM $AM 9AM Noon 3PM 6PIVI 9PM 0 Remote I I I . I

Figure 5 User Time Restrictions Disp lay

jJJOHNSON on PS\IIAPM L!J I Albilute{s): jQ uotas

Awly to Ali I i I l!elp I Quota Calegoor.

A£T Linlit: lEqual to 325 · Virtual Memory liJ ( 11 l.uffered 10: 1 �ual t() 100 : 11 Bt.tta lE qual to liJJ 95536 l!J J 11 12_irect 10: I Equ�l to 100 1 liJ I 11 Lillit: �Equal to 2ooo i'EM.Q. liJ J 11 Openfile Lilllil: JEqual to l!J I 200 11 I lob Logicals: Equal to 4096 1 t!J I ·11 TQE Limit: Jr-E -q u_a_l t_o--..;:l!J::::; j 12a 11

Figure 6 Us e1· Quota Disp lay

Digital Te clmicaljourual Vo l. 6 No. 4 Fa l/ 1')94 83 PC LAN and System Management Tools

reconnection on failure, and support routines fo r updates the collection of systems that are to receive fo rmatting and decoding attributes in messages. any fu ture management requests. Assuming this Because of the request-response nature of the was successful, the OMM then begins the request communications, the project team's first approach processing by retrieving the version number fo r the was to use remote procedure calls for communica­ current fo rwarding server. Based on that, the OMM tions, using DCE's remote procedure call (RPC) then fo rmats and issues the request. Once the mechanism.5 This matches the message traffic fo r request has been issued, the OMM perioclically the degenerate case of a single managed system. checks to see if either the response has arrived or Management of mul tiple systems can easily be mod­ the system manager has canceled the request. Upon eled by adding a continuation routine fo r any given arrival of the response, it is retrieved and the mes­ management service. This routine returns the next sage data decoded. response, or a "no more" indication. To perfo rm this messaging sequence, the OMM The RPC mechanism also handles much of the uses a pair of interfaces. The first is used to estab­ basic data type encoding and clecocling. A fo rm of lish and maintain the collection of systems that are version support allows the services to evolve over to receive any management requests. The second time and still interoperate with previous versions. interface, which is compatible with X/Open's XTI The project team's eventual decision not to use standard, is used to issue the request, determine if DCE's RPC was not due to technical concerns. The the response is available, and to retrieve it when technology was, and is, a good match fo r the needs it is 6 A third interface that supports the encoding of the OpenVMS Management Station software. and decoding of message data is described in a fo l­ Instead, the decision was prompted by concerns lowing section. fo r system cost and project risk. At the time, both the Open VMS Management Station product and the Reconnection on Fa ilure OpenVMS DCE port were under development. The The OpenVMS Management Station product DCE on OpenVMS product has since been delivered, attempts to recover from communications failures and many of the system cost concerns, such as with little disruption to the system manager the license fees fo r the RPC run time and the need through the use of an automated reconnection fo r non-OpenVMS name and security server sys­ mechanism. This mechanism places constraints on tems, have been corrected. the behavior of the request and response messages. In the end, the OpenVMS Management Station Each request must be able to be issued after a recon­ software contained a communications layer that nection. Therefore, each request is marked as either hid many of the details of the underlying implemen­ an initial request, which does not depend on server tation, offering a simpJe request-response para­ state from previous requests, or as a continuation digm. The only difference with an RPC-style model request, which is used to retrieve the second or is that the data encoding and decoding operations later responses from a multiple target request and were moved into support routines called directly does depend on existing server state. by the sender or receiver, rather than by the com­ If an existing communications link fails, that link munications layer itself. In future versions, the goal is marked as unconnected. If a response were fo r this layer is to support add itional transports, outstanding, an error would be returned instead of such as simple Transmission Control Protocol/ a response message. \Vhen the communications Internet Protocol (TCP/l P) messages or DCE's RPC. layer is next called to send a request across the An investigation into providing additional trans­ unconnected link, an automated reconnection is ports is currently under war attempted. This involves establishing a network The remainder of this section describes the com­ connection to a target system in the request. Once munications layer in more detail, including the the connection has been established, the authenti­ mechanisms provided to the client OMMs, how cation protocol is executed, using the previously reconnection on failure operates, and the message supplied authentication data. If authentication encoding and decoding support routines. succeeds, the request is sent. If it is a continuation request, and the target server has no existing state Client Request-response Mechanisms fo r that request, an error response is returned. The OMMs in the client system call the communica­ At most, the resulting behavior fo r the system tions layer directly. To make a request, an OMM first manager is to return an error on a management

84 Vo l. G No. 4 Fa ll 1994 Digital Te cbnicaljournal Th e Structure of the Op enVMS Management Station

request, indicating that communication was lost a status code indicating if the attribute was present during that request's execution. If no request was in the message buffer. The client uses this routine in progress, then there is no apparent disruption of to locate data in a response. The third routine takes service. a message buffer, a table listing the attribute codes that are of interest to the caller, and an action rou­ tine that is called fo r each message item that has an Message Encoding and Decoding attribute code fo und in the table. The server OMMs Messages from the OpenVMS Management Station use this routine to process incoming requests. tool are divided into three sections. The first sec­ tion contains a message header that describes the length of the message, the protocol version number Ha ndling of Complex Data Types in use, and the name of the target OMM. The second In general, the interpretation of data between the section contains the collection of target systems fo r client and server systems did not pose a significant the request. The third section contains the data concern. There was no floating-point data, and the fo r the OMM. This last section forms the request and integer and string data types were sufficiently simi­ is the only section of the message that is visible to lar not to require special treatment. However, the the OMMs. OpenVMS Management Station software did need The OMM data fo r a request is typically con­ a few data types to process that were not simple structed as a command, fo llowed by some number atomic values. These required special processing to of attributes and command qualifiers. For instance, handle. This processing typically consisted of fo r­ a request to list all known users on a system, return­ matting the data type into some intermediate fo rm ing their usernames and last login time, could be that both client and server systems could deal with described as this: equally well. For instance, one such data type is the time­ COMMAND LIST USERS stamp. In the OpenVMS operating system, times MODIFIER USERNAME = "*" ATTRIBUTES USERNAME, are stored as 64-bit quadword values that are LAST LOG IN TIME 100-nanosecond offsets from midnight, November 18, 1858. This is not a natural fo rmat fo r a Microsoft The OMM data fo r a response is typically a status Windows client. Date and time display fo rmats vary code, the list of attributes from the request, and the greatly depending on localization options, so the attributes' associated values. There may be many data needed to be fo rmatted on the local client. The responses fo r a single request. Using the LIST_USERS developers used an approach that decomposed the example from above, the responses would each native OpenVMS time into a set of components, sim­ look like a sequence of: ilar to the output from the $!\T lJMTIM system or the

STATUS SUCCESS UNIX tm structure. This decomposed time struc­ ATTRIBUTES USERNAME () ture was the fo rmat used to transmit timestamp LAST LOGIN TIME ( ) information between the client and server.

There are many possible attributes for an OpenVMS user. To make later extensions easier and to limit ServerCom ponent the number of attributes that must be retrieved With the OpenYMS Management Station product, or updated by a request, the OMM data fields are the server component is responsible fo r enacting self-describing. They consist of a sequence of mes­ management requests that target its local system. sage items that are stored as attribute code/item The server also must fo rward requests to all other length/item value. The base data type of each VMScluster systems or independent systems that any attribute is known and fixed. incoming request may target. The server is a multi­ Message encoding is supported by a set of rou­ threaded, privileged application running on the tines. The first accepts an attribute code and its managed OpenVMS systems. It consists of an infra­ associated data item. It appends the appropriate structure layer that receives incoming requests and message item at the end of the currem message. dispatches them, the server OMMs that enact the This is used to encode both requests and responses. management requests fo r the local system, and a fo r­ The second routine takes a message buffer and an warding layer that routes management requests to attribute code, returning the attribute's value and other target systems and returns their responses.

Digital Te clmicaljournal Vo l. 6 No. 4 Fa ll 1994 85 PC LAN and System Management Tools

Server Infrastructure development and testing phases of the project, and

The server infrastructure, shown in Figure 7, is the number of wo rker threads was kept at five. responsible for dispatching incoming requests to In addition to the basic thread structure , the the server OMMs and the fo rwarding layer. It has a infras tructure is responsible fo r participating in the set of threads, one fo r each inbound connection, a authentication exchange fo r inbound connections. pair of work queues that buffe r individual requests This is accomplished through the use of a special­ and responses, and a limited set of worker threads ized server OMM, called Spook. The Spook OMM that either call the appropriate Oi\'I M or fo rward the uses the basic server infrastructure to ensure that request. authentication requests are fo rwarded to the appro­ The inbound connection threads are responsible priate target system . This mechanism reduced the fo r ensuring that the request identifies a known amount of specialized logic needed fo r the authen­ OMM and meets its message requ irem ents. The tication protocol: fo r this reason, the server OMMs must declare if they require an authenticated link connection threads must also ensure that the OM M version number is within an acceptable range and before accepting an incoming request. that the link is sufficiently authenticated . The inbound threads are then responsible fo r repli­ Server OMM Structure cating the request and placing requests that have The server o:vi:VJs are at the heart of the server.

only one target system in the request work queue. These Oi'vl. 'vls are loaded dynamically when the Once a response appears in the response wo rk server initializes. H queue, these threads return the response to the Figure shows the structure of the liAServer 0;\1i'vl client system. in Openv:vJS Management Station version 1.0. 'fhe A fixed number of worker threads are responsi­ server OMM consists of the main application mod­ ble fo r taking messages from the request work ule, the preprocessing routine, and the postprocess­ queue and either fo rwarding them or call ing the ing routine. The interfaces are synchronous, passing

appropriate local OMM. Each result is placed in the OMi'vl (lata sections from the request and response response queue as a response message. A fixed message buffers. In addition, the main appl ication

number of five worker threads was chosen to module executes in the security context, called a ensure that messages with many targets could not persona, of the authenticated caller. Th is allows nor­ exhaust the server's resources. Re sponsiveness and mal access checking and auditing in the OpenVMS resource usage were acceptable throughout the operating system to work transparently.

OUTGOING FORWARD [ INCOMING REO =�T I _ R_EQU_ES_T _ REQUEST � - ______IS TARGET LOOK UP OMM NAME ""'11[---I --- I LOCAL? ATIACH AUTHENTICATION I REQUEST WORK QUEUE CONTEXT TO REQUEST 1 YES/ I I '\ L T E / I I ��� ��� iA���� iJi� M I 111111111 I I WAIT FOR r---,SERVER / GET NEXT"' STALL WAIT FOR RESPONSE OMM REQUEST 1 l THREAD NEXT /I lllllllll R � MATCH RES>ONSE TO � O,NG _ _.. _ __ REQUEST ______RESPONSE WORK QUEUE RESPONSE INCOMING .., / SEND RESPONSE --I RESPONSET Ir-- ::::- .- I I I FIVE WORKER THREADS ONE THREAD PER CONNECTION [ [ L ______j

Figure 7 Seruer Infrastructure and il1essage Flou•

J H>i. (j 4 R No. Fa ll /')';)4 Digital Te cbuical jo urnal The Structure of the Op en VMS Ma nagement Station

YES APPLICATION

w ::2: u (/) w ID f- >- z NO RESOURCE LL f- (/) X - z MANAGERS

Figure S UA Server OMM

The preprocessing and postprocessing routines knows what resources are affected by a given man­ are used ro ease interoperation of multiple ver­ agement request. Each resource manager knows sions. They are cal led if the incoming request has how to perform requested modifications to the a diffe rent, but supported, OMM version number specific resource that it manages. than the one fo r the local OMM. The resulting OMM For instance, the UAServer application layer data section is at the local OM M's ve rsion. These knows that the creation of a new user involves routines hide any version differences in the OMM's several resource managers, including the authoriza­ data items and free the main application from the tion file and file system resource managers. How­ need to hand le out-of-version data items. If the pre­ ever, it does not specifically know how to perform processing routine is called, the server infrastruc­ low-level operations such as creating a home direc­ ture always calls the postprocessing routine, even if tory or modifying a disk quota entry. In comparison, an error occurred that prevented the main OMM the file system resource manager knows how to do application from being called (for instance, by a these low-level operations, but it does not recognize link fa ilure during fo rwarding). This al lows the two the higher level requests, such as user creation. routines to work in tandem, with shared state. The application layer fo r all OMMs offers an inter­ The actual management operations take place in face and a buffe r. The request message passes the the main appl ication portion of the server OMM. It OMM data section to the interface, and the buffe r is structured with an application layer that provides holds the OMM data section fo r the response mes­ the interface to the management object, such as the sage. Similarly, all resource managers accept an user account. This uses underlying resource man­ OMM data section fo r input and output parameters, agers that encapsulate the primitive data stores, ignoring any OMM data items fo r attributes outside such as the authorization file. The application layer their specific resource. Because of the loose

Digital Te chnical journal Vo l. 6 Nu. 4 Fa ll 1994 87 PC LAN and System Management To ols

coupling between the resource managers and the support and encouragement to the team through­ application layer, the resource managers can be eas­ out the project, and for their further encourage­ ily reused by server OMMs developed later. ment in writing about this experience.

Summary References The Open VMS Management Station tool has demon­ 1. Op en AXP Guide to Sy stem Security strated a robust client-server solution to the manage­ VMS (Maynard, MA: Digital Equipment Corporation, ment of user accounts fo r the OpenVMS operating May 1993): 5-1 to 5-37 system. It provides increases in fu nctionality and data consistency over system management tools pre­ 2. D. Giokas and ]. Rokicki, "The Design of viously avail able on the Open VMS operating system. ManageWORKS: A User Interface Framework," In addition, the OpenVMS Management Station soft­ Digital Te chnical jo urnal, val. 6, no. 4 (Fall ware is fo cused on the management of several 1994, this issue): 63 -74. loosely associated VMScluster systems and indepen­ 3. ]. Case, M. Fedor, M. Schoffstall, and J Davin, dent systems. It has addressed the issues concern­ Network Wo rking Group, Internet Engineering ing performance, usability, and functionality that Ta sk Force RFC 1157 (May 1990). arose from the need to issue management requests to execute on several target systems. 4. DECnet Digital Network Architecture, Co mmon Management Information Protocol (CMIP) , Ver­ Ack ledgments now sion 1.0.0 (Maynard , MA: Digital Equipment Cor­ I wish to thank the Argus project team of Gary poration, Order No. EK-DNA01-FS-001 , July 1991). Allison, Lee Barton, George Claborn, Nestor Dutko, 5. ]. Shirley, Guide to Wr iting DCE Applications To ny Dziedzic, Bill Fisher, Sue George, Keith (Sebastopol, CA: O'Reilly & Associates, Inc., 1992) Griffin, Dana Joly, Kevin McDonough, and Connie Pawelczak fo r giving me a chance to work on such 6. X/Open CA E Sp ecification, X/Open Tramport an interesting and exciting project. I also wish to Interfa ce (X TI) , ISBN 1-872630-29-4 (Reading, thank Rich Marcello and Jack Fallon for providing U. K. X/Open Company Ltd., January 1992).

88 Vo l. 6 11-iJ. 4 Fat/ 1994 Digital Te clmicaljournal jo hn R. Lawson, Jr. I

Automatic, Ne twork-directed Op erating System Software Up grades: A Pl atform­ independent Approach

Th e initial system load (ISL) capability of Digital's layered-product POLYCENTER Software Distribution (formerly known asRS1J!J) version 3.0 provides Op enVJ\1/S sys­ tem managers with a network-directed tool fo r performing automatic operating system software upg rades. Th e design of the POLYCENTER Software Distribution product integrates a number of new and varied software architectures to perform the ISL. A description of the POLYCENTER Software Distribution implementation of the ISL fo r the Op en VMS op erating system details the step s of the ISL process. The soft­ ware's modular ISL mechanism can be expanded fo r use on other Digital and non­ Digital op erating systems and hardware platforms.

The POLYCENTER Software Distribution version 3.0 The POLYCENTER Software Distribution product product provides automatic, centrally delivered, can be extended fo r use with non-Digital operating network-directed operating system software systems and hardware platforms all controlled upgrades through a process called the initial system from its one user interface. In most cases, no back­ load.1 The term initial system load (ISL) has existed ups need be performed on the clients' system disks. fo r a number of years in various fo rms and has come A halted client system, of course, must be launched to describe the act of loading the operating system into the first step manually. software onto brand-new (virgin) systems without This paper describes the POLYCENTER Software the need for locally attached tape drives or other Distribution version 3.0 product. It begins by dis­ removable media devices. This term, more loosely cussing the software environment and the software applied, may also be used to describe other opera­ technologies used by the ISL process. It then states tions such as operating system upgrades. the project team's goals fo r the product. The paper The !SL technology provides many advantages next relates the ISL scheme implemented by over the traditional means of perfo rming upgrades. the POLYCENTER Software Distribution version 3.0 Typically, upgrades are performed one system at product fo r the OpenVMS operating system. The a time, at each console by the system manager, who paper concludes with a discussion of the details of must maintain the correct set of installation media expanding the ISL to other platforms. It is assumed fo r each client system's unique set of peripherals that candidate operating systems are capable of at and answer each question as the upgrade pro­ least simple task-to-task communication through cedure prompts. In a network managed by the the DECnet network (or some emulation), but other POLYCENTER Software Distribution product, operat­ communication mechanisms could be devisee! ing system upgrades are performed simultaneously. instead. Any number of !SL operations can be invoked by using a single installation medium and often by issu­ Software Environment ing a single command. In addition, the ISL mecha­ The POLYCENTER Software Distribution product nism can be used fo r system disk maintenance defines operating system software as everything on operations, such as upgrade, replacement, replica­ the volume that is typically cal led the system disk. tion, backup, or compression. This includes boot files, data files, configuration

Digital Te cbnicaljour11al Vo l. 6 No. 4 Fa ll 1994 89 PC LAN and System Management Tools

files, utilities, compilers, layered products, and cus­ • Maintenance Operations Protocol (MOP) is a net­ tom izations. It even includes user directories, if work protocol used to download system software they exist on that system disk. into the memory of adjacent network nodes. The POLYCENTER Software Distribution product • Remote triggering enables one client system to allows the individual operating system to deter­ cause another client system to reboot. The client mine the definition of upgrading the operating sys­ system tnliSt have triggering enabled and have tem software. For the OpenYMS operating system, a triggering password defined and known to the upgrading could mean the complete re placement server node. of the contents of the system disk volume, includ­ ing the utilities, compilers, database, and custom­ • The load assist agent is a shareable image run­ izations. For the OpenVMS AXP operating system, ning in the context of the maintenance opera­ it cou ld also mean the installation, without touch­ tions monitor (iVIOM) process on the server ing the rest of the volume, of only newly acquired system. This code permits a server to control OpenYMS operating system files and images. For and customize (if necessary) the system soft­ other platforms, it could mean any one of several ware being downloaded to the client. other techniques. Each platform can define what is • The local area disk (LAD) protocol allows locally needed to perform operating system upgrades or attached disks or container files on server sys­ installations. tems to be presented to the local area network A network managed by the POLYCENTER Soft­ (LAN) for use as virtual disks on client systems. ware Distribution product consists of one or more The InfoServer is the most common server of the centrally located server systems; each server is protocol. OpenVMS systems running the responsible fo r performing certain operating LAD POLYCENTER Software Distribution product can system maintenance fu nctions on its assigned set also act as LAD servers. system can be a LAD of cJ ient systems. The server system runs the A virtual disk. OpenViviS operating system. Client systems run any of a number of operating systems. The POI.YC ENTER • The processor-specific primary bootstrap is Software Distribution version 3.0 software sup­ a low-level program that is loaded into the mem­ ports backups, user authorizations, and layered­ ory of a booting client. This program can be product installations fo r clients ru nning the VA X loaded from a disk drive, a tape drive, the net­ VMS, OpenYNIS VA X, OpenVMS AXP, and ULTRIX work interconnect (NI), or read-only memory operating systems. The software includes support (ROM). A small but self-contained program, it is for ISL procedures to clients running the VA X VMS, capable of communicating with the machine's OpenVMS VA X, and OpenVMS AXP operating console subsystem, most of the machine's inter­ systems. The software's lSL architecture, however, nal resources, and the system disk from which it supports expansion to other operating systems loads a secondary bootstrap or the fu ll operating and platforms. system.

Design of the POL YCENTER Software Note that some operating systems (OpenVMS Distribution Product included) claim that their bootstrap programs are processor-independent. However, if the The design of the POLYCENTER Software Distri­ operating system is under development, support bution version 3.0 product integrates a number of will be added eventually to this bootstrap fo r new and varied software architectures. The soft­ new CPU models and/or hardware variations. ware development required the cooperation and Thus the processor-independent bootstrap pro­ synchronization of two layered-product and two gram from an earlier version of an operating sys­ operating system development groups. tem may not support all the processor types supported by a later version of this same pro­ Software Technologies gram. Therefore, processor independence is tied No single software technology is capable of auto­ to the set of processors supported by that par­ matically upgrading system disks. Several must be ticular version of the operating system. For this used in combination. A brief description of a num­ reason, the POLYCENTEH Software Distribution ber of such technologies that could be used to product specifically stores an image of the boot­ implement the ISL process fo llows. strap program in a private directory alongside

Vo l. 6 1994 Digital Te chnical jo urnal 90 No 4 Fa ll Automatic, Network-directed Op erating Sy stem Software Up grades

the container file that houses a virtual system • The software library must store several operat­ disk (a bootable snapshot of the version of the ing system images. They can be images of differ­ operating system). ent operating systems and/or diffe rent versions of the same operating systems. • The system start-up command procedure is a

command script that is responsible for bringing • Client systems must not be restricted to specific the recently booted operating system to its full peripheral hardware. configuration. In an ISL, this command proce­ • The software must make no assumptions about dure is tailored to configure only the resources the hardware to which it is delivering software. needed to perform the ISL. Sometimes, limited \Vhateverconfigur ations are legal to a particular command-level access is allowed, but seldom is operating system must also be supported by the fu ll user access or timesharing permitted. POLYCENTER Software Distribution product.

• In the OpenVMS operating system, the BACKUP/ • The client software must make no assump­ IMAGE command can duplicate a system disk tions about the server system directing the ISL. either directly from disk to disk, or indirectly Therefore, it would be inappropriate to store from a saveset file (which might be located on operating system images in BACKUP saveset tape or across the network) to disk. files, which are unique to the OpenVMS operat­ • Standalone BACKUP is a self-contained, diskless ing system. operating system capable of executing BACKUP! • Client software, including temporary system !J'vlAGE commands but not capable of network disks, must be taken from the clients themselves. operations. Prepackaged operating system software is dis­ • The SYS$UPDATE:VMSKJTBLO.COM procedure in couraged because it becomes obsolete as new the OpenVMS operating system is used to create versions of the operating system are developed, a generic system disk, using the current system and because it is rarely capable of being cus­ disk software as a modeL tomized by the user.

• The POLYCENTER Software Installation (PCSl) • The JSL process should be expandable to other utility is capable of creating or upgrading a sys­ operating systems and hardware platforms with­ tem disk from a configuration file and a descrip­ out changes to the current product. tion file (possibly located at a remote network • The POLYCENTER Software Distribution product location) or the current system disk. should be able to use Digital-supplied distribu­ Allthese software technologies are utilized in the tion media as operating system images, such as POLYCENTER Software Distribution ISL, except stand­ the PCSI-based OpenVMS AXP version 6.1 CO-ROM. alone BACKUP and SYS$UPDATE:VMSK!TBLD.COM, • The operating system image should occupy as because the fo rmer cannot be used remotely, and little disk space as possible. the latter produces uninteresting system disks.

Anyone expanding the ISL, however, can use what­ • The ISL process should work over all valid ever techniques they choose, including those two. DECnet network configurations. This require­ ment was only partially achieved: the LAO proto­ Goals fo r the ISL Process col works over the LA.t"' only. The development team had the fo llowing goals fo r From these requirements, two achievements were the POLYCENTER Software Distribution implemen­ gained: the totally modular organization and the tation of the ISL process. compatibility with the OpenVMS AXP operating system distribution CD-ROM. The fo rmer permits • The process must be totally automatic; only support for other operating system and hardware halted client systems are permitted to require platforms to be added incrementally. The latter human intervention. enables system managers to simply load the latest

• Multiple ISL processes must run concurrently. copy of the distribution media, invoke the PCSI util­ No specific limits should be placed on the num­ ity to record their configuration choices, and then ber running in parallel (except fo r practical per­ enter a single command to upgrade all their client fo rmance reasons). systems at once.

Digital Te chnica.ljourna.l Vo l. 6 No. 4 Fa ll 1994 91 PC LAN and System Management Tools

Description of the ISL Steps utility. The PCSI-based upgrade very closely resem­ Regardless of the operating system or hardware bles the ISL process of the POLYC ENTER Software platform, the ISL process requires the fo llowing Distribution version 3.0 product. simple steps: The modular layout of the POLYCENTER Software Distribution implementation and the extension of • Load a processor-specific primary bootstrap the lSL to other operating systems are discussed in into the memory of the client system. the section Platform Independence.

• Boot a (usually read-only) version of the operat­ ing system from some form of temporary system Processor-specifi c Primary Bootstrap disk. The primary bootstrap is responsible for establish­ ing the connection needed between the client sys­ • Determine the parameters of the ISL to be tem and the temporary (or virtual) system disk, performed. wherever that might be maintained.

• Move the operating system software to the target system disk. Op enVMS Implementation If the distribution medium is local, then the ROM bootstrap or a boot­ • Initiate the cleanup, configuration, tuning, and strap file on the medium is sufficient to boot the reboot of the target system disk (which would operating system contained there. If the distribu­ then contain the new version of the operating tion medium is served to the LAN by an lnfoServer system software). system, then the bootstrap image must be clown­ Figure 1 shows the steps of the ISL process in the loaded from an adjacent DEC:net node with service POLYCENTER Software Distribution Installation. enabled on its NI circuit common to that client, using This section provides a description of each MOP. In OpenVM S AXP, this image is cal led APB.SYS; step in the ISL process, contrasting the OpenVMS in OpenVJ\115 VAX, it is ISL_SVAX.SYS or ISL_LVAX.SYS implementation with the POLYC ENTER Software (for small or large VA X systems, respectively). Distribution implementation of the ISL for the The system manager requests the image to be OpenVMS operating system. The discussion downloaded (at the console of each client system) includes both the traditional standalone installa­ by entering special processor-dependent boot com­ tion based on the BACKUP command and the mands. The MOM process on the adjacent node OpenVi'

- POL YCENTER Software Distribution lmplenzen­ . ------, 1 PROCESSOR- I 1 tation The client system boots from its Nl ,..1 SPECIFIC - 1 BOOTSTRAP �-: adapter, generating a MOP load request. The server IMAGE I ------· keeps the client's hardware NJ address in its database so it can detect and process this request. This activates the load assist agent (LAA) under

1 __ ...,. the MOM process. The LAA retrieves the various INSIDE THE BOOTSTRAP IMAGE, A WORKSPACE answers to the operating system configura­ CONTAINS THE ISL tion questions from the POLYCENTER Software PARAMETERS TARGET Distribution library. It then passes those answers SYSTEM plus the operating system version-specific pri­ DISK EJ mary bootstrap (APB.EXE for OpenVMS AXP or ESS$JSL_VMSLO AD.EXE for OpenVMS VA X) back to Figure 1 JSL Process in the POLYCENTER the MOM process to be downloaded to the client's Software Distribution Installation memory. Among these configuration answers are

92 Vo f. (, No. 4 Fa ll 1994 Digital Te chnical jou rnal Automatic, Ne twork-directed Op erating Sy stem Sojtware Up grades

processor requires a new minimum version of the the name ancl password of a LAD service, presented by the server in a file containing the temporary sys­ OpenYMS operating system. The system disks cap­ tem disk. This primary bootstrap establishes the tured in the boot container could not be easily logical connection between the client system and upgraded in the field. An engineering change order this virtual system disk. was required for the POLYCENTER Software Distri­ bution product each time support was added to the Temporary Sy stem Disk OpenVMS system for a new processor. These previous versions also stored the operat­ The temporary system disk is a tailored copy of ing system image in a BACKUP saveset file. This the operating system to be installed. This system method could be more space-efficient (page, swap, disk is usually customized in such a way that its and dump files consume no space in a saveset file), only purpose is to perform the !SL. but it violates one of the design goals. In version 3.0, the software developers eliminated Op en VMS Implementation If it is mounted locally, the concept of separate BACKUP saveset files and the temporary system disk is the distribution boot containers. Since the operating system support medium. If the medium is presented to the client by fo r the new processors exists in the software saved an InfoServer service, then the temporary system in the operating system image, the cl ients can be disk is a virtual disk bound to that LAD service. In booted directly from that image. The POLYCENTER either case, the temporary system disk is mounted Software Distribution version 3.0 product stores the as read-only. image of the model system disk directly into a con­ The operating system booted from that medium tainer fi le. This approach produced an interesting is either standalone BACKUP (for traditional installa­ side effect. If a particular processor is not supported tions) or Open VMS (for PCSI-bascd ISL installations by the version of Open VMS saved in the operating or upgrades). Since standalone BACKUP can per­ system image, it is not possible to boot that proces­ form only BACKUP operations, there is an extra, sor into the ISL. As a resu lt, an older version of the time-consuming step. The system manager must OpenYMS operating system cannot be instal led on enter an appropriate BACKCP/!MAGE command to hardware that requires a newer version. move a portion of the operating system software (the so-cal led REQUIRED saveset file) to the target system disk and then boot onto the target system Pa ra meters of the JS L disk (containing this partial OpenVMS operating When operating system software is being installed, system) to continue the installation. system configuration choices must be selected from a number of variables. At a minimum, the POLYCEN T/:.1? Sojtware Distribution Implemen­ name of the target system disk must be known. tation The temporary system disk always con­ Answers might also be needed fo r questions such tains the fu ll operating system to be installed. In as: which subsets of the operating system files are most cases, this temporary system disk is actually a to be installed' The JSL procedure must be capable ful ly fu nctional image of a model system disk taken of obtaining these answers, either by prompting from another client system by an earlier FETCH a user at the console of the cl ient system or by some OPERATING_SYSTEM com mancl.l The fe tch process automatic means. (discussed later) has replaced this temporary sys­ tem disk's system start-up command procedure Ope nVMS Implementation At this point, the with a script that runs the remaining steps of the ISL OpenYMS operating system is running, and a spe­ process. cial system start-up command procedure has con­ Previous versions of this product (also known as trol. The system manager now answers a series of RSM) included a prepackaged temporary system prompts at the console. Only rare.ly does the disk with a fixed contents that was built by hand. upgrade procedure ask all its questions at once Software developers routinely captured the latest (and state that it is finished asking questions) before versions of Open VMS system disks inside boot con­ commencing any time-consu ming tasks. If it did, tainer files as small as 14,000 blocks' Although the system manager could leave the console of one an interesting academic achievement, this proved machine to move to the console of the next to be an impractical approach. Digital releases machine and so on. In this way, multiple upgrades new processors from time to time, and each new could be performed concurrently

Vo l. Digital Te chnical journal 6 No. 4 Fa ll 1994 93 PC LAN and System Management Tools

POLYCENTER Software Distribution Implemen­ ates duplicate basic system disks), the BACKUP/ ta tion The parameters of the IS.L were down­ IMAGE command (which duplicates system disks in loaded along with the primary bootstrap image. their entirety), and the PCSI utility (which upgrades The system start-up procedure of the ISL executes a system disks in place) could be utilized at this point. program that locates the Jist of parameters in mem­ The BACKUP/IMAGE command moves the image of ory and returns them as logical names, which are the temporary system disk to the target system disk. easier fo r command procedures to manipulate. The PCSI utility replaces the operating system files (Other operating systems wou ld use their own eas­ on the target system disk with the new operating ily accessible data storage mechanisms.) system files from the temporary system disk. In the The system starr-up procedure starts the DECnet BACKUP/IMAGEcase, any system-specific customiza­ networking software and establishes a network con­ tions or layered-product files that were saved into nection with the POLYCENTER Software Distribution the container file by the FETCH OPERATING_SYSTEM server system, permitting access to larger amounts process are now in place. In the l'CSI case, how­ of data than might fit into the bootstrap image. The ever, a.ll system-specific customizations or layered­ BACKUP saveset file used by previous versions of product files are left undisturbed. the POLYCENTER Software Distribution product was accessed through this DECnet connection. Cleanup, Configuration, Tu ning, and Reboot Move the Op erating Sy stem Software Anyfinal changes needed before allowing the client Each operating system has specific requirements to use its new system disk are performed during the fo r creating or duplicating system disks. This step cleanup, configuration, tuning, and reboot phase. uses the client operating system's standard proce­ The client now boots from its newly upgraded dure to duplicate or upgrade the target system disk, target system disk, and the temporary system disk generalJy using the temporary system disk as its is no longer needed. source (or model). However, another means, such as network files or library files, may be used. Op enVMS Implementation As a final step, the AUTOGEN procedure tunes the operating sys­ Op enVM S Implementation From this point, there tem parameters to the hardware on which it is is no difference between an upgrade and an installa­ intended to run. Any other configuration issues tion using the traditional standalone OpenVMS (such as the network node name and address) mechanisms. remain as exercises fo r the system manager to per­ The OpenVMS mechanism now performs a series fo rm at some later time. The system reboots one

of complex fi le replacements in a peculiar order, last time. For traditional instaUations, this reboot which requires several reboots to complete. This may have been the fifth or sixth. Some of these may maximizes the existing free space on the target sys­ have been manual reboots, which require the sys­ tem disk. After all the reboots have completed, the tem manager to issue nonstandard, processor­ old operating system files will have been deleted, specific console commands. and the new files will have been delivered. The PCSl-based upgrade doesnot need to perform POLYCENTER Software Distribution Implemen­ the several reboots, since the target system disk is tation When the BACKUP/IMAGE com mancl is treated as a data disk. Its operating system fi les are used, customizations specific fo r the ISL, which are simply replaced with new versions taken from the all stored under the [RSMO.] directory tree, must be temporary system disk. This is one reason that removed from the target system disk, which, before the PCSI-based OpenVMS upgrade is faster than the this step, is a perfect image of the temporary system traditional OpenVMS upgrade. disk. In addition, the DECnet software must be reconfigured. The DECnet databases still contain POLYCENTER Software Distribution Implemen­ the configurations saved in the temporary system ta tion Since full OpenVMS (including DECnet) is disk; these must be updated to reflect the hardware running, all the resources of the OpenVIviS operat­ on this client. As a final step, the cl ient reboots ing system are availab.le fo r manipulating the target onto the target system disk, and the temporary sys­ system disk, which is also treated as a data disk. tem disk is no longer required. With the PCSI-based Alternatives such as VMSKITBLD.COM (which ere- ISL, no cleanup is required.

94 !7rJ l. 6 No. 4 Fa /1 1()94 Digital Te clmicaljournctl Automatic, Ne twork-directed Op erating Sy stem Software Up grades

Fe tching and Installing Op erating \Xt'hen processing an INSTALL OPERATING_SYSTEM System Software AV MS com mand, the POLYCENTER Software Dis­ The POLYCENTER Software Distribution product tribution product uses the OpenVMS system keeps images of model operating systems in its run-time library ro utine LIBSFIND_liVlAGE_SYMBOL private software library. The act of placing soft­ in order to dynamically activate the shareable image ware into the library is called a fe tch. The act of SYSSSHARE:RSMSISL_INSTA LL-AVMS.EXE. Th is image delivering that software to a client system is called is called the ISL Director. It is used fo r both fetch an install. The operating system commands are and install operations. The POLYCENTER Software FETCH OPERATINC;_SYSTEM and INSTALL or UPGRADE Distribution product calls the ISL Director rou tine OPERATING_SYSTEM .1 A model system disk cannot RSM$ISL_FETCH and passes to it a context data stmc­ be instal led without first being fetched from a ture (described in the section Platform Indepen­ client system or suitable distribution medium. dence). This routine uses the software's remote command execution agent (CEA) to issue Digital command language (DCL) commands on the cl ient The Fetch Op eration system. Non-OpenVMS cl ients wou ld need to imple­ The FETCH OPERATING SYSTEM command takes ment th eir own communications mechanism, so a parameter that is the symbolic name of the operat­ that the server system could direct the client to per­ ing system to be fe tched. The POLYCENTER Software fo rm any required actions. Distribution product uses and records this sym­ These DCL commands cause the client to mount bolic name because it is the key to a naming scheme the LAD virtual disk presented from the fetch toolkit used to activate program modules later. container file RSMSSDS_DATA:RSMSFETCH-AVMS.DSK. Table 1 lists several symbolic names and the The client executes the command procedure operating systems they might represent. It is impor­ [RSMV3.0]RSM$ISL_BOOT-AVMS.COM from the fe tch tant to remember that there is no built-in mapping toolkit virtual disk. This command procedure between these names and the operating systems to which they are mapped. This list is a theoretical • Determines the size of the client's system disk sampling of what mappings could be configured on a particular server system. • Reports that system disk size to the ISL Director

Ta ble 1 Required Files for Sample Operating Systems

Symbolic Name Operating System Required ISL Files

VMS OpenVMS VAX SYS$SHARE:RSM$1SL_I NSTALL-VMS.EXE or VAX VMS RSM$SDS_DATA:RSM$FETCH-VMS.DSK SYS$SHARE: RSM$1SL_LAA-VMS.EXE

AV MS OpenVMS AXP SYS$S HARE:RSM$1SL_I NSTALL-AVMS.EXE RSM$SDS_DATA: RSM$FETCH-AVMS.DSK SYS$SHARE:RSM$1SL_LAA-AVMS.EXE

ULTRIX ULTRIX SYS$SHARE:RSM$1SL_INSTA LL-ULTRIX.EXE RSM$SDS_DATA:RSM$FETCH-ULTRIX.DSK SYS$SHARE:RSM$1SL_LAA- ULTRIX.EXE OSF1 OSF/1 SYS$SHAR E: RSM$1SL_INSTA LL-OSF1. EXE RSM$SDS_DATA: RSM$FETCH-OSF1.DSK SYS$SHARE: RSM$1SL_LAA-OSF1. EXE VMS5 OpenVMS VAX SYS$SHARE:RSM$1SL_INSTA LL-VMS5.EXE with DECnet Phase V RSM$SDS_DATA:RSM$FETCH-VMS5.DSK SYS$SHARE:RSM$1SL_LAA-VMS5.EXE

AV MS5 OpenVMS AXP SYS$SHARE:RSM$1SL_INSTA LL-AVMS5.EXE with DEC net Phase V RSM$SDS_DATA:RSM$FETCH-AVMS5.DSK SYS$SHARE:RSM$1SL_LAA-AVMS5.EXE

WINDOWS MS-DOS ru nning SYS$SHAR E:RSM$1SL_INSTALL-WIN DOWS.EXE Microsoft Windows RSM$SDS_DATA:RSM$FETCH-WINDOWS.DSK SYS$SHARE:RSM$1SL_LAA-WIN DOWS.EXE

Digital Te cht�icaljournal Vo l. 6 No. 4 Pa ll 19')4 95 PC LAN and System Management Tools

The server creates an appropriately sized LAD that the cl ient is halted and that the system manager container file to receive the snapshot of the will launch the client into the ISL manually. client's system disk and serves it to the cl ient. When the MOM process detects the client's NI address, it activates the LAA and passes control to • Mounts the new virtual disk the routine at offset 0000 in the image. The parame­

• Issues a BACKUP/IiVLAGE command to copy the ters to this procedure call (which are described in system disk to the virtual disk the section Platform Independence) include the node name of the client system and the address of • Provides the server with access to the processor­ a callback routine used to deliver the bytes of the specific primary bootstrap image (APB.EXE) bootstrap image to the client. The callback routine The server saves the APB.EXE image in its library • Reads the RSM$SDS_WO RK:ISL_client.DAT file alongside the newly created container file. (described in the section Platform Independence)

• Customizes the virtual disk so it can be used as • Retrieves the processor-specific bootstrap the temporary system clisk during an ISL image (APB.EXE) from the I ibrary The boot command procedure uses programs • Locates and writes the parameters of the ISL into and command procedures from the fe tch toolkit the bootstrap image's work space virtual disk to accomplish this step. In a FETCH

OPERATING_SYSTEM AV JVIS, this final step includes • Releases these bytes to MOM for delivery to the creating a special system root [RSMO.SYSEXE] , client placing a private system start-up command pro­ Once this is downloaded, the server system cedure [RSMV3.0]RSM$ISL_STARTUP-AVMS.COM, assumes a passive role, waiting for the cl ient to installing a program to retrieve the parameters announce its own completion. of the ISL [RSMV3.0]RSM$JSL_CLIENT-AVMS.EXE, The processor-specific bootstrap image has and installing a command procedure to remove control of the client system. It locates the LAD ser­ these customizations [RSMV3.0]RSM$1SL_CLEANUP­ vice name and password in the parameters of the AV MS.C01Vl. lSI. to establish the connection to the temporary vir­ The two virtual disks are then dismounted, and the tual system disk (which is being presented by the server closes the container file and makes it avail­ server system) and boots the Open VMS A.X.P operat­ able, write-protected, fo r ISL operations. These LAD ing system. services can be accessed with binary passwords The system start-up command procedure known only to POLYC ENTER Software Distribution (RS!vl $1SL_STARTUP-AVMS.COM) then receives con­ servers, so no casual access to the data contained trol and within is ever allowed. • Starts enough of the OpenVMS operating system Th e Install Op eration to mount local disks and start the DECnet net­ working software The POLYCENTER Software Distribution product

retrieves the symbolic name of the operating • Executes the program RSMSISI._CI.IENT-AVMS.EXE system (e.g., AV MS) from the database. The software to retrieve the ISL parameters product uses the symbolic name to activate the With the parameters of the ISL stored in logical JSL Director image (SYS$SHARE:RSM$TSL_INSlALL­ names, the system start-up procedure then AV MS.EXE) and passes control to its universal rou­

tine RSM$ISL_INSTA LL. This routine enables the LAA • Configures the target system disk (SYS$SHAHE:RSM$ISL_LAA-AVMS.EXE) and prepares • Initializes the target system disk if necessary a data file RSM$SDS_WO RK:ISL_client.DAT for use

by the LAA after the client system requests it to be • Starts the DECnet networking software downloaded. • Solicits further instructions (if any) from the If a DECnet connection is possible between the server system server system and the client system, then the com­ mand execution agent issues appropriate shutdown • Issues a BACKUP/IMAGE command to move the and reboot commands to launch the ISL. If not, the operating system software from the temporary POLYCENTER Software Distribution process assumes system disk to the target system disk

96 Vo l. 6 No. 4 Fu /1 1994 Digital Te clmicaljournal Automatic, Ne twork-directed Op erating Sy stem Software Upg rades

• Executes the RSM$1SL_CLEANUP-AYMS.COM com­ • The operating system must be bootable from a mand procedure to remove the customizations read-only LAD virtual disk. (Am ong others, the specific fo r the ISL MS-DOS, ULTRIX, and DEC OSF/ 1 operating sys­ tems are known to have this capability.) The target system disk now appears to be identi­ cal to the model system disk just before the fe tch • The hardware platform must be MOP down­ operation. loadable. (Most Digital processors have this capability.) The UPGRADE OPERATING_S YS TEM and • FE TCH CONFIGURATION Commands The operating system's processor-specific boot­ strap image must have an LAA-writeable scratch The contents of the PCSI-installable distribution area for the parameters of the ISL. medium fo r OpenVMS AXP bears a striking resem­ blance to a POLYCENTER Software Distribution tem­ • The parameters of the ISL must be retrievable porary system disk. This is no coincidence. The by the operating system's system start-up com­ Open VMS AXP development team modeled the distri­ mand procedure. bution medium after the POLYCENTER Software • Distribution boot container, so the product would be The operating system must have a mechanism plug-compatible. The obvious difference, however, is fo r moving the contents of the temporary sys­ tem disk to the target system disk, which will that the system start-up procedure invokes the PCSI utility instead of the BACKUP/L\1AGE command. never be identical media. (Most operating sys­ The client system boots from the distribution tems have this capability.) medium under the direction of the POLYCENTER • The 1St Director shareable image (SYS$SHARE: Software Distribution product. Next the procedure RSM$ISL_INSTALL-opera .EXE), containing entry starts the DECnet network software using the points RSM$ISL_FETCH and RSM$ISL_INSTALL, parameters of the ISL. Then the PCSI configuration must be active on the server system (running answers are taken from the server system rather OpenVMS). than being prompted manually at the console. Everything else is the same. • The contents of the fe tch toolkit container file Before any of this is possible, however, the sys­ (RSM$SDS_DATA :RSM$FETCH-opem.DSK) need be tem manager invokes the PCSI mil ity to record the known only to the ISL Director. This file resides answers to all the configuration questions using on the server system (running OpenVMS) but is the RSi\II$TRIAL_INSTA LL.COM command procedure. read only by the client system and only during The PCS! configuration file is then inserted in the a fetch operation. POLYCENTER Software Distribution library using • The load assist agent (SYS$SHARE:RSM$1SL_LAA­ the FETCH CONFIGURATION command. op era .EXE) must be capable of delivering the Note that when recording configuration files, the operating system's processor-specific primary PCSl utility permits users to defe r answers until bootstrap image (plus the parameters of the ISL) installation time. Unfortunately, because of the to the client system, which runs on the server product's stipulation that no human intervention system (running OpenVMS). be required, such deferrals cause the ISL to fail. Table 1 lists the names of the files required to Platform Independence support various operating systems. Note the nam­ The fo llowing section details how the ISL process ing scheme fo r the files. Each set of three files, can be expanded to other platforms and operating which compose a single ISL kit, implements the systems. Ta ble 1 gives a sample list of symbolic entire ISL fe tch and install functionality. The ISL names and their corresponding operating systems. Director routine RSM$ISL_FETCH works in conjunc­ The POLYC ENTER Software Distribution version 3.0 tion with the fe tch toolkit. The ISL Director routine kit provides only the VMS (for the VA X VMS and the RSM$1SL_INSTALL works in conjunction with the Open VMS VAX operating systems) and the AV MS (fo r load assist agent. Table 2 gives the naming conven­ the OpenVMS AXP operating system) ISL kits. tion used fo r all resources shared between these To add ISL support fo r other operating systems three files. The term op era identifies the symbolic and/or hardware platforms, the fo llowing require­ name of the operating system. The term server ments must be met. identifies the DECnet node name of the server

Digital Te chuical]ournal Vo l. (> .Vo . 4 F(l l/ 199 1 97 PC LAN and System Management Tools

Ta ble 2 Naming Conventions Used by ISL Resou rces

Name Description

RSM$1SL_INSTA LL-opera. EXE Shareable image containing the ISL Director routines, which runs on the POLYC ENTER Software Distribution server (running OpenVMS). RS M$1SL_LAA-opera. EXE Shareable image containing the load assist agent, which runs on the POLYC ENTER Software Distribution server (running OpenVMS).

RS M$FETC H-opera. DSK Container file containing the fetch toolkit, which resides on the POLYCENTER Software Distribution server syste m (running OpenVMS) but is read only by the client system.

RSM$FETCH_server-opera LAD service name for the fetch toolkit, which is served by the POLYC ENTER Software Distribution server (running OpenVMS) to the client.

RSM$1SL_BOOT-opera. COM Command procedure responsible for actually performing the save of the operating system software from the client's system disk to the virtual disk, which runs on the client system.

RSM$SDS_OS_LIB RARY: Container file for the fetched operating system, which resides on [opsys.OPERSYS]SYSO.DS K the PO LYC ENTER Software Distribution server system (running OpenVMS).

(T his directory may also be used to store the bytes of the processor­ specific bootstrap image so the load assist agent has easy access.)

RSM$1SL_server-opsys LAD service name of the te mporary system disk containing the fetched operating system image, which is served from the POLYC ENTER Software Distribution server syste m (r unning OpenVMS) to the client syste m.

RSM$1SL_STARTUP-opera.COM Command procedure responsible for actually delivering the operat­ ing system software from the temporary system disk to the target system disk. It runs on the client system but is booted from the temporary system disk.

RSM$1SL_CLEANU P-opera .COM Command procedure for removing customizations specific to the initial system load from a temporary system disk. It runs on the client system but is booted from the temporary system disk.

RSM$1SL_server LAD service name of the VA X "boot container'' for operating systems fetched prior to version 3.0, which is served from the POLYC ENTER Software Distribution server system (running OpenVMS) to the client system (which in this case must be running OpenVMS VA X or VAX VMS).

RSM$1SL_server_EVMS LAD service name of the AXP "boot container" for operating systems fetched prior to version 3.0, which is served from the POLYCENTER Software Distribution server system (running OpenVMS) to the client system (which in this case must be ru nning OpenVMS AXP).

system, and the term opsys identifies the user­ The pertinent fields of the QENTRY data structure defined pseudonym fo r the fe tched operating sys­ passed to RSM$ISL_FETCH are tem image.

The IS L Director char pseudonym[64J; The ISL Director is a shareable image activated by LIB$FIND_IMAGE_SYJVI BOL; therefore it need not char client_no de[128J; have transfer vectors, as long as the two required entry points are declared UNIVERSAl. These two char Libr ary_n o de[1 28J; routines are called in user mode. They are passed a single parameter, the address of a data structure char ope ra_ house [8J; called the QENTRY.

98 Vo l. 6 No. 4 Fa ll 1994 Digital Te c!Jnicaljo urnal Automatic, Ne twork-directed Op erating Sy stem Softwa re Up grades

and the pertinent fields of the QENTRY data struc­ store the bytes of the operating system's primary ture passed to RSMSISL_INSTA LL are bootstrap image fo r fu ture access by the LAA.

char ethernet[19J; The Load Assist Agent The LA A delivers the bytes of the processor-specific char client_n ode[128J; primary bootstrap image to the client system. The MOM process activates this shareable image dynam­ char Libra ry_node[128J; ically, but not using LIB$FIND_IMAGE_SYMBOL. Therefore, the one requ ired entry point to this char opera_h ou se[8J; image must occur at offset 0000 in the image. (The name of the entry point is unimportant.) This is best accomplished using a single transfer vector. In both routines, the field called opera_house con­ tains the symbolic name of the operating system This routine is caJJed in user mode with three (e.g., AV MS). parameters, the addresses of three data structures: MOMIDB, MOMODB. RSM$ISL_F ETCI-I is responsible fo r copying a the the MOMARB, and the MOMIDB$A_PARAM_DSC bootable snapshot of the client's system disk into The offset contains any NCP the LAD container file SYSO. DSK. The LAD virtual text from the load assist parameter field. This RSM$JSL_!NSTALL disk should be organized into the native fo rmat of field contains arbitrary text that the operating system being fe tched. The server sys­ placed there. Normally, this field contains a handle RSM$SDS_WO RK:ISL_ tem will never attempt to read these files. To the used to retrieve the file client.DAT . A server system, tbis container is simply a large series good handle is the DECnet node name of bytes, whose meaning (to the client system) is of the client system. unimportant. This routine is responsible for obtain­ The offset MOMARB$A_SEND_DATA is the address ing the size of the container file to be created, of a routine to deliver data to the client. The LAA creating that container file, and then serving it, need only collect and/or generate the data to be writeable, to the LAD. Once the fetch operation has delivered to the client; this callback routine deliv­ concluded, the container should be served again in ers it to the client. Its two parameters are a string read-only fo rmat. descriptor identifying which and how many bytes RSMSISL_INSTALL is responsible fo r enabling the are to be delivered, and the relative address in the LAA for the new client system. Since the LAA ru ns client's memory to place these bytes. This callback under the MOM process, which is a non-POLYCENTER routine may be ca.l led repetitively. MOMODB$L_TRANSFER_ADORESS Software Distribution environment, this routine The offset must should also collect any and all in.formation (such be filled with the relative transfer address of the pro­ as the DECnet node name and address of the server cessor-specific bootstrap image that was loaded into system) needed by the LAA, and store that infor­ the client's memory by MOMARB$A_SEND_DATA . For mation in the file RSJ'-•1$SDS_WOR K:ISL_client.DAT. OpenVMS VA X, this offset is traditionally zero, The conten.t of this fi le is shared only between because certain older VAX processors are not capa­ RSM$!SL_INSTALL and the LAA; therefore, the format ble of using any other value. That is one reason why of the file is implementation-dependent. the transfer address fo r ISL_SVAX.SYS is always zero.

The Fetch Toolkit Summary The fe tch toolkit is also a LAD virtual disk organized The ISL mechanism installs, maintains, and in a fo rmat that is native to the client's operating upgrades operating system software. These simple system. Again, the server system wi ll never read descriptions provide the framework fo r expaneling this virtual volume. This virtual volume contains the ISL process implemented in the POLYC ENTER the native operating system pieces necessary to Software Distribution version 3.0 product to plat­ save a snapshot of the model system disk, make it forms other than Open VMS VA X and OpenVMS AXP bootable as the temporary system disk, and restore operating systems. This expansion can make work it to its original state These are usually three sepa­ easier fo r system managers of multiple platforms rate command procedures. The command proce­ and may even start a de facto standard fo r perform­ dure that saves the system disk image must also ing operating system upgrades.

Digital Te e/mica/journal Vo l. 6 No. 4 Fall 1994 99 PC LAN and System Management To ols

Acknowledgments product. The installed software continues to use I would like to thank Richard Bishop and CharJie its traditional acronym RSM. Hammond in the OpenVMS AXP Development Group for al lowing me to unify the POLYCENTER 2. POLYCENTER Software Distribution Ma nage­ Software Distribution version 3.0 ISL and the PCST­ ment Guide (Maynard, MA: Digital Equipment based Open VMS AXP version 6.1 upgrade. Corporation, Order No. AA-J G05E-TE, May 1994).

No te and References 3. POl.YCENTER Soft ware Distribution Command

I. POLYCENTER Software Distribution is the new Reference (Maynard, MA Digital Equipment name for Digital 's Remote System Manager Corporation, Order No. AA-JG03E-TE, May 1994).

Digital Te cbnical jo urual 100 Vo l. 6 No. 4 Fa ll 1994 I Fu rtherReadings The fo llowing technical papers were written by A. john, "Dynamic Vnodes: Design and Imple­ Digital authors: mentation," USENIX 1995 Te chnical Conference on UNIX and Advanced Co mputing Sy stems R. Abugov and K. Zin ke, "WaferLevel Tracking (January 1995). Enhances Particle Source Isolation in a Manu­ IEEE!SEMI facturing Environment,'' Fifth Annual D. jones and V Murthy, "Ad vancing Reliability Advanced Semiconductor kla nujacturing with State of the Art Software To ols," University Conference and Wo rkshop (November 1994). of Manchester School of Engineering Th ird Reliability Software Seminar and Wo rkshop .J. Card, A. McGowa n, and C. Reed, "Neural Network (December 1994) . Approach to Automated Wirebond Defect Classifi­ cation," ASJlt!EProceedings of the Artificial Ne ural N. Khalil, ). Faricelli, and D. Bell, "The Extraction Ne tworks in Engineering (ANNIE '94) Conference of Two-Dimensional MOS Transistor Doping via (November 1994). Inverse Modeling,'' IEEE Electron Device Letters (January 1995). S. Cheung, D. jensen, and G. Mooney, "Ultra-High Purity Gas Distribution Systems fo r Sub O.Sum ULSI A. Labun, "Profile Simulation of Electron Cyclotron Manufacturing," Fifth Annual iEEE/SEMI Advanced Resonance Planarization of an Interlevel Dielectric," Semiconductor Manufacturing Conference and jo urnal of Va cuum Science and Te chnologyB Wo rkshop (November 1994). (JVST B) (November/December 1994).

R. Col lica, B. Cantell, and J Ramirez, "Statistical K. Mistry and B. Doyle, "How Do Hot Carriers An alysis of Particle/Defe ct Data Experiment Using Degrade N-Channel MOSFETs? ," IEEE Circuits and Poisson and Logistic Regression," IEEE Interna­ Devices (January 1995). tional Wo rkshop on Deje ct and Fa ult To lerance in VLSI Sy stems (October 1994). C. Ozveren, R. Simcoe, and G. Va rghese, "Reliable and Efficient Hop-by-Hop Flow Control," SIG­ B. Doyle, K. Mistry, and C-L. Huang, ·'Analysis of AC111 COJHM 94 (October 1994). Gate Oxide Th ickness Hot Carrier Effects in Surface Channel P-MOSFETs;· iEEE Transactions on R. Razdan and M. Smith, "A High-Performance Electron Devices (January 1995). Microarchitecture with Hardware-Programmable Functional Units," Proceedings of the Twenty­ ). Edmondson, "Internal Organization of the Alpha seventh Annual International Sy mposium on 21164," iEEE Fi rst International Sy mposium on (M IG"R0-27) High-jJelt'ormcmce Computer Architecture (HPCA) Microarchitecture (December 1994). (January 1995). R. Rios and N. Arora, "Determination of Ultra-Thin L. Elliott, D. Paine, and ). Rose, "The Microstructure Gate Oxide Thickness fo r CMOS Structures Using and Electromigration Behaviour of AJ-0.35%Pd Quantum Effects," IEEE International Electmn Interconnects,'' Ma terials Research Society Devices Meeting/IEDM Te chnical Digest Sy mjJosium Proceedings: Ma terials Reliability (December 1994). in Microelectronics IV Sy mposium (April 1994). N. Sullivan, "Semiconductor Pattern Overlay," C. Gordon and K. Roselle, "An Efficient and Accu­ Proceedings of the InternationalSociet y of rate Method fo r Estimating Crosstalk in Multicon­ Ph oto-Optical Instrumentation Engineers (SPIE) ductor Coupled Transmission Lines," IEEE Th ird Microelectronic Processing: Integrated Circuit Top ical Meeting on the Electrical Performance Metrology and Process Control (Critical Review) of Electronic Pa ckaging (November 1994). (September 1994).

D. Heimann, "Using Complexity-Tracking in the B. Thomas, "OpenVMS I/O Concepts: CSR Access;' Software Development Process,'' Th irty-second Digital Sy stemsjo urnal (November/December Annual Sp ring Reliability Sy mposium (April 1994) 1994).

VrJI. 6 No. 4 Fa l/ 1994 Digital Te chllical]ow-nal 101 Recent Digital US. Pa tents

The fo llowingI patents were recent�v issued to Digital Equipment Corporation. Titles and names supplied to us by the US. Pa tent and Trademark Offi ce are reproduced exactly as they appear on the original pub­ lished patent.

5,208,692 D. McMahon High Bandwidth Network Based on Wavelength Division Multiplexing 5,208,768 E. Simoudis Expert System Including Arrangement fo r Acquiring Redesign Knowledge 5,210,854 A. Beaverson and T. Hunt System fo r Updating Program Stored in EEPROM by Storing New Ve rsion into New Location and Updating Second Transfer Ve ctor to Contain Starting Address of New Version 5,210,865 S. Davis, W Goleman, and Transferring Data between Storage Media While Maintaining D. Thiel Host Processor Access fo r 1/0 Operations 5,217, 198 V Samarov, W Pauplis, and Uniform Spatial Action Shock Mount G. Doumani 5,218,678 B. Kelleher and S-S. Chow System and Method fo r Atomic Access to an Input/Output Device with Direct Memory Access 5,220,661 ]. Wray, A. Mason, P Karger, System and Method fo r Reducing Timing Channels in Digital P Robinson, W-M. Hu, and Data Processing Systems C. Kahn 5,224,206 E. Simoudis System and Method fo r Retrieving .Justifiably Relevant Cases from a Case Library 5,224,884 M. Singer, R. Noffke, and High Current, Low Voltage Drop Separable Connector D. Gilmour 5,233,684 R. Ulichney Method and Apparatus fo r Mapping a Digital Color Image from a First Color Space to a Second Color Space 5,235,697 S. Steely and]. Zurawski Set Prediction Cache Memory System Using Bits of the Main Memory Address 5,239,634 B. Buch and C. MacGregor Memoqr Controller fo r Engineering/Dequeuing Process 5,239,637 S. Davis, W Goleman, and Digital Data Management System fo r Maintaining Consistency D. Thiel of Data in a Shadow Set 5,241,564 J. Tang and J.L. Ya ng Low Noise, High Performance Data Bus System and Method 5,242,761 Y Uchiyama Magnetic Recording Medium and Method of Manufacture Thereof 5,243,241 C-H. Wa ng To tally Magnetic Fine Tracking Miniature Galvanometer Actuator 5, 247,464 R. Curtis Node Location by Differential Time Measurements 5,247,618 S. Davis, W Goleman, D. Thiel, Transferring Data in a Digital Data Processing System R. Bean, and]. Zahrobsky 5,251,147 ]. Finnerty Minimizing the Interconnection Cost of Electronically Linked Objects 5,251,227 T. Bissett, W Bruckert, Resets for a Fault To lerant, Dual Zone Computer System ]. Munzer, D. Kovalcin, and M. Norcross 5,253,249 ]. Fitzgerald and D. Shuda Bidirectional Transceiver for High Speed Data System 5,253,353 ].C. Mogul System and Method fo r Efficiently Supporting Access to 1/0 Devices through Large Direct-mapped Data Caches 5,257, 264 H. Yang and Automatically Deactivated No-owner Frame Removal K. K. Ramakrishnan Mechanism fo r To ken Ring Networks 5,261 ,077 J.R. Duval, K.R. Peterson, Configurable Data Path Arrangement fo r Resolving Data Ty pe and T.E. Hunt Incompatibility (This case is related to PD89-0300.)

102 Vo l. 6 No 4 Fa ll 1994 Digital Te chnical journal I '),261 ,OH'5 LB. Lamport Fau lt-tolerant System and Method fo r Implementing a Distributed State Machine '5, 26'5,092 S. R. Soloway, A.G. Lauck, and Synchronization Mechanism fo r Link State Packet Routing G. Ve rghese '5,26'5,2'57 R.J. Simcoe aml R. E. Thomas Fast Arbiter Having Easy Scaling fo r Large Numbers of Requesters, Large Numbers of Resource Typ es with Multipl.e Instances of Each Ty pe and Selectable Queueing Disciplines '),274,Hll A. Borg and D.W Wa ll Method for Quickly Acquiring and Using Ve ry Long Traces of Mixed System and User Memory References '5,276, 712 ].D. Pearson Method and Apparatus fo r Clock Recovery in Digital Com munication Systems '5, 276,809 .f. K. Grooms, R.L. Sites, Method and Apparatus fo r Capturing Real-time Data Bus L.A. Chisvin, and OW Smelser Cycles in a Data Processing System 5,276,828 J Dion Methods of Maintaining Cache Coherence and Processor Synchronization in a Multiprocessor System Using Send ami Receive Instructions '5, 276, 851 C. Thacker and D. Conroy Automatic Writeback and Storage Limit in a High-performance Frame Buffe r and Cache Memory System '5,276,874 R.G. Thomson Method for Creating a Directory Tree in Main Memory Using an Index File in Secondary Memory 5,278,974 R. Ramanujan, PJ Lemmon, Method and Apparatus fo r the Dynamic Adjustment of Data and .J.C. Stickney Transfer Timing to Equalize the Bandwidths of Two Buses in a Computer System Having Different Bandwidths 5,2H0.47H H. Ya ng, PW Ciarfella, No-owner Frame and Multiple To ken Removal Mechanism fo r K.K Ramakrishnan To ken Ring Networks '), 280, ')75 C. A. Yo ung and N.F jacobson Arraratus fo r Cell Format Control in a Spreadsheet 5, 280,'582 H. Ya ng, K.K. Ramakrishnan. No-owner Frame and Multiple To ken Removal fo r To ken Ring and A. Lauck Networks '5,280,627 j. E. Flaherty and A. Abrahams Re mote Bootstrapping a Node over Communication Link by Initially Requesting Remote Storage Access Program Which Emulates Local Disk to Load Other Programs '),28:1,857 E. Simoudis Expert System Including Arrangement for Acquiring Redesign Knowledge S.C. Steely and D.J Sager Next Line Prediction Apparatus fo r a Pipelined Computer System '),287, 1:)8 B.iVI. Kelleher System and Method fo r Drawing Anrialiased Polygons 5, 287,48') L. Umina and R. Anselmo Digital Processing System Incl.uding Plural Memory Devices and Data Transfer Circuitry 5, 287,5:)4 T Reuther Correcting Crossover Distortion Produced When Analog Signal Thresholds Are Used to Remove Noise from Signal '),291,494 T. Bissett, W Rruckert, and Method of Handling Errors in Software .J. Melvin '5,293,620 W Barabash and Method and Apparatus fo r Scheduling Tas ks in Repeated WS. Ye razunis Iterations in a Digital Data Processing System Having Multiple Processors '5,296,:)92 G.J. Gnlla ancl W.C. Metz Method for Forming Trench Isolated Regions with Sidewall Doping '5,297. 269 D. Donaldson, M. Howard, Cache Coherency Protocol fo r Multi Processor Computer D. Orbits, ). Parchem, System D. Robinson, and D. WiLliams

Digital Te chnical jo urnal Vo l. 6 No. 4 Fa ll 1994 103 Recent Digital US. Patents

5,298,464 RW Doe, R.D. Gates, Method of Manufacturing Tape Automated Bonding D.P. Goddard, S.C. Hsu, and Semiconductor Package R.L. Schlesinger

5,301,327 W McKeeman and S. Aki Virtual Memory Management fo r Source-code Development System 5,303,382 B. Buch and C. MacGregor Arbiter with Programmable Dynamic Request Prioritization 5,303,391 R . .J. Simcoe and R.E. Thomas Fast Arbiter Having Easy Scaling fo r Large Numbers of Requesters, Large Numbers of Resource Types with Multiple Instances of Each Type and Selectable Queueing Disciplines 5,313,387 WM. McKeeman ami S. Aki Re-execution of Edit-compile-run Cycles for Changed Lines of Source Code, with Storage of Associated Data in Buffers 5,313,464 F. H. Reiff Fault To lerant Memory Using Bus Bit AJigned Reed-Solomon Error Correction Code Symbols

5,313,641 R.]. Simcoe and R.E. Thomas Fast Arbiter Having Easy Scaling fo r Large Numbers of Requesters, Large Numbers of Resource Types with Multiple Instances of Each Type and Selectable Queueing Disciplines

5,315,480 V Samarov, G. Doumani, and Conformal Heat Sink fo r Electronic Module R. Larson 5,317, 708 R. Edgar Content Addressable Memory 5,319,651 R. Hel liwell, R. Lary, B. Edem, Data Integrity Features for a Sort Accelerator andJ.Johnston

5,321,724 B. Long and M.]. Hynes Interference Suppression System

5,321 ,841 M.C. Ozur, S.M. Jenness, Server Impersonation of Client Processes in an Object-based ).W Kel ly, J). Wa lker, and Computer Operating System ).A. East

5,325,531 WM. McKeeman and S. Aki Incremental Compiler fo r Source Code Development System

5,327,368 R.A. Eustace and].S. Leonard Chunky Binary Multiplier and Method of Operation

5,327,557 J.P.Emmo nd Single-keyed Indexed File fo r TP Queue Repository

5,330,881 A. Sidman and S. Fung Microlithographic Method for Producing Thick Ve rtically Wa lled Photoresist Patterns

5,337, 404 P. Beaudelaire, M. Gangnet, Process for Making Computer-aided Drawings J Herve, T. Puder, and J.VThong

5,339.449 P.A. Karger, A. H. Mason, System and JVI ethod fo r Reducing Storage Channels in Disk JC.R. Wray, P.T. Robinson, Systems A.L. Priborsky, C.E. Kahn, and T. E. Leonard

5,345,587 L.G. Fehskens, C. Strutt, Extensible Entity Management System Including a Dispatching S. Wo ng, J.F.Callande r, Kernel and Modules Which Independently Interpret and P.H.Burgess, K.J Nelson, Execute Commands M.]. Guertin, D.L. Smith, MW Sylor, KW Chapman, R.C. Schuchard, S. I. Goldfarb, R.R.N. Ross, LB. O'Brien, P.). Trasatti, D.O. Rogers, B.M. England, ].L. Lemmon, R.L. Rosenbaum, and additional inventors

5,345,588 R. Peterson, B. Schreiber, Thread Private Memory Storage fo r Multithread Digital Data and S. Greenwood Processing

104 Vo l. G No . 4 Fa ll 1994 Digital Te chnical jo urnal Call fo r Authors fromDigital Press

Digital Press has become an imprint of Butterworth-Heinemann, a major inter­ national publisher of professional books and a member of the Reed Elsevier group. Digital Press remains the authorized publisher fo r Digital Equipment Corporation: the two companies are working in partnership to identify and pub­ lish new books under the Digital Press imprint and create opportunities fo r authors to publish their work.

Digital Press remains committed to publishing high-quality books on a wide variety of subjects. We would like to hear from you ifyou are writing or thinking about writing a book.

Contact: Frank Satlow Publisher Digital Press 313 Wa shington Street Newton, MA 02158 Te l: (617) 928-2649 Fax: (617) 928-2640 fp [email protected] ISSN 0898-90lX

Reserved. Corporation. All Rights © Digital Equipment l l8E-Tj/95 04 14 14.5 Copyright Printed in U.S.A. EY-T