OS/2® SAS Server Using SAS® Software in a Networked Client-Server System

Allan E. Williams David West Fred Hutchinson Cancer Research Center, Seattle, WA.

ABSTRACT would also free up their computer for other tasks while large jobs were running on the remote SAS server. For users of the SAS System 6.04 on PC networks, the release of the SAS System 6.06 for the IBM OS/2 brought both exciting opportunities and Requirements: accompanying problems. The SAS System 6.08 for OS/2 is much faster, as well as allowing larger jobs and new To be effective, a SAS Server needed to satisfy the capabilities. However, upgrading all SAS users to OS/2 is foll9wing objectives. First, The server should use the impractical for many environments. same network SAS system files as the SAS System 6.04, without conversion or transportation to a different This paper discusses one solution that enables all SAS location. This is not possible ~ the files are to be users on a PC network to take advantages of the new process~d on a computer on the network. Sharing features and throughput of the SAS System 6.08 for OS/2 files between the PC SASSystem 6.04 and the UNIX without having to convert operating systems. SAS system require conversion to intermediate transport files via PROC CPORT and PROC CIMPORT. Files A simple approach to implementing an OS/2 SAS server would then have to be copied between UNIX network disk on a network is described. Examples of the use of the partitions and DOS network disk partitions and then SAS server are described and future enhancements and converted by a DOS to UNIX utility. In contrast, the SAS possibilities are discussed. Extensions from a single System 6.06 on OS/2 can read and write PC SAS System OS/2 SAS server to multiple servers on the network are 6.04 system files directly. straight forward. Multiple batch servers create a 'cluster' of processors that appear to the user like one large A second requirement was that use of the SAS Server server. requires minimal conversion, ~ any, of SAS jobs. In additiOn, it was important to allow a user to decide where to run a particular job, since the SAS Server may have a INTRODUCTION: long queue at times.

The environment that provided the impetus for Third, the user should be able to use the same habits consideration of a SAS server was a large PC Novell® when using the SAS server or the PC SAS System 6.04. network serving a group of social scientists, Thus ~ the job is run in batch mode, the SAS '.lOG' and epidemiologists, and biostatisticians involved in cancer '.lST' files should be left in the same location, regardless prevention research. The investigators, programmers, of where the job is run. and analysts use the PC SAS System 6.04 for all analyses. Many survey data sets have over 500 A final but very important design element was to use a variables and SAS jobs tend to run out of computer simple and modular approach to develop the SAS Server memory. In addition, some new features of the SAS system. System 6.06/6.08, particularly PROC SQl, were very important to several projects. Solution:

It was not practical to convert all SAS users to OS/2 so Consideration was given to implementing the SAS Server that they could use the SAS System 6.06, since many of in SAS MACRO or in a batch programming language the other important computer tools were not available on such as OS/2 . The latter was chosen, because OS/2. Converting a few machines to OS/2 was the approach would then work on any batch job that could problematic because each person worked cooperatively be run under OS/2, not just SAS. (REXX is part of OS/2.) on multiple projects, and all people on a project had to Further, developing the pilot program in an interpreted access the same SAS system files. batch programming language allowed for the most flexibility and simplicity. Most PC SAS users in our environment tend to run SAS in batch mode, since it reduced the out-of-memory Network communication between all users was a major problems and because most preferred using their own concern, since not all users have read or write permission editor. We reasoned that ~ it were possible to submit a on all parts of the network. A common communications batch SAS job to a remote PC running SAS on OS/2 then directory was used that all users had access to for the we could provide them access to the SAS System 6.06 SAS Server queue and log files. features without a change of operating system. This

474 The main function of the OS/2 Server (DOSAS.CMD) is The Users View on MS-DOS: to read from a queue file the name of a SAS job to be run, the directory in which the job resides, and to wr~e the In order to submit a job (for example, test1.sas) to the time spent to a log file after the job is finished. OS/2 SAS Server the user types:

Two Pascal programs were written for the MS-DOS side. OS2SAS TEST1.SAS The first program, OS2SAS, subm~s a SAS job to the queue and/or shows the list of jobs in the queue. Another This starts the OS2SAS program, and passes the name similar program, DELSAS, deletes a job from the queue. of the SAS job to the queue. The OS2SAS program wr~es the name of the SAS job (TEST' .SAS) and the full One implication of requiring the SAS Server to read SAS directory name to the queue file in the shared jobs submitted from any directory is the SAS Server 'user' communications area. must have very broad permissions on the network. A special Novell user, named 'SASSERV' was created to The. results of submitting this job is shown below. Note run the SAS Server. Because of the broad permissions that there are several jobs already in the queue, and that required, the SAS Server computer has limited access, the job TEST' .SAS shows up as the last line, but w~h and limited key personnel can bring up the SAS Server '.TMP' as the extension. This is because a copy of the and log into the network. SAS job is made before submission.

A 25 Mhz 486 level PC (IBM PS/2) was chosen as the M:\SASEX> os2sas test1.sas in~ial SAS Server computer. OS/2, version '.3 was installed with Novell network access. The computer has 1 file(s) copied 8 MB of memory, which was adequate for most SAS jobs Submitted M:\SASEX\TEST1.SAS but proved to be too limited for some large SAS/IML® for SAs/OS2 processing jobs or to run simu~aneous SAS jobs. A larger server (50 Mhz 486 PC clone w~h 24 Mb memory, OS/2 2.0) was CONTENTS OF QUEUE FILE: then installed and no limitations were observed. Many of M: \BER\CODE MAKEBASE.TMP our large jobs are I/O limited and the access speed to the M: \BER\CODE MAKEIBAS.TMP network is critical. Our network is a '6MB/Sec IBM M: \BER\CODE FIXIBASE.TMP Token Ring, and the file server uses striped disk arrays M: \BER\CODE FIXBASE.TMP for high speed file access. Since all data is shared M: \TEST\CODE CONTENT.TMP between users the data is always stored on the network. M: \sASEX TEST1.TMP Software is also obtained from the network server for s~e licensing control. If the user leaves off the name of the SAS file, the program displays the contents of queue. This is shown below. Starting the OSI2 SAS Server: M:\SASEX> os2sas The following steps are followed to bring up the SAS Server: CONTENTS OF QUEUE FILE: M: \BER\CODE MAKEBASE.TMP • OS/2 SAS Server M: \BER\CODE MAKEIBAS.TMP • Log into Novell as SASSERV M: \BER\CODE FIXIBASE.TMP • Bring up an OS/2 batch window M: \BER\CODE FIXBASE.TMP • Start the DOSAS REXX program. M: \TEST\CODE CONTENT.TMP M: \SASEX TEST1.TMp·

What happens on the OS/2 SAS Server? The user can check the .LOG and .LST files to see d the job is finished, just as would be the case submitting the The DOSAS program reads the queue file for jobs job as a batch PC SAS System 6.04 job. Our convention submitted by users. A 'job' consists of the name of the is to issue a Novell 'SEND' using the SAS SAS program to be run, and the full name of the network 'CALL SYSTEM' function in conjunction with the directory that submitted the job. When a 'job' is found, NOXWAIT option to alert the user that the job is finished. the DOSAS program changes to the directory indicated NOTE: This does not work effectively if the user is in and starts the indicated SAS job. Windows.

When the job finishes, control returns to the DOSAS If the user wishes to delete his/her job from the queue, REXX program. The DOSAS program wr~es the date, there is a second program, DELSAS. To delete the job job execution time in seconds, and the directory and SAS TEST' .SAS from the queue, the user types: file name to the log file. The program then returns to the communications area and continues looking for more jobs DELSAS TEST1.SAS to run. If no jobs are found in the queue a period is written to the window indicating that the program is active and available.

475 A typical delete request is shown below. Note that the job There is an undocumented SORTSIZE option in PROC TESTI is no longer shown in the queue. SORT in the SAS System 6.04. This option is used to reduce the OUT OF MEMORY problem so common to M:\SASEX> delsas test1.sas SAS 6.04 with more than 500 variables or a large number Deleted M:\SASEX\TESTI of observations. This option MUST NOT BE USED as part of the PROC SORT procedure in the SAS System CONTENTS OF ~UEUE FILE: 6.06. If the option appears in a SAS System 6.06 PROC M: \BER\CODE MAKEBASE.TMP SORT step, the job will either hang the server or use M: \BER\CODE MAKEIBAS.TMP excessive server resources and force rebooting the SAS M: \BER\CODE FIXIBASE.TMP server computer. M: \BER\CODE FIXBASE.TMP M: \TEST\CODE CONTENT.TMP CONCLUSION

The OS/2 SAS server has been in operation for almost What is different about submitting SAS 6.06 jobs two years. A number of projects would not have been from a MS-DOS system? possible without it. Some SAS programs were converted from version 5 on an IBM mainframe and would not run There is a major decision that must be made by the user. within the memory constraints of the PC SAS System That is whether the SAS system files will be able to be 6.04. Others required PROC SOL. Other projects used by both the SAS System 6.04 (on the users PC) as developed SAS/IML models that strain the memory well as the SAS System 6.06 or only by the SAS System limitations of the current server. However, the primary 6.06 (on the OS/2 SAS Server). If the former, then the use of the SAS Server is to offload large jobs from the files must remain in SAS 6.04 format. If the latter, then user's machine to the SAS Server. the user can use the default SAS 6.06 format for SAS system files. The SAS Server has had an average use of over 5 CPU hours per work day, with peaks of 16·22 hours of daily Since the SAS LlBNAME must specify the correct version use. New users usually need only a 10-15 minute of the SAS system engine, a macro was written to include introduction before using it successfully. Some users the correct SAS engine specffication. If the user wants to have only needed it occasionally for jobs that were too be able to run the SAS job with either the SAS System large to run on the PC SAS System 6.04. Others have 6.04 andlor the SAS System 6.06, they need 'V604' become regular daily users and have started dev.eloping before the libname directory location. This instructs SAS programs specffically for the SAS Server. There is a that all files using this lib name are in SAS 6.04 format and nightly queue of large jobs at the end of most days. should remain that way. Some user's approach to multi-tasking is to send their large jobs to the SAS Server and then run shorter jobs on If a format library is necessary for the job, then it must be their own machine in the SAS System 6.04. created as a SAS version 6.06 format library. The SAS System 6.06 cannot use SAS 6.04 format libraries. Our The use of the initial SAS server prompted installation of preferred convention is to have a file called a second server which serves a second queue. This is FORMATS.SAS in a project directory that can be used to primarily used for one project that has very long jobs, create either a SAS version 6.04 or 6.06 format library. leaving the initial server for shorter jobs. Creation of a The SAS 6.06 Format libname thus requires a 'V606' prior second queue involves modffication of the OS/2 REXX to the directory location. command file to look to a different queue file and conversion .of the Pascal programs to write to the correct files. The new queue was initially set up to test the beta * Macro to add SAS engine to 8AS LIBNAME's; version of the SAS System 6.08 for OS/2, but the SAS System 6.08 is so stable that it quickly became used daily %macro setlib(libnayme, libpath, dsversn); for production jobs. %loca1 versn; %if &dsversD= %then %let dsverso=&.sysver; %if &sysver=6.04 Planned Enhancements: %then %str(ljbname &1ibnayme "&libpath";); %else %if &Sysver=6.06'or &sysver=6.08 %then The beta release of the SAS System 6.08 for OS/2 %do; proved to be over twice as fast as the beta SAS System %if %upcase(&lihnayme)=LIBRARY 6.08 for Windows for most. of our jobs. This throughput %then %let verso=;.. enhancement, combined with the release of OS/2 2.0 and %else %let versn=V%substr(&dsversn,l,l)%substr(&dsversn,3,2); the updated Novell network drivers, it more %str(li~name &libnayme &versn "&libpatb";); attractive to convert a selected group of programmers o/oend; and statisticians to OS/2. These users are also able to %mend setlib; set up the SAS Server as a background process, thus %setlib(datalib, m:\mydir\mysubdir, 6;04); becoming secondary SAS Servers themselves. A %setJib(library, m:lmyfmts); number of servers can all be running at the same time, each looking for the next job to start from the queue. However, questions of network permissions and access would have to be resolved. The initial approach will be to have project specific queue's dedicated to certain

476 computers. This is called the 'supermarket-line' queue probably the most powerful) being able to bid the highest. model. This is the approach used by the NQS (Network The high bidder gets that job, thus reducing its ability to Queueing System) developed for NASA AMES UNIX bid high until ns job is done. This general approach will networks. The other approach is to have all machines allow large networks to more efficiently use their read the same queue, referred to as the 'bank-teller tremendous computing capabilities in a coordinated queue model used by the DNQS (Distributed Network manner. Queueing System) developed by the Super Computing Research Instttute at Florida State University for UNIX The next step will be to have multiple computers work on (DeRoest, 1992). Both approaches have advantages. parts of a large problem, rather than on separate SAS does not allow two simuttaneous jobs to be run from problems. This is not currently an option under SAS but the same directory, ff they would both use the same development of distributed computing capabilities such as PROFILE catalog. This makes the multiple queue model the Network LINDA (Gelernter, 1991) language may one safer, since jobs that run from the same directory could day be integrated into SAS, allowing large, iterative be submttted in sequence. The 'bank-teller model is problems such as statistical simulations to be run on more efficient, however, since multiple queues have the networks faster than the fastest mainframes. possibiltty of having some machines overbooked and some free. During the day, users tend to check and move their jobs to the best queue. By advertising one REFERENCES queue for long jobs, users are reluctant to place jobs there ff there is a job in the queue and they need the DeRoest, Jim, (1992), RSIMagazine, January, 1, 32-35. resutts quickly. Gelernter, David, (1991), Mirror Worlds, Oxford The release of the SAS System 6.08 for Windows will University Press. remove many of the large job limitations involved wtth using the PC SAS System 6.04. However, using SAS for Goss, Lori, (1992), 'Using task broker to enhance SAS Windows on a 386 class machine makes it difficult to in a UNIX environment', Proceedings of accomplish other work at the same time. Thus our users the Seventeenth Annual SAS Users Group International will still like to be able to offload large SAS jobs to a fast Conference, 17, 623-629. SAS server while testing and running some jobs locally. The expected availabiltty of faster processors, combined OS/2 2.0 Technical Library: Procedures Language 21 with the possibiltty of new operating systems such as REXX Reference, (1991), IBM. Windows/NT® make n essential that new options be implemented in a cost effective manner and in ACKNOWLEDGMENTS a way that minimizes the complexity to the user community. Isolating new enhancements to the SAS SAS and SAS/IML are registered trademarks or Server allows major productivtty enhancements to be trademarks of SAS Instnute Inc. in the USA and other made while allowing the user to maintain a consistent countries. IBM and OS/2 are registered trademarks or environment. Windows/NT supports multiple processor trademarks of International Business Machines systems. These will be natural platforms for the SAS Corporation. ® indicates USA registration. Server when SAS for Windows/NT is released. All other brands and product names are trademarks or registered trademarks of their respective companies. Future Possibilities for Distributed Processing:

Large networks have an enormous computing capacny Author Contact: that is under utilized much of the time. If more computers on the network were able to run a mutti-tasking system Comments and Suggestions are welcome. like OS/2 (or UNIX on a UNIX network), then each user could start up the SAS Server program to run in Allan E. Williams background. The priortty could be adjusted to low CPU Fred Hutchinson Cancer Research Center use when the user was active and the priority raised Cancer Prevention Research Program when the user was going to be away from the machine 1124 Columbia St., MP 702 for a period of time. Every machine would then be Seattle, WA 98104 looking for work to do from the input queue. This is called a 'cluster' approach and is an active area of research in Phone: (206) 667-5231 the super computer. and UNIX community. (DeRoest, FAX: (206) 667-5977 1992) There is minimal cluster research activity in the INTERNET: [email protected] networked OS/2 communny but the advantages are the same.

Such an approach is being offered in the UNIX world by Hewlett-Packard and was demonstrated at SUGI 17 (Goss, 1992). Each networked workstation is constantly checking how busy n is. When a user requests that a job be done by some other workstation, each workstation 'bids' on the job, with the least used machines (and

477 Appendix 1 1******************************************************** The OS/2 REXX DOSAS.CMD Program Loop for a while before checking input queue lhis could be made more efficient by using the OSI2 2.0 function: call SysSleep seconds, in conjunction with ~******************************************** the RxFuncAdd to link the additional REXX function library >I< Program name: DOSAS.CMD See OS/2 2.0 REXX Reference manual, Pgs 4-47 and 4-65 *1 * Code location: C:\5AS on OS/2 SAS Server ********************************************************1 * j=O >I< Purpose: Implement the SAS Server on OS/2. doi= I to 5000 * Th.is reads 'jobs' from the queue file job.q' j = j + I * in the M:\HOME\TRANSFER network end * directory. moves to the job directory, starts a * batch SAS job, and writes the job and end 1* forever *1 * execution time to the log file Job.log' when finished. •* return >I< Project: meRe, Cancer Prevention. SRSS general use >I< Author: David West >I< Date : 06/91 Appendix 2 >I< Revised: 04192. D.W., to add log file OS2SAS.PAS: Submit jobs to SAS Server Queue >I< System: SRSS PS/2 - OS/2 - SAS 6.06/6.08 *********************************************1 {$M $4000.0.4000 ) 1* This program runs on the OS/2 SAS 'SERVER' */ program OS2SAS ; {Submit a SAS job to the SAS server 1* trace ?all >1<1 Syntax is: OS2SAS sasjob.sas 1* Name the queue file and the log file *1 Parameters may be added after the SAS program name. q = 'M:\home\transfer\job.q' log = 'M:\home\transfer\job.Jog' OS2SAS sasjob.sas pI p2 p3 Any parameters replace %1. %2, %3, etc. in the SAS do forever program as they would in a DOS batch file. If no SAS program is typed, the input queue is listed. 1* Check to see if any lines in the queue file *1 iflines(q) • then do 1* If so, open the file as a generic stream "'I uses Dos: x = stream(q,'c','open') ifx\='READY:' , {The queue is on a shared access network directory} then say q D(>t ready else do const q = 'm:\home\transfer\job.q'; 1* the first request will be the oruy one executed *1 parse value linein(q) with drive name var i, position : integer; J* pull remaining lines into program internal queue"'l ok : Boolean; do while lines(q) queue Linein(q) Fileinfo : SearchRec; end DirName, FileNm, conunstring, 1* close the file and delete it *1 tmpname, SASFileName, Drive, line, x = stream(q,'c','close,) searchstring : string; 'del 'q • " , x = stream(q, c, open) p : array (1..9] of string; 1* put remaining lines back into file *1 qfile, sasfile, tmpsasfile : text; do while queuedO > 0 pull text begin x=lineout{q,text) end assign(qfile,q): x = stream(q,'c','close') ok:= False; if drive \=" IF ParamCount = 0 then do THEN writelnCNo SAS code filename was given.') 1* concat into full name *1 ELSE fullName = drive"Dir'\'Name /* change to request's directory */ BEGIN can directory drive"Dir { assign command line params to variables } 1* give readout of current date and time *1 SASFileName := ParamStr(l); SAY 'started on 'dateO' 'timeO fori:= 1 to9dop(j] :="; 1* reset timer "'I call TimeCR') for i := 2 to ParamCount do p[i-l] := ParamStr(i); 1* run SAS *1 { get fuIl name) 'SAS 'fullName' -config c:\sas\Config.sas' GetDir(O,OirName); 1* write elapsed time to log file, reset timer'" I Drive := Copy(DirName, I ,2); ' 'daterS,), 'right(truoc(time('R'»,5)' DirName := Copy(DirName,3,Length(DirName»; 'fuUname' »'Iog say wrote job to log file { get the full name of the sas file} end 1* then *1 FindFirst(SASFileName, Anyfile, Fileinfo); end 1* else *1 IF Dos Error <:> 0 end 1* then *1 TIlEN else call charout:.' BEGIN

]I; = stream(q,'c','c1ose') writelnClncorrect filename or format.');

478 writelnCCorrect usage: OS2SAS filename.SAS pl ... p9'); tmpname. SASFileName, Drive, line, END searchstring : string; ELSE p : array I1..9J of string; BEGIN qfiI e. outfj] e : text; { make file name and tmp file name} found : Boolean; FileNm := Fileinfo.Name; BEGIN tmpname:= Copy(FileInfo.Name,l,PosC.',FileNm»+'TMP'; found := false; assign(sasfi1e.FileNm); IF ParamCount = 0 reset(sasfile); THEN writelnCNo SAS code filename was given.,) assign(tmpsasfile.tmpname); ELSE rewrite(tmpsasfile)~ BEGIN ( assign command line params to variables } ( substitute params for % I - %9 if listed t SASFileName:= ParamStr(I); WHILE not EOF(sasfile) DO { get full path name} BEGIN GetDir(O.DirName); readln(sasfi1e.line) ; Drive:= Copy(DirName,I,2); IF (PII] <> ') AND (line <> ') DirName:= Copy(DirName,3.Length(DirName»; TIlEN { get the fuJI name of the sas-file } FindFirst(SASFileName, Anyfile. Fileinfo); FORi:=lto9DO IF DosError <> 0 BEGIN TIlEN { make search string } BEGIN Str(i,searchstring) ; writelnClncorrect filename or format.'); searchstring := '%' + searchstring; writelnCCorrect usage: DELSAS filename.SAS pl ... p9'); position := Pos(searchstring.Iine); END WHll..E position> 0 DO ELSE BEGIN BEGIN Delete(line.position,2); searchstring := Drive+' '+DirName+' , Insert(p{i],line,position); +Copy( Fil~lnfo.Name,l,(Pos(' .SAS',FileInfo.Name)-l) ) position ~= Pos(searchstring,line); +'.TMP'; END; { while} { open the q file} END; {for} assign(qfile.q); FindFirst(q. Anyfile, Fileinfo); writeln(tmpsasfile.line); IF Dosmor <> 0 END; { while] THEN writelnCQueue file is empty, SAS program not deleted') ELSE c1ose(sasfile); BEGIN close(tmpsasfile); { if q file already exists, copy to tmp file, removing desired filename in process. } SwapVectors; reset(qfile); Exec(getenvCcomspec'),' Ie echo '+Drive+' , assign(outfile,tmp_q); +DirName+' '+ TmpName+'» '+q); rewrite(outfile); SwapVectors; WHn..E not EOF(qfile) DO writelnCSubmitted '+FileNm+' for SASIOS2 processing'); BEGIN writeln; readln (qfile. li ne); END; IF searchstring <> line END; THEN writeln(outfile.line) ELSE found := true; writelnCContents of SAS/OS2 queue:'); END; SwapVectors; c1ose(outfile); Exec(getenvCcomspec'),' Ie type '+q); close(qfile); SwapVectors; Erase(qfi1e); END. Rename(outfile.q); END { else} Appendix 3 END; { else} DELSAS.PAS: Delete jobs from the SAS Server Queue IF found TIlEN {$M $4000,0,4000 } BEGIN program DELSAS ; writeln; writelnCDeleted '+searchstring); uses Dos; END; END; { eIse } canst q = 'm:\home\transfer\job.q'; writeln; tmp_q = 'm:\home\transfer\garbg.tmp'; writeln('Contents of SAS/OS2 queue:'); SwapVeCtors; var i, position : integer; ExecCgetenvCcomspec'),' Ie type '+q); Fileinfo : SearchRec; SwapVectors; DirName. full name, commstring, END.

479