Compiler and Library Tim Hunter, SAS Institute Inc.; Cary, NC
Total Page:16
File Type:pdf, Size:1020Kb
An Overview of the SASle· Compiler and Library Tim Hunter, SAS Institute Inc.; Cary, NC ABSTRACT The Compiler This paper describes the major features of the SAS/C® compiler The compiler implemented the C language as described in the and library and discusses the development history oftha product. then definitive book on C, Kernighan and Ritchie's The C Pro The enhancements in the current production release are outlined. .gramming Language: Because the compiler was implemented for Some possible enhancements for future releases are described. IBM 370's, it included many features that were thought necessary for that architecture and for programmers who were used to working In that environment. The foliowing list describes some INTRODUCTION of these features: • The generated code IS fully reentrant, even allowing The SAS/C product has had four production reteases in its five modification of external variables. years of development. Each release included many new features and enhancements of existing features. Not only is the compiler • The compiler supports a number of built-in functions. used exclustve1y for Version 6 of the SA&!' System on ISM® main~ These are functions for which the compiler generates frame hosts, it is the leading C compiter in this market. Although instructions directly into the instruction stream, rather than the compiler and IIDrary is heavily orlented to use in large soft generating a call to a separately linked version of the ware systems, it is an efficient tool for any software project that function. In this release, the following functions were is written in the C langauge. implemented as built-in functions: The primary elements of the SASIC product are the the compiler strlen memcmpp ,b, strcpy memcP¥P ceil and run-time library. However, the product includes a number of 1IiEImcpy memltlt fabs utility programs, as well as several configurations of the run-time memset strxlt floor library for specialized environments. Because the documentation memcmp medf Idexp is as important as the software itself, the SASIC product is accompanied by a complete and detailed set of manuals and • The compiler produces a source listing file, induding technical reports. macro expansion and cross-reference. This description begins with a history of SASjC dev€lopment that .. C programs can call, and be called by, programs written highlights the features of the first three releases. Following this In IBM 370 assembly language using a very simple is an overview of the product as it is today. The tast part is a dis interface. Many assembler subroutines require no cussion of what dlrections the product may take in the future. modification. Throughout, features exclusive to the SASle product are high .. The INDE? compiler option generates code that allows lighted, as are those features that were implemented in response programs to execute withOut the need tor the run-time to requests from our users. library, -and that can be called from other high-level languages such as PLII, COBOL, and FORTRAN. HISTORY OF THE SASjC COMPILER AND (Exclusive.} LIBRARY The compiler also supports several extensions that were deemed important to the usability of the C language 1n the IBM 370 envi In mid-1 ~84, SAS Institute Inc. decided to develop its own C com ronment. Some of these extensions are described in the following piler. This project was assigned to the Institute's Language Sys list: tems Department. The goal of this group was to make Lattice~ Ino:s compiler work on the IBM 370 architecture under' the • The characters {lU" and the circumflex are commonly MVSj370, MVSjXA~ and VMjCMS operating systems. A little used in C programming, but are not always available on over a year later, the progenitor of the SASle product was put IBM terminals and printers. Therefore, the compiler into production. supports digraphs (two~character sequences) as substitute representations of these characters. You can specify up The following sections describe each production release of the to four representations of each character, two for use in SAS/C product. Specific features are mentioned in the section the source fHe and two for use in the Hsting file. The covering the release 111 which the features appeared. Of course, alternate representations may be modified on-site. aU of the features have continued to be supported in newer releases. .. In order to allow the generation of call-by-address parameter lists, the compiler supports the @ special Rele.... 2.10C operator. When applied to a function argument, this operator, like the & operator, returns the address of the The first production release was known as the Lattice Native argument. Unlike the & operator. the @ operator accepts an argument ttlat IS not an Ivalue; in which case the @ Compj/er as Modified by SAS Institute Inc. for IBM 370 Systems < operator forces creation of a temporary copy of the The compiler was based on Version 2.00 of the Lattice C com~ the piler. The run-time library was a new implementation of the stan argument and returns the address of the copy. dard C library, with emphasis on both IBM suitability and (Exc!uslve.) compatibility with UNIX® implementations. 640 • In order for C programs to call assembly language ences to an external object refer to the same object, OM0370 is functions that expect a VL-format parameter list, the an object code disassembler that. given an object deck produced compiler supports the ...........aS1n- pr~fix for function names. by the compiler, produces a listing, similar to that produced by an When a function whose name is prefixed by -.-aSDL is assembler, optionally interspersing.C language source lines at called, the compiler creates a VL-format parameter list for the appropriate points in the Ilst109. the function. (Exclusive.) Release 3.00F The Run-Time Library Shortly after Release 2.1-OC went into production, the Language As mentioned earlier, the design of the run-time library had two Systems Department began converting Lattice's Version 3.00 goaJs. suitability for IBM 370 operating systems and compatibility compiler. As shown below, the new release contained several with UNIX implementations. Most of the library functions are important enhancements. Release 3.00F. the first release to use standard functions; that is, they are -functions that most C imple the SAS/e name, went into production 13 months latar, Decem mentations support. These functions can be grouped as follows: ber 1986. • memory allocation functions The Compiler • character-type functions Both Lattice, Inc., and the Institute had been members of the • program control functions ANSl X3J11 committee (the committee responsible for producing a standard for the C language and library) for some time. There • string functions fore, Release 3.0OF contained several new language elements. Some of these elements are • mathematicat functions • the void data type • date and time functions • structure assignment, structure arguments, and functions • varia~e argument list functions. returning structures Because the UN1X I/O modeJ (widely used in the existing imple • the enum data type mentations of the C library) is so different from the IBM 370 model, the library I/O subs.ystem was implemented amy after • function prototypes much thought and hard work, The final result was three basically • the ~LINE-- and _FILE-- pre-defined macro names. separate sets of VO functions: standard, augmented, and UNIX· styJe. Standard I/O functions are those-commonly used, such as Also, the compiler implemented the register storage class. Up to fOPll'n, prjntf. and scanf. Augmented 1/0 functlons'(exciu six general purpose registers may be assigned to auto integer sive) are cJoser IBM 370 1/0 models and indude afopen. to and pointer variables. and up to two floating..point registers may afread, and afwrite. ille- UNIX-style I/O functions are close be assigned to floating-point variables. approximations of the UNIX operating system-level I/O routines open. read, write, and lseek. The Run-Tme Library The library also contains some functions designed specifically for use in the IBM 370 environment. $uCh as the dynamic loading The run-time library gained new functions in several areas. These functions are functions loadm and unloaom. These functions allow C pro grams to be implemented as several distinct load modules, each • the string functions memupr. memlwr. strupr. of whiCh can be loaded into memory as required and unloaded strlwr, strstr, and 14 others. when it is no longer needed. Other functions are present that fol low common IBM 370 idioms (for exampte, the strxl t and xlt • a set of system interface functions including easer io able functions for string translation). and functions that allow access to operating system information such as the name and release numbers. In addition, a version of the library that allows execution in other operating systems is provided. This version, called the General • two new utility functions, bsearch and qsort. ized Operating System environment, does not use operating sys tem services directly. Instead. the library invokes one of several This release also defined a subset of the library informally called exits. Each exit is coded for use in the-target operating system, pure/ib, This group of functions, including the math and string and, as such, may take whatever action is appropriate to fill or functions, are independent of operating system services. As deny the library's request. (Exclusive.) such, they were intended for use in programs that execute with out the full library. Finally, most of the run-time library was packaged as a set of tran sient load modules. This packaging allowed individual C pro Utilities grams to take up less space on disk, simplified system maintenance. and allowed the library to be installed in a shared The GENCSEG utmty was added in Release 3.00F, also.