2005 was a busy year for me as of POSIX was published (the the USENIX standards represen- Shell and Utilities volume), and tative. There are three major it became a second ISO standard. NICHOLAS M. STOUGHTON standards that I watch carefully: Amendments to these standards I POSIX, which also incorpo- were also under development, USENIX rates the Single UNIX Specifi- and led to the addition of real- cation time interfaces, including Standards I ISO-C pthreads, to the core system call I The Standard Base (LSB) set. Many of the other projects Activities died away as the people involved In order to do that, USENIX lost interest or hit political road- funds my participation in the blocks (most of which were Nick is the USENIX Standards committees that develop and reported in ;login: at the time). Liaison and represents the maintain these standards. Association in the POSIX, ISO C, Throughout 2005, the Free Until the end of the twentieth and LSB working groups. He is century, POSIX was developed the ISO organizational repre- Standards Group (FSG) also sentative to the Austin Group, helped fund these activities. For and maintained by IEEE exclu- a member of INCITS commit- each of these, let’s look at the his- sively. At the same time, the tees J11 and CT22, and the Open Group (also known as Specification Authority sub- tory of the standards, then at group leader for the LSB. what has happened over the past X/Open) had an entirely separate but 100% overlapping standard, [email protected] 12 months or so, and, finally, what is on the agenda for this known as the Single UNIX year. Each of these standards is Specification. This specification critical to a large proportion of started from the same place in our members. Without these history, and many of the partici- standards, open source software pants around the table at an as we know it today would be X/Open meeting were the exact very, very different! same people who had met a few weeks before at an IEEE POSIX meeting to discuss the same set POSIX of issues! The POSIX family of standards This duplication of effort became was first developed by the IEEE, so annoying that a new, collabo- arising from earlier work from rative, group was formed to pro- /usr/group and the System V duce a single document that Interface Definition (SVID), would have equal standing for and was published as a “trial use” each of ISO, IEEE, and the Open standard in 1986. In 1988, the Group. That group held its first first full-use standard was pub- meeting in Austin, Texas, in lished. The difference between 1998, and was therefore named “trial” and “full” use is principal- the “Austin Group.” The Austin ly in the use of the term “should” Group published a full revision rather than “shall” in the require- of the POSIX and Single UNIX ments for any interface. specifications as a single docu- ment in 2001. It was adopted by In 1990, the 1988 API standard all three organizations and is was revised, clarifying a number maintained by the same team, of areas and expanding them. At which represents the interests of the same time, the API standard all three member organizations. became an ISO standard. At this point in history, there were about Since the 2001 revision, work 10 separate POSIX projects under has been steadily progressing development, ranging from the maintaining this 3762-page mas- basic OS system calls and libra- terpiece. Every week, there is a ries, through commands and util- steady stream of “defect reports,” ities, to security, remote file which range from typos in the access, super-computing, and HTML version (the document is more. In 1992, the second part freely available in HTML on the

72 ;LO GIN: V OL. 31, NO. 2 Web; see http://www.unix.org nal files, getline and getdelim, ISO C /single_unix_specification), some multibyte string-handling through major issues with functions, robust mutexes, and The first version of the ISO C ambiguous definitions, and so versions of functions that take standard, then known as ANSI- on. Some of these defects can be pathnames relative to a directory C, was published in 1989. It quickly and cleanly fixed, and file descriptor rather than plain took the original language from two “Technical Corrigenda” pathnames (this helps avoid cer- Kernighan and Ritchie’s book documents have been approved, tain race conditions and helps and tightened it up in a num- which alter the wording for with really long pathnames). ber of places. It added function some of the interfaces to clarify I would expect to see official prototypes and considerably their meanings. drafts of this new revision this improved on the standard C Every ISO standard (and every summer, and the final version in . The first versions of IEEE standard, too) has a five- 2008. POSIX used this language as the underlying way to describe inter- year “reaffirm/revise/withdraw” POSIX has long had support process, where the document faces, and included a c89 com- beyond the C language world. mand to invoke the compiler. is examined to see if it is still There are Ada and Fortran offi- relevant, whether it needs revi- cial “bindings” to POSIX. How- Between 1989 and 1999, the C sion to meet current needs, or ever, there has never been a real committee added wide character whether it is now outdated and connection between the C++ support and addressed several should be withdrawn. For world and the POSIX world; language “defects”—internal POSIX, the Austin Group has C++ programs can use C to call discrepancies in the way various elected to revise the specifica- POSIX functions. But this leads features were described. The tion during 2006. to all sorts of complications for committee included a number Under the Austin Group rules, C++ programmers and, more of compiler vendors, who were the group as a whole cannot seriously, to much reinvention also keen to have the language invent new material. One of its of the wheel in providing map- permit ways to guide an opti- sponsor groups (IEEE, ISO, and pings between C++ constructs mizer: features such as con- the Open Group) must have and those of POSIX. The Austin stants, volatile variables and prepared a document and had it Group has received several restrict pointers were added to adopted under its own organiza- defects from C++ programmers the language for this purpose. tion rules before it can be pre- who want to know why they In 1999, a new revision came sented to the group as a whole. can’t do x, to which the tradi- out which included several new Therefore, the Open Group has tional answer has been “don’t features such as these, along been developing, and is now in use C++, use C”! And to make with major rework for floating the final stages of approving, a matters worse, the C++ language point support (including things number of documents which committee is also going through such as complex numbers). include new to become a a revision at present, and they At this point, the committee is possible future part of a UNIX want to add all sorts of features fairly happy with the state of branding program. Once ap- to the language that might make the core language and is fight- proved, these documents can it harder to access some of the ing back against proposals to then be examined by the Austin fine-grained features of POSIX change it. However, they have Group (OK, so it’s still the same (since they want the language not stopped working! They are group of people who developed to work on other platforms, currently preparing several tech- the set in the first place) for they deliberately try to be OS- nical reports that optionally inclusion into the POSIX neutral). extend the C language in a revision. All that may change soon. A number of directions. Of these, The new interfaces under con- study group has recently been by far the most significant to sideration are ones that have formed to look into the need most USENIX members is the been popular in the GNU-C for, and desire to build, a C++ report formerly known as the library (glibc) and Solaris for binding to POSIX. USENIX is “Security TR.” I say formerly some time, but have not been hosting the wiki for this group, because the term “Security” formally standardized before. and you are welcome to join: (and it turns out, many other They include support for stan- http://standards.usenix.org/posix related words) are so overloaded dard I/O functions to operate on ++wiki. and charged with meaning that memory buffers as well as exter-

;LOGIN: APRI L 2006 USENIX STANDARDS ACTIVITIES 73 by far the most objections to the to prevent buffer overflows, but buffers, and the application document were to its title. it hasn’t! slowly leaks memory. However, I The report formerly called the It will also likely change the ABI believe this is a smaller problem “Security TR” actually attempts of third-party libraries that want than the use of static buffers to deal with the fairly common to use this; they must now have with guessed sizes. problem of buffer overflow. It a way of receiving the size to Back to the name of this report: does so in a very simple fashion: check against. This suggests to as I said , “Secure Library” got every interface in the ISO-C me that this library will have lit- a ringing “no” vote. This report standard library that takes a tle uptake as it stands, though does not address any of what buffer has a secure variant which Microsoft has implemented it many people regard as security includes the size of the buffer. and has updated all of its core issues. The name “Safer Library” Now, while that is the meat of programs to use it (is this a was suggested, but the owners the original concept, it isn’t all good thing?). of a product called “Safer-C” that the report currently propos- The core of the problem is that objected. In the end it has come es. The report introduces the memory handling in C is com- down to “Extensions to the C concept of runtime constraints, plicated and error-prone. Library—Part 1—Bounds that is, various things that must Nobody will doubt that Checking Functions.” hold true when an interface is improvements in the supporting invoked. The original standard APIs are useful, but the existing THE LINUX STANDARD BASE library simply had undefined APIs already provide the means behavior when you passed a null to write correct programs. It is The LSB is an Application pointer to an interface that just cumbersome to do so. The Binary Interface (ABI), rather expected a pointer to a buffer. So proposed interfaces won’t than an Application char *p = malloc(10); change that; on the contrary, Programming Interface (API). gets(p); they could make programs even As such, it covers details of the could fail in a variety of ways, more complex. An alternative binary interfaces found on a despite being well-formed, legal approach is to take as much of given platform, providing a con- C. the memory handling away from tract between a compiled binary the programmer as possible. application and the runtime The new “secure” library ver- environment that it will execute sion of this, To that end, I am preparing a second part to this technical on. The first version was pub- char *p = malloc(10); report that uses dynamic memo- lished in 2000 and has devel- gets_s(p, 10); ry allocation instead of static oped rapidly since then. It now will invoke a runtime exception buffers. For new programs consists of a Core specification handler (analogous to a signal (rather than retrofits of old (including ELF, Libraries, handler) if p is null (because the code), this approach leads to a Commands & Utilities, and malloc failed) or if there are cleaner, more robust application, Packaging), a Graphics module more than 10 characters on the with fewer possibilities for prob- (including several core X11 next line of standard input. lems. For example, instead of libraries), and a C++ module. According to its current stats, reading data from an input with Each specification has a generic this document proposes a gets into a static buffer (that portion that describes interfaces library that might be of benefit might be too small), the getline that are common across all to someone going over thou- function allocates a buffer big architectures and seven architec- sands or millions of lines of enough to hold the entire input ture-specific add-ons that spell existing code and trying to find line, however long it was (or out the differences between the and plug all of the possible returns NULL if there was in- architectures. buffer overflow spots. It is likely sufficient memory). The only For the past year or more, I to end up obfuscating some of problem with such an interface have been acting on behalf of the code. It is also possible that is that the programmer must the as the if the buffer size is not well remember to release the memo- technical editor for the ISO ver- known, it could end up hiding ry when he or she is done with sion of this standard. ISO 23360 bugs where the programmer it, by means of a call to free. was unanimously approved last simply guesses at a buffer size Some have argued that this, too, September by the national bod- but is wrong; now the code can lead to unexpected bugs, as ies that contribute to the sub- looks as if it has been retrofitted programmers forget to free these committee responsible for pro-

74 ;LO GIN: V OL. 31, NO. 2 gramming languages and their symbol you should be using to difficult for end users and Linux runtime environments. get the promised behavior. vendors alike. With the LSB, all We have had to jump through a Additionally, the LSB Core parties—distribution vendors, few hoops in the final publica- Specification is mostly a super- ISVs, and end users—benefit as tion phase, but now it looks as set of the POSIX APIs. How- it becomes easier and less costly though the document is ready. ever, there is a small handful of for software vendors to target You will soon be able to buy a places where the two specifica- Linux, resulting in more appli- CD from ISO with the LSB on it tions are at odds. For the most cations being made available for (see http://www.iso.org), or you part, these differences won’t the Linux platform. can just download the PDF for bother most programmers most It is important for the LSB free from the Free Standards of the time, but there are corner workgroup not to slip into the Group (though the copyright cases you can creep into where comfortable feeling that the job notice is subtly different, as are you’ll find your application isn’t is now done. If the workgroup the running headers and foot- portable between an LSB-con- does not remain focused on the ers—see http://refspecs forming platform and a POSIX- core document, that core docu- .freestandards.org. conforming platform. For exam- ment will quickly become irrele- What now for the LSB? Are we ple, POSIX requires the error vant, overtaken by the pressures done? Of course not! The LSB EPERM if you attempt to unlink of distribution vendors to have workgroup has a new chair, Ian a directory, while the LSB their product be the de facto Murdock (the Ian of ). A requires this error to be EISDIR. standard in the absence of a new subgroup is developing a A document describing these good de jure base. desktop specification, with an differences is now available from My work with the LSB over the increased focus on libraries ISO as Technical Report 24715. past year has not just been as needed by desktop applications During the revision of POSIX the technical editor of the ISO such as GTK, Qt, PNG, XML, this year, and as a part of any standard, although this has been more X, imaging, etc. future LSB development work, a major part of my work. I have And with the pace of develop- we will review these changes to also been one of the principal ment in the open source com- see if there is any way that technical editors of the specifi- munity, it is necessary to contin- either specification can accom- cation as a whole. With the ually revise the specification to modate the behavior of the completion of the submission of match current practice. For other in some deterministic the initial core specification to example, until recently the plug- fashion. ISO, the sources of funding for gable authentication modules this critical project have largely A well-supported standard for dried up. I end with a plea: if (PAM library) had no symbol Linux is a necessary component versioning, but the upstream your organization believes that of Linux’s continued success. standards for UNIX, Linux, and maintainers have now decided Without a commonly adopted to add that (which makes main- C are important, consider donat- standard, Linux will fragment, ing money to USENIX to help taining an ABI possible). The thus proving costly for ISVs to LSB now has to be updated to fund the development and port their applications to the maintenance of these standards. discuss which version of which operating system and making it

ANNUAL MEETING OF THE USENIX BOARD OF DIRECTORS

The Annual Meeting of the USENIX Board of Directors will take place at the Boston Marriott Copley Place during the week of the 2006 USENIX Annual Technical Conference, May 30–June 3, 2006. The exact loca- tion and time will be announced on the USENIX Web site.

;LOGIN: APRI L 2006 USENIX STANDARDS ACTIVITIES 75