#ifdef ConsideredHarmful, or Portability ExperienceWith News HenrySpencer -_Zo9l9gy-Computer Systems, University of Toronto GeoffCollyer - SoftwareTool & Die ABSTRACT

We believe that a C programmer'simpulse to use#ifdef in an attemptat portability is usuallya mistake. Portabilityis generallythe resultof advanceplanning rathei than trench warfare.involving #ifdef. In the courseof developingC NewJ on difterent systems,we evolved various tactics. for dealing with differencèsãmong systemswithout producing a welter of #ifdefs at points of difference.We discussthe-altêmadves to, and occasio-nal properuse of, #ifdef.

Introduction portability problems are repeatedlyworked around With uNrx running on many different comput- ratherthan solved. The resultis a tangledand often impenetrable ers,vaguely t¡tlD(-like systems running on still more, web. Here's-a noteworthyexample popular and C runningon practicallyeverything, man! peo: from a newsreader.lSee Figure i. Observe that, not content ple are suddenlyfinding it necessaryto pott C with merely nestingfifdefs, the authorhas #ifdef softwarefrom one machineto another. When differ- andordinary if statementsþlus the mysterious encesamong systemscause trouble, the usual first IF macros) interweaving. This makes the structure impulse is to write two different versions of the almost impossible to follow without goingover it code-+ne per system-and use#ifdef to choosethe repeatedly,one case at a time. appropriateone. This is usuallya mistake. Fufhermore,given worst caseelaboration and (each Simple use of #ifdef works acceptablywell nesting fifdef alwayshas a matching#else), when differencesare localizedand only two versions the number of alternativecode paths doub-ieswiitr each extra are present. Unfortunately,as software using this level of #ifdef. By the time the depth reaches (not approachis ported to more and more systems,the 5 at all rare in the work of #ifdef enthusiasts), #ifdefs proliferate, nest, and interlock. After a there are potentially32 alternatecode paths while, the result is usuallyan unreadable,unmain- to consider.How many of thosepaths have tainable mess. Portability without tears requires been tested? Probablytwo or three. How many of betteradvance planning. the possiblecombinations even make sense?Often not very many. Figure2 is anotherwonderful exam- When we wrote C News [Coll87a], we put a ple, the LeaningTower Of Hostnames.It's most high priority on portability, sincewe ran severaldif- unlikely that anyoneunderstands this codeany more. ferent systems ourselves, and expected that the In suchsituations, maintenance is reducedto hit-or- softwarewould eventuallybe usedon many more. miss patching. If you find and fix a bug, how many Planningfor future adaptationssaved us (and others) otherbranches does it needto be fixed on? If vou from trying to force changesinto an uncooperative discover a performancebottleneck and work out a structurewhen we later encounterednew svstems. way to fix it, will you have to apply the fix Porting C News generallyinvolves writing a few separatelyto eachbranch? Now envisionwhat hap- small primitives. Therehave been surprises, but in pens when hurried or carelessmaintainers do¿,, the courseof maintainingand improving the code apply their fixes in all the placeswhere they are and its portability, we insistedthat the software relevant. remain readableand fixable. And we were not preparedto sacrifice performance,since one of C PhilosophicalAspects News's major virtues is that it is far fasterthan older The key step news software. We evolved several tactics that in avoidingsuch messesis to shouldbe widely applicable. realize that portability requiresplanning. There is an abundanceof bad examplesto showthat portabil- The Nature of the Problem ity,cannot be addedonto or patchedinto. unportable software. Many of the problemswe diÉcuis stem Considerwhat happenswhen #ifdef is used from the "never mind good,we want it nextweek" carelessly.T\e first #ifdef probablydoesn't cause much trouble. Unfortunately,they breed. Worse, /To they nest, and tend to becomemore deeplynested quote from the old tx¡x kernel: ..you are not with time. #ifdefs pile on top of *ifdets as expectedto understandthis",

'92 Summer USENIX- June8-June l:Z,1rgg? - San Antonio, TX 18s fifdef ConsideredHarmful ... Spencer,&Collyer

approachto software. earlier"solution" was really a quick fix and needs Even the best planning canriot anticipate all generalizing. In such cases,it is important to go problems,but it is importantto retain the emphasis back andfix the kludges, The time is not wasted;it on planningeven into ongoingmaintenance. When is an investmentin the future. a new portabilityproblem surfaces, it is importantto More generally, portability requires time and step back and think about the problem and its solu- thought. Nobodygets everythingright the frrst time; tion. Is this a uniqueproblem, or the harbingerof a gettingthe coderight meanstaking the time to think whole new class of them? Usuallv it's the latter. about what went wrong, decide what the mistakes which makesplanning all the more õrucial: how can were, andgo back andfrx them. the solution deal with all of them, not just the The alert readermay notice that almost all the current one? Failure to think leads to the patch- remarksin this section could also be applied to upon-patchapproach to portability,rapidly producing achieving high performance,high reliability, etc., unreadableand unmaintainable code. and that no specificboundary between 'We'vedevelopment Once the problem (class) and the solution are and maintenancewas mentioned. really dis- understood,then and only then it is time to start cussedhow to achievehigh-quality software. In our work on the code. Typically this will mean re- experience,this approachworks; we can't imagine implementingparts of it, not just hackingup the old anyother that would. code to work somehow. This highlights another issue: to revisethe code,you must understandit... Portable Interfaces and that means not making an incomprehensible Systems do, unfortunately,differ. It's often mess this time to interfere with maintenancenext possible to avoid system-dependentareas well time. enoughthat the samecode will run on all systems; All of this is typically more work than just we'll discussthat later. But sometimesmultiple hacking in a quick fix. Sometimesa quick fix may variantsare inevitable. Even within the family, be necessary,or later thought may show that an there are significant variations between systems.

wld lr,ld.! DEllocug cl.¡nu9_!cl ) !aâak_bogua¡ ., { l¡_ch¡a('D.l.t. bolu. ¡d.grouÞ¡? I¡y¡ ,D')¡ ragl.t.E tc_nut .¡. trgtt ..crt.!( bul, ) t !.gt.t.t re_f,Uü bgþ.Lty - 0 t tltC.! vrRr¡l tl,!d.! vln¡O8l Drl¡rqd( ) t ¡l( v.rËor.) ,6d1! , \B't llu88' !Þuta('Chackl¡g out !rou! ,ù&r!c--h69 oû À raco¡d,,,\D.,.!dout) ¡rurch.r( .. 'h') llSSHl 1! lrb{! { E8E llfC.l V!R¡O88 todl! It(v.rbo..l '\ ,L!d.! îER6E lDuÈ. ( Îyt. y b d.ta¡. bogu. nNrl!ouÞ..\n\ lDuÈr l'Ch.ckl,Dg .nfr.rc--hùg on... \n., rÈdouÈ) rLUOHr t6dl! 1y¡1. n ot 8t þ l.Âva th4 rÈ !h. 6d 1r CAr. thry r.Èuh.\n\ ',.tdour) . ¡tg8¡l¡ tor ¡ng* 0, ô9a < nqÈlcl¡¡.t nga++l { il (tor.âdfng¡t >. lR_w8uB) ( tlSl . ,6d1! aat_Þ!a¡d(n9x)t /r thl' ey raaaÈ ndrglouÞ r/ ,ild.t î!R8! /. or d.clâra lt bogua'/ , l9ut.l'y b d.l.È., n !o k..¡,\n.,rèdour) llugHt 1! (Èo!.Àdfn9r¡ .. ÎR-EOCU8) 16di! ÞtoaltY++t goto !a.ak_bcgu.t ) ) - .¡.. ll (.bu!.. ,n' 'buf.. ,q') lot ¡ngr ¡qucll¡.-lr ngx >¡ o ¡¡ ro!.¡dfngnl .. tR_rocust n9¡--) II bo9o.lty--, /. dfacounÈ alla¡dy evad ona¡ i/ .1.. l'! .- 'Y'l l!. (s.rtscll¡. > 5 ¡t bogo.lty > n.xrlclln. / 2) { l'bu! { It¡uE.l Yàll.'(Èa.¡dftr.xÈlclla.-¡l .¡ Aî_loco8 ¡¡ ¡.rlsclln. > O) 'It gha --nrtacll¡at r/ læka llk. actlva !11a la r.a.d upt 6nt¡ct Dasa adñtDlaÈs¡Èor,\D\ /r a.¡1 tough, huh? r..Èdout) !,our t ) !¡ruÈa ( .1.. { 'I.Âva lDut.(hforh.lD,.ÈCouÈ) th. \'bogu¡\. gþuD¡ aloÞ., ¡Dd th.y My c6. back b noml. ¡t¡yb.,\¡\ !!U6H, ',.tdoue) llU8H, r.sll._ddír( ), ) ggco aaaak_boqua, ll!d.! ¡lDocÀtr l t6dl! .1.. lt (bogorlÊy) { ,I!d.l vlRAO88 ) ,.1r. ll(v.rbo¡. ) tltd.! WRAOSI !Þuta(,Eovl¡g bogua Dd.gÞuDa b tha and o! your ,eÍarc.\n,, rtCout) turs[r ¡l I v.!hoa.) t!81 !Þutrl'lou ahould adlt bogua ndagrouDa ouË o! !þur .¡ú¡rc.\D., a6d1! lÈdcuÈ) tlu88, ,l!d.! ÎlRgl ttSE t6dll !Þur.('lbvl¡g bogu... b t¡r. .nd.\tri,¡rdour) !!u8ll, t6dtl tf!d.! îlR8E !9uÊ. bogur.. to! (t ¡gil >. 0, ngr--) { l'ldlr lson,nd.!c.\tr., rtdouc) rr,ug¡t, t6dl! tl lÞr.rdfng¡l .. Zn-Eocus) ,6d1,! ralæata_ndaglou¡r(n9¡rrsÈrctl¡a-1 ) t I ¡f,amold . lESlt

Figure 1: Exampleof overuseof #ifdef

'92 186 Summer USENIX - June 8-June 12, Lgg2- San Antonio, TX Spencer,& Collyer fifdef ConsideredHarmful ...

#lfdef, or somethingsimilar, ultimately is unavoid- makesthe code simpler, cleaner,and more manage- able. It can be managed,however, to minimize able evenwhen no rewrite is expected. problems. As a small casein point, when part of C News Among the basic principles of good software wishes to anange that a file descriptorassociated engineeringare cleaninterfaces and informationhid- with a sldio streambe closedat exec time, to avoid ing: when faced with a decisionthat might change, passingit to unpreparedchildren, this is doneby hide it in one module, with a simple outside-world fclsexec(fp); interface defined independentlyof exactly how the decisionis madeinside. One would think that well- (whereþ is the sfdio structurepointer) rather than educatedmodern programmers would not needto be by some _complexinvocation of. ioctl or something taught the virtues of this technique. Unfortunately, similar. Only the implementation of.fclsexec needs fifdef doesn't hide anything, and the interface it to be cluttered with the details. (As others have createsis arbitrarilycomplex and almostnever docu- notedin the past[ODel87a, SpenSSa] in other con- mented. texts, one paradoxicalproblem of UMx's not-too- The best method of managingsystem-specific complex system interfaces is that that they have variants is to follow those same basic principles: discouraged the development of librarieð with define a portableinterface to suitably-chosenprimi- cleaner,higher-level interfaces.) tives, and then implement different variants of the This confines#ifdef, but at first glance doesn't primitives for different systems. The well.defined seem to eliminate it. Sometimesseveral svstem- interface is the important part: the bulk of the specific primitives will be compiled from the same software,including most of the complexity,can be source,with portions selectedby fifdef. Note that written as a single version using that interface,and even limiting the damagecan be very important. can be read and understoodin portableterms. It is However, in our experience,it's much more usual commonwisdom2 that localizing systemdependen- for the different variantsto be completelydifferent cies in this way easesporting in caseswhere the code, compiled from different source files-in code must actuallybe rewritten. Our point is that it essence,the parts oußide the #ifdef disappear,The individual source files are generally imall and comprehensible,since they implementonly the prim- thatis widely ffiething known itives and are unclutteredwith the complexitiesof but usuallyignored. (uxx programmer,sdefinition.) the main-linelogic. Out of 50 such sourcefiles in C /* name of this site */ #ifdef GETHOSTNA¡',IE char *hostnamei # undef SITENA¡,IE # define SITENA¡'{Ehostname #el-se /', IGETHoSTNA¡,1E*/ # ifdef DOUNAME # incl-ude struct utsname utsni # undef SITENA¡,IE define SITENAME utsn.nodename # else /* IDoUNAME*/ # ifdef pHOSTNAtvtE char *hogtnamei # undef SITENA¡,IE # defíne SITENAIIE hostnane # else /* tpHosTNAME*/ # ifdef !{HOAtvtr undef SITENAI|E # define SITENA¡,IE sysname # endif /* VüHOAMT*/ # endif /* PHOSTNA¡',IE*/ # endif /* DOUNAME*/ #endif /* C¡IHOSTNAÌ|E */

Figure 2: The Iæaning Tower of Hostnames

'92 Summer USENIX- June8-June tZ,lgg}- SanAntonio, TX 187 fifdef ConsideredHarmful ... Spencer,&Collyer

News, half are less than 25 lines, most are under50, files nearlyas often as they should. few and only a are over 100. As an example,Figure Although C's limited macro facilities hamper 3 and Figure 4 are two implementationsof. fclsexec. large-scaleuse of header-fileencapsulation, more There is hardly anything to be gained by trying to ambitiousapplications can be useful despite occa- combine these two files into one file with fifdefs sional clumsiness. As an example, consider our every secondline. STRCHR primitive, which generatesin-line code There are, of course, things that cannot con- excepton machineswith compilersclever enoughto veniently be encapsulatedas functions,for reasons do so automatically(see Figure 5). This is a bit of either interfaceor efficiency. But a "primitive" awkward: what is being definedhere is not exactly is not necessarilya function. Types and macros a function, but C preprocessormacros nevertheless definedin a headerfile are also useful ways of hid- force it to look like one. In the absenceof a stan- ing system-specificdetail. Programmersoften use dard way to force inline expansionof normal func- such facilities on a small scale;e.g. the use of ollt tions, it remainsa powerful techniquefor portable as the system-suppliedtype for a size of a file or an performanceengineering despite its flaws: this and offset within it, but they don't write such header similar portable optimizations sped up major /* * set close on exec (on UNIX) */

#include #include void fclsexec ( fp ) FILE *fPi { (void) ioctl(fileno(fp), FIOCLEX, (struct sgttyb *)NULL); ) Figure 3: One implementationof. fclsexec

/* * set close on exec (on System V) tt/

#include #include void fclsexec ( fp ) FTLE *fP; { 1voÍd) fcntl(fileno(fp), F_SETFD,t); ) Figure 4: Anotherimplementation of. fclsexec

#ifdef FÀSTSTRCHR #define STRCHR(src,chr, dest) (dest) = strchr(src, chr) #eIse #define STRCHR(src,chr, dest¡ \ = *(dest,) ,\0, for ( (dest) (src) i l= && *(dest¡ t= (chr); ++(dest)) \ t \ '\0' if ( t(dest¡ == ) \ (dest) = NULL /* t¡.8.: ¡nissing seni-colon */ #endif Flgure5: To inlineor not to inline lEE Summer'92 USENIX- June8,June 12,lgg2- SanAntonio, TX Spencer,&Collyer #ifdef ConsideredHarmful ...

componentsof C News by 4OVowithout seriousloss There are severalsources of reasonablydecent of clarity. standardinterfaces, notably ANSI C [Inst89a] and If one must use fifdef, and it cannot be POSIX 1003.1[Engi90a].Since these srandards are confinedto headerfiles and the like, one good rule quite recent,many of the systemsof interestdo not of thumb is ¿se #ifdef only in declaratians(where implemenlthem firlly. This doesn't precludeusing "declarations" is understood to include macro the interfaces,however: you can supply your own definitions). This at least encouragessome thought implementation(s)for use on outdatedsystêms. An about defining an interface,rather than just hacking example is the ANSI function strerror (shown in in somethingthat somehowseems to work. Figure6). Finally, when defining interfaces,it is impor- This approachdoes impose a few constraints, tant to documentthem. The biggestreason for doing since the standardinterfaces sometimes are a bit this is that it is importantdiscipline that forcesyou ugly, and often aren't ideal for every program. It's to think aboutthe issuesand fill in fuzzy spots. The temptingto come up with customizedones instead. resulting documentationis also very ïaiuable for But the standardones have major advantages.For maintenance.Perhaps somewhat surprisingly, it's one thing, people understand(or will understand) also valuablefor development,even if the project is them without having to decipher your code. For not an army-of-antsoperation using buildingsfull of another,on systemswhich do implementthe stan- people. We found it very important to document dard interfaces,the system-providedones can be crucial interfaceslike our configurationprimitives, used. (This is particularly significantfor primitives eventhough only two peoplewere involved,to make like memcpy,where system-specifictuning can pro- sure things were being done consistentlyand we ducemajor improvementsin efficiency[Spen88a]. If understoodeach other.S you defineyour own customizedinterface, you must do your own customized implementation,which Standard Interfaces deniesyou the opportunityto benefit from the work of others.) For Of course,good interface design is not simple, a third, while the standardinterfaces especiallygiven the limitations of existingprogram- may not be ideal, by and large they contain no grievousmistakes, ming languages.Often the best way to solvè this and avoiding disastersis usually more important problem is to avoid it instead. If an interface is than achieving a precisely optimal solution. Finally, needed,there is much to be said for choosingone a standardinterface saves endless that is alreadystandard. puzzling, not to mention uncomplimentaryspecula- tion, by later maintainers: "did he have some deep subtle reasonfor using a non-standard 3lndeed, interface,or places where intemal interfaces weren,t washe just stupid?". completely documented were fruitful sourc€s of Reimplementinga standard misun

/t. * strerror - map error number to descriptive string * * This version is obviously somewhat UNlx-specific. )./ char * strerror( errnum) int errnum; { extern int sys_nerr; extern char *sys_errlist[ ]; if (errnum > 0 && errnum < sys_nerr) return ( sl¡s_errlist I errnum] ) t else if (errnum t= 0) returnl "unknown error" ¡; else return( "no details given', ¡ ;

Figure6: strerror

'92 - Summer USENIX JirneE-June lZ, tgg2- SanAntonio, TX 189 fifdef ConsideredHarmful ... Spencer,&Collyer poorly. A version which is faster but compatible defective. We have a remarkably large-and can solve performanceproblems while leaving the steadilygrowing-list of known portability problems door open to the possibility that the systemimple- that arisefrom defectiveimplementations of standard mentationswill improve someday. The stdio library lrxx programs.S is a particularcase in point: old implementationsof functions like lgets and fread are extremely Inside-OutInterfaces inefñcient, and even modern ones often can be Sometimesthere simply isn't any way to pro- improved on. This particular case gets tricky, vide a nec€ssaryprimitive on some systems. For because doing better means relying on ill- example,most modern permit setting the real documentedand somewhatvariable internal inter- userlD to equal the effective userlD, but some old faces,4but the performancewins for C News are so systemsallow only root to changethe real IDs... massivethat we neverthelessdid it. and it is necessaryto changethe real IDs to create Pitfalls that need careful attentionwhen using directorieswith properownerships. Given that many standardinterfaces are error checkingand boundary peoplewill beo reluctantto let a large and complex conditions. It is importantnot to make assumptions program written by a stranger run as root, there that aren'tin the standard.For example,a depress- doesn'tseem to be any easyway out. ing amount of umx software assumesthat close In this case,there is: turn the interfaceinside never returns any interestingstatus. Unfortunately, out, and have the dirty work done by caller rather as networkedfile systemsget more common and thancallee. Specifically,have the cÒmplexprogram other complicationsare introduced,it is not at all invokedby a simplesetuid-roof program which sets unthinkablefor an I/O error to be discoveredonlv - at thingsup properlyon uncooperativesystems. close time. Meticulous eror checking is important[Darw85a]. For example,see Figure 7. Finally, note that standardinterfaces exist on @re disparagingporers and more thanjust the C level. By includingan "over- ¡esellersonly, we shouldcomment that AT&T is asguilty ride" directoryearly in the shell's searchpath, it as anyoneelse. For example,several releases of System becomes trivial to substitute reimplementedpro- Y makchave violated the System V InterfaceDefinition in grams for standard ones that are missing or theirhandling of commandlines like test -s f ile in makcfiles.(Makefile command lines are specifiedto be executedjust as if by the shell,but if resris a shellbuilt. 4Thestandard build procedu¡efor C Newsruns a test in and thereis no actualprogram by that name,malæ p¡ogramto check compatibilitywith the locåI stdío oftenchokes and dies on thisline.) implementation. 60r shouldbelll

/t * nfclose(strean) - flush the stream, fsync its file descript,or and * fclose the stream, checking for errors at all stages. ThÍs dance * is needed to work around the lack of UNIX file system semantics * in Sun's NFS. Returns EOF on error. */

#include int nfclose ( stream) register FILE *stream; { register j-nt ret = 0 i if (fflush(stream) == EOF) ret = EOF' if (fsync(fileno(stream)) < 0l /* may get delayed error here */ ret = EOF' if (fclose(stream) == EOF) ret = EOF' return reti ) Figure 7: Necessaryenor checking

190 Summer'92 USENIX - June E.June12,1992 - SanAntonio, TX Spencer,& Collyer #ifdef ConsideredHarmful ...

A more mundaneexample is the problem of programminglanguage, sufficiently removed from readingdirectories. Thanksto the lack of a library the lower levels of the system that shell programs packagefor directory-readingin the oldest UNrxes, are often highly portable. (Gratuitousdiffeiences in there isn't any standardway to do it. Raw reads utility programsdo get in the way, as do attemptsto don't work on 4.2+BSDsystems (and increasingly- "improve" the shell that resultin subtleor not-so- many others),the Berkeley directory library works subtleincompatibilities, but this is usuallya manage- well but hasstupid nameclashes with many old sys- ableproblem.) tems, and the POSIX library isn't widespreadyet. usualobjection to shellprogramming Worse, becausethe insides of a directory-reading . -_The is the inefficiency of the result, but a careful division library are system-specific,it's difficult to provideã of labor betweenthe shell and the programsit invokes portablereimplementation of the POSIX functions. is all that is needed.Most of the C Newsbatching The simplestway aroundthis one is to move subsystemis written in shell, but it remainshighly the problemout to a higherlevel of abstraction.The efficient,because most of its time is spentin-thê ls utility portably job, does the so wrap the invoca- "batcher I compress I uux" pþeline, and tion of the programin a shell file, with the list of thoseare all C programs. namesgenerated by ls and fed into the applicationas Intermediate levels of abstraction, although argumentsor on standardinput. The performance harder to find, do exist. Substantialpieces of impact is rarely significant, and the alternative C News are codedin aw,t[Spen9l.a]where currently involves at least efficiencyis s¡x different variants of not crucial and requirementspermit. the code,with more surfacingdaily. One situationwhere high-level abstractionsare A less happy exampleof this techniqueis C particularlybeneficial is when one must step outside News's spaceþr program,used to check disk space -uNIx "common base uMx". Common base is so activity can be curtailedwhen it runs short. Its essentiallyVersion 7lLaboS2al,rhough the later V7 interfaceis simple and clean, and it is used every- innovationshave taken a while to find theirwav where in C News. Making it a shell program into SystemV (and some have never done so). p'OSIX offeredthe possibilityof exploitingthe d/ commlnd, 1003.1[Engi90a] is mosrlyan artemptro codify com- whicl encapsulatesthe ugly complicationsof finding mon baseUMX. Unfortunately,common base UNIX out hov/ much spaceis available (and, sometimes, did not addresssome issuesat all, notablydealing the root privilegesneeded to do so). Unfortunately, with real-timenetworks like the . Attempts d/ is often relatively costly to invoke; worse, the to define interfacesfor real-time networks only portable\¡/ay to do 32-bit arithmeticfrom shell l,Divig3a. ATT86a] have generally resulted in complex aná scripts is to use , which likewise tends to have ugly messes./Worse, there is no considerablestartup overhead. consensuson which With some care, the to use, and quality performance impact 9n9 the of the designscan be was tolerable, although not judged by entirelypleasant. the rate at which they are being redesigned to deal with unexpected problems-. What we had not anticipatedwas that every lit- Although higher-level abstractioasfor networking tle uNIx variant has its own different, incompãtible are not as common or as well-designedas they d/ oulpurformat. Even "considerit standardt,Sys- should ¡ be, networkedfile systemsand shell invoca- tem V has at least three. The importanceof pro- tion of programslike rså can provide limited net- gram-output being useful as program inputÍRitclïal-end, working functionalitywithout having to deal with has been disregardedcompletely. In the wè the underlyingmess. found that while the d/ version remains useful- A sidebenefit of high-levelabstractions is that people with really odd systems can customizeit the resultingprograms are generallyfar easierto easily-it was best to also provideC variantsthat modify and customize. This is a particularlyimpor- use the threeor four commonestspace-determining tant considerationfor softwareintended to be run systemcalls, improving (!) portabilitywithin on a fairly m¿ny systemswith varying largesubset ofu¡nx variants. administrativepolicies. Many systemadministrators who are not up to deci- Levelsof Abstraction pheringa 5000-lineC programcan copequite well with modifying a 50-line shell script. We have In general, avoiding problemsis better than made a consciouseffort to put policy decisionsin solving ',¡iay them. The best to solve portability shell scripts,not in C code,wherever possible, and problemsis get not to involvedwith them. Some- have had extensiveand loud positivefeedback on times they can't be avoided,but often a bit of this. ingenuitysuffices to find a way aroundthem. _Themost powerful way of avoidingproblems is to choosea level of abstractionwhere thev don't show up, The /s example earlier was a case in ffiepdons like point. The standardur.rrx shell is a very powerful v1o Resea¡ch tnx [Cent90a]that are useful sources of interfaceideas.

'92 - Summer USENIX June8-June LZ,lgg2 - SanAntonio, TX 191 #ifdef ConsideredHarmfr¡l ... Spencer,&Collyer

- - There_is one negative aspectto moving to a underlyingabstractions. Porting C News to a radi- higher_levelof abstraction:-the resulting programs cally non-uNx-like operatingsyitem reportedlytypi- dependon a larger and perhapsmore fragile set of cally involveslittle changeto itre C code,since-the

#ffdef SYSLOG #ifdef BSD_¿2 openloE( "nntpxf€r", f,Oc_pID)i #eIee openlogl "nntpxfer", LOc_pfD, SySLOc); #endif #endÍf

#ifdef DBM If (dbmfnlt(HrSToRy_FrLE) < 0) t #ifdef SYSLOG syslog(LOG_ERR,"couldn,t open hietory flle: trn,,) ; #eIee perror( "nntpxferr couldn,t open history f1le"¡ ¡ #endff exit(1) t ) #endif #ifdef NDBM If ((db - dbn_openlHISTORY_FILE,O_RDONI.Y, O)) -- NULL) t #lfdef SySLOc syalog(LOG_ERR,"couldn,t open hlstory file: trn',¡; #else perror( "nntpxferr couldn,t open history file,') ; #endlf exlt(1) ì ) #endlf If ((server - get_tcp_conn(argvl1¡,.nntp,,)) < O) { #lfdef SYSLOG syslog(LOG_ERR,"could not open socket: tm,,) i #eIse perrorf"nntpxfer: could .not open socket,') ; #endif exlt(1) ì ) - if ( (rd_fp fdopenqaerver,',r,,¡ ¡ -- (FIÌE *) O){ #ifdef sYsLOc syslog(LOG_ERRr "could not fdopen aocket: trn,') ; #else perror( "nntpxfer: could not fdopen socket,') ; #endlf exlt(1) t )

#lfdef SYSLOG syslog(Loc_DEBuc,"connectedto nntp server at ts,,, argvll¡); #endlf #Ifdef DEBUG prlntf("connêct€d to nntp aerv€r at $B\n,', argvl1])i #endlf /* r ok, at thfe point n€,re connected to the nntp daenon * at the dlstant ho8t. */ FigureE: A truly awfulstyle t92 Summer'92 USENIX- June8.June 12, L992 - SanAntonio, TX Spencer,&Collyer #ifdef ConsideredHarmful ...

u¡¡lx and C programminginterfaces are widespread Although macroscannot take variable numbers even on non-UNIXsystems, but substantialshell files of arguments,it ¿,sstill possibleto have them pick relying on dozensof majorUMX utilities are more of and choose among a fixed number of arguments. a challenge. There is also the problem,mentioned For example,the VERBOSE-TERSEbusiness in one earlier, of u¡ux suppliersbreaking formerly-working of our first exhibits,an attemptto avoid compiling in utilities. unneededstrings, can be handledwith a macro: Low-Level Portability MSG(short_forn, long_forn, iostream) We assumethat everyonereading this has had A shorþform-only definition of the macro simply doesn'tuse the long_formargument. The choicecan exposureto elementarynotions of portability like '?' using typedef names, avoiding stupid assumptions evenbe madeat run time usingf or the opera- about the sizes of integers and/or pointers, being tor, all by changingonly the definitíonof the macro. careful aboutbyte order in interchangeformats, etc. One valid use of #ifdef, particularly in header There are neverthelessa good many fine points that files,is the idiom deservesome illumination, particularly in the areaof #ifndef COPYSIzE how to use#ifdef safely. #define COPYSIzE8192 As rnentionedearlier, if #ifdef is neededat all, /* unit of copying */ it is best confinedto declarations,to try to preserve #endif some explicit notion of interfaces. Such declara- to supply a default value that can be overriddenat tions, in turn, preferably should be confined to compilationtime (with cc -DCOPYSIZE=4096). header(.h) files, to minimize the temptation to intro- One couldwish for a shorterform (e.g.,#ifndefdef), duce#ifdef into main-linecode. or even a compiler option allowing one to specify a An optional feature such as debuggingassis- value that overridesthe first one definedin the pro- tanceor logging can be definedas a macro or func- gram,since this idiom is commonand very usefui. tion that does nothing when not needed,else the However,the first questionto ask about such full-blown function can be defined(perhaps in one of numericparameters is whetherthey should severalsystem-specific ways, be there e.g. using a sys/ogdae- at all. consider: mon or not). At worst, this requiresone #ifdef per such feature rather than the now-notoriousstyle, #ifdef pdpl1 seenin variousbits of popularsoftware, of clustering #define LBUFLEN512 #ifdefs at the site of èaõh call of. said function(si /* line buffer length */ seeFigure 8. #else #define LBUFLEN One awkward areaSis functionswith variable 1024 /* Iine buffer length 'tl numbersof arguments.There is no way to write a C #endif macrothat can take a variablenumber of arguments, which makes it awkward to provide such an inter- This code presumesthat people on small machines face while still being able to hide the innards. Vari- (or at leastPDP-11s) prefer their programsto crash ous tricks are in use, none of them entirely satisfac- earlier than people on large machines. Any code tory; perhapsthe least objectionableis an extra level using such (unchecked)fixed-sized buffers is prone of parentheses: to falling over and dying (or at best mysteriously truncating or wrapping DEBUG(("oops:ts td\n", b, c))i long lines) anyway; the #ifdefs tip us óff that rheselimits should be abol- which lets a headerfile decideto either passor dis- ished and replaced with code that deals with cardthe whole argumentlist: dynamically-sizedstrings. #ifdef NDEBUG Another legitimate use of #ifde[ in fact # define DEBUG(list) /* nothing *7 requiredby the ANSI C standardin standardheader #eIse files, is in protecting headerfiles against multiple # define DEBUG(list) printf tist inclusion. In complex programsit can be quite #endif difficult to ensure that a needed header file is includedonce and only once, and including it more A related problem is that definition of a than once typically causesproblems with duplicate variable-argumentsfunction pretty well invariably typedefs,structure tags, etc. Ignoringsome issues of involvessome #ifdefing to copewith the unfortunate name-spacecontrol, the usual idiom for defending differencesbetween ANSI C stdarg.hand the tradi- headerfiles againstmultiple inclusiongoes tional (although some- lessportable) varargs.h. thing like this: #ifndef FOO_H SActually, it's awkwardin a greatmany ways, this being #define Foo_H 1 only one. /* interface to the foo module */

'92 Summer USENIX- June8-June 12,lgg2 - SanAntonio, TX 193 fifdef ConsideredHarmful ... Spencer,&Collyer

typedef struct { interestingbit of code shown in Figure 10. Rather char *foo_ai mysterious,isn't it? What is so odd about Crays, char *foo_b; and is it only Craysthat are affected? ) foo; *mkfoo( If testing for particular machinesis unavoid- extern foo ); able, perhaps because of some highly machine- extern int rmfoo( ); specific operation, consider what happens if no #endif machineis specified(or if the machineis oneyou've (Somecompiler implementorshave inventedbuane never heard of and hence didn't bother to list). special-purposeconstructs, typically using ANSI C's Don't assumethere is a defaultmachine. It is much #pragma, to avoid having the compiler re-scanthe kinder to producea syntax enor than silently inap- propriatecode. #ifdef vax Occurrencesof finclude inside #ifdef should f( *ptr) t #endif always be viewed with suspicion. There are better ways. #ifdef pyr Consider: /* #ifdef NDIR * darned Pyramid is so picky #ifdef M_XENIX * about null pointers #include */ #else if (ptr t= NULL) #include f( *ptr), #endif #endif #eIse #ifdef sparc #include /* the Sun 4 is just as badt */ #endif if (ptr t= NULL) #ifdef USG f ( *ptr) ; #include #endif #else /* t / #include #endif Figure 9: Protectingbroken code This clutter could be avoidedvia judicious use of cc headerfile on later inclusions. That is not neces- -Ilusrlincludelsysand consistentuse of dirent.h,pto- sary. It sufficesto have the compilerremember that viding a fake one if necessary: the entire text of the file is inside the #ifndef, and hence need not be rescannedif FOO H is still #include defined.) #define dirent direct #ifdef is often used to protect broken code in Arranging,typically via a makefile,to put an "over- the style sho\¡/nin Figure 9. The solutionhere is to ride" directoryin the searchpath for headerfiles is face realities and write the code in a conect and a tremendouslypowerful way of fixing botchesin a portablemanner: site's headerfrles without #ifdef. /* avoid dereferencing nuII */ When one uses#ifdef, one shouldbase the tests if (prr l= NULL) on individual features: f( *ptr); #include /* nay define srcrsrP */ A relatedpoint, also illustratedby that exam- ple, is that if. onemust use fifdef, one shouldtest for #ifdef SIGTSTP specific features or characteristics(typically indi- (void) signal(SIGTSTP, SIG_IGN) catedto the compilerby symbolsdefined in a header ; /* no suspension, thanke */ file or on a command line), not for specific #endif møchines. There will almost always be another machine with the same problem. Consider the andnot on (inaccurate)generalisations:

#ifdef cray '1*g '\r' ) while t= ) i lt till a newline (not echoed) */ #eIse *g '\n' ) while 1 != ) i /* till a newline (not echoed ) */ #endif Figure 10: Mysteriouscode

'92 194 Summer USENIX - June 8.June12,1992 - SanAntonio, TX Spencer,& Collyer #ifdef ConsideredHarmful ...

#ifndef SYSV Similarly, intemal consistencychecks, such as (void) signal(SIGTSTP¡ SIG-IGN); validated magic numbers in structures passed /* no suspension, thanks */ betweenuser code and libraries,can saveone's san- #endif ity by detectingbreakage in systemsoftware early, or this exampleof the reverseproblem (generalising before corruption spreads everywhere. Trying to from the specific)from a newsreader debug a core dump by mail on an unfamiliar machineis not fun. /* Things vre can figure out */ To a greaterextent than we had anticipated, #ifdef sIcTsTP one leainsabout portability by porting. The system # define BERKELEy call variationsamong uNIx systemsare fairly well /* include job control signals? */ documentedand understood.The variationsin com- #endif mandswere less well understood,at leastby us, and This particularpoint is worth emphasizing:the UNrx the variations in programmingenvironments were world is not cleanlysplit into SystemV and 4BSD still more surprising. There is no substitutefor try- camps,particularly with the adventof Systemv ing your software on several seriously-different/2 Release4. Hybrid uNlxes are the rule, not the machinesbefore release. It's also worth makingan exception,nowadays. effort to pick your beta-testersfor maximum diver- sity of envi¡onments:we found a lot of unexpected Pragmatic Aspectsof Portability problemsthat way. In practice,one encountersall mannerof break- Finally, a plea: if you find portabilityprob- age in vendor-suppliedsystem software: compilers, lems,document them. You can't expecteveryone to utilities (notably the shell and awk), libraries, ker- actually read the documentation-we frequently nels. Optiqrizersmay need to be turnedoff if they respondto querieswith "please read section so- are broken.vInstallers may have to pick up working and-soin documentsuch-and-such, it'll tell you all commands f¡om other sources (e.g. the about it"-but the more careful and conscientious installers benefit greatly group comp.sources.unixor the GNU [Founa] pro- from an advancelook at ject). Sometimesit is worth supplyingsimple but known problems,especially when a truly weird sys- correct versions of small things (e.g. library func- tem is involved. tions) when a large class of machinesis known to Configuration have brokenones. We ultimatelydecided that we could not provide complete replacements,or even Given the senselessdiversity in existing sys- workarounds, for all potentially-broken system tems,some way to configuresoftware for a new sys- software. Sometimes the problems are horrific tem is needed.Given that #ifdef can't do the whole enoughthat the right responseis not to contortone's job, how shouldwe proceed?C Newscurrently has codebut to get the customersto complainabout the an interactive build script that interrogates the breakageuntil it is fixed. installerabout his systemand thenconstructs a few Given all thesepotential problems, it is impor- shell scripts,which when run will usemake to build tant to deteu breakageas well as avoidingit or cop- the software. We intendto pushmost of the shell ing with it. IVe think very highly of regression scriptsnto the makefiles,so that casualuse of make tests,prepackaged tests that exercisethe basic func- works as people expect,l3but the generalapproach tionality of the softwareand check that the results seemsto be a good one: ask which emulationrou- are correct. They are very useful during develop- tines and headerfiles are necess:rry.rather than rv- ment, both for bûg-huntingin new codeI0 and for ing to guess. This strategy e.rln aflows cross- confidencetesting before release.f/Of equal impor- configurationand some degreeof cross-compilation, tance,though, is that they give the installerreason- which autoconfigurationschemes generally don't. It able confrdencethat the softwareis actuallyworking is also more trustworthy than autoconfiguration on his system,and that no glaring portability prob- schemes,which can be fooled by somenew innova- lemshave escaped his notice. tion. Almost all of. build's configurationquestionsl4 turn into choicesof files ratherthan values for #ifdef tlhe ,,bug,' . singlemost frequently reported in C News is,actuallya bugin a popular386 C compilei'soptimizer. ruOne of us (HS) observes:"When I set up a regressiontest for softwarethat has never had one before, ffi our universityenvironmenr, I alwaysfind bugs. Always. Every tíme," quite rrOne it was difficultto find SystemV machines.When very usefult¡ick is to adda regression-testitem we actuallytried one, not long before our firstreal release, lookingfor eachbug that is found. Thisavoids the classic therewere some unpleasant surprises. syndromeof having"fixed" bugsreappear in a later rJThemain reason for not doingthis from the startwas release, thelack of a standard#include mechanism in make.

'92 - Summer USENIX June 8-JunelZ,1992 - SanAntohio, TX 195 #ifdef ConsideredHarmful ... Spencer,&Collyer to examine. The few exceptionsare mostly histori- In our experience,#ifdef is usuallya bad idea cal relics, and will be revisedor deletedas time per- (althoughwe do use it in places). Its legitimateuses mits. are fairly narow, and it gets abusedalmost as badly as the notoriousgoto statement. Like the goto, the Statistics #ifdef often degradesmodularity and readability A snapshotof cunent C News working sources (intentionally or not). Given some advanceplan- shows955 lines of headerfiles and 19,762linesof C ning, thereare betterways to be portable. flles, split between5,640 lines from libraries(includ- ing alternateversions of primitives),and 14,122lines Acknowledgements of mainline C code. Here is a breakdownof the Thanks to Rob Kolstad for helpful comments fifdef usagein thatcode: on a draft of this paper. Thanksto JamesClark for grefer (and groff). reason .h main .c dbz rna total the rest of. And thanks ro rhe authorsof our bad examples-you who you ifndefdef 13 40 6 0 know 59 are. comment 4 2700 25 config. 6 25197 57 References protect.h 5 000 _STDC- 3 3107 ATT86a.AT&T, System V Interface Definition, 2, pdp11 2 0002 1986. lint 1 1,20 4 Cent90a.Computing Science Research Center, sccsid 0 1001 AT&T Bell Laboratories,Murray Hill, New STATS O 500 Jersey, UNIX ResearchSystem Programmer's other 0 1001 ManuøL,Tenth Editio4 SaundersCollege Pub- total 34 97 28 7 t66 lishing,1990. Coll87a.Geoff Collyer and Henry Spencer,"News The ,h column represents headerfiles. The main .c Need Not Be Slow," Proc. Winter Usenix column representsall .c files other than thosein the Conf. Washington1987, pp. 181-190,January dbz and nra (Australianreadnews) directories. The 7987. row represeätsthe 'if not defined,define' ifndefdef Darw85a.Ian Darwin and Geoff Collyer, 'lCan't idiom. The commentrow representsuses of #ifdef Happenor /* NOTREACHED */ or Real Pro- to comment out obsolete, futuristic or otherwise gramsDump Core," Proc. Winter UsenixConf, unwantedcode. The config. row representsuses of Dallas 1985,pp. 136-157,January 1985. #ifdef to configurethe software. Divi83a.Computer Science Division, Dept. of E.E. rnø is presentedseparately because we inher- and C.S., UCB, U¡üã Programmer'sManual ited it rather than writing it. dbz is presented 4.2 Berkelq Softare Dßtribution, August, separately because it usés #ifdef heavíly for 1983. configuration, for backwæd compatibility and to Engi90a.Institute of Electrical and Elechonics attempt to stand independently of C News. The Engineers,Portable Operating SystemInterface main C files' use of #ifdef for "configuration', is (POSU), Part l: SystemApplication Program misleading; in fact this is vestigial mostly code, Interface (API) Languagel (IEEE Std supersededbut not yet rc deleted from our current 1003.1-1990)= ISOIIEC 9945-1:1990,IEEE, working copies. New York, 1990. Conclusions Founa.Free Software Foundation,GNU software, anonymousftp from prep.ai.mit.edu:/pub/gnu. Despite problems along the way, C News is InstS9a.American National Standards Institute, outstandinglyportable. It comes up easily on an X3J11 committee, Amerícan Natìonal Stan- amazingvariety of UND(systems. Otherpeople have dards Institute X3.159-1989 - programming reportedporting News C relatively easily to environ- Language C, = ISOIIEC 9899:1990,ANSI, mentsthat we had consideredtoo hostile,or at least New York, 1989. too differentfrom umx, to even consider possible as Labo82a.Bell Laboratories, UNIX Progrømmer's target systems: notably VMS, MS-DOS and Amiga Manual, Holt, Rinehartand Winston,1982. DOS. The only major operatingsystem known to ODel87a.Mike O'Dell, presentserious obstacles is VM/CMS. "UNIX: The World View," Proc. Winter Usenix Conf. Washington1987, pp. 35-45,January 1987. Ritc78a.D. M. Ritchie, "UND( Time-SharingSys- tem: A Retrospective,"Bell Sys,Tech. J., vol. pp; @iladon at all; somequestions 57, no. 6, L947-1969,L978. Also in Proc. are decisions affecting setup of control files for the Hawaii International Conferenceon Systems compiledsoftware to usc. Science,Honolulu, Hawaii, Jan,1977.

L96 Summer'92 USENIX- Junet:June 12,lgg?- SanAntonlo, TX Spencer,&Collyer #ifdef ConslderedHarmful ...

Spen88a.Henry Spencer, "How To Steal Code," Proc. Winter Usenix Conf. Dallas 1988, pp. 335-345,January 1988. Spen91a.Henry Spencer,"Awk As A Major Sys- tems ProgrammingLanguage," Proc, Whter UsenixConf, Dallas 1991,pp. t37-t43, January 199r.

Author Information Henry Spenceris head of Zoology Computer Systemsat the University of Toronto. He is known for his regularexpression and string libraries,and as a co-authorof the C News netnewssoftware. Reach him via CanadaPost at Zoology ComputerSystems, 25 HarbordSt., Universityof Toronto,Toronto, Ont. M5S 141 Canada. His electronicmail addressis utzoo I henry or henry€zoo. toronto.edu. Geoff Collyer leads C News developmentat SoftwareTool & Die. He is seniorauthor of the C News netnewssoftware. His interestsinclude sim- ple, small, fast, elegant and powerful system software. Reachhim via U.S. Mail at SoftrvareTool & Die, 1330 Beacon St. #215, Brookline, MA 02146. His electronic mail address is worldl geoff or geoff€world. std.com.

'92 Summer USENIX - June 8-June 12, LggZ- San Antonio, TX L97