<<

Chapter 3: Using the GNU Collection Page 1 of 21

Chapter 3: Using the GNU Compiler Collection

In This Chapter

l Features of GNU CC

l Tutorial Example

l Common Command-Line Options

l Optimization Options

l Debugging Options

l Architecture-Specific Options

l GNU Extensions

l PGCC: The Pentium Compiler

GNU CC, more commonly known as GCC, is the GNU project’ compiler suite. It compiles programs written in C, C++, and Objective C. GCC speaks the various C dialects, such as ANSI C and traditional (Kernighan and Ritchie) C, fluently. It also compiles (under the auspices of g77). Front-ends for Pascal, Modula-3, Ada 9X, and other languages are in various stages of development. Because GCC is the cornerstone of almost all development, I will discuss it in some depth. The examples in this chapter and throughout the book, unless noted otherwise, are based on GCC version 2.91.66.

Note - If you kick around the Linux development community long enough, you will eventually hear or read about another compiler, egcs, the Experimental (or Enhanced) GNU Compiler Suite. egcs was intended to be a more actively developed and more efficient compiler than GCC. It was based on the GCC code base and closely tracked GCC releases. To a long story short, in April, 1999, the Free Foundation, maintainers of GCC, appointed the egcs steering committee as GCC’s official maintainers. At the same , GCC was renamed from the GNU C Compiler to the GNU Compiler Collection. In addition, the egcs and GCC code bases merged, ending a long fork in GCC’s code base and incorporating many bug fixes and enhancements. So, egcs and GCC are, for all intents and purposes, the same program.

Features of GNU CC

GCC gives the extensive control over the compilation . The compilation process includes up to four stages:

1. Preprocessing file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 2 of 21

2. Compilation Proper

3. Assembly

4. Linking

You can stop the process after any of these stages to examine or use the compiler’s output. You can control the amount and of debugging information, if any, to embed in the resulting binary and, like most , GCC can also perform code optimization. GCC allows you to mix debugging information and optimization. I strongly discourage doing so, however, because optimized code is hard to debug: static variables may vanish or loops may be unrolled, so that the optimized program does not correspond line-for-line with the original .

GCC includes over 30 individual warnings and three general warning levels. GCC is also a cross- compiler, so you can develop code on one processor architecture that will be run on another. Finally, GCC sports a long list of extensions to C and C++. Most of these extensions enhance performance, assist the compiler’s efforts at code optimization, and make your job as a programmer easier. The price is portability, however. You will look at a few of the most common extensions because you will encounter them in the kernel header files, but I suggest you avoid them in your own code. Tutorial Example

Before beginning an in-depth look at GCC, a short example will help you start using GCC productively right away. For the purposes of this tutorial, we will use the program in Listing 3.1.

Listing 3.1 Program to Demonstrate GCC Usage

/* * hello.c – Canonical "Hello, world!" program */ #include int main(void) { ("Hello, Linux programming world!\n"); return 0;}

To compile and run this program, type

$ gcc hello.c -o hello $ ./hello Hello, Linux programming world!

The first command tells GCC to compile and link the source file hello.c, and create an name hello, specified using the -o argument. The second command executes the program, resulting in the output shown on the third line.

The whole process is straightforward, but a lot took place under the hood that you did not see. GCC first ran hello.c through the , cpp, to expand any macros and insert the contents of #included files. Next, it compiled the preprocessed source code to . Finally, the , ld, created the hello binary. Figure 3.1 depicts the compilation process graphically. file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 3 of 21

Figure 3.1 The compilation process.

You can re-create these steps manually, stepping through the compilation process. The first step is to run the preprocessor. To tell GCC to stop compilation after preprocessing, use GCC’s -E option:

$ gcc -E hello.c -o hello.cpp

Examine hello.cpp and you will see that the contents of stdio.h have indeed been inserted into the file, along with other preprocessing tokens. Figure 3.2 shows some of the contents of hello.cpp, starting at line 894.

Note - The exact location of this text may vary slightly on your system.

The next step is to compile hello.cpp to object code. Use GCC’s -c option to accomplish this:

$ gcc -x cpp-output -c hello.cpp -o hello.o

Figure 3.2 hello.c after preprocessing.

In this case, you do not need to specify the name of the output file because the compiler creates an object filename by replacing .c with .o. The -x option tells GCC to begin compilation at the indicated step, in this case, with cpp-output, the preprocessed source code.

How does GCC know how to deal with a particular kind of file? It relies upon file extensions to determine how to process a file correctly. The most common extensions and their interpretation are listed in Table 3.1.

Table 3.1 How GCC Interprets Filename Extensions

Extension Type .c C language source code .C, .cc C++ language source code .i Preprocessed C source code .ii Preprocessed C++ source code .S, .s source code .o Compiled object code .a, .so Compiled library code

file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 4 of 21

Linking the , finally, creates a binary:

$ gcc hello.o -o hello

Hopefully, you will see that it is far simpler to use the abbreviated syntax I used above, gcc hello.c -o hello. The point of the step-by-step example was to demonstrate how to stop and start compilation at any step, should the need arise. One situation in you would not want to complete the compilation process is when you are creating libraries. In this case, you only want to create object files, so the final link step is unnecessary. Another circumstance in which you would want to walk through the compilation process is when an #included file introduces conflicts with your own code or another #included file. Stepping through the process allows you to identify where the problem occurs and then to fix it. Being able to step through the process will make it clearer which file is introducing the conflict.

Most C programs consist of multiple source code files, so each source file must be compiled to object code before the final link step. This requirement is easily met. Suppose, for example, that hello.c uses code from helper.c (see Listings 3.2 and 3.3). Listing 3.4 shows the source code for the modified hello program, howdy.c.

Listing 3.2 Helper Code for howdy.c

/* * helper.c – Helper code for howdy.c */ #include void msg(void) { printf("This message sent from Jupiter.\n");}

Listing 3.3 Header File for helper.c

/* * helper.h – Header for helper.c */void msg(void)

Listing 3.4 The Modified hello Program

/* * howdy.c – Modifed "Hello, World!" program */ #include #include "helper.h" int main(void) { printf("Hello, Linux programming world!\n"); msg(); return 0;}

To compile howdy.c properly, use the following command line:

$ gcc howdy.c helper.c -o howdy

GCC goes through the same preprocess-compile-link steps as before. This time it creates object files for each source file, howdy.c and helper.c, before creating the binary, howdy, in the link stage. file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 5 of 21

Typing long commands like this does become tedious. In Chapter 4, "Project Management Using GNU make," you learn how to solve this problem. The next section will begin introducing you to the multitude of GCC’s command-line options. Common Command-Line Options

The list of command-line options GCC accepts runs to several pages, so Table 3.2 only lists the most common ones.

Table 3.2 GCC Command-Line Options

Option Description -o FILE Specifies the output filename; not necessary when compiling to object code. If FILE is not specified, the default name is a.out. -c Compiles without linking. -DFOO=BAR Defines a preprocessor macro named FOO with a value of BAR on the command line. -IDIRNAME Prepends to the list of directories searched for include files. -LDIRNAME Prepends DIRNAME to the list of directories that are searched for library files. -static Links against static libraries. By default, GCC links against shared libraries. -lFOO Links against libFOO. -g Includes standard debugging information in the binary. -ggdb Includes lots of debugging information in the binary that only the GNU , gdb, can understand. -O Optimizes the compiled code. -ON Specifies an optimization level N, 0<=N<= 3. The default level is 1 if N is not specified. -ansi Supports the ANSI/ISO C standard, turning off GNU extensions that conflict with the standard (this option does not guarantee ANSI-compliant code). -pedantic Emits all warnings required by the ANSI/ISO C standard. -pedantic- Emits all errors required by the ANSI/ISO C errors standard. -traditional Supports the Kernighan and Ritchie C language syntax (such as the old-style function definition syntax). If you don’t understand what this means, don’t worry about it. file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 6 of 21

-w Suppresses all warning messages. In my opinion, using this switch is a very bad idea! -Wall Emits all generally useful warnings that GCC can provide. Specific warnings can also be flagged using -W{warning}. -werror Converts all warnings into errors, which will stop the compilation. -MM Outputs a make-compatible dependency list. -v Shows the commands used in each step of compilation.

You have already seen how -c works, but -o needs a bit more discussion. -o FILE tells GCC to place output in the file FILE regardless of the output being produced. If you do not specify -o, the defaults for an input file named FILE.SUFFIX are to put an executable in a.out, object code in FILE.o, and assembler code in FILE.s. Preprocessor output goes to standard output.

Working with Libraries and Include Files

As you saw in Table 3.2, the -I{DIRNAME} option allows you to add directories to GCC’s search path for include files. For example, if you store custom header files in /home/fred/include, then, in order for GCC to find them, you would use the -I option as shown in the next example:

$ gcc myapp.c –I /home/fred/include –o myapp

The -L option works for libraries the way that the -I option works for header files. If you use library files that reside in non-standard locations, the -L{DIRNAME} option tells GCC to add DIRNAME to the library search path and ensures that DIRNAME is searched before the standard locations.

Suppose you are testing a new programming library, libnew.so, currently stored in /home/fred/lib. (.so is the normal extension for shared libraries—more on this subject in Chapter 10, "Using Libraries.") To link against this library, your GCC command line would be something like this:

$gcc myapp.c -L/home/fred/lib –lnew –o myapp

The -L/home/fred/lib construct will cause GCC to look in /home/fred/lib before looking in its default library search path. The -l option tells the linker to pull in object code from the specified library. In this example, I wanted to link against libnew.so. A long-standing UNIX convention is that libraries are named lib{something}, and GCC, like most compilers, relies on this convention. If you fail to use the -l option when linking against libraries, the link step will fail and GCC will complain about undefined references to function_name.

Naturally, you can use all of these together—in fact, doing so is quite common (and usually necessary) for all but the most trivial programs. That is, the command line

$ gcc myapp.c –L/home/fred/lib –I/home/fred/include –lnew –o myapp

instructs GCC to link against libnew.so, to look in /home/fred/lib for libnew.so, and to search file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 7 of 21

in /home/fred/include for any non-standard header files.

By default, GCC links with shared libraries, so if you must link against static libraries, you have to use the -static option. This means that only static libraries will be used during the link stage. The following example creates an executable linked to the static (Chapter 23, "Getting Started with Ncurses," and Chapter 24, "Advanced Ncurses Programming," discuss user interface programming with ncurses):

$ gcc cursesapp.c -lncurses –static –o cursesapp

When you link against static libraries, the binary that results is much larger than the one you get if you used shared libraries. Why use a static library then? One common reason is to guarantee that users can run your program—in the case of shared libraries, the code your program needs to run is linked dynamically at runtime, rather than statically at . If the shared library your program requires is not installed on the user's system, she will get errors and will not be able to run your program.

The Netscape Web browser is a perfect example of this. Netscape relies heavily on , an X programming toolkit. Before Motif's re-release as Open Motif (under a more open license), most Linux users could not afford to install Motif on their system. To get around this difficulty, Netscape actually installed two versions of its browser on your system; one that was linked against shared libraries, netscape-dynMotif, and one that was statically linked, netscape-statMotif. The netscape "executable" itself was actually a script that checked to see if you had the Motif shared library installed and launched one or the other of the binaries as necessary.

Warning and Error Message Options

GCC boasts a whole class of error-checking, warning-generating, command-line options. These include -ansi, -pedantic, -pedantic-errors, and -Wall. To begin with, -pedantic tells GCC to issue all warnings demanded by strict ANSI/ISO standard C. Any program using forbidden extensions, such as those supported by GCC, will be rejected. -pedantic-errors behaves similarly, except that it emits errors rather than warnings and stops compilation. -ansi, finally, turns off GNU extensions that do not comply with the standard. None of these options, however, guarantee that your code, when compiled without error using any or all of these options, is 100% ANSI/ISO-compliant.

Consider Listing 3.5, an example of very bad programming form. It declares main as returning void, when in fact main returns int, uses the GNU extension long long to declare a 64-bit integer, and does not call return before terminating.

Listing 3.5 Non-ANSI/ISO Source Code

/* * pedant.c - use -ansi, -pedantic or -pedantic-errors */ #include void main(void) { long long int i = 0l; printf("This is a non-conforming C program\n");}

Using gcc pedant.c -o pedant, the compiler warns you about main's invalid return type: file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 8 of 21

$ gcc pedant.c –o pedant pedant.c: In function 'main': pedant.c:7: warning: return type of 'main' is not 'int'

Now, add -ansi to the GCC invocation:

$ gcc -ansi pedant.c -o pedant $ gcc pedant.c –o pedant pedant.c: In function 'main': pedant.c:7: warning: return type of 'main' is not 'int'

Again, GCC issued the same warning and ignored the invalid data type. The lesson here is that -ansi forces GCC to emit the diagnostic messages required by the standard. It does not ensure that your code is ANSI C–compliant. The program compiled despite the deliberately incorrect declaration of main and the illegal data type.

Now, use -pedantic:

$ gcc -pedantic pedant.c -o pedant pedant.c: In function ’main’: pedant.c:8: warning: ANSI C does not support ’long long’ pedant.c:7 return type of ’main’ is not ’int’

The code still compiles, despite the emitted warning. This time, however, the compiler at least noticed the invalid data type. With -pedantic-errors, however, it does not compile. GCC stops after emitting the error diagnostic:

$ gcc -pedantic-errors pedant.c -o pedant pedant.c: In function ’main’: pedant.c:8: ANSI C does not support ’long long’ pedant.c:7 return type of ’main’ is not ’int’ $ hello.c helper.c helper.h howdy.c pedant.c

To reiterate, the -ansi, -pedantic, and -pedantic-errors compiler options do not ensure ANSI/ISO-compliant code. They merely help you along the road. It is instructive to point out the sarcastic remark in the file for GCC on the use of -pedantic:

"This option is not intended to be useful; it exists only to satisfy pedants would otherwise claim that GNU CC fails to support the ANSI standard. Some users try to use '-pedantic' to check programs for strict ANSI C conformance. They soon find that it does not do quite what they want: it finds some non-ANSI practices, but not all—only those for which ANSI C requires a diagnostic."

In addition to the -ansi, -pedantic, and -pedantic-errors compiler options, gcc boasts a number of other options that issue helpful warnings. The most useful of these is the -Wall option, which causes GCC to issue a number of warnings about code that is not outright wrong, but that is potentially dangerous or that looks like it might be a mistake. The next example shows -Wall's behavior when used on pedant.c:

$ gcc –Wall pedant.c –o pedant pedant.c:7: warning: return type of 'main' is not 'int' pedant.c: In function 'main': pedant.c:8: warning: unused variable 'i' file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 9 of 21

Note that this time, GCC flags the variable i, which is never used. Clearly, this is not an error, but it does demonstrate a poor programming practice.

On the other end of the spectrum from -Wall is -w, which turns off all warning messages. -W {warning} has the effect of turning on a particular warning in which you are interested, indicated by warning, such as implicit function declaration (-Wimplicit-function-declaration) or functions that have an implicitly declared return type (-Wreturn-type). The former is useful because it suggests you defined a function without first declaring it or that you forgot to include the appropriate header file. The latter warning indicates that you may have declared a function without specifying its return type, in which case the return type defaults to int. Table 3.3 lists a number of useful warnings that GCC provides that are useful for catching common programming mistakes.

Tip - If you want to check your program’s syntax without actually doing any compilation, call GCC with the -fsyntax-only option.

Table 3.3 GCC Warning Options

Option Description -Wcomment Warns if nested comments have been used (a second /* appears after a first /*) -Wformat Warns if arguments passed to printf and related functions do not match the type specified by the corresponding format string -Wmain Warns if main’s return type is not int or if main is called with the incorrect number of arguments -Wparentheses Warns if parentheses have been used when an assignment is made (for example, (n=10)) in a context in which a comparison was expected (for example, (n==10)), or if parentheses would resolve an operator precedence problem -Wswitch Warns if a is missing a case for one or more of its enumerated possibilities (only applies if the index is of type enum) -Wunused Warns if a variable is declared but not used or if a function is declared static but never defined - Warns if an automatic variable is used without Wuninitialized first being initialized -Wundef Warns if an undefined identifier gets evaluated in a #if macro directive -Winline Warns if a function cannot be inlined -Wmissing- Warns if a global function is defined but not declarations declared in any header file file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 10 of 21

-Wlong-long Warns if the long long type is used -Werror Converts all warnings into errors

As you can see, GCC has the capability to catch many common and frustrating programming blunders. Listing 3.6 illustrates a number of typical coding mistakes; the sample GCC command lines that follow the listing show the -W{warning} option at work.

Listing 3.6 Common Programming Mistakes

/* * blunder.c – Mistakes caught by –W{warning} */ #include #include int main(int argc, char *argv[ ]) { int i, j; printf("%c\n", "not a character"); /* -Wformat */ if(i = 10) /* -Wparentheses */ printf("oops\n"); if(j != 10) /* -Wuninitialized */ printf("another oops\n"); /* /* */ /* -Wcomment */ no_decl(); /* -Wmissing-declaration */ return(EXIT_SUCCESS); } void no_decl(void) { printf("no_decl\n");}

The expected warnings GCC will issue are indicated in the comments. The first attempt to compile this program uses a simple command line that does not invoke any warning options. It results in the following:

$ gcc blunder.c –o blunder blunder.c:27: warning: type mismatch with previous implicit declaration blunder.c:21: warning: previous implicit declaration of ‹no_decl' blunder.c:27: warning: 'no_decl' was previously implicitly declared to return 'int'

As you can see, in its default error-checking mode, GCC only issues warnings related to the implicit declaration of the no_decl function. It ignored the other potential errors, which include

l The type of the argument passed to printf (a string) does not match the format specifier (a char). This will cause a -Wformat warning.

l Both i and j are used unitialized. Either one or both of these will generate a -Wunitialized warning.

l i is assigned a value in a context in which a comparison is intended. This should result in a - Wparentheses warning.

l The beginning of a nested comment should generate a -Wcomment warning.

file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 11 of 21

First, see if gcc catches the type mismatch in the printf statement:

$ gcc –Wformat blunder.c –o blunder blunder.c: In function 'main': blunder.c:11: warning: int format, pointer arg (arg 2) blunder.c: At top level: blunder.c:27: warning: type mismatch with previous implicit declaration blunder.c:21: warning: previous implicit declaration of 'no_decl' blunder.c:27: warning: 'no_decl' was previously implicitly declared to return 'int'

As you can see, the first three lines of diagnostic output show that GCC caught the type mismatch in the printf call. Next, the -Wparentheses and -Wcomment options:

$ gcc –Wparentheses –Wcomment blunder.c –o blunder blunder.c:19: warning: '/*' within comment blunder.c: In function 'main': blunder.c:13: warning: suggest parentheses around assignment used as truth valu blunder.c: At top level: blunder.c:27: warning: type mismatch with previous implicit declaration blunder.c:21: warning: previous implicit declaration of 'no_decl' blunder.c:27: warning: 'no_decl' was previously implicitly declared to return 'int'

As anticipated, GCC emitted warnings about the apparent nested comment on line 19 and about the possibly mistaken assignment on line 13.

Finally, test -Wuninitialized:

$ gcc –O –Wunitialized blunder.c –o blunder blunder.c: In function 'main': blunder.c:9: warning 'j' might be used uninitialized in this function blunder.c: At top level: blunder.c:27: warning: type mismatch with previous implicit declaration blunder.c:21: warning: previous implicit declaration of 'no_decl' blunder.c:27: warning: 'no_decl' was previously implicitly declared to return 'int'

Interestingly, GCC did not warn that i was being used uninitialized, although it did for j. This is because i was first flagged as a -Wparentheses warning (you can confirm this by combining - Wparentheses and -Wuninitialized). If you want to catch all of these warnings, and many more, use the -Wall option mentioned earlier. It is much shorter to type.

Note - The last example used the -O (optimization) option. This was necessary because - Wuninitialized requires its use, although this is not evident in GCC’s info page.

This section demonstrated GCC’s ability to catch real and potential programming errors. The next section explores another of GCC’s capabilities: code optimization. Optimization Options

file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 12 of 21

Code optimization is an attempt to improve performance. The trade-off is lengthened compile times, increased memory usage during compilation, and, in some cases, a larger disk footprint for the resulting binary. While some optimizations are general in nature and can be applied in any circumstance, other optimizations are designed to exploit features of a given CPU or CPU family. This section looks at both classes of optimization options.

The bare -O option tells GCC to reduce both code size and time. It is equivalent to -O1. The types of optimization performed at this level depend on the target processor, but always include at least jumps and deferred stack pops. Thread jump optimizations attempt to reduce the number of jump operations; deferred stack pops occur when the compiler lets arguments accumulate on the stack as functions return and then pops them simultaneously, rather than popping the arguments piecemeal as each called function returns.

-O2 level optimizations include all first-level optimizations plus additional tweaks that involve processor . At this level, the compiler takes care to make sure the processor has instructions to execute while waiting for the results of other instructions or while waiting for data to be retrieved from second-level cache or main memory. The implementation of these optimizations, however, is highly processor-specific. -O3 options include all -O2 optimizations, loop unrolling, and other processor-specific features.

Depending on the amount of low-level knowledge you have about a given CPU family, you can use the -f{flag} option to request specific optimizations you want performed. Table 3.4 lists eight -f optimization flags that are often useful.

Table 3.4 GCC Optimization Flags

Flag Effect -ffloat-store Suppresses storing the value of floating-point variables in CPU registers. This will save CPU registers for other uses and prevent unnecessarily precise floating-point numbers from being generated. -ffast-math Generates floating-point math optimizations that are faster but that violate IEEE and/or ANSI/ISO standards. If your program does not need strict IEEE adherence, consider using this flag when compiling programs that are floating- point intensive. -finline- Expands all simple functions in place inside their callers. The compiler decides functions what constitutes a simple function. Reducing the processor overhead associated with function calls is a basic optimization technique. -funroll-loops Unrolls all loops having a fixed number of iterations that can be determined at compile time. Unrolling loops saves several CPU instructions per loop iteration, dramatically decreasing execution time. -fomit-frame- Discards a frame pointer stored in a CPU register if the function does not need pointer one. This speeds up processing because the instructions necessary to set up, save, and restore frame pointers are eliminated. -fschedule- Reorders instructions that may stall because they are waiting for data that is insns not in the CPU. -fschedule- -fschedule- file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 13 of 21

Performs a second round of instruction reordering (similar to fschedule insns2 insns). -fmove-all- Moves all invariant calculations occurring inside a loop outside of the loop. movables This eliminates unnecessary operations from the loop, speeding up its overall operations.

Inlining and loop unrolling can greatly improve a program’s execution speed because they avoid the overhead of function calls and variable lookups, but the cost is usually a large increase in the size of the object or binary files. You will need to experiment to see if faster execution time, if any, is worth the increased file size. In general, when playing with compiler options of the listed in Table 3.4, experimentation and code profiling, or performance analysis, are necessary to confirm that a given optimization has the desired effect.

As an experiment, the following program, pisqrt.c (see Listing 3.7), calculates the square root of pi 10,000,000 times. Table 3.5 lists the optimization or processor flag used to compile the program and the average execution time of ten runs of pisqrt on a Pentium II 260 MHz CPU with 128MB RAM.

Listing 3.7 Calculate the Square Root of pi

/* * pisqrt.c - Calculate the square of PI 10,000,000 * times. */ #include #include int main(void) { double pi = M_PI; /* Defined in */ double pisqrt; long i;

for(i = 0; i < 10000000; ++i) { pisqrt = sqrt(pi); } return 0;}

Table 3.5 pisqrt Execution Times

Flag/Optimization Average Execution Time 5.43 seconds -O1 2.74 seconds -O2 2.83 seconds -O3 2.76 seconds -ffloat-store 5.41 seconds -ffast-math 5.46 seconds -funroll-loops 5.44 seconds -fschedule-insns 5.45 seconds -fschedule- 5.44 seconds insns2

file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 14 of 21

This not terribly rigorous experiment shows that, at least for this program, letting the compiler choose the right set of optimizations, using -O1, -O2, and -O3, results in the greatest performance gains. The lesson to take away from this demonstration is that unless you know a great deal about processor architecture or know that a particular optimization will have a specific effect your program needs, stick with the -O optimization options.

Tip - In general, Linux seem to use -O2 optimization. Even on small programs, like the hello.c program introduced at the beginning of this chapter, you will see small reductions in code size and in performance time. This is based more on habit, however, than empirical testing. As Table 3.5 shows, -O1 optimization resulted in the best performance increase for the program tested. The moral? Try different optimization levels to see which one has the best results!

Debugging Options

Bugs, alas, are as inevitable as death and taxes. To accommodate this inescapable reality, you can use GCC’s -g and -ggdb options to insert debugging information into your compiled programs to facilitate debugging sessions. In addition, GCC also has a number of options to make code-profiling sessions easier and more productive.

The -g option can be qualified with a 1, 2, or 3 to specify how much debugging information to generate. The default level is 2 (-g2), which generates extensive symbol tables, line numbers, and information about local and external variables. All of this information is stored inside the binary. Level 3 debugging information includes all of the level 2 information plus all of the macros defined in the source code. Level 1, in contrast, generates just enough information to create backtraces and stack dumps. A backtrace is the history of the function calls that a program makes. A stack dump is a listing, usually in raw hexadecimal format, of the contents of a program’s execution environment, primarily the CPU registers and the memory allocated to it. Note that level 1 debugging information does not generate debugging information for local variables or line numbers.

If you intend to use the GNU Debugger, gdb (covered in Chapter 8, "Debugging"), using the -ggdb option creates extra information that eases the debugging chore under gdb. However, it will also likely make the program impossible to debug using other , such as the debugger common on the Solaris . -ggdb accepts the same level specifications as -g, and they have the same effects on the debugging output.

Using either of the two debug-enabling options will, however, dramatically increase the size of your binary. Simply compiling and linking the simple hello.c program I used earlier in this chapter resulted in a binary of 4089 bytes on my system. The resulting sizes when I compiled it with the -g and -ggdb options may surprise you:

$ gcc -g hello.c -o hello $ ls -l hello -rwxr-xr-x 1 kwall users 10275 May 21 23:27 hello

$ gcc -ggdb hello.c -o hello

file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 15 of 21

$ ls -l hello -rwxr-xr-x 1 kwall users 8135 May 21 23:28 hello

As you can see, the -g option increased the binary’s size nearly three times, while the-ggdb option doubled its size! Despite the size increase, I recommend shipping binaries with standard debugging symbols (created using -g) in them in case someone encounters a problem and wants to try to debug your code for you.

Additional debugging options include the -p and -pg options, which embed profiling information into the binary. This information is useful for tracking down performance bottlenecks in your code and for developing a general picture of a program’s performance.-p adds profiling symbols that the prof program can read, and -pg adds symbols that the GNU project’s prof incarnation, , can interpret. The -a option counts how many times each block of code (such as functions) is entered.

-save-temps saves the intermediate files, such as the object and assembler files, generated during compilation. These files can be useful if you suspect that the compiler is doing something unusual with your code or if you want to examine the generated code to see if it can be hand-tuned for better performance.

If you are interested in seeing how long the compiler takes to do its work, consider using the -Q option, which causes GCC to display each function as it compiles along with some statistics about how long each compiler pass takes. For example, here is the output when compiling the trusty hello.c program:

$ gcc hello.c main time in parse: 0.020000 time in integration: 0.000000 time in jump: 0.000000 time in cse: 0.000000 time in loop: 0.000000 time in cse2: 0.000000 time in branch-prob: 0.000000 time in flow: 0.000000 time in combine: 0.000000 time in regmove: 0.000000 time in sched: 0.000000 time in local-alloc: 0.000000 time in global-alloc: 0.000000 time in sched2: 0.000000 time in shorten-branch: 0.000000 time in stack-reg: 0.000000 time in final: 0.000000 time in varconst: 0.000000 time in symout: 0.000000 time in dumpt: 0.000000

The displayed times may vary on your system. This information is mostly of interest to compiler writers, but, if you ever get curious about what the compiler is doing, you know how to find out.

Finally, as mentioned at the beginning of this chapter, GCC allows you simultaneously to optimize your code and to insert debugging information. Optimized code presents a debugging challenge, however, because variables you declare and use may not be used in the optimized program, flow control may branch to unexpected places, statements that compute constant values may not execute, and statements inside loops will execute elsewhere because the loop was unrolled. My personal file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 16 of 21

preference, though, is to debug a program thoroughly before worrying about optimization. Your mileage may vary.

Tip - "Optimize later" does not mean "ignore efficiency during the design process." Optimization, in the context of this chapter, refers to the compiler magic I have discussed in this section. Good design and efficient algorithms have a far greater impact on overall performance than any compiler optimization ever will. A highly optimized bubble sort, for example, will never be as fast as a quick sort, except for very small data sets. If you take the time up front to create a clean design and use fast algorithms, you may not need to optimize, although it never hurts to try.

Architecture-Specific Options

In addition to the optimization options discussed in the previous section, GCC can generate code specific to each CPU type. To do so, use the -m{value} option. Table 3.6 shows a number of the supported options for the Intel i386 processor family.

Table 3.6 Architecture-Specific GCC Options

Option Meaning -mcpu=CPU TYPE Uses the default CPU instruction schedule for CPU_TYPE when compiling. The choices for CPU TYPE are i386, i486, i586, pentium, i686, and pentiumpro. -m386 Synonym for -mcpu=i386. -m486 Synonym for -mcpu=i486. -mpentium Synonym for -mcpu=pentium. -mpentiumpro Synonym for -mcpu=pentiumpro. -march=CPU Generates instructions for CPU TYPE. The choices TYPE for CPU TYPE are i386, i486, pentium, and pentiumpro. -march=CPU TYPE implies - mcpu=CPU TYPE. -mieee-fp Uses IEEE standards for floating-point comparisons. -mno-ieee-fp Doesn’t use IEEE standards for floating-point comparisons. -malign-double Aligns double, , and long long variables on a two word boundary, resulting in faster code. -mno-align- Doesn’t align double, long double, and long double long variables on a two word boundary. -mrtd Forces functions that take a fixed number of file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 17 of 21

arguments to return with the ret NUM instruction, saving one instruction in the caller.

A few notes on these options are necessary. Using a specific CPU TYPE with -mcpu=CPU_TYPE generates appropriate instructions for the indicated CPU. However, you must also use - march=CPU_TYPE or the compiler will generate code that also runs on i386 CPUs. Similarly, to generate code and instruction scheduling specific to a given processor, use the -march=CPU_TYPE option. This is the best way to customize compiled code for a given processor. The -malign_double option results in slightly faster code, as noted in Table 3.6, but it only works for Pentium CPUs. The -mrtd option overrides the -fdefer-stack-pops discussed earlier. GNU C Extensions

The GNU Compiler Collection extends the ANSI C standard in a variety of ways. If you don’t mind writing blatantly non-standard code, some of these extensions can be convenient and very useful.

About Portability

The problem with non-standard code, however, is that it is not terribly portable—code that takes advantage of GNU extensions doubtless will not compile with a non-GNU compiler. One of the reasons the C language has been so resilient and persistent, besides its fundamental power and flexibility, is that it is standardized and has been ported to every major computing architecture in existence, and probably to most of the minor platforms, too. This standardization makes it extremely portable.

Note - GCC also sports a number of extensions for C++.

So, dauntless reader, you have to choose between the convenience of the extensions or writing ANSI/ISO standard C. My recommendation is to write standard C. When you diverge from the standard, as will happen when using POSIX functions (such as reading and writing file descriptors, covered in Chapter 11, "Input and Output," and Chapter 12, "Working with Files and Directories"), isolate the code in question in a single module and take pains to make sure that your program will compile on a strict ANSI/ISO system. The usual way to accomplish this is to bracket the non-standard code in #ifdef directives. The following code snippet illustrates this approach. It will compile in a strictly ANSI C environment and in a looser, GNU-friendly environment:

#ifdef __STRICT_ANSI__ /* use ANSI/ISO C only here */ #else /* use GNU extensions here */ #endif

The macro __STRICT_ANSI__, if defined either by the user or an ANSI-compatible compiler, indicates that an ANSI-compatible environment is being enforced and code in the first part of the #ifdef block will be compiled. Otherwise, code following the #else directive will be compiled.

For all of the gory details, I will direct the curious reader to GCC's info pages. The extensions file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 18 of 21

covered in this section are ones frequently seen in Linux’s system headers, source code, and in many Linux applications.

Note - The ANSI/ISO C standard does not define the behavior of a compiler and the programs it builds when source code uses certain constructs, such as main returning void. At least in theory, anything can happen. An ancient comp.lang.c convention holds that demons will fly out of your nose if you invoke . The point of the quote is that a program is not correct or portable just because it compiles with your compiler on your system.

GNU Extensions

To provide 64-bit storage units, for example, GCC offers the long long type:

long long long_int_var;

On the platform, this definition results in a 64-bit memory location named long_int_var.

Note - The long long type exists in the new draft of the ISO C standard.

Inline Functions

Another GCC-ism you will encounter in Linux header files is the use of inline functions. Provided it is short enough, an inline function expands in your code much as a macro does, thus eliminating the cost of a function call. Inline functions are better than macros, however, because the compiler type- checks them at compile time.

To use the inline functions, insert the keyword inline in front of the function’s return type, as shown in the excerpt that follows, then compile with at least -O optimization.

inline void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }

This short function implements the well-known swap function as an inline function. Note, however, that the inline keyword is only a suggestion to the compiler, not a directive. The compiler decides, using an internal heuristic, whether or not a given function can or will be inlined.

Function and Variable Attributes

The attribute keyword enables you to tell GCC more about your code and helps the code optimizer do a better job. Consider, for example, the standard library functions exit and abort, which never

file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 19 of 21

return to their caller. The compiler can generate slightly more efficient code if it knows that they do not return. Of course, user programs may also define functions that do not return. GCC allows you to specify the noreturn attribute for such functions, which acts as a hint to the compiler when it optimizes the function.

Suppose, for example, you have a function named die_on_error that never returns. To use a function attribute, append __attribute__ ((attribute_name)) after the closing parenthesis of the function declaration. Thus, the declaration of die_on_error would look like

void die_on_error(void) __attribute__ ((noreturn));

The function would then be defined normally:

#include void die_on_error(void) { /* your code here */ exit(EXIT_FAILURE); }

You can also apply attributes to variables. The aligned attribute, for example, instructs the compiler to align the variable’s memory location on a specified byte boundary. The statement

int int_var __attribute__ ((aligned 16)) = 0;

will cause GCC to align int_var on a 16-byte boundary. The packed attribute tells GCC to use the minimum amount of space required for variables or structs. Used with structs, packed causes GCC to remove any padding that it would ordinarily insert for alignment purposes.

If you want to turn off warnings about unused variables, apply the unused attribute to the variable, which informs the compiler that the variable is intended to be unused. This will silence the warning that would otherwise be issued:

float big_salary __attribute__ ((unused));

Comment Delimiters

GCC permits the use of the C++ comment delimiter, //, in C programs unless the -ansi or - traditional compiler options are used. Many other compilers also permit this and it may become part of the new C standard currently being developed. This feature is a real convenience for a new generation of programmers who emerge from university having used C++ much more extensively than C and who are thus more accustomed to typing // rather than /* and */. Consider this extension a convenience.

Using Case Ranges

A terrifically useful extension is case ranges. The syntax looks like this:

case LOWVAL ... HIVAL:

Note that the spaces preceding and following the ellipsis are required. Case ranges are used in switch file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 20 of 21

statements to specify values that fall between LOWVAL and HIVAL inclusive. An example follows:

switch(int_var) { case 0 ... 2: /* your code here */ break; case 3 ... 5: /* more code here */ break; default: /* default code here */ }

The fragment above is equivalent to

switch(int_var) { case 0: case 1: case 2: /* your code here */ break; case 3: case 4: case 5: /* more code here */ break; default: /* default code here */

Case ranges are just a shorthand notation, syntactic sugar, for the traditional switch statement syntax. As you can see, it makes the first code fragment shorter and slightly improves its readability (although veteran C programmers find the idiomatic approach used in the second fragment easier to read).

Constructing Function Names

One GNU extension that can dramatically simplify debugging is to use function names as . GCC predefines the variable __FUNCTION__ to be the name of the current function (where flow of control is currently located) as it is written in the source code. Listing 3.8 illustrates how this feature works.

Listing 3.8 Using the __FUNCTION__ Variable

/* * showit.c - Illustrate using the __FUNCTION__ variable */ #include void foo(void); int main(void) { printf("The current function is %s\n", __FUNCTION__); foo(); return 0; } void foo(void) { printf("The current function is %s\n", __FUNCTION__);}

file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01 Chapter 3: Using the GNU Compiler Collection Page 21 of 21

The output of this program follows:

$ ./showit The current function is main The current function is foo

As you can see, showit obligingly produced the name of the function that was currently executing. This can be quite useful during a debugging session if you are having difficulty locating where a program is running into trouble. Simply insert a few printf statements that use __FUNCTION__ and you will quickly narrow it down. PGCC: The Pentium Compiler

Before ending this chapter, it is worthwhile mentioning another, lesser known compiler, pgcc, the Pentium GCC. Maintained by the Pentium Compiler Group (http://www.goof.com/pcg/), pgcc was created to address the different optimization features of the Pentium processor architecture at a time when GCC did a poor job of Pentium-specific optimization. While it does represent a fork in GCC’s code base, the maintainers closely track GCC’s releases. In fact,pgcc is released as a set of patches to egcs, now the official GNU compiler.

pgcc’s chief benefit is better optimization for Pentium CPUs. It was originally based on a version of GCC that a team of Intel engineers created for the Pentium. While the Intel team produced benchmarks showing a 30% improvement in certain applications, the Pentium Compiler Group cautions that a performance increase of 5% is more likely in real-world situations.

Why bother with pgcc, especially now that egcs incorporates sophisticated Pentium optimizations? Well, in the first place, you need not. This is Linux, after all, and you are free to do as you see fit. However, pgcc is used as the compiler for both the Stampede and Enoch Linux distributions, so it has some merit. It also represents an alternative to GCC. Further, it might be an interesting experiment to see if pgcc can produce faster and/or smaller binaries on your system. Finally, it could be just plain fun to play with another piece of software. Summary

This chapter introduced you to GCC, the GNU compiler collection. After a brief tutorial, it covered many GCC features, including options for using libraries and header files, generating compile time warnings, adding debugging symbols to your programs, and optimization. In reality, it has only scratched the surface, though; GCC’s own documentation runs to several hundred pages. Nevertheless, you know enough about GCC’s features and capabilities to enable you to start using it in your own development projects. With this basic competency level, you are ready to start coding. First, though, the next chapter, "Project Management Using GNU make," adds another key tool to your developing Linux programming toolchest.

© Copyright Macmillan USA. All rights reserved.

file://J:\MacmillanComputerPublishing\chapters\in201.html 3/22/01