Chapter 13. Programming Languages

Chapter 13. Programming Languages There's much more to Linux than simply using the system. One of the benefits of free software is that you can modify it to suit your needs. This applies equally to the many free applications available for Linux and to the Linux kernel itself. Linux supports an advanced programming interface, using GNU compilers and tools, such as the gcc compiler, the gdb debugger, and so on. A number of other programming languages, including Perl, Python, and LISP, are also supported. Whatever your programming needs, Linux is a great choice for developing Unix applications. Because the complete source code for the libraries and Linux kernel is provided, programmers who need to delve into the system internals are able to do so.1 Linux is an ideal platform for developing software to run under the X Window System. The Linux X distribution, as described in Chapter 10, is a complete implementation with everything you need to develop and support X applications. Programming for X is portable across applications, so the X-specific portions of your application should compile cleanly on other Unix systems. In this chapter, we'll explore the Linux programming environment and give you a five-cent tour of the many facilities it provides. Half of the trick to Unix programming is knowing what tools are available and how to use them effectively. Often the most useful features of these tools are not obvious to new users. Since C programming has been the basis of most large projects (even though it is nowadays being replaced more and more by C++) and is the language common to most modern programmers — not only on Unix, but on many other systems as well — we'll start out telling you what tools are available for that. The first few sections of the chapter assume you are already a C programmer. But several other tools are emerging as important resources, especially for system administration. We'll examine one in this chapter: Perl. Perl is a scripting language like the Unix shells, taking care of grunt work like memory allocation, so you can concentrate on your task. But Perl offers a degree of sophistication that makes it more powerful than shell scripts and, therefore, appropriate for many programming tasks. Lots of programmers are excited about trying out Java , the new language from Sun Microsystems. While most people associate Java with interactive programs (applets) on web pages, it is actually a general-purpose language with many potential Internet uses. In a later section, we'll explore what Java offers above and beyond older programming languages, and how to get started. 13.1 Programming with gcc The C programming language is by far the most often used in Unix software development. Perhaps this is because the Unix system was originally developed in C; it is the native tongue 1 On a variety of Unix systems, the authors have repeatedly found available documentation to be insufficient. With Linux, you can explore the very source code for the kernel, libraries, and system utilities. Having access to source code is more important than most programmers think. of Unix. Unix C compilers have traditionally defined the interface standards for other languages and tools, such as linkers, debuggers, and so on. Conventions set forth by the original C compilers have remained fairly consistent across the Unix programming board. The GNU C compiler, gcc, is one of the most versatile and advanced compilers around. Unlike other C compilers (such as those shipped with the original AT&T or BSD distributions, or those available from various third-party vendors), gcc supports all the modern C standards currently in use — such as the ANSI C standard — as well as many extensions specific to gcc. Happily, however, gcc provides features to make it compatible with older C compilers and older styles of C programming. There is even a tool called protoize that can help you write function prototypes for old-style C programs. gcc is also a C++ compiler. For those who prefer the more modern object-oriented environment, C++ is supported with all the bells and whistles — including most of the C++ introduced when the C++ standard was released, such as method templates. Complete C++ class libraries are provided as well, such as the Standard Template Library (STL). For those with a taste for the particularly esoteric, gcc also supports Objective-C, an objectoriented C spinoff that never gained much popularity but may see a second spring due to its usage in Mac OS X. And there is gcj, which compiles Java code to machine code. But the fun doesn't stop there, as we'll see. In this section, we're going to cover the use of gcc to compile and link programs under Linux. We assume you are familiar with programming in C/C++, but we don't assume you're accustomed to the Unix programming environment. That's what we'll introduce here. The latest gcc version at the time of this writing is Version 3.0.4. However, the 3.0 series has proven to be still quite unstable, which is why Version 2.95.3 is still considered the official standard version. We suggest sticking with that one unless you know exactly what you are doing. 13.1.1 Quick Overview Before imparting all the gritty details of gcc, we're going to present a simple example and walk through the steps of compiling a C program on a Unix system. Let's say you have the following bit of code, an encore of the much-overused "Hello, World!" program (not that it bears repeating): #include <stdio.h> int main( ) { (void)printf("Hello, World!\n"); return 0; /* Just to be nice */ } Several steps are required to compile this program into a living, breathing executable. You can accomplish most of these steps through a single gcc command, but we've left the specifics for later in the chapter. First, the gcc compiler must generate an object file from this source code. The object file is essentially the machine-code equivalent of the C source. It contains code to set up the main( ) calling stack, a call to the printf( ) function, and code to return the value of 0. The next step is to link the object file to produce an executable. As you might guess, this is done by the linker. The job of the linker is to take object files, merge them with code from libraries, and spit out an executable. The object code from the previous source does not make a complete executable. First and foremost, the code for printf( ) must be linked in. Also, various initialization routines, invisible to the mortal programmer, must be appended to the executable. Where does the code for printf( ) come from? Answer: the libraries. It is impossible to talk for long about gcc without mentioning them. A library is essentially a collection of many object files, including an index. When searching for the code for printf( ), the linker looks at the index for each library it's been told to link against. It finds the object file containing the printf( ) function and extracts that object file (the entire object file, which may contain much more than just the printf( ) function) and links it to the executable. In reality, things are more complicated than this. Linux supports two kinds of libraries: static and shared. What we have described in this example are static libraries: libraries where the actual code for called subroutines is appended to the executable. However, the code for subroutines such as printf( ) can be quite lengthy. Because many programs use common subroutines from the libraries, it doesn't make sense for each executable to contain its own copy of the library code. That's where shared libraries come in.2 With shared libraries, all the common subroutine code is contained in a single library "image file" on disk. When a program is linked with a shared library, stub code is appended to the executable, instead of actual subroutine code. This stub code tells the program loader where to find the library code on disk, in the image file, at runtime. Therefore, when our friendly "Hello, World!" program is executed, the program loader notices that the program has been linked against a shared library. It then finds the shared library image and loads code for library routines, such as printf( ), along with the code for the program itself. The stub code tells the loader where to find the code for printf( ) in the image file. Even this is an oversimplification of what's really going on. Linux shared libraries use jump tables that allow the libraries to be upgraded and their contents to be jumbled around, without requiring the executables using these libraries to be relinked. The stub code in the executable actually looks up another reference in the library itself — in the jump table. In this way, the library contents and the corresponding jump tables can be changed, but the executable stub code can remain the same. Shared libraries also have another advantage: their upgradability. When someone fixes a bug in printf() (or worse, a security hole), you only need to upgrade the one library. You don't have to relink every single program on your system. But don't allow yourself to be befuddled by all this abstract information. In time, we'll approach a real-life example and show you how to compile, link, and debug your programs. It's actually very simple; the gcc compiler takes are of most of the details for you.

Chapter 13. Programming Languages

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support