Embedding with GNU:

Bill Gatliff 12/28/2001 10:36 AM EST

Embedding with GNU: Newlib newlib is often the most appropriate choice for a runtime library in an . Read on to find out why.

In previous articles in this series, I discussed how to use the GNU compiler and linker (February 2000) and the GNU debugger (September 1999), for embedded development. In this installment, I will introduce newlib, a C runtime library that is often an appropriate choice for built using GNU tools.

Technically, the name newlib refers to a collection of source code assembled by Cygnus Solutions (now Red Hat) from several different origins. This point is interesting only for heritage and licensing purposes (the latter we will discuss momentarily), because the code itself is released, managed, and used as a single functional unit, just as you would expect from any collection of C code distributed under a single name and version number.

Newlib is one of several options for small, GNU-friendly C runtime environments (see sidebar "Other Runtimes"). It is unique, however, in that it is a mature and well-supported product with an active developer and user base. And its architecture addresses the needs of deeply embedded systems quite well.

Other Embedded C Runtime Libraries

Clibc (cvs.uclinux.org/) This is the C runtime library behind CLinux, a derivative of the that runs on CPUs without MMUs. This library has a long and convoluted history, with its origins in a version of the GNU C library that predates Linux itself.

Clibc is a surprisingly small yet effective library-one can produce a 4KB "Hello, world!" application with CLibc, where the latest full GNU libc requires over 200KB to accomplish the same thing (although, honestly, this isn't exactly an apples-to-apples comparison). In addition, CLibc supports POSIX threads, a useful feature that newlib currently lacks. On the downside, CLibc is still very much under development, and it does not support most of targets currently supported by newlib. In fact, as of December 1, 2000, Clibc only supported x86-the m68k and arm targets were broken (external-lists.valinux.com/archives//sglibc/2000- December/000017.html). Clibc is also a library very much dedicated to a CLinux runtime environment, which makes it difficult to apply to other systems. sglibc (http://external-lists.valinux.com/lists/listinfo/sglibc) sglibc is a very recent attempt to reduce the size of the GNU glibc system. In fact, this project is so new, they don't even have a web page established yet. sglibc attacks the code bloat in glibc by using patches to discard major sections of unneeded or expensive functionality in the glibc source tree. There is considerable discussion underway on the mailing lists as to whether this is a reasonable direction or not, whether the results will be of any use to anyone other than projects needing a thrifty glibc (single- floppy Linux distributions, mostly), and whether the project's principal maintainer even wants to support embedded systems at all!

Assuming they can work all of their differences out, sglibc could be the next big thing for embedded Linux and BSD systems, because the GNU C library is the largest single component in today's embedded Linux systems-even larger than the kernel itself. Any gains made here will make Linux increasingly attractive in small, resource-constrained devices.

Licensing Because newlib is a collection of source code, it is distributed under the terms of several different licenses. All of the licensing is either public domain or BSD-like, which means that even proprietary applications can adopt newlib because its use does not require distribution of the end work's source code. For convenience, all of newlib's licenses are gathered up into the file COPYING.NEWLIB, which is included with the source code. Features Some of the library's features are useful enhancements to a "typical" embedded setting (whatever that is), while others allow newlib to be about as POSIX-like as a compact C runtime setting can be. A POSIX-like API can be a big help when porting applications to embedded hardware. printf vs. iprintf Newlib contains a complete implementation of the standard printf() and family. By the implementation's own admission, "this code is large and complicated" (vfprintf.c:144), but it is essential for systems that need full ANSI C input and output support, including capabilities for representing and parsing floating-point numbers.

Many embedded systems do not use floating-point math, however, and great pains are taken in most embedded runtime libraries to cull this code-bloating functionality whenever possible. Newlib actually approaches this problem in two ways:

A FLOATING_POINT macro that allows you to selectively disable floating-point support in each of the stdio library functions that can offer it A new iprintf() function that only knows how to display integer objects

If an embedded system needs floating-point support in only a few of the standard input and output functions, newlib can be rebuilt to exclude floating point from places where it isn't needed. You can omit floating point for everything except scanf(), for example, by either undefining the FLOATING_POINT macro everywhere except in the scanf source file, or by modifying newlib's build process to do the same thing, in a more automated fashion. I'll have more to say on these options later, when we look at what it takes to port newlib.

For situations where only integer output is required, newlib provides the iprintf() function: a version of printf built without FLOATING_POINT enabled. It behaves exactly like printf, except that it does not understand the %f, %F, %g, or %G type specifiers, and therefore has a much smaller code footprint than its full-featured big brother. More on stdio Newlib's standard input and output facilities are surprisingly complete. The complete C file API is also provided, complete with read and write buffering, seeking, and stream flushing capabilities. Variations like sprintf, fprintf, and vfprintf (takes va_list arguments) are also included, which makes a newlib environment look strikingly similar to one you' expect to see in a more workstation-oriented programming environment.

An unfortunate limitation of newlib's stdio library is that it requires at least a minimal malloc() for proper operation. When the printf() machinery detects a floating-point format specifier, it allocates a few bytes of memory to use as a workspace for constructing the text representation of the number. It's actually a poor design decision, because in other places, newlib statically allocates memory to accomplish the same thing. But that's the way it is. On the plus side, newlib includes a good dynamic memory allocator that is straightforward to set up and use. I have also built a malloc based on a fixed-size memory block allocator, to eliminate fragmentation worries in systems where this was a concern. That code appears in the porting section.

Unix API Newlib also includes a lot of the familiar Unix API functions like open() and write(). Most of them map directly to the code stubs newlib uses to invoke external system resources, so their simplicity eliminates the need for malloc(). These stubs will be discussed later, in the section on porting. libm Newlib contains a complete IEEE math library called libm. In addition to offering the standard math functions like exp(), sin(), and pow(), this code also provides matherr(), a modifiable math error handler that the library invokes whenever a serious math-related error, such as an underflow or loss of precision, is detected. By customizing this function, you can handle these situations in whatever way is appropriate for your system.

As a surprising bonus, libm also includes functions that take float parameters, instead of double. These extensions are named after their full-precision equivalents. For example, sinf() is the single-precision version of the sin() function. The reduced-precision functions have a considerable speed advantage over their IEEE-compliant double-precision conterparts, which can put some floating-point operations within reach of hardware that is too weak for full double-precision computations. Reentrancy Newlib's C and math libraries are reentrant when properly integrated into a multithreaded environment. The implementation is not obvious at first glance, so I'll spend a few paragraphs describing how it works. Once you know the details, it will be clear how to make sure you set it up properly in your system.

The ANSI C standard specifies a global integer called errno, that the runtime library sets when an error occurs. Once set, errno's value persists until the application clears it, which simplifies error notification by the library but can create reentrancy problems when multiple execution threads are working in the library at the same time-neither thread knows who generated the error.

Newlib addresses this problem by redefining errno as a macro that references a global pointer to a structure of type struct _reent (the pointer is called _impure_ptr). (Actually, the errno macro calls the function __errno(), which in turn references __impure_ptr[errno.c:11]. But you get the idea.) This structure contains the traditional errno value specified by ANSI, but it also contains a lot of other stuff, including signal handler pointers and file handles for standard input, output, and error streams.

Newlib's errno macro hides all of this from the user, of course. To check for errors, an application only needs to refer to the value of errno, just like it would in any ANSI environment. To determine why an fopen() failed, for example, you still do this: fp = fopen( "myfile.txt", "rw" ); if (fp == NULL) { switch( errno ) { case EACCESS: /* we don't have permissions */ ... } } just like ANSI C says to. The only difference is that the reference to errno is a macro that returns _impure_ptr->errno, instead of the value of a global integer.

Newlib declares one _reent structure and aims _impure_ptr at it during initialization, so everything starts out correctly for situations where only one thread of execution is in the library at a time. To facilitate multiple contexts, you must take two additional steps: you must provide one _reent structure for each execution thread, and you must move _impure_ptr between these structures during context switches. Simple enough, no?

If you also want a reentrant malloc(), you must provide the functions __malloc_lock() and __malloc_unlock() to protect your memory pool from corruption during allocations. This step usually isn't necessary if your memory pool is built from memory management components supplied by an RTOS, as the functions there are often reentrant already, and need no additional protection. Hooks called __env_lock and __env_unlock are also provided, so that you can implement a reentrant environment variable pool as well. Designed for portability All of newlib's functionality builds on a set of 17 stub functions that newlib uses to hook into the execution environment. By replacing these stubs, you can integrate newlib with just about any system imaginable, from one with no operating system at all to one based on an RTOS to one with a complete POSIX OS like Linux. Newlib's documentation provides details on which stubs are needed for each library function, so you only need to provide stubs for the portions of newlib that you intend to use. For example, newlib has a stub called _fork, but you don't need to do anything with it unless your application will call system() or fork().

Newlib does not include a filesystem, though it may seem like it requires one for proper operation, particularly when you consider that it provides file-oriented functions like fprintf() and fseek(). As you dig deeper, however, you will find that while newlib likes to think that there is a stream-oriented filesystem working behind the scenes, its integration layer has been conveniently organized to not require this. Building newlib Building newlib for a supported target is a straightforward process that follows the conventions adhered to by most open source projects. After you download and decompress the source code, you simply configure and build it using a previously constructed like gcc. In other words, you type: tar xzvf newlib-1.8.2.tar.gz mkdir build-newlib cd build-newlib ../newlib-1.8.2/configure target=#&60;target-name> make all install info install-info and then go get a glass of your favorite beverage. When you get back, the build process will have produced the files libc.a, libg.a (a debugging-enabled libc), and libm.a, in the directory /usr/local/#&60;target-name>. If the target you specified has several variants, the build process will produce multiple files, each with optimizations specific to that variant. Link one or more of these files with your application, and there you have it-a free C runtime environment.

Newlib does not require use of the GNU compiler, but if you use something else, you'll have to make adjustments to newlib's makefiles after the configuration step.

Newlib's build process also produces documentation, in the files libc.info and libm.info. By default these files go into /usr/local/info, and they can be browsed using info, a documentation browser included with most Linux distributions. Just change to the directory containing these files, and type: info -f ./libc.info

Newlib's configuration utility supports several options, not the least of which are the definition of the target system (what CPU and OS the library will run under) and where to put the files generated during the build. As an example, the following command will set up a build for the Hitachi SH CPU using the elf output file format, and put the resulting libraries in /home/bgat/newlib-install. Since I don't specify an operating system, the build scripts will assume that the target system does not have one, with the idea that I will provide code stubs with my application.

../newlib-1.8.2/configure -target=sh-elf -prefix=/home/ bgat/newlib-install

Use the -help option to see a complete list of command line options.

Table 1 lists some of the that newlib supports. The text in parentheses is the text to use in the -target option during configuration. Table 1: Supported CPUs (ARM)

AMD 29K family (a29k) Intel 386 (i386) Hitachi H8-300 (h8300) Hitachi H8-500 (h8500) Hitachi SH-2 (sh) Motorola 68K and CPU32 (m68k) Mitsubishi MN10300 (mn10300) NEC V70 (necv70) Zilog Z8000 (z8k )

Since the library is implemented almost entirely in C, newlib can also be used on unsupported microprocessors simply by replacing its startup code and implementations of memcpy() and setjmp()/longjmp(), and by obtaining a cross compiler for the target system.

Newlib can be built in both Unix-like environments, and under , Red Hat's freely available POSIX-compatibility environment that makes Win32 hosts more Unix-like. (You don't even need to modify the previous example-Cygwin's shell makes the Win32 directory structure look enough like Unix that newlib will build without problems. More information about Cygwin is available at sources.redhat.com/cygwin/.) Tweaks Newlib's source code has a few configuration points, and you will want to use them to eliminate unneeded stubs, to optimize for code size instead of speed, or to remove floating-point support. You won't want to spend much time with this in the early stages of a project, before you have enough of a system in place to evaluate the benefits of your changes, but do not be afraid to tinker once you can test your refinements-that's part of what having source code is all about, after all.

The best way to go about tweaking newlib is to change the values in the Makefile generated by the configure command, before you type make. Look for the variable CFLAGS_FOR_TARGET, and add flags there like:

-DINTEGER_ONLY to build an integer-only library, or:

-DPREFER_SIZE_OVER_SPEED to enable a few small changes that reduce library code size.

You can also adjust the value of the CFLAGS variable, to affect the way the library is compiled. For example, if you are using the GNU C compiler then you could use:

-Os (instead of -O2) to tell it to optimize for code size over performance, or:

-O3 (instead of -O2) to tell the compiler to optimize for raw performance over everything else. You could also add:

-fomit-frame-pointer to tell the compiler to not build stack frames for functions in the library that don't need them, which saves some space and boosts performance slightly. Don't eliminate stack frames if you intend to step through code inside of newlib itself, however, because it likely won't work-like most debuggers, the GNU debugger needs a valid stack frame at all times.

You can discover additional, minor source code configuration points by using the find program on the library sources, to locate sections of conditionally compiled code. Here's how I do it, from the top of the newlib source tree: find . -name "*[ch]" -type f | xargs grep "#if"

One other thing: if you decide you don't like your changes, and want to try again, you don't need to repeat the entire configuration process. Instead, simply edit the makefile and then do a clean rebuild, like this: make clean all install

That's all for now. In a continuation of this article, I'll show you how to integrate newlib into a multithreaded runtime environment that features Jean Labrosse's C/OS, a well-written and thrifty embedded RTOS that doesn't provide a filesystem API, but can do everything that newlib needs it to do anyway.

Go to the continuation of this article, Embedding with GNU: Newlib Part 2

Bill Gatliff is an independent embedded systems consultant and a contributing editor to ESP. He welcomes questions and comments at [email protected].

References

Barr, Michael. "Know Your Rights," ESP, September 2000, p. 80. sources.redhat.com/newlib

Return to January 2002 Table of Contents