Alp-Ch08-Linux-System-Calls.Pdf

10 0430 Ch08 5/22/01 10:33 AM Page 167 8 Linux System Calls SOFAR, WE’VE PRESENTED A VARIETY OF FUNCTIONS that your program can invoke to perform system-related functions, such as parsing command-line options, manipu- lating processes, and mapping memory. If you look under the hood, you’ll find that these functions fall into two categories, based on how they are implemented. n A library function is an ordinary function that resides in a library external to your program. Most of the library functions we’ve presented so far are in the standard C library, libc. For example, getopt_long and mkstemp are functions provided in the C library. A call to a library function is just like any other function call.The arguments are placed in processor registers or onto the stack, and execution is transferred to the start of the function’s code, which typically resides in a loaded shared library. n A system call is implemented in the Linux kernel.When a program makes a system call, the arguments are packaged up and handed to the kernel, which takes over execution of the program until the call completes.A system call isn’t an ordinary function call, and a special procedure is required to transfer control to the kernel. However, the GNU C library (the implementation of the standard C library provided with GNU/Linux systems) wraps Linux system calls with functions so that you can call them easily. Low-level I/O functions such as open and read are examples of system calls on Linux. 10 0430 Ch08 5/22/01 10:33 AM Page 168 168 Chapter 8 Linux System Calls The set of Linux system calls forms the most basic interface between programs and the Linux kernel. Each call presents a basic operation or capability. Some system calls are very powerful and can exert great influence on the system. For instance, some system calls enable you to shut down the Linux system or to allocate system resources and prevent other users from accessing them.These calls have the restriction that only processes running with superuser privilege (programs run by the root account) can invoke them.These calls fail if invoked by a nonsuperuser process. Note that a library function may invoke one or more other library functions or system calls as part of its implementation. Linux currently provides about 200 different system calls.A listing of system calls for your version of the Linux kernel is in /usr/include/asm/unistd.h. Some of these are for internal use by the system, and others are used only in implementing special- ized library functions. In this chapter, we’ll present a selection of system calls that are likely to be the most useful to application and system programmers. Most of these system calls are declared in <unistd.h>. 8.1 Using strace Before we start discussing system calls, it will be useful to present a command with which you can learn about and debug system calls.The strace command traces the execution of another program, listing any system calls the program makes and any signals it receives. To watch the system calls and signals in a program, simply invoke strace, followed by the program and its command-line arguments. For example, to watch the system calls that are invoked by the hostname1 command, use this command: % strace hostname This produces a couple screens of output. Each line corresponds to a single system call. For each call, the system call’s name is listed, followed by its arguments (or abbre- viated arguments, if they are very long) and its return value.Where possible, strace conveniently displays symbolic names instead of numerical values for arguments and return values, and it displays the fields of structures passed by a pointer into the system call. Note that strace does not show ordinary function calls. In the output from strace hostname, the first line shows the execve system call that invokes the hostname program:2 execve(“/bin/hostname”, [“hostname”], [/* 49 vars */]) = 0 1. hostname invoked without any flags simply prints out the computer’s hostname to standard output. 2. In Linux, the exec family of functions is implemented via the execve system call. 10 0430 Ch08 5/22/01 10:33 AM Page 169 8.2 access: Testing File Permissions 169 The first argument is the name of the program to run; the second is its argument list, consisting of only a single element; and the third is its environment list, which strace omits for brevity.The next 30 or so lines are part of the mechanism that loads the standard C library from a shared library file. Toward the end are system calls that actually help do the program’s work.The uname system call is used to obtain the system’s hostname from the kernel, uname({sys=”Linux”, node=”myhostname”, ...}) = 0 Observe that strace helpfully labels the fields (sys and node) of the structure argument.This structure is filled in by the system call—Linux sets the sys field to the operating system name and the node field to the system’s hostname.The uname call is discussed further in Section 8.15,“uname.” Finally, the write system call produces output. Recall that file descriptor 1 corresponds to standard output.The third argument is the number of characters to write, and the return value is the number of characters that were actually written. write(1, “myhostname\n”, 11) = 11 This may appear garbled when you run strace because the output from the hostname program itself is mixed in with the output from strace. If the program you’re tracing produces lots of output, it is sometimes more conve- nient to redirect the output from strace into a file. Use the option -o filename to do this. Understanding all the output from strace requires detailed familiarity with the design of the Linux kernel and execution environment. Much of this is of limited interest to application programmers. However, some understanding is useful for debug- ging tricky problems or understanding how other programs work. 8.2 access: Testing File Permissions The access system call determines whether the calling process has access permission to a file. It can check any combination of read, write, and execute permission, and it can also check for a file’s existence. The access call takes two arguments.The first is the path to the file to check.The second is a bitwise or of R_OK, W_OK, and X_OK, corresponding to read, write, and execute permission.The return value is 0 if the process has all the specified permissions. If the file exists but the calling process does not have the specified permissions, access returns –1 and sets errno to EACCES (or EROFS, if write permission was requested for a file on a read-only file system). If the second argument is F_OK, access simply checks for the file’s existence. If the file exists, the return value is 0; if not, the return value is –1 and errno is set to ENOENT. Note that errno may instead be set to EACCES if a directory in the file path is inaccessible. 10 0430 Ch08 5/22/01 10:33 AM Page 170 170 Chapter 8 Linux System Calls The program shown in Listing 8.1 uses access to check for a file’s existence and to determine read and write permissions. Specify the name of the file to check on the command line. Listing 8.1 (check-access.c) Check File Access Permissions #include <errno.h> #include <stdio.h> #include <unistd.h> int main (int argc, char* argv[]) { char* path = argv[1]; int rval; /* Check file existence. */ rval = access (path, F_OK); if (rval == 0) printf (“%s exists\n”, path); else { if (errno == ENOENT) printf (“%s does not exist\n”, path); else if (errno == EACCES) printf (“%s is not accessible\n”, path); return 0; } /* Check read access. */ rval = access (path, R_OK); if (rval == 0) printf (“%s is readable\n”, path); else printf (“%s is not readable (access denied)\n”, path); /* Check write access. */ rval = access (path, W_OK); if (rval == 0) printf (“%s is writable\n”, path); else if (errno == EACCES) printf (“%s is not writable (access denied)\n”, path); else if (errno == EROFS) printf (“%s is not writable (read-only filesystem)\n”, path); return 0; } For example, to check access permissions for a file named README on a CD-ROM, invoke it like this: % ./check-access /mnt/cdrom/README /mnt/cdrom/README exists /mnt/cdrom/README is readable /mnt/cdrom/README is not writable (read-only filesystem) 10 0430 Ch08 5/22/01 10:33 AM Page 171 8.3 fcntl: Locks and Other File Operations 171 8.3 fcntl: Locks and Other File Operations The fcntl system call is the access point for several advanced operations on file descriptors.The first argument to fcntl is an open file descriptor, and the second is a value that indicates which operation is to be performed. For some operations, fcntl takes an additional argument.We’ll describe here one of the most useful fcntl operations, file locking. See the fcntl man page for information about the others. The fcntl system call allows a program to place a read lock or a write lock on a file, somewhat analogous to the mutex locks discussed in Chapter 5,“Interprocess Communication.”A read lock is placed on a readable file descriptor, and a write lock is placed on a writable file descriptor. More than one process may hold a read lock on the same file at the same time, but only one process may hold a write lock, and the same file may not be both locked for read and locked for write.

Alp-Ch08-Linux-System-Calls.Pdf

Glibc and System Calls Documentation Release 1.0

The Linux Kernel Module Programming Guide

System Calls

Demarinis Kent Williams-King Di Jin Rodrigo Fonseca Vasileios P

The Evolution of the Unix Time-Sharing System*

Lab 1: Loadable Kernel Modules (Lkms)

The UNIX Time- Sharing System

System Calls & Signals

UNIX History Page 1 Tuesday, December 10, 2002 7:02 PM

Chapter 4 Introduction to UNIX Systems Programming

Lecture 24 Systems Programming in C

System Calls in Unix and Windows