Interlude: Process API

5 Interlude: Process API ASIDE: INTERLUDES Interludes will cover more practical aspects of systems, including a par- ticular focus on operating system APIs and how to use them. If you don’t like practical things, you could skip these interludes. But you should like practical things, because, well, they are generally useful in real life; com- panies, for example, don’t usually hire you for your non-practical skills. In this interlude, we discuss process creation in UNIX systems. UNIX presents one of the most intriguing ways to create a new process with a pair of system calls: fork() and exec(). A third routine, wait(), can be used by a process wishing to wait for a process it has created to complete. We now present these interfaces in more detail, with a few simple examples to motivate us. And thus, our problem: CRUX: HOW TO CREATE AND CONTROL PROCESSES What interfaces should the OS present for process creation and control? How should these interfaces be designed to enable powerful func- tionality, ease of use, and high performance? 5.1 The fork() System Call The fork() system call is used to create a new process [C63]. How- ever, be forewarned: it is certainly the strangest routine you will ever call1. More specifically, you have a running program whose code looks like what you see in Figure 5.1; examine the code, or better yet, type it in and run it yourself! 1Well, OK, we admit that we don’t know that for sure; who knows what routines you call when no one is looking? But fork() is pretty odd, no matter how unusual your routine- calling patterns are. 1 2 INTERLUDE: PROCESS API 1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <unistd.h> 4 5 int main(int argc, char *argv[]) { 6 printf("hello world (pid:%d)\n", (int) getpid()); 7 int rc = fork(); 8 if (rc < 0) { 9 // fork failed 10 fprintf(stderr, "fork failed\n"); 11 exit(1); 12 } else if (rc == 0) { 13 // child (new process) 14 printf("hello, I am child (pid:%d)\n", (int) getpid()); 15 } else { 16 // parent goes down this path (main) 17 printf("hello, I am parent of %d (pid:%d)\n", 18 rc, (int) getpid()); 19 } 20 return 0; 21 } 22 Figure 5.1: Calling fork() (p1.c) When you run this program (called p1.c), you’ll see the following: prompt> ./p1 hello world (pid:29146) hello, I am parent of 29147 (pid:29146) hello, I am child (pid:29147) prompt> Let us understand what happened in more detail in p1.c. When it first started running, the process prints out a hello world message; in- cluded in that message is its process identifier, also known as a PID. The process has a PID of 29146; in UNIX systems, the PID is used to name the process if one wants to do something with the process, such as (for example) stop it from running. So far, so good. Now the interesting part begins. The process calls the fork() system call, which the OS provides as a way to create a new process. The odd part: the process that is created is an (almost) exact copy of the calling process. That means that to the OS, it now looks like there are two copies of the program p1 running, and both are about to return from the fork() system call. The newly-created process (called the child, in contrast to the creating parent) doesn’t start running at main(), like you might expect (note, the “hello, world” message only got printed out once); rather, it just comes into life as if it had called fork() itself. OPERATING SYSTEMS WWW.OSTEP.ORG [VERSION 1.01] INTERLUDE: PROCESS API 3 1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <unistd.h> 4 #include <sys/wait.h> 5 6 int main(int argc, char *argv[]) { 7 printf("hello world (pid:%d)\n", (int) getpid()); 8 int rc = fork(); 9 if (rc < 0) { // fork failed; exit 10 fprintf(stderr, "fork failed\n"); 11 exit(1); 12 } else if (rc == 0) { // child (new process) 13 printf("hello, I am child (pid:%d)\n", (int) getpid()); 14 } else { // parent goes down this path (main) 15 int rc_wait = wait(NULL); 16 printf("hello, I am parent of %d (rc_wait:%d) (pid:%d)\n", 17 rc, rc_wait, (int) getpid()); 18 } 19 return 0; 20 } 21 Figure 5.2: Calling fork() And wait() (p2.c) You might have noticed: the child isn’t an exact copy. Specifically, al- though it now has its own copy of the address space (i.e., its own private memory), its own registers, its own PC, and so forth, the value it returns to the caller of fork() is different. Specifically, while the parent receives the PID of the newly-created child, the child receives a return code of zero. This differentiation is useful, because it is simple then to write the code that handles the two different cases (as above). You might also have noticed: the output (of p1.c) is not deterministic. When the child process is created, there are now two active processes in the system that we care about: the parent and the child. Assuming we are running on a system with a single CPU (for simplicity), then either the child or the parent might run at that point. In our example (above), the parent did and thus printed out its message first. In other cases, the opposite might happen, as we show in this output trace: prompt> ./p1 hello world (pid:29146) hello, I am child (pid:29147) hello, I am parent of 29147 (pid:29146) prompt> The CPU scheduler, a topic we’ll discuss in great detail soon, deter- mines which process runs at a given moment in time; because the scheduler is complex, we cannot usually make strong assumptions about what THREE c 2008–20, ARPACI-DUSSEAU EASY PIECES 4 INTERLUDE: PROCESS API it will choose to do, and hence which process will run first. This non- determinism, as it turns out, leads to some interesting problems, par- ticularly in multi-threaded programs; hence, we’ll see a lot more non- determinism when we study concurrency in the second part of the book. 5.2 The wait() System Call So far, we haven’t done much: just created a child that prints out a message and exits. Sometimes, as it turns out, it is quite useful for a parent to wait for a child process to finish what it has been doing. This task is accomplished with the wait() system call (or its more complete sibling waitpid()); see Figure 5.2 for details. In this example (p2.c), the parent process calls wait() to delay its execution until the child finishes executing. When the child is done, wait() returns to the parent. Adding a wait() call to the code above makes the output deterministic. Can you see why? Go ahead, think about it. (waiting for you to think .... and done) Now that you have thought a bit, here is the output: prompt> ./p2 hello world (pid:29266) hello, I am child (pid:29267) hello, I am parent of 29267 (rc_wait:29267) (pid:29266) prompt> With this code, we now know that the child will always print first. Why do we know that? Well, it might simply run first, as before, and thus print before the parent. However, if the parent does happen to run first, it will immediately call wait(); this system call won’t return until the child has run and exited2. Thus, even when the parent runs first, it politely waits for the child to finish running, then wait() returns, and then the parent prints its message. 5.3 Finally, The exec() System Call A final and important piece of the process creation API is the exec() system call3. This system call is useful when you want to run a program that is different from the calling program. For example, calling fork() 2There are a few cases where wait() returns before the child exits; read the man page for more details, as always. And beware of any absolute and unqualified statements this book makes, such as “the child will always print first” or “UNIX is the best thing in the world, even better than ice cream.” 3On Linux, there are six variants of exec(): execl, execlp(), execle(), execv(), execvp(), and execvpe(). Read the man pages to learn more. OPERATING SYSTEMS WWW.OSTEP.ORG [VERSION 1.01] INTERLUDE: PROCESS API 5 1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <unistd.h> 4 #include <string.h> 5 #include <sys/wait.h> 6 7 int main(int argc, char *argv[]) { 8 printf("hello world (pid:%d)\n", (int) getpid()); 9 int rc = fork(); 10 if (rc < 0) { // fork failed; exit 11 fprintf(stderr, "fork failed\n"); 12 exit(1); 13 } else if (rc == 0) { // child (new process) 14 printf("hello, I am child (pid:%d)\n", (int) getpid()); 15 char *myargs[3]; 16 myargs[0] = strdup("wc"); // program: "wc" (word count) 17 myargs[1] = strdup("p3.c"); // argument: file to count 18 myargs[2] = NULL; // marks end of array 19 execvp(myargs[0], myargs); // runs word count 20 printf("this shouldn’t print out"); 21 } else { // parent goes down this path (main) 22 int rc_wait = wait(NULL); 23 printf("hello, I am parent of %d (rc_wait:%d) (pid:%d)\n", 24 rc, rc_wait, (int) getpid()); 25 } 26 return 0; 27 } 28 Figure 5.3: Calling fork(), wait(), And exec() (p3.c) in p2.c is only useful if you want to keep running copies of the same program.

Interlude: Process API

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support