Lecture 05: Understanding Execvp

Lecture 05: Understanding execvp Principles of Computer Systems Spring 2019 Stanford University Computer Science Department Lecturer: Chris Gregg PDF of this presentation 1 Lecture 05: Question from yesterday: where is the cursor stored for a file? Diagram from last lecture: In the last lecture, the question came up about multiple processes pointing to the open file table, and what happens to the cursor. Every time there is an open() system call, a new entry is generated for the open file table. If a process forks, both the parent and child share a pointer to the same open file entry, and therefore their cursors will be the same (i.e., if one reads a line, the other will read the next line). This is also true with the dup and dup2 system calls, which we will learn about (briefly) in lab and in more detail next week. 2 Assignment 2: The Unix v6 Filesystem Your second assignment of the quarter is to write a program in C (not C++) that can read from a 1970s-era Unix version 6 filesystem. The test data you are reading from are literally bit-for-bit representations of a Unix v6 disk. You will leverage all of the information covered in the file system lecture from Lecture 3 to write the program, and for more detailed information, see Section 2.5 of the Salzer and Kaashoek textbook. You will primarily be writing code in four different files (and we suggest you tackle them in this order): inode.c file.c directory.c pathname.c Because the program is in C, you will have to rely on arrays of structs, and low-level data manipulation, as you don't have access to any C++ standard template library classes. 3 Assignment 2: The Unix v6 Filesystem The basic idea of the assignment is to write code that will be able to locate and read files in the file system. You will, for example, be using a function we've written for you to read sectors from the disk image: /** * Reads the specified sector (e.g. block) from the disk. Returns the number of bytes read, * or -1 on error. */ int diskimg_readsector(int fd, int sectorNum, void *buf); Sometimes, buf will be an array of inodes, sometimes it will be a buffer that holds actual file data. In any case, the function will always read DISKIMG_SECTOR_SIZE number of bytes, and it is your job to determine the relevance of those bytes. It is critical that you carefully read through the header files for this assignment (they are actually relatively short). There are key constants (e.g., ROOT_INUMBER, struct direntv6, etc.) that are defined for you to use, and reading them will give you an idea about where to start for parts of the assignment. 4 Assignment 2: The Unix v6 Filesystem One function that can be tricky to write is the following: /** * Gets the location of the specified file block of the specified inode. * Returns the disk block number on success, -1 on error. */ int inode_indexlookup(struct unixfilesystem *fs, struct inode *inp, int blockNum); The unixfilesystem struct is defined and initialized for you. The inode struct will be populated already The blockNum is the number, in linear order, of the block you are looking for in a particular file. For example: Let's say the inode indicates that the file it refers to has a size of 180,000 bytes. And let's assume that blockNum is 302. This means that we are looking for the 302nd block of data in the file referred to by inode. Recall that blocks are 512 bytes long. How would you find the block index (i.e., sector index) of the 302nd block in the file? 5 Assignment 2: The Unix v6 Filesystem For example: Let's say the inode indicates that the file it refers to has a size of 180,000 bytes. And let's assume that blockNum is 302. This means that we are looking for the 302nd block of data in the file referred to by inode. Recall that blocks are 512 bytes long How would you find the block index (i.e., sector index) of the 302nd block in the file? 1. Determine if the file is large or not 2. If it isn't large, you know you only have direct addressing. 3. If it is large (this file is), then you have indirect addressing. 4. The 302nd block is going to fall into the second indirect block, because each block has 256 block numbers (each block number is an unsigned short, or a uint16_t). 5. You, therefore, need to use diskimg_readsector to read the sector listed in the 2nd block number (which is in the inode struct), then extract the (302 % 256)th short from that block, and return the value you find there. 6. If the block number you were looking for happened to fall into the 8th inode block, then you would have a further level of indirection for a doubly-indirect lookup. 6 Assignment 2: The Unix v6 Filesystem For the assignment, you will also have to search through directories to locate a particular file. You do not have to follow symbolic links (you can ignore them completely) You do need to consider directories that are longer than 32 files long (because they will take up more than two blocks on the disk), but this is not a special case! You are building generic functions to read files, so you can rely on them to do the work for you, even for directory file reading. Don't forget that a filename is limited to 14 characters, and if it is exactly 14 characters, there is not a trailing '\0' at the end of the name (this to conserve that one byte of data!) So...you might want to be careful about using strcmp for files (maybe use strncmp, instead?) This is a relatively advanced assignment, with a lot of moving parts. Start early! Come to office hours or ask Piazza questions. Remember: CAs won't look at your code, so you must formulate your questions to be conceptual enough that they can be answered. 7 Lecture 05: More on Multiprocessing Third example: Synchronizing between parent and child using waitpid waitpid can be used to temporarily block one process until a child process exits. pid_t waitpid(pid_t pid, int *status, int options); The first argument specifies the wait set, which for the moment is just the id of the child process that needs to complete before waitpid can return. The second argument supplies the address of an integer where termination information can be placed (or we can pass in NULL if we don't care for the information). The third argument is a collection of bitwise-or'ed flags we'll study later. For the time being, we'll just go with 0 as the required parameter value, which means that waitpid should only return when a process in the supplied wait set exits. The return value is the pid of the child that exited, or -1 if waitpid was called and there were no child processes in the supplied wait set. 8 Lecture 05: More on Multiprocessing Third example: Synchronizing between parent and child using waitpid Consider the following program, which is more representative of how fork really gets used in practice (full program, with error checking, is right here): int main(int argc, char *argv[]) { printf("Before.\n"); pid_t pid = fork(); printf("After.\n"); if (pid == 0) { printf("I am the child, and the parent will wait up for me.\n"); return 110; // contrived exit status } else { int status; waitpid(pid, &status, 0) if (WIFEXITED(status)) { printf("Child exited with status %d.\n", WEXITSTATUS(status)); } else { printf("Child terminated abnormally.\n"); } return 0; } } 9 Lecture 05: More on Multiprocessing Third example: Synchronizing between parent and child using waitpid The output is the same every single time the above program is executed. myth60$ ./separate Before. After. After. I am the child, and the parent will wait up for me. Child exited with status 110. myth60$ The implementation directs the child process one way, the parent another. The parent process correctly waits for the child to complete using waitpid. The parent lifts child exit information out of the waitpid call, and uses the WIFEXITED macro to examine some high-order bits of its argument to confirm the process exited normally, and it uses the WEXITSTATUS macro to extract the lower eight bits of its argument to produce the child return value (which we can see is, and should be, 110). The waitpid call also donates child process-oriented resources back to the system. 10 Lecture 05: More on Multiprocessing Synchronizing between parent and child using waitpid This next example is more of a brain teaser, but it illustrates just how deep a clone the process created by fork really is (full program, with more error checking, is right here). int main(int argc, char *argv[]) { printf("I'm unique and just get printed once.\n"); pid_t pid = fork(); bool parent = pid != 0; if ((random() % 2 == 0) == parent) sleep(1); // force exactly one of the two to sleep if (parent) waitpid(pid, NULL, 0); // parent shouldn't exit until child has finished printf("I get printed twice (this one is being printed from the %s).\n", parent ? "parent" : "child"); return 0; } The code emulates a coin flip to seduce exactly one of the two processes to sleep for a second, which is more than enough time for the child process to finish. The parent waits for the child to exit before it allows itself to exit.

Load more