Ben-Gurion University of the Negev, Operating Systems 2012

OPERATING SYSTEMS, ASSIGNMENT 4 SYSTEM

SUBMISSION DATE: 26/06/2012 22:00

In this assignment you are to extend the of xv6. xv6 implements a -like file system, and when running on top of QEMU, stores its data on a virtual IDE disk (fs.img) for persistence. To get familiar with the xv6’s file system design and capabilities, it is recommended to Chapter 5 of the xv6 book.

Assignment overview

The assignment consists of the following parts:

1. “Hacking” the xv6’s file system. This part includes expanding the maximal supported by the file system, and adding support for symbolic links.

2. File tagging (bonus). In this part you are to add support for adding key-value pairs to files.

3. application. In this part you are to implement a find application, to search the file system for files matching specified criteria.

Task 0: Running xv6

Begin by downloading our revision of xv6 from the os122 svn repository:

a shell and traverse to the desired working .  Check-out the project files using svn by calling: svn checkout http://bgu-os-122-xv6-rev6-1.googlecode.com/svn/ass4

 Build xv6 by calling: make  Run xv6 on top of QEMU by calling: make qemu  When working through a remote connection to the department’s computers use: screen make qemu-nox

1

Ben-Gurion University of the Negev, Operating Systems 2012

PART 1: "HACKING" THE XV6’S FILE SYSTEM

Xv6, as well as other modern operating systems, supports a fixed maximum file size. Xv6's implementation was inspired by the UNIX . Both operating systems use the i-nodes architecture for their file systems. In xv6, the i-node structure contains pointers to 12 direct data blocks, each of size 512 bytes (totaling to 6KB), as well as a single indirect pointer; the latter points to a block which maps additional 128 blocks. This one level of indirection gives us access to 64KB of additional data. This allows files of size of up to 70KB.

Expanding the maximal file size

Extend the i-node structure to support files of size up to 8MB, by adding a double indirection layer to the i-nodes. This will require changes to the i-node structure, as well as to the disk representation of i-nodes (dinode).

Hint: the file mkfs.c is written using standard C libraries and is built and executed by the makefile outside of xv6 to create the virtual drive, fs.img. One of its tasks is to the superblock, which contains metadata characterizing the file system. You will have to modify the contents of the superblock to be consistent with the changes you make. The virtual disk, fs.img, should contain at least 215 blocks (totaling to 16MB).

Hint: the size of the dinode structure should be a divisor of the block size, i.e. an integer number of dinodes should fit in a block. You may want to add padding for this purpose.

Sanity

Write a simple user application which creates a of size 1MB and writes a notification message to the screen after writing the first 12 direct blocks, after writing the single indirect blocks, and after writing the double indirect blocks. The purpose of this application is to test your implementation, and in case it contains bugs, help you identify where the bugs are. You are free to change the output or add additional printings as you see fit.

The output should look something like:

Finished writing 6KB (direct) Finished writing 70KB (single indirect) Finished writing 1MB

Adding support for symbolic links

Xv6 supports hard links via the user space program . Hard links allow different file names to reference the same actual file by using the same i-node number. For example, when a hard named "b.txt" is created for a file named "a.txt", both "a.txt" and "b.txt" refer to the same file (data) on disk. Changes made to “a.txt” are reflected in “b.txt” and vice versa. Deleting one will not affect the other (will only decrease the link count). In this task you will expand the program ln to support symbolic links, also referred to as soft links. When a new symbolic link is created, it doesn't share the same i-node number as the pointed file (target). Instead, a new file is created and a new i-node number is assigned to it. The contents of the new symbolic link file will contain the to the target.

2

Ben-Gurion University of the Negev, Operating Systems 2012

Symbolic links are a special type of files, and should be assigned a unique enumeration value in the file type enum.

A symbolic link can point to any type of file: regular file, directory and even another symbolic link.

A symbolic link can be either absolute or relative. Naturally, moving a relative link to a different location will result in a broken link.

The following syntax should be used to create a symbolic link: ln –s old_path new_path

Example: ln –s /home/os/a.txt /home/algo/b.txt

Will create the new symbolic link /home/algo/b.txt and it will point to the existing file /home/os/a.txt

The following system calls should be implemented to support symbolic links: int symlink(cont char *oldpath, const char *newpath); symlink() creates a symbolic link, whose name is specified by the parameter newpath, and which points to the file whose name is specified by the parameter oldpath. The latter file can be of any type, and may not even exist. symlink() returns 0 upon success, and a negative integer upon failure. int readlink (const char *pathname, char *buf, size_t bufsiz); readlink() reads the name of the file to which the symbolic link points. The name of the link is specified by the parameter pathname. The target path is stored in the buffer buf, whose size is specified by bufsiz. readlink() returns the number of bytes which have been placed in the buffer buf, or the value -1 upon failure. readlink() should dereference any symbolic links encountered in the path.

Protection from loops

Symbolic links to directories may cause infinite loops. We shall tackle this issue in the same manner as it is done in UNIX, by limiting the degree of a chain of links to 16. I.e. when retrieving the target of a link, which points to a link, which points to a link, and so on… The max number of jumps we will allow is 16. Longer chains will be considered as loops.

Extending user applications

Extend the user applications cat, grep, sh and wc to handle symbolic links, as if it were the target file. I.e. cat some_link should display the contents of the target file, to which the symbolic link some_link points.

The deletion of a symbolic link shouldn’t affect the target file, i.e. the user application , when applied to a symbolic link, only deletes the link.

The shell application, sh, should handle traversal to directories pointed to by symbolic links (cd some_link), as well as execution of programs pointed to by symbolic links. Don’t forget to handle broken links, by displaying an appropriate error message. 3

Ben-Gurion University of the Negev, Operating Systems 2012

PART 2: FILE TAGGING

This part is a bonus which weights 15 points.

In this part of the assignment, you will add a new feature to the xv6 file system in order to allow users to tag files with arbitrary key-value pairs they define. This tagging feature can be very useful, and exists in many real world applications already. For example, you can add tags to PDF, JPG, MP3 and many other media files. However, these application- specific implementations are ad hoc and must be re-implemented for new applications. You'll provide a general implementation in the xv6 file system so that it applies to all file types.

Hint: You can extend the i-node structure to include a pointer to a block which will contain the key-value pairs.

You'll need to implement the following three system calls: int ftag(int fd, char *key, char *val);

System call ftag() tags a file identified by fd with a new key-value pair, with the key being key and value val. Argument key is a null-terminated string with a maximum length of 10 and minimum length of 2, counting the ending null byte. Argument val specifies the value in the key-value pair, and is a null-terminated string as well, with a maximum length of 30 and minimum length of 2, counting the ending null byte. If key already exists in file fd's tags, this system call overwrites the old tag with the new one. int funtag(int fd, char *key);

System call funtag() removes a tag with key key from the file fd. If there is no key-value pair matching key, funtag() should return an error code. int gettag(int fd, char *key, char *buf);

System call gettag() retrieves the value mapped to key from the file fd. The retrieved value is stored in buf. gettag() should return the length of the value if it exists and -1 otherwise.

Implementation notes:

To implement these system calls, you'll need to change the on-disk representation of an i-node and add code to fs.c. Depending on your design, you may need to change mkfs.c so that the layout of the new file system created by mkfs is consistent with the file system layout expected by fs.c.

Make sure you use proper locking and unlocking when working with i-nodes and blocks (examine the xv6’s code to see how it should be done).

You may support in your implementation the addition of tags only to regular files and directories (no need to support tags for symbolic links).

You may define a special character (which one is up to you) to use as a delimiter.

You may assume that all tags of a file fit within one 512-byte disk block. You never have to allocate more than a single disk block to store the tags for each file. 4

Ben-Gurion University of the Negev, Operating Systems 2012

PART 3: FIND APPLICATION

The Find application is required by POSIX, and is found in many UNIX-like systems. It is used to search the file system for files matching user-specified criteria, and apply a user- specified action on each matching file. It is a very useful application; we will implement a (very) simplistic version of it.

The syntax of find which you are to implement is: find

The path argument, which is the only mandatory argument, specifies the location where the search should begin and descend from. In case the provided path is a file and not a directory, then only the specified file will be tested for the specified criteria. The rest of the arguments are optional and are explained below. The output of find should be the full paths of all matches.

Options

-follow Dereference symbolic links. If a symbolic link is encountered, apply tests to the target of the link. If a symbolic link points to a directory, then descend to it.

-help Print a summary of the command-line usage of find and exit.

Tests

-name file name All files named (exactly, no wildcards) file name.

-size (+/-)n File is of size n (exactly), +n (more than n), -n (less than n).

-type c File is of type c: d directory f regular file s soft (symbolic) link

-tag key=value (Bonus) File is tagged by the specified key-value pair. If value equals “?” then all files having the key key are matched, regardless of the value.

5

Ben-Gurion University of the Negev, Operating Systems 2012

Examples find / –type d -name xv6 Search the whole file system for directories named “xv6”. find /src –name xv6 Search /src and all its subdirectories recursively for files, directories and links named “xv6”. find / –f –size +1000000 Find all files having size greater than 1000000 bytes. find / –tag artist=Mozart Find all files and directories which have the tag . find / -d –tag genre=? Find all directories which have a tag with the key “genre”.

Submission guidelines

Assignment due date: 26/06/2012 22:00

Make sure that your Makefile is properly updated and that your code compiles with no warnings whatsoever. We strongly recommend documenting your code changes with remarks – these are often handy when discussing your code with the graders.

Due to our constrained resources, assignments are only allowed in pairs. Please note this important point and try to match up with a partner as soon as possible.

Submissions are only allowed through the submission system. To avoid submitting a large number of xv6 builds you are required to submit an svn patch (i.e. a file which patches the original xv6 and applies all your changes).

Tip: although graders will only apply your latest patch file, the submission system supports multiple uploads. Use this feature often and make sure you upload patches of your current work even if you haven’t completed the assignment.

Finally, you should note that graders are instructed to examine your code on lab computers only (!) - Test your code on lab computers prior to submission.

GOOD LUCK!

6