Systems Programming/ C and UNIX
Alice E. Fischer
September 9, 2015
Alice E. Fischer Systems Programming – Lecture 3. . . 1/40 September 9, 2015 1 / 40 Outline
1 Compile and Run
2 Unix Topics System Calls The Unix File System Directories and Files I-Nodes
3 Directory Operations
4 Summary
Alice E. Fischer Systems Programming – Lecture 3. . . 2/40 September 9, 2015 2 / 40 Outline Coding Standards
Part of every program grade will be based on these coding standards: Keep all functions short. 30 lines is normally too long. Global variables and use of goto are permanently forbidden. Use class member variables and local variables appropriately. Test data should be submitted and should test all parts of the program. Write your code so that it conforms to the style standards on the next slide.
Alice E. Fischer Systems Programming – Lecture 3. . . 3/40 September 9, 2015 3 / 40 Outline Style Standards
Keep your code and your comments within 80 columns. Eliminate unnecessary words from your comments. Do not repeat the obvious. Comments are for ideas that are not obvious. Use appropriate whitespace, not too little, not too much. Put a space after every comma, semicolon, //, and anywhere else that will help to break up the words in your statement. Do not put blank lines randomly all over the code. Do not put multiple consecutive blank lines in your file. If you need a visual divider, use a line of //—————– Spelling and grammar need to be correct. Abbreviations are OK. Outright misspellings are not. Indentation must be consistent. Learn how to use your IDE to do this. The indentation style should be one of the two nationally recognized styles for C. Indent 3 or 4 spaces at every level. Not less, not more.
Alice E. Fischer Systems Programming – Lecture 3. . . 4/40 September 9, 2015 4 / 40 Outline Using the Tools
Attached to the website are two files, tools.cpp, and tools.hpp. Download and save the pair. Please use the functions in the tools module as follows: In tools.hpp, put your own name on line 9. Call banner() or fbanner() or both at the beginning of each program you write. This will put a standard header on the output. Call fatal( format, output-list ); to exit the program after a fatal error. It flushes the streams and exits properly. Use hold() if you want to pause execution so you can read the output screen. Do not use a system call or anything non-standard. Study the code in the fatal() function until you understand how variable-length argument lists work. Please read the relevant section of the handout from the first week (Chapter 20).
Alice E. Fischer Systems Programming – Lecture 3. . . 5/40 September 9, 2015 5 / 40 Compile and Run The Command Shell
Stages of Compilation Conditional Compilation Linking
Alice E. Fischer Systems Programming – Lecture 3. . . 6/40 September 9, 2015 6 / 40 Compile and Run Stages of Compilation
Between submitting your source code to the compiler and running your program, these things must happen: The preprocessor is used in four ways:
Lexical analysis: the words and symbols in your code are identified. Preprocessing: conditional compilation, include files, macros, and constant definitions are removed from the code and replaced by compilable code. Compilation: Your code is parsed and converted to object code. Linking: Modules of your program are combined into a load module, and linked to the system libraries. Loading: The executable file is loaded into main memory and its process is put on the system’s run queue.
Alice E. Fischer Systems Programming – Lecture 3. . . 7/40 September 9, 2015 7 / 40 Compile and Run The Preprocessor
Both C and C++ have a preprocessor, with its own language that is quite separate from the programming language. The preprocessor is used in four ways: To include header files for code modules: #include
Alice E. Fischer Systems Programming – Lecture 3. . . 8/40 September 9, 2015 8 / 40 Compile and Run Conditional Compilation
Conditional compilation enables us to “make” a single source code module in multiple ways that are appropriate for different OS and hardware architectures.
When you download source code for a major software project, it will be full of conditional compilation blocks. The top-level blocks will include other files that are also full of conditional compilation blocks. It soon becomes difficult to figure out what code is being compiled!
Alice E. Fischer Systems Programming – Lecture 3. . . 9/40 September 9, 2015 9 / 40 Compile and Run Conditional Compilation Directives
These directives include or exclude a block of code from the current module during the current compile. Excluded code is not compiled. #ifdef SYMBOL If the SYMBOL is defined, the following code block will be compiled. #ifndef SYMBOL If the SYMBOL is not defined, the following code block will be compiled. #if followed by a constant expression. The if-condition is tested. When true, the lines of code up to the matching #endif are included in the program. When false, they are skipped (not compiled). #if can be used with the keywords “defined” or “!defined” #if defined (SYMBOL) and #if !defined (SYMBOL) #elif constant-expression and #else Any number of #elif and one final #else can follow the #if. #endif is used with #ifdef, #ifndef, and #if
Alice E. Fischer Systems Programming – Lecture 3. . . 10/40 September 9, 2015 10 / 40 Compile and Run Conditional Compilation–Continued
Include-guards are the most common use of conditional compilation. An include guard is a preprocessor command (or command combo) that will include or exclude a block of code based on what has previously been included. Include guards are used to ensure that no block of code is included twice in the same compile module. Every header file should be protected by include guards. Three processor commands are needed: #ifndef MODULE_NAME #define MODULE_NAME ... code for the module ... #endif An alternative that is supported by some, but not all, systems is #pragma once
Alice E. Fischer Systems Programming – Lecture 3. . . 11/40 September 9, 2015 11 / 40 Compile and Run Linking
At link-time, system and local libraries are searched for functions that were called by the program. (See bottom of log.txt). User functions are linked statically: the entry point of the function is stored in the transfer vector of the module that calls it. System functions are linked dynamically: a stub function is hard-linked into the code. The actual function code is brought in at load time, if it is not already there. If some other process has already loaded the needed function, your process will be linked to the same copy. To make this work, system code must be reentrant, that is, the code must not modify itself at run time.
Alice E. Fischer Systems Programming – Lecture 3. . . 12/40 September 9, 2015 12 / 40 Unix Topics Unix Topics
System Calls vs. Library Function Calls The Unix File System Directories and Files Directory Operations
Alice E. Fischer Systems Programming – Lecture 3. . . 13/40 September 9, 2015 13 / 40 Unix Topics System Calls System Calls vs. Library Function Calls
A system call is a call to a function that executes with root privileges. They are documented in section 2 of the Unix manual. Library function calls execute with user privileges. (Therefore, they are less dangerous.) A library function sometimes calls a system function, causing a context switch: The user’s process (in the OS) goes from RUN state to BLOCKED state. Control is transferred to the system function, which has its own stack and environment, and runs in supervisor mode. The system function reaches back into the user’s stack to get its parameters. When execution is finished, the user’s process moves into the READY state. System calls cause two context swaps. System calls may not be portable because systems are different.
Alice E. Fischer Systems Programming – Lecture 3. . . 14/40 September 9, 2015 14 / 40 Unix Topics System Calls The OS Process Queue
Starting Done
login or my turn termination execve READY RUN
time is up system call completion or I/O
BLOCKED
A system call takes your process out of RUN. It stays in BLOCKED until the system call finishes. Frequent system calls will affect performance.
Alice E. Fischer Systems Programming – Lecture 3. . . 15/40 September 9, 2015 15 / 40 Unix Topics The Unix File System The Unix File System
Mounted file systems The UNIX root and bin directories
Directories, files, and pathnames Links and iNodes System libraries for directory processing.
Alice E. Fischer Systems Programming – Lecture 3. . . 16/40 September 9, 2015 16 / 40 Unix Topics The Unix File System Hardware – Unix System – Program
3451716 John's file 2 links data data 6 blocks
data Mary's link to data John’s file data
data
Alice E. Fischer Systems Programming – Lecture 3. . . 17/40 September 9, 2015 17 / 40 Unix Topics The Unix File System A Fully Mounted File System
Windows pathnames all start with a device code, such as C: Unix pathnames don’t. All Unix pathnames start at the root directory, which is written as: / A removable device that stores files (disk, CD-ROM, stick memory) must be mounted before use. For this purpose, several mount points (empty directories) are built into the Unix file system. For example, /bin/media and /bin/mnt. My main disk, backup disk, and stick memory are all mounted (automatically) on the /Volumes directory. mount() is a system call – happily, not something you will be using.
Alice E. Fischer Systems Programming – Lecture 3. . . 18/40 September 9, 2015 18 / 40 Unix Topics The Unix File System Mounting a Removable Device
Modern systems automatically mount removable devices when they are inserted and unmount them when they are removed. (Many years ago, the user wrote the mount() command himself.) To mount a device, a Unix system must go to the root directory of the file system on the device. That directory is grafted onto the file-tree as a subdirectory of the mount point. Thereafter, the files on it can be reached from anywhere else in the file system using an ordinary path name. Thus, a pathname might start on one device, then go through a mount point, and onto another device.
Alice E. Fischer Systems Programming – Lecture 3. . . 19/40 September 9, 2015 19 / 40 Unix Topics The Unix File System The Fedora Root directory
The Organization of a Linux File System:
/ boot/ Where the kernel files live. dev/ Devices and device types etc/ Configuration files home/ Everybody's home directories alice/ lost+found/ bob/ mike/ media/ For mounting media root/ The home directory of the root user. root directory from stick. * run/media/mike/stickname/ tmp/ Things that should disappear on power off. bin/ usr/ Where non-kernel system files live. games/ usr/bin/ Executable essential system commands kerberos/ usr/sbin/ Executables for sys administration local/ usr/lib/ System libraries (C, etc.) share/ etc/ usr/lib64/ include/ var/ System log files, other files that change lib/ libexec/ sbin/ * The run/ and tmp/ directories exist only at run time. src/
Alice E. Fischer Systems Programming – Lecture 3. . . 20/40 September 9, 2015 20 / 40 Unix Topics The Unix File System What is in the bin directory?
The bin directory stores the executable code and scripts for commands that are necessary for basic system administration. Here are the most familiar and useful: bash cat df ps tcsh echo ln pwd date chmod ls rm hostname chown mkdir rmdir kill cp mv unlink
Some things are in /bin in any Unix-like system; the things listed here are the same in OS-X and Linux. However, Linux has important commands in /bin that are found in /usr/bin in OS-X.
Alice E. Fischer Systems Programming – Lecture 3. . . 21/40 September 9, 2015 21 / 40 Unix Topics The Unix File System In /bin or in /usr/bin: More Basic Commands
Languages: Directories: Utilities: Shell: emacs cd ftp alias ed find grep chsh vi mount gzip echo awk stat gunzip logout sed tar make ping c99 Files: man passwd gcc cat svn path g++ diff rdiff-backup rehash java less rsync slogin python more sort ssh ruby touch uniq su umask sudo which whoami
Alice E. Fischer Systems Programming – Lecture 3. . . 22/40 September 9, 2015 22 / 40 Unix Topics Directories and Files Directories and Path Names
A pathname is a sequence of directory names, separated by slashes, and optionally ending with a file name. A pathname can be either absolute or relative. An absolute pathname begins with slash: / (slash, the root directory) Avoid absolute pathnames in your code, since they generally fail when an application is moved to another directory or another machine. We can write a relative pathname starting with these special directories: ∼/ (tilde, the user’s home directory). ./ (dot, the current working directory) ../ (dot dot, the parent of the current working directory) Any other pathname is interpreted relative to the current working directory.
Alice E. Fischer Systems Programming – Lecture 3. . . 23/40 September 9, 2015 23 / 40 Unix Topics Directories and Files Types of Directory Entries
Files, devices, directories, links, and inter-process connections are treated uniformly. The first two entries are always . and .. The type of the entry is stored in the I-node, not in the directory. Directories are treated just like files. A soft link, or symbolic link, is a short file that stores the pathname of another file. A hard link is a second directory entry that points to the same INode. Devices (block- and character-oriented) and communication channels (pipes and sockets) are treated like files. The directory is the only place where the file name is stored.
Alice E. Fischer Systems Programming – Lecture 3. . . 24/40 September 9, 2015 24 / 40 Unix Topics Directories and Files File Types
The entries in a directory can be any of the following types:
Symbol Val Meaning DT_UNKNOWN 0 just in case DT_FIFO 1 pipe DT_CHR 2 character device DT_DIR 4 directory DT_BLK 6 block device DT_REG 8 regular file DT_LNK 10 symbolic link DT_SOCK 12 socket DT_WHT 14 system use only: to hide files.
Alice E. Fischer Systems Programming – Lecture 3. . . 25/40 September 9, 2015 25 / 40 Unix Topics I-Nodes I-Nodes
The POSIX standard for the I-Node for a regular file requires: The length of the file in bytes. Device ID (this identifies the device containing the file). The User ID of the file’s owner. The Group ID of the file. The file mode, including the file type and u-g-o access privileges. Additional system and user flags to further protect the file. Timestamps telling when the inode itself was last changed (ctime, change time), the file content last modified (mtime, modification time), and last accessed (atime, access time). A link count telling how many hard links point to the inode. One or a few file-content block pointers or indirect pointers.
Alice E. Fischer Systems Programming – Lecture 3. . . 26/40 September 9, 2015 26 / 40 Unix Topics I-Nodes I-Node Information
The stat system call retrieves a file’s inode number and some of the information in its I-node. Example: bash-3.2$ stat Elephant.pdf 234881026 // device ID number 3451716 // I-node number -rw-r--r-- // type, permissions 1 // # of hard links to file alice staff // owner, group 0 // device type 87339 // file size in bytes "Jan 10 02:38:05 2010" "Jan 3 14:05:56 2010" // acc, mod "Jan 4 18:32:36 2010" "Jan 3 14:05:56 2010" // Imd, brn 4096 // block size 176 0 // # of blocks in file, ? Elephant.pdf // Name of file
Alice E. Fischer Systems Programming – Lecture 3. . . 27/40 September 9, 2015 27 / 40 Unix Topics I-Nodes Links: Hard and Soft
A hard link is a second pointer to the same I-node. It is exactly like a file; one cannot distinguish the original from the link. Hard links only work within one physical disk partition. A file will not be deleted until the last hard link is deleted. You lose control of your file if you give a hard link to another user. A soft link, or symbolic link is a file that contains a single string, the pathname of another file. That path name can be either absolute or relative. Moving a file or a directory breaks soft links that point to it from the outside. Deleting a file breaks soft links to it, but the broken link will not be discovered until someone tries to use it. Soft links are used more often than hard links.
Alice E. Fischer Systems Programming – Lecture 3. . . 28/40 September 9, 2015 28 / 40 Unix Topics I-Nodes The I-Node is the Actual File
3451716 John's file 2 links data data 6 blocks
data Mary's link to data John’s file data
data
The directory entry points to an I-node. The I-node contains the file info and points to data blocks. For longer files, there is a single-indirect link to another I-node. For huge files, there are double and triple-indirect links that points to an I-node full of single or double-indirect links.
Alice E. Fischer Systems Programming – Lecture 3. . . 29/40 September 9, 2015 29 / 40 Unix Topics I-Nodes Soft link.
A Soft link is a separate file containing only a pathname.
John’s soft 3451729 pathname to John’s file directory for his link link, 1 link original file. 1 block
Alice E. Fischer Systems Programming – Lecture 3. . . 30/40 September 9, 2015 30 / 40 Directory Operations Directory Operations
Library Functions for Directory Processing Error Codes Structure of a Directory Entry Testing the Type of the Entry
Alice E. Fischer Systems Programming – Lecture 3. . . 31/40 September 9, 2015 31 / 40 Directory Operations Library Functions for Directory Processing
To use the directory functions, you must include #include
These libraries are documented in Section 3 of the Unix manual. For further details, check the Unix man pages.
You will need to capture the return value from these functions, then test it and handle error codes appropriately.
The direntDemoJoined program gives a skeleton program with two classes for dealing with directories.
For Program 5, you will need to call the library functions on the next few slides. The error return values for each one are given.
Alice E. Fischer Systems Programming – Lecture 3. . . 32/40 September 9, 2015 32 / 40 Directory Operations The Current Working Directory
Get the absolute path name of the current working directory. #include
Alice E. Fischer Systems Programming – Lecture 3. . . 33/40 September 9, 2015 33 / 40 Directory Operations Directory Access
Open a directory stream. #include
Close the directory stream and free the associated memory: int closedir(DIR* dirp); A return value of 0 means success. On failure, -1 is returned and the global variable errno is set to indicate the error.
Alice E. Fischer Systems Programming – Lecture 3. . . 34/40 September 9, 2015 34 / 40 Directory Operations Directory Processing
Read a directory entry. struct dirent * readdir(DIR* dirp); The return value is NULL if there are no more entries in this directory or an error occurred. In the event of an error, errno may be set to any of these values: EBADF fd is not a valid file descriptor open for reading. EFAULT Either buf or basep point outside the allocated address space. EIO An I/O error occurred while reading from or writing to the file system.
Alice E. Fischer Systems Programming – Lecture 3. . . 35/40 September 9, 2015 35 / 40 Directory Operations Structure of a Directory Entry
Basically, the directory lists only the name of the file. All other information is in the I-node.
This is one possible implementation of the directory entry: struct dirent { ino_t d_ino; // file or I-node number __uint16_t d_reclen; // length of this record __uint8_t d_type; // file type, see below __uint8_t d_namlen; // strlen( d_name ) char d_name[255 + 1]; // name must be <= 255 }; This is platform dependent; always use the standard interface functions to process directory entries. Use the member names d_name, d_type, and d_ino in your code.
Alice E. Fischer Systems Programming – Lecture 3. . . 36/40 September 9, 2015 36 / 40 Directory Operations Testing the Entry Type
The header file sys/stat.h defines macros that let you test the type of a directory entry. Use them as if they were functions. #define S_ISDIR(m) (((m) & S_IFMT) == S_IFDIR) // directory #define S_ISREG(m) (((m) & S_IFMT) == S_IFREG) // regular file #define S_ISLNK(m) (((m) & S_IFMT) == S_IFLNK) // symbolic link
#define S_ISSOCK(m) (((m) & S_IFMT) == S_IFSOCK) // socket #define S_ISFIFO(m) (((m) & S_IFMT) == S_IFIFO) // pipe or socket You will need the first three for program 3; we will use pipes and sockets later in the term.
Alice E. Fischer Systems Programming – Lecture 3. . . 37/40 September 9, 2015 37 / 40 Directory Operations System Calls for Directory Processing
System calls are documented in Section 2 of the Unix manual. For Program 3, you will need to make the following system call: Get the stats out of the I-node for a regular file. int lstat(const char* path, struct stat* buf); A return value of 0 means success. If the function fails, it sets a global variable, errno, to the code for the error and returns -1. The stat type definition follows:
Alice E. Fischer Systems Programming – Lecture 3. . . 38/40 September 9, 2015 38 / 40 Directory Operations Reading the I-node Stats.
struct stat { dev_t st_dev; // device ino_t st_ino; // inode mode_t st_mode; // protection nlink_t st_nlink; // number of hard links uid_t st_uid; // user ID of owner gid_t st_gid; // group ID of owner dev_t st_rdev; // device type (if inode device) off_t st_size; // total size, in bytes blksize_t st_blksize; // blocksize for filesystem I/O blkcnt_t st_blocks; // number of blocks allocated time_t st_atime; // time of last access time_t st_mtime; // time of last modification time_t st_ctime; // time I-node info last changed };
Alice E. Fischer Systems Programming – Lecture 3. . . 39/40 September 9, 2015 39 / 40 Summary Summary
Tonight’s topics include: How compilation and linking work. What is a system call? The Unix file system and I-Nodes Directories and files Directory operations Error return values and error codes
Alice E. Fischer Systems Programming – Lecture 3. . . 40/40 September 9, 2015 40 / 40