<<

Inter- Communication Mechanisms (IPC) Signals Pipes Message Queues Semaphores Signals v Signal: a mechanism to notify process of system events ™ Asynchronous notification ™ Synchronous errors or exceptions v Invoke signals ™ Processes may send each other signals by , ™ Kernel may send signals to a process. v A process may react to a by ™ Ignore the signal ™ handle signals itself v Asynchronously execute a specified procedure (the signal handler) Signals (Cont.) v A set of defined signals ™ 1)SIGHUP 2) SIGINT 3) SIGQUIT ™ 4) SIGILL 5) SIGTRAP 6) SIGIOT ™ 7) SIGBUS 8) SIGFPE 9) SIGKILL ™ 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 ™ 13) SIGPIPE 14) SIGALR 15)SIGTERM ™ 17) SIGCHLD 18) SIGCONT 19) SIGSTOP ™ 20) SIGTSTP 21) SIGTTIN 22) SIGTTOU ™ 23) SIGURG 24) SIGXCPU 25) SIGXFSZ ™ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH ™ 29) SIGIO 30) SIGPWR Signals (Cont.) v Kernel handles signals using default actions. ™ Terminate the process ™ and terminate the process v E.g., SIGFPE(floatingpoint exception) : core dump and ™ Ignore the signal ™ Suspend the process ™ Resume the process’s execution, if it was stopped v Signal related fields in task_struct data structure ™ signal (32 bits):pending signals ™ blocked: a mask of blocked signal ™ sigaction array: address of handling routine or a flag to let kernel handle the signal Signals v Normal process can only send signals to processes with the same uid and gid or to the processes in the same Note: Process Group v introduces the notion of process groups to represent a job ™ $ ls| sort | more ™ A progress group consists of three processes: ls, sort, more v Login session ™ All processes that are descendants of the login shell process Signal Handling v Signal Delivery ™Deliver until the receiver process is scheduled vSignals are not presented to the process immediately they are generated

™Every time a process exits from a system call its signal and blocked fields are checked vIf there are any unblocked signals, they can now be delivered Signal Handling (Cont.)

v The Linux signal processing code looks at the sigaction structure for each of the current unblocked signal Pipes v Half-duplex ™Data flows only in one direction v The writer and the reader communicate using standard / function

Communication pipe Task A Task B Pipe v $ls| pr |lpr v Linux using two file data structure ™Point at the same temporary VFS inode(points to a physical page within memory) v Use standard read/write library function Pipe Restriction of Pipes and Signals v Signal ™ The only information transported is a simple number v Renders signals unsuitable for transferring data. v Pipe ™ Impossible for any arbitrary process to read or write in a pipe unless it is the child of the process which created it. ™ Named Pipes (also known as FIFO) v also one-way flow of data v allowing unrelated processes to access a single FIFO. System V IPC Mechanisms v System V IPC Mechanisms ™Message queues vSend a message to other processes or receive messages from them ™Semaphores vSynchronize itself with other processes by means of semaphores ™Shared memory vShare a memory area with other processes System V IPC Mechanisms v First appeared in UNIX System V in 1983 v They allow unrelated processes to communicate with each other ™Including those that do not share the same ancestor Key Management v Each IPC resource is identified by ™ 32-bit key: similar to the file pathname v Freely chosen by the programmer

™ 32-bit identifier: similar to the v Assigned by the kernel System V IPC v Create IPC resources: semget(), msgget(), shmget() ™ Derive the IPC identifier from the IPC key ™Process then use identifier to access the resource v If two independent processes want to share a common IPC resource ™The processes agree on some fixed, predefined IPC keys v Access to these System V IPC objects is checked using access permissions ™ Similar to access the file Message Queues v ™ A linked list of messages stored within the kernel ™Identified by a message queue identifier ™For processes to send messages asynchronously to each other.

S queue R Message Queues (Cont.) v New messages are always added to the end of a queue v But can be removed anywhere in the queue intmsgrcv(intmsgid, void *ptr, size_tnbytes, long type, intflag)

™The type argument lets us specify which message we want v Type == 0, the first message is returned v Type > 0, the first message whose type equals type is returned Message Queue Data Structures

m_listm_list ipc_idsstructure v Where the kernel keeps its message queues structipc_ids{ intsize;/* number of ipc_idelements */ intin_use;/* number of in-use ipc_idelements */ intmax_id; unsigned short seq; unsigned short seq_max; structsemaphore sem;/* protecting the ipc_idsDS spinlock_tary; structipc_id*entries;/* list of message queues */ }; msg_queuestructure v A single message queue structmsg_queue{ structkern_ipc_permq_perm;/* access permission */ time_tq_stime; /* last msgsndtime */ time_tq_rtime; /* last msgrcvtime */ time_tq_ctime; /* last change time */ unsigned long q_cbytes; /* current number of bytes on queue */ unsigned long q_qnum; /* number of messages in queue */ unsigned long q_qbytes; /* max number of bytes on queue */ pid_tq_lspid; /* pidof last msgsnd*/ pid_tq_lrpid; /* last receive pid*/ structlist_headq_messages; /* List of message in queue */ structlist_headq_receivers; structlist_headq_senders; }; Message Structures v Each message is broken in one or more pages ™ Dynamically allocated v The first page stores ™ The message header with type msg_msg ™The message text v Start right after the next field v If the message text is longer than 4072 bytes ™ The second or third page (and so on) is used with type meg_megseg Message Structures v Message Header structmsg_msg{ structlist_headm_list;/* pointers for message lists */ long m_type;/* message types */ intm_ts; /* message text size */ structmsg_msgseg*next; /* next portion of the message */ /* the actual message follows immediately */ }; Maintains a list of the linked messages in a single queue v Message Segment structmsg_msgseg{ structmsg_msgseg*next; /* the next part of the message follows immediately */ }; Structure of large messages

structmsg_msg structmsg_msgseg next structmsg_msgseg next next

First Chunk Second Third Chunk Chunk Message queue related system calls v msgget() ™ Return an id either for the existing queue with the key, or for a new message queue with the key v msgsnd() ™ Sends a message to a message queue v msgrcv() ™ Receives a message from a message queue v msgctl() ™ Perform a set of administrative operations on a message queue Semaphores v Semaphores ™A semaphore is a location in memory whose value can be test_and_set (atomic) by more than one processes

™Can be used to implement critical regions Semaphore Data Structure

structsem_array

The semaphore array

Pending semaphore operations Semaphore Data Structure sem& sem_arraystructure structsem_array{ structkern_ipc_permsem_perm;/* permissions*/ time_tsem_otime; /* last semoptime */ time_tsem_ctime; /* last change time */ structsem*sem_base; /* ptrto first semaphore in array */ structsem_queue*sem_pending; /* pending operations */ … unsigned short sem_nsems;/* number of semaphores in array */ }; structsem{ intsemval; /* current value */ intsempid; /* pidof last operation*/ }; sem_queuestructure structsem_queue{ structsem_queue*next; /* next entry in the queue */ structsem_queue**prev; /* previous entry in the queue */ structtask_struct*sleeper; /* pointer to the sleeping process */ intpid;/* process id of requesting process */ intstatus; /* completion status of operation */ structsem_array*sma; /* semaphore array for operations */ ……. structsembuf*sops; /* array of pending operations */ intnsops; /* number of pending operations*/ intalter;/* 1èoperation will alter semaphore */ }; Semaphore related system calls v semget() ™ The counterpart of msgget, semgetcreates a semaphore set. v semop() ™ Performs an operation on semaphores v semctl() ™ Like msgctl, semctlperforms administrative operations on a semaphore set Shared Memory v Shared memory ™Allow processes to communicate via memory that appears in all of their

™Controlled via keys and access rights checking

™Rely on other mechanisms (e.g. semaphores) to synchronize access to the memory Shared Memory (Cont.) v Fastest, easiest of the IPC services ™Because data does not need to be copied between the client and server v Mutual exclusion problem ™Should synchronize access to a given region among multiple processes ™Often semaphores are used to synchronize shared memory access Shared Memory (Cont.)

share

Process 2 Process 1 physical page Share memory system calls v shmget() ™ Returns a unique id for a shared memory region v shmat() ™ Attaches the calling process to a shared memory region v shmdt() ™ Detaches the calling process from the shared memory region v shmctl() ™ Performs administrative operations on a shared memory region Linux File System v Linux supports different file system structures at the same time ™ , ISO 9660, ufs, FAT-16,VFAT,… v Hierarchical File System Structure ™ Linux adds each new file system into this single file system as it is mounted v The real file systems are separated from the OS by an interface layer: : VFS v VFS allows Linux to support many different file systems, each presenting a common software interface to the VFS. Hierarchical File System Structure

/

bin dev etc lib sbin usr

ls cp bin include lib man sbin

cc Mounting of Filesystems

/ / mounting operation

bin dev etc lib sbin usr bin include lib man sbin root filesystem /usr filesystem

/

bin dev etc lib sbin usr

bin include lib man sbin

complete hierarchy after mounting /usr The Layers in the File System

Process Process Process 1 2 n User mode System mode

Virtual File System

ext2 msdos minix proc

Buffer cache File system

Device drivers Linux File System v File systems supported in Linux ™ext, ext2, xia, minix, umsdos, msdos, vfat, proc, smb, ncp, iso9660, sysv, hpfs, affs, ufs,… v The separate file systems are combined into a single hierarchical tree structures ™mount on a (mount point) Virtual File System v Disks are initialized into logical partitions v Each partition may hold a single file system ™ext2fs v The real file systems are separated from the by an interface layer: Virtual File System (VFS) The Second (EXT2) v File ™A continuous set of data blocks v Inode ™describe which blocks the data within a file occupies, access rights, modification time,… v directory ™special file which contains pointers to the inodes Physical Layout of EXT2

Boot block Physical Layout of EXT2 v The first block is never managed by the ext2 ™It is reserved for the partition boot sector v The rest is split into block groups ™All groups are of the same size ™Data blocks in a file are tend to in the same group The EXT2 Superblock v Keep file-system scope information ™ Each group has a copy of the superblock v For error recovery if the superblockin the first group is destoryed ™ Fields v Total number of inodes v Free Blocks counter v Free Inodescounter v Block Size v Number of Blocks per Group v Number of Inodesper Group v Block Group Number of this superblock v …… Group Descriptor v Keep information belong to the block group v Fields ™Block number of “block bitmap” block ™Block number of “inodebitmap” block ™Block number of “inodetable” block ™Number of free blocks in the group ™Number of free inodesin the group ™Number of directory in the group Block & InodeBitmap v A sequence of bits represent the allocation status of the corresponding disk block ™0 è free ™1 è used v Fast for block/inodeallocation InodeTable v Consist of a series of consecutive blocks ™Each block contains a number of inodes v All inodeshave the same size ™128 bytes InodeTable and Inode InodeTable: A list of Inodes

Inode: The EXT2 Inode v Mode ™ What this inode describes and the permissions v Regular file v Directory v Character device v Block device v Socket v Owner Information ™ user and group ids of the owners v Size v Timestamps ™ creation and modification v Pointers to Data Blocks EXT2 Directory v Ext2 implement directories as a special kind of file ™The data block store in this directory together with the corresponding inode number ™Example: vHome1inode=21 vusrinode=53 Block Allocation v Lock EXT2 Superblock v Check if there are preallocatedblocks ™ Ext2 allocates 8 adjacent block for future use v EXT2 allocate new block ™data block near the last allocated block of the file ™ Data block in the next 64 blocks ™ data block in the same block group ™ from the other block groups Block Allocation (Cont.) v Update the Block Group’s block bitmap v Allocate a data buffer in the buffer cache ™ For writing the dirty data v Mark the in-memory superblockas “dirty” ™Linux periodically copying all dirty superblocksto disk v Unlock the superblock The Virtual File System (VFS) v A common file model capable of representing all supported filesystems ™ Manage kernel level file abstraction in one format for all file systems ™Strictly mirrors the file model provided by the traditional v Since Linux want to run its native filesystemwith minimum overhead ™ Thus, each specific filesystemimplementation must translate its physical organization into the VFS’s common file model v Indirect call the specific filesystem’sfunctions The Virtual File System (VFS) v Receive system calls from user level ™e.g., write, open, v Interact with a specific file system based on mount point traversal The Virtual File System(VFS) The Virtual File System (Cont.) v The VFS common file model consists ™ Superblockobject v Store information concerning a mounted filesystem v Corresponding filesystemcontrol block stored on disk ™ Inodeobject v Store general information about a specific file v Corresponding files control block stored on disk ™ File object v Store information about the interaction between an open file anda process ™ Dentryobject v Store information about the linking of a directory entry with the corresponding file v Note, object is a software construct, define ™ A data structure and the methods that operate on it. Interaction between Process and VFS Objects (Cont.) v See the following slide ™ Three processes have opened the same file ™ Two of them using the same link v Each process has its own file object v But only two dentryobjects are required ™ One for each link v Both dentryobjects refer to the same inodeobject ™Identify the superblockand the common disk file v Speedup scheme ™ Most recently used dentryobjects are contained in the dentrycache Interaction between Process and VFS Objects Dentryobject

inodeobject

i_mode i_dev i_size Mapping name File object to inode The VFS SuperblockObject v Device: device identifier v Blocksize: block size in bytes v File System Type v Whether the superlockis dirty v Superblockoperations ™ read_inode() ™ write_inode() ™ delete_inode() ™ write_super()… The VFS InodeObject v All information needed by the filesystemto handle a file is included in inode v Inodeis unique to a file once a file is created ™ However, would be changed v Each VFS inodeduplicates some of the data included in the disk inode ™ And add some additional information for run-time processing v inode_operations: inodeoperations v superblock: pointer to the superblockobject The VFS InodeObject (Cont.) v Device v file system specific v Inode Number v inodeoperations ™ create() v Mode: ™ delete() ™ File type and access right ™ mkdir() v Times ™ rmdir() ™ Time of last access ™ rename() v Count: usage count ™ link() v Dirty ™ unlink() …… ™Pointer to the modified data buffers VFS File Objects v A file object describe how a process interacts with a file it has opened ™ Created when a file is opened v Fields ™ Count: file object’s usage counter ™ Position: current file offset (file pointer) ™ UserID ™ File operations v read() v write() v llseek() v open() v ioctl() v () v …… VFS DentryObjects v VFS consider each directory as a file ™Contain a list of files and other directories v However, other file system may not use such a approach ™FAT stores the position of each file in the directory tree and directories are not file v Once a directory entry is read into memory ™It is transformed into a dentryobject VFS DentryObjects (Cont.) v Example ™When lookupingup /tmp/test pathname, the kernel creates vA dentryobject for the / root directory vA second dentryobject for the tmpentry vA third dentryobject for the /test directory VFS DentryObjects (Cont.) v Fields ™d_count: dentryobject usage counter ™d_subdirs: for directories, list of dentryobjects of subdirectories ™d_inode: inodeassociate with filename ™… FilesystemRegistration v The basic operation that must be performed before using a filesystemtype v Two methods ™Be included in the kernel image vBuild : supported file systems ™Dynamically loaded as a module vBuild file systems as modules vload by ismod Registering the File Systems v All filesystem-type objects are inserted into a linked list and pointed by file_system variable ™ read_super(): method for reading superblock v Read the superblockfrom the disk drive and copy into the corresponding superblockobject ™fs_supers: point to the head of the superblockobject

file_system_type file_system_type file_system_type *read_super() *read_super() *read_super() file_systems name name name fs_supers fs_supers fs_supers next next next Mounting a File System v Each file system has its own root directory ™Ex: an Ext2 filesystemstored in the /dev/fd0 floppy disk v Root filesystem: ™The filesystemwhose root directory is the root of the system’s directory tree v Other filesystemsthen are mounted on the system’s directory tree Mounting a File System (Cont.) v Ex1: an Ext2 filesystemstored in the /dev/fd0 is mounted on /flp ™ mount –t ext2 /dev/fd0 /flp v In Linux 2.4, the same filesystemcan be mounted several times ™ Ex2: mount –t ext2 –o ro/dev/fd0 /flo-ro ™ The Ext2 filesystemstored in the floppy disk is mounted both on /flpand /flp-ro v However, they are the same v Only one superblockobject is allocated Mounting a File System (Cont.) v When mounted, the kernel must save ™ The mount point and the mount flag…… v Such information is stored in a data structure ™ Called mounted filesystemdescriptor ™ Each descriptor is a data structure of type vfsmount ™A circular doubly linked list including the descriptors of all mounted filesystems v The head of the list is represented by the vfsmntlist variable A Mounted File System

Superblock object Mounting a File System (Cont.) v When mounted ™Search in the list of file system types( iso9600) vReturn the address of the corresponding file_system_typedescriptor ™See “Registering the File Systems” ™We need the fields of the descriptor, like read_super()… ™Allocate a new mounted filesystem descriptor, i.e., vfsmount ™Allocate a VFS superblockobject Umounta File System v Check whether someone is using the FS v Check if the FS is dirty ™write back v Return VFS superblockto kernel’s pool v vfsmount is unlinked from vfsmntlist Speedup Access v DentryCache: including in-used, unused dentry object ™Stores the mapping between the directory names and their inodenumbers. ™ Since read a directory entry from disk and construct the dentryobject is time consuming v InodeCache ™ The inodesassociated with an unused dentryobject are not discarded but kept in the inodecache v Replacement policy of unused dentryobject: LRU Speedup Access (Cont.) v Buffer Cache ™All of the Linux file systems use a common buffer cache to cache data buffers from the underlying devices The States of the Buffer Cache v Clean :Unused, new buffers v Locked ™Buffers that are locked to be written v Dirty ™Dirty buffers. These contain new, valid data, and will be written but so far have not been scheduled to write bdflush& update Kernel Daemons v The bdflush kernel ™provides a dynamic response to the system having too many dirty buffers (default:60%). ™tries to write a reasonable number of dirty buffers out to their owning disks (default:500). v The update daemon ™periodically flush all older dirty buffers out to disk The /proc File System vIt does not really exist. vProvide information on the current status of the Linux kernel and the running processes ™Allows you to request the kernel status by reading files in the hierarchy. The /proc File System v System information ™Process-Specific Subdirectories ™Kernel data ™IDE devices in /proc/ide ™Networking info in /proc/net, SCSI info ™Parallel port info in /proc/parport ™TTY info in /proc/tty