Direct Memory Access (DMA)

Drive

1. CPU programs DMA Disk Main Operating Systems CPU the DMA controller controller Buffer

Input/Output Address Count Control 4. Ack dr. Tomasz Jordan Kruk when 2. DMA requests [email protected] done transfer to memory 3. Data transferred

Institute of Control & Computation Engineering √ DMA modules control data exchange between main mamory and Warsaw University of Technology external devices, √ usage of free bus time or (usually) delaying of the CPU for a one cycle from time to time to send a word by bus (cycle stealing and burst mode), √ only one interrupt after the whole transfer, avoidance of context switches,

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 1/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 3/24

Interrupts Revisited (hardware level) Input/Output Handling

Interrupt 1. Device is finished controller Division of I/O devices: CPU 3. CPU acks interrupt Disk Keyboard 11 12 1 block devices, read/write of each block possible independently, 10 2 √ 9 3 8 4 Clock 7 5 2. Controller 6 Printer issues √ character devices, deliver stream of characters without division into interrupt blocks, not addressable, without seek operation, Bus √ communication/network devices sometimes distinguished as a √ when an I/O device has finished the work given to it, it causes an interrupt by asserting a separate group because of their specificity, signal on a bus line that it has been assigned, √ some devices, like timers, do not fit in to this classification scheme, √ signal detected by the interrupt controller chip. If no other pending, the interrupt controller processes the interrupt immediately. If another one in progress or there is a simultaneous request on a higher-priority interrupt request line, continues to assert until serviced by the CPU. √ the controller puts a number on the address lines and asserts a signal that interrupts the CPU, √ that number used as an index into a table called the interrupt vector to start a corresponding interrupt service procedure, √ the service procedure in certain moment acknowledges the interrupt by sending some value to some controller’s port. That enables the controller to issue other interrupts. Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 2/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 4/24 Differences in Input/Output Handling Communication with External Devices (I)

Differencies in I/O handling: How processor communicates with control registers and how accesses external devices buffers. Two communication techniques: √ complexity of service, 1. I/O ports, with each control register some port with established number √ additional hardware support requirement, associated. Communication with special instructions: √ priorization of services, IN REG, PORT √ throughput unit, OUT PORT, REG √ data representation, 2. memory-mapped I/O √ device response type, √ driver may be completely written in C, without any assembly code √ error handling, pieces, because access only via standard read/write calls, √ programming method. √ no need for separate special protection mechanism, √ faster testing of the contents of control registers, but √ must be disabled for mapped region, √ complicates the architecture with different type buses.

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 5/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 7/24

Input/Output Programming Goals Communication with External Devices (II)

√ device independence, Two address One address space Two address spaces

√ uniform naming, 0xFFFF… Memory √ error handling – the closer the hardware the better, √ transfer type – synchronous/ asynchronous, I/O ports √ buffering. 0 (a) (b) (c)

a. separate I/O and memory space, b. memory-mapped I/O, c. hybrid solution, i.e. Pentium architecture: 640kB - 1MB addresses reserved for external devices still having I/O ports space 0 – 64K.

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 6/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 8/24 Principles of I/O Software Interrupt-Driven I/O

Three ways of I/O communication/ programming: copy_from_user(buffer, p, count); if (count == 0) { 1. programmed I/O (with polling, busy waiting behaviour), enable_interrupts( ); unblock_user( ); while ( printer_status_reg != READY) ; } else { 2. interrupt-driven I/O, * *printer_data_register = p[0]; *printer_data_register = p[i]; 3. I/O using DMA. scheduler( ); count = count − 1; i = i + 1; } acknowledge_interrupt( ); return_from_interrupt( );

(a) (b) An example of an interrupt-driven I/O: writing a string to the printer.

a. code executed when the print system call is made, b. interrupt service procedure.

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 9/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 11/24

Programmed I/O I/O Using DMA

copy_from_user(buffer, p, count); /* p is the kernel bufer */ for (i = 0; i < count; i++) { /* loop on every character */ copy_from_user(buffer, p, count); acknowledge_interrupt( ); while (*printer_status_reg != READY) ; /* loop until ready */ set_up_DMA_controller( ); unblock_user( ); *printer_data_register = p[i]; /* output one character */ scheduler( ); return_from_interrupt( ); } return_to_user( ); (a) (b) An example: of I/O using DMA: printing a string. An example of programmed I/O: steps in printing a string. a. code executed when the print system call is made, b. interrupt service procedure. √ advantage: reduction of number of interrupts from one per character to one per buffer printed, √ not always the best method – aspects of transfer scope size and relative speed of CPU and DMA controller.

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 10/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 12/24 I/O Software Layers Disk Access Performance

I/O √ seek time the time ti move the arm to the proper track, Layer reply I/O functions √ rotational delay/latency the time for the proper sector to rotate under I/O User processes Make I/O call; format I/O; spooling the head, request √ access time seek time + rotational latency, Device-independent Naming, protection, blocking, buffering, allocation software √ seek time decides about performance, Device drivers Set up device registers; check status √ important role of the cache memory (replacement algorithms LRU, LFU).

Interrupt handlers Wake up driver when I/O completed

Hardware Perform I/O operation

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 13/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 15/24

Device-Independent I/O Software Disk Arm Scheduling Algorithms

√ uniform interfacing for device drivers, Because of requester:

√ under Unix: naming devices with usage of major and minor numbers, RSS random scheduling, √ protection against unauthorized access, FIFO the most fair one, √ providing a device-independent block size, PRI with priorities, √ buffering mechanisms (example: double buffering), LIFO Last In First Out, locality and resource usage maximization, √ management of accessability of devices, Because of service: √ handling of allocation and releasing of devices, SSTF √ management of resource to user allocation, shortest service time first with the smallest move of arm, SCAN elevator algorithm, the arm moves alternately in two directions (up √ part of errors handling. and down) servicing all requests, C-SCAN cyclic SCAN, servicing during the move only in one direction with a fast return to the start position.

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 14/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 16/24 Strip 0 Strip 1 Strip 2 Strip 3 Redundancy in Disk Service (a) Strip 4 Strip 5 RAIDStrip 6 SolutionsStrip 7 RAID level 0(RAID 1) Strip 8 Strip 9 Strip 10 Strip 11

RAID Redundant Array of Independent Disks – (formerly: Inexpensive) the name and classification originating from Berkeley University. Strip 0 Strip 1 Strip 2 Strip 3 Strip 0 Strip 1 Strip 2 Strip 3 RAID Strip 4 Strip 5 Strip 6 Strip 7 Strip 4 Strip 5 Strip 6 Strip 7 √ a technique of creation of virtual disks (with logical volumes), with some (b) level 1 features related to reliability, efficiency and serviceability, from a group Strip 8 Strip 9 Strip 10 Strip 11 Strip 8 Strip 9 Strip 10 Strip 11 of disks, √ data distributed over the matrix of disks, √ mirroringBit 1 ,Bitfull 2 dataBitr edundancy3 Bit 4 , Bit 5 Bit 6 Bit 7 √ redundancy used to improve fault tolerance, especially tolerance to √ from the point of fault tolerance the best solution, physical medium damage. (c) RAID level 2 √ expensive solution. √ RAID as opposed to (before) SLED (Single Large Expensive Disk) or (now) JBOD (Just a Bunch of Disks).

Bit 1 Bit 2 Bit 3 Bit 4 Parity

(d) Strip 0 Strip 1 Strip 2 Strip 3 RAID level 3 (a) Strip 4 Strip 5 Strip 6 Strip 7 RAID level 0 Strip 8 Strip 9 Strip 10 Strip 11

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 17/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 19/24 Strip 0 Strip 1 Strip 2 Strip 3 P0-3

(e) Strip 04 Stripip 15 Strip 26 Strip 37 StrP4-7ip 0 RAIDStrip le 1vel 4 Strip 2 Strip 3 Strip 8 Strip 9 Strip 10 Strip 11 P8-11 RAID RAID Solutions (RAID 0) (b) Strip 4 Strip 5 RAIDStrip 6 SolutionsStrip 7 Strip 4 (RAIDStrip 5 Str2)ip 6 Strip 7 level 1 Strip 8 Strip 9 Strip 10 Strip 11 Strip 8 Strip 9 Strip 10 Strip 11

Strip 0 Strip 1 Strip 2 Strip 3 P0-3 Strip 0 Strip 1 Strip 2 Strip 3 StrBitip 1 4 StrBitip 2 5 StrBitip 3 6 P4-7Bit 4 StrBitip 5 7 Bit 6 Bit 7 (a) Strip 4 Strip 5 Strip 6 Strip 7 RAID level 0 (c)(f) Strip 8 Strip 9 P8-11 Strip 10 Strip 11 RAID level 5 RAID level 2 Strip 8 Strip 9 Strip 10 Strip 11 Strip 12 P12-15 Strip 13 Strip 14 Strip 15 P16-19 Strip 16 Strip 17 Strip 18 Strip 19

√ noStrdataip 0 redundancyStrip 1 Str,ip 2 Strip 3 Strip 0 Strip 1 Strip 2 Strip 3 √ correction code computed from data bits, RAID Bit 1 Bit 2 Bit 3 Bit 4 Parity √(b)divisionStrip 4 intStroipconcat 5 Stripenation 6 Stripand 7 stripingStrip 4 ,Strip 5 Strip 6 Strip 7 level 1 √ usage of correction-detection codes (Hamming’s code), (d) RAID level 3 √ perStrformaneip 8 Stripand 9 fleStrxibilityip 10 Strimip pr11ovement,Strip 8 loStrwipcos 9 t Strsolutionip 10 Strwitip 11h lack of √ expensive solution which requires many disks. fault tolerance.

Bit 1 Bit 2 Bit 3 Bit 4 Bit 5 Bit 6 Bit 7 Strip 0 Strip 1 Strip 2 Strip 3 P0-3 (c) RAID level 2 (e) Strip 4 Strip 5 Strip 6 Strip 7 P4-7 RAID level 4 Strip 8 Strip 9 Strip 10 Strip 11 P8-11

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 18/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 20/24 Bit 1 Bit 2 Bit 3 Bit 4 Parity Strip 0 Strip 1 Strip 2 Strip 3 P0-3 (d) RAID level 3 Strip 4 Strip 5 Strip 6 P4-7 Strip 7

(f) Strip 8 Strip 9 P8-11 Strip 10 Strip 11 RAID level 5 Strip 12 P12-15 Strip 13 Strip 14 Strip 15 P16-19 Strip 16 Strip 17 Strip 18 Strip 19 Strip 0 Strip 1 Strip 2 Strip 3 P0-3 (e) Strip 4 Strip 5 Strip 6 Strip 7 P4-7 RAID level 4 Strip 8 Strip 9 Strip 10 Strip 11 P8-11

Strip 0 Strip 1 Strip 2 Strip 3 P0-3 Strip 4 Strip 5 Strip 6 P4-7 Strip 7

(f) Strip 8 Strip 9 P8-11 Strip 10 Strip 11 RAID level 5 Strip 12 P12-15 Strip 13 Strip 14 Strip 15 P16-19 Strip 16 Strip 17 Strip 18 Strip 19 Strip 0 Strip 1 Strip 2 Strip 3 (a) Strip 4 Strip 5 Strip 6 Strip 7 RAID level 0 Strip 8 Strip 9 Strip 10 Strip 11

Strip 0 Strip 1 Strip 2 Strip 3 Strip 0 Strip 1 Strip 2 Strip 3 RAID (b) Strip 4 Strip 5 Strip 6 Strip 7 Strip 4 Strip 5 Strip 6 Strip 7 level 1 Strip 8 Strip 9 Strip 10 Strip 11 Strip 8 Strip 9 Strip 10 Strip 11

Strip 0 Strip 1 Strip 2 Strip 3 Bit 1 Bit 2 Bit 3 Bit 4 Bit 5 Bit 6 Bit 7 (a) Strip 4 Strip 5 Strip 6 Strip 7 RAID level 0 (c) RAID level 2 Strip 8 Strip 9 Strip 10 Strip 11

Strip 0 Strip 1 Strip 2 Strip 3 Strip 0 Strip 1 Strip 2 Strip 3 RAID Bit 1 Bit 2 Bit 3 Bit 4 Parity (b) Strip 4 Strip 5 Strip 6 Strip 7 Strip 4 Strip 5 Strip 6 Strip 7 level 1 (d) RAID level 3 Strip 8 Strip 9 Strip 10 Strip 11 Strip 8 Strip 9 Strip 10 Strip 11

Bit 1 Bit 2 Bit 3 Bit 4 Bit 5 Bit 6 Bit 7 Strip 0 Strip 1 Strip 2 Strip 3 P0-3 (c) RAID Solutions (RAID 3) RAID level 2 RAID Solutions (RAID 5) (e) Strip 4 Strip 5 Strip 6 Strip 7 P4-7 RAID level 4 Strip 8 Strip 9 Strip 10 Strip 11 P8-11

Bit 1 Bit 2 Bit 3 Bit 4 Parity Strip 0 Strip 1 Strip 2 Strip 3 P0-3 (d) Strip 0 Strip 1 Strip 2 Strip 3 RAID level 3 Strip 4 Strip 5 Strip 6 P4-7 Strip 7 (a) Strip 4 Strip 5 Strip 6 Strip 7 RAID level 0 (f) Strip 8 Strip 9 P8-11 Strip 10 Strip 11 RAID level 5 Strip 8 Strip 9 Strip 10 Strip 11 Strip 12 P12-15 Strip 13 Strip 14 Strip 15 √ analogoues to RAID 2, with parity bits instead of correction-detection P16-19 Strip 16 Strip 17 Strip 18 Strip 19 codes,Strip 0 Strip 1 Strip 2 Strip 3 P0-3 Strip 4 Strip 5 Strip 6 Strip 7 P4-7 RAID level 4 √(e)goodStrip t0hrouhputStrip 1 inStrdataip 2 sizeStripper 3 time,Strip 0poorStrperip 1 formanceStrip 2 inStrnumberip 3 of striping with added parity bits, RAID √ serStrvicedip 48 rStreqipipues 59 tsStrStrinipip 10 time.6 StrStripip 11 7 StrP8-11ip 4 Strip 5 Strip 6 Strip 7 (b) level 1 √ economical solution - redundancy costs exactly one disk, Strip 8 Strip 9 Strip 10 Strip 11 Strip 8 Strip 9 Strip 10 Strip 11 √ good read performance, noticable degradation of write performance,

Strip 0 Strip 1 Strip 2 Strip 3 P0-3 √ quality and efficiency of solution determined by the parameters tuning process. StrBitip 1 4 StrBitip 2 5 StrBitip 3 6 P4-7Bit 4 StrBitip 5 7 Bit 6 Bit 7 (c)(f) Strip 8 Strip 9 P8-11 Strip 10 Strip 11 RAID level 5 RAID level 2 Strip 12 P12-15 Strip 13 Strip 14 Strip 15 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 21/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 23/24 P16-19 Strip 16 Strip 17 Strip 18 Strip 19

Bit 1 Bit 2 RAIDBit 3 SolutionsBit 4 Parity (RAID 4) RAID - Additional Aspects (d) RAID level 3

√ RAID 6, as RAID 5 with two independent parity bits (stripes), √ RAID 10 = 1 + 0 Strip 0 Strip 1 Strip 2 Strip 3 P0-3 √ hardware RAID and software RAID, (e) Strip 4 Strip 5 Strip 6 Strip 7 P4-7 RAID level 4 Strip 8 Strip 9 Strip 10 Strip 11 P8-11 √ in common use RAID: 0, 1, 5, 1+0, 0+1, √ typical server configuration: √ RAID 4 – RAID 6, independent access to disks, independent requests ? RAID 1 for small critical data (i.e. disks with operating system), may be serviced in parallel, better performance in number of serviced Strip 0 Strip 1 Strip 2 Strip 3 P0-3 ? RAID 5 for huge databases (i.e. disks in some external matrix). requests in time, Strip 4 Strip 5 Strip 6 P4-7 Strip 7

√(f)stripingStrip 8 witStrhipbig 9 stripes,P8-11 Strip 10 Strip 11 RAID level 5 √ parityStrip 12comP12-15puter onStripbit 13basisStrip s14till rStreqipuir 15es read of a block. P16-19 Strip 16 Strip 17 Strip 18 Strip 19

Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 22/24 Faculty of E&IT, Warsaw University of Technology Operating Systems / Input/Output – p. 24/24