Systems Structure Input/Output Peripherals Computer

Central Main Processing Memory Unit

Computer Systems Interconnection

Input Output Communication lines

CS 160 Ward 1 CS 160 Ward 2

Examples of I/O Devices

• Human readable (Communicating with user) – Screen, printer, keyboard, etc. Basic I/O Concepts & • Machine readable (Communicating with Terminology equipment) – Magnetic disk, tape systems, etc. – Cameras, audio speakers, etc. – Sensors, actuators, etc. • Communication (Communicating with remote devices) – Modems, Network Interface Card (NIC), etc.

CS 160 Ward 3 CS 160 Ward 4 Illustration of Early Devices Input/Output Problems

• Wide variety of peripherals – Delivering different amounts of data – At different speeds – In different formats • All slower than CPU and RAM

• Independent of processor – Separate circuitry & power • Connected by digitals signals

CS 160 Ward 5 CS 160 Ward 6

Typical I/O Data Rates (~Year 2000) Input/Output Problems

• Wide variety of peripherals – Delivering different amounts of data – At different speeds – In different formats • All slower than CPU and RAM

Need I/O controllers with interfaces to effectively handle.

CS 160 Ward 7 CS 160 Ward 8 Illustration of Modern Interface Controller I/O Controller Functions

• Control & Timing • CPU Communication • Device Communication • Data Buffering • Error Detection • Needed at each end of a physical connection • Allows arbitrary voltage and signal on connection

CS 160 Ward 9 CS 160 Ward 10

I/O Controller Illustration Two Types of Interfaces

• Parallel interface – Composed of many wires – Each wire carries one bit at any time – Width is number of wires • Serial interface – Single signal wire (also need a ground) – Bits sent one-at-a-time – Slower than parallel interface

CS 160 Ward 11 CS 160 Ward 12 Clock(s) Duplex Technology

• Ends of connection typically use • Full-duplex separate clocks, and controllers manage – Simultaneous, bi-directional transfer differences – Example: disk drive supports simultaneous read and write • Transmission is self-clocking if signal • Half-duplex encoded in such a way that receiving – Transfer in only one direction at a time controller can determine boundary of – Interfaces must negotiate access before bits transmitting

CS 160 Ward 13 CS 160 Ward 14

Latency and Throughput Data Multiplexing

• Fundamental idea • The latency of an interface is a measure • Arises from hardware limits on parallelism of the time required to perform a single (pins or wires) bit transfer • Allows sharing of hardware • The throughput of an interface is a • Multiplexor – Accepts input from many sources measure of the data that can be – Sends small amount from one source before transferred per unit time accepting another • Demultiplexor – Receives transmission of pieces – Sends each piece to appropriate destination

CS 160 Ward 15 CS 160 Ward 16 Illustration of Mutiplexing Multiplexing and I/O Interfaces

• Multiplexing is used to construct an I/O interface that can transfer arbitrary amounts of data over a fixed number of parallel wires • Multiplexing hardware divides the data into blocks, and transfers each block independently • 64 bits of data multiplexed over 16-bit path

CS 160 Ward 17 CS 160 Ward 18

Multiple Devices per External Interface Processor View of I/O

• Cannot afford separate interface per • Processor does not access external device devices directly – Too many wires • Instead, processor uses a programming interface to pass requests to an – Not enough pins on processor chip interface controller • Example • Programming interface translates the – I/O devices, memory, etc. sharing a requests into the appropriate external common bus. signals

CS 160 Ward 19 CS 160 Ward 20 Definition of a Bus

• Digital interconnection mechanism • Allows two or more functional units to transfer data Buses & Bus Architectures • Typical use: connect processor to – Memory – I/O Devices • Design can be – Proprietary (owned by one company) – Standardized (available to many companies)

CS 160 Ward 21 CS 160 Ward 22

Illustration of a Bus Sharing

• Most buses are shared by multiple devices • Need an access protocol – Determines which device can use the bus at any time – All attached devices must follow the protocol • Note: can have multiple buses in one computer

CS 160 Ward 23 CS 160 Ward 24 Characteristics of a Bus Physical Bus Connections

• Parallel data transfer • Several possibilities – Can transfer multiple bits at the same time – Wires on a circuit board or chip – Typical width is 32 or 64 bits – Sockets on boards • Passive – Bus does not contain many electronic components – Combinations – Attached devices handle communication • Conceptual view: think of a bus as parallel wires • Bus may have arbiter that handles sharing

CS 160 Ward 25 CS 160 Ward 26

Illustration of Bus on a Motherboard Illustration of Circuit Board and Corresponding Sockets

CS 160 Ward 27 CS 160 Ward 28 Bus Interface Conceptual Design of a Bus

• Nontrivial • Need three functions • Controller circuit required – Control – Address specification – Data being transferred • Conceptually three separate groups of wires (lines)

CS 160 Ward 29 CS 160 Ward 30

Illustration of Lines in a Bus Bus Access

• Bus only supports two operations paradigm – fetch (also called read) – store (also called write) • Access paradigm known as fetch-store

• Obvious for memory access • Surprise: all operations, including I/O, must be performed using fetch-store paradigm

CS 160 Ward 31 CS 160 Ward 32 Fetch-Store Over a Bus Width of a Bus

• Fetch • Larger width – Place an address on the address lines – Higher performance – Use control line to signal fetch operation – Higher cost – for control line to indicate operation complete – Requires more pins • Store • Smaller width – Place an address on the address lines – Lower cost – Place data items on the data lines – Lower performance – Use control lines to signal store operation – Requires fewer pins – Wait for control line to indicate operation complete • Compromise: multiplex transfers to reduce width

CS 160 Ward 33 CS 160 Ward 34

Multiplexing Illustration of Multiplexing on a Bus

• Reuse lines for multiple purposes • Extreme case – Serial bus has one line • Typical case – Bus has K lines – Address and data are K bits wide • Transfer takes longer with multiplexing • Controller hardware is more sophisticated

CS 160 Ward 35 CS 160 Ward 36 Effect of Bus Multiplexing on Design Illustration of Memory Bus

• Addresses and data are multiplexed over a bus • To optimize performance of the hardware, an architect chooses a single size for both data items and addresses

• Address over bus used to activate desired memory unit

CS 160 Ward 37 CS 160 Ward 38

Control Hardware and Addresses Steps an Interface Takes

• Although all interfaces receive all Let R be the range of addresses assigned to the memory requests that pass across the bus, an Repeat forever { interface only responds to requests that Monitor the bus until a request appears; contain an address for which the if (the request specifies an address in R) { interface has been configured respond to the request } else { ignore the request } }

CS 160 Ward 39 CS 160 Ward 40 Potential Errors on a Bus Address Configuration and Sockets

• Two options for address configuration • Address conflict – Configure each interface with the set of addresses – Two devices attempt to respond to a given – Arrange sockets so that wiring limits each address socket to a range of addresses • Unassigned address • Latter avoids misconfiguration: owner can – No device responds to a given address plug in additional boards without configuring the hardware • Note: some systems allow MMU to detect Bus hardware reports a bus error. and configure boards automatically

CS 160 Ward 41 CS 160 Ward 42

Using Fetch-Store with Devices Example: Meaning Assigned to Addresses

• Example Address Operation Meaning – Imaginary status light controller Nonzero data value turns the display 100-103 store on, and a zero data value turns the – Connected to 32-bit bus display off – Contains N separate lights Returns zero if display is currently off, 100-103 fetch and nonzero if display is currently on – Desired functions are Change brightness. Low-order four bits • Turn display on of the data value specify brightness 104-107 store • Turn display off value from zero (dim) through sixteen (bright) • Set display brightness Low order sixteen bits of data value • Turn status light i on or off each controls a status light, where zero 108-111 store sets the corresponding light off and one sets it on

CS 160 Ward 43 CS 160 Ward 44 Example: Interpretation of Operations Unified Memory & Device Addressing

• Semantics are if (address == 100 && op == store && data != 0) • Single bus can attach turn_on_display; – Multiple memories and if (address == 100 && op == store && data == 0) – Multiple devices turn_off_display; • Bus address space includes all units

• Circuits actually test the address, operation, and data values in parallel and take appropriate action

CS 160 Ward 45 CS 160 Ward 46

Example: Bus with Memories & Devices Example: Bus Addressing

Bus Address Space Address Assignments

• Two memories and two I/O devices

Address space may be contiguous or may have holes

CS 160 Ward 47 CS 160 Ward 48 Address Map Bridge Connecting Two Buses • Specifies types of hardware that can be used for different addresses • Part of bus specification • Example on the right – 16-bit bus – Bus can support up to 32,768 bytes • An interconnection device • In a typical computer, the part of the address space • Maps range of addresses available to devices is • Forwards operations and replies from one bus sparsely populated – only a to the other small percentage of address • Especially useful for adding an auxiliary bus are used.

CS 160 Ward 49 CS 160 Ward 50

Bridge Address Mapping Switching Fabric • Alternative to bus – Bus • only one pair of attached units can communicate at any given time • : (1) obtain exclusive use of bus, (2) transfer data, and (3) release bus • Switching fabric connects multiple devices – Allows multiple attached units to communicate simultaneously • Sender supplies data and destination device • Fabric delivers data to specified destination

CS 160 Ward 51 CS 160 Ward 52 Crossbar Switch

Processor-I/O Interaction Techniques

• Solid dot indicates a connection

CS 160 Ward 53 CS 160 Ward 54

Input-Output Techniques Programmed I/O: Detail

Three principle I/O techniques • CPU requests I/O operation – Programmed I/O • I/O device performs operation I/O occurs under the direct and continuous control • I/O device sets status bits of the CPU – Interrupt-driven I/O • CPU checks status bits periodically (polling) CPU issues an I/O command, then continues to • I/O device does not inform CPU directly execute, until interrupted by the I/O hardware signaling completion of the I/O operation • I/O device does not interrupt CPU – Direct Memory Access (DMA) • CPU may wait or come back later Specialized I/O processor takes over control of an I/O operation from the CPU CPU may waste considerable time

CS 160 Ward 55 CS 160 Ward 56 Programmed I/O: Example Interrupt-Driven I/O

• Print a new line of text on a printer • Overcomes CPU waiting Operation: Cause printer to advance the paper • No repeated CPU checking of device Poll: Determine when paper has advanced Operation: Move print head to beginning of line • I/O device interrupts when ready Poll: Determine when print head reaches beginning of line Operation: Specify character to print Major improvement in CPU performance. Poll: Determine when character locked in place Operation: Cause hammer to strike the character Poll: Determine when hammer is finished striking

CS 160 Ward 57 CS 160 Ward 58

Interrupt-Driven I/O: Example Interrupt-Driven I/O: Interrupts

• Print a new line of text on a printer CPU issues command to device for printer to advance the • Issues I/O command paper • Does other work CPU continues with other execution until receives interrupt • Checks for interrupt at end of each instruction from the I/O device cycle (recall basic Instruction Cycle – next CPU issues command to move print head to beginning of line slide) CPU continues with other execution until receives interrupt from the I/O device CPU issues command to specify character to print CPU continues with other execution until receives interrupt from the I/O device . . .

CS 160 Ward 59 CS 160 Ward 60 Basic Instruction Cycle States Handling an Interrupt

• Save the current execution state – Values in registers – Program counter – Condition code • Determine which device issued the interrupt • Call the procedure that handles the device – Runs code for the specific interrupt (e.g., fetch & store) • Clear the interrupt signal from the bus • Restore the current execution state

CS 160 Ward 61 CS 160 Ward 62

Direct Memory Access DMA Operation

• CPU tells DMA controller: • Interrupt driven and programmed I/O require active CPU intervention – Read/Write – Transfer rate is limited – Device address • CPU saves process state information – Starting address of memory block for data – CPU is tied up – Amount of data to be transferred • CPU carries on with other work DMA is the solution. • DMA controller has necessary digital logic to deal with transfer • DMA controller sends interrupt when finished

CS 160 Ward 63 CS 160 Ward 64 DMA Transfer Effect of Cache

• DMA controller requests bus – Bus must allow multiple units to access the bus • What effect does a system with caching without interference memory have on DMA? • When control of bus given, DMA controller – Cache reduces the number of memory begins transfer of data accesses, thus bus is available more often • CPU can request bus for its operations and is for DMA use given higher priority • Slows down CPU but not as much as CPU doing transfer

CS 160 Ward 65 CS 160 Ward 66

DMA Configurations [1] DMA Configurations [2]

• Single Bus, Detached DMA controller • Each transfer uses bus twice – I/O to DMA then DMA to memory • Single Bus, Integrated DMA controller • Twice the potential interference with the CPU • Controller may support >1 device • Each transfer uses bus once – DMA to memory

CS 160 Ward 67 CS 160 Ward 68 DMA Configurations [3] I/O Processors

• I/O devices getting more sophisticated – e.g. 3D graphics cards • I/O Module enhanced to become a processor (with memory) • CPU instructs I/O controller to do transfer • I/O controller does entire transfer • Separate I/O Bus • Improves speed • Bus supports all DMA enabled devices – Takes off CPU • Each transfer uses bus once – Dedicated processor is faster – DMA to memory

CS 160 Ward 69 CS 160 Ward 70

Evolution of I/O

• CPU directly controlled peripheral device • I/O module (controller) added using programmed I/O • Interrupts employed for notification Standard I/O Interfaces • I/O module given direct access to memory (called DMA) • I/O module enhanced to become a processor (called I/O channel or processor) • I/O module adds local memory

CS 160 Ward 71 CS 160 Ward 72 Computer System & Different Interfaces Bus Standards • Industry Standard Architecture (ISA) No Standard – A de facto standard due to IBM PC – Basically the PC/AT bus running at 8.33 MHz with 16-bit transfer

• Other three widely used bus standards: – PCI (Peripheral Component Interconnect) – SCSI (Small Computer System Interface) – USB (Universal Serial Bus)

CS 160 Ward 73 CS 160 Ward 74

PCI [1] PCI [2]

• Peripheral Component – plug-and-play Interconnect (PCI) – a standard promoted by Intel – extremely popular, used by the Pentium – supports functions found on a and the Sun UltraSPARC Iii processor bus but in standardized Host format – uses a centralized bus arbiter, mostly is – appear to processor to be connected Main built into one of the bridge chips. to processor bus PCI bridge memory – supports high-speed disks and graphic and video devices (64-bit PCI bus transfers, 66 MHz → 528 MB/sec) – processor independent Ethernet Disk Printer • The PCI bridge acts as a data interface buffer to keep the PCI independent of the processor speed.

CS 160 Ward 75 CS 160 Ward 76 SCSI [1] SCSI [2]

• Small Computer System Interface (SCSI) is a • There are two types of controllers connected to standard for interfaces to I/O devices defined by a SCSI bus. American National Standards Institute (ANSI) under the designation X3.131. – An initiator (such as the SCSI controller) has – 8 – 16 data lines the ability to select a particular target and to – 5 MB/sec to 160 MB/sec send commands specifying the operations. – maximum capacity: 8 – 16 devices – A target (such as the disk controller) carries • The SCSI bus is connected to the processor bus out the commands it receives from the through a SCSI controller that uses DMA for data transfer. initiator.

CS 160 Ward 77 CS 160 Ward 78

SCSI [3] SCSI [4]

• There are 4 phases involved in a SCSI bus – 2) When the initiator wins the arbitration (distributed operations : arbitration, selection, reselection, arbitration), it selects the target (selection) and hands and data transfer. over control of the bus to the target (logical connection • Example: The processor sends a command to the established). SCSI controller to read 2 non-contiguous disk – 3) The target requests an input from initiator; the initiator sends a command specifying the read operation. sectors from a disk drive. – 4) The target suspends the connection, releases the bus; – 1) The initiator (SCSI controller) contends for bus then performs the disk seek operation (may be several control (arbitration). ms long delay).

CS 160 Ward 79 CS 160 Ward 80 SCSI [5] SCSI [6]

– 5) The target sends a seek command to the disk drive • The data transfers are always controlled to read the first sector; then requests control of the bus; by the target controller. wins the arbitration; then reselects the initiator to restore the connection (reselection). • While a particular connection is – 6) The target transfers the first sector to the initiator suspended, other devices can use the (data transfer), then suspends the connection again. bus. This ability to overlap data – 7) The target sends a seek command to the disk drive transfer requests leads to its high to read the second sector, then transfers it to the performance. initiator as before. The logical connection is then terminated.

CS 160 Ward 81 CS 160 Ward 82

USB [1] USB [2]

• The Universal Serial Bus (USB) – plug-and-play – developed by collaborative efforts of computer and • when a new I/O device is plugged in, the root hub communications companies, including Compaq, detects this event and interrupts the OS Hewlett-Packard, Intel, Lucent, Microsoft, Nortel • The OS queries the device to find out what it is and Networks, and Philips how much USB bandwidth it needs – provide a simple, low-cost, and easy to use • If the OS decides that there is enough bandwidth, it interconnection system assigns the new device a unique address and – accommodate a wide range of data transfer downloads this address and other information to characteristics for I/O devices, including Internet connections (low-speed: 1.5Mbits/s, full-speed: configuration registers inside the device 12Mbits/s, high-speed: 480Mbits/s (USB 2.0))

CS 160 Ward 83 CS 160 Ward 84 USB [3] Architecture of a Typical Pentium II System • A USB system consists of a root hub that usually plugs Host computer into the main bus; the root Root hub can be connected to I/O hub devices or to expansion hubs – a tree topology. • A message sent by the host Hub Hub computer is broadcast to all I/O devices. I/O I/O I/O I/O Hub device device device device • A message from an I/O device is sent only upstream towards I/O I/O the root of the tree. device device Bridge is used to • USB has its own 7-bit address translate the signals and protocols of one bus into space those of the other.

CS 160 Ward 85 CS 160 Ward 86