<<

DMA -

Part of course, Sharif U of Tech.

This report can be used as a source for the DMA topic I introduced in the last session of the class.

In a modern system, between a device (for example a hard disk, , or an card) and system memory is accomplished in two ways: programmed IO and DMA.

1. Programmed IO

The first less efficient method uses programmed IO transfers. The device generates an to inform the CPU that it needs data transferred. The device service routine (ISR) causes the CPU to read from the device into one of its own registers. The ISR then tells the CPU to write from its register to memory. Similarly, if data is to be moved from memory to the device, the ISR tells the CPU to read from memory into its own register. The ISR then tells the CPU to write from its register to the device. It is apparent that the process is very inefficient for two reasons. First, there are two cycles (i.e. two load and store instructions) generated by the CPU for every data transfer, one to memory and one to the device. Second, the CPU is busy transferring data rather than performing its primary function of executing application code.

2. DMA The second more efficient method to transfer data is the DMA (direct memory access) technique. In the DMA technique, a DMA replaces the CPU and takes care of the access of both, the i/o device and the memory, for fast data transfers. Using DMA you get the fastest data transfer rates possible. In this technique, a special hardware, called DMA controller, writes to / reads from memory directly (without CPU intervention) and saves the timing associated with op-code fetch and decoding, increment and test addresses of source and destination that otherwise CPU should do. So, in normal PCs, programmed IO transfer of one data byte takes up to 29 clock cycles, but the DMA transfer requires only 5 clock cycles. On the first PCs, there was a circuit in charge of such transfers called DMA controller. Originally this was a chip called “8237 IC”. In modern PCs, this circuit started to be embedded on the south bridge chip.

To access to the memory in parallel with the normal accesses that CPU may do simulataneously, the DMA controller may either 1: stop the CPU and access the memory (cycle stealing DMA) 2: or use the bus while the CPU is not using it (hidden cycle DMA).

The DMA controller has some control lines (to do a handshake with the CPU negotiating to be a bus master and to emulate the CPU behaviour while accessing the memory), an address register which is auto-incremented (or auto-decremented) at each memory access, and a counter used to check for byte (or word) count to make sure the required number of bytes are transferred. To perform a memory access, DMA controller should be programmed first. The DMA controller for reading data from a device is programmed with (at least): - initial - number of bytes to be input - address of the source The sequence of events is: - Bus (CUP to memory bus) request from DMA controller to CPU (can I use the bus, please?) - Bus grant from CPU to DMA controller (yes, you can, at the end of this bus cycle) - Bus grant acknowledge from DMA controller to CPU (thank you. Here I go!) - DMA controller reads device - DMA controller writes to memory - DMA controller increments counter - DMA controller checks for End-of-Count - At the End-of-Count the DMA controller generates an interrupt request to tell the CPU: 'I have done a DMA operation. New data is available'.

3. DMA channels and programming A DMA controller may have several DMA channels, each of which can be programmed to perform a sequence of DMA transfers. Devices, usually I/O , that acquire data that must be read (or devices that must output data and be written to) signal the DMA controller to perform a DMA transfer by asserting a hardware DMA request signal. A DMA request signal for each channel is routed to the DMA controller. This signal is monitored and responded to in much the same way that a handles interrupts. When the DMA controller sees a DMA request, the DMA controller responds by performing one or many data transfers from that I/O device into system memory or vice versa. Channels must be enabled and programmed by the processor for the DMA controller to respond to DMA requests. The number of transfers performed, transfer modes used, and memory locations accessed depends on how the DMA channel is programmed. A DMA controller typically shares the system memory and I/O bus with the CPU and has both bus master and slave capability. In bus master mode, the DMA controller acquires the (address, data, and control lines) from the CPU to perform the DMA transfers. In bus slave mode, the DMA controller is accessed by the CPU, which programs the DMA controller's internal registers to set up DMA transfers. The internal registers consist of source and destination address registers and transfer count registers for each DMA channel, as well as control and status registers for initiating, monitoring, and sustaining the operation of the DMA controller. Each channel is programmed by CPU (i.e. the driver of the device). CPU must specify the direction (memory-to-I/O or I/O-to-memory) and the address and length of the transfer. Once all the settings are ready, the DMA channel can be used. That DMA channel is now considered to be “armed'', and will respond when the corresponding devices asks for DMA transfer.

A Sample DMA transfer (IBM-PC)

For example assume that a hard disk controller (HDC) has just read a byte from a disk and wants the DMA to place it in memory at location 0x00123456. Also assume that the DMA channel 2 is assigned to the HDD. The process begins by the HDC asserting the DRQ2 signal (the DRQ line for DMA channel 2) to alert the DMA controller.

The DMA controller will note that the DRQ2 signal is asserted. The DMA controller will then make sure that DMA channel 2 has been programmed by CPU and is enabled. The DMA controller also makes sure that none of the other DMA channels are active or want to be active and have a higher priority. Once these checks are complete, the DMA asks the CPU to release the bus so that the DMA may use the bus. The DMA requests the bus by asserting the HRQ signal which goes to the CPU.

The CPU detects the HRQ signal, and will complete executing the current instruction. Once the processor has reached a state where it can release the bus, it will. Now all of the signals normally generated by the CPU (-MEMR, -MEMW, -IOR, -IOW and a few others) are placed in a tri-stated condition (to allow DMA controller manage them) and then the CPU asserts the HLDA signal which tells the DMA controller that it is now in charge of the bus. -MEMR, -MEMW, -IOR, -IOW are the signals that are used for memory read, memory write, IO read, and IO write, respectively.

Depending on the processor, the CPU may be able to execute a few additional instructions now that it no longer has the bus, but the CPU will eventually have to wait when it reaches an instruction that must read something from memory that is not in the internal processor or pipeline.

Now that the DMA ``is in charge'', the DMA activates its -MEMR, -MEMW, -IOR, -IOW output signals(-IOR, -MEMW in this case to copy data from disk to memory), and the address outputs from the DMA are set to 0x3456, which will be used to direct the byte that is about to transferred by HDC to a specific memory location.

The DMA will then let the device that requested the DMA transfer know that the transfer is commencing. This is done by asserting the -DACK signal, or in the case of the controller, -DACK2 is asserted.

The floppy disk controller is now responsible for placing the byte to be transferred on the bus Data lines. The DMA will wait one DMA clock, and then de-assert the -MEMW and -IOR signals so that the memory will latch and store the byte that was on the bus, and the HDC will know that the byte has been transferred.

Since the DMA cycle only transfers a single byte at a time, the HDC now drops the DRQ2 signal, so the DMA knows that it is no longer needed. The DMA will de-assert the -DACK2 signal, so that the FDC knows it must stop placing data on the bus. The DMA will now check to see if any of the other DMA channels have any work to do. If none of the channels have their DRQ lines asserted, the DMA controller has completed its work and will now tri-state the -MEMR, -MEMW, -IOR, -IOW and address signals.

Finally, the DMA will de-assert the HRQ signal. The CPU sees this, and de-asserts the HOLDA signal. Now the CPU activates its -MEMR, -MEMW, -IOR, -IOW and address lines, and it resumes executing instructions and accessing main memory and the peripherals.

For a typical HDD, the above process may be repeated several times, once for each byte. Each time a byte is transferred, the address register in the DMA is incremented and the counter in the DMA that shows how many bytes are to be transferred is decremented.

When the counter reaches zero, the DMA asserts the EOP signal, which indicates that the counter has reached zero and no more data will be transferred until the DMA controller is reprogrammed by the CPU. This event is also called the Terminal Count (TC).

If a wants to generate an interrupt when the transfer of a buffer is complete, it can assert one of the interrupt signals to get the processors' attention. In the PC architecture, the DMA chip itself is not capable of generating an interrupt. The peripheral and its associated hardware is responsible for generating any interrupt that occurs.

It is important to understand that although the CPU always releases the bus to the DMA when the DMA makes the request, this action is invisible to both applications and the operating systems, except for slight changes in the amount of time the processor takes to execute instructions when the DMA is active. Subsequently, the processor must poll the peripheral, poll the registers in the DMA chip, or receive an interrupt from the peripheral to know for certain when a DMA transfer has completed.