Computer Organization (chapter 4) sample problems

In the lecture, we heard about some of the status flag bits. Some others not discussed in the lecture are found in the Intel architecture and include the adjust flag, the , the interrupt enable flag and the . What do each of these represent?

Adjust flag – indicates carry out of the low order 4 bits to indicate a carry between BCD digits.

Trap flag – processor should operate in single-step mode so that debugging and watch the results of each instruction being executed.

Interrupt flag – if set, the CPU is allowed to be interrupted. If not set, the CPU is handling something of a high enough priority that any interrupt will be postponed.

Direction flag – used to determine if processing a string should be handled left-to-right or right-to-left.

A word addressable computer has 64KBs of memory and 16-bit word size. Its memory is built out of 4Kx4 chips. The computer uses low-order interleave. Answer the following questions. a. How many chips make up memory? b. How many banks make up memory? c. How many bits make up an address? d. Of the bits from part c, which make up the bank select and which make up the address sent to the bank?

64KB of 16-bit words means 64KB/16/8 = 32K words. a. 32Kx16/4Kx4 = 32 chips b. 32 chips / (16/4) = 8 banks c. Log 2 32K = 15 bit address d. 8 banks requires 3 bits, low-order interleave means the rightmost 3 bits are the bank, so the format is: 12 bits (address to the bank) | 3 bits (bank select)

A word addressable computer has 1GB of memory and the word size is 64 bits. Assuming there are 80 control lines and 40 interrupt lines, how large is the system bus?

1GB/8bits/byte = 128M words, so the address size is log 2 128M = 27 bits, so the size of the address bus is 27 bits

The word size is 64 bits, so the data bus is 64 bits

There are 120 lines on the control bus

System bus size is 27 + 64 + 120 = 211 lines

You are comparing two processors. Processor A has a system clock speed of 1.667 GHz and Processor B has a system clock speed of 2.5GHz. Assume Processor A has an 8 stage fetch- execute cycle. Processor B has a 20 stage fetch-execute cycle. We are running the instruction Add R1, R2, #1 on both processors (this is like doing X = Y + 1 where X and Y are registers). Based on what you know, which processor will run this instruction faster? Assume the instruction was instead Load R1, X where X is a memory location. What other factors come into play that might influence which processor is faster?

Processor A will run the instruction faster because, while its clock speed is 2/3s slower then Processor B, it has a pipeline 40% the length of Processors B. Since the instruction does not involve memory or the system bus, A should be faster. If we do the Load instruction, since that involves memory and the system bus, other factors might influence the speed of execution such as whether X is in cache memory and that is influenced by the size and location of cache, and how wide the bus is, among other possible factors.

Provide a sample of code in which prefetching data from memory would be advantageous and then explain whether high or low order interleave would be preferred.

sum = a[i] + a[i+1] + a[i+2] + a[i+3];

We need to fetch a[i] and a[i+1] and add them, then we need to fetch a[i+2] and a[i+3] and add them to the previous sum and store the result. When fetching a[i], we can go ahead and starting fetching a[i+1], a[i+2], and a[i+3] as well. If we use low-order interleave, these four data will be stored in four different banks (assuming we have at least four banks) and so we can access each simultaneously to provide for pre-fetching.