CISC 360, Fall 2008 Example of Exam Exercises

Exercise 1:

Question 1. Suppose we have an un-pipelined computer hardware (Figure 1) for which on each 620 ps cycle, the system spends 600 ps evaluating a combinational logic function and 20ps storing the results in an output register:

Combinatorial logic r e g

600 ps 20ps 600 ps Figure 1: Un-pipelined computer hardware

1.a What is the delay (or latency) of this system?

1.b What is the throughput of this system?

Question 2. Suppose we can separate the combinatorial logic in six stages or blocks named A, B, C, D, E ,and F, having delay (or latency) of 80, 30, 60, 50, 70, and 10ps respectively as showed in Figure 2:

80ps 30ps 60ps 50ps 70ps 10ps 20ps

Figure 1: Un-pipelined computer hardware

We can create pipelined versions of this design by inserting pipeline registers between pairs of these blocks. We get different combinations of pipeline depth (number of stages) and throughput, depending on where we inset the pipeline registers.

2.a Inserting one single pipeline register gives a two-way pipeline. Were should the register be inserted to maximize throughput? What would it be the throughput and latency? 2.b Where should two registers be inserted to maximize the throughput of a three-stage pipeline? What would it be the throughput and latency?

2.c Where should three registers be inserted to maximize the throughput of a four-stage pipeline? What would it be the throughput and latency?

2.d What is the minimum number of stages that would yield a design with the maximum achievable throughput? Give the throughput and the latency.

Question 3. Suppose we can take the system in Figure 1 and divide it into an arbitrary number of pipeline stages, all having the same delay. What would be the ultimate limit on the throughput, given pipeline register delays of 20ps?

Question 4. Assume a single-cycle, un-pipelined machines. Given the following program:

mrmovl 0(edx), ebx mrmovl 4(edx), eax xor ebx, eax je Label 1 # not taken addl ebx, edx rmmovl 8(edx), edx Label 1: addl eax, edx

4.a How many cycles will it take to execute this code? ______cycle(s 4.b What is it going on during the 6th cycle of execution? 4.c What is it going on during the 11th cycle?

Question 5. Assume a five-stage pipelined machine without forwarding. Given the following program:

mrmovl 0(edx), ebx mrmovl 4(edx), eax xor ebx, eax je Label 1 # not taken addl ebx, edx rmmovl 8(edx), edx Label 1: addl eax, edx

5.a How many cycles will it take to execute this code? ______cycle(s 5.b What is it going on during the 6th cycle of execution? 5.c What is it going on during the 11th cycle?

Question 6. Now assume a five-stage pipelined machine with forwarding. Given the following program:

mrmovl 0(edx), ebx mrmovl 4(edx), eax xor ebx, eax je Label 1 # not taken addl ebx, edx rmmovl 8(edx), edx Label 1: addl eax, edx

6.a How many cycles will it take to execute this code? ______cycle 6.b What is it going on during the 6th cycle of execution? 6.c What is it going on during the 11th cycle? Question 7. What are the stages that are NOT idle in each of these instructions? Cross with an X the stages in which the combinatorial logic is performing some work.

addl ra, rb F D E M W

rmmovl rb, 8(ra) F D E M W

mrmovl 8(ra), rb F D E M W

If you use a single-cycle implementation, which of the three instructions determines the length of the clock? Why?