0901/159.233 Sc MAN Internal/Extramural

MASSEY UNIVERSITY MANAWATU CAMPUS

EXAMINATION FOR 159.233 COMPUTER SYSTEMS

Semester One - 2009

Time allowed: THREE (3) hours

THIS IS A CLOSED BOOK EXAMINATION

ANSWER ALL QUESTIONS

SECTION A Fourteen multi-choice questions – each worth 2 marks 28 marks

Record your answers to the questions in Section A on the Scantron Card provided.

SECTION B Three questions – each worth 24 marks 72 marks

Write your answers to the questions in Section B in the Blue Answer Booklet provided.

Total: 100 marks

Marks for each question are shown in brackets after the question, like this [8 marks]

Note that in some of the questions in this exam, the abbreviations hi and lo may have been used to signify logic high (usually 5V) and logic low (0V) signals respectively.

SECTION A

Page 1 of 17 0901/159.233 Sc MAN Internal/Extramural 14 Multi-choice Questions – each worth 2 marks

1 Which of the following comments would you expect someone to make about CISC computers but not about RISC computers?

(a) They have instructions that operate only on registers, and they have a large number of registers. (b) They have instructions that operate only on registers, and they have a small number of registers. (c) They have instructions that operate on memory, and they have a small number of registers. (d) They have instructions that operate on memory, and they have a large number of registers. [2 marks]

2 In the MIPS computer, branches

(a) occur after every instruction. (b) are conditional. (c) should not be used as a way of loading the Program Counter with a new value. (d) are generally only used in loops, because they are suitable for transferring control to an earlier instruction but not to a later one. [2 marks]

3 Amdahl's Law

(a) states that parallel processing and serial processing must be implemented as separate processes if any speedup is to occur. (b) states that the speedup from serial processing and the speedup from parallel processing have increased by 100% every 18 months since the mid-1960s. (c) states that speedup from parallel processing is limited by the serial processing component. (d) states that speedup from serial processing is limited by the parallel processing component. [2 marks]

4 The "three-address architecture" of the MIPS computer

(a) is violated by multiplication instructions, which produce double-precision (64-bit) results, and therefore have 4 registers, not 3. (b) operates directly on words in memory. It is thus inefficient and is only used rarely; most instructions operate on registers. (c) is very confusing for postmen. (d) is designed to suit an operator that operates on two operands and produces a result. [2 marks]

Page 2 of 17 0901/159.233 Sc MAN Internal/Extramural 5 2's complement numbers

(a) are a way of representing negative numbers and there is only one representation of zero. (b) are a way of representing negative numbers, but there are two representations of zero (+0 and -0). (c) can represent both negative and positive numbers, and there are two representations of zero (+0 and -0). (d) can represent both negative and positive numbers, and there is only one representation of zero. [2 marks]

6 The "Sticky Bit"

(a) is a bias that is added to negative exponents in the sign position to convert them to positive numbers so that it is easier to determine the relative magnitude of numbers. (b) is used to produce the correct result in some calculations whose result requires rounding up, when the digit that causes the rounding has a very low precision that would normally be ignored. (c) is always a 0, as a result of the normalisation process, and thus it is not necessary to store it. (d) is always a 1, as a result of the normalisation process, and thus it is not necessary to store it. [2 marks]

7 In a single-clock-cycle architecture

(a) no resources can be shared between different instruction types (e.g., calculating the address of memory reference instructions and the result of register operations). (b) is it not possible to use a single ALU for producing the results of an R-type calculation (for an addition operation, say) and for determining the destination address of a branch instruction. (c) no resources can be shared between the different operations that belong to an instruction. (d) a single ALU can be used for R-type calculations (for an addition operation, say) and for calculating the address of the next instruction (when incrementing the Program Counter). [2 marks]

Page 3 of 17 0901/159.233 Sc MAN Internal/Extramural 8 In a microprogrammed controller

(a) the address of the next microinstruction is always calculated by a dedicated ALU and the values for the signals that are used to control the architecture are produced by a Boolean circuit, based on the address of the current microinstruction. (b) the address of the next microinstruction is always calculated by a dedicated ALU and the values for the signals that are used to control the architecture are stored as bit values in micromemory. (c) the address of the next microinstruction is often stored as a field in the current microinstruction, and the values for the signals that are used to control the architecture are produced by a Boolean circuit, based on the address of the current microinstruction. (d) the address of the next microinstruction is often stored as a field in the current microinstruction, and the values for the signals that are used to control the architecture are stored as bit values in micromemory. [2 marks]

9 In the diagram below, which represents the architecture of a pipelined MIPS computer

write register read data1 read adddress read register zero 1 ALU data memory read register2 write data read data2 write adddress

instruc- tion memory

sign- shift extend left 2 sum 4

PC

instruction IF ID instruction decode, ID EX execute, EX MEM MEM WBwrite fetch register read address calculation memory access back

(a) data progresses from stage to stage through the pipeline without the need for any clocking, as the stages are designed to have the same delays, and the diagram portrays a von Neumann architecture. (b) the paler vertical rectangles from top to bottom of the diagram represent registers between the pipeline stages, and the diagram portrays a von Neumann architecture. (c) data progresses from stage to stage through the pipeline without the need for any clocking, as the stages are designed to have the same delays, and the diagram portrays a Harvard architecture. (d) the paler vertical rectangles from top to bottom of the diagram represent registers between the pipeline stages, and the diagram portrays a Harvard architecture. [2 marks]

Page 4 of 17 0901/159.233 Sc MAN Internal/Extramural 10 In the MIPS assembler sequences shown below

A B LW R1,45,(R2) LW R1,45,(R2) DADD R5,R1,R7 DADD R5,R6,R7 DSUB R8,R6,R7 DSUB R8,R1,R7 OR R9,R6,R7 OR R9,R6,R7

(a) sequence A does not have a hazard and sequence B has a hazard. (b) sequence A does not have a hazard and sequence B does not have a hazard. (c) sequence A has a hazard and sequence B has a hazard. (d) sequence A has a hazard and sequence B does not have a hazard. [2 marks]

11 Cache is generally implemented as

(a) slow static RAM. (b) slow dynamic RAM. (c) fast static RAM. (d) fast dynamic RAM. [2 marks]

12 If X and Y are unsigned eight bit numbers, the following 8051 code will

mov A,X clr C subb A,Y jz kkk jc kkk

(a) jump to the label kkk if X does not equal C (b) jump to the label kkk if X <= Y (c) jump to the label kkk if X >= Y (d) never jump to label kkk as the "clr c" instruction has cleared the carry bit. [2 marks]

Page 5 of 17 0901/159.233 Sc MAN Internal/Extramural 13 After the following 8051 code has executed, what value is in the accumulator?

mov 40h,#$F8 mov r0,#$40 mov a,@r0 anl a,#$1F

(a) 0 (b) $40 (c) 8 (d) $18 [2 marks]

14 Both polling and interrupts can be used for controlling I/O devices. Which of the follow- ing statements is correct?

(a) Polling systems always have slower response times than interrupt-driven systems. (b) The stack must be initialised before polled I/O control can be used. (c) The hardware status bits necessary for polled I/O are sometimes necessary when in- terrupts are used for device control. (d) Polling systems are the most suitable when a large number of devices need to be controlled as the simplicity of the software makes a predictable and rapid response time easy to achieve. [2 marks]

Page 6 of 17 SECTION B (long answers) Three Questions – each worth 24 marks

Answer questions 15 to 17 in the Blue Answer Booklet provided.

Write the numbers of the questions you have answered from this Section on the front cover of your blue answer booklet.

Do NOT tie the Scantron Card into your Blue Answer Booklet.

15 (a) Describe (or draw a diagram showing) how the fields in an IEEE 754 single- precision (32-bit) floating point number are allocated, and explain how the sign, exponent and significand (mantissa) are combined to produce the numerical value of the number. [6 marks]

(b) Draw a diagram showing the general format of an ASM (Algorithmic State Machine) circuit, and explain briefly how this type of circuit operates. [6 marks]

(c) Draw a block diagram that shows the datapath associated with a MIPS register-type instruction, and describe, with the help of the diagram, the data transfers that occur. You are not required to work at the bit-slice level (so don’t draw the components of any ALUs, for example), and parts of the architecture that are required for other activities such as address calculation, memory access instructions, instruction fetching are not required and their inclusion may result in a reduced mark. [6 marks]

(d) A pipelined architecture without any pipeline optimisation has 8 stages and a 20ns clock. Without the pipeline, a single-cycle version of the architecture takes 170ns per instruction. It has been determined that when the pipeline is in operation, it is able to process, on average, 10 instructions before having to be flushed.

What is the speedup of the pipelined architecture over the non-pipelined architecture?

Notes: You may find that drawing a diagram will clarify the timing. The calculation is very simple; but if you wish, you may express the result as a ratio, without loss of marks. [6 marks] 16 (a) The diagram below shows how cache miss rate varies as the block size increases. There are four separate traces.

Explain why the general trend from left to right is downwards, and explain why there is an upwards tilt at the right hand end of the traces for the smallest caches.

40%

35%

30% 1 KB 25%

Miss rate 20%

15%

10% 8 KB 5% 16 KB 64 KB 0% 256 KB 4 16 64 256 Block size (bytes) cache size

[6 marks]

(b) Write down two implications of the fact that page faults causes delays of millions of clock cycles. [6 marks]

(c) You're responsible for the design of cache for a new processor.

Specify, giving reasons, which combination of the following caching techniques would be most suitable for level 1 cache and for level 3 cache.

Placement policy: direct mapping and set-associative mapping. Replacement update policy: FIFO and LRU [6 marks]

(d) Explain, using a diagram to illustrate your explanation, how a page table register, a page table and a virtual address are used in a Virtual Memory system to produce a physical address. [6 marks] 17 (a) The 8051 status register contains the following bits.

CY – AC - F0 - RS1 - RS0 – OV - F1 - P

Briefly describe the purpose of each bit and when it is set and/or reset. [4 marks]

(b) Show how the following code fragments could be translated into 8051 assembler. A literal translation is not required, just code that implements the same functionality.

(i) The variables X and Y are one byte variables …

X = X * 29 – Y; [2 marks]

(ii) Assume byte is a data type and the array M is stored in a data memory starting at location $50 …

for (byte x = 4; x <= $1F; x++) M[x] = $FF; [3 marks]

(iii) The following is a pseudo-code fragment to build a string from characters entered by the user. Assume: - a getch() subroutine exists and returns a character in the accumulator. - the character array myString starts at location $50 in data memory

byte ch = getch(); byte i = 0; repeat myString[i] = ch; // myString starts at Data Memory location $50 i = i+1; ch = getch(); until ((ch == '.') || (ch == ';')); [5 marks]

(c) Describe what interrupts are, why they might be useful, and what (if any) setup is necessary before they can be used. [4 marks]

Question 17 continued over… Question 17 continued…

(d) Write a subroutine to add two multi-byte numbers (N1 and N2). On entry to the subroutine:

r0 contains the address of the first (the least significant) byte of the first number (N1) r1 contains the address of the first byte of the second number (N2) r7 contains the number of bytes in each number (they're the same)

The numbers are stored so that the least significant byte is stored at the specified memory address and higher-order bytes are stored in consecutive memory locations in ascending address order.

The multi-byte result of the addition is stored in the same memory as N2.

Demonstrate the use of your subroutine by showing the instructions necessary to

add two 3 byte numbers starting at $20 and $30. [6 marks]

+ + + + + + + + 8051 Architecture Reference

Program Memory Internal Memory (Code) Address (Data) 07FFFH (8051) Special Function Registers* I/O registers 7FH 07FFH (89C2051) accessed via memory locations

Directly 2FH Addressable Bank Select Bits 0-7F 20H Bits in PSW 1FH 11 { RB3* 18H Serial Port 0023H 17H Interrupt 10 { RB2* Timer 1 001BH Vectors 10H Ext Interrupt 1 0013H 0FH (location jumped 01 { RB1* Timer 0 000BH to on Interrupt) 08H Ext Interrupt 0 0003H 07H 00 { RB0* Reset 0000H 00H Reset value of Stack Pointer * Four banks of Registers addressable as R0-R7 Program Counter: 16 bit register restricted to 0000H -> 07FFFH Special Function Registers (SFR) Space: Byte Name Description Bits address ("-"  NOT bit addressable) 80H P0 Port 0 bit addressable: P0.7 -> P0.0 81H SP Stack Pointer - 82H DPL Low byte of DPTR - 83H DPH High byte of DPTR - 87H PCON Power control - 88H TCON Timer control TF1-TR1-TF0-TR0-IE1-IT1-IE0-IT0 89H TMOD Timer mode control - 8AH TL0 Timer 0 low byte - 8BH TL1 Timer 1 low byte - 8CH TH0 Timer 0 high byte - 8DH TH1 Timer 1 high byte - 90H P1 Parallel port 1 Bit Addressable P1.7 -> P1.0 98H SCON Serial control SM0-SM1-SM2-REN-TB8-RB8-TI -RI 99H SBUF Serial buffer - A0H P2 Port 2 Bit addressable: P2.7-P2.0 A8H IE Interrupt Enable EA - - -ES -ET1-EX1-ET0-EX0 B0H P3 Parallel port 3 Bit addressable: P3.7 -> P3.0 B8H IP Interrupt priority - - -PS -PT1-PX1-PT0-PX0 D0H PSW Program Status Word CY -AC -F0 -RS1-RS0-OV -F1 -P E0H ACC Accumulator ACC.7 -> ACC.0 F0H B B register B.7 -> B.0 Interrupt control register IE: EA Global bit to enable interrupts ES,ETx Serial interrupt (either RI or TI), Clock interrupt on overflow

Timer control and mode registers - 2 timers 0 and 1

TCON: TF0/TF1 Timer overflow flag timers 0/1 TR0/TR1 Timer run control bit. Set by software to switch timer ON

TMOD: mode0-mode1 2 4-bit nibbles. Timer 1 high order nibble, Timer 0 low order. mode = 0 13 bit timer mode = 1 16 bit timer mode = 2 8 bit auto-reload timer. THx -> TLx on overflow. Used by Serial I/O as bit rate (*32). 0FDH in Thx gives 9600bps for 11.059Mhz clock Serial control register

SCON: SM0-SM1-SM2-REN should be set to 0111 for normal operation TI set when the character has been transmitted RI set when a character is received

Power control register PCON: set to 2 will stop the processor

Addressing Modes: Rn Register R0 - R7 of the currently selected register bank. direct 8-bit internal data location's address. This could be an internal Data RAM location (0-127) or a SFR. @Ri 8-bit internal Data RAM location addressed indirectly via register R0 or R1 #data 8-bit constant included in instruction. #data16 16-bit constant included in instruction. addr11 11-bit destination address. Used by ACALL and AJMP. The branch will be within the same 2K byte page of Program Memory as the first byte of the following instruction. addr16 16-bit destination address. Used by LCALL and LJMP. A branch can be anywhere within the 2K byte Program Memory address space. rel Signed (two's complement) 8-bit offset byte. Used by SJMP and all conditional jumps. Range is -128 to +127 bytes relative to first byte of the next instruction. bit Direct addressed bit in internal Data RAM or SFR. Arithmetic operations: Byte Cycle C OV AC ADD A,Rn Add register to Accumulator 1 1 X X X ADD A,direct Add direct byte to Accumulator 2 1 X X X ADD A,@Ri Add indirect RAM to Accumulator 1 1 X X X ADD A,#data Add immediate data to Accumulator 2 1 X X X ADDC A,Rn Add register to Acc. with Carry 1 1 X X X ADDC A,direct Add direct byte to Acc. with Carry 2 1 X X X ADDC A,@Ri Add indirect RAM to Acc. with Carry 1 1 X X X ADDC A,#data Add immediate data to Acc. / Carry 2 1 X X X SUBB A,Rn Subtract reg. from Acc. with borrow 1 1 X X X SUBB A,direct Sub. direct byte from Acc. / borrow 2 1 X X X SUBB A,@Ri Sub. indirect RAM from Acc./ borrow 1 1 X X X SUBB A,#data Sub. imm. data from Acc. / borrow 2 1 X X X INC A Increment Accumulator 1 1 INC Rn Increment register 1 1 INC direct Increment direct byte 2 1 INC @Ri Increment indirect RAM 1 1 DEC A Decrement Accumulator 1 1 DEC Rn Decrement register 1 1 DEC direct Decrement direct byte 2 1 DEC @Ri Decrement indirect RAM 1 1 INC DPTR Increment Data Pointer 1 2 MUL AB Multiply A and B 1 4 0 X DIV AB Divide A by B 1 4 0 X DA A Decimal adjust Accumulator 1 1 X Logical operations: Byte Cycle C OV AC ANL A,Rn AND register to Accumulator 1 1 ANL A,direct AND direct byte to Accumulator 2 1 ANL A,@Ri AND indirect RAM to Accumulator 1 1 ANL A,#data AND immediate data to Accumulator 2 1 ANL direct,A AND Accumulator to direct byte 2 1 ANL direct,#data AND immediate data to direct byte 3 2 ORL A,Rn OR register to Accumulator 1 1 ORL A,direct OR direct byte to Accumulator 2 1 ORL A,@Ri OR indirect RAM to Accumulator 1 1 ORL A,#data OR immediate data to Accumulator 2 1 ORL direct,A OR Accumulator to direct byte 2 1 ORL direct,#data OR immediate data to direct byte 3 2 XRL A,Rn Exc-OR register to Accumulator 1 1 XRL A,direct Exc-OR direct byte to Accumulator 2 2 XRL A,@Ri Exc-OR indirect RAM to Accumulator 1 1 XRL A,#data Exc-OR immediate data to Acc. 2 1 XRL direct,A Exc-OR Accumulator to direct byte 2 1 XRL direct,#data Exc-OR imm. data to direct byte 3 2 CLR A Clear Accumulator 1 1 CPL A Complement Accumulator 1 1 RL A Rotate Accumulator left 1 1 RLC A Rotate Acc. left through Carry 1 1 X RR A Rotate Accumulator right 1 1 RRC A Rotate Acc. right through Carry 1 1 X SWAP A Swap nibbles within the Accumulator 1 1 Data transfer: Byte Cycle C OV AC MOV A,Rn Move register to Accumulator 1 1 MOV A,direct Move direct byte to Accumulator 2 1 MOV A,@Ri Move indirect RAM to Accumulator 1 1 MOV A,#data Move immediate data to Accumulator 2 1 MOV Rn,A Move Accumulator to register 1 1 MOV Rn,direct Move direct byte to register 2 2 MOV Rn,#data Move immediate data to register 2 1 MOV direct,A Move Accumulator to direct byte 2 1 MOV direct,Rn Move register to direct byte 2 2 MOV direct,direct Move direct byte to direct byte 3 2 MOV direct,@Ri Move indirect RAM to direct byte 2 2 MOV direct,#data Move immediate data to direct byte 3 2 MOV @Ri,A Move Accumulator to indirect RAM 1 1 MOV @Ri,direct Move direct byte to indirect RAM 2 2 MOV @Ri,#data Move immediate data to indirect RAM 2 1 MOV DPTR,#data16 Load Data Pointer with 16-bit const 3 2 MOVC A,@A+DPTR Move Code byte rel. to DPTR to Acc. 1 2 MOVC A,@A+PC Move Code byte rel. to PC to Acc. 1 2 PUSH direct Push direct byte onto stack 2 2 POP direct Pop direct byte from stack 2 2 XCH A,Rn Exchange register with Accumulator 1 1 XCH A,direct Exchange direct byte with Acc. 2 1 XCH A,@Ri Exchange indirect RAM with Acc. 1 1 XCHD A,@Ri Exchange low order digit indirect RAM with Accumulator 1 1

Number and String Formats: Numbers : Decimal - 34 Binary - 01110101B Hexadecimal - a leading $ or a trailing h or H. e.g. $7F, 7Fh, 0FFH, $FF 0A8H Note: if not preceded by $ hex constants must start with 0-9. eg 0C7h

Characters: 'A' - 'Abc' - ‘A’,00DH,00AH (mixed mode), "T" Strings : Only with DB directive for putting strings into CODE memory 'abc' or "abc"

Operators : () + - / * MOD SHR SHL NOT AND OR XOR Boolean variable manipulation: Byte Cycle C OV AC CLR C Clear Carry 1 1 0 CLR bit Clear direct bit 2 1 SETB C Set Carry 1 1 1 SETB bit Set direct bit 2 1 CPL C Complement Carry 1 1 X CPL bit Complement direct bit 2 1 ANL C,bit AND direct bit to Carry 2 2 X ANL C,/bit AND complement of dir. bit to Carry 2 2 X ORL C,bit OR direct bit to Carry 2 2 X ORL C,/bit OR complement of dir. bit to Carry 2 2 X MOV C,bit Move direct bit to Carry 2 1 X MOV bit,C Move Carry to direct bit 2 2 JC rel Jump if Carry is set 2 2 JNC rel Jump if Carry not set 2 2 JB bit,rel Jump if direct bit is set 3 2 JNB bit,rel Jump if direct bit is not set 3 2 JBC bit,rel Jump if dir. bit is set & clear bit 3 2

Program Branching: Byte Cycle C OV AC ACALL addr11 Absolute subroutine call 2 2 LCALL addr16 Long subroutine call 3 2 RET Return from subroutine 1 2 RETI Return from interrupt 1 2 AJMP addr11 Absolute jump 2 2 LJMP addr16 Long jump 3 2 SJMP rel Short jump (relative address) 2 2 JMP @A+DPTR Jump indirect relative to the DPTR 1 2 JZ rel Jump if Accumulator is zero 2 2 JNZ rel Jump if Accumulator is not zero 2 2 CJNE A,direct,rel Compare direct byte to Accumulator 3 2 X and jump if not equal CJNE A,#data,rel Compare immediate data to 3 2 X Accumulator and jump if not equal CJNE Rn,#data,rel Compare immediate data to register 3 2 X and jump if not equal CJNE @Ri,#data,rel Compare immediate data to indirect 3 2 X RAM and jump if not equal DJNZ Rn,rel Decr. register and jump if not zero 2 2 DJNZ direct,rel Decrement direct byte and jump if 3 2 not zero NOP No operation 1 1 Assembler Directives and Controls ; Everything after a semicolon (;) on the same line is a comment Label: Must start in column 1 – Defines a new Label - colon is optional.

Controlling Memory Spaces and Code location ORG 56H Specify a value for the current segment's location counter. USE IRAM Makes the data space the currently selected segment USE ROM Makes the code space the currently selected segment

Defining Byte and Bit values TEN EQU 10 EQUates 10 to symbol TEN, like #define in C, CONST in Delphi ON_FLAG BIT 6 Assigns BIT 6 (in data or SFR space) to the symbol ON_FLAG

Allocating Memory SP_BUFFER: RMB 6 Reserves Memory Byte – reserves 6 bytes of storage in current memory space (affected by most recent USE IRAM/ROM). Message: DB 'Hi' Define Byte(s): Store byte constants in code space.

The following are all equivalent – the string hello followed by a newline and a null. newline EQU 13 DB "H","E","L","L","O",13,0 DB "Hello",13,0 DB "Hello",newline,0