Computer Architecture

Faculty Of Computers And Information Technology

Second Term 2019- 2020

Dr.Khaled Kh. Sharaf

CHAPTER 6:

MACHINE INSTRUCTIONS AND PROGRAMS Computer Architecture

Objectives

• Machine instructions and program execution, including branching and subroutine call and return operations. • Number representation and addition/subtraction in the 2’s- complement system. • Addressing methods for accessing register and memory operands. • Assembly language for representing machine instructions, data, and programs. • Program-controlled Input/Output operations. Computer Architecture

NUMBER, ARITHMETIC OPERATIONS, AND CHARACTERS Computer Architecture

Signed Integer

• 3 major representations: Sign and magnitude One’s complement Two’s complement

• Assumptions: 4-bit machine word 16 different values can be represented Roughly half are positive, half are negative

Computer Architecture

Sign and Magnitude Representation

-7 +0 -6 1111 0000 +1 1110 0001 -5 +2 + 1101 0010

-4 1100 0011 +3 0 100 = + 4

-3 1011 0100 +4 1 100 = - 4 1010 0101 -2 +5 - 1001 0110 -1 1000 0111 +6 -0 +7 High order bit is sign: 0 = positive (or zero), 1 = negative Three low order bits is the magnitude: 0 (000) thru 7 (111) Number range for n bits = +/-2n-1 -1 Two representations for 0 Computer Architecture

One’s Complement Representation

-0 +0 -1 1111 0000 +1 1110 0001 -2 +2 + 1101 0010

-3 1100 0011 +3 0 100 = + 4

-4 1011 0100 +4 1 011 = - 4 1010 0101 -5 +5 - 1001 0110 -6 1000 0111 +6 -7 +7 • Subtraction implemented by addition & 1's complement • Still two representations of 0! This causes some problems • Some complexities in addition Computer Architecture

Two’s Complement Representation

-1 +0 -2 1111 0000 +1 1110 0001 -3 +2 + like 1's comp 1101 0010 except shifted -4 1100 0011 +3 0 100 = + 4 one position clockwise -5 1011 0100 +4 1 100 = - 4 1010 0101 -6 +5 - 1001 0110 -7 1000 0111 +6 -8 +7

• Only one representation for 0 • One more negative number than positive number Computer Architecture

Binary, Signed-Integer Representations

B Values represented

Sign and b 3 b 2 b 1 b 0 magnitude 1' s complement 2' s complement

0 1 1 1 + 7 + 7 + 7 0 1 1 0 + 6 + 6 + 6 0 1 0 1 + 5 + 5 + 5 0 1 0 0 + 4 + 4 + 4 0 0 1 1 + 3 + 3 + 3 0 0 1 0 + 2 + 2 + 2 0 0 0 1 + 1 + 1 + 1 0 0 0 0 + 0 + 0 + 0 1 0 0 0 - 0 - 7 - 8 1 0 0 1 - 1 - 6 - 7 1 0 1 0 - 2 - 5 - 6 1 0 1 1 - 3 - 4 - 5 1 1 0 0 - 4 - 3 - 4 1 1 0 1 - 5 - 2 - 3 1 1 1 0 - 6 - 1 - 2 1 1 1 1 - 7 - 0 - 1

Figure 2.1. Binary, signed-integer representations. Computer Architecture

MEMORY LOCATIONS, ADDRESSES, AND OPERATIONS Computer Architecture

Memory Location, Addresses, and Operation n bits

• Memory consists of many first word millions of storage cells, second word each of which can store 1 bit. • • • • Data is usually accessed in n-bit groups. n is called word length. i th word

• • •

last word

Memory words. Computer Architecture

Memory Location, Addresses, and Operation

• 32-bit word length example

32 bits

• • • b31 b30 b1 b0

Sign bit: b31 = 0 for positive numbers b31 = 1 for negative numbers

(a) A signed integer

8 bits 8 bits 8 bits 8 bits

ASCII ASCII ASCII ASCII character character character character

(b) Four characters Computer Architecture

Memory Location, Addresses, and Operation

• To retrieve information from memory, either for one word or one byte (8-bit), addresses for each location are needed. • A k-bit address memory has 2k memory locations, namely 0 – 2k-1, called memory space. • 24-bit memory: 224 = 16,777,216 = 16M (1M=220) • 32-bit memory: 232 = 4G (1G=230) • 1K(kilo)=210 • 1T(tera)=240 Computer Architecture

Memory Location, Addresses, and Operation

• It is impractical to assign distinct addresses to individual bit locations in the memory. • The most practical assignment is to have successive addresses refer to successive byte locations in the memory – byte-addressable memory. • Byte locations have addresses 0, 1, 2, … If word length is 32 bits, they successive words are located at addresses 0, 4, 8,… Computer Architecture Memory Location, Addresses, and Operation • Address ordering of bytes • Word alignment • Words are said to be aligned in memory if they begin at a byte addr. that is a multiple of the num of bytes in a word. • 16-bit word: word addresses: 0, 2, 4,…. • 32-bit word: word addresses: 0, 4, 8,…. • 64-bit word: word addresses: 0, 8,16,…. • Access numbers, characters, and character strings Computer Architecture

Memory Operation

• Load (or Read or Fetch) Copy the content. The memory content doesn’t change. Address – Load Registers can be used • Store (or Write) Overwrite the content in memory Address and Data – Store Registers can be used Computer Architecture

ADDRESSING MODES Computer Architecture

Generating Memory Addresses

• How to specify the address of branch target? • Can we give the memory operand address directly in a single Add instruction in the loop? • Use a register to hold the address of NUM1; then increment by 4 on each pass through the loop. Computer Architecture Computer Architecture Addressing Modes

Opcode Mode ... • Implied • AC is implied in “ADD M[AR]” in “One-Address” instr. • TOS is implied in “ADD” in “Zero-Address” instr. • Immediate • The use of a constant in “MOV R1, 5”, i.e. R1 ← 5 • Register • Indicate which register holds the operand Computer Architecture

Addressing Modes • Register Indirect • Indicate the register that holds the number of the register that holds the operand MOV R1, (R2) R1 • Autoincrement / Autodecrement R2 = 3 • Access & update in 1 instr.

• Direct Address R3 = 5 • Use the given address to access a memory location Computer Architecture Addressing Modes

• Indirect Address • Indicate the memory location that holds the address of the memory location that holds the data

Memory AR = 101

100 101 0 1 0 4 102 103 104 1 1 0 A Computer Architecture Addressing Modes

Memory • Relative Address • EA = PC + Relative Addr 0 1 PC = 2 2

+

100 AR = 100 101 102 1 1 0 A Could be Positive or 103 Negative 104 (2’s Complement) Computer Architecture Addressing Modes

• Indexed Memory • EA = Index Register + Relative Addr

Useful with XR = 2 “Autoincrement” or “Autodecrement” +

100 AR = 100 101 Could be Positive or Negative 102 1 1 0 A (2’s Complement) 103 104 Computer Architecture Addressing Modes

• Base Register Memory • EA = Base Register + Relative Addr

Could be Positive or AR = 2 Negative (2’s Complement) +

100 0 0 0 5 BR = 100 101 0 0 1 2 102 0 0 0 A Usually points to 103 0 1 0 7 the beginning of 104 0 0 5 9 an array Computer Architecture

Addressing Modes

Name Assembler syntax Addressing function • The different ways in which Immediate #Value Op erand = Value the location of an operand is Register R i EA = Ri specified in an instruction are Absolute (Direct) LOC EA = LOC referred to as Indirect (R i ) EA = [R i ] addressing (LOC) EA = [LOC] modes. Index X(R i ) EA = [R i ] + X

Base with index (R i ,R j ) EA = [R i ] + [R j ] Base with index X(R i ,R j ) EA = [R i ] + [R j ] + X and offset

Relative X(PC) EA = [PC] + X

Autoincrement (R i )+ EA = [R i ] ; Increment R i

Autodecrement  (R i ) Decrement R i ; EA = [R i ] Computer Architecture

Indexing and Arrays

• Index mode – the effective address of the operand is generated by adding a constant value to the contents of a register. • Index register

• X(Ri): EA = X + [Ri] • The constant X may be given either as an explicit number or as a symbolic name representing a numerical value. • If X is shorter than a word, sign-extension is needed. Computer Architecture

Indexing and Arrays

• In general, the Index mode facilitates access to an operand whose location is defined relative to a reference point within the data structure in which the operand appears. • Several variations:

(Ri, Rj): EA = [Ri] + [Rj] X(Ri, Rj): EA = X + [Ri] + [Rj] Computer Architecture

Relative Addressing

• Relative mode – the effective address is determined by the Index mode using the program in place of the general-purpose register. • X(PC) – note that X is a signed number • Branch>0 LOOP • This location is computed by specifying it as an offset from the current value of PC. • Branch target may be either before or after the branch instruction, the offset is given as a singed num. Computer Architecture

Additional Modes

• Autoincrement mode – the effective address of the operand is the contents of a register specified in the instruction. After accessing the operand, the contents of this register are automatically incremented to point to the next item in a list.

• (Ri)+. The increment is 1 for byte-sized operands, 2 for 16-bit operands, and 4 for 32-bit operands.

• Autodecrement mode: -(Ri) – decrement first Move N,R1 Move #NUM1,R2 Initialization Clear R0 LOOP Add (R2)+,R0 Decrement R1 Branch>0 LOOP Move R0,SUM

Figure 2.16. The Autoincrement used in the program of Figure 2.12. Computer Architecture

ASSEMBLY LANGUAGE Computer Architecture Levels of Programming Languages

1) Machine Language • Consists of individual instructions that will be executed by the CPU one at a time 2) Assembly Language (Low Level Language) • Designed for a specific family of processors (different groups/family has different Assembly Language) • Consists of symbolic instructions directly related to machine language instructions one-for-one and are assembled into machine language. 3) High Level Languages • e.g. : C, C++ and Vbasic • Designed to eliminate the technicalities of a particular computer. • Statements compiled in a high level language typically generate many low-level instructions. Computer Architecture

Advantages of Assembly Language

1. Shows how program interfaces with the processor, operating system, and BIOS. 2. Shows how data is represented and stored in memory and on external devices. 3. Clarifies how processor accesses and executes instructions and how instructions access and data. 4. Clarifies how a program accesses external devices. Computer Architecture

Reasons for using Assembly Language

1. A program written in Assembly Language requires considerably less memory and execution time than one written in a high –level language. 2. Assembly Language gives a programmer the ability to perform highly technical tasks that would be difficult, if not impossible in a high-level language. 3. Although most software specialists develop new applications in high-level languages, which are easier to write and maintain, a common practice is to recode in assembly language those sections that are time-critical. 4. Resident programs (that reside in memory while other program execute) and interrupt service routines (that handle input and output) are almost always develop in Assembly Language. Computer Architecture

The Computer Organization - INTEL PC

In this course, only INTEL assembly language will be learnt. Below is a brief description of the development of a few INTEL model. (i) 8088 • Has 16-bit registers and 8-bit data • Able to address up to 1 MB of internal memory • Although registers can store up to 16-bits at a time but the data bus is only able to transfer 8 bit data at one time (ii) 8086 • Is similar to 8088 but has a 16-bit data bus and runs faster. Computer Architecture

(iii) 80286 • Runs faster than 8086 and 8088 • Can address up to 16 MB of internal memory • multitasking => more than 1 task can be ran simultaneously (iv) 80386 • has 32-bit registers and 32-bit data bus • can address up to 4 billion bytes. of memory • support “virtual mode”, whereby it can swap portions of memory onto disk: in this way, programs running concurrently have space to operate. (v) 80486 • has 32-bit registers and 32-bit data bus • the presence of Computer Architecture

(vi) Pentium • has 32-bit registers, 64-bit data bus • has separate caches for data and instruction • the processor can decode and execute more than one • instruction in one clock cycle (pipeline)

(vii) Pentium II & III • has different paths to the cache and main memory Computer Architecture In performing its task, the processor (CPU) is partitioned into two logical units: 1) An (EU) 2) A Bus Interface Unit (BIU)

EU • EU is responsible for program execution • Contains of an (ALU), a (CU) and a number of registers BIU • Delivers data and instructions to the EU. • manage the bus control unit, segment registers and instruction queue. • The BIU controls the buses that transfer the data to the EU, to memory and to external input/output devices, whereas the segment registers control memory addressing. Computer Architecture

EU and BIU work in parallel, with the BIU keeping one step ahead. The EU will notify the BIU when it needs to data in memory or an I/O device or obtain instruction from the BIU instruction queue.

When EU executes an instruction, BIU will fetch the next instruction from the memory and insert it into to instruction queue.

Computer Architecture

EU : Execution Unit BIU : Bus Interface Unit

AX AH AL BX BH BL CX CH CL Program Control DX DH DL SP CS BP DS SI SS DI ES

Bus Bus Control Unit ALU 1 CU 2 Flag register 3 Instruction 4 Queue

Instruction Pointer n () Computer Architecture

Addressing Data in Memory

• Intel Personal Computer (PC) addresses its memory according to bytes. (Every byte has a unique address beginning with 0) • Depending to the model of a PC, CPU can access 1 or more bytes at a time • Processor (CPU) keeps data in memory in reverse byte sequence (reverse-byte sequence: low order byte in the low memory address and high-order byte in the high memory address) Computer Architecture

Example : consider value 052916 (0529H)

2 bytes  05 and 29

register 05 29

memory 29 05

Address 04A2616 Address 04A2716 (low-order/least significant byte) (high-order/most significant byte) • When the processor takes data (a word or 2 bytes), it will re-reverse the byte to its actual order 052916 Computer Architecture Segment And Addressing

• Segments are special areas in the memory that is defined in a program, containing the code, data, and stack. • The segment position in the memory is not fixed and can be determined by the programmer • 3 main segments for the programming process:

(i) Code Segment (CS) • Contains the machine instructions that are to execute. • Typically, the first executable instruction is at the start of this segment, and the operating system links to that location to begin program execution. • CS register will hold the beginning address of this segment Computer Architecture

(ii) Data Segment (DS) • Contains program’s defined data, constants and works areas. • DS register is used to store the starting address of the DS

(iii) Stack Segmen (SS) • Contains any data or address that the program needs to save temporarily or for used by your own “called”subroutines. • SS register is used to hold the starting address of this segment Computer Architecture

Contains the beginning address of each segment

SS Register Address Stack segment DS Register Address Data segment CS Register Address Code segment

Segment register (in CPU)

memory (MM)

Computer Architecture

Segment Offsets

• Within a program, all memory locations within a segment are relative to the segment’s starting address. • The distance in bytes from the segment address to another location within the segment is expressed as an offset (or displacement). • Thus the first byte of the code segment is at offset 00, the second byte is at offset 01 and so forth. • To reference any memory location in a segment (the actual address), the processor combines the segment address in a segment register with the offset value of that location.  actual address = segment address + offset Computer Architecture

Eg: A starting address of data segment is 038E0H, so the value in DS register is 038E0H. An instruction references a location with an offset of 0032H bytes from the start of the data segment.

 the actual address = DS segment address + offset = 038E0H + 0032H = 03912H Computer Architecture Registers

• Registers are used to control instructions being executed, to handle addressing of memory, and to provide arithmetic capability • Registers of Intel Processors can be categorized into: 1. Segment register 2. Pointer register 3. General purpose register 4. Index register 5. Flag register Computer Architecture i) Segment register

There are 6 segment registers :

(a) CS register • Contains the starting address of program’s code segment. • The content of the CS register is added with the content in the Instruction Pointer (IP) register to obtain the address of the instruction that is to be fetched for execution. (Note: common name for IP is PC (Program Counter))

(b) DS register • Contains the starting address of a program’s data segment. • The address in DS register will be added with the value in the address field (in instruction format) to obtain the real address of the data in data segment. Computer Architecture

(c) SS Register • Contains the starting address of the stack segment. • The content in this register will be added with the content in the Stack Pointer (SP) register to obtain the required word. (d) ES (Extra Segment) Register • Used by some string (character data) operations to handle memory addressing • ES register is associated with the Data Index (DI) register.

(e) FS and GS Registers • Additional extra segment registers introduced in 80386 for handling storage requirement. Computer Architecture

(ii) Pointer Registers

• There are 3 pointer registers in an Intel PC :

(a) Instruction Pointer register • The 16-bit IP register contains the offset address or displacement for the next instruction that will be executed by the CPU • The value in the IP register will be added into the value in the CS register to obtain the real address of an instruction Computer Architecture

Example : The content in CS register = 39B40H The content in IP register = 514H

next instruction address: 39B40H + 514H . 3A054H • Intel 80386 introduced 32-bit IP, known as EIP (Extended IP) Computer Architecture

(b) Stack Pointer Register (Stack Pointer (SP)) • The 16-bit SP register stores the displacement value that will be combined with the value in the SS register to obtain the required word in the stack • Intel 80386 introduced 32-bit SP, known as ESP (Extended SP) Example: Value in register SS = 4BB30H Value in register SP = + 412H 4BF42H

(c) Base Pointer Register • The 16-bit BP register facilitates referencing parameters, which are data and addresses that a program passes via a stack • The processor combines the address in SS with the offset in BP Computer Architecture (iii) General Purpose Registers There are 4 general-purpose registers, AX, BX, CX, DX:

(a) AX register • Acts as the accumulator and is used in operations that involve input/output and arithmetic • The diagram below shows the AX register with the number of bits. 32 bits

8 bit 8 bit

AH AL

AX

EAX EAX : 32 bit AX : 16 bit (rightmost 16-bit portion of EAX) AH : 8 bit => leftmost 8 bits of AX (high portion) AL : 8 bit => rightmost 8 bit of AX (low portion) Computer Architecture

(b) BX Register o Known as the base register since it is the only this general purpose register that can be used as an index to extend addressing. o This register also can be used for computations o BX can also be combined with DI and SI register as a base registers for special addressing like AX, BX is also consists of EBX, BH and BL 32 bits

8 bit 8 bit

BH BL

BX

EBX Computer Architecture

(c) CX Register • known as count register • may contain a value to control the number of times a loops is repeated or a value to shift bits left or right • CX can also be used for many computations • Number of bits and fractions of the register is like below : 32 bits

8 bit 8 bit

CH CL

CX

ECX Computer Architecture

(d) DX Register • Known as data register • Some I/O operations require its use • Multiply and divide operations that involve large values assume the use of DX and AX together as a pair to hold the data or result of operation. • Number of bits and the fractions of the register is as below : 32 bits

8 bit 8 bit

DH DL

DX

EDX Computer Architecture

(iv) Index Register There are 2 index registers, SI and DI

(a) SI Register o Needed in operations that involve string (character) and is always usually associated with the DS register o SI : 16 bit o ESI : 32 bit (80286 and above)

(b) DI Register o Also used in operations that involve string (character) and it is associated with the ES register o DI : 16 bit o EDI : 32 bit (80386 and above)

Computer Architecture

(v) FLAG Register o Flags register contains bits that show the status of some activities o Instructions that involve comparison and

arithmetic will change the flag status where some instruction will refer to the value of a specific bit in the flag for next subsequent action

O D I T S Z A P C

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 - 9 of its 16 bits indicate the current status of the computer and the results of processing - the above diagram shows the stated 9 bits Computer Architecture

O D I T S Z A P C

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

OF (overflow): indicate overflow of a high-order (leftmost) bit following arithmetic DF (direction): Determines left or right direction for moving or comparing string (character) data IF (interrupt): indicates that all external interrupts such as keyboard entry are to be processed or ignored TF (trap): permits operation of the processor in single-step mode. Usually used in “debugging” process SF (sign): contains the resulting sign of an arithmetic operation (0 = +ve, 1 = -ve) ZF (zero): indicates the result of an arithmetic or comparison operation (0 = non zero; 1 = zero result) AF (auxillary carry): contains a carry out of bit 3 into bit 4 in an arithmetic operation, for specialized arithmetic PF (parity): indicates the number of 1-bits that result from an operation. An even number of bits causes so-called even parity and an odd number causes odd parity CF (parity): contains carries from a high-order (leftmost) bit following an arithmetic operation; also, contains the content of the last bit of a shift or rotate operation. Computer Architecture

Instruction Format

• The operation of CPU is determined by the instructions it executes (machine or computer instructions) • CPU’s instruction set – the collection of different instructions that CPU can execute • Each instruction must contain the information required by CPU for execution :- 1. Operation code (opcode) -- specifies the operation to be performed (eg: ADD, I/O) do this 2. Source operand reference -- the operation may involve one or more source operands (input for the operation)  to this 3. Result operand reference -- the operation may produce a result  put the answer here 4. Next instruction reference -- to tell the CPU where to fetch the next instruction after the execution of this instruction is complete  do this when you have done that

opcode

2 3 4 1 Computer Architecture

• Operands (source & result) can be in one of the 3 areas:- • Main or • CPU register • I/O device • It is not efficient to put all the information required by CPU in a machine instruction • Each instruction is represented by sequence of bits & is divided into 2 fields; opcode & address

opcode address

• Processing become faster if all information required by CPU in one instruction or one instruction format

opcode address for Operand 1 address for Operand 2 address for Result address for Next instruction

• Problems  instruction become long (takes a few words in main memory to store 1 instruction) • Solution  provide a few instruction formats (format instruction); 1, 2, 3 and addressing mode. • Instruction format with 2 address is always used; INTEL processors Computer Architecture

• Opcodes are represented by abbreviations, called mnemonics, that indicate the operation. Common examples: ADD Add SUB Subtract DIV Divide LOAD Load data from memory STOR Store data to memory Computer Architecture

Instruction-3-address opcode address for Result address for Operand 1 address for Operand 2

• Example : SUB Y A B Y = A - B • Result  Y • Operand 1  A • Operand 2  B • Operation = subtracts • Address for Next instruction?  program counter (PC) Instruction-2-address opcode address for Operand 1 & Result address for Operand 2 • Example : SUB Y B Y = Y - B – Operand 1  Y – Operand 2  B – Result  replace to operand 1 – Address for Next instruction?  program counter (PC) Computer Architecture Instruction-1-address

opcode address for Operand 2

• Example : LOAD A ADD B AC = AC + B or SUB B AC = AC - B – Operand 1 & Result  in AC (accumulator), a register – SUB B  B subtracts from AC, the result stored in AC – Address for Next instruction?  program counter (PC) • Short instruction (requires less bit) but need more instructions for arithmetic problem Computer Architecture

Y = (A-B) / (C+D x E)

Instruction Comment • SUB Y, A, B • Y A - B • MPY T, D, E • T D x E • ADD T, T, C • T T + C • DIV Y, Y, T • Y Y / T

Three-address instructions Computer Architecture Y = (A-B) / (C+D x E)

• MOVE Y, A • SUB Y, B • MOVE T, D • MPY T, E • ADD T, C • DIV Y, T

Two-address instructions Computer Architecture Y = (A-B) / (C+D x E) INSTRUCTIONS Comment • LOAD D • AC D • MPY E • AC AC x E • ADD C • AC AC + C • STOR Y • LOAD A • Y AC • SUB B • AC A • DIV Y • AC AC – B • STOR Y • AC AC / Y

• Y AC One-address Instructions Computer Architecture

Utilization of Instruction Addresses

** Zero-address instructions are applicable to a special memory organisation, called a stack. Computer Architecture Types of Instructions

• Data Transfer Instructions Name Mnemonic Data value is Load LD not modified Store ST Move MOV Exchange XCH Input IN Output OUT Push PUSH Pop POP Computer Architecture Data Transfer Instructions

Mode Assembly Register Transfer Direct address LD ADR AC ← M[ADR] Indirect address LD @ADR AC ← M[M[ADR]] Relative address LD $ADR AC ← M[PC+ADR] Immediate operand LD #NBR AC ← NBR Index addressing LD ADR(X) AC ← M[ADR+XR] Register LD R1 AC ← R1 Register indirect LD (R1) AC ← M[R1] Autoincrement LD (R1)+ AC ← M[R1], R1 ← R1+1 Computer Architecture

Data Manipulation 1. Name Mnemonic Increment INC Instructions Decrement DEC Add ADD 1. Arithmetic Subtract SUB 2. Logical & Bit Manipulation Multiply MUL Divide DIV 3. Shift Add with carry ADDC Subtract with borrow SUBB 2. Name Mnemonic Negate NEG Clear CLR Complement COM 3. Name Mnemonic AND AND Logical shift right SHR OR OR Logical shift left SHL Exclusive-OR XOR Arithmetic shift right SHRA Clear carry CLRC Arithmetic shift left SHLA Set carry SETC Rotate right ROR Complement carry COMC Rotate left ROL Enable interrupt EI Rotate right through carry RORC Disable interrupt DI Rotate left through carry ROLC Computer Architecture Program Control Instructions

Name Mnemonic Branch BR Jump JMP Skip SKP Subtract A – B but Call CALL don’t store the result Return RET Compare CMP (Subtract) 1 0 1 1 0 0 0 1 Test (AND) TST 0 0 0 0 1 0 0 0

Mask 0 0 0 0 0 0 0 0 Computer Architecture

Conditional Branch Instructions

Mnemonic Branch Condition Tested Condition BZ Branch if zero Z = 1 BNZ Branch if not zero Z = 0 BC Branch if carry C = 1 BNC Branch if no carry C = 0 BP Branch if plus S = 0 BM Branch if minus S = 1 BV Branch if overflow V = 1 BNV Branch if no overflow V = 0 Computer Architecture

BASIC INPUT/OUTPUT OPERATIONS Computer Architecture

I/O

• The data on which the instructions operate are not necessarily already stored in memory. • Data need to be transferred between processor and outside world (disk, keyboard, etc.) • I/O operations are essential, the way they are performed can have a significant effect on the performance of the computer. Computer Architecture

Program-Controlled I/O Example

• Read in character input from a keyboard and produce character output on a display screen. Rate of data transfer (keyboard, display, processor) Difference in speed between processor and I/O device creates the need for mechanisms to synchronize the transfer of data. A solution: on output, the processor sends the first character and then waits for a signal from the display that the character has been received. It then sends the second character. Input is sent from the keyboard in a similar way. Computer Architecture

Bus

Processor DATAIN DATAOUT

SIN SOUT

Key board Display

Figure 2.19 Bus connection for processor, keyboard, and display. Program-Controlled I/O Example

- Registers - Flags - Device interface Computer Architecture

Program-Controlled I/O Example

• Machine instructions that can check the state of the status flags and transfer data: READWAIT Branch to READWAIT if SIN = 0 Input from DATAIN to R1

WRITEWAIT Branch to WRITEWAIT if SOUT = 0 Output from R1 to DATAOUT Computer Architecture

Program-Controlled I/O Example

• Memory-Mapped I/O – some memory address values are used to refer to peripheral device buffer registers. No special instructions are needed. Also use device status registers.

READWAIT Testbit #3, INSTATUS Branch=0 READWAIT MoveByte DATAIN, R1 Computer Architecture

Program-Controlled I/O Example

• Assumption – the initial state of SIN is 0 and the initial state of SOUT is 1. • Any drawback of this mechanism in terms of efficiency? • Two wait loopsprocessor execution time is wasted • Alternate solution? • Interrupt Computer Architecture

STACKS Computer Architecture Stack Organization

Current Top of Stack • LIFO TOS 0 Last In First Out 1 2 3 4 5 SP 6 0 1 2 3 7 0 0 5 5 FULL EMPTY 8 0 0 0 8 9 0 0 2 5 Stack Bottom 10 0 0 1 5 Stack Computer Architecture Stack Organization

Current 1 6 9 0 Top of Stack • PUSH TOS 0 SP ← SP – 1 1 M[SP] ← DR 2 If (SP = 0) then (FULL ← 1) 3 EMPTY ← 0 4 5 1 6 9 0 SP 6 0 1 2 3 7 0 0 5 5 FULL EMPTY 8 0 0 0 8 9 0 0 2 5 Stack Bottom 10 0 0 1 5 Stack Computer Architecture Stack Organization

Current Top of Stack • POP TOS 0 DR ← M[SP] 1 SP ← SP + 1 2 If (SP = 11) then (EMPTY ← 1) 3 FULL ← 0 4 5 1 6 9 0 SP 6 0 1 2 3 7 0 0 5 5 FULL EMPTY 8 0 0 0 8 9 0 0 2 5 Stack Bottom 10 0 0 1 5 Stack Computer Architecture Stack Organization

• Memory Stack Memory • PUSH PC 0 SP ← SP – 1 1 M[SP] ← DR 2 • POP DR ← M[SP] AR 100 101 SP ← SP + 1 102

200 SP 201 202 Computer Architecture Reverse Polish Notation

• Infix Notation A + B • Prefix or Polish Notation + A B • Postfix or Reverse Polish Notation (RPN) A B +

(2) (4)  (3) (3)  + RPN (8) (3) (3)  + A  B + C  D A B  C D  + (8) (9) + 17 Computer Architecture Reverse Polish Notation

• Example (A + B)  [C  (D + E) + F]

(A B +) (D E +) C  F +  Computer Architecture Reverse Polish Notation

• Stack Operation (3) (4)  (5) (6)  +

PUSH 3 PUSH 4 6 MULT PUSH 5 4530 PUSH 6 31242 MULT ADD Computer Architecture

ADDITIONAL INSTRUCTIONS Computer Architecture

Logical Shifts

• Logical shift – shifting left (LShiftL) and shifting right (LShiftR) C R0 0

before: 0 0 1 1 1 0 . . . 0 1 1

after: 1 1 1 0 . . . 0 1 1 0 0

(a) Logical shift left LShiftL #2,R0

0 R0 C

before: 0 1 1 1 0 . . . 0 1 1 0

after: 0 0 0 1 1 1 0 . . . 0 1

(b) Logical shift ightr LShiftR #2,R0 Computer Architecture

Arithmetic Shifts

R0 C

before: 1 0 0 1 1 . . . 0 1 0 0

after: 1 1 1 0 0 1 1 . . . 0 1

(c) Ar ithmetic shift right AShiftR #2,R0 Computer Architecture

C R0

before: 0 0 1 1 1 0 . . . 0 1 1

Rotate after: 1 1 1 0 . . . 0 1 1 0 1

(a) Rotate left without carry RotateL #2,R0

C R0

before: 0 0 1 1 1 0 . . . 0 1 1

after: 1 1 1 0 . . . 0 1 1 0 0

(b) Rotate left with carry RotateLC #2,R0

R0 C

before: 0 1 1 1 0 . . . 0 1 1 0

after: 1 1 0 1 1 1 0 . . . 0 1

(c) Rotate ightr without carry RotateR #2,R0

R0 C

before: 0 1 1 1 0 . . . 0 1 1 0

after: 1 0 0 1 1 1 0 . . . 0 1

(d) Rotate ightr with carry RotateRC #2,R0

Figure 2.32. Rotate instructions. Computer Architecture

Multiplication and Division

• Not very popular (especially division)

• Multiply Ri, Rj Rj ← [Ri] х [Rj] • 2n-bit product case: high-order half in R(j+1)

• Divide Ri, Rj Rj ← [Ri] / [Rj] Quotient is in Rj, remainder may be placed in R(j+1) Computer Architecture

ENCODING OF MACHINE INSTRUCTIONS Computer Architecture

Encoding of Machine Instructions

• Assembly language program needs to be converted into machine instructions. (ADD = 0100 in ARM instruction set) • In the previous section, an assumption was made that all instructions are one word in length. • OP code: the type of operation to be performed and the type of operands used may be specified using an encoded binary pattern • Suppose 32-bit word length, 8-bit OP code (how many instructions can we have?), 16 registers in total (how many bits?), 3-bit addressing mode indicator. • Add R1, R2 8 7 7 10 • Move 24(R0), R5 • LshiftR #2, R0 OP code Source Dest Other info • Move #$3A, R1 • Branch>0 LOOP (a) One-word instruction Computer Architecture

Encoding of Machine Instructions

• What happens if we want to specify a memory operand using the Absolute addressing mode? • Move R2, LOC • 14-bit for LOC – insufficient • Solution – use two words

OP code Source Dest Other info

Memory address/Immediate operand

(b) Two-word instruction Computer Architecture

Encoding of Machine Instructions

• Then what if an instruction in which two operands can be specified using the Absolute addressing mode? • Move LOC1, LOC2 • Solution – use two additional words • This approach results in instructions of variable length. Complex instructions can be implemented, closely resembling operations in high-level programming languages – Complex Instruction Set Computer (CISC) Computer Architecture

Encoding of Machine Instructions

• If we insist that all instructions must fit into a single 32-bit word, it is not possible to provide a 32-bit address or a 32-bit immediate operand within the instruction. • It is still possible to define a highly functional instruction set, which makes extensive use of the processor registers. • Add R1, R2 ----- yes • Add LOC, R2 ----- no • Add (R3), R2 ----- yes