Concordia University Electrical & Engineering

COEN 311 Computer Organization & Software

Chapter 3 Principle Components of a Computer

(Prof. Sofiène Tahar) CPU ()

• Components of a Computer – CPU – Memory – I/O devices

• Architectures – Accumulator – General Purpose Register (GPR) CPU (Central Processing Unit)

• Instruction Execution Steps – Fetch (read instruction from memory) – Decode (interpret instruction) – Execute (fetch operands and execute operation) – Increment PC (prepare to fetch next instruction) • Instruction Execution Flow – Microsteps – Time analysis CPU (Central Processing Unit)

• CPU‐Memory Interface – Data Bus – Address Bus – Control Bus (memory and I/O control)

• Internal Architecture – Data buses (width of data/registers/buffers) – Address buses (width of address/registers/buffers) – Control signals (select/enable/read/write/op.) CPU (Central Processing Unit)

• Internal Organization – Registers (RF/Accu, PC, IR) – Internal busses (address and data) – Buffers (MBR, MAR, Temp) – Units (ALU, decoder, etc.) – Control (muxes, enable, read/write, ALU ops, etc.) CPU Internal Organization General Purpose Register (GPR) Machine

0 10 10 10 1 00 R0 . 10 data bus a 2 . 01 R1 RF 10 MBR 10 . 10 R2 b 2 10 6 address bus . 11 R3 IR MAR 10 . 10 10 6 6 . a b Interpr 6 control bus . ALU eter 6 op . 6 10 6 PC . . 6 . INC

63 = 26‐1 RF: MBR: Memory Buffer Register CPU I/O Control MAR: Memory IR: Interpreter: (Byte organized) PC: INC: Incrementer GPR (General Purpose Register) CPU

0 16 16 1 00 R0 . 16 2 a data bus . 01 R1 RF 16 MBR 10 R2 16 . 2 16 16 b address bus . 11 R3 IR MAR 16 8 . 16 16 16 8 . a b Interpr 8 . ALU eter 8 . 8 16 8 PC . . 8 Mem. Control . INC

28‐1 = 255 RF: Register File MBR: Memory Buffer Register I/O Control MAR: Memory Address Register CPU Memory IR: Instruction Register (Byte organized) Interpreter: Control Unit PC: Program Counter INC: Incrementer Execution steps for “Add” Instruction

ADD R1, R2,R3 R3 R1 + R2 MAR  PC

MBR  M[MAR] fetch

IR  MBR

else Interp decode ret add R3  R1 + R2 addition

PC  PC + 1 Increment PC Accumulator Machine

8‐bit

16 0 16 16 16 1 ACC 16 MBR 16 16 Data bus 14 16 14 16 16 IR MAR Address bus 14 Interpr ALU eter 14 R/W

16 14 E PC 14 14 INC

214‐1 14 CPU Memory C = 2 Bytes = 16 KB ACC: Accumulator IR: Instruction Register ALU: Arithmetic & Logic Unit Interpreter: Control Unit MBR: Memory Buffer Register PC: Program Counter MAR: Memory Address Register INC: Incrementer Execution steps for “Add” instruction in Accumulator Machine

ADD Y MAR  PC ACC  ACC + M[Y]

MBR  M[MAR] fetch

IR  MBR

Interp else Decode/ ret? Interpret add MAR  IR[Y] Load operand MBR  M[MAR]

ACC  ACC + MBR addition

PC  PC + 2 PC increment Instruction Execution in GPR Machine ADD R1, R2, R3 R3 <‐ R1 + R2

$1000 9103 $1002 9202 1) Fetch $1004 A123 2) Decode Memory 1010 0001 0010 0011 ADD R1, R2, R3

MAR MDR 3) Execute Control 4) Inc PC PC R0 R1 IR . R15 ALU

CPU Timing Analysis of Instruction Execution ADD R1, R2, R3

Processor: 3.3GHz 0.3ns 1 MARPC Memory: 333MHz

9 3+0.3ns 11 MBRM[MAR] fetch 1/(3.3 x 10 ) = 0.3 ns 1/(333 x 106) = 3 ns

0.3ns 1 IRMBR

0.3ns 1 decode decode

3x0.3ns 3 R3R1+R2 execute

2x0.3ns 2 PCPC+2 Inc PC

5.7 ns 19 clock cycles Complex (multiple words) instruction

Assembly ADD ($2000), ($2002), R4

Operation R4 <‐ M[$2000] + M[$2000]

Encodings ADD R4 M[$2000] M[$2002] 1010 0100 $2000 $2002

4bits 4bits 16 bits 16 bits

40 bits = 5 Bytes!!

16‐bit data: 1010 XXXX 0100 XXXX $2000 $2002 16 bits 16 bits 16 bits

48 bits = 6 Bytes 1010 0000 0100 0000 $2000 $2002

ADD ($2000), ($2002), R4 16 bits 16 bits 16 bits

48 bits = 3 Words

$1000 A040 Instruction $1002 2000 PC $1004 2002 48 IR

16 bits $0710 $0311

16 MAR Temp 1 + $2000 0710 Data MBR Temp 2 $2002 0311 R4

16 bit data, 16 bit address architecture ADD ($2000), ($2002), R4

MARPC $1000

Fetch MBRM[MAR] $A040

IRMBR $A040

other opcode decode interpret

add mem1, mem2, reg ADD ($2000), ($2002), R4

PCPC+2 $1002

MAR PC $1002 Fetch address of memory operand 1 MBRM[MAR] $2000

Temp1  MBR $2000

PCPC+2 $1004

MAR PC $1004 Fetch address of memory operand 2 MBRM[MAR] $2002

 Temp2 MBR $2002 ADD ($2000), ($2002), R4

MAR Temp 1 $2000

Fetch memory operand 1 MBR M[MAR] $0710

Temp 1  MBR $0710

MAR  Temp2 $2002 Fetch memory operand 2 MBR M[MAR] $0311

Execution R4MBR + Temp1 ($0311+$0710)

Increment PC PCPC+2 $1006 ADD ($2000), ($2002), R4 0.3ns 1 MARPC 3.3ns 11 MDRM[MAR] 0.3ns 1 IRMDR

Processor: 3.3GHz 0.3ns 1 decode Memory: 333MHz 0.9ns 3 PCPC+2 0.3ns 1 MAR PC 3.3ns 11 MBRM[MAR] Memory access: 10cc 0.3ns 1 Register transfer: 1cc Temp1  MBR Logical operation: 1cc 0.9ns 3 PCPC+2 Arithmetic operation: 2cc 3.3ns 1 MAR PC 3.3ns 11 MBRM[MAR] 0.3ns 1 Temp2MBR 0.3ns 1 MAR Temp 1 71% memory traffic!! 3.3ns 11 MBR M[MAR] 0.3ns 1 Temp 1  MBR 0.3ns 1 MAR  Temp2 0.9ns 11 MBR M[MAR] 0.9ns 3 R4MBR + Temp1 Total: 77 cc = 23.1 ns 0.9ns 3 PCPC+2 MC68000 Computer Steps required for execution of add instruction Your First Assembly Program

We want to compute a  (x+y) * (x‐y) where x, y, a are memory data

1) Write an assembly program for the above task using: a) Accumulator machine b) GPR machine ( 4 registers) 2) Assuming all accumulator instructions take 50 ns; and all GPR instructions take 30ns, except move M‐R/R‐M that take 50ns Compute the total execution of the programs in 1a) and 1b) Instruction Sets

1) Accumulator Machine ADD X ; ACC  ACC + M[X] SUB X ; ACC  ACC –M[X] . . MUL X ; ACC  ACC * M[X] . LD X ; ACC  M[X] X x ST X ; M[X]  ACC address data x: data in memory. M[X]=x X: addess for Memory

2) GPR Machine ADD Ri, Rj, Rk ; Rk  Ri + Rj SUB Ri, Rj, Rk ; Rk  Ri – Rj src src dest MUL Ri, Rj, Rk ; Rk  Ri * Rj ADD Ri, Rj, Rk MOVE Ri, Rj ; Rj  Ri MOVE Ri, M[X] ; M[X]  Ri MOVE M[X], Rj ; Ri  M[X] 1) Accumulator Machine LD X; ACC  M[X] x X y Y ADD Y; ACC  x + y . . a A ST T; temp  x + y temp T LD X; ACC x SUB Y; ACC x‐y data address MUL T; ACC (x‐y)*temp ST A; a  (x‐y)*(x+y) 2) GPR

MOVE x, R0 ; R0  x (=M[X]) MOVE y, R1 ; R1  y (=M[Y]) ADD R0, R1,R2 ; R2  R0+R1 SUB R1, R0, R3 ; R3  R0‐R1 MUL R2, R3, R0 ; R0  R2*R3 MOVE R0, a;a  R0  R2*R3  (x+y)*(x‐y) The Assembly Programs

1) Accumulator Machine LD X; ACC  M[X]  ADD Y; ACC x + y x X ST T; temp  x + y .y Y LD X; ACC x . SUB Y; ACC x‐y . MUL T; ACC (x‐y)*temp a A temp T ST A; a  (x‐y)*(x+y)

2) GPR data address MOVE x, R0 ; R0  x (=M[X]) MOVE y, R1 ; R1  y (=M[Y]) ADD R0, R1, R2 ; R2  R0+R1 SUB R1, R0, R3 ; R3  R0‐R1 MUL R2, R3, R0 ; R0  R2*R3 MOVE R0, a; a  R0  R2*R3  (x+y)*(x‐y) Main Memory • Technologies, RAM, ROM etc. • Hardware (modules, buses, etc.) • Addresses, Organization and Capacity • Content

– Instructions (machine code) Code – Data • Integer Data • Floating point • Char Code • ASCII • EBCDIC Data CPU‐Memory Interface

Data and address buses width of 8 bits each Memory Hierarchy

0 1 . . . Level 1 Level 2 . Cache Cache . Main . Memory . Hard Disk . . . . CPU 2n‐1 Memory Technologies

• Volatile Memory : RAM (Random Access Memory) • Non‐Volatile Memory : ROM (Read Only Memory) – PROM : Programmable ROM – EPROM : Erasable PROM UV Deletes ALL Locations – EEPROM : Electrically EPROM Elec. – EAPROM : Electrically Alterable PROM Can delete SELCTED Locations (‐‐ Flash Memory) SRAM vs. DRAM

• Static RAM (SRAM) – Fast but expensive – 5 transistor flip (more chip area)

• Dynamic RAM (DRAM) – Relatively slower but less expensive – 1 transistor + 1 Capacitor (less chip area higher density) DRAM Technologies

• SDRAM: Synchronous DRAM • SDR SDRAM: Single Data Rate SDRAM • DDR SDRAM: Double Data Rate SDRAM • DDR2 SDRAM: an evolution over DDR SDRAM • DDR3 SDRAM: improvement over DDR2 • DDR4 SDRAM: improvement over DDR3 • RLDRAM: Reduced‐latency DRAM Memory Capacity

8 bits Address lines Data lines 0 C = 2m x n 1 8 2 Data bus 3 5 4 Address bus 5 2 Control bus

EXAMPLE: C = 25 x 8=32 x 8bit 31 = 32 Bytes WORD ORGANIZATION Adress Bus Data Bus m nbits C = 2 x n

0 n Data bus m 1 C = 2 x n bit 2 m m Address bus = 2 x n/8 Byte 3

4 2 Control bus 5 Example: m = 32, n = 16 C = 232 x 16 bit = 232 x 16/8 Byte 2m‐1 = 8 GB

Note: 210 = 1 K ; 220 = 1 M ; 230 = 1 G ; 240 = 1 T BYTE ORGANIZATION Address Bus Data Bus m 8 bits C = 2 x n

0 n Data bus m 1 C = 2 x 8 bit 2 m m Address bus = 2 Byte 3

4 2 Control bus 5

Example: m = 32, n = 16 C = 232 x 8 bit m 2 ‐1 = 232 Byte = 4 GB Example of Byte Organized Memory

(b)

First 24 bytes (a) and (a) first 12 16‐biit words (b) Example of Byte Organized Memory

First 12 16‐bit words (a) and three 32‐bit longwords with addresses 0, 2, and 6 (b). “Real” Memory –CPU Interface CPU Address bus

16 ─ 11 1098 7 ─ 1 RDWR Data bus

Decoder 3210 CS1 CS2 128 x 8 RD Data WR RAM 1 AD7

CS1 CS2 128 x 8 RD Data WR RAM 2 AD7

CS1 CS2 128 x 8 RD Data WR RAM 3 AD7

CS1 CS2 128 x 8 RD Data WR RAM 4 AD7

CS1 CS2 128 x 8 1 ─ 7 Data 8 AD9 ROM 9 Memory Organization: Logical View

2bits 8 bits 16bits

00 00..0 0

256 B 0

00 11..1 255 0 1 00..0 256

256 B 1

01 11..1 511 10 00..0 512 0 1

256 B 2 10 253 111111 0 1 254 10 11..1 767 255 11 00..0 768

256 B 3

11 11..1 1023 Memory Organization (Internal Structure) [Word Organized Memory] Memory System (8 bits address & 16 bits data) 8 bits 16 bits ? R/W CS

8 bits Memory Module (8 bits address & 8 bits data) 8 bits B 8 bits 256 B

256 R/W 16bits CS Data Bus 8 bits Address Bus 8 bits B R/W

256 CS

Capacity (word organized): C =28 x 16 bits = 28 x 2 Bytes = 512 B Memory Organization (Internal Structure) [Byte Organized Memory] Memory System 16 bits (10 bits address & 16 bits data) 10 bits B

16 bits ? R/W

256 CS

16 bits Memory Module B (8 bits address & 16 bits data) 8 bits

256 16 bits 256 B 16 bits R/W CS Data Bus 16 bits 8 bits 10 bits Address Bus B R/W

256 CS

16 bits B Capacity (byte organized): 10 256 2 bits C =2 = 1 KB Memory Organization: Another Structure [Byte organized Memory]

Memory System R/W (10 bits address & 16 bits data) 16bits R/W 8 bits CS 16 bits ? R/W CS

16bits 16bits R/W CS Memory Module (8 bits address & 16 bits data) CS 8 bits 16bits R/W 16 bits 256 B CS R/W CS

16bits R/W CS

8 bits 2bits 10bits Capacity (byte organized): C =210 = 1 KB Complex Memory Organization 25 bits ? 64 bits [Word Organized Memory] System

25 23 bits 8 bits C = 2 x 64/8 = 256 MB 8 MB modules Module 64 0

88 8 00 …… 00 23 23 23

…… 01 01 64bits

…… 10 bits

10 23

25bits …… 11

2bit 11

C = 256 MB = 32 x 8 MB Memory Content

• Contents : strings of bits (bytes, 16‐bit word, 32‐ bit word, ...) • Interpretation – Instruction (Code with fixed format, e.g. MC68000) – Data • Signed Integer • Unsigned Integer • Floating Point number • BCD number • Character: ASCII, EBCDIC Memory Content

Data CODE Signed Integer DATA BCD Character Unsigned Floating number ‐ASCII CODE Integer Point ‐ EBCDIC number DATA BCD : Binary Coded Decimal ASCII : American Standard Code for Information Interchange (7 bit Code –max 128 characters)

EDCDIC : Extended Decimal Coded Decimal Interchange Code (8 bit Code –max 256 characters) EBCDIC and ASCII codes (in Hex) for selected characters Example: Two possible representations of “0018” depending on the interpretation: EBCDIC character code or signed integer Example: Instruction Encoding Class Quiz

Give an interpretation to the following string of bits 0011001000001011 assuming it is: • Unsigned Integer • Signed Integer • BCD number • String of ASCII charachters • IEEE 754 Floating Point number • MC68000 Instruction Unsigned Number

0011001000001011 $ 3 2 0 B = 12811 Signed Number

0011001000001011 $ 3 2 0 B = 12811 BCD Number

0011001000001011 3 2 0 NA

Not a BCD number! IEEE Float Number 0011001000001011

SIGN EXPONENT SIGNIFICANT SE F

(‐1)s x 2E‐127 x 1.F (‐1)0 x 250‐127 x 1.00001011 = 2‐77 x 1.00001011 = 2‐77 x (1 +2‐5+2‐7+2‐8) = 6.901 x 10‐24

ASCII Character

0011001000001011 $ 3 2 0 B $32 $0B = “2” “VT” VT : Vertical Tab MC68000 Instructions Encoding (subset) MC68000 Instruction

0011001000001011

MOVE Ai , Dj (Di source, Ai: dest.) MOVE A3 , D1