CO200 - Computer Organization & Architecture

CO200 - Computer Organization & Architecture Basavaraj Talawar [email protected] Course Syllabus ● Processor Basics – CPU organization, Data representation and Instruction Sets ● Datapath Design – Fixed point arithmetic – Adders, Subtracters, Multipliers, Dividers. – ALU, Floating point arithmetic ● Control Design – Hardwired control, Microprogrammed control, Pipeline control ● Memory Organization – Serial vs. Random Access Memories – Caches, Virtual Memory ● Principles of Pipelining ● Principles of Parallel Computing Course Structure ● Textbooks – J P Hayes, Computer Architecture and Organization, 3 ed., McGraw Hill. – Hwang and Briggs, Computer Architecture and Parallel Processing, McGraw Hill. – D Patterson and J Hennessy, Computer Organization and Architecture, MK, 3 ed. ● Other References – NPTEL course on “High Performance Computing” by Matthew Jacob, IISc. ● Guest Lectures ● About Course – Surprise Quizzes – 15%, Assignments – 10%, Mid Sem – 25%, Final Exam – 50% Course Objectives ● To understand how a computer works ● To know the architecture and working of components inside a computer – Processor, Control unit, ALU, Memory, I/O Course Objectives – Expanded ● How is a machine language program executed by a computer? ● How does the software instruct the hardware to perform a desired action? How does the hardware instruct a desired unit to perform its corresponding operation? ● Why study all of this? – To gain insight into the setting in which our programs execute – To improve the setting in which our programs execute – to improve the performance of the system What is a Computer? What is a Computer? ● An electronic device which is capable of receiving information (data) in a particular form and of performing a sequence of operations in accordance with a predetermined but variable set of procedural instructions (program) to produce a result in the form of information or signals. Basic Computer Organization ● Machine instructions – Description of a primitive operation that a machine hardware is able to understand – In binary – Example of a 32b machine language instruction 00110011101100000100001110101011 Basic Computer Organization ● Instruction Set – Complete specification of all the kinds of instructions that the processor hardware was built to execute – Eg.: ADD, SUB, XOR, JUMP, … ● How are programs written in high level languages such as C translated into a language that the machine understands? The Computer Program ● Description of algorithms and data structures to achieve a specific objective ● A compiler translates the high level language into assembly language. ● An assembler translates the assembly into machine code. Basic Computer Organization ● Processor – Executes programs ● Main Memory – Holds program and data ● I/O – For communication and data Processor (CPU) ALU REGISTERS MEMORY CONTROL BUS I/O I/O I/O I/O Inside the Processor ● Control Hardware: Hardware to manage instruction execution ● ALU: Arithmetic and Logical Unit (hardware to do arithmetic and logic operations) ● Registers: Small units of memory to hold data/instructions temporarily during execution ● Memory: Stores information being processed by the CPU ● Input: Allows the user to supply information to the computer ● Output: Allows the user to receive information from the computer Physics in the Real World Computer Architecture Application Algorithm Computer architecture is the design of the Programming Language abstraction/implement Operating System/Virtual Machines ation layers that allow Instruction Set Architecture us to execute Organization/Microarchitecture information Register-Transfer Level processing Gates applications efficiently using Circuits manufacturing Devices technologies Physics Architecture vs. Organization ● Architecture/Instruction Set Architecture (ISA) – Programmer visible state (Memory & Registers) – Operations (Instructions and how they work) – Input/Output – Data Representation – Types/Sizes ● Microarchitecture/Organization: – Is the way a given ISA is implemented on a processor Same Architecture, Different Organizations ● AMD Athlon II X4 ● Intel Atom – X86 ISA – X86 Instruction Set – Quad Core, 2.9GHz, 125W – Single Core, 1.6GHz, 2W – 3 Instructions/Cycle/Core – 2 Instructions/Cycle/Core – 64KB L1Cache, 512KB L2 – 32KB/24KB L1 I/D Cache, 512KB Cache L2 Cache Different Architectures, Organizations ● AMD Vishera ● IBM POWER 8 – X86 ISA – Power ISA – 8 Core, 4.7 GHz, 125W – 12 cores, 4.5GHz, 250W – 64KB L1Cache, 2MB L2 – 64KB L1Cache, 512KB L2 Cache, 8MB L3 Cache, 8MB L3. Recap ● What is a Computer? ● Computer Organization and Architecture – Registers, Control Unit, ALU, Memory, I/O, Bus ● ISA, Machine language ● Organization vs. Architecture Coming up … ● Processor Performance ● Machine Models Concept of Time and Speed ● Frequency: Number of occurrences of a repeating event per unit time. – SI unit: Hertz (Hz) ● The period is the duration of one cycle in a repeating event – Period = Cycle time 1 Cycle Time= Frequency On Processor Performance ● How is frequency related to performance? Program ExecutionTime= Execution Time per Instruction×Total Program Instructions CPU Time=Execution Time per Instruction×InstructionCount Execution Time per Instruction= Cycles spent per Instruction×Cycle Time CPU Time= IC×Cycles per Instruction×CycleTime ExampleExample WhatWhat isis thethe executionexecution timetime ofof aa programprogram containingcontaining aa millionmillion InstructionsInstructions eacheach occupyingoccupying 44 cyclescycles inin aa 22 GHzGHz processor?processor? Iron Law of Processor Performance CPU Time= IC ×Cycles per Instruction×CycleTime 1 Time per Cycle= Frequency IC×CPI CPU Time= Frequency Instructions Clock cycles Seconds CPU time= ∗ ∗ Program Instruction Clock cycle On Processor Performance Instructions Clock cycles Seconds CPU time= ∗ ∗ Program Instruction Clock cycle ARCHITECTURE AND COMPILER ORGANIZATION The GNU C Compiler ● $gcc hello.c The compiler and its working: Guest lecture by Dr. Janakiraman, IBM, August 2 Operations and Operands ● C = A + B ● Operation: Addition. Operands: A & B. Result: C. ● Instruction: ADD C, A, B Where do Operands come from and where do results go? Architectural decision Memory – Toy Example 0x0000 ... ... ● ... Byte addressable ... ● Linearly increasing addresses 0x00FF 0x0100 ● Memory is 'growing down' 0x0101 0x0102 ● Any location can be read ... ... ... from/written into. ... ... ● ... How many bytes can be stored ... ... ... in this example memory? ... ... ... ... 0xFFFE ... ... 0xFFFF Recap ● Processor performance ● Abstract view of Memory ExampleExample YourYour desktopdesktop hashas aa 4GB4GB Memory.Memory. HowHow longlong (in(in bits)bits) isis itsits address?address? Operations and Operands ... Register File i1 i2 R O S S E C O Control R ALU P ... ... ... Memory ... Machine Model – Stack ● STACK Stack is a form of memory 0xFF ● Top of the Stack (Stack Pointer) 0xFE ... ... ● ... Push and Pop ... ... ... ... ... TOS ... ... ... 0x02 0x01 0x00 Stack STACK MEMORY 0xFF ... PUSH 10 PUSH 12 0xFE ... ... 255 0x07 POP 13 ... ... PUSH 7 ... ... 77... 0x10 ... ... ... ... 44 0x12 ... 172 0x13 0x06 ... ... 0x05 0x04 0x03 ... 0x02 71 TOS 0x01 94 TOS 0x02 0x00 10 Stack STACK MEMORY 0xFF ... PUSH 10 PUSH 12 0xFE ... ... 255 0x07 POP 13 ... ... PUSH 7 ... ... 77... 0x10 ... ... ... ... 44 0x12 172 0x13 0x06 ... ... ... 0x05 ... 0x04 0x03 ...77 TOS 0x02 71 TOS 0x01 94 TOS 0x020x03 0x00 10 Stack STACK MEMORY 0xFF ... PUSH 10 PUSH 12 0xFE ... ... 255 0x07 POP 13 ... ... PUSH 7 ... ... 77... 0x10 ... ... ... ... 44 0x12 172 0x13 0x06 ... 0x05 0x04 44 0x03 77 TOS 0x02 71 0x01 94 TOS 0x030x04 0x00 10 Stack STACK MEMORY 0xFF ... PUSH 10 PUSH 12 0xFE ... ... 255 0x07 POP 13 ... ... PUSH 7 ... ... 77... 0x10 ... ... ... ... 44 0x12 17244 0x13 0x06 ... 0x05 0x04 44 TOS 0x03 77 TOS 0x02 71 0x01 94 TOS 0x040x03 0x00 10 Stack STACK MEMORY 0xFF ... PUSH 10 PUSH 12 0xFE ... ... 255 0x07 POP 13 ... ... PUSH 7 ... ... 77... 0x10 ... ... ... ... 44 0x12 44 0x13 0x06 ... 0x05 0x04 255 TOS 0x03 44 0x02 71 0x01 94 TOS 0x04 0x00 10 Machine Model – Stack STACK STACK R O S S ... ... E ... C O R ... P ... TOS ... ... ... ... ALU ... ... ... Y R O M ... E ... M ... ... TOS Where do Operands come from and where do results go? Machine Model – Stack STACK ● R The operands are always TOS, O S S TOS – 1. E C ... O TOS R ● P Result always goes into TOS – 1. ● Implicit operands ● ALU Instruction: ADD ● Example equation: d=(a+b)*c ... ... ... ... Postfix Expressions a + b ab+ (a + b)*c X*c Xc* where X = (a + b) postfix form of (a + b) is ab+ ab+c* Postfix Expressions a + (b*c) abc*+ (a + b)* (c - d) X * (c – d) where X = (a + b) X * Y XY* where Y = (c – d) replace Y with its postfix form Xcd-* replace X with its postfix form (a + b)* (c - d) ab+cd-* (((a + b)*c)+d)*e ((X*c)+d)*e where X = (a + b) (Y+d)*e where Y = (X*c) Z*e Ze* where Z = (Y+d) replace Z with its postfix form Yd+e* replace Y with its postfix form Xc*d+e* replace X with its postfix form ab+c*d+e* Reverse Polish Notation ● A way of expressing arithmetic expressions that avoids the use of brackets. ● Evaluated left-to-right. Natural on a stack. ● Devised by the Polish philosopher and mathematician Jan Łukasiewicz (1878-1956) Infix Notation RPN a+b ab+ (a+b)*c ab+c* a+(b*c) abc*+ (a+b) * (c-d) ab+cd-* (((a+b)*c)+d)*e ab+c*d+e* RPN Example Stack Postfix Form: ab+ ... ... ... ... ... ... ... ... ... ... ... a RPN Example Stack Postfix Form: ab+ ... ... ... ... ... ... ... ... ... ... ... b a RPN Example Stack Postfix Form: ab+ ... ... ... ... ... ..

Load more