EECS 470 Lecture 2 Instruction Set Architecture
Total Page:16
File Type:pdf, Size:1020Kb
© Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar EECS 470 Lecture 2 Instruction Set Architecture Fall 2007 Prof. Thomas Wenisch http://www.eecs.umich.edu/courses/eecs470/ Slides developed in part by Profs. Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Shen, Smith, Sohi, Tyson, Vijaykumar, and Wenisch of Carnegie Mellon University, Purdue University, University of Michigan, and University of Wisconsin. Lecture 2 EECS 470 Slide 1 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Announcements Reminders: HW # 1 due Friday 9/14 Hand in at start of discussion Programming assignment #1 due Friday 9/14 Electronic hand‐in by midnight Lecture 2 EECS 470 Slide 2 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Readings For today: Task of the Referee. A.J. Smith H & P Chapter 1 H & P Appendix B For Wednesday: Cramming More Components onto ICs. GEG.E. Moore H & P Chapter A.1‐A.6 Lecture 2 EECS 470 Slide 3 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Performance – Key Points Amdahl’s law Soverall = 1 / ( (1-f)+ f/S ) Iron law Time Instructions Cycles Time = × × Program Program Instruction Cycle AiThiAveraging Techniques Arithmetic Harmonic Geometric Time Rates Ratios n n 1 n n 1 n ∑i=1Timei ∏Ratio i ∑i=1 n i =1 Rate i Lecture 2 EECS 470 Slide 4 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Digital System Cost Cost is also a key design constraint Architecture is about trade‐offs Cost plays a major role Huge difference between Cost & Price E.g., Higher Price Æ Lower Volume Æ Higher Cost Æ Higher Price Direct Cost Gross Margin List vs. Selling Price PiPrice also dddepends on the customer College student vs. US Government $ $$$$$$$ Embedded Portables Desktops Servers Supercomputer Lecture 2 EECS 470 Slide 5 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Direct Cost Cost distribution for a $1K PC Processor board 37% CPU, memory, I/O devices 37% Hard disk, DVD, monitor, … Software 20% T/biTower/cabinet 6% IdIntegrated systems account for a subbilstantial fifraction of cost Lecture 2 EECS 470 Slide 6 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar IC Cost Equation Die cost + Test cost + Packaging cost IC cost = Final test yield Wafer cost Die cost = Dies/wafer x Die yield Die yield = wafer yield x (1 + f(defect density, die area)) Lecture 2 EECS 470 Slide 7 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Power & Energy Why do we care? Electricity cost (energy) IC Packaging cost (power) Battery life (energy) Noise from fans (power) Environment impact (both) Chips melting (power) Trends More transistors per die Higher transistor density Increased leakage current Higher server density Lecture 2 EECS 470 Slide 8 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Other Design Objectives Current circuit trends Æ Lots & lots of lousy transistors c.a. 2010 Complexity 1G transistor designs too big (10x‐100x longer to verify) Wire latency dominates large designs Æ slower clock growth Reliability SllSmaller transistors => storage prone to partilicle radiat ion e.g., recently a vendor recalled large servers w/o ECC in L2! soon, 1 radiation‐induced error per cross‐country flight! Portability, Programmability, Utility, …. Lecture 2 EECS 470 Slide 9 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar In Summary: Performance & Cost It doesn’t make sense to talk about performance alone Computer design/engineering is about trade‐offs Complex relationship between different performance and cost metrics performance: power, reliability, portability, programmability cost, lifetime ... cost: NRE cost, unit cost, list vs. selling cost Engineering ≡ the profession in which a knowledge of the mathematical and natural sciences, gained by study, experience, and practice, is applied with judgment to develop ways to utilize, economically, the materials and forces of nature for the benefit of mankind. ‐‐ Accreditation Board for Engineering and Technology Lecture 2 EECS 470 Slide 10 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Instruction Set Architecture Lecture 2 EECS 470 Slide 11 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Instruction Set Architecture “Instruction set architecture (ISA) is the structure of a computer that a machine language programmer (or a compiler) must undtdderstand to write a correct (tim ing idindepen den t) program for that machine” IBM introducing 360 in 1964 ‐ IBM 360 is a family of binary‐compatible machines with distinct microarchitectures and technologies, ranging from Model 30 (8‐ bit datapath, up to 64KB memory) to Model 70 (64‐bit datapath, 512KB memory) and later Model 360/91 (the Tomasulo). ‐ IBM 360 replaced 4 concurrent, but incompatible lines of IBM architectures developed over the previous 10 years Lecture 2 EECS 470 Slide 12 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar ISA: a contract between HW and SW ‐ Programmer visible states program counter, GPR, memory, status/cntrl ‐ Programmer visible behaviors (state transitions) what to perform? where are the operands? what to perform next? if imem[pc]==“add rd, rs, rt” then pc ⇐ pc+1 gpr[d][rd]=gpr[rs]+grp[rt] ‐ A binary encoding Expected lifetime ~25 years (because of SW cost) Lecture 2 EECS 470 Slide 13 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Typical Instructions (Opcodes) Type Example Instruction Arithmetic and logical and, add Data transfer move, load Control branch, jump, call, return System trap, rett Floating point add, mul, div, sqrt Decimal addd, convert String move, compare What operations are necessary? {sub, ld & st, conditional br.} Too little or too simple Æ not expressive enough difficu lt to program (by hd)hand) programs tend to be bigger Too much or too complex Æ most of it won’t be used too much “baggage” for implementation. difficult choices during compiler optimization Lecture 2 EECS 470 Slide 14 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar ALU Instructions ALU instructions combine operands Number of explicit operands two, ri = ri op rj three, rk = ri op rj What about zero, one and four? Or thogona lityof operands: registers or memory (+ addressing modes) any combo, orthogonal, e.g., VAX at least one register, not orthogonal, e.g., IBM 360/370 all registers, orthogonal but loads/stores, e.g., Cray, RISCs Why is orthoggyonality good? Lecture 2 EECS 470 Slide 15 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Operand Model: Memory Only Where (other than memory) can operands come from? And how are they specified? Example: A = B + C Several options Memory only add B,C,A mem[A] = mem[B] + mem[C] MEM Lecture 2 EECS 470 Slide 16 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Operand Model: Accumulator Accumulator: implicit single element storage load B ACC = mem[B] add C ACC = ACC + mem[C] store A mem[A] = ACC ACC MEM Lecture 2 EECS 470 Slide 17 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Operand Model: Stack Stack: TOS implicit in instructions push B stk[TOS++] = mem[B] push C stk[TOS++] = mem[C] add stk[TOS++] = stk[‐‐TOS] + stk[‐‐TOS] pppop A mem[A] = stk[‐‐TOS] TOS MEM Lecture 2 EECS 470 Slide 18 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Operand Model: Registers General‐purpose register: multiple explicit accumulator load B,R1 R1 = mem[B] add C,R1 R1 = R1 + mem[C] store R1,A mem[A] = R1 LdLoad‐store: GPR and only ld/tloads/stores access memory load B,R1 R1 = mem[B] load C,R2 R2 = mem[C] add R1,R2,R1 R1 = R1 + R2 store R1,A mem[A] = R1 MEM Lecture 2 EECS 470 Slide 19 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Operand Model Comparison Why registers? faster access shorter/simpler address Accumulator: a legacy from the “adding” machine days 9 less hdhardware r high memory traffic r likely bottleneck Stack: an intuitive programming construct 9 code density r blbottlenec k while piliiipelining (h?)(why?) e.g., Burroughs's Stack Machine (ALGOL), x86 floating‐point, JAVA VM Lecture 2 EECS 470 Slide 20 © Wenisch 2007 -- Portions © Austin, Brehob, Falsafi, Hill, Hoe, Lipasti, Martin, Roth, Shen, Smith, Sohi, Tyson, Vijaykumar Operand Storage General Purpose Registers (8 to 256 words): 9