University of Wisconsin-Madison CS/ECE 752 Advanced Computer Architecture I

University of Wisconsin-Madison CS/ECE 752 Advanced Computer Architecture I Professor Matt Sinclair Unit 1: Technology, Cost, Performance, & Power Slides developed by Milo Martin & Amir Roth at the University of Pennsylvania with sources that included University of Wisconsin slides by Mark Hill, Guri Sohi, Jim Smith, and David Wood. Slides enhanced by Milo Martin, Mark Hill, and David Wood with sources that included Profs. Asanovic, Falsafi, Hoe, Lipasti, Sankaralingam, Shen, Smith, Sohi, Vijaykumar, and Wood CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 1 Announcements • Manually enrolled everyone in the course on Piazza • Using this for all announcements and questions moving forward • Poll: did everyone receive email sent to class mailing list? • Poll: who is interested in lecture from the Condor team? • Will post poll on Piazza • Lecture slides posted – see Course Schedule • Reminder: HW0 due next Wednesday CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 2 This Unit • What is a computer and what is computer architecture • Forces that shape computer architecture • Applications (covered last time) • Semiconductor technology • Evaluation metrics: parameters and technology basis • Cost • Performance • Power • Reliability CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 3 Readings • H&P • 1.4 – 1.7 (technology) • Paper • G. Moore, “Cramming More Components onto Integrated Circuits” CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 4 What is Computer Architecture? (review) • Design of interfaces and implementations… • Under constantly changing set of external forces… • Applications: change from above (discussed last time) • Technology: changes transistors characteristics from below • Inertia: resists changing all levels of system at once • To satisfy different constraints • This course mostly about performance • Cost • Power • Reliability • Iterative process driven by empirical evaluation • The art/science of tradeoffs (“it depends”) CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 5 Abstraction and Layering • Abstraction: only way of dealing with complex systems • Divide world into objects, each with an … • Interface: knobs, behaviors, knobs → behaviors • Implementation: “black box” (ignorance + apathy) • Specialists deal with implementations; others interface • Example: car drivers vs. mechanics • Layering: abstraction discipline makes life even easier • Removes need to even know interfaces of most objects • Divide objects in system into layers • Layer X objects • Implemented in terms of interfaces of layer X-1 objects • Don’t even need to know interfaces of layer X-2 objects CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 6 Abstraction, Layering, & Computers • Computers are complex systems, built in layers • Applications • O/S, compiler • Firmware, device drivers • Processor, memory, raw I/O devices • Digital circuits, digital/analog converters • Gates • Transistors • 99% of users don’t know hardware layers implementation • 90% of users don’t know implementation of any layer • That’s ok – computers still work just fine • But unfortunately, the layers sometimes breakdown • Someone needs to understand what’s “under the hood” CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 7 Gray box: peering through the layers • Layers of abstraction in a car • Interface (drivers): steering wheel, clutch, shift, brake • Implementation (mechanic): engine, fuel injection, transmission • But high-performance drivers know the torque curve • Achieve maximum performance • Similar exs. for computers Keep RPM in range where • Cache organization/locality torque is maximized • Pipeline scheduling/interlocks • Power users peek across layers CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 8 A Computer Architecture Picture Application OS Software Compiler Firmware Instruction Set Architecture (ISA) CPU I/O Memory Hardware Digital Circuits Gates & Transistors • Computer architecture • Definition of ISA to facilitate implementation of software layers • This course mostly on computer micro-architecture • Design CPU, Memory, I/O to implement ISA … CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 9 Technology Basis of Clock Frequency CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 10 Semiconductor Technology Background • Transistor (1947) Application • A key invention of 20th century OS • Fabrication Compiler Firmware CPU I/O Memory Digital Circuits Gates & Transistors CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 11 Shaping Force: Technology Drain • Basic technology element: MOSFET Gate Channel • MOS: metal-oxide-semiconductor • Conductor, insulator, semi-conductor • FET: field-effect transistor Source • Solid-state component acts like electrical switch • Channel conducts source → drain when voltage applied to gate • Channel length: characteristic parameter (short → fast) • Aka “feature size” or “technology” • Currently: 10-12 nm (0.010 – 0.012 micron) • Continued miniaturization (scaling) known as “Moore’s Law” • Won’t last forever, physical limits approaching (or are they?) CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 12 Transistor implementations are complex CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 13 Simple CMOS Interface • Voltage as values • Power (VDD) = 1, Ground = 0 Power (1) • Two kinds of MOSFETs • N-transistors p-transistor Input Output (“node”) • Conduct when gate voltage is 1 • Good at passing 0s n-transistor • P-transistors • Conduct when gate voltage is 0 • Good at passing 1s Ground (0) • CMOS: complementary n-/p-networks form boolean logic CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 14 CMOS Examples • Example I: inverter • Case I: input = 0 • P-transistor closed, n-transistor open • Power charges output (1) • Case II: input = 1 • P-transistor open, n-transistor closed • Output discharges to ground (0) • Example II: look at truth table • 0, 0 → 1 0, 1 → 1 • 1, 0 → 1 1, 1 → 0 • Result: this is a NAND (NOT AND) • NAND is universal (can build any logic function) CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 15 Transistor Speed, Power, and Reliability • Transistor characteristics and scaling impact • Switching speed • Power • Reliability • “Undergrad” gate delay model for architecture • Each Not, NAND, NOR, AND, OR gate has a delay of “1” • Reality: much more complex • Full answer requires 2nd order differential equations • Need something easier to reason about CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 16 Simple RC Delay Model • Switching time is a RC circuit (charge or discharge) 1 • R – Resistance: slows rate of current flow • Depends on material, length, cross-section area 1→0 • C – Capacitance: electrical charge storage • Depends on material, area, distance • Voltage affects speed, too 1 I 1→0 0→1 1→0 CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 17 Resistance • Transistor channel resistance 1 • Function of Vg (gate voltage) • Wire resistance 1→0 • Negligible for short wires 1 I 1→0 0→1 1→0 CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 18 Capacitance • Source/drain capacitance 1 • Gate capacitance • Wire capacitance 1→0 • Negligible for short wires 1 I 1→0 0→1 1→0 CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 19 RC Delay Model Ramifications • Want to reduce resistance • Wide drive transistors (width specified per device) 1 • Short gate length • Short wires • Want to reduce capacitance 1→0 • Number of connected devices • Less-wide transistors (gate capacitance 1 of next stage) • Short wires I 1→0 0→1 1→0 CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 20 Improving RC Delay • Exploit good effects of scaling • Fabrication technology improvements • Use copper instead of aluminum for wires (ρ↓ → Resistance ↓) CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 21 Transistors and Wires ©IBM From slides © Krste Asanović, MIT CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 22 Transistor Geometry: Width Length Gate Drain Source Gate Width Source Drain Width Bulk Si Length Diagrams © Krste Asanovic, MIT • Transistor width, set by designer for each transistor • Wider transistors: • Lower resistance of channel (increases drive strength) – good! • But, increases capacitance of gate/source/drain – bad! • Result: set width to balance these conflicting effects CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 23 Transistor Geometry: Length & Scaling Length Gate Drain Source Gate Width Source Drain Width Bulk Si Length Diagrams © Krste Asanovic, MIT • Transistor length: characteristic of “process generation” • 45nm refers to the transistor gate length, same for all transistors • Shrink transistor length: • Lower resistance of channel (shorter) – good! • Lower gate/source/drain capacitance – good! • Result: switching speed improves linearly as gate length shrinks CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc. 24 Wire Geometry Pitch Height Length Width IBM CMOS7, 6 layers of copper wiring • Transistors 1-dimensional for design purposes: width • Wires 4-dimensional: length, width, height, “pitch” • Longer wires have more resistance • “Thinner” wires have more resistance • Closer wire spacing (“pitch”) increases capacitance CS/ECE 752 (Sinclair): Technology, Cost, Performance, Power, etc.From slides © Krste Asanovic,

University of Wisconsin-Madison CS/ECE 752 Advanced Computer Architecture I

IBM Z Systems Introduction May 2017

45-Year CPU Evolution: One Law and Two Equations

Comparing the Power and Performance of Intel's SCC to State

Performance of a Computer (Chapter 4) Vishwani D

Technology & Energy

Computer “Performance”

Trends in Processor Architecture

Advanced Computer Architecture Lecture No. 30

Xilinx Running the Dhrystone 2.1 Benchmark on a Virtex-II Pro

Introduction to Cpu

Computer Architectures an Overview

AI Acceleration – Image Processing As a Use Case