Lecture 1: Course Introduction, Phone: 660-6551 Technology Trends, Performance • Teaching Assistant Shobana Ravi Professor Alvin R

Administrative • Office Hours Office: D308 LSRC Hours: Mon 3:00-4:00, Thurs 1:00-2:00 or by appointment (email) email: [email protected] Lecture 1: Course Introduction, Phone: 660-6551 Technology Trends, Performance • Teaching Assistant Shobana Ravi Professor Alvin R. Lebeck Office: D330 Compsci 220 / ECE 252 Hours: TBD email: [email protected] Fall 2004 Phone: 660-6589 Slides based on those of: Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 2 Administrative (Grading) Administrative (Continued) • 30% Homeworks • Midterm Exam: In class (75 min) Closed book – 4 to 6 Homeworks • Final Exam: (3 hours) closed book – Late < 1 day = 50% – Late > 1 day = zero • CS Graduate Students---This is a “Quals” Course. • 40% Examinations (Midterm 15% + Final 25%) – Quals pass based on Midterm and Final exams only • 30% Research Project (work in groups of 3 or 2) – No late term projects • Academic Misconduct • University policy will be followed strictly • Zero tolerance for cheating and/or plagiarism • This course requires hard work. © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 3 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 4 Administrative (Continued) SPIDER: Systems Seminar • Course Web Page • Systems & Architecture Seminar – http://www.cs.duke.edu/courses/cps220/fall04 – Wednesdays 4:00-5:00 in D344 – Lectures posted there shortly before class (pdf) – duke.cs.os-research (spider newsgroup) – Homework posted there • Presentations on current work – General information about course – Practice talks for conferences • Course News Group – Discussion on recent papers – duke.cs.cps220 – Your own research – Use it to • Why you should go? 1. read announcements/comments on class or homework, – If you want to work in Systems/Architecture… 2. ask questions (help), – Good time to practice public speaking in front of friendly crowd 3. communicate with each other. – Learn about current topics © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 5 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 6 Homework #0 What is This Course All About? • Need Duke CS account? • State-of-the-art computer hardware design • Email to me ([email protected]) your • Topics 1. Duke ID – Uniprocessor architecture (i.e., microprocessors) 2. ACPUB account name – Memory architecture – I/O architecture • Read Chapters 1 & 2 – Brief look at multithreading and multiprocessors • Fundamentals, current systems, and future systems • Will read from textbook, classic papers, brand-new papers © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 7 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 8 Course Goals and Expectations CPS 220 Course Focus • Course Goals Understanding the design techniques, machine – Understand how current processors work structures, technology factors, evaluation methods that will determine the form of computers in 21st Century – Understand how to evaluate/compare processors – Learn how to use simulator to perform experiments – Learn research skills by performing term project Parallelism Technology Programming Languages • Course expectations: Applications Interface Design – Will loosely follow text Computer Architecture: • Instruction Set Design (ISA) – Major emphasis on cutting-edge issues Power • Organization • Hardware – Students will read a list of research papers – Term project Operating Measurement & Systems Evaluation History © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 9 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 10 Expected Background Course Components – Basic architecture (ECE 152 / CPS 104) Reading Materials – Basic OS (ECE 153 / CPS 110) • Computer Architecture: A Quantitative Approach by Hennessy and Patterson, 3rd Edition • Other useful and related courses: • Readings in Computer Architecture by Hill, Jouppi, – Digital system design (ECE 251) Sohi – VLSI systems (ECE 261) • Recent research papers (online) – Multiprocessor architecture (ECE 259 / CPS 221) – Fault tolerant computing (ECE 254 / CPS 225) – Computer networks and systems (CPS 114 & 214) – Programming languages & compilers (CS 106 & 206) – Advanced OS (CPS 210) © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 11 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 12 Computer Architecture Is … Computer Architecture Topics “…the attributes of a [computing] system as seen by Input/Output and Storage the programmer, i.e., the conceptual structure and Disks, WORM, Tape RAID functional behavior, as distinct from the organization of the data flows and controls, the logic design, and Emerging Technologies DRAM Interleaving the physical implementation.” Bus protocols Amdahl, Blaaw, and Brooks, IBM Journal of R&D, April 1964 - . Coherence, Memory L2 Cache Bandwidth, Hierarchy Latency L1 Cache Addressing, VLSI Protection, Instruction Set Architecture Exception Handling Pipelining, Hazard Resolution, Pipelining and Instruction Superscalar, Reordering, Level Parallelism Prediction, Speculation © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 13 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 14 Architecture and Other Disciplines Levels of Computer Architecture architecture Application Software – functional appearance to immediate user » opcodes, addressing modes, architected registers Operating Systems, Compilers, Networking implementation (microarchitecture) – logical structure that performs the architecture Computer Architecture » pipelining, functional units, caches, physical registers realization (circuits) Circuits, Wires, Devices, Network Hardware – physical structure that embodies the implementation » gates, cells, transistors, wires • Architecture interacts with many other fields • Can’t be studied in a vacuum © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 15 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 16 Role of the Computer Microarchitect Computer Engineering Methodology • architect: defines the hardware/software interface • microarchitect: defines the hardware implementation – usually the same person • decisions based on – applications – performance Technology – cost Trends – reliability – power . © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 17 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 18 Computer Engineering Methodology Computer Engineering Methodology Evaluate Existing Evaluate Existing Systems for Systems for Bottlenecks Bottlenecks Benchmarks Benchmarks Technology Technology Trends Trends Simulate New Designs and Organizations Workloads © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 19 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 20 Computer Engineering Methodology Applications -> Requirements -> Designs • scientific: weather prediction, molecular modeling – need: large memory, floating-point arithmetic Evaluate Existing – examples: CRAY-1, T3E, IBM DeepBlue, BlueGene Implementation Systems for • commercial: inventory, payroll, web serving, e-commerce Complexity Bottlenecks – need: integer arithmetic, high I/O – examples: Clusters, SUN SPARCcenter, Enterprise Benchmarks • desktop: multimedia, games, entertainment – need: high data bandwidth, graphics Technology – examples: Intel Pentium4, IBM Power4, Motorola PPC 620 Trends • mobile: laptops Implement Next – need: low power (battery), good performance Generation System – examples: Intel Mobile Pentium III, Transmeta TM5400 Simulate New • embedded: cell phones, automobile engines, door knobs Designs and – need: low power (battery + heat), low cost Organizations – examples: Compaq/Intel StrongARM, X-Scale, Transmeta TM3200 Workloads © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 21 Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220 / ECE 252 22 Why Study Computer Architecture? Why Study Computer Architecture? • answer #1: requirements are always changing • answer #2: technology playing field is always changing • aren’t computers fast enough already? • annual technology improvements (approximate) – are they? – SRAM (logic): density +25%, speed +20% – fast enough to do everything we will EVER want? – DRAM (memory): density + 60%, speed: + 4% » AI, VR, protein sequencing, ???? – disk (magnetic): density +25%, speed: + 4% – fiber: ?? • is speed the only goal? • parameters change and change relative to one – power: heat dissipation + battery life another! – cost • designs change even if requirements fixed – reliability – etc but requirements are not fixed © 2004 Lebeck, Sorin, Roth, Hill, Wood, © 2004 Lebeck, Sorin, Roth, Hill, Wood, Sohi, Smith, Vijaykumar, Lipasti, Katz CompSci 220

Lecture 1: Course Introduction, Phone: 660-6551 Technology Trends, Performance • Teaching Assistant Shobana Ravi Professor Alvin R

Implicitly-Multithreaded Processors

Kaisen Lin and Michael Conley

A Speculative Control Scheme for an Energy-Efficient Banked Register File

The Microarchitecture of a Low Power Register File

REPORT Compaq Chooses SMT for Alpha Simultaneous Multithreading

PERL – a Register-Less Processor

UNIVERSITY of CALIFORNIA, SAN DIEGO Holistic Design for Multi-Core Architectures a Dissertation Submitted in Partial Satisfactio

Superscalar Execution Scalar Pipeline and the Flynn Bottleneck Multiple

Mini-Threads: Increasing TLP on Small-Scale SMT Processors

Energy-Effective Issue Logic

Piranha:Piranha

EECS 470 Lecture 24 Chip Multiprocessors and Simultaneous Multithreading Fall 2007 Prof