ECE 5745 Complex Digital ASIC Design Course Overview Christopher Batten

ECE 5745 Complex Digital ASIC Design Course Overview Christopher Batten School of Electrical and Computer Engineering Cornell University http://www.csl.cornell.edu/courses/ece5745 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Complex Digital ASIC Design Algorithm I Course goal, structure, motivation PL . What is the goal of the course? . Why should students want to take this course? OS . How is the course structured? ISA I Activity 1: Evaluation of Integer Multiplier μArch I Case Study: Scalar vs. Vector Processors RTL . Example design-space exploration . Example real ASIC chips Gates I Activity 2: Brainstorming for Sorting Accelerator Circuits Devices Technology ECE 5745 Course Overview 2 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 The Computer Engineering Stack Application Gap too large to bridge in one step (but there are exceptions e.g., magnetic compass) Technology In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 3 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 The Computer Engineering Stack Application Algorithm Programming Language Operating System Instruction Set Architecture Microarchitecture Register-Transfer Level Gate Level Circuits Computer Architecture Devices Technology In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 3 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 What is Computer Architecture? Application Algorithm Application Requirements Programming Language • Suggest how to improve architecture Operating System • Provide revenue to fund development Instruction Set Architecture Architecture provides feedback to guide Microarchitecture application and technology research directions Register-Transfer Level Gate Level Circuits Technology Constraints Computer Architecture • Restrict what can be done efficiently Devices • New technologies make new arch possible Technology In its broadest definition, computer architecture is the design of the abstraction/implementation layers that allow us to execute information processing applications efficiently using available manufacturing technologies ECE 5745 Course Overview 4 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Key Metrics in Computer Architecture I Primary Metrics Network . Execution time (cycles/task) . Energy (Joules/task) I$ I$ I$ I$ . Cycle time (ns/cycle) . Area (µm2) I Secondary Metrics P P P P . Performance (ns/task) . Average power (Watts) Network . Peak power (Watts) . Cost ($) . Design complexity D$ D$ D$ D$ . Reliability . Flexibility Network Discuss qualitative first-order analysis from ECE 4750 on board ECE 5745 Course Overview 5 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Unanswered Questions from ECE 4750 I How can we quantitatively evaluate Network area, cycle time, and energy? I How do we actually implement I$ Accelerated I$ Instructions processors, memories, and networks in a real chip? P Xcel Xcel P I How should we implement/analyze application-specific accelerators? Network . Very loosely coupled memory-mapped accelerators D$ D$ D$ D$ . More tightly coupled co-processor accelerators . Specialized instructions and Network functional units ECE 5745 Course Overview 6 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 ASIC: Application-Specific Integrated Circuit Network Out-of-Order C D Superscalar Superpipelined I$ Accelerated I$ Superscalar Instructions w/ Deeper Multicore E Pipelines B F P Xcel Xcel P Processor Power Simple Constraint Proc A Network Energy (Joules per Task) Specialized Accelerators D$ D$ D$ D$ Performance (Tasks per Second) Network ECE 5745 Course Overview 7 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 ASIC: Application-Specific Integrated Circuit Design Network Performance Custom Constraint ASIC Embedded Less Flexible I$ Accelerated I$ Architectures Accelerator Instructions More Flexible Accelerator P Xcel Xcel P Design Power Flexibility vs. Specialization Constraint Simple Network Processor High-Performance Energy Efficiency (Tasks per Joule) Architectures D$ D$ D$ D$ Performance (Tasks per Second) Network ECE 5745 Course Overview 8 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Goal for ECE 5745 is to answer these questions! I How can we quantitatively evaluate Network area, cycle time, and energy? I How do we actually implement I$ Accelerated I$ Instructions processors, memories, and networks in a real chip? P Xcel Xcel P I How should we implement/analyze application-specific accelerators? Network . Very loosely coupled memory-mapped accelerators D$ D$ D$ D$ . More tightly coupled co-processor accelerators . Specialized instructions and Network functional units ECE 5745 Course Overview 9 / 52 6.884 – Spring 2005 Requires complete customization of all layers of wafer of layers all of customization complete Requires style design consuming time Most anywhere anything, do to free is Designer – – devices (Intel microprocessors, RF power amps for cellphones) for amps power RF microprocessors, (Intel devices volume high very or performance high very for Reserved discipline some imposes usually team design each though • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Full Custom Design vs. Standard-Cell Design Full Custom Design Piece of full-custom multiplier array, multiplier full-custom of Piece I Full-Custom Design (ECE 4740) . Designer is free to do anything, anywhere; though team usually imposes some design discipline 1.0 . Most time consuming design style; reserved for 2 Feb 2005 m 2-metal m very high performance or very high volume chips (Intel microprocessors, RF power amps for cellphones) I Standard-Cell Design (ECE 5745) . Fixed library of “standard cells” and SRAM memory generators Full-custom layout . Register-transfer-level description is automatically in 1.0µm w/ 2 metal mapped to this library of standard cells, then these layers cells are placed and routed automatically L01 – 22 Introduction . Enables agile hardware design methodology ECE 5745 Course Overview 10 / 52 Standard Cell ASICs • Also called Cell-Based ICs (CBICs) • Fixed library• Complex of cells Digital plus ASICmemory Design generators• Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 • Cells can be synthesized from HDL, or entered in schematics • Cells placed and routed automatically • Requires complete set ofStandard-Cell custom masks for each design Design Methodology • Currently most popular hard-wired ASIC type (6.884 will use this) Cells arranged in rows Mem Standard1 Mem Cell Design 2 Cells have standard height but vary in width Designed to connect power, ground, and wellsGenerated by abutment memory arrays 2 Feb 2005 6.884 – Spring 2005 Well ContactL01 – Introduction 24 under Power Rail Clock Rail (not typical) Clock Rail VDD Rail Cell I/O on M2 Power Ripple carry adder with carry Rails in M1 chain highlighted GND Rail NAND2 Flip-flop ECE 5745 Course Overview 11 / 52 6.884 – Spring 2005 2 Feb 2005 L01 – Introduction 25 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Standard-Cell Design Methodology Design in HDL Standard Cells Area (μm2) HDL Synthesis Cycle Time (ns) Simulator Energy (J/task) Switching Activity Gate-Level Model Area (μm2) Execution Time Place&Route Cycle Time (ns) (cycles/task) Energy (J/task) Layout Power Energy (J/task) Analysis ECE 5745 Course Overview 12 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Motivation Architectural Patterns Scale VT Core Maven VT Core Evaluation Example Standard-Cell Chip Plot Single-Lane Vector-Thread Unit w/ 256 Registers MIT CSAIL Christopher Batten 32 / 42 ECE 5745 Course Overview 13 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 What is Complex Digital ASIC Design? Application Complex digital ASIC design is Algorithm the process of Programming Language quantitatively exploring the Operating System area, cycle time, execution time, and Instruction Set Architecture energy trade-offs Microarchitecture Register-Transfer Level of various Gate Level application-specific accelerators Circuits (and general-purpose proc+mem+net) Computer Architecture Devices Technology using automated standard-cell CAD tools and then to transform the most promising design to layout ready for fabrication ECE 5745 Course Overview 14 / 52 • Complex Digital ASIC Design • Activity 1 Case Study: Scalar vs. Vector Processors Activity 2 Application Complex Digital ASIC Design Algorithm I Course goal, structure, motivation PL . What is the goal of the course? . Why should students want to take this course? OS . How is the course structured? ISA I Activity 1: Evaluation of Integer Multiplier μArch I Case Study: Scalar vs. Vector Processors RTL . Example

ECE 5745 Complex Digital ASIC Design Course Overview Christopher Batten

Inside Intel® Core™ Microarchitecture Setting New Standards for Energy-Efficient Performance

POWER-AWARE MICROARCHITECTURE: Design and Modeling Challenges for Next-Generation Microprocessors

Hardware Architecture

Microcontroller Serial Interfaces

Reverse Engineering X86 Processor Microcode

Intel(R) Software Guard Extensions Developer Guide

Itanium® 2 Processor Microarchitecture Overview

Microarchitecture-Level Soc Design 27 Young-Hwan Park, Amin Khajeh, Jun Yong Shin, Fadi Kurdahi, Ahmed Eltawil, and Nikil Dutt

Computer Architectures an Overview

Digital and System Design

ECE 4750 Computer Architecture, Fall 2020 T02 Fundamental Processor

Itanium Processor