Design and Implementation of a Multithreaded Associative Simd Processor
Total Page:16
File Type:pdf, Size:1020Kb
DESIGN AND IMPLEMENTATION OF A MULTITHREADED ASSOCIATIVE SIMD PROCESSOR A dissertation submitted to Kent State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy by Kevin Schaffer December, 2011 Dissertation written by Kevin Schaffer B.S., Kent State University, 2001 M.S., Kent State University, 2003 Ph.D., Kent State University, 2011 Approved by Robert A. Walker, Chair, Doctoral Dissertation Committee Johnnie W. Baker, Members, Doctoral Dissertation Committee Kenneth E. Batcher, Eugene C. Gartland, Accepted by John R. D. Stalvey, Administrator, Department of Computer Science Timothy Moerland, Dean, College of Arts and Sciences ii TABLE OF CONTENTS LIST OF FIGURES ......................................................................................................... viii LIST OF TABLES ............................................................................................................. xi CHAPTER 1 INTRODUCTION ........................................................................................ 1 1.1. Architectural Trends .............................................................................................. 1 1.1.1. Wide-Issue Superscalar Processors............................................................... 2 1.1.2. Chip Multiprocessors (CMPs) ...................................................................... 2 1.2. An Alternative Approach: SIMD ........................................................................... 3 1.3. MTASC Processor ................................................................................................. 5 1.4. Dissertation Organization ...................................................................................... 5 CHAPTER 2 ASSOCIATIVE COMPUTING ................................................................... 7 2.1. Background ............................................................................................................ 7 2.1.1. Associative Memories ................................................................................... 8 2.1.2. Associative Processors ................................................................................ 10 2.1.3. STARAN..................................................................................................... 11 2.1.4. Massively Parallel Processor (MPP) ........................................................... 12 2.2. The Associative Computing Model (ASC) .......................................................... 12 2.2.1. Associative Search ...................................................................................... 14 2.2.2. Responder Detection ................................................................................... 15 2.2.3. Responder Selection/Iteration ..................................................................... 15 2.2.4. Maximum/Minimum Search ....................................................................... 16 2.2.5. PE Interconnection Network ....................................................................... 17 2.2.6. MASC ......................................................................................................... 17 2.3. ASC Processor Prototypes ................................................................................... 19 2.3.1. First Processor ............................................................................................. 19 2.3.2. Scalable Processor ...................................................................................... 19 2.3.3. Pipelined Processor ..................................................................................... 20 2.3.4. MASC Processor ......................................................................................... 20 2.3.5. MTASC and Previous ASC Processors ...................................................... 21 iii 2.4. Summary .............................................................................................................. 21 CHAPTER 3 PIPELINING .............................................................................................. 22 3.1. Background .......................................................................................................... 22 3.1.1. Pipelining Basics ......................................................................................... 23 3.1.2. Hazards ....................................................................................................... 24 3.2. Pipelining a SIMD Processor ............................................................................... 27 3.2.1. SIMD Instruction Types ............................................................................. 27 3.2.2. Pipelining Instruction Execution ................................................................. 28 3.2.3. Pipelining the Broadcast and Reduction Networks .................................... 29 3.2.4. SIMD-Specific Pipeline Hazards ................................................................ 30 3.3. MTASC Pipeline .................................................................................................. 31 3.3.1. Unified Pipeline .......................................................................................... 32 3.3.2. Hazards in the Unified Pipeline .................................................................. 32 3.3.3. Diversified Pipeline .................................................................................... 34 3.3.4. Hazards in the Diversified Pipeline ............................................................ 35 3.3.5. Comparison of Unified and Diversified Pipelines ...................................... 36 3.4. Summary .............................................................................................................. 37 CHAPTER 4 MULTITHREADING ................................................................................ 39 4.1. Background .......................................................................................................... 39 4.2. Thread Scheduling ............................................................................................... 42 4.2.1. Simple Scheduler ........................................................................................ 42 4.2.2. Dependency-Aware Scheduler .................................................................... 44 4.2.3. Semaphore-Aware Scheduler...................................................................... 47 4.3. Summary .............................................................................................................. 48 CHAPTER 5 MTASC: A MULTITHREADED ASSOCIATIVE SIMD PROCESSOR 50 5.1. Instruction Set Architecture ................................................................................. 50 5.1.1. Registers ...................................................................................................... 51 5.1.2. Instruction Formats ..................................................................................... 53 5.1.3. ALU Instructions ........................................................................................ 54 5.1.4. Load/Store Instructions ............................................................................... 54 5.1.5. Branch and Jump Instructions ..................................................................... 55 iv 5.1.6. Enable Stack Instructions ............................................................................ 55 5.1.7. Reduction Instructions ................................................................................ 56 5.1.8. Semaphore Instructions ............................................................................... 57 5.2. Organization ......................................................................................................... 57 5.2.1. Control Unit ................................................................................................ 57 5.2.2. Scalar Execution Unit ................................................................................. 59 5.2.3. Processing Elements (PEs) ......................................................................... 60 5.2.4. Broadcast/Reduction Network .................................................................... 63 5.2.5. PE Interconnection Network ....................................................................... 64 5.3. MTASC Assembler .............................................................................................. 64 5.4. MTASC Simulator ............................................................................................... 65 5.5. Summary .............................................................................................................. 66 CHAPTER 6 MULTITHREADED ASSOCIATIVE BENCHMARKS .......................... 67 6.1. Algorithmic Conventions ..................................................................................... 67 6.2. Associativity ........................................................................................................ 68 6.3. Benchmarks.......................................................................................................... 69 6.3.1. Jarvis March ................................................................................................ 69 6.3.2. Minimum Spanning Tree ............................................................................ 71 6.3.3. Matrix-Vector Multiplication.....................................................................