Exploiting Fine-Grain Concurrency Analytical Insights in Superscalar Processor Design

Purdue University Purdue e-Pubs Department of Electrical and Computer Department of Electrical and Computer Engineering Technical Reports Engineering 8-1-1991 Exploiting Fine-Grain Concurrency Analytical Insights in Superscalar Processor Design Pradeep K. Dubey Purdue University George B. Adams III Purdue University Michael J. Flynn Purdue University Follow this and additional works at: https://docs.lib.purdue.edu/ecetr Dubey, Pradeep K.; Adams, George B. III; and Flynn, Michael J., "Exploiting Fine-Grain Concurrency Analytical Insights in Superscalar Processor Design" (1991). Department of Electrical and Computer Engineering Technical Reports. Paper 746. https://docs.lib.purdue.edu/ecetr/746 This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for additional information. EXPLOITING FINE-GRAIN CONCURRENCY: ANALYTICAL INSIGHTS IN SUPERSCALAR PROCESSOR DESIGN Pradeep K. Dubey George B. Adams III Michael J. Flynn * School of Electrical Engineering Purdue University West Lafayette, Indiana 47907 Purdue University TR-EE 91-31 August 1991 * Department of Electrical Engineering, StanfordUniversity ii ACKNOWLEDGEMENTS The authors thank Prof. Henry Dietz, Prof. Jose Fortes, Prof. Mike Atallah, Prof. Arup Bose, and Raymond Kamin, all of Purdue University. We also thank Dr. James T. Kuehn at the Supercomputing Research Center in Bowie, Maryland, and the management at the University Computing Center at the California State University, Sacramento, for making their Multiflow computers available for this research. Finally, we thank Mr. Kent G. Fielden of Intel Corporation, Santa Clara for making available technical documentation crucial to the experiments. U i TABLE OF CONTENTS Page LIST OF TABLES...................................................... ............................................. vi LIST OF FIGURES ............................................................................... ....................... viii LISTOF SYMBOLS ................................................................................. .......... ....... xvii ABSTRACT......................................... ................... .............................................. xxi CHAPTER I INTRODUCTION.................................................. ................ I 1.1 Motivation..... .......................................................................... .......... I 1.2 Review of Concurrency Representation, Detection and Scheduling Techniques................... ...................... 2 1.2.1 RepresentingConcurrency........................................................... 3 1.2.2 Dependencies .......................................................... ...................... 5 1.2.3 Detecting, Dispatching and Scheduling ConcurrentOperations....................... .................................. 9 1.2.4 ImplementationTradeoffs .... ............. ........... ............................. 20 1.2.5 Summ ary........................... ............... ............................................. 24 1.3 DissertationOverview........................................................ ............... ....... 30 CHAPTER 2 OPTIMAL PIPELINING ...................................................... ........ 32 2.1 Introduction.......................... ...................................................... 32 2.1.1 Previous Research ................................. ............... ............... 33 2.2 A Generic Model .............................. ...................... ........... ........... ........ 34 2.3 Inferences .............................................................................. ...................... 38 2.3.1 Correspondence with Previously Published Experimental Results.................................................................... 53 2.4 Potential Improvements to the Model..................... ............. ................... 53 2.5 Summary .................................................................... ................ ............... 55 CHAPTER 3 BRANCHSTRATEGIESrMODELLINGAND OPTIMIZATION ............................... 59 3.1 Inttoduction ................................................................................. ........... 59 3.1.1 PreviousResearch................................... ....................... ........... 59 3.2 TheM odel............................................................................................ 60 3.3 ClassificationofBranchStrategies........................... ............................... 62 3 4 BranchPrediCtion ..... ........................................... .............................. ..... 65 3.5 Results ........... .............................. ........................................................... 70 3.5.1 Inferences..................................... .......................................... 70 iv Page 3.6 HybridStrategies ...................................... ......................................... 76 3.6.1 Inferences....................................................................... ............... 77 3.7 Sum m ary............................................................................ ................. 81 CHAPTER 4 SUPERPIPELINED VERSUS SUPERSCALAR ........... 82 4.1 Introduction............................................................................ ................. 82 4.2 Superpipeline/Superscalar Tradeoff Model.......................................... 82 4.3 Performance Limits.............. ........... ............................................. 85 4.4 ModellingResourceUtilization..................................... .......... 89 4.5 Summary ........... ............... ................................ ............ .............. 90 CHAPTER 5 INSTRUCTION WINDOW SIZE TRADEOFFS AND CHARACTERIZATION OF PROGRAM PARALLELISM.... 91 5.1 Introduction................................................................................................. 91 5.2 The Analytic Performance Model ...... ..................................................... 92 5.3 C ostofB ranches.... .......................... .................................................... 98 5.3.1 CalculatingMispredictionDelayResulting from Speculative Execution...................................................... 99 5.3.2 Alternate Computation for Pu ,................... ............................... 101 5.3.3 Dynamic Scheduling with Finite Lookahead ............................. 103 5.4 Experimental Results ............. .................. ...................................... 103 5.5 Potential Improvements to the Model................... ............. ..................... 113 5.6 Summary................................................... .............. ............................... 120 CHAPTER 6 SPECTRUM OF CHOICES: SUPERPIPELINED, SUPERSCALAR OR MULTIPROCESSOR? ........ 122 6.1 Introduction........ ................................................................... .......... 122 6.2 DelaysAssociatedwithMultiprocessors ....... 122 6.2.1 Dependency Delay ............ ................ ............. ................ 124 6.2.2 OperandFetch Delay ............ ................................... 126 6.3 Utilization Constraints ........... ............... ....................................... 126 6.3.1 Characteristics of Utilization Curves ............... ..................... 127 6.3.2 Alternate Characterization of Program Parallelism.................... 129 6.4 Results ....................................................................... ............................... 132 6.5 CombinedSystems ........................................— .......... ................ 139 6.6 Summary ............ ...................................... ............... 142 CHAPTER 7 CONCLUSIONS ....... ................. ........... .............. 143 7.1 Summary ............... ............ ....................................................................... 143 7.2 Contributions ..... ........................................................................................ 144 CHAPTER 8 FUTURE RESEARCH................................................................. 146 8.1 Out-of-sequence Execution Versus LocalityofOperandReferences....................................................... 146 8.2 Cosi/Performance Tradeoffs for Concurrency Detection in Different Execution Phases................................................................... 153 V Page 8.3 OtherMeasuresforDistancebetweenlnstructionPairs......................... 154 8.4 Recursive Performance Modelling ................ ....................... ................... 154 LIST OF REFERENCES...... ............................................................................ 157 APPENDICES AppendixA ..................... ..................................................... .................... ........ 163 Appendix B .......I.......................................................................... ....................... 182 Appendix C (158 page source code listing; not included) ................................ 194 vi LIST OF TABLES Table Page 1.1 Gomparisonofconcurrencydetectionandschedulingstrategies...................... 11 2.1 Nomenclature and nominal values of model parameters..... ..................... 37 2.2 Normalized throughput (Gnorm) versus static overhead (c).......... 39 2.3 Normalized throughput (Gnorm) versus dynamic overhead (k)..... 39 2.4 Normalized throughput (Gnorm) versus constant term of the utilization model (Mmax) ....................

Load more