Design and Implementation of the Instruction Set Architecture for Data Lars

University of Kentucky UKnowledge University of Kentucky Master's Theses Graduate School 2010 DESIGN AND IMPLEMENTATION OF THE INSTRUCTION SET ARCHITECTURE FOR DATA LARS Kalyan Ponnala University of Kentucky, [email protected] Right click to open a feedback form in a new tab to let us know how this document benefits ou.y Recommended Citation Ponnala, Kalyan, "DESIGN AND IMPLEMENTATION OF THE INSTRUCTION SET ARCHITECTURE FOR DATA LARS" (2010). University of Kentucky Master's Theses. 58. https://uknowledge.uky.edu/gradschool_theses/58 This Thesis is brought to you for free and open access by the Graduate School at UKnowledge. It has been accepted for inclusion in University of Kentucky Master's Theses by an authorized administrator of UKnowledge. For more information, please contact [email protected]. ABSTRACT OF THE THESIS DESIGN AND IMPLEMENTATION OF THE INSTRUCTION SET ARCHITECTURE FOR DATA LARS The ideal memory system assumed by most programmers is one which has high capacity, yet allows any word to be accessed instantaneously. To make the hardware approximate this performance, an increasingly complex memory hierarchy, using caches and techniques like automatic prefetch, has evolved. However, as the gap between processor and memory speeds continues to widen, these programmer-visible mechanisms are becoming inadequate. Part of the recent increase in processor performance has been due to the introduction of programmer/compiler-visible SWAR (SIMD Within A Register) parallel processing on increasingly wide DATA LARs (Line Associative Registers) as a way to both improve data access speed and increase efficiency of SWAR processing. Although the base concept of DATA LARs predates this thesis, this thesis presents the first instruction set architecture specification complete enough to allow construction of a detailed prototype hardware design. This design was implemented and tested using a hardware simulator. KEYWORDS: Line Associative Registers, DATA LARs, SIMD Within a Register (SWAR), Cache Registers (CRegs), Associativity. KALYAN PONNALA (03/09/10) DESIGN AND IMPLEMENTATION OF THE INSTRUCTION SET ARCHITECTURE FOR DATA LARS By Kalyan Ponnala Dr. Hank Dietz Director of Thesis Dr. Stephen D. Gedney Director of Graduate Studies (03/09/2010) RULES FOR THE USE OF THESES Unpublished theses submitted for the Master’s degree and deposited in the University of Kentucky Library are as a rule open for inspection, but are to be used only with due regard to the rights of the authors. Bibliographical references may be noted, but quotations or summaries of parts may be published only with the usual scholarly acknowledgements. Extensive copying or publication of the thesis in whole or in part also requires the consent of the Dean of the Graduate School of the University of Kentucky. A library that borrows this thesis work for use by its patrons is expected to secure the signature of each user. Name Date THESIS Kalyan Ponnala The Graduate School University of Kentucky 2010 DESIGN AND IMPLEMENTATION OF THE INSTRUCTION SET ARCHITECTURE FOR DATA LARS ________________________________________________________________ THESIS _________________________________________________________________ A thesis submitted in the partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering in the College of Engineering at the University of Kentucky By Kalyan Ponnala Lexington, Kentucky Director: Dr. Hank Dietz, Professor of Electrical Engineering Lexington, Kentucky 2010 Copyright © Kalyan Ponnala 2010 DEDICATION To my extraordinary parents And All my friends at the University of Kentucky ACKNOWLEDGEMENTS I sincerely thank my academic advisor and thesis Director, Dr. Hank Dietz from the bottom of my heart for his guidance and support throughout my thesis. I am very thankful to Dr. Robert Heath for allowing me to work in his lab and helping me with my thesis. I would also like to acknowledge Dr. Meikang Qiu for his willingness to serve on my thesis committee. I would like to thank my mom, dad and brother for their emotional support and belief in me. Finally, I would like to thank all my friends at the University of Kentucky who have encouraged me during my work at this university. iii TABLE OF CONTENTS ACKNOWLEDGEMENTS ............................................................................................... iii LIST OF TABLES ............................................................................................................. vi LIST OF FIGURES .......................................................................................................... vii CHAPTER 1 ....................................................................................................................... 1 1. INTRODUCTION .......................................................................................................... 1 1.1 MOTIVATION ......................................................................................................... 1 1.2 BACKGROUNG ...................................................................................................... 2 1.2.1 SWAR ................................................................................................................ 2 1.2.2 CREGS ............................................................................................................... 2 1.2.3 CACHE .............................................................................................................. 4 CHAPTER 2 ....................................................................................................................... 7 2. LARS .............................................................................................................................. 7 2.1 DATA TYPES .......................................................................................................... 8 2.1.1 Fundamental Data Types ................................................................................... 8 2.1.2 PACKED SIMD DATA TYPES ....................................................................... 9 2.2 SIGNED AND UNSIGNED INTEGERS .............................................................. 10 2.3 Memory Alignment ................................................................................................. 11 2.4 DATA LARS REGISTER SET .............................................................................. 11 CHAPTER 3 ..................................................................................................................... 14 3. INSTRUCTION SET ARCHITECTURE .................................................................... 14 3.1 DATA TRANSFER INSTRUCTIONS .................................................................. 14 3.2 TYPE CASTING INSTRUCTIONS ...................................................................... 17 3.3 ARITHMETIC AND LOGICAL INSTURCTIONS .............................................. 20 3.3.1 TYPE CONVERSIONS................................................................................... 22 3.3.2 SCALAR ALU OPERATIONS ....................................................................... 25 3.4 NO-OP .................................................................................................................... 27 3.5 LOADDUMMY ...................................................................................................... 27 3.5 SUMMARY OF INSTRUCTION SET ARCHITECTURE................................... 28 CHAPTER 4 ..................................................................................................................... 30 4. DATA LARs ARCHITECTURE ................................................................................. 30 iv 4.1 Interrupts ................................................................................................................. 33 4.2 TRADE OFFs OF DATA LARs............................................................................. 33 4.2.1 DATA HAZARDS........................................................................................... 34 4.2.2 STRUCTURAL HAZARD .............................................................................. 36 4.2.3 ASSOCIATIVE SEARCH OF LOAD INSTRUCTIONS .............................. 39 CHAPTER 5 ..................................................................................................................... 40 5. RESULTS ..................................................................................................................... 40 5.1 TRIVIAL EXAMPLE ............................................................................................. 40 5.1.1 EXECUTION OF ALIAS ANALISIS EXAMPLE ON DATA LARs SIMULATOR ........................................................................................................... 42 5.1.2 EXECUTION OF LAZY STORE EXAMPLE ON DATA LARs SIMULATOR ........................................................................................................... 46 5.2 DEVICE UTILIZATION SUMMARY .................................................................. 48 CHAPTER 6 ..................................................................................................................... 49 6. CONCLUSION AND FUTURE WORK ..................................................................... 49 REFERENCES ................................................................................................................. 51 Vita ...................................................................................................................................

Design and Implementation of the Instruction Set Architecture for Data Lars

Elementary Functions: Towards Automatically Generated, Efficient

Run-Time Adaptation to Heterogeneous Processing Units for Real-Time Stereo Vision

A Compiler Target Model for Line Associative Registers

High Performance Motion Detection: Some Trends

Low Power Image Processing: Analog Versus Digital Comparison Jacques-Olivier Klein, Lionel Lacassagne, Herve Mathias, Sebastien Moutault, Antoine Dupret

Low Overhead Memory Subsystem Design for a Multicore Parallel DSP Processor

FULLTEXT01.Pdf

Scalar Arithmetic Multiple Data: Customizable Precision for Deep Neural Networks

The Design of Scalar AES Instruction Set Extensions for RISC-V

An Instruction Set Extension to Support Software-Based Masking

ESA IP Core Extensions for LEON2FT

Software Component Template V4.2.0 R4.0 Rev 3