Background Optimization in Full System Binary Translation

Total Page:16

File Type:pdf, Size:1020Kb

Background Optimization in Full System Binary Translation Background Optimization in Full System Binary Translation Roman A. Sokolov Alexander V. Ermolovich MCST CJSC Intel CJSC Moscow, Russia Moscow, Russia Email: [email protected] Email: [email protected] Abstract—Binary translation and dynamic optimization are architecture codes fully utilizing all architectural features widely used to provide compatibility between legacy and promis- introduced to support binary translation. Besides, dynamic ing upcoming architectures on the level of executable binary optimization can benefit from utilization of actual information codes. Dynamic optimization is one of the key contributors to dynamic binary translation system performance. At the same about executables behavior which static compilers usually time it can be a major source of overhead, both in terms of don’t possess. CPU cycles and whole system latency, as long as optimization At the same time dynamic optimization can imply sig- time is included in the execution time of the application under nificant overhead as long as optimization time is included translation. One of the solutions that allow to eliminate dynamic in the execution time of application under translation. Total optimization overhead is to perform optimization simultaneously with the execution, in a separate thread. In the paper we present optimization time can be significant but will not necessarily implementation of this technique in full system dynamic binary be compensated by the translated codes speed-up if application translator. For this purpose, an infrastructure for multithreaded run time is too short. execution was implemented in binary translation system. This Also, the operation of optimizing translator can worsen the allowed running dynamic optimization in a separate thread latency (i.e., increase pause time) of interactive application or independently of and concurrently with the main thread of execution of binary codes under translation. Depending on the operating system under translation. By latency is meant the computational resources available, this is achieved whether by time of response of computer system to external events such interleaving the two threads on a single processor core or by as asynchronous hardware interrupts from attached I/O devices moving optimization thread to an underutilized processor core. and interfaces. This characteristic of a computer system is as In the first case the latency introduced to the system by a important for the end user, operation of hardware attached or computational intensive dynamic optimization is reduced. In the second case overlapping of execution and optimization threads other computers across network as its overall performance. also results in elimination of optimization time from the total Full system dynamic binary translators have to provide low execution time of original binary codes. latency of operation as well. Binary translation systems of this class target to implement all the semantics and behavior I. INTRODUCTION model of source architecture and execute the entire hierar- Technologies of binary translation and dynamic optimiza- chy of system-level and application-level software including tion are widely used in modern software and hardware com- BIOS and operating systems. They exclusively control all the puting systems [1]. In particular, dynamic binary translation computer system hardware and operation. Throughout this systems (DBTS) comprising the two serve as a solution to paper we will also refer this type of binary translation systems provide compatibility between widely used legacy and promis- as virtual machine level (or VM-level) binary translators (as ing upcoming architectures on the level of executable binary opposed to application-level binary translators). codes. In the context of binary translation these architectures One recognized technique to reduce dynamic optimization are usually referred to as source and target, correspondingly. overhead is to perform optimization simultaneously (con- DBTSs execute binary codes of source architecture on currently) with the execution of original binary codes by top of instruction set (ISA) incompatible target architecture utilizing unemployed computational resources or free cycles. hardware. They perform translation of executable codes incre- It was utilized in a number of dynamic binary translation and mentally (as opposed to whole application static compilation) optimization systems [2], [3], [4], [5], [6], [7], [8]. We will interleaving it with execution of generated translated codes. refer this method as background optimization (as opposed to One of the key requirements that every DBTS has to meet consequent optimization, when optimizing translation inter- is that the performance of execution of source codes through rupts execution and utilizes processor time exclusively unless binary translation is to be comparable or even outperform the it completes). performance of native execution (when executing them on top The paper describes implementation of background opti- of source architecture hardware). mization in a VM-level dynamic binary translation system. Optimizing translator is usually employed to achieve higher This is achieved by separating of optimizing translation from DBTS performance. It allows to generate highly efficient target execution flow into an independent thread which can then con- currently share available processing resources with execution thread. Backgrounding is implemented whether by interleaving the two threads in case of a single-core (single processor) system or by moving optimization thread to an unemployed Elbrus CPU processor core in case of a dual-core (dual processor) system. (IA-32 incompatible) In the first case the latency introduced to the system by the ”heavy” phase of optimizing translation is reduced. In the second case, overlapping of execution and optimization VM-level dynamic binary translation threads also eliminates the time spent in dynamic optimization system LIntel phase from the total run time of the original application under translation. IA-32 BIOS, OS, drivers The specific contributions of this work are as follows: and libraries • implementation of multithreaded infrastructure in a VM- IA-32 applications level dynamic binary translation system; • single processor system targeted implementation of back- ground optimization technique where processor time shar- Fig. 1. VM-level dynamic binary translation system LIntel. ing is implemented by interleaving optimizing translation with execution of original binary codes; Adaptive • dual processor system targeted implementation of back- retranslation IA-32 binaries ground optimization technique where optimizing trans- lation is being completely offloaded onto underutilized processor core. Interpretation Non-optimizing trace and profiling translation Translation cache: The solutions described in the paper were implemented execution of translated codes in the VM-level dynamic binary translation system LIntel, and profiling Optimizing region which provides full system-level binary compatibility with translation Intel IA-32 architecture on top of Elbrus architecture [9], [10] hardware. Fig. 2. Adaptive binary translation. II. LINTEL A. Adaptive binary translation Elbrus is a VLIW (Very Long Instruction Word) micropro- LIntel follows adaptive, profile-directed model of translation cessor architecture. It has several special features including and execution of binary codes (Fig. 2). It includes four levels hardware support for full compatibility with IA-32 architecture of translation and optimization varying by the efficiency of the on the basis of transparent dynamic binary translation. resulting Elbrus code and the overhead implied, namely: inter- LIntel is a dynamic binary translation system developed for preter, non-optimizing translator of traces and two optimizing high performance emulation of Intel IA-32 architecture sys- translators of regions. LIntel performs dynamic profiling to tem through dynamic translation of source IA-32 instructions identify hot regions of source code and to apply reasonable into wide instructions of target Elbrus architecture (the two level of optimization depending on executable codes behavior. architectures are ISA-incompatible). It provides full system- Translation cache is employed to store and reuse generated level compatibility meaning that it is capable of translating translations throughout execution. Run-time support system the entire hierarchy of source architecture software (including controls the overall binary translation and execution process. BIOS, operating systems and applications) transparently for When the system starts, interpreter is used to carefully the end user (Fig. 1). As is noted above, LIntel is a co- decode and execute IA-32 instructions sequentially, with at- designed system (developed along with the architecture, with tention to memory access ordering and precise exception hardware assistance in mind) and heavily utilizes all the handling. Non-optimizing translation is launched if execution features of architecture introduced to support efficient IA-32 counter of a particular basic block exceeds specified threshold. compatibility. Non-optimizing translator builds a trace which is a seman- In its general structure LIntel is similar to many other binary tically accurate mapping of one or several contiguous basic translation and optimization systems described before [11], blocks (following one path of control) into the target code. The [12], [13] and is very close to Transmeta’s Code Morphing
Recommended publications
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Open Access Proceedings Journal of Physics: Conference Series
    IOP Conference Series: Earth and Environmental Science PAPER • OPEN ACCESS Issues of compatibility of processor command architectures To cite this article: T R Zmyzgova et al 2020 IOP Conf. Ser.: Earth Environ. Sci. 421 042006 View the article online for updates and enhancements. This content was downloaded from IP address 85.143.35.46 on 13/02/2020 at 06:53 AGRITECH-II-2019 IOP Publishing IOP Conf. Series: Earth and Environmental Science 421 (2020) 042006 doi:10.1088/1755-1315/421/4/042006 Issues of compatibility of processor command architectures T R Zmyzgova, A V Solovyev, A G Rabushko, A A Medvedev and Yu V Adamenko Kurgan State University, 62 Proletarskaya street, Kurgan, 640002, Russia E-mail: [email protected] Abstract. Modern computers and computing devices are based on the principle of open architecture, according to which the computer consists of several sufficiently independent devices that perform a certain function. These devices must meet certain standards of interaction with each other. Existing standards relate to both the technical characteristics of the devices and the content of the signals exchanged between them. The article considers the issue of creating a universal architecture of processor commands. The brief analysis of the components of devices and interrelations between them (different hardware features of processors, architecture of memory models and registers of peripheral devices, mechanisms of operand processing, the number of registers and processed data types, interruptions, exceptions, etc.) is carried out. The problem of architecture standardization, which generalizes the capabilities of the most common architectures and is suitable for high-performance emulation on most computer architectures, is put.
    [Show full text]
  • Oral History of Boris Babayan
    Oral History of Boris Babayan Interviewed by: Alex Bochannek Recorded: May 16, 2012 Moscow, Russia CHM Reference number: X6507.2012 © 2013 Computer History Museum Oral History of Boris Babayan Boris Babayan, May 16, 2012 Alex Bochannek: I’m Alex Bochannek; Curator and Senior Manager at the Computer History Museum in Mountain View, California. Today is Wednesday, May 16th and we are at Intel in Moscow, Russia to conduct the oral history with Boris Babayan. Also present in the room are Lubov Gladkikh his assistant and Yuri Merling [ph?] the videographer. Thank you both for agreeing to do this oral history for the archive at the Computer History Museum today. Let’s start about talking about your childhood and would you give us your full name, where you were born, and tell us a little bit about your parents, if there are any siblings and what your childhood was like. Boris Babayan: I was born in 1933 in Baku, Azerbaijan, now it’s an independent state. My father was a technician, he was an electrical engineer, and my mother worked in the kindergarten. I finished 10 year secondary school in Baku and then moved in Moscow, where I entered the Moscow Institute of Physics and Technology. It seems to me that in Russia I definitely was the first student in computer science. Bochannek: Now, what made you want to go to Moscow to that institute? Were you interested in technical things? Babayan: Because I was interested in the technical education and some people told me that the Moscow Institute of Physics and Technology is a very good institute, like MIT in the United States; it was established just after World War II, specially for education in the high tech.
    [Show full text]
  • Experience of Building and Deployment Debian on Elbrus Architecture
    Experience of Building and Deployment Debian on Elbrus Architecture Andrey Kuyan, Sergey Gusev, Andrey Kozlov, Zhanibek Kaimuldenov, Evgeny Kravtsunov Moscow Center of SPARC Technologies (ZAO MCST) Vavilova street, 24, Moscow, Russia fkuyan a, gusev s, kozlov a, kajmul a, kravtsunov [email protected] Abstract—This article describe experience of porting Debian II. Debian package managment system Linux distribution on Elbrus architecture. Authors suggested Debian is built from a large number of open-source projects effective method of building Debian distribution for architecture which is not supported by community. which maintained by different groups of developers around the world. Debian uses the package term. There are 2 types of packages: source and binary. Common source package I. Introduction consists of *.orig.tar.gz file, *.diff.gz file and *.dsc file. *.orig.tar.gz file contains upstream code of a project, MCST (ZAO ”MCST”) is a Russian company specializing maintained by original developers. *.diff.gz file contains in the development of general purpose CPU with Elbrus- a Debian patch with some information about project, such 2000 (e2k) ISA [1] and computing platforms based on it as build-dependencies, build rules, etc. *.dsc file holds an [2].Also in the company are being developed optimizing information about *.orig.tar.gz and *.diff.gz. Some and binary compilers, operating systems. General purpose of source packages, maintained by Debian developers (for ex- microprocessors and platforms assume that users have the ample dpkg) may not comprise *.diff.gz file because they ability to solve any problems of system integration with its already have a Debian information inside.
    [Show full text]
  • Chronologie De L'informatique Scientifique
    Chronologie de l’Informatique Scientifique Gérard Sookahet (août 2021) Voici une chronologie simplifiée des différents faits qui balisent le parcours de l'activité de l'Informatique Scientifique . S'entrecroisent à la fois les progrès réalisés dans les domaines tels que l'Analyse Numérique, l'Informatique, l'Algorithmique, l'Infographie, etc .... ndlr: Certains événements ont parfois un rapport assez lointain avec l'Informatique Scientifique, mais ils permettent de mieux situer le contexte scientifique de l'époque. Certains autres événements ont une importance toute relative selon votre grille de lecture. Nomenclature Mathématiques, analyse numérique, calcul numérique, algorithmique Eléments Finis, calcul de structure, mécanique, CFD CAO, infographie, cartographie Informatique, calcul intensif Programmation, calcul formel et symbolique, intelligence artificielle, cryptographie - 2000 Tables numériques babyloniennes de carrés et cubes - 1800 Algorithme de Babylone (approximation des racines carrées) - 220 Calcul de la circonférence de la Terre par Eratosthène de Cyrène - 225 Approximation de π par Archimèdes de Syracuse - 150 Hipparque de Rhodes utilise des interpolations linéaires pour calculer la position des corps célestes - 87 Machine d'Anticythère pour calculer les positions astronomiques (Grèce) 200 Abaques chinois 263 Première méthode d'élimination de Gauss par Liu Hui (Chine) 450 Calcul de π avec 6 décimales par Tsu Chung-Chih et Tsu Keng-Chih 550 Apparition du zéro et de la numération de position (Inde) 600 Liu
    [Show full text]
  • Error Handling and Energy Estimation Framework for Error Resilient Near-Threshold Computing Rengarajan Ragavan
    Error Handling and Energy Estimation Framework For Error Resilient Near-Threshold Computing Rengarajan Ragavan To cite this version: Rengarajan Ragavan. Error Handling and Energy Estimation Framework For Error Resilient Near- Threshold Computing. Embedded Systems. Rennes 1, 2017. English. tel-01636803 HAL Id: tel-01636803 https://hal.inria.fr/tel-01636803 Submitted on 17 Nov 2017 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. No d’ordre : ANNÉE 2017 THÈSE / UNIVERSITÉ DE RENNES 1 sous le sceau de l’Université Bretagne Loire pour le grade de DOCTEUR DE L’UNIVERSITÉ DE RENNES 1 Mention : Traitement du Signal et Télécommunications École doctorale MATISSE présentée par Rengarajan RAGAVAN préparée à l’unité de recherche UMR6074 IRISA Institut de recherche en informatique et systèmes aléatoires - CAIRN École Nationale Supérieure des Sciences Appliquées et de Technologie Thèse soutenue à Lannion le 22 septembre 2017 devant le jury composé de : Edith BEIGNÉ Error Handling and Directrice de Recherche à CEA Leti / rapporteur Energy Estimation Patrick GIRARD Directrice de Recherche à CNRS, LIRMM / rappor- teur Framework For Lorena ANGHEL Error Resilient Professeur à Grenoble INP, TIMA / examinateur Daniel MÉNARD Near-Threshold Professeur à INSA Rennes, IETR / examinateur Olivier SENTIEYS Computing Professeur à l’Université de Rennes 1 / directeur de thèse Cédric KILLIAN Maître de Conférences à l’Université de Rennes 1 / co-directeur de thèse To My Parents, My Brother, and My Teachers iii Acknowledgements I sincerely express my gratitude to my thesis supervisor Prof.
    [Show full text]
  • International Journal of Soft Computing and Engineering
    International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8 Issue-8S2, June 2019 Analysis of Various Computer Architectures Kulvinder Singh, Pushpendra Kumar Pateriya, Devinder Kaur, Mritunjay Rai Abstract: With the tremendous acceptance of computers globally in the different fields, there comes the challenge of managing II. ISA instructions, processing, and improving the efficiency of computer systems. Ever since the inception of computers from Instruction Set Architecture (ISA) is a mechanism which earliest ENIAC (developed at the University of Pennsylvania) to governs the interaction of hardware and software. ISA modern supercomputer Sunway TaihuLight (developed at comprises of hardware implemented instructions, defined National Supercomputing Center, China) a broad research along addressing modes, types, and sizes of different operands used with great ideas in the field of computer architecture has laid its as well as registers and input-output control [10]. Broadly way. It has lead to the advances which has made the computers capable to perform computations at lightning speeds. For an speaking, an ISA is a set of operations which defines example, the computing speed of TaihuLight exceeds 93 everything a machine language programmer must be familiar petaflops. It uses 40,960 individual RISC processors to reach such with to be able to program a computer [2]. Every ISA has its high speeds. In this, the architecture plays an indispensable role own set of supported operations and valid instructions format right from accessing memory, allocating instruction to processors which are decoded by the control unit of the processor while for execution, and exploiting the architecture to increase the execution is performed by ALU.
    [Show full text]
  • Thesis Title
    No d’ordre : ANNÉE 2017 THÈSE / UNIVERSITÉ DE RENNES 1 sous le sceau de l’Université Bretagne Loire pour le grade de DOCTEUR DE L’UNIVERSITÉ DE RENNES 1 Mention : Traitement du Signal et Télécommunications École doctorale MATISSE présentée par Rengarajan RAGAVAN préparée à l’unité de recherche UMR6074 IRISA Institut de recherche en informatique et systèmes aléatoires - CAIRN École Nationale Supérieure des Sciences Appliquées et de Technologie Thèse soutenue à Lannion le 22 septembre 2017 devant le jury composé de : Edith BEIGNÉ Error Handling and Directrice de Recherche à CEA Leti / rapporteur Energy Estimation Patrick GIRARD Directrice de Recherche à CNRS, LIRMM / rappor- teur Framework For Lorena ANGHEL Error Resilient Professeur à Grenoble INP, TIMA / examinateur Daniel MÉNARD Near-Threshold Professeur à INSA Rennes, IETR / examinateur Olivier SENTIEYS Computing Professeur à l’Université de Rennes 1 / directeur de thèse Cédric KILLIAN Maître de Conférences à l’Université de Rennes 1 / co-directeur de thèse To My Parents, My Brother, and My Teachers iii Acknowledgements I sincerely express my gratitude to my thesis supervisor Prof. Olivier Sentieys who gave me the opportunity to work him in prestigious INRIA/IRISA labs. His guidance and confidence in me gave extra boost and cultivated more interest in me to do this research work in a better way. His feedback and comments refined the quality of my research contributions and this thesis dissertation. I am very grateful to my thesis co-supervisor Dr. C´edricKillian for his encour- agement and support. He was inspirational to me throughout my Ph.D. tenure. His suggestions and critical evaluation helped me to innovatively address various bottlenecks in this thesis work.
    [Show full text]
  • Proceedings of the 7Th Spring/Summer Young Researchers’ Colloquium on Software Engineering
    SYRCoSE 2013 Editors: Alexander Kamkin, Alexander Petrenko, Andrey Terekhov Proceedings of the 7th Spring/Summer Young Researchers’ Colloquium on Software Engineering Kazan, May 30-31, 2013 2013 Proceedings of the 7th Spring/Summer Young Researchers’ Colloquium on Software Engineering (SYRCoSE 2013), May 30-31, 2013 – Kazan, Russia: The issue contains the papers presented at the 7th Spring/Summer Young Researchers’ Colloquium on Software Engineering (SYRCoSE 2013) held in Kazan, Russia on 30th and 31st of May, 2013. Paper selection was based on a competitive peer review process being done by the program committee. Both regular and research-in-progress papers were considered acceptable for the colloquium. The topics of the colloquium include modeling of computer systems, software testing and verification, parallel and distributed systems, information search and data mining, image and speech processing and others. Труды 7-ого весеннего/летнего коллоквиума молодых исследователей в области программной инженерии (SYRCoSE 2013), 30-31 мая 2013 г. – Казань, Россия: Сборник содержит статьи, представленные на 7-ом весеннем/летнем коллоквиуме молодых исследователей в области программной инженерии (SYRCoSE 2013), проводимом в Казани 30 и 31 мая 2013 г. Отбор статей производился на основе рецензирования материалов программным комитетом. На коллоквиум допускались как полные статьи, так и краткие сообщения, описывающие текущие исследования. Программа коллоквиума охватывает следующие темы: моделирование компьютерных систем, тестирование и верификация
    [Show full text]
  • Computer  Pc Components, Features and Control  Components of System and Its Feature  Types of Computer  Software Types  Operating System
    Contents at a Glance 1 Introduction and Development of the PC 2 Pc components, features and Control 3 Basic Electronics 4 SMPS 5 Motherboards and Add-on Cards 6 Processor Types and Specifications 7Memory 8Hard Disk Drive 9Optical Storage 10Input Devices 11 BIOS 12 Video Hardware 13 Printers and Scanners 14 Operating System 15 PC Diagnostics, Testing and Maintenance Lesson -1 Introduction and Development of the PC Characteristic and features of computer Pc components, features and Control Components of System and its feature Types of Computer Software Types Operating System Computer:- The word computer comes from the word ―compute‖, which means, ―to calculate‖. Computer is a programmable machine that receives input, stores and manipulates data, and provides output in a useful format. A computer is also called as data processor because it can store, process, and retrieve data whenever requires. The activity of processing data using a computer is called data processing. ―Data‖ is raw material used as input and ―Information‖ is processed data obtained as output of data processing. Processing: -Manipulation of data in the computer. Manipulation means calculations, comparisons, sorting of data. Characteristic of computers:- 1. Automatic 2. Speed 3. Accuracy 4. Diligence 5. Versatility 6. Power of remembering 7. No I. Q. 8. No feelings Basic Function of Computer: - 1. Inputting: -Processing of entering of data and instructions into a computer system. 2. Storing: -Saving data and instructions to make them readily for initial or additional processing and when required. 3. Processing: -Performing arithmetic operation (add, subtract, multiply, divide, etc) or logical operations (comparisons like equal to, less than, greater than, etc) on data to convert them to useful information.
    [Show full text]
  • Design and Implementation of a Multimedia Extension for a RISC Processor
    Design and implementation of a Multimedia Extension for a RISC Processor Eduardo Jonathan Martínez Montes Facultat d’Informàtica de Barcelona (FIB) Universitat Politècnica de Catalunya - BarcelonaTech A thesis submitted for the degree of Master in Innovation and Research in Informatics (MIRI-HPC) 2 July 2015 Advisor: Oscar Palomar, CAPP Junior Researcher at BSC-CNS Advisor: Marco Antonio Ramírez Salinas, Researcher at CIC-IPN Tutor: Adrian Cristal Kestelman, CAPP Senior Researcher Professor at BSC-CNS To my parents and my sister for their support during all this years To Oscar, my advisor, for his patience and his support. Acknowledgements I want to thank to all people and friends that helping me to do this possible. I also want to thank to my colleagues at MICROSE Lab and its leader Marco Ramírez. Also, it would not have been possible without the support of the BSC and its leader Mateo Valero. This work has been supported by the CONACYT and its project SIP: 20150957 "Desarrollo de Procesadores de Alto Desempeño para Sistemas en Chips Programables" Special thanks Adrián Cristal Kestelman Cesar Hernández Calderón Luis Villa Vargas Osman Sabri Ünsal Patricia Cantú Velasco Ramón Canal Corretger Santiago Reyes Ulises Cortes Víctor Hugo Ponce Ponce Abstract Nowadays, multimedia content is everywhere. Many applications use audio, images, and video. Over the last 20 years significant advances have been made in compilation and microarchitecture technologies to improve the instruction-level parallelism (ILP). As more demanding applications have appeared, a number of different microarchitectures have emerged to deal with them. A number of microarchitectural techniques have emerged to deal with them: pipelining, superscalar, out-of-order, multithreading, etc.
    [Show full text]
  • Architecture 1 CISC, RISC, VLIW, Dataflow Contents
    Architecture 1 CISC, RISC, VLIW, Dataflow Contents 1 Computer architecture 1 1.1 History ................................................ 1 1.2 Subcategories ............................................. 1 1.3 Roles .................................................. 2 1.3.1 Definition ........................................... 2 1.3.2 Instruction set architecture .................................. 2 1.3.3 Computer organization .................................... 3 1.3.4 Implementation ........................................ 3 1.4 Design goals .............................................. 3 1.4.1 Performance ......................................... 3 1.4.2 Power consumption ...................................... 4 1.4.3 Shifts in market demand ................................... 4 1.5 See also ................................................ 4 1.6 Notes ................................................. 4 1.7 References ............................................... 5 1.8 External links ............................................. 5 2 Complex instruction set computing 6 2.1 Historical design context ........................................ 6 2.1.1 Incitements and benefits .................................... 6 2.1.2 Design issues ......................................... 7 2.2 See also ................................................ 8 2.3 Notes ................................................. 8 2.4 References ............................................... 8 2.5 Further reading ............................................ 8 2.6 External
    [Show full text]