Revanc: a Framework for Reverse Engineering Hardware Page Table Caches

Total Page:16

File Type:pdf, Size:1020Kb

Revanc: a Framework for Reverse Engineering Hardware Page Table Caches RevAnC: A Framework for Reverse Engineering Hardware Page Table Caches Stephan van Schaik Kaveh Razavi Ben Gras Vrije Universiteit Amsterdam Vrije Universiteit Amsterdam Vrije Universiteit Amsterdam [email protected] [email protected] [email protected] Herbert Bos Cristiano Giuffrida Vrije Universiteit Amsterdam Vrije Universiteit Amsterdam [email protected] [email protected] ABSTRACT (ASLR) [10, 12, 13, 14, 17] leaking cryptographic keys [27, Recent hardware-based attacks that compromise systems 6] and even tracking mouse movements [21]. with Rowhammer or bypass address-space layout random- Many of these attacks abuse how modern processors in- ization rely on how the processor's memory management teract with memory. At the core of any processor today is a unit (MMU) interacts with page tables. These attacks often memory management unit (MMU) that simplifies the man- need to reload page tables repeatedly in order to observe agement of the available physical memory by virtualizing it changes in the target system's behavior. To speed up the between multiple processes. The MMU uses a data struc- MMU's page table lookups, modern processors make use of ture called the page table to perform the translation between multiple levels of caches such as translation lookaside buffers virtual memory to physical memory. Page tables are an at- (TLBs), special-purpose page table caches and even general tractive target for hardware-based attacks. For example, a data caches. A successful attack needs to flush these caches single bit flip in a page table page caused by Rowhammer reliably before accessing page tables. To flush these caches could grant an attacker the control over a physical memory from an unprivileged process, the attacker needs to create location that she does not own, enough to gain root privi- specialized memory access patterns based on the internal leges [25, 26]. Further, security defenses such as ASLR and architecture and size of these caches as well as how they in- others that bootstrap using ASLR [5, 8, 19, 20] rely on ran- teract with each other. While information about TLBs and domizing where code or data is placed in virtual memory. data caches are often reported in processor manuals released Since this (secret) information is embedded in page tables, by the vendors, there is typically little or no information the attackers can perform various side-channel attacks on about the properties of page table caches on different pro- the interaction of the MMU with page tables to leak this cessors. In this paper, we describe RevAnC, an open-source information [12]. framework for reverse engineering internal architecture, size The virtual to physical memory translation is slow as it and the behavior these page table caches by retrofitting a requires multiple additional memory accesses to resolve the recently proposed EVICT+TIME attack on the MMU. RevAnC original virtual address. To improve performance, modern can automatically reverse engineer page table caches on new processors make use of multiple levels of caches such as architectures while providing a convenient interface for flush- translation lookaside buffers (TLBs), special-purpose page ing these caches on 23 different microarchitectures that we table caches and even general data caches. To mount a suc- evaluated from Intel, ARM and AMD. cessful attack on page tables, attackers often need to repeat- edly flush these caches [12, 13, 25, 26] to observe the sys- tem's behavior when manipulating page tables. The details 1. INTRODUCTION of TLBs and data caches are easy to find by examining pro- As software is becoming harder to compromise due to the cessor manuals [15, 16]. Information on page table caches plethora of advanced defenses [1, 7, 11, 18, 19], attacks on such as their size and behavior, however, is often lacking. hardware are instead becoming an attractive alternative. Without this information, attackers need to resort to trial These attacks range from compromising the system using and error and it becomes difficult to build robust attacks the Rowhammer vulnerability [4, 24, 25, 26] and using side that work across different architectures. channels for breaking address space layout randomization In this paper, we retrofit AnC [12], an existing EVICT+TIME side-channel attack on the MMU to build RevAnC, a reverse engineering framework for determining the size and internal Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed architecture of page table caches as well as how they inter- for profit or commercial advantage and that copies bear this notice and the full cita- act with other caches on 23 microarchitectures comprising tion on the first page. Copyrights for components of this work owned by others than various generations from Intel, ARM and AMD processors. ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission RevAnC relies on the fact that page table lookups by the and/or a fee. Request permissions from [email protected]. MMU are stored in the last level cache (LLC) in order to EuroSec ’17 April 23, 2017, Belgrade, Serbia speed up the next required translation. By flushing parts © 2017 ACM. ISBN 123-4567-24-567/08/06. $15.00 of the LLC and timing the page table lookup, RevAnC can DOI: http://dx.doi.org/10.475/123 4 identify which parts of the LLC store page tables. On top of flushing the LLC, RevAnC needs to flush the TLB as well as Level 1 Level 2 Level 3 Level 4 PTE 0: PTE 0: PTE 0: PTE 0: . page table caches. Since the information on the size of TLB . CR3: Level 1 Physical Addr and the LLC is available, we can use RevAnC to reverse PTE 200: Leve l 2 Phys Addr . engineer the properties of the page table caches that are of . PTE 300: Level 3 Phys Addr interest to attackers, like their internal architecure and size. Summarizing, we make the following contributions: . PTE 400: Leve l 4 Phys Addr . • We describe a novel technique to reverse engineer the . PTE 500: Target Phys Addr . undocumented page table caches commonly found in . modern processors. • We evaluate our technique on 23 different microarchi- Figure 1: MMU's page table walk to translate tectures from recent Intel, ARM and AMD processors. 0x644b321f4000 to its corresponding memory page on the x86_64 architecture. • We release RevAnC as open-source software. RevAnC provides a convenient interface to flush page table caches on the microarchitectures that we tested and can au- 2.2 Caching MMU’s Operations tomatically detect page table caches on new proces- sors. More information about AnC and RevAnC can The performance of memory accesses improves greatly if be found at: https://www.vusec.net/projects/anc the MMU can avoid having to resolve a virtual address that it already resolved recently. For this purpose, CPUs store We provide some necessary background in Section 2. We the resolved address mappings in the TLB cache. Hence, then discuss our design and implementation on multiple ar- a hit in the TLB removes the need for an expensive page chitectures in Section 3 before evaluating it on 23 different table walk. In addition, to improve the performance of a processors in Section 4. We discuss related work in Section 5 TLB miss, the processor stores the page table data in the and conclude in Section 6. data caches. Modern processors may improve the performance of a 2. BACKGROUND AND MOTIVATION TLB miss even further by means of page table caches or In this section, we discuss the paging memory manage- translation caches that store PTEs for different page table ment scheme and its implementation on most modern pro- levels [2] [3]. While page table caches use the physical ad- cessors. Furthermore, we look at how the MMU performs dress and the PTE index for indexing, translation caches virtual address translation, and the caches that are used to use the partially resolved virtual address instead. With improve the performance of this translation. translation caches the MMU can look up the virtual address and select the page table with the longest matching prefix. 2.1 Paging and the MMU Hence, translation caches skip upper levels of the page ta- Paging has become integral to modern processor architec- ble and continue the translation with the lowest-level page tures as it simplifies the management of physical memory table present within the cache for the given virtual address. by virtualizing it: the operating system no longer needs to While this allows the MMU to skip part of the page table relocate the entire memory of applications due to a limited walk, the implementation of translation caches also comes address space and it no longer needs to deal with fragmenta- with additional complexity. Furthermore, these caches can tion of physical memory. Furthermore, the operating system be implemented as split dedicated caches for different page can limit the memory to which a process has access, prevent- table levels, as one single unified cache for different page ta- ing malicious or malfunctioning code from interfering with ble levels, or even as a TLB that is also capable of caching other processes. PTEs. We discover using RevAnC that for instance, AMD's As a direct consequence, many modern processor archi- Page Walking Caches, as found in the AMD K8 and AMD tectures use an MMU, a hardware component responsible K10 microarchitectures, and Intel's Page-Structure Caches for the translation of virtual addresses to the corresponding are examples of unified page table caches and split translation physical addresses.
Recommended publications
  • Memorandum in Opposition to Hewlett-Packard Company's Motion to Quash Intel's Subpoena Duces Tecum
    ORIGINAL UNITED STATES OF AMERICA BEFORE THE FEDERAL TRADE COMMISSION ) In the Matter of ) ) DOCKET NO. 9341 INTEL. CORPORATION, ) a corporation ) PUBLIC ) .' ) MEMORANDUM IN OPPOSITION TO HEWLETT -PACKARD COMPANY'S MOTION TO QUASH INTEL'S SUBPOENA DUCES TECUM Intel Corporation ("Intel") submits this memorandum in opposition to Hewlett-Packard Company's ("HP") motion to quash Intel's subpoena duces tecum issued on March 11,2010 ("Subpoena"). HP's motion should be denied, and it should be ordered to comply with Intel's Subpoena, as narrowed by Intel's April 19,2010 letter. Intel's Subpoena seeks documents necessary to defend against Complaint Counsel's broad allegations and claimed relief. The Complaint alleges that Intel engaged in unfair business practices that maintained its monopoly over central processing units ("CPUs") and threatened to give it a monopoly over graphics processing units ("GPUs"). See CompI. iiii 2-28. Complaint Counsel's Interrogatory Answers state that it views HP, the world's largest manufacturer of personal computers, as a centerpiece of its case. See, e.g., Complaint Counsel's Resp. and Obj. to Respondent's First Set ofInterrogatories Nos. 7-8 (attached as Exhibit A). Complaint Counsel intends to call eight HP witnesses at trial on topics crossing virtually all of HP' s business lines, including its purchases ofCPUs for its commercial desktop, commercial notebook, and server businesses. See Complaint Counsel's May 5, 2010 Revised Preliminary Witness List (attached as Exhibit B). Complaint Counsel may also call HP witnesses on other topics, including its PUBLIC FTC Docket No. 9341 Memorandum in Opposition to Hewlett-Packard Company's Motion to Quash Intel's Subpoena Duces Tecum USIDOCS 7544743\'1 assessment and purchases of GPUs and chipsets and evaluation of compilers, benchmarks, interface standards, and standard-setting bodies.
    [Show full text]
  • Reverse Engineering X86 Processor Microcode
    Reverse Engineering x86 Processor Microcode Philipp Koppe, Benjamin Kollenda, Marc Fyrbiak, Christian Kison, Robert Gawlik, Christof Paar, and Thorsten Holz, Ruhr-University Bochum https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/koppe This paper is included in the Proceedings of the 26th USENIX Security Symposium August 16–18, 2017 • Vancouver, BC, Canada ISBN 978-1-931971-40-9 Open access to the Proceedings of the 26th USENIX Security Symposium is sponsored by USENIX Reverse Engineering x86 Processor Microcode Philipp Koppe, Benjamin Kollenda, Marc Fyrbiak, Christian Kison, Robert Gawlik, Christof Paar, and Thorsten Holz Ruhr-Universitat¨ Bochum Abstract hardware modifications [48]. Dedicated hardware units to counter bugs are imperfect [36, 49] and involve non- Microcode is an abstraction layer on top of the phys- negligible hardware costs [8]. The infamous Pentium fdiv ical components of a CPU and present in most general- bug [62] illustrated a clear economic need for field up- purpose CPUs today. In addition to facilitate complex and dates after deployment in order to turn off defective parts vast instruction sets, it also provides an update mechanism and patch erroneous behavior. Note that the implementa- that allows CPUs to be patched in-place without requiring tion of a modern processor involves millions of lines of any special hardware. While it is well-known that CPUs HDL code [55] and verification of functional correctness are regularly updated with this mechanism, very little is for such processors is still an unsolved problem [4, 29]. known about its inner workings given that microcode and the update mechanism are proprietary and have not been Since the 1970s, x86 processor manufacturers have throughly analyzed yet.
    [Show full text]
  • AMD's Early Processor Lines, up to the Hammer Family (Families K8
    AMD’s early processor lines, up to the Hammer Family (Families K8 - K10.5h) Dezső Sima October 2018 (Ver. 1.1) Sima Dezső, 2018 AMD’s early processor lines, up to the Hammer Family (Families K8 - K10.5h) • 1. Introduction to AMD’s processor families • 2. AMD’s 32-bit x86 families • 3. Migration of 32-bit ISAs and microarchitectures to 64-bit • 4. Overview of AMD’s K8 – K10.5 (Hammer-based) families • 5. The K8 (Hammer) family • 6. The K10 Barcelona family • 7. The K10.5 Shanghai family • 8. The K10.5 Istambul family • 9. The K10.5-based Magny-Course/Lisbon family • 10. References 1. Introduction to AMD’s processor families 1. Introduction to AMD’s processor families (1) 1. Introduction to AMD’s processor families AMD’s early x86 processor history [1] AMD’s own processors Second sourced processors 1. Introduction to AMD’s processor families (2) Evolution of AMD’s early processors [2] 1. Introduction to AMD’s processor families (3) Historical remarks 1) Beyond x86 processors AMD also designed and marketed two embedded processor families; • the 2900 family of bipolar, 4-bit slice microprocessors (1975-?) used in a number of processors, such as particular DEC 11 family models, and • the 29000 family (29K family) of CMOS, 32-bit embedded microcontrollers (1987-95). In late 1995 AMD cancelled their 29K family development and transferred the related design team to the firm’s K5 effort, in order to focus on x86 processors [3]. 2) Initially, AMD designed the Am386/486 processors that were clones of Intel’s processors.
    [Show full text]
  • The X86 Is Dead. Long Live the X86!
    the x86 is dead. long live the x86! CC3.0 share-alike attribution copyright c 2013 nick black with diagrams by david kanter of http://realworldtech.com “Upon first looking into Intel’s x86” that upon which we gaze is mankind’s triumph, and we are its stewards. use it well. georgia tech ◦ summer 2013 ◦ cs4803uws ◦ nick black The x86 is dead. Long live the x86! Why study the x86? Used in a majority of servers, workstations, and laptops Receives the most focus in the kernel/toolchain Very complex processor, thus large optimization space Excellent documentation and literature Fascinating, revealing, lengthy history Do not think that x86 is all that’s gone on over the past 30 years1. That said, those who’ve chased peak on x86 can chase it anywhere. 1Commonly expressed as “All the world’s an x86.” georgia tech ◦ summer 2013 ◦ cs4803uws ◦ nick black The x86 is dead. Long live the x86! In the grim future of computing there are 10,000 ISAs Alpha + BWX/FIX/CIX/MVI SPARC V9 + VIS3a AVR32 + JVM JVMb CMS PTX/SASSc PA-RISC + MAX-2 TILE-Gxd SuperH ARM + NEONe i960 Blackfin IA64 (Itanium) PowerISA + AltiVec/VSXf MIPS + MDMX/MIPS-3D MMIX IBMHLA (s390 + z) a Most recently the “Oracle SPARC Architecture 2011”. b m68k Most recently the Java SE 7 spec, 2013-02-28. c Most recently the PTX ISA 3.1 spec, 2012-09-13. VAX + VAXVA d TILE-Gx ISA 1.2, 2013-02-26. e z80 / MOS6502 ARMv8: A64, A32, and T32, 2011-10-27. f MIX PowerISA v.2.06B, 2010-11-03.
    [Show full text]
  • The Microarchitecture of Intel and AMD Cpus
    3. The microarchitecture of Intel, AMD and VIA CPUs An optimization guide for assembly programmers and compiler makers By Agner Fog. Copenhagen University College of Engineering. Copyright © 1996 - 2012. Last updated 2012-02-29. Contents 1 Introduction ....................................................................................................................... 4 1.1 About this manual ....................................................................................................... 4 1.2 Microprocessor versions covered by this manual........................................................ 6 2 Out-of-order execution (All processors except P1, PMMX)................................................ 8 2.1 Instructions are split into µops..................................................................................... 8 2.2 Register renaming ...................................................................................................... 9 3 Branch prediction (all processors) ................................................................................... 11 3.1 Prediction methods for conditional jumps.................................................................. 11 3.2 Branch prediction in P1............................................................................................. 16 3.3 Branch prediction in PMMX, PPro, P2, and P3 ......................................................... 20 3.4 Branch prediction in P4 and P4E .............................................................................. 21
    [Show full text]
  • “架构+工艺”,Cpu 业务拉动业绩持续成长 ( )投资价值分析报告| Amd Amd.O 2019.10.10
    2: “架构+工艺”,CPU 业务拉动业绩持续成长 ( )投资价值分析报告| AMD AMD.O 2019.10.10 中信证券研究部 核心观点 AMD CPU 芯片新品架构设计进步迅速,作为纯芯片设计公司,所采用台积电 先进代工工艺历史上第一次阶段性领先竞争对手英特尔。随着有竞争力的新品 持续发布,在 PC 和服务器芯片领域,公司未来三年有望凭借 7nm、7nm+及 5nm 高性价比产品持续抢占份额、扩增营收,服务器 CPU 芯片市场份额有望 创历史新高,公司作为 CPU 和 GPU 双领域全球龙头公司,有望实现持续高速 成长,值得长期重点关注。 徐涛 ▍唯一兼具 CPU+独立 GPU 芯片厂商。公司成立至今经历了英特尔第二供货商、 首席电子分析师 IDM、Fabless+GlobalFoundries、Fabless+TSMC 四个阶段。历史上 2003 年 前后产品性能一度超越英特尔, 年收购 设计厂商 ,成为唯一兼 S1010517080003 2006 GPU ATI 具 CPU 与独立 GPU 设计能力的厂商。我们估测 2018 年公司 PC CPU 收入 22.93 亿美元(营收占比 36%,市场份额 13%);GPU 收入 18.32 亿美元(营收占比 28%,市场份额 18%);服务器收入 3.31 亿美元(营收占比 5%,市场份额 3.2%); 嵌入式与半定制等业务收入 20.19 亿美元(营收占比 31%)。 ▍高壁垒 640 亿美元 CPU+GPU 市场,市场第二名。公司各细分市场中,PC 端 CPU 市场 322 亿美元,服务器端 CPU 市场 166 亿美元,GPU 市场约 120 亿美 元,另有以游戏主机芯片为主的半定制芯片市场,约 34 亿美元。总体市场空间 郑泽科 巨大,公司在 CPU 与 GPU 市场长期为市场第二名,主要竞争对手是英特尔与 电子分析师 英伟达,AMD 在游戏主机芯片市场占据绝大部分份额。凭借近年来架构设计能 S1010517100002 力提升+拥抱台积电先进代工工艺,AMD 有望通过优势新品持续扩大市场份额。 ▍“架构+工艺”,CPU 业务拉动业绩成长。我们认为公司未来两年有望在 CPU 市 场率先提升份额,拉动公司成长。回顾历史,我们发现具有竞争力的新品发布 是市场份额变化重要因素;从设计角度,公司最新发布的 Zen2 架构通过务实创 新设计已实现对英特尔的反超;从制造角度,公司将采用业界顶尖的台积电 7nm,在制程上领先英特尔,后续还将采用 7nm+及 5nm;从时间角度,AMD 将具备领先英特尔至少半年上市的市场机遇期。公司采用灵活的 Fabless 模式, 胡叶倩雯 相较于英特尔 IDM 模式更加适应未来市场趋势,且合作伙伴台积电已将高性能 电子分析师 计算列为重要战略方向。公司 CEO 苏姿丰对产品与技术把握清晰,有望引领公 S1010517100004 司快速发展。 ▍风险因素:PC、服务器市场景气度低于预期;台积电先进制程受不确定性因素 影响演进及产能;公司核心管理层出现重大变化;英特尔先进工艺芯片超预期。 ▍投资建议:我们预计未来两年公司 CPU 业务将随新品发布持续提升份额,拉动 营收并提高毛利率,GPU 业务总体保持平稳增长。基于该假设,我们预测公司 2019/2020/2021 年 EPS 分别为 0.65/1.00/1.54 美元,按照 2020 年 35 倍 PE, 给予目标价 35 美元,首次覆盖,给予“买入”评级。 项目/年度 2017 2018 2019E 2020E 2021E 营业收入 百万美元 ( ) 5,381.00 6,475.00 7,022.85 9,540.25
    [Show full text]
  • AMD K8 Processor Architecture
    Excerpt from MindShare’s Upcoming Book: AMD K8 Processor Architecture Joe Winkles MindShare, Inc. [email protected] November 2005 For training on this topic, visit www.mindshare.com or call 1-800-633-1440 MindShare_K8_Breaking_Tradition.fm Page 1 Tuesday, November 22, 2005 12:38 AM 1 K8 Processors: Breaking Tradition Notice This material is copyrighted and is not to be reproduced without permission from MindShare, Inc. It is offered as a courtesy to MindShare subscribers. Copyright © 2005 by MindShare, Inc. All rights reserved. AMD, AMD Opteron, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Introduction The following is an excerpt from the upcoming MindShare textbook on AMD K8 Processor Architecture. MindShare currently offers a course on AMD based processors which can be found at www.mindshare.com. The K8 Microarchitecture The terms “K8” and “Hammer” are AMD’s internal names for the processor microarchitecture that will be described in detail throughout this book. AMD uses the K8 microarchitecture for several lines of processors such as: —AMD OpteronTM —AMD AthlonTM 64 —AMD AthlonTM 64 FX —AMD TurionTM —AMD SempronTM (a subset of this processor line uses the K8 microar- chitecture, the early Semprons were based on the K7 microarchitecture) Visit MindShare Training at www.mindshare.com 1 MindShare_K8_Breaking_Tradition.fm Page 2 Tuesday, November 22, 2005 12:38 AM The K8 Architecture All of these processors use the same basic internal microarchitecture however they are targeting different markets and thus have different feature sets. A brief description of each processor line’s characteristics can be found later in the chapter.
    [Show full text]
  • Instruction Latencies and Throughput for AMD and Intel X86 Processors
    Instruction latencies and throughput for AMD and Intel x86 processors Torbj¨ornGranlund 2014-07-20 Copyright Torbj¨ornGranlund 2005{2014. Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved. [This report is work-in-progress. A newer version might be available here: http://gmplib.org/~tege/x86-timing.pdf] In this short report we present latency and throughput data for various x86 processors. We only present data on integer operations. The data on integer MMX and SSE2 instructions is currently limited. We might present more complete data in the future, if there is enough interest. There are several reasons for presenting this report: 1. Intel's published data were in the past incomplete and full of errors. 2. Intel did not publish any data for 64-bit operations. 3. To allow straightforward comparison of AMD and Intel pipelines. The here presented data is the result of extensive timing tests. While we have made an effort to make sure the data is accurate, the reader is cautioned that some errors might have crept in. 1 Nomenclature and notation LNN means latency for NN-bit operation.TNN means throughput for NN-bit operation. The term throughput is used to mean number of instructions per cycle of this type that can be sustained. That implies that more throughput is better, which is consistent with how most people understand the term. Intel use that same term in the exact opposite meaning in their manuals. The notation "P6 0-E", "P4 F0", etc, are used to save table header space.
    [Show full text]
  • The Microarchitecture of Intel, AMD and VIA Cpus: an Optimization Guide for Assembly Programmers and Compiler Makers
    3. The microarchitecture of Intel, AMD, and VIA CPUs An optimization guide for assembly programmers and compiler makers By Agner Fog. Technical University of Denmark. Copyright © 1996 - 2021. Last updated 2021-08-17. Contents 1 Introduction ....................................................................................................................... 7 1.1 About this manual ....................................................................................................... 7 1.2 Microprocessor versions covered by this manual ........................................................ 8 2 Out-of-order execution (All processors except P1, PMMX) .............................................. 10 2.1 Instructions are split into µops ................................................................................... 10 2.2 Register renaming .................................................................................................... 11 3 Branch prediction (all processors) ................................................................................... 12 3.1 Prediction methods for conditional jumps .................................................................. 12 3.2 Branch prediction in P1 ............................................................................................. 18 3.3 Branch prediction in PMMX, PPro, P2, and P3 ......................................................... 21 3.4 Branch prediction in P4 and P4E .............................................................................. 23 3.5 Branch
    [Show full text]
  • Welcome to Anandtech.Com [ Article: Intel Core Versus AMD's K
    Welcome to AnandTech.com [ Article: Intel Core versus AMD's K8 ar... http://www.anandtech.com/printarticle.aspx?i=2748 print this page Intel Core versus AMD's K8 architecture Date: May 1, 2006 Type: CPU & Chipset Manufacturer: Intel Author: Johan De Gelas Page 1 Introduction Wide Dynamic Execution, Advanced Digital Media Boost, Smart Memory Access and Advanced Smart Cache; those are the technologies that according to the marketing people at Intel enable Intel to build the high performance, low energy CPUs using the new Core architecture. Of course, as an AnandTech Reader, you couldn't care less about which Hyper Super Advanced Label the marketing folks glue on their CPUs. "Extend the digital lifestyle by combining robust performance with low power consumption" could have been another marketing claim for the new Core architecture, but VIA already cornered that sentence for its C7 CPUs. The marketing slogans for Intel's Core and VIA's C7 are almost the same; the architectures are however vastly different. No, let us find out what is really behind all this marketing hyper-talk, and preferably compare it with the AMD "K8" (Athlon 64, Opteron) architecture of Intel's NetBurst and Pentium M processors. That is what this article is all about. We talked to Jack Doweck, the engineer who designed the completely new Memory Reorder Buffer and Memory disambiguation system. Jack Doweck is one of the Intel Israel Development Center (IDC) architects. The Intel "P8" Intel marketing states that Core is a blend of P-M techniques and NetBurst architecture. However, Core is clearly a descendant of the Pentium Pro, or the P6 architecture.
    [Show full text]
  • UPGRADING and REPAIRING Pcs
    UPGRADING AND REPAIRING PCs, 20th Edition Scott Mueller Que 800 East 96th Street Indianapolis, Indiana 46240 Dual Independent Bus Architecture hS Contents HT Technology 66 Multicore Technology 67 Introduction 1 Processor Manufacturing 68 Processor 72 Book Objectives 1 Re-Marking PGA 72 The 20th Edition DVD-ROM 2 Chip Packaging Contact and My Website: informit.com/upgrading 2 Single Edge Single Edge Processor 73 A Personal Note 2 Packaging Processor Socket and Slot Types 74 1 Development of the PC 5 Socket 478 76 Socket LGA775 77 Computer History: Before Personal Socket LGA1156 78 Computers 5 Socket LGA1366 79 Timeline 5 Socket LGA1155 80 Electronic Computers 10 Socket 939 and 940 80 Modern Computers 11 Socket AM2/AM2+/AM3/AM3+ 81 From Tubes to Transistors 11 Socket F (1207FX) 83 Integrated Circuits 13 CPU Operating Voltages 83 History of the PC 13 Math Coprocessors {Floating-Point Units) Hi Birth of the Personal Computer 13 Processor Bugs and Steppings 84 The IBM Personal Computer 15 Processor Code Names 85 The PC Industry 30 Years Later 16 PI (086) Processors 85 2 PC Components, Features, and P2 (286) Processors 86 System Design 19 P3 (386) Processors 87 P4 (486) Processors 88 What Is a PC? 19 P5 (586) Processors 90 Who Controls PC Software? 20 AMD-K5 92 Who Controls PC Hardware? 23 Intel P6 (686) Processors 92 White-Box 25 Systems Pentium Pro Processors 93 PC Guides 26 Design Pentium II Processors 93 27 System Types Pentium III 95 28 System Components Celeron 97 Intel Pentium 4 Processors 97 3 Processor and Types Pentium 4 Extreme Edition
    [Show full text]
  • An Exploratory Analysis of Microcode As a Building Block for System Defenses
    An Exploratory Analysis of Microcode as a Building Block for System Defenses Benjamin Kollenda, Philipp Koppe, Marc Fyrbiak Christian Kison, Christof Paar, Thorsten Holz Ruhr-Universität Bochum [email protected] ABSTRACT 1 INTRODUCTION Microcode is an abstraction layer used by modern x86 processors New vulnerabilities, design flaws, and attack techniques with devas- that interprets user-visible CISC instructions to hardware-internal tating consequences for the security and safety of computer systems RISC instructions. The capability to update x86 microcode enables a are announced on a regular basis [20]. The underlying faults range vendor to modify CPU behavior in-field, and thus patch erroneous from critical memory safety violations [22] or input validation [21] microarchitectural processes or even implement new features. Most in software to race conditions or side-channel attacks in the under- prominently, the recent Spectre and Meltdown vulnerabilities lying hardware [3, 27, 37, 39, 40, 45, 53]. To cope with erroneous were mitigated by Intel via microcode updates. Unfortunately, mi- behavior and to reduce the attack surface, various defenses have crocode is proprietary and closed source, and there is little publicly been developed and integrated in software and hardware over the available information on its inner workings. last decades [75, 78]. In this paper, we present new reverse engineering results that Generally speaking, defenses implemented in software can be cat- extend and complement the public knowledge of proprietary mi- egorized in either compiler-assisted defenses [5, 9, 19, 54, 60, 65, 70] crocode. Based on these novel insights, we show how modern or binary defenses [1, 25, 32, 64, 80].
    [Show full text]