Proceedings of the Bsdcon 2002 Conference
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
Fax86: an Open-Source FPGA-Accelerated X86 Full-System Emulator
FAx86: An Open-Source FPGA-Accelerated x86 Full-System Emulator by Elias El Ferezli A thesis submitted in conformity with the requirements for the degree of Master of Applied Science (M.A.Sc.) Graduate Department of Electrical and Computer Engineering University of Toronto Copyright c 2011 by Elias El Ferezli Abstract FAx86: An Open-Source FPGA-Accelerated x86 Full-System Emulator Elias El Ferezli Master of Applied Science (M.A.Sc.) Graduate Department of Electrical and Computer Engineering University of Toronto 2011 This thesis presents FAx86, a hardware/software full-system emulator of commodity computer systems using x86 processors. FAx86 is based upon the open-source IA-32 full-system simulator Bochs and is implemented over a single Virtex-5 FPGA. Our first prototype uses an embedded PowerPC to run the software portion of Bochs and off- loads the instruction decoding function to a low-cost hardware decoder since instruction decode was measured to be the most time consuming part of the software-only emulation. Instruction decoding for x86 architectures is non-trivial due to their variable length and instruction encoding format. The decoder requires only 3% of the total LUTs and 5% of the BRAMs of the FPGA's resources making the design feasible to replicate for many- core emulator implementations. FAx86 prototype boots Linux Debian version 2.6 and runs SPEC CPU 2006 benchmarks. FAx86 improves simulation performance over the default Bochs by 5 to 9% depending on the workload. ii Acknowledgements I would like to begin by thanking my supervisor, Professor Andreas Moshovos, for his patient guidance and continuous support throughout this work. -
SYSCALL, IRET ● Opportunistic SYSRET Failed, Do IRET
entry_*.S A carefree stroll through kernel entry code Borislav Petkov SUSE Labs [email protected] Reasons for entry into the kernel ● System calls (64-bit, compat, 32-bit) ● Interrupts (NMIs, APIC, timer, IPIs... ) – software: INT 0x0-0xFF, INT3, … – external (hw-generated): CPU-ext logic, async to insn exec ● Architectural exceptions (sync vs async) – faults: precise, reported before faulting insn => restartable (#GP,#PF) – traps: precise, reported after trapping insn (#BP,#DB-both) – aborts: imprecise, not reliably restartable (#MC, unless MCG_STATUS.RIPV) 2 Intr/Ex entry ● IDT, int num index into it (256 vectors); all modes need an IDT ● If handler has a higher CPL, switch stacks ● A picture is always better: 3 45sec guide to Segmentation ● Continuous range at an arbitrary position in VA space ● Segments described by segment descriptors ● … selected by segment selectors ● … by indexing into segment descriptor tables (GDT,LDT,IDT,...) ● … and loaded by the hw into segment registers: – user: CS,DS,{E,F,G}S,SS – system: GDTR,LDTR,IDTR,TR (TSS) 4 A couple more seconds of Segmentation ● L (bit 21) new long mode attr: 1=long mode, 0=compat mode ● D (bit 22): default operand and address sizes ● legacy: D=1b – 32bit, D=0b – 16bit ● long mode: D=0b – 32-bit, L=1,D=1 reserved for future use ● G (bit 23): granularity: G=1b: seg limit scaled by 4K ● DPL: Descriptor Privilege Level of the segment 5 Legacy syscalls ● Call OS through gate descriptor (call, intr, trap or task gate) ● Overhead due to segment-based protection: – load new selector + desc into -
X86 Memory Protection and Translation
2/5/20 COMP 790: OS Implementation COMP 790: OS Implementation Logical Diagram Binary Memory x86 Memory Protection and Threads Formats Allocators Translation User System Calls Kernel Don Porter RCU File System Networking Sync Memory Device CPU Today’s Management Drivers Scheduler Lecture Hardware Interrupts Disk Net Consistency 1 Today’s Lecture: Focus on Hardware ABI 2 1 2 COMP 790: OS Implementation COMP 790: OS Implementation Lecture Goal Undergrad Review • Understand the hardware tools available on a • What is: modern x86 processor for manipulating and – Virtual memory? protecting memory – Segmentation? • Lab 2: You will program this hardware – Paging? • Apologies: Material can be a bit dry, but important – Plus, slides will be good reference • But, cool tech tricks: – How does thread-local storage (TLS) work? – An actual (and tough) Microsoft interview question 3 4 3 4 COMP 790: OS Implementation COMP 790: OS Implementation Memory Mapping Two System Goals 1) Provide an abstraction of contiguous, isolated virtual Process 1 Process 2 memory to a program Virtual Memory Virtual Memory 2) Prevent illegal operations // Program expects (*x) – Prevent access to other application or OS memory 0x1000 Only one physical 0x1000 address 0x1000!! // to always be at – Detect failures early (e.g., segfault on address 0) // address 0x1000 – More recently, prevent exploits that try to execute int *x = 0x1000; program data 0x1000 Physical Memory 5 6 5 6 1 2/5/20 COMP 790: OS Implementation COMP 790: OS Implementation Outline x86 Processor Modes • x86 -
Virtual Memory in X86
Fall 2017 :: CSE 306 Virtual Memory in x86 Nima Honarmand Fall 2017 :: CSE 306 x86 Processor Modes • Real mode – walks and talks like a really old x86 chip • State at boot • 20-bit address space, direct physical memory access • 1 MB of usable memory • No paging • No user mode; processor has only one protection level • Protected mode – Standard 32-bit x86 mode • Combination of segmentation and paging • Privilege levels (separate user and kernel) • 32-bit virtual address • 32-bit physical address • 36-bit if Physical Address Extension (PAE) feature enabled Fall 2017 :: CSE 306 x86 Processor Modes • Long mode – 64-bit mode (aka amd64, x86_64, etc.) • Very similar to 32-bit mode (protected mode), but bigger address space • 48-bit virtual address space • 52-bit physical address space • Restricted segmentation use • Even more obscure modes we won’t discuss today xv6 uses protected mode w/o PAE (i.e., 32-bit virtual and physical addresses) Fall 2017 :: CSE 306 Virt. & Phys. Addr. Spaces in x86 Processor • Both RAM hand hardware devices (disk, Core NIC, etc.) connected to system bus • Mapped to different parts of the physical Virtual Addr address space by the BIOS MMU Data • You can talk to a device by performing Physical Addr read/write operations on its physical addresses Cache • Devices are free to interpret reads/writes in any way they want (driver knows) System Interconnect (Bus) : all addrs virtual DRAM Network … Disk (Memory) Card : all addrs physical Fall 2017 :: CSE 306 Virt-to-Phys Translation in x86 0xdeadbeef Segmentation 0x0eadbeef Paging 0x6eadbeef Virtual Address Linear Address Physical Address Protected/Long mode only • Segmentation cannot be disabled! • But can be made a no-op (a.k.a. -
Protected Mode - Wikipedia
2/12/2019 Protected mode - Wikipedia Protected mode In computing, protected mode, also called protected virtual address mode,[1] is an operational mode of x86- compatible central processing units (CPUs). It allows system software to use features such as virtual memory, paging and safe multi-tasking designed to increase an operating system's control over application software.[2][3] When a processor that supports x86 protected mode is powered on, it begins executing instructions in real mode, in order to maintain backward compatibility with earlier x86 processors.[4] Protected mode may only be entered after the system software sets up one descriptor table and enables the Protection Enable (PE) bit in the control register 0 (CR0).[5] Protected mode was first added to the x86 architecture in 1982,[6] with the release of Intel's 80286 (286) processor, and later extended with the release of the 80386 (386) in 1985.[7] Due to the enhancements added by protected mode, it has become widely adopted and has become the foundation for all subsequent enhancements to the x86 architecture,[8] although many of those enhancements, such as added instructions and new registers, also brought benefits to the real mode. Contents History The 286 The 386 386 additions to protected mode Entering and exiting protected mode Features Privilege levels Real mode application compatibility Virtual 8086 mode Segment addressing Protected mode 286 386 Structure of segment descriptor entry Paging Multitasking Operating systems See also References External links History https://en.wikipedia.org/wiki/Protected_mode -
The Evolution of an X86 Virtual Machine Monitor
The Evolution of an x86 Virtual Machine Monitor Ole Agesen, Alex Garthwaite, Jeffrey Sheldon, Pratap Subrahmanyam {agesen, alextg, jeffshel, pratap}@vmware.com ABSTRACT While trap-and-emulate virtualization on mainframes was Twelve years have passed since VMware engineers first virtu- well understood at the time, it was – unsurprisingly – not a alized the x86 architecture. This technological breakthrough design goal for the 4004, whose first application was in fact kicked off a transformation of an entire industry, and virtu- a Busicom calculator [7]. However, as x86 CPUs expanded alization is now (once again) a thriving business with a wide their reach, the case for virtualization gradually emerged. range of solutions being deployed, developed and proposed. There was just one problem, seemingly fundamental: the But at the base of it all, the fundamental quest is still the generally accepted wisdom held that x86 virtualization was same: running virtual machines as well as we possibly can impossible, or at best merely impractical, due to inherited on top of a virtual machine monitor. architectural quirks and a requirement to retain compatibil- ity with legacy code. We review how the x86 architecture was originally virtual- ized in the days of the Pentium II (1998), and follow the This impasse was broken twelve years ago when VMware evolution of the virtual machine monitor forward through engineers first virtualized the x86 architecture entirely in the introduction of virtual SMP, 64 bit (x64), and hard- software. By basing the virtual machine monitor (VMM) ware support for virtualization to finish with a contempo- on binary translation, the architectural features that pre- rary challenge, nested virtualization. -
AMD 64-Bit Technology
Preliminary Information AMD 64-Bit Technology The AMD x86-64™ Architecture Programmers Overview Publication # 24108 Rev: A Issue Date: August 2000 Preliminary Information © 2000 Advanced Micro Devices, Inc. All rights reserved. The contents of this document are provided in connection with Advanced Micro Devices, Inc. (“AMD”) products. AMD makes no representations or warranties with respect to the accuracy or completeness of the contents of this publication and reserves the right to make changes to specifications and product descriptions at any time without notice. No license, whether express, implied, arising by estoppel or otherwise, to any intellectual property rights is granted by this publication. Except as set forth in AMD’s Standard Terms and Conditions of Sale, AMD assumes no liability whatsoever, and disclaims any express or implied warranty, relating to its products including, but not limited to, the implied warranty of merchantability, fitness for a particular purpose, or infringement of any intellectual property right. AMD’s products are not designed, intended, authorized or warranted for use as components in systems intended for surgical implant into the body, or in other applications intended to support or sustain life, or in any other applica- tion in which the failure of AMD’s product could create a situation where per- sonal injury, death, or severe property or environmental damage may occur. AMD reserves the right to discontinue or make changes to its products at any time without notice. Trademarks AMD, the AMD logo, x86-64, AMD Athlon, and combinations thereof are trademarks of Advanced Micro Devices, Inc. MMX is a trademark of Intel Corporation. -
X86 Memory Protection and Translation
x86 Memory Protection and Translation Don Porter CSE 506 Lecture Goal ò Understand the hardware tools available on a modern x86 processor for manipulating and protecting memory ò Lab 2: You will program this hardware ò Apologies: Material can be a bit dry, but important ò Plus, slides will be good reference ò But, cool tech tricks: ò How does thread-local storage (TLS) work? ò An actual (and tough) Microsoft interview question Undergrad Review ò What is: ò Virtual memory? ò Segmentation? ò Paging? Two System Goals 1) Provide an abstraction of contiguous, isolated virtual memory to a program 2) Prevent illegal operations ò Prevent access to other application or OS memory ò Detect failures early (e.g., segfault on address 0) ò More recently, prevent exploits that try to execute program data Outline ò x86 processor modes ò x86 segmentation ò x86 page tables ò Software vs. Hardware mechanisms ò Advanced Features ò Interesting applications/problems x86 Processor Modes ò Real mode – walks and talks like a really old x86 chip ò State at boot ò 20-bit address space, direct physical memory access ò Segmentation available (no paging) ò Protected mode – Standard 32-bit x86 mode ò Segmentation and paging ò Privilege levels (separate user and kernel) x86 Processor Modes ò Long mode – 64-bit mode (aka amd64, x86_64, etc.) ò Very similar to 32-bit mode (protected mode), but bigger ò Restrict segmentation use ò Garbage collect deprecated instructions ò Chips can still run in protected mode with old instructions Translation Overview 0xdeadbeef Segmentation 0x0eadbeef Paging 0x6eadbeef Virtual Address Linear Address Physical Address Protected/Long mode only ò Segmentation cannot be disabled! ò But can be a no-op (aka flat mode) x86 Segmentation ò A segment has: ò Base address (linear address) ò Length ò Type (code, data, etc). -
Ramp: Research Accelerator for Multiple Processors
..................................................................................................................................................................................................................................................... RAMP: RESEARCH ACCELERATOR FOR MULTIPLE PROCESSORS ..................................................................................................................................................................................................................................................... THE RAMP PROJECT’S GOAL IS TO ENABLE THE INTENSIVE, MULTIDISCIPLINARY INNOVATION John Wawrzynek THAT THE COMPUTING INDUSTRY WILL NEED TO TACKLE THE PROBLEMS OF PARALLEL David Patterson PROCESSING. RAMP ITSELF IS AN OPEN-SOURCE, COMMUNITY-DEVELOPED, FPGA-BASED University of California, EMULATOR OF PARALLEL ARCHITECTURES. ITS DESIGN FRAMEWORK LETS A LARGE, Berkeley COLLABORATIVE COMMUNITY DEVELOP AND CONTRIBUTE REUSABLE, COMPOSABLE DESIGN Mark Oskin MODULES. THREE COMPLETE DESIGNS—FOR TRANSACTIONAL MEMORY, DISTRIBUTED University of Washington SYSTEMS, AND DISTRIBUTED-SHARED MEMORY—DEMONSTRATE THE PLATFORM’S POTENTIAL. Shih-Lien Lu ...... In 2005, the computer hardware N Prototyping a new architecture in Intel industry took a historic change of direction: hardware takes approximately four The major microprocessor companies all an- years and many millions of dollars, nounced that their future products would even at only research quality. Christoforos Kozyrakis be single-chip multiprocessors, and that N Software -
Computer Architectures an Overview
Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements. -
Selected Project Reports, Spring 2005 Advanced OS & Distributed Systems
Selected Project Reports, Spring 2005 Advanced OS & Distributed Systems (15-712) edited by Garth A. Gibson and Hyang-Ah Kim Jangwoo Kim††, Eriko Nurvitadhi††, Eric Chung††; Alex Nizhner†, Andrew Biggadike†, Jad Chamcham†; Srinath Sridhar∗, Jeffrey Stylos∗, Noam Zeilberger∗; Gregg Economou∗, Raja R. Sambasivan∗, Terrence Wong∗; Elaine Shi∗, Yong Lu∗, Matt Reid††; Amber Palekar†, Rahul Iyer† May 2005 CMU-CS-05-138 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 †Information Networking Institute ∗Department of Computer Science ††Department of Electrical and Computer Engineering Abstract This technical report contains six final project reports contributed by participants in CMU’s Spring 2005 Advanced Operating Systems and Distributed Systems course (15-712) offered by professor Garth Gibson. This course examines the design and analysis of various aspects of operating systems and distributed systems through a series of background lectures, paper readings, and group projects. Projects were done in groups of two or three, required some kind of implementation and evalution pertaining to the classrom material, but with the topic of these projects left up to each group. Final reports were held to the standard of a systems conference paper submission; a standard well met by the majority of completed projects. Some of the projects will be extended for future submissions to major system conferences. The reports that follow cover a broad range of topics. These reports present a characterization of synchronization behavior -
The Memory Sinkhole : an Architectural Privilege Escalation Vulnerability { Domas // Black Hat 2015 Christopher Domas
The Memory Sinkhole : An architectural privilege escalation vulnerability { domas // black hat 2015 Christopher Domas Battelle Memorial Institute ./bio x86 architectural vulnerability Hidden for 20 years A new class of exploits Overview (demonstration) Ring 3 Ring 2 Ring 1 Ring 0 The x86 Rings Virtualization The Negative Rings… Some things are so important… Ring 0 should not have access to them The Negative Rings… Ring -1 The hypervisor The Negative Rings… CIH The Negative Rings… Some things are so important… Ring -1 should not have access to them The Negative Rings… Ring -2 System Management Mode (SMM) The Negative Rings… What if… A mode invisible to the OS SMM Power management SMM … in the beginning System safety Power button Century rollover Error logging Chipset errata Platform security SMM … evolution Cryptographically authenticated variables Signature verifications Firmware, SecureBoot Hardware locks TPM emulation and communication Platform Lock Box Interface to the Root of Trust SMM … pandora’s box Whenever we have anything … So important, we don’t want the kernel to screw it up So secret, it needs to hide from the OS and DMA So sensitive, it should never be touched … it gets tossed into SMM SMM … pandora’s box Ring 3 (Userland) Ring 0 (Kernel) Ring -1 (Hypervisor) Ring -2 (SMM) Processor If you think you own a system when you get to ring 0 … … you're wrong On modern systems, ring 0 is not in control Ring -2 has the hardware the firmware all the most critical security checks The deepest