Virtual Breakpoints for X86/64

Total Page:16

File Type:pdf, Size:1020Kb

Virtual Breakpoints for X86/64 Virtual Breakpoints for x86/64 Gregory Price1∗y August 22, 2019 Abstract Debuggers [1] and other instrumentation tools [3][4][7][11] are becoming increasingly complex as Efficient, reliable trapping of execution in a pro- malware deploys equally complex anti-analysis gram at the desired location is a linchpin tech- techniques. Many techniques require special mem- nique for dynamic malware analysis. The pro- ory allocators, compilers, or complex systems to gression of debuggers and malware is akin to a control the \view" of memory [10, 2]. In some game of cat and mouse - each are constantly in cases, the efficiency or robustness of these tech- a state of trying to thwart one another. At the niques rely on architecture-level quirks that are core of most efficient debuggers today is a com- more of a happy accident, rather than intentional bination of virtual machines and traditional bi- design. For example, SPIDER breakpoints [2] nary modification breakpoints (int3). In this pa- rely on a caching quirk of Intel CPU's to remain per, we present a design for Virtual Breakpoints efficient, and many dynamic binary instrumenta- | a modification to the x86 MMU which brings tion systems [10] are a mess of trampolines and breakpoint management into hardware alongside runtime hooking. page tables. In this paper we demonstrate the Despite the growing complexity, all of these fundamental abstraction failures of current trap- systems must trade between efficiency, reliabil- ping methods, and design a new mechanism from ity, and transparency [4]. If a trapping mech- the hardware up. This design incorporates lessons anism introduces significant overhead, analysis learned from 50 years of virtualization and de- becomes tedious. Traps that can be bypassed bugger design to deliver fast, reliable trapping via detection or eviction inherently unreliable. without the pitfalls of traditional binary modifi- Traps that modify a target program's memory cation. can cause corruption. These trade-offs create a game of whack-a-mole, which suggests we are 1 Introduction simply treating symptoms, rather than address- ing the disease. Trapping solutions such as single-stepping, arXiv:1801.09250v3 [cs.OS] 20 Aug 2019 In the modern age of malware analysis and \fuzzing for dollars", security researchers still seek a ro- debug registers, and full system emulation have bust, transparent, efficient trapping system. The been tried, but all fail to meet the flexibility field heavily leverages virtualization technologies and efficiency requirements to reach wide-spread [2, 5, 6, 7, 8, 9, 14, 10]. Unfortunately, these tech- adoption. Single stepping a large, complex pro- nologies inherit the failings of traditional trap- gram (like an OS) is extremely inefficient [1][2]. ping methods while adding the complexity of Usage of debug registers can be detected[2]. Fi- architecture-specific implementations. nally, pure-emulation is simply no longer suffi- cient to run and debug modern operating sys- ∗ *This work was supported by Raytheon CSI tems. y1Gregory Price is price.gr at husky.neu.edu Researchers have made significant headway formation is stored on a byte-per-byte basis with in solving reliability and efficiency problems by the data frame. leveraging virtualization[4, 2, 3, 8], but most still For each byte read from a breakpointed page, rely on some form of binary modification. The an 8-bit value is retrieved from the breakpoint difficulty with binary modification for debugging buddy-frame and used to determine if an inter- hostile programs is the lack of assurance the mod- rupt is generated. This byte implements stan- ifications are not a) detected, b) bypassed, or dard read/write/execute breakpoint settings, and c) introducing undefined behavior. This is a generates a debug break prior to executing the result of these trapping mechanisms being de- target instruction if the conditions are met. The signed prior to the advent of modern security remaining bits remain open for future develop- and reverse engineering requirements. ment. \Cutting edge" x86/64 debugging systems such By doing away with binary modification as as SPIDER [2] may accomplish (to an extent) the de facto standard of breakpointing, we gain the goal of stealth and efficiency, but still fail reliability, transparency, and guaranteed correct- to mitigate the corruption and reliability prob- ness of target program execution | all without lem with binary modification breakpoints. Bi- sacrificing flexibility or efficiency. nary modification systems carry a presumption of well-behaved programs and lack of user error. 2 Background For example, if a 5-byte instruction is located at memory address 0x10, and a trap instruction is Debuggers and dynamic analysis tools tradition- inserted at memory address 0x12, the resulting ally implement breakpoints in one of three ways: behavior could be a range of unintended con- single-stepping, debug registers, and binary mod- sequences (Illegal Instructions, Jumps to Ran- ification [1]. Each mechanism is not without dom Memory, Memory Corruption, etc). SPI- their flaws, and much research [3, 4, 5, 6, 7, 8, 11] DER cannot handle this modification or user- has been done on the topic of mitigating the ver- error case directly, handling the degenerate case itable list of issues. by simply evicting the trap and falling back to In a review of debugging technology published emulation (trading reliability for efficiency). in 1990[16], Vernon Paxson lays out a list of In this paper, we propose a memory man- techniques for debugging software that is eerily agement unit extension for virtual machine de- familiar. Despite almost 30 years of research bugging that addresses each of these core issues. since then, few if any new hardware-supported We claim that the corruption issue is the criti- debugging capabilities have been developed by cal piece of the puzzle that, if solved, will lead major hardware manufacturers. Even the \new" to robust, efficient debuggers at both the system ARMv8.5 Memory Tagging Extensions (MTE) and virtual machine level. The current field of are not truly new, with Paxson discussing fully debuggers attempt to fix this fundamentally un- tagged architectures existing as far back as 1982 solvable problem by way of building \a better (possibly earlier). mousetrap", but the problem they are attempt- Despite modern tools trying to mitigate the ing to solve reduces to the halting problem. flaws of dated trapping mechanisms, each subse- Our proposed solution introduces a \break- quent system increases complexity, reduces effi- point buddy-frame" and adds a \breakpoint bit" ciency, or narrows in applicability. All still fall to page table entries. When a byte on a page prey to anti-debugging techniques such as timing with this bit set is accessed, the breakpoint frame attacks, code integrity checks, and just-in-time is referenced (in hardware) to determine if that compilation or code relocation. address has a breakpoint set. This breakpoint in- 2.1 Traditional Trapping Methods ical headaches, namely that it is easily detected and may cause corruption. Much research has Each established method of trapping exhibits their been done to hide these breakpoints, but only own unique failures. Each battles with trying to recently has an efficient solution emerged [2]. achieve flexibility, efficiency, transparency, and When Intel and AMD released virtualized Page reliability - but none seem to solve for all four. Table support (known as Extended or Nested Single-stepping approaches make executing Page Tables [12, 13]), the idea of transparent bi- large, complex programs unbearably slow. Typi- nary modification came to fruition with SPIDER cally implemented via the use of the eflags/rflags [2]. SPIDER introduced a binary modification register, a full context switch between debug- mechanism that made use of extended page ta- ger and debuggee is required on each instruc- bles to split the \read-write" view of guest data tion. This can push execution time to be orders from the \execute" view of data. By doing so, a of magnitude longer than the original program guest cannot view a trap set via binary modifi- [2]. Even emulation approaches, which can be cation, because the guest may only read from a viewed as a form of binary interpreter, are sim- sanitized view of memory. ply too slow for general use. Unfortunately, this transparent breakpoint- Debug Registers are an efficient but finite re- ing system still fails to be sufficiently flexible source that are not practical nor easily virtu- and reliable. First, it falls victim to the \Crit- alized. On x86 there are only 4 debug regis- ical Byte Problem" which will be described in ters (DR0-DR4), limiting a debugger to 4 to- the next section. Second, because the host must tal watch/break points. Further, because this is maintain consistency between data and execu- a physical register limitation, the debuggee can tion views, the guest still has a mechanism (writ- typically detect whether these registers are being ing to its code pages) with which to for a trap used [2][10]. eviction. Finally, while binary modification meets the Moving forward, we will demonstrate that requirement of efficient execution, it is easily de- systems relying on binary modification cannot tected by a debuggee which monitors the integrity achieve perfect flexibility and reliability. If we of its own code [2, 10, 5, 6]. On x86, these traps hope to accomplish truly transparent and effi- are implemented by placing an \int3" (debug cient breakpointing, then we must design it from break) instruction at the given address. To see the hardware up. the problem with this technique, imagine a de- buggee that periodically hashes its entire read- only codebase. It would be able to detect this 3 Overview change, and modify its behavior. It is still not apparent whether these tradi- In this section we discuss the goals of an \opti- tional breakpointing methods can meet the re- mal" trapping system.
Recommended publications
  • KDE 2.0 Development, Which Is Directly Supported
    23 8911 CH18 10/16/00 1:44 PM Page 401 The KDevelop IDE: The CHAPTER Integrated Development Environment for KDE by Ralf Nolden 18 IN THIS CHAPTER • General Issues 402 • Creating KDE 2.0 Applications 409 • Getting Started with the KDE 2.0 API 413 • The Classbrowser and Your Project 416 • The File Viewers—The Windows to Your Project Files 419 • The KDevelop Debugger 421 • KDevelop 2.0—A Preview 425 23 8911 CH18 10/16/00 1:44 PM Page 402 Developer Tools and Support 402 PART IV Although developing applications under UNIX systems can be a lot of fun, until now the pro- grammer was lacking a comfortable environment that takes away the usual standard activities that have to be done over and over in the process of programming. The KDevelop IDE closes this gap and makes it a joy to work within a complete, integrated development environment, combining the use of the GNU standard development tools such as the g++ compiler and the gdb debugger with the advantages of a GUI-based environment that automates all standard actions and allows the developer to concentrate on the work of writing software instead of managing command-line tools. It also offers direct and quick access to source files and docu- mentation. KDevelop primarily aims to provide the best means to rapidly set up and write KDE software; it also supports extended features such as GUI designing and translation in con- junction with other tools available especially for KDE development. The KDevelop IDE itself is published under the GNU Public License (GPL), like KDE, and is therefore publicly avail- able at no cost—including its source code—and it may be used both for free and for commer- cial development.
    [Show full text]
  • Using the GNU Debugger (Gdb) a Debugger Is a Very Useful Tool for Finding Bugs in a Program
    Using the GNU debugger (gdb) A debugger is a very useful tool for finding bugs in a program. You can interact with a program while it is running, start and stop it whenever, inspect the current values of variables and modify them, etc. If your program runs and crashes, it will produce a ‘core dump’. You can also use a debugger to look at the core dump and give you extra information about where the crash happened and what triggered it. Some debuggers (including recent versions of gdb) can also go backwards through your code: you run your code forwards in time to the point of an error, and then go backwards looking at the values of the key variables until you get to the start of the error. This can be slow but useful sometimes! To use a debugger effectively, you need to get the compiler to put extra ‘symbol’ information into the binary, otherwise all it will contain is machine code level – it is much more useful to have the actual variable names you used. To do this, you use: gfortran –g –O0 mycode.f90 –o mybinary where ‘-g’ is the compiler option to include extra symbols, -O0 is no optimization so the code is compiled exactly as written, and the output binary is called ’mybinary’. If the source files and executable file is in the same directory, then you can run the binary through the debugger by simply doing: gdb ./mybinary This will then put you into an interactive debugging session. Most commands can be shortened (eg ‘b’ instead of ‘break’) and pressing ‘enter’ will repeat the last command.
    [Show full text]
  • Tricore™ Tricore™ V1.6 Microcontrollers User Manual
    TriCore™ 32-bit TriCore™ V1.6 Core Architecture 32-bit Unified Processor Core User Manual (Volume 1) V1.0, 2012-05 Microcontrollers Edition 2012-05 Published by Infineon Technologies AG 81726 Munich, Germany © 2012 Infineon Technologies AG All Rights Reserved. Legal Disclaimer The information given in this document shall in no event be regarded as a guarantee of conditions or characteristics. With respect to any examples or hints given herein, any typical values stated herein and/or any information regarding the application of the device, Infineon Technologies hereby disclaims any and all warranties and liabilities of any kind, including without limitation, warranties of non-infringement of intellectual property rights of any third party. Information For further information on technology, delivery terms and conditions and prices, please contact the nearest Infineon Technologies Office (www.infineon.com). Warnings Due to technical requirements, components may contain dangerous substances. For information on the types in question, please contact the nearest Infineon Technologies Office. Infineon Technologies components may be used in life-support devices or systems only with the express written approval of Infineon Technologies, if a failure of such components can reasonably be expected to cause the failure of that life-support device or system or to affect the safety or effectiveness of that device or system. Life support devices or systems are intended to be implanted in the human body or to support and/or maintain and sustain and/or protect human life. If they fail, it is reasonable to assume that the health of the user or other persons may be endangered.
    [Show full text]
  • PRE LIM INARY KDII-D Processor Manual (PDP-Ll/04)
    PRE LIM I N A R Y KDII-D Processor Manual (PDP-ll/04) The information in this document is subject to change without notice and should not be construed as a commitment by Digital Equipment Corporation. Digital Equipment Corporation assumes no responsibility for any errors that may appear in this manual. Copyright C 1975 by Digital Equipment Corporation Written by PDP-II Engineering PREFACE OVE~ALt DESCRIPTION 3 8 0 INsT~UCTION SET 3~1 Introduction 1~2 Addressing ModIs 3.3 Instruction Timing l@4 In~truet1on Dtlcriptlonl le!5 Dlffer~ncei a,tween KOlle and KOllS ]0 6 Programming Diftereneel B~twlen pepita ).7 SUI L~tency Tim@1 CPu OPERlTING 5PECIFICATID~S 5 e 0 DETAILED HARDWARE DESC~ltTION 5.1 IntrOduction 5 m2 Data path Circuitry 5.2g1 General D~serlpt1on 5.2,,2 ALU 5,2 .. 3 Scratch Pad Mtmory 5.2,.3.1 Scratch P~d Circuitry 5.2.3.2 SCrateh P~d Addr~11 MUltlplex.r 5.2.3.3 SCrateh Pad Regilter 5 0 2 .. 4 B J:H'!qilil ter 5.2 .. 4.1 B ReQister CircuItry 5.2 .. 4.2 8 L~G MUltl~lex@r 5,2.5 AMUX 5.2.6 Pfoeellof statuI Word (PSW) 5.3 Con<Htion COd@i 5.3.1 G~neral De~erl~tlon 5.3.2 Carry and overflow Decode 5.3.3 Ayte Multiplexing 5.4 UNIBUS Addf@$5 and Data Interfaces S.4 g 1 U~IBUS Dr1v~rl and ~lee1Vtrl 5.4@2 Bus Addr~s~ Generation 5 8 4111 Internal Addrf$$ D@coder 5 0 4,.4 Bus Data Line Interface S.5 Instruet10n DeCoding 5.5,,1 General De~er1pt1nn 5;5.2 Instruction Re91lt~r 5.5,,3 Instruetlon Decoder 5.5.3 0 1 Doubl~ O~erand InltrYetions 5.5.3.2 Single Operand Inltruetlons 5,5.3.3 Branch In$truetionl 5 8 5.3 6 4 Op~rate rn~truet1ons 5.6 Auil11ary ALU Control 5.6.1 G~ntral Delcrlptlon 5.6.2 COntrol Circuitry 5.6.2.1 Double Op@rand Instructions 5.6.2.2 Singl@ nDef~nd Instruet10ns PaQe 3 5.7 Data Transfer Control 5,7,1 General Descr1ption 5,7,2 Control CIrcuItry 5,7,2,1 Procelsor Cloek Inhibit 5,7,2.2 UNIBUS Synchronization 5,7.2,3 BUs Control 5.7,2.4 MSyN/SSYN Timeout 5,7.2.5 Bus Error.
    [Show full text]
  • Debugging a Program
    Debugging a Program SunPro A Sun Microsystems, Inc. Business 2550 Garcia Avenue Mountain View, CA 94043 U.S.A. Part No: 801-5106-10 Revision A December 1993 1993 Sun Microsystems, Inc. 2550 Garcia Avenue, Mountain View, California 94043-1100 U.S.A. All rights reserved. This product and related documentation are protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or related documentation may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Portions of this product may be derived from the UNIX® and Berkeley 4.3 BSD systems, licensed from UNIX System Laboratories, Inc. and the University of California, respectively. Third-party font software in this product is protected by copyright and licensed from Sun’s Font Suppliers. RESTRICTED RIGHTS LEGEND: Use, duplication, or disclosure by the United States Government is subject to the restrictions set forth in DFARS 252.227-7013 (c)(1)(ii) and FAR 52.227-19. The product described in this manual may be protected by one or more U.S. patents, foreign patents, or pending applications. TRADEMARKS Sun, Sun Microsystems, the Sun logo, SunPro, SunPro logo, Sun-4, SunOS, Solaris, ONC, OpenWindows, ToolTalk, AnswerBook, and Magnify Help are trademarks or registered trademarks of Sun Microsystems, Inc. UNIX and OPEN LOOK are registered trademarks of UNIX System Laboratories, Inc., a wholly owned subsidiary of Novell, Inc. All other product names mentioned herein are the trademarks of their respective owners. All SPARC trademarks, including the SCD Compliant Logo, are trademarks or registered trademarks of SPARC International, Inc.
    [Show full text]
  • Exceptions and Processes
    Exceptions and Processes! Jennifer Rexford! The material for this lecture is drawn from! Computer Systems: A Programmerʼs Perspective (Bryant & O"Hallaron) Chapter 8! 1 Goals of this Lecture! •#Help you learn about:! •# Exceptions! •# The process concept! … and thereby…! •# How operating systems work! •# How applications interact with OS and hardware! The process concept is one of the most important concepts in systems programming! 2 Context of this Lecture! Second half of the course! Previously! Starting Now! C Language! Application Program! language! service! levels! Assembly Language! levels! Operating System! tour! tour! Machine Language! Hardware! Application programs, OS,! and hardware interact! via exceptions! 3 Motivation! Question:! •# How does a program get input from the keyboard?! •# How does a program get data from a (slow) disk?! Question:! •# Executing program thinks it has exclusive control of CPU! •# But multiple programs share one CPU (or a few CPUs)! •# How is that illusion implemented?! Question:! •# Executing program thinks it has exclusive use of memory! •# But multiple programs must share one memory! •# How is that illusion implemented?! Answers: Exceptions…! 4 Exceptions! •# Exception! •# An abrupt change in control flow in response to a change in processor state! •# Examples:! •# Application program:! •# Requests I/O! •# Requests more heap memory! •# Attempts integer division by 0! •# Attempts to access privileged memory! Synchronous! •# Accesses variable that is not$ in real memory (see upcoming $ “Virtual Memory” lecture)! •# User presses key on keyboard! Asynchronous! •# Disk controller finishes reading data! 5 Exceptions Note! •# Note:! ! !Exceptions in OS % exceptions in Java! Implemented using! try/catch! and throw statements! 6 Exceptional Control Flow! Application! Exception handler! program! in operating system! exception! exception! processing! exception! return! (optional)! 7 Exceptions vs.
    [Show full text]
  • Providing Memory Performance Feedback in Modern Processors
    Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors Mark Horowitz Margaret Martonosi Todd C. Mowry Michael D. Smith Computer Systems Department of Department of Electrical Division of Laboratory Electrical Engineering and Computer Engineering Applied Sciences Stanford University Princeton University University of Toronto Harvard University [email protected] [email protected] [email protected] [email protected] Abstract other purely hardware-based mechanisms (e.g., stream buffers [Jou90]) are complete solutions. Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several prom- In addition to hardware mechanisms, a number of promising soft- ising software techniques have been shown to address this problem ware techniques have been proposed to avoid or tolerate memory successfully in specific situations. However, the generality of these latency. These software techniques have resorted to a variety of software approaches has been limited because current architectures different approaches for gathering information and reasoning about do not provide a fine-grained, low-overhead mechanism for memory performance. Compiler-based techniques, such as cache observing and reacting to memory behavior directly. To fill this blocking [AKL79,GJMS87,WL91] and prefetching [MLG92, need, we propose a new class of memory operations called inform- Por89] use static program analysis to predict which references are ing memory operations, which essentially consist of a memory likely to suffer misses. Memory performance tools have relied on operation combined (either implicitly or explicitly) with a condi- sampling or simulation-based approaches to gather memory statis- tional branch-and-link operation that is taken only if the reference tics [CMM+88,DBKF90,GH93,LW94,MGA95].
    [Show full text]
  • Intel® 64 and IA-32 Architectures Software Developer's Manual
    Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B: System Programming Guide, Part 2 NOTE: The Intel® 64 and IA-32 Architectures Software Developer's Manual consists of five volumes: Basic Architecture, Order Number 253665; Instruction Set Reference A-M, Order Number 253666; Instruction Set Reference N-Z, Order Number 253667; System Programming Guide, Part 1, Order Number 253668; System Programming Guide, Part 2, Order Number 253669. Refer to all five volumes when evaluating your design needs. Order Number: 253669-029US November 2008 INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANT- ED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR IN- TENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUA- TION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "unde- fined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them.
    [Show full text]
  • Chapter 3 Protected-Mode Memory Management
    CHAPTER 3 PROTECTED-MODE MEMORY MANAGEMENT This chapter describes the Intel 64 and IA-32 architecture’s protected-mode memory management facilities, including the physical memory requirements, segmentation mechanism, and paging mechanism. See also: Chapter 5, “Protection” (for a description of the processor’s protection mechanism) and Chapter 20, “8086 Emulation” (for a description of memory addressing protection in real-address and virtual-8086 modes). 3.1 MEMORY MANAGEMENT OVERVIEW The memory management facilities of the IA-32 architecture are divided into two parts: segmentation and paging. Segmentation provides a mechanism of isolating individual code, data, and stack modules so that multiple programs (or tasks) can run on the same processor without interfering with one another. Paging provides a mech- anism for implementing a conventional demand-paged, virtual-memory system where sections of a program’s execution environment are mapped into physical memory as needed. Paging can also be used to provide isolation between multiple tasks. When operating in protected mode, some form of segmentation must be used. There is no mode bit to disable segmentation. The use of paging, however, is optional. These two mechanisms (segmentation and paging) can be configured to support simple single-program (or single- task) systems, multitasking systems, or multiple-processor systems that used shared memory. As shown in Figure 3-1, segmentation provides a mechanism for dividing the processor’s addressable memory space (called the linear address space) into smaller protected address spaces called segments. Segments can be used to hold the code, data, and stack for a program or to hold system data structures (such as a TSS or LDT).
    [Show full text]
  • Definition of Terms Introduction Trap Are a Mechanism To
    Eidgenössische Definition of Terms Technische Hochschule Zürich Interrupt: Asynchronous interruption of the instruction flow Programming in Systems Caused by external event! (37-023) Hit program execution in Programming in Assembler «between» 2 Instructions Basics of Operating Systems Exception: Unexpected event during Models of Computer Architecture program execution (division by zero, page fault) Lecturer today: Prof. Thomas M. Stricker Hits «during» an instruction Text-/Reference-Books: Trap: Software generated Inter- R.P.Paul: SPARC Architecture... and C rupt Sun SPARC V8 Manual and K&R C Reference Caused by an exception or by an explicit trap instruction Topics of Today: (e.g. for a system call) • Traps, Exceptions and Supervisor Mode Supervisor Privileged execution mode • SPARC V8 Trap Model Mode: in contrast to User Mode • Traps in tkisem System Calls, Interrupt-, • Register Window Trap-Handlers Exception- and Traphandler execute in supervisor mode. 9/14.1.02 - 1 37-023 Systemprogrammierung © Alonso/Stricker 9/14.1.02 - 2 37-023 Systemprogrammierung © Alonso/Stricker Introduction Trap are a mechanism to... Standard library in C (libc.a) provides ... avoid programming mistakes with sys- two different kind of functions: tem resources. • pure library functions ... share common system resources in a fair way under OS control. • self contained code (sin, printf...) ... prevent access to protected memory • «visible» execution in user mode. segments. • examination in single-step mode of ... catch Instructions which would lead to the debugger possible. error conditions and inconsistent sys- • glue code to system calls tem state (z.B. Division by 0, Window- (read, write, open, close...) Overflow). • «invisible» execution ➜ Two modes of execution: Supervisor and User Mode • call some code in the operating system (Trap) ➜ Traps: • execute in supervisor mode Mechanism to get into/return from supervisor mode.
    [Show full text]
  • Design of the RISC-V Instruction Set Architecture
    Design of the RISC-V Instruction Set Architecture Andrew Waterman Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2016-1 http://www.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-1.html January 3, 2016 Copyright © 2016, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor David Patterson, Chair Professor Krste Asanovi´c Associate Professor Per-Olof Persson Spring 2016 Design of the RISC-V Instruction Set Architecture Copyright 2016 by Andrew Shell Waterman 1 Abstract Design of the RISC-V Instruction Set Architecture by Andrew Shell Waterman Doctor of Philosophy in Computer Science University of California, Berkeley Professor David Patterson, Chair The hardware-software interface, embodied in the instruction set architecture (ISA), is arguably the most important interface in a computer system. Yet, in contrast to nearly all other interfaces in a modern computer system, all commercially popular ISAs are proprietary.
    [Show full text]
  • Low-Overhead Call Path Profiling of Unmodified, Optimized Code
    Low-Overhead Call Path Profiling of Unmodified, Optimized Code Nathan Froyd John Mellor-Crummey Rob Fowler [email protected] [email protected] [email protected] Rice University, Houston, TX ABSTRACT tiple places. To understand the performance implications of Call path profiling associates resource consumption with data copying in such applications, it can thus be necessary the calling context in which resources were consumed. We to attribute costs at all of these levels. To do this in general describe the design and implementation of a low-overhead requires an efficient data collection mechanism that tracks call path profiler based on stack sampling. The profiler costs along entire call paths. Since understanding and im- uses a novel sample-driven strategy for collecting frequency proving the on-node performance of MPI programs is a key counts for call graph edges without instrumenting every pro- factor in overall performance, context-sensitive profiling is cedure’s code to count them. The data structures and al- of particular importance for large-scale parallel systems. gorithms used are efficient enough to construct the com- An ideal profiler should have the following properties. plete calling context tree exposed during sampling. The First, it should collect performance measurements with low, profiler leverages information recorded by compilers for de- controllable overhead; this makes it feasible to collect pro- bugging or exception handling to record call path profiles file information during production runs as well as during even for highly-optimized code. We describe an implemen- development. Second, it should identify differences in the tation for the Tru64/Alpha platform.
    [Show full text]