OS and Runtime Support for Efficiently Managing Cores in Parallel

Total Page:16

File Type:pdf, Size:1020Kb

OS and Runtime Support for Efficiently Managing Cores in Parallel OS and Runtime Support for Efficiently Managing Cores in Parallel Applications by Kevin Alan Klues Adissertationsubmittedinpartialsatisfactionofthe requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Eric Brewer, Chair Professor Krste Asanovi´c Professor Brian Carver Summer 2015 OS and Runtime Support for Efficiently Managing Cores in Parallel Applications Copyright 2015 by Kevin Alan Klues 1 Abstract OS and Runtime Support for Efficiently Managing Cores in Parallel Applications by Kevin Alan Klues Doctor of Philosophy in Computer Science University of California, Berkeley Professor Eric Brewer, Chair Parallel applications can benefit from the ability to explicitly control their thread schedul- ing policies in user-space. However, modern operating systems lack the interfaces necessary to make this type of “user-level” scheduling efficient. The key component missing is the abil- ity for applications to gain direct access to cores and keep control of those cores even when making I/O operations that traditionally block in the kernel. A number of former systems provided limited support for these capabilities, but they have been abandoned in modern systems, where all e↵orts are now focused exclusively on kernel-based scheduling approaches similar to Linux’s. In many circles, the term “kernel” has actually become synonymous with Linux, and its Unix-based model of process and thread management is often assumed as the only possible choice. In this work, we explore the OS and runtime support required to resurrect user-level threading as a viable mechanism for exploiting parallelism on modern systems. The idea is that applications request cores, not threads, from the underlying system and employ user- level scheduling techniques to multiplex their own set of threads on top of those cores. So long as an application has control of a core, it will have uninterrupted, dedicated access to it. This gives developers more control over what runs where (and when), so they can design their algorithms to take full advantage of the parallelism available to them. They express their parallelism via core requests and their concurrency via user-level threads. We frame our work with a discussion of Akaros, a new, experimental operating system we developed, whose “Many-Core Process” (MCP) abstraction is built specifically with user- level scheduling in mind. The Akaros kernel provides low-level primitives for gaining direct access to cores, and a user-level library, called parlib, provides a framework for building custom threading packages that make use of those cores. From a parlib-based version of pthreads to a port of Lithe and the popular Go programming language, the combination of Akaros and parlib proves itself as a powerful medium to help bring user-level scheduling back from the dark ages. i Contents Contents i List of Figures iii List of Tables vi 1 Introduction 1 1.1 BackgroundandMotivation ........................... 2 1.2 Akaros....................................... 5 1.2.1 The Many-Core Process . 7 1.2.2 ResourceProvisioningandAllocation . 8 1.3 ContributionsandRoadmap ........................... 9 2Parlib 13 2.1 AbstractionOverview............................... 14 2.2 ThreadSupportLibrary ............................. 17 2.2.1 Vcores................................... 20 2.2.2 Uthreads.................................. 25 2.2.3 Synchronization Primitives . 31 2.3 AsynchronousEvents............................... 36 2.3.1 Akarosvs.Linux ............................. 37 2.3.2 TheEventDeliveryAPI ......................... 38 2.3.3 EventNotifications ............................ 40 2.4 AsynchronousServices .............................. 43 2.4.1 System Calls . 43 2.4.2 TheAlarmService ............................ 45 2.4.3 POSIX Signal Support . 47 2.5 OtherAbstractions ................................ 49 2.5.1 MemoryAllocators ............................ 49 2.5.2 Wait-Free Lists . 50 2.5.3 Dynamic TLS . 51 2.6 Evaluation . 52 ii 2.6.1 The upthread Library . 53 2.6.2 The kweb Webserver . 55 2.6.3 Experiments . 57 2.7 Discussion . 73 3 Lithe 75 3.1 ArchitecturalOverview.............................. 79 3.2 TheLitheAPI .................................. 82 3.2.1 Lithe Callbacks . 85 3.2.2 Lithe Library Calls . 87 3.2.3 Lithe Synchronization Primitives . 93 3.3 TheFork-JoinScheduler ............................. 95 3.4 Evaluation . 96 3.4.1 SPQRBenchmark ............................ 97 3.4.2 ImageProcessingServer . 99 3.5 Discussion . 103 4Go 105 4.1 Go Overview . 106 4.2 PortingGotoAkaros............................... 108 4.3 Results . 109 4.4 Discussion and Future Work . 111 5RelatedWork 113 5.1 Threading Models . 113 5.2 Concurrency vs. Parallelism . 115 5.3 Contemporary Systems . 117 6 Conclusion 119 Bibliography 121 iii List of Figures 1.1 A high-level comparison between a traditional 1:1 process and an MCP . 8 1.2 Three di↵erent core allocation schemes, based on (a) allocating cores strictly for future use, (b) allocating cores based solely on real-time demand, and (c) provisioning cores for future use, but allowing others to make use of them during periodsoflowdemand................................. 10 2.1 The collection of abstractions provided by parlib . 15 2.2 Parlib and the APIs it exposes in support of building parallel runtimes based on user-level scheduling. 16 2.3 This figure shows a direct comparison of threads running under a traditional 1:1 threading model, compared to parlib’s decoupled threading model. Under parlib, all threading is done in user-space, and a processes request cores (not threads) from the kernel, which pop up on a transition context represented by the vcore abstraction. 18 2.4 This figure shows the process of passing control from uthread context to vcore context and back. Parlib provides the mechanisms to easily pass from one context to another. Scheduler writers uses these mechanisms to implement their desired scheduling policies. 20 2.5 This figure shows the layering of the vcore and uthread abstractions with respect to the bi-directional API they provide to scheduler writers. 26 2.6 Average context switch latency/core of upthreads vs. the Linux NPTL. For up- threads, we show the various components that make up the total context switch latency. For the Linux NPTL, we show the latency when context switching a single thread vs. two. Error bars indicate the standard deviation of the measure- ments across cores. In all cases, this standard deviation is negligible. 61 2.7 Total context switches per second of upthreads vs. the Linux NPTL for an increasing number of cores and a varying number of threads per core. 62 2.8 Thread completion times of a fixed-work benchmark for di↵erent scheduling algo- rithms. The benchmark runs 1024 threads with 300 million loop iterations each, averagedover50runs. ................................ 65 iv 2.9 Speedup and average runtime of the OpenMP-based benchmarks from the NAS Parallel Benchmark Suite on upthreads vs. the Linux NPTL. Figure (a) shows the average runtime over 20 runs when configured for both 16 and 32 cores. Figure (b) shows the speedup. The red bars represents the speedup when all benchmarks are ran on 16 cores. The blue bars represents the speedup when all benchmarks are ran on 32 cores. 68 2.10 Webserver throughput with 100 concurrent connections of 1000 requests each. Requests are batched to allow 100 concurrent requests/connection on the wire at a time. After each request is completed, a new one is placed on the wire. After each connection completes all 1000 requests, a new connection is started. 71 2.11 Results from running the kweb throughput experiment on Akaros. The results are much worse due to a substandard networking stack on Akaros. The Linux NPTLresultsareshownforreference. 73 3.1 Representative application of the type targeted by Lithe. Di↵erent components are built using distinct parallel runtimes depending on the type of computation they wish to perform. These components are then composed together to make up a full application. 76 3.2 The evolution of parallel runtime scheduling from traditional OS scheduling in the kernel (a), to user-level scheduling with parlib (b), to user-level scheduling with Lithe (c). Lithe bridges the gap between between providing an API similar to parlib’s for building customized user-level schedulers, and allowing multiple parallel runtimes to seamlessly share cores among one another. 77 3.3 Lithe interposing on parlib to expose its own API for parallel runtime develop- ment. Figure (a) shows a single parallel runtime sitting directly on parlib. Figure (b) shows Lithe interposing on the parlib API to support the development of multiple,cooperatingparallelruntimes. 79 3.4 An example scheduler hierarchy in Lithe. Figure (a) shows an example program invoking nested schedulers. Figure (b) shows the resulting scheduler hierarchy from this program. The Graphics and AI schedulers originate from the same point because they are both spawned from the same context managed by the upthread scheduler. 80 3.5 The process of requesting and granting a hart in Lithe. A child process first puts in a request through lithe hart request(),whichtriggersahart request() callback on its parent. The parent uses this callback to register the pending request and grant a new hart to its child at some point in the future. The initial request is initiated in uthread context, but all callbacks and the subsequent lithe hart grant() occur in vcore context. 81 3.6 The software architecture of the SPQR benchmark on Lithe. Cores are shared between OpenMP and TBB at di↵erentlevelsoftheSPQRtasktree. 97 v 3.7 SPQR performance across a range of di↵erent input matrices. We compare its performance when running on the Linux NPTL (for both an ‘out-of-the-box’ configuration and a hand-tuned configuration based on optimal partitioning of cores) to running directly on parlib-based upthreads and Lithe. In all cases, Lithe outperforms them all. 99 3.8 Integrating the Flickr-like image processing server into kweb. Static file requests serviced by pthreads must share cores with thumbnail requests serviced by a mix of pthreads and OpenMP.
Recommended publications
  • CSE421 Midterm Solutions —SOLUTION SET— 09 Mar 2012
    CSE421 Midterm Solutions —SOLUTION SET— 09 Mar 2012 This midterm exam consists of three types of questions: 1. 10 multiple choice questions worth 1 point each. These are drawn directly from lecture slides and intended to be easy. 2. 6 short answer questions worth 5 points each. You can answer as many as you want, but we will give you credit for your best four answers for a total of up to 20 points. You should be able to answer the short answer questions in four or five sentences. 3. 2 long answer questions worth 20 points each. Please answer only one long answer question. If you answer both, we will only grade one. Your answer to the long answer should span a page or two. Please answer each question as clearly and succinctly as possible. Feel free to draw pic- tures or diagrams if they help you to do so. No aids of any kind are permitted. The point value assigned to each question is intended to suggest how to allocate your time. So you should work on a 5 point question for roughly 5 minutes. CSE421 Midterm Solutions 09 Mar 2012 Multiple Choice 1. (10 points) Answer all ten of the following questions. Each is worth one point. (a) In the story that GWA (Geoff) began class with on Monday, March 4th, why was the Harvard student concerned about his grade? p He never attended class. He never arrived at class on time. He usually fell asleep in class. He was using drugs. (b) All of the following are inter-process (IPC) communication mechanisms except p shared files.
    [Show full text]
  • Precise Garbage Collection for C
    Precise Garbage Collection for C Jon Rafkind Adam Wick John Regehr Matthew Flatt University of Utah Galois, Inc. University of Utah University of Utah [email protected] [email protected] [email protected] mfl[email protected] Abstract no-ops. The resulting program usually runs about as well as before, Magpie is a source-to-source transformation for C programs that but without the ongoing maintenance burden of manual memory enables precise garbage collection, where precise means that inte- management. In our experience, however, conservative GC works gers are not confused with pointers, and the liveness of a pointer poorly for long-running programs, such as a web server, a program- is apparent at the source level. Precise GC is primarily useful for ming environment, or an operating system kernel. For such pro- long-running programs and programs that interact with untrusted grams, conservative GC can trigger unbounded memory use due to components. In particular, we have successfully deployed precise linked lists [Boehm 2002] that manage threads and continuations; GC in the C implementation of a language run-time system that was this problem is usually due to liveness imprecision [Hirzel et al. originally designed to use conservative GC. We also report on our 2002], rather than type imprecision. Furthermore, the programs experience in transforming parts of the Linux kernel to use precise are susceptible to memory-exhaustion attack from malicious code GC instead of manual memory management. (e.g., user programs or untrusted servlets) that might be otherwise restricted through a sandbox [Wick and Flatt 2004]. Categories and Subject Descriptors D.4.2 [Storage Manage- This paper describes our design of and experience with a GC ment]: Garbage Collection for C that is less conservative.
    [Show full text]
  • Man Pages Section 3 Library Interfaces and Headers
    man pages section 3: Library Interfaces and Headers Part No: 816–5173–16 September 2010 Copyright © 2010, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related software documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT RIGHTS Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are “commercial computer software” or “commercial technical data” pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms setforth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License (December 2007). Oracle America, Inc., 500 Oracle Parkway, Redwood City, CA 94065.
    [Show full text]
  • Part I: Unix Signals
    Part I: Unix Signals 1 Stopping a Program What if you run this program? int main () { while (1); printf("bye\n"); return 0; } What happens if you hit Ctl-C? Could you make Ctl-C print ªbyeº before exiting? 2-4 Signals A shell handles Ctl-C by sending the SIGINT signal to a process The sigaction() function can be used to install a signal handler See bye.c and bye2.c 5 Some Other Signals SIGHUP terminal is gone SIGQUIT please quit SIGKILL force quit (cannot handle) SIGSEGV seg fault SIGALRM timer expired SIGPIPE write to pipe with closed read end SIGCHLD child completed 6 Timers Use setitimer() to start a timer See timer.c and timer2.c... 7 Signal Handlers and Races Beware! Ð a signal handler is practically a thread Use sigprocmask() to (un)block signals See timer3.c 8 Part II: Deadlock · Conditions · Prevention · Detection 9 Deadlock is when two or more threads are waiting for an event that can only be generated by these same threads printer->Wait(); disk->Wait(); disk->Wait(); printer->Wait(); // copy from disk // copy from disk // to printer // to printer printer->Signal(); printer->Signal(); disk->Signal(); disk->Signal(); Deadlock can occur anytime threads acquire multiple resources (printers, disks, etc.), perform work, and then release their resources 10 Deadlock 11 Deadlock 12 Deadlock Examples · Linux: In-kernel memory allocator runs out of pages, causing an ªout of memory handlerº to run, which calls a function that tries to allocate a page. · Windows 2000: The OS keeps a pool of ªworker threadsº waiting to do work; one worker thread submits a job to another worker thread, but there are no free worker-thread slots.
    [Show full text]
  • Automatic Handling of Global Variables for Multi-Threaded MPI Programs
    Automatic Handling of Global Variables for Multi-threaded MPI Programs Gengbin Zheng, Stas Negara, Celso L. Mendes, Laxmikant V. Kale´ Eduardo R. Rodrigues Department of Computer Science Institute of Informatics University of Illinois at Urbana-Champaign Federal University of Rio Grande do Sul Urbana, IL 61801, USA Porto Alegre, Brazil fgzheng,snegara2,cmendes,[email protected] [email protected] Abstract— necessary to privatize global and static variables to ensure Hybrid programming models, such as MPI combined with the thread-safety of an application. threads, are one of the most efficient ways to write parallel ap- In high-performance computing, one way to exploit multi- plications for current machines comprising multi-socket/multi- core nodes and an interconnection network. Global variables in core platforms is to adopt hybrid programming models that legacy MPI applications, however, present a challenge because combine MPI with threads, where different programming they may be accessed by multiple MPI threads simultaneously. models are used to address issues such as multiple threads Thus, transforming legacy MPI applications to be thread-safe per core, decreasing amount of memory per core, load in order to exploit multi-core architectures requires proper imbalance, etc. When porting legacy MPI applications to handling of global variables. In this paper, we present three approaches to eliminate these hybrid models that involve multiple threads, thread- global variables to ensure thread-safety for an MPI program. safety of the applications needs to be addressed again. These approaches include: (a) a compiler-based refactoring Global variables cause no problem with traditional MPI technique, using a Photran-based tool as an example, which implementations, since each process image contains a sep- automates the source-to-source transformation for programs arate instance of the variable.
    [Show full text]
  • Threading and GUI Issues for R
    Threading and GUI Issues for R Luke Tierney School of Statistics University of Minnesota March 5, 2001 Contents 1 Introduction 2 2 Concurrency and Parallelism 2 3 Concurrency and Dynamic State 3 3.1 Options Settings . 3 3.2 User Defined Options . 5 3.3 Devices and Par Settings . 5 3.4 Standard Connections . 6 3.5 The Context Stack . 6 3.5.1 Synchronization . 6 4 GUI Events And Blocking IO 6 4.1 UNIX Issues . 7 4.2 Win32 Issues . 7 4.3 Classic MacOS Issues . 8 4.4 Implementations To Consider . 8 4.5 A Note On Java . 8 4.6 A Strategy for GUI/IO Management . 9 4.7 A Sample Implementation . 9 5 Threads and GUI’s 10 6 Threading Design Space 11 6.1 Parallelism Through HL Threads: The MXM Options . 12 6.2 Light-Weight Threads: The XMX Options . 12 6.3 Multiple OS Threads Running One At A Time: MSS . 14 6.4 Variations on OS Threads . 14 6.5 SMS or MXS: Which To Choose? . 14 7 Light-Weight Thread Implementation 14 1 March 5, 2001 2 8 Other Issues 15 8.1 High-Level GUI Interfaces . 16 8.2 High-Level Thread Interfaces . 16 8.3 High-Level Streams Interfaces . 16 8.4 Completely Random Stuff . 16 1 Introduction This document collects some random thoughts on runtime issues relating to concurrency, threads, GUI’s and the like. Some of this is extracted from recent R-core email threads. I’ve tried to provide lots of references that might be of use.
    [Show full text]
  • KAAPI: a Thread Scheduling Runtime System for Data Flow Computations
    KAAPI: A thread scheduling runtime system for data flow computations on cluster of multi-processors Thierry Gautier, Xavier Besseron, Laurent Pigeon INRIA, projet MOAIS LIG, Batiment^ ENSIMAG 51 avenue Jean-Kuntzmann F-38330 Montbonnot St Martin, France [email protected], [email protected], [email protected] ABSTRACT among the processors. In the work-stealing scheduling [7], The high availability of multiprocessor clusters for com- each processor gets its own stack implemented by a deque puter science seems to be very attractive to the engineer of frames. When a thread is created, a new frame is ini- because, at a first level, such computers aggregate high per- tialized and pushed at the bottom of the deque. When a formances. Nevertheless, obtaining peak performances on processor become idle, it tries to steal work from a victim irregular applications such as computer algebra problems re- processor with a non-empty deque; then it takes the top- mains a challenging problem. The delay to access memory is most frame (the least recently pushed frame) and places it non uniform and the irregularity of computations requires to in its own deque which is linked to the victim deque. use scheduling algorithms in order to automatically balance Such scheduling has been proven to be efficient for fully- the workload among the processors. strict multithreaded computations [7, 12] while requiring a This paper focuses on the runtime support implementa- bounded memory space with respect to a depth first sequen- tion to exploit with great efficiency the computation re- tial execution [8, 29]. sources of a multiprocessor cluster.
    [Show full text]
  • Brocade Fabric OS Administration Guide, 8.2.1
    ADMINISTRATION GUIDE Brocade Fabric OS Administration Guide, 8.2.1 Supporting Fabric OS 8.2.1 FOS-821-AG101 28 September 2018 Copyright © 2018 Brocade Communications Systems LLC. All Rights Reserved. Brocade and the stylized B logo are among the trademarks of Brocade Communications Systems LLC. Broadcom, the pulse logo, and Connecting everything are among the trademarks of Broadcom. The term "Broadcom" refers to Broadcom Inc. and/or its subsidiaries. Brocade, a Broadcom Inc. Company, reserves the right to make changes without further notice to any products or data herein to improve reliability, function, or design. Information furnished by Brocade is believed to be accurate and reliable. However, Brocade does not assume any liability arising out of the application or use of this information, nor the application or use of any product or circuit described herein, neither does it convey any license under its patent rights nor the rights of others. The product described by this document may contain open source software covered by the GNU General Public License or other open source license agreements. To find out which open source software is included in Brocade products, view the licensing terms applicable to the open source software, and obtain a copy of the programming source code, please visit https://www.broadcom.com/support/fibre-channel-networking/tools/oscd. Brocade Fabric OS Administration Guide, 8.2.1 2 FOS-821-AG101 Contents Introduction........................................................................................................................................................................................................................18
    [Show full text]
  • ELC PLATFORM SPECIFICATION Version 1.0
    ! ! ! ! !"# $"%&'()*+,$!#-'-#%&-(. /012345+678 "##$%&'()*'+',-'$).//. &90+!:;0<<0<+"35=>+#45241?3=: ! ! CONTENTS 1 Copyright and Licensing Terms...........................................................................................1 2 Acknowledgements.............................................................................................................1 3 Introduction.................................................................................................................. .......2 3.1 Purpose .......................................................................................................................2 3.2 Relationship to Other Industry Standards.....................................................................3 3.3 How To Use This Specification ...................................................................................4 3.4 Definitions...................................................................................................................4 3.4.1 ELCPS.................................................................................................................4 3.4.2 ELCPS-Compliant Application............................................................................4 3.4.3 ELCPS-Conforming Implementation ...................................................................4 3.4.4 Non-ELCPS-Compliant Application....................................................................5 3.4.5 ELCPS Implementation Conformance .................................................................5 3.4.6
    [Show full text]
  • A Debugging API for User-Level Thread Libraries
    User Level DB: a Debugging API for User-Level Thread Libraries Kevin Pouget∗ , Marc Perache´ y , Patrick Carribaulty and Herve´ Jourdreny ∗[email protected]; yfmarc.perache, patrick.carribault, [email protected] CEA, DAM, DIF F-91297 Arpajon France Abstract—With the advent of the multicore era, parallel THREAD DB, which lets them be controlled by common programming is becoming ubiquitous. Multithreading is a debuggers like GDB [6], DBX [7] or TOTALVIEW [8]. common approach to benefit from these architectures. Hybrid The use of user-level or hybrid thread libraries provides M:N libraries like MultiProcessor Communication (MPC) good performances but most of these thread libraries do or MARCEL reach high performance expressing fine-grain parallelism by mapping M user-level threads onto N kernel- not implement the THREAD DB API. Without this accurate level threads. However, such implementations skew the debug- support, debuggers can only handle LWPs (Light Weight gers’ ability to distinguish one thread from another, because Process) distinction, rather than end-user level threads. only kernel threads can be handled. SUN MICROSYSTEMS’ LWPs are related to the underlying OS. Their events can THREAD DB API is an interface between the debugger and ptrace the thread library allowing the debugger to inquire for thread be monitored through the system call [9]. Also semantics details. In this paper we introduce the USER LEVEL this function offers memories and registers access primitives. DB (ULDB) library, an implementation of the THREAD DB However, user threads only exist within their library. The interface abstracting the common features of user-level thread kernel is not aware that user threads are handled on a LWP libraries.
    [Show full text]
  • Accurate Recovery of Functions in COTS Binaries a Dissertation Presented by Rui Qiao to the Graduate School in Partial Fulfillme
    Accurate Recovery of Functions in COTS Binaries A Dissertation presented by Rui Qiao to The Graduate School in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science Stony Brook University May 2017 Copyright by Rui Qiao 2017 Stony Brook University The Graduate School Rui Qiao We, the dissertation committe for the above candidate for the Doctor of Philosophy degree, hereby recommend acceptance of this dissertation Dr. R. Sekar - Dissertation Advisor Professor, Computer Science Department Dr. Michalis Polychronakis - Chairperson of Defense Assistant Professor, Computer Science Department Dr. Nick Nikiforakis - Committee Member Assistant Professor, Computer Science Department Dr. Aravind Prakash - External Committee Member Assistant Professor, Computer Science Department Binghamton University This dissertation is accepted by the Graduate School Charles Taber Dean of the Graduate School ii Abstract of the Dissertation Accurate Recovery of Functions in COTS Binaries by Rui Qiao Doctor of Philosophy in Computer Science Stony Brook University 2017 Binary analysis and instrumentation play a central role in COTS software security. They can be used to detect and prevent vulnerabilities, mitigate exploits, enforce security policies, and so on. Many security instrumentations work at the granularity of functions. How- ever, unlike high-level languages, functions in binaries are not clearly demar- cated. To complicate matters further, functions in binaries may have multiple entry points and/or exit points. Some of these entries or exits may not be determined simply by instruction syntax or code patterns. Moreover, many functions are reachable only through indirect control transfers, while some may be altogether unreachable. In this dissertation, we present an approach that overcomes these challenges to accurately identify function boundaries, as well as calls and returns.
    [Show full text]
  • Download Cert
    man pages section 3: Library Interfaces and Headers Part No: E29033 October 2012 Copyright © 1993, 2012, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT END USERS. Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including anyoperating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. No other rights are granted to the U.S. Government. This software or hardware is developed for general use in a variety of information management applications.
    [Show full text]