A Shared Library and Dynamic Linking Monitor

Total Page:16

File Type:pdf, Size:1020Kb

A Shared Library and Dynamic Linking Monitor MONDO: A Shared Library and Dynamic Linking Monitor C. Donour Sizemore, Jacob R. Lilly, and David M. Beazley Department of Computer Science University of Chicago Chicago, Illinois 60637 {donour,jrlilly,beazley}@cs.uchicago.edu March 15, 2003 Abstract ically loadable extension modules. Similarly, dy- namic modules are commonly used to extend web Dynamic modules are one of the most attractive servers, browsers, and other large programming en- features of modern scripting languages. Dynamic vironments. modules motivate programmers to build their ap- Although dynamic linking offers many benefits plications as small components, then glue them to- such as increased modularity, simplified mainte- gether. They rely on the runtime linker to assemble nance, and extensibility, it has also produced a con- components and shared libraries as the application siderable amount of programmer confusion. Few runs. MONDO, a new debugging tool, provides pro- programmers would claim to really understand how grammers with the ability to monitor in real-time dynamic linking actually works. Moreover, the run- the dynamic linking of a program. MONDO sup- time linking process depends heavily on the system plies programmers with a graphical interface show- configuration, environment variables, subtle com- ing library dependencies, listing symbol bindings, piler and linker options, and the flags passed to low- and providing linking information used to uncover level dynamic loader. The interaction of these pieces subtle programming errors related to the use of tends to produce a very obscured picture of what shared libraries. The use of MONDO requires no is actually going on inside an application that uses modification to existing code or any changes to the shared libraries. Unfortunately, traditional debug- dynamic linker. MONDO can be used with any ap- ging tools are of little assistance since they are pri- plication that uses shared libraries. marily concerned with errors in program logic rather than errors that arise from the way in which a pro- gram is constructed and linked. 1 Introduction To address this limitation, we present a special purpose debugger, MONDO (Monitor of Dynamic For the past 10-15 years, shared libraries and dy- Objects), that is designed to help programmers dis- namic linking have changed the way application pro- cover problems related to the run-time linking of grammers write software. Instead of writing huge an application. MONDO works by watching real- monolithic applications, it is quite common to exe- time debug traces generated by the run-time linker cute smaller software components, extension mod- (ld.so.1) and using the output information to con- ules, and plugins. For example, when programmers struct a global view of an application, its libraries, use scripting languages like Python or Tcl, they also and dynamically loaded modules (if any). Using this tend to use a collection of loosely-coupled dynam- information, it is possible examine symbol bindings, library dependencies, as well as to uncover subtle % ldd a.out programming problems related to linking that are libsocket.so.1 => /usr/lib/libsocket.so.1 otherwise hidden from the programmer by the lim- libthread.so.1 => /usr/lib/libthread.so.1 itations of classic debugging tools. libdl.so.1 => /usr/lib/libdl.so.1 In the first part of this paper, we illustrate dy- libc.so.1 => /usr/lib/libc.so.1 namic linking and some of the common program- ming errors that arise. Next, we describe how run- When a dynamically linked program runs, con- time linking information can be extracted from the trol is first passed to a special runtime loader (often loader. This information is used to construct the called ld.so.1). The loader is then responsible for MONDO debugger and how it can be used by pro- locating, loading, and binding library dependencies grammers. to the the executable–a process that occurs at run- time. To improve performance, most of the runtime binding occurs lazily. Library symbols are not actu- 2 An Overview of Dynamic Linking ally bound until they are used for the first time by a program. This binding is managed by routing proce- To create an executable, a “linker” is used to dure calls through a special procedure linking table assemble the appropriate object files and libraries (PLT), a jump table use to match procedures with into a runnable program [2]. Linking is nothing memory addresses. Initially, the PLT is set up to more than the binding of symbolic names to mem- pass control back to the dynamic loader when a pro- ory addresses. For example, if you have some proce- cedure is called for the first time. When this occurs, dure foo() in your application, the linker calculates the dynamic linker locates the symbol in a library where in memory foo() will reside and patches all and patches the PLT to point to the newly bound references to foo() making sure they point to the symbol. Many programmers are unaware that link- memory address of foo(). Most programmers view ing occurs throughout the execution of a program. linking as merely the final step of building a pro- For instance, when a user selects certain applica- gram. That is,can one simply runs the linker and it tion features, the linker quietly runs underneath the collects all of the object files and library functions program–binding all of the newly used symbols as into some huge “bundle of bits” that is the program. they are needed. A popular misconceptions of program state is that In addition to automatically binding libraries, once linked the program runs statically without any the runtime loader can also be used to explicitly dynamic changes (i.e., one just executes the pro- load new program modules through a special dy- gram). namic loading API. Using calls such as dlopen() In reality, the situation is much more compli- and dysym(), programmers can load shared object cated and convoluted than the previously described files, search for symbols, and invoke functions. This model. Most modern systems use dynamic linking– API is the basis for systems that support dynami- a technique in which much of the actual linking is cally loadable extension modules and plugins. deferred until runtime. For example, when you cre- ate a program like this, 3 Enter MONDO % ld $(OBJS) -lsocket -lthread -ldl Unlike in conventional programming langauges The linker merely records the names of libraries in (C, C++, etc), users routinely change the ”struc- the executable instead of linking them. A command ture” of their program when they import modules. such as ldd can be used to list these library and their In scripting languages errors in program construc- dependencies. For example: tion can be considered programming errors. A single module may introduce many libraries with no clear 4 Data Collection way of watching what is happening to a process as they are linked. Here a mistake in program logic As previously mentioned, MONDO collects infor- results in an incorrectly built program. mation from ld.so.1 using its debugging mode. The runtime linker (ld.so.1) on many systems can This is enabled by setting the LD DEBUG and supply real-time updates as it binds symbols from LD DEBUG OUTPUT environment variables of a dynamic objects. The LD DEBUG environment process. Debugging data is redirected to a file where variable allows the user to collect information about it is read and parsed by MONDO as the process exe- libraries, dependencies, bindings, and relocations as cutes. Information of interest includes library search they are processed by ld.so.1 during program exe- paths, library loading, library dependencies, symbol cution. Figure 1 illustrates some of this output for bindings, and relocations. In addition to collecting Solaris. the trace data, MONDO also collects information from the library files themselves. This is used to The ld.so.1 debugging trace contains a wealth build a model of what libraries have been loaded, of information about almost everything the runtime what symbols are contained in those libraries, and linker is doing during program execution. Moreover, which symbols have actually been bound by the run- this information is presented in real-time–making time linker. it qualitatively different than the information pro- MONDO can simultaneously monitor any number vided by static library diagnostic tools such as ldd, of programs and is limited only by the underlying nm, or elfdump. However, the debug trace is often operating system (number of process, open file han- unreadable and contains too much unordered infor- dles, etc.). Since tracing is enabled by the environ- mation to be easily usable by programmers. ment (and environments are preserved after calls to MONDO is a tool that synthesizes real-time data exec()), any process spawned by a traced program from the dynamic loader with the information that will be traced itself. Process ID is provided inline is available from existing library diagnostic tools with traces data (first column in fig.1) and ld.so.1 (e.g., nm, ldd, etc.). Moreover, it tries to present is able to dump trace data to a file corresponding this information to the user in a coherent manner the PID. MONDO can follow the growth of a pro- using an interactive graphical user interface. The cess tree without the need to do system call tracing. system consists of two components; a real-time data Function overloading, Namespaces, etc. do not collector that extracts data from the ld.so.1 trace exists at the object level. To support these fea- and an analysis tool that assembles this data, com- tures C++ compiles mangle type names. The man- bines it with static library information, and presents gled names reflect typing and scope information. it to the user. MONDO uses the demangingle code from GCC to To use MONDO, a programmer simply types demangle these symbols names.
Recommended publications
  • Program-Library-HOWTO.Pdf
    Program Library HOWTO David A. Wheeler version 1.20, 11 April 2003 This HOWTO for programmers discusses how to create and use program libraries on Linux. This includes static libraries, shared libraries, and dynamically loaded libraries. Program Library HOWTO Table of Contents 1. Introduction.....................................................................................................................................................1 2. Static Libraries................................................................................................................................................2 3. Shared Libraries.............................................................................................................................................3 3.1. Conventions......................................................................................................................................3 3.1.1. Shared Library Names.............................................................................................................3 3.1.2. Filesystem Placement..............................................................................................................4 3.2. How Libraries are Used....................................................................................................................4 3.3. Environment Variables.....................................................................................................................5 3.3.1. LD_LIBRARY_PATH............................................................................................................5
    [Show full text]
  • Linux Standard Base Development Kit for Application Building/Porting
    Linux Standard Base Development Kit for application building/porting Rajesh Banginwar Nilesh Jain Intel Corporation Intel Corporation [email protected] [email protected] Abstract validating binaries and RPM packages for LSB conformance. We conclude with a couple of case studies that demonstrate usage of the build The Linux Standard Base (LSB) specifies the environment as well as the associated tools de- binary interface between an application and a scribed in the paper. runtime environment. This paper discusses the LSB Development Kit (LDK) consisting of a build environment and associated tools to assist software developers in building/porting their applications to the LSB interface. Developers 1 Linux Standard Base Overview will be able to use the build environment on their development machines, catching the LSB porting issues early in the development cycle The Linux* Standard Base (LSB)[1] specifies and reducing overall LSB conformance testing the binary interface between an application and time and cost. Associated tools include appli- a runtime environment. The LSB Specifica- cation and package checkers to test for LSB tion consists of a generic portion, gLSB, and conformance of application binaries and RPM an architecture-specific portion, archLSB. As packages. the names suggest, gLSB contains everything that is common across all architectures, and This paper starts with the discussion of ad- archLSBs contain the things that are specific vantages the build environment provides by to each processor architecture, such as the ma- showing how it simplifies application develop- chine instruction set and C library symbol ver- ment/porting for LSB conformance. With the sions. availability of this additional build environment from LSB working group, the application de- As much as possible, the LSB builds on ex- velopers will find the task of porting applica- isting standards, including the Single UNIX tions to LSB much easier.
    [Show full text]
  • Today's Big Adventure
    Today’s Big Adventure f.c gcc f.s as f.o ld a.out c.c gcc c.s as c.o • How to name and refer to things that don’t exist yet • How to merge separate name spaces into a cohesive whole • More information: - How to write shared libraries - Run “nm,” “objdump,” and “readelf” on a few .o and a.out files. - The ELF standard - Examine /usr/include/elf.h 3 / 45 How is a program executed? • On Unix systems, read by “loader” compile time run time ld loader cache - Reads all code/data segments into buer cache; Maps code (read only) and initialized data (r/w) into addr space - Or...fakes process state to look like paged out • Lots of optimizations happen in practice: - Zero-initialized data does not need to be read in. - Demand load: wait until code used before get from disk - Copies of same program running? Share code - Multiple programs use same routines: share code 4 / 45 x86 Assembly syntax • Linux uses AT&T assembler syntax – places destination last - Be aware that intel syntax (used in manual) places destination first • Types of operand available: - Registers start with “%”– movl %edx,%eax - Immediate values (constants) prefixed by “$”– movl $0xff,%edx - (%reg) is value at address in register reg – movl (%edi),%eax - n(%reg) is value at address in (register reg)+n – movl 8(%ebp),%eax - *%reg in an indirection through reg – call *%eax - Everything else is an address – movl var,%eax; call printf • Some heavily used instructions - movl – moves (copies) value from source to destination - pushl/popl – pushes/pops value on stack - call – pushes next
    [Show full text]
  • Meson Manual Sample.Pdf
    Chapter 2 How compilation works Compiling source code into executables looks fairly simple on the surface but gets more and more complicated the lower down the stack you go. It is a testament to the design and hard work of toolchain developers that most developers don’t need to worry about those issues during day to day coding. There are (at least) two reasons for learning how the system works behind the scenes. The first one is that learning new things is fun and interesting an sich. The second one is that having a grasp of the underlying system and its mechanics makes it easier to debug the issues that inevitably crop up as your projects get larger and more complex. This chapter aims outline how the compilation process works starting from a single source file and ending with running the resulting executable. The information in this chapter is not necessary to be able to use Meson. Beginners may skip it if they so choose, but they are advised to come back and read it once they have more experience with the software build process. The treatise in this book is written from the perspective of a build system. Details of the process that are not relevant for this use have been simplified or omitted. Entire books could (and have been) written about subcomponents of the build process. Readers interested in going deeper are advised to look up more detailed reference works such as chapters 41 and 42 of [10]. 2.1 Basic term definitions compile time All operations that are done before the final executable or library is generated are said to happen during compile time.
    [Show full text]
  • 2D-Compiler Libraries.Pdf
    Laboratorio di Tecnologie dell'Informazione Ing. Marco Bertini marco.bertini@unifi.it http://www.micc.unifi.it/bertini/ lunedì 17 marzo 14 How the compiler works Programs and libraries lunedì 17 marzo 14 The compiler "In C++, everytime someone writes ">> 3" instead of "/ 8", I bet the compiler is like, "OH DAMN! I would have never thought of that!" - Jon Shiring (Call of Duty 4 / MW2) lunedì 17 marzo 14 What is a compiler ? • A compiler is a computer program (or set of programs) that translate source code from a high-level programming language to a lower level language (e.g., assembly language or machine code). • A compiler typically performs: lexical analysis (tokenization), preprocessing, parsing, semantic analysis, code generation, and code optimization. lunedì 17 marzo 14 From source code to a running program text editor Source file compiler Object file Libraries linker Running program loader Executable lunedì 17 marzo 14 From source to object file • Lexical analysis: breaks the source code text into small pieces source file Lexical analyzer called tokens. Each token is a single atomic unit of the language, for instance a keyword, identifier or symbol name. • Preprocessing: in C/C++ macro substitution and conditional Preprocessor Analysis compilation • Syntax analysis: token sequences are parsed to identify the Syntax analyzer syntactic structure of the program. • Semantic analysis: semantic checks such as type checking (checking for type errors), or object binding (associating Semantic analyzer variable and function references with their definitions), or definite assignment (requiring all local variables to be initialized before use), rejecting incorrect programs or issuing warnings • Code generation: translation into the output language, usually Code generator Synthesis the native machine language of the system.
    [Show full text]
  • Dependency Management in C++ Xavier Bonaventura BMW AG
    Dependency management in C++ Xavier Bonaventura BMW AG. xbonaventurab limdor limdor xavierbonaventura code::dive 2019 – Wrocław, Poland – 20th November 2019 About me • I… • studied software engineering • created a 3D model visualization tool in C++ and OpenGL during my PhD https://github.com/limdor/quoniam (~10.000 LOC) (2010 - 2015) • moved to Munich to work in 2015 BMW AG – • started attending the C++ User Group Munich (MUC++) to realize that I knew nothing about C++ • decided to do this presentation about dependencies after 4 years attending to C++ meetups every month Xavier Bonaventura Bonaventura Xavier • remembered that I developed the 3D model visualization tool when I knew nothing about C++ 2 Goal Awareness and better understanding of the BMW BMW AG dependencies in your project – Xavier Bonaventura Bonaventura Xavier 3 What will we see? • Basic dependency concepts • Difference between declaration and definition • Building process BMW AG – • Execution sequence of a process • Implications of the design of a library Xavier Bonaventura Bonaventura Xavier • Examples with code 4 What are dependencies? 5 Basic concepts Build and runtime phase Library design Examples Dependencies in real life Apple pie Apples Pastry Sugar Flour Butter Egg white Lemon juice Spices Apple tree BMW AG – Butter Flour Salt Sugar Egg Lemon Milk Wheat Chicken Nutmeg Lemon tree Cow Egg Ginger Bonaventura Xavier Cinnamon 6 Chicken Basic concepts Build and runtime phase Egg Library design Examples Dependencies in real life digraph G { "apple pie" -> apples apples
    [Show full text]
  • Boost.Build V2 User Manual
    Boost.Build V2 User Manual XML to PDF by RenderX XEP XSL-FO Formatter, visit us at http://www.renderx.com/ Boost.Build V2 User Manual Copyright © 2006-2009 Vladimir Prus Distributed under the Boost Software License, Version 1.0. (See accompanying ®le LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) XML to PDF by RenderX XEP XSL-FO Formatter, visit us at http://www.renderx.com/ Table of Contents How to use this document ........................................................................................................................................ 1 Installation ............................................................................................................................................................ 2 Tutorial ................................................................................................................................................................ 3 Hello, world .................................................................................................................................................. 4 Properties ..................................................................................................................................................... 5 Project Hierarchies ......................................................................................................................................... 7 Dependent Targets .........................................................................................................................................
    [Show full text]
  • TASKING VX-Toolset for C166 User Guide TASKING VX-Toolset for C166 User Guide
    TASKING VX-toolset for C166 User Guide TASKING VX-toolset for C166 User Guide Copyright © 2006 Altium Limited. All rights reserved.You are permitted to print this document provided that (1) the use of such is for personal use only and will not be copied or posted on any network computer or broadcast in any media, and (2) no modifications of the document is made. Unauthorized duplication, in whole or part, of this document by any means, mechanical or electronic, including translation into another language, except for brief excerpts in published reviews, is prohibited without the express written permission of Altium Limited. Unauthorized duplication of this work may also be prohibited by local statute. Violators may be subject to both criminal and civil penalties, including fines and/or imprisonment. Altium, TASKING, and their respective logos are trademarks or registered trademarks of Altium Limited or its subsidiaries. All other registered or unregistered trademarks referenced herein are the property of their respective owners and no trademark rights to the same are claimed. Table of Contents 1. C Language .................................................................................................................. 1 1.1. Data Types ......................................................................................................... 1 1.2. Changing the Alignment: __unaligned and __packed__ ............................................... 3 1.3. Accessing Memory .............................................................................................
    [Show full text]
  • C++ ABI for the ARM Architecture
    C++ ABI for the ARM architecture C++ ABI for the ARM® Architecture Document number: ARM IHI 0041D, current through ABI release 2.09 Date of Issue: 30th November 2012 Abstract This document describes the C++ Application Binary Interface for the ARM architecture. Keywords C++ ABI, generic C++ ABI, exception handling ABI How to find the latest release of this specification or report a defect in it Please check the ARM Information Center (http://infocenter.arm.com/) for a later release if your copy is more than one year old (navigate to the ARM Software development tools section, ABI for the ARM Architecture subsection). Please report defects in this specification to arm dot eabi at arm dot com. Licence THE TERMS OF YOUR ROYALTY FREE LIMITED LICENCE TO USE THIS ABI SPECIFICATION ARE GIVEN IN SECTION 1.4, Your licence to use this specification (ARM contract reference LEC-ELA-00081 V2.0). PLEASE READ THEM CAREFULLY. BY DOWNLOADING OR OTHERWISE USING THIS SPECIFICATION, YOU AGREE TO BE BOUND BY ALL OF ITS TERMS. IF YOU DO NOT AGREE TO THIS, DO NOT DOWNLOAD OR USE THIS SPECIFICATION. THIS ABI SPECIFICATION IS PROVIDED “AS IS” WITH NO WARRANTIES (SEE SECTION 1.4 FOR DETAILS). Proprietary notice ARM, Thumb, RealView, ARM7TDMI and ARM9TDMI are registered trademarks of ARM Limited. The ARM logo is a trademark of ARM Limited. ARM9, ARM926EJ-S, ARM946E-S, ARM1136J-S ARM1156T2F-S ARM1176JZ-S Cortex, and Neon are trademarks of ARM Limited. All other products or services mentioned herein may be trademarks of their respective owners. ARM IHI 0041D Copyright © 2003-2009, 2012 ARM Limited.
    [Show full text]
  • Creating Pascal Bindings for C V1.0
    Creating Pascal bindings for C v1.0 W Hunter August 2016 Contents 1 Purpose of document 3 2 Prerequisites 3 3 Development environments 4 3.1 Windows 7/8.1/10 ......................................... 4 3.2 Linux (Ubuntu) ............................................ 4 3.3 Compiler versions .......................................... 5 3.4 Conventions .............................................. 5 4 Linking to Object iles 6 4.1 The sum of two numbers ...................................... 6 4.1.1 C source iles ........................................ 6 4.1.2 Pascal source iles ..................................... 8 4.2 Printing to screen .......................................... 10 4.2.1 C source iles ........................................ 10 4.2.2 Pascal source iles ..................................... 12 4.3 Simple array functions ....................................... 14 4.3.1 C source iles ........................................ 14 4.3.2 Pascal source iles ..................................... 17 5 Linking to Static and Shared Object C libraries 19 1 5.1 C source iles for calculating Fibonacci numbers ....................... 19 5.2 Creating a Static C library (aka an archive) ........................... 20 5.2.1 Using the Static Library from Pascal .......................... 20 5.3 Creating a Shared C library (aka a Shared Object ile) .................... 23 5.3.1 Using the Shared Library from Pascal ......................... 23 2 1 Purpose of document The purpose of this document is to show how to use C libraries from Object Pascal, more speciically using Free Pascal (FPC). It is inspired by [1], but showing a more up to date approach as per the Free Pascal Programmer’s Guide http://www.freepascal.org/docs-html/prog/progse28.html, and with help from ‘Thaddy’ on the FPC forum. I suggest you read everything in this document (that means the whole document), including the source iles, because I’ve also put explanations and hints in them.
    [Show full text]
  • Linking and Loading
    Linking and Loading CS61, Lecture 16 Prof. Stephen Chong October 25, 2011 Announcements •Midterm exam in class on Thursday •80 minute exam •Open book, closed note. No electronic devices allowed •Please be punctual, so we can start on time •CS 61 Infrastructure maintenance •SEAS Academic Computing need to do some maintainance •All CS 61 VMs will be shut down on Thursday •You will need to re-register (i.e., run “seas-cloud register”) next time you use the infrastructure •If this will cause problems for you, please contact course staff Stephen Chong, Harvard University 2 Today •Getting from C programs to executables •The what and why of linking •Symbol relocation •Symbol resolution •ELF format •Loading •Static libraries •Shared libraries Stephen Chong, Harvard University 3 Linking •Linking is the process of combining pieces of data and code into a single file than can be loaded into memory and executed. •Why learn about linking? •Help you build large programs •Avoid dangerous programming errors •Understand how to use shared libraries •Linking can be done •at compile time (aka statically) •at load time •at run time Stephen Chong, Harvard University 4 Static linking •Compiler driver coordinates translation and linking •gcc is a compiler driver, invoking C preprocessor (cpp), C compiler (cc1), assembler (as), and linker (ld) main.c swap.c Source files gcc -c gcc -c main.o swap.o Separately compiled relocatable object files gcc -o myprog ... Fully linked executable object file myprog Contains code and data from main.o and swap.o Stephen Chong, Harvard University 5 Example program with two C files Definition of buf[] Declaration of buf[] main.c swap.c int buf[2] = {1, 2}; extern int buf[]; int main() { void swap() { swap(); int temp; return 0; } temp = buf[1]; buf[1] = buf[0]; Reference to swap() buf[0] = temp; } Definition of main() References to buf[] Definition of swap() Stephen Chong, Harvard University 6 The linker •The linker takes multiple object files and combines them into a single executable file.
    [Show full text]
  • DDL: Extending Dynamic Linking for Program Customization, Analysis, and Evolution
    DDL: Extending Dynamic Linking for Program Customization, Analysis, and Evolution Sumant Tambe Navin Vedagiri Naoman Abbas Jonathan E. Cook Department of Computer Science New Mexico State University Las Cruces, NM 88003 USA [email protected] Abstract methods, global data) to the runtime of the program. While new software languages and environments An extra module, the dynamic linker, is loaded with have moved towards richer introspective and manip- the program, and accomplishes the dynamic link- ulatable runtime environments, there is still much ing necessary for the program to complete its execu- traditional software that is compiled into platform- tion. Since symbols are not yet bound, a dynamic specific executables and runs in a context that does linker could allow a tool builder access to the bind- not easily offer such luxuries. Yet even in these ing operation, and allow the tool to take a variety environments, mechanisms such as dynamic link li- of actions. Inserting wrappers for tracing, security, braries do offer the potential of building more con- assertion checks, and many other uses ought to be trol over the deployed and running system, and can possible. Redirecting a binding to another symbol offer opportunities for supporting dynamic evolu- should be supported. Even run-time modification tion of such systems. Indeed, the basic dynamic of bindings should be possible. All of this is feasible linking mechanisms can be exploited to offer a rich without any modification or translation of the object component-based execution environment. code of the application. In this paper, we present our initial explorations Current dynamic linkers do allow rudimentary into building such support.
    [Show full text]