Exploring the Barrier to Entry - Incremental Generational Garbage Collection for Haskell

Total Page:16

File Type:pdf, Size:1020Kb

Exploring the Barrier to Entry - Incremental Generational Garbage Collection for Haskell Exploring the Barrier to Entry - Incremental Generational Garbage Collection for Haskell A.M. Cheadle & A.J. Field S. Marlow & S.L. Peyton Jones Imperial College London Microsoft Research, Cambridge famc4,[email protected] fsimonmmar,[email protected] R.L. While The University of Western Australia, Perth [email protected] ABSTRACT Eventually, however, the region(s) containing long-lived We document the design and implementation of a \produc- objects (the \older" generation(s)) will fill up and it will be tion" incremental garbage collector for GHC 6.2. It builds necessary to perform a so-called major collection. on our earlier work (Non-stop Haskell) that exploited GHC's Major collections are typically expensive operations be- dynamic dispatch mechanism to hijack object code pointers cause the older generations are usually much larger than so that objects in to-space automatically scavenge them- the young one. Furthermore, collecting an old generation selves when the mutator attempts to \enter" them. This requires the collection of all younger generations so, regard- paper details various optimisations based on code speciali- less of the actual number of generations, the entire heap sation that remove the dynamic space, and associated time, will eventually require collection. Thus, although genera- overheads that accompanied our earlier scheme. We de- tional collection ensures a relatively small mean pause time, tail important implementation issues and provide a detailed the pause time distribution has a \heavy tail" due to the in- evaluation of a range of design alternatives in comparison frequent, but expensive, major collections. This renders the with Non-stop Haskell and GHC's current generational col- technique unsuitable for applications that have real-time re- lector. We also show how the same code specialisation tech- sponse requirements, for example certain interactive or real- niques can be used to eliminate the write barrier in a gen- time control systems. erational collector. The traditional way to reduce the variance of the pause times is to perform the garbage collection incrementally: Categories and Subject Descriptors: D.3.4 [Program- rather than collect the whole heap at once, a small amount ming Languages]: Processors|Memory management of collection is performed periodically. In this way the activ- (garbage collection) ities of the executing program (referred to as the mutator) and garbage collector are interleaved. General Terms: Algorithms, Design, Experimentation, One way to achieve incremental collection in the context Measurement, Performance, Languages of copying garbage collectors [8] is to use a read barrier to prevent the mutator from accessing live objects that have 1 Keywords: Incremental garbage collection, yet to be copied. This is the basis of Baker's algorithm [2] . Non-stop Haskell In our earlier paper [7] we described a low-cost mechanism for supporting the read barrier in the GHC implementa- 1. INTRODUCTION tion of Haskell with a single-generation heap. This exploits Generational garbage collection [24] is a well-established the dynamic dispatch mechanism of GHC and works by in- technique for reclaiming heap objects no longer required by tercepting calls to object evaluation code (in GHC this is a program. Short-lived objects are reclaimed quickly and called the object's entry code) in the case where the garbage efficiently, and long-lived objects are promoted to regions of collector is on and where the object has been copied but the heap which are subject to relatively infrequent collec- not yet scavenged. This `hijacking' mechanism directs the tions. It is therefore able to manage large heap spaces with call to code that automatically scavenges the object (\self- generally short pause times, these predominantly reflecting scavenging" code) prior to executing its entry code; this the time to perform minor collections. eliminates the need for an explicit read barrier. In implementing this \barrierless" scheme we had to mod- ify the behaviour of objects copied during garbage collec- tion so that an attempt to enter the object lands in the Permission to make digital or hard copies of all or part of this work for self-scavenging code. At the same time, we had to devise a personal or classroom use is granted without fee provided that copies are mechanism for invoking the object's normal entry code after not made or distributed for profit or commercial advantage and that copies 1 bear this notice and the full citation on the first page. To copy otherwise, to An alternative is to use replication [13] which replaces the republish, to post on servers or to redistribute to lists, requires prior specific read barrier with a write barrier, but at the expense of addi- permission and/or a fee. tional work in maintaining a log of outstanding writes. This ISMM'04, October 24–25, 2004, Vancouver, British Columbia, Canada. paper focuses on Baker's approach, although the replication Copyright 2004 ACM 1-58113-945-4/04/0010 ...$5.00. approach is the subject of current work. execution of the self-scavenging code. We chose to do this by and compare the approach with that proposed by Ro- augmenting copied objects with an extra word which retains jemo in [16] (Section 5). a reference to the original object entry code after the entry code pointer has been hijacked. This extra word comes at 2. BACKGROUND a price: in addition to the overhead of managing it, it also influences the way memory is used. For example, in a fixed- We assume that the reader is familiar with Baker's in- size heap a space overhead reduces the amount of memory cremental collection algorithm [2] and the fundamentals of available for new object allocation which can reduce the av- generational garbage collection [24]. However, we review erage time between garbage collections and increase the total some notation. number of collections. In this paper we assume that all collection is performed An alternative approach is to build, at compile time, spe- by copying live objects from a from-space to a to-space. In cialised entry code for each object type that comes in two Baker's algorithm copying is synonymous with evacuation. flavours: one that executes normal object entry code and Evacuated objects are scavenged incrementally and the set one that first scavenges the object and then executes its en- of objects that have been evacuated but not scavenged con- try code. This eliminates the need for the extra word as stitute the collector queue. we can simply flip from one to the other when the object is A read barrier is required on each object access to ensure copied during garbage collection. The cost is an increase in that objects in from-space are copied to to-space before the the size of the static program code (code bloat). mutator is allowed to access them. A significant component The main objective of this paper is to detail how this of this barrier code is a test that first determines whether code specialisation can be made to work in practice and to the garbage collector is on (GC-ON) before applying the evaluate the effect of removing the dynamic space overhead from-space test. on both the mutator and garbage collector. Although the In generational collection we shall assume in the discus- idea is simple in principle it interacts in various subtle ways sions that follow that there are just two generations: the with GHC. young generation and the old generation, although our im- Specialisation also facilitates other optimisations, for ex- plementations can be configured to handle an arbitrary num- ample the elimination of the write barrier associated with ber of generations. We assume that objects are aged in ac- generational collectors. We show how the basic scheme can cordance with the number of collections they survive. We be extended to do this, although in the end we chose not to distinguish object tenuring, the process of ageing an object implement the extension as the average performance gain is within a generation, from object promotion, whereby an ob- shown to be small in practice. However, for collectors with ject is deemed sufficiently old for it to be relocated to the a more expensive write barrier, or write-intensive applica- next oldest generation. tions, one might consider the optimisation worthwhile. The remainder of this section focuses on the implementa- Experimental evaluation shows that a modest increase in tion platform that is the subject of this paper. static code size (25% over and above stop-and-copy and an 2.1 GHC and the STG-machine additional 15% over our earlier implementation of Non-stop Haskell) buys us an incremental generational collector that GHC (the Glasgow Haskell Compiler) implements the increases average execution time by a modest 4.5% and re- Spineless Tagless G-machine (STG) [21, 18, 19], which is duces it by 3.5% compared with stop-and-copy and Non-stop a model for the compilation of lazy functional languages. Haskell respectively, for our chosen benchmarks. Our solu- In the STG-machine every program object is represented tion is a compromise that exploits specialisation to remove as a closure. The first field of each closure is a pointer to the expensive read barrier but retains the write barrier to statically-allocated entry code, above which sits an info ta- yield an acceptable overhead in the static code size. ble that contains static information relating to the object's The paper makes the following contributions: type, notably its layout. An example is shown in Figure 1 for an object with four fields, the first two of which are • We describe how to eliminate the read barrier in GHC pointers (pointers always precede non-pointers). The layout using object specialisation and compare it to our previ- information is used by the collector when evacuating and ous scheme which pays a space overhead on all copied scavenging the closure.
Recommended publications
  • Memory Diagrams and Memory Debugging
    CSCI-1200 Data Structures | Spring 2020 Lab 4 | Memory Diagrams and Memory Debugging Checkpoint 1 (focusing on Memory Diagrams) will be available at the start of Wednesday's lab. Checkpoints 2 and 3 focus on using a memory debugger. It is highly recommended that you thoroughly read the instructions for Checkpoint 2 and Checkpoint 3 before starting. Memory debuggers will be a useful tool in many of the assignments this semester, and in C++ development if you work with the language outside of this course. While a traditional debugger lets you step through your code and examine variables, a memory debugger instead reports memory-related errors during execution and can help find memory leaks after your program has terminated. The next time you see a \segmentation fault", or it works on your machine but not on Submitty, try running a memory debugger! Please download the following 4 files needed for this lab: http://www.cs.rpi.edu/academics/courses/spring20/csci1200/labs/04_memory_debugging/buggy_lab4. cpp http://www.cs.rpi.edu/academics/courses/spring20/csci1200/labs/04_memory_debugging/first.txt http://www.cs.rpi.edu/academics/courses/spring20/csci1200/labs/04_memory_debugging/middle. txt http://www.cs.rpi.edu/academics/courses/spring20/csci1200/labs/04_memory_debugging/last.txt Checkpoint 2 estimate: 20-40 minutes For Checkpoint 2 of this lab, we will revisit the final checkpoint of the first lab of this course; only this time, we will heavily rely on dynamic memory to find the average and smallest number for a set of data from an input file. You will use a memory debugging tool such as DrMemory or Valgrind to fix memory errors and leaks in buggy lab4.cpp.
    [Show full text]
  • Automatic Detection of Uninitialized Variables
    Automatic Detection of Uninitialized Variables Thi Viet Nga Nguyen, Fran¸cois Irigoin, Corinne Ancourt, and Fabien Coelho Ecole des Mines de Paris, 77305 Fontainebleau, France {nguyen,irigoin,ancourt,coelho}@cri.ensmp.fr Abstract. One of the most common programming errors is the use of a variable before its definition. This undefined value may produce incorrect results, memory violations, unpredictable behaviors and program failure. To detect this kind of error, two approaches can be used: compile-time analysis and run-time checking. However, compile-time analysis is far from perfect because of complicated data and control flows as well as arrays with non-linear, indirection subscripts, etc. On the other hand, dynamic checking, although supported by hardware and compiler tech- niques, is costly due to heavy code instrumentation while information available at compile-time is not taken into account. This paper presents a combination of an efficient compile-time analysis and a source code instrumentation for run-time checking. All kinds of variables are checked by PIPS, a Fortran research compiler for program analyses, transformation, parallelization and verification. Uninitialized array elements are detected by using imported array region, an efficient inter-procedural array data flow analysis. If exact array regions cannot be computed and compile-time information is not sufficient, array elements are initialized to a special value and their utilization is accompanied by a value test to assert the legality of the access. In comparison to the dynamic instrumentation, our method greatly reduces the number of variables to be initialized and to be checked. Code instrumentation is only needed for some array sections, not for the whole array.
    [Show full text]
  • Memory Debugging
    Center for Information Services and High Performance Computing (ZIH) Memory Debugging HRSK Practical on Debugging, 03.04.2009 Zellescher Weg 12 Willers-Bau A106 Tel. +49 351 - 463 - 31945 Matthias Lieber ([email protected]) Tobias Hilbrich ([email protected]) Content Introduction Tools – Valgrind – DUMA Demo Exercise Memory Debugging 2 Memory Debugging Segmentation faults sometimes happen far behind the incorrect code Memory debuggers help to find the real cause of memory bugs Detect memory management bugs – Access non-allocated memory – Access memory out off allocated bounds – Memory leaks – when pointers to allocated areas get lost forever – etc. Different approaches – Valgrind: Simulation of the program run in a virtual machine which accurately observes memory operations – Libraries like ElectricFence, DMalloc, and DUMA: Replace memory management functions through own versions Memory Debugging 3 Memory Debugging with Valgrind Valgrind detects: – Use of uninitialized memory – Access free’d memory – Access memory out off allocated bounds – Access inappropriate areas on the stack – Memory leaks – Mismatched use of malloc and free (C, Fortran), new and delete (C++) – Wrong use of memcpy() and related functions Available on Deimos via – module load valgrind Simply run program under Valgrind: – valgrind ./myprog More Information: http://www.valgrind.org Memory Debugging 4 Memory Debugging with Valgrind Memory Debugging 5 Memory Debugging with DUMA DUMA detects: – Access memory out off allocated bounds – Using a
    [Show full text]
  • GC Assertions: Using the Garbage Collector to Check Heap Properties
    GC Assertions: Using the Garbage Collector to Check Heap Properties Edward Aftandilian Samuel Z. Guyer Tufts University Tufts University 161 College Ave 161 College Ave Medford MA Medford MA [email protected] [email protected] ABSTRACT The goal of this work is to develop a general mechanism, which GC assertion ex- This paper introduces GC assertions, a system interface that pro- we call a , that allows the programmer to express pected grammers can use to check for errors, such as data structure in- properties of data structures and convey this information to variant violations, and to diagnose performance problems, such as the garbage collector. The key is that many useful, non-trivial prop- memory leaks. GC assertions are checked by the garbage collec- erties can be easily checked by the garbage collector at GC time, tor, which is in a unique position to gather information and answer imposing little or no run-time overhead. The collector triggers the questions about the lifetime and connectivity of objects in the heap. assertion if it finds that the expected properties have been violated. We introduce several kinds of GC assertions, and we describe how While GC assertions require extra programmer effort, they cap- they are implemented in the collector. We also describe our report- ture application-specific properties that programmers want to check ing mechanism, which provides a complete path through the heap and can easily express. For example, we provide a GC assertion to to the offending objects. We show results for one type of asser- verify that a data structure is reclaimed at the next collection when tion that allows the programmer to indicate that an object should the programmer expects it to be garbage.
    [Show full text]
  • Heapviz: Interactive Heap Visualization for Program Understanding and Debugging
    Heapviz: Interactive Heap Visualization for Program Understanding and Debugging 1 Abstract Understanding the data structures in a program is crucial to understanding how the program works, or why it doesn’t work. Inspecting the code that implements the data structures, however, is an arduous task and often fails to yield insights into the global organization of a program’s data. Inspecting the actual contents of the heap solves these problems but presents a significant challenge of its own: finding an effective way to present the enormous number of objects it contains. In this paper we present Heapviz, a tool for visualizing and exploring snapshots of the heap obtained from a running Java program. Unlike existing tools, such as tra- ditional debuggers, Heapviz presents a global view of the program state as a graph, together with powerful interactive capabilities for navigating it. Our tool employs sev- eral key techniques that help manage the scale of the data. First, we reduce the size and complexity of the graph by using algorithms inspired by static shape analysis to aggregate the nodes that make up a data structure. Second, we implement a power- ful visualization component whose interactive interface provides extensive support for exploring the graph. The user can search for objects based on type, connectivity, and field values; group objects; and color or hide and show each group. The user may also inspect individual objects to see their field values and neighbors in the graph. These interactive abilities help the user manage the complexity of these huge graphs. By applying Heapviz to both constructed and real-world examples, we show that Heapviz provides programmers with a powerful and intuitive tool for exploring program behavior.
    [Show full text]
  • Efficient and Programmable Support for Memory Access Monitoring And
    Appears in the Proceedings of the 13th International Symposium on High-Performance Computer Architecture (HPCA-13), February 2007. MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging ∗ Guru Venkataramani Brandyn Roemer Yan Solihin Milos Prvulovic Georgia Tech Georgia Tech North Carolina State University Georgia Tech [email protected] [email protected] [email protected] [email protected] Abstract many of these tasks, performance overheads of such tools are prohibitive, especially in post-deployment monitoring of live Memory bugs are a broad class of bugs that is becoming (production) runs where end users are directly affected by the increasingly common with increasing software complexity, overheads of the monitoring scheme. and many of these bugs are also security vulnerabilities. Un- One particularly important and broad class of program- fortunately, existing software and even hardware approaches ming errors is erroneous use or management of memory for finding and identifying memory bugs have considerable (memory bugs). This class of errors includes pointer arith- performance overheads, target only a narrow class of bugs, metic errors, use of dangling pointers, reads from unini- are costly to implement, or use computational resources in- tialized locations, out-of-bounds accesses (e.g. buffer over- efficiently. flows), memory leaks, etc. Many software tools have been This paper describes MemTracker, a new hardware sup- developed to detect some of these errors. For example, Pu- port mechanism that can be configured to perform different rify [8] and Valgrind [13] detect memory leaks, accesses to kinds of memory access monitoring tasks. MemTracker as- unallocated memory, reads from uninitialized memory, and sociates each word of data in memory with a few bits of some dangling pointer and out-of-bounds accesses.
    [Show full text]
  • Secure Virtual Architecture: Using LLVM to Provide Memory Safety to the Entire Software Stack
    Secure Virtual Architecture: Using LLVM to Provide Memory Safety to the Entire Software Stack John Criswell, University of Illinois Andrew Lenharth, University of Illinois Dinakar Dhurjati, DoCoMo Communications Laboratories, USA Vikram Adve, University of Illinois What is Memory Safety? Intuitively, the guarantees provided by a safe programming language (e.g., Java, C#) Array indexing stays within object bounds No uses of uninitialized variables All operations are type safe No uses of dangling pointers Control flow obeys program semantics Sound operational semantics Benefits of Memory Safety for Commodity OS Code Security Memory error vulnerabilities in OS kernel code are a reality1 Novel Design Opportunities Safe kernel extensions (e.g. SPIN) Single address space OSs (e.g. Singularity) Develop New Solutions to Higher-Level Security Challenges Information flow policies Encoding security policies in type system 1. Month of Kernel Bugs (http://projects.info-pull.com/mokb/) Secure Virtual Architecture Commodity OS Virtual ISA Compiler + VM Hardware Native ISA Compiler-based virtual machine underneath software stack Uses analysis & transformation techniques from compilers Supports commodity operating systems (e.g., Linux) Typed virtual instruction set enables sophisticated program analysis Provide safe execution environment for commodity OSs Outline SVA Architecture SVA Safety Experimental Results SVA System Architecture Applications Safety Checking Compiler OS Kernel Drivers OS Memory Allocator SVA ISA Safety Verifier
    [Show full text]
  • Debugging Memory Problems on Cray XT Supercomputers with Totalview Debugger
    Debugging Memory Problems on Cray XT Supercomputers with TotalView Debugger Chris Gottbrath, Ariel Burton TotalView Technologies Robert Moench, Luiz DeRose Cray Inc. ABSTRACT: The TotalView Source Code Debugger gives scientists and engineers a way to debug memory problems on the Cray XT series supercomputer. Memory problems such as memory leaks, array bounds violations, and dangling pointers are difficult to track down with conventional tools on even a simple desktop architecture – they can be much more vexing when encountered on a distributed parallel architecture like that of the XT. KEYWORDS: Programming Environment Tools, Debuggers, Memory Debuggers, Cray XT Series 1. Introduction 2. Classes Of Memory Errors The purpose of this paper is to highlight the Programs typically make use of several different availability of memory debugging on the Cray XT series categories of memory that are managed in different supercomputer. Scientists and engineers taking ways. These include stack memory, heap memory, shared advantage of the unique capabilities of the Cray XT can memory, thread private memory and static or global now identify problems like memory leaks and heap memory. Programmers have to pay special attention, allocation bounds violations – problems that are though, to memory that is allocated out of the Heap extremely difficult and tedious to track down without the Memory. This is because the management of heap kinds of capabilities described here. memory is done explicitly in the program rather than implicitly at compile or run time. We begin by discussing and categorizing memory errors then provide a brief overview of both TotalView There are quite a number of ways that a program can Source Code Debugger and the Cray XT Series.
    [Show full text]
  • Detecting Memory Related Errors Using a Valgrind Tool Suite and Detecting Data Races Using Thread Sanitizer Clang
    Volume 6, No. 2, March-April 2015 International Journal of Advanced Research in Computer Science RESEARCH PAPER Available Online at www.ijarcs.info Detecting Memory Related Errors using a Valgrind Tool Suite and Detecting Data Races using Thread Sanitizer Clang Saroj. A.Shambharkar Kavikulguru Institute of Technology and Science Nagpur, India Abstract: It is important for a programmer to detect data races when a programming is done under concurrent systems or using shared memory or parallel programming. Many researchers are implementing their applications using the parallel programming and even for simple c programs ,they are not much concentrating on memory issues,accuracy of a result obtained, speed. It is important and one must take care it. There may found data races in their application code which has to be resolve and for that some data race detectors are required. There is also a need of detecting the memory related problems that may occur with shared memory. As a result of such memory related bugs and data races, the system may slow down the speed of the execution of a program .To improve the speed,the valgrind tool suite is used for memory related problems and ThreadSanitizer is used for detecting data races and some authors used Intel Thread Checker,ADAT,on-the-fly techniques for detecting them. Some tools are reducing the false positives but removing all false positives. Valgrind is providing a number of debugging and profiling tools that helps in improving the accuracy and to enhance the performance of system. This paper is demonstrating and showing the different results obtained by using the mentioned tool suite.
    [Show full text]
  • Introduction to C++ for Java Users
    Introduction Memory management Standard Template Library Classes and templates Introduction to C++ for Java users Leo Liberti DIX, Ecole´ Polytechnique, Palaiseau, France I semester 2006 Leo Liberti Introduction to C++ for Java users Introduction Programming style issues Memory management Main differences Standard Template Library Development of the first program Classes and templates Preliminary remarks Teachers Leo Liberti: [email protected]. TDs: Pietro Belotti [email protected], Claus Gwiggner [email protected] Aim of the course Introducing C++ to Java users Means Develop a simple C++ application which performs a complex task http://www.lix.polytechnique.fr/~liberti/teaching/c++/ fromjava/ Leo Liberti Introduction to C++ for Java users Introduction Programming style issues Memory management Main differences Standard Template Library Development of the first program Classes and templates Course structure Timetable Lecture: Wednesday 20/9/06 14:00-15:30, Amphi Pierre Curie TDs: 21-22/9: 8-10, 10:15-12:15, 13:45-15:45, 16-18, S.I. 34-35 25/9/06: 8-10, 10:15-12:15, S.I. 34-35 Course material (optional) Bjarne Stroustrup, The C++ Programming Language, 3rd edition, Addison-Wesley, Reading (MA), 1999 Stephen Dewhurst, C++ Gotchas: Avoiding common problems in coding and design, Addison-Wesley, Reading (MA), 2002 Herbert Schildt, C/C++ Programmer’s Reference, 2nd edition, Osborne McGraw-Hill, Berkeley (CA) Leo Liberti Introduction to C++ for Java users Introduction Programming style issues Memory management Main differences
    [Show full text]
  • Thread & Memory Debugger
    Thread & Memory Debugger Klaus-Dieter Oertel Intel IAGS HLRN User Workshop, 3-6 Nov 2020 Optimization Notice Copyright © 2020, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others. Debug Memory & Threading Errors Intel® Inspector Find and eliminate errors ▪ Memory leaks, invalid access… ▪ Races & deadlocks ▪ C, C++ and Fortran (or a mix) Simple, Reliable, Accurate ▪ No special recompiles Use any build, any compiler1 Clicking an error instantly displays source ▪ Analyzes dynamically generated or linked code code snippets and the call stack ▪ Inspects 3rd party libraries without source ▪ Productive user interface + debugger integration Fits your existing ▪ Command line for automated regression analysis process Optimization Notice 1That follows common OS standards. Copyright © 2020, Intel Corporation. All rights reserved. 3 *Other names and brands may be claimed as the property of others. Race Conditions Are Difficult to Diagnose They only occur occasionally and are difficult to reproduce Correct Incorrect Shared Shared Thread 1 Thread 2 Thread 1 Thread 2 Counter Counter 0 0 Read count 0 Read count 0 Increment 0 Read count 0 Write count ➔ 1 Increment 0 Read count 1 Increment 0 Increment 1 Write count ➔ 1 Write count ➔ 2 Write count ➔ 1 Optimization Notice Copyright © 2020, Intel Corporation. All rights reserved. 4 *Other names and brands may be claimed as the property of others. Deliver More Reliable Applications Intel® Inspector and Intel® Compiler Intel® Inspector Memory Errors Threading Errors ▪ Dynamic instrumentation ▪ No special builds ▪ Any compiler1 • Invalid Accesses • Races ▪ Source not required • Memory Leaks • Deadlocks • Uninit. Memory Accesses • Cross Stack References Intel® Compiler Pointer Errors ▪ Pointer checker • Out of bounds accesses • Dangling pointers ▪ Run time checks ▪ C, C++ Find errors earlier with less effort 1That follows common OS standards.
    [Show full text]
  • SCARPE: a Technique and Tool for Selective Capture and Replay of Program Executions∗
    SCARPE: A Technique and Tool for Selective Capture and Replay of Program Executions∗ Shrinivas Joshi Alessandro Orso Advanced Micro Devices Georgia Institute of Technology [email protected] [email protected] Abstract used. For another example, capture and replay would also allow for performing dynamic analyses that impose a high Because of software’s increasing dynamism and the het- overhead on the execution time. In this case, we could capture erogeneity of execution environments, the results of in-house executions of the uninstrumented program and then perform testing and maintenance are often not representative of the the expensive analyses off-line, while replaying. way the software behaves in the field. To alleviate this prob- State of the Art. Most existing capture-replay techniques lem, we present a technique for capturing and replaying par- and tools (e.g., WinRunner [17]) are defined to be used in- tial executions of deployed software. Our technique can be house, typically during testing. These techniques cannot used for various applications, including generation of test be used in the field, where many additional constraints ap- cases from user executions and post-mortem dynamic anal- ply. First, traditional techniques capture complete executions, ysis. We present our technique and tool, some possible appli- which may require to record and transfer a huge volume of cations, and a preliminary empirical evaluation that provides data for each execution. Second, existing techniques typi- initial evidence of the feasibility of our approach. cally capture inputs to an application, which can be difficult and may require ad-hoc mechanisms, depending on the way 1 Introduction the application interacts with its environment (e.g., via a net- Today’s software is increasingly dynamic, and its complexity work, through a GUI).
    [Show full text]