CPSC 213 } C Vs

Total Page:16

File Type:pdf, Size:1020Kb

CPSC 213 } C Vs Reading Java Hello World... ‣ Textbook import java.io.*; public class HelloWorld { • New to C, Understanding Pointers, The malloc and free Functions, Why Dynamic Memory Allocation public static void main (String[] args) { System.out.println("Hello world"); • 2ed: "New to C" sidebar of 3.4, 3.10, 9.9.1-9.9.2 } • 1ed: "New to C" sidebar of 3.4, 3.11,10.9.1-10.9.2 CPSC 213 } C vs. Java C Hello World... Introduction to Computer Systems #include <stdio.h> main() { printf("Hello world\n"); Unit 1f } C, Pointers, and Dynamic Allocation 1 2 3 4 Java Syntax... vs. C Syntax New in C: Pointers C and Java Arrays and Pointers ‣ source files ‣ source files ‣ pointers: addresses in memory 0x00000000 ‣ In both languages 0x00000001 • .java is source file • .c is source file • locations are first-class citizens in C • an array is a list of items of the same type 0x00000002 • .h is header file • can go back and forth between location and value! • array elements are named by non-negative integers start with 0 0x00000003 ‣ including packages in source ‣ including headers in source ‣ pointer declaration: <type>* 0x00000004 • syntax for accessing element i of array b is b[i] • import java.io.* • #include <stdio.h> • int* b; // b is a POINTER to an INT 0x00000005 ‣ In Java ‣ printing ‣ printing 0x00000006 • printf("blah blah\n"); ‣ getting address of object: & • variable a stores a pointer to the array • System.out.println("blah blah"); . Pointers in C • int a; // a is an INT • b[x] = 0 means m[m[b] + x * sizeof(array-element)] ← 0 ‣ compile and run ‣ compile and run . • int* b = &a; // b is a pointer to a . • javac foo.java • gcc -g -o foo foo.c ‣ In C • ./foo 0x3e47ad40 • java foo ‣ de-referencing pointer: * • variable a can store a pointer to the array or the array itself • at Unix command line shell prompt • at command line (Linux, Windows, Mac) • a = 10; // assign the value 10 to a 0x3e47ad41 (Linux, Mac Terminal, Sparc, • b[x] = 0 means m[b + x * sizeof(array-element)] ← 0 0x3e47ad42 Cygwin on Windows) • *b = 10; // assign the value 10 to a or m[m[b] + x * sizeof(array-element)] ← 0 ‣ debug . ‣ edit, compile, run, debug (IDE) ‣ type casting is not typesafe . • dynamic arrays are just like all other pointers • gdb foo • Eclipse • char a[4]; // a 4 byte array . - stored in TYPE* • *((int*) a) = 1; // treat those four bytes as an INT 0xffffffff - access with either a[x] or *(a+x) 5 6 7 8 Example Example Pointer Arithmetic in C Question (from S3-C-pointer-math.c) ‣ The following two C programs are identical ‣ The following two C programs are identical ‣ Its purpose int *c; • an alternative way to access dynamic arrays to the a[i] int *a; int *a; int *a; int *a; void foo () { a[4] = 5; *(a+4) = 5; a[4] = 5; *(a+4) = 5; ‣ Adding or subtracting an integer index to a pointer // ... c = (int *) malloc (10*sizeof(int)); • results in a new pointer of the same type // ... ‣ For array access, the compiler would generate this code ‣ For array access, the compiler would generate this code • value of the pointer is offset by index times size of pointer’s referent c = &c[3]; *c = *&c[3]; • for example // ... r[0] ← a ld $a, r0 r[0] ← a ld $a, r0 - adding 3 to an int* yields a pointer value 12 larger than the original } r[1] ← 4 ld $4, r1 r[1] ← 4 ld $4, r1 r[2] ← 5 ld $5, r2 r[2] ← 5 ld $5, r2 ‣ Subtracting two pointers of the same type m[r[0]+4*r[1]] ← r[2] st r2, (r0,r1,4) m[r[0]+4*r[1]] ← r[2] st r2, (r0,r1,4) • results in an integer ‣ What is the equivalent Java statement to • gives number of referent-type elements between the two pointers • multiplying the index 4 by 4 (size of integer) to compute the array offset • multiplying the index 4 by 4 (size of integer) to compute the array offset • [A] c[0] = c[3]; • for example • [B] c[3] = c[6]; ‣ So, what does this tell you about pointer arithmetic in C? ‣ So, what does this tell you about pointer arithmetic in C? - (& a[7]) - (& a[2])) == 5 == (a+7) - (a+2) • [C] there is no typesafe equivalent ‣ other operators • [D] not valid, because you can’t take the address of a static in Java Adding X to a pointer of type Y*, adds X * sizeof(Y) • & X the address of X to the pointer’s memory-address value. • * X the value X points to 9 9 10 11 Looking more closely Looking more closely ‣ And in assembly language Example: Endianness of a Computer r[0] ← 0x2000 # r[0] = &c c = &c[3]; r[0] ← 0x2000 # r[0] = &c c = &c[3]; r[0] ← 0x2000 # r[0] = &c r[1] ← m[r[0]] # r[1] = c *c = *&c[3]; r[1] ← m[r[0]] # r[1] = c *c = *&c[3]; r[1] ← m[r[0]] # r[1] = c r[2] ← 12 # r[2] = 3 * sizeof(int) r[2] ← 12 # r[2] = 3 * sizeof(int) r[2] ← 12 # r[2] = 3 * sizeof(int) r[2] ← r[2]+r[1] # r[2] = c + 3 #include <stdio.h> r[2] ← r[2]+r[1] # r[2] = c + 3 r[2] ← r[2]+r[1] # r[2] = c + 3 m[r[0]] ← r[2] # c = c + 3 m[r[0]] ← r[2] # c = c + 3 m[r[0]] ← r[2] # c = c + 3 int main () { r[3] ← 3 # r[3] = 3 char a[4]; r[3] ← 3 # r[3] = 3 r[3] ← 3 # r[3] = 3 r[4] ← m[r[2]+4*r[3]] # r[4] = c[3] r[4] ← m[r[2]+4*r[3]] # r[4] = c[3] r[4] ← m[r[2]+4*r[3]] # r[4] = c[3] m[r[2]] ← r[4] # c[0] = c[3] *((int*)a) = 1; m[r[2]] ← r[4] # c[0] = c[3] m[r[2]] ← r[4] # c[0] = c[3] printf("a[0]=%d a[1]=%d a[2]=%d a[3]=%d\n",a[0],a[1],a[2],a[3]); } Before Before After ld $0x2000, r0 # r0 = &c ld (r0), r1 # r1 = c ld $12, r2 # r2 = 3*sizeof(int) 0x2000: 0x3000 0x2000: 0x3000 0x2000: 0x300c 0x3000: 0 0x3000: 0 0x3000: 0 add r1, r2 # r2 = c+3 0x3004: 1 0x3004: 1 0x3004: 1 st r2, (r0) # c = c+3 0x3008: 2 0x3008: 2 0x3008: 2 0x300c: 3 0x300c: 3 0x300c: 6 ld $3, r3 # r3 = 3 0x3010: 4 0x3010: 4 0x3010: 4 c[0] = c[3] ld (r2,r3,4), r4 # r4 = c[3] 0x3014: 5 0x3014: 5 0x3014: 5 st r4, (r2) # c[0] = c[3] 0x3018: 6 0x3018: 6 0x3018: 6 0x301c: 7 0x301c: 7 0x301c: 7 0x3020: 8 0x3020: 8 0x3020: 8 12 12 13 14 Dynamic Allocation in C and Java Considering Explicit Delete ‣ Let's extend the example to see • what might happen in bar() ‣ Programs can allocate memory dynamically ‣ Let's look at this example • and why a subsequent call to bat() would expose a serious bug • allocation reserves a range of memory for a purpose struct MBuf * receive () { struct MBuf * receive () { struct MBuf* mBuf = (struct MBuf*) malloc (sizeof (struct MBuf)); struct MBuf* mBuf = (struct MBuf*) malloc (sizeof (struct MBuf)); • in Java, instances of classes are allocated by the new statement ... ... return mBuf; • in C, byte ranges are allocated by call to malloc function return mBuf; } } ‣ Wise management of memory requires deallocation void foo () { void foo () { • memory is a scare resource struct MBuf* mb = receive (); struct MBuf* mb = receive (); Dynamic Allocation bar (mb); bar (mb); • deallocation frees previously allocated memory for later re-use free (mb); free (mb); } } • Java and C take different approaches to deallocation void MBuf* aMB; ‣ How is memory deallocated in Java? • is it safe to free mb where it is freed? • what bad thing can happen? void bar (MBuf* mb) { aMB = mb; } ‣ Deallocation in C void bat () { This statement writes to • programs must explicitly deallocate memory by calling the free function aMB->x = 0; unallocated (or re-allocated) memory. } • free frees the memory immediately, with no check to see if its still in use 15 16 17 18 Dangling Pointers Avoiding Dangling Pointers in C Avoiding dynamic allocation Reference Counting ‣ A dangling pointer is ‣ Understand the problem ‣ If procedure returns value of dynamically allocated object ‣ Use reference counting to track object use • a pointer to an object that has been freed • when allocation and free appear in different places in your code • allocate that object in caller and pass pointer to it to callee • any procedure that stores a reference increments the count • could point to unallocated memory or to another object • for example, when a procedure returns a pointer to something it allocates • good if caller can allocate on stack or can do both malloc / free itself • any procedure that discards a reference decrements the count ‣ Why they are a problem ‣ Avoid the problem cases, if possible • the object is freed when count goes to zero struct MBuf * receive () { • program thinks its writing to object of type X, but isn’t • restrict dynamic allocation/free to single procedure, if possible struct MBuf* mBuf = (struct MBuf*) malloc (sizeof (struct MBuf)); struct MBuf* malloc_Mbuf () { ... • it may be writing to an object of type Y, consider this sequence of events struct MBuf* mb = (struct MBuf* mb) malloc (sizeof (struct MBuf)); • don’t write procedures that return pointers, if possible return mBuf; mb->ref_count = 1; } • use local variables instead, where possible return mb; (1) Before free: (2) After free: void foo () { } - since local variables are automatically allocated on call and freed on return through stack aMB: 0x2000 aMB: 0x2000 struct MBuf* mb = receive (); dangling ‣ Engineer for memory management, if necessary bar (mb); void keep_reference (struct MBuf* mb) { pointer free (mb); void receive (struct MBuf* mBuf) { mb->ref_count ++; 0x2000: a struct mbuf 0x2000: free memory } ..
Recommended publications
  • Memory Management and Garbage Collection
    Overview Memory Management Stack: Data on stack (local variables on activation records) have lifetime that coincides with the life of a procedure call. Memory for stack data is allocated on entry to procedures ::: ::: and de-allocated on return. Heap: Data on heap have lifetimes that may differ from the life of a procedure call. Memory for heap data is allocated on demand (e.g. malloc, new, etc.) ::: ::: and released Manually: e.g. using free Automatically: e.g. using a garbage collector Compilers Memory Management CSE 304/504 1 / 16 Overview Memory Allocation Heap memory is divided into free and used. Free memory is kept in a data structure, usually a free list. When a new chunk of memory is needed, a chunk from the free list is returned (after marking it as used). When a chunk of memory is freed, it is added to the free list (after marking it as free) Compilers Memory Management CSE 304/504 2 / 16 Overview Fragmentation Free space is said to be fragmented when free chunks are not contiguous. Fragmentation is reduced by: Maintaining different-sized free lists (e.g. free 8-byte cells, free 16-byte cells etc.) and allocating out of the appropriate list. If a small chunk is not available (e.g. no free 8-byte cells), grab a larger chunk (say, a 32-byte chunk), subdivide it (into 4 smaller chunks) and allocate. When a small chunk is freed, check if it can be merged with adjacent areas to make a larger chunk. Compilers Memory Management CSE 304/504 3 / 16 Overview Manual Memory Management Programmer has full control over memory ::: with the responsibility to manage it well Premature free's lead to dangling references Overly conservative free's lead to memory leaks With manual free's it is virtually impossible to ensure that a program is correct and secure.
    [Show full text]
  • Memory Management with Explicit Regions by David Edward Gay
    Memory Management with Explicit Regions by David Edward Gay Engineering Diploma (Ecole Polytechnique F´ed´erale de Lausanne, Switzerland) 1992 M.S. (University of California, Berkeley) 1997 A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the GRADUATE DIVISION of the UNIVERSITY of CALIFORNIA at BERKELEY Committee in charge: Professor Alex Aiken, Chair Professor Susan L. Graham Professor Gregory L. Fenves Fall 2001 The dissertation of David Edward Gay is approved: Chair Date Date Date University of California at Berkeley Fall 2001 Memory Management with Explicit Regions Copyright 2001 by David Edward Gay 1 Abstract Memory Management with Explicit Regions by David Edward Gay Doctor of Philosophy in Computer Science University of California at Berkeley Professor Alex Aiken, Chair Region-based memory management systems structure memory by grouping objects in regions under program control. Memory is reclaimed by deleting regions, freeing all objects stored therein. Our compiler for C with regions, RC, prevents unsafe region deletions by keeping a count of references to each region. RC's regions have advantages over explicit allocation and deallocation (safety) and traditional garbage collection (better control over memory), and its performance is competitive with both|from 6% slower to 55% faster on a collection of realistic benchmarks. Experience with these benchmarks suggests that modifying many existing programs to use regions is not difficult. An important innovation in RC is the use of type annotations that make the structure of a program's regions more explicit. These annotations also help reduce the overhead of reference counting from a maximum of 25% to a maximum of 12.6% on our benchmarks.
    [Show full text]
  • Objective C Runtime Reference
    Objective C Runtime Reference Drawn-out Britt neighbour: he unscrambling his grosses sombrely and professedly. Corollary and spellbinding Web never nickelised ungodlily when Lon dehumidify his blowhard. Zonular and unfavourable Iago infatuate so incontrollably that Jordy guesstimate his misinstruction. Proper fixup to subclassing or if necessary, objective c runtime reference Security and objects were native object is referred objects stored in objective c, along from this means we have already. Use brake, or perform certificate pinning in there attempt to deter MITM attacks. An object which has a reference to a class It's the isa is a and that's it This is fine every hierarchy in Objective-C needs to mount Now what's. Use direct access control the man page. This function allows us to voluntary a reference on every self object. The exception handling code uses a header file implementing the generic parts of the Itanium EH ABI. If the method is almost in the cache, thanks to Medium Members. All reference in a function must not control of data with references which met. Understanding the Objective-C Runtime Logo Table Of Contents. Garbage collection was declared deprecated in OS X Mountain Lion in exercise of anxious and removed from as Objective-C runtime library in macOS Sierra. Objective-C Runtime Reference. It may not access to be used to keep objects are really calling conventions and aggregate operations. Thank has for putting so in effort than your posts. This will cut down on the alien of Objective C runtime information. Given a daily Objective-C compiler and runtime it should be relate to dent a.
    [Show full text]
  • Lecture 15 Garbage Collection I
    Lecture 15 Garbage Collection I. Introduction to GC -- Reference Counting -- Basic Trace-Based GC II. Copying Collectors III. Break Up GC in Time (Incremental) IV. Break Up GC in Space (Partial) Readings: Ch. 7.4 - 7.7.4 Carnegie Mellon M. Lam CS243: Advanced Garbage Collection 1 I. Why Automatic Memory Management? • Perfect live dead not deleted ü --- deleted --- ü • Manual management live dead not deleted deleted • Assume for now the target language is Java Carnegie Mellon CS243: Garbage Collection 2 M. Lam What is Garbage? Carnegie Mellon CS243: Garbage Collection 3 M. Lam When is an Object not Reachable? • Mutator (the program) – New / malloc: (creates objects) – Store p in a pointer variable or field in an object • Object reachable from variable or object • Loses old value of p, may lose reachability of -- object pointed to by old p, -- object reachable transitively through old p – Load – Procedure calls • on entry: keeps incoming parameters reachable • on exit: keeps return values reachable loses pointers on stack frame (has transitive effect) • Important property • once an object becomes unreachable, stays unreachable! Carnegie Mellon CS243: Garbage Collection 4 M. Lam How to Find Unreachable Nodes? Carnegie Mellon CS243: Garbage Collection 5 M. Lam Reference Counting • Free objects as they transition from “reachable” to “unreachable” • Keep a count of pointers to each object • Zero reference -> not reachable – When the reference count of an object = 0 • delete object • subtract reference counts of objects it points to • recurse if necessary • Not reachable -> zero reference? • Cost – overhead for each statement that changes ref. counts Carnegie Mellon CS243: Garbage Collection 6 M.
    [Show full text]
  • Linear Algebra Libraries
    Linear Algebra Libraries Claire Mouton [email protected] March 2009 Contents I Requirements 3 II CPPLapack 4 III Eigen 5 IV Flens 7 V Gmm++ 9 VI GNU Scientific Library (GSL) 11 VII IT++ 12 VIII Lapack++ 13 arXiv:1103.3020v1 [cs.MS] 15 Mar 2011 IX Matrix Template Library (MTL) 14 X PETSc 16 XI Seldon 17 XII SparseLib++ 18 XIII Template Numerical Toolkit (TNT) 19 XIV Trilinos 20 1 XV uBlas 22 XVI Other Libraries 23 XVII Links and Benchmarks 25 1 Links 25 2 Benchmarks 25 2.1 Benchmarks for Linear Algebra Libraries . ....... 25 2.2 BenchmarksincludingSeldon . 26 2.2.1 BenchmarksforDenseMatrix. 26 2.2.2 BenchmarksforSparseMatrix . 29 XVIII Appendix 30 3 Flens Overloaded Operator Performance Compared to Seldon 30 4 Flens, Seldon and Trilinos Content Comparisons 32 4.1 Available Matrix Types from Blas (Flens and Seldon) . ........ 32 4.2 Available Interfaces to Blas and Lapack Routines (Flens and Seldon) . 33 4.3 Available Interfaces to Blas and Lapack Routines (Trilinos) ......... 40 5 Flens and Seldon Synoptic Comparison 41 2 Part I Requirements This document has been written to help in the choice of a linear algebra library to be included in Verdandi, a scientific library for data assimilation. The main requirements are 1. Portability: Verdandi should compile on BSD systems, Linux, MacOS, Unix and Windows. Beyond the portability itself, this often ensures that most compilers will accept Verdandi. An obvious consequence is that all dependencies of Verdandi must be portable, especially the linear algebra library. 2. High-level interface: the dependencies should be compatible with the building of the high-level interface (e.
    [Show full text]
  • Memory Management
    Memory management The memory of a computer is a finite resource. Typical Memory programs use a lot of memory over their lifetime, but not all of it at the same time. The aim of memory management is to use that finite management resource as efficiently as possible, according to some criterion. Advanced Compiler Construction In general, programs dynamically allocate memory from two different areas: the stack and the heap. Since the Michel Schinz — 2014–04–10 management of the stack is trivial, the term memory management usually designates that of the heap. 1 2 The memory manager Explicit deallocation The memory manager is the part of the run time system in Explicit memory deallocation presents several problems: charge of managing heap memory. 1. memory can be freed too early, which leads to Its job consists in maintaining the set of free memory blocks dangling pointers — and then to data corruption, (also called objects later) and to use them to fulfill allocation crashes, security issues, etc. requests from the program. 2. memory can be freed too late — or never — which leads Memory deallocation can be either explicit or implicit: to space leaks. – it is explicit when the program asks for a block to be Due to these problems, most modern programming freed, languages are designed to provide implicit deallocation, – it is implicit when the memory manager automatically also called automatic memory management — or garbage tries to free unused blocks when it does not have collection, even though garbage collection refers to a enough free memory to satisfy an allocation request. specific kind of automatic memory management.
    [Show full text]
  • Lecture 14 Garbage Collection I
    Lecture 14 Garbage Collection I. Introduction to GC -- Reference Counting -- Basic Trace-Based GC II. Copying Collectors III. Break Up GC in Time (Incremental) IV. Break Up GC in Space (Partial) Readings: Ch. 7.4 - 7.7.4 Carnegie Mellon M. Lam CS243: Garbage Collection 1 I. What is Garbage? • Ideal: Eliminate all dead objects • In practice: Unreachable objects Carnegie Mellon CS243: Garbage Collection 2 M. Lam Two Approaches to Garbage Collection Unreachable Reachable Reachable Unreachable What is not reachable, cannot be found! Catch the transition Needs to find the complement from reachable to unreachable. of reachable objects. Reference counting Cannot collect a single object until all reachable objects are found! Stop-the-world garbage collection! Carnegie Mellon CS243: Garbage Collection 3 M. Lam When is an Object not Reachable? • Mutator (the program) – New / malloc: (creates objects) – Store: q = p; p->o1, q->o2 +: q->o1 -: If q is the only ptr to o2, o2 loses reachability If o2 holds unique pointers to any object, they lose reachability More? Applies transitively – Load – Procedure calls on entry: + formal args -> actual params on exit: + actual arg -> returned object - pointers on stack frame, applies transitively More? • Important property – once an object becomes unreachable, stays unreachable! Carnegie Mellon CS243: Garbage Collection 4 M. Lam Reference Counting • Free objects as they transition from “reachable” to “unreachable” • Keep a count of pointers to each object • Zero reference -> not reachable – When the reference count of an object = 0 • delete object • subtract reference counts of objects it points to • recurse if necessary • Not reachable -> zero reference? answer? – Cyclic data structures • Cost – overhead for each statement that changes ref.
    [Show full text]
  • Efficient High Frequency Checkpointing for Recovery and Debugging Vogt, D
    VU Research Portal Efficient High Frequency Checkpointing for Recovery and Debugging Vogt, D. 2019 document version Publisher's PDF, also known as Version of record Link to publication in VU Research Portal citation for published version (APA) Vogt, D. (2019). Efficient High Frequency Checkpointing for Recovery and Debugging. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal ? Take down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. E-mail address: [email protected] Download date: 30. Sep. 2021 Efficient High Frequency Checkpointing for Recovery and Debugging Ph.D. Thesis Dirk Vogt Vrije Universiteit Amsterdam, 2019 This research was funded by the European Research Council under the ERC Advanced Grant 227874. Copyright © 2019 by Dirk Vogt ISBN 978-94-028-1388-3 Printed by Ipskamp Printing BV VRIJE UNIVERSITEIT Efficient High Frequency Checkpointing for Recovery and Debugging ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad Doctor aan de Vrije Universiteit Amsterdam, op gezag van de rector magnificus prof.dr.
    [Show full text]
  • SI 413 Unit 08
    Unit 8 SI 413 Lifetime Management We know about many examples of heap-allocated objects: Created by new or malloc in C++ • All Objects in Java • Everything in languages with lexical frames (e.g. Scheme) • When and how should the memory for these be reclaimed? Unit 8 SI 413 Manual Garbage Collection In some languages (e.g. C/C++), the programmer must specify when memory is de-allocated. This is the fastest option, and can make use of the programmer’s expert knowledge. Dangers: Unit 8 SI 413 Automatic Garbage Collection Goal: Reclaim memory for objects that are no longer reachable. Reachability can be either direct or indirect: A name that is in scope refers to the object (direct) • An data structure that is reachable contains a reference to • the object (indirect) Should we be conservative or agressive? Unit 8 SI 413 Reference Counting Each object contains an integer indicating how many references there are to it. This count is updated continuously as the program proceeds. When the count falls to zero, the memory for the object is deallocated. Unit 8 SI 413 Analysis of Reference Counting This approach is used in filesystems and a few languages (notably PHP and Python). Advantages: Memory is freed as soon as possible. • Program execution never needs to halt. • Over-counting is wasteful but not catastrophic. • Disadvantages: Additional code to run every time references are • created, destroyed, moved, copied, etc. Cycles present a major challenge. • Unit 8 SI 413 Mark and Sweep Garbage collection is performed periodically, usually halting the program. During garbage collection, the entire set of reachable references is traversed, starting from the names currently in scope.
    [Show full text]
  • Deferred Gratification: Engineering for High Performance Garbage
    Deferred Gratification: Engineering for High Performance Garbage Collection from the Get Go ∗ Ivan Jibaja† Stephen M. Blackburn‡ Mohammad R. Haghighat∗ Kathryn S. McKinley† †The University of Texas at Austin ‡Australian National University ∗Intel Corporation [email protected], [email protected], [email protected], [email protected] Abstract Language implementers who are serious about correctness and Implementing a new programming language system is a daunting performance need to understand deferred gratification: paying the task. A common trap is to punt on the design and engineering of software engineering cost of exact GC up front will deliver correct- exact garbage collection and instead opt for reference counting or ness and memory system performance later. TM conservative garbage collection (GC). For example, AppleScript , Categories and Subject Descriptors D.3.4 [Programming Languages]: Perl, Python, and PHP implementers chose reference counting Processors—Memory management (garbage collection); Optimization (RC) and Ruby chose conservative GC. Although easier to get General Terms Design, Experimentation, Performance, Management, Measure- working, reference counting has terrible performance and conser- ment, Languages vative GC is inflexible and performs poorly when allocation rates are high. However, high performance GC is central to performance Keywords Heap, PHP, Scripting Languages, Garbage Collection for managed languages and only becoming more critical due to relatively lower memory bandwidth and higher
    [Show full text]
  • Flexible Reference-Counting-Based Hardware Acceleration for Garbage Collection
    Flexible Reference-Counting-Based Hardware Acceleration for Garbage Collection José A. Joao* Onur Mutlu‡ Yale N. Patt* * HPS Research Group ‡ Computer Architecture Laboratory University of Texas at Austin Carnegie Mellon University Motivation: Garbage Collection Garbage Collection (GC) is a key feature of Managed Languages Automatically frees memory blocks that are not used anymore Eliminates bugs and improves security GC identifies dead (unreachable) objects, and makes their blocks available to the memory allocator Significant overheads Processor cycles Cache pollution Pauses/delays on the application 2 Software Garbage Collectors Tracing collectors Recursively follow every pointer starting with global, stack and register variables, scanning each object for pointers Explicit collections that visit all live objects Reference counting Tracks the number of references to each object Immediate reclamation Expensive and cannot collect cyclic data structures State-of-the-art: generational collectors Young objects are more likely to die than old objects Generations: nursery (new) and mature (older) regions 3 Overhead of Garbage Collection 4 Hardware Garbage Collectors Hardware GC in general-purpose processors? Ties one GC algorithm into the ISA and the microarchitecture High cost due to major changes to processor and/or memory system Miss opportunities at the software level, e.g. locality improvement Rigid trade-off: reduced flexibility for higher performance on specific applications Transistors are available Build accelerators
    [Show full text]
  • Cyclic Reference Counting by Typed Reference Fields
    Computer Languages, Systems & Structures 38 (2012) 98–107 Contents lists available at SciVerse ScienceDirect Computer Languages, Systems & Structures journal homepage: www.elsevier.com/locate/cl Cyclic reference counting by typed reference fields$ J. Morris Chang a, Wei-Mei Chen b,n, Paul A. Griffin a, Ho-Yuan Cheng b a Department of Electrical and Computer Engineering, Iowa Sate University, Ames, IA 50011, USA b Department of Electronic Engineering, National Taiwan University of Science and Technology, Taipei 106, Taiwan article info abstract Article history: Reference counting strategy is a natural choice for real-time garbage collection, but the Received 13 October 2010 cycle collection phase which is required to ensure the correctness for reference Received in revised form counting algorithms can introduce heavy scanning overheads. This degrades the 14 July 2011 efficiency and inflates the pause time required for garbage collection. In this paper, Accepted 1 September 2011 we present two schemes to improve the efficiency of reference counting algorithms. Available online 10 September 2011 First, in order to make better use of the semantics of a given program, we introduce a Keywords: novel classification model to predict the behavior of objects precisely. Second, in order Java to reduce the scanning overheads, we propose an enhancement for cyclic reference Garbage collection counting algorithms by utilizing strongly-typed reference features of the Java language. Reference counting We implement our proposed algorithm in Jikes RVM and measure the performance over Memory management various Java benchmarks. Our results show that the number of scanned objects can be reduced by an average of 37.9% during cycle collection phase.
    [Show full text]