The memory

int • • Address • 1036 0 0 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 0 1 0 0 1 0 0 1 0 1 0 0 0 1 0 1040 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0190 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1044 0 0 1 0 0 1 0 0 0 1 1 1 0 0 0 0 0 1 1 0 1 0 0 1 0 1 0 1 0 0 0 0 1048 1 0 0 1 0 0 1 1 0 0 0 byte0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1052 0 1 0 0 1 0 1 0 1 0 0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0 1056 0 0 1 0 0 1 0 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 1 1060 1 1 0 1H1 0 1 0 0 0 0 1e0 0 1 0 0 0 0 0l1 0 0 0 1 0 0 1l0 0 1 0 1064 0 1 0 1o0 1 0 1 0 1 0 0 0 1 1 1 1 1 0 1w0 0 0 0 0 0 0 0o1 0 0 1 1068 0 0 1 0r1 0 0 0 1 0 0 1l0 1 1 0 0 0 1 0d0 1 0 0 0 1 0 1 0 1 1 0 1072 0 1 1 1 0 0 1 1 0 1 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 1076 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1080 0 0 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 1 1 1 1 1 0 0 1 0 0 0 1 0 0 1084 0 0 1 0 0 0 1 0 0 1 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 • Jordi Cortadella and Jordi Petit • • Department of Computer Science string (“Hello world”)

Memory management © Dept. CS, UPC 2 Pointers Pointers Memory • Given a type T, T* is another type called “pointer to T”. • It holds the memory address of an object of type T. Address of a variable (reference operator &): i int i; j • Examples: int* pi = &i; // &i means: “the address of i” int* pi; pi myClass* pC; • Access to the variable (dereference operator *) int j = *pi; • Pointers are often used to manage the memory space // j gets the value of the variable pointed by pi of dynamic data structures. • (points to nowhere; useful for initialization) • In some cases, it might be convenient to obtain the address of a variable during runtime. int* pi = nullptr;

Memory management © Dept. CS, UPC 3 Memory management © Dept. CS, UPC 4 Dynamic Object Creation/Destruction Access to members through a pointer • The new operator returns a pointer to a newly created object: The members of a class can be accessed through a pointer to the class via the -> operator: myClass* = new myClass( ); myClass* c = new myClass{ }; // C++11 myClass* c = new myClass; vector* vp = new vector (100); … vp->push_back(5); • The delete operator destroys an object through a … pointer (deallocates the memory space associated to int n = vp->size(); the object):

delete c; // c must be a pointer

Memory management © Dept. CS, UPC 5 Memory management © Dept. CS, UPC 6 Allocating/deallocating arrays References new and delete can also create/destroy arrays of objects • A reference defines a new name for an existing value (a synonym).

c myClass[0] // c is a pointer to an myClass[1] // array of objects • References are not pointers, although they are myClass[2] myClass* c = new myClass[10]; usually implemented as pointers (memory myClass[3] // c[i] refers to one of the references). myClass[4] // elements of the array myClass[5] c[i].some_method_of_myClass(); myClass[6] • Typical uses: myClass[7] // deallocating the array delete [] c; – Avoiding copies (e.g., parameter passing) myClass[8] myClass[9] – Aliasing long names – Range for loops

Memory management © Dept. CS, UPC 7 Memory management © Dept. CS, UPC 8 References: examples Pointers vs. references • A pointer holds the memory address of a variable. A auto & = x[getIndex(a, b)].val; reference is an alias (another name) for an existing variable. ... r.defineValue(n); // avoids a long name for r ... • In practice, references are implemented as pointers.

// avoids a copy of a large data structure • A pointer can be re-assigned any number of times, whereas bigVector & V = myMatrix.getRow(i); a reference can only be assigned at initialization. ... • Pointers can be NULL. References always refer to an object. // An alternative for the following loop: // for (int i =0; i < a.size(); ++i) ++a[i]; • You can have pointers to pointers. You cannot have references to references. for (auto x: arr) ++x; // does not work (why?)

for (auto & x: arr) ++x; // it works! • You can use pointer arithmetic (e.g., &object+3). Reference arithmetic does not exist.

Memory management © Dept. CS, UPC 9 Memory management © Dept. CS, UPC 10 The Vector class • The natural replacement of C-style arrays.

• Main advantages: – It can dynamically grow and shrink. The Vector class – It is a class (can handle any type T) – No need to take care of the allocated memory. – Data is allocated in a contiguous memory area. (an approximation to the STL vector class) • We will implement a Vector class, a simplified version of STL’s vector.

• Based on Weiss’ book (4th edition), see Chapter 3.4.

Memory management © Dept. CS, UPC 12 The Vector class The Vector class

template template theSize: 3 class Vector { class Vector { theCapacity: 5 a[0] public: public: objects: a[1] … … a[2] private: private: … int theSize; }; int theCapacity; Object* objects; a pointer ! • What is a class template? }; – A generic abstract class that can handle various datatypes. – The template parameters determine the genericity of the class. • A Vector may allocate more memory than needed (size vs. capacity).

• Example of declarations: • The memory must be reallocated when there is no enough capacity in the storage area. – Vector V; – Vector Vp; • The pointer stores the base memory address (location of objects[0]). Pointers – Vector> M; can be used to allocate/free memory blocks via new/delete operators.

Memory management © Dept. CS, UPC 13 Memory management © Dept. CS, UPC 14 The Vector class Memory management

theSize: a[0] 3 theSize: 4 a[0] B’s A’s C’s A C B theCapacity: 5 a[1] theCapacity: 5 a[1] storage storage storage objects: a[2] objects: a[2] a[3]

After resizing B’s storage (e.g., B.push_back(…)) a.push_back(…) A’s B’s C’s A C B storage storage storage a.push_back(…)

• Data structures usually have a descriptor (fixed size) and a storage area (variable size). theSize: 5 a[0] Memory blocks cannot always be resized. They need to be reallocated and initialized theCapacity: 5 a[1] with the old block. After that, the old block can be freed. objects: a[2] a[3] • Programmers do not have to take care of memory allocation, but it is convenient to a.push_back(…) a[4] know the of memory management.

• Beware of memory leaks, fragmentation, … Memory management © Dept. CS, UPC 15 Memory management © Dept. CS, UPC 16 The Vector class The Vector class public: public: // Returns the size of the vector // Returns a const ref to the last element theSize: 3 int size() const const Object& back() const theSize: 3 theCapacity: 5 a[0] { return objects[theSize – 1]; } { return theSize; } theCapacity: 5 a[0] objects: a[1] objects: a[1] a[2] // Returns a ref to the i-th element // Is the vector empty? a[2] bool empty() const Object& operator[](int i) { return size() == 0; } { return objects[i]; }

// Adds an element to the back of the vector // Returns a const ref to the i-th element void push_back(const Object & x) { const Object& operator[](int i) const if (theSize == theCapacity) reserve(2*theCapacity + 1); { return objects[i]; } objects[theSize++] = x; } // Modifies the size of the vector (destroying // elements in case of shrinking) // Removes the last element of the array void resize(int newSize) { void pop_back() if (newSize > theCapacity) reserve(newSize); { theSize--; } see later theSize = newSize; }

Memory management © Dept. CS, UPC 17 Memory management © Dept. CS, UPC 18 The Vector class Constructors, copies, assignments

// Reserves space (to increase capacity) void reserve(int newCapacity) { if (newCapacity <= theCapacity) return; MyClass a, b; // Constructor is used

// Allocate the new memory block MyClass c = a; // Copy constructor is used Object* newArray = new Object[newCapacity]; b = c; // Assignment operator is used // Copy the old memory block for (int k = 0; k < theSize; ++k) newArray[k] = objects[k]; // Copy constructor is used when passing // the argument to a function (or returning theCapacity = newCapacity; // the value from a function) // Swap pointers and free old memory block std::swap(objects, newArray); theSize: void foo(Myclass x); delete [] newArray; 3 theCapacity: a[0] } 5 … objects: a[1] foo(c); // Copy constructor is used a[2]

Memory management © Dept. CS, UPC 19 Memory management © Dept. CS, UPC 20 The Vector class Memory layout of a program

// Default constructor with initial size Vector(int initSize = 0) : theSize{initSize}, Code + theCapacity{initSize + SPARE_CAPACITY} Heap free Stack { objects = new Object[theCapacity]; } static data (fragmented) memory (compact) // Copy constructor (binary file) Vector(const Vector& rhs) : theSize{rhs.theSize}, theCapacity{rhs.theCapacity}, objects{nullptr} { objects = new Object[theCapacity]; for (int k = 0; k < theSize; ++k) objects[k] = rhs.object[k]; low address (0) high address } theSize: 3 // Assignment operator theCapacity: 5 a[0] Vector& operator=(const Vector& rhs) { objects: a[1] if (this != &rhs) { // Avoid unnecessary copy Region Type of data Lifetime theSize = rhs.theSize; a[2] theCapacity = rhs.theCapacity; Static Global data Lifetime of the program delete [] objects; objects = new Object[theCapacity]; for (int k = 0; k < theSize; ++k) objects[k] = rhs.object[k]; Stack Local variables of a function Lifetime of the function } return *this; Heap Dynamic data Since created (new) until destroyed (delete) }

// Destructor ~Vector() { delete [] objects; }

Memory management © Dept. CS, UPC 21 Memory management © Dept. CS, UPC 22 Memory layout of a program Memory management models int g(vector& X) { • Programmer-controlled management: int i; Activation records vector Y; – The programmer decides when to allocate (new) and … deallocate (delete) blocks of memory. } i Y’s data – Example: C++. Y.theSize – Pros: efficiency, memory management can be optimized. void f() { g Y.theCapacity – int z; Y.objects Cons: error-prone (dangling references and memory leaks) vector A; X … • Automatic management: z = g(A); z … A.theSize – The program periodically launches a garbage collector that } f A.theCapacity frees all non-referenced blocks of memory. A.objects – Examples: Java, Python, R. int main() { A’s data double r; – Pros: the programmer does not need to worry about … memory management. f(); … main r – Cons: cannot optimize memory management, less control } over runtime. Stack Heap Memory management © Dept. CS, UPC 23 Memory management © Dept. CS, UPC 24 Dangling references and memory leaks Pointers and memory leaks

for (i = 0; i < 1000000; ++i) { myClass* A = new myClass; myClass* A = new Myclass; // 1Kbyte myClass* B = new myClass; ... // We have allocated space for two objects do_something(A, i); ... A = B; // forgets to delete A // A and B point at the same object! } // Possible (unreferenced memory space) // This loop produces a 1-Gbyte memory leak!

delete A; Recommendations: // Now B is a dangling reference • Do not use pointers, unless you are very desperate. // (points at free space) • If you have to use pointers, hide their usage inside a class and define consistent constructors/destructors. delete B; // Error • Use valgrind (valgrind.org) to detect memory leaks. • Remember: no pointers  no memory leaks.

Memory management © Dept. CS, UPC 25 Memory management © Dept. CS, UPC 26 Pointers and references to dynamic data How much memory space does an object take?

vector a; Objects ... // do some push_back’s to a Class ... const myClass& x = a[0]; // ref to a[0] Attributes Attributes Attributes Attributes (data) (data) (data) … (data) // x contains a memory address pointing at a[0] Obj Obj Objn a.push_back(something); 1 2 // the address of a[0] might have changed! // x might be pointing at garbage data • The methods (code) are shared by all the objects of the same class. Methods Recommendation: don’t trust on pointers/references to dynamic data. (code) • Each object has its own copy of attributes (data).

Possible solution: use indices/keys to access data in containers.

Memory management © Dept. CS, UPC 27 Memory management © Dept. CS, UPC 28 About C (the predecessor of C++) C: parameters by reference

• Developed by Dennis Ritchie at Bell Labs (1972) and used to // a is received by value and b by reference re-implement Unix. // (using a pointer) int f(int a, int* b) { • It was designed to be easily mappable to machine instructions *b = *b + a; and provide low-level memory access. return a*a; } • Today, it is still necessary to use some C libraries, designed by skilled experts, that have not been rewritten in other languages (the same happens with some libraries). int main() { int x, y, z; • Some aspects that must be known to interface with C libraries: … – No references (only pointers). // pointers used to pass by reference – No object-oriented support (no STL, no vectors, no maps, etc). z = f(x, &y); – Vectors must be implemented as arrays. … }

Memory management © Dept. CS, UPC 29 Memory management © Dept. CS, UPC 30 C: arrays C: arrays int a[100]; // an array of 100 int’s // Equivalent declaration: double sum(double* v, int n) double m[30][40]; // a matrix of size 30x40 // The size of the array must be indicated explicitly. double sum(double v[], int n) { int c[]; // also int* c: an array of unknown size double s = 0; for (int i = 0; i < n; ++i) s += v[i]; char* s; // a string (array of unknown size) return s; } … // memory allocation for n integers // Or we can use a sentinel as a terminator of an array. // Example: strings are usually terminated by a 0. c = malloc(n*sizeof(int)); int length(char* s) { int len; // memory allocation for m chars for (len = 0; s[len] != 0; ++len); s = malloc(m*sizeof(char)); return len; } … // Anything that is allocated must be freed int main() { // (or memory leaks will occur) char hello[] = “Hello, world!”; free(c); // arrays are implicitly passed by reference int l = length(hello); free(s); … }

Memory management © Dept. CS, UPC 31 Memory management © Dept. CS, UPC 32 C arrays and C++ vectors Assertions vs. error handling double sum(double v[], int n) { • Assertions: double s = 0; – Runtime checks of properties for (int i = 0; i < n; ++i) s += v[i]; (e.g., invariants, pre-/post-conditions). return s; – Useful to detect internal errors. } – They should always hold in a bug-free program. // Any C++ vector is an object that contains – They should give meaningful error messages to the // a private array. This array can be accessed programmers. // using the ‘data()’ method (C++11). int main() { • Error handling: vector a; … – Detect improper use of the program double s = sum(a.data(), a.size()); (e.g., incorrect input data). } – The internal array The program should not halt unexpectedly. of the vector – They should give meaningful error messages to the users.

Functions using the internal array of a vector should NEVER resize the array !

Memory management © Dept. CS, UPC 33 Containers: Stacks © Dept. CS, UPC 34 Assertions vs. error handling Summary

assert (x >= 0); • Memory is a valuable resource in computing devices. double y = sqrt(x); It must be used efficiently.

assert (i >= 0 and i < A.size()); • Languages are designed to make memory management transparent to the user, but a lot of inefficiencies may arise unintentionally (e.g., int v = A[i]; copy by value).

assert (p != nullptr); • Pointers imply the use of the heap and all the problems associated int n = p->number; to memory management (memory leaks, fragmentation).

• Recommendation: do not use pointers unless you have no other string id; choice. Not using pointers will save a lot of debugging time. cin >> id; if (isdigit(name[0])) { • In case of using pointers, try to hide the pointer manipulation and cerr << "Invalid identifier" << endl; memory management (new/delete) inside the class in such a way return false; that the user of the class does not need to “see” the pointers. }

Containers: Stacks © Dept. CS, UPC 35 Memory management © Dept. CS, UPC 36 Constructors Consider two versions of the program below, each one using a different definition of the class Point. Comment on the behavior of the program at compile time and runtime. class Point { class Point { int x, y; int x, y; public: public: Point(const Point &p) { Point(int i=0, int j=0) { x = p.x; y = p.y; x = i; y = j; } } int getX() { return x; } int getX() { return x; } int getY() { return y; } int getY() { return y; } }; };

EXERCISES int main() { Point p1(10); Point p2 = p1; cout << "x = " << p2.getX() << endl; cout << "y = " << p2.getY() << endl; }

Memory management © Dept. CS, UPC 37 Memory management © Dept. CS, UPC 38 Constructors and destructors List with pointers L int c = 0; // Global variable What is the output of this program? name name name name name Explain why. marks marks marks marks marks next next next next next class A { nullptr int id; void f(const A& x, A y) { struct Student { Consider the following definition of a list public: A z = x; of students organized as shown in the A() : id(++c) { cout << "N"; } A w; string name; picture. } vector marks; A(const A& x) { Student* next; You can assume that the vector of marks id = x.id; cout << "C"; int main() { }; is never empty. } A v1, *v2, v3; A v4 = v3; string BestStudent(Student* L); The last student in the list points at A& operator=(const A& x) { v2 = new A(); “null” (nullptr). id = x.id; cout << "A"; v1 = *v2; } f(v1, v4); Design the function BestStudent with the following specification: delete v2; ~A() { cout << id; } A v5; L points at the first student of the list. BestStudent returns the name of the student }; } with the best average mark. In case no student has an average mark greater than or equal to 5, the function must return the string “Bad Teacher”.

Memory management © Dept. CS, UPC 39 Memory management © Dept. CS, UPC 40