Data structures

Static problems. Given an input, produce an output. DATA STRUCTURES I, II, III, AND IV Ex. Sorting, FFT, edit distance, shortest paths, MST, max-flow, ...

I. Amortized Analysis Dynamic problems. Given a sequence of operations (given one at a time), II. Binary and Binomial Heaps produce a sequence of outputs. Ex. Stack, queue, , symbol table, union–find, …. III. Fibonacci Heaps IV. Union–Find Algorithm. Step-by-step procedure to solve a problem. . Way to store and organize data. Ex. Array, linked list, binary heap, , hash table, …

1 2 3 4 5 6 7 8 7

Lecture slides by Kevin Wayne 33 22 55 23 16 63 86 9 http://www.cs.princeton.edu/~wayne/kleinberg-tardos 33 10

44 86 33 99 1 ● 4 ● 1 ● 3 ●

46 83 90 93 47 60 Last updated on 11/13/19 5:37 AM 2

Appetizer Appetizer

Goal. Design a data structure to support all operations in O(1) time. Data structure. Three arrays A[1.. n], B[1.. n], and C[1.. n], and an integer k. ・INIT(n): create and return an initialized array (all zero) of length n. ・A[i] stores the current value for READ (if initialized). ・READ(A, i): return element i in array. ・k = number of initialized entries. ・WRITE(A, i, value): set element i in array to value. ・C[j] = index of j th initialized element for j = 1, …, k. ・If C[j] = i, then B[i] = j for j = 1, …, k. true in C or C++, but not Java Assumptions. ・Can MALLOC an uninitialized array of length n in O(1) time. Theorem. A[i] is initialized iff both 1 ≤ B[i] ≤ k and C[B[i]] = i. ・Given an array, can read or write element i in O(1) time. Pf. Ahead.

1 2 3 4 5 6 7 8

Remark. An array does INIT in Θ(n) time and READ and WRITE in Θ(1) time. A[ ] ? 22 55 99 ? 33 ? ?

B[ ] ? 3 4 1 ? 2 ? ?

C[ ] 4 6 2 3 ? ? ? ?

k = 4

A[4]=99, A[6]=33, A[2]=22, and A[3]=55 initialized in that order 3 4 Appetizer Appetizer

Theorem. A[i] is initialized iff both 1 ≤ B[i] ≤ k and C[B[i]] = i. Pf. ⇒ Suppose is the th entry to be initialized. INIT (A, n) READ (A, i) WRITE (A, i, value) ・ A[i] j

______・Then C[j] = i and B[i] = j. k ← 0. IF (IS-INITIALIZED (A[i])) IF (IS-INITIALIZED (A[i])) ・Thus, C[B[i]] = i. A ← MALLOC(n). RETURN A[i]. A[i] ← value. B ← MALLOC(n). ELSE ELSE C ← MALLOC(n). RETURN 0. k ← k + 1.

______

______

A[i] ← value. 1 2 3 4 5 6 7 8

B[i] ← k. A[ ] ? 22 55 99 ? 33 ? ? C[k] ← i. IS-INITIALIZED (A, i) ______B[ ] ? 3 4 1 ? 2 ? ? IF (1 ≤ B[i] ≤ k) and (C[B[i]] = i) RETURN true. C[ ] 4 6 2 3 ? ? ? ? ELSE k = 4 RETURN false. ______A[4]=99, A[6]=33, A[2]=22, and A[3]=55 initialized in that order 5 6

Appetizer

Theorem. A[i] is initialized iff both 1 ≤ B[i] ≤ k and C[B[i]] = i. Pf. ⇐ AMORTIZED ANALYSIS ・Suppose A[i] is uninitialized. ・If B[i] < 1 or B[i] > k, then A[i] clearly uninitialized. ‣ binary counter ・If 1 ≤ B[i] ≤ k by coincidence, then we still can’t have C[B[i]] = i ‣ multi-pop stack because none of the entries C[1.. k] can equal i. ▪ ‣ dynamic table

1 2 3 4 5 6 7 8

A[ ] ? 22 55 99 ? 33 ? ?

B[ ] ? 3 4 1 ? 2 ? ? Lecture slides by Kevin Wayne

http://www.cs.princeton.edu/~wayne/kleinberg-tardos

C[ ] 4 6 2 3 ? ? ? ?

k = 4

A[4]=99, A[6]=33, A[2]=22, and A[3]=55 initialized in that order 7 Last updated on 11/13/19 5:37 AM Amortized analysis Amortized analysis: applications

Worst-case analysis. Determine worst-case running time of a data structure ・Splay trees. operation as function of the input size n. ・Dynamic table. can be too pessimistic if the only way to encounter an expensive operation is when ・Fibonacci heaps. there were lots of previous cheap operations ・Garbage collection. ・Move-to-front list updating. ・Push–relabel algorithm for max flow. Amortized analysis. Determine worst-case running time of a sequence ・Path compression for disjoint-set union. of n data structure operations. ・Structural modifications to red–black trees. ・Security, databases, distributed computing, ... Ex. Starting from an empty stack implemented with a dynamic table, any sequence of n push and pop operations takes O(n) time in the worst case. SIAM J. ALG. DISC. METH. 1985 Society for Industrial and Applied Mathematics Vol. 6, No. 2, April 1985 016

AMORTIZED COMPUTATIONAL COMPLEXITY* ROBERT ENDRE TARJANt

Abstract. A powerful technique in the complexity analysis of data structures is amortization, or averaging over time. Amortized running time is a realistic but robust complexity measure for which we can obtain surprisingly tight upper and lower bounds on a variety of algorithms. By following the principle of designing algorithms whose amortized complexity is low, we obtain "self-adjusting" data structures that are simple, flexible and efficient. This paper surveys recent work by several researchers on amortized complexity.

ASM(MOS) subject classifications. 68C25, 68E05

9 1. Introduction. Webster’s [34] defines "amortize" as "to put money aside at 10 intervals, as in a sinking fund, for gradual payment of (a debt, etc.)." We shall adapt this term to computational complexity, meaning by it "to average over time" or, more precisely, "to average the running times of operations in a sequence over the sequence." The following observation motivates our study of amortization: In many uses of data structures, a sequence of operations, rather than just a single operation, is performed, and we are interested in the total time of the sequence, rather than in the times of Binary counter the individual operations. A worst-case analysis, in which we sum the worst-case times of the individual operations, may be unduly pessimistic, because it ignores correlated effects of the operations on the data structure. On the other hand, an average-case analysis may be inaccurate, since the probabilistic assumptions needed to carry out the analysis may be false. In such a situation, an amortizedk analysis, in which we Goal. Increment aaverage k-bitthe runningbinarytime percounteroperation over (moda (worst-case) 2 ). sequence of operations, can yield an thanswer that is both realistic and robust. AMORTIZED ANALYSIS Representation. A[jTo] =make j17.1the leastidea Aggregateof amortizationsignificant analysisand the motivationbit of behindcounter.it more concrete, 455 let us consider a very simple example. Consider the manipulation of a stack by a sequence of operations composed of two kinds of unit-time primitives: push, which adds a new item to the top of the stack, and pop, which removes and returns the top item on the Counterstack. We wish to analyze the running timeTotalof a sequence of operations, [7] [6] [5] [4] [3] [2] [1] [0] ‣ binary counter each composedvalueof zero Aor moreA A popsA A followedA A A by a push.cost Assume we start with an empty stack and0carry out00000000m such operations. A single operation0 in the sequence can take up to m time units, as happens if each of the first m- 1 operations performs no ‣ multi-pop stack pops and the last1operation00000001performs m 1 pops. However,1 altogether the m operations can perform at most2 2m00000010pushes and pops, since there are3 only m pushes altogether and each pop must3 correspond00000011to an earlier push. 4 ‣ dynamic table This example4 may seem00000100too simple to be useful, but7 such stack manipulation indeed occurs in5applications00000101as diverse as planarity-testing8 [14] and related problems [24] and linear-time string matching [18]. In this paper we shall survey a number of 6 00000110 10 settings in which amortization is useful. Not only does amortized running time provide a more exact way7 to measure00000111the running time of known11 algorithms, but it suggests that there may be8 new algorithms00001000efficient in an amortized15 rather than a worst-case sense. As we shall9 see, such00001001algorithms do exist, and they16 are simpler, more efficient, and more flexible10 than their00001010worst-case cousins. 18 11 00001011 19 * Received by12the editors December0000110029, 1983. This work was presented22 at the SIAM Second Conference on the Applications of Discrete Mathematics held at Massachusetts Institute of Technology, Cambridge, Massachusetts, June1327-29, 1983.00001101 23 t Bell Laboratories,14 Murray00001110Hill, New Jersey 07974. 25 15 00001111 26 CHAPTER 17 306 16 00010000 31

Figure 17.2 An 8-bit binary counter as its value goes from 0 to 16 by a sequence of 16 INCREMENT Cost model. Number ofoperations. bits flipped. Bits that flip to achieve the next value are shaded. The running cost for flipping bits is shown at the right. Notice that the total cost is always less than twice the total number of INCREMENT operations.

operations on an initially zero counter causes AŒ1 to flip n=2 times. Similarly,12 b c bit AŒ2 flips only every fourth time, or n=4 times in a sequence of n INCREMENT b c i operations. In general, for i 0; 1; : : : ; k 1,bitAŒi flips n=2 times in a D ! b c sequence of n INCREMENT operations on an initially zero counter. For i k, " bit AŒi does not exist, and so it cannot flip. The total number of flips in the sequence is thus

k 1 ! n 1 1

Goal. Increment a k-bit binary counter (mod 2k). Aggregate method. Analyze cost of a sequence of operations. th Representation. A[j] = j17.1 least Aggregate significant analysis bit of counter. 455 17.1 Aggregate analysis 455

Counter Total Counter Total [7] [6] [5] [4] [3] [2] [1] [0] [7] [6] [5] [4] [3] [2] [1] [0] value A A A A A A A A cost value A A A A A A A A cost 0 00000000 0 0 00000000 0 1 00000001 1 1 00000001 1 2 00000010 3 2 00000010 3 3 00000011 4 3 00000011 4 4 00000100 7 4 00000100 7 5 00000101 8 5 00000101 8 6 00000110 10 6 00000110 10 7 00000111 11 7 00000111 11 8 00001000 15 8 00001000 15 9 00001001 16 9 00001001 16 10 00001010 18 10 00001010 18 11 00001011 19 11 00001011 19 12 00001100 22 12 00001100 22 13 00001101 23 13 00001101 23 14 00001110 25 14 00001110 25 15 00001111 26 15 00001111 26 16 00010000 31 16 00010000 31 Theorem. Starting from the zero counter, a sequence of n INCREMENT Figure 17.2 An 8-bit binary counter as its value goes from 0 to 16 by a sequence of 16 INCREMENT Figure 17.2 An 8-bit binary counter as its value goes from 0 to 16 by a sequence of 16 INCREMENT operations flips O(n k) bits.operations. Bits thatoverly flip topessimistic achieve the upper next value bound are shaded. The running cost for flipping bits is operations. Bits that flip to achieve the next value are shaded. The running cost for flipping bits is shown at the right. Notice that the total cost is always less than twice the total number of INCREMENT shown at the right. Notice that the total cost is always less than twice the total number of INCREMENT Pf. At most k bits flippedoperations. per increment. ▪ operations.

operations on an initially zero counter causes AŒ1 to flip n=2 times. Similarly,13 operations on an initially zero counter causes AŒ1 to flip n=2 times. Similarly,14 b c b c bit AŒ2 flips only every fourth time, or n=4 times in a sequence of n INCREMENT bit AŒ2 flips only every fourth time, or n=4 times in a sequence of n INCREMENT b c i b c i operations. In general, for i 0; 1; : : : ; k 1,bitAŒi flips n=2 times in a operations. In general, for i 0; 1; : : : ; k 1,bitAŒi flips n=2 times in a D ! b c D ! b c sequence of n INCREMENT operations on an initially zero counter. For i k, sequence of n INCREMENT operations on an initially zero counter. For i k, " " bit AŒi does not exist, and so it cannot flip. The total number of flips in the bit AŒi does not exist, and so it cannot flip. The total number of flips in the Binary counter: aggregatesequence ismethod thus Accounting method (banker’ssequence is thus method)

k 1 k 1 ! n 1 1 ! n 1 1 ci, we store credits in data structure Di to pay for future ops;

when ĉi < ci, we consume credits in data structure Di.

Theorem. Starting from the zero counter, a sequence of n INCREMENT ・Initial data structure D0 starts with 0 credits. operations flips O(n) bits. Pf. Credit invariant. The total number of credits in the data structure ≥ 0. j n n ・Bit j flips ⎣ n / 2 ⎦ times. our job is to choose suitable amortized k 1 cˆi ci 0 k1 n 1 costs so that this invariant holds The total number of bits flipped is n

Remark. Theorem may be false if initial counter is not zero.

15 16 Accounting method (banker’s method) Binary counter: accounting method

Assign (potentially) different charges to each operation. Credits. One credit pays for a bit flip. th ・Di = data structure after i operation. can be more or less Invariant. Each 1 bit has one credit; each 0 bit has zero credits. than actual cost th ・ci = actual cost of i operation. th ・ĉi = amortized cost of i operation = amount we charge operation i. Accounting.

・When ĉi > ci, we store credits in data structure Di to pay for future ops; ・Flip bit j from 0 to 1: charge 2 credits (use one and save one in bit j).

when ĉi < ci, we consume credits in data structure Di.

・Initial data structure D0 starts with 0 credits. increment Credit invariant. The total number of credits in the data structure ≥ 0. 7 6 5 4 3 2 1 0 n n

cˆi ci 0 0 1 0 0 1 1 1 01 i=1 i=1

Theorem. Starting from the initial data structure D0, the total actual cost of any sequence of n operations is at most the sum of the amortized costs. n n Pf. The amortized cost of the sequence of operations is: cˆi ci. n ▪ i=1 i=1 credit invariant Intuition. Measure running time in terms of credits (time = money). 17 18

Binary counter: accounting method Binary counter: accounting method

Credits. One credit pays for a bit flip. Credits. One credit pays for a bit flip. Invariant. Each 1 bit has one credit; each 0 bit has zero credits. Invariant. Each 1 bit has one credit; each 0 bit has zero credits.

Accounting. Accounting. ・Flip bit j from 0 to 1: charge 2 credits (use one and save one in bit j). ・Flip bit j from 0 to 1: charge 2 credits (use one and save one in bit j). ・Flip bit j from 1 to 0: pay for it with the 1 credit saved in bit j. ・Flip bit j from 1 to 0: pay for it with the 1 credit saved in bit j. increment

7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0

0 1 0 10 01 01 10 10 0 1 0 1 0 0 0 0

19 20 Binary counter: accounting method Potential method (physicist’s method)

Credits. One credit pays for a bit flip. Potential function. Φ(Di) maps each data structure Di to a real number s.t.:

Invariant. Each 1 bit has one credit; each 0 bit has zero credits. ・Φ(D0) = 0. ・Φ(Di) ≥ 0 for each data structure Di. Accounting. ・Flip bit j from 0 to 1: charge 2 credits (use one and save one in bit j). Actual and amortized costs. th ・Flip bit j from 1 to 0: pay for it with the 1 credit saved in bit j. ・ci = actual cost of i operation. th ・ĉi = ci + Φ(Di) – Φ(Di–1) = amortized cost of i operation. Theorem. Starting from the zero counter, a sequence of n INCREMENT our job is to choose operations flips O(n) bits. a potential function the rightmost 0 bit so that the amortized cost Pf. (unless counter overflows) of each operation is low ・Each INCREMENT operation flips at most one 0 bit to a 1 bit, so the amortized cost per INCREMENT ≤ 2. ・Invariant ⇒ number of credits in data structure ≥ 0. ・Total actual cost of n operations ≤ sum of amortized costs ≤ 2 n. ▪

accounting method theorem

21 22

Potential method (physicist’s method) Binary counter: potential method

Potential function. Φ(Di) maps each data structure Di to a real number s.t.: Potential function. Let Φ(D) = number of 1 bits in the binary counter D.

・Φ(D0) = 0. ・Φ(D0) = 0. ・Φ(Di) ≥ 0 for each data structure Di. ・Φ(Di) ≥ 0 for each Di.

Actual and amortized costs. th ・ci = actual cost of i operation. th increment ・ĉi = ci + Φ(Di) – Φ(Di–1) = amortized cost of i operation.

7 6 5 4 3 2 1 0 Theorem. Starting from the initial data structure D0, the total actual cost of any sequence of n operations is at most the sum of the amortized costs. 0 1 0 0 1 1 1 10 Pf. The amortized cost of the sequence of operations is: n n

cˆi = (ci + (Di) (Di 1)) i=1 i=1 i=1 i=1 n = ci + (Dn) (D0) i n 0 i=1 i=1 n

ci ▪ 23 24 i=1 Binary counter: potential method Binary counter: potential method

Potential function. Let Φ(D) = number of 1 bits in the binary counter D. Potential function. Let Φ(D) = number of 1 bits in the binary counter D.

・Φ(D0) = 0. ・Φ(D0) = 0. ・Φ(Di) ≥ 0 for each Di. ・Φ(Di) ≥ 0 for each Di.

increment

7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0

0 1 0 10 01 01 10 10 0 1 0 1 0 0 0 0

25 26

Binary counter: potential method Famous potential functions

Potential function. Let Φ(D) = number of 1 bits in the binary counter D. Fibonacci heaps. (H)=2trees(H)+2marks(H)

・Φ(D0) = 0. ・Φ(Di) ≥ 0 for each Di. Splay trees. (T )= log size(x) 2 x T Theorem. Starting from the zero counter, a sequence of n INCREMENT operations flips O(n) bits. Move-to-front. (L)=2inversions(L, L) Pf. th ・Suppose that the i INCREMENT operation flips ti bits from 1 to 0. operation flips at most one bit from 0 to 1 The actual cost ci ≤ ti + 1. Preflow–push. (f)= height(v) ・ (no bits flipped to 1 when counter overflows) v : excess(v) > 0 ・The amortized cost ĉi = ci + Φ(Di) – Φ(Di–1)

≤ ci + 1 – ti potential decreases by 1 for ti bits flipped from 1 to 0 and increases by 1 for bit flipped from 0 to 1 ≤ 2. Red–black trees. (T )= w(x) x T ・Total actual cost of n operations ≤ sum of amortized costs ≤ 2 n. ▪ 0 x 1 x potential method theorem w(x)= 0 x 2 x 27 28 Multipop stack

Goal. Support operations on a set of elements: AMORTIZED ANALYSIS ・PUSH(S, x): add element x to stack S. ・POP(S): remove and return the most-recently added element. ‣ binary counter ・MULTI-POP(S, k): remove the most-recently added k elements. ‣ multi-pop stack

‣ dynamic table

MULTI-POP(S, k)

______FOR i = 1 TO k

POP(S).

______

SECTION 17.4

Exceptions. We assume POP throws an exception if stack is empty.

30

Multipop stack Multipop stack: aggregate method

Goal. Support operations on a set of elements: Goal. Support operations on a set of elements: ・PUSH(S, x): add element x to stack S. ・PUSH(S, x): add element x to stack S. ・POP(S): remove and return the most-recently added element. ・POP(S): remove and return the most-recently added element. ・MULTI-POP(S, k): remove the most-recently added k elements. ・MULTI-POP(S, k): remove the most-recently added k elements.

Theorem. Starting from an empty stack, any intermixed sequence of n Theorem. Starting from an empty stack, any intermixed sequence of n PUSH, POP, and MULTI-POP operations takes O(n2) time. PUSH, POP, and MULTI-POP operations takes O(n) time. Pf. overly pessimistic ・Use a singly linked list. upper bound Pf. ・PoP and PUSH take O(1) time each. ・An element is popped at most once for each time that it is pushed. ・MULTI-POP takes O(n) time. ▪ ・There are ≤ n PUSH operations. ・Thus, there are ≤ n POP operations (including those made within MULTI-POP). ▪

top ● 1 ● 4 ● 1 ● 3 ●

31 32 Multipop stack: accounting method Multipop stack: potential method

Credits. 1 credit pays for either a PUSH or POP. Potential function. Let Φ(D) = number of elements currently on the stack.

Invariant. Every element on the stack has 1 credit. ・Φ(D0) = 0. ・Φ(Di) ≥ 0 for each Di. Accounting. ・PUSH(S, x): charge 2 credits. Theorem. Starting from an empty stack, any intermixed sequence of n - use 1 credit to pay for pushing x now PUSH, POP, and MULTI-POP operations takes O(n) time. - store 1 credit to pay for popping x at some point in the future ・POP(S): charge 0 credits. Pf. [Case 1: push] ・MULTIPOP(S, k): charge 0 credits. ・Suppose that the i th operation is a PUSH. ・The actual cost ci = 1. Theorem. Starting from an empty stack, any intermixed sequence of n ・The amortized cost ĉi = ci + Φ(Di) – Φ(Di–1) = 1 + 1 = 2. PUSH, POP, and MULTI-POP operations takes O(n) time. Pf. ・Invariant ⇒ number of credits in data structure ≥ 0. ・Amortized cost per operation ≤ 2. ・Total actual cost of n operations ≤ sum of amortized costs ≤ 2n. ▪

accounting method theorem 33 34

Multipop stack: potential method Multipop stack: potential method

Potential function. Let Φ(D) = number of elements currently on the stack. Potential function. Let Φ(D) = number of elements currently on the stack.

・Φ(D0) = 0. ・Φ(D0) = 0. ・Φ(Di) ≥ 0 for each Di. ・Φ(Di) ≥ 0 for each Di.

Theorem. Starting from an empty stack, any intermixed sequence of n Theorem. Starting from an empty stack, any intermixed sequence of n PUSH, POP, and MULTI-POP operations takes O(n) time. PUSH, POP, and MULTI-POP operations takes O(n) time.

Pf. [Case 2: pop] Pf. [Case 3: multi-pop] ・Suppose that the i th operation is a POP. ・Suppose that the i th operation is a MULTI-POP of k objects.

・The actual cost ci = 1. ・The actual cost ci = k. ・The amortized cost ĉi = ci + Φ(Di) – Φ(Di–1) = 1 – 1 = 0. ・The amortized cost ĉi = ci + Φ(Di) – Φ(Di–1) = k – k = 0. ▪

35 36 Multipop stack: potential method

Potential function. Let Φ(D) = number of elements currently on the stack.

・Φ(D0) = 0. AMORTIZED ANALYSIS ・Φ(Di) ≥ 0 for each Di. ‣ binary counter Theorem. Starting from an empty stack, any intermixed sequence of n ‣ multi-pop stack PUSH, POP, and MULTI-POP operations takes O(n) time. ‣ dynamic table Pf. [putting everything together]

・Amortized cost ĉi ≤ 2. 2 for push; 0 for pop and multi-pop ・Sum of amortized costs ĉi of the n operations ≤ 2 n. ・Total actual cost ≤ sum of amortized cost ≤ 2 n. ▪

potential method theorem SECTION 17.4

37

Dynamic table Dynamic table: insert only

Goal. Store items in a table (e.g., for hash table, binary heap). ・When inserting into an empty table, allocate a table of capacity 1. ・Two operations: INSERT and DELETE. ・When inserting into a full table, allocate a new table of twice the - too many items inserted ⇒ expand table. capacity and copy all items. - too many items deleted ⇒ contract table. ・Insert item into table. ・Requirement: if table contains m items, then space = Θ(m). old new insert copy insert capacity capacity cost cost 1 0 1 1 –

2 1 2 1 1 Theorem. Starting from an empty dynamic table, any intermixed sequence 3 2 4 1 2 of n INSERT and DELETE operations takes O(n2) time. 4 4 4 1 –

overly pessimistic 5 4 8 1 4 upper bound Pf. Each INSERT or DELETE takes O(n) time. ▪ 6 8 8 1 – 7 8 8 1 – 8 8 8 1 – 9 8 16 1 8 ⋮ ⋮ ⋮ ⋮ ⋮

Cost model. Number of items written (due to insertion or copy). 39 40 Dynamic table: insert only (aggregate method) Dynamic table demo: insert only (accounting method)

Theorem. [via aggregate method] Starting from an empty dynamic table, Insert. Charge 3 credits (use 1 credit to insert; save 2 with new item). any sequence of n INSERT operations takes O(n) time. Invariant. 2 credits with each item in right half of table; none in left half.

th Pf. Let ci denote the cost of the i insertion. insert N i i 1 ci = 1 capacity = 16

A B C D E F G H I J K L M Starting from empty table, the cost of a sequence of n INSERT operations is:

n lg n n lg n j ci n + 2j ci n + 2 i=1 j=0 i=1 j=0

41 42

Dynamic table: insert only (accounting method) Dynamic table: insert only (potential method)

Insert. Charge 3 credits (use 1 credit to insert; save 2 with new item). Theorem. [via potential method] Starting from an empty dynamic table, any sequence of n INSERT operations takes O(n) time. Invariant. 2 credits with each item in right half of table; none in left half.

Pf. [induction] Pf. Let Φ(Di) = 2 size(Di) – capacity(Di). slight cheat if table capacity = 1 ・Each newly inserted item gets 2 credits. (can charge only 2 credits for first insert) number of capacity of ・When table doubles from k to 2k, k / 2 items in the table have 2 credits. elements array - these k credits pay for the work needed to copy the k items ・Φ(D0) = 0. immediately after doubling - now, all items are in left half of table (and have 0 credits) for each . k ・Φ(Di) ≥ 0 Di capacity(Di) = 2 size(Di)

Theorem. [via accounting method] Starting from an empty dynamic table, any sequence of n INSERT operations takes O(n) time. 1 2 3 4 5 6 Pf. ・Invariant ⇒ number of credits in data structure ≥ 0. size = 6 capacity = 8 ・Amortized cost per INSERT = 3. Φ = 4 ・Total actual cost of n operations ≤ sum of amortized cost ≤ 3n. ▪

accounting method theorem 43 44 Dynamic table: insert only (potential method) Dynamic table: insert only (potential method)

Theorem. [via potential method] Starting from an empty dynamic table, Theorem. [via potential method] Starting from an empty dynamic table, any sequence of n INSERT operations takes O(n) time. any sequence of n INSERT operations takes O(n) time.

Pf. Let Φ(Di) = 2 size(Di) – capacity(Di). Pf. Let Φ(Di) = 2 size(Di) – capacity(Di).

number of capacity of number of capacity of elements array elements array ・Φ(D0) = 0. ・Φ(D0) = 0. ・Φ(Di) ≥ 0 for each Di. ・Φ(Di) ≥ 0 for each Di.

Case 0. [first insertion] Case 1. [no array expansion] capacity(Di) = capacity(Di–1).

・Actual cost c1 = 1. ・Actual cost ci = 1. ・Φ(D1) – Φ(D0) = (2 size(D1) – capacity(D1)) – (2 size(D0) – capacity(D0)) ・Φ(Di) – Φ(Di–1) = (2 size(Di) – capacity(Di)) – (2 size(Di–1) – capacity(Di–1)) = 1. = 2.

・Amortized cost ĉi = ci + (Φ(D1) – Φ(D0)) ・Amortized cost ĉi = ci + (Φ(Di) – Φ(Di–1)) = 1 + 1 = 1 + 2 = 2. = 3.

45 46

Dynamic table: insert only (potential method) Dynamic table: insert only (potential method)

Theorem. [via potential method] Starting from an empty dynamic table, Theorem. [via potential method] Starting from an empty dynamic table, any sequence of n INSERT operations takes O(n) time. any sequence of n INSERT operations takes O(n) time.

Pf. Let Φ(Di) = 2 size(Di) – capacity(Di). Pf. Let Φ(Di) = 2 size(Di) – capacity(Di).

number of capacity of number of capacity of elements array elements array ・Φ(D0) = 0. ・Φ(D0) = 0. ・Φ(Di) ≥ 0 for each Di. ・Φ(Di) ≥ 0 for each Di.

Case 2. [array expansion] capacity(Di) = 2 capacity(Di–1). [putting everything together]

・Actual cost ci = 1 + capacity(Di–1). ・Amortized cost per operation ĉi ≤ 3. ・Φ(Di) – Φ(Di–1) = (2 size(Di) – capacity(Di)) – (2 size(Di–1) – capacity(Di–1)) ・Total actual cost of n operations ≤ sum of amortized cost ≤ 3 n. ▪

= 2 – capacity(Di) + capacity(Di–1) potential method theorem = 2 – capacity(Di–1).

・Amortized cost ĉi = ci + (Φ(Di) – Φ(Di–1))

= 1 + capacity(Di–1) + (2 – capacity(Di–1)) = 3. 47 48 Dynamic table: doubling and halving Dynamic table demo: insert and delete (accounting method)

Thrashing. Insert. Charge 3 credits (1 to insert; save 2 with item if in right half). ・INSERT: when inserting into a full table, double capacity. Delete. Charge 2 credits (1 to delete; save 1 in empty slot if in left half). ・DELETE: when deleting from a table that is ½-full, halve capacity. Invariant 1. 2 credits with each item in right half of table. Invariant 2. 1 credit with each empty slot in left half of table. Efficient solution.

・When inserting into an empty table, initialize table size to 1; delete M when deleting from a table of size 1, free the table. ・INSERT: when inserting into a full table, double capacity. capacity = 16 ・DELETE: when deleting from a table that is ¼-full, halve capacity. A B C D E F G H I J K L M

Memory usage. A dynamic table uses Θ(n) memory to store n items. Pf. Table is always between 25% and 100% full. ▪

49 50

Dynamic table: insert and delete (accounting method) Dynamic table: insert and delete (potential method)

Insert. Charge 3 credits (1 to insert; save 2 with item if in right half). Theorem. [via potential method] Starting from an empty dynamic table, Delete. Charge 2 credits (1 to delete; save 1 in empty slot if in left half). any intermixed sequence of n INSERT and DELETE operations takes O(n) time.

discard any existing or extra credits Invariant 1. 2 credits with each item in right half of table. to pay for expansion Pf sketch.

Invariant 2. 1 credit with each empty slot in left half of table. to pay for contraction ・Let α(Di) = size(Di) / capacity(Di).

2 size(Di) capacity(Di) (Di) 1/2 ・Define (Di)= 1 capacity(D ) size(D ) (D ) < 1/2 Theorem. [via accounting method] Starting from an empty dynamic table, 2 i i i any intermixed sequence of n INSERT and DELETE operations takes O(n) time.

Pf. ・Φ(D0) = 0, Φ(Di) ≥ 0. [a potential function] ・Invariants ⇒ number of credits in data structure ≥ 0. ・When α(Di) = 1/2, Φ(Di) = 0. [zero potential after resizing] ・Amortized cost per operation ≤ 3. ・When α(Di) = 1, Φ(Di) = size(Di). [can pay for expansion] ・Total actual cost of n operations ≤ sum of amortized cost ≤ 3n. ▪ ・When α(Di) = 1/4, Φ(Di) = size(Di). [can pay for contraction] ... accounting method theorem

51 52