arXiv:2012.03996v1 [cs.DS] 7 Dec 2020 tpriual elsie osrigdt whose data to well-suited particularly it co preferred pr in previously the over Hence, . algorithm sorting and made standard Python as the such as languages programming adopted was and data, neeti h td fsrigalgorithms. sorting of study the in interest eg sorts. merge opnns i ain fa neto ot hc sused is which , insertion an of variant a (i) components: NaturalMergeSort NaturalMergeSort efracs sarsl,tegnrlsrcueof structure general the result, testin a extensive As through engineered, performances. carefully were which runs n oraitcdsrbtoso aa npriua,tev the particular, In data. of distributions ( realistic computers to of architecture and the to adapted well is algorithm ot,wihalwdrvn h aeuprbud.Atrusi After bounds. upper same the deriving allow which sorts, ult h oinof notion the to dual oprsn efre fuigti u-otn nta of instead sub-routine this using of performed comparisons n20,TmPtr,asfwr nier rae e sorti new a created engineer, software TimSort a Peters, Tim 2002, In ewrsadphrases and Keywords Classification Subject ACM 2012 Iran Tehran, Technology, of University Sharif ESIE Khalighinejad ENPC, Ghazal CNRS, 8049), (UMR LIGM Eiffel, Gustave Université Jugé Vincent sorts merge natural in Galloping u-ras eefe called hereafter sub-arrays, hc predated Knuth by used which already were decompositions Such simple. nlsso of Analysis esuytealgorithm the study We comparisons opeiymaue eut htaesmlrt hs alrea those to induce similar those are to that dual results are measure, that complexity measures complexity induces It u-nue esrs lhuhornwrslsd o edt lead not do results of new our Although measures. run-induced Introduction 1 lmn moves element mn h etietfidraosbhn h ucs of success the behind reasons best-identified the Among nadto obigantrlmresort, merge natural a being to addition In nti ril,w nrdc e betfrmauigtec the measuring for object new a introduce we article, this In nodrt os,w nrdc e oin of notions new introduce we so, do to order In n nmrigteern oehr hs l loihssh algorithms all Thus, together. runs these merging on and , Abstract epoeta hycnb ple oawat fvrat of variants of wealth a to applied be can they that prove we , [ 10 .Ti loih meitl eosrtdisecec fo efficiency its demonstrated immediately algorithm This ]. efre ytealgorithm. the by performed efre,te a edt rmtcipoeet ntenum the on improvements dramatic to lead may they performed, TimSort r localled also are sbsdo pitn rasit oooi usqecs a subsequences, monotonic into arrays splitting on based is runs TimSort otn loihs eg otn loihs iSr,Sh TimSort, algorithms, sorting Merge algorithms, Sorting n dpe h traditional the adapted and , nwhich on runs oepeiey elo tteipc ntenme felemen of number the on impact the at look we precisely, More . n h u-otn tue omremntnc(non-decreas monotonic merge to uses it sub-routine the and natural hoyo computation of Theory TimSort eg sorts. merge optimal ul t ucs ofr ec ecl it call we hence far, so success its built TimSort u decompositions run fast- loihscnrbtdt h eanof regain the to contributed algorithms TimSort and ykonwe osdrn standard considering when known dy av routine. naive a → NaturalMergeSort loicue ayoptimisations, many includes also r ocpinof conception ery .. o eln ihcceissues) with dealing for e.g., MergeSort yrn.W rv,frti new this for prove, We runs. by d n mrvmn ntenumber the on improvement any o ,t ffrtebs complexity best the offer to g, otn n searching and Sorting middle-growth gteentosscesul on successfully notions these ng gagrtm hc a called was which algorithm, ng mlxt fary.Ti notion This arrays. of omplexity TimSort oda with deal to a esltit he main three into split be can mnneo uhacustom- a such of ominence elbaiso wide-spread of libraries re ai,UPEM Paris, E TimSort [ 3 loih sfollows: as algorithm , r h atta this that fact the are rn hsfaueof feature this aring 5 seFigure (see ] o aua merge natural for [ n te natural other and small 7 otn actual sorting r eto 5.2.4], Section , TimSort e of ber us(e.g., runs s called lso ulruns dual iversSort, element makes 1 are ) ing) t . 2 Galloping in natural merge sorts

S =( 12, 7, 6, 5, 5, 7, 14, 36, 3, 3, 5, 21, 21, 20, 8, 5, 1 ) |first{z run} |second{z run} | third{z run } |fourth{z run} Figure 1 A sequence and its run decomposition computed by a greedy algorithm: for each run, the first two elements determine if it is non-decreasing or decreasing, then it continues with the maximum number of consecutive elements that preserves the monotonicity.

runs of length less than 32), (ii) a simple policy for choosing which large runs to merge, (iii) a sub-routine for merging these runs. The second component has been subject to an intense scrutiny these last few years, thereby giving birth to a great variety of TimSort-like algorithms. On the contrary, the first and third components, which seem more complicated and whose effect may be harder to quantify, have often been used as black boxes when studying TimSort or designing variants thereof.

Context and related work

The success of TimSort has nurtured the interest in the quest for sorting algorithms that would be adapted to arrays with few runs. However, the ad hoc conception of TimSort made its complexity analysis less easy than what one might have hoped, and it is only in 2015, a decade after TimSort had been largely deployed, that Auger et al. proved that TimSort required O(n log(n)) comparisons for sorting arrays of length n [2]. This is optimal in the model of sorting by comparisons, if the input array can be an arbitrary array of length n. However, taking into account the run decompositions of the input array allows using finer-grained complexity classes, as follows. First, one may consider only arrays whose run decomposition consists of ρ monotonic runs. On such arrays, the best worst-case one may hope for is O(n log(ρ)) [8]. Second, we may consider even more restricted classes of input arrays, and focus only on those arrays that consist of ρ runs of lengths r1,...,rρ. In that case, the best worst-case time complexity is O(n + nH), ρ where H is defined as H = H(r1/n,...,rρ/n) and H(x1,...,xρ)= − Pi=1 xi log2(xi) is the general entropy function [3, 6]. TimSort enjoys such a O(nH) time complexity [1]. In fact, since TimSort was invented, several natural merge sorts have been proposed, all of which were meant to offer easy- to-prove complexity guarantees. Such algorithms include ShiversSort [11], which runs in time O(n log(n)); α-StackSort [2], which, like NaturalMergeSort, runs in time O(n log(ρ)); α-MergeSort [4], PeekSort and PowerSort [9], and the most recent adaptive ShiversSort [6], which, like TimSort, run in time O(n + nH). Except TimSort, these algorithms are, in fact, described only as policies for merging runs, the actual sub-routine used for merging runs being left implicit. In practice, choosing a naive merging sub-routine does not harm the worst-case time complexities considered above. As a consequence, all authors identified the cost of merging two runs of lengths m and n with the sum m + n, and the complexity of the algorithm with the sum of the costs of the merges processed. One notable exception is that of [9], whose authors compare the running times of TimSort and of TimSort’s variant where the merging sub-routine is replaced by a naive routine. While the arrays on which the performance comparisons are performed are chosen to have a low entropy H, the authors did not try to identify which arrays could benefit the most from TimSort’s sub-routine. As a result, they unsurprisingly observed that TimSort’s complex sub- routine seemed less efficient than the naive one, but their work suggests another approach: V. Jugé and G. Khalighinejad 3

finding distributions on arrays for which TimSort’s merging sub-routine will actually be helpful.

Contributions

We study the time complexity of TimSort and its variants in a context where we refine the family of arrays we consider. This refinement is based on notion of complexity about the input arrays that is dual to the decomposition of arrays into monotonic runs. Consider an array A whose values A[1], A[2],...,A[n] are pairwise distinct integers between 1 and n: we identify A with a permutation of the set {1, 2,...,n}. In the standard literature, we subdivide A into distinct increasing runs, i.e., we partition the set {1, 2,...,n} into intervals R1, R2,...,Rρ such that the function x 7→ A[x] is increasing on each interval Ri. A dual approach would consist in partitioning that set into intervals S1,S2,...,Sσ, which we call dual runs below, such that the function x 7→ A−1[x] is increasing on each interval ∗ Sj . It would then be feasible to sort the array A in time O(n log(σ)), or even O(n + nH ), ∗ where our intervals have lengths s1,s2,...,sσ and H = H(s1/n,...,sσ/n). In this preliminary version, we prove that, thanks to its merging sub-routine, TimSort requires O(n log σ) comparisons. In a subsequent version of this paper, we will further prove that TimSort requires only O(nH∗) comparisons.

2 TimSort and its sub-routine for merging runs

In this section, we describe the TimSort and the components it consists of. As mentioned in Section 1, TimSort is based on decomposing its input array into monotonic (non-decreasing) runs, which are then merged together into larger monotonic runs. Following the literature on this subject, we first focus on the policy used by TimSort to decide which runs to merge. In order to do so, each monotonic run is identified to a pair of pointers to its first and last entries in the array. TimSort discovers its runs on-the-fly, from left to right, makes them non-decreasing, and inserts them into a stack, which is used to decide which runs should be merged. These actions are performed as described in Algorithm 1. This description of TimSort relies on two black-box sub-routines for (i) finding a new run R and making it non-decreasing, and (ii) merging two runs. The first sub-routine works as follows. New runs in the input array are found by using a naive (yet optimal) algorithm, where r comparisons are used to detect a new run R of length r. If r is smaller than a given threshold, the run R is made artificially longer by absorbing elements to its right. All these steps require a total of O(n) comparisons and element moves. Therefore, and although diving into details might give us some insight on the constant hidden in the O notation, we choose not to focus on these steps. The second sub-routine is a blend between a naive merging algorithm, which requires a + b − 1 comparisons to merge runs of lengths a and b, and a dichotomy-based algorithm, which requires O(log(a + b)) comparisons in the best case, and O(a + b) comparisons in the worst case. TimSort’s merging sub-routine includes both algorithms because the constant hidden in the O notations is quite larger than 1, which might slow down TimSort significantly if the dichotomy-based algorithm were used alone. This sub-routine roughly works as described in Algorithm 2. In practice, the parameter t is initially set to a constant (t = 7 in Java), and may change during the algorithm. In this paper, we will abusively assume that t is constant throughout 4 Galloping in natural merge sorts

the algorithm. Deciding whether our results remain valid when t is updated like in TimSort remains an open question so far. One readily observes that, if we are lucky enough, for instance if each element in B is smaller than each element in A, only O(log(a + b)) element comparisons will be performed. However, we should not expect to perform less than Θ(a + b) element moves. Consequently, using this merging sub-routine instead of a naive one may be relevant only if the cost of element comparisons is significantly larger than the cost of element (or pointer) moves.

3 Merging arrays with few distinct values

This section is devoted to proving our main result, which is stated in Theorem 1.

◮ Theorem 1. TimSort requires O(n log(σ)) element comparisons to sort arrays of length n with σ dual runs.

TimSort is a comparison-based sorting algorithm. Thus, its behaviour, i.e., the element comparisons and element moves it performs, are invariant under the following transformation: starting from an array A of length n with σ dual runs S1,S2,...,Sσ, we create a new array B of length n with σ distinct values, such that B[i] = j whenever A[i] belongs to the interval Sj . In other words, for comparison-based sorting algorithms such as TimSort, arrays with σ dual runs are indistinguishable from arrays with σ distinct values. Thus, Theorem 1 is in fact equivalent with the following result, which we focus on proving.

◮ Theorem 2. TimSort requires O(n log(k)) element comparisons to sort arrays of length n with k distinct values.

Our proof is decomposed into three parts, which follow the decomposition of TimSort’s structure into separate components used for deciding which large runs to merge and merging

Algorithm 1: TimSort Input : Array A to sort Result : The array A is sorted into a single run. That run remains on the stack. th Note : We denote the height of the stack S by h, and its i bottom-most run by Ri. The length of Ri is denoted by ri. When two consecutive runs of S are merged, they are replaced, in S, by the run resulting from the merge. 1 S ← an empty stack 2 while true : 3 if h > 3 and rh−2 2 and rh−1 6 rh : 6 merge the runs Rh−1 and Rh ⊲ case #2 7 else if h > 3 and rh−2 6 rh−1 + rh : 8 merge the runs Rh−1 and Rh ⊲ case #3 9 else if h > 4 and rh−3 6 rh−2 + rh−1 : 10 merge the runs Rh−1 and Rh ⊲ case #4 11 else if the end of the array has not yet been reached : 12 find a new monotonic run R, make it non-decreasing, and push it onto S 13 else: 14 break 15 while h > 2 : 16 merge the runs Rh−1 and Rh V. Jugé and G. Khalighinejad 5 these runs. In subsequent sections, we will then be able to adapt our proofs more easily to derive similar results to other sorting algorithms. Below, we first focus on the cost of merging two adjacent runs. Then, we will prove structural properties of the merge tree induced by TimSort; we call these properties the fast- growth and middle-growth properties. Finally, using these properties and our upper bounds about the cost of individual merges, we will derive Theorem 2. Before diving into this three-step proof, however, let us first introduce some notation about runs and their lengths, which we will frequently use, and which follows notations used in Algorithm 1. We will commonly denote runs with capital letters, possibly with some index or some adornment, and we will then denote the length of such a run with the same ′ small-case letter and the same index of adornment. For instance, runs named R, Q, Ri, R

Algorithm 2: TimSort’s merging sub-routine Input : Sorted arrays A and B of lengths a and b, and parameter t Result : Array C of length a + b, obtained by sorting the concatenation of arrays A and B Note : For each array T of length ℓ and each integer x ∈ {1, 2,...,ℓ}, we denote by T [x] the element of the array T in position x. If x>ℓ, we abusively set T [x]=+∞. 1 (x,y) ← (1, 1) 2 while x 6 a and y 6 b : 3 if A[x] 6 B[y] : ′ 4 x ← Search(A,B,C,x,y, t) ′ 5 Fill(A,C,x,x ,y) ′ 6 x ← x 7 else: ′ 8 y ← Search(B,A,C,y,x, t) ′ 9 Fill(B,C,y,y ,x) ′ 10 y ← y 11 Fill(A,C,x,a,b) 12 Fill(B,C,y,b,a)

′ 13 Function Fill(A,C,x,x ,y): ′ 14 while x

′ ′ ′ 17 Function Search(A,B,C,x,y, t): ⊲ find some x >x s.t. A[x − 1] 6 B[y] B[y] : ⊲ naive approach 19 x+ ← x + 1 20 else: ⊲ dichotomic approach 21 u ← 0 u 22 while A[x + 2 ] 6 B[y] : 23 u ← u + 1 − u 24 (x ,x+) ← (x,x + 2 ) − 25 while x+ > x + 2 : ◦ − 26 x ← (x + x+)/2 ◦ 27 if A[x ] 6 B[y] : − ◦ 28 x ← x 29 else: ◦ 30 x+ ← x 31 return x+ 6 Galloping in natural merge sorts

′ and R will have respective lengths r, q, ri, r and r.

3.1 Merging runs with few distinct values In the worst case, when merging runs of lengths a and b, TimSort’s merging sub-routine requires Θ(a + b) element comparisons and element moves. Moreover, as mentioned in Section 1, if two runs have only few distinct values, merging them can require many fewer comparisons. Both these statements are proved now. ◮ Proposition 3. Let A and B be two sorted arrays of lengths a and b, whose values are elements of the set {1, 2,...,k}. TimSort’s sub-routine performs O(min{a + b, k log(a + b)}) element comparisons to merge these two arrays. Proof. TimSort’s merging sub-routine goes at most 2kt times in the if branch of line 18, and at most 2k times in the else branch of line 20. Thus it performs at most 2k(t + 1) times the comparison in line 3. Finally, whenever the sub-routine goes through the else branch of line 20, it performs O(log(a + b)) element comparisons. Consequently, TimSort performs O(k log(a + b)) element comparisons. Second, we perform at most a + b times the comparisons of lines 3 and 18. Furthermore, whenever we enter the else branch of line 20, we return some integer x+ = x + δ for some

δ > t, after having performed 2⌈log2(δ)⌉ +1= O(δ) comparisons. Immediately afterwards, the sum of the counters x and y maintained in the core of Algorithm 2 increases by δ. Thus, whenever we enter the while loop of line 2, we have already performed O(x + y) comparisons within the else branch of line 20. Adding the comparisons of lines 3 and 18, TimSort performs a total of O(a + b) comparisons, which completes the proof. ◭

3.2 A fast-growth property Like its many variants, TimSort is based on merging adjacent monotonic (often, increasing) runs, either by the means of a naive merging sub-routine or by using Algorithm 2. However, this merging sub-routine can be used as a black box along with the procedure for choosing which runs to merge, as described in Algorithm 1. When an array A is sorted, its elements belong to runs which will, two by two, be merged with each other. Consequently, each element may undergo several successive merge operations. A convenient way to represent the succession of runs that ever occur while A is sorted is the use of a merge tree [3, 9, 6]. ◮ Definition 4. The merge tree induced by a natural merge sort algorithm on an array A is the binary rooted tree T defined as follows. The nodes of T are all the runs that were present in the initial array A or that resulted from merging two runs. The runs of the initial array are the leaves of T , and when two adjacent runs R1 and R2 are merged with each other (the run R1 spanning positions to the left of those of R2) into a new run R, they respectively form the left and right children of the node R. Such trees ease the task of referring to several runs that might not have occurred simultaneously. In particular, we will often refer to the ith ancestor or a run R, which is just R itself if i = 0, or the parent (in the tree T ) of the (i − 1)th ancestor of R if i > 1. Let us introduce now two notions of fast growth. We will then prove that TimSort matches these two notions. ◮ Definition 5. Let A be a natural merge sort algorithm. We say that A has the fast-growth property if it satisfies the following statement: V. Jugé and G. Khalighinejad 7

There exist an integer ℓ > 1 and a real number ε> 1 such that, for every merge tree T induced by A and every pair of runs R and R of T , if R is the ℓth ancestor of R, then r > εr.

We also say that A has the middle-growth property if it satisfies the following statement:

There exists a real number ε > 1 such that, for every merge tree T induced by A, every integer h > 0 and every run R of height h in the tree T , we have r > εh.

Since every node of height h in a merge tree is a run of length at list h, it comes at once that every algorithm with the fast-growth property also has the middle-growth property. However, we still wanted to introduce these two notions simultaneously. Indeed, we will reuse the notion of fast-growth in subsequent sections, with stronger results than for algorithm that only enjoy the middle-growth property. In addition, our proof that TimSort has the middle-growth property is based on proving that it has the fast-growth property, due to the following result.

◮ Proposition 6. Let T be a merge sort generated by TimSort, and let R and R be two runs in T . If R is the 6th ancestor of R, then r > 4r/3.

In order to prove Proposition 6, we will need do distinguish merges according to the case (from #1 to #4) that triggered them. In particular, although cases #2 to #4 all lead to merging the runs Rh−1 and Rh, we still distinguish these merges. Below, we say that a merge triggered by case #x, for x ∈{1, 2, 3, 4}, is an #x-merge. It may also be convenient 2 to group some classes of merges. For instance, we say that a merge m is a #3 4-merge if it 1 was triggered by one of the cases #2 to #4, or that m is a #2 4-merge if it was triggered by one of the cases #1, #2 or #4. For each type of merge #x, we also say that R is a #x-left (respectively, a #x-right) run if R is the left (respectively, right) child of its parent in the tree T , and was merged during a #x-merge. If knowing #x is irrelevant, we may also refer directly to left and right runs. We split the proof of Proposition 6 into several lemmas, all of which take place in the following setting. Borrowing notations from [1] and from Algorithm 1, we will look at the runs simultaneously present in a stack S = (R1, R2,...,Rh) just before the run R is merged. th Here, Ri is the i bottom-most run of the stack and ri is the length of the run Ri. Similarly, we denote by q the length of a run named Q, and so on. We will also denote by iR the ith ancestor or the run R, so that R = 0R and R = 6R, and by ir the length of iR. Let us now mention a series of results that will pave the way towards proving Proposition 6. Lemma 7 is a mere rephrasing of [1, Lemma 5], hence we omit its proof, while the results of subsequent Lemmas 8 to 10 are novel.

◮ Lemma 7. We have (i) ri > ri+1 + ri+2 for all i ∈ {1, 2,...,h − 4}, (ii) 3rh−2 > rh−1, (iii) rh−3 > rh−2 and (iv) rh−3 + rh−2 > rh−1.

From Lemma 7 follow immediately the inequalities (v) r1 > r2 >...>rh−2.

◮ Lemma 8. If 1R is a right run, then 2r > 4r/3.

Proof. Let R• be the left sibling of the run 1R, and let i ∈{h−2,h−1,h} be the integer such that R = Ri. If i = h − 2, then (iii) shows that ri−1 > ri. If i = h − 1, then (ii) shows that 2 2 ri−1 > ri/3. In both cases, the run Ri−1 descends from R, and thus r > r + ri−1 > 4r/3. 2 Finally, if i = h, then R is a #3 4-right run, which means both that rh−2 > r and that • 2 Rh−2 descends from R . It follows that r > r + rh−2 > 2r. ◭ 8 Galloping in natural merge sorts

1 2 ◮ Lemma 9. If R is a #2 4-left run, then r > 4r/3. Proof. We treat three cases independently, depending on whether R is a #1-, #2- or #4- left run. In each case, we assume that 1R is a left run, unless what Lemma 8 proves that 2r > 4r/3.

1 ⊲ If R is a #1-left run, then R = Rh−2 is merged with Rh−1 and rh−2 < rh. Since R is a 2 2 left run, Rh descends from R, and thus r > r + rh > 2r. ⊲ If R is a #2-left run, then R = Rh−1 is merged with Rh and rh−1 6 rh. It follows that 2 1 r > r = r + rh > 2r. 1 ⊲ If R is a #4-left run, then R = Rh−1 is merged with Rh, and we both know that rh−2 > r 1 and that rh−3 6 rh−2 + rh−1 6 rh−2 + r. Due to the latter inequality, our next journey 1 through the loop of lines 2 to 14 must trigger another merge. Since r < rh−2 < rh−3, that new merge cannot be a #1-merge, and thus 1R is a right run, contradicting our initial assumption. ◭

◮ Lemma 10. If R, 1R, 2R and 3R are #3-left runs, then 3r > 4r/3. Proof. Let 1P and 1Q be the two runs placed to the left of 1R (1P is the left neighbour of 1Q, which is the left neighbour of 1R) when 1R is merged, and let u be the largest integer 1 2 2 such that Ru descends from P . We define similarly the runs P and Q, and denote by v 2 the largest integer such that Rv descends from P . 1 Since R is a #3-left run, we know that R = Rh−1 and that rh−2 6 rh−1 +rh = r. Thus, our next journey through the loop of lines 2 to 14 must trigger another merge, which must 1 1 be a #1-merge since R is a left run. This means that rh−3 < r and that Rh−3 descends from 1Q, so that u 6 h − 4. Let us now repeat this argument twice. Since 1R and 2R are #3-left runs, we know that 1p< 2r, that 1P descends from 2Q, and thus that v 6 u−1 6 h−5. Then, since 2R and 3R are #3-runs, we know again that 2p< 3r. Thus, we derive from the inequalities (i) to (v) that 3 2 r> p > rv > rh−5 > rh−4 + rh−3 > 2rh−3 + rh−2 > rh−2 + rh−1 > 4rh−1/3=4r/3. ◭

Proof of Proposition 6. We only need to distinguish three (non mutually exclusive) cases: ⊲ If a run iR is a right run for some integer i ∈ {1, 2, 3, 4}, then Lemma 8 shows that 6r > i+1r > 4/3 × i−1r > 4r/3. i 1 ⊲ If a run R is a #2 4-left run for some integer i ∈ {1, 2, 3, 4}, then Lemma 9 shows that 6r > i+2r > 4/2 × ir > 4r/3. ⊲ If all runs iR such that i ∈ {1, 2, 3, 4} are #3-left runs, then Lemma 10 shows that 6r > 4r > 4/3 × 1r > 4r/3. ◭

3.3 Sorting arrays with few element comparisons We prove below that every natural merge sort algorithm with the middle-growth property requires few element comparisons to sort arrays with few values, if they use TimSort’s merging sub-routine. This assertion is more precisely stated in the following result, which we are then devoted to proving. ◮ Proposition 11. Let A be a natural merge sort algorithm with the middle-growth property that uses TimSort’s merging sub-routine. This algorithm requires O(n log(k)) element comparisons to sort arrays of length n with k distinct values. Note that, since we proved in Section 3.2 that TimSort has the middle-growth property, Theorem 2 will immediately follow from Proposition 11. V. Jugé and G. Khalighinejad 9

◮ Lemma 12. Let T be a merge tree induced by a natural merge sort algorithm on some array A of length n. Let R = {R1, R2,...,Rℓ} be a family of runs in T that are pairwise incomparable for the ancestor relation, i.e., such that no run Ri is the ancestor of another run Rj . It holds that r1 + r2 + . . . + rℓ 6 n.

Proof. The runs of T to which a given position in the array A belongs form a chain for the ancestor relation. Thus, the runs in R cover pairwise disjoint intervals of the array A. This completes the proof. ◭

Proof of Proposition 11. Let ε > 1 be the real number mentioned in the definition of the statement “A has the middle-growth property” such as described in Definition 5. Let A be an array of length n with k distinct values, and let T be the merge tree induced by A when sorting A. Then, for all h > 0, let Rh be the family of runs at height h in T . We associate each run R of height h > 1 with the number of element comparisons, noted c(R), performed by TimSort’s merging sub-routine during the merge that resulted in the run R. Proposition 3 states that c(R) = O(fk(r)), where the function fk is defined by fk : x 7→ min{x, k logε(x)}. We extend this notation and, for all h > 1, we define the number c(h) as the sum of the numbers fk(r) for those numbers R that belong to Rh. Observe that, if T is a tree of height h⋆, then the algorithm A must have performed O(c(1) + c(2) . . . + c(h⋆)) comparisons to sort A. It remains to prove that c(1) + c(2) . . . + c(h⋆)= O(n log(k)). We do so by distinguishing to cases:

⊲ If h 6 log (k +3), we simply have c(h) 6 r, and Lemma 12 proves that the latter ε PR∈Rh sum is not larger than n.

⊲ If h > logε(k + 3), by definition of ε, we know that every run R ∈ Rh is of length h r > ε > k +3 > 3. Since the function g : x 7→ logε(x)/x is decreasing the interval h h [e, +∞), it follows that fk(r) 6 k logε(r) = krg(r) 6 krg(ε )= krh/ε for all R ∈ Rh. Hence, Lemma 12 proves that c(h) 6 knh/εh.

Setting ℓ = logε(k + 3), we conclude that

⋆ h h kn h + ℓ h c(h) 6 ℓn + kn = ℓn + 6 ℓn + (ℓ + 1)n . X X εh k +3 X εh X εh h=1 h>ℓ h>0 h>0

h c c c ⋆ Since the series Ph>0 h/ε is convergent, this shows that (1) + (2) . . . + (h ) is O(ℓn), i.e., is O(n log(k)), thereby completing the proof. ◭

4 Fast- or middle-growth property in other natural merge sorts

Once equipped with Propositions 3 and 11, we can readily adapt our proof of Theorems 1 and 2 to a wide range of natural merge sort algorithms. Indeed, it suffices to replicate the results of Section 3.2 and to prove that these algorithms have the middle-growth property. In this section, we focus on the algorithms we mentioned in Section 1, thereby advocating for the use of TimSort’s merging sub-routine in all these algorithms. In each case, we identify the algorithm with its strategy for deciding which runs to merge, implicitly using TimSort’s merging sub-routine to perform these merges. th Below, and depending on the context, we may denote by Ri the i left-most run of an array A to be sorted, or the ith bottom-most run of a stack. In each case, we will make our convention explicit, and we will denote the length of the run Ri by ri. In addition, when 10 Galloping in natural merge sorts

Algorithm 3: α-MergeSort Input : Array A to sort, parameter α> 1 Result : The array A is sorted into a single run. That run remains on the stack. th Note : We denote the height of the stack S by h, and its i bottom-most run by Ri. The length of Ri is denoted by ri. When two consecutive runs of S are merged, they are replaced, in S, by the run resulting from the merge. 1 S ← an empty stack 2 while true : 3 if h > 3 and rh−2 2 and rh−1 < αrh : 6 merge the runs Rh−1 and Rh ⊲ case #2 7 else if h > 3 and rh−2 < αrh−1 : 8 merge the runs Rh−1 and Rh ⊲ case #3 9 else if the end of the array has not yet been reached : 10 find a new monotonic run R, make it non-decreasing, and push it onto S 11 else: 12 break 13 while h > 2 : 14 merge the runs Rh−1 and Rh

focusing on a given array R, we will denote by iR the ith ancestor of the run R, and we will denote the length of the run iR by ir.

4.1 Algorithms with the fast-growth property 4.1.1 The algorithm α-MergeSort

The algorithm α-MergeSort is presented in Algorithm 3. Our proof follows a path similar to the one used in Section 3.2: it relies on variants of Lemmas 7 to 10, and culminates with proving the following result.

◮ Proposition 13. Consider some real parameter α > 1. Let T be a merge sort generated by α-MergeSort, and let R and R be two runs in T . If R is the mth ancestor of R, then ⋆ ⋆ r > α r, where we set α = min{α, 1+1/α} and m = max{2, 4 −⌊logα(α − 1)⌋}.

First, Lemma 14 strengthens a result that already appears in the proof of [4, Theorem 14], but that contains only the inequalities (vi).

◮ Lemma 14. Let S = (R1, R2,...,Rh) be a stack obtained during an execution of α-MergeSort. We have (vi) ri > αri+1 whenever 1 6 i 6 h−3 and (vii) rh−2 > (α − 1)rh−1.

Proof. We prove, by a direct induction on the number of (push or merge) operations performed before obtaining the stack S, that S satisfies (vi) and (vii). First, (vi) and (vii) trivially hold if h 6 2, as is the case when the algorithm starts. Then, when a stack S = (R1, R2,...,Rh) obeying (vi) and (vii) is transformed into a stack

S = (R1, R2,..., Rh)

⊲ by inserting a run, then h = h + 1, and ri = ri > αri+1 > ri+1 for all i 6 h − 1= h − 2, which is actually even stronger than required; V. Jugé and G. Khalighinejad 11

⊲ by merging two runs, then h = h−1, and ri = ri > αri+1 > ri+1 for all i 6 h−4= h−3, so that S satisfies (vi), whereas S also satisfies (vii) because

> > > rh−2 = rh−3 αrh−2 = (α − 1)rh−2 + rh−2 (α − 1)rh−2 + (α − 1)rh−1 (α − 1)rh−1.

In both cases, S satisfies (vi) and (vii), which completes the induction and the proof. ◭

Lemmas 15 to 17 focus on inequalities involving the lengths of a given run R and its ancestors. In each case, we denote by S = (R1, R2,...,Rh) the stack at the moment where the run R is merged, and we will use this naming convention when proving these Lemmas.

◮ Lemma 15. If 1R is a right run, then 2r > α⋆r.

Proof. Let R• be the left sibling of the run 1R, and let i ∈{h−2,h−1,h} be the integer such that R = Ri. If i = h−2, then (vi) shows that ri−1 > αri. If i = h−1, then (vii) shows that 2 2 ri−1 > (α−1)ri. In both cases, the run Ri−1 descends from R, and thus r > r +ri−1 > αr. 2 Finally, if i = h, then R is a #3-right run, which means both that rh−2 > r and that • 2 Rh−2 descends from R . It follows that r > r + rh−2 > 2r > (1+1/α)r. ◭

1 2 ⋆ ◮ Lemma 16. If R is a #2-left run, then r > α r.

Proof. We treat three cases independently, depending on whether R is a #1- or #2-left run. In each case, we assume that 1R is a left run, unless what Lemma 15 proves that 2r > α⋆r.

1 ⊲ If R is a #1-left run, then R = Rh−2 is merged with Rh−1 and rh−2 < rh. Since R is a 2 2 left run, Rh descends from R, and thus r > r + rh > 2r > (1+1/α)r.

⊲ If R is a #2-left run, then R = Rh−1 is merged with Rh, r < αrh, and therefore 2 1 r > r = r + rh > (1+1/α)r. ◭

◮ Lemma 17. If all of R, 1R, 2R, ..., m−3R are #3-left runs, then m−3r > α⋆r.

Proof. Our proof is entirely similar to that of Lemma 10. For all i > 1, let iP and iQ be the two runs placed to the left of iR when iR is merged, and let u(i) be the largest integer i such that Ru(i) descends from R. 1 Since R is a #3-left run, we know that R = Rh−1 and that rh−2 < αrh−1 < α r. Thus, our next journey through the loop of lines 2 to 14 must trigger another merge, which must 1 1 be a #1-merge since R is a left run. This means that rh−3 < r and that Rh−3 descends from 1Q, so that u(1) 6 h − 4. Repeating the same argument over and over, we find that u(i) 6 h − 3 − i and that ip < i+1r whenever i 6 m − 4. Thus, we derive from the inequalities (vi) and (vii) that m−3 m−4 m−3 m−3 r> p > ru(m−4) > rh−m+1 > α × rh−2 > α (α − 1)r > αr. ◭

Proof of Proposition 13. We only need to distinguish three (non mutually exclusive) cases:

⊲ If a run iR is a right run for some integer i ∈ {1, 2,...,m − 2}, then Lemma 15 shows that mr > i+1r > α⋆ × i−1r > α⋆r. i 1 ⊲ If a run R is a #2-left run for some integer i ∈{1, 2,...,m − 2}, then Lemma 16 shows that mr > i+2r > α⋆ × ir > α⋆r. ⊲ If all runs iR such that i ∈{1, 2,...,m − 2} are #3-left runs, then Lemma 17 shows that mr > m−2r > α⋆ × 1r > α⋆r. ◭ 12 Galloping in natural merge sorts

4.1.2 The algorithm PeekSort

The algorithm PeekSort is not stack-based, and the description that its authors provide is top-down [9, Section 3.1] instead of bottom-up like Algorithms 1 and 3. From their description, we derive immediately the following characterisation of the merge trees induced by PeekSort, and which could be considered as an alternative (abstract, under-specified) definition.

◮ Definition 18. Let T be a merge sort generated by PeekSort, and let R1, R2,...,Rρ be its leaves, ordered from left to right. Let R be an internal node (i.e., run) of T , and let Rleft and Rright be its left and right children. Finally, let Ri be the rightmost leaf of T that descends from Rleft, and let Ri+1 be the leftmost leaf of T that descends from Rright. It holds that rright − ri+1 6 rleft 6 rright + ri.

◮ Proposition 19. Let T be a merge sort generated by PeekSort, and let R and R be two runs in T . If R is the 3th ancestor of R, then r > 2r.

Proof. Since left-to-right and right-to-left orientations play symmetric roles in Definition 18, we assume without loss of generality that a majority of the runs R, 1R and 2R are left runs. Let i < j be the two smallest integers such that iR and jR are left runs, and let iQ and j Q be their right siblings. The rightmost leaf of T that descends from jR also descends from iQ, and thus its length is at most iq. Hence, it follows from Definition 18 that j r 6 j q + iq, and thus 3r > j+1r = j r + j q > i+1r + j q = ir + iq + j q > ir + j r > 2r. ◭

4.1.3 The algorithm PowerSort

Like its sibling PeekSort, the algorithm PowerSort is not stack-based, and the description that its authors provide is top-down [9, Section 3.2]. This algorithm is based on the ad hoc notion of power of a run, which is defined as follows.

◮ Definition 20. Let T be a merge sort generated by PeekSort when sorting an array of length n (which spans positions numbered from 1 to n), and let R1, R2,...,Rρ be its leaves, ordered from left to right. Let R be an internal node (i.e., run) of T , and let Rleft and Rright be its left and right children. Finally, let Ri be the rightmost leaf of T that descends from Rleft, and let Ri+1 be the leftmost leaf of T that descends from Rright. The power of R is the largest integer ℓ such that there exists an integer k satisfying the ℓ inequalities 2(r1 + . . . + ri−1)+ ri < 2 kn 6 2(r1 + . . . + ri)+ ri+1.

Note that the power of a leaf is not defined. Based on the notion of power, we use now [9, Lemma 4] as our alternative (abstract, and quite under-specified) definition of PowerSort.

◮ Definition 21. Let T be a merge sort generated by PowerSort, and let R and R be two runs in T , with powers p and p. If R is the parent of R, then p > p + 1.

◮ Proposition 22. Let T be a merge sort generated by PowerSort, and let R and R be two runs in T . If R is the 4th ancestor of R, then r > 2r.

Proving Proposition 22 is slightly more complicated than proving Proposition 19. Our proof is based on the following result.

◮ Lemma 23. Let R be a run of power p. It holds that r 6 2p+1n and that 1r > 2pn. V. Jugé and G. Khalighinejad 13

Proof. In what follows, we always denote by sx the sum r1 + r2 + . . . + rx, and by tx the p number (2sx−1 + rx)/(2 n). Then, we prove the two desired inequalities separately. First, let Ri, Ri+1,...,Rj be the leaves of T that descend from T , and let Rℓ be the rightmost leaf that descends from the left child of R. By maximality of p in Definition 20, we know that ⌊tℓ⌋ < ⌊tℓ+1⌋ and that ⌊tℓ/2⌋ = ⌊tℓ+1/2⌋. This means that there exists an odd integer k such that ⌊tℓ⌋ = k − 1 and ⌊tℓ+1⌋ = k. Now, consider some integer u ∈{i,i +1,...,ℓ − 1}. Let R be the least common ancestor of Ru and Ru+1, and let p be its power. Since R descends from the left child of R, we know that p 6 p − 1. By maximality of p, this means that ⌊tu⌋ = ⌊tu+1⌋. This equality is valid whenever i 6 u 6 ℓ − 1, and thus ⌊ti⌋ = ⌊tℓ⌋ = k − 1. When u ∈ {ℓ,ℓ +1,...,j − 1}, the same reasoning proves that ⌊tℓ+1⌋ = ⌊tj⌋ = k. It follows that the run R has length

r = ri + ri+1 + . . . + rj = sj − si−1 6 (sj − si−1) + (ri+1 + ri+2 + . . . + rj−1) p p p+1 6 (2sj−1 + sj ) − (2si−1 + ri)=2 n(tj − ti) < 2 n((k + 1) − (k − 1)) = 2 n.

1 1 Second, let Rm be the rightmost leaf that descends from the left child of R. Since R is 1 a run of power p > p + 1, we know that ⌊tm/2⌋ < ⌊tm+1/2⌋, and therefore there exists an ′ ′ ′ even integer k such that tm < k < tm+1. Since k and k have opposite parities, they are ′ ′ distinct from each other, and thus tmin{m,u} < min{k, k } < max{k, k } 6 tmax{m,u}+1. By 1 construction, both runs Rmin{m,u} and Rmax{m,u}+1 descend from R. It follows that the run 1R has length

1 r > rmin{m,u} + . . . + rmax{m,u}+1 = smax{m,u}+1 − smin{m,u}−1

> (smax{m,u} + rmax{m,u}+1/2) − (smin{m,u}−1 + rmin{m,u}/2) p−1 p−1 > 2 n(tmax{m,u}+1 − tmin{m,u}) > 2 n. ◭

Proof of Proposition 22. Let 1p and p be the respective powers of the runs 1R (the parent 1 of R) and R. Since p > 1p + 3, Lemma 23 proves that r > 2p−1n > 2p +2n > 2r1 > 2r. ◭

4.1.4 The algorithm adaptive ShiversSort

The algorithm adaptive ShiversSort is presented in Algorithm 4. Our proof follows the same path as those used in Sections 3.2 and 4.1.1, and culminates with the following result.

◮ Proposition 24. Let T be a merge sort generated by adaptive ShiversSort, and let R and R be two runs in T . If R is the 6th ancestor of R, then r > 2r.

As indicated in Algorithm 4, adaptive ShiversSort is based on an ad hoc tool, which we call level of run : the level of run of length r is defined as the number ℓ = ⌊log2(r)⌋. In practice, and following the naming conventions set up at the beginning of Section 3, we ′ i ′ i denote by ℓ , ℓ and ℓi the respective levels of the runs R , R and Ri, and so on. Our proof is based on the following result, which was stated and proved in [6, Lemma 7].

◮ Lemma 25. Let S = (R1, R2,...,Rh) be a stack obtained during an execution of adaptive ShiversSort. We have (viii) ℓi > ℓi+1 +1 whenever 1 6 i 6 h − 3 and (ix) ℓh−3 > ℓh−1 +1.

Lemmas 26 and 27 focus on inequalities involving the lengths of a given run R and its ancestors. In each case, we denote by S = (R1, R2,...,Rh) the stack at the moment where the run R is merged, and we will use this naming convention when proving these Lemmas. 14 Galloping in natural merge sorts

Algorithm 4: adaptive ShiversSort Input : Array A to sort Result : The array A is sorted into a single run. That run remains on the stack. th Note : We denote the height of the stack S by h, and its i bottom-most run by Ri.

The length of Ri is denoted by ri, and we set ℓi = ⌊log2(ri)⌋. When two consecutive runs of S are merged, they are replaced, in S, by the run resulting from the merge. 1 S ← an empty stack 2 while true : 3 if h > 3 and ℓh−2 6 max{ℓh−1,ℓh} : 4 merge the runs Rh−2 and Rh−1 5 else if the end of the array has not yet been reached : 6 find a new monotonic run R, make it non-decreasing, and push it onto S 7 else: 8 break 9 while h > 2 : 10 merge the runs Rh−1 and Rh

◮ Lemma 26. If 1R is a right run, then 2ℓ > ℓ +1.

1 Proof. The run R coincides with either Rh−2 or Rh−1, and thus R is the parent of the runs 1 2 Rh−2 and Rh−1. Hence, the run Rh−3 is the left sibling of R, and descends from R. Thus, 2 it follows from inequalities (viii) and (xi) that ℓ > ℓh−3 > max{ℓh−2,ℓh−1} +1 > ℓ + 1. ◭

◮ Lemma 27. If the runs R and 1R are left runs, then 2ℓ > ℓ +1.

Proof. Since R is a left run, it coincides with Rh−2, and ℓh−2 6 max{ℓh−1,ℓh}. Then, since 1 2 2 R is a left run, the run Rh descends from R. It follows that r > r + max{rh−1, rh} > 2ℓ +2max{ℓh−1,ℓh} > 2ℓ+1, and thus that 2ℓ > ℓ + 1. ◭

◮ Lemma 28. We have 3ℓ > ℓ +1.

Proof. We only need to distinguish three (non mutually exclusive) cases:

⊲ If a run iR is a right run for some integer i ∈ {1, 2}, then Lemma 26 shows that ℓ 3ℓ > i+1 > i−1ℓ +1 > ℓ + 1. ⊲ If both 1R and 2R are left runs, then Lemma 27 shows that 3ℓ > 1ℓ +1 > ℓ + 1. ◭

Proof of Proposition 24. Lemma 28 states that ℓ = 6ℓ > 3ℓ +1 > ℓ + 2, and therefore r > 2ℓ > 2 × 2ℓ+1 > 2r. ◭

4.2 Algorithms with the middle-growth property 4.2.1 The algorithm NaturalMergeSort The algorithm NaturalMergeSort consists in a plain binary merge sort, whose unit slices of data to be merged are runs instead of being single elements. Thus, we identify NaturalMergeSort with the fundamental property that describes those merge trees it induces.

◮ Definition 29. Let T be a merge sort generated by NaturalMergeSort, and let R and R be two runs that are siblings of each other in T . Denoting by n and n the number of leaves of T that respectively descend from R and from R, we have n − 1 6 n 6 n + 1. V. Jugé and G. Khalighinejad 15

◮ Proposition 30. Let T be a merge sort generated by NaturalMergeSort, let R be a run in T , and let h be its height. We have r > 2h. Proof. Let us prove the following stronger statement by induction on h: every run R of height h is an ancestor of at least 2h + 1. First, this is the case if h = 0. Then, if h > 1, let R1 and R2 be the two children of R. One of them, say R2, has height h − 1. Let us denote by n, n1 and n2 the number of leaves that respectively descend from R, R1 and R2. The induction hypothesis shows that h−1 h n = n1 + n2 > 2n1 − 1 > 2 × (2 + 1) − 1=2 − 1, which completes the proof. ◭

4.2.2 The algorithm ShiversSort The algorithm ShiversSort is presented in Algorithm 5. Like its variant adaptive ShiversSort, it relies on the notion of level of a run. Below, we prove the following result. ◮ Proposition 31. Let T be a merge sort generated by ShiversSort, let R be a run in T , and let h be its height. We have r > 2h/4.

Algorithm 5: ShiversSort Input : Array A to sort Result : The array A is sorted into a single run. That run remains on the stack. th Note : We denote the height of the stack S by h, and its i bottom-most run by Ri.

The length of Ri is denoted by ri, and we set ℓi = ⌊log2(ri)⌋. When two consecutive runs of S are merged, they are replaced, in S, by the run resulting from the merge. 1 S ← an empty stack 2 while true : 3 if h > 1 and ℓh−1 6 ℓh : 4 merge the runs Rh−1 and Rh 5 else if the end of the array has not yet been reached : 6 find a new monotonic run R, make it non-decreasing, and push it onto S 7 else: 8 break 9 while h > 2 : 10 merge the runs Rh−1 and Rh

Our proof is based on the following result, which appears in the proof of [4, Theorem 11].

◮ Lemma 32. Let S = (R1, R2,...,Rh) be a stack obtained during an execution of ShiversSort. We have (x) ℓi > ℓi+1 +1 whenever 1 6 i 6 h − 2. Like when the other stack-based algorithms α-MergeSort and adaptive ShiversSort, Lemmas 26 and 27 focus on inequalities involving the lengths of a given run R and its ancestors. In each case, we denote by S = (R1, R2,...,Rh) the stack at the moment where the run R is merged, and we will use this naming convention when proving these Lemmas. ◮ Lemma 33. Let R be a run in T , and let k and m be two integers. If k of the ancestors ancestors 1R, 2R,..., mR of R are right runs, then m+1ℓ > k − 1. Proof. Let a(1) u(1) >u(2) >...>u(k), m+1 it follows from (x) that ℓ > ℓu(k) > ℓh−k > ℓh−1 + (k − 1) > k − 1. ◭ 16 Galloping in natural merge sorts

Algorithm 6: α-StackSort Input : Array A to sort, parameter α> 1 Result : The array A is sorted into a single run. That run remains on the stack. th Note : We denote the height of the stack S by h, and its i bottom-most run by Ri. The length of Ri is denoted by ri. When two consecutive runs of S are merged, they are replaced, in S, by the run resulting from the merge. 1 S ← an empty stack 2 while true : 3 if h > 2 and rh−1 6 αrh : 4 merge the runs Rh−1 and Rh 5 else if the end of the array has not yet been reached : 6 find a new monotonic run R, make it non-decreasing, and push it onto S 7 else: 8 break 9 while h > 2 : 10 merge the runs Rh−1 and Rh

◮ Lemma 34. Let R be a run in T . If R is a left run, then 1ℓ > ℓ +1.

Proof. Since R is a left run, it coincides with Rh−1, and ℓh−1 6 ℓh. It follows that 1 ℓh−1 ℓh ℓ+1 1 r = rh−1 + rh > 2 +2 > 2 , and thus ℓ > ℓ + 1. ◭

◮ Lemma 35. Let i be an integer, and let R and R be two runs in T , such that let R is the ith ancestor of R. We have ℓ > (i − 1)/2.

Proof. We distinguish two cases:

⊲ If at least i/2 + 1 of the runs R, 1R, 2R,..., i−1R are right runs, then Lemma 33 shows that ℓ > i/2. ⊲ If at least i/2 − 1 of the runs R, 1R, 2R,..., i−1R are left runs, let k = ⌈i/2⌉− 1, and let a(1) ℓ + j for all j 6 k. It follows that that ℓ > a(k)+1ℓ > ℓ + k > k > (i − 1)/2. ◭

Proof of Proposition 31. If R is of height h 6 1, the result is immediate. Hence, we assume that h > 2. Let R• be a run T whose hth ancestor is the run R. Lemma 35 proves that ℓ > (h − 1)/2 > h/4, and therefore r > 2ℓ > 2h/4. ◭

4.2.3 The algorithm α-StackSort

The algorithm α-StackSort, which predated and inspired its variant α-MergeSort, is presented in Algorithm 6.

◮ Proposition 36. Consider some real parameter α > 1. Let T be a merge sort generated by ShiversSort, let R be a run in T , and let h be its height. We have r > min{α, 1+1/α}h/4.

We begin with the following result, which appears in [2, Lemma 2].

◮ Lemma 37. Let S = (R1, R2,...,Rh) be a stack obtained during an execution of α-StackSort. We have ri > αri+1 whenever 1 6 i 6 h − 2. V. Jugé and G. Khalighinejad 17

Based on that result, our proof follows the tracks of that of Section 4.2.2. Lemmas 38 and 39 focus on inequalities involving the lengths of a given run R and its ancestors. In each case, we denote by S = (R1, R2,...,Rh) the stack at the moment where the run R is merged, and we will use this naming convention when proving these Lemmas.

◮ Lemma 38. Let R be a run in T , and let k and m be two integers. If k of the ancestors ancestors 1R, 2R,..., mR of R are right runs, then m+1r > αk−1.

Proof. Let a(1) u(1) >u(2) >...>u(k), m+1 h−1−u(k) k−1 k−1 it follows from (x) that r > ru(k) > α × rh−1 > α × rh−1 > α . ◭

◮ Lemma 39. Let R be a run in T . If R is a left run, then 1r > (1+1/α)r.

Proof. Since R is a left run, it coincides with Rh−1, and rh−1 6 αrh. It follows immediately 1 that r = rh−1 + rh > (1+1/α)r. ◭

◮ Lemma 40. Let i be an integer, and let R and R be two runs in T , such that let R is the ith ancestor of R. We have r > min{α, 1+1/α}(i−1)/2.

Proof. We distinguish two cases:

⊲ If at least i/2 + 1 of the runs R, 1R, 2R,..., i−1R are right runs, then Lemma 38 shows that r > αi/2. ⊲ If at least i/2 − 1 of the runs R, 1R, 2R,..., i−1R are left runs, let k = ⌈i/2⌉− 1, and let a(1) (1+1/α)j ×r for all j 6 k, and thus r > a(k)+1r > (1+1/α)k × r > (1+1/α)k > (1+1/α)(i−1)/2. ◭

Proof of Proposition 36. If R is of height h 6 1, and since 2 > 1+1/α, the result is immediate. Hence, we assume that h > 2. Let R• be a run T whose hth ancestor is the run R. Lemma 35 proves that r > min{α, 1+1/α}(h−1)/2 > min{α, 1+1/α}h/4. ◭

5 Concluding remarks and open questions

In the above, we have introduced a new measure of complexity of arrays, based on the number and lengths of dual runs which an input array consists of. Then, due to the new notions of fast- and middle-growth of natural merge sort algorithms, we have proved that numerous sorting algorithms could sort arrays of length n and σ dual runs in O(n log(σ)) element comparisons, provided that they use TimSort’s merging sub-routine. In subsequent versions of this article or in future work, we plan to address the following questions:

1. We will improve our O(n log(σ)) upper bound to the optimal O(nH⋆). 2. We will reprove that every algorithm with the fast-growth property sorts arrays of length n with runs of length r1, r2,...,rρ in time O(nH), regardless of whether the algorithm uses TimSort’s merging sub-routine or its naive variant. 3. A next step would be to study the effects of the procedure used in TimSort’s merging sub-routine to update the parameter t mentioned in Algorithm 2, and which we decided to forget in this paper. In particular, we do not yet know whether the O(n log(σ)) or O(nH⋆) upper bounds still hold. 18 Galloping in natural merge sorts

4. It was proved in [6] that the constant hidden in the O(nH) upper bound on the number of comparisons performed by adaptive ShiversSort with the naive merging sub-routine is 1, which is optimal. This does not hold with our current version of TimSort’s merging sub-routine, where the parameter t is fixed. Does this hold again with TimSort’s actual sub-routine? And what about the constant in the O(nH⋆) upper bound?

References 1 Nicolas Auger, Vincent Jugé, Cyril Nicaud, and Carine Pivoteau. On the worst-case complexity of Timsort. In 26th Annual European Symposium on Algorithms (ESA 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2018. Extended version available at: https://arxiv.org/abs/1805.08612. 2 Nicolas Auger, Cyril Nicaud, and Carine Pivoteau. Merge strategies: From Merge Sort to TimSort. Research Report hal-01212839, hal, 2015. 3 Jérémy Barbay and Gonzalo Navarro. On compressing permutations and adaptive sorting. Theor. Comput. Sci., 513:109–123, 2013. 4 Sam Buss and Alexander Knop. Strategies for stable merge sorting. In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1272–1290. Society for Industrial and Applied Mathematics, 2019. 5 Vladmir Estivill-Castro and Derick Wood. A survey of adaptive sorting algorithms. ACM Computing Surveys, vol. 24, issue 4, pages 441-476, 1992. 6 Vincent Jugé. Adaptive Shivers Sort: An Alternative Sorting Algorithm. In In Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms, pages 1639–1654. SIAM, 2020. 7 . The Art of Computer Programming, Volume 3: (2nd Ed.) Sorting and Searching. Addison Wesley Longman Publish. Co., Redwood City, CA, USA, 1998. 8 Heikki Mannila. Measures of presortedness and optimal sorting algorithms. IEEE Trans. Computers, 34(4):318–325, 1985. 9 J. Ian Munro and Sebastian Wild. Nearly-optimal mergesorts: Fast, practical sorting methods that optimally adapt to existing runs. In 26th Annual European Symposium on Algorithms (ESA 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, pages 63:1– 63:15, 2018. 10 Tim Peters. Timsort description. https://github.com/python/cpython/blob/master/Objects/listsort.txt. 11 Olin Shivers. A simple and efficient natural merge sort. Technical report, Georgia Institute of Technology, 2002.