arXiv:2004.02375v2 [math.CO] 2 Feb 2021 rva oe on f(1 of bound lower trivial esyta permutation a that say We Introduction 1 npr ySS rjc 743adNFAadDMS-1953990. Award NSF and 178493 project SNSF by part in ieetsbeune adteeoea otti aypatter many this most at therefore (and subsequences different oeerywr ntecase the in work early some n bevdta any that observed and oinb rai n19 5.Artarie ehp h otfundam most the perhaps raised a Arratia can [5]. short 1999 how in Arratia by notion usto indices of subset on rssfo h rva inequality trivial the from arises bound h ok 1,6,tesres[1 2 4,adTenner’s and 24], 22, [21, become surveys has the patterns 6], permutation [12, of books subject the the [14], Knuth to later order-isomorphic is that subsequence ny(1 only osrcindet ilr[6 civsa Eriksso achieves Eriksson, [16] by Miller work to earlier Following due construction bound. upper the to made lgtyt ( to slightly rai’ ojcue ht(1 that conjecture) Arratia’s ut aua.Gnrlysekn,amteaia tutr ssaid is structure mathematical a speaking, Generally natural. quite setal pia.A ipeeape o n oiieintegers positive any for example, bound simple these a and bound As bound, lower optimal. superpattern trivial essentially Arratia’s are ha to way to there similar universality, seems very of concept notions (this different sense many specified some in substructures, sequence ∗ † ‡ fapermutation a If rai’ ojcuehsrmie pnfrtels wnyyears twenty last the for open remained has conjecture Arratia’s rai’ ojcuemya rtse iteumtvtd u whe but unmotivated, little a seem first at may conjecture Arratia’s ascuet nttt fTcnlg,Cmrde A0214 MA Cambridge, Technology, of Institute Massachusetts eateto ahmtc,Safr nvriy Stanford University, Stanford Mathematics, of Department ascuet nttt fTcnlg,Cmrde A0213 MA Cambridge, Technology, of Institute Massachusetts ueptenol if only superpattern usqec of subsequence hsb einn necetecdn ceefrtepattern the for scheme encoding efficient an bo designing trivial by the this improve we conjecture, Arratia’s Disproving ne n atro rbe ocrig( concerning problem a probl on universality-type Vatter other and to Engen applicable is and flexible /e permutation A oe onsfrspratrsaduieslsequences universal and superpatterns for bounds Lower 2 ihteeprmtr,wihi tigoe size- a over string a is which parameters, these with + k 2 o (1)) 1) + k t ahr Chroman Zachary k 1 spratr e egv eysml osrcino a of construction simple very a gave He be? -superpattern / 2 yEgnadVte 8.Atal,teatoso 9 aeaconjec a made [9] of authors the Actually, [8]. Vatter and Engen by 2 σ σ . < otisall contains hti re-smrhcto order-isomorphic is that · · · σ k /e n -superpattern ∈ 2 ≥ t < k + S / (1 σ ySmo n cmd 2] uepten eefis nrdcda introduced first were superpatterns [20], Schmidt and Simion by 3 = n + 2 o k /e (1)) ssi obe to said is ∈ uhthat such 2 o S k + (1)) k n atrsin patterns ! o 2 (1)) contains antb mrvd enn htteeare there that meaning improved, be cannot k ∗ σ 2 k π stetu iiu egho a of length minimum true the is n k ∈ 2 k .Satn ihsm onainlwr yMcao 1]and [15] MacMahon by work foundational some with Starting ). σ n rai ojcue htti oe on sbest-poss is bound lower this that conjectured Arratia and , spratr flnt ( length of -superpattern k ( S ≥ -universal t n i ) permutation a k ate Kwan Matthew k uthv egha least at length have must S π σ < ,wihhlsbcueapermutation a because holds which !, )aysqecswihcnanall contain which sequences 1)-ary + Abstract k ipecutn ruetsosthat shows argument counting simple A . ti adt be to said is it , ( A935 Email: 94305. CA , 1 t j ra or fadol if only and if ) .Email: 2. .Email: 9. aaaeo emtto atr Avoidance Pattern Permutation of Database m;freape eas mrv on by bound a improve also we example, for ems; n yasalcntn atr eaccomplish We factor. constant small a by und k -superpattern q π s flength of ns) htapa in appear that s lhbtcnann vr osbelength- possible every containing alphabet zacchro@mit mihirs@mit ebe rtcniee yRd 1].For [18]). Rado by considered first been ve ∈ k † k ut atoe h er;sefrexample for see years; the over vast quite r eyotnkono upce obe to suspected or known often very are s S -universal and obe to rsn rmcutn ruet na in arguments counting from arising s k k u aiu mrvmnshv been have improvements various but , mattkwan@stanford 2 ,Lnso n atud[] clever a [9], Wastlund and Linusson n, sapattern a as π + q na usinaotsuperpatterns: about question ental ( hr skont xs a exist to known is there , i universal lcdi ie otx,i is it context, wider a in placed n k . ii Singhal Mihir ffrevery for if edu ) k n . ) edu -superpattern. / k π < ≥ ,adti a ae improved later was this and 2, rai ojcue htthis that conjectured Arratia . . ra or . σ k (1 hsapoc squite is approach This . ( spratr flength of -superpattern j /e ta st say, to is (that ) k k -superpattern fi otisalpossible all contains it if k spratrso length of -superpatterns 2 hnteeeit some exists there when -permutations. π + . edu ∈ σ o ue(contradicting ture (1)) σ S ∈ eerhsupported Research . ‡ k a ea be can hr sa is there , S k n 2 hslower This . a only has Following . general a s ible. eBruijn de σ k - [23]. a a has k n k 2 k , string exactly once. A much less simple example is the case of universality for finite graphs, a subject that has received a large amount of attention over recent years due in part to applications in computer science (see for example [4]). A graph is said to be k-induced-universal if it contains an induced copy of all possible k (2) n graphs on k vertices. There are at least 2 /k! graphs on k vertices, and an n-vertex graph has at most k induced subgraphs with k vertices, so a straightfoward computation shows that every k-induced-universal graph must have at least (1 + o(1))2(k−1)/2 vertices (this was first observed by Moon [17]). In a recent breakthrough, Alon [2] proved that this trivial lower bound is essentially optimal, by showing that a random graph with (1 + o(1))2(k−1)/2 vertices typically contains almost all k-vertex graphs as induced subgraphs, and then proving that the remaining “exceptional” graphs can be handled separately without adding too many vertices. It is very tempting to try to adapt Alon’s proof strategy to permutations, to prove Arratia’s conjecture. First, one might try to prove that a random permutation of length n = (1/e2 + o(1))k2 typically contains n almost all permutations of length k (one imagines that the k subsequences of a random permutation tend to yield mostly distinct patterns, so the trivial counting bound should not be too far from the truth). Second, one hopes to be able to deal with the few remaining permutations in some more ad-hoc way, without substantially increasing the length of the permutation. In this paper we prove that not only is Arratia’s conjecture false, but even the first of these two steps fails. 2 2 Theorem 1.1. Suppose that n< (1.000076/e )k . Then every σ Sn contains only o(k!) different patterns π S . In particular, there is no k-superpattern of length less than∈ (1.000076/e2)k2. ∈ k We have made some effort to optimize the constant in Theorem 1.1, where it would not negatively affect the readability of the proof. However, it would be quite complicated to fully squeeze the utmost out of our proof idea (see Section 6 for further discussion). In any case, obtaining a constant that is substantially larger than 1/e2 seems quite out of reach. To very briefly describe the approach in our proof of Theorem 1.1, it is instructive to reinterpret Arratia’s trivial bound in an “information-theoretic” way. Namely, for each pattern π Sk that appears in σ, note that σ gives a way to encode π by a set of indices t < < t . Since there∈ are at most n possible 1 ··· k k outcomes of this encoding, there are at most n different patterns that we could have encoded, yielding k Arratia’s bound. The key observation towards improving this bound is that the aforementioned encoding is often slightly wasteful. For example, suppose that we are encoding a pattern π that appears in σ at indices t < < t , and suppose that t t − > k for some i. Now, having specified all the indices except t , 1 ··· k i+1 − i 1 i there are more than k possibilities for ti. This means it would be “cheaper” simply to specify the relative value π(i) itself, rather than to specify the index ti (note that there are only k choices for this relative value, and π(i) together with t1,...,ti−1,ti+1,...,tk still fully determine π). More generally, we can design an encoding scheme that decides for each i whether to specify the index ti or the relative value π(i) depending 2 on the size of t t − . Note that the average value of t t − is about 2k/e (which is substantially i+1 − i 1 i+1 − i 1 less than k), so for most i we will end up just specifying ti, as in the trivial encoding scheme. Our small gain comes from the fact that for almost all choices of t1,...,tk, there are a small number of i for which the difference t t − is much larger than average (if the t ,...,t are chosen randomly, then the consecutive i+1 − i 1 1 k differences ti+1 ti are well-approximated by independent exponential random variables). The details of the proof are in Sections− 2 and 3. At first glance the approach sketched above may seem to be an extremely general way to improve trivial counting-based lower bounds for basically any universality-type problem. However, for many interesting problems this approach does not seem to give us even tiny (lower-order) improvements. We discuss this in the concluding remarks (Section 6). We were, however, able to find another problem closely related to Arratia’s conjecture, for which our approach is useful. To describe this, we generalize to the setting where σ is a sequence of (not necessarily distinct) elements of [r] := 1,...,r . We define containment of a pattern identically: if σ [r]n is a sequence and π S is a pattern,{ then} σ contains π if there exist indices ∈ ∈ k t1 < < tk such that σ(ti) < σ(tj ) if and only if π(i) < π(j). Again, σ is said to be k-universal if it contains··· all k-patterns. Note that any k-universal sequence yields a k-superpattern of the same length, since it is possible to find a permutation with the same relative ordering (breaking ties arbitrarily). Engen and Vatter, in their survey paper [8], considered the question of finding the shortest possible
2 k-universal sequence in the cases r = k, k +1. In the case r = k, it is known that the answer must be (1 + o(1))k2, as proved by Kleitman and Kwiatkowski [13]. The case r = k + 1 is of particular interest since Miller’s construction of a k-superpattern of length n = (k2 +k)/2 actually comes from a k-universal sequence in [k + 1]n. Meanwhile, until now the best lower bound was the trivial one obtained as follows. Writing am for the number of occurrences of each m [r] in σ, the number of subsequences of σ with elements ∈ k m1,m2, ,mk is equal to am1 am2 ...amk . This expression is at most (n/k) by convexity. Thus, the total ··· r k number of subsequences that don’t contain repeated elements is at most k (n/k) . This must be at least k! for σ to be a k-universal sequence, so as long as r = (1+ o(1))k we must have n (1/e + o(1))k2. We improve upon this trivial bound as follows. ≥ Theorem 1.2. Suppose that n< (1 + e−600)k2/e. Let σ [(1 + e−1000)k]n be a sequence of length n. Then, ∈ σ can only contain o(k!) patterns in Sk.
The proof of Theorem 1.2 appears in Section 4. We made no effort to optimize the constants.
2 Lower-bounding the length of a superpattern
In this section we will prove Theorem 1.1, modulo a simple probabilistic lemma that will be proved in the next section. We assume, as we may, that k is odd, and we will generally omit floor symbols in quantities that may not be integers. 2 2 Let σ Sn be a permutation of length n< 1.000076k /e . Our objective is to prove that σ contains o(k!) patterns.∈ As outlined in the introduction, the idea is to define an efficient encoding; namely, an injective function φ from the set of all patterns in σ into a set of size o(k!). Our encoding is always going to specify the positions (in σ) of the odd-indexed entries of π, but for for the even indexed-entries, we are going to encode either the position or the relative value, depending on which of the two “requires less information” (unless this encoding scheme does not provide significant savings, in which case we just encode the positions of all entries of π). To describe our encoding more formally, let π be a pattern of length k that is contained in σ. Fix a set of indices T = t ,...,t , where t < < t , such that the values of σ on t ,...,t form the pattern π. { 1 k} 1 ··· k 1 k For even 1