Arxiv:2008.08417V2 [Cs.DS] 3 Nov 2020
Total Page:16
File Type:pdf, Size:1020Kb
Modular Subset Sum, Dynamic Strings, and Zero-Sum Sets Jean Cardinal∗ John Iacono† Abstract a number of operations on a persistent collection strings, e modular subset sum problem consists of deciding, given notably, split and concatenate, as well as nding the longest a modulus m, a multiset S of n integers in 0::m − 1, and common prex (LCP) of two strings, all in at most loga- a target integer t, whether there exists a subset of S with rithmic time with updates being with high probability. We elements summing to t (mod m), and to report such a inform the reader of the independent work of Axiotis et + set if it exists. We give a simple O(m log m)-time with al., [ABB 20], contains much of the same results and ideas high probability (w.h.p.) algorithm for the modular sub- for modular subset sum as here has been accepted to ap- set sum problem. is builds on and improves on a pre- pear along with this paper in the proceedings of the SIAM vious O(m log7 m) w.h.p. algorithm from Axiotis, Backurs, Symposium on Simplicity in Algorithms (SOSA21). + Jin, Tzamos, and Wu (SODA 19). Our method utilizes the As the dynamic string data structure of [GKK 18] is ADT of the dynamic strings structure of Gawrychowski quite complex, we provide in §4 a new and far simpler alter- et al. (SODA 18). However, as this structure is rather com- native, which we call the Data Dependent Tree (DDT) struc- plicated we present a much simpler alternative which we ture. Our general approach is to create a tree with the string call the Data Dependent Tree. As an application, we con- stored in the leaves, and where the shape of the tree is a sider the computational version of a fundamental theorem function of the data in the leaves and a random seed; in in zero-sum Ramsey theory. e Erdos-Ginzburg-Ziv˝ eo- common with other simple string algorithms we use a hash rem states that a multiset of 2n−1 integers always contains function to compute ngerprints of strings, a method pio- a subset of cardinality exactly n whose values sum to a mul- neered in by Karp and Rabin [KR87]. e result is a struc- + tiple of n. We give an algorithm for nding such a subset in ture with almost identical runtimes as [GKK 18] but which time O(n log n) w.h.p. which improves on an O(n2) algo- is only slightly more complex than a skip list and is easy rithm due to Del Lungo, Marini, and Mori (Disc. Math. 09). to visualize (see Figure 1) and reason about. We say al- most identical as [GKK+18] supports LCP queries in con- 1 Introduction stant time whereas we do so in logarithmic time; this makes no overall dierence in applications such as ours where the In SODA 2019 [ABJ+19], Axiotis, Backurs, Jin, Tzamos, and number of LCP queries do not asymptotically dominate the Wu gave an algorithm for modular subset sum with run- number of update operations, which take logarithmic time time O(m log7 m) and that returns the correct answer with ~ 5=4 in both structures. None of the structures that for this prob- high probability. is improved upon an earlier O(m ) + arXiv:2008.08417v2 [cs.DS] 3 Nov 2020 lem that predate [GKK 18] (Sundar and Tarjan [ST94], Mel- algorithm of Koiliaris and Xu [KX19] which rst appeared horn, Sundar and Uhrig [MSU97] and Alstrup, Brodal and in SODA 2017. Rauhe [ABR00]) match the DDT’s logarithmic time for all In §2 we improve upon this with a very simple algo- operations. rithm running in O(m log m) time with high probability As an application, we consider the computational ver- (w.h.p.). Our method is a straightforward implementation sion of a fundamental theorem in zero-sum Ramsey theory of the na¨ıve dynamic programming approach, sped up us- (see [Car96,Bia93,GG06] for surveys). e Erdos-Ginzburg-˝ ing a recent data structure of Gawrychowski, Karczmarz, Ziv eorem [EZ61] states that a multiset of 2n−1 integers Kociumaka, Lacki, and Sankowski [GKK+18] that supports always contains a subset of cardinality exactly n whose val- ues sum to a multiple of n. We give an algorithm in §3 for ∗Universite´ libre de Bruxelles. Supported by the Fonds de la Recherche nding such a subset in time O(n log n) w.h.p. which im- Scientique-FNRS under CDR Grant J.0146.18. proves on an O(n2) algorithm due to Del Lungo, Marini, †Universite´ libre de Bruxelles and New York University. Supported by and Mori [LMM09]. NSF grants CCF-1533564 and by the Fonds de la Recherche Scientique- FNRS under Grant n° MISU F 6001 1. 1 2 Modular subset sum via dynamic strings Theorem 2.1. [GKK+18] ere exists a data structure for e input to the modular subset sum problem is a positive maintaining a collection of strings and supporting the follow- integer modulus m, a multiset S = s1; s2; : : : ; sn of n ing operations, which we call the dynamic strings ADT: elements of , and a target value t in . e multiset Zm Zm • D New(x): Creates a new string containing a single thus has at most m distinct items and multiplicities are character, x. represented in compact form so that S takes space O(m). • b D.Equal(D0): Returns whether D and D0 are e problem is to decide whether there exists a subset of equal. S whose sum of elements is congruent to t (mod m). Our • ` D.Lcp(D0): Returns the length of the longest solution, in common with [ABJ+19], solves the problem for common prex of D and D0. all values of t in simultaneously. Zm • D.Set(i; x): Sets the ith character of the string to x. Solution overview. Our solution is based on the clas- • x D.Get(i): Returns the ith character of the string. sic dynamic programming approach, which we now de- • D0 D.Split(i): e string beyond the ith character is scribe. We use the notation S+x := fy+x (mod m)jy 2 Sg removed from this structure and is placed in a new one, and [n] = f1; 2; : : : ; ng. Let S ⊆ be the set of residues i Zm which is returned. of all the sums of subsets of the rst i numbers of S: • D.Concatenate(D0): Concatenates the string repre- 8 9 sented by D0 to the end of D0, D0 becomes invalid. <X = Si := sj (mod m) X ⊆ [i] e rst three operations are constant time, the rest take time :j2X ; O(log n) where n is the total size of strings stored, and the Given this denition, the problem is simply to determine if runtimes of update operations are with high probability. t 2 Sn We can construct Si recursively as follows: Now observe that we can implement the circular shi ( f0g if i = 0 operation using one split and one concatenate operation, Si = 0 in time O(log m). Let σi−1 := (σi−1 si). Finally, the Si−1 [ (Si−1 + si) otherwise. 0 bitwise disjunction σi−1 _ σi−1 is implemented as follows: If we wish to obtain the actual subset that adds to a given First nd the index of the rst bit that diers, which is the target j, call it Tj, as is typical for dynamic programming, length of the LCP incremented by one. en change this bit another table is needed to record the choices made. Here it in one of the two strings so that they both have a 1 at this is sucient to record for each target j the index of the Si position. en iterate until both strings are identical. where it rst was realized, which we call aj: ere is a twist, however, in that aer performing the splits and concatenates to implement the shiing of σi−1 to ( min i if j 2 Sn obtain σi−1 si we still need the data structure for σi−1 to j2S aj := i compute σ _(σ s ), and we cannot aord to make 1 i−1 i−1 i otherwise. a copy. Fortunately, the technique of persistence exists exactly for this purpose. In its simplest form, known as Given the ajs, Tj may be easily computed with at most m−1 recursions: partial persistence, read-only access to previous versions of a data structure are supported. e dynamic string structure 8 ; if j = 0 of [GKK+18] supports persistent access to the strings, and > < if j > 0 our alternative DDT structure can be made to be partially (2.1) Tj = fsaj g [ Tj−sa (mod m) j and aj 6= 1 persistent via the general transformation of [DSST89]. > :No Subset otherwise. To summarize: A string encoding. We encode the set Si in a binary Theorem 2.2. Given an integer modulus m > 0, a target t, m string σi 2 f0; 1g , such that j 2 Si if and only if the jth and a compact representation of a multiset S of integers in bit of σi, σi[j], is 1. e recurrence restated for strings is: Zm, the following algorithm determines if there is a subset of S t O(m log m) 8 m−1 times that sums to in time with high probability, < z }| { and reports it: σi = 1 00 ··· 0 if i = 0 :σi−1 _ (σi−1 si) otherwise. 1.