arXiv:2107.14527v1 [cs.DS] 30 Jul 2021 muto eoy infiatysalrta hti eddt needed [ Szegedy is and what Matias, Alon, than of work smaller the with d significantly Starting large memory, processing of for amount algorithms are algorithms Streaming Introduction 1 o fatninfo ayaesi optrsine uha such science, computer in processing. areas language many natural from attention of lot a for ucinmgtcuttenme fdsic lmnsi h s the in 1.1. elements Definition distinct of number the count might function napoiainfrtevleo oe(rdfie)real-val (predefined) to some referred sometimes of are value algorithms streaming the Such for approximation an temn loih o h function the for algorithm streaming ftefloighlsfreeysequence every for holds following the if hr h rbblt stknoe h on falgorithm of coins the over taken is probability the where h nu stream input the ahrsadr.Nt htti ento sue httei the that assumes definition this that Note standard. rather rmwr o desra temn i ieeta Priv Differential via Streaming Adversarial for Framework A nti okw ou nsraigagrtm htamt main to aim that algorithms streaming on focus we work this In hl ento . scranyntteol osbedefini possible only the not certainly is 1.1 Definition While m rva a)it igehbi rmwr htgisfo ohappr both from gains streams. that turnstile framework for hybrid the combine single performances We a fram weaknesses. into and suggested way) strengths recently trivial own These its with (2021). each Zhou ideas, and fram Woodruff suggested by recently and two hybrids that streaming adversarial uhsraigagrtm r adt be to ch said is are stream st algorithms input in streaming the interest Such when growing even a guarantees is provable there provide Recently, that advance. t in under fixed operate is algorithms stream streaming Classical memory. of amount ons ban neeetfrom element an obtains rounds, temn loihsaeagrtm o rcsiglredt stre data large processing for algorithms are algorithms Streaming dnAta dt oe oh hcnrUiStemmer Uri Shechner Moshe Cohen Edith Attias Idan ~u Let n eoeteasesgvnby given answers the denote and , X eadmi n let and domain a be r[ Pr ∀ i ∈ n ieec Estimators Difference and [ m : ] F z i ihaccuracy with X ∈ ~u uy3,2021 30, July (1 n upt elnme.Algorithm number. real a outputs and adversarially-robust ( = F Abstract ± : u α X 1 1 ) u , . . . , ∗ F · → AMS99 A ( togtrackers strong α u R alr probability failure , m 1 as u , . . . , eafnto.Let function. a be ) A ~ z ∈ . t tem hl sn nyalimited a only using while streams ata ,sraigagrtm aeattracted have algorithms streaming ], ( = epooeanvlfaeokfor framework novel a propose We . ptstream nput X ra.Formally, tream. hoy aaae,ntokn,and networking, databases, theory, s wrsb asdme l (2020) al. et Hassidim by eworks i e ucino h nu stream. input the of function ued m )] z ino temn loihs tis it algorithms, streaming of tion 1 osdra xcto of execution an Consider . snb naatv adversary. adaptive an by osen ≥ z , . . . , tr h niedt stream. data entire the store o easmto htteinput the that assumption he wrsrl nvr different very on rely eworks etofaeok i non- a (in frameworks two se digsraigalgorithms streaming udying o xml,ti predefined this example, For . 1 m,uigol limited a only using ams, ahst bansuperior obtain to oaches − an taypiti time, in point any at tain, m β, ) ~u Then, . β A sfie navne In advance. in fixed is n temlength stream and , ea loih that, algorithm an be A ssi ob a be to said is acy A on m particular, it is assumed that the choice for the elements in the stream is independent from the internal randomness of . This assumption is crucial for the analysis (and correctness) of many of A the existing streaming algorithms. We refer to algorithms that utilize this assumption as oblivious streaming algorithms. In this work we are interested in the setting where this assumption does not hold, often called the adversarial setting.

1.1 The adversarial streaming model The adversarial streaming model, in various forms, was considered by [MNS11,GHR+12,GHS+12, AGM12a,AGM12b,HW13,BY19,BJWY20,HKM+20,KMNS21,BHM+21]. We give here the formu- lation presented by Ben-Eliezer et al. [BJWY20]. The adversarial setting is modeled by a two-player game between a (randomized) StreamingAlgorithm and an Adversary. At the beginning, we fix a function : X∗ R. Then the game proceeds in rounds, where in the ith round: F → 1. The Adversary chooses an update u X for the stream, which can depend, in particular, i ∈ on all previous stream updates and outputs of StreamingAlgorithm.

2. The StreamingAlgorithm processes the new update ui and outputs its current response z R. i ∈

The goal of the Adversary is to make the StreamingAlgorithm output an incorrect response zi at some point i in the stream. For example, in the distinct elements problem, the adversary’s goal is that at some step i, the estimate zi will fail to be a (1 + α)-approximation of the true current number of distinct elements. In this work we present a new framework for transforming an oblivious into an adversarially-robust streaming algorithm. Before presenting our framework, we first elaborate on the existing literature and the currently available frameworks.

1.2 Existing framework: Ben-Eliezer et al. [BJWY20] To illustrate the results of [BJWY20], let us consider the distinct elements problem, in which the function counts the number of distinct elements in the stream. Observe that, assuming that there F are no deletions in the stream, this quantity is monotonically increasing. Furthermore, since we are aiming for a multiplicative (1 α) error, even though the stream is large (of length m), the number ± 1 of times we actually need to modify the estimates we release is quite small (roughly α log m times). Informally, the idea of [BJWY20] is to run several independent copies of an oblivious algorithm (in parallel), and to use each algorithm to release answers over a part of the stream during which the estimate remains constant. In more detail, the generic transformation of [BJWY20] (applicable not only to the distinct elements problem) is based on the following definition.

Definition 1.2 (Flip number [BJWY20]). Given a function , the (α, m)-flip number of , de- F F noted as λ (or simply λ in short), is the maximal number of times that the value of can α,m F change (increase or decrease) by a factor of (1 + α) during a stream of length m.

The generic construction of [BJWY20] for a function is as follows. F 1. Instantiate λ independent copies of an oblivious streaming algorithm for the function , and F set j = 1.

2 2. When the next update ui arrives:

(a) Feed ui to all of the λ copies. (b) Release an estimate using the jth copy (rounded to the nearest power of (1+ α)). If this estimate is different than the previous estimate, then set j j + 1. ← Ben-Eliezer et al. [BJWY20] showed that this can be used to transform an oblivious streaming algorithm for into an adversarially robust streaming algorithm for . In addition, the overhead F F in terms of memory is only λ, which is small for many problems of interest. The simple, but powerful, observation of Ben-Eliezer et al. [BJWY20], is that by “using every copy at most once” we can break the dependencies between the internal randomness of our algorithm and the choice for the elements in the stream. Intuitively, this holds because the answer is always computed using a “fresh copy” whose randomness is independent from the choice of stream items.

1.3 Existing framework: Hassidim et al. [HKM+20] Hassidim et al. [HKM+20] showed that, in fact, we can use every copy of the oblivious algorithm much more than once. In more detail, the idea of Hassidim et al. is to protect the internal random- ness of each of the copies of the oblivious streaming algorithm using differential privacy [DMNS06]. Hassidim et al. showed that this still suffices in order to break the dependencies between the in- ternal randomness of our algorithm and the choice for the elements in the stream. Informally, the framework