Arxiv:1809.03626V3 [Math.AG] 20 Jun 2021 C-908.JMR a Atal Upre Ynfgat CCF-14 Grants NSF by Supported Partially Pra Was 52749 by 1900881

arXiv:1809.03626v3 [math.AG] 20 Jun 2021 C-908.JMR a atal upre yNFgat CCF-14 grants NSF by supported partially Pra was 52749 by 1900881. grant and J.M.R. Collaboration Berlin Foundation Foundation, CCF-1900881. Simons Einstein by by supported supported partially partially was A.E. norms n let and d yCce 1] o estu h oainadpeetCce’ defi Cucker’s present and notation 1.1 the up Definition thes set captures we Now that root theory [12]. that number Cucker arbitrar proving by condition an Algebra) a by numbe of Luckily, complex zeros Theorem constant. Fundemantel the real over the many appear as to doesn’t (such zero ch behaviour real to This no set coefficients. having solution the from cause go can even perturbations small th case: in complex role central 28]. field a 6, played the [11, over numbers Shu problem solving condition 17th background). system Subsequently, further polynomial 41]. for for a [40, 6] numbers speed [3, condition of e.g., of measurement use (see, the algorithms in role of central racy a play numbers condition etry? ytm.Ormi eut r hoes11,11,ad11 eo bu 1.1. below 1.18 results. and our 1.16, for 1.14, context Theorems some are give str results solving main numerically Our computat of systems. complexity discrete or the algebraic understanding toward of step cost c the than some transparent are less They numerical a outline: and to computation, easy discrete/polyhedral are algorithm a algorithms numerous manipulation, these are of steps There core geometry. algebraic computational 1 d , . . . , h ueiso ovn oyoilssesoe h elnmesis numbers real the over systems polynomial solving of numerics The rmacmuainlcmlxt on fve,tecs fnumerica of cost the view, of point complexity computational a From the of one is systems polynomial real of roots real finding Efficiently o ocnrlacrc n opeiyo ueisi elal real in numerics of complexity and accuracy control to How MOHDAAYI O H ODTO UBROF NUMBER CONDITION THE FOR ANALYSIS SMOOTHED of Abstract. ubr fsrcue admra oyoilssesadexte and systems setting. polynomial explicit analysis real provide smoothed we random particular, structured In of numbers coefficients. their of pertubations ntenmrcllna ler rdto,gigbc ovnNeumann von to back going tradition, algebra linear numerical the In P n − p 1 i ( = Let . and LEE .ERG A. ALPEREN p TUTRDRA OYOILSYSTEMS POLYNOMIAL REAL STRUCTURED P 1 Bmir-elNorm) (Bombieri-Weyl p , . . . , ecnie h estvt fra eo fsrcue polynomial structured of zeros real of sensitivity the consider We c ob,respectively, be, to i,α eoetececetof coefficient the denote n − 1 ) easse fhmgnu oyoil ihdge pattern degree with polynomials homogenous of system a be R RGRSPORS N .MUIEROJAS MAURICE J. AND PAOURIS, GRIGORIS UR, ¨ k p i k W 1. := Introduction . s eset We α 1 + 1 ··· X + x α α n x = α na in d i := | c i,α x d α p i 1 α | i 1 2 edfiethe define We . n S rnsDS1120and DMS-1812240 grants NSF and 8 · · · o.Ti ae osiue a constitutes paper This ion. dteeetmtst the to estimates these nd x 92,DS1676 n CCF- and DMS-1460766, 09020, ehKtaio M.GP was G.P. CMU. of Kothari vesh n α siae o condition for estimates uteiswsdeveloped was subtleties e n n ml ntae the initiated Smale and b necriaiy n can One cardinality. ange where onsae“generically” are counts trtv scheme. iterative cue elpolynomial real uctured the but task, this for s miaino algebraic of ombination l ml hnei the in change small ily sa n a theorems has one as rs dtecnrlo accu- of control the nd nition. ewl rtne to need first will we t ouino Smale’s of solution e fcmlxnumbers complex of trto smuch is iteration l anojcie of objectives main α oesbl than subtle more ytm to systems eri geom- gebraic =( := Weyl-Bombieri n Turing, and α 1 α , . . . , n ) , 2 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

and n 1 − P := p 2 . k kW v k ikW ⋄ u i=1 uX t The following is Cucker’s condition number deﬁnition [12]. Deﬁnition 1.2 (Real Condition Number). For a system of homogenous polynomials P = (p1,...,pn 1) with degree pattern (d1,...,dn 1), let ∆n 1 be the diagonal matrix with entries − − − √d1,..., dn 1 and let − n 1 m DP (x) n−1 : T S − R p |TxS x −→ denote the linear map between tangent spaces induced by the Jacobian matrix of the polyno- n 1 mial system P evaluated at the point x S − . ∈ n 1 The local condition number of P at a point x S − is ∈ P κ˜(P, x) := k kW 1 2 2 DP (x) − n−1 ∆n 1 − + P (x) |TxS − k k2 and the global condition number isq

κ˜(P ) := sup κ˜(P, x). x Sn−1 ⋄ ∈ An important feature of Cucker’s real condition number is the following geometric fact[14].

Theorem 1.3 (Real Condition Number Theorem). We use HD to denote the vector space of homogenous polynomial systems with degree pattern (d1,...,dn 1), and equip this space with the metric ρ(., .) induced by the Bombieri-Weyl norm. We define− the set of ill-posed problems to be: n 1 Σ := P H : P has a singular zero in S − { ∈ D } Then we have P κ˜(P )= k kW . ρ(P, Σ) Cucker’s condition number is used in the design and analysis of a numerical algorithm for real zero counting [13, 14, 15], in the series of papers for computing homology groups of semialgebraic sets [16, 7, 8], and more recently in the analysis of a well-known algorithm for meshing curves and surfaces (the Plantinga-Vegter algorithm) [10]. One important observation is that the complexity of a numerical algorithm over the real numbers (imagine using bisection for finding real zeros of a given univariate polynomial) varies depending on the geometry of the input, and not just the bit complexity of its vector representation. Therefore it is more natural to go beyond worst-case analysis and seek quantitative bounds for “typical” inputs. We now explain the existing attempts toward mathematically modeling the intuitive phrase “typical input”. Random and adverserial random models. Worst-case complexity theory, spearheaded by the P vs. NP question, has been a driving force behing many algoritmic breaktrhorughs in the last five decades. However, it has become clear that worst-case complexity theory fails to capture the practical performance of algorithms. The unreasonable effectiveness of everyday statistical methods are a case in point: the spotify app on cell phones solves instances of an NP-Hard problem all the time! SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 3

Two dominant paradigms for going beyond the worst-case analysis of algorithms are as follows: Assume an algorithm T operates on the input x Rk, with the cost of output T(x) bounded from above by C(x). One then equips the input∈ space Rk with a probability measure µ and considers the average cost Ex µC(x), or smoothed analysis of the cost with E ∼ parameter δ > 0: supx Rk y µC(x + δ x y). Clearly, as δ 0, smoothed complexity recovers worst-case complexity,∈ ∼ and whenkδ k we recover average-case→ complexity. It is also clear that to have a realistic complexity−→ analysis, ∞ one should have a probability measure µ that somehow reflects one’s context, and use theorems that allow a broad class of measures µ. The idea of smoothed analysis originated in work of Spielman and Teng [42]. 1.2. Existing results for average and smoothed analysis. Existing results for the average analysis of real condition number from [15] can be roughly summarized as follows. Theorem 1.4 (Cucker, Krick, Malajovich, Wschebor). Suppose p (x) := c(i)xα , i =1,...,n 1 i α − α1+...X+αn=di (i) are random polynomials where cα are centered Gaussian random variables with variances di . Then, for the random polynomial system P =(p1,...,pn 1) and for all t 1, we have α − ≥ 1 n+4 5/2 1/2 (1 + log(t)) 2 P κ˜(P ) t 8d 2 n N ≥ ≤ t n n 1 n+d 1 o where d = max d and N = − i− . i i i=1 di Recall the following smoothedP analysis type result from [14]: Theorem 1.5 (Cucker, Krick, Malajovich, Wschebore). Let Q be an arbitrary polynomial system with degree pattern (d1,...,dn 1), let P = (p1,...,pn 1) be a random polynomial system as defined above. Now for a parameter− 0 <δ< 1 we define− a random perturbation of Q with (P, δ) as follows: G := Q + δ Q P . Then we have k kW 13n2d2n+2N 1 P κ˜(G) t . ≥ δ ≤ t Remark 1.6. The randomness model considered in these seminal results has the following restriction: the induced probability measure is invariant under the action of the orthogonal group O(n) on the space of polynomials. The proof techniques used in the papers seems to be only applicable when one has this group invariance property. This creates an obstruction against anaylsis on spaces of structured polynomials; spaces of structured polynomials are not necessarily closed under the action of O(n), and hence do not support an O(n)-invariant probability measure.

1.3. What about structured polynomials? Let Hdi be the vector space of homogenous polynomials with n variables, and let HD be the vector space of polynomial systems with degree pattern D = (d1,...,dn 1). Let Ei Hd be linear subspaces for i = 1,...,n 1, − ⊂ i − and let E =(E1,...,En 1) be the corresponding vector space of polynomial systems. For virtually any application− of real root finding algorithms, the user has a polynomial system with a particular structure rather than a generic polynomial system with n+d 1 N = i− many coefficients. Suppose a user has identified the linear structure i=1 di E that is present in the target equations, and would like to know about how much precision P 4 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

is expected for round-off errors in the space E. One could induce a probability measure µ on E and use EP µ log (˜κ(P )) to determine the expected precision of numerical solutions. What could go wrong?∼ n 1 Example 1.7. Let u, v S − be two vectors with u v, and define the following subspaces: ∈ ⊥ E := p H : p(u)= p(u), v =0 , i =1,...,n 1 i { ∈ di h∇ i } − where p(u) denotes the the gradient of p evaluated at u. E are codimension 2 linear ∇ i subspaces of Hdi . Now consider the space of polynomials E := (E1,...,En 1); any polynomial system in the space E has a singular real zero at u. Hence, for all P E the− condition number κ˜(P ) is infinite. ∈ The preceding example illustrates that, for certain linear spaces E, the probabilistic analysis of condition numbers is meaningless: It is possible for certain spaces E that all inhabitants have infinite condition number. We will rule out these degenerate cases as follows.

Deﬁnition 1.8 (Non-degenerate linear space). We call a linear space Ei Hdi non- n 1 ⊂ degenerate if for all v S − , there exists an element pi Ei with pi(v) =0. In other words, ∈ n 1 ∈ 6 E is non-degenerate if there is no base point v S − where all the elements of E vanish i ∈ i all together. We call a space of polynomial systems E =(E1,...,En 1) non-degenerate if all E are non-degenerate for i =1,...,n 1. − i − ⋄ An easy corollary of Theorem 1.14 shows that the expected precision is ﬁnite for any non-degenerate space E.

Corollary 1.9. Let E HD be a non-degenerate linear space of polynomials. Let µ be a probability measure supported⊂ on the space E that satisfies the assumptions listed in section 1.5. Then EP µ log (˜κ(P )) is finite. ∼ This is clearly not the end of the story: a non-degenerate linear structure E may still be close to being degenerate, and this would make every element in the space E ill-conditioned. So we need to somehow quantify the numerical conditioning of a linear structure E. Next, we introduce the notion of dispersion as a rough measure of conditioning of a linear structure. 1.4. The dispersion constant of a linear space. Suppose a linear subspace F H is ⊂ d given for some d> 1 together with an orthonormal basis uj(x) , j = 1,...,m with respect n 1 to Bombieri-Weyl norm. Now suppose for a particular point v S − all the basis elements 0 ∈ satisfy absuj(v0) < ε where ε > 0 is small. What kind of behavior one would expect from elements of F at the point v0? This point v0 would behave like a base point (like if all elements of F vanishes at v0) unless one employs rather high precision. This motivates the following definition. Definition 1.10 (Dispersion constant of a linear space of polynomials). Let F H is given ⊂ d for some d > 1, and let uj(x) , j = 1,...,m be an orthonormal basis of F with respect to Bombieri-Weyl norm. We define the following two quantities 1 1 2 2 2 2 σmin(F ) := min uj(v) , σmax(F ) := max uj(v) v Sn−1 v Sn−1 ∈ j ! ∈ j ! X X and the dispersion constant σ(F ) is their ratio: σ (F ) σ(F ) := max . σmin(F ) ⋄ SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 5

The quantity σmax is introduced to make things scale invariant. We generalize the definition to polynomial systems in a straight-forward manner. Definition 1.11 (Dispersion constant of a linear space of polynomial systems). Let E H i ⊂ di be linear spaces for i = 1,...,n 1, and let E = (E1,...,En 1). We define the dispersion constant σ(E) as follows: σ(E) :=− max σ(E ). − i i ⋄ Our estimates replace the dimension N in earlier results with the (potentially much smaller) dimension of E, at the expense of involving the new quantity σ(E). So if a user has a fixed structure E with small dimension and tame dispersion constant, then the expected conditioning on E admits a much better bound than what earlier results suggest. On the other hand, if one has a sparse but highly sensitive structure, the resulting average-case conditioning could be a lot worse than the average over the entire space HD. How big is the dispersion constant? To better understand the dispersion constant, let us consider two examples at opposite extremes. Example 1.12 (A subspace with minimal dispersion constant). Consider subspaces of polynomials F P defined as the span of i ⊂ n,2di (i) 2 2 d 1 u =(x + + x ) i− x x for 1 k,l n kl 1 ··· n k l ≤ ≤ and let F =(F1,...,Fn). It is easy to show that σ(F )=1.

Example 1.13 (A sparse but highly sensitive structure). Let E Pn,d be the subspace of − d d ⊂ d 1 polynomials spanned by the monomials x1,...,xn. Then, we have σ(E)= n 2 . One may wonder how big the dispersion constant for a “typical” linear space E is, for say, E of dimension around n2 log d. Would a typical low-dimensional space look like the second example or the first example? We address this question in the Appendix. For our main theorems, we will allow E to be arbitrary and give bounds depending explicitly on the the dispersion constant σ(E). 1.5. A general model of randomness for structured polynomial systems. In our precursor paper [20] we obtained probabilistic condition number estimates for general measures (without any group invariance assumption). In this paper we present probabilistic results for the same general family of measures, but this time supported on a structured space E instead of HD. Note that here the structured space E will be fixed by the user, and our results will give estimates for a random element from E. First, we introduce our general model of randomness. We say a random vector X Rn satisfies the Centering, sub-Gaussian, and Small Ball ∈ properties, with constants K and c0, if the following hold true: n 1 1 1. (Centering) For any θ S − we have E X, θ = 0. ∈ h i n 1 2. (Sub-Gaussian) There is a K > 0 such that for every θ S − we have t2/K2 ∈ Prob ( X, θ t) 2e− for all t> 0. |h i| ≥ ≤ n 3. (Small Ball) There is a c0 > 0 such that for every vector a R we have Prob ( a, X ε a ) c ε for all ε> 0. ∈ |h i| ≤ k k2 ≤ 0 We note that these three assumptions directly yield a relation between K and c0: We in fact have Kc 1 (see [20] just before Section 3.2). Moreover, for a random variable X that 0 ≥ 4 satisfies above assumptions with constants K and co, and a scalar λ> 0, the random varible

1Equivalently, EX =O. 6 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

1 λX satisfies the above assumptions with constants λK and λ− c0. In other words Kc0 is 1 invariant under scaling, hence one can hope for a universal lower bound of 4 . Random vectors that satisfy these three properties form a large family of distributions, including standard Gaussian vectors and uniform measures on a large family of convex bodies called Ψ2-bodies (such as uniform measures on lp-balls for all p 2). We refer the reader to the book of Vershynin [44] for more details. Discrete sub-Gaussian≥ distributions, such as the Bernoulli distribution, also satisfy an inequality similar to the small-ball inequality in our assumptions. However, the small-ball type inequality satisfied by such discrete distributions depends not only on the norm of the deterministic vector a but also on the arithmetic structure of a. It is possible that our methods, combined with the work of Rudelson and Veshynin on the Littlewood-Offord problem [35], can extend our main results to discrete distributions such as the Bernoulli distribution. In this work, we will content ourselves with continuous distributions. The preceding examples of random vectors do not necessarily have independent coordinates. This provides important extra flexibility. There are also interesting examples of random vectors with independent coordinates. In particular, if X1,...,Xm are independent centered random variables that each satisfy both the sub-Gaussian inequality with constant K and the Small Ball condition with c0, then the random vector X = (X1,...,Xm) also satisfies the sub-Gaussian and Small Ball inequalities with constants C1K and C2c0, where C1 and C2 are universal constants. This is a relatively new result of Rudelson and Ver- shynin [37]. The best possible universal constant C2 is discussed in [31, 34]. To create a random variable satisfying the Small Ball and sub-Gaussian properties one can, for instance, start by fixing any p 2 and then considering a random variable with density function t p ≥ f(t) := cpe−| | , for suitably chosen positive cp. 1.6. Our Results. We present estimates for random structured polynomial systems, where the randomness model is the one introduced in the preceding section. Average-case condition number estimates for structured polynomial systems

Theorem 1.14. Let Ei Hd be non-degenerate linear subspaces, and let E =(E1,...,En 1). ⊆ i − Assume dim(E) n log(ed) and n 3. Let pi Ei be independent random elements of Ei that satisfy the Centering≥ property,≥ the sub-Gaussian∈ property with constant K, and the Small Ball property with constant c0, each with respect to the Bombieri-Weyl inner product. 2 2n 2 We set d := max d and M := nK dim(E)(c CKd log(ed)σ(E)) − , where C 4 is a i i 0 ≥ universal constant. Then for the random polynomial system P =(p1,...,pn 1), we have − p 1 2n log (ed) 3t− 2 ; if 1 t e Prob(˜κ(P ) tM) 1 1 ≤ ≤ 2 + 2n log (ed) ≥ ≤ (e + 1)t− 2 4 log(ed) ; if e t ( ≤ Moreover, for 0

Let , denote the standard inner product on Rn. We then say a random vector X Rn h· ·i ∈ satisfies the Anti-Concentration Property with constant c0 if we have F ( X, θ ,ε) c0ε for n 1 h i ≤ all θ S − . ∈ ⋄ It is easy to check that if the random variable Z has bounded density f then F (Z, t) f t. Moreover, the Lebesgue Differentiation theorem states that upper bounds for the≤k functionk∞ 1 t− F (Z, t) for all t imply upper bounds for f . See [36] for the details. k k∞ Theorem 1.16. Let E HD be a non-degenerate linear subspace for D = (d1,...,dn 1). Assume dim(E) n log2⊆(ed) and n 3. Let Q E be a fixed (deterministic) polynomial− system let G E≥be a random polynomial≥ system∈ given by the same model of randomness as in Theorem∈ 1.14, but with the Small Ball Property replaced by the Anti-Concentration Property. Set d := maxi di, and 2n 1 2n 2 Q − M := nK dim(E) c d2CK log(ed)σ(E) − 1+ k kW 0 √nK log(ed) where C 4 is a universalp constant. Then for the randomly perturbed polynomial system P = Q + ≥G, we have 1 2n log (ed) 3t− 2 ; if 1 t e Prob(˜κ(P ) tM) 1 1 ≤ ≤ 2 + 2n log (ed) ≥ ≤ (e + 1)t− 2 4 log(ed) ; if e t ( ≤ Moreover, for 0

Corollary 1.17. Let E HD be a non-degenerate linear subspace for D = (d1,...,dn 1). Assume dim(E) n log(⊆ed)2 and n 3. Let Q E be a ﬁxed (deterministic) polynomial− system, and let G≥ E be a random polynomial≥ system∈ given by the model of randomness as in Theorem 1.16, but∈ with ﬁxed K =1. Now let 0 <δ< 1 be a parameter and consider the polynomial system P := Q + δ Q G k kW We set d := maxi di, and 2n 1 2n 2 1 − M := n dim(E) c Cd2 log(ed)σ(E) − δ Q 1+ 0 k kW δ√n log(ed) where C 4 is ap universal constant. Then, we have ≥ 1 2n log (ed) 3t− 2 ; if 1 t e Prob(˜κ(P ) tM) 1 1 ≤ ≤ 2 + 2n log (ed) ≥ ≤ (e + 1)t− 2 4 log(ed) ; if e t ( ≤ An interesting consequence As a corollary of the smoothed analysis estimate in Theorem 1.16, we derive the following structural result.

Theorem 1.18. Let Ei Hd be non-degenerate linear subspaces, let E = (E1,...,En 1), ⊆ i − and let Q E. Then, for every 0 <ε< 1, there is a polynomial system Pε E with the following properties:∈ ∈ dim(E) Pε Q W ε Q W k − k ≤ k k log(p ed)√n! 8 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

and 2n 2 d2C log(ed)σ(E) − κ˜(P ) √n dim(E) ε ≤ ε for a universal constant C. p One can view this result as a metric entropy statement as follows: Suppose we are given T T a bounded set E with supP T P W 1, and we would like to cover with balls of T ⊂ ∈ k k ≤ radius δ, i.e., = i B(pi, δ). Moreover, suppose we want the ball-centers pi to have a δ T δ controlled condition number. We can start with an arbitrary 2 covering = i B(pi, 2 ), S δ√n and use Theorem 1.18 with ε = to ﬁnd a pi with controlled condition number in 2√dim(E) S δ T T each one of the balls B(pi, 2 ). Then = i B(pi, δ) gives a δ-covering of where pi has controlled condition number. S

2. Background and Basic Estimates We ﬁrst present a simple lemma for a single random polynomial.

Lemma 2.1. Let F Hd be non-degenerate linear subspace of degree d homogenous polynomials. We equip F⊂with Bombieri-Weyl norm. Suppose p F is a random element that satisfies centering property, sub-Gaussian property with constant∈ K, and small probability n 1 with constant co each with respect to Bombieri inner product. Then for all w S − the following estimates hold: ∈ t2 Prob ( p(w) tσ (F )) exp 1 | | ≥ max ≤ − K2 Prob ( p(w) εσ (F )) c ε. | | ≤ min ≤ 0 Proof. Suppose u1,...,um is an orthonormal basis of F with respect to Bombieri-Weyl inner n 1 product. Let f F be a polynomial with f(x) = f u (x), then for any v S − , ∈ i i i ∈ clearly f(v) = aiui(v). In other words, if we set qv := ui(v)ui(x) then we have i P i f(v)= f, qv W . Also note since ui is an orthonormal basis with respect to Bombieri norm, h i P 1 P we have q =( u (v)2) 2 . k vkW i i Now let p E′ be the random element described above. The reasoning in the preceding ∈ P n 1 paragraph gives us the following estimates for any fixed point v S − : ∈ t2 Prob( p(v) t q ) exp 1 | | ≥ k vkW ≤ − K2 Prob ( p(v) ε q ) c ε. | | ≤ k vkW ≤ 0 By the definition of σmax(F ) and σmin(F ) these pointwise estimates yield the desired result.

The following is the generalization of Lemma 2.1 to systems of polynomials.

n 1 Lemma 2.2. Let D = (d1,...,dn 1) N − . For all i 1,...,n 1 let Ei Hd be − ∈ ∈{ − } ⊆ i non-degenerate linear subspaces, and let E := (E1,...,En 1). For each i, let pi be chosen − from Ei via a distribution satisfying the Centering Property, the Sub-Gaussian Property with SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 9

constant K, and the Small Ball Property with constant c0 (each with respect to the Bombieri- Weyl inner product). Then, for the random polynomial system P = (p1,...,pn 1), and all n 1 − v S − , the following estimates hold: ∈ a t2(n 1) Prob P (v) tσ (E)√n 1 exp 1 1 − k k2 ≥ max − ≤ − K2 n 1 and Prob P (v) εσ (E)√n 1 (a c ε) − , k k2 ≤ min − ≤ 2 0 where a1 and a2 are absolute constants. For the proof of Lemma 2.2 we need to recall some theorems from probability theory and some basic tools developed in our earlier work [20]. These basic lemmata will also be used throughout the paper. We start with a theorem which is reminiscent of Hoeﬀding’s classical inequality [21].

Theorem 2.3. [46, Prop. 5.10] There is an absolute constant c˜1 > 0 with the following property: If X1,...,Xn are centered, sub-Gaussian random variables with constant K, a = (a ,...,a ) Rn and t 0, then 1 n ∈ ≥ 2 c˜1t Prob aiXi t 2 exp − 2 . ≥ ! ≤ K2 a ! i k k2 X We will also need the following standard lemma (see, e.g., [35, Lemma 2.2]).

Lemma 2.4. Assume Z1,...,Zn are independent random variables that have the property that F (Z , t)) c t for all t > 0. Then for t > 0 we have F (W, t√n) (cc t)n, where i ≤ 0 ≤ 0 W := (Z1,...,Zn) 2. Moreover, if ξ1,...,ξk are independent random variables such that, for everyk ε> 0, we havek Prob ( ξ ε) c ε. Then there is a universal constant c>˜ 0 such | i| ≤ ≤ 0 that for every ε> 0 we have Prob ξ2 + + ξ2 ε√k (˜cc ε)k. 1 ··· k ≤ ≤ 0 Now that we have our basic probabilisticp tools we proceed to deriving some deterministic inequalities. The lemma below was proved in our earlier paper [20], generalizing a classical The- orem of Kellog [22]. To state the lemma we need a bit of terminology: For any sys- n 1 tem of homogeneous polynomials P := (p1,...,pn 1) (R[x1,...,xn]) − deﬁne P := − ∈ k k∞ n 1 2 supx Sn−1 i=1− pi(x) . Let DP (x) denote the Jacobian matrix of the polynomial system ∈ at point xq, let DP (x)(u) denote the image of the vector u under the linear operator DP (x), (1)P and set D P := supx,u Sn−1 DP (x)(u) 2. (Alternatively, the last quantity can be ∞ ∈ k k n 1 2 written sup x,u S n−1 i=1− pi(x),u .) ∈ h∇ i q n 1 Lemma 2.5. Let P P:= (p1,...,pn 1) (R[x1,...,xn]) − be a polynomial system with pi − ∈ homogeneous of degree di for each i and set d:=maxi di. Then: (1) 2 n 1 (1) We have D P d P and, for any mutually orthogonal x, y S − , we also ∞ ≤ k k∞ ∈ have DP (x)(y) 2 d P . k k ≤ k k∞ (1) (2) If deg(pi)= d for all i 1,...,n 1 then we also have D P d P . ∈{ − } ∞ ≤ k k∞ The ﬁnal lemma we need is a discretization tool for homogenous polynomial systems that was developed in [20] based on Lemma 2.5. We need a bit of terminology to state the lemma. 10 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

Deﬁnition 2.6. Let K be a compact set in a metric space (X,d), then a set A K with ﬁnitely many elements is called a δ-net if for every x K there exists y A with d(⊆x, y) δ. ∈ ∈ ≤ ⋄ For the unit sphere in Rn, equipped with the standard Euclidean metric, there are known bounds for the size of a δ-net. We recall one such bound below. n 1 n Lemma 2.7. Let S − be the unit sphere in R with respect to standard euclidean metric. n 1 2 n 1 Then for every δ > 0, there exist a δ-net S − with size at most 2n(1 + ) − . N ⊂ δ Lemma 2.7 is almost folklore: a proof appears in Proposition 2.1 of [36].

Lemma 2.8. Let P = (p1,...,pn 1) be a system of homogenous polynomials pi with n − n 1 variables and deg(pi)= di. Let be a δ-net on S − . Let max (P ) = supy P (y) 2 and N N ∈N k k P = supx Sn−1 P (x) 2. Similarly let us define, k k∞ ∈ k k (k) (k) max(D P ) = sup D P (x)(u1,...,uk) 2 k+1 x,u ,...,u N 1 k ∈N (k) (k) and D P = sup D P (x)(u1,...,uk) . n−1 2 ∞ x,u1,...,uk S ∈ Then maxN (P ) (1) When deg(pi)= d for all i 1,...,m we have P 1 dδ and (k) ∈{ } k k∞ ≤ − (k) maxN k+1 (D P ) D P 1 δd√k+1 . ∞ ≤ − (k) maxN (P ) (k) maxN k+1 (D P ) (2) When max i deg(pi) d we have P 1 d2δ and D P 1 δd2√k+1 . { } ≤ k k∞ ≤ − ∞ ≤ − Proof of Lemma 2.2: We begin with the first claim. Using Lemma 2.1 and the fact that n 1 σ (E) σ (E ) for all i, we get the following estimate for any p E and w S − : max ≥ max i i ∈ i ∈ s2 Prob ( p (w) sσ (E)) exp 1 . | i | ≥ max ≤ − K2 n 1 Now let a =(a1,...,an 1) R − with a = 1, and apply Lemma 2.3 to the sub-Gaussian − ∈ k k2 random variables pi(w) and the vector a. We then get σmax(E) c˜ s2 Prob a p (w) sσ (E) exp 1 1 . i i ≥ max ≤ − K2 i ! X − n 1 Observe that P (w) 2 = maxa Sn 2 a, P (w) . For any fixed point w S − and a free variable a Rkn, wek have that ∈a, P (|hw ) is a lineari| polynomial on a. We∈ then use Lemma 2.8 on this∈ linear polynomial, whichh givesi us the following estimate: sσ (E) c˜ s2 Prob P (w) max exp 1 1 . k k2 ≥ 1 δ ≤ |N| − K2 − We then use Lemma 2.7 to control the cardinality of the δ-net and get

2 n 1 (n 1)˜c log( 1 ) 2n(1 + ) − e − δ , |N| ≤ δ ≤ for some absolute constantc ˜. So we set t = 2s√n 1, δ = 1 , and obtain the following − 2 estimate for some universal constant a1. a t2(n 1) Prob P (w) tσ (E)√n 1 exp 1 1 − . k k2 ≥ max − ≤ − K2 SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 11

We continue with the proof of the second claim. Using Lemma 2.1 and the fact that σ (E) σ (E ) for all i, we deduce the following estimate for all p and for any ε> 0: min ≤ min i i p (w) Prob i ε c ε. σ (E) ≤ ≤ 0 min

Using Lemma 2.4 on the random variables pi( w) gives the following estimate: σmin (E) n 1 Prob P (w) εσ ( E)√n 1 (˜c c ε) − . k k2 ≤ min − ≤ 2 0 3. Operator Norm Type Estimates In this section we will estimate the absolute maximum norm of a random polynomial system on the sphere. Recall that for a homogenous polynomial system P = (p1,...,pn 1) − the sup-norm is deﬁned as P = supx Sn−1 P (x) 2. The following lemma is our sup-norm estimate for a random polynomialk k∞ system∈ P . k k

Lemma 3.1. Let D = (d1,...,dn 1) be a vector with positive integer coordinates, let Ei − ⊆ Hd be full linear subspaces, and let E =(E1,...,En 1). Let pi Ei be independent random i − ∈ elements of Ei that satisfy the Centering Property, the Sub-Gaussian Property with constant K, and the Small Ball Property with constant c0, each with respect to Bombieri-Weyl inner n 1 product. Let be a δ-net on S − . Then for P =(p1,...,pn 1) we have N − 2 a1t n Prob max P (x) tσmax(E)√n exp 1 , x k k2 ≥ ≤ |N| − K2 ∈N 1 where a1 is a universal constant. In particular, for d = maxi deg(pi), δ = 3d2 , and t = s log(ed) with s 1 this gives us the following estimate ≥ a s2n log(ed)2 √ 3 Prob P sσmax(E) n log(ed) exp 1 2 k k∞ ≥ ≤ − K where a3 is a universal constant. Proof. The ﬁrst statement is proven by just taking a union bound over and using Lemma 2.2. The second part of the statement immediately follows by using the ﬁrstN part and Lemma 2.8.

4. Small Ball Type Estimates We deﬁne the following quantity for later convenience.

(x, y) := ∆ 1D(1)P (x)(y) 2 + P (x) 2 L k m− k2 k k2 It follows directly that q

P W 2 2 2 k k = P W µñorm(P, x)− + P (x) = inf (x, y) κ˜(P, x) k k k k2 y x L q y S⊥n−1 ∈ P W So we set L(P, x) = k k and L(P ) = minx Sn−1 L(P, x). We then have the following κ˜(P,x) ∈ equalities: P L(P, x) = inf (x, y) , κ˜(P, x)= k kW y x L L(P, x) y S⊥n−1 ∈ 12 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS and, finally P κ˜(P )= k kW . L(P ) In this section, we prove a small-ball type estimate to control behavior of the denominator L(p). We first need to recall a technical lemma from our earlier paper [20], which builds on an idea of Nguyen [33].

Lemma 4.1. Let n 2, let P := (p1,...,pn 1) be a system of n-variate homogenous ≥ − n 1 polynomials, and assume P γ. Let x, y S − be mutually orthogonal vectors with k k∞ ≤ ∈ 2 n (x, y) α, and let r [ 1, 1]. Then for every w with w = x + βry + β z for some z B2 , weL have≤ the following inequalities:∈ − ∈ 4 2 2 4 4 4 2 (1) If d := maxi di and 0 < β d− then P (w) 2 8(α +(2+ e )β d γ ). ≤ k k2 ≤ 2 2 4 4 4 2 (2) If deg(p )= d for all i [n 1] and 0 < β d− then P (w) 8(α +(2+ e )β d γ ). i ∈ − ≤ k k2 ≤ We also need to state and prove the following simple Lemma for the clarity of succeeding proofs. Lemma 4.2. Let n 1 be an integer. Then for 0 x 1 we have (1 + x)n 1+3nx. ≥ ≤ ≤ n ≤ Proof. For every 0 y 1 we have 1+3y ey. This can be seen by setting f(y)= 1+3y ey, ′ ≤ ≤ ≥ − observing f (y) > 0 for all 0 y 1 and f(1) > 0, f(0) = 0. With a similar reasoning one can prove ex 1+ x, and hence≤ ≤enx (1 + x)n for all 0 x 1. Using y = nx completes the proof. ≥ ≥ ≤ ≤

Theorem 4.3. Let D = (d1,...,dn 1) be a vector with positive integer coordinates, let − Ei Hd be full linear subspaces, and let E = (E1,...,En 1). Let pi Ei be independent ⊆ i − ∈ random elements of Ei that satisfy the Centering Property, the Sub-Gaussian Property with constant K, and the Small Ball Property with constant c0, each with respect to Bombieri- 8 1 Weyl inner product. Let γ 1, d := max d , and assume α min d− , n− . Then for ≥ i i ≤ { } P =(p1,...,pn 1) we have − 2 n 1 1 c0d γC − Prob(L(P ) α) Prob ( P γ)+ cα 2 √n ≤ ≤ k k∞ ≥ σ (E)√n min where C is a universal constant. The proof of Theorem 4.3 is similar to a proof in our earlier paper [20]. We reproduce the proof here due to the importance of Theorem 4.3 in the ﬂow of our current paper.

4 Proof. We assume the hypotheses of Assertion (1) in Lemma 4.1: Let α,γ > 0 and β d− . Let B := P P γ and let ≤ { |k k∞ ≤ } n 1 L := P L(P ) α = P There exist x, y S − with x y and (x, y) α . { 2 | ≤4 }4 4 {2 | n ∈ ⊥ Rn L ≤ } Let Γ := 8(α +(2+ e )β d γ ) and let B2 denote the unit ℓ2-ball in . Lemma 4.1 implies that if the event B L occurs then there exists a non-empty set ∩ V := w Rn : w = x + βry + β2z, x y, r 1, z y, z Bn Bn x,y { ∈ ⊥ | | ≤ ⊥ ∈ 2 }\ 2 2 such that P (w) 2 Γ for every w in this set. Let V := Vol(Vx,y). Note that for w Vx,y k 2 k ≤ 2 2 2 2 2 ∈ we have w 2 = x + β z 2 + βy 2 1+4β . Hence we have w 2 1+2β . Since k k 2 kn n k k k ≤ k k ≤ Vx,y (1+2β )B2 B2 , we have showed that ⊆ B L\ P Vol( x (1+2β2)Bn Bn P (x) 2 Γ ) V . ∩ ⊆{ | { ∈ 2 \ 2 |k k2 ≤ } ≥ } SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 13

Using Markov’s Inequality, Fubini’s Theorem, and Lemma 2.2, we can estimate the probability of this event. Indeed, Prob (Vol( x (1+2β2)Bn Bn : P (x) 2 Γ ) V ) { ∈ 2 \ 2 k k2 ≤ } ≥ 1 EVol x (1+2β2)Bn Bn : P (x) 2 Γ ≤ V { ∈ 2 \ 2 k k2 ≤ } 1 2 Prob P (x) 2 Γ dx ≤ V (1+2β2)Bn Bn k k ≤ Z 2 \ 2 2 n n Vol((1 + 2β )B2 B2 ) 2 \ max Prob P (x) 2 Γ . ≤ V x (1+2β2)Bn Bn k k ≤ ∈ 2 \ 2 n/2 n ′ n π Vol(B2 ) c Now recall that Vol(B ) = . Then n−1 for some constant c′ > 0. If we 2 Γ n +1 Vol(B ) √n ( 2 ) 2 ≤ assume that that β2 1 , then Lemma 4.2 implies (1 + 2β2)n 1+6nβ2, and we obtain ≤ 2n ≤ 2 n n n 2 n Vol((1 + 2β )B2 B2 ) Vol(B2 )((1+2β ) 1) 2 2n c√nββ − , \ 2 n 1 n 1− V ≤ β(β ) − Vol(B2 − ) ≤ for some absolute constant c> 0. Note that here, for a lower bound on V , we used the fact 2 that Vx,y contains more than half of a cylinder with base having radius β and height 2β. Writingx ˜ := x for any x = 0 we then obtain, for z / Bn, that x 2 2 k k 6 ∈ m m m P (z) 2 = p (z) 2 = p (˜z) 2 z 2dj p (˜z) 2 = P (˜z) 2. k k2 | j | | j | k k2 ≥ | j | k k2 j=1 j=1 j=1 X X X This implies, via Lemma 2.2, that for every w (1+2β2)Bn Bn we have ∈ 2 \ 2 n 1 − 2 2 Γ Prob P (w) 2 Γ Prob P (˜w) 2 Γ cc0 2 . k k ≤ ≤ k k ≤ ≤ snσmin(E) ! n 1 3 2n Γ − So we conclude that Prob (B L) c√nβ − cc0 2 . Since Prob (L(P ) α) ∩ ≤ nσmin(E) ≤ ≤ Prob ( P γ) + Prob(B L) we then have q k k∞ ≥ ∩

n 1 − 3 2n Γ √ − Prob (L(P ) α) Prob ( P γ)+ c nβ cc0 2 ≤ ≤ k k∞ ≥ snσmin(E) ! Recall that Γ = 8(α2 +(5+ e4)β4d4γ2). We set β2 := α. Our choice of β and the assumption that γ 1 then imply that Γ Cα2γ2d4 for some constant C. So we obtain ≥ ≤ 2 n 1 3 n c0Cαd γ − Prob(L(P ) α) Prob ( P γ)+ c√n(α) 2 − ≤ ≤ k k∞ ≥ σ (E)√n min

2 n 1 1 c0d γC − Prob(L(P ) α) Prob ( P γ)+ c√n(α) 2 ≤ ≤ k k∞ ≥ σ (E)√n min and our proof is complete. 14 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

5. Proof of Theorem 1.14 We ﬁrst need to estimate Bombieri norm of a random polynomial system. The following lemma is more or less standard, and it follows from Lemma 2.3.

Lemma 5.1. Let D = (d1,...,dn 1) be a vector with positive integer coordinates, let Ei − ⊆ Hd be full linear subspaces, and let E =(E1,...,En 1). Let pi Ei be random elements of i − ∈ Ei that satisfy the Centering Property and the Sub-Gaussian Property with constant K, each with respect to Bombieri-Weyl inner product. Then for all t 1, we have ≥ t2 dim(E ) Prob p t dim(E ) exp 1 i k ikW ≥ i ≤ − K2 p and for the random polynomial system P =(p1,...,pn 1) we have − t2 dim(E) Prob P t dim(E) exp 1 . k kW ≥ ≤ − K2 p Now we have all the necessary tools to prove our probabilistic condition number theorem. We will prove the following statement:

Theorem 5.2. Let D = (d1,...,dn 1) be a vector with positive integer coordinates, let − Ei Hd be non-degenerate linear subspaces, and let E = (E1,...,En 1). We assume that ⊆ i − dim(E) n log(ed) and n 3. Let pi Ei be independent random elements of Ei that satisfy the≥ Centering Property,≥ the Sub-Gaussian∈ Property with constant K, and the Small Ball Property with constant c0, each with respect to the Bombieri-Weyl inner product. We set d := maxi di, and

2 2 2n 2 M := nK dim(E)(c0d CK log(ed) σ(E)) −

where C 4 is a universal constant.p Then for P =(p1,...,pn 1), we have − ≥ 3 2n log (ed) √t ; if 1 t e n ≤ ≤ Prob(˜κ(P ) tM) 2 ≥ ≤  e +1 log t 2 ; if e2n log (ed) t  √t 2n log (ed) ≤ For notational simplictiy we set m= dim(E). To start the proof we observe the following: sK√m Prob(˜κ(P ) tM) Prob P sK√m + Prob L(P ) ≥ ≤ k kW ≥ ≤ tM The ﬁrst probability on the right hand side will be controlled by Lemma 5.1, and the second will be controlled by Theorem 4.3. Theorem 4.3 states that for any γ 1 and for sK√m 8 1 ≥ min d− , n− , we have tM ≤ { }

1 n 1 sK√m sK√m 2 c Cγd2 − Prob L(P ) Prob ( P γ)+ √n 0 ≤ tM ≤ k k∞ ≥ tM σ (E)√n min sK√m 8 1 8 1 To have min d− , n− is equivalent to tM min d− , n− sK√m. We will check tM ≤ { } { } ≥ this condition at the end of the proof. Now, for γ = uσmax(E)√n log(ed)K with u 1, from 2 ≥2 Lemma 3.1 we have Prob ( P uσmax(E)√n log(ed)K) exp(1 a3u n log(ed) ). That k k∞ ≥ ≤ − SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 15

is, for γ = uσmax(E)√n log(ed)K, we have the following estimate: 1 n 1 sK√m sK√m 2 c Cuσ (E) log(ed)d2K − Prob L(P ) exp(1 a u2n log(ed)2)+ √n 0 max . ≤ tM ≤ − 3 tM σ (E) min σmax(E) 2 2n 2 Since σ(E)= and M = n√nK(c0C log(ed)d Kσ(E)) − , we have σmin(E)

1 sK√m 2 s 2 n 1 Prob L(P ) exp(1 a u n log(ed)) + u − . ≤ tM ≤ − 3 t Using Lemma 5.1 and the assumption that m n log(ed) we then obtain ≥ 1 2 2 2 s 2 n 1 Prob (˜κ(P ) tM) exp(1 s n log(ed) ) + exp(1 a u n log(ed))+ u − . ≥ ≤ − − 3 t If t e2n log(ed) then setting s = u = 1 gives the desired inequality. If t e2n log(ed) then we ≤ 2 log(t) ≥ set s = u = 2ãn log(ed) , wherea ˜ > a3 > 0 is a constant greater than 1. We then obtain n 1 log(t) 2 1 Prob(˜κ(P ) tM) exp 2 log(t) + . ≥ ≤ − 2 2n log(ed) √t n Observe that exp 2 1 log t = e2 . So we have Prob (˜κ(P ) tM) log(t) 2 e2+1 . To − 2 √t ≥ ≤ 2n log(ed) √t 8 1 finalize our proof we need to check if tM min d− , n− sK√m. So we check the following: { } ≥ ? 2 2n 2 8 1 log(t) tKn√m(c C log(ed)d Kσ(E)) − min d− , n− K√m. 0 { } ≥ 2n log(ed) 2 2n 2 8 1 For n 3 we have (d log(ed)) − >d . Since Kc , C 4, and σ(E) 1, we have ≥ 0 ≥ 4 ≥ ≥ 2 2n 2 8 (c0C log(ed)d Kσ(E)) − >d . Hence, it suffices to check if t log(t) , which is clear. ≥ 2n log(ed) We would like to complete the proof of Theorem 1.14 as it was stated in the introduction, for which the following easy observation suffices.

n log(t) 2 1 Lemma 5.3. For t e2n log(ed), we have t 4 log(ed) . ≥ 2n log(ed) ≤ Proof. Let t = xe2n log(ed) where x 1. Then n ≥ n log(t) 2 log(x) 2 log(x) 1 = 1+ e 4 log(ed) = x 4 log(ed) 2n log(ed) 2n log(ed) ≤ Since x t, we are done. ≤ We now state the resulting bounds on the expectation of the condition number. Corollary 5.4. Under the assumptions of Theorem 5.2, 0

q q q ∞ q 1 E(˜κ(P ) )= M + qM P κ˜(P ) tM t − dt. { ≥ } Zt=1 16 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

For t e2n log(ed), we have ≥ q 1 q+ 1 3 1 1 P κ˜(P ) tM t − t 4 log(ed) − 2 t− − 4 log(ed) . { ≥ } ≤ ≤ For t e2n log(ed) we have even stronger tail bounds: ≤ q q ∞ 1 1 E(˜κ(P ) ) M 1+ q t− − 4 log(ed) dt . ≤ Zt=1 This proves the ﬁrst claim. The second claim follows by sending q 0 and using Jensen’s inequality. →

6. Proof of Theorem 1.16

Let Ei Hdi be non-degenerate linear spaces, and let E = (E1,...,En 1). Suppose Q E is⊆ a fixed polynomial system. Let g E be independent random− elements of ∈ i ∈ i Ei that satisfy the Centering Property, the Sub-Gaussian Property with constant K, and the Anti-Concentration Property with constant c0, each with respect to the Bombieri-Weyl inner product. Let G := (g1,...,gn 1) be the corresponding polynomial system. We define random perturbation of Q as follows:− P := Q + G. We will use this notation for P , Q and G for the rest of this section. Lemma 6.1. Let Q E be a polynomial system, let G be a random polynomial system in E that satisfies the Centering,∈ sub-Gaussian, and Anti-Concentration hypotheses, and let P = Q + G. Then we have a s2n log(ed) √ 3 Prob P sσmax(E) n log(ed)+ Q exp 1 2 k k∞ ≥ k k∞ ≤ − K where a3 is an absolute constant. Proof. The triangle inequality implies P Q + G . We complete the proof by using Lemma 3.1 for the random systemk Gk.∞≤ k k∞ k k∞ Lemma 6.2. Let Q E be a polynomial system, let G be a random polynomial system in E that satisfies the Centering,∈ sub-Gaussian, and Anti-Concentration hypotheses, and let n 1 P = Q + G. Then, for all ε> 0, and for any w S − we have ∈ n 1 Prob P (w) εσ (E)√n 1 (a c ε) − k k2 ≤ min − ≤ 2 0 where a2 is an absolute constant. Proof. By the Anti-Concentration Property, for all 1 i n 1, we have ≤ ≤ − Prob g (w)+ q (w) c εσ (E ) c ε {| i i | ≤ 0 min i } ≤ 0 We then use Lemma 2.4 with the random variables gi(w)+ qi(w). Lemma 6.3. Let Q E be a polynomial system, let G be a random polynomial system in E that satisfies the Centering,∈ sub-Gaussian, and Anti-Concentration hypotheses, and let P = Q + G. Then for all t 1, we have ≥ Prob P tK dim(E)+ Q exp(1 t2m). k kW ≥ k kW ≤ − p SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 17

Proof. For all 1 i n 1, by triangle inequality pi W qi W + gCi W . So using the first claim of Lemma≤ ≤ 5.1− gives k k ≤k k k k t2 dim(E ) Prob p t dim(E )+ q exp 1 i k ikW ≥ i k ikW ≤ − K2 p Note that P W = max w 2=1 abs w, (p1,...,pn 1). So proceding as in the proof of Lemma 2.2 completesk k the proof.k k h − Theorem 6.4. Let Q E be a polynomial system, let G be a random polynomial system in E that satisfies the Centering,∈ sub-Gaussian, and Anti-Concentration hypotheses, and let 8 1 P = Q + G. Now let γ 1, d := max d , and assume α min d− , n− . Then ≥ i i ≤ { } 2 n 1 1 c0d γC − Prob(L(P ) α) Prob ( P γ)+ cα 2 √n ≤ ≤ k k∞ ≥ σ (E)√n min where C is a universal constant. The proof of Theorem 6.4 is identical to Theorem 4.3, so we skip it. Now we are ready to state main theorem of this section. Theorem 6.5. Let Q E be a polynomial system, let G be a random polynomial system in E that satisfies the Centering,∈ sub-Gaussian, and Anti-Concentration hypotheses, and let P = Q + G. Also let d := maxi di, and set 2n 1 2n 2 Q − M = nK dim(E) c d2CK log(ed)σ(E) − 1+ k kW 0 √nK log(ed) where C 4 is a universalp constant. Assume also that dim(E) n log(ed)2 and n 3. Then ≥ ≥ ≥ 3 2n log (ed) √t ; if 1 t e n ≤ ≤ Prob(˜κ(P ) tM) 2 2 ≥ ≤  e +1 log t ; if e2n log (ed) t.  √t 2n log (ed) ≤ n 1 Proof. We need a quick observation before we start our proof: For any Q E and w S − , 2 n 1 2 2 2 2 ∈ ∈ we have Q(w) − q σ (E ) Q σ (E) . So we have k k2 ≤ i=1 k ikW max i ≤k kW max Q Q W σmax(E). P k k∞ ≤k k Using this upper bound on Q and the assumption that dim(E) n log(ed)2, we deduce k k∞ ≥ 2n 2 2 2n 2 Q W Q ∞ − M nK dim(E)(c0d CK log(ed)σ(E)) − 1+ k k 1+ k k . ≥ nK√dim(E) √n log(ed)Kσmax(E) We will usep this lower bound on M later in our proof. Now let m = dim(E). We start our proof with the following observation: sK√m+ Q Prob(˜κ(P ) tM) Prob ( P sK√m + Q ) + Prob L(P ) k kW . ≥ ≤ k kW ≥ k kW ≤ tM Lemma 6.3 states that Prob P sK√m + Q exp(1 s2m). k kW ≥ k kW ≤ − sK√m+ Q k kW 8 1 Theorem 6.4 states that for tM min d− , n− we have ≤ { } n 1 sK√m+ Q sK√m+ Q 1 c d2γC − Prob L(P ) k kW Prob ( P γ)+ c( k kW ) 2 √n 0 . ≤ tM ≤ k k∞ ≥ tM σmin(E)√n We set γ = uσmax(E)√n log( ed)K + Q . From Lemma 6.1, we have k k∞ 2 Prob P uσmax(E)√n log(ed)K + Q exp(1 a3u n log(ed)). k k∞ ≥ k k∞ ≤ − 18 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

We also have 2 n 1 n 1 c0d γC − 2 n 1 Q − = c ud CK log(ed)σ(E) − 1+ k k∞ . σ (E)√n 0 u√n log(ed)Kσ (E) min max Using u 1, s 1, m n log(ed)2, and the lower obtained on M, we obtain ≥ ≥ ≥ 1 2 2 s 2 n 1 Prob(˜κ(P ) tM) exp(1 s n log(ed)) + exp(1 a u n log(ed)) + u − ≥ ≤ − − 3 t The rest of the proof is identical to the proof of Theorem 5.2.

7. Proof of Theorem 1.18

Define a random polynomial system Fε = Q + G where G is Gaussian random polynomial ε Q 1 system with K = k kW and c K = . Using Lemma 5.1 with t = 1, we have with √n log(ed) 0 √2π probability at least 1 exp(1 dim(E)) that − − ε Q W dim(E) Fε Q W = G W k k . k − k k k ≤ √n log(p ed) ε Q k kW For the condition estimate we will use Theorem 6.5: First note that with K = √n log(ed) and c K = 1 , the quantity M in Theorem 6.5 is the following: 0 √2π 2n 2 2n 1 ε√n dim(E) d2C log(ed)σ(E) − 1 − M = 1+ . log(ed) √2π ε p 2n 2 2n 2 d2C log(ed)σ(E) So we have M 2√n dim(E) 2 − − . Using Theorem 6.5 with t = 36 ≤ ε √2π 1 we deduce that with probabilityp greater than 2 we have 2n 2 d2C log(ed)σ(E) − κ˜(F ) 2√n dim(E) . ε ≤ ε Since the union of the complementp of these two events has measure less than 1 + exp(1 dim(E)), their intersection has positive measure, and the proof is completed. 2 − Remark 7.1. The proof of Theorem 6.4 actually works for 2n 2 2 2n 2 Q − M = nK dim(E) c d CK log(ed)σ(E) − (1+ Q ) 1+ k k∞ , 0 k kW √n log(ed)Kσ (E) max which is oftenp much more smaller than the M used in the theorem statement. ⋄ 8. Appendix: The Dispersion Constants of Random Subspaces of Polynomial Systems Here we address the question how big the dispersion constant is for a “typical” low- dimensional linear space. Imagine we have fixed a dimension m n log d and wish to ∼ consider subspaces of dimension m inside Hd (the vector space homogenous polynomials of degree d). How does the dispersion constant vary among these subspaces? We know that some of these subspaces will be degenerate and have infinite dispersion constant. Can we argue that high dispersion constants are rare? To address this problem, we represent the space of m-dimensional linear subspaces of Hd by the Grassmannian variety, Gr(m, dim(Hd)), which comes equipped with a Haar measure. SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 19

We will analyze the Haar measure of the set of subspaces in Gr(m, dim(Hd)) that yield high dispersion constant (see Corollary 8.4 below). We will first need to introduce the following notion from high-dimensional probability. Definition 8.1 (Gaussian Complexity). Let X Rn be a set, then the Gaussian complexity of X denoted by γ(X) is defined as follows: ⊆

γ(X) := E sup G, x x X |h i| ∈ where G is distributed according to standard normal distribution (0, I) on Rn. N ⋄ The use of the term complexity in deﬁnition 8.1 might look unorthodox to readers with a computational complexity theory background. The rationale behind this standard terminology in high dimensional probability is that the Gaussian complexity of a set X is known to control the complexity of stochastic processes indexed on the set X (see, e.g., [44]). A corollary of Lemma 2.1 and Lemma 2.8 is the following.

Corollary 8.2 (Gaussian Complexity of the Veronese Embedding). Let Hd be the vector n+d 1 space of degree d homogenous polynomials in n variables. Let ui i = 1,..., d− be an orthonormal basis for the vector space Hd with respect to the Bombieri-Weyl norm. For every n 1 v S − , we deﬁne the following polynomial q : ∈ v qv(x) := ui(v)ui(x) i X and the following set created out of qv: n 1 B := q : v S − d { v ∈ } Then we have γ(B ) c√n log(ed) for a universal constant c. d ≤ Proof of Corollary 8.2: We need to consider a Gaussian element G in the vector space H . Note that for G (0, I) in H we have G, d xα (0, 1) since d xα is an d ∼N d α ∼N α W orthonormal basis with respect to the Weyl-Bombieriq inner product. This meansq Gaussian elements of Hd are included in our model of randomness for the special case K = 1. Since σmax(Hd) = 1, Lemma 2.1 gives us the following estimate for pointwise evaluations of the Gaussian element G (0, I) in H : ∼N d t2 Prob G(v) t exp 1 . {| | ≥ } ≤ − 2 n−1 Note that G = maxv S G(v) = maxqv Bd G, qv . So to estimate Gaussian complex- k k∞ ∈ | | ∈ |h i| ity of the Veronese embedding Bd, we need to estimate E G . Let be a δ-net on the n 1 k k∞ N sphere S − . Using a union bound, we then have t2 Prob max G(v) t exp 1 . { v | | ≥ }≤|N| − 2 ∈N Setting δ = 1 and using Lemma 2.8 for t a √n log(ed) then gives the following: d ≥ 1 2 2 a1t n log(ed) Prob G a1t√n log(ed) exp 1 {k k∞ ≥ }≤|N| − 2 20 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

It is known that exp(a n log d). So we have |N| ≤ 0 a2t2n log(ed) exp 1 1 exp(1 a t2n log(ed)) |N| − 2 ≤ − 2 2 for some constant a2. So Prob G a1t√n log(ed) exp(1 a2t n log(ed)). Using this inequality one can routinely derive{k k∞ the≥ estimate for E}G ≤ . − Since Talagrand proved his celebrated “majorizingk measurek∞ theorem” (see [43]) it has been observed that for a set X and a random k n sub-Gaussian matrix A, the deviation E × supx X Ax 2 Ax 2 is controlled by the Gaussian complexity γ(X). We will use a variant∈ |k establishedk − k in [25]k | but not stated explicitly: Theorem 8.3. Let F be a random m dimensional subspace of Rn drawn from Haar measure n on Gr(m, n), and let PF be orthogonal projection map on F . Let X R be a set. Then there is a universal constant C such that ⊆

sup √n PF (x) √m x Ctγ(X), t 1 x X k k − k k ≤ ≥ ∈ t2 with probability greater than 1 e− . − There is a series of papers that established several variants of the preceding two deviation bounds — mainly in [38, 25] and, more recently in [19, 30]. Vershynin devoted the 9th chapter of his recent book [44] on these results and their applications. Theorem 8.3 follows easily upon combining some statements and exercises from [44, Ch. 9]. We include a sketch of the proof below for the interested reader.

Proof of Theorem 8.3. Let x X and consider the random process Wx := √n P x √m x . By [[25], Lemma 4.2,∈ [38]] we have that W is a subgaussian process in X,k i.e.,k − k k x cs2 P ( P x Py s x y ) 2e− ,s> 0 |k k−k k| ≥ k − k ≤ where c> 0 is an absolute constant and x, y X, or equivalently ∈ P x Py c x y . k k−k k ψ2 ≤ k − k n 1 In [[25], Lemma 4.2], the above inequality is stated for x, y S − . To extend it for every x, y is straightforward, and we explain the idea below ( see e.g.∈ proof of Lemma 9.1.4 in [44] or [30] for details). By scaling, without loss of generality we may assume that x =1, y 1. y k k k k ≥ Sety ˜ = y . Note that k k W W = y y¯ W C y y¯ k y − y¯k k − kk y¯kψ2 ≤ k − k for a universal constant C. Using all the above and the triangle inequality we get that W W C ( x y¯ + y y¯ ) √2C x y . k x − ykψ2 ≤ k − k k − k ≤ k − k Now that we have established that Wx for x X is a subgaussian process we may apply [[19] Theorem 3.2] or [[43] Theorem 2.2.27] to∈ conclude the proof. For example the latter states that s2 P sup Wx Wy C (γ2(X, )+ sdiam(X)) 2e− . x,y X | − | ≥ k·k ≤ ∈ Here diam(X) := maxx,y X x y 2 and γ2 is Talagrand’s functional (see [44], Deﬁnition of 8.5.1 for details). By∈ Talagrand’sk − k majorizing measure theorem (see e.g. [44], Theorem SMOOTHED ANALYSIS OF STRUCTURED REAL POLYNOMIAL SYSTEMS 21

8.6.1) it is known that γ2(X, ) γ(X). Using the triangle inequality and the fact that diam(X) 2γ(X), we concludek·k that≃ ≤ s2 P sup Wx csγ(X) 2e− , s 1. x X | | ≥ ≤ ≥ ∈ A simple consequence of Theorem 8.3 is the following estimate on the dispersion constant of a random subspace of polynomial systems:

Corollary 8.4. Let F be a random m dimensional subspace of Hd drawn from the Haar measure on Gr(m, dim(H )), where m 16Cn log(ed)2. Then d ≥ √m + Ct√n log(ed) σ(F ) ≤ √m Ct√n log(ed) − t2 with probability greater than 1 e− , where C is the absolute constant from Theorem 8.3. − n 1 Proof of Corollary 8.4: Since q = 1 for all v S − , applying Theorem 8.3 to the k vkW ∈ set Bd implies that 1 n + d 1 2 sup − ΠF (x) √m Ct√n log(ed) x Bd d k k − ≤ ∈ t2 with probability greater than 1 e− for all t 1. Since σmin(F ) = minx B ΠF (x) and − ≥ ∈ d k k σmax(F ) = maxx B ΠF (x) , we have ∈ d k k √m Ct√n log(ed) √m + Ct√n log(ed) − 1 σmin(F ) σmax(F ) 1 n+d 1 2 ≤ ≤ ≤ n+d 1 2 d− d− t2 with probability greater than 1 e− . − References [1] Franck Barthe and Alexander Koldobsky, “Extremal slabs in the cube and the Laplace transform,” Adv. Math. 174 (2003), pp. 89–114. [2] Carlos Beltrán and Luis-Miguel Pardo, “Smale’s 17th problem: Average polynomial time to compute affine and projective solutions,” Journal of the American Mathematical Society 22 (2009), pp. 363–385. [3] Lenore Blum, Felipe Cucker, Mike Shub, and Steve Smale, Complexity and Real Computation, Springer- Verlag, 1998. [4] Jean Bourgain, “On the isotropy-constant problem for ψ2-bodies,” in Geometric Aspects of Functional Analysis, Lecture Notes in Mathematics, vol. 1807, pp. 114–121, Springer Berlin Heidielberg, 2003. [5] Peter Bürgisser and Felipe Cucker, “On a problem posed by Steve Smale,” Annals of Mathematics, pp. 1785–1836, Vol. 174 (2011), no. 3. [6] Peter Bürgisser and Felipe Cucker, Condition, Grundlehren der mathematischen Wissenschaften, no. 349, Springer-Verlag, 2013. [7] Peter Bürgisser, Felipe Cucker, Pierre Lairez, Computing the homology of basic semialgebraic sets in weakly exponential time, Journal of the ACM (JACM) 66, 2018, pg 1–30 [8] Peter Bürgisser, Felipe Cucker , JosuéTonelli-Cueto , Computing the homology of semialgebraic sets. I: Lax formulas, Foundations of Computational Mathematics, 2018 [9] Peter Bürgisser, Felipe Cucker , JosuéTonelli-Cueto ,Computing the Homology of Semialgebraic Sets. II: General formulas, arXiv preprint arXiv:1903.10710 [10] Felipe Cucker, Alperen A. Ergür, JosuéTonelli-Cueto, Plantinga-Vegter algorithm takes average polynomial time., In Proceedings of the 2019 on International Symposium on Symbolic and Algebraic Com- putation, pp. 114-121. 2019. 22 ALPEREN A. ERGUR,¨ GRIGORIS PAOURIS, AND J. MAURICE ROJAS

[11] D. Castro, Juan San Mart´ın, Luis M. Pardo, “Systems of Rational Polynomial Equations have Polyno- mial Size Approximate Zeros on the Average,” Journal of Complexity 19 (2003), pp. 161–209. [12] Felipe Cucker, “Approximate zeros and condition numbers,” Journal of Complexity 15 (1999), no. 2, pp. 214–226. [13] Felipe Cucker, Teresa Krick, Gregorio Malajovich, and Mario Wschebor, “A numerical algorithm for zero counting I. Complexity and accuracy,” J. Complexity 24 (2008), no. 5–6, pp. 582–605 [14] Felipe Cucker, Teresa Krick, Gregorio Malajovich, and Mario Wschebor, “A numerical algorithm for zero counting II. Distance to ill-posedness and smoothed analysis,” J. Fixed Point Theory Appl. 6 (2009), no. 2, pp. 285–294. [15] Felipe Cucker, Teresa Krick, Gregorio Malajovich, and Mario Wschebor, “A numerical algorithm for zero counting III: Randomization and condition,” Adv. in Appl. Math. 48 (2012), no. 1, pp. 215–248. [16] Felipe Cucker; Teresa Krick; and Mike Shub, “Computing the Homology of Real Projective Sets,” Foun- dations of Computational Mathematics, Foundations of Computational Mathematics, 2018, pg 929–970 [17] Jean-Piere Dedieu, Mike Shub, “Newton’s Method for Overdetermined Systems Of Equations,” Math. Comp. 69 (2000), no. 231, pp. 1099–1115. [18] James Demmel, Benjamin Diament, and Gregorio Malajovich, “On the Complexity of Computing Error Bounds,” Found. Comput. Math. pp. 101–125 (2001). [19] S. Dirksen, Tail bounds via generic chaining, Electronic J. Probab. 20 (2015), no. 53, 29 pp. [20] Alperen A. Ergür, Grigoris Paouris, J. Maurice Rojas, “Probabilistic Condition Number Estimates For Real Polynomial Systems I: A Broader Family Of Distributions”, Foundations of Computational Mathematics, to appear. DOI: 10.1007/s10208-018-9380-5. [21] Wassily Hoeffding, “Probability inequalities for sums of bounded random variables,” Journal of the American Statistical Association, 58 (301):13–30, 1963. [22] O. D. Kellog, “On bounded polynomials in several variables,” Mathematische Zeitschrift, December 1928, Volume 27, Issue 1, pp. 55–64. [23] Bo’az Klartag and Emanuel Milman, “Centroid bodies and the logarithmic Laplace Transform – a unified approach,” J. Func. Anal., 262(1):10–34, 2012. [24] Bo’az Klartag and Shahar Mendelson, Empirical processes and random projections , J. Functional Analysis, Vol. 225, no. 1 (2005) 229–245. [25] Bo’az Klartag and Shahar Mendelson, Empirical processes and random projections , J. Funct. Anal. Vol. 225, no. 1 (2005) 229–245. [26] Alexander Koldobsky and Alain Pajor, “A Remark on Measures of Sections of Lp-balls,” Geometric Aspects of Functional Analysis, Lecture Notes in Mathematics 2169, pp. 213–220, Springer-Verlag, 2017. [27] Eric Kostlan, “On the Distribution of Roots of Random Polynomials,” Ch. 38 (pp. 419–431) of From Topology to Computation: Proceedings of Smalefest (M. W. Hirsch, J. E. Marsden, and M. Shub, eds.), Springer-Verlag, New York, 1993. [28] Pierre Lairez, “A deterministic algorithm to compute approximate roots of polynomial systems in polynomial average time,” Foundations of Computational Mathematics, DOI 10.1007/s10208-016-9319-7. [29] Michel Ledoux, The Concentration of Measure Phenomemon, Mathematical Surveys & Monographs, Book 89, AMS Press, 2005. [30] C. Liaw, A. Mehrabian, Y. Plan, and R. Vershynin, “A simple tool for bounding the deviation of random matrices on geometric sets,” Geometric Aspects of Functional Analysis: Israel Seminar (GAFA) 2014–2016, Lecture Notes in Mathematics 2169, Springer, 2017, pp. 277–299. [31] G. Livshyts, Grigoris Paouris and P. Pivovarov, “Sharp bounds for marginal densities of product measures,” Israel Journal of Mathematics, Vol. 216, Issue 2, pp. 877–889. n (1+ε)n [32] Assaf Naor and Artem Zvavitch, “Isomorphic embedding of ℓp , 1

[36] Mark Rudelson and Roman Vershynin, “The Smallest Singular Value of Rectangular Matrix,” Commu- nications on Pure and Applied Mathematics 62 (2009), pp. 1707–1739. [37] Mark Rudelson and Roman Vershynin, “Small ball Probabilities for Linear Images of High-Dimensional Distributions,” Int. Math. Res. Not. (2015), no. 19, pp. 9594–9617. [38] G. Schechtman, Two observations regarding embedding subsets of Euclidean spaces in normed spaces, Adv. Math. 200 (2006), 125–135. [39] Igor R. Shafarevich, Basic Algebraic Geometry 1: Varieties in Projective Space, 3rd edition, Springer- Verlag (2013). [40] Mike Shub and Steve Smale, “Complexity of Bezout’s Theorem I. Geometric Aspects,” J. Amer. Math. Soc. 6 (1993), no. 2, pp. 459–501. [41] Steve Smale, “Mathematical Problems for the next Century,” The Mathematical Intelligencer, 1998, 20.2, pp. 7—15. [42] D. A. Spielman and S.-H. Teng, “Smoothed analysis of algorithms,” Proc. Int. Congress Math. (Beijing, 2002), Volume I, 2002, pp. 597–606. [43] M. Talagrand, The generic chaining. Upper and lower bounds of stochastic processes. Springer Mono- graphs in Mathematics. Springer-Verlag, Berlin, 2005. [44] Roman Vershynin , “High Dimensional Probability: An Introduction with Application in Data Science”, Cambridge Series in Statistical and Probabilistic Mathematics (Book 47), Cambridge University Press, 2018. [45] Roman Vershynin , “Four Lectures on Probabilistic Methods for Data Science”, available at https://www.math.uci.edu/ rvershyn/papers/four-lectures-probability-data.pdf [46] Roman Vershynin, “Introduction to the Non-Asymptotic Analysis of Random Matrices,” Compressed sensing, pp. 210–268, Cambridge Univ. Press, Cambridge, 2012.

University of Texas at San Antonio, One UTSA Circle, San Antonio, TX, 78249, USA Email address: [email protected] Department of Mathematics, Texas A&M University TAMU 3368, College Station, Texas 77843-3368, USA. Email address: [email protected] Department of Mathematics, Texas A&M University TAMU 3368, College Station, Texas 77843-3368, USA. Email address: [email protected]