Numerical Condition in Equation Solving

Numerical Condition in Polynomial Equation Solving

Peter B¨urgisser Technische Universit¨atBerlin

Meeting on Algebraic Vision 2015

Berlin, October 8, 2015 Numerical Condition in Polynomial Equation Solving General phenomena

Part I: Definition of condition and general phenomena

Explained for solving I Relative error k∆xk/kxk of input x causes relative error k∆yk/kyk of output y

I Condition number κ(f , x) of f at x:

k∆yk/kyk . κ(f , x) k∆xk/kxk.

I Formal definition if f is differentiable: kxk κ(f , x) := kDf (x)k kf (x)k

where kDf (x)k denotes the operator of the of f at x.

Numerical Condition in Polynomial Equation Solving General phenomena

Condition number as norm of derivative

I Numerical computation problem (“solution map”)

p q f : R → R , x 7→ y = f (x)

Fix norms k k on Rp, Rq. I Condition number κ(f , x) of f at x:

k∆yk/kyk . κ(f , x) k∆xk/kxk.

I Formal definition if f is differentiable: kxk κ(f , x) := kDf (x)k kf (x)k

where kDf (x)k denotes the of the derivative of f at x.

Numerical Condition in Polynomial Equation Solving General phenomena

Condition number as norm of derivative

I Numerical computation problem (“solution map”)

p q f : R → R , x 7→ y = f (x)

Fix norms k k on Rp, Rq. I Relative error k∆xk/kxk of input x causes relative error k∆yk/kyk of output y I Formal definition if f is differentiable: kxk κ(f , x) := kDf (x)k kf (x)k

where kDf (x)k denotes the operator norm of the derivative of f at x.

Numerical Condition in Polynomial Equation Solving General phenomena

Condition number as norm of derivative

I Numerical computation problem (“solution map”)

p q f : R → R , x 7→ y = f (x)

Fix norms k k on Rp, Rq. I Relative error k∆xk/kxk of input x causes relative error k∆yk/kyk of output y

I Condition number κ(f , x) of f at x:

k∆yk/kyk . κ(f , x) k∆xk/kxk. Numerical Condition in Polynomial Equation Solving General phenomena

Condition number as norm of derivative

I Numerical computation problem (“solution map”)

p q f : R → R , x 7→ y = f (x)

Fix norms k k on Rp, Rq. I Relative error k∆xk/kxk of input x causes relative error k∆yk/kyk of output y

I Condition number κ(f , x) of f at x:

k∆yk/kyk . κ(f , x) k∆xk/kxk.

I Formal definition if f is differentiable: kxk κ(f , x) := kDf (x)k kf (x)k

where kDf (x)k denotes the operator norm of the derivative of f at x. I Theorem. The condition number of f at A equals

κ(A) := κ(f , A) = kAk kA−1k.

This is well known as “the condition number of the A”.

I κ(A) was first introduced by A. Turing in 1948.

I Warnung: a different computational problem related to A has a different condition number.

I For computing the eigenvalues λ1, . . . , λn of A have n condition numbers κ(A, λ1), . . . , κ(A, λn) (Wilkinson).

Numerical Condition in Polynomial Equation Solving General phenomena

Condition number of matrix inversion

I Consider matrix inversion

m×m −1 f : GL(m, R) → R , A 7→ A . We measure errors with the spectral norm. I κ(A) was first introduced by A. Turing in 1948.

I Warnung: a different computational problem related to A has a different condition number.

I For computing the eigenvalues λ1, . . . , λn of A have n condition numbers κ(A, λ1), . . . , κ(A, λn) (Wilkinson).

Numerical Condition in Polynomial Equation Solving General phenomena

Condition number of matrix inversion

I Consider matrix inversion

m×m −1 f : GL(m, R) → R , A 7→ A . We measure errors with the spectral norm.

I Theorem. The condition number of f at A equals

κ(A) := κ(f , A) = kAk kA−1k.

This is well known as “the condition number of the matrix A”. I Warnung: a different computational problem related to A has a different condition number.

I For computing the eigenvalues λ1, . . . , λn of A have n condition numbers κ(A, λ1), . . . , κ(A, λn) (Wilkinson).

Numerical Condition in Polynomial Equation Solving General phenomena

Condition number of matrix inversion

I Consider matrix inversion

m×m −1 f : GL(m, R) → R , A 7→ A . We measure errors with the spectral norm.

I Theorem. The condition number of f at A equals

κ(A) := κ(f , A) = kAk kA−1k.

This is well known as “the condition number of the matrix A”.

I κ(A) was first introduced by A. Turing in 1948. I For computing the eigenvalues λ1, . . . , λn of A have n condition numbers κ(A, λ1), . . . , κ(A, λn) (Wilkinson).

Numerical Condition in Polynomial Equation Solving General phenomena

Condition number of matrix inversion

I Consider matrix inversion

m×m −1 f : GL(m, R) → R , A 7→ A . We measure errors with the spectral norm.

I Theorem. The condition number of f at A equals

κ(A) := κ(f , A) = kAk kA−1k.

This is well known as “the condition number of the matrix A”.

I κ(A) was first introduced by A. Turing in 1948.

I Warnung: a different computational problem related to A has a different condition number. Numerical Condition in Polynomial Equation Solving General phenomena

Condition number of matrix inversion

I Consider matrix inversion

m×m −1 f : GL(m, R) → R , A 7→ A . We measure errors with the spectral norm.

I Theorem. The condition number of f at A equals

κ(A) := κ(f , A) = kAk kA−1k.

This is well known as “the condition number of the matrix A”.

I κ(A) was first introduced by A. Turing in 1948.

I Warnung: a different computational problem related to A has a different condition number.

I For computing the eigenvalues λ1, . . . , λn of A have n condition numbers κ(A, λ1), . . . , κ(A, λn) (Wilkinson). I The Eckart-Young Theorem from 1936 states that kAk κ(A) = kAk kA−1k = . dist(A, Σ)

where dist either refers to operator norm or to Frobenius norm (Euclidean norm on Rn×n).

I This is a prototype of a Condition Number Theorem; see Demmel.

Numerical Condition in Polynomial Equation Solving General phenomena

Distance to ill-posedness

m×m I We call the set of singular matricesΣ ⊆ R the set of ill-posed instances for matrix inversion. Clearly, A ∈ Σ ⇔ det A = 0. I This is a prototype of a Condition Number Theorem; see Demmel.

Numerical Condition in Polynomial Equation Solving General phenomena

Distance to ill-posedness

m×m I We call the set of singular matricesΣ ⊆ R the set of ill-posed instances for matrix inversion. Clearly, A ∈ Σ ⇔ det A = 0.

I The Eckart-Young Theorem from 1936 states that kAk κ(A) = kAk kA−1k = . dist(A, Σ)

where dist either refers to operator norm or to Frobenius norm (Euclidean norm on Rn×n). Numerical Condition in Polynomial Equation Solving General phenomena

Distance to ill-posedness

m×m I We call the set of singular matricesΣ ⊆ R the set of ill-posed instances for matrix inversion. Clearly, A ∈ Σ ⇔ det A = 0.

I The Eckart-Young Theorem from 1936 states that kAk κ(A) = kAk kA−1k = . dist(A, Σ)

where dist either refers to operator norm or to Frobenius norm (Euclidean norm on Rn×n).

I This is a prototype of a Condition Number Theorem; see Demmel. I conjugate gradient method for solving linear equations (Hestenes and Stiefel)

I interior point methods for linear optimization (Renegar)

I Newton homotopy methods to solve systems of polynomial equations (Shub and Smale)

Less obvious, but true: Even when assuming infinite precision arithmetic, the condition of an input often dominates the running time of iterative algorithms.

Three important examples for this phenomenon:

Numerical Condition in Polynomial Equation Solving General phenomena

Role of condition numbers

Obvious: Condition numbers are a crucial issue for designing “numerically stable” algorithms. I conjugate gradient method for solving linear equations (Hestenes and Stiefel)

I interior point methods for linear optimization (Renegar)

I Newton homotopy methods to solve systems of polynomial equations (Shub and Smale)

Three important examples for this phenomenon:

Numerical Condition in Polynomial Equation Solving General phenomena

Role of condition numbers

Obvious: Condition numbers are a crucial issue for designing “numerically stable” algorithms. Less obvious, but true: Even when assuming infinite precision arithmetic, the condition of an input often dominates the running time of iterative algorithms. I interior point methods for linear optimization (Renegar)

I Newton homotopy methods to solve systems of polynomial equations (Shub and Smale)

Numerical Condition in Polynomial Equation Solving General phenomena

Role of condition numbers

Obvious: Condition numbers are a crucial issue for designing “numerically stable” algorithms. Less obvious, but true: Even when assuming infinite precision arithmetic, the condition of an input often dominates the running time of iterative algorithms.

Three important examples for this phenomenon:

I conjugate gradient method for solving linear equations (Hestenes and Stiefel) I Newton homotopy methods to solve systems of polynomial equations (Shub and Smale)

Numerical Condition in Polynomial Equation Solving General phenomena

Role of condition numbers

Obvious: Condition numbers are a crucial issue for designing “numerically stable” algorithms. Less obvious, but true: Even when assuming infinite precision arithmetic, the condition of an input often dominates the running time of iterative algorithms.

Three important examples for this phenomenon:

I conjugate gradient method for solving linear equations (Hestenes and Stiefel)

I interior point methods for linear optimization (Renegar) Numerical Condition in Polynomial Equation Solving General phenomena

Role of condition numbers

Obvious: Condition numbers are a crucial issue for designing “numerically stable” algorithms. Less obvious, but true: Even when assuming infinite precision arithmetic, the condition of an input often dominates the running time of iterative algorithms.

Three important examples for this phenomenon:

I conjugate gradient method for solving linear equations (Hestenes and Stiefel)

I interior point methods for linear optimization (Renegar)

I Newton homotopy methods to solve systems of polynomial equations (Shub and Smale) I We model this with an isotropic Gaussian distribution with mean A ∈ Rn×n and covariance matrix σ2I , which has the density: 1  kA − Ak2  ρ(A) = √ exp − F . (σ 2π)n2 2σ2

If A = 0 and σ = 1: Ginibre ensemble.

I Theorem. (Wschebor, improving Sankar, Spielman, and Teng)  n  sup Prob {κ(A) ≥ t} = O . kAk=1 A∼N(A,σ2I ) σt

This is a prototype of a smoothed analysis.

I Tao and Vu (2010) have results for general distributions.

Numerical Condition in Polynomial Equation Solving General phenomena

Probabilistic analysis of Turing’s condition number

I Assume slight perturbation of A due to noise, round-off, etc. I Theorem. (Wschebor, improving Sankar, Spielman, and Teng)  n  sup Prob {κ(A) ≥ t} = O . kAk=1 A∼N(A,σ2I ) σt

This is a prototype of a smoothed analysis.

I Tao and Vu (2010) have results for general distributions.

Numerical Condition in Polynomial Equation Solving General phenomena

Probabilistic analysis of Turing’s condition number

I Assume slight perturbation of A due to noise, round-off, etc.

I We model this with an isotropic Gaussian distribution with mean A ∈ Rn×n and covariance matrix σ2I , which has the density: 1  kA − Ak2  ρ(A) = √ exp − F . (σ 2π)n2 2σ2

If A = 0 and σ = 1: Ginibre ensemble. I Tao and Vu (2010) have results for general distributions.

Numerical Condition in Polynomial Equation Solving General phenomena

Probabilistic analysis of Turing’s condition number

I Assume slight perturbation of A due to noise, round-off, etc.

I We model this with an isotropic Gaussian distribution with mean A ∈ Rn×n and covariance matrix σ2I , which has the density: 1  kA − Ak2  ρ(A) = √ exp − F . (σ 2π)n2 2σ2

If A = 0 and σ = 1: Ginibre ensemble.

I Theorem. (Wschebor, improving Sankar, Spielman, and Teng)  n  sup Prob {κ(A) ≥ t} = O . kAk=1 A∼N(A,σ2I ) σt

This is a prototype of a smoothed analysis. Numerical Condition in Polynomial Equation Solving General phenomena

Probabilistic analysis of Turing’s condition number

I Assume slight perturbation of A due to noise, round-off, etc.

I We model this with an isotropic Gaussian distribution with mean A ∈ Rn×n and covariance matrix σ2I , which has the density: 1  kA − Ak2  ρ(A) = √ exp − F . (σ 2π)n2 2σ2

If A = 0 and σ = 1: Ginibre ensemble.

I Theorem. (Wschebor, improving Sankar, Spielman, and Teng)  n  sup Prob {κ(A) ≥ t} = O . kAk=1 A∼N(A,σ2I ) σt

This is a prototype of a smoothed analysis.

I Tao and Vu (2010) have results for general distributions. Numerical Condition in Polynomial Equation Solving Polynomial Equations

Part II: Polynomial Equations

Complexity of Bezout’s Theorem (Shub and Smale 1993–1996) The answer is yes!

Very recently, the problem was completely solved by Pierre Lairez at TU Berlin (2015), building on work by Shub, Smale, Beltr´an,Pardo, B¨urgisser,Cucker.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Smale’s 17th problem

Guiding question:

Can a zero of n complex polynomial equations in n unknowns be found approximately, on the average, in polynomial time with a uniform algorithm? Very recently, the problem was completely solved by Pierre Lairez at TU Berlin (2015), building on work by Shub, Smale, Beltr´an,Pardo, B¨urgisser,Cucker.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Smale’s 17th problem

Guiding question:

Can a zero of n complex polynomial equations in n unknowns be found approximately, on the average, in polynomial time with a uniform algorithm?

The answer is yes! Numerical Condition in Polynomial Equation Solving Polynomial Equations

Smale’s 17th problem

Guiding question:

Can a zero of n complex polynomial equations in n unknowns be found approximately, on the average, in polynomial time with a uniform algorithm?

The answer is yes!

Very recently, the problem was completely solved by Pierre Lairez at TU Berlin (2015), building on work by Shub, Smale, Beltr´an,Pardo, B¨urgisser,Cucker. I The input size is N := dimC Hd . n I We look for zeros ζ of f in complex projective space P : f (ζ) = 0.

I The Bombieri-Weyl hermitian inner product h i on Hd is invariant under the natural action of the unitary group U(n + 1) on Hd and allows to define kf k := hf , f i1/2.

I We have a standard Gaussian distribution on Hd with density

1  1 2 ρ(f ) = √ 2N exp − kf k . 2π 2

Numerical Condition in Polynomial Equation Solving Polynomial Equations

The framework

I For a degree vector d = (d1,..., dn) we define

Hd := {f = (f1,..., fn) | fi ∈ C[X0,..., Xn] homogeneous of degree di }. n I We look for zeros ζ of f in complex projective space P : f (ζ) = 0.

I The Bombieri-Weyl hermitian inner product h i on Hd is invariant under the natural action of the unitary group U(n + 1) on Hd and allows to define kf k := hf , f i1/2.

I We have a standard Gaussian distribution on Hd with density

1  1 2 ρ(f ) = √ 2N exp − kf k . 2π 2

Numerical Condition in Polynomial Equation Solving Polynomial Equations

The framework

I For a degree vector d = (d1,..., dn) we define

Hd := {f = (f1,..., fn) | fi ∈ C[X0,..., Xn] homogeneous of degree di }.

I The input size is N := dimC Hd . I The Bombieri-Weyl hermitian inner product h i on Hd is invariant under the natural action of the unitary group U(n + 1) on Hd and allows to define kf k := hf , f i1/2.

I We have a standard Gaussian distribution on Hd with density

1  1 2 ρ(f ) = √ 2N exp − kf k . 2π 2

Numerical Condition in Polynomial Equation Solving Polynomial Equations

The framework

I For a degree vector d = (d1,..., dn) we define

Hd := {f = (f1,..., fn) | fi ∈ C[X0,..., Xn] homogeneous of degree di }.

I The input size is N := dimC Hd . n I We look for zeros ζ of f in complex projective space P : f (ζ) = 0. I We have a standard Gaussian distribution on Hd with density

1  1 2 ρ(f ) = √ 2N exp − kf k . 2π 2

Numerical Condition in Polynomial Equation Solving Polynomial Equations

The framework

I For a degree vector d = (d1,..., dn) we define

Hd := {f = (f1,..., fn) | fi ∈ C[X0,..., Xn] homogeneous of degree di }.

I The input size is N := dimC Hd . n I We look for zeros ζ of f in complex projective space P : f (ζ) = 0.

I The Bombieri-Weyl hermitian inner product h i on Hd is invariant under the natural action of the unitary group U(n + 1) on Hd and allows to define kf k := hf , f i1/2. Numerical Condition in Polynomial Equation Solving Polynomial Equations

The framework

I For a degree vector d = (d1,..., dn) we define

Hd := {f = (f1,..., fn) | fi ∈ C[X0,..., Xn] homogeneous of degree di }.

I The input size is N := dimC Hd . n I We look for zeros ζ of f in complex projective space P : f (ζ) = 0.

I The Bombieri-Weyl hermitian inner product h i on Hd is invariant under the natural action of the unitary group U(n + 1) on Hd and allows to define kf k := hf , f i1/2.

I We have a standard Gaussian distribution on Hd with density

1  1 2 ρ(f ) = √ 2N exp − kf k . 2π 2 n n I Consider the solution variety V := (f , ζ) | f (ζ) = 0 ⊆ Hd × P , which is a smooth Riemannian submanifold

I We have a solution map

n G : Hd → P , ef 7→ ζe defined locally around the simple zero ζ (implicit theorem). n I We consider the derivative Df G : Tf Hd → Tζ P . I According to our general principles, we define by

0 µ (f , ζ) := kf k · kDf Gk, the condition number at (f , ζ). One can show kD Gk = kDf (ζ)†k if kzk = 1. I f √ † I For the sake of elegance: rather use kDf (ζ) diag( di )k resulting in the condition number µ(f , ζ) for polynomial equation solving.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Condition number

I Let f (ζ) = 0. How much does ζ change when we perturb f a little? I We have a solution map

n G : Hd → P , ef 7→ ζe defined locally around the simple zero ζ (implicit function theorem). n I We consider the derivative Df G : Tf Hd → Tζ P . I According to our general principles, we define by

0 µ (f , ζ) := kf k · kDf Gk, the condition number at (f , ζ). One can show kD Gk = kDf (ζ)†k if kzk = 1. I f √ † I For the sake of elegance: rather use kDf (ζ) diag( di )k resulting in the condition number µ(f , ζ) for polynomial equation solving.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Condition number

I Let f (ζ) = 0. How much does ζ change when we perturb f a little? n n I Consider the solution variety V := (f , ζ) | f (ζ) = 0 ⊆ Hd × P , which is a smooth Riemannian submanifold n I We consider the derivative Df G : Tf Hd → Tζ P . I According to our general principles, we define by

0 µ (f , ζ) := kf k · kDf Gk, the condition number at (f , ζ). One can show kD Gk = kDf (ζ)†k if kzk = 1. I f √ † I For the sake of elegance: rather use kDf (ζ) diag( di )k resulting in the condition number µ(f , ζ) for polynomial equation solving.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Condition number

I Let f (ζ) = 0. How much does ζ change when we perturb f a little? n n I Consider the solution variety V := (f , ζ) | f (ζ) = 0 ⊆ Hd × P , which is a smooth Riemannian submanifold

I We have a solution map

n G : Hd → P , ef 7→ ζe defined locally around the simple zero ζ (implicit function theorem). I According to our general principles, we define by

0 µ (f , ζ) := kf k · kDf Gk, the condition number at (f , ζ). One can show kD Gk = kDf (ζ)†k if kzk = 1. I f √ † I For the sake of elegance: rather use kDf (ζ) diag( di )k resulting in the condition number µ(f , ζ) for polynomial equation solving.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Condition number

I Let f (ζ) = 0. How much does ζ change when we perturb f a little? n n I Consider the solution variety V := (f , ζ) | f (ζ) = 0 ⊆ Hd × P , which is a smooth Riemannian submanifold

I We have a solution map

n G : Hd → P , ef 7→ ζe defined locally around the simple zero ζ (implicit function theorem). n I We consider the derivative Df G : Tf Hd → Tζ P . One can show kD Gk = kDf (ζ)†k if kzk = 1. I f √ † I For the sake of elegance: rather use kDf (ζ) diag( di )k resulting in the condition number µ(f , ζ) for polynomial equation solving.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Condition number

I Let f (ζ) = 0. How much does ζ change when we perturb f a little? n n I Consider the solution variety V := (f , ζ) | f (ζ) = 0 ⊆ Hd × P , which is a smooth Riemannian submanifold

I We have a solution map

n G : Hd → P , ef 7→ ζe defined locally around the simple zero ζ (implicit function theorem). n I We consider the derivative Df G : Tf Hd → Tζ P . I According to our general principles, we define by

0 µ (f , ζ) := kf k · kDf Gk, the condition number at (f , ζ). √ † I For the sake of elegance: rather use kDf (ζ) diag( di )k resulting in the condition number µ(f , ζ) for polynomial equation solving.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Condition number

I Let f (ζ) = 0. How much does ζ change when we perturb f a little? n n I Consider the solution variety V := (f , ζ) | f (ζ) = 0 ⊆ Hd × P , which is a smooth Riemannian submanifold

I We have a solution map

n G : Hd → P , ef 7→ ζe defined locally around the simple zero ζ (implicit function theorem). n I We consider the derivative Df G : Tf Hd → Tζ P . I According to our general principles, we define by

0 µ (f , ζ) := kf k · kDf Gk, the condition number at (f , ζ). † I One can show kDf Gk = kDf (ζ) k if kzk = 1. Numerical Condition in Polynomial Equation Solving Polynomial Equations

Condition number

I Let f (ζ) = 0. How much does ζ change when we perturb f a little? n n I Consider the solution variety V := (f , ζ) | f (ζ) = 0 ⊆ Hd × P , which is a smooth Riemannian submanifold

I We have a solution map

n G : Hd → P , ef 7→ ζe defined locally around the simple zero ζ (implicit function theorem). n I We consider the derivative Df G : Tf Hd → Tζ P . I According to our general principles, we define by

0 µ (f , ζ) := kf k · kDf Gk, the condition number at (f , ζ). One can show kD Gk = kDf (ζ)†k if kzk = 1. I f √ † I For the sake of elegance: rather use kDf (ζ) diag( di )k resulting in the condition number µ(f , ζ) for polynomial equation solving. n I Let d refer to the geodesic distance on the Riemannian manifold P (Fubini-Study metric). Think of d as an angle. Theorem. (Smale) The condition controls the radius of the basin of attraction of the Newton iteration. Let (f , ζ) ∈ V and put D := maxi di . If z ∈ Pn satisfies 0.3 d(z, ζ) ≤ , D3/2 µ(f , ζ) then for all i ∈ N 1 d(z , ζ) ≤ d(z , ζ). i 22i −1 0 We call z an approximate zero of f associated with the zero ζ.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Newton iteration and approximate zeros

I Have a projective Newton iteration

zk+1 = Nf (zk )

n n with Newton operator Nf : P → P and starting point z0. Theorem. (Smale) The condition controls the radius of the basin of attraction of the Newton iteration. Let (f , ζ) ∈ V and put D := maxi di . If z ∈ Pn satisfies 0.3 d(z, ζ) ≤ , D3/2 µ(f , ζ) then for all i ∈ N 1 d(z , ζ) ≤ d(z , ζ). i 22i −1 0 We call z an approximate zero of f associated with the zero ζ.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Newton iteration and approximate zeros

I Have a projective Newton iteration

zk+1 = Nf (zk )

n n with Newton operator Nf : P → P and starting point z0. n I Let d refer to the geodesic distance on the Riemannian manifold P (Fubini-Study metric). Think of d as an angle. Numerical Condition in Polynomial Equation Solving Polynomial Equations

Newton iteration and approximate zeros

I Have a projective Newton iteration

zk+1 = Nf (zk )

n n with Newton operator Nf : P → P and starting point z0. n I Let d refer to the geodesic distance on the Riemannian manifold P (Fubini-Study metric). Think of d as an angle. Theorem. (Smale) The condition controls the radius of the basin of attraction of the Newton iteration. Let (f , ζ) ∈ V and put D := maxi di . If z ∈ Pn satisfies 0.3 d(z, ζ) ≤ , D3/2 µ(f , ζ) then for all i ∈ N 1 d(z , ζ) ≤ d(z , ζ). i 22i −1 0 We call z an approximate zero of f associated with the zero ζ. I Connect g and f by line segment [g, f ] consisting of

qt := (1 − t)g + tf for t ∈ [0, 1].

I If [g, f ] does not meet the discriminant variety (none of the qt has a multiple zero), then there exists a unique lifting to a solution path γ

γ : [0, 1] → V , t 7→ (ft , ζt )

such that (f0, ζ0) = (g, ζ).

I Choosing adaptive step sizes is dictated by a condition number theorem that characterizes µ(f , ζ)−1 as the inverse distance of f to the set Σζ of systems ef that have a multiple zero at ζ.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Homotopy path continuation: a topological idea

I Given a start system (g, ζ) ∈ V and an input f ∈ Hd . I If [g, f ] does not meet the discriminant variety (none of the qt has a multiple zero), then there exists a unique lifting to a solution path γ

γ : [0, 1] → V , t 7→ (ft , ζt )

such that (f0, ζ0) = (g, ζ).

I Choosing adaptive step sizes is dictated by a condition number theorem that characterizes µ(f , ζ)−1 as the inverse distance of f to the set Σζ of systems ef that have a multiple zero at ζ.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Homotopy path continuation: a topological idea

I Given a start system (g, ζ) ∈ V and an input f ∈ Hd .

I Connect g and f by line segment [g, f ] consisting of

qt := (1 − t)g + tf for t ∈ [0, 1]. I Choosing adaptive step sizes is dictated by a condition number theorem that characterizes µ(f , ζ)−1 as the inverse distance of f to the set Σζ of systems ef that have a multiple zero at ζ.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Homotopy path continuation: a topological idea

I Given a start system (g, ζ) ∈ V and an input f ∈ Hd .

I Connect g and f by line segment [g, f ] consisting of

qt := (1 − t)g + tf for t ∈ [0, 1].

I If [g, f ] does not meet the discriminant variety (none of the qt has a multiple zero), then there exists a unique lifting to a solution path γ

γ : [0, 1] → V , t 7→ (ft , ζt )

such that (f0, ζ0) = (g, ζ). Numerical Condition in Polynomial Equation Solving Polynomial Equations

Homotopy path continuation: a topological idea

I Given a start system (g, ζ) ∈ V and an input f ∈ Hd .

I Connect g and f by line segment [g, f ] consisting of

qt := (1 − t)g + tf for t ∈ [0, 1].

I If [g, f ] does not meet the discriminant variety (none of the qt has a multiple zero), then there exists a unique lifting to a solution path γ

γ : [0, 1] → V , t 7→ (ft , ζt )

such that (f0, ζ0) = (g, ζ).

I Choosing adaptive step sizes is dictated by a condition number theorem that characterizes µ(f , ζ)−1 as the inverse distance of f to the set Σζ of systems ef that have a multiple zero at ζ. I Follow γ numerically: Let t0 = 0,..., tk = 1 and write qi := qti .

I Successively compute approximations zi of ζi := ζti by Newton’s method

zi+1 := Nqi+1 (zi )

starting with z0 := ζ.

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Discretization

Pn

Nqi+1 zi+1 zi

ζ ζi+1 ζi

g qi qi+1 f Hd Numerical Condition in Polynomial Equation Solving Polynomial Equations

Discretization

Pn

Nqi+1 zi+1 zi

ζ ζi+1 ζi

g qi qi+1 f Hd

I Follow γ numerically: Let t0 = 0,..., tk = 1 and write qi := qti .

I Successively compute approximations zi of ζi := ζti by Newton’s method

zi+1 := Nqi+1 (zi )

starting with z0 := ζ. I We denote by K(f , g, ζ) the number k of Newton continuation steps that are needed to follow the homotopy.

Theorem. (Shub & Smale). zi is an approximate zero of ζi for all i. Moreover,

Z 1 K(f , g, ζ) ≤ 217 D3/2 µ(γ(t))2 kγ˙ (t)k dt. 0

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Complexity

I Compute ti+1 adaptively from ti such that c d(q , q ) = . i+1 i 3/2 2 D µ(qi , zi ) This defines the Adaptive Linear Homotopy algorithm. Theorem. (Shub & Smale). zi is an approximate zero of ζi for all i. Moreover,

Z 1 K(f , g, ζ) ≤ 217 D3/2 µ(γ(t))2 kγ˙ (t)k dt. 0

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Complexity

I Compute ti+1 adaptively from ti such that c d(q , q ) = . i+1 i 3/2 2 D µ(qi , zi ) This defines the Adaptive Linear Homotopy algorithm.

I We denote by K(f , g, ζ) the number k of Newton continuation steps that are needed to follow the homotopy. Numerical Condition in Polynomial Equation Solving Polynomial Equations

Complexity

I Compute ti+1 adaptively from ti such that c d(q , q ) = . i+1 i 3/2 2 D µ(qi , zi ) This defines the Adaptive Linear Homotopy algorithm.

I We denote by K(f , g, ζ) the number k of Newton continuation steps that are needed to follow the homotopy.

Theorem. (Shub & Smale). zi is an approximate zero of ζi for all i. Moreover,

Z 1 K(f , g, ζ) ≤ 217 D3/2 µ(γ(t))2 kγ˙ (t)k dt. 0 I Further assume that the input system f ∈ Hd is standard Gaussian.

I Theorem. (Beltr´anand Pardo) The average number of homotopy steps is bounded by

3/2  E(g,ζ),f K(f , g, ζ) ≤ O D Nn ,

where D := maxi di and N = dim CHd .

I When allowing randomized algorithms, this is a solution to Smale’s 17th problem.

I “Derandomizations”: B & Cucker, Lairez

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Average expected polynomial time

I Suppose the start system (g, ζ) ∈ V is chosen “at random”:

choose g ∈ Hd from standard Gaussian, choose one of the d1 ··· dn many zeros ζ of g uniformly at random. Efficient sampling of (g, ζ) is possible (Beltr´an& Pardo). I Theorem. (Beltr´anand Pardo) The average number of homotopy steps is bounded by

3/2  E(g,ζ),f K(f , g, ζ) ≤ O D Nn ,

where D := maxi di and N = dim CHd .

I When allowing randomized algorithms, this is a solution to Smale’s 17th problem.

I “Derandomizations”: B & Cucker, Lairez

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Average expected polynomial time

I Suppose the start system (g, ζ) ∈ V is chosen “at random”:

choose g ∈ Hd from standard Gaussian, choose one of the d1 ··· dn many zeros ζ of g uniformly at random. Efficient sampling of (g, ζ) is possible (Beltr´an& Pardo).

I Further assume that the input system f ∈ Hd is standard Gaussian. I When allowing randomized algorithms, this is a solution to Smale’s 17th problem.

I “Derandomizations”: B & Cucker, Lairez

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Average expected polynomial time

I Suppose the start system (g, ζ) ∈ V is chosen “at random”:

choose g ∈ Hd from standard Gaussian, choose one of the d1 ··· dn many zeros ζ of g uniformly at random. Efficient sampling of (g, ζ) is possible (Beltr´an& Pardo).

I Further assume that the input system f ∈ Hd is standard Gaussian.

I Theorem. (Beltr´anand Pardo) The average number of homotopy steps is bounded by

3/2  E(g,ζ),f K(f , g, ζ) ≤ O D Nn ,

where D := maxi di and N = dim CHd . I “Derandomizations”: B & Cucker, Lairez

Numerical Condition in Polynomial Equation Solving Polynomial Equations

Average expected polynomial time

I Suppose the start system (g, ζ) ∈ V is chosen “at random”:

choose g ∈ Hd from standard Gaussian, choose one of the d1 ··· dn many zeros ζ of g uniformly at random. Efficient sampling of (g, ζ) is possible (Beltr´an& Pardo).

I Further assume that the input system f ∈ Hd is standard Gaussian.

I Theorem. (Beltr´anand Pardo) The average number of homotopy steps is bounded by

3/2  E(g,ζ),f K(f , g, ζ) ≤ O D Nn ,

where D := maxi di and N = dim CHd .

I When allowing randomized algorithms, this is a solution to Smale’s 17th problem. Numerical Condition in Polynomial Equation Solving Polynomial Equations

Average expected polynomial time

I Suppose the start system (g, ζ) ∈ V is chosen “at random”:

choose g ∈ Hd from standard Gaussian, choose one of the d1 ··· dn many zeros ζ of g uniformly at random. Efficient sampling of (g, ζ) is possible (Beltr´an& Pardo).

I Further assume that the input system f ∈ Hd is standard Gaussian.

I Theorem. (Beltr´anand Pardo) The average number of homotopy steps is bounded by

3/2  E(g,ζ),f K(f , g, ζ) ≤ O D Nn ,

where D := maxi di and N = dim CHd .

I When allowing randomized algorithms, this is a solution to Smale’s 17th problem.

I “Derandomizations”: B & Cucker, Lairez Numerical Condition in Polynomial Equation Solving Polynomial Equations

For more details I refer to my new monograph (Springer 2013) with Felipe Cucker: Numerical Condition in Polynomial Equation Solving Condition of intersecting

Part III: Condition of intersecting a projective variety with a varying linear subspace I Goal: Define and study the condition to numerically compute elements of Z ∩ L

I Motivated by a problem in algebraic vision: S. Agarwal, H. L. Lee, R. Thomas, and B. Sturmfels

Numerical Condition in Polynomial Equation Solving Condition of intersecting

Setting of intersection problem

n I Z ⊆ P fixed m-dimensional irreducible projective variety n I L ⊆ P varying linear subspace of complementary dimension s = n − m

I Bezout: #(Z ∩ L) = deg Z if L is in sufficiently general position Numerical Condition in Polynomial Equation Solving Condition of intersecting

Setting of intersection problem

n I Z ⊆ P fixed m-dimensional irreducible projective variety n I L ⊆ P varying linear subspace of complementary dimension s = n − m

I Bezout: #(Z ∩ L) = deg Z if L is in sufficiently general position

I Goal: Define and study the condition to numerically compute elements of Z ∩ L

I Motivated by a problem in algebraic vision: S. Agarwal, H. L. Lee, R. Thomas, and B. Sturmfels m×(n+1) n I Consider the derivative DAG : TAC → Tz P of the solution map G and its operator norm

kDAGk := sup kDAG(A˙ )k, kA˙ kF =1

defined with respect to the Frobenius norm kA˙ kF .

I The intersection condition number of A at z (with respect to the variety Z) is defined as

kercondZ (A, z) := kAk · kDAGk

Numerical Condition in Polynomial Equation Solving Condition of intersecting

The kernel intersection condition number

m×(n+1) I Let L be given as the kernel of a full rank matrix A ∈ C I Suppose L has a transversal intersection with Z at the smooth point z of Z. For Ae ≈ A, let G(Ae) be the intersection point of ker Ae ∩ Z close to L Numerical Condition in Polynomial Equation Solving Condition of intersecting

The kernel intersection condition number

m×(n+1) I Let L be given as the kernel of a full rank matrix A ∈ C I Suppose L has a transversal intersection with Z at the smooth point z of Z. For Ae ≈ A, let G(Ae) be the intersection point of ker Ae ∩ Z close to L m×(n+1) n I Consider the derivative DAG : TAC → Tz P of the solution map G and its operator norm

kDAGk := sup kDAG(A˙ )k, kA˙ kF =1

defined with respect to the Frobenius norm kA˙ kF .

I The kernel intersection condition number of A at z (with respect to the variety Z) is defined as

kercondZ (A, z) := kAk · kDAGk n I Have now a solution map eL 7→ γ(eL) ∈ P with the derivative n DLγ : TLG → Tz P and define the (intrinsic) intersection condition number of L at z (with respect to the variety Z) as

κZ (L, z) := kDLγk.

m×(n+1) I Theorem. If L ∈ G is the kernel of A ∈ C and z ∈ L:

κZ (L, z) ≤ kercondZ (A, z) ≤ κ(A) · κZ (L, z),

where κ(A) := kAk · kA†k.

Numerical Condition in Polynomial Equation Solving Condition of intersecting

The intrinsic intersection condition number

n I The complex Grassmann manifold G := G(P , s) is the set of s-dimensional projective linear subspaces L of Pn. One can define a unitary invariant hermitian metric on G. m×(n+1) I Theorem. If L ∈ G is the kernel of A ∈ C and z ∈ L:

κZ (L, z) ≤ kercondZ (A, z) ≤ κ(A) · κZ (L, z),

where κ(A) := kAk · kA†k.

Numerical Condition in Polynomial Equation Solving Condition of intersecting

The intrinsic intersection condition number

n I The complex Grassmann manifold G := G(P , s) is the set of s-dimensional projective linear subspaces L of Pn. One can define a unitary invariant hermitian metric on G. n I Have now a solution map eL 7→ γ(eL) ∈ P with the derivative n DLγ : TLG → Tz P and define the (intrinsic) intersection condition number of L at z (with respect to the variety Z) as

κZ (L, z) := kDLγk. Numerical Condition in Polynomial Equation Solving Condition of intersecting

The intrinsic intersection condition number

n I The complex Grassmann manifold G := G(P , s) is the set of s-dimensional projective linear subspaces L of Pn. One can define a unitary invariant hermitian metric on G. n I Have now a solution map eL 7→ γ(eL) ∈ P with the derivative n DLγ : TLG → Tz P and define the (intrinsic) intersection condition number of L at z (with respect to the variety Z) as

κZ (L, z) := kDLγk.

m×(n+1) I Theorem. If L ∈ G is the kernel of A ∈ C and z ∈ L:

κZ (L, z) ≤ kercondZ (A, z) ≤ κ(A) · κZ (L, z),

where κ(A) := kAk · kA†k. I Theorem. We have κZ (L, z) = 1/ sin α, where α be the minimum angle between Tz Z and Tz L.

I The Schubert varietyΣ z (Z) of Z at z consists of the L ∈ Gz having a nontransversal intersection with Z at z. The geodesic distance dg and projection distance dp both define a metric on the subspace Gz of G.

I Condition Number Theorem. We have dp(L, Σz ) = sin dg (L, Σz ) and 1 κZ (L, z) = . dp(L, Σz )

Numerical Condition in Polynomial Equation Solving Condition of intersecting

Geometric characterizations

I Let z ∈ Z ∩ L be a smooth point of Z.

I A small minimum angle between L and Z at z means a “glancing intersection” of L and Z, which should result in a large intersection condition. This can be made formal: I The Schubert varietyΣ z (Z) of Z at z consists of the L ∈ Gz having a nontransversal intersection with Z at z. The geodesic distance dg and projection distance dp both define a metric on the subspace Gz of G.

I Condition Number Theorem. We have dp(L, Σz ) = sin dg (L, Σz ) and 1 κZ (L, z) = . dp(L, Σz )

Numerical Condition in Polynomial Equation Solving Condition of intersecting

Geometric characterizations

I Let z ∈ Z ∩ L be a smooth point of Z.

I A small minimum angle between L and Z at z means a “glancing intersection” of L and Z, which should result in a large intersection condition. This can be made formal:

I Theorem. We have κZ (L, z) = 1/ sin α, where α be the minimum angle between Tz Z and Tz L. I Condition Number Theorem. We have dp(L, Σz ) = sin dg (L, Σz ) and 1 κZ (L, z) = . dp(L, Σz )

Numerical Condition in Polynomial Equation Solving Condition of intersecting

Geometric characterizations

I Let z ∈ Z ∩ L be a smooth point of Z.

I A small minimum angle between L and Z at z means a “glancing intersection” of L and Z, which should result in a large intersection condition. This can be made formal:

I Theorem. We have κZ (L, z) = 1/ sin α, where α be the minimum angle between Tz Z and Tz L.

I The Schubert varietyΣ z (Z) of Z at z consists of the L ∈ Gz having a nontransversal intersection with Z at z. The geodesic distance dg and projection distance dp both define a metric on the subspace Gz of G. Numerical Condition in Polynomial Equation Solving Condition of intersecting

Geometric characterizations

I Let z ∈ Z ∩ L be a smooth point of Z.

I A small minimum angle between L and Z at z means a “glancing intersection” of L and Z, which should result in a large intersection condition. This can be made formal:

I Theorem. We have κZ (L, z) = 1/ sin α, where α be the minimum angle between Tz Z and Tz L.

I The Schubert varietyΣ z (Z) of Z at z consists of the L ∈ Gz having a nontransversal intersection with Z at z. The geodesic distance dg and projection distance dp both define a metric on the subspace Gz of G.

I Condition Number Theorem. We have dp(L, Σz ) = sin dg (L, Σz ) and 1 κZ (L, z) = . dp(L, Σz ) I Define κZ (L) := maxz∈Z∩L κZ (L, z). Then

−1 κZ (L) ≥  ⇐⇒ dp(L, Σ(Z)) ≤ .

I For uniformly random L ∈ G, bounding the probability that κZ (L) is large means bounding the volume of the tube around Σ(Z).

I Can express the volume of Σ(Z) in terms of its degree.

Numerical Condition in Polynomial Equation Solving Condition of intersecting

Towards a probabilistic analysis

I The Hurwitz variety Σ(Z) of Z is defined as the Zariski closure of the union of the local Schubert varieties Σz (Z), taken over all regular points z of Z. It consists of all L touching Z (and the limits thereof).

I Σ(Z) is a hypersurface in G if Z is not linear (Sturmfels 2014). I Can express the volume of Σ(Z) in terms of its degree.

Numerical Condition in Polynomial Equation Solving Condition of intersecting

Towards a probabilistic analysis

I The Hurwitz variety Σ(Z) of Z is defined as the Zariski closure of the union of the local Schubert varieties Σz (Z), taken over all regular points z of Z. It consists of all L touching Z (and the limits thereof).

I Σ(Z) is a hypersurface in G if Z is not linear (Sturmfels 2014).

I Define κZ (L) := maxz∈Z∩L κZ (L, z). Then

−1 κZ (L) ≥  ⇐⇒ dp(L, Σ(Z)) ≤ .

I For uniformly random L ∈ G, bounding the probability that κZ (L) is large means bounding the volume of the tube around Σ(Z). Numerical Condition in Polynomial Equation Solving Condition of intersecting

Towards a probabilistic analysis

I The Hurwitz variety Σ(Z) of Z is defined as the Zariski closure of the union of the local Schubert varieties Σz (Z), taken over all regular points z of Z. It consists of all L touching Z (and the limits thereof).

I Σ(Z) is a hypersurface in G if Z is not linear (Sturmfels 2014).

I Define κZ (L) := maxz∈Z∩L κZ (L, z). Then

−1 κZ (L) ≥  ⇐⇒ dp(L, Σ(Z)) ≤ .

I For uniformly random L ∈ G, bounding the probability that κZ (L) is large means bounding the volume of the tube around Σ(Z).

I Can express the volume of Σ(Z) in terms of its degree. Numerical Condition in Polynomial Equation Solving Condition of intersecting

Thank you