<<

CDM 1 and Rings Semirings

Klaus Sutner  : Applications Carnegie Mellon University

30-semi-rings 2017/12/15 23:16  Polynomials: Definition

 Roots

Semirings 3

Example Some standard examples of semirings are the following. The of Z, +, , 0, 1 is a under the standard operations of and A semiring is a structure X, , , 0, 1 that satisfies the following conditions: h ∗ i h ⊕ ⊗ i . More generally any ring is a semiring; the additive here is a . 1 X, , 0 and X, , 1 are , the former is commutative. h ⊕ i h ⊗ i 2 distributes over on the : Example ⊗ ⊕ x (y z) = (x y) (x z) and (y z) x = (y x) (z x). The Boolean semiring B = 0, 1 , , , 0, 1 where the operations are logical ⊗ ⊕ ⊗ ⊕ ⊗ ⊕ ⊗ ⊗ ⊕ ⊗ h { } ∨ ∧ i ’or’ and ’and’. 3 0 is a null (or annihilator) with respect to : x 0 = 0 x = 0. ⊗ ⊗ ⊗ Example (The Semiring) A semiring is commutative if x y = y x and if x x = x. The semiring of all binary relations over a A has set as the ⊗ ⊗ ⊕ R additive operation, and relational composition as the multiplicative operation. 0 is the empty relation, and 1 is the relation.

Example (The Language Semiring) Example ( Semirings) ? The semiring (Σ) of all languages over Σ is the P(Σ ), , , , ε . It L h ∪ · ∅ i is easy to check that all the equations for a semiring are satisfied by (Σ). Let S = S, , , 0, 1 be a semiring. The operations and naturally h ⊕ ⊗ i ⊕ n,n ⊗ L n Note that one can introduce a metric on (Σ) by setting dist(L, K) := 2− extend to n by n matrices over S. One can show that S , , , 0, 1 is L h ⊕ ⊗ i where n is minimal such that L Σn = K Σn for L = K and dist(L, L) = 0. again a semiring. Here 0 and 1 are the appropriate null and identity matrices ∩ 6 ∩ 6 It is not hard to see that (Σ) is a complete metric with respect to this over S. L distance . Matrix semirings will be of importance to us later. To describe the behavior of a finite state machine one can often use of the form Example (Tropical Semiring) h :Σ? Sn,n where Sn,n stands for the monoid of n n matrices over S → ? × The tropical semiring is defined by TS = N , min, +, , 0 . It is used with multiplication. Since Σ is free, any such can be specified h ∪ {∞} ∞ i by a list of matrices h(a), a Σ. for example in Dijkstra’s for minimal cost paths. We will encounter it ∈ in chapter ?? in our discussion of the finite power property. Exercise Verify all the examples given above are indeed semirings. Exercise (Fusion Semiring) Exercise Show that all the following structures are semirings. Recall the partial operation of fusion on words. We have extended fusion to languages as usual (note that fusion on languages is a total operation). Show 1 R + , min, +, , 0 , that the structure (Γ∗), , , , Γ is a semiring, the fusion semiring over Γ. h ∪ { ∞} ∞ i h P ∪ • ∅ i 2 R , max, +, , 0 , h ∪ {−∞} −∞ i 3 x R 0 x 1 , max, , 0, 1 , { ∈ ≤ ≤ } ≥ 4 R 0 + , max, min, 0, . h ≥ ∪ { ∞} ∞ i

Star Semirings and Closed Semirings 9 We can define the star operation in terms of an infinitary operation

i 0 1 2 x∗ = x = x + x + x + ... i 0 X≥ Note that this approach does not quite fit into the equational framework used Consider again the semiring of all binary relations on a set A. The to describe the other . Here is a more precise definition. Suppose S is R multiplicative operation here is . As we have seen, an idempotent semiring with an additional infinitary operation i I ai which is ∈ there is important operation closely related to composition: transitive . defined for any family (ai i I) of elements in S where I is an arbitrary index | ∈ P How about adding a ∗ to the semiring that denotes transitive set. S is a closed semiring iff this infinite summation operation behaves closure? properly with respect to finite sums, distributivity and reordering of the arguments. More precisely, we require This additional unary operation also occurs in other semirings. For example, in the semiring (Σ) of all languages over some alphabet, there is the L operation, that is of central importance in language theory. As it turns out, the ai = a1 + ... + an, star operation will allow us to solve linear equations of the form x = a x + b. i [n] · X∈ ( ai)( bj ) = aibj , i I j J (i,j) I J X∈ X∈ X∈ × ai = ( ai), i I j J i I X∈ X∈ X∈ j

We will denote semirings with a star operation by X, +, , ∗, 0, 1 . For a more h · i detailed description of closed semirings and star semirings see [?, ?]. where in the last equation I is the of the sets Ij . The star operation in a closed semiring is then defined as above. We can then verify Exercise equations such as Show that the collection of all Boolean n by n matrices forms a closed semiring. Think about the case n = 1 first. (x + y)∗ = (x∗y)∗x∗, (xy)∗ = 1 + x(yx)∗y, Exercise It is easy to check that (Σ) is in fact a closed semiring, since plus here Verify the equations given above in (Σ). Solve the equation x = a x + b over L L · corresponds to set theoretic union and can naturally be expanded to infinitely ( a, b ). L { } many arguments. Note that in this case we also have the super idempotency property: ai = a if for all i I: ai = a. It follows that (x∗)∗ = x∗ and Exercise ∈ 1∗ = 1. P Show that in a closed semiring with super idempotency we have (x∗)∗ = x∗ and 1∗ = 1. Also show that these equations hold in any closed semiring satisfying (x + y)∗ = (x∗y)∗x∗ and (xy)∗ = 1 + x(yx)∗y. Rings 13 Rings 14

Definition A ring is an of the form A ring is a semiring where the additive monoid is an . A field is a = R, +, , 0, 1 ring where in addition the multiplicative monoid is Abelian, and is essentially a R h · i group: X 0 , , 1 is a group. h − { } ⊗ i where

1 Note that one cannot give a reasonable definition of 0− , which is why 0 is R, +, 0 is a commutative group (), h i excluded from the of a field. R, , 1 is a monoid (not necessarily commutative), Thus, in a field one has a good chance to solve equations such as h · i x2 5x + 3 = 0. Note, though, that we cannot hope to solve all such − multiplication distributes over addition: equations in all fields: over the real , x2 + 1 = 0 has no solution (but is has a solution over C, the algebraic completion of the reals). x (y + z) = x y + x z · · · (y + z) x = y x + z x · · ·

Commutative Rings 15 Examples: Rings 16

Example (Integers)

The integers Z, +, , 0, 1 with the usual addition and multiplication for a h ∗ i Note that we need two distributive laws since multiplication is not assumed to ring. be commutative. If multiplication is commutative the ring itself is called commutative. Example (Modular Numbers)

One can relax the conditions a bit and deal with rings without a 1: for The integers n, Zn, +, , 0, 1 with the usual addition and h ∗ i example, 2Z is a ring without 1. Instead of a multiplicative monoid one has a multiplication form a ring. If n is prime, this ring is actually a field. In . particular there is a two-element field consisting just of 0 and 1. Note that these fields are finite. For our purposes there is no need for this.

Example (Standard Fields)

The rationals Q, the reals R, the complex numbers C.

Examples: Rings 17 Annihilators, Inverses and Units 18

Example (Univariate Polynomials) Definition A ring element a is an annihilator if for all x: xa = ax = a. Given a ring R we can construct a new ring by considering all polynomials with

coefficients in R, written R[x] where x indicates the “unknown” or “”. An inverse u0 of a ring element u is any element such that uu0 = u0u = 1. For example, Z[x] is the ring of all polynomials with coefficients. A ring element u is called a if it has an inverse u0.

Example (Matrix Rings) Proposition Another important way to construct rings is to consider matrices with 0 is an annihilator in any ring. coefficients in a ground ring R. n,n For example, R denotes the ring of all n by n matrices with real coefficients. Proof. Note that a0 = a(0 + 0) = a0 + a0, done by cancellation in the Note that this ring is not commutative unless n = 1. additive group. 2 Inverses 19 20

We are interested in rings that have lots of units. One obstruction to having a Note that an annihilator cannot be a unit. is described in the next definition.

For suppose aa0 = 1. We have xa = a, so that x = 1, contradiction. Definition A ring element a = 0 is a zero if there exist b, c = 0 such that The multiplicative 1 in a ring is uniquely determined: 1 = 1 10 = 10. 6 6 · ab = ca = 0.

A is an integral if it has no zero-. Proposition If u is a unit, then its inverse is uniquely determined Consider R? = R 0 , so all units are located in R?. − { } Proof. Then R?, , 1 is a monoid in any . h · i Suppose uu0 = u0u = 1 and uu00 = u00u = 1. Then

u0 = u01 = u0uu00 = 1u00 = u00. Proposition (Multiplicative Cancellation) 2 In an integral domain we have ab = ac where a = 0 implies b = c. 6 1 As usual, lots of equational reasoning. And we can write the inverse as u− . Proof. ab = ac iff a(b c) = 0, done. 2 −

Examples: Integral Domains 21 A Strange Ring 22

Arithmetic structures provide the standard examples for rings, but the are much more general than that. Here is a warning not to over-interpret the Example (Standard Integral Domains) ring axioms. Let A be an arbitrary set and let P = P(A) be its powerset. For x, y P ∈ The integers Z, the rationals Q, the reals R, the complex numbers C are all define integral domains. x + y = (x y) (y x) − ∪ − x y = x y Example (Modular Numbers) ∗ ∩ Thus addition is symmetric difference and multiplication is plain set-theoretic The ring of modular numbers Zm is an integral domain iff m is prime. intersection. In terms of logic, addition is “,” and multiplication is “and.” Example (Non-ID) Proposition The ring of 2 2 real matrices is not an integral domain: × P(A), +, , ,A is a commutative ring. ( 0 1 ) ( 1 0 ) = ( 0 0 ) h ∗ ∅ i 0 0 · 0 0 0 0

Exercise Prove the proposition.

Fields 23 Examples: Fields 24

Definition Example

A field F is a ring in which the multiplicative monoid F ∗, , 1 forms a In calculus one always deals with the classical fields: the rationals , the reals h · i Q commutative group. R, the complex numbers C.

In other words, every non- is already a unit. As a consequence, in a field we can always solve linear equations Example The modular numbers Zm form a field for m is prime. a x + b = 0 · We can use the Extended to compute multiplicative 1 inverses: obtain two cofactors x and y such that xa + ym = 1. Then x is the provided that a = 0: the solution is x0 = a− b. In fact, we can solve systems 6 − multiplicative inverse of a modulo m. of linear equations using the standard machinery from . Note that we can actually compute quite well in this type of finite field: the As we will see, this additional condition makes fields much more constrained elements are trivial to implement and there is a reasonably efficient way to than arbitrary rings. By the same token, they are also much more manageable. realize the field operations. Axiomatization 25 Products Fail 26

One standard method in algebra that produces more complicated structures Note that one can axiomatize monoids and groups in a purely equational from simpler one is to form a (operations are performed 1 fashion, using a unary function symbol − to denote an when componentwise). necessary.

This works fine for structures with an equational axiomatization: , Alas, this does not work for fields: the inverse operation is partial and we need monoids, groups, and rings. to guard against argument 0:

1 Unfortunately, for fields this approach fails. For let x = 0 x x− = 1 6 ⇒ ∗

F = F1 F2 × One can try to pretend that inverse is total and explore the corresponding axiomatization; this yields a structure called a “meadow” which does not quite where F1 and F2 are two fields. Then F is a ring, but never a field: the element (0, 1) F is not (0, 0), and so would have to have an inverse (a, b). have the right properties. ∈ But (0, 1)(a, b) = (0, b) = (1, 1), so this does not work. 6

Polynomials 28

 Semirings and Rings Informally, a (univariate) polynomial is an of the form

x3 2x2 + 3x 1 − − 2 Polynomials: Applications There is an unknown or variable x and we form powers xi of the variable (), multiply them by an element in some ground ring (coefficients),  Polynomials: Definition add several such terms.

Of course, is not allowed.  Roots The ground ring supplies the coefficients of the polynomials. In the example, it is presumably Z.

Coefficient Representation 29 Evaluation 30

Suppose we have a univariate polynomial p(x) over ring R. Hence we can represent a polynomial by a vector of its coefficients: We can replace the unknown x by elements in R and evaluate to obtain another element in the ring: For example,

p(x) = x3 2x2 + 3x 1 a = (a0, a1, . . . , an 1) − − − produces represents 3 2 2 n 1 p(2) = 2 2 2 + 3 2 1 = 5. p(x) = a0 + a1x + a2x + ... + an 1x − − · · − − Hence we can associate the polynomial p with a polynomial function If an 1 = 0 then n 1 is the degree of p. − 6 − In general, p(x) has degree bound n (note that this is a strict bound; this p : R R, a p(a) notion will be useful later in several ). → 7→ This may seem like splittingb hairs, but sometimes it is important to keep the two notions apart. Come on . . . 31 Horner’s Rule 32

The polynomial and the associated polynomial function really are two different If the polynomial is given in coefficient form objects. Consider the ground ring Z2. The polynomial

2 p(x) = x + x a = (a0, a1, . . . , ad)

has the associated function we can easily evaluate at any point b R: ∈ p(a) = 0 f(b) = ((... (adb + ad 1)b + ad 2)b + ...)b + a0 − − for all a Z2. ∈ b i In fact, any polynomial p(x) = i I x produces the identically 0 map as long + ∈ as I N has even cardinality. ⊆ P Proposition Exercise A polynomial of degree d can be evaluated in d ring and d ring . Describe all polynomial functions over ground rings Z2 and Z3.

Interpolation 33 Lagrange Functions 34

Suppose we wish to construct a polynomial f that evaluates to given target values at certain points. Say we want f(ai) = bi for i = 0, . . . , n 1 . Define − the Lagrange interpolant

n x aj Li (x) = − ai aj j=i Y6 −

Proposition

n n L (ai) = 1 and L (aj ) = 0 for i = j. i i 6 Hence we can choose n f(x) = biLi (x) i

Example 35 Secret Sharing 36

Suppose you have a “secret” a, a natural , that you want to distribute over n people in such a way that no proper of the n persons can Suppose we want f(i) = the ith prime for i = 0,..., 5 . access the secret but the whole group can. The Lagrange interpolation looks like More generally may want to distribute the secret so that k out of n persons can access it, but not any subgroup of size k 1. − 6 6 6 6 6 6 f(x) = 2L0 + 3L1 + 5L2 + 7L3 + 11L4 + 13L5 Here is a simple idea for the n = k problem: which, after expansion and simplification, produces We may safely assume that a is a m-bit number. Generate n 1 m-bit − numbers ai and give number ai to person i, i = 1, . . . , n 1 . Person n receives 1 − (240 286 x + 735 x2 425 x3 + 105 x4 9 x5) 120 − − − an = a a1 a2 ... an 1 ⊕ ⊕ ⊕ ⊕ − where is bit-wise xor. ⊕ Clearly all n secret sharers can compute a, but if one is missing they are stuck with a random number. Using Polynomials 37 Shamir’s Method 38

A more flexible approach is built on the following idea. More precisely, pick a prime p > a, n. We will use the ground ring Zp. Construct a polynomial f that has a as its lowest coefficient, so f(0) = a. All Generate random numbers 0 < ai < p for i = 1, . . . , n 1 . other coefficients are chosen at random. − Define the polynomial The secret sharers are given not the coefficients of f but pieces of the n 1 point-value description of f, which are obtained by repeated evaluation. f(x) = a + a1x + ... + an 1x − − If all n agree, they can use their information to reconstruct the coefficient representation of f by interpolation. f is completely determined by the n point-value pairs (i, bi), i = 1, . . . , n . Once f is reconstructed a can be found be a simple evaluation. By interpolation we can retrieve f from the point-value pairs, hence we can determine a = a . But for any proper f will be underdetermined, and a cannot be 0 recovered. An important point is that n 1 persons can obtain no information about the − zero coefficient; every coefficient is equally likely. It is not hard to generalize this method to k < n.

Defining Polynomials 40

 Semirings and Rings So polynomials are associated with functions, we can evaluate them efficiently and we can reconstruct them from point-value pairs. How do we give a precise definition of polynomials?

 Polynomials: Applications The definition should pin down all the essential properties and should make the algebraic structure perfectly clear For example, we can add and multiply polynomials.

3 Polynomials: Definition There are (at least) two ways we could turn:

Define an algebraic structure R[x], the so-called over R.

 Roots Define a collection of formal expressions representing polynomials (implicit representation).

Pro and Con 41 Polynomial Rings and 42

We start with a moderately strict algebraic definition of polynomial. The algebraic approach, unsurprisingly, has the advantage that it brings out the Let R be a commutative ring with 1 throughout. algebraic properties of polynomials very clearly. Surprisingly, though, it also provides some ideas for implementation. It is a bit abstract, though. Definition The formal expression approach is more intuitive and closely follows actual Given a ring R the N- of R is defined by practical use. E.g., an expression like N R = (ai) R only finitely many ai = 0 { ∈ | 6 } N (x + y)5 1 a −

The notation (ai) R is a bit clumsy since only finitely many ∈ N will be polynomial in x and y by definition. To evaluate, we perform terms are non-zero. One usually writes suggestively ` substitution followed by some evaluations. n a0 + a1x + ... + anx Unfortunately, the algebra gets more tricky when dealing with purely syntactic where an is the last non-zero element in the sequence (or n = 0 if they all are structures. 0). Note that the “unknown” x is nothing but syntactic sugar, all we really have is a sequence with finite . Operations on Coproducts 43 More Arithmetic 44

Again, there is no “unknown” in the definition of the coproduct. That makes it easier to give clean definitions of the algebraic structure. Addition is easy: Write 0 and 1 for the (0, 0, 0,...) and (1, 0, 0,...), respectively. (ai) + (bi) = (ai + bi) We have a + 0 = a so that

The sum (ai) + (bi) is again an element of the coproduct and it is not too hard R, +, 0 to check that this operation is associative and commutative. D a E is a commutative monoid and even a group. But multiplication is somewhat more complicated (Cauchy product):

(ai) (bi) = ( ar bs) Likewise 1 a = a 1 = a and · · r+s=i · · X R, , 1 · D a E Proposition is also a commutative monoid (assuming that R is commutative).

The product (ai) (bi) is an element of the coproduct. ·

The Unknown 46

Here is a much more interesting element: let

x = (0, 1, 0, 0, 0,...) Exercise Then x2 = (0, 0, 1, 0, 0,...), x3 = (0, 0, 0, 1, 0,...) and so forth. Prove that coproducts are closed under multiplication. Which properties of the natural numbers are important? This justifies the notation

n Exercise a0 + a1x + ... + anx Prove that R, , 1 is a commutative monoid. for the sequence h · i (a0, a1, . . . , an, 0, 0,...) `

Note that one can actually embed all of R:

r (r, 0, 0,...) 7→ Moreover, this map is (trivially) a ring .

The Polynomial Ring 47 More Abstraction 48

Lemma The last definition may seem fairly abstract, but one can push even further R, +, , 0, 1 is a ring. This ring is commutative whenever R is. ahead. h · i ` Note that N was not just used to define the coproduct (a sequence is a map This is unsurprising, but note that a proof requires a bit of work: we have to N R) but that addition on N was used in the definition of multiplication: → verify e.g. that multiplication as defined above really is associative. (ai) (bi) = ( ar bs) We ignore the details. · · r+s=i X Definition What we are really using here is not just any old countable set but the monoid The ring R, +, , 0, 1 is the polynomial ring with coefficients in R and is h · i N, +, 0 usually written R[x]. h i ` The fact that the equation r + s = i has only finitely many solutions in this monoid was crucial. In calculus one studies R[x]. For our purposes, Q[x], Z[x], Zm[x] or [x] where is a finite field will be more So? Everybody knows kindergarten arithmetic. Why make a fuss about it? important. Generalization 49 Bivariate Polynomials 50

Suppose we have a commutative monoid Instead of sequences we now have a grid of coefficients (a two-dimensional array instead of a one-dimensional one). M, +, 0 h i where equations x + y = m have only finitely many solutions. a00 a01 a02 . . . a0j ... a10 a11 a12 . . . a1j ...   Then we can define a ring R[M] on the carrier set a20 a21 a22 . . . a2j ...  . ..   . .  R = a : M R only finitely many non-zeros   { → | }  ai0 ai1 ai2 . . . aij ...  M   a  .   .  in the exact same way as before.  .   

We need at least one example for M (other then N) that makes this Addition in R[N N] is essentially the same as before (since it does not depend construction interesting. It is probably good to stay close to . won’t work × N Z on the algebraic structure of M, just the carrier set). since it violates the finiteness conditions. How about M = N N? But multiplication becomes more interesting. ×

Multiplication 51 Multivariate Polynomials 52

0 0 0 ... 0 1 0 ... One can easily show that any element in R[N N] has the form 1 0 0 ... 0 0 0 ... × Let x = 0 0 0 ... and y = 0 0 0 ... .  . . . .   . . . .  i j ...... a = ai,j x y i,j 0     X≥ Then for R = the products (x + y)5 and (1 + x)5(2 + y)5 look like this (all Z7 where only finitely many coefficients are non-zero and addition and other entries are 0). multiplication work as one would expect. So we really have a rock-solid definition of bivariate polynomials. And, of course, using the monoid

n M = N , +, 0 h i we can get multivariate polynomials in general. Definition The ring of polynomials over R in n variables is defined by

n R[x1, . . . , xn] = R[N ]

Normal Form 53 Hold It 54

There is a natural way to write down a multivariate polynomial analogous to Why not simply define, say, bivariate polynomials by saying they are expressions the univariate and bivariate case. A is a term of the form like p(x, y) = x2y2 + 3x y + 1 e e1 e2 en − x = x1 x2 . . . xn . Or, if you want more precision, since the operation R R[x] works for any n 7→ where e = (e1, . . . , en) N . Then any multivariate polynomial can be written ring R we can simply repeat it to get multivariate polynomials: ∈ as a sum of monomials, multiplied by the appropriate coefficients. R[x1, . . . , xn] = R[x1, . . . , xn 1][xn] − e p(x) = ae x All true, but these “definitions” obscure a lot of things. X And, arithmetic works as expected. What is R[x][x], what R[x, x]? Note that the sum is naturally ordered (use the natural lex on the exponents). What is the difference between R[x], R[X] and R[y]? What is the difference between R[x, y], R[y, x]? Note: Some authors consider the coefficient to be part of the monomial. What is the role of evaluation? Clarity 55 Evaluation 56

So what exactly is the role of evaluation? One might argue that R[x][x] and R[x, x] simply make no sense. The point is that once we fix the value of the unknown(s), everything else is R[x] and R[y] are isomorphic, so in a sense they are the same. determined, too. For a human being (a mathematician) these difficulties are usually irrelevant: More precisely, suppose f : R S is some . Consider the one can determine from context what is meant, or even correct the author if → need be. set of all ring homomorphisms from R[x] to S that agree with f on R:

But if one wants to implement polynomial algebra things are more problematic. H = h : R[x] S h R = f { → |  } For example, if evaluation is done by rewrite rules, the name of the variable does play a huge role. Here R[x] and R[y] can behave very differently. Then the map

Also note that from the implementation point of view n R is probably a H S h h(x) N better model than definitions using terms. → 7→ ` is a .

Universal Property 57 Degrees 58

Another way of saying the same thing: Definition For any homomorphism f : R S and any element a S there is exactly one → ∈ e homomorphism h : R[x] S such that The degree of a monomial x is ei. The is the → largest degree of any of its monomials. If the polynomial is zero its degree is . P h  R = f and −∞ h(x) = a. Lemma Let R be an integral domain and p, q R[x]. Then deg(pq) = deg(p) + deg(q). Exercise ∈ Note that if R be an integral domain then by the lemma R[x] is again an Give a similar result for multivariate polynomials. integral domain.

Exercise The lemma fails without the integral domain assumption: Let R = Z4 and consider p(x) = q(x) = 2x. Show that this property in fact characterizes polynomial rings.

Roots 59 Polynomials and Division 60

Definition Definition A ring element a R is a root of p(x) R[x] if p(a) = 0. Given two polynomials f and g, g divides f if for some polynomial q: q g = f. ∈ ∈ · In other words, a root is any solution of the equation p(x) = 0. For the integers, the most important algorithm associated with the notion of divisibility is the Division Algorithm: we can compute quotient q and remainder r such that a = qb + r, 0 r < b. Finding roots of polynomial equations is often very difficult, in particular when ≤ The situation for polynomials is very similar. several variables are involved. For univariate polynomials over the reals good numerical methods exist, but Theorem (Division Algorithm) over other rings things are problematic. Assume that F is a field. Let f and g be two univariate polynomials over F , 2 For example, computing square roots, i.e. solving x a = 0, over Zm is − g = 0. Then there exist polynomials q and r such that surprisingly difficult. Of course there is a brute-force algorithm, but think of 6 modulus m having thousands of digits. f = q g + r where deg(r) < deg(g). · And for Z[x1, x2, . . . , xn] it is even undecidable whether a root exists. Moreover, q and r are uniquely determined. Proof Sketch 61 The Actual Algorithm 62

For existence consider the set of possible remainders

S = f qg q F [x] . { − | ∈ } Though the theorem is often referred to as “Division Algorithm” it’s just an existence and uniqueness result. However, with a little work one can turn the If 0 S we are done, so suppose otherwise. ∈ proof into an algorithm. Trick: let r S be any element of minimal degree, say r = f qg. ∈ − Write (h) for the leading coefficient of a polynomial h. Suppose g is monic, so Write m = deg(r) and n = deg(g), so we need m < n. that (g) = 1. Assume m n and define ≥ Here is an abstract version, to be called on f. m n r0 = r am/bn x − g remainder( f, g ) − r = f; while( deg(r) ¿= deg(g) ) c = lc( r ); k = deg(r) - deg(g); r = r - c * where a and b are the leading coefficients of r and g, respectively. m n xk g; returnr; ∗ But then deg(r0) < deg(r) and r0 S, contradicting minimality. ∈ Uniqueness is left as an exercise.

Another Version 63 Application: GCD 64

An important application of the Division Algorithm for integers is the Euclidean Often one needs to compute both q and r, here is an array-based version which algorithm for the GCD. does that. Assume deg(f) = n and deg(g) = m. Likewise we can obtain a polynomial GCD algorithm from the Division r = f; Algorithm for polynomials. for i = n-m downto 0 do if( deg r = m + i ) q[i] = lc(r); r = r - q[i] * In fact, essentially the same algorithm works, just replace Z by Z[x]. xi g; elseq[i] = 0; ∗ return q[]; For example, we can obtain cofactors s and t such that Note that sparseness is a tricky issue here: divide xn+1 1 by x 1. − − gcd(f, g) = sf + tg.

Local Views 65 More Local Views 66

Consider the polynomial Similarly 4 3 2 p(x) = x 8x + 23x 28x + 12 p(x) = (x 2)4 (x 2)2 − − − − − We can read off immediately that p(0) = 12 and that the tangent at that point p(x) = (x 3)4 + 4(x 3)3 + 5(x 3)2 + 2(x 3) has slope 28. − − − − − so it is clearly useful to consider polynomials in non-expanded form. But what about the polynomial around x = 1? Here it is convenient to use Taylor expansion to rewrite p in the form

4 3 2 Exercise p(x) = (x 1) 4(x 1) + 5(x 1) 2(x 1) − − − − − − What conclusions can you draw from the various representations of p(x)? We can see that x = 1 is a root and the tangent has slope 2. − Bounds 67

 Semirings and Rings On rare occasions one can also easily establish bounds given the “right” representation of a polynomial. For instance,

p(x) = x4 4x3 + 7x2 10x + 10  Polynomials: Applications − − appears to be positive over R.

One can prove this by tediously checking the behavior over the intervals with  Polynomials: Definition endpoints , 1, 1, 3/2, 2, . Alternatively, one can use differentiation to −∞ − ∞ find the global minimum of p.

Or one can note that p(x) = (x2 1)2 + (x 3)2. 4 Roots − −

Roots and Divisibility 69 Decomposition 70

Lemma Let a be a root of f F [x]. Then (x a) divides f(x). ∈ − So if deg(f) = n and f has n roots we decompose f completely into linear Proof. terms: Write f = q(x a) + r f = c(x a1)(x a2) ... (x an) − − − − where deg(r) < 1. But then r must be 0, done. 2 Of course, there may be fewer roots, even over a rich field such as R: f = x2 + 2 has no roots. Lemma Any non-zero polynomial f F [x] has at most deg(f) many roots. This problem can be fixed by enlarging to the field of complex numbers ∈ R C (the so-called algebraic completion of R). Proof. Use the last lemma and induction on the degree. 2

More Roots 71 Point-Value Representation 72

The fact that a non-zero polynomial of degree n can have at most n roots can Note that over arbitrary rings more roots may well exist. be used to show that the interpolating polynomial 2 For example over R = Z15 the equation x 4 = 0 has four roots: 2, 7, 8, 13 . − { } x aj But of course f(x) = bi − ai aj i j=i 6 − (x 2)(x 7)(x 8)(x 13) = 1 + 7x2 + x4 x2 4 X Y − − − − 6≡ − is unique: suppose g is another interpolating polynomial so that g(ai) = bi. Then f g has n + 1 roots and so is identically zero. −

Exercise Hence we have an alternative representation for polynomials: we can give a list of point-value pairs rather than a list of coefficients. Using the Chinese Remainder theorem explain why there are four roots in the To the naked eye this proposal may seem absurd: why bother with a example above. Can you generalize? representation that is clearly more complicated? As we well see, there are occasions when point-value is computationally superior to coefficient list. Multiplying Polynomials 73 FFT 74

Suppose we have two univariate polynomials f and g of degree bound n. It may seem absurd to spend all the effort to convert between coefficient representation and point-value representation. Surprisingly, it turns out that Using the brute force algorithm (i.e., literally implementing the definition of the conversions can be handled in Θ(n log n) steps using a technique called multiplication in R) we can compute the product fg in Θ(n2) ring Fast Fourier Transform. operations. ` Now suppose we are dealing with real polynomials. There is a bizarre way to But the multiplication is linear in n, so the whole algorithm is just speed up multiplication: Θ(n log n).

Convert f and g into point-value representation where the support points Theorem are carefully chosen. Two real polynomials of degree bound n can be multiplied in Θ(n log n) steps. Multiply the values pointwise to get h. Convert h back to coefficient representation. Take a look at CLR for details.

Vandermonde Matrices 75 Evaluation versus Interpolation 76

Here is another look at conversions between coefficient and point-value representation, i.e., between evaluation and interpolating. It follows that the Vandermonde matrix is invertible iff all the xi are distinct. Definition Now consider a polynomial Define the n by n Vandermonde matrix by n 1 2 n 1 f(x) = c0 + c1x + ... + cn 1x − 1 x0 x0 . . . x0 − − 2 n 1 1 x1 x . . . x − 1 1 To evaluate f at points a = (a0, . . . , an 1) we can use matrix-by-vector  2 n 1 − 1 x2 x . . . x − (x0, x1, . . . , xn 1) = 2 2 multiplication: − . . . .  . . . .   2 n 1 1 xn 1 xn 1 . . . x −  b = (a) c  − n 1  − −  · But given the values b we can obtain the coefficient vector by

Lemma 1 c = (a)− b ·

(x) = xj xi | | − i

Implicit Representation 77 So? 78

More precisely, we could consider polynomials to be arbitrary expressions built from Recall our alternative way of defining polynomials: formal expressions involving the variables x1, . . . , xn, ring elements and the variables using operations plus and times. elements in the ground ring, We have used this implicit representation in several places already, without any addition, subtraction and multiplication. protest from the audience.

Obviously we can recover the explicit polynomial (i.e. the coefficient list) from The expressions c(x a1)(x a2) ... (x an) encountered in the root − − − these explicit representations. decomposition. The description of the of a Vandermonde matrix Example (xj xi). i

We just have to expand (multiply out) to get the “classical form”.

What exactly is meant by “expanding” a polynomial? Expanding Polynomials 79 Polynomial Identities 80

We want to bring a multivariate polynomial f(x1, x2, . . . , xn) into normal form. First we apply rewrite rules to push multiplication to the bottom of the A task one encounters frequently in symbolic computation is to check whether until we have a sum of products: two polynomials are equivalent, i.e., whether an equation between polynomials

α(β + γ) αβ + αγ 7→ f(x1, x2, . . . , xn) = g(x1, x2, . . . , xn) (β + γ)α βα + γα 7→ holds in the sense that for all x1, x2, . . . , xn R. Then we collect terms with the same monomial and adjust the coefficient. ∈

... + cxe + ... + dxe ...... + (c + d)xe + ... Definition 7→ Two polynomials f(x) and g(x) are equivalent if for all a R: f(a) = g(a). Some terms may cancel here (we don’t keep monomials with coefficient 0). ∈ In particular f is identically zero if f(x) and 0 are equivalent. Computationally it is probably best to first sort the terms (rather than trying to do pattern matching at a distance). In other words, two polynomials f(x) and g(x) are equivalent iff f(x) = g(x). Notation: f(x) g(x). The problem is that it may take exponential time to perform the expansion: ≡ b b there may be exponentially many terms in the actual polynomial.

Zero Testing 81 Using Roots 82

Assume that the ground ring is a field F . Note that the polynomial identity f g can be rewritten as f g 0. ≡ − ≡ Suppose f has degree d. As we have seen, in the univariate case there are at Problem: most d roots and d roots determine the polynomial (except for a multiplicative constant). How can we check whether a multivariate polynomial f F [x] is identically ∈ zero? So we could simply check if the polynomial vanishes at, say, a = 0, 1, 2, . . . , d: f will vanish on all d + 1 points iff it is identically zero.

Note: The polynomial may not be given in normal form, but as in the example in a much shorter, parenthesized form. We want a method that is reasonably Requires only d + 1 evaluations of the polynomial (in any form). fast without having to expand out the polynomial first. Unfortunately, the roots for multivariate polynomials are a bit more complicated.

A Quadratic 83 The Key lemma 84

x2 y2 + 1 − Lemma

Let f F [x1, . . . , xn] be of degree d and S F a set of cardinality s. Then f ∈ n⊆ 1 n 2 is not identically zero then f has at most ds − roots in S .

1 Proof. 4 2 2 The proof is by induction on n. 0 0 1 -2 2 -2 0 -1 -1 0 -1 1 The set S here could be anything. For example, over Q we might choose -2 -2 S = 0, 1, 2, . . . , s 1 . 2 -2 -1 0 1 2 { − } Schwartz’s Method 85 A Real Algorithm 86

The main application of the lemma is to give a probabilistic algorithm to check To lower the error bound, we repeat the basic step k times. whether a polynomial is zero. // is f identically zero ? Suppose f is not identically zero and has degree d. do k times pick a uniformly at random in Snif(f(a)! = 0)returnfalse; od Choose a point a Sn uniformly at random and evaluate f(a). Then ∈ return true; d Pr[f(a) = 0] Note that the answer false is always correct. s ≤ Answer true is correct with error bound ε provided that So by selecting S of cardinality 2d the error probability is 1/2. k = log 1/ε d e Note that the number of variables plays no role in the error bound.

Certainty 87 Application 88

Recall that a perfect matching in a bipartite graph is a subset M of the edges With more work we can make sure the f really vanishes. such that the edges do not overlap and every vertex is incident upon one edge in M. Corollary There is a polynomial time algorithm to check whether a perfect matching exists, but using Schwartz’s lemma one obtains a faster and less complicated Let f F [x1, . . . , xn] be of degree d and S F a set of cardinality s > d. If f algorithm. ∈ ⊆ vanishes on Sn then f is identically zero. Suppose the vertices are partitioned into ui and vi, i = 1, . . . , n . Define a n n matrix A by But note that for finite fields we may not be able to select a set of cardinality × higher than d. xij if (ui, vj ) E, Recall the example over Z2: in this case we can essentially only choose S = Z2, A(i, j) = ∈ so only degree 1 polynomials can be tackled by the lemma. (0 otherwise. i That’s fine for univariate polynomials (since any monomial x simplifies to x) Note that the determinant of A is a polynomial whose non-zero terms look like but useless for multivariate polynomials. x x . . . x ± 1π(1) 2π(2) nπ(n)

Testing Perfect Matchings 89

Proposition The graph has a perfect matching iff the determinant of A is not identically zero. Proof. If the graph has no perfect matching then all the terms in the determinant are 0 (they all involve at least one non-edge). But if the graph has a perfect matching it must have the form M = (ui, v ) i [n] where π is a permutation. { π(i) | ∈ } But then the determinant cannot be 0 since the corresponding monomial cannot be canceled out. 2