How to Multiply Integers, Matrices, and Polynomials

How to Multiply integers, matrices, and polynomials Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 1 Complex Multiplication Complex multiplication. (a + bi) (c + di) = x + yi. Grade-school. x = ac - bd, y = bc + ad. 4 multiplications, 2 additions Q. Is it possible to do with fewer multiplications? 2 Complex Multiplication Complex multiplication. (a + bi) (c + di) = x + yi. Grade-school. x = ac - bd, y = bc + ad. 4 multiplications, 2 additions Q. Is it possible to do with fewer multiplications? A. Yes. [Gauss] x = ac - bd, y = (a + b) (c + d) - ac - bd. 3 multiplications, 5 additions Remark. Improvement if no hardware multiply. 3 5.5 Integer Multiplication Integer Addition Addition. Given two n-bit integers a and b, compute a + b. Grade-school. Θ(n) bit operations. 1 1 1 1 1 1 0 1 1 1 0 1 0 1 0 1 + 0 1 1 1 1 1 0 1 1 0 1 0 1 0 0 1 0 Remark. Grade-school addition algorithm is optimal. 5 Integer Multiplication Multiplication. Given two n-bit integers a and b, compute a × b. Grade-school. Θ(n2) bit operations. 1 1 0 1 0 1 0 1 × 0 1 1 1 1 1 0 1 1 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 1 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 Q. Is grade-school multiplication algorithm optimal? 6 Divide-and-Conquer Multiplication: Warmup To multiply two n-bit integers a and b: Multiply four ½n-bit integers, recursively. Add and shift to obtain result. n /2 a = 2 " a1 + a0 n /2 b = 2 "b1 + b0 ab 2n /2 a a 2n /2 b b 2n a b 2n /2 a b a b a b = ( " 1 + 0 ) ( " 1 + 0 ) = " 1 1 + "( 1 0 + 0 1) + 0 0 Ex.! a = 10001101 b = 11100001 a1 a0 b1 b0 T(n) = 4T(n/2) + "(n) # T(n) = "(n2 ) 14 2 4 3 12 3 recursive calls add, shift ! 7 Recursion Tree lg n 1+ lg n " 0 if n = 0 k $ 2 #1' 2 T(n) = # T(n) = " n 2 = n & ) = 2n # n $ 4T(n/2) + n otherwise k=0 % 2 #1 ( ! T(n)! n T(n/2) T(n/2) T(n/2) T(n/2) 4(n/2) T(n/4) T(n/4) T(n/4) T(n/4) ... T(n/4) T(n/4) T(n/4) T(n/4) 16(n/4) ... ... k k T(n / 2k) 4 (n / 2 ) ... ... T(2) T(2) T(2) T(2) ... T(2) T(2) T(2) T(2) 4 lg n (1) 8 Karatsuba Multiplication To multiply two n-bit integers a and b: Add two ½n bit integers. Multiply three ½n-bit integers, recursively. Add, subtract, and shift to obtain result. n /2 a = 2 " a1 + a0 n /2 b = 2 "b1 + b0 n n /2 ab = 2 " a1b1 + 2 "(a1b0 + a0b1) + a0b0 n n /2 = 2 " a1b1 + 2 "( (a1 + a0 )(b1 + b0 ) # a1b1 # a0b0 ) + a0b0 1 2 1 3 3 ! 9 Karatsuba Multiplication To multiply two n-bit integers a and b: Add two ½n bit integers. Multiply three ½n-bit integers, recursively. Add, subtract, and shift to obtain result. n /2 a = 2 " a1 + a0 n /2 b = 2 "b1 + b0 n n /2 ab = 2 " a1b1 + 2 "(a1b0 + a0b1) + a0b0 n n /2 = 2 " a1b1 + 2 "( (a1 + a0 )(b1 + b0 ) # a1b1 # a0b0 ) + a0b0 1 2 1 3 3 ! Theorem. [Karatsuba-Ofman 1962] Can multiply two n-bit integers in O(n1.585) bit operations. T(n) " T( #n/2$ ) + T( %n/2& ) + T( 1+%n/2& ) + '(n) ( T(n) = O(nlg 3 ) = O(n1.585 ) 14 4 4 4 4 44 2 4 4 4 4 4 4 43 14 2 4 3 recursive calls add, subtract, shift 10 ! Karatsuba: Recursion Tree lg n $ 3 1+ lg n 1' " 0 if n = 0 3 k ( 2) # lg 3 T(n) = # T(n) = " n ( 2) = n & ) = 3n # 2n 3T(n/2) n otherwise & 3 1 ) $ + k=0 % 2 # ( ! !T(n) n T(n/2) T(n/2) T(n/2) 3(n/2) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) T(n/4) 9(n/4) ... ... k k T(n / 2k) 3 (n / 2 ) ... ... T(2) T(2) T(2) T(2) ... T(2) T(2) T(2) T(2) 3 lg n (1) 11 Fast Integer Division Too (!) Integer division. Given two n-bit (or less) integers s and t, compute quotient q = s / t and remainder r = s mod t. Fact. Complexity of integer division is same as integer multiplication. To compute quotient q: 2 Approximate x = 1 / t using Newton's method: xi+1 = 2xi " t xi After log n iterations, either q = s x or q = s x. using fast multiplication ! 12 Matrix Multiplication Dot Product Dot product. Given two length n vectors a and b, compute c = a ⋅ b. Grade-school. Θ(n) arithmetic operations. n a " b = # ai bi i=1 a = [ .70 .20 .10 ] ! b = [ .30 .40 .30 ] a " b = (.70 # .30) + (.20 # .40) + (.10 # .30) = .32 ! Remark. Grade-school dot product algorithm is optimal. 14 Matrix Multiplication Matrix multiplication. Given two n-by-n matrices A and B, compute C = AB. Grade-school. Θ(n3) arithmetic operations. n cij = " aik bkj k=1 "c 11 c12 L c1n % "a 11 a12 L a1n % "b 11 b12 L b1n % $ ' $ ' $ ' ! c21 c22 L c2n a21 a22 L a2n b21 b22 L b2n $ ' = $ ' ( $ ' $ M M O M ' $ M M O M ' $ M M O M ' $ ' $ ' $ ' #c n1 cn2 cnn & #a n1 an2 ann & #b n1 bn2 bnn & L L L ! ".59 .32 .41% ".70 .20 .10% " .80 .30 .50% $ ' $ ' $ ' $.31 .36 .25' = $.30 .60 .10' ( $ .10 .40 .10' #$.45 .31 .42&' #$.50 .10 .40&' #$ .10 .30 .40&' ! Q. Is grade-school matrix multiplication algorithm optimal? 15 Block Matrix Multiplication A11 A12 B11 C11 " 152 158 164 170 % " 0 1 2 3 % "16 17 18 19% $ 504 526 548 570 ' $ 4 5 6 7 ' $20 21 22 23' $ ' = $ ' ( $ ' $ 856 894 932 970 ' $ 8 9 10 11' $24 25 26 27' $ ' $ ' $ ' #1208 1262 1316 1370& #12 13 14 15& #28 29 30 31& B ! 11 # 0 1& #16 17& #2 3& #24 25& #152 158& C A B A B 11 = 11 " 11 + 12 " 21 = % ( " % ( + % ( " % ( = % ( $ 4 5' $20 21' $6 7' $28 29' $504 526' ! 16 Matrix Multiplication: Warmup To multiply two n-by-n matrices A and B: Divide: partition A and B into ½n-by-½n blocks. Conquer: multiply 8 pairs of ½n-by-½n matrices, recursively. Combine: add appropriate products using 4 matrix additions. "C 11 C12 % " A11 A12 % " B11 B12 % C11 = (A11 " B11) + (A12 " B21) $ ' = $ ' ( $ ' C C A A B B C12 = (A11 " B12 ) + (A12 " B22 ) # 21 22& # 21 22& # 21 22& C21 = (A21 " B11) + (A22 " B21) C A B A B 22 = ( 21 " 12 ) + ( 22 " 22 ) ! ! T(n) = 8T(n/2 ) + "(n2 ) # T(n) = "(n3) 14 42 4 43 14 2 4 3 add, form submatrices recursive calls ! 17 Fast Matrix Multiplication Key idea. multiply 2-by-2 blocks with only 7 multiplications. " C C % " A A % " B B % 11 12 = 11 12 ( 11 12 P = A " (B # B ) $C C ' $ A A ' $ B B ' 1 11 12 22 # 21 22& # 21 22& # 21 22& P2 = ( A11 + A12 ) " B22 P3 = ( A21 + A22 ) " B11 ! C11 = P5 + P4 " P2 + P6 P4 = A22 " (B21 # B11) C12 = P1 + P2 P5 = ( A11 + A22 ) " (B11 + B22 ) C21 = P3 + P4 P6 = ( A12 # A22 ) " (B21 + B22 ) C P P P P P A A B B 22 = 5 + 1 " 3 " 7 7 = ( 11 # 21) " ( 11 + 12 ) !7 multiplications. ! 18 = 8 + 10 additions and subtractions. 18 Fast Matrix Multiplication To multiply two n-by-n matrices A and B: [Strassen 1969] Divide: partition A and B into ½n-by-½n blocks. Compute: 14 ½n-by-½n matrices via 10 matrix additions. Conquer: multiply 7 pairs of ½n-by-½n matrices, recursively. Combine: 7 products into 4 terms using 8 matrix additions. Analysis. Assume n is a power of 2. T(n) = # arithmetic operations. T(n) = 7T(n/2 )+ "(n2 ) # T(n) = "(nlog2 7 ) = O(n2.81) 14 2 4 3 14 2 4 3 add, subtract recursive calls ! 19 Fast Matrix Multiplication: Practice Implementation issues. Sparsity. Caching effects. Numerical stability. Odd matrix dimensions. Crossover to classical algorithm around n = 128. Common misperception. “Strassen is only a theoretical curiosity.” Apple reports 8x speedup on G4 Velocity Engine when n ≈ 2,500. Range of instances where it's useful is a subject of controversy. Remark. Can "Strassenize" Ax = b, determinant, eigenvalues, SVD, …. 20 Fast Matrix Multiplication: Theory Q. Multiply two 2-by-2 matrices with 7 scalar multiplications? A. Yes! [Strassen 1969] "(nlog2 7 ) = O(n 2.807 ) Q. Multiply two 2-by-2 matrices with 6 scalar multiplications? ! A. Impossible. [Hopcroft and Kerr 1971] log 6 2.59 " (n 2 ) = O(n ) Q. Two 3-by-3 matrices with 21 scalar multiplications? A. Also impossible. log3 21 2.77 ! " (n ) = O(n ) Begun, the decimal wars have.

Load more