Sophomoric Matrix Multiplication
Carl C. Cowen
IUPUI
(Indiana University Purdue University Indianapolis)
September 12, 2016, Taylor University Linear algebra students learn, for m × n matrices A, B, and C, matrix addition is A + B = C if and only if aij + bij = cij.
Expect matrix multiplication is AB = C if and only if aijbij = cij, But, the professor says “No! It is much more complicated than that!” Linear algebra students learn, for m × n matrices A, B, and C, matrix addition is A + B = C if and only if aij + bij = cij.
Expect matrix multiplication is AB = C if and only if aijbij = cij, But, the professor says “No! It is much more complicated than that!”
Today, I want to explain why this kind of multiplication not only is sensible but also is very practical,very interesting, and has many applications in mathematics and related subjects. Linear algebra students learn, for m × n matrices A, B, and C,
matrix addition is A + B = C if and only if aij + bij = cij.
Expect matrix multiplication is AB = C if and only if aijbij = cij, But, the professor says “No! It is much more complicated than that!”
Today, I want to explain why this kind of multiplication not only is sensible but also is very practical, very interesting, and has many applications in mathematics and related subjects.
Definition If A and B are m × n matrices, the Schur (or Hadamard or naive or sophomoric) product of A and B
is the m × n matrix C = A•B with cij = aijbij. These ideas go back more than a century to Moutard (1894), who didn’t even notice he had proved anything(!), Hadamard (1899), and Schur (1911). These ideas go back more than a century to Moutard (1894), who didn’t even notice he had proved anything(!), Hadamard (1899), and Schur (1911).
P∞ n Hadamard considered analytic functions f(z) = n=0 anz and P∞ n g(z) = n=0 bnz that have singularities at {αi} and {βj} respectively. These ideas go back more than a century to Moutard (1894), who didn’t even notice he had proved anything(!), Hadamard (1899), and Schur (1911).
P∞ n Hadamard considered analytic functions f(z) = n=0 anz and P∞ n g(z) = n=0 bnz that have singularities at {αi} and {βj} respectively.
P∞ n He proved that if h(z) = n=0 anbnz which has singularities {γk}, then
{γk} ⊂ {αiβj}. This seems a little less surprising when you consider convolutions:
Let f and g be 2π-periodic functions on R and Z 2π Z 2π −ikθ dθ −ikθ dθ ak = e f(θ) and bk = e g(θ) 0 2π 0 2π so that
X ikθ X ikθ f ∼ ake and g ∼ bke
Z 2π dt X ikθ If h(θ) = f(θ − t)g(t) , then h ∼ akbke 0 2π and f ≥ 0 and g ≥ 0 implies h ≥ 0. Definition If A and B are m × n matrices, the Schur product of A and B
is the m × n matrix C = A•B with cij = aijbij.
Schur’s name is most often associated with the matrix product because he published the first theorem about this kind of matrix multiplication. Definition A real (or complex) n × n matrix is called positive or positive semidefinite if • A = A∗
• hAx, xi ≥ 0 for all x in Rn ( or Cn) Definition A real (or complex) n × n matrix is called positive or positive semidefinite if • A = A∗
• hAx, xi ≥ 0 for all x in Rn ( or Cn)
Properties of positivity of matrices:
• For any m × n matrix A, both AA∗ and A∗A are positive. Definition A real (or complex) n × n matrix is called positive or positive semidefinite if • A = A∗
• hAx, xi ≥ 0 for all x in Rn ( or Cn)
Properties of positivity of matrices:
• For any m × n matrix A, both AA∗ and A∗A are positive.
• Conversely, if B is positive, then B = AA∗ for some A. Definition A real (or complex) n × n matrix is called positive or positive semidefinite if • A = A∗
• hAx, xi ≥ 0 for all x in Rn ( or Cn)
Properties of positivity of matrices:
• For any m × n matrix A, both AA∗ and A∗A are positive.
• Conversely, if B is positive, then B = AA∗ for some A.
• In statistics, every variance-covariance matrix is positive. Examples:
1 2 • A = is NOT positive: 2 3 1 2 2 2 0 2 h , i = h , i = −1 2 3 −1 −1 1 −1 Examples:
1 2 • A = is NOT positive: 2 3 1 2 2 2 0 2 h , i = h , i = −1 2 3 −1 −1 1 −1
1 0 5 −4 • B = and C = are positive 0 2 −4 5 Examples:
1 2 • A = is NOT positive: 2 3 1 2 2 2 0 2 h , i = h , i = −1 2 3 −1 −1 1 −1
1 0 5 −4 • B = and C = are positive 0 2 −4 5
1 0 5 −4 5 −4 but BC = = is not. 0 2 −4 5 −8 10 Schur Product Theorem (1911) If A and B are positive n × n matrices, then A•B is positive also. Schur Product Theorem (1911) If A and B are positive n × n matrices, then A•B is positive also.
Applications:
Experimental design: If A and B are variance-covariance matrices, then A•B is positive and a variance-covariance also. Schur Product Theorem (1911) If A and B are positive n × n matrices, then A•B is positive also.
Applications:
Experimental design: If A and B are variance-covariance matrices, then A•B is positive and a variance-covariance also.
P.D.E.’s: Let Ω be a domain in R2 and let L be the differential operator ∂2u ∂2u ∂2u ∂u ∂u Lu = a + 2a + a + b + b + cu 11∂x2 12∂x∂y 22 ∂y2 1∂x 2 ∂y a a 11 12 L is called elliptic if is positive definite. a21 a22 Let L be the differential operator ∂2u ∂2u ∂2u ∂u ∂u Lu = a + 2a + a + b + b + cu 11∂x2 12∂x∂y 22 ∂y2 1∂x 2 ∂y a a 11 12 L is called elliptic if is positive definite. a21 a22
Weak Minimum Principle (Moutard, 1894) If L is elliptic, c < 0, and Lu ≡ 0 in Ω, then u cannot have a negative minimum value in Ω. Fejer’s Uniqueness Theorem If L is elliptic on Ω and c < 0, then there is at most one solution to the boundary value problem Lu = f in Ω
u = g on ∂Ω
such that u is continuous on Ω and smooth in Ω. Fejer’s Uniqueness Theorem If L is elliptic on Ω and c < 0, then there is at most one solution to the boundary value problem Lu = f in Ω
u = g on ∂Ω
such that u is continuous on Ω and smooth in Ω.
Proof:
If u1 and u2 are both solutions with u1 =6 u2, then since u1 = g = u2 on ∂Ω
and u1 − u2 = 0 = u2 − u1 on ∂Ω, either u1 − u2 or u2 − u1 must have a negative minimum value in Ω. Fejer’s Uniqueness Theorem If L is elliptic on Ω and c < 0, then there is at most one solution to the boundary value problem Lu = f in Ω
u = g on ∂Ω
such that u is continuous on Ω and smooth in Ω.
Proof:
If u1 and u2 are both solutions with u1 =6 u2, then since u1 = g = u2 on ∂Ω
and u1 − u2 = 0 = u2 − u1 on ∂Ω, either u1 − u2 or u2 − u1 must have a negative minimum value in Ω.
But Lu1 = f = Lu2 in Ω, so L(u1 − u2) ≡ 0 ≡ L(u2 − u1) in Ω
Moutard: neither has negative minimum value in Ω so must have u1 ≡ u2. Goal: Prove Schur’s theorem Goal: Prove Schur’s theorem
Recall (AB)t = BtAt and (AB)∗ = B∗A∗
Do for case of real scalars: same but reals more comfortable for most math students than complex Goal: Prove Schur’s theorem
Recall (AB)t = BtAt; Do for case of real scalars.
Use column vectors: x1 y1 x y 2 2 t hx, yi = h , i = x1y1 + x2y2 + x3y3 + ··· + xnyn = x y . . xn yn Goal: Prove Schur’s theorem
Recall (AB)t = BtAt; Do for case of real scalars.
Use column vectors: x1 y1 x y 2 2 t hx, yi = h , i = x1y1 + x2y2 + x3y3 + ··· + xnyn = x y . . xn yn But(!): t x1 y1 x1 x1y1 x1y2 ··· x1yn x2 y2 x2 x2y1 x2y2 ··· x2yn t xy = = y1 y2 ··· yn = . . . . ... . xn yn xn xny1 xny2 ··· xnyn Goal: Prove Schur’s theorem
Recall (AB)t = BtAt; Do for case of real scalars. Use column vectors.
So t hx, yi = x1y1 + x2y2 + x3y3 + ··· + xnyn = x y and x1y1 x1y2 ··· x1yn x y x y ··· x y t 2 1 2 2 2 n xy = . ... . xny1 xny2 ··· xnyn Goal: Prove Schur’s theorem
Recall (AB)t = BtAt; Do for case of real scalars. Use column vectors.
And use x1y1 x1y2 ··· x1yn x y x y ··· x y t t 2 1 2 2 2 n hx, yi = x y and xy = . ... . xny1 xny2 ··· xnyn to prove Lemma An n × n matrix A is positive if and only if t t t t A = v1v1 + v2v2 + v3v3 + ··· + vkvk
for vectors v1, v2, v3, ··· , vk and k ≤ n. Lemma An n × n matrix A is positive if and only if t t t t A = v1v1 + v2v2 + v3v3 + ··· + vkvk
for vectors v1, v2, v3, ··· , vk and k ≤ n. Proof: n t t t (⇒) Let x be in R and A = v1v1 + v2v2 + ··· + vkvk for vectors
v1, v2, ··· , vk. Then t t t hAx, xi = h(v1v1 + v2v2 + ··· + vkvk)x, xi X t X t t = hvjvj x, xi = (vjvj x) x j j X t t t t X t t t = x ((vj ) vj x = (vj x) (vj x) j j X X 2 = hvj, xihvj, xi = |hvj, xi| ≥ 0 j j Lemma An n × n matrix A is positive if and only if t t t t A = v1v1 + v2v2 + v3v3 + ··· + vkvk
for vectors v1, v2, v3, ··· , vk and k ≤ n. Proof: t n (⇐) Let A be positive. Since A = A , there is an orthonormal basis for R
that consists of eigenvectors for A, call it w1, w2, ···, wn. Then, for each j,
let αj be the eigenvalues of A, that is, Awj = αjwj.
Because A is positive, αj ≥ 0 for all j. We suppose they have been
numbered so that for 1 ≤ j ≤ k, αj > 0 and for j ≥ k + 1, αj = 0.
2 For 1 ≤ j ≤ k, choose βj > 0 with αj = βj and let vj = βjwj. Then, we show that t t t t t t A = v1v1 + v2v2 + ··· + vkvk = α1w1w1 + α2w2w2 + ··· + αkwkwk Lemma An n × n matrix A is positive if and only if t t t t A = v1v1 + v2v2 + v3v3 + ··· + vkvk
for vectors v1, v2, v3, ··· , vk and k ≤ n. Proof: (⇐) To show that t t t A = α1w1w1 + α2w2w2 + ··· + αkwkwk
we will show that for each x in Rn, Ax and t t t (α1w1w1 + α2w2w2 + ··· + αkwkwk)x are the same. n If x is a vector in R , then x is a linear combination of the wj’s, say
x = x1w1 + x2w2 + ··· + xnwn. Then Ax is given by
Ax = A(x1w1 + x2w2 + ··· + xnwn)
= x1Aw1 + x2Aw2 + ··· + xnAwn Lemma An n × n matrix A is positive if and only if t t t t A = v1v1 + v2v2 + v3v3 + ··· + vkvk
for vectors v1, v2, v3, ··· , vk and k ≤ n. Proof: (⇐) which is
Ax = x1Aw1 + x2Aw2 + ··· + xnAwn
= x1α1w1 + x2α2w2 + ··· + xkαkwk + ··· + xnαnwn
= x1α1w1 + x2α2w2 + ··· + xkαkwk + ··· + xn0wn
= x1α1w1 + x2α2w2 + ··· + xkαkwk
t Notice that wi wj = hwi, wji = δij. Lemma An n × n matrix A is positive if and only if t t t t A = v1v1 + v2v2 + v3v3 + ··· + vkvk
for vectors v1, v2, v3, ··· , vk and k ≤ n. Proof: (⇐) Similar to the above calculation, t t t t t t (v1v1 + v2v2 + ··· + vkvk)x = (α1w1w1 + α2w2w2 + ··· + αkwkwk)x X t X = ( αiwiwi )( xjwj) i j X t X t = (αiwiwi )xjwj = αixjwi(wi wj) i,j i,j X = αixiwi = x1α1w1 + x2α2w2 + ··· + xkαkwk i t t t Thus, for each x, Ax = (v1v1 + v2v2 + ··· + vkvk)x and t t t A = v1v1 + v2v2 + ··· + vkvk Lemma
n t t t If u and v are vectors in R , then (uu )•(vv ) = (u•v)(u•v) . Proof: u1u1 u1u2 ··· u1un u u u u ··· u u t 2 1 2 2 2 n If u = (u1, u2, ··· , un), then uu = . ... . unu1 unu2 ··· unun Thus, u1u1 u1u2 ··· u1un v1v1 v1v2 ··· v1vn u2u1 u2u2 ··· u2un v2v1 v2v2 ··· v2vn t • t • (uu ) (vv ) = = . ... . . ... . unu1 unu2 ··· unun vnv1 vnv2 ··· vnvn Lemma
n t t t If u and v are vectors in R , then (uu )•(vv ) = (u•v)(u•v) . Proof: Thus, u1u1v1v1 u1u2v1v2 ··· u1unv1vn u2u1v2v1 u2u2v2v2 ··· u2unv2vn t • t (uu ) (vv ) = . ... . unu1vnv1 unu2vnv2 ··· ununvnvn
u1v1u1v1 u1v1u2v2 ··· u1v1unvn u2v2u1v1 u2v2u2v2 ··· u2v2unvn • • t = = (u v)(u v) . ... . unvnu1v1 unvnu2v2 ··· unvnunvn Schur Product Theorem (1911) If A and B are positive n × n matrices, then A•B is positive also.
Proof: t t t There are vectors u1, u2, ···, uk so that A = u1u1 + u2u2 + ··· + ukuk t t t and vectors v1, v2, ···, v` so that B = v1v1 + v2v2 + ··· + v`v` . Now, ! • X t • X t A B = uiui vjvj i j
X t • t X • • t = (uiui ) (vjvj ) = (ui vj) (uivj) ≥ 0 i,j i,j Corollary
2 3 aij If A = (aij is a positive n × n matrix, then (aij), (aij), (e ) are positive also. 3 −2 For example, is positive, −2 2
9 4 27 −8 e3 e−2 so , , and are also positive! 4 4 −8 8 e−2 e2 Application: Matrix completion problems
A matrix completion problem is a problem in which certain entries in a matrix are known and others are unknown. The goal is to fill in the rest of the entries so that certain conditions are met.
Usually, a starting point is to discover if it is possible to find the rest of the entries to meet the conditions. Then, if it is possible, to find a way, or create an algorithm to find the way to fill in the entries. Application: Matrix completion problems
A Famous Matrix Completion Problem: The Netflix Prize
2006, Netflix offered $1,000,000 prize for algorithm to predict customers’ preferences for films: payable for 1st algorithm to beat Netflix’s own by more than 10%. Prize was won about 3 years after the start.
Can be viewed as a matrix completion problem: each customer has an incomplete matrix where rows represent dates and columns represent films and entry in a particular day/film spot is the preference number, integer 1 to 5, with no entry in the spot if the customer did not rate film on that day.
Goal: “correctly” predict rating in each spot. Plausible criterion would be to make rank of the prediction matrix relatively small compared to rank of data matrix. Application: Matrix completion problems
15 2 a b 2 7 2 c Are there numbers a, b, and c so that A = is positive? a 2 17 −3 b c −3 13 Application: Matrix completion problems
15 2 a b 2 7 2 c Are there numbers a, b, and c so that A = is positive? a 2 17 −3 b c −3 13
In this case, yes: a = 7, b = 5, and c = 8.
Standard algorithm to find solutions uses Schur products. THANK YOU!
These slides will be posted on my website:
http://www.math.iupui.edu/˜ccowen/ General Problem For B is a fixed n × n matrix, compute the Schur multiplier norm of B,
that is, find the smallest constant KB such that
kX•Bk ≤ KBkXk
Moreover, we want to have a computationally effective way to find KB. Schur (1911) If B is a positive n × n matrix, then its Schur multiplier norm is its largest diagonal entry.
Proof:
If β is the largest diagonal entry of B, then kB•Ik = β, and KB ≥ β. αI A Note that kAk ≤ α if and only if ≥ 0. A∗ αI Schur’s Theorem implies, BB IA B•IB•A • 0 ≤ = BB A∗ I B•A∗ B•I
B•IB•A βI B•A = ≤ (B•A)∗ B•I (B•A)∗ βI