EE263 Autumn 2015 S. Boyd and S. Lall

QR factorization

I Gram-Schmidt procedure, QR factorization

I orthogonal decomposition induced by a

1 Gram-Schmidt procedure

m given independent vectors a1, . . . , an ∈ R , G-S procedure finds orthonormal vec- tors q1, . . . , qn s.t.

span(a1, . . . , ar) = span(q1, . . . , qr) for r ≤ n

I thus, q1, . . . , qr is an for span(a1, . . . , ar)

I rough idea of method: first orthogonalize each vector w.r.t. previous ones; then normalize result to have norm one

2 I step 1a. q˜1 := a1

I step 1b. q1 :=q ˜1/kq˜1k (normalize)

T I step 2a. q˜2 := a2 − (q1 a2)q1 (remove q1 component from a2)

I step 2b. q2 :=q ˜2/kq˜2k (normalize)

T T I step 3a. q˜3 := a3 − (q1 a3)q1 − (q2 a3)q2 (remove q1, q2 components)

I step 3b. q3 :=q ˜3/kq˜3k (normalize)

etc. a2 T I q˜2 = a2 − (q1 a2)q1

q2

for i = 1, 2, . . . , n we have q1 q˜1 = a1

T T T ai = (q1 ai)q1 + (q2 ai)q2 + ··· + (qi−1ai)qi−1 + kq˜ikqi

= r1iq1 + r2iq2 + ··· + riiqi

(note that the rij ’s come right out of the G-S procedure, and rii 6= 0) 3 QR decomposition

m×n m×n n×n written in matrix form: A = QR, where A ∈ R , Q ∈ R , R ∈ R :

 r11 r12 ··· r1n   0 r22 ··· r2n   a a ··· a  =  q q ··· q    1 2 n 1 2 n  . . .. .  | {z } | {z }  . . . .  A Q 0 0 ··· rnn | {z } R

T I Q Q = I, and R is upper triangular & invertible

I called QR decomposition (or factorization) of A

I usually computed using a variation on Gram-Schmidt procedure which is less sensitive to numerical (rounding) errors

I columns of Q are orthonormal basis for range(A)

4 General Gram-Schmidt procedure

m I in basic G-S we assume a1, . . . , an ∈ R are independent

I if a1, . . . , an are dependent, we find q˜j = 0 for some j, which means aj is linearly dependent on a1, . . . , aj−1

I modified algorithm: when we encounter q˜j = 0, skip to next vector aj+1 and continue: r = 0 for i = 1, . . . , n Pr T a˜ = ai − j=1 qj qj ai if a˜ 6= 0 r = r + 1 qr =a/ ˜ ka˜k

5 Staircase form on exit,

I q1, . . . , qr is an orthonormal basis for range(A) (hence r = (A))

I each ai is linear combination of previously generated qj ’s

T r×n in matrix notation we have A = QR with Q Q = I and R ∈ R in upper staircase form × × possibly nonzero entries × × × × zero entries × × ×

‘corner’ entries (shown as ×) are nonzero

6 Applications

I directly yields orthonormal basis for range(A)

m×r r×n I yields factorization A = BC with B ∈ R , C ∈ R , r = Rank(A)   I to check if b ∈ span(a1, . . . , an), apply Gram-Schmidt to a1 ··· an b

I staircase pattern in R shows which columns of A are dependent on previous ones

  works incrementally: one G-S procedure yields QR factorizations of a1 ··· ap for p = 1, . . . , n     a1 ··· ap = q1 ··· qs Rp   where s = Rank( a1 ··· ap ) and Rp is leading s × p submatrix of R

7 ‘Full’ QR factorization

with A = Q1R1 the QR factorization as above, write  R  A =  Q Q  1 1 2 0

  m×(m−r) where Q1 Q2 is orthogonal, i.e., columns of Q2 ∈ R are orthonormal, orthogonal to Q1 to find Q2:

˜   ˜ I find any matrix A s.t. A A˜ has rank m (e.g., A = I)   I apply general Gram-Schmidt to A A˜

I Q1 are orthonormal vectors obtained from columns of A ˜ I Q2 are orthonormal vectors obtained from extra columns (A)

m i.e., any set of orthonormal vectors can be extended to an orthonormal basis for R

8 Permutation

can permute columns with × to front of matrix:

 R R  A =  Q Q  11 12 P 1 2 0 0

where:

T I Q Q = I

r×r I R11 ∈ R is upper triangular and invertible

n×n I P ∈ R is a (which moves forward the columns of a which generated a new q)

9 Complementary subspaces

  if Q = Q1 Q2 and Q is orthogonal then range(Q1) and range(Q2) are called complementary subspaces, because

⊥ range(Q2) = range(Q1)

I they are orthogonal i.e., every vector in the first subspace is orthogonal to every vector in the second subspace

m I every vector in R can be expressed as a sum of two vectors, one from each subspace

I each subspace is the orthogonal complement of the other

10 Orthogonal decomposition induced by A

range(A)⊥ = null(AT)

 T  T  T  Q1 I from A = R1 0 T we see that Q2

T T A z = 0 ⇐⇒ Q1 z = 0 ⇐⇒ z ∈ range(Q2)

T so range(Q2) = null(A )

T I the columns of Q2 are an orthonormal basis for null(A )

m m×n I called orthogonal decomposition (of R ) induced by A ∈ R

n I every y ∈ R can be written uniquely as y = z + w, with z ∈ range(A), w ∈ null(AT) (we’ll soon see what the vector z is . . . )

11