EE263 Autumn 2015 S. Boyd and S. Lall
QR factorization
I Gram-Schmidt procedure, QR factorization
I orthogonal decomposition induced by a matrix
1 Gram-Schmidt procedure
m given independent vectors a1, . . . , an ∈ R , G-S procedure finds orthonormal vec- tors q1, . . . , qn s.t.
span(a1, . . . , ar) = span(q1, . . . , qr) for r ≤ n
I thus, q1, . . . , qr is an orthonormal basis for span(a1, . . . , ar)
I rough idea of method: first orthogonalize each vector w.r.t. previous ones; then normalize result to have norm one
2 I step 1a. q˜1 := a1
I step 1b. q1 :=q ˜1/kq˜1k (normalize)
T I step 2a. q˜2 := a2 − (q1 a2)q1 (remove q1 component from a2)
I step 2b. q2 :=q ˜2/kq˜2k (normalize)
T T I step 3a. q˜3 := a3 − (q1 a3)q1 − (q2 a3)q2 (remove q1, q2 components)
I step 3b. q3 :=q ˜3/kq˜3k (normalize)
etc. a2 T I q˜2 = a2 − (q1 a2)q1
q2
for i = 1, 2, . . . , n we have q1 q˜1 = a1
T T T ai = (q1 ai)q1 + (q2 ai)q2 + ··· + (qi−1ai)qi−1 + kq˜ikqi
= r1iq1 + r2iq2 + ··· + riiqi
(note that the rij ’s come right out of the G-S procedure, and rii 6= 0) 3 QR decomposition
m×n m×n n×n written in matrix form: A = QR, where A ∈ R , Q ∈ R , R ∈ R :
r11 r12 ··· r1n 0 r22 ··· r2n a a ··· a = q q ··· q 1 2 n 1 2 n . . .. . | {z } | {z } . . . . A Q 0 0 ··· rnn | {z } R
T I Q Q = I, and R is upper triangular & invertible
I called QR decomposition (or factorization) of A
I usually computed using a variation on Gram-Schmidt procedure which is less sensitive to numerical (rounding) errors
I columns of Q are orthonormal basis for range(A)
4 General Gram-Schmidt procedure
m I in basic G-S we assume a1, . . . , an ∈ R are independent
I if a1, . . . , an are dependent, we find q˜j = 0 for some j, which means aj is linearly dependent on a1, . . . , aj−1
I modified algorithm: when we encounter q˜j = 0, skip to next vector aj+1 and continue: r = 0 for i = 1, . . . , n Pr T a˜ = ai − j=1 qj qj ai if a˜ 6= 0 r = r + 1 qr =a/ ˜ ka˜k
5 Staircase form on exit,
I q1, . . . , qr is an orthonormal basis for range(A) (hence r = Rank(A))
I each ai is linear combination of previously generated qj ’s
T r×n in matrix notation we have A = QR with Q Q = I and R ∈ R in upper staircase form × × possibly nonzero entries × × × × zero entries × × ×
‘corner’ entries (shown as ×) are nonzero
6 Applications
I directly yields orthonormal basis for range(A)
m×r r×n I yields factorization A = BC with B ∈ R , C ∈ R , r = Rank(A) I to check if b ∈ span(a1, . . . , an), apply Gram-Schmidt to a1 ··· an b
I staircase pattern in R shows which columns of A are dependent on previous ones
works incrementally: one G-S procedure yields QR factorizations of a1 ··· ap for p = 1, . . . , n a1 ··· ap = q1 ··· qs Rp where s = Rank( a1 ··· ap ) and Rp is leading s × p submatrix of R
7 ‘Full’ QR factorization
with A = Q1R1 the QR factorization as above, write R A = Q Q 1 1 2 0
m×(m−r) where Q1 Q2 is orthogonal, i.e., columns of Q2 ∈ R are orthonormal, orthogonal to Q1 to find Q2:
˜ ˜ I find any matrix A s.t. A A˜ has rank m (e.g., A = I) I apply general Gram-Schmidt to A A˜
I Q1 are orthonormal vectors obtained from columns of A ˜ I Q2 are orthonormal vectors obtained from extra columns (A)
m i.e., any set of orthonormal vectors can be extended to an orthonormal basis for R
8 Permutation
can permute columns with × to front of matrix:
R R A = Q Q 11 12 P 1 2 0 0
where:
T I Q Q = I
r×r I R11 ∈ R is upper triangular and invertible
n×n I P ∈ R is a permutation matrix (which moves forward the columns of a which generated a new q)
9 Complementary subspaces
if Q = Q1 Q2 and Q is orthogonal then range(Q1) and range(Q2) are called complementary subspaces, because
⊥ range(Q2) = range(Q1)
I they are orthogonal i.e., every vector in the first subspace is orthogonal to every vector in the second subspace
m I every vector in R can be expressed as a sum of two vectors, one from each subspace
I each subspace is the orthogonal complement of the other
10 Orthogonal decomposition induced by A
range(A)⊥ = null(AT)
T T T Q1 I from A = R1 0 T we see that Q2
T T A z = 0 ⇐⇒ Q1 z = 0 ⇐⇒ z ∈ range(Q2)
T so range(Q2) = null(A )
T I the columns of Q2 are an orthonormal basis for null(A )
m m×n I called orthogonal decomposition (of R ) induced by A ∈ R
n I every y ∈ R can be written uniquely as y = z + w, with z ∈ range(A), w ∈ null(AT) (we’ll soon see what the vector z is . . . )
11