<<

TEST TEST Given 1 any A of size m×n we have either a collection of n column vectors (a1, ··· , an) in Rm or a collection of m row-vectors in Rn. As we shall see later it is enough to discuss column vectors and consider the rows of A (correctly one should use the conjugates of the rows in the sense of complex conjugates) as the columns of A0 (where the mapping A 7→ B = A0 just interchanges rows and columns, and applies complex conjugation). In other words, bk,l = al,k. m As a consequence we have a subspace CA ⊆ R (the column space) and a sub- n space RA ⊆ R , the row space associated with A. By interpreting the coefficients in each of the equations constituting the homogeneous linear system of equations as a normal vector to some (hyper-) one easily finds that the general solution of the homogeneous system equals the of the row space. Hence each element from Rn can be split into two summands, which are orthogonal to each other: the orthogonal projection onto the row space and the orthogonal projection onto the null-space of A, which we will denote by NA from now on. In a similar way the “target space” Rm can be decomposed as an orthogonal sum of the column space and the null-space of the adjoint matrix A0. It is one of the early results in (and in fact an easy exercise using the linearity of : x 7→ A ∗ x) that the solution to a given equation A ∗ x = b - assuming there exists some non-trivial solution - can be described as a particular solution 2 of the inhomogeneous equation plus the whole null-space of A. In other words, the elements x which solve a particular linear equation A ∗ x = b for some fixed b differ by elements from NA. The next step is to observe that for each right hand side b ∈ Rm the solution set Sb = {x | A ∗ x = b} has exactly one element within RA. Indeed, due to the of NA and RA the elements from NA all have the same projection in n RA, and this particular element is the unique element in R such that A ∗ r = b. Note the this particular element r ∈ RA can be characterized by a minimal norm property: In fact, any other element solving A ∗ x = b differs from this specific solution by an element n ∈ NA, i.e. is of the form x = r + n. But for such an element one has kxk2 = krk2 + knk2 according to the Pythagorean theorem. We are now left with the question, how to “solve” the standard linear equation A ∗ x = b for the case that there is not really a solution. One should not give up, because - for example - there might be a solvable equation “nearby”. Imagin for example the problem of fitting a cubic polynomial to data obtained by measurements, at more than 4 positions, say 7 or 10 locations. Then it is natural to look for the optimal fitting cubic polynomial, i.e. the curve represented by a cubic polynomial with the property that the total sum of the squared (absolute) errors is minimal, P 2 i.e.. that k |p(xk)−dk| is minimal, where dk are the data given at the points (xk), 1 ≤ k ≤ m. Back to our equation A ∗ x = b this means: If the right hand side happens to be not from the column space of A, then one should solve the equation A ∗ x = b˜ :=

1For the discussion here it does not make a difference whether we have real or complex values, or even values from another “field”, such as a finite field. 2Typically one may guess a specific solution of the inhomogeneous problem, at least it is easier to find one specific solution than the general solution.

1 2

PCA (b), the orthogonal projection of b onto CA, the range space of x 7→ A ∗ x. The element b˜ is characterized by the least squares property: b˜ is the element which Pm ˜ 2 minimizes k=1 |bk − bk| among all possible elements from CA. Altogether we can now find the MINIMAL NORM LEAST SQUARE SO- LUTION (Methode der kleinsten Quadrate) for each linear equation A ∗ x = b: It is defined as the minimal norm solution to A ∗ x = b˜. It is not difficult to show that the mapping b˜ 7→ r is indeed a bijection, and therefore also a linear mapping. Since b 7→ b˜ is linear as well we have found a geo- metric way of defining a kind of inverse mapping to the linear mapping x 7→ A ∗ x. This linear mapping is mapping Rm back into Rn and is called the pseudo-inverse to the linear mapping defined by the matrix A. Correspondingly it has the format of a n × m matrix. Usually the symbol A+ is used for this mapping. Of course the matrix description for A+ can be obtained by solving the minimal norm least squares problem several times, for the right hand sides equal to b = ek, 1 ≤ k ≤ n. The pair A and A+ are symmetric. The row space of A turns out to be the range, hence column space of A+, and the row space of A+ is just the column-space of the original matrix A. Moreover, the null-space of A+ is the same as that of A0. It is also not difficult to check that (A+)+ = A and other, similar properties. It is also obvious (not just from the terminology) that A+ = A−1 in case the matrix A is an invertible n × n matrix. In fact, the range space equals Rn in that case, hence b˜ = b. On the other hand, the matrix has only a trivial null-space, and therefore r = A−1(b). If we are asking ourselves to which extent A+ can take the role of an inverse matrix we come to the following observation: If x 7→ A ∗ x is a surjective mapping (or equivalently if the columns of A are a m + m generating system for R ) then A ∗ A = IdR . On the other hand, if x 7→ A ∗ x is injective, or equivalently if the columns of A are linearly independent, then the + n null-space of A is trivial and we have A ∗ A = IdR . In fact, it is not difficult to + + show that A ∗ A = PCA while A ∗ A = PRA in the general case. We will provide a method to calculate pinvA explicitly below. This last case occurs for the interpolation problem mentioned above. If the data are not within the four-dimensional space of data which occur as samples of cubic polynomials it may be necessary to look for the solution of the “replacement prob- lem” A ∗ x = b˜. On the other hand, the columns of the Vandermonde matrix are linearly independent, hence the solution to that problem is uniquely determined and is r = A+ ∗ b. There are some observations that can be made in connection with those four spaces ( CA, RA, NA, NA0 ): The restriction of of x 7→ A ∗ x to RA establishes an between RA and CA. In fact, this confirms our knowledge that those two spaces have equal dimsion (row- of a matrix equals column rank). Exercise: All the matrices A, A+, A0, A0 ∗ A, A0 ∗ A, pinv(A ∗ A0) etc. have there range and null-space in one of the two spaces Rn and Rm respectively. It turns out that all the null-spaces and all the range spaces occurring within Rn resp. within Rm are equal !!! 3

MATLAB provides a simple way to obtain pinv(A). Another way to understand the calculation of the pseudo-inverse is the following reasoning: Assume we want to solve A ∗ x = b. If x is a solution to this equation, then also (A0 ∗ A) ∗ x = A0 ∗ b. Assume now that the columns of A are linear independent, 0 or equivalently, NA = {0}. We claim that this implies that A ∗ A is invertible (cf. below) and hence it is natural to go for the solution x˜ = [(A0A)−1A0] ∗ b. We claim that A+ = [(A0A)−1A0]. First of all the Lemma: The columns of A are linearly independent if and only if the Gramian matrix A0 ∗ A is invertible. 0 Proof: It is clear that the invertibility of A ∗ A implies that NA = {0}, i.e. that the columns of A are linearly independent (otherwise there exists some non-zero x with A ∗ x = 0, hence A0 ∗ A ∗ x = 0. Conversely assume that A0 ∗ A ∗ x = 0. Then 0 2 hA ∗ A ∗ x, xi = 0 or equivalently kA ∗ xk = 0, i.e. x ∈ NA.

should be the same as A0 ∗ A ∗ x = A0 ∗ b. 1. ad: ”By interpreting the coefficients in each of the equations constituting the homogeneous linear system of equations as a normal vector to some (hyper-)plane one easily finds that the general solution of the homogeneous system equals the orthogonal complement of the row space.”

Hier wrde ich vielleicht ein kleines Beispiel bringen. z.B. Schnitt von xy Ebene mit yz Ebene. Lsung klarerweise y-Achse. Normalvektor von 1. Gleichung (xy Ebene) ist Vektor in z Richtung, Normalvektor von 2. Gleichung (yz Ebene) ist Vektor in x Richtung, somit spannt der Zeilenraum die xz Ebene auf, was das orthogonale Komplement der Lsung des Gleichungssystems ist (y-Achse). 2. Sie knnten noch die explizite Formel fr die pinv angeben, wenn rank(A) = n bzw. die Spalten linear unabhngig sind.

References