Matrices and Linear Maps

Optimization Models EECS 127 / EECS 227AT Laurent El Ghaoui EECS department UC Berkeley Fall 2018 Fa18 1 / 40 LECTURE 3 Matrices and Linear Maps The Matrix is everywhere. It is all around us. Morpheus Fa18 2 / 40 Outline 1 Introduction Basics 2 Matrices as linear maps Range, rank, and nullspace Eigenvalues and eigenvectors PageRank Matrices with special structure 3 Matrix concepts Matrix factorizations Matrix norms Matrix functions Fa18 3 / 40 Introduction A matrix is a collection of numbers, arranged in columns and rows in a tabular format. Suitably defining operations such as sum, product and norms on matrices, we can treat matrices as elements of a vector space. A matrix defines a linear map between an input and an output space. This leads to the introduction of concepts such as range, rank, nullspace, eigenvalues and eigenvectors, that permit a complete analysis of (finite dimensional) linear maps. Matrices are an ubiquitous tool in engineering for organizing and manipulating data. They constitute the fundamental building block of numerical computation methods. Fa18 4 / 40 A data matrix Figure: Votes of US Senators, 2002-2004. Is there anything beyond just an array of numbers? Fa18 5 / 40 Basics We shall mainly deal with matrices whose elements are real (or sometimes complex) numbers, that is with arrays of the form 2 a11 a12 ··· a1n 3 6 a21 a22 ··· a2n 7 A = 6 7 : 6 . .. 7 4 . 5 am1 am2 ··· amn This matrix has m rows and n columns. m;n m;n In the case of real elements, we say that A 2 R , resp. A 2 C in the case of complex elements. The i-th row of A is the (row) vector [ai1 ··· ain]; the j-th column of A is the > (column) vector [a1j ··· amj ] . The transposition operation works on matrices by exchanging rows and columns, that is > [A ]ij = [A]ji ; where the notation [A]ij (or sometimes also simply Aij ) refers to the element of A positioned in row i and column j. Fa18 6 / 40 Example Matrices for networks A network can be represented as a graph of m nodes connected by n directed arcs. Here, we assume that arcs are ordered pairs of nodes, with at most one arc joining any two nodes; we also assume that there are no self-loops (arcs from a node to itself). We can fully describe such kind of network via the so-called (directed) arc-node incidence matrix, which is an m × n matrix defined as follows: 8 < 1 if arc j starts at node i Aij = −1 if arc j ends at node i ; 1 ≤ i ≤ m; 1 ≤ j ≤ n: (1) : 0 otherwise. Fa18 7 / 40 Example Matrices for networks: example 7 4 5 4 6 1 2 3 5 6 1 3 8 2 A network with m = 6 nodes and n = 8 arcs, with (directed) arc-node incidence matrix 2 1 1 0 0 0 0 0 −1 3 6 −1 0 1 0 0 0 0 1 7 6 7 6 0 −1 −1 −1 1 1 0 0 7 A = 6 7 : 6 0 0 0 1 0 0 −1 0 7 4 0 0 0 0 0 −1 1 0 5 0 0 0 0 −1 0 0 0 Fa18 8 / 40 Basics Matrix products m;n n;p Two matrices can be multiplied if conformably sized, i.e., if A 2 R and B 2 R , m;p then the matrix product AB 2 R is defined as a matrix whose (i; j)-th entry is n X [AB]ij = Aik Bkj : k=1 The matrix product is non-commutative, meaning that, in general, AB 6= BA. The n × n identity matrix (often denoted In, or simply I , depending on context), is a matrix with all zero elements, except for the elements on the diagonal (that is, the elements with row index equal to the column index), which are equal to one. This matrix satisfies AIn = A for every matrix A with n columns, and InB = B for every matrix B with n rows. Fa18 9 / 40 Basics Matrix-vector product m;n m n Let A 2 R be a matrix with columns a1;:::; an 2 R and b 2 R a vector. We define the matrix-vector product by n X m;n n Ab = ak bk ; A 2 R ; b 2 R : k=1 m That is, Ab is a vector in R obtained by forming a linear combination of the columns of A, using the elements in b as coefficients. m;n Similarly, we can multiply matrix A 2 R on the left by (the transpose of) vector c 2 m as follows R m > X > m;n m c A = ck αk ; A 2 R ; c 2 R : k=1 > 1;n That is, c A is a vector in R obtained by forming a linear combination of the rows αk of A, using the elements in c as coefficients. Fa18 10 / 40 Matrix-vector product For a network incidence matrix We describe a flow (of goods, traffic, charge, information,7 etc) across the network as a 4 5n vector x 2 R , where the j-th component of 4x denotes6 the amount flowing through arc j. By convention, we use positive values when 1 2 the flow is in the direction of the arc, and 3 negative5 ones in the opposite case. 6 1 3 8 2 The total flow leaving a given node i is then n X Aij xj = [Ax]i ; j=1 where [Ax]i denotes the i-th component of vector Ax. Fa18 11 / 40 Basics Matrix products m;n A matrix A 2 R can also be seen as a collection of columns, each column being a vector, or as a collection of rows, each row being a (transposed) vector: 2 > 3 α1 > 6 α2 7 A = a a ··· a ; or A = 6 7 ; 1 2 n 6 . 7 4 . 5 > αm m > > n where a1;:::; an 2 R denote the columns of A, and α1 ; : : : ; αm 2 R denote the rows of A. n If the columns of B are given by the vectors bi 2 R , i = 1;:::; p, so that B = [b1 ··· bp], then AB can be written as AB = A b1 ::: bp = Ab1 ::: Abp : In other words, AB results from transforming each column bi of B into Abi . Fa18 12 / 40 Basics Matrix products The matrix-matrix product can also be interpreted as an operation on the rows of > A. Indeed, if A is given by its rows αi , i = 1;:::; m, then AB is the matrix > obtained by transforming each one of these rows into αi B, i = 1;:::; m: 2 > 3 2 > 3 α1 α1 B 6 . 7 6 . 7 AB = 4 . 5 B = 4 . 5 : > > αm αm B Finally, the product AB can be given the interpretation as the sum of so-called > > dyadic matrices (matrices of rank one, of the form ai βi , where βi denote the rows of B: n X > m;n n;p AB = ai βi ; A 2 R ; B 2 R : i=1 For any two conformably sized matrices A; B, it holds that (AB)> = B>A>; Fa18 13 / 40 Matrices as linear maps We can interpret matrices as linear maps (vector-valued functions), or \operators," acting from an \input" space to an \output" space. We recall that a map f : X!Y is linear if any points x and z in X and any scalars λ, µ satisfy f (λx + µz) = λf (x) + µf (z). n m m;n Any linear map f : R ! R can be represented by a matrix A 2 R , mapping n m input vectors x 2 R to output vectors y 2 R : y = Ax: Affine maps are simply linear functions plus a constant term, thus any affine map n m f : R ! R can be represented as f (x) = Ax + b; m;n m for some A 2 R , b 2 R . Fa18 14 / 40 Range, rank, and nullspace Consider a m × n matrix A, and denote by ai , i = 1;:::; n, its i-th column, so that A = [a1 ::: an]. The set of vectors y obtained as a linear combination of the ai 's are of the form n y = Ax for some vector x 2 R . This set is commonly known as the range of A, and is denoted R(A): n R(A) = fAx : x 2 R g : By construction, the range is a subspace. The dimension of R(A) is called the rank of A and denoted with rank(A); by definition the rank represents the number of linearly independent columns of A. The rank is also equal to the number of linearly independent rows of A; that is, the rank of A is the same as that of its transpose A>. Proof here: https://en.wikipedia.org/wiki/Rank_(linear_algebra) As a consequence, we always have the bounds 1 ≤ rank(A) ≤ min(m; n). Fa18 15 / 40 Range, rank, and nullspace m;n The nullspace of the matrix A 2 R is the set of vectors in the input space that are mapped to zero, and is denoted N (A): n N (A) = fx 2 R : Ax = 0g : This set is again a subspace. R(A>) and N (A) are mutually orthogonal subspaces, i.e., N (A)? R(A>). The direct sum of a subspace and its orthogonal complement equals the whole space, thus, n ? > R = N (A) ⊕ N (A) = N (A) ⊕ R(A ): Fa18 16 / 40 Fundamental theorem of linear algebra Theorem 1 m;n > > For any given matrix A 2 R , it holds that N (A)? R(A ) and R(A)? N (A ), hence > n N (A) ⊕ R(A ) = R > m R(A) ⊕ N (A ) = R : n Consequently, we can decompose any vector x 2 R as the sum of two vectors orthogonal to each other, one in the range of A>, and the other in the nullspace of A: x = A>ξ + z; z 2 N (A): m Similarly, we can decompose any vector w 2 R as the sum of two vectors orthogonal to each other, one in the range of A, and the other in the nullspace of A>: w = A' + ζ; ζ 2 N (A>): Fa18 17 / 40 Fundamental theorem of linear algebra Geometry 3 Figure: Illustration of the fundamental theorem of linear algebra in R .

Load more