Introduction to Numerical Linear Algebra II

Total Page:16

File Type:pdf, Size:1020Kb

Introduction to Numerical Linear Algebra II Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in relatively more detail, but will still skip a lot ... 1 Norms fMeasuring the length/mass of mathematical quantitiesg General norms Vector p norms Matrix norms induced by vector p norms Frobenius norm 2 Singular Value Decomposition (SVD) The most important tool in Numerical Linear Algebra 3 Least Squares problems Linear systems that do not have a solution 2 / 49 Norms General Norms How to measure the mass of a matrix or length of a vector m×n Norm k · k is function R ! R with 1 Non-negativity kAk ≥ 0, kAk = 0 () A = 0 2 Triangle inequality kA + Bk ≤ kAk + kBk 3 Scalar multiplication kα Ak = jαj kAk for all α 2 R. Properties Minus signs k − Ak = kAk Reverse triangle inequality j kAk − kBk j ≤ kA − Bk New norms m×n m×m For norm k · k on R , and nonsingular M 2 R def kAkM = kMAk is also a norm 4 / 49 Vector p Norms n For x 2 R and integer p ≥ 1 n !1=p def X p kxkp = jxi j i=1 Pn One norm kxk1 = j=1 jxj j p qPn 2 T Euclidean (two) norm kxk2 = j=1 jxj j = x x Infinity (max) norm kxk1 = max1≤j≤n jxj j 5 / 49 Exercises n 1 Determine k11kp for 11 2 R and p = 1; 2; 1 n 2 For x 2 R with xj = j, 1 ≤ j ≤ n, determine closed-form expressions for kxkp for p = 1; 2; 1 6 / 49 Inner Products and Norm Relations n For x; y 2 R Cauchy-Schwartz inequality T jx yj ≤ kxk2 kyk2 H¨olderinequality T T jx yj ≤ kxk1 kyk1 jx yj ≤ kxk1 kyk1 Relations kxk1 ≤ kxk1 ≤ n kxk1 p kxk2 ≤ kxk1 ≤ n kxk2 p kxk1 ≤ kxk2 ≤ n kxk1 7 / 49 Exercises 1 Prove the norm relations on the previous slide n p 2 For x 2 R show kxk2 ≤ kxk1 kxk1 8 / 49 2 T T T T 2 kV xk2 = (V x) (V x) = x V V x = x x = kxk2 Vector Two Norm Theorem of Pythagoras n For x; y 2 R T 2 2 2 x y = 0 () kx ± yk2 = kxk2 + kyk2 Two norm does not care about orthonormal matrices n m×n T For x 2 R and V 2 R with V V = In kV xk2 = kxk2 9 / 49 Vector Two Norm Theorem of Pythagoras n For x; y 2 R T 2 2 2 x y = 0 () kx ± yk2 = kxk2 + kyk2 Two norm does not care about orthonormal matrices n m×n T For x 2 R and V 2 R with V V = In kV xk2 = kxk2 2 T T T T 2 kV xk2 = (V x) (V x) = x V V x = x x = kxk2 9 / 49 Exercises n For x; y 2 R show 1 Parallelogram equality 2 2 2 2 kx + yk2 + kx − yk2 = 2 kxk2 + kyk2 2 Polarization identity 1 xT y = kx + yk2 − kx − yk2 4 2 2 10 / 49 Matrix Norms (Induced by Vector p Norms) m×n For A 2 R and integer p ≥ 1 def kAxkp kAkp = max = max kA ykp x6=0 kxkp kykp=1 One Norm: Maximum absolute column sum m X kAk1 = max jaij j = max kAe j k1 1≤j≤n 1≤j≤n i=1 Infinity Norm: Maximum absolute row sum n X T kAk1 = max jaij j = max kA ei k1 1≤i≤m 1≤i≤m j=1 11 / 49 Matrix Norm Properties Every norm realized by some vector y 6= 0 kAykp y kAkp = = kAzkp where z ≡ kzkp = 1 kykp kykp fVector y is different for every A and every pg Submultiplicativity m×n n Matrix vector product: For A 2 R and y 2 R kAykp ≤ kAkpkykp m×n n×l Matrix product: For A 2 R and B 2 R kABkp ≤ kAkpkBkp 12 / 49 More Matrix Norm Properties m×n m×m n×n For A 2 R and permutation matrices P 2 R ,Q 2 R Permutation matrices do not matter kPAQkp = kAkp Submatrices have smaller norms than parent matrix BA12 If P A Q = then kBkp ≤ kAkp A21 A22 13 / 49 Exercises 1 Different norms realized by different vectors Find y and z so that kAk1 = kAyk1 and kAk1 = kAzk1 when 1 4 A = 0 2 n×n 2 If Q 2 R is permutation then kQkp = 1 3 Prove the two types of submultiplicativity 4 If D = diag d11 ··· dnn then kDkp = max jdjj j 1≤j≤n n×n −1 5 If A 2 R nonsingular then kAkpkA kp ≥ 1 14 / 49 Norm Relations and Transposes m×n For A 2 R Relations between different norms 1 p p kAk ≤ kAk ≤ m kAk n 1 2 1 1 p p kAk ≤ kAk ≤ n kAk m 1 2 1 Transposes T T T kA k1 = kAk1 kA k1 = kAk1 kA k2 = kAk2 15 / 49 Proof: Two Norm of Transpose fTrue for A = 0, so assume A 6= 0g n Let kAk2 = kAzk2 for some z 2 R with kzk2 = 1 Reduce to vector norm 2 2 T T T T T kAk2 = kAzk2 = (Az) (Az) = z A Az = z A Az | {z } vector Cauchy-Schwartz inequality and submultiplicativity imply T T T T z A Az ≤ kzk2 kA Azk2 ≤ kA k2 kAk2 2 T T Thus kAk2 ≤ kA k2 kAk2 and kAk2 ≤ kA k2 T T Reversing roles of A and A gives kA k2 ≤ kAk2 16 / 49 Matrix Two Norm m×n A 2 R Orthonormal matrices do not change anything k×m T l×n T If U 2 R with U U = Im,V 2 R with V V = In T kUAV k2 = kAk2 Gram matrices m×n For A 2 R T 2 T kA Ak2 = kAk2 = kAA k2 Outer products m n For x 2 R , y 2 R T kxy k2 = kxk2 kyk2 17 / 49 Proof: Norm of Gram Matrix Submultiplicativity and transpose T T 2 kA Ak2 ≤ kAk2 kA k2 = kAk2 kAk2 = kAk2 T 2 Thus kA Ak2 ≤ kAk2 n Let kAk2 = kAzk2 for some z 2 R with kzk2 = 1 Cauchy-Schwartz inequality and submultiplicativity imply 2 2 T T T kAk2 = kAzk2 = (Az) (Az) =z A A z | {z } vector T T ≤k zk2 kA A zk2 ≤ kA Ak2 2 T Thus kAk2 ≤ kA Ak2 18 / 49 Exercises 1 Infinity norm of outer products m n T For x 2 R and y 2 R , show kx y k1 = kxk1 kyk1 n×n 2 If A 2 R with A 6= 0 is idempotent then (i) kAkp ≥ 1 (ii) kAk2 = 1 if A also symmetric n×n 3 Given A 2 R 1 T Among all symmetric matrices, 2 (A + A ) is a (the?) matrix that is closest to A in the two norm 19 / 49 n Vector If x 2 R then kxkF = kxk2 T Transpose kA kF = kAkF p Identity kInkF = n Frobenius Norm The true mass of a matrix m×n For A = a1 ··· an 2 R v v u n u n m q def uX 2 uX X 2 T kAkF = t kaj k2 = t jaij j = trace(A A) j=1 j=1 i=1 20 / 49 Frobenius Norm The true mass of a matrix m×n For A = a1 ··· an 2 R v v u n u n m q def uX 2 uX X 2 T kAkF = t kaj k2 = t jaij j = trace(A A) j=1 j=1 i=1 n Vector If x 2 R then kxkF = kxk2 T Transpose kA kF = kAkF p Identity kInkF = n 20 / 49 More Frobenius Norm Properties m×n A 2 R Orthonormal invariance k×m T l×n T If U 2 R with U U = Im,V 2 R with V V = In T kUAV kF = kAkF Relation to two norm p p kAk2 ≤ kAkF ≤ rank(A) kAk2 ≤ minfm; ng kAk2 Submultiplicativity kABkF ≤k Ak2 kBkF ≤ kAkF kBkF 21 / 49 Exercises 1 Show the orthonormal invariance of the Frobenius norm 2 Show the submultiplicativity of the Frobenius norm 3 Frobenius norm of outer products m n For x 2 R and y 2 R show T kx y kF = kxk2 kyk2 22 / 49 Singular Value Decomposition (SVD) Full SVD m×n Given: A 2 R Tall and skinny: m ≥ n 0 1 σ1 Σ A = U VT where Σ ≡ B .. C ≥ 0 0 @ . A σn Short and fat: m ≤ n 0 1 σ1 T B .. C A = U Σ 0 V where Σ ≡ @ . A ≥ 0 σm m×m n×n U 2 R and V 2 R are orthogonal matrices 24 / 49 Names and Conventions m×n A 2 R with SVD Σ A = U VT or A = U Σ 0 VT 0 Singular values (svalues): Diagonal elements σj of Σ Left singular vector matrix:U Right singular vector matrix:V Svalue ordering σ1 ≥ σ2 ≥ ::: ≥ σminfm; ng ≥ 0 Svalues of matrices B, C: σj (B), σj (C) 25 / 49 SVD Properties m×n A 2 R Number of svalues equal to small dimension m×n A 2 R has minfm; ng singular values σj ≥ 0 Orthogonal invariance m×m n×n If P 2 R and Q 2 R are orthogonal matrices then P A Q has same svalues as A Gram Product Nonzero svalues (= eigenvalues) of AT A are squares of svalues of A Inverse n×n A 2 R nonsingular () σj > 0 for 1 ≤ j ≤ n If A = U Σ VT then A−1 = V Σ−1 UT SVD of inverse 26 / 49 Exercises 1 TransposeA T has same singular values as A 2 Orthogonal matrices n×n All singular values of A 2 R are equal to 1 () A is orthogonal matrix n×n 3 If A 2 R is symmetric and idempotent then all singular values are 0 and/or 1 m×n 4 For A 2 R and α > 0 express svalues of (AT A + α I)−1 AT in terms of α and svalues of A In 5 If A 2 m×n with m ≥ n then singular values of are R A q 2 equal to 1 + σj for 1 ≤ j ≤ n 27 / 49 Singular Values m×n A 2 R with svalues σ1 ≥ · · · ≥ σp p ≡ minfm; ng Two norm kAk2 = σ1 q 2 2 Frobenius norm kAkF = σ1 + ··· + σp Well conditioned in absolute sense jσj (A) − σj (B)j ≤ kA − Bk2 1 ≤ j ≤ p Product σj (A B) ≤ σ1(A) σj (B) 1 ≤ j ≤ p 28 / 49 Matrix Schatten Norms m×n A 2 R , with singular values σ1 ≥ · · · ≥ σρ > 0, and integer p ≥ 0, the family of the Schatten p-norms is defined as ρ !1=p def X p kAkp = σi : i=1 Different than the vector-induced matrix p-norms1.
Recommended publications
  • C H a P T E R § Methods Related to the Norm Al Equations
    ¡ ¢ £ ¤ ¥ ¦ § ¨ © © © © ¨ ©! "# $% &('*),+-)(./+-)0.21434576/),+98,:%;<)>=,'41/? @%3*)BAC:D8E+%=B891GF*),+H;I?H1KJ2.21K8L1GMNAPO45Q5R)S;S+T? =RU ?H1*)>./+EAVOKAPM ;<),5 ?H1G;P8W.XAVO4575R)S;S+Y? =Z8L1*)/[%\Z1*)]AB3*=,'Q;<)B=,'41/? @%3*)RAP8LU F*)BA(;S'*)Z)>@%3/? F*.4U ),1G;^U ?H1*)>./+ APOKAP;_),5a`Rbc`^dfeg`Rbih,jk=>.4U U )Blm;B'*)n1K84+o5R.EUp)B@%3*.K;q? 8L1KA>[r\0:D;_),1Ejp;B'/? A2.4s4s/+t8/.,=,' b ? A7.KFK84? l4)>lu?H1vs,+D.*=S;q? =>)m6/)>=>.43KA<)W;B'*)w=B8E)IxW=K? ),1G;W5R.K;S+Y? yn` `z? AW5Q3*=,'|{}84+-A_) =B8L1*l9? ;I? 891*)>lX;B'*.41Q`7[r~k8K{i)IF*),+NjL;B'*)71K8E+o5R.4U4)>@%3*.G;I? 891KA0.4s4s/+t8.*=,'w5R.BOw6/)Z.,l4)IM @%3*.K;_)7?H17A_8L5R)QAI? ;S3*.K;q? 8L1KA>[p1*l4)>)>lj9;B'*),+-)Q./+-)Q)SF*),1.Es4s4U ? =>.K;q? 8L1KA(?H1Q{R'/? =,'W? ;R? A s/+-)S:Y),+o+D)Blm;_8|;B'*)W3KAB3*.4Urc+HO4U 8,F|AS346KABs*.,=B)Q;_)>=,'41/? @%3*)BAB[&('/? A]=*'*.EsG;_),+}=B8*F*),+tA7? ;PM ),+-.K;q? F*)75R)S;S'K8ElAi{^'/? =,'Z./+-)R)K? ;B'*),+pl9?+D)B=S;BU OR84+k?H5Qs4U ? =K? ;BU O2+-),U .G;<)Bl7;_82;S'*)Q1K84+o5R.EU )>@%3*.G;I? 891KA>[ mWX(fQ - uK In order to solve the linear system 7 when is nonsymmetric, we can solve the equivalent system b b [¥K¦ ¡¢ £N¤ which is Symmetric Positive Definite. This system is known as the system of the normal equations associated with the least-squares problem, [ ­¦ £N¤ minimize §* R¨©7ª§*«9¬ Note that (8.1) is typically used to solve the least-squares problem (8.2) for over- ®°¯± ±²³® determined systems, i.e., when is a rectangular matrix of size , .
    [Show full text]
  • The Column and Row Hilbert Operator Spaces
    The Column and Row Hilbert Operator Spaces Roy M. Araiza Department of Mathematics Purdue University Abstract Given a Hilbert space H we present the construction and some properties of the column and row Hilbert operator spaces Hc and Hr, respectively. The main proof of the survey is showing that CB(Hc; Kc) is completely isometric to B(H ; K ); in other words, the completely bounded maps be- tween the corresponding column Hilbert operator spaces is completely isometric to the bounded operators between the Hilbert spaces. A similar theorem for the row Hilbert operator spaces follows via duality of the operator spaces. In particular there exists a natural duality between the column and row Hilbert operator spaces which we also show. The presentation and proofs are due to Effros and Ruan [1]. 1 The Column Hilbert Operator Space Given a Hilbert space H , we define the column isometry C : H −! B(C; H ) given by ξ 7! C(ξ)α := αξ; α 2 C: Then given any n 2 N; we define the nth-matrix norm on Mn(H ); by (n) kξkc;n = C (ξ) ; ξ 2 Mn(H ): These are indeed operator space matrix norms by Ruan's Representation theorem and thus we define the column Hilbert operator space as n o Hc = H ; k·kc;n : n2N It follows that given ξ 2 Mn(H ) we have that the adjoint operator of C(ξ) is given by ∗ C(ξ) : H −! C; ζ 7! (ζj ξ) : Suppose C(ξ)∗(ζ) = d; c 2 C: Then we see that cd = (cj C(ξ)∗(ζ)) = (cξj ζ) = c(ζj ξ) = (cj (ζj ξ)) : We are also able to compute the matrix norms on rectangular matrices.
    [Show full text]
  • Matrices and Linear Algebra
    Chapter 26 Matrices and linear algebra ap,mat Contents 26.1 Matrix algebra........................................... 26.2 26.1.1 Determinant (s,mat,det)................................... 26.2 26.1.2 Eigenvalues and eigenvectors (s,mat,eig).......................... 26.2 26.1.3 Hermitian and symmetric matrices............................. 26.3 26.1.4 Matrix trace (s,mat,trace).................................. 26.3 26.1.5 Inversion formulas (s,mat,mil)............................... 26.3 26.1.6 Kronecker products (s,mat,kron).............................. 26.4 26.2 Positive-definite matrices (s,mat,pd) ................................ 26.4 26.2.1 Properties of positive-definite matrices........................... 26.5 26.2.2 Partial order......................................... 26.5 26.2.3 Diagonal dominance.................................... 26.5 26.2.4 Diagonal majorizers..................................... 26.5 26.2.5 Simultaneous diagonalization................................ 26.6 26.3 Vector norms (s,mat,vnorm) .................................... 26.6 26.3.1 Examples of vector norms.................................. 26.6 26.3.2 Inequalities......................................... 26.7 26.3.3 Properties.......................................... 26.7 26.4 Inner products (s,mat,inprod) ................................... 26.7 26.4.1 Examples.......................................... 26.7 26.4.2 Properties.......................................... 26.8 26.5 Matrix norms (s,mat,mnorm) .................................... 26.8 26.5.1
    [Show full text]
  • Lecture 5 Ch. 5, Norms for Vectors and Matrices
    KTH ROYAL INSTITUTE OF TECHNOLOGY Norms for vectors and matrices — Why? Lecture 5 Problem: Measure size of vector or matrix. What is “small” and what is “large”? Ch. 5, Norms for vectors and matrices Problem: Measure distance between vectors or matrices. When are they “close together” or “far apart”? Emil Björnson/Magnus Jansson/Mats Bengtsson April 27, 2016 Answers are given by norms. Also: Tool to analyze convergence and stability of algorithms. 2 / 21 Vector norm — axiomatic definition Inner product — axiomatic definition Definition: LetV be a vector space over afieldF(R orC). Definition: LetV be a vector space over afieldF(R orC). A function :V R is a vector norm if for all A function , :V V F is an inner product if for all || · || → �· ·� × → x,y V x,y,z V, ∈ ∈ (1) x 0 nonnegative (1) x,x 0 nonnegative || ||≥ � �≥ (1a) x =0iffx= 0 positive (1a) x,x =0iffx=0 positive || || � � (2) cx = c x for allc F homogeneous (2) x+y,z = x,z + y,z additive || || | | || || ∈ � � � � � � (3) x+y x + y triangle inequality (3) cx,y =c x,y for allc F homogeneous || ||≤ || || || || � � � � ∈ (4) x,y = y,x Hermitian property A function not satisfying (1a) is called a vector � � � � seminorm. Interpretation: “Angle” (distance) between vectors. Interpretation: Size/length of vector. 3 / 21 4 / 21 Connections between norm and inner products Examples / Corollary: If , is an inner product, then x =( x,x ) 1 2 � The Euclidean norm (l ) onC n: �· ·� || || � � 2 is a vector norm. 2 2 2 1/2 x 2 = ( x1 + x 2 + + x n ) . Called: Vector norm derived from an inner product.
    [Show full text]
  • Matrix Norms 30.4   
    Matrix Norms 30.4 Introduction A matrix norm is a number defined in terms of the entries of the matrix. The norm is a useful quantity which can give important information about a matrix. ' • be familiar with matrices and their use in $ writing systems of equations • revise material on matrix inverses, be able to find the inverse of a 2 × 2 matrix, and know Prerequisites when no inverse exists Before starting this Section you should ... • revise Gaussian elimination and partial pivoting • be aware of the discussion of ill-conditioned and well-conditioned problems earlier in Section 30.1 #& • calculate norms and condition numbers of % Learning Outcomes small matrices • adjust certain systems of equations with a On completion you should be able to ... view to better conditioning " ! 34 HELM (2008): Workbook 30: Introduction to Numerical Methods ® 1. Matrix norms The norm of a square matrix A is a non-negative real number denoted kAk. There are several different ways of defining a matrix norm, but they all share the following properties: 1. kAk ≥ 0 for any square matrix A. 2. kAk = 0 if and only if the matrix A = 0. 3. kkAk = |k| kAk, for any scalar k. 4. kA + Bk ≤ kAk + kBk. 5. kABk ≤ kAk kBk. The norm of a matrix is a measure of how large its elements are. It is a way of determining the “size” of a matrix that is not necessarily related to how many rows or columns the matrix has. Key Point 6 Matrix Norm The norm of a matrix is a real number which is a measure of the magnitude of the matrix.
    [Show full text]
  • 5 Norms on Linear Spaces
    5 Norms on Linear Spaces 5.1 Introduction In this chapter we study norms on real or complex linear linear spaces. Definition 5.1. Let F = (F, {0, 1, +, −, ·}) be the real or the complex field. A semi-norm on an F -linear space V is a mapping ν : V −→ R that satisfies the following conditions: (i) ν(x + y) ≤ ν(x)+ ν(y) (the subadditivity property of semi-norms), and (ii) ν(ax)= |a|ν(x), for x, y ∈ V and a ∈ F . By taking a = 0 in the second condition of the definition we have ν(0)=0 for every semi-norm on a real or complex space. A semi-norm can be defined on every linear space. Indeed, if B is a basis of V , B = {vi | i ∈ I}, J is a finite subset of I, and x = i∈I xivi, define νJ (x) as P 0 if x = 0, νJ (x)= ( j∈J |aj | for x ∈ V . We leave to the readerP the verification of the fact that νJ is indeed a seminorm. Theorem 5.2. If V is a real or complex linear space and ν : V −→ R is a semi-norm on V , then ν(x − y) ≥ |ν(x) − ν(y)|, for x, y ∈ V . Proof. We have ν(x) ≤ ν(x − y)+ ν(y), so ν(x) − ν(y) ≤ ν(x − y). (5.1) 160 5 Norms on Linear Spaces Since ν(x − y)= |− 1|ν(y − x) ≥ ν(y) − ν(x) we have −(ν(x) − ν(y)) ≤ ν(x) − ν(y). (5.2) The Inequalities 5.1 and 5.2 give the desired inequality.
    [Show full text]
  • A Degeneracy Framework for Scalable Graph Autoencoders
    Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) A Degeneracy Framework for Scalable Graph Autoencoders Guillaume Salha1;2 , Romain Hennequin1 , Viet Anh Tran1 and Michalis Vazirgiannis2 1Deezer Research & Development, Paris, France 2Ecole´ Polytechnique, Palaiseau, France [email protected] Abstract In this paper, we focus on the graph extensions of autoen- coders and variational autoencoders. Introduced in the 1980’s In this paper, we present a general framework to [Rumelhart et al., 1986], autoencoders (AE) regained a sig- scale graph autoencoders (AE) and graph varia- nificant popularity in the last decade through neural network tional autoencoders (VAE). This framework lever- frameworks [Baldi, 2012] as efficient tools to learn reduced ages graph degeneracy concepts to train models encoding representations of input data in an unsupervised only from a dense subset of nodes instead of us- way. Furthermore, variational autoencoders (VAE) [Kingma ing the entire graph. Together with a simple yet ef- and Welling, 2013], described as extensions of AE but actu- fective propagation mechanism, our approach sig- ally based on quite different mathematical foundations, also nificantly improves scalability and training speed recently emerged as a successful approach for unsupervised while preserving performance. We evaluate and learning from complex distributions, assuming the input data discuss our method on several variants of existing is the observed part of a larger joint model involving low- graph AE and VAE, providing the first application dimensional latent variables, optimized via variational infer- of these models to large graphs with up to millions ence approximations. [Tschannen et al., 2018] review the of nodes and edges.
    [Show full text]
  • Chapter 6 the Singular Value Decomposition Ax=B Version of 11 April 2019
    Matrix Methods for Computational Modeling and Data Analytics Virginia Tech Spring 2019 · Mark Embree [email protected] Chapter 6 The Singular Value Decomposition Ax=b version of 11 April 2019 The singular value decomposition (SVD) is among the most important and widely applicable matrix factorizations. It provides a natural way to untangle a matrix into its four fundamental subspaces, and reveals the relative importance of each direction within those subspaces. Thus the singular value decomposition is a vital tool for analyzing data, and it provides a slick way to understand (and prove) many fundamental results in matrix theory. It is the perfect tool for solving least squares problems, and provides the best way to approximate a matrix with one of lower rank. These notes construct the SVD in various forms, then describe a few of its most compelling applications. 6.1 Eigenvalues and eigenvectors of symmetric matrices To derive the singular value decomposition of a general (rectangu- lar) matrix A IR m n, we shall rely on several special properties of 2 ⇥ the square, symmetric matrix ATA. While this course assumes you are well acquainted with eigenvalues and eigenvectors, we will re- call some fundamental concepts, especially pertaining to symmetric matrices. 6.1.1 A passing nod to complex numbers Recall that even if a matrix has real number entries, it could have eigenvalues that are complex numbers; the corresponding eigenvec- tors will also have complex entries. Consider, for example, the matrix 0 1 S = − . " 10# 73 To find the eigenvalues of S, form the characteristic polynomial l 1 det(lI S)=det = l2 + 1.
    [Show full text]
  • Domain Generalization for Object Recognition with Multi-Task Autoencoders
    Domain Generalization for Object Recognition with Multi-task Autoencoders Muhammad Ghifary W. Bastiaan Kleijn Mengjie Zhang David Balduzzi Victoria University of Wellington {muhammad.ghifary,bastiaan.kleijn,mengjie.zhang}@ecs.vuw.ac.nz, [email protected] Abstract mains. For example, frontal-views and 45◦ rotated-views correspond to two different domains. Alternatively, we can The problem of domain generalization is to take knowl- associate views or domains with standard image datasets edge acquired from a number of related domains where such as PASCAL VOC2007 [14], and Office [31]. training data is available, and to then successfully apply The problem of learning from multiple source domains it to previously unseen domains. We propose a new fea- and testing on unseen target domains is referred to as do- ture learning algorithm, Multi-Task Autoencoder (MTAE), main generalization [6, 26]. A domain is a probability P Nk that provides good generalization performance for cross- distribution k from which samples {xi,yi}i=1 are drawn. domain object recognition. Source domains provide training samples, whereas distinct Our algorithm extends the standard denoising autoen- target domains are used for testing. In the standard super- coder framework by substituting artificially induced cor- vised learning framework, it is assumed that the source and ruption with naturally occurring inter-domain variability in target domains coincide. Dataset bias becomes a significant the appearance of objects. Instead of reconstructing images problem when training and test domains differ: applying a from noisy versions, MTAE learns to transform the original classifier trained on one dataset to images sampled from an- image into analogs in multiple related domains.
    [Show full text]
  • Norms of Vectors and Matrices and Eigenvalues and Eigenvectors - (7.1)(7.2)
    Norms of Vectors and Matrices and Eigenvalues and Eigenvectors - (7.1)(7.2) Vector norms and matrix norms are used to measure the difference between two vectors or two matrices, respectively, as the absolute value function is used to measure the distance between two scalars. 1. Vector Norms: a. Definition: A vector norm on Rn is a real-valued function, ||!||, with the following properties: Let x and y be vectors and ! be real number. i. ||x|| ! 0, ||x|| ! 0, if and only if x is a zero vector; ii. ||!x|| ! |!|||x||; iii. ||x " y|| " ||x|| " ||y||. T b. Commonly used the vector norms: let x ! x1 " xn 2 2 i. 2-norm (or Euclidean norm): #x#2 ! x1 "..."xn ii. 1-norm: #x#1 ! |x1 | "..."|xn | iii. infinity-norm: #x## ! max1"i"n |xi | T c. Cauchy-Schwarz Inequality: x y " ||x||2#y#2. $1"cos#"$"1 T x y ! #x#2||y||2 cos#"$ " #x#2||y||2 T Example Let x ! 1 $23$4 . Compute #x#2, #x#1 and #x##. 2 2 2 2 #x#2 ! 1 " 2 " 3 " 4 ! 30 , ||x||1 ! 1 " 2 " 3 " 4 ! 10 , ||x||# ! 4 T T Example Let D1 ! x ! x1 x2 | #x#1 " 1 , D2 ! x ! x1 x2 | #x#2 " 1 and T D# ! x ! x1 x2 | #x## " 1 . Show that D1 % D2 % D# We want to show that #x## " #x#2 " #x#1. 1 Vectors in the green diamond satisfy the inequality 0.8 #x#1 " 1. 0.6 Vectors in the red circle satisfy the inequality 0.4 0.2 #x#2 " 1.
    [Show full text]
  • Deep Subspace Clustering Networks
    Deep Subspace Clustering Networks Pan Ji∗ Tong Zhang∗ Hongdong Li University of Adelaide Australian National University Australian National University Mathieu Salzmann Ian Reid EPFL - CVLab University of Adelaide Abstract We present a novel deep neural network architecture for unsupervised subspace clustering. This architecture is built upon deep auto-encoders, which non-linearly map the input data into a latent space. Our key idea is to introduce a novel self-expressive layer between the encoder and the decoder to mimic the “self- expressiveness” property that has proven effective in traditional subspace clustering. Being differentiable, our new self-expressive layer provides a simple but effective way to learn pairwise affinities between all data points through a standard back- propagation procedure. Being nonlinear, our neural-network based method is able to cluster data points having complex (often nonlinear) structures. We further propose pre-training and fine-tuning strategies that let us effectively learn the parameters of our subspace clustering networks. Our experiments show that our method significantly outperforms the state-of-the-art unsupervised subspace clustering techniques. 1 Introduction In this paper, we tackle the problem of subspace clustering [42] – a sub-field of unsupervised learning – which aims to cluster data points drawn from a union of low-dimensional subspaces in an unsupervised manner. Subspace clustering has become an important problem as it has found various applications in computer vision, e.g., image segmentation [50, 27], motion segmentation [17, 9], and image clustering [14, 10]. For example, under Lambertian reflectance, the face images of one subject obtained with a fixed pose and varying lighting conditions lie in a low-dimensional subspace of dimension close to nine [2].
    [Show full text]
  • Eigenvalues and Eigenvectors
    Chapter 3 Eigenvalues and Eigenvectors In this chapter we begin our study of the most important, and certainly the most dominant aspect, of matrix theory. Called spectral theory, it allows us to give fundamental structure theorems for matrices and to develop power tools for comparing and computing withmatrices.Webeginwithastudy of norms on matrices. 3.1 Matrix Norms We know Mn is a vector space. It is most useful to apply a metric on this vector space. The reasons are manifold, ranging from general information of a metrized system to perturbation theory where the “smallness” of a matrix must be measured. For that reason we define metrics called matrix norms that are regular norms with one additional property pertaining to the matrix product. Definition 3.1.1. Let A Mn.Recallthatanorm, ,onanyvector space satifies the properties:∈ · (i) A 0and A =0ifandonlyifA =0 ≥ | (ii) cA = c A for c R | | ∈ (iii) A + B A + B . ≤ There is a true vector product on Mn defined by matrix multiplication. In this connection we say that the norm is submultiplicative if (iv) AB A B ≤ 95 96 CHAPTER 3. EIGENVALUES AND EIGENVECTORS In the case that the norm satifies all four properties (i) - (iv) we call it a matrix norm. · Here are a few simple consequences for matrix norms. The proofs are straightforward. Proposition 3.1.1. Let be a matrix norm on Mn, and suppose that · A Mn.Then ∈ (a) A 2 A 2, Ap A p,p=2, 3,... ≤ ≤ (b) If A2 = A then A 1 ≥ 1 I (c) If A is invertible, then A− A ≥ (d) I 1.
    [Show full text]