What Is a Tensor? (Part II)

What is a tensor? (Part II) Lek-Heng Lim University of Chicago goals of these tutorials • very elementary introduction to tensors • through the lens of linear algebra and numerical linear algebra • highlight their roles in computations • trace chronological development through three definitions a multi-indexed object that satisfies tensor transformation rules Á a multilinear map Â an element of a tensor product of vector spaces • last week: | trickiest one to discuss • today: Á and Â | much easier modern definitions 1 recap from part I • take home idea: a tensor is defined by its transformation rule • example: is this 2 3 1 2 3 6 7 42 3 45 3 4 5 a tensor? • makes no sense • definition requires a context 2 recap from part I • if it comes from a measurement of stress 2 3 σ11 σ12 σ13 6 7 4σ21 σ22 σ235 σ31 σ32 σ33 then it is a contravariant 2-tensor • if we are interested in its eigenvalues and eigenvectors (XAX −1)Xv = λXv then it is a mixed 2-tensor • if we are interested in its Hadamard product 2 3 2 3 2 3 a11 a12 a13 b11 b12 b13 a11b11 a12b12 a13b13 6 7 6 7 6 7 4a21 a22 a235 ◦ 4b21 b22 b235 = 4a21b21 a22b22 a23b235 a31 a32 a33 b31 b32 b33 a31b31 a32b32 a33b33 then it is not a tensor 3 recap from part I • indices tell us nothing • A 2 Rm×n has two indices but if transformation rule is 0 0 −T −T −T A = XA = [Xa1;:::; Xan] or A = X A = [X a1;:::; X an] then it is covariant or contravariant 1-tensor respectively • e.g., Householder QR algorithm, equivariant neural networks | all covariant 1-tensors no matter how many indices • want coordinate-free definition that does not depend on indices 4 tensors via multilinear maps multilinear maps • V1;:::; Vd and W vector spaces • multilinear map or d-linear map is ' : V1 × · · · × Vd ! W with 0 0 Φ(v; : : : ; λvk + λ vk ;:::; vd ) 0 0 = λΦ(v1;:::; vk ;:::; vd ) + λ Φ(v1;:::; vk ;:::; vd ) 0 0 for v1 2 V1;:::; vk ; vk 2 Vk ;:::; vd 2 Vd , λ, λ 2 R d • write M (V1;:::; Vd ; W) for set of all such maps 1 • write M (V; W) = L(V; W) for linear maps 5 why important linearity principle: almost any natural process is linear in small amounts almost everywhere multilinearity principle: if we keep all but one factors constant, the varying factor obeys principle of linearity Hooke's and Ohm's laws both linear but not if we pass a current through the spring or stretch the resistor 6 d = 1 ∗ ∗ ∗ • B = fv1;:::; vmg basis of V, B = fv1 ;:::; vmg dual basis • any v 2 V uniquely represented by a 2 Rm 2 3 a1 : 6 . 7 m V 3 v = a1v1 + ··· + amvm ! [v]B = 4 . 5 2 R am • any ' 2 V∗ uniquely represented by b 2 Rm 2 3 b1 ∗ ∗ ∗ . m ∗ : 6 . 7 V 3 ' = b1v1 + ··· + bmvm ! [']B = 4 . 5 2 R bm 7 d = 1 0 0 m×m • if C = fv1;:::; vmg another basis and X 2 R is m 0 X vj = xij vi i=1 • change-of-basis theorem: −1 T [v]C = X [v]B; [']C ∗ = X [']B∗ • recover transformation rules for contravariant and covariant 1-tensors a0 = X −1a; b0 = X Tb • vectors ! contravariant 1-tensors • linear functionals ! covariant 1-tensors 8 d = 2 • bases A = fu1;:::; ung on U, B = fv1;:::; vmg on V • linear operator Φ : U ! V has matrix representation m×n [Φ]A ;B = A 2 R where m X Φ(uj ) = aij vi i=1 • new bases A 0 and B0 0 m×n [Φ]A 0;B0 = A 2 R • change-of-basis theorem: if X 2 GL(m) change-of-basis matrix on V, Y 2 GL(n) change-of-basis matrix on U, then A0 = X −1AY 9 d = 2 • special case U = V with A = B and A 0 = B0 A0 = X −1AX • recover transformation rules for mixed 2-tensors • bilinear functional β : U × V ! R with β(λu + λ0u0; v) = λβ(u; v) + λ0β(u0; v); β(u; λv + λ0v 0) = λβ(u; v) + λ0β(u; v 0) for all u; u0 2 U, v; v 0 2 V, λ, λ0 2 R • matrix representation of β m×n [β]A ;B = A 2 R given by aij = β(ui ; vj ) 10 d = 2 • change-of-basis theorem: if 0 m×n [β]A 0;B0 = A 2 R ; then A0 = X TAY • special case U = V, A = B, A 0 = B0 A0 = X TAX • recover transformation rules for covariant 2-tensors • linear operators ! mixed 2-tensors • bilinear functionals ! covariant 2-tensors 11 1- and 2-tensor transformation rules contravariant 1-tensor: a0 = X −1a a0 = Xa covariant 1-tensor: a0 = X Ta a0 = X −Ta covariant 2-tensor: A0 = X TAX A0 = X −TAX −1 contravariant 2-tensor: A0 = X −1AX −T A0 = XAX T mixed 2-tensor: A0 = X −1AX A0 = XAX −1 contravariant 2-tensor: A0 = X −1AY −T A0 = XAY T covariant 2-tensor: A0 = X TAY A0 = X −TAY −1 mixed 2-tensor: A0 = X −1AY A0 = XAY −1 12 d = 3 • bilinear operator B : U × V ! W, B(λu + λ0u0; v) = λB(u; v) + λ0B(u0; v); B(u; λv + λ0v 0) = λB(u; v) + λ0B(u; v 0) • bases A = fu1;:::; ung, B = fv1;:::; vmg, C = fw1;:::; wpg, p X B(ui ; vj ) = aijk wk i=1 • change-of-basis theorem: if 0 m×n×p [B]A ;B;C = A and [B]A 0;B0;C 0 = A 2 R ; then A0 = (X T; Y T; Z −1) · A 13 d = 3 • trilinear functional τ : U × V × W ! R, τ(λu + λ0u0; v; w) = λτ(u; v; w) + λ0τ(u0; v; w); τ(u; λv + λ0v 0; w) = λτ(u; v; w) + λ0τ(u; v 0; w); τ(u; v; λw + λ0w 0) = λτ(u; v; w) + λ0τ(u; v; w 0) • bases A = fu1;:::; ung, B = fv1;:::; vmg, C = fw1;:::; wpg, τ(ui ; vj ; wk ) = aijk • change-of-basis theorem: if 0 m×n×p [τ]A ;B;C = A and [τ]A 0;B0;C 0 = A 2 R ; then A0 = (X T; Y T; Z T) · A 14 extends to arbitrary order • bilinear operators ! mixed 3-tensor of covariant order 2 contravariant order 1 • trilinear functionals ! covariant 3-tensor • recover all transformation rules in definition I covariant d-tensor: 0 T T T A = (X1 ; X2 ;:::; Xd ) · A I contravariant d-tensor: 0 −1 −1 −1 A = (X1 ; X2 ;:::; Xd ) · A I mixed d-tensor: 0 −1 −1 T T A = (X1 ;:::; Xp ; Xp+1;:::; Xd ) · A • definition Á is the easiest definition of a tensor 15 adopted in books c. 1980s 16 issue • many more multilinear maps than there are types of tensors • d = 2: I linear operators ∗ ∗ ∗ ∗ Φ: U ! V; Φ: U ! V ; Φ: U ! V I bilinear functionals ∗ ∗ ∗ ∗ β : U × V ! R; β : U × V ! R; β : U × V ! R • d = 3: I bilinear operators ∗ ∗ ∗ ∗ ∗ B : U × V ! W; B : U × V ! W;:::; B : U × V ! W I trilinear functionals ∗ ∗ ∗ ∗ ∗ τ : U ×V×W ! R; τ : U×V ×W ! R; : : : ; τ : U ×V ×W ! R I more complicated maps Φ1 : U ! L(V; W); Φ2 : L(U; V) ! W; β1 : U × L(V; W) ! R; β2 : L(U; V) × W ! R 17 issue • possibilities increase exponentially with order d • ought to be only as many as types of transformation rules ∗ covariant 2-tensor: Φ : U ! V ; β : U × V ! R ∗ ∗ ∗ contravariant 2-tensor: Φ : U ! V; β : U × V ! R ∗ mixed 2-tensor: Φ : U ! V; β : U × V ! R; ∗ ∗ ∗ Φ: U ! V ; β : U × V ! R • definition Â accomplishes this without reference to the transformation rules 18 imperfect fix • only allow W = R • d-tensor of contravariant order p and covariant order d − p is multilinear functional ∗ ∗ ' : V1 × · · · × Vp × Vp+1 × · · · × Vd ! R • excludes vectors, by far the most common 1-tensor • excludes linear operators, by far the most common 2-tensor • excludes bilinear operators, by far the most common 3-tensor • e.g., instead of talking about v 2 V, need to talk about linear functionals f : V∗ ! R 19 multilinear operators are useful • favorite example: higher derivatives of multivariate functions d • but need a norm on M (V1;:::; Vd ; W) d • M (V1;:::; Vd ; W) is itself a vector space • if V1;:::; Vd and W endowed with norms, then kΦ(v1;:::; vd )k kΦkσ := sup v1;:::;vd 6=0 kv1k · · · kvd k d defines a norm on M (V1;:::; Vd ; W) • slightly abused notation: same k · k denote norms on different spaces 20 higher-order derivatives • V, W normed spaces; Ω ⊆ V open • derivative of f :Ω ! W at v 2 Ω is linear operator Df (v): V ! W, kf (v + h) − f (v) − [Df (v)](h)k lim = 0 h!0 khk • since Df (v) 2 L(V; W), apply same definition to Df :Ω ! L(V; W) • get D2f (v): V ! L(V; W) as D(Df ), kDf (v + h) − Df (v) − [D2f (v)](h)k lim = 0 h!0 khk • apply recursively to get derivatives of arbitrary order 2 Df (v) 2 L(V; W); D f (v) 2 L V; L(V; W) ; 3 4 D f (v) 2 L V; L(V; L(V; W)) ; D f (v) 2 L V;L V; L(V; L(V; W)) 21 higher-order derivatives • how to avoid nested spaces of linear maps? • use multilinear maps d−1 d L V;M (V;:::; V; W) = M (V;:::; V; W) d • if Φ : V ! M (V;:::; V; W) linear, then [Φ(h)](h1;:::; hd ) linear in h for fixed h1;:::; hd , d-linear in h1;:::; hd for fixed h • Dd f (v): V × · · · × V ! W may be regarded as multilinear operator • Taylor's theorem 1 f (v + h) = f (v) + [Df (v)](h) + [D2f (v)](h; h) + ··· 2 1 ··· + [Dd f (v)](h;:::; h) + R(h) d! remainder kR(h)k=khkd ! 0 as h ! 0 22 why not hypermatrices • why not choose bases and just treat them as hypermatrices A 2 Rn1×···×nd ? • may not be computable: writing down a hypermatrix given bases in general #P-hard • may not be possible: multilinear maps extend to modules, which may not have bases • may not be useful: even for d = 1; 2, writing down matrix unhelpful when vector spaces have special structures • may not be meaningful: ìndices' may be continuous 23 writing down hypermatrix is #P-hard • 0 ≤ d1 ≤ d2 ≤ · · · ≤ dn, generalized Vandermonde matrix 2 d1 d1 d1 3 x1 x2 ::: xn 6 x d2 x d2 ::: x d2 7 6 1 2 n 7 6 .

What Is a Tensor? (Part II)

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support