<<

10. Euclidean Spaces

Most of the spaces used in traditional consumer, producer, and general equilibrium theory will be Euclidean spaces—spaces where Euclid’s geometry rules. At this point, we have to start being a little more careful how we write things. We will start with the space Rn, the space of n-vectors, n-tuples of real numbers. When we are being picky, we write them vertically as if they are n 1 matrices. ×

x1 x2 x =  .  , .  x   n    just as we did when writing linear systems as products, Ax = b. The xi are referred to as the coordinates of the vector. 2 MATH METHODS 10.1 Vectors Vertical and Horizontal Sometimes, especially in text, we will be informal and write vectors horizontally, but this is not strictly correct. I will sometimes attach a symbol to remind you they should be vertical. Truly horizontal “vectors” are called co-vectors or covariant vectors. They differ from vectors in how they transform if you change coordinates. Ordinary vectors are sometimes called contravariant vectors. We can see this distinction when thinking of commodity vectors and their associated price vectors. Suppose we decide to measure milk in quarts rather than gallons. The quantities of all the different types of milk (skim, 2% milkfat, whole, chocolate, etc.) would all have to be multiplied by 4 to reflect the change in measurement because there are 4 quarts in a gallon. This is not how this measurement change affects prices. No, no, no! Price “vectors” are actually covectors, and if milk costs 2.40 per gallon, that’s 0.60 per quart. We have to divide the prices by 4 rather than multiplying by 4. This is the essence of the distinction between vectors and covectors. If we were being really picky, which we won’t, the vectors would use superscripts for their coordinates and the covectors would use subscripts. Although that is a use- ful convention in geometry and physics, it conflicts with other useful conventions in economics. One such is to use superscripts to indicate ownership—which consumer a commodity vector belongs to or which firm is using inputs and producing outputs. 10. EUCLIDEAN SPACES 3 10.2 Vectors 1 In R2, a vector x = looks like this: 2  

x2 x2 b x x

x1 x1

Figure 10.2.1:

Although we will mostly treat vectors as points in the , as in the left panel of Figure 10.2.1, it is often useful to think of them as indicating a direction, which x does in the right panel of Figure 10.2.1. In mathematics this is sometimes indicated by using vector bundles. A vector bundle indicates the starting point as well as the direction. We will not make explicit use of vector bundles. 4 MATH METHODS 10.3 Vector Addition Algebraically, vector addition is just matrix addition. We add the components.

x1 y1 x1 + y1 x2 y2 x2 + y2  .  +  .  =  .  . . .  x   y   x + y   n   n   n n        When we add vectors on a diagram, we add them nose to tail, placing the starting point of the second vector at the end of the first. x2 a + b b

a a

b x1

Figure 10.3.1: Two ways to add vectors: Here we add a = (1, 2)T to b = (2, 1)T . The upper combination is a + b and the lower b + a. Of course, both end at the same point, (3, 3)T , because matrix and thus vector addition are commutative. The red vector is the sum (a + b). In fact, we can read off the diagram that a = (1, 2) and b = (2, 1). 10. EUCLIDEAN SPACES 5 10.4 Multiplication We can also multiply vectors by scalars. We still use the rules of matrix algebra so

x1 αx1 x2 αx2 α  .  =  .  . . .  x   αx   n   n      The diagram illustrates this graphically. x2

2a

a

x1

a −

Figure 10.4.1: Vector Multiplication: Here a = ( 1.5, 1)T is multiplied by 2. Multiplying − by a larger number would extend the line further. Multiplying by a smaller number would shrink toward the origin. Finally, multiplying by a negative number goes in the opposite direction as illustrated by a. − 6 MATH METHODS 10.5 Coordinate Vectors in Rn n The standard vectors in R , ek, are defined by ek = (δik), so 1 0 0 0 1 0 . e1 =  0  , e2 =  0  , en =  .  . . . ···  .   .   0         0   0   1        These vectors are also referred to as the canonical basis vectors. We can write any vector x as a sum of basis vectors,

x1 n x2 x =  .  = x1e1 + x2e2 + + xnen = xiei. . ··· i=1  x  X  n    The sum can also be written as a matrix product.

x1 x2 x = e1 e2 en  .  ··· .   x   n  1 0 0 x1  ··· 0 1 0 x2 = . . ···. .  .   . . .. .  .  0 0 1   xn   ···    = Inx.   10. EUCLIDEAN SPACES 7 10.6 Linear Transformations We are now ready to show that any linear transformation T : Rn Rm can be written in matrix form. Theorem 10.6.1. Let T : Rn Rm be a linear transformation. Then→ there is an m n × matrix A with T = TA. n n Proof. We can write x = →j=1 xjej. Then T(x)= j=1 xjT(ej) for any vector x. The vector T(e ) is in Rm. We denote its ith component by T(e ) , i = 1,...,m. j P P j i Define the m n matrix A by setting aij = T(ej)i. Thus A = T(ej)i . × n With this definition of A, T(x) = a x , so T(x) = Ax. This shows that i j=1 ij j   T = TA. P This means that the matrix

A = T(e ) T(e ) T(en) 1 2 ···   represents T. 8 MATH METHODS 10.7 Matrix Representations of Linear Transformations ◮ Example 10.7.1: Linear Transformation as Matrix. Suppose T : R2 R3 is a linear transformation with 1 1 − → T(e1)= 0 and T(e2)= 1 . 3 ! 2 ! Then it can be represented by the matrix 1 1 − A = 0 1 3 2 ! so that x x 1 − 2 Ax = x2 . 3x1 +2x2 ! ◭ 10. EUCLIDEAN SPACES 9 10.8 Row Vectors vs. Column Vectors I’m insisting on writing vectors as columns. One reason I gave earlier was that we can then write our linear systems with coefficient matrix A as Ax = b. But you might object that you could just take the transpose, obtaining xT AT = bT , and redefine the coefficient matrix to be AT . That may seen slightly unnatural because we would always put the variable on the left, but math is sometimes done that way, particular in some areas of algebra. The real issue is that we need to maintain a distinction between the row vectors and column vectors. Suppose we had a linear function f that maps Rn into R. Such a linear function is called a linear functional. As we just saw, that kind of linear function can be represented bya1 n matrix × A = f(e ) f(e ) f(en) . 1 2 ···   If we write the vectors as columns, the linear functionals become row vectors. Thus the row (covariant) vector (1, 3, 0, 1) defines a linear functional on (contravariant) vectors in R4 by x1 x2 1 3 0 1 = x1 +3x2 + x4.  x3     x4    Which one is which is completely arbitrary. But the weight of mathematical tradition, stretching back at least to the 19th century, is to make the regular vectors columns, and the linear functionals rows. It’s implicit in the Ricci calculus that Einstein used to develop the general theory of relativity. In economics, vectors of goods should be represented as columns, but price vectors are properly represented as rows. Thus each set of prices defines a linear functional on the space of goods. If we purchase two different commodity bundles, the cost of them is the sum of the costs, and if we purchase a multiple of a particular commodity bundle under the same price system, its cost is the same multiple of the original. The cost of a bundle is both additive and multiplicative, so it is a linear functional on the commodity space. 10 MATH METHODS 10.9 Vector Spaces: Definition Both Rn and Cn, the set of complex n-tuples or n-vectors, are examples of vector spaces. But what are vector spaces? They are sets of things called vectors that can be added together and multiplied by scalars (numbers). The addition and multiplication has to obey certain rules. . A vector space (V, +, ) over a number field F (the set of scalars) is a set of vectors V with two operations: vector· addition, which defines x + y V for all vectors x, y V; and , which defines αx for any scalar∈ α F and any vector∈ x V. Scalar multiplication and vector addition obey the following∈ properties: ∈ 1. For all x, y V, x + y = y + x (addition commutes). ∈ 2. For all x, y, z V, (x + y)+ z = x + (y + z) (addition associates). ∈ 3. There exists a unique vector 0 V with x + 0 = 0 + x = x (additive identity). ∈ 4. For all x V, there is a unique vector x V with x + ( x) = 0 (additive inverse). ∈ − ∈ − 5. For all x V,1x = x (multiplicative identity). ∈ 6. For all α, β F and x V, α(βx) = (αβ)x (scalar multiplication associates). ∈ ∈ 7. For all α F and x, y V, α(x + y)= αx + αy (distributive law I). ∈ ∈ 8. For all α, β F and x V, (α + β)x = αx + βx (distributive law II). ∈ ∈ A vector space can be defined over any number field F, such as the rational numbers, the real numbers, or the complex numbers.1 Vector spaces over the real numbers are called real vector spaces and vector spaces over the complex numbers are called complex vector spaces. The vector space Rn is the set of all n-tuples of real numbers, T x = (x1, x2,...,xn) . Here vector addition is defined componentwise, with x + T y = (x1 + y1, x2 + y2,...,xn + yn) and scalar multiplication is given by αx = T (αx1, αx2,... ,αxn) .

1 There are many other number fields and many other types of vector spaces. 10. EUCLIDEAN SPACES 11 10.10 More Vector Spaces Vector spaces do not have to be n-tuples of numbers. They can be anything that we can add and multiply by a scalar in a way that obeys the vector space axioms. ◮ Example 10.10.1: Spaces of Continuous Functions. If A is a set, the set of real-valued continuous functions on A is a vector space with the usual addition of functions and multiplication by real numbers: (f + g)(x)= f(x)+ g(x) and (αf)(x)= αf(x). Interestingly, the set of complex-valued functions on A can be regarded as either a real or complex vector space. If A is an open set, we can consider the vector space of continuously differentiable, or even k-times continuously differentiable functions. These are denoted Ck(A). The space of continous functions is denoted C0 or sometimes just C. ◭ Sequence spaces are also commonly used in economics. ◮ Example 10.10.2: A Sequence Space. Another commonly used vector space is s, the space of sequences of real numbers. Elements of s are of the form (x1, x2, x3,... ) where each xi R. Vectors are added componentwise, and scalar multiplication is defined componentwise.∈ This and similar spaces often appear in optimal growth models and the dynamic general equilibrium models used in macroeconomics. ◭ 12 MATH METHODS 10.11 Vector Subspaces We say that W is a vector subspace of V if W V is closed under the vector space operations of vector addition and scalar multiplication⊂ (that is, x + y W when x, y W and αx W when x W and α F). ∈ When∈ W is a vector∈ subspace∈ of V it can also∈ be regarded as a vector space in its own n right as it inherits all the vector space properites of V. Thus W = {x R : x1+3x2 =0} is a real vector subspace of Rn and hence a real vector space. Many∈ of our vector spaces will be subspaces of Rn. Both requirements to be in a subspace can be shown simultaneously. Theorem 10.11.1. Let V be a vector space over F. If αx+y W for any x, y W V and any α F, then W is a vector subspace of V. ∈ ∈ ⊂ ∈ Proof. We need only show that any x + y W and any αx W for α F and x, y W. For the first, set α = 1 in the hypothesis.∈ For the second,∈ we must first∈ show that 0∈ W. Set x = y and α = 1 to find that 0 = x + x W Then set y = 0 in the hypothesis.∈ − − ∈ 10. EUCLIDEAN SPACES 13 10.12 Subspaces of Sequence Spaces ◮ Example 10.12.1: Subspaces of Sequence Spaces. Some vector subspaces of s include c, the space of convergent sequences, c0, the space of sequences that converge to 0 (limn xn = 0), and c00, the space of sequences that are zero except for finitely many terms (given x c00, there is some N with xn = 0 for n N). It is straightforward to verify these are∈ all subspaces of s. ≥ The vectors

{1, 8, 27,...,n3,... } and {1,e,e2,...,en,... } are in s, but not c. The possibility of arbitrary growth rates is why we cannot find a norm for s. The sequences

{2, 3/2, 4/3,..., 1+1/n, . . . } {1, 1/4, 1/9,..., 1/n2,... } {1, 2, 3, 4, 0,..., 0,... } are respectively in c, but not c0; c0 but not c00; c00. These spaces are nested, so

c c c s 00 ⊂ 0 ⊂ ⊂

The set sb of sequences that are bounded, but not necessarily convegent is another sequence space that sits between c and s. All of the spaces except for s can be normed by x ∞ = sup |x | where sup denotes the supremum, the least upper bound. ◭ k k n n 14 MATH METHODS 10.13 The Null Space or of a Linear System We start by considering a homogeneous linear system defined by an m n coefficient n m × matrix A, Ax = 0. Recall that TA : R R is given by TA(x)= Ax. The solution set of the homogeneous system, {x : TA(x)= 0}, is a vector subspace of Rn 2 , called the null space or kernel of A→. We denote it by ker A. Theorem 10.13.1. For any m n coefficient matrix A, consider the homogeneous system Ax = 0. The set of solutions× to this system forms a vector subspace of Rn, ker A. Proof. Let the solution set be V. Suppose x, y V and α R. Then A(αx + y)= αAx + Ay = 0, so any αx + y V, showing that∈ V Rn∈ is a vector subspace of Rn. ∈ ⊂ Theorem 10.13.1 also gives us information about the solutions to the system Ax = b. If a solution exists, call it x . Then Ax = b = Ax . Subtracting, we find A(x x )= 0. 0 0 − 0 In other words, if x0 is a solution to Ax = b, then every other solution can be written x + x where x ker A. The solution set is {x } + ker A. 0 ∈ 0

2 Simon and Blume prefer to call it the nullspace and use the symbol N(A). 10. EUCLIDEAN SPACES 15 10.14 The Range of a Linear Transformation The kernel is not the only subspace associated with a linear transformation. Another is its range. Theorem 10.14.1. Let T : Rn Rm be a linear transformation. Then ran T is a vector subspace of Rm. Proof. Clearly ran T Rm.→ Let y, y′ ran T. Then there are x, x′ Rn with T(x)= y and T(x′)= y′⊂. Let α be a scalar. Then∈ T(αx+x′)= αT(x)+T(x′)=∈ αy+y′, showing that αy + y′ ran T. This shows that ran T is a subspace of Rm. ∈ In particular, this applies to linear transformations defined by matrices, TA(x)= Ax. n Corollary 10.14.2. Let A be an m n matrix and define the linear function TA : R m × m R by TA(x)= Ax. Then ran TA is a vector subspace of R . → 16 MATH METHODS 10.15 The Geometry of Vector Spaces: Norms Euclidean space is distinguished by its geometry. Part of geometry is measuring distances between points. In vector spaces this entails measuring vectors to determine their length. This can often be accomplished by a norm. There are three basic properties a norm must have. Norm. A norm on a real or complex vector space V is a mapping from V to R, denoted x that has three properties: k k 1. Positive Definite. For all x V, x 0 and x = 0 if and only if x = 0. ∈ k k ≥ k k 2. Absolutely Homogeneous of Degree One. For all α F and x V, αx = |α| x . ∈ ∈ k k k k 3. Triangle Inequality. For all x, y V, x + y x + y . ∈ k k≤k k k k Normed Vector Space. A normed vector space (V, +, ), is a vector space (V, +, ) together with a norm defined on that space. · k·k · k·k  We will usually use the abbreviated notation (V, ) for (V, +, ), We measure the distance between two points byk·k the length of th· ek·k vector between them, x y , which is the same as y x by absolute homogeneity.  Onek thing− wek can do with normsk is create− k unit vectors, vectors with norm one, in any given direction. If x = 0, we define the unit vector u in direction x by u = x/ x . If we compute 6 k k x x u = = k k =1, k k x x

k k k k by absolute homogeneity. Since it has norm one, u is indeed a unit vector.

Once we define some more norms, Figure 10.17.1 will illustrate all the unit vectors for three different norms. 10. EUCLIDEAN SPACES 17 n 10.16 ℓp Spaces n 3 n One family of norms on R are the ℓp-norms. . The ℓp norm on R is defined for 1 p < by ≤ n 1/p p ∞ x p = |xi| , k k ! Xi=1 while x ∞ = max |xi|. k k i We use the notation ℓn to denote Rn with the ℓ norm. Thus ℓn means Rn, . p p p k·kp The ℓp norm can also be defined on sequences of real numbers, when ℓp is defined p  as the set of sequences x = {x1, x2,... } such that i |xi| converges. We use the norm

∞ P1/p p x p = |xi| . k k ! Xi=1 To see that distances change under different norms, consider the distance between 2 2 2 2 (1, 3) and (4, 7). Itis 7in ℓ1,5in ℓ2, approximately 4.17 in ℓ5, and 4 in ℓ∞. In contrast, the distances between (1, 3) and (1, 4)are 1 in each of the ℓp norms. The fact that some distances change while others don’t tells us that the geometry itself has changed.

3 Pronounced “little el pee”. 18 MATH METHODS

10.17 Shapes in ℓp

We can get a clue about how ℓp geometry changes with p by considering the vectors of length one. Figure 10.17.1 shows all unit vectors x, vectors with x = 1, for three k k different values of p. From left to right, the norms used are ℓ1, ℓ2, and ℓ∞. Although the distances along the coordinate axes are the same in all cases, points in other directions get closer as p increases. As a result, the sets themselves expand in the off-axis directions as p gets larger.

x2 x2 x2 x ∞ =1 x k k x =1 2 =1 k k1 k k

b b b x1 x1 x1

Figure 10.17.1: The left diagram shows the unit vectors, the vectors of length 1 in the ℓ1 norm. The center diagram shows the unit vectors in the ℓ2 norm. The right diagram shows the unit vectors in the ℓ∞ norm. Although the distances along the axes stay the same, the other vectors get farther out as for the same ℓp length as p increases.

The ℓ1 norm is sometimes called the taxicab norm. When streets are laid out on a grid, it gives the distance via street between any two locations (no shortcuts through buildings!). The ℓ2 norm is the Euclidean norm. It measures distance according to Euclidean geometry, and the points at distance one from the origin form a circle. 10. EUCLIDEAN SPACES 19 10.18 More Normed Spaces Norms can be defined on other vector spaces. Consider the set of bounded real- valued continuous functions on a space X, denoted Cb(X). This space is a vector space when vector addition and scalar multiplication are defined pointwise. That is, (f + g)(x)= f(x)+ g(x) and (tf)(x)= tf(x) for all x X. Because the functions in C (X) ∈ b are bounded, the supremum norm (or sup norm) defined by f ∞ = supx∈X |f(x)| will always be finite. It is easy to show that the sup norm is positivk k e definite, absolutely homogeneous of degree one, and obeys the triangle inequality on Cb(X). Another function space with a norm is the vector space of square integrable functions on A, L2(A), has norm4 1/2 f = |f(x)|2 dx . k k2 ZA  It will turn out that this space is also Euclidean in the sense that the Pythagorean identity is true. In fact, it is a generalization of the ℓ2 norm.

4 This space is called “big el two” when we need to distinguish it from one of the ℓ2 spaces. 20 MATH METHODS 10.19 Lp Spaces This generalization can be extended to p,1 p < by ≤ 1/p f = |f(x)|p dx∞ . k kp ZA  ◮ Example 10.19.1: Vectors in Lp Spaces. The function f(x) = x−1/2 is in L1 when A = [0, 1] because 1 1 x−1/2 dx = 2x1/2 =2, 0 0 Z 2 but it is not in L (0, 1) because

1 2 1 1 1 x−1/2 dx = dx = ln x =+ . 0 0 x 0 Z Z

◭ ∞ 10. EUCLIDEAN SPACES 21 n 10.20 ℓp norms and Pythagoras

The ℓ2 norm is called the Euclidean norm because it measures vectors according to Euclidean geometry. The other ℓp norms are not Euclidean. The geometry is different. We can see this by measuring a right triangle in R2. Euclidean geometry requires that the Pythagorean identity hold. We will use the right triangle defined by 0 = (0, 0), a = (a, 0), and b = (0,b) with a,b > 0 to check this. The two sides of the right angle are 0–a and 0–b while p 1/p the hypotenuse is a–b. In all of the ℓp norms the a side has length (|a| ) = a for p < , and max{0,a} = a for p = . Not surprisingly, the b side has length b. This is illustrated in the diagram. ∞ ∞

x2

b a b b −

x1 a a

For the Pythagorean identity to be true, the hypotenuse a b = (a, b) must have 2 2 − 2 −2 length √a + b . It works fine in the ℓ2 norm, (a, b) 2 = √a + b . For the ℓ∞ 2 k 2 − k norm, we have (a, b) ∞ = max{a,b} < √a + b , failing the test. k − k For p = , (a, b) = (ap + bp)1/p, which will not be √a2 + b2 unless p = 2. 6 k − kp For example, if a = b = 2, then the terms are 2(p+1)/p and 23/2. n Similar calculations∞ apply to all the ℓp for n 2 and 1 p . We don’t worry ≥ ≤ ≤ n about n = 1 because there is no room for right triangles there. That means that ℓp is not Euclidean when p = 2. 6 ∞ 22 MATH METHODS 10.21 Inner Product Spaces

The ℓ2 norm is special, and one thing that makes it special is that it is based on an inner product. . An inner product space (V, ) is a real or complex vector space V together with an inner product x y on V. The inner· product is a mapping V V to R or C, denoted x y, which is: · × · 1. (Conjugate) Symmetric. For all x, y V, x y = y x.5 ∈ · · 2. Linear. For all vectors x, y, z and scalars α, β, x (αy + z)= α(x y) + (x z).6 · · · 3. Positive Definite. For all x V, x x 0 and x x = 0 if and only if x = 0. ∈ · ≥ · The inner product is also known as the or scalar product. Various notations are used for the inner product, including x y, (x, y), x, y , or x|y . We will generally use the dot notation, x y. · h i h i When we use a notation such· as x|y we are writing the inner product as a bilinear form or a sesquilinear form, where xh|y iis separately linear in both x and in y (bilinear) or linear in one and conjugate linearh ini the other (sesquilinear). In Rn,

(αx + y) z = αx y + y z, · · · but in Cn, (αx + y) z = αx y + y z. · · · The presence of the conjugate is why the inner product is sesquilinear in Cn, not bilinear.

5 The conjugate has no effect when V is a real vector space. There we just have ordinary symmetry. 6 In the complex case, this implies conjugate linearity in the first term. Some authors use the opposite convention with conjugate linearity in the second term. 10. EUCLIDEAN SPACES 23 10.22 Euclidean Inner Product on Rn On Rn, we define the Euclidean inner product by

n x y = x y . · i i i X=1 Another way to write the inner product on Rn is as a matrix product, x y = xT y. In Cn, the transpose must be replaced by the Hermitian conjugate, so x y·= x∗y which can also be written · n x y = x y . · i i Xi=1 24 MATH METHODS 10.23 More Inner Products There are other inner products on Rn. In fact, whenever A is a symmetric positive definite matrix, we can define an inner product by x y = xT Ay. Linearity in the first argument is clear. The symmetry of A ensures that· the resulting inner product is symmetric. The fact that A is positive definite, makes the inner product positive definite. This also works in Cn provided that A is Hermitian. 10. EUCLIDEAN SPACES 25 10.24 Creating a Norm from the Inner Product Any time wehave an inner product x y on a vector space V, we can define an associated norm by · x = √x x. k k · If we use the Euclidean inner product, this becomes the Euclidean norm

n x = |x |2. k k v i u i=1 uX t When we use the Euclidean norm on Rn, the resulting space is called n-dimensional n Euclidean space, ℓ2 . 26 MATH METHODS

10.25 Cauchy-Schwartz Inequality 9/14/21 An inner product and its associated norm are closely related. One aspect of this is the Cauchy-Schwartz inequality, which will help us prove that the associated norm obeys the triangle inequality. Cauchy-Schwartz Inequality. Let x and y be vectors in an inner product space (V, ). Then · |x y| x y · ≤k kk k for all vectors x and y in V. Moreover, if |x y| = x y for non-zero x and y, then x and y are proportional. · k kk k 10. EUCLIDEAN SPACES 27 10.26 Proof of Cauchy-Schwartz Inequality Cauchy-Schwartz Inequality. Let x and y be vectors in an inner product space (V, ). Then · |x y| x y · ≤k kk k for all vectors x and y in V. Moreover, if |x y| = x y for non-zero x and y, then x and y are proportional. · k kk k Proof. The inequality clearly holds if x = 0 as both sides are then zero. We restrict our attention to the case x = 0. Since we wish to include the complex case, keep in mind that x y = x∗y and y6 x = y∗x = x∗y = x y. · · · x y 2 0 y · x (10.26.1) ≤ − x 2 k k x y x y = y∗ · x∗ y · x − x 2 − x 2  k k   k k  (x y)(x y) (x y)(x y) (x y)(x y) = y 2 · · · · + · · x 2 k k − x 2 − x 2 x 4 k k k k k k k k (x y)(x y) = y 2 · · k k − x 2 k k |x y|2 = y 2 · . (10.26.2) k k − x 2 k k Since is positive definite, k·k |x y|2 x 2 y 2, · ≤k k k k establishing the Cauchy-Schwartz Inequality. If |x y| = x y for non-zero x and y, eq. 10.26.2 implies that the right-hand side of eq.· 10.26.1k kk is zero,k meaning that

y x y = · x. x 2 k k  But then y is proportional to x. Since both x and y are non-zero, they are proportional to each other. 28 MATH METHODS 10.27 The Inner Product defines a Norm Now consider the norm derived from the inner product, x = (x x)1/2. This is obviously absolutely homogeneous and positive definite. Bukt isk it a norm?· Does the triangle inequality apply? The Cauchy-Schwartz inequality tells us it does. Proposition 10.27.1. Let (V, ) be an inner product space over F = C or F = R and set · x = (x x)1/2. Then obeys: k k · k·k 1. For all α F and x V, αx = |α| x (absolute homogeneity of degree one). ∈ ∈ k k k k 2. x 0 and x =0 if and only if x = 0 (positive definite). k k ≥ k k 3. For all x, y V, x + y x + y (triangle inequality). This shows that ∈ isk a norm.k≤k k k k k·k Proof. Now (αx) (αx) = (αα)(x x)= |α|2 x 2, taking the positive square root shows αx = |α| x , proving· (1). · k k k Fork (2), xk k 0 by definition. If x = 0, then x x = 0. Since the inner product is positive definite,k k ≥ x = 0. This provesk (2).k · For (3),

x + y 2 = x 2 + x y + y x + y 2 k k k k · · k k = x 2 + 2 Re x y + y 2 k k · k k x 2 +2|x y| + y 2 ≤k k · k k x 2 +2 x y + y 2 ≤k k k kk k k k x + y 2 ≤ k k k k where the third line uses the Cauchy-Schwartz inequality, |x y| x y . This proves (3). · ≤k kk k 10. EUCLIDEAN SPACES 29 10.28 Polarization Identities If you have a norm derived from an inner product on Rn, it is possible to reconstruct the inner product from the norm using the polarization identity

1 x y = x + y 2 x y 2 . · 4 k k −k − k  Expanding the right-hand side gives

1 x 2 +2x y + y 2 x 2 +2x y y 2 = x y. 4 k k · k k −k k · −k k ·   Although the expression defined by the polarization identity is always symmetric and positive definite, it will fail to be separately linear in x and y unless the norm obeys the parallelogram law 2 x 2 +2 y 2 = x + y 2 + x y 2 k k k k k k k − k which states that the sum of squares of the lengths of the four sides of a parallelogram is equal to the sum of squares of the lengths of the two diagonals. In Euclidean geometry, it follows from the law of cosines. In inner product spaces the parallelogram law follows immediately after expanding the right-hand terms. In Cn, the polarization identity takes a somewhat more complicated form:

1 i x y = x + y 2 x y 2 ix y 2 ix + y 2 . · 4 k k −k − k − 4 k − k −k k   the first term is the real part of x y, given by · 1 Re x y = x y + x y . · 2 · ·  The second term is i times the imaginary part of x y, · 1 i Im x y = x y x y . · 2 · − ·  30 MATH METHODS 10.29 Perpendicular Vectors One important fact about the Euclidean inner product is that perpendicular vectors have a dot product of zero. Theorem 10.29.1. Let x and y be non-zero vectors in Euclidean Rn. Then x is perpen- dicular to y if and only if x y =0. · Proof. If case ( ): Suppose x y = 0. Then · y x 2 = x 2 2y x + y 2 = x 2 + y 2 ⇐k − k k k − · k k k k k k which is the Pythagorean identity. That means that x, y, and y x form a right triangle. As y x is the hypotenuse, x and y are perpendicular. − Only− if case ( ): Suppose x is perpendicular to y. Consider the right triangle with sides x, y, and y x. By the Pythagorean identity − ⇒ x 2 + y 2 = y x 2 = y 2 2y x + x 2. k k k k k − k k k − · k k It follows immediately that x y = 0. · 10. EUCLIDEAN SPACES 31 10.30 Orthogonal and Orthonormal Vectors The standard basis vectors E are perpendicular unit vectors because

e e = e e = δ δ = δ δ = δ . i· j j· i ki kj ii jj ij k X

So ei ej = 0 if i = j. The basis vectors are perpendicular to one another. A set of vectors· that are mutually6 perpendicular are referred to as orthogonal vectors. 2 Also, ei ei = ei = 1. The basis vectors are also unit vectors. A set of unit vectors that are also· orthogonalk k are called orthonormal vectors. The standard (or canonical) basis makes it easy to write the coordinates of x in terms of the inner product x = x e = e x. i · i i· ◮ Example 10.30.1: Orthonormal Basis for R3. Orthonormal bases don’t have to look anything like the standard basis vectors. 1 1 1 1 1 − 1 b1 = 1 , b2 = 1 , b3 = 1 . √2 0 ! √3 1 ! √6 −2 !

3 Then {b1, b2, b3} is an orthonormal basis for R . Easily calculations show the vectors are perpendicular to each other, and that they all have norm 1. ◭ 32 MATH METHODS 10.31 Angles and Inner Products I Theorem 10.31.1. If x and y are non-zero vectors in Euclidean Rn, and θ is the angle between them, then x y cos θ = · . (10.31.3) x y k kk k where x y is the Euclidean inner product. · Proof. We will write y = w + z with w a multiple of x and z perpendicular to x. The diagrams below show how that works when the angle θ between x and y is acute (Figure 10.31.2) and obtuse (Figure 10.31.3). Notice that we always measure the angle the short way round (or π for 180◦).

x2 x2

y y z θ θ x1 x1 w x x

Figure 10.31.2: The acute case is shown in the left-hand diagram. In the right-hand diagram y = w + z where z is perpendicular to x and w is parallel to x. Here w = y | cos θ|. As k k k k cos θ> 0, w points in the same direction as x.

x2 x2

y z y

θ θ w x1 x1

x x

Figure 10.31.3: The obtusecase isshown inthe left-hand diagram. Inthe right-hand diagram y = w + z where z is perpendicular to x and w is parallel to x. Here w = y | cos θ|. As k k k k cos θ< 0, w points in the opposite direction as x. 10. EUCLIDEAN SPACES 33 10.32 Angles and Inner Products II Proof continues. Now y = w + z where w is parallel to x and z is perpendicular to x. Euclidean geometry tells us that w has signed length y cos θ, we multiply that by the unit vector in the x direction to find k k x w = y cos θ . k k x k k Of course, cos θ is positive when the angle is acute and negative when it is obtuse. As a result, the formula works for any θ,0 θ π. Since z is perpendicular to x, x z =0 by Theorem 10.29.1. Then ≤ ≤ ·

x y = x w + x z · · · = x w · x = x y cos θ · k k x  k k x 2 = y cos θ k k k k x k k = x y cos θ. k kk k Divide by x y to obtain equation (10.31.3). k kk k In any inner product space, we can use equation (10.31.3) to define the angle between any two non-zero vectors.

x y θ = arccos · . x y k kk k 34 MATH METHODS 10.33 Metric Spaces A metric is a way of measuring the distance between two points that is more general than a norm. There are several basic criteria it must satisfy.

Metric Space. Given a set X, a metric d on X is a mapping from X X into R+ that satisfies: × 1. Symmetry. d(x, y)= d(y, x). 2. Positive Definite. d(x, y) 0 and d(x, y) = 0 if and only if x = y. ≥ 3. Triangle Inequality. For all x, y, z X, d(x, z) (x, y)+ d(y, z). ∈ ≤ . Compared to a norm, we have lost the absolute homogeneity. The geometry can change with distance. A metric space is a set with a metric on it. Metric Space. A metric space (X, d) is a set X together with a metric d defined on X. 10. EUCLIDEAN SPACES 35 10.34 Metrics for Normed Spaces Normed vector spaces have a natural metric defined by d(x, y) = x y . It is easy to see that this metric is symmetric by absolute homogeneity of thek norm.− Itk is positive definite because the norm is positive definite. Finally, the triangle inequality for d follows from

d(x, z)= x z k − k x y + y z ≤k − k k − k = d(x, y)+ d(y, z). 36 MATH METHODS 10.35 Discrete Metric ◮ Example 10.35.1: Discrete Metric. A metric that does not require that X be a normed vector space, or a vector space, or that there be any structure at all on the space X, is the discrete metric. It is defined by

1 if x = y d(x, y)= 6 0 if x = y.

The discrete metric is defined on every set X. It is positive definite and symmetric. It also obeys the triangle inquality because the left-hand side is either 0 or 1. If it is one, either d(x, y)=1or d(y, z) = 1, or both, so the right-hand side is either 1 or 2, satisfying the triangle inequality. ◭ Unless otherwise noted, we will use the Euclidean norm on Rn and its subsets. If there is any ambiguity about which metric to use with a space, it should be specified. Metrics will be important later on when we examine limits, open and closed sets, and continuity (Chapters 12, 29, and 13). 10. EUCLIDEAN SPACES 37 10.36 Bounded Metrics Unlike a norm, a metric may be bounded. The following metric on the real line is bounded by one. |x y| d(x, y)= − < 1. 1+ |x y| − That d is symmetric and positive definite is pretty obvious. It takes a little work to show the triangle inequality, but it is based on the fact that if a b + c, for a,b,c 0, then ≤ ≥ a b c + . 1+ a ≤ 1+ b 1+ c Here’s how it works:

a b + c ≤ b + c +2bc + abc ≤ a + ab + ac + abc b + ab + bc + abc + c + ac + bc + abc ≤ a(1 + b)(1 + c) b(1 + a)(1 + c)+ c(1 + a)(1 + b) ≤ a b c + . 1+ a ≤ 1+ b 1+ c We divided by (1 + a)(1 + b)(1 + c) in the last line.7 Finish by setting a = |x z|, b = |x y|, and c = |y z| to obtain the triangle inequality. − − −

7 If you’re wondering how anyone came up with such a calculation, there’s a trick. Work backwards from the end. 38 MATH METHODS 10.37 A Metric for the Sequence Space Although it is not possible to define a norm on the sequence space s, we can define a metric on it by ∞ 1 |xi yi| d(x, y)= i − . 2 1+ |xi yi| i X=1 − Since each term is no more than 2−i, the sum converges uniformly to a number less than one. Although the metric is bounded, this doesn’t really translate into bounds on 2 i the xi. The following diagram applies this to R , with the same weighting of 1/2 . This means 1 |x y | 1 |x y | d(x, y)= 1 − 1 + 2 − 2 2 1+ |x y | 4 1+ |x y | 1 − 1 2 − 2

x2

b 1 1 x1 −

Figure 10.37.1: Although the sequence metric is bounded, that cannot be said about the points at distance 1/4 from zero. The diagram illustrates points x with d(x, 0)=1/4 in R2. The entire vertical axis has d(x, 0) < 1/4, with d (0,x ), 0 1/4 as x . The 2 2 ± geometry is obviously quite different from any of the ℓ , some of which were illustrated in p  Figure 10.17.1. → → ∞ As noted in the Figure 10.37.1 the entire vertical axis ends up being distance less than 1/4 from the origin. Here’s the calculation:

1 |y| 1 d (0, 0), (0, y) = < . 4 1+ |y| 4  As a result, when applied to the sequence space, the set

{x s : d(0, x) < r} ∈ is unbounded. Even for r < 1 it will typically include the vertical axes for large values of i. When r 1, it is all of s. ≥ 10. EUCLIDEAN SPACES 39 10.38 Lines in Euclidean Space: Slope-Intercept Form Now that we have measurement of distances and angles under control, via the inner n product (angles and length), ℓ2 norm (length), and associated metric (distance function), we turn our attention to other aspects of the geometry of Rn. Until perpendicular angles Rn Rn n become involved, this will apply to in general, not just Euclidean , ℓ2 . We start with lines in R2. As you well know, there’s more than one way to write the equation of a line. We will start with the slope-intercept form, where y = mx + b. The coordinates are (x, y), the slope is m and b is the vertical intercept. Writing the equation in terms of coordinates, we have x x 1 0 = = x + . y mx + b m b         We can think of this as a one-parameter family of coordinates. So we can remove the link between a particular coordinate system and the equation for the line, we rewrite this in terms of a parameter t R: ∈ 0 1 x(t)= + t . (10.38.4) b m     The line is specified by a point on the line, (0,b)T , and a direction (1, m)T . 40 MATH METHODS 10.39 Lines in Euclidean Space: Parametric Form We can easily generalize the form in equation (10.38.4) to Rn. We can write a line in Rn as the points of the form x(t)= x0 + tx1 (10.39.5)

where x0 is a point on the line, and x1 is the direction of the line. This is the parametric form of a line. Coordinates have been eliminated from the definition. When we need them, we can write the equation using any coordinates we wish. If L is the line, we can now write

L = x(t) : x(t)= x + tx ,t R . 0 1 ∈ One advantage of representing the line this way is that we can write equations for vertical lines. In R2, just set x = (0, 1)T (or even (0, 12)T ) to get a vertical line through 1 − x0. This is one advantage of a coordinate-free definition. We are not tied to writing y as a function of x, x can be a function of y, or both functions of some other variable, such as the parameter t. Before moving on, let’s consider lines through the origin, where 0 L. These can be written in a special form. ∈ Theorem 10.39.1. A line through the origin, L, can be written L = {x : x = tx1,t R} if and only if 0 L. ∈ ∈ Proof. If case ( ): Since 0 L, there is a t0 with x0 +t0x1 = 0. Then x0 +tx1 = (t t0)x1 for any t R. Since t′ =∈ t t0 can be any , L = {x : x = tx1,t R−}. Only if case (∈ ): If the line L has− the specified form, set t = 0 to find 0 L. ∈ ⇐ ∈ ⇒ 10. EUCLIDEAN SPACES 41 10.40 Perpendiculars, Lines, and Hyperplanes We return to the equation y = mx + b, this time restricting our attention to 2- dimensional Euclidean space. We previously converted this to an equation involving the direction (1, m). Suppose we think of the line as being defined by its perpendicular direction. Since the line runs in the direction (1, m), we need (x1, x2) with 0 = (x1, x2) (1, m) = x1 + mx2. One such vector is (x1, x2) = ( m, 1) (any scalar multiple would· do). − n Now rewrite y = mx + b as y mx = b, or ( m, 1) (x, y) = b. In ℓ2 , we can generalize this to a x = b for a = 0−. Define H(a,b)=− {x :· a x = b}. Consider the equation· defining6 H, ·

n

aixi = b. i X=1

Since at least one ai = 0, the coefficient matrix of this linear system has one, so there are (n 1) free6 variables. If n = 2, there is one free variable and we have a line. − If n = 3, there are two free variables, and the equation a1x1 +a2x2 +a3x3 = b defines a plane. When n = 4, we have a three-dimensional surface in 4-space. We refer to H(a,b) as a hyperplane. It has the most free variables possible without including the whole space Rn. 42 MATH METHODS 10.41 Hyperplanes and Half-Spaces The hyperplane H(a,b) cuts Rn into two parts whose intersection is H(a,b), H+(a,b)= {x Rn : a x b} and H−(a,b) = {x Rn : a x b}. These two sets are referred to∈ as closed· half-spaces≥ . The term closed∈ means· that≤ the boundary, H(a,b) itself, is included (we will formalize terms such as closed and boundary in Chapter 12). ◮ 2 Example 10.41.1: Hyperplane in ℓ2. We examine H(e1, 2) and its two closed half- spaces.

x2

− + H (e1, 2) H (e1, 2)

x1

2 Figure 10.41.2: Here the heavy line/hyperplane H(e1, 2) separates R into two half-spaces, H+(e , 2) right of the hyperplane, and H +(e , 2) left of the hyperplane. 1 − 1 ◭ 10. EUCLIDEAN SPACES 43 10.42 Hyperplanes, Half-Spaces, and the Budget Set One place hyperplanes and half-spaces appear in economics is in the budget set. ◮ Example 10.42.1: The Budget Set. The budget set is defined by

B(p, m)= {x Rn : p x m} ∈ + · ≤ Rn Rn for p 0 and m 0. Here + = {x : xi 0 for all i = 1,...,n}. We can think of≥ this as an≥ intersection of half-spaces.∈ There≥ is the half-space H−(p, m), and + there are also the half-spaces H (ei, 0). Thus

n − + B(p, m)= H (p, m) H (ei, 0) . i ! \ \=1

x2 p

m/p2

x1

m/p1

Figure 10.42.2: The figure illustrates a budget set. The price vector is perpendicular to the budget line. The intercepts, where all income is spent either on good one or good two, are m/p1 and m/p2, respectively.

◭ 44 MATH METHODS 10.43 Hyperplanes, Half-Spaces, and the Probability Sim- plex A second example of hyperplanes and half-spaces in economics is the probability simplex. ◮ Example 10.43.1: Probability Simplex. The probability simplex ∆ is defined by

∆ = {p Rn : p e =1} ∈ + · n T where e = i=1 ei = (1,..., 1) . Thus P n ∆ = {p Rn : p =1} ∈ + i i X=1

The idea is that i = 1,...,n are mutually exclusive possible events and that pi is the probability of each event. Since 0 p 1 we can think of the p as percentages. The ≤ i ≤ i fact that i pi = 1 tells us that the probabilities add up to 100%. One of the events must happen. ◭ P

x3

x2

x1

Figure 10.43.2: The probability simplex ∆ is the yellow triangle in R3 with vertices (1, 0, 0), (0, 1, 0), and (0, 0, 1). 11.LINEARINDEPENDENCE 45

11.

This section focuses on bases for vector spaces. We’ve seen one basis already, the standard basis for Rn. It allowed us to write any linear transformation from Rn to Rm in terms of . Bases allow us to define the of a vector space. In the context of solutions to linear systems, the dimension is the number of free variables. It tells us what the solution set looks like. Finally, the judicious choice of a basis can simplify linear systems, allowing easier interpretation of results. In a dynamic context, this allows us to better understand both short and long-run dynamics of the system. 46 MATH METHODS 11.1 Linear Combinations

Let L be a line through the origin. We say L = {x : x = tx1} is the line generated by x1, or the line spanned by x1. It’s the set of all scalar multiples of x1. What if we have more than one generator? What do we get? Let’s try it. Let x1,..., xk be vectors in a vector space V. A sum of the form

k t x + t x + + t x = t x 1 1 2 2 ··· k k j j Xj=1 for t R is called a of x ,..., x . j ∈ 1 k 11.LINEARINDEPENDENCE 47 11.2 The Span of a Set

We define the span of a set {x1,..., xk}, L x1,..., xk , as the set of linear combinations of x1,..., xk. In other words,   k L x ,..., x = x : x = t x for some t R . 1 k j j j ∈  j=1    X Theorem 11.2.1. Suppose x ,..., x are vectors in V and x, y L x ,..., x . Then 1 k ∈ 1 k for α F, αx + y L x ,..., x , so the span is a vector subspace of V. ∈ ∈ 1 k   Proof. In this case there are s ,t F with x = s x and y = t x . Then  i  i ∈ i i i i i i αx + y = (αsi + ti)xi is also in L x1,..., xk . i P P When V = Rn, we can write the span using a matrix. Form an n k matrix X P   from the vectors x ,..., x Rn by taking x as the jth column of X×. The linear 1 k ∈ j combinations of the x1,..., xk can be written

x11 xk1 k x21 xk2 Xt = tjxj = t1  .  + + tk  .  . . ··· . j=1 X  x   x   n1   kn      Then L x ,..., x = Xt : t Rk . 1 k ∈ When writing x this way, we can think of the tj as coordinates of x in the coordinate system X = x , , x . 1 ··· k   48 MATH METHODS 11.3 Spanning Examples

◮ Example 11.3.1: Standard Basis Vectors. If k = n and xj = ej, then the matrix formed from the standard basis vectors ej is the identity matrix and the coordinates of x are Ix = x, meaning that the j coordinate is just xj. ◭ Spans need not resemble the standard basis vectors. ◮ Example 11.3.2: Span of Vectors. Consider the case of three vectors in R4 given by the columns of 1 1 0 0 1 1 X = .  1 1 1   1 0 0    Then t1 + t2 t2 + t3 L[x1, x2, x3]= : tj R .  t1 + t2 + t3  ∈     t1    Since the rank of X is three, there are vectors cannot be writtenx = Xt. These vectors 4 are not in the span, showing that L[x1, x2, x3] is a proper subspace of R . ◭ 11.LINEARINDEPENDENCE 49 11.4 When are Linear Combinations Unique? 9/16/21 Linear combinations allow us to write vectors in terms of a particular set of vectors. It lets us set up a coordinate system. Does a vector have a single set of coordinates? Or are there multiple ways to write it in terms of our of vectors? Let X be an n k matrix whose columns define a coordinate system and suppose x = Xt and x = ×Xt′, so t and t′ are both coordinates for x. When can we conclude that t = t′? Alternately, when can we conclude that a vector has only one set of coordinates? By subtracting, we find 0 = X(t t′), so the question is really whether this homoge- neous linear system has a unique solution.− By Corollary 7.28.1, this will have multiple solutions if and only if k > rank X, which is equivalent to saying there are free variables. This is connected to the idea of linear dependence.

Linear Dependence. Non-zero vectors x1 ..., xk are linearly dependent if there are k t1,...,tk, not all zero, with j=1 tjxj = 0. In other words, the vectorsP are linearly dependent if and only if Xt = 0 has a non-zero solution t. 50 MATH METHODS 11.5 Theorem on Linear Dependence

Theorem 11.5.1. Suppose x1 ..., xk are linearly dependent vectors. Then there is h so that 1 xh = tjxj. −th 6 Xj=h k Proof. By linear dependence, we can find t1,...,tk, not all zero, with j=1 tjxj = 0. Take h with th = 0. Then 6 P t x = t x h h − j j 6 Xj=h implying that xh is a linear combination of the other xj’s. Dividing by th, we obtain

1 xh = tjxj. −th j6 h X= 11.LINEARINDEPENDENCE 51 11.6 Examples of Linear Dependence The vectors 1 6 0 x1 = 1 , x2 = 6 , x3 = 0 1 ! 0 ! 7 ! are linearly dependent as 7x1 (7/6)x2 x3 = 0. Another set of linearly dependent− vectors− is

1 1 1 1 1 x1 = , x2 = , x3 = . 0 √2 1 √2 1      −  Here 1 x1 x2 + x3 = 0. − √2  52 MATH METHODS 11.7 Linear Dependence and Independence

Linear Independence. We call non-zero vectors x1,..., xk linearly independent if they are not linearly dependent.

Equivalently, a set of non-zero vectors X = {x1,..., xk} is linearly independent if k t x = 0 implies t = t = = t = 0. Linear independence implies there j=1 j j 1 2 ··· k is at most one vector t with x = Xt where X is the matrix formed by setting the jth P column of X equal to xj X. The vectors ∈ 1 0 1 x1 = 1 , x2 = 1 , x3 = 0 0 ! 1 ! 1 ! are linearly independent. Suppose t1 + t3 0 Xt = t1x1 + t2x2 + t3x3 = t1 + t2 = 0 . t2 + t3 ! 0 ! Then t = t , t = t , and t = t . Combining these, we find t = 0. Since the 1 − 3 1 − 2 2 − 3 only linear combination of x1, x2, and x3 that is zero is the zero linear combination, the vector are linearly independent. 11.LINEARINDEPENDENCE 53 11.8 Too Many Vectors Imply Linear Dependence If there are too many vectors, they must be linearly dependent. n Theorem 11.8.1. Suppose x1,..., xk are non-zero vectors in R with k > n. Then x1,..., xk are linearly dependent. Proof. Consider the equation Xt = 0. Since there are more variables than equations, there is at least one free variable. It follows that Xt = 0 has infinitely many solutions, establishing linear dependence. Alternatively, we could quote Corollary 7.25.1. ◮ Example 11.8.2: More than n Vectors are Linearly Dependent in Rn. For example, suppose that in R3, we have 1 1 0 1 x1 = 1 , x2 = 1 , x3 = 1 , and x4 = 0 . 1 ! 0 ! 1 ! 1 !

These vectors are linearly dependent because there are too many of them. In fact, x1 is a linear combination of the others: x1 = (1/2)(x2 + x3 + x4). ◭ 54 MATH METHODS 11.9 Spanning Sets A second issue concerning coordinate systems is whether our standard set of vectors is big enough to encompass all possible vectors as linear combinations. If so, every vector will have coordinates in our system. If not, there will be vectors outside the coordinate system. Span. A set of non-zero vectors X = {x ,..., x } V spans a vector space V if 1 k ⊂ k t x = x has a solution for every x V. j=1 j j ∈ PEquivalently, x1,..., xk span V if every vector in V is a linear combination of x1,..., xk. If the vectors we are using to build a coordinate system span V, then every vector can be written using our coordinate system. 11.LINEARINDEPENDENCE 55 11.10 Spanning Sets in Rn Any set that spans Rn must contain at least n vectors. n Theorem 11.10.1. If X = {x1,..., xk} is a set of non-zero vectors that spans R , then k n. ≥ Proof. If X spans Rn, construct X as above. Corollary 7.29.1 tells us that rank X = n, which implies k n. ≥ n More generally, if V is a vector subspace of R , x1,..., xk span V if Xt = x has a solution for every x V. The set ∈ 1 1 1 x = , x = , x = 1 1 2 1 3 0    −    spans R2. To see it, suppose

x1 t1 + t2 + t3 = t1x1 + t2x2 + t3x3 = . x2 t1 t2    −  This system has infinitely many solutions.

x2 x2 +1 t = 0 and t′ = 1 x x ! x x 2 ! 1 − 2 1 − 2 − are two of them. Larger sets that span will involve some redundancy. Theorem 11.8.1 says they will be linearly dependent, so by Theorem 11.5.1 at least one can be written as a linear combination of the others. So every time it appears in a linear combination, it can be replaced. It is redundant. It is not needed to span the set. 56 MATH METHODS 11.11 Basis of a Vector Space This brings us to the concept of a basis. A set that both spans and is linearly independent. Basis. A set of non-zero vectors {x ,..., x } V are a basis for a vector space V if 1 k ⊂ 1. x1,..., xk are linearly independent. 2. x1,..., xk span V. Bases are ideal for building coordinate systems. They are neither too big nor too small. A basis is just right. We can write any vector as a linear combination of the basis vectors, and there is only one way to do it, only one set of coordinates for each vector. Theorem 11.11.1. Every basis for Rn has exactly n elements. n Proof. Suppose x1,..., xk is a basis for R . By Theorem 11.10.1, k n, and by Theorem 11.8.1, k n. Thus k = n. ≥ ≤ 11.LINEARINDEPENDENCE 57 11.12 Creating a Vector Space We can make a basis for a vector space out of anything. ◮ Example 11.12.1: Free Vector Space. Although our vector spaces usually have mean- ingful scalar products, it is not necessary. Given a set B, we define the free vector space over B as the set of formal linear combinations of elements of B. Here “formal” means that we don’t try to interpret what x =1.5 red +2.4 Einstein actually means. Sometimes, the scalar multiples are written in the form (α, red) to emphasize this. Moreover, we don’t need to know what addition means either. We just do calculations concerning vectors according to the rules of vector arithmetic. For example, if B = {red, water, Einstein}, then V = {x1 red + x2 water + x Einstein : x R}. ◭ 3 i ∈ We can even make a basis out of nothing, or more precisely the empty set. ◮ Example 11.12.2: Free Vector Space Using the Empty Set. Let

B′ = , , , { } ∅ ∅ ∅ ∅   and form the free vector space over B′. ◭ 58 MATH METHODS 11.13 Bases and Independent Sets

Theorem 11.13.1. Let B = {b1,..., bn} be a basis for a vector space V. Suppose S V has m > n elements. Then the vectors in S are linearly dependent. ⊂ Proof. Let S be as described. We can write S = {x1,..., xm}. Since B is a basis for V, and S V, we can write each x as a linear combination of the basis vectors B. ⊂ i That means there are aij, for i =1,...,m and j =1,...,n, with

n

xi = aijbj. j X=1

To examine linear independence of the xi, we consider the equation

m

tixi = 0. i X=1 We will show it has non-zero solutions. We start by rewriting it

m n n m

0 = ti aijbj = tiaij bj. i j ! j i ! X=1 X=1 X=1 X=1

As the bj are linearly independent, this implies their coefficients are zero. That means

m m m

tiai1 =0, tiai2 =0,..., tiain =0. i i i X=1 X=1 X=1 We have n equations in m unknowns, which we can write in matrix form as AT t = 0. This homogeneous system not only has a solution, but must have infinitely many solutions because there are more unknowns (m) than equations (n). See Corollary 7.28.1. It follows that there are t1,...,tm, not all zero, with

m

tixi = 0. i X=1

In other words, the {xi} must be linearly dependent. 11.LINEARINDEPENDENCE 59 11.14 Testing for a Basis We can use our various results on solving equations to construct a test to see if n {b1,..., bn} form a basis for R . Item (4) of the theorem is the test. Theorem 11.14.1. Let {b ,..., b } be a collection of vectors in Rn. Form the n n 1 n × matrix whose columns B whose columns are the bj. Then the following are equivalent.

1. b1,..., bn are linearly independent n 2. b1,..., bn span R n 3. b1,..., bn form a basis for R 4. det B is non-zero Proof. (1) implies (2). Linear independence means Bx = 0 has at most one solution, so rank B = #cols = n by Corollary 7.29.1. As B is n n, this is also the number of rows, so Bx = y always has a solution by Corollary 7.30.2,× showing that the vectors span. (2) implies (3). We do this by showing (2) implies (1). Just use the same arguments in the opposite order. Then the vectors {b1,..., bn} span and are linearly independent, so they are a basis. (3) clearly implies (1) and (2). So (1), (2), and (3) are equivalent. (1)-(3) are equivalent to (4). As we saw above, (1), (2), and (3) are equivalent to rank B = n = #cols = #rows, which is equivalent to B being non-singular (Corollary 7.31.1). Finally, det B is non-zero if and only B is non-singular, completing the proof. 60 MATH METHODS 11.15 The Dimension of a Vector Space A consequence of Theorem 11.13.1 is that every basis of a vector space must be the same size, provided the size if finite. More precisely. Basis Theorem. Suppose a vector space V has a basis B with n elements where n is finite. Then every other basis of that vector space must also have n elements. Proof. Any other basis must be a linearly independent set, so by Theorem 11.13.1, it cannot have more than n elements. If there was a basis with fewer than n elements, we could apply Theorem 11.13.1 to determine that B is not a linearly independent set, and so not a basis. This contradicts our hypothesis, so it is impossible. We conclude that any basis has exactly n elements. The Basis Theorem lets us define the dimension of a vector space, at least when the dimension is finite.1 Suppose a vector space V has a finite basis B. The Basis Theorem tells us that any basis for V will have the same number of elements as B. Dimension. For vector spaces with a finite basis, we define the dimension of V as the number of elements of that basis. By the Basis Theorem, the dimension does not depend on which basis we use. We denote the dimension of V by dim V.

1 When the vector space is infinite, we have two choices. We can continue to use finite linear combinations (Hamel basis), or we can allow infinite sums. If we use the metric we previously defined on s, it is possible to show that the partial sums n x e converge to x s for every x. This is an i=1 i i ∈ example of a Schauder basis. P 11.LINEARINDEPENDENCE 61 11.16 Is the Dimension of a Vector Space Always Finite? Although we started with a basis with a finite number of elements, there are vector spaces with infinite bases. The arguments become trickier then, and we will not consider that case further other than to give an example. ◮ Example 11.16.1: Attempted Basis for the Sequence Space. The sequence space s contains infinite linearly independent sets. For s, define the vectors ej, j =1, 2, 3,... by (ej)i = δij. Then e1 = (1, 0, 0,...), e2 = (0, 1, 0, 0,...), etc. We now have an infinite set E = {e1, e2, e3,... } that seem like a possible basis. The set E is linearly independent in s. However, E is not a basis for s because it does not span s. The problem is that linear combinations involve finite sums, and vectors such as (1, 1, 1,...) cannot be written as a finite sum of the ej. ◭ 62 MATH METHODS

27. Subspaces Attached to a Matrix

This chapter draws on Chapter 27 of Simon and Blume. 27.1 The Column Space of a Matrix Let A be an m n matrix. The column space of the matrix A, Col(A), is the set of × linear combinations of the columns of A. Denote column i by aj, so

A = a an . 1 ··· Any linear combination of the columns of A can written

m

xjaj = Ax i X=1 T n where x = (x1,...,xn) R . Alternatively, the since the∈ column space of A is the set of linear combinations of the columns of A, we have Col(A)= L a1,..., an . There are multiple ways to see that Col( A) is a vector subspace of Rm. One is to invoke Theorem 11.2.1, which tells us the span of any set of vectors in Rm is a subspace m of R . Another is to realize that Col(A) = ran TA where the linear transformation TA m is defined by TA(x)= Ax. By Corollary 10.14.2, ran TA is a vector subspace of R . 27. SUBSPACES ATTACHED TO A MATRIX 63 27.2 The Row Space of a Matrix We define the row space of A, Row(A), as the of all linear combinations of the rows of A. It is the span of the rows of A. Of course, these are row vectors in Rn, not column th vectors. Now let ai denote the i row of A, so we can write

a1 . A = . .   am   Any linear combination of the rows of A can be written

m

xiai = xA i X=1 where x = (x1,...,xm) is an m-dimensional row vector. We can turn it into a column vector by taking the transpose (which was conspicuously absent). That yields AT xT , so we can consider the row space as being the range of AT . 64 MATH METHODS 27.3 Elementary Operations and We now have three vector spaces derived from matrices: the kernel (null space), row space, and column space of a matrix A. Each of these spaces has a well-defined dimension. It’s not hard to show that elementary column operations do not affect dim Col(A) and elementary row operations do not affect dim Row(A). Theorem 27.3.1. Let A be an m n matrix. The three elementary row operations do not affect the row space of A and× the three elementary column operations do not affect the column space of A. Proof. We will prove the row case. The column case differs only in notation. Let a1,..., am be the rows of A. The row space is the set of their linear combinations, L a1,..., am . We now consider the three elementary row operations in turn. (1) If we interchange two rows, the set of vectors is completely unchanged, so the span is also unchanged. (2) If we multiply row i by r = 0, the linear combination 6 t a + + t a + + t a 1 1 ··· i i ··· m m is the same as t t a + + i (ra )+ + t a , 1 1 ··· r i ··· m m so the span remains the same. (3) If we add r = 0 times row i to row j, we can rewrite the linear combination 6 t a + + t a + + t a + + t a 1 1 ··· i i ··· j j ··· m m = t a + + (t t r)a + + t (a + ra )+ + t a . 1 1 ··· i − j i ··· j j i ··· m m Since this can be read either way, the span again remains the same. None of the elementary row operations affect the span. 27. SUBSPACES ATTACHED TO A MATRIX 65 27.4 Dimensions of Row and Column Spaces Recall that the rank of a matrix A is the number of basic variables, which is the number of leading ones in the reduced row-echelon form. Theorem 27.4.1. Let A be an m n matrix. Then dim Row(A) = rank A = dim Col(A). × Proof. Since each row in reduced row-echelon form has a leading one that is a pivot, the non-zero rows are linearly independent. They also span the row space, which is unchanged by row reduction, and thus form a basis. It follows that

dim Row(A) = rank A.

The column case is similar. Since dim Row(A) (dim Col(A)) is unaffected by elementary row (column) operations, the rank is independent of the row (column) reduction used. 66 MATH METHODS 27.5 Rank of Matrix Products One consequence of Theorem 27.4.1 is that the rank of the product AB is no more than the rank of A, and that the ranks are equal when B is invertible. Theorem 27.5.1. Let A and B be conformable matrices. Then rank(AB) rank A. Moreover, if B is invertible, rank(AB) = rank A. ≤ Proof. We know that rank A = dim Col(A). Now the columns of AB are linear combinations of the columns of A. In fact, if a1,..., an are the columns of A, then th n the j column of AB is i=1 bijai. But then rank(ABP) = dim Col(AB) dim Col(A) = rank A. ≤ Now suppose B is invertible. Apply the previous result to [(AB)B−1], obtaining

rank (AB)B−1 rank(AB). ≤ But (AB)B−1 = A(BB−1) = A, so rank A rank(AB). Combining the two results shows rank AB = A when B is invertible. ≤ 27. SUBSPACES ATTACHED TO A MATRIX 67 27.6 The Kernel of a Matrix Since the solution set of a linear system is not affected by row operations, neither is the dimension of the kernel. We also saw in section 10.13 that the solution set to Ax = b is a translate of ker A, so the dimension of the kernel tells us what the solution set looks like in general. It is just a translate of the kernel. So what is the dimension of the kernel? Let’s look at an example. Suppose the reduced row-echelon form of the coefficient matrix is 1 1 0 1 1 1 R = . 0 0 1 1 0 1   There are two basic variables (x1 and x3) and four free variables (x2, x4, x5, and x6), marked in red. The reduced matrix now gives us two equations that define the kernel:

0= x1 + x2 + x4 + x5 + x6

0= x3 + x4 + x6.

We can find a basis for the kernel by successively setting all of the free variables but one to zero. The other can be anything we want. We choose +1. Here are the solutions. 1 1 1 1 −1 −0 −0 −0  0   1   0   1  b1 = , b2 = − , b3 = , b4 = − .  0   1   0   0           0   0   1   0           0   0   0   1          Each of the bi is in the kernel, and they are all linearly independent. The point is that row i = 2, 4, 5, 6 is non-zero only in one of the vectors. This happens because each vector is generated by considering a case where only one of the free variables is non-zero, which forces the corresponding row to be non-zero. The result of all this is that

dim ker A = #free vars. 68 MATH METHODS 27.7 The Dimension of the Kernel The example was all very nice, but it is no substitute for a proof. Fortunately, all the proof needs is to add some words of explanation. Theorem 27.7.1. Let A be an m n matrix. Then × dim ker A = #free vars.

Proof. Row reduce A to a reduced row-echelon form R. Both A and R will have the same kernel because elementary row operations do not affect the solution set. Write down the homogeneous equations corresponding to the reduced row-echelon form R. For each free variable i, set xi = 1, all the other free variables to zero, and solve for the basic variables. This defines a vector bi for each free variable i. Each b solves the homogeneous equations, b ker A. Suppose x ker A. Then i i ∈ ∈ x solves the reduced row-echelon system for some values xi of the free variables. We can write x = xibi i∈freeX vars showing that the bi span ker A. As bi is the only one of the bj where row i is non-zero, the bj are linearly indepen- dent. It follows that they form a basis for ker A, so

dim ker A = #free vars. 27. SUBSPACES ATTACHED TO A MATRIX 69 27.8 Fundamental Theorem of We know that the number of free variables and the number of basic variables add to the number of variables (n). Combining that with Theorem 27.7.1, we obtain

#basic vars = n dim ker A. − Since rank A is the number of basic variables, we can sum this up as follows. Fundamental Theorem of Linear Algebra. Let A be an m n matrix. Then × n = rank A + dimker A. 70 MATH METHODS

31. Transformations and Coordinates

You’ll notice there’s no Chapter 31 in Simon and Blume. Some of this material is not in Simon and Blume, some is scattered in the text. 31.1 Isomorphic Vector Spaces It’s often helpful to analyze vector space problems based on general considerations rather than being tied to a specific vector space characterized by a particular basis. We can change bases of vector spaces regardless of whether we think of them as being subspaces of Rn or something else entirely. However, if two vector spaces have the same finite dimension there will always be a mapping that will allow us to treat them as identical, as far as all vector space constructions are concerned. So when are two vector spaces the same? If only the vector space properties matter, the answer is that they are the same if they are isomorphic. Isomorphic Vector Spaces. Two vector spaces V and W are isomorphic if there is a linear transformation T : V W that is one-to-one and onto. Such a mapping is called a linear . The fact that the transformation→ is linear tells us that it preserves the vector space operations. Because it is one-to-one and onto, it is also invertible. To see that, for any y W, let T −1(y) denote the unique x V with T(x)= y. Here such x exist because T maps∈ onto W, and x is unique because∈ T is one-to-one. 31.TRANSFORMATIONSANDCOORDINATES 71 31.2 Matrix One important result is that a linear isomorphism T from Rn to Rm can be written T(x)= Ax for an A. Theorem 31.2.1. Let T : Rn Rn be a linear isomorphism. Then there is an invertible matrix A with T(x)= Ax for every x V. ∈ Proof. Let T be as above. Because→ the mapping is linear, we can use Theorem 10.6.1 to find an m n matrix A with T(x)= Ax. Since T is× a linear isomorphism, it must be both one-to-one and onto. The first requires that rank A = n by Corollary 7.28.1. By Corollary 7.30.2, the fact that T is onto tells us that rank A = m. Combining these shows m = n = rank A, which means that A is non-singular, that it must be invertible. 72 MATH METHODS 31.3 Isomorphic Vector Spaces have the Same Dimension In fact, all isomorphic vector spaces, not just Rn, must have the same dimension. It’s also easy to show that two vector spaces of the same finite dimension are isomorphic. Theorem 31.3.1. Let V and W be finite-dimensional vector spaces. Then dim V = dim W if and only if there is an isomorphism T : V W. Proof. Only if case ( ): Let V be a basis for V and W be a basis for W and let dim V = n (which is also dim W in this part). Then→ set T(vi)= wi and use linearity to define T on all of V. The⇒ resulting transformation T maps V into W. We will show it is bijective. (1) The linear mapping T maps onto W, because if x = x w W, then j j j ∈ x = T x v . j j j P (2) The mapping T is one-to-one because if T(x) = T(y) for some x, y V, then P  ∈ j xjwj = j yjwj. As W is linearly independent, xj = yj for all j = 1,...,n, showing that x = y. It follows that T is an isomorphism. PIf case ( ):PUsing the same notation, consider the set T L[V] = TV. Since T maps onto W, T L[V] = W, showing that dim W dim V.  We next show that the image of the V-basis V≤, T(V), is a linearly independent set. Sup- ⇐  pose there are tj, j =1,...,n with j tjT(vj)= 0. Then T j tjvj = j tjT(vj)= 0. The fact that T is an isomorphism, and so one-to-one, implies t v = 0. But P P  j jPj V is a basis, so tj = 0 for all j, showing that T(V) is linearly independent. Since T(V) is a linearly independent set in W, dim W n. CombiningP the results shows dim W = n = dim V. ≥ You can even use this to show that any n-dimensional vector space is isomorphic to any free vector space of dimension n. 31.TRANSFORMATIONSANDCOORDINATES 73 31.4 Isomorphism: One and a half Examples

3 ◮ Example 31.4.1: An Isomorphism. Let W = {x R : x1 + x2 + x3 = 0}. This is a two-dimensional subspace and should be isomorphic∈ with R2. We start by finding a basis for W. The vectors 1 0 b1 = 0 , b2 = 1 1 ! 1 ! − − will do (as a linearly independent set in a two-dimensional space, they must be a basis). Define

x1 T = x1b1 + x2b2 x2   1 0 = x1 0 + x2 1 1 ! 1 ! − − x1 = x2 x x ! − 1 − 2

Because b1 and b2 are linearly independent, T is one-to-one, and because bi spans W, it is onto. That makes it an isomorphism. The inverse map is y 1 y T y = 1 . 2 y y y ! 2 − 1 − 2   Keep in mind that the components of y must sum to zero by the definition of W. So once we know y1 and y2, the value of y3 is already determined by the need to be in W. ◭ We could just as easily use the free vector space with basis {red, water} and set

x1 T x1 red + x2 water = x2 . x1 x2 !  − − 74 MATH METHODS 31.5 Isomorphic Normed Spaces Normed spaces and inner product spaces have to meet higher standards for isomor- phism because they have structure beyond their vector space structure that must be preserved.

Isometric Normed Spaces. An isomorphism T between normed spaces (V, 1) and (W, ) is a vector space isomorphism between V and W that preservesk·k the norm, k·k2 T(x) 2 = x 1. Such isomorphisms are also called linear isometries or isometric isomorphismsk k k .k Isometries are often linear. To state this more precisely, define the midpoint of two vectors x and y as (x + y)/2. We state the following theorem of Mazur and Ulam without proof Mazur-Ulam Theorem. Let V and W be normed spaces and T a mapping of V onto W with Tx = x and T(0)= 0. Then T maps midpoints to midpoints and is linear as a mapk overkR. k k This means that T is an isometric isomorphism between V and W. The result can fail in complex vector spaces. You’ll notice that although T maps midpoints to midpoints, that is not generally true of T . What happens is that k k 1 1 1 T (x + y) = (T(x)+ T(y) T(x)+ T(y) . 2 2 ≤ 2 k k   

However, the triangle inequality is often strict, so T usually doesn’t map midpoints to midpoints. k k 31.TRANSFORMATIONSANDCOORDINATES 75

31.6 Isomorphic Inner Product Spaces 9/21/21 Inner product spaces have an even higher standard to uphold for isomorphism. The inner product must be preserved, meaning that angles between vectors remain the same. We will use the notation x, y to distinguish the inner products. h ii Isomorphic Inner Product Spaces. An isomorphism T between inner product spaces V, , 1 and W, , 2 is a vector space isomorphism between V and W that pre- servesh· ·i the inner product,h· ·i T(x), T(y) = x, y .  h i2 h i1 Using less precise notation, we may write the inner product condition as

T(x) T(y)= x y. · · 76 MATH METHODS 31.7 Characterizing Inner Product Isomorphisms Suppose T : Rn Rn is an inner product isomorphism when both Rn’s have the Euclidean inner product. Such a mapping is also an isometry. The image of the standard basis is→ not only a basis (guaranteed by the vector space isomorphism), but must be an orthonormal basis. To see this, we compute

T(e ) T(e )= e e = δ . i · j i· j ij The mapping of the standard basis vectors to an orthonormal set means that T is either a rotation, or a rotation together with an reflection. 2 To see this in R , the fact that T(e1) is perpendicular to T(e2) means that T(e2) lies on the line perpendicular to T(e1). Since T(e2) is a unit vector, there are only two possible places to put it. One is a rotation. The other involves a reflection together with a rotation. 3 In R , T(e1) determines a plane that the other T(ei) lies in. Once we know where T(e2) goes, there are only two choices for T(e3). One involves a rotation of {e1, e2, e3}, the other a reflection and rotation. That the latter case is possible can be seen by letting your thumb, forefinger, and middle finger represent the basis vectors. Your right hand cannot be rotated to be a left hand, and vice-versa. However, a mirror can turn a right hand into a left hand, which is why a reflection might be needed. 31.TRANSFORMATIONSANDCOORDINATES 77 31.8 Rotations in R2 Consider a clockwise rotation of the canonical basis vectors in R2 by an angle θ. This rotates black vectors ei into the red ones bi. Effect of Rotation by θ

x2

e2

b2 θ

x1 θ e1

b1

Figure 31.8.1: The standard coordinate axes are rotated counter-clockwise by an angel θ. A little trigonometry gives us the coordinates in the old system. Since bi is a unit vector, we can read the coordinates in terms of the sine and cosine of θ. Specifically, b =( sin θ, cos θ) 1 − and b2 = (cos θ, sin θ).

As shown in the diagram, we have the following mapping: 1 cos θ 0 sin θ e = , e = . 1 0 7 sin θ 2 1 7 cos θ    −      Note that the new vectors are→ still an orthonormal basis,→ as both vectors are rotated by the same angle. Also, the of the new basis matrix is cos θ sin θ det = +1. sin θ cos θ  −  In fact, rotations always have a determinant of +1. 78 MATH METHODS 31.9 Example: A Rotation and its Inverse ◮ Example 31.9.1: 45◦ Rotation: Done and Undone. The matrix

1 1 1 B = − √2 1 1   rotates the coordinates of R2 by 45◦, taking (1, 0)T to (1/√2, 1/√2)T and (0, 1)T to ( 1/√2, 1/√2)T . Since this is a rotation, its transpose is also its inverse, and we have − T 1 1 1 B−1 = B = . √2 1 1  − 

Effect of B Effect of BT

x2 x2

e2 e2 T b2 b1 b2 45◦ 45◦ 45◦ x1 x1 e1 e1 45◦

T b1

Figure 31.9.2: Here the standard coordinate axes are rotated counter-clockwise by 45◦ in the left panel. Here b1 and b2 are the columns of B. In the right panel, we have the inverse ◦ T transformation where a 45 clockwise rotation gives us the new coordinate axes. Here b1 T T and b2 are the columns of the matrix B .

◭ 31.TRANSFORMATIONSANDCOORDINATES 79 31.10 Matrices with AT A = I Reflections also obey AT A = I, but the determinant is 1. For example, reflecting in the horizontal axis maps (e , e ) (e , e ). The new− basis matrix is 1 2 7 1 − 2 1 0 B = → 0 1  −  which has determinant 1. Notice that multiplying− again by B returns us to our original coordinates. Also notice that this transformation can’t possibly be a rotation. We saw that both diagonal elements must be the same for a rotation, but they aren’t here. Also, rotations have determinant +1, this has determinant 1. It is easy to see that matrices− obeying AT A = I come in two types, those that are purely rotations, with det A = +1 and those that involve a reflection, with det A = 1. Nothing else is possible. − 80 MATH METHODS n 31.11 Automorphisms on ℓ2 An automorphism on a vector space V is an isomorphism from V to itself, T : V V. n Here will will consider automorphisms that preserve the inner product on ℓ2 , Euclidean Rn . → Before proceeding, we prove a useful lemma. Lemma 31.11.1. Let V be an inner product space and suppose x and x′ obey x y = x′ y for every y V. Then x = x′. · · ∈ Proof. Now (x x′) y = 0 for every y V. Set y = x x′ to obtain x x′ = 0. It follows that x =−x′. · ∈ − k − k n Let T be an automorphism on ℓ2 . Use the standard basis to find a matrix represen- tation A of T. Then (Ax) (Ay)= x y for all x, y Rn. Now · · ∈ x y = (Ax) (Ay) · ·T = (Ay) (Ax) T T = y (A A)x T = (A A)x y. ·  Since this holds for every x, y Rn, Lemma 31.11.1 implies that AT Ax = x, so T T ∈ A A = I. Then A−1 = A . Consideration of the standard basis showed these are rotations and reflections. In fact, the pure rotations all have det A = +1, while those involving reflections have det A = 1. As in R2, an even number of reflections amounts to a rotation. I think the main− reason it is less clear in dimensions higher than 3 is that our intuition doesn’t work so well there. When the vector space is Cn, we use x y = x∗y = n x y . In that case, we · j=1 j j find that the inner product is preserved when the basis matrix U obeys U∗U = I. Such matrices are called unitary matrices and are the complexP version of rotations and reflections. We can still use det U = 1 to sort out which is which. ± 31.TRANSFORMATIONSANDCOORDINATES 81 31.12 Bases and Coordinates We’ve used two sets of bases for several pages. It’s time to approach the different coordinate systems more systematically. n Let B = {b1,..., bn} be a basis for R . Define the basis matrix by lining up the basis vectors in order: B = b b bn . 1 2 ··· Since B both spans and is a linearly independent set of n vectors in Rn, B is an n n matrix with rank n. That means it is invertible. Given a vector x Rn, we find× its ∈ vector of coordinates tB in the B basis by solving the equation

x = BtB.

−1 Because B is invertible, tB = B x. This all applies to the standard basis E = {e1,..., en}. In that case, the basis matrix E is the n n identity matrix. Nonetheless, we use a special name for it to emphasize that we are× doing basis calculations. Given a vector x, expressed in the standard −1 coordinates, we find that tE = E x = Ix = x, meaning that the coordinates are what we think they are. 82 MATH METHODS 31.13 Example: Coordinates in R3 3 Let’s see how this works in R . The basis B = {b1, b2, b3}, defined by 1 2 1 b1 = 2 , b2 = 0 , b3 = 1 , 3 ! 2 ! 1 !

gives us the basis matrix 1 2 1 B = 2 0 1 . 3 2 1 ! We now use the formula x = BtB To change from B coordinates to standard coordinates. The vectors with coordinates T ′ T ′ tB = (1, 1, 0) and tB = (1, 0, 3) yield the vectors x = BtB = (3, 2, 5) and x = ′ BtB = (4, 5, 6) in standard coordinates. Let’s see how it works the other way, going from standard basis E coordinates to B coordinates. For that, we use the formula

−1 tB = B x.

The inverse of B is 1/2 0 1/2 − − B 1 = 1/4 1/2 1/4 . 1− 1 1 ! − T −1 T The vector x = (3, 2, 1) then has coordinates tB = B x = ( 1, 0, 4) in the basis B, ′ −T ′ meaning that x = b1 +4b3. The vector x = ( 1, 1, +5) has coordinates tB = −1 ′ − ′ − − B x = (3, 3/2, 7) in the B basis, so that x =3b1 + (3/2)b2 7b3 = ( 1, 1, +5) in the standard basis.− − − − 31.TRANSFORMATIONSANDCOORDINATES 83 31.14 Changing Coordinate Systems ′ Consider two different bases (aka coordinate systems), B = {b1,..., bn}, and B = ′ ′ ′ {b1,..., bn}, we form the corresponding basis matrices B and B . Given a vector x, we can write it in the two coordinate systems as x = BtB and ′ ′ x = B tB′ . Then BtB = B tB′ . Solving for tB and tB′ , we derive the change of coordinates formulas:

−1 ′ ′ −1 tB = B B tB′ and tB′ = (B ) B tB

Starting with the B′ coordinates, we multiply by B′ to get the actual vector x, and then multiply by B−1 to put it into the B coordinate system. Conversely, to convert the B coordinates to B′ coordinates, we reverse the process, multiplying first by B, and then by (B′)−1. 84 MATH METHODS 31.15 Example: Changing Coordinates in R2 To see how this works, suppose 1 1 1 2 B = and B′ = 0 1 2 1     are the basis matrices. Then 1 1 1 1 2 B−1 = − and (B′)−1 = − . 0 1 3 2 1    −  ′ T Consider the vector with B coordinates tB′ = (1, 4) . Using the formula, we obtain the B coordinates 1 1 1 3 tB = = . 0− 1 4 6 T      Let’s check it. Now tB′ = (1, 4) corresponds to 1 2 9 x = +4 = 2 1 6       and 1 1 9 x =3 +6 = . 0 1 6       This shows that both expressions refer to the same vector x, whose standard coordinates are 9 x = . 6   31.TRANSFORMATIONSANDCOORDINATES 85 31.16 Linear Transformations and Bases So how do these coordinate change affect linear transformations? Take a linear trans- formation on Rn, T : Rn Rn. As we saw in section 10.6, we can use the standard basis to represent this in matrix form, so that T(x)= Ax. So what happens if we→ want to use a different basis? One reason to do such a thing would be to write the transformation in a more convenient form—one that is easier to calculate or interpret. Let B be a basis matrix for B. We will write the matrix for T as AE when it is in standard (E) coordinates and AB when it is in B coordinates. To find AB from AE, we start with B coordinates tB, then convert them to standard coordinates, x = BtB. Then we feed this to the matrix in standard coordinates, obtaining AEBtB. As this is in standard coordinates, we have to convert the result back to the B −1 −1 coordinates. We do this by multiplying on the left by B . That gives us (B AEB)tB as the B coordinates of the transformed vector. Thus

−1 −1 AB = B AEB or BABB = AE. (31.16.1) is the matrix for T in B coordinates. The type of transformation used in Equation 31.16.1 is sometimes called a similarity transformation. 86 MATH METHODS 31.17 Coordinate Change with Arbitrary Bases Things are a bit more complicated if we had originally used a basis other than the standard basis. If the transformation had been written in B′ coordinates, we would multiply by (B′)−1B to convert B coordinates to B′ coordinates, apply A, then convert back. The result is:  −1 ′ ′ −1 AB = B B AB′ (B ) B .

Another way to write this that may make the method clearer is transform them into standard coordinates: −1 ′ ′ −1 BABB = AE = B AB′ (B ) . 31.TRANSFORMATIONSANDCOORDINATES 87 31.18 Example: Linear Transformation Basis Change Suppose our new basis is 1 1 B = , 1 1    −  with basis matrix 1 1 B = . 1 1  −  This is an orthogonal basis since the vectors are perpendicular, but not orthonormal because they have length √2. This means that B−1 = (1/2)BT . Suppose our linear transformation T has matrix

1 3 1 A = 2 1 3   in the standard basis. To find its representation AB in the B basis, we first compute

1 1 1 B−1 = . 2 1 1  −  Then our new is

−1 AB = B AB 1 1 1 3 1 1 1 = 4 1 1 1 3 1 1 − − 2 0      = . 0 1   As you can see, the transformation has taken a particularly simple form: The trans- formed matrix AB is diagonal. This reflects the fact that T(b1)=2b1 and T(b2)= b2. In Chapter 23, you will learn how we can sometimes find such a basis from the original matrix. 88 MATH METHODS 31.19 Example: Transformations with Complex Numbers Consider the matrix 0 1 A = . 1 0  −  In section 8.31, we saw that A is a square root of I, the negative of the identity matrix. By using a complex basis, we can see just how− true that is. There is a complex basis B where AB is purely imaginary in the weak sense that all of its non-zero elements are purely imaginary. Indeed, the non-zero elements are square roots of 1. Consider the basis with basis matrix − 1 1 1 B = . √2 i i  −  It is a unitary matrix. Its inverse is its Hermitian conjugate

1 1 i B−1 = − . √2 1 i   Now

−1 AB = B AB 1 1 i 0 1 1 1 = − 2 1 i 1 0 i i − − 1  1 i   i i   = − − 2 1 i 1 1 − − i 0    = 0 i  −  revealing that the matrix A is more closely connected to the imaginary numbers than we first realized. If you’ve had a comprehensive linear algebra course, you may have seen such transfor- mations before. If you had a differential equations class that covered linear differential systems, you may have seen them there too. As was true of the previous page, you will learn how find these transformations like this in Chapter 23. 31.TRANSFORMATIONSANDCOORDINATES 89 31.20 The In section 10.8, we defined linear functionals, linear functions from a real vector space Rn to R. More generally, let F = R or C. A linear function f from a finite-dimensional vector space V over F to F is called a linear functional. We saw earlier that any linear functional on Fn can be represented by a 1 n matrix, a horizontal vector, a covector.1 The dual space of V is the set of all linear functionals× on V and is denoted V∗. Since the set of 1 n matrices is a vector space of dimension n = dim V, dim V∗ = dim V. The most× common duality in economics involves prices and quantities. We think of quantities x Rn and prices in (Rn)∗, writing px for cost. Some problems are better studied using∈ functions of quantity (e.g., utility, production), while others are better studied using dual functions of price (cost, expenditure, indirect utility). Tosee how duality relates to bases, we treat f as we do any other linear transformation. We express x using the standard basis and use the linearity of f to write:

n

f(x)= f xjej j ! X=1 n = xjf(ej) (31.20.2) j X=1 x1 . = f(e1) f(en) . . ···     xn   So any linear functional f on V defines a 1 n matrix v by × f

v = f(e ) f(en) . f 1 ···   Reading equation (31.20.2) up from the bottom makes it clear that any 1 n matrix defines a linear functional on V, and vice-versa. We can think of the linear× functionals as 1 n matrices. In× fact, if V is an inner product space, we can identify x V with the dual element x∗ since y x y = x∗y is a linear functional on V. However,∈ this mapping is only linear if V is a7 real· vector space. If it is a complex vector space, the mapping is conjugate linear. →

1 When dealing with infinite-dimensional spaces, a distinction is made between linear functions from V to R, sometimes called linear forms and continuous linear functions from V to R, called linear functionals. We are avoiding these technical issues by restricting ourselves to finite-dimensional spaces. 90 MATH METHODS 31.21 The Dual of a Basis ∗ Given a basis B = {b1,..., bn} for V, we can define a corresponding dual basis B for ∗ ∗ V by setting bi (bj)= δij. It is easy to verify that this is a basis for V∗. Theorem 31.21.1. Let V be a finite-dimensional vector space. The dual basis is a basis for V∗

Proof. Let f be a linear functional on V. We write x in the basis B as x = j xjbj. Then n n P ∗ ∗ bi (x)= xjbi (bj)= xjδij = xi. j j X=1 X=1 Now expand vfx = f(x):

n n n ∗ vfx = f(x)= f xjbj = xjf(bj)= f(bj)bj (x). j ! j j X=1 X=1 X=1 ∗ ∗ ∗ This shows that f = vf = j f(bj)bj , meaning that B spans V . Next, we consider linear independence. Suppose f = v = n x b∗ = 0. Then P f j=1 j j for each bi, i =1,...,n, P n n ∗ 0= f(bi)= vf(bi)= xjbj (bi)= xjδij = xi. j j X=1 X=1 ∗ But then xi = 0 for i = 1,...,n, showing that B is a linearly independent set, and therefore a basis for V∗. 31.TRANSFORMATIONSANDCOORDINATES 91 31.22 The Standard Dual Basis ∗ ∗ Rn ∗ ∗ We can now define the standard dual basis by {e1,..., en} for ( ) by ei (ej) = ∗ ∗ ∗ δij. Thus the dual basis vectors ei are the row vectors e1 = (1, 0, 0,..., 0), e2 = (0, 1, 0,..., 0), ... , e∗ (0,..., 0, 1). This allows us to write any f V∗ as n − ∈ n ∗ f = f(ei)ei = f(e1),...,f(en) . i=1 X  The following corollary is similar to Lemma 31.11.1, but applies to any dual space and doesn’t require that V be an inner product space. Corollary 31.22.1. Let V be a finite-dimensional vector space. Suppose f(x)= f(y) for all f V∗, then x = y ∈ Proof. Let B be a basis for V. We can write x = j xjbj and y = j yjbj. Since ∗ ∗ ∗ ∗ b V , xi = b (x) = b (y) = yi for every i = 1,...,n. Then x = y by linear i ∈ i i P P independence of the bi. 92 MATH METHODS 31.23 Coordinate Change in the Dual Space Although we have a formula for the dual basis, we still need to fully identify it. Since the dual basis consists of covectors (row vectors), we form the dual basis matrix Bˆ by stacking the rows.2 ∗ b1 ∗ b2 Bˆ =  .  . .  b∗   n    ˆ ∗ ˆ Then the ij coordinate of BB is bi bj = δij, meaning that BB = I. The dual basis matrix is simply B−1, keeping in mind that we are using the rows of B−1, not the columns. We formalize this as the following theorem. Theorem 31.23.1. Let B be a basis for an n-dimensional real vector space V and ∗ ∗ ∗ B = (b1,..., bn) be the basis matrix. Then the dual basis B = {b1,..., bn} has basis matrix B−1. −1 Of course, to change coordinates, tB = B x in V. The dual space works a little differently as the basis matrix must multiply the coordinate vector on the right. Thus if we have a linear functional f defined by the covector vf in the standard basis, the ∗ ∗ −1 coordinates in the basis B are tB = vf(Bˆ) = vfB. Since the coordinates vary directly with the basis matrix, the vector is called covariant. With ordinary vectors, we use the inverse of the basis matrix, and call them contravariant as a result. It follows that

∗ −1 −1 tB(tB)= vfB B x = vf BB x = vfx = f(x),

showing that f(x) is unaffected by this double change  of coordinates, which is what we need. ◮ Example 31.23.2: Gallons vs. Quarts. Going back to the beginning of the Chapter 10 notes, this means if we measures milk in quarts rather than gallons, and milk is good k, then the coordinate change for quantities is given by B−1 = diag(1,..., 1, 4, 1,..., 1) where4isinthekth row. Then B = diag(1,..., 1, 1/4, 1,..., 1), so the corresponding (dual) price vector must be multiplied by 1/4. ◭

2 I don’t use B∗ because of potential confusion with the Hermitian conjugate. 31.TRANSFORMATIONSANDCOORDINATES 93 31.24 Another Duality Example Rotations and reflections are a little different from the other transformations of bases. For starters, B−1 = BT . Now when we apply B−1 = BT to the coordinates of a vector, we taking sums of the columns of BT . When we apply B to a covector, we obtain sums of the rows of B, which are the columns of BT . The action is the same on both the vectors and covectors. Let’s see how this works with an orthonormal basis. ◮ Example 31.24.1: 45◦ Rotation and Duality. The matrix 1 1 1 B = − √2 1 1   rotates the coordinates of R2 by 45◦, mapping 1 1 1 0 1 1 and − 0 7 √2 1 1 7 √2 1         Since this is a rotation, its transpose→ is also its inverse,→ and we have

T 1 1 1 B−1 = B = . √2 1 1  −  Now what happens to the dual basis? It is also rotated by 45◦ as the covector (1, 0) 7 (1/√2, 1/√2) and (0, 1) ( 1/√2, 1/√2). 7 − → → x2

∗ e2, e2 b , b∗ ∗ 2 2 b1, b1 45◦

45◦ x ∗ 1 e1, e1

Figure 31.24.2: Here the standard coordinate axes are rotated counter-clockwise by 45◦. The standard dual basis lines up with the standard basis itself. Because the new basis is an orthonormal basis, the dual basis must rotate to match.

◭ 94 MATH METHODS 31.25 Rn Geometry Puzzle Consider a square with 2-foot sides. Divide that square into 4 quadrants, with sides 1-foot each. Then inscribe a circle into each quadrant. Finally, inscribe a small circle in the middle of the circles (shown in red below).

Suppose we try an analogous construction in R3, R4,... . In R3, we start with a cube with 2-foot sides. We bisect each side with planes, inscribe the 1-foot spheres, then inscribe the red 2-sphere in the middle. In R4, we have a tesseract with 2-foot sides, we bisect each side with 3-d “planes”, inscribe the 1-foot 3-spheres, then the red 3-sphere in the middle. We do this for each Rn with n 2. Problem: What does the diameter of the red≥ sphere converge to as n ? (A) 0. (B) 1. (C) 2. (D) . →∞ ∞ 31.TRANSFORMATIONSANDCOORDINATES 95 31.26 The Answer to the Puzzle

As shown in the diagram, we draw the diagonal of the 1-foot square (cube, tesseract, 2 2 2 etc.). We are in Euclidean space, so in R the diagonal has length L2 = √2 +2 = 2√2. n In R , the length is Ln = 2√n. Examining the diagonal more closely, we find that after subtracting the diameter of 1-foot circles, we have 2√n 2. This includes both the diameter of the red circle and the part sticking out of the−1-foot circles at either end. By symmetry, the portion sticking out of the 1-foot circle has the same length as the radius, so the leftover portion (2√n 2) is 4 times the radius, or half the diameter of the red circle. − That means the red circle has diameter d = √n 1. When n = 2, d .414. n − 2 ≈ when n = 4, d4 = 2 1 = 1. When n = 9, d9 = 3 1 = 2. At that point the red hypersphere in the middle− touches the sides of the large− enclosing box. For n > 9, the red hypersphere actually pokes out. We can see now that the correct answer was (D) . The inside gets much roomier as the number of dimensions increases, which allows the red hypersphere to partly escape the containment by the other hyperspheres. ∞ 96 MATH METHODS

32. and Products

Here we build up some of the basic facts concerning tensors. We will later use ten- sors to express the multidimensional version of Taylor’s Theorem. They are also useful background for exterior products (a special kind of tensor) and their relation to multi- dimensional integrals. 32.1 Outer Products We’ve spent some time on inner products of vectors. So if vectors have an inner product, do they also have an ? Yes they do! Given x Rm and y Rn the outer product x y is an m n matrix defined by ∈ ∈ ⊗ T × (x y)ij = xiyj. This can also be written x y = xy . An immediate consequence ⊗ T T ⊗ is that (x y) = yx = y x. The outer⊗ product is not⊗ to be confused with the exterior product which we will encounter later. Thus, if x R3 and y R2, ∈ ∈ x1y1 x1y2 x y = x y x y . ⊗ 2 1 2 2 x3y1 x3y2 ! More generally, x y is an m n matrix with ij element x y . ⊗ × j i 32.TENSORSANDTENSORPRODUCTS 97 32.2 Examples of Outer Products Suppose x = (1, 2, 3)T and y = (10, 1)T . Then − 10 1 − x y = 20 2 . ⊗ 30 −3 ! − Another example is x = ( 1, 0, 3)T and y = ( 1, 5, 10)T , when − − 1 5 10 − − x y = 0 0 0 . ⊗ 3 15 30 ! − 98 MATH METHODS 32.3 Some Properties of the Outer Product There are two important relations:

(αx) y = x (αy)= α(x y) ⊗ ⊗ ⊗ and

x (y + z)= x y + x z ⊗ ⊗ ⊗ (x + z) y = x y + z y. ⊗ ⊗ ⊗ You can see that some sort of bilinearity has been built into the outer product. In fact, you can combine the above to show that for x, y Rm and u Rn, ∈ ∈ (αx + y) u = (αx) u + y u ⊗ ⊗ ⊗ = α(x u)+ y u, ⊗ ⊗ showing that the outer product is linear in the first coordinate. Similarly, for x Rm and u, v Rn, ∈ ∈ x (αu + v)= α(x u)+ x v, ⊗ ⊗ ⊗ showing that the outer product is also linear in the second coordinate, and so is bilinear. Outer products can be used to represent tensor products, but there is an important difference. The outer product obeys (x y)T = y x, a formula that often doesn’t make sense for tensor products. ⊗ ⊗ 32.TENSORSANDTENSORPRODUCTS 99 32.4 Outer Products and Bilinear Functions Tf T is a of m n matrices (considered as a vector space) into R, T(x y) is a bilinear function of (x×, y). ⊗ Theorem 32.4.1. Let T be a linear function on the set of m n matrices, x Rm and y Rn. Then the function f(x, y)= T(x y) is a bilinear function× on Rm ∈Rn. ∈ ⊗ × Proof. To see this, note that

T (αx) y = T x (αy) = T α(x y) = αT(x y) ⊗ ⊗ ⊗ ⊗ and   

T (x + y) z = T(x z + y z)= T(x z)+ T(y z) ⊗ ⊗ ⊗ ⊗ ⊗ T x (y + z) = T(x y + x z)= T(x y)+ T(x z). ⊗  ⊗ ⊗ ⊗ ⊗ Together, these imply that f(x, y)= T(x y) is bilinear in (x, y). ⊗ According to Theorem 10.6.1, we can always represent a linear transformation be- tween vector spaces by a matrix. Here T maps the vector space of m n matrices into the real numbers. All we need is a basis for the matrices to help× us write the transformation in matrix form. We will take the basis of the m n real matrices defined by bij is the m n matrix with 1 in position ij, and zero elsewhere.× It is easy to see this spans the m ×n matrices and is linearly independent. In fact, the ij basis element is e e . × i ⊗ j Define an m n matrix A by aij = T(ei ej). We can represent f and T by the matrix A as × ⊗ m n f(x, y)= T(x y)= a x y , ⊗ ij i j i j X=1 X=1 which is a bilinear form (2-tensor). 100 MATH METHODS 32.5 The Vector Space of Outer Products We can define a basis on the set of m n outer products. One way is to use the × standard basis of ei. Then e1 e1, e1 e2,..., e1 , en, e2 e1,..., em en is a basis for the space of m n outer⊗ products.⊗ ⊗ ⊗ ⊗ The way this is normally× done is to take the free vector space generated by the pairs (ei, ej) and then take equivalence classes so that the result obeys the rules:

u (v + w)= u v + u w ⊗ ⊗ ⊗ (u + v) w = u w + v w ⊗ ⊗ ⊗ α(v w) = (αv) w = v (αw) ⊗ ⊗ ⊗ with x y representing an equivalence class. Fortunately, we only need to know that we have⊗ a vector space with the extra operation, the outer or .1 The use of the free vector space has purged such notions as taking the transpose of a tensor product as there is no rule for handling it. However, it does allow us to define the tensor product of linear maps. Suppose V, W, X, Y are vector spaces and we have linear transformations T : V X and S: W Y. Then we can define S T : V W X Y by ⊗ ⊗ ⊗ → → (S T)(v w)= S(v) T(w). ⊗ →⊗ ⊗

1 The outer product is a way of representing the tensor product, but the two are isomorphic in this case. 32.TENSORSANDTENSORPRODUCTS 101 32.6 Pure 2-Tensors Tensors of the form v w V W are called pure tensors or simple tensors. Not all tensors can be written⊗ that∈ way.⊗ Some must be written as linear combinations of pure tensors. ◮ Example 32.6.1: Not All Tensors are Pure. Let V = W = R2, 1 0 e e + e e = 1 ⊗ 1 2 ⊗ 2 0 1   Suppose there are v and w with v w v w 1 0 v w = 1 1 1 2 = ⊗ v2w1 v2w2 0 1     Then v1w1 = 1, so neither v1 nor w1 is zero, and v2w2 = 1, so neihter v2 nor w2 is zero. But w1v2 = 0 and w2v1 = 0, so at least of the vi or wj must be zero, which is impossible. We can only conclude that e e + e e is not a simple tensor. ◭ 1 ⊗ 1 2 ⊗ 2 102 MATH METHODS 32.7 Compound Tensors In fact, there is a rich supply of tensors that are not pure if the dimensions of both vector spaces are greater than one.

Theorem 32.7.1. Suppose both {v1, v2} and {w1, w2} are linearly independent sets. Then v w + v w is not a pure tensor. 1 ⊗ 1 2 ⊗ 2 Proof. We use the outer product to see this. Suppose there are v3 and w3 with

v w + v w = v w . 1 ⊗ 1 2 ⊗ 2 3 ⊗ 3 Since each of the outer products is m n, this equation can be writtenas m equations. th × Let wij denote the j component of wi.

w11v1 + w21v2 = w31v3

w12v1 + w22v2 = w32v3 . . . .

w1mv1 + w2mv2 = w3mv3

If any of the w3i = 0, the corresponding w1i and w2i must be zero due to linear independence of w1 and w2. If all of the w3i = 0, w1 = w2 = 0, contradicting the linear independence of the wj. If only one of the w3i = 0, both wkj = 0 for j = i, so either both w are zero, or one is proportional to the6 other, again violating 6 i linear independence. Finally, if at least two w3i = 0, either the non-zero equations are proportional, violating linear independence, or they6 are not proportional. Then we can write a non-trivial equation of the form αw1 + βw2 = 0, again contradicting linear independence. 32.TENSORSANDTENSORPRODUCTS 103 32.8 The Rank of Pure Tensors A simpler way to see that many tensors are not pure is to consider the rank of the matrix x y. Since x y = xyT and the rank of the vecto x is either zero or one, the rank of⊗ any pure tensor⊗ must be zero or one by the product rule for rank (Theorem 27.5.1). In fact, the rank of x y is only zero when it is the zero tensor (aka the zero matrix). Summing up, ⊗ Theorem 32.8.1. Tensors with rank larger than one are not pure tensors. Proof. As shown above, non-zero pure tensors have rank one, so tensors with rank larger than one are not pure tensors. 104 MATH METHODS 32.9 Tensor Products To get a taste of how we can free tensors from the use of coordinates, suppose V is a vector space. We can write Vk as a tensor product of vector spaces:

k k times V = V⊗n = V V V. ⊗ ⊗···⊗ i O=1 z }| { The tensor product has a natural vector space structure. We write the tensor product of vectors by

x x x 1 ⊗ 2 ⊗···⊗ n n where each xi R . The tensor product itself is k-linear and elements of the tensor product are said∈ to have order k. Now

(αx + y ) x x = α(x x x )+ y x x , 1 1 ⊗ 2 ⊗···⊗ k 1 ⊗ 2 ⊗···⊗ k 1 ⊗ 2 ⊗···⊗ k x (αx + y ) x = α(x x x )+ x y x , 1 ⊗ 2 2 ⊗···⊗ k 1 ⊗ 2 ⊗···⊗ k 1 ⊗ 2 ⊗···⊗ k ...... x x (αx + y )= α(x x x )+ x x y . 1 ⊗ 2 ⊗···⊗ k k 1 ⊗ 2 ⊗···⊗ k 1 ⊗ 2 ⊗···⊗ k This shows that the tensor product is k-linear. Finally, if A is linear on k Rn, then A(x x ) is a k-linear function of ⊗i=1 1 ⊗···⊗ k (x1,..., xk) because the tensor product is k-linear. n If we need to use coordinates, we can write xi = j=1 xijej. Expanding A, we obtain P n n

A(x x )= x x k A(e e k )= a ··· k x x k 1 ⊗···⊗ k 1j1 ··· kj j1 ⊗···⊗ j j1 j 1j1 ··· kj j ···jk j ···jk 1X 1X where each a ··· k = A(e e k ). j1 j j1 ⊗···⊗ j 32.TENSORSANDTENSORPRODUCTS 105 32.10 Covariance and Contravariance We will touch briefly on covariant and contravariant tensors, in case you encounter them in the future. Suppose V is a vector space and V∗ its dual. A tensor of order (p, q) is in the tensor product

p times q times V V V∗ V∗ = V⊗p (V∗)⊗q. ⊗···⊗ ⊗ ⊗···⊗ ⊗ It is p times contravariantz }| and q{ timesz covariant,}| { with total order p + q. Among the things that can be done with a (1, 1)-tensor is to form an evaluation map, mapping x f f(x). ⊗ It just evaluates the linear functional f at the vector x and is a special case of a more general operation called contraction which→ reduces a (p, q)-tensor to a (p 1, q 1)- tensor. − − Of course, tensors in V⊗p (V∗)⊗q are often written as with p superscripts (for contravariant coordinates) and⊗q subscripts (for covariant coordinates) and summation i i occurs over all repeated indices. Thus x fj denotes x f while x fi is the evaluation ⊗ ij map. A more complicated (2, 3)-tensor might have the form Tkℓm and a contraction on ij ij it could be written Tjℓm, meaning j Tjℓm.

P September 23, 2021

Copyright c 2021 by John H. Boyd III: Department of Economics, Florida International University, Miami, FL 33199