A Review of Linear Algebra, Vector Calculus, and Systems Theory

A Review of Linear Algebra, Vector Calculus, and Systems Theory Throughout this book, methods and terminology from the area of mathematics known as linear algebra are used to facilitate analytical and numerical calculations. Linear algebra is concerned with objects that can be scaled and added together (i.e., vectors), the properties of sets of such objects (i.e., vector spaces), and special relationships between such sets (i.e., linear mappings expressed as matrices). This appendix begins by reviewing the most relevant results from linear algebra that are used elsewhere in the book. Section A.1 reviews the definition and properties of vectors, vector spaces, inner products, and norms. Section A.2 reviews matrices, matrix norms, traces, determinants, etc. Section A.3 reviews the eigenvalue–eigenvector problem. Section A.4 reviews matrix decompositions. The theory of matrix perturbations is reviewed in Section A.5, the matrix exponential is reviewed in Section A.6, and Kronecker product of matrices is reviewed in Section A.7. Whereas the emphasis in this appendix (and throughout the book) is on real vector spaces, Section A.8 discusses the complex case. Basic linear systems theory is reviewed in Section A.9. The concept of a product integral, which is important for defining Brownian motions in Lie groups, is covered in Section A.10. Building on linear-algebraic foundations, concepts from vector and matrix calculus are reviewed in Section A.11. Section A.12 presents exercises. A.1 Vectors The n-dimensional Euclidean space, Rn = R × R × ... × R (n times), can be viewed as the set of all “vectors” (i.e., column arrays consisting of n real numbers, xi ∈ R)ofthe form ⎡ ⎤ x1 ⎢ ⎥ . ⎢ x2 ⎥ x = ⎢ . ⎥ . ⎣ . ⎦ xn A very special vector is the zero vector, 0, which has entries that are all equal to the number zero. A.1.1 Vector Spaces When equipped with the operation of vector addition for any two vectors, x, y ∈ Rn, 316 A Review of Linear Algebra, Vector Calculus, and Systems Theory ⎡ ⎤ x1 + y1 ⎢ ⎥ . ⎢ x2 + y2 ⎥ x + y = ⎢ . ⎥ , ⎣ . ⎦ xn + yn and scalar multiplication by any c ∈ R, ⎡ ⎤ c · x1 ⎢ · ⎥ . ⎢ c x2 ⎥ c · x = ⎢ . ⎥ , ⎣ . ⎦ c · xn it can be shown that eight properties hold. Namely: x + y = y + x ∀ x, y ∈ Rn (A.1) (x + y)+z = x +(y + z) ∀ x, y, z ∈ Rn (A.2) x + 0 = x ∈ Rn (A.3) ∃−x ∈ Rn for each x ∈ Rn s.t. x +(−x)=0 (A.4) α · (x + y)=α · x + α · y ∀ α ∈ R and x, y ∈ Rn (A.5) (α + β) · x = α · x + β · x ∀ α, β ∈ R and x ∈ Rn (A.6) (α · β) · x = α · (β · x) ∀ α, β ∈ R and x ∈ Rn (A.7) 1 · x = x ∀ x ∈ Rn. (A.8) Here the symbol ∃ means “there exists” and ∀ means “for all.” The “·” in the above equations denotes scalar–scalar and scalar–vector multiplication. The above properties each have names: (A.1) and (A.2) are respectively called com- mutativity and associativity of vector addition; (A.3) and (A.4) are respectively called the existence of an additive identity element and an additive inverse element for each element; (A.5), (A.6), and (A.7) are three different kinds of distributive laws; and (A.8) refers to the existence of a scalar that serves as a multiplicative identity. These properties make (Rn, +, ·)areal vector space. Moreover, any abstract set, X, that is closed under the operations of vector addition and scalar multiplication and satisfies the above eight properties is a real vector space (X, +, ·). If the field1 over which properties (A.5)–(A.7) hold is extended to include complex numbers, then the result is a complex vector space. It is often convenient to decompose an arbitrary vector x ∈ Rn into a weighted sum of the form n x = x1e1 + x2e2 + ...+ xnen = xiei. i=1 Here the scalar–vector multiplication, ·, is implied. That is, xiei = xi · ei. It is often convenient to drop the dot, because the scalar product of two vectors (which will be defined shortly) is also denoted with a dot. In order to avoid confusion, the dot in the scalar–vector multiplication is henceforth suppressed. Here the natural basis vectors ei are 1This can be thought of as the real or complex numbers. More generally a field is an algebraic structure that is closed under addition, subtraction, multiplication, and division. For example, the rational numbers form a field. A.1 Vectors 317 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 0 0 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ . ⎢ 0 ⎥ . ⎢ 1 ⎥ . ⎢ 0 ⎥ e1 = ⎢ . ⎥ ; e2 = ⎢ . ⎥ ; ... en = ⎢ . ⎥ . ⎣ . ⎦ ⎣ . ⎦ ⎣ . ⎦ 0 0 1 This (or any) basis is said to span the whole vector space. And “the span” of this basis is Rn. A.1.2 Linear Mappings and Isomorphisms In some situations, it will be convenient to take a more general perspective. For example, when considering the tangent planes at two different points on a two-dimensional surface in three-dimensional Euclidean space, these two planes are not the same plane since they sit in space in two different ways, but they nonetheless have much in com- mon. It is clear that scalar multiplies of vectors in either one of these planes can be added together and the result will remain within that plane, and all of the other rules in (A.1)–(A.8) will also follow. By attaching an origin and coordinate system to each plane at the point where it meets the surface, all vectors tangent to the surface at that point form a vector space. If two of these planes are labeled as V1 and V2, it is clear that each one is “like” R2. In addition, given vectors in coordinate systems in either plane, it is possible to describe those two-dimensional vectors as three-dimensional vectors in the ambient three-dimensional space, E = R3, in which the surfaces sit. Both transformations between planes and from a plane into three-dimensional space are examples of linear transformations between two vector spaces of the form f : V → U, which are defined by the property f(av1 + bv2)=a f(v1)+b f(v2) (A.9) for all a, b ∈ R (or C if V is complex), and vi ∈ V . Most linear transformations f : V1 → V2 that will be encountered in this book will be of the form f(x)=Ax where the dimensions of the matrix A are dim(V2) × dim(V1). (Matrices, as well as matrix–vector multiplication, are defined in Section A.2.) The concept of two planes being equivalent is made more precise by defining the more general concept of a vector-space isomorphism. Specifically, two vector spaces, V1 and V2, are isomorphic if there exists an invertible linear transformation between them. And this is reflected in the matrix A being invertible. (More about matrices will follow.) ∼ When two vector spaces are isomorphic, the notation V1 = V2 is used. A.1.3 The Scalar Product and Vector Norm The scalar product (also called the inner product) of two vectors x, y ∈ Rn is defined as n . x · y = x1y1 + x2y2 + ...xnyn = xiyi. i=1 Sometimes it is more convenient to write x · y as (x , y). The comma in this notation is critical. With this operation, it becomes clear that xi = x · ei. Note that the inner product is linear in each argument. For example, 318 A Review of Linear Algebra, Vector Calculus, and Systems Theory x · (α1y1 + α2y2)=α1(x · y1)+α2(x · y2). Linearity in the first argument follows from the fact that the inner product is symmetric: x · y = y · x. The vector space Rn together with the inner-product operation is called an inner- product space. The norm of a vector can be defined using the inner product as . x = (x, x). (A.10) If x = 0, this will always be a positive quantity, and for any c ∈ R, cx = |c|x. (A.11) The triangle inequality states that x + y≤x + y. (A.12) This is exactly a statement (in vector form) of the ancient fact that the sum of the lengths of any two sides of a triangle can be no less than the length of the third side. Furthermore, the well-known Cauchy–Schwarz inequality states that (x, y) ≤x·y. (A.13) This is used extensively throughout the rest of the book, and it is important to know where it comes from. The proof of the Cauchy–Schwarz inequality is actually quite straightforward. To start, define f(t)as f(t)=(x + ty, x + ty)=x + ty2 ≥ 0. Expanding out the inner product results in a quadratic equation in t: f(t)=(x, x)+2(x, y)t +(y, y)t2 ≥ 0. Since the minimum of f(t) occurs when f (t) = 0 (i.e., when t = −(x, y)/(y, y)), the minimal value of f(t)is f(−(x, y)/(y, y))=(x, x) − (x, y)2/(y, y) when y = 0. Since f(t) ≥ 0 for all values of t, the Cauchy–Schwarz inequality follows. In the case when y = 0, the Cauchy–Schwarz inequality reduces to the equality 0 = 0. Alternatively, the Cauchy–Schwarz inequality is obtained for vectors in Rn from Lagrange’s equality [2]: & '& ' & ' n n n 2 n −1 n 2 2 − − 2 ak bk akbk = (aibj ajbi) (A.14) k=1 k=1 k=1 i=1 j=i+1 by observing that the right-hand side of the equality is always non-negative. Lagrange’s equality can be proved by induction. The norm in (A.10) is often called the “2-norm” to distinguish it from the more general vector “p-norm” & ' 1 n p . p xp = |xi| i=1 A.1 Vectors 319 for 1 ≤ p ≤∞, which also satisfies (A.11) and (A.12).

Load more