37 4. Euclidean Spaces in Studying the Geometry of ℝ in the First Three

4. Euclidean Spaces

In studying the geometry of ℝ in the first three chapters, we took advantage of its algebraic properties. ℝ admits addition and multiplication operations which obey certain rules. We exploited these operations to define the absolute value function and used this function to define the standard metric on ℝ. The standard metric determines the geometry of ℝ. We also defined two types of rigid motions of ℝ - translations and reflections – using addition and multiplication, and proved that all rigid motions of ℝ are one of these two types. The algebraic rules obeyed by addition and multiplication of real numbers make ℝ into a object which mathematicians call a field. We now state the definition of a field.

Definition. Suppose that F is a set that is equipped with two operations – addition and multiplication. Addition combines two elements x and y of F into a single element of F which is denoted x + y and is called the sum of x and y. Multiplication combines two elements x and y of F into a single element of F which is denoted xy and is called the product of x and y. If these operations obey the following five rules, then F is called a field. The associativity of (x + y) + z = x + (y + z) and (xy)z = x(yz) addition and multiplication: for all x, y and z ∈ F. The commutativity of x + y = y + x and xy = yx for all x and y ∈ F. addition and multiplication: The existence of additive There are elements of F denoted 0 and 1 such that and multiplicative identities: x + 0 = 0 + x = x and x1 = 1x = x for every x ∈ F. The existence of additive For every x ∈ F, there is an element of F denoted –x and multiplicative inverses: such that x + (–x) = (–x) + x = 0; and for every x ∈ F such that x ≠ 0, there is an element of F denoted x–1 such that x(x–1) = (x–1)x = 1. The distributivity of x(y + z) = xy + xz and (x + y)z = xz + yz multiplication over addition: for all x, y and z ∈ F.

Since the addition and multiplication operations on ℝ clearly obey these rules, then we know that ℝ is a field. While exploring the geometry of ℝ, we exploited the fact that ℝ is a field without ever explicitly using the term “field”.

We will take a similar approach to the geometry of n-dimensional space ℝn. First we will observe that ℝn is equipped with three algebraic operations. The first two of these operations – vector addition and scalar multiplication – make ℝn into an object that mathematicians call a vector space. The third algebraic operaton is a form of multiplication called an inner product. The inner product is used to define a function on ℝn called a norm. The norm on ℝn is a generalization of the absolute value function on 38

ℝ. We then use the norm on ℝn to define the Euclidean metric on ℝn in exactly the same way that we used the absolute value function to define the standard metric on ℝ. When ℝn is equipped with the Euclidean metric, we obtain a metric space called Euclidean n-space. We will spend the remainder of this course studying the geometry of Euclidean spaces.

We begin by defining ℝn.

Definition. Let n ≥ 1 be an integer. An n-tuple (x1, x2, … , xn) is an object with the following property which we call the fundamental property of n-tuples:

(x1, x2, … , xn) = (y1, y2, … , yn) if and only if xi = yi for each i between 1 and n.

The objects x1, x2, … , xn are called the coordinates of the n-tuple (x1, x2, … , xn): x1 is the first coordinate of (x1, x2, … , xn), x2 is the second coordinate of (x1, x2, … , xn), th … , and xn is the n coordinate of (x1, x2, … , xn). Thus, an n-tuple is object with n coordinates that is uniquely determined by its coordinates.

Definition. Let n ≥ 1 be an integer. Let ℝn denote the set of all n-tuples (x1, x2, … , xn) such that each coordinate xi is a real number. n ℝ = { (x1, x2, … , xn) : xi ∈ ℝ for 1 ≤ i ≤ n }. ℝn is called Cartesian n-space. Note that ℝ1 = ℝ.

Next we defined two algebraic operations on ℝn: vector addition and scalar multiplication.

Definition. For any two elements x = (x1, x2, … , xn) and y = (y1, y2, … , yn) of ℝn, vector addition combines x and y into a single element of ℝn called their vector sum, which is denoted x + y and which is defined by the equation

x + y = (x1 + y1, x2 + y2, … , xn + yn). n For each real number a ∈ ℝ and each element x = (x1, x2, … , xn) of ℝ , scalar multiplication combines a and x into a single element of ℝn called their scalar product, which is denoted ax and which is defined by the equation

ax = (ax1, ax2, … , axn).

Further notation: 0 = (0, 0, … , 0) ∈ ℝn. n For each x = (x1, x2, … , xn) ∈ ℝ ,

–x = (–1)x = (–x1, –x2, … , –xn). n For all x = (x1, x2, … , xn) and y = (y1, y2, … , yn) ∈ ℝ ,

x – y = x + (–y) = (x1 – y1, x2 – y2, … , xn – yn).

Homework Problem 4.1. Let x = (0, 2, –3), y = (–1/2, 2/3, 0), z = (3/4, 0, –2) and w ∈ ℝ3. a) Compute 3x – 2y + z. b) Compute 2(x – z) – 6(y + x). c) Solve 5x – 3w = y for w.

The geometric interpretation of vector addition and scalar multiplication. If x and y ∈ ℝn, then 0, x, y and x + y form the four vertices of a parallelogram in ℝn.

x + y

y x

Also if x ∈ ℝn and a ∈ ℝ, then 0, x and ax are collinear (lie in the same line) and the distance from 0 to ax is | a | times the distance from 0 to x.

(1/2)x 0

–x

The observation that vector addition creates a parallelogram inspires the vector interpretation of the elements of ℝn. Physics makes very effective use of this point of view. Let p be a fixed element of ℝn. We create a copy of ℝn based at p which we n n n denote ℝ p . For each x ∈ ℝ , there is a copy of x (also denoted “x”) in ℝ p which is the directed line segment in ℝn from p to p + x. Thus the line segment joining 0 to x and n the directed line segment represented by x in ℝ p are opposite sides of a parallelogram. € €

€ 40

p + x

x x p

n The elements of ℝ p are called vectors based at p. In summary: if x is a vector based at n p (i.e., x ∈ ℝ p ), then x is a directed line segment from p to p + x.

n From€ this point of view, if p and q ∈ ℝ , then the directed line segment from p to q can be labeled x if and only if q = p + x. Then x = q – p. (In this case, the directed € n line segment labeled x is regarded as an element of ℝ p . In other words, the

€ x = q – p

0 vector perspective gives us the power to assign the directed line segment from an element p of ℝn to an element q of ℝn the label x ∈ ℝn precisely when p + x = q.

Hence, if p, q and r are elements of ℝn as pictured below, then we can use x, y and z ∈ ℝn to label the directed sides of the triangle with vertices p, q and r as shown below if and only if x = q – p, y = r – q and z = r – p.

r x

As we remarked above, the operations of vector addition and scalar multiplication make ℝn into an object that mathematicians call a vector space. We now give the definition of “vector space” and observe that ℝn is one.

Definition. Suppose that V is a set that is equipped with two operations – vector addition and scalar multiplication. Vector addition combines two elements x and y of V into a single element of V which is denoted x + y and is called the vector sum of x and y. Scalar multiplication combines a real number a ∈ ℝ and an element x of V into a single element of V which is denoted ax and is called the scalar product of a and x. If these operations obey the following eight rules, then V is called a vector space. The associativity of vector addition: (x + y) + z = x + (y + z) for all x, y and z ∈ V. The commutativity of vector addition: x + y = y + x for all x and y ∈ V. The existence of an identity for There is an elements of V denoted 0 such that vector addition: x + 0 = x = 0 + x for every x ∈ V. The existence of an inverse for For every x ∈ V, there is an element of V vector addition: denoted –x such that x + (–x) = 0 = (–x) + x. The associativity of scalar (ab)x = a(bx) multiplication: for all a and b ∈ ℝ and every x ∈ V. The existence of an identity for 1x = x for every x ∈ V. scalar multiplication: The distributivity of scalar a(x + y) = ax + ay multiplication over vector addition: for every a ∈ ℝ and all x and y ∈ V. The distributivity of scalar (a + b)x = ax + bx multiplication over scalar addition: for all a and b ∈ ℝ and every x ∈ V.

Theorem 4.1. ℝn with the operations of vector addition and scalar multiplication is a vector space.

Proof. The proof requires us to prove that vector addition and scalar multiplication on ℝn obey the eight properties listed in the definition of “vector space”. We will write out proofs of three of these properties – the associativity of vector addition, the existence of an inverse for vector addition, and the distributivity of scalar multiplication over vector addition. We will leave the proof of the remaining five properties as homework problems.

Proof of the associativity of vector addition. n Assume x = (x1, x2, … , xn), y = (y1, y2, … , yn) and z = (z1, z2, … , zn) ∈ ℝ . th Then for 1 ≤ i ≤ n, the i coordinate of (x + y) + z is (xi + yi) + zi and th the i coordinate of x + (y + z) is xi + (yi + zi). 42

Since addition of real numbers is associative, then (xi + yi) + zi = xi + (yi + zi). Hence, the ith coordinate of (x + y) + z equals the ith coordinate of x + (y + z) for 1 ≤ i ≤ n. Therefore, (x + y) + z = x + (y + z) by the fundamental property of n-tuples. p

Proof of the existence of an inverse for vector addition. To prove this statement, we must already have proven that 0 = (0, 0, … , 0) is an identity for vector addition. Assume that we have already proven that 0 is the identity for vector addition. n Assume x = (x1, x2, … , xn) ∈ ℝ .

For 1 ≤ i ≤ n, each real number xi has an additive inverse –xi ∈ ℝ.

Recall that –x = (–x1, –x2, … , –xn). Then –x ∈ ℝn. th th For 1 ≤ i ≤ n, the i coordinate of x + (–x) is xi + (–xi) and the i coordinate of 0 is 0.

Since –xi is the additive inverse of xi, then xi + (–xi) = 0. Hence, the ith coordinate of x + (–x) equals the ith coordinate of 0 for 1 ≤ i ≤ n. Therefore, x + (–x) = 0 by the fundamental property of n-tuples. p

Proof of the distributivity of scalar multiplication over vector addition. n Assume a ∈ ℝ and x = (x1, x2, … , xn) and y = (y1, y2, … , yn) ∈ ℝ . th Then for 1 ≤ i ≤ n, the i coordinate of a(x + y) is a(xi + yi) and th the i coordinate of ax + ay is axi + ayi. Since the multiplication of real numbers distributes over the addition of real numbers, then a(xi + yi) = axi + ayi. Hence, the ith coordinate of a(x + y) equals the ith coordinate of ax + ay for 1 ≤ i ≤ n. Therefore, a(x + y) = ax + ay by the fundamental property of n-tuples. p

Homework Problem 4.2. Prove that vector addition and scalar multiplication on ℝn obey the remaining five properties listed in the definition of “vector space”: a) the commutativity of vector addition, b) the existence of an identity for vector addition, c) the associativity of scalar multiplication, d) the existence of an identity for scalar multiplication, and e) the distributivity of scalar multiplication over scalar addition. 43

Homework Problem 4.3. Prove that the operations of vector addition and scalar multiplication on ℝn satisfy the following three algebraic properties. i) 0x = 0 for every x ∈ ℝn. ii) a0 = 0 for a ∈ ℝ. iii) (–1)x = –x for every x ∈ ℝn.

Next we define a third algebraic operation on ℝn called the dot product. The dot product is a form of multiplication of two elements of ℝn. Using the dot product we will define a norm on ℝn which we will then use to define the Euclidean metric on ℝn. The Euclidean metric will make ℝn into a metric space called Euclidean n-space. We will then study the geometry of Euclidean n-space.

Definition. The dot product on ℝn is a function which assigns to every pair of n elements x = (x1, x2, … , xn) and y = (y1, y2, … , yn) of ℝ a real number denoted x•y which is called the dot product of x and y and which is defined by the equation

x•y = x1y1 + x2y2 + … + xnyn.

The dot product on ℝn is a particular instance of a function on a vector space called an inner product which assigns a real number to every pair of elements of V. We will now define the concept of an inner product and then prove that the dot product on ℝn is, indeed, an inner product.

Definition. Let V be a vector space. Suppose there is a function which assigns to every pair of elements x and y ∈ V a real number denoted x∗y. If this function obeys the following three rules, then it is called an inner product on V. Bilinearity: (x + y)∗z = x∗z + y∗z, x∗(y + z) = x∗y + x∗z and (ax)∗y = a(x∗y) = x∗(ay) for all x, y and z ∈ V and every a ∈ ℝ. Symmetry: x∗y = y∗x for all x and y ∈ V. Positivity: x∗x ≥ 0, and x∗x = 0 if and only if x = 0, for every x ∈ V.

Theorem 4.2. The dot product on ℝn is an inner product on ℝn.

Proof. We will verify that the dot product satisfies the first equation of the bilinearity property that is part of the definition of “inner product”. We will leave the proofs that the dot product satisfies the remaining parts of the definition of “inner product” as homework problems.

n Let x = (x1, x2, … , xn), y = (y1, y2, … , yn) and z = (z1, z2, … , zn) ∈ ℝ . We will prove (x + y)•z = x•z + y•z.

(x + y)•z = (x1 + y1, x2 + y2, … , xn + yn)•(z1, z2, … , zn) =

(x1 + y1)z1 + (x2 + y2)z2 + … + (xn + yn)zn =

(x1z1 + y1z1) + (x2z2 + y2z2) + … + (xnzn + ynzn) =

(x1z1 + x2z2 + … + xnzn) + (y1z1 + y2z2 + … + ynzn) = x•z + y•z. p

Homework Problem 4.4. Prove that the dot product on ℝn obeys the remaining properties listed in the definition of “inner product”: a) x∗(y + z) = x∗y + x∗z for all x, y and z ∈ V. b) (ax)∗y = a(x∗y) = x∗(ay) for all x and y ∈ V and every a ∈ ℝ. c) x∗y = y∗x for all x and y ∈ V. d) x∗x ≥ 0 for every x ∈ V. e) x∗x = 0 if and only if x = 0 for every x ∈ V.

Our next step is to use the dot product on ℝn to define a function from ℝn to the real numbers ℝ called the Euclidean norm. The Euclidean norm, which is like an absolute value function on ℝn, will be used to define the Euclidean metric on ℝn and thereby give ℝn geometry.

Definition. The Euclidean norm on ℝn is a function which assigns to every n element x = (x1, x2, … , xn) of ℝ a real number denoted || x || which is called the Euclidean norm of x and which is defined by the equation

2 2 2 || x || = x • x = x1 + x2 + ⋯ + xn .

The Euclidean norm on ℝn is a particular instance of a function from a vector space to the real numbers which is called a norm on V and behaves somewhat like an absolute value function on V. We will now define the concept of a norm on a vector space and then prove that the Euclidean norm on ℝn is, indeed, a norm.

Definition. Let V be a vector space. Suppose there is a function which assigns to every element x ∈ V a real number denoted || x ||. If this function obeys the following the following three rules, then it is called a norm on V. Positivity: || x || ≥ 0, and || x || = 0 if and only if x = 0, for every x ∈ V. Homogeneity: || ax || = |a| || x || for every x ∈ V and every a ∈ ℝ. The triangle inequality: || x + y || ≤ || x || + || y || for all x and y ∈ V.

Next we will prove that the Euclidean norm on ℝn, which was defined in terms of the dot product by the formula || x || = x • x, is a norm in the sense just defined. However, before embarking on this proof, we observe that other norms beside the 45

Euclidean norm can be defined on ℝn. For instance, we can define the taxicab norm and the maximum norm on ℝn.

Definition. The taxicab norm on ℝn is a function which assigns to every element n x = (x1, x2, … , xn) of ℝ a real number denoted || x ||T which is called the taxicab norm of x and which is defined by the equation

|| x ||T = | x1 | + | x2 | + … + | xn |. The maximum norm on ℝn is a function which assigns to every element x = n (x1, x2, … , xn) of ℝ a real number denoted || x ||M which is called the taxicab norm of x and which is defined by the equation

|| x ||M = max { | x1 |, | x2 |, … , | xn | }.

We can then use the taxicab and maximum norms to define the taxicab metric and the maximum metric on ℝn. Clearly, these norms are not defined using the dot product. Because the Euclidean norm and Euclidean metric derive from the dot product while the taxicab and maximum norm and metric do not, the geometry on ℝn associated with the Euclidean metric has certain nice properties that the geometries coming from the taxicab and maximum metrics do not share. We saw this in ℝ2. For example, in Euclidean geometry, there is only one midpoint between any two points in ℝn; whereas in the taxicab and maximum geometries, there may be infinitely many midpoints between two points in ℝn depending on how the two points are situated.

Theorem 4.3. The Euclidean norm on ℝn is a norm on ℝn.

Proof. We will verify that the Euclidean norm satisfies the positivity property that is part of the definition of “norm”. We will leave the proof that the Euclidean norm satisfies the homogeneity property as a homework problem. Proving the triangle inequality for the Euclidean norm is harder than proving it for the taxicab and maximum norms. Our proof that the Euclidean norm satisfies the triangle inequality requires an auxiliary result known as the Cauchy Inequality which is a foundational result of Euclidean geometry. We will state and prove the Cauchy Inequality below in Theorem 4.5. We will then complete the proof that the Euclidean norm is a norm by showing that is satisfies the triangle inequality.

We now prove that the Euclidean norm satisfies the positivity property that is n part of the definition of a norm. Let x = (x1, x2, … , xn) ∈ ℝ .

Since the square root function maps [0,∞) to [0,∞), then || x || = x • x ∈ [0,∞). Hence, || x || ≥ 0.

Now assume x = 0. Since 0•0 = 0, then || x || = 0 • 0 = 0 = 0.

2 2 2 Finally assume || x || = 0. Then x1 + x2 + ⋯ + xn = 0. Hence, 46

2 2 2 2 x1 + x2 + … + xn = 0. Since xi ≥ 0 for 1 ≤ i ≤ n, then the only way that 2 2 2 2 x1 + x2 + … + xn can be 0 is if each xi = 0. Hence, xi = 0 for 1 ≤ i ≤ n. Therefore, x = (0, 0, … , 0) = 0.

We have now proved that || x || = 0 if and only if x = 0. Thus, we have verified that the Euclidean norm satisfies the positivity property that is part of the definition of a norm. p

Homework Problem 4.5. Prove that the Euclidean norm satisfies the homogeneity property from the definition of a norm.

Homework Problem 4.6. a) Prove that the taxicab norm is a norm on ℝn. b) Prove that the maximum norm is a norm on ℝn.

Before we state and prove the fundamental theorem known as the Cauchy Inequality, we present several basic and useful results about the dot product and Euclidean norm.

Lemma 4.4. a) For all x, y ∈ ℝn, || x ± y ||2 = || x ||2 ± 2x•y + || y ||2. b) For all x, y ∈ ℝn, ( x – y )•( x + y ) = || x ||2 – || y ||2. c) The Parallelogram Law: For all x, y ∈ ℝn, || x + y ||2 + || x – y ||2 = 2|| x ||2 + 2|| y ||2.

In-Class Exercise 4.A. Prove Lemma 4.4 a).

Homework Problem 4.7. Prove parts b) and c) of Lemma 4.4.

Homework Problem 4.8. a) Prove that if x, y and z ∈ ℝn, || x || = || y || = || z || and x + y + z = 0, then || x – y || = || x – z || = || y – z || (i.e., x, y and z are the vertices of an equilateral triangle). b) Let w, x, y and z ∈ ℝn. If || w || = || x || = || y || = || z || and w + x + y + z = 0, what if anything can be said about the quadrilateral with vertices w, x, y and z?

We now state and prove the Cauchy Inequality. As we remarked previously, this theorem is a foundational result about the dot product and the Euclidean norm which assists us in establishing many basic geometric properties of Euclidean spaces including that the facts that the Euclidean norm and Euclidean metric satisfy triangle inequalities.

Theorem 4.5. a) The Cauchy Inequality: For all x and y ∈ ℝn, | x•y | ≤ || x || || y ||. b) Equality holds in the Cauchy Inequality (i.e. | x•y | = || x || || y ||) if and only if one of x and y is a scalar multiple of the other. More precisely, x•y = || x || || y || if and only if ( || y || )x = ( || x || )y and x•y = – || x || || y || if and only if ( || y || )x = – ( || x || )y.

Remark. One might ask why, in part b) of this theorem, the equation ( || y || )x = ( || x || )y is written in this form rather than in the form x = ( || x || / || y || )y or y = ( || y || / || x || )x, either of which would reveal more directly that x is a scalar multiple of y or vice versa. The reason is that in the case that either x = 0 or y = 0 (or both), then the first equation remains valid, while either the second or the third equation (or both) becomes meaningless because it contains a fraction with 0 in its denominator. The same comment applies to the equation ( || y || )x = – ( || x || )y.

Historical comment. The Cauchy inequality is a well known and fundamental mathematical fact that is sometimes called the Cauchy-Schwarz inequality or the Cauchy-Schwarz-Bunyakovsky inequality. The great French mathematician Augustin Cauchy discovered and proved this inequality for points in ℝn in 1821. The Russian mathematician Viktor Bunyakovsky who was Cauchy’s doctoral student in the 1820’s proved a version of the Cauchy inequality for integrals in 1859. The integral version of the inequality was rediscovered by the German mathematician Hermann Schwarz in 1888.

Proof of a). We first prove the Cauchy Inequality for all vectors of norm 1. (Observe that for vectors u and v of norm 1, the Cauchy Inequality says: | u•v | ≤ 1.) Assume u, v ∈ ℝn such that || u || = || v || = 1. Then Lemma 4.4.a implies 0 ≤ || u – v ||2 = || u ||2 – 2u•v + || v ||2 = 2 – 2u•v, and 0 ≤ || u + v ||2 = || u ||2 + 2u•v + || v ||2 = 2 + 2u•v. Adding 2u•v to both sides of the first inequality and subtracting 2 from both sides of the second inequality yields –2 ≤ 2u•v ≤ 2. Hence, –1 ≤ u•v ≤ 1. We have proved: if u, v ∈ ℝn and || u || = || v || = 1, then | u•v | ≤ 1. 48

To prove the general case of the Cauchy Inequality, assume x and y ∈ ℝn. First consider the case in which either x = 0 or y = 0 (or both). Then x•y = 0, and || x || || y || = 0. Hence, the statement which we are trying to prove – | x•y | ≤ || x || || y || – is obviously true in this case. So we can focus on the remaining case, and assume that x ≠ 0 and y ≠ 0. Then || x || ≠ 0 and || y || ≠ 0, and we can define 1 1 u = x and v = y. x y 1 1 Then || u || = x = || x || = 1. Similarly || v || = 1. Hence, the result x x established in the first paragraph implies | u•v | ≤ 1. Therefore,

1 1 x y ≤ 1. x • y

Since 1 1 1 x • y x • y = x y = , x y x y • x y then we have: x • y ≤ 1. x y

Hence, | x•y | ≤ || x || || y ||. This completes the proof of the Cauchy Inequality. p

Proof of b). Again we begin by considering vectors of norm 1. Assume u, v ∈ ℝn such that || u || = || v || = 1. We will prove: u•v = 1 if and only if u = v. First assume u•v = 1. Then Lemma 4.7.a) implies || u – v ||2 = || u ||2 – 2u•v + || v ||2 = 1 – 2 + 1 = 0. Therefore, || u – v || = 0. Hence, u – v = 0. So u = v. Second assume u = v. Then u•v = u•u = || u ||2 = 1. This completes our consideration of norm 1 vectors.

To prove part b) in the general case, assume x and y ∈ ℝn. Observe that if either x = 0 or y = 0, then each of the two statements “x•y = || x || || y || if and only if ( || y || )x = ( || x || )y” and “x•y = – || x || || y || if and only if ( || y || )x = – ( || x || )y” reduces to the true statement “0 = 0 if and only if 0 = 0”. So part b) is true if either x = 0 or y = 0. Thus, we can assume that x ≠ 0 and y ≠ 0. Then || x || ≠ 0 and || y || ≠ 0, and we can again define 49

1 1 u = x and v = y. x y Then, as before, || u || = || v || = 1. Observe that 1 u•v = (x•y). x y

Now, using the fact that u•v = 1 is equivalent to u = v for norm 1 vectors, we see that the following statements are equivalent: 1 x•y = || x || || y ||, (x•y) = 1, u•v = 1, x y 1 1 u = v, x = y, ( || y || )x = ( || x || )y. x y Finally, since || y || = || – y ||, then it follows that the following statements are equivalent: x•y = – || x || || y ||, – (x•y) = || x || || y ||, x•(– y) = || x || || – y ||, ( || – y || )x = ( || x || )( – y), ( || y || )x = – ( || x || )y. This completes the proof of part b). p

Our first application of Theorem 4.5 is to complete the proof of Theorem 4.3 by showing that the Euclidean norm on ℝn satisfies the triangle inequality.

Proof that the Euclidean norm on ℝn satisfies the triangle inequality. Assume x, y ∈ ℝn. The Cauchy inequality implies x•y ≤ || x || || y ||. Hence, by Lemma 4.4.a), || x + y ||2 = || x ||2 + 2x•y + || y ||2 ≤ || x ||2 + 2 || x || || y || + || y ||2 = ( || x || + || y || )2. Therefore, || x + y || ≤ || x || + || y ||. p

We are now ready to define the Euclidean metric on ℝn in terms of the Euclidean norm.

Definition. The Euclidean metric d on ℝn is defined by the formula d(x,y) = || x – y ||.

We must verify that the Euclidean metric on ℝn satisfies the three defining conditions for a metric: positivity, symmetry and the triangle inequality.

Theorem 4.6. The Euclidean metric d is a metric on ℝn.

Proof. To prove that the d satisfies the positivity condition, let x, y ∈ ℝn. According to Theorem 4.3, the Euclidean norm is a norm and, hence, satisfies a positivity condition. Hence, d(x,y) = || x – y || ≥ 0. Furthermore, the positivity of the Euclidean norm implies that the following statements are equivalent: d(x,y) = 0, || x – y || = 0, x – y = 0 and x = y. Hence, d satisfies the appropriate positivity condition.

To prove that d satisfies the symmetry condition, let x, y ∈ ℝn. According to Theorem 4.3, the Euclidean norm is a norm and, hence, satisfies a homogeneity condition. It follows that d(x,y) = || x – y || = || (–1)( y – x ) || = |–1| || y – x || = || y – x || = d(y,x). Hence, d satisfies the appropriate symmetry condition.

To prove that d satisfies the triangle inequality, let x, y and z ∈ ℝn. According to Theorem 4.3, the Euclidean norm is a norm and, hence, satisfies a triangle inequality. Consequently, d(x,z) = || x – z || = || ( x – y ) + ( y – z ) || ≤ || x – y || + || y – z || = d(x,y) + d(y,z). Hence, d satisfies the appropriate triangle inequality. p

Definition. When Cartesian n-space ℝn is endowed with the Euclidean metric, it is called Euclidean n-space and is denoted �n. Thus, �1 = ℝ1 = ℝ.

We can define the taxicab and maximum metrics on ℝn from the taxicab and maximum norms, just as we defined the Euclidean metric using the Euclidean norm.

n Definition. The taxicab metric dT on ℝ is defined by the formula

dT(x,y) = || x – y ||T. n The maximum metric dM on ℝ is defined by the formula

dM(x,y) = || x – y ||M.

The proof that the Euclidean metric is a metric on ℝn depended only the fact that the Euclidean norm obeys the positivity and homogeneity conditions and the triangle inequality. Since the taxicab and maximum norms also obey these three conditions, then the same proof shows that the taxicab metric dT and the maximum metric dM are also metrics on ℝn.

We now focus our attention on �n. The next theorem draws a fundamental connection between a metric relationship among three points in �n and an algebraic 51 relationship among the three points. This result is quite useful. It will help us prove among other things that in �n the midpoint between each pair of points is unique.

Theorem 4.7. Suppose x, y and z ∈ �n and x ≠ y. Then d(z,y) d(x,z) d(x,z) + d(z,y) = d(x,y) if and only if z = x + y. d(x,z) + d(z,y) d(x,z) + d(z,y)

One direction of this proof depends essentially on Theorem 4.5.b. We will present this direction of the proof and leave the other direction as a homework problem.

Proof. Assume x, y and z ∈ �n, x ≠ y and d(x,z) + d(z,y) = d(x,y). Then ( d(x,y) )2 = ( d(x,z) + d(z,y) )2 = ( d(x,z) )2 + 2 d(x,z) d(z,y) + ( d(z,y) )2 = ( d(x,z) )2 + 2 || x – z || || z – y || + ( d(z,y) )2 Using Theorem 4.4.a, we obtain a different expression for ( d(x,y) )2: ( d(x,y) )2 = || x – y ||2 = || ( x – z ) + ( z – y ) ||2 = || x – z ||2 + 2( x – z )•( z – y ) + || y – z ||2 = ( d(x,z) )2 + 2( x – z )•( z – y ) + ( d(z,y) )2. We equate these two expressions for ( d(x,y) )2 to obtain: ( d(x,z) )2 + 2 || x – z || || z – y || + ( d(z,y) )2 = ( d(x,z) )2 + 2( x – z )•( z – y ) + ( d(z,y) )2. Hence, subtracting ( d(x,z) )2 + ( d(z,y) )2 from both sides of this equation and dividing by 2 yields ( x – z )•( z – y ) = || x – z || || z – y ||. Thus, equality holds in this particular instance of the Cauchy Inequality, and we can invoke Theorem 4.5.b to obtain || z – y || ( x – z ) = || x – z || ( z – y ). Thus, d(z,y) ( x – z ) = d(x,z) ( z – y ). So ( d(z,y) )x – ( d(z,y) )z = ( d(x,z) )z – ( d(x,z) )y Therefore, ( d(z,y) )x + ( d(x,z) )y = ( d(x,z) + d(z,y) )z. 52

Dividing both sides of this equation by d(x,z) + d(z,y), we obtain: d(z,y) d(x,z) z = x + y. p d(x,z) + d(z,y) d(x,z) + d(z,y)

Homework Problem 4.9. Prove the other direction of Theorem 4.7. In other words, prove that if x, y and z ∈ �n, x ≠ y and d(z,y) d(x,z) z = x + y, d(x,z) + d(z,y) d(x,z) + d(z,y) then d(x,z) + d(z,y) = d(x,y).

Recall that if x, y and z are points of a metric space X, then z is a midpoint between x and y if d(x,z) = d(y,z) = (1/2)d(x,y).

Theorem 4.8. The uniqueness of midpoints in �n. If x and y ∈ �n, x ≠ y and 1 m = ( /2)( x + y ), then m is the unique midpoint between x and y.

Homework Problem 4.10. Prove Theorem 4.8.

Hint. This problem requires you to prove two statements. First, prove m is a 1 midpoint between x and y. In other words, prove d(x,m) = d(y,m) = ( /2)d(x,y). Second, 1 prove that if z is a midpoint between x and y (i.e., z satisfies d(x,z) = d(y,z) = ( /2)d(x,y)), then z = m. Use Theorem 4.7 to prove the second assertion.

Homework Problem 4.11. Suppose x and y ∈ �n such that x ≠ y. Let m = 1 1 n ( /2)( x + y ) and r = ( /2)d(x,y). Let S[x,y] = S(m,r) = { z ∈ � : d(z,m) = r }. Observe that x and y ∈ S[x,y] and d(x,y) = 2r. Call S(x,y) the sphere in �n that has a diameter with endpoints x and y. Prove: for any z ∈ �n: z ∈ S[x,y] if and only if ( z – x )•( z – y ) = 0.

Hint: Consider || z – m ||2 – || x – m ||2.

We devote the rest of this chapter to studying properties of distance preserving functions between Euclidean spaces.

Definition. Let f : �m → �n be a function. We say that f is dot-product preserving if f(x)•f(y) = x•y for all x and y ∈ �m. A dot-product preserving function is also called an orthogonal function. We say that f is norm preserving if || f(x) || = || x || for each x ∈ �m.

Lemma 4.9. A function f : �m → �n is dot-product preserving if and only if f is distance preserving and f(0) = 0.

Proof. Assume f : �m → �n is a dot-product preserving function. Then for each x ∈ �m, || f(x) ||2 = f(x)•f(x) = x•x = || x ||2. Hence, || f(x) || = || x || for each x ∈ �m. In other words, f is norm preserving. Now let x and y ∈ �m. By Lemma 4.4.a, we have: ( d(f(x),f(y)) )2 = || f(x) – f(y) ||2 = || f(x) ||2 – 2f(x)•f(y) + || f(y) ||2 and ( d(x,y) )2 = || x – y ||2 = || x ||2 – 2x•y + || y ||2. Since f preserves dot products and norms, then || f(x) || = || x ||, f(x)•f(y) = x•y and || f(y) || = || y ||. It follows that ( d(f(x),f(y)) )2 = ( d(x,y) )2. So d(f(x),f(y)) = d(x,y). Thus, f is distance preserving. Also, since f is norm preserving, || f(0) || = || 0 || = 0. Hence, f(0) = 0.

Now assume f : �m → �n is distance preserving and f(0) = 0. Then for each x ∈ �m, || f(x) || = || f(x) – 0 || = || f(x) – f(0) || = d(f(x),f(0)) = d(x,0) = || x – 0 || = || x ||. Thus, f is norm preserving. As in the previous paragraph, Lemma 4.4.a implies: ( d(f(x),f(y)) )2 = || f(x) ||2 – 2f(x)•f(y) + || f(y) ||2 and ( d(x,y) )2 = || x ||2 – 2x•y + || y ||2. Since f is distance preserving and norm preserving, then d(f(x),f(y)) = d(x,y), || f(x) || = || x || and || f(y) || = || y ||. It follows that 2f(x)•f(y) = 2x•y. Thus, f(x)•f(y) = x•y. Therefore, f is dot-product preserving. p

Homework Problem 4.12. If a function from �m to �n is norm preserving, must it be distance preserving?

Remark. Since there are distance preserving functions f : �m → �n such that f(0) ≠ 0, then Theorem 4.9 implies that not all distance preserving functions between Euclidean spaces preserve dot products. For example, if a ∈ �n and a ≠ 0, and if we n n define the translation Ta : � → � by Ta(x) = x + a, then Ta is distance preserving

( d(Ta(x),Ta(y)) = || Ta(x) – Ta(y) || = || (x + a) – (y + a) || = || x – y || = d(x,y) ) and Ta(0) = 0 + a = a ≠ 0. Hence, Theorem 4.9 implies that Ta does not preserve dot products. It turns out to be useful to have a property like preservation of dot products (but weaker) that is possessed by all distance preserving functions (not just those that map 0 to itself.) To this end, we define the following concept.

Definition. A function f : �m → �n preserves dot products of differences if for all w, x, y and z ∈ �m, ( f(x) – f(w) )•( f(z) – f(y) ) = ( x – w )•( z – y ).

w f(z)

y f f(y) z f(w) x f(x) ( x – w )•( z – y ) = ( f(x) – f(w) )•( f(z) – f(y) )

Theorem 4.10. A function f : �m → �n is distance preserving if and only if f preserves dot products of differences.

Proof. Assume f : �m → �n is a distance preserving function. Define the function g : �m → �n by g(x) = f(x) – f(0) for each x ∈ �m. We assert that g is distance preserving and g(0) = 0. Indeed for x and y ∈ �m, since f is distance preserving, then d( g(x), g(y) ) = || g(x) – g(y) || = || ( f(x) – f(0) ) – ( f(y) – f(0) ) || = || f(x) – f(y) || = d( f(x), f(y) ) = d(x,y). Also g(0) = f(0) – f(0) = 0. This proves our assertion. It now follows from Lemma 4.9 that g : �m → �n is dot-product preserving.

We now prove that f preserves dot products of differences. Let w, x, y and z ∈ �m. Then ( f(x) – f(w) )•( f(z) – f(y) ) = ( ( f(x) – f(0) ) – ( f(w) – f(0) ) )•( ( f(z) – f(0) ) – ( f(y) – f(0) ) ) = ( g(x) – g(w) )•( g(z) – g(y) ) = g(x)•g(z) – g(x)•g(y) – g(w)•g(z) + g(w)•g(y) = x•z – x•y – w•z + w•y = x•( z – y ) – w•( z – y ) = ( x – w )•( z – y )

This completes the proof that if f is distance preserving, then f preserves dot products of differences.

Homework Problem 4.13. Complete the proof of Theorem 4.10 by proving that if a function f : �m → �n preserves dot products of differences, then f is distance preserving. p

In-Class Exercise 4.B. Suppose f : �m → �n is a distance preserving function, and suppose w, x, y and z ∈ �m such that w•y = – 2, w•z = – 3, x•y = 4, and x•z = 7. What is the value of ( f(w) – f(x) )•( f(y) – f(z) )?

We finish this chapter by showing that distance preserving functions between Euclidean spaces are affine. We now define this concept. Affineness is an algebraic property of a function that will be useful in subsequent developments.

Definition. A function f : ℝm → ℝn is affine if f( ax + by ) = af(x) + bf(y) for all x and y ∈ ℝm and all a and b ∈ ℝ such that a + b = 1. Equivalently, f : ℝm → ℝn is affine if f( (1 – a) x + a y ) = (1 – a) f(x) + a f(y) for all x and y ∈ ℝm and all a ∈ ℝ.

Remark. If a and b are real numbers such that a + b = 1, then point z = ax + by lies on the line through z and y. (This fact will be explained further in the next chapter.) Furthermore, the values of a and b determine the position of z relative to x and y. If a is close to 1, then z is close to x, while if b is close to 1, then z is close to y. For example, 2 1 1 the point ( /3)x + ( /3)y is on the line through x and y /3 of the way between x and y. Thus, if f : ℝm → ℝn is an affine function and z = ax + by where a + b = 1, then the equation f(z) = af(x) + bf(y) tells us that f(z) lies on the line through f(x) and f(y) and that f(z) has the same position on this line relative to f(x) and f(y) that z has on the line through x and y relative to x and y.

f(x) 2 1 y f(z) = ( /3)f(x) + ( /3)f(y)

2 1 z = ( /3)x + ( /3)y f f(y)

Definition. If x and y ∈ ℝm and a and b ∈ ℝ such that a + b = 1, then ax + by is m called an affine combination of x and y. More generally, if x1, x2, … , xk ∈ ℝ and for any a1, a2, … , ak ∈ ℝ such that a1 + a2 + … + ak = 1, then a1x1 + a2x2 + … + akxk is called an affine combination of x1, x2, … , xk.

Thus, a function f : ℝm → ℝn is affine if it preserves affine combinations of two elements of ℝm. In fact, affine functions actually preserve affine combinations of any number of elements. This is the content of the next theorem.

Theorem 4.11. Let f : ℝm → ℝn be a function. Then the following statements are equivalent. a) f : ℝm → ℝn is affine. m b) For each k ≥ 1, for any x1, x2, … , xk ∈ ℝ and for any a1, a2, … , ak ∈ ℝ such that a1 + a2 + … + ak = 1,

f(a1x1 + a2x2 + … + akxk) = a1f(x1) + a2f(x2) + … + akf(xk).

Proof that b) implies a). Assume f : ℝm → ℝn satisfies the condition stated in b). To prove f is affine, let x and y ∈ ℝm and let a ∈ ℝ. Since (1 – a) + a = 1, then the k = 2 case of the condition stated in b) implies f((1 – a) x + a y) = (1 – a) f(x) + a f(y). This proves f is affine. p

Homework Problem 4.14. Complete the proof of Theorem 4.11 by proving that a) implies b).

Hint. Assume f : ℝm → ℝn is affine and use induction on k to establish b).

In-Class Exercise 4.C. Assume the distributive law a(x + y) = ax + ay holds for all real numbers a, x and y. Give an inductive proof that the generalized distributive law

a(x1 + x2 + … + xk) = ax1 + ax2 + … + axk holds for all real numbers a, x1, x2, … , xk.

Corollary 4.12. If f : ℝm → ℝn is an affine function and x, y and z ∈ ℝm, then f(x – y + z) = f(x) – f(y) + f(z).

Homework Problem 4.15. Prove Corollary 4.12.

We will finish this chapter by proving that all distance preserving functions between Euclidean spaces are affine. In other words, all distance preserving functions between Euclidean spaces preserve affine combinations. This algebraic property of distance preserving functions between Euclidean spaces will turn out to be extremely useful in later chapters. Our proof of this theorem depends on the fact that distance preserving functions preserve dot products of differences (Theorem 4.10). This proof is short and a bit surprising because there is no apparent connection between the property of preserving dot products of differences and affineness. 57

Theorem 4.13. If the function f : �m → �n is distance preserving, then f is affine.

Proof. Assume f : �m → �n is a distance preserving function. To prove f is affine, let x and y ∈ ℝm and let a and b ∈ ℝ such that a + b = 1. Let z = ax + by. We must prove f(z) = af(x) + bf(y). We will accomplish this by using the fact that f preserves dot products of differences to show that d(f(z),af(x) + bf(y)) = 0. Here is the argument.

Since a + b = 1, then using Lemma 4.4.a, we have: ( d(f(z),af(x) + bf(y)) )2 = || f(z) – ( af(x) + bf(y) ) ||2 = || ( a + b )f(z) – ( af(x) + bf(y) ) ||2 = || a( f(z) – f(x) ) + b( f(z) – f(y) ) ||2 = || a( f(z) – f(x) ) ||2 + 2( a( f(z) – f(x) ) )•( b( f(z) – f(y) ) ) + || b( f(z) – f(y) ) ||2 = a2 || f(z) – f(x) ||2 + 2ab( ( f(z) – f(x) )•( f(z) – f(y) ) ) + b2 || f(z) – f(y) ||2 = a2 ( d(f(z),f(x)) )2 + 2ab( ( f(z) – f(x) )•( f(z) – f(y) ) ) + b2 ( d(f(z),f(y)) )2 and ( d(z,ax + by) )2 = || z – ( ax + by ) ||2 = || ( a + b )z – ( ax + by ) ||2 = || a( z – x) + b( z – y ) ||2 = || a( z – x) ||2 + 2( a( z – x) )•( b( z – y ) ) + || b( z – y ) ||2 = a2 || z – x ||2 + 2ab( ( z – x) )•( z – y ) ) + b2 || z – y ||2 = a2 ( d(z,x) )2 + 2ab( ( z – x) )•( z – y ) ) + b2 ( d(z,y) )2.

Since f preserves distance by hypothesis and f preserves dot products of differences by Theorem 4.10, then d(f(z),f(x)) = d(z,x), d(f(z),f(y)) = d(z,y) and ( f(z) – f(x) )•( f(z) – f(y) ) = ( z – x) )•( z – y ). It follows by comparing the above equations for ( d(f(z),af(x) + bf(y)) )2 and ( d(z,ax + by) )2 that ( d(f(z),af(x) + bf(y)) )2 = ( d(z,ax + by) )2. Hence, d(f(z),af(x) + bf(y)) = d(z,ax + by). Since z = ax + by, then d(z,ax + by) = 0. Hence, d(f(z),af(x) + bf(y)) = 0. Therefore, f(z) = af(x) + bf(y). We conclude that f is affine. p

Remark. Although all distance preserving functions between Euclidean spaces are affine, not all affine functions are distance preserving. For example, the function f : ℝ → ℝ defined by f(x) = 2x is affine but not distance preserving. Indeed, if a and b ∈ ℝ such that a + b = 1 f((ax + by) = 2(ax + by) = a(2x) + b(2y) = af(x) + bf(y), but d(f(0),f(1)) ≠ d(0,1) because d(f(0),f(1)) = d(2•0,2•1) = d(0,2) = 2 and d(0,1) = 1. 58

In-Class Exercise 4.D. Suppose f : �m → �n is a distance preserving function, and suppose x and y ∈ �m. Express f(5x + 2y) as a vector sum of scalar multiples of f(x), f(y) and f(0).